How can get array of frequency from audio? - java

Now I can get only array of bytes from audio file. But I need frequency of sound. How I can get it. I'm trying the fft. But after that I get very big numbers and it's not frequency. Of course, I can't to mult to i, because this is Java
private static double[] fft(byte[] bytes) {
double[] fft = new double[bytes.length];
for (int k = 0; k < bytes.length; k++) {
for (int n = 0; n < bytes.length; n++) {
fft[k] += bytes[n] * Math.pow(Math.E, -2 * Math.PI * k * n / bytes.length);
}
}
return fft;
}

I would suggest using a complex type for your FFT calculations, it will make everything simpler (or at least more readable) and doesn't add a lot of overhead.
I am not a java person, it doesnt seem like JDK has a built in complex type BUT implementations like this exist:
https://introcs.cs.princeton.edu/java/97data/Complex.java.html
Your FFT could then be something like this (a bit unoptimized pseudo code!):
private static Complex[] fft(byte[] bytes) {
Complex[] fft = new Complex[bytes.length];
for (int k = 0; k < bytes.length; k++) {
for (int n = 0; n < bytes.length; n++) {
Complex temp = new Complex (0,-2 * Math.PI * k * n/bytes.length);
fft[k] += bytes[n] * Complex.exp(temp);
}
}
return fft;
}
You can get the magnitudes with something like
Complex.abs(fft[k])
I would also look at your outer loop (k), this is the size of your FFT and it will currently be the length of the input. This may or may not be what you want, I would suggest looking at signal windowing.

Related

Loop twice but cost same time in JAVA

I have a loop like this:
double[][] ED = new double[n][m];
for(int k = 0; k < 20000; k++)
for (int i = 0; i < 200; i++)
for (int j = 0; j < 200; j++)
ED[i][j] = dis(qx[k][i], qy[k][i], dx[k][j], dy[k][j]);
"dis" is a function to calculate the distance between (x1,y1) and (x2,y2). Don't mind it. The problem is when I add another boolean assignment in the loop just like this:
double[][] ED = new double[n][m];
boolean[][] bool = new boolean[n][m];
for(int k = 0; k < 20000; k++)
for (int i = 0; i < 200; i++)
for (int j = 0; j < 200; j++)
{
ED[i][j] = dis(qx[k][i], qy[k][i], dx[k][j], dy[k][j]);
bool[i][j] = ED[i][j] > 5000;
}
The new loop cost 1.5 time over the first one. I think it cost too much. For testing, I break 2 assignment into 2 loop.The strange thing happens, two cost of time are same. Sometimes, code 3 cost less time than code 2
double[][] ED = new double[n][m];
boolean[][] bool = new boolean[n][m];
for(int k = 0; k < 20000; k++)
{
for (int i = 0; i < 200; i++)
for (int j = 0; j < 200; j++)
{
ED[i][j] = dis(qx[k][i], qy[k][i], dx[k][j], dy[k][j]);
}
for (int i = 0; i < 200; i++)
for (int j = 0; j < 200; j++)
{
bool[i][j] = ED[i][j] > 5000;
}
}
My aim is use as less time too calculate bool[i][j], how should I do.
Introducing new, big array bool[][] may have more impact than it seems.
When only single arrayED[i][j] is used, you put less stress on L1 processor cache.
With second array, you have twice as much data, therefore cache will be invalidated more often.
Could you try, instead of using two arrays (bool and arrayED) use single array that holds both double and boolean? There will be significant overhead for array of Objects, but (maybe) compiler will be smart enough to destructure the Object.
With single array, you will have better data locality.
Also, as suggested in comments make sure you do your microbenchmarking properly.
Use http://openjdk.java.net/projects/code-tools/jmh/ and read its documentation how to use it correctly.
Another soltution that would help on multicore system is to use parallel processing. You may create ThreadPoolExecutor (with pool size equal to number of cores you have) then submit each opearation as task to the executor. Operation may be inner-most loop (with counter j) or even two inner-most loops (with counters i and j).
There will be some overhead for coordinating the work but execution time should be much faster if you have 4 or 8 cores.
Yet another idea is to change input data structure. Instead of operating on 4 two-dimensional arrays (qx,qy,dx,dy) could you have single array? it may make dis function faster.

Fastest linear algebra library in terms of Cholesky factorization

I'd really like assessing if any of you could point me towards the most optimized and computetionally quick linear algebra library in terms of Cholesky factorization.
So far I've been using the Apache Commons Math library, but perhaps there are more robust and better-enhanced options already available.
For instance, would PColt, EJML or ojAlgo better choices? The most urgent concerns is mainly one: I need to iteratively calculate (within a 2048 elements for loop generally) the lower triangular Cholesky factor for up to three different matrices; the largest size the matrices will reach is about 2000x2000.
Cholesky factorisation is quite a simple algorithm. Here's the (unoptimised) C# code that I use. C# and Java are quite similar, so should be an easy job for you to convert to Java and make whatever improvements you deem necessary.
public class CholeskyDecomposition {
public static double[,] Do(double[,] input) {
int size = input.GetLength(0);
if (input.GetLength(1) != size)
throw new Exception("Input matrix must be square");
double[] p = new double[size];
double[,] result = new double[size, size];
Array.Copy(input, result, input.Length);
for (int i = 0; i < size; i++) {
for (int j = i; j < size; j++) {
double sum = result[i, j];
for (int k = i - 1; k >= 0; k--)
sum -= result[i, k] * result[j, k];
if (i == j) {
if (sum < 0.0)
throw new Exception("Matrix is not positive definite");
p[i] = System.Math.Sqrt(sum);
} else
result[j, i] = sum / p[i];
}
}
for (int r = 0; r < size; r++) {
result[r, r] = p[r];
for (int c = r + 1; c < size; c++)
result[r, c] = 0;
}
return result;
}
}
Have a look at the Java Matrix Benchmark. The "Inver Symm" case test inverting a matrix using the cholesky decomposition. If you get the source code for the benchmark there is also a pure cholesky decomposition test that you can turn on.
Here's another comparison of various matrix decompositions between ojAlgo and JAMA

Java generate all possible permutations of a String

I know this question has been asked many times, but I'm looking for a very fast algorithm to generate all permutations of Strings of length 8. I am trying to generate a String of length 8, where each character in the String can be any of the characters 0-9 or a-z (36 total options). Currently, this is the code I have to do it:
for(idx[2] = 0; idx[2] < ch1.length; idx[2]++)
for(idx[3] = 0; idx[3] < ch1.length; idx[3]++)
for(idx[4] = 0; idx[4] < ch1.length; idx[4]++)
for(idx[5] = 0; idx[5] < ch1.length; idx[5]++)
for(idx[6] = 0; idx[6] < ch1.length; idx[6]++)
for(idx[7] = 0; idx[7] < ch1.length; idx[7]++)
for(idx[8] = 0; idx[8] < ch1.length; idx[8]++)
for(idx[9] = 0; idx[9] < ch1.length; idx[9]++)
String name = String.format("%c%c%c%c%c%c%c%c%c%c",ch1[idx[0]],ch2[idx[1]],ch3[idx[2]],ch4[idx[3]],ch5[idx[4]],ch6[idx[5]],ch7[idx[6]],ch8[idx[7]],ch9[idx[8]],ch10[idx[9]]);
As you can see, this code is not pretty by any means. Also, this code can generate 280 thousand Strings per second. I'm looking for an algorithm to do it even faster than that.
I've tried a recursive approach, but that seems to run slower than this approach does. Suggestions?
Should be faster (generates way above million outputs per second), and at least it's definitely more pleasant to read:
final long count = 36L * 36L * 36L * 36L * 36L * 36L * 36L * 36L;
for (long i = 0; i < count; ++i) {
String name = StringUtils.leftPad(Long.toString(i, 36), 8, '0');
}
This exploits the fact that your problem:
generate a String of length 8, where each character in the String can be any of the characters 0-9 or a-z (36 total options)
Can be reformulated to:
Print all numbers from 0 until 36^8 in base-36 system.
Few notes:
output is sorted by definition, nice!
I'm using StringUtils.leftPad() for simplicity, see also: How can I pad an integers with zeros on the left?
what you are looking for is not really a permutation
by exploiting the fact that you generate all subsequent numbers you can easily improve this algorithm even further:
final int MAX = 36;
final long count = 1L * MAX * MAX * MAX * MAX * MAX * MAX * MAX * MAX * MAX * MAX;
final char[] alphabet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ".toCharArray();
final int[] digits = new int[8];
final char[] output = "00000000".toCharArray();
for (long i = 0; i < count; ++i) {
final String name = String.valueOf(output);
// "increment"
for (int d = 7; d >= 0; --d) {
digits[d] = (digits[d] + 1) % MAX;
output[d] = alphabet[digits[d]];
if (digits[d] > 0) {
break;
}
}
}
Program above, on my computer, generate more than 30 million strings per second. And there's still much room for improvement.
This code may look a little prettier - or at least more complex ;)
boolean incrementIndex( int[] idx, final int maxValue ) {
int i = 0;
int currIndexValue;
do {
currIndexValue = idx[i];
currIndexValue++;
if ( currIndexValue > maxValue ) {
currIndexValue = 0;
}
idx[i] = currIndexValue;
i++;
} while ( currIndexValue == 0 && i < idx.length );
return i < idx.length;
}
do {
// create String from idx[0]...idx[8]
} while ( incrementIndex(idx, 35) );

Neural Network Backpropagation does not compute weights correctly

Currently, I am having problems with the Backpropagation algorithm.
I am trying to implement it and use it to recognize the direction of faces (left, right, down, straight).
Basically, I have N images, read the pixels and change its values(0 to 255) to values from 0.0 to 1.0. All images are 32*30.
I have an input layer of 960 neurons, a hidden layer of 3 neurons and an output layer of 4 neurons. For example, the output <0.1,0.9,0.1,0.1> means that the person looks to the right.
I followed the pseudy-code. However, it doesn't work right - it does not compute the correct weights and consequently it can't handle the training and test examples.
Here are parts of the code:
// main function - it runs the algorithm
private void runBackpropagationAlgorithm() {
for (int i = 0; i < 900; ++i) {
for (ImageUnit iu : images) {
double [] error = calcOutputError(iu.getRatioMatrix(), iu.getClassification());
changeHiddenUnitsOutWeights(error);
error = calcHiddenError(error);
changeHiddenUnitsInWeights(error,iu.getRatioMatrix());
}
}
}
// it creates the neural network
private void createNeuroneNetwork() {
Random generator = new Random();
for (int i = 0; i < inHiddenUnitsWeights.length; ++i) {
for (int j = 0; j < hiddenUnits; ++j) {
inHiddenUnitsWeights[i][j] = generator.nextDouble();
}
}
for (int i = 0; i < hiddenUnits; ++i) {
for (int j = 0; j < 4; ++j) {
outHddenUnitsWeights[i][j] = generator.nextDouble();
}
}
}
// Calculates the error in the network. It runs through the whole network.
private double [] calcOutputError(double[][] input, double [] expectedOutput) {
int currentEdge = 0;
Arrays.fill(hiddenUnitNodeValue, 0.0);
for (int i = 0; i < input.length; ++i) {
for (int j = 0; j < input[0].length; ++j) {
for (int k = 0; k < hiddenUnits; ++k) {
hiddenUnitNodeValue[k] += input[i][j] * inHiddenUnitsWeights[currentEdge][k];
}
++currentEdge;
}
}
double[] out = new double[4];
for (int j = 0; j < 4; ++j) {
for (int i = 0; i < hiddenUnits; ++i) {
out[j] += outHddenUnitsWeights[i][j] * hiddenUnitNodeValue[i];
}
}
double [] error = new double [4];
Arrays.fill(error, 4);
for (int i = 0; i < 4; ++i) {
error[i] = ((expectedOutput[i] - out[i])*(1.0-out[i])*out[i]);
//System.out.println((expectedOutput[i] - out[i]) + " " + expectedOutput[i] + " " + out[i]);
}
return error;
}
// Changes the weights of the outgoing edges of the hidden neurons
private void changeHiddenUnitsOutWeights(double [] error) {
for (int i = 0; i < hiddenUnits; ++i) {
for (int j = 0; j < 4; ++j) {
outHddenUnitsWeights[i][j] += learningRate*error[j]*hiddenUnitNodeValue[i];
}
}
}
// goes back to the hidden units to calculate their error.
private double [] calcHiddenError(double [] outputError) {
double [] error = new double[hiddenUnits];
for (int i = 0; i < hiddenUnits; ++i) {
double currentHiddenUnitErrorSum = 0.0;
for (int j = 0; j < 4; ++j) {
currentHiddenUnitErrorSum += outputError[j]*outHddenUnitsWeights[i][j];
}
error[i] = hiddenUnitNodeValue[i] * (1.0 - hiddenUnitNodeValue[i]) * currentHiddenUnitErrorSum;
}
return error;
}
// changes the weights of the incomming edges to the hidden neurons. input is the matrix of ratios
private void changeHiddenUnitsInWeights(double [] error, double[][] input) {
int currentEdge = 0;
for (int i = 0; i < input.length; ++i) {
for (int j = 0; j < input[0].length; ++j) {
for (int k = 0; k < hiddenUnits; ++k) {
inHiddenUnitsWeights[currentEdge][k] += learningRate*error[k]*input[i][j];
}
++currentEdge;
}
}
}
As the algorithm works, it computes bigger and bigger weights, which finally approach infinity (NaN values). I checked the code. Alas, I didn't manage to solve my problem.
I will be firmly grateful to anyone who would try to help me.
I didn't check all of your code. I just want to give you some general advices. I don't know if your goal is (1) to learn the direction of faces or (2) to implement your own neural network.
In case (1) you should consider one of those libraries. They just work and give you much more flexible configuration options. For example, standard backpropagation is one of the worst optimization algorithms for neural networks. The convergence depends on the learning rate. I can't see which value you chose in your implementation, but it could be too high. There are other optimization algorithms that don't require a learning rate or adapt it during training. In addition, 3 neurons in the hidden layer is most likely not enough. Most of the neural networks that have been used for images have hundreds and sometimes even thousands of hidden units. I would suggest you first try to solve your problem with a fully developed library. If it does work, try implementing your own ANN or be happy. :)
In case (2) you should first try to solve a simpler problem. Take a very simple artificial data set, then take a standard benchmark and then try it with your data. A good way to verify that your backpropagation implementation works is a comparison with a numerical differentation method.
Your code is missing the transfer functions. It sounds like you want the logistic function with a softmax output. You need to include the following in calcOutputError
// Logistic transfer function for hidden layer.
for (int k = 0; k < hiddenUnits; ++k) {
hiddenUnitNodeValue[k] = logistic(hiddenUnitNodeValue[k]);
}
and
// Softmax transfer function for output layer.
sum = 0;
for (int j = 0; j < 4; ++j) {
out[j] = logistic(out[j]);
sum += out[j];
}
for (int j = 0; j < 4; ++j) {
out[j] = out[j] / sum;
}
where the logistic function is
public double logistic(double x){
return (1/(1+(Math.exp(-x)));
}
Note that the softmax transfer function gives you outputs that sum to 1, so they can be interpreted as probabilities.
Also, your calculation of the error gradient for the output layer is incorrect. It should simply be
for (int i = 0; i < 4; ++i) {
error[i] = (expectedOutput[i] - out[i]);
}
I haven't tested your code but I am almost certain that you start out with to large weights.
Most of the introductions on the subjects leave it at "init the weights with random values" and leaving out that the algorithm actually diverges (goes to Inf) for some starting values.
Try using smaller starting values, for example between -1/5 and 1/5 and shrink it down.
And additionally do an method for matrix multiplication, you have (only) used that 4 times, much easier to see if there is some problem there.
I had a similar problem with a neural network processing grayscale images. You have 960 input values ranging between 0 and 255. Even with small initial weights, you can end up having inputs to your neurons with a very large magnitude and the backpropagation algorithm gets stuck.
Try dividing each pixel value by 255 before passing it into the neural network. That's what worked for me. Just starting with extremely small initial weights wasn't enough, I believe due to the floating-point precision issue brought up in the comments.
As suggested in another answer, a good way to test your algorithm is to see if your network can learn a simple function like XOR.
And for what it's worth, 3 neurons in the hidden layer was plenty for my purpose (identifying the gender of a facial image)
I wrote an entire new neural-network library and it works. It is sure that in my previous attempt I missed the idea of using transfer functions and their derivatives. Thank you, all!

Spectral analysis using FFT, fundamental frequency derivation

I need to perform spectral analysis of a simple wav file.
The things I have already done :
Read file into byte array :
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int bytesRead = 0;
while ((bytesRead = audioStream.read(buffer)) != -1) {
baos.write(buffer, 0, bytesRead);
}
fileByteArray = baos.toByteArray();
Then I transform it to the real values (doubles).
I've got sample values stored in double[] array.
How can I make FFT of those samples + estimate fundamental frequency?
Using JTranforms library I tried something like this :
DoubleFFT_1D fft = new DoubleFFT_1D(reader.getSpectrum().getYvalues().length);
double[] x = reader.getSpectrum().getYvalues();
double[] frequencyArray = new double[x.lenght/2];
double[] amplitudeArray = new double[x.lenght/2];
fft.realForward(x);
int i=0;
for (int j = 0; j < x.length-2; j += 2) {
i++;
this.frequencyArray[i] = i;
this.amplitudeArray[i] = Math.sqrt(Math.pow(doub[j],2) + Math.pow(doub[j + 1],2));
}
Is it correct?
All suggestions are appreciated ;)
You should use autocorrelation which can be computed efficiently with FFT:
DoubleFFT_1D fft = new DoubleFFT_1D(reader.getSpectrum().getYvalues().length);
DoubleFFT_1D ifft = new DoubleFFT_1D(reader.getSpectrum().getYvalues().length);
fft.realForward(x);
for (int i = 0; i < x.length/2; i++) {
x[2*i] = Math.sqrt(Math.pow(x[2*i],2) + Math.pow(x[2*i+1],2));
x[2*i+1] = 0;
}
ifft.realInverse(x);
for (int i = 1; i < x.length; i++)
x[i] /= x[0];
x[0] = 1.0;
This code gives you a list of values for which:
x[i]: corelation with i shifts
So, for example if you have a high value (close to 1) for x[n], that means you have a fundemental signal period that is: n*(1000/sampleRateHz) msecs. This is equivalent to a frequency of: sampleRateHz/(1000*n)
The values in the frequency array need to be related to the sample rate and the length of the FFT.
You will still need to solve the problem of determining the fundamental frequency versus the peak frequencies. For that, you may want to use a pitch detection/estimation algorithm, of which there are many (look for research papers on the topic).

Categories