How can I get the magnitudes and corresponding frequencies after performing FFT on a dataset. I need to plot the magnitude vs frequencies for a dataset. Also, why are we increasing the size of our FFT array as twice the size of actual dataset? Then the size of resulting output array is again different, Please help me understand this FFT code. Further, when is complexforward FFT and when realForward FFT is performed? Difference between the two? I need to perform FFT on a dataset and get the magnitude after FFT and corresponding frequencies for each magnitude.
int length = data.length;
FloatFFT_1D fftDo = new FloatFFT_1D(length);
float[] fft = new float[length * 2];
System.arraycopy(data, 0, fft, 0, length);
fftDo.complexForward(fft);
//for(double d: fft) {
//System.out.println(d);
//}
float outputfft[] = new float[(fft.length+1)/2];
if(fft.length%2==0){
for(int i = 0; i < length/2; i++){
outputfft[i]= (float) Math.sqrt((Math.pow(fft[2*i],2))+(Math.pow(fft[(2*(i))+1], 2)));
}
}else{
for(int i = 0; i < (length/2)+1; i++){
outputfft[i]= (float) Math.sqrt((Math.pow(fft[2*i],2))+(Math.pow(fft[(2*i)+1], 2)));
}
}
for (float f : outputfft) {
System.out.println(f);
}
The FFT of a real-valued data vector is by definition complex and symmetric. If you have a vector length of N samples you will get a FFT of N frequency data separated in frequency by Fs/N, where Fs is the sampling frequency. Your output vector is twice the size since the complex data is interleaved [re,im,re,im ...].
The output data is half the size since it is symmetric and you only need to view the first half corresponding to frequencies [0 .. Fs/2] the upper half is [-Fs/2 .. 0)
If you have an even-symmetric input data, X(-n)=X(n), or odd-symmetric, X(-n)=-X(n), you may use the realForward function
Related
I'm trying to get the most representative frequency (or first harmonic) from an audio file using the Noise FFT library (https://github.com/paramsen/noise). I have an array with the values of size x and the output array's size is x+2. I'm not familiar with Fourier Transform, so maybe I'm missing something, but from my understanding I should have something that represents the frequencies and stores the magnitude (or in this case a complex number from with to calculate it) of each one.
The thing is: since each position in the array should be a frequency, how can I know the range of the output frequencies, what frequency is each position or something like that?
Edit: This is part of the code I'm using
float[] mono = new float[size];
// I fill the array with the appropiate values
Noise noise = Noise.real(size);
float[] dst = new float[size + 2];
float[] fft = noise.fft(mono, dst);
// The result array has the pairs of real+imaginary floats in a one dimensional array; even indices
// are real, odd indices are imaginary. DC bin is located at index 0, 1, nyquist at index n-2, n-1
double greatest = 0;
int greatestIdx = 0;
for(int i = 0; i < fft.length / 2; i++) {
float real = fft[i * 2];
float imaginary = fft[i * 2 + 1];
double magnitude = Math.sqrt(real*real+imaginary*imaginary);
if (magnitude > greatest) {
greatest = magnitude;
greatestIdx = i;
}
System.out.printf("index: %d, real: %.5f, imaginary: %.5f\n", i, real, imaginary);
}
I just noticed something I had overlooked. When reading the comment just before the for loop (which is from the sample code provided in GitHub) it says that nyquist is located at the last pair of values of the array. From what I searched, nyquist is 22050Hz, so... To know the frequency corresponding to greatestIdx I should map the range [0,size+2] to the range [0,22050] and calculate the new value? It seems like a pretty unprecise measure.
Taking the prior things into account, maybe I should use another library for more precision? If that is the case, what would be one that let me specify the output frequency range or that gives me approximately the human hearing range by default?
I believe that the answer to your question is here if I understand it correctly https://stackoverflow.com/a/4371627/9834835
To determine the frequency for each FFT bin you may use the formula
F = i * sample / nFFt
where:
i = the FFT index
sample = the sample rate
nFft = your FFT size
I'm trying to understand how to implement the Fast Fourier Transform using the separability property.
Image from the book: Gonzalez, R. C., and R. E. Woods. "Digital Image Processing, 4th Global Edition."
According to this, as I understand, we must compute the matrix of complex sinewave values like this:
Complex[] F = new Complex[i.width*i.height];
for (int x=0;x<i.img.length;x++){
F[x] = new Complex();
double theta = -2 * Math.PI * (x/(double)i.img.length);
F[x] = new Complex(Math.cos(theta), Math.sin(theta));
}
So, our matrix is a 1D array with the size of our image.
And then we have to take an image and multiply it with row-by-row array F, and then take the processed image and multiply it with column-by-column array F.
What i can't understand is how to properly perform this row-by-row and column-by-column multiplication.
Let's say we take F and separate its columns by chunks. I did it like this:
int chunk = image.width;
for(int f=0;f<F.length;f+=chunk){
row = Arrays.copyOfRange(F, f, Math.min(F.length,f+chunk));
for(int k=0; k < row.length; k++){
row[k] = row[k].mul(image[k]);
clone[k] = (byte) row[k].r;
}
}
But this is wrong, the row gets empty.
Should we actually cut the 1D F array like this in order to multiply it with pixel values? Or there's another way to compute the row-by-row and column-by-column multiplication? How it can be implemented in Java?
I am trying to read a NetCDF file with 4 parameters (Time, Depth, Latitude, Longitude), I want to read the file at a constant Time and depth.
Right now I am reading the whole file and then getting the values in a 4D grid and then parsing the grid to get the values at constant depth and time into a 2D array
//I have read the values of time and depth in TimeArr and depthArr respectively
int depthIndex = binarySearchInArray(depthArr, d);
int timeIndex = binarySearchInArray(timeArr, d);
ArrayFloat.D4 tempArr = (ArrayFloat.D4) v.read();
float[][] grid = new float[(int) latArr.getSize()][(int) lonArr.getSize()];
for (int i = 0; i < latArr.getSize(); i++) {
for (int j = 0; j < lonArr.getSize(); j++) {
grid[i][j] = tempArr.get(timeIndex, depthIndex, i, j);
}
}
return grid;
The line ArrayFloat.D4 tempArr = (ArrayFloat.D4) v.read(); takes a lot of time to read the file if it's too large.
Also, it is pointless to read all the dimensions when I need it for only one.
Is there a way to directly read a file along 2 dimensions only (with 2 dimensions, Time and Depth, kept constant)?
Thank you so much in advance.
One way you can just read the data you need would be to use the read(int[] origin, int[] shape) method on the Variable:
// define the indexes where you would like the array
// subset to start
int[] origin = new int[] {timeIndex, depthIndex, 0, 0};
// define the overall size of the read to be done, starting
// at origin
int[] size = new int[] {1, 1, latSize, lonSize};
// read the subset
Array data4D = v.read(origin, size);
// remove any dimensions of size 1
Array data2D = data4D.reduce();
where latSize and lonSize are the size of those dimensions, respectively.
For more information, as well as for a few other approaches, see the netCDF-Java tutorial (specifically the Reading data from a Variable section).
Cheers!
I am not so proficient in Java, so please keep it quite simple. I will, though, try to understand everything you post. Here's my problem.
I have written code to record audio from an external microphone and store that in a .wav. Storing this file is relevant for archiving purposes. What I need to do is a FFT of the stored audio.
My approach to this was loading the wav file as a byte array and transforming that, with the problem that 1. There's a header in the way I need to get rid of, but I should be able to do that and 2. I got a byte array, but most if not all FFT algorithms I found online and tried to patch into my project work with complex / two double arrays.
I tried to work around both these problems and finally was able to plot my FFT array as a graph, when I found out it was just giving me back "0"s. The .wav file is fine though, I can play it back without problems. I thought maybe converting the bytes into doubles was the problem for me, so here's my approach to that (I know it's not pretty)
byte ByteArray[] = Files.readAllBytes(wav_path);
String s = new String(ByteArray);
double[] DoubleArray = toDouble(ByteArray);
// build 2^n array, fill up with zeroes
boolean exp = false;
int i = 0;
int pow = 0;
while (!exp) {
pow = (int) Math.pow(2, i);
if (pow > ByteArray.length) {
exp = true;
} else {
i++;
}
}
System.out.println(pow);
double[] Filledup = new double[pow];
for (int j = 0; j < DoubleArray.length; j++) {
Filledup[j] = DoubleArray[j];
System.out.println(DoubleArray[j]);
}
for (int k = DoubleArray.length; k < Filledup.length; k++) {
Filledup[k] = 0;
}
This is the function I'm using to convert the byte array into a double array:
public static double[] toDouble(byte[] byteArray) {
ByteBuffer byteBuffer = ByteBuffer.wrap(byteArray);
double[] doubles = new double[byteArray.length / 8];
for (int i = 0; i < doubles.length; i++) {
doubles[i] = byteBuffer.getDouble(i * 8);
}
return doubles;
}
The header still is in there, I know that, but that should be the smallest problem right now. I transformed my byte array to a double array, then filled up that array to the next power of 2 with zeroes, so that the FFT can actually work (it needs an array of 2^n values). The FFT algorithm I'm using gets two double arrays as input, one being the real, the other being the imaginary part. I read, that for this to work, I'd have to keep the imaginary array empty (but its length being the same as the real array).
Worth to mention: I'm recording with 44100 kHz, 16 bit and mono.
If necessary, I'll post the FFT I'm using.
If I try to print the values of the double array, I get kind of weird results:
...
-2.0311904060823147E236
-1.3309975624948503E241
1.630738286366793E-260
1.0682002560745842E-255
-5.961832069690704E197
-1.1476447092561027E164
-1.1008407401197794E217
-8.109566204271759E298
-1.6104556241572942E265
-2.2081172620352248E130
NaN
3.643749694745671E-217
-3.9085815506127892E202
-4.0747557114875874E149
...
I know that somewhere the problem lies with me overlooking something very simple I should be aware of, but I can't seem to find the problem. My question finally is: How can I get this to work?
There's a header in the way I need to get rid of […]
You need to use javax.sound.sampled.AudioInputStream to read the file if you want to "skip" the header. This is useful to learn anyway, because you would need the data in the header to interpret the bytes if you did not know the exact format ahead of time.
I'm recording with 44100 kHz, 16 bit and mono.
So, this almost certainly means the data in the file is encoded as 16-bit integers (short in Java nomenclature).
Right now, your ByteBuffer code makes the assumption that it's already 64-bit floating point and that's why you get strange results. In other words, you are reinterpreting the binary short data as if it were double.
What you need to do is read in the short data and then convert it to double.
For example, here's a rudimentary routine to do such as you're trying to do (supporting 8-, 16-, 32- and 64-bit signed integer PCM):
import javax.sound.sampled.*;
import javax.sound.sampled.AudioFormat.Encoding;
import java.io.*;
import java.nio.*;
static double[] readFully(File file)
throws UnsupportedAudioFileException, IOException {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioFormat fmt = in.getFormat();
byte[] bytes;
try {
if(fmt.getEncoding() != Encoding.PCM_SIGNED) {
throw new UnsupportedAudioFileException();
}
// read the data fully
bytes = new byte[in.available()];
in.read(bytes);
} finally {
in.close();
}
int bits = fmt.getSampleSizeInBits();
double max = Math.pow(2, bits - 1);
ByteBuffer bb = ByteBuffer.wrap(bytes);
bb.order(fmt.isBigEndian() ?
ByteOrder.BIG_ENDIAN : ByteOrder.LITTLE_ENDIAN);
double[] samples = new double[bytes.length * 8 / bits];
// convert sample-by-sample to a scale of
// -1.0 <= samples[i] < 1.0
for(int i = 0; i < samples.length; ++i) {
switch(bits) {
case 8: samples[i] = ( bb.get() / max );
break;
case 16: samples[i] = ( bb.getShort() / max );
break;
case 32: samples[i] = ( bb.getInt() / max );
break;
case 64: samples[i] = ( bb.getLong() / max );
break;
default: throw new UnsupportedAudioFileException();
}
}
return samples;
}
The FFT algorithm I'm using gets two double arrays as input, one being the real, the other being the imaginary part. I read, that for this to work, I'd have to keep the imaginary array empty (but its length being the same as the real array).
That's right. The real part is the audio sample array from the file, the imaginary part is an array of equal length, filled with 0's e.g.:
double[] realPart = mySamples;
double[] imagPart = new double[realPart.length];
myFft(realPart, imagPart);
More info... "How do I use audio sample data from Java Sound?"
The samples in a wave file are not going to be already 8-byte doubles that can be directly copied as per your posted code.
You need to look up (partially from the WAVE header format and from the RIFF specification) the data type, format, length and endianess of the samples before converting them to doubles.
Try 2 byte little-endian signed integers as a likely possibility.
I have recorded an array[1024] of data from my mic on my Android phone, passed it through a 1D forward DFT of the real data (setting a further 1024 bits to 0). I saved the array to a text file, and repeated this 8 times.
I got back 16384 results. I opened the text file in Excel and made a graph to see what it looked like(x=index of array, y=size of number returned). There are some massive spikes (both positive and negative) in magnitude around 110, 232, and small spikes continuing in that fashion until around 1817 and 1941 where the spikes get big again, then drop again.
My problem is that wherever I look for help on the topic it mentions gettng the real and imaginary numbers, I only have a 1D array, that I got back from the method I used from Piotr Wendykier's class:
DoubleFFT_1D.realForwardFull(audioDataArray); // from the library JTransforms.
My question is: What do I need to do to this data to return a frequency?
The sound recorded was me playing an 'A' on the bottom string (5th fret) of my guitar (at roughly 440Hz) .
The complex data is interleaved, with real components at even indices and imaginary components at odd indices, i.e. the real components are at index 2*i, the imaginary components are at index 2*i+1.
To get the magnitude of the spectrum at index i, you want:
re = fft[2*i];
im = fft[2*i+1];
magnitude[i] = sqrt(re*re+im*im);
Then you can plot magnitude[i] for i = 0 to N / 2 to get the power spectrum. Depending on the nature of your audio input you should see one or more peaks in the spectrum.
To get the approximate frequency of any given peak you can convert the index of the peak as follows:
freq = i * Fs / N;
where:
freq = frequency in Hz
i = index of peak
Fs = sample rate in Hz (e.g. 44100 Hz, or whatever you are using)
N = size of FFT (e.g. 1024 in your case)
Note: if you have not previously applied a suitable window function to the time-domain input data then you will get a certain amount of spectral leakage and the power spectrum will look rather "smeared".
To expand on this further, here is pseudo-code for a complete example where we take audio data and identify the frequency of the largest peak:
N = 1024 // size of FFT and sample window
Fs = 44100 // sample rate = 44.1 kHz
data[N] // input PCM data buffer
fft[N * 2] // FFT complex buffer (interleaved real/imag)
magnitude[N / 2] // power spectrum
// capture audio in data[] buffer
// ...
// apply window function to data[]
// ...
// copy real input data to complex FFT buffer
for i = 0 to N - 1
fft[2*i] = data[i]
fft[2*i+1] = 0
// perform in-place complex-to-complex FFT on fft[] buffer
// ...
// calculate power spectrum (magnitude) values from fft[]
for i = 0 to N / 2 - 1
re = fft[2*i]
im = fft[2*i+1]
magnitude[i] = sqrt(re*re+im*im)
// find largest peak in power spectrum
max_magnitude = -INF
max_index = -1
for i = 0 to N / 2 - 1
if magnitude[i] > max_magnitude
max_magnitude = magnitude[i]
max_index = i
// convert index of largest peak to frequency
freq = max_index * Fs / N