I obtain a result that I don't understand when I apply the FFT of Jtransform.
The output frequency I get, is different from what I expect.
Currently I try to use Jtransform. From this library, I used realForward(double[] a).
To test the application, I used the following parameters:
input frequency = 50 hz
sample rate = 1 Khz
signal length = 1024
Below is a code snippet of the test method I wrote:
private static void test() {
//double[] signal = {980, 988, 1160, 1080, 928, 1068, 1156, 1152, 1176, 1264};
int signalLength = 1024;
double[] signal = new double[signalLength];
double sampleRate = 1000;
// Generate sin signal f = 50 , SampleRate = 0,001
for (int i = 0; i < signal.length; i++) {
signal[i] = Math.sin(2 * Math.PI * i * 50.0 / sampleRate);
}
// Copy signal for columbiaFFT
double signal2[] = signal.clone();
// Calculate FFT using Jtransforms
DoubleFFT_1D fft_1D = new DoubleFFT_1D(signal.length);
fft_1D.realForward(signal);
double[] magResult = new double[signal.length / 2];
double re, im;
magResult[0] = signal[0];
for (int i = 1; i < magResult.length - 1; i++) {
re = signal[i * 2];
im = signal[i * 2 + 1];
magResult[i] = Math.sqrt(re * re + im * im);
}
// converting bin to frequency values
double[] bin2freq = new double[magResult.length];
// sampleRate is in Hz
for (int i = 0; i < bin2freq.length; i++) {
bin2freq[i] = i * sampleRate / magResult.length;
//bin2freq[i] = i * sampleRate / * signal.length;
}
System.out.println("freq 1 " + bin2freq[1]);
// Calculate FFT using columbiaFFT
FFTColumbia fftColumbia = new FFTColumbia(signalLength);
double[] imaginary = new double[signal2.length];
fftColumbia.fft(signal2, imaginary);
double[] magColumbia = new double[signal2.length];
for (int i = 0; i < magColumbia.length; i++) {
magColumbia[i] = Math.sqrt(Math.pow(signal2[i], 2) + Math.pow(imaginary[i], 2));
}
}
When I plot the magnitude of the signal apart from seeing a noise and having negative result for the amplitude which I think it could come from not applying a window, I obtain an unexpected f-plot from applying fft of Jtransform (image here).
I also would like to ask if the FFT Columbia algorithm is displaying directly the frequency and amplitude or if it is also displaying the bin and I would therefore have to convert it to F.
See Plotting FFT Columbia vs Jtransform
Blue signal is FFT Columbia output
Red signal is FFT Jtransform output
If that's the case I might have generated the signal wrong.
Related
tl;dr: when I calculate and visualize an FFT for any audio sample, the visualization is full of background noise to the point of swallowing the signal. Why?
Full question/details: I'm (long-term) attempting to make an audio fingerprinter following the blog post here. The code given is incomplete and this is my first time doing this kind of audio processing, so I'm filling in blanks in both the code and my knowledge as I go.
The post first explains running the audio sample through a windowed FFT. I'm using the Apache Commons FastFourierTransform class for this, and I've sanity checked some very simple bit patterns against their computed FFTs with good results.
The post then detours into making a basic spectrum analyzer to confirm that the FFT is working as intended, and here's where I see my issue.
The post's spectrum analyzer is very simple code. results is a Complex[][] containing the raw results of the FFT.
for(int i = 0; i < results.length; i++) {
int freq = 1;
for(int line = 1; line < size; line++) {
// To get the magnitude of the sound at a given frequency slice
// get the abs() from the complex number.
// In this case I use Math.log to get a more managable number (used for color)
double magnitude = Math.log(results[i][freq].abs()+1);
// The more blue in the color the more intensity for a given frequency point:
g2d.setColor(new Color(0,(int)magnitude*10,(int)magnitude*20));
// Fill:
g2d.fillRect(i*blockSizeX, (size-line)*blockSizeY,blockSizeX,blockSizeY);
// I used a improviced logarithmic scale and normal scale:
if (logModeEnabled && (Math.log10(line) * Math.log10(line)) > 1) {
freq += (int) (Math.log10(line) * Math.log10(line));
} else {
freq++;
}
}
}
The post's visualization results as shown are good quality. This is a picture of a sample from Aphex Twin's song "Equation", which has an image of the artist's face encoded into it:
Indeed, when I take a short sample from the song (starting around 5:25) and run it through the online spectrum analyzer here for a sanity check, I get a pretty legible rendition of the face:
But my own results on the exact same audio file are a lot noisier, to the point that I have to mess with the spectrum analyzer's colors just to get something to show at all, and I never get to see the full face:
I get this kind of heavy background noise with any audio sample I try, across a variety of factors - MP3 or WAV, mono or stereo, short sample or long sample, a simple audio pattern or a complex song.
I've experimented with different FFT window sizes, conversion from raw FFT frequency output to power or dB, and different ways of visualizing the FFT output just in case the issue is with the visualization. None of that has helped.
I looked up the WebAudio implementation behind the Academo online spectrum analyzer, and it looks like there's a lot going on there: a Blackman window instead of my simple rectangular window to smooth the audio sampling; an interesting FFT with a built-in multiplication by 1/N, which seems to match the Unitary normalization provided by Apache Commons' FFT class; a smoothing function on the frequency data; and conversion from frequency values to dB to top it all off. Just for fun, I tried mimicking the WebAudio setup, but with about the same or even worse noise in the results, which suggests the issue is in the FFT step rather than any of the pre or post processing. I'm not sure how this can be the case when the FFT passes my basic calculation checks. I suppose the issue could be in the audio reading step, that I'm passing garbage into the FFT and getting garbage back, but I've experimented with reading the audio file and immediately writing a copy back to disk, and the new copy sounds just fine.
Here's a simplified version of my code that demonstrates the issue:
//Application.java
import java.io.File;
import java.io.IOException;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
public class Application {
public static void main(String[] args) {
sanityCheckFft();
File inputFile = new File("C:\\Aphex Twin face.mp3");
AudioProcessor audioProcessor = new AudioProcessor();
SpectrumAnalyzer debugSpectrumAnalyzer = new SpectrumAnalyzer();
try {
AudioInputStream audioStream = readAudioFile(inputFile);
byte[] bytes = audioStream.readAllBytes();
AudioFormat audioFormat = audioStream.getFormat();
FftChunk[] fft = audioProcessor.calculateFft(bytes, audioFormat, 4096);
debugSpectrumAnalyzer.debugFftSpectrum(fft);
}
catch (Exception e) {
e.printStackTrace();
}
}
//https://github.com/hendriks73/ffsampledsp#usage
private static AudioInputStream readAudioFile(File file) throws IOException, UnsupportedAudioFileException {
// compressed stream
AudioInputStream mp3InputStream = AudioSystem.getAudioInputStream(file);
// AudioFormat describing the compressed stream
AudioFormat mp3Format = mp3InputStream.getFormat();
// AudioFormat describing the desired decompressed stream
int sampleSizeInBits = 16;
int frameSize = 16 * mp3Format.getChannels() / 8;
AudioFormat pcmFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
mp3Format.getSampleRate(),
sampleSizeInBits,
mp3Format.getChannels(),
frameSize,
mp3Format.getSampleRate(),
mp3Format.isBigEndian());
// actually decompressed stream (signed PCM)
final AudioInputStream pcmInputStream = AudioSystem.getAudioInputStream(pcmFormat, mp3InputStream);
return pcmInputStream;
}
private static void sanityCheckFft() {
AudioProcessor audioProcessor = new AudioProcessor();
//pattern 1: one block
byte[] bytePattern1 = new byte[] {2, 1, -1, 5, 0, 3, 0, -4};
FftChunk[] fftResults1 = audioProcessor.calculateFft(bytePattern1, null, 8);
//expected results: [6 + 0J, -5.778 - 3.95J, 3 + -3J, 9.778 - 5.95J, -4 + 0J, 9.778 + 5.95J, 3 + 3J, -5.778 + 3.95J]
//expected results verified with https://engineering.icalculator.info/discrete-fourier-transform-calculator.html
//pattern 2: two blocks
byte[] bytePattern2 = new byte[] {2, 1, -1, 5, 0, 3, 0, -4};
FftChunk[] fftResults2 = audioProcessor.calculateFft(bytePattern1, null, 4);
//expected results: [7 + 0J, 3 + 4J, -5 + 0J, 3 - 4J], [-1 + 0J, 0 - 7J, 1 + 0J, 0 + 7J]
//expected results verified with https://engineering.icalculator.info/discrete-fourier-transform-calculator.html
/* pattern 3
* "Try a signal of alternate ones and negative ones with zeros between each. (i.e. 1,0,-1,0, 1,0,-1,0, ...) For a real FFT of length 1024, this should give you a single peak at out[255] ( the 256th frequency bin)"
* - https://stackoverflow.com/questions/8887896/why-does-my-kiss-fft-plot-show-duplicate-peaks-mirrored-on-the-y-axis#comment11127476_8887896
*/
byte[] bytePattern3 = new byte[1024];
byte[] pattern3Phases = new byte[] {1, 0, -1, 0};
for (int pattern3Index = 0; pattern3Index < bytePattern3.length; pattern3Index++) {
int pattern3PhaseIndex = pattern3Index % pattern3Phases.length;
byte pattern3Phase = pattern3Phases[pattern3PhaseIndex];
bytePattern3[pattern3Index] = pattern3Phase;
}
FftChunk[] fftResults3 = audioProcessor.calculateFft(bytePattern3, null, 1024);
//expected results: 0s except for fftResults[256]
}
}
//AudioProcessor.java
import javax.sound.sampled.AudioFormat;
import org.apache.commons.math3.complex.Complex;
import org.apache.commons.math3.transform.DftNormalization;
import org.apache.commons.math3.transform.FastFourierTransformer;
import org.apache.commons.math3.transform.TransformType;
public class AudioProcessor {
public FftChunk[] calculateFft(byte[] bytes, AudioFormat audioFormat, int debugActualChunkSize) {
//final int BITS_PER_BYTE = 8;
//final int PREFERRED_CHUNKS_PER_SECOND = 60;
/* turn the audio bytes into chunks. Each chunk represents the audio played during a certain window of time, defined by the audio's play rate (frame rate * frame size = the number of bytes processed per second)
* and the number of chunks we want to cut each second of audio into.
* frame rate * frame size = 1 second worth of bytes
* if we divide each second worth of data into chunksPerSecond chunks, that gives us:
* 1 chunk in bytes = 1 second in bytes / chunksPerSecond
* 1 chunk in bytes = frame rate * frame size / chunksPerSecond
*/
//float oneSecondByteLength = audioFormat.getChannels() * audioFormat.getSampleRate() * (audioFormat.getSampleSizeInBits() / BITS_PER_BYTE);
//int preferredChunkSize = (int)(oneSecondByteLength / PREFERRED_CHUNKS_PER_SECOND);
//int actualChunkSize = getPreviousPowerOfTwo(preferredChunkSize);
int chunkCount = bytes.length / debugActualChunkSize;
FastFourierTransformer fastFourierTransformer = new FastFourierTransformer(DftNormalization.STANDARD);
FftChunk[] fftResults = new FftChunk[chunkCount];
//set up each chunk individually for FFT processing
for (int timeIndex = 0; timeIndex < chunkCount; timeIndex++) {
//to map the input into the frequency domain, we need complex numbers (we only use the normal half of the Complex, but we need to provide & receive the entire Complex value)
Complex[] currentChunkComplexRepresentation = new Complex[debugActualChunkSize];
for (int currentChunkIndex = 0; currentChunkIndex < debugActualChunkSize; currentChunkIndex++) {
//get the next byte in the current audio chunk
int currentChunkCurrentByteIndex = (timeIndex * debugActualChunkSize) + currentChunkIndex;
byte currentChunkCurrentByte = bytes[currentChunkCurrentByteIndex];
//put the time domain data into a complex number with imaginary part as 0
currentChunkComplexRepresentation[currentChunkIndex] = new Complex(currentChunkCurrentByte, 0);
}
//perform FFT analysis on the chunk
Complex[] currentChunkFftResults = fastFourierTransformer.transform(currentChunkComplexRepresentation, TransformType.FORWARD);
FftChunk fftResult = new FftChunk(currentChunkFftResults);
fftResults[timeIndex] = fftResult;
}
return fftResults;
}
}
//FftChunk.java
import org.apache.commons.math3.complex.Complex;
import lombok.Data;
import lombok.RequiredArgsConstructor;
#Data
#RequiredArgsConstructor
public class FftChunk {
private final Complex[] fftResults;
}
//SpectrumAnalyzer.java
import java.awt.Color;
import java.awt.Dimension;
import java.awt.Graphics;
import java.awt.Graphics2D;
import java.awt.image.BufferedImage;
import javax.swing.JComponent;
import javax.swing.JFrame;
import javax.swing.JScrollPane;
import javax.swing.ScrollPaneConstants;
import javax.swing.WindowConstants;
import org.apache.commons.math3.complex.Complex;
public class SpectrumAnalyzer {
private JFrame frame;
private SpectrumAnalyzerComponent spectrumAnalyzerComponent;
public void debugFftSpectrum(FftChunk[] spectrum) {
Dimension windowSize = new Dimension(1000, 600);
spectrumAnalyzerComponent = new SpectrumAnalyzerComponent();
JScrollPane scrollPanel = new JScrollPane(spectrumAnalyzerComponent);
scrollPanel.setHorizontalScrollBarPolicy(ScrollPaneConstants.HORIZONTAL_SCROLLBAR_ALWAYS);
scrollPanel.setPreferredSize(windowSize);
frame = new JFrame();
frame.add(scrollPanel);
frame.setSize(windowSize);
frame.setVisible(true);
frame.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
spectrumAnalyzerComponent.analyze(spectrum);
}
}
#SuppressWarnings("serial")
class SpectrumAnalyzerComponent extends JComponent {
private FftChunk[] spectrum;
private boolean useLogScale = true;
private int blockSizeX = 1;
private int blockSizeY = 1;
private BufferedImage cachedImage;
public void analyze(FftChunk[] spectrum) {
this.spectrum = spectrum;
if (spectrum == null) {
cachedImage = null;
}
else {
int newWidth = (spectrum.length * blockSizeX) + blockSizeX;
int newHeight = 0;
for (FftChunk audioChunk : spectrum) {
Complex[] chunkFftResults = audioChunk.getFftResults();
int chunkHeight = calculatePixelHeight(chunkFftResults);
if (chunkHeight > newHeight) {
newHeight = chunkHeight;
}
}
Dimension newSize = new Dimension(newWidth, newHeight);
this.setPreferredSize(newSize);
this.setSize(newSize);
this.revalidate();
cachedImage = new BufferedImage(newWidth, newHeight, BufferedImage.TYPE_INT_RGB);
drawSpectrum(cachedImage.createGraphics());
}
this.repaint(); //force an immediate redraw
}
#Override
public void paint(Graphics graphics) {
if (cachedImage != null) {
graphics.drawImage(cachedImage, 0, 0, null);
}
}
//based on the spectrum analyzer from https://www.royvanrijn.com/blog/2010/06/creating-shazam-in-java/
private void drawSpectrum(Graphics2D graphics) {
if (this.spectrum == null) {
return;
}
int windowHeight = this.getSize().height;
for (int timeIndex = 0; timeIndex < spectrum.length; timeIndex++) {
System.out.println(String.format("Drawing time chunk %d/%d", timeIndex + 1, spectrum.length));
FftChunk currentChunk = spectrum[timeIndex];
Complex[] currentChunkFftResults = currentChunk.getFftResults();
int fftIndex = 0;
int yIndex = 1;
/* each chunk contains N elements, where N is the size of the FFT window. The first N/2 elements are positive and the last N/2 elements are negative, but they're otherwise mirrors
* of each other. We only want the positive half.
* Additionally, because we're working with audio samples, our FFT is a "real" FFT (FFT on real numbers -
* https://stackoverflow.com/questions/8887896/why-does-my-kiss-fft-plot-show-duplicate-peaks-mirrored-on-the-y-axis/10744384#10744384 ), which produces a mirror of its own inside
* the positive elements. We need to further divide the positive elements in half. This leaves us with the first N/4 elements after all is said and done.
*/
while (fftIndex < currentChunkFftResults.length / 4) {
Complex currentChunkFftResult = currentChunkFftResults[fftIndex];
// To get the magnitude of the sound at a given frequency slice
// get the abs() from the complex number.
// In this case I use Math.log to get a more managable number (used for color)
double magnitude = Math.log10(currentChunkFftResult.abs() + 1);
// The more blue in the color the more intensity for a given frequency point:
/*int red = 0;
int green = (int) magnitude * 10;
int blue = (int) magnitude * 20;
graphics.setColor(new Color(red, green, blue));*/
float hue = (float)(magnitude / 255 * 100);
int colorValue = Color.HSBtoRGB(hue, 100, 50);
graphics.setColor(new Color(colorValue));
// Fill:
graphics.fillRect(timeIndex * blockSizeX, (windowHeight - yIndex) * blockSizeY, blockSizeX, blockSizeY);
// I used an improvised logarithmic scale and normal scale:
int normalScaleFrequencyDelta = 1;
int logScaleFrequencyDelta = (int)(Math.log10(yIndex) * Math.log10(yIndex));
if (logScaleFrequencyDelta < 1) {
logScaleFrequencyDelta = 1;
}
if (useLogScale) {
fftIndex = fftIndex + logScaleFrequencyDelta;
}
else {
fftIndex = fftIndex + normalScaleFrequencyDelta;
}
yIndex = yIndex + 1;
}
}
}
private int calculatePixelHeight(Complex[] fftResults) {
int fftIndex = 1;
int tempPixelCount = 1;
int pixelCount = 1;
while (fftIndex < fftResults.length / 4) {
pixelCount = tempPixelCount;
int normalScaleFrequencyDelta = 1;
int logScaleFrequencyDelta = (int)(Math.log10(tempPixelCount) * Math.log10(tempPixelCount));
if (logScaleFrequencyDelta < 1) {
logScaleFrequencyDelta = 1;
}
if (useLogScale) {
fftIndex = fftIndex + logScaleFrequencyDelta;
}
else {
fftIndex = fftIndex + normalScaleFrequencyDelta;
}
tempPixelCount = tempPixelCount + 1;
}
return pixelCount;
}
}
//build.gradle
plugins {
//Java application plugin
id 'application'
//Project Lombok plugin
id 'io.freefair.lombok' version '6.5.0.2'
}
repositories {
// Use Maven Central for resolving dependencies.
mavenCentral()
}
dependencies {
implementation 'com.tagtraum:ffsampledsp-complete:0.9.46'
implementation 'org.apache.commons:commons-math3:3.6.1'
}
I am using javax.sound to make sounds, however when you play it they have some sort of noise in background, which even overcomes the sound if you play few notes at once. Here is the code:
public final static double notes[] = new double[] {130.81, 138.59, 146.83, 155.56, 164.81, 174.61, 185,
196, 207.65, 220, 233.08, 246.94, 261.63, 277.18, 293.66,
311.13, 329.63, 349.23, 369.99, 392, 415.3, 440, 466.16,
493.88, 523.25, 554.37};
public static void playSound(int note, int type) throws LineUnavailableException { //type 0 = sin, type 1 = square
Thread t = new Thread() {
public void run() {
try {
int sound = (int) (notes[note] * 100);
byte[] buf = new byte[1];
AudioFormat af = new AudioFormat((float) sound, 8, 1, true,
false);
SourceDataLine sdl;
sdl = AudioSystem.getSourceDataLine(af);
sdl = AudioSystem.getSourceDataLine(af);
sdl.open(af);
sdl.start();
int maxi = (int) (1000 * (float) sound / 1000);
for (int i = 0; i < maxi; i++) {
double angle = i / ((float) 44100 / 440) * 2.0
* Math.PI;
double val = 0;
if (type == 0) val = Math.sin(angle)*100;
if (type == 1) val = square(angle)*50;
buf[0] = (byte) (val * (maxi - i) / maxi);
sdl.write(buf, 0, 1);
}
sdl.drain();
sdl.stop();
sdl.close();
} catch (LineUnavailableException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
};
};
t.start();
}
public static double square (double angle){
angle = angle % (Math.PI*2);
if (angle > Math.PI) return 1;
else return 0;
}
This code is from here: https://stackoverflow.com/a/1932537/3787777
In this answer I will refer to 1) your code, 2) better approach (IMHO:) and 3) playing of two notes in the same time.
Your code
First, the sample rate should not depend on note frequency. Therefore try:
AudioFormat(44100,...
Next, use 16 bit sampling (sounds better!). Here is your code that plays simple tone without noise - but I would use it bit differently (see later). Please look for the comments:
Thread t = new Thread() {
public void run() {
try {
int sound = (440 * 100); // play A
AudioFormat af = new AudioFormat(44100, 16, 1, true, false);
SourceDataLine sdl;
sdl = AudioSystem.getSourceDataLine(af);
sdl.open(af, 4096 * 2);
sdl.start();
int maxi = (int) (1000 * (float) sound / 1000); // should not depend on notes frequency!
byte[] buf = new byte[maxi * 2]; // try to find better len!
int i = 0;
while (i < maxi * 2) {
// formula is changed to be simple sine!!
double val = Math.sin(Math.PI * i * 440 / 44100);
short s = (short) (Short.MAX_VALUE * val);
buf[i++] = (byte) s;
buf[i++] = (byte) (s >> 8); // little endian
}
sdl.write(buf, 0, maxi);
sdl.drain();
sdl.stop();
sdl.close();
} catch (LineUnavailableException e) {
e.printStackTrace();
}
}
};
t.start();
Proposal for better code
Here is a simplified version of your code that plays some note (frequency) without noise. I like it better as we first create array of doubles, which are universal values. These values can be combined together, or stored or further modified. Then we convert them to (8bit or 16bit) samples values.
private static byte[] buffer = new byte[4096 * 2 / 3];
private static int bufferSize = 0;
// plays a sample in range (-1, +1).
public static void play(SourceDataLine line, double in) {
if (in < -1.0) in = -1.0; // just sanity checks
if (in > +1.0) in = +1.0;
// convert to bytes - need 2 bytes for 16 bit sample
short s = (short) (Short.MAX_VALUE * in);
buffer[bufferSize++] = (byte) s;
buffer[bufferSize++] = (byte) (s >> 8); // little Endian
// send to line when buffer is full
if (bufferSize >= buffer.length) {
line.write(buffer, 0, buffer.length);
bufferSize = 0;
}
// todo: be sure that whole buffer is sent to line!
}
// prepares array of doubles, not related with the sampling value!
private static double[] tone(double hz, double duration) {
double amplitude = 1.0;
int N = (int) (44100 * duration);
double[] a = new double[N + 1];
for (int i = 0; i <= N; i++) {
a[i] = amplitude * Math.sin(2 * Math.PI * i * hz / 44100);
}
return a;
}
// finally:
public static void main(String[] args) throws LineUnavailableException {
AudioFormat af = new AudioFormat(44100, 16, 1, true, false);
SourceDataLine sdl = AudioSystem.getSourceDataLine(af);
sdl.open(af, 4096 * 2);
sdl.start();
double[] tones = tone(440, 2.0); // play A for 2 seconds
for (double t : tones) {
play(sdl, t);
}
sdl.drain();
sdl.stop();
sdl.close();
}
Sounds nice ;)
Play two notes in the same time
Just combine two notes:
double[] a = tone(440, 1.0); // note A
double[] b = tone(523.25, 1.0); // note C (i hope:)
for (int i = 0; i < a.length; i++) {
a[i] = (a[i] + b[i]) / 2;
}
for (double t : a) {
play(sdl, t);
}
Remember that with double array you can combine and manipulate your tones - i.e. to make composition of tone sounds that are being played in the same time. Of course, if you add 3 tones, you need to normalize the value by dividing with 3 and so on.
Ding Dong :)
The answer has already been provided, but I want to provide some information that might help understanding the solution.
Why 44100?
44.1 kHz audio is widely used, due to this being the sampling rate used in CDs. Analog audio is recorded by sampling it 44,100 times per second (1 cycle per second = 1 Hz), and then these samples are used to reconstruct the audio signal when playing it back. The reason behind the selection of this frequency is rather complex; and unimportant for this explanation. That said, the suggestion of using 22000 is not very good because that frequency is too close to the human hearing range (20Hz - 20kHz). You would want to use a sampling rate higher than 40kHz for good sound quality. I think mp4 uses 96kHz.
Why 16-bit?
The standard used for CDs is 44.1kHz/16-bit. MP4 uses 96kHz/24-bit. The sample rate refers to how many X-bit samples are recorded every second. CD-quality sampling uses 44,100 16-bit samples to reproduce sound.
Why is this explanation important?
The thing to remember is that you are trying to produce digital sound (not analog). This means that these bits and bytes have to be processed by an audio CODEC. In hardware, an audio CODEC is a device that encodes analog audio as digital signals and decodes digital back into analog. For audio outputs, the digitized sound must go through a Digital-to-Analog Converter (DAC) in order for proper sound to come out of the speakers. Two of the most important characteristics of a DAC are its bandwidth and its signal-to-noise ratio and the actual bandwidth of a DAC is characterized primarily by its sampling rate.
Basically, you can't use an arbitrary sampling rate because the audio will not be reproduced well by your audio device for the reasons stated above. When in doubt, check your computer hardware and find out what your CODEC supports.
For some reason the frequencies as displaced
391 hz => 1162
440 hz => 2196
493 hz => 2454
I am using this values
final int audioFrames= 1024;
final float sampleRate= 44100.0f;
final int bitsPerRecord= 16;
final int channels= 1;
final boolean bigEndian = true;
final boolean signed= true;
byteData= new byte[audioFrames * 2]; //two bytes per audio frame, 16 bits
dData= new double[audioFrames * 2]; // real & imaginary
This is how I ready the data and transform it to doubles:
format = new AudioFormat(sampleRate, bitsPerRecord, channels, signed, bigEndian);
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
microphone = (TargetDataLine) AudioSystem.getLine(info);
microphone.open(format);
microphone.start();
int numBytesRead = microphone.read(byteData, 0, byteData.length);
Once the data is read, cast from 16 bit, big endian, signed to double
public void byteToDouble(){
ByteBuffer buf= ByteBuffer.wrap(byteData);
buf.order(ByteOrder.BIG_ENDIAN);
int i=0;
while(buf.remaining()>1){
short s = buf.getShort();
dData[ 2 * i ] = (double) s / 32768.0; //real
dData[ 2 * i + 1] = 0.0; // imag
++i;
}
}
And at last, run the FFT and find the frequency:
public void findFrequency(){
double frequency;
DoubleFFT_1D fft= new DoubleFFT_1D(audioFrames);
/* edu/emory/mathcs/jtransforms/fft/DoubleFFT_1D.java */
fft.complexForward(dData); // do the magic so we can find peak
for(int i = 0; i < audioFrames; i++){
re[i] = dData[i*2];
im[i] = dData[(i*2)+1];
mag[i] = Math.sqrt((re[i] * re[i]) + (im[i]*im[i]));
}
double peak = -1.0;
int peakIn=-1;
for(int i = 0; i < audioFrames; i++){
if(peak < mag[i]){
peakIn=i;
peak= mag[i];
}
}
frequency = (sampleRate * (double)peakIn) / (double)audioFrames;
System.out.print("Peak: "+peakIn+", Frequency: "+frequency+"\n");
}
You can interpolate between FFT result bins (parabolic or Sinc interpolation) to get a more accurate estimate of frequency. But you may have a bigger problem: your frequency source may be producing (or be being clipped to produce) some very strong odd harmonics or overtones that mask any fundamental sinusoid in the FFT result magnitudes. Thus you should try using a pitch detection/estimation algorithm instead of just trying to look for a (possibly missing) FFT peak.
Firstly, if the audio you're recording is long, you'll need to do FFT in chunks, preferably with windowing each chunk before performing FFT. FFT only computes one fundamental frequency, so you need to take FFT at many places if the frequency changes many times.
Accuracy can also be improved from sliding windows. This means that you would take a chunk, then slide over slightly and take another chunk, so that the chunks overlap. How much you slide over is variable, and the size of each chunk is also variable.
Then, FFT alone might produce false results. You can do more analysis like Cepstrum analysis or Harmonic Product Spectrum analysis on the power spectrum produces by the FFT to try and estimate the pitch more accurately.
I'm having some trouble with generating a sound with a specific frequency. I've set up my app so you can slide back and forth on a seekbar to select a specific frequency, which the app should then use to generate a tone.
I'm currently getting a tone just fine, but it's a complete different frequency than the one you set it to. (and I know that the problem is not passing the value from the seekbar to the "tone generating process", so it must be the way I generate the tone.)
What's wrong with this code?
Thanks
private final int duration = 3; // seconds
private final int sampleRate = 8000;
private final int numSamples = duration * sampleRate;
private final double sample[] = new double[numSamples];
double dbFreq = 0; // I assign the frequency to this double
private final byte generatedSnd[] = new byte[2 * numSamples];
...
void genTone(double dbFreq){
// fill out the array
for (int i = 0; i < numSamples; ++i) {
sample[i] = Math.sin(2 * Math.PI * i / (sampleRate/dbFreq));
}
// convert to 16 bit pcm sound array
// assumes the sample buffer is normalised.
int idx = 0;
for (final double dVal : sample) {
// scale to maximum amplitude
final short val = (short) ((dVal * 32767));
// in 16 bit wav PCM, first byte is the low order byte
generatedSnd[idx++] = (byte) (val & 0x00ff);
generatedSnd[idx++] = (byte) ((val & 0xff00) >>> 8);
}
}
void playSound(){
final AudioTrack audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,
sampleRate, AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT, numSamples,
AudioTrack.MODE_STATIC);
audioTrack.write(generatedSnd, 0, generatedSnd.length);
audioTrack.play();
}
Your code is actually correct but have a look at the Sampling theorem.
In short: you must set the sampling rate higher than 2*max_frequency. So set sampleRate = 44000 and you should hear even higher frequencies correct.
This question already has answers here:
Draw Gaussian curve in Java
(2 answers)
Closed 7 years ago.
I have calculated mean and SD of a set of values. Now I need to draw a bell curve using those value to show the normal distribution in JAVA Swing. How do i proceed with this situation.
List : 204 297 348 528 681 684 785 957 1044 1140 1378 1545 1818
Total count : 13
Average value (Mean): 877.615384615385
Standard deviation (SD) : 477.272626245539
If i can get the x and y cordinates I can do it, but how do i get those values?
First you need to calculate the variance for the set. The variance is computed as the average squared deviation of each number from its mean.
double variance(double[] population) {
long n = 0;
double mean = 0;
double s = 0.0;
for (double x : population) {
n++;
double delta = x – mean;
mean += delta / n;
s += delta * (x – mean);
}
// if you want to calculate std deviation
return (s / n);
}
Once you have that you can choose x depending on your graph resolution compared to your value set spread and plug it in to the following equation to get y.
protected double stdDeviation, variance, mean;
public double getY(double x) {
return Math.pow(Math.exp(-(((x - mean) * (x - mean)) / ((2 * variance)))), 1 / (stdDeviation * Math.sqrt(2 * Math.PI)));
}
To display the resulting set: say we take the population set you laid out and decide you want to show x=0 to x=2000 on a graph with an x resolution of 1000 pixels. Then you would plug in a loop (int x = 0; x <= 2000; x = 2) and feed those values into the equation above to get your y values for the pair. Since the y you want to show is 0-1 then you map these values to whatever you want your y resolution to be with appropriate rounding behavior so your graph doesn't end up too jaggy. So if you want your y resolution to be 500 pixels then you set 0 to 0 and 1 to 500 and .5 to 250 etc. etc. This is a contrived example and you might need a lot more flexibility but I think it illustrates the point. Most graphing libraries will handle these little things for you.
Here's an example of plotting some Gaussian curves using XChart. The code can be found here. Disclaimer: I'm the creator of the XChart Java charting library.
public class ThemeChart03 implements ExampleChart {
public static void main(String[] args) {
ExampleChart exampleChart = new ThemeChart03();
Chart chart = exampleChart.getChart();
new SwingWrapper(chart).displayChart();
}
#Override
public Chart getChart() {
// Create Chart
Chart_XY chart = new ChartBuilder_XY().width(800).height(600).theme(ChartTheme.Matlab).title("Matlab Theme").xAxisTitle("X").yAxisTitle("Y").build();
// Customize Chart
chart.getStyler().setPlotGridLinesVisible(false);
chart.getStyler().setXAxisTickMarkSpacingHint(100);
// Series
List<Integer> xData = new ArrayList<Integer>();
for (int i = 0; i < 640; i++) {
xData.add(i);
}
List<Double> y1Data = getYAxis(xData, 320, 60);
List<Double> y2Data = getYAxis(xData, 320, 100);
List<Double> y3Data = new ArrayList<Double>(xData.size());
for (int i = 0; i < 640; i++) {
y3Data.add(y1Data.get(i) - y2Data.get(i));
}
chart.addSeries("Gaussian 1", xData, y1Data);
chart.addSeries("Gaussian 2", xData, y2Data);
chart.addSeries("Difference", xData, y3Data);
return chart;
}
private List<Double> getYAxis(List<Integer> xData, double mean, double std) {
List<Double> yData = new ArrayList<Double>(xData.size());
for (int i = 0; i < xData.size(); i++) {
yData.add((1 / (std * Math.sqrt(2 * Math.PI))) * Math.exp(-(((xData.get(i) - mean) * (xData.get(i) - mean)) / ((2 * std * std)))));
}
return yData;
}
}
The resulting plot looks like this: