I'm having some trouble with generating a sound with a specific frequency. I've set up my app so you can slide back and forth on a seekbar to select a specific frequency, which the app should then use to generate a tone.
I'm currently getting a tone just fine, but it's a complete different frequency than the one you set it to. (and I know that the problem is not passing the value from the seekbar to the "tone generating process", so it must be the way I generate the tone.)
What's wrong with this code?
Thanks
private final int duration = 3; // seconds
private final int sampleRate = 8000;
private final int numSamples = duration * sampleRate;
private final double sample[] = new double[numSamples];
double dbFreq = 0; // I assign the frequency to this double
private final byte generatedSnd[] = new byte[2 * numSamples];
...
void genTone(double dbFreq){
// fill out the array
for (int i = 0; i < numSamples; ++i) {
sample[i] = Math.sin(2 * Math.PI * i / (sampleRate/dbFreq));
}
// convert to 16 bit pcm sound array
// assumes the sample buffer is normalised.
int idx = 0;
for (final double dVal : sample) {
// scale to maximum amplitude
final short val = (short) ((dVal * 32767));
// in 16 bit wav PCM, first byte is the low order byte
generatedSnd[idx++] = (byte) (val & 0x00ff);
generatedSnd[idx++] = (byte) ((val & 0xff00) >>> 8);
}
}
void playSound(){
final AudioTrack audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,
sampleRate, AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT, numSamples,
AudioTrack.MODE_STATIC);
audioTrack.write(generatedSnd, 0, generatedSnd.length);
audioTrack.play();
}
Your code is actually correct but have a look at the Sampling theorem.
In short: you must set the sampling rate higher than 2*max_frequency. So set sampleRate = 44000 and you should hear even higher frequencies correct.
Related
I haven’t experience to use javascript. But I want to demonstrate sound reduction or cancellation to high school students by using single frequency sound in class.
I’ve searched sound generator & detection code in website. Now I can find out frequency but I cannot make phase shifting sound to reduce sound.
Could you help me advices to make phase shifting sound to reduce sound?
//Single frequency sound generator
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
public class MakeSound {
public static void main(String[] args) throws LineUnavailableException {
System.out.println("Generate Noise!");
byte[] buf = new byte[2];
int samplingsize = 44100;
AudioFormat af = new AudioFormat((float) samplingsize, 16, 1, true, false);
SourceDataLine sdl = AudioSystem.getSourceDataLine(af);
sdl.open();
sdl.start();
int duration = 500000; // noise generating duration [ms]
int noise_frequency = 315; // noise frequency
System.out.println("Noise Frequency:"+noise_frequency+"Hz");
for (int i = 0; i < duration*(float) 44100/1000; i++) {
float numberOfSamplesToRepresentFullSin = (float) samplingsize / noise_frequency;
double angle = i / (numberOfSamplesToRepresentFullSin/ 2.0) * Math.PI;
short a = (short) (Math.sin(angle) * 32767); //32767 - max value for sample to take (-32767 to 32767)
buf[0] = (byte) (a & 0xFF);
buf[1] = (byte) (a >> 8);
sdl.write(buf, 0, 2);
}
sdl.drain();
sdl.stop();
}
}
//Frequency detection & phase shifting sound generator
package fft_1;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.SourceDataLine;
import javax.sound.sampled.TargetDataLine;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
import org.apache.commons.math3.complex.Complex;
import org.apache.commons.math3.transform.DftNormalization;
import org.apache.commons.math3.transform.FastFourierTransformer;
import org.apache.commons.math3.transform.TransformType;
#SuppressWarnings("unused")
public class AudioInput {
TargetDataLine microphone;
final int audioFrames= 8192; //power ^ 2
final float sampleRate= 8000.0f;
final int bitsPerRecord= 16;
final int channels= 1;
final boolean bigEndian = true;
final boolean signed= true;
byte byteData[]; // length=audioFrames * 2
double doubleData[]; // length=audioFrames only reals needed for apache lib.
AudioFormat format;
FastFourierTransformer transformer;
public AudioInput () {
byteData= new byte[audioFrames * 2]; //two bytes per audio frame, 16 bits
doubleData= new double[audioFrames * 2]; // real & imaginary
doubleData= new double[audioFrames]; // only real for apache
transformer = new FastFourierTransformer(DftNormalization.STANDARD);
System.out.print("Microphone initialization\n");
format = new AudioFormat(sampleRate, bitsPerRecord, channels, signed, bigEndian);
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format); // format is an AudioFormat object
if (!AudioSystem.isLineSupported(info)) {
System.err.print("isLineSupported failed");
System.exit(1);
}
try {
microphone = (TargetDataLine) AudioSystem.getLine(info);
microphone.open(format);
System.out.print("Microphone opened with format: "+format.toString()+"\n");
microphone.start();
}
catch(Exception ex){
System.out.println("Microphone failed: "+ex.getMessage());
System.exit(1);
}
}
public int readPcm(){
int numBytesRead=
microphone.read(byteData, 0, byteData.length);
if(numBytesRead!=byteData.length){
System.out.println("Warning: read less bytes than buffer size");
System.exit(1);
}
return numBytesRead;
}
#SuppressWarnings({ })
public void byteToDouble(){
ByteBuffer buf= ByteBuffer.wrap(byteData);
buf.order(ByteOrder.BIG_ENDIAN);
int i=0;
while(buf.remaining()>2){
short s = buf.getShort();
doubleData[ i ] = (new Short(s)).doubleValue();
++i;
}
System.out.println("Parsed "+i+" doubles from "+byteData.length+" bytes");
}
public void findFrequency() throws LineUnavailableException{
float frequency;
Complex[] cmplx= transformer.transform(doubleData, TransformType.FORWARD);
double real = 0;
double im = 0;
double mag[] = new double[cmplx.length];
byte[] buf = new byte[2];
int samplingsize = 44100;
AudioFormat af = new AudioFormat((float) samplingsize, 16, 1, true, false);
SourceDataLine sdl = AudioSystem.getSourceDataLine(af);
sdl.open();
sdl.start();
for(int i = 0; i < cmplx.length; i++){
real = cmplx[i].getReal();
im = cmplx[i].getImaginary();
mag[i] = Math.sqrt((real * real) + (im*im));
}
double peak = -1.0;
int index=-1;
for(int i = 0; i < cmplx.length; i++){
if(peak < mag[i]){
index=i;
peak= mag[i];
}
}
frequency = (sampleRate * index) / audioFrames;
System.out.print("Index: "+index+", Frequency: "+frequency+"\n");
int duration = 3000; // duration millisecond
int beatpersec = (int) Math.round(frequency);
for (int i = 0; i < frequency/2 ; i++) {
System.out.println("i"+i);
}
for (int i = 0; i < duration*(float) 44100/1000; i++) {
float numberOfSamplesToRepresentFullSin = (float) samplingsize / beatpersec;
double angle = i / (numberOfSamplesToRepresentFullSin/ 2.0) * Math.PI;
short a = (short) (Math.sin(angle) * 32767); //32767 - max value for sample to take (-32767 to 32767)
buf[0] = (byte) (a & 0xFF);
buf[1] = (byte) (a >> 8);
sdl.write(buf, 0, 2);
}
sdl.drain();
sdl.stop();
}
public void printFreqs(){
for (int i=0; i<audioFrames/4; i++){
//System.out.println("bin "+i+", freq: "+(sampleRate*i)/audioFrames);
System.out.println("End");
}
}
public static void main(String[] args) throws LineUnavailableException {
AudioInput ai= new AudioInput();
int turns=1;
while(turns-- > 0){
ai.readPcm();
ai.byteToDouble();
ai.findFrequency();
}
ai.printFreqs();
}
}
I'm just looking at the MakeSound class. I'm assuming if we had a controlled way of altering its phase, that would be sufficient for your needs.
First off, include a slider control in the project. It's output should go from 0 to one full period, depending on your "angle" units. If it's degrees, it could be 0 to 359.
Put the sdl.write method in its own thread, inside a while loop and keep it running continuously.
Make a class or function that provides the "next" block of sine data on demand. If you need an array size for the write, something like 4K might be a good starting guess. In my experience, anything from 1k to 8k works fine. The while loop holding the sdl calls this function once per each write operation.
Now, your angle value in your data on-demand function needs to be determined by adding two parts: (1) the part that cycles on pitch continously (similar to what you are already doing), (2) a "phase" variable that holds an angle value that can range from 0 to one full period.
Have the slider tied to the "phase" variable. Probably some form of loose coupling would be good, to prevent the changes to the "phase" variable from blocking the sine-wave calculation.
A couple of cautions, though. For one, as you move the slider, you will likely create some clicks unless you build in a function to spread out the changes in the "phase" value over, say, 128 PCM values. Secondly, the volumes have to match for true cancellation, so a volume slider as well as the phase slider might be needed. The "volume" slider can range from 0 to 1, creating a factor that you multiply against the PCM values that you are holding in the short array.
The main thing, since there is a single starting point for this continuous signal, (thanks to running the sdl continuously in the while loop), there should be some point on the slide that best corresponds to the cancellation. It will be a different point on the slider each time, of course.
I have following code:
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
public class Test {
private static final int MAX_LENGTH = 1000;
private Random r = new Random();
protected static final int SAMPLE_RATE = 32 * 1024;
public static byte[] createSinWaveBuffer(double freq, int ms) {
int samples = ((ms * SAMPLE_RATE) / 1000);
byte[] output = new byte[samples];
double period = (double) SAMPLE_RATE / freq;
for (int i = 0; i < output.length; i++) {
double angle = 2.0 * Math.PI * i / period;
output[i] = (byte) (Math.sin(angle) * 0x7f);
}
return output;
}
public static void main(String[] args) throws LineUnavailableException {
List<Double> freqs = new Test().generate();
System.out.println(freqs);
final AudioFormat af = new AudioFormat(SAMPLE_RATE, 8, 1, true, true);
SourceDataLine line = AudioSystem.getSourceDataLine(af);
line.open(af, SAMPLE_RATE);
line.start();
freqs.forEach(a -> {
byte[] toneBuffer = createSinWaveBuffer(a, 75);
line.write(toneBuffer, 0, toneBuffer.length);
});
line.drain();
line.close();
}
private List<Double> generate() {
List<Double> frequencies = new ArrayList<>();
double[] values = new double[] { 4.0/3,1.5,1,2 };
double current = 440.00;
frequencies.add(current);
while (frequencies.size() < MAX_LENGTH) {
//Generate a frequency in Hz based on harmonics and a bit math.
boolean goUp = Math.random() > 0.5;
if (current < 300)
goUp = true;
else if (current > 1000)
goUp = false;
if (goUp) {
current *= values[Math.abs(r.nextInt(values.length))];
} else {
current *= Math.pow(values[Math.abs(r.nextInt(values.length))], -1);
}
frequencies.add(current);
}
return frequencies;
}
}
I want to generate a random "melody", beginning from A(hz=440). I do this using random numbers to determine, whether the tone goes up or down.
My problem:
I can generate the melody, but if I play it, there is always a "knocking" sound between each tone. What could I do to remove it, so it sounds better?
At the beginning and ending of each tone, the signal going to your SourceDataLine jumps from volume 0 to the full out sine wave instantaneously. Or, it jumps from some arbitrary value in one sine wave to the beginning value in the next. Large jumps can create many overtones which are often heard as clicks.
To remedy this, in your method createSineWaveBuffer, it would be helpful to smooth out the start and end of the buffer by multiplying the values by a factor that ranges from 0 to 1 for the start of the tone, and 1 to 0 for the end of the tone. The number of frames over which you do this depends mostly on esthetics and the sample rate. I think 1 millisecond transitions might work as a ballpark minimum. A commercial digital synth that I have uses that as the smallest value. For 44100 fps, that comes to dividing the transition into 44 steps, e.g., 0/44, 1/44, 2/44, etc. that you multiply to the data values at the start of the buffer, and the reverse that you multiple against the end of the buffer.
I'd be tempted to prefer 64 or 128 steps. 128 steps at 44100 comes to a note onset that only takes about 0.003 seconds, and it should make the transition smooth enough to eliminate the "discontinuity" in the signal. Of course you can choose longer transitions if it sounds more pleasing.
If you do this (if the transition is long enough) there shouldn't be any need to apply low-pass filtering.
I obtain a result that I don't understand when I apply the FFT of Jtransform.
The output frequency I get, is different from what I expect.
Currently I try to use Jtransform. From this library, I used realForward(double[] a).
To test the application, I used the following parameters:
input frequency = 50 hz
sample rate = 1 Khz
signal length = 1024
Below is a code snippet of the test method I wrote:
private static void test() {
//double[] signal = {980, 988, 1160, 1080, 928, 1068, 1156, 1152, 1176, 1264};
int signalLength = 1024;
double[] signal = new double[signalLength];
double sampleRate = 1000;
// Generate sin signal f = 50 , SampleRate = 0,001
for (int i = 0; i < signal.length; i++) {
signal[i] = Math.sin(2 * Math.PI * i * 50.0 / sampleRate);
}
// Copy signal for columbiaFFT
double signal2[] = signal.clone();
// Calculate FFT using Jtransforms
DoubleFFT_1D fft_1D = new DoubleFFT_1D(signal.length);
fft_1D.realForward(signal);
double[] magResult = new double[signal.length / 2];
double re, im;
magResult[0] = signal[0];
for (int i = 1; i < magResult.length - 1; i++) {
re = signal[i * 2];
im = signal[i * 2 + 1];
magResult[i] = Math.sqrt(re * re + im * im);
}
// converting bin to frequency values
double[] bin2freq = new double[magResult.length];
// sampleRate is in Hz
for (int i = 0; i < bin2freq.length; i++) {
bin2freq[i] = i * sampleRate / magResult.length;
//bin2freq[i] = i * sampleRate / * signal.length;
}
System.out.println("freq 1 " + bin2freq[1]);
// Calculate FFT using columbiaFFT
FFTColumbia fftColumbia = new FFTColumbia(signalLength);
double[] imaginary = new double[signal2.length];
fftColumbia.fft(signal2, imaginary);
double[] magColumbia = new double[signal2.length];
for (int i = 0; i < magColumbia.length; i++) {
magColumbia[i] = Math.sqrt(Math.pow(signal2[i], 2) + Math.pow(imaginary[i], 2));
}
}
When I plot the magnitude of the signal apart from seeing a noise and having negative result for the amplitude which I think it could come from not applying a window, I obtain an unexpected f-plot from applying fft of Jtransform (image here).
I also would like to ask if the FFT Columbia algorithm is displaying directly the frequency and amplitude or if it is also displaying the bin and I would therefore have to convert it to F.
See Plotting FFT Columbia vs Jtransform
Blue signal is FFT Columbia output
Red signal is FFT Jtransform output
If that's the case I might have generated the signal wrong.
I am using javax.sound to make sounds, however when you play it they have some sort of noise in background, which even overcomes the sound if you play few notes at once. Here is the code:
public final static double notes[] = new double[] {130.81, 138.59, 146.83, 155.56, 164.81, 174.61, 185,
196, 207.65, 220, 233.08, 246.94, 261.63, 277.18, 293.66,
311.13, 329.63, 349.23, 369.99, 392, 415.3, 440, 466.16,
493.88, 523.25, 554.37};
public static void playSound(int note, int type) throws LineUnavailableException { //type 0 = sin, type 1 = square
Thread t = new Thread() {
public void run() {
try {
int sound = (int) (notes[note] * 100);
byte[] buf = new byte[1];
AudioFormat af = new AudioFormat((float) sound, 8, 1, true,
false);
SourceDataLine sdl;
sdl = AudioSystem.getSourceDataLine(af);
sdl = AudioSystem.getSourceDataLine(af);
sdl.open(af);
sdl.start();
int maxi = (int) (1000 * (float) sound / 1000);
for (int i = 0; i < maxi; i++) {
double angle = i / ((float) 44100 / 440) * 2.0
* Math.PI;
double val = 0;
if (type == 0) val = Math.sin(angle)*100;
if (type == 1) val = square(angle)*50;
buf[0] = (byte) (val * (maxi - i) / maxi);
sdl.write(buf, 0, 1);
}
sdl.drain();
sdl.stop();
sdl.close();
} catch (LineUnavailableException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
};
};
t.start();
}
public static double square (double angle){
angle = angle % (Math.PI*2);
if (angle > Math.PI) return 1;
else return 0;
}
This code is from here: https://stackoverflow.com/a/1932537/3787777
In this answer I will refer to 1) your code, 2) better approach (IMHO:) and 3) playing of two notes in the same time.
Your code
First, the sample rate should not depend on note frequency. Therefore try:
AudioFormat(44100,...
Next, use 16 bit sampling (sounds better!). Here is your code that plays simple tone without noise - but I would use it bit differently (see later). Please look for the comments:
Thread t = new Thread() {
public void run() {
try {
int sound = (440 * 100); // play A
AudioFormat af = new AudioFormat(44100, 16, 1, true, false);
SourceDataLine sdl;
sdl = AudioSystem.getSourceDataLine(af);
sdl.open(af, 4096 * 2);
sdl.start();
int maxi = (int) (1000 * (float) sound / 1000); // should not depend on notes frequency!
byte[] buf = new byte[maxi * 2]; // try to find better len!
int i = 0;
while (i < maxi * 2) {
// formula is changed to be simple sine!!
double val = Math.sin(Math.PI * i * 440 / 44100);
short s = (short) (Short.MAX_VALUE * val);
buf[i++] = (byte) s;
buf[i++] = (byte) (s >> 8); // little endian
}
sdl.write(buf, 0, maxi);
sdl.drain();
sdl.stop();
sdl.close();
} catch (LineUnavailableException e) {
e.printStackTrace();
}
}
};
t.start();
Proposal for better code
Here is a simplified version of your code that plays some note (frequency) without noise. I like it better as we first create array of doubles, which are universal values. These values can be combined together, or stored or further modified. Then we convert them to (8bit or 16bit) samples values.
private static byte[] buffer = new byte[4096 * 2 / 3];
private static int bufferSize = 0;
// plays a sample in range (-1, +1).
public static void play(SourceDataLine line, double in) {
if (in < -1.0) in = -1.0; // just sanity checks
if (in > +1.0) in = +1.0;
// convert to bytes - need 2 bytes for 16 bit sample
short s = (short) (Short.MAX_VALUE * in);
buffer[bufferSize++] = (byte) s;
buffer[bufferSize++] = (byte) (s >> 8); // little Endian
// send to line when buffer is full
if (bufferSize >= buffer.length) {
line.write(buffer, 0, buffer.length);
bufferSize = 0;
}
// todo: be sure that whole buffer is sent to line!
}
// prepares array of doubles, not related with the sampling value!
private static double[] tone(double hz, double duration) {
double amplitude = 1.0;
int N = (int) (44100 * duration);
double[] a = new double[N + 1];
for (int i = 0; i <= N; i++) {
a[i] = amplitude * Math.sin(2 * Math.PI * i * hz / 44100);
}
return a;
}
// finally:
public static void main(String[] args) throws LineUnavailableException {
AudioFormat af = new AudioFormat(44100, 16, 1, true, false);
SourceDataLine sdl = AudioSystem.getSourceDataLine(af);
sdl.open(af, 4096 * 2);
sdl.start();
double[] tones = tone(440, 2.0); // play A for 2 seconds
for (double t : tones) {
play(sdl, t);
}
sdl.drain();
sdl.stop();
sdl.close();
}
Sounds nice ;)
Play two notes in the same time
Just combine two notes:
double[] a = tone(440, 1.0); // note A
double[] b = tone(523.25, 1.0); // note C (i hope:)
for (int i = 0; i < a.length; i++) {
a[i] = (a[i] + b[i]) / 2;
}
for (double t : a) {
play(sdl, t);
}
Remember that with double array you can combine and manipulate your tones - i.e. to make composition of tone sounds that are being played in the same time. Of course, if you add 3 tones, you need to normalize the value by dividing with 3 and so on.
Ding Dong :)
The answer has already been provided, but I want to provide some information that might help understanding the solution.
Why 44100?
44.1 kHz audio is widely used, due to this being the sampling rate used in CDs. Analog audio is recorded by sampling it 44,100 times per second (1 cycle per second = 1 Hz), and then these samples are used to reconstruct the audio signal when playing it back. The reason behind the selection of this frequency is rather complex; and unimportant for this explanation. That said, the suggestion of using 22000 is not very good because that frequency is too close to the human hearing range (20Hz - 20kHz). You would want to use a sampling rate higher than 40kHz for good sound quality. I think mp4 uses 96kHz.
Why 16-bit?
The standard used for CDs is 44.1kHz/16-bit. MP4 uses 96kHz/24-bit. The sample rate refers to how many X-bit samples are recorded every second. CD-quality sampling uses 44,100 16-bit samples to reproduce sound.
Why is this explanation important?
The thing to remember is that you are trying to produce digital sound (not analog). This means that these bits and bytes have to be processed by an audio CODEC. In hardware, an audio CODEC is a device that encodes analog audio as digital signals and decodes digital back into analog. For audio outputs, the digitized sound must go through a Digital-to-Analog Converter (DAC) in order for proper sound to come out of the speakers. Two of the most important characteristics of a DAC are its bandwidth and its signal-to-noise ratio and the actual bandwidth of a DAC is characterized primarily by its sampling rate.
Basically, you can't use an arbitrary sampling rate because the audio will not be reproduced well by your audio device for the reasons stated above. When in doubt, check your computer hardware and find out what your CODEC supports.
For some reason the frequencies as displaced
391 hz => 1162
440 hz => 2196
493 hz => 2454
I am using this values
final int audioFrames= 1024;
final float sampleRate= 44100.0f;
final int bitsPerRecord= 16;
final int channels= 1;
final boolean bigEndian = true;
final boolean signed= true;
byteData= new byte[audioFrames * 2]; //two bytes per audio frame, 16 bits
dData= new double[audioFrames * 2]; // real & imaginary
This is how I ready the data and transform it to doubles:
format = new AudioFormat(sampleRate, bitsPerRecord, channels, signed, bigEndian);
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
microphone = (TargetDataLine) AudioSystem.getLine(info);
microphone.open(format);
microphone.start();
int numBytesRead = microphone.read(byteData, 0, byteData.length);
Once the data is read, cast from 16 bit, big endian, signed to double
public void byteToDouble(){
ByteBuffer buf= ByteBuffer.wrap(byteData);
buf.order(ByteOrder.BIG_ENDIAN);
int i=0;
while(buf.remaining()>1){
short s = buf.getShort();
dData[ 2 * i ] = (double) s / 32768.0; //real
dData[ 2 * i + 1] = 0.0; // imag
++i;
}
}
And at last, run the FFT and find the frequency:
public void findFrequency(){
double frequency;
DoubleFFT_1D fft= new DoubleFFT_1D(audioFrames);
/* edu/emory/mathcs/jtransforms/fft/DoubleFFT_1D.java */
fft.complexForward(dData); // do the magic so we can find peak
for(int i = 0; i < audioFrames; i++){
re[i] = dData[i*2];
im[i] = dData[(i*2)+1];
mag[i] = Math.sqrt((re[i] * re[i]) + (im[i]*im[i]));
}
double peak = -1.0;
int peakIn=-1;
for(int i = 0; i < audioFrames; i++){
if(peak < mag[i]){
peakIn=i;
peak= mag[i];
}
}
frequency = (sampleRate * (double)peakIn) / (double)audioFrames;
System.out.print("Peak: "+peakIn+", Frequency: "+frequency+"\n");
}
You can interpolate between FFT result bins (parabolic or Sinc interpolation) to get a more accurate estimate of frequency. But you may have a bigger problem: your frequency source may be producing (or be being clipped to produce) some very strong odd harmonics or overtones that mask any fundamental sinusoid in the FFT result magnitudes. Thus you should try using a pitch detection/estimation algorithm instead of just trying to look for a (possibly missing) FFT peak.
Firstly, if the audio you're recording is long, you'll need to do FFT in chunks, preferably with windowing each chunk before performing FFT. FFT only computes one fundamental frequency, so you need to take FFT at many places if the frequency changes many times.
Accuracy can also be improved from sliding windows. This means that you would take a chunk, then slide over slightly and take another chunk, so that the chunks overlap. How much you slide over is variable, and the size of each chunk is also variable.
Then, FFT alone might produce false results. You can do more analysis like Cepstrum analysis or Harmonic Product Spectrum analysis on the power spectrum produces by the FFT to try and estimate the pitch more accurately.