How do I apply the hanning function to my audio sample? - java

I am a university student. I am developing a music identification system for my final year project. According to the "Robust Audio Fingerprint Extraction Algorithm Based on 2-D Chroma" research paper, the following functions should need to be included in my system.
Capture Audio Signal ----> Framing Window (hanning window) -----> FFT ----->
High Pass Filter -----> etc.....
I was able to code for Audio Capture function and I was applied the FFT API as well to the code. But I am confused about how to apply the hanning window function to my the my code. Please can someone help me to do this function? Tell me where do I need to add this function and how do I need to add it to the code.
Here is the my Audio capturing code and applying FFT code:
private class RecordAudio extends AsyncTask<Void, double[], Void> {
#Override
protected Void doInBackground(Void... params) {
started = true;
try {
DataOutputStream dos = new DataOutputStream(
new BufferedOutputStream(new FileOutputStream(
recordingFile)));
int bufferSize = AudioRecord.getMinBufferSize(frequency,
channelConfiguration, audioEncoding);
audioRecord = new AudioRecord(MediaRecorder.AudioSource.MIC,
frequency, channelConfiguration, audioEncoding,
bufferSize);
short[] buffer = new short[blockSize];
double[] toTransform = new double[blockSize];
long t = System.currentTimeMillis();
long end = t + 15000;
audioRecord.startRecording();
double[] w = new double[blockSize];
while (started && System.currentTimeMillis() < end) {
int bufferReadResult = audioRecord.read(buffer, 0,
blockSize);
for (int i = 0; i < blockSize && i < bufferReadResult; i++) {
toTransform[i] = (double) buffer[i] / 32768.0;
dos.writeShort(buffer[i]);
}
// new part
toTransform = hanning (toTransform);
transformer.ft(toTransform);
publishProgress(toTransform);
}
audioRecord.stop();
dos.close();
} catch (Throwable t) {
Log.e("AudioRecord", "Recording Failed");
}
return null;
}
These links are providing hanning window algorithm and code snippets:
WindowFunction.java
Hanning - MATLAB
The following code I have used to apply hanning function to the my application and it works for me....
public double[] hanningWindow(double[] recordedData) {
// iterate until the last line of the data buffer
for (int n = 1; n < recordedData.length; n++) {
// reduce unnecessarily performed frequency part of each and every frequency
recordedData[n] *= 0.5 * (1 - Math.cos((2 * Math.PI * n)
/ (recordedData.length - 1)));
}
// return modified buffer to the FFT function
return recordedData;
}

At first, I think you should consider having your FFT length fixed. If I understand your code correctly, you are now using some kind of minimum buffer size also as the FFT length. FFT length has huge effect on the performance and resolution of your calculation.
Your link to WindowFunction.java can generate you an array, that should be the same length as your FFT length (blockSize in your case, I think). You should then multiply each sample of your buffer with the value returned from the WindowFunction that has the same id in the array.
This should be done before the FFT.

Related

MP3 Files get distorted during read process

I'm currently working on an application that plays back sound. I implemented playback for standard WAV File with the Java Sound API, no problems there, everything working fine. Now I want to add support for MP3 as well, but I'm having a strange problem: the playback gets distorted. I'm trying to figure out what I'm doing wrong, I would appreciate any leads in the right direction.
I'm using the Mp3SPI (http://www.javazoom.net/mp3spi/documents.html) for playing back the Mp3 Files.
I have already tried to take a look at the output and recorded a wav-file with the output I get from the mp3, then I compared the waveforms of the original and the recorded file. As it turns out, in the recorded file there are a lot of samples that are 0, or very close to it. Longer tones get broken up and the waveform returns to 0 all the time, then jumping back to the place the waveform is in the original.
I open the file like this:
private AudioInputStream mp3;
private AudioInputStream rawMp3;
private void openMP3(File file) {
// open the Audio INput Stream
try {
rawMp3 = AudioSystem.getAudioInputStream(file);
AudioFormat baseFormat = rawMp3.getFormat();
AudioFormat decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
baseFormat.getSampleRate(),
16,
baseFormat.getChannels(),
baseFormat.getChannels() * 2,
baseFormat.getSampleRate(),
false);
mp3 = AudioSystem.getAudioInputStream(decodedFormat, rawMp3);
} catch (UnsupportedAudioFileException | IOException ex) {
Logger.getLogger(SoundFile.class.getName()).log(Level.SEVERE, null, ex);
}
}
The part where I read the Mp3 File:
byte[] data = new byte[length];
// read the data into the buffer
int nBytesRead = 0;
while (nBytesRead != - 1 && nBytesRead < length) {
nBytesRead = mp3.read(data, 0, data.length - nBytesRead);
}
Also I convert the byte-array to doubles, perhaps I do something wrong here (I'm fairly new to using bitwise operators, so maybe there is the problem
double[][] frameBuffer = new double[2][1024]; // 2 channel stereo buffer
int nFramesRead = 0;
int byteIndex = 0;
// convert the data into double and write it to frameBuffer
for (int i = 0; i < length; ++i) {
for (int c = 0; c < 2; ++c) {
byte a = data[byteIndex++];
byte b = data[byteIndex++];
int val = a | b << 8; // a is the least significant byte. | functions as a + here. b << 8 moves 8 zeroes to the end of b.
frameBuffer[c][i] = (double) val / (double) Short.MAX_VALUE;
nFramesRead++;
}
}
The double-array is then later used to play back the sound. When playing a wav file, I do the exact same thing to the buffer, so I'm pretty sure it has to be something during the read process, not me sending faulty bytes to the ouput.
I would expect this to work out of the box with Mp3SPI, but somehow something breaks the audio along the way.
I am also open to trying other libraries to play back the MP3, if you have any recommendations. Just a Decoder for the raw MP3 Data would actually be enough.
As it turns out, the AudioFormat from the mp3 (input) and the AudioFormat of the output didnt match, obviously resulting in distortion. So with those matched up, playback is fine!

Compatibility of Android's AudioRecord with Java's SE AudioFormat

I am writing an Android application, which sends recorded sound to a server and I need to adapt its format to the one which is required. I was told that the server's audio format is specified by javax.sound.sampled.AudioFormat class constructor with the following parameters: AudioFormat(44100, 8, 1, true, true), which means that the required sound should have 44100 sample rate, 8 bit sample size, mono channel, be signed and encoded with big endian byte order. My question is how can I convert my recorded sound to the one I want? I think that the biggest problem might be Android's 16b restriction as far as the smallest sample size is concerned
You can record 44100 8bit directly by AudioRecord, specifying the format in the constructor
int bufferSize = Math.max(
AudioRecord.getMinBufferSize(44100,
AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_8BIT),
ENOUGH_SIZE_FOR_BUFFER);
AudioRecord audioRecord = new AudioRecord(MediaRecorder.AudioSource.MIC,
44100, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_8BIT, bufferSize);
then pull data from audioRecord, using read(byte[], int, int) method:
byte[] myBuf = new byte[bufferSize];
audioRecord.startRecording();
while (audioRecord.getRecordingState() == AudioRecord.RECORDSTATE_RECORDING) {
int l = audioRecord.read(myBuf, 0, myBuf.length);
if (l > 0) {
// process data
}
}
in this case the data in the buffer will be as you want: 8 bit, mono, 44100.
But, some devices may not support 8 bit recording. In this case you can record the data in 16 bit format, and obtain it using read(short[], int, int) method. In this case you need to resample data on your own:
short[] recordBuf = new short[bufferSize];
byte[] myBuf = new byte[bufferSize];
...
int l = audioRecord.read(recordBuf, 0, recordBuf.length);
if (l > 0) {
for (int i = 0; i < l; i++) {
myBuf[i] = (byte)(recordBuffer[I] >> 8);
}
// process data
}
Using the same approach, you can resample any PCM format to any another format;

How to get Audio for encoding using Xuggler

I'm writing an application that records the screen and audio. While the screen recording works perfectly, I'm having difficulty in getting the raw audio using the JDK libraries. Here's the code:
try {
// Now, we're going to loop
long startTime = System.nanoTime();
System.out.println("Encoding Image.....");
while (!Thread.currentThread().isInterrupted()) {
// take the screen shot
BufferedImage screen = robot.createScreenCapture(screenBounds);
// convert to the right image type
BufferedImage bgrScreen = convertToType(screen,
BufferedImage.TYPE_3BYTE_BGR);
// encode the image
writer.encodeVideo(0, bgrScreen, System.nanoTime()
- startTime, TimeUnit.NANOSECONDS);
/* Need to get audio here and then encode using xuggler. Something like
WaveData wd = new WaveData();
TargetDataLine line;
AudioInputStream aus = new AudioInputStream(line);
short[] samples = getSourceSamples();
writer.encodeAudio(0, samples); */
if (timeCreation < 10) {
timeCreation = getGMTTime();
}
// sleep for framerate milliseconds
try {
Thread.sleep((long) (1000 / FRAME_RATE.getDouble()));
} catch (Exception ex) {
System.err.println("stopping....");
break;
}
}
// Finally we tell the writer to close and write the trailer if
// needed
} finally {
writer.close();
}
This page has some pseudo code like
while(haveMoreAudio())
{
short[] samples = getSourceSamples();
writer.encodeAudio(0, samples);
}
but what exactly should I do for getSourceSamples()?
Also, a bonus question - is it possible to choose from multiple microphones in this approach?
See also:
Xuggler encoding and muxing
Try this:
// Pick a format. Need 16 bits, the rest can be set to anything
// It is better to enumerate the formats that the system supports, because getLine() can error out with any particular format
AudioFormat audioFormat = new AudioFormat(44100.0F, 16, 2, true, false);
// Get default TargetDataLine with that format
DataLine.Info dataLineInfo = new DataLine.Info( TargetDataLine.class, audioFormat );
TargetDataLine line = (TargetDataLine) AudioSystem.getLine(dataLineInfo);
// Open and start capturing audio
line.open(audioFormat, line.getBufferSize());
line.start();
while (true) {
// read as raw bytes
byte[] audioBytes = new byte[ line.getBufferSize() / 2 ]; // best size?
int numBytesRead = 0;
numBytesRead = line.read(audioBytes, 0, audioBytes.length);
// convert to signed shorts representing samples
int numSamplesRead = numBytesRead / 2;
short[] audioSamples = new short[ numSamplesRead ];
if (format.isBigEndian()) {
for (int i = 0; i < numSamplesRead; i++) {
audioSamples[i] = (short)((audioBytes[2*i] << 8) | audioBytes[2*i + 1]);
}
}
else {
for (int i = 0; i < numSamplesRead; i++) {
audioSamples[i] = (short)((audioBytes[2*i + 1] << 8) | audioBytes[2*i]);
}
}
// use audioSamples in Xuggler etc
}
To pick a microphone, you'd probably have to do this:
Mixer.Info[] mixerInfo = AudioSystem.getMixerInfo();
// Look through and select a mixer here, different mixers should be different inputs
int selectedMixerIndex = 0;
Mixer mixer = AudioSystem.getMixer(mixerInfo[ selectedMixerIndex ]);
TargetDataLine line = (TargetDataLine) mixer.getLine(dataLineInfo);
I think it's possible that multiple microphones will show up in one mixer as different source data lines. In that case you'd have to open them and call dataLine.getControl(FloatControl.Type.MASTER_GAIN).setValue( volume ); to turn them on and off.
See:
WaveData.java
Sound wave from TargetDataLine
How to set volume of a SourceDataLine in Java

Java: retrieving byte array from an 8 bit wav file and normalizing it to -1.0 to 1.0

Bear with me as Im very new with working with audio and I have been googling for days for a solution and not finding any.
So i retrieve the byte array of a .wav file with this (source: Wav file convert to byte array in java)
ByteArrayOutputStream out = new ByteArrayOutputStream();
BufferedInputStream in = new BufferedInputStream(new FileInputStream(WAV_FILE));
int read;
byte[] buff = new byte[1024];
while ((read = in.read(buff)) > 0)
{
out.write(buff, 0, read);
}
out.flush();
byte[] audioBytes = out.toByteArray();
And then i convert the byte array to a float array and normalize it from -1.0 to 1.0. (source: Convert wav audio format byte array to floating point)
ShortBuffer sbuf =
ByteBuffer.wrap(audioBytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer();
short[] audioShorts = new short[sbuf.capacity()];
sbuf.get(audioShorts);
float[] audioFloats = new float[audioShorts.length];
for (int i = 0; i < audioShorts.length; i++) {
audioFloats[i] = ((float)audioShorts[i])/0x8000;
}
return audioFloats;
Later i convert this to line drawings which outputs the waveform using java.swing
class Panel2 extends JPanel {
float[] audioFloats;
Dimension d;
public Panel2(Dimension d, float[] audioFloats) {
// set a preferred size for the custom panel.
this.d = d;
setPreferredSize(d);
this.audioFloats = audioFloats;
}
#Override
public void paint(Graphics g) {
//super.paintComponent(g);
super.paint(g);
//shift by 45 because first 44 bytes used for header
for (int i = 45; i<audioFloats.length; i++){
Graphics2D g2 = (Graphics2D) g;
float inc = (i-45)*((float)d.width)/((float)(audioFloats.length-45-1));
Line2D lin = new Line2D.Float(inc, d.height/2, inc, (audioFloats[i]*d.height+d.height/2));
g2.draw(lin);
}
}
}
The waveform only looks right for 16 bit wav files (ive cross checked with goldwave and both my waveform and their waveform look similar for 16 bits).
How do i do this for 8 bit .wav files?
Because this is for homework, my only restriction is read the wav file byte by byte.
I also know the wav files are PCM coded and have the first 44 bytes reserved as the header
You need to adapt this part of the code:
ShortBuffer sbuf =
ByteBuffer.wrap(audioBytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer();
short[] audioShorts = new short[sbuf.capacity()];
sbuf.get(audioShorts);
float[] audioFloats = new float[audioShorts.length];
for (int i = 0; i < audioShorts.length; i++) {
audioFloats[i] = ((float)audioShorts[i])/0x8000;
}
You don't need ByteBuffer at all—you already have your byte array. So just convert it to floats:
float[] audioFloats = new float[audioBytes.length];
for (int i = 0; i < audioBytes.length; i++) {
audioFloats[i] = ((float)audioBytes[i])/0x80;
}
Audio streams are usually interleaved with one channel of data then the opposite channel of data. So for example the first 16 bits would be the left channel, then the next 16 bits would be the right channel. Each of these is considered 1 frame of data. I would make sure that your 8 bit stream is only one channel because it looks like the methods are only set up to read one channel.
Also in your example to convert the frames you are grabbing the individual channel as a short then finding a decimal by dividing that by 0x8000 hex or the maximum value of a signed short.
short[] audioShorts = new short[sbuf.capacity()];
sbuf.get(audioShorts);
...
audioFloats[i] = ((float)audioShorts[i])/0x8000;
My guess is that you need to read the 8 byte stream as a type 'byte' instead of a short then divide that by 128 or the maximum value of a signed 8 bit value. This will involve making a whole new method that processes 8 bit streams instead of 16 bit streams. With the following changes.
byte[] audioBytes = new byte[sbuf.capacity()];
sbuf.get(audioBytes);
...
audioFloats[i] = ((float)audioBytes[i])/0x80;

Generating sound with javax.sound.sampled

I'm trying to generate sound with Java. In the end, I'm willing to continuously send sound to the sound card, but for now I would be able to send a unique sound wave.
So, I filled an array with 44100 signed integers representing a simple sine wave, and I would like to send it to my sound card, but I just can't get it to work.
int samples = 44100; // 44100 samples/s
int[] data = new int[samples];
// Generate all samples
for ( int i=0; i<samples; ++i )
{
data[i] = (int) (Math.sin((double)i/(double)samples*2*Math.PI)*(Integer.MAX_VALUE/2));
}
And I send it to a sound line using:
AudioFormat format = new AudioFormat(Encoding.PCM_SIGNED, 44100, 16, 1, 1, 44100, false);
Clip clip = AudioSystem.getClip();
AudioInputStream inputStream = new AudioInputStream(ais,format,44100);
clip.open(inputStream);
clip.start();
My problem resides between these to code snippets. I just can't find a way to convert my int[] to an input stream!
Firstly I think you want short samples rather than int:
short[] data = new short[samples];
because your AudioFormat specifies 16-bit samples. short is 16-bits wide but int is 32 bits.
An easy way to convert it to a stream is:
Allocate a ByteBuffer
Populate it using putShort calls
Wrap the resulting byte[] in a ByteArrayInputStream
Create an AudioInputStream from the ByteArrayInputStream and format
Example:
float frameRate = 44100f; // 44100 samples/s
int channels = 2;
double duration = 1.0;
int sampleBytes = Short.SIZE / 8;
int frameBytes = sampleBytes * channels;
AudioFormat format =
new AudioFormat(Encoding.PCM_SIGNED,
frameRate,
Short.SIZE,
channels,
frameBytes,
frameRate,
true);
int nFrames = (int) Math.ceil(frameRate * duration);
int nSamples = nFrames * channels;
int nBytes = nSamples * sampleBytes;
ByteBuffer data = ByteBuffer.allocate(nBytes);
double freq = 440.0;
// Generate all samples
for ( int i=0; i<nFrames; ++i )
{
double value = Math.sin((double)i/(double)frameRate*freq*2*Math.PI)*(Short.MAX_VALUE);
for (int c=0; c<channels; ++ c) {
int index = (i*channels+c)*sampleBytes;
data.putShort(index, (short) value);
}
}
AudioInputStream stream =
new AudioInputStream(new ByteArrayInputStream(data.array()), format, nFrames*2);
Clip clip = AudioSystem.getClip();
clip.open(stream);
clip.start();
clip.drain();
Note: I changed your AudioFormat to stereo, because it threw an exception when I requested a mono line. I also increased the frequency of your waveform to something in the audible range.
Update - the previous modification (writing directly to the data line) was not necessary - using a Clip works fine. I have also introduced some variables to make the calculations clearer.
If you want to play a simple Sound, you should use a SourceDataLine.
Here's an example:
import javax.sound.sampled.*;
public class Sound implements Runnable {
//Specify the Format as
//44100 samples per second (sample rate)
//16-bit samples,
//Mono sound,
//Signed values,
//Big-Endian byte order
final AudioFormat format=new AudioFormat(44100f,16,2,true,true);
//Your output line that sends the audio to the speakers
SourceDataLine line;
public Sound(){
try{
line=AudioSystem.getSourceDataLine(format);
line.open(format);
}catch(LineUnavailableExcecption oops){
oops.printStackTrace();
}
new Thread(this).start();
}
public void run(){
//a buffer to store the audio samples
byte[] buffer=new byte[1000];
int bufferposition=0;
//a counter to generate the samples
long c=0;
//The pitch of your sine wave (440.0 Hz in this case)
double wavelength=44100.0/440.0;
while(true){
//Generate a sample
short sample=(short) (Math.sin(2*Math.PI*c/wavelength)*32000);
//Split the sample into two bytes and store them in the buffer
buffer[bufferposition]=(byte) (sample>>>8);
bufferposition++;
buffer[bufferposition]=(byte) (sample & 0xff);
bufferposition++;
//if the buffer is full, send it to the speakers
if(bufferposition>=buffer.length){
line.write(buffer,0,buffer.length);
line.start();
//Reset the buffer
bufferposition=0;
}
}
//Increment the counter
c++;
}
public static void main(String[] args){
new Sound();
}
}
In this example you're continuosly generating a sine wave, but you can use this code to play sound from any source you want. You just have to make sure that you format the samples right. In this case, I'm using raw, uncompressed 16-bit samples at a sample rate of 44100 Hz. However, if you want to play audio from a file, you can use a Clip object
public void play(File file){
Clip clip=AudioSystem.getClip();
clip.open(AudioSystem.getAudioInputStream(file));
clip.loop(1);
}

Categories