writing double[] as WAV file in Java - java

I'm trying to save a double[] array as .WAV file using this method:
public static void saveWav(String filename, double[] samples) {
// assumes 44,100 samples per second
// use 16-bit audio, 2 channels, signed PCM, little Endian
AudioFormat format = new AudioFormat(SAMPLE_RATE * 2, 16, 1, true, false);
byte[] data = new byte[2 * samples.length];
for (int i = 0; i < samples.length; i++) {
int temp = (short) (samples[i] * MAX_16_BIT);
data[2*i + 0] = (byte) temp;
data[2*i + 1] = (byte) (temp >> 8);
}
// now save the file
try {
ByteArrayInputStream bais = new ByteArrayInputStream(data);
AudioInputStream ais = new AudioInputStream(bais, format, samples.length);
if (filename.endsWith(".wav") || filename.endsWith(".WAV")) {
AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
}
else if (filename.endsWith(".au") || filename.endsWith(".AU")) {
AudioSystem.write(ais, AudioFileFormat.Type.AU, new File(filename));
}
else {
throw new RuntimeException("File format not supported: " + filename);
}
}
catch (IOException e) {
System.out.println(e);
System.exit(1);
}
}
but when I reload the files I saved, for every song[i], the double value is different than the original. I use this method to read WAV files:
public static double[] read(String filename) {
byte[] data = readByte(filename);
int N = data.length;
double[] d = new double[N/2];
for (int i = 0; i < N/2; i++) {
d[i] = ((short) (((data[2*i+1] & 0xFF) << 8) + (data[2*i] & 0xFF))) / ((double) MAX_16_BIT);
}
return d;
}
private static byte[] readByte(String filename) {
byte[] data = null;
AudioInputStream ais = null;
try {
// try to read from file
File file = new File(filename);
if (file.exists()) {
ais = AudioSystem.getAudioInputStream(file);
data = new byte[ais.available()];
ais.read(data);
}
// try to read from URL
else {
URL url = StdAudio.class.getResource(filename);
ais = AudioSystem.getAudioInputStream(url);
data = new byte[ais.available()];
ais.read(data);
}
}
catch (IOException e) {
System.out.println(e.getMessage());
throw new RuntimeException("Could not read " + filename);
}
catch (UnsupportedAudioFileException e) {
System.out.println(e.getMessage());
throw new RuntimeException(filename + " in unsupported audio format");
}
return data;
}
I need both double[] arrays to have the exact same values, and that's not the case.
when I hear the song playing I can't tell the difference, but I still need those original values.
Any help appreciated.
Guy

A double requires 64-bits of storage and has a lot of precision. You can't just throw away 48 bits of data in the round trip and expect to get the exact same value back. It is analogous to starting with a high resolution image, converting it to a thumbnail and then expecting that you can magically recover the original high resolution image. In the real world, the human ear is not going to be able to distinguish between the two. The higher resolution is useful during computation and signal processing routines to reduce the accumulation of computational errors. That being said, if you want to store 64-bit you'll need to use something other than .WAV. The closest you'll get is 32-bit.

Related

MediaRecorder record audio in a loop

I'm developing a sound recognition system. I'm using a tensorflow model developed on python to convert MFCC values to labels. I'm using the MediaRecorder class to record the audio, and I'm doing it in a loop so I can be constantly getting microphone audio and then getting the label from the model. Here is the recording loop:
temp = 0;
while (true) {
audioPath = getApplicationContext().getFilesDir().getAbsolutePath();
audioPath += "/Recording" + temp + ".3gp";
audioFile = new File(audioPath);
mediaRecorder = new MediaRecorder();
mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);
mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
mediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
mediaRecorder.setOutputFile(audioPath);
try {
mediaRecorder.prepare();
} catch (IOException e) {
e.printStackTrace();
}
mediaRecorder.start();
sleep(2000);
if (!isRunning) {
mediaRecorder.stop();
return;
}
try {
int amplitude = mediaRecorder.getMaxAmplitude();
Log.d("volume", Integer.toString(amplitude));
//finished = false;
avgVolumeTask task = new avgVolumeTask();
task.execute(amplitude);
} catch (Exception e) {
Log.d("Exception in startMediaRecorder()", e.toString());
}
mediaRecorder.stop();
mediaRecorder.release();
soundRecognition task2 = new soundRecognition();
task2.execute();
audioFile.delete();
temp++;
}
This is the soundRecognition method:
private class soundRecognition extends AsyncTask<Integer, Integer, Long> {
#Override
protected Long doInBackground(Integer... level) {
float[] mfccValues = null;
Interpreter tflite = null;
float[][] labelProbArray = null;
try {
mfccValues = computeMFCC();
labelList = loadLabelList();
labelProbArray = new float[1][labelList.size()];
tflite = new Interpreter(loadModel());
} catch (IOException e) {
e.printStackTrace();
} catch (UnsupportedAudioFileException e) {
e.printStackTrace();
}
tflite.run(mfccValues, labelProbArray);
for (int i = 0; i < labelProbArray[0].length; i++) {
float value = labelProbArray[0][i];
//if (i == 1f){
//Log.d("Output at " + Integer.toString(i) + ": ", Float.toString(value));
//doAlert(i);
//}
}
return null;
}
}
The computeMFCC method is this:
public float[] computeMFCC() throws IOException, UnsupportedAudioFileException {
FileInputStream in2 = new FileInputStream(audioPath);
int i;
// InputStream to byte array
byte[] buf = IOUtils.toByteArray(in2);
in2.close();
i = Integer.MAX_VALUE;
// byte array to short array
short[] shortArr = new short[buf.length / 2];
ByteBuffer.wrap(buf).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(shortArr);
int count = 0;
while (count <= shortArr.length) { // Still have data to process.
for (int n = 0; n < nSubframePerBuf; n++) { // Process audio signal in ArrayList and shift by one subframe each time
int k = 0;
for (i = (n * frameShift); i < (n + 1) * frameShift; i++) {
subx[k] = shortArr[i];
k++;
}
subframeList.add(subx); // Add the current subframe to the subframe list. Later, a number of
}
count++;
}
// Need at least nSubframePerMfccFrame to get one analysis frame
x = extractOneFrameFromList(nSubframePerMfccFrame);
MFCC mfcc = new MFCC(samplePerFrm, 16000, numMfcc);
double[] mfccVals = mfcc.doMFCC(x);
float[] floatArray = new float[mfccVals.length];
for (i = 0 ; i < mfccVals.length; i++)
{
floatArray[i] = (float) mfccVals[i];
}
return floatArray;
}
And the doMFCC method is from a downloaded java file here:
https://github.com/enmwmak/ScreamDetector/blob/master/src/edu/polyu/mfcc/MFCC.java
The issue I'm having is that after a few iterations, I run into the problem that the file doesnt get created, and then get a null error passing the results from the input stream to the tensorflow model.
Possible Issues
One reason could be where the file is stored. I've been trying to send the file to local storage because I was worried that all the devices wouldnt have external storage.
Another reason could be that i'm not calling the sound recognition in the right spot. I waited will after the mediaRecorder is stopped to make sure that the file is written with the mic audio, but when I review the contents of the fileInputStream, it appears to not be working, and in each loop the file is always the same.
Any help would be much appreciated.
It may be tricky to have a sleep(2000) inside while loop.
It may be better to check millis and break until 2000 ms has lapsed.

How to change volume of WAV file byte by byte in Java?

I can read WAV file (8-bit per sample) in the following function and copy it to another file. I want to play with the overall volume of the source file with given scale parameter, which is in range [0, 1]. My naive approach was to multiple byte with scale and convert it to byte again. All I got a noisy file. How can I achieve this byte by byte volume adjustment?
public static final int BUFFER_SIZE = 10000;
public static final int WAV_HEADER_SIZE = 44;
public void changeVolume(File source, File destination, float scale) {
RandomAccessFile fileIn = null;
RandomAccessFile fileOut = null;
byte[] header = new byte[WAV_HEADER_SIZE];
byte[] buffer = new byte[BUFFER_SIZE];
try {
fileIn = new RandomAccessFile(source, "r");
fileOut = new RandomAccessFile(destination, "rw");
// copy the header of source to destination file
int numBytes = fileIn.read(header);
fileOut.write(header, 0, numBytes);
// read & write audio samples in blocks of size BUFFER_SIZE
int seekDistance = 0;
int bytesToRead = BUFFER_SIZE;
long totalBytesRead = 0;
while(totalBytesRead < fileIn.length()) {
if (seekDistance + BUFFER_SIZE <= fileIn.length()) {
bytesToRead = BUFFER_SIZE;
} else {
// read remaining bytes
bytesToRead = (int) (fileIn.length() - totalBytesRead);
}
fileIn.seek(seekDistance);
int numBytesRead = fileIn.read(buffer, 0, bytesToRead);
totalBytesRead += numBytesRead;
for (int i = 0; i < numBytesRead - 1; i++) {
// WHAT TO DO HERE?
buffer[i] = (byte) (scale * ((int) buffer[i]));
}
fileOut.write(buffer, 0, numBytesRead);
seekDistance += numBytesRead;
}
fileOut.setLength(fileIn.length());
} catch (FileNotFoundException e) {
System.err.println("File could not be found" + e.getMessage());
} catch (IOException e) {
System.err.println("IOException: " + e.getMessage());
} finally {
try {
fileIn.close();
fileOut.close();
} catch (IOException e) {
System.err.println("IOException: " + e.getMessage());
}
}
}
A java byte is in a range from -128 to 127, while the byte used in the wav pcm format ranges from 0 to 255. That is most likely why you are changing your pcm data to random/noisy values.
buffer[i] = (byte) (scale * (buffer[i]<0?256+(int)buffer[i]:buffer[i]));

Get the int values from a .wav file in Java

I seem to have reached a dead-end. I try to display the first int values of a 32 bits per sample, 2 channels .wav file with the following code :
public static void main(String[] args) {
File inFile = new File("C:/.../1.wav");
FileInputStream fstream = null;
try {
fstream = new FileInputStream(inFile);
} catch (FileNotFoundException e1) {
e1.printStackTrace();
}
BufferedInputStream in = new BufferedInputStream(fstream);
byte[] bytes = null;
try {
bytes = IOUtils.toByteArray(in);
} catch (IOException e) {
e.printStackTrace();
}
ByteBuffer wrapped = ByteBuffer.wrap(bytes);
int num = wrapped.getInt();
System.out.println(num);
}
But i'm not sure what the number displayed means. I get "1380533830", first of all if I understood correctly is the int value is the not from 0 to 2^32-1 but from -2^31 to 2^31 - 1.
If that's the case it should be the first int from the first channel, but when I do an audioread in matlab I get a completely different value : "-1376256". I tried to see if "1380533830" was a value somewhere in my audioread but it wasn't so I don't know what I did.

Extracting Big Wav file into smaller chunks in Java

I have a big wav file that I would like to get into smaller chunks. I also have a .cue file that have the frame rate lengths, at which the smaller chunks should be. I figured out how to split the wav up, but all the wav files that are made are the same sound. It seems that everytime I create a new wav the big wav file is starting from the beginning and making the new wave the correct length but same sound.
I think I need a way to read the wav to a specific frame, then write to a file, then continue reading and write to another file,etc...
I've been at this for hours and can't seem to figure it out. any help would be greatly appreciated. Here is my code, all the commented stuff is my wrong code that I have been trying.
int count2 = 0;
int totalFramesRead = 0;
//cap contains the how many wav's are to be made
//counter contains the vector position.
String wavFile1 = "C:\\Users\\DC3\\Desktop\\wav&text\\testwav.wav";
//String wavFile2 = "C:\\Users\\DC3\\Desktop\\waver\\Battlefield.wav";
while(count2 != counter){
try {
AudioInputStream clip1 = AudioSystem.getAudioInputStream(new File(wavFile1));
int bytesPerFrame = clip1.getFormat().getFrameSize();
//System.out.println(bytesPerFrame);
// int numBytes = safeLongToInt(clip1.getFrameLength()) * bytesPerFrame;
// byte[] audioBytes = new byte[numBytes];
// int numBytesRead = 0;
// int numFramesRead = 0;
// // Try to read numBytes bytes from the file.
// while ((numBytesRead =
// clip1.read(audioBytes)) != -1) {
// // Calculate the number of frames actually read.
// clip1.read(audioBytes)
// numFramesRead = numBytesRead / bytesPerFrame;
// totalFramesRead += numFramesRead;
// System.out.println(totalFramesRead);
// }
long lengthofclip = Integer.parseInt(time.get(count2))- silence;
globallength = clip1.getFrameLength();
AudioInputStream appendedFiles = new AudioInputStream(clip1, clip1.getFormat(), lengthofclip);
//long test = (appendedFiles.getFrameLength() *24 *2)/8;
//int aaaaa = safeLongToInt(test);
//appendedFiles.mark(aaaaa);
AudioSystem.write(appendedFiles,
AudioFileFormat.Type.WAVE,
new File("C:\\Users\\DC3\\Desktop\\wav&text\\" + name.get(count2)));
count2++;
} catch (Exception e) {
e.printStackTrace();
}
}
}
public static int safeLongToInt(long l) {
if (l < Integer.MIN_VALUE || l > Integer.MAX_VALUE) {
throw new IllegalArgumentException
(l + " cannot be cast to int without changing its value.");
}
return (int) l;
}
Just a thought at first glance but I'm assuming it's this line giving trouble:
AudioInputStream clip1 = AudioSystem.getAudioInputStream(new File(wavFile1));
Take that and put it outside of your while loop so it doesn't get recreated every cycle. Like so:
//...
String wavFile1 = "C:\\Users\\DC3\\Desktop\\wav&text\\testwav.wav";
AudioInputStream clip1 = AudioSystem.getAudioInputStream(new File(wavFile1));
int bytesPerFrame = clip1.getFormat().getFrameSize();
while(count2 != counter){
try {
//...
This also assumes that your algorithm is correct, which I'm not going to waste time thinking about because you didn't ask that question :-D

Audio Mixing with Java (without Mixer API)

I am attempting to mix several different audio streams and trying to get them to play at the same time instead of one-at-a-time.
The code below plays them one-at-a-time and I cannot figure out a solution that does not use the Java Mixer API. Unfortunately, my audio card does not support synchronization using the Mixer API and I am forced to figure out a way to do it through code.
Please advise.
/////CODE IS BELOW////
class MixerProgram {
public static AudioFormat monoFormat;
private JFileChooser fileChooser = new JFileChooser();
private static File[] files;
private int trackCount;
private FileInputStream[] fileStreams = new FileInputStream[trackCount];
public static AudioInputStream[] audioInputStream;
private Thread trackThread[] = new Thread[trackCount];
private static DataLine.Info sourceDataLineInfo = null;
private static SourceDataLine[] sourceLine;
public MixerProgram(String[] s)
{
trackCount = s.length;
sourceLine = new SourceDataLine[trackCount];
audioInputStream = new AudioInputStream[trackCount];
files = new File[s.length];
}
public static void getFiles(String[] s)
{
files = new File[s.length];
for(int i=0; i<s.length;i++)
{
File f = new File(s[i]);
if (!f.exists())
System.err.println("Wave file not found: " + filename);
files[i] = f;
}
}
public static void loadAudioFiles(String[] s)
{
AudioInputStream in = null;
audioInputStream = new AudioInputStream[s.length];
sourceLine = new SourceDataLine[s.length];
for(int i=0;i<s.length;i++){
try
{
in = AudioSystem.getAudioInputStream(files[i]);
}
catch(Exception e)
{
System.err.println("Failed to assign audioInputStream");
}
monoFormat = in.getFormat();
AudioFormat decodedFormat = new AudioFormat(
AudioFormat.Encoding.PCM_SIGNED,
monoFormat.getSampleRate(), 16, monoFormat.getChannels(),
monoFormat.getChannels() * 2, monoFormat.getSampleRate(),
false);
monoFormat = decodedFormat; //give back name
audioInputStream[i] = AudioSystem.getAudioInputStream(decodedFormat, in);
sourceDataLineInfo = new DataLine.Info(SourceDataLine.class, monoFormat);
try
{
sourceLine[i] = (SourceDataLine) AudioSystem.getLine(sourceDataLineInfo);
sourceLine[i].open(monoFormat);
}
catch(LineUnavailableException e)
{
System.err.println("Failed to get SourceDataLine" + e);
}
}
}
public static void playAudioMix(String[] s)
{
final int tracks = s.length;
System.out.println(tracks);
Runnable playAudioMixRunner = new Runnable()
{
int bufferSize = (int) monoFormat.getSampleRate() * monoFormat.getFrameSize();
byte[] buffer = new byte[bufferSize];
public void run()
{
if(tracks==0)
return;
for(int i = 0; i < tracks; i++)
{
sourceLine[i].start();
}
int bytesRead = 0;
while(bytesRead != -1)
{
for(int i = 0; i < tracks; i++)
{
try
{
bytesRead = audioInputStream[i].read(buffer, 0, buffer.length);
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
if(bytesRead >= 0)
{
int bytesWritten = sourceLine[i].write(buffer, 0, bytesRead);
System.out.println(bytesWritten);
}
}
}
}
};
Thread playThread = new Thread(playAudioMixRunner);
playThread.start();
}
}
The problem is that you are not adding the samples together. If we are looking at 4 tracks, 16-bit PCM data, you need to add all the different values together to "mix" them into one final output. So, from a purely-numbers point-of-view, it would look like this:
[Track1] 320 -16 2000 200 400
[Track2] 16 8 123 -87 91
[Track3] -16 -34 -356 1200 805
[Track4] 1011 1230 -1230 -100 19
[Final!] 1331 1188 537 1213 1315
In your above code, you should only be writing a single byte array. That byte array is the final mix of all tracks added together. The problem is that you are writing a byte array for each different track (so there is no mixdown happening, as you observed).
If you want to guarantee you don't have any "clipping", you should take the average of all tracks (so add all four tracks above and divide by 4). However, there are artifacts from choosing that approach (like if you have silence on three tracks and one loud track, the final output will be much quiter than the volume of the one track that is not silent). There are more complicated algorithms you can use to do the mixing, but by then you are writing your own mixer :P.

Categories