Audio Mixing with Java (without Mixer API) - java

I am attempting to mix several different audio streams and trying to get them to play at the same time instead of one-at-a-time.
The code below plays them one-at-a-time and I cannot figure out a solution that does not use the Java Mixer API. Unfortunately, my audio card does not support synchronization using the Mixer API and I am forced to figure out a way to do it through code.
Please advise.
/////CODE IS BELOW////
class MixerProgram {
public static AudioFormat monoFormat;
private JFileChooser fileChooser = new JFileChooser();
private static File[] files;
private int trackCount;
private FileInputStream[] fileStreams = new FileInputStream[trackCount];
public static AudioInputStream[] audioInputStream;
private Thread trackThread[] = new Thread[trackCount];
private static DataLine.Info sourceDataLineInfo = null;
private static SourceDataLine[] sourceLine;
public MixerProgram(String[] s)
{
trackCount = s.length;
sourceLine = new SourceDataLine[trackCount];
audioInputStream = new AudioInputStream[trackCount];
files = new File[s.length];
}
public static void getFiles(String[] s)
{
files = new File[s.length];
for(int i=0; i<s.length;i++)
{
File f = new File(s[i]);
if (!f.exists())
System.err.println("Wave file not found: " + filename);
files[i] = f;
}
}
public static void loadAudioFiles(String[] s)
{
AudioInputStream in = null;
audioInputStream = new AudioInputStream[s.length];
sourceLine = new SourceDataLine[s.length];
for(int i=0;i<s.length;i++){
try
{
in = AudioSystem.getAudioInputStream(files[i]);
}
catch(Exception e)
{
System.err.println("Failed to assign audioInputStream");
}
monoFormat = in.getFormat();
AudioFormat decodedFormat = new AudioFormat(
AudioFormat.Encoding.PCM_SIGNED,
monoFormat.getSampleRate(), 16, monoFormat.getChannels(),
monoFormat.getChannels() * 2, monoFormat.getSampleRate(),
false);
monoFormat = decodedFormat; //give back name
audioInputStream[i] = AudioSystem.getAudioInputStream(decodedFormat, in);
sourceDataLineInfo = new DataLine.Info(SourceDataLine.class, monoFormat);
try
{
sourceLine[i] = (SourceDataLine) AudioSystem.getLine(sourceDataLineInfo);
sourceLine[i].open(monoFormat);
}
catch(LineUnavailableException e)
{
System.err.println("Failed to get SourceDataLine" + e);
}
}
}
public static void playAudioMix(String[] s)
{
final int tracks = s.length;
System.out.println(tracks);
Runnable playAudioMixRunner = new Runnable()
{
int bufferSize = (int) monoFormat.getSampleRate() * monoFormat.getFrameSize();
byte[] buffer = new byte[bufferSize];
public void run()
{
if(tracks==0)
return;
for(int i = 0; i < tracks; i++)
{
sourceLine[i].start();
}
int bytesRead = 0;
while(bytesRead != -1)
{
for(int i = 0; i < tracks; i++)
{
try
{
bytesRead = audioInputStream[i].read(buffer, 0, buffer.length);
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
if(bytesRead >= 0)
{
int bytesWritten = sourceLine[i].write(buffer, 0, bytesRead);
System.out.println(bytesWritten);
}
}
}
}
};
Thread playThread = new Thread(playAudioMixRunner);
playThread.start();
}
}

The problem is that you are not adding the samples together. If we are looking at 4 tracks, 16-bit PCM data, you need to add all the different values together to "mix" them into one final output. So, from a purely-numbers point-of-view, it would look like this:
[Track1] 320 -16 2000 200 400
[Track2] 16 8 123 -87 91
[Track3] -16 -34 -356 1200 805
[Track4] 1011 1230 -1230 -100 19
[Final!] 1331 1188 537 1213 1315
In your above code, you should only be writing a single byte array. That byte array is the final mix of all tracks added together. The problem is that you are writing a byte array for each different track (so there is no mixdown happening, as you observed).
If you want to guarantee you don't have any "clipping", you should take the average of all tracks (so add all four tracks above and divide by 4). However, there are artifacts from choosing that approach (like if you have silence on three tracks and one loud track, the final output will be much quiter than the volume of the one track that is not silent). There are more complicated algorithms you can use to do the mixing, but by then you are writing your own mixer :P.

Related

how to get the decibel of byte audio data

i have been working on a java program that captures microphone audio byte data and then sends it to somewhere else (a part of my program), is there anyway i can calculate the decibel value of the data?
i am using TargetDataLine, in each iteration i am saving data to a tempData holder which i take and write it into a ByteOutputStream, in each iteration i am trying to calculate the decibel of tempData.
keep in mind i don't really understand a lot of things related to sound in computers and in java in general so please forgive me for my lack of knowledge.
this is class 1 or "foo", it's handling when to stop the capturing
public class Foo {
public static void foo() {
AudioFormat format = new AudioFormat(8000.0f, 16, 1, true, true);
try (
var microphone = (TargetDataLine) AudioSystem.getLine(new DataLine.Info(TargetDataLine.class, format))
) {
var micListener = new MicListener(microphone);
ByteArrayOutputStream allData = new ByteArrayOutputStream();
byte[] tempData;
final int chunkSize = 1024;
while (true) {
// in this case the loop goes forever, but in my program it stops when the user stops capturing audio.
tempData = micListener.startRecording(chunkSize);
//calculate the decibel value of tempData; Utils.calculateDecibel(tempData)
//if decibel is high then do stuff
if (decibel > 50)
allData.write(tempData , 0 , micListener.getNumOfBytesRead());
}
} catch (LineUnavailableException e) {
e.printStackTrace();
}
}
}
this is class 2 or "MicListener", it's handeling capture of data
public class MicListener {
private final TargetDataLine target;
private byte[] audioData;
private int numOfBytesRead = 0;
public MicListener(TargetDataLine target){
this.target = target;
audioData = new byte[target.getBufferSize() / 5];
}
public byte[] startRecording(int chunkSize) throws LineUnavailableException {
numOfBytesRead = target.read(audioData , 0 , chunkSize);
return audioData;
}
public int getNumOfBytesRead() {
return numOfBytesRead;
}
}
thanks for the help! have a great day

how to use javax.sound.sampled.LineListener?

I have a program that will ask the user which songs they want to play out of a list of available songs and after the user selects one once the song finishes it asks the user which song they want to play again. I have been told to use line listener for this but I can't seem to figure out how to even after using the oracle docs
my code
public class Main {
public static void main(String[] args) {
Scanner input = new Scanner(System.in);
String[] pathnames;
File MusicFileChosen;
String musicDir;
boolean songComplete = false;
pathnames = ProgramMap.musicDir.list();
// Print the names of files and directories
for (int ListNum = 0; ListNum < pathnames.length; ListNum++) {
System.out.println(ListNum + 1 + ". " + pathnames[ListNum]);
}
for (int playlistLength = 0; playlistLength < pathnames.length; playlistLength++){
if (!songComplete) {
System.out.println("Which Song would you like to play?");
int musicChoice = input.nextInt();
musicDir = ProgramMap.userDir + "\\src\\Music\\" + pathnames[musicChoice - 1];
MusicFileChosen = new File(musicDir);
PlaySound(MusicFileChosen, pathnames[musicChoice - 1]);
}
}
}
public static void PlaySound(File sound, String FileName){
try{
// Inits the Audio System
Clip clip = AudioSystem.getClip();
AudioInputStream AudioInput = AudioSystem.getAudioInputStream(sound);
//Finds and accesses the clip
clip.open(AudioInput);
//Starts the clip
clip.start();
System.out.println("Now Playing " + FileName);
clip.drain();
}catch (Exception e){
System.out.println("Error playing music");
}
}
}
Basically one thing which you need to change is to replace this:
for (int playlistLength = 0; playlistLength < pathnames.length; playlistLength++){
to something like:
while (true) {
System.out.println("Which Song would you like to play?");
int musicChoice = input.nextInt();
musicDir = ProgramMap.userDir + "\\src\\Music\\" + pathnames[musicChoice - 1];
MusicFileChosen = new File(musicDir);
PlaySound(MusicFileChosen, pathnames[musicChoice - 1]);
}
You can add some logic to break the loop.
Also, I would recommend changing a little bit PlaySound method:
public static void PlaySound(File sound, String FileName) {
try (final AudioInputStream in = getAudioInputStream(sound)) {
final AudioFormat outFormat = getOutFormat(in.getFormat());
Info info = new Info(SourceDataLine.class, outFormat);
try (final SourceDataLine line =
(SourceDataLine) AudioSystem.getLine(info)) {
if (line != null) {
line.open(outFormat);
line.start();
System.out.println("Now Playing " + FileName);
stream(getAudioInputStream(outFormat, in), line);
line.drain();
line.stop();
}
}
} catch (UnsupportedAudioFileException
| LineUnavailableException
| IOException e) {
System.err.println("Error playing music\n" + e.getMessage());
}
}
private static AudioFormat getOutFormat(AudioFormat inFormat) {
final int ch = inFormat.getChannels();
final float rate = inFormat.getSampleRate();
return new AudioFormat(PCM_SIGNED, rate, 16, ch, ch * 2, rate, false);
}
private static void stream(AudioInputStream in, SourceDataLine line)
throws IOException {
final byte[] buffer = new byte[4096];
for (int n = 0; n != -1; n = in.read(buffer, 0, buffer.length)) {
line.write(buffer, 0, n);
}
}
It needs to play MP3 because you can face such a problem:
Unknown frame size.
To add MP3 reading support to Java Sound, add the mp3plugin.jar of the JMF to the run-time classpath of the application. https://www.oracle.com/technetwork/java/javase/download-137625.html

MediaRecorder record audio in a loop

I'm developing a sound recognition system. I'm using a tensorflow model developed on python to convert MFCC values to labels. I'm using the MediaRecorder class to record the audio, and I'm doing it in a loop so I can be constantly getting microphone audio and then getting the label from the model. Here is the recording loop:
temp = 0;
while (true) {
audioPath = getApplicationContext().getFilesDir().getAbsolutePath();
audioPath += "/Recording" + temp + ".3gp";
audioFile = new File(audioPath);
mediaRecorder = new MediaRecorder();
mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);
mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
mediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
mediaRecorder.setOutputFile(audioPath);
try {
mediaRecorder.prepare();
} catch (IOException e) {
e.printStackTrace();
}
mediaRecorder.start();
sleep(2000);
if (!isRunning) {
mediaRecorder.stop();
return;
}
try {
int amplitude = mediaRecorder.getMaxAmplitude();
Log.d("volume", Integer.toString(amplitude));
//finished = false;
avgVolumeTask task = new avgVolumeTask();
task.execute(amplitude);
} catch (Exception e) {
Log.d("Exception in startMediaRecorder()", e.toString());
}
mediaRecorder.stop();
mediaRecorder.release();
soundRecognition task2 = new soundRecognition();
task2.execute();
audioFile.delete();
temp++;
}
This is the soundRecognition method:
private class soundRecognition extends AsyncTask<Integer, Integer, Long> {
#Override
protected Long doInBackground(Integer... level) {
float[] mfccValues = null;
Interpreter tflite = null;
float[][] labelProbArray = null;
try {
mfccValues = computeMFCC();
labelList = loadLabelList();
labelProbArray = new float[1][labelList.size()];
tflite = new Interpreter(loadModel());
} catch (IOException e) {
e.printStackTrace();
} catch (UnsupportedAudioFileException e) {
e.printStackTrace();
}
tflite.run(mfccValues, labelProbArray);
for (int i = 0; i < labelProbArray[0].length; i++) {
float value = labelProbArray[0][i];
//if (i == 1f){
//Log.d("Output at " + Integer.toString(i) + ": ", Float.toString(value));
//doAlert(i);
//}
}
return null;
}
}
The computeMFCC method is this:
public float[] computeMFCC() throws IOException, UnsupportedAudioFileException {
FileInputStream in2 = new FileInputStream(audioPath);
int i;
// InputStream to byte array
byte[] buf = IOUtils.toByteArray(in2);
in2.close();
i = Integer.MAX_VALUE;
// byte array to short array
short[] shortArr = new short[buf.length / 2];
ByteBuffer.wrap(buf).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(shortArr);
int count = 0;
while (count <= shortArr.length) { // Still have data to process.
for (int n = 0; n < nSubframePerBuf; n++) { // Process audio signal in ArrayList and shift by one subframe each time
int k = 0;
for (i = (n * frameShift); i < (n + 1) * frameShift; i++) {
subx[k] = shortArr[i];
k++;
}
subframeList.add(subx); // Add the current subframe to the subframe list. Later, a number of
}
count++;
}
// Need at least nSubframePerMfccFrame to get one analysis frame
x = extractOneFrameFromList(nSubframePerMfccFrame);
MFCC mfcc = new MFCC(samplePerFrm, 16000, numMfcc);
double[] mfccVals = mfcc.doMFCC(x);
float[] floatArray = new float[mfccVals.length];
for (i = 0 ; i < mfccVals.length; i++)
{
floatArray[i] = (float) mfccVals[i];
}
return floatArray;
}
And the doMFCC method is from a downloaded java file here:
https://github.com/enmwmak/ScreamDetector/blob/master/src/edu/polyu/mfcc/MFCC.java
The issue I'm having is that after a few iterations, I run into the problem that the file doesnt get created, and then get a null error passing the results from the input stream to the tensorflow model.
Possible Issues
One reason could be where the file is stored. I've been trying to send the file to local storage because I was worried that all the devices wouldnt have external storage.
Another reason could be that i'm not calling the sound recognition in the right spot. I waited will after the mediaRecorder is stopped to make sure that the file is written with the mic audio, but when I review the contents of the fileInputStream, it appears to not be working, and in each loop the file is always the same.
Any help would be much appreciated.
It may be tricky to have a sleep(2000) inside while loop.
It may be better to check millis and break until 2000 ms has lapsed.

Sound class sounds layered and screechy on Windows

So, when I'm on Mac, this error did not occur. However, when I am on Windows, any sounds I play multiple times over each other start sounding like they are becoming screechy and layering over each other in an unpleasant way.
Here is relevant code from my Sound class:
public class NewerSound {
private boolean stop = true;
private boolean loopable;
private boolean isUrl;
private URL fileUrl;
private Thread sound;
private double volume = 1.0;
public NewerSound(URL url, boolean loopable) throws UnsupportedAudioFileException, IOException {
isUrl = true;
fileUrl = url;
this.loopable = loopable;
}
public void play() {
stop = false;
Runnable r = new Runnable() {
#Override
public void run() {
do {
try {
AudioInputStream in;
if(!isUrl)
in = getAudioInputStream(new File(fileName));
else
in = getAudioInputStream(fileUrl);
final AudioFormat outFormat = getOutFormat(in.getFormat());
final Info info = new Info(SourceDataLine.class, outFormat);
try(final SourceDataLine line = (SourceDataLine) AudioSystem.getLine(info)) {
if(line != null) {
line.open(outFormat);
line.start();
AudioInputStream inputMystream = AudioSystem.getAudioInputStream(outFormat, in);
stream(inputMystream, line);
line.drain();
line.stop();
}
}
}
catch(UnsupportedAudioFileException | LineUnavailableException | IOException e) {
throw new IllegalStateException(e);
}
} while(loopable && !stop);
}
};
sound = new Thread(r);
sound.start();
}
private AudioFormat getOutFormat(AudioFormat inFormat) {
final int ch = inFormat.getChannels();
final float rate = inFormat.getSampleRate();
return new AudioFormat(PCM_SIGNED, rate, 16, ch, ch * 2, rate, false);
}
private void stream(AudioInputStream in, SourceDataLine line) throws IOException {
byte[] buffer = new byte[4];
for(int n = 0; n != -1 && !stop; n = in.read(buffer, 0, buffer.length)) {
byte[] bufferTemp = new byte[buffer.length];
for(int i = 0; i < bufferTemp.length; i += 2) {
short audioSample = (short) ((short) ((buffer[i + 1] & 0xff) << 8) | (buffer[i] & 0xff));
audioSample = (short) (audioSample * volume);
bufferTemp[i] = (byte) audioSample;
bufferTemp[i + 1] = (byte) (audioSample >> 8);
}
buffer = bufferTemp;
line.write(buffer, 0, n);
}
}
}
It is possible that it could be an issue of accessing the same resources when playing the same sound multiple times over itself when I use the NewerSound.play() method.
Please let me know if any other details are needed. Much appreciated :)
The method you are using to change the volume in the method "stream" is flawed. you have 16-bit encoding, thus it takes two bytes to derive a single audio value. You need to assemble the value from the two byte pairs before the multiplication, then take apart the 16-bit result back into two bytes. There are a number of StackOverflow threads with code to do this.
I don't know if this is the whole reason for the problem you describe but it definitely could be, and definitely needs to be fixed.

writing double[] as WAV file in Java

I'm trying to save a double[] array as .WAV file using this method:
public static void saveWav(String filename, double[] samples) {
// assumes 44,100 samples per second
// use 16-bit audio, 2 channels, signed PCM, little Endian
AudioFormat format = new AudioFormat(SAMPLE_RATE * 2, 16, 1, true, false);
byte[] data = new byte[2 * samples.length];
for (int i = 0; i < samples.length; i++) {
int temp = (short) (samples[i] * MAX_16_BIT);
data[2*i + 0] = (byte) temp;
data[2*i + 1] = (byte) (temp >> 8);
}
// now save the file
try {
ByteArrayInputStream bais = new ByteArrayInputStream(data);
AudioInputStream ais = new AudioInputStream(bais, format, samples.length);
if (filename.endsWith(".wav") || filename.endsWith(".WAV")) {
AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
}
else if (filename.endsWith(".au") || filename.endsWith(".AU")) {
AudioSystem.write(ais, AudioFileFormat.Type.AU, new File(filename));
}
else {
throw new RuntimeException("File format not supported: " + filename);
}
}
catch (IOException e) {
System.out.println(e);
System.exit(1);
}
}
but when I reload the files I saved, for every song[i], the double value is different than the original. I use this method to read WAV files:
public static double[] read(String filename) {
byte[] data = readByte(filename);
int N = data.length;
double[] d = new double[N/2];
for (int i = 0; i < N/2; i++) {
d[i] = ((short) (((data[2*i+1] & 0xFF) << 8) + (data[2*i] & 0xFF))) / ((double) MAX_16_BIT);
}
return d;
}
private static byte[] readByte(String filename) {
byte[] data = null;
AudioInputStream ais = null;
try {
// try to read from file
File file = new File(filename);
if (file.exists()) {
ais = AudioSystem.getAudioInputStream(file);
data = new byte[ais.available()];
ais.read(data);
}
// try to read from URL
else {
URL url = StdAudio.class.getResource(filename);
ais = AudioSystem.getAudioInputStream(url);
data = new byte[ais.available()];
ais.read(data);
}
}
catch (IOException e) {
System.out.println(e.getMessage());
throw new RuntimeException("Could not read " + filename);
}
catch (UnsupportedAudioFileException e) {
System.out.println(e.getMessage());
throw new RuntimeException(filename + " in unsupported audio format");
}
return data;
}
I need both double[] arrays to have the exact same values, and that's not the case.
when I hear the song playing I can't tell the difference, but I still need those original values.
Any help appreciated.
Guy
A double requires 64-bits of storage and has a lot of precision. You can't just throw away 48 bits of data in the round trip and expect to get the exact same value back. It is analogous to starting with a high resolution image, converting it to a thumbnail and then expecting that you can magically recover the original high resolution image. In the real world, the human ear is not going to be able to distinguish between the two. The higher resolution is useful during computation and signal processing routines to reduce the accumulation of computational errors. That being said, if you want to store 64-bit you'll need to use something other than .WAV. The closest you'll get is 32-bit.

Categories