My aim is to take in two channels of audio from a live source, perform some FFT analysis and display realtime graphs of the data.
So far I have researched and have gotten to the point where I can create a targetdataline from my audio interface with two channels of audio at a specified audioformat. I have created a buffer for this stream of bytes, however I would like to treat each audio channel independently. Do I need to split the stream as it writes to the buffer, and have two buffers? Or do I need to split the buffer into different arrays to process?
final AudioFormat format = getFormat();
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
TargetDataLine line = (TargetDataLine) m.getLine(info);
line.open(format);
line.start();
System.out.println("Line Started");
Thread captureThread = new Thread(){
int bufferSize = (int) (format.getSampleRate() * format.getFrameRate() * format.getChannels());
byte buffer[] = new byte[bufferSize / 5];
out = new ByteArrayOutputStream();
while(running) {
int numBytesRead = line.read(buffer, 0, buffer.length);
while (numBytesRead > 0) {
arraytoProcess = buffer;
Thread fftThread;
fftThread = new Thread(){
public void fftrun() {
try {
fftCalc();
} catch (ParseException ex) {
Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
}
};
};
while (numBytesRead == buffer.length){
fftThread.start();
}
}
I am sure I have gone far wrong, however any pointers would help. When I try running this at the moment I am aware that it takes too longer to complete the 'fftThread' than it takes for each pass of the buffer so I get an illegal thread state exception (however it is currently getting all bytes (stereo channel) passed to this thread. I have tried the good old search engines however things aren't overly clear on how to deal with accessing multiple channels of a TargetDataStream.
Related
i'm writing a program for simple voice transmission via udp. This works fine, until I implement a method to check the average volume level of a sample-to-be-sent.
Here I have included the Audio class and the Receiver class and cut out some unimportant stuff.
public class Audio extends Thread
{
private AudioFormat defaultAudioFormat = null;
private TargetDataLine inputLine = null;
private byte[] sample = new byte[1024];
private SourceDataLine speakers = null;
private int voiceActivationLevel = 35;
private SourceDataLine speakers = null;
private boolean running = true;
private VOIPSocket sender = null;
public Audio(int voiceActivationLevel, VOIPSocket socket) throws LineUnavailableException
{
this.voiceActivationLevel = voiceActivationLevel;
this.defaultAudioFormat = new AudioFormat(8000f, 16, 1, true, true);
DataLine.Info info = new DataLine.Info(TargetDataLine.class, this.defaultAudioFormat);
this.inputLine = (TargetDataLine) AudioSystem.getLine(info);
this.inputLine.open(defaultAudioFormat);
this.sender = socket;
DataLine.Info dataLineInfo = new DataLine.Info(SourceDataLine.class, this.defaultAudioFormat);
this.speakers = (SourceDataLine) AudioSystem.getLine(dataLineInfo);
this.speakers.open();
this.speakers.start();
}
public void run()
{
DataLine.Info = new DataLine.Info(TargetDataLine.class, this.defaultAudioFormat);
this.inputLine.start();
while(running)
{
if(AudioSystem.isLineSupported(info))
{
int data = inputLine.read(this.sample, 0, sample.length);
int voiceLevel = calculateVolumeLevel(this.sample);
if(voiceLevel >= this.voiceActivationLevel)
this.sender.sendData(this.sample); //
}
}
}
public int calculateVolumeLevel(byte[] audioData)
{
long l = 0;
for(int i = 0; i < audioData.length; i++)
l = l + audioData [i];
double avg = l / audioData.length;
double sum = 0d;
for(int j = 0; j < audioData.length; j++)
sum += Math.pow(audioData[j] - avg, 2d);
double averageMeanSquare = sum / audioData.length;
return (int)(Math.pow(averageMeanSquare, 0.5d) + 0.5);
}
public void playSound(byte[] data)
{
synchronized(this.speakers)
this.speakers.write(data, 0, data.length);
}
}
Note that calculateVolumeLevel does NOT modify voiceData, just returns an average volume level as integer.
public class Receiver extends Thread
{
private VOIPSocket socket = null; //Just a simple class with some helper functions for the socket, not important for this
private byte[] buffer = new byte[4096];
private boolean isRunning = true;
private Audio audio = null;
public Receiver(VOIPSocket socket, Audio audio) throws SocketException
{
this.socket = socket;
this.audio = audio;
}
public void run()
{
DatagramPacket packet = new DatagramPacket(this.buffer, this.buffer.length);
while(isRunning)
{
if(!socket.getSocket.isClosed())
{
socket.getSocket.receive(packet);
byte data = packet.getData();
this.audio.playSound(data);
}
}
}
}
As soon as I include the check for the volume level, the sound is stuttering, repeating over and over and other mistakes, until I flush or drain the speakers-dataline.
The data transmission via UDP is working correctly and needs no further investigation in my opinion.
My read is, that as soon as I implement check for the voice volume, the byte-data somehow is corrupted or important parts of sample[] are not transmitted. This somehow puts errors on the speakers-dataline.
I don't know how to solve this. Any ideas?
edit:
According to https://docs.oracle.com/javase/10/troubleshoot/java-sound.htm#JSTGD490, some over- or underrun condition comes in I guess.
If I dont have the volume check enabled, a continuous data stream is provided for the speakers-dataline. If it is enabled, this stream of course is interrupted most of the time, leading to corrupted data in either the input dataline or the speakers-dataline (or both).
This can be solved by flushing and closing the datalines, then opening them again. This unfortunately is not a suitable solution, as flushing can sometimes take up to 1 second, where no data can be played, which is not acceptable, as I would need to flush very often (like every time, there is silence).
Any ideas on this?
I would consider buffering the volume-checking. For example, I wrote a line that takes input from a live microphone's TargetDataLine, converts to PCM and passes the data on to an input track of an audio mixing tool. The key to getting this to work in my situation was providing an intermediate, concurrent-safe FIFO queue. I used a ConcurrentLinkedQueue<Float[]> with the float[] arrays holding signed, normalized float "packets" of PCM.
A more experienced coder than myself used to continuously harp on the following general principle in working with audio: avoid locks. I haven't read through the code you provided thoroughly enough to know if this is arising in your case, but seeing the keyword synchronized in the playSound method reminds me of his advice.
The only blocking when working with audio, I think, should be that which is built into the blocking queues employed by TargetDataLine or SourceDataLine read and write methods. I blindly pass on this advice and encourage you to inspect your code for blocking and find alternatives in any situation in your code where a block might occur.
I'm writing a function to capture an audio clip for ~ 7.5 seconds using a TargetDataLine. The code executes and renders an 'input.wav' file, but when I play it there is no sound.
My approach, as shown in the code at the bottom of this post, is to do the following things:
Create an AudioFormat and get the Info for a Target Data Line.
Create the Target Data Line by getting the line from AudioSystem.
Open and Start the TargetDataLine, which allocates system resources for recording.
Create an auxiliary Thread that will record audio by writing to a file.
Start the auxiliary Thread, pause the main Thread in the meantime, and then close out the Target Data Line in order to stop recording.
What I have tried so far:
Changing the AudioFormat. Initially, I was using the other AudioFormat constructor which takes the file type as well (where the first argument is AudioFormat.Encoding.PCM_SIGNED etc). I had a sample rate of 44100, 16 bits, 2 channels and small-Endian settings on the other format, which yielded the same result.
Changing the order of commands on my auxiliary and main Thread (i.e. performing TLine.open() or start() in alternate locations).
Checking that my auxiliary thread does actually start.
For reference I am using IntelliJ on a Mac OS Big Sur.
public static void captureAudio() {
try {
AudioFormat f = new AudioFormat(22050, 8, 1, false, false);
DataLine.Info secure = new DataLine.Info(TargetDataLine.class, f);
if (!AudioSystem.isLineSupported(secure)) {
System.err.println("Unsupported Line");
}
TargetDataLine tLine = (TargetDataLine)AudioSystem.getLine(secure);
System.out.println("Starting recording...");
tLine.open(f);
tLine.start();
File writeTo = new File("input.wav");
Thread t = new Thread(){
public void run() {
try {
AudioInputStream is = new AudioInputStream(tLine);
AudioSystem.write(is, AudioFileFormat.Type.WAVE, writeTo);
} catch(IOException e) {
System.err.println("Encountered system I/O error in recording:");
e.printStackTrace();
}
}
};
t.start();
Thread.sleep(7500);
tLine.stop();
tLine.close();
System.out.println("Recording has ended.");
} catch(Exception e) {
e.printStackTrace();
}
}
Update 1: Some new testing and results
My microphone and speakers are both working with other applications - recorded working audio with QuickTimePlayer.
I did a lot of testing around what my TargetDataLines are and what the deal is with them. I ran the following code:
public static void main(String[] args) {
AudioFormat f = new AudioFormat(48000, 16, 2, true, false);
//DataLine.Info inf = new DataLine.Info(SourceDataLine.class, f);
try {
TargetDataLine line = AudioSystem.getTargetDataLine(f);
DataLine.Info test = new DataLine.Info(TargetDataLine.class, f);
TargetDataLine other = (TargetDataLine)AudioSystem.getLine(test);
String output = line.equals(other) ? "Yes" : "No";
if (output.equals("No")) {
System.out.println(other.toString());
}
System.out.println(line.toString());
System.out.println("_______________________________");
for (Mixer.Info i : AudioSystem.getMixerInfo()) {
Line.Info[] tli = AudioSystem.getMixer(i).getTargetLineInfo();
if (tli.length != 0) {
Line comp = AudioSystem.getLine(tli[0]);
System.out.println(comp.toString() + ":" +i.getName());
if (comp.equals(line) || comp.equals(other)) {
System.out.println("The TargetDataLine is from " + i.getName());
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
Long story short, the TargetDataLine I receive from doing
TargetDataLine line = AudioSystem.getTargetDataLine(f); and
TargetDataLine other = (TargetDataLine)AudioSystem.getLine(new DataLine.Info(TargetDataLine.class, f));
are different, and furthermore, don't match any of the TargetDataLines that are associated with my system's mixers.
The output of the above code was this (where there first lines are other and line respectively):
com.sun.media.sound.DirectAudioDevice$DirectTDL#cc34f4d
com.sun.media.sound.DirectAudioDevice$DirectTDL#17a7cec2
_______________________________
com.sun.media.sound.PortMixer$PortMixerPort#79fc0f2f:Port MacBook Pro Speakers
com.sun.media.sound.PortMixer$PortMixerPort#4d405ef7:Port ZoomAudioDevice
com.sun.media.sound.DirectAudioDevice$DirectTDL#3f91beef:Default Audio Device
com.sun.media.sound.DirectAudioDevice$DirectTDL#1a6c5a9e:MacBook Pro Microphone
com.sun.media.sound.DirectAudioDevice$DirectTDL#37bba400:ZoomAudioDevice
Upon this realization I manually loaded up all the TargetDataLines from my mixers and tried recording audio with each of them to see if I got any sound.
I used the following method to collect all the TargetDataLines:
public static ArrayList<Line.Info> allTDL() {
ArrayList<Line.Info> all = new ArrayList<>();
for (Mixer.Info i : AudioSystem.getMixerInfo()) {
Line.Info[] tli = AudioSystem.getMixer(i).getTargetLineInfo();
if (tli.length != 0) {
for (int f = 0; f < tli.length; f += 1) {
all.add(tli[f]);
}
}
}
return all;
}
My capture/record audio method remained the same, except for switching the format to AudioFormat f = new AudioFormat(48000, 16, 2, true, false);, changing the recording time to 5000 milliseconds, and writing the method header as public static void recordAudio(Line.Info inf) so I could load each TargetDataLine individually with it's info.
I then executed the following code to rotate TargetDataLines:
public static void main(String[] args) {
for (Line.Info inf : allTDL()) {
recordAudio(inf);
try {
Thread.sleep(5000);
} catch(Exception e) {
e.printStackTrace();
}
if (!soundless(loadAsBytes("input.wav"))) {
System.out.println("The recording with " + inf.toString() + " has sound!");
}
System.out.println("The last recording with " + inf.toString() + " was soundless.");
}
}
}
The output was as such:
Recording...
Was unable to cast com.sun.media.sound.PortMixer$PortMixerPort#506e1b77 to a TargetDataLine.
End recording.
The last recording with SPEAKER target port was soundless.
Recording...
Was unable to cast com.sun.media.sound.PortMixer$PortMixerPort#5e9f23b4 to a TargetDataLine.
End recording.
The last recording with ZoomAudioDevice target port was soundless.
Recording...
End recording.
The last recording with interface TargetDataLine supporting 8 audio formats, and buffers of at least 32 bytes was soundless.
Recording...
End recording.
The last recording with interface TargetDataLine supporting 8 audio formats, and buffers of at least 32 bytes was soundless.
Recording...
End recording.
The last recording with interface TargetDataLine supporting 14 audio formats, and buffers of at least 32 bytes was soundless.
TL;DR the audio came out soundless for every TargetDataLine.
For completeness, here are the soundless and loadAsBytes functions:
public static byte[] loadAsBytes(String name) {
assert name.contains(".wav");
ByteArrayOutputStream out = new ByteArrayOutputStream();
File retrieve = new File("src/"+ name);
try {
InputStream input = AudioSystem.getAudioInputStream(retrieve);
int read;
byte[] b = new byte[1024];
while ((read = input.read(b)) > 0) {
out.write(b, 0, read);
}
out.flush();
byte[] full = out.toByteArray();
return full;
} catch(UnsupportedAudioFileException e) {
System.err.println("The File " + name + " is unsupported on this system.");
e.printStackTrace();
} catch (IOException e) {
System.err.println("Input-Output Exception on retrieval of file " + name);
e.printStackTrace();
}
return null;
}
static boolean soundless(byte[] s) {
if (s == null) {
return true;
}
for (int i = 0; i < s.length; i += 1) {
if (s[i] != 0) {
return false;
}
}
return true;
}
I'm not really sure what the issue could be at this point save for an operating system quirk that doesn't allow Java to access audio lines, but I do not know how to fix that - looking at System Preferences there isn't any obvious way to allow access. I think it might have to be done with terminal commands but also not sure of precisely what commands I'd have to execute there.
I'm not seeing anything wrong in the code you are showing. I haven't tried testing it on my system though. (Linux, Eclipse)
It seems to me your code closely matches this tutorial. The author Nam Ha Minh is exceptionally conscienscious about answering questions. You might try his exact code example and consult with him if his version also fails for you.
But first, what is the size of the resulting .wav file? Does the file size match the amount of data expected for the duration you are recording? If so, are you sure you have data incoming from your microphone? Nam has another code example where recorded sound is progressively read and placed into memory. Basically, instead of using the AudioInputStream as a parameter to the AudioSystem.write method, you execute multiple read method calls on the AudioInputStream and inspect the incoming data directly. That might be helpful for trouble-shooting whether the problem is occurring on the incoming vs outgoing part of the process.
I'm not knowledgeable enough about formats to know if the Mac does things differently. I'm surprised you are setting the format to unsigned. For my limited purposes, I stick with "CD quality stereo" and signed PCM at all junctures.
EDIT: based on feedback, it seems that the problem is that the incoming line is not returning data. From looking at other, similar tutorials, it seems that several people have had the same problem on their Mac systems.
First thing to verify: does your microphone work with other applications?
As far as next steps, I would try verifying the chosen line. The lines that are exposed to java can be enumerated/inspected. The tutorial Accessing Audio System Resources has some basic information on how to do this. It looks like AudioSystem.getMixerInfo() will return a list of available mixers that can be inspected. Maybe AudioSystem.getTargetLineInfo() would be more to the point.
I suppose it is possible that the default Line or Port being used when you obtain a TargetDataLine isn't the one that is running the microphone. If a particular line or port turns out to be the one you need, then it can be specified explicitly via an overridden getTargetDataLine method.
I'm reading that there might be a security policy that needs to be handled. I don't fully understand the code, but if that were the issue, an Exception presumably would have been thrown. Perhaps there are new security measures coming from the MacOs, to prevent an external program from opening a mic line surreptitiously?
If you do get this solved, be sure and post the answer and mark it solved. This seems to be a live question for many people.
I'm trying to get an audio stream from a text-to-speech interface (MaryTTS) and stream it in an SIP RTP session (using Peers).
Peers wants a SoundSource to stream audio, which is an interface defined as
public interface SoundSource {
byte[] readData();
}
and MaryTTS synthesises a String to an AudioInputStream. I tried to simply read the stream and buffering it out to Peers implementing SoundSource, in the lines of
MaryInterface tts = new LocalMaryInterface();
AudioInputStream audio = tts.generateAudio("This is a test.");
SoundSource soundSource = new SoundSource() {
#Override
public byte[] readData() {
try {
byte[] buffer = new byte[1024];
audio.read(buffer);
return buffer;
} catch (IOException e) {
return null;
}
}
};
// issue call with soundSource using Peers
the phone rings, and I hear a slow, low, noisy sound instead of the synthesised speech. I guess it could be something with the audio format the SIP RTP session expects, since Peers documentation states
The sound source must be raw audio with the following format: linear PCM 8kHz, 16 bits signed, mono-channel, little endian.
How can I convert/read the AudioInputStream to satisfy these requirements?
One way I know is this - given the systems that you are using I dont know if it will pass:
ByteArrayOutputStream outputStream=new ByteArrayOutputStream();
try {
byte[] data=new byte[1024];
while(true) {
k=audioInputStream.read(data, 0, data.length);
if(k<0) break;
outputStream.write(data, 0, k);
}
AudioFormat af=new AudioFormat(8000f, 16, 1, true, false);
byte[] audioData=outputStream.toByteArray();
InputStream byteArrayInputStream=new ByteArrayInputStream(audioData);
AudioInputStream audioInputStream2=new AudioInputStream(byteArrayInputStream, af, audioData.length/af.getFrameSize());
outputStream.close();
}
catch(Exception ex) { ex.printStackTrace(); }
}
There is also this
AudioSysytem.getAudioInputStream(AudioFormat targetFormat, AudioInputStream sourceStream)
which you can use with the above parameters.
I'm trying to write 2 different buffers (buffer A and B) multithreaded with SourceDataLine to play the sounds at the same time. But it keeps switching between buffer A and buffer B, do I need to merge the buffers together before writing them to my SourceDataLine or is there a way to play them synchronized?
class PlayThread extends Thread {
byte[] buffer = new byte[2 * 1024];
#Override
public void run() {
try {
while (true) {
DatagramPacket receive = new DatagramPacket(buffer, buffer.length);
mDatagramSocket.receive(receive);
mSourceDataLine.write(receive.getData(), 0, receive.getData().length);
System.out.println("Received!");
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
I have 2 PlayThread instances with a different incoming buffer. Below is the function where the SourceDataLine is initialized.
private void init() {
try {
DataLine.Info sourceDataLineInfo = new DataLine.Info(
SourceDataLine.class, audioFormat);
DataLine.Info targetDataLineInfo = new DataLine.Info(
TargetDataLine.class, audioFormat);
Mixer.Info[] mixerInfo = AudioSystem.getMixerInfo();
Mixer mixer = AudioSystem.getMixer(mixerInfo[3]);
mSourceDataLine = (SourceDataLine) AudioSystem
.getLine(sourceDataLineInfo);
mTargetDataLine = (TargetDataLine) mixer.getLine(targetDataLineInfo);
mSourceDataLine.open(audioFormat, 2 * 1024);
mSourceDataLine.start();
mTargetDataLine.open(audioFormat, 2 * 1024);
mTargetDataLine.start();
} catch (LineUnavailableException ex) {
ex.printStackTrace();
}
}
Thank you.
You absolutely do have to merge them. Imagine writing numbers to a file from two threads:
123456...
123456...
might become
11234235656...
Which is what's happening to you.
Another issue is that you need to buffer your data as it comes in from the network, or you will likely drop it. You need at least two threads -- one for reading and one for playing for this to work. However, in your case, you will probably have better luck with one reader thread for each input packet stream. (See my talk slides: http://blog.bjornroche.com/2011/11/slides-from-fundamentals-of-audio.html I specifically have a slide about streaming from http which is also relevant here)
So, Instead of multiple PlayThreads, make multiple ReaderThreads, which wait for data and then write to a buffer of some sort (PipedInput and PipedOutputStream work well for Java). Then you need another thread to read the data from the buffers and then write the COMBINED data to the stream.
This leaves your original question of how to combine the data. The answer is that there's no single answer, but usually the easiest correct way is to average the data on a sample-by-sample basis. However, exactly how you do so depends on your data format, which your code doesn't include. Assuming it's big-endian 16-bit integer, you need to convert the incoming raw data to shorts, average the shorts, and convert the averaged short back to bytes.
The byte to short conversion is most easily accomplished using DataInputStream and DataOutputStream.
How can I control the mic on-off function using java code? I need to control the time for which mic is on.
I tried using the following code in java:
final AudioFormat format = getFormat();//getformat() has the audio format
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
final TargetDataLine line = (TargetDataLine) AudioSystem.getLine(info);
line.open(format); //open mic for input
line.start();
byte[] buffer = new byte[1048576];
OutputStream out = new ByteArrayOutputStream();//output the audio to buffer
boolean running = true;
try {
while (running) {
int count = line.read(buffer, 0, buffer.length);
running=false;
if (count > 0) {
out.write(buffer, 0, count);
}
}
out.close();
} catch (IOException e) {
System.out.println("Error");
System.err.println("I/O problems: " + e);
System.exit(-1);
}
But this basically depends on the the size of buffer. And the while loop can input audio for 30secs per pass.
I need to take the sample inputs for just 10secs.
any help?? thanks.:)
It seems you are trying to control the duration of the miking via the size of the buffer. I'm pretty sure this isn't common practice. Usually one uses a buffer that is a fraction of a second in size (to keep latency low), and iterates through it repeatedly. To control the duration of an open-ended read or playback operation, it is more usual to change the value of the "running" boolean.
Thus, from outside of the loop, code updates the "running" boolean, and when the loop notices that there has been a request to stop, the read loop ends.
I'm not up on specifics as to how one gets permission to turn on a mike or not. I know the java sound tutorials talk about it.
http://docs.oracle.com/javase/tutorial/sound/capturing.html
In their example, they use a boolean "stopped" to control when to end the recording loop.