How many audio clips can Java handle? - java

I'm making a game in Java. I want for there to be about 100 different samples and at any given time, 10 samples could be playing. However, for each of these 10 samples, I want to be able to manipulate their volume and pan.
As of right now, I request a line as follows: new DataLine.Info(Clip.class, format);
I do not specify the controls that I need for this line, but it appears that Clips always have MASTER_GAIN and BALANCE controls.
Is this correct?
Could I just create an array of 100 clips and preload all of the samples? I don't quite understand if Java's lines correspond with physical lines into a physical mixer or if they are virtualized.
If I am limited, then how can I swap samples in and out of lines? Is there a way to do this so that all of my say 100 samples are preloaded? Or, does preloading only help when you already have a line designated?
Again, if I am limited, is this the wrong approach? Should I either:
a. use a different programming language, and/or
b. combine audio streams manually and put them all through the same line.
Wow, that's a lot of questions. I didn't find answers in the documentation and I really hope that you guys can help. Please number your answers 1 to 4. Thank you very much!

1) I do NOT think it is safe to assume there will always be a BALANCE or even a MASTER_GAIN. Maybe there is. My experience with Java Controls for audio was vexing and short. I quickly decided to write my own mixer, and have done so. I'm willing to share this code. It includes basic provisions for handling volume and panning.
Even when they work, the Java Controls have a granularity that is limited by the buffer size being used, and this severely limits how fast you can fade in or out without creating clicks, if you trying to do fades. Setting and holding a single volume is no problem, though.
Another java library (bare bones but vetted by several game programmers at java-gaming.org) is "TinySound" which is available via github. I've looked it over but not used it myself. It also mixes all sounds down to a single output SourceDataLine. I can't recall how volume or panning is handled. He included provisions for ogg/vorbis files.
2) I'm not sure how you are envisioning using Clips work when you mention "Samples". Yes, you can preload an array of 100 Clips. And you would directly play one or another of these Clips on it's own thread (assuming using raw Java instead of an audio-mixing library), then reset them back to frame 0, then play them again. But you can only have one thread playing a given Clip at a time: they do not accommodate concurrent playback. (You can "retrigger" though by stopping a given playback and moving the position back to frame #0 then replaying.)
How long are the Clips? 100 of them could be a LOT of memory. If each is a second long, 100 seconds * 44100 frames per second * 4 bytes per frame = 17,640,000 bytes (almost 18MB just dedicated to RAM for sound!).
I guess, if you know you'll only need a few at a time and you can predict which ones will be needed, you can pre-load those and reuse them. But don't fall into the trap of thinking that Clips are meant to be loaded at the time of playback. If you are doing that, you should use SourceDataLines instead. They start playing back quicker since they don't have to wait until the entire external file has been put into memory (as Clips do). I'd recommend only using a Clip if you plan to reset it to the 0th frame and replay it (or loop it)!
3) Once it is loaded as a Clip, it is basically ready to go, there really isn't an additional stage. There really isn't any intermediate stage between an external file and a Clip in memory that I can think of that might be helpful.
Ah, another thought: You might want to create a thread pool ( = max number of concurrent sounds) and manage that. I don't know at what point the scaling justifies the extra management.
4) It IS possible to run concurrent SourceDataLines in many contexts, which relieves the need for holding the entire file in RAM. In that case, the only thing you can preload are the Strings for the File locations, I think. I may be wrong and you can preload the Files as well, but maybe not. Definitely can't reuse an AudioInputLine! On the plus side, an SDL kicks off pretty quick compared to an UNLOADED Clip.
HOWEVER! There are systems (e.g., some Linux OS) that limit you to a single output, which might be either a Clip or a SourceDataLine. That was the clincher for me when I decided to build my own mixer.
I think if only 8 or 10 tones are playing at one time, you will probably be okay as long as the graphics are not too ambitious (not counting the above mentioned Linux OS situation). You'll have to test it.
I don't know what alternative languages you are considering. Some flavor of C is the only alternative I know of. Most everything else that I know of, except Java, is not low-level or fast enough to handle that much audio processing. But I am only modestly experienced, and do not have a sound engineering background but am self-taught.

Related

Sudden delay while recording audio over long time periods inside the JVM

I'm implementing an application which records and analyzes audio in real time (or at least as close to real time as possible), using the JDK Version 8 Update 201. While performing a test which simulates typical use cases of the application, I noticed that after several hours of recording audio continuously, a sudden delay of somewhere between one and two seconds was introduced. Up until this point there was no noticeable delay. It was only after this critical point of recording for several hours when this delay started to occur.
What I've tried so far
To check if my code for timing the recording of the audio samples is wrong, I commented out everything related to timing. This left me essentially with this update loop which fetches audio samples as soon as they are ready (Note: Kotlin code):
while (!isInterrupted) {
val audioData = read(sampleSize, false)
listener.audioFrameCaptured(audioData)
}
This is my read method:
fun read(samples: Int, buffered: Boolean = true): AudioData {
//Allocate a byte array in which the read audio samples will be stored.
val bytesToRead = samples * format.frameSize
val data = ByteArray(bytesToRead)
//Calculate the maximum amount of bytes to read during each iteration.
val bufferSize = (line.bufferSize / BUFFER_SIZE_DIVIDEND / format.frameSize).roundToInt() * format.frameSize
val maxBytesPerCycle = if (buffered) bufferSize else bytesToRead
//Read the audio data in one or multiple iterations.
var bytesRead = 0
while (bytesRead < bytesToRead) {
bytesRead += (line as TargetDataLine).read(data, bytesRead, min(maxBytesPerCycle, bytesToRead - bytesRead))
}
return AudioData(data, format)
}
However, even without any timing from my side the problem was not resolved. Therefore, I went on to experiment a bit and let the application run using different audio formats, which lead to very confusing results (I'm going to use a PCM signed 16 bit stereo audio format with little endian and a sample rate of 44100.0 Hz as default, unless specified otherwise):
The critical amount of time that has to pass before the delay appears seems to be different depending on the machine used. On my Windows 10 desktop PC it is somewhere between 6.5 and 7 hours. On my laptop (also using Windows 10) however, it is somewhere between 4 and 5 hours for the same audio format.
The amount of audio channels used seems to have an effect. If I change the amount of channels from stereo to mono, the time before the delay starts to appear is doubled to somewhere between 13 and 13.5 hours on my desktop.
Decreasing the sample size from 16 bits to 8 bits also results in a doubling of the time before the delay starts to appear. Somewhere between 13 and 13.5 hours on my desktop.
Changing the byte order from little endian to big endian has no effect.
Switching from stereomix to a physical microphone has no effect either.
I tried opening the line using different buffer sizes (1024, 2048 and 3072 sample frames) as well as its default buffer size. This also didn't change anything.
Flushing the TargetDataLine after the delay has started to occur results in all bytes being zero for approximately one to two seconds. After this I get non-zero values again. The delay, however, is still there. If I flush the line before the critical point, I don't get those zero-bytes.
Stopping and restarting the TargetDataLine after the delay appeared also does not change anything.
Closing and reopening the TargetDataLine, however, does get rid of the delay until it reappears after several hours from there on.
Automatically flushing the TargetDataLines internal buffer every ten minutes does not help to resolve the issue. Therefore, a buffer overflow in the internal buffer does not seem to be the cause.
Using a parallel garbage collector to avoid application freezes also does not help.
The used sample rate seems to be important. If I double the sample rate to 88200 Hertz, the delay starts occurring somewhere between 3 and 3.5 hours of runtime.
If I let it run under Linux using my "default" audio format, it still runs fine after about 9 hours of runtime.
Conclusions that I've drawn:
These results let me come to the conclusion that the time for which I can record audio before this issue starts to happen is dependent on the machine on which the application is run and dependent on the byte rate (i.e. frame size and sample rate) of the audio format. This seems to hold true (although I can't completely confirm this as of now) because if I combine the changes made in 2 and 3, I would assume that I can record audio samples for four times as long (which would be somewhere between 26 and 27 hours) as when using my "default" audio format before the delay starts to appear. As I didn't find the time to let the application run for this long yet, I can only tell that it did run fine for about 15 hours before I had to stop it due to time constraints on my side. So, this hypothesis is still to be confirmed or denied.
According to the result of bullet point 13, it seems like the whole issue only appears when using Windows. Therefore, I think that it might be a bug in the platform specific parts of the javax.sound.sampled API.
Even though I think I might have found a way to change when this issue starts to happen, I'm not satisfied with the result. I could periodically close and reopen the line to avoid the problem from starting to appear at all. However, doing this would result in some arbitrary small amount of time where I wouldn't be able to capture audio samples. Furthermore, the Javadoc states that some lines can't be reopened at all after being closed. Therefore, this is not a good solution in my case.
Ideally, this whole issue shouldn't be happening at all. Is there something I am completely missing or am I experiencing limitations of what is possible with the javax.sound.sampled API? How can I get rid of this issue at all?
Edit: By suggestion of Xtreme Biker and gidds I created a small example application. You can find it inside this Github repository.
I have (a rather) vast experience with Java audio interfacing.
Here are a few points that may be useful in guiding you towards a proper solution:
It's not a matter of JVM version - the java audio system have barely been upgraded since Java 1.3 or 1.5
The java audio system is a poor-man's wrapper around whatever audio interface API the operating system has to offer. In linux it's the Pulseaudio library, For windows, it's the direct show audio API (if I'm not mistaken about the latter).
Again, the audio system API is kind of a legacy API - some of the features are not working or not implemented, other behaviors are straight out weird, as they are dependent on an obsolete design (I can provide examples, if required).
It's not a matter of Garbage Collection - If your definition of "delay" is what I understand it to be (audio data is delayed by 1-2 seconds, meaning you start hearing stuff 1-2 seconds later), well, the garbage collector cannot cause blank data to magically be captured by the target data line and then append data as usual in an 2 seconds worth byte offset.
What's most likely happening here is either the hardware or driver providing you with 2 seconds worth of garbled data at some point, and then, streams the rest of the data as usual, resulting in the "delay" you are experiencing.
The fact that it works perfectly on linux means it's not a hardware issue, but rather a driver related one.
To affirm that suspicion, you can try capturing audio via FFmpeg for the same duration and see if the issue is reproduced.
If you are using specialized audio capturing hardware, better approach your hardware manufacturer and inquire him about the issue you are facing on windows.
In any case, when writing an audio capturing application from scratch I'd strongly suggest keeping away from the Java audio-system if possible. It's nice for POCs, but it's an un-maintained legacy API. JNA is always a viable option (I've used it in Linux with ALSA/Pulse-audio to control audio hardware attributes the Java audio system could not alter), so you could look for audio capturing examples in C++ for windows and translate them to Java. It'll give you fine grain control over audio capture devices, much more than what the JVM provide OOTB. If you want to have a look at a living/breathing usable JNA example, check out my JNA AAC encoder project.
Again, if you use special capturing harwdare, there's a good chance the manufacturer already provides it's own low-level C api for interfacing with the hardware, and you should consider having a look at it as well.
If that's not the case, maybe you and your company/client should
consider using specialized capturing hardware (doesn't have to be
that expensive).

can reading from disk from difference threads optimize program?

I am wondering is there a way to optimize reading from disk in java. I mean for example I want to print the contains of all text files in some directory, but after uppercase them. I can create another thread do uppercase them, but can I optimize reading by adding another(thread(s)) to read files too? I mean 2,3 or more threads to read difference files from disk. Is there some optimization for doing this or not? I hope that I explain the problem clearly.
I want to print the contains of all text files
This is most likely your bottleneck. If not, you should focus on what you bottleneck is as optimising anything else is likely to complicate your code for no benefit.
I can create another thread do uppercase them,
You can, though passing the work to another thread could be more expensive than making it uppercase depending on how your do this.
can I optimize reading by adding another(thread(s)) to read files too?
Possibly. How many disks do you have. If you have one disk, it can usually only do one thing at a time.
I mean 2,3 or more threads to read difference files from disk.
Most desktop drives can only do one operation at a time.
Is there some optimization for doing this or not?
Yes, but as I said, until you know what your bottleneck is, it's hard to jump to a solution.
I can create another thread do uppercase them
That's actually going in the right direction, but simply making all letters uppercase doesn't take enough time to really matter unless you're processing really large chunks of the file.
Because the standard single-threaded model of read-then-process means you're either reading data or processing it, when you could be doing both at the same time.
For example, you could be creating a series of highly compressed (say, JPEG2000 because it's so CPU intensive) images from a large video stream file. You could have one thread reading frames from the stream, placing them into a queue to process, and then have N threads each processing a frame into an image.
You'd tune the number of threads reading data and the number of threads processing data to keep both your disks and CPUs maximally busy without excess contention.
There are some cases where you can use multiple threads to read from a single file to get better performance. But you need a system designed from the ground up to do that. You need lots of disks (less so if they're SSDs), a pretty substantial IO infrastructure along with a system that has a lot of IO bandwidth, and then you need a file system that can handle multiple simultaneous access to a single file. Then the code you have to write to get better performance from reading using more than one thread has to match things like the physical layout of your files on disk.
That works best if you're doing lots of random reads from a file spread over multiple devices. Like a large, high-powered database server.
For example, lets say I have a huge data file spread over four or five disks (or even RAID arrays), with the file spread out over the disks in 64KB chunks. A handful of threads doing 64KB reads would be ideal to read or write such a file in a random-access mode. Let's say everything is really fast and you can read or write 1 GB/sec from such a file.
But if you turn around and just try to copy that data in a stream, you can still use multiple threads to get maximum performance - say 1 GB/sec - but if you just used a single thread to do read() calls in 1 MB chunks you'd probably get 950 MB/sec - or 95% or maximum multithreaded read performance.
I've actually benchmarked such systems and most of the time, multithreaded IO isn't worth the trouble unless you've invested a lot of money in your hardware and software (opensource file systems tend not to do this very well - you need to get into the realm of IBM's GPFS and Oracle's (nee LSC's then Sun's) QFS) and you know exactly what you're doing when you set it up.

Android GPU profiling - OpenGL Live Wallpaper is slow

I'm developing a Live Wallpaper using OpenGL ES 3.0. I've set up according to the excellent tutorial at http://www.learnopengles.com/how-to-use-opengl-es-2-in-an-android-live-wallpaper/, adapting GLSurfaceView and using it inside the Live Wallpaper.
I have a decent knowledge of OpenGL/GLSL best practices, and I've set up a simple rendering pipeline where the draw loop is as tight as possible. No re-allocations, using one static VBO for non-changing data, a dynamic VBO for updates, using only one draw call, no branching in the shaders et cetera. I usually get very good performance, but at seemingly random but reoccurring times, the framerate drops.
Profiling with the on-screen bars gives me intervals where the yellow bar ("waiting for commands to complete") shoots away and takes everything above the critical 60fps threshold.
I've read any resources on profiling and interpreting those numbers I can get my hands on, including the nice in-depth SO question here. However, the main takeaway from that question seems to be that the yellow bar indicates time spent on waiting for blocking operations to complete, and for frame dependencies. I don't believe I have any of those, I just draw everything at every frame. No reading.
My question is broad - but I'd like to know what things can cause this type of framerate drop, and how to move forward in pinning down the issue.
Here are some details that may or may not have impact:
I'm rendering on demand, onOffsetsChanged is the trigger (render when dirty).
There is one single texture (created and bound only once), 1024x1024 RGBA. Replacing the one texture2D call with a plain vec4 seems to help remove some of the framerate drops. Reducing the texture size to 512x512 does nothing for performance.
The shaders are not complex, and as stated before, contain no branching.
There is not much data in the scene. There are only ~300 vertices and the one texture.
A systrace shows no suspicious methods - the GL related methods such as buffer population and state calls are not on top of the list.
Update:
As an experiment, I tried to render only every other frame, not requesting a render every onOffsetsChanged (swipe left/right). This was horrible for the look and feel, but got rid of the yellow lag spikes almost completely. This seems to tell me that doing 60 requests per frame is too much, but I can't figure out why.
My question is broad - but I'd like to know what things can cause this
type of framerate drop, and how to move forward in pinning down the
issue.
(1) Accumulation of render state. Make sure you "glClear" the color/depth/stencil buffers before you start each render pass (although if you are rendering directly to the window surface this is unlikely to be the problem, as state is guaranteed to be cleared every frame unless you set EGL_BUFFER_PRESERVE).
(2) Buffer/texture ghosting. Rendering is deeply pipelined, but OpenGL ES tries to present a synchronous programming abstraction. If you try to write to a buffer (SubBuffer update, SubTexture update, MapBuffer, etc) which is still "pending" use in a GPU operation still queued in the pipeline then you either have to block and wait, or you force a copy of that resource to be created. This copy process can be "really expensive" for large resources.
(3) Device DVFS (dynamic frequency and voltage scaling) can be quite sensitive on some devices, especially for content which happens to sit just around a level decision point between two frequencies. If the GPU or CPU frequency drops then you may well get a spike in the amount of time a frame takes to process. For debug purposes some devices provide a means to fix frequency via sysfs - although there is no standard mechnanism.
(4) Thermal limitations - most modern mobile devices can produce more heat than they can dissipate if everything is running at high frequency, so the maximum performance point cannot be sustained. If your content is particularly heavy then you may find that thermal management kicks in after a "while" (1-10 minutes depending on device, in my experience) and forcefully drops the frequency until thermal levels drop within safe margins. This shows up as somewhat random increases in frame processing time, and is normally unpredictable once a device hits the "warm" state.
If it is possible to share an API sequence which reproduces the issue it would be easier to provide more targeted advice - the question is really rather general and OpenGL ES is a very wide API ;)

Java (JavaSound): Is "clip.play()" an expensive call?

I've read here on StackOverflow that every time you play a clip in JavaSound, behind the scenes it creates a thread to play it. If it is true (and if it isn't please tell me, as I have not found any documentation/source on that), would it be considered as an expensive call, since creating threads is an expensive task in any OS/JVM? I am not sure yet, but I may need to play 10 to 20 clips concurrently, so I was wondering if that would be a problem.
PS: If it is an exoensive call for other reasons beside creating threads, please let me know.
Threads are NOT expensive, particularly. I've personally made a program that has over 500 running. Server programs can spawn considerably more than that.
Sound processing is not inexpensive, but I don't know that it is much more cpu-intensive than many graphics effects, like lighting in 3D. I made a program that both played a sound and made a "glow ball" that grew and faded while the sound was playing. The "glow ball" continually updated a RadialGradientPaint to achieve this effect. I ran into a ceiling of about 10 balls and sounds, and it was the graphical balls that were the bigger processing load.
Still, you might not be able to do a whole lot else with 17 Clips playing. You'll have to test it, and will hear dropouts if the cpu can't keep up.
Your 17 Clips may take up a huge amount of RAM. You know that they are all loaded into memory, yes? At 44100 samples for each second, and typically 4 bytes per sample (stereo, 16-bit PCM), that starts to add up quick.
So, there may be reasons to consider using SourceDataLine's instead, especially for the longer sounds.
Also, it seems some OS systems don't handle multiple sounds very well. I've had problems come up here with Linux in particular. I ended up writing a program to mix all the playing sounds into one output SourceDataLine as a way to handle this.
Another way I get some efficiency is that I load my own custom made Clip. I've giving this Clip multiple cursors (pointers) that can independently move through the audio data. This way, I can play a Clip multiple times (and at varying speeds) overlapping. To do this with a Java Clip, you have to load it into RAM multiple times. So, you might consider writing something like that. The output from the multiple cursors can be summed and played via a SourceDataLine.

Mixing Sound in Java?

How do I play the same sound more than once at any given time with numerous other sounds going on at the same moment? Right now I've got a "Clip" playing but it won't overlap with itself. ( I hear one bullet fire, the sound finishes then it plays again ). I'm writing a game with a fast bullet firing system but i can't get the sound to work nicely. It just doesn't sound "right" to hear only one bullet shot every half second when you spawn 20+ on the screen each second.
Any help? Pointers? :D
This seems to answer your question:
http://my.safaribooksonline.com/9781598634761/ch09lev1sec3
Quote:
"In other words, a single Clip object cannot mix with itself, only with other sounds. This process works quite well if you use short sound effects, but can sound odd if your sound clips are one second or more in length. [...] If you want to repeatedly mix a single clip, there are two significant options (and one other unlikely option):
1) Load the sound file into multiple Clip objects (such as an array), and then play each one in order. Whenever you need to play this specific sound, just iterate through the array and locate a clip that has finished playing, and then start playing it again."
So in principle Java does do mixing, just not inside a single clip.
Playing 20 bullit clips at once might be a little cpu intensive. It might be fine. I made a windchime once that played 7 chimes, overlapping (each was about 3 or 4 seconds long), and got away with setting it to play about 100 chimes per 5 second block. But the program wasn't doing anything else.
With Clips, to do this you would need to make multiple copies, and all that audio data would be sitting there, taking up RAM. If they are really short, it's not such a sacrifice. But for rapid fire, the solution most games use is to just cut off the sound and restart. You don't have to play the sound through to the end.
myClip.stop();
myClip.setFramePosition(0);
myClip.start();
with each bullit start. This is what is most often done. It uses a lot less CPU and less RAM than the overlapping Clip solution.
AudioClip might be what you're looking for - I use it in games playing short .wav sound effects, it's not perfect but it works fine most of the time.

Categories