Play .mp3 and a sequence of notes (MIDI) simultaneously in Java

Play .mp3 and a sequence of notes (MIDI) simultaneously in Java - java

I am currently delopping an application where the user can load a .mp3 file and enter a sequence of notes. The goal for the user is to match this sequence of notes with the song of the .mp3 file.
This requires the possibility to play the .mp3 file and the sequence of notes simultaneously. After some research I found out that either the Java Sound API or JFuge can do the job to produce a sequence of notes (MIDI). (The input given by the user). As stated here, JLayer can be used to play mp3 files in Java. (I could also transform the .mp3 to .wav and use another way to play the transformed .wav).
However, would it be possible to play this .mp3 and sequence of notes together without any problems, or should I first convert them to one single file?
The user should be able to play the .mp3 and his/her sequence of notes at any random timestamp simultaneously. Preferably without any delay so the user can easily adapt a note to match the pitch of the file. It seems that merging them together to one file, before playing them, would be too much overhead when the user is almost constantly changing a note and replaying to check if it matches the pitch.
Thanks in advance!

Java supports playback from multiple threads. All you need to do is run the .mp3 from one thread, and the midi-generated notes on another concurrently running thread.
There used to be a few Linux systems that could only handle output from one audio source at a time. I don't know if this is still an issue.
Another, much more elaborate possibility that would let you do live mixing and output to a single line would be to read the song file using AudioInputStream, convert the bytes to PCM on the fly (e.g., to floats ranging from -1 to 1) (or preload and store the audio as PCM), and then add this to PCM data coming from a do-it-yourself synth, and then convert this back to bytes and output via a SourceDataLine.
That is a lot of trouble and you probably don't want to go that route, but if you did, following is some info to help break down the various steps of one possible realization.
Loading .wav data and converting it into an internal PCM form can be seen in the open-source AudioCue (line 359 loadURL method). And here is an example (free download) of a real-time Java synth I made that runs via keystrokes. One of the voices is a simple organ, which outputs PCM audio data by just adding four sine waves at harmonic frequencies. Making other sounds is possible if you want to get into other forms of synthesis but gets more involved.
(IDK how to convert data coming from a MIDI-controlled synth, unless maybe a TargetDataLine can be identified, and data from it converted to PCM similar to the conversion used in reading from an AudioInputStream in the AudioCue source example.)
Given two PCM sources, the two can be mixed in real time using addition, converted to bytes and output via a single SourceDataLine (see line 1387 convertBufferToAudioBytes method). The SourceDataLine can be kept running indefinitely if you input zeros from the contributors when they are not playing. An SDL spends the vast majority of its time in a blocked state as audio data processing is much quicker than the rate it is consumed by the system, and so uses very little cpu.

Related

Extract audio data to memory using vlcj

I want to extract audio data to memory using vlcj (https://github.com/caprica/vlcj, version 4.2.0). I don't want to play the video on the same time, just extract the audio data, as fast as the performance allows.
Right now I'm using a workaround based on this: https://github.com/caprica/vlcj/blob/master/src/test/java/uk/co/caprica/vlcj/test/rip/RipAudioTest.java, i.e. output the data to a file first, and then read the file. While this solution is working, it's not optimal and it takes disk space.
Maybe the example above can be modified to direct the audio to a callback instead.
A second example is:
https://github.com/caprica/vlcj/blob/master/src/test/java/uk/co/caprica/vlcj/test/directaudio/DirectAudioPlayerTest.java
In that example, the audio data is extracted to memory, which is what I want. But it also plays the video in a window, which is not what I want. Maybe that example can be modified to turn off the video somehow, and make it run as fast as possible?

There's no perfect answer here, sadly.
Using the DirectAudioPlayerTest.java that you found already, you can change the media player factory creation to pass this parameter to prevent the video window being opened:
factory = new MediaPlayerFactory("--novideo");
You will receive the audio into a memory buffer at the rate at which VLC decodes it, so if you have a 5 minute video it will take 5 minutes to extract the audio - not ideal.
This is the only way with vlcj that you can grab the audio data in memory.
The RipAudioTest.java that you found, IIRC, extracts the audio as quickly as possible, which may be a lot faster than the normal playback speed - but here you can't grab the decoded audio directly into your Java application.
So the solution you already have, ripping the track to disk first, might actually be the best solution you can achieve with vlcj here (since it could be considerably quicker than using the direct audio player).

Multi channel audio within processing

I’m trying to build a sketch that shows me levels of audio coming into a system. I want to be able to do more than 2 channels so i know that i need more than the processing.sound library can provide currently and my searching has led me to javax.sound.sampled.*, however this is as far as my searching and playing has got me.
Does anyone know how to query the system for how many lines are coming in and to get the amplitude of audio on each line?

This is kind of a composite question.
For the number of lines, see Accessing Audio System Resources in the Java tutorials. There is sample code there for inspecting what lines are present. If some of the terms are confusing, most are defined in the tutorial immediately preceding this one.
To see what is on the line, check Capturing Audio.
To get levels, you will probably want to do some sort of rolling average (usually termed as root-mean-square). The "controls" (sometimes) provided at a higher level are kind of iffy for a variety of reasons.
In order to calculate those levels, though, you will have to convert the byte data to PCM. The example code in Using Files and Format Converters has example code that shows the point where the conversion would take place. In the first real example given, under the heading "Reading Sound Files" take note of the place where the comment sits that reads
// Here, do something useful with the audio data that's
// now in the audioBytes array...
I recall there are already StackOverflow questions that show the commands needed to convert bytes to PCM.

Combine two wave files to create a single smoother wave file in java

I have a requirement of joining two audio waves so that the final output audio wave should have a smoother meeting point. I meant at the joining point lets' say for 10 seconds the first audio should start fading out and the other audio starts picking up.
I have already been able to concatenate the two audio files and produce a single output but the output wave file has a abrupt change at the meeting point.
I am looking forward to some code in java (i.e. the crossfading should happen through java code without playing the files in any audio player And just to mention that I am not targeting the android solution.) or please point me to any helpful link demonstrating how to do the same.

What you're talking about is called crossfading. Corssfade means to slowly bring up the volume of the new song while slowly bringing the volume of the old one down. For a time both can be heard.
You might want to look at these:
java sound fade out
concatenating mp3 files or joining mp3 files using java

Identify audio sample in a file

I want to be able to identify an audio sample (that is provided by the user) in a audio file I've got (mp3).
The mp3 file is a radio stream that I've kept for testing purposes, and I have the Pre-roll of the show. I want to identify it in the file and get the timestamp where it's playing in the file.
Note: The solution can be in any of the following programming languages: Java, Python or C++. I don't know how to analyze the video file and any reference about this subject will help.

This problem falls under the category of audio fingerprinting. If you have matched a sample to a song, then you'll certainly know the timestamp where the sample occurs within the song. There is a great paper by the guys behind Shazam that describes their technique: http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf They basically pick out the local maxima in the spectrogram and create a hash based on their relative positions.
Here is a good review on audio fingerprinting algorithms: http://mtg.upf.edu/files/publications/MMSP-2002-pcano.pdf
In any case, you'll likely be working a lot with FFT and spectrograms. This post talks about how to do that in Python.

I'd start by computing the FFT spectrogram of both the haystack and needle files (so to speak). Then you could try and (fuzzily) match the spectrograms - if you format them as images, you could even use off-the-shelf algorithms for that.
Not sure if that's the canonical or optimal way, but I feel like it should work.

Converting and segmenting a collection of 44100hz, 16-bit mono wav files into 16kHz, 16-bit mono wav files

I need to break apart a large collection of wav files into smaller segments, and convert them into 16 khz, 16-bit mono wav files. To segment the wav files, I downloaded a WavFile class from the following site: WavFile Class. I tweaked it a bit to allow skipping an arbitrary number of frames. Using that class, I created a WavSegmenter class that would read a source wav file, and copy the frames between time x and time y into a new wav file. The start time and end time I can get from a provided XML file, and I can get the frames using sample rate * time. My problem is I do not know how to convert the sample rates from 44,100 to 16,000.
Currently, I am looking into Java's Sound API for this. I didn't consult it initially, because I found the guides long, but if it's the best existing option, I am willing to go through it. I would still like to know if there's another way to do it, though. Finally, I would like to know whether I should completely adapt Java's Sound API, and drop the WavFile class I am currently using. To me, it looks sound, but I would just like to be sure.
Thank you very much, in advance, for your time.

I believe the hardest part of your task is re-sampling from 44.1K to 16K samples per sec. It would have been much simpler to downsample to 22K or 11K from there! You will need to do some interpolation there.
EDIT: After further review and discussion with OP I believe the right choice for this situation is to go with Java Sound API because it provides methods for conversion between different sound file formats, including different sampling rates. Sticking with the WavFile API would require re-sampling which is quite complicated to implement in a 44.1K to 16K conversion case.

http://www.jsresources.org/examples/SampleRateConverter.html. I suppose This would help you...

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.