I have an ogg file loaded with the Sound class. I set it to loop and it runs ok but I hear a click when it starts to play all over again. The sound in barely 7 seconds long.
Is there a way to fix this because it is quite annoying?
There is undoubtedly a big discontinuity in the signal between the starting moment and the ending moment. With the Java Clip as it is constructed, there is no way to counteract this that I know of.
I recommend editing the sound in something like Audacity to try and make the two ends more similar. Easiest dodge is to have both ends taper to silence--but then that leaves a gap.
The other possibility is writing your own Clip and adding a provision that allows a degree of overlap of the two edges. Then, you can add tapers in Audacity and have the tapered parts overlap. This is similar to how audio editing ("splicing") is done in a DAW.
That is a bit of work though! I have done this and it does work well, but am a month or three away, still, from releasing my audio library. To program it yourself, read the bytes into an array, converting to PCM. Then, you can use the TargetDataLine as an example of an interface that allows reading progressive data (it is overkill, but the read() method is the key). In your read, you make a provision for starting a LERP (linear interpolation) of the PCM data at the overlap point.
If another means of dealing with this issue comes up, I will be happy to learn about it!
Related
I want to generate chords from a midi file, but I can't find any source code written in java yet. So I want to write it by myself. What I want to do first is to gather the notes in the same position, but this is the problem, I don't know if there is a way to get the midi note position using JMusic. If not, are there any way to get this information? Thank you all~
Like slim mentioned, Midi files are basically a collection of Midi Events, which are basically hex bytes of code that correspond to a Midi action. EVERYTHING in Midi (including very in-depth things like tempo, setting the instrument bank to typical things such as note on/off events and note volume [called velocity in MIDI]) is controlled by those Midi Events. Unfortunately, you're going to need ALL of that information in order to do what you want.
My advice? There is this odd notion (that I once held before working with Midi as well) that Midi should be simple to work with because it's been around for so long and it's so easy to use in DAW's like FL Studio. Well let me be the person to crush this notion for you; MIDI is complex, hard, and idiosyncratic. As a beginner, it will force you to take caffeine (to keep the gears rolling and for headaches), tylenol (also for the headaches) and alcohol when you realize you just worked 6 hours for one thing and you're getting the fatigue. Turn back now, pay Dave Smith to help you, or hit the books, cause it's going to get nasty.
HOWEVER: You will never feel greater success than when your baby starts playing music.
So I tested downsampling a music track in Java by using the Javax.Sound API
First of all we had the original mp3 which was then converted to .wav to accept Javas audio AudioFormat. Then I used AudioSystem.getAudioInputStream(AudioFormat targetFormat, AudioInputStream sourceStream) to downsample my .wav file.
Here you can see the original mp3 file in Audacity:
After converting it by using JLayer and applying Javas Sound API on it, the downsampled track looked like this:
However, by using another program, dBPoweramp, it looked like this:
You can see that the amplitudes of the wave are higher than in the version i downsampled with Java.
Therefore it sounds louder and a bit more like the original .mp3 file, where as my own file sounds very quiet compared to the original.
Now my Questions:
How can i achieve this effect? Is it better to have higher amps or are they just cut of like you see in the picture sampled by dBPoweramp. Why ist there any difference anyway?
I'm not entirely sure what you mean by quality here, but it's no surprise whatsoever that nature of a downsampled signal will be different to the original as it will have been filtered to remove frequencies that would violate the nyqist rate at the new sample frequency.
We would therefore expect that the gain of down-sampled signal would be lower than that of original. From that perspective, the signal produced by JLayer looks far more plausible than that of dbPoweramp.
The second graph appears to have higher gain than the original, so I suspect that there is make-up gain applied, and possibly dynamic range compression and brick-wall limiting (this signal has periods which appear to have peaks at the limit). Or worse, it's simply clipped.
And this brings us back to the definition of quality: It's subjective. Lots of commercial music is heavily compressed and brick-wall limited as part of the production process for a whole variety of reasons. One of which is artistic effect. It seems you're getting more of this from dbPoweramp, and this may well be flattering to your taste and the content.
It is probably not a clean conversion. In any objective measurement of system performance (e.g. PSNR), the quality would be lower.
If objective quality is what you're after, much better performance is achieved in rendering the mp3 into a lower sample rather than decoding to PCM and then downsampling.
As a final word of caution: Audacity is doing some processing on the signal in order to render the time-aplitude graph. I suspect it is showing the max amplitude per point in the x-axis.
I am working on a personal project. Basically I have a collection of small sound clips, like a clap or a beep noise. I want to create a program that listens for the sounds via a mic or some form of audio input, and when I play sound clip it should identify that clip.
I have tried looking into this myself and have found this article.
http://www.redcode.nl/blog/2010/06/creating-shazam-in-java/
I tried replicating it, but I have found that it doesn't work as expected. I am guessing the sound clips I am using to create my hash from are too small to create enough values to compare.
Wondering if there any well know programs or algorithms that are capable of doing this.
Dan Ellis' slides are probably a good start. They explain the principal task of audio fingerprinting and the two best known approaches:
The Shazam algorithm by A. Wang (paper)
The Philips (now Gracenote) algorithm by Haitsma/Kalker (paper)
As you have already tried the landmark (Shazam) approach, perhaps it's worth your time to fiddle around with the stream-based approach. Since your queries are very short, you might also want to tweak the analysis frame length and overlap. Shorter frames and greater overlap may improve your results for very short samples. If you want to delve even deeper into the Haitsma/Kalker algorithm, you might also be interested in this unfortunately paywalled paper (by me).
I'm making program for Active Noise Control(also use Adaptive instead of Active / use Cancellation instead of Control)
System is pretty simple.
get sound via mic
turn the sound into data, which I can read(Something like Integer array)
make antiphase of the sound.
turn the data into sound file
Follwing is my question
Can I read sound as Integer Array?
If I can use Integer Array, how can I make antiphase? Just multiply -1 to every data?
Any useful think about my project
Is there any recommended language rather than java?
I heard that stackoverflow have many top class programmers. So, I expect for critical answer :D
Answering your questions:
(1) When you read sound, a byte array is returned. The bytes can readily be decoded into integers, shorts, floats, whatever. Java supports many common formats, and probably has one that matches your microphone input and speaker output. For example, Java supports 16-bit encoding, stereo, 44100 fps, which is considered the standard for CD-quality. There are several questions already at StackOverflow that show the coding for the decoding and recoding back to bytes.
(2) Yes, just multiply by -1 to every element of your PCM array. When you add the negative to the correctly lined up counterpart, 0 will result.
(3 & 4) I don't know what the tolerances are for lag time! I think if you simply take the input, decode, multiply by -1, recode, and output, it might be possible to get a very small amount of processing time. I don't know what Java is capable of here, but I bet it will be on the scale of a dozen millis, give or take. How much is enough for cancellation? How far does the sound travel from mike to speaker location? How much time does that allow? (Or am I missing something about how this works? I haven't done this sort of thing before.)
Java is pretty darn fast, and you will be relatively close to the native code level with the reading and writing and simple numeric conversions. The core code (for testing) could probably be written in an afternoon, using the following tutorial examples as a template: Reading/Writing sound files, see code snippets. I'd pay particular attention to the spot where the comment reads "Here do something useful with the audio data that is in the bytes array..." At this point,
you would put the code to convert the bytes to DSP, multiply by -1, then convert back to bytes.
If Java doesn't prove fast enough, I assume the next thing to try would be some flavor of C.
So I want to make a new music player for Android, it's going to be open source and if you think this idea is any good feel free to let me know and maybe we can work on it.
I know it's possible to speed up and slow down a song and normalize the sound so that the voices and instruments still hit the same pitch.
I'd like to make a media play for Android aimed at joggers which will;
Beat match successive songs
Maintain a constant beat for running to
Beat can be established via accelerometer or manually
Alarms and notifications automatically at points in the run (Geo located or timer)
Now I know that this will fall down with many use cases (Slow songs sounding stupid, beat changes within song getting messed up) but I feel they can be overcome. What I really need to know is how to get started writing an application in C++ (Using the Android NDK) which will perform the analysis and adjust the stream.
Will it be feasible to do this on the fly? What approach would you use? A server that streams to the phone? Maybe offline analysis of the songs on a desktop that gets synched to your device via tether?
If this is too many questions for one post I am most interested in the easiest way of analysing the wave of an MP3 to find the beat. On top of that, how to perform the manipulation, to change the beat, would be my next point of interest.
I had a tiny crappy mp3 player that could do double speed on the fly so I'm sure it can be done!
Gav
This is technologically feasible on a smartphone-type device, although it is extremely difficult to achieve good-sounding pitch-shifting and time-stretching effects even on a powerful PC and not in realtime.
Pitch-shifting and time-stretching can be achieved on a relatively powerful mobile device in realtime (I've done it in .Net CF on a Samsung i760 smartphone) without overly taxing the processor (the simple version is not much more expensive than ordinary MP3 playback). The effect is not great, although it doesn't sound too bad if the pitch and time changes are relatively small.
Automatic determination of a song's tempo might be too time-consuming to do in real time, but this part of the process could be performed in advance of playback, or it could be done on the next song well before the current song is finished playing. I've never done this myself, so I dunno.
Everything else you mentioned is relatively easy to do. However: I don't know how easy Android's API is regarding audio output, or even whether it allows the low-level access to audio playback that this project would require.
Actually, you'll have 2 problems:
Finding the tempo of a song is not easy. The most common method involves autocorrolation, which involves quite a bit of calculus, so I hope you've studied up.
Actually changing the beat of a song without pitch shift is even harder, and still results in sound artifacts in the song. Typically it takes a long time to edit audio in this way, and it takes a lot of tinkering to get the song to sound good. To actually perform this in real time would be very, very hard. The actual process involves taking the Fourier Transform of the audio, shifting the frequency, and taking the inverse Fourier Transform. More calculus, this time with imaginary numbers.
If you really want to work on this I suggest taking a class in signals and systems from an Electrical Engineering department.
Perhaps an easier idea: Find the tempo of all the songs in a user's library, and just focus on playing songs with a close beat to the jogger's pace. You still need to do #1 but you don't need to worry about #2.
Changing the audio speed on the fly is definetly doable; I'm not sure if it's doable on the G1.
Rather than writing your own source I would recommend looking at the MythTV source and/or the mplayer source code. They both support speeding up video playback while compensating the audio.
http://picard.exceed.hu/tcpmp/test/
tcpmp did all that you asked for on an iddy biddy Palm Centro... And More, Including Video! If it can be done on a Palm Centro, it sure as heck can be done on the Android!!