The file format is "PCM_SIGNED 44100.0 Hz, 16 bit, stereo, 4 bytes/frame, little-endian", and I want to add them together while amplifying one of the two files. I plan to read the two wav get put them into two audioinputstream instances, then store the instances into two byte[] array, manipulate in the arrays, and get return as another audioinputstream instance.
I have done a lot of research but I have got no good results.
I know that is a class from www.jsresources.org mixing two audioinputstream, but it doesn't allow me to modify either of the two streams before mixing while I want to decrease one of the streams before mixing them. What do you think I should do?
To do this, you can convert the streams to PCM data, multiply the channel whose volume you wish to change by the desired factor, add the PCM data from the results together, then convert back to bytes.
To access the AudioStreams on a per-byte basis, check out the first extended code fragment at the Java Tutorials section on Using Files and Format Converters. This shows how to get an array of sound byte data. There is a comment that reads:
// Here, do something useful with the audio data that's
// now in the audioBytes array...
At this point, iterate through the bytes, converting to PCM. A set of commands based on the following should work:
for (int i = 0; i < numBytes; i += 2)
{
pcmA[i/2] = audioBytesA[i] & 0xff ) | ( audioBytesA[i + 1] << 8 );
pcmB[i/2] = audioBytesB[i] & 0xff ) | ( audioBytesB[i + 1] << 8 );
}
In the above, audioBytesA and audioBytesB are two input streams (names based on the code from the example), and pcmA and pcmB could be either int arrays or short arrays, holding values that fit within the range of a short. It might be best to make pcm arrays floats since you will be doing some math that will result in fractions. Using floats as in the example below only adds one place worth of accuracy (better rounding than when using int), and int would perform faster. I think using floats is more often done if the audio data gets normalized for use with additional processing.
From there, the best way to change volume is to multiply every PCM value by the same amount. For example, to increase volume by 25%,
pcmA[i] = pcmA[i] * 1.25f;
Then, add pcmA and pcmB, and convert back to bytes. You might also want to put in min or max functions to ensure that the volume & merging do not exceed values that can fit in the format's 16 bits.
I use the following to convert back to bytes:
for (int i = 0; i < numBytes; i++)
{
outBuffer[i*2] = (byte) pcmCombined[i];
outBuffer[(i*2) + 1] = (byte)((int)pcmCombined[i] >> 8 );
}
Above assumes pcmCombined[] is a float array. The conversion code can be a bit simpler if it is a short[] or int[] array.
I cut and pasted the above from dev work I did for programs posted at my website, and edited it for your scenario, so if there is a typo or bug crept in, please let me know in the comments and I will fix it.
Related
I have recently come across the problem of creating arrays with values that have a specified bit length. Say an array with 13bits instead of 8,16,32 etc. I tried to look for a good tutorial/article about it as I am new to bit operations. Though I am not really sure of what to search for. I presume the array would work with a backing array of bytes or longs...
My ultimate question is if you can show me if there is a duplicate question or tutorial out there.
If not perhaps show me an example. AND if you got the time write a short explanation.
Thank you.
EDIT: The purpose is not to make an array of say longs but only use 40% of it. I want it to be packed together to save space to be compatible with the thing im making.
It's not possible to "create your own primitive types" in java. Also I don't think there is any library around here to do what you want. I think most people would go with the overhead of losing some memory, especially at bit level. Maybe C or Cpp would have been a wiser choice (and I'm not even sure).
You'll have to create your own bit manipulation library. There are many ways to do it, I'll give you one. I began using a byte[] but it's more complex. As a rule, use the biggest normal type (ex: for a 48bit elements, use 32 bit types as storage). so let's go with an int array (16 bits) for 100 of your 13bits types. I'll use big-endian-style storage.
int intArraySize = 100 * 16 / 13 + 1; // + 1 is just to be sure...
int[] intArray = new int[byteArraySize];
Now, how do you access the sixth value for example. You'll always need at least and at most two int of your array and an integer to store it.
int pos = 6;
int buffer = 0;
int firstPart = int Array[ (pos * 13) /16]; // 1010 0110 1100 0011
int secondPart = int Array[ (pos * 13) /16 + 1]; // 1001 1110 0101 1111
int begin = pos * 13 % 16;
The variable begin = 14 is the bit at which your number begins. So that means on your 13bits elements there are (16-14) 3 bits in the first (left) int and the rest (13-3 = 10) in the second (right).
The number you want is 1010 0110 1100 0{011 and 1001 1110} 0101 1111.
You're gonna put these two ints into one now. Right shift the secondPart 3 times (so it's the right part of your final number), and left shift the firstPart 10 times, add them in the buffer. Because it's a 13bits elements, you'll need to clean ( with a bitmask ) the 3 first elements of your 16 bit in the buffer, and voila !
I'll let you guess how to insert a value in the array (try doing the same step, but in reverse) and be carefull not to erase other values. And if you haven't looked yet: https://docs.oracle.com/javase/tutorial/java/nutsandbolts/op3.html
Disclaimer: I didn't try the code, but you get the general idea. There might be some errors, maybe you'll have to add or remove 1 to begin. But you get the general idea. The first thing you should do is make a function that prints/log any integer (or byte, or whatever) into it's binary representation. Multiple possibilities here: Print an integer in binary format in Java because you're gonne need them to test every step of your code.
I still think it's a bad idea to store your special number this way, (seriously memory is rarely gonna be an issue), but I found the exercise interesting, and maybe you really need taht kind of storage. If your curious, take a look at the ByteArrayOutputStream, I'm not sure you'll ever need this for what you're doing but who knows.
I transfer data like this: [DATA - any size bit array][CRC 15 bits].
How i can get 15 bits CRC from any size bit array to detect accidental changes to raw data?
Here start of code:
byte crc[15];
int data_length = *any size*;
byte data[data_length]; // for example data = {1,1,1,0,1,0,1,0,1,1,1,1,0,1,0,1,1,1,0,1,0,1,1,1,1,0,1,0,1,1}
crc = get_crc(crc);
get_crc - ?
As a commenter said, your code example has no bits but bytes. I assume you´ve really want the last 15 bit out of some byte array named data and that you checked that data has at least 2 bytes.
It´s as simple as
short crc = (data[data_length - 2] & 0x7f) | data[data_length - 1];
Side note: Maybe you are aware of it already, but 15bit CRCs are not exactly the most reliable thing (but fast). If you data not completely unimportant, some better algorithm could make sense (SHA´s, or depending on the use case something else like ReedSolomon-encoding etc.etc.)
My goal is to be able to process a single note from a guitar (or other instrument), and convert it into a frequency value.
This does not have to be in real time- I figured it would be much easier to record a one-second sound and analyse the data afterwards.
I understand that to do this I need to use a Fourier transform (and have a class that will perform a FFT). However, I don't really understand the input / output of a FFT algorithm- the class I am using seems to use a complex vector input and give a complex vector output. What do these represent?
Also, could anyone recommend any Java classes that can detect and record an input (and if possible, give frequency or values that can be plugged into FFT?)?
Thanks in advance.
Input to your FFT will be a time-domain signal representing the audio. If you record some sound for a second from the mic, this will really contain a wave that is made up of various frequencies at different amounts - hopefully mostly the frequency/frequencies corresponding to the note which you are playing, plus some outside noise and noise introduced by the microphone and electronics. If in that 1 second you happen to have, say, 512 time points (so the mic can sample at 512 times a second), then each of those time points represents the intensity picked up by the mic. These sound intensity values can be turned from their time-domain representation to a frequency-domain representation using the FFT.
If you now give this to the FFT, as it is a real-valued input, you will get a symmetric complex output (symmetric around the central value) and can ignore the second half of the complex vector output and use only the first half - i.e. the second half will be symmetric (and thus "identical") to the first half. The output represents the contributions of each frequency to the input waveform - in essence, each "bin" or array index contains information about that frequency's amplitude. To extract the amplitude you want to do:
magnitudeFFTData[i] = Math.sqrt((real * real) + (imaginary * imaginary));
where real and imaginary are the real and imaginary parts of the complex number at that frequency bin. To get the frequency corresponding to a given bin, you need the following:
frequency = i * Fs / N;
where i is bin or array index number, Fs the sampling frequency and N the number of data points. From a project of mine wherein I recently used the FFT:
for (int i = (curPersonFFTData.length / 64); i < (curPersonFFTData.length / 40); i++) {
double rr = (curPersonFFTData[i].getReal());
double ri = (curPersonFFTData[i].getImaginary());
magnitudeCurPersonFFTData[i] = Math.sqrt((rr * rr) + (ri * ri));
ds.addValue(magnitudeCurPersonFFTData[i]);
}
The divisions by 64 and 40 are arbitrary and useful for my case only, to only get certain frequency components, as opposed to all frequencies, which you might want. You can easily do all this in real time.
We have this use case where we would like to compress and store objects (in-memory) and decompress them as and when required.
The data we want to compress is quite varied, from float vectors to strings to dates.
Can someone suggest any good compression technique to do this ?
We are looking at ease of compression and speed of decompression as the most important factors.
Thanks.
If you want to compress instances of MyObject you could have it implement Serializable and then stream the objects into a compressed byte array, like so:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream gzipOut = new GZIPOutputStream(baos);
ObjectOutputStream objectOut = new ObjectOutputStream(gzipOut);
objectOut.writeObject(myObj1);
objectOut.writeObject(myObj2);
objectOut.close();
byte[] bytes = baos.toByteArray();
Then to uncompress your byte[] back into the objects:
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
GZIPInputStream gzipIn = new GZIPInputStream(bais);
ObjectInputStream objectIn = new ObjectInputStream(gzipIn);
MyObject myObj1 = (MyObject) objectIn.readObject();
MyObject myObj2 = (MyObject) objectIn.readObject();
objectIn.close();
Similar to previous answers, except I suggest you use DeflatorOutputStream and InflatorInputStream as these are simpler/faster/smaller than the alternatives. The reason it is smaller is it just does the compression whereas the alternatives add file format extensions like CRC checks and headers.
If size is important, you might like to have a simple serialization of your own. This is because ObjectOutputStream has a significant overhead making small objects much larger. (It improves for larger object especially when compressed)
e.g. an Integer takes 81 byte, and compression won't help much for such a small number of bytes. It is possible to cut this significantly.
One proposal could be to use a combination of the following streams:
ObjectOutputStream / ObjectInputStream for serializing/deserializing Java objects
GZIPOutputStream / GZIPInputStream for compressing/uncompressing. There are other options to be found in the java.util.zip package.
ByteArrayOutputStream / ByteArrayInputStream for storing the data in memory as a byte array
Compression of searilized objects in Java is usually not well... not so good.
First of all you need to understand that a Java object has a lot of additional information not needed. If you have millions of objects you have this overhead millions of times.
As an example lets us a double linked list. Each element has a previous and a next pointer + you store a long value (timestamp) + byte for the kind of interaction and two integers for the user ids. Since we use pointer compression we are 6Bytes * 2 + 8 + 4 * 2= 28Bytes. Java adds 8 Bytes + 12bytes for the padding. This makes 48Bytes per Element.
Now we create 10 million lists with 20 Elements each (time series of click events of users for the last three years (we want to find patterns)).
So we have 200Million * 48 Bytes of elements = 10GB memory (ok not much).
Ok beside the Garbage collection kills us and the overhead inside the JDK skyrocks, we end with 10GB memory.
Now lets use our own memory / object storage. We store it as a column wise data table where each object is actually a single row. So we have 200Million rows in a timestamp, previous, next, userIdA and userIdB collection.
Previous and next are now point to row ids and become 4byte (or 5bytes if we exceed 4billion entries (unlikely)).
So we have 8 + 4 + 4 + 4 + 4 => 24 * 200 Mio = 4.8GB + no GC problem.
Since the timestamp column stores the timestamps in a min max fashion and our timestamps all are within three years, we only need 5bytes to store each of the timestamps. Since the pointer are now stored relative (+ and -) and due the click series are timely closely related we only need 2bytes in average for both previous and next and for the user ids we use a dictionary since the click series are for roughly 500k users we only need three bytes each.
So we now have 5 + 2 + 2 + 3 + 3 => 15 * 200Mio => 3GB + Dictionary of 4 * 500k * 4 = 8MB = 3GB + 8MB. Sounds different to 10GB right?
But we are not finished yet. Since we now have no objects but rows and datas, we store each series as a table row and use special columns being collections of array that actually are storing 5 values and a pointer to the next five values + a pointer previous.
So we have 10Mio lists with 20 enries each (since we have overhead), we have per list 20 * (5 + 3 + 3) + 4 * 6 (lets add some overhead of partly filled elements) => 20 * 11 + 5 * 6 => 250 * 10Mio => 2,5GB + we can access the arrays faster than walking elements.
But hey its not over yet... the timestamps are now relatively stored only requiring 3 bytes per entry + 5 at the first entry. -> so we save a lot more 20 * 9 + 2 + 5 * 6 => 212 * 10Mio => 2,12 GB. And now storing it all to memory using gzip it and we result in 1GB since we can store it all lineary first storing the length of the array, all timestamps, all user ids making it very highly that there are patterns in the bits to be compressable. Since we use a dictionary we just sort it according the propability of each userId to be part of a series.
And since everything is a table you can deserialize everything in almost read speed so 1GB on a modern SSD cost 2 second to load. Try this with serialization / deserialization and you can hear inner user cry.
So before you ever compress serialized data, store it in tables, check each column / property if it can be logically be compressed. And finally have fun with it.
And remember 1TB (ECC) cost 10k today. Its nothing. And 1TB SSD 340 Euro. So do not waste your time on that issue unless you really have to.
The best compression technology I know is ZIP. Java supports ZipStream. All you need is to serialize your object into byte array and then zip it.
Tips: Use ByteArrayOutputStream, DataStream, ZipOutputStream.
There are various compression algorithm implemented in the JDK. Check the [java.util.zip](http://download.oracle.com/javase/6/docs/api/java/util/zip/package-summary.html) for all algorithm implemented. However it may not be a good thing to compress all your data. For instance a serialized empty array may be several dozen of bytes long as the name of the underlying class is in the serialized data stream. Also most compression algorithm are designed to remove redundancy from large data blocks. On small to medium Java objects you'll probably have very little or no gain at all.
This is a tricky problem:
First, using ObjectOutputStream is probably not the answer. The stream format includes a lot of type-related metadata. If you are serializing small objects, the mandatory metadata will make it hard for the compression algorithm to "break even", even if you implement custom serialization methods.
Using DataOutputStream with minimal (or no) added type information will give a better result, but mixed data is not generally that compressible using a general purpose compression algorithms.
For better compression, you may need to look at the properties of the data that you are compressing. For instance:
Date objects could be represented as int values if you know that have a precision of 1 day.
Sequences of int values could be run-length encoded, or delta-encoded if they have the right properties.
and so on.
However way you do it, you will need to do a serious amount of work to get a worthwhile amount of compression. IMO, a better idea would be to write the objects to a database, datastore or file and use caching to keep frequently used objects in memory.
If you need to compress arbitrary objects, a possible approach is to serialize the object into a byte array, and then use e.g. the DEFLATE algorithm (the one used by GZIP) to compress it. When you need the object, you can decompress and deserialize it. Not sure about how efficient this would be, but it will be completely general.
How can I merge two wav files using java?
I tried this but it didn't work correctly, is their any other way to do it?
If you work with the bytes of a wav file directly you can use the same strategy in any programming language. For this example I'll assume the two source files have the same bitrate/numchannels and are the same length/size.
(if not you can probably edit them before starting the merge).
First look over the wav specificaiton, I found a good one at a stanford course website:
Common header lengths are 44 or 46 bytes.
If you want to concatenate two files (ie play one wav then the other in a single file):
find out what format your wav files are
chop off the first 44/46 bytes which are the headers, the remainder of the file is the data
create a new file and stick one of the headers in that.
new wav file = {header} = {44/46} bytes long
add the two data parts from the original files
new wav file = {header + data1 + data2 } = {44/46 + size(data1) + size(data2)} bytes long
modify your header in two places to reflect the new file's length.
a. modify bytes 4+4 (ie. 4 bytes starting at offset 4).
The new value should be a hexadecimal number representing the size of the new wav file in bytes {44/46 + size(data1) + size(data2)} - 8bytes.
b. modify bytes 40+4 or 42+4 (the 4 bytes starting at offset 40 or 42, depending on if you have a 44byte header or 46 byte header).
The new value should be a hexadecimal number representing the total size of the new wav file. ie {44/46 + size(data1) + size(data2)}
If you want to instead merge or mix the two files (so that they both play at the same time then):
you won't have to edit the header if both files are the same length.
starting at byte 44/46 you will have to edit each sample to be the value in data1 + the value in data2.
so for example if your SampleRate was 8 bits you would modify 1 byte, if your sample rate was 16bits you would modify 2 bytes.
the rest of the file is just Samples of 1/2bytes storing an int value representing the waveform of the sound at that time.
a. For each of the remaining samples in the file grab the 1/2 byte hex string and get the int value from both files data1 and data2.
b. add the 1/2 byte integers together
convert the result back to hexadecimal and use that value in your output file.
c. You normally have to divide that number by 2 to get an average value that fits back in the original 1/2byte sample block. I was getting distortion when i tried it in objc(probably related to signed or unsigned ints) and just skipped the division part since it will only likely be a problem if you are merging very loud sounds together.
ie when data1 + data2 is larger than 1/2 bytes the sound will clip. There was a discussion about the clipping issue here and you may want to try one of those clipping techniques.
Merge implies mixing, but it sounds like you mean concatenation here.
To concatenate with silence in the middle you need to insert a number of frames of silence into the file. A silent frame is one where every channel has a "0" - if you are using signed samples this is literally a 0, for unsigned, it is maxvalue/2.
Each frame will have one sample for each channel. So to generate one second of silence in CD format, you would insert 44100 (hz) * 2 (channels per frame) = 88200 16 bit signed ints with a value of 0 each. I am not sure how to access the raw file abstracted by the Java audio abstractions, but that is the data to insert.