I'm trying to encode raw pcm data as uLaw to save on the bandwidth required to transmit speech data.
I have come across a class called UlawEncoderInputStream on This page but there is no documentation! :(
The constructor takes an input stream and a max pcm value (whatever that is).
/**
* Create an InputStream which takes 16 bit pcm data and produces ulaw data.
* #param in InputStream containing 16 bit pcm data.
* #param max pcm value corresponding to maximum ulaw value.
*/
public UlawEncoderInputStream(InputStream in, int max) {
After looking through the code, I suspect that i should calculate this "max" value using a supplied function: maxAbsPcm. Problem is, i dont really understand what I'm meant to pass into it! I am recording my raw pcm to a file on the sdcard so I dont have one continuous memory resident array of data to pass to this.
/**
* Compute the maximum of the absolute value of the pcm samples.
* The return value can be used to set ulaw encoder scaling.
* #param pcmBuf array containing 16 bit pcm data.
* #param offset offset of start of 16 bit pcm data.
* #param length number of pcm samples (not number of input bytes)
* #return maximum abs of pcm data values
*/
public static int maxAbsPcm(byte[] pcmBuf, int offset, int length) {
Another problem I have using this code is I am unsure what values to write out for the header for uLaw data. How do i determine how much less byte data there is after encoding with uLaw?
I have listened to one of the (potentially) uLaw encoded files that I created in VLC media player (the only player i have that will attempt to read the file) and its sounds nasty, broken and clicky but can still make out the voice.
I am writing my wave header using code similar to a class I found called WaveHeader which can be found Here!
If anyone has any thoughts on this matter I would be most grateful to hear them!:)
Many thanks
Dexter
The max in the constructor is the maximum amplitude in the PCM data. It is used to scale the input before generating the output. If the input is very loud you need a higher value, if it's quiet you need a lower one. If you pass in 0 the encoder will use 8192 by default, which may be good enough.
The length in the other method is the number of 16-bit samples from which you want to find the maximum amplitude. This class assumes that the input PCM data is always encoded with 16-bit samples, which means that each sample spans two bytes: if your input is 2000 bytes long you have 1000 samples.
The encoder in this class produces one 8-bit ยต-Law sample for every 16-bit PCM sample, so the size in bytes is halved.
This is the opposite of what you are trying to do, but I thought it could be helpful to someone. Here is an exmple method that will convert an 8-bit uLaw encoded binary file into a 16-bit WAV file using built-in Java methods.
public static void convertULawFileToWav(String filename) {
File file = new File(filename);
if (!file.exists())
return;
try {
long fileSize = file.length();
int frameSize = 160;
long numFrames = fileSize / frameSize;
AudioFormat audioFormat = new AudioFormat(Encoding.ULAW, 8000, 8, 1, frameSize, 50, true);
AudioInputStream audioInputStream = new AudioInputStream(new FileInputStream(file), audioFormat, numFrames);
AudioSystem.write(audioInputStream, Type.WAVE, new File("C:\\file.wav"));
} catch (IOException e) {
e.printStackTrace();
}
}
Related
I am recording sound from mic using audio recorder. recoChunk byte[] store raw recording as shown below.
while (isRecording == true) {
Log.w("myMsg", "recording..");
recoChunk = new byte[minBuffSize];
audioRecord.read(recoChunk, 0, minBuffSize);
mFosRaw.write(recoChunk);
}
now from recoChunk I want to find largest amplitude recorded how can I do that?
You can cast your byte array to an array of a type that matches the bit-depth of your recorded audio in its size. For example for 16-bit audio, you can use short, since it holds a 16-bit signed integer value. For 8-bit you can just use the byte array without casting. Then, simply, the largest "number" in the array (you would probably want to take the absolute value) will be the sample with the highest amplitude value.
I'm confused. I needed to record sound from MIC in Android so I used the following code:
recorder = new AudioRecord(AudioSource.MIC, 44100,
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT, N);
buffer = new byte[N];
//...
recorder.read(buffer, 0, N);
As we know, a byte array can store values between -128 to +128 while a 16Bit sound needs a lot more storage(e.g. short and int) but surprisingly Java and Android have a record method which saves recorded data to a byte array.
How that can be possible? What am I missing?
You are thinking of byte as a shot integer. It is just 8 bits. You need to store 1000111011100000 (16 bits)? First byte is 10001110, second byte is 11100000. That you can interpret these bits as numbers is not relevant here. In a more general way, byte[] is usually how you deal with binary "raw data" (let it be audio streams or encrypted content or anything else that you treat like a stream of bits).
If you have n "words" of 16 bits then you will need 2n bytes to store it. Byte 0 will be lower (or higher) part of word 0, byte 1 will be the rest of word 0, byte 0 will be lower (or higher) part of word 1...
This is part of a larger assignment that I've mostly got done except for this one part, which is a bit embarrassing because it sounds really simply on paper.
So basically, I've got a large amount of compressed data. I've been keeping track of the length using a CRC32
CRC32 checksum = new CRC32();
...
//read input into buffer
checksum.update(buff, 0, bytesRead);
So it updates everytime more info is read in. I've also kept track of the uncompress length using
uncompressedLength += manage.read(buff);
So it is an int value that has the number of bytes of the original file. This is a little Endian machine.
From what I can tell, what I need is four byte CRC, which I used
public byte[] longToBytes(long x) {
ByteBuffer buffer = ByteBuffer.allocate(8);
buffer.putLong(x);
return buffer.array();
}
byte[] c = longToBytes(checksum.getValue());
BUT this is 8 bytes. CRC32.getValue returns a long. Can I convert it to an int in this case without losing information I need?
And then the ISIZE is supposed to be...the four byte compressed length modulo 2^32. I've got the variable uncompresedLength which is an int. I think I just have to convert it to bytes and that's all?
I've been hexdumping the result from gzip and the result from my program and my header and data are right, I'm just missing my trailer.
As for why I'm doing this manually, it's because of an assignment. Trust me, I'd love to just use GZIPOoutputStream if I could.
CRC32 has 32 bits... the class returns long because of the super interface.
uncompressed length should be long, since nowadays files larger than 2G isn't uncommon.
so in both cases, you need to convert the lowest 32 bits of a long to 4 bytes.
static byte[] lower4bytes(long v)
{
return new byte[] {
(byte)(v ),
(byte)(v>> 8),
(byte)(v>>16),
(byte)(v>>24)
};
}
To write an integer in little-endian form, simply write the low byte of the integer (i.e. modulo 256 or anded with 0xff), then shift it down eight bits or divide by 256, then write the resulting low byte, and repeat that two more times. You'll write four bytes. Since you only write four, you will automatically be writing the length modulo 232.
my experiment is like this:
first, I use matlab to create a specified wave file with a rate of 44100, which means any fragment lasting 1s contains 44100 elements and these elements are presented as double.
then, I use smartphone's microphone to retrieve the wave. And the sampling rate is 44100, in order to restore the wave.
But, audioRecord store the data as byte, while what i want is double. Converting from byte to double sounds reasonable, I still confused that sampling rate 44100 means the audioRecord should record 44100 bytes in 1s or 44100*4 bytes, since double contains 4 bytes?
Other experiment i have committed:
using recording software to retrieve wave and store in .wav
read the .wav by matlab's wavread and by java respectively.
To 1s, we get 44100 elements, and list below:
-0.00164794921875
1.52587890625E-4
2.74658203125E-4
-0.003326416015625
0.001373291015625
-4.2724609375E-4
0.00445556640625
9.1552734375E-5
-9.1552734375E-4
7.62939453125E-4
-0.003997802734375
9.46044921875E-4
-0.00103759765625
0.002471923828125
0.001922607421875
-0.00250244140625
8.85009765625E-4
-0.0032958984375
8.23974609375E-4
8.23974609375E-4
anyone know how many elements the audioRecord will retrieve in 1s with the sampling rate of 44100?
The default for AudioRecord is to return 16-bits per channel for each sample (ENCODING_PCM_16BIT).
Now there are two read overloads that let you specify either a short[] (16 bits) or a byte[] (8 bits) buffer.
int read(short[] audioData, int offsetInShorts, int sizeInShorts)
int read(byte[] audioData, int offsetInBytes, int sizeInBytes)
So a 1 second mono buffer (1 channel) should have a short[] buffer of length 44100. Stereo (2 channels) would have 88200, etc...
I would avoid using the byte[] buffer unless you had set the AudioRecord format to ENCODING_PCM_8BIT for some reason (it is not guaranteed to be supported by all devices).
Now if you want to convert those short values to doubles you have to realize that the double values you record in matlab are double-precision normalized samples which are normalized from [-1 to 1] while the short values are going to be from [-32768 to 32767] so you would have to write a conversion function instead of just trying to cast the numbers from a short to a double.
I'm trying to play some .m4a files, and I understand that JAAD only supports decoding AAC, but there are songs that I am able to get the sourceDataLine from, and then when I go to try to play them, I get behavior like this:
We read: 1024 bytes.
We read: 512 bytes.
We read: -1 bytes.
When running this:
// read from the input
bytesRead = audioInputStream.read(tempBuffer, 0, tempBuffer.length);
System.out.println("We read: " + bytesRead + " bytes.");
until bytesRead == -1
For this particular file, I'm getting the AudioFormat baseformat to be this:
MPEG1L1 48000.0 Hz, unknown bits per sample, mono, unknown frame size, 125.0 frames/second.
Then the AudioFormat decodedFormat to be this:
PCM_SIGNED 48000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian
I use these line of code to make the conversion:
AudioFormat baseFormat = audioInputStream.getFormat();
AudioFormat decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
baseFormat.getSampleRate(),
16,
baseFormat.getChannels(),
baseFormat.getChannels() * 2,
baseFormat.getSampleRate(),
false);
Am I doing something wrong here? I don't fully understand that that second line really does, but it's been working just fine for decoding MP3 files using the MP3SPI.
I'd really appreciate any guidance here.
Is the problem that your file is only showing 1024 + 512 bytes in length? (That would be awfully short for an audio file!) Or is the question about File Formats? I've seen some people run into a problem when they try to decode an mp3 of a .wav file that happens to be incompatible with the range of wav file formats supported by Java.
I'm assuming the "second line" you refer to is the creation of a new FileFormat, yes? That is simply the act of making a new Format based upon the one being decoded from your inputstream. Presumably, you will use the new format in your playback.
The point of the new format is probably to ensure the data will be played with a format compatible with your system. It says:
(1) that no matter the incoming encoding format, Signed PCM will be used.
(2) the same sample rate will be used (my system only supports 44100, am surprised to see yours allowing 48000).
(3) 16-bits will be the new number of bits per sample, regardless of the number of bits used in the original original file.
(4) the same number of channels will be used as the original file.
(5) the number of channels * 2 will be deemed to be the frame size (makes sense, given 16 bits per sample)
(6) the same frames per second rate
(7) the byte order will be little-endian, regardless of the input file order.
You can look into the API of this constructor (some good documentation) under "Constructor Detail" at this link: http://docs.oracle.com/javase/6/docs/api/javax/sound/sampled/AudioFormat.html
If you need to convert some of the audio data in order to allow playback, then things get more involved. Have you read through this tutorial on audio file & format conversion?
http://docs.oracle.com/javase/tutorial/sound/converters.html
I hope this is at least of partial help. I haven't used JAAD myself, so I can understand if this post isn't very helpful.