how audioRecord retrieve data with a specified sampling rate - java

my experiment is like this:
first, I use matlab to create a specified wave file with a rate of 44100, which means any fragment lasting 1s contains 44100 elements and these elements are presented as double.
then, I use smartphone's microphone to retrieve the wave. And the sampling rate is 44100, in order to restore the wave.
But, audioRecord store the data as byte, while what i want is double. Converting from byte to double sounds reasonable, I still confused that sampling rate 44100 means the audioRecord should record 44100 bytes in 1s or 44100*4 bytes, since double contains 4 bytes?
Other experiment i have committed:
using recording software to retrieve wave and store in .wav
read the .wav by matlab's wavread and by java respectively.
To 1s, we get 44100 elements, and list below:
-0.00164794921875
1.52587890625E-4
2.74658203125E-4
-0.003326416015625
0.001373291015625
-4.2724609375E-4
0.00445556640625
9.1552734375E-5
-9.1552734375E-4
7.62939453125E-4
-0.003997802734375
9.46044921875E-4
-0.00103759765625
0.002471923828125
0.001922607421875
-0.00250244140625
8.85009765625E-4
-0.0032958984375
8.23974609375E-4
8.23974609375E-4
anyone know how many elements the audioRecord will retrieve in 1s with the sampling rate of 44100?

The default for AudioRecord is to return 16-bits per channel for each sample (ENCODING_PCM_16BIT).
Now there are two read overloads that let you specify either a short[] (16 bits) or a byte[] (8 bits) buffer.
int read(short[] audioData, int offsetInShorts, int sizeInShorts)
int read(byte[] audioData, int offsetInBytes, int sizeInBytes)
So a 1 second mono buffer (1 channel) should have a short[] buffer of length 44100. Stereo (2 channels) would have 88200, etc...
I would avoid using the byte[] buffer unless you had set the AudioRecord format to ENCODING_PCM_8BIT for some reason (it is not guaranteed to be supported by all devices).
Now if you want to convert those short values to doubles you have to realize that the double values you record in matlab are double-precision normalized samples which are normalized from [-1 to 1] while the short values are going to be from [-32768 to 32767] so you would have to write a conversion function instead of just trying to cast the numbers from a short to a double.

Related

Invert image pixels

I'm currently trying to convert a piece of matlab code to java. The purpose of the code is to invert and normalize the image pixels of an image file. In java, the pixels are stored in a byte array. Below is the Matlab code of importance:
inp2=1024.-inp.-min; %inp is the input array (double precision). min is the minimum value in that matrix.
The image is 16 bit, but is using only 10 bits for storage, so that's where the 1024 comes from (2^10). I know definitively that this code works in matlab. However, I'm personally not proficient in matlab, and my java translation isn't behaving the same way as its counterpart.
Below is the method where I've tried inverting the image matrix:
//bitsStored is the bit depth. In this test, it is 10.
//imageBytes is the pixel data in a byte array
public static short[] invert(int bitsStored) {
short min = min(imageBytes);//custom method. Gets the minimum value in the byte array.
short range = (short) (2 << bitsStored);
short[] holder = new short[imageBytes.length];
for (int i = 0; i < imageBytes.length; i++) {
holder[i] = (short) (range - imageBytes[i] - min);
}
imageBytes = holder;
return imageBytes;
}
However, instead of inverting the color channels, the image loses some data and becomes much harsher looking (higher contrast, less blend, etc). What am I doing wrong here?
Let me know if I can make anything clearer for you. Thank you.
UPDATE:
Hi, I have another question regarding this code. Can the above code (fixed to short[] not byte[]) be used in reverse on the same file? As in, if I rerun through this code using an inverted version of the original image, should I get the original input/image from the start of the program? The only problem with it I think is that the min value changes between runs.
byte has range from -128 to 127, it cannot hold 1024 different values. So either you need to use a wider type (like short) to model your points, or your byte array has to be unpacked before processing.
One more thing: double is floating point and it does not play well with integers used in the rest of your code. The following seems better:
short range = 1 << bitsStored; // 2^bitsStored
Correct equation for inversion is:
newValue[i] = maxPossibleValue - currentValue[i]
Your maxPossibleValue is 1024.
Other thing is that you can't have image with depth of 10 bits in array of bytes (cause they've 8 bits)
On your second question about the reversibility of your algorithm.
Your formula looks like result[i] = 1024 - min(data) - data[i] where data ranges from 0 to 1023. Let's imagine that all your data points are 1023. Then min is 1023, so all the result[i] will be -1022.
So the result does not even fit in the same range as the data.
Then, if you run your algorithm with that result array to produce result1, all its points will be 1024 - (-1022) - (-1022) i.e. 3068, and not the original 1023.
So the answer is not, double application of this algorithm does not produce result equal to the input.
Please note that the algorithm mentioned in another answer (maxPossibleValue - currentValue[i]) keeps range and it is reverses when applied twice.
BTW, it should be
short range = (short) (1 << bitsStored);
instead of
short range = (short) (2 << bitsStored);
to produce 2^bitsStored.

How to extract amplitude from byte array in android?

I am recording sound from mic using audio recorder. recoChunk byte[] store raw recording as shown below.
while (isRecording == true) {
Log.w("myMsg", "recording..");
recoChunk = new byte[minBuffSize];
audioRecord.read(recoChunk, 0, minBuffSize);
mFosRaw.write(recoChunk);
}
now from recoChunk I want to find largest amplitude recorded how can I do that?
You can cast your byte array to an array of a type that matches the bit-depth of your recorded audio in its size. For example for 16-bit audio, you can use short, since it holds a 16-bit signed integer value. For 8-bit you can just use the byte array without casting. Then, simply, the largest "number" in the array (you would probably want to take the absolute value) will be the sample with the highest amplitude value.

Storing 16Bit Audio on a 8bit byte array in android

I'm confused. I needed to record sound from MIC in Android so I used the following code:
recorder = new AudioRecord(AudioSource.MIC, 44100,
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT, N);
buffer = new byte[N];
//...
recorder.read(buffer, 0, N);
As we know, a byte array can store values between -128 to +128 while a 16Bit sound needs a lot more storage(e.g. short and int) but surprisingly Java and Android have a record method which saves recorded data to a byte array.
How that can be possible? What am I missing?
You are thinking of byte as a shot integer. It is just 8 bits. You need to store 1000111011100000 (16 bits)? First byte is 10001110, second byte is 11100000. That you can interpret these bits as numbers is not relevant here. In a more general way, byte[] is usually how you deal with binary "raw data" (let it be audio streams or encrypted content or anything else that you treat like a stream of bits).
If you have n "words" of 16 bits then you will need 2n bytes to store it. Byte 0 will be lower (or higher) part of word 0, byte 1 will be the rest of word 0, byte 0 will be lower (or higher) part of word 1...

How to determine if 8bit WAV File is signed or unsigned, using Java and without javax.sound

I need to know whether a ".wav" of 8bits, is signed or unsigned PCM, by only reading file. I cannot use "javax.sound.sampled.*" or AudioSystem libraries.
8 bit (or lower) WAV files are always unsigned. 9 bit or higher are always signed:
Each sample is contained in an integer i. The size of i is the smallest number of bytes required to contain the specified sample size. The least significant byte is stored first. The bits that represent the sample amplitude are stored in the most significant bits of i, and the remaining bits are set to zero.
For example, if the sample size (recorded in nBitsPerSample) is 12 bits, then each sample is stored in a two-byte integer. The least significant four bits of the first (least significant) byte is set
to zero.
The data format and maximum and minimums values for PCM waveform samples of various sizes are as follows:
Multimedia Programming Interface
and Data Specifications 1.0 - IBM/Microsoft, August 1991
In the wav File, 8-bit samples are stored as unsigned bytes, ranging from 0 to 255.
The 16-bit samples are stored as signed integers in 2's-complement.

Android PCM to Ulaw encoding wav file

I'm trying to encode raw pcm data as uLaw to save on the bandwidth required to transmit speech data.
I have come across a class called UlawEncoderInputStream on This page but there is no documentation! :(
The constructor takes an input stream and a max pcm value (whatever that is).
/**
* Create an InputStream which takes 16 bit pcm data and produces ulaw data.
* #param in InputStream containing 16 bit pcm data.
* #param max pcm value corresponding to maximum ulaw value.
*/
public UlawEncoderInputStream(InputStream in, int max) {
After looking through the code, I suspect that i should calculate this "max" value using a supplied function: maxAbsPcm. Problem is, i dont really understand what I'm meant to pass into it! I am recording my raw pcm to a file on the sdcard so I dont have one continuous memory resident array of data to pass to this.
/**
* Compute the maximum of the absolute value of the pcm samples.
* The return value can be used to set ulaw encoder scaling.
* #param pcmBuf array containing 16 bit pcm data.
* #param offset offset of start of 16 bit pcm data.
* #param length number of pcm samples (not number of input bytes)
* #return maximum abs of pcm data values
*/
public static int maxAbsPcm(byte[] pcmBuf, int offset, int length) {
Another problem I have using this code is I am unsure what values to write out for the header for uLaw data. How do i determine how much less byte data there is after encoding with uLaw?
I have listened to one of the (potentially) uLaw encoded files that I created in VLC media player (the only player i have that will attempt to read the file) and its sounds nasty, broken and clicky but can still make out the voice.
I am writing my wave header using code similar to a class I found called WaveHeader which can be found Here!
If anyone has any thoughts on this matter I would be most grateful to hear them!:)
Many thanks
Dexter
The max in the constructor is the maximum amplitude in the PCM data. It is used to scale the input before generating the output. If the input is very loud you need a higher value, if it's quiet you need a lower one. If you pass in 0 the encoder will use 8192 by default, which may be good enough.
The length in the other method is the number of 16-bit samples from which you want to find the maximum amplitude. This class assumes that the input PCM data is always encoded with 16-bit samples, which means that each sample spans two bytes: if your input is 2000 bytes long you have 1000 samples.
The encoder in this class produces one 8-bit ยต-Law sample for every 16-bit PCM sample, so the size in bytes is halved.
This is the opposite of what you are trying to do, but I thought it could be helpful to someone. Here is an exmple method that will convert an 8-bit uLaw encoded binary file into a 16-bit WAV file using built-in Java methods.
public static void convertULawFileToWav(String filename) {
File file = new File(filename);
if (!file.exists())
return;
try {
long fileSize = file.length();
int frameSize = 160;
long numFrames = fileSize / frameSize;
AudioFormat audioFormat = new AudioFormat(Encoding.ULAW, 8000, 8, 1, frameSize, 50, true);
AudioInputStream audioInputStream = new AudioInputStream(new FileInputStream(file), audioFormat, numFrames);
AudioSystem.write(audioInputStream, Type.WAVE, new File("C:\\file.wav"));
} catch (IOException e) {
e.printStackTrace();
}
}

Categories