Create sample array from 32bit sample size WAV

Create sample array from 32bit sample size WAV - java

I got a WAV (32 bit sample size, 8 byte per frame, 44100 Hz, PCM_Float), which in need to create a sample array of. This is the code I have used for a Wav with 16 bit sample size, 4 byte per frame, 44100 Hz, PCM_Signed.
private float[] getSampleArray(byte[] eightBitByteArray) {
int newArrayLength = eightBitByteArray.length
/ (2 * calculateNumberOfChannels()) + 1;
float[] toReturn = new float[newArrayLength];
int index = 0;
for (int t = 0; t + 4 < eightBitByteArray.length; t += 2) // t+2 -> skip
//2nd channel
{
int low=((int) eightBitByteArray[t++]) & 0x00ff;
int high=((int) eightBitByteArray[t++]) << 8;
double value = Math.pow(low+high, 2);
double dB = 0;
if (value != 0) {
dB = 20.0 * Math.log10(value); // calculate decibel
}
toReturn[index] = getFloatValue(dB); //minorly important conversion
//to normalized values
index++;
}
return toReturn;
}
Obviously this code cant work for the 32bits sample size Wav, as I have to consider 2 more bytes in the first channel.
Does anybody know how the 2 other bytes have to be added (and shiftet) to calculate the amplitude? Unfortunately google didnt help me at all :/.
Thanks in advance.

Something like this should do the trick.
for (int t = 0; t + 4 < eightBitByteArray.length; t += 4) // t+4 -> skip
//2nd channel
{
float value = ByteBuffer.wrap(eightBitByteArray, t, 4).order(ByteOrder.LITTLE_ENDIAN).getFloat();
double dB = 0;
if (value != 0) {
dB = 20.0 * Math.log10(value); // calculate decibel
}
toReturn[index] = getFloatValue(dB); //minorly important conversion
//to normalized values
index++;
}
On another note - converting instantaneous samples to dB is nonsensical.

Related

Calculate the level/amplitude/db of audio for two channels

I have read two posts about extracting samples from AudioInputStream and converting them in to dB.
https://stackoverflow.com/a/26576548/8428414
https://stackoverflow.com/a/26824664/8428414
As far as I understand byte[] bytes; has structure like this:
Index 0: Sample 0 (Left Channel)
Index 1: Sample 0 (Right Channel)
Index 2: Sample 1 (Left Channel)
Index 3: Sample 1 (Right Channel)
Index 4: Sample 2 (Left Channel)
Index 5: Sample 2 (Right Channel)
In the first article it shows how to get samples from one channel (mono).
So, my problem is that I want to get samples separately for the right channel and separately for the left channel in order to calculate dB for right and left channels.
Here is the code. How it can be changed to get right and left channels separately?
I can't understand how the index i changes...
final byte[] buffer = new byte[2048];
float[] samples = new float[buffer.length / 2];
for (int n = 0; n != -1; n = in.read(buffer, 0, buffer.length)) {
line.write(buffer, 0, n);
for (int i = 0, sampleIndex = 0; i < n; ) {
int sample = 0;
sample |= buffer[i++] & 0xFF; // (reverse these two lines
sample |= buffer[i++] << 8; // if the format is big endian)
// normalize to range of +/-1.0f
samples[sampleIndex++] = sample / 32768f;
}
float rms = 0f;
for (float sample : samples) {
rms += sample * sample;
}
rms = (float) Math.sqrt(rms / samples.length);
Hope you could help me. Thank you in advance.

The format the stereo signal is saved in is called interleaved. I.e., as you described correctly, it's LLRRLLRRLLRR.... SO you first need to read a left sample, then a right sample, and so on.
I have edited your code to reflect this. However, there is some room for improvement via refactoring.
Note: The code changes only deal with interleaving. I have not checked the rest of your code.
final byte[] buffer = new byte[2048];
// create two buffers. One for the left, one for the right channel.
float[] leftSamples = new float[buffer.length / 4];
float[] rightSamples = new float[buffer.length / 4];
for (int n = 0; n != -1; n = in.read(buffer, 0, buffer.length)) {
line.write(buffer, 0, n);
for (int i = 0, sampleIndex = 0; i < n; ) {
int sample = 0;
leftSample |= buffer[i++] & 0xFF; // (reverse these two lines
leftSample |= buffer[i++] << 8; // if the format is big endian)
rightSample |= buffer[i++] & 0xFF; // (reverse these two lines
rightSample |= buffer[i++] << 8; // if the format is big endian)
// normalize to range of +/-1.0f
leftSamples[sampleIndex] = leftSample / 32768f;
rightSamples[sampleIndex] = rightSample / 32768f;
sampleIndex++;
}
// now compute RMS for left
float leftRMS = 0f;
for (float sample : leftSamples) {
leftRMS += sample * sample;
}
leftRMS = (float) Math.sqrt(leftRMS / leftSamples.length);
// ...and right
float rightRMS = 0f;
for (float sample : rightSamples) {
rightRMS += sample * sample;
}
rightRMS = (float) Math.sqrt(rightRMS / rightSamples.length);
}

JAVA - read binary file with header (trying to transfer c# code)

I have some old C# code that does what I need, with the file type I'm working with, but I need to get it into Java. I've been reading up on binary I/O but I can't figure out how to deal with the header and I don't understand the C# code enough to know what it's doing
I would appreciate any assistance - mostly with understanding what the C# code means when it uses br.readInt32() and such and how to emulate that with Java which (as I understand it) reads the binary differently
I don't understand binary files very well (nor do I want to, this is a one off code piece), I just want to get the data out then I can work on the code that I understand better.
thanks
C# snippet:
[code]
public void ConvertEVDtoCSV(string fileName)
{
string[] fileArray = File.ReadAllLines(fileName);
float minX = 0;
float maxX = 0;
try
{
FileStream fs = new FileStream(fileName, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
/*
16 + n*80*6 = sizeof(header) where n is the 9th nibble of the file (beginning of the 5th byte)
*/
//Reads "EVIS"
br.ReadBytes(4);
//Reads numDataSets
int numDataSets = br.ReadInt32();
//Reads lngNumPlotSurfaces
int lngNumPlotSurfaces = br.ReadInt32();
//Reads headerEvisive length
int headerEvisive = br.ReadInt32();
//skip all six title and axes text lines.
int remainingHeader = (lngNumPlotSurfaces * 6 * 80) + headerEvisive;
br.ReadBytes(remainingHeader); //could also use seek(remainingHeader+16), but streams don't support seek?
long dataSize = numDataSets * (2 + lngNumPlotSurfaces); //meb 6-8-2016: +2 for X and Y
string[] dataForCSVFile = new string[dataSize];
for (long cnt = 0; cnt < numDataSets; cnt++)
{
for (int j = 0; j < 2 + lngNumPlotSurfaces; j++) //+2 for X and Y
{
//don't read past the end of file
if (br.BaseStream.Position<br.BaseStream.Length) {
//This is where the data needs to be read in and converted from 32-bit single-precision floating point to strings for the csv file
float answerLittle = br.ReadSingle();
if (j == 0 && answerLittle > maxX)
maxX = answerLittle;
if (j == 0 && answerLittle < minX)
minX = answerLittle;
if (j > lngNumPlotSurfaces)
dataForCSVFile[cnt * (2 + lngNumPlotSurfaces) + j] = answerLittle.ToString() + "\r\n";
else
dataForCSVFile[cnt * (2 + lngNumPlotSurfaces) + j] = answerLittle.ToString() + ",";
}
}
}
fs.Close();
textBox_x_max.Text = (maxX).ToString("F2");
textBox_x_min.Text = (minX).ToString("F2");
StreamWriter sw = new StreamWriter(tempfile);
for (int i = 0; i < dataForCSVFile.Length; i++)
{
sw.Write(dataForCSVFile[i]);
}
sw.Close();
}
catch (Exception ex)
{ Console.WriteLine("Error reading data past eof."); }
}

From MSDN:
https://msdn.microsoft.com/en-us/library/system.io.binaryreader.readint32(v=vs.110).aspx?cs-save-lang=1&cs-lang=csharp#code-snippet-1
Reads a 4-byte signed integer from the current stream and advances the current position of the stream by four bytes."
https://msdn.microsoft.com/en-us/library/system.io.binaryreader.readsingle(v=vs.110).aspx
A 4-byte floating point value read from the current stream.
For reading the integer in Java, you will also have to pay attention to endianness. C# BinaryReader is small endian by default, while Java is big endian. So when reading integers you will have to read 4 bytes, swap their order and recombine them.

How to mix PCM audio sources (Java)?

Here's what I'm working with right now:
for (int i = 0, numSamples = soundBytes.length / 2; i < numSamples; i += 2)
{
// Get the samples.
int sample1 = ((soundBytes[i] & 0xFF) << 8) | (soundBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535
int sample2 = ((outputBytes[i] & 0xFF) << 8) | (outputBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535
// Normalize for simplicity.
float normalizedSample1 = sample1 / 65535.0f;
float normalizedSample2 = sample2 / 65535.0f;
float normalizedMixedSample = 0.0f;
// Apply the algorithm.
if (normalizedSample1 < 0.5f && normalizedSample2 < 0.5f)
normalizedMixedSample = 2.0f * normalizedSample1 * normalizedSample2;
else
normalizedMixedSample = 2.0f * (normalizedSample1 + normalizedSample2) - (2.0f * normalizedSample1 * normalizedSample2) - 1.0f;
int mixedSample = (int)(normalizedMixedSample * 65535);
// Replace the sample in soundBytes array with this mixed sample.
soundBytes[i] = (byte)((mixedSample >> 8) & 0xFF);
soundBytes[i + 1] = (byte)(mixedSample & 0xFF);
}
From as far as I can tell, it's an accurate representation of the algorithm defined on this page: http://www.vttoth.com/CMS/index.php/technical-notes/68
However, just mixing a sound with silence (all 0's) results in a sound that very obviously doesn't sound right, maybe it's best to describe it as higher-pitched and louder.
Would appreciate help in determining if I'm implementing the algorithm correctly, or if I simply need to go about it a different way (different algorithm/method)?

In the linked article the author assumes A and B to represent entire streams of audio. More specifically X means the maximum abs value of all of the samples in stream X - where X is either A or B. So what his algorithm does is scans the entirety of both streams to compute the max abs sample of each and then scales things so that the output theoretically peaks at 1.0. You'll need to make multiple passes over the data in order to implement this algorithm and if your data is streaming in then it simply will not work.
Here is an example of how I think the algorithm to work. It assumes that the samples have already been converted to floating point to side step the issue of your conversion code being wrong. I'll explain what is wrong with it later:
double[] samplesA = ConvertToDoubles(samples1);
double[] samplesB = ConvertToDoubles(samples2);
double A = ComputeMax(samplesA);
double B = ComputeMax(samplesB);
// Z always equals 1 which is an un-useful bit of information.
double Z = A+B-A*B;
// really need to find a value x such that xA+xB=1, which I think is:
double x = 1 / (Math.sqrt(A) * Math.sqrt(B));
// Now mix and scale the samples
double[] samples = MixAndScale(samplesA, samplesB, x);
Mixing and scaling:
double[] MixAndScale(double[] samplesA, double[] samplesB, double scalingFactor)
{
double[] result = new double[samplesA.length];
for (int i = 0; i < samplesA.length; i++)
result[i] = scalingFactor * (samplesA[i] + samplesB[i]);
}
Computing the max peak:
double ComputeMaxPeak(double[] samples)
{
double max = 0;
for (int i = 0; i < samples.length; i++)
{
double x = Math.abs(samples[i]);
if (x > max)
max = x;
}
return max;
}
And conversion. Notice how I'm using short so that the sign bit is properly maintained:
double[] ConvertToDouble(byte[] bytes)
{
double[] samples = new double[bytes.length/2];
for (int i = 0; i < samples.length; i++)
{
short tmp = ((short)bytes[i*2])<<8 + ((short)(bytes[i*2+1]);
samples[i] = tmp / 32767.0;
}
return samples;
}

Normalize the PCM data

I am using following code to normalize PCM audio data, Is this the correct way to normalize? After Normalization I am applying LPF. Does the order matters whether to do LPF first and Normalization on its output or my current order is better only if that matters. Also my targetMax is set to 8000 which I used from on of this forum's posting. What is the optimal value for it. My input is 16 bit MONO PCM with sample rate of 44100.
private static int findMaxAmplitude(short[] buffer) {
short max = Short.MIN_VALUE;
for (int i = 0; i < buffer.length; ++i) {
short value = buffer[i];
max = (short) Math.max(max, value);
}
return max;
}
short[] process(short[] buffer) {
short[] output = new short[buffer.length];
int maxAmplitude = findMaxAmplitude(buffer);
for (int index = 0; index < buffer.length; index++) {
output[index] = normalization(buffer[index], maxAmplitude);
}
return output;
}
private short normalization(short value, int rawMax) {
short targetMax = 8000;
double maxReduce = 1 - targetMax / (double) rawMax;
int abs = Math.abs(value);
double factor = (maxReduce * abs / (double) rawMax);
return (short) Math.round((1 - factor) * value);
}

Your findMaxAmplitude only looks at the positive excursions. It should use something like
max = (short)Math.Max(max, Math.Abs(value));
Your normalization seems quite involved. A simpler version would use:
return (short)Math.Round(value * targetMax / rawMax);
Whether a targetMax of 8000 is correct is a matter of taste. Normally I would expect normalisation of 16-bit samples to use the maximum range of values. So a targetMax of 32767 seems more logical.
The normalization should probably be done after the LPF operation, as the gain of the LPF may change the maximum value of your sequence.

how to find 2 to the power of n . n ranges from 0 to 200

Assume my system as 32 bit machine. Considering this if I use long int for n>63 I will get my value as 0. How to solve it?

double is perfectly capable of storing powers of two up to 1023 exactly. Don't let someone tell you that floating point numbers are somehow always inexact. This is a special case where they aren't!
double x = 1.0;
for (int n = 0; n <= 200; ++n)
{
printf("2^%d = %.0f\n", n, x);
x *= 2.0;
}
Some output of the program:
2^0 = 1
2^1 = 2
2^2 = 4
2^3 = 8
2^4 = 16
...
2^196 = 100433627766186892221372630771322662657637687111424552206336
2^197 = 200867255532373784442745261542645325315275374222849104412672
2^198 = 401734511064747568885490523085290650630550748445698208825344
2^199 = 803469022129495137770981046170581301261101496891396417650688
2^200 = 1606938044258990275541962092341162602522202993782792835301376

Just wait around for a 256-bit compiler, then use int :-)
No, seriously, since you just want to start with 1 and keep doubling, your best bet is to get a big integer library like GNU MP.
You would do that with a piece of code like (untested):
#include <stdio.h>
#include "gmp.h"
int main (void) {
int i;
mpz_t num;
mpz_init_set_ui (num, 1);
for (i = 0; i <= 200; i++) {
printf ("2^%d = ", i);
mpz_out_str (NULL, 10, num);
printf ("\n");
mpz_mul_ui (num, num, 2);
}
return 0;
}
You could code up your own data structure of an array of longs with only two operations, double and print but I think it would be far easier to just use GMP.
If you do want to roll your own, have a look at this. It's a variation/simplification of some big integer libraries I've developed in the past:
#include <stdio.h>
#include <stdlib.h>
// Use 16-bit integer for maximum portability. You could adjust
// these values for larger (or smaller) data types. SZ is the
// number of segments in a number, ROLLOVER is the maximum
// value of a segment plus one (need to be less than the
// maximum value of your datatype divided by two. WIDTH is
// the width for printing (number of "0" characters in
// ROLLOVER).
#define SZ 20
#define ROLLOVER 10000
#define WIDTH 4
typedef struct {
int data[SZ];
} tNum;
// Create a number based on an integer. It allocates the segments
// then initialises all to zero except the last - that one is
// set to the passed-in integer.
static tNum *tNumCreate (int val) {
int i;
tNum *num = malloc (sizeof (tNum));
if (num == NULL) {
printf ("MEMORY ERROR\n");
exit (1);
}
for (i = 0; i < SZ - 1; i++) {
num->data[i] = 0;
}
num->data[SZ-1] = val;
}
// Destroy the number. Simple free operation.
static void tNumDestroy (tNum *num) {
free (num);
}
// Print the number. Ignores segments until the first non-zero
// one then prints it normally. All following segments are
// padded with zeros on the left to ensure number is correct.
// If no segments were printed, the number is zero so we just
// output "0". Then, no matter what, we output newline.
static void tNumPrint (tNum *num) {
int i, first;
for (first = 1, i = 0; i < SZ; i++) {
if (first) {
if (num->data[i] != 0) {
printf ("%d", num->data[i]);
first = 0;
}
} else {
printf ("%0*d", WIDTH, num->data[i]);
}
}
if (first) {
printf ("0");
}
printf ("\n");
}
// Double a number. Simplified form of add with carry. Carry is
// initialised to zero then we work with the segments from right
// to left. We double each one and add the current carry. If
// there's overflow, we adjust for it and set carry to 1, else
// carry is set to 0. If there's carry at the end, then we have
// arithmetic overflow.
static void tNumDouble (tNum *num) {
int i, carry;
for (carry = 0, i = SZ - 1; i >= 0; i--) {
num->data[i] = num->data[i] * 2 + carry;
if (num->data[i] >= ROLLOVER) {
num->data[i] -= ROLLOVER;
carry = 1;
} else {
carry = 0;
}
}
if (carry == 1) {
printf ("OVERFLOW ERROR\n");
exit (1);
}
}
// Test program to output all powers of 2^n where n is in
// the range 0 to 200 inclusive.
int main (void) {
int i;
tNum *num = tNumCreate (1);
printf ("2^ 0 = ");
tNumPrint (num);
for (i = 1; i <= 200; i++) {
tNumDouble (num);
printf ("2^%3d = ", i);
tNumPrint (num);
}
tNumDestroy (num);
return 0;
}
and its associated output:
2^ 0 = 1
2^ 1 = 2
2^ 2 = 4
2^ 3 = 8
2^ 4 = 16
2^ 5 = 32
2^ 6 = 64
2^ 7 = 128
2^ 8 = 256
2^ 9 = 512
: : : : :
2^191 = 3138550867693340381917894711603833208051177722232017256448
2^192 = 6277101735386680763835789423207666416102355444464034512896
2^193 = 12554203470773361527671578846415332832204710888928069025792
2^194 = 25108406941546723055343157692830665664409421777856138051584
2^195 = 50216813883093446110686315385661331328818843555712276103168
2^196 = 100433627766186892221372630771322662657637687111424552206336
2^197 = 200867255532373784442745261542645325315275374222849104412672
2^198 = 401734511064747568885490523085290650630550748445698208825344
2^199 = 803469022129495137770981046170581301261101496891396417650688
2^200 = 1606938044258990275541962092341162602522202993782792835301376

python supports big integers out of the box. At any linux prompt, run this:
$ python -c "for power in range(201): print power, 2**power"
0 1
1 2
2 4
3 8
4 16
5 32
6 64
<snip>
196 100433627766186892221372630771322662657637687111424552206336
197 200867255532373784442745261542645325315275374222849104412672
198 401734511064747568885490523085290650630550748445698208825344
199 803469022129495137770981046170581301261101496891396417650688
200 1606938044258990275541962092341162602522202993782792835301376
This can be easily made into a script if necessary. See any python tutorial.

It's been ages since I've used Java seriously, but: BigInteger class? It has all the usual mathematical (multiply, pow) and bitwise (shiftLeft) operations.
Your tagging is a little confusing though, which language did you prefer?

Use java.math.BigInteger.shiftLeft.
for (int i = 0; i <= 200; i++) {
System.out.format("%d = %s%n", i, BigInteger.ONE.shiftLeft(i));
}
Excerpt of output:
0 = 1
1 = 2
2 = 4
3 = 8
4 = 16
:
197 = 200867255532373784442745261542645325315275374222849104412672
198 = 401734511064747568885490523085290650630550748445698208825344
199 = 803469022129495137770981046170581301261101496891396417650688
200 = 1606938044258990275541962092341162602522202993782792835301376
If BigInteger is unavailable, you can also just manually do the multiplication and store it in a String.
String s = "1";
for (int i = 0; i < 200; i++) {
StringBuilder sb = new StringBuilder();
int carry = 0;
for (char ch : s.toCharArray()) {
int d = Character.digit(ch, 10) * 2 + carry;
sb.append(d % 10);
carry = d / 10;
}
if (carry != 0) sb.append(carry);
s = sb.toString();
System.out.format("%d = %s%n", i + 1, sb.reverse());
}
(see full output)

In C/C++ I don't know of a standard way you can store integers that big, pax's solution is the rightway to go.
However for Java, you do have a way out, BigInteger

Use scheme!
1 => (expt 2 200)
1606938044258990275541962092341162602522202993782792835301376

in kotlin :
var x= readLine()!!.toInt()
var y=BigDecimal(1)
for (i in 1..x)
{
y *= BigDecimal(2)
}
println(DecimalFormat().format(y))

If unsigned long int is 64 bits then the largest value for 2^n that you can represent is 2^63 (i.e. n = 63):
unsigned long int x = (1UL << n); // n = 0..63

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Create sample array from 32bit sample size WAV - java

Related

Calculate the level/amplitude/db of audio for two channels

JAVA - read binary file with header (trying to transfer c# code)

How to mix PCM audio sources (Java)?

Normalize the PCM data

how to find 2 to the power of n . n ranges from 0 to 200

Categories

Resources