I have a .au audio file that I am trying to copy to another audio file, and I want the copied audio file to have half the volume. I have written the following code and it produces the following audio file:
for (int i = 24; i < bytes.length; i++) {
// bytes is a byte[] array containing every byte in the .au file
if (i % 2 == 0) {
short byteFrame = (short) (((bytes[i - 0]&0xFF) << 8) | ((bytes[i - 1]&0xFF)));
byteFrame >>= 1;
bytes[i - 0] = (byte) (byteFrame);
bytes[i - 1] = (byte) (byteFrame >>> 8);
}
}
The data I get from that code is this:
The following code is the same as above, only 'bytes[i - 0]' and 'bytes[i - 1]' have switched places. When I do that, the information in the channels gets swapped to the other channel.
for (int i = 24; i < bytes.length; i++) {
// bytes is a byte[] array containing every byte in the .au file
if (i % 2 == 0) {
short byteFrame = (short) (((bytes[i - 0]&0xFF) << 8) | ((bytes[i - 1]&0xFF)));
byteFrame *= 0.5;
bytes[i - 1] = (byte) (byteFrame);
bytes[i - 0] = (byte) (byteFrame >>> 8);
}
}
The data I get from that code is this (Information in the channels has been swapped):
I need to reduce the volume in both channels by half. Below is the wikipedia page on the au file format. Any ideas on how to get it to work properly in reducing the volume? This file is encoding 1 (8-bit G.711 mu-law), 2 channels, 2 bytes per frame, and sample rate of 48000. (It works properly on Encoding 3 but not encoding 1.) Thanks in advance for any help offered.
http://en.wikipedia.org/wiki/Au_file_format
Use a ByteBuffer. It appears that you use 16 bit quantities in little endian order, and that you want to right shift them by 1.
Therefore:
final ByteBuffer orig = ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN)
.asReadOnlyBuffer();
final ByteBuffer transformed = ByteBuffer.wrap(bytes.length)
.order(ByteOrder.LITTLE_ENDIAN);
while (orig.hasRemaining())
transformed.putShort(orig.getShort() >>> 1);
return transformed.array();
Note that the >>> is necessary; otherwise you carry the sign bit.
That is, trying to use >> 1 on:
1001 0111
will give:
1100 1011
ie, the sign bit (the most significant bit) is carried. This is why >>> exists in Java, which DOES NOT carry the sign bit, therefore using >>> 1 on the above will give:
0100 1011
As seems logical when doing bit shifting!
Related
I have an application in which I'm trying to send UDP messages using TSLv5. The protocol is somewhat complicated, but to do it I need to write 16 bit values as little-endian and then send that via UDP.
Here's my code doing just that:
buffer.order(ByteOrder.LITTLE_ENDIAN);
buffer.putShort(SCREEN_POS, screen);
buffer.putShort(INDEX_POS, index);
ByteBuffer text = ByteBuffer.wrap(value.getBytes());
short textLength = (short) value.getBytes().length;
buffer.putShort(LENGTH_POS, textLength);
ByteBuffer combined = ByteBufferUtils.concat(buffer, text);
short control = 0x00;
control |= rhTally << 0;
control |= textTally << 2;
control |= lhTally << 4;
control |= brightness << 6;
combined.putShort(CONTROL_POS, control);
short msgLength = (short) (combined.array().length - 2);
combined.putShort(PBC_POS, msgLength);
return new DatagramPacket(combined.array(), combined.array().length, ip, 9000);
This mostly works, but the problem is when I have values that are greater than 127.
For example, my index is 148 and when all is said and done, my control comes out to be 193. When I write those values to the ByteBuffer they become -108 and -63, respectively.
I know why this happens, a ByteBuffer is an array of bytes and bytes can't be greater than 127. What I don't know is how I can achieve this? The protocol does not work if I send signed values, it has to be the exact number.
I can assure that a signed java byte will be read correctly in the two bytes of a short. I have simplified the code, writing the fields one after the other in linear fashion, with message fields in front. Also just used one ByteBuffer.
(Maybe there is some small error like a wrong offset.)
Also I send the text bytes as being in UTF-8. You used the implicit platform encoding, which may differ on every computer.
byte[] text = value.getBytes(StandardCharsets.UTF_8);
int textLength = text.length;
int length = 2 + 2 + 2 + 2 + 2 + textLength;
ByteBuffer buffer = ByteBuffer.allocate(length)
.order(ByteOrder.LITTLE_ENDIAN);
short control = 0x00;
control |= rhTally << 0;
control |= textTally << 2;
control |= lhTally << 4;
control |= brightness << 6;
buffer.putShort(/*CONTROL_POS,*/ control);
short msgLength = (short) (length - 2);
buffer.putShort(/*PBC_POS,*/ msgLength);
buffer.putShort(/*SCREEN_POS,*/ screen);
buffer.putShort(/*INDEX_POS,*/ index);
buffer.putShort(/*LENGTH_POS,*/ (short)textLength);
buffer.put(text, 0, textLength);
return new DatagramPacket(buffer.array(), length, ip, 9000);
For a peer-to-peer audio client, I need to have the ability to change the output volume to a desired level. In my case, the volume is a floating point number between zero and one.
I modify the audio stream this way:
void play(byte[] buf)
{
for (int i = 0; i < buf.length; i += 2)
{
// sample size is 2 bytes, so convert to int and then back
int data = ((buf[i + 1] & 0xFF) << 8) | (buf[i] & 0xFF);
data = (int) (data * outputVolume);
buf[i] = (byte) (data & 0xFF);
buf[i + 1] = (byte) (data >> 8);
}
line.write(buf, 0, Math.min(buf.length, line.available()));
}
Now, when outputVolume is set to 0, the output is silent. When it is set to 1, it behaves normal and quality is fine (as it is not modified). But any numbers between 0 and 1 produce a horrible noise which is louder than the expected stream itself. At 0.5, the noise reaches it's loudest point.
I don't want to use the controls of the audio mixer itself (like gain control or volume control) because I had compatibility problems this way and later on, I want to modify the byte stream even more so I have to iterate through the stream anyway.
Assuming the audio data is signed (because I think it would be pretty unusual to have unsigned 16-bit samples), there is a mistake in that code, because you also need to sign extend the sample.
You can remove the & 0xFF from the high byte which will let sign extension happen automatically:
int data = (buf[i + 1] << 8) | (buf[i] & 0xFF);
If for some reason you couldn't modify the and-shift-or expression, you could do sign extension like this:
int data = ((buf[i + 1] & 0xFF) << 8) | (buf[i] & 0xFF);
data = (data << 16) >> 16;
The result of the shifting expression is equivalent to this:
if (data > 32767)
data -= 65536;
And this:
if ((i & 0x80_00) != 0)
i |= 0xFF_FF_00_00;
(Those would also work.)
However, in your case you can just remove the & 0xFF from the high byte.
For a quick explanation, if you had some 16-bit sample like this (which is -1):
11111111_11111111
If you just convert to 32-bit without sign extending, you would get:
00000000_00000000_11111111_11111111
But that's 65536, not -1. Sign extension fills the upper bits with 1s if the MSB in the 16-bit value was set:
11111111_11111111_11111111_11111111
I have to compress a list of short-values into a byte array, but only the last X bits of the value.
Given this method:
byte[] compress(int bitsPerWord, List<Short> input){
...
}
The BitsPerWorld will never be bigger than the given values in the input field.
Example: 10 bits per word => maximum value 1023
I also may not waste bits, I have to save X bits in the first Y bytes, and then append the next X bits directly to them.
Example:
Input(Short) [ 500, 150, 100 ]
Input(Binary):0000000111110100 0000000001101000 0000000001100100
Output (10 bits per short): 0111110100 0001101000 0001100100
Output (As byte array):0111 1101 0000 0110 1000 0001 1001 0000
What the result should look like
Any way to do this efficiently? BitSet seems not fitting for this task, because i would have to set every single bit explicit.
Thanks
Efficient in what way?
In terms of work required, extending BitSet adding a bulk put method and an index is super efficient; little work and thinking required.
The alternative, shifting and masking bits is moderately complicated in terms of programming effort if you know your ways with bitwise operations. It may be a major obstacle if you don't.
Considering you already use wrapper types and collections, indicating troughput is not your major concern, extending BitSet is probably all you need.
You need to perform some bit manipulations, and for that to work you need to find a repeatable pattern. In this case, you have a list of "short" values, but actually you just use the rightmost 10 bits. Since you want to pack those into bytes, the minimum repeatable pattern is 40 bits long (5 bytes, 4 10-bit values). That is the "block size" for processing.
You would then have a loop that would do the main parsing of full blocks, plus maybe a special case at the end for the final incomplete block.
byte[] pack10(List<Short> source) {
final int nBlk = source.size() / 4;
final int remBits = (source.size() % 4) * 10;
final int remBytes = (remBits / 8) + (remBits % 8 > 0 ? 1 : 0);
byte[] ret = new byte[nBlk*5 + remBytes];
final short bitPat = (short)0b0000001111111111;
for (int iBlk = 0; iBlk < nBlk; ++iBlk) {
// Parse full blocks
List<Short> curS = source.subList(iBlk*4, (iBlk+1)*4);
ret[iBlk*5 ] = (byte) ((curS.get(0) & bitPat) >> 2);
ret[iBlk*5+1] = (byte) ((curS.get(0) & bitPat) << 6
| (curS.get(1) & bitPat) >> 4);
ret[iBlk*5+2] = (byte) ((curS.get(1) & bitPat) << 4
| (curS.get(2) & bitPat) >> 6);
ret[iBlk*5+3] = (byte) ((curS.get(2) & bitPat) << 2
| (curS.get(3) & bitPat) >> 8);
ret[iBlk*5+4] = (byte) (curS.get(3) & bitPat);
}
// Parse final values
List<Short> remS = source.subList(nBlocks*4, source.size());
if (remS.size() >= 1) {
ret[nBlk*5 ] = (byte) ((remS.get(0) & bitPat) >> 2);
ret[nBlk*5+1] = (byte) ((remS.get(0) & bitPat) << 6);
}
if (remS.size() >= 2) { // The first byte is appended to
ret[nBlk*5+1] |= (byte) ((remS.get(1) & bitPat) >> 4);
ret[nBlk*5+2] = (byte) ((remS.get(1) & bitPat) << 4);
}
if (remS.size() == 3) { // The first byte is appended to
ret[iBlk*5+2] |= (byte) ((curS.get(2) & bitPat) >> 6);
ret[iBlk*5+3] = (byte) ((curS.get(2) & bitPat) << 2);
}
return ret;
}
That is a specific version for 10-bit values; if you want a version with a generic number of values you'd have to generalise from that. The bit pattern operations changes, and all the system becomes less efficient if the pattern is computed at runtime (i.e. if the number of bits is a variable like in your example).
There are several people who have already written a BitOutputStream in Java. Pick one of them, wrap it in a ByteArrayOutputStream, and you’re done.
Like in topic. I am writing application where I must save as much RAM as possible. I want to split byte into two parts 4 bits each (numbers from 0 to 15) - how can I save and later read this values?
If what you want is store 2 numbers (range 0-15) in one byte, here's an example on how to save and restore them. Note that you have to make sure your original numbers are within the allowed range, otherwise this will not work.
// Store both numbers in one byte
byte firstNumber = 10;
byte secondNumber = 15;
final byte bothNumbers = (byte) ((firstNumber << 4) | secondNumber);
// Retreive the original numbers
firstNumber = (byte) ((bothNumbers >> 4) & (byte) 0x0F);
secondNumber = (byte) (bothNumbers & 0x0F);
Also worth noting that if you want to save as much memory as possible you shouldn't be using Java to start with. JVM already consumes memory by itself. Native languages are much more suited to this requirement.
You can get lower bits as
byte lower = b & 0xF;
higher bits as
byte higher = (b >> 4) & 0xF;
and back to one byte
byte b = (byte) (lower + (higher << 4));
I do some learning of using voip over udp in a small network. I know there are bundles of libraries ready to do and overdo everything I ever need with a few method calls, but as I said I am learning, so need to reinvent the wheel to see how it works.
I am currently investigating the DatagramPacket class and I've noticed that there is no method that would set header information(ie packet order sequence number which I need to know to do interleaving) in DatagramPacket class.
A little code to reflect the environment:
byte[] block;
DatagramPacket packet; // UDP packet
/* x Bytes per block , y blocks per second,
z ms time block playback duration */
block = recorder.getBlock(); // assume I have class that handles audio
// recording and returns speech in a
// uncompressed form of bytes
packet = new DatagramPacket(block, block.length, clientIP, PORT);
Firstly, I assume that because it is UDP, the sender doesnt really care anything whatsoever besides the simple fact that he throws packets somewhere. So that is why there is no such method inside.
Secondly, I assume that I need to do it myself - add extra bytes to the byte block to be sent , which would contain a sequence number of a packet? However am also concerned that if I do that, then how do I recognize if bytes are header bytes not audio bytes? I can make assumption that first byte represents a number, however we know that byte can only represent 258 numbers. I've never really worked on byte level before. Or there maybe other techniques?
Shortly saying, to do interleaving I need to know how to set up packet sequence number as I can't order unordered packets :-)
Thank You,
You'll need to serialize/deserialize data types your program uses onto a byte array.
Lets assume you're talking about RTP, and you'd want to send a packet with these fields - look at chapter 5 in the RTP specs:
Version = 2
padding = 0
extension = 0
CSRC count = 1
marker = 0
payload type = 8 (G711 alaw)
sequence number = 1234
timestamp = 1
one CSRC = 4321
Lets put these into some variables, using integers for ease, or long when we need to deal with an unsigned 32 bit value:
int version = 2;
int padding = 0;
int extension = 0;
int csrcCount = 1;
int marker = 0;
int payloadType = 8;
int sequenceNumber = 1234;
long timestamp = 1;
long ourCsrc = 4321;
byte buf[] = ...; //allocate this big enough to hold the RTP header + audio data
//assemble the first bytes according to the RTP spec (note, the spec marks version as bit 0 and 1, but
//this is really the high bits of the first byte ...
buf[0] = (byte) ((version & 0x3) << 6 | (padding & 0x1) << 5 | (extension & 0x1) << 4 | (csrcCount & 0xf));
//2.byte
buf[1] = (byte)((marker & 0x1) << 7 | payloadType & 0x7f);
//squence number, 2 bytes, in big endian format. So the MSB first, then the LSB.
buf[2] = (byte)((sequenceNumber & 0xff00) >> 8);
buf[3] = (byte)(sequenceNumber & 0x00ff);
//packet timestamp , 4 bytes in big endian format
buf[4] = (byte)((timestamp & 0xff000000) >> 24);
buf[5] = (byte)((timestamp & 0x00ff0000) >> 16);
buf[6] = (byte)((timestamp & 0x0000ff00) >> 8);
buf[7] = (byte) (timestamp & 0x000000ff);
//our CSRC , 4 bytes in big endian format
buf[ 8] = (byte)((sequenceNumber & 0xff000000) >> 24);
buf[ 9] = (byte)((sequenceNumber & 0x00ff0000) >> 16);
buf[10] = (byte)((sequenceNumber & 0x0000ff00) >> 8);
buf[11] = (byte) (sequenceNumber & 0x000000ff);
That's the header, now you can copy the audio bytes into buf, starting at buf[12] and send buf as one packet.
Now, the above is ofcourse just to show the principles, an actual serializer for a RTP packet would have to deal with much more, in accordance to the RTP specificaion (e.g. you might need some extension headers, you might need more than one CSRC, you need the correct payload type according to the format of the audio data you have, you need to packetize and schedule those audio data correctly - e.g. for G.711Alaw you'll should fill each RTP packet with 160 bytes of audio data and send one packet every 20 milisecond.