Background
I am taking 8, 16, 24 or 32 bit audio data and converting them to integers, but BigInteger cannot be recycled and using it will waste lot of memory so I created this class to fix the memory consumption. And seems like ByteBuffer will do the job well, except if the input is 3 bytes long.
I have never done any bit or byte operations, so I am completely lost here.
Issue
None of the examples that I found on stackoverflow on 3 bytes to int do not give the wanted result. Check the bytesToInt3 method.
Question
Is there something obvious that I am doing completely wrong?
Is the return new BigInteger(byte[] data).intValue(); really the only solution to this?
Code
import java.math.BigInteger;
import java.nio.ByteBuffer;
class BytesToInt {
// HELP
private static int bytes3ToInt(byte[] data) {
// none below seem to work, even if I swap first and last bytes
// these examples are taken from stackoverflow
//return (data[2] & 0xFF) | ((data[1] & 0xFF) << 8) | ((data[0] & 0x0F) << 16);
//return ((data[2] & 0xF) << 16) | ((data[1] & 0xFF) << 8) | (data[0] & 0xFF);
//return ((data[2] << 28) >>> 12) | (data[1] << 8) | data[0];
//return (data[0] & 255) << 16 | (data[1] & 255) << 8 | (data[2] & 255);
return (data[2] & 255) << 16 | (data[1] & 255) << 8 | (data[0] & 255);
// Only thing that works, but wastes memory
//return new BigInteger(data).intValue();
}
public static void main(String[] args) {
// Test with -666 example number
byte[] negativeByteArray3 = new byte[] {(byte)0xff, (byte)0xfd, (byte)0x66};
testWithData(negativeByteArray3);
}
private static void testWithData(byte[] data) {
// Compare our converter to BigInteger
// Which we know gives wanted result
System.out.println("Converter = " + bytes3ToInt(data));
System.out.println("BigInteger = " + new BigInteger(data).intValue());
}
}
Output
Converter = 6749695
BigInteger = -666
full code here http://ideone.com/qu9Ulw
First of all, your indices are wrong. It's not 2, 1, 0 but 0, 1, 2.
Secondly the problem is that the sign isn't being extended, so even though it would work for positive values, negative values show wrong.
If you don't mask the highest (of 24bits) byte, it will sign extend properly, filling the highest (of 32bits) byte with 0x00 for positive values or 0xFF for negative values.
return (data[0] << 16) | (data[1] & 255) << 8 | (data[2] & 255);
Related
I'm trying to convert this snippet from C# to java. The C# snippet is correctly returning the value 3259945, the java code is incorrectly returning -16855. I'm completely useless at bit manipulation and have no idea where to even start. Can anyone help?
If people need the input variables I'll try to get the buffer byte array as a hex string so I can put it up. The startIndex I'm using is 26.
C# snippet:
Int64 mantissa = ((Int64)(buffer[startIndex] & 0x7F) << (8 * 2))
| ((Int64)buffer[startIndex + 3] << (8 * 1))
| ((Int64)buffer[startIndex + 2] << (8 * 0));
Java Snippet:
long mantissa = ((long)(buffer[startIndex] & 0x7F) << (8 * 2))
| ((long)buffer[startIndex + 3] << (8 * 1))
| ((long)buffer[startIndex + 2] << (8 * 0));
As mentioned in the comments, in .NET a byte is unsigned (0 to 255) and in Java it is signed (-128 to 127). To normalize it, you need to use the & 0xFF mask.
long mantissa = ((long)(buffer[startIndex] & 0x7F) << (8 * 2))
| ((long)(buffer[startIndex + 3] & 0xFF) << (8 * 1))
| ((long)(buffer[startIndex + 2] & 0xFF) << (8 * 0));
In the first case, you don't need this mask because the sign bit has been cleared by 0x7F.
I know there are many similar questions in here, but I have some weird case. What I want to do is to convert a byte[4] to an int.
Here is the conversion from int to byte:
int data24bit = 51;
for (int i = 0; i < 3; i++) {
data8bit = (byte)(data24bit & 0x0000FF);
data24bit = data24bit >> 8;
byte_file.put(data8bit);
}
So far is clear enough. After that I want to read this 4 bytes to get the 51 back. I tried to do it by different ways:
Reading 4 bytes:
byte[] bytes = new byte[4];
for (int i = 0; i < 3; i++) {
byte b = dis.readByte();
bytes[i] = b;
}
// bytes[3] = (byte)(0x000000);
Convert bytes to int:
int value = 0;
value = ((0xFF & bytes[0]) << 24) | ((0xFF & bytes[1]) << 16) |
((0xFF & bytes[2]) << 8) | (0xFF & bytes[3]);
or
value = ByteBuffer.wrap(bytes).getInt();
or
value = new BigInteger(bytes).intValue();
I always get 855638016 as result where 51 is expected.
When I debug the code and look into the byte array I can see the following content: [51, 0, 0, 0].
What am I doing wrong?
The problem is that you're writing the bytes in little-endian (least significant byte first), but read it back assuming big-endian.
After writing it out, your byte array looks like this:
[51, 0, 0, 0]
Then you're trying to convert that back into an integer, like in this example from your post:
value = ((0xFF & bytes[0]) << 24)
| ((0xFF & bytes[1]) << 16)
| ((0xFF & bytes[2]) << 8)
| (0xFF & bytes[3]);
If you fill in the actual values, that calculation is basically this:
value = 51 * 256 * 256 * 256
+ 0 * 256 * 256
+ 0 * 256
+ 0
= 855638016
While what you actually want is this:
value = 0 * 256 * 256 * 256
+ 0 * 256 * 256
+ 0 * 256
+ 51
= 51
The fixed calculation would thus be this:
value = ((0xFF & bytes[3]) << 24)
| ((0xFF & bytes[2]) << 16)
| ((0xFF & bytes[1]) << 8)
| (0xFF & bytes[0]);
Ok stupid enough but I just didn't preserve the byte order.
[51, 0, 0, 0] -> is 855638016
[0, 0, 0, 51] -> is 51
public static int liEndVal (Byte[] mem) {
return (mem[0] & 0xFF)
| ((mem[1] & 0xFF) << 8)
| ((mem[2] & 0xFF) << 16)
| ((mem[3] & 0xFF) << 24);
}
How can I modify this method so that when my input is for example 45 A2 BD 8A the little endian integer output will not be a negative integer? I don't understand why does it keeps on returning the two complement integer.
When mem[3] > 0x7F, the returned int will be negative, since the max value of int is 0x7FFFFFFF. If you want a positive returned value, return a long.
public static long liEndVal (Byte[] mem) {
return (mem[0] & 0xFF)
| ((mem[1] & 0xFF) << 8)
| ((mem[2] & 0xFF) << 16)
| (((long)mem[3] & 0xFF) << 24);
}
Because in that representation, the (signed) integer is negative. Looks like you need an unsigned int.
I think the answer here is probably actually that you shouldn't mind that the answer is negative: just treat it as unsigned, and the signedness of the output as unimportant. You cannot eliminate the possibility of negative output, but I think you're wrong that it matters.
I'm having trouble making sense of how Java promotes bytes to ints with bitwise operations. I'm attempting to implement AES, and while my output is correct as a 2d byte array, I ultimately need to store it in a 1d int array. However, the following code changes some of the expected values
ciphertexts[0] = ((state[0][0] & 0xFF) << 24) ^ ((state[1][0] & 0xFF) << 16)
^ ((state[2][0] & 0xFF) << 8) ^ state[3][0];
ciphertexts[1] = ((state[0][1] & 0xFF) << 24) ^ ((state[1][1] & 0xFF) << 16)
^ ((state[2][1] & 0xFF) << 8) ^ state[3][1];
ciphertexts[2] = ((state[0][2] & 0xFF) << 24) ^ ((state[1][2] & 0xFF) << 16)
^ ((state[2][2] & 0xFF) << 8) ^ state[3][2];
ciphertexts[3] = ((state[0][3] & 0xFF) << 24) ^ ((state[1][3] & 0xFF) << 16)
^ ((state[2][3] & 0xFF) << 8) ^ state[3][3];
I didn't particularly expect masking with 0xFF to help, since the mask should just return the original byte value, but then I tried this:
int zero = ((state[0][0] & 0xFF) << 24);
int one = ((state[0][1] & 0xFF) << 16);
int two = ((state[0][2] & 0xFF) << 8) ;
int three = (state[0][3] & 0xFF);
int total = zero ^ one ^ two ^ three;
printhex(zero);
printhex(one);
printhex(two);
printhex(three);
printhex(total);
Which gives the following output:
69000000
006A0000
0000D800
00000070
696AD870
Which is what I'm trying to do with the code above. Without the masking, the following code gives the following output:
int zero = (state[0][0] << 24);
int one = (state[0][1] << 16);
int two = (state[0][2] << 8);
int three = state[0][3];
int total = zero ^ one ^ two ^ three;
69000000
006A0000
FFFFD800
00000070
9695D870
I also tried what seemed to me more sensible, which is masking after shifting, and got similarly messed up output:
ciphertexts[0] = ((state[0][0] << 24) & 0xFFFFFFFF) ^
((state[1][0] << 16) & 0xFFFFFF) ^ ((state[2][0] << 8) & 0xFFFF)
^ state[3][0];
ciphertexts[1] = ((state[0][1] << 24) & 0xFFFFFFFF) ^
((state[1][1] << 16) & 0xFFFFFF) ^ ((state[2][1] << 8) & 0xFFFF)
^ state[3][1];
ciphertexts[2] = ((state[0][2] << 24) & 0xFFFFFFFF) ^
((state[1][2] << 16) & 0xFFFFFF) ^ ((state[2][2] << 8) & 0xFFFF)
^ state[3][2];
ciphertexts[3] = ((state[0][3] << 24) & 0xFFFFFFFF) ^
((state[1][3] << 16) & 0xFFFFFF) ^ ((state[2][3] << 8) & 0xFFFF)
^ state[3][3];
Where "messed up" means:
ciphertext at round 9 is 963b1fd86a7b04302732488070b4c55a
instead of:
69C4E0D86A7B0430D8CDB78070B4C55A
So my questions are how do I neatly or bytes together into an int, and what is actually going on with the masking and shifting. I looked at other answers and can't figure out why they're not working in this case. Thanks!
That´s the cruelty of a language lacking unsigned
(one can get the same result in C if he/she use a signed char, ie. signed byte)
Let´s ignore shift´s , only concentrate at the assignment and &.
Example value here 0xfe instead of 0xd8
(the problem will happen with each value between 0x80 and 0xff)
With problem, java:
byte a = 0xfe;
int i = a;
With problem, C:
signed char a = 0xfe;
int i = a;
What does happen: A byte can hold value between -128 and +127.
0xfe maps to a negative number (2-complement): -2
...and so, i get the value -2 in i, and i is not 8bit, but 32bit long.
According to the rules of the 2-complement, this gives 0xfffffffe
(http://en.wikipedia.org/wiki/Two%27s_complement)
So, what does & change, because masking 0xfe first with 0xff
shouldn´t change the value?
Yes, but: As & is a "calculation" like + - ...
the value gets expanded first to 32bit
(because more suited for the processor´s ALU)
That´s more likely to be known by C/Asm programmers,
but as you see, it´s relevant in Java too.
(if nessecary for an assignment to an smaller variable than 32bit,
it will be shortened again after calculation)
Ie. first, -2=0xfe becomes 32bit -2=0xfffffffe,
then masking results in a 0xfe again (already 32bit)...
which is assigned to i.
Your value of state[0][2] is a byte 0xD8. This has the most significant bit set to 1: in binary: 1101 1000. Before the shift operation << is applied, the byte is converted to an int. Java doesn't care that byte is unsigned, it is treated as a signed byte. So the byte's most significant bit is filled all the way to the int's most significant bit.
In short: With bytes you need the mask with 0xFF as this masks the filled in bits away in the already converted int.
I have a byte array of 64 in size. I am receiving 64 bytes of data from UsbConnection.bulkTRansfer(). I want to check whether I received "Sync" packet or not. "Sync" is a long constant with value 4006390527l. Here's my code.
byte[] buffer = new byte[64];
bytesReceived += usbConnection.bulkTransfer(usbEndpointIn, buffer, 64, 2000);
String l=Base64.encodeToString(buffer,0);
long ll=Long.parseLong(l);
if(C.SYNC_PAD_TO_HOST == ll) {
Log.d(TAG,"SyncReceived");//This is the Sync
gotSync=true;
System.arraycopy(buffer, 0, rcvbuf, 0, buffer.length);
}
I am getting very weird results. Never does the if condition becomes true. Whats wrong here.
There are a few issues here. A USB-Sync for full-speed is 32-bits. An int is capable of containing the data, but not as an unsigned integer. The only reason your code stores it as a long is to store values of 0x80000000 to 0xFFFFFFFF as a positive number. However, only the least significant 32-bits of the long are used.
To calculate the first little-endian unsigned 32-bit number in the stream and store it as a long, use:
long ll = (buffer[0] & 0xFF)
| ((buffer[1] & 0xFF) << 8)
| ((buffer[2] & 0xFF) << 16)
| ((buffer[3] & 0xFFL) << 24);
Here's a breakdown of what's happening:
Your Sync packet in hex is 0x17E1444C. USB transmits this value using little-endian, which means the least significant byte is sent first. Over the wire, the bytes come in this order:
4C 44 E1 17
To break-down the steps:
long ll = buffer[0] & 0xFF;
// ll == 0x4C & 0x000000FF
// == (long) 0x0000004C
// == 0x000000000000004C
ll |= (buffer[1] && 0xFF) << 8;
// ll == 0x000000000000004C | ((0x44 & 0x000000FF) << 8)
// == 0x000000000000004C | (0x00000044 << 8)
// == 0x000000000000004C | 0x00004400
// == 0x000000000000004C | (long) 0x00004400
// == 0x000000000000004C | 0x0000000000004400
// == 0x000000000000004C
ll |= (buffer[2] & 0xFF) << 16;
// ll == 0x000000000000444C | ((0xE1 & 0x000000FF) << 16)
// == 0x000000000000444C | (0x000000E1 << 16)
// == 0x000000000000444C | 0x00E10000
// == 0x000000000000444C | (long) 0x00E10000
// == 0x000000000000444C | 0x0000000000E10000
// == 0x0000000000E1444C
That last inclusion illustrates why we use & 0xFF. Without using this bitwise AND, here's what happens without using & 0xFF:
ll |= buffer[2] << 16;
// ll == 0x000000000000444C | ((int) 0xE1) << 16)
// == 0x000000000000444C | (0xFFFFFFE1 << 16)
// == 0x000000000000444C | 0xFFE10000
// == 0x000000000000444C | (long) 0xFFE10000
// == 0x000000000000444C | 0xFFFFFFFFFFE10000
// == 0xFFFFFFFFFFE1444C
This is because E1 exceeds the maximum positive byte (0x7F), so it's treated as a negative number. When cast directly to int, the negative sign is maintained. To avoid this behaviour, we cast it to int by using a full 8-bit AND.
Now back to the process. The last byte:
ll |= (buffer[3] & 0xFFL) << 24;
// ll == 0x0000000000E1444C | ((0x17 & 0x00000000000000FF) << 24)
// == 0x0000000000E1444C | (0x0000000000000017 << 24)
// == 0x0000000000E1444C | 0x0000000017000000
// == 0x0000000017E1444C
You'll notice the last bitwise AND performed above is using a long version of 0xFF. This is because a left-shift of 24 bits (or higher) has the potential of making a negative int if the least significant byte exceeds the maximum positive byte (0x7F). Imagine instead of 17 that the last byte is A7. Here's what happens when using using & 0xFF instead of & 0xFFL:
ll |= (buffer[3] & 0xFF) << 24;
// ll == 0x0000000000E1444C | ((0xA7 & 0x000000FF) << 24)
// == 0x0000000000E1444C | (0x000000A7 << 24)
// == 0x0000000000E1444C | 0xA7000000
// == 0x0000000000E1444C | (long) 0xA7000000
// == 0x0000000000E1444C | 0xFFFFFFA7000000
// == 0xFFFFFFFFA7E1444C