I'm having trouble making sense of how Java promotes bytes to ints with bitwise operations. I'm attempting to implement AES, and while my output is correct as a 2d byte array, I ultimately need to store it in a 1d int array. However, the following code changes some of the expected values
ciphertexts[0] = ((state[0][0] & 0xFF) << 24) ^ ((state[1][0] & 0xFF) << 16)
^ ((state[2][0] & 0xFF) << 8) ^ state[3][0];
ciphertexts[1] = ((state[0][1] & 0xFF) << 24) ^ ((state[1][1] & 0xFF) << 16)
^ ((state[2][1] & 0xFF) << 8) ^ state[3][1];
ciphertexts[2] = ((state[0][2] & 0xFF) << 24) ^ ((state[1][2] & 0xFF) << 16)
^ ((state[2][2] & 0xFF) << 8) ^ state[3][2];
ciphertexts[3] = ((state[0][3] & 0xFF) << 24) ^ ((state[1][3] & 0xFF) << 16)
^ ((state[2][3] & 0xFF) << 8) ^ state[3][3];
I didn't particularly expect masking with 0xFF to help, since the mask should just return the original byte value, but then I tried this:
int zero = ((state[0][0] & 0xFF) << 24);
int one = ((state[0][1] & 0xFF) << 16);
int two = ((state[0][2] & 0xFF) << 8) ;
int three = (state[0][3] & 0xFF);
int total = zero ^ one ^ two ^ three;
printhex(zero);
printhex(one);
printhex(two);
printhex(three);
printhex(total);
Which gives the following output:
69000000
006A0000
0000D800
00000070
696AD870
Which is what I'm trying to do with the code above. Without the masking, the following code gives the following output:
int zero = (state[0][0] << 24);
int one = (state[0][1] << 16);
int two = (state[0][2] << 8);
int three = state[0][3];
int total = zero ^ one ^ two ^ three;
69000000
006A0000
FFFFD800
00000070
9695D870
I also tried what seemed to me more sensible, which is masking after shifting, and got similarly messed up output:
ciphertexts[0] = ((state[0][0] << 24) & 0xFFFFFFFF) ^
((state[1][0] << 16) & 0xFFFFFF) ^ ((state[2][0] << 8) & 0xFFFF)
^ state[3][0];
ciphertexts[1] = ((state[0][1] << 24) & 0xFFFFFFFF) ^
((state[1][1] << 16) & 0xFFFFFF) ^ ((state[2][1] << 8) & 0xFFFF)
^ state[3][1];
ciphertexts[2] = ((state[0][2] << 24) & 0xFFFFFFFF) ^
((state[1][2] << 16) & 0xFFFFFF) ^ ((state[2][2] << 8) & 0xFFFF)
^ state[3][2];
ciphertexts[3] = ((state[0][3] << 24) & 0xFFFFFFFF) ^
((state[1][3] << 16) & 0xFFFFFF) ^ ((state[2][3] << 8) & 0xFFFF)
^ state[3][3];
Where "messed up" means:
ciphertext at round 9 is 963b1fd86a7b04302732488070b4c55a
instead of:
69C4E0D86A7B0430D8CDB78070B4C55A
So my questions are how do I neatly or bytes together into an int, and what is actually going on with the masking and shifting. I looked at other answers and can't figure out why they're not working in this case. Thanks!
That´s the cruelty of a language lacking unsigned
(one can get the same result in C if he/she use a signed char, ie. signed byte)
Let´s ignore shift´s , only concentrate at the assignment and &.
Example value here 0xfe instead of 0xd8
(the problem will happen with each value between 0x80 and 0xff)
With problem, java:
byte a = 0xfe;
int i = a;
With problem, C:
signed char a = 0xfe;
int i = a;
What does happen: A byte can hold value between -128 and +127.
0xfe maps to a negative number (2-complement): -2
...and so, i get the value -2 in i, and i is not 8bit, but 32bit long.
According to the rules of the 2-complement, this gives 0xfffffffe
(http://en.wikipedia.org/wiki/Two%27s_complement)
So, what does & change, because masking 0xfe first with 0xff
shouldn´t change the value?
Yes, but: As & is a "calculation" like + - ...
the value gets expanded first to 32bit
(because more suited for the processor´s ALU)
That´s more likely to be known by C/Asm programmers,
but as you see, it´s relevant in Java too.
(if nessecary for an assignment to an smaller variable than 32bit,
it will be shortened again after calculation)
Ie. first, -2=0xfe becomes 32bit -2=0xfffffffe,
then masking results in a 0xfe again (already 32bit)...
which is assigned to i.
Your value of state[0][2] is a byte 0xD8. This has the most significant bit set to 1: in binary: 1101 1000. Before the shift operation << is applied, the byte is converted to an int. Java doesn't care that byte is unsigned, it is treated as a signed byte. So the byte's most significant bit is filled all the way to the int's most significant bit.
In short: With bytes you need the mask with 0xFF as this masks the filled in bits away in the already converted int.
Related
Background
I am taking 8, 16, 24 or 32 bit audio data and converting them to integers, but BigInteger cannot be recycled and using it will waste lot of memory so I created this class to fix the memory consumption. And seems like ByteBuffer will do the job well, except if the input is 3 bytes long.
I have never done any bit or byte operations, so I am completely lost here.
Issue
None of the examples that I found on stackoverflow on 3 bytes to int do not give the wanted result. Check the bytesToInt3 method.
Question
Is there something obvious that I am doing completely wrong?
Is the return new BigInteger(byte[] data).intValue(); really the only solution to this?
Code
import java.math.BigInteger;
import java.nio.ByteBuffer;
class BytesToInt {
// HELP
private static int bytes3ToInt(byte[] data) {
// none below seem to work, even if I swap first and last bytes
// these examples are taken from stackoverflow
//return (data[2] & 0xFF) | ((data[1] & 0xFF) << 8) | ((data[0] & 0x0F) << 16);
//return ((data[2] & 0xF) << 16) | ((data[1] & 0xFF) << 8) | (data[0] & 0xFF);
//return ((data[2] << 28) >>> 12) | (data[1] << 8) | data[0];
//return (data[0] & 255) << 16 | (data[1] & 255) << 8 | (data[2] & 255);
return (data[2] & 255) << 16 | (data[1] & 255) << 8 | (data[0] & 255);
// Only thing that works, but wastes memory
//return new BigInteger(data).intValue();
}
public static void main(String[] args) {
// Test with -666 example number
byte[] negativeByteArray3 = new byte[] {(byte)0xff, (byte)0xfd, (byte)0x66};
testWithData(negativeByteArray3);
}
private static void testWithData(byte[] data) {
// Compare our converter to BigInteger
// Which we know gives wanted result
System.out.println("Converter = " + bytes3ToInt(data));
System.out.println("BigInteger = " + new BigInteger(data).intValue());
}
}
Output
Converter = 6749695
BigInteger = -666
full code here http://ideone.com/qu9Ulw
First of all, your indices are wrong. It's not 2, 1, 0 but 0, 1, 2.
Secondly the problem is that the sign isn't being extended, so even though it would work for positive values, negative values show wrong.
If you don't mask the highest (of 24bits) byte, it will sign extend properly, filling the highest (of 32bits) byte with 0x00 for positive values or 0xFF for negative values.
return (data[0] << 16) | (data[1] & 255) << 8 | (data[2] & 255);
public static int liEndVal (Byte[] mem) {
return (mem[0] & 0xFF)
| ((mem[1] & 0xFF) << 8)
| ((mem[2] & 0xFF) << 16)
| ((mem[3] & 0xFF) << 24);
}
How can I modify this method so that when my input is for example 45 A2 BD 8A the little endian integer output will not be a negative integer? I don't understand why does it keeps on returning the two complement integer.
When mem[3] > 0x7F, the returned int will be negative, since the max value of int is 0x7FFFFFFF. If you want a positive returned value, return a long.
public static long liEndVal (Byte[] mem) {
return (mem[0] & 0xFF)
| ((mem[1] & 0xFF) << 8)
| ((mem[2] & 0xFF) << 16)
| (((long)mem[3] & 0xFF) << 24);
}
Because in that representation, the (signed) integer is negative. Looks like you need an unsigned int.
I think the answer here is probably actually that you shouldn't mind that the answer is negative: just treat it as unsigned, and the signedness of the output as unimportant. You cannot eliminate the possibility of negative output, but I think you're wrong that it matters.
I try understand BSD checksum calulcation algorithm, writed in Java language.
Wiki writed:
byte checksum(byte[] input) {
byte checksum = 0;
for (byte cur_byte: input) {
checksum = (byte) (((checksum & 0xFF) >>> 1) + ((checksum & 0x1) << 7)); // Rotate the accumulator
checksum = (byte) ((checksum + cur_byte) & 0xFF); // Add the next chunk
}
return checksum;
}
And my questions:
Why we use bitwise & in this line checksum = (byte) ((checksum + cur_byte) & 0xFF); ? 0xFF is binary "11111111" and this operation not return always this same number?
What is a sense of this operation? checksum = (byte) (((checksum & 0xFF) >>> 1) + ((checksum & 0x1) << 7)); I understand binary operation and logical and arithmetical shifts, but dont understand what we doing.
Thanks for help :)
b & 0xFF is often used to cast signed byte to bit-identical int. In this case it is unnecessary - (byte)(b & 0xFF) identical to (b). For example ((byte)-1) & 0xFF = 255
12345678 >>>1 01234567
12345678 <<7 80000000 ADD -> 81234567
Thus it is cyclic rotation
The task is to fetch each byte from a given integer. This is the approach I saw somewhere:
byte[] bytes = new byte[4];
bytes[0] = (byte) ((id >> 24) & 0xff);
bytes[1] = (byte) ((id >> 16) & 0xff);
bytes[2] = (byte) ((id >> 8) & 0xff);
bytes[3] = (byte) (id & 0xff);
It would result in the same break-up as this:
bytes[0] = (byte) (id >>> 24);
bytes[1] = (byte) (id >>> 16);
bytes[2] = (byte) (id >>> 8);
bytes[3] = (byte) (id);
where, id is an integer value and will ALWAYS be unsigned. In fact, I don't see the need to AND with 0xff in the first approach (isn't it? since we're always using the least significant byte).
Is there any difference in the two approaches and which one is preferred?
You do not need the & 0xff in the upper example either, because your example always chops off the bits that are different in sign-extended vs. non-sign-extended numbers.
Here is why: when you shift a number right by n bits using >>, the upper n bits will get the same value as the most significant bit of the number being shifted. The behavior of >>> differs only in that >>> forces the upper n bits to zero. The lower (32-n) bits are the same regardless of the kind of the shift that you use.
None of your examples shifts by more 24 bits, so the lower eight bits would be the same if you replace >>> with >> in your bottom example.
Since it is entirely unnecessary to mask with 0xff, I would use your second snippet using >> or >>> for the operator, because the code is shorter.
I'm trying to mask an integer in order to separate each byte individually like so:
int a = (0xffffffff & 0xff000000) >> 24;
int b = (0xffffffff & 0x00ff0000) >> 16;
int c = (0xffffffff & 0x0000ff00) >> 8;
int d = 0xffffffff & 0x000000ff;
b, c and d give the correct answer in this case, 255, however, a continues to give me -1 and other negative numbers no matter what I change it to, I even tried:
int a = (0xefffffff & 0xff000000) >> 24;
and it gives me -17.
Does someone know how do I solve this problem so that in this boundary case a gives me 255 and other positive numbers?
This is because of sign extension. If the top-most bit is 1, then >> shifts in 1s. This is to preserve the sign of the argument. You want to use >>> which always shifts in 0. Or, mask after the shift:
int a = (0xffffffff >> 24) & 0x000000ff;
You are doing a signed shift, so the sign is preserved.
int a = (0xffffffff & 0xff000000) >>> 24; // unsigned shift.
or
int a = 0xffffffff >>> 24; // unsigned shift and all the bottom bits are lost anyway
int b = (0xffffffff >>> 16) & 0xFF;
int c = (0xffffffff >>> 8) & 0xFF;
int d = 0xffffffff & 0xFF;
I think you need an unsigned shifting,
Try it this way...
(0xffffffff & 0xff000000) >>> 24