I'm trying to mask an integer in order to separate each byte individually like so:
int a = (0xffffffff & 0xff000000) >> 24;
int b = (0xffffffff & 0x00ff0000) >> 16;
int c = (0xffffffff & 0x0000ff00) >> 8;
int d = 0xffffffff & 0x000000ff;
b, c and d give the correct answer in this case, 255, however, a continues to give me -1 and other negative numbers no matter what I change it to, I even tried:
int a = (0xefffffff & 0xff000000) >> 24;
and it gives me -17.
Does someone know how do I solve this problem so that in this boundary case a gives me 255 and other positive numbers?
This is because of sign extension. If the top-most bit is 1, then >> shifts in 1s. This is to preserve the sign of the argument. You want to use >>> which always shifts in 0. Or, mask after the shift:
int a = (0xffffffff >> 24) & 0x000000ff;
You are doing a signed shift, so the sign is preserved.
int a = (0xffffffff & 0xff000000) >>> 24; // unsigned shift.
or
int a = 0xffffffff >>> 24; // unsigned shift and all the bottom bits are lost anyway
int b = (0xffffffff >>> 16) & 0xFF;
int c = (0xffffffff >>> 8) & 0xFF;
int d = 0xffffffff & 0xFF;
I think you need an unsigned shifting,
Try it this way...
(0xffffffff & 0xff000000) >>> 24
Related
So I'm trying to understand base64 encoding better and I came across this implementation on wikipedia
private static String base64Encode(byte[] in) {
StringBuffer out = new StringBuffer((in.length * 4) / 3);
int b;
for (int i = 0; i < in.length; i += 3) {
b = (in[i] & 0xFC) >> 2;
out.append(codes.charAt(b));
b = (in[i] & 0x03) << 4;
if (i + 1 < in.length) {
b |= (in[i + 1] & 0xF0) >> 4;
out.append(codes.charAt(b));
b = (in[i + 1] & 0x0F) << 2;
if (i + 2 < in.length) {
b |= (in[i + 2] & 0xC0) >> 6;
out.append(codes.charAt(b));
b = in[i + 2] & 0x3F;
out.append(codes.charAt(b));
} else {
out.append(codes.charAt(b));
out.append('=');
}
} else {
out.append(codes.charAt(b));
out.append("==");
}
}
return out.toString();
}
And I'm following along and I get to the line:
b = (in[i] & 0xFC) >> 2;
and I don't get it...why would you bitwise and 252 to a number then shift it right 2...wouldn't it be the same if you just shifted the byte itself without doing the bitwise operation? example:
b = in[i] >> 2;
Say my in[i] was the letter e...represented as 101 or in binary 01100101. If I shift that 2 to the right I get 011001 or 25. If I bitwise & it I get
01100101
11111100
--------
01100100
but then the shift is going to chop off the last 2 anyway...so why bother doing it?
Can somebody clarify for me please. Thanks.
IN in[i] >> 2, in[i] is converted to an int first. If it was a negative byte (with the high bit set) it will be converted to a negative int (with the now-highest 24 bits set as well).
In (in[i] & 0xFC) >> 2, in[i] is converted to an int as above, and then & 0xFC makes sure the extra bits are all reset to 0.
You're partially right, in that (in[i] & 0xFF) >> 2 would give the same result. & 0xFF is a common way to convert a byte to a non-negative int in the range 0 to 255.
The only way to know for sure why the original developer used 0xFC, and not 0xFF, is to ask them - but I speculate that it's to make it more obvious which bits are being used.
public static int liEndVal (Byte[] mem) {
return (mem[0] & 0xFF)
| ((mem[1] & 0xFF) << 8)
| ((mem[2] & 0xFF) << 16)
| ((mem[3] & 0xFF) << 24);
}
How can I modify this method so that when my input is for example 45 A2 BD 8A the little endian integer output will not be a negative integer? I don't understand why does it keeps on returning the two complement integer.
When mem[3] > 0x7F, the returned int will be negative, since the max value of int is 0x7FFFFFFF. If you want a positive returned value, return a long.
public static long liEndVal (Byte[] mem) {
return (mem[0] & 0xFF)
| ((mem[1] & 0xFF) << 8)
| ((mem[2] & 0xFF) << 16)
| (((long)mem[3] & 0xFF) << 24);
}
Because in that representation, the (signed) integer is negative. Looks like you need an unsigned int.
I think the answer here is probably actually that you shouldn't mind that the answer is negative: just treat it as unsigned, and the signedness of the output as unimportant. You cannot eliminate the possibility of negative output, but I think you're wrong that it matters.
I try understand BSD checksum calulcation algorithm, writed in Java language.
Wiki writed:
byte checksum(byte[] input) {
byte checksum = 0;
for (byte cur_byte: input) {
checksum = (byte) (((checksum & 0xFF) >>> 1) + ((checksum & 0x1) << 7)); // Rotate the accumulator
checksum = (byte) ((checksum + cur_byte) & 0xFF); // Add the next chunk
}
return checksum;
}
And my questions:
Why we use bitwise & in this line checksum = (byte) ((checksum + cur_byte) & 0xFF); ? 0xFF is binary "11111111" and this operation not return always this same number?
What is a sense of this operation? checksum = (byte) (((checksum & 0xFF) >>> 1) + ((checksum & 0x1) << 7)); I understand binary operation and logical and arithmetical shifts, but dont understand what we doing.
Thanks for help :)
b & 0xFF is often used to cast signed byte to bit-identical int. In this case it is unnecessary - (byte)(b & 0xFF) identical to (b). For example ((byte)-1) & 0xFF = 255
12345678 >>>1 01234567
12345678 <<7 80000000 ADD -> 81234567
Thus it is cyclic rotation
I'm having trouble making sense of how Java promotes bytes to ints with bitwise operations. I'm attempting to implement AES, and while my output is correct as a 2d byte array, I ultimately need to store it in a 1d int array. However, the following code changes some of the expected values
ciphertexts[0] = ((state[0][0] & 0xFF) << 24) ^ ((state[1][0] & 0xFF) << 16)
^ ((state[2][0] & 0xFF) << 8) ^ state[3][0];
ciphertexts[1] = ((state[0][1] & 0xFF) << 24) ^ ((state[1][1] & 0xFF) << 16)
^ ((state[2][1] & 0xFF) << 8) ^ state[3][1];
ciphertexts[2] = ((state[0][2] & 0xFF) << 24) ^ ((state[1][2] & 0xFF) << 16)
^ ((state[2][2] & 0xFF) << 8) ^ state[3][2];
ciphertexts[3] = ((state[0][3] & 0xFF) << 24) ^ ((state[1][3] & 0xFF) << 16)
^ ((state[2][3] & 0xFF) << 8) ^ state[3][3];
I didn't particularly expect masking with 0xFF to help, since the mask should just return the original byte value, but then I tried this:
int zero = ((state[0][0] & 0xFF) << 24);
int one = ((state[0][1] & 0xFF) << 16);
int two = ((state[0][2] & 0xFF) << 8) ;
int three = (state[0][3] & 0xFF);
int total = zero ^ one ^ two ^ three;
printhex(zero);
printhex(one);
printhex(two);
printhex(three);
printhex(total);
Which gives the following output:
69000000
006A0000
0000D800
00000070
696AD870
Which is what I'm trying to do with the code above. Without the masking, the following code gives the following output:
int zero = (state[0][0] << 24);
int one = (state[0][1] << 16);
int two = (state[0][2] << 8);
int three = state[0][3];
int total = zero ^ one ^ two ^ three;
69000000
006A0000
FFFFD800
00000070
9695D870
I also tried what seemed to me more sensible, which is masking after shifting, and got similarly messed up output:
ciphertexts[0] = ((state[0][0] << 24) & 0xFFFFFFFF) ^
((state[1][0] << 16) & 0xFFFFFF) ^ ((state[2][0] << 8) & 0xFFFF)
^ state[3][0];
ciphertexts[1] = ((state[0][1] << 24) & 0xFFFFFFFF) ^
((state[1][1] << 16) & 0xFFFFFF) ^ ((state[2][1] << 8) & 0xFFFF)
^ state[3][1];
ciphertexts[2] = ((state[0][2] << 24) & 0xFFFFFFFF) ^
((state[1][2] << 16) & 0xFFFFFF) ^ ((state[2][2] << 8) & 0xFFFF)
^ state[3][2];
ciphertexts[3] = ((state[0][3] << 24) & 0xFFFFFFFF) ^
((state[1][3] << 16) & 0xFFFFFF) ^ ((state[2][3] << 8) & 0xFFFF)
^ state[3][3];
Where "messed up" means:
ciphertext at round 9 is 963b1fd86a7b04302732488070b4c55a
instead of:
69C4E0D86A7B0430D8CDB78070B4C55A
So my questions are how do I neatly or bytes together into an int, and what is actually going on with the masking and shifting. I looked at other answers and can't figure out why they're not working in this case. Thanks!
That´s the cruelty of a language lacking unsigned
(one can get the same result in C if he/she use a signed char, ie. signed byte)
Let´s ignore shift´s , only concentrate at the assignment and &.
Example value here 0xfe instead of 0xd8
(the problem will happen with each value between 0x80 and 0xff)
With problem, java:
byte a = 0xfe;
int i = a;
With problem, C:
signed char a = 0xfe;
int i = a;
What does happen: A byte can hold value between -128 and +127.
0xfe maps to a negative number (2-complement): -2
...and so, i get the value -2 in i, and i is not 8bit, but 32bit long.
According to the rules of the 2-complement, this gives 0xfffffffe
(http://en.wikipedia.org/wiki/Two%27s_complement)
So, what does & change, because masking 0xfe first with 0xff
shouldn´t change the value?
Yes, but: As & is a "calculation" like + - ...
the value gets expanded first to 32bit
(because more suited for the processor´s ALU)
That´s more likely to be known by C/Asm programmers,
but as you see, it´s relevant in Java too.
(if nessecary for an assignment to an smaller variable than 32bit,
it will be shortened again after calculation)
Ie. first, -2=0xfe becomes 32bit -2=0xfffffffe,
then masking results in a 0xfe again (already 32bit)...
which is assigned to i.
Your value of state[0][2] is a byte 0xD8. This has the most significant bit set to 1: in binary: 1101 1000. Before the shift operation << is applied, the byte is converted to an int. Java doesn't care that byte is unsigned, it is treated as a signed byte. So the byte's most significant bit is filled all the way to the int's most significant bit.
In short: With bytes you need the mask with 0xFF as this masks the filled in bits away in the already converted int.
The task is to fetch each byte from a given integer. This is the approach I saw somewhere:
byte[] bytes = new byte[4];
bytes[0] = (byte) ((id >> 24) & 0xff);
bytes[1] = (byte) ((id >> 16) & 0xff);
bytes[2] = (byte) ((id >> 8) & 0xff);
bytes[3] = (byte) (id & 0xff);
It would result in the same break-up as this:
bytes[0] = (byte) (id >>> 24);
bytes[1] = (byte) (id >>> 16);
bytes[2] = (byte) (id >>> 8);
bytes[3] = (byte) (id);
where, id is an integer value and will ALWAYS be unsigned. In fact, I don't see the need to AND with 0xff in the first approach (isn't it? since we're always using the least significant byte).
Is there any difference in the two approaches and which one is preferred?
You do not need the & 0xff in the upper example either, because your example always chops off the bits that are different in sign-extended vs. non-sign-extended numbers.
Here is why: when you shift a number right by n bits using >>, the upper n bits will get the same value as the most significant bit of the number being shifted. The behavior of >>> differs only in that >>> forces the upper n bits to zero. The lower (32-n) bits are the same regardless of the kind of the shift that you use.
None of your examples shifts by more 24 bits, so the lower eight bits would be the same if you replace >>> with >> in your bottom example.
Since it is entirely unnecessary to mask with 0xff, I would use your second snippet using >> or >>> for the operator, because the code is shorter.