Unsigned Shift Operation in Java - java

Could anybody tells me how can this Operations results "sar" a negative number?

data[0] is promoted to int before the shift operator is applied.
Therefore, if for example, data[0] is -128,
you are applying the shift on the int -128, whose binary representation is :
11111111 11111111 11111111 10000000
This results in
00000011 11111111 11111111 11111110
And after you cast that back to byte, you end up with a negative number
11111110 (-2)
If you want to ignore the 1 bits that were added as a result of the int promotion, you can write :
byte sar = (byte) ((data[0]&0xff)>>>6);
This will result in 2 (when data[0] is -128).

Related

Right Shift Operator With Brackets

I don't understand why there's a difference between this code:
byte b = (byte) (0xff >> 1);
(so now b = 01111111),
and this code:
byte b = (byte) 0xff;
b >>= 1;
(but now b = 11111111).
Thanks in advance for your help!
In the first code, (0xff >> 1) is 255 >> 1, which is 127. That is calculated with ints and then you cast it to a byte. 127 as a byte is 01111111 bin.
In the second code, you start with (byte) 0xff, which is 11111111 bin, which is the two's complement representation of -1 in 8 bits. So (byte) 0xff is -1.
When you perform shifting, the byte value -1 is promoted to the int value -1. That's 11111111 11111111 11111111 11111111 bin.
Shifting it right one place with the arithmetic right shift operator, (-1) >> 1 gives you 11111111 11111111 11111111 11111111 again, because the >> operator on a negative number moves the bits to the right and fills in the left with ones instead of zeroes.
Then, since you're using >>=, the result is cast back to a byte to be stored in b. That only retains the last 8 bits, which are 11111111.
Alternatively, if you used the logical right shift operator, (-1) >>> 1 would give you 01111111 11111111 11111111 11111111 in binary (a zero followed by 31 ones). Since the last 8 bits are the same, this would still give you 11111111 when it is cast back to a byte.

>> and >>> operator with negative byte value in java

I have a sample code snippet like this:
byte a = -0b00001111;
byte b = a;
byte c = a;
System.out.println("a=" + a );
System.out.println("b=" + (b >> 1) );
System.out.println("c=" + (c >>> 1) );
and, it prints:
a=-15
b=-8
c=2147483640
I don't quite understand how b and c became those 2 values respectively, could someone demonstrate me in steps how those 2 values were calculated please?
For byte a, you have the literal 0b00001111, which is binary for 15, so a is -15. The bit representation of -15 for a byte is:
11110001
In Java, unary numeric promotion occurs on the operands of bit-shift operators <<, >>, and >>>.
Unary numeric promotion (§5.6.1) is performed on each operand separately.
This means that the value is promoted to int before shifting.
The code (b >> 1) promotes b to an int, then shifts the value with sign extension. This means that if the value was already negative, then a 1 bit is shifted to ensure it's still negative.
11110001
is promoted to
11111111 11111111 11111111 11110001
which is -15 as an int. After shifting to the right one bit, with sign extension:
11111111 11111111 11111111 11111000
which is -8.
However, for the code (c >>> 1), the >>> unsigned right shift operator does not perform sign extension, even if the promotion to int does maintain the sign and the value.
11110001
is promoted to
11111111 11111111 11111111 11110001
which is -15 as an int as before. After unsigned shifting to the right one bit:
01111111 11111111 11111111 11111000
The first bit is now 0. Without the most significant bit set, the value is now 231 - 8, or 2147483640.

Why Java unsigned bit shifting for a negative byte is so strange?

I have a byte variable:
byte varB = (byte) -1; // binary view: 1111 1111
I want to see the two left-most bits and do an unsigned right shift of 6 digits:
varB = (byte) (varB >>> 6);
But I'm getting -1 as if it was int type, and getting 3 only if I shift for 30!
How can I work around this and get the result only with a 6-digit shift?
The reason is the sign extension associated with the numeric promotion to int that occurs when bit-shifting. The value varB is promoted to int before shifting. The unsigned bit-shift to the right does occur, but its effects are dropped when casting back to byte, which only keeps the last 8 bits:
varB (byte) : 11111111
promoted to int : 11111111 11111111 11111111 11111111
shift right 6 : 00000011 11111111 11111111 11111111
cast to byte : 11111111
You can use the bitwise-and operator & to mask out the unwanted bits before shifting. Bit-anding with 0xFF keeps only the 8 least significant bits.
varB = (byte) ((varB & 0xFF) >>> 6);
Here's what happens now:
varB (byte) : 11111111
promoted to int : 11111111 11111111 11111111 11111111
bit-and mask : 00000000 00000000 00000000 11111111
shift right 6 : 00000000 00000000 00000000 00000011
cast to byte : 00000011
Because thats how shifting for bytes in java is defined in the language: https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.19.
The gist is that types smaller than int are silently widened to int, shifted and then narrowed back.
Which makes your single line effectively the equivalent of:
byte b = -1; // 1111_1111
int temp = b; // 1111_1111_1111_1111_1111_1111_1111_1111
temp >>>= 6; // 0000_0011_1111_1111_1111_1111_1111_1111
b = (byte) temp; // 1111_1111
To shift just the byte you need to make the widening conversion explicitly yourself with unsigned semantics (and the narrowing conversion needs to be manually, too):
byte b = -1; // 1111_1111
int temp = b & 0xFF; // 0000_0000_0000_0000_0000_0000_1111_1111
temp >>>= 6; // 0000_0000_0000_0000_0000_0000_0000_0011
b = (byte) temp; // 0000_0011
One problem with the top answer is that, although it works correctly for unsigned >>> right shift, it doesn't for signed >> right shift. This is because >> depends on the sign bit (the one farthest to the left) which moves when it's promoted to int. This means when you use >>, you'll get 00000011 when you might expect 11111111. If you want a trick that works for both, try shifting left by 24, doing your chosen right shift, then shifting back to the right by 24. That way your byte data's sign bit is in the right place.
varB = (byte) (varB << 24 >> 6 >> 24);
I've [bracketed] the sign bit. Here's what's happening:
varB (byte) : [1]1111111
promoted to int : [1]1111111 11111111 11111111 11111111
shift left 24 : [1]1111111 00000000 00000000 00000000
signed shift right 6 : [1]1111111 11111100 00000000 00000000
shift right 24 : [1]1111111 11111111 11111111 11111111
cast to byte : [1]1111111
Here you can see it also works for >>>:
varB = (byte) (varB << 24 >>> 6 >> 24);
varB (byte) : [1]1111111
promoted to int : [1]1111111 11111111 11111111 11111111
shift left 24 : [1]1111111 00000000 00000000 00000000
unsigned shift right 6 : [0]0000011 11111100 00000000 00000000
shift right 24 : [0]0000000 00000000 00000000 00000011
cast to byte : [0]0000011
This costs more operations for the convenience of not having to remember the rules about which one you should and shouldn't bitmask. So use whatever solution works for you.
Btw, it's good to know that short is also promoted to int which means everything in these answers applies to it as well. The only difference is that you shift left/right by 16, and the bitmask is 0xFFFF.

Why both (byte) 400000 and (byte) -400000 result -128

In Java, why both (byte) 400000 and (byte) -400000 have result -128?
Actually, I followed the calculation method from https://stackoverflow.com/a/9085666/1037217
For case: 400000
Binary: 1100001101010000000
Trim to 8 digits: 10000000
Since the left most digit is 1, so -1 from it: 01111111
Then invert it: 10000000
Result: -128
For case: -400000
Binary: -1100001101010000000
Trim to 8 digits: 10000000
Since the left most digit is 1, so -1 from it: 01111111
Then invert it: 10000000
Result: 128
The same method works on
(short) 40000 = -25536
(short) -40000 = 25536
Casting an int to byte will preserve the int number's last 8 bits (the last byte).
400000 = 0x61a80
-400000 = 0xfff9e580
Both of your numbers have the same last 8 bits: 0x80 which is -1 in 2's complement.
For example:
System.out.println((byte)0x23403); // Prints 3 (the last 8 bits: 0x03 = 3)
System.out.println((byte)0x23483); // Prints -125 (last 8 bits: 0x83 = -125)
// (in 2's complement: 0x83 = -(128-3) = -125)
Because byte has the range -128 to 127. Both of your values overflow and are then subject to a narrowing conversion. To quote JLS Example 5.1.3-2. Narrowing Primitive Conversions that lose information,
// An int value too big for byte changes sign and magnitude:
As you say:
For case: 400000 Binary: 1100001101010000000 Trim to 8 digits: 10000000 Since the left most digit is 1, so -1 from it: 01111111 Then invert it: 10000000 Result: -128
For case: -400000 Binary: -1100001101010000000 Trim to 8 digits: 10000000 Since the left most digit is 1, so -1 from it: 01111111 Then invert it: 10000000 Result: 128
In both cases, the bit pattern you get is 10000000. That equates to -128 both times. A byte cannot represent the value 128; it is out of range.
However, your procedure is not quite right. You can't just put a negative sign there and then "trim to 8 digits". A negative sign is not a valid state for a bit. You should probably study the 2s complement representation of integers.

Preventing Sign Extension with Byte Mask

I've been reading the book TCP/IP Sockets in Java, 2nd Edition. I was hoping to get more clarity on something, but since the book's website doesn't having a forum or anything, I thought I'd ask here.
In several places, the book uses a byte mask to avoid sign extension. Here's an example:
private final static int BYTEMASK = 0xFF; //8 bits
public static long decodeIntBigEndian(byte[] val, int offset, int size) {
long rtn = 0;
for(int i = 0; i < size; i++) {
rtn = (rtn << Byte.SIZE) | ((long) val[offset + i] & BYTEMASK);
}
return rtn;
}
So here's my guess of what's going on. Let me know if I'm right.
BYTEMASK in binary should look like 00000000 00000000 00000000 11111111.
To make things easy, let's just say the val byte array only contains 1 short so the offset is 0. So let's set the byte array to val[0] = 11111111, val[1] = 00001111. At i = 0, rtn is all 0's so rtn << Byte.SIZE just keeps the value the same. Then there's (long)val[0] making it 8 bytes with all 1's due to sign extension. But when you use & BYTEMASK, it sets all those extra 1's to 0's, leaving that last byte all 1's. Then you get rtn | val[0] which basically flips on any 1's in the last byte of rtn. For i = 1, (rtn << Byte.SIZE) pushes the least-significant byte over and leaves all 0's in place. Then (long)val[1] makes a long with all zero's plus 00001111 for the least-significant byte which is what we want. So using & BYTEMASK doesn't change it. Then when rtn | val[1] is used, it flips rtn's least-significant byte to all 1's. The final return value is now rtn = 00000000 00000000 00000000 00000000 00000000 00000000 11111111 11111111.
So, I hope this wasn't too long, and it was understandable. I just want to know if the way I'm thinking about this is correct, and not just completely wacked out logic. Also, one thing that confuses me is the BYTEMASK is 0xFF. In binary, this would be 11111111 11111111, so if it's being implicitly cast to an int, wouldn't it actually be 11111111 11111111 11111111 11111111 due to sign-extension? If that's the case, then it doesn't make sense to me how BYTEMASK would even work. Thank you for reading.
Everything is right except for the last point:
0xFF is already an int (0x000000FF), so it won't be sign-extended. In general, integer number literals in Java are ints unless they end with an L or l and then they are longs.

Categories