Weird behaviour of bit-shifting with byte in Java - java

As I was using bit-shifting on byte, I notice I was getting weird results when using unsigned right shift (>>>). With int, both right shift (signed:>> and unsigned:>>>) behave as expected:
int min1 = Integer.MIN_VALUE>>31; //min1 = -1
int min2 = Integer.MIN_VALUE>>>31; //min2 = 1
But when I do the same with byte, strange things happen with unsigned right shift:
byte b1 = Byte.MIN_VALUE; //b1 = -128
b1 >>= 7; //b1 = -1
byte b2 = Byte.MIN_VALUE; //b2 = -128
b2 >>>= 7; //b2 = -1; NOT 1!
b2 >>>= 8; //b2 = -1; NOT 0!
I figured that it could be that the compiler is converting the byte to int internally, but does not seem quite sufficient to explain that behaviour.
Why is bit-shifting behaving that way with byte in Java?

This happens exactly because byte is promoted to int prior performing bitwise operations. int -128 is presented as:
11111111 11111111 11111111 10000000
Thus, shifting right to 7 or 8 bits still leaves 7-th bit 1, so result is narrowed to negative byte value.
Compare:
System.out.println((byte) (b >>> 7)); // -1
System.out.println((byte) ((b & 0xFF) >>> 7)); // 1
By b & 0xFF, all highest bits are cleared prior shift, so result is produced as expected.

Shift operators for byte, short and char are always done on int.
Therefore, the value really being shifted is the int value -128, which looks like this
int b = 0b11111111_11111111_11111111_10000000;
When you do b2 >>= 7; what you are really doing is shifting the above value 7 places to the right, then casting back to a byte by only considering the last 8 bits.
After shifting 7 places to the right we get
0b11111111_11111111_11111111_11111111;
When we convert this back to a byte, we get just 11111111, which is -1, because the byte type is signed.
If you want to get the answer 1 you could shift 31 places without sign extension.
byte b2 = Byte.MIN_VALUE; //b2 = -128
b2 >>>= 31;
System.out.println(b2); // 1

Refer to JLS 15.19 Shift Operators:
Unary numeric promotion (§5.6.1) is performed on each operand separately.
and in 5.6.1 Unary Numeric Promotion :
if the operand is of compile-time type byte, short, or char, it is promoted to a value of type int by a widening primitive conversion
So, your byte operands are promoted to int before shifting. The value -128 is 11111111111111111111111110000000 .
After the shifting 7 or 8 times, the lowest 8 bits are all 1s, which when assigning to a byte, a narrowing primitive conversion occurs. Refer to JLS 5.1.3 Narrowing Primitive Conversion :
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T.

Related

Finding the absolute value of a Byte variable in Kotlin or Java?

Why isn't there any function in the standard library of Kotlin/Java for taking the absolute value of a Byte/byte variable? I'm I missing something?
Math.abs() is only defined for int, long, double and float.
For context: in the audio world you can run easily into byte arrays representing the amplitude. I'm interested in calculating the average of the absolute values of a byte array. For e.g see this listener related to Visualizer in Android.
I know I can cast it to an integer and take the absolute value of that, but I would still be interested why is this not predefined.
The operations in java.lang.Math are in line with all other arithmetic operations in Java. Integer operations always work in either, 64 bit long or 32 bit int.
As stated in JLS, §4.2.2. Integer Operations
If an integer operator other than a shift operator has at least one operand of type long, then the operation is carried out using 64-bit precision, and the result of the numerical operator is of type long. If the other operand is not long, it is first widened (§5.1.5) to type long by numeric promotion (§5.6).
Otherwise, the operation is carried out using 32-bit precision, and the result of the numerical operator is of type int. If either operand is not an int, it is first widened to type int by numeric promotion.
In other words, not even the following, equivalent to abs, would compile:
byte a = 42, absA = a < 0? -a: a;
as the numeric operation -a will promote a to int before negating.
It’s important that a cast of the result to byte would not be a lossless operation here. The byte datatype has a value range from -128 to +127, so if the value is -128, its absolute value +128 is outside the byte value range and a cast to byte would cause an overflow to -127.
Therefore, to have a correct and efficient calculation, you should do as always in Java when it comes to byte, short, or char calculations, calculate everything using int and only cast the final result back to your data type. When you want to calculate the average, you have to calculate the sum using int anyway (or even long if you have more than 16777215 array elements).
byte[] array // e.g. test case: = { 1, -1, -128, 127 };
int sum = 0;
for(byte b: array) sum += Math.abs(b);
int average = sum/array.length;
// if you really need a byte result
byte byteAverage = average == 128? 127: (byte)average;
I don’t know about Kotlin, but in Java, the automatic promotion to int also works if the operand is of type Byte, so you don’t need to “cast it to an integer” to call Math.abs(int). You only have to deal with the fact that the result will be an int, as with all arithmetic operations on byte, short, char, or their wrapper types.
In java byte is signed between -128 and 127, corresponding as (unsigned) int: 0xFF & b between 128 .. 255, and 0 .. 127.
Math.abs is irrelevant here as probably unsigned byte values are assumed.
int[] bytesToInt(byte[] bs) {
int[] is = new int[bs.length];
Arrays.fill(is, i -> bs[i] & 0xFF);
return is;
}
byte byteAbs(byte b) {
return b >= 0? b : b == -128? 127 : -b;
}
byteAbs - given for completeness - reduces the range to 7 bits, and has the artefact that -128 maps to 127, as there is no 128.

>> and >>> operator with negative byte value in java

I have a sample code snippet like this:
byte a = -0b00001111;
byte b = a;
byte c = a;
System.out.println("a=" + a );
System.out.println("b=" + (b >> 1) );
System.out.println("c=" + (c >>> 1) );
and, it prints:
a=-15
b=-8
c=2147483640
I don't quite understand how b and c became those 2 values respectively, could someone demonstrate me in steps how those 2 values were calculated please?
For byte a, you have the literal 0b00001111, which is binary for 15, so a is -15. The bit representation of -15 for a byte is:
11110001
In Java, unary numeric promotion occurs on the operands of bit-shift operators <<, >>, and >>>.
Unary numeric promotion (§5.6.1) is performed on each operand separately.
This means that the value is promoted to int before shifting.
The code (b >> 1) promotes b to an int, then shifts the value with sign extension. This means that if the value was already negative, then a 1 bit is shifted to ensure it's still negative.
11110001
is promoted to
11111111 11111111 11111111 11110001
which is -15 as an int. After shifting to the right one bit, with sign extension:
11111111 11111111 11111111 11111000
which is -8.
However, for the code (c >>> 1), the >>> unsigned right shift operator does not perform sign extension, even if the promotion to int does maintain the sign and the value.
11110001
is promoted to
11111111 11111111 11111111 11110001
which is -15 as an int as before. After unsigned shifting to the right one bit:
01111111 11111111 11111111 11111000
The first bit is now 0. Without the most significant bit set, the value is now 231 - 8, or 2147483640.

Narrowing from int to short [duplicate]

This question already has answers here:
Java - Explicit Conversion from Int to Short
(3 answers)
Closed 7 years ago.
I am working on narrowing and checked the following code :-
int i = 131072;
short s = (short)i;
System.out.println(s); //giving 0
This narrowing is outputting 0. I am not able to get the logic behind.
131072 int is 00000000 00000010 00000000 00000000 in binary.
When you case it to short, only the lowest 16 bits remain - 00000000 00000000.
When you cast a primitive to a smaller primitive the top bits are dropped.
Written another way you can see what is happening.
int i = 0x20000;
short s = (short) (i & 0xFFFF);
Note: the lower 16 bits of your integer are all zero, so the answer is 0.
As binary casting to (short) keeps only the lower 16 bits.
00000000 00000010 (00000000 00000000)
If you were to cast a longer number, it would still take the lower bits in each case. Note: the & in each case is redundant and only to help clarity.
long l = 0x0FEDCBA987654321L;
// i = 0x87654321
int i = (int) (l & 0xFFFFFFFFL);
// c = \u4321
char c = (char) (l & 0xFFFF);
// s = 0x4321
short s = (short) (l & 0xFFFF);
// b = 0x21
byte b = (byte) (l & 0xFF);
Primitive values (int, short,...) are stored as binary values. int uses more bits than short. When you try to downcast you're cutting away bits which truncates (and potentially ruins) the value.
This is not a down cast (which refers to objects), it's a narrowing cast, or truncation. When you perform such a cast, you just copy the two least significant bytes of the int to your short. If the integer if smaller than 215 you'd just ignore bytes containing zeroes, so it would just work.
This is not the case here, however. If you examine the binary representation of 131072 you'll see it's 100000000000000000. So, the two least significant bytes are clearly 0, which is exactly what you're getting.

Why bit or operation will lead to sign extension but bit and won't?

I need to cast a byte to int in Java but I don't want sign extension so I did
byte b = -1
(int) (b & 0xF) // this returns 15, which is what I want
(int) (b | 0) // this returns -1, which is essentially 0xFFFF, sign extension happens, not what I want
I thought the above two should give same results but it turns out that's not the case.
I must miss something in bit operations.
The trick is to print the binary representation of those values and perform the binary operations on them
byte b = -1;
int a = (int) (b & 0xF); // this returns 15, which is what I want
int c = (int) (b | 0); // this returns -1, which is essentially 0xFFFF
System.out.println("b:" + Integer.toBinaryString(b));
System.out.println("a:" + Integer.toBinaryString(a));
System.out.println("c:" + Integer.toBinaryString(c));
System.out.println("0xF:" + Integer.toBinaryString(0xF));
prints
b:11111111111111111111111111111111
a:1111
c:11111111111111111111111111111111
0xF:1111
So b & OxF is
11111111111111111111111111111111
00000000000000000000000000001111 (AND)
--------------------------------
1111 (15)
and b | 0 is
11111111111111111111111111111111
00000000000000000000000000000000 (OR)
--------------------------------
11111111111111111111111111111111 (-1)
Hot Licks explains why the byte value -1 is represented in binary as it is.
The issue here is that bitwise operators work on ints or longs, not bytes. b & 0xF is essentially treated as ((int)b) & ((int)0xF). You can trace it all from the JLS definitions of each operation.
First JLS 15.22.1 (which defines & and |) explains that when both operands are convertible to integer primitive types, "binary numeric promotion is first performed on the operands (§5.6.2)."
JLS 5.6.2, in turn, says that unless either operand is a float, double or long, both values are widened to int.
Finally, widening is defined in JLS 5.1.2 and states that "widening conversion of a signed integer value to an integral type T simply sign-extends the two's-complement representation of the integer value to fill the wider format." Bytes are signed (JLS 4.2).
So, your b byte is widened to an int using sign extension before being AND'd or OR'ed with the right operand.
Note that this would imply that the result of b & 0F should be an int, not a byte. This is in fact the case (meaning that your explicitly casting it to int is superfluous). You can test this by auto-boxing it to an Object and then checking that object's type:
byte b = -1;
Object o = (b & 0xF);
System.out.println(o.getClass());
// prints "class java.lang.Integer", not "class java.lang.Byte"

Behaviour of unsigned right shift applied to byte variable

Consider the following snip of java code
byte b=(byte) 0xf1;
byte c=(byte)(b>>4);
byte d=(byte) (b>>>4);
output:
c=0xff
d=0xff
expected output:
c=0x0f
how?
as b in binary 1111 0001
after unsigned right shift 0000 1111 hence 0x0f but why is it 0xff how?
The problem is that all arguments are first promoted to int before the shift operation takes place:
byte b = (byte) 0xf1;
b is signed, so its value is -15.
byte c = (byte) (b >> 4);
b is first sign-extended to the integer -15 = 0xfffffff1, then shifted right to 0xffffffff and truncated to 0xff by the cast to byte.
byte d = (byte) (b >>> 4);
b is first sign-extended to the integer -15 = 0xfffffff1, then shifted right to 0x0fffffff and truncated to 0xff by the cast to byte.
You can do (b & 0xff) >>> 4 to get the desired effect.
I'd guess that b is sign extended to int before shifting.
So this might work as expected:
(byte)((0x000000FF & b)>>4)
According to Bitwise and Bit Shift Operators:
The unsigned right shift operator ">>>" shifts a zero into the leftmost position, while the leftmost position after ">>" depends on sign extension.
So with b >> 4 you transform 1111 0001 to 1111 1111 (b is negative, so it appends 1) which is 0xff.
Java tries to skimp on having explicit support for unsigned basic types by defining the two different shift operators instead.
The question talks about unsigned right shift, but the examples does both (signed and unsigned), and shows the value of the signed shift (>>).
Your calculations would be right for unsigned shift (>>>).
The byte operand is promoted to an int before the shift.
See https://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.19
Unary numeric promotion (§5.6.1) is performed on each operand separately. (Binary numeric promotion (§5.6.2) is not performed on the operands.)
And https://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.6.1
Otherwise, if the operand is of compile-time type byte, short, or char, it is promoted to a value of type int by a widening primitive conversion (§5.1.2).
byte b=(byte) 0xf1;
if (b<0)
d = (byte) ((byte) ((byte)(b>>1)&(byte)(0x7F)) >>>3);
else
d = (byte)(b>>>4);
First, check the value:
If the value is negative. Make one right shift, then & 0x7F, It will be changed to positive. then you can make the rest of right shift (4-1=3) easily.
If the value is positive, make all right shift with >>4 or >>>4. It does'nt make no difference in result nor any problem of right shift.

Categories