Bitwise differences in Java/Perl - java

I was messing around and noticed differences that I don't understand between Java and Perl when I bit-shifted -1.
I thought integers are stored in two's complement binary, so if there are 32 bits, -1 is 11...11 (32 times).
I'd expect -1 >> 1 to give $2^31 - 1$ = 011...11 = 2147483647.
I'd expect -1 << 1 to give = 11...110 = -2.
What is the reason for these different behaviors and where are these standards stated in different languages?
Code and printouts for Perl & Java below:
In Perl:
print (-1 >> 1, "\n");
print (-1 << 1, "\n");
2147483647
4294967294
In Java:
public class Tempy {
public static void main(String[] args){
System.out.println( -1 >> 1);
System.out.println( -1 << 1);
}
}
-1
-2

Perl's bit shift is inherently unsigned so -1 is treated as 2^32 -1 and it automatically fills with 0 so -1 >> 1 is 2^31-1 and -1 << 1 is 2^32-2.
[Edit] Thanks #Powerlord using integer will force perl to use signed values.
Java's bit shift sign extends (if using >>) so -1 << 1 is still -1 and -1 >> 1 is -2. If you don't want to sign extend you have to use the logical version >>>.

Related

How can -1 divided by 2 result in -1? (bitwise operation >>)

I was surprised to see that -1 divided by 2 using bitwise operations results -1.
I was expecting 0 to be returned.
Just like when you divide 1 or -1 by 2 the decimal part is removed and we get zero.
This might have to do with Two's complement but is just a guess and a guess that I don't fully understand.
can somebody explain it?
-1 >> 1 = -1
-1 / 2 = 0
public class JavaFiddle
{
public static void main(String[] args)
{
System.out.println(-1 >> 1);
System.out.println(-1 / 2);
}
}
Negative number in java is representated using a notation called 2's complement. If we assume the size of the signed integer is 8. you can think of 2's complement like this
2 will be 00000010
1 will be 00000001
0 will be 00000000
-1 will be 11111111 (Count in reverse from max)
-2 will be 11111110
-3 will be 11111101
(Actually in java the size of int is 4 bytes)
>> this is signed bitwise right shift operator. according to the documentation it fills 0 on the leftmost position for positive numbers and for negative numbers it will fill the same position with 1.
Which means shifting -1 any number of time gives -1 only.
11111111 >> 1 = 11111111
This is because of the Non-equivalence of arithmetic right shift and division meaning for negative numbers division by 2 and right shift should not be considered equal in all cases

Why does a shifted integer mask with a tilde casted to a long return zero? (Java, bit shift)

I am trying to create a mask to view specific bits on a long in Java. I tried the following:
long mask = ~ (0xffffffff << 32);
If I print this on the console it will return 0 but I am expecting 4294967295 since my result should look like 0x00000000FFFFFFFFL and 2^32 - 1 equals 4294967295. When I shift a long mask it works but I do not understand why.
long mask = ~ (0xFFFFFFFFFFFFFFFFL << 32);
Can anyone explain me this behavior?
Java assumes that if you're performing arithmetic operations on ints, then you want to get an int back, not a long. (The fact that you assign the output to a long after the calculation is done does not affect the calculation itself.)
Left-shifting an int (which is 32 bits) by 32 places does nothing. When you left-shift an int, only the five lowest-order bits of the right-hand operand are used, giving a number in the range 0 to 31.
http://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.19
That's why (0xffffffff<<32)==0xffffffff, and ~(0xffffffff<<32)==0
When shifting a long (which is 64 bits), the six lowest-order bits are used, giving a number in the range 0 to 63.
If you use 0xffffffffL, then Java will know to produce another long. So you can shift by 32 places without going off the left end of the number.
Left shift is modulus† the size of the data type, e.g. a shift of an int of 32 bits has no effect.
(0xffffffff << 32) ==
(0xffffffff << (32 % Integer.SIZE)) ==
(0xffffffff << (32 % 32)) ==
(0xffffffff << 0) ==
0xffffffff
And ~ of 0xffffffff is 0x00000000, i.e. 0 which is what you see.
Then with 64 bit, the full 32 bit shift is applied as it is less than 64:
(0xffffffffL << 32) ==
(0xffffffffL << (32 % Long.SIZE) ==
(0xffffffffL << (32 % 64) ==
(0xffffffffL << 32) ==
0xffffffff00000000L
† Strictly speaking it's taking the last 5 bits for ints and last 6 for longs, which makes a difference over modulus for negative left shifts.

Understanding unsigned right shift

The JLS 15.19 describes the formula for >>> operator.
The value of n >>> s is n right-shifted s bit positions with
zero-extension, where:
If n is positive, then the result is the same as that of n >> s.
If n is negative and the type of the left-hand operand is int, then
the result is equal to that of the expression (n >> s) + (2 << ~s).
If n is negative and the type of the left-hand operand is long, then
the result is equal to that of the expression (n >> s) + (2L << ~s).
Why does n >>> s = (n >> s) + (2 << ~s), where ~s = 31 - s for int and ~s = 63 - s for long?
If n is negative it means that the sign bit is set.
>>> s means shift s places to the right introducing zeros into the vacated slots.
>> s means shift s places to the right introducing copies of the sign bit into the vacated slots.
E.g.
10111110000011111000001111100000 >>> 3 == 00010111110000011111000001111100
10111110000011111000001111100000 >> 3 == 11110111110000011111000001111100
Obviously if n is not negative, n >> s and n >>> s are the same. If n is negative, the difference will consist of s ones at the left followed by all zeros.
In other words:
(n >>> s) + X == n >> s (*)
where X consists of s ones followed by 32 - s zeros.
Because there are 32 - s zeros in X, the right-most one in X occurs in the position of the one in 1 << (32 - s), which is equal to 2 << (31 - s), which is the same as 2 << ~s (because ~s == -1 - s and shift amounts work modulo 32 for ints).
Now what happens when you add 2 << ~s to X? You get zero! Let's demonstrate this in the case s == 7. Notice that the the carry disappears off the left.
11111110000000000000000000000000
+ 00000010000000000000000000000000
________________________________
00000000000000000000000000000000
It follows that -X == 2 << ~s. Therefore adding -X to both sides of (*) we get
n >>> s == (n >> s) + (2 << ~s)
For long it's exactly the same, except that shift amounts are done modulo 64, because longs have 64 bits.
Here is some additional context that will help you understand pbabcdefp's answer if you don't already know the basics he assumes:
To understand bitwise operators you must think about numbers as strings of binary digits, eg. 20 = 00010100 and -4 = 11111100 (For the sake of clarity and not having to write so many digits I will be writing all binary numbers as bytes; ints are the same but four times as long). If you are unfamiliar with binary and binary operations, you can read more here. Note how the first digit is special: It makes numbers negative, as if it had a place value (remember elementary math, the ones/tens/hundreds places?) of the most negative number possible, so Byte.MIN_VALUE = -128 = 1000000, and setting any other bit to 1 always increases the number. To easily read a negative number such as 11110011, know that -1 = 11111111, then read the 0s as if they were 1s in a positive number, then that number is how far away you are from -1. So 11110011 = -1 - 00001100 = -1 - 12 = -13.
Also understand that ~s is bitwise NOT: It takes all the digits and flips them, this is actually equivalent to ~s = -1 - s. Eg ~5 (00000101) is -6 (11111010). Observe how my suggested method for reading negative binary numbers is simply a trick to be able to read the bitwise NOT of the number rather than the number itself, which is easier for negative numbers close to zero because those numbers have fewer 0s than 1s.

Get n Least Significant Bits from an Int

This seems fairly straightforward, but I cant find an answer. If I have an int X, what is the best way to get N least significant bits from this int, in Java?
This should work for all non-negative N < 33 32:
x & ((1 << N) - 1)
It's worth elaborating on how this works for N == 31 and N == 32. For N == 31, we get 1 << N == Integer.MIN_VALUE. When you subtract 1 from that, Java silently wraps around to Integer.MAX_VALUE, which is exactly what you need. For N == 32, the 1 bit is shifted completely out, so 1 << N == 0; then (1 << N) - 1 == -1, which is all 32 bits set.
For N == 32, this unfortunately doesn't work because (thanks, #zstring!) the << operator only shifts by the right side mod 32. Instead, if you want to avoid testing for that case specially, you could use:
x & ((int)(1L << N) - 1)
By shifting a long, you get the full 32-bit shift, which, after casting back to an int, gets you 0. Subtracting 1 gives you -1 and x & -1 is just x for any int value x (and x is the value of the lower 32 bits of x).
Ted's approach is likely to be faster but here is another approach
x << -N >>> -N
This shift all the bit up and then down to chop off the top bits.
int i = -1;
System.out.println(Integer.toBinaryString(i));
i = i << -5 >>> -5;
System.out.println(Integer.toBinaryString(i));
prints
11111111111111111111111111111111
11111
You can also use a mask. If you use the & bitwise operator you can then remove whatever bit you would want to remove (say the highest x bits);
int mask = 0x7FFFFFFF //Example mask where you will remove the
// most significant bit
// (0x7 = 0111b and 0xF = 1111b).
int result = numberToProcess & mask; //And apply the mask with the &bitwise op.
The disadvantage to this is that you will need to make a mask for each bit, so perhaps this is better seen as another method of approach in general.

bit shift operation does not return expected result

Why does Java return -2147483648 when I bit shift 1 << 63 ?
The expected result is 9 223 372 036 854 775 808, tested with Wolfram Alpha and my calculator.
I tested:
System.out.print((long)(1 << (63)));
There's an important thing to note about the line
System.out.print((long)(1 << (63)));
You first take (1 << 63), and then you cast to long. As a result, you are actually left-shifting in integers, so the long cast doesn't have any effect. That's why shifting 63 bits left gives the min integer rather than the min long.
But there's another, more important point. Java longs are always signed, so even the line
System.out.print(1L << 63);
would give a negative number. Under two's complement, whenever the leftmost bit is a 1 the number is negative.
You actually cannot represent the number 263 = 9223372036854775808 in a Java primitive type, because that number is bigger than the maximum long, and long is the largest primitive type. You can represent this number as a BigInteger, though. You can even generate it via a left-shift by 63 with the code
BigInteger.ONE.shiftLeft(63)
You are having an integer overflow [twice].
1 << 32 == 1
1 << 31 == -2147483648 [ becuase this is the binary representation in 2's complement for -2147483648]
1 << 63 == 1 << (32 + 31) == (1 << 32) << 31 == 1 << 31 == -2147483648
When you do (long)(1 << (63)) you are only casting the result of 1 << (63) [which is -2147483648] to a long - and it does not change its value.

Categories