why Integer.MAX_VALUE + 1 == Integer.MIN_VALUE? - java

System.out.println(Integer.MAX_VALUE + 1 == Integer.MIN_VALUE);
is true.
I understand that integer in Java is 32 bit and can't go above 231-1, but I can't understand why adding 1 to its MAX_VALUE results in MIN_VALUE and not in some kind of exception. Not mentioning something like transparent conversion to a bigger type, like Ruby does.
Is this behavior specified somewhere? Can I rely on it?

Because the integer overflows. When it overflows, the next value is Integer.MIN_VALUE. Relevant JLS
If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format. If overflow occurs, then the sign of the result is not the same as the sign of the mathematical sum of the two operand values.

The integer storage gets overflowed and that is not indicated in any way, as stated in JSL 3rd Ed.:
The built-in integer operators do not indicate overflow or underflow in any way. Integer operators can throw a NullPointerException if unboxing conversion (§5.1.8) of a null reference is required. Other than that, the only integer operators that can throw an exception (§11) are the integer divide operator / (§15.17.2) and the integer remainder operator % (§15.17.3), which throw an ArithmeticException if the right-hand operand is zero, and the increment and decrement operators ++(§15.15.1, §15.15.2) and --(§15.14.3, §15.14.2), which can throw an OutOfMemoryError if boxing conversion (§5.1.7) is required and there is not sufficient memory available to perform the conversion.
Example in a 4-bits storage:
MAX_INT: 0111 (7)
MIN_INT: 1000 (-8)
MAX_INT + 1:
0111+
0001
----
1000

You must understand how integer values are represented in binary form, and how binary addition works. Java uses a representation called two's complement, in which the first bit of the number represents its sign. Whenever you add 1 to the largest java Integer, which has a bit sign of 0, then its bit sign becomes 1 and the number becomes negative.
This links explains with more details: http://www.cs.grinnell.edu/~rebelsky/Espresso/Readings/binary.html#integers-in-java
--
The Java Language Specification treats this behavior here: http://docs.oracle.com/javase/specs/jls/se6/html/expressions.html#15.18.2
If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format. If overflow occurs, then the sign of the result is not the same as the sign of the mathematical sum of the two operand values.
Which means that you can rely on this behavior.

On most processors, the arithmetic instructions have no mode to fault on an overflow. They set a flag that must be checked. That's an extra instruction so probably slower. In order for the language implementations to be as fast as possible, the languages are frequently specified to ignore the error and continue. For Java the behaviour is specified in the JLS. For C, the language does not specify the behaviour, but modern processors will behave as Java.
I believe there are proposals for (awkward) Java SE 8 libraries to throw on overflow, as well as unsigned operations. A behaviour, I believe popular in the DSP world, is clamp the values at the maximums, so Integer.MAX_VALUE + 1 == Integer.MAX_VALUE [not Java].
I'm sure future languages will use arbitrary precision ints, but not for a while yet. Requires more expensive compiler design to run quickly.

The same reason why the date changes when you cross the international date line: there's a discontinuity there. It's built into the nature of binary addition.

This is a well known issue related to the fact that Integers are represented as two's complement down at the binary layer. When you add 1 to the max value of a two's complement number you get the min value. Honestly, all integers behaved this way before java existed, and changing this behavior for the Java language would have added more overhead to integer math, and confused programmers coming from other languages.

When you add 3 (in binary 11) to 1 (in binary 1), you must change to 0 (in binary 0) all binary 1 starting from the right, until you got 0, which you should change to 1. Integer.MAX_VALUE has all places filled up with 1 so there remain only 0s.

Easy to understand with byte example=>
byte a=127;//max value for byte
byte b=1;
byte c=(byte) (a+b);//assigns -128
System.out.println(c);//prints -128
Here we are forcing addition and casting it to be treated as byte.
So what will happen is that when we reach 127 (largest possible value for a byte) and we add plus 1 then the value flips (as shown in image) from 127 and it becomes -128.
The value starts circling around the type.
Same is for integer.
Also integer + integer stays integer ( unlike byte + byte which gets converted to int [unless casted forcefully as above]).
int int1=Integer.MAX_VALUE+1;
System.out.println(int1); //prints -2147483648
System.out.println(Integer.MIN_VALUE); //prints -2147483648
//below prints 128 as converted to int as not forced with casting
System.out.println(Byte.MAX_VALUE+1);

Cause overflow and two-compliant nature count goes on "second loop", we was on far most right position 2147483647 and after summing 1, we appeared at far most left position -2147483648, next incrementing goes -2147483647, -2147483646, -2147483645, ... and so forth to the far most right again and on and on, its nature of summing machine on this bit depth.
Some examples:
int a = 2147483647;
System.out.println(a);
gives: 2147483647
System.out.println(a+1);
gives: -2147483648 (cause overflow and two-compliant nature count goes on "second loop", we was on far most right position 2147483647 and after summing 1, we appeared at far most left position -2147483648, next incrementing goes -2147483648, -2147483647, -2147483646, ... and so fores to the far most right again and on and on, its nature of summing machine on this bit depth)
System.out.println(2-a);
gives:-2147483645 (-2147483647+2 seems mathematical logical)
System.out.println(-2-a);
gives: 2147483647 (-2147483647-1 -> -2147483648, -2147483648-1 -> 2147483647 some loop described in previous answers)
System.out.println(2*a);
gives: -2 (2147483647+2147483647 -> -2147483648+2147483646 again mathematical logical)
System.out.println(4*a);
gives: -4 (2147483647+2147483647+2147483647+2147483647 -> -2147483648+2147483646+2147483647+2147483647 -> -2-2 (according to last answer) -> -4)`

Related

Java - Masking out sign bit causes unexpected behavior

I have the following value:
int x = -51232;
Java integers are 32 bits, so in binary this should be the following:
10000000000000001100100000100000
The sign bit on the left is set to 1 since x is negative.
Then I do the operation
x = (x & Integer.MAX_VALUE);
Integer.MAX_VALUE is 2147483647 and in binary that would be:
01111111111111111111111111111111
0 on the left because the value is positive.
So why does x & Integer.MAX_VALUE yield 2147432416? The AND operator should only retrieve bits that x and Integer.MAX_VALUE have in common, which should be equivalent to -x (since they do not share the same sign bit).
What's going on here?
Your misunderstanding is caused by lack of knowledge regarding how negative integers are represented in binary in Java. You should read about 2's complement.
10000000000000001100100000100000 is not the binary representation of -51232.
11111111111111110011011111100000 is.
And when you run bitwise AND, you get:
11111111111111110011011111100000 (-51232)
01111111111111111111111111111111 (Integer.MAX_VALUE)
--------------------------------
01111111111111110011011111100000 (2147432416)
Here's the binary representation of -51232 next to the binary representation of 51232. You can see that their sum is 232. that's always the case with 2's complement, for any pair of ints x and -x.
00000000000000001100100000100000 (-51232)
11111111111111110011011111100000 (51232)
Integers are stored in two complement: https://en.m.wikipedia.org/wiki/Two%27s_complement.
Thus while the leftmost bit indicate negative numbers, it is not really a sign bit as it would be for floating point numbers in common representation.
The main reason of this representation is that it make it easy to do addition and substraction among other at the hardware level.
One should not that with two complement notation, -1 is represented by all bit set to 1.

Rules governing narrowing of double to int

Please note I am NOT looking for code to cast or narrow a double to int.
As per JLS - $ 5.1.3 Narrowing Primitive Conversion
A narrowing conversion of a signed integer to an integral type T
simply discards all but the n lowest order bits, where n is the number
of bits used to represent type T.
So, when I try to narrow a 260 (binary representation as 100000100) to a byte then result is 4 because the lowest 8 bits is 00000100 which is a decimal 4 OR a long value 4294967296L (binary representation 100000000000000000000000000000000) to a byte then result is 0.
Now, why I want to know the rule for narrowing rule from double to int, byte etc. is when I narrow a double value 4294967296.0 then result is 2147483647 but when I narrow a long 4294967296L value then result is 0.
I have understood the long narrowing to int, byte etc. (discards all but the n lowest order bits) but I want to know what is going under the hoods in case of double narrowing.
I have understood the long narrowing to int, byte etc. (discards all but the n lowest order bits) but I want to know what is going under the hoods in case of double narrowing.
... I want to understand the why and how part.
The JLS (JLS 5.1.3) specifies what the result is. A simplified version (for int) is:
a NaN becomes zero
an Inf becomes "max-int" or "min-int"
otherwise:
round towards zero to get a mathematical integer
if the rounded number is too big for an int, the result becomes "min-int" or "max-int"
"How" is implementation specific. For examples of how it could be implemented, look at the Hotspot source code (OpenJDK version) or get the JIT compiler to dump some native code for you to look at. (I imagine that the native code maps uses a single instruction to do the actual conversion .... but I haven't checked.)
"Why" is unknowable ... unless you can ask one of the original Java designers / spec authors. A plausible explanation is a combination of:
it is easy to understand
it is consistent with C / C++,
it can be implemented efficiently on common hardware platforms, and
it is better than (hypothetical) alternatives that the designers considered.
(For example, throwing an exception for NaN, Inf, out-of-range would be inconsistent with other primitive conversions, and could be more expensive to implement.)
Result is Integer.MAX_VALUE when converting a double to an integer, and the value exceeds the range of an integer. Integer.MAX_VALUE is 2^31 - 1.
When you start with the double value 4294967296.0, it is greater than the greatest long value which is 2147483647 so the following rule is applied (from the page you cited) : The value must be too large (a positive value of large magnitude or positive infinity), and the result of the first step is the largest representable value of type int or long and you get 0x7FFFFFF = 2147483647
But when you try to convert 4294967296L = 0x100000000, you start from an integral type, so the rule is : A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits so if n is less than 32 (8 bytes) you just get a 0.

Why is the maximum capacity of a Java HashMap 1<<30 and not 1<<31?

Why is the maximum capacity of a Java HashMap 1<<30 and not 1<<31, even though the max value of an int is 231-1? The maximum capacity is initialized as static final int MAXIMUM_CAPACITY = 1 << 30;
Java uses signed integers which means the first bit is used to store the sign of the number (positive/negative).
A four byte integer has 32 bits in which the numerical portion may only span 31 bits due to the signing bit. This limits the range of the number to 2^31 - 1 (due to inclusion of 0) to - (2^31).
While it would be possible for a hash map to handle quantities of items between 2^30 and 2^31-1 without having to use larger integer types, writing code which works correctly even near the upper limits of a language's integer types is difficult. Further, in a language which treats integers as an abstract algebraic ring that "wraps" on overflow, rather than as numbers which should either yield numerically-correct results or throw exceptions when they cannot do so, it may be hard to ensure that there aren't any cases where overflows would cause invalid operations to go undetected.
Specifying an upper limit of 2^30 or even 2^29, and ensuring correct behavior on things no larger than that, is often much easier than trying to ensure correct behavior all the way up to 2^31-1. Absent a particular reason to squeeze out every last bit of range, it's generally better to use the simpler approach.
By default, the int data type is a 32-bit signed two's complement integer, which has a minimum value of -2^31 and a maximum value of (2^31)-1, ranges from –2,147,483,648 to 2,147,483,647.
The first bit is reserved for the sign bit — it is 1 if the number is negative and 0 if it is positive.
1 << 30 is equal to 1,073,741,824
it's two's complement binary integer is 01000000-00000000-00000000-00000000.
1 << 31 is equal to -2,147,483,648.
it's two's complement binary integer is 10000000-00000000-00000000-00000000.
It says the maximum size to which hash-map can expand is 1,073,741,824 = 2^30.
You are thinking of unsigned, with signed upper range is (2^31)-1

Why do integers in Java integer not use all the 32 or 64 bits?

I was looking into 32-bit and 64-bit. I noticed that the range of integer values that can stored in 32 bits is ±4,294,967,295 but the Java int is also 32-bit (If I am not mistaken) and it stores values up to ±2 147 483 648. Same thing for long, it stores values from 0 to ±2^63 but 64-bit stores ±2^64 values. How come these values are different?
Integers in Java are signed, so one bit is reserved to represent whether the number is positive or negative. The representation is called "two's complement notation." With this approach, the maximum positive value represented by n bits is given by
(2 ^ (n - 1)) - 1
and the corresponding minimum negative value is given by
-(2 ^ (n - 1))
The "off-by-one" aspect to the positive and negative bounds is due to zero. Zero takes up a slot, leaving an even number of negative numbers and an odd number of positive numbers. If you picture the represented values as marks on a circle—like hours on a clock face—you'll see that zero belongs more to the positive range than the negative range. In other words, if you count zero as sort of positive, you'll find more symmetry in the positive and negative value ranges.
To learn this representation, start small. Take, say, three bits and write out all the numbers that can be represented:
0
1
2
3
-4
-3
-2
-1
Can you write the three-bit sequence that defines each of those numbers? Once you understand how to do that, try it with one more bit. From there, you imagine how it extends up to 32 or 64 bits.
That sequence forms a "wheel," where each is formed by adding one to the previous, with noted wraparound from 3 to -4. That wraparound effect (which can also occur with subtraction) is called "modulo arithemetic."
In 32 bit you can store 2^32 values. If you call these values 0 to 4294967295 or -2147483648 to +2147483647 is up to you. This difference is called "signed type" versus "unsigned type". The language Java supports only signed types for int. Other languages have different types for an unsigned 32bit type.
NO laguage will have a 32bit type for ±4294967295, because the "-" part would require another bit.
That's because Java ints are signed, so you need one bit for the sign.

Basic question on Java's int

Why does the below code prints 2147483647, the actual value being 2147483648?
i = (int)Math.pow(2,31) ;
System.out.println(i);
I understand that the max positive value that a int can hold is 2147483647. Then why does a code like this auto wraps to the negative side and prints -2147483648?
i = (int)Math.pow(2,31) +1 ;
System.out.println(i);
i is of type Integer. If the second code sample (addition of two integers) can wrap to the negative side if the result goes out of the positive range,why can't the first sample wrap?
Also ,
i = 2147483648 +1 ;
System.out.println(i);
which is very similar to the second code sample throws compile error saying the first literal is out of integer range?
My question is , as per the second code sample why can't the first and third sample auto wrap to the other side?
For the first code sample, the result is narrowed from a double to an int. the JLS 5.1.3 describes how narrowing conversions for doubles to ints are performed.
The relevant part is:
The value must be too large (a
positive value of large magnitude or
positive infinity), and the result of
the first step is the largest
representable value of type int or
long.
This is why 2^31 (2147483648) is reduced to Integer.MAX_VALUE (2147483647). The same is true for
i = (int)(Math.pow(2,31)+100.0) ; // addition note the parentheses
and
i = (int)10000000000.0d; // == 2147483647
When the addition is done without parentheses, as in your second example, we are then dealing with integer addition. Integral types use 2's complement to represent values. Under this scheme adding 1 to
0x7FFFFFFF (2147483647)
gives
0x80000000
Which is 2's complement for -2147483648. Some languages perform overflow checking for arithmetic operations (e.g. Ada will throw an exception). Java, with it's C heritage does not check for overflow. CPUs typically set an overflow flag when an arithmetic operation overflows or underflows. Language runtimes can check this flag, although this introduces additional overhead, which some feel is unnecessary.
The third example doesn't compile since the compiler checks literal values against the range of their type, and gives a compiler error for values out of range. See JLS 3.10.1 - Integer Literals.
Then why does a code like this auto wraps to the negative side and prints -2147483648?
This is called overflow. Java does it because C does it. C does it because most processors do it. In some languages this does not happen. For example some languages will throw an exception, in others the type will change to something that can hold the result.
My question is , as per the second code sample why can't the first and third sample auto wrap to the other side?
Regarding the first program: Math.pow returns a double and does not overflow. When the double is converted to an integer it is truncated.
Regarding your third program: Overflow is rarely a desirable property and is often a sign that your program is no longer working. If the compiler can see that it gets an overflow just from evaluating a constant that is almost certainly an error in the code. If you wanted a large negative number, why would you write a large positive one?

Categories