I've 2 integer values stored in a bytebuffer, in little-endian format. These integers are actually the 32-bit pieces of a long. I've to store them as a class' member variables, loBits and hiBits.
This is what I did:
long loBits = buffer.getInt(offset);
long hiBits = buffer.getInt(offset + Integer.BYTES);
I want to know why directly assigning signed int to long is wrong. I kind of know what's going on, but would really appreciate an explanation.
The int I read from the buffer is signed (because Java). If it is negative then directly assigning it to a long value (or casting it like (long)) would change all the higher order bits in the long to the signed bit value.
For e.g. Hex representation of an int, -1684168480 is 9b9da0e0. If I assign this int to a long, all higher order 32 bits would become F.
int negativeIntValue = -1684168480;
long val1 = negativeIntValue;
long val2 = (long) negativeIntValue;
Hex representation of:
negativeIntValue is 0x9b9da0e0
val1 is 0xffffffff9b9da0e0
val2 is 0xffffffff9b9da0e0
However, if I mask the negativeIntValue with 0x00000000FFFFFFFFL, I get a long which has the same hex representation as negativeIntValue and a positive long value of 2610798816.
So my questions are:
Is my understanding correct?
Why does this happen?
Yes, your understanding is correct (at least if I understood your understanding correctly).
The reason this happens is because (most) computers use 2's complement to store signed values. So when assigning a smaller datatype to a larger one, the value is sign extended meaning that the excess part of the datatype is filled with 0 or 1 bits depending on whether the original value was positive or negative.
Also related is the difference between >> and >>> operators in Java. The first one performs sign extending (keeping negative values negative) the second one does not (shifting a negative value makes it positive).
The reason for this is that negative values are stored as two's complement.
Why do we use two's complement?
In a fixed width numbering system what happens, if you substract 1 from 0?
0000b - 0001b -> 1111b
and what is the next lesser number to 0? It is -1.
Therfore we thread a binary number with all bits set (for a signed datatype) as -1
The big advantage is that the CPU does not need to do any special operation when changing from positive to negative numbers. It handles 5 - 3 the same as 3 - 5
Related
When I do
Long.parseUnsignedLong("FBD626CC4961A4FC", 16)
I get back -300009666327239428
Which seems wrong, since the meaning of unsigned long according to this answer https://stackoverflow.com/a/2550367/1754020 is that the range is always positive.
To get the correct number from this HEX value I do
BigInteger value = new BigInteger("FBD626CC4961A4FC", 16);
When I print value it prints the correct value. but if I do value.longValue()
again I get the same -300009666327239428 is this of the number being too big and overflowing ?
Java 8 does (somewhat) support unsigned longs, however, you can't just print them directly. Doing so will give you the result that you saw.
If you have an unsigned long
Long number = Long.parseUnsignedLong("FBD626CC4961A4FC", 16);
you can get the correct string representation with the function
String numberToPrint = Long.toUnsignedString(number);
If you now print numberToPrint you get
18146734407382312188
To be more exact, your number is still going to be a regular signed long which is why it shows overflow if printed directly. However, there are new static functions that will treat the value as if it was unsigned, such as this Long.toUnsignedString(long x) or Long.compareUnsigned(long x, long y).
The hexadecimal number "FBD626CC4961A4FC", converted to decimal, is exactly 18146734407382312188. That number is indeed larger than the maximum possible long, defined as Long.MAX_VALUE and which is equal to 263-1, or 9223372036854775807:
System.out.println(new BigInteger("FBD626CC4961A4FC", 16)); // 18146734407382312188
System.out.println(Long.MAX_VALUE); // 9223372036854775807
As such, it's normal that you get back a negative number.
You do not have an exception, as it is exactly the purpose of those new *Unsigned* methods added in Java 8, to give the ability to handle unsigned longs (like compareUnsigned or divideUnsigned). Since the type long in Java is still unsigned, those methods work by understanding negative values as values greater than MAX_VALUE: it simulates an unsigned long. parseUnsignedLong says:
An unsigned integer maps the values usually associated with negative numbers to positive numbers larger than MAX_VALUE.
If you print a long that was the result of parseUnsignedLong, and it is negative, all it means is that the value is greater than the max long value as defined by the language, but that methods taking unsigned longs as parameter will correctly interpret those values, as if they were greater than the max value. As such, instead of printing it directly, if you pass that number to toUnsignedString, you'll get the right output, like shown in this other answer. Not all of these methods are new to Java 8, for example toHexString also interprets the given long as an unsigned long in base 16, and printing Long.toHexString(Long.parseUnsignedLong("FBD626CC4961A4FC", 16)) will give you back the right hex String.
parseUnsignedLong will throw an exception only when the value cannot be represented as an unsigned long, i.e. not a number at all, or greater than 264-1 (and not 263-1 which is the maximum value for a signed long).
Yes, it overflows when you are trying to print it, as it is converted to Java long type. To understand why let's take log2 of your dec value.
First thing, original value is 18146734407382312188. It's log2 is ~63.9763437545.
Second, look into documentation: in java long type represents values of -2^63 and a maximum value of 2^63-1.
So, your value is obviously greater then 2^63-1, hence it overflows:
-2^63 + (18146734407382312188 - 2^63 + 1) = -300009666327239428
But as #Keiwan brilliantly mentioned, you still can print proper value using Long.toUnsignedString(number);
Internally unsigned and signed numbers are represented in the same way, i.e. as 8 bytes in case of a long. The difference only how the "sign" bit interpreted, i.e. if you'd do the same in a C/C++ program and store your value into an uint64_t then cast/map it to a asigned int64_t you should get the same result.
Since the maximum value 8 bytes or 64 bits can hold is 2^64-1 that's the hard constraint for such numbers. Also Java doesn't directly support unsigned numbers and thus the only way to store an unsigned long in a long is to allow for a value that's higher than the signed Long.MAX_VALUE. In fact Java doesn't know whether the string/hexcode you're reading is meant to represent a signed or unsigned long so it's up to you to provide that interpretation, either by converting back to a string or using a larger datatype such as BigInteger.
Here: http://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.2.3
it says that:
The finite nonzero values of any floating-point value set can all be expressed in the form s · m · 2^(e - N + 1), where s is +1 or -1, m is a positive integer less than 2^N, and e is an integer between Emin = -(2^(K-1)-2) and Emax = 2^(K-1)-1, inclusive, and where N and K are parameters that depend on the value set.
and there is a table below:
Parameter float
N 24
K 8
So let's say N = 24 and K = 8 then we can have the following value from the formula:
s · 2^N · 2^(2^(K-1)-1 - N + 1) which gives us according to values specified in the table:
s * 2^24 * 2^(127 - 24) which is equal to s * 2^127. But float has only 32 bits so it's not possible to store in it such a big number.
So it's obvious that initial formula should be read in a different way. How then?
Also in javadoc for Float max value: http://docs.oracle.com/javase/7/docs/api/java/lang/Float.html#MAX_VALUE
it says:
A constant holding the largest positive finite value of type float, (2-2^-23)·2^127
This also doesn't make sense, as resulting value is much larger than 2^32 - which is possible the biggest value that can be stored in float variable. So again, I'm misreading this notation. So how it should be read?
The idea with the floating point notation is to store a much larger range of numbers than can be stored in the same space (bytes) with the integer representation. So, for example, you say that the "resulting value is much larger than 2^32". But, that would only be a problem if we're storing a typical binary number as one computes in a typical math class.
Instead, floating point representations break those 32 bytes into two main parts:
- significand
- exponent
For simplicity, imagine that 3 bytes are used for the significand and 1 byte for the exponent. Also assume that each of these is your typical binary integer style of representation. So, the three bytes can have a value 2^24, or 2^23 if you want to keep one bit for the sign.
However, the other byte can store up to 2^7 (if you want a sign there too).
So, you could express 500^100, by storing the 500 in the three bytes and the 100 in the 1 byte.
Essentially, one cannot store every number precisely. One changes it into significant form and one can store as many significant digits as the portion reserved for the significand (3 bytes in this example).
Rather than try to explain the complications, check this Wikipedia article for more.
Why is the maximum capacity of a Java HashMap 1<<30 and not 1<<31, even though the max value of an int is 231-1? The maximum capacity is initialized as static final int MAXIMUM_CAPACITY = 1 << 30;
Java uses signed integers which means the first bit is used to store the sign of the number (positive/negative).
A four byte integer has 32 bits in which the numerical portion may only span 31 bits due to the signing bit. This limits the range of the number to 2^31 - 1 (due to inclusion of 0) to - (2^31).
While it would be possible for a hash map to handle quantities of items between 2^30 and 2^31-1 without having to use larger integer types, writing code which works correctly even near the upper limits of a language's integer types is difficult. Further, in a language which treats integers as an abstract algebraic ring that "wraps" on overflow, rather than as numbers which should either yield numerically-correct results or throw exceptions when they cannot do so, it may be hard to ensure that there aren't any cases where overflows would cause invalid operations to go undetected.
Specifying an upper limit of 2^30 or even 2^29, and ensuring correct behavior on things no larger than that, is often much easier than trying to ensure correct behavior all the way up to 2^31-1. Absent a particular reason to squeeze out every last bit of range, it's generally better to use the simpler approach.
By default, the int data type is a 32-bit signed two's complement integer, which has a minimum value of -2^31 and a maximum value of (2^31)-1, ranges from –2,147,483,648 to 2,147,483,647.
The first bit is reserved for the sign bit — it is 1 if the number is negative and 0 if it is positive.
1 << 30 is equal to 1,073,741,824
it's two's complement binary integer is 01000000-00000000-00000000-00000000.
1 << 31 is equal to -2,147,483,648.
it's two's complement binary integer is 10000000-00000000-00000000-00000000.
It says the maximum size to which hash-map can expand is 1,073,741,824 = 2^30.
You are thinking of unsigned, with signed upper range is (2^31)-1
I would like that one of my columns in a specific table would not get negative numbers.
Is there a way to declare an unsigned int in derby DB ?
Q: Is there a way to declare a column "unsigned int" in a Derby DB table?
A: I believe answer is "No":
http://db.apache.org/derby/docs/10.0/manuals/reference/sqlj124.html
http://db.apache.org/derby/docs/10.0/manuals/reference/sqlj124.html
... HOWEVER ...
You should easily be able to "CAST" the stored value in any query:
The answer is no. Java (unfortunately) has no support for unsigned arithmetics.
But you can still use Java's int and make your own methods to do unsigned arithmetic, like this. Even better is to create an UnsignedInteger class and let it handle all that arithmetic.
I don't know if it's worth it though. You can just use long as jtahlborn suggests.
Java does NOT support the type unsigned int, as opposed to C or C++.
On the other hand, there is a simple way to circumvent this limitation by using:
An intermediate long variable/literal
A cast to (int)
First, let us note that both signed and unsigned int have the same number of bits; they are both 32 bits in size.
The main difference is, that for signed integers the MSB (Most Significant Bit) or bit 0 is used to indicate that the actual integer is negative or positive.
If this bit is set to 1 the integer is negative, if it is set to 0, the integer is negative. So, you will end up with 31 bits for the value of the integer, and 1 bit for the sign.
For unsigned integers, all the 32 bits are used to represent the value of the integer. You can now have larger numbers.
The signed integers range from -2^31 to (2^31 - 1)
The unsigned integers range from 0 to (2^32 - 1)
To show the size limitation issue, try to compile the following snippet:
public static void main (String[] args){
int i = 2_147_483_648;//This line will generate a compiler error
}
}
The compiler will complain saying integer number too large.
Let us see how we can circumvent this and still store this number in a Java integer, and retrieve it back.
The idea behind the proposed solution stems from the fact that a series of bits is interpreted to mean something. It could be a representation of an image, a Java object, a C struct, a character, etc.
You "interpret" the series of bits based on what you "expect" that series of bits to represent. If you are expecting a character, you might be looking to match the series against an ASCII table. If you are expecting an image, you might be looking to decode a JPEG.
Now, and coming back to integers, if you are looking for signed integers you will interpret the MSB as the sign bit. If you are looking for unsigned integers you will interpret the MSB as part of the value.
To give an example, let us assume that you have the following series of 32 bits:
in hex 0x8000_0000 or in binary 0b1000_0000_0000_0000_0000_0000_0000_0000
The MSB is set to 1 all other bits are 0.
If you are looking for/expecting a signed integer these 32 bits would be interpreted as the representation of the negative decimal number -2_147_483_648
If you are looking for/expecting an unsigned integer these same 32 bits would be interpreted as the representation of the positive decimal number 2_147_483_648
Hence, the solution would be to store the values in a signed 32-bit integer and interpret them as an unsigned 32-bit integer.
Here is how we will modify the previous snippet and save 2_147_483_648 into an integer value and be able to print it correctly.
To do so, we will:
First, use a long intermediate datatype/value
Then cast that value to int to save it
Finally, mask/ignore & the extra bits and display it
public static void main (String[] args){
//1. cast a 64-bit long value that fits into a 32-bit int
int i = (int)2_147_483_648L;
//2. Mask the extra long bits
System.out.println(i & 0x0000_0000_FFFF_FFFFL);
}
}
The long variable can hold 64 bits, and its MSB, or sign bit, is not affected by its first 32 bits. You will notice a bitwise operation happening here:
i & 0x0000_0000_FFFF_FFFFL
This is telling the compiler to ignore, or mask, all the bits that are beyond the first 32 bits we are interested in. Our integer is 32-bit.
If you want to do some arithmetic operation on the int value before saving it, you could resort again to a long intermediate variable.
Here is how, and this will be the proposed solution:
public static void main (String[] args){
int i = (int)2_147_483_648L;
int j = 1_000_000;
long tempL = i + j;//Use a temp long to perform the operations
i = (int)tempL; //We save back into i and store in a DB for example
//Now we use the saved value to display it, for example
System.out.println(i & 0x0000_0000_FFFF_FFFFL);
}
}
Good luck and hope the above helps!
I was looking into 32-bit and 64-bit. I noticed that the range of integer values that can stored in 32 bits is ±4,294,967,295 but the Java int is also 32-bit (If I am not mistaken) and it stores values up to ±2 147 483 648. Same thing for long, it stores values from 0 to ±2^63 but 64-bit stores ±2^64 values. How come these values are different?
Integers in Java are signed, so one bit is reserved to represent whether the number is positive or negative. The representation is called "two's complement notation." With this approach, the maximum positive value represented by n bits is given by
(2 ^ (n - 1)) - 1
and the corresponding minimum negative value is given by
-(2 ^ (n - 1))
The "off-by-one" aspect to the positive and negative bounds is due to zero. Zero takes up a slot, leaving an even number of negative numbers and an odd number of positive numbers. If you picture the represented values as marks on a circle—like hours on a clock face—you'll see that zero belongs more to the positive range than the negative range. In other words, if you count zero as sort of positive, you'll find more symmetry in the positive and negative value ranges.
To learn this representation, start small. Take, say, three bits and write out all the numbers that can be represented:
0
1
2
3
-4
-3
-2
-1
Can you write the three-bit sequence that defines each of those numbers? Once you understand how to do that, try it with one more bit. From there, you imagine how it extends up to 32 or 64 bits.
That sequence forms a "wheel," where each is formed by adding one to the previous, with noted wraparound from 3 to -4. That wraparound effect (which can also occur with subtraction) is called "modulo arithemetic."
In 32 bit you can store 2^32 values. If you call these values 0 to 4294967295 or -2147483648 to +2147483647 is up to you. This difference is called "signed type" versus "unsigned type". The language Java supports only signed types for int. Other languages have different types for an unsigned 32bit type.
NO laguage will have a 32bit type for ±4294967295, because the "-" part would require another bit.
That's because Java ints are signed, so you need one bit for the sign.