buff.getInt() & 0xffffffffL is an identity?

buff.getInt() & 0xffffffffL is an identity? - java

Here is some code I have been looking at:
public static long getUnsignedInt(ByteBuffer buff) {
return (long) (buff.getInt() & 0xffffffffL);
}
Is there any reason to do buff.getInt() & 0xffffffffL (0xffffffffL has 32 bits of 1's in the 32 least significant bits)? It looks to me like the result will always be buff.getInt().

In short, it's because the method needs to convert a signed int (which all Java ints are) to an unsigned quantity.
If you were to just do (long) buff.getInt(), and buff.getInt() returned -1, you'd end up with -1. And that's a signed quantity -- not what the method is supposed to return.
So what the method does is forces buff.getInt() to become unsigned by ANDing the int bits with 0x00000000FFFFFFFF. This effectively "reinterprets" the bits of the signed int as an unsigned int (really a signed long, but as only the lower 32 bits are ever going to be set, it works as an unsigned int), producing the desired result.
For example, (working with bytes for brevity).
Say buff.getInt() is really buff.getByte(), and returns -1 == 0xFF
Try to cast that to an int, you'll end up with 0xFFFFFFFF -- still -1, due to the magic of sign extension.
However, mask that with 0xFF, and you'll end up with 0x000000FF == 255 -- the desired value.
I believe the explicit cast is unnecessary (it isn't on my machine), but I could be missing something...
Edit: Turns out the cast actually is unnecessary. From JLS section 5.6.2:
Widening primitive conversion (§5.1.2) is applied to convert either or both operands as specified by the following rules:
If either operand is of type double, the other is converted to double.
Otherwise,if either operand is of type float, the other is converted to float.
Otherwise, if either operand is of type long, the other is converted to long.
Otherwise, both operands are converted to type int.

If buff.getInt() is a negative number, it will still be a negative number when casting it to a long. But it'll be sign extended.
So, if you want to preserve the bit pattern, e.g. to try to interpret the values as unsigned values, &0xffffffffL will mask off those bits.
e.g. if buff.getInt() returns -2147483648, the returned int will have a bit pattern of 0x80000000. Cast to long, that'll still be -2147483648 , but with a bit pattern of 0xffffffff80000000. 0xffffffff80000000 & 0xFFFFFFFF preserves the original bit pattern of 0x80000000

Integers in Java (int and long) are signed, two's complement numbers. That means the high digit is 0 if the number is positive (or 0), and -1 if it's negative.
To convert from an int to a long, Java uses sign extension. This means that the numbers it "fills in" to the left depend on that leftmost digit. To illustrate with going from 4-bit to 8-bit numbers:
1 = 0001 -> 0000 0001
-1 = 1111 -> 1111 1111
So if you want to interpret 1111 as a signed number (ie, 15), you can't just convert it from 4 bits to 8: you'd get 1111 1111, which is -1. What you want is 0000 1111, which is 15.
To do this, you need to do a bitwise AND on the number to mask out the high bits which have been filled in, and turn those into 0's.

Related

Can Java's byte actually store 32bit?

My goal is to understand how byte is stored in Java.
System.out.println("(byte) 0xFF:\r\n" +
Integer.toBinaryString((byte) 0xFF));
My expected result of (byte) 0xFF is 0xFF.
My actual result of (byte) 0xFF is 0xFFFFFFFF
The output:
(byte) 0xFF:
11111111111111111111111111111111
If this is true, does storing negative number in byte actually is no different than storing negative number in int?

toBinaryString accepts an int. Your input was just autopromoted to int, hence it was represented with 32 digits (0xFF == -1 in two's complement, which promoted to int becomes 0xFFFFFFFF which is still -1 but represented with 32 bits, still in two's complement).
Notice that
If the unsigned magnitude is zero, it is represented by a single zero character '0' ('\u0030'); otherwise, the first character of the representation of the unsigned magnitude will not be the zero character.
Which means that if there are leading 0s they won't be part of the output (unless the output is 0), which means you'll get less than 32 digits.

When the byte is implicitly re-cast as an integer, it sees the first bit as a sign bit and when it stretches out to be 4 bytes long, it retains it's value as negative. You've effectively overflowed to the smallest negative integer value.

Java bytes are signed and can represent the values -128 to 127, and a hex value of 0xff is treated as -1. When you call Integer.toBinaryString, the byte is cast to an int, preserving the sign. The way this works is called sign extension, and the highest bit from the byte is copied all the way up. This is why you see 0xfffff...
To perform an unsigned conversion, mask the value with 0xff, which itself an int unless specified otherwise.
byte b = (byte) 0xff;
Integer.toBinaryString(b & 0xff);
And to answer your original question, Java doesn't really support a byte type as you might expect. It's always a 32-bit value except when dealing with byte arrays and byte buffers. Using a type of byte simply informs the JVM to perform certain type casting rules which have the effect of clearing or setting the upper bits accordingly.

How does overflow work in java?

I've read about overflow, I know that "Overflow is when a number is so large that it will no longer fit within the data type, so the system “wraps around” to the next lowest value and counts up from there".
For example:
short s = (short)1921222; // Stored as 20678
In that example we started counting from -32768 (Short.MIN_VALUE), but when I try to prove in another integer data types, it doesn't seem work the same way...
byte b = (byte)400; // Stored as -112
The example above started counting from 0 that was the only way I found to get -112
I don't know if I am doing something wrong.

The Java Language Specification says:
The integral types are byte, short, int, and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers, respectively, and char, whose values are 16-bit unsigned integers representing UTF-16 code units.
So, short and byte are both two's complement integers.
short is 16 bits, meaning it can hold 2^16 = 65536 different values. After the 65536th value, it overflows.
1921222 modulo 65536 is 20678 . This is less than 32768 (2^15, the turning point for the two's complement) so we keep a positive number.
byte is 8 bits, meaning it can hold 2^8 = 256 different values. This one overflows after the 256hth value.
400 modulo 256 is 144. This value is higher than 128, the turning point for the two's complement - hence it will be interpreted as a negative two's complement number.

The cast is truncating the number. (JLS)
0000 0001 1001 0000
loses the high byte to become
1001 0000
which is -112.

In java, byte primitive type is an 8 bit signed integer, that's why you got -112 from calling:
byte b = (byte) 400;
You can avoid that and get its un-signed value, by binary adding it with 0xFF like this:
int b = (byte) 400 & 0xFF;
For further details you can check:
Java Primitive data types Documentation.
How to Convert Int to Unsigned Byte and Back

In addition to the other answers, you can get to that answer by manual calculation as well.
In Java, the data type byte is an 8-bit, signed integer. So the values are in the interval [-128, 127]. If you have a value of 400 and you want to see the actual value for that type, you can subtract the size of the interval from that number until you reach a value that's inside the interval.
As I said, byte is 8 bit, so the size of the interval is 256. Subtract that from your initial value: 400 - 256 = 144. This value is still outside of the interval so you have to subtract again: 144 - 256 = -112. This value is now inside the interval and is indeed the value you've seen in your test.
The same is true for your first example: short is 16 bit and signed, so the interval is [-32768, 32767] with size 65536. Doing repeated subtraction from the value 1921222 will eventually give you the value 20678 as seen in your test.

java - Why is 0x000F stored as unsigned?

I was reading through examples trying to understand how to convert signed bytes to unsigned integer counter parts.
The most popular method that I have come across is:
a & 0xFF
Where a is the signed byte.
My question is why is 0xFF stored as unsigned? Are all hex values stored as unsigned? If so why?
And how does "and"-ing turn off the sign bit in the sign integer?
It would be great if someone could break down the process step by step.

You probably saw this in code that converted a byte to an integer, where they wanted to treat the byte as an unsigned value in the range 0-255. It does not apply to integers in general. If you want to make an integer a "unsigned", you can do:
int unsignedA = a & 0x7FFFFFFF;
This will ensure that unsignedA is positive - but it does that by chopping off the high bit, so for example if a was -1, then unsignedA is Integer.MAX_VALUE.
There is no way to turn a 32-bit signed Java integer into a 32-bit unsigned Java integer because there is no datatype in Java for a 32-bit unsigned integer. The only unsigned integral datatype in Java is 16 bits long: char.
If you want to store a 32-bit unsigned integral value in Java, you need to store it in a long:
long unsignedA = a & 0xFFFFFFFFL;

To elaborate on Erwin's answer about converting a byte to an integer: In Java, byte is a signed integer type. That means it has values in the range -128 to 127. If you say:
byte a;
int b;
a = -64;
b = a;
The language will preserve the value; that is, it will set b to -64.
But if you really want to convert your byte to a value from 0 to 255 (which I guess you call the "unsigned counterpart" of the byte value), you can use a & 0xFF. Here's what happens:
Java does not do arithmetic directly on byte or short types. So when it sees a & 0xFF, it converts both sides to an int. The hex value of a, which is a byte, looks like
a = C0
When it's converted to a 32-bit integer, the value (-64) has to be preserved, so that means the 32-bit integer has to have 1 bits in the upper 24 bits. Thus:
a = C0
(int)a = FFFFFFC0
But then you "and" it with 0xFF:
a = C0
(int)a = FFFFFFC0
& 000000FF
--------
a & FF = 000000C0
And the result is an integer in the range 0 to 255.

In Java, literals (1, 0x2A, etc) are positive unless you explicitly indicate that they are negative. It's how we intuitively write numbers.
This previous question answers you question about converting to unsigned. Understanding Java unsigned numbers

Type casting into byte in Java

I am a beginner in Java. I came across a concept called Type Casting.
I have the following snippet-
class Demo
{
byte b;
int a=257;
double d= 323.142
b=(byte)a;
System.out.println(b);
b=(byte)d;
System.out.println(b);
}
The output for the code is 1
67
Can anybody explain me the outputs.
Thanks in Advance!

The byte type is encoded on 8 bits, so it takes its values between -128 and 127. In your case, casting by byte is the same as computing a modulo and rounding to an int. Try the following code, the output is the same:
int a = 257;
double d = 323.142;
System.out.println(a % 128);
System.out.println((int) d % 128);

In both cases you are doing a narrowing conversion, which may result in loss of information.
Conversion of int to byte
you convert an int whose value is 257 (00000000 00000000 00000001 00000001 in binary) to a byte. Therefore, only the lowest (right) byte of the int is kept. Therefore the result is 00000001 in binary, which is 1.
Conversion of double to byte
This conversion is more complicated.
In the first step 323.142 is converted from double to int, so it becomes 323.
The second step is the same as the first conversion :
323 is 00000000 00000000 00000001 01000011 in binary.
Converting 323 to byte keeps the lowest (right) byte, which gives you 67.
Here's what the JLS says about this conversion :
A narrowing conversion of a floating-point number to an integral type
T takes two steps:
In the first step, the floating-point number is converted either to a long, if T is long, or to an int, if T is byte, short, char, or
int, as follows:
If the floating-point number is NaN (§4.2.3), the result of the first step of the conversion is an int or long 0.
Otherwise, if the floating-point number is not an infinity, the floating-point value is rounded to an integer value V, rounding
toward zero using IEEE 754 round-toward-zero mode (§4.2.3). Then there
are two cases:
a. If T is long, and this integer value can be represented as a long, then the result of the first step is the long value V.
b. Otherwise, if this integer value can be represented as an int, then the result of the first step is the int value V.
Otherwise, one of the following two cases must be true:
a. The value must be too small (a negative value of large magnitude or negative infinity), and the result of the first step is
the smallest representable value of type int or long.
b. The value must be too large (a positive value of large magnitude or positive infinity), and the result of the first step is
the largest representable value of type int or long.
In the second step:
If T is int or long, the result of the conversion is the result of the first step.
If T is byte, char, or short, the result of the conversion is the result of a narrowing conversion to type T (§5.1.3) of the result
of the first step.

byte b;
int a=257;
double d= 323.142
b=(byte)a; // 257-256=1
System.out.println(b); // now b is 1
b=(byte)d; // 323.142-256=67
System.out.println(b); // now b is 67
byte data type is an 8-bit signed two's complement integer(This is important in second case, why 67.142 become 67). byte in Java is signed, so it has a range -2^7 to 2^7-1 - that is, -128 to 127. Since 257 is above 127, you end up wrapping around to 257-256=1. That is 256 is added or subtracted until it falls into range.Same scenario happen in the second case too.

Byte can store between the range of -128 to 127 which means 1 Byte (8 bit).
Binary value of 257 is 100000001 after converting to byte which means 8 bit (from LSB to MSB) which will get value of 00000001 which is nothing but 1. Here you are explicitly converting the type so data loss will take place while converting higher data type to lower data type. Similarly for the later one. Hope this will help you.

Hexadecimal representation of
(257) = 0x101
(323) = 0x143
byte stores only one byte of data and as you see the highlighted part in above hex representation:-
b = 0x01 = 1 for 257 and b = 0x43 = 67 for integer part of 323.14
Note:- In Java a double uses a 52 bit mantissa, hence we can represent a 32 bit integer without lost of data.

Because 257 = 100000001b
but when you cast it in byte you get only 8 bit of this number:
00000001b = 1

Casting of primitives type

I am beginner in Java. I cannot understand this line even after a long try.
byte num=(byte)135;
this line gives result -121 why it is in signed number ?
Can any one elaborate it ?

In Java, bytes are always signed, and they are in the range -128 to 127. When the int literal 135 is downcasted to a byte, the result is a negative number because the 8th bit is set.
1000 0111
Specifically, the JLS, Section 5.1.3, states:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
When you cast an int literal such as 135 to a byte, that is a narrowing primitive conversion.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.