Converting 0xFF gives me -1 -> Signed vs Unsigned?

Converting 0xFF gives me -1 -> Signed vs Unsigned? - java

Here is my code:
byte b = (byte) 0xFF;
System.out.println(b);
I expect to see 255, but instead I see -1. What's happening? Is the compiler using a signed instead of an unsigned byte?

Maybe you are confused about the difference between storage and representation.
As the others told you, Java only uses signed integral types. As a byte is only 8 bit, this means you have a range from -128 to 127. But keep in mind: This is only a decimal representation that does not tell you about the binary storage!
A better reminder for the range of Java's byte type would be the following (decimal) representation:
0 1 ... 126 127 -128 -127 ... -2 -1
This directly corresponds to the following binary representation:
00000000 00000001 ... 11111110 11111111
The signed decimal representation is called the two's complement. As you can see, the binary value 11111111, which is exactly 0xFF, corresponds to the (signed) decimal value -1.
If you now send this value via a stream, you will get the full 0xFF value, regardless what you see in a Java program's console output.

In Java type byte is signed, it is not possible to use an "unsigned byte" in java. A workaround would be to use short types

Everytime you operate on a byte variable in Java, (byte) 0xFF is going to be implicitly converted to (int) 0xFFFFFF (i.e. -1). You can do 0xFFFFFF & 0xFF and then you'll get (int) 255 as you wish (you will never be able to get (byte) 255, only (int) 255).
If you only need to store bytes, it doesn't matter how 0xFF is printed on the screen, it's still (byte) 0xFF internally. If you need to operate on them (comparing to an int, adding, subtracting, converting to a string, printing on the screen, and basically anything else), then you need to be aware that they'll get converted to int 0xFFFFFF which is interpreted -1.

Related

Can Java's byte actually store 32bit?

My goal is to understand how byte is stored in Java.
System.out.println("(byte) 0xFF:\r\n" +
Integer.toBinaryString((byte) 0xFF));
My expected result of (byte) 0xFF is 0xFF.
My actual result of (byte) 0xFF is 0xFFFFFFFF
The output:
(byte) 0xFF:
11111111111111111111111111111111
If this is true, does storing negative number in byte actually is no different than storing negative number in int?

toBinaryString accepts an int. Your input was just autopromoted to int, hence it was represented with 32 digits (0xFF == -1 in two's complement, which promoted to int becomes 0xFFFFFFFF which is still -1 but represented with 32 bits, still in two's complement).
Notice that
If the unsigned magnitude is zero, it is represented by a single zero character '0' ('\u0030'); otherwise, the first character of the representation of the unsigned magnitude will not be the zero character.
Which means that if there are leading 0s they won't be part of the output (unless the output is 0), which means you'll get less than 32 digits.

When the byte is implicitly re-cast as an integer, it sees the first bit as a sign bit and when it stretches out to be 4 bytes long, it retains it's value as negative. You've effectively overflowed to the smallest negative integer value.

Java bytes are signed and can represent the values -128 to 127, and a hex value of 0xff is treated as -1. When you call Integer.toBinaryString, the byte is cast to an int, preserving the sign. The way this works is called sign extension, and the highest bit from the byte is copied all the way up. This is why you see 0xfffff...
To perform an unsigned conversion, mask the value with 0xff, which itself an int unless specified otherwise.
byte b = (byte) 0xff;
Integer.toBinaryString(b & 0xff);
And to answer your original question, Java doesn't really support a byte type as you might expect. It's always a 32-bit value except when dealing with byte arrays and byte buffers. Using a type of byte simply informs the JVM to perform certain type casting rules which have the effect of clearing or setting the upper bits accordingly.

java - Why is 0x000F stored as unsigned?

I was reading through examples trying to understand how to convert signed bytes to unsigned integer counter parts.
The most popular method that I have come across is:
a & 0xFF
Where a is the signed byte.
My question is why is 0xFF stored as unsigned? Are all hex values stored as unsigned? If so why?
And how does "and"-ing turn off the sign bit in the sign integer?
It would be great if someone could break down the process step by step.

You probably saw this in code that converted a byte to an integer, where they wanted to treat the byte as an unsigned value in the range 0-255. It does not apply to integers in general. If you want to make an integer a "unsigned", you can do:
int unsignedA = a & 0x7FFFFFFF;
This will ensure that unsignedA is positive - but it does that by chopping off the high bit, so for example if a was -1, then unsignedA is Integer.MAX_VALUE.
There is no way to turn a 32-bit signed Java integer into a 32-bit unsigned Java integer because there is no datatype in Java for a 32-bit unsigned integer. The only unsigned integral datatype in Java is 16 bits long: char.
If you want to store a 32-bit unsigned integral value in Java, you need to store it in a long:
long unsignedA = a & 0xFFFFFFFFL;

To elaborate on Erwin's answer about converting a byte to an integer: In Java, byte is a signed integer type. That means it has values in the range -128 to 127. If you say:
byte a;
int b;
a = -64;
b = a;
The language will preserve the value; that is, it will set b to -64.
But if you really want to convert your byte to a value from 0 to 255 (which I guess you call the "unsigned counterpart" of the byte value), you can use a & 0xFF. Here's what happens:
Java does not do arithmetic directly on byte or short types. So when it sees a & 0xFF, it converts both sides to an int. The hex value of a, which is a byte, looks like
a = C0
When it's converted to a 32-bit integer, the value (-64) has to be preserved, so that means the 32-bit integer has to have 1 bits in the upper 24 bits. Thus:
a = C0
(int)a = FFFFFFC0
But then you "and" it with 0xFF:
a = C0
(int)a = FFFFFFC0
& 000000FF
--------
a & FF = 000000C0
And the result is an integer in the range 0 to 255.

In Java, literals (1, 0x2A, etc) are positive unless you explicitly indicate that they are negative. It's how we intuitively write numbers.
This previous question answers you question about converting to unsigned. Understanding Java unsigned numbers

Byte initialization using 0xFF instead of 0xFFFFFFFF

I have a question about initializing byte in java, I want to initialize a byte value allBitsOne and all bits of it are 1:
Method 1:
byte allBitsOne = 0xFF;
Wrong, it says that 0xFF is a integer type and over the range of byte, so i do it like below
Method 2:
byte allBitsOne = (byte)0xFF;
Works fine.
Method 3:
byte allBitsOne = 0xFFFFFFFF;
It works fine as well, but if 0xFF exceeds the range of a byte, why doesn't 0xFFFFFFFF?
Thank you all, I found this: link

byte is a signed integer type, going from -128 to 127.
0xFF is 255, so it's larger than 127.
0xFFFFFFFF is -1, so it's within the bounds of the byte type.
See http://en.wikipedia.org/wiki/Two%27s_complement

literal integers in Java are signed 32 bit numbers, so:
0xff is an integer type which equals to 255, which is over the limit for byte.
0xffffffff is an integer type which equals to -1, which is not over the limit for byte.

the byte-variable in the java in the java can hold the values from -128 to 127.
if you want to set all the bit to 1. then u can store the -128 to it.

Understanding Java bytes

So at work yesterday, I had to write an application to count the pages in an AFP file. So I dusted off my MO:DCA spec PDF and found the structured field BPG (Begin Page) and its 3-byte identifier. The app needs to run on an AIX box, so I decided to write it in Java.
For maximum efficiency, I decided that I would read the first 6 bytes of each structured field and then skip the remaining bytes in the field. This would get me:
0: Start of field byte
1-2: 2-byte length of field
3-5: 3-byte sequence identifying the type of field
So I check the field type and increment a page counter if it's BPG, and I don't if it's not. Then I skip the remaining bytes in the field rather than read through them. And here, in the skipping (and really in the field length) is where I discovered that Java uses signed bytes.
I did some googling and found quite a bit of useful information. Most useful, of course, was the instruction to do a bitwise & to 0xff to get the unsigned int value. This was necessary for me to get a length that could be used in the calculation for the number of bytes to skip.
I now know that at 128, we start counting backwards from -128. What I want to know is how the bitwise operation works here--more specifically, how I arrive at the binary representation for a negative number.
If I understand the bitwise & properly, your result is equal to a number where only the common bits of your two numbers are set. So assuming byte b = -128, we would have:
b & 0xff // 128
1000 0000-128
1111 1111 255
---------
1000 0000 128
So how would I arrive at 1000 0000 for -128? How would I get the binary representation of something less obvious like -72 or -64?

In order to obtain the binary representation of a negative number you calculate two's complement:
Get the binary representation of the positive number
Invert all the bits
Add one
Let's do -72 as an example:
0100 1000 72
1011 0111 All bits inverted
1011 1000 Add one
So the binary (8-bit) representation of -72 is 10111000.
What is actually happening to you is the following: You file has a byte with value 10111000. When interpreted as an unsigned byte (which is probably what you want), this is 88.
In Java, when this byte is used as an int (for example because read() returns an int, or because of implicit promotion), it will be interpreted as a signed byte, and sign-extended to 11111111 11111111 11111111 10111000. This is an integer with value -72.
By ANDing with 0xff you retain only the lowest 8 bits, so your integer is now 00000000 00000000 00000000 10111000, which is 88.

What I want to know is how the bitwise operation works here--more specifically, how I arrive at the binary representation for a negative number.
The binary representation of a negative number is that of the corresponding positive number bit-flipped with 1 added to it. This representation is called two's complement.

I guess the magic here is that the byte is stored in a bigger container, likely a 32 bit int. And if the byte was interpreted as being a signed byte it gets expanded to represent the same number in the 32 bit int, that is if the most significant bit (the first one) of the byte is a 1 then in the 32 bit int all the bits left of that 1 are also turned to 1 (that's due to the way negative numbers are represented, two's complement).
Now, if you & 0xFF that int you cut off those 1's and end up with a "positive" int representing the byte value you've read.

Not sure what you really want :) I assume you are asking how to extract a signed multi-byte value? First, look at what happens when you sign extend a single byte:
byte[] b = new byte[] { -128 };
int i = b[0];
System.out.println(i); // prints -128!
So, the sign is correctly extendet to 32 bits without doing anything special. The byte 1000 0000 extends correctly to 1111 1111 1111 1111 1111 1111 1000 0000.
You already know how to suppress sign extension by AND'ing with 0xFF - for multi byte values, you want only the sign of the most significant byte to be extendet, and the less significant bytes you want to treat as unsigned (example assumes network byte order, 16-bit int value):
byte[] b = new byte[] { -128, 1 }; // 0x80, 0x01
int i = (b[0] << 8) | (b[1] & 0xFF);
System.out.println(i); // prints -32767!
System.out.println(Integer.toHexString(i)); // prints ffff8001
You need to suppress the sign extension of every byte except the most significant one, so to extract a signed 32-bit int to a 64-bit long:
byte[] b = new byte[] { -54, -2, -70, -66 }; // 0xca, 0xfe, 0xba, 0xbe
long l = ( b[0] << 24) |
((b[1] & 0xFF) << 16) |
((b[2] & 0xFF) << 8) |
((b[3] & 0xFF) );
System.out.println(l); // prints -889275714
System.out.println(Long.toHexString(l)); // prints ffffffffcafebabe
Note: on intel based systems, bytes are often stored in reverse order (least significant byte first) because the x86 architecture stores larger entities in this order in memory. A lot of x86 originated software does use it in file formats, too.

To get the unsigned byte value you can either.
int u = b & 0xFF;
or
int u = b < 0 ? b + 256 : b;

For bytes with bit 7 set:
unsigned_value = signed_value + 256
Mathematically when you compute with bytes you compute modulo 256. The difference between signed and unsigned is that you choose different representatives for the equivalence classes, while the underlying representation as a bit pattern stays the same for each equivalence class. This also explains why addition, subtraction and multiplication have the same result as a bit pattern, regardless of whether you compute with signed or unsigned integers.

Why byte b = (byte) 0xFF is equals to integer -1?

Why byte b = (byte) 0xFF is equal to integer -1?
Ex:
int value = byte b = (byte) 0xFF;
System.out.println(value);
it will print -1?

Bytes are signed in Java. In binary 0x00 is 0, 0x01 is 1 and so on but all 1s (ie 0xFF) is -1, 0xFE is -2 and so on. See Two's complement, which is the binary encoding mechanism used.

b is promoted to an int in determining which overload of system.out.println to call.
All bytes in Java are signed.
The signed byte 0xff represents the value -1. This is because Java uses two's complement to represent signed values. The signed byte 0xff represents -1 because its most significant bit is 1 (so therefore it represents a negative value) and its value is -128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = -1.

perhaps your confusion comes from why (byte)0xFF is somehow equal to (int)0xFFFFFFFF. What's happening here is the promotion from smaller to larger signed types causes the smaller value to be sign extended, whereby the most significant bit is copied to all of the new bits of the promoted value. an unsigned type will not become sign-extended, they get zero extended, the new bits will always be zero.
If it helps you to swallow it, think of it this way, every integer of any size also has some 'phantom' bits that are too significant to be represented. they are there, just not stored in the variable. a negative number has those bits nonzero, and positive numbers have all zeros for phantom bits when you promote a smaller value to a larger one, those phantom bits become real bits.

If you are using a signed int then 0xFF = -1 due to the 2-complement.
This wiki article explains it well, see the table on the right:
http://en.wikipedia.org/wiki/Two%27s_complement

Because Java (and most languages) represent negative integer values using two's-complement math. In two's-complement, 0xFF (11111111) represents (in a signed int) the value -1.

reduced modulo
byte = 256
0xff = 255
255 / 256 -> remainder 255
So 255 - 256 = -1
Simple Logic
Cheers

Its not just Java that does 2's complement math. That is the way every microprocessor and DSP that I can think of does math. So, its the way every programming language represents it.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.