Problem
I wanted to perform bit operation with Java and was expecting the same behavior that happens with C or C++. However it is not working as intended.
In C or C++
printf("%d", 0 > 0xFFFFFFFF);
this would return 0 (which is false)
However in Java
System.out.println(0 > 0xFFFFFFFF);
returns true
What I understand
I know how 2's complement works. The below is that I am guessing what is happening internally with those two languages.
C++ or C just translates 0xFFFFFFFF to 0xFFFFFFFF itself, so value 0 is smaller than 0xFFFFFFFF thus resulting false.
Java translates 0xFFFFFFFF with 2's complement as -1, so value 0 is bigger than -1 thus resulting true.
Question
Is there any possible way that Java can work just like C++ or C did? I would like Java's hex values to be recognized as hex values instead of converting them into signed int values?
In Java the literal 0xFFFFFFFF represents the int whose value is -1. In Java int is a signed type.
In C / C++ 0xFFFFFFFF will typically be either a long or an unsigned long. In the former case, it represents -1. In the latter case it represents 2^32 - 1 ... a very large positive integer.
Is there any possible way that Java can work just like C++ or C did?
No.
I would like Java's hex values to be recognized as hex values instead of converting them into signed int values?
Well the problem is that Java int is signed. And you can't change that.
However, there are methods in the Integer class that will treat an int value as if it was unsigned; e.g. Integer.compareUnsigned, divideUnsigned and parseUnsignedInt. See the javadocs for more details.
Related
When I do
Long.parseUnsignedLong("FBD626CC4961A4FC", 16)
I get back -300009666327239428
Which seems wrong, since the meaning of unsigned long according to this answer https://stackoverflow.com/a/2550367/1754020 is that the range is always positive.
To get the correct number from this HEX value I do
BigInteger value = new BigInteger("FBD626CC4961A4FC", 16);
When I print value it prints the correct value. but if I do value.longValue()
again I get the same -300009666327239428 is this of the number being too big and overflowing ?
Java 8 does (somewhat) support unsigned longs, however, you can't just print them directly. Doing so will give you the result that you saw.
If you have an unsigned long
Long number = Long.parseUnsignedLong("FBD626CC4961A4FC", 16);
you can get the correct string representation with the function
String numberToPrint = Long.toUnsignedString(number);
If you now print numberToPrint you get
18146734407382312188
To be more exact, your number is still going to be a regular signed long which is why it shows overflow if printed directly. However, there are new static functions that will treat the value as if it was unsigned, such as this Long.toUnsignedString(long x) or Long.compareUnsigned(long x, long y).
The hexadecimal number "FBD626CC4961A4FC", converted to decimal, is exactly 18146734407382312188. That number is indeed larger than the maximum possible long, defined as Long.MAX_VALUE and which is equal to 263-1, or 9223372036854775807:
System.out.println(new BigInteger("FBD626CC4961A4FC", 16)); // 18146734407382312188
System.out.println(Long.MAX_VALUE); // 9223372036854775807
As such, it's normal that you get back a negative number.
You do not have an exception, as it is exactly the purpose of those new *Unsigned* methods added in Java 8, to give the ability to handle unsigned longs (like compareUnsigned or divideUnsigned). Since the type long in Java is still unsigned, those methods work by understanding negative values as values greater than MAX_VALUE: it simulates an unsigned long. parseUnsignedLong says:
An unsigned integer maps the values usually associated with negative numbers to positive numbers larger than MAX_VALUE.
If you print a long that was the result of parseUnsignedLong, and it is negative, all it means is that the value is greater than the max long value as defined by the language, but that methods taking unsigned longs as parameter will correctly interpret those values, as if they were greater than the max value. As such, instead of printing it directly, if you pass that number to toUnsignedString, you'll get the right output, like shown in this other answer. Not all of these methods are new to Java 8, for example toHexString also interprets the given long as an unsigned long in base 16, and printing Long.toHexString(Long.parseUnsignedLong("FBD626CC4961A4FC", 16)) will give you back the right hex String.
parseUnsignedLong will throw an exception only when the value cannot be represented as an unsigned long, i.e. not a number at all, or greater than 264-1 (and not 263-1 which is the maximum value for a signed long).
Yes, it overflows when you are trying to print it, as it is converted to Java long type. To understand why let's take log2 of your dec value.
First thing, original value is 18146734407382312188. It's log2 is ~63.9763437545.
Second, look into documentation: in java long type represents values of -2^63 and a maximum value of 2^63-1.
So, your value is obviously greater then 2^63-1, hence it overflows:
-2^63 + (18146734407382312188 - 2^63 + 1) = -300009666327239428
But as #Keiwan brilliantly mentioned, you still can print proper value using Long.toUnsignedString(number);
Internally unsigned and signed numbers are represented in the same way, i.e. as 8 bytes in case of a long. The difference only how the "sign" bit interpreted, i.e. if you'd do the same in a C/C++ program and store your value into an uint64_t then cast/map it to a asigned int64_t you should get the same result.
Since the maximum value 8 bytes or 64 bits can hold is 2^64-1 that's the hard constraint for such numbers. Also Java doesn't directly support unsigned numbers and thus the only way to store an unsigned long in a long is to allow for a value that's higher than the signed Long.MAX_VALUE. In fact Java doesn't know whether the string/hexcode you're reading is meant to represent a signed or unsigned long so it's up to you to provide that interpretation, either by converting back to a string or using a larger datatype such as BigInteger.
I'm working on an assignment for school and I'm getting strange output. So, I figured I should start checking some of my more basic methods before I get to the fancier ones. The question I have is this:
would the method
public static short get16(byte a, byte b){
return (short)(a*Math.pow(2,8)+b)
}
return a short where the first 8 bits are byte a and the last 8 bits are byte b?
I don't see why it wouldn't, since multiplying by 2^8 would be the same as left shifting 8 bits to the left. And adding the second byte would make up for the 8 0's achieved by multiplying by 2^8. Is this correct?
I wouldn't recommend using Math.pow to compute 256. pow is notoriously hard to implement correctly; some extant implementations don't even get the exact cases right!
Also, bytes in Java are signed, so you probably want to say (a&255) and (b&255) rather than just a and b. Sign extension will ruin everything for you.
Some things you should know:
"Math.pow" is a floating-point function. Don't do integer calculation by calling floating-point functions and then rounding the result.
Java virtual machine is internally a 32-bit system. All "byte" and "short" mathematical expressions are internally evaluated as "int". Even an addition of two bytes goes internallly like this: 1) convert the bytes to ints, 2) add the ints, 3) convert the lower 8 bits to byte.
The correct way is:
return (short) ((a << 8) + (b & 255));
or
return (short) ((a << 8) | (b & 255));
When "byte" is converted to "int", the sign bit gets copied into the new bits. For example 0b01010101 becomes 0b00000000_00000000_00000000_01010101, because the first bit was 0, but 0b10101010 becomes 0b11111111_11111111_11111111_10101010.
Does python have an equivalence to java's Byte.MAX_VALUE representing the max byte? I had a look at python sys module, I only managed to find sys.maxint. Does it have anything like sys.maxbyte?
UPDATE:
In my case, I am doing a Hbase Rowkey scan, My rowkey looks like rk1_rk2. In order to scan all results for rk1 without knowing exact rk2, My java code looks like:
byte[] startRowBytes = "rk1".getBytes();
byte[] endRowBytes = ("rk1" + (char) Byte.MAX_VALUE).getBytes();
HbaseScanQuery query = new HbaseScanQuery(tableName, colFamily);
query.setStartRow(startRowBytes).setStopRow(endRowBytes);
I am just trying to work out the python equivalence of Byte.MAX_VALUE part.
I think you will have to define the value yourself. A byte has 2^8 = 256 unique states and so the largest integer it can represent is 255. java's byte type, however, is a signed byte, so half the states are reserved for positives(and 0) and the other half is used for negatives. therefore the the equivalent of java's Byte.MAX_VALUE is 127, and the equivalent of java's Byte.MIN_VALUE is -128
Since python bytes are unsigned, the equivalent of java's Byte.MIN_VALUE would be 128 which is the representation of -128 in 2's compliment notation(the defacto standard for representing signed integers) thanks to Ignacio Vazquez-Abrams for pointing that out.
I haven't dealt with python in a while, but i believe what you want is ("rk1"+chr(127))
Given your update, there is an even better answer: Don't worry about what the max byte value is. According to the HBase documentation, the setStartRow and setStopRow methods work just like Python's slicing; namely, the start is inclusive, but the stop is exclusive, meaning your endRowBytes should simply be 'rk2'.
Also, the documentation mentions that you can make the stop row inclusive by adding a zero byte, so another alternative is 'rk1' + chr(0) (or 'rk1\0' or 'rk1\x00', whichever is clearest to you). In fact, the example used to explain HBase scans in the linked documentation illustrates exactly your use case.
I want to do an "AND" operation between a and b (both decimal). the value of b is determined inside the code. As a result, if I write sth like:
String g= Integer.toHexString(b);
int k=a & g;
I get error, because it should be sth like:
int k=a & 0xFF;
somehow 0x should be before the hex value, and at the same time it can't be of type String. I did not find any example in Internet for the cases that the second operand can be a variable. Should I write manually a for loop to apply the AND operation bit by bit, or there is a straight forward solution for it?
I appreciate your help.
If b is already an integer, your code should just be:
int k = a & b;
A number is a number is a number. The human representations of 0xFF vs 255 makes no difference at all to the & operator. There is no such thing as a "Hex Integer" vs a "Decimal Integer" to the computer, it's just a value. The different ways of writing it are on the human end.
Just do this:
int k = a & b;
There's no need to convert an int to a hexadecimal before applying the & operator. In fact, that is an error, because the value returned by toHexString() is a String, and the & operator works for integers only.
When you write the hex literal 0xFF in your java code, it's the same as writing integer literal 255.
Hence there's no point converting your integer to string, doing a & b is sufficient. The hex notation is just how you specify your literal
This is what I see in java, and it puzzles me.
Long.toHexString(0xFFFFFFFF) returns ffffffffffffffff
Similarly, 0xFFFFFFFF and Long.parseLong("FFFFFFFF", 16) are unequal.
As others have said, 0xFFFFFFFF evaluates to the int value -1, which is promoted to a long.
To get the result you were expecting, qualify the constant with the L suffix to indicate it should be treated as a long, i.e. Long.toHexString(0xFFFFFFFFL).
This:
Long.toHexString(0xFFFFFFFF)
is equivalent to:
Long.toHexString(-1)
which is equivalent to:
Long.toHexString(0xFFFFFFFFFFFFFFFFL)
Basically, the problem is that you're specifying a negative int value, which is then being converted to the equivalent negative long value, which consists of "all Fs". If you really want 8 Fs, you should use:
Long.toHexString(0xFFFFFFFFL)
Of course, Long in java is 64-bits long! 0xFFFFFFFF means -1 as an int, when written in 64 bits, it's ffffffffffffffff.
However, if the number were unsigned, the string would also be ffffffff [but there's no unsigned in java].
0xFFFFFFFF is an int literal. When using ints (32 bit in Java) 0xFFFFFFFF equals -1. What your code does:
the compiler parses 0xFFFFFFFF as an int with value -1
the java runtime calls Long.toHexString(-1) (the -1 get "casted" automatically to a long which is expected here)
And when using longs (64 bit in Java) -1 is 0xffffffffffffffff.
long literals are post-fixed by an L. So your expected behaviour is written in Java as:
Long.toHexString(0xFFFFFFFFL)
and Long.toHexString(0xFFFFFFFFL) is "ffffffff"