Difference in BigInteger values from a String and a byte array - java

Can someone please explain the difference between the below two initialisations of BigInteger.
Input:
BigInteger bi1 = new BigInteger("EF", 16);
byte[] ba = new byte[] {(byte)0xEF};
BigInteger bi2 = new BigInteger(ba);
Log.d("BIGINTEGER", "Big Integer1 = " + bi1.toString(16));
Log.d("BIGINTEGER", "Big Integer2 = " + bi2.toString(16));
Output:
Big Integer1 = ef
Big Integer2 = -11
How can I initialise a BigInteger with the value "EF" from a byte array?

From the BigInteger docs
Constructor and Description
BigInteger(byte[] val)
Translates a byte array containing the two's-complement binary
representation of a BigInteger into a BigInteger.
The Two's-complement is the real reason.
Lets see how...
(Byte)0xef in binary = 11101111
Now convert that back to Int and you get -17 (base 10) or -11 (base 16).
Now take a look at
byte[] ba = new byte[] {0, (byte)0xEF};
This has the (Byte)0xef but prepended by 0. Which means this array has 00000000 11101111, which when converted gives the correct result.
Why was the previous case different?
Check out 2's complement rules - SO Answer, Mandatory Wikipedia link
Another way of thinking about this
0xEF in Decimal = 239
Range of Byte = -127 to 128
We have Overflow.
239 - 128 = 111
Now count this 111 from back (Numeric data types have this circular behaviour, again due to 2's complement representation).
For example: 129.toByte = -127
(129 - 128 = 1, count from back the 1st value = -127)
Shortcut to counting from back if x>128 && x<256 then x.toByte = (x - 128) - 128
Here x = 239 so x.toByte = -17

Put a leading zero into the byte[]:
byte[] ba = new byte[] {0, (byte)0xEF};
Ideone demo

You need to add a zero into the byte[] array:
byte[] myByteArray = new byte[] {0, (byte)0xEF};
BigInteger bi2 = new BigInteger(ba);
Log.d("BIGINTEGER", "Big Integer1 = " + bi1.toString(16));
Log.d("BIGINTEGER", "Big Integer2 = " + bi2.toString(16));
why?
well the reason is related to the language specification:
Decimal literals have a particular property that is not shared by hexadecimal, i.e Decimal literals are all positive [JLS 3.10.1].
To write a negative decimal constant, you need to use the unary negation operator (-) in combination with a decimal literal.
In this way, you can write any int or long value, whether positive
or negative, in decimal form, and negative decimal constants are clearly identifiable by the presence of a minus sign.
Not so for hexadecimal nor octal literals.
They can take on both positive and negative values. Hex and octal literals are
negative if their high-order bit is set.
So after having said that, 0xFE is actually a negative number...

public BigInteger(byte[] val)
Translates a byte array containing the two's-complement binary representation of a BigInteger into a BigInteger. The input array is assumed to be in big-endian byte-order: the most significant byte is in the zeroth element.
public BigInteger(String val,
int radix)
Translates the String representation of a BigInteger in the specified radix into a BigInteger. [...]
Source: Oracle Java 7 Docs
Your Initialization from a bytearray does not behave as expected, because 0xEF casted to a bytearray returns {1, 1, 1, 0, 1, 1, 1, 1}.
Made to an integer according to the specs mentioned above is done as follows:
1*2^0 + 1*2^1 + 1*2^2 + 1*2^3 + 0*2^4 + 1*2^5 + 1*2^6 - 1*2^7 = -17 = -0x11
The two's-compliment causes the highest byte to be substracted, rather than being added. So adding a 0 to the beginningthe byte array should probably fix the problem:
byte[] ba = new byte[] {0, (byte)0xEF};

Related

Java Integer.parseInt() for 32-bit signed binary string throws NumberFormatException

Is this Java Api's bug?
int i = 0xD3951892;
System.out.println(i); // -745203566
String binString = Integer.toBinaryString(i);
int radix = 2;
int j = Integer.valueOf(binString, radix );
Assertions.assertThat(j).isEqualTo(i);
I expect it to be true without any question. But it throws below exception:
java.lang.NumberFormatException: For input string: "11010011100101010001100010010010"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:495)
at java.lang.Integer.valueOf(Integer.java:556)
at com.zhugw.temp.IntegerTest.test_valueof_binary_string(IntegerTest.java:14)
So if I have a binary String , e.g. 11010011100101010001100010010010, How can I get its decimal number(-745203566) in Java? DIY? Write code to implement below equation?
Integer.valueOf(String, int radix) and Integer.parseInt(String, int radix) will only parse numbers of value -2 147 483 648 to 2 147 483 647, i.e. the values of 32-bit signed integers.
These functions cannot interpret two's complement numbers for binary (radix = 2), because the string being passed can be of any length, and so a leading 1 could be part of the number or the sign bit. I guess Java's developers decided that the most logical way to proceed is to never accept two's complement, rather than assume that a 32nd bit is a sign bit.
They read your input binary string as unsigned 3 549 763 730 (bigger than max int value). To read a negative value, you'd want to give a positive binary number with a - sign in front. For example for -5:
Integer.parseInt("1011", 2); // 11
// Even if you extended the 1s to try and make two's complement of 5,
// it would always read it as a positive binary value
Integer.parseInt("-101", 2); // -5, this is right
Solutions:
I suggest, first, that if you can store it as a positive number with extra sign information on your own (e.g. a - symbol), do that. For example:
String binString;
if(i < 0)
binString = "-" + Integer.toBinaryString(-i);
else // positive i
binString = Integer.toBinaryString(i);
If you need to use signed binary strings, in order to take a negative number in binary two's complement form (as a string) and parse it to an int, I suggest you take the two's complement manually, convert that into int, and then correct the sign. Recall that two's complement = one's complement + 1, and one's complement is just reverse each bit.
As an example implementation:
String binString = "11010011100101010001100010010010";
StringBuilder onesComplementBuilder = new StringBuilder();
for(char bit : binString.toCharArray()) {
// if bit is '0', append a 1. if bit is '1', append a 0.
onesComplementBuilder.append((bit == '0') ? 1 : 0);
}
String onesComplement = onesComplementBuilder.toString();
System.out.println(onesComplement); // should be the NOT of binString
int converted = Integer.valueOf(onesComplement, 2);
// two's complement = one's complement + 1. This is the positive value
// of our original binary string, so make it negative again.
int value = -(converted + 1);
You could also write your own version of Integer.parseInt for 32-bit two's complement binary numbers. This, of course, assumes you're not using Java 8 and can't just use Integer.parseUnsignedInt, which #llogiq pointed out while I was typing this.
EDIT: You could also use Long.parseLong(String, 2) first, then calculate the two's complement (and mask it by 0xFFFFFFFF), then downgrade the long down to int. Faster to write, probably faster code.
The API docs for Integer.toBinaryString(..) explicitly state:
The value of the argument can be recovered from the returned string s by calling Integer.parseUnsignedInt(s, 8).
(as of Java 8u25) I think this is a documentation error, and it should read Integer.parseUnsignedInt(s, 2). Note the Unsigned. This is because the toBinaryString output will include the sign bit.
Edit: Note that even though this looks like it would produce an unsigned value, it isn't. This is because Java does not really have a notion of unsigned values, only a few static methods to work with ints as if they were unsigned.

parseInt on a string of 8 bits returns a negative value when the first bit is 1

I've got a huge string of bits (with some \n in it too) that I pass as a parameter to a method, which should isolate the bits 8 by 8, and convert them all to bytes using parseInt().
Thing is, every time the substring of 8 bits starts with a 1, the resulting byte is a negative number. For example, the first substring is '10001101', and the resulting byte is -115. I can't seem to figure out why, can someone help? It works fine with other substrings.
Here's my code, if needed :
static String bitsToBytes(String geneString) {
String geneString_temp = "", sub;
for(int i = 0; i < geneString.length(); i = i+8) {
sub = geneString.substring(i, i+8);
if (sub.indexOf("\n") != -1) {
if (sub.indexOf("\n") != geneString.length())
sub = sub.substring(0, sub.indexOf("\n")) + sub.substring(sub.indexOf("\n")+1, sub.length()) + geneString.charAt(i+9);
}
byte octet = (byte) Integer.parseInt(sub, 2);
System.out.println(octet);
geneString_temp = geneString_temp + octet;
}
geneString = geneString_temp + "\n";
return geneString;
}
In Java, byte is a signed type, meaning that when the most significant bit it set to 1, the number is interpreted as negative.
This is precisely what happens when you print your byte here:
System.out.println(octet);
Since PrintStream does not have an overload of println that takes a single byte, the overload that takes an int gets called. Since octet's most significant bit is set to 1, the number gets sign-extended by replicating its sign bit into bits 9..32, resulting in printout of a negative number.
byte is a signed two's complement integer. So this is a normal behavior: the two's complement representation of a negative number has a 1 in the most-significant bit. You could think of it like a sign bit.
If you don't like this, you can use the following idiom:
System.out.println( octet & 0xFF );
This will pass the byte as an int while preventing sign extension. You'll get an output as if it were unsigned.
Java doesn't have unsigned types, so the only other thing you could do is store the numbers in a wider representation, e.g. short.
In Java, all integers are signed, and the most significant bit is the sign bit.
Because parseInt parse signed int that means it converts the binary if it begins with 0 its positive and if 1 its negative try to use parseUnsignedInt instead

BigInteger.toByteArray() returns purposeful leading zeros?

I'm transforming bigints into binary, radix16 and radix64 encoding and seeing mysterious msb zero paddings. Is this a biginteger problem that I can workaround by stripping zero padding or perhaps doing something else?
My test code:
String s;
System.out.printf( "%s length %d\n", s = "123456789A", (new BigInteger( s, 16 )).toByteArray().length );
System.out.printf( "%s length %d\n", s = "F23456789A", (new BigInteger( s, 16 )).toByteArray().length );
Produces output:
123456789A length 5
F23456789A length 6
Of which the longer array has zero padding at the front. Upon inspection of BigInteger.toByteArray() I see:
public byte[] toByteArray() {
int byteLen = bitLength()/8 + 1;
byte[] byteArray = new byte[byteLen];
Now, I can find private int bitLength;, but I can't quite find where bitLength() is defined to figure out exactly why this class does this - connected to sign extension perhaps?
Yes, this is the documented behaviour:
The byte array will be in big-endian byte-order: the most significant byte is in the zeroth element. The array will contain the minimum number of bytes required to represent this BigInteger, including at least one sign bit, which is (ceil((this.bitLength() + 1)/8)).
bitLength() is documented as:
Returns the number of bits in the minimal two's-complement representation of this BigInteger, excluding a sign bit.
So in other words, two values with the same magnitude will always have the same bit length, regardless of sign. Think of a BigInteger as being an unsigned integer and a sign bit - and toByteArray() returns all the data from both parts, which is "the number of bits required for the unsigned integer, and one bit for the sign".
Thanks Jon Skeet for your answer. Here's some code I'm using to convert, very likely it can be optimized.
import java.math.BigInteger;
import java.util.Arrays;
public class UnsignedBigInteger {
public static byte[] toUnsignedByteArray(BigInteger value) {
byte[] signedValue = value.toByteArray();
if(signedValue[0] != 0x00) {
throw new IllegalArgumentException("value must be a psoitive BigInteger");
}
return Arrays.copyOfRange(signedValue, 1, signedValue.length);
}
public static BigInteger fromUnsignedByteArray(byte[] value) {
byte[] signedValue = new byte[value.length + 1];
System.arraycopy(value, 0, signedValue, 1, value.length);
return new BigInteger(signedValue);
}
}

Convert an int to byte in java

String boxVal = "FB";
Integer val = Integer.parseInt(boxVal, 16);
System.out.println(val); //prints out 251
byte sboxValue = (byte) val;
System.out.println("sboxValue = " + Integer.toHexString(sboxValue)); //fffffffb
The last line should print out "fb". I am not sure why it prints out "fffffffb."
What am I doing wrong? How should I fix my code to print "fb"?
You have an overflow when you convert 251 to a byte. Byte has a minimum value of -128 and a maximum value of 127 (inclusive)
See here: http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
Why does it print "fffffffb": because you first convert the byte value (which is -5) to an integer with value -5 and then print that integer.
The easiest way to get the output you want is:
System.out.printf("sboxValue = %02x\n", sboxValue);
Or, you could also use:
System.out.println("sboxValue = " + Integer.toHexString(sboxValue & 0xff));
What happens here in detail:
the byte value fb is converted to an integer. Since the value is negative, as you can see because the leftmost bit is 1, it is sign extended to 32 bits: fffffffb.
By masking out the lower 8 bits (with the bitwise and operation &) we get the integer value 000000fb.

Unable to parse 64 bit binary numbers to long

Given binary number in a string "0", I converted it to long to find its Bitwise Not/Complement.
long number = Long.parseLong("0",2);
number = ~number;
System.out.println(Long.toBinaryString(number));
which prints
1111111111111111111111111111111111111111111111111111111111111111
i.e., 64 1's. But I'm unable to find complement of this.
Long.parseLong("111111111111111111111111111111111111111111111111111111111111111",2); //fails
I get java.lang.NumberFormatException. What am I to do?
When you invert zero
number = ~number
you get negative one. The Long.parseLong(String, int) method expects negative numbers to be represented with a minus prefix. When you pass 64 1-s to the method, it thinks it's an overflow, and returns an error.
One way to fix this is to check that the length is less than 64 before you parse the value. If the length is exactly 64, chop off the first digit, and parse the rest of the number. Then check the initial digit. If it is zero, leave the parsed number as is; otherwise, use binary OR to set the most significant bit:
String s = "1111111111111111111111111111111111111111111111111111111111111111";
long res;
if (s.length() < 64) {
res = Long.parseLong(s, 2);
} else {
res = Long.parseLong(s.substring(1), 2);
if (s.charAt(0) == '1') {
res |= (1L << 63);
}
}
The complement of 0 is 64 1's, which is equivalent to -1, since Java uses two's complement.
Long.parseLong(String, int)
expects a signed long (aka if the number is negative, it expects a leading -), but you are passing it 64 1's, which are supposed to represent -1, but do not in this form.
Given that for negatives, it expects the a negative sign, passing it 64 1's causes the it to believe that the number is too large.
EDIT (explanation of dasblinkenlight's fix: couldn't properly format in comment):
So if String s =
"1111111111111111111111111111111111111111111111111111111111111111";
, and we have:
long res = Long.parseLong(s.substring(1), 2);
The binary form of res is:
0111111111111111111111111111111111111111111111111111111111111111
Now, if we know that the first char of s is '1', then we do the following:
res |= (1L << 63);
(1L << 63) produces:
1000000000000000000000000000000000000000000000000000000000000000
So, the bitwise-or assignment to res yields 64 1's, which in two's complement is -1, as desired.
This is because Long.parseLong (as well as Integer.parseInt etc) cannot parse two's complements and for it "111111111111111111111111111111111111111111111111111111111111111" is a positive number that exceeds Long.MAX_VALUE. But we can use BigInteger
long l = new BigInteger("1111111111111111111111111111111111111111111111111111111111111111", 2).longValue()
this produces expected result = -1

Categories