Calculating number of bits and number of words BigInteger

Calculating number of bits and number of words BigInteger - java

While converting a String into BigInteger, Java internally calculates the number of bits and then the number of words(each word is a group of 9 integers i think) in a BigInteger as can be seen here from Line 325 to Line 327. numWords is used then to create an array that can accomodate that BigInteger.
I don't understand the logic used for calculating numBits in line 325 and then the logic for numWords in Line 326.
Logically i think that for the string "123456789", numWords should be 1 and for "12345678912",numWords should be 2 , but that's not always the case. For example for "12345678912345678912", numWords should be 3, but it comes out to be 2.
Can anyone please explain the logic used in line 325 and 326?

To represent decimal number of numDigits as binary number, it requires
numDigits * Math.log(10) / Math.log(2)
bits.
int numBits = (int)(((numDigits * bitsPerDigit[radix]) >>> 10) + 1);
In the calculation above bitsPerDigit[10] is 3402.
Math.log(10) / Math.log(2) * Math.pow(2, 10) = 3401.6543691646593

In Java, BigIntegers are not stored as strings or bytes with a digit each. They are stored as an array of 32-bit integers, which together form the so-called magnitude of the BigInteger. There can be no leading zero integers(*), so the BigInteger is stored as compactly as possible.
The "words" mentioned are these 32-bit integers. They are not groups of 9 digits, they are used in full, so each bit counts.
So you just have to know how many 32-bit integers are stored, which is the length of the internal array times 32. But the top integer can still have leading zeroes, so you must get the number of leading zeroes of that top integer and subtract them from the obtained product, in pseudo-code:
numBits = internalArray.length * 32 - numberOfLeadingZeroBits(internalArray[0]);
Note that the internal array is stored with the top integer at the lowest address (I have no idea why that is), so the top integer is at index 0 of the array.
(*) In reality, the above is a little more complicated, since the top item may be stored at an offset from the start of the array (probably to make certain calculations easier), but to understand the mechanism, you can pretend there are no extra integers.

Words doesn't refer to words as you know it - it's referring to words as memory blocks.
https://en.wikipedia.org/wiki/Word_(computer_architecture)

Related

Floating point notation representation in java specification

Here: http://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.2.3
it says that:
The finite nonzero values of any floating-point value set can all be expressed in the form s · m · 2^(e - N + 1), where s is +1 or -1, m is a positive integer less than 2^N, and e is an integer between Emin = -(2^(K-1)-2) and Emax = 2^(K-1)-1, inclusive, and where N and K are parameters that depend on the value set.
and there is a table below:
Parameter float
N 24
K 8
So let's say N = 24 and K = 8 then we can have the following value from the formula:
s · 2^N · 2^(2^(K-1)-1 - N + 1) which gives us according to values specified in the table:
s * 2^24 * 2^(127 - 24) which is equal to s * 2^127. But float has only 32 bits so it's not possible to store in it such a big number.
So it's obvious that initial formula should be read in a different way. How then?
Also in javadoc for Float max value: http://docs.oracle.com/javase/7/docs/api/java/lang/Float.html#MAX_VALUE
it says:
A constant holding the largest positive finite value of type float, (2-2^-23)·2^127
This also doesn't make sense, as resulting value is much larger than 2^32 - which is possible the biggest value that can be stored in float variable. So again, I'm misreading this notation. So how it should be read?

The idea with the floating point notation is to store a much larger range of numbers than can be stored in the same space (bytes) with the integer representation. So, for example, you say that the "resulting value is much larger than 2^32". But, that would only be a problem if we're storing a typical binary number as one computes in a typical math class.
Instead, floating point representations break those 32 bytes into two main parts:
- significand
- exponent
For simplicity, imagine that 3 bytes are used for the significand and 1 byte for the exponent. Also assume that each of these is your typical binary integer style of representation. So, the three bytes can have a value 2^24, or 2^23 if you want to keep one bit for the sign.
However, the other byte can store up to 2^7 (if you want a sign there too).
So, you could express 500^100, by storing the 500 in the three bytes and the 100 in the 1 byte.
Essentially, one cannot store every number precisely. One changes it into significant form and one can store as many significant digits as the portion reserved for the significand (3 bytes in this example).
Rather than try to explain the complications, check this Wikipedia article for more.

Converting array of integers/decimal points to a single number

I was wondering if there is a way to take an array of integers (and possibly decimal points) and convert them into a single number (an int or a double)? For example, if I had the array {4, 1, ., 9}, is it possible to convert it into a double, 41.9? I am implementing a way to do it by iteration and *10 to various powers, but I'm not sure if it will work out because of rounding errors and such (will it calculate i.e., 9 * (10^-1) correctly all the time?).

I guess a quick and dirty solution is to convert your array of integers and points (in which case you probably have a char array like so: {'4','1','.','9'}) into a string, "41.9" and then use the Double.parseDouble(s) to parse your string to a double. It wont be fast, though.

This only works if the possible values in the array are very limited (for instance 0 to 9) and the number of elements is not that large.
The following algorithm works does this:
long l = 0x00;
int[] array = {5,2,3,6,7,8};
int radix = 9;
for(int i = 0x00; i < array.length; i++) {
l *= radix;
l += array[i];
}
return l;
Where radix is the maximum value +1.
Even then however, a long has a maximum value of 2^63-1 or 9 223 372 036 854 775 807. That means that if the number of possible values per item is 10, you can only store 17 items.
A double won't solve the problem. Since doubles use floating point semantics, they can only store approx. 12 decimals correctly. If you store more in a double, information will get lost eventually. What happens to the least significant values depends on the larger values (since half of the time a 0 will be generates as least significant bit, and 1 in the other case).

Hamming Code: Number of parity bits

I'm trying to write a method in java that will take an input of any number of 0 or 1 digits and output that line after being encoded with Hamming Code.
I have managed to write the code when knowing the number of digits the input will have (in this case 16) because knowing the number of digits in the input, I immediately know the number of parity bits there have to be added (5 in this case) to a total of 21 digits in the final output. I am working with int arrays so I need to declare a size in the beginning and my code works based on those exact sizes.
Can you guys think of any way/algorithm that can give me the number of digits the output will have (after adding the relevant parity digits to the number of input digits) based solely on the number of input digits?
Or do I have to tackle this problem in a totally different way? Any suggestions? Thank you in advance!
Cheers!

From my understanding, you get your 6th parity bit at 32 bits of input, 7th at 64, etc. so what you need is floor(lg(n)) + 1, which in java you can get by using 32 - Integer.numberOfLeadingZeros(n).
Assuming your input is made up entirely of 0s and 1s, you would do
int parityDigits = 32 - Integer.numberOfLeadingZeros(input.length());

Is your input a String or individual bits? If you input as a String, you can convert each character to a bit, and the length of the String gives you the length of the array.
If you need to input the bits one at a time, store them in an ArrayList. When all bits have been entered, you can convert your list to an array easily, or use the size of the list etc.

Why is my byte array displaying the wrong length?

BigInteger number = new BigInteger("7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450");
byte[] array = number.toByteArray();
System.out.println((int)array.length);
I was working on number 8 for project euler, where the length of number is supposed to be 1000, but whenever I run this program, I receive 416. Could someone please explain to me why this isn't working?

one char doesn't mean one byte here, for example number 11 is 00001011 which can be represented by just 1 byte
Similarly in your case
7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450
is in binary
1011001000110011100000101111011000010111001000110011110000001000000000010100100101011100100110100100001010111010001101011100100100011110110101101111001100110111101110101000011011011101011001010111001000000101110101100000100100010011010111010111100010100110000101010101101100000110100111000111001001011111001001010110110110111011010111100010111101001011010110111000110111111011011000110110110110110001110100001011010001101110110010011100010010000000011011100101110101100011110010010010110110001111101111101100010011000110001000111111001010111110001000111010000010000011110000111101011010011010100001011110001010000001001000101111110000110000011111110000010110100010110101100111011000000001100011100111000000000100100101110101000100010001010100101111001100110011000110001110101010100001010101011000111011010110010000101010100110010110111100011100011000001001011100111000001001101111111001101111011000111011110101101010001001000110100110010011110101001101110110000000010011100101011111110110101001101000011011111110001011001110111010010001110000010100111010001011101011111000111001000011010111111001000101010001101100001000111111011001010010101100000001001100111100011001010111111010011111100100101011011010000010100100101110000010101000110010011010001001100011101101111110001000001000011101011111111011010100010010101011111011101000010111011001000000001100011111101100111011001111111100100001100110111110110110000101101000101110101111000101101111010010101000000001110100111011001000011001100010001110001000010110110011000111001000110010100110111000010110110100110010100101111111000100101011001100111100111001011100000000100110110000110001001001111011110101100101010010010000111110101011111101010101101011001001010000011000110010010111101001000110001011111001111011101010111010110111111110101011010011011101000011010010110010110101001100100010110000000110101001100101010110110011000101011000111100011000100110010011101111011111111100101110000011111110000110010001100011111101011100110001001001010100101001100011110110110000101001111010101001011101000101011011011000010010000001000110001000100101000000110010000100101000101001101111010010011101010001110011001110000001011011001111100100110010101101011000101001111110011101101010001111000111101101110111001001111101001010000011001101000111110100000000100011101000101111101001111100101111101010000100011101100000110010010001001110010100101010101101000100111000001110100010011110110000100001111001001001010111101001111001100010101000110000101111100101110001110001000011001010001000001101111100010110001000111111010101110110100100111011100100010000011111100001100011001011110010111111100011111010010100100000111100110101011000010100011100100001000101011101110000011101010110100111101101110000110010011110101110110100011001101110111101010110100000010001011111110011000010111111111101101110011110010100101011100100111101000110100001011011011010101111100001101111010110011110111000000010101101000111000100101101010110010110110010100001000000110000110011100011000111101011010110011010000100000111000101101100111111101111110100110010010011001011001010110001110111011100110101010101011000110100100001000011101111011100111010001101101011111011111001010110111011101110000001110010011001101010000010110001100101101111011111011111000100010100001000011001100010101100010100100101011101111010
Now if you check how many byte it requires to represent this number
More generally you can check this by
N length of binary string can represent up to 2^N - 1 number
For length: 2 = (max binary string) 11 = 2^2 - 1 = 3 (in 10)

This is because the toByteArray saves the binary representation of the number, not a decimal one. You can think of each byte representing a single digit in base-256. That's why the space required for the representation is more than twice less than the number of decimal digits.
If you need to save each digit to a byte, convert your BigInteger to String: its length is going to equal the number of digits (plus one character for the minus character '-' if the number is negative).

I don't know precisely how BigInteger stores values, but my guess would be that rather than storing them as a string, with one byte per digit, it stores them as one long number, with log_2(n) bits being used to store the number n, and therefore ceiling(log_2(n) / 8) bytes being used.

Because a byte array is a number in base 256 (since every digit can have range 0-255 or 0x00-0xFF) while the input number is in base 10. When you convert your number into a byte array you obtain a number which is in a different base, hence has a different amount of digits.
To prove it you can apply the change of base of logarithms:
logA(C) = logB(C) / logB(A)
log10(C) = log256(C) / log256(10)
1000 ~= 416 / log256(10)
1000 ~= 416 / (log2(10)/log2(256))
1000 ~= 416 / (3.3219/8)
1000 ~= 416 / 0.4152
1000 * 0.4152 ~= 416
415.2 ~= 416

Padding a number with zeroes

Lets say I have a number 345, I want to have so that I end up with 0345. I.e.
int i = 0345;
How can I take an existing number and shift it along or append a 0.

I know you are talking about an int, but maybe what you want is to pad a number with leading 0s. A quick way is with the String.format static method.
int num = 345;
String.format("%04d", num);
would return:
"0345"
The 4d tells it to add 0s to the left if it has less than 4 digits, so you can change it to a 5 and it would give you:
"00345"

Using a 0 on the start of the number when declaring it means it's octal, so 0345 is actually 229 in decimal. I'm not sure how you expect to add a zero to a number using bitwise operations, which work on the binary representation of the number. If you want to add it to the decimal representation, it won't mean anything, since the number is always stored in binary, and the value is converted for your convenience to decimal when being displayed. When doing any computations, the decimal value is not important, only the binary one.
If you're interested only in displaying the value with a 0 at the start, then you could append the 0 to a String containing that number which can be easily done like this "0" + i.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Calculating number of bits and number of words BigInteger - java

Words doesn't refer to words as you know it - it's referring to words as memory blocks. https://en.wikipedia.org/wiki/Word_(computer_architecture)

Related

Floating point notation representation in java specification

Converting array of integers/decimal points to a single number

Hamming Code: Number of parity bits

Why is my byte array displaying the wrong length?

Padding a number with zeroes

Categories

Resources