BitCount Method - java

Can anyone Explain this bitcount method.
public static int bitCount(int i) {
// Hacker's Delight, Figure 5-2
i -= (i >> 1) & 0x55555555;
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
i = ((i >> 4) + i) & 0x0F0F0F0F;
i += i >> 8;
i += i >> 16;
return i & 0x0000003F;
}

It's based on three observations,
The bitcount of a single bit is that bit itself.
The bitcount of the concatenation of two bitstrings is the sum of their bitcounts.
The bitcount of any string takes no more bits than that string itself.
The first two points together give you a simple recursive algorithm, split the string in two halves, recurse on both, return the sum. The base case is a single bit which you return. Simple so far.
The third observation is very important, it means that if you replace each substring with its bitcount, it will always fit in the space available to it. It means that if you give every count twice as much space (by separating the odd and the even groups), you can sum them and there will be no carry from one group into the next. Then you can rewrite it to this form:
i = (i & 0x55555555) + ((i >> 1) & 0x55555555); // sum groups of 1
i = (i & 0x33333333) + ((i >> 2) & 0x33333333); // sum groups of 2
i = (i & 0x0f0f0f0f) + ((i >> 4) & 0x0f0f0f0f); // sum groups of 4
...
And so on. What happens here is that on the left side of the + we take the even groups (0th, 2nd, 4th etc) and on right side we take the odd groups and align them with their corresponding even groups. For example, summing groups of 2:
[10 01 00 01 00 01 10 10] // note that 11 cannot occur
split
even: [0001 0001 0001 0010]
odd: [1000 0000 0000 1000]
align: [0010 0000 0000 0010]
sum: [0011 0001 0001 0100]
Then Hacker's Delight uses various tricks to optimize some operations out, for example groups of 4 can be summed using only the masking at the end, because the counts go up to 4 at most so summing them directly gives 8 at most, which still fits in the 4 bits available to it.

Why don't you add some logging code to display i in binary at each step and see if you can work out what's happening?
Or reduce it to a smaller numebr of bits (8, say) and work through it on paper.
It'll give you a much better feel for the code than someone simply explaining it to you.
This page may help.

This algorithm dates back at least to HAKMEM item 169
LDB B,[014300,,A] ;or MOVE B,A then LSH B,-1
AND B,[333333,,333333]
SUB A,B
LSH B,-1
AND B,[333333,,333333]
SUBB A,B ;each octal digit is replaced by number of 1's in it
LSH B,-3
ADD A,B
AND A,[070707,,070707]
IDIVI A,77 ;casting out 63.'s
These ten instructions, with constants extended, would work on word lengths up to 62.; eleven suffice up to 254..

Related

Concatenation of Consecutive Binary Numbers, Modulo Operation discussion behavior

Here I am contributing the Java Solution for the problem.
Concatenation of Consecutive Binary Numbers:
Given an integer n, return the decimal value of the binary string formed by concatenating the binary representations of 1 to n in order, modulo 10^9 + 7.
class Solution {
public int concatenatedBinary(int n) {
long sum = 0;
for(int i = 1; i <= n; i++){
sum = ((sum << Integer.toBinaryString(i).length()) + i) % 1000000007;
}
return (int) sum;
}
}
Now I have a doubt that is when we are modulating at each step within for loop. It will not impact the result till 1000000007 but after that, it will change the sum variable, and this cycle will repeat. Now, why doesn't this modulo impacting the overall result? Thanks in advance.
Let's take a simpler problem: Take the number 1000, write it as bits, then take the number 1001, write that as bits, concatenate the two, what's that, in decimal?
1000 in bits is 11 1110 1000
1001 in bits is 11 1110 1001
Thus, the answer'd be 1111 1010 0011 1110 1001, or 1025001.
But, let's do a more mathy take on this: "concatenate the two" boils down to: "Shift the bits in the first number to the left to make enough room, then add the second number". And "shift left by X" is the same as 'multiply by 2X'. Just like if I have the number '1234', and I tell you to 'shift that left by 2 spots', it's the same as multiplying by 100: That turns it into 123400, which is 1234*100, and 100 is just 102. So, 'shift left by X spots' is the same as 'multiply by bX' where b is the 'base' of the number system we use; 2 in binary, 10 in decimal.
Thus, a different way to state the same result is: 'Take the number 1000, multiply it by 210, add 1001 to it. Sure enough: 1000 * 2^10 + 1001 is indeed 1025001.
Thus, a single 'loop' in your algorithm is effectively: Take the result we have so far, multiply it by 2 a bunch of times (X times, where X is the position of the highest 1-bit in the number we're processing this loop), then add the new number.
So, it's just multiplication and addition.
Modulo has the property that it is stable for those operations.
Consider basic math: You were probably taught about the number line. A horizontal line, infinite in size.
A modulo system is no different, except the number line is a big loop. It's a circle. In modulo 1000000007 space, the numbers '5 and 6' are just as adjacent as the numbers '0 and 1000000006' are.
Given, on the normal number line, a * b = c, modulo has the property that this also means that (a%Z * b%Z)%Z = c%Z for any Z. The same goes for addition; if a + b = c, then (a%Z + b%Z)%Z = c%Z is also true. You can try a bunch of numbers and witness this, or try to prove this yourself, or search the web for proof of this property.
Example:
12 * 18 = 216
(12%7 * 18%7)%7 = 216%7
Yup, that checks out:
5 * 4 = 20
20%7 = 6.
216%7 is also 6.
Thus:
Your question boils down to a lot of applications of multiplying and addition.
multiply and add translate to modulo math without issue.
Therefore, your algorithm works.

The method add mod 2^512

It's add modulo 2^512. Could you explain me why we doing here >>8 and then &oxFF?
I know i'm bad in math.
int AddModulo512(int []a, int []b)
{
int i = 0, t = 0;
int [] result = new int [a.length];
for(i = 63; i >= 0; i--)
{
t = (a[i]) + (int) (b[i]) + (t >> 8);
result[i] = (t & 0xFF); //?
}
return result;
}
The mathematical effect of a bitwise shift right (>>) on an integer is to divide by two (truncating any remainder). By shifting right 8 times, you divide by 2^8, or 256.
The bitwise & with 0xFF means that the result will be limited to the first byte, or a range of 0-255.
Not sure why it references modulo 512 when it actually divides by 256.
It looks like you have 64 ints in each array, but your math is modulo 2^512. 512 divided by 64 is 8, so you are only using the least significant 8 bits in each int.
Here, t is used to store an intermediate result that may be more than 8 bits long.
In the first loop, t is 0, so it doesn't figure in the addition in the first statement. There's nothing to carry yet. But the addition may result in a value that needs more than 8 bits to store. So, the second line masks out the least significant 8 bits to store in the current result array. The result is left intact to the next loop.
What does the previous value of t do in the next iteration? It functions as a carry in the addition. Bit-shifting it to the right 8 positions makes any bits beyond 8 in the previous loop's result into a carry into the current position.
Example, with just 2-element arrays, to illustrate the carrying:
[1, 255] + [1, 255]
First loop:
t = 255 + 255 + (0) = 510; // 1 11111110
result[i] = 510 & 0xFF = 254; // 11111110
The & 0xFF here takes only the least significant 8 bits. In the analogy with normal math, 9 + 9 = 18, but in an addition problem with many digits, we say "8 carry the 1". The bitmask here performs the same function as extracting the "8" out of 18.
Second loop:
// 1 11111110 >> 8 yields 0 00000001
t = 1 + 1 + (510 >> 8) = 1 + 1 + 1 = 3; // The 1 from above is carried here.
result[i] = 3 & 0xFF = 3;
The >> 8 extracts the possible carry amount. In the analogy with normal math, 9 + 9 = 18, but in an addition problem with many digits, we say "8 carry the 1". The bit shift here performs the same function as extracting the "1" out of 18.
The result is [3, 254].
Notice how any carry leftover from the last iteration (i == 0) is ignored. This implements the modulo 2^512. Any carryover from the last iteration represents 2^512 and is ignored.
>> is a bitwise shift.
The signed left shift operator "<<" shifts a bit pattern to the left,
and the signed right shift operator ">>" shifts a bit pattern to the
right. The bit pattern is given by the left-hand operand, and the
number of positions to shift by the right-hand operand. The unsigned
right shift operator ">>>" shifts a zero into the leftmost position,
while the leftmost position after ">>" depends on sign extension.
& is a bitwise and
The bitwise & operator performs a bitwise AND operation.
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/op3.html
http://www.tutorialspoint.com/java/java_bitwise_operators_examples.htm
>> is the bitshift operator
0xFF is the hexadecimal literal for 255.
I think your question misses a very important part, the data format, i.e. how data are stored in a[] and b[]. To solve this question, I make some assumptions:
Since it's modulo arithmetic, a, b <= 2^512. Thus, a and b have 512 bits.
Since a and b have 64 elements, only 8 right-most bits of each elements are used. In other words, a[i], b[i] <= 256.
Then, what remains is very straightforward. Just consider each a[i] and b[i] as a digit (each digit is 8-bit) in a base 2^512 addition and then perform addition by adding digit-by-digit from right-to-left.
t is the carry variable which stores the value (with carry) of the addition at the last digit. t>>8 throws a way the right-most 8 bits that has been used for the last addition which is used as carry for the current addition. (t & 0xFF) gets the right-most 8 bits of t which is used for the current digit.
Since it's modulo addition, the final carry is thrown away.

Why Integer.toBinaryString returns 32 bits if the argument is negative?

I was just messing around the Integer.toBinaryString(int) method.
When I pass a positive number say 7, it outputs 111 but when I pass negative 7, it outputs 11111111111111111111111111111001. I understand that Java uses 2's complement to represent negative numbers, but why 32 bits (I also know that an int is 32 bits long but doesn't fit in the answer)?
Ok so I did some digging...
I wrote up a little program probably close to what you did.
public class IntTest {
public static void main(String[] args){
int a = 7;
int b = -7;
System.out.println(Integer.toBinaryString(a));
System.out.println(Integer.toBinaryString(b));
}
}
My output:
111
11111111111111111111111111111001
So 111 is the same if it had 29 "0"s in front of it. That is just wasted space and time.
If we follow the instructions for twos compliment from this guy here you can see that what we must do is flip the bits ( zeros become ones and ones become zeros ) then we add 1 to the result.
So 0000 0000 0000 0000 0000 0000 0000 0111 becomes 1111 1111 1111 1111 1111 1111 1111 1001
The ones can not be thrown out because they are significant in the twos compliment representation. This is why you have 32 bits in the second case.
Hope this helps! -- Code on!!!
Because Java ints are signed 32-bit. If you use a negative number the first bit must be 1.
System.out.println(Integer.toBinaryString(0));
System.out.println(Integer.toBinaryString(Integer.MAX_VALUE)); // 31 bits
System.out.println(Integer.toBinaryString(Integer.MAX_VALUE - 1)); // 31 bits
System.out.println(Integer.toBinaryString(Integer.MAX_VALUE + 1)); // 32 bits
System.out.println(Integer.SIZE);
Output is
0
1111111111111111111111111111111
1111111111111111111111111111110
10000000000000000000000000000000
32
Note that Integer.MAX_VALUE + 1 is Integer.MIN_VALUE (and it has an extra bit).
It outputs the smallest number it can, stripping leading zeroes. In the case of a negative number, the first bit of the 32 bits is a sign bit (i.e. -1 is 1, 30 zeros, and another 1). So, since it has to output the sign bit (it's significant), it outputs all 32 bits.
Here's a cool semi-relevant example of using the sign bit and the unsigned shift operator :). If you do:
int x = {positive_value};
int y = {other_positive_value};
int avg = (x + y) >>> 1;
The x and y integers can both use the first 31 bits since the 32nd bit is the sign. This way, if they overflow, they overflow into the sign bit and make the value negative. The >>> is an unsigned shift operator which shifts the value back one bit to the right which is effectively a divide by two and floor operation, which gives a proper average.
If you, on the other hand, had done:
int x = {value};
int y = {other_value};
int avg = (x + y) / 2;
And you had gotten an overflow, you would end up with the wrong result as you'd be dividing a negative value by 2.

Create and fill a variable-sized binary truth table

I am trying solve the 0/1 Knapsack problem through brute force. The simplest (it seems) way to do it would be to set up a 2d matrix with 1's and 0's signifying present and non-present in the knapsack, respectively. The parameters would be the number of items (ie: columns), so then the rows should be 2^numOfItems. But since the number of items isn't constant, I can't think of how to fill the matrix. I was told that bit-shifting would work, but I do not understand how that works. Can someone point me in the right direction?
EDIT: by truth table I mean the 'A' part of one of these: http://www.johnloomis.org/ece314/notes/devices/binary_to_BCD/bcd03.png
You don't have to store all the bit sequences in a matrix, it's unnecessary and will waste way too much memory. You can simply use an integer to denote the current set. The integer will go from 0 to 2^n-1 where n is the number of elements that you can choose from. Here's the basic idea.
int max = (1 << n);
for(int set = 0; set < max; set++)
{
for(int e = 0; e < n; e++)
{
if((set & (1 << e)) != 0)
//eth bit is 1 means that the eth item is in our set
else
// eth element will not be put in the knapsack
}
}
The algorithm relies on logical left bit shifting. (1 << n) means that we will shift 1, n positions to the left by padding zeros to the right side of the number. So for example, if we represent 1 as an 8-bit number 00000001, (1 << 1) == 00000010, (1 << 2) == 00000100, etc. The bitwise-and operator is an operator that takes two arguments, and "ands" every two bits that have the same index. So if we have 2 bit-strings of length n each, bit zero will be anded with bit 0, bit 1 with bit 1, etc. The output of & is a 1 if and only if both bits are 1s, otherwise it's 0. Why is this useful?? we need it to test bits. For example, assume that we have some set represented as a bit-string, and we want to determine if the ith bit in the bit-set is one or a zero. We can do that by using a shift left operation followed by a bitwise-and operation.
Example
Set = 00101000
we want to test Set(3) (remember that the rightmost bit is bit 0)
We can do that by shifting 1 3 places to the left, so it becomes 00001000. Then we "and" the shifted 1 with the set
00101000
&
00001000
---------
00001000
As you can see, if the bit I am testing is a 1, then the output of the & will be non zero, otherwise it'll be zero.

What is the most programatically efficient way to determine if a number is a power of 2?

Just a simple of boolean true/false that is very efficient would be nice. Should I use recursion, or is there some better way to determine it?
From here:
Determining if an integer is a power of 2
unsigned int v; // we want to see if v is a power of 2
bool f; // the result goes here
f = (v & (v - 1)) == 0;
Note that 0 is incorrectly considered a power of 2 here. To remedy
this, use:
f = v && !(v & (v - 1));
Why does this work? An integer power of two only ever has a single bit set. Subtracting 1 has the effect of changing that bit to a zero and all the bits below it to one. AND'ing that with the original number will always result in all zeros.
An Integer power of two will be a 1 followed by one or more zero's
i.e.
Value value -1 (binary)
10 2 1
100 4 11
1000 8 111
10000 16 1111
as mitch said
(value & (value-1)) == 0
when value is a power of 2 (but not for any other number apart from 1 / 0 and 1 is normally regarded as 2 raised to the power of zero).
For mitch's solution, where numbers > 0 that are not powers of 2 i.e.
value value - 1 V & (v-1)
1000001 1000000 1000000
1000010 1000001 1000000
1000100 1000011 1000000
1001000 1000111 1000000
1000011 1000010 1000010
1000101 1000100 1000100
1000111 1000110 1000110
and never zero.
Subtracting 1 from a number reverses the bits up unto and including the first 1; for power's of two's there is only one '1' so Value & (Value-1) == 0, for other numbers second and subsequent 1's are left un-affected.
Zero will need to be excluded
Another possible solution (probably slightly slower) is
A & (-A) == A
Powers of 2:
A -A
00001 & 11111 = 00001
00010 & 11110 = 00010
00100 & 11100 = 00100
Some other numbers:
A -A
00011 & 11101 = 00001
00101 & 11011 = 00001
Again you need to exclude 0 as well
To solve this problem, I did
Write the number in binary; you will see that a power of 2 has only a single one in it
Fiddle with the various operators / boolean at boolean level and see what works
doing this, I found the following also work:
A & (-A) == A
(not A) | (not A + 1) == -1 /* boolean not of a & (a-1) == 0 */
Not sure whether you mean efficient in terms of computation speed, or in terms of lines of code. But you could try value == Integer.highestOneBit(value). Don't forget to exclude zero if you need to.

Categories