Constrained counting sets on binary codes

Constrained counting sets on binary codes - java

Problem formulation: Given a binary code of length l, for every t bits set (l%t=0), if there exists at least one bit of value 1, we add 1 to the result.
My question : How to efficiently get the final result?
For example, we have a binary code 010 110 000, and t=3. Then the final result is 2. Since for 000, there is no bit of value 1. For 110, there exists at least one bit of value 1. We add 1 to the result. For 010, there also exists one bit of value 1. We add 1 to the result. Thus, the final result is 2.
My question is that how to efficiently solve this problem without scanning through every t bits, which causes a time complexity linear with the length of the binary code.
For the counting set problem (calculate how many 1's in a binary code), there are some algorithms which take constant time to solve it by taking a limited number of mask and shift operations such as the MIT HAKMEM Count algorithm.
However, existing algorithms for the traditional counting set problem cannot be used to solve my problem.
Does anyone know some tricks for my problems? You can assume a maximum length of the input binary code if it makes the problem easier to solve.

From comment:
Then you can assume that the input is a binary code of 64-bit integer for this problem.
Here are two different ways to do it.
The first run in O(I/t) and works by testing each set for all 0-bits.
public static int countSets(int setSize, long bits) {
Long mask = (1L << setSize) - 1;
int count = 0;
for (long b = bits; b != 0; b >>>= setSize)
if ((b & mask) != 0)
count++;
return count;
}
The second run in O(log I) and works by collapsing the bits of a set to the right-most bit of the set, then counting bits set.
public static int countSets(int setSize, int bitCount, long bits) {
long b = bits, mask = 1;
for (int i = 1; i < setSize; i++)
b |= bits >>> i;
for (int i = setSize; i < bitCount; i <<= 1)
mask |= mask << i;
return Long.bitCount(b & mask);
}
Further explanation
The first method builds a mask for the set, e.g. with t = 3, the mask is 111.
It then shifts the value to the right, t bits at a time, e.g. with input = 010 110 000 and t = 3, you get:
mask = 111
b = 010 110 000 -> b & mask = 000 -> don't count
b = 010 110 -> b & mask = 110 -> count
b = 010 -> b & mask = 010 -> count
result: 2
The second method first merge bits to the right-most bit of the sets, e.g. with input = 010 110 000 and t = 3, you get:
bits = 010 110 000
bits >> 1 = 001 011 000
bits >> 2 = 000 101 100
b (OR'd) = 011 111 100
It then builds a mask for only checking right-most bits, and then applies mask and counts the bits set:
mask = 001 001 001
b & mask = 001 001 000
result: 2

Related

Bitwise operations to remove specific bits expressed as a mask

As part of a serial data protocol decoder, I must decode data that has sync bits inserted (bits that are not part of the data and are always '1'). I need to remove the sync bits and assemble the data by shifting the remaining bits left. Each 32-bit word has a different pattern of sync bits. I know what the patterns are, but I cannot come up with a generalized was of removing the sync bits.
For example, I might have a bit pattern like this (just showing 12 bits for example):
0 1 1 1 1 0 0 1 1 0 1 1
I know that some of those bits are sync bits, specifically those that are '1' in this mask:
0 0 1 1 0 0 0 0 1 0 0 1
The resulting data should be those data bits with a '0' in the corresponding mask, shifted to remove the sync bits, padded right with zeros. The mask above could be understood as "take first 2 bits, skip next 2 bits, take next 4 bits, skip next bit, take 2 bits, skip 1 bit".
E.g I should end up with:
0 1 1 0 0 1 0 1 0 0 0 0
Trying to do this in Java but I don't see any bit mask/shift operations that would make this work.
Best method I could come up with (does not left align the results, but that is OK for my purposes):
private static final int MSB_ONLY = 0x80000000;
private static int squeezeBits(int data, int mask) {
int v = 0;
for (int i=0; i<32; i++) {
if ((mask & MSB_ONLY) != MSB_ONLY) {
// There is a 0 in the mask, so we want this data bit
v = (v << 1) | ((data & MSB_ONLY) >>> 31);
}
else {
// Throw bit away
}
mask = mask << 1;
data = data << 1;
}
return v;
}

Use of << and >>> in a hash function

I have a project in C where I need to created a suitable hash function for void pointers which could contain alphanumeric chars, ints or just plain ol' chars.
I need to use a polynomial hash fuction, where instead of multiplying by a constant, I should use a cyclic shift of partial sums by a fixed number of bits.
In this page here
, there's the java code(I assume this is java because of the use of strings):
static int hashCode(String s) {
int h = 0;
for (int i = 0; i < s.length(); i++) {
h = (h << 5) | (h >>> 27); // 5-bit cyclic shift of the running sum
h += (int) s.charAt(i); // add in next character
}
return h;
}
What exactly is this line, below, doing?
h = (h << 5) | (h >>> 27); // 5-bit cyclic shift of the running sum
Yes, the comment says 5bit cyclic shift, but how does the <<, | and >>> operands work in this regard? I've never seen or used any of them before.

As it says, it's a 5-bit cyclic left shift. This means that all the bits are shifted left, with the bit "shifted off" added to the right side, five times.
The code replaces the value of h with the value of two bit patterns ORed together. The first bit pattern is the original value shifted left 5 bits. The second value is the original value shifted right 27 bits.
The left shift of 5 bits puts all the bits but the leftmost five in their final position. The leftmost 5 bits get "shifted out" by that shift and replaced with zeroes as the rightmost bits of the output. The right shift of 27 bits put the leftmost five bits in their final position as the rightmost bits, shifting in zeroes for the leftmost 27 bits. ORing them together produces the desired output.
The >>> is Java's unsigned shift operation. In C or C++, you'd just use >>.

The method add mod 2^512

It's add modulo 2^512. Could you explain me why we doing here >>8 and then &oxFF?
I know i'm bad in math.
int AddModulo512(int []a, int []b)
{
int i = 0, t = 0;
int [] result = new int [a.length];
for(i = 63; i >= 0; i--)
{
t = (a[i]) + (int) (b[i]) + (t >> 8);
result[i] = (t & 0xFF); //?
}
return result;
}

The mathematical effect of a bitwise shift right (>>) on an integer is to divide by two (truncating any remainder). By shifting right 8 times, you divide by 2^8, or 256.
The bitwise & with 0xFF means that the result will be limited to the first byte, or a range of 0-255.
Not sure why it references modulo 512 when it actually divides by 256.

It looks like you have 64 ints in each array, but your math is modulo 2^512. 512 divided by 64 is 8, so you are only using the least significant 8 bits in each int.
Here, t is used to store an intermediate result that may be more than 8 bits long.
In the first loop, t is 0, so it doesn't figure in the addition in the first statement. There's nothing to carry yet. But the addition may result in a value that needs more than 8 bits to store. So, the second line masks out the least significant 8 bits to store in the current result array. The result is left intact to the next loop.
What does the previous value of t do in the next iteration? It functions as a carry in the addition. Bit-shifting it to the right 8 positions makes any bits beyond 8 in the previous loop's result into a carry into the current position.
Example, with just 2-element arrays, to illustrate the carrying:
[1, 255] + [1, 255]
First loop:
t = 255 + 255 + (0) = 510; // 1 11111110
result[i] = 510 & 0xFF = 254; // 11111110
The & 0xFF here takes only the least significant 8 bits. In the analogy with normal math, 9 + 9 = 18, but in an addition problem with many digits, we say "8 carry the 1". The bitmask here performs the same function as extracting the "8" out of 18.
Second loop:
// 1 11111110 >> 8 yields 0 00000001
t = 1 + 1 + (510 >> 8) = 1 + 1 + 1 = 3; // The 1 from above is carried here.
result[i] = 3 & 0xFF = 3;
The >> 8 extracts the possible carry amount. In the analogy with normal math, 9 + 9 = 18, but in an addition problem with many digits, we say "8 carry the 1". The bit shift here performs the same function as extracting the "1" out of 18.
The result is [3, 254].
Notice how any carry leftover from the last iteration (i == 0) is ignored. This implements the modulo 2^512. Any carryover from the last iteration represents 2^512 and is ignored.

>> is a bitwise shift.
The signed left shift operator "<<" shifts a bit pattern to the left,
and the signed right shift operator ">>" shifts a bit pattern to the
right. The bit pattern is given by the left-hand operand, and the
number of positions to shift by the right-hand operand. The unsigned
right shift operator ">>>" shifts a zero into the leftmost position,
while the leftmost position after ">>" depends on sign extension.
& is a bitwise and
The bitwise & operator performs a bitwise AND operation.
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/op3.html
http://www.tutorialspoint.com/java/java_bitwise_operators_examples.htm

>> is the bitshift operator
0xFF is the hexadecimal literal for 255.

I think your question misses a very important part, the data format, i.e. how data are stored in a[] and b[]. To solve this question, I make some assumptions:
Since it's modulo arithmetic, a, b <= 2^512. Thus, a and b have 512 bits.
Since a and b have 64 elements, only 8 right-most bits of each elements are used. In other words, a[i], b[i] <= 256.
Then, what remains is very straightforward. Just consider each a[i] and b[i] as a digit (each digit is 8-bit) in a base 2^512 addition and then perform addition by adding digit-by-digit from right-to-left.
t is the carry variable which stores the value (with carry) of the addition at the last digit. t>>8 throws a way the right-most 8 bits that has been used for the last addition which is used as carry for the current addition. (t & 0xFF) gets the right-most 8 bits of t which is used for the current digit.
Since it's modulo addition, the final carry is thrown away.

Implementation logic behind adding two numbers without using +?

I found this code online. But, I am unable to get the logic behind the following code:
public static int add(int a, int b) {
if (b == 0) return a;
int sum = a ^ b; // add without carrying
System.out.println("sum is : "+sum);
int carry = (a & b) << 1; // carry, but don’t add
return add(sum, carry); // recurse
}

Let's look at an example (using 8 bits for simplicity)
a = 10010110
b = 00111101
a^b is the xor, which gives 1 for places where there is a 1 in one number and 0 in the other. In our example:
a^b = 10101011
Since 0 + 0 = 0, 0 + 1 = 1 and 1 + 0 = 1, the only columns left to deal with are the ones that have a 1 in both of the numbers. In our example, a^b is short by whatever the answer to
00010100
+ 00010100
is. In binary, 1 + 1 = 10, so the answer to the above sum is
00101000
or (a & b) << 1. Therefore the sum of a^b and (a & b) << 1 is the same as a + b.
So, assuming the process is guaranteed to terminate, the answer will be correct. But the process will terminate because each time we call sum recursively the second parameter has at least one more 0 at the end, due to the bit shift <<. Therefore, we are guaranteed to eventually end up with the second argument consisting entirely of 0s, so that the line if (b == 0) return a; can end the process and give us an answer.

Consider, as an example, 5+7:
5 = 101 (Base 2)
7 = 111 (Base 2)
Now consider adding the two (base 2) digits:
0+0 = 0 = 0 carry 0
1+0 = 1 = 1 carry 0
0+1 = 1 = 1 carry 0
1+1 = 10 = 0 carry 1
The sum (without carrying) of A+B is A^B and the carry is A&B; and when you carry a number it is shifted one digit to the left (hence (A&B)<<1).
So:
5 = 101 (Base 2)
7 = 111 (Base 2)
5^7 = 010 (sum without carrying)
5&7 = 101 (the carry shifted left)
Then we can recurse to add the carry:
A = 010
B = 1010
A^B = 1000 (sum without carrying)
A&B = 0010 (the carry shifted left)
Then we can recurse again as we still have more to carry:
A' = 1000
B' = 100 (without the leading zeros)
A'^B' = 1100 (sum without carrying)
A'&B' = 0000 (the carry shifted left)
Now there is nothing to carry - so we can stop and the answer is 1100 (base 2) = 12 (base 10).
The algorithm is just implementing decimal addition as (longhand) binary addition using the ors to add and the bitshifted ands to find the carry and will recurse until there is nothing more to carry (which will always occur as the bitshift appends another zero to the carry each time so with each recursion at least one more bit will not generate a carry value each time).

We are adding converting the integers to bits and using bitwise operators .
EXOR i.e ^ : 0 ^0 and 1 ^1 =0 , other cases give 1.
AND i.e & 1^1 =1 , ..other cases give 0.
<< or left shift . i.e shift left and append a 0 bit : 0010 becomes 0100
eg.
add(2,3)
2= 0010
3=0011
exor both : to get initial sum : 0001
carry : a &b = 0010
Left shift by 1 bit : 0100 i.e 4
add(1,4)
exor both : 0001 0100 and u get 0101 i.e 5
carry = 0000 <<1 i.e 0000 ..
since carry is 0 , it stops addition and returns previous sum

This is the table for addition:
+ 0 1
-- --
0 | 0 1
1 | 1 10
▲
If you ignore the carry bit ▲ you'll see that it's the same as the XOR table:
^ 0 1
-- --
0 | 0 1
1 | 1 0
So if you combine two numbers with bitwise XOR you get bit-by-bit addition without carry.
Now, what is the carry? It's a bit that's only there when both inputs are 1.
You can get that with AND:
& 0 1
-- --
0 | 0 0
1 | 0 1
But it needs to be added to the sum after being shifted one position to the left, because it's "carried" over, hence the (a & b) << 1
So you can compute the addition without carry and the carry itself. How do you add them together without using addition? Simple! By recursing on this very definition of addition!
See #pbabcdefp's answer on why the recursion always terminates.

how to assign negative numbers as binary to int variable in java?

I was seeing how to represent binary numbers in java.
one option is to use as String and use Integer.parseInt() to get the decimal value;
The other option (assignment of 2 to b):
int b = 0b0010; //2
System.out.println(b);
System.out.println(Integer.toBinaryString(-2));
output:
2
11111111111111111111111111111110
using this format, how to represent -2:
int c=??//-b

int values are stored in 32 bits where the most significant bit is the sign.
0b0010; //2
is actually
0b00000000_00000000_00000000_00000010; //2
To convert that to a negative number, you flip the 0s to 1s and the 1s to 0s` and add 1. So
0b00000000_00000000_00000000_00000010; //2
0b11111111_11111111_11111111_11111101
0b11111111_11111111_11111111_11111110; // +1
And so
int b = 0b11111111_11111111_11111111_11111110;
would have the value -2.

Use the Bitwise NOT operator:
00000000000000000000000000000010 // 2 = 0b0010
00000000000000000000000000000001 // 1 = 0b0010-0b0001
11111111111111111111111111111110 // ~1 = ~(0b0010-0b0001)
11111111111111111111111111111110 // -2 = ~(0b0010-0b0001)
So you just subtract 0b001 and flip all the bits with the ~ Bitwise operator:
int b = ~(0b0010-0b0001); // ~(2-1) = ~1 = -2
System.out.println(b);
System.out.println(Integer.toBinaryString(b));
Demo

You are confusing storage with presentation. int b has no base, regardless of whether you assign 2, 0b010 or 0x02 to it. Base 10 is just something println assumes.
You can use Integer.toString(number, radix) to print properly signed binary numbers.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Constrained counting sets on binary codes - java

Related

Bitwise operations to remove specific bits expressed as a mask

Use of << and >>> in a hash function

The method add mod 2^512

Implementation logic behind adding two numbers without using +?

how to assign negative numbers as binary to int variable in java?

Categories

Resources