Java BitSet wrong conversion from/to byte array

Java BitSet wrong conversion from/to byte array - java

Working with BitSets I have a failing test:
BitSet bitSet = new BitSet();
bitSet.set(1);
bitSet.set(100);
logger.info("BitSet: " + BitSetHelper.toString(bitSet));
BitSet fromByteArray = BitSetHelper.fromByteArray(bitSet.toByteArray());
logger.info("fromByteArray: " + BitSetHelper.toString(bitSet));
Assert.assertEquals(2, fromByteArray.cardinality());
Assert.assertTrue(fromByteArray.get(1)); <--Assertion fail!!!
Assert.assertTrue(fromByteArray.get(100)); <--Assertion fail!!!
To be more weird I can see my String representation of both BitSets:
17:34:39.194 [main] INFO c.i.uniques.helper.BitSetHelperTest - BitSet: 00000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000
17:34:39.220 [main] INFO c.i.uniques.helper.BitSetHelperTest - fromByteArray: 00000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000
Are equals! What's happening here??
The used methods on this example are:
public static BitSet fromByteArray(byte[] bytes) {
BitSet bits = new BitSet();
for (int i = 0; i < bytes.length * 8; i++) {
if ((bytes[bytes.length - i / 8 - 1] & (1 << (i % 8))) > 0) {
bits.set(i);
}
}
return bits;
}
And the method used to get the String representation:
public static String toString(BitSet bitSet) {
StringBuffer buffer = new StringBuffer();
for (byte b : bitSet.toByteArray()) {
buffer.append(String.format("%8s", Integer.toBinaryString(b & 0xFF)).replace(' ', '0'));
}
return buffer.toString();
}
Some one could explain what's going on here?

Note that BitSet has a valueOf(byte[]) that already does this for you.
Inside your fromByteArray method
for (int i = 0; i < bytes.length * 8; i++) {
if ((bytes[bytes.length - i / 8 - 1] & (1 << (i % 8))) > 0) {
bits.set(i);
}
}
you're traversing your byte[] in reverse. On the first iteration,
bytes.length - i / 8 - 1
will evaluate to
8 - (0 / 8) - 1
which is 7, which will access the most significant byte. This is the one containing the 100th bit from your original bitset. Viewed from the reverse side, this is the fourth bit. And if you check the bits set in your generated BitSet, you'll notice the 5th and 98th (there might be an off by one bug here) bits are set.
But the byte[] returned by toByteArray() contains
a little-endian representation of all the bits in this bit set
You need to read the byte[] in the appropriate order
for (int i = 0; i < bytes.length * 8; i++) {
if ((bytes[i / 8] & (1 << (i % 8))) > 0) {
bits.set(i);
}
}

Related

Hex to Bytes, ruby and java

I have the following code to convert hex string to bytes in Java:
String s = "longhex";
int len = s.length();
byte[] data = new byte[(len / 2)];
for (int i = 0; i < len; i += 2)
{
data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4) + Character.digit(s.charAt(i + 1), 16));
}
is this a correct way to reproduce it in ruby?
s = "longhex"
bytes = []
(0..s.length / 2 - 1).step(2).each do |i|
bytes[i / 2] = s[i].ord << 4 + s[i + 1].ord
end

No it is not correct. << has a lower operator precedence than +. Note even in java there are parentheses around shift operator. Also, it’s not ruby, it’s c written with an almost ruby syntax.
str.codepoints.
each_slice(2).
map { |f, l| (f << 4) + l }
would probably do what you want, but without seeing an expected outcome it’s hard to say.
Correct version as by Ilya is:
str.scan(/.{1}/).
each_slice(2).
map { |f, l| (Integer(f,16) << 4) + Integer(l,16) }

this is code for Rearrange array in alternating positive & negative items with O(1) extra space can you please explain what is & 0x01 elaborately? [duplicate]

I was going through a piece of code in the Apache commons library and was wondering what these conditions do exactly.
public static byte[] decodeHex(final char[] data) throws DecoderException {
final int len = data.length;
if ((len & 0x01) != 0) { // what does this condition do
throw new DecoderException("Odd number of characters.");
}
final byte[] out = new byte[len >> 1];
// two characters form the hex value.
for (int i = 0, j = 0; j < len; i++) {
int f = toDigit(data[j], j) << 4;
j++;
f = f | toDigit(data[j], j);
j++;
out[i] = (byte) (f & 0xFF); // what is happening here.
}
return out;
}
thanks in advance.

This checks if the last digit in the binary writing of len is a 1.
xxxxxxxy
& 00000001
gives 1 if y is 1, 0 if y is 0, ignoring the other digits.
If y is 1, the length of the char array is odd, which shouldn't happen in this hex writing, hence the exception.
Another solution would have been
if (len%2 != 0) {
which would have been clearer in my opinion. I doubt the slight performance increase just before a loop really matters.

It's a 1337 (high performance) way of coding:
if (len % 2 == 1)
i.e. is len odd. It works because the binary representation of every odd integer has its least significant (ie last) bit set. Performaning a bitwise AND with 1 masks all other bits, leaving a result of either 1 if it's odd or 0 if even.
It's a carryover from C, where you can code simply:
if (len & 1)

This line checks if len is an odd number or not.
If len isn't odd, len & 1 will be equal to 0. (1 and 0x01 are the same value, 0x01 is just the hexadecimal notation)

Java. Extracting integers from bits in a byte array not fitting the byte boundary

I have the following array of bytes:
01010110 01110100 00100101 01001011
These bytes are broken into two groups to encode seven integers. I know that the first group consists of 3 values 4 bits each (0101 0110 0111) that represent numbers 5,6,7. The second group consists of 4 values 5 bits each (01000 01001 01010 01011), which represent integers 8,9,10, and 11.
To extract the integers, I am currently using the following approach. Convert the array into a binary string:
public static String byteArrayToBinaryString(byte[] byteArray)
{
String[] arrayOfStrings = new String[byteArray.length];
for(int i=0; i<byteArray.length; i++)
{
arrayOfStrings[i] = byteToBinaryString(byteArray[i]);
}
String bitsetString = "";
for(String testArrayStringElement : arrayOfStrings)
{
bitsetString += testArrayStringElement;
}
return bitsetString;
}
// Taken from here: http://helpdesk.objects.com.au/java/converting-large-byte-array-to-binary-string
public static String byteToBinaryString(byte byteIn)
{
StringBuilder sb = new StringBuilder("00000000");
for (int bit = 0; bit < 8; bit++)
{
if (((byteIn >> bit) & 1) > 0)
{
sb.setCharAt(7 - bit, '1');
}
}
return sb.toString();
}
Then, I split the binary string into 2 substrings: 12 characters and 20 characters. Then I split each substring into new substrings, each of which has length that equals the number of bits. Then I convert each sub-substring into an integer.
It works but a byte array representing thousands of integers takes 30 seconds to a minute to extract.
I am a bit at a loss here. How do I do this using bitwise operators?
Thanks a lot!

I assume you have an understanding of the basic bit operations and how to express them in Java.
Use a pencil to draw a synthetic picture of the problem
byte 0 byte 1 byte 2 byte 3
01010110 01110100 00100101 01001011
\__/\__/ \__/\______/\___/\______/\___/
a b c d e f g
To extract a, b and c we need to do the following
a b c
byte 0 byte 0 byte 1
01010110 01010110 01110100
\. \. |||||||| \. \.
'\ '\ XXXX|||| '\ '\
0.. 0101 0.. 0110 0.. 0111
Shift And Shift
In Java
int a = byteArray[0] >>> 4, b = byteArray[0] & 0xf, c = byteArray[1] >>> 4;
The other values d, e, f and g are computed similarly but some of them require to read two bytes from the array (d and f actually).
d e
byte 1 byte 2 byte 2
01110100 00100101 00100101
||||\\\\ | |\\\\\
XXXX \\\\ | X \\\\\
\\\\| \\\\\
0.. 01000 01001
To compute d we need to isolate the least four bits of byte 1 with byteArray[1] & 0xf then make space for the bit from byte 2 with (byteArray[1] & 0xf) << 1, extract that bit with byteArray[1] >>> 7 and finally merge together the result.
int d = (byteArray[1] & 0xf) << 1 | byteArray[2] >>> 7;
int e = (byteArray[2] & 0x7c) >>> 2;
int f = (byteArray[2] & 0x3) << 3 | byteArray[3] >>> 5;
int g = byteArray[3] & 0x1f;
When you are comfortable with handling bits operations you may consider generalizing the function that extract the integers.
I made function int extract(byte[] bits, int[] sizes, int[] res), that given an array of bytes bits, an array of sizes sizes, where the even indices hold the size of the integers to extract in bits and the odd indices the number of integers to extract, and an output array res large enough to hold all the integers in output, extracts from bits all the integers expressed by sizes.
It returns the number of integers extracted.
For example the original problem can be solved as
int res[] = new int[8];
byte bits[] = new byte[]{0x56, 0x74, 0x25, 0x4b};
//Extract 3 integers of 4 bits and 4 integers of 5 bits
int ints = BitsExtractor.extract(bits, new int[]{4, 3, 5, 4}, res);
public class BitsExtractor
{
public static int extract(byte[] bits, int[] sizes, int[] res)
{
int currentByte = 0; //Index into the bits array
int intProduced = 0; //Number of ints produced so far
int bitsLeftInByte = 8; //How many bits left in the current byte
int howManyInts = 0; //Number of integers to extract
//Scan the sizes array two items at a time
for (int currentSize = 0; currentSize < sizes.length - 1; currentSize += 2)
{
//Size, in bits, of the integers to extract
int intSize = sizes[currentSize];
howManyInts += sizes[currentSize+1];
int temp = 0; //Temporary value of an integer
int sizeLeft = intSize; //How many bits left to extract
//Do until we have enough integer or we exhaust the bits array
while (intProduced < howManyInts && currentByte <= bits.length)
{
//How many bit we can extract from the current byte
int bitSize = Math.min(sizeLeft, bitsLeftInByte); //sizeLeft <= bitsLeftInByte ? sizeLeft : bitsLeftInByte;
//The value to mask out the number of bit extracted from
//The current byte (e.g. for 3 it is 7)
int byteMask = (1 << bitSize) - 1;
//Extract the new bits (Note that we extract starting from the
//RIGHT so we need to consider the bits left in the byte)
int newBits = (bits[currentByte] >>> (bitsLeftInByte - bitSize)) & byteMask;
//Create the new temporary value of the current integer by
//inserting the bits in the lowest positions
temp = temp << bitSize | newBits;
//"Remove" the bits processed from the byte
bitsLeftInByte -= bitSize;
//Is the byte has been exhausted, move to the next
if (bitsLeftInByte == 0)
{
bitsLeftInByte = 8;
currentByte++;
}
//"Remove" the bits processed from the size
sizeLeft -= bitSize;
//If we have extracted all the bits, save the integer
if (sizeLeft == 0)
{
res[intProduced++] = temp;
temp = 0;
sizeLeft = intSize;
}
}
}
return intProduced;
}
}

Well I did the first group , the second can be done in similar fashion
public static void main(String args[]) {
//an example 32 bits like your example
byte[] bytes = new byte[4];
bytes[0] = 31;//0001 1111
bytes[1] = 54;//0011 0110
bytes[2] = 67;
bytes[3] = 19;
//System.out.println(bytes[0]);
int x = 0;
int j = -1; // the byte number
int k = 0; // the bit number in that byte
int n = 0; // the place of the bit in the integer we are trying to read
for (int i = 0; i < 32; i++) {
if (i < 12) { //first group
if (i % 8 == 0) {
j++;
k = 0;
}
if (i % 4 == 0) {
x = 0;
n = 0;
}
byte bit = (byte) ((bytes[j] & (1 << (7 - k))) >> (7 - k));
System.out.println("j is :" + j + " k is :" + k + " " + bit);
x = x | bit << (3 - n);
if ((i + 1) % 4 == 0) {
System.out.println(x);
}
k++;
n++;
} else {
}
}
}
It's a bit tricky because you are trying to encode an integer on less than what java allocates (8 bits). So I had to take each bit and "construct" the int from them
To get each bit
byte bit = (byte) ((bytes[j] & (1 << (7 - k))) >> (7 - k));
this takes the byte we are at and does And operation. For example I want the 3rd bit of the 1st byte, I do
bytes[0] & 1 << (7 - 3)
but this gives me an integer encoded over 8 bits, so I still have to shift it to get that single bit with >> (7 - 3)
Then I just Or it with x (the int we are trying to decode). All while putting it at the right position with << (3 - n) . 3 because your integer is encoded over 4 bits
Try running the code and reading the output.
I am honestly not sure if this is the best way, but I believe it's at least faster than dealing with Strings

Bit shift operations on a byte array in Java

How do I shift a byte array n positions to the right? For instance shifting a 16 byte array right 29 positions? I read somewhere it can be done using a long? Would using a long work like this:
Long k1 = byte array from 0 to 7
Long k2 = byte array from 8 to 15
Then right rotating these two longs using Long.rotateRight(Long x, number of rotations).How would the two longs be joined back into a byte array?

I believe you can do this using java.math.BigInteger which supports shifts on arbitrarily large numbers. This has advantage of simplicity, but disadvantage of not padding into original byte array size, i.e. input could be 16 bytes but output might only be 10 etc, requiring additional logic.
BigInteger approach
byte [] array = new byte[]{0x7F,0x11,0x22,0x33,0x44,0x55,0x66,0x77};
// create from array
BigInteger bigInt = new BigInteger(array);
// shift
BigInteger shiftInt = bigInt.shiftRight(4);
// back to array
byte [] shifted = shiftInt.toByteArray();
// print it as hex
for (byte b : shifted) {
System.out.print(String.format("%x", b));
}
Output
7f1122334455667 <== shifted 4 to the right. Looks OK
Long manipulation
I don't know why you'd want to do this as rotateRight() as this makes life more difficult, you have to blank at the bits that appear at the left hand side in K1 etc. You'd be better with using shift IMO as describe below. I've used a shift of 20 as divisible by 4 so easier to see the nibbles move in the output.
1) Use ByteBuffer to form two longs from 16 byte array
byte[] array = { 0x00, 0x00, 0x11, 0x11, 0x22, 0x22, 0x33, 0x33, 0x44, 0x44, 0x55, 0x55, 0x66, 0x66, 0x77, 0x77 };
ByteBuffer buffer = ByteBuffer.wrap(array);
long k1 = buffer.getLong();
long k2 = buffer.getLong();
2) Shift each long n bits to the right
int n = 20;
long k1Shift = k1 >> n;
long k2Shift = k2 >> n;
System.out.println(String.format("%016x => %016x", k1, k1Shift));
System.out.println(String.format("%016x => %016x", k2, k2Shift));
0000111122223333 => 0000000001111222
4444555566667777 => 0000044445555666
Determine bits from k1 that "got pushed off the edge"
long k1CarryBits = (k1 << (64 - n));
System.out.println(String.format("%016x => %016x", k1, k1CarryBits));
0000111122223333 => 2333300000000000
Join the K1 carry bits onto K2 on right hand side
long k2WithCarray = k2Shift | k1CarryBits;
System.out.println(String.format("%016x => %016x", k2Shift, k2WithCarray));
0000044445555666 => 2333344445555666
Write the two longs back into a ByteBuffer and extract as a byte array
buffer.position(0);
buffer.putLong(k1Shift);
buffer.putLong(k2WithCarray);
for (byte each : buffer.array()) {
System.out.print(Long.toHexString(each));
}
000011112222333344445555666

Here is what I came up with to shift a byte array by some arbitrary number of bits left:
/**
* Shifts input byte array len bits left.This method will alter the input byte array.
*/
public static byte[] shiftLeft(byte[] data, int len) {
int word_size = (len / 8) + 1;
int shift = len % 8;
byte carry_mask = (byte) ((1 << shift) - 1);
int offset = word_size - 1;
for (int i = 0; i < data.length; i++) {
int src_index = i+offset;
if (src_index >= data.length) {
data[i] = 0;
} else {
byte src = data[src_index];
byte dst = (byte) (src << shift);
if (src_index+1 < data.length) {
dst |= data[src_index+1] >>> (8-shift) & carry_mask;
}
data[i] = dst;
}
}
return data;
}

1. Manually implemented
Here are left and right shift implementation without using BigInteger (ie. without creating a copy of the input array) and with unsigned right shift (BigInteger only supports arithmetic shifts of course)
Left Shift <<
/**
* Left shift of whole byte array by shiftBitCount bits.
* This method will alter the input byte array.
*/
static byte[] shiftLeft(byte[] byteArray, int shiftBitCount) {
final int shiftMod = shiftBitCount % 8;
final byte carryMask = (byte) ((1 << shiftMod) - 1);
final int offsetBytes = (shiftBitCount / 8);
int sourceIndex;
for (int i = 0; i < byteArray.length; i++) {
sourceIndex = i + offsetBytes;
if (sourceIndex >= byteArray.length) {
byteArray[i] = 0;
} else {
byte src = byteArray[sourceIndex];
byte dst = (byte) (src << shiftMod);
if (sourceIndex + 1 < byteArray.length) {
dst |= byteArray[sourceIndex + 1] >>> (8 - shiftMod) & carryMask;
}
byteArray[i] = dst;
}
}
return byteArray;
}
Unsigned Right Shift >>>
/**
* Unsigned/logical right shift of whole byte array by shiftBitCount bits.
* This method will alter the input byte array.
*/
static byte[] shiftRight(byte[] byteArray, int shiftBitCount) {
final int shiftMod = shiftBitCount % 8;
final byte carryMask = (byte) (0xFF << (8 - shiftMod));
final int offsetBytes = (shiftBitCount / 8);
int sourceIndex;
for (int i = byteArray.length - 1; i >= 0; i--) {
sourceIndex = i - offsetBytes;
if (sourceIndex < 0) {
byteArray[i] = 0;
} else {
byte src = byteArray[sourceIndex];
byte dst = (byte) ((0xff & src) >>> shiftMod);
if (sourceIndex - 1 >= 0) {
dst |= byteArray[sourceIndex - 1] << (8 - shiftMod) & carryMask;
}
byteArray[i] = dst;
}
}
return byteArray;
}
Used in this class by this Project.
2. Using BigInteger
Be aware that BigInteger internally converts the byte array into an int[] array so this may not be the most optimized solution:
Arithmetic Left Shift <<:
byte[] result = new BigInteger(byteArray).shiftLeft(3).toByteArray();
Arithmetic Right Shift >>:
byte[] result = new BigInteger(byteArray).shiftRight(2).toByteArray();
3. External Library
Using the Bytes java library*:
Add to pom.xml:
<dependency>
<groupId>at.favre.lib</groupId>
<artifactId>bytes</artifactId>
<version>{latest-version}</version>
</dependency>
Code example:
Bytes b = Bytes.wrap(someByteArray);
b.leftShift(3);
b.rightShift(3);
byte[] result = b.array();
*Full Disclaimer: I am the developer.

The is an old post, but I want to update Adam's answer.
The long solution works with a few tweak.
In order to rotate, use >>> instead of >>, because >> will pad with significant bit, changing the original value.
second, the printbyte function seems to miss leading 00 when it prints.
use this instead.
private String getHexString(byte[] b) {
StringBuilder result = new StringBuilder();
for (int i = 0; i < b.length; i++)
result.append(Integer.toString((b[i] & 0xff) + 0x100, 16)
.substring(1));
return result.toString();
}

Storing int value of bitmask - extract 1 valued bits

I am calculating the int equivalent of a given set of bits and storing that in memory. From there, I would like to determine all 1 value bits from the original bitmask. Example:
33 --> [1,6]
97 --> [1,6,7]
Ideas for an implementation in Java?

On BitSet
Use java.util.BitSet to store, well, a set of bits.
Here's how you can convert from an int to a BitSet, based on which bits in the int is set:
static BitSet fromInt(int num) {
BitSet bs = new BitSet();
for (int k = 0; k < Integer.SIZE; k++) {
if (((num >> k) & 1) == 1) {
bs.set(k);
}
}
return bs;
}
So now you can do the following:
System.out.println(fromInt(33)); // prints "{0, 5}"
System.out.println(fromInt(97)); // prints "{0, 5, 6}"
And just for completeness, here's the reverse transformation:
static int toInt(BitSet bs) {
int num = 0;
for (int k = -1; (k = bs.nextSetBit(k + 1)) != -1; ) {
num |= (1 << k);
}
return num;
}
So composing both together, we always get back the original number:
System.out.println(toInt(fromInt(33))); // prints "33"
System.out.println(toInt(fromInt(97))); // prints "97"
On 0-based indexing
Note that this uses 0-based indexing, which is the more commonly used indexing for bits (and most everything else in Java). This is also more correct. In the following, ^ denotes exponentiation:
33 = 2^0 + 2^5 = 1 + 32 97 = 2^0 + 2^5 + 2^6 = 1 + 32 + 64
33 -> {0, 5} 97 -> {0, 5, 6}
If you insist on using 1-based indexing, however, you can use bs.set(k+1); and (1 << (k-1)) in the above snippets. I would advise strongly against this recommendation, however.
Related questions
What does the ^ operator do in Java? -- it's actually not exponentiation

For bit fiddling, java.lang.Integer has some very helpful static methods. Try this code as a starting base for your problem:
public int[] extractBitNumbers(int value) {
// determine how many ones are in value
int bitCount = Integer.bitCount(value);
// allocate storage
int[] oneBits = new int[bitCount];
int putIndex = 0;
// loop until no more bits are set
while (value != 0) {
// find the number of the lowest set bit
int bitNo = Integer.numberOfTrailingZeros(value);
// store the bit number in array
oneBits[putIndex++] = bitNo+1;
// clear the bit we just processed from the value
value &= ~(1 << bitNo);
}
return oneBits;
}

I can show you C# implementation, Java should be very similar.
int value = 33;
int index = 1;
while (value > 0)
{
if ((value % 2) == 1)
Console.WriteLine(index);
index++;
value /= 2;
}

If you want to get an array like that you'll likely need to loop the number of bits you want to check & the integer with a bit shifted 1 for each step.
Something like (pseudo):
Init array
mask = 1
for (0 to BitCount):
if Integer & mask
array[] = pos
mask << 1

A bit-crunching variation would be something like:
int[] getBits(int value) {
int bitValue = 1;
int index = 1;
int[] bits = new int[33];
while (value >= bitValue)
{
bits[index++] = (value & bitValue);
bitValue << 1; // or: bitValue *= 2;
}
return bits;
}
Note that since the bits are indexed from 1 as you requested, bits[0] is left unused.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java BitSet wrong conversion from/to byte array - java

Related

Hex to Bytes, ruby and java

this is code for Rearrange array in alternating positive & negative items with O(1) extra space can you please explain what is & 0x01 elaborately? [duplicate]

Java. Extracting integers from bits in a byte array not fitting the byte boundary

Bit shift operations on a byte array in Java

Storing int value of bitmask - extract 1 valued bits

Categories

Resources