BitSet toString() and valueOf() are difficult to understand - java

I just want to convert integer to binary string using BitSet.
My code is below.
public class BitSetTest {
public static void main(String[] args) {
//Method_1
int value = 10; //0b1010
String bits = Integer.toBinaryString(value);
BitSet bs = new BitSet(bits.length());
for (int i = 0; i < bits.length(); i++) {
if (bits.charAt(i) == '1') {
bs.set(i);
} else {
bs.clear(i);
}
}
System.out.println(bs); //{0, 2} so 0th index and 2nd index are set.
System.out.println(Arrays.toString(bs.toLongArray())); //prints [5]
System.out.println(Arrays.toString(bs.toByteArray()));
//Method_2
value = 42;
System.out.println(Integer.toBinaryString(value)); //101010
BitSet bitSet = BitSet.valueOf(new long[] { value });
System.out.println(bitSet);
System.out.println(Arrays.toString(bitSet.toLongArray())); // prints [42]
System.out.println(Arrays.toString(bitSet.toByteArray()));
}
}
Q1) What i didn't understand; which is correct approach (Method_1 or Method_2).
Method_1 seems to be correct one, but bs.toLongArray() gives different results.
Q2) Could you please explain this api public static BitSet valueOf(long[] longs) accepts array of long values instead of single long ..? And what does really doing with this array.
Java doc says the below; but i really didn't get the meaning.
More precisely, BitSet.valueOf(longs).get(n) == ((longs[n/64] &
(1L<<(n%64))) != 0)
for all n < 64 * longs.length.
Please help.

Bits are numbered from the right.
42 = 0b101010
543210 <-- bit numbers
Hence output of {1, 3, 5}.
Your method #1 numbers bits from the left.
You also don't need to call bs.clear(i), since a new BitSet doesn't have any bits set.
All bits are initially false.
As for how BitSet.valueOf() works, it's fairly simple.
There are two versions for byte data (byte[], ByteBuffer), and two versions for long data (long[], LongBuffer).
A byte consists of 8 bits, and a long consists of 64 bits. The BitSet will then be built with the first N (8 or 64) bits from the first value, the next N bits from the second value, and so forth.
E.g. if you call BitSet.valueOf(new long[] { 1, 2, 3 }), bits 0-63 come from first number, bits 64-127 comes from second number, and bits 128-191 comes from third number, resulting in {0, 65, 128, 129}.
If you call BitSet.valueOf(new byte[] { 1, 2, 3 }), bits 0-7 come from first number, bits 8-15 comes from second number, and bits 16-23 comes from third number, resulting in {0, 9, 16, 17}.

Related

Java binary translation?

I am trying to find some code that will easily break now a binary string. I'm not even sure I'm asking this question correctly, but I want to get the value of each "active bit". For example, if I have a binary string of 100000001, I would like to return the values 256, 1 in an array. I'm trying to figure this out so I can use a lookup table in SQL which has an integer column and a text column. The integer column will be used to determine which text values will be written to a new table. So, the value "Text1" at 1, and "Text 2" at 256 would both be written to the new table, but the number submitted to get those values would be 257.
I know I'm rambling, but I would input a value, 257, and I convert it to a binary string of 100000001. Now I want some code to break that binary string into two values... 1 and 256. Am I making any sense?
You don't need to convert to a binary string if you use Integer.highestOneBit. You can loop through the one bits, filling in an array of size Integer.bitCount with each call to Integer.highestOneBit. Afterwards, you can xor with the value of the highest bit to remove it from the number.
public static int[] getOneBits(int num) {
int[] oneBits = new int[Integer.bitCount(num)];
for (int i = 0; i < oneBits.length; i++) {
oneBits[i] = Integer.highestOneBit(num);
num ^= oneBits[i];
}
return oneBits;
}
Ideone Demo
This will produce an array, where all of the values are powers of 2 in descending order, where the sum of all the elements will be the original number. For example, 257 will produce [256, 1], and 127 will produce [64, 32, 16, 8, 4, 2, 1].

(Java) Converting String of symbols to single digit integer array

I'm trying to take a String, i.e. $!#, convert each individual symbol to its corresponding ASCII value, it here being 36, 33, 35 and then ending up with an array of integers with each single number stored in a separate index, meaning 3, 6, 3, 3, 3, 5.
In short: From $!# to $,!,# to 36, 33, 35 to 3, 6, 3, 3, 3, 5
Since I am using Processing (wrapper for Java), my SO-research got me so far:
String str = "$!#";
byte[] b = str.getBytes();
for (int i=0; i<b.length; i++){
println(b[i]);
}
I'm ending up with the ASCII values. The byte array b contains [36, 33, 35].
Now if those were to be Strings instead of bytes, I would use String.valueOf(b) to get 363335 and then split the whole thing again into a single digit integer array.
I wonder if this approach is unnecessarily complicated and can be done with less conversion steps. Any suggestions?
Honestly, if I were you, I wouldn't worry too much about "efficiency" until you have an actual problem. Write code that you understand. If you have a problem with efficiency, you'll have to define exactly what you mean: is it taking too many steps? Is it taking too long? What is the size of your input?
Note that the approach you outlined and the approach below are both O(N), which is probably the best you're going to get.
If you have a specific bottleneck inside that O(N) then you should do some profiling to figure out where that is. But you shouldn't bother with micro-optimizations (or shudder premature optimizations) just because you "feel" like something is "inefficient".
To quote from this answer:
...in the absence of measured performance issues you shouldn't optimize because you think you will get a performance gain.
If I were you, I would just use the charAt() function to get each char in the String, and then use the int() function to convert that char into an int. Then you could add those int values to a String (or better yet, a StringBuilder):
String str = "$!#";
StringBuilder digits = new StringBuilder();
for (int i=0; i<str.length(); i++) {
int c = int(str.charAt(i));
digits.append(c);
}
Then you could use the split() function to split them into individual digits:
String[] digitArray = digits.toString().split("");
for (String d : digitArray) {
println(d);
}
Prints:
3
6
3
3
3
5
There are a ton of different ways to do this: you could use the toCharArray() function instead of charAt(), for example. But none of this is going to be any more or less "efficient" until you define exactly what you mean by efficiency.
Java 8 stream solution:
int[] digits = "$!#".chars()
.mapToObj(Integer::toString)
.flatMapToInt(CharSequence::chars)
.map(c -> c - '0')
.toArray();
System.out.println(Arrays.toString(digits)); // prints [3, 6, 3, 3, 3, 5]
One solution could be doing something like that
// Create an array of int of a size 3 * str.length()
for(int i = 0; i < str.length(); i++)
{
int n = (int) str.charAt(i);
int a = n / 100;
int b = (n - (a * 100)) / 10;
int c = n - (a * 100) - (b * 10);
// push a, b and c in your array (chose the order and if you want
// to add the potential 0's or not)
}
But as others said before, I'm not sure if it worth playing like that. It depends on your application I guess.

First Byte's Bit Off-By-One after Steganography

Currently working on a Steganography project where, given a message in bytes and the number of bits to modify per byte, hide a message in an arbitrary byte array.
In the first decoded byte of the resulting message, the value has it's first (leftmost) bit set to '1' instead of '0'. For example, when using message "Foo".getBytes() and maxBits = 1 the result is "Æoo", not "Foo" (0b01000110 gets changed to 0b11000110). With message "Æoo".getBytes() and maxBits = 1 result is "Æoo", meaning the bit is not getting flipped as far as I can tell.
Only certain values of maxBits for certain message bytes cause this error, for example "Foo" encounters this problem at maxBits equal to 1, 5, and 6, whereas "Test" encounters this problem at maxBits equal to 1, 3, and 5. Only the resulting first character ends up with its first bit set, and this problem only occurs at the specified values of this.maxBits related to the initial data.
Why, for certain values of maxBits, is the first bit of the
resulting decoded message always 1?
Why do different inputs have different values for maxBits that
work fine, and others that do not?
What is the pattern with the value of maxBits and the
resulting erroneous results in relation to the original data?
Encode and Decode Methods:
public byte[] encodeMessage(byte[] data, byte[] message) {
byte[] encoded = data;
boolean[] messageBits = byteArrToBoolArr(message);
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
encoded[index] = messageBits[x] ? setBit(encoded[index], x % this.maxBits) : unsetBit(encoded[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return encoded;
}
public byte[] decodeMessage(byte[] data) {
boolean[] messageBits = new boolean[data.length * this.maxBits];
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
messageBits[x] = getBit(data[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return boolArrToByteArr(messageBits);
}
Unset, Set, and Get Methods:
public byte unsetBit(byte data, int pos) {
return (byte) (data & ~((1 << pos)));
}
public byte setBit(byte data, int pos) {
return (byte) (data | ((1 << pos)));
}
public boolean getBit(byte data, int pos) {
return ((data >>> pos) & 0x01) == 1;
}
Conversion Methods:
public boolean[] byteArrToBoolArr(byte[] b) {
boolean bool[] = new boolean[b.length * 8];
for (int x = 0; x < bool.length; x++) {
bool[x] = false;
if ((b[x / 8] & (1 << (7 - (x % 8)))) > 0)
bool[x] = true;
}
return bool;
}
public byte[] boolArrToByteArr(boolean[] bool) {
byte[] b = new byte[bool.length / 8];
for (int x = 0; x < b.length; x++) {
for (int y = 0; y < 8; y++) {
if (bool[x * 8 + y]) {
b[x] |= (128 >>> y);
}
}
}
return b;
}
Sample Code and Output:
test("Foo", 1);//Æoo
test("Foo", 2);//Foo
test("Foo", 3);//Foo
test("Foo", 4);//Foo
test("Foo", 5);//Æoo
test("Foo", 6);//Æoo
test("Foo", 7);//Foo
test("Foo", 8);//Foo
test("Test", 1);//Ôest
test("Test", 2);//Test
test("Test", 3);//Ôest
test("Test", 4);//Test
test("Test", 5);//Ôest
test("Test", 6);//Test
test("Test", 7);//Test
test("Test", 8);//Test
private static void test(String s, int x) {
BinaryModifier bm = null;
try {
bm = new BinaryModifier(x);//Takes maxBits as constructor param
} catch (BinaryException e) {
e.printStackTrace();
}
System.out.println(new String(bm.decodeMessage(bm.encodeMessage(new byte[1024], s.getBytes()))));
return;
}
Your logic of incrementing index has two flaws, which overwrite the first bit of the first letter. Obviously, the bug is expressed when the overwriting bit is different to the first bit.
if (x % this.maxBits == 0 && x != 0)
index++;
The first problem has to do with embedding only one bit per byte, i.e. maxBits = 1. After you have embedded the very first bit and reached the above conditional, x is still 0, since it will be incremented at the end of the loop. You should be incrementing index at this point, but x != 0 prevents you from doing so. Therefore, the second bit will also be embedded in the first byte, effectively overwriting the first bit. Since this logic also exists in the decode method, you read the first two bits from the first byte.
More specifically, if you embed a 00 or 11, it will be fine. But a 01 will be read as 11 and a 10 will be read as 00, i.e., whatever value is the second bit. If the first letter has an ascii code less or equal than 63 (00xxxxxx), or greater or equal than 192 (11xxxxxx), it will come out fine. For example:
# -> # : 00100011 (35) -> 00100011 (35)
F -> Æ : 01000110 (70) -> 11000110 (198)
The second problem has to do with the x % this.maxBits == 0 part. Consider the case where we embed 3 bits per byte. After the 3rd bit, when we reach the conditional we still have x = 2, so the modulo operation will return false. After we have embedded a 4th bit, we do have x = 3 and we're allowed to move on to the next byte. However, this extra 4th bit will be written at the 0th position of the first byte, since x % this.maxBits will be 3 % 3. So again, we have a bit overwriting our very first bit. However, after the first cycle the modulo operation will correctly write only 3 bits per byte, so the rest of our message will be unaffected.
Consider the binary for "F", which is 01000110. By embedding N bits per byte, we effectively embed the following groups in the first few bytes.
1 bit 01 0 0 0 1 1 0
2 bits 010 00 11 0x
3 bits 0100 011 0xx
4 bits 01000 110x
5 bits 010001 10xxxx
6 bits 0100011 0xxxxx
7 bits 01000110
8 bits 01000110x
As you can see, for groups of 5 and 6 bits, the last bit of the first group is 1, which will overwrite our initial 0 bit. For all other cases the overwrite doesn't affect anything. Note that for 8 bits, we end up using the first bit of the second letter. If that happened to have an ascii code greater or equal than 128, it would again overwrite the firstmost 0 bit.
To address all problems, use either
for (int x = 0; x < messageBits.length; x++) {
// code in the between
if ((x + 1) % this.maxBits == 0)
index++;
}
or
for (int x = 0; x < messageBits.length; ) {
// code in the between
x++;
if (x % this.maxBits == 0)
index++;
}
Your code has another potential problem which hasn't been expressed. If your data array has a size of 1024, but you only embed 3 letters, you will affect only the first few bytes, depending on the value of maxBits. However, for the extraction, you define your array to have a size of data.length * this.maxBits. So you end up reading bits from all of the bytes of the data array. This is currently no problem, because your array is populated by 0s, which are converted to empty strings. However, if your array had actual numbers, you'd end up reading a lot of garbage past the point of your embedded data.
There are two general ways of addressing this. You either
append a unique sequence of bits at the end of your message (marker), such that when you encounter that sequence you terminate the extraction, e.g. eight 0s, or
you add a few bits before embedding your actual data (header), which will tell you how to extract your data, e.g., how many bytes and how many bits per byte to read.
One thing you're probably going to run afoul of is the nature of character encoding.
When you call s.getBytes() you are turning the string to bytes using your JVM's default encoding. Then you modify the bytes and you create a new String from the modified bytes again using the default encoding.
So the question is what is that encoding and precisely how does it work. For example, the encoding may well in some cases only be looking at the lower 7 bits of a byte relating to the character, then your setting of the top bit won't have any effect on the string created from the modified bytes.
If you really want to tell if your code is working right, do your testing by directly examining the byte[] being produced by your encode and decode methods, not by turning the modified bytes into strings and looking at the strings.

Byte to "Bit"array

A byte is the smallest numeric datatype java offers but yesterday I came in contact with bytestreams for the first time and at the beginning of every package a marker byte is send which gives further instructions on how to handle the package. Every bit of the byte has a specific meaning so I am in need to entangle the byte into it's 8 bits.
You probably could convert the byte to a boolean array or create a switch for every case but that can't certainly be the best practice.
How is this possible in java why are there no bit datatypes in java?
Because there is no bit data type that exists on the physical computer. The smallest allotment you can allocate on most modern computers is a byte which is also known as an octet or 8 bits. When you display a single bit you are really just pulling that first bit out of the byte with arithmetic and adding it to a new byte which still is using an 8 bit space. If you want to put bit data inside of a byte you can but it will be stored as a at least a single byte no matter what programming language you use.
You could load the byte into a BitSet. This abstraction hides the gory details of manipulating single bits.
import java.util.BitSet;
public class Bits {
public static void main(String[] args) {
byte[] b = new byte[]{10};
BitSet bitset = BitSet.valueOf(b);
System.out.println("Length of bitset = " + bitset.length());
for (int i=0; i<bitset.length(); ++i) {
System.out.println("bit " + i + ": " + bitset.get(i));
}
}
}
$ java Bits
Length of bitset = 4
bit 0: false
bit 1: true
bit 2: false
bit 3: true
You can ask for any bit, but the length tells you that all the bits past length() - 1 are set to 0 (false):
System.out.println("bit 75: " + bitset.get(75));
bit 75: false
Have a look at java.util.BitSet.
You might use it to interpret the byte read and can use the get method to check whether a specific bit is set like this:
byte b = stream.read();
final BitSet bitSet = BitSet.valueOf(new byte[]{b});
if (bitSet.get(2)) {
state.activateComponentA();
} else {
state.deactivateComponentA();
}
state.setFeatureBTo(bitSet.get(1));
On the other hand, you can create your own bitmask easily and convert it to a byte array (or just byte) afterwards:
final BitSet output = BitSet.valueOf(ByteBuffer.allocate(1));
output.set(3, state.isComponentXActivated());
if (state.isY){
output.set(4);
}
final byte w = output.toByteArray()[0];
How is this possible in java why are there no bit datatypes in java?
There are no bit data types in most languages. And most CPU instruction sets have few (if any) instructions dedicated to adressing single bits. You can think of the lack of these as a trade-off between (language or CPU) complexity and need.
Manipulating a single bit can be though of as a special case of manipulating multiple bits; and languages as well as CPU's are equipped for the latter.
Very common operations like testing, setting, clearing, inverting as well as exclusive or are all supported on the integer primitive types (byte, short/char, int, long), operating on all bits of the type at once. By chosing the parameters appropiately you can select which bits to operate on.
If you think about it, a byte array is a bit array where the bits are grouped in packages of 8. Adressing a single bit in the array is relatively simple using logical operators (AND &, OR |, XOR ^ and NOT ~).
For example, testing if bit N is set in a byte can be done using a logical AND with a mask where only the bit to be tested is set:
public boolean testBit(byte b, int n) {
int mask = 1 << n; // equivalent of 2 to the nth power
return (b & mask) != 0;
}
Extending this to a byte array is no magic either, each byte consists of 8 bits, so the byte index is simply the bit number divided by 8, and the bit number inside that byte is the remainder (modulo 8):
public boolean testBit(byte[] array, int n) {
int index = n >>> 3; // divide by 8
int mask = 1 << (n & 7); // n modulo 8
return (array[index] & mask) != 0;
}
Here is a sample, I hope useful for you!
DatagramSocket socket = new DatagramSocket(6160, InetAddress.getByName("0.0.0.0"));
socket.setBroadcast(true);
while (true) {
byte[] recvBuf = new byte[26];
DatagramPacket packet = new DatagramPacket(recvBuf, recvBuf.length);
socket.receive(packet);
String bitArray = toBitArray(recvBuf);
System.out.println(Integer.parseInt(bitArray.substring(0, 8), 2)); // convert first byte binary to decimal
System.out.println(Integer.parseInt(bitArray.substring(8, 16), 2)); // convert second byte binary to decimal
}
public static String toBitArray(byte[] byteArray) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < byteArray.length; i++) {
sb.append(String.format("%8s", Integer.toBinaryString(byteArray[i] & 0xFF)).replace(' ', '0'));
}
return sb.toString();
}

Reverse Engineer Sorting Algorithm

I have been given 3 algorithms to reverse engineer and explain how they work, so far I have worked out that I have been given a quick sorting algorithm and a bubble sorting algorithm; however i'm not sure what algorithm this is. I understand how the quick sort and bubble sort work, but I just can't get my head around this algorithm. I'm unsure what the variables are and was hoping someone out there would be able to tell me whats going on here:
public static ArrayList<Integer> SortB(ArrayList<Integer> a)
{
ArrayList<Integer> array = CopyArray(a);
Integer[] zero = new Integer[a.size()];
Integer[] one = new Integer[a.size()];
int i,b;
Integer x,p;
//Change from 8 to 32 for whole integers - will run 4 times slower
for(b=0;b<8;++b)
{
int zc = 0;
int oc = 0;
for(i=0;i<array.size();++i)
{
x = array.get(i);
p = 1 << b;
if ((x & p) == 0)
{
zero[zc++] = array.get(i);
}
else
{
one[oc++] = array.get(i);
}
}
for(i=0;i<oc;++i) array.set(i,one[i]);
for(i=0;i<zc;++i) array.set(i+oc,zero[i]);
}
return(array);
}
This is a Radix Sort, limited to the least significant eight bits. It does not complete the sort unless you change the loop to go 32 times instead of 8.
Each iteration processes a single bit b. It prepares a mask called p by shifting 1 left b times. This produces a power of two - 1, 2, 4, 8, ..., or 1, 10, 100, 1000, 10000, ... in binary.
For each bit, the number of elements in the original array with bit b set to 1 and to 0 are separated into two buckets called one and zero. Once the separation is over, the elements are placed back into the original array, and the algorithm proceeds to the next iteration.
This implementation uses two times more storage than the size of the original array, and goes through the array a total of 16 times (64 times in the full version - once for reading and once for writing of data for each bit). The asymptotic complexity of the algorithm is linear.
Looks like a bit-by-bit radix sort to me, but it seems to be sorting backwards.

Categories