First Byte's Bit Off-By-One after Steganography - java

Currently working on a Steganography project where, given a message in bytes and the number of bits to modify per byte, hide a message in an arbitrary byte array.
In the first decoded byte of the resulting message, the value has it's first (leftmost) bit set to '1' instead of '0'. For example, when using message "Foo".getBytes() and maxBits = 1 the result is "Æoo", not "Foo" (0b01000110 gets changed to 0b11000110). With message "Æoo".getBytes() and maxBits = 1 result is "Æoo", meaning the bit is not getting flipped as far as I can tell.
Only certain values of maxBits for certain message bytes cause this error, for example "Foo" encounters this problem at maxBits equal to 1, 5, and 6, whereas "Test" encounters this problem at maxBits equal to 1, 3, and 5. Only the resulting first character ends up with its first bit set, and this problem only occurs at the specified values of this.maxBits related to the initial data.
Why, for certain values of maxBits, is the first bit of the
resulting decoded message always 1?
Why do different inputs have different values for maxBits that
work fine, and others that do not?
What is the pattern with the value of maxBits and the
resulting erroneous results in relation to the original data?
Encode and Decode Methods:
public byte[] encodeMessage(byte[] data, byte[] message) {
byte[] encoded = data;
boolean[] messageBits = byteArrToBoolArr(message);
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
encoded[index] = messageBits[x] ? setBit(encoded[index], x % this.maxBits) : unsetBit(encoded[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return encoded;
}
public byte[] decodeMessage(byte[] data) {
boolean[] messageBits = new boolean[data.length * this.maxBits];
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
messageBits[x] = getBit(data[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return boolArrToByteArr(messageBits);
}
Unset, Set, and Get Methods:
public byte unsetBit(byte data, int pos) {
return (byte) (data & ~((1 << pos)));
}
public byte setBit(byte data, int pos) {
return (byte) (data | ((1 << pos)));
}
public boolean getBit(byte data, int pos) {
return ((data >>> pos) & 0x01) == 1;
}
Conversion Methods:
public boolean[] byteArrToBoolArr(byte[] b) {
boolean bool[] = new boolean[b.length * 8];
for (int x = 0; x < bool.length; x++) {
bool[x] = false;
if ((b[x / 8] & (1 << (7 - (x % 8)))) > 0)
bool[x] = true;
}
return bool;
}
public byte[] boolArrToByteArr(boolean[] bool) {
byte[] b = new byte[bool.length / 8];
for (int x = 0; x < b.length; x++) {
for (int y = 0; y < 8; y++) {
if (bool[x * 8 + y]) {
b[x] |= (128 >>> y);
}
}
}
return b;
}
Sample Code and Output:
test("Foo", 1);//Æoo
test("Foo", 2);//Foo
test("Foo", 3);//Foo
test("Foo", 4);//Foo
test("Foo", 5);//Æoo
test("Foo", 6);//Æoo
test("Foo", 7);//Foo
test("Foo", 8);//Foo
test("Test", 1);//Ôest
test("Test", 2);//Test
test("Test", 3);//Ôest
test("Test", 4);//Test
test("Test", 5);//Ôest
test("Test", 6);//Test
test("Test", 7);//Test
test("Test", 8);//Test
private static void test(String s, int x) {
BinaryModifier bm = null;
try {
bm = new BinaryModifier(x);//Takes maxBits as constructor param
} catch (BinaryException e) {
e.printStackTrace();
}
System.out.println(new String(bm.decodeMessage(bm.encodeMessage(new byte[1024], s.getBytes()))));
return;
}

Your logic of incrementing index has two flaws, which overwrite the first bit of the first letter. Obviously, the bug is expressed when the overwriting bit is different to the first bit.
if (x % this.maxBits == 0 && x != 0)
index++;
The first problem has to do with embedding only one bit per byte, i.e. maxBits = 1. After you have embedded the very first bit and reached the above conditional, x is still 0, since it will be incremented at the end of the loop. You should be incrementing index at this point, but x != 0 prevents you from doing so. Therefore, the second bit will also be embedded in the first byte, effectively overwriting the first bit. Since this logic also exists in the decode method, you read the first two bits from the first byte.
More specifically, if you embed a 00 or 11, it will be fine. But a 01 will be read as 11 and a 10 will be read as 00, i.e., whatever value is the second bit. If the first letter has an ascii code less or equal than 63 (00xxxxxx), or greater or equal than 192 (11xxxxxx), it will come out fine. For example:
# -> # : 00100011 (35) -> 00100011 (35)
F -> Æ : 01000110 (70) -> 11000110 (198)
The second problem has to do with the x % this.maxBits == 0 part. Consider the case where we embed 3 bits per byte. After the 3rd bit, when we reach the conditional we still have x = 2, so the modulo operation will return false. After we have embedded a 4th bit, we do have x = 3 and we're allowed to move on to the next byte. However, this extra 4th bit will be written at the 0th position of the first byte, since x % this.maxBits will be 3 % 3. So again, we have a bit overwriting our very first bit. However, after the first cycle the modulo operation will correctly write only 3 bits per byte, so the rest of our message will be unaffected.
Consider the binary for "F", which is 01000110. By embedding N bits per byte, we effectively embed the following groups in the first few bytes.
1 bit 01 0 0 0 1 1 0
2 bits 010 00 11 0x
3 bits 0100 011 0xx
4 bits 01000 110x
5 bits 010001 10xxxx
6 bits 0100011 0xxxxx
7 bits 01000110
8 bits 01000110x
As you can see, for groups of 5 and 6 bits, the last bit of the first group is 1, which will overwrite our initial 0 bit. For all other cases the overwrite doesn't affect anything. Note that for 8 bits, we end up using the first bit of the second letter. If that happened to have an ascii code greater or equal than 128, it would again overwrite the firstmost 0 bit.
To address all problems, use either
for (int x = 0; x < messageBits.length; x++) {
// code in the between
if ((x + 1) % this.maxBits == 0)
index++;
}
or
for (int x = 0; x < messageBits.length; ) {
// code in the between
x++;
if (x % this.maxBits == 0)
index++;
}
Your code has another potential problem which hasn't been expressed. If your data array has a size of 1024, but you only embed 3 letters, you will affect only the first few bytes, depending on the value of maxBits. However, for the extraction, you define your array to have a size of data.length * this.maxBits. So you end up reading bits from all of the bytes of the data array. This is currently no problem, because your array is populated by 0s, which are converted to empty strings. However, if your array had actual numbers, you'd end up reading a lot of garbage past the point of your embedded data.
There are two general ways of addressing this. You either
append a unique sequence of bits at the end of your message (marker), such that when you encounter that sequence you terminate the extraction, e.g. eight 0s, or
you add a few bits before embedding your actual data (header), which will tell you how to extract your data, e.g., how many bytes and how many bits per byte to read.

One thing you're probably going to run afoul of is the nature of character encoding.
When you call s.getBytes() you are turning the string to bytes using your JVM's default encoding. Then you modify the bytes and you create a new String from the modified bytes again using the default encoding.
So the question is what is that encoding and precisely how does it work. For example, the encoding may well in some cases only be looking at the lower 7 bits of a byte relating to the character, then your setting of the top bit won't have any effect on the string created from the modified bytes.
If you really want to tell if your code is working right, do your testing by directly examining the byte[] being produced by your encode and decode methods, not by turning the modified bytes into strings and looking at the strings.

Related

Change the least significant bit (LSB) in java

I am trying to change the LSB of a numerical value, say 50 which LSB is 0 because 50 % 2 is 0 (remainder operator) to a value of 1. Thus change the LSB from 0 to 1 in this case.
The code is below:
//Get the LSB from 50 using the modulas operator
lsb = 50 % 2;
//if the character equals 1
//and the least significant bit is 0, add 1
if(binaryValue == '1' && lsb ==0)
{
//This clearly does not work.
//How do I assign the altered LSB (1) to the value of 50?
50 = lsb + 1;
}
I am having problems inside the if statement, where I am tying to assign the altered LSB, which in this case is 1 to the value of 50. This is not the full code, thus all values are different.
Thanks
The xor operation ^ can be used to flip the value of a single bit. For example
int value = 4;
value = value ^ 1;
System.out.println(value);
Will output 5 since the least significant bit was changed to one.
XOR in java:
System.out.println(50 ^ 1);

What does this binary documentation mean?

I'm trying to decode somebody's byte array and I'm stuck at this part:
&lt state &gt ::= "01" <i>(2 bits) for A</i>
"10" <i>(2 bits) for B</i>
"11" <i>(2 bits) for C</i>
I think this wants me to look at the next 2 bits of the next byte. Would that mean the least or most significant digits of the byte? I suppose I would just throw away the last 6 bits if it means the least significant?
I found this code for looking at the bits of a byte:
for (int i = 0; i < byteArray.Length; i++)
{
byte b = byteArray[i];
byte mask = 0x01;
for (int j = 0; j < 8; j++)
{
bool value = b & mask;
mask << 1;
}
}
Can someone expand on what this does exactly?
Just to give you a start:
To extract individual bits of a byte, you use "&", called the bitwise and operator. The bitwise and operation means "preserve all bits which are set on both sides". E.g. when you calculate the bitwise-and of two bytes, e.g. 00000011 & 00000010, then the result is 00000010, because only the bit at the second last position is set in both sides.
In java programming language, the very same example looks like this:
int a = 3;
int b = 2;
int bitwiseAndResult = a & b; // bitwiseAndResult will be equal to 2 after this
Now to examine if the n'th bit of some int is set, you can do this:
int intToExamine = ...;
if ((intToExamine >> n)) & 1 != 0) {
// here we know that the n'th bit was set
}
The >> is called the bitshift operator. It simply shifts the bits from left to right, like this: 00011010 >> 2 will have the result 00000110.
So from the above you can see that for extracting the n'th bit of some value, you first shift the n'th bit to position 0 (note that the first bit is bit 0, not bit 1), and then you use the bitwise and operator (&) to only keep that bit 0.
Here are some simple examples of bitwise and bit shift operators:
http://www.tutorialspoint.com/java/java_bitwise_operators_examples.htm

One-byte bool. Why?

In C++, why does a bool require one byte to store true or false where just one bit is enough for that, like 0 for false and 1 for true? (Why does Java also require one byte?)
Secondly, how much safer is it to use the following?
struct Bool {
bool trueOrFalse : 1;
};
Thirdly, even if it is safe, is the above field technique really going to help? Since I have heard that we save space there, but still compiler generated code to access them is bigger and slower than the code generated to access the primitives.
Why does a bool require one byte to store true or false where just one bit is enough
Because every object in C++ must be individually addressable* (that is, you must be able to have a pointer to it). You cannot address an individual bit (at least not on conventional hardware).
How much safer is it to use the following?
It's "safe", but it doesn't achieve much.
is the above field technique really going to help?
No, for the same reasons as above ;)
but still compiler generated code to access them is bigger and slower than the code generated to access the primitives.
Yes, this is true. On most platforms, this requires accessing the containing byte (or int or whatever), and then performing bit-shifts and bit-mask operations to access the relevant bit.
If you're really concerned about memory usage, you can use a std::bitset in C++ or a BitSet in Java, which pack bits.
* With a few exceptions.
Using a single bit is much slower and much more complicated to allocate. In C/C++ there is no way to get the address of one bit so you wouldn't be able to do &trueOrFalse as a bit.
Java has a BitSet and EnumSet which both use bitmaps. If you have very small number it may not make much difference. e.g. objects have to be atleast byte aligned and in HotSpot are 8 byte aligned (In C++ a new Object can be 8 to 16-byte aligned) This means saving a few bit might not save any space.
In Java at least, Bits are not faster unless they fit in cache better.
public static void main(String... ignored) {
BitSet bits = new BitSet(4000);
byte[] bytes = new byte[4000];
short[] shorts = new short[4000];
int[] ints = new int[4000];
for (int i = 0; i < 100; i++) {
long bitTime = timeFlip(bits) + timeFlip(bits);
long bytesTime = timeFlip(bytes) + timeFlip(bytes);
long shortsTime = timeFlip(shorts) + timeFlip(shorts);
long intsTime = timeFlip(ints) + timeFlip(ints);
System.out.printf("Flip time bits %.1f ns, bytes %.1f, shorts %.1f, ints %.1f%n",
bitTime / 2.0 / bits.size(), bytesTime / 2.0 / bytes.length,
shortsTime / 2.0 / shorts.length, intsTime / 2.0 / ints.length);
}
}
private static long timeFlip(BitSet bits) {
long start = System.nanoTime();
for (int i = 0, len = bits.size(); i < len; i++)
bits.flip(i);
return System.nanoTime() - start;
}
private static long timeFlip(short[] shorts) {
long start = System.nanoTime();
for (int i = 0, len = shorts.length; i < len; i++)
shorts[i] ^= 1;
return System.nanoTime() - start;
}
private static long timeFlip(byte[] bytes) {
long start = System.nanoTime();
for (int i = 0, len = bytes.length; i < len; i++)
bytes[i] ^= 1;
return System.nanoTime() - start;
}
private static long timeFlip(int[] ints) {
long start = System.nanoTime();
for (int i = 0, len = ints.length; i < len; i++)
ints[i] ^= 1;
return System.nanoTime() - start;
}
prints
Flip time bits 5.0 ns, bytes 0.6, shorts 0.6, ints 0.6
for sizes of 40000 and 400K
Flip time bits 6.2 ns, bytes 0.7, shorts 0.8, ints 1.1
for 4M
Flip time bits 4.1 ns, bytes 0.5, shorts 1.0, ints 2.3
and 40M
Flip time bits 6.2 ns, bytes 0.7, shorts 1.1, ints 2.4
If you want to store only one bit of information, there is nothing more compact than a char, which is the smallest addressable memory unit in C/C++. (Depending on the implementation, a bool might have the same size as a char but it is allowed to be bigger.)
A char is guaranteed by the C standard to hold at least 8 bits, however, it can also consist of more. The exact number is available via the CHAR_BIT macro defined in limits.h (in C) or climits (C++). Today, it is most common that CHAR_BIT == 8 but you cannot rely on it (see here). It is guaranteed to be 8, however, on POSIX compliant systems and on Windows.
Though it is not possible to reduce the memory footprint for a single flag, it is of course possible to combine multiple flags. Besides doing all bit operations manually, there are some alternatives:
If you know the number of bits at compile time
bitfields (as in your question). But beware, the ordering of fields is not guaranteed, which may result in portability issues.
std::bitset
If you know the size only at runtime
boost::dynamic_bitset
If you have to deal with large bitvectors, take a look at the BitMagic library. It supports compression and is heavily tuned.
As others have pointed out already, saving a few bits is not always a good idea. Possible drawbacks are:
Less readable code
Reduced execution speed because of the extra extraction code.
For the same reason, increases in code size, which may outweigh the savings in data consumption.
Hidden synchronization issues in multithreaded programs. For example, flipping two different bits by two different threads may result in a race condition. In contrast, it is always safe for two threads to modify two different objects of primitive types (e.g., char).
Typically, it makes sense when you are dealing with huge data because then you will benefit from less pressure on memory and cache.
Why don't you just store the state to a byte? Haven't actually tested the below, but it should give you an idea. You can even utilize a short or an int for 16 or 32 states. I believe I have a working JAVA example as well. I'll post this when I find it.
__int8 state = 0x0;
bool getState(int bit)
{
return (state & (1 << bit)) != 0x0;
}
void setAllOnline(bool online)
{
state = -online;
}
void reverseState(int bit)
{
state ^= (1 << bit);
}
Alright here's the JAVA version. I've stored it to an Int value since. If I remember correctly even using a byte would utilize 4 bytes anyways. And this obviously isn't be utilized as an array.
public class State
{
private int STATE;
public State() {
STATE = 0x0;
}
public State(int previous) {
STATE = previous;
}
/*
* #Usage - Used along side the #setMultiple(int, boolean);
* #Returns the value of a single bit.
*/
public static int valueOf(int bit)
{
return 1 << bit;
}
/*
* #Usage - Used along side the #setMultiple(int, boolean);
* #Returns the value of an array of bits.
*/
public static int valueOf(int... bits)
{
int value = 0x0;
for (int bit : bits)
value |= (1 << bit);
return value;
}
/*
* #Returns the value currently stored or the values of all 32 bits.
*/
public int getValue()
{
return STATE;
}
/*
* #Usage - Turns all bits online or offline.
* #Return - <TRUE> if all states are online. Otherwise <FALSE>.
*/
public boolean setAll(boolean online)
{
STATE = online ? -1 : 0;
return online;
}
/*
* #Usage - sets multiple bits at once to a specific state.
* #Warning - DO NOT SET BITS TO THIS! Use setMultiple(State.valueOf(#), boolean);
* #Return - <TRUE> if states were set to online. Otherwise <FALSE>.
*/
public boolean setMultiple(int value, boolean online)
{
STATE |= value;
if (!online)
STATE ^= value;
return online;
}
/*
* #Usage - sets a single bit to a specific state.
* #Return - <TRUE> if this bit was set to online. Otherwise <FALSE>.
*/
public boolean set(int bit, boolean online)
{
STATE |= (1 << bit);
if(!online)
STATE ^= (1 << bit);
return online;
}
/*
* #return = the new current state of this bit.
* #Usage = Good for situations that are reversed.
*/
public boolean reverse(int bit)
{
return (STATE ^= (1 << bit)) == (1 << bit);
}
/*
* #return = <TRUE> if this bit is online. Otherwise <FALSE>.
*/
public boolean online(int bit)
{
int value = 1 << bit;
return (STATE & value) == value;
}
/*
* #return = a String contains full debug information.
*/
#Override
public String toString()
{
StringBuilder sb = new StringBuilder();
sb.append("TOTAL VALUE: ");
sb.append(STATE);
for (int i = 0; i < 0x20; i++)
{
sb.append("\nState(");
sb.append(i);
sb.append("): ");
sb.append(online(i));
sb.append(", ValueOf: ");
sb.append(State.valueOf(i));
}
return sb.toString();
}
}
Also I should point out that you really shouldn't utilize a special class for this, but to just have the variable stored within the class that'll be most likely utilizing it. If you plan to have 100's or even 1000's of Boolean values consider an array of bytes.
E.g. the below example.
boolean[] states = new boolean[4096];
can be converted into the below.
int[] states = new int[128];
Now you're probably wondering how you'll access index 4095 from a 128 array. So what this is doing is if we simplify it. The 4095 is be shifted 5 bits to the right which is technically the same as divide by 32. So 4095 / 32 = rounded down (127). So we are at index 127 of the array. Then we perform 4095 & 31 which will cast it to a value between 0 and 31. This will only work with powers of two minus 1. E.g. 0,1,3,7,15,31,63,127,255,511,1023, etc...
So now we can access the bit at that position. As you can see this is very very compact and beats having 4096 booleans in a file :) This will also provide a much faster read/write to a binary file. I have no idea what this BitSet stuff is, but it looks like complete garbage and since byte,short,int,long are already in their bit forms technically you might as well use them as is. Then creating some complex class to access the individual bits from memory which is what I could grasp from reading a few posts.
boolean getState(int index)
{
return (states[index >> 5] & 1 << (index & 0x1F)) != 0x0;
}
Further information...
Basically if the above was a bit confusing here's a simplified version of what's happening.
The types "byte", "short", "int", "long" all are data types which have different ranges.
You can view this link: http://msdn.microsoft.com/en-us/library/s3f49ktz(v=vs.80).aspx
To see the data ranges of each.
So a byte is equal to 8 bits. So an int which is 4 bytes will be 32 bits.
Now there isn't any easy way to perform some value to the N power. However thanks to bit shifting we can simulate it somewhat. By performing 1 << N this equates to 1 * 2^N. So if we did 2 << 2^N we'd be doing 2 * 2^N. So to perform powers of two always do "1 << N".
Now we know that a int will have 32 bits so can use each bits so we can just simply index them.
To keep things simple think of the "&" operator as a way to check if a value contains the bits of another value. So let's say we had a value which was 31. To get to 31. we must add the following bits 0 through 4. Which are 1,2,4,8, and 16. These all add up to 31. Now when we performing 31 & 16 this will return 16 because the bit 4 which is 2^4 = 16. Is located in this value. Now let's say we performed 31 & 20 which is checking if bits 2 and 4 are located in this value. This will return 20 since both bits 2 and 4 are located here 2^2 = 4 + 2^4 = 16 = 20. Now let's say we did 31 & 48. This is checking for bits 4 and 5. Well we don't have bit 5 in 31. So this will only return 16. It will not return 0. So when performing multiple checks you must check that it physically equals that value. Instead of checking if it equals 0.
The below will verify if an individual bit is at 0 or 1. 0 being false, and 1 being true.
bool getState(int bit)
{
return (state & (1 << bit)) != 0x0;
}
The below is example of checking two values if they contain those bits. Think of it like each bit is represented as 2^BIT so when we do
I'll quickly go over some of the operators. We've just recently explained the "&" operator slightly. Now for the "|" operator.
When performing the following
int value = 31;
value |= 16;
value |= 16;
value |= 16;
value |= 16;
The value will still be 31. This is because bit 4 or 2^4=16 is already turned on or set to 1. So performing "|" returns that value with that bit turned on. If it's already turned on no changes are made. We utilize "|=" to actually set the variable to that returned value.
Instead of doing -> "value = value | 16;". We just do "value |= 16;".
Now let's look a bit further into how the "&" and "|" can be utilized.
/*
* This contains bits 0,1,2,3,4,8,9 turned on.
*/
const int CHECK = 1 | 2 | 4 | 8 | 16 | 256 | 512;
/*
* This is some value were we add bits 0 through 9, but we skip 0 and 8.
*/
int value = 2 | 4 | 8 | 16 | 32 | 64 | 128 | 512;
So when we perform the below code.
int return_code = value & CHECK;
The return code will be 2 + 4 + 8 + 16 + 512 = 542
So we were checking for 799, but we recieved 542 This is because bits o and 8 are offline we equal 256 + 1 = 257 and 799 - 257 = 542.
The above is great great great way to check if let's say we were making a video game and wanted to check if so and so buttons were pressed if any of them were pressed. We could simply check each of those bits with one check and it would be so many times more efficient than performing a Boolean check on every single state.
Now let's say we have Boolean value which is always reversed.
Normally you'd do something like
bool state = false;
state = !state;
Well this can be done with bits as well utilizing the "^" operator.
Just as we performed "1 << N" to choose the whole value of that bit. We can do the same with the reverse. So just like we showed how "|=" stores the return we will do the same with "^=". So what this does is if that bit is on we turn it off. If it's off we turn it on.
void reverseState(int bit)
{
state ^= (1 << bit);
}
You can even have it return the current state. If you wanted it to return the previous state just swap "!=" to "==". So what this does is performs the reversal then checks the current state.
bool reverseAndGet(int bit)
{
return ((state ^= (1 << bit)) & (1 << bit)) != 0x0;
}
Storing multiple non single bit aka bool values into a int can also be done. Let's say we normally write out our coordinate position like the below.
int posX = 0;
int posY = 0;
int posZ = 0;
Now let's say these never wen't passed 1023. So 0 through 1023 was the maximum distance on all of these. I'm choose 1023 for other purposes as previously mentioned you can manipulate the "&" variable as a way to force a value between 0 and 2^N - 1 values. So let's say your range was 0 through 1023. We can perform "value & 1023" and it'll always be a value between 0 and 1023 without any index parameter checks. Keep in mind as previously mentioned this only works with powers of two minus one. 2^10 = 1024 - 1 = 1023.
E.g. no more if (value >= 0 && value <= 1023).
So 2^10 = 1024, which requires 10 bits in order to hold a number between 0 and 1023.
So 10x3 = 30 which is still less than or equal to 32. Is sufficient for holding all these values in an int.
So we can perform the following. So to see how many bits we used. We do 0 + 10 + 20. The reason I put the 0 there is to show you visually that 2^0 = 1 so # * 1 = #. The reason we need y << 10 is because x uses up 10 bits which is 0 through 1023. So we need to multiple y by 1024 to have unique values for each. Then Z needs to be multiplied by 2^20 which is 1,048,576.
int position = (x << 0) | (y << 10) | (z << 20);
This makes comparisons fast.
We can now do
return this.position == position;
apposed to
return this.x == x && this.y == y && this.z == z;
Now what if we wanted the actual positions of each?
For the x we simply do the following.
int getX()
{
return position & 1023;
}
Then for the y we need to perform a left bit shift then AND it.
int getY()
{
return (position >> 10) & 1023;
}
As you may guess the Z is the same as the Y, but instead of 10 we use 20.
int getZ()
{
return (position >> 20) & 1023;
}
I hope whoever views this will find it worth while information :).
If you really want to use 1 bit, you can use a char to store 8 booleans, and bitshift to get the value of the one you want. I doubt it will be faster, and it's probably going to gives you a lot of headaches working that way, but technically it's possible.
On a side note, an attempt like this could prove useful for systems that don't have a lot of memory available for variables but do have some more processing power then what you need. I highly doubt you will ever need it though.

How to find the closest value of 2^N to a given input?

I somehow have to keep my program running until the output of the exponent function exceeds the input value, and then compare that to the previous output of the exponent function. How would I do something like that, even if in just pseudocode?
Find logarithm to base 2 from given number => x := log (2, input)
Round the value acquired in step 1 both up and down => y := round(x), z := round(x) + 1
Find 2^y, 2^z, compare them both with input and choose the one that suits better
Depending on which language you're using, you can do this easily using bitwise operations. You want either the value with a single 1 bit set greater than the highest one bit set in the input value, or the value with the highest one bit set in the input value.
If you do set all of the bits below the highest set bit to 1, then add one you end up with the next greater power of two. You can right shift this to get the next lower power of two and choose the closer of the two.
unsigned closest_power_of_two(unsigned value)
{
unsigned above = (value - 1); // handle case where input is a power of two
above |= above >> 1; // set all of the bits below the highest bit
above |= above >> 2;
above |= above >> 4;
above |= above >> 8;
above |= above >> 16;
++above; // add one, carrying all the way through
// leaving only one bit set.
unsigned below = above >> 1; // find the next lower power of two.
return (above - value) < (value - below) ? above : below;
}
See Bit Twiddling Hacks for other similar tricks.
Apart from the looping there's also one solution that may be faster depending on how the compiler maps the nlz instruction:
public int nextPowerOfTwo(int val) {
return 1 << (32 - Integer.numberOfLeadingZeros(val - 1));
}
No explicit looping and certainly more efficient than the solutions using Math.pow. Hard to say more without looking what code the compiler generates for numberOfLeadingZeros.
With that we can then easily get the lower power of 2 and then compare which one is nearer - the last part has to be done for each solution it seems to me.
set x to 1.
while x < target, set x = 2 * x
then just return x or x / 2, whichever is closer to the target.
public static int neareastPower2(int in) {
if (in <= 1) {
return 1;
}
int result = 2;
while (in > 3) {
in = in >> 1;
result = result << 1;
}
if (in == 3) {
return result << 1;
} else {
return result;
}
}
I will use 5 as input for an easy example instead of 50.
Convert the input to bits/bytes, in this case 101
Since you are looking for powers of two, your answer will all be of the form 10000...00 (a one with a certain amount of zeros). You take the input value (3 bits) and calculate the integer value of 100 (3 bits) and 1000 (4 bits). The integer 100 will be smaller then the input, the integer 1000 will be larger.
You calculate the difference between the input and the two possible values and use the smallest one. In this case 100 = 4 (difference of 1) while 1000 = 8 (difference of 3), so the searched answer is 4
public static int neareastPower2(int in) {
return (int) Math.pow(2, Math.round(Math.log(in) / Math.log(2)));
}
Here's the pseudo code for a function that takes the input number and returns your answer.
int findit( int x) {
int a = int(log(x)/log(2));
if(x >= 2^a + 2^(a-1))
return 2^(a+1)
else
return 2^a
}
Here's a bitwise solution--it will return the lessor of 2^N and 2^(N+1) in case of a tie. This should be very fast compare to invoking the log() function
let mask = (~0 >> 1) + 1
while ( mask > value )
mask >> 1
return ( mask & value == 0 ) ? mask : mask << 1

Set a BitSet as a primitive type?

In Java, can you create a BitSet of size 8 and store it as a byte in order to output it? The documentation on BitSets doesn't mention it. Does that mean no?
You can't cast BitSet to byte.
You can write code to do what you want though. Given a BitSet named bits, here you go:
byte output = 0;
for (int i = 0; i < 8; i++) {
if (bits.get(i)) {
output |= 1 << (7 - i);
}
}
Update: The above code assumes that your bits are indexed 0 to 7 from left to right. E.g. assuming the bits 01101001 you consider bit 0 to be the leftmost 0. If however you're assigning the bits from right to left then bit 0 would be the rightmost 1. In which case you want output |= 1 << i instead.
There's nothing built in for that. You could implement that yourself obviously.
bit set is an array of bits
the JVM uses a 32-bit stack cell ie each register in the JVM stores one 32-bit address
we know that primitive boolean is set to be 1 bit but handled as 32 bit. array of boolean will be considered to be array of bytes
in BitSet each component of the bit set has a boolean value
Every bit set has a current size,
which is the number of bits of space
currently in use by the bit set. Note
that the size is related to the
implementation of a bit set, so it may
change with implementation. The length
of a bit set relates to logical length
of a bit set and is defined
independently of implementation.
The BitSet class is obviously not intended to export or import its bits to native datatypes and also quite heavy if you just want to deal with the fixed size of a single byte. It might thus not be what you need if you just want to manipulate the bits of a byte independently and then use the resulting byte. It seems you might just want to use a API like this:
SimpleBitSet bs = new SimpleBitSet( 'A' );
bs.setBit( 5 );
byte mybyte = bs.getByte();
So a implementation of such a simplified bit set could look like this:
public class SimpleBitSet
{
private byte bits;
public SimpleBitSet( int bits )
{
this.bits = (byte) bits;
}
public byte getByte()
{
return bits;
}
public boolean getBit( int idx )
{
checkIndex( idx );
return ( bits & ( 1 << idx ) ) != 0;
}
public void setBit( int idx )
{
checkIndex( idx );
bits |= 1 << idx;
}
public void clearBit( int idx )
{
checkIndex( idx );
bits &= ~( 1 << idx );
}
protected void checkIndex( int idx )
{
if( idx < 0 || idx > 7 )
throw new IllegalArgumentException( "index: " + idx );
}
}

Categories