For an application I need to have an unsigned data-type with 24 bits.
Unfortunately such a data type is not available in Java. I’m planning to implement it as a new class.But I’m not sure about the performance of such an implementation.
Is it advisable to write my own class?
If it is advisable, is it possible to good performance?
Presumably, you mean implement it as a class which uses a larger data type and bounds checking, like this:
public class Unsigned24 {
private static final MAX_UNSIGNED24 = Math.pow(2, 24) - 1;
private static final MIN_UNSIGNED24 = 0;
private final int value;
public Unsigned24(int value) {
if (value > MAX_UNSIGNED24 || value < MIN_UNSIGNED24)
throw new IllegalArgumentException("value out of bounds: " + value);
this.value = value;
}
public int getValue() {
return value;
}
// ... other methods, such as equals(), comparison, addition, subtraction, etc.
}
This would work, but might not be worth the trouble. Also, it doesn't really take only 24 bits of memory, but rather 32 plus the overhead for an object.
It really depends on your goals. Why do you want a 24-bit integer.
Is it just because you have bounds-constraints on the values? If so, you might want to do something like the above.
Is it because you have a lot of them, and want to save memory? If so, you might want to build some class that abstracts an array of 24-bit integers, and internally saves them consecutively in a byte-array.
Is it because you are interfacing with some hardware or network interface that expects exactly 24 bits? In that case, you might want to look at the java.nio classes.
If you want to save space, you could use int for caluclation and map the least significant 3 bytes to a byte[] or just three bytes:
public static byte[] convert(int i) {
return new byte[]{ (i & 0xff0000) >> 16, (i & 0xff00) >> 8, (i & 0xff) };
}
public static int convert(byte[] b) {
if (b == null && b.length != 3)
throw new IllegalArgumentException();
return (b[2] << 16) | (b[1] << 8) | b;
}
(can't verify if it is bug free but at least it should give an idea)
Related
I want to extrapolate a byte array into a double array.
I know how to do it in Java. But the AS converter doesn't work for this... :-D
This is the class I want to write in Kotlin:
class ByteArrayToDoubleArrayConverter {
public double[] invoke(byte[] bytes) {
double[] doubles = new double[bytes.length / 2];
int i = 0;
for (int n = 0; n < bytes.length; n = n + 2) {
doubles[i] = (bytes[n] & 0xFF) | (bytes[n + 1] << 8);
i = i + 1;
}
return doubles;
}
}
This would be a typical example of what results are expected:
class ByteArrayToDoubleArrayConverterTest {
#Test
fun `check typical values`() {
val bufferSize = 8
val bytes = ByteArray(bufferSize)
bytes[0] = 1
bytes[1] = 0
bytes[2] = 0
bytes[3] = 1
bytes[4] = 0
bytes[5] = 2
bytes[6] = 1
bytes[7] = 1
val doubles = ByteArrayToDoubleArrayConverter().invoke(bytes)
assertTrue(1.0 == doubles[0])
assertTrue(256.0 == doubles[1])
assertTrue(512.0 == doubles[2])
assertTrue(257.0 == doubles[3])
}
}
Any idea? Thanks!!!
I think this would be clearest with a helper function. Here's an extension function that uses a lambda to convert pairs of bytes into a DoubleArray:
inline fun ByteArray.mapPairsToDoubles(block: (Byte, Byte) -> Double)
= DoubleArray(size / 2){ i -> block(this[2 * i], this[2 * i + 1]) }
That uses the DoubleArray constructor which takes an initialisation lambda as well as a size, so you don't need to loop through setting values after construction.
The required function then simply needs to know how to convert each pair of bytes into a double. Though it would be more idiomatic as an extension function rather than a class:
fun ByteArray.toDoubleSamples() = mapPairsToDoubles{ a, b ->
(a.toInt() and 0xFF or (b.toInt() shl 8)).toDouble()
}
You can then call it with e.g.:
bytes.toDoubleSamples()
(.toXxx() is the conventional name for a function which returns a transformed version of an object. The standard name for this sort of function would be toDoubleArray(), but that normally converts each value to its own double; what you're doing is more specialised, so a more specialised name would avoid confusion.)
The only awkward thing there (and the reason why the direct conversion from Java fails) is that Kotlin is much more fussy about its numeric types, and won't automatically promote them the way Java and C do; it also doesn't have byte overloads for its bitwise operators. So you need to call toInt() explicitly on each byte before you can call and and shl, and then call toDouble() on the result.
The result is code that is a lot shorter, hopefully much more readable, and also very efficient! (No intermediate arrays or lists, and — thanks to the inline — not even any unnecessary function calls.)
(It's a bit more awkward than most Kotlin code, as primitive arrays aren't as well-supported as reference-based arrays — which are themselves not as well-supported as lists. This is mainly for legacy reasons to do with Java compatibility. But it's a shame that there's no chunked() implementation for ByteArray, which could have avoided the helper function, though at the cost of a temporary list.)
A byte is the smallest numeric datatype java offers but yesterday I came in contact with bytestreams for the first time and at the beginning of every package a marker byte is send which gives further instructions on how to handle the package. Every bit of the byte has a specific meaning so I am in need to entangle the byte into it's 8 bits.
You probably could convert the byte to a boolean array or create a switch for every case but that can't certainly be the best practice.
How is this possible in java why are there no bit datatypes in java?
Because there is no bit data type that exists on the physical computer. The smallest allotment you can allocate on most modern computers is a byte which is also known as an octet or 8 bits. When you display a single bit you are really just pulling that first bit out of the byte with arithmetic and adding it to a new byte which still is using an 8 bit space. If you want to put bit data inside of a byte you can but it will be stored as a at least a single byte no matter what programming language you use.
You could load the byte into a BitSet. This abstraction hides the gory details of manipulating single bits.
import java.util.BitSet;
public class Bits {
public static void main(String[] args) {
byte[] b = new byte[]{10};
BitSet bitset = BitSet.valueOf(b);
System.out.println("Length of bitset = " + bitset.length());
for (int i=0; i<bitset.length(); ++i) {
System.out.println("bit " + i + ": " + bitset.get(i));
}
}
}
$ java Bits
Length of bitset = 4
bit 0: false
bit 1: true
bit 2: false
bit 3: true
You can ask for any bit, but the length tells you that all the bits past length() - 1 are set to 0 (false):
System.out.println("bit 75: " + bitset.get(75));
bit 75: false
Have a look at java.util.BitSet.
You might use it to interpret the byte read and can use the get method to check whether a specific bit is set like this:
byte b = stream.read();
final BitSet bitSet = BitSet.valueOf(new byte[]{b});
if (bitSet.get(2)) {
state.activateComponentA();
} else {
state.deactivateComponentA();
}
state.setFeatureBTo(bitSet.get(1));
On the other hand, you can create your own bitmask easily and convert it to a byte array (or just byte) afterwards:
final BitSet output = BitSet.valueOf(ByteBuffer.allocate(1));
output.set(3, state.isComponentXActivated());
if (state.isY){
output.set(4);
}
final byte w = output.toByteArray()[0];
How is this possible in java why are there no bit datatypes in java?
There are no bit data types in most languages. And most CPU instruction sets have few (if any) instructions dedicated to adressing single bits. You can think of the lack of these as a trade-off between (language or CPU) complexity and need.
Manipulating a single bit can be though of as a special case of manipulating multiple bits; and languages as well as CPU's are equipped for the latter.
Very common operations like testing, setting, clearing, inverting as well as exclusive or are all supported on the integer primitive types (byte, short/char, int, long), operating on all bits of the type at once. By chosing the parameters appropiately you can select which bits to operate on.
If you think about it, a byte array is a bit array where the bits are grouped in packages of 8. Adressing a single bit in the array is relatively simple using logical operators (AND &, OR |, XOR ^ and NOT ~).
For example, testing if bit N is set in a byte can be done using a logical AND with a mask where only the bit to be tested is set:
public boolean testBit(byte b, int n) {
int mask = 1 << n; // equivalent of 2 to the nth power
return (b & mask) != 0;
}
Extending this to a byte array is no magic either, each byte consists of 8 bits, so the byte index is simply the bit number divided by 8, and the bit number inside that byte is the remainder (modulo 8):
public boolean testBit(byte[] array, int n) {
int index = n >>> 3; // divide by 8
int mask = 1 << (n & 7); // n modulo 8
return (array[index] & mask) != 0;
}
Here is a sample, I hope useful for you!
DatagramSocket socket = new DatagramSocket(6160, InetAddress.getByName("0.0.0.0"));
socket.setBroadcast(true);
while (true) {
byte[] recvBuf = new byte[26];
DatagramPacket packet = new DatagramPacket(recvBuf, recvBuf.length);
socket.receive(packet);
String bitArray = toBitArray(recvBuf);
System.out.println(Integer.parseInt(bitArray.substring(0, 8), 2)); // convert first byte binary to decimal
System.out.println(Integer.parseInt(bitArray.substring(8, 16), 2)); // convert second byte binary to decimal
}
public static String toBitArray(byte[] byteArray) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < byteArray.length; i++) {
sb.append(String.format("%8s", Integer.toBinaryString(byteArray[i] & 0xFF)).replace(' ', '0'));
}
return sb.toString();
}
EDIT:
I think my purpose was not understood, and so the negative votes and comments. I am NOT interested in knowing what bit of a floating point(double) means what, and if they match the same position in a long; this is totally irrelevant to me. My problem is the following: I want to use a single primitive array to store all my primitive values. If I choose a double[] as "storage", I need to be able to store longs in it too (I could also do it the other way around, but the problem would not go away, just be reversed). Since both are the same size, it should work, somehow. Using Double.doubleToRawLongBits(double) and Double.longBitsToDouble(long) allow me to do that. But what I wanted to know was: "Can I just put cast a long to an double and back, and always get the same long back?" If THAT is true, then it solves my problem, and I don't care if the bits gets moved around internally. So I wanted to test if I can safely safely do this. The output says all 64 bits could be accessed and modified individually, but possibly this is not sufficient to prove that no long bit gets lost/modified.
I just discovered, with a small test, that I can correctly "address" every bit in a double, just by casting to long and back. Here the test program (which succeeds, at least on Java 7 / Windows 7 64bit):
import static org.junit.Assert.assertTrue;
import org.junit.Test;
public class TestDoubleBits {
private static double set(final double d, final int bit, final boolean value) {
if (value) {
return ((long) d) | (1L << bit);
} else {
return ((long) d) & ~(1L << bit);
}
}
private static boolean get(final double d, final int bit) {
return (((long) d) & (1L << bit)) != 0;
}
#Test
public void testDoubleBits() {
final double value = Math.random();
for (int bit = 0; bit < 64; bit++) {
assertTrue((get(set(value, bit, false), bit) == false));
assertTrue((get(set(value, bit, true), bit) == true));
}
}
}
Assuming my test program correctly "proves" that every bit of a double can be accessed, just by casting to long and back, why do we have the following native methods:
Double.doubleToRawLongBits(double)
Double.longBitsToDouble(long)
Since native methods are usually slower (the content of the method might be faster, but the overhead of native call make it slower), is there any benefit to using those methods?
The bit pattern of a floating point number will NEVER (with ONE exception) remotely resemble the bit pattern of the corresponding integer value, if one exists.
I suggest you run the following program
public class Test
{
public static void main(String[] args) {
double d = 1.3;
long d1 = (long) d;
long d2 = (Double.doubleToLongBits(d));
System.out.printf("cast %016X bits %016X\n", d1, d2);
}
}
Then read What Every Computer Scientist Should Know About Floating-Point Arithmetic
(The exception is, of course, the value zero)
If you want to investigate further, there's a neat interactive floating point converter at CUNY that displays everything you'd ever want to know about any given number's float representations.
This is the test I should have used. It fails at 53, which (I assume) means that only the first 52 bits of a long could be stored in a double "safely" without using those native methods (this also precludes using any negative values).
public class TestDoubleBits {
public static void main(final String[] args) {
int failsAt = -1;
long value = 1;
for (int bit = 1; bit < 64; bit++) {
value = value | (1L << bit);
final double d = value;
final long l2 = (long) d;
if (value != l2) {
failsAt = bit;
break;
}
}
System.out.println("failsAt: " + failsAt);
value = value & ~(1L << failsAt);
System.out.println("Max value decimal: " + value);
System.out.println("Max value hex: " + Long.toHexString(value));
System.out.println("Max value binary: " + Long.toBinaryString(value));
}
}
The problem with my original test was that I tested the bits individually. By always setting the first bit to 1, I can find out when I start loosing data, because the least significant bit is the "first to go".
In C++, why does a bool require one byte to store true or false where just one bit is enough for that, like 0 for false and 1 for true? (Why does Java also require one byte?)
Secondly, how much safer is it to use the following?
struct Bool {
bool trueOrFalse : 1;
};
Thirdly, even if it is safe, is the above field technique really going to help? Since I have heard that we save space there, but still compiler generated code to access them is bigger and slower than the code generated to access the primitives.
Why does a bool require one byte to store true or false where just one bit is enough
Because every object in C++ must be individually addressable* (that is, you must be able to have a pointer to it). You cannot address an individual bit (at least not on conventional hardware).
How much safer is it to use the following?
It's "safe", but it doesn't achieve much.
is the above field technique really going to help?
No, for the same reasons as above ;)
but still compiler generated code to access them is bigger and slower than the code generated to access the primitives.
Yes, this is true. On most platforms, this requires accessing the containing byte (or int or whatever), and then performing bit-shifts and bit-mask operations to access the relevant bit.
If you're really concerned about memory usage, you can use a std::bitset in C++ or a BitSet in Java, which pack bits.
* With a few exceptions.
Using a single bit is much slower and much more complicated to allocate. In C/C++ there is no way to get the address of one bit so you wouldn't be able to do &trueOrFalse as a bit.
Java has a BitSet and EnumSet which both use bitmaps. If you have very small number it may not make much difference. e.g. objects have to be atleast byte aligned and in HotSpot are 8 byte aligned (In C++ a new Object can be 8 to 16-byte aligned) This means saving a few bit might not save any space.
In Java at least, Bits are not faster unless they fit in cache better.
public static void main(String... ignored) {
BitSet bits = new BitSet(4000);
byte[] bytes = new byte[4000];
short[] shorts = new short[4000];
int[] ints = new int[4000];
for (int i = 0; i < 100; i++) {
long bitTime = timeFlip(bits) + timeFlip(bits);
long bytesTime = timeFlip(bytes) + timeFlip(bytes);
long shortsTime = timeFlip(shorts) + timeFlip(shorts);
long intsTime = timeFlip(ints) + timeFlip(ints);
System.out.printf("Flip time bits %.1f ns, bytes %.1f, shorts %.1f, ints %.1f%n",
bitTime / 2.0 / bits.size(), bytesTime / 2.0 / bytes.length,
shortsTime / 2.0 / shorts.length, intsTime / 2.0 / ints.length);
}
}
private static long timeFlip(BitSet bits) {
long start = System.nanoTime();
for (int i = 0, len = bits.size(); i < len; i++)
bits.flip(i);
return System.nanoTime() - start;
}
private static long timeFlip(short[] shorts) {
long start = System.nanoTime();
for (int i = 0, len = shorts.length; i < len; i++)
shorts[i] ^= 1;
return System.nanoTime() - start;
}
private static long timeFlip(byte[] bytes) {
long start = System.nanoTime();
for (int i = 0, len = bytes.length; i < len; i++)
bytes[i] ^= 1;
return System.nanoTime() - start;
}
private static long timeFlip(int[] ints) {
long start = System.nanoTime();
for (int i = 0, len = ints.length; i < len; i++)
ints[i] ^= 1;
return System.nanoTime() - start;
}
prints
Flip time bits 5.0 ns, bytes 0.6, shorts 0.6, ints 0.6
for sizes of 40000 and 400K
Flip time bits 6.2 ns, bytes 0.7, shorts 0.8, ints 1.1
for 4M
Flip time bits 4.1 ns, bytes 0.5, shorts 1.0, ints 2.3
and 40M
Flip time bits 6.2 ns, bytes 0.7, shorts 1.1, ints 2.4
If you want to store only one bit of information, there is nothing more compact than a char, which is the smallest addressable memory unit in C/C++. (Depending on the implementation, a bool might have the same size as a char but it is allowed to be bigger.)
A char is guaranteed by the C standard to hold at least 8 bits, however, it can also consist of more. The exact number is available via the CHAR_BIT macro defined in limits.h (in C) or climits (C++). Today, it is most common that CHAR_BIT == 8 but you cannot rely on it (see here). It is guaranteed to be 8, however, on POSIX compliant systems and on Windows.
Though it is not possible to reduce the memory footprint for a single flag, it is of course possible to combine multiple flags. Besides doing all bit operations manually, there are some alternatives:
If you know the number of bits at compile time
bitfields (as in your question). But beware, the ordering of fields is not guaranteed, which may result in portability issues.
std::bitset
If you know the size only at runtime
boost::dynamic_bitset
If you have to deal with large bitvectors, take a look at the BitMagic library. It supports compression and is heavily tuned.
As others have pointed out already, saving a few bits is not always a good idea. Possible drawbacks are:
Less readable code
Reduced execution speed because of the extra extraction code.
For the same reason, increases in code size, which may outweigh the savings in data consumption.
Hidden synchronization issues in multithreaded programs. For example, flipping two different bits by two different threads may result in a race condition. In contrast, it is always safe for two threads to modify two different objects of primitive types (e.g., char).
Typically, it makes sense when you are dealing with huge data because then you will benefit from less pressure on memory and cache.
Why don't you just store the state to a byte? Haven't actually tested the below, but it should give you an idea. You can even utilize a short or an int for 16 or 32 states. I believe I have a working JAVA example as well. I'll post this when I find it.
__int8 state = 0x0;
bool getState(int bit)
{
return (state & (1 << bit)) != 0x0;
}
void setAllOnline(bool online)
{
state = -online;
}
void reverseState(int bit)
{
state ^= (1 << bit);
}
Alright here's the JAVA version. I've stored it to an Int value since. If I remember correctly even using a byte would utilize 4 bytes anyways. And this obviously isn't be utilized as an array.
public class State
{
private int STATE;
public State() {
STATE = 0x0;
}
public State(int previous) {
STATE = previous;
}
/*
* #Usage - Used along side the #setMultiple(int, boolean);
* #Returns the value of a single bit.
*/
public static int valueOf(int bit)
{
return 1 << bit;
}
/*
* #Usage - Used along side the #setMultiple(int, boolean);
* #Returns the value of an array of bits.
*/
public static int valueOf(int... bits)
{
int value = 0x0;
for (int bit : bits)
value |= (1 << bit);
return value;
}
/*
* #Returns the value currently stored or the values of all 32 bits.
*/
public int getValue()
{
return STATE;
}
/*
* #Usage - Turns all bits online or offline.
* #Return - <TRUE> if all states are online. Otherwise <FALSE>.
*/
public boolean setAll(boolean online)
{
STATE = online ? -1 : 0;
return online;
}
/*
* #Usage - sets multiple bits at once to a specific state.
* #Warning - DO NOT SET BITS TO THIS! Use setMultiple(State.valueOf(#), boolean);
* #Return - <TRUE> if states were set to online. Otherwise <FALSE>.
*/
public boolean setMultiple(int value, boolean online)
{
STATE |= value;
if (!online)
STATE ^= value;
return online;
}
/*
* #Usage - sets a single bit to a specific state.
* #Return - <TRUE> if this bit was set to online. Otherwise <FALSE>.
*/
public boolean set(int bit, boolean online)
{
STATE |= (1 << bit);
if(!online)
STATE ^= (1 << bit);
return online;
}
/*
* #return = the new current state of this bit.
* #Usage = Good for situations that are reversed.
*/
public boolean reverse(int bit)
{
return (STATE ^= (1 << bit)) == (1 << bit);
}
/*
* #return = <TRUE> if this bit is online. Otherwise <FALSE>.
*/
public boolean online(int bit)
{
int value = 1 << bit;
return (STATE & value) == value;
}
/*
* #return = a String contains full debug information.
*/
#Override
public String toString()
{
StringBuilder sb = new StringBuilder();
sb.append("TOTAL VALUE: ");
sb.append(STATE);
for (int i = 0; i < 0x20; i++)
{
sb.append("\nState(");
sb.append(i);
sb.append("): ");
sb.append(online(i));
sb.append(", ValueOf: ");
sb.append(State.valueOf(i));
}
return sb.toString();
}
}
Also I should point out that you really shouldn't utilize a special class for this, but to just have the variable stored within the class that'll be most likely utilizing it. If you plan to have 100's or even 1000's of Boolean values consider an array of bytes.
E.g. the below example.
boolean[] states = new boolean[4096];
can be converted into the below.
int[] states = new int[128];
Now you're probably wondering how you'll access index 4095 from a 128 array. So what this is doing is if we simplify it. The 4095 is be shifted 5 bits to the right which is technically the same as divide by 32. So 4095 / 32 = rounded down (127). So we are at index 127 of the array. Then we perform 4095 & 31 which will cast it to a value between 0 and 31. This will only work with powers of two minus 1. E.g. 0,1,3,7,15,31,63,127,255,511,1023, etc...
So now we can access the bit at that position. As you can see this is very very compact and beats having 4096 booleans in a file :) This will also provide a much faster read/write to a binary file. I have no idea what this BitSet stuff is, but it looks like complete garbage and since byte,short,int,long are already in their bit forms technically you might as well use them as is. Then creating some complex class to access the individual bits from memory which is what I could grasp from reading a few posts.
boolean getState(int index)
{
return (states[index >> 5] & 1 << (index & 0x1F)) != 0x0;
}
Further information...
Basically if the above was a bit confusing here's a simplified version of what's happening.
The types "byte", "short", "int", "long" all are data types which have different ranges.
You can view this link: http://msdn.microsoft.com/en-us/library/s3f49ktz(v=vs.80).aspx
To see the data ranges of each.
So a byte is equal to 8 bits. So an int which is 4 bytes will be 32 bits.
Now there isn't any easy way to perform some value to the N power. However thanks to bit shifting we can simulate it somewhat. By performing 1 << N this equates to 1 * 2^N. So if we did 2 << 2^N we'd be doing 2 * 2^N. So to perform powers of two always do "1 << N".
Now we know that a int will have 32 bits so can use each bits so we can just simply index them.
To keep things simple think of the "&" operator as a way to check if a value contains the bits of another value. So let's say we had a value which was 31. To get to 31. we must add the following bits 0 through 4. Which are 1,2,4,8, and 16. These all add up to 31. Now when we performing 31 & 16 this will return 16 because the bit 4 which is 2^4 = 16. Is located in this value. Now let's say we performed 31 & 20 which is checking if bits 2 and 4 are located in this value. This will return 20 since both bits 2 and 4 are located here 2^2 = 4 + 2^4 = 16 = 20. Now let's say we did 31 & 48. This is checking for bits 4 and 5. Well we don't have bit 5 in 31. So this will only return 16. It will not return 0. So when performing multiple checks you must check that it physically equals that value. Instead of checking if it equals 0.
The below will verify if an individual bit is at 0 or 1. 0 being false, and 1 being true.
bool getState(int bit)
{
return (state & (1 << bit)) != 0x0;
}
The below is example of checking two values if they contain those bits. Think of it like each bit is represented as 2^BIT so when we do
I'll quickly go over some of the operators. We've just recently explained the "&" operator slightly. Now for the "|" operator.
When performing the following
int value = 31;
value |= 16;
value |= 16;
value |= 16;
value |= 16;
The value will still be 31. This is because bit 4 or 2^4=16 is already turned on or set to 1. So performing "|" returns that value with that bit turned on. If it's already turned on no changes are made. We utilize "|=" to actually set the variable to that returned value.
Instead of doing -> "value = value | 16;". We just do "value |= 16;".
Now let's look a bit further into how the "&" and "|" can be utilized.
/*
* This contains bits 0,1,2,3,4,8,9 turned on.
*/
const int CHECK = 1 | 2 | 4 | 8 | 16 | 256 | 512;
/*
* This is some value were we add bits 0 through 9, but we skip 0 and 8.
*/
int value = 2 | 4 | 8 | 16 | 32 | 64 | 128 | 512;
So when we perform the below code.
int return_code = value & CHECK;
The return code will be 2 + 4 + 8 + 16 + 512 = 542
So we were checking for 799, but we recieved 542 This is because bits o and 8 are offline we equal 256 + 1 = 257 and 799 - 257 = 542.
The above is great great great way to check if let's say we were making a video game and wanted to check if so and so buttons were pressed if any of them were pressed. We could simply check each of those bits with one check and it would be so many times more efficient than performing a Boolean check on every single state.
Now let's say we have Boolean value which is always reversed.
Normally you'd do something like
bool state = false;
state = !state;
Well this can be done with bits as well utilizing the "^" operator.
Just as we performed "1 << N" to choose the whole value of that bit. We can do the same with the reverse. So just like we showed how "|=" stores the return we will do the same with "^=". So what this does is if that bit is on we turn it off. If it's off we turn it on.
void reverseState(int bit)
{
state ^= (1 << bit);
}
You can even have it return the current state. If you wanted it to return the previous state just swap "!=" to "==". So what this does is performs the reversal then checks the current state.
bool reverseAndGet(int bit)
{
return ((state ^= (1 << bit)) & (1 << bit)) != 0x0;
}
Storing multiple non single bit aka bool values into a int can also be done. Let's say we normally write out our coordinate position like the below.
int posX = 0;
int posY = 0;
int posZ = 0;
Now let's say these never wen't passed 1023. So 0 through 1023 was the maximum distance on all of these. I'm choose 1023 for other purposes as previously mentioned you can manipulate the "&" variable as a way to force a value between 0 and 2^N - 1 values. So let's say your range was 0 through 1023. We can perform "value & 1023" and it'll always be a value between 0 and 1023 without any index parameter checks. Keep in mind as previously mentioned this only works with powers of two minus one. 2^10 = 1024 - 1 = 1023.
E.g. no more if (value >= 0 && value <= 1023).
So 2^10 = 1024, which requires 10 bits in order to hold a number between 0 and 1023.
So 10x3 = 30 which is still less than or equal to 32. Is sufficient for holding all these values in an int.
So we can perform the following. So to see how many bits we used. We do 0 + 10 + 20. The reason I put the 0 there is to show you visually that 2^0 = 1 so # * 1 = #. The reason we need y << 10 is because x uses up 10 bits which is 0 through 1023. So we need to multiple y by 1024 to have unique values for each. Then Z needs to be multiplied by 2^20 which is 1,048,576.
int position = (x << 0) | (y << 10) | (z << 20);
This makes comparisons fast.
We can now do
return this.position == position;
apposed to
return this.x == x && this.y == y && this.z == z;
Now what if we wanted the actual positions of each?
For the x we simply do the following.
int getX()
{
return position & 1023;
}
Then for the y we need to perform a left bit shift then AND it.
int getY()
{
return (position >> 10) & 1023;
}
As you may guess the Z is the same as the Y, but instead of 10 we use 20.
int getZ()
{
return (position >> 20) & 1023;
}
I hope whoever views this will find it worth while information :).
If you really want to use 1 bit, you can use a char to store 8 booleans, and bitshift to get the value of the one you want. I doubt it will be faster, and it's probably going to gives you a lot of headaches working that way, but technically it's possible.
On a side note, an attempt like this could prove useful for systems that don't have a lot of memory available for variables but do have some more processing power then what you need. I highly doubt you will ever need it though.
In Java, I would like to store (>10'000) arrays of boolean values (boolean[]) with length 32 to the disk and read them again later on for further computation and comparison.
Since a single array will have a length of 32, I wonder whether it makes sense to store it as an integer value to speed up the reading and writing (on a 32 bit machine). Would you suggest using BitSet and then convert to int? Or even forget about int and use bytes?
For binary storage, use int and a DataOutputStream (DataInputStream for reading).
I think boolean arrays are stored as byte or int arrays internally in Java, so you may want to consider avoiding the overhead and keeping the int encoding all the time, i.e. not use boolean[] at all.
Instead, have something like
public class BooleanArray32 {
private int values;
public boolean get(int pos) {
return (values & (1 << pos)) != 0;
}
public void set(int pos, boolean value) {
int mask = 1 << pos;
values = (values & ~mask) | (value ? mask : 0);
}
public void write(DataOutputStream dos) throws IOException {
dos.writeInt(values);
}
public void read(DataInputStream dis) throws IOException {
values = dis.readInt();
}
public int compare(BooleanArray32 b2) {
return countBits(b2.values & values);
}
// From http://graphics.stanford.edu/~seander/bithacks.html
// Disclaimer: I did not fully double check whether this works for Java's signed ints
public static int countBits(int v) {
v = v - ((v >>> 1) & 0x55555555); // reuse input as temporary
v = (v & 0x33333333) + ((v >>> 2) & 0x33333333); // temp
return ((v + (v >>> 4) & 0xF0F0F0F) * 0x1010101) >>> 24;
}
}
I am under the strong impression that any compression you are going to make to pack your boolean values will increase the read and write time. (my mistake, I was clearly missing my medication). You will rather gain in terms of storage involved.
BitSet is a sensible choice on your business logic side. It internally stores a long, which you could convert to an int. However, since BitSet is prude enough not to show you its privates, you need to get each bit index in sequence. This means that I guess there is no real advantage converting to an int rather than just using bytes directly.
The roll-your-own solution of Stefan Haustein (extended as necessary to mimic BitSet) is therefore preferable for your storage requirement, since you do not incur any unnecessary overhead.