I am working on a BitBuffer that will take x bits from a ByteBuffer as an int, long, etc, but I seem to be having a whole lot of problems.
I've tried loading a long at a time and using bit shifting, but the difficulty comes from rolling from one long into the next. I am wondering if there's just a better way. Anyone have any suggestions?
public class BitBuffer
{
final private ByteBuffer bb;
public BitBuffer(byte[] bytes)
{
this.bb = ByteBuffer.wrap(bytes);
}
public int takeInt(int bits)
{
int bytes = toBytes(bits);
if (bytes > 4) throw new RuntimeException("Too many bits requested");
int i=0;
// take bits from bb and fill it into an int
return i;
}
}
More specifically, I am trying to take x bits from the buffer and return them as an int (the minimal case). I can access bytes from the buffer, but let's say I only want to take just the first 4 bits instead.
Example:
If my buffer is filled with "101100001111", if I run these in order:
takeInt(4) // should return 11 (1011)
takeInt(2) // should return 0 (00)
takeInt(2) // should return 0 (00)
takeInt(1) // should return 1 (1)
takeInt(3) // should return 7 (111)
I would like to use something like this for bit packed encoded data where an integer can be stored in just a few bits of a byte.
The BitSet and ByteBuffer ideas were a bit too difficult to control so instead, I went with a binary string approach that basically takes a whole lot of headache out of managing an intermediate buffer of bits.
public class BitBuffer
{
final private String bin;
private int start;
public BitBuffer(byte[] bytes)
{
this.bin = toBinaryString(bytes); // TODO: create this function
this.start = 0;
}
public int takeInt(int nbits)
{
// TODO: handle edge cases
String bits = bin.substring(start, start+=nbits);
return Integer.parseInt(bits, 2);
}
}
Out of everything I've tried this was the cleanest and easiest approach, but I am open to suggestions!
You can convert the ByteBuffer into BitSet and then you'll have continuous access to the bits
public class BitBuffer
{
final private BitSet bs;
public BitBuffer(byte[] bytes)
{
this.bs = BitSet.valueOf(bytes);
}
public int takeInt(int bits)
{
int bytes = toBytes(bits);
if (bytes > 4) throw new RuntimeException("Too many bits requested");
int i=0;
// take bits from bs and fill it into an int
return i;
}
}
Related
I need to store boolean array with 80,000 items in file. I don't care how much time saving takes, I'm interested only in the loading time of array.
I did't try to store it by DataOutputStream because it requires access for each value.
I tried to make this by 3 approaches, such as:
serialize boolean array
use BitSet instead of boolean array an serialize it
transfer boolean array into byte array, where 1 is true and 0 is false appropriately and write it by FileChannel using ByteBuffer
To test reading from files by these approaches, I had run each approach 1,000 times in loop. So I got results which look like this:
deserialization of boolean array takes 574 ms
deserialization of BitSet - 379 ms
getting byte array from FileChannel by MappedByteBuffer - 170 ms
The first and second approaches are too long, the third, perhaps, is not approach at all.
Perhaps there are a best way to accomplish it, so I need your advice
EDIT
Each method ran once
13.8
8.71
6.46
ms appropriatively
What about writing a byte for each boolean and develop a custom parser? This will propably one of the fastest methods.
If you want to save space you could also put 8 booleans into one byte but this would require some bit shifting operations.
Here is a short example code:
public void save() throws IOException
{
boolean[] testData = new boolean[80000];
for(int X=0;X < testData.length; X++)
{
testData[X] = Math.random() > 0.5;
}
FileOutputStream stream = new FileOutputStream(new File("test.bin"));
for (boolean item : testData)
{
stream.write(item ? 1 : 0);
}
stream.close();
}
public boolean[] load() throws IOException
{
long start = System.nanoTime();
File file = new File("test.bin");
FileInputStream inputStream = new FileInputStream(file);
int fileLength = (int) file.length();
byte[] data = new byte[fileLength];
boolean[] output = new boolean[fileLength];
inputStream.read(data);
for (int X = 0; X < data.length; X++)
{
if (data[X] != 0)
{
output[X] = true;
continue;
}
output[X] = false;
}
long end = System.nanoTime() - start;
Console.log("Time: " + end);
return output;
}
It takes about 2ms to load 80.000 booleans.
Tested with JDK 1.8.0_45
So I had a very similar use case where i wanted to serialise/deserialise a very large boolean array.
I implemented something like this,
Firstly i converted boolean array to an integer array simply to club multiple boolean values (This makes storage more efficient and there are no issues with bit padding)
This now means we have to build wrapper methods which will give true/false
private boolean get (int index) {
int holderIndex = (int) Math.floor(index/buckets);
int internalIndex = index % buckets;
return 0 != (container[holderIndex] & (1 << internalIndex));
}
and
private void set (int index) {
int holderIndex = (int) Math.floor(index/buckets);
int internalIndex = index % buckets;
int value = container[holderIndex];
int newValue = value | (1 << internalIndex);
container[holderIndex] = newValue;
}
Now to serialise and deserialise you can directly convert this to bytestream and write to file.
my source code, for reference
I am experiencing some sort of difficulties with storing structure {int, int, long} as byte array in java and reading it as binary structure in Cpp.
I have tried nearly everything. My biggest success was when I could read Long value properly, but integers were some random numbers.
I am affraid of endianness and I am not sure how can I decide which language use little or big endianness. Can anybody, please, tell me, how can I store primitive types such as int, long, double in java and read it in Cpp?
Thank you, it would be really helpful.
EDIT:
I know how do I want to read it in C++:
struct tick {
int x;
int y;
long time;
};
...
tick helpStruct;
input.open("test_file", ios_base::in | ios_base::binary);
input.read((char*) &helpStruct, sizeof(tick));
In Java, I've tried many ways, my last try was:
DataOutput stream = new DataOutputStream(new FileOutputStream(new File("test_file")));
byte[] bytes = ByteBuffer.allocate(4).order(ByteOrder.LITTLE_ENDIAN).putInt(1).array();
for (byte b : bytes) {
stream.write(b);
}
but Java code is open.
You wrote only the very first integer.. You never wrote the second one followed by the long..
Thus any values you read would be random of course. Just remember that sizeof(long) in C++ might not actually be 8 as it is in java! Also don't forget that the structure in C++ might be padded and it'd be better to read each value one at a time into the struct's fields.
This works..
On the java side:
package test;
import java.io.*;
import java.nio.*;
public class Test {
public static void main(String[] args) throws FileNotFoundException, IOException {
DataOutput stream = new DataOutputStream(new FileOutputStream(new File("C:/Users/Brandon/Desktop/test_file.dat")));
int sizeofint = 4;
int sizeoflong = 4;
ByteBuffer buffer = ByteBuffer.allocate(sizeofint + sizeofint + sizeoflong).order(ByteOrder.LITTLE_ENDIAN);
buffer.putInt(5).putInt(6).putInt(7);
byte[] bytes = buffer.array();
for (byte b : bytes) {
stream.write(b);
}
}
}
and on the C++ side:
#include <fstream>
#include <iostream>
struct tick
{
int x;
int y;
long time;
};
int main()
{
std::fstream file("C:/Users/Brandon/Desktop/test_file.dat", std::ios::in | std::ios::binary);
if (file.is_open())
{
tick t = {0};
file.read(reinterpret_cast<char*>(&t), sizeof(t));
file.close();
std::cout<<t.x<<" "<<t.y<<" "<<t.time<<"\n";
}
}
Results are: 5 6 7.
It might even be better to do:
file.read(reinterpret_cast<char*>(&t.x), sizeof(t.x));
file.read(reinterpret_cast<char*>(&t.y), sizeof(t.y));
file.read(reinterpret_cast<char*>(&t.time), sizeof(t.time));
as the title already says, I need to convert an int[] to a ByteBuffer in Java. Is there a recommended way to do this ?
I want to pass the ByteBuffer over JNI to C++. What do I have to look out for regarding any specific endian conversions in this case ?
Edit: Sorry, I mistakenly wrote ByteArray but meant the type ByteBuffer.
Edit: Sample code:
I stripped out the unnecessary parts. I call a Java function over JNI from c++ to load a resource and pass it back to c++ as bytebuffer. It works with various other resources. Now I have an "int []" and would like to know if there is an elegant way to convert it to a bytebuffer or if I have to go the oldfashioned way and fill it in a for loop.
ByteBuffer resource= null;
resource = ByteBuffer.allocateDirect((x*y+2)*4).order(ByteOrder.nativeOrder());
.
.
ByteBuffer GetResourcePNG(String text)
{
.
.
int [] pix;
map.getPixels(pix,0,x,0,0,x,y);
return resource;
}
You have to use ByteBuffer.allocateDirect if you want to be able to use JNI's GetDirectBufferAddress.
Use ByteBuffer.order(ByteOrder.nativeOrder()) to adjust the ByteBuffer instance's endianness to match the current platform.
After the ByteBuffer's endianness is properly configured, use ByteBuffer.asIntBuffer() to get a view of it as a java.nio.IntBuffer and fill it with your data.
Full Example:
import java.nio.ByteBuffer; import java.nio.ByteOrder; import java.nio.IntBuffer;
public class Test {
static final int bytes_per_datum = 4;
public static void main(String args[]) {
main2("Native Endian", ByteOrder.nativeOrder());
main2("Big Endian", ByteOrder.BIG_ENDIAN);
main2("Little Endian", ByteOrder.LITTLE_ENDIAN);
}
static void main2(String comment, ByteOrder endian) {
int[] data = { 1, 0xF, 0xFF, 0xFFF, 0xFFFF, 0xFFFFF, 0xFFFFFF, 0xFFFFFFF, 0xFFFFFFFF };
ByteBuffer bb = ByteBuffer.allocateDirect(data.length * bytes_per_datum);
bb.order(endian); // endian must be set before putting ints into the buffer
put_ints(bb, data);
System.out.println(comment + ": ");
print(bb);
}
static void put_ints(ByteBuffer bb, int[] data) {
IntBuffer b = bb.asIntBuffer(); // created IntBuffer starts only from the ByteBuffer's relative position
// if you plan to reuse this IntBuffer, be mindful of its position
b.put(data); // position of this IntBuffer changes by +data.length;
} // this IntBuffer goes out of scope
static void print(ByteBuffer bb) { // prints from start to limit
ByteBuffer bb_2 = bb.duplicate(); // shares backing content, but has its own capacity/limit/position/mark (equivalent to original buffer at initialization)
bb_2.rewind();
for (int x = 0, xx = bb_2.limit(); x < xx; ++x) {
System.out.print((bb_2.get() & 0xFF) + " "); // 0xFF for display, since java bytes are signed
if ((x + 1) % bytes_per_datum == 0) {
System.out.print(System.lineSeparator());
}
}
}
}
you could convert to matrix in this way:
public static final byte[] intToByteArray(int value) {
return new byte[] {
(byte)(value >>> 24),
(byte)(value >>> 16),
(byte)(value >>> 8),
(byte)value};
}
int[] arrayOfInt = {1,2,3,4,5,6};
byte[][] matrix = new byte[arrayOfInt.length][size];
for(int i=0;i<arrayOfInt.length;i++)
byte[i] = intToByteArray(arrayOfInt[i]);
Any reason for not passing the int[] array directly to the C++ or C code using JNI as per the example mentioned here?
I have a BitSet and want to write it to a file- I came across a solution to use a ObjectOutputStream using the writeObject method.
I looked at the ObjectOutputStream in the java API and saw that you can write other things (byte, int, short etc)
I tried to check out the class so I tried to write a byte to a file using the following code but the result gives me a file with 7 bytes instead of 1 byte
my question is what are the first 6 bytes in the file? why are they there?
my question is relevant to a BitSet because i don't want to start writing lots of data to a file and realize I have random bytes inserted in the file without knowing what they are.
here is the code:
byte[] bt = new byte[]{'A'};
File outFile = new File("testOut.txt");
FileOutputStream fos = new FileOutputStream(outFile);
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.write(bt);
oos.close();
thanks for any help
Avner
The other bytes will be type information.
Basically ObjectOutputStream is a class used to write Serializable objects to some destination (usually a file). It makes more sense if you think about InputObjectStream. It has a readObject() method on it. How does Java know what Object to instantiate? Easy: there is type information in there.
You could be writing any objects out to an ObjectOutputStream, so the stream holds information about the types written as well as the data needed to reconstitute the object.
If you know that the stream will always contain a BitSet, don't use an ObjectOutputStream - and if space is a premium, then convert the BitSet to a set of bytes where each bit corresponds to a bit in the BitSet, then write that directly to the underlying stream (e.g. a FileOutputStream as in your example).
The serialisation format, like many others, includes a header with magic number and version information. When you use DataOutput/OutputStream methods on ObjectOutputStream are placed in the middle of the serialised data (with no type information). This is typically only done in writeObject implementations after a call to defaultWriteObject or use of putFields.
If you only use the saved BitSet in Java, the serialization works fine. However, it's kind of annoying if you want share the bitset across multi platforms. Besides the overhead of Java serialization, the BitSet is stored in units of 8-bytes. This can generate too much overhead if your bitset is small.
We wrote this small class so we can exract byte arrays from BitSet. Depending on your usecase, it might work better than Java serialization for you.
public class ExportableBitSet extends BitSet {
private static final long serialVersionUID = 1L;
public ExportableBitSet() {
super();
}
public ExportableBitSet(int nbits) {
super(nbits);
}
public ExportableBitSet(byte[] bytes) {
this(bytes == null? 0 : bytes.length*8);
for (int i = 0; i < size(); i++) {
if (isBitOn(i, bytes))
set(i);
}
}
public byte[] toByteArray() {
if (size() == 0)
return new byte[0];
// Find highest bit
int hiBit = -1;
for (int i = 0; i < size(); i++) {
if (get(i))
hiBit = i;
}
int n = (hiBit + 8) / 8;
byte[] bytes = new byte[n];
if (n == 0)
return bytes;
Arrays.fill(bytes, (byte)0);
for (int i=0; i<n*8; i++) {
if (get(i))
setBit(i, bytes);
}
return bytes;
}
protected static int BIT_MASK[] =
{0x80, 0x40, 0x20, 0x10, 0x08, 0x04, 0x02, 0x01};
protected static boolean isBitOn(int bit, byte[] bytes) {
int size = bytes == null ? 0 : bytes.length*8;
if (bit >= size)
return false;
return (bytes[bit/8] & BIT_MASK[bit%8]) != 0;
}
protected static void setBit(int bit, byte[] bytes) {
int size = bytes == null ? 0 : bytes.length*8;
if (bit >= size)
throw new ArrayIndexOutOfBoundsException("Byte array too small");
bytes[bit/8] |= BIT_MASK[bit%8];
}
}
What's the most efficient way to put as many bytes as possible from a ByteBuffer bbuf_src into another ByteBuffer bbuf_dest (as well as know how many bytes were transferred)? I'm trying bbuf_dest.put(bbuf_src) but it seems to want to throw a BufferOverflowException and I can't get the javadocs from Sun right now (network problems) when I need them. >:( argh.
edit: darnit, #Richard's approach (use put() from the backing array of bbuf_src) won't work if bbuf_src is a ReadOnly buffer, as you can't get access to that array. What can I do in that case???
As you've discovered, getting the backing array doesn't always work (it fails for read only buffers, direct buffers, and memory mapped file buffers). The better alternative is to duplicate your source buffer and set a new limit for the amount of data you want to transfer:
int maxTransfer = Math.min(bbuf_dest.remaining(), bbuf_src.remaining());
// use a duplicated buffer so we don't disrupt the limit of the original buffer
ByteBuffer bbuf_tmp = bbuf_src.duplicate ();
bbuf_tmp.limit (bbuf_tmp.position() + maxTransfer);
bbuf_dest.put (bbuf_tmp);
// now discard the data we've copied from the original source (optional)
bbuf_src.position(bbuf_src.position() + maxTransfer);
OK, I've adapted #Richard's answer:
public static int transferAsMuchAsPossible(
ByteBuffer bbuf_dest, ByteBuffer bbuf_src)
{
int nTransfer = Math.min(bbuf_dest.remaining(), bbuf_src.remaining());
if (nTransfer > 0)
{
bbuf_dest.put(bbuf_src.array(),
bbuf_src.arrayOffset()+bbuf_src.position(),
nTransfer);
bbuf_src.position(bbuf_src.position()+nTransfer);
}
return nTransfer;
}
and a test to make sure it works:
public static boolean transferTest()
{
ByteBuffer bb1 = ByteBuffer.allocate(256);
ByteBuffer bb2 = ByteBuffer.allocate(50);
for (int i = 0; i < 100; ++i)
{
bb1.put((byte)i);
}
bb1.flip();
bb1.position(5);
ByteBuffer bb1a = bb1.slice();
bb1a.position(2);
// bb3 includes the 5-100 range
bb2.put((byte)77);
// something to see this works when bb2 isn't empty
int n = transferAsMuchAsPossible(bb2, bb1a);
boolean itWorked = (n == 49);
if (bb1a.position() != 51)
itWorked = false;
if (bb2.position() != 50)
itWorked = false;
bb2.rewind();
if (bb2.get() != 77)
itWorked = false;
for (int i = 0; i < 49; ++i)
{
if (bb2.get() != i+7)
{
itWorked = false;
break;
}
}
return itWorked;
}
You get the BufferOverflowException because your bbuf_dest is not big enough.
You will need to use bbuf_dest.remaining() to find out the maximum number of bytes you can transfer from bbuf_src:
int maxTransfer = Math.min(bbuf_dest.remaining(), bbuf_src.remaining());
bbuf_dest.put(bbuf_src.array(), 0, maxTransfer);