Deep copy duplicate() of Java's ByteBuffer - java

java.nio.ByteBuffer#duplicate() returns a new byte buffer that shares the old buffer's content. Changes to the old buffer's content will be visible in the new buffer, and vice versa. What if I want a deep copy of the byte buffer?

I think the deep copy need not involve byte[]. Try the following:
public static ByteBuffer clone(ByteBuffer original) {
ByteBuffer clone = ByteBuffer.allocate(original.capacity());
original.rewind();//copy from the beginning
clone.put(original);
original.rewind();
clone.flip();
return clone;
}

As this question still comes up as one of the first hits to copying a ByteBuffer, I will offer my solution. This solution does not touch the original buffer, including any mark set, and will return a deep copy with the same capacity as the original.
public static ByteBuffer cloneByteBuffer(final ByteBuffer original) {
// Create clone with same capacity as original.
final ByteBuffer clone = (original.isDirect()) ?
ByteBuffer.allocateDirect(original.capacity()) :
ByteBuffer.allocate(original.capacity());
// Create a read-only copy of the original.
// This allows reading from the original without modifying it.
final ByteBuffer readOnlyCopy = original.asReadOnlyBuffer();
// Flip and read from the original.
readOnlyCopy.flip();
clone.put(readOnlyCopy);
return clone;
}
If one cares for the position, limit, or order to be set the same as the original, then that's an easy addition to the above:
clone.position(original.position());
clone.limit(original.limit());
clone.order(original.order());
return clone;

Based off of mingfai's solution:
This will give you an almost true deep copy. The only thing lost will be the mark. If orig is a HeapBuffer and the offset is not zero or the capacity is less than the backing array than the outlying data is not copied.
public static ByteBuffer deepCopy( ByteBuffer orig )
{
int pos = orig.position(), lim = orig.limit();
try
{
orig.position(0).limit(orig.capacity()); // set range to entire buffer
ByteBuffer toReturn = deepCopyVisible(orig); // deep copy range
toReturn.position(pos).limit(lim); // set range to original
return toReturn;
}
finally // do in finally in case something goes wrong we don't bork the orig
{
orig.position(pos).limit(lim); // restore original
}
}
public static ByteBuffer deepCopyVisible( ByteBuffer orig )
{
int pos = orig.position();
try
{
ByteBuffer toReturn;
// try to maintain implementation to keep performance
if( orig.isDirect() )
toReturn = ByteBuffer.allocateDirect(orig.remaining());
else
toReturn = ByteBuffer.allocate(orig.remaining());
toReturn.put(orig);
toReturn.order(orig.order());
return (ByteBuffer) toReturn.position(0);
}
finally
{
orig.position(pos);
}
}

One more simple solution
public ByteBuffer deepCopy(ByteBuffer source, ByteBuffer target) {
int sourceP = source.position();
int sourceL = source.limit();
if (null == target) {
target = ByteBuffer.allocate(source.remaining());
}
target.put(source);
target.flip();
source.position(sourceP);
source.limit(sourceL);
return target;
}

You'll need to iterate the entire buffer and copy by value into the new buffer.

I believe this should supply a full deep copy, including the mark, "out-of-bounds" data, etc...just in case you need the most complete sandbox-safe carbon copy of a ByteBuffer.
The only thing it doesn't copy is the read-only trait, which you can easily get by just calling this method and tagging on a ".asReadOnlyBuffer()"
public static ByteBuffer cloneByteBuffer(ByteBuffer original)
{
//Get position, limit, and mark
int pos = original.position();
int limit = original.limit();
int mark = -1;
try
{
original.reset();
mark = original.position();
}
catch (InvalidMarkException e)
{
//This happens when the original's mark is -1, so leave mark at default value of -1
}
//Create clone with matching capacity and byte order
ByteBuffer clone = (original.isDirect()) ? ByteBuffer.allocateDirect(original.capacity()) : ByteBuffer.allocate(original.capacity());
clone.order(original.order());
//Copy FULL buffer contents, including the "out-of-bounds" part
original.limit(original.capacity());
original.position(0);
clone.put(original);
//Set mark of both buffers to what it was originally
if (mark != -1)
{
original.position(mark);
original.mark();
clone.position(mark);
clone.mark();
}
//Set position and limit of both buffers to what they were originally
original.position(pos);
original.limit(limit);
clone.position(pos);
clone.limit(limit);
return clone;
}

Related

MappedByteBuffer(in Android Studio) constructor is broken(super constructor broken)

I have a byte array and it has to be converted to MappedByteBuffer.
But when I try creating MappedByteBuffer, an error occurs.
error: cannot find symbol method MappedByteBuffer(int,int,int,int,byte[],int)
MappedByteBuffer.java
package java.nio;
import java.io.FileDescriptor;
import sun.misc.Unsafe;
public abstract class MappedByteBuffer
extends ByteBuffer
{
...
// Android-added: Additional constructor for use by Android's DirectByteBuffer.
MappedByteBuffer(int mark, int pos, int lim, int cap, byte[] buf, int offset) {
super(mark, pos, lim, cap, buf, offset); // <- when I hover mouse here, ByteBuffer() in ByteBuffer cannot be applied to message appears with a red underline.
this.fd = null;
}
...
}
ByteBuffer.java
package java.nio;
import libcore.io.Memory;
import dalvik.annotation.codegen.CovariantReturnType;
public abstract class ByteBuffer
extends Buffer
implements Comparable<ByteBuffer>
{
// These fields are declared here rather than in Heap-X-Buffer in order to
// reduce the number of virtual method invocations needed to access these
// values, which is especially costly when coding small buffers.
//
final byte[] hb; // Non-null only for heap buffers
final int offset;
boolean isReadOnly; // Valid only for heap buffers
// Creates a new buffer with the given mark, position, limit, capacity,
// backing array, and array offset
//
ByteBuffer(int mark, int pos, int lim, int cap, // package-private
byte[] hb, int offset)
{
// Android-added: elementSizeShift parameter (log2 of element size).
super(mark, pos, lim, cap, 0 /* elementSizeShift */);
this.hb = hb;
this.offset = offset;
}
...
}
What I think strange is when goto definition of extends ByteBuffer in MappedByteBuffer.java, it shows ByteBuffer.annotated.java, not ByteBuffer.java
ByteBuffer.annotated.java
// -- This file was mechanically generated: Do not edit! -- //
package java.nio;
#SuppressWarnings({"unchecked", "deprecation", "all"})
public abstract class ByteBuffer extends java.nio.Buffer implements java.lang.Comparable<java.nio.ByteBuffer> {
ByteBuffer(int mark, int pos, int lim, int cap) { super(0, 0, 0, 0, 0); throw new RuntimeException("Stub!"); }
I don't know what {classname}.annotated.java does, so it might not be an error, but I pasted because I think it's odd.
So how can I create MappedByteBuffer from byte array?
There is only 1 constructor, but it's broken.
There is only 1 constructor, but it's broken
That constructor isn't public (it's package-private), so you can't call it.
So how can I create MappedByteBuffer from byte array?
You can't, not without writing it to a file first. From the docs:
A direct byte buffer whose content is a memory-mapped region of a file.
If you do need to create a MappedByteBuffer specifically and not just a ByteBuffer from a byte array, you need to write it to a file and use FileChannel.map. If you just need a ByteBuffer, you can use ByteBuffer.wrap

How to copy native memory to DirectByteBuffer

I know one way - using memcpy on C++ side:
C++ method:
void CopyData(void* buffer, int size)
{
memcpy(buffer, source, size);
}
JNR mapping:
void CopyData(#Pinned #Out ByteBuffer byteBuffer, #Pinned #In int size);
Java invocation:
ByteBuffer buffer = ByteBuffer.allocateDirect(size);
adapter.CopyData(buffer, size);
But I would like to handle case when native code does not copy data, but only returns pointer to the memory which is to be copied:
C++ methods:
void* GetData1()
{
return source;
}
// or
struct Data
{
void* data;
};
void* GetData2(Data* outData)
{
outData->data = source;
}
I know how to write JNR mapping to be able to copy data to HeapByteBuffer:
Pointer GetData1();
// or
void GetData2(#Pinned #Out Data outData);
final class Data extends Struct {
public final Struct.Pointer data;
public DecodeResult(Runtime runtime) {
super(runtime);
data = new Struct.Pointer();
}
}
Java invocation:
ByteBuffer buffer = ByteBuffer.allocate(size);
Pointer dataPtr = adapter.GetData1();
dataPtr.get(0, buffer.array(), 0, buffer.array().length);
// or
ByteBuffer buffer = ByteBuffer.allocate(size);
Data outData = new Data(runtime);
adapter.GetData2(outData);
Pointer dataPtr = outData.data.get();
dataPtr.get(0, buffer.array(), 0, buffer.array().length);
But I have not found a way to copy memory to DirectByteBuffer instead of HeapByteBuffer. The above snippet of code does not work for DirectByteBuffer because buffer.array() is null for such a buffer, as it is backed by native memory area.
Please help.
I have found several ways to perform copying of JNR native memory to DirectByteBuffer. They differ in efficiency. Currently I use the following aproach, I don't know whether is it the best or intended by JNR authors:
ByteBuffer buffer = ByteBuffer.allocateDirect(size);
Pointer dataPtr = adapter.GetData1();
long destAddress = ((DirectBuffer)buffer).address();
Pointer destPtr = AsmRuntime.pointerValue(destAddress, runtime);
assert dataPtr.isDirect() && destPtr.isDirect();
dataPtr.transferTo(0, destPtr, 0, size);
or
ByteBuffer buffer = ByteBuffer.allocateDirect(size);
Data outData = new Data(runtime);
adapter.GetData2(outData);
Pointer dataPtr = outData.data.get();
long destAddress = ((DirectBuffer)buffer).address();
Pointer destPtr = AsmRuntime.pointerValue(destAddress, runtime);
assert dataPtr.isDirect() && destPtr.isDirect();
dataPtr.transferTo(0, destPtr, 0, size);
It is important that the assert clause above is fulfilled. It guarantees that pointers are jnr.ffi.provider.jffi.DirectMemoryIO instances, and the efficient memcpy method is used for copying (check implementation of DirectMemoryIO.transferTo()).
The alternative is to wrap DirectByteBuffer using the following method:
Pointer destPtr = Pointer.wrap(runtime, destAddress);
or
Pointer destPtr = Pointer.wrap(runtime, destAddress, size);
but no:
Pointer destPtr = Pointer.wrap(runtime, buffer);
The first and second pointers are backed by DirectMemoryIO, but the third pointer is backed by ByteBufferMemoryIO and it involves slow byte-by-byte copying.
The one drawback is that DirectMemoryIO instance is quite heavyweight. It allocates 32 bytes on JVM heap, so in case of plenty of JNR invocations, all DirectMemoryIO instances consume big part of memory.

Java How to convert a java.nio.Buffer into a ByteBuffer

I'm facing a little problem I have two libraries one send me the output as java.nio.Buffer and the other receives the input as a java.nio.ByteBuffer how do I make the conversion?
Thanks
the Buffer is from javaCV from this piece of code:
private BytePointer[] image_ptr;
private Buffer[] image_buf;
// Determine required buffer size and allocate buffer
int size = avpicture_get_size(fmt, width, height);
image_ptr = new BytePointer[] { new BytePointer(av_malloc(size)).capacity(size) };
image_buf = new Buffer[] { image_ptr[0].asBuffer() };
// Assign appropriate parts of buffer to image planes in picture_rgb
// Note that picture_rgb is an AVFrame, but AVFrame is a superset of AVPicture
avpicture_fill(new AVPicture(picture_rgb), image_ptr[0], fmt, width, height);
picture_rgb.format(fmt);
picture_rgb.width(width);
picture_rgb.height(height);
First of all, since Buffer is an abstract class, and ByteBuffer is one of its subclasses, it's entirely possible that the output you're getting from the first library is in fact a ByteBuffer. If possible, check to see which implementation of Buffer the library is returning, because if it's actually returning a ByteBuffer you can just cast the output to ByteBuffer and be done.
If you don't know which implementation of Buffer the library returns, you'll have to resort to instanceof tests to determine what subclass it is, and copy the data from the returned Buffer to a new ByteBuffer after downcasting it to a subclass. This is because the Buffer interface doesn't actually provide any methods to read the data from the buffer; only the subclasses (ByteBuffer, ShortBuffer, LongBuffer, etc.) do. Fortunately, there are only 7 possible subclasses of Buffer, one for each primitive type.
Once you've determined which subclass of Buffer you have, you can copy the data to a ByteBuffer using the "asXXXBuffer()" method described in this answer, as #Tunaki pointed out.
The code would look something like this:
Buffer outputBuffer = library.getBuffer();
ByteBuffer byteBuffer;
if (outputBuffer instanceof ByteBuffer) {
byteBuffer = (ByteBuffer) outputBuffer;
} else if (outputBuffer instanceof CharBuffer) {
byteBuffer = ByteBuffer.allocate(outputBuffer.capacity());
byteBuffer.asCharBuffer().put((CharBuffer) outputBuffer);
} else if (outputBuffer instanceof ShortBuffer) {
byteBuffer = ByteBuffer.allocate(outputBuffer.capacity() * 2);
byteBuffer.asShortBuffer().put((ShortBuffer) outputBuffer);
} else if (outputBuffer instanceof IntBuffer) {
byteBuffer = ByteBuffer.allocate(outputBuffer.capacity() * 4);
byteBuffer.asIntBuffer().put((IntBuffer) outputBuffer);
} else if (outputBuffer instanceof LongBuffer) {
byteBuffer = ByteBuffer.allocate(outputBuffer.capacity() * 8);
byteBuffer.asLongBuffer().put((LongBuffer) outputBuffer);
} else if (outputBuffer instanceof FloatBuffer) {
byteBuffer = ByteBuffer.allocate(outputBuffer.capacity() * 4);
byteBuffer.asFloatBuffer().put((FloatBuffer) outputBuffer);
} else if (outputBuffer instanceof DoubleBuffer) {
byteBuffer = ByteBuffer.allocate(outputBuffer.capacity() * 8);
byteBuffer.asDoubleBuffer().put((DoubleBuffer) outputBuffer);
}
Note that the size of the ByteBuffer you allocate depends on which subclass of Buffer you're copying from, since different primitive types are stored using different numbers of bytes. For example, since an int is 4 bytes, if your library gives you an IntBuffer, you need to allocate a ByteBuffer with 4 times the capacity.

Are there any tricks to reduce memory usage when storing String data type in hashmap?

I need to store value pair (word and number) in the Map.
I am trying to use TObjectIntHashMap from Trove library with char[] as the key, because I need to minimize the memory usage. But with this method, I can not get the value when I use get() method.
I guess I can not use primitive char array to store in a Map because hashcode issues.
I tried to use TCharArrayList but that takes much memory also.
I read in another stackoverflow question that similar with my purpose and have suggestion to use TLongIntHashMap , store encode values of String word in long data type. In this case my words may contains of latin characters or various other characters that appears in wikipedia collections, I do not know whether the Long is enough for encode or not.
I have tried using Trie data structure to store it, but I need to consider my performance also and choose the best for both memory usage and performance.
Do you have any idea or suggestion for this issue?
It sounds like the most compact way to store the data is to use a byte[] encoded in UTF-8 or similar. You can wrap this in your own class or write you own HashMap which allows byte[] as a key.
I would reconsider how much time it is worth spending to save some memory. If you are talking about a PC or Server, at minimum wage you need to save 1 GB for an hours work so if you are only looking to save 100 MB that's about 6 minutes including testing.
Write your own class that implements CharSequence, and write your own implementation of equals() and hashcode(). The implementation would also pre-allocate large shared char[] storage, and use bits of it at a time. (You can definitely incorporate #Peter Lawrey's excellent suggestion into this, too, and use byte[] storage.)
There's also an opportunity to do a 'soft intern()' using an LRU cache. I've noted where the cache would go.
Here's a simple demonstration of what I mean. Note that if you need heavily concurrent writes, you can try to improve the locking scheme below...
public final class CompactString implements CharSequence {
private final char[] _data;
private final int _offset;
private final int _length;
private final int _hashCode;
private static final Object _lock = new Object();
private static char[] _storage;
private static int _nextIndex;
private static final int LENGTH_THRESHOLD = 128;
private CompactString(char[] data, int offset, int length, int hashCode) {
_data = data; _offset = offset; _length = length; _hashCode = hashCode;
}
private static final CompactString EMPTY = new CompactString(new char[0], 0, 0, "".hashCode());
private static allocateStorage() {
synchronized (_lock) {
_storage = new char[1024];
_nextIndex = 0;
}
}
private static CompactString storeInShared(String value) {
synchronized (_lock) {
if (_nextIndex + value.length() > _storage.length) {
allocateStorage();
}
int start = _nextIndex;
// You would need to change this loop and length to do UTF encoding.
for (int i = 0; i < value.length(); ++i) {
_storage[_nextIndex++] = value.charAt(i);
}
return new CompactString(_storage, start, value.length(), value.hashCode());
}
}
static {
allocateStorage();
}
public static CompactString valueOf(String value) {
// You can implement a soft .intern-like solution here.
if (value == null) {
return null;
} else if (value.length() == 0) {
return EMPTY;
} else if (value.length() > LENGTH_THRESHOLD) {
// You would need to change .toCharArray() and length to do UTF encoding.
return new CompactString(value.toCharArray(), 0, value.length(), value.hashCode());
} else {
return storeInShared(value);
}
}
// left to reader: implement equals etc.
}

Store and retrieve a float[] to/from Cassandra using Hector

I have the following Cassandra schema:
ColumnFamily: FloatArrays {
SCKey: SuperColumn Key (Integer) {
Key: FloatArray (float[]) {
field (String): value (String)
}
}
}
In order to insert data that adheres to this schema I created the following template in Hector:
template = new ThriftSuperCfTemplate<Integer, FloatArray, String>(
keyspace, "FloatArrays", IntegerSerializer.get(),
FloatArraySerializer.get(), StringSerializer.get());
To (de-)serialize the FloatArray I created (and unit tested) a custom Serializer:
public class FloatArraySerializer extends AbstractSerializer<FloatArray> {
private static final FloatArraySerializer instance =
new FloatArraySerializer();
public static FloatArraySerializer get() {
return instance;
}
#Override
public FloatArray fromByteBuffer(ByteBuffer buffer) {
buffer.rewind();
FloatBuffer floatBuf = buffer.asFloatBuffer();
float[] floats = new float[floatBuf.limit()];
if (floatBuf.hasArray()) {
floats = floatBuf.array();
} else {
floatBuf.get(floats, 0, floatBuf.limit());
}
return new FloatArray(floats);
}
#Override
public ByteBuffer toByteBuffer(FloatArray theArray) {
float[] floats = theArray.getFloats();
ByteBuffer byteBuf = ByteBuffer.allocate(4 * descriptor.length);
FloatBuffer floatBuf = byteBuf.asFloatBuffer();
floatBuf.put(floats);
byteBuf.rewind();
return byteBuf;
}
}
Now comes the tricky bit. Storing and then retrieving an array of floats does not return the same result. In fact, the number of elements in the array isn't even the same. The code I use to retrieve the result is shown below:
SuperCfResult<Integer, FloatArray, String> result =
template.querySuperColumns(hash);
for (FloatArray floatArray: result.getSuperColumns()) {
// Do something with the FloatArrays
}
Do I make a conceptual mistake here since I'm quite new to Cassandra/Hector? Right now I don't even have a clue on where it goes wrong. The Serializer seems to be ok. Can you please provide me with some pointers to continue my search? Many thanks!
I think you're on the right track. When I work with ByteBuffers I find I sometimes need the statement:
import org.apache.thrift.TBaseHelper;
...
ByteBuffer aCorrectedByteBuffer = TBaseHelper.rightSize(theByteBufferIWasGiven);
The byte buffer sometimes has its value stored as an offset into its buffer but the Serializers seem to assume that the byte buffer's value starts at offset 0. The TBaseHelper corrects the offsets as best I can tell so the assumptions in the Serializer implementation are made valid.
The difference in lengths of the array in and array out are the result of starting at the wrong offset. The first byte or two of the serialized value contain the length of the array.
Thanks to Chris I solved the problem. The Serializer now looks like this:
public class FloatArraySerializer extends AbstractSerializer<FloatArray> {
private static final FloatArraySerializer instance =
new FloatArraySerializer();
public static FloatArraySerializer get() {
return instance;
}
#Override
public FloatArray fromByteBuffer(ByteBuffer buffer) {
ByteBuffer rightBuffer = TBaseHelper.rightSize(buffer); // This does the trick
FloatBuffer floatBuf = rightBuffer.asFloatBuffer();
float[] floats = new float[floatBuf.limit()];
if (floatBuf.hasArray()) {
floats = floatBuf.array();
} else {
floatBuf.get(floats, 0, floatBuf.limit());
}
return new FloatArray(floats);
}
#Override
public ByteBuffer toByteBuffer(FloatArray theArray) {
float[] floats = theArray.getDescriptor();
ByteBuffer byteBuf = ByteBuffer.allocate(4 * descriptor.length);
FloatBuffer floatBuf = byteBuf.asFloatBuffer();
floatBuf.put(floats);
byteBuf.rewind();
return byteBuf;
}
}

Categories