Store and retrieve a float[] to/from Cassandra using Hector - java

I have the following Cassandra schema:
ColumnFamily: FloatArrays {
SCKey: SuperColumn Key (Integer) {
Key: FloatArray (float[]) {
field (String): value (String)
}
}
}
In order to insert data that adheres to this schema I created the following template in Hector:
template = new ThriftSuperCfTemplate<Integer, FloatArray, String>(
keyspace, "FloatArrays", IntegerSerializer.get(),
FloatArraySerializer.get(), StringSerializer.get());
To (de-)serialize the FloatArray I created (and unit tested) a custom Serializer:
public class FloatArraySerializer extends AbstractSerializer<FloatArray> {
private static final FloatArraySerializer instance =
new FloatArraySerializer();
public static FloatArraySerializer get() {
return instance;
}
#Override
public FloatArray fromByteBuffer(ByteBuffer buffer) {
buffer.rewind();
FloatBuffer floatBuf = buffer.asFloatBuffer();
float[] floats = new float[floatBuf.limit()];
if (floatBuf.hasArray()) {
floats = floatBuf.array();
} else {
floatBuf.get(floats, 0, floatBuf.limit());
}
return new FloatArray(floats);
}
#Override
public ByteBuffer toByteBuffer(FloatArray theArray) {
float[] floats = theArray.getFloats();
ByteBuffer byteBuf = ByteBuffer.allocate(4 * descriptor.length);
FloatBuffer floatBuf = byteBuf.asFloatBuffer();
floatBuf.put(floats);
byteBuf.rewind();
return byteBuf;
}
}
Now comes the tricky bit. Storing and then retrieving an array of floats does not return the same result. In fact, the number of elements in the array isn't even the same. The code I use to retrieve the result is shown below:
SuperCfResult<Integer, FloatArray, String> result =
template.querySuperColumns(hash);
for (FloatArray floatArray: result.getSuperColumns()) {
// Do something with the FloatArrays
}
Do I make a conceptual mistake here since I'm quite new to Cassandra/Hector? Right now I don't even have a clue on where it goes wrong. The Serializer seems to be ok. Can you please provide me with some pointers to continue my search? Many thanks!

I think you're on the right track. When I work with ByteBuffers I find I sometimes need the statement:
import org.apache.thrift.TBaseHelper;
...
ByteBuffer aCorrectedByteBuffer = TBaseHelper.rightSize(theByteBufferIWasGiven);
The byte buffer sometimes has its value stored as an offset into its buffer but the Serializers seem to assume that the byte buffer's value starts at offset 0. The TBaseHelper corrects the offsets as best I can tell so the assumptions in the Serializer implementation are made valid.
The difference in lengths of the array in and array out are the result of starting at the wrong offset. The first byte or two of the serialized value contain the length of the array.

Thanks to Chris I solved the problem. The Serializer now looks like this:
public class FloatArraySerializer extends AbstractSerializer<FloatArray> {
private static final FloatArraySerializer instance =
new FloatArraySerializer();
public static FloatArraySerializer get() {
return instance;
}
#Override
public FloatArray fromByteBuffer(ByteBuffer buffer) {
ByteBuffer rightBuffer = TBaseHelper.rightSize(buffer); // This does the trick
FloatBuffer floatBuf = rightBuffer.asFloatBuffer();
float[] floats = new float[floatBuf.limit()];
if (floatBuf.hasArray()) {
floats = floatBuf.array();
} else {
floatBuf.get(floats, 0, floatBuf.limit());
}
return new FloatArray(floats);
}
#Override
public ByteBuffer toByteBuffer(FloatArray theArray) {
float[] floats = theArray.getDescriptor();
ByteBuffer byteBuf = ByteBuffer.allocate(4 * descriptor.length);
FloatBuffer floatBuf = byteBuf.asFloatBuffer();
floatBuf.put(floats);
byteBuf.rewind();
return byteBuf;
}
}

Related

How to map binary data in java class object?

Input data :hexadecimal 64 byte
String binaryData="01000076183003104000800180f5010100010100000063000000630000006300000063000000000000000000820000000200b8010307010700640005e1cbe180";
Question is to read this binary data and set in class object
Here is the model
public class Transaction_PLUSale {
public byte opcode;
public byte[] code=new byte[7];
public byte flag1;
public byte flag2;
public byte flag3;
public byte flag4;
public byte flag5;
public short deptnum;
public byte multi_sell_unit;
public byte return_type;
public byte tax_pointer;
public int qty;
public int price;
public int amount;
public int no_tax_price;
public int no_tax_amount;
public int return_surcharge_percent;
public byte product_code;
public byte flags;
public TransactionTail tail;
}
I am currently doing this way to set values in each fields.
String hexArray[]= binaryData.split("(?<=\\G..)");
public static void readPLUSalesData(String hexArray[]) {
Transaction_PLUSale pluSale=new Transaction_PLUSale();
pluSale.setOpcode(Byte.valueOf(hexArray[0]));
byte arr[]=new byte[7];
for(int i=1;i<=7;i++) {
arr[i-1]=Byte.valueOf(hexArray[i]);
}
pluSale.setCode(arr);
pluSale.setFlag1(Byte.valueOf(hexArray[8]));
pluSale.setFlag2(Byte.valueOf(hexArray[9]));
pluSale.setFlag3(Byte.valueOf(hexArray[10]));
pluSale.setFlag4(Byte.valueOf(hexArray[11]));
pluSale.setFlag5(Byte.valueOf(hexArray[12]));
pluSale.setDeptnum((short)Integer.parseInt((hexArray[14]+hexArray[13]),16));
pluSale.setMulti_sell_unit(Byte.valueOf(hexArray[15]));
pluSale.setReturn_type(Byte.valueOf(hexArray[16]));;
pluSale.setTax_pointer(Byte.valueOf(hexArray[17]));
pluSale.setQty(Integer.parseInt((hexArray[21]+hexArray[20]+hexArray[19]+hexArray[18]),16));
pluSale.setPrice(Integer.parseInt((hexArray[25]+hexArray[24]+hexArray[23]+hexArray[22]),16));
pluSale.setAmount(Integer.parseInt((hexArray[29]+hexArray[28]+hexArray[27]+hexArray[26]),16));
pluSale.setNo_tax_price(Integer.parseInt((hexArray[33]+hexArray[32]+hexArray[31]+hexArray[30]),16));
pluSale.setNo_tax_amount(Integer.parseInt((hexArray[37]+hexArray[36]+hexArray[35]+hexArray[34]),16));
pluSale.setReturn_surcharge_percent(Integer.parseInt((hexArray[41]+hexArray[40]+hexArray[39]+hexArray[38]),16));
pluSale.setProduct_code(Byte.valueOf(hexArray[42]));
pluSale.setFlags(Byte.valueOf(hexArray[43]));
}
It is working fine. But I want it to be generic. So instead of giving byte by byte value. I want to direct map it to class fields.
In .net we are doing the marshalling for same feature that I need.
Here is the example
foreach (KeyValuePair<string, byte[]> s in t)
{
//byte array consist of bytes of the above hexadecimal string.
Ticket ticket = new Ticket();
int count = Marshal.SizeOf(typeof(Transaction_Coupon));
MemoryStream ms = new MemoryStream(s.Value);
byte[] readBuffer = new byte[count];
BinaryReader br = new BinaryReader(ms);
readBuffer = br.ReadBytes(count);
GCHandle handle = GCHandle.Alloc(readBuffer, GCHandleType.Pinned);
//here we are mapping byte data to each field
Transaction_PLUSale t_plusale = (Transaction_PLUSale)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(Transaction_PLUSale));
}
To convert binary data, a byte[] to a class with fields, there is no memory template to shift the data in. A good solution is using a ByteBuffer, either on a byte array or InputStream.
public static void readPLUSalesData(String[] hexArray) {
byte[] bytes = new byte[hexArray.length];
for (int i = 0; i < bytes.length; ++i) {
bytes[i] = Byte.parseByte(hexArray[i], 16);
}
ByteBuffer buf = ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN;
Transaction_PLUSale pluSale=new Transaction_PLUSale();
pluSale.setOpcode(buf.get());
byte[] arr[] = new byte[7];
buf.get(arr);
pluSale.setCode(arr);
pluSale.setFlag1(buf.get());
pluSale.setFlag2(buf.get());
pluSale.setFlag3(buf.get());
pluSale.setFlag4(buf.get());
pluSale.setFlag5(buf.get());
pluSale.setDeptnum(buf.getShort());
pluSale.setMulti_sell_unit(buf.get());
pluSale.setReturn_type(buf.get());
pluSale.setTax_pointer(buf.get());
pluSale.setQty(buf.getInt());
pluSale.setPrice(buf.getInt());
pluSale.setAmount(buf.getInt());
pluSale.setNo_tax_price(buf.getInt());
pluSale.setNo_tax_amount(buf.getInt());
pluSale.setReturn_surcharge_percent(buf.getInt());
pluSale.setProduct_code(buf.get());
pluSale.setFlags(buf.get());
}
There exist other solutions, like using reflection, which is inefficient.
I used little endian byte order here, default in java is big endian.
There is the ObjectOutputStream, Serializable, persistence using serialisation.
It stores class data too, so is not the language agnostic format you desire.
While developing with a ByteBuffer is makes sense to check the read position.
If you are interested in XML persistence, JAXB with annotations offers a nice reflection based way, without need of handling every field.
A remark: Type[] variable is the preferred notation; Type var[] was initially added to java to be compatible with C/C++.

Casting double[] to byte[] using stream in Java

Hey I'm working on an app that uses Paho mqtt
Now I'm trying to cast the contents of a couple of objects to byte arrays so I can send them to the broker. There are a couple of different objects that all adhere to a abstract class, but the one I started with contains a double[]
Here's the function I'm trying to implement:
#Override
public byte[] getBytes() {
return Arrays.stream(driveVector).map(d -> Double.valueOf(d).byteValue()).toArray();
}
I thought this would work, but I get an error that the return value is a double[]
I think I either don't understand the map method or I'm goin about this all wrong in general (I looked at the ByteBuffer class, but it seems like a pain to implement this with it)
Thanks in advance
You can't cast a double[] to a byte[] for the fundamental reason that they are unrelated types, and you can only cast between related types.
Casts in Java, unlike, say, C++, don't actually create a new object: they are merely a way to the compiler "I know more about the type of this object than you; trust me." For example, you might know that a variable of type Object actually holds a reference to a String, something which the compiler cannot know; in that case, you can cast the reference.
You can, however, construct a new array:
byte[] output = new byte[input.length];
for (int j = 0; j < input.length; j++) {
output[j] = (byte) input[j];
}
There is no way to do this with streams. Or rather, there is, in that you could crowbar this code into a stream operation on a Stream<double[]>, say; but involving streams like that clearly adds no benefit.
You can use ByteBuffer for it:
double[] doubles = new double[] {1,2,3,4,5};
ByteBuffer buffer = ByteBuffer.allocate(doubles.length * Double.BYTES);
Arrays.stream(doubles).forEach(buffer::putDouble);
buffer.array();
Java Streams is not the right tool here, especially not since there is no ByteStream in Java.
Your method can be implemented as a simple for loop.
#Override
public byte[] getBytes() {
byte[] arr = new byte[driveVector.length];
for (int i = 0; i < arr.length; i++)
arr[i] = (byte) driveVector[i];
return arr;
}
In my MQTT application I read a single double value and post that to the broker. However, there is no real difference between a single and an array of doubles. The client needs to know the array length, while with a single value it always knows there is one.
I'm confident that you can adapt my code to writing multiple values, adapt the toMessage to write multiple double values.
public abstract class SensorMonitor {
protected final MqttAsyncClient client;
protected final String topic;
protected final Logger logger = Logger.getLogger(getClass().getName());
private final ByteArrayOutputStream byteOut = new ByteArrayOutputStream(8);
private final DataOutputStream dataOut = new DataOutputStream(byteOut);
public SensorMonitor(MqttAsyncClient mqttClient, String topic) {
this.client = mqttClient;
this.topic = topic;
}
public void start(ScheduledExecutorService service) {
service.scheduleWithFixedDelay(this::publish, 0, 30, TimeUnit.SECONDS);
}
protected void publish() {
try {
MqttMessage message = toMessage(readNewValue());
client.publish(topic, message);
} catch (MqttException | IOException e) {
logger.log(Level.SEVERE, "Could not publish message", e);
}
}
private MqttMessage toMessage(double value) throws IOException {
byteOut.reset();
dataOut.writeDouble(value);
return new MqttMessage(byteOut.toByteArray());
}
protected abstract double readNewValue();
}
The DataOutputStream.writeDouble uses Double.doubleToLongBits to create a IEEE 754 floating-point "double format" bit layout.
In my case I could pre-alloc and reuse the byteOut output stream as I knew upfront the needed size of the byte[].

Need some help ensuring my ByteUtil class is accurate

I am not very familiar with exactly all of the implications of bytes or even close to charsets, simply because i have not used them often. However i am working on a project in which i need to convert every Java primitive type (and Strings) to AND from bytes. I want them all with the charset UTF-8, but i'm not sure if i am converting them properly.
Anyways, although i am pretty sure that all number to/from byte conversions are correct, but then again, i need to be 100% sure. If someone has really good experience with bytes with numbers and charsets, could you look over the class below, and point out any issues?
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
public class ByteUtil
{
//TO BYTES FROM PRIMITIVES & STRINGS
public static byte[] getBytes(short i)
{
return ByteBuffer.allocate(2).putInt(i).array();
}
public static byte[] getBytes(int i)
{
return ByteBuffer.allocate(4).putInt(i).array();
}
public static byte[] getBytes(long i)
{
return ByteBuffer.allocate(8).putLong(i).array();
}
public static byte getBytes(boolean i)
{
return (byte) (i ? 1 : 0);
}
public static byte[] getBytes(char i)
{
return getBytes(String.valueOf(i).trim());
}
public static byte[] getBytes(String i)
{
return i.getBytes(StandardCharsets.UTF_8);
}
public static byte[] getBytes(float i)
{
return getBytes(Float.floatToIntBits(i));
}
public static byte[] getBytes(double i)
{
return getBytes(Double.doubleToLongBits(i));
}
//TO PRIMITIVES & STRINGS FROM BYTES
public static short getShort(byte[] b)
{
ByteBuffer wrapped = ByteBuffer.wrap(b);
return wrapped.getShort();
}
public static int getInt(byte[] b)
{
ByteBuffer wrapped = ByteBuffer.wrap(b);
return wrapped.getInt();
}
public static long getLong(byte[] b)
{
ByteBuffer wrapped = ByteBuffer.wrap(b);
return wrapped.getLong();
}
public static boolean getBoolean(byte b)
{
return(b == 1 ? true : false);
}
public static char getChar(byte[] b)
{
return getString(b).trim().toCharArray()[0];
}
public static String getString(byte[] b)
{
return new String(b, StandardCharsets.UTF_8);
}
public static float getFloat(byte[] b)
{
return Float.intBitsToFloat(getInt(b));
}
public static double getDouble(byte[] b)
{
return Double.longBitsToDouble(getLong(b));
}
}
Additionally, all the data put in and returned is read by my source internally, for example the boolean conversion may or may not be the correct way to do something like such, but in the boolean case, it wont matter since i know what i am checking for.
You don't even need to do this. You can use a DataOutputStream to write your primitive types and Strings to a ByteArrayOutputStream. You can then use toByteArray() to get a byte[] that you put into a ByteArrayInputStream. You can wrap that InputStream in a DataInputStream to get back your primitives.
If you're doing a school assignment where you need to implement this yourself (which sounds like a dumb assignment), you can look up the implementations of ByteArrayOutputStream and ByteArrayInputStream on GrepCode. Copy/pasting is a bad idea, but it might give you some hints about considerations to take into account.

2 dimensional array changing during serialisation

I'm serialising and deserialising a large two dimensional array of objects. Each object contains instructions to creating a BufferedImage - this is done to get around BufferedImage not being directly serializable itself.
The class being serialised is:
public final class MapTile extends TransientImage
{
private static final long serialVersionUID = 0;
private transient BufferedImage f;
transient BufferedImage b;
int along;
int down;
boolean flip = false;
int rot = 0;
public MapTile(World w, int a, int d)
{
// f = w.getMapTiles();
along = a;
down = d;
assignImage();
}
public MapTile(World w, int a, int d, int r, boolean fl)
{
// f = w.getMapTiles();
along = a;
down = d;
rot = r;
flip = fl;
assignImage();
}
public int getA()
{
return along;
}
public int getD()
{
return down;
}
#Override
public void assignImage()
{
if (f == null)
{
f = World.mapTiles;
}
b = f.getSubimage(along, down, World.squareSize, World.squareSize);
if (rot != 0)
{
b = SmallMap.rotateImage(b, rot);
}
if (flip)
{
b = SmallMap.flipImage(b);
}
super.setImage(b);
f.flush();
b.flush();
f = null;
b = null;
}
}
which extends:
public abstract class TransientImage implements Serializable
{
private transient BufferedImage image;
public BufferedImage getImage()
{
return image;
}
public void setImage(BufferedImage i)
{
image = i;
}
public abstract void assignImage();
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException
{
in.defaultReadObject();
assignImage();
}
}
This will ultimately be part of a map - usually it is created randomly but certain areas must be the same each time, hence serialising them and reading the array back in. As I will never need to save the image during normal usage I am putting in the write code:
try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("verticalroad.necro")))
{
//out.writeObject(mapArray);
//}
//catch (IOException e) {
//}
in the class that creates the map, the read code:
try{
FileInputStream door = new FileInputStream(new File(f.getPath()+ "//verticalroad.necro"));
ObjectInputStream reader = new ObjectInputStream(door);
homeTiles = (MapTile[][]) reader.readObject();
}
catch (IOException | ClassNotFoundException e)
{
System.out.println("Thrown an error" + e.getMessage());
}
in the initialising class and commenting in and out as needed.
However. Each time I run the program the contents of the two dimensional array (mapArray in write, homeTiles in read) is different. Not only different from the one I (thought) I wrote, but also different each time the program is opened.
As can be seen, I'm printing out the toString to System.out which reveals further oddities. As its just a standard array, the toString isn't 100% helpful but it seems to cycle between several distinct values. However, even when the toStringg gives the same value, the contents of the array as displayed are not the same.
An example of a toString is hometiles:[[Lriseofthenecromancer.MapTile;#7681720a Looking at the documentation for Array.toString (here) it seems to be badly formed, lacking a trailing ]. I'm not sure if this is a clue to the issue or if its simply that the array is very large (several thousand objects) and its an issue of display space (I'm using NetBeans).
Any insight as to why this is changing would be appreciated. My working assumption is that its serializing the array but not the contents. But I have no idea a) if that's the case and b)if it is, what to do about it.
EDIT: Looking into this a bit further, it seems that instance variables aren't being set immediately. Printing them out directly after the call to setImage() has them all at zero, printing them from the calling class has them where they should be.
The underlying problem was that I'm an idiot. The specific expression of this in this particular case was that I forgot that subclasses couldn't inherit private methods. As such, the assignImage call wasn't being made and the image wasn't being set up.
Sorry for wasting the time of anyone who looked at this. I feel quite embarrassed.

Javolution ByteBuffer question

I have the following implementation with Javolution:
public class RunScan extends Struct
{
public final Signed32 numOfClusters = new Signed32();
public final ClusterData[] clusters;
public final Signed32 numOfRecons = new Signed32();
public final ReconData[] recons ;
public RunScan (int numOfClusters, int numOfRecons)
{
this.numOfClusters.set(numOfClusters);
this.numOfRecons.set(numOfRecons);
clusters = array(new ClusterData[numOfClusters]);
recons = array(new ReconData[numOfRecons]);
}
}
public class ClusterData extends Struct
{
public final UTF8String scanType = new UTF8String(CommInterfaceFieldConstants.SCAN_TYPE_SIZE);
public final UTF8String patientId = new UTF8String(CommInterfaceFieldConstants.PATIENT_ID_SIZE);
.
.
.
}
public class ReconData extends Struct
{
public final UTF8String patientId = new UTF8String(CommInterfaceFieldConstants.PATIENT_ID_SIZE);
public final UTF8String scanSeriesId = new UTF8String(CommInterfaceFieldConstants.SCAN_SERIES_ID_SIZE);
.
.
.
}
In our communication class, before we put data onto socket, we need to get the bytes[] of the RunScan object but we get BufferUnderflowException in the line with "//<<<<<<<":
private byte[] getCmdBytes(Struct scCmd)
{
ByteBuffer cmdBuffer = scCmd.getByteBuffer();
int cmdSize = scCmd.size();
byte[] cmdBytes = new byte[cmdSize];
if (cmdBuffer.hasArray())
{
int offset = cmdBuffer.arrayOffset() + scCmd.getByteBufferPosition();
System.arraycopy(cmdBuffer.array(), offset, cmdBytes, 0, cmdSize);
}
else
{
String msg = "\n\ncmdBufferRemaining=" + cmdBuffer.remaining() + ", cmdBytesSize=" + cmdBytes.length + "\n\n";
System.out.println(msg);
cmdBuffer.position(scCmd.getByteBufferPosition());
cmdBuffer.get(cmdBytes); //<<<<<<<<<< underFlowException
}
return cmdBytes;
}
This method works in other cases. The exception happens because this line,
ByteBuffer cmdBuffer = scCmd.getByteBuffer();
only returns a 8 bytes (from the remaining() method) ByteBuffer of the RunScan object which are those two Signed32 fields, I think. But this line,
int cmdSize = scCmd.size();
returns a right length of the RunScan object which includes the size of those two arrays.
If I create those two array at the time I declare them (not "new" them in the constructor) with hard coded length, it works fine without any exception.
Anybody can help me figure out what's wrong with our implementation?
I ran into a similar situation with my code. Generally, with the current Struct object, you cannot have a variable length array defined in the same struct as the member that contains the number of elements in the array.
Try something like this:
public class RunScanHeader extends Struct
{
public final Signed32 numOfClusters = new Signed32();
public final Signed32 numOfRecons = new Signed32();
}
public class RunScanBody extends Struct
{
public final ClusterData[] clusters;
public final ReconData[] recons ;
public RunScan (int numOfClusters, int numOfRecons)
{
clusters = array(new ClusterData[numOfClusters]);
recons = array(new ReconData[numOfRecons]);
}
}
You'll then need a two phase approach to read and write, first read/write the header data, then read/write the body data.
Sorry I don't have more details at this time, if you can't solve this, let me know and I'll dig back through my code.
The initialization order is important has it defines the position of each field. Either your initialization is done when the field is declared (most common case). Or if you do it in the constructor you have to remember that the constructor is called after the member initialization. Here is an example with initialization done in the constructor:
public class RunScan extends Struct {
public final Signed32 numOfClusters;
public final ClusterData[] clusters;
public final Signed32 numOfRecons;
public final ReconData[] recons ;
public RunScan (int numOfClusters, int numOfRecons) {
// Initialization done in the constructor for all members
// Order is important, it should match the declarative order to ensure proper positioning.
this.numOfClusters = new Signed32();
this.clusters = array(new ClusterData[numOfClusters]);
this.numOfRecons = new Signed32();
this.recons = array(new ReconData[numOfRecons]);
// Only after all the members have been initialized the set method can be used.
this.numOfClusters.set(numOfClusters);
this.numOfRecons.set(numOfRecons);
}
}
get() will move the position of the ByteBuffer.
scCmd.getByteBuffer().slice().get(dest) might solve your issue with moving the position and unintended side effects.
scCmd.getByteBuffer().duplicate().get(dest) might also solve your issue if slice() produces the wrong picture of the origin buffer.
Additionally, it appears as though scCmd.getByteBuffer() creates a redundant reference and you are calling the source and child reference in the same method.
If scCmd.getByteBuffer() is already passing you a slice(), your redundant access to these methods is certainly going to do something other than what you planned.

Categories