Simple data serialization in C - java

I am currently re-designing an application and stumbled upon a problem serializing some data.
Say I have an array of size mxn
double **data;
that I want to serialize into a
char *dataSerialized
using simple delimiters (one for rows, one for elements).
De-serialization is fairly straightforward, counting delimiters and allocating size for the data to be stored. However, what about the serialize function, say
serialize_matrix(double **data, int m, int n, char **dataSerialized);
What would be the best strategy to determine the size needed by the char array and allocate the appropriate memory for it?
Perhaps using some fixed width exponential representation of double's in a string? Is it possible to just convert all bytes of double into char's and have a sizeof(double) aligned char array? How would I keep the accuracy of the numbers intact?
NOTE:
I need the data in a char array, not in binary, not in a file.
The serialized data will be sent over the network using ZeroMQ between a C server and a Java client. Would it be possible, given the array dimensions and sizeof(double) that it can always be accurately reconstructed between those two?

Java has pretty good support for reading raw bytes and converting into whatever you want.
You can decide on a simple wire-format, and then serialize to this in C, and unserialize in Java.
Here's an example of an extremely simple format, with code to unserialize and serialize.
I've written a slightly larger test program that I can dump somewhere if you want; it creates a random data array in C, serializes, writes the serialized string base64-encoded to stdout. The much smaller java-program then reads, decodes and deserializes this.
C code to serialize:
/*
I'm using this format:
32 bit signed int 32 bit signed int See below
[number of elements in outer array] [number of elements in inner array] [elements]
[elements] is buildt like
[element(0,0)][element(0,1)]...[element(0,y)][element(1,0)]...
each element is sendt like a 64 bit iee754 "double". If your C compiler/architecture is doing something different with its "double"'s, look forward to hours of fun :)
I'm using a couple non-standard functions for byte-swapping here, originally from a BSD, but present in glibc>=2.9.
*/
/* Calculate the bytes required to store a message of x*y doubles */
size_t calculate_size(size_t x, size_t y)
{
/* The two dimensions in the array - each in 32 bits - (2 * 4)*/
size_t sz = 8;
/* a 64 bit IEE754 is by definition 8 bytes long :) */
sz += ((x * y) * 8);
/* and a NUL */
sz++;
return sz;
}
/* Helpers */
static char* write_int32(int32_t, char*);
static char* write_double(double, char*);
/* Actual conversion. That wasn't so hard, was it? */
void convert_data(double** src, size_t x, size_t y, char* dst)
{
dst = write_int32((int32_t) x, dst);
dst = write_int32((int32_t) y, dst);
for(int i = 0; i < x; i++) {
for(int j = 0; j < y; j++) {
dst = write_double(src[i][j], dst);
}
}
*dst = '\0';
}
static char* write_int32(int32_t num, char* c)
{
char* byte;
int i = sizeof(int32_t);
/* Convert to network byte order */
num = htobe32(num);
byte = (char*) (&num);
while(i--) {
*c++ = *byte++;
}
return c;
}
static char* write_double(double d, char* c)
{
/* Here I'm assuming your C programs use IEE754 'double' precision natively.
If you don't, you should be able to convert into this format. A helper library most likely already exists for your platform.
Note that IEE754 endianess isn't defined, but in practice, normal platforms use the same byte order as they do for integers.
*/
char* byte;
int i = sizeof(uint64_t);
uint64_t num = *((uint64_t*)&d);
/* convert to network byte order */
num = htobe64(num);
byte = (char*) (&num);
while(i--) {
*c++ = *byte++;
}
return c;
}
Java code to unserialize:
/* The raw char array from c is now read into the byte[] `bytes` in java */
DataInputStream stream = new DataInputStream(new ByteArrayInputStream(bytes));
int dim_x; int dim_y;
double[][] data;
try {
dim_x = stream.readInt();
dim_y = stream.readInt();
data = new double[dim_x][dim_y];
for(int i = 0; i < dim_x; ++i) {
for(int j = 0; j < dim_y; ++j) {
data[i][j] = stream.readDouble();
}
}
System.out.println("Client:");
System.out.println("Dimensions: "+dim_x+" x "+dim_y);
System.out.println("Data:");
for(int i = 0; i < dim_x; ++i) {
for(int j = 0; j < dim_y; ++j) {
System.out.print(" "+data[i][j]);
}
System.out.println();
}
} catch(IOException e) {
System.err.println("Error reading input");
System.err.println(e.getMessage());
System.exit(1);
}

If you are writing a binary file, you should think of a good way to serialize the actual binary data (64bit) of your double. This could go from directly writing the content of the double to the file (minding endianness) to some more elaborate normalizing serialization schemes (e.g. with a well-defined representation of NaN). That's up to you really. If you expect to be basically among homogeneous architectures, a direct memory dump would probably suffice.
If you want to write to a text file and a are looking for an ASCII representation, I would strongly discourage a decimal numerical representation. Instead, you could convert the 64-bit raw data to ASCII using base64 or something like that.
You really want to keep all the precision that you have in your double!

Related

How to see number representation in floating point binary

For example, I have the number 0.1:
double n = 0.1;
It's represented in IEEE-754 big endian as:
0 01111111011 1001100110011001100110011001100110011001100110011010
How can I output 0.1 in this binary format?
Float class can do that for you calling the method Float.floatToIntBits
final int intBits = Float.floatToIntBits(4.1f);
final String binary = Integer.toBinaryString(intBits);
System.out.println(binary);
here can you verify setting the fusses i the binary result...
https://www.h-schmidt.net/FloatConverter/IEEE754.html
To convert a double to binary you'll need to call Double.doubleToLongBits(x) and Long.toBinaryString(x).
So you could try String binary = Long.toBinaryString( Double.doubleToLongBits(0.1) );
To get a full 64-bit representation you'd then have to prepend as many 0s as needed.
Edit:
Since you asked for a C version, I'll try and add one (though I'm no C expert so I might miss something like std lib functions):
#include <stdio.h>
#include <stdint.h>
union binary {
double d;
uint64_t l;
} binary;
int main() {
union binary b;
b.d = 0.1; //set the value as double
uint64_t bin = b.l; //read the value as 64-bit unsigned integer
char c[65];
c[64] = '\0'; //string terminator
//iterate from 63 to 0
for( int i = sizeof(uint64_t) * 8 - 1; i >= 0; i--) {
if( bin & 1 ) {
c[i]='1';
} else {
c[i]='0';
}
bin >>= 1; //right-shift by 1, i.e. 0010 -> 0001 etc.
}
printf("%s\n",c);
return 0;
}
This basically uses a union struct that allows you to write a double and access the bytes as a 64-bit unsigned integer (aka long long). Then the code iterates over a copy of that integer, checks whether the last bit is set and sets the according element of the character array and finally right-shifts the bits by 1.
Note that with some pointer casting you could do the same without the union structure: double dbl = 0.1; uint64_t bin = *((uint64_t*)(&dbl)); (You'd need a variable here to have something to point to, alternatively provide a function and take a pointer to the parameter).
A final warning though: you'll have to make sure that the data types you're using have equal size (i.e. they map to exactly the same memory location) or otherwise you'll run into access violations or other unpleasant stuff.

Fourier transforming a byte array

I am not so proficient in Java, so please keep it quite simple. I will, though, try to understand everything you post. Here's my problem.
I have written code to record audio from an external microphone and store that in a .wav. Storing this file is relevant for archiving purposes. What I need to do is a FFT of the stored audio.
My approach to this was loading the wav file as a byte array and transforming that, with the problem that 1. There's a header in the way I need to get rid of, but I should be able to do that and 2. I got a byte array, but most if not all FFT algorithms I found online and tried to patch into my project work with complex / two double arrays.
I tried to work around both these problems and finally was able to plot my FFT array as a graph, when I found out it was just giving me back "0"s. The .wav file is fine though, I can play it back without problems. I thought maybe converting the bytes into doubles was the problem for me, so here's my approach to that (I know it's not pretty)
byte ByteArray[] = Files.readAllBytes(wav_path);
String s = new String(ByteArray);
double[] DoubleArray = toDouble(ByteArray);
// build 2^n array, fill up with zeroes
boolean exp = false;
int i = 0;
int pow = 0;
while (!exp) {
pow = (int) Math.pow(2, i);
if (pow > ByteArray.length) {
exp = true;
} else {
i++;
}
}
System.out.println(pow);
double[] Filledup = new double[pow];
for (int j = 0; j < DoubleArray.length; j++) {
Filledup[j] = DoubleArray[j];
System.out.println(DoubleArray[j]);
}
for (int k = DoubleArray.length; k < Filledup.length; k++) {
Filledup[k] = 0;
}
This is the function I'm using to convert the byte array into a double array:
public static double[] toDouble(byte[] byteArray) {
ByteBuffer byteBuffer = ByteBuffer.wrap(byteArray);
double[] doubles = new double[byteArray.length / 8];
for (int i = 0; i < doubles.length; i++) {
doubles[i] = byteBuffer.getDouble(i * 8);
}
return doubles;
}
The header still is in there, I know that, but that should be the smallest problem right now. I transformed my byte array to a double array, then filled up that array to the next power of 2 with zeroes, so that the FFT can actually work (it needs an array of 2^n values). The FFT algorithm I'm using gets two double arrays as input, one being the real, the other being the imaginary part. I read, that for this to work, I'd have to keep the imaginary array empty (but its length being the same as the real array).
Worth to mention: I'm recording with 44100 kHz, 16 bit and mono.
If necessary, I'll post the FFT I'm using.
If I try to print the values of the double array, I get kind of weird results:
...
-2.0311904060823147E236
-1.3309975624948503E241
1.630738286366793E-260
1.0682002560745842E-255
-5.961832069690704E197
-1.1476447092561027E164
-1.1008407401197794E217
-8.109566204271759E298
-1.6104556241572942E265
-2.2081172620352248E130
NaN
3.643749694745671E-217
-3.9085815506127892E202
-4.0747557114875874E149
...
I know that somewhere the problem lies with me overlooking something very simple I should be aware of, but I can't seem to find the problem. My question finally is: How can I get this to work?
There's a header in the way I need to get rid of […]
You need to use javax.sound.sampled.AudioInputStream to read the file if you want to "skip" the header. This is useful to learn anyway, because you would need the data in the header to interpret the bytes if you did not know the exact format ahead of time.
I'm recording with 44100 kHz, 16 bit and mono.
So, this almost certainly means the data in the file is encoded as 16-bit integers (short in Java nomenclature).
Right now, your ByteBuffer code makes the assumption that it's already 64-bit floating point and that's why you get strange results. In other words, you are reinterpreting the binary short data as if it were double.
What you need to do is read in the short data and then convert it to double.
For example, here's a rudimentary routine to do such as you're trying to do (supporting 8-, 16-, 32- and 64-bit signed integer PCM):
import javax.sound.sampled.*;
import javax.sound.sampled.AudioFormat.Encoding;
import java.io.*;
import java.nio.*;
static double[] readFully(File file)
throws UnsupportedAudioFileException, IOException {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioFormat fmt = in.getFormat();
byte[] bytes;
try {
if(fmt.getEncoding() != Encoding.PCM_SIGNED) {
throw new UnsupportedAudioFileException();
}
// read the data fully
bytes = new byte[in.available()];
in.read(bytes);
} finally {
in.close();
}
int bits = fmt.getSampleSizeInBits();
double max = Math.pow(2, bits - 1);
ByteBuffer bb = ByteBuffer.wrap(bytes);
bb.order(fmt.isBigEndian() ?
ByteOrder.BIG_ENDIAN : ByteOrder.LITTLE_ENDIAN);
double[] samples = new double[bytes.length * 8 / bits];
// convert sample-by-sample to a scale of
// -1.0 <= samples[i] < 1.0
for(int i = 0; i < samples.length; ++i) {
switch(bits) {
case 8: samples[i] = ( bb.get() / max );
break;
case 16: samples[i] = ( bb.getShort() / max );
break;
case 32: samples[i] = ( bb.getInt() / max );
break;
case 64: samples[i] = ( bb.getLong() / max );
break;
default: throw new UnsupportedAudioFileException();
}
}
return samples;
}
The FFT algorithm I'm using gets two double arrays as input, one being the real, the other being the imaginary part. I read, that for this to work, I'd have to keep the imaginary array empty (but its length being the same as the real array).
That's right. The real part is the audio sample array from the file, the imaginary part is an array of equal length, filled with 0's e.g.:
double[] realPart = mySamples;
double[] imagPart = new double[realPart.length];
myFft(realPart, imagPart);
More info... "How do I use audio sample data from Java Sound?"
The samples in a wave file are not going to be already 8-byte doubles that can be directly copied as per your posted code.
You need to look up (partially from the WAVE header format and from the RIFF specification) the data type, format, length and endianess of the samples before converting them to doubles.
Try 2 byte little-endian signed integers as a likely possibility.

Byte to "Bit"array

A byte is the smallest numeric datatype java offers but yesterday I came in contact with bytestreams for the first time and at the beginning of every package a marker byte is send which gives further instructions on how to handle the package. Every bit of the byte has a specific meaning so I am in need to entangle the byte into it's 8 bits.
You probably could convert the byte to a boolean array or create a switch for every case but that can't certainly be the best practice.
How is this possible in java why are there no bit datatypes in java?
Because there is no bit data type that exists on the physical computer. The smallest allotment you can allocate on most modern computers is a byte which is also known as an octet or 8 bits. When you display a single bit you are really just pulling that first bit out of the byte with arithmetic and adding it to a new byte which still is using an 8 bit space. If you want to put bit data inside of a byte you can but it will be stored as a at least a single byte no matter what programming language you use.
You could load the byte into a BitSet. This abstraction hides the gory details of manipulating single bits.
import java.util.BitSet;
public class Bits {
public static void main(String[] args) {
byte[] b = new byte[]{10};
BitSet bitset = BitSet.valueOf(b);
System.out.println("Length of bitset = " + bitset.length());
for (int i=0; i<bitset.length(); ++i) {
System.out.println("bit " + i + ": " + bitset.get(i));
}
}
}
$ java Bits
Length of bitset = 4
bit 0: false
bit 1: true
bit 2: false
bit 3: true
You can ask for any bit, but the length tells you that all the bits past length() - 1 are set to 0 (false):
System.out.println("bit 75: " + bitset.get(75));
bit 75: false
Have a look at java.util.BitSet.
You might use it to interpret the byte read and can use the get method to check whether a specific bit is set like this:
byte b = stream.read();
final BitSet bitSet = BitSet.valueOf(new byte[]{b});
if (bitSet.get(2)) {
state.activateComponentA();
} else {
state.deactivateComponentA();
}
state.setFeatureBTo(bitSet.get(1));
On the other hand, you can create your own bitmask easily and convert it to a byte array (or just byte) afterwards:
final BitSet output = BitSet.valueOf(ByteBuffer.allocate(1));
output.set(3, state.isComponentXActivated());
if (state.isY){
output.set(4);
}
final byte w = output.toByteArray()[0];
How is this possible in java why are there no bit datatypes in java?
There are no bit data types in most languages. And most CPU instruction sets have few (if any) instructions dedicated to adressing single bits. You can think of the lack of these as a trade-off between (language or CPU) complexity and need.
Manipulating a single bit can be though of as a special case of manipulating multiple bits; and languages as well as CPU's are equipped for the latter.
Very common operations like testing, setting, clearing, inverting as well as exclusive or are all supported on the integer primitive types (byte, short/char, int, long), operating on all bits of the type at once. By chosing the parameters appropiately you can select which bits to operate on.
If you think about it, a byte array is a bit array where the bits are grouped in packages of 8. Adressing a single bit in the array is relatively simple using logical operators (AND &, OR |, XOR ^ and NOT ~).
For example, testing if bit N is set in a byte can be done using a logical AND with a mask where only the bit to be tested is set:
public boolean testBit(byte b, int n) {
int mask = 1 << n; // equivalent of 2 to the nth power
return (b & mask) != 0;
}
Extending this to a byte array is no magic either, each byte consists of 8 bits, so the byte index is simply the bit number divided by 8, and the bit number inside that byte is the remainder (modulo 8):
public boolean testBit(byte[] array, int n) {
int index = n >>> 3; // divide by 8
int mask = 1 << (n & 7); // n modulo 8
return (array[index] & mask) != 0;
}
Here is a sample, I hope useful for you!
DatagramSocket socket = new DatagramSocket(6160, InetAddress.getByName("0.0.0.0"));
socket.setBroadcast(true);
while (true) {
byte[] recvBuf = new byte[26];
DatagramPacket packet = new DatagramPacket(recvBuf, recvBuf.length);
socket.receive(packet);
String bitArray = toBitArray(recvBuf);
System.out.println(Integer.parseInt(bitArray.substring(0, 8), 2)); // convert first byte binary to decimal
System.out.println(Integer.parseInt(bitArray.substring(8, 16), 2)); // convert second byte binary to decimal
}
public static String toBitArray(byte[] byteArray) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < byteArray.length; i++) {
sb.append(String.format("%8s", Integer.toBinaryString(byteArray[i] & 0xFF)).replace(' ', '0'));
}
return sb.toString();
}

Current best way to populate mixed type byte array

I'm trying to send and receive a byte stream in which certain ranges of bytes represent different pieces of data. I've found ways to convert single primitive datatypes into bytes, but I'm wondering if there's a straightforward way to place certain pieces of data into specified byte regions.
For example, I might need to produce or read something like the following:
byte 1 - int
byte 2-5 - int
byte 6-13 - double
byte 14-21 - double
byte 25 - int
byte 26-45 - string
Any suggestions would be appreciated.
Try DataOutputStream/DataInputStream or, for arrays, the ByteBuffer class.
For storing the integer in X bytes, you may use the following method. If you think it is badly named, you may use the much less descriptive i2os name which is used in several (crypto) algorithm descriptions. Note that the returned octet string uses Big Endian encoding of unsigned ints, which you should specify for your protocol.
public static byte[] possitiveIntegerToOctetString(
final long value, final int octets) {
if (value < 0) {
throw new IllegalArgumentException("Cannot encode negative values");
}
if (octets < 1) {
throw new IllegalArgumentException("Cannot encode a number in negative or zero octets");
}
final int longSizeBytes = Long.SIZE / Byte.SIZE;
final int byteBufferSize = Math.max(octets, longSizeBytes);
final ByteBuffer buf = ByteBuffer.allocate(byteBufferSize);
for (int i = 0; i < byteBufferSize - longSizeBytes; i++) {
buf.put((byte) 0x00);
}
buf.mark();
buf.putLong(value);
// more bytes than long encoding
if (octets >= longSizeBytes) {
return buf.array();
}
// less bytes than long encoding (reset to mark first)
buf.reset();
for (int i = 0; i < longSizeBytes - octets; i++) {
if (buf.get() != 0x00) {
throw new IllegalArgumentException("Value does not fit in " + octets + " octet(s)");
}
}
final byte[] result = new byte[octets];
buf.get(result);
return result;
}
EDIT before storing the string, think of a padding mechanism (spaces would be most used), and character-encoding e.g. String.getBytes(Charset.forName("ASCII")) or "Latin-1". Those are the most common encodings with a single byte per character. Calculating the size of "UTF-8" is slightly more difficult (encode first, add 0x20 valued bytes at the end using ByteBuffer).
You may want to consider having a constant size for each data type. For example, the 32-bit Java int will take up 4 bytes a long will take 8, etc. In fact, if you use Java's DataInputStream and DataOutputStreams, you'll basically be doing that anyway. They have really nice methods like read/writeInt, etc.

Send integers from an Ardunio to a Java program using a serial port and converting a two-byte array to an integer in Java

I want to send integer values from my Arduino to a Java application. To do so, I am using a serial port. In the processing program for the Arduino I am printing the following to the serial port:
Serial.print(accel, DEC)
accel is a signed int. In my Java code I am implementing a SerialPortEventListener, and I am reading an array of bytes from the inputstream input.
public synchronized void serialEvent(SerialPortEvent oEvent) {
if (oEvent.getEventType() == SerialPortEvent.DATA_AVAILABLE) {
try {
int available = input.available();
byte[] chunk = new byte[available];
input.read(chunk);
System.out.println (new String(chunk));
} catch (Exception e) {
System.err.println(e.toString());
}
}
}
I need to convert this array of bytes into integers, but I am not sure how to do this. The Arduino tutorial Lesson 4 - Serial communication and playing with data says that the Arduino stores signed ints in two bytes (16 bits). I have tried reading just two bytes of data into an array chunk and then decoding those two bytes into integer with the following code.
public synchronized void serialEvent(SerialPortEvent oEvent) {
if (oEvent.getEventType() == SerialPortEvent.DATA_AVAILABLE) {
try {
int available = input.available();
int n = 2;
byte[] chunk = new byte[n];
input.read(chunk);
int value = 0;
value |= chunk[0] & 0xFF;
value <<= 8;
value |= chunk[1] & 0xFF;
System.out.println(value);
} catch (Exception e) {
System.err.println(e.toString());
}
}
}
This should print out a stream of integers fluctuating between -512 and -516, but it doesn’t. The code prints out the following.
2949173
3211317
851978
2949173
3211317
851978
2949173
3211319
How can the bytes coming from the Arduino through the serial port to integers be decoded?
You can use ByteBuffer to convert byte to int, your code will look something like this:
ByteBuffer bb = ByteBuffer.wrap(chunk);
System.out.println(bb.getInt());
Edit: I should have recognized the ascii digits :-)
Why not: Make a line oriented protocol; Use a java BufferedReader and .readLine().
Parse the strings using Integer.valueOf(). Note that the newline is used as a
framing character.
It how most GPS GPS devices and all sorts of quite important devices are used.
Not optimal in any way, but there you go...
It turns out that the Serial.print() function on the Arduino converts the value to a string and sends the string to the serial port, which is why my bit shifting wasn't working. One solution is to collect the values in my java code as a string then and convert the string to an int.
However, to avoid the step of converting the values from string to int, you can use the serial.write() function to send ints to the serial port. To do this, in the Arduino sketch, define a union of a int and an array of bytes of the same size. Then create an instance of the union, set the integer, and Serial.write() the matching byte array. Collect the byte array in the Java program and do a bit shift to get the integer.
My Arduino sketch is coded like this:
union{
int i;
byte b[2];
}u;
u.i = accel;
Serial.write(u.b, 2);
My java code for reading the byte array and converting it to an int looks like this:
int n = 2;
byte[] chunk = new byte[n];
input.read(chunk);
short value = 0;
// get 2 bytes, unsigned 0..255
int low = chunk[0] & 0xff;
int high = chunk[1] & 0xff;
value = (short) (high << 8 | low) ;
System.out.println(value);
I am still looking for a way to get a float or double from the Arduino into the java code. I know I can write the float or double by replacing the int in the union as a double or float, set the double or float, and Serial.write() the matching byte array. My problem comes on the Java side. Once I have collected the byte array I have not found a way to correctly bit shift the data to the the float or double. Any help here?
My problem was similar. I used Arduino Serial.write to send a set of known int values (2 bytes/int). Arduino serial monitor correctly interpreted those values. However the java program got all the numbers wrong (except for zero). Apparently the byte order is reversed. I would expect that byte order is also the problem with other data types output by arduino. jSerialComm was used for reading the serial port. Here's the java code:
// declarations
byte[] byter = new byte[16];
ByteBuffer bufr = ByteBuffer.wrap(byter);
short[] shortr = new short[8];
// code
comPort.readBytes(byter, 16);
for(int i=0; i<8; i++){
shortr[i] = Short.reverseBytes(bufr.getShort(2*i));
}

Categories