Cannot read in Java data serialized in C++ - java

I have a Java client connected via socket to a C++ server.
The C++ server sends back to the client serialized objects.
However serialization works differently for Java and C++, so I cannot read the objects in that way:
objectInputStream.readObject();
This forces me to read each single value of the object manually:
byte[] buffer = read(FOUR_BYTES);
int flag = convertBufferToInt(buffer);
byte[] buffer = read(FOUR_BYTES);
float price = convertBufferToFloat(buffer);
// More stuff
myObject.setFlag(flag);
myObject.setPrice(price);
// More stuff
That's very hard to maintain. Isn't there an easier way to fill in my object with data?

To solve this in general you would need to write a C++ parser for objects serialized in Java. This is no small task.
Rather, I would recommend that you find some serialization format that is easy to parse and share between your Java and C++ programs. Preferably a format where there exists Java as well as C++ libraries for the serialization/deserialization. JSON or Google Protocol Buffers are obvious candidates.

Yes there is (are). You have 2 options using only the standard library:
Using the DataInputStream class
Check out the DataInputStream class. It has methods to read values of primitive types like readByte(), readInt(), readLong(), readFloat(), readChar(), readUTF() (for reading UTF-8 encoded String) etc.
So your code becomes as simple as:
// Obtain InputStream from Socket:
InputStream is = ...;
// Create DataInputStream:
DataInputStream dis = new DataInputStream(is);
myObject.setFlag(dis.readInt());
myObject.setPrice(dis.readFloat());
Using the ByteBuffer class
For this you have to read first the whole data into a byte array. Once you've done that, you can create a ByteBuffer using the ByteBuffer.wrap(byte[] array) method. The ByteBuffer class also supports reading primitive types just like the DataInputStream class.
The good thing about ByteBuffer that it supports changing the byte order (the order how the low and high bytes of a multi-byte value like int are read/written): ByteBuffer.order(ByteOrder bo). This is very useful if you're communicating with systems which use a differnet byte order (which might apply in your case).
Example using ByteBuffer:
// Read all your input data:
byte[] data = ...;
// Create ByteBuffer:
ByteBuffer bb = ByteBuffer.wrap(data);
myObject.setFlag(bb.getInt());
myObject.setPrice(bb.getFloat());

Related

Best way for transfering files via socket in java

I have a BufferedReader object and a PrintWriter object. So I can work passing String objects made by json-io of any type (e.g.: List, Map, MyOwnClass)
My class have a byte[] attribute, this byte[] will keep a file bytes, such as an image.
The json generated of my class is very very big, obviously... Then i started to think that must have a better way to transfer files.
Should I change all the mechanism to transfer only byte[] instead of String? Does someone know what is the mechanism used by chat programs? Should I reserve the first 20 bytes of the array for the message identification?
I would write it to the socket in binary:
Assuming a class with one String and one byte[].
The String
The length of the String is written with DataOutputStream.writeInt(int) (or methods for smaller integers) and then OutputStream.write(byte[]) on the return value of String.getBytes(String) with the charset explicitly specified.
The byte[]
The length is written with DataOutputStream.writeInt(int) (or methods for smaller integers) and then OutputStream.write(byte[]) for the byte[] to transfer.
On the other side you would do the exact opposite of this procedure.
I chose this binary approach over JSON because even though you could transmit the byte[] with JSON almost as efficiently as in binary, it would defeat the very purpose of JSON: being human-readable.

Using Java's ByteBuffer to replicate Python's struct.pack

First off, I saw Java equivalent of Python's struct.pack?... this is a clarification.
I am new to Java and trying to mirror some of the techniques that I have used in Python. I am trying to send data over the network, and want to ensure I know what it looks like. In python, I would use struct.pack. For example:
data = struct.pack('i', 10)
data += "Some string"
data += struct.pack('i', 500)
print(data)
That would print the packed portions in byte order with the string in plaintext in the middle.
I tried to replicate that with ByteBuffer:
String somestring = "Some string";
ByteBuffer buffer = ByteBuffer.allocate(100);
buffer.putInt(10);
buffer.put(somestring.getbytes());
buffer.putInt(500);
System.out.println(buffer.array());
What part am I not understanding?
That sounds more complicated than you really need.
I suggest using DataOutputStream and BufferedOutputStream:
DataOutputStream dos = new DataOutputStream(
new BufferedOutputStream(socket.getOutputStream()));
dos.writeInt(50);
dos.writeUTF("some string"); // this includes a 16-bit unsigned length
dos.writeInt(500);
This avoids creating more objects than needed by writing directly to the stream.
if use https://github.com/raydac/java-binary-block-parser then the code will be much easier
JBBPOut.BeginBin().Int(10).Utf8("Some string").Int(500).End().toByteArray();

How to read data in Java from a Python Struck.pack

I am using Java UDP datagrampacket to receive a python packet which contains a Struct.pack string. How could I unpack it in Java?
If you have the packet as byte[] array in Java, you can use java.io.java.io.ByteArrayInputStream to create an InputStream from it which can be wrapped by java.io.DataInputStream which provides methods to read several simple datatypes.
Be aware that DataInputStream works with big endian. If you use little endian some byte juggling will be necessary for multi-byte integer types.

Java and Binary data in the context of sockets

Java newbie here. Are there any helper functions to serialize data in and out of byte arrays? I am writing a Java package that implements a network protocol. So I have to write some typical variables like a version (1byte), sequence Number (long) and binary data (bytes) in a loop. How do I do this in Java? Coming from C I am thinking of creating a byte array of the required size and then since there is no memcpy() I am converting the long into a temporary byte array and then copying it into the actual byte array. It seems so inefficient and also really error prone. Is there a class I could use to marshall and unmarshall parameters to a byte array?
Also why does all the Socket classes only deals with char[] and not byte[]? A socket by definition has to deal with binary data also. How is this done in Java?
I am sure what I am missing is the Java mindset. Appreciate it if some one can point it to me.
EDIT: I did look at DataOutputStream and DataInputStream but I cannot convert the bytes to a String not to a byte[] which means the information might be lost in the conversion to write to a socket.
Pav
Have a look at DataInputStream, DataOutputStream, ObjectInputStream and ObjectOutputStream. Check first if the layout of the data is acceptable to you. Also, Serialization.
Sockets neither deal with char[] nor with byte[] but with InputStream and OutputStream which are used to read and write bytes.
If you are sending the data over a socket, then you don't need a temporary byte array at all; you can wrap the socket's OutputStream with DataOutputStream or ObjectOutputStream and just write what you want to write.
There might be an aspect I've missed that means you do actually need temporary byte arrays. If so, look at ByteArrayOutputStream. Also, there's no memcpy(), sure, but there is System.arraycopy.
As above, DataInputStream and DataOutputStream are exactly what you are looking for. Re your comment about String, if you're planning to use Java Strings over the wire, you're not designing a network protocol, youre designing a Java protocol. There are readUTF() and writeUTF() if you're sure the other end is Java or if you can code the other end to understand these formats. Or you can send as bytes along with the appropriate charset, or predefine the charset for the entire protocol if that makes sense.

Should I use DataInputStream or BufferedInputStream

I want to read each line from a text file and store them in an ArrayList (each line being one entry in the ArrayList).
So far I understand that a BufferedInputStream writes to the buffer and only does another read once the buffer is empty which minimises or at least reduces the amount of operating system operations.
Am I correct - do I make sense?
If the above is the case in what situations would anyone want to use DataInputStream. And finally which of the two should I be using and why - or does it not matter.
Use a normal InputStream (e.g. FileInputStream) wrapped in an InputStreamReader and then wrapped in a BufferedReader - then call readLine on the BufferedReader.
DataInputStream is good for reading primitives, length-prefixed strings etc.
The two classes are not mutually exclusive - you can use both of them if your needs suit.
As you picked up, BufferedInputStream is about reading in blocks of data rather than a single byte at a time. It also provides the convenience method of readLine(). However, it's also used for peeking at data further in the stream then rolling back to a previous part of the stream if required (see the mark() and reset() methods).
DataInputStream/DataOutputStream provides convenience methods for reading/writing certain data types. For example, it has a method to write/read a UTF String. If you were to do this yourself, you'd have to decide on how to determine the end of the String (i.e. with a terminator byte or by specifying the length of the string).
This is different from BufferedInputStream's readLine() which, as the method sounds like, only returns a single line. writeUTF()/readUTF() deal with Strings - that string can have as many lines it it as it wants.
BufferedInputStream is suitable for most text processing purposes. If you're doing something special like trying to serialize the fields of a class to a file, you'd want to use DataInput/OutputStream as it offers greater control of the data at a binary level.
Hope that helps.
You can always use both:
final InputStream inputStream = ...;
final BufferedInputStream bufferedInputStream =
new BufferedInputStream(inputStream);
final DataInputStream dataInputStream =
new DataInputStream(bufferedInputStream);
InputStream: Base class to read byte from stream (network or file ), provide ability to read byte from the stream and delete the end of the stream.
DataInputStream: To read data directly as a primitive datatype.
BufferInputStream: Read data from the input stream and use buffer to optimize the speed to access the data.
You shoud use DataInputStream in cases when you need to interpret the primitive types in a file written by a language other Java in platform-independent manner.
I would advocate using Jakarta Commons IO and the readlines() method (of whatever variety).
It'll look after buffering/closing etc. and give you back a list of text lines. I'll happily roll my own input stream wrapping with buffering etc., but nine times out of ten the Commons IO stuff works fine and is sufficient/more concise/less error prone etc.
The differences are:
The DataInputStream works with the binary data, while the BufferedReader work with character data.
All primitive data types can be handled by using the corresponding methods in DataInputStream class, while only string data can be read from BufferedReader class and they need to be parsed into the respective primitives.
DataInputStream is a part of filtered streams, while BufferedReader is not.
DataInputStream consumes less amount of memory space being it is a binary stream, whereas BufferedReader consumes more memory space being it is character stream.
The data to be handled is limited in DataInputStream, whereas the number of characters to be handled has wide scope in BufferedReader.

Categories