java.io.StreamCorruptedException with ruby sender and java client - java

I have a ruby program that writes data to a socket with sock.write, and I'm reading the data with ObjectInputStream in a java file. I'm getting an invalid header error that translate to the first few characters of my stream.
I've read that if you use ObjectInputStream you must write with ObjectOutputStream, but since the writing file is in ruby im not sure how to accomplish this.

As you say, ObjectInputStream assumes that the bytes it's receiving have been formatted by an ObjectOutputStream. That is, it is expecting the incoming bytes to be a specific representation of a Java primitive or object.
Your Ruby code is unlikely to format bytes in such a way.
You need to define exactly the byte format of the message passing from the Ruby to the Java process. You could tell us more about that message format, but it's likely you will need to use Java's ByteArrayInputStream (https://docs.oracle.com/javase/7/docs/api/java/io/ByteArrayInputStream.html). The data will come into the Java program as a raw array of bytes, and you will need to parse/unpack/process these bytes into whatever objects are appropriate.
Unless performance is critical, you'd probably be best off using JSON or YAML as the intermediate format. They would make it simple to send simple objects such as strings, arrays, and hashes (maps).

Related

Best way for transfering files via socket in java

I have a BufferedReader object and a PrintWriter object. So I can work passing String objects made by json-io of any type (e.g.: List, Map, MyOwnClass)
My class have a byte[] attribute, this byte[] will keep a file bytes, such as an image.
The json generated of my class is very very big, obviously... Then i started to think that must have a better way to transfer files.
Should I change all the mechanism to transfer only byte[] instead of String? Does someone know what is the mechanism used by chat programs? Should I reserve the first 20 bytes of the array for the message identification?
I would write it to the socket in binary:
Assuming a class with one String and one byte[].
The String
The length of the String is written with DataOutputStream.writeInt(int) (or methods for smaller integers) and then OutputStream.write(byte[]) on the return value of String.getBytes(String) with the charset explicitly specified.
The byte[]
The length is written with DataOutputStream.writeInt(int) (or methods for smaller integers) and then OutputStream.write(byte[]) for the byte[] to transfer.
On the other side you would do the exact opposite of this procedure.
I chose this binary approach over JSON because even though you could transmit the byte[] with JSON almost as efficiently as in binary, it would defeat the very purpose of JSON: being human-readable.

Parse byte array as HTTP object

In Java, how would I convert a byte array (TCP packet payload from a pcap file) into some kind of HTTP object that I can use to get HTTP headers and content body?
One of the stupid lovely things about Java is a total lack of unsigned types. So, a good place to start would be taking your byte array and converting it into a short array to make sure that you don't have any rollover problems. (16 bits versus 8 bits per number).
From there, you could use a BufferedOutputStream to write your data to a file and parse it with one of the Java built-in XML readers, such as JaxB or DOM. BufferedOutputStream writes hex directly to a file, and can take an input of an int, byte, or short array. After you write it out, using the OutputStream it should be very simple to parse the HTML out of it.
If you need any help with any of these individual steps, I'd be happy to help.
EDIT: as maerics has pointed out, perhaps I didn't grasp what you were asking. Regardless, writing your byte array with a BufferedOutputStream is the way to go in my opinion, and I could still help you build a parser if you want.
JNetPcap can do exactly this.
Here are examples for
Opening a pcap file
Parsing http (in the example, we extract an image)
Drawback: parsing http in this library is depracated*, but that doesn't mean it doesn't work
*I can't post anymore links without more reputation. Sorry. You can Google for "jnetpcap http deprecated"

Why to avoid using ByteStream much in Java

We shouldn't use byte Stream as Sun Doc says -
actually it represents a kind of low-level I/O that you should avoid.
What is actually low-level I/O and what is exact problem using byte stream.
So the Java docs say:
CopyBytes seems like a normal program, but it actually represents a
kind of low-level I/O that you should avoid. Since xanadu.txt contains
character data, the best approach is to use character streams, as
discussed in the next section. There are also streams for more
complicated data types. Byte streams should only be used for the most
primitive I/O.
The byte streams give you access to the file as it is. Just the bytes. No interpration of any kind. That means no character set conversion, no handling of ints or floats in binary or ascii representation, no dealing with byte orders, or any of that. The higher level streams provide some of these.
Of course a program that copies a file is actually a pretty good example of something that needs a raw byte stream, because it doesn't need or want to do any kind of intepretation of the data; it just wants to copy it verbatim.
So what the really mean is, use byte streams if you think you need them, but be sure you know what you are doing :)
The suggestion is in the context of reading a text file that is discussed in the tutorial. For that purpose it is better to use character streams to handle character set translation properly:
The Java platform stores character values using Unicode conventions.
Character stream I/O automatically translates this internal format to
and from the local character set.
A program that uses character streams in place of byte streams
automatically adapts to the local character set and is ready for
internationalization — all without extra effort by the programmer.

Java limit at json format

I want to transfer some database data through a TCP socket. The data is formatted to JSON.
Since the database size might grow, I'm afraid that the String object maximum size will not be enough to store the entire data with JSON formatting.
I already had an problem transferring the data using the DataOutput function writeUTF().
What should I do? Maybe convert the database rows to CSV and transfer it through the Internet line by line? Or do I not need to worry about String limits and solve the writeUTF() problem by getting the bytes of the String, transferring them through the socket and rebuilding the String from the bytes at the destination?
Java strings can be extremely long - you're unlikely to run into problems with the String type itself. If you convert the string to binary first, then use writeInt to write the number of bytes, then the bytes themselves, that should be fine. The problem with writeUTF is that it uses writeShort, so it only handles up to 64K of data.

Java and Binary data in the context of sockets

Java newbie here. Are there any helper functions to serialize data in and out of byte arrays? I am writing a Java package that implements a network protocol. So I have to write some typical variables like a version (1byte), sequence Number (long) and binary data (bytes) in a loop. How do I do this in Java? Coming from C I am thinking of creating a byte array of the required size and then since there is no memcpy() I am converting the long into a temporary byte array and then copying it into the actual byte array. It seems so inefficient and also really error prone. Is there a class I could use to marshall and unmarshall parameters to a byte array?
Also why does all the Socket classes only deals with char[] and not byte[]? A socket by definition has to deal with binary data also. How is this done in Java?
I am sure what I am missing is the Java mindset. Appreciate it if some one can point it to me.
EDIT: I did look at DataOutputStream and DataInputStream but I cannot convert the bytes to a String not to a byte[] which means the information might be lost in the conversion to write to a socket.
Pav
Have a look at DataInputStream, DataOutputStream, ObjectInputStream and ObjectOutputStream. Check first if the layout of the data is acceptable to you. Also, Serialization.
Sockets neither deal with char[] nor with byte[] but with InputStream and OutputStream which are used to read and write bytes.
If you are sending the data over a socket, then you don't need a temporary byte array at all; you can wrap the socket's OutputStream with DataOutputStream or ObjectOutputStream and just write what you want to write.
There might be an aspect I've missed that means you do actually need temporary byte arrays. If so, look at ByteArrayOutputStream. Also, there's no memcpy(), sure, but there is System.arraycopy.
As above, DataInputStream and DataOutputStream are exactly what you are looking for. Re your comment about String, if you're planning to use Java Strings over the wire, you're not designing a network protocol, youre designing a Java protocol. There are readUTF() and writeUTF() if you're sure the other end is Java or if you can code the other end to understand these formats. Or you can send as bytes along with the appropriate charset, or predefine the charset for the entire protocol if that makes sense.

Categories