Java TCP/IP socket send byte-aligned data - java

I am having a hard time sending byte-aligned data over a socket because the examples I've been following use a PrintWriter class which converts everything to a string representation.
I want to send 3 float values, with a header and a footer. This way the consumer knows exactly how many bytes to read per transmission. My client sends something like the following:
//Add a header:
float type = params[0];
if (type == TransmitService.ACC_TYPE) out.print('a');
else if (type == TransmitService.GYR_TYPE) out.print('g');
else out.print('u'); //unknown - wtf? hasn't happened yet but just in case
//Payload:
out.print(params[1]);
out.print(params[2]);
out.print(params[3]);
//Footer:
out.print('e');
Here is the initialization of network objects:
echoSocket = new Socket(HOST, PORT);
out = new PrintWriter(echoSocket.getOutputStream(), true);
Then on the server side I want to do something like read exactly 26 bytes at a time (8 bytes per 3 floats on a 64-bit system, 1 byte for header char, 1 byte for footer char). The exact number doesn't matter, I can test and figure that out.
What's problematic is that out.print() converts everything to a string, so if I have 0.001000.. with trailing zeros, it will truncate to 0.001 as a string, which is 5 chars, giving me inconsistent byte transaction amounts for my server.
It's in MATLAB unfortunately and doing the following:
t=tcpip('0.0.0.0', 8000, 'NetworkRole', 'server');
fopen(t);
bytesToRead = 26;
data = fread(t,bytesToRead);
What should I do to consistently send my header, 3 floats, and footer to my server?
Cheers

So don't use a PrintWriter. Use a DataOutputStream. It has methods for sending floats and all other primitives.
'Byte-aligned' has nothing to do with it. All data is byte-aligned.

You shouldn't use PrintWriter, I believe.
Try using e.g. an ObjectOutputStream.
http://docs.oracle.com/javase/7/docs/api/java/io/ObjectOutputStream.html
With this one you can write bytes, chars, floats, basically
anything (including primitive types and objects).
And you don't go through a String, you write binary data.

Related

Converting byte array with ASCII encoding to String produces weird result

I'm making a socket application in Java that receives some HTML data from the server in ASCII and then parse the data accordingly.
byte[] receivedContent = new byte[12500];
receivedSize = inputStream.read(receivedContent);
receivedContent = Arrays.copyOf(receivedContent, receivedSize+1);
if (receivedSize == -1) {
System.out.println("ERROR! NO DATA RECEIVED");
System.exit(-1);
}
lastReceived = new String(receivedContent, StandardCharsets.US_ASCII);
This should really be quite straight forward but it's not. I printed out some debug messages and found that despite receiving some bytes of data, (for exmaple priting receivedSize tells me its received 784 bytes), the resulting string from those bytes is only a few chars long, like this:
Ard</a></li><li><a
I'm expecting a full HTML document, and so this is clearly wrong. There's also no obvious pattern as to when might this happen. It seems totally random. Since I'm allocating new memory for the buffer there really shouldn't be any old data in it that messes with the new data from the socket. Can someone shed some light on this strange behavior? Also this seems to happen less frequently on my Windows machine running OracleJDK rather than my remote Ubunut machine that runs OpenJDK, could that be the reason and how would I fix that?
UPDATE:
at the end I manually inspected the byte array's ASCII encoding against a ASCII table and found that the server is intentionally sending garbled data. Mystery solved.
Instead of using:
inputStream.read(receivedContent);
You need to read all data from the stream. Using something like (from apache commons io):
IOUtils.readFully(inputStream, receivedContent)

Sending buffered images between Java client and Twisted Python socket server

I have a server-side function that draws an image with the Python Imaging Library. The Java client requests an image, which is returned via socket and converted to a BufferedImage.
I prefix the data with the size of the image to be sent, followed by a CR. I then read this number of bytes from the socket input stream and attempt to use ImageIO to convert to a BufferedImage.
In abbreviated code for the client:
public String writeAndReadSocket(String request) {
// Write text to the socket
BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));
bufferedWriter.write(request);
bufferedWriter.flush();
// Read text from the socket
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(socket.getInputStream()));
// Read the prefixed size
int size = Integer.parseInt(bufferedReader.readLine());
// Get that many bytes from the stream
char[] buf = new char[size];
bufferedReader.read(buf, 0, size);
return new String(buf);
}
public BufferedImage stringToBufferedImage(String imageBytes) {
return ImageIO.read(new ByteArrayInputStream(s.getBytes()));
}
and the server:
# Twisted server code here
# The analog of the following method is called with the proper client
# request and the result is written to the socket.
def worker_thread():
img = draw_function()
buf = StringIO.StringIO()
img.save(buf, format="PNG")
img_string = buf.getvalue()
return "%i\r%s" % (sys.getsizeof(img_string), img_string)
This works for sending and receiving Strings, but image conversion (usually) fails. I'm trying to understand why the images are not being read properly. My best guess is that the client is not reading the proper number of bytes, but I honestly don't know why that would be the case.
Side notes:
I realize that the char[]-to-String-to-bytes-to-BufferedImage Java logic is roundabout, but reading the bytestream directly produces the same errors.
I have a version of this working where the client socket isn't persistent, ie. the request is processed and the connection is dropped. That version works fine, as I don't need to care about the image size, but I want to learn why the proposed approach doesn't work.
BufferedReader.read() isn't guaranteed to fill the buffer, and converting the image to String and back is not only pointless but wrong.
String is not a container for binary data, and the round-trip isn't guaranteed to work.
It would be better to redesign the protocol so that you can get rid of the readLine(), and send the length in binary and can read the entire stream with a DataInputStream.
In general when dealing with binary protocols, the answer is always DataInputStream and DataOutputStream, unless the byte order isn't the canonical network byte order, which is a protocol design mistake, and in which case you need to look into byte-ordered ByteBuffers.
In the server code, your use of sys.getsizeof is wrong. That returns the size of the bytestring object, whereas what you want is the number of bytes in the bytestring, i.e. its length len(img_string).
Also, in the client code the .readLine method reads characters until it sees either '\r' possibly followed '\n' or '\n', so using '\r' as the terminator will cause a problem if the first byte of the image data happens to be 0x0A, i.e. '\n'.
I expect that the problem is that you are trying to use a Reader and getBytes() to read binary data (the image).
The Reader stack will be taking the bytes from the underlying socket stream, converting them to characters (using the platform's default character encoding), and returning them as a String. Then you convert the String contents back into bytes using the default encoding again. The initial conversion of bytes to characters is likely to be "lossy" for binary data.
The fix is not to use a Reader / BufferedReader. Use an InputStream and a BufferedInputStream. You are not making it easy for yourself by sending the image size encoded as text, but you can deal with that by reading bytes one at a time until you get the newline, and converting them "by hand" into an integer.
(If the size was sent as a fixed-sized binary integer in "network order" you could use DataInputStream instead ... )

Why does Java read random amounts from a socket but not the whole message?

I am working on a project and have a question about Java sockets. The source file which can be found here.
After successfully transmitting the file size in plain text I need to transfer binary data. (DVD .Vob files)
I have a loop such as
// Read this files size
long fileSize = Integer.parseInt(in.readLine());
// Read the block size they are going to use
int blockSize = Integer.parseInt(in.readLine());
byte[] buffer = new byte[blockSize];
// Bytes "red"
long bytesRead = 0;
int read = 0;
while(bytesRead < fileSize){
System.out.println("received " + bytesRead + " bytes" + " of " + fileSize + " bytes in file " + fileName);
read = socket.getInputStream().read(buffer);
if(read < 0){
// Should never get here since we know how many bytes there are
System.out.println("DANGER WILL ROBINSON");
break;
}
binWriter.write(buffer,0,read);
bytesRead += read;
}
I read a random number of bytes close to 99%. I am using Socket, which is TCP based,
so I shouldn't have to worry about lower layer transmission errors.
The received number changes but is always very near the end
received 7258144 bytes of 7266304 bytes in file GLADIATOR/VIDEO_TS/VTS_07_1.VOB
The app then hangs there in a blocking read. I am confounded. The server is sending the correct
file size and has a successful implementation in Ruby but I can't get the Java version to work.
Why would I read less bytes than are sent over a TCP socket?
The above is because of a bug many of you pointed out below.
BufferedReader ate 8Kb of my socket's input. The correct implementation can be found
Here
If your in is a BufferedReader then you've run into the common problem with buffering more than needed. The default buffer size of BufferedReader is 8192 characters which is approximately the difference between what you expected and what you got. So the data you are missing is inside BufferedReader's internal buffer, converted to characters (I wonder why it didn't break with some kind of conversion error).
The only workaround is to read the first lines byte-by-byte without using any buffered classes readers. Java doesn't provide an unbuffered InputStreamReader with readLine() capability as far as I know (with the exception of the deprecated DataInputStream.readLine(), as indicated in the comments below), so you have to do it yourself. I would do it by reading single bytes, putting them into a ByteArrayOutputStream until I encounter an EOL, then converting the resulting byte array into a String using the String constructor with the appropriate encoding.
Note that while you can't use a BufferedInputReader, nothing stops you from using a BufferedInputStream from the very beginning, which will make byte-by-byte reads more efficient.
Update
In fact, I am doing something like this right now, only a bit more complicated. It is an application protocol that involves exchanging some data structures that are nicely represented in XML, but they sometimes have binary data attached to them. We implemented this by having two attributes in the root XML: fragmentLength and isLastFragment. The first one indicates how much bytes of binary data follow the XML part and isLastFragment is a boolean attribute indicating the last fragment so the reading side knows that there will be no more binary data. XML is null-terminated so we don't have to deal with readLine(). The code for reading looks like this:
InputStream ins = new BufferedInputStream(socket.getInputStream());
while (!finished) {
ByteArrayOutputStream buf = new ByteArrayOutputStream();
int b;
while ((b = ins.read()) > 0) {
buf.write(b);
}
if (b == -1)
throw new EOFException("EOF while reading from socket");
// b == 0
Document xml = readXML(new ByteArrayInputStream(buf.toByteArray()));
processAnswers(xml);
Element root = xml.getDocumentElement();
if (root.hasAttribute("fragmentLength")) {
int length = DatatypeConverter.parseInt(
root.getAttribute("fragmentLength"));
boolean last = DatatypeConverter.parseBoolean(
root.getAttribute("isLastFragment"));
int read = 0;
while (read < length) {
// split incoming fragment into 4Kb blocks so we don't run
// out of memory if the client sent a really large fragment
int l = Math.min(length - read, 4096);
byte[] fragment = new byte[l];
int pos = 0;
while (pos < l) {
int c = ins.read(fragment, pos, l - pos);
if (c == -1)
throw new EOFException(
"Preliminary EOF while reading fragment");
pos += c;
read += c;
}
// process fragment
}
Using null-terminated XML for this turned out to be a really great thing as we can add additional attributes and elements without changing the transport protocol. At the transport level we also don't have to worry about handling UTF-8 because XML parser will do it for us. In your case you're probably fine with those two lines, but if you need to add more metadata later you may wish to consider null-terminated XML too.
Here is your problem. The first few lines of the program your using in.readLine() which is probably some sort of BufferedReader. BufferedReaders will read data off the socket in 8K chunks. So when you did the first readLine() it read the first 8K into the buffer. The first 8K contains your two numbers followed by newlines, then some portion of the head of the VOB file (that's the missing chunk). Now when you switched to using the getInputStream() off the socket you are 8K into the transmission assuming your starting at zero.
socket.getInputStream().read(buffer); // you can't do this without losing data.
While the BufferedReader is nice for reading character data, switching between binary and character data in a stream is not possible with it. You'll have to switch to using InputStream instead of Reader and convert the first few portions by hand to character data. If you read the file using a buffered byte array you can read the first chunk, look for your newlines and convert everything to the left of that to character data. Then write everything to the right to your file, then start reading the rest of the file.
This used to be easier with DataInputStream, but it doesn't do a good job handling character conversion for you (readLine is deprecated with BufferedReader being the only replacement - doh). Probably should write a DataInputStream replacement that under the covers uses Charset to properly handle string conversion. Then switching between characters and binary would be easier.
Your basic problem is that BufferedReader will read as much data is available and place in its buffer. It will give you the data as you ask for it. This is the whole point of buffereing i.e. to reduce the number of calls to the OS. The only safe way to use an buffered input is to use the same buffer over the life of the connection.
In your case, you only use the buffer to read two lines, however it is highly likely that 8192 bytes has been read into the buffer. (The default size of the buffer) Say the first two lines consist of 32 bytes, this leaves 8160 waiting for you to read, however you by-pass the buffer to perform the read() on the socket directly leading to 8160 bytes left in the buffer you end up discarding. (the amount you are missing)
BTW: You should be able to see this in a debugger if you inspect the contents of your buffered reader.
Sergei may have been right about data being lost inside the buffer, but I'm not sure about his explanation. (BufferedReaders don't usually hold onto data inside their buffers. He may be thinking of a problem with BufferedWriters, which can lose data if the underlying stream is shut down prematurely.) [Never mind; I had misread Sergei's answer. The rest of this is valid AFAIK.]
I think you have a problem that's specific to your application. In your client code, you start reading as follows:
public static void recv(Socket socket){
try {
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
//...
int numFiles = Integer.parseInt(in.readLine());
... and you proceed to use in for the start of the exchange. But then you switch to using the raw socket stream:
while(bytesRead > fileSize){
read = socket.getInputStream().read(buffer);
Because in is a BufferedReader, it's already going to have filled its buffer with up to 8192 bytes from the socket input stream. Any bytes that are in that buffer, and which you don't read from in, will be lost. Your app is hanging because it believes that the server is holding onto some bytes, but the server doesn't have them.
The solution is not to do byte-by-byte reads from the socket (ouch! your poor CPU!), but to use the BufferedReader consistently. Or, to use buffering with binary data, change the BufferedReader to a BufferedInputStream that wraps the socket's InputStream.
By the way, TCP is not as reliable as many people assume it to be. For example, when the server socket closes, it's possible for it to have written data into the socket which then gets lost as the socket connection is shutdown. Calling Socket.setSoLinger can help to prevent this problem.
EDIT: Also BTW, you're playing with fire by treating byte and character data as if they're interchangeable, as you do below. If the data really is binary, then the conversion to String risks corrupting the data. Perhaps you want to be writing into a BufferedOutputStream?
// Java is retarded and reading and writing operate with
// fundamentally different types. So we write a String of
// binary data.
fileWriter.write(new String(buffer));
bytesRead += read;
EDIT 2: Clarified (or attempted to clarify :-} the handling of binary vs. String data.

Java socket listener load problem

I have made a socket listener in Java that listens on two ports for data and does operations on the listened data. Now the scenario is such that when both the listener and the device that transmits data are up and running, the listener receives data, one at a time ( each data starts with a "#S" and ends with a ".") and when the listener is not up or is not listening, the device stores the data in its local memory and as soon as the listener is up it sends all the data in the appended form like:
"#S ...DATA...[.]#S...DATA...[.]..."
Now I have implemented this in a way that, whatever data the listener gets on either port, it converts into the hex form, and then carries out operations on the hex format of the input data.The hex form of"#S" is "2353" and the hex form of "." is "2e". The code for handling the hex-converted form of the input data is as follows.
hexconverted1 is a string that contains the hex-converted form of the whole input data, that comes on any port.
String store[];
store=hexconverted1.split("2353");
for(int m=0;m<store.length;m++)
store[m]="2353"+store[m];
PrintWriter out2 = new PrintWriter(new BufferedWriter(new FileWriter("C:/Listener/array.bin", true)));
for(int iter=0;iter<store.length; iter++)
out2.println(store[iter]);
out2.close();
What I am trying to accomplish by the above code is that, whenever a bunch of data arrives, I'm trying to scan through the data and sore every single data from the bunch and store in in a string array so that the operations I wish to carry out on the hex converted form of the data can be done in an easier manner. So when I write the contents of the array to a BIN file,the output varies for the same input. When I send a bunched data of 280 data packets, appended one after the other, at times, the array contains 180, at other times 270. But for smaller bunch sizes I get the desired results and the size of the 'store' array is also as expected.
I'm pretty clueless about whats going on and any pointers would be of great help.
To make matters more lucid, the data I get on the ports are mostly unreadable and often the only readable parts are the starting bits"#S" and the end bit".". So I'm using a combination of BufferedInputStream and InputStream to read the incoming data and convert it into the hex format and I'm quite sure that the conversion to hex is coming about alright.
im using a combination of BufferedInputStream and InputStream to read the incoming data
Clutching at straws here. If you read from a Stream using both InputStream and BufferedInputStream methods, you'll get into difficulty:
InputStream is = ...
BufferedInputStream bis = new BufferedInputStream(is);
// This is OK
int b = bis.read();
...
// Reading the InputStream directly at this point is liable to
// give unpredictable results. It is likely that some bytes still
// remain in "bis"'s buffer, and a read on "is" will not return them.
int b2 = is.read();

How to get data out of network packet data in Java

In C if you have a certain type of packet, what you generally do is define some struct and cast the char * into a pointer to the struct. After this you have direct programmatic access to all data fields in the network packet. Like so :
struct rdp_header {
int version;
char serverId[20];
};
When you get a network packet you can do the following quickly :
char * packet;
// receive packet
rdp_header * pckt = (rdp_header * packet);
printf("Servername : %20.20s\n", pckt.serverId);
This technique works really great for UDP based protocols, and allows for very quick and very efficient packet parsing and sending using very little code, and trivial error handling (just check the length of the packet). Is there an equivalent, just as quick way in java to do the same ? Or are you forced to use stream based techniques ?
Read your packet into a byte array, and then extract the bits and bytes you want from that.
Here's a sample, sans exception handling:
DatagramSocket s = new DatagramSocket(port);
DatagramPacket p;
byte buffer[] = new byte[4096];
while (true) {
p = new DatagramPacket(buffer, buffer.length);
s.receive(p);
// your packet is now in buffer[];
int version = buffer[0] << 24 + buffer[1] << 16 + buffer[2] < 8 + buffer[3];
byte[] serverId = new byte[20];
System.arraycopy(buffer, 4, serverId, 0, 20);
// and process the rest
}
In practise you'll probably end up with helper functions to extract data fields in network order from the byte array, or as Tom points out in the comments, you can use a ByteArrayInputStream(), from which you can construct a DataInputStream() which has methods to read structured data from the stream:
...
while (true) {
p = new DatagramPacket(buffer, buffer.length);
s.receive(p);
ByteArrayInputStream bais = new ByteArrayInputStream(buffer);
DataInput di = new DataInputStream(bais);
int version = di.readInt();
byte[] serverId = new byte[20];
di.readFully(serverId);
...
}
I don't believe this technique can be done in Java, short of using JNI and actually writing the protocol handler in C. The other way to do the technique you describe is variant records and unions, which Java doesn't have either.
If you had control of the protocol (it's your server and client) you could use serialized objects (inc. xml), to get the automagic (but not so runtime efficient) parsing of the data, but that's about it.
Otherwise you're stuck with parsing Streams or byte arrays (which can be treated as Streams).
Mind you the technique you describe is tremendously error prone and a source of security vulnerabilities for any protocol that is reasonably interesting, so it's not that great a loss.
I wrote something to simplify this kind of work. Like most tasks, it was much easier to write a tool than to try to do everything by hand.
It consisted of two classes, Here's an example of how it was used:
// Resulting byte array is 9 bytes long.
byte[] ba = new ByteArrayBuilder()
.writeInt(0xaaaa5555) // 4 bytes
.writeByte(0x55) // 1 byte
.writeShort(0x5A5A) // 2 bytes
.write( (new BitBuilder()) // 2 bytes---0xBA12
.write(3, 5) // 101 (3 bits value of 5)
.write(2, 3) // 11 (2 bits value of 3)
.write(3, 2) // 010 (...)
.write(2, 0) // 00
.write(2, 1) // 01
.write(4, 2) // 0002
).getBytes();
I wrote the ByteArrayBuilder to simply accumulate bits. I used a method chaining pattern (Just returning "this" from all methods) to make it easier to write a bunch of statements together.
All the methods in the ByteArrayBuilder were trivial, just like 1 or 2 lines of code (I just wrote everything to a data output stream)
This is to build a packet, but tearing one apart shouldn't be any harder.
The only interesting method in BitBuilder is this one:
public BitBuilder write(int bitCount, int value) {
int bitMask=0xffffffff;
bitMask <<= bitCount; // If bitcount is 4, bitmask is now ffffff00
bitMask = ~bitMask; // and now it's 000000ff, a great mask
bitRegister <<= bitCount; // make room
bitRegister |= (value & bitMask); // or in the value (masked for safety)
bitsWritten += bitCount;
return this;
}
Again, the logic could be inverted very easily to read a packet instead of build one.
edit: I had proposed a different approach in this answer, I'm going to post it as a separate answer because it's completely different.
Look at the Javolution library and its struct classes, they will do just what you are asking for. In fact, the author has this exact example, using the Javolution Struct classes to manipulate UDP packets.
This is an alternate proposal for an answer I left above. I suggest you consider implementing it because it would act pretty much the same as a C solution where you could pick fields out of a packet by name.
You might start it out with an external text file something like this:
OneByte, 1
OneBit, .1
TenBits, .10
AlsoTenBits, 1.2
SignedInt, +4
It could specify the entire structure of a packet, including fields that may repeat. The language could be as simple or complicated as you need--
You'd create an object like this:
new PacketReader packetReader("PacketStructure.txt", byte[] packet);
Your constructor would iterate over the PacketStructure.txt file and store each string as the key of a hashtable, and the exact location of it's data (both bit offset and size) as the data.
Once you created an object, passing in the bitStructure and a packet, you could randomly access the data with statements as straight-forward as:
int x=packetReader.getInt("AlsoTenBits");
Also note, this stuff would be much less efficient than a C struct, but not as much as you might think--it's still probably many times more efficient than you'll need. If done right, the specification file would only be parsed once, so you would only take the minor hit of a single hash lookup and a few binary operations for each value you read from the packet--not bad at all.
The exception is if you are parsing packets from a high-speed continuous stream, and even then I doubt a fast network could flood even a slowish CPU.
Short answer, no you can't do it that easily.
Longer answer, if you can use Serializable objects, you can hook your InputStream up to an ObjectInputStream and use that to deserialize your objects. However, this requires you have some control over the protocol. It also works easier if you use a TCP Socket. If you use a UDP DatagramSocket, you will need to get the data from the packet and then feed that into a ByteArrayInputStream.
If you don't have control over the protocol, you may be able to still use the above deserialization method, but you're probably going to have to implement the readObject() and writeObject() methods rather than using the default implementation given to you. If you need to use someone else's protocol (say because you need to interop with a native program), this is likely the easiest solution you are going to find.
Also, remember that Java uses UTF-16 internally for strings, but I'm not certain that it serializes them that way. Either way, you need to be very careful when passing strings back and forth to non-Java programs.

Categories