I'm trying to have Java server and C++ clients communicate over TCP under the following conditions: text mode, and binary/encrypted mode. My problem is over the eof indicator for end of stream that DataInputStream's read(byte []) uses to return with -1. If I send binary data, what's to prevent a random byte sequence happening to represent an eof and falsely indicating to read() that the stream is ending? It seems I'm limited to text mode. I can live with that until I need to scale, but then I have the problem that I am going to encrypt the text and add message authentication. Even if I were sending from another Java program rather than C++, encrypting a string with AES+MAC would produce binary output not a normal string. What's to prevent some encrypted sequence containing a part identical to an eof?
So, what are the solutions here?
If I send binary data, what's to prevent a random byte sequence happening to represent an eof and falsely indicating to read() that the stream is ending?
In most cases (including TCP/IP and similar network protocols) there is no specific data representation for an EOF. Rather, EOF is a logical abstraction that means that you have reached the end of the data stream. For example, with a Socket it means that the input side of the socket has been closed and you have read all outstanding bytes. (And for a file, it means that you have read the last bytes of the file.)
Since there is no data representation for the (logical) EOF, you don't need to worry about getting false EOFs. In short, there is no problem to be solved here.
"end of stream" in TCP is normally signaled by closing the socket -- that is what makes the stream actually end. If you don't really want the stream to end, but just to signal the end of a "packet" (to be followed, quite possibly, by other packets on the same connection), you can start each packet with an unencrypted length indicator (say, 2 or 4 bytes depending on your need). DataInputStream, according to its docs, is suitable only to receive streams sent by a DataOutputStream, which appears to have nothing to do with your use case as you describe it.
Usually when using tcp streams you have a data header format which at a minimum has a field which holds the length of data to be expected so that the receiver knows exactly how many bytes to expect. Simple example is the TLV format.
As Thomas Pornin replied to Aelx Martelli, DataInputStream is used even on data not sent by DataOutputStream or Java. My question is the consequences of, as the documentation says, DataInputStream's read() returning when the stream ends--that is, is there some sequence of bytes that read() interprets as a stream end, and that I cannot use it thus if there's any possibility of it occurring in the data I'm sending, as can be if I send generic binary data?
My problem is over the eof indicator for end of stream that DataInputStream's read(byte []) uses to return with -1.
No it isn't. This problem is imaginary. -1 is the return code of InputStream.read() that indicates that the peer has closed the connection. It has nothing whatsoever to do with the data being sent over the connection.
Related
I am creating a simple TCP server in java which accepts clients. Each client that connects to the server sends a message, and sends an answer based on the message.
In order to accept clients I am using ServerSocket class. In order to read the client's message and write to him I am allowed to use only in the Socket, DataOutputStream and DataInputStream classes and in StandardCharsets.UTF_8. In addition, every message that the server gets and sends must be in UTF-8 encoding.
However, I am not sure how to read and write messages in UTF-8 encoding using those classes. The size of the messages is unbounded. I tried to read about the read function of DataInputStream class, but I couldn't understand how to use it (if this indeed the function I need to use).
DataOutputStream has a writeUTF() method, and DataInputStream has a readUTF() method. Just note that these methods deal in modified UTF-8 rather than standard UTF-8, and they limit the UTF-8 data to 65535 bytes max.
To provide interoperability with non-Java clients, and/or handle longer strings, would be better to just handle the UTF-8 yourself manually.
The sender can use String.getBytes() to encode a String to a byte[] array using StandardCharsets.UTF_8, then send the array's length using DataOutputStream.writeInt(), and then finally send the actual bytes using DataOutputStream.write(byte[]).
The receiver can then read the array length using DataInputStream.readInt(), then allocate a byte[] array and read it using DataInputStream.readFully(), and then finally use the String constructor to decode the bytes using StandardCharsets.UTF_8.
Right now, I'm trying to write a GUI based Java tic-tac-toe game that functions over a network connection. It essentially works at this point, however I have an intermittent error in which several chars sent over the network connection are lost during gameplay. One case looked like this, when println statements were added to message sends/reads:
Player 1:
Just sent ROW 14 COLUMN 11 GAMEOVER true
Player 2:
Just received ROW 14 COLUMN 11 GAMEOV
Im pretty sure the error is happening when I read over the network. The read takes place in its own thread, with a BufferedReader wrapped around the socket's InputStream, and looks like this:
try {
int input;
while((input = dataIn.read()) != -1 ){
char msgChar = (char)input;
String message = msgChar + "";
while(dataIn.ready()){
msgChar = (char)dataIn.read();
message+= msgChar;
}
System.out.println("Just received " + message);
this.processMessage(message);
}
this.sock.close();
}
My sendMessage method is pretty simple, (just a write over a DataOutputStream wrapped around the socket's outputstream) so I don't think the problem is happening there:
try {
dataOut.writeBytes(message);
System.out.println("Just sent " + message);
}
Any thoughts would be highly appreciated. Thanks!
As it turns out, the ready() method guaruntees only that the next read WON'T block. Consequently, !ready() does not guaruntee that the next read WILL block. Just that it could.
I believe that the problem here had to do with the TCP stack itself. Being stream-oriented, when bytes were written to the socket, TCP makes no guarantees as to the order or grouping of the bytes it sends. I suspect that the TCP stack was breaking up the sent string in a way that made sense to it, and that in the process, the ready() method must detect some sort of underlying break in the stream, and return false, in spite of the fact that more information is available.
I refactored the code to add a newline character to every message send, then simply performed a readLine() instead. This allowed my network protocol to be dependent on the newline character as a message delimiter, rather than the ready() method. I'm happy to say this fixed the problem.
Thanks for all your input!
Try flushing the OutputStream on the sender side. The last bytes might remain in some intenal buffers.
It is really important what types of streamed objects you use to operate with data. It seems to me that this troubleshooting is created by the fact that you use DataOutputStream for sending info, but something else for receiving. Try to send and receive info by DataOutputStream and DataInputStream respectively.
Matter fact, if you send something by calling dataOut.writeBoolean(b)
but trying to receive this thing by calling dataIn.readString(), you will eventually get nothing. DataInputStream and DataOutputStream are type-sensitive. Try to refactor your code keeping it in mind.
Moreover, some input streams return on invocation of read() a single byte. Here you try to convert this one single byte into char, while in java char by default consists of two bytes.
msgChar = (char)dataIn.read();
Check whether it is a reason of data loss.
I am trying to transfer a text file to another server using TCP and it is behaving differently than expected. The code sending the data is:
System.out.println("sending file name...");
String outputFileNameWithDelimiter = outputFileName + "\r\n"; //These 4 lines send the fileName with the delimiter
byte[] fileNameData = outputFileNameWithDelimiter.getBytes("US-ASCII");
outToCompression.write(fileNameData, 0, fileNameData.length);
outToCompression.flush();
System.out.println("sending content...");
System.out.println(new String(buffer, dataBegin, dataEnd-dataBegin));
outToCompression.write(buffer, dataBegin, dataEnd-dataBegin); //send the content
outToCompression.flush();
System.out.println("sending magic String...");
byte[] magicStringData = "--------MagicStringCSE283Miami".getBytes("US-ASCII"); //sends the magic string to tell Compression server the data being sent is done
outToCompression.write(magicStringData, 0, magicStringData.length);
outToCompression.flush();
Because this is TCP and you can't send discrete packets like in UDP, I expected all of the data to be in the input stream and I could just use delimiters to separate the file name, content, and ending string and then each in.read() would just give me the next subsequent amount of data.
Instead this is the data I am getting on each read:
On the first in.read() byteBuffer appears to only have "fileName\r\n".
On the second in.read() byteBuffer still has the same information.
On the third in.read() byteBuffer now holds the content I sent.
On the fourth in.read() byteBuffer holds the content I sent minus a few letters.
On the fifth in.read() I get the magicString + part of the message.
I am flushing on every send from the Webserver, but input streams don't seem to implement flushable.
Can anyone explain why this is happening?
EDIT:
This is how I am reading things in. Basically this in a loop then writing to a file.
in.read(byteBuffer, 0, BUFSIZE);
If your expectation is that read will fill the buffer, or receive exactly what was sent by a single write() by the peer, it is your expectation that is at fault here, not read(). it isn't specified to transfer more than one byte at a time, and there is no guarantee about preserving write boundaries.
It is quite impossible to write correct code without storing the result of read() into a variable.
When you read from an InputStream, you're giving it a byte array to write into (and optionally an offset and a maximum amount to read). InputStream makes no guarantees that the array will be filled with fresh data. The return value is the number of bytes that was actually read into it.
What's happening in your example is this:
The first TCP packet comes in with "fileName\r\n", gets written into your buffer, everything fine so far.
You call read() again, but the next packet hasn't arrived yet. read() will have returned 0, because it didn't want to block until data arrived. So the buffer still contains "fileName\r\n". Edit: as pointed out, read() always blocks until it reads at least one byte. Don't really know why the buffer didn't change then.
On the third read, the content has arrived.
The first bit of the content gets overwritten with the second part of the message, the last bit still contains part of the old message (I think that's what you meant).
etc., you get the idea
You need to check the return value, wait for the data to arrive, and only use as much of the buffer as was written by the last read().
In java there is another Object like BufferedReader to read data recived by server??
Because the server send a string without newline and the client don't print any string untile the Server close the connection form Timeout (the timeout message have a newline!) , after the Client print all message recived and the timeout message send by server!
help me thanks!!
Just don't read by newlines using readLine() method, but read char-by-char using read() method.
for (int c = 0; (c = reader.read()) > -1;) {
System.out.print((char) c);
}
You asked for another class to use, so in that case give Scanner a try for this. It's usually used for delimiting input based on patterns or by the types inferred from the input (e.g. reading on a byte-by-byte bases or an int-by-int basis, or some combination thereof). However, you can use it as just a general purpose "reader" here as well, to cover your use case.
When you read anything from a server, you have to strictly follow the communication protocol. For example the server might be an HTTP server or an SMTP server, it may encrypt the data before sending it, some of the data may be encoded differently, and so on.
So you should basically ask: What kind of server do I want to access? How does it send the bytes to me? And has someone else already done the work of interpreting the bytes so that I can quickly get to the data that I really want?
If it is an HTTP server, you can use the code new URL("http://example.org/").openStream(). Then you will get a stream of bytes. How you convert these bytes into characters, strings and other stuff is another task.
You could try
InputStream is = ... // from input
String text = IOUtils.toString(is);
turns the input into text, without without newlines (it preserves original newlines as well)
(Forgive me because I do not write in Java very often.)
I'm writing a client-side network application in Java and I'm having an interesting issue. Every call to readInt() throws an EOFException. The variable is of type DataInputStream (initialized as: DataInputStream din = new DataInputStream(new BufferedInputStream(sock.getInputStream())); where sock is of type Socket).
Now, sock.isInputShutdown() returns false and socket.isConnected() returns true. I'm assuming that this means that I have a valid connection to the other machine I'm connecting to. I've also performed other checks to ensure that I'm properly connected to the other machine.
Is it possible that the DataInputStream was not set up correctly? Are there any preconditions that I have missed?
Any help is greatly appreciated.
#tofubeer: I actually wrote 17 bytes to the socket. The socket is connected to another machine and I'm waiting on input from that machine (I'm sorry if this was unclear). I successfully read from the stream (to initiate a handshake) first and this worked just fine. I'm checking now to see if my sent-requests are malformed, but I don't think they are. Also, I tried reading a single byte from the stream (via read()) and it returned -1.
Are you writing 4 bytes to the socket? According to the JavaDoc it will throw an EOFException if this stream reaches the end before reading all the bytes.
Try calling readByte() 4 times in a row instead of readInt() and see what happens (likely not all of them will work).
Edit (given your edit).
Find out how many times you can call read() before you get the -1.
When read() returns -1 it means that it has hit the end of file.
Also find out what each read() returns to make sure what you are reading in is what you actually wrote out.
It sounds like a problem either with the read code reading more than you thing while doing the handshake or the other side not writing what you think it is writing.
Some things to check:
Did the handshake consume more than 13 bytes, leaving less than four for the readInt()?
Was the integer you want to read written via DataOutputStream.writeInt()?
Did you flush the stream from the sender?
Edit: I took a look at the Java sources (I have the 1.4 sources on my desktop, not sure which version you're using) and the problem might be in BufferedInputStream. DataInputStream.readInt() is just calling BufferedInputStream.read() four times. BufferedInputStream.read() is calling BufferedInputStream.fill() if its buffer is exhausted (e.g., if its first read only got 16 bytes). BufferedInputStream.fill() calls the underlying InputStream's read(byte[], int, int) method, which by contract might not actually read anything! If this happens, BufferedInputStream.read() will return an erroneous EOF.
This is all assuming that I'm reading all of this correctly, which might not be the case. I only took a quick peek at the sources.
I suspect that your BufferedInputStream is only getting the first 16 bytes of the stream in its first read. I'd be curious what your DataInputStream's available() returns right before the readInt. If you're not already, I'd suggest you flush your OutputStream after writing the int you can't read as a possible workaround.