InputStream's available() tends to halt - java

This is a bit of an obscure problem that only seems to happen when I'm on certain computers.
I was having this issue today on our school's XP computers and I can't seem to replicate this on my home computer (W7).
Anyway, reading/writing to sockets in Java tends to be problematic whenever I use this code (where: int avail, InputStream socket, byte[] buffer, String output):
while( (avail = input.available()) > 0 )
{
read = input.read( buffer );
output += new String( buffer, 0, read );
}
It seems to make sense (reading all the data until no data is available to a temporary buffer, then to a string), but on our school computers (testing it using IE7), the whole thing somehow pauses. I'm thinking input.available() is causing it to somehow block because the thread just keeps running without ever reaching an endpoint... effectively just pausing somewhere.
OH, I forgot to mention: whenever I run this in debug mode and perform each line step-by-step, it works completely like it should... which just confuses me even more.
When I got home to replicate this issue, it works just fine (just using Firefox and IE8). I have no idea what would be a better alternative to this.
PS:
If the buffer is large enough and I just use:
read = input.read( buffer );
output += new String( buffer, 0, read );
It works just fine, but there's always a worry that the data sent will exceed the buffer size.

You're thinking about available() the wrong way. That method tells you approximately how many bytes can be read right now, without blocking. The commonly accepted idiom for what you're trying to do is
int length;
while ((length = in.read(buffer)) != -1) {
output += new String(buffer, 0, length);
}
or something along those lines (not compiled/tested).
Update: I think you misunderstand the concept of "end of the stream". "End of the stream" doesn't mean that all the data you want to read has been read. It means that there isn't, and won't ever be, anything else to read. For instance, it might mean that you were reading a file and have come to the end of it, or it might mean you were reading from an in-memory byte array and came to the end of that. Those are "end of streams".
In your question, you indicated, or at least implied, that you're reading from a Socket. Are you aware that you'll never get to the end of that stream until the associated Socket or the remote end of the connection is closed? Just because you received a bit of data from it doesn't make it the end of the stream.

Why not use a buffered reader? Something like:
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
String output = "";
try {
String readLine = null;
while ((readLine = reader.readLine()) != null) {
output += readLine + "\n";
}
} catch (IOException e) {
System.err.println("Error: " + e);
}
System.out.println("Read from Socket:" + output);

Your code is invalid. This is a misuse of available(). All it does is tell you how many bytes may be available for reading without blocking. It cannot be used to indicate how many bytes will ever be sent by the peer, and it has no necessary relationship with peer messages. There are no messages in TCP, only a byte stream. If you want to read to EOS, just remove the available() test and read until it returns -1. If you want to read a message, the peer will have to delimit it for you somehow, e.g. by an out-of-band terminator, a length word prefix, or a self-describing protocol such as Object Serialization or XML.
It 'works' in debug mode because you are radically changing the timing with breakpoints. This is further proof that what you are doing is incorrect.

Related

Java NIO ByteBuffer, write after flip

I'm new to Java ByteBuffers and was wondering what the correct way to write to a ByteBuffer after it has been flipped.
In my use case, I am writing an outputBuffer to a socket:
outBuffer.flip();
//Non-blocking SocketChannel
int bytesWritten = getSocket().write(outBuffer);
After this, the output buffer has to be written to again. Also not all of the bytes in the outBuffer may have been written to the socket.
Since it is currently flipped, how can I make it writable again, without overriding any data if it is still in the buffer and wasn't written to the socket?
If I am right, outBuffer.position() == bytesWritten and limit should be at how much data there was to write.
So would using the following in order to reuse the output buffer be right? :
int limit = outBuffer.limit()
outBuffer.limit(outBuffer.capacity());
outBuffer.position(limit);
Again from the API spec.:
The following loop copies bytes from one channel to another via the buffer buf:
while (in.read(buf) >= 0 || buf.position != 0) {
buf.flip();
out.write(buf);
buf.compact(); // In case of partial write
}
since it is currently flipped
It will stay flipped. The write doesn't change that.
how can I make it writable again, without overriding any data if it is still in the buffer and wasn't written to the socket?
You don't have to do anything, but if you want to read before you write again you should do flip/write/compact. If you just want to repeat the write just call write() again, with the buffer still in its current state.
But I prefer to always keep these buffers ready for reading, so there is no possibility of a slip-up, and to flip/write/compact (or flip/get/compact) when those operations are necessary, atomically as it were.
Note that you should not use clear(), unless you are certain that the write was complete and the buffer is now empty. In that case compact and clear are equivalent. But it is simpler to just always compact.
If you're copying in blocking mode, use the loop quoted by #zlakad.

Last few chars in a string sent over socket sometimes missing in Java network program

Right now, I'm trying to write a GUI based Java tic-tac-toe game that functions over a network connection. It essentially works at this point, however I have an intermittent error in which several chars sent over the network connection are lost during gameplay. One case looked like this, when println statements were added to message sends/reads:
Player 1:
Just sent ROW 14 COLUMN 11 GAMEOVER true
Player 2:
Just received ROW 14 COLUMN 11 GAMEOV
Im pretty sure the error is happening when I read over the network. The read takes place in its own thread, with a BufferedReader wrapped around the socket's InputStream, and looks like this:
try {
int input;
while((input = dataIn.read()) != -1 ){
char msgChar = (char)input;
String message = msgChar + "";
while(dataIn.ready()){
msgChar = (char)dataIn.read();
message+= msgChar;
}
System.out.println("Just received " + message);
this.processMessage(message);
}
this.sock.close();
}
My sendMessage method is pretty simple, (just a write over a DataOutputStream wrapped around the socket's outputstream) so I don't think the problem is happening there:
try {
dataOut.writeBytes(message);
System.out.println("Just sent " + message);
}
Any thoughts would be highly appreciated. Thanks!
As it turns out, the ready() method guaruntees only that the next read WON'T block. Consequently, !ready() does not guaruntee that the next read WILL block. Just that it could.
I believe that the problem here had to do with the TCP stack itself. Being stream-oriented, when bytes were written to the socket, TCP makes no guarantees as to the order or grouping of the bytes it sends. I suspect that the TCP stack was breaking up the sent string in a way that made sense to it, and that in the process, the ready() method must detect some sort of underlying break in the stream, and return false, in spite of the fact that more information is available.
I refactored the code to add a newline character to every message send, then simply performed a readLine() instead. This allowed my network protocol to be dependent on the newline character as a message delimiter, rather than the ready() method. I'm happy to say this fixed the problem.
Thanks for all your input!
Try flushing the OutputStream on the sender side. The last bytes might remain in some intenal buffers.
It is really important what types of streamed objects you use to operate with data. It seems to me that this troubleshooting is created by the fact that you use DataOutputStream for sending info, but something else for receiving. Try to send and receive info by DataOutputStream and DataInputStream respectively.
Matter fact, if you send something by calling dataOut.writeBoolean(b)
but trying to receive this thing by calling dataIn.readString(), you will eventually get nothing. DataInputStream and DataOutputStream are type-sensitive. Try to refactor your code keeping it in mind.
Moreover, some input streams return on invocation of read() a single byte. Here you try to convert this one single byte into char, while in java char by default consists of two bytes.
msgChar = (char)dataIn.read();
Check whether it is a reason of data loss.

Java.net Which goes fastest when you parse html from online?

Using java.net, java.io, what is the fastest way to parse html from online, and load it to a file or the console? Is buffered writer/buffered reader faster than inputstreamreader/outputstreamwriter? Are writers and readers faster than outputstreams and inputstreams?
I am experiencing serious lag with the following output writer/stream:
URLConnection ii;
BufferedReader iik = new BufferedReader(new InputStreamReader(ii.getInputStream()));
String op;
while(iik.readLine()!=null) {
op=iik.readLine();
System.out.println(op);
}
But curiously i am experiencing close to no lagtime with the following code:
URLConnection ii=i.openConnection();
Reader xh=new InputStreamReader(ii.getInputStream());
int r;
Writer xy=new PrintWriter(System.out);
while((r=xh.read())!=-1) {
xy.write(r);
}
xh.close();
xy.close();
What is going on here?
Your first snippet is wrong: it reads the next line, tests if it's null, ignores it, then reads the next line without testing if it's null, and prints it.
The second code prints the integer value of every char read from the reader.
Both snippets use the same underlying streams and readers, and, if coded correctly, the first one should probably be a bit faster thanks to buffering. But of course, you'll have something printed on the screen only when the line is ended. If the server sends a single line of text of 10 MBs, you'll have to read the whole 10 MBs before something is printed to the screen.
Make sure to close the readers in finally blocks.
Readers/Writers shouldn't be inherently faster than Input/OutputStreams.
That said, going through readLine() and println() probably isn't the optimal way of transferring bytes. In your case, if the file you're loading doesn't contain many newline characters, BufferedReader will have to buffer a lot of data before readLine() will return.
The canonical non-terrible way of transferring data between streams is doing it in chunks by using a buffer:
byte[] buf = new byte[1<<12];
InputStream in = urlConnection.getInputStream();
int read = -1;
while ((read = in.read(buf) != -1) {
System.out.write(buf, 0, read);
}
It might be faster yet to use NIO, the code for it is a little less straightforward and I just use the one found in this blog post.
If you're writing to/from a file, the best method is to use a zero-copy approach, which Java makes available with FileChannel.transferFrom() and transferTo(). Sample code is available in a DeveloperWorks article.

Reading socket one byte a time, how to change this, optimisation

I need some help about optimisation. I am trying to improve this open-source game server made with JAVA. Each player has its own thread, and each thread goes something like this:
BufferedReader _in = new BufferedReader(new InputStreamReader(_socket.getInputStream()));
String packet = "";
char charCur[] = new char[1];
while(_in.read(charCur, 0, 1)!=-1)
{
if (charCur[0] != '\u0000' && charCur[0] != '\n' && charCur[0] != '\r')
{
packet += charCur[0];
}else if(!packet.isEmpty())
{
parsePlayerPacket(packet);
packet = "";
}
}
I have been told so many times that this code is stupid, and I agree because when profiling it I see that reading each byte and appending it using packet += "" is just stupid and slow.
I want to improve this but I don't know how.. I'm sure I can find something, but I'm afraid it will be even slower than this, because I have to split packets based on the '\u0000', '\n', or '\r' to parse them. And I know that splitting 3 times is verry slow.
Can someone give me an idea? Or a piece of code for this? It will make my day.
If you're going to explain, please, please use verry simple words, with code examples, I'm just a JAVA beginner. Thank's
There is no significant performance issue with reading from a BufferedReader either in large chunks, or even one character at a time.
Unless your profiling has identified the BufferedReader.read() method as a specific hotspot in your code, the best thing you can do is make the code simple and readable, and not spend time optimizing it.
For your particular case:
yes that code is a bit lame, but
no it is unlikely to make a lot of difference from the performance perspective.
The real performance bottleneck is most likely the network itself. There are application level things that you can do to address this, but ultimately you can only send / receive data at a rate that the end-to-end network connection will support.
My profiling result is saying that it's coming from: BufferedReader.read(). What does it mean really?
Are you sure that the time is really being spent in the Socket's read method? If it is, then the real issue is that your application threads are spending lots of time waiting for network packets to arrive. If that is the case, then the only thing you could do is to reduce the number of client and server side flushes so that the network doesn't have to deal with so many small packets. Depending on your application, that may be infeasible.
I'd write your code as follows:
BufferedReader _in = new BufferedReader(
new InputStreamReader(_socket.getInputStream()));
StringBuilder packet = new StringBuilder();
int ch;
while ((ch = _in.read()) != 1) {
if (ch != '\u0000' && ch != '\n' && ch != '\r') {
packet.append((char) ch);
} else if (!packet.isEmpty()) {
parsePlayerPacket(packet.toString());
packet = new StringBuilder();
}
}
But I don't think it will make much difference to the performance ... unless the "packets" are typically hundreds of characters long. (The real point of my tweaks is to reduce the number of temporary strings that are created while reading a packet. I don't think that there's a simple way to make it spend less real time in the read calls.)
Perhaps you should look into the readLine() method of BufferedReader. Looks like you're reading Strings, calling BufferedReader.readLine() gives you the next line (sans the newline/linefeed).
Something like this:
String packet = _in.readLine();
while(packet!=null) {
parsePlayerPacket(packet);
packet = _in.readLine();
}
Just like you're implementation, readLine() will block until either the stream is closed or there's a newline/linefeed.
EDIT: yeah, this isn't going to split '\0'. You're best bet is probably a PushbackReader, read in some buffer of chars (like David Oliván Ubieto suggests)
PushbackReader _in = new PushbackReader(new InputStreamReader(_socket.getInputStream()));
StringBuilder packet = new StringBuilder();
char[] buffer = new char[1024];
// read in as much as we can
int bytesRead = _in.read(buffer);
while(bytesRead > 0) {
boolean process = false;
int index = 0;
// see if what we've read contains a delimiter
for(index = 0;index<bytesRead;index++) {
if(buffer[index]=='\n' ||
buffer[index]=='\r' ||
buffer[index]=='\u0000') {
process = true;
break;
}
}
if(process) {
// got a delimiter, process entire packet and push back the stuff we don't care about
_in.unread(buffer,index+1,bytesRead-(index+1)); // we don't want to push back our delimiter
packet.append(buffer,0,index);
parsePlayerPacket(packet);
packet = new StringBuilder();
}
else {
// no delimiter, append to current packet and read some more
packet.append(buffer,0,bytesRead);
}
bytesRead = _in.read(buffer);
}
I didn't debug that, but you get the idea.
Note that using String.split('\u0000') has the problem where a packet ending with '\u0000' won't get processed until a newline/linefeed is sent across the stream. Since you're writing some kind of game, I assume it's important to process an incoming packet as soon as you get it.
Read as many bytes as you can using a large buffer (1K). Check if the "terminator" is found ('\u0000', '\n', '\r'). If not, copy to a temporal buffer (larger than used to read socket), read again and copy to the temporal buffer until "terminator" found. When you have all the necessary bytes, copy the temporal buffer to any "final" buffer and process it. The remaining bytes should be considered as the "next" message and copied to the start of the temporal buffer.

Why does Java read random amounts from a socket but not the whole message?

I am working on a project and have a question about Java sockets. The source file which can be found here.
After successfully transmitting the file size in plain text I need to transfer binary data. (DVD .Vob files)
I have a loop such as
// Read this files size
long fileSize = Integer.parseInt(in.readLine());
// Read the block size they are going to use
int blockSize = Integer.parseInt(in.readLine());
byte[] buffer = new byte[blockSize];
// Bytes "red"
long bytesRead = 0;
int read = 0;
while(bytesRead < fileSize){
System.out.println("received " + bytesRead + " bytes" + " of " + fileSize + " bytes in file " + fileName);
read = socket.getInputStream().read(buffer);
if(read < 0){
// Should never get here since we know how many bytes there are
System.out.println("DANGER WILL ROBINSON");
break;
}
binWriter.write(buffer,0,read);
bytesRead += read;
}
I read a random number of bytes close to 99%. I am using Socket, which is TCP based,
so I shouldn't have to worry about lower layer transmission errors.
The received number changes but is always very near the end
received 7258144 bytes of 7266304 bytes in file GLADIATOR/VIDEO_TS/VTS_07_1.VOB
The app then hangs there in a blocking read. I am confounded. The server is sending the correct
file size and has a successful implementation in Ruby but I can't get the Java version to work.
Why would I read less bytes than are sent over a TCP socket?
The above is because of a bug many of you pointed out below.
BufferedReader ate 8Kb of my socket's input. The correct implementation can be found
Here
If your in is a BufferedReader then you've run into the common problem with buffering more than needed. The default buffer size of BufferedReader is 8192 characters which is approximately the difference between what you expected and what you got. So the data you are missing is inside BufferedReader's internal buffer, converted to characters (I wonder why it didn't break with some kind of conversion error).
The only workaround is to read the first lines byte-by-byte without using any buffered classes readers. Java doesn't provide an unbuffered InputStreamReader with readLine() capability as far as I know (with the exception of the deprecated DataInputStream.readLine(), as indicated in the comments below), so you have to do it yourself. I would do it by reading single bytes, putting them into a ByteArrayOutputStream until I encounter an EOL, then converting the resulting byte array into a String using the String constructor with the appropriate encoding.
Note that while you can't use a BufferedInputReader, nothing stops you from using a BufferedInputStream from the very beginning, which will make byte-by-byte reads more efficient.
Update
In fact, I am doing something like this right now, only a bit more complicated. It is an application protocol that involves exchanging some data structures that are nicely represented in XML, but they sometimes have binary data attached to them. We implemented this by having two attributes in the root XML: fragmentLength and isLastFragment. The first one indicates how much bytes of binary data follow the XML part and isLastFragment is a boolean attribute indicating the last fragment so the reading side knows that there will be no more binary data. XML is null-terminated so we don't have to deal with readLine(). The code for reading looks like this:
InputStream ins = new BufferedInputStream(socket.getInputStream());
while (!finished) {
ByteArrayOutputStream buf = new ByteArrayOutputStream();
int b;
while ((b = ins.read()) > 0) {
buf.write(b);
}
if (b == -1)
throw new EOFException("EOF while reading from socket");
// b == 0
Document xml = readXML(new ByteArrayInputStream(buf.toByteArray()));
processAnswers(xml);
Element root = xml.getDocumentElement();
if (root.hasAttribute("fragmentLength")) {
int length = DatatypeConverter.parseInt(
root.getAttribute("fragmentLength"));
boolean last = DatatypeConverter.parseBoolean(
root.getAttribute("isLastFragment"));
int read = 0;
while (read < length) {
// split incoming fragment into 4Kb blocks so we don't run
// out of memory if the client sent a really large fragment
int l = Math.min(length - read, 4096);
byte[] fragment = new byte[l];
int pos = 0;
while (pos < l) {
int c = ins.read(fragment, pos, l - pos);
if (c == -1)
throw new EOFException(
"Preliminary EOF while reading fragment");
pos += c;
read += c;
}
// process fragment
}
Using null-terminated XML for this turned out to be a really great thing as we can add additional attributes and elements without changing the transport protocol. At the transport level we also don't have to worry about handling UTF-8 because XML parser will do it for us. In your case you're probably fine with those two lines, but if you need to add more metadata later you may wish to consider null-terminated XML too.
Here is your problem. The first few lines of the program your using in.readLine() which is probably some sort of BufferedReader. BufferedReaders will read data off the socket in 8K chunks. So when you did the first readLine() it read the first 8K into the buffer. The first 8K contains your two numbers followed by newlines, then some portion of the head of the VOB file (that's the missing chunk). Now when you switched to using the getInputStream() off the socket you are 8K into the transmission assuming your starting at zero.
socket.getInputStream().read(buffer); // you can't do this without losing data.
While the BufferedReader is nice for reading character data, switching between binary and character data in a stream is not possible with it. You'll have to switch to using InputStream instead of Reader and convert the first few portions by hand to character data. If you read the file using a buffered byte array you can read the first chunk, look for your newlines and convert everything to the left of that to character data. Then write everything to the right to your file, then start reading the rest of the file.
This used to be easier with DataInputStream, but it doesn't do a good job handling character conversion for you (readLine is deprecated with BufferedReader being the only replacement - doh). Probably should write a DataInputStream replacement that under the covers uses Charset to properly handle string conversion. Then switching between characters and binary would be easier.
Your basic problem is that BufferedReader will read as much data is available and place in its buffer. It will give you the data as you ask for it. This is the whole point of buffereing i.e. to reduce the number of calls to the OS. The only safe way to use an buffered input is to use the same buffer over the life of the connection.
In your case, you only use the buffer to read two lines, however it is highly likely that 8192 bytes has been read into the buffer. (The default size of the buffer) Say the first two lines consist of 32 bytes, this leaves 8160 waiting for you to read, however you by-pass the buffer to perform the read() on the socket directly leading to 8160 bytes left in the buffer you end up discarding. (the amount you are missing)
BTW: You should be able to see this in a debugger if you inspect the contents of your buffered reader.
Sergei may have been right about data being lost inside the buffer, but I'm not sure about his explanation. (BufferedReaders don't usually hold onto data inside their buffers. He may be thinking of a problem with BufferedWriters, which can lose data if the underlying stream is shut down prematurely.) [Never mind; I had misread Sergei's answer. The rest of this is valid AFAIK.]
I think you have a problem that's specific to your application. In your client code, you start reading as follows:
public static void recv(Socket socket){
try {
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
//...
int numFiles = Integer.parseInt(in.readLine());
... and you proceed to use in for the start of the exchange. But then you switch to using the raw socket stream:
while(bytesRead > fileSize){
read = socket.getInputStream().read(buffer);
Because in is a BufferedReader, it's already going to have filled its buffer with up to 8192 bytes from the socket input stream. Any bytes that are in that buffer, and which you don't read from in, will be lost. Your app is hanging because it believes that the server is holding onto some bytes, but the server doesn't have them.
The solution is not to do byte-by-byte reads from the socket (ouch! your poor CPU!), but to use the BufferedReader consistently. Or, to use buffering with binary data, change the BufferedReader to a BufferedInputStream that wraps the socket's InputStream.
By the way, TCP is not as reliable as many people assume it to be. For example, when the server socket closes, it's possible for it to have written data into the socket which then gets lost as the socket connection is shutdown. Calling Socket.setSoLinger can help to prevent this problem.
EDIT: Also BTW, you're playing with fire by treating byte and character data as if they're interchangeable, as you do below. If the data really is binary, then the conversion to String risks corrupting the data. Perhaps you want to be writing into a BufferedOutputStream?
// Java is retarded and reading and writing operate with
// fundamentally different types. So we write a String of
// binary data.
fileWriter.write(new String(buffer));
bytesRead += read;
EDIT 2: Clarified (or attempted to clarify :-} the handling of binary vs. String data.

Categories