Java NIO ByteBuffer, write after flip - java

I'm new to Java ByteBuffers and was wondering what the correct way to write to a ByteBuffer after it has been flipped.
In my use case, I am writing an outputBuffer to a socket:
outBuffer.flip();
//Non-blocking SocketChannel
int bytesWritten = getSocket().write(outBuffer);
After this, the output buffer has to be written to again. Also not all of the bytes in the outBuffer may have been written to the socket.
Since it is currently flipped, how can I make it writable again, without overriding any data if it is still in the buffer and wasn't written to the socket?
If I am right, outBuffer.position() == bytesWritten and limit should be at how much data there was to write.
So would using the following in order to reuse the output buffer be right? :
int limit = outBuffer.limit()
outBuffer.limit(outBuffer.capacity());
outBuffer.position(limit);

Again from the API spec.:
The following loop copies bytes from one channel to another via the buffer buf:
while (in.read(buf) >= 0 || buf.position != 0) {
buf.flip();
out.write(buf);
buf.compact(); // In case of partial write
}

since it is currently flipped
It will stay flipped. The write doesn't change that.
how can I make it writable again, without overriding any data if it is still in the buffer and wasn't written to the socket?
You don't have to do anything, but if you want to read before you write again you should do flip/write/compact. If you just want to repeat the write just call write() again, with the buffer still in its current state.
But I prefer to always keep these buffers ready for reading, so there is no possibility of a slip-up, and to flip/write/compact (or flip/get/compact) when those operations are necessary, atomically as it were.
Note that you should not use clear(), unless you are certain that the write was complete and the buffer is now empty. In that case compact and clear are equivalent. But it is simpler to just always compact.
If you're copying in blocking mode, use the loop quoted by #zlakad.

Related

How to re-read an InputStream after calling IOUtils.copy?

I simply use
IOUtils.copy(myInputStream, myOutputStream);
And I see that before calling IOUtils.copy the input stream is avaible to read and after not.
flux.available()
(int) 1368181 (before)
(int) 0 (after)
I saw some explanation on this post, and I see I can copy the bytes from my InputStream to a ByteArrayInputStream and then use mark(0) and read(), in order to read multiple times an input stream.
Here is the code resulted (which is working).
I find this code very verbose, and I'd like if there is a better solution to do that.
ByteArrayInputStream fluxResetable = new ByteArrayInputStream(IOUtils.toByteArray(myInputStream));
fluxResetable.mark(0);
IOUtils.copy(fluxResetable, myOutputStream);
fluxResetable.reset();
An InputStream, unless otherwise stated, is single shot: you consume it once and that's it.
If you want to read it many times, that isn't just a stream any more, it's a stream with a buffer. Your solution reflects that accurately, so it is acceptable. The one thing I would probably change is to store the byte array and always create a new ByteArrayInputStream from it when needed, rather than resetting the same one:
byte [] content = IOUtils.toByteArray(myInputStream);
IOUtils.copy(new ByteArrayInputStream(content), myOutputStream);
doSomethingElse(new ByteArrayInputStream(content));
The effect is more or less the same but it's slightly easier to see what you're trying to do.

In PrintWriter, why doesn't the print() function also auto-flush?

When looking at the PrintWriter contract for the following constructor:
public PrintWriter(OutputStream out, boolean autoFlush)
Creates a new PrintWriter from an existing OutputStream. This convenience constructor creates the necessary intermediate OutputStreamWriter, which will convert characters into bytes using the default character encoding.
Parameters:
out - An output stream
autoFlush - A boolean; if true, the println, printf, or format methods will flush the output buffer
See Also:
OutputStreamWriter.OutputStreamWriter(java.io.OutputStream)
Notice the autoFlush flag only works on println, printf, and format. Now, I know that printf and format basically do the exact same thing as print except with more options, but I just don't see why they didn't include print as well in the contract. Why did they make this decision?
I suspect it's because the Java authors are making assumptions about performance:
Consider the following code:
public static void printArray(int[] array, PrintWriter writer) {
for(int i = 0; i < array.length; i++) {
writer.print(array[i]);
if(i != array.length - 1) writer.print(',');
}
}
You almost certainly would not want such a method to call flush() after every single call. It could be a big performance hit, especially for large arrays. And, if for some reason you did want that, you could just call flush yourself.
The idea is that printf, format, and println methods are likely going to be printing a good chunk of text all at once, so it makes sense to flush after every one. But it would rarely, if ever, make sense flush after only 1 or a few characters.
After some searching, I have found a citation for this reasoning (emphasis mine):
Most of the examples we've seen so far use unbuffered I/O. This means each read or write request is handled directly by the underlying OS. This can make a program much less efficient, since each such request often triggers disk access, network activity, or some other operation that is relatively expensive.
To reduce this kind of overhead, the Java platform implements buffered I/O streams. Buffered input streams read data from a memory area known as a buffer; the native input API is called only when the buffer is empty. Similarly, buffered output streams write data to a buffer, and the native output API is called only when the buffer is full.
<snip>
It often makes sense to write out a buffer at critical points, without waiting for it to fill. This is known as flushing the buffer.
Some buffered output classes support autoflush, specified by an optional constructor argument. When autoflush is enabled, certain key events cause the buffer to be flushed. For example, an autoflush PrintWriter object flushes the buffer on every invocation of println or format.

InputStream's available() tends to halt

This is a bit of an obscure problem that only seems to happen when I'm on certain computers.
I was having this issue today on our school's XP computers and I can't seem to replicate this on my home computer (W7).
Anyway, reading/writing to sockets in Java tends to be problematic whenever I use this code (where: int avail, InputStream socket, byte[] buffer, String output):
while( (avail = input.available()) > 0 )
{
read = input.read( buffer );
output += new String( buffer, 0, read );
}
It seems to make sense (reading all the data until no data is available to a temporary buffer, then to a string), but on our school computers (testing it using IE7), the whole thing somehow pauses. I'm thinking input.available() is causing it to somehow block because the thread just keeps running without ever reaching an endpoint... effectively just pausing somewhere.
OH, I forgot to mention: whenever I run this in debug mode and perform each line step-by-step, it works completely like it should... which just confuses me even more.
When I got home to replicate this issue, it works just fine (just using Firefox and IE8). I have no idea what would be a better alternative to this.
PS:
If the buffer is large enough and I just use:
read = input.read( buffer );
output += new String( buffer, 0, read );
It works just fine, but there's always a worry that the data sent will exceed the buffer size.
You're thinking about available() the wrong way. That method tells you approximately how many bytes can be read right now, without blocking. The commonly accepted idiom for what you're trying to do is
int length;
while ((length = in.read(buffer)) != -1) {
output += new String(buffer, 0, length);
}
or something along those lines (not compiled/tested).
Update: I think you misunderstand the concept of "end of the stream". "End of the stream" doesn't mean that all the data you want to read has been read. It means that there isn't, and won't ever be, anything else to read. For instance, it might mean that you were reading a file and have come to the end of it, or it might mean you were reading from an in-memory byte array and came to the end of that. Those are "end of streams".
In your question, you indicated, or at least implied, that you're reading from a Socket. Are you aware that you'll never get to the end of that stream until the associated Socket or the remote end of the connection is closed? Just because you received a bit of data from it doesn't make it the end of the stream.
Why not use a buffered reader? Something like:
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
String output = "";
try {
String readLine = null;
while ((readLine = reader.readLine()) != null) {
output += readLine + "\n";
}
} catch (IOException e) {
System.err.println("Error: " + e);
}
System.out.println("Read from Socket:" + output);
Your code is invalid. This is a misuse of available(). All it does is tell you how many bytes may be available for reading without blocking. It cannot be used to indicate how many bytes will ever be sent by the peer, and it has no necessary relationship with peer messages. There are no messages in TCP, only a byte stream. If you want to read to EOS, just remove the available() test and read until it returns -1. If you want to read a message, the peer will have to delimit it for you somehow, e.g. by an out-of-band terminator, a length word prefix, or a self-describing protocol such as Object Serialization or XML.
It 'works' in debug mode because you are radically changing the timing with breakpoints. This is further proof that what you are doing is incorrect.

Java NIO: transferFrom until end of stream

I'm playing around with the NIO library. I'm attempting to listen for a connection on port 8888 and once a connection is accepted, dump everything from that channel to somefile.
I know how to do it with ByteBuffers, but I'd like to get it working with the allegedly super efficient FileChannel.transferFrom.
This is what I got:
ServerSocketChannel ssChannel = ServerSocketChannel.open();
ssChannel.socket().bind(new InetSocketAddress(8888));
SocketChannel sChannel = ssChannel.accept();
FileChannel out = new FileOutputStream("somefile").getChannel();
while (... sChannel has not reached the end of the stream ...) <-- what to put here?
out.transferFrom(sChannel, out.position(), BUF_SIZE);
out.close();
So, my question is: How do I express "transferFrom some channel until end-of-stream is reached"?
Edit: Changed 1024 to BUF_SIZE, since the size of the buffer used, is irrelevant for the question.
There are few ways to handle the case. Some background info how trasnferTo/From is implemented internally and when it can be superior.
1st and foremost you should know how many bytes you have to xfer, i.e. use FileChannel.size() to determine the max available and sum the result. The case refers to FileChannel.trasnferTo(socketChanel)
The method does not return -1
The method is emulated on Windows. Windows doesn't have an API function to xfer from filedescriptor to socket, it does have one (two) to xfer from the file designated by name - but that's incompatible with java API.
On Linux the standard sendfile (or sendfile64) is used, on Solaris it's called sendfilev64.
in short for (long xferBytes=0; startPos + xferBytes<fchannel.size();) doXfer() will work for transfer from file -> socket.
There is no OS function that transfers from socket to file (which the OP is interested in). Since the socket data is not int he OS cache it can't be done so effectively, it's emulated. The best way to implement the copy is via standard loop using a polled direct ByteBuffer sized with the socket read buffer. Since I use only non-blocking IO that involves a selector as well.
That being said: I'd like to get it working with the allegedly super efficient "? - it is not efficient and it's emulated on all OSes, hence it will end up the transfer when the socket is closed gracefully or not. The function will not even throw the inherited IOException, provided there was ANY transfer (If the socket was readable and open).
I hope the answer is clear: the only interesting use of File.transferFrom happens when the source is a file. The most efficient (and interesting case) is file->socket and file->file is implemented via filechanel.map/unmap(!!).
Answering your question directly:
while( (count = socketChannel.read(this.readBuffer) ) >= 0) {
/// do something
}
But if this is what you do you do not use any benefits of non-blocking IO because you actually use it exactly as blocking IO. The point of non-blocking IO is that 1 network thread can serve several clients simultaneously: if there is nothing to read from one channel (i.e. count == 0) you can switch to other channel (that belongs to other client connection).
So, the loop should actually iterate different channels instead of reading from one channel until it is over.
Take a look on this tutorial: http://rox-xmlrpc.sourceforge.net/niotut/
I believe it will help you to understand the issue.
I'm not sure, but the JavaDoc says:
An attempt is made to read up to count bytes from the source channel
and write them to this channel's file starting at the given position.
An invocation of this method may or may not transfer all of the
requested bytes; whether or not it does so depends upon the natures
and states of the channels. Fewer than the requested number of bytes
will be transferred if the source channel has fewer than count bytes
remaining, or if the source channel is non-blocking and has fewer than
count bytes immediately available in its input buffer.
I think you may say that telling it to copy infinite bytes (of course not in a loop) will do the job:
out.transferFrom(sChannel, out.position(), Integer.MAX_VALUE);
So, I guess when the socket connection is closed, the state will get changed, which will stop the transferFrom method.
But as I already said: I'm not sure.
allegedly super efficient FileChannel.transferFrom.
If you want both the benefits of DMA access and nonblocking IO the best way is to memory-map the file and then just read from the socket into the memory mapped buffers.
But that requires that you preallocate the file.
This way:
URLConnection connection = new URL("target").openConnection();
File file = new File(connection.getURL().getPath().substring(1));
FileChannel download = new FileOutputStream(file).getChannel();
while(download.transferFrom(Channels.newChannel(connection.getInputStream()),
file.length(), 1024) > 0) {
//Some calculs to get current speed ;)
}
transferFrom() returns a count. Just keep calling it, advancing the position/offset, until it returns zero. But start with a much larger count than 1024, more like a megabyte or two, otherwise you're not getting much benefit from this method.
EDIT To address all the commentary below, the documentation says that "Fewer than the requested number of bytes will be transferred if the source channel has fewer than count bytes remaining, or if the source channel is non-blocking and has fewer than count bytes immediately available in its input buffer." So provided you are in blocking mode it won't return zero until there is nothing left in the source. So looping until it returns zero is valid.
EDIT 2
The transfer methods are certainly mis-designed. They should have been designed to return -1 at end of stream, like all the read() methods.
Building on top of what other people here have written, here's a simple helper method which accomplishes the goal:
public static void transferFully(FileChannel fileChannel, ReadableByteChannel sourceChannel, long totalSize) {
for (long bytesWritten = 0; bytesWritten < totalSize;) {
bytesWritten += fileChannel.transferFrom(sourceChannel, bytesWritten, totalSize - bytesWritten);
}
}

Trying to packetize TCP with non-blocking IO is hard! Am I doing something wrong?

Oh how I wish TCP was packet-based like UDP is! [see comments] But alas, that's not the case, so I'm trying to implement my own packet layer. Here's the chain of events so far (ignoring writing packets)
Oh, and my Packets are very simply structured: two unsigned bytes for length, and then byte[length] data. (I can't imagine if they were any more complex, I'd be up to my ears in if statements!)
Server is in an infinite loop, accepting connections and adding them to a list of Connections.
PacketGatherer (another thread) uses a Selector to figure out which Connection.SocketChannels are ready for reading.
It loops over the results and tells each Connection to read().
Each Connection has a partial IncomingPacket and a list of Packets which have been fully read and are waiting to be processed.
On read():
Tell the partial IncomingPacket to read more data. (IncomingPacket.readData below)
If it's done reading (IncomingPacket.complete()), make a Packet from it and stick the Packet into the list waiting to be processed and then replace it with a new IncomingPacket.
There are a couple problems with this. First, only one packet is being read at a time. If the IncomingPacket needs only one more byte, then only one byte is read this pass. This can of course be fixed with a loop but it starts to get sorta complicated and I wonder if there is a better overall way.
Second, the logic in IncomingPacket is a little bit crazy, to be able to read the two bytes for the length and then read the actual data. Here is the code, boiled down for quick & easy reading:
int readBytes; // number of total bytes read so far
byte length1, length2; // each byte in an unsigned short int (see getLength())
public int getLength() { // will be inaccurate if readBytes < 2
return (int)(length1 << 8 | length2);
}
public void readData(SocketChannel c) {
if (readBytes < 2) { // we don't yet know the length of the actual data
ByteBuffer lengthBuffer = ByteBuffer.allocate(2 - readBytes);
numBytesRead = c.read(lengthBuffer);
if(readBytes == 0) {
if(numBytesRead >= 1)
length1 = lengthBuffer.get();
if(numBytesRead == 2)
length2 = lengthBuffer.get();
} else if(readBytes == 1) {
if(numBytesRead == 1)
length2 = lengthBuffer.get();
}
readBytes += numBytesRead;
}
if(readBytes >= 2) { // then we know we have the entire length variable
// lazily-instantiate data buffers based on getLength()
// read into data buffers, increment readBytes
// (does not read more than the amount of this packet, so it does not
// need to handle overflow into the next packet's data)
}
}
public boolean complete() {
return (readBytes > 2 && readBytes == getLength()+2);
}
Basically I need feedback on my code and overall process. Please suggest any improvements. Even overhauling my entire system would be okay, if you have suggestions for how better to implement the whole thing. Book recommendations are welcome too; I love books. I just get the feeling that something isn't quite right.
Here's the general solution I came up with thanks to Juliano's answer: (feel free to comment if you have any questions)
public void fillWriteBuffer() {
while(!writePackets.isEmpty() && writeBuf.remaining() >= writePackets.peek().size()) {
Packet p = writePackets.poll();
assert p != null;
p.writeTo(writeBuf);
}
}
public void fillReadPackets() {
do {
if(readBuf.position() < 1+2) {
// haven't yet received the length
break;
}
short packetLength = readBuf.getShort(1);
if(readBuf.limit() >= 1+2 + packetLength) {
// we have a complete packet!
readBuf.flip();
byte packetType = readBuf.get();
packetLength = readBuf.getShort();
byte[] packetData = new byte[packetLength];
readBuf.get(packetData);
Packet p = new Packet(packetType, packetData);
readPackets.add(p);
readBuf.compact();
} else {
// not a complete packet
break;
}
} while(true);
}
Probably this is not the answer you are looking for, but someone would have to say it: You are probably overengineering the solution for a very simple problem.
You do not have packets before they arrive completely, not even IncomingPackets. You have just a stream of bytes without defined meaning. The usual, the simple solution is to keep the incoming data in a buffer (it can be a simple byte[] array, but a proper elastic and circular buffer is recommended if performance is an issue). After each read, you check the contents of the buffer to see if you can extract an entire packet from there. If you can, you construct your Packet, discard the correct number of bytes from the beginning of the buffer and repeat. If or when you cannot extract an entire packet, you keep those incoming bytes there until the next time you read something from the socket successfully.
While you are at it, if you are doing datagram-based communication over a stream channel, I would recommend you to include a magic number at the beginning of each "packet" so that you can test that both ends of the connection are still synchronized. They may get out of sync if for some reason (a bug) one of them reads or writes the wrong number of bytes to/from the stream.
Can't you just read whatever number of bytes that are ready to be read, and feed all incoming bytes into a packet parsing state machine? That would mean treating the incoming (TCP) data stream like any other incoming data stream (via serial line, or USB, a pipe, or whatever...)
So you would have some Selector determining from which connection(s) there are incoming bytes to be read, and how many. Then you would (for each connection) read the available bytes, and then feed those bytes into a (connection specific) state machine instance (the reading and feeding could be done from the same class, though). This packet parsing state machine class would then spit out finished packets from time to time, and hand those over to whoever will handle those complete and parsed packets.
For an example packet format like
2 magic header bytes to mark the start
2 bytes of payload size (n)
n bytes of payload data
2 bytes of checksum
the state machine would have states like (try an enum, Java has those now, I gather)
wait_for_magic_byte_0,
wait_for_magic_byte_1,
wait_for_length_byte_0,
wait_for_length_byte_1,
wait_for_payload_byte (with a payload_offset variable counting),
wait_for_chksum_byte_0,
wait_for_chksum_byte_1
and on each incoming byte you can switch the state accordingly. If the incoming byte does not properly advance the state machine, discard the byte by resetting the state machine to wait_for_magic_byte_0.
Ignoring client disconnects and server shutdown for now, here's more or less traditional structure of a socket server:
Selector, handles sockets:
polls open sockets
if it's the server socket, create new Connection object
for each active client socket find the Connection, call it with event (read or write)
Connection (one per socket), handles I/O on one socket:
Communicates to Protocol via two queues, input and output
keeps two buffers, one for reading, one for writing, and respective offsets
on read event: read all available input bytes, look for message boundaries, put whole messages onto Protocol input queue, call Protocol
on write event: write the buffer, or if it's empty, take message form output queue into buffer, start writing it
Protocol (one per connection), handles application protocol exchange on one connection:
take message from input queue, parse application portion of the message
do the server work (here's where the state machine is - some messages are appropriate in one state, while not in the other), generate response message, put it onto output queue
That's it. Everything could be in a single thread. The key here is separation of responsibilities.
Hope this helps.
I think you're approaching the issue from a slightly wrong direction. Instead of thinking of packets, think of a data structure. That's what you're sending. Effectively, yes, it's an application layer packet, but just think of it as a data object. Then, at the lowest level, write a routine which will read off the wire, and output data objects. That will give you the abstraction layer I think you're looking for.

Categories