ReadableByteChannel.read(ByteBuffer dest) reads capped at 8KB. Why? - java

I've got some code that:
reads from a ReadableByteChannel into a ByteBuffer,
takes note of the bytes transfered,
pauses a few tens to hundreds of miliseconds,
passes the ByteBuffer onto a WritableByteChannel.
Some details:
Both Channels are TCP/IP sockets.
The total connection read size is in the tens of megabytes.
The source socket (which the ReadableByteChannel is getting bytes from) is on the same machine.
Debian Lenny 64-bit on HP DL380s
Sun Java 1.6.0 update 20
The problem is, no matter how large a ByteBuffer is allocated, either with .allocate() or .allocateDirect(), the number of bytes read into the ByteBuffer maxes out at 8KB. My target ByteBuffer size is 256KB, which is only a tiny fraction (1/32nd) is being used. About 10% of the time only 2896 bytes are read in.
I've checked the OS TCP buffer settings, and they look fine. This is confirmed by watching netstat's report on how many bytes are in the buffer--both have data in the socket buffers exceeding 8KB.
tcp 0 192384 1.2.3.4:8088 1.2.3.4:53404 ESTABLISHED
tcp6 110144 0 1.2.3.4:53404 1.2.3.4:8088 ESTABLISHED
One thing that stands out here is the mix of TCP and TCP6, but that should not be a problem, I think. My Java client is on port 53404 in the above output.
I've tried setting the socket properties to favor bandwidth over latency, but no change.
Socket socket = new Socket(host.getHostName(), host.getPort());
socket.setPerformancePreferences(1, 0, 2); //bw > connection time > latency
When I log the value of socket.getReceiveBufferSize(), it consistently reports a mere 43856 bytes. While it is smaller than I would like, it is still more than 8KB. (It is also is not a very round number, which I would have expected.)
I'm really stumped as to what the problem is here. In theory, AFAIK, this should not be happening. It would not be desirable to 'downgrade' to a stream-based solution, although that is where we are going next if a solution cannot be found.
What am I missing? What can I do to correct it?

OK, I've found the issue! (And am answer my own question in case someone has the same problem.)
I was instancing the ReadableByteChannel not directly from the Socket instance, but from an the HttpEntity.getContent() (Apache HTTP Commons Client) method's returned InputStream. The HTTP Commons client had been passed the socket early on with the DefaultHttpClientConnection.bind() method. What I did not understand is, I think, the Channel is of a BufferedInputStream instance buried inside the HTTP Commons Client implementation. (8KB just happens to be the default value for a Java 6.)
My solution, therefore, was to grab the ReadableByteChannel off the raw Socket instance.

Related

How to speed up data transfer over socket?

Currently I am using this code on both Server and Client Side. Client is an android device.
BufferedOutputStream os = new BufferedOutputStream(socket.getOutputStream(),10000000);
BufferedInputStream sin = new BufferedInputStream(socket.getInputStream(),10000000);
os.write("10000000\n".getBytes());
os.flush();
for (int i =0;i<10000000;i++){
os.write((sampleRead[i]+" ").getBytes());
}
os.flush();
The problem is that this code takes about 80 secs to transfer data from android client to server while it takes only 8 seconds to transfer the data back from server to client. The code is same on both sides and buffer is also same. I also tried with different buffer sizes but the problem is with this segment
for (int i =0;i<10000000;i++){
os.write((sampleRead[i]+" ").getBytes());
}
The buffering takes most of the time while the actual transfer takes only about 6-7 seconds on a 150mbps hotspot connection. What could be the problem and how to solve it?
First of all, as a commenter has already noted, using a monstrously large buffer is likely to be counter productive. Once your stream buffer is bigger than the size of a network packet, app-side buffering loses its effectiveness. (The data in your "big" buffer needs to be split packet-sized chunks by the TCP/IP stack before it goes onto the network.) Indeed, if the app-side buffer is really large, you may find that your data gets stuck in the buffer for a long time waiting for the buffer to fill ... while the network is effectively idle.
(The Buffered... readers, writers and streams are primarily designed to avoid lots of syscalls that transfer tiny amounts of data. Above 10K or so, the buffering doesn't performance help much.)
The other thing to now is that in a lot of OS environments, the network throughput is actually limited by virtualization and default network stack tuning parameters. To get a better throughput, you may need to tune at the OS level.
Finally, if your network path is going over a network path that is congested, has a high end-to-end latency or links with constrained data rate, then you are unlikely to get fast data transfers no matter how you tune things.
(Compression might help ... if you can afford the CPU overhead at both ends ... but some data links already do compression transparently.)
You could compress the data transfer, it will save a lot of memory and well to transfer a compress stream of data is cheaper... For that you need to implement compress logic in client side and decompress logic in server side, see GZIPInputStream... And try reducing the buffer size is huge for a mobile device...

Use of socket.setReceiveBufferSize()

I'm confused with the use of Socket's setReceiveBufferSize() from java.net.
From the API, I know that setting the receive buffer size for the socket defines (or gives a hint to) the data limit that the socket can receive at a time. However, everytime I try to read from the socket's input stream, I've found out that it can store more than what I set with setReceiveBufferSize().
Consider the following code:
InputStream input_stream = socket.getInputStream();
socket.setReceiveBufferSize(1024);
byte[] byte_array = new byte[4096];
input_stream.read(byte_array);
Everytime I read from input_stream, I've tested that I can actually read more than 1024 bytes at a time (and fill the 4096 byte array), as long as the sender side has already sent more than that much data.
Can anyone give an explanation as to why this happens? Am I just missing something? Thank you.
From the API, I know that setting the receive buffer size for the socket defines (or gives a hint to) the data limit that the socket can receive at a time.
No it doesn't. It gives a hint to TCP as to the total receive buffer size, which in turn affects the maximum receive window that can be advertised. 'Receive at a time' doesn't really have anything to do with it.
However, every time I try to read from the socket's input stream, I've found out that it can store more than what I set with setReceiveBufferSize().
TCP is free to adjust the hint up or down. In this case, 1024 is a ludicrously small size that any implementation would increase to at least 8192. You can find out how much TCP actually used with getReceiveBufferSize().

TCP packet sizing at application level for max throughput

At application level, say using java, how much do I have to worry about the actual TCP packet size? So, for example, I am trying to write an application that should send data over TCP socket's outputstream, do I have to always keep into account the size of the data written to the stream? Since java sockets are streaming sockets, I havent actually considered the size of data units, but the TSO (TCP Segmentation offload) is "turned on" for the OS/NIC, then I can write a 64KB data slice or MSS to the outputstream and thus try to save the precious CPU time of slicing the data to less than 1500 bytes (< MTU). How effective could my programming be, in terms of being able to take care of this dynamically? I know we can get NetworkInterface.getMTU() to determine OS/NIC MTU size, but not sure how it can help that.
So, I can say that overall, I am a bit confused on how to maximize my throughput of byte writing to the outputstream.
how much do I have to worry about the actual TCP packet size?
Almost never. You can setNoTcpDelay(true); but this rarely makes a difference.
So, for example, I am trying to write an application that should send data over TCP socket's outputstream, do I have to always keep into account the size of the data written to the stream?
I doubt it. If you have a 1 Gb connection or slower, you will have trouble writing a program so inefficient it can't use this bandwidth.
Since java sockets are streaming sockets, I havent actually considered the size of data units, but the TSO (TCP Segmentation offload) is "turned on" for the OS/NIC, then I can write a 64KB data slice or MSS to the outputstream and thus try to save the precious CPU time of slicing the data to less than 1500 bytes (< MTU).
I don't see how give most decent network adapter support TCP offloading.
How effective could my programming be, in terms of being able to take care of this dynamically?
Java doesn't support it in any case.
I know we can get NetworkInterface.getMTU() to determine OS/NIC MTU size, but not sure how it can help that.
Me neither.
So, I can say that overall, I am a bit confused on how to maximize my throughput of byte writing to the outputstream.
The most significant change you can make in Java is to use NIO. I suggest blocking NIO as this is the simplest change from NIO. If you use direct ByteBuffers this can save redundant memory copies from Java to native memory.
Do you know you have a problem using the maximum bandwidth of your network? If you haven't measured this is the cause of your problem, it's just a guess.
TCP buffers, paces, decides segment sizes etc behind the scenes for you. There is nothing you can do to help except write as much as possible as fast as possible, and use a large socket send buffer at the sender and a large socket receive buffer at the receiver.

Correct way to set the socket send buffer size on linux?

I have an NIO server that gets small client requests that result in ~1meg responses. The server uses the following to accept a new client:
SocketChannel clientChannel = server.accept();
clientChannel.configureBlocking(false);
clientChannel.socket().setSendBufferSize(2 * 1024 * 1024);
I then log out a "client connected" line that includes the result of clientChannel.socket().getSendBufferSize().
On Windows, the set changes the client socket's send buffer size from 8k to 2megs. But on linux, the socket says its send buffer is 131,071 bytes.
This results in lousy performance, as my clientChannel.write only writes 128k at a time, so it takes 7 more passes to get all the data written. On Windows, the setSendBufferSize change significantly improved performance.
Linux appears to be configured to allow a large socket send buffer:
$ cat /proc/sys/net/ipv4/tcp_wmem
4096 16384 4194304
The platform is free to adjust the requested buffer size up or down, and that's what Linux appears to be doing. Nothing you can do about that except maybe tune the maxima via kernel configuration.
Note that my comments in the question you linked about setting buffer sizes > 64k apply to the receive buffer, not the send buffer, because the receive buffer size affects the window scaling option, so it needs to be set before the socket is connected, which sets the window scale in stone.
I don't see why requiring 'more passes' should cause such a major performance difference that you're even asking this question. It seems to me you would be better off adjusting the receiver's window size upwards, and doing it prior to connection as above.
You can subclass ServerSocket, and override ServerSocket.implAccept(Socket socket).
This method, in your override, will receive an "empty" Socket instance on which you can call Socket.setSendBufferSize(int).
Then:
ServerSocketChannel server = myServerSocket.getChannel();
...
SocketChannel clientChannel = server.accept(); // <-- this guy will have the new send buffer size.

Transmitting files of 64k size through sockes fails

I am developing an application that uses java sockets between a server and client apps. I need to send files of size 64k from client to server through these sockets. When I locally run all the system (both, server and client) everything goes ok, but when I run the server and client in different machines it fails.
I am using JSON to process the file content, so the thrown exception in server is: "net.sf.json.util.JSONTokener.syntaxError". However the problem is not JSON, is the size of the file. When I send files with a size lesser than 8k everything goes ok, but bigger sizes truncates the sent information, so it throws a JSONTokener.syntaxError when server tries to interpret truncated received information.
I am defining a socket buffer of 64k as following (I am using NIO API):
SocketChannel sc;
private static final int BUFFER _SIZE = (int)Math.pow(2, 16);
.....
sc.socket().setReceiveBufferSize( BUFFER_SIZE );
sc.socket().setSendBufferSize( BUFFER_SIZE );
What do I need to do to enlarge the network package size when I run my system in a remote mode? Do you have any idea which is the problem?
Thank you very much in advance.
Oscar
The buffer sizes are typically over 64K so you could be actually shrinking them.
The MTU for TCP packets is typically 1.5 KB and changing this is highly unlikely to help you. In any case, 9000 bytes is the most I have ever seen it set to.
The problem you have could be in the way you are writing the data, but I suspect its in the way you are reading the data. It a common mistake to assume you will receive the same size of data you sent or that you will receive it all at once.
Streams do not "know" what size of data you wrote and you don't know when all the data sent has been received unless you have a protocol which includes the length.

Categories