I have a really strange behavior in Java and I can't tell whether this happens on purpose or by chance.
I do have a Socket Connection to Server that sends me a response to a request. I am reading this response from the Socket with the following loop, which is encapsulated in a try-with-resource.
BufferedInputStream remoteInput = new BufferedInputStream(remoteSocket.getInputStream())
final byte[] response = new byte[512];
int bytes_read;
while ((bytes_read = remoteInput.read(response,0,response.length)) != -1) {
// Messageparsingstuff which does not affect the behaviour
}
According to my understanding the "read" Method fills as many bytes as possible into the byte Array. The limiting factors are either the amount of received bytes or the size of the array.
Unfortunately, this is not whats happening: the protocol I'm transmitting answers my request with several smaller answers which are sent one after another over the same socket connection.
In my case the "read" Method always returns with exactly one of those smaller answers in the array. The length of the answers varies but the 512 Byte that fit into the array are always enough. Which means my array always contains only one message and the rest/unneeded part of the array remains untouched.
If I intentionally define the byte-array smaller than my messages it will return several completely filled arrays and one last array that contains the rest of the bytes until the message is complete.
(A 100 byte answer with an array length of 30 returns three completely filled arrays and one with only 10 bytes used)
The InputStream or a socket connection in general shouldn't interpret the transmitted bytes in any way which is why I am very confused right now. My program is not aware of the used protocol in any way. In fact, my entire program is only this loop and the stuff you need to establish a socket connection.
If I can rely on this behavior it would make parsing the response extremely easy but since I do not know what causes this behavior in the first place I don't know whether I can count on it.
The protocol I'm transmitting is LDAP but since my program is completely unaware of that, that shouldn't matter.
According to my understanding the "read" Method fills as many bytes as possible into the byte Array.
Your understanding is incorrect. The whole point of that method returning the "number of bytes read" is: it might return any number. And to be precise: when talking about a blocking read - when the method returns, it has read something; thus it will return a number >= 1.
In other words: you should never every rely on read() reading a specific amount of bytes. You always always always check the returned numbers; and if you are waiting for a certain value to be reached, then you have to do something about that in your code (like buffering again; until you got "enough" bytes in your own buffer to proceed).
Thing is: there is a whole, huge stack of elements involved in such read operations. Network, operating system, jvm. You can't control what exactly happens; and thus you can not and should not build any implicit assumptions into your code like this.
While you might see this behaviour on a given machine, esp over loopback, once you start using real networks and use different hardware this can change.
If you send messages with enough of a delay, and read them fast enough, you will see one message at a time. However, if writing messages are sent close enough or your reader is delayed in any way, you can get multiple messages sent at once.
Also if you message is large enough e.g. around the MTU or more, a single message can be broken up even if your buffer is more than large enough.
Related
Background: I'm currently creating an application in which two Java programs communicate over a network using a DataInputStream and DataOutputStream.
Before every communication, I'd like to send an indication of what type of data is being sent, so the program knows how to handle it. I was thinking of sending an integer for this, but a byte has enough possible combinations.
So my question is, is Java's DataInputStream's readByte() faster than readInt()?
Also, on the other side, is Java's DataOutputStream's writeByte() faster than writeInt()?
If one byte will be enough for your data then readByte and writeByte will be indeed faster (because it reads/writes less data). It won't be noticeable difference though because the size of the data is very small in both cases - 1 vs 4 bytes.
If you have lots of data coming from the stream then using readByte or readInt will not make speed difference - for example calling readByte 4 times instead of readInt 1 time. Just use the one depending on what kind of data you expect and what makes your code easier to understand. You will have to read the whole stuff anyway :)
I am converting the details that has to be sent from my C++ function to Java as strings and as a char* which will be sent through socket.
My buffer size is 10 MB. Can I send the 10MB in one shot or should I split and send as chunks of smaller memory?
What is the difference between those two approaches? If I should send as smaller memory what should be the chunk size?
Can I send the 10MB in one shot
Yes.
or should I split and send as chunks of smaller memory?
No.
What is the difference between those two approaches?
The difference is that in case 1 you are letting TCP make all the decisions it is good at, with all the extra knowledge it has that you don't have, about the path MTU, the RTT, the receive window at the peer, ... whereas in case 2 you're trying to do TCP's job for it. Keeping a dog and barking yourself.
If I should send as smaller memory what should be the chunk size?
As big as possible.
When you call the write() function, you provide a buffer and number of bytes you want to write. However it is not guaranteed that the OS will send/write all the bytes that you are willing to write in a single shot. (In case of blocking sockets, the write() call would block until it copies the entire chunk to the TCP buffer. However in case of non-blocking ones, the write() would return and would not block and would write the just the bytes it is able to).
The TCP/IP stack runs in the OS and each OS will have its own implemenation of the stack. This stack would determine the buffer sizes and moreover the TCP/IP would itself take care of all the low level statistics such as MSS, the available receiver window size, which would let TCP run the flow control, congestion control related algorithms.
Therefore it is best that let TCP decide how would it want to send your data. Instead of you breaking the data into chunks, let the TCP stack do it for you.
Just be careful with the thing that always check the number of bytes actually sent which is returned by the write() call.
Let's say we have SocketChannel (in non-blocking mode) that registered with Selector for read interest. Let's say after select() Selector tells us that this channel is ready for read and we have some ByteBuffer. We want to read some bytes from our channel to this buffer (ByteBuffer is cleared before reading). For this we use channel's read() method that returns actual number of bytes read. Lets suppose that this number is positive after read from channel and also ByteBuffer's method hasRemaining() returns true. Is it practical in this situation to immediately try to read from same channel some more?
The same question for write(). If write() returns positive value and not all contents of the buffer was sent, is it practical to immediately try again until write() returns zero?
If you get a short read result, there is no more data to read without blocking, so you must not read again until there is. Otherwise the next read will almost certainly return zero or -1.
If the read fills the buffer, it might make sense from the point of view of that one connection to keep reading until it returns <= 0, but you are stealing cycles from the other channels. You need to consider fairness as well. In general you should probably do one read and keep iterating over the selected keys. If there's more data there the select will tell you next time.
Use big buffers.
This also means that it's wrong to clear the buffer before each read. You should get the data out with a flip/get/compact cycle, then the buffer is ready to read again and you don't risk losing data. This in turn implies that you need a buffer per connection.
It all depends on the data rate at which data is arriving, and the latency requirements of your application. If you don't care about latency at all, you might get slightly higher bandwidth by delaying your read interest until you suspect enough data has arrived to fill your buffer.
You have to be careful, though. Delaying reads could force the kernel to buffer more data, possibly fill its buffer, and have to start dropping packets or otherwise engage some flow control. That will more than kill any benefits from the last paragraph.
So generally, you want to read as much as you can, as early as you can. The benefits for batching reads are minor at best, and the potential pitfalls can be major. And keep in mind that the fact that you're seeing non-full reads means you're processing the data faster than it is coming in. In other words, you're in a state where you have CPU to burn, so the extra overhead of smaller reads is essentially free.
This is more like a matter of conscience than a technological issue :p
I'm writing some java code to dowload files from a server...For that, i'm using the BufferedOutputStream method write(), and BufferedInputStream method read().
So my question is, if i use a buffer to hold the bytes, what should be the number of bytes to read? Sure i can read byte to byte using just int byte = read() and then write(byte), or i could use a buffer. If i take the second approach, is there any aspects that i must pay attention when defining the number of bytes to read\write each time? What will this number affect in my program?
Thks
Unless you have a really fast network connection, the size of the buffer will make little difference. I'd say that 4k buffers would be fine, though there's no harm in using buffers a bit bigger.
The same probably applies to using read() versus read(byte[]) ... assuming that you are using a BufferedInputStream.
Unless you have an extraordinarily fast / low-latency network connection, the bottleneck is going to be the data rate that the network and your computers' network interfaces can sustain. For a typical internet connection, the application can move the data two or more orders of magnitude of times faster than the network can. So unless you do something silly (like doing 1 byte reads on an unbuffered stream), your Java code won't be the bottleneck.
BufferedInputStream and BufferedOutputStream typically rely on System.arraycopy for their implementations. System.arraycopy has a native implementation, which likely relies on memmove or bcopy. The amount of memory that is copied will depend on the available space in your buffer, but regardless, the implementation down to the native code is pretty efficient, unlikely to affect the performance of your application regardless of how many bytes you are reading/writing.
However, with respect to BufferedInputStream, if you set a mark with a high limit, a new internal buffer may need to be created. If you do use a mark, reading more bytes than are available in the old buffer may cause a temporary performance hit, though the amortized performance is still linear.
As Stephen C mentioned, you are more likely to see performance issues due to the network.
What is the MTU(maximum traffic unit) in your network connection? If you using UDP for example, you can check this value and use smaller array of bytes. If this is no metter, you need to check how memory eats your program. I think 1024 - 4096 will be good variant to save this data and continue to receive
If you pump data you normally do not need to use any Buffered streams. Just make sure you use a decently sized (8-64k) temporary byte[] buffer passed to the read method (or use a pump method which does it). The default buffer size is too small for most usages (and if you use a larger temp array it will be ignored anyway)
I have two scenarios in Netty where I am trying to minimize memory copies and optimize memory usage:
(1) Reading a very large frame (20 Megabites).
(2) Reading lots of very little frames (20 megabites at 50 bites per frame) to rebuild into one message at a higher level in the pipeline.
For the first scenario, as I get a length at the beginning of the frame, I extended FrameDecoder. Unfortunately as I don't see how to return the length to Netty (I only indicate whether the frame is complete or not), I believe Netty is going through multiple fill buffer, copy and realloc cycles thus using for more memory than is required. Is there something I am missing here? Or should I be avoiding the FrameDecoder entirely if I expect this scenario?
In the second scenario, I am currently creating a linked list of all the little frames which I wrap using ChannelBuffers.wrappedBuffer (which I can then wrap in a ChannelBufferInputStream), but I am again using far more memory than I expected to use (perhaps because the allocated ChannelBuffers have spare space?). Is this the right way to use Netty ChannelBuffers?
There is a specialized version of frame decoder called, LengthFieldBasedFrameDecoder. Its handy, when you have a header with message length. It can even extract the message length from header by giving an offset.
Actually, ChannelBuffers.wrappedBuffer does not creates copies of received data, it creates a composite buffer from given buffers, so your received frame data will not be copied. If you are holding the composite buffers/ your custom wrapper in the code and forgot to nullify, memory leaks can happen.
These are practices I follow,
Allocate direct buffers for long lived objects, slice it on use.
when I want to join/encode multiple buffers into one big buffer. I Use ChannelBuffers.wrappedBuffer
If I have a buffer and want to do something with it/portion of it, I make a slice of it by calling slice or slice(0,..) on channel buffer instance
If I have a channel buffer and know the position of data which is small, I always use getXXX methods
If I have a channel buffer, which is used in many places for make something out of it, always make it modifiable, slice it on use.
Note: channelbuffer.slice does not make a copy of the data, it creates a channel buffer with new reader & write index.
In the end, it appeared the best way to handle my FrameDecoder issue was to write my own on top of the SimpleChannelUpstreamHandler. As soon as I determined the length from the header, I created the ChannelBuffer with size exactly matching the length. This (along with other changes) significantly improved the memory performance of my application.