I'm running a multithreaded minimalistic http(s) server (not a web server though) that accepts connections on three server sockets: local, internet and internet-ssl.
Each socket has a so timeout of 1000ms (might be lowered in the future).
The worker threads read requests like this:
byte[] reqBuffer = new byte[512];
theSocket.getInputStream().read(reqBuffer);
The problem now is that with the newly implemented ssl socket the problem with the 1/n-1 record splitting technique arises. Also some clients split in other strange ways when using ssl (4/n-4 etc.) so I thought I might just perform multiple reads like this:
byte[] reqBuffer = new byte[512];
InputStream is = theSocket.getInputStream();
int read = is.read(reqBuffer, 0, 128); // inital read - with x/n-x this is very small
int pos = 0;
if (read > 0) {
pos = read;
}
int i = 0;
do {
read = is.read(reqBuffer, pos, 128);
if (read > 0) {
pos += read;
}
i++;
} while(read == 128 && i < 3); // max. 3 more reads (4 total = 512 bytes) or until less than 128 bytes are read (request should be completely read)
Which works with browsers like firefox or chrome and other clients using that technique.
Now my problem is that the new method is much slower. Requests to the local socket are so slow that a script with 2 seconds timeout times out requesting (I have no idea why). Maybe I have some logical problem in my code?
Is there a better way to read from a SSL socket? Because there are up to hundreds or even a thousand requests per second and the new read method slows down even the http requests.
Note: The ssl-socket is not in use at the moment and will not be used until I can fix this problem.
I have also tried reading line for line using a buffered reader since we are talking about http here but the server exploded running out of file descriptors (limit is 20 000). Might have been because of my implementation, though.
I'm thankful for every suggestion regarding this problem. If you need more information about the code just tell me and I will post them asap.
EDIT:
I actually put a little bit more thought into what I am trying to do and I realized that it comes down to reading HTTP headers. So the best solution would be to actually read the request line for line (or character for character) and stop reading after x lines or until an empty line (marking the end of the header) is reached.
My current approach would be to put a BufferedInputStream around the socket's InputStream and read it with an InputStreamReader which is "read" by a BufferedReader (question: does it make sense to use a BufferedInputStream when I'm using a BufferedReader?).
This BufferedReader reads the request character for character, detects end-of-lines (\r\n) and continues to read until either a line longer than 64 characters is reached, a maximum of 8 lines are read or an empty line is reached (marking the end of the HTTP header). I will test my implementation tomorrow and edit this edit accordingly.
EDIT:
I almost forgot to write my results here: It works. On every socket, even faster than the previously working way. Thanks everyone for pointing me in the right direction. I ended up implementing it like this:
List<String> requestLines = new ArrayList<String>(6);
InputStream is = this.cSocket.getInputStream();
bis = new BufferedInputStream(is, 1024);
InputStreamReader isr = new InputStreamReader(bis, Config.REQUEST_ENCODING);
BufferedReader br = new BufferedReader(isr);
/* read input character for character
* maximum line size is 768 characters
* maximum number of lines is 6
* lines are defined as char sequences ending with \r\n
* read lines are added to a list
* reading stops at the first empty line => HTTP header end
*/
int readChar; // the last read character
int characterCount = 0; // the character count in the line that is currently being read
int lineCount = 0; // the overall line count
char[] charBuffer = new char[768]; // create a character buffer with space for 768 characters (max line size)
// read as long as the stream is not closed / EOF, the character count in the current line is below 768 and the number of lines read is below 6
while((readChar = br.read()) != -1 && characterCount < 768 && lineCount < 6) {
charBuffer[characterCount] = (char) readChar; // fill the char buffer with the read character
if (readChar == '\n' && characterCount > 0 && charBuffer[characterCount-1] == '\r') { // if end of line is detected (\r\n)
if (characterCount == 1) { // if empty line
break; // stop reading after an empty line (HTTP header ended)
}
requestLines.add(new String(charBuffer, 0, characterCount-1)); // add the read line to the readLines list (and leave out the \r)
// charBuffer = new char[768]; // clear the buffer - not required
characterCount = 0; // reset character count for next line
lineCount++; // increase read line count
} else {
characterCount++; // if not end of line, increase read character count
}
}
This is most likely slower as you are waiting for the other end to send more data, possibly data it is never going to send.
A better approach is you give it a larger buffer like 32KB (128 is small) and only read once the data which is available. If this data needs to be re-assembled in the messages of some sort, you shouldn't be using timeouts or a fixed number of loops as read() is only guaranteed to return one byte at least.
You should certainly wrap a BufferedInputStream around the SSLSocket's input stream.
Your technique of reading 128 bytes at a time and advancing the offset is completely pointless. Just read as much as you can at a time and deal with it. Or one byte at a time from the buffered stream.
Similarly you should certainly wrap the SSLSocket's output stream in a BufferedOutputStream.
Related
i am making a java program that reads data from a binary stream (using a DataInputStream).
Sometimes during this process i need to read a data chunk, however the method (which i cannot modify) that reads it will stop before reaching the end of the chunk (it is the normal behavior, apparently it just doesn't need the last bytes, but i can't do anything about the fact that they are there). This is not a problem in itself because i know exactly how long the chunk is, i.e. i know how many bytes there are in the chunk so i can skip bytes (with the skipBytes(int) method) until the end of the chunk ; the problem is : i don't actually know how many bytes the method actually read (or left), so i don't know how many bytes i need to skip to reach the end of the chunk.
Is there any way to :
know how many bytes were read in a stream since a certain point in time ?
know how many bytes were read in a stream since it was ?
any other way i could know how many bytes my data-chunk-reading method just read (since it won't directly tell me) ?
Just in case, i made a small diagram
Thanks in advance
ImageInputStream can do what you want. It implements DataInput and it has most of the methods of InputStream. And it has getStreamPosition, seek and skipBytes methods.
However, as you correctly point out, ImageIO.read(ImageInputStream) would close the stream, preventing you from reading more than one image.
The solution is to avoid using ImageIO.read, and instead obtain an ImageReader explicitly, using ImageIO.getImageReaders. Then you can invoke an ImageReader’s read method, which does not close the stream.
Here’s how I implemented it:
public void readImages(InputStream source,
Consumer<? super BufferedImage> imageHandler)
throws IOException {
// Every image is at a byte index which is a multiple of this number.
int boundary = 5000;
try (ImageInputStream stream = ImageIO.createImageInputStream(source)) {
while (true) {
long pos = stream.getStreamPosition();
Iterator<ImageReader> readers = ImageIO.getImageReaders(stream);
if (!readers.hasNext()) {
break;
}
ImageReader reader = readers.next();
reader.setInput(stream);
BufferedImage image = reader.read(0);
imageHandler.accept(image);
pos = stream.getStreamPosition();
long bytesToSkip = boundary - (pos % boundary);
if (bytesToSkip < boundary) {
stream.skipBytes(bytesToSkip);
}
}
}
}
And here’s how I tested it:
try (InputStream source = new BufferedInputStream(
Files.newInputStream(Path.of(filename)))) {
reader.readImages(source, img -> EventQueue.invokeLater(() -> {
JOptionPane.showMessageDialog(null, new ImageIcon(img));
}));
}
All the buffered read methods return the actual number of bytes read.
Quoting documentation for InputStream#read(byte[] b):
Returns:
the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
I'm trying to receive a file in byte[], and I'm using:
byte[] buffer = new byte[16384]; // How many bytes to read each run
InputStream in = socket.getInputStream(); // Get the data (bytes)
while((count = in.read(buffer)) > 0) { // While there is more data, keep running
fos.write(buffer); // Write the data to the file
times++; // Get the amount of times the loop ran
System.out.println("Times: " + times);
}
System.out.println("Loop ended");
The loop stops after 1293 times and then stops printing the times. But the code did not move to System.out.println("Loop ended"); - it seems like the loop is waiting for something...
Why the loop doesn't break?
Your loop terminates only at the end of the input stream. Has the sender terminated the stream (closed the socket)? If not, then there is no end yet.
In such a case, read() will pend until there is at least one byte.
If the socket cannot be closed at the end of the file, for some reason, then you will need to find another way for the recipient to know when to exit the loop. A usual method is to first send the number of bytes that will be sent.
Your write-to-file is faulty as well, since it will attempt to write the entire buffer. But the read can return a partial buffer; that's why it returns a count. The returned count needs to be used in the write to the output file.
In my application, I'm trying to compress/decompress byte array using java's Inflater/Deflater class.
Here's part of the code I used at first:
ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length);
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int count = inflater.inflate(buffer);
outputStream.write(buffer, 0, count);
}
Then after I deployed the code it'll randomly (very rare) cause the whole application hang, and when I took a thread dump, I can identify that one thread hanging
at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:259)
- locked java.util.zip.ZStreamRef#fc71443
at java.util.zip.Inflater.inflate(Inflater.java:280)
It doesn't happen very often. Then I googled everywhere and found out it could be some empty byte data passed in the inflater and finished() will never return true.
So I used a workaround, instead of using
while (!inflater.finished())
to determine if it's finished, I used
while (inflater.getRemaining() > 0)
But it happened again.
Now it makes me wonder what's the real reason that causes the issue. There shouldn't be any empty array passed in the inflater, even if it did, how come getRemaining() method did not break the while loop?
Can anybody help pls? It's really bugging me.
Confused by the same problem, I find this page.
This is my workaround for this, it may helps:
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int i = inflater.inflate(buffer);
if (i == 0) {
break;
}
byteArrayOutputStream.write(buffer, 0, i);
}
The javadoc of inflate:
Uncompresses bytes into specified buffer. Returns actual number of bytes uncompressed. A return value of 0 indicates that needsInput() or needsDictionary() should be called in order to determine if more input data or a preset dictionary is required. In the latter case, getAdler() can be used to get the Adler-32 value of the dictionary required.
So #Wildo Luo was certainly right to check for 0 being returned.
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int count = inflater.inflate(buffer);
if (count != 0 ) {
outputStream.write(buffer, 0, count);
} else {
if (inflater.needsInput()) { // Not everything read
inflater.setInput(...);
} else if (inflater.needsDictionary()) { // Dictionary to be loaded
inflater.setDictionary(...);
}
}
}
inflater.end();
I can only imagine that elsewhere the code is not entirely right, maybe on the compression size. Better first check the general code. There is the Inflater(boolean nowrap) requiring an extra byte, the end() call. Exception handling (try-finally). Etcetera.
For unkown data, unknown occurrences: using a try-catch, find compressed data to check whether it is a data based error, and for testing any solution.
Having the same problem...
What I'm sure about:
I'm having an infinite loop, assured with logs printed.
inflater.inflate returns 0, and the output buffer size is 0.
My loop is like this (Hive ORC code):
while (!(inflater.finished() || inflater.needsDictionary() ||
inflater.needsInput())) {
try {
int count = inflater.inflate(out.array(),
out.arrayOffset() + out.position(),
out.remaining());
out.position(count + out.position());
} catch (DataFormatException dfe) {
throw new IOException("Bad compression data", dfe);
}
}
After the out buffer is consumed and its remaining size is 0, the loop will infinitely run.
But I'm not sure about whether it's orc or zlib caused this. On orc side, it fills original data with the same compression buffer size then do the compression, so theoretically it's not possible I get an compressed chunk larger than the buffer size. Possibilities may be zlib or hardware.
That being said, break the loop when count == 0 is dangerous, since there may be still uncompressed data in the inflator.
Currently, I am relying on the ObjectInputStream.available() method to tell me how many bytes are left in a stream. Reason for this -- I am writing some unit/integration tests on certain functions that deal with streams and I am just trying to ensure that the available() method returns 0 after I am done.
Unfortunately, upon testing for failure (i.e., I have sent about 8 bytes down the stream) my assertion for available() == 0 is coming up true when it should be false. It should show >0 or 8 bytes!
I know that the available() method is classically unreliable, but I figured it would show something at least > 0!
Is there a more reliable way of checking if a stream is empty or not (The is my main goal here after all)? Perhaps in the Apache IO domain or some other library out there?
Does anyone know why the available() method is so profoundly unreliable; what is the point of it? Or, is there a specific, proper way of using it?
Update:
So, as many of you can read from the comments, the main issue I am facing is that on one end of a stream, I am sending a certain number of bytes but on the other end, not all the bytes are arriving!
Specifically, I am sending 205498 bytes on one end and only getting 204988 on the other, consistently. I am controlling both sides of this operation between threads in a socket, but it should be no matter.
Here is the code I have written to collect all the bytes.
public static int copyStream(InputStream readFrom, OutputStream writeTo, int bytesToRead)
throws IOException {
int bytesReadTotal = 0, bytesRead = 0, countTries = 0, available = 0, bufferSize = 1024 * 4;
byte[] buffer = new byte[bufferSize];
while (bytesReadTotal < bytesToRead) {
if (bytesToRead - bytesReadTotal < bufferSize)
buffer = new byte[bytesToRead - bytesReadTotal];
if (0 < (available = readFrom.available())) {
bytesReadTotal += (bytesRead = readFrom.read(buffer));
writeTo.write(buffer, 0, bytesRead);
countTries = 0;
} else if (countTries < 1000)
try {
countTries++;
Thread.sleep(1L);
} catch (InterruptedException ignore) {}
else
break;
}
return bytesReadTotal;
}
I put the countTries variable in there just to see what happens. Even without countTires in there, it will block forever before it reaches the BytesToRead.
What would cause the stream to suddenly block indefinitely like that? I know on the other end it fully sends the bytes over (as it actually utilizes the same method and I see that it completes the function with the full BytesToRead matching bytesReadTotal in the end. But the receiver doesn't. In fact, when I look at the arrays, they match up perfectly up till the end as well.
UPDATE2
I noticed that when I added a writeTo.flush() at the end of my copyStream method, it seems to work again. Hmm.. Why are flushes so vital in this situation. I.e., why would not using it cause a stream to perma-block?
The available() method only returns how many bytes can be read without blocking (which may be 0). In order to see if there are any bytes left in the stream, you have to read() or read(byte[]) which will return the number of bytes read. If the return value is -1 then you have reached the end of file.
This little code snippet will loop through an InputStream until it gets to the end (read() returns -1). I don't think it can ever return 0 because it should block until it can either read 1 byte or discover there is nothing left to read (and therefore return -1)
int currentBytesRead=0;
int totalBytesRead=0;
byte[] buf = new byte[1024];
while((currentBytesRead =in.read(buf))>0){
totalBytesRead+=currentBytesRead;
}
This very well may just be a KISS moment, but I feel like I should ask anyway.
I have a thread and it's reading from a sockets InputStream. Since I am dealing in particularly small data sizes (as in the data that I can expect to recieve from is in the order of 100 - 200 bytes), I set the buffer array size to 256. As part of my read function I have a check that will ensure that when I read from the InputStream that I got all of the data. If I didn't then I will recursively call the read function again. For each recursive call I merge the two buffer arrays back together.
My problem is, while I never anticipate using more than the buffer of 256, I want to be safe. But if sheep begin to fly and the buffer is significantly more the read the function (by estimation) will begin to take an exponential curve more time to complete.
How can I increase the effiency of the read function and/or the buffer merging?
Here is the read function as it stands.
int BUFFER_AMOUNT = 256;
private int read(byte[] buffer) throws IOException {
int bytes = mInStream.read(buffer); // Read the input stream
if (bytes == -1) { // If bytes == -1 then we didn't get all of the data
byte[] newBuffer = new byte[BUFFER_AMOUNT]; // Try to get the rest
int newBytes;
newBytes = read(newBuffer); // Recurse until we have all the data
byte[] oldBuffer = new byte[bytes + newBytes]; // make the final array size
// Merge buffer into the begining of old buffer.
// We do this so that once the method finishes, we can just add the
// modified buffer to a queue later in the class for processing.
for (int i = 0; i < bytes; i++)
oldBuffer[i] = buffer[i];
for (int i = bytes; i < bytes + newBytes; i++) // Merge newBuffer into the latter half of old Buffer
oldBuffer[i] = newBuffer[i];
// Used for the recursion
buffer = oldBuffer; // And now we set buffer to the new buffer full of all the data.
return bytes + newBytes;
}
return bytes;
}
EDIT: Am I being paranoid (unjustifiedly) and should just set the buffer to 2048 and call it done?
BufferedInputStream, as noted by Roland, and DataInputStream.readFully(), which replaces all the looping code.
int BUFFER_AMOUNT = 256;
Should be final if you don't want it changing at runtime.
if (bytes == -1) {
Should be !=
Also, I'm not entirely clear on what you're trying to accomplish with this code. Do you mind shedding some light on that?
I have no idea what you mean by "small data sizes". You should measure whether the time is spent in kernel mode (then you are issuing too many reads directly on the socket) or in user mode (then your algorithm is too complicated).
In the former case, just wrap the input with a BufferedInputStream with 4096 bytes of buffer and read from it.
In the latter case, just use this code:
/**
* Reads as much as possible from the stream.
* #return The number of bytes read into the buffer, or -1
* if nothing has been read because the end of file has been reached.
*/
static int readGreedily(InputStream is, byte[] buf, int start, int len) {
int nread;
int ptr = start; // index at which the data is put into the buffer
int rest = len; // number of bytes that we still want to read
while ((nread = is.read(buf, ptr, rest)) > 0) {
ptr += nread;
rest -= nread;
}
int totalRead = len - rest;
return (nread == -1 && totalRead == 0) ? -1 : totalRead;
}
This code completely avoids creating new objects, calling unnecessary methods and furthermore --- it is straightforward.