How to avoid EOFException for an empty HTTP response from HttpURLConnection?

How to avoid EOFException for an empty HTTP response from HttpURLConnection? - java

I'm sending an HTTP request to a server which legitimately returns a blank response with HTTP response code = 200 (OK).
But this causes the marked line below to throw an EOFException:
InputStream responseStream = httpURLConnection.getInputStream();
final String contentEncoding = this.connection.getContentEncoding();
if ("gzip".equalsIgnoreCase(contentEncoding)) {
responseStream = new GZIPInputStream(responseStream); // throws EOFException
}
There may be an obvious way to prevent this but I can't find it. Should I perhaps be checking something like connection.getContentLength() > 0 or responseStream.available() > 0 before creating the GZIPInputStream? Neither of these seem quite right and I haven't come across anything similar in example code fragments...

Should I perhaps be checking something like connection.getContentLength() > 0
Yes.
or responseStream.available() > 0
Definitely not. available() == 0 isn't a valid test for EOF, and the Javadoc explicitly says so.

Shouldn't your server return HTTP code 204 (No Content), instead of 200? Except for a HEAD request that is. See Http Status Code definition
Apart from that, you could indeed just check to work around what looks like a not-all-too-compliant server implementation, or catch the exception and then determine the proper action depending on the type of request you sent.

What I understand from your question is that this exception is being thrown consistently everytime you get a 200 response from the server.
If responseStream.available() > 0 is not an option, it means that the stream actually contains something, but that stream seems to be incomplete or ending prematurely by any other means, since available states that there ARE readable bytes in the stream.
Update
Based on your comment (and after reading InputStream's doc again out of sheer curiosity) I also believe connection.getContentLength() > 0 is the way to go. If you're using Java 7, tough, connection.getContentLengthLong() is preferred over that because it returns a long directly.
Given what the docs say about InputStream#available :
The available method for class InputStream always returns 0.
And
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.
This discards that check as an option.

Unless I'm missing something, a response with Content-Encoding "gzip" by definition can not be empty.

Related

non blocking writes between listeners

I'm reading an http response using the apache async client. Every time I read a chunk of data I want to write it to the servletoutputstream in a non-blocking mode. Something like this:
// decoder.read is executed when data available for reading
while (decoder.read(this.bbuf) > 0)
{
this.bbuf.flip();
arr = new byte[numbytesread];
this.bbuf.rewind();
this.bbuf.get(arr);
// Blocking write to servletoutputstream 'sos'
this.sos.write(arr);
this.bbuf.compact();
}
Obviously this does not work even if I wrap the 'sos' in a WritableChannel, because I always end up with the bytearrayoutputstream from the servletoutputstream.
So I should add a WriteListener to the servletoutputstream to switch to nio mode, but here comes the problem I'm not able to solve. How can I pass every chunk of data from my http callback to the writelistener to work asynchronus and non blocking?
Is this possible? If so, can anyone give me a clue about how to do it?

Weird issues with gzip encoded responses

Ok, so I'm running my own fork of NanoHttpd (a minimalist java web server, the fork is quite complex though), and I had to implement gzip compression on top of it.
It has worked fine, but it just turned out that firefox 33.0 on Linux mint 17.1 will not execute gzipped js files at all, although they load just fine and headers look OK etc. This does not happen on the same pc with chrome, or with any other browser I've tried, but still is something that I must get fixed.
Also, the js resources execute just fine if I disable gzipping. I also tried removing Connection: keep-alive, but that did not have any effect.
Here's the code responsible for gzipping:
private void sendAsFixedLength(OutputStream outputStream) throws IOException {
int pending = data != null ? data.available() : 0; // This is to support partial sends, see serveFile()
headerLines.add("Content-Length: "+pending+"\r\n");
boolean acceptEncoding = shouldAcceptEnc();
if(acceptEncoding){
headerLines.add("Content-Encoding: gzip\r\n");
}
headerLines.add("\r\n");
dumpHeaderLines(outputStream);//writes header to outputStream
if(acceptEncoding)
outputStream = new java.util.zip.GZIPOutputStream(outputStream);
if (requestMethod != Method.HEAD && data != null) {
int BUFFER_SIZE = 16 * 1024;
byte[] buff = new byte[BUFFER_SIZE];
while (pending > 0) {
int read = data.read(buff, 0, ((pending > BUFFER_SIZE) ? BUFFER_SIZE : pending));
if (read <= 0) {
break;
}
outputStream.write(buff, 0, read);
pending -= read;
}
}
outputStream.flush();
outputStream.close();
}
Fwiw, the example I copied this from did not close the outputStream, but without doing that the gzipped resources did not load at all, while non-gzipped resources still loaded ok. So I'm guessing that part is off in some way.
EDIT: firefox won't give any errors, it just does not excecute the script, eg:
index.html:
<html><head><script src="foo.js"></script></head></html>
foo.js:
alert("foo");
Does not do anything, despite that the resources are loaded OK. No warnings in console, no nothing. Works fine when gzip is disabled and on other browsers.
EDIT 2:
If I request foo.js directly, it loads just fine.
EDIT 3:
Tried checking the responses & headers with TemperData while having gzipping on/off.
The only difference was that when gzipping is turned on, there is Content-Encoding: gzip in the response header, which is not very suprising. Other than that, 100% equal responses.
EDIT 4:
Turns out that removing content-length from the header made it work again... Not sure of the side effects tho, but at least this pinpoints it better.

I think the cause of your problem is that you are writing the Content-Length header before compressing the data, which turns out in an incoherent information to the browser. I guess that depending on the browser implementation, it handles this situation in one or other way, and it seems that Firefox does it the strict way.
If you don't know the size of the compressed data (which is understandable), you'd better avoid writing the Content-Length header, which is not mandatory.

Java ObjectOutputStream reset error

my project consists of 2 parts: server side and client side. When I start server side everything is OK, but when I start client side from time to time I get this error:
java.io.IOException: stream active
at java.io.ObjectOutputStream.reset(Unknown Source)
at client.side.TcpConnection.sendUpdatedVersion(TcpConnection.java:77)
at client.side.Main.sendCharacter(Main.java:167)
at client.side.Main.start(Main.java:121)
at client.side.Main.main(Main.java:60)
When I tried to run this project on the other pc this error occurred even more frequently. In Java docs I found this bit.
Reset may not be called while objects are being serialized. If called
inappropriately, an IOException is thrown.
And this is the function where error is thrown
void sendUpdatedVersion(CharacterControlData data) {
try {
ServerMessage msg = new ServerMessage(SEND_MAIN_CHARACTER);
msg.setCharacterData(data);
oos.writeObject(msg);
oos.reset();
} catch (IOException e) {
e.printStackTrace();
}
}
I tried to put flush() but that didn't help. Any ideas? Besides, no errors on server side.

I think you're misunderstanding what reset() does. It resets the stream to disregard any object instances previously written to it. This is pretty clearly not what you want in your case, since you're sending an object to the stream and then resetting straight away, which is pointless.
It looks like all you need is a flush(); if that's insufficient then the problem is on the receiving side.

I think you are confusing close() with reset().
use
oos.close();
instead of oos.reset();

calling reset() is a perfectly valid thing to want to do. It is possible that 'data' is reused, or some field in data is reused, and the second time he calls sendUpdatedVersion, that part is not sent. So those who complain that the use is invalid are not accurate. Now as to why you are getting this error message
What the error message is saying is that you are not at the top level of your writeObject call chain. sendUpdatedVersion must be being called from an method that was called from another writeObject.
I'm assuming that some object is implementing a custom writeObject() and that method, is calling this method.
So you have to differentiate when sendUpdatedVersion is being called at the top level of the call chain and only use reset() in those cases.

How to Ensure Input from URL isn't from a Redirected Page

I have the following lines of code that gathers the source code from a given URL:
URL url = new URL(websiteAddress);
URLConnection connection = url.openConnection(); // throws an IOException
connection.setConnectTimeout(timeoutInMilliseconds);
bufferedReader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
while ((line = bufferedReader.readLine()) != null) {
outputString += line;
}
However, the problem that I'm having is that wi-fi hotspots often redirect you to a page where you have to click "I Agree." If you run this code before you have clicked that checkbox, then it gathers the source code from the hotspot login page, rather than the intended page.
What I want to do is have some way of checking whether or not the intended page was reached. I was hoping that calling connection.getURL() after creating the InputStreamReader would show me the actual web page that was arrived, but no such luck. How can I determine whether or not the intended URL has been redirected?

One way would be to look for any specific element in your web page, and if its not there then you know that you may be in some other page (possibly redirected to some login page).

The only thing I can suggest is to have a server where you know what the response is, and query that first to ensure connectivity to at least that server. That will (typically) be enough to assume full connectivity.
You can then go on to query the url you're interested in.
The challenege is that if a computer asks for the page at some url, the way a lot of wifi hotspots work is to intercept that request and return the page. There's often no clue, form the computer's POV that the page returned is not the page requested.

One option would be to call setFollowRedirects(false). By default, a connection will quietly follow redirects and try to reach a page which returns a 200 HTTP response. Disabling redirect following will make confirming the expected page is returned easier, simply confirm the response is a 200.
That said, #rec's comment is worth taking into account - it isn't enough to simply check the response code, because there are many different ways a router could interrupt your request, many of which are not detectable. A malicious router could, for instance, intercept all your requests and change the responding content in a subtle but dangerous way - this is called a man-in-the-middle attack.
By definition you cannot avoid MitM attacks unless you can open a secure and trusted connection (generally, HTTPS) between yourself and the remote site, however assuming you aren't really concerned about attacks, the better tactic is simply to assume the data you get back could be broken in any number of ways, and instead make your scraping logic more robust to that possibility.
I can't speak directly to how you would make your logic more robust without understanding your use case and the issues you've run into, however the gist would be to add checks where issues might arise, and throw an exception that you then handle gracefully higher up the stack.
For instance, if your code was:
System.out.println(outputString.subString(outputString.indexOf('A'));
This would fail if outputString didn't actually have an'A'` character. So check that explicitly:
int aPos = outputString.indexOf('A');
if (aPos < 0) {
throw new InvalidParseException("Didn't find an 'A', cannot proceed");
}
System.out.println(outputString.subString(aPos);
And handle the InvalidParseException wherever makes the most sense for your use case.

Should I close the servlet outputstream? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Should one call .close() on HttpServletResponse.getOutputStream()/.getWriter()?
Am I responsible for closing the HttpServletResponse.getOutputStream() (or the getWriter() or even the inputstream)
or should I leave it to the container ?
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
OutputStream o = response.getOutputStream();
...
o.close(); //yes/no ?
}

You indeed don't need to do so.
Thumb rule: if you didn't create/open it yourself using new SomeOutputStream(), then you don't need to close it yourself. If it was for example a new FileOutputStream("c:/foo.txt"), then you obviously need to close it yourself.
Reasons that some people still do it are just to ensure that nothing more will be written to the response body. If it would ever happen, then this will cause an IllegalStateException in the appserver logs, but this wouldn't affect the client, so the client still gets the proper response. This is also an easier debug to spot the potential problems in the request-response chain which you wouldn't see at first glance. For example, something else is appending more data to the response body somewhere further down in the chain.
Another reason which you see among starters is that they just wanted to prevent that more data is written to the response body. You see this often when JSP incorrectly plays a role in the response. They just ignore the IllegalStateExceptions in the logs. Needless to say that this particular purpose is bad.

No you don't need to close it. If you do you basically end the response to the client. After closing the stream you cannot send anything else to the client until the next request. You didn't open the stream, so you don't have to close it.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to avoid EOFException for an empty HTTP response from HttpURLConnection? - java

Should I perhaps be checking something like connection.getContentLength() > 0 Yes. or responseStream.available() > 0 Definitely not. available() == 0 isn't a valid test for EOF, and the Javadoc explicitly says so.

Unless I'm missing something, a response with Content-Encoding "gzip" by definition can not be empty.

Related

non blocking writes between listeners

Weird issues with gzip encoded responses

Java ObjectOutputStream reset error

How to Ensure Input from URL isn't from a Redirected Page

Should I close the servlet outputstream? [duplicate]

Categories

Resources