When using HttpURLConnection does the InputStream need to be closed if we do not 'get' and use it?
i.e. is this safe?
HttpURLConnection conn = (HttpURLConnection) uri.getURI().toURL().openConnection();
conn.connect();
// check for content type I don't care about
if (conn.getContentType.equals("image/gif") return;
// get stream and read from it
InputStream is = conn.getInputStream();
try {
// read from is
} finally {
is.close();
}
Secondly, is it safe to close an InputStream before all of it's content has been fully read?
Is there a risk of leaving the underlying socket in ESTABLISHED or even CLOSE_WAIT state?
According to http://docs.oracle.com/javase/6/docs/technotes/guides/net/http-keepalive.html
and OpenJDK source code.
(When keepAlive == true)
If client called HttpURLConnection.getInputSteam().close(), the later call to HttpURLConnection.disconnect() will NOT close the Socket. i.e. The Socket is reused (cached)
If client does not call close(), call disconnect() will close the InputStream and close the Socket.
So in order to reuse the Socket, just call InputStream.close(). Do not call HttpURLConnection.disconnect().
is it safe to close an InputStream
before all of it's content has been
read
You need to read all of the data in the input stream before you close it so that the underlying TCP connection gets cached. I have read that it should not be required in latest Java, but it was always mandated to read the whole response for connection re-use.
Check this post: keep-alive in java6
Here is some information regarding the keep-alive cache. All of this information pertains Java 6, but is probably also accurate for many prior and later versions.
From what I can tell, the code boils down to:
If the remote server sends a "Keep-Alive" header with a "timeout" value that can be parsed as a positive integer, that number of seconds is used for the timeout.
If the remote server sends a "Keep-Alive" header but it doesn't have a "timeout" value that can be parsed as a positive integer and "usingProxy" is true, then the timeout is 60 seconds.
In all other cases, the timeout is 5 seconds.
This logic is split between two places: around line 725 of sun.net.www.http.HttpClient (in the "parseHTTPHeader" method), and around line 120 of sun.net.www.http.KeepAliveCache (in the "put" method).
So, there are two ways to control the timeout period:
Control the remote server and configure it to send a Keep-Alive header with the proper timeout field
Modify the JDK source code and build your own.
One would think that it would be possible to change the apparently arbitrary five-second default without recompiling internal JDK classes, but it isn't. A bug was filed in 2005 requesting this ability, but Sun refused to provide it.
If you really want to make sure that the connection is close you should call conn.disconnect().
The open connections you observed are because of the HTTP 1.1 connection keep alive feature (also known as HTTP Persistent Connections).
If the server supports HTTP 1.1 and does not send a Connection: close in the response header Java does not immediately close the underlaying TCP connection when you close the input stream. Instead it keeps it open and tries to reuse it for the next HTTP request to the same server.
If you don't want this behaviour at all you can set the system property http.keepAlive to false:
System.setProperty("http.keepAlive","false");
When using HttpURLConnection does the InputStream need to be closed if we do not 'get' and use it?
Yes, it always needs to be closed.
i.e. is this safe?
Not 100%, you run the risk of getting a NPE. Safer is:
InputStream is = null;
try {
is = conn.getInputStream()
// read from is
} finally {
if (is != null) {
is.close();
}
}
You also have to close error stream if the HTTP request fails (anything but 200):
try {
...
}
catch (IOException e) {
connection.getErrorStream().close();
}
If you don't do it, all requests that don't return 200 (e.g. timeout) will leak one socket.
Since Java 7 the recommended way is
try (InputStream is = conn.getInputStream()) {
// read from is
// ...
}
as for all other classes implementing Closable. close() is called at the end of the try {...} block.
Closing the input stream also means you are done with reading. Otherwise the connection hangs around until the finalizer closes the stream.
Same applies to the output stream, if you are sending data.
There is no need to get an close the ErrorStream. Even if it implements the InputStream interface: It's using the InputStream in combination with a buffer. Closing the InputStream is sufficient.
Related
I'm working on project where I need to read several files from the server.
I was wondering if I can read the output and connect to somewhere else with the same instance.
Apparently I can (see below).
Tried this pseudocode:
c.connect(google)
BufferedReader r1 = ... c.getInputStream();
c.connect(somewhereelse)
BufferedReader r2 = ... c.getInputStream();
print(r1)
print(r2)
And the output was correct.
I don't know how does IO streams works or what does this conn function returns (I'll check that during winter evenings :-))
Real question: Can I rely on the fact I'll always get correct data?
i.e. Does buffered reader keep some reference to data object that is not anymore bound to connection itself?
No, I think you can't trust this code. InputStream needs a connection to fetch data.
For example, if you open java.net.SocketInputStream (url connections use this stream) and see read method, you will see such code:
// connection reset
if (impl.isConnectionReset()) {
throw new SocketException("Connection reset");
}
a bit lower you can see, that stream tries to fetch buffered data if exists:
/*
* We receive a "connection reset" but there may be bytes still
* buffered on the socket
*/
if (gotReset) {...}
and lower is checking for a connections lost:
/*
* If we get here we are at EOF, the socket has been closed,
* or the connection has been reset.
*/
if (impl.isClosedOrPending()) {
throw new SocketException("Socket closed");
}
About your working implementation:
First guess is that you did not close your first connection. Linking c to another connection did not close previous one.
Second guess is that your data was buffered. By connection or by your buffer reader.
I could be wrong in my thoughts, but it seams to be true at first sight.
I have implemented a small HTTP-server which allows clients to connect via HTTP and stream audio-data to them.
My problem is, that in case there's currently no audio-data available, the connection seems to break, either because the client is disconnecting, or due to another reason inside Android.
I'm acting like the following way:
serverSocket = new ServerSocket(0);
Socket socket = serverSocket.accept();
socket.setKeepAlive(true);
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));
out.write("HTTP/1.1 200 OK\r\n");
out.write("Content-Type: audio/wav\r\n");
out.write("Accept-Ranges: none\r\n");
out.write("Connection: keep-alive\r\n"); // additionally added due to answer below
out.write("\r\n");
out.flush();
..
while(len=otherInput.read(audioBuffer)){
out.write(audioBuffer, 0, len);)
}
For sure this is just a snipped of the real code, but it shows what I'm doing.
Now, in case the "otherinput.read()" takes a long time because there's no data available at the moment, I get a
java.net.SocketException: sendto failed: EPIPE (Broken pipe)
at libcore.io.IoBridge.maybeThrowAfterSendto(IoBridge.java:499)
at libcore.io.IoBridge.sendto(IoBridge.java:468)
at java.net.PlainSocketImpl.write(PlainSocketImpl.java:508)
at java.net.PlainSocketImpl.access$100(PlainSocketImpl.java:46)
at java.net.PlainSocketImpl$PlainSocketOutputStream.write(PlainSocketImpl.java:270)
at java.io.BufferedOutputStream.flushInternal(BufferedOutputStream.java:185)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:139)
Who can tell me how I can prevent the connection from breaking/closing without a manual heartbeat? Do I miss some header or am I using something the wrong way?
Thanks for your help in advance, tried and searched myself crazy meanwhile.
There are at least two problems here.
Clients of HTTP servers are not well-behaved in the way you seem to expect. Consider a browser. The user can shut it down, go back, navigate away etc, any time he likes, even in the middle of a page load. If you get any error transmitting to the client there's nothing you can do except close the connection and forget about it. Same applies to any server really, but it applies unsolder to HTTP servers.
You're not reading the entire request sent by the client. You need to read all the headers until a blank line, then you need to read the body up to the length specified in the Content-length: header, or all the chunks, or until end of stream, as the case may be: see RFC 2616. The effect of this may be that you cause the behaviour at (1).
I'm trying to write out to URLConnection#getOutputStream, however, no data is actually sent until I call URLConnection#getInputStream. Even if I set URLConnnection#doInput to false, it still will not send. Does anyone know why this is? There's nothing in the API documentation that describes this.
Java API Documentation on URLConnection: http://download.oracle.com/javase/6/docs/api/java/net/URLConnection.html
Java's Tutorial on Reading from and Writing to a URLConnection: http://download.oracle.com/javase/tutorial/networking/urls/readingWriting.html
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.net.URL;
import java.net.URLConnection;
public class UrlConnectionTest {
private static final String TEST_URL = "http://localhost:3000/test/hitme";
public static void main(String[] args) throws IOException {
URLConnection urlCon = null;
URL url = null;
OutputStreamWriter osw = null;
try {
url = new URL(TEST_URL);
urlCon = url.openConnection();
urlCon.setDoOutput(true);
urlCon.setRequestProperty("Content-Type", "text/plain");
////////////////////////////////////////
// SETTING THIS TO FALSE DOES NOTHING //
////////////////////////////////////////
// urlCon.setDoInput(false);
osw = new OutputStreamWriter(urlCon.getOutputStream());
osw.write("HELLO WORLD");
osw.flush();
/////////////////////////////////////////////////
// MUST CALL THIS OTHERWISE WILL NOT WRITE OUT //
/////////////////////////////////////////////////
urlCon.getInputStream();
/////////////////////////////////////////////////////////////////////////////////////////////////////////
// If getInputStream is called while doInput=false, the following exception is thrown: //
// java.net.ProtocolException: Cannot read from URLConnection if doInput=false (call setDoInput(true)) //
/////////////////////////////////////////////////////////////////////////////////////////////////////////
} catch (Exception e) {
e.printStackTrace();
} finally {
if (osw != null) {
osw.close();
}
}
}
}
The API for URLConnection and HttpURLConnection are (for better or worse) designed for the user to follow a very specific sequence of events:
Set Request Properties
(Optional) getOutputStream(), write to the stream, close the stream
getInputStream(), read from the stream, close the stream
If your request is a POST or PUT, you need the optional step #2.
To the best of my knowledge, the OutputStream is not like a socket, it is not directly connected to an InputStream on the server. Instead, after you close or flush the stream, AND call getInputStream(), your output is built into a Request and sent. The semantics are based on the assumption that you will want to read the response. Every example that I've seen shows this order of events. I would certainly agree with you and others that this API is counterintuitive when compared to the normal stream I/O API.
The tutorial you link to states that "URLConnection is an HTTP-centric class". I interpret that to mean that the methods are designed around a Request-Response model, and make the assumption that is how they will be used.
For what it's worth, I found this bug report that explains the intended operation of the class better than the javadoc documentation. The evaluation of the report states "The only way to send out the request is by calling getInputStream."
Although the getInputStream() method can certainly cause a URLConnection object to initiate an HTTP request, it is not a requirement to do so.
Consider the actual workflow:
Build a request
Submit
Process the response
Step 1 includes the possibility of including data in the request, by way of an HTTP entity. It just so happens that the URLConnection class provides an OutputStream object as the mechanism for providing this data (and rightfully so for many reasons that aren't particularly relevant here). Suffice to say that the streaming nature of this mechanism provides the programmer an amount of flexibility when supplying the data, including the ability to close the output stream (and any input streams feeding it), before finishing the request.
In other words, step 1 allows for supplying a data entity for the request, then continuing to build it (such as by adding headers).
Step 2 is really a virtual step, and can be automated (like it is in the URLConnection class), since submitting a request is meaningless without a response (at least within the confines of the HTTP protocol).
Which brings us to Step 3. When processing an HTTP response, the response entity -- retrieved by calling getInputSteam() -- is just one of the things we might be interested in. A response consists of a status, headers, and optionally an entity. The first time any one of these is requested, the URLConnection will perform virtual step 2 and submit the request.
No matter if an entity is being sent via the connection's output stream or not, and no matter whether a response entity is expected back, a program will ALWAYS want to know the result (as provided by the HTTP status code). Calling getResponseCode() on the URLConnection provides this status, and switching on the result may end the HTTP conversation without ever calling getInputStream().
So, if data is being submitted, and a response entity is not expected, don't do this:
// request is now built, so...
InputStream ignored = urlConnection.getInputStream();
... do this:
// request is now built, so...
int result = urlConnection.getResponseCode();
// act based on this result
As my experiments have shown (java 1.7.0_01) the code:
osw = new OutputStreamWriter(urlCon.getOutputStream());
osw.write("HELLO WORLD");
osw.flush();
Doesn't send anything to the server. It just saves what's written there to the memory buffer. Thus in case you're going to upload a large file via POST - you need to be sure that you have enough memory. On desktop/server it may not be such a big problem, but on android that may result in out of memory error. Here's the example of how the stack trace looks when trying to write to output stream, and memory runs out.
Exception in thread "Thread-488" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:78)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:135)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
at java.io.Writer.write(Writer.java:157)
at maxela.tables.weboperations.POSTRequest.makePOST(POSTRequest.java:138)
On the bottom of the trace you can see the makePOST() method which does the following:
writer = new OutputStreamWriter(conn.getOutputStream());
for (int j = 0 ; j < 3000 * 100 ; j++)
{
writer.write("&var" + j + "=garbagegarbagegarbage_"+ j);
}
writer.flush();
And writer.write() throws the exception.
Also my experiments have shown that any exception related to the actual connection/IO with the server is thrown only after urlCon.getOutputStream() is called. Even urlCon.connect() seems to be "dummy" method which doesn't do any physical connection.
However if you call urlCon.getContentLengthLong() which returns Content-Length: header field from the server response-headers - then URLConnection.getOutputStream() will be called automatically and in case there's exception - it will be thrown.
The exceptions thrown by urlCon.getOutputStream() are all IOException, and I have met the follwing ones:
try
{
urlCon.getOutputStream();
}
catch (UnknownServiceException ex)
{
System.out.println("UnkownServiceException():" + ex.getMessage());
}
catch (ConnectException ex)
{
System.out.println("ConnectException()");
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
catch (IOException ex) {
System.out.println("IOException():" + ex.getMessage());
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
Hopefully my little research helps to people, as URLConnection class is a bit counter-intuitive in some cases thus, when implementing it - one needs to know what's it deals with.
Second reason is: when working with servers - the work with server may fail because of many reasons (connection, dns, firewall, httpresponses, server not being able to accept connection, server not being able to process request timely). Thus it is important to understand how exceptions raised can explain about what's actually happening with the connection.
Calling getInputStream() signals that the client is finished sending it's request, and is ready to receive the response (per HTTP spec). It seems that the URLConnection class has this notion built into it, and must be flush()ing the output stream when the input stream is asked for.
As the other responder noted, you should be able to call flush() yourself to trigger the write.
The fundamental reason is that it has to compute a Content-length header automatically (unless you are using chunked or streaming mode). It can't do that until it has seen all the output, and it has to send it before the output, so it has to buffer the output. And it needs a decisive event to know when the last output has actually been written. So it uses getInputStream() for that. At that time it writes the headers including the content-length, then the output, then it starts reading the input.
(Repost from your first question. Shameless self-plug)
Don't fiddle around with URLConnection yourself, let Resty handle it.
Here's the code you would need to write (I assume you are getting text back):
import static us.monoid.web.Resty.*;
import us.monoid.web.Resty;
...
new Resty().text(TEST_URL, content("HELLO WORLD")).toString();
How can I detect that the client side of a tomcat servlet request has disconnected? I've read that I should do a response.getOutputStream().print(), then a response.getOutputStream().flush() and catch an IOException, but is there a way I can detect this without writing any data?
EDIT:
The servlet sends out a data stream that doesn't necessarily end, but doesn't necessarily have any data flowing through it (it's a stream of real time events). I need to actually detect when the client disconnects because I have some cleanup I have to do at that point (resources to release, etcetera). If I have the HttpServletRequest available, will trying to read from that throw an IOException if the client disconnects?
is there a way I can detect this
without writing any data?
No because there isn't a way in TCP/IP to detect it without writing any data.
Don't worry about it. Just complete the request actions and write the response. If the client has disappeared, that will cause an IOException: connection reset, which will be thrown into the servlet container. Nothing you have to do about that.
I need to actually detect when the client disconnects because I have some cleanup I have to do at that point (resources to release, etcetera).
There the finally block is for. It will be executed regardless of the outcome. E.g.
OutputStream output = null;
try {
output = response.getOutputStream();
// ...
output.flush();
// ...
} finally {
// Do your cleanup here.
}
If I have the HttpServletRequest available, will trying to read from that throw an IOException if the client disconnects?
Depends on how you're reading from it and how much of request body is already in server memory. In case of normal form encoded requests, whenever you call getParameter() beforehand, it will usually be fully parsed and stored in server memory. Calling the getInputStream() won't be useful at all. Better do it on the response instead.
Have you tried to flush the buffer of the response:
response.flushBuffer();
Seems to throw an IOException when the client disconnected.
In the method service(), we use
PrintWriter out = res.getWriter();
Please tell me how it returns the PrintWriter class object, and then makes a connection to the Browser and sends the data to the Browser.
It doesn't make a connection to the browser - the browser has already made a connection to the server. It either buffers what you write in memory, and then transmits the data at the end of the request, or it makes sure all the headers have been written to the network connection and then returns a PrintWriter which writes data directly to that network connection.
In the buffering scenario there may be a fixed buffer size, and if you exceed that the data written so far will be "flushed" to the network connection. The big advantage of having a buffer at all is that if something goes wrong half-way through, you can change your response to an error page. If you've already started writing the response when something goes wrong, there's not a lot you can do to indicate the error cleanly.
(There's also the matter of transmitting the content length before any of the content, for keep-alive connections. If you run out of buffer before completing the response, I'm reliably informed that the response will use a chunked encoding.)
One fairly simple implementation:
PrintWriter getWriter() throws java.io.IOException {
return new PrintWriter(socket.getOutputStream());
}
Also note that several open source implementations of the Servlet API is available. This allows you to see how it can be done.
I believe the official implementation has been open sourced too, and is included with the Glassfish server.