Jersey: Close I/O resources after HTTP response

Jersey: Close I/O resources after HTTP response - java

My Setup: I have created a REST service (Jersey/Dropwizard) that streams large content from a database. During a GET operation, the service taps into the database through a connection pool, wraps the data into a stream and performs some on-the-fly transformation to render the requested data in various encodings (CSV, JSON, ...). The life time of the database connection is tied to the life time of the stream and only when the stream is closed, the database connection is released.
The stream transformation is performed by an Encoder class that returns a StreamingOutput which is then passed to the Response object. The Encoder currently handles resource closing when the stream is fully consumed.
My Problem: Since StreamingOutput does not implement AutoCloseable, connection leaks may occur when the output is only partially consumed.
I sometimes observe that stale active connections are piling up in the connection pool, and I suspect that they arise from aborted HTTP connections. As you can see below, the current code handles exceptions that occur in the try block. What I cannot handle are Exceptions that occur after the return statement and I don't know how to attach any instructions for resource closing to the Response object.
My Question: How can I inform the Response object to close particular resources after the request has terminated (regularly or due to an error)? Or: Is there a better way to safely close any associated resources when the request context ends?
#GET
//#Produces(...)
public Response streamData(
#PathParam("key") String key,
// ... other params
) {
//decode and validate params
Stream<Pojo> ps = null;
try {
// connect to db and obtain data stream for <key>
ps = loadData(db, key);
// apply detailed encoding instrunctions and create a StreamingOutput
final StreamingOutput stream = Encoder.encodeData(ps, encodingArgs);
return Response.ok(stream).build();
} catch (Exception e) {
closeOnException(ps); // wrapper for ps.close();
throw e;
}
}

I received a good answer from the dropwizard mailing list that solves my problem and I want to reference here it in case somebody encounters the same problem.
https://groups.google.com/forum/#!topic/dropwizard-user/62GoLDBrQuo
Citing from Shawn's response:
Jersey supports CloseableService that lets you register Closeable objects to be closed when the request is complete:
public Response streamData(..., #Context CloseableService closer) {
...
closer.add(closeable);
return Response.ok(...).build();
}

You can add to your method HttpServletResponse:
#Produces(MediaType.APPLICATION_OCTET_STREAM)
public Object streamData(
#PathParam("key") String key,
#Context HttpServletResponse response,
// ... other params
) {
...
response.getOutputStream().write(....)
response.flushBuffer();
response.getOutputStream().close();
return null;
}

Related

How to close Response object when we get http body directly in JAX-RS?

When we write REST client with jersey we should close Response like this:
Client c = ClientBuilder.newClient();
Response r = null;
try {
r = c.target("http://localhost:8080/testrest/customers/854878").request().get();
Customer cus = r.readEntity(Customer.class);
/* process result */
} catch (Exception e) {
/* log here */
if (r != null) {
r.close();
}
}
how should we access Response object when we directly read HTTP body:
Client c = ClientBuilder.newClient();
Customer cus = c.target("http://localhost:8080/testrest/customers/854878").request().get(Customer.class);
/* close Response object and process result */

Assuming you are using Glassfish's jersey-client implementation version 2.3.1 (or check the other versions too), you can follow the calls that get(Class) makes. A little down the line you will find a call to
org.glassfish.jersey.message.internal.InboundMessageContext#readEntity(Class<T>, Type, Annotation[], PropertiesDelegate)
which, based on some rules, closes the response
if (!buffered && !(t instanceof Closeable) && !(t instanceof Source)) {
entityContent.close(); // wrapper to the actual response stream
}
where t is the object created based on the specified Class object.
The API itself doesn't seem to say anything about this so an implementation doesn't have to close the underlying response stream. The only thing I could find is from Client javadoc which states
Client instances must be properly closed before being disposed to
avoid leaking resources.
So do not depend on a specific implementation, make sure to close everything yourself, even if that means you have to break your Fluent method invocations and store intermediate object references in variables.

Client has a close method. Look into its sources. If Client.close doesn't clean up its resources then you must obtain a reference to Response and close it. Otherwise you'll have hanging connections.
If code allows you to do something, it doesn't mean you should. But from your questions I gather that you understand it.

How to execute code when connection finishes in Apache HttpRequestHandler

I'm using an Apache's HttpRequestHandler to serve data to HTTP clients. I'm generating content (probably a costly process) for clients, on demand.
I want to take care of two cases:
normal consumption, it ends, I want to close resources
client closes connection prematurely, I don't want to keep processing things
I'm using an InputStream (and an InputStreamEntity) that does the process, but I'd like to know if the client closes the resource prematurely (or not) and take actions at the end in both cases.
I've realized that InputStreamEntity.writeTo (which is the method used to send the content to the client) doesn't close the input stream I've declared.
What am I missing?

I've solved this by subclassing InputStreamEntity and making writeTo call close after original behaviour:
public class ClosingInputStreamEntity extends InputStreamEntity {
#Override
public void writeTo(OutputStream os) {
try {
super.writeTo(OutputStream os);
} finally {
// close resources
}
}
}

Why do you have to call URLConnection#getInputStream to be able to write out to URLConnection#getOutputStream?

I'm trying to write out to URLConnection#getOutputStream, however, no data is actually sent until I call URLConnection#getInputStream. Even if I set URLConnnection#doInput to false, it still will not send. Does anyone know why this is? There's nothing in the API documentation that describes this.
Java API Documentation on URLConnection: http://download.oracle.com/javase/6/docs/api/java/net/URLConnection.html
Java's Tutorial on Reading from and Writing to a URLConnection: http://download.oracle.com/javase/tutorial/networking/urls/readingWriting.html
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.net.URL;
import java.net.URLConnection;
public class UrlConnectionTest {
private static final String TEST_URL = "http://localhost:3000/test/hitme";
public static void main(String[] args) throws IOException {
URLConnection urlCon = null;
URL url = null;
OutputStreamWriter osw = null;
try {
url = new URL(TEST_URL);
urlCon = url.openConnection();
urlCon.setDoOutput(true);
urlCon.setRequestProperty("Content-Type", "text/plain");
////////////////////////////////////////
// SETTING THIS TO FALSE DOES NOTHING //
////////////////////////////////////////
// urlCon.setDoInput(false);
osw = new OutputStreamWriter(urlCon.getOutputStream());
osw.write("HELLO WORLD");
osw.flush();
/////////////////////////////////////////////////
// MUST CALL THIS OTHERWISE WILL NOT WRITE OUT //
/////////////////////////////////////////////////
urlCon.getInputStream();
/////////////////////////////////////////////////////////////////////////////////////////////////////////
// If getInputStream is called while doInput=false, the following exception is thrown: //
// java.net.ProtocolException: Cannot read from URLConnection if doInput=false (call setDoInput(true)) //
/////////////////////////////////////////////////////////////////////////////////////////////////////////
} catch (Exception e) {
e.printStackTrace();
} finally {
if (osw != null) {
osw.close();
}
}
}
}

The API for URLConnection and HttpURLConnection are (for better or worse) designed for the user to follow a very specific sequence of events:
Set Request Properties
(Optional) getOutputStream(), write to the stream, close the stream
getInputStream(), read from the stream, close the stream
If your request is a POST or PUT, you need the optional step #2.
To the best of my knowledge, the OutputStream is not like a socket, it is not directly connected to an InputStream on the server. Instead, after you close or flush the stream, AND call getInputStream(), your output is built into a Request and sent. The semantics are based on the assumption that you will want to read the response. Every example that I've seen shows this order of events. I would certainly agree with you and others that this API is counterintuitive when compared to the normal stream I/O API.
The tutorial you link to states that "URLConnection is an HTTP-centric class". I interpret that to mean that the methods are designed around a Request-Response model, and make the assumption that is how they will be used.
For what it's worth, I found this bug report that explains the intended operation of the class better than the javadoc documentation. The evaluation of the report states "The only way to send out the request is by calling getInputStream."

Although the getInputStream() method can certainly cause a URLConnection object to initiate an HTTP request, it is not a requirement to do so.
Consider the actual workflow:
Build a request
Submit
Process the response
Step 1 includes the possibility of including data in the request, by way of an HTTP entity. It just so happens that the URLConnection class provides an OutputStream object as the mechanism for providing this data (and rightfully so for many reasons that aren't particularly relevant here). Suffice to say that the streaming nature of this mechanism provides the programmer an amount of flexibility when supplying the data, including the ability to close the output stream (and any input streams feeding it), before finishing the request.
In other words, step 1 allows for supplying a data entity for the request, then continuing to build it (such as by adding headers).
Step 2 is really a virtual step, and can be automated (like it is in the URLConnection class), since submitting a request is meaningless without a response (at least within the confines of the HTTP protocol).
Which brings us to Step 3. When processing an HTTP response, the response entity -- retrieved by calling getInputSteam() -- is just one of the things we might be interested in. A response consists of a status, headers, and optionally an entity. The first time any one of these is requested, the URLConnection will perform virtual step 2 and submit the request.
No matter if an entity is being sent via the connection's output stream or not, and no matter whether a response entity is expected back, a program will ALWAYS want to know the result (as provided by the HTTP status code). Calling getResponseCode() on the URLConnection provides this status, and switching on the result may end the HTTP conversation without ever calling getInputStream().
So, if data is being submitted, and a response entity is not expected, don't do this:
// request is now built, so...
InputStream ignored = urlConnection.getInputStream();
... do this:
// request is now built, so...
int result = urlConnection.getResponseCode();
// act based on this result

As my experiments have shown (java 1.7.0_01) the code:
osw = new OutputStreamWriter(urlCon.getOutputStream());
osw.write("HELLO WORLD");
osw.flush();
Doesn't send anything to the server. It just saves what's written there to the memory buffer. Thus in case you're going to upload a large file via POST - you need to be sure that you have enough memory. On desktop/server it may not be such a big problem, but on android that may result in out of memory error. Here's the example of how the stack trace looks when trying to write to output stream, and memory runs out.
Exception in thread "Thread-488" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:78)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:135)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
at java.io.Writer.write(Writer.java:157)
at maxela.tables.weboperations.POSTRequest.makePOST(POSTRequest.java:138)
On the bottom of the trace you can see the makePOST() method which does the following:
writer = new OutputStreamWriter(conn.getOutputStream());
for (int j = 0 ; j < 3000 * 100 ; j++)
{
writer.write("&var" + j + "=garbagegarbagegarbage_"+ j);
}
writer.flush();
And writer.write() throws the exception.
Also my experiments have shown that any exception related to the actual connection/IO with the server is thrown only after urlCon.getOutputStream() is called. Even urlCon.connect() seems to be "dummy" method which doesn't do any physical connection.
However if you call urlCon.getContentLengthLong() which returns Content-Length: header field from the server response-headers - then URLConnection.getOutputStream() will be called automatically and in case there's exception - it will be thrown.
The exceptions thrown by urlCon.getOutputStream() are all IOException, and I have met the follwing ones:
try
{
urlCon.getOutputStream();
}
catch (UnknownServiceException ex)
{
System.out.println("UnkownServiceException():" + ex.getMessage());
}
catch (ConnectException ex)
{
System.out.println("ConnectException()");
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
catch (IOException ex) {
System.out.println("IOException():" + ex.getMessage());
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
Hopefully my little research helps to people, as URLConnection class is a bit counter-intuitive in some cases thus, when implementing it - one needs to know what's it deals with.
Second reason is: when working with servers - the work with server may fail because of many reasons (connection, dns, firewall, httpresponses, server not being able to accept connection, server not being able to process request timely). Thus it is important to understand how exceptions raised can explain about what's actually happening with the connection.

Calling getInputStream() signals that the client is finished sending it's request, and is ready to receive the response (per HTTP spec). It seems that the URLConnection class has this notion built into it, and must be flush()ing the output stream when the input stream is asked for.
As the other responder noted, you should be able to call flush() yourself to trigger the write.

The fundamental reason is that it has to compute a Content-length header automatically (unless you are using chunked or streaming mode). It can't do that until it has seen all the output, and it has to send it before the output, so it has to buffer the output. And it needs a decisive event to know when the last output has actually been written. So it uses getInputStream() for that. At that time it writes the headers including the content-length, then the output, then it starts reading the input.

(Repost from your first question. Shameless self-plug)
Don't fiddle around with URLConnection yourself, let Resty handle it.
Here's the code you would need to write (I assume you are getting text back):
import static us.monoid.web.Resty.*;
import us.monoid.web.Resty;
...
new Resty().text(TEST_URL, content("HELLO WORLD")).toString();

Safe use of HttpURLConnection

When using HttpURLConnection does the InputStream need to be closed if we do not 'get' and use it?
i.e. is this safe?
HttpURLConnection conn = (HttpURLConnection) uri.getURI().toURL().openConnection();
conn.connect();
// check for content type I don't care about
if (conn.getContentType.equals("image/gif") return;
// get stream and read from it
InputStream is = conn.getInputStream();
try {
// read from is
} finally {
is.close();
}
Secondly, is it safe to close an InputStream before all of it's content has been fully read?
Is there a risk of leaving the underlying socket in ESTABLISHED or even CLOSE_WAIT state?

According to http://docs.oracle.com/javase/6/docs/technotes/guides/net/http-keepalive.html
and OpenJDK source code.
(When keepAlive == true)
If client called HttpURLConnection.getInputSteam().close(), the later call to HttpURLConnection.disconnect() will NOT close the Socket. i.e. The Socket is reused (cached)
If client does not call close(), call disconnect() will close the InputStream and close the Socket.
So in order to reuse the Socket, just call InputStream.close(). Do not call HttpURLConnection.disconnect().

is it safe to close an InputStream
before all of it's content has been
read
You need to read all of the data in the input stream before you close it so that the underlying TCP connection gets cached. I have read that it should not be required in latest Java, but it was always mandated to read the whole response for connection re-use.
Check this post: keep-alive in java6

Here is some information regarding the keep-alive cache. All of this information pertains Java 6, but is probably also accurate for many prior and later versions.
From what I can tell, the code boils down to:
If the remote server sends a "Keep-Alive" header with a "timeout" value that can be parsed as a positive integer, that number of seconds is used for the timeout.
If the remote server sends a "Keep-Alive" header but it doesn't have a "timeout" value that can be parsed as a positive integer and "usingProxy" is true, then the timeout is 60 seconds.
In all other cases, the timeout is 5 seconds.
This logic is split between two places: around line 725 of sun.net.www.http.HttpClient (in the "parseHTTPHeader" method), and around line 120 of sun.net.www.http.KeepAliveCache (in the "put" method).
So, there are two ways to control the timeout period:
Control the remote server and configure it to send a Keep-Alive header with the proper timeout field
Modify the JDK source code and build your own.
One would think that it would be possible to change the apparently arbitrary five-second default without recompiling internal JDK classes, but it isn't. A bug was filed in 2005 requesting this ability, but Sun refused to provide it.

If you really want to make sure that the connection is close you should call conn.disconnect().
The open connections you observed are because of the HTTP 1.1 connection keep alive feature (also known as HTTP Persistent Connections).
If the server supports HTTP 1.1 and does not send a Connection: close in the response header Java does not immediately close the underlaying TCP connection when you close the input stream. Instead it keeps it open and tries to reuse it for the next HTTP request to the same server.
If you don't want this behaviour at all you can set the system property http.keepAlive to false:
System.setProperty("http.keepAlive","false");

When using HttpURLConnection does the InputStream need to be closed if we do not 'get' and use it?
Yes, it always needs to be closed.
i.e. is this safe?
Not 100%, you run the risk of getting a NPE. Safer is:
InputStream is = null;
try {
is = conn.getInputStream()
// read from is
} finally {
if (is != null) {
is.close();
}
}

You also have to close error stream if the HTTP request fails (anything but 200):
try {
...
}
catch (IOException e) {
connection.getErrorStream().close();
}
If you don't do it, all requests that don't return 200 (e.g. timeout) will leak one socket.

Since Java 7 the recommended way is
try (InputStream is = conn.getInputStream()) {
// read from is
// ...
}
as for all other classes implementing Closable. close() is called at the end of the try {...} block.
Closing the input stream also means you are done with reading. Otherwise the connection hangs around until the finalizer closes the stream.
Same applies to the output stream, if you are sending data.
There is no need to get an close the ErrorStream. Even if it implements the InputStream interface: It's using the InputStream in combination with a buffer. Closing the InputStream is sufficient.

Detecting client disconnect in tomcat servlet?

How can I detect that the client side of a tomcat servlet request has disconnected? I've read that I should do a response.getOutputStream().print(), then a response.getOutputStream().flush() and catch an IOException, but is there a way I can detect this without writing any data?
EDIT:
The servlet sends out a data stream that doesn't necessarily end, but doesn't necessarily have any data flowing through it (it's a stream of real time events). I need to actually detect when the client disconnects because I have some cleanup I have to do at that point (resources to release, etcetera). If I have the HttpServletRequest available, will trying to read from that throw an IOException if the client disconnects?

is there a way I can detect this
without writing any data?
No because there isn't a way in TCP/IP to detect it without writing any data.
Don't worry about it. Just complete the request actions and write the response. If the client has disappeared, that will cause an IOException: connection reset, which will be thrown into the servlet container. Nothing you have to do about that.

I need to actually detect when the client disconnects because I have some cleanup I have to do at that point (resources to release, etcetera).
There the finally block is for. It will be executed regardless of the outcome. E.g.
OutputStream output = null;
try {
output = response.getOutputStream();
// ...
output.flush();
// ...
} finally {
// Do your cleanup here.
}
If I have the HttpServletRequest available, will trying to read from that throw an IOException if the client disconnects?
Depends on how you're reading from it and how much of request body is already in server memory. In case of normal form encoded requests, whenever you call getParameter() beforehand, it will usually be fully parsed and stored in server memory. Calling the getInputStream() won't be useful at all. Better do it on the response instead.

Have you tried to flush the buffer of the response:
response.flushBuffer();
Seems to throw an IOException when the client disconnected.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.