AWS SDK S3 Socket Closed exception

AWS SDK S3 Socket Closed exception - java

My application uses close to 10 threads, each of which makes perhaps 7,000 Put Requests to S3 per minute. (I'm running it on a powerful EC2 box which can handle the load quite well.) It runs beautifully for close to an hour, but, after an hour, gets Unable to execute HTTP request: Socket Closed exceptions:
http.AmazonHttpClient: Unable to execute HTTP request: Socket Closed
java.net.SocketException: Socket Closed
at java.net.AbstractPlainSocketImpl.setOption(AbstractPlainSocketImpl.java:206)
at java.net.Socket.setSoTimeout(Socket.java:1105)
at sun.security.ssl.SSLSocketImpl.setSoTimeout(SSLSocketImpl.java:2414)
at org.apache.http.impl.io.SocketInputBuffer.isDataAvailable(SocketInputBuffer.java:106)
at org.apache.http.impl.AbstractHttpClientConnection.isResponseAvailable(AbstractHttpClientConnection.java:246)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.isResponseAvailable(ManagedClientConnectionImpl.java:180)
at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:238)
at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doSendRequest(SdkHttpRequestExecutor.java:47)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:713)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:518)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:446)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:256)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3641)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1438)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:128)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:120)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.upload(UploadMonitor.java:176)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:134)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:50)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
The Put Requests are done asynchronously, using the AWS SDK TransferManager. I imagine that, in the time it takes for one put request to fully complete, about 10 have been made asynchronously.
Googling that exception, I found two possible causes:
The limit on MaxConnections. I've raised it from the default 50 to 3000, to no avail.
Premature garbage collecting. I've tried keeping a reference to the Upload objects returned by TransferManager (in an concurrent queue), and, again, no help.
How can I fix this? Again, the app runs well for close to an hour, but, consistently, hits this wall after about an hour. (I'm running on Amazon AMI Linux on EC2.)
Update
No code other than the AWS SDK touches the sockets, or even knows about them. All the HTTP work is done exclusively through AWS SDK.
So, if something's closing them, it must be something in the AWS SDK.
The code is running on an EC2 server; there's no reason to expect any type of network connectivity issues between EC2 and S3, and certainly no reason they should happen predictably (after an hour into the run) each time

I'm not sure if this is the answer, but http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html states that "if you expect a rapid increase in the request rate for a bucket to more than 300 PUT/LIST/DELETE requests per second or more than 800 GET requests per second, we recommend that you open a support case to prepare for the workload and avoid any temporary limits on your request rate". Perhaps since I exceeded the limit, AWS starts aborting connections; the SDK, detecting the IDLE sockets, closes them, and, voila!, we get exceptions.
UPDATE: Not sure if this is correct. Amazon seems to state that, in this case, you'll get an explicit "Slow Down" error message, not an unexpected close. So, the puzzle remains.

The Exception is SocketException caused by setSoTimeout() method in java.net.Socket.(See the stack trace). The method can be viewed here: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/net/Socket.java#Socket.setSoTimeout%28int%29
Possible reason may be that the requests to S3 are still pending/incomplete resulting the thread to wait(). Once the wait time exceeds socket timeout , the socket is closed and exception is thrown.

I think you'd better try the ClientConfiguration.setSocketTimeout(int). If the socket is asynchronously closed, I think it's because of the timeout. According to the amazon document:
public void setSocketTimeout(int socketTimeout)
Sets the amount of time to wait (in milliseconds) for data to be transfered
over an established, open connection before the connection times out and is closed.
A value of 0 means infinity, and isn't recommended.
So, according to the document, if the connection times out, I think it is automatically closed.
link: http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/ClientConfiguration.html#setSocketTimeout(int)

There is only one cause of this exception. You or your framework closed the socket and then continued to use it.

Related

URLConnection, why two different timeouts? (connect and read) [duplicate]

This question already has answers here:
What is the difference between connection and read timeout for sockets?
(2 answers)
Closed 8 years ago.
Just curiosity. Is there a good reason why the class URLConnection needs to have two different timeouts?
The connectTimeout is the maximum time in milliseconds to wait while connecting. Connecting to a server will fail with a SocketTimeoutException if the timeout elapses before a connection is established.
The readTimeout is the maximum time to wait for an input stream read to complete before giving up. Reading will fail with a SocketTimeoutException if the timeout elapses before data becomes available.
Can you give me a good reason why these two values should be different? Why a call would need more time for performing the connection rather than receiving some data (or viceversa)?
I am asking this because I have to configure these values and my idea is to set the same value for both.

Let's say server is busy and is configured to accept 'N' connection and all the connections are long runner and all of sudden you send in request, What should happen? Should you wait indefinitely or should you time out? That's connectTimeout.
While let's say your server turns brain dead service just accepting connection and doing nothing with it (or say server synchronously goes to db and does some time taking activity and server ends up with deadlock for e.g.) and on the other hand client keeps on waiting for the response, in this case what should client do? Should it wait indefinitely for response or should it timeout? That's read timeout.

The connection timeout is how long you're prepared to wait to get some sort of response from the server. It's not particularly related to what it is that you're trying to achieve.
But suppose you had a service that would allow you to give it a large number, and have it return its prime factors. The server might take quite a while to generate the answer and send it to you.
You might well have clear expectations that the server would quickly respond to the connection: maybe even a delay of 5 seconds here tells you that the server is likely to be down. But the read timeout might need to be much higher: it might be a few minutes before you get to be able to read the server's answer to your query.

The connect time-out is the time-out in which you want a (in normal situations TCP) connection to be established. The default time-outs as specified in the internet RFCs and implemented by the various OSes are normally in the minute(s) range. But we know that if a server is available and reachable, it will respond in a matter of milli-seconds and otherwise not at all. A normal value would be a couple of seconds at a maximum.
The read timeout is the time in which the server is expected to respond after it received the incoming request. Read time-outs therefore depend on time within you expect the server to deliver the result. These are depending on the type of the request you are making and should be larger if the processing requires some time or the server may be very busy in some situations. Especially if you do a retry after a read time-out, it is best to put the read time-outs not too low, normally a factor 3-4 times the expected time.

Netty Channel configuration for connection timeout & number of open connections

I am trying to implement a HTTP server using netty & i wanted to know few thing which i could not understand from the netty api. I read many other netty related stackoverflow question but still i couldn't udnertand.
1.If i want the connection from client to be opened for a certain period of time, what should i use CONNECT_TIMEOUT_MILLIS or add a read timeout handler & add a timeout in it. Basically i want to understand the difference between these two. & what is the default value of CONNECT_TIMEOUT_MILLIS.
what is the default value of SO_BACKLOG,i read it in one of the that it is equal to SOMAXCONN in io.netty.netUtils.But what it the value of it. Also, i want to be sure that so_backlog limits the number of worker thread ri8?. I mean if i set it to say 1000 it means netty won't allow more than 1000 open connection at a time.
can somebody explain how netty responds to a HTTP request as in internally in terms of writing & reading from a channel?
Thanks in advance!!!

CONNECT_TIMEOUT_MILLIS is the timeout for connection attempt. Once the connection is established, it has no effect. What you are interested in is ReadTimeoutHandler.
The default SO_BACKLOG is NetUtils.SOMAXCONN. It does not limit the number of worker threads. For more information about SO_BACKLOG, please refer to this question. To limit the number of worker threads, you must specify it when you construct an NioEventLoop. SO_BACKLOG is unrelated to the maximum number of concurrent connections, either.
Re: How HTTP works in Netty - The question is too broad to give a simple answer. Please use your debugger to step into the Netty internals to find our how it works.

One way of limiting the number of concurrent connections is by limiting the number of open files a process can have. This property can be set in Linux using ulimit command or using limits.conf file

Java, URLConnection class, behaviour of timeouts

I'm working on a component in an android/java app, responsible (currently) for sending GET requests to a remote server. My code is based on this sample:
HTTP Client Template.
I've utilized methods setConnectTimeout() and setReadTimeout() from the URLConnection class to my favor, however I lack full understanding of their impact, now say I specify a value of 10 seconds for both:
Does it mean it should give up after 10 seconds of inability to start a connection? and will never timeout if the connection is open & active?
Or giving up after 10 seconds from the moment of the call? even if say the connection was actually started successfully after 2 seconds, and it could not finish all data transfer during the next 8 seconds?
Or is it even another different case?
Also, the concept is clear for timing-out a connect attempt, but how does a receive timeout occur? because as far as I know the OS will automatically receive and hold the data sent to you in its local buffer even before you call for receiving, since the data could actually be sent to you before you make a call in your code, and thus the OS does what it does to guarantee that data isn't lost around.
So is my timeout value for receiving passed to the OS for it to handle stuff?
Forward thanks, I hope I did my part well in the question.

App Engine: Keep Socket Open more than 2 Minutes

Using the App Engine Trusted Tester Sockets to connect to APNS. Writing to socket works fine.
But the problem is that the Socket gets reclaimed after 2 minutes of inactivity. It says in the Trusted Tester Website that any socket operation keeps the socket alive for further 2 minutes. It is nicer to keep the socket open until APNS decides to close the connection.
After trying pretty much all of the Socket API methods short of writing to the Output Stream, Socket gets closed after 2 minutes no matter what. What have I missed?
Deployed on java backend.

You can't keep a socket connected to APNS artifically open; without sending actual push notifications. The only way to keep it open is to send some arbitrary data/bytes but that would result in an immediate closure of the socket; APNS closes the connection as soon as it detects something that does not conform to the protocol, i.e. something that is not an actual push notification.
SO_KEEPALIVE
What about SO_KEEPALIVE? App Engine explicitly says it is supported. I think it just means it won't throw an exception when you call Socket.setKeepAlive(true); calls wanted to set socket options raised Not Implemented exceptions before. Even if you enable keep-alive your socket will be reclaimed (closed) if you don't send something for more than 2 minutes; at least on App Engine as of now.
Actually, it's not a big surprise. RFC1122 that specifies TCP Keep Alive explicitly states that TCP Keep Alives are not to be sent more than once every two hours, and then, it is only necessary if there was no other traffic. Although, it also says that this interval must be also configurable, there is no API on java.net.Socket you could use to configure that (most probably because it's highly OS dependent) and I doubt it would be set to 2 minutes on App Engine.
SO_TIMEOUT
What about SO_TIMEOUT? It is for something completely else. The javadoc of Socket.setSoTimeout() states:
Enable/disable SO_TIMEOUT with the specified timeout, in milliseconds. With this option set to a non-zero timeout, a read() call on the InputStream associated with this Socket will block for only this amount of time. If the timeout expires, a java.net.SocketTimeoutException is raised, though the Socket is still valid. The option must be enabled prior to entering the blocking operation to have effect. The timeout must be > 0. A timeout of zero is interpreted as an infinite timeout.
That is, when read() is blocking for too long because there's nothing to read you can say "ok, I don't want to wait (block) anymore; let's do something else instead". It's not going to help with our "2 minutes" problem.
What then?
The only way you can work around this problem is this: detect when a connection is reclaimed/closed then throw it away and open a new connection. And there is a library which supports exactly that.
Check out java-apns-gae.
It's an open-source Java APNS library that was specifically designed to work (and be used) on Google App Engine.
https://github.com/ZsoltSafrany/java-apns-gae

Did you try getSoLinger()? That may be the getSocketOpt that works (kind of) currently and it may reset the 2 minute timeout. In theory, also doing a zero byte read would as well but I'm not sure that would, if you try that, use this method on the inputstream.
public int read(byte b[], int off, int len)
If these suggestions don't work, please file an issue with the App Engine issue tracker.
There will be some other fixes coming, e.g. using socket options etc.

Use getpeername().
From https://developers.google.com/appengine/docs/java/sockets/overview ...
Sockets may be reclaimed after 2 minutes of inactivity; any socket
operation (e.g. getpeername) keeps the socket alive for a further 2
minutes. (Notice that you cannot Select between multiple available
sockets because that requires java.nio.SocketChannel which is not
currently supported.)

Java socket not throwing exceptions on a dead socket?

We have a simple client server architecture between our mobile device and our server both written in Java. An extremely simple ServerSocket and Socket implementation. However one problem is that when the client terminates abruptly (without closing the socket properly) the server does not know that it is disconnected. Furthermore, the server can continue to write to this socket without getting any exceptions. Why?
According to documentation Java sockets should throw exceptions if you try to write to a socket that is not reachable on the other end!

The connection will eventually be timed out by Retransmit Timeout (RTO). However, the RTO is calculated using a complicated algorithm based on network latency (RTT), see this RFC,
http://www.ietf.org/rfc/rfc2988.txt
So on a mobile network, this can be minutes. Wait 10 minutes to see if you can get a timeout.
The solution to this kind of problem is to add a heart-beat in your own application protocol and tear down connection when you don't get ACK for the heartbeat.

The key word here (without closing the socket properly).
Sockets should always be acquired and disposed of in this way:
final Socket socket = ...; // connect code
try
{
use( socket ); // use socket
}
finally
{
socket.close( ); // dispose
}
Even with this precautions you should specify application timeouts, specific to your protocol.
My experience had shown, that unfortunately you cannot use any of the Socket timeout functionality reliably ( e.g. there is no timeout for write operations and even read operations may, sometimes, hang forever ).
That's why you need a watchdog thread that enforces your application timeouts and disposes of sockets that have been unresponsive for a while.
One convenient way of doing this is by initializing Socket and ServerSocket through corresponding channels in java.nio. The main advantage of such sockets is that they are Interruptible, that way you can simply interrupt the thread that does socket protocol and be sure that socket is properly disposed off.
Notice that you should enforce application timeouts on both sides, as it is only a matter of time and bad luck when you may experience unresponsive sockets.

TCP/IP communications can be very strange. TCP will retry for quite a while at the bottom layers of the stack without ever letting the upper layers know that anything happened.
I would fully expect that after some time period (30 seconds to a few minutes) you should see an error, but I haven't tested this I'm just going off how TCP apps tend to work.
You might be able to tighten the TCP specs (retry, timeout, etc) but again, haven't messed with it much.
Also, it may be that I'm totally wrong and the implementation of Java you are using is just flaky.

To answer the first part of the question (about not knowing that the client has disconnected abruptly), in TCP, you can't know whether a connection has ended until you try to use it.
The notion of guaranteed delivery in TCP is quite subtle: delivery isn't actually guaranteed to the application at the other end (it depends on what guaranteed means really). Section 2.6 of RFC 793 (TCP) gives more details on this topic. This thread on the Restlet-discuss list and this thread on the Linux kernel list might also be of interest.
For the second part (not detecting when you write to this socket), this is probably a question of buffer and timeout (as others have already suggested).

I am facing the same problem.
I think when you register the socket with a selector it doesn't throw any exception.
Are you using a selector with your socket?

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.