Thread.sleep() taking longer than expected?

Thread.sleep() taking longer than expected? - java

We have a Java client/server RMI application that uses a Persistence Framework. Upon starting a client session we start the following thread:
Thread checkInThread = new Thread() {
public void run() {
while(true) {
try {
getServer().checkIn(userId);
}
catch(Exception ex) {
JOptionPane.showMessageDialog(parentFrame, "The connection to the server was lost.");
ex.printStackTrace();
}
try {
Thread.sleep(15000);
}
catch(InterruptedException e) {
}
}
}
};
This is used to keep track of whether a client session loses connection to the server. If the client does not check in for 45 seconds, then there are a number of things we need to clean up from that client's session. Upon their next check in after they've gone beyond the 45 seconds threshold we boot them from the system which then allows them to log back in. In theory the only time this should happen is if the client PC loses connectivity to the server.
However, we have come across scenarios where the thread runs just fine and checks in every 15 seconds and then for an unknown reason, the thread will just go out to lunch for 45+ seconds. Eventually the client will check back in, but it seems like something is blocking the execution of the thread during that time. We have experienced this using both Swing and JavaFX on the client side. The client/server are only compatible with Windows OS.
Is there an easy way to figure out what is causing this to happen, or a better approach to take to make sure the check ins occur regularly at 15 second intervals assuming their is connectivity between client and server?

getServer().checkIn(userId);
getServer or checkIn functions may take more than 15 seconds to return, then for that reason
the thread will just go out to lunch for 45+ seconds.

This can happen when the client machine goes into sleep or hibernate mode. Usually when it's a laptop that just had its cover closed.
There can also be temporary network outages that last for >15 seconds, but allow connections to resume automatically when the network comes back. In this case, the client can be stuck in .checkIn(), not sleep()

You should absolutely and positively not do this. There is no such thing as a connection in RMI, ergo you are testing for a condtion that does not exist. You are also interfering with RMI's connection pooling. The correct way to accomplish what you're attempting is via the remote session pattern and the Unreferenced interface. RMI can already tell you when a client loses connectivity, without all this overhead. 'Still connected' has no meaning in RMI'.

Related

Java LDAP bind response takes too long

I have an application that uses LDAP and communicates in a server client way using Sun's jndi library. The problem is that when many connections are trying to be established at once I see a lot of failed connections because bind response is not sent in desired time interval.
Is there a way to enhance this?
It is not unusual that there are >200 connections at once. Everything works OK until ~60 connections and after that it becomes too slow.
P.S.There is no possibility to increase waiting time.
Every connection is running in a separate thread like this:
...
serverSocket = new ServerSocket(port);
infinite loop:
newSocket = serverSocket.accept();
newSocket.setTcpNoDelay(true);
Thread t = new Thread(/*runnable that does something*/);
t.start();
Thanks!

Just wanted to share with you that I set a higher value for backlog and also I cleaned up run method a lot, making the transfer part the first thing that executes and then doing the analysis. Thanks four your help.

You probably have networking code in the constructor of the Runnable. Move it to the run() method so that it runs in its own thread instead of the thread that is calling accept().

App Engine: Keep Socket Open more than 2 Minutes

Using the App Engine Trusted Tester Sockets to connect to APNS. Writing to socket works fine.
But the problem is that the Socket gets reclaimed after 2 minutes of inactivity. It says in the Trusted Tester Website that any socket operation keeps the socket alive for further 2 minutes. It is nicer to keep the socket open until APNS decides to close the connection.
After trying pretty much all of the Socket API methods short of writing to the Output Stream, Socket gets closed after 2 minutes no matter what. What have I missed?
Deployed on java backend.

You can't keep a socket connected to APNS artifically open; without sending actual push notifications. The only way to keep it open is to send some arbitrary data/bytes but that would result in an immediate closure of the socket; APNS closes the connection as soon as it detects something that does not conform to the protocol, i.e. something that is not an actual push notification.
SO_KEEPALIVE
What about SO_KEEPALIVE? App Engine explicitly says it is supported. I think it just means it won't throw an exception when you call Socket.setKeepAlive(true); calls wanted to set socket options raised Not Implemented exceptions before. Even if you enable keep-alive your socket will be reclaimed (closed) if you don't send something for more than 2 minutes; at least on App Engine as of now.
Actually, it's not a big surprise. RFC1122 that specifies TCP Keep Alive explicitly states that TCP Keep Alives are not to be sent more than once every two hours, and then, it is only necessary if there was no other traffic. Although, it also says that this interval must be also configurable, there is no API on java.net.Socket you could use to configure that (most probably because it's highly OS dependent) and I doubt it would be set to 2 minutes on App Engine.
SO_TIMEOUT
What about SO_TIMEOUT? It is for something completely else. The javadoc of Socket.setSoTimeout() states:
Enable/disable SO_TIMEOUT with the specified timeout, in milliseconds. With this option set to a non-zero timeout, a read() call on the InputStream associated with this Socket will block for only this amount of time. If the timeout expires, a java.net.SocketTimeoutException is raised, though the Socket is still valid. The option must be enabled prior to entering the blocking operation to have effect. The timeout must be > 0. A timeout of zero is interpreted as an infinite timeout.
That is, when read() is blocking for too long because there's nothing to read you can say "ok, I don't want to wait (block) anymore; let's do something else instead". It's not going to help with our "2 minutes" problem.
What then?
The only way you can work around this problem is this: detect when a connection is reclaimed/closed then throw it away and open a new connection. And there is a library which supports exactly that.
Check out java-apns-gae.
It's an open-source Java APNS library that was specifically designed to work (and be used) on Google App Engine.
https://github.com/ZsoltSafrany/java-apns-gae

Did you try getSoLinger()? That may be the getSocketOpt that works (kind of) currently and it may reset the 2 minute timeout. In theory, also doing a zero byte read would as well but I'm not sure that would, if you try that, use this method on the inputstream.
public int read(byte b[], int off, int len)
If these suggestions don't work, please file an issue with the App Engine issue tracker.
There will be some other fixes coming, e.g. using socket options etc.

Use getpeername().
From https://developers.google.com/appengine/docs/java/sockets/overview ...
Sockets may be reclaimed after 2 minutes of inactivity; any socket
operation (e.g. getpeername) keeps the socket alive for a further 2
minutes. (Notice that you cannot Select between multiple available
sockets because that requires java.nio.SocketChannel which is not
currently supported.)

How Do You Stop A Thread Blocking for Network I/O?

I am currently trying to write a very simple chat application to introduce myself to java socket programming and multithreading. It consists of 2 modules, a psuedo-server and a psuedo-client, however my design has lead me to believe that I'm trying to implement an impossible concept.
The Server
The server waits on localhost port 4000 for a connection, and when it receives one, it starts 2 threads, a listener thread and a speaker thread. The speaker thread constantly waits for user input to the console, and sends it to the client when it receives said input. The listener thread blocks to the ObjectInputStream of the socket for any messages sent by the client, and then prints the message to the console.
The Client
The client connects the user to the server on port 4000, and then starts 2 threads, a listener and s speaker. These threads have the same functionality as the server's threads, but, for obvious reasons, handle input/output in the opposite way.
The First Problem
The problem I am running into is that in order to end the chat, a user must type "Bye". Now, since my threads have been looped to block for input:
while(connected()){
//block for input
//do something with this input
//determine if the connection still exists (was the message "Bye"?)
}
Then it becomes a really interesting scenario when trying to exit the application. If the client types "Bye", then it returns the sending thread and the thread that listened for the "Bye" on the server also returns. This leaves us with the problem that the client-side listener and the server-side speaker do not know that "Bye" has been typed, and thus continue execution.
I resolved this issue by creating a class Synchronizer that holds a boolean variable that both threads access in a synchronized manner:
public class Synchronizer {
boolean chatting;
public Synchronizer(){
chatting = true;
onChatStatusChanged();
}
synchronized void stopChatting(){
chatting = false;
onChatStatusChanged();
}
synchronized boolean chatting(){
return chatting;
}
public void onChatStatusChanged(){
System.out.println("Chat status changed!: " + chatting);
}
}
I then passed the same instance of this class into the thread as it was created. There was still one issue though.
The Second Problem
This is where I deduced that what I am trying to do is impossible using the methods I am currently employing. Given that one user has to type "Bye" to exit the chat, the other 2 threads that aren't being utilized still go on to pass the check for a connection and begin blocking for I/O. While they are blocking, the original 2 threads realize that the connection has been terminated, but even though they change the boolean value, the other 2 threads have already passed the check, and are already blocking for I/O.
This means that even though you will terminate the thread on the next iteration of the loop, you will still be trying to receive input from the other threads that have been properly terminated. This lead me to my final conclusion and question.
My Question
Is it possible to asynchronously receive and send data in the manner which I am trying to do? (2 threads per client/server that both block for I/O) Or must I send a heartbeat every few milliseconds back and forth between the server and client that requests for any new data and use this heartbeat to determine a disconnect?
The problem seems to reside in the fact that my threads are blocking for I/O before they realize that the partner thread has disconnected. This leads to the main issue, how would you then asynchronously stop a thread blocking for I/O?
I feel as though this is something that should be able to be done as the behavior is seen throughout social media.
Any clarification or advice would be greatly appreciated!

I don't know Java, but if it has threads, the ability to invoke functions on threads, and the ability to kill threads, then even if it doesn't have tasks, you can add tasks, which is all you need to start building your own ASync interface.
For that matter, if you can kill threads, then the exiting threads could just kill the other threads.
Also, a "Bye" (or some other code) should be sent in any case where the window is closing and the connection is open - If Java has Events, and the window you're using has a Close event, then that's the place to put it.
Alternately, you could test for a valid/open window, and send the "Bye" if the window is invalid/closed. Think of that like a poor mans' event handler.
Also, make sure you know how to (and have permission to) manually add exceptions to your networks' firewall(s).
Also, always test it over a live network. Just because it works in a loopback, doesn't mean it'll work over the network. Although you probably already know that.

Just to clarify for anyone who might stumble upon this post in the future, I ended up solving this problem by tweaking the syntax of my threads a bit. First of all, I had to remove my old threads, and replace them with AsyncSender and AsyncReader, respectively. These threads constantly send and receive regardless of user input. When there is no user input, it simply sends/receives a blank string and only prints it to the console if it is anything but a blank string.
The Workaround
try{
if((obj = in.readObject()) != null){
if(obj instanceof String)
output = (String) obj;
if(output.equalsIgnoreCase("Bye"))
s.stop();
}
}
catch(ClassNotFoundException e){
e.printStackTrace();
}
catch(IOException e){
e.printStackTrace();
}
In this iteration of the receiver thread, it does not block for input, but rather tests if the object read was null (no object was in the stream). The same is done in the sender thread.
This successfully bypasses the problem of having to stop a thread that is blocking for I/O.
Note that there are still other ways to work around this issue, such as using the InterruptableChannel.

Java program and network connection timeout

I have a java program that gather data from the web.
Unfortunately, my network has a problem and it goes off and on.
I need a trick to ask my program wait until the connection is on again and continue its job.
I use the "URLConnection" library to connect. I made a loop to reconnect when it a "ConnectException" is catched, but it doesn't work.
Any suggestions?

my network has a problem and it goes off and on. I need a trick to ask
my program wait until the connection is on again and continue its job
It depends on what your program's purpose is.
How long are these intermittent failures? If they are short enough you could use setConnectTimeout(0) to indicate an infinite timeout period while trying to connect but if your program has to report something back then it is not a good option for the end user.
You could set a relatively low timeout so that when you start to lose the network you will get a java.net.SocketTimeoutException.
When you catch this, you could wait for a period and try again in a loop e.g. for 3 times and then perhaps report a failure.
It depends on what you are trying to do.

Delay in multiple TCP connections from Java to the same machine

(See this question in ServerFault)
I have a Java client that uses Socket to open concurrent connections to the same machine. I am witnessing a phenomenon where one request completes extremely fast, but the others see a delay of 100-3000 milliseconds. Packet inspection using Wireshark shows all SYN packets beyond the first wait a long time before leaving the client. I am seeing this on both Windows and Linux clients. What could be causing this? This happens when the client is a Windows 2008 or a Linux box.
Code attached:
import java.util.*;
import java.net.*;
public class Tester {
public static void main(String[] args) throws Exception {
if (args.length < 3) {
usage();
return;
}
final int n = Integer.parseInt(args[0]);
final String ip = args[1];
final int port = Integer.parseInt(args[2]);
ExecutorService executor = Executors.newFixedThreadPool(n);
ArrayList<Callable<Long>> tasks = new ArrayList<Callable<Long>>();
for (int i = 0; i < n; ++i)
tasks.add(new Callable<Long>() {
public Long call() {
Date before = new Date();
try {
Socket socket = new Socket();
socket.connect(new InetSocketAddress(ip, port));
}
catch (Throwable e) {
e.printStackTrace();
}
Date after = new Date();
return after.getTime() - before.getTime();
}
});
System.out.println("Invoking");
List<Future<Long>> results = executor.invokeAll(tasks);
System.out.println("Invoked");
for (Future<Long> future : results) {
System.out.println(future.get());
}
executor.shutdown();
}
private static void usage() {
System.out.println("Usage: prog <threads> <url/IP Port>");
System.out.println("Examples:");
System.out.println(" prog tcp 10 127.0.0.1 2000");
}
}
Update - the problem reproduces consistently if I clear the relevant ARP entry before running the test program. I've tried tuning the TCP retransmission timeout, but that didn't help. Also, we ported this program to .Net, but the problem still happens.
Updated 2 - 3 seconds is the specified delay in creating new connections, from RFC 1122. I still don't fully understand why there is a retransmission here, it should be handled by the MAC layer. Also, we reproduced the problem using netcat, so it has nothing to do with java.

It looks like you use a single underlying HTTP connection. So other request can't be done before you call close() on the InputStream of an HttpURLConnection, i. e. before you process the response.
Or you should use a pool of HTTP connections.

You are doing the right thing in reducing the size of the problem space. On the surface this is an impossible problem - something that moves between IP stacks, languages and machines, and yet is not arbitrarily reproducible (e.g. I cannot repro using your code on Windows nor Linux).
Some suggestions, going from the top of the stack to the bottom:
Code -- you say this happens on .Net and Java. Are there any language/compiler combinations for which it does not happen? I used your client talking to the SocketTest program from sourceforge and also "nc" with identical results - no delays. Similarly JDK 1.5 vs 1.6 made no difference for me.
-- Suppose you pace the speed at which the client sends requests, say one every 500ms. Does the problem repro?
IP stack -- maybe something is getting stuck in the stack on the way out. I see you've ruled out Nagle but don't forget silly stuff like firewalls/ip tables. I'd find it hard to believe that the TCP stack on Win and Linux was that hosed, but you never know.
-- loopback interface handling can be freaky. Does it repro when you use the machine's real IP? What about across the network (or better, back-to-back with a x-over cable to another machine)?
NIC -- if the packets are making it to the cards, consider features of the cards (TCP offload or other 'special' handling) or quirks in the NICs themselves. Do you get the same results with other brands of NIC?

I haven't found a real answer from this discussion. The best theory I've come up with is:
TCP layer sends a SYN to the MAC layer. This happens from several threads.
First thread sees that IP has no match in the ARP table, sends an ARP request.
Subsequent threads see there is a pending ARP request so they drop the packet altogether. This behavior is probably implemented in the kernel of several operating systems!
ARP reply returns, the original SYN request from the first thread leaves the machine and a TCP connection is established.
TCP layer waits 3 seconds as stated in RFC 1122, then retries and succeeds.
I've tried tweaking the timeout in Windows 7 but wasn't successful. If anyone can reproduce the problem and provide a workaround, I'll be most helpful. Also, if anyone has more details on why exactly this phenomenon happens only with multiple threads, it would be interesting to hear.
I'll try to accept this answer as I don't think any of the answers provided a true explanation (see this discussion on meta).

If either of the machines is a windows box, I'd take a look at the Max Concurrent Connections on both. See: http://www.speedguide.net/read_articles.php?id=1497
I think this is a app-level limit in some cases, so you'll have to follow the guide to raise them.
In addition, if this is what happens, you should see something in the System Event Log on the offending machine.

Java client that uses HttpURLConnection to open concurrent connections to the same machine.
The same machine? What application does the clients accept? If you wrote that program by yourself, maybe you have to time how fast your server can accept clients. Maybe it is just a bad (or not fast working) written server application. The servercode looks like this, I think;
ServerSocket ss = ...;
while (acceptingMoreClients)
{
Socket s = ss.accept();
// On this moment the client is connected to the server, so start timing.
long start = System.currentTimeMillis();
ClientHandler handler = new ClientHandler(s);
handler.start();
// After "handler.start();" the handler thread is started,
// So the next two commands will be very fast done.
// That means the server is ready to accept a new client.
// Stop timing.
long stop = System.currentTimeMillis();
System.out.println("Client accepted in " + (stop - start) + " millis");
}
If this result are bad, than you know where the problem is situated.
I hope this helps you closer to the solution.
Question:
To do the test, do you use the ip you recieved from the DHCP server or 127.0.0.1
If that from the DHCP-Server, everything goes thru the router/switch/... from your company. That can slow down the whole process.
Otherwise:
In Windows all TCP-traffic (localhost to localhost) will be redirected in the software-layer of the system (not the hardware-layer), that is why you cannot see TCP-traffic with Wireshark. Wireshark only sees the traffic that passes the hardware-layer.
Linux: Wireshark can only see the traffic at the hardware-layer. Linux doesn't redirect on the software-layer. That is also the reason why InetAddress.getLocalhost().getAddress() 127.0.0.1 returns.
So when you use Windows, it is very normal you cannot see the SYN packet, with Wireshark.
Martijn.

The fact that you see this on multiple clients, with different OS's, and with different application environments on (I assume) the same OS is a strong indication that it's a problem with either the network or the server, not the client. This is reinforced by your comment that clearing the ARP table reproduces the problem.
Do you perhaps have two machines on the switch with the same MAC address? (one of which will probably be a router that's spoofing the MAC address).
Or more likely, if I recall ARP correctly, two machines that have the same hardcoded IP address. When the client sends out "who is IP 123.456.123.456", both will answer, but only one will actually be listening.
Another possibility (I've seen this happen in a corporate environment) is a rogue DHCP server, again giving out the same IP addresses to two machines.

Since the problem isn't reproducible unless you clear the associated ARP cache, what does the entire packet trace look like from a timing perspective, from the time the ARP request is issued until after the 3 second delay?
What happens if you open connections to two different IPs? Will the first connections to both succeed? If so, that should rule out any JVM or library issues.
The first SYN can't be sent until the ARP response arrives. Maybe the OS or TCP stack uses a timeout instead of an event for threads beyond the first one that try to open a connection when the associated MAC address isn't known.
Imagine the following scenario:
Thread #1 tries to connect, but the SYN can't be sent because the ARP cache is empty, so it queues the ARP request.
Next, Thread #2 (through #N) tries to connect. It also can't send the SYN packet because the ARP cache is empty. This time, though, instead of sending another ARP request, the thread goes to sleep for 3 seconds, as it says in the RFC.
Next, the ARP response arrives. Thread #1 wakes up immediately and sends the SYN.
Thread #2 isn't waiting on the ARP request; it has a hard-coded 3-second sleep. So after 3 seconds, it wakes up, finds the ARP entry it needs, and sends the SYN.

I have seen similar behavior when I was getting DNS timeouts. To test this, you can either use the IP address directly or enter the IP address in your hosts file.

Does setting socket.setTcpNoDelay( true ) help?

Have you tried to see what system calls are made by running your client with strace.
It's been very helpful to me in the past, while debugging some mysterious networking issues.

What is the listen backlog on the server? How quickly is it accepting connections? If the backlog fills up, the OS ignores connection attempts. 3 seconds later, the client tries again and gets in now that the backlog has cleared.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.