We have an application which is using an embedded tomcat version 7.0.32. I am observing a peculiar situation with respect to latency.
I am doing some load tests on the application, what i have observed is the very first request to tomcat takes quite some amount of time, e.g. rate of about 300+ ms. Subsequent requests take about 10-15ms.
I am using a BIO connector. I know that persistent connections are used since i am using HTTP 1.1, which has that support by default. So ideally only 1 TCP connection is created and all request pushed on the same connection, till the keep alive timeout is elapsed.
I get the creating a TCP connection will have some costs involved, but the difference is just large.
Any idea what could be causing this huge difference in latency between the 1st and subsequent request and can we do anything to reduce/eliminate it.
Thanks,
Vikram
If you are using JSPs, they are compiled.
If you are connecting to databases, the connection pool might be empty before.
Generally speaking, if you have singletons which are initialized lazily, the first request has to wait.
On top of this, the JIT plays its role: So after the first request, the JIT might have applied some optimizations.
If it is a load test (or perfomance test), I would just ignore the first requests/runs, because this is still the "warm up" phase.
Update
You might find the information regarding a micro benchmark interesting.
Related
UPDATE:
My goal is to learn what factors could overwhelm my little tomcat server. And when some exception happens, what I could do to resolve or remediate it without switching my server to a better machine. This is not a real app in a production environment but just my own experiment (Besides some changes on the server-side, I may also do something on my client-side)
Both of my client and server are very simple: the server only checks the URL format and send 201 code if it is correct. Each request sent from my client only includes an easy JSON body. There is no database involved. The two machines (t2-micro) only run client and server respectively.
My client is OkHttpClient(). To avoid timeout exceptions, I already set timeout 1,000,000 milli secs via setConnectTimeout, setReadTimeout, and setWriteTimeout. I also go to $CATALINA/conf/server.xml on my server and set connectionTimeout = "-1"(infinite)
ORIGINAL POST:
I'm trying to stress out my server by having a client launching 3000+ threads sending HTTP requests to my server. Both of my client and server reside on different ec2 instances.
Initially, I encountered some timeout issues, but after I set the connection, read and write timeout to a bigger value, this exception has been resolved. However, with the same specification, I'm getting java.net.ConnectException: Failed to connect to my_host_ip:8080 exception. And I do not know its root cause. I'm new to multithreading and distributed system, can anyone please give me some insights of this exception?
Below is some screenshot of from my ec2:
1. Client:
2. Server:
Having gone through similar exercise in past I can say that there is no definitive answer to the problem of scaling.
Here are some general trouble shooting steps that may lead to more specific information. I would suggest trying out tests by tweaking a few parameters in each test and measure the changes in Cpu, logs etc.
Please provide what value you have put for the timeout. Increasing timeout could cause your server (or client) to run out of threads quickly (cause each thread can process for longer). Question the need for increasing timeout. Is there any processing that slows your server?
Check application logs, JVM usage, memory usage on the client and Server. There will be some hints there.
Your client seems to be hitting 99%+ and then come down. This implies that there could be a problem at the client side in that it maxes out during the test. Your might want to resize your client to be able to do more.
Look at open file handles. The number should be sufficiently high.
Tomcat has some limit on thread count to handle load. You can check this in server.xml and if required change it to handle more. Although cpu doesn't actually max out on server side so unlikely that this is the problem.
If you a database then check the performance of the database. Also check jdbc connect settings. There is thread and timeout config at jdbc level as well.
Is response compression set up on the Tomcat? It will give much better throughout on server especially if the data being sent back by each request is more than a few kbs.
--------Update----------
Based on update on question few more thoughts.
Since the application is fairly simple, the path in terms of stressing the server should be to start low and increase load in increments whilst monitoring various things (cpu, memory, JVM usage, file handle count, network i/o).
The increments of load should be spread over several runs.
Start with something as low as 100 parallel threads.
Record as much information as you can after each run and if the server holds up well, increase load.
Suggested increments 100, 200, 500, 1000, 1500, 2000, 2500, 3000.
At some level you will see that the server can no longer take it. That would be your breaking point.
As you increase load and monitor you will likely discover patterns that suggest tuning of specific parameters. Each tuning attempt should then be tested again the same level of multi threading. The improvement of available will be obvious from the monitoring.
I am writing a benchmarking tool to run against a web application. The problem that I am facing is that the first request to the server always takes significantly longer than subsequent requests.
I have experienced this problem with the apache http client 3.x, 4.x and the Google http client. The apache http client 4.x shows the biggest difference (first request takes about seven times longer than subsequent ones. For Google and 3.x it is about 3 times longer.
My tool has to be able to benchmark simultaneous requests with threads. I can not use one instance of e.g. HttpClient and call it from all the threads, since this throws a Concurrent exception. Therefore, I have to use an individual instance in each thread, which will only execute a single request. This changes the overall results dramatically.
I do not understand this behavior. I do not think that it is due to a caching mechanism on the server because a) the webapp under consideration does not employ any caching (to my knowledge) and b) this effect is also visible when first requesting www.hostxy.com and afterwards www.hostxy.com/my/webapp.
I use System.nanoTime() immediately before and after calling client.execute(get) or get.execute(), respectively.
Does anyone have an idea where this behavior stems from? Do these httpclients themselves do any caching? I would be very grateful for any hints.
Read this: http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html for connection pooling.
Your first connection probably takes the longest, because its a Connect: keep-alive connection, thus following connections can reuse that connection, once it has been established. This is justa guess
Are you hitting a JSP for the first time after server start? If the server flushes it's working directory on each start, then the first hit the JSPs compile and it takes a long time.
Also done on the first transaction: If the transaction uses a ca cert trust store, it will be loaded and cached.
You better once see it about caching
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/caching.html
If your problem is that "first http request to specific host significantly slower", maybe the cause of this symptom is on the server, while you are concerned about the client.
If the "specific host" that you are calling is an Google App Engine application (or any other Cloud Plattform), it is normal that the first call to that application make you wait a little more. That is because Google put it under a dormant state upon inactivity.
After a recent call (that usually takes longer to respond), the subsequent calls have faster responses, because the server instances are all awake.
Take a look on this:
Google App Engine Application Extremely slow
I hope it helps you, good luck!
We're using Glassfish 3.0.1 and experiencing very long response times; in the order of 5 minutes for 25% of our POST/PUT requests, by the time the response comes back the front facing load balancer has timed out.
My theory is that the requests are queuing up and waiting for an available thread.
The reason I think this is because the access logs reveal that the requests are taking a few seconds to complete however the time at which the requests are being executed are five minutes later than I'd expect.
Does anyone have any advice for debugging what is going on with the thread pools? or what the optimum settings should be for them?
Is it required to do a thread dump periodically or will a one off dump be sufficient?
At first glance, this seems to have very little to do with the threadpools themselves. Without knowing much about the rest of your network setup, here are some things I would check:
Is there a dead/nonresponsive node in the load balancer pool? This can cause all requests to be tried against this node until they fail due to timeout before being redirected to the other node.
Is there some issue with initial connections between the load balancer and the Glassfish server? This can be slow or incorrect DNS lookups (though the server should cache results), a missing proxy, or some other network-related problem.
Have you checked that the clocks are synchronized between the machines? This could cause the logs to get out of sync. 5min is a pretty strange timeout period.
If all these come up empty, you may simply have an impedance mismatch between the load balancer and the web server and you may need to add webservers to handle the load. The load balancer should be able to give you plenty of stats on the traffic coming in and how it's stacking up.
Usually you get this behaviour if you configured not enough worker threads in your server. Default values range from 15 to 100 threads in common webservers. However if your application blocks the server's worker threads (e.g. by waiting for queries) the defaults are way too low frequently.
You can increase the number of workers up to 1000 without problems (assure 64 bit). Also check the number of workerthreads (sometimes referred to as 'max concurrent/open requests') of any in-between server (e.g. a proxy or an apache forwarding via mod_proxy).
Another common pitfall is your software sending requests to itself (e.g. trying to reroute or forward a request) while blocking an incoming request.
Taking threaddump is the best way to debug what is going on with the threadpools. Please take 3-4 threaddumps one after another with 1-2 seconds gap between each threaddump.
From threaddump, you can find the number of worker threads by their name. Find out long running threads from the multiple threaddumps.
You may use TDA tool (http://java.net/projects/tda/downloads/download/tda-bin-2.2.zip) for analyzing threaddumps.
Two questions about EC2 ELB:
First is how to properly run JMeter tests. I've found the following http://osdir.com/ml/jmeter-user.jakarta.apache.org/2010-04/msg00203.html, which basically says to set -Dsun.net.inetaddr.ttl=0 when starting JMeter (which is easy) and the second point it makes is that the routing is per ip not per request. So aside from starting a farm of jmeter instances I don't see how to get around that. Any ideas are welcome, or possibly I'm mis-reading the explanation(?)
Also, I have a web service that is making a server side call to another web service in java (and both behind ELB), so I'm using HttpClient and it's MultiThreadedHttpConnectionManager, where I provide some large-ish routes to host value in the connection manager. And I'm wondering if that will break the load balancing behavior ELB because the connections are cached (and also, that the requests all originate from the same machine). I can switch to use a new HttpClient each time (kind of lame) but that doesn't get around the fact that all requests are originating from a small number of hosts.
Backstory: I'm in the process of perf testing a service using ELB on EC2 and the traffic is not distributing evenly (most traffic to 1-2 nodes, almost no traffic to 1 node, no traffic at all to a 4th node). And so the issues above are the possible culprits I've identified.
I have had very simular problems. One thing is the ELB does not scale well under burst load. So when you are trying to test it, it is not scaling up immediately. It takes a lot of time for it to move up. Another thing that is a drawback is the fact that it uses a CNAME as the DNS look up. This alone is going to slow you down. There are more performance issues you can research.
My recommendation is to use haproxy. You have much more control, and you will like the performance. I have been very happy with it. I use heartbeat to setup a redundant server and I am good to go.
Also if you plan on doing SSL with the ELB, you will suffer more because I found the performance to be below par.
I hope that helps some. When it comes down to it, AWS has told me personally that load testing the ELB does not really work, and if you are planning on launching with a large amount of load, you need to tell them so they can scale you up ahead of time.
You don't say how many jmeter instances you're running, but in my experience it should be around 2x the number of AZs you're scaling across. Even then, you will probably see unbalanced loads - it is very unusual to see the load scaled exactly across your back-end fleet.
You can help (a bit) by running your jmeter instances in different regions.
Another factor is the duration of your test. ELBs do take some time to scale up - you can generally tell how many instances are running by doing an nslookup against the ELB name. Understand your scaling patterns, and build tests around them. (So if it takes 20 minutes to add another instance to the ELB pool, include a 25-30 minute warm-up to your test.) You also get AWS to "pre-warm" the ELB pool if necessary.
If your ELB pool size is sufficient for your test, and can verify that the pool does not change during a test run, you can always try running your tests directly against the ELB IPs - i.e. manually balancing the traffic.
I'm not sure what you expect to happen with the 2nd tier of calls - if you're opening a connection, and re-using it, there's obviously no way to have that scaled across instances without closing & re-opening the connection. Are these calls running on the same set of servers, or a different set? You can create an internal ELB, and use that endpoint to connect to, but I'm not sure that would help in the scenario you've described.
My web app is running on 64-bit Java 6.0.23, Tomcat 6.0.29 (with Apache Portable Runtime 1.4.2), on Linux (CentOS). Tomcat's JAVA_OPTS includes -Xincgc, which is supposed to help prevent long garbage collections.
The app is under heavy load and has intermittent failures, and I'd like to troubleshoot it.
Here is the symptom: Very intermittently, an HTTP client will send an HTTP request to the web app and get an empty response back.
The app doesn't use a database, so it's definitely not a problem with JDBC connections. So I figure the problem is perhaps one of: memory (perhaps long garbage collections), out of threads, or out of file descriptors.
I used javamelody to view the number of threads that are being used, and it seems that maxThreads is set high enough to not be running out of threads. Similarly, we have the number of available of file descriptors set to a very high number.
The app does use a lot of memory. Does it seem like memory is probably the culprit here, or is there something else that I might be overlooking?
I guess my main confusion, though, is why garbage collections would cause HTTP requests to fail. Intuitively, I would guess that a long garbage collection might cause an HTTP request to take a long time to run, but I would not guess that a long garbage collection would cause an HTTP request to fail.
Additional info in response to Jon Skeet's comments...
The client is definitely not timing out. The empty response happens fairly quickly. When it fails, there is no data and no HTTP headers.
I very much doubt that garbage collection is responsible for the issue.
You really really need to find out exactly what this "empty response" consists of:
Does the server just chop the connection?
Does the client perhaps time out?
Does the server give a valid HTTP response but with no data?
Each of these could suggest very different ways of finding out what's going on. Determining the failure mode should be your primary concern, IMO. Until you know that, it's complete guesswork.