I've a Java + Spring app that will query ElasticSearch using Jest client (poor choice because it is poorly documented). ElasticSearch has response times of about 8-20 ms with 150 concurrent connections, but my app goes up to 900 -1500 ms. A quick look at VisualVM tells me that the processor usage is below 10% and profiling it tells me that 98% of the time all that the app does is wait on the following method
that is part of Apache HttpCore and a dependency of Jest. I don't have a limitation in terms of threads that can run on tomcat (max is 200 and VisualVM says that the maximum number of thread during the experiment was 174). So it's not waiting free threads.
I think that the latency increase is excessive and I suspect that Jest is using an internal threadpool that has not enough threads to cope with all the requests, but I don't know.
I think that the latency increase is excessive and I suspect that Jest is using an internal threadpool that has not enough threads to cope with all the requests...
In poking around the source real fast I see that you should be able to inject a ClientConfig into the Jest client factory.
The ClientConfig has the following setters which seem to impact the internal Apache http client connection manager:
Maybe tweaking some of those will give you more connections? Take a look at the JestClientFactory source to see what it is doing. We've definitely had to tweak those values in the past when making a large number of connections to the same server using HttpClient.
I would test this with just one connection and see what the average response time is. With just one thread you should have more than enough thread and resources etc. Most likely the process is waiting on an external resource like a database or a network service.
My goal is to learn what factors could overwhelm my little tomcat server. And when some exception happens, what I could do to resolve or remediate it without switching my server to a better machine. This is not a real app in a production environment but just my own experiment (Besides some changes on the server-side, I may also do something on my client-side)
Both of my client and server are very simple: the server only checks the URL format and send 201 code if it is correct. Each request sent from my client only includes an easy JSON body. There is no database involved. The two machines (t2-micro) only run client and server respectively.
My client is OkHttpClient(). To avoid timeout exceptions, I already set timeout 1,000,000 milli secs via setConnectTimeout, setReadTimeout, and setWriteTimeout. I also go to $CATALINA/conf/server.xml on my server and set connectionTimeout = "-1"(infinite)
I'm trying to stress out my server by having a client launching 3000+ threads sending HTTP requests to my server. Both of my client and server reside on different ec2 instances.
Initially, I encountered some timeout issues, but after I set the connection, read and write timeout to a bigger value, this exception has been resolved. However, with the same specification, I'm getting java.net.ConnectException: Failed to connect to my_host_ip:8080 exception. And I do not know its root cause. I'm new to multithreading and distributed system, can anyone please give me some insights of this exception?
Below is some screenshot of from my ec2:
1. Client:
2. Server:
Having gone through similar exercise in past I can say that there is no definitive answer to the problem of scaling.
Here are some general trouble shooting steps that may lead to more specific information. I would suggest trying out tests by tweaking a few parameters in each test and measure the changes in Cpu, logs etc.
Please provide what value you have put for the timeout. Increasing timeout could cause your server (or client) to run out of threads quickly (cause each thread can process for longer). Question the need for increasing timeout. Is there any processing that slows your server?
Check application logs, JVM usage, memory usage on the client and Server. There will be some hints there.
Your client seems to be hitting 99%+ and then come down. This implies that there could be a problem at the client side in that it maxes out during the test. Your might want to resize your client to be able to do more.
Look at open file handles. The number should be sufficiently high.
Tomcat has some limit on thread count to handle load. You can check this in server.xml and if required change it to handle more. Although cpu doesn't actually max out on server side so unlikely that this is the problem.
If you a database then check the performance of the database. Also check jdbc connect settings. There is thread and timeout config at jdbc level as well.
Is response compression set up on the Tomcat? It will give much better throughout on server especially if the data being sent back by each request is more than a few kbs.
Based on update on question few more thoughts.
Since the application is fairly simple, the path in terms of stressing the server should be to start low and increase load in increments whilst monitoring various things (cpu, memory, JVM usage, file handle count, network i/o).
The increments of load should be spread over several runs.
Start with something as low as 100 parallel threads.
Record as much information as you can after each run and if the server holds up well, increase load.
Suggested increments 100, 200, 500, 1000, 1500, 2000, 2500, 3000.
At some level you will see that the server can no longer take it. That would be your breaking point.
As you increase load and monitor you will likely discover patterns that suggest tuning of specific parameters. Each tuning attempt should then be tested again the same level of multi threading. The improvement of available will be obvious from the monitoring.
I am working on enterprise java application which has a lot of tools/frameworks in it already, such as Struts, JAX-RS and Spring MVC. It contains UIs and REST endpoints bundled together in a .war file.
The project is evolving and we are getting rid of older tools, striving for sticking up with only Spring MVC/Webflux.
Application is performing search on millions of XML/JSON records and recently the search engine was switched from Marklogic to Elasticsearch.
What we have noticed is that on production with not that heavy usage (up to 1.7k rpm on 2-4 application nodes) response times on some of the endpoints are increasing over time.
Elasticsearch has a space to grow and does not show any signs of a huge load.
So currently we have to restart/replace prod instances once in like a week or two when average response time is over 3 seconds instead of a regular 200-300 millis.
I tried to get CPU and heap flame graphs using async-profiler but the load profile is changing on every measurement as we have bunch of features available so I cannot really compare how graphs are changing over time.
Can you advise me on some tactics/approaches for finding the proper place in the code?
Found issue. It is related to thread pooling.
What we have noticed is that over time amount of active tomcat threads were growing together with response times:
On the image you can also see that the server was restarted on May 9th.
I was able to get a heap dump before the server restarted and after some digging found an interesting repeated piece in thread dump:
Thread xxx
at sun.misc.Unsafe.park(ZJ)V (Native Method)
at java.util.concurrent.locks.LockSupport.park(Ljava/lang/Object;)V (LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()V (AbstractQueuedSynchronizer.java:2039)
at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(Ljava/lang/Object;Ljava/lang/Object;JLjava/util/concurrent/TimeUnit;Ljava/util/concurrent/Future;)Lorg/apache/http/pool/PoolEntry; (AbstractConnPool.java:377)
at org.apache.http.pool.AbstractConnPool.access$200(Lorg/apache/http/pool/AbstractConnPool;Ljava/lang/Object;Ljava/lang/Object;JLjava/util/concurrent/TimeUnit;Ljava/util/concurrent/Future;)Lorg/apache/http/pool/PoolEntry; (AbstractConnPool.java:67)
at org.apache.http.pool.AbstractConnPool$2.get(JLjava/util/concurrent/TimeUnit;)Lorg/apache/http/pool/PoolEntry; (AbstractConnPool.java:243)
at org.apache.http.pool.AbstractConnPool$2.get(JLjava/util/concurrent/TimeUnit;)Ljava/lang/Object; (AbstractConnPool.java:191)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(Ljava/util/concurrent/Future;JLjava/util/concurrent/TimeUnit;)Lorg/apache/http/HttpClientConnection; (PoolingHttpClientConnectionManager.java:282)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(JLjava/util/concurrent/TimeUnit;)Lorg/apache/http/HttpClientConnection; (PoolingHttpClientConnectionManager.java:269)
at org.apache.http.impl.execchain.MainClientExec.execute(Lorg/apache/http/conn/routing/HttpRoute;Lorg/apache/http/client/methods/HttpRequestWrapper;Lorg/apache/http/client/protocol/HttpClientContext;Lorg/apache/http/client/methods/HttpExecutionAware;)Lorg/apache/http/client/methods/CloseableHttpResponse; (MainClientExec.java:191)
at org.apache.http.impl.execchain.ProtocolExec.execute(Lorg/apache/http/conn/routing/HttpRoute;Lorg/apache/http/client/methods/HttpRequestWrapper;Lorg/apache/http/client/protocol/HttpClientContext;Lorg/apache/http/client/methods/HttpExecutionAware;)Lorg/apache/http/client/methods/CloseableHttpResponse; (ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(Lorg/apache/http/conn/routing/HttpRoute;Lorg/apache/http/client/methods/HttpRequestWrapper;Lorg/apache/http/client/protocol/HttpClientContext;Lorg/apache/http/client/methods/HttpExecutionAware;)Lorg/apache/http/client/methods/CloseableHttpResponse; (RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(Lorg/apache/http/conn/routing/HttpRoute;Lorg/apache/http/client/methods/HttpRequestWrapper;Lorg/apache/http/client/protocol/HttpClientContext;Lorg/apache/http/client/methods/HttpExecutionAware;)Lorg/apache/http/client/methods/CloseableHttpResponse; (RedirectExec.java:111)
at org.apache.http.impl.client.InternalHttpClient.doExecute(Lorg/apache/http/HttpHost;Lorg/apache/http/HttpRequest;Lorg/apache/http/protocol/HttpContext;)Lorg/apache/http/client/methods/CloseableHttpResponse; (InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(Lorg/apache/http/client/methods/HttpUriRequest;Lorg/apache/http/protocol/HttpContext;)Lorg/apache/http/client/methods/CloseableHttpResponse; (CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(Lorg/apache/http/client/methods/HttpUriRequest;)Lorg/apache/http/client/methods/CloseableHttpResponse; (CloseableHttpClient.java:108)
at io.searchbox.client.http.JestHttpClient.executeRequest(Lorg/apache/http/client/methods/HttpUriRequest;)Lorg/apache/http/client/methods/CloseableHttpResponse; (JestHttpClient.java:136)
at io.searchbox.client.http.JestHttpClient.execute(Lio/searchbox/action/Action;Lorg/apache/http/client/config/RequestConfig;)Lio/searchbox/client/JestResult; (JestHttpClient.java:70)
at io.searchbox.client.http.JestHttpClient.execute(Lio/searchbox/action/Action;)Lio/searchbox/client/JestResult; (JestHttpClient.java:63)
In our case we are using Jest library to talk to Elasticsearch.
Internally it is using Apache HTTP client & Apache HTTP Async Client.
As you see on the thread dump it's clear that this thread was waiting for an available thread in HTTP Client's thread pool. And there were more threads with exactly the same stack.
What I also discovered is that we set maxTotal (maximum total number of connections) to 20 and defaultMaxPerRoute (maximum connections per route) to 2:
By default the pool allows only 20 concurrent connections in total and two concurrent connections per a unique route. The two connection limit is due to the requirements of the HTTP specification. However, in practical terms this can often be too restrictive.
See Connection pools description.
So the fix I did is increased those values to 50 and 40 respectively.
I would still prefer to have this parameters unbounded and grow with a usage but for now stick to these values.
I have a very simple Java REST service. At lower traffic volumes, the service runs perfectly with response times of ~1ms and zero server backlog.
When traffic rises past a certain threshold the response times skyrocket from 1ms to 2.0 seconds, the http active session queue and open file counts spike, and the server is performing unacceptably. I posted a metrics graph of a typical six hour window where traffic starts low and goes above the problem threshold.
Any ideas on what could be causing this or how to diagnose further?
Your webapp will use a thread (borrowed from thread pool) to server one request.
Under stress load, many threads are created, if the number of requests exceed the capacity of the pool, they have to queue, waiting till a thread is available again from pool.
If your service is not run fast enough, (especially you're doing IO - open file), the wait time is increase, lead to slow response.
CPU has to switch between many threads hence the CPU will spike under load.
That's why they need a load balancing and many webapp to server as a service. The stress load is distributed to many subwebapp which improve the end user experience
The usual approach to diagnostic is to create load with JMeter and investigate results with Java VisualVM, Eclipse memory analyzer and so on. I don't know whether you have tried it.
I need to pull data from a lot of clients connecting to a java server through a web socket.
There are a lot of web socket implementations, and I picked vert.x.
I made a simple demo where I listen to text frames of json, parse them with jackson and send response back. Json parser doesn't influence significantly on the throughput.
I am getting overall speed 2.5k per second with 2 or 10 clients.
Then I tried to use buffering and clients don't wait for every single response but send batch of messages (30k - 90k) after a confirmation from a server - speed increased up to 8k per second.
I see that java process has a CPU bottleneck - 1 core is used by 100%.
Mean while nodejs client cpu consumption is only 5%.
Even 1 client causes server to eat almost a whole core.
Do you think it's worth to try other websocket implementations like jetty?
Is there way to scale vert.x with multiple cores?
After I changed the log level from debug to info I have 70k. Debug level causes vert.x print messages for every frame.
It's possible specify number of verticle (thread) instances by e.g. configuring DeploymentOptions http://vertx.io/docs/vertx-core/java/#_specifying_number_of_verticle_instances
You was able to create more than 60k connections on a single machine, so I assume the average time of a connection was less than a second. Is it the case you expect on production? To compare other solutions you can try to run https://github.com/smallnest/C1000K-Servers
Something doesn't sound right. That's very low performance. Sounds like vert.x is not configured properly for parallelization. Are you using only one verticle (thread)?
The Kaazing Gateway is one of the first WS implementations. By default, it uses multiple cores and is further configurable with threads, etc. We have users using it for massive IoT deployments so your issue is not the JVM.
In case you're interested, here's the github repo: https://github.com/kaazing/gateway
Full disclosure: I work for Kaazing
I created a web service both client and server. I thought of doing the performance testing. I tried jmeter with a sample test plan to execute it. Upto 3000 request jboss processed the request but when requests more than 3000 some of the request are not processed (In sense of Can't open connection : Connection refused). Where i have to make the changes to handle more than 10000 request at the same time. Either it's a jboss issue or System Throughput ?
jmeter Config : 300 Threads, 1 sec ramp up and 10 loop ups.
System (Server Config) : Windows 7, 4G RAM
Where i have to make the changes to handle more than 10000 request at the same time
10 thousand concurrent requests in Tomcat (I believe it is used in JBoss) is quite a lot. In typical setup (with blocking IO connector) you need one thread per one HTTP connection. This is way too much for ordinary JVM. On a 64-bit server machine one thread needs 1 MiB (check out -Xss parameter). And you only have 4 GiB.
Moreover, the number of context switches will kill your performance. You would need hundreds of cores to effectively handle all these connections. And if your request is I/O or database bound - you'll see a bottleneck elsewhere.
That being said you need a different approach. Either try out non-blocking I/O or asynchronous servlets (since 3.0) or... scale out. By default Tomcat can handle 100-200 concurrent connections (reasonable default) and a similar amount of connections are queued. Everything above that is rejected and you are probably experiencing that.
See also
Advanced IO and Tomcat
Asynchronous Support in Servlet 3.0
There are two common problems that I think of.
First, if you run JBoss on Linux as a normal user, you can run into 'Too many open files', if you did not edit the limits.conf file. See https://community.jboss.org/thread/155699. Each open socket counts as an 'open file' for Linux, so the OS could block your connections because of this.
Second, the maximum threadpool size for incoming connections is 200 by default. This limits the number of concurrent requests, i.e. requests that are in progress at the same time. If you have jmeter doing 300 threads, the jboss connector threadpool should be larger. You can find this in jboss6 in the jboss-web.sar/server.xml. Look for 'maxThreads' in the element: http://docs.jboss.org/jbossweb/latest/config/http.html.
200 is the recommended maximum for a single core CPU. Above that, the context switches start to give too much overhead, like Tomasz says. So for production use, only increase to 400 on a dual core, 800 on a quad core, etc.