I created a web service both client and server. I thought of doing the performance testing. I tried jmeter with a sample test plan to execute it. Upto 3000 request jboss processed the request but when requests more than 3000 some of the request are not processed (In sense of Can't open connection : Connection refused). Where i have to make the changes to handle more than 10000 request at the same time. Either it's a jboss issue or System Throughput ?
jmeter Config : 300 Threads, 1 sec ramp up and 10 loop ups.
System (Server Config) : Windows 7, 4G RAM
Where i have to make the changes to handle more than 10000 request at the same time
10 thousand concurrent requests in Tomcat (I believe it is used in JBoss) is quite a lot. In typical setup (with blocking IO connector) you need one thread per one HTTP connection. This is way too much for ordinary JVM. On a 64-bit server machine one thread needs 1 MiB (check out -Xss parameter). And you only have 4 GiB.
Moreover, the number of context switches will kill your performance. You would need hundreds of cores to effectively handle all these connections. And if your request is I/O or database bound - you'll see a bottleneck elsewhere.
That being said you need a different approach. Either try out non-blocking I/O or asynchronous servlets (since 3.0) or... scale out. By default Tomcat can handle 100-200 concurrent connections (reasonable default) and a similar amount of connections are queued. Everything above that is rejected and you are probably experiencing that.
See also
Advanced IO and Tomcat
Asynchronous Support in Servlet 3.0
There are two common problems that I think of.
First, if you run JBoss on Linux as a normal user, you can run into 'Too many open files', if you did not edit the limits.conf file. See https://community.jboss.org/thread/155699. Each open socket counts as an 'open file' for Linux, so the OS could block your connections because of this.
Second, the maximum threadpool size for incoming connections is 200 by default. This limits the number of concurrent requests, i.e. requests that are in progress at the same time. If you have jmeter doing 300 threads, the jboss connector threadpool should be larger. You can find this in jboss6 in the jboss-web.sar/server.xml. Look for 'maxThreads' in the element: http://docs.jboss.org/jbossweb/latest/config/http.html.
200 is the recommended maximum for a single core CPU. Above that, the context switches start to give too much overhead, like Tomasz says. So for production use, only increase to 400 on a dual core, 800 on a quad core, etc.
Related
We have the IBM HTTP Server in front of our App Server.
We run around 50 concurrent request during stress testing.
But we are getting errors and are not able to run 50 concurrent request.
The server runs successfully for 40 concurrent request. The memory and cpu utilization have not peaked, there are enough resources to handle requests.But
we are not able to find any request in the app server log.
But in plugin logs we are able to see the request.
The following are the configurations.
From IBM site, my understanding is IHS server will be able to process 600 concurrent request by default and there are 50 threads to process by default in App Server.
From the logs seen, in plugin , i infer that the requests are processed by the ihs server, but the threads available in app server( 50 by default) is not enough to process 50 concurrent request. Is there a one to one mapping between the threads and the concurrent request? Do we need to increase the maximum threads. What are the parameters that need to be modified to increase the concurrent request with respect to IHS, Websphere app server and the plugin conf to increase the concurrent request handling of the websphere 8.0
httpd.conf:
# Windows MPM
ThreadLimit 2048
ThreadsPerChild 250
MaxRequestsPerChild 0
plugin.conf
<ServerCluster CloneSeparatorChange="false" GetDWLMTable="false" IgnoreAffinityRequests="true" LoadBalance="Round Robin" Name="xxxx" PostBufferSize="64" PostSizeLimit="-1" RemoveSpecialHeaders="true" RetryInterval="60" ServerIOTimeoutRetry="-1">
<Server ConnectTimeout="0" ExtendedHandshake="false" MaxConnections="-1" Name="xxxx" ServerIOTimeout="900" WaitForContinue="false">
<Transport Hostname="SA16" Port="9080" Protocol="http"/>
</Server>
</ServerCluster>
WebContainer Settings in app server
(Application servers > server1 > Thread pools > WebContainer)
Minimum thread - 50
Maximum thread - 50
You have 50 web container threads, so you can have roughly 50 threads in synchronous application code processing HTTP requests -- unless you are programming against async servlet APIs and using "other" threads (e.g. one of the executor APIs, or asynch beans API).
It should be trivial to start your load test then look at the performance monitor in the WAS console, or even more simply to look at thread activity in a javacore from a kill -3 on the appserver JVM.
It doesn't seem like you need any more scale in the WebServer tier. At that tier the best monitoring, proprietary to IHS, is the outout of mod_mpmstats which tells you your webserver thread usage.
I need to pull data from a lot of clients connecting to a java server through a web socket.
There are a lot of web socket implementations, and I picked vert.x.
I made a simple demo where I listen to text frames of json, parse them with jackson and send response back. Json parser doesn't influence significantly on the throughput.
I am getting overall speed 2.5k per second with 2 or 10 clients.
Then I tried to use buffering and clients don't wait for every single response but send batch of messages (30k - 90k) after a confirmation from a server - speed increased up to 8k per second.
I see that java process has a CPU bottleneck - 1 core is used by 100%.
Mean while nodejs client cpu consumption is only 5%.
Even 1 client causes server to eat almost a whole core.
Do you think it's worth to try other websocket implementations like jetty?
Is there way to scale vert.x with multiple cores?
After I changed the log level from debug to info I have 70k. Debug level causes vert.x print messages for every frame.
It's possible specify number of verticle (thread) instances by e.g. configuring DeploymentOptions http://vertx.io/docs/vertx-core/java/#_specifying_number_of_verticle_instances
You was able to create more than 60k connections on a single machine, so I assume the average time of a connection was less than a second. Is it the case you expect on production? To compare other solutions you can try to run https://github.com/smallnest/C1000K-Servers
Something doesn't sound right. That's very low performance. Sounds like vert.x is not configured properly for parallelization. Are you using only one verticle (thread)?
The Kaazing Gateway is one of the first WS implementations. By default, it uses multiple cores and is further configurable with threads, etc. We have users using it for massive IoT deployments so your issue is not the JVM.
In case you're interested, here's the github repo: https://github.com/kaazing/gateway
Full disclosure: I work for Kaazing
We're using Glassfish 3.0.1 and experiencing very long response times; in the order of 5 minutes for 25% of our POST/PUT requests, by the time the response comes back the front facing load balancer has timed out.
My theory is that the requests are queuing up and waiting for an available thread.
The reason I think this is because the access logs reveal that the requests are taking a few seconds to complete however the time at which the requests are being executed are five minutes later than I'd expect.
Does anyone have any advice for debugging what is going on with the thread pools? or what the optimum settings should be for them?
Is it required to do a thread dump periodically or will a one off dump be sufficient?
At first glance, this seems to have very little to do with the threadpools themselves. Without knowing much about the rest of your network setup, here are some things I would check:
Is there a dead/nonresponsive node in the load balancer pool? This can cause all requests to be tried against this node until they fail due to timeout before being redirected to the other node.
Is there some issue with initial connections between the load balancer and the Glassfish server? This can be slow or incorrect DNS lookups (though the server should cache results), a missing proxy, or some other network-related problem.
Have you checked that the clocks are synchronized between the machines? This could cause the logs to get out of sync. 5min is a pretty strange timeout period.
If all these come up empty, you may simply have an impedance mismatch between the load balancer and the web server and you may need to add webservers to handle the load. The load balancer should be able to give you plenty of stats on the traffic coming in and how it's stacking up.
Usually you get this behaviour if you configured not enough worker threads in your server. Default values range from 15 to 100 threads in common webservers. However if your application blocks the server's worker threads (e.g. by waiting for queries) the defaults are way too low frequently.
You can increase the number of workers up to 1000 without problems (assure 64 bit). Also check the number of workerthreads (sometimes referred to as 'max concurrent/open requests') of any in-between server (e.g. a proxy or an apache forwarding via mod_proxy).
Another common pitfall is your software sending requests to itself (e.g. trying to reroute or forward a request) while blocking an incoming request.
Taking threaddump is the best way to debug what is going on with the threadpools. Please take 3-4 threaddumps one after another with 1-2 seconds gap between each threaddump.
From threaddump, you can find the number of worker threads by their name. Find out long running threads from the multiple threaddumps.
You may use TDA tool (http://java.net/projects/tda/downloads/download/tda-bin-2.2.zip) for analyzing threaddumps.
When I send about 100 users to my web service, I get response and web service performs fine, but when I check for 1000 concurrent users none of the requests get reply.
I am using jmeter for testing.
When I send 1000 concurrent users my glassfish admin panel goes time out in browser and it opens after 4-5 minutes only.Same happen for wsdl URL.
I have tested my web service on our LAN and it works for 2000 queries without any issues.
Please help me find a solution.
Edit 1.0
Some more findings
Hi on your recommendation, what I did is that I simply returned string on web service function call, no lookup, no dao, nothing... just returning a string
Thread pool is 2000 no issues on that.
Now when I ran jmeter for 1000 users they run much fast and returned response for ~200 requests
So this means that my PC running Windows 7 with an i5 processor and 4GB RAM is out performing dedicated server of hostgator having 4GB RAM with xeon 5*** 8 cores :(
This is not for what am paying 220$ a month....
Correct me if my finding is wrong, I tested my app on lan b/w two pc's locally and it can process 2000+msgs smoothly
Edit 1.1
After lot of reading,and practicals I have come to a conclusion that it is network latency which is responsible for such a behavior.
I increased bean pool size in glassfish's admin panel and it helped improving number of concurrent users to 300, but issue arise again no matter how much beans I keep in pool.
So friends question is: please suggest some other settings which I can change in Glassfish's admin panel to remove this issue from root!
You need to add some performance logging for the various steps that your service performs. Does it do multiple steps? Is computation slow? Database access slow? Your connection pool not scale well? Do things need to be tweaked in the web server to allow for such high concurrency? You'll need to measure these things to find the bottlenecks so you can eliminate them.
I had the same problem in a server (with 200+ simultaneously users), I studied the official glassfish tuning guide but there is a parameter very important that doesn't appear. I used Jmeter too and in my case the time response increases exponentially but the server's processor stay low.
In the glassfish admin server (configurations/server-config/Network config/thread pools/http-thread-pool) you can see how many users you server can handle. (The parameters are different in glassfish 2 and 3).
Max Queue Size: The maximum number of threads in the queue. A value of –1 indicates that there is no limit to the queue size.
Max Thread Pool Size: The maximum number of threads in the thread pool
Min Thread Pool Size: The minimum number of threads in the thread pool
Idle Thread Timeout: The maximum amount of time that a thread can remain idle in the pool. After this time expires, the thread is removed from the pool.
I recommend you to set Max Thread Pool Size to 100 or 200 to fix the problem.
Also you can set another JMV variables, for example:
-Xmx/s/m
-server
-XX:ParallelGCThreads
-XX:+AggressiveOpts
I hope it helps.
When my tomcat (6.0.20) maxThreads limit is reached, i get the expected error:
Maximum number of threads (XXX) created for connector with address null and port 80
And then request starts hanging on queue and eventually timing out. so far, so good.
The problem is that when the load goes down, the server does not recover and is forever paralysed, instead of coming back to life.
Any hints?
Consider switching to NIO, then you don't need to worry about the technical requirement of 1 thread per connection. Without NIO, the limit is about 5K threads (5K HTTP connections), then it blows like that. With NIO, Java will be able to manage multiple resources by a single thread, so the limit is much higher. The border is practically the available heap memory, with about 2GB you can go up to 20K connections.
Configuring Tomcat to use NIO is as simple as changing the protocol attribute of the <Connector> element in /conf/server.xml to "org.apache.coyote.http11.Http11NioProtocol".
I think may be a bug in Tomcat and according to the issue:
https://issues.apache.org/bugzilla/show_bug.cgi?id=48843
should be fixed in Tomcat 6.0.27 and 5.5.30