I am using Tomcat 7.0.32, with Java 1.7. I have a use case where my client is single threaded and sends request at a high pace. Now my problem is my server, processes all requests, this it does at the expense of high latency. This is quite obvious since the communication is not concurrent.
I know if my connector is BIO and i set max threads to 1, my accept count as 1, then if i send 3 concurrent request, it fails with Connection Refused. Which is expected.
However, if the client is single threaded, then the above does not apply, i can continue to send as many requests as i want. However the latency drops. This again is natural. Now i am trying to check if at all there are any server(i.e. Tomcat) configuration available which can influence the OS to refuse connections if the time to acquire connections is high.
Any ideas?
This is quite obvious since the communication is not concurrent.
It's not obvious to me. I run a dozen highly concurrent Tomcats.
If you are suggesting that Tomcat is not a highly concurrent server you are mistaken. It is. You need to reexamine your observations and your assumptions.
My goal is to learn what factors could overwhelm my little tomcat server. And when some exception happens, what I could do to resolve or remediate it without switching my server to a better machine. This is not a real app in a production environment but just my own experiment (Besides some changes on the server-side, I may also do something on my client-side)
Both of my client and server are very simple: the server only checks the URL format and send 201 code if it is correct. Each request sent from my client only includes an easy JSON body. There is no database involved. The two machines (t2-micro) only run client and server respectively.
My client is OkHttpClient(). To avoid timeout exceptions, I already set timeout 1,000,000 milli secs via setConnectTimeout, setReadTimeout, and setWriteTimeout. I also go to $CATALINA/conf/server.xml on my server and set connectionTimeout = "-1"(infinite)
I'm trying to stress out my server by having a client launching 3000+ threads sending HTTP requests to my server. Both of my client and server reside on different ec2 instances.
Initially, I encountered some timeout issues, but after I set the connection, read and write timeout to a bigger value, this exception has been resolved. However, with the same specification, I'm getting java.net.ConnectException: Failed to connect to my_host_ip:8080 exception. And I do not know its root cause. I'm new to multithreading and distributed system, can anyone please give me some insights of this exception?
Below is some screenshot of from my ec2:
1. Client:
2. Server:
Having gone through similar exercise in past I can say that there is no definitive answer to the problem of scaling.
Here are some general trouble shooting steps that may lead to more specific information. I would suggest trying out tests by tweaking a few parameters in each test and measure the changes in Cpu, logs etc.
Please provide what value you have put for the timeout. Increasing timeout could cause your server (or client) to run out of threads quickly (cause each thread can process for longer). Question the need for increasing timeout. Is there any processing that slows your server?
Check application logs, JVM usage, memory usage on the client and Server. There will be some hints there.
Your client seems to be hitting 99%+ and then come down. This implies that there could be a problem at the client side in that it maxes out during the test. Your might want to resize your client to be able to do more.
Look at open file handles. The number should be sufficiently high.
Tomcat has some limit on thread count to handle load. You can check this in server.xml and if required change it to handle more. Although cpu doesn't actually max out on server side so unlikely that this is the problem.
If you a database then check the performance of the database. Also check jdbc connect settings. There is thread and timeout config at jdbc level as well.
Is response compression set up on the Tomcat? It will give much better throughout on server especially if the data being sent back by each request is more than a few kbs.
Based on update on question few more thoughts.
Since the application is fairly simple, the path in terms of stressing the server should be to start low and increase load in increments whilst monitoring various things (cpu, memory, JVM usage, file handle count, network i/o).
The increments of load should be spread over several runs.
Start with something as low as 100 parallel threads.
Record as much information as you can after each run and if the server holds up well, increase load.
Suggested increments 100, 200, 500, 1000, 1500, 2000, 2500, 3000.
At some level you will see that the server can no longer take it. That would be your breaking point.
As you increase load and monitor you will likely discover patterns that suggest tuning of specific parameters. Each tuning attempt should then be tested again the same level of multi threading. The improvement of available will be obvious from the monitoring.
I am using Elasticsearch 1.5.1 and Tomcat 7. Web application creates a TCP client instance as Singleton during server startup through Spring Framework.
Just noticed that I failed to close the client during server shutdown.
Through analysis on various tools like VisualVm, JConsole, MAT in Eclipse, it is evident that threads created by the elasticsearch client are live even after server(tomcat) shutdown.
Note: after introducing client.close() via Context Listener destroy methods, the threads are killed gracefully.
But my query here is,
how to check the memory occupied by these live threads?
Memory leak impact due to this thread?
We have got few Out of memory:Perm gen errors in PROD. This might be a reason but still I would like to measure and provide stats for this.
Any suggestions/help please.
Typically clients run in a different process than the services they communicate with. For example, I can open a web page in a web browser, and then shutdown the webserver, and the client will remain open.
This has to do with the underlying design choices of TCP/IP. Glossing over the details, under most cases a client only detects it's server is gone during the next request to the server. (Again generally speaking) it does not continually poll the server to see if it is alive, nor does the server generally send a "please disconnect" message on shutting down.
The reason that clients don't generally poll servers is because it allows the server to handle more clients. With a polling approach, the server is limited by the number of clients running, but without a polling approach, it is limited by the number of clients actively communicating. This allows it to support more clients because many of the running clients aren't actively communicating.
The reason that servers typically don't send an "I'm shutting down" message is because many times the server goes down uncontrollably (power outage, operating system crash, fire, short circuit, etc) This means that an protocol which requires such a message will leave the clients in a corrupt state if the server goes down in an uncontrolled manner.
So losing a connection is really a function of a failed request to the server. The client will still typically be running until it makes the next attempt to do something.
Likewise, opening a connection to a server often does nothing most of the time too. To validate that you really have a working connection to a server, you must ask it for some data and get a reply. Most protocols do this automatically to simplify the logic; but, if you ever write your own service, if you don't ask for data from the server, even if the API says you have a good "connection", you might not. The API can report back a good "connections" when you have all the stuff configured on your machine successfully. To really know if it works 100% on the other machine, you need to ask for data (and get it).
Finally servers sometimes lose their clients, but because they don't waste bandwidth chattering with clients just to see if they are there, often the servers will put a "timeout" on the client connection. Basically if the server doesn't hear from the client in 10 minutes (or the configured value) then it closes the cached connection information for the client (recreating the connection information as necessary if the client comes back).
From your description it is not clear which of the scenarios you might be seeing, but hopefully this general knowledge will help you understand why after closing one side of a connection, the other side of a connection might still think it is open for a while.
There are ways to configure the network connection to report closures more immediately, but I would avoid using them, unless you are willing to lose a lot of your network bandwidth to keep-alive messages and don't want your servers to respond as quickly as they could.
We're using Glassfish 3.0.1 and experiencing very long response times; in the order of 5 minutes for 25% of our POST/PUT requests, by the time the response comes back the front facing load balancer has timed out.
My theory is that the requests are queuing up and waiting for an available thread.
The reason I think this is because the access logs reveal that the requests are taking a few seconds to complete however the time at which the requests are being executed are five minutes later than I'd expect.
Does anyone have any advice for debugging what is going on with the thread pools? or what the optimum settings should be for them?
Is it required to do a thread dump periodically or will a one off dump be sufficient?
At first glance, this seems to have very little to do with the threadpools themselves. Without knowing much about the rest of your network setup, here are some things I would check:
Is there a dead/nonresponsive node in the load balancer pool? This can cause all requests to be tried against this node until they fail due to timeout before being redirected to the other node.
Is there some issue with initial connections between the load balancer and the Glassfish server? This can be slow or incorrect DNS lookups (though the server should cache results), a missing proxy, or some other network-related problem.
Have you checked that the clocks are synchronized between the machines? This could cause the logs to get out of sync. 5min is a pretty strange timeout period.
If all these come up empty, you may simply have an impedance mismatch between the load balancer and the web server and you may need to add webservers to handle the load. The load balancer should be able to give you plenty of stats on the traffic coming in and how it's stacking up.
Usually you get this behaviour if you configured not enough worker threads in your server. Default values range from 15 to 100 threads in common webservers. However if your application blocks the server's worker threads (e.g. by waiting for queries) the defaults are way too low frequently.
You can increase the number of workers up to 1000 without problems (assure 64 bit). Also check the number of workerthreads (sometimes referred to as 'max concurrent/open requests') of any in-between server (e.g. a proxy or an apache forwarding via mod_proxy).
Another common pitfall is your software sending requests to itself (e.g. trying to reroute or forward a request) while blocking an incoming request.
Taking threaddump is the best way to debug what is going on with the threadpools. Please take 3-4 threaddumps one after another with 1-2 seconds gap between each threaddump.
From threaddump, you can find the number of worker threads by their name. Find out long running threads from the multiple threaddumps.
You may use TDA tool (http://java.net/projects/tda/downloads/download/tda-bin-2.2.zip) for analyzing threaddumps.
I am creating a web application having a login page , where number of users can tries to login at same time. so here I need to handle number of requests at a time.
I know this is already implemented for number of popular sites like G talk.
So I have some questions in my mind.
"How many requests can a port handle at a time ?"
How many sockets can I(server) create ? is there any limitations?
For e.g . As we know when we implement client server communication using Socket programming(TCP), we pass 'a port number(unreserved port number)to server for creating a socket .
So I mean to say if 100000 requests came at a single time then what will be approach of port to these all requests.
Is he manitains some queue for all these requests , or he just accepts number of requests as per his limit? if yes what is handling request limit size of port ?
I want to know how server serves multiple requests simultaneously?I don't know any thing about it. I know we are connection to a server via its ip address and port number that's it.
So I thought there is only one port and many request come to that port only via different clients so how server manages all the requests?
This is all I want to know. If you explain this concept in detail it would be very helpful. Thanks any way.
A port doesn't handle requests, it receives packets. Depending on the implementation of the server this packets may be handled by one or more processes / threads, so this is unlimited theoretically. But you'll always be limited by bandwith and processing performance.
If lots of packets arrive at one port and cannot be handled in a timely manner they will be buffered (by the server, the operating system or hardware). If those buffers are full, the congestion maybe handled by network components (routers, switches) and the protocols the network traffic is based on. TCP for example has some methods to avoid or control congestion: http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Congestion_control
This is typically configured in the application/web server you are using. How you limit the number of concurrent requests is by limiting the number of parallel worker threads you allow the server to spawn to serve requests. If more requests come in than there are available threads to handle them, they will start to queue up. This is the second thing you typically configure, the socket back-log size. When the back-log is full, the server will start responding with "connection refused" when new requests come in.
Then you'll probably be restricted by number of File Descriptors your os supports (in case of *nix) or the number of simultaneous connections your webserver supports. The OS maximum on my machine seems to be 75000.
100,000 concurrent connections should be easily possible in Java if you use something like Netty.
You need to be able to:
Accept incoming connections fast enough. The NIO framework helps enormously here, which is what Netty uses internally. There is a smallish queue for incoming requests, so you need to be able to handle these faster than the queue can fill up.
Create connections for each client (this implies some memory overhead for things like connection info, buffers etc.) - you may need to tweak your VM settings to have enough free memory for all the connections
See this article from 2009 where they discuss achieving 100,000 concurrent connections with about 20% CPU usage on a quad-core server.
I'm developing a Java client/server application in which there will be a great number of servers with which the clients are going to have to connect. The problem is that probably the vast majority of them will not be serving at the same time. The client needs to find at least one available in the list, so it will iterate it, looking for an available server (when it finds the first it stops, one is enough).
The problem is that the list will probably be long, tens of zousands, they could be even hundreds... and it may happen that only 1% of them are connected (i.e. executing the server). That's why I need a clever and a fast way to know if a server is connected, without waiting for time-outs or so. I accept all kinds of suggestions.
I have thought about ordering the server list statistically, so that the servers that are available more often are the first hosts attempted. But this is not enough.
Perhaps multicasting UDP datagrams? The connections between clients/servers are TCP, but perhaps to find a server it's better to do an UDP multicast first and wait for the answer, for example... what do you think?
Both the server and client use thread pools.
The server pool handles 200 threads concurrently, and when the pool is full, queues the rest until the queue is 200 runnables long. Then it blocks, and stop accepting connections until there is free room in the queue again.
The client has a cached thread pool, it can make all the request to the server you want concurrently (with common sense, obviously...).
This is just an initial thought and would add some over head, but you could have the servers periodically ping some centralized server which the clients would connect through. Then if the server doesn't ping for some set time it gets removed.
You might want to use a peer-to-peer network.
Have a look at JXTA/JXSE:
If it is your own code which is running on each of these servers, could you send an alive to a central server (which is controlled by you and is guaranteed to be up at all times)? The central server can then maintain an updated list of all servers which are active. The client just needs a copy of this list from the central server and then start whatever communication it needs.
Sounds like a job for Threads. You cannot speed up the connection, it takes time to contact the server.
IMHO, the best way is to get few hundred Threads to march through the list of servers. The first one to find one server alive wins. Then signal other threads to die out.
Btw, did you really mean to order the server list "sadistically"? :)