io handling in tomcat - java

I noticed a major difference in processing time between two servlets in the same tomcat and two separate tomcats on the same host. The servlets communicate using http. Does tomcat or java have some mechanism that optimizes http communication when in the same tomcat or JVM. I'm trying to confirm this observation is not related to the host I'm running on.

It could be the difference between blocking and non-blocking I/O.
Tomcat uses the multi-thread model: have a pool of threads for processing requests and a queue for incoming requests. The server assigns a thread to an incoming request for processing, performs the task, sends back the response, and returns the thread to the pool. The queue handles requests that back up.
Non-blocking IO, as employed by Netty, is something different.
Perhaps the two requests are being queued up when they are processed by the same Tomcat.

Some more information about these tests. Both tests are run on SunOS 5.10 with Apache Tomcat Version 6.0.20 and jdk1.6.0_23. The http transfers can involve fairly large files - 5M. Thread handling might explain but with timing differences over a factor of 10 makes me suspect that no data has to transfer out of the JVM. Some form of blocking vs non-blocking might fit the timing difference.

Related

How are blocking calls like database access handled internally by a Jetty Non Blocking IO servlet?

I have read a lot of material to try and clearly understand the gains a Jetty Non Blocking Web Application Server can or can't offer.
So far what I understand (in part by referring to this: How do Jetty and other containers leverage NIO while sticking to the Servlet specification?) is that with a non blocking IO model a web server like Jetty runs a single (or one per CPU core) thread - the Selector thread - that determines connections that are ready for some I/O. Connections that are ready with some I/O are dispatched for processing on to an internal thread pool to process the request.
I can see how such an architecture could allow you to serve many more connections with far fewer resources. However, what I am not clear about is this:
If I wrote a servlet that ran a long running database operation using a standard JDBC driver performing blocking I/O, wouldn't the handler thread dispatched from the pool to handle this request block?
And if requests came through faster than database requests are fulfilled, the handler thread pool would exhaust at some point?
And so with an application such as this is there any benefit to be run on a Non Blocking Jetty webserver? Is the non-blocking benefit only truly accrued if the servlet itself used another layer of non-blocking access to the database? Or is there something I am missing?
Please do explain if there's some magic through which Jetty will pay less of a price for the blocking database operations than say, a blocking web server.
P.S: For a contrast I read about Node.js here - How the single threaded non blocking IO model works in Node.js - it seems to suggest that Node uses libuv underneath and applies other techniques to translate all blocking operations in code (such as database access and sleep()) into event callbacks ensuring the event loop and the internal thread pool never get blocked in a blocking callback. While it's still a little gobbledygook to me, but assuming that's true for Node, can Jetty promise the same? That too for servlets etc that are not written in a non-blocking way?

How does Restful WS works with multiple clients at the same time?

I am new to the RESTful Webservices world and I have a question regarding how WS works.
Context:
I am developing a RESTful WS that will have a high load; at one given time I can have let's say up to 10 clients sending multiple requests. All the requests will be sent to port 80.
I am developing the WS with Jersey (Java) and deploying on a Tomcat Webserver.
Question:
Let's say we have 5 clients that send requests at the same time; each one sends 2 requests to port 80; will they be treated in FIFO order? Can we have some sort of multi-threading if let's say we don't care about the order?
It all depends what server you use and how it is configured. Standard configuration (you have to work hard to make it not standard) is to have multiple threads. In other words - server usually automatically creates or uses another thread for each new request and it is almost certain that it will be processed in parallel.
You can actually see it inside your running code by using java.lang.Thread.currentThread() - print the name of current thread and Rest request and you will see.
To answer your question, a thread will be fetched from thread pool to server every request you send. The server does not care about the order, the request comes first will be served first.
More about the servers:
I suggest you use Nginx or Apache as reverse server to enable high performance, a thread will be fetched from the thread pool to server the request. To improve performance, you can increase the thread pool size. However, too much thread will, on the other hand, reduce your performance due to the frequency of switching from thread to thread increases. You don't want to have a very large thread pool.
If you are using Apache + Tomcat, basically, you have the same situation like you are using Tomcat. But apache is more suitable than tomcat to be the web server. In real life, companies use apache as reverse server that dispatch request to tomcat.
Apache and Tomcat are multithread based server, their performance reduce when there are too much requests. If you have to handle a lot of requests, you can use Nginx.
Nginx is an even based server, it uses queue to store requests and use FIFO to dispatch them. It can handle a lot of requests with much fewer threads. Therefore, its performance will be more stable even with larger amount of requests. However, with extremely large amount of requests, Nginx will also be overwhelmed, as its event loop has no room for extra requests.
Companies due with the situation by using distributed system concepts. For example load balancer. But to answer your question, that's a little too much. Check this article and this article to gain a better idea about nginx and apache.

Socket best practices in Java

Writing any kind of web server in Java (be it a webserver, RESTful webapp or a microservice) you get to use Sockets for dual channel communication between client and server.
Using the common Socket and ServerSocket class is trivial, but since Sockets are blocking, you end up creating a thread for each request. Using this threaded system, your server will work perfectly but won't scale very well.
The alternative is using Streams by means of SocketChannel, ServerSocketChannel and Selector, and is clearly not as trivial as common Sockets.
My question is: which of these two systems are used in production ready code? I'm talking about medium to big projects like Tomcat, Jetty, Sparkjava and the like?
I suppose they all use the Stream approach, right?
To make a web server really scalable, you'll have to implement it with non-blocking I/O - which means that you should make it in such a way that threads will never get blocked waiting for I/O operations to complete.
Threads are relatively expensive objects. For example, for each thread memory needs to be allocated for its call stack. By default this is in the order of one or a few MB. Which means that if you create 1000 threads, just the call stacks for all those threads will already cost you ~ 1 GB memory.
In a naïve server application, you might create a thread for each accepted connection (each client). This won't scale very well if you have many concurrent users.
I don't know the implementation details of servers like Tomcat and Jetty, but they are most likely implemented using non-blocking I/O.
Some info about non-blocking I/O in Tomcat: Understanding the Tomcat NIO Connector
One of the most well-known non-blocking I/O libraries in Java is Netty.

Tomcat NIO connector with a blocking application

After reading about the Tomcat NIO connector I still don't get one thing: is the nio connector beneficial if the application code is blocking, i.e. it blocks on reading from the database, on reading the file system, on calling external web services?
So, for example, you have a REST-like API that receives a request, reads something from the database, and returns a response. It doesn't use servlet 3 async, it just writes to the response.
I didn't find a full description of the thread pools used by the NIO connector, but I imagine it has a thread pool for handling the requests, so each request ends up in its own thread, which it can block.
If that's the case, are the benefits of NIO still there, or the blocking code diminishes the benefits of NIO (in terms of resource utilization)?
Is the nio connector beneficial if the application code is blocking...?
Yes, the NIO connector is built with the assumption that your app will block somewhere. The NIO connector basically has several socket placeholders and responds to new incoming requests until information starts getting written back.
I didn't find a full description of the thread pools used by the NIO connector
I think this is the start of your confusion. Tomcat NIO has a selector pool, not a thread pool (reference). The connector code polls each selector to see if it has incoming or outgoing bytes to send. In this sense, the selector for a given request will continue to receive information until there is enough to process the request with a Request/Response object that bridges the gap between synchronous I/O and asynchronous I/O (reference).
The polling code never blocks longer than the time it takes to serialize a packet of information, so it's free to handle new requests. The only real limitation is the amount of memory available to Tomcat. While there is a thread pool, the number of actual threads used are much lower than the number of connections the application can handle (reference).
While there are performance differences between Tomcat Connectors (reference), the difference in raw request/response time is pretty small when the servlet itself blocks. However, the difference in the number of simultaneous requests that Tomcat can handle is vastly different when you use non-blocking I/O.

Threaded apache cxf clients and performance on high frequency requests

I have a relatively simple java service that fetches information from various SOAP webservices and does so using apache cxf 2.5.2 under the hood. The service launches 20 worker threads to churn through 1000-8000 requests every hour and each request could make 2-5 webservice calls depending on the nature of the request.
Setup
I am using connection pooling on the webservice connections
Connection Timeout is set to 2 seconds in order to realistically tackle the volume of requests efficiently.
All connections are going out through a http proxy.
20 Worker Threads
Grunty 16 cpu box
The problem is that I am starting to see 'connect time out' errors in the logs and quite a large number of them and it seems the the application service is also effecting the machines network performance as curl from the command line takes >5 seconds just establish a connection to the same webservices. However when I stop the service application, curl performance improves drastically to < 5ms
How have other people tackled this situation using CXF? did it work or did they switch to a different library? If you were to start from scratch how would you design for 'small payload high frequency' transactions?
Once we had the similar problem as yours that the request took very long time to complete. It is not CXF issue, every web services' stacks will operate long for very frequent request.
To solve this issue we implemented JMS EJB message driven bean. The flow were as follows: when the users send their request to web service, all requests were put into JMS queue so that response to users come very quickly and request is left to process at the background. Later the users were able to see their operations: if they are still send to process, if they are processing, if they are completed successfully or if they failed to complete for some reason.
If I had to design frequent transactions application, I would definitely use JMS for that.
Hope this helps.

Categories