We have a need to implement a simple toggle server (rest application) that would take the toggle name and return if it is enabled or disabled. We are expecting a load of 10s of thousands of requests per day.
Does Spring (reactive) webflux makes sense here?
My understanding is reactive rest apis will be useful if there is any possibility of idle time of the http thread - meaning thread waiting for some job to be done and it can't proceed until it receives the response like from db reads or rest calls to other services.
Our use case is just to return the toggle value (probably from some cache) that is being queried. Will the reactive rest service be useful in our case? Does it provide any advantages over simple spring boot application?
I'm coming from a background of "traditional" spring/spring-mvc apps development experience, and these days I'm also starting to learn spring webflux and based on the data provided in the question here are my observations (disclaimer: since I'm a beginner in this area as I said, take this answer with a grain of salt):
WebFlux is less "straight forward" to implement compared to the traditional application: the maintenance cost is higher, the debugging is harder, etc.
WebFlux will shine if your operations are I/O bound. If you're going to read the data from in-memory cache - this is not an I/O bound operation. I also understand that the nature of "toggle" data is that it doesn't change that much, but gets accessed (read) frequently, hence keeping it in some memory cache indeed makes sense here, unless you build something huge that won't fit in memory at all, but this is a different story.
WebFlux + netty will let you to serve simultaneously thousands of requests, tomcat, having a traditional "thread per request" model, still allows 200 threads + 100 in the queue by default, if you exceed these values it will fail, but netty will "survive". Based on the data presented in the question, I don't see you'll benefit from netty here.
10s of thousands requests per day seems like something that any kind of server can handle easily, tomcat, jetty, whatever - you don't need that "high-load" here.
As I've mentioned in item "3" WebFlux is good in simultaneous request handling, but you probably won't gain any performance improvement over the traditional approach, its not about the speed, its about the better resource utilization.
If you're going to read the data from the database and you do want to go with webflux, make sure you do have reactive drivers for your database - when you run the flow, you should be "reactive" all the way, blocking on db access doesn't make sence.
So, bottom line, If I were you, I would have started with a regular server and consider moving to reactive stack later (probably this "later" will never come as long the expectations specified in the questions don't change).
Indeed it aims to minimize thread idling and getting more performance by using fewer threads than in a traditional multithreading approach where a thread per request is used, or in reality a pool of worker threads to prevent too many threads being created.
If you're only getting tens of thousands of requests per day, and your use case is as simple as that, it doesn't sound like you need to plan anything special for it. A regular webapp will perform just fine.
Related
Our gRPC need to handle 1000 QPS and each request requires a list of sequential operations to happen, including one which is to read data from the DB using JDBC. Handling a single request takes at most 50ms.
Our application can be written in two ways:
Option 1 - Classic one blocking thread per request: we can create a large thread pool (~200) and simply assign one thread per request and have that thread block while it waits for the DB.
Option 2 - Having each request handled in a truly non-blocking fashion:. This would require us to use a non-blocking MySQL client which I don't know if it exist, but for now let's assume it exist.
My understanding is that non-blocking approach has these pros and cons:
Pros: Allows to reduce the number of threads required, and as a such reduce the memory footprint
Pros: Save some overhead on the OS since it doesn't need to give CPU time to the thread waiting for IO
Cons: For a large application (where each task is subscribing a callback to the previous task), it requires to split a single request to multiple threads creating a different kind of overhead. And potentially if a same request gets executed on multiple physical core, it adds overhead as data might not be available in L1/L2 core cache.
Question 1: Even though non blocking application seems to be the new cool thing, my understanding is that for an application that aren't memory bounded and where creating more threads isn't a problem, it's not clear that writing a non-blocking application is actually more CPU efficient than writing blocking application. Is there any reason to believe otherwise?
Question 2: My understanding is also that if we use JDBC, the connection is actually blocking and even if we make the rest of our application to be non-blocking, because of the JDBC client we lose all the benefit and in that case a Option 1 is most likely better?
For question 1, you are correct -- non-blocking is not inherently better (and with the arrival of Virtual Threads, it's about to become a lot worse in comparison to good old thread-per-request). At best, you could look at the tools you are working with and do some performance testing with a small scale example. But frankly, that is down to the tool, not the strategy (at least, until Virtual Threads get here).
For question 2, I would strongly encourage you to choose the solution that works best with your tool/framework. Staying within your ecosystem will allow you to make more flexible moves when the time comes to optimize.
But all things equal, I would strongly encourage you to stick with thread-per-request, since you are working with Java. Ignoring Virtual Threads, thread-per-request allows you to work with and manage simple, blocking, synchronous code. You don't have to deal with callbacks or tracing the logic through confusing and piecemeal logs. Simply make a thread per request, let it block where it does, and then let your scheduler handle which thread should have the CPU core at any given time.
Pros: Save some overhead on the OS since it doesn't need to give CPU time to the thread waiting for IO
It’s not just the CPU time for waiting threads, but also the overhead of switching between threads competing for the CPU. As you have more threads, more of them will be in a running state, and the CPU time must be spread between them. This requires a lot of memory management for switching.
Cons: For a large application (where each task is subscribing a callback to the previous task), it requires to split a single request to multiple threads creating a different kind of overhead. And potentially if a same request gets executed on multiple physical core, it adds overhead as data might not be available in L1/L2 core cache.
This also happens with the “classic” approach since blocking calls will cause the CPU to switch to a different thread, and, as stated before, the CPU will even have to switch between runnable threads to share the CPU time as their number increases.
Question 1: […] for an application that aren't memory bounded and where creating more threads isn't a problem
In the current state of Java, creating more threads is always going to be a problem at some point. With the thread-per-request model, it depends how many requests you have in parallel. 1000, probably ok, 10000… maybe not.
it's not clear that writing a non-blocking application is actually more CPU efficient than writing blocking application. Is there any reason to believe otherwise?
It is not just a question of efficiency, but also scalability. For the performance itself, this would require proper load testing. You may also want to check Is non-blocking I/O really faster than multi-threaded blocking I/O? How?
Question 2: My understanding is also that if we use JDBC, the connection is actually blocking and even if we make the rest of our application to be non-blocking, because of the JDBC client we lose all the benefit and in that case a Option 1 is most likely better?
JDBC is indeed a synchronous API. Oracle was working on ADBA as an asynchronous equivalent, but they discontinued it, considering that Project Loom will make it irrelevant. R2DBC provides an alternative which supports MySQL. Spring even supports reactive transactions.
I am using SpringDoc Open Api 3 for my rest API. Right now I'm using the WebMvc version. Is there any advantage to switching to the WebFlux version? Wouldn't using WebClient (or some other async client) on the client side be the same thing except the async would happen on the client side instead? At the end of the day, a Rest method can use async methods internally, but trying to see if it is worth it to migrate the exposed methods to WebFlux.
No, it's a completely different thing. Spring MVC operates in thread-per-request model. You have 100 concurrent request = you have 100 threads to handle those requests. 100 threads is already a lot, now imagine 1k, 10k or even 100k - completely impossible in this model.
Point is that those threads are not doing work 100% of the time. If you call a database or another service then the thread is just waiting for the response, not doing the work it could be doing in that time.
That's the Webflux way, you use fewer threads, because instead of waiting for response from external service, threads are working in that time, making it possible to handle 1k concurrent requests without much of a problem.
Why is everyone not doing this then: less resources used, better performance etc.? First and I think the most important - it's incredibly hard to do. Program flow is not as easy as in synchronous programming, debugging is really hard, stack traces become basically useless, you need to be very careful not to block everything. Second - the benefit becomes worthy at some threshold, most apps don't need to handle thousands of concurrent users. Before that threshold, not only there is no benefit, it may be even worse performance-wise, while paying the price in developer knowledge and experience mentioned in first point. Three - for this to work you need the whole flow to be asynchronous, otherwise you will just block the event loop - calls to external services and most importantly the database - you need asynchronous database driver for that and not every database supports it.
I've seen recently that there are different frameworks out there that allow the use of a messaging architecture but implemented in process, both using same and different threads. The ones I know about are Spring, Guava EventBus and Reactor.
My question is about what are good use cases where someone would want to use them instead of sending messages to a full fledged broker. I understand that its usage allows for a better decoupling of the business logic but in a microservices architecture you would normally publish events to be consumed by other microservices. The advantage of that is the failure tolerance you have by adding a cluster of brokers where an erroneous message cause by a failure in an instance can be retried by another one. Implementing logic that is decomposed and executed by sending messages that are later consumed by the same system, specially when the subscribers are executed in different threads, seems to me difficult to then put the data back to a consistent state.
Advantages of microservices over in-process is not really in the change it represents for message consumption.
Microservices allow you to execute portion of your code on specific nodes within a cluster, permitting to allocate the heavy calculations on powerful computers and secondary or light resources on less powerful resources. Overall it allows you to balance the performances better and scale your resources on the portions of code that require it.
Also, whenever you update the code of a micro-service you do not impact the other services, so that your changes (and errors) are isolated. If everything runs within the same process any wrong update might actually render the entire solution unusable.
In the end, getting the communication out of your process (3rd party broker) allows you to share it with more people, agents, processes, etc. Otherwise people have to become part of your process (a module?) and this is really not efficient.
Honestly, the only good reason you have for intra-process communication within your monolithic is for speed (in-memory communication rather than on-the-wire communication).
The throughput of our web application seems to be limited by slow connections. In load tests, we easily achieve about 5000 requests/s. But in practice, we max out at about 1000 requests/s. The server isn't really under serious load, neither IO nor CPU wise. The same applies to the database. The main difference seems to be that most worker threads are slowed down by clients that cannot accept the response fast enough (often responses are several MB in size).
We hardly have any static resources. The the problem is about dynamically generated content. It's implemented with the Spring Framework. But I think it wouldn't be different for any other servlet based implementation.
So what are our options for improving throughput? Is there some sort of caching available that would quickly absorb the response, free up the worker threads and then asynchronously deliver it to the client at their speed?
We'd rather not increase the number of processing threads as they keep a database connection open for most of their processing. We're really looking for a solution where a small number of worker threads can work at full speed.
I would suggest you to use standard techniques such as gzip for responses.
The second one is to use asynchronous processing in Spring MVC. See Making a Controller Method Asynchronous to learn more about this.
I am currently building a java-servlet-based web application that should offer its service to quite a lot of users (don't ask me how much "a lot" is :-) - I don't know yet).
However, while the application is being used, there might occur some long-taking processing on the serverside.
In order to avoid bad UI responsiveness, I decided to move these processing operations into their own threads.
This means that once a user is logged in, it can happen that 1-10 threads run in the background (per user!).
I once heard that using multiple threads in a web application is a "bad idea".
Is this true and if yes: Why?
Update: I forgot to mention that my application heavily relies on ajax calls. Every user action causes a new ajax call. So, when the main servlet thread is busy, the ajax call takes very long to process. That's why I want to use multiple threads.
It is a bad idea to manually create the threads yourself. This has been discussed a lot here in SO. See this question for example.
Another question discusses alternative solutions.
The "bad idea" isn't multiple threads. Java EE was originally written so multi-threading was in the hands of the app server, so users were discouraged from starting their own threads.
I think what you really want is asynchronous processing for long-running tasks so users won't have to wait for them to finish before proceeding.
You could do that with JMS and stay within the lines in the Java EE coloring book. I think that it's safer to do on your own, now that there are new classes and constructs in the java.util.concurrent package.
It's still not an easy thing to do. Multi-threaded code isn't trivial. But I think it's easier than it used to be in Java.
Part of the problem might be that you're asking that servlet to do too much. Servlets should listen for HTTP request and orchestrate getting a response from other classes, not do all the processing themselves. Perhaps your servlet is telling you that it's time to refactor a bit. This will help your testing, since you'll be able to unit test those asynch classes without having a servlet/JSP engine running.
AJAX calls to services via HTTP need not block. If the service can return a token, a la FedEx, that tells the app when and how to get the response, there's no reason why the service can't process asynchronously. It's an implementation detail for the services that you should hide from clients.
1.
Brilliant Idea.
It's not common, but it's nothing wrong.
If you think asynchronous tasks are needed for better user experiences. Just use it.
2.
You need to be careful with it.
2.1.
Creating and destroying threads add a lot of overhead to your server.
You'd better use a executor, like java.util.concurrent.ThreadPoolExecutor.
2.2.
Don't just use Executors.newFixedThreadPool(). It is for beginners and hides dangerous details.
You need to know the edge behavior of ThreadPoolExecutor, and configure it properly.
How much threads are enough for your task? You need to calculate it out.
What would happen if there is no free theads in your pool? Different configurations can make it wait, cache, or abandon new tasks. What should you expect?
What would happen if a task runs for too long(such as an infinite loop)? There is no real timeout-and-exit mechanism in java. How do you prevent these.
If the application requires it, then I say go ahead and do the background threads, but, since you don't know how many users you will have, you are taking a great risk that you will overwhelm your server. You might consider some alternatives, if they will work in your situation. Can you run the background tasks completely offline, e.g. in a batch job? Can you limit the number of threads that each logged in user will need? How will you get the results of the background threads back to the user?
This is a bad idea for three main reasons:
Excessive number of running threads can kill system resources and cause some strange things such as starvation and priority inversion. Often this can be solved with a thread pool.
User session duration is unpredictable. The user can fire an action and go for a coffee, or he/she might complain for the delay an redo the action. This can cause creation of multiple background jobs, so requires complex control, and when we talk about threads, we never know for sure if we didn't left race conditions or unantecipated scenarios.
Most likely servlets will have some interaction with the threads. Now suppose your application needs to be scaled, so you use a clustered container (after all, you have "a lot" of users). The container can passivate a session and restore it in another node. But your threads will remain in the initial node, so the link between session and threads will be broken. This ends in unexpected exceptions and error 500 - server failure.
I think the best solution is to design your application so that it won't create so many background threads.
But if you insist or really need it, try using Java EE message driven beans (MDBs) and make your servlet invoke it using JMS, like #duffymo said.
The challenge is how to make communication between MDBs and user sessions. Perhaps your servlet can create a JMS queue or topic and send it to MDBs for them to reply, but I don't know if the servlet side of JMS connection can be passivated and restored.
Another forms of communication would be JNDI or an external database or file, but this requires polling, which might be unresponsive or CPU-excessive.