role of multithreading in web application

role of multithreading in web application - java

I am using java(Servlets, JSPs) since 2 years for web application development. In those 2 years I never required to use multithreading(explicitly - as I know that servlet containers uses threading to serve same servlet to different requests) in any project.
But whenever I attend an interview for Web Developer position(java), then there are several questions related to threads in java. I know the basics of java threading so answering the questions is not a problem. But sometimes I get confused whether I am missing something while developing web application by not using mutithreading?
So my question is that what is the role of multithreading in Web Application? Any example where multithreading can be used in web application will be appreciated.
Thanks in advance.

Multi-threading can be used in Web Apps mainly when you are interested in asynchronous calls.
Consider for example you have a Web application that activates a user's state on a GSM network (e.g activate 4G plan) and sends a confirmatory SMS or email message at the end.
Knowing that the Web call would take several minutes - especially if the GSM network is stressed - it does not make sense to call it directly from the Web thread.
So basically, when a user clicks "Activate", the Server returns something like "Thanks for activating the 4G plan. Your plan will be activated in a few minutes and you will receive a confirmation SMS/email".
In that case, you server has to spawn a new thread, ideally using a thread pool, in an asynchronous manner, and immediately return a response to the user.
Workflow:
1- User clicks "Activate" button
2- Servlet receives request and activates a new "Activate 4G Plan" task in a thread pool.
3- Servlet immediately returns an HTML response to the user without waiting for the task to be finalized.
4- End of Http transaction
.
.
.
Asynchronously, the 4G plan gets activated later and the user gets notified through SMS or email, etc...

Speaking about a real-world example, there are several reasons to use multi-threading, and I wouldn't hire a web-developer who doesn't know about it. But in the end, the reasons to use multi-threading are the same for standard- and web-development: you either want something that take a while (aka blocking) done in the background to give the user some response in between, or you have a task that can be speed up by having it run on several cores. When multi-threading is actually useful is however a different question.
Situation 1: A web server that does require some processing and has low hits/second
Here multi-threading (if applicable to the algorithm) is a good thing, as idle cores are utilized and threading can result in a faster response to the user.
Situation 2: A web server that does require some processing and has high hits/second
Here multi-threading is possible, but as cores are usually busy with other requests, there are no resources left to use it properly. Actually spreading out the task to several threads can even have a negative impact on the response time, as the task is now fragmented and all parts need to complete, but the order of execution with threads is undefined. So one client could immediately receive a response, while others might wait into time-out till their last fragment eventually gets processed.
Situation 3: A web server has to do some processing that takes a very long time
Here multi-threading is required, there is no way around it. A client cannot wait minutes or probably hours till it receives the response. In this case a callback system is usually implemented, so basically each task has an "API" that can be queried for the current state. Most online-shops are an example for this: you order something and later you can query your order status.
The alternative to threading is process-forking, as Apache does in its standard configuration. The benefit is that load is spread across cores (mostly applicable to situation 2), and the web-code itself doesn't have to do anything to use all those cores, as the OS handles that automatically. However if you have imbalanced load, some cores can be idle and resources are not used in an optimal way. A threading situation is almost always the better solution, if it is done right. But the Apache/Tomcat standard configuration uses a very outdated threading model, by spawning one thread for each request. Effectively given a certain amount of hits/second, the CPU is more busy with threading than with actually processing those requests.

Well this is a nice question and I think most of the developers who work in web application development don't use multithreading explicitly.
The reason is quite obvious since you are using a application server to deploy your application, the application server internally manages a thread pool for incoming requests.
Then why use multithreading explicitly? What is need a web application developer expose himself to multithreading?
When you work on a large scale application where you have to server many request concurrently it is difficult to serve every kind of request synchronously because particular kind of request could have been doing a lot processing which could bring down the performance your application.
Lets take an example where a web application after serving particular kind of request has to notify users through email and SMS. Doing it synchronously with the request thread could bring down the performance of your web application. So here comes the role of mutlithreading.
In such cases it is advisable to develop a stand alone multithreaded application over the network which is responsible for sending email and SMS only.

Multi-treading in web application can be used when you are interested in parallel action, e.g., fetching data from multiple addresses.
As I understand, multi-threading is used in different situation from thread-pool, which can be used to handle requests from multiple clients.

Related

asynchronous http request handling with tomcat and spring

It's my first SO question so be patient with me :)
I'm trying to create a service that:
Receives HTTP GET requests containing a URL to query
For a single GET request the service extracts the URL
Queries a local DB about the URL
If a result was found in the DB it will return it to the client and if not it will need to query some external services (that may take relatively long time to respond)
Return the result of the URL to the client
I'm running this on a virtual machine and Tomcat7 with spring.
I'll apologize in advance and mention that I'm pretty new to Tomcat
Anyway, I'm expecting a lot of concurrent GET requests to this service (hundreds of thousands of simultaneous requests)
What I'm basically trying to achieve is to make this service as scalable as possible (and if that's not possible then at least a service that can handle hundreds of thousands of simultaneous requests)
I've been reading A LOT about asynchronous requests handling in services and especially in Tomcat but I have some things that are still unclear to me:
From the official tomcat website it seems that Tomcat contains number of acceptor threads and number of working threads.
If so, why should I use AsyncContext? Whats the benefit of releasing a Tomcat's working thread and occupying a different thread in my application to do the exact same actions? (there's still 1 active thread in the system)
Somewhat similar to the first question but are there any benefits for creating the AsyncContext and using it with a different thread? (a thread from a thread pool created in my application)
Regarding the same issue, I've seen here that I can also return a Callable or a DeferredResult and process it with either one of Tomcat's threads or with one of my own threads. Are there any benefits for returning a Callable or using a DeferredResult over just processing the AsyncContext from the requests?
Also, If I decide to return a callable, from what thread pool does Tomcat gets the thread to process my callable? Are the threads being used here the same working threads from Tomcat that I previously mentioned? If so, what benefits do I get from releasing one Tomcat working thread and using a different one instead?
I've seen from Oracle's documentation that I can pass AsyncContext a Runnable object that will be processed concurrently, From where do the threads used to execute this Runnable come from? Do I have any control over it? Also, any benefits to passing the AsyncContext a Runnable over just passing the AsyncContext to one my threads?
I apologize for asking so many questions regarding the same things but me and my colleagues are arguing over these things for over a week without any concrete answer.
I have 1 more general question:
What do you think is the best way to make the service I described scalable? (putting aside adding more machines at the moment), could you post any examples or references for the purposed solution?
I'd post more links of links I've been looking at but my current reputation doesn't allow it.
I'll be grateful for any understandable references or for concrete examples and I'll obviously be happy to clarify on any relevant issue
Cheers!

There are a lot of questions packed into this, but I'll try to address some of them.
Asynchronous I/O is a good thing, especially on servers that serve large volumes of requests - it allows you to use fewer threads to process more requests. In the case of a proxy such as you are writing, you really want your HTTP client (that makes the requests to foreign URLs) to be asynchronous as well, so that neither processing the request nor receiving the remote response involves blocking I/O.
That said, you may have a harder time doing this stuff with Tomcat or Java EE servers in general, which have had asynchronous I/O bolted onto them as an afterthought, than using a framework like Netty that is asynchronous from the ground up. As the author of a framework which builds on top of Netty, I'm a bit biased.
To demonstrate how little code you'd need to do what you describe, I wrote a small server that does what you describe here in 3 Java source files and put it on github - it builds a standalone JAR you can run with java -jar to try it out, and I tried to comment it clearly.
What it comes down to is, networked applications spend most of their time waiting for I/O to happen. In the case of a proxy in particular, with traditional, threaded I/O, you would get a request, and the thread that received the request would be responsible for answering it synchronously - that means, if it has to make a network request to another server, that thread is blocked waiting for the answer to come from the remote server. Meaning that thread can't be used for anything else. So, if you have 10 threads, and all of them are waiting on responses, your server can't answer any more requests until one of them finishes and frees up a thread. With asynchronous I/O, you get a callback when some I/O completes. In other words, instead of standing still until the OS flushes your data to the socket and out the network card, your code simply gets a friendly tap on the shoulder when there is something to do (like a response arriving from your proxy request). While your code is waiting for that HTTP request to complete, the thread that sent the proxy request is free to be used to handle another request That means one thread can do a little work on one request, do a little work on another, and another, and eventually finish the first request. Since threads are a finite resource provided by your operating system, this allows you to do a lot more with a lot less hardware.
As to Callable vs. DeferredResult, using a Callable just moves when the work happens around (the Callable gets executed later, on some thread or other, but is still expected to do return a result synchronously); DeferredResult sounds more like what you'd need, since that allows your code to go off and do whatever work it wants, and then set the result (triggering completion of the response) whenever it has something to set.
Honestly, I think if you want to implement this really efficiently, you'd be better off staying away from the Java EE stack - so much of it has baked in assumptions that I/O is synchronous that trying to do async stuff with it is swimming upstream (for example, JDBC has synchronous I/O baked into its bones - if you really want this to scale and you want to use an SQL database, you'd be better off with something like this ).
For another example of using Netty for this sort of thing, see the tiny-maven-proxy project - the code is less pretty, but it shows an example of doing an HTTP proxy where the response body is fed to the client chunk-by-chunk, as it arrives - so you never actually pull the full response body into memory, meaning even requests with huge responses won't run the proxy out of memory. Tiny-maven-proxy also caches on the filesystem. I didn't do those things in the demo because it would have made the code more complicated.

Monitor database with GWT

Maybe I'm overthinking this but I'd like some advice. Customers can place an order inside my GWT application and on a secondary computer I want to monitor those submittals inside th eGWT application and flash an alarm every time an order is submitted, provided the user has OK'd this. I cant figure out the best way to do this. Orders are submitted to a mysql database if that makes any difference. Does anyone have a suggestion on what to do or try?

There are two options: 1) polling or 2) pushing which would allow your server (in the servlet handling the GWT request) to notify you (after the order is successfully placed).
In 1) polling, the client (meaning the browser you are using to monitor the app) will periodically call the server to see if there is data waiting. It may be more resource intensive as many calls are made for infrequent data. It may also be slower due to the delay between calls. If only your monitoring client is calling though it wouldn't be so resource intensive.
In 2) pushing, the client will make a request and the request will be held open until there is data. It is less resource intensive and can be faster. Once data is returned, the client sends another request (this is long polling). Alternatively, streaming is an option where the server doesn't sent a complete request and just keeps sending data. This streaming option requires a specific client-/browser-specific implementation though. If it's just you monitoring though, you should know the client and could set it up specifically for that.
See the demo project in GWT Event Service
Here is the documentation (user manual) for it.
Also see GWT Server Push FAQ
There are other ways of doing it other than GWT Event Service of course. Just google "GWT server push" and you'll find comet, DWR, etc., and if you are using Google's App Engine the Channel API

Update vs. Request-Reply in Storm DRPC

I'm building a real-time API handling 2 types of calls:
Updates,
Computation requests.
Internally, the updates are broadcasted among workers. The workers keep working data structures (such as hash-tables) in their RAM, and modify the contents as the updates are coming.
When a computation request comes, exactly one idle worker handles it, using multiple threads, working with the local copy in RAM.
I'm wondering whether I could migrate my current implementation to Storm. As I understand it, Storm is pretty real-time and could help me a lot with scalability and fault-tolerance.
Currently, I'm using UWSGI/Python to handle the API requests, and Java workers to do the computation. I'm thinking of putting the Java workers into the Storm topology as bolts. However, I'm not quite sure about the spouts.
As I understand it, I could use DRPC to handle the computation requests, just by connecting to a DRPC server from python. It is clearly written in the docs that DRPC can handle the whole life-cycle of the request-reply paradigm. But what about updates?
My question is: Is it a good idea (or is it even possible?) to use DRCP to only submit updates in non-blocking manner, not waiting for replies (because there are no results)?

For Non blocking, asynchronous Updates you should use a Job Server like Gearman
This will enable you to submit and need not to wait for any response. Gearman is used by Instagram to share photos to Facebook/Twitter whenever a user uploads a photo using Instagram app.

Architectural issue with Tomcat cluster environment

I am working on project in which we have an authentication mechanism. We are following the below steps in the authentication mechanism.
The user opens a browser and enter his/her email in a text box and click the login button.
The request goes to a server. We generate a random string (for example, 123456) and send a notification to the user's Android/iPhone and makes the the current thread wait with the help of the wait() method.
The user enters a password on his/her phone and clicks the submit button on his/her phone.
Once the user clicks the submit button, we are making a webservice hit the server and passing the previously generated string (for example, 123456) and password.
If the password is correct against the previously entered email, we call the notify() method to the previously waiting thread and send success as the response and the user gets entered into our system.
If the password is incorrect against the previously entered email, we call the notify() method to the previously waiting thread and send failed as the response and display an invalid credential message to the user.
Everything is working fine, but recently we moved to a clustered environment. We found that some threads are not notified even after replied by the user and for an unlimited waiting time.
For the server, we are using Tomcat 5.5, and we are following The Apache Tomcat 5.5 Servlet/JSP Container for making tomcat cluster environment.
Answer :: Possible problem and solution
The possible problem is the multiple JVMs in a clustered environment. Now we are also sending the clustered Tomcat URL to the user Android application along with generated string.
And when the user clicks on the reply button, we are sending the generated string along with the clustered Tomcat URL so in this case both requests are going to the same JVM, and it works fine.
But I am wondering if there is a single solution for the above issue.
There is a problem in this solution. What happens if the clustered Tomcat crashes? The load balancer will send a request to the second clustered Tomcat and again the same problem will arise.

The underlying reason for your problems is that Java EE was designed to work in a different way - attempting to block/wait on a service thread is one of the important no-no's. I'll give the reason for this first, and how to solve the issue after that.
Java EE (both the web and EJB tier) is designed to be able to scale to very large size (hundreds of computers in a cluster). However, in order to do that, the designers had to make the following assumptions, which are specific limitations on how to code:
Transactions are:
Short lived (eg don't block or wait for periods greater than a second or so)
Independent of each other (eg no communication between threads)
For EJBs, managed by the container
All user state is maintained in specific data storage containers, including:
A data store accessed through, eg, JDBC. You can use a traditional SQL database or a NoSQL backend
Stateful session beans, if you use EJBs. Think of these as Java Bean that persists its fields to a database. Stateful session beans are managed by the container
Web session This is a key-value store (kinda like a NoSQL database but without the scale or search capabilities) that persists data for a specific user over their session. It's managed by the Java EE container and has the following properties:
It will automatically relocate if the node crashes in a cluster
Users can have more than one current web session (i.e. on two different browsers)
Web sessions end when the user ends their session by logging out, or when the session is inactive for longer than the configurable timeout.
All values that are stored must be serializable for them to be persisted or transfered between nodes in a cluster.
If we follow those rules, the Java EE container can successfully manage a cluster, including shutting down nodes, starting new ones and migrating user sessions, without any specific developer code. Developers write the graphical interface and the business logic - all the 'plumbing' is managed by configurable container features.
Also, at run time, the Java EE container can be monitored and managed by some pretty sophisticated software that can trace application performance and behavioural issues on a live system.
< snark >Well, that was the theory. Practice suggests there are pretty important limitations that were missed, which lead to AOSP and code injection techniques, but that's another story < /snark >
[There are many discussions around the 'net on this. One which focuses on EJBs is here: Why is spawning threads in Java EE container discouraged? Exactly the same is true for web containers such as Tomcat]
Sorry for the essay - but this is important to your problem. Because of the limitations on threads, you should not block on the web request waiting for another, later request.
Another problem with the current design is what should happen if the user becomes disconnected from the network, runs out of power, or simply decides to give up? Presumably you will time out, but after how long? Just too soon for some customers, perhaps, which will cause satisfaction problems. If the timeout is too long, you could end up blocking all worker threads in Tomcat and the server will freeze. This opens your organisation up for a denial of service attack.
EDIT : Improved suggestions after a more detailed description of the algorithm was published.
Notwithstanding the discussion above on the bad practice of blocking a web worker thread and also the possible denial of service, it's clear that the user is presented with a small time window in which to react to the the notification on the Android phone, and this can be kept reasonably small to enhance security. This time window can also be kept below Tomcat's timeout for responses as well. So the thread blocking approach could be used.
There are two ways this problem can be resolved:
Change the focus of the solution to the client end - polling the server using Javascript on the browser
Communication between nodes in the cluster allowing the node receiving the authorization response from the Android App to unblock the node blocking the servlet's response.
For approach 1, the browser polls the server via Javascript with an AJAX call to a web service on Tomcat; the AJAX call returns True if the Android app authenticated. Advantage: client side, minimal implementation on the server, no thread blocking on the server. Disadvantages: During the waiting period, you have to make frequent calls (maybe one a second - the user will not notice this latency) which amounts to a lot of calls and some additional load on the server.
For approach 2, there is again choice:
Block the thread with an Object.wait() optionally storing the node ID, IP or other identifier in a shared data store: If so, the node receiving the Android app authorization needs to:
Either find the node that is currently blocking or broadcast to all nodes in the cluster
For each node in 1. above, send a message that identifies the user session to unblock. The message could be sent via:
Have an internal-only servlet on each node - this is called by the servlet performing the Android app authorization. The internal servlet will call Object.notify on the correct thread
Use a JMS pub-sub message queue to broadcast to all members of the cluster. Each node is a subscriber that, on receipt of a notification will call Object.notify() on the correct thread.
Poll a data store until the thread is authorized to continue: In this case, all the Android app needs to do is save the state in a SQL DB

Using wait/notify can be tricky. Remember that any thread can be suspended at any time. So it's possible for notify to be called before wait, in which case wait will then block for ever.
I wouldn't expect this in your case, as you have user interaction involved. But for the type of synchronisation you are doing, try using a Semaphore. Create a Semaphore with 0 (zero) quantity. The waiting thread calls acquire() and it will block until another thread calls release().
Using Semaphore in this way is much more robust that wait/notify for the task you described.

Consider using an in-memory grid so that the instances in the cluster can share state. We used Hazelcast to share data between instances so in case a response reaches a different instance it still can handle it.
E.g. you could use distributed countdown latch with value of 1 to set the thread waiting after sending the message, and when the response arrives from the client to a separate instance it can decrease, that instance can decrease the latch to 0 letting to run the first thread.

Your clustered deployment means that any node in the cluster could receive any response.
Using wait/notify using threads for a web app risks accumulating a lot of threads that may not be notified which could leak memory or create a lot of blocked threads. This could eventually affect the reliability of your server.
A more robust solution would be to send the request to the android app and store the current state of the users request for later processing and complete the HTTP request. To store the state you could consider:
A database that all tomcat nodes connect to
A java cache solution that will work across tomcat nodes like hazelcast
This state would be visible to all nodes in your tomcat cluster.
When the reply from the android app arrives on a different node, restore the state of what your thread was doing and continue processing on that node.
If the UI of the application is waiting on a response from the server, you might consider using an ajax request to poll for the response state from the server. The node processing the android app response does not need to be the same one handling UI requests.

Using Thread.wait in a web service environment is a colossal mistake. Instead, maintain a database of user/token pairs and expire them at intervals.
If you want a cluster, then use a database that is clusterable. I would recommend something like memcached since it's in-memory (and fast) and low on overhead (key/value pairs are dead simple, so you don't need RDBMS, etc.). memcached handles expiration of tokens for you already, so it seems like a perfect fit.
I think the username -> token -> password strategy is unnecessary, especially because you have two different components sharing the same 2-factor authentication responsibility. I think you can further reduce your complexity, reduce confusion for your users, and save yourself some money in SMS-send fees.
The interaction with your web service is simple:
User logs into your website using username + password
If primary authentication (username/password) is successful, generate a token and insert userid=token into memcached
Send the token to the user's phone
Present "enter token" page to the user
User receives token via phone and enters it into the form
Fetch the token value from memcached based upon the user's id. If it matches, expire the token in memcached and consider the second-factor successful
Tokens will auto-expire after whatever amount of time you want to set in memcached
There are no threading problems with the above solution and it will scale across as many JVMs as you need to support your own software.

After analysing your question, I came to the conclusion that the exact problem is of multiple JVMs in a clustered environment.

The exact problem is because of the cluster environment. Both requests are not going to the same JVM. But we know that a normal/simple notify works on the same JVM when the previous thread is waiting.
You should try to execute both requests (first request, second request when the user replies from an Android application).

I'm afraid, but threads cannot migrate over classic Java EE clusters.
You have to rethink your architecture to implement the wait/notify differently (connection-less).
Or, you may give it a try with terracotta.org. It looks like this allows to cluster an entire JVM process over multiple machines. Maybe it's your only solution.
Read a quick introduction in Introduction to OpenTerracotta.

I guess the problem is, your first thread sends a notification to the user's Android application in JVM 1 and when the user reply back, the control goes to JVM 2. And that's the main problem.
Somehow, both threads can access the same JVM to apply wait and notify logic.

Solution:
Create a single point of contact for all waiting threads. Hence in a clustered environment, all the threads will wait on a third JVM (single point of contact), so in this way all the requests (any clustered Tomcat) will contact the same JVM for waiting and notify logic and hence no thread will wait for an unlimited time. If there is a reply, then the thread will be notified if the same object has waited and is being notified the second time.

How to optimize number of database connections?

We have a Java (Spring) web application with Tomcat servlet container.
We have a something like blog.
But the blog must load its posts dynamically with Ajax.
The client's ajax script checks for new posts every second.
I.e. Ajax must ask the server for new posts every second and it will be very heavy for database.
But what if we have hundreds of thousands connects simultaneously?
I think that we must retrieve all posts with cron every second and after that save it somewhere. But where? The main idea is to unload the database.
Any ideas about architecture?
Thanks in advance!

There is other architecture for polling that could be more optimal, depending on the case:
Long polling
Long polling is a variation of the
traditional polling technique and
allows emulation of an information
push from a server to a client. With
long polling, the client requests
information from the server in a
similar way to a normal poll. However,
if the server does not have any
information available for the client,
instead of sending an empty response,
the server holds the request and waits
for some information to be available.
Once the information becomes available
(or after a suitable timeout), a
complete response is sent to the
client. The client will normally then
immediately re-request information
from the server, so that the server
will almost always have an available
waiting request that it can use to
deliver data in response to an event.
In a web/AJAX context, long polling is
also known as Comet programming.
Long Polling
Example of Implementations of this technology:
Push Server
You could also use the observer pattern to register the requests, and notify them when an update is done.

Hundreds of thousands of concurrent users all polling our site every second makes for a huge amount of traffic. If you truly expect this load you are going to have to design your platform accordingly, probably by clustering multiple web, application and DB servers.
Remember that with a database connection pool you don't need a DB connection for every user.

I'm not as familiar with Tomcat, but in WebSphere we can set up connection pools to prepare a certain number of connections.
Also, are you mainly worried about reads or the same number of writes?
Plus, you may also want to have the database "split" depending on region etc. This way there is no single heavy load across the entire database, but it can then be split and even load balanced.
There is also the "NoSQL" databases to look into as well. Maybe something to consider. Just ideas to help out.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.