Architectural issue with Tomcat cluster environment

Architectural issue with Tomcat cluster environment - java

I am working on project in which we have an authentication mechanism. We are following the below steps in the authentication mechanism.
The user opens a browser and enter his/her email in a text box and click the login button.
The request goes to a server. We generate a random string (for example, 123456) and send a notification to the user's Android/iPhone and makes the the current thread wait with the help of the wait() method.
The user enters a password on his/her phone and clicks the submit button on his/her phone.
Once the user clicks the submit button, we are making a webservice hit the server and passing the previously generated string (for example, 123456) and password.
If the password is correct against the previously entered email, we call the notify() method to the previously waiting thread and send success as the response and the user gets entered into our system.
If the password is incorrect against the previously entered email, we call the notify() method to the previously waiting thread and send failed as the response and display an invalid credential message to the user.
Everything is working fine, but recently we moved to a clustered environment. We found that some threads are not notified even after replied by the user and for an unlimited waiting time.
For the server, we are using Tomcat 5.5, and we are following The Apache Tomcat 5.5 Servlet/JSP Container for making tomcat cluster environment.
Answer :: Possible problem and solution
The possible problem is the multiple JVMs in a clustered environment. Now we are also sending the clustered Tomcat URL to the user Android application along with generated string.
And when the user clicks on the reply button, we are sending the generated string along with the clustered Tomcat URL so in this case both requests are going to the same JVM, and it works fine.
But I am wondering if there is a single solution for the above issue.
There is a problem in this solution. What happens if the clustered Tomcat crashes? The load balancer will send a request to the second clustered Tomcat and again the same problem will arise.

The underlying reason for your problems is that Java EE was designed to work in a different way - attempting to block/wait on a service thread is one of the important no-no's. I'll give the reason for this first, and how to solve the issue after that.
Java EE (both the web and EJB tier) is designed to be able to scale to very large size (hundreds of computers in a cluster). However, in order to do that, the designers had to make the following assumptions, which are specific limitations on how to code:
Transactions are:
Short lived (eg don't block or wait for periods greater than a second or so)
Independent of each other (eg no communication between threads)
For EJBs, managed by the container
All user state is maintained in specific data storage containers, including:
A data store accessed through, eg, JDBC. You can use a traditional SQL database or a NoSQL backend
Stateful session beans, if you use EJBs. Think of these as Java Bean that persists its fields to a database. Stateful session beans are managed by the container
Web session This is a key-value store (kinda like a NoSQL database but without the scale or search capabilities) that persists data for a specific user over their session. It's managed by the Java EE container and has the following properties:
It will automatically relocate if the node crashes in a cluster
Users can have more than one current web session (i.e. on two different browsers)
Web sessions end when the user ends their session by logging out, or when the session is inactive for longer than the configurable timeout.
All values that are stored must be serializable for them to be persisted or transfered between nodes in a cluster.
If we follow those rules, the Java EE container can successfully manage a cluster, including shutting down nodes, starting new ones and migrating user sessions, without any specific developer code. Developers write the graphical interface and the business logic - all the 'plumbing' is managed by configurable container features.
Also, at run time, the Java EE container can be monitored and managed by some pretty sophisticated software that can trace application performance and behavioural issues on a live system.
< snark >Well, that was the theory. Practice suggests there are pretty important limitations that were missed, which lead to AOSP and code injection techniques, but that's another story < /snark >
[There are many discussions around the 'net on this. One which focuses on EJBs is here: Why is spawning threads in Java EE container discouraged? Exactly the same is true for web containers such as Tomcat]
Sorry for the essay - but this is important to your problem. Because of the limitations on threads, you should not block on the web request waiting for another, later request.
Another problem with the current design is what should happen if the user becomes disconnected from the network, runs out of power, or simply decides to give up? Presumably you will time out, but after how long? Just too soon for some customers, perhaps, which will cause satisfaction problems. If the timeout is too long, you could end up blocking all worker threads in Tomcat and the server will freeze. This opens your organisation up for a denial of service attack.
EDIT : Improved suggestions after a more detailed description of the algorithm was published.
Notwithstanding the discussion above on the bad practice of blocking a web worker thread and also the possible denial of service, it's clear that the user is presented with a small time window in which to react to the the notification on the Android phone, and this can be kept reasonably small to enhance security. This time window can also be kept below Tomcat's timeout for responses as well. So the thread blocking approach could be used.
There are two ways this problem can be resolved:
Change the focus of the solution to the client end - polling the server using Javascript on the browser
Communication between nodes in the cluster allowing the node receiving the authorization response from the Android App to unblock the node blocking the servlet's response.
For approach 1, the browser polls the server via Javascript with an AJAX call to a web service on Tomcat; the AJAX call returns True if the Android app authenticated. Advantage: client side, minimal implementation on the server, no thread blocking on the server. Disadvantages: During the waiting period, you have to make frequent calls (maybe one a second - the user will not notice this latency) which amounts to a lot of calls and some additional load on the server.
For approach 2, there is again choice:
Block the thread with an Object.wait() optionally storing the node ID, IP or other identifier in a shared data store: If so, the node receiving the Android app authorization needs to:
Either find the node that is currently blocking or broadcast to all nodes in the cluster
For each node in 1. above, send a message that identifies the user session to unblock. The message could be sent via:
Have an internal-only servlet on each node - this is called by the servlet performing the Android app authorization. The internal servlet will call Object.notify on the correct thread
Use a JMS pub-sub message queue to broadcast to all members of the cluster. Each node is a subscriber that, on receipt of a notification will call Object.notify() on the correct thread.
Poll a data store until the thread is authorized to continue: In this case, all the Android app needs to do is save the state in a SQL DB

Using wait/notify can be tricky. Remember that any thread can be suspended at any time. So it's possible for notify to be called before wait, in which case wait will then block for ever.
I wouldn't expect this in your case, as you have user interaction involved. But for the type of synchronisation you are doing, try using a Semaphore. Create a Semaphore with 0 (zero) quantity. The waiting thread calls acquire() and it will block until another thread calls release().
Using Semaphore in this way is much more robust that wait/notify for the task you described.

Consider using an in-memory grid so that the instances in the cluster can share state. We used Hazelcast to share data between instances so in case a response reaches a different instance it still can handle it.
E.g. you could use distributed countdown latch with value of 1 to set the thread waiting after sending the message, and when the response arrives from the client to a separate instance it can decrease, that instance can decrease the latch to 0 letting to run the first thread.

Your clustered deployment means that any node in the cluster could receive any response.
Using wait/notify using threads for a web app risks accumulating a lot of threads that may not be notified which could leak memory or create a lot of blocked threads. This could eventually affect the reliability of your server.
A more robust solution would be to send the request to the android app and store the current state of the users request for later processing and complete the HTTP request. To store the state you could consider:
A database that all tomcat nodes connect to
A java cache solution that will work across tomcat nodes like hazelcast
This state would be visible to all nodes in your tomcat cluster.
When the reply from the android app arrives on a different node, restore the state of what your thread was doing and continue processing on that node.
If the UI of the application is waiting on a response from the server, you might consider using an ajax request to poll for the response state from the server. The node processing the android app response does not need to be the same one handling UI requests.

Using Thread.wait in a web service environment is a colossal mistake. Instead, maintain a database of user/token pairs and expire them at intervals.
If you want a cluster, then use a database that is clusterable. I would recommend something like memcached since it's in-memory (and fast) and low on overhead (key/value pairs are dead simple, so you don't need RDBMS, etc.). memcached handles expiration of tokens for you already, so it seems like a perfect fit.
I think the username -> token -> password strategy is unnecessary, especially because you have two different components sharing the same 2-factor authentication responsibility. I think you can further reduce your complexity, reduce confusion for your users, and save yourself some money in SMS-send fees.
The interaction with your web service is simple:
User logs into your website using username + password
If primary authentication (username/password) is successful, generate a token and insert userid=token into memcached
Send the token to the user's phone
Present "enter token" page to the user
User receives token via phone and enters it into the form
Fetch the token value from memcached based upon the user's id. If it matches, expire the token in memcached and consider the second-factor successful
Tokens will auto-expire after whatever amount of time you want to set in memcached
There are no threading problems with the above solution and it will scale across as many JVMs as you need to support your own software.

After analysing your question, I came to the conclusion that the exact problem is of multiple JVMs in a clustered environment.

The exact problem is because of the cluster environment. Both requests are not going to the same JVM. But we know that a normal/simple notify works on the same JVM when the previous thread is waiting.
You should try to execute both requests (first request, second request when the user replies from an Android application).

I'm afraid, but threads cannot migrate over classic Java EE clusters.
You have to rethink your architecture to implement the wait/notify differently (connection-less).
Or, you may give it a try with terracotta.org. It looks like this allows to cluster an entire JVM process over multiple machines. Maybe it's your only solution.
Read a quick introduction in Introduction to OpenTerracotta.

I guess the problem is, your first thread sends a notification to the user's Android application in JVM 1 and when the user reply back, the control goes to JVM 2. And that's the main problem.
Somehow, both threads can access the same JVM to apply wait and notify logic.

Solution:
Create a single point of contact for all waiting threads. Hence in a clustered environment, all the threads will wait on a third JVM (single point of contact), so in this way all the requests (any clustered Tomcat) will contact the same JVM for waiting and notify logic and hence no thread will wait for an unlimited time. If there is a reply, then the thread will be notified if the same object has waited and is being notified the second time.

Related

Server-Sent events in scalable backend

I have deployed a Java web application in Heroku.
Now, I want to change the back-end so that it can notify connected users regarding specific events. I thought I could use server-sent events to do that and the way I thought it would work is the following:
When user opens up the front-end, it would establish a connection for the server-sent events.
When the back-end receives such a request, it would create such a connection (basically an EventOutput) and store it somewhere along with the user's ID (let's say in a Map in memory).
When a new event comes along, the back-end will find the user that needs to be notified, retrieve his connection according to his ID and send him the notification.
This works just fine when you have only one machine handling the requests.
My problem starts when I want to scale up my app and introduce more machines. Then, I cannot really store these connections in memory in one machine anymore, I need to use some centralized location. But the centralized location will need to serialize/deserialize the connection, which means that it's not the same connection anymore!
How do you usually do something like that?

One solution is to use session affinity (a.k.a. sticky sessions), which will ensure that a single session's requests are "always" routed to the same process (I say "always" because there are some caveats). You can turn this feature on by running this command:
$ heroku labs:enable http-session-affinity
In this way, you can keep things in memory and will not have to serialize the session.
Here is an article describing this feature in more detail: https://blog.heroku.com/archives/2015/4/28/introducing_session_affinity

You could use a pub-sub solution (ex: Redis pub-sub) that is accessible to each of your dynos.
On starting, your app subscribes to the appropriate channels. When an event happens, it is published to a channel. This means all instances of your app (spread across multiple dynos) receive that event, and any of them that have SSE connections open can respond to the event.

Monitor database with GWT

Maybe I'm overthinking this but I'd like some advice. Customers can place an order inside my GWT application and on a secondary computer I want to monitor those submittals inside th eGWT application and flash an alarm every time an order is submitted, provided the user has OK'd this. I cant figure out the best way to do this. Orders are submitted to a mysql database if that makes any difference. Does anyone have a suggestion on what to do or try?

There are two options: 1) polling or 2) pushing which would allow your server (in the servlet handling the GWT request) to notify you (after the order is successfully placed).
In 1) polling, the client (meaning the browser you are using to monitor the app) will periodically call the server to see if there is data waiting. It may be more resource intensive as many calls are made for infrequent data. It may also be slower due to the delay between calls. If only your monitoring client is calling though it wouldn't be so resource intensive.
In 2) pushing, the client will make a request and the request will be held open until there is data. It is less resource intensive and can be faster. Once data is returned, the client sends another request (this is long polling). Alternatively, streaming is an option where the server doesn't sent a complete request and just keeps sending data. This streaming option requires a specific client-/browser-specific implementation though. If it's just you monitoring though, you should know the client and could set it up specifically for that.
See the demo project in GWT Event Service
Here is the documentation (user manual) for it.
Also see GWT Server Push FAQ
There are other ways of doing it other than GWT Event Service of course. Just google "GWT server push" and you'll find comet, DWR, etc., and if you are using Google's App Engine the Channel API

role of multithreading in web application

I am using java(Servlets, JSPs) since 2 years for web application development. In those 2 years I never required to use multithreading(explicitly - as I know that servlet containers uses threading to serve same servlet to different requests) in any project.
But whenever I attend an interview for Web Developer position(java), then there are several questions related to threads in java. I know the basics of java threading so answering the questions is not a problem. But sometimes I get confused whether I am missing something while developing web application by not using mutithreading?
So my question is that what is the role of multithreading in Web Application? Any example where multithreading can be used in web application will be appreciated.
Thanks in advance.

Multi-threading can be used in Web Apps mainly when you are interested in asynchronous calls.
Consider for example you have a Web application that activates a user's state on a GSM network (e.g activate 4G plan) and sends a confirmatory SMS or email message at the end.
Knowing that the Web call would take several minutes - especially if the GSM network is stressed - it does not make sense to call it directly from the Web thread.
So basically, when a user clicks "Activate", the Server returns something like "Thanks for activating the 4G plan. Your plan will be activated in a few minutes and you will receive a confirmation SMS/email".
In that case, you server has to spawn a new thread, ideally using a thread pool, in an asynchronous manner, and immediately return a response to the user.
Workflow:
1- User clicks "Activate" button
2- Servlet receives request and activates a new "Activate 4G Plan" task in a thread pool.
3- Servlet immediately returns an HTML response to the user without waiting for the task to be finalized.
4- End of Http transaction
.
.
.
Asynchronously, the 4G plan gets activated later and the user gets notified through SMS or email, etc...

Speaking about a real-world example, there are several reasons to use multi-threading, and I wouldn't hire a web-developer who doesn't know about it. But in the end, the reasons to use multi-threading are the same for standard- and web-development: you either want something that take a while (aka blocking) done in the background to give the user some response in between, or you have a task that can be speed up by having it run on several cores. When multi-threading is actually useful is however a different question.
Situation 1: A web server that does require some processing and has low hits/second
Here multi-threading (if applicable to the algorithm) is a good thing, as idle cores are utilized and threading can result in a faster response to the user.
Situation 2: A web server that does require some processing and has high hits/second
Here multi-threading is possible, but as cores are usually busy with other requests, there are no resources left to use it properly. Actually spreading out the task to several threads can even have a negative impact on the response time, as the task is now fragmented and all parts need to complete, but the order of execution with threads is undefined. So one client could immediately receive a response, while others might wait into time-out till their last fragment eventually gets processed.
Situation 3: A web server has to do some processing that takes a very long time
Here multi-threading is required, there is no way around it. A client cannot wait minutes or probably hours till it receives the response. In this case a callback system is usually implemented, so basically each task has an "API" that can be queried for the current state. Most online-shops are an example for this: you order something and later you can query your order status.
The alternative to threading is process-forking, as Apache does in its standard configuration. The benefit is that load is spread across cores (mostly applicable to situation 2), and the web-code itself doesn't have to do anything to use all those cores, as the OS handles that automatically. However if you have imbalanced load, some cores can be idle and resources are not used in an optimal way. A threading situation is almost always the better solution, if it is done right. But the Apache/Tomcat standard configuration uses a very outdated threading model, by spawning one thread for each request. Effectively given a certain amount of hits/second, the CPU is more busy with threading than with actually processing those requests.

Well this is a nice question and I think most of the developers who work in web application development don't use multithreading explicitly.
The reason is quite obvious since you are using a application server to deploy your application, the application server internally manages a thread pool for incoming requests.
Then why use multithreading explicitly? What is need a web application developer expose himself to multithreading?
When you work on a large scale application where you have to server many request concurrently it is difficult to serve every kind of request synchronously because particular kind of request could have been doing a lot processing which could bring down the performance your application.
Lets take an example where a web application after serving particular kind of request has to notify users through email and SMS. Doing it synchronously with the request thread could bring down the performance of your web application. So here comes the role of mutlithreading.
In such cases it is advisable to develop a stand alone multithreaded application over the network which is responsible for sending email and SMS only.

Multi-treading in web application can be used when you are interested in parallel action, e.g., fetching data from multiple addresses.
As I understand, multi-threading is used in different situation from thread-pool, which can be used to handle requests from multiple clients.

How to optimize number of database connections?

We have a Java (Spring) web application with Tomcat servlet container.
We have a something like blog.
But the blog must load its posts dynamically with Ajax.
The client's ajax script checks for new posts every second.
I.e. Ajax must ask the server for new posts every second and it will be very heavy for database.
But what if we have hundreds of thousands connects simultaneously?
I think that we must retrieve all posts with cron every second and after that save it somewhere. But where? The main idea is to unload the database.
Any ideas about architecture?
Thanks in advance!

There is other architecture for polling that could be more optimal, depending on the case:
Long polling
Long polling is a variation of the
traditional polling technique and
allows emulation of an information
push from a server to a client. With
long polling, the client requests
information from the server in a
similar way to a normal poll. However,
if the server does not have any
information available for the client,
instead of sending an empty response,
the server holds the request and waits
for some information to be available.
Once the information becomes available
(or after a suitable timeout), a
complete response is sent to the
client. The client will normally then
immediately re-request information
from the server, so that the server
will almost always have an available
waiting request that it can use to
deliver data in response to an event.
In a web/AJAX context, long polling is
also known as Comet programming.
Long Polling
Example of Implementations of this technology:
Push Server
You could also use the observer pattern to register the requests, and notify them when an update is done.

Hundreds of thousands of concurrent users all polling our site every second makes for a huge amount of traffic. If you truly expect this load you are going to have to design your platform accordingly, probably by clustering multiple web, application and DB servers.
Remember that with a database connection pool you don't need a DB connection for every user.

I'm not as familiar with Tomcat, but in WebSphere we can set up connection pools to prepare a certain number of connections.
Also, are you mainly worried about reads or the same number of writes?
Plus, you may also want to have the database "split" depending on region etc. This way there is no single heavy load across the entire database, but it can then be split and even load balanced.
There is also the "NoSQL" databases to look into as well. Maybe something to consider. Just ideas to help out.

Ensuring serial processing of JMS messages in an OC4J cluster

We have an application that processes JMS message using a message driven bean. This application is deployed on an OC4J application server. (10.1.3)
We are planning to deploy this application on multiple OC4J application servers that will be configured to run in a cluster.
The problem is with JMS message processing in this cluster. We must ensure, that only a single message is being processed in the entire OC4J cluster at a single time. This is required, since the messages have to be processed in chronological order.
Do you know of a configuration parameter, that would control message processing across an OC4J cluster?
Or do you think we have to implement our own synchronisation code that will synchronise the message driven beans across the cluster?

I've done sequential processing of messages in a cluster on a pretty large scale - 1.5 million+ message/day, using a combination of the Competing Consumers pattern and a Lease pattern.
Here's the kicker, though - your requirement that you can only process one trans at a time is going to keep you from achieving your goals. We had the same basic requirement - messages had to be processed in order. At least, we thought we did. Then we had an epiphany - as we gave the problem more thought, we realized that we didn't require total ordering. We actually required ordering only within each account. Therefore, we could distribute the load across the servers in a cluster by assigning ranges of accounts to different servers in the cluster. Then, each server was responsible to process messages for a given account in order.
Here's the second clever part - we used a Lease pattern do dynamically assign account ranges to various servers in the cluster. If one server in the cluster went down, another would grab the lease and take over the first server's responsibility.
This worked for us, and the process lived in production for about 4 years before being replaced due to a company merger.
Edit:
I explain this solution in more detail here: http://coders-log.blogspot.com/2008/12/favorite-projects-series-installment-2.html
Edit:
Okay, gotcha. You're already doing the processing at the level you need, but since you're being deployed to a cluster, you need to make sure that only one instance of your MDB is actively pulling messages from the queue. Plus, you need the simplest workable solution.
You don't need to abandon your MDB mechanism that you have now, I don't think. Essentially what we're talking about here is a requirement for a distributed lock mechanism, not to put too fancy a phrase to it.
So, let me suggest this. At the point where your MDB registers to receive messages from the queue, it should check the distributed lock, and see if it can grab it. The first MDB to grab the lock wins, and only it will register to receive messages. So, now you have your serialization. What form should this lock take? There are many possibilities. Well, how about this. If you have access to a database, its transactional locking already provides some of what you need. Create a table with a single row. In the row is the identifier of the server that currently holds the lock, and an expiration time. This is the server's lease. Each server needs to have a way to generate its unique identifier, perhaps the server name plus a thread ID, for example.
If a server can get update access to the row, and the lease is expired, it should grab it. Otherwise, it gives up. If it grabs the lease, it needs to update the row with a time in the near future, like five minutes or so, and commit the update. The active server should update the lease before it expires. I recommend updating it when there's half the time remaining, so, every 2-1/2 minutes if the lease expires in five. With this, you now have failover. If the active MDB dies, another MDB (and only one) will take over.
That should be pretty straightforward, I think. Now, you want to have the dormant MDBs check the lock occasionally to see if it's freed up.
So, the active MDB and the dormant MDBs all have to do something periodically. You might have them spawn a separate thread to do this. Many application engine vendors won't be happy if you do this, but adding one thread is no big deal, especially since it spends most of its time sleeping. Another option would be to tie into the timer mechanism that many engines provide, and have it wake up your MDB periodically to check the lease.
Oh, and by the way - make sure the server admins employ NTP to keep the clocks reasonably synced.

First point: this is a pretty crappy design and you'll seriously limit performance only being able to process a single message at a time. I assume you are clustering only for fault tolerance, because you won't get performance improvements?
Are you using the default JMS implementation with OC4J or another one?
I've used IBM's MQ in the past and that had a feature that a queue could be marked as exclusive, which meant only one client could connect to it. This would appear to offer what you want.
An alternative would be to introduce a sequence ID (as simple as an incrementing counter) and the client processing the message would check that the sequence ID is the next expected value, if not then the message put back. This approach requires the different clients to persist the last valid sequence ID they've seen in some centrally shared data store, such as a database.

I agree with stevendick: May be you're off track with the design. Regarding sequence ID or similar approachs I suggest you get insight on messaging architectures with Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions (by Gregor Hohpe y Bobby Woolf). It's a great book, plenty of useful patterns... I'm sure the forces and the problem you are facing are well described there.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.