Server-Sent events in scalable backend

Server-Sent events in scalable backend - java

I have deployed a Java web application in Heroku.
Now, I want to change the back-end so that it can notify connected users regarding specific events. I thought I could use server-sent events to do that and the way I thought it would work is the following:
When user opens up the front-end, it would establish a connection for the server-sent events.
When the back-end receives such a request, it would create such a connection (basically an EventOutput) and store it somewhere along with the user's ID (let's say in a Map in memory).
When a new event comes along, the back-end will find the user that needs to be notified, retrieve his connection according to his ID and send him the notification.
This works just fine when you have only one machine handling the requests.
My problem starts when I want to scale up my app and introduce more machines. Then, I cannot really store these connections in memory in one machine anymore, I need to use some centralized location. But the centralized location will need to serialize/deserialize the connection, which means that it's not the same connection anymore!
How do you usually do something like that?

One solution is to use session affinity (a.k.a. sticky sessions), which will ensure that a single session's requests are "always" routed to the same process (I say "always" because there are some caveats). You can turn this feature on by running this command:
$ heroku labs:enable http-session-affinity
In this way, you can keep things in memory and will not have to serialize the session.
Here is an article describing this feature in more detail: https://blog.heroku.com/archives/2015/4/28/introducing_session_affinity

You could use a pub-sub solution (ex: Redis pub-sub) that is accessible to each of your dynos.
On starting, your app subscribes to the appropriate channels. When an event happens, it is published to a channel. This means all instances of your app (spread across multiple dynos) receive that event, and any of them that have SSE connections open can respond to the event.

Related

How to send events to all instances of the application in PCF

I am not able to find a way to send/broadcast a message to all application instances in Pivotal Cloud Foundry. How can we notify to all app instances of some events? If we use the HTTP request, PCF router will dispatch it to a single instance of the app. How can we solve this problem?

What #Florian said is probably the safer option, but if you want something quick and easy, you can send HTTP requests directly to an app instance by using the X-CF-APP-INSTANCE header. The format for the header is YOUR-APP-GUID:YOUR-INSTANCE-INDEX.
https://docs.cloudfoundry.org/concepts/http-routing.html#app-instance-routing
So given an app guid, you could iterate over the number of instances, say 0 to 5, and send an HTTP request to each one. Make sure to check the response to confirm that each one succeeded.
This also requires that you know the app guid for your app (i.e. cf app <name> --guid) and the number of instances of your app.

CF, out of the box, does not provide any event queue mechanism where apps can subscribe to.
What I would do (assuming you've two app instances A and B):
Provide an event endpoint in your application code, e.g. POST /api/event (alternatively, if the event should arise from another app (e.g. another microservice), this one could directly send messages onto the queue)
All app instances are listening on an internal event queue for new events
instance A receives the call from the CF router and processes it by issuing an event on an internal event queue, the instance will not react to the event, yet
When A publishes the event, A and B receives the event and processes it accordingly
Now, the internal event queue you can use highly depends on your deployment. On AWS you probably can use SQS or SNS or something similar. PCF, as I know, may also provide a messaging system which would suit here as well, rabbitmq. You could also use features of other services that would allow you to subscribe to events, such as redis (pub/sub commands) or similar.
If you provide more information about what you want to achieve more concretely, more detailed answer would be possible, though.

java client server update without polling

when building a server, one sometimes performs asynchronous tasks from client to server (which responds to client in asynchronous time),
or the server needs to send the client a message
now if the client is listening at all times (meaning polling) it takes a lot of resources which is problematic
here is where I assume the operating system steps in and assumes the role of polling for the appropriate port, and letting the application know using the appropriate event (the application subscribes using the OS API)
am I right in my assumptions?
how do I subscribe to a port using the OS's API? (lets say android for the sake of argument)
how is a message from server to client work exactly?
and how does the server know the client's IP at all times?
I have seen many questions in the subject, but wasn't able to figure out the big picture
Edit:
I am using GCM in android, but have seen other apps that does not use it and still manage to do it right, also it's a more general question as to what is the right approach in java VS. any operating system it uses (ubnutu, windows, android, etc.)

Totally right - polling is typically a waste of resources. Until recently, many apps would either keep a socket open and poll every few minutes to keep it alive, or make periodic HTTP calls to a server.
Nowadays, Google Cloud Messaging is used by most apps to push data instead of constantly polling. As you correctly guessed, this is implemented by maintaining a persistent connection with Google's servers. The advantage of this is that it's very efficient for battery life, and that all apps can use this one resource to send push notifications, instead of each app having to poll a different server or create its own persistent connection.
The idea is that you send requests to GCM from your server (this can be in response to user activity, etc), which sends it to all of the client's devices. You can either send a message with a small payload (up to 4kb) or a "send-to-sync" message, which tells an app to contact the server (e.g. to sync new data from the server after user changes).
here is where I assume the operating system steps in and assumes the role of polling for the appropriate port, and letting the application know using the appropriate event (the application subscribes using the OS API)
GCM pushes messages to clients, so there isn't active waiting like you'd see in a simple polling system.
how is a message from server to client work exactly? and how does the server know the client's IP at all times?
There's no need for servers to know the client IP, as any online android device will typically maintain a connection with GCM. Targeting specific users is done via User Notifications.
(Oh, and I realize that your question is more general than just Android, which I have more experience in, but iOS has a similar system in place. Some developers I've met like to use Parse for managing push notifications).

Architectural issue with Tomcat cluster environment

I am working on project in which we have an authentication mechanism. We are following the below steps in the authentication mechanism.
The user opens a browser and enter his/her email in a text box and click the login button.
The request goes to a server. We generate a random string (for example, 123456) and send a notification to the user's Android/iPhone and makes the the current thread wait with the help of the wait() method.
The user enters a password on his/her phone and clicks the submit button on his/her phone.
Once the user clicks the submit button, we are making a webservice hit the server and passing the previously generated string (for example, 123456) and password.
If the password is correct against the previously entered email, we call the notify() method to the previously waiting thread and send success as the response and the user gets entered into our system.
If the password is incorrect against the previously entered email, we call the notify() method to the previously waiting thread and send failed as the response and display an invalid credential message to the user.
Everything is working fine, but recently we moved to a clustered environment. We found that some threads are not notified even after replied by the user and for an unlimited waiting time.
For the server, we are using Tomcat 5.5, and we are following The Apache Tomcat 5.5 Servlet/JSP Container for making tomcat cluster environment.
Answer :: Possible problem and solution
The possible problem is the multiple JVMs in a clustered environment. Now we are also sending the clustered Tomcat URL to the user Android application along with generated string.
And when the user clicks on the reply button, we are sending the generated string along with the clustered Tomcat URL so in this case both requests are going to the same JVM, and it works fine.
But I am wondering if there is a single solution for the above issue.
There is a problem in this solution. What happens if the clustered Tomcat crashes? The load balancer will send a request to the second clustered Tomcat and again the same problem will arise.

The underlying reason for your problems is that Java EE was designed to work in a different way - attempting to block/wait on a service thread is one of the important no-no's. I'll give the reason for this first, and how to solve the issue after that.
Java EE (both the web and EJB tier) is designed to be able to scale to very large size (hundreds of computers in a cluster). However, in order to do that, the designers had to make the following assumptions, which are specific limitations on how to code:
Transactions are:
Short lived (eg don't block or wait for periods greater than a second or so)
Independent of each other (eg no communication between threads)
For EJBs, managed by the container
All user state is maintained in specific data storage containers, including:
A data store accessed through, eg, JDBC. You can use a traditional SQL database or a NoSQL backend
Stateful session beans, if you use EJBs. Think of these as Java Bean that persists its fields to a database. Stateful session beans are managed by the container
Web session This is a key-value store (kinda like a NoSQL database but without the scale or search capabilities) that persists data for a specific user over their session. It's managed by the Java EE container and has the following properties:
It will automatically relocate if the node crashes in a cluster
Users can have more than one current web session (i.e. on two different browsers)
Web sessions end when the user ends their session by logging out, or when the session is inactive for longer than the configurable timeout.
All values that are stored must be serializable for them to be persisted or transfered between nodes in a cluster.
If we follow those rules, the Java EE container can successfully manage a cluster, including shutting down nodes, starting new ones and migrating user sessions, without any specific developer code. Developers write the graphical interface and the business logic - all the 'plumbing' is managed by configurable container features.
Also, at run time, the Java EE container can be monitored and managed by some pretty sophisticated software that can trace application performance and behavioural issues on a live system.
< snark >Well, that was the theory. Practice suggests there are pretty important limitations that were missed, which lead to AOSP and code injection techniques, but that's another story < /snark >
[There are many discussions around the 'net on this. One which focuses on EJBs is here: Why is spawning threads in Java EE container discouraged? Exactly the same is true for web containers such as Tomcat]
Sorry for the essay - but this is important to your problem. Because of the limitations on threads, you should not block on the web request waiting for another, later request.
Another problem with the current design is what should happen if the user becomes disconnected from the network, runs out of power, or simply decides to give up? Presumably you will time out, but after how long? Just too soon for some customers, perhaps, which will cause satisfaction problems. If the timeout is too long, you could end up blocking all worker threads in Tomcat and the server will freeze. This opens your organisation up for a denial of service attack.
EDIT : Improved suggestions after a more detailed description of the algorithm was published.
Notwithstanding the discussion above on the bad practice of blocking a web worker thread and also the possible denial of service, it's clear that the user is presented with a small time window in which to react to the the notification on the Android phone, and this can be kept reasonably small to enhance security. This time window can also be kept below Tomcat's timeout for responses as well. So the thread blocking approach could be used.
There are two ways this problem can be resolved:
Change the focus of the solution to the client end - polling the server using Javascript on the browser
Communication between nodes in the cluster allowing the node receiving the authorization response from the Android App to unblock the node blocking the servlet's response.
For approach 1, the browser polls the server via Javascript with an AJAX call to a web service on Tomcat; the AJAX call returns True if the Android app authenticated. Advantage: client side, minimal implementation on the server, no thread blocking on the server. Disadvantages: During the waiting period, you have to make frequent calls (maybe one a second - the user will not notice this latency) which amounts to a lot of calls and some additional load on the server.
For approach 2, there is again choice:
Block the thread with an Object.wait() optionally storing the node ID, IP or other identifier in a shared data store: If so, the node receiving the Android app authorization needs to:
Either find the node that is currently blocking or broadcast to all nodes in the cluster
For each node in 1. above, send a message that identifies the user session to unblock. The message could be sent via:
Have an internal-only servlet on each node - this is called by the servlet performing the Android app authorization. The internal servlet will call Object.notify on the correct thread
Use a JMS pub-sub message queue to broadcast to all members of the cluster. Each node is a subscriber that, on receipt of a notification will call Object.notify() on the correct thread.
Poll a data store until the thread is authorized to continue: In this case, all the Android app needs to do is save the state in a SQL DB

Using wait/notify can be tricky. Remember that any thread can be suspended at any time. So it's possible for notify to be called before wait, in which case wait will then block for ever.
I wouldn't expect this in your case, as you have user interaction involved. But for the type of synchronisation you are doing, try using a Semaphore. Create a Semaphore with 0 (zero) quantity. The waiting thread calls acquire() and it will block until another thread calls release().
Using Semaphore in this way is much more robust that wait/notify for the task you described.

Consider using an in-memory grid so that the instances in the cluster can share state. We used Hazelcast to share data between instances so in case a response reaches a different instance it still can handle it.
E.g. you could use distributed countdown latch with value of 1 to set the thread waiting after sending the message, and when the response arrives from the client to a separate instance it can decrease, that instance can decrease the latch to 0 letting to run the first thread.

Your clustered deployment means that any node in the cluster could receive any response.
Using wait/notify using threads for a web app risks accumulating a lot of threads that may not be notified which could leak memory or create a lot of blocked threads. This could eventually affect the reliability of your server.
A more robust solution would be to send the request to the android app and store the current state of the users request for later processing and complete the HTTP request. To store the state you could consider:
A database that all tomcat nodes connect to
A java cache solution that will work across tomcat nodes like hazelcast
This state would be visible to all nodes in your tomcat cluster.
When the reply from the android app arrives on a different node, restore the state of what your thread was doing and continue processing on that node.
If the UI of the application is waiting on a response from the server, you might consider using an ajax request to poll for the response state from the server. The node processing the android app response does not need to be the same one handling UI requests.

Using Thread.wait in a web service environment is a colossal mistake. Instead, maintain a database of user/token pairs and expire them at intervals.
If you want a cluster, then use a database that is clusterable. I would recommend something like memcached since it's in-memory (and fast) and low on overhead (key/value pairs are dead simple, so you don't need RDBMS, etc.). memcached handles expiration of tokens for you already, so it seems like a perfect fit.
I think the username -> token -> password strategy is unnecessary, especially because you have two different components sharing the same 2-factor authentication responsibility. I think you can further reduce your complexity, reduce confusion for your users, and save yourself some money in SMS-send fees.
The interaction with your web service is simple:
User logs into your website using username + password
If primary authentication (username/password) is successful, generate a token and insert userid=token into memcached
Send the token to the user's phone
Present "enter token" page to the user
User receives token via phone and enters it into the form
Fetch the token value from memcached based upon the user's id. If it matches, expire the token in memcached and consider the second-factor successful
Tokens will auto-expire after whatever amount of time you want to set in memcached
There are no threading problems with the above solution and it will scale across as many JVMs as you need to support your own software.

After analysing your question, I came to the conclusion that the exact problem is of multiple JVMs in a clustered environment.

The exact problem is because of the cluster environment. Both requests are not going to the same JVM. But we know that a normal/simple notify works on the same JVM when the previous thread is waiting.
You should try to execute both requests (first request, second request when the user replies from an Android application).

I'm afraid, but threads cannot migrate over classic Java EE clusters.
You have to rethink your architecture to implement the wait/notify differently (connection-less).
Or, you may give it a try with terracotta.org. It looks like this allows to cluster an entire JVM process over multiple machines. Maybe it's your only solution.
Read a quick introduction in Introduction to OpenTerracotta.

I guess the problem is, your first thread sends a notification to the user's Android application in JVM 1 and when the user reply back, the control goes to JVM 2. And that's the main problem.
Somehow, both threads can access the same JVM to apply wait and notify logic.

Solution:
Create a single point of contact for all waiting threads. Hence in a clustered environment, all the threads will wait on a third JVM (single point of contact), so in this way all the requests (any clustered Tomcat) will contact the same JVM for waiting and notify logic and hence no thread will wait for an unlimited time. If there is a reply, then the thread will be notified if the same object has waited and is being notified the second time.

Strategy to merge java applications in one

Situation : 2 small java applications both of them connecting to remote service and sending some data there (first application listens to local socket, process data, sends it for verification to remote service and process response; second application starts on scheduled time, process some data for database and send that data to remote service).The problem is that remote service allows only one connection (that connection is SMPP session), that mean if one application is running and other application starts and try to make connection then bad things will happens...
The idea is to combine that 2 applications in 1 (maybe there are other solutions?) and create some kind of control workflow functionality which responsibility will be to manages applications to avoid collisions in connecting to remote service.
Can someone give me any advice about that idea? Maybe there is some kind of design pattern which allows me to avoid some pitfalls when I will be implementing that? (it would be even better if there are some open source applications which manages similar kind of problem so I could browse source code and gather some good information).
Thank you.

Wrap data to classes together with necessary metadata.
Place your applications to separate threads and instead of sending the data add them to a queue.
Then, in another thread, read the queue and send data from the queue to the service.
I'd give a try to BlockingQueue (http://download.oracle.com/javase/1,5.0/docs/api/java/util/concurrent/BlockingQueue.html) .

Most obvious solution is to write a simple reverse proxy server that will gather requests in to queue and send them one by one to your remote service. Or proxy could run new instance of service for each request.
http://en.wikipedia.org/wiki/Reverse_proxy

How to keep stateful web clients in sync, when multiple clients are looking at the same data?

In a RIA web client, created with GWT, the state is maintained in the web client, while the server is (almost) stateless (this is the preferred technique to keep the site scalable).
However, if multiple users looking at the same data in their browser and one user changes something, which is send to the server and stored in the database, the other users still have the old data in their browser state. For example a tree is shown and one of the users adds/removes an item from the tree.
Since there can be a lot of state in the client, what is the best technique to keep the state in the clients in sync? Are there any java frameworks that take care of this?

Push changes (delta) only, it applicable, and if not -- re-sync client completely. That's what we do with our remote clients (not only GWT but with Eclipse RCP too). We send delta contexts while changes are small and local, and on global change we re-sync. This will require to design a sophisticated diff protocol, and often will require redesign of remote client protocol from scratch.

The most promising HTTP Push (Comet) library I tried so far is the StreamHub Project:
StreamHub is a highly-scalable HTTP
Comet and Reverse Ajax server allowing
you to push live data to a web browser
without requiring any plugins or
security-policy changes. It uses a
technique known as Comet or Reverse
Ajax to keep a persistent connection
open to the browser.
That might be what you are looking for to keep you clients states up to date. They have a GWT adapter project, too.

Comet support is also available in GWT using the rocket-gwt project (which also provides a bunch of other cool features, like lightweight collections, drag-and-drop, etc.) -- Comet is provided by the Remoting package.

I am having the same dilemma on my flex application.
It seems that best way to deal with this is to keep an interval of some seconds between server and client, and force polling the state to each clients.
I have made the following approach, note that this does not solve the out of sync situation, it just reduces significantly the possible situation to happen.
I have at server side, one cache of each collection calls.
I have at client side, per instance of application, one cache of that same collections.
Instance one loads some array of objects into a grid. (Builds a collection initial state with the server)
Instance two loads and makes changes, submitting changed data to the server, db info is persisted and server cache is rebuilt.
(Client cache also maintains its local cache not requiring to call for the server collection again.)
Instance one is out of sync. (will be in sync at the next polling interval)
instance two is sync due to being the app responsible for the changes.
Both instances polls the server from time to time, like a 10 second interval for changes.
(If server side cache suffered changes, it will bring the new info to all clients on next interval call.)
If no changes at server side level, no info is sent to one already registered client.
(This means no info is exchanged between server and client, reducing overhead.)
If a third client comes in, it's state is fresh and will perform the necessary calls to build its current cache as well.
There is a delay, but it surely helps out propagating changes to the client.
The problem is that the client consumes some extra memory by keeping it's cache state.
I am doing this in a per screen situation, once that screen is out of view, the client cache is nulled, once that screen is called again, local cache is created and the timer starts and the polling begins.
Hope that helps,
Ernani

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.