Point to point messaging in scalabale application? - java

After googling how message is sent/received in chat messenger like whatsapp, i came across they use queues based messaging system. I am just trying
to figure out what can be high level design of this feature
HLD per mine understanding :-
Say Friend 1 and Friend 2 are online . Friend 1 has established HTTP web connection to web server 1 and Friend 2 has established HTTP web connection to web server 2. Friend 1 send the message to Friend 2.
Now as soon as message comes to web server 1, i need to convey the message to web server 2 so that message can be pushed back to friend 2 through already established web connection.
I believe distributed custom java queues can be used here to propagate the message from one server to another. As soon as message comes to one server , it will push it to distributed queue(distribute queue because of load balancing and high availability) with message content, fromUserId, toUserId. There will be listener on queue which will see destination userId of just poppedIn message and find on which webserver destination userId is active . If user is active pop out the message and push it to client otherwise store it in db so that it can be pulled
once once gets online. To see which user is active on which server, there we can maintain the treemap with userId as key and value as serverName for efficient look up
Probably actual design must be more complex/scalable than above brief . Would like to know if this is the right direction for scalable chat messenger?
Also i believe we need to have multiple distributed queues instead of one for such a scalable application. But if we have multiple distributed queues how system will ensure the FIFO message delivery across distributed queues ?

Would like to know if this is the right direction for scalable chat
messenger?
Designing this application using message queues has the following benefits:
Decoupling of client-server and reduce of failure blast: Queues can gracefully handle traffic peaks, by just having a temporarily increased queue size, which will be back to normal as long as traffic normal again (or any transient failures have been fixed)
In a messaging application, clients (mobiles) can be offline for long periods. As a result, a synchronous design would not work, since the clients might not be accessible for message delivery. However, with an asynchronous design as with message queues, the responsibility of message delivery is on the client side. As a result, the client can poll for new messages as soon as it gets online.
So, yes this design could be quite scalable in terms of performance and usability. The only thing to have in mind is that this design would require a separate queue for each user, so the number of queues would scale linearly with the number of the application's users (which could be a significant financial & scalability issue).
But if we have multiple distributed queues how system will ensure the
FIFO message delivery across distributed queues ?
Many queues, either open-source (rabbitMQ, activeMQ) or commercial (AWS SQS), support FIFO ordering. However, the FIFO guarantee inside the queue is not enough, since the messages sent by a single client could be delivered to the queue in different order due to asynchronicity issues in the network (unless you are using a single, not-distributed queue and TCP which guarantees ordered delivery).
However, you could implement FIFO ordering on the client side. Following this approach, the messages would include a timestamp, which would be used by each client to sort the messages when receiving them. The only side-effect of that is that a client could see a message, without having seen all the previous messages first. However, when previous messages arive, they will be shown in the correct order in the client's UI, so eventually the user would see all the messages and in the correct order.

Would like to know if this is the right direction for scalable chat messenger?
I would probably prefer a slightly different approach. Your ideas are correct, but I would like to add up a bit more to the same. I happened to create such a chat messenger a few years ago, and it was supposed to be quite similar to watsapp. I am sure that when you googled, you would have come across XMPP Extensible messaging and presence protocol. we were using openfire as the server that maintains connections . The concept that you explained where
Say Friend 1 and Friend 2 are online . Friend 1 has established HTTP web connection to web server 1 and Friend 2 has established HTTP web connection to web server 2. Friend 1 send the message to Friend 2.
is called federation, and openfire can be run in a federated mode. After reading through your comments, i came across the one queue per user point. I am sure that you already know that this approach is not scalable as its very resource intensive. A good approach would be use an Actor framework such as akka. Each actor is like a light weight thread in java and each actor has an inbox. so messaging is taken care of in this case.
So your scenario transforms to Friend 1 opens a connection to openfire xmpp server and initializes a Friend 1 Actor.When he types a message, it is transferred to the Friend 1 actor's in-box ( Each actor in akka has an in memory inbox). This is communicated to the xmpp server. The server has a database of its own, and since it is federated with other xmpp servers, it will try to find if friend 2 is online. The xmpp server will keep the message in its db until the friend 2 comes online. Once friend 2 establishes a connection to any of the xmpp server a friend 2 actor is created and its presence is propagated to all other servers and the xmpp server 1 will notify Friend 2's actor. Friend 2's actor inbox will now get the message
Optional: There is also a option of delivery receipt. Once Friend2 reads the message, a delivery receipt can be sent to friend 1 to indicate the status of the message i.e read, unread, delivered, not delivered etc.

Related

Purpose of message brokers in websocket communication?

I am implementing websockets for collaborative editing. For that I am using Spring 5 websockets.
The simplest example would be, two web clients are connected via websockets to my server. User 1 does some action which creates an event and sends this event info to my server. Now this event has to be sent to User 2 so that they can do appropriate UI changes.
I have two questions here:
Since there will be multiple instances running of this server, User 1 might connect to Server 1 and User 2 might connect to server 2. In this case how would the changes done by User 1 go to User 2 ?
Also, I was following this tutorial. This tutorial is implementing websockets without any message broker some tutorials are additionally using a message broker (amqp mostly). What is the point of a message broker in this case ? Is it only used because there might be too many messages and the server would be processing them one by one ?
Just wanted to add: We cannot get away with a peer to peer connection on the client side as the server needs to store the data for future.
By default spring 5 uses in memory simple stomp broker for all connections
If you want to scale horizontally you need a message broker like RabbitMQ etc.
Lets imagine the situation user1 and user2 is connected to server1 and user3 is connected to server2 . When user1 sends a message to user3 it would not be aware of user3 because server1 does not know about user3.
If we have a broker this issue will be solved.
So scalability is needed to handle the load and in production you always will have more than 1 instance for high availability and fault tolerance.

Need suggestions for reliable data broadcasting inside LAN using Java or android

We are working on a Android project with the below requirements.
The application should be able to send data to all the devices which are running our application which exists in the WiFi LAN.
Some payloads are expected to be of size >= 5MB.
The data shouldn't be lost and if lost the client should know the failure.
All the devices should be able to communicate with all other. There will be no message targeted to a specific device instead all the messages should be reached all the devices in the N/W.
No internet hence no remote server.
Study we have done:-
UDP Broadcasting - UDP doesn't guarantee the message delivery but this is a prime requirement in our case. Hence not an option.
TCP - TCP guarantees the message delivery but requires the receiver IP address to be known before hand and in our case we need to send the message to all the devices inside the LAN. Hence not a straight option.
Solutions we are looking into:-
A Hybrid approach - Name one of the devices in the N/W as Server. Post all the messages to a local Server. The Server keeps a open socket to all the devices(which have our application) & when there is a message from a device then it routes the message to all the devices. The disadvantages of this approach are,
Server having multiple sockets open each per device. But in our case we are expecting devices <=5 in LAN.
Server discovery using continuous UDP broadcast.
We want to have all the data in all the devices. So if we newly introduce any device into the LAN then that device needs to get all the data from the server.
So my question, have you any time worked on these kind of hybrid approaches? Or can you suggest any other approaches?
Your hybrid approach is the way to go.
Cleanly split your problem into parts and solve them independently:
Discovery: Devices need to be able to discover the server, if there is any.
Select server: Decide which of your devices assumes the server role.
Server implementation: The server distributes all data to all devices and sends notifications as necessary. Push or pull with notifications does not matter.
Client implementation: Clients only talk to the server. The device which contains the server should also contain a normal client, potentially passing data to the server directly, but using the same abstract protocol.
You could use mDNS (aka Bonjour or zeroconf) for the discovery, but I would not even recommend that. It often createsmore problems than it solves, and it does not solve your 'I need one server' problem. I would suggest you handcraft a simple UDP broadcast protocol for the discovery, which already tells you who the server is, if there is any.
Select server: One approach is to use network meta data which you have anyway, for example 'use the device with the highest IP address'. This often works better than fancy arbitration algorithms. Once you established a server new devices would use this, rather than switching the server role.
Use UDP broadcast for the discovery, with manual heuristic repeats. No fancy logic, just make your protocol resilient against repeated packets and repeat your packets. (Your WLAN router may repeat your packets without your knowledge anyway.)
You may want to use two TCP connections per client, potentially to two different server ports, but that does not matter much: One control connection (always very responsive, no big amounts of data, just a few hundred bytes per message) and one data connection (for the bulk of the data, your > 5 MB chunks). This is so that everything stays responsive.

Uniform in zeromq for realtime purpose

I need to implement analytics system with server and terminals which in realtime.
I use library ZeroMq (pub|sub mode) to send messages to client (~40bytes).
if I connect with 1 client, messages come with delay (sometime more than 250ms).
if I connect with 100 clients a lot of clients lose uniformity of delivery (more than 750ms no one message, after that huge scope of data). It is so critical issue for me.
I have to publish to more than 6000 terminals...
Publish every 30ms, it is about 1700bytes to each client in the worst case (tcp)
Maybe I should use another technology to deliver messages in realtime?
As I said in the comment, Multicast is the way. The primary overriding concern is whether your terminals can join the group you are publishing on - irrespective of how far away they are.
You've not indicated how the terminals connect to your network - (for example vpn over internet, private line whatever..) You asked for a better technology - it's multicast.
Now there are some options if you are going to go down the tcp route:
Build a load-balacing infrastructure which sits in front of your
service. Meaning that your terminals don't connect to your
service, but to a set of load balancers which then connect to your
service. If you have 10 of these for example, each only has to deal
with 600 clients. Your problem is much smaller - you can scale this
way. Don't forget to use asynchronous io.
Buy better hardware - for example solace or tervela do hardware
message brokers which can scale to very large numbers concurrent tcp
connections - but this is not cheap.

Java Sockets - Messages between many clients

So the problem is I have fifteen clients which need to be able to communicate between each other. My question is how should this be done? Clearly one way is to simply make the clients also servers, but that means 120 unique connections necessary to fully connect the fifteen clients. I'd rather not do this as it seems messy.
Current solution:
Each new connection has the server spin off a separate thread for listening to it. Each client has a separate thread monitoring the channel for incoming information.
Server acts as a message router: Process 1 needs to send a message to Process 2 and sends a message to the server indicating intended recipient, sender, and message.
Upon receiving the message the server passes message to Process 2. The listening thread detects it and passes it to the process.
So on for each message between the clients.
This seems clunky. Is there a better methodology/package to use for this?
A UDP multicast system would work for this but will get complicated for you to do yourself (since you have to worry about synchronization and fault detection/correction yourself as well as nodes droping in and out of the group).
There are various middleware solutions including distributed caches that already address this problem pretty well. Look at Infinispan. If that's too high level and you just want a lower level solution, try JGroups. I only list those because I know they are quick and usable, but there are many others out there.

What is better for instant messenger TCP or UDP?

I need to implement client/server instant messenger using pure sockets in Java lang.
The server should serve large number of clients and I need to decide which sockets should I use - TCP or UDP.
Thanks, Costa.
TCP
Reason:
TCP: "There is absolute guarantee that the data transferred remains intact and arrives in the same order in which it was sent."
UDP: "There is no guarantee that the messages or packets sent would reach at all."
Learn more at: http://www.diffen.com/difference/TCP_vs_UDP
Would you want your chat message possibly lost?
Edit: I missed the part about "large chat program". I think because of the nature of the chat program it needs to be a TCP server, I cannot imagine the actual text content sent by users over a UDP protocol.
The max limit for TCP servers is 65536 connections at the same time. If you really need to go past that number you could create a dispatcher server that would send incoming connects to the appropriate server depending on current server loads.
You could use both. Use TCP for exchanging the actual messages, (so no data lost and streaming large messages, (eg. containing jpegs etc), is possible. Use UDP only for sending short 'connectNow' messages to clients for which there are messages queued. The clients could have states like (NotLoggedIn, TCPconnected, TCPdisconnected, LoggedOut) with various timeouts to control the state transitions as well as the normal message-exchange events. The UDP 'connectNow' message would instruct clients in 'TCPdisconnected' to connect and so move to 'TCPconnected', where they would stay, exchanging messages, until some inactivity timer instructs the client to disconnect for now. This would, of course, be unreliable and so you may wish to repeat the 'connectNow' message every X seconds for N times until the client connects. The client should, in any case, attempt a poll every X minutes, just in case...
It depends whether the user needs to know if the messages have been delivered to the server. UDP packets have no inherent acknowledgement. If the client sends an IM message to the server and it gets lost in transit, neither the client or the server will know about it.
(The short answer is "use TCP" ... but it is worth thinking through the design implications for yourself.)
TCP would give you reliability, which is certainly desirable when during instant messaging -- you would not want messages to be dropped during converstation.
However, if you intend on using group messaging, then you might end up using mulitcast. For such cases, UDP would be the right chioce since UDP can handle point to multipoint. Using TCP for multicast applications would be hard since now the sender would have to keep track of retransmissions/sending rate for multiple receivers. One alternative could be to use TCP for point-to-point chat and use UDP for group messaging.

Categories