Achieving JMS/AMQP messaging patterns using Redis - java

This question arises as I came across some mentions (such as this), about using a Messaging software such as ZeroMQ alongwith Redis, but I keep hearing about Redis itself used a messaging-system. So, if Redis is used along with other messaging systems, does it mean Redis has some serious deficiencies when used as a messaging system by itself ?
While the use of Redis for caching and pub/sub is clear to me, it is not clear if Redis can be used in place of a full-fledged messaging system such as JMS, AMQP or ZeroMQ.
Leaving alone the standards-compliance aspect and only concentrating on the functionality/features, does Redis provide support for all the messaging patterns/models required of a messaging system ?
The messaging patterns I am talking about are :
RPC/Request-reply (an
example
using ActiveMQ/JMS and another using RabbitMQ/AMQP)
Pipeline/Work queues (once and atmost once consumption of each message)
Broadcast (everyone subscribed to the channel)
Multicast (filtering of messages at server based on consumers' selectors)
Any other messaging pattern ?
If yes, then Redis seems to solve two (possibly more) aspects at once: Caching and Messaging.
I am looking at this in the context of a web-application backed by a Java/Java EE server.
And I am looking at this not from a proof-of-concept point-of-view but from a large-scale software development angle.
Edit1:
user:791406 asked a valid question:
"Who cares if redis supports those patterns; will redis meet your SLA
and QoS needs?"
I thought it is better to give this detail as part of the question instead of in the comments section.
My current needs are less to do with SLA and QOS and more to do with choosing a tool for my job (messaging) that I can use even when my requirements grow (reasonably) in future.
I am starting with simplistic requirements at first and we all know requirements tend to grow.
And NO, I am not looking for a one tool that does it all.
I just want to know if Redis fulfills the usual requirements expected out of a messaging system, like ActiveMQ/RabbitMQ does. Ofcourse, if my SLA/QOS needs are extreme/eccentric, I would need to get a special tool for satisfying that. For ex: In some cases ZeroMQ could be chosen over RabbitMQ due to specific SLA requirements. I am not talking about such special requirements. I am focusing on average enterprise requirements.
I was afraid (based on my little understanding) that eventhough redis could be used as a basic tool for my messaging needs of today, it might be the wrong tool for a real messaging job in future. I have experiences with messaging systems like ActiveMQ/RabbitMQ and know that they could be used for simple to (reasonably) complex messaging needs.
Edit2:
The redis website mentions "Redis is often used as a messaging server" but how to achieve the messaging patterns is not clear.
Salvatore sanfilippo mentions Redis users tend to use it as a database, as a messaging bus, or as a cache. To what extent can it serve as a "messaging bus" is not clear.
When I was trying to find out what messaging requirements of JMS that redis doesnt support, I came across something that Redis supports but JMS doesnt: Pattern-matching subscriptions i.e Clients may subscribe to glob-style patterns in order to receive all the messages sent to channel names matching a given pattern.
Conclusion:
I have decided to use JMS for my messaging needs and use Redis for Caching.

What are you needs?
I think the question you should be asking yourself is "what quality of messaging do I need to support my application?" before deciding on a messaging platform. Who cares if redis supports those patterns; will redis meet your SLA and QoS needs? Focus on that first, then make your technology decision based on that assessment.
Having said that, I'll offer my input on how you can make that decision...
Highly-Reliable/Persistent/Durable Messaging
Let's take an extreme case: say you're building a trading or financial application. Such applications demand strict SLA's where message persistence, reliability, exactly-once delivery and durability are paramount. Using redis as the messaging backbone for this case is probably a bad choice, lots of reasons why...
message redelivery (when the sh*t hits the fan)
message store replication when redis goes down
message transactions (redis can't do XA)
producer/subscriber fault tolerance and disconnection resilience
message order sequencing
sending messages when broker is down (store-and-forward)
single-threaded redis may become bottleneck
If your system has strict SLA's, some or all of these issues will most definitely arise, so how will you handle these limitations? You can implement custom code around redis to address some issues, but why bother when mature messaging platforms like ActiveMq, WebsphereMQ, and WebLogic JMS offer persistence, reliability, and fault tolerance? You said you're on a Java/Java EE stack, so you're in a position to use some of the most robust messaging frameworks available, either open source or commercial. If you're doing financial transactions, then you need to consider these options.
High Performance/Large Distributed Systems Messaging
If you're building a social network or gaming platform where you want performance over reliability, ZeroMq is probably a good fit. It's a socket communication library wrapped in a messaging-like API. It's decentralized (no broker), very fast, and highly resilient and fault tolerant. If you need to do things like N-to-N pub/sub with broker intermediaries, flow control, message persistence, or point-to-point synchronization, ZeroMq offers the necessary facilities and code samples to do it all with minimal code while avoiding building solutions from the ground up. It's written in C but has client libraries for nearly every popular language.
Hope that helps...

Related

Local message queue for sharing data between two processes

I have a server application A that produces records as requests arrive. I want these records to be persisted in a database. However, I don't want to let application A threads spend time persisting the records by communicating directly with the database. Therefore, I thought about using a simple producers-consumers architecture where application A threads produce records and, another application B threads are the consumers that persist the records to the database.
I'm looking for the "best" way to share these records between applications A and B. An important requirement is that application A threads will always be able to send records to the IPC system (e.g. queue but that may be some other solution). Therefore, I think the records must always be stored locally so that application A threads will be able to send record event if network is down.
The initial idea that came to my mind was to use a local message queue (e.g. ActiveMQ). Do you think a local message queue is appropriate? If yes, do you recommend a specific message queue implementation? Note that both applications are written in Java.
Thanks, Mickael
For this type of needs Queueing solution seems to be the best fit as the producer and consumer of the events can work in isolation. There are many solutions out there, and I have personally worked with RabbitMQ and ActiveMQ. Both are equally good. I don't wish to compare their performance characteristics here but RabbitMQ is written in Erlang which a language tailer-made for building real time applications.
Since you're already on Java platform ActiveMQ might be a better option and is capable producing high throughput. With a solution like this, the consumer does not have to be online all the time. Based on how critical your events data are, you may also want to have persistent queues and messages so that in the event of a message broker failure, you can still recover important "event" messages your application A produced.
If there are many applications producing events and later if you wish to scale out(or horizontally scale) the broker service because it's getting a bottleneck, both of the above solutions provide clustering services.
Last but not least, if you want to share these events between different platforms you may wish to share messages in AMQP format, which is a platform-independent wire-level protocol to share messages between heterogenous systems, and I'm not sure if this is requirement for you. RabbitMQ and ActiveMQ both support AMQP. Both of these solutions also support MQTT which is a lightweight messaging protocol but it seems that you don't wish to use MQTT.
There are other products such as HornetQ and Apache Qpid which are also production ready solutions but I have not used them personally.
I think queueing solution is a the best approach in terms of maintainability, loose coupling nature of participating applications and performance.

Duplex streaming in Java EE

I'm looking for a full duplex streaming solution with Java EE.
The situation: client applications (JavaFX) read data from a peripheral device. This data needs to be transferred in near real-time to a server for processing and also get the response back asynchronously, all while it keeps sending new data for processing.
Communication with the server needs to have an overhead as low as possible. Data coming in is basically some sensor data and after processing it is turned in what can be described as a set of commands.
What I've looked into:
A TCP/IP server (this is a non-Java EE approach).This would be the obvious solution. Two connections opened in parallel from each client app: one for upstream data and one for downstream data.
Remote & stateless EJBs. This would mean that there's no streaming involved and that I pack sensor data in smaller windows (1-2 seconds worth of sensor data) which I then send to the server for processing and get the processing result as a response. For this approach, while it is scalable, I am not sure how fast it will be considering I have to make a request each 1-2 seconds. I still need to test this but I have my doubts.
RMI. Is this any different than EJBs, technically?
Two servlets (up/down) with long polling. I've not done this before, so it's something to be tested.
For now I would like to test the performance for my approach #2. The first solution will work for sure, but I'm not too fond of having a separate server (next to Tomcat, where I already have something running).
However, meanwhile, it would be worth knowing if there are any other Java specific (EE or not) technologies that could easily solve this. If anyone has an idea, then please share it.
This looks like a good place for using JMS. Instead of stateless EJBs, you will probably be using Message-Driven Beans.
This gives you an approach similar to your first solution, using two message queues instead of TCP/IP connections. JMS makes your communications fully asynchronous and is low-overhead in the sense that your clients can send messages as fast as they can regardless of how fast your server can consume them. You also get delivery guarantees and other JMS goodness.
Tomcat does not come with JMS, however. You might try TomEE or integrate your existing Tomcat with a JMS implementation like ActiveMQ.
There are numerous options you could try. Appropriate solutions depend on the nature of your application, communication protocol, data transfer type, control you have over the client and server and firewall restrictions on client server routes.
There's not much info on this in your question, but given what you have provided, you may like to look at netty as it is quite general purpose and flexible and seems to fit your requirements. Netty also includes a duplex websocket implementation. Note that a netty based solution may be more complex to implement and require more background study than some other solutions (such as jms).
Yet another possible solution in GraniteDS, which advertises a JavaFX client integration and multiple server integrations for full duplex client/server communication, though I have not used it. GraniteDS uses comet (your two asynchronous servlets with long polling model) with the Active Message Format for data which you may be familiar with from Flex/Flash.
Have you looked at websockets as a solution? They are known to keep persistent connections and hence the asynchronous response will be quick.

AMQP or XMPP for Client Notifications

I'm designing a replacement for a custom messaging system that is currently used to notify a JavaScript web application about changed content from the server side (Java). This legacy messaging system works through Flash XMLSocket by use of custom text based protocol and plain Java sockets.
The replacement will be used not only by the web application (through web sockets instead of Flash) but by an additional desktop client application written in C# too.
My requirements are:
user authentication
transport encryption (SSL/TLS)
bidirectional message exchange
some sort of (automatic) publish/subscribe so that users get only the messages they are allowed to receive
message exchange based on an established protocol (so that we may use existing libraries where possible)
cluster-able server components
At this time the messaging system will only be used to publish updates down to the clients. The clients will react to these messages and acquire further information directly from the server (not via the messaging system). If this new messaging system is established successfully it may be used for more advanced use cases in the future. Some possibilities may include users chatting, file exchange and remote control of the server components.
I did a little research about viable technology to implement these requirements and I think my choices boil down to either using ejabberd (XMPP) or RabbitMQ (AMQP). What are the main pros and cons of these two systems regarding my requirements? We already use RabbitMQ for other parts of our system infrastructure so it was my natural choice. I'm just not sure if it would be a good idea to have client applications connect directly to such an critical main component. This may be mitigated though by just using a different RabbitMQ installation for the client notifications.
Well you could use both protocols to fulfill your needs, xmpp is an extensible protocol so no doubt the things that you're looking for does allready exist as a plugin or whatever is the correct term for a protocol. However some, me included, might actually see this as a downside, and they add extra complexity to the protocol. Another thing to keep in mind is that xmpp was primarily designed to be a instant messaging protocol. For example publish/subscribe is an extension of xmpp and not part of the protocol itself.
That being said xmpp is backed by organisations such as Google, which means that there are some major players using this protocol, so no doubt some of the extensions are really good and well written/thought out.
On the other hand you have AMQP, a protocol almost specifically designed for what you're after. It is backed by such organisations as JP Morgan, Cisco, Credit Suisse etc. so no doubt AMQP is a protocol to be counted on, even though early versions of it has been critized
When it comes to using RabbitMQ there seem to be some issues with memory, however I cannot speak much of that since I was only notified about this issue, and never really got into actually fix it or even understand it. However there seem to be quite a few people experiencing this on different versions of RMQ.
But for me RabbitMQ has never crashed (hey if erlang is known for one thing, it's stability), it has been a joy to setup and it is really easy to setup in a cluster, and you can have your queues easily mirrored on several instances of RMQ and you can have an extra layer of security by having one or more instances write messages to disk.
So I'd say go with RabbitMQ and AMQP, I believe it's a protocol well suited for your needs, but having said that xmpp could probably do the work just fine aswell.
I've read this book, which is a nice introduction to AMQP and RabbitMQ but I find it lacking in the technical aspect, it's basically a good tutorial.
PS: I feel I should be honest and say that I'm not really sure what bidirectional message exchange entails, but if it means sending and receiving messages, you're in the clear on that point also with AMQP. :)
I hope this helped shed some light on which protocol to choose.
Edit
RabbitMQ has something called virtual hosts which acts like an own instance of RabbitMQ so you don't have to start with setting up a cluster just to handle separate responsibilities. Depending on how you set up your queues and exchanges I don't see a problem with clients connecting to the RabbitMQ server, but clustering is without a doubt a good idea. It also seems that setting up RabbitMQ with HAProxy is very easy, but yet again that is something I have no experience of.

JGroups, Terracotta & Hazelcast

Trying to wrap my head around these 3 projects and they all seem to handle slightly different problems that arise when trying to cluster. But all the documentation for them is sort of written for developers that are already "in the know", and are difficult for a newbie like me to make good sense of.
What are the specific problems each of them are trying to solve, and how do these problems differ from one another?
How does clustering with each of them differ from clustering app servers (like JBoss's or GlassFish's built-in clustering capabilities)?
Are the problems these frameworks solve different enough to warrant their use on the same project? Or are they competitors with each other and thus have different solutions to the same/similar problem?
Thanks in advance for any insight into these curious, yet elusive frameworks!
jgroups is more about task distribution and cluster management while hazelcast/terracotta are more distributed caches (data grids) - there is certainly overlap between them when you compare all the functionality - you need to figure out what functionality is more important and perhaps easier to implement.
hazelcast allows clustering through either tcp based addressing or multicasting. It supports maps, multimaps, lists, queues, topics - for disk based backups, you have to implement load/store interfaces.
With EhCache, you can use JGroups, JMS or RMI replication for caches.
In short, if you're looking for a distrubuted data cache/grid, hazelcast or ehcache would be the tools to look at - if you're looking for task distribution using a library and not concerned about existing data grid caches, JGroups would work for you.
It is possible to distinguish two categories of technologies: i. enabler (i.e. middleware API) and ii. end-to-end or ready-to-use solution (i.e. applicative API).
JGroups is an enabler technology as its core implements the Group Communication Primitives like Reliable Unicast, Multicast and Broadcast which are building blocks for more complex distributed protocols like Atomic Broadcast; it falls into the category of Group Communication Toolkits (GCTs).
Hazelcast as well as Terracotta are end-to-end service technologies as they provide a rich set of services for distributed applications; the fall into the category of In-Memory Data Grids (IMDGs), also known as distributed and in-memory caching solutions which are very well suited to compute data with low latencies.
In terms of capabilities:
JGroups provides a set of primitives to enable group membership which is a key concept in any clustering scenario where a group of joining/leaving participants/nodes has to be managed in terms of lifecycle and roles; it allows to create a rich set of protocols based on a Protocol Kernel design by stacking micro protocols on the foundation APIs (as said, group membership) which rely on reliable distribution of messages both on TCP and UDP*. Out of the box, JGroups does not provide any composite service: such kind of services can be built on top of the basic provided capabilities.*
Hazelcast provides a rich set of distributed data structures which can be fully replicated or sharded with an implicit replication factor; distributed List, Map, Queue and Lock are example of basic data structures implemented using the Java Collection interfaces, clearly the distribution and so replication implicitly needs Group Membership services which are provided by its Engine and in particular by the Cluster Manager with Cloud Discovery SPI module. Hazelcast can achieve Group Membership management via IP Multicast, IP Multicast with TCP and 3rd party Cloud Service (e.g. Zookeeper). Potentially, Hazelcast might use JGroups services for Node discovery and Cluster Management (aka Group Membership service).
Terracotta, on the other hand, if associated to the popular Ehcache then it provides a Distributed Caching service which in turn is based on a set of basic Group Membership capabilities; in terms of implementation, Terracotta Ehcache may be based on top of JGroups services and provides a specific set of APIs specific of a Caching system and so less generic than Hazelcast.
Considering the relationship between the two types of technology, JGroups is effectively an enabling service (i.e. building block) for compound services
(i.e. semantically rich APIs) like Hazelcast and Terracotta which provide end-to-end or ready-to-use services for 3rd party applications, managing all the reliable distribution aspects behind the scenes. Definitely, JGroups is a middleware and Hazelcast and Terracotta are applications which may embed their own middleware implementation for clustering services.

what is JMS good for? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm looking for (simple) examples of problems for which JMS is a good solution, and also reasons why JMS is a good solution in these cases. In the past I've simply used the database as a means of passing messages from A to B when the message cannot necessarily be processed by B immediately.
A hypothetical example of such a system is where all newly registered users should be sent a welcome e-mail within 24 hours of registration. For the sake of argument, assume the DB does not record the time when each user registered, but instead a reference (foreign key) to each new user is stored in the pending_email table. The e-mail sender job runs once every 24 hours, sends an e-mail to all the users in this table, then deletes all the pending_email records.
This seems like the kind of problem for which JMS should be used, but it's not clear to me what benefit JMS would have over the approach I've described. One advantage of the DB approach is that the messages are persistent. I understand that JMS message queues can also be persisted, but in that case there seems to be little difference between JMS and the "database as message queue" approach I've described?
What am I missing?
- Don
JMS and messaging is really about 2 totally different things.
publish and subscribe (sending a message to as many consumers as are interested - a bit like sending an email to a mailing list, the sender does not need to know who is subscribed
high performance reliable load balancing (message queues)
See more info on how a queue compares to a topic
The case you are talking about is the second case, where yes you can use a database table to kinda simulate a message queue.
The main difference is a JMS message queue is a high performance highly concurrent load balancer designed for huge throughput; you can send usually tens of thousands of messages per second to many concurrent consumers in many processes and threads. The reason for this is that a message queue is basically highly asynchronous - a good JMS provider will stream messages ahead of time to each consumer so that there are thousands of messages available to be processed in RAM as soon as a consumer is available. This leads to massive throughtput and very low latency.
e.g. imagine writing a web load balancer using a database table :)
When using a database table, typically one thread tends to lock the whole table so you tend to get very low throughput when trying to implement a high performance load balancer.
But like most middleware it all depends on what you need; if you've a low throughput system with only a few messages per second - feel free to use a database table as a queue. But if you need low latency and high throughput - then JMS queues are highly recommended.
In my opinion JMS and other message-based systems are intended to solve problems that need:
Asynchronous communications : An application need to notify another that an event has occurred with no need to wait for a response.
Reliability. Ensure once-and-only-once message delivery. With your DB approach you have to "reinvent the wheel", specially if you have several clients reading the messages.
Loose coupling. Not all systems can communicate using a database. So JMS is pretty good to be used in heterogeneous environments with decoupled systems that can communicate over system boundaries.
The JMS implementation is "push", in the sense that you don't have to poll the queue to discover new messages, but you register a callback that gets called as soon as a new message arrives.
to address the original comment. what was originally described is the gist of (point-to-point) JMS. the benefits of JMS are, however:
you don't need to write the code yourself (and possibly screw up the logic so that it's not quite as persistent as you think it is). also, third-party impl might be more scalable than simple database approach.
jms handles publish/subscribe, which is a bit more complicated that the point-to-point example you gave
you are not tied to a specific implementation, and can swap it out if your needs change in the future, w/out messing w/ your java code.
One advantage of JMS is to enable asynchronous processing which can by done by database solution as well. However following are some other benefit of JMS over database solution
a) The consumer of the message can be in a remote location. Exposing database for remote access is dangerous. You can workaround this by providing additional service for reading messages from database, that requires more effort.
b) In the case of database the message consumer has to poll the database for messages where as JMS provides callback when a message is arrived (as sk mentioned)
c) Load balancing - if there are lot of messages coming it is easy to have pool of message processors in JMS.
d) In general implementation via JMS will be simpler and take less effort than database route
JMS is an API used to transfer messages between two or more clients. It's specs are defined under JSR 914.
The major advantage of JMS is the decoupled nature of communicating entities - Sender need not have information about the receivers. Other advantages include the ability to integrate heterogeneous platforms, reduce system bottlenecks, increase scalability, and respond more quickly to change.
JMS are just kind of interfaces/APIs and the concrete classes must be implemented. These are already implemented by various organizations/Providers. they are called JMS providers. Example is WebSphere by IBM or FioranoMQ by Fiorano Softwares or ActiveMQ by Apache, HornetQ, OpenMQ etc. .Other terminologies used are Admin Objects(Topics,Queues,ConnectionFactories),JMS producer/Publisher, JMS client and the message itself.
So coming to your question - what is JMS good for?
I would like to give a practical example to illustrate it's importance.
Day Trading
There is this feature called LVC(Last value cache)
In Trading share prices are published by a publisher at regular intervals. Each share has an associated Topic to which it is published to. Now if you know what a Topic is then you must know messages are not saved like queues. Messages are published to the subscribers alive at the time the message was published(Exception being Durables subscribers which get all the messages published from the time it was created but then again we don't want to get too old stock prices which discard the possibility of using it). So if a client want to know a stock price he create a subscriber and then he has to wait till next stock price is published(which is again not what we want). This is where LVC comes into picture. Each LVC message has an associated key. If a messages is sent with a LVC key(for a particular stock) and then another update message with same key them the later overrides the previous one. When ever a subscriber subscribes to a topic(which has LVC enabled) the subscriber will get all the messages with distinct LVC keys. If we keep a distinct key per listed company then when client subscribes to it it will get the latest stock price and eventually all the updates.
Ofcourse this is one of the factors other that reliability,security etc which makes JMS so powerful.
Guido has the full definition. From my experience all of these are important for a good fit.
One of the uses I've seen is for order distribution in warehouses. Imagine an office supply company that has a fair number of warehouses that supply large offices with office supplies. Those orders would come into a central location and then be batched up for the correct warehouse to distribute. The warehouses don't have or want high speed connections in most cases so the orders are pushed down to them over dialup modems and this is where asynchronous comes in. The phone lines are not really that important either so half the orders may get in and this is where reliability is important.
The key advantage is decoupling unrelated systems rather than have them share comon databases or building custom services to pass data around.
Banks are a keen example, with intraday messaging being used to pass around live data changes as they happen. It's very easy for the source system to throw a message "over the wall"; the downside is there's very little in the way of contract between these systems, and you normally see hospitalisation being implemented on the consumer's side. It's almost too loosly coupled.
Other advantages are down to the support for JMS out of the box for many application servers, etc. and all the tools around that: durability, monitoring, reporting and throttling.
There's a nice write-up with some examples here: http://www.winslam.com/laramee/jms/index.html
The 'database as message queue' solution may be heavy for the task. The JMS solution is less tightly coupled in that the message sender does not need to know anything about the recipient. This could be accomplished with some additional abstraction in the 'database as message queue' as well so it is not a huge win...Also, you can use the queue in a 'publish and subscribe' way which can be handy depending on what you are trying to accomplish. It is also a nice way to further decouple your components. If all of your communication is within one system and/or having a log that is immediately available to an application is very important, your method seems good. If you are communicating between separate systems JMS is a good choice.
JMS in combination with JTA (Java Transaction API) and JPA (Java persistence API) can be very useful. With a simple annotation you can put several database actions + message sending/receiving in the same transaction. So if one of them fails everything gets rolled back using the same transaction mechanism.

Categories