Pub/Sub paradigm: Can I know if subscribers are alive?

Pub/Sub paradigm: Can I know if subscribers are alive? - java

I am using JMS (ActiveMQ) between a Topic Publisher and a variable number of Topic Subscribers.
I would have to check if, at a given moment in time, some of the subscribers are "offline" (disconnected, shutdown, unable to communicate, etc...).
Is there a way that JMS allows the publisher to know which subscribers are "registered"?
Right now, I have it implemented so that subscribers send an "alive" message on a specific Queue (acting as producers) and the publisher receives them (acting as a consumer): if it is detected that any of the subscribers didn't "ping" for X seconds (threshold), it is assumed to be offline.
It works, but I was curious to know if I reinvented the wheel...
I know that this feature is not completely related to Messaging or Pub/Sub paradigm and I also know that Pub/Sub is specifically designed so that the publisher doesn't have to worry about who/where/when its messages will be consumed....but I was wondering that, if it DID want to know, maybe there was a way.
After all, it doesn't seem like a particularly uncommon usecase....
Thank you very much.

I don't think there is a way to tell your publisher the information about the subscribers directly.
what you should do is actually use AMQ advisory messages to keep track of the statuses of your subscribers. Read the article - it provides all the information you need.

JMS explicitly decouples publishers from subscribers. The whole point was supposed to be that the two did not need to know or care about the state of the other. Therefore JMS has no facility to do what you require. On the other hand, providers will have vendor-specific administration APIs and, as Paul notes, the AMQ advisory messages are what you want with Active MQ. Because these are vendor-specific it will not be portable to any other provider though and will not be JMS compliant. Not the fault of AMQ, just that it's not part of the JMS spec.

Related

Local message queue for sharing data between two processes

I have a server application A that produces records as requests arrive. I want these records to be persisted in a database. However, I don't want to let application A threads spend time persisting the records by communicating directly with the database. Therefore, I thought about using a simple producers-consumers architecture where application A threads produce records and, another application B threads are the consumers that persist the records to the database.
I'm looking for the "best" way to share these records between applications A and B. An important requirement is that application A threads will always be able to send records to the IPC system (e.g. queue but that may be some other solution). Therefore, I think the records must always be stored locally so that application A threads will be able to send record event if network is down.
The initial idea that came to my mind was to use a local message queue (e.g. ActiveMQ). Do you think a local message queue is appropriate? If yes, do you recommend a specific message queue implementation? Note that both applications are written in Java.
Thanks, Mickael

For this type of needs Queueing solution seems to be the best fit as the producer and consumer of the events can work in isolation. There are many solutions out there, and I have personally worked with RabbitMQ and ActiveMQ. Both are equally good. I don't wish to compare their performance characteristics here but RabbitMQ is written in Erlang which a language tailer-made for building real time applications.
Since you're already on Java platform ActiveMQ might be a better option and is capable producing high throughput. With a solution like this, the consumer does not have to be online all the time. Based on how critical your events data are, you may also want to have persistent queues and messages so that in the event of a message broker failure, you can still recover important "event" messages your application A produced.
If there are many applications producing events and later if you wish to scale out(or horizontally scale) the broker service because it's getting a bottleneck, both of the above solutions provide clustering services.
Last but not least, if you want to share these events between different platforms you may wish to share messages in AMQP format, which is a platform-independent wire-level protocol to share messages between heterogenous systems, and I'm not sure if this is requirement for you. RabbitMQ and ActiveMQ both support AMQP. Both of these solutions also support MQTT which is a lightweight messaging protocol but it seems that you don't wish to use MQTT.
There are other products such as HornetQ and Apache Qpid which are also production ready solutions but I have not used them personally.
I think queueing solution is a the best approach in terms of maintainability, loose coupling nature of participating applications and performance.

JMS topic receive in a Queue listener

I have a question regarding JMS. I´ve been checking some blogs, where show how a sending topic message can be receive by a queue listener. It is that even possible, as far as I know only a client subscribe to a particular topic can receive a message published.
Regards.

So, given you publish to a topic, you want to consume the messages from a queue. I assume you use ActiveMQ since you added that tag.
The main reason for this setup is to be able to load balance multiple cluster nodes of the consumer. Plain durable subscriptions won't allow that in JMS 1.x. I guess your case is similar.
In generic JMS, this is not possible. However, in JMS 2.0 durable subscriptions can be load balanced and hence work a bit like queues. Not all JMS brokers implements JMS 2.0. ActiveMQ does not implement JMS 2.0, but ActiveMQ Artemis do.
ActiveMQ allows this by a concept called Virtual Topics. Using Virtual Topics you can give the topic a certain name, say VirtualTopic.MyTopic would forward all published messages to any created queue that matches Consumer.MyConsumer.VirtualTopic.MyTopic.
Example topic name:
VirtualTopic.GameScores
Example queue names:
Consumer.ScoreBoardService.VirtualTopic.GameScores
Consumer.BettingService.VirtualTopic.GameScores

A Scalable Architecture for Reconstructing events

I have been tasked to develop the architecture for a data transformation pipeline.
Essentially, data comes in at one end and is routed through various internal systems acquiring different forms before ending up in its destination.
The main objectives are -
Fault Tolerant. The message should be recoverable if one of the intermediate systems were down.
Replay/ Resequence - The message can be replayed from any stage and it should be possible to recreate the events in an idempotent manner.
I have a few custom solutions in mind to address
Implement a checkpoint system where a message can be logged at both entry and exit points at each checkpoint so we know where failure happens.
Implement a recovery mechanism that can go to the logged storage ( database, log file etc.. ) and reconstruct events programmatically.
However, I have a feeling this is a fairly standard problem with well defined solutions.
So, I would welcome any thoughts on a suitable architecture to go with, any tools/packages/patterns to refer to etc..
Thanks

Akka is obvious choice. Of course Scala version is more powerful, but even with Java bindings you can achieve a lot.
I think you can follow CQRS approach and use Akka Persistence module. In this case it's easy to replay any sequence of events, because you always have a persistent journal.
Generally Actor Model provides you fault-tolerance using supervision.
Akka Clustering will give you scalability you need.
Really awesome example of using Akka Clustering with Akka Persistence and Cassandra - https://github.com/boldradius/akka-dddd-template (only Scala unfortunately).

One common solution is JMS, where a central component (the JMS Broker) keeps a transactional store of pending messages. Because it does nothing other than that, it can have a high uptime (uptime can further be increased with a failover cluster, in which case you'll likely its persistence store to be a failover cluster, too).
Sending a JMS message can be made transactional, as can consuming a message. These transaction can be synchronized with database transactions through XA-transactions, which does its utmost get as close to exactly-once delivery as possible, but is rather heavy machinery.
In many cases (idempotent receiver), at-least-once delivery is sufficient. This can be accomplished by sending the message with a synchronous transaction (that is, the sender only succeeds once the broker has acknowledged receipt of the message), and consuming a message only after it has been processed.

Can I somehow overwrite JMS provider behavior in messaging?

I know I might sounds ridiculous for some experts, however, it's been in my head for quite a while and still no concrete answer found.
In PUBLISH/SUBSCRIBE MESSAGING WITH JMS TOPICS: JMS publisher sends a msg to JMS provider, and JMS provider sends the msg to JMS subscribers and receives their acknowledgement.
Is it possible that I can somehow modify the JMS provider, so that the JMS producer only sends out every other message it receives from JMS publisher?
Totally newbie in this field, so any suggestion is welcomed.

If what you want is for the subscriber to be able to configure to receive messages in batches, where each subscriber can have a different batch size, then JMS will not provide this functionality. This is not a typical pubsub type scenario.
If you want to accomplish this, I would suggest you add some custom buffering on your subscriber side that will queue up the incoming messages and then do a batch notify when your queue is full. This could then be easily configured on a per subscriber basis.
The only messaging system that I know provides a similar functionality is pubsub in XMPP, but even then the batches are determined by a timed interval instead of number.

You could look at filtering at your JMS subscriptions using JMS API Message Selectors. You can then only read/process messages that match a certain criteria.
With more information about what you are trying to accomplish (filtering? testing dropped messages? load balancing? something else entirely?) you might get a better answer.

Why would you want to do this? Would it not defeat the whole gambit of messaging, which is not to lose any messages? Or is it that you want to control exactly how the message gets distributed to subscribers? Even this would go against the basic JMS specifications.

what is JMS good for? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm looking for (simple) examples of problems for which JMS is a good solution, and also reasons why JMS is a good solution in these cases. In the past I've simply used the database as a means of passing messages from A to B when the message cannot necessarily be processed by B immediately.
A hypothetical example of such a system is where all newly registered users should be sent a welcome e-mail within 24 hours of registration. For the sake of argument, assume the DB does not record the time when each user registered, but instead a reference (foreign key) to each new user is stored in the pending_email table. The e-mail sender job runs once every 24 hours, sends an e-mail to all the users in this table, then deletes all the pending_email records.
This seems like the kind of problem for which JMS should be used, but it's not clear to me what benefit JMS would have over the approach I've described. One advantage of the DB approach is that the messages are persistent. I understand that JMS message queues can also be persisted, but in that case there seems to be little difference between JMS and the "database as message queue" approach I've described?
What am I missing?
- Don

JMS and messaging is really about 2 totally different things.
publish and subscribe (sending a message to as many consumers as are interested - a bit like sending an email to a mailing list, the sender does not need to know who is subscribed
high performance reliable load balancing (message queues)
See more info on how a queue compares to a topic
The case you are talking about is the second case, where yes you can use a database table to kinda simulate a message queue.
The main difference is a JMS message queue is a high performance highly concurrent load balancer designed for huge throughput; you can send usually tens of thousands of messages per second to many concurrent consumers in many processes and threads. The reason for this is that a message queue is basically highly asynchronous - a good JMS provider will stream messages ahead of time to each consumer so that there are thousands of messages available to be processed in RAM as soon as a consumer is available. This leads to massive throughtput and very low latency.
e.g. imagine writing a web load balancer using a database table :)
When using a database table, typically one thread tends to lock the whole table so you tend to get very low throughput when trying to implement a high performance load balancer.
But like most middleware it all depends on what you need; if you've a low throughput system with only a few messages per second - feel free to use a database table as a queue. But if you need low latency and high throughput - then JMS queues are highly recommended.

In my opinion JMS and other message-based systems are intended to solve problems that need:
Asynchronous communications : An application need to notify another that an event has occurred with no need to wait for a response.
Reliability. Ensure once-and-only-once message delivery. With your DB approach you have to "reinvent the wheel", specially if you have several clients reading the messages.
Loose coupling. Not all systems can communicate using a database. So JMS is pretty good to be used in heterogeneous environments with decoupled systems that can communicate over system boundaries.

The JMS implementation is "push", in the sense that you don't have to poll the queue to discover new messages, but you register a callback that gets called as soon as a new message arrives.

to address the original comment. what was originally described is the gist of (point-to-point) JMS. the benefits of JMS are, however:
you don't need to write the code yourself (and possibly screw up the logic so that it's not quite as persistent as you think it is). also, third-party impl might be more scalable than simple database approach.
jms handles publish/subscribe, which is a bit more complicated that the point-to-point example you gave
you are not tied to a specific implementation, and can swap it out if your needs change in the future, w/out messing w/ your java code.

One advantage of JMS is to enable asynchronous processing which can by done by database solution as well. However following are some other benefit of JMS over database solution
a) The consumer of the message can be in a remote location. Exposing database for remote access is dangerous. You can workaround this by providing additional service for reading messages from database, that requires more effort.
b) In the case of database the message consumer has to poll the database for messages where as JMS provides callback when a message is arrived (as sk mentioned)
c) Load balancing - if there are lot of messages coming it is easy to have pool of message processors in JMS.
d) In general implementation via JMS will be simpler and take less effort than database route

JMS is an API used to transfer messages between two or more clients. It's specs are defined under JSR 914.
The major advantage of JMS is the decoupled nature of communicating entities - Sender need not have information about the receivers. Other advantages include the ability to integrate heterogeneous platforms, reduce system bottlenecks, increase scalability, and respond more quickly to change.
JMS are just kind of interfaces/APIs and the concrete classes must be implemented. These are already implemented by various organizations/Providers. they are called JMS providers. Example is WebSphere by IBM or FioranoMQ by Fiorano Softwares or ActiveMQ by Apache, HornetQ, OpenMQ etc. .Other terminologies used are Admin Objects(Topics,Queues,ConnectionFactories),JMS producer/Publisher, JMS client and the message itself.
So coming to your question - what is JMS good for?
I would like to give a practical example to illustrate it's importance.
Day Trading
There is this feature called LVC(Last value cache)
In Trading share prices are published by a publisher at regular intervals. Each share has an associated Topic to which it is published to. Now if you know what a Topic is then you must know messages are not saved like queues. Messages are published to the subscribers alive at the time the message was published(Exception being Durables subscribers which get all the messages published from the time it was created but then again we don't want to get too old stock prices which discard the possibility of using it). So if a client want to know a stock price he create a subscriber and then he has to wait till next stock price is published(which is again not what we want). This is where LVC comes into picture. Each LVC message has an associated key. If a messages is sent with a LVC key(for a particular stock) and then another update message with same key them the later overrides the previous one. When ever a subscriber subscribes to a topic(which has LVC enabled) the subscriber will get all the messages with distinct LVC keys. If we keep a distinct key per listed company then when client subscribes to it it will get the latest stock price and eventually all the updates.
Ofcourse this is one of the factors other that reliability,security etc which makes JMS so powerful.

Guido has the full definition. From my experience all of these are important for a good fit.
One of the uses I've seen is for order distribution in warehouses. Imagine an office supply company that has a fair number of warehouses that supply large offices with office supplies. Those orders would come into a central location and then be batched up for the correct warehouse to distribute. The warehouses don't have or want high speed connections in most cases so the orders are pushed down to them over dialup modems and this is where asynchronous comes in. The phone lines are not really that important either so half the orders may get in and this is where reliability is important.

The key advantage is decoupling unrelated systems rather than have them share comon databases or building custom services to pass data around.
Banks are a keen example, with intraday messaging being used to pass around live data changes as they happen. It's very easy for the source system to throw a message "over the wall"; the downside is there's very little in the way of contract between these systems, and you normally see hospitalisation being implemented on the consumer's side. It's almost too loosly coupled.
Other advantages are down to the support for JMS out of the box for many application servers, etc. and all the tools around that: durability, monitoring, reporting and throttling.

There's a nice write-up with some examples here: http://www.winslam.com/laramee/jms/index.html

The 'database as message queue' solution may be heavy for the task. The JMS solution is less tightly coupled in that the message sender does not need to know anything about the recipient. This could be accomplished with some additional abstraction in the 'database as message queue' as well so it is not a huge win...Also, you can use the queue in a 'publish and subscribe' way which can be handy depending on what you are trying to accomplish. It is also a nice way to further decouple your components. If all of your communication is within one system and/or having a log that is immediately available to an application is very important, your method seems good. If you are communicating between separate systems JMS is a good choice.

JMS in combination with JTA (Java Transaction API) and JPA (Java persistence API) can be very useful. With a simple annotation you can put several database actions + message sending/receiving in the same transaction. So if one of them fails everything gets rolled back using the same transaction mechanism.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.