I work on a data processing application in which concurrency is achieved by putting several units of work on a message queue that multiple instances of a message driven bean (MDB) listen to. Other than achieving concurrency in this manner, we do not have any specific reason to use the messaging infrastructure and MDBs.
This led me to think why the same could not have been achieved using multiple threads.
So my question is, in what situations can asynchronous messaging (e.g. JMS) be used as an alternative to mutithreading as a means to achieve concurrency ? What are some advantages/disadvantages of using one approach over another.
It can't be used as an alternative to multithreading, it is a way of of implementing multithreading. There are three basic kinds of solutions here:
You are responsible for both ends of the queue;
You are responsible for sending data; or
You are responsible for receiving data.
Receiving data is the kicker here because there's really no way of doing that without some form of multithreading/multiprocessing otherwise you'll only be processing one request at a time. Sending data without multithreading is much more viable but there you're only really pushing the responsibility for dealing with those messages to an external system. So it's not an alternative to multithreading.
In your case with message driven beans, the container is creating and managing threads for you so it's not an alternative to multithreading, you're simply using someone else's implementation.
There are two additional bonuses that I don't think has been mentioned: Transactions and durability.
While it isn't required and quite often isn't the default configuration, JMS providers can be configured to persist the messages and also to participate in a XA transaction with little or no code changes.
In an EJB container, actually, there is no alternative, since you're not allowed to create your own threads in an EJB container. JMS is doing all of that work for you, at a cost of running it through the queue processor. You could also create a Java Connector, which has a more intimate relationship with the container (and thus, can have threads), but it's a lot more work.
If the overhead of using the JMS queue isn't having a performance impact, then it's the easiest solution.
Performance-wise multi-threading should be faster than any messaging, because you add an additional network layer with messaging.
Application-wise messaging helps you to avoid locking and data sharing issues as there is no common object.
From a scaling perspective messaging is a lot better as you can configure just more nodes on several server by configuring the message service instead of changing the application.
Messaging can reduce number of errors in multithreaded applications greatly, since it reduces risk of data races. It also simplifies adding new threads without changing the rest of app.
Although I think JMS is slightly misused here. java.util.concurrent's thread-safe queues and libraries like jetlang may provide you better performance.
Using multi-threading you can achieve concurrency by sharing core of CPU. But if you use JMS instead you can balance the load and can delegate the task to other system.
e.g. Suppose your application demands to send email on completion of certain task. And you want to send email concurrently. Either you can pull a thread and process it asynchronously. Or you can delegate this task of mail sending to other system using JMS. No of receiver threads can be configurable in jms. Also multiple nodes can listen to same JMS queue which balance the loads. And you can use further applications like persistent queue, transaction managed queue as per application.
In simple words, JMS can be better alternative to multi-threading depends on application architecture
Related
I've seen recently that there are different frameworks out there that allow the use of a messaging architecture but implemented in process, both using same and different threads. The ones I know about are Spring, Guava EventBus and Reactor.
My question is about what are good use cases where someone would want to use them instead of sending messages to a full fledged broker. I understand that its usage allows for a better decoupling of the business logic but in a microservices architecture you would normally publish events to be consumed by other microservices. The advantage of that is the failure tolerance you have by adding a cluster of brokers where an erroneous message cause by a failure in an instance can be retried by another one. Implementing logic that is decomposed and executed by sending messages that are later consumed by the same system, specially when the subscribers are executed in different threads, seems to me difficult to then put the data back to a consistent state.
Advantages of microservices over in-process is not really in the change it represents for message consumption.
Microservices allow you to execute portion of your code on specific nodes within a cluster, permitting to allocate the heavy calculations on powerful computers and secondary or light resources on less powerful resources. Overall it allows you to balance the performances better and scale your resources on the portions of code that require it.
Also, whenever you update the code of a micro-service you do not impact the other services, so that your changes (and errors) are isolated. If everything runs within the same process any wrong update might actually render the entire solution unusable.
In the end, getting the communication out of your process (3rd party broker) allows you to share it with more people, agents, processes, etc. Otherwise people have to become part of your process (a module?) and this is really not efficient.
Honestly, the only good reason you have for intra-process communication within your monolithic is for speed (in-memory communication rather than on-the-wire communication).
Within a Java EE 5 environment I have the problem to ensure the existence of some data written by another part before continue processing my own data.
Historically (J2EE time), it was done by putting the data object to be processed into an internal JMS queue after waiting for e.g. 500ms via Thread.sleep.
But this does not feel like the best way to handle that problem, so I have 2 questions:
Is there any problem with using the sleep method within an Java EE context?
What is a reasonable solution to delaying some processing within an Java EE 5 application?
Edit:
I should have mentioned, that my processing takes place while handling objects from a JMS queue via an MDB.
And it may be the case, that the data for which I'm waiting never shows up, so there must be some sort of timeout, after which I can do some special processing with my data.
You can use EJB TimerService feature. Using threads in a managed environment should be avoided.
I agree with #dkaustubh about timers and avoiding threads manipulation in JavaEE.
Another possibility is to use JMS queue with delayed delivery. Although it is not a part of JavaEE API, most of messaging systems vendors supports it. check here.
I think, its possible with some advanced Threading approach. More than thinking on manual synchronizations and thread management, you can always use the Java Concurrent package.
Future can be one of the ways to do this. Please refer to Java Concurrent package.
Use notifications and Object#wait() / Object#notifyAll()
i.e. Multithreaded, the producer notifies the consumer.
I work on a data processing application in which concurrency is achieved by putting several units of work on a message queue that multiple instances of a message driven bean (MDB) listen to. Other than achieving concurrency in this manner, we do not have any specific reason to use the messaging infrastructure and MDBs.
This led me to think why the same could not have been achieved using multiple threads.
So my question is, in what situations can asynchronous messaging (e.g. JMS) be used as an alternative to mutithreading as a means to achieve concurrency ? What are some advantages/disadvantages of using one approach over another.
It can't be used as an alternative to multithreading, it is a way of of implementing multithreading. There are three basic kinds of solutions here:
You are responsible for both ends of the queue;
You are responsible for sending data; or
You are responsible for receiving data.
Receiving data is the kicker here because there's really no way of doing that without some form of multithreading/multiprocessing otherwise you'll only be processing one request at a time. Sending data without multithreading is much more viable but there you're only really pushing the responsibility for dealing with those messages to an external system. So it's not an alternative to multithreading.
In your case with message driven beans, the container is creating and managing threads for you so it's not an alternative to multithreading, you're simply using someone else's implementation.
There are two additional bonuses that I don't think has been mentioned: Transactions and durability.
While it isn't required and quite often isn't the default configuration, JMS providers can be configured to persist the messages and also to participate in a XA transaction with little or no code changes.
In an EJB container, actually, there is no alternative, since you're not allowed to create your own threads in an EJB container. JMS is doing all of that work for you, at a cost of running it through the queue processor. You could also create a Java Connector, which has a more intimate relationship with the container (and thus, can have threads), but it's a lot more work.
If the overhead of using the JMS queue isn't having a performance impact, then it's the easiest solution.
Performance-wise multi-threading should be faster than any messaging, because you add an additional network layer with messaging.
Application-wise messaging helps you to avoid locking and data sharing issues as there is no common object.
From a scaling perspective messaging is a lot better as you can configure just more nodes on several server by configuring the message service instead of changing the application.
Messaging can reduce number of errors in multithreaded applications greatly, since it reduces risk of data races. It also simplifies adding new threads without changing the rest of app.
Although I think JMS is slightly misused here. java.util.concurrent's thread-safe queues and libraries like jetlang may provide you better performance.
Using multi-threading you can achieve concurrency by sharing core of CPU. But if you use JMS instead you can balance the load and can delegate the task to other system.
e.g. Suppose your application demands to send email on completion of certain task. And you want to send email concurrently. Either you can pull a thread and process it asynchronously. Or you can delegate this task of mail sending to other system using JMS. No of receiver threads can be configurable in jms. Also multiple nodes can listen to same JMS queue which balance the loads. And you can use further applications like persistent queue, transaction managed queue as per application.
In simple words, JMS can be better alternative to multi-threading depends on application architecture
When trying to distribute work that requires a multiple stage processing pipeline what are the communication, synchronization and throughput costs limitations in JMS vs JavaSpaces?
If you want SEDA, sending messages from stage to stage, then JMS implementations are typically much faster and more scalable, since MOMs are designed to not require locks so they can be highly asynchronous and concurrent. With JMS you can setup a consumer on startup and the message broker will typically push messages to your application ASAP so that there are many in-memory objects available at any time to be processed as soon as your application can process them - avoiding any network round trips or locking etc. See for example how prefetch works with ActiveMQ
Using JavaSpaces for messaging tends to be less efficient as they are generally implemented using a more database-centric approach of using locks with read/writes to entries etc. So you tend to query for objects then process them with JavaSpaces which tends to be a bit more chatty and less efficient for messaging.
The big win of the JavaSpaces approach though is if you want shared state; you can use a JavaSpace as a kinda database. Though maybe if you really want a database, you could use a relational database with JMS; but JavaSpace folks like to use a single system for shared state and messaging.
FWIW there's often no silver bullit with middleware; sometimes in memory SEDA is all you need, sometimes JMS, sometimes a relational database, sometimes files in a directory. It totally depends on your requirements, scalability, throughput, reliability and so forth. I tend to recommend to folks to hide middleware APIs from their code so that they can switch to whatever middleware they want easily via a simple one line config change such as with using Apache Camel
JMS is API, not product. It cannot have any "communication, synchronization and throughput costs". Specific implementation of JMS (Weblogic, JBoss, Tibco, ...) can.
There are no synchronization functions in JMS, btw -- queue is queue, you cannot make one message (in one queue) wait for another message (in another queue).
One other point to consider, JMS queues don't provide the ability to block based on size so a pure SEDA implementaion has a hard time working with pure JMS queues as it relies on the queues 'filling up' and applying back pressure on upstream stages.
I am looking for lightweight messaging framework in Java. My task is to process events in a SEDA’s manner: I know that some stages of the processing could be completed quickly, and others not, and would like to decouple these stages of processing.
Let’s say I have components A and B and processing engine (be this container or whatever else) invokes component A, which in turn invokes component B. I do not care if execution time of component B will be 2s, but I do care if execution time of component A is below 50ms, for example. Therefore, it seems most reasonable for component A to submit a message to B, which B will process at the desired time.
I am aware of different JMS implementations and Apache ActiveMQ: they are too heavyweight for this. I searched for some lightweight messaging (with really basic features like messages serialization and simplest routing) to no avail.
Do you have anything to recommend in this issue?
Do you need any kind of persistence (e.g. if your JVM dies in between processing thousands of messages) and do you need messages to traverse to any other JVMs?
If its all in a single JVM and you don't need to worry about transactions, recovery or message loss if a JVM dies - then as Chris says above, Executors are fine.
ActiveMQ is pretty lightweight; you can use it in a single JVM only with no persistence if you want to; you can then enable transactions / persistence / recovery / remoting (working with multiple JVMs) as and when you need it. But if you need none of these things then its overkill - just use Executors.
Incidentally another option if you are not sure which steps might need persistence/reliability or load balancing to multiple JVMs would be to hide the use of middleware completely so you can switch between in memory SEDA queues with executors to JMS/ActiveMQ as and when you need to.
e.g. it might be that some steps need to be reliable & recoverable (so needing some kind of persistence) and other times you don't.
Really lightweight? Executors. :-) So you set up an executor (B, in your description), and A simply submits tasks to the executor.
I think Apache Camel covers all your needs. It's works within the JVM and supports SEDA style (http://camel.apache.org/seda.html) and simpe routing. Can be used on it's own, or with spring, with a JMS provider or other adaptors.
Sorry for resurrecting an old thread, but maybe it helps somebody else reading it... I think FFMQ is a good candidate for a lightweight messaging framework.
UPDATE: however I'm not sure if it supports redelivery delays (the dead-letter-queue problem). I would find this usable even for lightweight providers. But I guess it could be possible with a combination of MessageSelector query and message properties.
For help to somebody else read this thread:
One of the lightest messaging framework is Mbasseder.
MBassador is a very light-weight message (event) bus implementation following the publish subscribe pattern. It is designed for ease of use and aims to be feature rich and extensible while preserving resource efficiency and performance.
The core of MBassador's high performance is a specialized data structure that minimizes lock contention such that performance degradation of concurrent access is minimal.
Features: Declarative listener definition via annotations, sync and/or async event delivery, weak-references, message filtering