I'm looking for a reasonably fast event handling mechanism in Java to generate and handle events across different JVMs running on different hosts.
For event handling across multiple threads in a single JVM, I found some good candidates like Jetlang. But in my search for a distributed equivalent , I couldn't find anything that was lightweight enough to offer good performance.
Does anyone know of any implementations that fit the bill?
Edit:
Putting numbers to indicate performance is a bit difficult. But for example, if you implement a heartbeating mechanism using events and the heartbeat interval is 5 seconds, the heartbeat receiver should receive a sent heartbeat within say a second or two.
Generally, a lightweight implementation gives good performance. A event handling mechanism involving a web server or any kind of centralized hub requiring powerful hardware (definitely not lightweight) to give good performance is not what I'm looking for.
Hazelcast Topic is a distributed pub-sub messaging solution.
public class Sample implements MessageListener {
public static void main(String[] args) {
Sample sample = new Sample();
Topic topic = Hazelcast.getTopic ("default");
topic.addMessageListener(sample);
topic.publish ("my-message-object");
}
public void onMessage(Object msg) {
System.out.println("Message received = " + msg);
}
}
Hazelcast also supports events on distributed queue, map, set, list. All events are ordered too.
Regards,
-talip
http://www.hazelcast.com
Depending on your use case, Terracotta may be an excellent choice.
AMQP(Advanced Message Queuing Protocol ) -- more details :
http://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol is probably what you're looking for.
It is used by financial service companies for their high performance requirements -- apache has an implementation going -- http://cwiki.apache.org/qpid/
OpenAMQ - http://www.openamq.org/ is an older REFERENCE IMPLEMENTATION .
For distributed Event processing you could use Esper.It could process up to 500 000 event/s on a dual CPU 2GHz Intel based hardware.It's very stable because many banks use this solution. It supports JMS input and output adapter based on Spring JMS templates. So you could use any JMS implementation for event processing, i.e. ActiveMQ.
ZeroMQ - http://www.zeromq.org/
Although this is a transport layer, it can be tailored for event handling.
Whichever tool you use I'd recommend hiding the middleware APIs from your application logic. For example if you used the Apache Camel approach to hiding middleware you could then easily switch from AMQP to SEDA to JMS to ActiveMQ to JavaSpaces to your own custom MINA transport based on your exact requirements.
If you want to use a message broker I'd recommend using Apache ActiveMQ which is the most popular and powerful open source message broker with the largest most active community behind it both inside Apache and outside it.
Take a look at akka (http://akka.io/). It offers a distributed actor model in the same vein as erlang for the JVM with both java and scala APIs.
You need to implement Observer Design pattern for distributed event handling in java. I am using event Streaming using MongoDB capped collection and Observers to achieve this.
You can make an architecture in which your triggers a publish a document in capped collection and your observer thread waits for it using a tailable cursor.
If you did not understand what I have said above you need to brush up your MongoDB and java skills
If a JMS implementation isn't for you, then you may be interested in an XMPP approach. There are multiple implementations, and also have a Publish-Subscribe extension.
The Avis event router might be suitable for your needs. It's fast enough for near-real-time event delivery, such as sending mouse events for remote mouse control (an application we use it for daily).
Avis is also being used for chat, virtual presence, and smart room automation where typically 10-20 computers are communicating over an Avis-based messaging bus. Its commercial cousin (Mantara Elvin) is used for high-volume commercial trade event processing.
Related
I have set up a ZeroMQ pipeline through VPN. However, the producer doesn't consider the consumption capacity of consumer. The producer keeps sending messages to the consumer due to which the RAM consumption has increased extremely.
I want to find the reason behind this problem. Maybe its due to UDP VPN channel.
Q : ( find the reason behind this problem ) Maybe its due to UDP VPN channel.
Disclaimer,given the ultimate Information Asymmetry is still pending here :
Well,given there are so far Zero-pieces-of-information (the less the Community promoted MCVE ~ Minimum-Complete-Verifiable-Eample-of-code-based problem formulation), either about the UDP channel, or about the composition, configuration and interconnection(s) of "producer" and "consumer" entities, except of just mentioning the ZeroMQ by itself, this answer can be and will be based but on generally available knowledge.
Answer :
ZeroMQ framework is based on a few, rather cardinal, principles :
Rule 1 ) it is Broker-less - i.e. the resulting ecosystem resembles a network of independently working agents.
Rule 2 ) user can implement whatever add-on capability, so as to extend the Rule 1.
This said, if you configure your application-domain behaviour of your ZeroMQ-interconnected agents in proper manner, the local agent may receive some indication from remote agent(s) about its(theirs) problems with RAM ( for details about possible restrictive configurations of sets-of-Tx/Rx-queue(s) see the well published API documentation )
Finally :
It is almost sure the UDP, the less the VPN has hardly anything to do with the "problem",
( the producer doesn't consider the consumption capacity of consumer. )
as this is the by-design property of the ZeroMQ concept, unless one implements any application-domain specific distributed-FSA layer, atop the ZeroMQ trivial scalable formal communication pattern archetypes, that would provision such app-level add-on-service signalling/messaging among the otherwise autonomous agents ( ZeroMQ Context()-instance equipped agent-alike entities ).
If interested, feel free to read more about ZeroMQ here & get inspired by it's beauties and powers
I'm a starter in Spring Web-Flux. I wrote a controller as follows:
#RestController
public class FirstController
{
#GetMapping("/first")
public Mono<String> getAllTweets()
{
return Mono.just("I am First Mono")
}
}
I know one of the reactive benefits is Backpressure, and it can balance the request or the response rate. I want to realize how to have backpressure mechanism in Spring Web-Flux.
Backpressure in WebFlux
In order to understand how Backpressure works in the current implementation of the WebFlux framework, we have to recap the transport layer used by default here. As we may remember, the normal communication between browser and server (server to server communication usually the same as well) is done through the TCP connection. WebFlux also uses that transport for communication between a client and the server.
Then, in order to get the meaning of the backpressure control term, we have to recap what backpressure means from the Reactive Streams specification perspective.
The basic semantics define how the transmission of stream elements is regulated through back-pressure.
So, from that statement, we may conclude that in Reactive Streams the backpressure is a mechanism that regulates the demand through the transmission (notification) of how many elements recipient can consume; And here we have a tricky point. The TCP has a bytes abstraction rather than logical elements abstraction. What we usually want by saying backpressure control is the control of the number of logical elements sent/received to/from the network. Even though the TCP has its own flow control (see the meaning here and animation there) this flow control is still for bytes rather than for logical elements.
In the current implementation of the WebFlux module, the backpressure is regulated by the transport flow control, but it does not expose the real demand of the recipient. In order to finally see the interaction flow, please see the following diagram:
For simplicity, the above diagram shows the communication between two microservices where the left one sends streams of data, and the right one consumes that stream. The following numbered list provides a brief explanation of that diagram:
This is the WebFlux framework that takes proper care for conversion of logical elements to bytes and back and transferring/receiving them to/from the TCP (network).
This is the starting of long-running processing of the element which requests for next elements once the job is completed.
Here, while there is no demand from the business logic, the WebFlux enqueue bytes that come from the network without their acknowledgment (there is no demand from the business logic).
Because of the nature of TCP flow control, Service A may still send data to the network.
As we may notice from the diagram above, the demand exposed by the recipient is different from the demand of the sender (demand here in logical elements). It means that the demand of both is isolated and works only for WebFlux <-> Business logic (Service) interaction and exposes less the backpressure for Service A <-> Service B interaction. All that means that the backpressure control is not that fair in WebFlux as we expect.
All that means that the backpressure control is not that fair in WebFlux as we expect.
But I still want to know how to control backpressure
If we still want to have an unfair control of backpressure in WebFlux, we may do that with the support of Project Reactor operators such as limitRate(). The following example shows how we may use that operator:
#PostMapping("/tweets")
public Mono<Void> postAllTweets(Flux<Tweet> tweetsFlux) {
return tweetService.process(tweetsFlux.limitRate(10))
.then();
}
As we may see from the example, limitRate() operator allows defining the number of elements to be prefetched at once. That means that even if the final subscriber requests Long.MAX_VALUE elements, the limitRate operator split that demand into chunks and does not allow to consume more than that at once. The same we may do with elements sending process:
#GetMapping("/tweets")
public Flux<Tweet> getAllTweets() {
return tweetService.retreiveAll()
.limitRate(10);
}
The above example shows that even if WebFlux requests more then 10 elements at a time, the limitRate() throttles the demand to the prefetch size and prevents to consume more than the specified number of elements at once.
Another option is to implement own Subscriber or extend the BaseSubscriber from Project Reactor. For instance, The following is a naive example of how we may do that:
class MyCustomBackpressureSubscriber<T> extends BaseSubscriber<T> {
int consumed;
final int limit = 5;
#Override
protected void hookOnSubscribe(Subscription subscription) {
request(limit);
}
#Override
protected void hookOnNext(T value) {
// do business logic there
consumed++;
if (consumed == limit) {
consumed = 0;
request(limit);
}
}
}
Fair backpressure with RSocket Protocol
In order to achieve logical-elements backpressure through the network boundaries, we need an appropriate protocol for that. Fortunately, there is one called RScoket protocol. RSocket is an application-level protocol that allows transferring real demand through the network boundaries.
There is an RSocket-Java implementation of that protocol that allows to set up an RSocket server. In the case of a server to server communication, the same RSocket-Java library provides a client implementation as well. To learn more how to use RSocket-Java, please see the following examples here.
For browser-server communication, there is an RSocket-JS implementation which allows wiring the streaming communication between browser and server through WebSocket.
Known frameworks on top of RSocket
Nowadays there are a few frameworks, built on top of the RSocket protocol.
Proteus
One of the frameworks is a Proteus project which offers full-fledged microservices built on top of RSocket. Also, Proteus is well integrated with Spring framework so now we may achieve a fair backpressure control (see examples there)
Further readings
https://www.netifi.com/proteus
https://medium.com/netifi
http://scalecube.io/
I've been reading something about, found some libraries which really messed with my thoughts, like Akka, Quasar, Reactor and Disruptor, Akka and Quasar implements the Actor Pattern and the Disruptor is a Inter-Thread Messaging library, and Reactor is based on . So what are the advantages, use cases for using a message driven architecture over simple method calls?
Given a RabbitMQ queue listener, I receive a message from the method, decide which Type the RabbitMQ message is (NewOrder,Payment,...).
With a Message Driven library I could do.
Pseudo code:
actor.tell('decider-mailbox',message)
Which basically says "I'm putting this message here, when you guys can handle it, do it") and so on until it gets saved.
And the actor is ready again to receive another message
But with directly calling the method like messageHandler.handle(message), wouldn't be better and less abstracted ?
The Actors Model looks a lot like people working together; it is based on message-passing but there is much more to it and I'd say that not all message-passing models are the same, for example Quasar actually supports not only Erlang-like actors but also Go-like channels which are simpler but don't provide a fault-tolerance model (and fibers BTW, that are just like threads but much more lightweight, which you can use even without any message-passing at all).
Methods/functions follow a strict, nestable call-return (so request-response) discipline and usually don't involve any concurrency (at least in imperative and non-pure functional languages).
Message passing instead, very broadly speaking, allows looser coupling because doesn't enforce a request-response discipline and allows the communicating parties to execute concurrently, which also helps in isolating failures and in hot-upgrades and generally maintenance (for example, the Actors Model offers these features). Often message passing will also allow looser data contracts by using a more dynamic typing for messages (this is especially true for the Actors Model where each party, or actor, has a single incoming channel, that is his mailbox).
Other than that the details depends a lot on the messaging model/solution you're considering, for example the communication channels can synchronize the interacting parts or have limited/unlimited buffering, allow multiple source and/or multiple producers and consumers etc.
Note that RPC is really message passing but with a strict request-response communication discipline.
This means that, depending on the situation, one or the other may suit you better: methods/functions are better when you're in a call-return discipline and/or you're simply making your sequential code more modular. Message-passing is better when you need a network of potentially concurrent, autonomous "agents" that communicate but not necessarily in a request-response discipline.
As for the Actors Model I think you can build more insight about it for example by reading the first part of this blog post (notice: I'm the main author of the post and I'm part of the Parallel Universe - and Quasar - development team):
The actor model is a design pattern for fault-tolerant and highly scalable systems. Actors are independent worker-modules that communicate with other actors only through message-passing, can fail in isolation from other actors but can monitor other actors for failure and take some recovery measures when that happens. Actors are simple, isolated yet coordinated, concurrent workers.
Actor-based design brings many benefits:
Adaptive behaviour: interacting only through a message-queue makes actors loosely coupled and allows them to:
Isolate faults: mailboxes are decoupling message queues that allow actor restart without service disruption.
Manage evolution: they enable actor replacement without service disruption.
Regulate concurrency: receiving messages very often and discarding overflow or, alternatively, increasing mailbox size can maximize concurrency at the expense of reliability or memory usage respectively.
Regulate load: reducing the frequency of receive calls and using small mailboxes reduces concurrency and increases latencies, applying back-pressure through the boundaries of the actor system.
Maximum concurrency capacity:
Actors are extremely lightweight both in memory consumption and
management overhead, so it’s possible to spawn even millions in a
single box.
Because actors do not share state, they can safely run in parallel.
Low complexity:
Each actor can implements stateful behaviour by mutating its private state without worrying about concurrent modification.
Actors can simplify their state transition logic by selectively receiving messages from the mailbox in logical, rather than arrival order.
The difference is that the processing takes place in a different thread so the current one is ready to receive and forwards the next message. When you call the handler from the current thread it is blocked until processing is finished.
In a way it is just a matter of defining abstractions. Some say that originally object oriented programming was actually supposed to be based on message passing, and calling a method on an object would have the semantics of sending it a message (with similar async non-blocking behavior as in actors).
The way we implemented OO in most popular languages is such that it became what it is today - a "synchronous blocking order" to an object, controlled and ran from the same execution context (thread/process) as the caller. This is nice because it is easy to understand, but it has its limitations when designing concurrent systems.
In theory, you could create a language with similar syntax as Java, but give it different semantics - making object.method(arg) actually internally be something similar to actor.tell(msg). There are a lot of idioms that try to hide asynchronous calling and message passing behind simple method invocations, but as always it depends on the use case.
Akka provides a nice new syntax which makes it clear that what we are doing is something completely different than invoking methods on an object, in part to cause less confusion and make the message passing more explicit. In the end, you are stating the same thing - you are sending a message to an actor in the system, but you are doing it with less constraints than if you were calling one of its methods directly.
I have a J2EE application that receives and process messages (events). These messages contain various blocks of data. Different types of processing can be triggered depending of the type of data contained in a message.
I would like to have a simple internal event/message bus that can be used by the main processing thread to invoke different post-processors dependent on message content. For example, if a message is received of type A, I would like to be able to send an internal event to all post-processors that have subscribed to events of type A. The post-processors can then work their magic in their own time/thread. It would be nice (though not required) if the post-processors could be added/removed from the application via some sort of plugin-framework.
I understand that there are various message buses available. I am really seeking advice on an appropriate (lightweight) choice or perhaps a design pattern/example to cook my own.
Thanks in anticipation
Guava has and nice EventBus implementation. See the documentation.
You can also check out MBassador https://github.com/bennidi/mbassador.
It is annotation driven, very light-weight and uses weak references (thus easy to integrate in environments where objects lifecycle management is done by a framework like spring or guice or somethign). It provides an object filtering mechanism and synchronous or asynchronous dispatch/message handling. And it's very fast!
EDIT: I created a performance and feature comparison for a selection of available event bus implementations including Guava, MBassador and some more. The results are quite interesting. Check it out here
http://codeblock.engio.net/?p=37
I would like to design a simple application (without j2ee and jms) that can process massive amount of messages (like in trading systems)
I have created a service that can receive messages and place them in a queue to so that the system won't stuck when overloaded.
Then I created a service (QueueService) that wraps the queue and has a pop method that pops out a message from the queue and if there is no messages returns null, this method is marked as "synchronized" for the next step.
I have created a class that knows how process the message (MessageHandler) and another class that can "listen" for messages in a new thread (MessageListener). The thread has a "while(true)" and all the time tries to pop a message.
If a message was returned, the thread calls the MessageHandler class and when it's done, he will ask for another message.
Now, I have configured the application to open 10 MessageListener to allow multi message processing.
I have now 10 threads that all time are in a loop.
Is that a good design??
Can anyone reference me to some books or sites how to handle such scenario??
Thanks,
Ronny
Seems from your description that you are on the right path, with one little exception. You implemented a busy wait on the retrieval of messages from the Queue.
A better way is to block your threads in the synchronised popMessage() method, doing a wait() on the queue resource when no more messages can be pop-ed. When adding (a) message(s) to the queue, the waiting threads are woken up via a notifyAll(), one or more threads will get a message and the rest re-enter the wait() state.
This way the distribution of CPU resources will be smoother.
I understand that queuing providers like Websphere and Sonic cost money, but there's always JBoss Messaging, FUSE with ApacheMQ, and others. Don't try and make a better JMS than JMS. Most JMS providers have persistence capabilities that for provide fault tolerance if the Queue or App server dies. Don't reinvent the wheel.
Reading between the lines a little it sounds like your not using a JMS provider such as MQ. Your solution sounds in the most parts to be ok however I would question your reasons for not using JMS.
You mention something about trading, I can confirm a lot of trading systems use JMS with and without j2ee. If you really want high performance, reliability and piece of mind don't reinvent the wheel by writing your own queuing system take a look at some of the JMS providers and their client API's.
karl
Event loop
How about using a event loop/message pump instead? I actually learned this technique from watching the excellent node.js video presentation from Ryan which I think you should really watch if not already.
You push at most 10 messages from Thread a, to Thread b(blocking if full). Thread a has an unbounded [LinkedBlockingQueue][3](). Thread b has a bounded [ArrayBlocking][4] of size 10 (new ArrayBlockingQueue(10)). Both thread a and thread b have an endless "while loop". Thread b will process messages available from the ArrayBlockingQueue. This way you will only have 2 endless "while loops". As a side note it might even be better to use 2 arrayBlockingQueues when reading the specification because of the following sentence:
Linked queues typically have higher
throughput than array-based queues but
less predictable performance in most
concurrent applications.
Off course the array backed queue has a disadvantage that it will use more memory because you will have to set the size prior(too small is bad, as it will block when full, too big could also be a problem if low on memory) use.
Accepted solution:
In my opinion you should prefer my solution above the accepted solution. The reason is that if it all posible you should only use the java.util.concurrent package. Writing proper threaded code is hard. When you make a mistake you will end up with deadlocks, starvations, etc.
Redis:
Like others already mentioned you should use a JMS for this. My suggestion is something along the line of this, but in my opinion simpler to use/install. First of all I assume your server is running Linux. I would advise you to install Redis. Redis is really awesome/fast and you should also use it as your datastore. It has blocking list operations which you can use. Redis will store your results to disc, but in a very efficient manner.
Good luck!
While it is now showing it's age, Practical .NET for Financial Markets demonstrates some of the universal concepts you should consider when developing a financial trading system. Athough it is geared toward .Net, you should be able to translate the general concepts to Java.
The separation of listening for the message and it's processing seems sensible to me. Having a scalable number of processing threads also is good, you can tune the number as you find out how much parallel processing works on your platform.
The bit I'm less happy about is the way that the threads poll for message arrival - here you're doing busy work, and if you add sleeps to reduce that then you don't react immediately to message arrival. The JMS APIs and MDBs take a more event driven approach. I would take a look at how that's implemented in an open source JMS so that you can see alternatives. [I also endorse the opinion that re-inventing JMS for yourself is probably a bad idea.] The thing to bear in mind is that as your systems get more complex, you add more queues and more processing busy work has greater impact.
The other concern taht I have is that you will hit limitiations of using a single machine, first you may allow greater scalability my allowing listeners to be on many machines. Second, you have a single point of failure. Clearly solving this sort of stuff is where the Messaging vendors make their money. This is another reason why Buy rather than Build tends to be a win for complex middleware.
You need very light, super fast, scalable queuing system. Try Hazelcast distributed queue!
It is a distributed implementation of java.util.concurrent.BlockingQueue. Check out the documentation for detail.
Hazelcast is actually a little more than a distributed queue; it is transactional, distributed implementation of queue, topic, map, multimap, lock, executor service for Java.
It is released under Apache license.