I would like to test a distributed algorithm that supposed to run over multiple servers (each server running the same code and logic). The end-points will communicate by broadcasting messages to each other.
For the purpose of pre-testing the algorithm I thought of developing a single process application where each end-point is simulated by a single thread.
Is there any framework that provides something similar in terms of just defining how many threads, implementing the messages and the code that will be executed by each thread?
Thanks
You might find that Akka does what you need. Akka uses actors as the implementation of business logic - these actors react to events produced by other actors.
Akka provides this API and deals with the coordination of the actors - while Threads must be used underneath, the developer doesn't have to deal with them.
The final benefit in the context of your question is that Akka can be distributed over multiple machines - I don't believe this change from single-machine to multiple-machine involves much if any modification of the program. I assume you need to ensure your events implement Serializable.
Related
Suppose I have a Java Spring application having a synchronized method (thread-safe). If my application is deployed on a single instance, then I can say that it's thread safe. However, to scale, we must deploy on multiple instances. How to achieve inter-instance synchronization?
The question doesn't seem to be about thread-safety. A single application can have multiple threads, and certain methods in an application can be thread-safe or -unsafe.
With that, if you have multiple instances of a given application that needs access to communicate, then its as #Michael in the comments said: it is a form of inter-process communication (IPC). There are two ways to achieve this: message passing and shared memory. Some examples of this can be found at this answer: How to have 2 JVMs talk to one another
However, since you are using Java Spring, a web framework, you are likely already working with network based services. Why not have multiple instances of the same service share state through an external shared service? This sounds like a great use-case for Message Brokers like Redis or RabbitMQ. As Reddis states as their homepage states:
Chat, messaging, and queues
Redis supports Pub/Sub with pattern matching and a variety of data
structures such as lists, sorted sets, and hashes. This allows Redis
to support high performance chat rooms, real-time comment streams,
social media feeds and server intercommunication. The Redis List data
structure makes it easy to implement a lightweight queue. Lists offer
atomic operations as well as blocking capabilities, making them
suitable for a variety of applications that require a reliable
message broker or a circular list.
I would like to write a verticle that renders graphs using GraphViz. I would like to do it by loading the native (shared) libs into my JVM and calling it via JNI. Now, GraphViz itself is not thread-safe. It is not enough to run each of the multi-instance verticles always in their own thread, I must additionally ensure that each verticle gets its own instance of the native code, or in other words, that every verticle runs in a separate process, each utilizing one of the cores.
Most descriptions of Vert.x say talk only about isolation between threads (not sharing data etc.) I have found nothing about process isolation.
Basically I'm looking for a framework to create a couple of instances of a REST server, all listening on the same socket or with a loadbalancer in front, and not having to write any of the code myself. Sort of what PM2 does for Node.js. Can I do that with Vert.x?
I realize this may be against the spirit of Vert.x, as the core documentation makes clear:
Instead of a single event loop, each Vertx instance maintains several event loops. By default we choose the number based on the number of available cores on the machine, but this can be overridden.
This means a single Vertx process can scale across your server, unlike Node.js.
But as I am using native libraries, which can only be loaded once per JVM, and in my case cannot execute concurrently, and so prevent scaling out to multiple cores, I guess I really do want the Node.js pattern, only in Java.
My requirement is also much simpler than what is described in the documentation of clustered event bus, e. g. the Zookeeper example, because I need no communication between the instances.
I've seen recently that there are different frameworks out there that allow the use of a messaging architecture but implemented in process, both using same and different threads. The ones I know about are Spring, Guava EventBus and Reactor.
My question is about what are good use cases where someone would want to use them instead of sending messages to a full fledged broker. I understand that its usage allows for a better decoupling of the business logic but in a microservices architecture you would normally publish events to be consumed by other microservices. The advantage of that is the failure tolerance you have by adding a cluster of brokers where an erroneous message cause by a failure in an instance can be retried by another one. Implementing logic that is decomposed and executed by sending messages that are later consumed by the same system, specially when the subscribers are executed in different threads, seems to me difficult to then put the data back to a consistent state.
Advantages of microservices over in-process is not really in the change it represents for message consumption.
Microservices allow you to execute portion of your code on specific nodes within a cluster, permitting to allocate the heavy calculations on powerful computers and secondary or light resources on less powerful resources. Overall it allows you to balance the performances better and scale your resources on the portions of code that require it.
Also, whenever you update the code of a micro-service you do not impact the other services, so that your changes (and errors) are isolated. If everything runs within the same process any wrong update might actually render the entire solution unusable.
In the end, getting the communication out of your process (3rd party broker) allows you to share it with more people, agents, processes, etc. Otherwise people have to become part of your process (a module?) and this is really not efficient.
Honestly, the only good reason you have for intra-process communication within your monolithic is for speed (in-memory communication rather than on-the-wire communication).
I work on a data processing application in which concurrency is achieved by putting several units of work on a message queue that multiple instances of a message driven bean (MDB) listen to. Other than achieving concurrency in this manner, we do not have any specific reason to use the messaging infrastructure and MDBs.
This led me to think why the same could not have been achieved using multiple threads.
So my question is, in what situations can asynchronous messaging (e.g. JMS) be used as an alternative to mutithreading as a means to achieve concurrency ? What are some advantages/disadvantages of using one approach over another.
It can't be used as an alternative to multithreading, it is a way of of implementing multithreading. There are three basic kinds of solutions here:
You are responsible for both ends of the queue;
You are responsible for sending data; or
You are responsible for receiving data.
Receiving data is the kicker here because there's really no way of doing that without some form of multithreading/multiprocessing otherwise you'll only be processing one request at a time. Sending data without multithreading is much more viable but there you're only really pushing the responsibility for dealing with those messages to an external system. So it's not an alternative to multithreading.
In your case with message driven beans, the container is creating and managing threads for you so it's not an alternative to multithreading, you're simply using someone else's implementation.
There are two additional bonuses that I don't think has been mentioned: Transactions and durability.
While it isn't required and quite often isn't the default configuration, JMS providers can be configured to persist the messages and also to participate in a XA transaction with little or no code changes.
In an EJB container, actually, there is no alternative, since you're not allowed to create your own threads in an EJB container. JMS is doing all of that work for you, at a cost of running it through the queue processor. You could also create a Java Connector, which has a more intimate relationship with the container (and thus, can have threads), but it's a lot more work.
If the overhead of using the JMS queue isn't having a performance impact, then it's the easiest solution.
Performance-wise multi-threading should be faster than any messaging, because you add an additional network layer with messaging.
Application-wise messaging helps you to avoid locking and data sharing issues as there is no common object.
From a scaling perspective messaging is a lot better as you can configure just more nodes on several server by configuring the message service instead of changing the application.
Messaging can reduce number of errors in multithreaded applications greatly, since it reduces risk of data races. It also simplifies adding new threads without changing the rest of app.
Although I think JMS is slightly misused here. java.util.concurrent's thread-safe queues and libraries like jetlang may provide you better performance.
Using multi-threading you can achieve concurrency by sharing core of CPU. But if you use JMS instead you can balance the load and can delegate the task to other system.
e.g. Suppose your application demands to send email on completion of certain task. And you want to send email concurrently. Either you can pull a thread and process it asynchronously. Or you can delegate this task of mail sending to other system using JMS. No of receiver threads can be configurable in jms. Also multiple nodes can listen to same JMS queue which balance the loads. And you can use further applications like persistent queue, transaction managed queue as per application.
In simple words, JMS can be better alternative to multi-threading depends on application architecture
I work on a data processing application in which concurrency is achieved by putting several units of work on a message queue that multiple instances of a message driven bean (MDB) listen to. Other than achieving concurrency in this manner, we do not have any specific reason to use the messaging infrastructure and MDBs.
This led me to think why the same could not have been achieved using multiple threads.
So my question is, in what situations can asynchronous messaging (e.g. JMS) be used as an alternative to mutithreading as a means to achieve concurrency ? What are some advantages/disadvantages of using one approach over another.
It can't be used as an alternative to multithreading, it is a way of of implementing multithreading. There are three basic kinds of solutions here:
You are responsible for both ends of the queue;
You are responsible for sending data; or
You are responsible for receiving data.
Receiving data is the kicker here because there's really no way of doing that without some form of multithreading/multiprocessing otherwise you'll only be processing one request at a time. Sending data without multithreading is much more viable but there you're only really pushing the responsibility for dealing with those messages to an external system. So it's not an alternative to multithreading.
In your case with message driven beans, the container is creating and managing threads for you so it's not an alternative to multithreading, you're simply using someone else's implementation.
There are two additional bonuses that I don't think has been mentioned: Transactions and durability.
While it isn't required and quite often isn't the default configuration, JMS providers can be configured to persist the messages and also to participate in a XA transaction with little or no code changes.
In an EJB container, actually, there is no alternative, since you're not allowed to create your own threads in an EJB container. JMS is doing all of that work for you, at a cost of running it through the queue processor. You could also create a Java Connector, which has a more intimate relationship with the container (and thus, can have threads), but it's a lot more work.
If the overhead of using the JMS queue isn't having a performance impact, then it's the easiest solution.
Performance-wise multi-threading should be faster than any messaging, because you add an additional network layer with messaging.
Application-wise messaging helps you to avoid locking and data sharing issues as there is no common object.
From a scaling perspective messaging is a lot better as you can configure just more nodes on several server by configuring the message service instead of changing the application.
Messaging can reduce number of errors in multithreaded applications greatly, since it reduces risk of data races. It also simplifies adding new threads without changing the rest of app.
Although I think JMS is slightly misused here. java.util.concurrent's thread-safe queues and libraries like jetlang may provide you better performance.
Using multi-threading you can achieve concurrency by sharing core of CPU. But if you use JMS instead you can balance the load and can delegate the task to other system.
e.g. Suppose your application demands to send email on completion of certain task. And you want to send email concurrently. Either you can pull a thread and process it asynchronously. Or you can delegate this task of mail sending to other system using JMS. No of receiver threads can be configurable in jms. Also multiple nodes can listen to same JMS queue which balance the loads. And you can use further applications like persistent queue, transaction managed queue as per application.
In simple words, JMS can be better alternative to multi-threading depends on application architecture