How to achieve inter-instance synchronization in java? - java

Suppose I have a Java Spring application having a synchronized method (thread-safe). If my application is deployed on a single instance, then I can say that it's thread safe. However, to scale, we must deploy on multiple instances. How to achieve inter-instance synchronization?

The question doesn't seem to be about thread-safety. A single application can have multiple threads, and certain methods in an application can be thread-safe or -unsafe.
With that, if you have multiple instances of a given application that needs access to communicate, then its as #Michael in the comments said: it is a form of inter-process communication (IPC). There are two ways to achieve this: message passing and shared memory. Some examples of this can be found at this answer: How to have 2 JVMs talk to one another
However, since you are using Java Spring, a web framework, you are likely already working with network based services. Why not have multiple instances of the same service share state through an external shared service? This sounds like a great use-case for Message Brokers like Redis or RabbitMQ. As Reddis states as their homepage states:
Chat, messaging, and queues
Redis supports Pub/Sub with pattern matching and a variety of data
structures such as lists, sorted sets, and hashes. This allows Redis
to support high performance chat rooms, real-time comment streams,
social media feeds and server intercommunication. The Redis List data
structure makes it easy to implement a lightweight queue. Lists offer
atomic operations as well as blocking capabilities, making them
suitable for a variety of applications that require a reliable
message broker or a circular list.

Related

What are good use cases for an in-process events system vs microservices with a broker?

I've seen recently that there are different frameworks out there that allow the use of a messaging architecture but implemented in process, both using same and different threads. The ones I know about are Spring, Guava EventBus and Reactor.
My question is about what are good use cases where someone would want to use them instead of sending messages to a full fledged broker. I understand that its usage allows for a better decoupling of the business logic but in a microservices architecture you would normally publish events to be consumed by other microservices. The advantage of that is the failure tolerance you have by adding a cluster of brokers where an erroneous message cause by a failure in an instance can be retried by another one. Implementing logic that is decomposed and executed by sending messages that are later consumed by the same system, specially when the subscribers are executed in different threads, seems to me difficult to then put the data back to a consistent state.
Advantages of microservices over in-process is not really in the change it represents for message consumption.
Microservices allow you to execute portion of your code on specific nodes within a cluster, permitting to allocate the heavy calculations on powerful computers and secondary or light resources on less powerful resources. Overall it allows you to balance the performances better and scale your resources on the portions of code that require it.
Also, whenever you update the code of a micro-service you do not impact the other services, so that your changes (and errors) are isolated. If everything runs within the same process any wrong update might actually render the entire solution unusable.
In the end, getting the communication out of your process (3rd party broker) allows you to share it with more people, agents, processes, etc. Otherwise people have to become part of your process (a module?) and this is really not efficient.
Honestly, the only good reason you have for intra-process communication within your monolithic is for speed (in-memory communication rather than on-the-wire communication).

Sharing thread between processes

I suppose this is not possible. But I am looking at best way to separate different layers of my service yet be able to access layers quickly or without overhead of IPC/RMI.
The main programming language I am using is java, but can use C++ if required.
What we have right now is a server that host database and access control. And we use RMI for consumers to request data. This slow and doesn't scale very well.
We need performance and scalability which we dont have at the moment.
What we are thinking of is using a layered architecture with database at base, access control ontop of it along with a notification bus to notify clients of changes in database.
The main problem is the overhead of communication that we want to avoid/or minimize.
Is there any magic thread that can run in two context (switch context) and share information that way. I know the short answer would be no, but what are the options?
Update
We are currently using Java RMI.
Our base layer will provide an API that can be used to create plugins that will run on top. So its not a fixed collectors/consumer we have. We can have 5-6 collectors running and same amount of consumers.
We can have upto 1000 consumers.
My first suggestion is that you should buy a book (or find an online tutorial) on building scalable applications, because you seem to be pretty lost.
Sharing a thread between processes doesn't make sense at any level - it is meaningless, but you can share the data that the thread accesses, which is probably what you want.
The fastest method will be C based IPC (e.g., shared memory, semasphores, etc: Shmget). You say you want to avoid the overhead of IPC, but really, it isn't going to get any faster than that.
But why do you want multiple processes? If you are worried about the overhead of communicating between processes, just have your threads in one process? There is no reason your different layers have to be in different processes.
But anyway, I am not convinced that your original statement that RMI is slow and doesn't scale is completely correct. If it is not scaling, you are probably not using the right framework. Maybe you have an issue that you only have one RMI end point on the server. Have you considered an J2EE system with stateless session beans?
Without knowing about your requirements, it is hard to say.
It is not possible in general to share thread between two processes due to OS design. The problem of sharing data between two or more processes is usually solved by sharing files, sharing database or sharing messages (which in turn can be synchronous or asynchronous), having processes communicate via pipes, say in Linux, or even sharing memory. You scenario description is not very precise, you need to describe all processes and how information is supposed to flow, what triggers information flow, etc.
Most likely you need high performance messaging library, https://github.com/real-logic/Aeron/ is one. But to get precise answer you would need to describe better what overhead exactly you want to minimize.
If your goal is to notify users, you should consider publish/subscribe messaging (pub/sub). There are many middleware vendors out there that provide this architecture though most are expensive in production scenarios. For open source, check out http://redis.io/topics/pubsub. (No affiliation.)

Java - Messaging Framework

I would like to test a distributed algorithm that supposed to run over multiple servers (each server running the same code and logic). The end-points will communicate by broadcasting messages to each other.
For the purpose of pre-testing the algorithm I thought of developing a single process application where each end-point is simulated by a single thread.
Is there any framework that provides something similar in terms of just defining how many threads, implementing the messages and the code that will be executed by each thread?
Thanks
You might find that Akka does what you need. Akka uses actors as the implementation of business logic - these actors react to events produced by other actors.
Akka provides this API and deals with the coordination of the actors - while Threads must be used underneath, the developer doesn't have to deal with them.
The final benefit in the context of your question is that Akka can be distributed over multiple machines - I don't believe this change from single-machine to multiple-machine involves much if any modification of the program. I assume you need to ensure your events implement Serializable.

Rationale for JetLang (Threadless Concurrency) vs Java Executor [duplicate]

I work on a data processing application in which concurrency is achieved by putting several units of work on a message queue that multiple instances of a message driven bean (MDB) listen to. Other than achieving concurrency in this manner, we do not have any specific reason to use the messaging infrastructure and MDBs.
This led me to think why the same could not have been achieved using multiple threads.
So my question is, in what situations can asynchronous messaging (e.g. JMS) be used as an alternative to mutithreading as a means to achieve concurrency ? What are some advantages/disadvantages of using one approach over another.
It can't be used as an alternative to multithreading, it is a way of of implementing multithreading. There are three basic kinds of solutions here:
You are responsible for both ends of the queue;
You are responsible for sending data; or
You are responsible for receiving data.
Receiving data is the kicker here because there's really no way of doing that without some form of multithreading/multiprocessing otherwise you'll only be processing one request at a time. Sending data without multithreading is much more viable but there you're only really pushing the responsibility for dealing with those messages to an external system. So it's not an alternative to multithreading.
In your case with message driven beans, the container is creating and managing threads for you so it's not an alternative to multithreading, you're simply using someone else's implementation.
There are two additional bonuses that I don't think has been mentioned: Transactions and durability.
While it isn't required and quite often isn't the default configuration, JMS providers can be configured to persist the messages and also to participate in a XA transaction with little or no code changes.
In an EJB container, actually, there is no alternative, since you're not allowed to create your own threads in an EJB container. JMS is doing all of that work for you, at a cost of running it through the queue processor. You could also create a Java Connector, which has a more intimate relationship with the container (and thus, can have threads), but it's a lot more work.
If the overhead of using the JMS queue isn't having a performance impact, then it's the easiest solution.
Performance-wise multi-threading should be faster than any messaging, because you add an additional network layer with messaging.
Application-wise messaging helps you to avoid locking and data sharing issues as there is no common object.
From a scaling perspective messaging is a lot better as you can configure just more nodes on several server by configuring the message service instead of changing the application.
Messaging can reduce number of errors in multithreaded applications greatly, since it reduces risk of data races. It also simplifies adding new threads without changing the rest of app.
Although I think JMS is slightly misused here. java.util.concurrent's thread-safe queues and libraries like jetlang may provide you better performance.
Using multi-threading you can achieve concurrency by sharing core of CPU. But if you use JMS instead you can balance the load and can delegate the task to other system.
e.g. Suppose your application demands to send email on completion of certain task. And you want to send email concurrently. Either you can pull a thread and process it asynchronously. Or you can delegate this task of mail sending to other system using JMS. No of receiver threads can be configurable in jms. Also multiple nodes can listen to same JMS queue which balance the loads. And you can use further applications like persistent queue, transaction managed queue as per application.
In simple words, JMS can be better alternative to multi-threading depends on application architecture

When is messaging (e.g. JMS) an alternative for multithreading?

I work on a data processing application in which concurrency is achieved by putting several units of work on a message queue that multiple instances of a message driven bean (MDB) listen to. Other than achieving concurrency in this manner, we do not have any specific reason to use the messaging infrastructure and MDBs.
This led me to think why the same could not have been achieved using multiple threads.
So my question is, in what situations can asynchronous messaging (e.g. JMS) be used as an alternative to mutithreading as a means to achieve concurrency ? What are some advantages/disadvantages of using one approach over another.
It can't be used as an alternative to multithreading, it is a way of of implementing multithreading. There are three basic kinds of solutions here:
You are responsible for both ends of the queue;
You are responsible for sending data; or
You are responsible for receiving data.
Receiving data is the kicker here because there's really no way of doing that without some form of multithreading/multiprocessing otherwise you'll only be processing one request at a time. Sending data without multithreading is much more viable but there you're only really pushing the responsibility for dealing with those messages to an external system. So it's not an alternative to multithreading.
In your case with message driven beans, the container is creating and managing threads for you so it's not an alternative to multithreading, you're simply using someone else's implementation.
There are two additional bonuses that I don't think has been mentioned: Transactions and durability.
While it isn't required and quite often isn't the default configuration, JMS providers can be configured to persist the messages and also to participate in a XA transaction with little or no code changes.
In an EJB container, actually, there is no alternative, since you're not allowed to create your own threads in an EJB container. JMS is doing all of that work for you, at a cost of running it through the queue processor. You could also create a Java Connector, which has a more intimate relationship with the container (and thus, can have threads), but it's a lot more work.
If the overhead of using the JMS queue isn't having a performance impact, then it's the easiest solution.
Performance-wise multi-threading should be faster than any messaging, because you add an additional network layer with messaging.
Application-wise messaging helps you to avoid locking and data sharing issues as there is no common object.
From a scaling perspective messaging is a lot better as you can configure just more nodes on several server by configuring the message service instead of changing the application.
Messaging can reduce number of errors in multithreaded applications greatly, since it reduces risk of data races. It also simplifies adding new threads without changing the rest of app.
Although I think JMS is slightly misused here. java.util.concurrent's thread-safe queues and libraries like jetlang may provide you better performance.
Using multi-threading you can achieve concurrency by sharing core of CPU. But if you use JMS instead you can balance the load and can delegate the task to other system.
e.g. Suppose your application demands to send email on completion of certain task. And you want to send email concurrently. Either you can pull a thread and process it asynchronously. Or you can delegate this task of mail sending to other system using JMS. No of receiver threads can be configurable in jms. Also multiple nodes can listen to same JMS queue which balance the loads. And you can use further applications like persistent queue, transaction managed queue as per application.
In simple words, JMS can be better alternative to multi-threading depends on application architecture

Categories