I have a Java application named 'X'. In Windows environment, at a given point of time there might be more than one instance of the application.
I want a common piece of code to be executed sequentially in the Application 'X' no matter how many instances of the application are running. Is that something possible and can be achieved ? Any suggestions will help.
Example :- I have a class named Executor where a method execute() will be invoked. Assuming there might be two or more instances of the application at any given point of time, how can i have the method execute() run sequential from different instances ?
Is there something like a lock which can be accessed from two instances and see if the lock is currently active or not ? Any help ?
I think what you are looking for is a distributed lock (i.e. a lock which is visible and controllable from many processes). There are quite a few 3rd party libraries that have been developed with this in mind and some of them are discussed on this page.
Distributed Lock Service
There are also some other suggestions in this post which use a file on the underlying system as a synchornization mechanism.
Cross process synchronization in Java
To my knowledge, you cannot do this that easily. You could implement TCP calls between processes... but well I wouldn't advice it.
You should better create an external process in charge of executing the task and a request all the the tasks to execute by sending a message to a JMS queue that your executor process would consume.
...Or maybe you don't really need to have several processes running in the same time but what you might require is just an application that would have several threads performing things in the same time and having one thread dedicated to the Executor. That way, synchronizing the execute()method (or the whole Executor) would be enough and spare you some time.
You cannot achieve this with Executors or anything like that because Java virtual machines will be separate.
If you really need to synchronize between multiple independent instances, one of the approaches would be to dedicate internal port and implement a simple internal server within the application. Look into ServerSocket or RMI is full blown solution if you need extensive communications. First instance binds to the dedicated application port and becomes the master node. All later instances find the application port taken but then can use it to make HTTP (or just TCP/IP) call to the master node reporting about activities they need to do.
As you only need to execute some action sequentially, any slave node may ask master to do this rather than executing itself.
A potential problem with this approach is that if the user shuts down the master node, it may be complex to implement approach how another running node could take its place. If only one node is active at any time (receiving input from the user), it may take a role of the master node after discovering that the master is not responding and then the port is not occupied.
A distributed queue, could be used for this type of load-balancing. You put one or more 'request messages' into a queue, and the next available consumer application picks it up and processes it. Each such request message could describe your task to process.
This type of queue could be implemented as JMS queue (e.g. using ActiveMQ http://activemq.apache.org/), or on Windows there is also MSMQ: https://msdn.microsoft.com/en-us/library/ms711472(v=vs.85).aspx.
If performance is an issue and you can have C/C++ develepors, also the 'shared memory queue' could be interesting: shmemq API
Related
I'm building a small client/Server chat application. I came across NIO.2 after I tried to simulate it using the classic NIO library.
The goal of my "simulation" of the NIO.2 lib with the classisc NIO, was to use multiple selectors in multiple threads which are in pairs connected through a ArrayBlockingQueue, to avoid the network read and write times.
My question is, how are multiple events at the same time handled with in the NIO.2 lib using AsynchronousSocketChannels and CompletionHandlers (which act to my understanding as callbacks)?
The classic NIO lib uses Selectors which deliver after a select call a key set. This key set can then be iterated over and each event(read,accept and write) can be handled one after another.
The NIO.2 callbacks on the other hand, don't have such a sequence. They are asyncronous. So what happens if, for example, 2 clients send at exact the same moment a message to the server ?
Do then 2 callbacks run at the same time? And if yes, then how?
Do they each run in seperate threads or not?
And if I were to take those messages from each of the callbacks and tried to enqueue them in a as before mentioned ArrayBlockingQueue, would they wait for each other or not ?
So what happens if, for example, 2 clients send at exact the same moment a message to the server ?
The clients do not share a common connection with the server. Server-sided, you'd call AsynchronousSocketChannel#read with your callback for both clients, which would fire when some bytes arrive.
For that reason, two callbacks can run simultaneously (as they're asynchronous), but they're still independent for each client, so there won't be a problem.
Do they each run in seperate threads or not?
This depends on the backing AsynchronousChannelGroup's thread pool (which you can specify yourself or use the default group).
I created a simple networking library with NIO.2, which I think would help you: https://github.com/jhg023/SimpleNet
We can use nodejs cluster to run multiple processes...
While the equivalent in java is multi-thread...
I have a http listener running on nodejs (without clustering), and I'm using Java to call this nodejs http (using java.lang.Thread class)
If I have concurrently 300 request, will it create multiple instances of nodejs? Will nodejs be a bottle neck?
NodeJS is single-threaded. It means that whatever number of http calls you make, it will queue them and process them. You'll have a longer response time thought if you overload Node JS with hundreds on call in a few seconds.
See this guide about the event loop for further informations
Edit : I did not see the cluster part. It'll allow you to use multiple instances, hence using more cores in your processor and processing more actions at the same time. I would say that the best thing to do is to benchmark a lot of operations to see if it's enough to process hundreds of call in a few seconds
Even though NodeJS is single-threaded, asynchronous operations are run in separate threads thanks to its Event Loop architecture.
If I have concurrently 300 request, will it create multiple instances of nodejs?
No, unless you are running a node cluster, only a single Node proccess (and thread) will handle the requests.
Will nodejs be a bottle neck?
If most of your work is asynchronous, then it will be able to perform those tasks in parallel and shouldn't be a bottleneck. Also, you can scale the application by creating a node process for each available core in the CPU and/or by deploying the process in multiple computer instances.
However, it's important to note the distinctions between a Java multithread application and a Node cluster application (or multiproccess).
processes are typically independent, while threads exist as subsets of a process
processes carry considerably more state information than threads, whereas multiple
threads within a process share process state as well as memory and other resources
processes have separate address spaces, whereas threads share their address space
processes interact only through system-provided inter-process communication mechanisms
context switching between threads in the same process is typically faster than context switching between processes.
Therefore, if memory is scarce in your context, and if your instance has a multi-core processor, then NodeJS might indeed become a bottleneck.
I am trying to see whether there is an existing implementation for "distributed threads" in Java.
In our days almost everything is moved to cloud. So to say when I have a queue full o messages i can use a simple ThreadPoolExecutor and spawn various threads to take over. Off course all of them belong to the same VM (virtual machine).
What about when i have a system with 3 VMs ? Is there any framework that will support such a scaling without caring where the threads belong ?
Let's say something like a distributed ThreadPool executor so the treads might belong to multipe VMs ?
You can set up a messaging queue. and a simple (scalable) application that listens to that queue. you can then monitor the queue and scale up if things get busy.
I have 2 java processes, Process1 is responsible for importing some external data to the database, Process2 is running the rest of the application using the same database, i.e. it hosts the web module the everything else. Process1 would normally import data once a day.
What I require is when Process1 has finished it's work it should notify the Process2 about it, so that it can perform some subsequent tasks. That is it, this will be their limit of interaction with each other. No other data has to be shared later.
No I know I can do this in one of the following ways:
Have the Process1 write an entry in the database when it has finished its execution and have a demon thread in Process2 looking for that entry. Once this entry is read, complete the task in Process2. Even though this might be the easiest to implement in the existing ecosystem, I think having a thread loop the database just for one notification looks kind of ugly. However, it could be optimised by starting the thread only when the import job starts and killing it after the notification is received.
Use a socket. I have never worked with sockets before, so this might be an interesting learning curve. But after my initial readings I am afraid it might be an overkill.
Use RMI
I would like to hear from people who have worked on similar problems, and what approach they choose and why and also would like to know what will be an appropriate solution for my problem.
Edit.
I went through this but found that for a starter in interprocess communication it lacks basic examples. That is what I am looking in this post.
I would say take a look at Chronicle-Queue
It uses a memory mapped file and saves data off-heap (so no problem with GC). Also, Provides TCP replication for failover scenarios.
It scales pretty well and supports distributed processing when more than one machine is available.
I might have a problem with my application. There is a client running multiple threads which might execute rather time consuming calls to the server over Java RMI. Of course a time consuming call from one client should not block everyone else.
I tested it, and it works on my machine. So I created two Threads on the client and a dummy call on the server. On startup the clients both call the dummy method which just does a huge number of sysout. It can be seen that these calls are handled in parallel, without blocking.
I was very satisfied until a collegue indicated that the RMI spec does not necessarily guarantee that behavior.
And really a text on the hp of the university of Lancaster states that
“A method dispatched by the RMI runtime to a remote object
implementation (a server) may or may not execute in a separate thread.
Calls originating from different clients Virtual Machines will execute
in different threads. From the same client machine it is not
guaranteed that each method will run in a separate thread” [1]
What can I do about that? Is it possible that it just won't work in practice?
in theory, yes, you may have to worry about this. in reality, all mainstream rmi impls multi-thread all incoming calls, so unless you are running against some obscure jvm, you don't have anything to worry about.
What that wording means is that you can't assume it will all execute in the same thread. So you are responsible for any required synchronization.
Based on my testing on a Mac laptop, every single client request received in parallel seems to be executed on a separate thread (I tried upto a thousand threads without any issues. I don't know if there is an upper bound though. My guess is that the max no. of threads will be limited only by memory).
These threads then hang around for some time (a minute or two), in case they can service more clients. If they are unused for some time, they get GC'ed.
Note that I used Thread.sleep() on the server to hold up every request, hence none of the threads could finish the task and move on to another request.
The point is that, if required, the JVM can even allocate a separate thread for each client request. If work is done and threads are free, it could reuse existing threads without creating new ones.
I don't see a situation where any client request would be stuck waiting due to RMI constraints. No matter how many threads on the server are "busy" processing existing requests, new client requests will be received.