I'm trying to accomplish something that in terms of concept is very simple to understand. I want to synchronize a block of java code between different machines. There are two instances of a programa running in different machines that cannot run at the same time.
I've heard of zookeeper, jgroups and akka too, but while reading the documentation it seemed to me a bit overkill for what I'm trying to do. Does anyone have any idea if there's anything more straight forward to use?
Thanks in advance,
Rui
I think Hazelcast's Distributed Lock ( http://docs.hazelcast.org/docs/3.6/manual/html-single/index.html#lock ) may be helpful. Hazelcast is relatively lightweight so should hopefully not be overkill.
If all the technologies you mentioned (also take a look at Terracotta) are too sophisticated for your needs, maybe simple database locking?
SELECT FOR UPDATE statement will lock given database record, making other clients running this query to block. Simple, yet safe and reliable.
A very very basic solution would be using RMI.
Decide to use one machine as master which has a method which uses a mutex lock to allow only one mthod caller passing.
This special method you have to call via RMI from all other slave instances before you run your special java code block.
Related
Sorry for that I am the newbie for Distributed Lock and Redis, I heard about Redis is a single-threaded server. So My question is Why we need distributed lock management for the Redis? For example ,The second thread(Client B initiated) will not interrupt the previous thread until the operation of the first thread(Client A initiated) is accomplished. Even both threads are working on the same data. I knew I must miss somthing . Please kindly help to correct me. Thanks.
I think that distributed lock is not about the Redis itself (you shouldn't really care whether its single-threaded or not), but rather about your application.
It's clear what a "regular" (not distributed) lock does, but it works with multi-threaded application in the single JVM.
The word "distributed" adds a way to synchronize the access to some resource across many JVMs so that only one JVM will execute a critical section.
Now, as stated in the article you refer, its possible to implement the lock with "SET" command but the fundamental issue with such an implementation is a single point of failure of the Redis itself, that's why they're talking about the Redlock algorithm which supports to acquire a lock based on the state of many independent Redis machines. Note that in any point of this we don't really care whether the Redis itself is single- or multi-threaded.
I suppose this is not possible. But I am looking at best way to separate different layers of my service yet be able to access layers quickly or without overhead of IPC/RMI.
The main programming language I am using is java, but can use C++ if required.
What we have right now is a server that host database and access control. And we use RMI for consumers to request data. This slow and doesn't scale very well.
We need performance and scalability which we dont have at the moment.
What we are thinking of is using a layered architecture with database at base, access control ontop of it along with a notification bus to notify clients of changes in database.
The main problem is the overhead of communication that we want to avoid/or minimize.
Is there any magic thread that can run in two context (switch context) and share information that way. I know the short answer would be no, but what are the options?
Update
We are currently using Java RMI.
Our base layer will provide an API that can be used to create plugins that will run on top. So its not a fixed collectors/consumer we have. We can have 5-6 collectors running and same amount of consumers.
We can have upto 1000 consumers.
My first suggestion is that you should buy a book (or find an online tutorial) on building scalable applications, because you seem to be pretty lost.
Sharing a thread between processes doesn't make sense at any level - it is meaningless, but you can share the data that the thread accesses, which is probably what you want.
The fastest method will be C based IPC (e.g., shared memory, semasphores, etc: Shmget). You say you want to avoid the overhead of IPC, but really, it isn't going to get any faster than that.
But why do you want multiple processes? If you are worried about the overhead of communicating between processes, just have your threads in one process? There is no reason your different layers have to be in different processes.
But anyway, I am not convinced that your original statement that RMI is slow and doesn't scale is completely correct. If it is not scaling, you are probably not using the right framework. Maybe you have an issue that you only have one RMI end point on the server. Have you considered an J2EE system with stateless session beans?
Without knowing about your requirements, it is hard to say.
It is not possible in general to share thread between two processes due to OS design. The problem of sharing data between two or more processes is usually solved by sharing files, sharing database or sharing messages (which in turn can be synchronous or asynchronous), having processes communicate via pipes, say in Linux, or even sharing memory. You scenario description is not very precise, you need to describe all processes and how information is supposed to flow, what triggers information flow, etc.
Most likely you need high performance messaging library, https://github.com/real-logic/Aeron/ is one. But to get precise answer you would need to describe better what overhead exactly you want to minimize.
If your goal is to notify users, you should consider publish/subscribe messaging (pub/sub). There are many middleware vendors out there that provide this architecture though most are expensive in production scenarios. For open source, check out http://redis.io/topics/pubsub. (No affiliation.)
The project I am working on would trigger various asynchronous jobs to do some work. As I look into it more these asynchronous jobs are actually being run as separate JVMs (separate java processes). Does it mean I would not be able to use any of the following if I need to synchronize between these processes:
synchronized methods/blocks
any lock that implements java.util.concurrent.locks
Because it seems to me they are all thread-level?
Does Java provide support for IPC like semaphores between processes?
That's right. You can not use any standard synchronization mechanisms because they are working into one JVM.
Solutions
You can use file locks introduced in java 7.
You can use synchronization via database entities.
One of already implemented solutions like Terracota may be helpful
Re-think your design. If you are beginner in java world try to talk in details with more experienced engineers. Your question shows that IMHO you are just on wrong way.
You can use synchronized keyword, locks, atomic objects, etc. - but they are local to the JVM. So if you have two JVMs running the same program, they can still e.g. run the same synchronized method at the same time - one on each JVM, but not more.
Solutions:
terracotta provides distributed locking
hazelcast as well
you can use manual synchronization on file system or database
I'm using distributed lock provided by Redisson to synchronize work of different JVMs
they are all thread-level?
That's correct, synchronized etc only work within the context of a single process.
Does Java provide support for IPC like semaphores between processes?
One way to implement communication between Java processes is using RMI.
I have implemented a java IPC Lock implementation using files: FileBasedLock and a IPC Semaphore implementation using a shared DB (jdbc): JdbcSemaphore. Both implementations are part of spf4j.
If you have a zookeeper instance take a look at the Zookeeper based Lock recipes from Apache Curator
Does anybody know of a way to lock down individual threads within a Java process to specific CPU cores (on Linux)? I've done this in C, but can't find how to do this in Java. My instincts are that this will require a JNI call, but I was hoping someone here might have some insight or might have done it before.
Thanks!
You can't do this in pure java. But if you really need it -- you can use JNI to call native code which do the job. This is the place to start with:
http://ovatman.blogspot.com/2010/02/using-java-jni-to-set-thread-affinity.html
http://blog.toadhead.net/index.php/2011/01/22/cputhread-affinity-in-java/
UPD: After some thinking, I've decided to create my own class for this: ThreadAffinity.java It's JNA-based, and very simple -- so, if you want to use it in production, may be you should spent some time making it more stable, but for benchmarking and testing it works well as is.
UPD 2: There is another library for working with thread affinity in java. It uses same method as previously noted, but has another interface
I know it's been a while, but if anyone comes across this thread, here's how I solved this problem. I wrote a script that would do the following:
"jstack -l "
Take the results, find the "nid"'s of the threads I want to manually lock down to cores.
Taskset those threads.
You might want to take a look at https://github.com/peter-lawrey/Java-Thread-Affinity/blob/master/src/test/java/com/higherfrequencytrading/affinity/AffinityLockBindMain.java
IMO, this will not be possible unless you use native calls. JVM is supposed to be platform independent, any system calls done to achieve this will not result in a portable code.
It's not possible (at least with plain Java).
You can use thread pools to limit the amount of threads (and therefore cores) used for different types of work, but there is no way to specify a core to use.
There is even the (small) possibility that your Java runtime doesn't support native threading for your OS or hardware. In this case, green threads are used and only one core will be used for the whole JVM.
this is a bit related to this question.
I'm using make to extract some information concerning some C programs. I'm wrapping the compilation using a bash script that runs my java program and then gcc. Basically, i'm doing:
make CC=~/my_script.sh
I would like to use several jobs (-j option with make). It's running several processes according to the dependency rules.
If i understood well, I would have as many instances of the jvm as jobs, right ?
The thing is that i'm using sqlite-jdb to collect some info. So the problem is how to avoid several processes trying to modify the db at the same time ?
It seems that the sqlite lock is jvm-dependant (i mean one lock can be "see" only inside the locking jvm), and that this is the same for RandomAccessFile.lock().
Do you have any idea how to do that ? (creating a tmp file and then looking if it exists or not seems to be one possibility but may be expensive. A locking table in the dB ? )
thanks
java.nio.channels.FileLock allows OS-level cross-process file locking.
However, using make to start a bash scripts that runs several JVMs in parallel before calling gcc sounds altogether too Rube-Goldbergian and brittle to me.
there are several solutions for this.
if your lock should be within the same machine, you can use a server socket to implement it (The process that manages to bind to the port first owns the lock, other processes waits for the port to become available).
if you need a lock that span across multiple machines you can use a memcached lock. this will require a memcached server running. I can paste some code if you are interested in this solution.
you can get Java library to connect to memcached here.
You may try Terracotta for sharing objects between various JVM instances. It may appear as a too heavy solution for your needs, but at least worth considering.