Lock across several jvm? - java

this is a bit related to this question.
I'm using make to extract some information concerning some C programs. I'm wrapping the compilation using a bash script that runs my java program and then gcc. Basically, i'm doing:
make CC=~/my_script.sh
I would like to use several jobs (-j option with make). It's running several processes according to the dependency rules.
If i understood well, I would have as many instances of the jvm as jobs, right ?
The thing is that i'm using sqlite-jdb to collect some info. So the problem is how to avoid several processes trying to modify the db at the same time ?
It seems that the sqlite lock is jvm-dependant (i mean one lock can be "see" only inside the locking jvm), and that this is the same for RandomAccessFile.lock().
Do you have any idea how to do that ? (creating a tmp file and then looking if it exists or not seems to be one possibility but may be expensive. A locking table in the dB ? )
thanks

java.nio.channels.FileLock allows OS-level cross-process file locking.
However, using make to start a bash scripts that runs several JVMs in parallel before calling gcc sounds altogether too Rube-Goldbergian and brittle to me.

there are several solutions for this.
if your lock should be within the same machine, you can use a server socket to implement it (The process that manages to bind to the port first owns the lock, other processes waits for the port to become available).
if you need a lock that span across multiple machines you can use a memcached lock. this will require a memcached server running. I can paste some code if you are interested in this solution.
you can get Java library to connect to memcached here.

You may try Terracotta for sharing objects between various JVM instances. It may appear as a too heavy solution for your needs, but at least worth considering.

Related

"In a distributed environment, one does not use multithreding" - Why?

I am working on a platfor that hosts small Java applications, all of which currently uses a single thread, living inside a Docker engine, consuming data from a Kafka server and logging to a central DB.
Now, I need to put another Java application to this platform. This app at hand uses multithreading relatively heavily, I already tested it inside a Docker container and it works perfectly there, so I'm ready to deploy it on the platform where it would be scaled manually, that is, some human would define the number of containers that would be started, each of them containing an instance of this app.
My Architect has an objection, saying that "In a distributed environment we never use multithreading". So now, I have to refactor my application eliminating any thread related logic from it, making it single threaded. I requested a more detailed reasoning from him, but he yells "If you are not aware of this principle, you have no place near Java".
Is it really a mistake to use a multithreaded Java application in a distributed system - a simple cluster with ten or twenty physical machines, each hosting a number of virtual machines, which then runs Docker containers, with Java applications inside them.
Honestly, I don't see the problem of multithreading inside a container.
Is it really a mistake or somehow "forbidden"?
Thanks.
When you write for example a web application that will run in a Java EE application server, then normally you should not start up your own threads in your web application. The application server will manage threads, and will allocate threads to process incoming requests on the server.
However, there is no hard rule or reason why it is never a good idea to use multi-threading in a distributed environment.
There are advantages to making applications single-threaded: the code will be simpler and you won't have to deal with difficult concurrency issues.
But "in a distributed environment we never use multithreading" is not necessarily always true and "if you are not aware of this principle, you have no place near Java" sounds arrogant and condescending.
I guess he only tells you this as using a single thread eliminates multi threading and data ordering issues.
There is nothing wrong with multithreading though.
Distributed systems usually have tasks that are heavily I/O bound.
If I/O calls are blocking in your system
The only way to achieve concurrency within the process is spawning new threads to do other useful work. (Multi-threading).
The caveat with this approach is that, if they are too many threads
in flight, the operating system will spend too much time context
switching between threads, which is wasteful work.
If I/O calls are Non-Blocking in your system
Then you can avoid the Multi-threading approach and use a single thread to service all your requests. (read about event-loops or Java's Netty Framework or NodeJS)
The upside for single thread approach
The OS does not any wasteful thread context switches.
You will NOT run into any concurrency problems like dead locks or race conditions.
The downside is that
It is often harder to code/think in a non-blocking fashion
You typically end up using more memory in the form of blocking queues.
What? We use RxJava and Spring Reactor pretty heavily in our application and it works pretty fine. You can't work with threads across two JVMs anyway. So just make sure that your logic is working as you expect on a single JVM.

Remote debugging threads with different debuggers

I have an application which is scheduler running different threads.
The application may load new Runnable classes and run them.
Currently the application is in production, that is it's running on remote server.
My team consists of 3 people developing Runnable classes.
When the class is ready, it's uploaded to server and loaded to scheduler.
I would like to give my team the ability to debug specific threads.
That is: person A may debug threads of Runnable A, B-B, and so on.
Giving them the full access to the remote JVM is not a solution, because
the developers are not allowed to see the system core, and each others solutions.
So my question is: how to allow multiple remote debugging with thread specific connections?
Preferable IDE: Eclipse
EDIT:
It's possible to connect remotely to specific thread with jdb
http://docs.oracle.com/javase/7/docs/technotes/tools/windows/jdb.html
Here is an example: http://www.itec.uni-klu.ac.at/~harald/CSE/Content/debugging.html
1) Find your thread with jdb threads
2) Put breakpoint and enter the wanted thread
Still the security issue stays.
One solution was to compile protected code without debug symbols, but it will only protect the core, allow seeing each other's threads.
So, next step - digging Security Manager. Maybe there's privilege layer suitable for my situation.
I'm not sure I've got a good answer to your question, but let's see how it pans out.
As I understand it you want to allow different developers to debug their class alone, and their class runs as a thread as part of a single Java process.
On the face of it that sort of runs counter to the nature of debugging in that normally you have access to everything in the process. I don't imagine that Java is any different to any other language in this respect (I'm no Java programmer).
So how about running the classes in separate Java processes. That way I presume the standard Eclipse tools would allow each developer to remote attach and debug their class.
However I presume that these classes need to interact with each other in some way, otherwise you wouldn't be asking your question in the first place. And running each class in a separate process (JVM) sounds like a bad thing as far as interaction is concerned.
So how about a different form of interaction where tbe process boundary between each class doesn't really matter that much? You could look at using JCSP which, as far as I can tell, doesn't really care if two threads are in the same process or not.
It's a completely different interaction model, based solely on synchronous message passing. You get some nice fringe benefits - scalability is suddenly no longer a massive problem, and it allows you to dodge many pitfalls normally associated with multithreaded programs (deadlock, etc). However if you've already written a large amount of code, adopting JCSP is probably a significant rewrite.
Is that anywhere near the mark? Good luck.

Java synchronization between different JVMs

The project I am working on would trigger various asynchronous jobs to do some work. As I look into it more these asynchronous jobs are actually being run as separate JVMs (separate java processes). Does it mean I would not be able to use any of the following if I need to synchronize between these processes:
synchronized methods/blocks
any lock that implements java.util.concurrent.locks
Because it seems to me they are all thread-level?
Does Java provide support for IPC like semaphores between processes?
That's right. You can not use any standard synchronization mechanisms because they are working into one JVM.
Solutions
You can use file locks introduced in java 7.
You can use synchronization via database entities.
One of already implemented solutions like Terracota may be helpful
Re-think your design. If you are beginner in java world try to talk in details with more experienced engineers. Your question shows that IMHO you are just on wrong way.
You can use synchronized keyword, locks, atomic objects, etc. - but they are local to the JVM. So if you have two JVMs running the same program, they can still e.g. run the same synchronized method at the same time - one on each JVM, but not more.
Solutions:
terracotta provides distributed locking
hazelcast as well
you can use manual synchronization on file system or database
I'm using distributed lock provided by Redisson to synchronize work of different JVMs
they are all thread-level?
That's correct, synchronized etc only work within the context of a single process.
Does Java provide support for IPC like semaphores between processes?
One way to implement communication between Java processes is using RMI.
I have implemented a java IPC Lock implementation using files: FileBasedLock and a IPC Semaphore implementation using a shared DB (jdbc): JdbcSemaphore. Both implementations are part of spf4j.
If you have a zookeeper instance take a look at the Zookeeper based Lock recipes from Apache Curator

How to PIN a Java thread to a processor on Linux? (with JNI, native code, linux trick, etc.) [duplicate]

Does anybody know of a way to lock down individual threads within a Java process to specific CPU cores (on Linux)? I've done this in C, but can't find how to do this in Java. My instincts are that this will require a JNI call, but I was hoping someone here might have some insight or might have done it before.
Thanks!
You can't do this in pure java. But if you really need it -- you can use JNI to call native code which do the job. This is the place to start with:
http://ovatman.blogspot.com/2010/02/using-java-jni-to-set-thread-affinity.html
http://blog.toadhead.net/index.php/2011/01/22/cputhread-affinity-in-java/
UPD: After some thinking, I've decided to create my own class for this: ThreadAffinity.java It's JNA-based, and very simple -- so, if you want to use it in production, may be you should spent some time making it more stable, but for benchmarking and testing it works well as is.
UPD 2: There is another library for working with thread affinity in java. It uses same method as previously noted, but has another interface
I know it's been a while, but if anyone comes across this thread, here's how I solved this problem. I wrote a script that would do the following:
"jstack -l "
Take the results, find the "nid"'s of the threads I want to manually lock down to cores.
Taskset those threads.
You might want to take a look at https://github.com/peter-lawrey/Java-Thread-Affinity/blob/master/src/test/java/com/higherfrequencytrading/affinity/AffinityLockBindMain.java
IMO, this will not be possible unless you use native calls. JVM is supposed to be platform independent, any system calls done to achieve this will not result in a portable code.
It's not possible (at least with plain Java).
You can use thread pools to limit the amount of threads (and therefore cores) used for different types of work, but there is no way to specify a core to use.
There is even the (small) possibility that your Java runtime doesn't support native threading for your OS or hardware. In this case, green threads are used and only one core will be used for the whole JVM.

Distributed synchronized execution

I'm trying to accomplish something that in terms of concept is very simple to understand. I want to synchronize a block of java code between different machines. There are two instances of a programa running in different machines that cannot run at the same time.
I've heard of zookeeper, jgroups and akka too, but while reading the documentation it seemed to me a bit overkill for what I'm trying to do. Does anyone have any idea if there's anything more straight forward to use?
Thanks in advance,
Rui
I think Hazelcast's Distributed Lock ( http://docs.hazelcast.org/docs/3.6/manual/html-single/index.html#lock ) may be helpful. Hazelcast is relatively lightweight so should hopefully not be overkill.
If all the technologies you mentioned (also take a look at Terracotta) are too sophisticated for your needs, maybe simple database locking?
SELECT FOR UPDATE statement will lock given database record, making other clients running this query to block. Simple, yet safe and reliable.
A very very basic solution would be using RMI.
Decide to use one machine as master which has a method which uses a mutex lock to allow only one mthod caller passing.
This special method you have to call via RMI from all other slave instances before you run your special java code block.

Categories