I know, using processbuilder we can create a process in java, but how do I create a thread inside a process? Also if i want to create multiple threads inside a process what is the best way of doing that?
Thanks in advance.
I'm looking for thread creation inside of new process.
Once launched, the launching application is not in control of the threads within the new process. Additional threads in the new process will be started as and when that processes code decides to. Only if you are the author of the code for the other process, would you be able to change how and when it spawns new threads.
The difference between processes and threads with respect to Java is that threads run within the same JVM instance, while processes run in different JVM instances.
For example, launching two instances of the same Java application, results in two processes, each running in their own JVM. Despite being the same application, they run independently of each other unless the application includes means of communicating between itself.
Thread creation in a different process would be the responsibility of that's process' Java code. If you are looking to create threads in one JVM under the direction of code in another JVM, you will have to implement an inter-process control mechanism (e.g. socket, control file, RMI, JMX, etc).
Without knowing your reason for spawning threads in a different process, I can only assume that you want some type of isolation. If it is data isolation you seek, consider revising your application's architecture to provide it intrinsically and follow one of the suggestions in Peter Lawrey's comment. A good starting point for ExecutorService is Java 8 Concurrency Tools: Threads and Executors.
Related
I would like to write a verticle that renders graphs using GraphViz. I would like to do it by loading the native (shared) libs into my JVM and calling it via JNI. Now, GraphViz itself is not thread-safe. It is not enough to run each of the multi-instance verticles always in their own thread, I must additionally ensure that each verticle gets its own instance of the native code, or in other words, that every verticle runs in a separate process, each utilizing one of the cores.
Most descriptions of Vert.x say talk only about isolation between threads (not sharing data etc.) I have found nothing about process isolation.
Basically I'm looking for a framework to create a couple of instances of a REST server, all listening on the same socket or with a loadbalancer in front, and not having to write any of the code myself. Sort of what PM2 does for Node.js. Can I do that with Vert.x?
I realize this may be against the spirit of Vert.x, as the core documentation makes clear:
Instead of a single event loop, each Vertx instance maintains several event loops. By default we choose the number based on the number of available cores on the machine, but this can be overridden.
This means a single Vertx process can scale across your server, unlike Node.js.
But as I am using native libraries, which can only be loaded once per JVM, and in my case cannot execute concurrently, and so prevent scaling out to multiple cores, I guess I really do want the Node.js pattern, only in Java.
My requirement is also much simpler than what is described in the documentation of clustered event bus, e. g. the Zookeeper example, because I need no communication between the instances.
I am working on a platfor that hosts small Java applications, all of which currently uses a single thread, living inside a Docker engine, consuming data from a Kafka server and logging to a central DB.
Now, I need to put another Java application to this platform. This app at hand uses multithreading relatively heavily, I already tested it inside a Docker container and it works perfectly there, so I'm ready to deploy it on the platform where it would be scaled manually, that is, some human would define the number of containers that would be started, each of them containing an instance of this app.
My Architect has an objection, saying that "In a distributed environment we never use multithreading". So now, I have to refactor my application eliminating any thread related logic from it, making it single threaded. I requested a more detailed reasoning from him, but he yells "If you are not aware of this principle, you have no place near Java".
Is it really a mistake to use a multithreaded Java application in a distributed system - a simple cluster with ten or twenty physical machines, each hosting a number of virtual machines, which then runs Docker containers, with Java applications inside them.
Honestly, I don't see the problem of multithreading inside a container.
Is it really a mistake or somehow "forbidden"?
Thanks.
When you write for example a web application that will run in a Java EE application server, then normally you should not start up your own threads in your web application. The application server will manage threads, and will allocate threads to process incoming requests on the server.
However, there is no hard rule or reason why it is never a good idea to use multi-threading in a distributed environment.
There are advantages to making applications single-threaded: the code will be simpler and you won't have to deal with difficult concurrency issues.
But "in a distributed environment we never use multithreading" is not necessarily always true and "if you are not aware of this principle, you have no place near Java" sounds arrogant and condescending.
I guess he only tells you this as using a single thread eliminates multi threading and data ordering issues.
There is nothing wrong with multithreading though.
Distributed systems usually have tasks that are heavily I/O bound.
If I/O calls are blocking in your system
The only way to achieve concurrency within the process is spawning new threads to do other useful work. (Multi-threading).
The caveat with this approach is that, if they are too many threads
in flight, the operating system will spend too much time context
switching between threads, which is wasteful work.
If I/O calls are Non-Blocking in your system
Then you can avoid the Multi-threading approach and use a single thread to service all your requests. (read about event-loops or Java's Netty Framework or NodeJS)
The upside for single thread approach
The OS does not any wasteful thread context switches.
You will NOT run into any concurrency problems like dead locks or race conditions.
The downside is that
It is often harder to code/think in a non-blocking fashion
You typically end up using more memory in the form of blocking queues.
What? We use RxJava and Spring Reactor pretty heavily in our application and it works pretty fine. You can't work with threads across two JVMs anyway. So just make sure that your logic is working as you expect on a single JVM.
So i have a existing Spring library that performs some blocking tasks(exposed as services) that i intend to wrap using Scala Futures to showcase multi processor capabilities. The intention is to get people interested in the Scala/Akka tech stack.
Here is my problem.
Lets say i get two services from the existing Spring library. These services perform different blocking tasks(IO,db operations).
How do i make sure that these tasks(service calls) are carried out across multiple cores ?
For example how do i make use of custom execution contexts?
Do i need one per service call?
How does the execution context(s) / thread pools relate to multi core operations ?
Appreciate any help in this understanding.
You cannot ensure that tasks will be executed on different cores. The workflow for the sample program would be as such.
Write a program that does two things on two different threads (Futures, Java threads, Actors, you name it).
JVM sees that you want two threads so it starts two JVM threads and submits them to the OS process dispatcher (or the other
way round, doesn't matter).
OS decides on which core to execute each thread. Usually, it will try to put threads on different cores to maximize the overall efficiency but it is not guaranteed; you might have a situation that your 10 JVM threads will be executed on one core, although this is extreme.
Rule of the thumb for writing concurrent and seemingly parallel applications is: "Here, take my e.g. 10 threads and TRY to split them among the cores."
There are some tricks, like tuning CPU affinity (low-level, very risky) or spawning a plethora of threads to make sure that they are parallelized (a lot of overhead and work for the GC). However, in general, OS is usually not that overloaded and if you create two e.g. actors, one for db one for network IO they should work well in parallel.
UPDATE:
The global ExecutionContext manages the thread pool. However, you can define your own and submit runnables to it myThreadPool.submit(runnable: Runnable). Have a look at the links provided in the comment.
I have a program in Java that performs some computation in parallel. I can either run it on a single machine or using multiple different machines.
When executing on a single machine, thread synchronization is successfully achieved by using CyclicBarrier class from java.util.concurrent.CyclicBarrier package. The idea is that all the threads must wait for the other threads to arrive at the same point before proceeding with the computation.
When executing on multiple different machines, inter process communication is implemented via RMI (Remote Method Invocation). I have the same problem on this situation and I want the threads of these processes to wait for the others to arrive at the same point before continuing. I cannot use a shared CyclicBarrier object between the different processes because this class is not serializable.
What are my alternatives to get this barrier behavior on threads executing on different processes on multiple machines?
Thanks
You don't need to pass a CyclicBarrier between processes. You can do an RMI call which in turn uses a CyclicBarrier. I suggest you look at HazelCast at it support distributed Lock and many other collections.
IMHO I would reconsider whether you really need all the processes to check point and find a way to avoid needing this in the first place.
Without having the source code for a Java API, is there anyway to know if the API methods create multiple threads ? Are there any conventions to follow if you are writing Java APIs and they create multiple threads. This may be very fundamental question but it happened to spawn out of a discussion in which the crux question was - " How do you know which Java APIs create threads and which don't " ?
One way of determining which libraries create new threads is by disallowing Thread creation and ThreadGroup modification in the SecurityManager. See the java.lang.SecurityManager.checkAccess(Thread) method. By implementing your own SecurityManager, you are able to react on the creation of Threads.
To answer the other question: many libraries create new threads, even if you don't expect it. For example APIs for HTTP communication create Timers for Keep-Alives or session timeouts. Java 2D is creating a signalling thread. Java itself has multiple threads, e.g. the Finalizer thread; the AWT/Swing event dispatcher thread etc.
There's no way to tell. Actually, I don't think you normally would care that much unless you're in some kind of constrained environment. What's I've found is more relevant is to determine if a method is written with an expectation of being run on a particular thread (the AWT Event dispatch thread, in the case I've seen). There's not a way to do that either, unless the code is using some kind of naming convention, or it's documented.
In my experience, if you are looking at core java, not J2EE, the only time I can think that threads are created in core Java is with Swing.
I haven't seen any example of other threads being created by the core Java APIs, except for the Thread class, of course. :)
But, if you are using other libraries then it may be that they are creating threads, but, if you don't want to profile, you may want to use AspectJ to log whenever a new thread is created, and the stack track of what called it, so you can see what is creating the threads.
UPDATE:
Swing uses 4 threads, according to this post, but he also explains how you can go about killing off the threads, if needed.
http://www.herongyang.com/Swing/jframe_2.html
If you want to see active threads, just fire up the jvisualvm application (located in your $JDK/bin directory) to connect to any local java process. You'll be able to see a multitude of information about the process, including thread names, status, and history. Get more information here.