limiting number of threads used by the JVM - java

How to set limit to the number of Thread that someone can create? What I do is running someone's code (something like ideone), and want to limit number of thread that he can spawn. How to do so? Some jvm setting or something else?
EDIT
I add more specified info because some people are not gettin my point.
Some random guy send me a code which my computer is going to execute
Code must be execute within using maximum of k threads
All must be automated - working like SPOJ, ideone, etc.

On Linux, you could run the program as a separate user and use the shell command ulimit -u nprocs to limit the number of threads (processes) for that user. If an attempt is made to exceed the limit, the JVM will throw an OutOfMemoryError.
But why do you want to do this? Are you worried that the program will consume all the CPU resources of the computer? If so, you might want to consider running the JVM at lower scheduling priority, using nice, so other processes will get preferential use of the CPU:
NPROCS=100 # for example
NICENESS=13 # for example
ulimit -u $NPROCS
nice -n $NICENESS java ...
Using nice in that manner should reduce the priority of all the threads, but it is not clear that it does so for Linux.

You can create your own subclass for thread that performs the desired checking in the constructor(s) or in the start method.
To ensure the code you are running uses your custom thread class, you must load the code with your own custom class loader and that class loader simply catches any request for the java.lang.Thread class and hands out your custom class instead (the concept can be extended to other classes as well).
Warning: Implementing this properly is not trivial.

AFAIK,Limit is purely depends on OS not on JVM.
And you can Monitor them by a Executor service
An Executor that provides methods to manage termination and methods that can produce a Future for tracking progress of one or more asynchronous tasks.
ExecutorService pool = Executors.newFixedThreadPool(n);

Related

How to change the priority of a running java process?

In a related question we explored using ProcessBuilder to start external processes in low priority using OS-dependant commands. I also discovered that if a parent process is low priority, then all of its spawned processes start in low priority. So my new question is about starting a java file (run via double-clicking an executable jar in windows) in low priority or changing its priority programmatically during the run. I have tried altering the thread priority, but this has no effect on the windows process priority.
I have tried the following, but it does not change the process priority in the task manager
public class hello{
public hello(){
try{
Thread.currentThread().setPriority(1);
Thread.sleep(10000);
}catch(Exception e){e.printStackTrace();}
}
}
The only other thing I can think of is to run the program using a batch file, but I would rather keep this in the family so to speak. So does anyone know of a java-based way to change the current process priority? Ideally, it would be nice to be able to change the priority of the process in response to user input while the program is running.
Perhaps you are trying to do something the OS does for you.
In Unix, under load, each process is given a short time slice to do its work. If it uses all its time slice it is assume the process is CPU bound it priority is lowers. If it blocks on IO, it is assumed to be IO bound and its priority is raised (because it didn't use all its time slice)
All this only matters if there isn't enough CPU. If you keep you CPU load below 100% most of the time, every process will get as much CPU as it needs and the priority doesn't make much difference.
https://stackoverflow.com/questions/257859 discusses how to change the priority of a thread in Windows. I don't know of any Java API to do this, so you're going to have to fall back on JNI to call into the Windows API. In your shoes I think I'd start with JNA which will let you map the functions easily, or find a ready-written Java wrapper to the API if there is one.
For Windows 10, you can still set low priority to the runninng process by deprecated WMIC command:
static void setSelfLowPrio(){
try {
Runtime.getRuntime()
.exec(String.format("wmic process where processid=%d CALL setpriority \"idle\"", ProcessHandle.current().pid()));
} catch (IOException e) {
e.printStackTrace();
}
}
You can also set it to "low" or "below normal" if "idle" is not enough for your process.
(The title does not address windows specifically, but the tags do. However I think it might be relevant to know the differences.)
In general scheduling of threads an processes is a kernel dependent feature, there is hardly a portable way to do this. In fact what priority means varies greatkly. For example on NT a high value of 24 means realtime and a value of 1 means idle. On unix this is the opposite: 1 is fastest and larger values are slower.
Of course Java abstracts this information away using .setPriority with a range of 1 (lowest) to 10 (highest).
Something not pointed out yet, but a pretty big problem on many unixes is: By default a user can not increase the priority of a process (that is reduce the nice value), even if the user itself decreased the priority right before.
In contrast on NT I think you can reraise your priority back to default priority.
Simply put: .setPriority may work on windows, but will most likely not work on unix.

Manually Increasing the Amount of CPU a Java Application Uses

I've just made a program with Eclipse that takes a really long time to execute. It's taking even longer because it's loading my CPU to 25% only (I'm assuming that is because I'm using a quad-core and the program is only using one core). Is there any way to make the program use all 4 cores to max it out? Java is supposed to be natively multi-threaded, so I don't understand why it would only use 25%.
You still have to create and manage threads manually in your application. Java can't determine that two tasks can run asynchronously and automatically split the work into several threads.
This is a pretty vague question because we don't know much about what your program does. If your program is single-threaded, then no number of cores on your machine is going to make it run any faster. Java does have threading support, but it won't automatically parallelize your code for you. To speed it up, you'll need to identify parts of the computation that can be run in parallel with one another and add code as appropriate to split up and reconstitute the work. Without more info on what your program does, I can't help you out.
Another important detail to note is that Java threads are not the same as system threads. The JVM often has its own thread scheduler that tries to put Java threads onto actual system threads in a way that's fair, but there's no actual guarantee that it will do so.
Yes, Java is multi-threaded, but the multi-threading doesn't happen "by magic".
Have a look at either at the Thread class or at the Executor framework. Essentially you need to split your job into "subtasks" each of which can run on a single processor, then do something like this:
Executor ex = Executors.newFixedThreadPool(4);
while (thereAreMoreSubtasksToDo) {
ex.execute(new Runnable() {
public void run() {
... do subtask ...
}
});
}
Turning a serial routine/algorithm into a parallel one isn't necessarily trivial: you need to know in particular about a range of issues broadly termed "thread-safety". You may be interested in some material I've written about thread-safety in Java, and threading in general if you follow the links: the key thing to bear in mind is that if any data/objects are being shared among the different threads running, then you need to take special precautions. That said, for independent things that you just want to "run at the same time", then the above pattern will get you started.
Java is multi-threaded but if your application runs in only one thread, only one thread will be used. (Apart from the internal threads Java uses for finalization, garbage collection and so on.)
If you want your code to use multiple threads, you have to split it up manually, either by starting threads by yourself or using a third party thread pool. I'd suggest the latter option as it's safer but both can work equally well.
You've got a bit of learning ahead of you (actually, quite a bit of learning) - but it's learning you should do if you are going to be doing any serious programming.
Here's a starting point: http://download.oracle.com/javase/tutorial/essential/concurrency/
But you might want to look into a good book on Java multi-threading (I did this so long ago that any book I could recommend would be out of print). This sort of hard topic is well suited for learning from a text instead of online tutorials.

Multithreading, Multiprocessing with STOP and Continue Signals

I am working on a project where I need to get the native stack of the Java application. I am able to achieve this partially thanks to ptrace, multiprocessing, and signals.
On Linux, a normal Java application has, at a minimum, 14 threads. Out of these 14, I am interested in only the main thread of which I have to get the native stack. Considering this objective, I have started a separate process using fork() which is monitoring the native stack of the main thread. In short, I have 2 separate processes: one is being monitored and the other does the monitoring using ptrace and signal handling.
Steps in the monitoring process:
Get the main thread ID out of the 14 threads from the monitored process.
ptrace_attach on the main ID.
ptrace_cont on the main ID.
continuous loop starts
{
kill(main_ID, SIGSTOP)
nanosleep and check the status from the /proc/[pid]/stat directory.
ptrace_peekdata to read the stack and navigate.
ptrace_cont on the main ID.
nanosleep and check the status from the /proc/[pid]/stat directory.
}
ptrace_detach on the main ID.
This perfectly gives the native stack information continuously. However, sometimes I encounter an issue:
When I kill(main_ID, SIGSTOP) the main thread, the other threads from the process get into a finished or stoped state (T) and the entire process blocks. This is not the consistent behavior and sometimes entire process executes correctly. I cannot understand this behavior as i am only signaling the main thread. Why are the other threads affected?
Can someone help me analyze this problem?
I also tried sending SIGCONT and SIGSTOP to all of the threads of the process but the issue still occurs sometimes.
Thanks,
Sandeep
Assuming you are using Linux, you should be using tkill(2) or tgkill(2) instead of kill(2). On FreeBSD, you should use the SYS_thr_kill2 syscall. Per the tkill(2) manpage:
tgkill() sends the signal sig to the thread with the thread ID tid in
the thread group tgid. (By contrast, kill(2) can only be used to send
a signal to a process (i.e., thread group) as a whole, and the signal
will be delivered to an arbitrary thread within that process.)
Ignore the stuff about tkill(2) and friends being for internal thread library usage, it is commonly used by debuggers/tracers to send signals to specific threads.
Also, you should use waitpid(2) (or some variation of it) to wait for the thread to receive the SIGSTOP instead of polling on /proc/[pid]/stat. This approach will be more efficient and more responsive.
Finally, it appears that you are doing some sort of stack sampling. You may want to check out Google PerfTools as these tools include a CPU sampler that is doing stack sampling to obtain estimates of what functions are consuming the most CPU time. You could maybe reuse the work these tools have already done, as stack sampling can be tricky to make robust.

What is the difference between Go's multithreading and pthread or Java Threads?

What is the difference between Go's multithreading approach and other approaches, such as pthread, boost::thread or Java Threads?
Quoted from Day 3 Tutorial <- read this for more information.
Goroutines are multiplexed as needed
onto system threads. When a goroutine
executes a blocking system call, no
other goroutine is blocked.
We will do the same for CPU-bound
goroutines at some point, but for now,
if you want user-level parallelism you
must set $GOMAXPROCS. or call
runtime.GOMAXPROCS(n).
A goroutine does not necessarily correspond to an OS thread. It can have smaller initial stack size and the stack will grow as needed.
Multiple gorouitines may be multiplexed into a single thread when needed.
More importantly, the concept is as outlined above, that a goroutine is a sequential program that may block itself but does not block other goroutines.
Goroutines is implemented as pthreads in gccgo, so it can be identical to OS thread, too.
It's separating the concept of OS thread and our thinking of multithreading when programming.
IMO, what makes the multi-threading in Go appealing is the communication facilities: unlike pthread where one must build the communications infrastructure (mutex, queues etc.), in Go it is available by default in a convenient form.
In short, there is "low-friction" to using threads because of the good communication facilities (akin to Erlang if I can say so).
In the reference compilers (5g/6g/8g), the master scheduler (src/pkg/runtime/proc.c) creates N OS threads, where N is controlled by runtime.GOMAXPROCS(n) (default 1). Each scheduler thread pulls a new goroutine off the master list and starts running it. The goroutine(s) will continue to run until a syscall is made (e.g. printf) or an operation on a channel is made, at which point the scheduler will grab the next goroutine and run it from the point at which it left off (see gosched() calls in src/pkg/runtime/chan.c).
The scheduling, for all intents and purposes, is implemented with coroutines. The same functionality could be written in straight C using setjmp() and longjmp(), Go (and other languages that implement lightweight/green threads) are just automating the process for you.
The upside to lightweight threads is since it's all userspace, creating a "thread" is very cheap (allocating a small default stack) and can be very efficient due to the inherent structure of how the threads talk to eachother. The downside is that they are not true threads which means a single lightweight thread can block the entire program, even when it appears all the threads should be running concurrently.
As previous answers have stated, go routines do not necessarily correspond to system threads however I found the following useful if you must have the performance increase of multi-threading right now:
The current implementation of the Go runtime will not parallelize this code by default. It dedicates only a single core to user-level processing. An arbitrary number of goroutines can be blocked in system calls, but by default only one can be executing user-level code at any time. It should be smarter and one day it will be smarter, but until it is if you want CPU parallelism you must tell the run-time how many goroutines you want executing code simultaneously. There are two related ways to do this. Either run your job with environment variable GOMAXPROCS set to the number of cores to use or import the runtime package and call runtime.GOMAXPROCS(NCPU). A helpful value might be runtime.NumCPU(), which reports the number of logical CPUs on the local machine. Again, this requirement is expected to be retired as the scheduling and run-time improve.
quote source
An example program that maxes out my i5 processor is this (uses all 4 cores at 100% in htop):
package main
import (
"fmt"
"time"
"runtime"
)
func main() {
runtime.GOMAXPROCS(4) // Set the maximum number of threads/processes
d := make(chan string)
go boring("boring!", d, 1)
go boring("boring!", d, 2)
go boring("boring!", d, 3)
go boring("boring!", d, 4)
for i := 0; i < 10; i++ {
time.Sleep(time.Second);
}
fmt.Println("You're boring; I'm leaving.")
}
func boring(msg string, c chan string, id int) {
for i := 0; ; i++ {
}
}
Now that doesn't actually 'do' anything, but see how short/easy/simple that is compared to writing multithreaded applications in other languages such as Java.

How good is the JVM at parallel processing? When should I create my own Threads and Runnables? Why might threads interfere?

I have a Java program that runs many small simulations. It runs a genetic algorithm, where each fitness function is a simulation using parameters on each chromosome. Each one takes maybe 10 or so seconds if run by itself, and I want to run a pretty big population size (say 100?). I can't start the next round of simulations until the previous one has finished. I have access to a machine with a whack of processors in it and I'm wondering if I need to do anything to make the simulations run in parallel. I've never written anything explicitly for multicore processors before and I understand it's a daunting task.
So this is what I would like to know: To what extent and how well does the JVM parallel-ize? I have read that it creates low level threads, but how smart is it? How efficient is it? Would my program run faster if I made each simulation a thread? I know this is a huge topic, but could you point me towards some introductory literature concerning parallel processing and Java?
Thanks very much!
Update:
Ok, I've implemented an ExecutorService and made my small simulations implement Runnable and have run() methods. Instead of writing this:
Simulator sim = new Simulator(args);
sim.play();
return sim.getResults();
I write this in my constructor:
ExecutorService executor = Executors.newFixedThreadPool(32);
And then each time I want to add a new simulation to the pool, I run this:
RunnableSimulator rsim = new RunnableSimulator(args);
exectuor.exectue(rsim);
return rsim.getResults();
The RunnableSimulator::run() method calls the Simulator::play() method, neither have arguments.
I think I am getting thread interference, because now the simulations error out. By error out I mean that variables hold values that they really shouldn't. No code from within the simulation was changed, and before the simulation ran perfectly over many many different arguments. The sim works like this: each turn it's given a game-piece and loops through all the location on the game board. It checks to see if the location given is valid, and if so, commits the piece, and measures that board's goodness. Now, obviously invalid locations are being passed to the commit method, resulting in index out of bounds errors all over the place.
Each simulation is its own object right? Based on the code above? I can pass the exact same set of arguments to the RunnableSimulator and Simulator classes and the runnable version will throw exceptions. What do you think might cause this and what can I do to prevent it? Can I provide some code samples in a new question to help?
Java Concurrency Tutorial
If you're just spawning a bunch of stuff off to different threads, and it isn't going to be talking back and forth between different threads, it isn't too hard; just write each in a Runnable and pass them off to an ExecutorService.
You should skim the whole tutorial, but for this particular task, start here.
Basically, you do something like this:
ExecutorService executorService = Executors.newFixedThreadPool(n);
where n is the number of things you want running at once (usually the number of CPUs). Each of your tasks should be an object that implements Runnable, and you then execute it on your ExecutorService:
executorService.execute(new SimulationTask(parameters...));
Executors.newFixedThreadPool(n) will start up n threads, and execute will insert the tasks into a queue that feeds to those threads. When a task finishes, the thread it was running on is no longer busy, and the next task in the queue will start running on it. Execute won't block; it will just put the task into the queue and move on to the next one.
The thing to be careful of is that you really AREN'T sharing any mutable state between tasks. Your task classes shouldn't depend on anything mutable that will be shared among them (i.e. static data). There are ways to deal with shared mutable state (locking), but if you can avoid the problem entirely it will be a lot easier.
EDIT: Reading your edits to your question, it looks like you really want something a little different. Instead of implementing Runnable, implement Callable. Your call() method should be pretty much the same as your current run(), except it should return getResults();. Then, submit() it to your ExecutorService. You will get a Future in return, which you can use to test if the simulation is done, and, when it is, get your results.
You can also see the new fork join framework by Doug Lea. One of the best book on the subject is certainly Java Concurrency in Practice. I would strong recommend you to take a look at the fork join model.
Java threads are just too heavyweight. We have implement parallel branches in Ateji PX as very lightweight scheduled objects. As in Erlang, you can create tens of millions of parallel branches before you start noticing an overhead. But it's still Java, so you don't need to switch to a different language.
If you are doing full-out processing all the time in your threads, you won't benefit from having more threads than processors. If your threads occasionally wait on each other or on the system, then Java scales well up to thousands of threads.
I wrote an app that discovered a class B network (65,000) in a few minutes by pinging each node, and each ping had retries with an increasing delay. When I put each ping on a separate thread (this was before NIO, I could probably improve it now), I could run to about 4000 threads in windows before things started getting flaky. Linux the number was nearer 1000 (Never figured out why).
No matter what language or toolkit you use, if your data interacts, you will have to pay some attention to those areas where it does. Java uses a Synchronized keyword to prevent two threads from accessing a section at the same time. If you write your Java in a more functional manner (making all your members final) you can run without synchronization, but it can be--well let's just say solving problems takes a different approach that way.
Java has other tools to manage units of independent work, look in the "Concurrent" package for more information.
Java is pretty good at parallel processing, but there are two caveats:
Java threads are relatively heavyweight (compared with e.g. Erlang), so don't start creating them in the hundreds or thousands. Each thread gets its own stack memory (default: 256KB) and you could run out of memory, among other things.
If you run on a very powerful machine (especially with a lot of CPUs and a large amount of RAM), then the VM's default settings (especially concerning GC) may result in suboptimal performance and you may have to spend some times tuning them via command line options. Unfortunately, this is not a simple task and requires a lot of knowledge.

Categories