In a related question we explored using ProcessBuilder to start external processes in low priority using OS-dependant commands. I also discovered that if a parent process is low priority, then all of its spawned processes start in low priority. So my new question is about starting a java file (run via double-clicking an executable jar in windows) in low priority or changing its priority programmatically during the run. I have tried altering the thread priority, but this has no effect on the windows process priority.
I have tried the following, but it does not change the process priority in the task manager
public class hello{
public hello(){
try{
Thread.currentThread().setPriority(1);
Thread.sleep(10000);
}catch(Exception e){e.printStackTrace();}
}
}
The only other thing I can think of is to run the program using a batch file, but I would rather keep this in the family so to speak. So does anyone know of a java-based way to change the current process priority? Ideally, it would be nice to be able to change the priority of the process in response to user input while the program is running.
Perhaps you are trying to do something the OS does for you.
In Unix, under load, each process is given a short time slice to do its work. If it uses all its time slice it is assume the process is CPU bound it priority is lowers. If it blocks on IO, it is assumed to be IO bound and its priority is raised (because it didn't use all its time slice)
All this only matters if there isn't enough CPU. If you keep you CPU load below 100% most of the time, every process will get as much CPU as it needs and the priority doesn't make much difference.
https://stackoverflow.com/questions/257859 discusses how to change the priority of a thread in Windows. I don't know of any Java API to do this, so you're going to have to fall back on JNI to call into the Windows API. In your shoes I think I'd start with JNA which will let you map the functions easily, or find a ready-written Java wrapper to the API if there is one.
For Windows 10, you can still set low priority to the runninng process by deprecated WMIC command:
static void setSelfLowPrio(){
try {
Runtime.getRuntime()
.exec(String.format("wmic process where processid=%d CALL setpriority \"idle\"", ProcessHandle.current().pid()));
} catch (IOException e) {
e.printStackTrace();
}
}
You can also set it to "low" or "below normal" if "idle" is not enough for your process.
(The title does not address windows specifically, but the tags do. However I think it might be relevant to know the differences.)
In general scheduling of threads an processes is a kernel dependent feature, there is hardly a portable way to do this. In fact what priority means varies greatkly. For example on NT a high value of 24 means realtime and a value of 1 means idle. On unix this is the opposite: 1 is fastest and larger values are slower.
Of course Java abstracts this information away using .setPriority with a range of 1 (lowest) to 10 (highest).
Something not pointed out yet, but a pretty big problem on many unixes is: By default a user can not increase the priority of a process (that is reduce the nice value), even if the user itself decreased the priority right before.
In contrast on NT I think you can reraise your priority back to default priority.
Simply put: .setPriority may work on windows, but will most likely not work on unix.
Related
How to set limit to the number of Thread that someone can create? What I do is running someone's code (something like ideone), and want to limit number of thread that he can spawn. How to do so? Some jvm setting or something else?
EDIT
I add more specified info because some people are not gettin my point.
Some random guy send me a code which my computer is going to execute
Code must be execute within using maximum of k threads
All must be automated - working like SPOJ, ideone, etc.
On Linux, you could run the program as a separate user and use the shell command ulimit -u nprocs to limit the number of threads (processes) for that user. If an attempt is made to exceed the limit, the JVM will throw an OutOfMemoryError.
But why do you want to do this? Are you worried that the program will consume all the CPU resources of the computer? If so, you might want to consider running the JVM at lower scheduling priority, using nice, so other processes will get preferential use of the CPU:
NPROCS=100 # for example
NICENESS=13 # for example
ulimit -u $NPROCS
nice -n $NICENESS java ...
Using nice in that manner should reduce the priority of all the threads, but it is not clear that it does so for Linux.
You can create your own subclass for thread that performs the desired checking in the constructor(s) or in the start method.
To ensure the code you are running uses your custom thread class, you must load the code with your own custom class loader and that class loader simply catches any request for the java.lang.Thread class and hands out your custom class instead (the concept can be extended to other classes as well).
Warning: Implementing this properly is not trivial.
AFAIK,Limit is purely depends on OS not on JVM.
And you can Monitor them by a Executor service
An Executor that provides methods to manage termination and methods that can produce a Future for tracking progress of one or more asynchronous tasks.
ExecutorService pool = Executors.newFixedThreadPool(n);
I am working on a project that is both memory and computationally intensive. A significant portion of the execution utilizes multi-threading by a FixedThreadPool. In short; I have 1 thread for fetching data from several remote locations (using URL connections) and populating a BlockingQueue with objects to be analyzed and n threads that pick these objects and run the analysis. edit: see code below
Now this setup works like a charm on my Linux machine running OpenSUSE 11.3, but a colleague is testing it on a very similar machine running Win7 is getting custom notifications of timeouts on the queue polling (see code below), lots of them actually. I have been trying to monitor the processor use on her machine, and it appears that the software does not get any more than 15% of the CPUs while on my machine the processor usage hits the roof, just as I intended.
My question is, then, can this be a sign of "starvation" of the queue? Could it be so that the producer thread is not getting enough cpu time? If so how do I go about giving one particular thread in the pool higher priority?
UPDATE:
I have been trying to pinpoint the problem, with no joy... I did however gain some new insights.
Profiling the execution of the code with JVisualVM demonstrates a very peculiar behavior. The methods are called in short bursts of CPU-time with several seconds of no progress in between. This to me means that somehow the OS is hitting the brakes on the process.
Disabling the anti-virus and back-up daemons do not have any significant affect on the matter
Changing the priority of java.exe (the only instance) through task manager (adviced here) does not change anything either. (That being said, I could not give "realtime" priority to java, and had to be content with "high" prio)
Profiling the network usage shows good flow of data in and out, so I am guessing that is not the bottleneck (while it is a considerable part of the execution time of the process, but that I know already and is pretty much the same percentage as what I get on my Linux machine).
Any ideas as to how the Win7 OS might be limiting the cpu time to my project? if it's not the OS, what could be the limiting factor? I would like to stress yet again that the machine is NOT running any other computation intensive at the same time and there is almost no load on the cpus other than my software. This is driving me crazy...
EDIT: relevant code
public ConcurrencyService(Dataset d, QueryService qserv, Set<MyObject> s){
timeout = 3;
this.qs = qserv;
this.bq = qs.getQueue();
this.ds = d;
this.analyzedObjects = s;
this.drc = DebugRoutineContainer.getInstance();
this.started = false;
int nbrOfProcs = Runtime.getRuntime().availableProcessors();
poolSize = nbrOfProcs;
pool = (ThreadPoolExecutor) Executors.newFixedThreadPool(poolSize);
drc.setScoreLogStream(new PrintStream(qs.getScoreLogFile()));
}
public void serve() throws InterruptedException {
try {
this.ds.initDataset();
this.started = true;
pool.execute(new QueryingAction(qs));
for(;;){
MyObject p = bq.poll(timeout, TimeUnit.MINUTES);
if(p != null){
if (p.getId().equals("0"))
break;
pool.submit(new AnalysisAction(ds, p, analyzedObjects, qs.getKnownAssocs()));
}else
drc.log("Timed out while waiting for an object...");
}
} catch (Exception ex) {
ex.printStackTrace();
String exit_msg = "Unexpected error in core analysis, terminating execution!";
}finally{
drc.log("--DEBUG: Termination criteria found, shutdown initiated..");
drc.getMemoryInfo(true); // dump meminfo to log
pool.shutdown();
int mins = 2;
int nCores = poolSize;
long totalTasks = pool.getTaskCount(),
compTasks = pool.getCompletedTaskCount(),
tasksRemaining = totalTasks - compTasks,
timeout = mins * tasksRemaining / nCores;
drc.log("--DEBUG: Shutdown commenced, thread pool will terminate once all objects are processed, " +
"or will timeout in : " + timeout + " minutes... \n" + compTasks + " of " + (totalTasks -1) +
" objects have been analyzed so far, " + "mean process time is: " +
drc.getMeanProcTimeAsString() + " milliseconds.");
pool.awaitTermination(timeout, TimeUnit.MINUTES);
}
}
The class QueryingAction is a simple Runnable that calls the data acquisition method in the designated QueryService object which then populates a BlockingQueue. The AnalysisAction class does all the number-crunching for a single instance of MyObject.
I suspect the producer thread is not getting/loading the source data fast enough. This might not be a lack of CPU but an IO related issue. (not sure why you have time outs on your BlockingQueue)
It might be worth having a thread which periodically logs things like the number of tasks added and the length of the queue (e.g. every 5-15 seconds)
So, if I correctly understand your problem, you have one thread to fetch data, and several threads to analyse the fetched data. Your problem is that the threads are not correctly synchronized to run together and take full advantage of the processor.
You have a tipical producer-consumer problem with a single producer and several consumers.
I advise you to remake your code a bit to have, instead, several independent consumer threads that are always waiting for resources to be available and only then running. This way you guarantee the maximum processor use.
Consumer thread:
while (!terminate)
{
synchronized (Producer.getLockObject())
{
try
{
//sleep (no processing at all)
Producer.getLockObject().wait();
}
catch (Exceptions..)
}
MyObject p = Producer.getObjectFromQueue(); //this function should be synchronized
//Analyse fetched data, and submit it to somewhere...
}
Producer thread:
while (!terminate)
{
MyObject newData = fetchData(); //fetch data from remote location
addDataToQueueu(newData); //this should also be synchronized
synchronized (getLockObject())
{
//wake up one thread to deal with the data
getLockObject().notify();
}
}
You see that this way, your threads are always performing useful work or sleeping.
This is just draft code to exemplify.
See more explanation here: http://www.javamex.com/tutorials/wait_notify_how_to.shtml
and here: http://www.java-samples.com/showtutorial.php?tutorialid=306
Priority won't help, since the problem is not an issue of deciding who gets precious resources -- resource usage isn't maxed. The only way the producer thread would not be getting enough CPU time is if it wasn't ready-to-run. Priority won't help, since the problem is not an issue.
How many cores does the machine have? It's possible that the producer thread is running full speed and there still just isn't enough CPU to go around. It's also possible the producer is I/O bound.
You can try to separate the producer thread from the pool (i.e. create a distinct Thread and set the pool to have -1 the current capacity) and then set its priority to maximum via setPriority. See what happens, although priority rarely accounts for such a difference in performance.
When you say URL connection, do you mean local or remote? It could be that network speed is slowing your producer down
So after weeks of fiddling, wrestling in code and other types of suffering I think I had a breakthrough, "a moment of clarity" if you will...
I managed to show that the program can exhibits the same slow behavior on my Linux machine and can indeed run full throttle on the problematic Win-7 machine. The crux of the problem appears to be some sort of corruption of the system/cache files that are used to store the results of previous queries, and overall, speed up the analysis. You got to love the irony, in this case they appeared to be the reason for EXTREME slow analysis. In retrospect, I should have known (a la Occam's razor)...
I am still not sure what how the corruption occurs, but at least it's probably not related to different OS. Using the system files from my machine increases the output on the Win7 host up to about 40% only however. Profiling the process more has also revealed that, oddly enough, there is significantly more GC activity on Win7, which apparently took lots of CPU time from number crunching. Giving -Xmx2g takes care of excessive garbage collection and the CPU usage for the process shoots up to 95-96%, and threads run smoothly.
Now that my original question is answered, I have to say that overall java responsiveness is definitely better on Linux environment, even without allocating more heap memory, I can easily multi-task while I am running an extensive analysis in the background. Things are not as smooth in Win-7, e.x. resizing the GUI is significantly slow once the analysis takes off at full speed.
Thanks for all the replies, I am sorry for the partially misleading problem description. I merely shared what I found out while debugging to the best of my abilities. Anyways, I believe the bounty goes to Peter Lawrey, since he early on pointed to an I/O issue and it was his suggestion about a logger thread which eventually led me to the answer.
I would think it was some OS specific issue because that is the core difference between the two units. More specifically, something is slowing down the data arriving through the remote connection.
Find some traffic analysis tool such as Wireshark and/or Networx and try to discover if there is anything throttling the Win PC. Perhaps it is going through a proxy that has some kind of rate cap configured.
Sorry not really an answer but did not fit inside comment and still it is worth the read I think:
well i am not JAVA friendly
but i have recently the same problem with C++ projects for machine control through USB.
On XP or W2K all goes perfectly for months of 24/7 operation on any 2 or more core machine
On W7 and strong enough machine all goes OK but sometimes (cca 1x per few hours) freezes for few seconds without obvious reason.
On W7 and relatively weak machine (2 core 1.66GHz T2300E notebook) the threads are freezing for some time and run again which under/overflows USB/WIN/App FIFOs and collapse communication ...
it appears that nothing is blocked but the W7 sheduler just do not give CPU to the right threads occasionally.
i thought that USB driver (JUNGO) communication freezes bud that is not true I measured it and it is OK even in freeze
the freeze was about 6-15 seconds cca once per minute.
after adding some safety sleeps to threads loops the freeze has shorten to about 0.5 sec
but still there
even if App do not Under/Overflows FIFOs the windows USB driver side do (few times per minute for few ms)
Change of exe/threads priority and class do not affect performance on W7 (on XP,W2K work as it should)
As you can see it seems we have most likely the same problem. In my case:
is not I/O related (when i replace USB thread with simulation of device it behaves similar)
adding Sleep to time critical code helps a lot
error is present also in low count of threads [2 fast (17ms) + 1 slow (250ms) + App code = 4]
my CPU consumption on W7 slow machine is also not 100% but about 95% which is OK because I have sleeps everywhere
my Apps use about 40-100MB of memory but are CPU computation demanding ...
but not that much it could run safely on much slower machines
but because of USB driver connection and multiple device support it need at least 2 cores
my next step is to add some kind of execution time logging/analyze to see what is happening in more detail
and also little rewrite of send/receive threads to see if it helps
When i learn something new/useful will add it.
I've just made a program with Eclipse that takes a really long time to execute. It's taking even longer because it's loading my CPU to 25% only (I'm assuming that is because I'm using a quad-core and the program is only using one core). Is there any way to make the program use all 4 cores to max it out? Java is supposed to be natively multi-threaded, so I don't understand why it would only use 25%.
You still have to create and manage threads manually in your application. Java can't determine that two tasks can run asynchronously and automatically split the work into several threads.
This is a pretty vague question because we don't know much about what your program does. If your program is single-threaded, then no number of cores on your machine is going to make it run any faster. Java does have threading support, but it won't automatically parallelize your code for you. To speed it up, you'll need to identify parts of the computation that can be run in parallel with one another and add code as appropriate to split up and reconstitute the work. Without more info on what your program does, I can't help you out.
Another important detail to note is that Java threads are not the same as system threads. The JVM often has its own thread scheduler that tries to put Java threads onto actual system threads in a way that's fair, but there's no actual guarantee that it will do so.
Yes, Java is multi-threaded, but the multi-threading doesn't happen "by magic".
Have a look at either at the Thread class or at the Executor framework. Essentially you need to split your job into "subtasks" each of which can run on a single processor, then do something like this:
Executor ex = Executors.newFixedThreadPool(4);
while (thereAreMoreSubtasksToDo) {
ex.execute(new Runnable() {
public void run() {
... do subtask ...
}
});
}
Turning a serial routine/algorithm into a parallel one isn't necessarily trivial: you need to know in particular about a range of issues broadly termed "thread-safety". You may be interested in some material I've written about thread-safety in Java, and threading in general if you follow the links: the key thing to bear in mind is that if any data/objects are being shared among the different threads running, then you need to take special precautions. That said, for independent things that you just want to "run at the same time", then the above pattern will get you started.
Java is multi-threaded but if your application runs in only one thread, only one thread will be used. (Apart from the internal threads Java uses for finalization, garbage collection and so on.)
If you want your code to use multiple threads, you have to split it up manually, either by starting threads by yourself or using a third party thread pool. I'd suggest the latter option as it's safer but both can work equally well.
You've got a bit of learning ahead of you (actually, quite a bit of learning) - but it's learning you should do if you are going to be doing any serious programming.
Here's a starting point: http://download.oracle.com/javase/tutorial/essential/concurrency/
But you might want to look into a good book on Java multi-threading (I did this so long ago that any book I could recommend would be out of print). This sort of hard topic is well suited for learning from a text instead of online tutorials.
What is the difference between Go's multithreading approach and other approaches, such as pthread, boost::thread or Java Threads?
Quoted from Day 3 Tutorial <- read this for more information.
Goroutines are multiplexed as needed
onto system threads. When a goroutine
executes a blocking system call, no
other goroutine is blocked.
We will do the same for CPU-bound
goroutines at some point, but for now,
if you want user-level parallelism you
must set $GOMAXPROCS. or call
runtime.GOMAXPROCS(n).
A goroutine does not necessarily correspond to an OS thread. It can have smaller initial stack size and the stack will grow as needed.
Multiple gorouitines may be multiplexed into a single thread when needed.
More importantly, the concept is as outlined above, that a goroutine is a sequential program that may block itself but does not block other goroutines.
Goroutines is implemented as pthreads in gccgo, so it can be identical to OS thread, too.
It's separating the concept of OS thread and our thinking of multithreading when programming.
IMO, what makes the multi-threading in Go appealing is the communication facilities: unlike pthread where one must build the communications infrastructure (mutex, queues etc.), in Go it is available by default in a convenient form.
In short, there is "low-friction" to using threads because of the good communication facilities (akin to Erlang if I can say so).
In the reference compilers (5g/6g/8g), the master scheduler (src/pkg/runtime/proc.c) creates N OS threads, where N is controlled by runtime.GOMAXPROCS(n) (default 1). Each scheduler thread pulls a new goroutine off the master list and starts running it. The goroutine(s) will continue to run until a syscall is made (e.g. printf) or an operation on a channel is made, at which point the scheduler will grab the next goroutine and run it from the point at which it left off (see gosched() calls in src/pkg/runtime/chan.c).
The scheduling, for all intents and purposes, is implemented with coroutines. The same functionality could be written in straight C using setjmp() and longjmp(), Go (and other languages that implement lightweight/green threads) are just automating the process for you.
The upside to lightweight threads is since it's all userspace, creating a "thread" is very cheap (allocating a small default stack) and can be very efficient due to the inherent structure of how the threads talk to eachother. The downside is that they are not true threads which means a single lightweight thread can block the entire program, even when it appears all the threads should be running concurrently.
As previous answers have stated, go routines do not necessarily correspond to system threads however I found the following useful if you must have the performance increase of multi-threading right now:
The current implementation of the Go runtime will not parallelize this code by default. It dedicates only a single core to user-level processing. An arbitrary number of goroutines can be blocked in system calls, but by default only one can be executing user-level code at any time. It should be smarter and one day it will be smarter, but until it is if you want CPU parallelism you must tell the run-time how many goroutines you want executing code simultaneously. There are two related ways to do this. Either run your job with environment variable GOMAXPROCS set to the number of cores to use or import the runtime package and call runtime.GOMAXPROCS(NCPU). A helpful value might be runtime.NumCPU(), which reports the number of logical CPUs on the local machine. Again, this requirement is expected to be retired as the scheduling and run-time improve.
quote source
An example program that maxes out my i5 processor is this (uses all 4 cores at 100% in htop):
package main
import (
"fmt"
"time"
"runtime"
)
func main() {
runtime.GOMAXPROCS(4) // Set the maximum number of threads/processes
d := make(chan string)
go boring("boring!", d, 1)
go boring("boring!", d, 2)
go boring("boring!", d, 3)
go boring("boring!", d, 4)
for i := 0; i < 10; i++ {
time.Sleep(time.Second);
}
fmt.Println("You're boring; I'm leaving.")
}
func boring(msg string, c chan string, id int) {
for i := 0; ; i++ {
}
}
Now that doesn't actually 'do' anything, but see how short/easy/simple that is compared to writing multithreaded applications in other languages such as Java.
I have a Java program that runs many small simulations. It runs a genetic algorithm, where each fitness function is a simulation using parameters on each chromosome. Each one takes maybe 10 or so seconds if run by itself, and I want to run a pretty big population size (say 100?). I can't start the next round of simulations until the previous one has finished. I have access to a machine with a whack of processors in it and I'm wondering if I need to do anything to make the simulations run in parallel. I've never written anything explicitly for multicore processors before and I understand it's a daunting task.
So this is what I would like to know: To what extent and how well does the JVM parallel-ize? I have read that it creates low level threads, but how smart is it? How efficient is it? Would my program run faster if I made each simulation a thread? I know this is a huge topic, but could you point me towards some introductory literature concerning parallel processing and Java?
Thanks very much!
Update:
Ok, I've implemented an ExecutorService and made my small simulations implement Runnable and have run() methods. Instead of writing this:
Simulator sim = new Simulator(args);
sim.play();
return sim.getResults();
I write this in my constructor:
ExecutorService executor = Executors.newFixedThreadPool(32);
And then each time I want to add a new simulation to the pool, I run this:
RunnableSimulator rsim = new RunnableSimulator(args);
exectuor.exectue(rsim);
return rsim.getResults();
The RunnableSimulator::run() method calls the Simulator::play() method, neither have arguments.
I think I am getting thread interference, because now the simulations error out. By error out I mean that variables hold values that they really shouldn't. No code from within the simulation was changed, and before the simulation ran perfectly over many many different arguments. The sim works like this: each turn it's given a game-piece and loops through all the location on the game board. It checks to see if the location given is valid, and if so, commits the piece, and measures that board's goodness. Now, obviously invalid locations are being passed to the commit method, resulting in index out of bounds errors all over the place.
Each simulation is its own object right? Based on the code above? I can pass the exact same set of arguments to the RunnableSimulator and Simulator classes and the runnable version will throw exceptions. What do you think might cause this and what can I do to prevent it? Can I provide some code samples in a new question to help?
Java Concurrency Tutorial
If you're just spawning a bunch of stuff off to different threads, and it isn't going to be talking back and forth between different threads, it isn't too hard; just write each in a Runnable and pass them off to an ExecutorService.
You should skim the whole tutorial, but for this particular task, start here.
Basically, you do something like this:
ExecutorService executorService = Executors.newFixedThreadPool(n);
where n is the number of things you want running at once (usually the number of CPUs). Each of your tasks should be an object that implements Runnable, and you then execute it on your ExecutorService:
executorService.execute(new SimulationTask(parameters...));
Executors.newFixedThreadPool(n) will start up n threads, and execute will insert the tasks into a queue that feeds to those threads. When a task finishes, the thread it was running on is no longer busy, and the next task in the queue will start running on it. Execute won't block; it will just put the task into the queue and move on to the next one.
The thing to be careful of is that you really AREN'T sharing any mutable state between tasks. Your task classes shouldn't depend on anything mutable that will be shared among them (i.e. static data). There are ways to deal with shared mutable state (locking), but if you can avoid the problem entirely it will be a lot easier.
EDIT: Reading your edits to your question, it looks like you really want something a little different. Instead of implementing Runnable, implement Callable. Your call() method should be pretty much the same as your current run(), except it should return getResults();. Then, submit() it to your ExecutorService. You will get a Future in return, which you can use to test if the simulation is done, and, when it is, get your results.
You can also see the new fork join framework by Doug Lea. One of the best book on the subject is certainly Java Concurrency in Practice. I would strong recommend you to take a look at the fork join model.
Java threads are just too heavyweight. We have implement parallel branches in Ateji PX as very lightweight scheduled objects. As in Erlang, you can create tens of millions of parallel branches before you start noticing an overhead. But it's still Java, so you don't need to switch to a different language.
If you are doing full-out processing all the time in your threads, you won't benefit from having more threads than processors. If your threads occasionally wait on each other or on the system, then Java scales well up to thousands of threads.
I wrote an app that discovered a class B network (65,000) in a few minutes by pinging each node, and each ping had retries with an increasing delay. When I put each ping on a separate thread (this was before NIO, I could probably improve it now), I could run to about 4000 threads in windows before things started getting flaky. Linux the number was nearer 1000 (Never figured out why).
No matter what language or toolkit you use, if your data interacts, you will have to pay some attention to those areas where it does. Java uses a Synchronized keyword to prevent two threads from accessing a section at the same time. If you write your Java in a more functional manner (making all your members final) you can run without synchronization, but it can be--well let's just say solving problems takes a different approach that way.
Java has other tools to manage units of independent work, look in the "Concurrent" package for more information.
Java is pretty good at parallel processing, but there are two caveats:
Java threads are relatively heavyweight (compared with e.g. Erlang), so don't start creating them in the hundreds or thousands. Each thread gets its own stack memory (default: 256KB) and you could run out of memory, among other things.
If you run on a very powerful machine (especially with a lot of CPUs and a large amount of RAM), then the VM's default settings (especially concerning GC) may result in suboptimal performance and you may have to spend some times tuning them via command line options. Unfortunately, this is not a simple task and requires a lot of knowledge.