Dividing long application to Threads

Dividing long application to Threads - java

I have a process running every night and doing lot of data analysis for a set of companies . I am doing this just in a for loop that runs through company list . Sometimes It takes about 1 hour for this process to completed . Sometimes cause of some errors it crashes in between causing . I have to manually restart it and it processes all remaining company's analysis .
Since each for loop run's a separate company's data analysis should multithreading inside for loop be a good solution?
Thanks for any suggestions.

ThreadPoolExecutor is your friend!

Since each for loop run's a separate company's data analysis should multithreading inside forloop be a good solution ?
Maybe yes, maybe no.
Lets look at the facts:
Sometimes It takes about 1 hour for this process to completed
By itself, this should not be a problem. One hour is not a long time, especially since you probably have a ~12 hour window to do it.
And multi-threading won't necessarily significantly reduce the elapsed time. It depends on the nature of tasks, the processing algorithms, and the nature of your hardware and system configurations.
Sometimes cause of some errors it crashes in between causing (what?).
Multi-threading won't fix that. If you do each company run in a separate thread, then the same error would still cause that thread to crash. And depending on the cause of the error, and the consequences of the error, the crash for one company could crash the others too ... or cause them to work incorrectly in other ways.
I have to manually restart it and it processes all remaining company's analysis.
Threading won't entirely fix that either.
You'll still have to to fix the problem(s) that caused the original crash(es) and then manually restart. And you still have the problem of distinguishing and recording the companies that need to be rerun so that you don't repeat the other unnecessarily.
In summary, multi-threading could make the application go faster (it probably will IMO), but I don't really think it is going to solve your root problem ... which appears to be either bad data or bugs causing processing to fail.
Finally, on a technical level, it is probably a bad idea to simply fire off a thread for each company. If you try to do the work in parallel, the threads will be competing for local resources and resources on your back-end database. It is probably better to use something like ThreadPoolExecutor with a limited pool size.

Why don't you add a wrapper for error handling... log it and proceed incase of errors... in that way you don't have to restart incase of errors.
for( your company list){
try{
your tasks
}catch(Exception){
//log error and proceed
}
Are your tasks independent for your company list?
If so, you can create new thread to process each task..
If not, you can process them sequential in required order

Related

Multithreaded legacy Java application's threads taking turns in order

TL;DR - A user has an error (ORA-01438: value larger than specified precision allows for this column). I can't recreate it locally because when my machine runs the multithreaded app, only one of ten threads runs at a time, in sequence. Furthermore, running it often results in my machine running out of heap even with 8GB allocated to the heap, and then I happen to hit a NullPointerException instead of the user's issue.
I'm attempting to debug a multithreaded legacy Java app (JDK 1.6) written years ago by people that are no longer around. It is attempting to insert some data into an Oracle DB. The app usually runs on a Weblogic 11G server and takes about 5 minutes to finish running the calculations. However, debugging locally, the threads don't work concurrently, they're taking turns on my local machine. This makes the running time go from the aforementioned 5 minutes to ~1 hour and still manages to run out of heap (I gave it 8GB) or throw a NullPointerException if I'm lucky, but that isn't the business user's error. I've thought about cutting it down to use only one thread since it's taking turns anyway, but after touching this for a week, the business impact is becoming real and I can't just keep hitting it with a hammer.
This may be a long shot given I haven't provided and of the code, but does anyone have experience with a similar issue? Specifically why the threads are taking turns.
EDIT: the user's error is a constraint violation, so I think it's modifying the inputted data and doing something like adding extra precision.
The problem: The application's 10 threads are working in sequence rather than concurrently and the code potentially contains a memory leak, resulting in the app crashing and not hitting the same code the constraint violation exception that the business user is encountering.
Edit 2: I suspect the threads trading off, rather than running concurrently, could be causing them not to run garbage collection on my local machine perhaps? Though, it still doesn't explain the issue of me receiving a different error than the business user if I'm lucky enough not to run out of heap.

You may well be correct in your instincts which tell you that the "threads" are working against you and that your predecessor simply left you with an unworkable design which he could never manage to fix.
"The eventual recipient," in all cases, "is the [Oracle ...] database." No matter what the application does in presenting requests to it, the only thing that matters is the requests that it receives. Obviously the clients are colliding with themselves, and it is therefore probable that there's no reason for having multiple threads at all.

How do I begin to optimise my program [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a web server program written in java and my boss wants it to run faster.
Iv always been happy if it ran without error so efficiency is new to me.
I tried a profiler but it crashed my computer and turned out to be a dead opensource project.
I have no idea what I am doing except from reading a few questions on here. I see that re factoring code is the best option but Im not sure how to go about that and that i need a profiler to see what code to re factor.
So does anyone know of a free profiler that I can use ? Im using java and eclipse. if possible some instructions or a like to easy instruction would be great.
But what I really want if anyone can give it is a basic introduction to the subject so I can understand enough to go do in depth research on the subject to get the best results.
I am a complete beginner when it comes to optimising code and the subject seems very complex from what I have seen so far, any help with how to get started would be greatly appreciated.
Im new to java as well so saying things like check garbage collection would mean nothing to me, id need a more detailed explanation.
EDIT: the program uses tomcat for the networking. it connects to an SQL database. the main function is a polling loop which checks all attached devices on the network, reads events from them writes the event to the database and the performs the event functions.
I am trying to improve the polling loop. the program is heavily multithreaded and uses a lot of interfaces and proxies so it is hart to see were code goes the farther you get from the polling loop.
I hope this information helps you offer solutions. also I did not build it, I inherited the code.

First of all detect the bottlenecks. There is no point in optimizing a method from 500ms to 400ms when there is a method running for 5 seconds, when it should run for 100ms.
You can try using the VisualVM as a profiler, which is built-in in the JDK.

If you want a free profiler, use VisualVM when comes with Java. It is likely to be enough.
You should ask your boss exact what he would like to go faster. There is no point optimising random pieces of code he/she might not care about. (Its easily done)
You can also log key points in you task/request to determine what it spends the most time doing.

EDIT: the program uses tomcat for the networking. it connects to an
SQL database. the main function is a polling loop which checks all
attached devices on the network, reads events from them writes the
event to the database and the performs the event functions.
I am trying to improve the polling loop. the program is heavily
multithreaded and uses a lot of interfaces and proxies so it is hart
to see were code goes the farther you get from the polling loop
This sounds like you have a heavily I/O bound application. There really isn't much that you can do about that because I/O bound applications aren't inefficiently using the CPU--they're stuck waiting for I/O operations on other devices to complete.
FWIW, this scenario is actually why a lot of big companies are contemplating moving toward cheap, ARM-based solutions. They're wasting a lot of power and resources on powerful x86 CPUs that get underutilized while their code sits there waiting for a remote MySQL or Oracle server to finish doing its thing. With such an application, why throw more CPU than you need?

If your new to java then Optimization sounds like a bad idea. Its very easy to get wrong. Its not trivial to rewrite code and keep all the outputs the same while changing the inner workings.
Possibly have a look at your stored procedures and replace any IN statments with INNER JOIN. Thats a fairly low risk and high reward way of speeding thing up.

Start by identifying the time taken by various steps in your application (use logging to identify). Notice if there is anything unusual.
Step into each of these steps to see if there are any bottlenecks. Identify if something can be cached to save a db call. Identify if there is scope of parallelism by breaking down your tasks into independent units.
Hope you have some unit/ integration tests to ensure you don't accidentally break anything.

Measure (with a profiler - as others suggested, VisualVM is good) and locate the spots where your program spends most of its time.
Analyze the hot spots and try to improve their performance.
Measure again to verify that your changes had the expected effect.
If needed, repeat from step 1.

Start very simple.
Make a list of whats slow from a user perspective.
Try to do high level profiling yourself. Maybe an interceptor that prints the run time for your actions.
Then profile only those actions with Start time = System.currentTime...
This easy way could be a starting point into more advanced profiling and if your lucky it may fix your problems.

Before you start optimizing, you have to understand your workload, and you have to be able to recreate that workload. One easy way to do that is to log all requests, in production, with enough detail that you can recreate the requests in a development environment.
At the same time that you log your load, you can also log the performance of those requests: the time from the start of the request to the end. One way to do that (and, incidentally, to capture the data needed to log the request) is to add a servlet filter into your stack.
Then you can start to think about optimization.
Establish performance goals. Simply saying "make it faster" is pointless. Instead, you need to establish goals such as "all pages should respond within 1.5 seconds, as long as there are less than 100 concurrent users."
Identify the requests that fail your performance goals. Focus on the biggest failure first.
Identify why the request takes so long.
To do #3, you need to be able to recreate load in a development environment. Then you can either use a profiler, or simply add trace-level logging into your application to find out how long each step of the process takes.
There is also a whole field of holistic optimization, of which garbage collection tuning is probably the most important. But again, you need to establish and replicate your workload, otherwise you'll be flailing.

When starting to optimize an application, the main risk is to try to optimize every step, which does often not improve the program efficiency as expected and results in unmaintainable code.
It is likely that 80% of the execution time of your program is caused by a single step, which is itself only 20% of the code base.
The first thing to do is to identify this bottleneck. For example, you can log timestamps (using System.nanoTime and/or System.currentTimeMillis and you favorite logging framework) to do this.
Once the step has been identified, try to write a test class which runs this step, and run it with a profiler. I have good experience with both HPROF (http://java.sun.com/developer/technicalArticles/Programming/HPROF.html) although it might require some time to get familiar with, and Eclipse Test and Performance Tools Platform (http://www.eclipse.org/tptp/). If you have never used a profiler, I recommend you start with Eclipse TPTP.
The execution profile will help you find out in what methods your program spends time. Once you know them, look at the source code, and try to understand why it is slow. It might be because (this list is not exhaustive) :
unnecessary costly operations are performed,
a sub-optimal algorithm is used,
the algorithm generates lots of objects, thus giving a lot of work to the garbage collector (especially true for objects which have a medium to long life expectancy).
If there is no visible defect in the code, then you might consider :
making the algorithm more parallel in order to leverage all your CPUs
buying faster hardware.
Regarding JVM options, the two most important ones for performance are :
-server, in order to use the server VM (enabled by default depending on the hardware) which provides better performance at the price of a slower startup (http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client),
-Xms and -Xmx which define the heap size available on startup, and the maximum amount of memory that the JVM can use. If the JVM is not given enough memory, garbage collection will use a lot of your CPU resources, slowing down your program, however if the JVM already has enough memory, increasing the heap size will not improve performance, and might even cause longer GC pauses. (http://stackoverflow.com/questions/1043817/speed-tradeoff-of-javas-xms-and-xmx-options)
Other parameters usually have lower impact, you can consult them at http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html.

Java Memory Usage / Thread Pool Performance Problem

These things obviously require close inspection and availability of code to thoroughly analyze and give good suggestions. Nevertheless, that is not always possible and I hope it may be possible to provide me with good tips based on the information I provide below.
I have a server application that uses a listener thread to listen for incoming data. The incoming data is interpreted into application specific messages and these messages then give rise to events.
Up to that point I don't really have any control over how things are done.
Because this is a legacy application, these events were previously taken care of by that same listener thread (largely a single-threaded application). The events are sent to a blackbox and out comes a result that should be written to disk.
To improve throughput, I wanted to employ a threadpool to take care of the events. The idea being that the listener thread could just spawn new tasks every time an event is created and the threads would take care of the blackbox invocation. Finally, I have a background thread performing the writing to disk.
With just the previous setup and the background writer, everything works OK and the throughput is ~1.6 times more than previously.
When I add the thread pool however performance degrades. At the start, everything seems to run smoothly but then after awhile everything is very slow and finally I get OutOfMemoryExceptions. The weird thing is that when I print the number of active threads each time a task is added to the pool (along with info on how many tasks are queued and so on) it looks as if the thread pool has no problem keeping up with the producer (the listener thread).
Using top -H to check for CPU usage, it's quite evenly spread out at the outset, but at the end the worker threads are barely ever active and only the listener thread is active. Yet it doesn't seem to be submitting more tasks...
Can anyone hypothesize a reason for these symptoms? Do you think it's more likely that there's something in the legacy code (that I have no control over) that just goes bad when multiple threads are added? The out of memory issue should be because some queue somewhere grows too large but since the threadpool almost never contains queued tasks it can't be that.
Any ideas are welcome. Especially ideas of how to more efficiently diagnose a situation like this. How can I get a better profile on what my threads are doing etc.
Thanks.

Slowing down then out of memory implies a memory leak.
So I would start by using some Java memory analyzer tools to identify if there is a leak and what is being leaked. Sometimes you get lucky and the leaked object is well-known and it becomes pretty clear who is hanging on to things that they should not.

Thank you for the answers. I read up on Java VisualVM and used that as a tool. The results and conclusions are detailed below. Hopefully the pictures will work long enough.
I first ran the program and created some heap dumps thinking I could just analyze the dumps and see what was taking up all the memory. This would probably have worked except the dump file got so large and my workstation was of limited use in trying to access it. After waiting two hours for one operation, I realized I couldn't do this.
So my next option was something I, stupidly enough, hadn't thought about. I could just reduce the number of messages sent to the application, and the trend of increasing memory usage should still be there. Also, the dump file will be smaller and faster to analyze.
It turns out that when sending messages at a slower rate, no out of memory issue occured! A graph of the memory usage can be seen below.
The peaks are results of cumulative memory allocations and the troughs that follow are after the garbage collector has run. Although the amount of memory usage certainly is quite alarming and there are probably issues there, no long term trend of memory leakage can be observed.
I started to incrementally increase the rate of messages sent per second to see where the application hits the wall. The image below shows a very different scenario then the previous one...
Because this happens when the rate of messages sent are increased, my guess is that my freeing up the listener thread results in it being able to accept a lot of messages very quickly and this causes more and more allocations. The garbage collector doesn't run and the memory usage hits a wall.
There's of course more to this issue but given what I have found out today I have a fairly good idea of where to go from here. Of course, any additional suggestions/comments are welcome.
This questions should probably be recategorized as dealing with memory usage rather than threadpools... The threadpool wasn't the problem at all.

I agree with #djna.
Thread Pool of java concurrency package works. It does not create threads if it does not need them. You see that number of threads is as expected. This means that probably something in your legacy code is not ready for multithreading. For example some code fragment is not synchronized. As a result some element is not removed from collection. Or some additional elements are stored in collection. So, the memory usage is growing.
BTW I did not understand exactly which part of the application uses threadpool now. Did you have one thread that processes events and now you have several threads that do this? Have you probably changed the inter-thread communication mechanism? Added queues? This may be yet another direction of your investigation.
Good luck!

As mentioned by djna, it's likely some type of memory leak. My guess would be that you're keeping a reference to the request around somewhere:
In the dispatcher thread that's queuing the requests
In the threads that deal with the requests
In the black box that's handling the requests
In the writer thread that writes to disk.
Since you said everything works find before you add the thread pool into the mix, my guess would be that the threads in the pool are keeping a reference to the request somewhere. Th idea being that, without the threadpool, you aren't reusing threads so the information goes away.
As recommended by djna, you can use a Java memory analyzer to help figure out where the data is stacking up.

Can't see my own application methods in Java VisualVM

I'm trying to profile my java app, just to find out the methods in which most time is being spent. Given the poor reactions here to TPTP, I thought I'd give Java VisualVM a go.
It all seemed rather simple to use - except that I can't seem to get anything consistent or useful out of it.
I can't seem to see anything relating to MY OWN code - all I get is a whole bunch of calls to things like java.* methods.
I've tried restricting instrumentation to only my own packages, which seems to cut down the number of methods instrumented, but still I don't ever seem to see my own.
Each time I run, I get varying numbers of methods instrumented, ranging from 10's to 1000's.
I've tried putting in a sleep at the start of my app, to make sure I get VisualVM up and running before my app starts to do anything interesting, to make sure it's profiling when the interesting stuff is running.
Is there something I have to do to ensure my classes get instrumented ?
Are there timing issues ? ..like, have to wait for classes to be loaded etc ?
I've also tried running the guts of the code twice, to make sure all the code does get exercised...
I'm just running an app, with a main, from Eclipse. I've tried using the Eclipse integration so that VisualVM starts up when I start the app - results are the same.
I've also tried exporting the app as a runnable app, and running it standalone from the command line, rather than through Eclipse - same result.
My app is not a long running web app etc - just a main that calls some other of my own classes to do some processing, then quits.
I'd be grateful for any advice about what I might be doing wrong ! :)
Thanks !

I too am struggling with VisualVM, which is a shame because its user interface is fantastic while its profiling output seems horrific. You can seem my question here.
Java VisualVM giving bizarre results for CPU profiling - Has anyone else run into this?
I can tell you a couple of odd things that I have learned about VisualVM and the way it seems to do its profiling.
VisualVM appears to be counting the total time spent inside a method (wall-clock time). I have a thread in my application which starts a number of other threads and then immediately blocks waiting for a message on a queue. VisualVM will not register this method in the profiler until one of the other threads sends the message the first thread was waiting for (when the application terminates). Suddenly the blocking method call dominates the profiling output and is recorded as taking up more than 80% of the application time.
Other profilers, such as JProfiler and the one used by Azul do not count a blocked thread as taking up time for the profiler. This means that blocking methods which probably aren't interesting (situation dependant) for performance profiling are obscuring your view of that code that is eating your CPU time.
When I am running my profiling I end up with
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run()
obscuring my profiling right up until that message comes back to the waiting thread and then the top spot is shared between these two totally irrelevant methods, as well as various other uninteresting methods which don't appear on other profilers.
Secondly and I think quite importantly the method filtering mechanism doesn't work as I would have expected. This means that I can't filter out the I am trying to track down what the story is with this right now.
Not a really helpful answer. The solution as I see it right now is to pay for JProfiler - VisualVM just doesn't seem trustworthy for this task.

you could take a look at Appdynamics lite , it's has a nice features such as business transaction discovery which can samples all call made to a specific method in your code.
The lite version has a lot of limitation such as 10min sampling max and 30 business transaction discovery max.
It's would be nice to have a free tools that do the same

I assume this isn't just an academic question - you would like to see if you could make the app run faster. I assume you also wouldn't mind a little "out of the box" thinking. There are many popular ideas about performance that are actually pretty fuzzy.
For example, you say you're looking for "methods in which most time is being spent". If by that you mean "self time" (program counter actually in the method) there is probably very little, unless you've got some intense loops. Methods generally spend time by calling other methods, sometimes doing I/O.
Another fuzzy idea is that measuring method time or counting the number of calls can tell you very much about where bottlenecks are. Bottlenecks are specific lines of code, not methods, so even if you know approximately where to look, you're still playing detective.
So those are a few of the fuzzy ideas. Here is a bunch more. Let me suggest how one should think about it, and how that leads to results.
When you eventually fix something, it will reduce execution time by some percent, like (pick a number) 30%, right? (Otherwise you didn't fix anything.) OK, during that 30% it was doing something, something that it didn't need to do because later you got rid of it. So, you don't need to measure. You do need to find out what it is doing in that time, so you know what to get rid of.
A very simple way is to "pause" it 10 (or some number of) times at random. Understand what it is doing and why, by looking at the call stack and possibly some of the data. On about 3 of those times you will see it doing something you could get rid of.
You will know approximately how much it will save by seeing what percent of samples is showing it. Approximate is good enough. You can easily see exactly how much time is saved by stopwatching it before and after.
Then, don't stop. You've made the app faster. Do it again, and make it faster yet. Sooner or later you get to a point where you can't make it any faster, but it's probably in more than one step.

Java Random Slowdowns on Mac OS cont'd

I asked this question a few weeks ago, but I'm still having the problem and I have some new hints. The original question is here:
Java Random Slowdowns on Mac OS
Basically, I have a java application that splits a job into independent pieces and runs them in separate threads. The threads have no synchronization or shared memory items. The only resources they do share are data files on the hard disk, with each thread having an open file channel.
Most of the time it runs very fast, but occasionally it will run very slow for no apparent reason. If I attach a CPU profiler to it, then it will start running quickly again. If I take a CPU snapshot, it says its spending most of its time in "self time" in a function that doesn't do anything except check a few (unshared unsynchronized) booleans. I don't know how this could be accurate because 1, it makes no sense, and 2, attaching the profiler seems to knock the threads out of whatever mode they're in and fix the problem. Also, regardless of whether it runs fast or slow, it always finishes and gives the same output, and it never dips in total cpu usage (in this case ~1500%), implying that the threads aren't getting blocked.
I have tried different garbage collectors, different sizings the parts of the memory space, writing data output to non-raid drives, and putting all data output in threads separate the main worker threads.
Does anyone have any idea what kind of problem this could be? Could it be the operating system (OS X 10.6.2) ? I have not been able to duplicate it on a windows machine, but I don't have one with a similar hardware configuration.

It's probably a bit late to reply, but I could observe similar slowdowns using Random in Threads, related to a volatile variable used within java.util.Random - see How can assigning a variable result in a serious performance drop while the execution order is (nearly) untouched? for details. If the answer I got is correct (and it sounds pretty reasonable to me), the slowdown might be related to the in-memory-addresses of the volatile variables used within Random (Have a look at the answer of user 'irreputable' to my question, which explains the problem much better than I do here).
In case you're creating the Random-instances within the run-method of your Threads, you could simply try to turn them into object-variables and initialize them within the constructor of your Thread: This would most likely ensure that the volatile fields of your Random instances will end up in 'different areas' in RAM, which do not have to get synchronized between the processor cores.

How do you know it's running slow? How do you know that it runs quicker when CPU profiler is active? If you do the entire run under the profiler does it ever run slow? If you restrict the number of threads to one does it ever run slow?

Actually this is an interesting problem, im curious to know whats the problem.
First, in your previous question, you are saying you split the job between "multiple" processors. Are they physically multiple, like in multiple machines? or a multi core CPU?
Second, im not sure if Snow Leopard has something to do with it, but we know that SL introduced few new features in term of multi-processor machines. So there might be some problem with the VM on the new OS. Try to use another Java version, i know SL uses Java 6 by default. Try to use Java 5.
Third, did you try to make the Thread pool a little smaller, you are talking about 100 threads running at same time. Try to make them 20 or 40 for example. See if it makes difference.
Finally, i would be interested in seeing how you implemented the multi-threading solution. Small parts of the code will be good

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.