I've been developing a Reporting Engine (RE is to generate PDF-reports) in C++ on Linux. If a PDF-report being generated must contain some charts, I need to build them while building the report. ChartBuilder is written in Java (with JFreeChart of Java-TeeChart - it does not matter anyway). Well, while RE is building a report, it invokes some ChartBuilder-API functions via JNI to build a chart (or several charts) step by step (ChartBuilder is packed into .jar-file). The problem is that it takes a lot of time to build the first chart (that is, to execute every ChartBuilder-API function for the first time during the process lifetime)! More specifically, it takes about 1.5 seconds to build the first chart. If there are several charts to be created, the rest of charts are built during about (~0.05, ~0.1) seconds. That is 30 times faster than the first one! It's worth to note, that this first chart is the same with the rest of them (except for data). The problem seems to be fundamental for Java (and I'm not very expirienced in this platform).
Below is the picture that illustrates described problem:
I wonder if there is a way to hasten the first execution. It would be great to understand how to avoid the overhead on the first execution at all because now it hampers the whole performance of RE.
In addition I'd like to describe the way it works: Somebody invokes C++RE::CreateReport with all needed parameters. This function, if it's needed, creates a JVM and makes requests to it via JNI. When a report is created, the JVM is destroyed.
Thanks in advance!
Just-in-time compilation. Keep your JVM alive as a service to avoid paying JIT compilation cost multiple times.
I this it is likely a combination of things as people have pointed out in the comments and other answer - JVM startup, class loader, the fact that Java 'interprets' your code when it is running it etc.
Most fall into the category of 'first time start up' overhead - hence the higher performance in subsequent runs.
I would personally be inclined to agree with Thomas (in the comments to your question) that the highest overhead is possibly the class loader.
There are tools you can use to profile the Java JVM to get a feel for what is taking the most time within the JVM itself - such as:
visualvm (http://visualvm.java.net)
JVM monitor (http://jvmmonitor.org)
You have to be careful using these tools to interpret the results with some thought - you may want to measure first runs and subsequent runs separately, and you also may want to add your own system timings into your C++ code that wraps the JNI calls to get a better picture of the end to end timings. With performance monitoring, multiple test runs are very important to allow for slow and fast individual runs for one reason or another (e.g. other load on the computer - even on a non shared laptop).
As LeffeBrune mentions if you can have the chart builder running as a service already, it will likely speed up the first run, although you will probably need to experiment to see how much difference it makes if it has not actually been running on a processor for a while, for example.
Related
What I want to do is generate a call tree with CPU timing information for a Java application as it goes through a scripted task. The idea is to see how much time is spent in each part of the code, and how this changes when I change the code or the task, but to do so in a consistently repeatable way.
In Java VisualVM I can do this interactively by clicking to start and stop profiling, but I would like to automate the process so I can get more consistent results (and not get so bored). Can VisualVM do this, or is there another profiler that can?
If I were a profiler vendor I would have to be concerned about providing people what they think they want, even if what they think they want does not solve the problem they have.
The thing is, only some problems can be found by knowing how long routines typically take, and if you ignore the ones you don't find that way, they will become the dominant part of how much time your program takes.
An example of what I mean is this recent example:
A program spends 50% of its wall-clock time reading .dll files to look up string resources to get the names of files so that the strings can be displayed on a splash screen so the user can see that something is happening during application startup. That means, if there were some other way to provide eye-candy to the user, the app could start up twice as fast.
During this process, the call stack is typically 15-20 functions deep, so it's really hard to tell what's going on just by having timing numbers for the functions.
What makes the problem difficult is that it is semantic. No particular routine is "hot" in a way that it could be speeded up.
The only "hot" thing is the general description, overall, of what the program is doing, and no tool can isolate it for you.
Only you can recognize it.
However, if you simply interrupted the program and examined the call stack during startup, the probability is 50% that you would see the entire explanation for the time being spent.
If you do it several times, it's the basis of the random pausing technique that some programmers rely on because it will find every problem profilers can find, and more, and others look down on because it isn't a tool.
And do it interactively, either that or extract a small number of stack samples by using something analogous to pstack.
I'm developing an action-platformer game using Java and just recently finished coding the AI for an enemy unit. I started to notice that at certain points in the game, it would slowdown and lag before going back to normal. See an example here on this video. The lagging part is somewhere at the midpoint of the video and is annotated, so you won't miss it:
LoGaP 06.14.2013 on YouTube
I would guess that there's some code that it tries executing that becomes a bottleneck. Is this typically the symptom of a memory issue or CPU issue?
More importantly, what is the best way in this case to identify what the problematic code is so that I could analyze how to optimize it? The only optimization tool I've only ever used for Java is jvisualvm and I only used it for a while. Will that do the trick in this scenario?
It is a common problem in profiling that you have a method that typically executes in acceptable time, but every once in a while becomes a bottleneck.
Profiling the entire run will likely not help you since the one slow invocation will be drowned out by all the other regular invocations.
In JProfiler, you can mark a method so that exceptional method runs are kept separately and you can examine the slowest operations in detail. You can use the method statistics view to see the call distribution and mark them for exceptional method recording there:
In the call tree view, the slowest invocations will be marked with an [exceptional run] suffix and you can investigate their call tree in isolation:
Disclaimer: My company develops JProfiler.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a web server program written in java and my boss wants it to run faster.
Iv always been happy if it ran without error so efficiency is new to me.
I tried a profiler but it crashed my computer and turned out to be a dead opensource project.
I have no idea what I am doing except from reading a few questions on here. I see that re factoring code is the best option but Im not sure how to go about that and that i need a profiler to see what code to re factor.
So does anyone know of a free profiler that I can use ? Im using java and eclipse. if possible some instructions or a like to easy instruction would be great.
But what I really want if anyone can give it is a basic introduction to the subject so I can understand enough to go do in depth research on the subject to get the best results.
I am a complete beginner when it comes to optimising code and the subject seems very complex from what I have seen so far, any help with how to get started would be greatly appreciated.
Im new to java as well so saying things like check garbage collection would mean nothing to me, id need a more detailed explanation.
EDIT: the program uses tomcat for the networking. it connects to an SQL database. the main function is a polling loop which checks all attached devices on the network, reads events from them writes the event to the database and the performs the event functions.
I am trying to improve the polling loop. the program is heavily multithreaded and uses a lot of interfaces and proxies so it is hart to see were code goes the farther you get from the polling loop.
I hope this information helps you offer solutions. also I did not build it, I inherited the code.
First of all detect the bottlenecks. There is no point in optimizing a method from 500ms to 400ms when there is a method running for 5 seconds, when it should run for 100ms.
You can try using the VisualVM as a profiler, which is built-in in the JDK.
If you want a free profiler, use VisualVM when comes with Java. It is likely to be enough.
You should ask your boss exact what he would like to go faster. There is no point optimising random pieces of code he/she might not care about. (Its easily done)
You can also log key points in you task/request to determine what it spends the most time doing.
EDIT: the program uses tomcat for the networking. it connects to an
SQL database. the main function is a polling loop which checks all
attached devices on the network, reads events from them writes the
event to the database and the performs the event functions.
I am trying to improve the polling loop. the program is heavily
multithreaded and uses a lot of interfaces and proxies so it is hart
to see were code goes the farther you get from the polling loop
This sounds like you have a heavily I/O bound application. There really isn't much that you can do about that because I/O bound applications aren't inefficiently using the CPU--they're stuck waiting for I/O operations on other devices to complete.
FWIW, this scenario is actually why a lot of big companies are contemplating moving toward cheap, ARM-based solutions. They're wasting a lot of power and resources on powerful x86 CPUs that get underutilized while their code sits there waiting for a remote MySQL or Oracle server to finish doing its thing. With such an application, why throw more CPU than you need?
If your new to java then Optimization sounds like a bad idea. Its very easy to get wrong. Its not trivial to rewrite code and keep all the outputs the same while changing the inner workings.
Possibly have a look at your stored procedures and replace any IN statments with INNER JOIN. Thats a fairly low risk and high reward way of speeding thing up.
Start by identifying the time taken by various steps in your application (use logging to identify). Notice if there is anything unusual.
Step into each of these steps to see if there are any bottlenecks. Identify if something can be cached to save a db call. Identify if there is scope of parallelism by breaking down your tasks into independent units.
Hope you have some unit/ integration tests to ensure you don't accidentally break anything.
Measure (with a profiler - as others suggested, VisualVM is good) and locate the spots where your program spends most of its time.
Analyze the hot spots and try to improve their performance.
Measure again to verify that your changes had the expected effect.
If needed, repeat from step 1.
Start very simple.
Make a list of whats slow from a user perspective.
Try to do high level profiling yourself. Maybe an interceptor that prints the run time for your actions.
Then profile only those actions with Start time = System.currentTime...
This easy way could be a starting point into more advanced profiling and if your lucky it may fix your problems.
Before you start optimizing, you have to understand your workload, and you have to be able to recreate that workload. One easy way to do that is to log all requests, in production, with enough detail that you can recreate the requests in a development environment.
At the same time that you log your load, you can also log the performance of those requests: the time from the start of the request to the end. One way to do that (and, incidentally, to capture the data needed to log the request) is to add a servlet filter into your stack.
Then you can start to think about optimization.
Establish performance goals. Simply saying "make it faster" is pointless. Instead, you need to establish goals such as "all pages should respond within 1.5 seconds, as long as there are less than 100 concurrent users."
Identify the requests that fail your performance goals. Focus on the biggest failure first.
Identify why the request takes so long.
To do #3, you need to be able to recreate load in a development environment. Then you can either use a profiler, or simply add trace-level logging into your application to find out how long each step of the process takes.
There is also a whole field of holistic optimization, of which garbage collection tuning is probably the most important. But again, you need to establish and replicate your workload, otherwise you'll be flailing.
When starting to optimize an application, the main risk is to try to optimize every step, which does often not improve the program efficiency as expected and results in unmaintainable code.
It is likely that 80% of the execution time of your program is caused by a single step, which is itself only 20% of the code base.
The first thing to do is to identify this bottleneck. For example, you can log timestamps (using System.nanoTime and/or System.currentTimeMillis and you favorite logging framework) to do this.
Once the step has been identified, try to write a test class which runs this step, and run it with a profiler. I have good experience with both HPROF (http://java.sun.com/developer/technicalArticles/Programming/HPROF.html) although it might require some time to get familiar with, and Eclipse Test and Performance Tools Platform (http://www.eclipse.org/tptp/). If you have never used a profiler, I recommend you start with Eclipse TPTP.
The execution profile will help you find out in what methods your program spends time. Once you know them, look at the source code, and try to understand why it is slow. It might be because (this list is not exhaustive) :
unnecessary costly operations are performed,
a sub-optimal algorithm is used,
the algorithm generates lots of objects, thus giving a lot of work to the garbage collector (especially true for objects which have a medium to long life expectancy).
If there is no visible defect in the code, then you might consider :
making the algorithm more parallel in order to leverage all your CPUs
buying faster hardware.
Regarding JVM options, the two most important ones for performance areĀ :
-server, in order to use the server VM (enabled by default depending on the hardware) which provides better performance at the price of a slower startup (http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client),
-Xms and -Xmx which define the heap size available on startup, and the maximum amount of memory that the JVM can use. If the JVM is not given enough memory, garbage collection will use a lot of your CPU resources, slowing down your program, however if the JVM already has enough memory, increasing the heap size will not improve performance, and might even cause longer GC pauses. (http://stackoverflow.com/questions/1043817/speed-tradeoff-of-javas-xms-and-xmx-options)
Other parameters usually have lower impact, you can consult them at http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html.
We have an Java ERP type of application. Communication between server an client is via RMI. In peak hours there can be up to 250 users logged in and about 20 of them are working at the same time. This means that about 20 threads are live at any given time in peak hours.
The server can run for hours without any problems, but all of a sudden response times get higher and higher. Response times can be in minutes.
We are running on Windows 2008 R2 with Sun's JDK 1.6.0_16. We have been using perfmon and Process Explorer to see what is going on. The only thing that we find odd is that when server starts to work slow, the number of handles java.exe process has opened is around 3500. I'm not saying that this is the acual problem.
I'm just curious if there are some guidelines I should follow to be able to pinpoint the problem. What tools should I use? ....
Can you access to the log configuration of this application.
If you can, you should change the log level to "DEBUG". Tracing the DEBUG logs of a request could give you a usefull information about the contention point.
If you can't, profiler tools are can help you :
VisualVM (Free, and good product)
Eclipse TPTP (Free, but more complicated than VisualVM)
JProbe (not Free but very powerful. It is my favorite Java profiler, but it is expensive)
If the application has been developped with JMX control points, you can plug a JMX viewer to get informations...
If you want to stress the application to trigger the problem (if you want to verify whether it is a charge problem), you can use stress tools like JMeter
Sounds like the garbage collection cannot keep up and starts "halt-the-world" collecting for some reason.
Attach with jvisualvm in the JDK when starting and have a look at the collected data when the performance drops.
The problem you'r describing is quite typical but general as well. Causes can range from memory leaks, resource contention etcetera to bad GC policies and heap/PermGen-space allocation. To point out exact problems with your application, you need to profile it (I am aware of tools like Yourkit and JProfiler). If you profile your application wisely, only some application cycles would reveal the problems otherwise profiling isn't very easy itself.
In a similar situation, I have coded a simple profiling code myself. Basically I used a ThreadLocal that has a "StopWatch" (based on a LinkedHashMap) in it, and I then insert code like this into various points of the application: watch.time("OperationX");
then after the thread finishes a task, I'd call watch.logTime(); and the class would write a log that looks like this: [DEBUG] StopWatch time:Stuff=0, AnotherEvent=102, OperationX=150
After this I wrote a simple parser that generates CSV out from this log (per code path). The best thing you can do is to create a histogram (can be easily done using excel). Averages, medium and even mode can fool you.. I highly recommend to create a histogram.
Together with this histogram, you can create line graphs using average/medium/mode (which ever represents data best, you can determine this from the histogram).
This way, you can be 100% sure exactly what operation is taking time. If you can't determine the culprit, binary search is your friend (fine grain the events).
Might sound really primitive, but works. Also, if you make a library out of it, you can use it in any project. It's also cool because you can easily turn it on in production as well..
Aside from the GC that others have mentioned, Try taking thread dumps every 5-10 seconds for about 30 seconds during your slow down. There could be a case where DB calls, Web Service, or some other dependency becomes slow. If you take a look at the tread dumps you will be able to see threads which don't appear to move, and you could narrow your culprit that way.
From the GC stand point, do you monitor your CPU usage during these times? If the GC is running frequently you will see a jump in your overall CPU usage.
If only this was a Solaris box, prstat would be your friend.
For acute issues like this a quick jstack <pid> should quickly point out the problem area. Probably no need to get all fancy on it.
If I had to guess, I'd say Hotspot jumped in and tightly optimised some badly written code. Netbeans grinds to a halt where it uses a WeakHashMap with newly created objects to cache file data. When optimised, the entries can be removed from the map straight after being added. Obviously, if the cache is being relied upon, much file activity follows. You probably wont see the drive light up, because it'll all be cached by the OS.
I'm trying to profile my java app, just to find out the methods in which most time is being spent. Given the poor reactions here to TPTP, I thought I'd give Java VisualVM a go.
It all seemed rather simple to use - except that I can't seem to get anything consistent or useful out of it.
I can't seem to see anything relating to MY OWN code - all I get is a whole bunch of calls to things like java.* methods.
I've tried restricting instrumentation to only my own packages, which seems to cut down the number of methods instrumented, but still I don't ever seem to see my own.
Each time I run, I get varying numbers of methods instrumented, ranging from 10's to 1000's.
I've tried putting in a sleep at the start of my app, to make sure I get VisualVM up and running before my app starts to do anything interesting, to make sure it's profiling when the interesting stuff is running.
Is there something I have to do to ensure my classes get instrumented ?
Are there timing issues ? ..like, have to wait for classes to be loaded etc ?
I've also tried running the guts of the code twice, to make sure all the code does get exercised...
I'm just running an app, with a main, from Eclipse. I've tried using the Eclipse integration so that VisualVM starts up when I start the app - results are the same.
I've also tried exporting the app as a runnable app, and running it standalone from the command line, rather than through Eclipse - same result.
My app is not a long running web app etc - just a main that calls some other of my own classes to do some processing, then quits.
I'd be grateful for any advice about what I might be doing wrong ! :)
Thanks !
I too am struggling with VisualVM, which is a shame because its user interface is fantastic while its profiling output seems horrific. You can seem my question here.
Java VisualVM giving bizarre results for CPU profiling - Has anyone else run into this?
I can tell you a couple of odd things that I have learned about VisualVM and the way it seems to do its profiling.
VisualVM appears to be counting the total time spent inside a method (wall-clock time). I have a thread in my application which starts a number of other threads and then immediately blocks waiting for a message on a queue. VisualVM will not register this method in the profiler until one of the other threads sends the message the first thread was waiting for (when the application terminates). Suddenly the blocking method call dominates the profiling output and is recorded as taking up more than 80% of the application time.
Other profilers, such as JProfiler and the one used by Azul do not count a blocked thread as taking up time for the profiler. This means that blocking methods which probably aren't interesting (situation dependant) for performance profiling are obscuring your view of that code that is eating your CPU time.
When I am running my profiling I end up with
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run()
obscuring my profiling right up until that message comes back to the waiting thread and then the top spot is shared between these two totally irrelevant methods, as well as various other uninteresting methods which don't appear on other profilers.
Secondly and I think quite importantly the method filtering mechanism doesn't work as I would have expected. This means that I can't filter out the I am trying to track down what the story is with this right now.
Not a really helpful answer. The solution as I see it right now is to pay for JProfiler - VisualVM just doesn't seem trustworthy for this task.
you could take a look at Appdynamics lite , it's has a nice features such as business transaction discovery which can samples all call made to a specific method in your code.
The lite version has a lot of limitation such as 10min sampling max and 30 business transaction discovery max.
It's would be nice to have a free tools that do the same
I assume this isn't just an academic question - you would like to see if you could make the app run faster. I assume you also wouldn't mind a little "out of the box" thinking. There are many popular ideas about performance that are actually pretty fuzzy.
For example, you say you're looking for "methods in which most time is being spent". If by that you mean "self time" (program counter actually in the method) there is probably very little, unless you've got some intense loops. Methods generally spend time by calling other methods, sometimes doing I/O.
Another fuzzy idea is that measuring method time or counting the number of calls can tell you very much about where bottlenecks are. Bottlenecks are specific lines of code, not methods, so even if you know approximately where to look, you're still playing detective.
So those are a few of the fuzzy ideas. Here is a bunch more. Let me suggest how one should think about it, and how that leads to results.
When you eventually fix something, it will reduce execution time by some percent, like (pick a number) 30%, right? (Otherwise you didn't fix anything.) OK, during that 30% it was doing something, something that it didn't need to do because later you got rid of it. So, you don't need to measure. You do need to find out what it is doing in that time, so you know what to get rid of.
A very simple way is to "pause" it 10 (or some number of) times at random. Understand what it is doing and why, by looking at the call stack and possibly some of the data. On about 3 of those times you will see it doing something you could get rid of.
You will know approximately how much it will save by seeing what percent of samples is showing it. Approximate is good enough. You can easily see exactly how much time is saved by stopwatching it before and after.
Then, don't stop. You've made the app faster. Do it again, and make it faster yet. Sooner or later you get to a point where you can't make it any faster, but it's probably in more than one step.