Java Profiling, Performance Tuning and Memory Profiling exercises - java

I am about to conduct a workshop profiling, performance tuning, memory profiling, memory leak detection etc. of java applications using JProfiler and Eclipse Tptp.
I need a set of exercises that I could offer to participants where they can:
Use the tool to to profile the discover the problem: bottleneck, memory leak, suboptimal code etc. I am sure there is plenty experience and real-life examples around.
Resolve the problem and implement optimized code
Demonstrate the solution by performing another session of profiling
Ideally, write the unit test that demonstrates the performance gain
Problems nor solutions should not be overly complicated; it should be possible to resolve them in matter of minutes at best and matter of hours at worst.
Some interesting areas to exercise:
Resolve memory leaks
Optimize loops
Optimize object creation and management
Optimize string operations
Resolve problems exacerbated by concurrency and concurrency bottlenecks
Ideally, exercises should include sample unoptimized code and the solution code.

I try to find real life examples that I've seen in the wild (maybe slightly altered, but the basic problems were all very real). I've also tried to cluster them around the same scenario, so you can build up a session easily.
Scenario: you have a time consuming function that you want to do many times for different values, but the same values may pop up again (ideally not too long after it was created). A good and simple example is url-web page pairs that you need to download and process (for the exercise it should be probably simulated).
Loops:
You want to check if any of a set of words pops up in the pages. Use your function in a loop, but with the same value, pseudo code:
for (word : words) {
checkWord(download(url))
}
One solution is quite easy, just download the page before the loop.
Other solution is below.
Memory leak:
simple one: you can also solve your problem with a kind of cache. In the simplest case you can just put the results to a (static) map. But if you don't prevent it, its size will grow infinitely -> memory leak.
Possible solution: use an LRU map. Most likely performance will not degrade too much, but the memory leak should go away.
trickier one: say you implement the previous cache using a WeakHashMap, where the keys are the URLs (NOT as strings, see later), values are instances of a class that contain the URL, the downloaded page and something else. You may assume that it should be fine, but in fact it is not: as the value (which is not weakly referenced) has a reference to the key (the URL) the key will never be eligible to clean up -> nice memory leak.
Solution: remove the URL from the value.
Same as before, but the urls are interned strings ("to save some memory if we happen to have the same strings again"), value does not refer to this. I did not try it, but it seems to me that it would also cause a leak, because interned Strings can not be GC-ed.
Solution: do not intern, which will also lead to the advice that you must not skip: don't do premature optimization, as it is the root of all evil.
Object creation & Strings:
say you want to display the text of the pages only (~remove html tags). Write a function that does it line by line, and appends it to a growing result. At first the result should be a string, so appending will take a lot of time and object allocation. You can detect this problem from performance point of view (why appends are so slow) and from object creation point of view (why we created so many Strings, StringBuffers, arrays, etc).
Solution: use a StringBuilder for the result.
Concurrency:
You want to speed the whole stuff up by doing downloading/filtering in parallel. Create some threads and run your code using them, but do everything inside a big synchronized block (based on the cache), just "to protect the cache from concurrency problems". Effect should be that you effectively use just one thread, as all the others are waiting to acquire the lock on the cache.
Solution: synchronize only around cache operations (e.g. use `java.util.collections.synchronizedMap())
Synchronize all tiny little pieces of code. This should kill performance, probably prevent normal parallel execution. If you are lucky/smart enough you can come up with a dead lock also.
Moral of this: synchronization should not be an ad hoc thing, on an "it will not hurt" basis, but a well thought thing.
Bonus exercise:
Fill up your cache at the beginning and don't do too much allocation afterward, but still have a small leak somewhere. Usually this pattern is not too easy to catch. You can use a "bookmark", or "watermark" feature of the profiler, which should be created right after the caching is done.

Don't ignore this method because it works very well for any language and OS, for these reasons. An example is here. Also, try to use examples with I/O and significant call depth. Don't just use little cpu-bound programs like Mandelbrot. If you take that C example, which isn't too large, and recode it in Java, that should illustrate most of your points.
Let's see:
Resolve memory leaks.
The whole point of a garbage collector is to plug memory leaks. However, you can still allocate too much memory, and that shows up as a large percent of time in "new" for some objects.
Optimize loops.
Generally loops don't need to be optimized unless there's very little done inside them (and they take a good percent of time).
Optimize object creation and management.
The basic approach here is: keep data structure as simple as humanly possible. Especially stay away from notification-style attempts to keep data consistent, because those things run away and make the call tree enormously bushy. This is a major reason for performance problems in big software.
Optimize string operations.
Use string builder, but don't sweat code that doesn't use a solid percent of execution time.
Concurrency.
Concurrency has two purposes.
1) Performance, but this only works to the extent that it allows multiple pieces of hardware to get cranking at the same time. If the hardware isn't there, it doesn't help. It hurts.
2) Clarity of expression, so for example UI code doesn't have to worry about heavy calculation or network I/O going on at the same time.
In any case, it can't be emphasized enough, don't do any optimization before you've proved that something takes a significant percent of time.

I have used JProfiler for profiling our application.But it hasn't been of much help.Then I used JHat.Using JHat you cannot see the heap in real time.You have to take a heap dump and then analyse it. Using the OQL(Object Query Language) is a good technique to find heap leaks.

Related

How to memory profile in Java?

I'm still learning the ropes of Java so sorry if there's a obvious answer to this. I have a program that is taking a ton of memory and I want to figure a way to reduce its usage, but after reading many SO questions I have the idea that I need to prove where the problem is before I start optimizing it.
So here's what I did, I added a break point to the start of my program and ran it, then I started visualVM and had it profile the memory(I also did the same thing in netbeans just to compare the results and they are the same). My problem is I don't know how to read them, I got the highest area just saying char[] and I can't see any code or anything(which makes sense because visualvm is connecting to the jvm and can't see my source, but netbeans also does not show me the source as it does when doing cpu profiling).
Basically what I want to know is which variable(and hopefully more details like in which method) all the memory is being used so I can focus on working there. Is there a easy way to do this? I right now I am using eclipse and java to develop(and installed visualVM and netbeans specifically for profiling but am willing to install anything else that you feel gets this job done).
EDIT: Ideally, I'm looking for something that will take all my objects and sort them by size(so I can see which one is hogging memory). Currently it returns generic information such as string[] or int[] but I want to know which object its referring to so I can work on getting its size more optimized.
Strings are problematic
Basically in Java, String references ( things that use char[] behind the scenes ) will dominate most business applications memory wise. How they are created determines how much memory they consume in the JVM.
Just because they are so fundamental to most business applications as a data type, and they are one of the most memory hungry as well. This isn't just a Java thing, String data types take up lots of memory in pretty much every language and run time library, because at the least they are just arrays of 1 byte per character or at the worse ( Unicode ) they are arrays of multiple bytes per character.
Once when profiling CPU usage on a web app that also had an Oracle JDBC dependency I discovered that StringBuffer.append() dominated the CPU cycles by many orders of magnitude over all other method calls combined, much less any other single method call. The JDBC driver did lots and lots of String manipulation, kind of the trade off of using PreparedStatements for everything.
What you are concerned about you can't control, not directly anyway
What you should focus on is what in in your control, which is making sure you don't hold on to references longer than you need to, and that you are not duplicating things unnecessarily. The garbage collection routines in Java are highly optimized, and if you learn how their algorithms work, you can make sure your program behaves in the optimal way for those algorithms to work.
Java Heap Memory isn't like manually managed memory in other languages, those rules don't apply
What are considered memory leaks in other languages aren't the same thing/root cause as in Java with its garbage collection system.
Most likely in Java memory isn't consumed by one single uber-object that is leaking ( dangling reference in other environments ).
It is most likely lots of smaller allocations because of StringBuffer/StringBuilder objects not sized appropriately on first instantantations and then having to automatically grow the char[] arrays to hold subsequent append() calls.
These intermediate objects may be held around longer than expected by the garbage collector because of the scope they are in and lots of other things that can vary at run time.
EXAMPLE: the garbage collector may decide that there are candidates, but because it considers that there is plenty of memory still to be had that it might be too expensive time wise to flush them out at that point in time, and it will wait until memory pressure gets higher.
The garbage collector is really good now, but it isn't magic, if you are doing degenerate things, it will cause it to not work optimally. There is lots of documentation on the internet about the garbage collector settings for all the versions of the JVMs.
These un-referenced objects may just have not reached the time that the garbage collector thinks it needs them to for them to be expunged from memory, or there could be references to them held by some other object ( List ) for example that you don't realize still points to that object. This is what is most commonly referred to as a leak in Java, which is a reference leak more specifically.
EXAMPLE: If you know you need to build a 4K String using a StringBuilder create it with new StringBuilder(4096); not the default, which is like 32 and will immediately start creating garbage that can represent many times what you think the object should be size wise.
You can discover how many of what types of objects are instantiated with VisualVM, this will tell you what you need to know. There isn't going to be one big flashing light that points at a single instance of a single class that says, "This is the big memory consumer!", that is unless there is only one instance of some char[] that you are reading some massive file into, and this is not possible either, because lots of other classes use char[] internally; and then you pretty much knew that already.
I don't see any mention of OutOfMemoryError
You probably don't have a problem in your code, the garbage collection system just might not be getting put under enough pressure to kick in and deallocate objects that you think it should be cleaning up. What you think is a problem probably isn't, not unless your program is crashing with OutOfMemoryError. This isn't C, C++, Objective-C, or any other manual memory management language / runtime. You don't get to decide what is in memory or not at the detail level you are expecting you should be able to.
In JProfiler, you can take go to the heap walker and activate the biggest objects view. You will see the objects the retain most memory. "Retained" memory is the memory that would be freed by the garbage collector if you removed the object.
You can then open the object nodes to see the reference tree of the retained objects. Here's a screen shot of the biggest object view:
Disclaimer: My company develops JProfiler
I would recommend capturing heap dumps and using a tool like Eclipse MAT that lets you analyze them. There are many tutorials available. It provides a view of the dominator tree to provide insight into the relationships between the objects on the heap. Specifically for what you mentioned, the "path to GC roots" feature of MAT will tell you where the majority of those char[], String[] and int[] objects are being referenced. JVisualVM can also be useful in identifying leaks and allocations, particularly by using snapshots with allocation stack traces. There are quite a few walk-throughs of the process of getting the snapshots and comparing them to find the allocation point.
Java JDK comes with JVisualVM under bin folder, once your application server (for example is running) you can run visualvm and connect it to your localhost, which will provide you memory allocation and enable you to perform heap dump
If you use visualVM to check your memory usage, it focuses on the data, not the methods. Maybe your big char[] data is caused by many String values? Unless you are using recursion, the data will not be from local variables. So you can focus on the methods that insert elements into large data structures. To find out what precise statements cause your "memory leakage", I suggest you additionally
read Josh Bloch's Effective Java Item 6: (Eliminate obsolete object references)
use a logging framework an log instance creations on the highest verbosity level.
There are generally two distinct approaches to analyse Java code to gain an understanding of its memory allocation profile. If you're trying to measure the impact of a specific, small section of code – say you want to compare two alternative implementations in order to decide which one gives better runtime performance – you would use a microbenchmarking tool such as JMH.
While you can pause the running program, the JVM is a sophisticated runtime that performs a variety of housekeeping tasks and it's really hard to get a "point in time" snapshot and an accurate reading of the "level of memory usage". It might allocate/free memory at a rate that does not directly reflect the behaviour of the running Java program. Similarly, performing a Java object heap dump does not fully capture the low-level machine specific memory layout that dictates the actual memory footprint, as this could depend on the machine architecture, JVM version, and other runtime factors.
Tools like JMH get around this by repeatedly running a small section of code, and observing a long-running average of memory allocations across a number of invocations. E.g. in the GC profiling sample JMH benchmark the derived *·gc.alloc.rate.norm metric gives a reasonably accurate per-invocation normalised memory cost.
In the more general case, you can attach a profiler to a running application and get JVM-level metrics, or perform a heap dump for offline analysis. Some commonly used tools for profiling full applications are Async Profiler and the newly open-sourced Java Flight Recorder in conjunction with Java Mission Control to visualise results.

What does "costly" mean in terms of software operations?

What is meant by Operation is costly or the resource is costly in-terms of Software. When i come across with some documents they mentioned something like Opening a file every-time is a Costly Operation. I can have more examples like this (Database connection is a costly operation, Thread pool is a cheaper one, etc..). At what basis it decided whether the task or operation is costly or cheaper? When we calculating this what the constraints to consider? Is based on the Time also?
Note : I already checked in the net with this but i didn't get any good explanation. If you found kindly share with me and i can close this..
Expensive or Costly operations are those which cause a lot of resources to be used, such as the CPU, Disk Drive(s) or Memory
For example, creating an integer variable in code is not a costly or expensive operation
By contrast, creating a connection to a remote server that hosts a relational database, querying several tables and returning a large results set before iterating over it while remaining connected to the data source would be (relatively) expensive or costly, as opposed to my first example with the Integer.
In order to build scalable, fast applications you would generally want to minimize the frequency of performing these costly/expensive actions, applying techniques of optimisation, caching, parallelism (etc) where they are essential to the operation of the software.
To get a degree of accuracy and some actual numbers on what is 'expensive' and what is 'cheap' in your application, you would employ some sort of profiling or analysis tool. For JavaScript, there is ySlow - for .NET applications, dotTrace - I'd be certain that whatever the platform, a similar solution exists. It's then down to someone to comprehend the output, which is probably the most important part!
Running time, memory use or bandwidth consumption are the most typical interpretations of "cost". Also consider that it may apply to cost in development time.
I'll try explain through some examples:
If you need to edit two field in each row of a Database, if you do it one field at a time that's gonna be close to twice the time as if it was properly done both at same time.
This extra time was not only your waste of time, but also a connection opened longer then needed, memory occupied longer then needed and at the end of the day, your eficience goes down the drain.
When you start scalling, very small amount of time wasted grows into a very big waste of Company resources.
It is almost certainly talking about a time penalty to perform that kind of input / output. Lots of memory shuffling (copying of objects created from classes with lots of members) is another time waster (pass by reference helps eliminate a lot of this).
Usually costly means, in a very simplified way, that it'll take much longer then an operation on memory.
For instance, accessing a File in your file system and reading each line takes much longer then simply iterating over a list of the same size in memory.
The same can be said about database operations, they take much longer then in-memory operations, and so some caution should be used not to abuse these operations.
This is, I repeat, a very simplistic explanation. Exactly what costly means depends on your particular context, the number of operations you're performing, and the overall architecture of the system.

Should I inline long code in a loop, or move it in a separate method?

Assume I have a loop (any while or for) like this:
loop{
A long code.
}
From the point of time complexity, should I divide this code in parts, write a function outside the loop, and call that function repeatedly?
I read something about functions very long ago, that calling a function repeatedly takes more time or memory or like something, I don't remember it exactly. Can you also provide some good reference about things like this (time complexity, coding style)?
Can you also provide some reference book or tutorial about heap memory, overheads etc. which affects the performance of program?
The performance difference is probably very minimal in this case. I would concentrate on clarity rather than performance until you identify this portion of your code to be a serious bottleneck.
It really does depend on what kind of code you're running in the loop, however. If you're just doing a tiny mathematical operation that isn't going to take any CPU time, but you're doing it a few hundred thousand times, then inlining the calculation might make sense. Anything more expensive than that, though, and performance shouldn't be an issue.
There is an overhead of calling a function.
So if the "long code" is fast compared to this overhead (and your application cares about performance), then you should definitely avoid the overhead.
However, if the performance is not noticably worse, it's better to make it more readable, by using a (or better multiple) function.
Rule one of performance optmisation: Measure it.
Personally, I go for readable code first and then optimise it IF NECESSARY. Usually, it isn't necessary :-)
See the first line in CHAPTER 3 - Measurement Is Everything
"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil." - Donald Knuth
In this case, the difference in performance will probably be minimal between the two solutions, so writing clearer code is the way to do it.
There really isnt a simple "tutorial" on performance, it is a very complex subject and one that even seasoned veterans often dont fully understand. Anyway, to give you more of an idea of what the overhead of "calling" a function is, basically what you are doing is "freezing" the state of your function(in Java there are no "functions" per se, they are all called methods), calling the method, then "unfreezing", where your method was before.
The "freezing" essentially consists of pushing state information(where you were in the method, what the value of the variables was etc) on to the stack, "unfreezing" consists of popping the saved state off the stack and updating the control structures to where they were before you called the function. Naturally memory operations are far from free, but the VM is pretty good at keeping the performance impact to an absolute minimum.
Now keep in mind Java is almost entirely heap based, the only things that really have to get pushed on the stack are the value of pointers(small), your place in the program(again small), and whatever primitives you have local to your method, and a tiny bit of control information, nothing else. Furthermore, although you cannot explicitly inline in Java(though Im sure there are bytecode editors out there that essentially let you do that), most VMs, including the most popular HotSpot VM, will do this automatically for you. http://java.sun.com/developer/technicalArticles/Networking/HotSpot/inlining.html
So the bottom line is pretty much 0 performance impact, if you want to verify for yourself you can always run benchmarking and profiling tools, they should be able to confirm it for you.
From a execution speed point of view it shouldn't matter, and if you still believe this is a bottleneck it is easy to measure.
From a development performance perspective, it is a good idea to keep the code short. I would vote for turning the loop contents into one (or more) properly named methods.
Forget it! You can't gain any performance by doing the job of the JIT. Let JIT inline it for you. Keep the methods short for readability and also for performance, as JIT works better with short methods.
There are microptimizations which may help you gain some performance, but don't even think about them. I suggest the following rules:
Write clean code using appropriate objects and algorithms for readability and for performance.
In case the program is too slow, profile and identify the critical parts.
Think about improving them using better objects and algorithms.
As a last resort, you may also consider microoptimizations.

frequent garbage collection java web app

I have a web app that serializes a java bean into xml or json according to the user request.
I am facing a mind bending problem when I put a little bit of load on it, it quickly uses all allocated memory, and reach max capacity. I then observe full GC working really hard every 20-40 seconds.
Doesnt look like a memory leak issue... but I am not quite sure how to trouble shoot this?
The bean that is serialized to xml/json has reference to other beans and those to others. I use json-lib and jaxb to serialize the beans.
yourkit memory profiler is telling me that a char[] is the most memory consuming live object...
any insight is appreciated.
There are two possibilities: you've got a memory leak, or your webapp is just generating lots of garbage.
The brute-force way to tell if you've got a memory leak is to run it for a long time and see if it falls over with an OOME. Or turn on GC logging, and see if the average space left after garbage collection continually trends upwards over time.
Whether or not you have a memory leak, you can probably improve performance (reduce the percentage GC time) by increasing the max heap size. The fact that your webapp is seeing lots of full GCs suggests to me that it needs more heap. (This is just a bandaid solution if you have a memory leak.)
If it turns out that you are not suffering from a memory leak, then you should take a look at why your application is generating so much garbage. It could be down to the way that you are doing the XML and JSON serialization.
Why do you think you have a problem? GC is a natural and normal thing to happen. We have customers that GC every second (for less than 100ms duration), and that's fine as long as memory keeps getting reclaimed.
GCing every 20-40 seconds isn't a problem IMO - as long as it doesn't take a large % of that 20-40s. Most major commercial JVMs aim to keep GC in the 5-10% of time range (so 1-4 seconds of that 20-40s). Posting more data in the form of the GC logs might help, and I'd also suggest tools like GCMV would help you visualize and get recommendations on what your GC profile looks like.
It's impossible to diagnose this without a lot more information - code and GC logs - but my guess would be that you're reading data in as large strings, then chopping out little bits with substring(). When you do that, the substring string is made using the same underlying character array as the parent string, and so as long as it's alive, will keep that array in memory. That means code like this:
String big = a string of one million characters;
String small = big.substring(0, 1);
big = null;
Will still keep the huge string's character data in memory. If this is the case, then you can address it by forcing the small strings to use fresh, smaller, character arrays by constructing new instances:
small = new String(small);
But like i said, this is just a guess.
I'm not sure how much of it is in your code and how much might be in the tools you are using, but there are some key things to watch for.
One of the worst is if you constantly add to strings in loops. A simple "hello"+"world" is no problem at all, it's actually very smart about that, but if you do it in a loop it will constantly reallocate the string. Use StringBuilder where you can.
There are profilers for Java that should quickly point you to where the allocations are taking place. Just fool around with a profiler for a while while your java app is running and you will probably be able to reduce your GCs to virtually nothing unless the problem is inside your libraries--and even then you may figure out some way to fix it.
Things you allocate and then free quickly don't require time in the GC phase--it's pretty much free. Be sure you aren't keeping Strings around longer than you need them. Bring them in, process them and return to your previous state before returning from your request handler.
You should attach yourkit and record allocations (e.g., every 10th allocation; including all large ones). They have a step by step guide on diagnosing excessive gc:
http://www.yourkit.com/docs/90/help/excessive_gc.jsp
To me that sounds like you are trying to serialize a recursive object by some encoder which is not prepared for it.
(or at least: very deep/almost recursive)
Java's native XML API is really "noisy" and generally wasteful in terms of resources which means that if your requests and XML/JSON generation cycles are short-lived, the GC will have lots to clean up for.
I have debugged a very similar case and found out this the hard way, only way I could at least somewhat improve the situation without major refactorings was implicitly calling GC with the appropriate VM flags which actually turn System.gc(); from a non-op call to maybe-op call.
I would start by inspecting my running application to see what was being created on the heap.
HPROF can collect this information for you, which you can then analyse using HAT.
To debug issues with memory allocations, InMemProfiler can be used at the command line. Collected object allocations can be tracked and collected objects can be split into buckets based on their lifetimes.
In trace mode this tool can be used to identify the source of memory allocations.

Java Object Reuse

Should Java Objects be reused as often as it can be reused ? Or should we reuse it only when they are "heavyweight", ie have OS resources associated with it ?
All old articles on the internet talk about object reuse and object pooling as much as possible, but I have read recent articles that say new Object() is highly optimized now ( 10 instructions ) and Object reuse is not as big a deal as it used to be.
What is the current best practice and how are you people doing it ?
I let the garbage collector do that kind of deciding for me, the only time I've hit heap limit with freshly allocated objects was after running a buggy recursive algorithm for a couple of seconds which generated 3 * 27 * 27... new objects as fast as it could.
Do what's best for readability and encapsulation. Sometimes reusing objects may be useful, but generally you shouldn't worry about it.
If you use them very intensively and the construction is costly, you should try to reuse them as much as you can.
If your objects are very small, and cheap to create ( like Object ) you should create new ones.
For instance connections database are pooled because the cost of creating a new one is higher than those of creating .. mmhh new Integer for instance.
So the answer to your question is, reuse when they are heavy AND are used often ( it is not worth to pool a 3 mb object that is only used twice )
Edit:
Additionally, this item from Effective Java:Favor Immutability is worth reading and may apply to your situation.
Object creation is cheap, yes, but sometimes not cheap enough.
If you create a lot (and I mean A LOT) temporary objects in rapid succession, the costs for the garbage collector are considerable. However even with a good profiler you may not necessarily see the costs easily, as the garbage collector nowadays works in short intervals instead of blocking the whole application for a second or two.
Most of the performance improvements I got in my projects came from either avoiding object creation or avoiding the whole work (including the object creation) through aggressive caching. No matter how big or small the object is, it still takes time to create it and to manage the references and heap structures for it. (And of course, the cleanup and the internal heap-defrag/copying also takes time.)
I would not start to be religious about avoiding object creation at all cost, but if you see a jigsaw pattern in your memory-profiler, it means your garbage collector is on heavy duty. And if your garbage collector uses the CPU, the CPI is not available for your application.
Regarding object pooling: Doing it right and not running into either memory leaks or invalid states or spending more time on the management than you would save is difficult. So I never used that strategy.
My strategy has been to simply strive for immutable objects. Immutable things can be cached easily and therefore help to keep the system simple.
However, no matter what you do: Make sure you check your hotspots with a profiler first. Premature optimization is the root of most evilness.
Let the garbage collector do its job, it can be considered better than your code.
Unless a profiler proves it guilty. And don't even use common sense to try to figure out when it's wrong. In unusual cases even cheap objects like byte arrays are better pooled.
Rule 1 of optimization: don't do it.
Rule 2 (for experts only): don't do it yet.
The rule of thumb should be to use your common sense and reuse objects when their creation consumes significant resources such as I/O, network traffic, DB connections, etc...
If it's just creating a new String(), forget about the reuse, you'll gain nothing from it. Code readability has higher preference.
I would worry about performance issues if they arise. Do what makes sense first (would you do this with primatives), if you then run a profiling tool and find that it is new causing you problems, start to think about pre-allocation (ie. when your program isn't doing much work).
Re-using objects sounds like a disaster waiting to happen by the way:
SomeClass someObject = new SomeClass();
someObject.doSomething();
someObject.changeState();
someObject.changeOtherState();
someObject.sendSignal();
// stuff
//re-use
someObject.reset(); // urgh, had to put this in to support reuse
someObject.doSomethingElse(); // oh oh, this is wrong after calling changeOtherState, regardless of reset
someObject.changeState(); // crap, now this is wrong but it's not obvious yet
someObject.doImportantStuff(); // what's going on?
Object creation is certainly faster than it used to be. The newer generational GC in JDKs 5 and higher are improvements, too.
I don't think either of these makes excessive creation of objects cost-free, but they do reduce the importance of object pooling. I think pooling makes sense for database connections, but I don't attempt it for my own domain objects.
Reuse puts a premium on thread-safety. You need to think carefully to ensure that you can reuse objects safely.
If I decided that object reuse was important I'd do it with products like Terracotta, Tangersol, GridGain, etc. and make sure that my server had scads of memory available to it.
Second the above comments.
Don't try and second guess the GC and Hotspot. Object pooling may have been useful once but these days its not so useful unless you are talking about database connections or unique system resources.
Just try and write clean and simple code and be amazed at what Hotspot can do.
Why not use VisualVM or a profiler to take a look at your code?

Categories