Alternatives to Java string interning - java

Since Java's default string interning has got a lot of bad press, I am looking for an alternative.
Can you suggest an API which is a good alternative to Java string interning? My application uses Java 6. My requirement is mainly to avoid duplicate strings via interning.
Regarding the bad press:
String intern is implemented via a native method. And the C implementation uses a fixed size of some 1k entries and scales very poorly for large number of strings.
Java 6 stores interned strings in Perm gen. And therefore are not GC'd and possibly lead to perm gen errors. I know this is fixed in java 7 but I can't upgrade to java 7.
Why do I need to use intering?
My application is a server app with heap size of 10-20G for different deployments.
During profiling we have figured that hundrends of thousands of string are duplicates and we can significantly improve the memory usage by avoiding storing duplicate strings.
Memory has been a bottleneck for us and therefore we are targetting it rather than doing any premature optimization.

String intern is implemented via a native method. And the C implementation uses a fixed size of some 1k entries and scales very poorly for large number of strings.
It scales poorly for many thousand Strings.
Java 6 stores interned strings in Perm gen. And therefore are not GC'd
It will be cleaned up when the perm gen is cleaned up which is not often but it can mean you reach the maximum of this space if you don't increase it.
My application is a server app with heap size of 10-20G for different deployments.
I suggest you consider using off heap memory. I have 500 GB in off heap memory and about 1 GB in heap in one application. It isn't useful in all cases but worth considering.
During profiling we have figured that hundrends of thousands of string are duplicates and we can significantly improve the memory usage by avoiding storing duplicate strings.
For this I have used a simple array of String. This is very light weight and you can control the upper bound of Strings stored easily.
Here is an example of generic interner.
class Interner<T> {
private final T[] cache;
#SuppressWarnings("unchecked")
public Interner(int primeSize) {
cache = (T[]) new Object[primeSize];
}
public T intern(T t) {
int hash = Math.abs(t.hashCode() % cache.length);
T t2 = cache[hash];
if (t2 != null && t.equals(t2))
return t2;
cache[hash] = t;
return t;
}
}
An interest property of this cache is it doesn't matter that its not thread safe.
For extra speed you can use a power of 2 size and a bit mask, but its more complicated and may not work very well depending on how your hashCodes are calculated.

Related

How is "Size" calculated in JVisualVM's Heap Dump

I'm running Java Visual VM to analyze performance on a Mule app to reduce the amount of memory used. When I look at the heap dump, I see that char[] has a size of over 37 MB, String has a size of a bit over 28 MB.
What I'm not clear on is how the size column accounts for the amount of memory used. In particular, since a String is an abstraction of a char[], I'm wondering if this means that some of that 37 MB of char arrays is also present within the 28 MB of Strings, or if they are allocated separately.
On top of that, I also have a class that I suspect is hogging a great deal of memory and contains several strings, but according to my heap dump, this class only uses 6.5% of the total memory in the heap.
So I guess my question is... If I were to make my custom class more efficient by using fewer String objects, would I see a reduction in the amount of memory reported to be used by Strings and Char[]s, or just for that specific class?
Thanks!
Holger's comment is all I needed...
"The sizes only include the memory for an object itself, not any referenced object (and arrays are objects)."
This alone gives me a much better idea of how I can optimize.

Java throwing out of memory exception before it's really out of memory?

I wish to make a large int array that very nearly fills all of the memory available to the JVM. Take this code, for instance:
final int numBuffers = (int) ((runtime.freeMemory() - 200000L) / (BUFFER_SIZE));
System.out.println(runtime.freeMemory());
System.out.println(numBuffers*(BUFFER_SIZE/4)*4);
buffers = new int[numBuffers*(BUFFER_SIZE / 4)];
When run with a heap size of 10M, this throws an OutOfMemoryException, despite the output from the printlns being:
9487176
9273344
I realise the array is going to have some overheads, but not 200k, surely? Why does java fail to allocate memory for something it claims to have enough space for? I have to set that constant that is subtracted to something around 4M before Java will run this (By which time the printlns are looking more like:
9487176
5472256
)
Even more bewilderingly, if I replace buffers with a 2D array:
buffers = new int[numBuffers][BUFFER_SIZE / 4];
Then it runs without complaint using the 200k subtraction shown above - even though the amount of integers being stored is the same in both arrays (And wouldn't the overheads on a 2D array be larger than that of a 1D array, since it's got all those references to other arrays to store).
Any ideas?
The VM will divide the heap memory into different areas (mainly for the garbage collector), so you will run out of memory when you attempt to allocate a single object of nearly the entire heap size.
Also, some memory will already have been used up by the JRE. 200k is nothing with todays memory sizes, and 10M heap is almost unrealistically small for most applications.
The actual overhead of an array is relatively small, on a 32bit VM its 12 bytes IIRC (plus what gets wasted if the size is less than the minimal granularity, which is AFAIK 8 bytes). So in the worst case you have something like 19 bytes overhead per array.
Note that Java has no 2D (multi-dimensional) arrays, it implements this internally as an array of arrays.
In the 2D case, you are allocating more, smaller objects. The memory manager is objecting to the single large object taking up most of the heap. Why this is objectionable is a detail of the garbage collection scheme-- it's probably because something like it can move the smaller objects between generations and the heap won't accomodate moving the single large object around.
This might be due to memory fragmentation and the JVM's inability to allocate an array of that size given the current heap.
Imagine your heap is 10 x long:
xxxxxxxxxx
Then, you allocate an object 0 somehere. This makes your heap look like:
xxxxxxx0xx
Now, you can no longer allocate those 10 x spaces. You can not even allocate 8 xs, despite the fact that available memory is 9 xs.
The fact is that an array of arrays does not suffer from the same problem because it's not contiguous.
EDIT: Please note that the above is a very simplistic view of the problem. When in need of space in the heap, Java's garbage collector will try to collect as much memory as it can and, if really, really necessary, try to compact the heap. However, some objects might not be movable or collectible, creating heap fragmentation and putting you in the above situation.
There are also many other factors that you have to consider, some of which include: memory leaks either in the VM (not very likely) or your application (also not likely for a simple scenario), unreliability of using Runtime.freeMemory() (the GC might run right after the call and the available free memory could change), implementation details of each particular JVM, etc.
The point is, as a rule of thumb, don't always expect to have the full amount of Runtime.freeMemory() available to your application.

Reducing memory churn when processing large data set

Java has a tendency to create a large number objects that needs to be garbage collected when processing large data set. This happens fairly frequently when streaming a amounts of data from the database, creating reports, etc. Is there a strategy to reduce the memory churn.
In this example, the object based version spends significant amount of times (2+ seconds) generating objects and performing garbage collection whereas the boolean array version completes in a fraction of a section without any garbages collection whatsoever.
How do I reduce the memory churn (the need for large number of garbage collections) when processing large data sets?
java -verbose:gc -Xmx500M UniqChars
...
----------------
[GC 495441K->444241K(505600K), 0.0019288 secs] x 45 times
70000007
================
70000007
import java.util.HashSet;
import java.util.Set;
public class UniqChars {
static String a=null;
public static void main(String [] args) {
//Generate data set
StringBuffer sb=new StringBuffer("sfdisdf");
for (int i =0; i< 10000000; i++) {
sb.append("sfdisdf");
}
a=sb.toString();
sb=null; //free sb
System.out.println("----------------");
compareAsSet();
System.out.println("================");
compareAsAry();
}
public static void compareAsSet() {
Set<String> uniqSet = new HashSet<String>();
int n=0;
for(int i=0; i<a.length(); i++) {
String chr = a.substring(i,i);
uniqSet.add(chr);
n++;
}
System.out.println(n);
}
public static void compareAsAry() {
boolean uniqSet[] = new boolean[65536];
int n=0;
for(int i=0; i<a.length(); i++) {
int chr = (int) a.charAt(i);
uniqSet[chr]=true;
n++;
}
System.out.println(n);
}
}
Well as pointed out by one of the comments it's your code, not Java at fault for memory churn. So let's see you've written this code that builds an insanely large String from a StringBuffer. Calls toString() on it. Then calls substring() on that insanely large string which is in a loop and creating new a.length() Strings. Then does some in place junk on an array that really will perform pretty damn fast since there is no object creation, but ultimately writes to true to the same 5-6 locations in a huge array. Waste much? So what did you think would happen? Ditch StringBuffer and use StringBuilder since it's not fully synchronized which will be a little faster.
Ok so here's where your algorithm is probably spending its time. See the StringBuffer is allocating an internal character array to store stuff in each time you call append(). When that character array fills entirely up, it has to allocate a larger character array, copy all that junk you just wrote to it into the new array, then append what you originally called it with. So your code is allocating filling up, allocating a bigger chunk, copying that junk to the new array, then repeating that process until it does that 1000000 times. You can speed that up by pre-allocating the character array for the StringBuffer. Roughly that's 10000000 * "sfdisdf".length(). That will keep Java from creating tons of memory that it just dumps over and over.
Next is the compareAsSet() mess. Your line String chr = a.substring(i,i); is creating NEW strings a.length() times. Well since you're doing a.substring(i,i) is only a character you could just charAt(i) then there's no allocating happen. There's also an option of CharSequence which doesn't create a new String with it's own character array but simply points to the original underlying char[] with an offset and length. String.subSequence()
You plug this same code in any other language and it'll suck there too. In fact I'd say far far worse. Just try this is C++ and watch it be significantly worse than Java should you allocate and deallocate this much. See Java memory allocation is way way way faster than C++ because everything in Java is allocated from a memory pool so creating objects is magnitudes faster. But, there are limits. Furthermore, Java compresses its memory should it become too fragmented, C++ doesn't. So as you allocate memory and dump it, just in the same way, you'll probably run the risk of fragmenting the memory in C++. That could mean your StringBuffer might run out of the ability to grow large enough to finish and would crash.
In fact that might also explain some of the performance issues with GC because it's having to make room more a continuous block big enough after lots of trash has been taken out. So Java is not only cleaning up the memory its also having to compress the memory address space so it can get a block big enough for your StringBuffer.
Anyway, I'm sure your just testing the tires, but testing with code like this isn't really smart because it'll never perform well because it's unrealistic memory allocation. You know the old adage Garbage In Garbage Out. And that's what you got Garbage.
In your example your two methods are doing very different things.
In compareAsSet() you are generating the same 4 Strings ("s", "d", "f" and "i") and calling String.hashCode() and String.equals(String) (HashSet does this when you try to add them) 70000007 times. What you end up with is a HashSet of size 4. While you are doing this you are allocating String objects each time String.substring(int, int) returns which will force a minor collection every time the 'new' generation of the garbage collector gets filled.
In compareAsAry() you've allocated a single array 65536 elements wide changed some values in it and and then it goes out of scope when the method returns. This is a single heap memory operation vs 70000007 done in compareAsSet. You do have a local int variable being changed 70000007 times but this happens in stack memory not in heap memory. This method does not really generate that much garbage in the heap compared to the other method (basically just the array).
Regarding churn your options are recycle objects or tuning the garbage collector.
Recycling is not really possible with Strings in general as they are immutable, though the VM may perform interning operations this only reduces total memory footprint not garbage churn. A solution targeted for the above scenario that recycles could be generated but the implementation would be brittle and inflexible.
Tuning the garbage collector so that the 'new' generation is larger could reduce the total number of collections that has to be performed during your method call and thus increase the throughput of the call, you could also just increase the heap size in general which would accomplish the same thing.
For futher reading on garbage collector tuning in Java 6 I recommend the Oracle white paper linked below.
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html
For comparison, if you wrote this it would do the same thing.
public static void compareLength() {
// All the loop does is count the length in a complex way.
System.out.println(a.length());
}
// I assume you intended to write this.
public static void compareAsBitSet() {
BitSet uniqSet = new BitSet();
for(int i=0; i<a.length(); i++)
uniqSet.set(a.charAt(i));
System.out.println(uniqSet.size());
}
Note: the BitSet uses 1 bit per element, rather than 1 byte per element. It also expands as required so say you have ASCII text, the BitSet might use 128-bits or 16 bytes (plus 32-byte overhead) The boolean[] uses 64 KB which is much higher. Ironically, using a boolean[] can be faster as it involves less bit shifting and only the portion of the array used needs to be in memory.
As you can see, with either solution, you get a much more efficient result because you use a better algorithm for what needs to be done.

HashSet . slow performance in big set

I have encountered a problem i cannot find a solution.
I am using a HashSet to store values. The values I store is of the custom type Cycles where i have overridden the HashCode and equals as following in order to make sure the slow performance is not cuased by the hascode or the equal methods
Also i have set the initial capacity of the hashset to 10.000.000
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + (int) (cycleId ^ (cycleId >>> 32));
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Cycle other = (Cycle) obj;
if (cycleId != other.cycleId)
return false;
return true;
}
After the first 1.500.000 first values when i try to add a new value (with the add method of the HashSet class) the program is very slow. Eventually i am going to have java out of memory exception (Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space) before the stored values reach the 1.600.000
The IDE i use is Eclipse. So the next step was to increase the JVM heap size from the default value to 1 giga (using the commnads Xmx1000M and Xms1000M)
Now the elipse starts with 10 times more memory available (i can see that in the bottom right where the total heap size memory and used memory is shown) but again i have the same "slow" performance and the same out of memory error IN THE SAME VALUES as before (after the 1.500.000 and before 1.600.000) which is very odd.
Does anyone has an idea what it might be the problem?
Thank you in advance
You don't want to increase the JVM heap for Eclipse, you want to set it for your program.
Go to Run > Run Configurations (or Debug Configurations) and set the VM Options there.
Not enough heap memory (increase it via -Xmx, e.g. -Xmx512m). When free memory goes very low, then much, much time is spent by the garbage collector which furiously scans the heap for unreachable objects.
Your hashCode() is fine, extra points for using all bits of the cycleId long.
Edit. Now I saw you did increase the memory, and didn't help. First of all, are you sure you did manage to increase the memory? You could check this by jconsole, connect to your app and see its heap size.
For an alternative explanation to be verified, is there any particular pattern in your cycleId that could make this hashCode() implementation bad? Like, its 32 high order bits are mostly similar to the 32 low order bits. (Yeah, right).
But no. Even if that would be the case, you would be seeing a gradual degradation of performance, not a sharp drop at a specific point (and you do get a OutOfMemoryError and frenzy gc operation). So my best guess is still a memory issue. You either didn't increase the heap size as you thought, or there is some other code grabbing memory at some point. (You could use a tool like VisualVM to profile this, and get a heap dump upon OOME, and see what objects it contains).
Edit2 I made bold the correct part of the above.
A memory size available for the application you start from Eclipse should be configured from the Run menu. Try:
Run -> Run Configurations -> Arguments
-> VM Arguments -> -Xmx1000M
The reason why your program is slow is Garbage Collector - it starts each time a memory is going to be out of the limit.
Have you tested your hashCode method implementation? it always returns 31, for any value of circleId. Not strange that your HashMap works slow, it has a linear performance.
If you want to increase the memory your program can use it won't help to increase Eclipse's heap size. You must put the parameter into the launch configuration's vm parameters of your program.
JVM throws 'out of memory' NOT based on available memory. It is thrown when time being spent on the garbage collection is too much. check this. Exact implementation details vary based on JVM and the garbage collector implementation.
Increasing memory would not help in this case. You may have to choose another approach.
Maybe your computer doesn't have enough memory, hence it has to swap to disk.
How are you initializing your HashSet? You need to be aware of its growth pattern. With every add operation, it checks whether it is getting close to capacity. If it reaches a certain point (determined by its 'load factor'), it performs a resizing operation which can be expensive. From the JavaDoc (of HashMap - the collection that backs HashSet):
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
I'm pretty disappointed at the number of answers telling the OP to increase his heap size in his application. That's not a solution--that's a quick-and-dirty patch, which won't address any underlying problem.
I found this presentation extremely informative:
http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/memory-efficient-java-tutorial.pdf
Mainly the page listing the minimal byte sizes of each when empty--
ArrayList: 40 or 48
LinkedList: 48
HashMap: 56 or 120
HashSet: 72 or 136
Turns out that a HashSet is practically a HashMap, and (counterintuitively) takes up more more memory despite holding only values instead of key-value pairs.

At what point is it worth reusing arrays in Java?

How big does a buffer need to be in Java before it's worth reusing?
Or, put another way: I can repeatedly allocate, use, and discard byte[] objects OR run a pool to keep and reuse them. I might allocate a lot of small buffers that get discarded often, or a few big ones that's don't. At what size is is cheaper to pool them than to reallocate, and how do small allocations compare to big ones?
EDIT:
Ok, specific parameters. Say an Intel Core 2 Duo CPU, latest VM version for OS of choice. This questions isn't as vague as it sounds... a little code and a graph could answer it.
EDIT2:
You've posted a lot of good general rules and discussions, but the question really asks for numbers. Post 'em (and code too)! Theory is great, but the proof is the numbers. It doesn't matter if results vary some from system to system, I'm just looking for a rough estimate (order of magnitude). Nobody seems to know if the performance difference will be a factor of 1.1, 2, 10, or 100+, and this is something that matters. It is important for any Java code working with big arrays -- networking, bioinformatics, etc.
Suggestions to get a good benchmark:
Warm up code before running it in the benchmark. Methods should all be called at least 1000 10000 times to get full JIT optimization.
Make sure benchmarked methods run for at least 1 10 seconds and use System.nanotime if possible, to get accurate timings.
Run benchmark on a system that is only running minimal applications
Run benchmark 3-5 times and report all times, so we see how consistent it is.
I know this is a vague and somewhat demanding question. I will check this question regularly, and answers will get comments and rated up consistently. Lazy answers will not (see below for criteria). If I don't have any answers that are thorough, I'll attach a bounty. I might anyway, to reward a really good answer with a little extra.
What I know (and don't need repeated):
Java memory allocation and GC are fast and getting faster.
Object pooling used to be a good optimization, but now it hurts performance most of the time.
Object pooling is "not usually a good idea unless objects are expensive to create." Yadda yadda.
What I DON'T know:
How fast should I expect memory allocations to run (MB/s) on a standard modern CPU?
How does allocation size effect allocation rate?
What's the break-even point for number/size of allocations vs. re-use in a pool?
Routes to an ACCEPTED answer (the more the better):
A recent whitepaper showing figures for allocation & GC on modern CPUs (recent as in last year or so, JVM 1.6 or later)
Code for a concise and correct micro-benchmark I can run
Explanation of how and why the allocations impact performance
Real-world examples/anecdotes from testing this kind of optimization
The Context:
I'm working on a library adding LZF compression support to Java. This library extends the H2 DBMS LZF classes, by adding additional compression levels (more compression) and compatibility with the byte streams from the C LZF library. One of the things I'm thinking about is whether or not it's worth trying to reuse the fixed-size buffers used to compress/decompress streams. The buffers may be ~8 kB, or ~32 kB, and in the original version they're ~128 kB. Buffers may be allocated one or more times per stream. I'm trying to figure out how I want to handle buffers to get the best performance, with an eye toward potentially multithreading in the future.
Yes, the library WILL be released as open source if anyone is interested in using this.
If you want a simple answer, it is that there is no simple answer. No amount of calling answers (and by implication people) "lazy" is going to help.
How fast should I expect memory allocations to run (MB/s) on a standard modern CPU?
At the speed at which the JVM can zero memory, assuming that the allocation does not trigger a garbage collection. If it does trigger garbage collection, it is impossible to predict without knowing what GC algorithm is used, the heap size and other parameters, and an analysis of the application's working set of non-garbage objects over the lifetime of the app.
How does allocation size effect allocation rate?
See above.
What's the break-even point for number/size of allocations vs. re-use in a pool?
If you want a simple answer, it is that there is no simple answer.
The golden rule is, the bigger your heap is (up to the amount of physical memory available), the smaller the amortized cost of GC'ing a garbage object. With a fast copying garbage collector, the amortized cost of freeing a garbage object approaches zero as the heap gets larger. The cost of the GC is actually determined by (in simplistic terms) the number and size of non-garbage objects that the GC has to deal with.
Under the assumption that your heap is large, the lifecycle cost of allocating and GC'ing a large object (in one GC cycle) approaches the cost of zeroing the memory when the object is allocated.
EDIT: If all you want is some simple numbers, write a simple application that allocates and discards large buffers and run it on your machine with various GC and heap parameters and see what happens. But beware that this is not going to give you a realistic answer because real GC costs depend on an application's non-garbage objects.
I'm not going to write a benchmark for you because I know that it would give you bogus answers.
EDIT 2: In response to the OP's comments.
So, I should expect allocations to run about as fast as System.arraycopy, or a fully JITed array initialization loop (about 1GB/s on my last bench, but I'm dubious of the result)?
Theoretically yes. In practice, it is difficult to measure in a way that separates the allocation costs from the GC costs.
By heap size, are you saying allocating a larger amount of memory for JVM use will actually reduce performance?
No, I'm saying it is likely to increase performance. Significantly. (Provided that you don't run into OS-level virtual memory effects.)
Allocations are just for arrays, and almost everything else in my code runs on the stack. It should simplify measuring and predicting performance.
Maybe. Frankly, I think that you are not going to get much improvement by recycling buffers.
But if you are intent on going down this path, create a buffer pool interface with two implementations. The first is a real thread-safe buffer pool that recycles buffers. The second is dummy pool which simply allocates a new buffer each time alloc is called, and treats dispose as a no-op. Finally, allow the application developer to choose between the pool implementations via a setBufferPool method and/or constructor parameters and/or runtime configuration properties. The application should also be able to supply a buffer pool class / instance of its own making.
When it is larger than young space.
If your array is larger than the thread-local young space, it is directly allocated in the old space. Garbage collection on the old space is way slower than on the young space. So if your array is larger than the young space, it might make sense to reuse it.
On my machine, 32kb exceeds the young space. So it would make sense to reuse it.
You've neglected to mention anything about thread safety. If it's going to be reused by multiple threads you'll have to worry about synchronization.
An answer from a completely different direction: let the user of your library decide.
Ultimately, however optimized you make your library, it will only be a component of a larger application. And if that larger application makes infrequent use of your library, there's no reason that it should pay to maintain a pool of buffers -- even if that pool is only a few hundred kilobytes.
So create your pooling mechanism as an interface, and based on some configuration parameter select the implementation that's used by your library. Set the default to be whatever your benchmark tests determine to be the best solution.1 And yes, if you use an interface you'll have to rely on the JVM being smart enough to inline calls.2
(1) By "benchmark," I mean a long-running program that exercises your library outside of a profiler, passing it a variety of inputs. Profilers are extremely useful, but so is measuring the total throughput after an hour of wall-clock time. On several different computers with differing heap sizes, and several different JVMs, running in single and multi-threaded modes.
(2) This can get you into another line of debate about the relative performance of the various invoke opcodes.
Short answer: Don't buffer.
Reasons are follow:
Don't optimize it, yet until it become a bottleneck
If you recycle it, the overhead of the pool management will be another bottleneck
Try to trust the JIT. In the latest JVM, your array may allocated in STACK rather then HEAP.
Trust me, the JRE usually do handle them faster and better then you DIY.
Keep it simple, for easier to read and debug
When you should recycle a object:
only if is it heavy. The size of memory won't make it heavy, but native resources and CPU cycle do, which cost addition finalize and CPU cycle.
You may want to recycle them if they are "ByteBuffer" rather then byte[]
Keep in mind that cache effects will probably be more of an issue than the cost of "new int[size]" and its corresponding collection. Reusing buffers is therefore a good idea if you have good temporal locality. Reallocating the buffer instead of reusing it means you might get a different chunk of memory each time. As others mentioned, this is especially true when your buffers don't fit in the young generation.
If you allocate but then don't use the whole buffer, it also pays to reuse as you don't waste time zeroing out memory you never use.
I forgot that this is a managed-memory system.
Actually, you probably have the wrong mindset. The appropriate way to determine when it is useful is dependent on the application, system it is running on, and user usage pattern.
In other words - just profile the system, determine how much time is being spent in garbage collection as a percentage of total application time in a typical session, and see if it is worthwhile to optimize that.
You will probably find out that gc isn't even being called at all. So writing code to optimize this would be a complete waste of time.
with today's large memory space I suspect 90% of the time it isn't worth doing at all. You can't really determine this based on parameters - it is too complex. Just profile - easy and accurate.
Looking at a micro benchmark (code below) there is no appreciable difference in time on my machine regardless of the size and the times the array is used (I am not posting the times, you can easily run it on your machine :-). I suspect that this is because the garbage is alive for so short a time there is not much to do for cleanup. Array allocation should probably a call to calloc or malloc/memset. Depending on the CPU this will be a very fast operation. If the arrays survived for a longer time to make it past the initial GC area (the nursery) then the time for the one that allocated several arrays might take a bit longer.
code:
import java.util.Random;
public class Main
{
public static void main(String[] args)
{
final int size;
final int times;
size = 1024 * 128;
times = 100;
// uncomment only one of the ones below for each run
test(new NewTester(size), times);
// test(new ReuseTester(size), times);
}
private static void test(final Tester tester, final int times)
{
final long total;
// warmup
testIt(tester, 1000);
total = testIt(tester, times);
System.out.println("took: " + total);
}
private static long testIt(final Tester tester, final int times)
{
long total;
total = 0;
for(int i = 0; i < times; i++)
{
final long start;
final long end;
final int value;
start = System.nanoTime();
value = tester.run();
end = System.nanoTime();
total += (end - start);
// make sure the value is used so the VM cannot optimize too much
System.out.println(value);
}
return (total);
}
}
interface Tester
{
int run();
}
abstract class AbstractTester
implements Tester
{
protected final Random random;
{
random = new Random(0);
}
public final int run()
{
int value;
value = 0;
// make sure the random number generater always has the same work to do
random.setSeed(0);
// make sure that we have something to return so the VM cannot optimize the code out of existence.
value += doRun();
return (value);
}
protected abstract int doRun();
}
class ReuseTester
extends AbstractTester
{
private final int[] array;
ReuseTester(final int size)
{
array = new int[size];
}
public int doRun()
{
final int size;
// make sure the lookup of the array.length happens once
size = array.length;
for(int i = 0; i < size; i++)
{
array[i] = random.nextInt();
}
return (array[size - 1]);
}
}
class NewTester
extends AbstractTester
{
private int[] array;
private final int length;
NewTester(final int size)
{
length = size;
}
public int doRun()
{
final int size;
// make sure the lookup of the length happens once
size = length;
array = new int[size];
for(int i = 0; i < size; i++)
{
array[i] = random.nextInt();
}
return (array[size - 1]);
}
}
I came across this thread and, since I was implementing a Floyd-Warshall all pairs connectivity algorithm on a graph with one thousand vertices, I tried to implement it in both ways (re-using matrices or creating new ones) and check the elapsed time.
For the computation I need 1000 different matrices of size 1000 x 1000, so it seems a decent test.
My system is Ubuntu Linux with the following virtual machine.
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
Re-using matrices was about 10% slower (average running time over 5 executions 17354ms vs 15708ms. I don't know if it would still be faster in case the matrix was much bigger.
Here is the relevant code:
private void computeSolutionCreatingNewMatrices() {
computeBaseCase();
smallest = Integer.MAX_VALUE;
for (int k = 1; k <= nVertices; k++) {
current = new int[nVertices + 1][nVertices + 1];
for (int i = 1; i <= nVertices; i++) {
for (int j = 1; j <= nVertices; j++) {
if (previous[i][k] != Integer.MAX_VALUE && previous[k][j] != Integer.MAX_VALUE) {
current[i][j] = Math.min(previous[i][j], previous[i][k] + previous[k][j]);
} else {
current[i][j] = previous[i][j];
}
smallest = Math.min(smallest, current[i][j]);
}
}
previous = current;
}
}
private void computeSolutionReusingMatrices() {
computeBaseCase();
current = new int[nVertices + 1][nVertices + 1];
smallest = Integer.MAX_VALUE;
for (int k = 1; k <= nVertices; k++) {
for (int i = 1; i <= nVertices; i++) {
for (int j = 1; j <= nVertices; j++) {
if (previous[i][k] != Integer.MAX_VALUE && previous[k][j] != Integer.MAX_VALUE) {
current[i][j] = Math.min(previous[i][j], previous[i][k] + previous[k][j]);
} else {
current[i][j] = previous[i][j];
}
smallest = Math.min(smallest, current[i][j]);
}
}
matrixCopy(current, previous);
}
}
private void matrixCopy(int[][] source, int[][] destination) {
assert source.length == destination.length : "matrix sizes must be the same";
for (int i = 0; i < source.length; i++) {
assert source[i].length == destination[i].length : "matrix sizes must be the same";
System.arraycopy(source[i], 0, destination[i], 0, source[i].length);
}
}
More important than buffer size is number of allocated objects, and total memory allocated.
Is memory usage an issue at all? If it is a small app may not be worth worrying about.
The real advantage from pooling is to avoid memory fragmentation. The overhead for allocating/freeing memory is small, but the disadvantage is that if you repeatedly allocated many objects of many different sizes memory becomes more fragmented. Using a pool prevents fragmentation.
I think the answer you need is related with the 'order' (measuring space, not time!) of the algorithm.
Copy file example
By example, if you want to copy a file you need to read from an inputstream and write to an outputstream. The TIME order is O(n) because the time will be proportional to the size of the file. But the SPACE order will be O(1) because the program you'll need to do it will ocuppy a fixed ammount of memory (you'll need only one fixed buffer). In this case it's clear that it's convenient to reuse that very buffer you instantiated at the beginning of the program.
Relate the buffer policy with your algorithm execution structure
Of course, if your algoritm needs and endless supply of buffers and each buffer is a different size probably you cannot reuse them. But it gives you some clues:
try to fix the size of buffers (even
sacrifying a little bit of memory).
Try to see what's the structure of
the execution: by example, if you're
algorithm traverse some kind of tree
and you're buffers are related to
each node, maybe you only need O(log
n) buffers... so you can make an
educated guess of the space required.
Also if you need diferent buffers but
you can arrange things to share
diferent segments of the same
array... maybe it's a better
solution.
When you release a buffer you can
add it to a pool of buffers. That
pool can be a heap ordered by the
"fitting" criteria (buffers that
fit the most should be first).
What I'm trying to say is: there's no fixed answer. If you instantiated something that you can reuse... probably it's better to reuse it. The tricky part is to find how you can do it without incurring in buffer managing overhead. Here's when the algorithm analysis come in handy.
Hope it helps... :)

Categories