Related
I am doing some experiments on memory. The first problem I met is how to allocate given amount of memory during runtime, say 500MB. I need the program's process hold it until the program exit.
I guess there may be several ways to achieve this? I prefer a simple but practical one.
Well, Java hides memory management from you, so there are two answers to your question:
Create the data structures of this size, you are going to need and hold a reference to them in some thread, until the program exits, because, once there is no reference to data on the heap in an active thread it becomes garbage collectable. On a 32-bit system 500MB should be roughly enough for an int array of 125000 cells, or 125 int arrays of 1000 cells.
If you just want to have the memory allocated and available, but not filled up, then start the virtual machine with -Xms=512M. This is going to make the VM allocate 512 M of memory for your program on startup, but it is going to be empty (just allocated) until you need it (do point 1). Xmx sets the maximum allocatable memory by your program.
public static void main( String[] args ) {
final byte[] x = new byte[500*1024 ]; // 500 Kbytes
final byte[] y = new byte[500*1024*1024]; // 500 Mbytes
...
System.out.println( x.length + y.length );
}
jmalloc lets you do it, but I wouldn't recommend it unless you're truly an expert. You're giving up something that's central to Java - garbage collection. You might as well be writing C.
Java NIO allocates byte buffers off heap this way. I think this is where Oracle is going for memory mapping JARs and getting rid of perm gen, too.
I have implemented a java program . This is basically a multi threaded service with fixed number of threads. Each thread takes one task at a time, create a hashSet , the size of hashset can vary from 10 to 20,000+ items in a single hashset. At end of each thread, the result is added to a shared collection List using synchronized.
The problem happens is at some point I start getting out of memory exception. Now after doing bit of research, I found that this memory exception occurs when GC is busy clearing the memory and at that point it stops the whole world to execute anything.
Please give me suggestions for how to deal with such large amount of data. Is Hashset a correct datastructure to be used? How to deal with memory exception, I mean one way is to use System.GC(), which is again not good as it will slow down the whole process. Or is it possible to dispose the "HashSet hsN" after I add it to the shared collection List?
Please let me know your thoughts and guide me for wherever I am going wrong. This service is going to deal with huge amout of data processing.
Thanks
//business object - to save the result of thread execution
public class Location{
integer taskIndex;
HashSet<Integer> hsN;
}
//task to be performed by each thread
public class MyTask implements Runnable {
MyTask(long task) {
this.task = task;
}
#Override
public void run() {
HashSet<Integer> hsN = GiveMeResult(task);//some function calling which returns a collection of integer where the size vary from 10 to 20000
synchronized (locations) {
locations.add(task,hsN);
}
}
}
public class Main {
private static final int NTHREDS = 8;
private static List<Location> locations;
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(NTHREDS);
for (int i = 0; i < 216000; i++) {
Runnable worker = new MyTask(i);
executor.execute(worker);
}
// This will make the executor accept no new threads
// and finish all existing threads in the queue
executor.shutdown();
// Wait until all threads are finish
while (!executor.isTerminated()) {
}
System.out.println("Finished all threads");
}
}
For such implementation is JAVA a best choice or C# .net4?
A couple of issues that I can see:
You synchronize on the MyTask object, which is created separately for each execution. You should be synchronizing on a shared object, preferably the one that you are modifying i.e. the locations object.
216,000 runs, multiplied by say 10,000 returned objects each, multiplied by a minimum of 12 bytes per Integer object is about 24 GB of memory. Do you even have that much physical memory available on your computer, let alone available to the JVM?
32-bit JVMs have a heap size limit of less than 2 GB. On a 64-bit JVM on the other hand, an Integer object takes about 16 bytes, which raises the memory requirements to over 30 GB.
With these numbers it's hardly surprising that you get an OutOfMemoryError...
PS: If you do have that much physical memory available and you still think that you are doing the right thing, you might want to have a look at tuning the JVM heap size.
EDIT:
Even with 25GB of memory available to the JVM it could still be pushing it:
Each Integer object requires 16 bytes on modern 64-bit JVMs.
You also need an 8-byte reference that will point to it, regardless of which List implementation you are using.
If you are using a linked list implementation, each entry will also have an overhead of at least 24 bytes for the list entry object.
At best you could hope to store about 1,000,000,000 Integer objects in 25GB - half that if you are using a linked list. That means that each task could not produce more than 5,000 (2,500 respectively) objects on average without causing an error.
I am unsure of your exact requirement, but have you considered returning a more compact object? For example an int[] array produced from each HashSet would only keep the minimum of 4 bytes per result without the object container overhead.
EDIT 2:
I just realized that you are storing the HashSet objects themselves in the list. HashSet objects use a HashMap internally which then uses a HashMap.Entry object of each entry. On an 64-bit JVM the entry object consumes about 40 bytes of memory in addition to the stored object:
The key reference which points to the Integer object - 8 bytes.
The value reference (always null in a HashSet) - 8 bytes.
The next entry reference - 8 bytes.
The hash value - 4 bytes.
The object overhead - 8 bytes.
Object padding - 4 bytes.
I.e. for each Integer object you need 56 bytes for storage in a HashSet. With the typical HashMap load factor of 0.75, you should add another 10 or bytes for the HashMap array references. With 66 bytes per Integer you can only store about 400,000,000 such objects in 25 GB, without taking into account the rest of your application any any other overhead. That's less than 2,000 object per task...
EDIT 3:
You would be better off storing a sorted int[] array instead of a HashSet. That array is searchable in logarithmic time for any arbitrary integer and minimizes the memory consumption to 4 bytes per number. Considering the memory I/O it would also be as fast (or faster) as the HashSet implementation.
If you want a more memory efficient solution I would use TIntHashSet or a sorted int[]. In this case, you get a Full GC before an OutOfMemoryError. These are not the cause of the problem, but symptoms. The cause of the problem is you are using too much memory for the amount you are allowing as your maximum heap.
Another solution is to create tasks as you go instead of creating all your tasks in advance. You can do this by breaking your task in to NTHREAD tasks instead. It appears that you are trying to retain every solution. If so this won't help much. Instead you need to find a way to reduce consumption.
Depending on your distribution of numbers, a BitSet may be more efficient. This uses 1 bit per integer in a range. e.g. say your range is 0 - 20,000, This will use only 2.5 KB.
If you are going to keep 216000 * 10000 Integers in memory you do require huge memory.
You can try Xmx settings to maximum allowable in your system and see how many objects you can store before you run out of memory.
It is not clear why you want to store the results of processing of so many threads, what is the next step? If you really need to store so much of data you need to probably use a database.
Now after doing bit of research, I found that this memory exception
occurs when GC is busy clearing the memory and at that point it stops
the whole world to execute anything.
No - not true. Memory exceptions occur because you are using more memory than was allocated to your program. Very rarely is a memory exception due to some behavior of the GC. This can happen if you configure the GC in poorly.
Have you tried running with a larger -Xmx value? And why don't you just use a Hashtable for locations?
You probably need to increase the size of your heap. Please look at the -Xmx JVM setting.
I'm reusing the same ArrayList in a for loop, and I use
for loop
results = new ArrayList<Integer>();
experts = new ArrayList<Integer>();
output = new ArrayList<String>();
....
to create new ones.
I guess this is wrong, because I'm allocating new memory. Is this correct ?
If yes, how can I empty them ?
Added: another example
I'm creating new variables each time I call this method. Is this good practice ? I mean to create new precision, relevantFound.. etc ? Or should I declare them in my class, outside the method to not allocate more and more memory ?
public static void computeMAP(ArrayList<Integer> results, ArrayList<Integer> experts) {
//compute MAP
double precision = 0;
int relevantFound = 0;
double sumprecision = 0;
thanks
ArrayList.clear() will empty them for you; note that doing it your way is also 'okay', since Java is garbage-collected, so the old allocations will eventually get cleaned up. Still, it's better to avoid lots of new allocations (and garbage generation), so the better way would be to move those declarations outside the loop and put in calls to clear inside it.
For your second example, either way is fine; primitive types are typically going to get allocated only once (on the stack, when you enter the function), declaring them inside a loop doesn't increase the cost any. It's only heap allocations (i.e. calls to new) you need to worry about.
In response to comment:
If it doesn't make sense for those things to be instance members, then don't make them such. Also, using new to 'clean' them means allocating new objects every time; definitely don't do that - if your method needs a new copy of something every time it's invoked, and it isn't used anywhere except that method, then it has no business being an instance variable.
In general, worrying about such micro-optimizations at this point is counter-productive; you only think about it if you really, absolutely have to, and then measure whether there's a benefit before doing anything.
The code snippet below measures the difference between allocating a new list inside the loop and calling clear() to reuse an existing list.
Allocating a new list is slower, as pointed out a few times above. This gives an idea of how much.
Note that the code loops 100,000 times to get those numbers. For UI code the difference may not matter. For other applications it can be a significant improvement to reuse the list.
This is the result of three runs:
Elapsed time - in the loop: 2198
Elapsed time - with clear(): 1621
Elapsed time - in the loop: 2291
Elapsed time - with clear(): 1621
Elapsed time - in the loop: 2182
Elapsed time - with clear(): 1605
Having said that, if the lists are holding hundreds or even thousands of objects, the allocation of the array itself will pale in comparison with the allocation of the objects. The performance bottleneck will be related to the objects being added to the array, not with the array.
For completeness: code was measured with Java 1.6.0_19, running on a Centrino 2 laptop with Windows. However, the main point is the difference between them, not the exact number.
import java.util.*;
public class Main {
public static void main(String[] args) {
// Allocates a new list inside the loop
long startTime = System.currentTimeMillis();
for( int i = 0; i < 100000; i++ ) {
List<String> l1 = new ArrayList<String>();
for( int j = 0; j < 1000; j++ )
l1.add( "test" );
}
System.out.println( "Elapsed time - in the loop: " + (System.currentTimeMillis() - startTime) );
// Reuse the list
startTime = System.currentTimeMillis();
List<String> l2 = new ArrayList<String>();
for( int i = 0; i < 100000; i++ ) {
l2.clear();
for( int j = 0; j < 1000; j++ )
l2.add( "test" );
}
System.out.println( "Elapsed time - with clear(): " + (System.currentTimeMillis() - startTime) );
}
}
first, allocating primitive types is practically free in java, so don't worry about that.
with regard to objects, it really depends on the loop. if it's a tight loop to 100k then yes, it's a big deal to allocate 3 array list objects each time through the loop. it'd be better to allocate them outside of the loop and use List.clear().
you also have to consider where the code is running. if it's a mobile platform you will be more concerned about frequent garbage collection than you would on a server with 256GB of ram and 64 CPUs.
that all being said, no one if going to beat you up for coding for performance, whatever the platform. performance is often a trade off with code cleanliness. for example, on the android platform they recommend using the for (int i = 0 ...) syntax to loop through array lists vs. for (Object o: someList). the latter method is cleaner, but on a mobile platform the performance difference is significant. in this case i don't think clear()'ing outside of the loop makes things any harder to understand.
ArrayLists allocate a default memory for 5 entries. These entries are references, which need 4 bytes each (depending on architecture, maybe even 8 byte). An array list contains an int for its "real" length, which are already 24 bytes. Add the default 16 bytes, which every object (even without instance variables) has, so you at with at least 40 bytes for each ArrayList(). Depending if the store them all, or how many you have, this might be a performance loss.
Not however, that starting with Java 1.6.16, the JVM has a (default off?) feature, which "inlines" objects within a function, if no access to those objects leaves the methods context. In this case all instance variables are compiled in as being used as "local" instance variables of the calling function, so no real objects will be created.
Another issue to take into consideration here is how garbage collection is affected. It is clear that reusing the same ArrayList references and using ArrayList.clear() reduces instance creations.
However, garbage collection is not so simple, and apparently here we force 'old' objects to reference 'newer' objects. That means more old-to-young references (i.e. references from objects in the old generation to ones in the young generation). This kind of references result in more work during garbage collection (See this article for example).
I never tried to benchmark this, and I don't know how significant this is, but I thought it could be relevant for this discussion. Maybe if the number of list items significantly outnumbers the number of lists, it is not worthwhile to use the same lists.
How big does a buffer need to be in Java before it's worth reusing?
Or, put another way: I can repeatedly allocate, use, and discard byte[] objects OR run a pool to keep and reuse them. I might allocate a lot of small buffers that get discarded often, or a few big ones that's don't. At what size is is cheaper to pool them than to reallocate, and how do small allocations compare to big ones?
EDIT:
Ok, specific parameters. Say an Intel Core 2 Duo CPU, latest VM version for OS of choice. This questions isn't as vague as it sounds... a little code and a graph could answer it.
EDIT2:
You've posted a lot of good general rules and discussions, but the question really asks for numbers. Post 'em (and code too)! Theory is great, but the proof is the numbers. It doesn't matter if results vary some from system to system, I'm just looking for a rough estimate (order of magnitude). Nobody seems to know if the performance difference will be a factor of 1.1, 2, 10, or 100+, and this is something that matters. It is important for any Java code working with big arrays -- networking, bioinformatics, etc.
Suggestions to get a good benchmark:
Warm up code before running it in the benchmark. Methods should all be called at least 1000 10000 times to get full JIT optimization.
Make sure benchmarked methods run for at least 1 10 seconds and use System.nanotime if possible, to get accurate timings.
Run benchmark on a system that is only running minimal applications
Run benchmark 3-5 times and report all times, so we see how consistent it is.
I know this is a vague and somewhat demanding question. I will check this question regularly, and answers will get comments and rated up consistently. Lazy answers will not (see below for criteria). If I don't have any answers that are thorough, I'll attach a bounty. I might anyway, to reward a really good answer with a little extra.
What I know (and don't need repeated):
Java memory allocation and GC are fast and getting faster.
Object pooling used to be a good optimization, but now it hurts performance most of the time.
Object pooling is "not usually a good idea unless objects are expensive to create." Yadda yadda.
What I DON'T know:
How fast should I expect memory allocations to run (MB/s) on a standard modern CPU?
How does allocation size effect allocation rate?
What's the break-even point for number/size of allocations vs. re-use in a pool?
Routes to an ACCEPTED answer (the more the better):
A recent whitepaper showing figures for allocation & GC on modern CPUs (recent as in last year or so, JVM 1.6 or later)
Code for a concise and correct micro-benchmark I can run
Explanation of how and why the allocations impact performance
Real-world examples/anecdotes from testing this kind of optimization
The Context:
I'm working on a library adding LZF compression support to Java. This library extends the H2 DBMS LZF classes, by adding additional compression levels (more compression) and compatibility with the byte streams from the C LZF library. One of the things I'm thinking about is whether or not it's worth trying to reuse the fixed-size buffers used to compress/decompress streams. The buffers may be ~8 kB, or ~32 kB, and in the original version they're ~128 kB. Buffers may be allocated one or more times per stream. I'm trying to figure out how I want to handle buffers to get the best performance, with an eye toward potentially multithreading in the future.
Yes, the library WILL be released as open source if anyone is interested in using this.
If you want a simple answer, it is that there is no simple answer. No amount of calling answers (and by implication people) "lazy" is going to help.
How fast should I expect memory allocations to run (MB/s) on a standard modern CPU?
At the speed at which the JVM can zero memory, assuming that the allocation does not trigger a garbage collection. If it does trigger garbage collection, it is impossible to predict without knowing what GC algorithm is used, the heap size and other parameters, and an analysis of the application's working set of non-garbage objects over the lifetime of the app.
How does allocation size effect allocation rate?
See above.
What's the break-even point for number/size of allocations vs. re-use in a pool?
If you want a simple answer, it is that there is no simple answer.
The golden rule is, the bigger your heap is (up to the amount of physical memory available), the smaller the amortized cost of GC'ing a garbage object. With a fast copying garbage collector, the amortized cost of freeing a garbage object approaches zero as the heap gets larger. The cost of the GC is actually determined by (in simplistic terms) the number and size of non-garbage objects that the GC has to deal with.
Under the assumption that your heap is large, the lifecycle cost of allocating and GC'ing a large object (in one GC cycle) approaches the cost of zeroing the memory when the object is allocated.
EDIT: If all you want is some simple numbers, write a simple application that allocates and discards large buffers and run it on your machine with various GC and heap parameters and see what happens. But beware that this is not going to give you a realistic answer because real GC costs depend on an application's non-garbage objects.
I'm not going to write a benchmark for you because I know that it would give you bogus answers.
EDIT 2: In response to the OP's comments.
So, I should expect allocations to run about as fast as System.arraycopy, or a fully JITed array initialization loop (about 1GB/s on my last bench, but I'm dubious of the result)?
Theoretically yes. In practice, it is difficult to measure in a way that separates the allocation costs from the GC costs.
By heap size, are you saying allocating a larger amount of memory for JVM use will actually reduce performance?
No, I'm saying it is likely to increase performance. Significantly. (Provided that you don't run into OS-level virtual memory effects.)
Allocations are just for arrays, and almost everything else in my code runs on the stack. It should simplify measuring and predicting performance.
Maybe. Frankly, I think that you are not going to get much improvement by recycling buffers.
But if you are intent on going down this path, create a buffer pool interface with two implementations. The first is a real thread-safe buffer pool that recycles buffers. The second is dummy pool which simply allocates a new buffer each time alloc is called, and treats dispose as a no-op. Finally, allow the application developer to choose between the pool implementations via a setBufferPool method and/or constructor parameters and/or runtime configuration properties. The application should also be able to supply a buffer pool class / instance of its own making.
When it is larger than young space.
If your array is larger than the thread-local young space, it is directly allocated in the old space. Garbage collection on the old space is way slower than on the young space. So if your array is larger than the young space, it might make sense to reuse it.
On my machine, 32kb exceeds the young space. So it would make sense to reuse it.
You've neglected to mention anything about thread safety. If it's going to be reused by multiple threads you'll have to worry about synchronization.
An answer from a completely different direction: let the user of your library decide.
Ultimately, however optimized you make your library, it will only be a component of a larger application. And if that larger application makes infrequent use of your library, there's no reason that it should pay to maintain a pool of buffers -- even if that pool is only a few hundred kilobytes.
So create your pooling mechanism as an interface, and based on some configuration parameter select the implementation that's used by your library. Set the default to be whatever your benchmark tests determine to be the best solution.1 And yes, if you use an interface you'll have to rely on the JVM being smart enough to inline calls.2
(1) By "benchmark," I mean a long-running program that exercises your library outside of a profiler, passing it a variety of inputs. Profilers are extremely useful, but so is measuring the total throughput after an hour of wall-clock time. On several different computers with differing heap sizes, and several different JVMs, running in single and multi-threaded modes.
(2) This can get you into another line of debate about the relative performance of the various invoke opcodes.
Short answer: Don't buffer.
Reasons are follow:
Don't optimize it, yet until it become a bottleneck
If you recycle it, the overhead of the pool management will be another bottleneck
Try to trust the JIT. In the latest JVM, your array may allocated in STACK rather then HEAP.
Trust me, the JRE usually do handle them faster and better then you DIY.
Keep it simple, for easier to read and debug
When you should recycle a object:
only if is it heavy. The size of memory won't make it heavy, but native resources and CPU cycle do, which cost addition finalize and CPU cycle.
You may want to recycle them if they are "ByteBuffer" rather then byte[]
Keep in mind that cache effects will probably be more of an issue than the cost of "new int[size]" and its corresponding collection. Reusing buffers is therefore a good idea if you have good temporal locality. Reallocating the buffer instead of reusing it means you might get a different chunk of memory each time. As others mentioned, this is especially true when your buffers don't fit in the young generation.
If you allocate but then don't use the whole buffer, it also pays to reuse as you don't waste time zeroing out memory you never use.
I forgot that this is a managed-memory system.
Actually, you probably have the wrong mindset. The appropriate way to determine when it is useful is dependent on the application, system it is running on, and user usage pattern.
In other words - just profile the system, determine how much time is being spent in garbage collection as a percentage of total application time in a typical session, and see if it is worthwhile to optimize that.
You will probably find out that gc isn't even being called at all. So writing code to optimize this would be a complete waste of time.
with today's large memory space I suspect 90% of the time it isn't worth doing at all. You can't really determine this based on parameters - it is too complex. Just profile - easy and accurate.
Looking at a micro benchmark (code below) there is no appreciable difference in time on my machine regardless of the size and the times the array is used (I am not posting the times, you can easily run it on your machine :-). I suspect that this is because the garbage is alive for so short a time there is not much to do for cleanup. Array allocation should probably a call to calloc or malloc/memset. Depending on the CPU this will be a very fast operation. If the arrays survived for a longer time to make it past the initial GC area (the nursery) then the time for the one that allocated several arrays might take a bit longer.
code:
import java.util.Random;
public class Main
{
public static void main(String[] args)
{
final int size;
final int times;
size = 1024 * 128;
times = 100;
// uncomment only one of the ones below for each run
test(new NewTester(size), times);
// test(new ReuseTester(size), times);
}
private static void test(final Tester tester, final int times)
{
final long total;
// warmup
testIt(tester, 1000);
total = testIt(tester, times);
System.out.println("took: " + total);
}
private static long testIt(final Tester tester, final int times)
{
long total;
total = 0;
for(int i = 0; i < times; i++)
{
final long start;
final long end;
final int value;
start = System.nanoTime();
value = tester.run();
end = System.nanoTime();
total += (end - start);
// make sure the value is used so the VM cannot optimize too much
System.out.println(value);
}
return (total);
}
}
interface Tester
{
int run();
}
abstract class AbstractTester
implements Tester
{
protected final Random random;
{
random = new Random(0);
}
public final int run()
{
int value;
value = 0;
// make sure the random number generater always has the same work to do
random.setSeed(0);
// make sure that we have something to return so the VM cannot optimize the code out of existence.
value += doRun();
return (value);
}
protected abstract int doRun();
}
class ReuseTester
extends AbstractTester
{
private final int[] array;
ReuseTester(final int size)
{
array = new int[size];
}
public int doRun()
{
final int size;
// make sure the lookup of the array.length happens once
size = array.length;
for(int i = 0; i < size; i++)
{
array[i] = random.nextInt();
}
return (array[size - 1]);
}
}
class NewTester
extends AbstractTester
{
private int[] array;
private final int length;
NewTester(final int size)
{
length = size;
}
public int doRun()
{
final int size;
// make sure the lookup of the length happens once
size = length;
array = new int[size];
for(int i = 0; i < size; i++)
{
array[i] = random.nextInt();
}
return (array[size - 1]);
}
}
I came across this thread and, since I was implementing a Floyd-Warshall all pairs connectivity algorithm on a graph with one thousand vertices, I tried to implement it in both ways (re-using matrices or creating new ones) and check the elapsed time.
For the computation I need 1000 different matrices of size 1000 x 1000, so it seems a decent test.
My system is Ubuntu Linux with the following virtual machine.
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
Re-using matrices was about 10% slower (average running time over 5 executions 17354ms vs 15708ms. I don't know if it would still be faster in case the matrix was much bigger.
Here is the relevant code:
private void computeSolutionCreatingNewMatrices() {
computeBaseCase();
smallest = Integer.MAX_VALUE;
for (int k = 1; k <= nVertices; k++) {
current = new int[nVertices + 1][nVertices + 1];
for (int i = 1; i <= nVertices; i++) {
for (int j = 1; j <= nVertices; j++) {
if (previous[i][k] != Integer.MAX_VALUE && previous[k][j] != Integer.MAX_VALUE) {
current[i][j] = Math.min(previous[i][j], previous[i][k] + previous[k][j]);
} else {
current[i][j] = previous[i][j];
}
smallest = Math.min(smallest, current[i][j]);
}
}
previous = current;
}
}
private void computeSolutionReusingMatrices() {
computeBaseCase();
current = new int[nVertices + 1][nVertices + 1];
smallest = Integer.MAX_VALUE;
for (int k = 1; k <= nVertices; k++) {
for (int i = 1; i <= nVertices; i++) {
for (int j = 1; j <= nVertices; j++) {
if (previous[i][k] != Integer.MAX_VALUE && previous[k][j] != Integer.MAX_VALUE) {
current[i][j] = Math.min(previous[i][j], previous[i][k] + previous[k][j]);
} else {
current[i][j] = previous[i][j];
}
smallest = Math.min(smallest, current[i][j]);
}
}
matrixCopy(current, previous);
}
}
private void matrixCopy(int[][] source, int[][] destination) {
assert source.length == destination.length : "matrix sizes must be the same";
for (int i = 0; i < source.length; i++) {
assert source[i].length == destination[i].length : "matrix sizes must be the same";
System.arraycopy(source[i], 0, destination[i], 0, source[i].length);
}
}
More important than buffer size is number of allocated objects, and total memory allocated.
Is memory usage an issue at all? If it is a small app may not be worth worrying about.
The real advantage from pooling is to avoid memory fragmentation. The overhead for allocating/freeing memory is small, but the disadvantage is that if you repeatedly allocated many objects of many different sizes memory becomes more fragmented. Using a pool prevents fragmentation.
I think the answer you need is related with the 'order' (measuring space, not time!) of the algorithm.
Copy file example
By example, if you want to copy a file you need to read from an inputstream and write to an outputstream. The TIME order is O(n) because the time will be proportional to the size of the file. But the SPACE order will be O(1) because the program you'll need to do it will ocuppy a fixed ammount of memory (you'll need only one fixed buffer). In this case it's clear that it's convenient to reuse that very buffer you instantiated at the beginning of the program.
Relate the buffer policy with your algorithm execution structure
Of course, if your algoritm needs and endless supply of buffers and each buffer is a different size probably you cannot reuse them. But it gives you some clues:
try to fix the size of buffers (even
sacrifying a little bit of memory).
Try to see what's the structure of
the execution: by example, if you're
algorithm traverse some kind of tree
and you're buffers are related to
each node, maybe you only need O(log
n) buffers... so you can make an
educated guess of the space required.
Also if you need diferent buffers but
you can arrange things to share
diferent segments of the same
array... maybe it's a better
solution.
When you release a buffer you can
add it to a pool of buffers. That
pool can be a heap ordered by the
"fitting" criteria (buffers that
fit the most should be first).
What I'm trying to say is: there's no fixed answer. If you instantiated something that you can reuse... probably it's better to reuse it. The tricky part is to find how you can do it without incurring in buffer managing overhead. Here's when the algorithm analysis come in handy.
Hope it helps... :)
I'm having a contest with another student to make the fastest version of our homework assignment, and I'm not using an ArrayList for performance reasons (resizing the array myself cut the benchmark time from 56 seconds to 4), but I'm wondering how much I should resize the array every time I need to. Specifically the relevant parts of my code are this:
private Node[] list;
private int size; // The number of items in the list
private static final int N; // How much to resize the list by every time
public MyClass(){
list = new Node[N];
}
public void add(Node newNode){
if(size == list.length){
list = Arrays.copyOf(list, size + N);
}
list[size] = newNode;
size++;
}
TL;DR: What should I make N?
It's recommended to double the size of the array when resizing. Doubling the size leads to amortized linear-time cost.
The naive idea is that there are two costs associated with the resize value:
Copying performance costs - costs of copying the elements from previous array to new one, and
Memory overhead costs - cost of the allotted memory that is not used.
If you were to re-size the array by adding one element at a time, the memory overhead is zero, but the copying cost becomes quadratic. If you were to allocate too much slots, the copying cost will be linear, but the memory overhead is too much.
Doubling leads to a linear amortized cost (i.e. over a long time, the cost of copying is linear with respect to the size of the array), and you are guaranteed not to waste more than half of the array.
UPDATE: By the way, apparently Java's ArrayList expands by (3/2). This makes it a bit more memory conservative, but cost a bit more in terms of copying. Benchmarking for your use wouldn't hurt.
MINOR Correction: Doubling would make the cost resizing linear amortized, but would ensure that you have a amortized constant time insertion. Check CMU's lecture on Amortized Analysis.
3/2 is likely chosen as "something that divides cleanly but is less than phi". There was an epic thread on comp.lang.c++.moderated back in November 2003 exploring how phi establishes an upper bound on reusing previously-allocated storage during reallocation for a first-fit allocator.
See post #7 from Andrew Koenig for the first mention of phi's application to this problem.
If you know roughly how many items there are going to be, then pre-assign the array or the ArrayList to that size, and you'll never have to expand. Unbeatable performance!
Failing that, a reasonable way to achieve good amortized cost is to keep icreasing by some percentage. 100% or 50% are common.
You should resize your lists as a multiple of the previous size, rather than adding a constant amount each time.
for example:
newSize = oldSize * 2;
not
newSize = oldSize + N;
Double the size each time you need to resize unless you know that more or less would be best.
If memory isn't an issue, just start off with a big array to begin with.
Your code seems to do pretty much what ArrayList does - if you know you will be using a large list, you can pass it an initial size when you create the list and avoid resizing at all. This ofcourse assumes that you're going for raw speed and memory consumption is not an issue.
From the comments of one of the answers:
The problem is that memory isn't an
issue, but I'm reading an arbitrarily
large file.
Try this:
new ArrayList<Node>((int)file.length());
You could do it with your array as well. Then there should be no resizing in either case since the array will be the size of the file (assuming that the file is not longer then an int...).
For maximum performance, you're going to want to resize as rarely as possible. Set the initial size to be as large as you'll typically need, rather than starting with N elements. The value you choose for N will matter less in that case.
If you are going to create a large number of these list objects, of varying sizes, then you'll want to use a pool-based allocator, and not free the memory until you exit.
And to eliminate the copy operation altogether, you can use a list of arrays
Here's an analogy for you, long long ago when I used to work on a mainframe we used a filing system called VSAM, which would require you to specify the initial file size and the amount of freespace required.
Whenever the amount of freespace dropped below threshold required, then the amount of freespace required would be allocated in the background while the program continued to process.
It would be interesting to see if this could be done in java using a separate thread to allocate the additional space and 'attach' it to the end of the array while the main thread continues processing.