So what I am having here is a java program the manipulates a huge amount of data and store it into objects (Mainly hash-maps). At some point of the running time the data becomes useless and I need to discard so I can free up some memory.
My question is what would be the best behavior to discard these data to be garbage collected ?
I have tried the map.clear(), however this is not enough to clear the memory allocated by the map.
EDIT (To add alternatives I have tried)
I have also tried the system.gc() to force the garbage collector to run, however it did not help
HashMap#clear will throw all entries out of the HashMap, but it will not shrink it back to its initial capacity. That means you will have an empty backing array with (in your case, I guess) space for tens of thousands of entries.
If you do not intend to re-use the HashMap (with roughly the same amount of data), just throw away the whole HashMap instance (set it to null).
In addition to the above:
if the entries of the Map are still referenced by some other part of your system, they won't be garbage-collected even when they are removed from the Map (because they are needed elsewhere)
Garbage collections happens in the background, and only when it is required. So you may not immediately see a lot of memory being freed, and this may not be a problem.
system.gc()
is not recommended as jvm should be the only one to take care of all the garbage collection. Use Class WeakHashMap<K,V> in this case.
The objects will automatically be removed if the key is no longer valid
Please read this link for reference
Related
Backstory: So I had this great idea, right? Sometimes you're collecting a massive amount of data, and you don't need to access all of it all the time, but you also may not need it after the program has finished, and you don't really want to muck around with database tables, etc. What if you had a library that would silently and automatically serialize objects to disk when you're not using them, and silently bring them back when you needed them? So I started writing a library; it has a number of collections like "DiskList" or "DiskMap" where you put your objects. They keep your objects via WeakReferences. While you're still using a given object, it has strong references to it, so it stays in memory. When you stop using it, the object is garbage collected, and just before that happens, the collection serializes it to disk (*). When you want the object again, you ask for it by index or key, like usual, and the collection deserializes it (or returns it from its inner cache, if it hasn't been GCd yet).
(*) See now, this is the sticking point. In order for this to work, I need to be able to be notified JUST BEFORE the object is GCd - after no other references to it exist (and therefore the object can no longer be modified), but before the object is wiped from memory. This is proving difficult. I thought briefly that using a ReferenceQueue would save me, but alas, it returns a Reference, whose referent has thus far always been null.
Is there a way, having been given an arbitrary object, to receive (via callback or queue, etc.) the object after it is ready to be garbage collected, but before it IS garbage collected?
I know (Object).finalize() can basically do that, but I'll have to deal with classes that don't belong to me, and whose finalize methods I can't legitimately override. I'd prefer not to go as arcane as custom classloaders, bytecode manipulation, or reflection, but I will if I have to.
(Also, if you know of existing libraries that do transparent disk caching, I'd look favorably on that, though my requirements on such a library would be fairly stringent.)
You can look for a cache that supports "write behind caching" and tiering. Notable products would be EHCache, Hazelcast, Infinispan.
Or you can construct something by yourself with a cache and a time to idle expiry.
Then, the cache access would be "the usage" of the object.
Is there a way, having been given an arbitrary object, to receive (via callback or queue, etc.) the object after it is ready to be garbage collected, but before it IS garbage collected?
This interferes heavily with garbage collection. Chances are high that it will bring down your application or whole system. What you want to do is to start disk I/O and potentially allocate additional objects, when the system is low or out of memory. If you manage it to work, you'll end up using more heap than before, since the heap must always be extended when the GC kicks in.
This question already has answers here:
How to force garbage collection in Java?
(25 answers)
Closed 8 years ago.
How can I manually delete a specific object before the garbage collector would ever collect it ?
For example I want to delete requestToken object. How can I do that ?
The short answer is that you can't, and that you don't need to. The GC will reclaim the memory when it needs to ... and there is no reason to interfere with that.
The only situation I can think of for needing to delete an object sooner is when the object contains information that needs to be erased ... for information security reasons. (The classic example is when you are processing a password provided by the user and you are worried that it might leak via a code dump or something) In that case, you need to implement a method on your object for erasing the object's fields by overwriting them. But this requires careful design; e.g. to make sure that you find and erase all traces of the information.
It is also sometimes necessary to "help" the GC a bit to avoid potential memory leaks. A classic example of this is the ArrayList class, which uses a Java array to represent the list content. The array is often larger than the list's logical size, and the elements of the array could contain pointers to objects that have been removed from the list. The ArrayList class deals with this by assigning null to these elements.
Note that neither of these examples involve actually deleting objects. In both cases, the problem / issue is addressed another way.
It is also worth noting that calling System.gc() is usually a bad idea:
It is not guaranteed to do anything at all.
In most situations, it won't do anything that wouldn't happen anyway.
In most situations, it is inefficient. The JVM is in a better position than application code to know the ergonomically most efficient time to run the GC. (Read this for a first-principles explanation of the ergonomics.)
The only cases where running the GC in production code is advisable are when you are trying to manage GC pauses, and you know that a pause is acceptable at a particular point in time. (For example, when you are changing levels in an interactive game ... )
You cannot delete an object, you can just try to make it eligible for garbage collection in its next cycle. The best you could do is , set the object as null and try calling System.gc();
Note: System.gc() call will only request the JVM to run garbage collector but it cannot force it to.
I am generating a large data structure and write it to hard disk. Afterwards I want to get rid of the object, to reduce the memory consumption. My problem is that after I had forced a garbage collection the amount of used memory is at least as high as it was before garbage collection. I have added a minimal working example what I am doing.
DataStructure data = new DateStructure();
data.generateStructure(pathToData);
Writer.writeData(data);
WeakReference<Object> ref = new WeakReference<Object>(data);
data = null;
while (ref.get() != null) {
System.gc();
}
The code should force a garbage collection on the data object as it is recommended in thread:
Forcing Garbage Collection in Java?
I know this garbage collection does guarantee the deletion of the data object, but in the past I was more successful by using the garbage collection as described at the link as using simply System.gc().
Maybe someone has an answer whats the best way to get rid of large objects.
It seems that this is premature optimization (or rather an illusion of it). System.gc(); is not guaranteed to force a garbage collection. What you are doing here is busy waiting for some non-guaranteed gc to happen. But if the heap does not get filled up the JVM might not start a garbage collection at all.
I think that you should start thinking about this problem when you stress test your application and you can identify this part as a bottleneck.
So in a nutshell you can't really force a gc and this is intentional. The JVM will know when and how to free up space. I think that if you clear your references and call System.gc(); you can move on without caring about whether it gets cleaned up or not. You may read the Official documentation about how to fine-tune the garbage collector. You should rather be using some GC tuning according to the documentation than asking java to GC from your code.
Just a sidenote: the JVM will expand some of the heap's generations if the need arises. As far as I know there is a configuration option where you can set some percentage when the JVM will contract a generation. Use MinHeapFreeRatio/MaxHeapFreeRatio if you don't want Java to reserve memory which it does not need.
This idiom is broken for a whole range of reasons, here are some:
System.gc() doesn't force anything; it is just a hint to the garbage collector;
there is no guarantee when a weak reference will be cleared. The spec says "Suppose that the garbage collector determines at a certain point in time that an object is weakly reachable". When that happens, it is up to the implementation;
even after the weak reference is cleared, there is no telling when its referent's memory will actually be reclaimed. The only thing you know at that point is that the object has transitioned from "weakly reachable" to "finalizable". It may even be resurrected from the finalizer.
From my experience, just doing System.gc a fixed number of times, for example three, with delays between them (your GC could be ConcurrentMarkSweep) in the range of half-second to second, gives much stabler results than these supposedly "smart" approaches.
A final note: never use System.gc in production code. It is almost impossible to make it bring any value to your project. I use it only for microbenchmarking.
UPDATE
In the comments you provide a key piece of information which is not in your question: you are interested in reducing the total heap size (Runtime#totalMemory) after you are done with your object, and not just the heap occupancy (Runtime#totalMemory-Runtime#freeMemory). This is completely outside of programmatic control and on certain JVM implementations it never happens: once the heap has increased, the memory is never released back to the operating system.
I am newbie in Java and my English not enough good, I hope everyone will forgive me.
Question 1:
I have an ArrayList(J2SE) or Vector(J2ME), I have a class(example: Bullet), when i fire, i add a instance of that class to the List and after the bullets hit the target, i need to destroy them and remove them from the list. I want to ask: How to delete completely object which i need to remove, I mean: free all memory which that object was take(Same as delete pointer in C++). With normal object, we can use "= null", but in here, this object is inside a List and we can not use like that. I try to use System.gc(), but that is a bad idea, program will slow down and memory increase more than i not use gc(). If i only use List.remove(bullet_index), memory will increase a bit, but it will not decrease.
Question 2:
Have any Other idea to make a gun shot with "Unlimited number of bullet" and safe with memory.
I making a simple 2D shoting game
You simply can't. Java works over the JVM which provides a managed environment for the allocation and deallocation of your memory through a garbage collector.
You can hint the JVM to clean memory by deallocating unused objects but that's not usually the way it is meant to be used. Just remove the Bullet instance from the list, if there are no other references to it then eventually its memory will be released.
If you really want to save memory you should think about reusing the same instances, this can be done if you plan to have at most a precise amount of bullets at the same time on the screen, so instead that removing from the list the expired one you could add them to a another list which then is used to pick up new bullets (by setting their attributes). In this way you can avoid going over a certain threshold.
Java is memory managed, so you can't reliably free memory. You can be sure that memory will not be freed as long as you have pointers to your object. Setting an object to null means that nothing points to it, but doesn't necessarily mean that it will be garbage collected at that point. For an in-depth explanation on memory management in Java, check this out.
Also, avoid using vectors- they're synchronized, and usually not used in newer code.
I am trying to implement an LRU cache in Java which should be able to:
Change size dynamically. In the sense that I plan to have it as SoftReference subscribed to a ReferenceQueue. So depending upon the memory consumption, the cache size will vary.
I plan to use ConcurrentHashMap where the value will be a soft reference and then periodically clear the queue to update the map.
But the problem with the above is that, how do I make it LRU?
I know that we have no control over GC, but can we manage references to the value (in cache) in such a manner that all the possible objects in cache, will become softly reachable (under GC) depending upon usage (i.e. the last time it was accessed) and not in some random manner.
Neither weak nor soft references are really well suited for this. WeakReferences tend to get cleared immediatly as soon as the object has no stronger references anymore and soft references get cleared only after the heap has grown to it's maximum size and when a OutOufMemoryError would need to be thrown otherwise.
Typically it's more efficient to use some time based approach with regular strong refernces which are much cheaper for the VM than the Reference subclasses (faster to handle for the program and the GC and use no extra memory for the reference itself.). I.e. release all objects that have not been used for a certain time. You can check this with a periodic TimerTask that you would need anyway to operate your reference queue. The idea is that if it takes i.e. 10ms to create the object and you keep it at most 1s after it was last used you will on average only be 1% slower than when you would keep all objects forever. But since it will most likely use less memory it will actually be faster.
Edit: One way to implement this would be to use 3 buckets internally. Objects that are placed into the cache get always inserted into bucket 0. When a object is requested the cache looks for it in all 3 buckets in order and places it into bucket 0 if it was not already there. The TimerTask gets invoked in fixed intervals and just drops bucket 2 and places a new empty bucket at the front of the bucket list, so that the new bucket 0 will be empty and the former bucket 0 becomes 1 and the former bucket 1 is now bucket 2. This will ensure that idle objects will survive at least one and at most two timer intervals and that objects that are accessed more than once per interval are very fast to retrieve. The total maintenance overhead for such a data structure will be considerably smaller than everything that is based on reference objects and reference queues.
Your question doesn't really make sense unless you want several of these caches at the same time. If you have only a single cache, don't give it a size limit and always use WeakReference. That way, the cache will automatically use all available free memory.
Prepare for some hot discussions with your sysadmins, though, since they will come complaining that your app has a memory leak and "will crash any moment!" sigh
The other option is to use a mature cache library like EHCache since it already knows everything that there is to know about caches and they spent years getting them right - literally. Unless you want to spend years debugging your code to make it work with every corner case of the Java memory model, I suggest that you avoid reinventing the wheel this time.
I would use a LinkedHashMap as it support access order and use as a LRU map. It can have a variable maximum size.
Switching between weak and soft references based on usage is very difficult to get right because. Its hard to determine a) how much your cache is using exclusively, b) how much is being used by the system c) how much would be used after a Full GC.
You should note that weak and soft references are only cleaned on a GC, and that discarding them or changing them won't free memory until a GC is run.