java: use existing object or create a new one? - java

What is better:
This
public Move(moveString) {
ArrayList<Square> moveSquares = splitToSquares(moveString.toLowerCase());
this.from = new Square(moveSquares.get(0));
this.to = new Square(moveSquares.get(1));
}
or this:
public Move(moveString) {
ArrayList<Square> moveSquares = splitToSquares(moveString.toLowerCase());
this.from = moveSquares.get(0);
this.to = moveSquares.get(1);
}
In the first, I use the information from the move objects to create a new one.
In the second, I directly use the object.
It doesn't make much difference for my program now, but I am wondering if Java needs to keep the complete ArrayList because I referenced them. If that is a huge list, it would be better to just copy the two objects I need and let the rest be collected by the GC, wouldn't it?
Or is the GC intelligent enough to do that himself? Then the first method would make unnecessary copies of the objects. Not a big deal in this case, but in another there may be hundreds or thousands such objects.

In both cases, the ArrayList will not be referenced once you exit the Move() method. It does not matter that you reference elements within the list. If the list itself is unreachable, it becomes eligible for garbage collection.
The two referenced elements will remain alive as long they are referenced by another live object. In your second example, those two elements will be reachable at least as long as your Move object is reachable. But if there are other elements in the list, and they are not referenced outside of the list, then those elements will be eligible for garbage collection when the list goes out of scope.

Your list moveSquares becomes unreachable as soon as the Move constructor ends (this is a reasonable assumption, although it ultimately depends on what exactly the splitToSquares method does).
In additon to that, the fact that one item of the list is still reachable has nothing to do with the reachability of the list itself, or of any other list item. They will all become unreachable, thus collectible.

Related

Garbage collector vs. collections

I have read few posts about garbage collection in Java, but still I cannot decide whether clearing a collection explicitly is considered a good practice or not... and since I could not find a clear answer, I decided to ask it here.
Consider this example:
List<String> list = new LinkedList<>();
// here we use the list, perhaps adding hundreds of items in it...
// ...and now the work is done, the list is not needed anymore
list.clear();
list = null;
From what I saw in implementations of e.g. LinkedList or HashSet, the clear() method basically just loops all the items in the given collection, setting all its elements (in case of LinkedList also references to next and previous elements) to null
If I got it right, setting the list to null just removes one reference from list - considering it was the only reference to it, the garbage collector will eventually take care of it. I just don't know how long would it take until also the list's elements are processed by garbage collector in this case.
So my question is - do the last two lines of the above listed example code actually help the garbage collector to work more efficiently (i.e. to collect the list's elements earlier) or would I just make my application busy with "irrelevant tasks"?
The last two lines do not help.
Once the list variable goes out of scope*, if that's the last reference to the linked list then the list becomes eligible for garbage collection. Setting list to null immediately beforehand adds no value.
Once the list becomes eligible for garbage collection, so to do its elements if the list holds the only references to them. Clearing the list is unnecessary.
For the most part you can trust the garbage collector to do its job and do not need to "help" it.
* Pedantically speaking, it's not scope that controls garbage collection, but reachability. Reachability isn't easy to sum up in one sentence. See this Q&A for an explanation of this distinction.
One common exception to this rule is if you have code that will retain references longer than they're needed. The canonical example of this is with listeners. If you add a listener to some component, and later on that listener is no longer needed, you need to explicitly remove it. If you don't, that listener can inhibit garbage collection of both itself and of the objects it has references to.
Let's say I added a listener to a button like so:
button.addListener(event -> label.setText("clicked!"));
Then later on the label is removed, but the button remains.
window.removeChild(label);
This is a problem because the button has a reference to the listener and the listener has a reference to the label. The label can't be garbage collected even though it's no longer visible on screen.
This is a time to take action and get on the GC's good side. I need to remember the listener when I add it...
Listener listener = event -> label.setText("clicked!");
button.addListener(listener);
...so that I can remove it when I'm done with the label:
window.removeChild(label);
button.removeListener(listener);
It depends on the following factors
how clear() is implemented
the allocation patterns for the entries held by the collection
the garbage collector
whether there might be other things holding onto the collection or subviews of it (does not apply to your example but common in the real world)
For a primitive, non-generational, tracing garbage-collector clearing out references only means extra work for without making things much easier on the GC. But clearing may still help if you cannot guarantee that all references to the collection are nulled out in a timely manner.
For generational GCs and especially G1GC nulling out references inside a collection (or a reference array) may be helpful under some circumstances by reducing cross-region references.
But that only helps if you actually have allocation patterns that create objects in different regions and put them into a collection living in a another region. And it also depends on the clear() implementation nulling out those references, which turns clearing into an O(n) operation when it could often be implemented as a O(1) one.
So for your concrete example the answer would be as follows:
If
your list is long-lived
the lists created on that code-path make up/hold onto a significant fraction of the garbage your application produces
you're using G1 or a similar multi-generational collector
slowly accumulates objects before eventually being released (this usually puts them in different regions, thus creating cross-region references)
you wish to trade CPU-time on clearing for reduced GC workload
the clear() implementation is O(n) instead of O(1), i.e. nulls out all entries. OpenJDK's 1.8 LinkedList does this.
then it may be beneficial to call clear() before releasing the collection itself.
So at best this is a very workload-specific micro-optimization that should only be applied after profiling/monitoring the application under realistic conditions and determining that GC overhead justifies the extra cost of clearing.
For reference, OpenJDK 1.8's LinkedList::clear
/**
* Removes all of the elements from this list.
* The list will be empty after this call returns.
*/
public void clear() {
// Clearing all of the links between nodes is "unnecessary", but:
// - helps a generational GC if the discarded nodes inhabit
// more than one generation
// - is sure to free memory even if there is a reachable Iterator
for (Node<E> x = first; x != null; ) {
Node<E> next = x.next;
x.item = null;
x.next = null;
x.prev = null;
x = next;
}
first = last = null;
size = 0;
modCount++;
}
I don't believe the clear() will help in this instance. The GC will remove items once there are no more references to them, so in theory, just setting the list = null will have the same effect.
You cannot control when the GC will be called, so in my view its not worth worry about unless you have specific resource/performance requirements. Personally I'd still with list = null;
If you want to reuse the list variable, then of course clear() is the best option rather than creating a new list object.
In Java an object is either alive (reachable via a reference owned by some other object) or dead (not reachable by a reference owner by any other object). Objects that are only reachable from dead objects are also considered dead and eligible for garbage collection.
If no live object has a reference to your collection, then it is unreachable and eligible for garbage collection. What this also means is that all of your collection's elements (and any other helper objects that it may have created) are also unreachable unless some other live object has a reference to them.
Therefore, the clear method has no effect other than erasing a reference from one dead object to another. They will get garbage collected either way.

Java ArrayList reference

I am creating an ArrayList of objects using generics. Each thread does come calculating and stores the object in the the array list.
However when looking at the ArrayList which is static and volatile all the object attributes are set as null. My thoughts are something to do with the garbage collector removing the instances in the threads so once the threads have finished there is no reference to them.
Any help would be really helpful?
The garbage collector will not remove instances1 from an array list. That is not the problem.
The problem is most likely that you are accessing and updating the array list object without proper synchronization. If you do not synchronize properly, one thread won't always see the changes made by another one.
Declaring the reference to the ArrayList object only guarantees that the threads will see the same list object reference. It makes no guarantees about what happens with the operations on the list object.
1 - Assuming that the array list is reachable when the GC runs, then all elements that have been properly added to the list will also be reachable. Nothing that is reachable will be deleted by the garbage collector. Besides, the GC won't ever reach into an object that your application can still see and change ordinary references to null.

clearing or set null to objects in java

I was recently looking into freeing up memory occupied by Java objects. While doing that I got confused about how objects are copied (shallow/deep) in Java and how to avoid accidently clearing/nullifying objects while they are still in use.
Consider following scenarios:
passing a ArrayList<Object> as an argument to a method.
passing a ArrayList<Object> to a runnable class to be processed by a thread.
putting a ArrayList<Object> into a HashMap.
Now in these case, if I call list = null; or list.clear();, what happens to the objects? In which case the objects are lost and in which cases only the reference is set to null?
I guess it has to do with shallow and deep copying of objects, but in which cases does shallow copying happens and in which case does deep copy happens in Java?
Firstly, you never set an object to null. That concept has no meaning. You can assign a value of null to a variable, but you need to distinguish between the concepts of "variable" and "object" very carefully. Once you do, your question will sort of answer itself :)
Now in terms of "shallow copy" vs "deep copy" - it's probably worth avoiding the term "shallow copy" here, as usually a shallow copy involves creating a new object, but just copying the fields of an existing object directly. A deep copy would take a copy of the objects referred to by those fields as well (for reference type fields). A simple assignment like this:
ArrayList<String> list1 = new ArrayList<String>();
ArrayList<String> list2 = list1;
... doesn't do either a shallow copy or a deep copy in that sense. It just copies the reference. After the code above, list1 and list2 are independent variables - they just happen to have the same values (references) at the moment. We could change the value of one of them, and it wouldn't affect the other:
list1 = null;
System.out.println(list2.size()); // Just prints 0
Now if instead of changing the variables, we make a change to the object that the variables' values refer to, that change will be visible via the other variable too:
list2.add("Foo");
System.out.println(list1.get(0)); // Prints Foo
So back to your original question - you never store actual objects in a map, list, array etc. You only ever store references. An object can only be garbage collected when there are no ways of "live" code reaching that object any more. So in this case:
List<String> list = new ArrayList<String>();
Map<String, List<String>> map = new HashMap<String, List<String>>();
map.put("Foo", list);
list = null;
... the ArrayList object still can't be garbage collected, because the Map has an entry which refers to it.
To clear the variable
According to my knowledge,
If you are going to reuse the variable, then use
Object.clear();
If you are not going to reuse, then define
Object=null;
Note:
Compare to removeAll(), clear() is faster.
Please correct me, If I am wrong....
It depends on how many variables are referenciating to each of your objects, to explain this it would be better some code:
Object myAwesomeObject = new Object();
List<Object> myList = new ArrayList<Object>();
myList.add(myAwesomeObject);
myList = null; // Your object hasn't been claimed by the GC just yet, your variable "myAwesomeObject" is still refering to it
myAwesomeObject = null; // done, now your object is eligible for garbage collection.
So it doesn't depend whether you pass your ArrayList as an argument to a method or the like, it depends on how many variables are still refering to your objects.
If you passed an ArrayList to a method then list = null will have no effect if there is a live reference to the list somewhere eg in the calling code. If you call list.clear() anywhere in the code the references to the objects from this list will be nulled. Passing a reference to a method is not shallow copying it is passing reference by-value
Java GC automatically claims the objects when they are not referenced anywhere. So in most cases you will have to set the reference as null explicitly
As soon as the scope of the variable ends the object becomes eligible for GC and gets freed up if no other reference points to the object.
Java is pass by value so if you set the list as null in the method then it will not affect the original reference that was passed to you in the method.
public class A{
private List<Integer> list = new ArrayList<Integer>();
public static void main(String[] args) {
A a = new A();
B b = new B();
b.method(a.list);
System.out.println(a.list.size()); //Will print 0 and not throw NullPointerException
}
}
class B{
public void method(List<Integer> list){
list = null;
//just this reference is set to null and not the original one
//so list of A will not be GCed
}
}
If you put the list into a hash map, the hash map now holds a reference to the list.
If you pass the list as an argument to a method, the method will have a reference to it for the duration of the method.
If you pass it to a thread to be manipulated, the thread will have a reference to the object until it terminates.
In all of these cases, if you set list = null, the references will still be maintained, but they will disappear after these references disappear.
If you simply clear the list, the references will still be valid, but will now point to a list that has suddenly been emptied, by means that may be unknown to the programmer and may be considered a bug, especially if you use the thread.
I was recently looking into freeing up memory occupied by java objects.
A piece of advice.
It is usually a bad idea to think about this. And it is usually a worse idea to try to "help". In 99.8% of cases, the Java garbage collector is able to do a better job of collecting the garbage if you actually just let it get on with it ... and don't waste your effort by assigning null to things. Indeed, the chances are that the fields you are nulling are in objects that are about to become unreachable anyway. And in that case, the GC is not even going to look at the fields that you've nulled.
If you take this (pragmatic) view, all your thinking about shallow versus deep copies and when it is safe to null things is moot.
There is a tiny percentage of cases where it is advisable to assign null ... to avoid medium or long term storage leaks. And if you are in one of those rare situations where it is "recycling" objects is actually a good idea, then nulling is also advisable.

Question about Garbage Collection in Java

Suppose I have a doubly linked list. I create it as such:
MyList list = new MyList();
Then I add some nodes, use it and afterwards decide to throw away the old list like this:
list = new MyList();
Since I just created a new list, the nodes inside the old memory area are still pointing to each other. Does that mean the region with the old nodes won't get garbage collected? Do I need to make each node point to null so they're GC'd?
No, you don't. The Java GC handles cyclic references just fine.
Conceptually, each time the GC runs, it looks at all the "live" root references in the system:
Local variables in every stack frame
"this" references in every instance method stack frame
Effectively, all static variables (In fact these are really referenced by Class objects, which are in turn referenced by ClassLoaders, but lets ignore that for the moment.)
With those "known live" objects, it examines the fields within them, adding to the list. It recurses down into those referenced objects, and so on, until it's found every live object in the system. It then garbage collects everything that it hasn't deemed to be live.
Your cyclically referenced nodes refer to each other, but no live object refers to them, so they're eligible for garbage collection.
Note that this is a grossly simplified summary of how a garbage collector conceptually works. In reality they're hugely complicated, with generations, compaction, concurrency issues and the like.
If you created your own double linked list, and you put in this double linked list Containers (that contain items from your list); only those containers are linked one to another.
So in your list you'll have an object A contained in A'. A' is linked to B' and B' is a container that hold B etc. And none of the object have to reference another.
In a normal case those containers won't be available from outside (only the content is interesting); so only your list will have references to your containers (remember that your content isn't aware of his container).
If you remove your last reference to your list (the list, not the container nor the content) the GC will try to collect your list content, witch is your containers and your contents.
Since your containers are not available outside the only reference they have is one each other and the main list. All of that is called an island of isolation. Concerning the content, if they still have references in your application, they will survive the GC, if not they won't.
So when you remove your list only A' and B' will be deleted because even if they still have references, those references are part of an island. If A and B have no more references they will be deleted too.
No -- Java (at least as normally implemented) doesn't use reference counting, it uses a real garbage collector. That means (in essence) when it runs out of memory, it looks at the pointers on the stack, in registers, and other places that are always accessible, and "chases" them to find everything that's accessible from them.
Pointers within other data structures like your doubly-linked list simply don't matter unless there's some outside pointer (that is accessible) that leads to them.
No, the GC will reclaim them anyways so you don't need to point them to null. Here's a good one paragraph description from this JavaWorld article:
Any garbage collection algorithm must
do two basic things. First, it must
detect garbage objects. Second, it
must reclaim the heap space used by
the garbage objects and make it
available to the program. Garbage
detection is ordinarily accomplished
by defining a set of roots and
determining reachability from the
roots. An object is reachable if there
is some path of references from the
roots by which the executing program
can access the object. The roots are
always accessible to the program. Any
objects that are reachable from the
roots are considered live. Objects
that are not reachable are considered
garbage, because they can no longer
affect the future course of program
execution.
The garbage collector looks if objects are referenced by live threads. If objects are not reachable by any live threads, they are eligible for garbage collection.
It doesn't matter if the objects are referencing each other.
As others have pointed out, the Java garbage collector doesn't simply look at reference counting; instead it essentially looks at a graph where the nodes are the objects that currently exist and links are a reference from one object to another. It starts from a node that is known to be live (the main method, for instance) and then garbage collects anything that can't be reached.
The Wikipedia article on garbage collection discusses a variety of ways that this can be done, although I don't know exactly which method is used by any of the JVM implementations.
The garbage collector looks for objects that isn't referenced anywhere.
So if you create a object and you loose the reference like the example the garbage collector will collect this.

Usefulness of ArrayList<E>.clear()?

I was looking through the Java documentation, and I came across the clear() method of ArrayLists.
What's the use of this, as opposed to simply reassigning a new ArrayList object to the variable?
Because there might be multiple references to the one list, it might be preferable and/or more practical to clear it than reassigning all the references.
If you empty your array a lot (within, say, a large loop) there's no point creating lots of temporary objects. Sure the garbage collector will eventually clean them up but there's no point being wasteful with resources if you don't have to be.
And because clearing the list is less work than creating a new one.
You might have a final field (class variable) List:
private final List<Thing> things = ...
Somewhere in the class you want to clear (remove all) things. Since things is final it can't be reassigned. Furthermore, you probably don't want to reassign a new List instance as you already have a perfectly good List instantiated.
Imagine the situation where there is multiple references of the same java.util.ArrayList throughout your code. It would be almost impossible or very difficult to create new instance of the list and assign to all the variables. But java.util.ArrayList.clear() does the trick!
You pay less with clear than you do with creating a new object if your objective was to really clear.
Reassigning a reference doesn't clear the object. The assumption is that if there are no other references to it, it would be reclaimed by the GC relatively soon. Otherwise, you just got yourself a mess.
In addition to the reasons mentioned, clearing a list is often more semantically correct than creating a new one. If your desired outcome is a cleared list, the action you should take is to clear the list, not create a new list.
clear() doesn't reallocate a new object, so it's less of a performance hit.

Categories