Java Threads and Garbage Collection - java

I have read in countless places that running threads are garbage collection roots (ie they reside on the stack, the GC identifies them and traces through them to determine if the objects inside them are still reachable). Further more, a GC root will never be garbage collected itself.
My confusion is here: If the objects allocated from within a thread can never be garbage collected until the thread is stopped, how then is anything garbage collected in single-threaded programs where the only thread is the main thread ?
Clearly I'm missing something here.

First, a thread (stack) is only a GC root while it is alive. When the thread terminates, it is no longer a GC root.
My confusion is here: If the objects allocated from within a thread can never be garbage collected until the thread is stopped, how then is anything garbage collected in single-threaded programs where the only thread is the main thread ?
Your premise is incorrect.
It doesn't matter which thread allocates an object. What matters is whether an objects allocated by your thread remains reachable. If it become unreachable (for example, if you didn't put the object's reference into a local variable) ... then it can be collected.

Objects are in the heap, regardless which thread created them.
Objects may be reachable through references. Some of these references can be on the call stack of one or more threads.
An object can be collected when there are no more references to it, regardless whether is allocating thread is still running or not.
For example, the thread below repeatedly allocates new StringBuilder objects. During a call to foo(), the thread has references on its call stack to a StringBuilder object. When foo() returns, there are no further references to the StringBuilder object. Therefore, that object is no longer reachable, and is eligible for garbage collection.
Thread thread = new Thread( new Runnable() {
#Override
public void run() {
while ( true ) {
foo();
}
}
public void foo() {
StringBuilder strBuilder = new StringBuilder("This new object is allocated on the heap.");
System.out.println( strBuilder );
}
});
thread.run();

Related

How does Java prevent an as yet unassigned instance from being garbage collected?

Example:
public void foo() {
A a = new A();
}
What if there is this sequence of events?
Java allocates memory for A.
The A() constructor runs. Now there is an instance in the heap.
The GC runs.
There are no refs to the object and it is removed before being assigned to a.
How does it prevent this from occurring? I would greatly appreciate links to article where it is explained.
This is easy to answer once you know that the call stack is traversed by the garbage collector (GC). So when it traverses the stack of the method foo, it simply knows that there are references (a) pointing to that heap memory.
In order to know what is garbage, the GC has to first scan everything that is alive. Since there are references pointing to that memory (the new A()); that is treated alive, at least until a is used, somewhere, by some thread.

Why enqueuing of PhantomReference takes more GC cycles than WeakReference or SoftReference?

I decided to continue https://stackoverflow.com/a/41998907/2674303 in a separated topic.
Let's consider following example:
public class SimpleGCExample {
public static void main(String[] args) throws InterruptedException {
ReferenceQueue<Object> queue=new ReferenceQueue<>();
SimpleGCExample e = new SimpleGCExample();
Reference<Object> pRef=new PhantomReference<>(e, queue),
wRef=new WeakReference<>(e, queue);
e = null;
for(int count=0, collected=0; collected<2; ) {
Reference ref=queue.remove(100);
if(ref==null) {
System.gc();
count++;
}
else {
collected++;
System.out.println((ref==wRef? "weak": "phantom")
+" reference enqueued after "+count+" gc polls");
}
}
}
#Override
protected void finalize() throws Throwable {
System.out.println("finalizing the object in "+Thread.currentThread());
Thread.sleep(100);
System.out.println("done finalizing.");
}
}
Java 11 prints following:
finalizing the object in Thread[Finalizer,8,system]
weak reference enqueued after 1 gc polls
done finalizing.
phantom reference enqueued after 3 gc polls
First 2 rows can change order. Looks like they work in parallel.
Last row sometimes prints 2 gc polls and sometimes 3
So I see that enqueing of PhantomReference takes more GC cycles. How to explain it? Is it mentioned somewhere in documentation(I can't find)?
P.S.
WeakReference java doc:
Suppose that the garbage collector determines at a certain point in
time that an object is weakly reachable. At that time it will
atomically clear all weak references to that object and all weak
references to any other weakly-reachable objects from which that
object is reachable through a chain of strong and soft references. At
the same time it will declare all of the formerly weakly-reachable
objects to be finalizable. At the same time or at some later time it
will enqueue those newly-cleared weak references that are registered
with reference queues
PhantomReference java doc:
Suppose the garbage collector determines at a certain point in time
that an object is phantom reachable. At that time it will atomically
clear all phantom references to that object and all phantom references
to any other phantom-reachable objects from which that object is
reachable. At the same time or at some later time it will enqueue
those newly-cleared phantom references that are registered with
reference queues
Difference is not clear for me
P.S.(we are speaking about object with non-trivial finalize method)
I got answer to my question from #Holger:
He(no sexism but I suppose so) pointed me to the java doc and noticed that PhantomReference contains extra phrase in comparison with Soft and Weak References:
An object is weakly reachable if it is neither strongly nor softly
reachable but can be reached by traversing a weak reference. When the
weak references to a weakly-reachable object are cleared, the object
becomes eligible for finalization.
An object is phantom reachable if
it is neither strongly, softly, nor weakly reachable, it has been
finalized, and some phantom reference refers to it
My next question was about what does it mean it has been finalized I expected that it means that finalize method was finished
To prove it I modified application like this:
public class SimpleGCExample {
static SimpleGCExample object;
public static void main(String[] args) throws InterruptedException {
ReferenceQueue<Object> queue = new ReferenceQueue<>();
SimpleGCExample e = new SimpleGCExample();
Reference<Object> pRef = new PhantomReference<>(e, queue),
wRef = new WeakReference<>(e, queue);
e = null;
for (int count = 0, collected = 0; collected < 2; ) {
Reference ref = queue.remove(100);
if (ref == null) {
System.gc();
count++;
} else {
collected++;
System.out.println((ref == wRef ? "weak" : "phantom")
+ " reference enqueued after " + count + " gc polls");
}
}
}
#Override
protected void finalize() throws Throwable {
System.out.println("finalizing the object in " + Thread.currentThread());
Thread.sleep(10000);
System.out.println("done finalizing.");
object = this;
}
}
I see following output:
weak reference enqueued after 1 gc polls
finalizing the object in Thread[Finalizer,8,system]
done finalizing.
And application hangs. I think it is because for Weak/Soft references GC works in a following way: As soon as GC detected that object is Weak/Soft Reachable it does 2 actions in parallel:
enqueue Weak/Soft into registered ReferenceQueue instance
Run finalize method
So for adding into ReferenceQueue it doesn't matter if object was resurrected or not.
But for PhantomReference actions are different. As soon as GC detected that object is Phantom Reachable it does following actions sequentially:
Run finalize method
Check that object still only phantomReachable(check that object was not resurrected during finalize method execution). And Only if object is GC adds phantom reference into ReferenceQueue
But #Holger said that it has been finalized means that JVM initiated finalize() method invocation and for adding PhantomReference into ReferenceQueue it doesn't matter if it finished or not. But looks like my example shows that it really matter.
Frankly speaking I don't understand the difference according to adding into RefernceQueue for Weak and Soft Reference. What was the idea?
The key point is the definition of “phantom reachable” in the package documentation:
An object is phantom reachable if it is neither strongly, softly, nor weakly reachable, it has been finalized, and some phantom reference refers to it.
bold emphasis mine
Note that when we remove the finalize() method, the phantom reference gets collected immediately, together with the weak reference.
This is the consequence of JLS §12.6:
For efficiency, an implementation may keep track of classes that do not override the finalize method of class Object, or override it in a trivial way.
…
We encourage implementations to treat such objects as having a finalizer that is not overridden, and to finalize them more efficiently, as described in §12.6.1.
Unfortunately, §12.6.1 does not go into the consequences of “having a finalizer that is not overridden”, but it’s easy to see, that the implementation just treats those objects like being already finalized, never enqueuing them for finalization and hence, being able to reclaim them immediately, which affects the majority of all objects in typical Java applications.
Another point of view is that the necessary steps for ensuring that the finalize() method will eventually get invoked, i.e. the creation and linking of a Finalizer instance, will be omitted for objects with a trivial finalizer. Also, eliminating the creation of purely local objects after Escape Analysis, only works for those objects.
Since there is no behavioral difference between weak references and phantom references for objects without a finalizer, we can say that the presence of finalization, and its possibility to resurrect objects, is the only reason for the existence of phantom references, to be able to perform an object’s cleanup only when it is safe to assume that it can’t get resurrected anymore¹.
​​
¹ Though, before Java 9, this safety was not bullet-proof, as phantom references were not automatically cleared and deep reflection allowed to pervert the whole concept.
PhantomReferences will only be enqueued after any associated finalizer has finished execution. Note a finalizer can resurrect an object (used to good effect by Princeton's former Secure Internet Project).
Exact behaviour beyond the spec is not specified. Here be implementation dependent stuff.
So what seems to be happening? Once an object weakly collectable, it is also finalisable. So the WeakReferences can be enqueued and the objects queued for finalisation in the same stop-the-world event. The finalisation thread(s) is (are) running in parallel with your ReferenceQueue thread (main). Hence you may see the first two lines of your output in either order, always (unless wildly delayed) followed by the third.
Only some time after your finalizer is exited is the PhantomReference enqueueable. Hence the gc count is strictly greater. The code looks like a reasonably fair race. Perhaps changing the millisecond timeouts would change things. Most things GC don't have exact guarantees.

Garbage Collection (Local references)

I have confusions on how GC works in Java.
Below is the code snippet that confuse me:
private Data data = new Data();
void main() {
for (int i = 0; i < 100 ; i++) {
MyThread thread = new MyThread(data);
thread.start();
}
System.gc();
// Long running process
}
class MyThread extends Thread {
private Data dataReference;
MyThread(Data data) {
dataReference = data;
}
}
In the above example if gc is called before continuing further (// Long running process)
will the local threads will be garbage collected?
Or GC will mark them (MyThread local references) as alive since it holds the reference to global reference data?
The MyThread instances may be garbage collected only after they are done (i.e. their run method is done). After the for loop ends, any instances of MyThread whose run method is done may be garbage collected (since there are no references to them).
The fact the the MyThread instances each hold a reference to a Data instance that doesn't get garbage collected doesn't affect the time the MyThread instances become eligible for garbage collection.
Your MyThread instances will not be eligible for garbage collection until they have finished running.
The thread stack and local variables for any live (i.e. started but not terminated) thread are reachable by definition.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. (JLS 12.6.1)
Furthermore, since a live thread could call Thread.currentThread(), the thread's Thread object must also be reachable as long as the thread is live ... irrespective of any other references to it.
However, if the reference to a Thread object becomes unreachable before the start() method has been called, it will be eligible for garbage collection. If this was not so, creating and not starting a Thread would be a memory leak!
You can always call to the garbage collection and but it is not guaranteed to run at the same time. (may or may not depending on your system). because garbage collection running under the daemon thread which is a low priority thread.
An object becomes eligible for Garbage collection or GC if it's not reachable from any live threads or by any static references. In other words, you can say that an object becomes eligible for garbage collection if its all references are null. Cyclic dependencies are not counted as the reference so if object A has a reference to object B and object B has a reference to Object A and they don't have any other live reference then both Objects A and B will be eligible for Garbage collection.
garbage-collection-in-java
There is no grantee that a gc will be executed after a System.gc(); call. A System.gc() call simply SUGGESTS that the VM do a garbage collection.
And thread is not the target for a gc. A thread won't be cleaned up unless its finished running.
Generally speaking, objects are juedged to be alive, if they are still referenced by others.
You should never be calling System.gc. The system will call it for you when low on memory.
In Java, GC works on a system called Mark and Sweep. The algorithm works like this
Start with a set of root objects (GC roots) and a set of all the objects allocated.
Mark those roots
Mark every object reachable from those roots, by visiting every field of these objects recursively.
When every possible object is marked, walk the list of all objects. If an item is not marked, free it.
(This is a simplification, the modern implementation works sort of like this, but is far more sophisticated).
So what is a GC root? Any object stored in a local variable still in scope, in a static variable, in a JNI reference, and all threads that are currently running.
So no, a thread won't be cleaned up unless its finished running. That's why threads so easily create a memory leak- as long as they run, any object they have a reference to cannot be freed because a GC root (the thread) has a reference to it.
But the relationship always goes down from the root to other objects. If Foo holds a reference to Bar, Foo can be deleted regardless of if Bar can be. But if Foo can't be deleted, then neither can Bar.

Garbage collection when application ends

As far as I know objects are available to be garbage collected when assigning a null value to the variable :
Object a = new Object;
a = null; //it is now available for garbage collection
or when the object is out of scope due to the method's execution is done:
public void gc(){
Object a = new Object;
} //once gc method is done the object where a is referring to will be available for garbage collection
given with the out of scope isn't also the same when the application just ended?
class Ink{}
public class Main {
Ink k = new Ink();
public void getSomething(){
//method codes here
}
public static void main(String[] args) {
Main n = new Main();
}
}
where I expect 2 objects (Ink object and Main object) should be garbage collected when the application ends.
When the Java application terminates, the JVM typically also terminates in the scope of the OS, so GC at that point is moot. All resources have returned to the OS after as orderly a shutdown of the JVM as the app defined.
You are confusing the event of an object becoming eligible for garbage collection with the actual process of collecting garbage or, more precisely, reclaiming memory.
The garbage collector doesn’t run just because a reference became null or an object went out of scope, that would be a waste of resources. It usually runs because either, memory is low or CPU resources are unused.
Also, the term “garbage collection” is misleading. The actual task for the JVM is to mark all objects being still alive (also known as reachable objects). Everything else is considered reclaimable, aka garbage. Since at the termination of the JVM, the entire memory is reclaimed per se, there is no need to search for reachable references.
That said, it’s helpful to understand, that most thinking about the memory management is useless. E.g. in your code:
public void gc(){
Object a = new Object;
// even here the object might get garbage collected as it is unused in subsequent code
}
the optimizer might remove the entire creation of the object, as it has no observable effect. Then, there will no garbage collection, as the object hasn’t been created in the first place.
See also here.
JVM monitors the GC roots - if an object is not available from a GC root, then it is a candidate for garbage collections. GC root can be
local variables
active java threads
static variables
jni references

Why is finalize not being called?

I have couple of questions regarding garbage collector in java.
Q1.As far as I understand, finalize() gets called when object is out of scope and JVM is about to collect garbage. I thought finalize() method is called automatically by garbage collector, but it does not seems to work in this case. What is the explanation? Why is the need for me to explicitly call finalize() method?
public class MultipleConstruct {
int x,y;
public MultipleConstruct(int x)
{
this.x= x;
y=5;
System.out.println("ONE");
}
#Override
protected void finalize() throws Throwable {
// TODO Auto-generated method stub
super.finalize();
System.out.println("FINALIZED");
}
public static void main(String[] args) throws Throwable {
MultipleConstruct construct = new MultipleConstruct(3);
}
}
Q2. Also, when is garbage collector invoked? I understand gc is a daemon thread and invoked by JVM depending on heap size remaining. Does that mean, JVM waits for the program to use threshold limit of resources and then notify the gc to sweep garbage objects.
EDIT: How does gc resolved circular references?
There is a lot to finalize() method which is frankly a lot to write, but in short:
An object is in the finalized state if it is still unreachable after its finalize method, if any, has been run. A finalized object is awaiting deallocation. Note that the VM implementation controls when the finalizer is run. You are almost always better off doing your own cleanup instead of relying on a finalizer. Using a finalizer can also leave behind critical resources that won't be recovered for an indeterminate amount of time.
In your case the reason it does not print is that you do not know when the finalizer thread will call the finalize() method. What is happening is that the program is terminating before anything can get printed. To check it:
edit the code inside main code by( NOTE: this does not guarrantee nor should you should ever rely on it but still it does prints some time)
for(int i =0;i<1000000;i++)
{
MultipleConstruct construct = new MultipleConstruct(3);
construct = null;
}
There are a lot of disadvantages of using a finalize() right from taking more time in object construction to possibility of memory leakage and memory starvation. If you strongly refer to the same object inside the finalize() then it is never called the second time and thus can leave system in undesired state etc etc etc...
The only place where you should use finalize() is as a safety net to dispose any resources like InputStream uses it to close (which again there is no guarrantee that it will will br run when your program is still alive). Another place to use it is while using natives where garbage collector has no control.
For more info visit:
http://jatinpuri.com/?p=106
q1) finalize method is called when the object is being garbage collected, thus, if no GC is being performed, your finalizer may not be called. You need to call super simply to preserve the behavior provided by Object implementation.
q2) the exact moment in which GC is performed depends on a lot of factors like: which JVM you are using, tuning parameters, amount of free heap, etc. So it does not only rely on a used heap threshold. You can also ask for a GC to be performed through System.gc() but you have no guarantee about if and when it will be actually executed.
You can find some details on how to configure GC in http://java.sun.com/performance/reference/whitepapers/tuning.html
it gets called eventually or not at all
basically the GC scans the heap for everything that is not reachable and runs the finalizer on those (after which it needs to prove again it is not reachable for it to be freed)
however it can take a while (effectively undefined and actually dependent on program behavior) for the GC to find it which is why you shouldn't really rely on it to dispose of critical data
edit: as for circular references it distinguishes between objects with a finalize method and objects without one
for an object to be freed (deleted from main memory) it may not be reachable by any code (this includes finalizers that still need to run)
when 2 objects with finalizers are eligible to get the finalizers run the GC arbitrarily selects one object and runs the finalizer on it and then it can run the other object
note that a finalizer can run while the fields of the objects may or may not be finalized already
finalize() method is called automatically during garbage collection. System.gc() method forcibly calls garbage collector. but we will have to destroy object before.
example:
public class Sample
{
public Sample()
{
System.out.println("Object created");
}
#Override
public void finalize()
{
System.out.println("Object Destroyed");
}
public static void main(String args[])
{
Sample x=new Sample();
Sample y=new Sample();
x=null;
y=null;
System.gc();
}
}

Categories