CompletableFuture Chain uncompleted -> Garbage Collector?

CompletableFuture Chain uncompleted -> Garbage Collector? - java

if i have one (or more) CompletableFuture not started yet, and on that method(s) a few thenApplyAsync(), anyOf()-methods.
Will the Garbage Collector remove all of that?
If there is a join()/get() at the end of that chain -> same question: Will the Garbage Collector remove all of that?
Maybe we need more information about that context of the join().
That join is in a Thread the last command, and there are no side-effects.
So is in that case the Thread still active? - Java Thread Garbage collected or not
Anyway is that a good idea, to push a poisen-pill down the chain, if im sure (maybe in a try-catch-finally), that i will not start that Completable-chain, or is that not necessary?
The question is because of something like that? (https://bugs.openjdk.java.net/browse/JDK-8160402)
Some related question to it: When is the Thread-Executor signaled to shedule a new task? I think, when the CompletableFuture goes to the next chained CompletableFuture?. So i must only carry on memory-leaks and not thread-leaks?
Edit: What i mean with a not started CompletableFuture?
i mean a var notStartedCompletableFuture = new CompletableFuture<Object>(); instead of a CompletableFuture.supplyAsync(....);
I can start the notStartedCompletableFuture in that way:
notStartedCompletableFuture.complete(new Object); later in the program-flow or from another thread.
Edit 2: A more detailed Example:
AtomicReference<CompletableFuture<Object>> outsideReference=new AtomicReference<>();
final var myOuterThread = new Thread(() ->
{
final var A = new CompletableFuture<Object>();
final var B = new CompletableFuture<Object>();
final var C = A.thenApplyAsync((element) -> new Object());
final var D = CompletableFuture.anyOf(A, C);
A.complete(new Object());
// throw new RuntimeException();
//outsideReference.set(B);
----->B.complete(new Object());<------ Edit: this shouldn't be here, i remove it in my next iteration
D.join();
});
myOuterThread.start();
//myOutherThread variable is nowhere else referenced, it's sayed so a local variable, to point on my text on it^^
So in the normal case here in my example i don't have a outside
reference. The CompletableFutures in the thread have never a chance
getting completed. Normally the GC can safely erase both the thread
and and the content in there, the CompetableFutures. But i don't
think so, that this would happen?
If I abbord this by throwing an exception -> the join() is never
reached, then i think all would be erased by the GC?
If I give one of the CompletableFutures to the outside by that AtomicReference, there then could be an chance to unblock the join(), There should be no GC here, until the unblock happens. BUT! the waiting myOuterThread on that join() doesen't have to to there anything more after the join(). So it could be an optimization erasing that Thread, before someone from outside completes B. But I think this would be also not happen?!
One more question here, how I can proof that behavior, if threads are blocked by waiting on a join() or are returned to a Thread-Pool?, where the Thread also "blocks"?

You seem to be struggling with different ways that CompletableFuture might leak, depending on how you created it. But it doesn't matter how, where, when or why it was created. The only thing that matters is whether or not it is still reachable.
Will the Garbage Collector remove all of that?
There are two places where we would expect there to be references to a CompletableFuture:
In the Runnable (or whatever) that would complete the future.
In any other code that would (at some point) attempt to get the eventual value from the future.
If you have a call thenApplyAsync() or anyOf() then the reference Runnable is in the arguments to that call. If the call can still happen, then the reference to the Runnable must still be reachable.
In your example:
var notStartedCompletableFuture = new CompletableFuture<Object>();
if the variable notStartedCompletableFuture is still accessible by some code that is still executing, then that CompletableFuture is reachable and won't be garbage collected.
On the other hand, if notStartedCompletableFuture is no longer accessible, and if the future is no longer reachable by some other path, then it won't be reachable at all ... and will be a candidate for garbage collection.
If there is a join() / get() at the end of that chain -> same question: Will the Garbage Collector remove all of that?
That makes no difference. It is all based on reachability. (The only wrinkle is that a thread that is currently alive1 is always reachable, irrespective of any other references to its Thread object. The same applies to its Runnable, and other objects reachable from the Runnable.)
But it is worth noting that if you call join() or get() on a thread / future that never terminates / completes, you will block the current thread, potentially for ever. And that is as bad as a thread leak.
1 - A thread is "alive" from when it is started to when it terminates.
When is the Thread-Executor signaled to schedule a new task?
It depends what you mean by "schedule". If you mean, when is the task submitted, the answer is when submit is called. If you mean, when is it actually run ... well it goes into the queue, and it runs when it gets to the head of the queue and a worker thread is free to execute it.
In the case of thenApplyAsync() and all_of(), the tasks are submitted (i.e. the submit(...) call occurs) when the respective method call occurs. So for example if thenApplyAsync is being called on the result of a previous call, then that call must return first.
This is all a consequence of the basic properties of Java expression evaluation ... applied to the expression that you are using to construct the chain of stages.
In general you don't need try / finally or try with resources to clean up potential memory leaks.
All you need to do is to make sure that you don't keep references to the various futures, stages, etc in variables, data structures, etc that will remain accessible / reachable beyond the lifetime of your computation. If you do that ... those references are liable to be the source of the leaks.
Thread leaks should not be your concern. If your code is not creating threads explicitly, they are being managed by the executor service / pool.

If a thread calls join() or get() on a CompletableFuture that will never be completed, it will remain blocked forever (except if it gets interrupted), holding a reference to that future.
If that future is the root of a chain of descendant futures (+ tasks and executors), it will also keep a reference to those, which will also remain in memory (as well as all transitively referenced objects).
A future does not normally hold references to its “parent(s)” when created through the then*() methods, so they should normally be garbage collected if there are no other references – but pay attention to those, e.g. local variables in the calling thread, reference to a List<CompletableFuture<?>> used in a lambda after allOf() etc.

This Answer only addresses with the 3 followup questions in your "Edit 2".
So in the normal case here in my example i don't have a outside
reference.
I assume that you are referring to the version with the commented out statements.
The CompletableFutures in the thread have never a chance getting completed.
Incorrect. First, A is completed here:
A.complete(new Object());
Next B is completed here:
B.complete(new Object());
Then you call D.join(). Since D is an anyOf stage, this completes when either of A and C completes. A has already completed, so D.join() may not need to wait for C to complete. But since C applies the function asynchronously, it could complete immediately too.
Normally the GC can safely erase both the thread and and the content in there, the CompletableFutures. But I don't think so, that this would happen?
When D.join() returns, the thread terminates. At that point, its local local variables (A, B, C, and D) will be unreachable.
If I abort this by throwing an exception -> the join() is never reached, then i think all would be erased by the GC?
A completes as before, but B, C and D don't.
However, the exception terminates the thread, so the local variables A, B, C, and D then become unreachable.
If I give one of the CompletableFutures to the outside by that AtomicReference, there then could be an chance to unblock the join().
Three points:
The AtomicReference is assigned B so the join() on D is not affected.
As we saw above, it doesn't matter if that a hypothetical join() on outsideReference.value() happens or not for the variables A, B, C, and D. Those variables become unreachable, whichever way the thread terminates.
However, you have now assigned a reference to one of the CompletableFuture objects to a variable which has a different lifetime to the thread. That may mean that that CompletableFuture object stays reachable after the thread has terminated.

Related

The behavior of "Mark & Sweep" in Java, especially for Future object

I'm wondering the lifetime of Future object, which is not bound to a named variable.
I learned that Java adopts mark & sweep style garbage collection.
In that case, any un-named object can be immediately deleted from heap. So I'm wondering if the Future might be swept out from memory even before the Runnable completes, or the memory might never be released.
Any information would be helpful, thanks!
class Main {
void main() {
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.submit(() -> { return true; }); // not bind to variable
Thread.sleep(1000);
}
}

I learned that Java adopts mark & sweep style garbage collection.
That is mostly incorrect. Most modern Java garbage collectors are NOT mark & sweep. They are mostly copying collectors that work by evacuating (copying) objects to a "to" space as they are marked. And most Java garbage collectors are also generational.
There is a lot of material published by Oracle about the Java garbage collectors and how they work. And there a good textbooks on the subject too.
In that case, any un-named object can be immediately deleted from heap.
Names have nothing to do with it. References are not names, and neither are variables. Java objects are deleted by the GC only if it finds that they are unreachable; i.e. if no code will never be able to find them again1. Furthermore they are not deleted immediately, or even (necessarily) at the next GC run.
So I'm wondering if the Future might be swept out from memory even before the Runnable completes, or the memory might never be released.
(That's a Callable rather than a Runnable. A Runnable doesn't return anything.)
The answer is no it won't.
The life cycle is something like this:
You call submit passing a Callable.
A CompletableFuture is created.
The CompletableFuture and the Callable are added to the executor's queue.
The CompletableFuture is returned to the caller. (In your case, the caller throws it away.)
At a later point, a worker thread takes the Future and the Callable from the queue, executes the Callable.
Then the worker thread calls complete on the Future to provide the result.
Finally, something will typically call Future.get to obtain the result. (But not in your example.)
In order for the step 6. to work, the CompletableFuture must still be reachable. It won't be thrown away until all references are lost or discarded. Certainly not until after step 6 has completed.
Bottom line: a Java handles the Future just like it would any other (normal) object. Don't worry about it. If anything needs it, it won't disappear.
1 - Reachability is a bit more complicated when you consider, finalization and Reference types. But the same general principal applies. If any code could still see an object, it won't be deleted.

"Submit" means to give something to someone, in this case you're giving a piece of code (in the form of a Callable) to the ExecutorService for later execution. In return, the method returns a Future object that will be updated with the result when it is done.
In order for the ExecutorService to update the Future object, it needs to hold on the Future object too, together with the code reference (the Callable).
Therefore, the ExecutorService maintains references to both the Callable object and the Future object until the job has been completed. Those references makes both objects reachable, preventing the objects from being garbage-collected.
Since your code discarded the returned Future object, the object will become eligible for GC as soon as the job completes, but not before that.

If you discard a reference to a Future, will it still run?

In my method, I have to call another method (AnotherMethod) that returns a future.
eg.
private static void myMethod() {
Future<MyObj> mjObj = AnotherMethod();
return;
}
I don't actually care about the value returned by AnotherMethod (eg. the value of myObj), but I do want AnotherMethod to run fully.
If I discard the reference to the future (as in the above example), will AnotherMethod still finish running?
I understand it won't finish before returning from myMethod, but will it still complete at some point even though there's no reference to myObj anymore?

First of all, AnotherMethod always will be performed from the start to the end because you call it. As for concurrency, if AnotherMethod starts a Thread or submits task to an executor then this concurrent execution will not be interrupted. Garbage collector does not interrupt threads because they are GC roots - top level objects in JVM.

Memory consistency effects in ConcurrentMap

According to the ConcurrentMap Javadoc:
Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object from the ConcurrentMap in another thread.
What is the meaning of above statement? And how does it work as the get() method in ConcurrentHashMap is not blocking (compared to BlockingQueue for example)?

The meaning is rather simple. Assume you have two pieces of code:
a = new A();
b = ...
someConcurrentHashMap.put(b, whatever);
And then:
Whatever value = someConcurrentHashMap.get(b);
c = new C();
When b is the same object and these two pieces of code are executed by two different threads, then it is guaranteed that a = new A() happens before c = new C().
For further reading on "happens before" - see here.
For the implementation details I recommend you to study the source code - that contains tons of (non-javadoc!) comments that explain the inner workings of this class.

GhostCat already explained the meaning of happens-before. However, it might be worth noting the difference between this and "blocking".
In a blocking queue, a thread trying to poll from the queue will wait until something is available.
For something like a ConcurrentHashMap, this is not the case. A happens-before relationship simple means that everything you did before adding it to the map has still happened when the other thread accesses it. But it doesn't mean the other thread will wait for something with the given key to be available.
To give an example where this is important, consider two classes foo and bar. In the constructor of Foo, we add it to a list in Bar. Now we put this instance of Foo in the ConcurrentHashMap and access it in another thread. It is logical that everything we did to that instance of Foo has still happened. However, Java will also make sure that the instance of Foo has still been added to the list in Bar.

Garbage collection and asynchronous calls / Future objects

Below is a sample code that utilizes the Future interface to make an asynchronous call. I need some clarification about the get() method.
Future<String> future = getAsyncString();
//do something ...
String msg = "";
if (validation)
return;
else
msg = future.get();
//do something else...
return;
The future variable is initialized in a method , so the variable will be soon cleared by the GC after the method's execution as it is no longer used.
So in case that the code enters the if statement , what will be the state of the JVM? How is the JVM going to handle the wrapped result in case that noone is going to read it back? Does it affect the Thread Pool , or the thread Executor?

How is the JVM going to handle the wrapped result in case that noone is going to read it back?
Presumably you got the Future object from an Executor. For this executor to be able to set the result in the Future, it holds a reference to the Future. In other words, just because the method local reference to the object disappears as the call stack is popped, doesn't mean that the Future object (which is on the heap) is automatically eligible for garbage collection.
The async call is not cancelled or anything like that. The executor will perform the call, fill in the result, and presumably drop it's reference to the Future object. At this point the object becomes unreachable and eligible for garbage collection.
If you're certain that your code doesn't keep a reference to the Future object (i.e. leaking it in the // do something... part) then you can be sure that the Future object is (eventually) collected by the GC. (The executor doesn't have any subtle memory leaks here.)
[...] so the variable will be soon cleared by the GC.
To be precise, the variable will be discarded as the call stack is popped. This will eventually cause the Future object to be unreachable and eligible for garbage collection. The object will however typically not be garbage collected immediately as the method returns.

How is the JVM going to handle the wrapped result in case that noone is going to read it back?
If nobody(I mean any program) is going to read it back then GC will take care of it during garbage collection. But that does not mean getAsyncString() will not be executed completely, instead it will complete normally as a normal method completes.

I guess. Scheduled future will have some internal references from threadpools queues until task completion. So it cant be collected by gc before task is complete.
May be there is exists additional abstraction level between future and executor and future can be collected. But im sure that if task submitted it will be runned. No matter, was pointer to future saved or not.

You have guarantee that the object will not be GCed while you are in scope in which reference to it is defined, or there is reference to the object somewhere in code.
This applies to all Objects, and Future makes no difference here.
So, once your method ends, and its call stack is cleared, at some point in the future your object will be eligible for Garbage collection, but certainly not before reference to it exists on the method's call stack.

when will finalize() be called on my class instance in this scenario?

I know that finalize() is called whenever a class instance is collected by the garbage collector. However, I am a little bit confused when passing an instance of a class to another thread via a queue.
Let's say this is a skeleton of Thread1:
for(i=0; i<1000; i++) {
Packet pkt = new Packet(); // instance of class
pkt.id = i;
thread2.queue.put(pkt);
}
Then, thread 2 will remove the packet from the queue and perform lengthy operations. Does this second thread get a "copy" of the packet, or is it by some form of reference? The importance is that, if it is by copy, the finalize() on the instance created in thread 1 can be called before thread 2 is done with the packet. If it is by reference, I am guaranteed that finalize() is only called once for the information in the packet.
This basic example may not show the importance, but I am storing a C-pointer (from JNI) in the packet to destroy some memory when I am done with the object. If it is passed by copy, the memory may get destroyed before the second thread is done with it. If it is passed by reference, then it should only be destroyed once the GC sees it is no longer in use by BOTH threads (my desired behavior). If this latter scenario is not guaranteed, I will not use finalize() and use something else but it will be more complex.

The second thread receives the same actual object instance. You're safe from premature finalization.
It receives a copy of the object reference, if you want to think of it that way.
In addition, finalize is not necessarily run when the garbage collector finds that the object has become garbage - the VM is free to run it at any later time, and to actually reclaim the memory some time after that. You really can't rely on when finalize will be run. However, since what you care about is knowing that finalize won't be called before the second thread finishes with the object, that's immaterial. But worth knowing!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.