Old Generation memory increases when using ThreadLocal

Old Generation memory increases when using ThreadLocal - java

I have class Collector and ThreadLocalScope like this:
Collector {
Collector() {
events = new LinkedList<>();
}
add(Event e) {
events.add(e);
}
flush() {
LinkedList<Event> copy = events;
new Thread(() -> {
for (Event e : copy) {
sendToServer(e);
}
copy.clear();
).start();
events = new LinkedList<>();
}
}
ThreadLocalScope {
public static ThreadLocal<Collector> local = new ThreadLocal<>() {
protected Collector initialValue() {
return new Collector();
}
}
}
Collector simply adds events and when flush is called sends those events to an API in a new thread. The Collector is initialized in a ThreadLocal.
I also have a Job class which is executed several times (using Quartz). When defined like this everything works great:
Job {
execute() {
for (int i = 0; i < 100,000; i++) {
ThreadLocalScope.get().add(new Event());
}
ThreadLocalScope.get().flush();
}
}
However if instead I hold onto Collector like this:
Job {
Collector collector;
Job() {
collector = ThreadLocalScope.get();
}
execute() {
for (int i = 0; i < 100,000; i++) {
collector.add(new Event());
}
collector.flush();
}
}
I see my Old Generation Memory usage increasing rapidly and Stop-the-world Garbage Collection cycles happening all the time. The only difference is I have added Collector as a member variable rather than calling ThreadLocalScope.get() every time.
The increase could only mean that the Events are being moved into Old Generation. But why would that happen? Collector immediately clears all its references to the Events, so even if it is not GCed, the events should be.

I said:
I think you might have a thread-safety issue here.
Incorrect. I think it is simpler than that.
In the first version you are calling ThreadLocalScope.get() in the context of the thread that is executing the job.
In the second version you are calling ThreadLocalScope.get() in the contrext of the thread that is creating the Job() object. It is then squirreled away in a variable and used later in the executor thread. Assuming that the Job() objects are all created on the same thread, that means that your execute() methods share the same Collector object. And they are potentially running on different threads. Since Collector is not thread-safe, that is a hazard.
There is another thing that you might not be aware of. It is likely that Quartz is using a thread pool. That means that when an execute() call terminates, the thread goes back to the pool. Next time around, if Quartz uses the same thread, it will reuse the Collector object from last time.

Related

StringCoding has threadLocal [duplicate]

Does any one have an example how to do this? Are they handled by the garbage collector? I'm using Tomcat 6.

The javadoc says this:
"Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).
If your application or (if you are talking about request threads) container uses a thread pool that means that threads don't die. If necessary, you would need to deal with the thread locals yourself. The only clean way to do this is to call the ThreadLocal.remove() method.
There are two reasons you might want to clean up thread locals for threads in a thread pool:
to prevent memory (or hypothetically resource) leaks, or
to prevent accidental leakage of information from one request to another via thread locals.
Thread local memory leaks should not normally be a major issue with bounded thread pools since any thread locals are likely to get overwritten eventually; i.e. when the thread is reused. However, if you make the mistake of creating a new ThreadLocal instances over and over again (instead of using a static variable to hold a singleton instance), the thread local values won't get overwritten, and will accumulate in each thread's threadlocals map. This could result in a serious leak.
Assuming that you are talking about thread locals that are created / used during a webapp's processing of an HTTP request, then one way to avoid the thread local leaks is to register a ServletRequestListener with your webapp's ServletContext and implement the listener's requestDestroyed method to cleanup the thread locals for the current thread.
Note that in this context you also need to consider the possibility of information leaking from one request to another.

Here is some code to clean all thread local variables from the current thread when you do not have a reference to the actual thread local variable. You can also generalize it to cleanup thread local variables for other threads:
private void cleanThreadLocals() {
try {
// Get a reference to the thread locals table of the current thread
Thread thread = Thread.currentThread();
Field threadLocalsField = Thread.class.getDeclaredField("threadLocals");
threadLocalsField.setAccessible(true);
Object threadLocalTable = threadLocalsField.get(thread);
// Get a reference to the array holding the thread local variables inside the
// ThreadLocalMap of the current thread
Class threadLocalMapClass = Class.forName("java.lang.ThreadLocal$ThreadLocalMap");
Field tableField = threadLocalMapClass.getDeclaredField("table");
tableField.setAccessible(true);
Object table = tableField.get(threadLocalTable);
// The key to the ThreadLocalMap is a WeakReference object. The referent field of this object
// is a reference to the actual ThreadLocal variable
Field referentField = Reference.class.getDeclaredField("referent");
referentField.setAccessible(true);
for (int i=0; i < Array.getLength(table); i++) {
// Each entry in the table array of ThreadLocalMap is an Entry object
// representing the thread local reference and its value
Object entry = Array.get(table, i);
if (entry != null) {
// Get a reference to the thread local object and remove it from the table
ThreadLocal threadLocal = (ThreadLocal)referentField.get(entry);
threadLocal.remove();
}
}
} catch(Exception e) {
// We will tolerate an exception here and just log it
throw new IllegalStateException(e);
}
}

There is no way to cleanup ThreadLocal values except from within the thread that put them in there in the first place (or when the thread is garbage collected - not the case with worker threads). This means you should take care to clean up your ThreadLocal's when a servlet request is finished (or before transferring AsyncContext to another thread in Servlet 3), because after that point you may never get a chance to enter that specific worker thread, and hence, will leak memory in situations when your web app is undeployed while the server is not restarted.
A good place to do such cleanup is ServletRequestListener.requestDestroyed().
If you use Spring, all the necessary wiring is already in place, you can simply put stuff in your request scope without worrying about cleaning them up (that happens automatically):
RequestContextHolder.getRequestAttributes().setAttribute("myAttr", myAttr, RequestAttributes.SCOPE_REQUEST);
. . .
RequestContextHolder.getRequestAttributes().getAttribute("myAttr", RequestAttributes.SCOPE_REQUEST);

Reading again the Javadoc documentation carefully:
'Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).
'
There is no need to clean anything, there is an 'AND' condition for the leak to survive. So even in a web container where thread survive to the application,
as long as the webapp class is unloaded ( only beeing reference in a static class loaded in the parent class loader would prevent this and this has nothing to do with ThreadLocal but general issues with shared jars with static data ) then the second leg of the AND condition is not met anymore so the thread local copy is eligible for garbage collection.
Thread local can't be the cause of memory leaks, as far the implementation meets the documentation.

I would like to contribute my answer to this question even though it's old. I had been plagued by the same problem (gson threadlocal not getting removed from the request thread), and had even gotten comfortable restarting the server anytime it ran out of memory (which sucks big time!!).
In the context of a java web app that is set to dev mode (in that the server is set to bounce every time it senses a change in the code, and possibly also running in debug mode), I quickly learned that threadlocals can be awesome and sometime be a pain. I was using a threadlocal Invocation for every request. Inside the Invocation. I'd sometimes also use gson to generate my response. I would wrap the Invocation inside a 'try' block in the filter, and destroy it inside a 'finally' block.
What I observed (I have not metrics to back this up for now) is that if I made changes to several files and the server was constantly bouncing in between my changes, I'd get impatient and restart the server (tomcat to be precise) from the IDE. Most likely than not, I'd end up with an 'Out of memory' exception.
How I got around this was to include a ServletRequestListener implementation in my app, and my problem vanished. I think what was happening is that in the middle of a request, if the server would bounce several times, my threadlocals were not getting cleared up (gson included) so I'd get this warning about the threadlocals and two or three warning later, the server would crash. With the ServletResponseListener explicitly closing my threadlocals, the gson problem vanished.
I hope this makes sense and gives you an idea of how to overcome threadlocal issues. Always close them around their point of usage. In the ServletRequestListener, test each threadlocal wrapper, and if it still has a valid reference to some object, destroy it at that point.
I should also point out that make it a habit to wrap a threadlocal as a static variable inside a class. That way you can be guaranteed that by destroying it in the ServeltRequestListener, you won't have to worry about other instances of the same class hanging around.

#lyaffe's answer is the best possible for Java 6. There are a few issues that this answer resolves using what is available in Java 8.
#lyaffe's answer was written for Java 6 before MethodHandle became available. It suffers from performance penalties due to reflection. If used as below, MethodHandle provides zero overhead access to fields and methods.
#lyaffe's answer also goes through the ThreadLocalMap.table explicitly and is prone to bugs. There is a method ThreadLocalMap.expungeStaleEntries() now available that does the same thing.
The code below has 3 initialization methods to minimize the cost of invoking expungeStaleEntries().
private static final MethodHandle s_getThreadLocals = initThreadLocals();
private static final MethodHandle s_expungeStaleEntries = initExpungeStaleEntries();
private static final ThreadLocal<Object> s_threadLocals = ThreadLocal.withInitial(() -> getThreadLocals());
public static void expungeThreadLocalMap()
{
Object threadLocals;
threadLocals = s_threadLocals.get();
try
{
s_expungeStaleEntries.invoke(threadLocals);
}
catch (Throwable e)
{
throw new IllegalStateException(e);
}
}
private static Object getThreadLocals()
{
ThreadLocal<Object> local;
Object result;
Thread thread;
local = new ThreadLocal<>();
local.set(local); // Force ThreadLocal to initialize Thread.threadLocals
thread = Thread.currentThread();
try
{
result = s_getThreadLocals.invoke(thread);
}
catch (Throwable e)
{
throw new IllegalStateException(e);
}
return(result);
}
private static MethodHandle initThreadLocals()
{
MethodHandle result;
Field field;
try
{
field = Thread.class.getDeclaredField("threadLocals");
field.setAccessible(true);
result = MethodHandles.
lookup().
unreflectGetter(field);
result = Preconditions.verifyNotNull(result, "result is null");
}
catch (NoSuchFieldException | SecurityException | IllegalAccessException e)
{
throw new ExceptionInInitializerError(e);
}
return(result);
}
private static MethodHandle initExpungeStaleEntries()
{
MethodHandle result;
Class<?> clazz;
Method method;
Object threadLocals;
threadLocals = getThreadLocals();
clazz = threadLocals.getClass();
try
{
method = clazz.getDeclaredMethod("expungeStaleEntries");
method.setAccessible(true);
result = MethodHandles.
lookup().
unreflect(method);
}
catch (NoSuchMethodException | SecurityException | IllegalAccessException e)
{
throw new ExceptionInInitializerError(e);
}
return(result);
}

The JVM would automatically clean-up all the reference-less objects that are within the ThreadLocal object.
Another way to clean up those objects (say for example, these objects could be all the thread unsafe objects that exist around) is to put them inside some Object Holder class, which basically holds it and you can override the finalize method to clean the object that reside within it. Again it depends on the Garbage Collector and its policies, when it would invoke the finalize method.
Here is a code sample:
public class MyObjectHolder {
private MyObject myObject;
public MyObjectHolder(MyObject myObj) {
myObject = myObj;
}
public MyObject getMyObject() {
return myObject;
}
protected void finalize() throws Throwable {
myObject.cleanItUp();
}
}
public class SomeOtherClass {
static ThreadLocal<MyObjectHolder> threadLocal = new ThreadLocal<MyObjectHolder>();
.
.
.
}

final ThreadLocal<T> old = backend;
// try to clean by reflect
try {
// BGN copy from apache ThreadUtils#getAllThreads
ThreadGroup systemGroup = Thread.currentThread().getThreadGroup();
while (systemGroup.getParent() != null) {
systemGroup = systemGroup.getParent();
}
int count = systemGroup.activeCount();
Thread[] threads;
do {
threads = new Thread[count + (count / 2) + 1]; //slightly grow the array size
count = systemGroup.enumerate(threads, true);
//return value of enumerate() must be strictly less than the array size according to javadoc
} while (count >= threads.length);
// END
// remove by reflect
final Field threadLocalsField = Thread.class.getDeclaredField("threadLocals");
threadLocalsField.setAccessible(true);
Class<?> threadLocalMapClass = Class.forName("java.lang.ThreadLocal$ThreadLocalMap");
Method removeMethod = threadLocalMapClass.getDeclaredMethod("remove", ThreadLocal.class);
removeMethod.setAccessible(true);
for (int i = 0; i < count; i++) {
final Object threadLocalMap = threadLocalsField.get(threads[i]);
if (threadLocalMap != null) {
removeMethod.invoke(threadLocalMap, old);
}
}
}
catch (Exception e) {
throw new ThreadLocalAttention(e);
}

is it safe to store threads in a ConcurrentMap?

I am building a backend service whereby a REST call to my service creates a new thread. The thread waits for another REST call if it does not receive anything by say 5 minutes the thread will die.
To keep track of all the threads I have a collection that keeps track of all the currently running threads so that when the REST call finally comes in such as a user accepting or declining an action, I can then identify that thread using the userID. If its declined we will just remove that thread from the collection if its accepted the thread can carry on doing the next action. i have implemented this using a ConcurrentMap to avoid concurrency issues.
Since this is my first time working with threads I want to make sure that I am not overlooking any issues that may arise. Please have a look at my code and tell me if I could do it better or if there's any flaws.
public class UserAction extends Thread {
int userID;
boolean isAccepted = false;
boolean isDeclined = false;
long timeNow = System.currentTimeMillis();
long timeElapsed = timeNow + 50000;
public UserAction(int userID) {
this.userID = userID;
}
public void declineJob() {
this.isDeclined = true;
}
public void acceptJob() {
this.isAccepted = true;
}
public boolean waitForApproval(){
while (System.currentTimeMillis() < timeElapsed){
System.out.println("waiting for approval");
if (isAccepted) {
return true;
} else if (declined) {
return false;
}
}
return isAccepted;
}
#Override
public void run() {
if (!waitForApproval) {
// mustve timed out or user declined so remove from list and return thread immediately
tCollection.remove(userID);
// end the thread here
return;
}
// mustve been accepted so continue working
}
}
public class Controller {
public static ConcurrentHashMap<Integer, Thread> tCollection = new ConcurrentHashMap<>();
public static void main(String[] args) {
int barberID1 = 1;
int barberID2 = 2;
tCollection.put(barberID1, new UserAction(barberID1));
tCollection.put(barberID2, new UserAction(barberID2));
tCollection.get(barberID1).start();
tCollection.get(barberID2).start();
Thread.sleep(1000);
// simulate REST call accepting/declining job after 1 second. Usually this would be in a spring mvc RESTcontroller in a different class.
tCollection.get(barberID1).acceptJob();
tCollection.get(barberID2).declineJob();
}
}

You don't need (explicit) threads for this. Just a shared pool of task objects that are created on the first rest call.
When the second rest call comes, you already have a thread to use (the one that's handling the rest call). You just need to retrieve the task object according to the user id. You also need to get rid of expired tasks, which can be done with for example a DelayQueue.
Pseudocode:
public void rest1(User u) {
UserTask ut = new UserTask(u);
pool.put(u.getId(), ut);
delayPool.put(ut); // Assuming UserTask implements Delayed with a 5 minute delay
}
public void rest2(User u, Action a) {
UserTask ut = pool.get(u.getId());
if(!a.isAccepted() || ut == null)
pool.remove(u.getId());
else
process(ut);
// Clean up the pool from any expired tasks, can also be done in the beginning
// of the method, if you want to make sure that expired actions aren't performed
while((UserTask u = delayPool.poll()) != null)
pool.remove(u.getId());
}

There's a synchronization issue that you should make your flags isAccepted and isDeclined of class AtomicBoolean.
A critical concept is that you need to take steps to make sure changes to memory in one thread are communicated to other threads that need that data. They're called memory fences and they often occur implicitly between synchronization calls.
The idea of a (simple) Von Neumann architecture with a 'central memory' is false for most modern machines and you need to know data is being shared between caches/threads correctly.
Also as others suggest, creating a thread for each task is a poor model. It scales badly and leaves your application vulnerable to keeling over if too many tasks are submitted. There is some limit to memory so you can only have so many pending tasks at a time but the ceiling for threads will be much lower.
That will be made all the worse because you're spin waiting. Spin waiting puts a thread into a loop waiting for a condition. A better model would wait on a ConditionVariable so threads not doing anything (other than waiting) could be suspended by the operating system until notified that the thing they're waiting for is (or may be) ready.
There are often significant overheads in time and resources to creating and destroying threads. Given that most platforms can be simultaneously only executing a relatively small number of threads creating lots of 'expensive' threads to have them spend most of their time swapped out (suspended) doing nothing is very inefficient.
The right model launches a pool of a fixed number of threads (or relatively fixed number) and places tasks in a shared queue that the threads 'take' work from and process.
That model is known generically as a "Thread Pool".
The entry level implementation you should look at is ThreadPoolExecutor:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html

Non blocking function that preserves order

I have the following method:
void store(SomeObject o) {
}
The idea of this method is to store o to a permanent storage but the function should not block. I.e. I can not/must not do the actual storage in the same thread that called store.
I can not also start a thread and store the object from the other thread because store might be called a "huge" amount of times and I don't want to start spawning threads.
So I options which I don't see how they can work well:
1) Use a thread pool (Executor family)
2) In store store the object in an array list and return. When the array list reaches e.g. 1000 (random number) then start another thread to "flush" the array list to storage. But I would still possibly have the problem of too many threads (thread pool?)
So in both cases the only requirement I have is that I store persistantly the objects in exactly the same order that was passed to store. And using multiple threads mixes things up.
How can this be solved?
How can I ensure:
1) Non blocking store
2) Accurate insertion order
3) I don't care about any storage guarantees. If e.g. something crashes I don't care about losing data e.g. cached in the array list before storing them.

I would use a SingleThreadExecutor and a BlockingQueue.
SingleThreadExecutor as the name sais has one single Thread. Use it to poll from the Queue and persist objects, blocking if empty.
You can add not blocking to the queue in your store method.
EDIT
Actually, you do not even need that extra Queue - JavaDoc of newSingleThreadExecutor sais:
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So I think it's exactly what you need.
private final ExecutorService persistor = Executors.newSingleThreadExecutor();
public void store( final SomeObject o ){
persistor.submit( new Runnable(){
#Override public void run(){
// your persist-code here.
}
} );
}
The advantage of using a Runnable that has a quasi-endless-loop and using an extra queue would be the possibility to code some "Burst"-functionality. For example you could make it wait to persist only when 10 elements are in queue or the oldest element has been added at least 1 minute ago ...

I suggest using a Chronicle-Queue which is a library I designed.
It allows you to write in the current thread without blocking. It was originally designed for low latency trading systems. For small messages it takes around 300 ns to write a message.
You don't need to use a back ground thread, or a on heap queue and it doesn't wait for the data to be written to disk by default. It also ensures consistent order for all readers. If the program dies at any point after you call finish() the message is not lost. (Unless the OS crashes/loses power) It also supports replication to avoid data loss.

Have one separate thread that gets items from the end of a queue (blocking on an empty queue), and writes them to disk. Your main thread's store() function just adds items to the beginning of the queue.
Here's a rough idea (though I assume there will be cleaner or faster ways for doing this in production code, depending on how fast you need things to be):
import java.util.*;
import java.io.*;
import java.util.concurrent.*;
class ObjectWriter implements Runnable {
private final Object END = new Object();
BlockingQueue<Object> queue = new LinkedBlockingQueue();
public void store(Object o) throws InterruptedException {
queue.put(o);
}
public ObjectWriter() {
new Thread(this).start();
}
public void close() throws InterruptedException {
queue.put(END);
}
public void run() {
while (true) {
try {
Object o = queue.take();
if (o == END) {
// close output file.
return;
}
System.out.println(o.toString()); // serialize as appropriate
} catch (InterruptedException e) {
}
}
}
}
public class Test {
public static void main(String[] args) throws Exception {
ObjectWriter w = new ObjectWriter();
w.store("hello");
w.store("world");
w.close();
}
}

The comments in your question make it sound like you are unfamilier with multi-threading, but it's really not that difficult.
You simply need another thread responsible for writing to the storage which picks items off a queue. - your store function just adds the objects to the in-memory queue and continues on it's way.
Some psuedo-ish code:
final List<SomeObject> queue = new List<SomeObject>();
void store(SomeObject o) {
// add it to the queue - note that modifying o after this will also alter the
// instance in the queue
synchronized(queue) {
queue.add(queue);
queue.notify(); // tell the storage thread there's something in the queue
}
}
void storageThread() {
SomeObject item;
while (notfinished) {
synchronized(queue) {
if (queue.length > 0) {
item = queue.get(0); // get from start to ensure same order
queue.removeAt(0);
} else {
// wait for something
queue.wait();
continue;
}
}
writeToStorage(item);
}
}

What if finalize method does not finish ? [duplicate]

what will the Finalizer thread do if there is a infinite loop or deadlock in the Java finalize method.

The spec writes:
Before the storage for an object is reclaimed by the garbage collector, the Java Virtual Machine will invoke the finalizer of that object.
The Java programming language does not specify how soon a finalizer will be invoked, except to say that it will happen before the storage for the object is reused.
I read this to mean that the finalizer must have completed before the storage may be reused.
The Java programming language does not specify which thread will invoke the finalizer for any given object.
It is important to note that many finalizer threads may be active (this is sometimes needed on large shared memory multiprocessors), and that if a large connected data structure becomes garbage, all of the finalize methods for every object in that data structure could be invoked at the same time, each finalizer invocation running in a different thread.
That is, finalization may occur in the garbage collector thread, in a separate thead, or even a separate thread pool.
A JVM is not permitted to simply abort executing a finalizer, and can only use a finite number of threads (threads are operating system resources, and operating systems don't support arbitrarily many threads). Non-terminating finalizers will therefore of necessity starve that thread pool, thereby inhibit collection of any finalizable objects, and cause a memory leak.
The following test program confirms this behavior:
public class Test {
byte[] memoryHog = new byte[1024 * 1024];
#Override
protected void finalize() throws Throwable {
System.out.println("Finalizing " + this + " in thread " + Thread.currentThread());
for (;;);
}
public static void main(String[] args) {
for (int i = 0; i < 1000; i++) {
new Test();
}
}
}
On Oracle JDK 7, this prints:
Finalizing tools.Test#1f1fba0 in thread Thread[Finalizer,8,system]
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at tools.Test.<init>(Test.java:5)
at tools.Test.main(Test.java:15)

I would say that since the Java Specification doesn't tell how the finalize method must be invoked (just that it must be invoked, before the object is garbage collected), the behaviour is implementation specific.
The spec doesn't rule out having multiple threads running the process, but doesn't require it:
It is important to note that many finalizer threads may be active
(this is sometimes needed on large shared memory multiprocessors), and
that if a large connected data structure becomes garbage, all of the
finalize methods for every object in that data structure could be
invoked at the same time, each finalizer invocation running in a
different thread.
Looking at the sources of the JDK7, the FinalizerThread keeps the queue of objects scheduled for finalization (actually objects are added to the queue by the GC, when proven to be unreachable - check ReferenceQueue doc):
private static class FinalizerThread extends Thread {
private volatile boolean running;
FinalizerThread(ThreadGroup g) {
super(g, "Finalizer");
}
public void run() {
if (running)
return;
running = true;
for (;;) {
try {
Finalizer f = (Finalizer)queue.remove();
f.runFinalizer();
} catch (InterruptedException x) {
continue;
}
}
}
}
Each object is removed from the queue, and runFinalizer method is run on it. Check is done if the finalization had run on the object, and if not it is being invoked, as a call to a native method invokeFinalizeMethod. The method simply is calling the finalize method on the object:
JNIEXPORT void JNICALL
Java_java_lang_ref_Finalizer_invokeFinalizeMethod(JNIEnv *env, jclass clazz,
jobject ob)
{
jclass cls;
jmethodID mid;
cls = (*env)->GetObjectClass(env, ob);
if (cls == NULL) return;
mid = (*env)->GetMethodID(env, cls, "finalize", "()V");
if (mid == NULL) return;
(*env)->CallVoidMethod(env, ob, mid);
}
This should lead to a situation, where the objects get queued in the list, while the FinalizerThread is blocked on the faulty object, which in turn should lead to OutOfMemoryError.
So to answer the original question:
what will the Finalizer thread do if there is a infinite loop or deadlock in the Java finalize method.
It will simply sit there and run that infinite loop until OutOfMemoryError.
public class FinalizeLoop {
public static void main(String[] args) {
Thread thread = new Thread() {
#Override
public void run() {
for (;;) {
new FinalizeLoop();
}
}
};
thread.setDaemon(true);
thread.start();
while (true);
}
#Override
protected void finalize() throws Throwable {
super.finalize();
System.out.println("Finalize called");
while (true);
}
}
Note the "Finalize called" if printed only once on the JDK6 and JDK7.

The objects will not be "freed", that is the memory will not be claimed back from them and also resources that are freed in the finalize method will remain reserved throughout.
Basically there is a queue holding all the objects waiting for their finalize() method to be executed. Finalizer thread picks up objects from this queue - runs finalize - and releases the object.
If this thread will be deadlocked the ReferenceQueue Queue will grow up and at some point OOM error will become inexorable. Also the resources will be hogged up by the objects in this queue. Hope this helps!!
for(;;)
{
Finalizer f = java.lang.ref.Finalizer.ReferenceQueue.remove();
f.get().finalize();
}

how to deal with multiple worker threads that may create new work items

I have a queue that contains work items and I want to have multiple threads work in parallel on those items. When a work item is processed it may result in new work items. The problem I have is that I can't find a solution on how to determine if I'm done. The worker looks like that:
public class Worker implements Runnable {
public void run() {
while (true) {
WorkItem item = queue.nextItem();
if (item != null) {
processItem(item);
}
else {
// the queue is empty, but there may still be other workers
// processing items which may result in new work items
// how to determine if the work is completely done?
}
}
}
}
This seems like a pretty simple problem actually but I'm at a loss. What would be the best way to implement that?
thanks
clarification:
The worker threads have to terminate once none of them is processing an item, but as long as at least one of them is still working they have to wait because it may result in new work items.

What about using an ExecutorService which will allow you to wait for all tasks to finish: ExecutorService, how to wait for all tasks to finish

I'd suggest wait/notify calls. In the else case, your worker threads would wait on an object until notified by the queue that there is more work to do. When a worker creates a new item, it adds it to the queue, and the queue calls notify on the object the workers are waiting on. One of them will wake up to consume the new item.
The methods wait, notify, and notifyAll of class Object support an efficient transfer of control from one thread to another. Rather than simply "spinning" (repeatedly locking and unlocking an object to see whether some internal state has changed), which consumes computational effort, a thread can suspend itself using wait until such time as another thread awakens it using notify. This is especially appropriate in situations where threads have a producer-consumer relationship (actively cooperating on a common goal) rather than a mutual exclusion relationship (trying to avoid conflicts while sharing a common resource).
Source: Threads and Locks

I'd look at something higher level than wait/notify. It's very difficult to get right and avoid deadlocks. Have you looked at java.util.concurrent.CompletionService<V>? You could have a simpler manager thread that polls the service and take()s the results, which may or may not contain a new work item.

Using a BlockingQueue containing items to process along with a synchronized set that keeps track of all elements being processed currently:
BlockingQueue<WorkItem> bQueue;
Set<WorkItem> beingProcessed = new Collections.synchronizedSet(new HashMap<WorkItem>());
bQueue.put(workItem);
...
// the following runs over many threads in parallel
while (!(bQueue.isEmpty() && beingProcessed.isEmpty())) {
WorkItem currentItem = bQueue.poll(50L, TimeUnit.MILLISECONDS); // null for empty queue
if (currentItem != null) {
beingProcessed.add(currentItem);
processItem(currentItem); // possibly bQueue.add(newItem) is called from processItem
beingProcessed.remove(currentItem);
}
}
EDIT: as #Hovercraft Full Of Eels suggested, an ExecutorService is probably what you should really use. You can add new tasks as you go along. You can semi-busy wait for termination of all tasks at regular interval with executorService.awaitTermination(time, timeUnits) and kill all your threads after that.

Here's the beginnings of a queue to solve your problem. bascially, you need to track new work and in process work.
public class WorkQueue<T> {
private final List<T> _newWork = new LinkedList<T>();
private int _inProcessWork;
public synchronized void addWork(T work) {
_newWork.add(work);
notifyAll();
}
public synchronized T startWork() throws InterruptedException {
while(_newWork.isEmpty() && (_inProcessWork > 0)) {
wait();
if(!_newWork.isEmpty()) {
_inProcessWork++;
return _newWork.remove(0);
}
}
// everything is done
return null;
}
public synchronized void finishWork() {
_inProcessWork--;
if((_inProcessWork == 0) && _newWork.isEmpty()) {
notifyAll();
}
}
}
your workers will look roughly like:
public class Worker {
private final WorkQueue<T> _queue;
public void run() {
T work = null;
while((work = _queue.startWork()) != null) {
try {
// do work here...
} finally {
_queue.finishWork();
}
}
}
}
the one trick is that you need to add the first work item _before you start any workers (otherwise they will all immediately exit).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.