TL;DR:
Is there a way to find out, that JVM-shutdown is only prevented from the threads started by my code? Is it for example possible to automatically trigger AutoCloseable.close() on Shutdown?
Context
I am building a library, that should be used by several customers. This means, besides providing a documentation, I can't enforce certain things.
Architecture
(I try to describe it as abstract as possible and avoid unnecessary details)
I have a "Manager" object (which is kind of a Factory), that is used to create a "Service" object, that in turn needs some data to work accordingly. Since that data is loaded from some "slow" backend service (which also might change from time to time), I use a separate (Daemon)-Thread that checks for updates and injects new data into that service as soon as available. (This also means that unless the first update, that service is simply in "noop mode". But that's ok.)
Now the "Updater" (which runs in my daemon thread) uses a library that again starts a thread when opening a connection and it's necessary to call "close" to ensure that this secondary thread is stopped - otherwise it is not possible, to shutdown the JVM properly.
As a safety-net I call the close() method inside the finalize() method of my "Manager" (which keeps a reference to all Updater instances). This is not 100% safe, since it's not predictable when GC runs (even more during shutdown!), but it's my only option.
Update: Here is some abstract example code that illustrates the architecture and the according problem
Problem
This architecture causes two possible pitfalls:
If the implementation does not keep a reference to the instance of the manager, it will be garbage collected at some point and trough the finalize method the necessary background updates will be stopped.
If the implementation keeps an instances of the manager, it must call the close method during the shutdown of the according system, otherwise the JVM can't terminate properly.
So my actual problem is the "potential unreliability" of the developers, which are using that library.
Does anyone have an idea how to build a solution, that could handle both pitfalls?
It would be nice to have some Auto-AutoCloseable ;) that is called during Shutdown (e.g by the DestroyJavaVM Thread or similar).
Solutions I tried unsuccessfully
Inside the Updater I am closing the "problematic" connection inside a "try-finally" block, but that daemon thread is not interrupted / stopped automatically as well.
I registered a Runtime.getRuntime().addShutdownHook(...) that would close all connections, but this shutdown hook is never called since a Shutdown is only initiated when all user-threads are stopped.
Update: Solved at my implementation, but not the problem
I solved my problem as I found that the third party library (RabbitMQ Client) offers a setThreadFactory method that I can use to ensure the spawned Threads are Daemon-Threads.
Good luck for me with my 3rd party library, but the described problem is still possible.
You want the AutoCloseable resources to be closed so the shutdown is orderly, I guess.
AutoCloseable objects should be used (by your library clients) in a manner that ensures they are closed when they are no longer needed. In almost all cases, they should be using a try-with-resources block, so they are closed even if an exception is thrown.
You should take advantage of that by requiring your library clients perform a controlled shutdown of each thread when they receive a request to shutdown the program. A thread performs a controlled shutdown by returning from each Runnable.run method, or throwing an exception from each Runnable.run method. I believe this is the only reliable means of closing resources, because it ensures nested resource allocations are reallocated in the correct order. More generally, as a library writer you can not know what other operations your library clients might want to do on shutdown, so you should give them complete control over the shutdown.
You can help them do that by having your library code properly handle InterruptedException and the Thread.interrupted flag.
Ok, I guess there is no good solution for that problem, but since I found that my 3rd party library gives the possibility to create Daemon-Threads, this is exactly what was necessary to fix it.
Maybe this is also a problem that should be solved on the "human level" by providing a good documentation and ensure a proper usage of AutoCloseable. A good developer should know how to deal with that.
From a technical point of view I found these possible solutions, which are just "safety nets" no one should rely on.
call "close()" inside the finalize method
I implemented a service, that can be used to register AutoClosable resources. It runs inside a Daemon-Thread and checks every few seconds, if it finds the JavaDestroyVM Thread and in such a case, closes all registered AutoClosables and stops itself.
Disclosure: If a system runs "outside of the main method" (the JavaDestroyVM Thread will run all the time), this is solution won't work.
Update: The RegisterAutoClosable-Service was a very ugly/hacky solution - I deleted it and plan to change the design, to avoid such situations, but finally it's the responsibility of the implementing developer to close opened resources properly.
Related
Assume that I have some code that because of a bug sometimes does
while(true){/*...*/}
It is in a third part library that I have no source code for.
In Java I can isolate it in a CompletableFuture. But since the 3th party code is not cooperating Timeout will not work on The future. Stopping a custom Thread pool also does not work. I tested that with a literal 'while true...'
Java's Thread.stop() is deprecated, but works more or less.
How should I stop such never returning non cooperating code after a certain time in Java? Should I use a process?
Same question for C# and NodeJS.
In c# context you have Thread.Abort() which will try to intercept on each managed code execution step and stop threads execution. Although, this option is not recommended since it can cause thread to be left in corrupted state while it is being killed, unmanaged resources hanging and other unwanted behavior.
A better idea would be to bound execution of this 3rd party code in a different process and end process with your provided time out. That incurs a greater performance penalty but you do not risk of any resources or locks hanging.
We are running a Java server app that is using ScheduledThreadPoolExecutor to manage some work. There are multiple instances running, for different types of work, but each instance only uses one thread. It's not important why this is as there's really no way around it. What we noticed on the production server is that one of these instances stopped working at some point, completely and silently. Restarting the server brought it back again, but the problem isn't solved.
I know that using scheduleAtFixedRate will stop if the task throws an exception at some point, but this isn't the case here. We had a recurrent task that simply stopped executing, and new tasks that used the schedule() method and still didn't execute. I presume that the thread it was using died and didn't start again.
My question is, are there any circumstances under which this could happen? Is there anything I should look out for?
It looks like the simplest explanation is the answer: all threads hang.
In my case the cause of this seems to be HTTP requests that never timeout. This can happen in certain situations and I am yet to find a good solution for the problem. I think the best option is to implement a timeout on the scheduled task itself to make sure we avoid any issues.
Does anybody know a mechanism that can capture the state of a running thread and serialize that for further resume?
Is there anything available for the JVM?
How about pthreads?
My main goal is to be able to migrate a running thread to a remote machine.
With the cooperation of that thread, you can do it by any mechanism that thread supports. Without the cooperation of that thread, it is impossible. What happens if that thread holds a lock that your serialize code needs?
What happens if you migrate a running thread that is currently using some kernel resource such as a pipe. Will you migrate that resource?
The right solution to your problem may be to have the thread support a migration mechanism. How you do that depends on precisely what that thread is doing. You'll get answers that are more likely to help you solve your actual problem if you explain precisely what is.
The answer to this is really going to depend on what constitutes the state of the running thread.
If the state is local thread data which allows for the thread state to be copied and saved and then inserted back into a new thread, then the mechanism is basically to just save the state with some kind of a serializable object which is then used to create a new thread with the saved state and to then begin it running.
However if the thread state depends on external objects or entities, the problem is much tougher. For instance if you have a thread which is acting as a server using TCP and you want to save its state then restart it later, the socket is going to change and the client which was accessing the server thread will know that the server thread stopped communicating for a while.
This means that for any external entities that are depending on the thread, will need to know that the thread is being saved and frozen, they will need to have something that allows them to either fall over to an alternative or to save and freeze themselves, and there will need to be some kind of protocol so that the restarted thread can let the other entities know that it is back in business and its current state.
Also if the thread is depending on some external entities then those entities must be able to deal with the thread being frozen. There may need to be some kind of a mechanism in place so that the thread can release various resources, whose states are saved, and then when restarted, be able to reclaim those resources or comparable resources and then reset those resources to the saved state.
If you want to move a running JVM from one machine to another, you will most likely not do it by yourself but instead use the live migration functionality of a VM manager.
The VM managers will move entire virtual machines from one physical machine to another without stopping the virtual machine or processes, but it's quite a bit higher level than serializing/deserializing a thread. Since a thread may use resources that are local to the operating system such as file systems or sockets, the whole operating system needs to follow the thread to the other physical machine.
I'm not aware of any way that you can send a thread, per se. However, you could use a pattern such as the memento pattern to save the state of your thread.
See these references before continuing so you know the terminology:
Memento pattern, oodesign.com
Memento pattern, Wikipedia
Basically, you'll have this:
Design a job (thread) that can run with any starting state, including a state from mid-execution.
When it needs migrated, get the state of that thread.
In Java, you could use ThreadLocal variables to store the thread state.
Serialize that state to the other machine.
Use the state to start a new thread with the state you deserialized.
This is a better approach then actually migrating a thread, its state, stack, etc. since you can pick and choose what absolutely needs to be moved instead of moving everything no matter what.
Problem scenario : The problem is noticed in sonic MF container (The jvm).The container has hosted some java services responsible for db operations and message transformations.Once started, the container runs fine for 2-3 weeks and terminates by its own without throwing any exceptions.
After much research, we are unable to find out why or what has triggered the jvm (MF Container) to shutdown.
Is there a way by which I can get the thread dumps when the jvm goes down automatically ? I'm using java 1.6. Is there any other approach to this problem I should follow ?
Thanks in advance.
You could try java.lang.Runtime.addShutdownHook(), and have the hook iterate over all threads and dump their stack traces with Thread.getAllStackTraces(). However, if the JVM was shutdown by Runtime.halt() then the hooks won't be called. More complicated would be to use instrumentation to hook into the calls to Runtime.exit() and Runtime.halt() (or Shutdown.sequence(), see edit #2), so you can see exactly what's happening at the time that either is called.
EDIT: Another way of doing it would be to install a SecurityManager which doesn't enforce any security, but which dumps the list of threads whenever SecurityManager.checkExit() is invoked, since both halt() and exit() call that security manager method. This would be a lot easier than using instrumentation, and you could even decide to throw an exception in addition to logging what the threads are doing.
EDIT 2: The system the JVM is running on can tell the JVM to terminate, in which case using a security manager won't work. Nor will using instrumentation on Runtime.exit() or Runtime.halt(), since the method that gets invoked is java.lang.Shutdown.exit(). And if the JVM is shutting down because the last daemon thread finished then Shutdown.shutdown() is invoked. But shutdown hooks will work in either of those situations. So you should always use the shutdown hooks, even if you're also going to use the security manager or instrumentation.
See also https://docs.oracle.com/javase/7/docs/webnotes/tsg/TSG-VM/html/hangloop.html "Troubleshooting Hanging or Looping Processes"
However, at least in my case, Eclipse is hung, and does not respond to any of these.
For various reasons calling System.exit is frowned upon when writing Java Applications, so how can I notify the calling process that not everything is going according to plan?
Edit: The 1 is a standin for any non-zero exit code.
The use of System.exit is frowned upon when the 'application' is really a sub-application (e.g. servlet, applet) of a larger Java application (server): in this case the System.exit could stop the JVM and hence also all other sub-applications. In this situation, throwing an appropriate exception, which could be caught and handled by the application framework/server is the best option.
If the java application is really meant to be run as a standalone application, there is nothing wrong with using System.exit. in this case, setting an exit value is probably the easiest (and also most used) way of communicating failure or success to the parent process.
I agree with the "throw an Exception" crowd. One reason is that calling System.exit makes your code difficult to use if you want other code to be able to use it. For example, if you find out that your class would be useful from a web app, or some kind of message consuming app, it would be nice to allow those containers the opportunity to deal with the failure somehow. A container may want to retry the operation, decide to log and ignore the problem, send an email to an administrator, etc.
An exception to this would be your main() method; this could trap the Exception, and call System.exit() with some value that can be recognized by the calling process or shell script.
System.exit() will block, and create a deadlock if the thread that initiated it is used in a shutdown hook.
Our company's policy is that it's OK (even preferred) to call System.exit(-1), but only in init() methods. I would definitely think twice before calling it during a program's normal flow.
I think throwing an exception is what you should do when something goes wrong. This way, if your application is not running as a stand-alone app the caller can react to it and has some information about what went wrong. It is also easier for debugging purposes because you as well get a better idea about what went wrong when you see a stack trace.
One important thing to note is that when the exception reaches the top level and therefore causes the VM to quit the VM returns a return code of 1, therefore outside applications that use the return code see that something went wrong.
The only case where I think System.exit() makes sense is when your app is meant to be called by applications which are not Java and therefore have to use return codes to see if your app worked or not and you want those applications to have a chance to react differently on different things going wrong, i.e. you need different return codes.
It can be dangerous / problematic in web servlet environments also.
Throwing an Exception is generally considered the other alternative.
Throwing exceptions is the best way to send information about a certain error up and out of the app.
A number doesn't tell you as much as:
Exception at thread 'main': FileNotFoundException "The file 'foo' doesn't exist"
(or something close to that)
It's frowned upon for normal exits. If "not everything is going according to plan", then System.exit is fine.
Update: I should add that I assume your '1' has meaning that is documented somewhere.
I feel impelled to add some salt of mine too. This is a great question that always pops up when I write an application.
As everyone here seems to agree, you should be careful using System.exit(), and if possible, use exceptions. However, System.exit() still seems the only way to return basic informations to the system, and thus required if you want to make your application scriptable. If you don't need that, just throw an exception and be done with it.
But if (and only if) your application is single-threaded, then it's fine to use it -- it's guaranteed that no other stuff is going in, and no resources are open (at least if you consistently use the try-with-resource idiom, which I'd highly recommend as it also makes the code cleaner and more compact).
On the other hand, as soon as your application creates any kind of thread that may write resources, System.exit() is a total "no, no", because it can (and, with time, will) corrupt data.
To be able to use a multi-threaded and scripted application and still guarantee the data integrity, my best solution so far is to save any resource-modifying thread you create (for example by consistently using a factory method which adds the thread to the list), and also installing a shutdown hook that cleanly ends each thread by interrupting and joining it. As the shutdown hook is also called by System.exit(), this will guarantee (minus programming errors) that no thread is killed in mid-resource writing.
Oh yes, maybe I shouldn't even mention it, but: never, EVER, use that horrible System.halt() method. It just shoots the VM in the head and doesn't call any shutdown hook.