File cannot be deleted because the JVM holds it - a tricky one

File cannot be deleted because the JVM holds it - a tricky one - java

My post got a little too long, sorry. Here is a summary:
File on disk cannot be deleted ("the JVM holds the file" error). both when deleting from the java code and when trying to manually delete the file from windows.
All streams to that file are closed and set to null. All file objects set to null.
The program does nothing at that point; but waiting 30 minutes allows me to deleted the file from windows. Weird. Is the file not used by java anymore? Plus, since nothing happens in the program, it indicates it cannot be some stream I forgot (plus, I triple checked nothing is open).
Invoking System.gc() seemed to work when files were small. Did not help when they got to about 20MB.
[EDIT2] - I tried writing some basic code to explain, but its tricky. I am sorry, I know it's difficult to answer like that. I can however write how I open and close streams, of course:
BufferedWriter bw = new BufferedWriter(new FileWriter(new File("C:\\folder\\myFile.txt")));
for(int i = 0; i < 10; i++)
{
bw.write("line " + i);
bw.newLine();
}
bw.close();
bw = null;
If I've used a file object:
File f = new File("C:\\folder\\myFile.txt");
// use it...
f = null;
Basic code, I know. But this is essentially what I do.
I know for a fact I've closed all streams in this exact way.
I know for a fact that nothing happens in the program in that 30-minutes interval in which I cannot delete the file, until I somehow magically can.
thank you for your input even without the coherent code.
I appreciate that.
Sorry for not providing any specific code here, since I can't pinpoint the problem (not exactly specific-code related). In any case, here is the thing:
I have written a program which reads, writes and modifies files on disk. For several reasons, the handling of the read/write is done in a different thread, which is constantly operating.
At some point, I am terminating the "read/write" thread, keeping only the main thread - it waits for input from a socket, totally unrelated to the file, and does nothing. Then, I try to delete the file (using either File.delete(), even tried nio.Files delete option).
The thing is - and it's very weird - sometimes it works, sometimes it doesn't. Even manually, going to the folder and trying to delete the file via windows, gives me the "The file is open by the JVM" message.
Now, I am well aware that keeping references from all kinds of streams to the file prevents me from deleting it. Well past that by now :)
I have made sure that all streams are closed. I even set their values to null, including any "File" objects I have used (even though it shouldn't make any difference). All set to null, all closed. And the thread which generates all of them - the "read/write" thread - well, it's terminated since it got the the end of its run() method.
Usually, if I wait about 30 minutes, while the JVM still operates, I can delete the file manually from windows. The error magically disappears. When the JVM is closed, I can always delete the file right away.
I am lost here. Tried specifically invoking System.gc() before trying to delete the file, even called it like 10 times (not that it should matter). Sometimes it helped, but on other occasions, for example, when the file got larger (say 20MB), that didn't help.
What am I missing here?
Obviously, this couldn't be my implicit fault (not closing some stream), since the read/write thread is dead, the main thread awaits something unrelated (so the program is at a "standstill"), I have explicitly closed all streams, even nullified the references (inStream = null), invoked the garbage collector.
What am I missing? Why is the file "deletable" after 30 minutes (nothing happens at that time - not something in my code). Am I missing some gentle reference/garbage collection thingy?

What you're doing just calls for problems. You say that "if an IOexception occurred, it is printed immediately" and it may be true, but given that something inexplicable happens, let's better doubt it.
I'd first ensure that everything gets always closed, and then I'd care about related logic (logging, exiting, ...).
Anyway, what you did is not how resources should be managed. The answer above is not exactly correct either. Anyway, try-with-resources is (besides #lombok.Cleanup) about the only way, clearly showing that nothing gets ever left open. Anything else is more complicated and more error-prone. I'd strongly recommend using it everywhere. This may be quite some work, but it also forces you to re-inspect all the critical code pieces.
Things like nullifying references and calling the GC should not help... and if they seem to do, it may be a chance.
Some ideas:
Are you using memory mapped files?
Are you sure System.exit is not disabled by a security manager?
Are you running an antivirus? They love to scan files just after they get written.
Btw., locking files is one reason why the WOW never started for me. Sometimes the locks persisted long after the culprit was gone, at least according to tools I could use.

Are you closing your streams in a try...finally or try(A a = new A()) block? If not the streams may not be closed.
I would strongly recommend using either Automatic Resource Block Management ( try(A a = new A()) ) or a try...finally block for all external resources.
try(BufferedWriter br = new BufferedWriter(new FileWriter(new File("C:\\folder\\myFile.txt")));
for(int i = 0; i < 10; i++)
{
br.write("line " + i);
br.newLine();
})

Related

what is the proper way to close streams on java's exec?

After
Runtime.getRuntime().exec(command);
i see syscalls happening that show 2~3 file descriptors (FIFO pipes). What is the proper way to close them with try-with-resource pattern?
Most historical tribal knowledge found on java forums suggest:
# out of date!
... } finally {
IOUtils.closeQuietly(p.getOutputStream());
IOUtils.closeQuietly(p.getInputStream());
IOUtils.closeQuietly(p.getErrorStream());
}
but that doesn't sound right because 1) method closeQuietly is deprecated and most libraries suggest using try-with-resource, 2) it is inelegant as I might not necessarily have all streams.
And simply moving the exec() call into try feels wrong as it is not the resource i will call close() on.

Closing them isn't necessary; the close by themselves when the process dies. If the process never dies, it is also not neccessary: Either you make a new never-dying process every so often in which case your system is going to crash and run out of resources whether you close these or not, or you make it only once, in which case these resources aren't going to count for much. For what it is worth, these are quite lightweight resources, and often they simply cannot be 'closed' in the sense that the resources can be 'freed' - closing them either keeps them open but denies further chat (and sends EOFs where needed), or reroutes them to /dev/null; generally processes just have 3 pipes on em and will continue to have them until the process dies.
Yes, closeQuietly is a silly idea for virtually all purposes, and so it is here. If closing these streams somehow fail you probably don't want to silently ignore that.
If you must close them, the individual streams from these 3 are closable. However, note that you're reading rules of thumb and attempting to apply them as if they are gospel truth. try-with-resources is not always the right answer, and try-with-resources is not a 100% always replacement for close, let alone closeQuietly.
For example, try-with-resources specifically is designed around a period of usage. You declare the span of statements within which the resource should be available (the braces that go with the try block), and the construct will then ensure that the resource is closed only once code flow transitions out of that span of statements, no matter how it exits this. That makes it probably irrelevant here, too!
You are starting a long-lived process and don't care about the in/out. You just want the process to run and to keep running. This means there is no span at all, and you should just call close() on these if somehow you feel it is important to try to save the resources even though most likely this accomplishes nothing at all. No span-of-statements means try-with-resources isn't right.
You are starting a short-lived process that you interact with. The right thing to 'close' is the process itself, except you can't use try-with-resources for that. That can only be used on auto-closables. (resources where the class that represents them implement AutoClosable. Most do, some don't. Lock is a famous one. Process is another: To 'close' it, you invoke destroy() or even destroyForcibly(). You cannot use try-with-resources (not without ugly hacks that defeats the purpose) to do this! Once you close/destroy the process, the streams that went along with them are dead too.
More generally the principle is: If you create it, you close it. If you never call getOutputStream() you never created them. On some OSes, fetching these streams and then closing them wastes more resources than not doing this. Thus, if the argument is based on some sort of purity model, then you shouldn't close them either. If it's based on pragmatics, you'd have to test how heavy these resources really are (most likely, extremely light), whether closing them actually saves you some pipes (most likely, it will not), and whether close()-ing the result of invoking getOutputStream() on the process even helps if the answers to the above questions make that relevant (it probably will, but the spec does not guarantee this).

They are very light processes that in almost every case don't require closing...

Is there any harm in failing to close a file when a Java program terminates?

When my program starts, it opens a file and writes to it periodically. (It's not a log file; it's one of the outputs of the program.) I need to have the file available for the length of the program, but I don't need to do anything in particular to end the file; just close it.
I gather that for file I/O in Java I'm supposed to implement AutoCloseable and wrap it in a try-with-resources block. However, because this file is long-lived, and it's one of a few outputs of the program, I'm finding it hard to organize things such that all the files I open are wrapped in try-with-resources blocks. Furthermore, the top-level classes (where my main() function lies) don't know about this file.
Here's my code; note the lack of writer.close():
public class WorkRecorder {
public WorkRecorder(String recorderFile) throws FileNotFoundException {
writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(recorderFile)));
}
private Writer writer;
public void record(Data data) throws Exception {
// format Data object to match expected file format
// ...
writer.write(event.toString());
writer.write(System.lineSeparator());
writer.flush();
}
}
tl;dr do I need to implement AutoCloseable and call writer.close() if the resource is an opened output file, and I never need to close it until the program is done? Can I assume the JVM and the OS (Linux) will clean things up for me automatically?
Bonus (?): I struggled with this in C#'s IDisposeable too. The using block, like Java's try-with-resources construct, is a nice feature when I have something that I'm going to open, do something with quickly, and close right away. But often that's not the case, particularly with files, when the access to that resource hangs around for a while, or when needing to manage multiple such resources. If the answer to my question is "always use try-with-resources blocks" I'm stuck again.

I have similar code that doesn't lend itself to being wrapped in a try-with-resources statement. I think that is fine, as long as you close it when the program is done.
Just make sure you account for any Exceptions that may happen. For example, in my program, there is a cleanup() method that gets called when the program is shut down. This calls writer.close(). This is also called if there is any abnormal behavior that would cause the program to shut down.
If this is just a simple program, and you're expecting the Writer to be open for its duration, I don't think it's really a big deal for it to not be closed when the program terminates...but it is good practice to make sure your resources are closed, so I would go ahead and add that to wherever your program may shut down.

You should always close resources or set them to null so it can be picked up by the garbage collector in Java. Using try-with-resource blocks is a great way to have Java automatically close resources when you're done with them. Even if you use it for the duration of the program, it is good programming practice to close it even at the end. Some might say you don't need to, I personally would say just go ahead and do it and here's why:
"When a stream is no longer needed, always close it using the close() method or automatically close it using a try-with-resource statement. Not closing streams may cause data corruption in the output file, or other programming errors."
-Introduction to Java Programming 10th Edition, Y. Daniel Liang
If possible, just run the .close() method on the resource at the very end of the program.

I (now) think a better answer is "It depends" :-). A detailed treatment is provided by Lukas Eder here. Also check out the Lambda EG group post.
But in general, it's a good idea to return the resource back to the operating system when you are done with it and use try-with-resources all the time (except when you know what you are doing).

Setting up a blocking file read in Java

I'd like to set up a blocking file read in Java. That is, have a file such that when wrapped by FileInputStream and any read() method is call, the call blocks.
I can't think of an easy OS-independent way - on Unix-like OSes I could try to create a FIFO using mkfifo and read from that file. A possible work around would be to just create a very large file and read from that - the read is unlikely to complete before I capture the stack, but it's ugly and slow (and indeed reads can still be incredibly fast when cached).
The corresponding socket read() case is trivial to set up - create a socket yourself and read from it, and you can have deterministic blocking.
The purpose is to examine stack of the method to determine what the top frames are in such a case. Imagine I have a component which periodically samples the stacks traces of all running threads and then tries to categorize what that thread is doing at the moment. One thing it could be doing is file IO. So I need to know what the "top of stack" looks like during file IO. I have already determined that by experimentation (simply read a file in a variety of ways and sample the stack), but I want to write a test that will fail if this ever changes.
The natural way to write such a test is to kick off a thread which does a file read, then examine the top frame(s). To do this reliably, I want a blocking read (or else the thread may finish its read before the stack trace is taken, etc).

To get a guaranteed blocked I/O, read from a console, e.g. /dev/console on Linux or CON on Windows.
To make this platform-independent, you may hack the FileDescriptor of FileInputStream:
// Open a dummy FileInputStream
File f = File.createTempFile("dummy", ".tmp");
f.deleteOnExit();
FileInputStream fis = new FileInputStream(f);
// Replace FileInputStream's descriptor with stdin
Field fd = FileInputStream.class.getDeclaredField("fd");
fd.setAccessible(true);
fd.set(fis, FileDescriptor.in);
System.out.println("Reading...");
fis.read();
System.out.println("Complete");
UPDATE
I've realized you don't even need a method to block. In order just to get a proper stacktrace you may invoke read() on an invalid FileInputStream:
FileInputStream fis = new FileInputStream(new FileDescriptor());
fis.read(); // This will throw IOException exactly with the right stacktrace
If you still need a blocking read(), named pipes is the way to go: run mkfifo using Runtime.exec on POSIX systems or create \\.\PIPE\MyPipeName on Windows.

I don't know of anyway to make a File in a OS-Independent way that will always block when read.
If I were trying to find the stack trace when a specific function were called, I would run the program under a debugger and set a break point on that function. Although, method breakpoints will slow down your program and give you different results than you would normally get if timing is important.
If you have access to the source code of the progream, you could make a fake FileInputStream that extends the real one but always blocks on a read. All you need to do is to switch out the import statements throughout the code. However, this won't capture places where you are not able to switch out import statements and it could be a pain if there is a lot of code.
If you want to use your own FileInputStream without changing the program source code or compiling, you can make a custom class loader that loads your custom FileInputStream class instead of the real one. You can specify which class loader to use on the command line by:
java -Djava.system.class.loader=com.test.MyClassLoader xxx
Now that I think about it, I have an even better idea, instead of making a custom FileInputStream that blocks on read(), make a custom FileInputStream that prints out the stack traces on read(). The custom class can then call the real version of read(). This way you will get all of the stack traces for all calls.

From my understanding you want to write a test which inspects the stack trace of FileInputStream.read() method. What about descendants of FileInputStream if they override the read() method?
If you don't need to inspect the descendants, I think you can use the JVM Tool Interface by inserting a break point at runtime in the desired method, and in the event processing of this event (break point) - dump the stack trace.
After the dump is completed you remove the break point and continue the execution.
(This all occurs in runtime using this API, no black magic :) )

You could have a separate thread watch for changes to the file's access time and generate a jvm thread dump when that happens. As to generating the thread dump in code I haven't tried but looks like that's answered here: Generate a Java thread dump without restarting.
I don't know how well this will work with the timing between your threads but I imagine this should come pretty close. I'm also not 100% on the OS independence of this solution as I haven't tested it, but it should work for most modern-ish systems. See the javadocs on the java.nio.file.attribute.BasicFileAttributes to see what will return if it's not supported.

One trick is : if it is possible to modify your API to return a Reader instead of a File, then you can wrap a String with a custom StringReader (class SlowAsRubyStringReader extends Reader, say) that overrides the various int read() methods with a Thread.sleep(500) before it does the real work. Only during testing, of course.
#see http://docs.oracle.com/javase/7/docs/api/java/io/StringReader.html
I think there is a larger issue here, not just Files : you want to inspect the context in which an API is getting called during your test cases is it not? That is, you want to be able to examine the stack and say, "aha! I caught you calling the MudFactory API from the JustTookABath object, OUTRAGEOUS!". If this is the case, then you may have to delve into dynamic proxies, which will allow you to hijack function calls or use aspect-oriented programming, which allows you to do the same, but in a more systematic way. See http://en.wikipedia.org/wiki/Pointcut

read() dives quickly into native code so yes probably need to go native to block at that level. Alternatively you may want to consider logging a stack trace at
the point in your code before or after read().
Something like:
log ( ExceptionUtils.getStackTrace(new Exception()) );
ExceptionUtils doco is here: https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/exception/ExceptionUtils.html

How to Wait for windows process to finish before opening file in java

I have a implemented a listener that notifies if we receive a new file in a particular directory. This is implemented by polling and using a TimerTask.
Now the program is so set up that once it receives a new file it calls another java program that opens the file and validates whether it is the correct file. My problem is that since the polling happens a specified number of seconds later there can arise a case in which a file is being copied in that directory and hence is locked by windows.
This throws an IOException since the other java program that tries to open it for validation cannot ("File is being used by another process").
Is there a way I can know when windows has finished copying and then call the second program to do the validations from java?
I will be more than happy to post code snippets if someone needs them in order to help.
Thanks

Thanks a lot for all the help, I was having the same problem with WatchEvent.
Unfortunately, as you said, file.canRead() and file.canWrite() both return true, even if the file still locked by Windows. So I discovered that if I try to "rename" it with the same name, I know if Windows is working on it or not. So this is what I did:
while(!sourceFile.renameTo(sourceFile)) {
// Cannot read from file, windows still working on it.
Thread.sleep(10);
}

This one is a bit tricky. It would have been a piece of cake if you could control or at least communicate with the program copying the file but this won't be possible with Windows I guess. I had to deal with a similar problem a while ago with SFU software, I resolved it by looping on trying to open the file for writing until it becomes available.
To avoid high CPU usage while looping, checking the file can be done at an exponential distribution rate.
EDIT A possible solution:
File fileToCopy = File(String pathname);
int sleepTime = 1000; // Sleep 1 second
while(!fileToCopy .canWrite()){
// Cannot write to file, windows still working on it
Sleep(sleepTime);
sleepTime *= 2; // Multiply sleep time by 2 (not really exponential but will do the trick)
if(sleepTime > 30000){
// Set a maximum sleep time to ensure we are not sleeping forever :)
sleepTime = 30000;
}
}
// Here, we have access to the file, go process it
processFile(fileToCopy);

I think you can create the File object and then use canRead or canWrite to know whether file ready to be used by the other java program.
http://docs.oracle.com/javase/6/docs/api/java/io/File.html
Other option is to try to Open file on first program and if it throws the exception then dont call the other java program. But I ll recommend the above 'File option.

Java keeps lock on files for no apparent reason

Despite closing streams in finally clauses I seem to constantly run into cleaning up problems when using Java. File.delete() fails to delete files, Windows Explorer fails too. Running System.gc() helps sometimes but nothing short of terminating the VM helps consistently and that is not an option.
Does anyone have any other ideas I could try? I use Java 1.6 on Windows XP.
UPDATE: FLAC code sample removed, the code worked if I isolated it.
UPDATE:
More info, this happens in Apache Tomcat, Commons FileUpload is used to upload the file and could be the culprit, also I use Runtime.exec() to execute LAME in a separate process to encode the file, but that seems unlikely to cause this since ProcessExplorer clearly indicates that java.exe has a RW lock on the file and LAME terminates fine.
UPDATE: I am working with the assumption that there is a missing close() or a close() that does not get called somewhere in my code or external library. I just can't find it!

The code you posted looks good - it should not cause the issues you are describing. I understand you posted just a piece of the code you have - can you try extracting just this part to a separate program, run it and see if the issue still happens?
My guess is that there is some other place in the code that does new FileInputStream(path); and does not close the stream properly. You might be just seeing the results here when you try to delete the file.

I assume you're using jFlac. I downloaded jFlac 1.3 and tried your sample code on a flac freshly downloaded from the internet live music archive. For me, it worked. I even monitored it with ProcessExplorer and saw the file handles be opened and then released. Is your test code truly as simple as what you gave us, or is that a simplified version of your code? For me, once close() was called, the handle was released and the file was subsequently successfully deleted.
Try changing your infinite loop to:
File toDelete = new File(path);
if (!toDelete.delete()) {
System.out.println("Could not delete " + path);
System.out.println("Does it exist? " + toDelete.exists());
}
or if you want to keep looping, then put a 1 second sleep between attempts to delete the file. I tried this with JDK6 on WinXP Pro.
Don't forget to put a try/catch around your close() and log errors if the close throws an exception.

Make sure you have your close calls in the finally block not in the try block. If there is no try/finally because the method throws the exception then add a try/finally and put the close in there.
Look at the Windows Task Manager. For the Processes add the "Handles" column (under the View menu). Watch to see if the handles keep going up without ever dropping.
Use a profiler to see if you have Stream/Reader/Writer objects around that you do not think you should have.
EDIT:
Thanks for posting the code... off to see it. One thing - your close methods are not both guaranteed to execute - the first close might throw and then the second won't run.
EDIT 2:
final WavWriter wavWriter = new WavWriter(os);
LACDecoder decoder = new FLACDecoder(is);
The above two lines will cause the strams to be kept in instance variables presumably. As a test see if you can set the stream references to null after the decoder.decode() call (make a decoder.cleanup() method perhaps). See if holding onto the closed streams is causing a problem.
Also, do you do any wrapping of the streams passed into the above constructors? If so you might have to close the streams via the wrappers.

Your code sample should definitely work. In fact I ran your it on Java 1.6/Vista with jflac 1.3 and the source file is deleted, without any looping.
I'm guessing in your case another process is keeping the file open, perhaps a desktop search indexer or an antivirus. You can procexp to find which process is actually holding onto the file.

Isn't that an empty while loop?
you have:
try
{
...code
}
finally
{
}
while (something);
put some whitespace in there, and you actually have:
try
{
...code
}
finally
{
}
while (something)
;
your while loop isn't related to your try/finally. if your original try statement fails and the file isn't created, that while loop will never complete, because the try/finally will never execute a second time.
did you intend to make that a do{ all your code } while (your while statement)?
because that isn't what you have there.
EDIT to clarify:
my suggestion would be to change your while loop to have more info of why it can't delete:
while (!file.delete())
{
if (!file.exists())
break; // the file doesn't even exist, of course delete will fail
if (!file.canRead())
break; // the file isn't readable, delete will fail
if (!file.canWrite())
break; // the file isn't writable, delete will fail
}
because if delete fails once, its just going to fail over and over and over, of course its going to hang there. you aren't changing the state of the file in the loop.
Now that you've added other info, like Tomcat, etc, is this a permissions issue? are you trying to write to a file that the user tomcat is running as (nobody?) vm can't create? or delete a file that the tomcat process can't delete?
If process explorer/etc say java has a lock on the file, then something still has an open stream using it. someone might have not properly called close() on whatever streams are writing to the file?

If you are out of clues and ideas: In cygwin, cd to your javaroot and run something like:
find . -name '*.java' -print0 | xargs -0 grep "new.*new.*putStream"
It might provide a few suspects...

Another thing to try since you're using Tomcat-- in your Context Descriptor (typically Tomcat/conf/Catalina/localhost/your-context.xml), you can set
antiResourceLocking=true, which is designed to "avoid resource locking on Windows". The default for this (if you don't specify) is false. Worth a try.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.