Prevent Text File From Being Deleted When Accessed From Multiple Threads

Prevent Text File From Being Deleted When Accessed From Multiple Threads - java

I'm trying to debug a problem that just surfaced in my program. Until now, I've been writing, reading and updating props file with no problem using the following code structure:
public void setAndReplacePropValue(String dir, String key, String value) throws FileNotFoundException, IOException {
if (value != null) {
File file = new File(dir);
if (!file.exists()) {
System.out.println("File: " + dir + " is not present. Attempting to create new file now..");
new FilesAndFolders().createTextFileWithDirsIfNotPresent(dir);
}
if (file.exists()) {
try {
FileInputStream fileInputStream = null;
fileInputStream = new FileInputStream(file);
if (fileInputStream != null) {
Properties properties = new Properties();
properties.load(fileInputStream);
fileInputStream.close();
if (properties != null) {
FileOutputStream fileOutputStream = new FileOutputStream(file);
properties.setProperty(key, value);
properties.store(fileOutputStream, null);
fileOutputStream.close();
}
}
}
catch (Exception e) {
e.printStackTrace();
}
} else {
System.out.println("File: " + dir + " does not exist and attempt to create new file failed");
}
}
}
However, recently I noticed that a specific file (let's call it: C:\\Users\\Admin\\Desktop\\props.txt) is being deleted after being updated from multiple threads. I'm not sure the exact source of this error, as it seems to happen randomly.
I thought that, perhaps, if two threads call setAndReplacePropValue() and the first thread calls FileOutputStream fileOutputStream = new FileOutputStream(file); before it has a chance to re-write data to file (via properties.store(fileOutputStream, null) ) then the second thread might call fileInputStream = new FileInputStream(file); on an empty file - causing the thread to delete previous data when writing 'empty' data back to file.
To test my hypothesis I tried calling setAndReplacePropValue() from multiple threads several hundred to thousand times in a row while making changes to setAndReplacePropValue() as needed. Here are my results:
If setAndReplace() is declared as static + synchronized the original props data is preserved. This remains true even when I add a random delay after calling FileOutputStream - as long as JVM exists normally. If JVM is killed/terminated (after FileOutputStream is called) then previous data will be deleted.
If I remove both static and synchronized modifiers from setAndReplace() and call setAndReplace() 5,000 times, the old data is still preserved (why?) - as long as JVM ends normally. This appears to be true even when I add random delay in setAndReplace() (after calling FileOutputStream).
When I try modifying props file using ExecutorService (I occasionally access setAndReplacePropValue() via ExecutorService in my program), file content is preserved as long as there's no delay after FileOutputStream. If I add delay and the delay is > 'timout' value set in future.get() (so interrupted exception is thrown) the data is NOT preserved. This remains true even if I add static + synchronized keywords to method.
In short, my question is what is the most likely explanation for why file is being deleted? (I thought point 3 might explain error but I'm not actually sleeping after calling new FileOutputStream() so presumably this would not prevent data from being written back to file after calling new FileOutputStream()). Is there another possibility I didn't think of?
Also, why is point 2 true? If method is not declared as static/synchronized shouldn't this cause one thread to create InputStream from empty file? Thanks.

Unfortunately it is very difficult to provide feedback on your code without a ton of additional information but hopefully, my comments will be helpful.
In general, having multiple threads reading and writing from the same file is a really bad idea. I can't agree more with #Hovercraft-Full-Of-Eels who recommends that you have 1 thread do the reading/writing and the other threads just add updates to a shared BlockingQueue.
But that said here are some comments.
If setAndReplace() is declared as static + synchronized the original props data is preserved.
Right, this stops the terrible race condition in your code where 2 threads could be trying to write to the output file at the same time. Or it could be that 1 thread starts to write and another thread reads an empty file causing data to be lost.
If JVM is killed/terminated (after FileOutputStream is called) then previous data will be deleted.
I don't quite understand this part but your code should have good try/finally clauses to make sure that the files are closed appropriately when the JVM terminates. If the JVM is hard-killed then the file may have been opened but not be written yet (depending on timing). In this case, I would recommend that you write to a temporary file and rename to your properties file which is atomic. Then you might miss the update if the JVM is killed but the file will never be overwritten and be empty.
If I remove both static and synchronized modifiers from setAndReplace() and call setAndReplace() 5,000 times, the old data is still preserved (why?)
No idea. Depends on race conditions. Maybe you are just getting lucky.
When I try modifying props file using ExecutorService (I occasionally access setAndReplacePropValue() via ExecutorService in my program), file content is preserved as long as there's no delay after FileOutputStream. If I add delay and the delay is > 'timeout' value set in future.get() (so interrupted exception is thrown) the data is NOT preserved. This remains true even if I add static + synchronized keywords to method.
I can't answer that without seeing the specific code.
This would be a good idea actually if you had a fixed thread pool with 1 thread, then each of the threads that want to update a value would just submit the field/value object to the thread-pool. This is approximately what #Hovercraft-Full-Of-Eels was talking about.
Hope this helps.

Related

Java: deleting gz files failed occasionally

UPDATE 2: Please close this question
After further debugging it is found that the problem is not in the inner try block, but a bug inside the 'while' loop. An exception was caused there and was not caught, which therefore skips the inner try block. Apologies for my mistake, please delete this thread.
UPDATE: added logging to capture errors during delete.
I am downloading 8000ish GZ files from a server, process its content locally, then delete the downloaded copy upon completion. I am running this over a number of threads, each process a disjoint batch of GZ files. But I do not understand for what reason that my code does not successfully delete the GZ files occasionally (not always). The code generally looks like this:
....
private static final Logger LOG = Logger.getLogger(....class.getName());
.....
for (String inputGZFile : gzFiles) { //gzFiles is a list of urls to be process by this thread
try {
File downloadTo = new
File(this.outFolder + "/" + new File(downloadFrom.getPath()).getName());
FileUtils.copyURLToFile(downloadFrom, downloadTo);
InputStream fileStream = new FileInputStream(downloadTo);
InputStream gzipStream = new GZIPInputStream(fileStream);
Reader decoder = new InputStreamReader(gzipStream, Charset.forName("utf8"));
Scanner inputScanner = new Scanner(decoder);
inputScanner.useDelimiter(" .");
while (inputScanner.hasNextLine() && (content = inputScanner.nextLine()) != null) {
//do something
}
try {
inputScanner.close();
FileUtils.forceDelete(downloadTo);
}catch (Exception e){
LOG.info("\t thread " + id + " deleting gz file error "+ inputGZFile);
LOG.info("\t thread " + id+ExceptionUtils.getFullStackTrace(e));
}
}catch(Exception e){
e.printStackTrace();
}
}
The only reason I can think of is that the scanner did not close the file or release the file handle. But that would be strange because I already call the close method to close the scanner.
Any suggestions highly appreiciated.

Without the ability to look into your log files, or debug your system first hand, it is close to impossible to tell you what is going wrong here.
But what you can definitely do: do that call to FileUtils.forceDelete(downloadTo); within a finally block for example.
The whole point of try/catch/finally is to enable you to enforce that specific actions always take place, no matter what happened in the try block!
Also note: if you are unable to tell what your code does, then add logging support to it. So that instead of printStackTrace(); you log the whole exception to a place where it does not get lost.
Meaning: the real answer here is that you step back and take the necessary actions to find out where your problems are coming from.

Rolling file implementation

I am always curious how a rolling file is implemented in logs.
How would one even start creating a file writing class in any language in order to ensure that the file size is not exceeded.
The only possible solution I can think of is this:
write method:
size = file size + size of string to write
if(size > limit)
close the file writer
open file reader
read the file
close file reader
open file writer (clears the whole file)
remove the size from the beginning to accommodate for new string to write
write the new truncated string
write the string we received
This seems like a terrible implementation, but I can not think up of anything better.
Specifically I would love to see a solution in java.
EDIT: By remove size from the beginning is, let's say I have 20 byte string (which is the limit), I want to write another 3 byte string, therefore I remove 3 bytes from the beginning, and am left with end 17 bytes, and by appending the new string I have 20 bytes.

Because your question made me look into it, here's an example from the logback logging framework. The RollingfileAppender#rollover() method looks like this:
public void rollover() {
synchronized (lock) {
// Note: This method needs to be synchronized because it needs exclusive
// access while it closes and then re-opens the target file.
//
// make sure to close the hereto active log file! Renaming under windows
// does not work for open files
this.closeOutputStream();
try {
rollingPolicy.rollover(); // this actually does the renaming of files
} catch (RolloverFailure rf) {
addWarn("RolloverFailure occurred. Deferring roll-over.");
// we failed to roll-over, let us not truncate and risk data loss
this.append = true;
}
try {
// update the currentlyActiveFile
currentlyActiveFile = new File(rollingPolicy.getActiveFileName());
// This will also close the file. This is OK since multiple
// close operations are safe.
// COMMENT MINE this also sets the new OutputStream for the new file
this.openFile(rollingPolicy.getActiveFileName());
} catch (IOException e) {
addError("setFile(" + fileName + ", false) call failed.", e);
}
}
}
As you can see, the logic is pretty similar to what you posted. They close the current OutputStream, perform the rollover, then open a new one (openFile()). Obviously, this is all done in a synchronized block since many threads are using the logger, but only one rollover should occur at a time.
A RollingPolicy is a policy on how to perform a rollover and a TriggeringPolicy is when to perform a rollover. With logback, you usually base these policies on file size or time.

JAVA: How can my two apps access the same file?

I've made two apps designed to run concurrently (I do not want to combine them), and one reads from a certain file and the other writes to it. When one or the other are running no errors, however if they are both running a get an access is denied error.
Relevant code of the first:
class MakeImage implements Runnable {
#Override
public void run() {
File file = new File("C:/Users/jeremy/Desktop/New folder (3)/test.png");
while (true) {
try{
//make image
if(image!=null)
{
file.createNewFile();
ImageIO.write(image, "png", file);
hello.repaint();}}
catch(Exception e)
{
e.printStackTrace();
}
}
}
}
Relevant code of the second:
BufferedImage image = null;
try {
// Read from a file
image = ImageIO.read(new File("C:/Users/jeremy/Desktop/New folder (3)/test.png"));
}
catch(Exception e){
e.printStackTrace();
}
if(image!=null)
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ImageIO.write( image, "png", baos );
baos.flush();
byte[] imageInByte = baos.toByteArray();
baos.close();
returns=Base64.encodeBase64String(imageInByte);
}
I looked at this: Java: how to handle two process trying to modify the same file, but that is when both are writting to the file where here only one is. I tried the retry later method as suggested in the former's answer without any luck. Any help would be greatly appreciated.

Unless you use OS level file locking of some sort and check for the locks you're not going to be able to reliably do this very easily. A fairly reliable way to manage this would be to use another file in the directory as a semaphore, "touch" it when you're writing or reading and remove it when you're done. Check for the existence of the semaphore before accessing the file. Otherwise you will need to use a database of some sort to store the file lock (guaranteed consistency) and check for it there.
That said, you really should just combine this into 1 program.

Try RandomAccessFile.
This is a useful but very dangerous feature. It goes like this "if you create different instances of RandomAccessFile for a same file you can concurrently write to the different parts of the file."
You can create multiple threads pointing to different parts of the file using seek method and multiple threads can update the file at the same time. Seek allow you to move to any part of the file even if it doesn't exist (after EOF), hence you can move to any location in the newly created file and write bytes on that location. You can open multiple instances of the same file and seek to different locations and write to multiple locations at the same time.

Use synchronized on the method that modify the file.
Edited:
As per the Defination of a Thread safe class, its this way.. " A class is said to be thread safe, which it works correctly in the presence of the underlying OS interleaving and scheduling with NO means of synchronization mechanism from the client side".
I believe there is a File which is to be accessed on to a different machine, so there must be some client-server mechanism, if its there.. then Let the Server side have the synchronization mechanism, and then it doesnt matters how many client access it...
If not, synchronized is more than enough........

How to find out which thread is locking a file in java?

I'm trying to delete a file that another thread within my program has previously worked with.
I'm unable to delete the file but I'm not sure how to figure out which thread may be using the file.
So how do I find out which thread is locking the file in java?

I don't have a straight answer (and I don't think there's one either, this is controlled at OS-level (native), not at JVM-level) and I also don't really see the value of the answer (you still can't close the file programmatically once you found out which thread it is), but I think you don't know yet that the inability to delete is usually caused when the file is still open. This may happen when you do not explicitly call Closeable#close() on the InputStream, OutputStream, Reader or Writer which is constructed around the File in question.
Basic demo:
public static void main(String[] args) throws Exception {
File file = new File("c:/test.txt"); // Precreate this test file first.
FileOutputStream output = new FileOutputStream(file); // This opens the file!
System.out.println(file.delete()); // false
output.close(); // This explicitly closes the file!
System.out.println(file.delete()); // true
}
In other words, ensure that throughout your entire Java IO stuff the code is properly closing the resources after use. The normal idiom is to do this in the try-with-resources statement, so that you can be certain that the resources will be freed up anyway, even in case of an IOException. E.g.
try (OutputStream output = new FileOutputStream(file)) {
// ...
}
Do it for any InputStream, OutputStream, Reader and Writer, etc whatever implements AutoCloseable, which you're opening yourself (using the new keyword).
This is technically not needed on certain implementations, such as ByteArrayOutputStream, but for the sake of clarity, just adhere the close-in-finally idiom everywhere to avoid misconceptions and refactoring-bugs.
In case you're not on Java 7 or newer yet, then use the below try-finally idiom instead.
OutputStream output = null;
try {
output = new FileOutputStream(file);
// ...
} finally {
if (output != null) try { output.close(); } catch (IOException logOrIgnore) {}
}
Hope this helps to nail down the root cause of your particular problem.

About this question, I also try to find out this answer, and ask this question and find answer:
Every time when JVM thread lock a file exclusively, also JVM lock
some Jave object, for example, I find in my case:
sun.nio.fs.NativeBuffer
sun.nio.ch.Util$BufferCache
So you need just find this locked Java object and analyzed them and
you find what thread locked your file.
I not sure that it work if file just open (without locked exclusively), but I'm sure that is work if file be locked exclusively by Thread (using java.nio.channels.FileLock, java.nio.channels.FileChannel and so on)
More info see this question

Java file locking on a network

This is perhaps similar to previous posts, but I want to be specific about the use of locking on a network, rather than locally. I want to write a file to a shared location, so it may well go on a network (certainly a Windows network, maybe Mac). I want to prevent other people from reading any part of this file whilst it it being written. This will not be a highly concurrent process, and the files will be typically less than 10MB.
I've read the FileLock documentation and File documentation and am left somewhat confused, as to what is safe and what is not. I want to lock the entire file, rather than portions of it.
Can I use FileChannel.tryLock(), and it is safe on a network, or does it depend on the type of network? Will it work on a standard Windows network (if there is such a thing).
If this does not work, is the best thing to create a zero byte file or directory as a lock file, and then write out the main file. Why does that File.createNewFile() documentation say don't use this for file locking? I appreciate this is subject to race conditions, and is not ideal.

This can't be reliably done on a network file system. As long as your application is the only application that accesses the file, it's best to implement some kind of cooperative locking process (perhaps writing a lock file to the network filesystem when you open the file). The reason that is not recommended, however, is that if your process crashes or the network goes down or any other number of issues happen, your application gets into a nasty, dirty state.

You can have a empty file which is lying on the server you want to write to.
When you want to write to the server you can catch the token. Only when you have the token you should write to any file which is lying on the server.
When you are ready with you file operations or an exception was thrown you have to release the token.
The helper class can look like
private FileLock lock;
private File tokenFile;
public SLTokenLock(String serverDirectory) {
String tokenFilePath = serverDirectory + File.separator + TOKEN_FILE;
tokenFile = new File(tokenFilePath);
}
public void catchCommitToken() throws TokenException {
RandomAccessFile raf;
try {
raf = new RandomAccessFile(tokenFile, "rw"); //$NON-NLS-1$
FileChannel channel = raf.getChannel();
lock = channel.tryLock();
if (lock == null) {
throw new TokenException(CANT_CATCH_TOKEN);
}
} catch (Exception e) {
throw new TokenException(CANT_CATCH_TOKEN, e);
}
}
public void releaseCommitToken() throws TokenException {
try {
if (lock != null && lock.isValid()) {
lock.release();
}
} catch (Exception e) {
throw new TokenException(CANT_RELEASE_TOKEN, e);
}
}
Your operations then should look like
try {
token.catchCommitToken();
// WRITE or READ to files inside the directory
} finally {
token.releaseCommitToken();
}

I found this bug report which describes why the note about file locking was added to the File.createNewFile documentation.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4676183
It states:
If you mark the file as deleteOnExit before invoking createNewFile but the file already exists, you run the risk of deleting a file you didn't create and dropping someone elses lock! On the other hand, if you mark the file after creating it, you lose atomicity: if the program exits before the file is marked, it won't get deleted and the lock will be "wedged".
So it looks like the main reason locking is discouraged with File.createNewFile() is that you can end up with orphaned lock files if the JVM unexpectedly terminates before you have a chance to delete it. If you can deal with orphaned lock files then it could be used as a simple locking mechanism. However, I wouldn't recommend the method suggested in the comments of the bug report as it has race conditions around read/writing the timestamp value and reclaiming the expired lock.

Rather than implementing a locking strategy which will, in all likelihood, rely on readers to adhere to your convention but will not force them to, perhaps you can write the file out to a hidden or obscurely named file where it will be effectively invisible to readers. When the write operation is complete, rename the file to the expected public name.
The downside is that hiding and/or renaming without additional IO may require you to use native OS commands, but the procedure to do so should be fairly simple and deterministic.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Prevent Text File From Being Deleted When Accessed From Multiple Threads - java

Related

Java: deleting gz files failed occasionally

Rolling file implementation

JAVA: How can my two apps access the same file?

How to find out which thread is locking a file in java?

Java file locking on a network

Categories

Resources