Java properties file under low disk space conditions on Linux - java

I unwittingly ran a Java program on Linux where the partition that it, it's properties and logs resided on was close to 100% full. It failed, but after clearing up the problem with the disk space, I ran it again and it failed a second time because it's properties file was 0 bytes long.
I don't have source of the file and I don't want to go as far as decompiling the class files, but I was wondering whether the corruption of the properites could be because the program failed to write to the properties file.
The mystery is that, I expected that the properties would be read-only and don't recall any items being updated by the program. Could it be that even if the properties are only read, the file is opened in read-write mode and could disappear if the partition is full?
N.b. this program has run without failure or incident at least 1000 times over several years.

I don't have source of the file and I don't want to go as far as decompiling the class files, but I was wondering whether the corruption of the properites could be because the program failed to write to the properties file.
That is the most likely explanation. There would have been an exception, but the application could have squashed it ... or maybe you didn't notice the error message. (Indeed, if the application tried to log the error to a file, that would most likely fail too.)
Could it be that even if the properties are only read, the file is opened in read-write mode and could disappear if the partition is full?
That is unlikely to be the answer in this case. Unlike many languages, in Java the code for reading a file and writing a file involves different stream classes. It is hard to imagine how / why the application's developer would open a property file for writing (with truncation) if there was never any intention to write it.
The most plausible explanation is that the application does update the properties file. (Try installing the program again, using it, stopping it, and looking at the property file's modification timestamp.)
N.b. this program has run without failure or incident at least 1000 times over several years.
And I bet this is the first time you've run it while the disk was full :-)

Related

Inter-process file exchange: efficiency and race conditions

The story:
A few days ago I was thinking about inter-process communication based on file exchange. Say process A creates several files during its work and process B reads these files afterwards. To ensure that all files were correctly written, it would be convenient to create a special file, which existence will signal that all operations were done.
Simple workflow:
process A creates file "file1.txt"
process A creates file "file2.txt"
process A creates file "processA.ready"
Process B is waiting until file "processA.ready" appears and then reads file1 and file2.
Doubts:
File operations are performed by the operating system, specifically by the file subsystem. Since implementations can differ in Unix, Windows or MacOS, I'm uncertain about the reliability of file exchange inter-process communication. Even if OS will guarantee this consistency, there are things like JIT compiler in Java, which can reorder program instructions.
Questions:
1. Are there any real specifications on file operations in operating systems?
2. Is JIT really allowed to reorder file operation program instructions for a single program thread?
3. Is file exchange still a relevant option for inter-process communication nowadays or it is unconditionally better to choose TCP/HTTP/etc?
You don’t need to know OS details in this case. Java IO API is documented to guess whether file was saved or not.
JVM can’t reorder native calls. It is not written in JMM explicitly but it is implied that it can’t do it. JVM can’t guess what is impact of native call and reordering of those call can be quite generous.
There are some disadvantages of using files as a way of communication:
It uses IO which is slow
It is difficult to separate processes between different machines in case you would need it (there are ways using samba for example but is quite platform-dependant)
You could use File watcher (WatchService) in Java to receive a signal when your .ready file appears.
Reordering could apply but it shouldn't hurt your application logic in this case - refer the following link:
https://assylias.wordpress.com/2013/02/01/java-memory-model-and-reordering/
I don't know the size of your data but I feel it would still be better to use an Message Queue (MQ) solution in this case. Using a File IO is a relatively slow operation which could slow down the system.
Used file exchange based approach on one of my projects. It's based on renaming file extensions when a process is done so other process can retrieve it by file name expression checking.
FTP process downloads a file and put its name '.downloaded'
Main task processor searched directory for the files '*.downloaded'.
Before starting, job updates file name as '.processing'.
When finished then updates to '.done'.
In case of error, it creates a new supplemantary file with '.error' extension and put last processed line and exception trace there. On retries, if this file exists then read it and resume from correct position.
Locator process searches for '.done' and according to its config move to backup folder or delete
This approach is working fine with a huge load in a mobile operator network.
Consideration point is to using unique names for files is important. Because moving file's behaviour changes according to operating system.
e.g. Windows gives error when there is same file at destination, however unix ovrwrites it.

How to poll a directory and not hit a file-transfer race condition?

I am working on an application that polls a directory for new input files at a defined interval. The general process is:
Input files FTP'd to landing strip directory by another app
Our app wakes up
List files in the input directory
Atomic-move the files to a separate staging directory
Kick off worker threads (via a work-distributing queue) to consume the files from the staging directory
Go to back sleep
I've uncovered a problem where the app will pick up an input file while it is incomplete and still in the middle of being transferred, resulting in a worker thread error, requiring manual intervention. This is a scenario we need to avoid.
I should note the file transfer will complete successfully and the server will get a complete copy, but this will happen to occur after the app has given up due to an error.
I'd like to solve this in a clean way, and while I have some ideas for solutions, they all have problems I don't like.
Here's what I've considered:
Force the other apps (some of which are external to our company) to initially transfer the input files to a holding directory, then atomic-move them into the input directory once they're transferred. This is the most robust idea I've had, but I don't like this because I don't trust that it will always be implemented correctly.
Retry a finite number of times on error. I don't like this because it's a partial solution, it makes assumptions about transfer time and file size that could be violated. It would also blur the lines between a genuinely bad file and one that's just been incompletely transferred.
Watch the file sizes and only pick up the file if its size hasn't changed for a defined period of time. I don't like this because it's too complex in our environment: the poller is a non-concurrent clustered Quartz job, so I can't just persist this info in memory because the job can bounce between servers. I could store it in the jobdetail, but this solution just feels too complicated.
I can't be the first have encountered this problem, so I'm sure I'll get better ideas here.
I had that situation once, we got the other guys to load the files with a different extension, e.g. *.tmp, then after the file copy is completed they rename the file with the extension that my code is polling for. Not sure if that is as easily done when the files are coming in by FTP tho.

FileNotFoundException (File too large)

I am getting this exception when trying to download a file
Caused by: java.io.FileNotFoundException: /repository/PWWVFSYWDW0STLHYVEEKHMYBXZTTETGROCQ4FGdsadadaXR1407709207964905350810526.jpg (File too large)
at java.io.FileOutputStream.open(Native Method)
It is clear that file the exists. In addition to that, the same program works properly on my PC, but there is a problem with the server, which is Unix
Any ideas what might be causing this?
I think that this is an obscure error that is actually coming from the OS level or the JVM's native code implementation. The message "File too large" is the error message that you would get if the perror C library method was used to render the EFBIG error number.
Now ordinarily, this should not happen. According to the UNIX / Linux manual entries, the various open library calls should not fail with EFBIG.
However, I have seen various error reports that imply that fopen (etcetera) can fail like that on some file systems, and/or when the a C / C++ program has been built with 64bit file size support disabled.
So what does this mean?
It is not clear, but I suspect that it means that you are either:
using a flaky implementation of Java,
running a flaky release of UNIX / Linux, or
you are trying to use some type of file system that is not well supported by your server's OS. (Might it be on a FUSE file system?)
A possibly related Java bug:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7009975 (Java 7 on Solaris.)
So , it is solved. The problem is that, disk is full as a result stream takes long time,
I clean up the disk after that there is no problem,
I received this message when trying to write a file to a directory on a RedHat server that already had the maximum number of files in it. I subdivided my files into subdirectories and the error did not reappear.
POSIX (and thus Unix) systems are allowed to impose a maximum length on the path (what you get from File.getPath() or the components of a path (the last of which you can get with File.getName()). You might be seeing this problem because of the long name for the file.
In that case, the file open operating system call will fail with an ENAMETOOLONG error code.
However, the message "File too large" is typically associated with the EFBIG error code. That is more likely to result from a write system call:
An attempt was made to write a file that exceeds the implementation-dependent maximum file size or the process' file size limit.
Perhaps the file is being opened for appending, and the implied lseek to the end of the file is giving the EFBIG error.
Irrespective of the JVM error output (which might be misleading or slightly off), you may want to check the that your Unix process has enough open file handles. Exhausting process file handles can lead to all kinds of FS-related error codes.
problem:
java.io.FileNotFoundException: /sent/plain/2009/04/Sterling_AS2_to_edt_AS2_Sterling_SJMVM00 A.20090429115945.All_to_cil_QM_214.GS13173.cgs_to_cil.PR120301900.node (File too large)
Files that do not have a very long filename are successfully written out to the same directory.
solution:
Reduce the assign filename length or remove older archived files with the longer filenames from that directory and try again.

Reading a file that is being written to - Locking it?

There is a file - stored on an external server which is updated very frequently by a vendor. My application polls this file every minute getting the values out. All I am doing is reading the file.
I am worried that by doing this I could inadvertently lock the file so it cant be written too by the vendor. Is this a possibility?
Kind regards
Further to Eric's answer - you could check the Last Modified Property of the temp file and only merge it with your 'working' file when it changes - that should protect you from read/write conflicts and only merge files just after the vendor has written to the temp. Though this is messy and mrab's comment is valid, a better solution should be found.
I have faced this problem several times, and as Peter Lawrey says there isn't any portable way to do this, and if your environment is Unix this should not be an issue at all as these concurrent access conditions are properly managed by the operating systems. However windows do not handle this at all (yes, that's the consequence of using an amateur OS for production work, lol).
Now that's said, there is a way to solve this if your vendor is flexible enough. They could write to a temp file and when finished move the temp file to the final destination. By doing this you avoid any concurrent access to the file between you and the vendor.
Another way is to precisely (difficult?) know the timing of your vendors update and avoid reading the file during some time frames. For instance if your vendor update the file every hour, avoid reading from five-to-the-hour to five-past-the-hour.
Hope it helps.
There is the Windows Shadow Copy service for volumes. This would allow to read the backup copy.
If the third party software is in java too, and uses a Logger, that should be tweakable: every minute writing to the next from 10 files or so.
I would try to relentlessly read the file (when modified since last read), till something goes wrong. Maybe you can make a test run with hundreds of reads in the weekend or at midnight, when no harm is done.
My answer:
Maybe you need a local watch program, a watch service for a directoryr, that waits till the file is modified, and then makes a fast cooy; after that allowing the copy to be transmitted.

What happens if I delete the xx.jar file after I started to execute the xx.jar

I have a server program running a java binary code (xx.jar file). While it is running I erroneously delete the xx.jar file. The program continues to run. But I am not sure if the results will be correct, and I am not sure if the program will fail.
When I delete the xx.jar file, the program was in a method for a long time and still it is in that method call. When it calls another method call will, my program fail?
I am asking this question because If deleting the file has no harm I will be gaining about 3-4h on a server machine
There is no guarantee that the JVM will load all classes from a .jar file into memory, although it may pre-load some or all of the .jar as an optimization.
If this fails, and I imagine it would at some point, it would not happen during the middle of execution of a method. It would be at a point where a new class must be loaded from the classpath and the JVM can't access that file anymore. Then you would fail with NoClassDefFoundError or worse.
So, no, I would definitely not advise you to do this, even if it happens to work in some cases.
Depending on your operating system, this will or will not be a problem. On Linux, for example, a file isn't really deleted until all applications that have it opened close it. The file will be gone from the directory listing but it still exists and can be read (and even written!) by any application with a valid file descriptor open.
Whether or not the JVM keeps file descriptors open to all jar files of your application, I don't know. I wouldn't rely on it doing so, even if it does seem to work ok sometimes.
You will come to know when you completely deploy and redeploy an application and restart it .
Dependency functionality will fail and Expection will be thrown

Categories