Files not closing on BTRFS and old versions reappearing

Files not closing on BTRFS and old versions reappearing - java

after a lot of testing and digging around the BTRFS man pages i need help from some Linux / BTRFS folks.
I have a java application that writes data files to disk using the java MappedByteBuffer utility. This is application uses a byte buffer of ~16000 bytes when writing to disk. When a new file is being written to it creates a temp file of the buffer size and due to the java implementation of mem-mapped files the code does not explicitly close the file. Instead we call Linux's drop_caches to force unused memory maps to flush to disk.
On EXT4 these files are automatically closed and the filesize is adjusted correctly.
On BTRFS these files stay the ~16000 bytes and are missing some data (possibly paging issues)
On BTRFS when i delete these files and the software re-runs and creates the files again the same issue occurs each time AND the modified dates are from when the files were originally created
Server info:
We are running on the latest centos 7.2 and are up to date with patches
OS Centos 7 x64 (Kernel 3.10.0-514.10.2.el7.x86_64)
btrfs-progs v4.4.1
Java 1.8.0_111
Testing performed
We have a replica server running on Ext4 and this issue is not happening
We are currently using COW and compression so i tried disabling them both, rebooting, deleting the old data and restarting the software. The issue still occurred
I have also tried disabling space_cache, recovery and i also tried setting commit=5 with flushoncommit ...this also didnt help the non closing files / the incorrect modified dates

and due to the java implementation of mem-mapped files the code does not explicitly close the file.
That does not make much sense. file-backed memory mappings do not require their file descriptors to be kept open. So you absolutely can close the file after creating the mapped buffer.
Instead we call Linux's drop_caches to force unused memory maps to flush to disk.
That is massive overkill.
use MappedByteBuffer::force to sync changes to disk
rename tempfile
fsync the directory necessary after renames for crash-durability (see references below).
try(FileChannel dir = FileChannel.open(Paths.get("/path/directory"), StandardOpenOptions.READ)) {
dir.force(true);
}
https://lwn.net/Articles/457667/
https://danluu.com/file-consistency/

Related

GC overhead limit exceeded when reading large xls file

When I run my project in Netbeans IDE (compiling it and testing it), it works fine. It enables me reading xls file with size of 25000 rows and extract all the infromation above, then save them into database.
The problem appears when I generate the installer and deliver it. When I install my application and run it, I obtain that error:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at jxl.read.biff.File.read(File.java:217)
at jxl.read.biff.Record.getData(Record.java:117)
at jxl.read.biff.CellValue.<init>(CellValue.java:94)
at jxl.read.biff.LabelSSTRecord.<init>(LabelSSTRecord.java:53)
at jxl.read.biff.SheetReader.read(SheetReader.java:412)
at jxl.read.biff.SheetImpl.readSheet(SheetImpl.java:716)
at jxl.read.biff.WorkbookParser.getSheet(WorkbookParser.java:257)
at com.insy2s.importer.SemapExcelImporter.launchImport(SemapExcelImporter.java:82)
at//staff
I even user POI libraries but I got same scenario.
UPDATE:
In messages.log file of my application, I found this strange values (I have changed them in netbeans.conf)
Input arguments:
-Xms24m
-Xmx64m
-XX:MaxPermSize=256m
-Dnetbeans.user.dir=C:\Program Files\insy2s_semap_app
-Djdk.home=C:\Program Files\Java\jdk1.8.0_05
-Dnetbeans.home=C:\Program Files\insy2s_semap_app\platform

OK, I got the answer... Let's begin from the beginning.
It is true that Microsoft documents hanlders' libraries need much resources but not so bad to cause application running failure as I thought at the beginning. In fact, that probleme has revealed to me a weakness and a shortage.
Because of working with Netbeans 8.0.2, the new property
app.conf
should be taken into consideration. It has all what needed to configure our applications.
But it is not possible to edit it directly so to increase the max permitted memory, we have to change the values in
harness/etc/app.conf
in netbeans installation directory. For more details look here.

FileNotFoundException (File too large)

I am getting this exception when trying to download a file
Caused by: java.io.FileNotFoundException: /repository/PWWVFSYWDW0STLHYVEEKHMYBXZTTETGROCQ4FGdsadadaXR1407709207964905350810526.jpg (File too large)
at java.io.FileOutputStream.open(Native Method)
It is clear that file the exists. In addition to that, the same program works properly on my PC, but there is a problem with the server, which is Unix
Any ideas what might be causing this?

I think that this is an obscure error that is actually coming from the OS level or the JVM's native code implementation. The message "File too large" is the error message that you would get if the perror C library method was used to render the EFBIG error number.
Now ordinarily, this should not happen. According to the UNIX / Linux manual entries, the various open library calls should not fail with EFBIG.
However, I have seen various error reports that imply that fopen (etcetera) can fail like that on some file systems, and/or when the a C / C++ program has been built with 64bit file size support disabled.
So what does this mean?
It is not clear, but I suspect that it means that you are either:
using a flaky implementation of Java,
running a flaky release of UNIX / Linux, or
you are trying to use some type of file system that is not well supported by your server's OS. (Might it be on a FUSE file system?)
A possibly related Java bug:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7009975 (Java 7 on Solaris.)

So , it is solved. The problem is that, disk is full as a result stream takes long time,
I clean up the disk after that there is no problem,

I received this message when trying to write a file to a directory on a RedHat server that already had the maximum number of files in it. I subdivided my files into subdirectories and the error did not reappear.

POSIX (and thus Unix) systems are allowed to impose a maximum length on the path (what you get from File.getPath() or the components of a path (the last of which you can get with File.getName()). You might be seeing this problem because of the long name for the file.
In that case, the file open operating system call will fail with an ENAMETOOLONG error code.
However, the message "File too large" is typically associated with the EFBIG error code. That is more likely to result from a write system call:
An attempt was made to write a file that exceeds the implementation-dependent maximum file size or the process' file size limit.
Perhaps the file is being opened for appending, and the implied lseek to the end of the file is giving the EFBIG error.

Irrespective of the JVM error output (which might be misleading or slightly off), you may want to check the that your Unix process has enough open file handles. Exhausting process file handles can lead to all kinds of FS-related error codes.

problem:
java.io.FileNotFoundException: /sent/plain/2009/04/Sterling_AS2_to_edt_AS2_Sterling_SJMVM00 A.20090429115945.All_to_cil_QM_214.GS13173.cgs_to_cil.PR120301900.node (File too large)
Files that do not have a very long filename are successfully written out to the same directory.
solution:
Reduce the assign filename length or remove older archived files with the longer filenames from that directory and try again.

Java properties file under low disk space conditions on Linux

I unwittingly ran a Java program on Linux where the partition that it, it's properties and logs resided on was close to 100% full. It failed, but after clearing up the problem with the disk space, I ran it again and it failed a second time because it's properties file was 0 bytes long.
I don't have source of the file and I don't want to go as far as decompiling the class files, but I was wondering whether the corruption of the properites could be because the program failed to write to the properties file.
The mystery is that, I expected that the properties would be read-only and don't recall any items being updated by the program. Could it be that even if the properties are only read, the file is opened in read-write mode and could disappear if the partition is full?
N.b. this program has run without failure or incident at least 1000 times over several years.

I don't have source of the file and I don't want to go as far as decompiling the class files, but I was wondering whether the corruption of the properites could be because the program failed to write to the properties file.
That is the most likely explanation. There would have been an exception, but the application could have squashed it ... or maybe you didn't notice the error message. (Indeed, if the application tried to log the error to a file, that would most likely fail too.)
Could it be that even if the properties are only read, the file is opened in read-write mode and could disappear if the partition is full?
That is unlikely to be the answer in this case. Unlike many languages, in Java the code for reading a file and writing a file involves different stream classes. It is hard to imagine how / why the application's developer would open a property file for writing (with truncation) if there was never any intention to write it.
The most plausible explanation is that the application does update the properties file. (Try installing the program again, using it, stopping it, and looking at the property file's modification timestamp.)
N.b. this program has run without failure or incident at least 1000 times over several years.
And I bet this is the first time you've run it while the disk was full :-)

How do I stop .mdmp files from being created

I have an instance of Solr, hosted with Tomcat that recently started creating minidump files. There are no errors in any of logs, and Solr continues to work with out a hitch.
The files are approximately 14gb, and are filling up the hard drive. Is there a way to turn this off, while we investigate the issue?

Generally speaking when JVM crashes the content of hs_err error log file (controlled by -XX:ErrorFile) is often enough to point what the trouble may be.
To prevent Oracle JVM Hotspot to generate Windows minidump (mdmp files), the JVM option to use on command line is: -XX:-CreateMinidumpOnCrash
It exists since 2011 but was very difficult to find: How to disable minidump (mdmp) files generation with Java Hotspot JVM on Windows

This article has decent information on both Linux and Windows JVM dump files. Have yet to test it myself on my current version of Java 7....
From that site:
Disabling Text dump Files
If you suspect problems with the creation of text dump files you can turn off the text dump file by using the option: -XXnoJrDump.
Disabling the Binary Crash Files
You can turn off the binary crash file by using the option: -XXdumpSize:none.

Are you using Java 7? In that case revert to Java 5 or 6. Lucene/Solr and Java 7 don't go well together and it could be this creates the dump files. Otherwise if everything is working, just disable the dumping of files.

I never found a way to disable the Java minidumps on windows. The strange part here is that everything on the server worked correctly, besides the hard drive filling up with minidumps.
We eventually re-installed everything, same version of Solr/Java/Tomcat onto a linux machine and didn't have the problem any more. I would imagine that re-installing everything onto a windows machine would have also fixed the problem. This was a strange one.

My java process's file descriptors going "bad" and I have no idea why

I have a java webapp, built with Lucene, and I keep getting various "file already closed" exceptions - depending on which Directory implementation I use. I've been able to get "java.io.IOException Bad File Descriptor" and "java.nio.channels.ClosedChannelException" out of Lucene, usually wrapped around an AlreadyClosedException for the IndexReader.
The funny thing is, I haven't closed the IndexReader and it seems the file descriptors are going stale on their own. I'm using the latest version of Lucene 3.0 (haven't had time to upgrade out of the 3.0 series), the latest version of Oracle's JDK6, the latest version of Tomcat 6 and the latest version of CentOS. I can replicate the bug with the same software on other Linux systems, but not on Windows systems and I don't have an OSX PC to test with. The linux servers are virtualized with qEmu, if that could matter at all.
This seems to also be load related - how frequently this happens corresponds to the amount of requests/second that Tomcat is serving (to this particular webapp). For example, on one server every request completes as expected until it has to deal with ~2 reqs/sec, then about 10% start having their file descriptors closed from under them, mid-request (the code checks for a valid IndexReader object and creates one at the beginning of processing the request). Once it gets to about 3 reqs/sec, all of the requests start failing with bad file descriptors.
My best guess is that somehow there's resource starvation at an OS level and the OS is cleaning up fds... but that's simply because I've eliminated every other idea I've had. I've already checked the ulimits and the filesystem fd limits and the number of open descriptors is well below either limit (example output from sysctl fs.file-nr: 1020 0 203404, ulimit -n: 10240).
I'm almost completely out of things to test and I'm no closer to solving this than the day that I found out about it. Has anyone experienced anything similar?
EDIT 07/12/2011: I found an OSX machine to use for some testing and have confirmed that this happens on OSX. I've also done testing on physical Linux boxes and replicated the issue, so the only OS that I've been unable to replicate this issue with is Windows. I'm guessing this has something to do with POSIX handling of file descriptors because that seems to be the only relevant difference between the two test systems (JDK version, tomcat version and webapp were all identical across all platforms).

the reason you probably don't see this happening on Windows, might be that its FSDirectory.open defaults to using SimpleFSDirectory.
check out the warnings at the top of FSDirectory and NIOFSDirectory: the text in red at http://lucene.apache.org/java/3_3_0/api/core/org/apache/lucene/store/NIOFSDirectory.html:
NOTE: Accessing this class either directly or indirectly from a thread while it's interrupted can close the underlying file descriptor immediately if at the same time the thread is blocked on IO. The file descriptor will remain closed and subsequent access to NIOFSDirectory will throw a ClosedChannelException. If your application uses either Thread.interrupt() or Future.cancel(boolean) you should use SimpleFSDirectory in favor of NIOFSDirectory
https://issues.apache.org/jira/browse/LUCENE-2239

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.