Do I need FileDescriptor "sync" and MappedByteBuffer "force"? - java

When I want to make sure that the changes of a MappedByteBuffer should be synced back to disc do I need randomAcccessFile.getFD().sync() or mappedByteBuffer.force() or both? (In my simple tests none of them seems to be required - confusing ...)
Somebody has an idea about the actual underlying operations or at least could explain the differences if any?

First, FileDescriptor.sync is equivalent to FileChannel.force (calling the POSIX fsync method)
Second, in the Book "Java NIO" from Ron Hitchens (via google books) in the chapter about MappedByteBuffer it says
MappedByteBuffer.force() is similar to the method of the same name in the FileChannel class. It forces any changes made to the mapped buffer to be flushed out to permanent disc storage. When updating a file through MappedByteBuffer object, you should always use MappedByteBuffer.force() rather than FileChannel.force(). The channel object may not be aware of all file updates made through the mapped buffer. MappedByteBuffer doesn't give you the option of not flushing file metadata - it's always flushed too. Note, that the same considerations regarding nonlocal filesystems apply here as they do for FileChannel.force
So, yes. You need to call MappedByteBuffer.force!
BUT then I found this bug which indicates that both calls could be still necessary at least on Windows.

Related

Java NIO: Why the method names "read" "write" sounds the opposite?

I am new to Java NIO and I am not an English native speaker.
When I read the method call about reading and writing buffer, I always get confused. It sounds the opposite to me.
For example,
fileChannel.read(buffer)
According to FileChannel javadoc, it means
Reads a sequence of bytes from this channel into the given buffers
I wonder why it does not call "write" instead.
In English, "write into" and "read from" sounds a more natural pairs than "write from" and "read into". Also in code reading too
fileChannel.write(buffer)
FileChannel writes a sequence of bytes into the given buffers
This clearly states what the actor is going to do.
Now I need to read everything opposite in order to get thing right...
Maybe they call them like that because of historical reason how they call them in IO package? Or maybe I am missing something obvious? Hope I could have your advice how to interpret them correctly.
Thanks!
When you have object.verb(…), the verb is generally being performed on the object, so if you want to read the fileChannel, you should have fileChannel.read(…).
In any data transfer, there's always a source from which data is read, and a destination to which the data is written. Perhaps because files are rather more permanent than memory buffer contents, it's just always been the convention to name transfer operations involving a file and a memory buffer with respect to the role of the file rather than that of the buffer. So when the file is the source and the buffer the destination, it's a "read" and when the file is the destination and the buffer the source, it's a "write".

Atomically write byte[] to file

(This is a hypothetical question since it's very broad, and workarounds exist for specific cases.)
Is it possible to atomically write a byte[] to a file (as FileOutputStream or FileWriter?
If writing fails, then it's unacceptable that part of the array is written. For example, if the array is 1,000,000 bytes and the disk is full after 500,000 bytes, then no bytes should be written to the file, or the changes should somehow be rolled back. This should even be the case if a medium is physically disconnected mid-write.
Assume that the maximum size of the array is known.
Atomic writes to files are not possible. Operating systems don't support it, and since they don't, programming language libraries can't do it either.
The best you are going to get with a files in a conventional file system is atomic file renaming; i.e.
write new file into same file system as the old one
use FileDescriptor.sync() to ensure that new file is written
rename the new file over the old one; e.g. using
java.nio.file.Files.move(Path source, Path target,
CopyOption... options)
with CopyOptions ATOMIC_MOVE. According to the javadocs, this may not be supported, but if it isn't supported you should get an exception.
But note that the atomicity is implemented in the OS, and if the OS cannot give strong enough guarantees, you are out of luck.
(One issue is what might happen in the event of a hard disk error. If the disk dies completely, then atomicity is moot. But if the OS is still able to read data from the disk after the failure, then the outcome may depend on the OS'es ability to repair a possibly inconsistent file system.)

Is there a clean way to determine if a RandomAccessFile is read-only?

I'm looking for a "good" way to determine if a RandomAccessFile is read-only
Non-"good" ways include:
Using reflection to read the value of the "rw" field (not implementation independant)
Read a byte, seek backward one, write the byte (not threadsafe)
I have tried writing an empty byte array but the call succeeds regardless of whether or not the RandomAccessFile is constructed in read-only or read-write mode.
The one viable approach I've discovered so far is to get a FileChannel and then attempt to create a read-write map.
RandomAccessFile raf = new RandomAccessFile("someFile.txt", "r");
raf.getChannel().map(MapMode.READ_WRITE, 0, 0);
In this example, the attempt to map throws a NonWritableChannelException which could then be caught and used to indicate that the RandomAccessFile is read-only.
However, I have a few concerns with the above approach.
The API does not currently provide a way to free memory maps, it's handled by the Garbage Collector, and on some operating system (e.g. Windows) a file can not be deleted while part of it is memory mapped
(NOTE: I have hope that creating a map with a size=0 and position=0 might avoid the Windows mapping restriction but I haven't tested it yet)
According to the FileChannel.map() documentation, creating a map is a somewhat expensive operation
Get the FileChannel and try to create a lock. It will fail if the RAF is read-only. Why is another issue.
But I agree with #Andreas. This is a question you shouldn't have to ask. Your application should be able to tell itself how it opened a file without resort to trickery.

Optimising Java's NIO for small files

We have a file I/O bottleneck. We have a directory which contains lots of JPEG files, and we want to read them in in real time as a movie. Obviously this is not an ideal format, but this is a prototype object tracking system and there is no possibility to change the format as they are used elsewhere in the code.
From each file we build a frame object which basically means having a buffered image and an explicit bytebuffer containing all of the information from the image.
What is the best strategy for this? The data is on a SSD which in theory has read/write rates around 400Mb/s, but in practice is reading no more than 20 files per second (3-4Mb/s) using the naive implementation:
bufferedImg = ImageIO.read(imageFile);[1]
byte[] data = ((DataBufferByte)bufferedImg.getRaster().getDataBuffer()).getData();[2]
imgBuf = ByteBuffer.wrap(data);
However, Java produces lots of possibilities for improving this.
(1) CHannels. Esp File Channels
(2) Gathering/Scattering.
(3) Direct Buffering
(4) Memory Mapped Buffers
(5) MultiThreading - use a bunch of callables to access many files simultaneously.
(6) Wrapping the files in a single large file.
(7) Other things I haven't thought of yet.
I would just like to know if anyone has extensively tested the different options, and knows what is optimal? I assume that (3) is a must, but I would still like to optimise the reading of a single file as far as possible, and am unsure of the best strategy.
Bonus Question: In the code snipped above, when does the JVM actually 'hit the disk' and read in the contents of the file, is it [1] or is that just a file handler which `points' to the object? It kind of makes sense to lazily evaluate but I don't know how the implementation of the ImageIO class works.
ImageIO.read(imageFile)
As it returns BufferedImage, I assume it will hit disk and just not file handler.

IO errors using memory mapped files in Java

I'm using memory mapped files in some Java code to quickly write to a 2G file. I'm mapping the entire file into memory. The issue I have with my solution is that if the file I'm writing to mysteriously disappears or the disk has some type of error, those errors aren't getting bubbled up to the Java code.
In fact, from the Java code, it looks as though my write completed successfully. Here's the unit test I created to simulate this type of failure:
File twoGigFile = new File("big.bin");
RandomAccessFile raf = new RandomAccessFile(twoGigFile, "rw");
raf.setLength(Integer.MAX_VALUE);
raf.seek(30000); // Totally arbitrary
raf.writeInt(42);
raf.writeInt(42);
MappedByteBuffer buf = raf.getChannel().map(MapMode.READ_WRITE, 0, Integer.MAX_VALUE);
buf.force();
buf.position(1000000); // Totally arbitrary
buf.putInt(0);
assertTrue(twoGigFile.delete());
buf.putInt(0);
raf.close();
This code runs without any errors at all. This is quite an issue for me. I can't seem to find anything out there that speaks about this type of issue. Does anyone know how to get memory mapped files to correctly throw exceptions? Or if there is another way to ensure that the data is actually written to the file?
I'm trying to avoid using a RandomAccessFile because they are much slower than memory mapped files. However, there might not be any other option.
You can't. To quote the JavaDoc:
All or part of a mapped byte buffer may become inaccessible at any time [...] An attempt to access an inaccessible region of a mapped byte buffer will not change the buffer's content and will cause an unspecified exception to be thrown either at the time of the access or at some later time.
And here's why: when you use a mapped buffer, you are changing memory, not the file. The fact that the memory happens to be backed by the file is irrelevant until the OS attempts to write the buffered blocks to disk, which is something that is managed entirely by the OS (ie, your application will not know it's happening).
If you expect the file to disappear underneath you, then you'll have to use an alternate mechanism to see if this happens. One possibility is to occasionally touch the file using a RandomAccessFile, and catch the error that it will throw. Depending on your OS, even this may not be sufficient: on Linux, for example, a file exists for those programs that have an open handle to it, even if it has been deleted externally.

Categories