Assume that I have the following code fragment:
operation1();
bw.close();
operation2();
When I call BufferedReader.close() from my code, I am assuming my JVM makes a system call that ensures that the buffer has been flushed and written to disk. I want to know if close() waits for the system call to complete its operation or does it proceed to operation2() without waiting for close() to finish.
To rephrase my question, when I do operation2(), can I assume that bw.close() has completed successfully?
when I do operation2(), can I assume that bw.close() has completed successfully?
Yes
Close the stream, flushing it first. Once a stream has been closed, further write() or flush() invocations will cause an IOException to be thrown. Closing a previously-closed stream, however, has no effect.
Though the documentation does not say anything specifically, I would assume this call does block until finished. In fact, I'm pretty sure nothing in the java.io package is non-blocking.
The JavaDoc for the java.io.BufferedReader.close() is taken exactly from the contract if fulfills with the java.io.Reader.
The Doc says:
Closes the stream and releases any system resources associated with it. Once the stream has been closed, further read(), ready(), mark(), reset(), or skip() invocations will throw an IOException. Closing a previously closed stream has no effect.
While this makes no explicit claim of blocking until the file system is complete, with this same instance of BufferedReader all other operations will throw an exception if close() returns. Although the JavaDoc could be seen as ambiguous about when the operation completes, if the file system flush and close were not complete when this method returned it would violate the spirit of the contract and be a bug in Java (implementation or documentation).
NO! You cannot be sure for the following reason:
A BufferedWriter is a Wrapper for another Writer. A close() to the BufferedWriter just propagates to the underlying Writer.
IF this underlying Writer is an OutputStreamWriter, and IF the OutputStream is a FileOutputStream, THEN the close will issue a system call to close the file handle.
You are completely free to even have a Writer where close() is a noop, or where the close is implemented non-blocking, but when using only classes from java.io, this is never the case.
A Writer (or BufferedWriter) is a black box that writes a stream of characters somewhere, not necessarily to the disk. A call to close() must (by method contract) flush its buffered content before closing, and should (normally) block before all its "essential" work is done. But this would depend on the implementation and the environment (you cannot know about caches that are below the Java layer, for example). In what respects of the work to be done by the Java writer itself (eg: make the system call to write to disk, in the case of a FileWriter or akin, and close the filehandle) , yes, you can assume that when close() returns it has already done all its work.
In general with any i/o operation you can make no assumptions about what has happened after the write() operation completes, even after you close. The idea of delivery is a subjective concept relative to the medium.
For instance, what if the writer represents a TCP connection, and then the data is lost inbetween client and server? Or what if the kernel writes data to a disk, but the drive physically fails to write it? Or if the writer represents a carrier pigeon that gets shot en route?
Furthermore, imagine the case when the write has no way of confirming that the endpoint has received the data (read: udp/datagrams). What should the blocking policy be in that situation?
The buffer will have been flushed to the operating system and the file handle closed, so the Java operations required will have been completed.
BUT the operating system will have cached or queued the write to the actual disk, pipe, network, whatever - there is no guarantee that the physical write has completed. FileChannel.force() provides a way to do that for files on local disks: see the Javadoc.
Yes, IF you reach operation2();, the stream would've had to have been completely closed. However, close() throws IOException, so you may not even get to operation2();. This may or may not be the behavior that you expect.
Related
Closing and flushing IO resources is very important and seldom done correctly (at least by me). The reason for this is that most of the time, it still works without doing it correctly. Files are closed by the garbage collector, which happens from time to time in most applications. Flushing is done automatically when a stream is closed (possibly also by the garbage collector) or when a lot of data is written.
Java 1.7's try-with-resource makes it much easier to close IO resources if their lifetime coincides with the lifetime of a local variable. Not so much if they should e.g. live as long as some other object, but that is another story.
Since I started writing programs that are complex enough that I needed to use resources that wrap other resource, I instead find that it's much harder to decide what to close and/or flush than when to do it. Examples of wrapping a resource in another resource are:
Creating an InputStreamReader from an InputStream.
Creating an InputStream from a ReadableByteChannel.
Creating a DataOutputStream from an OutputStream.
Creating a PrintStream or OutputStreamWriter from an OutputStream.
This may also happen multiple layers deep, like wrapping a ReadableByteChannel in an InputStream in a GZIPInputStream in an InputStreamReader in a BufferedReader (never had to do that but seems plausible). Almost always the wrapping and the wrapped resources should have the same lifetime and it is most convenient if flushing can be done on the outermost resource, where writing is also done, so that only one object needs to be passed around.
In all this time I've never seen a satisfactory explanation of how closing and flushing interacts with resources wrapped in other resources. My assumptions are the following:
Flushing a resource (i.e. calling flush() on it) also flushes wrapped resources recursively until data is pushed onto e.g. the disk or the network.
Closing a resource (i.e. calling close() on it) also closes wrapped resources recursively until some operating system resource is freed.
Now to my question; are these assumptions correct when using JDK implementations of IO resources, specifically of the interfaces InputStream, OutputStream, ReadableByteChannel, WritableByteChannel, Reader and Writer?
If one or both assumptions are not correct at all, what assumptions would be better?
If those assumptions are not always correct, where does the behavior of an implementation differ and what are the reasons?
In the following scenario
ObjectOutputStream output = new ObjectOutputStream(socket.getOutputStream());
output.flush();
// Do stuff with it
Why is it always necessary to flush the buffer after initial creation?
I see this all the time and I don't really understand what has to be flushed. I kind of expect newly created variables to be empty unless otherwise is specified.
Kind of like buying a trash-can and finding a tiny pile of trash inside that came with it.
In over 15 years of writing Java on a professional level I've never once encountered a need to flush a stream before writing to it.
The flush operation would do nothing at all, as there's nothing to flush.
You want to flush the stream before closing it, though the close operation should do that for you it is often considered best practice to do it explicitly (and I have encountered situations where that did make a difference, where apparently the close operation did not actually do a flush first.
Maybe you are confused with that?
When you write data out to a stream, some amount of buffering will occur, and you never know for sure exactly when the last of the data will actually be sent. You might perform many rite operations on a stream before closing it, and invoking the flush()method guarantees that the last of the data you thought you had already written actually gets out to the file. Whenever you're done using a file, either reading it or writing to it, you should invoke the close()method. When you are doing file I/O you're using expensive and limited operating system resources, and so when you're done, invoking close()will free up those resources.
This is needed when using either ObjectInputStream and ObjectOutputStream, because they send a header over the stream before the first write is called. The call to flush() will send that header to the remote side.
According to the spec, the header exists of the following contents:
magic version
If the header doesn't arrive at the moment a ObjectInputStream is build, this call will hang until it received the header bytes.
This means that if the protocol in question is written with ObjectStreams, it should flush after creating a ObjectOutputStream.
I was always wondering: What is the end of a stream?
In the javadoc of most readLine methods in the java.io package, you can read that "this returns null if the end of the stream is reached" - though I never actually got a null, as most streams (in the case of a network stream that I use most often) just block the program execution until something is written into the stream on the remote end
Are there ways to enforce this acutal behavior happening in an actual non-exception throwing way? I am simply curious ...
Think of a file being read. There is an end of stream there, the end of the file. If you try to read beyond that, you simply can't. If you have a network connection though, there doesn't need to be an end of stream if you simply wait for more data to be sent.
In the case of the file, we know for a fact that there is no more data to be read. In the case of a network stream we (usually) don't.
Blocking a FileReader when no more data is available, awakening when there is: the simple answer is: you can't. The fundamental difference is that you read a file actively, but when you listen to a network stream you read passively. When something comes from the network your hardware sends a short of signal to the Operating System, which then gives the new data to your JVM, and the JVM then awakens your process to read the new data (so to speak). But we don't have that with a file, at least not immediately.
A possible workaround would be to make a wrapper to the StreamReader you have, with a listener that is notified when the file is changed, which then awakens you to read further. In Java 7 you can use the WatchService.
At some point, the socket will be closed, and no more data can be sent via that stream. This is when the InputStream will signal EOF by returning -1 from read() and its overloads. This state is irreversible. That stream is dead.
Simply blocking for more data on an open stream is not an EOF condition.
I never actually got a null, as most streams (in the case of a network stream that I use most often) just block the program execution until something is written into the stream on the remote end
No. You never got a null because the peer never closed the connection. That's what 'end of stream' means. It doesn't mean 'no more data for the time being'.
I just read that
Some buffered output classes support autoflush, specified by an
optional constructor argument. When autoflush is enabled, certain key
events cause the buffer to be flushed. For example, an autoflush
PrintWriter object flushes the buffer on every invocation of println
or format.
So if I am keeping the reference of any BufferReader for some time being and it gets flushed , then how all the data will be retained back? Is there some call back mechanism that will automatically flush it and again read the content or will I lose the data and again I need to call for it?
So if I am keeping the reference of any BufferReader for some time being and it gets flushed , then how all the data will be retained back?
I think you mean BufferedWriter. (Neither the Reader or InputStream APIs have a flush() method. Flushing doesn't make any sense on a "source".)
The flushed data is written to the stream's "sink"; i.e. the file or socket or whatever. So if you look in the file (or whatever), the data will be there if the stream has been flushed (successfully).
Is there some call back mechanism that will automatically flush it and again read the content
There is no callback mechanism1. (At least, not in any of the buffered stream classes that the standard class library provides: who knows what a custom class might do ...)
Data is flushed automatically when certain things happen. For example, when the application calls println ... for a PrintWriter.
... or will I lose the data and again I need to call for it?
This doesn't make sense, either grammatically or semantically. I don't know what you are trying to ask.
Perhaps you don't understand what flushing does. Flushing simply means pushing the data out of the buffers and out to wherever the stream sends its data. An explicit flush() call or an automatic flush just means "write it NOW".
1 - Incidentally, BufferedWriter doesn't have a finalize() method either. This means that if one of these objects becomes unreachable while it still has output buffered, that output will never be written.
I think you're getting confused between buffered readers and writers. Your statement is talking about buffered writers, so if you're writing out to a stream then you shouldn't really care whether it is physically written or only written to the buffer - it doesn't matter to Java.
I would hope that a buffered reader would never be flushed, but depending on the type of buffer it might be OK. For example, if reading from a file, the buffer could be flushed and the file would just need to be re-read from the file system when you try to read(). However, for other streaming content, you wouldn't want it to be automatically flushed, as you would lose whatever data was in the buffer.
I have a question in my mind that, while writing into the file, before closing is done, should we include flush()??. If so what it will do exactly? dont streams auto flush??
EDIT:
So flush what it actually do?
Writers and streams usually buffer some of your output data in memory and try to write it in bigger blocks at a time. flushing will cause an immediate write to disk from the buffer, so if the program crashes that data won't be lost. Of course there's no guarantee, as the disk may not physically write the data immediately, so it could still be lost. But then it wouldn't be the Java program's fault :)
PrintWriters auto-flush (by default) when you write an end-of-line, and of course streams and buffers flush when you close them. Other than that, there's flushing only when the buffer is full.
I would highly recommend to call flush before close. Basically it writes remaining bufferized data into file.
If you call flush explicitly you may be sure that any IOException coming out of close is really catastrophic and related to releasing system resources.
When you flush yourself, you can handle its IOException in the same way as you handle your data write exceptions.
You don't need to do a flush because close() will do it for you.
From the javadoc:
"Close the stream, flushing it first. Once a stream has been closed, further write() or flush() invocations will cause an IOException to be thrown. Closing a previously-closed stream, however, has no effect."
To answer your question as to what flush actually does, it makes sure that anything you have written to the stream - a file in your case - does actually get written to the file there and then.
Java can perform buffering which means that it will hold onto data written in memory until it has a certain amount, and then write it all to the file in one go which is more efficient. The downside of this is that the file is not necessarily up-to-date at any given time. Flush is a way of saying "make the file up-to-date.
Close calls flush first to ensure that after closing the file has what you would expect to see in it, hence as others have pointed out, no need to flush before closing.
Close automatically flushes. You don't need to call it.
There's no point in calling flush() just before a close(), as others have said. The time to use flush() is if you are keeping the file open but want to ensure that previous writes have been fully completed.
As said, you don't usually need to flush.
It only makes sense if, for some reason, you want another process to see the complete contents of a file you're working with, without closing it. For example, it could be used for a file that is concurrently modified by multiple processes, although with a LOT of care :-)
FileWriter is an evil class as it picks up whatever character set happens to be there, rather than taking an explicit charset. Even if you do want the default, be explicit about it.
The usual solution is OutputStreamWriter and FileOutputStream. It is possible for the decorator to throw an exception. Therefore you need to be able to close the stream even if the writer was never constructed. If you are going to do that, you only need to flush the writer (in the happy case) and always close the stream. (Just to be confusing, some decorators, for instance for handling zips, have resources that do require closing.)
Another usecase for flushing in program is writing progress of longrunning job into file (so it can be stopped and restarted later. You want to be sure that data is safe on the drive.
while (true) {
computeStuff();
progresss += 1;
out.write(String.format("%d", progress));
out.flush();
}
out.close();