why does java's inputstream.close() block?

why does java's inputstream.close() block? - java

My Java program uses ProcessBuilder (with redirectErrorStream set true) and has a loop that runs the processes's inputstream's read method, which is blocking. The external program I'm calling then comes to a stop waiting for input and stdin. I now want to kill the process. Is this not done by (in a seperate thread) calling the process's destroy method, and calling the inputstream's close method to stop the read method from blocking anymore, so that my initial thread can end its life?
For some reason process.getInputStream().close() blocks. From the JavaDoc I don't see why this can happen. Furthermore, I don't understand why the javadoc says "The close method of InputStream does nothing." (link to javadoc) Could someone explain this?
Thanks :-)

Regarding the blocking behavior, there is a known issue in Java that can cause deadlock when communicating with another process. I can't tell if this is what you're seeing but it's worth looking into. The document for java.lang.Process says:
Because some native platforms only
provide limited buffer size for
standard input and output streams,
failure to promptly write the input
stream or read the output stream of
the subprocess may cause the
subprocess to block, and even
deadlock.

For some reason
process.getInputStream().close()
blocks. From the JavaDoc I don't see
why this can happen. Furthermore, I
don't understand why the javadoc says
"The close method of InputStream does
nothing." (link to javadoc) Could
someone explain this?
If you look at the Javadoc, you'll see that InputStream an abstract class. Subclasses that extend InputStream are expected to override the close() method (should it be needed). Clearly the InputStream subclass that you're using does something in the close method.

Adding onto what jdigital wrote, check this article. It deals with Runtime.exec() method, and ProcessBuilder was introduced in Java 5, but it seems to me the discussion can be extrapolated to system processes in general.

I think I figured this out. Obviously it is important to call process.getOutputStream().close() before process.getInputStream().close() and process.getErrorStream().close().

Related

what is the proper way to close streams on java's exec?

After
Runtime.getRuntime().exec(command);
i see syscalls happening that show 2~3 file descriptors (FIFO pipes). What is the proper way to close them with try-with-resource pattern?
Most historical tribal knowledge found on java forums suggest:
# out of date!
... } finally {
IOUtils.closeQuietly(p.getOutputStream());
IOUtils.closeQuietly(p.getInputStream());
IOUtils.closeQuietly(p.getErrorStream());
}
but that doesn't sound right because 1) method closeQuietly is deprecated and most libraries suggest using try-with-resource, 2) it is inelegant as I might not necessarily have all streams.
And simply moving the exec() call into try feels wrong as it is not the resource i will call close() on.

Closing them isn't necessary; the close by themselves when the process dies. If the process never dies, it is also not neccessary: Either you make a new never-dying process every so often in which case your system is going to crash and run out of resources whether you close these or not, or you make it only once, in which case these resources aren't going to count for much. For what it is worth, these are quite lightweight resources, and often they simply cannot be 'closed' in the sense that the resources can be 'freed' - closing them either keeps them open but denies further chat (and sends EOFs where needed), or reroutes them to /dev/null; generally processes just have 3 pipes on em and will continue to have them until the process dies.
Yes, closeQuietly is a silly idea for virtually all purposes, and so it is here. If closing these streams somehow fail you probably don't want to silently ignore that.
If you must close them, the individual streams from these 3 are closable. However, note that you're reading rules of thumb and attempting to apply them as if they are gospel truth. try-with-resources is not always the right answer, and try-with-resources is not a 100% always replacement for close, let alone closeQuietly.
For example, try-with-resources specifically is designed around a period of usage. You declare the span of statements within which the resource should be available (the braces that go with the try block), and the construct will then ensure that the resource is closed only once code flow transitions out of that span of statements, no matter how it exits this. That makes it probably irrelevant here, too!
You are starting a long-lived process and don't care about the in/out. You just want the process to run and to keep running. This means there is no span at all, and you should just call close() on these if somehow you feel it is important to try to save the resources even though most likely this accomplishes nothing at all. No span-of-statements means try-with-resources isn't right.
You are starting a short-lived process that you interact with. The right thing to 'close' is the process itself, except you can't use try-with-resources for that. That can only be used on auto-closables. (resources where the class that represents them implement AutoClosable. Most do, some don't. Lock is a famous one. Process is another: To 'close' it, you invoke destroy() or even destroyForcibly(). You cannot use try-with-resources (not without ugly hacks that defeats the purpose) to do this! Once you close/destroy the process, the streams that went along with them are dead too.
More generally the principle is: If you create it, you close it. If you never call getOutputStream() you never created them. On some OSes, fetching these streams and then closing them wastes more resources than not doing this. Thus, if the argument is based on some sort of purity model, then you shouldn't close them either. If it's based on pragmatics, you'd have to test how heavy these resources really are (most likely, extremely light), whether closing them actually saves you some pipes (most likely, it will not), and whether close()-ing the result of invoking getOutputStream() on the process even helps if the answers to the above questions make that relevant (it probably will, but the spec does not guarantee this).

They are very light processes that in almost every case don't require closing...

Is there any harm in failing to close a file when a Java program terminates?

When my program starts, it opens a file and writes to it periodically. (It's not a log file; it's one of the outputs of the program.) I need to have the file available for the length of the program, but I don't need to do anything in particular to end the file; just close it.
I gather that for file I/O in Java I'm supposed to implement AutoCloseable and wrap it in a try-with-resources block. However, because this file is long-lived, and it's one of a few outputs of the program, I'm finding it hard to organize things such that all the files I open are wrapped in try-with-resources blocks. Furthermore, the top-level classes (where my main() function lies) don't know about this file.
Here's my code; note the lack of writer.close():
public class WorkRecorder {
public WorkRecorder(String recorderFile) throws FileNotFoundException {
writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(recorderFile)));
}
private Writer writer;
public void record(Data data) throws Exception {
// format Data object to match expected file format
// ...
writer.write(event.toString());
writer.write(System.lineSeparator());
writer.flush();
}
}
tl;dr do I need to implement AutoCloseable and call writer.close() if the resource is an opened output file, and I never need to close it until the program is done? Can I assume the JVM and the OS (Linux) will clean things up for me automatically?
Bonus (?): I struggled with this in C#'s IDisposeable too. The using block, like Java's try-with-resources construct, is a nice feature when I have something that I'm going to open, do something with quickly, and close right away. But often that's not the case, particularly with files, when the access to that resource hangs around for a while, or when needing to manage multiple such resources. If the answer to my question is "always use try-with-resources blocks" I'm stuck again.

I have similar code that doesn't lend itself to being wrapped in a try-with-resources statement. I think that is fine, as long as you close it when the program is done.
Just make sure you account for any Exceptions that may happen. For example, in my program, there is a cleanup() method that gets called when the program is shut down. This calls writer.close(). This is also called if there is any abnormal behavior that would cause the program to shut down.
If this is just a simple program, and you're expecting the Writer to be open for its duration, I don't think it's really a big deal for it to not be closed when the program terminates...but it is good practice to make sure your resources are closed, so I would go ahead and add that to wherever your program may shut down.

You should always close resources or set them to null so it can be picked up by the garbage collector in Java. Using try-with-resource blocks is a great way to have Java automatically close resources when you're done with them. Even if you use it for the duration of the program, it is good programming practice to close it even at the end. Some might say you don't need to, I personally would say just go ahead and do it and here's why:
"When a stream is no longer needed, always close it using the close() method or automatically close it using a try-with-resource statement. Not closing streams may cause data corruption in the output file, or other programming errors."
-Introduction to Java Programming 10th Edition, Y. Daniel Liang
If possible, just run the .close() method on the resource at the very end of the program.

I (now) think a better answer is "It depends" :-). A detailed treatment is provided by Lukas Eder here. Also check out the Lambda EG group post.
But in general, it's a good idea to return the resource back to the operating system when you are done with it and use try-with-resources all the time (except when you know what you are doing).

Setting up a blocking file read in Java

I'd like to set up a blocking file read in Java. That is, have a file such that when wrapped by FileInputStream and any read() method is call, the call blocks.
I can't think of an easy OS-independent way - on Unix-like OSes I could try to create a FIFO using mkfifo and read from that file. A possible work around would be to just create a very large file and read from that - the read is unlikely to complete before I capture the stack, but it's ugly and slow (and indeed reads can still be incredibly fast when cached).
The corresponding socket read() case is trivial to set up - create a socket yourself and read from it, and you can have deterministic blocking.
The purpose is to examine stack of the method to determine what the top frames are in such a case. Imagine I have a component which periodically samples the stacks traces of all running threads and then tries to categorize what that thread is doing at the moment. One thing it could be doing is file IO. So I need to know what the "top of stack" looks like during file IO. I have already determined that by experimentation (simply read a file in a variety of ways and sample the stack), but I want to write a test that will fail if this ever changes.
The natural way to write such a test is to kick off a thread which does a file read, then examine the top frame(s). To do this reliably, I want a blocking read (or else the thread may finish its read before the stack trace is taken, etc).

To get a guaranteed blocked I/O, read from a console, e.g. /dev/console on Linux or CON on Windows.
To make this platform-independent, you may hack the FileDescriptor of FileInputStream:
// Open a dummy FileInputStream
File f = File.createTempFile("dummy", ".tmp");
f.deleteOnExit();
FileInputStream fis = new FileInputStream(f);
// Replace FileInputStream's descriptor with stdin
Field fd = FileInputStream.class.getDeclaredField("fd");
fd.setAccessible(true);
fd.set(fis, FileDescriptor.in);
System.out.println("Reading...");
fis.read();
System.out.println("Complete");
UPDATE
I've realized you don't even need a method to block. In order just to get a proper stacktrace you may invoke read() on an invalid FileInputStream:
FileInputStream fis = new FileInputStream(new FileDescriptor());
fis.read(); // This will throw IOException exactly with the right stacktrace
If you still need a blocking read(), named pipes is the way to go: run mkfifo using Runtime.exec on POSIX systems or create \\.\PIPE\MyPipeName on Windows.

I don't know of anyway to make a File in a OS-Independent way that will always block when read.
If I were trying to find the stack trace when a specific function were called, I would run the program under a debugger and set a break point on that function. Although, method breakpoints will slow down your program and give you different results than you would normally get if timing is important.
If you have access to the source code of the progream, you could make a fake FileInputStream that extends the real one but always blocks on a read. All you need to do is to switch out the import statements throughout the code. However, this won't capture places where you are not able to switch out import statements and it could be a pain if there is a lot of code.
If you want to use your own FileInputStream without changing the program source code or compiling, you can make a custom class loader that loads your custom FileInputStream class instead of the real one. You can specify which class loader to use on the command line by:
java -Djava.system.class.loader=com.test.MyClassLoader xxx
Now that I think about it, I have an even better idea, instead of making a custom FileInputStream that blocks on read(), make a custom FileInputStream that prints out the stack traces on read(). The custom class can then call the real version of read(). This way you will get all of the stack traces for all calls.

From my understanding you want to write a test which inspects the stack trace of FileInputStream.read() method. What about descendants of FileInputStream if they override the read() method?
If you don't need to inspect the descendants, I think you can use the JVM Tool Interface by inserting a break point at runtime in the desired method, and in the event processing of this event (break point) - dump the stack trace.
After the dump is completed you remove the break point and continue the execution.
(This all occurs in runtime using this API, no black magic :) )

You could have a separate thread watch for changes to the file's access time and generate a jvm thread dump when that happens. As to generating the thread dump in code I haven't tried but looks like that's answered here: Generate a Java thread dump without restarting.
I don't know how well this will work with the timing between your threads but I imagine this should come pretty close. I'm also not 100% on the OS independence of this solution as I haven't tested it, but it should work for most modern-ish systems. See the javadocs on the java.nio.file.attribute.BasicFileAttributes to see what will return if it's not supported.

One trick is : if it is possible to modify your API to return a Reader instead of a File, then you can wrap a String with a custom StringReader (class SlowAsRubyStringReader extends Reader, say) that overrides the various int read() methods with a Thread.sleep(500) before it does the real work. Only during testing, of course.
#see http://docs.oracle.com/javase/7/docs/api/java/io/StringReader.html
I think there is a larger issue here, not just Files : you want to inspect the context in which an API is getting called during your test cases is it not? That is, you want to be able to examine the stack and say, "aha! I caught you calling the MudFactory API from the JustTookABath object, OUTRAGEOUS!". If this is the case, then you may have to delve into dynamic proxies, which will allow you to hijack function calls or use aspect-oriented programming, which allows you to do the same, but in a more systematic way. See http://en.wikipedia.org/wiki/Pointcut

read() dives quickly into native code so yes probably need to go native to block at that level. Alternatively you may want to consider logging a stack trace at
the point in your code before or after read().
Something like:
log ( ExceptionUtils.getStackTrace(new Exception()) );
ExceptionUtils doco is here: https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/exception/ExceptionUtils.html

Accessing program messages output to error stream

I've created a class which processes files and if it encounters certain specific errors, it outputs relevant error messages to the error stream.
I am working on another class that needs to access these error messages. I'm not sure how to do this. I am a beginner in Java programming. Based on my limited knowledge, I thought that my two options would be to either call the main method of the first class (but I don't know how I would get the error messages in this case) or to execute the compiled class and access the messages through the getErrorStream() method of the Process class. But, I am having trouble with the system deadlocking or possibly not even executing the exec command, so I'm not sure how implement the second case either.

I'm not quite sure what you're asking here, but a potential problem with your code is that you're not reading from the process' stdout. Per the Process API, "failure to promptly ... read the output stream of the subprocess may cause the subprocess to block, and even deadlock." Is this the "trouble" you mentioned?
Edit: So yeah, you can either do what you're doing, but be sure to read both the error stream and the output stream (see my comment), or you could just call the main method directly from your code, in which case the error output will be written to System.err. You could use System.setErr() to install your own stream that would let you get what's written to it, but keep in mind that any error output from your own app--the one that's running the other app--will also show up here. It sounds like spawning a separate process, like you're already doing, is what you want.

You can't build modularity based on many little programs with a main method. You have to make blocks of function as classes that are designed to be called from elsewhere -- and that means returning status information in some programmatic fashion, not just blatting it onto System.err. If it really is an error, throw an exception. If you have to return status, design a data structure to hold the status and return it. But don't go launching new processes all over the place and reading their error streams.

Can you remap system.in , system.out and system.err to Java threads?

I have some old C++ code that uses stdio for input and output. The code also spawns new processes via forking. It remaps stdio to each new process so each session gets its respective data.
I am looking at using threads in Java to create child processes. However, I am stuck when it comes to finding out how to remap System.in , System.out and System.err to the child threads on creation.
Could anyone please point me in the right direction if this is possible?

The simple answer is not not write your code as directly accessing System.out/in/err. Instead have an InputStream and two OutputStreams passed in to your object's constructor. The object then works directly with those objects and does not depend on what they are actually mapped to. To get access to print() and println(), you will pass the OutputStream in to the constructor of a PrintStream.
Then based on what you actually want to do, you can call the constructor with System.out or some a FileOutputStream.

spawning a thread is not the same as spawning a process. when you spawn a thread in java (and c++) it shares the same memory space as the spawner (i.e. they share the same sdio streams). if you wanted to spawn a new process in java, you would use Runtime.exec() and then you would have to manually pipe the io into the new process, java does not support sharing io streams across process boundaries.

Java doesn't have fork(), but it has ProcessBuilder and Runtime.exec() for starting new processes (objects of the Process class). You can think of it as of fork()/exec() pair, but without the capability to perform something in between like dup2(). This means you can't redirect the child process' stdio, but you can explicitly write something into its stdin and read from its stdout and stderr, using the corresponding Process methods, or, to be precise, the corresponding methods of input/output streams returned by the getInputStream()/getOutputStream()/getErrorStream() methods of the Process class. This can be valid workaround if you want to have processes instead of threads.
If you want to use threads, then they all share the same stdio. You can redirect it, but it would be pointless since the redirection will affect all the threads. You can use some sort of imitation of IPC with threads using custom InputStream/OutputStream implementations, or you may wish to have a look at the PipedInputStream/PipedOutputStream pair. These can actually be used to set up something like an IPC pipe, probably with conjunction with BufferedInputStream/BufferedOutputStream to avoid excessive blocking.

I would recommend using separate processes, or else explicitly assigning a PrintStream to each thread.
However, it is possible to forward writes to System.out to different places for each thread, even though each thread sees the same object for System.out. In your startup code, you would call System.setOut(PrintStream) with a custom PrintStream. This PrintStream would override all the print and write methods. In these methods, it would look up the thread's PrintStream with an InheritableThreadLocal and forward the method call to it.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.