Setting up a blocking file read in Java - java

I'd like to set up a blocking file read in Java. That is, have a file such that when wrapped by FileInputStream and any read() method is call, the call blocks.
I can't think of an easy OS-independent way - on Unix-like OSes I could try to create a FIFO using mkfifo and read from that file. A possible work around would be to just create a very large file and read from that - the read is unlikely to complete before I capture the stack, but it's ugly and slow (and indeed reads can still be incredibly fast when cached).
The corresponding socket read() case is trivial to set up - create a socket yourself and read from it, and you can have deterministic blocking.
The purpose is to examine stack of the method to determine what the top frames are in such a case. Imagine I have a component which periodically samples the stacks traces of all running threads and then tries to categorize what that thread is doing at the moment. One thing it could be doing is file IO. So I need to know what the "top of stack" looks like during file IO. I have already determined that by experimentation (simply read a file in a variety of ways and sample the stack), but I want to write a test that will fail if this ever changes.
The natural way to write such a test is to kick off a thread which does a file read, then examine the top frame(s). To do this reliably, I want a blocking read (or else the thread may finish its read before the stack trace is taken, etc).

To get a guaranteed blocked I/O, read from a console, e.g. /dev/console on Linux or CON on Windows.
To make this platform-independent, you may hack the FileDescriptor of FileInputStream:
// Open a dummy FileInputStream
File f = File.createTempFile("dummy", ".tmp");
f.deleteOnExit();
FileInputStream fis = new FileInputStream(f);
// Replace FileInputStream's descriptor with stdin
Field fd = FileInputStream.class.getDeclaredField("fd");
fd.setAccessible(true);
fd.set(fis, FileDescriptor.in);
System.out.println("Reading...");
fis.read();
System.out.println("Complete");
UPDATE
I've realized you don't even need a method to block. In order just to get a proper stacktrace you may invoke read() on an invalid FileInputStream:
FileInputStream fis = new FileInputStream(new FileDescriptor());
fis.read(); // This will throw IOException exactly with the right stacktrace
If you still need a blocking read(), named pipes is the way to go: run mkfifo using Runtime.exec on POSIX systems or create \\.\PIPE\MyPipeName on Windows.

I don't know of anyway to make a File in a OS-Independent way that will always block when read.
If I were trying to find the stack trace when a specific function were called, I would run the program under a debugger and set a break point on that function. Although, method breakpoints will slow down your program and give you different results than you would normally get if timing is important.
If you have access to the source code of the progream, you could make a fake FileInputStream that extends the real one but always blocks on a read. All you need to do is to switch out the import statements throughout the code. However, this won't capture places where you are not able to switch out import statements and it could be a pain if there is a lot of code.
If you want to use your own FileInputStream without changing the program source code or compiling, you can make a custom class loader that loads your custom FileInputStream class instead of the real one. You can specify which class loader to use on the command line by:
java -Djava.system.class.loader=com.test.MyClassLoader xxx
Now that I think about it, I have an even better idea, instead of making a custom FileInputStream that blocks on read(), make a custom FileInputStream that prints out the stack traces on read(). The custom class can then call the real version of read(). This way you will get all of the stack traces for all calls.

From my understanding you want to write a test which inspects the stack trace of FileInputStream.read() method. What about descendants of FileInputStream if they override the read() method?
If you don't need to inspect the descendants, I think you can use the JVM Tool Interface by inserting a break point at runtime in the desired method, and in the event processing of this event (break point) - dump the stack trace.
After the dump is completed you remove the break point and continue the execution.
(This all occurs in runtime using this API, no black magic :) )

You could have a separate thread watch for changes to the file's access time and generate a jvm thread dump when that happens. As to generating the thread dump in code I haven't tried but looks like that's answered here: Generate a Java thread dump without restarting.
I don't know how well this will work with the timing between your threads but I imagine this should come pretty close. I'm also not 100% on the OS independence of this solution as I haven't tested it, but it should work for most modern-ish systems. See the javadocs on the java.nio.file.attribute.BasicFileAttributes to see what will return if it's not supported.

One trick is : if it is possible to modify your API to return a Reader instead of a File, then you can wrap a String with a custom StringReader (class SlowAsRubyStringReader extends Reader, say) that overrides the various int read() methods with a Thread.sleep(500) before it does the real work. Only during testing, of course.
#see http://docs.oracle.com/javase/7/docs/api/java/io/StringReader.html
I think there is a larger issue here, not just Files : you want to inspect the context in which an API is getting called during your test cases is it not? That is, you want to be able to examine the stack and say, "aha! I caught you calling the MudFactory API from the JustTookABath object, OUTRAGEOUS!". If this is the case, then you may have to delve into dynamic proxies, which will allow you to hijack function calls or use aspect-oriented programming, which allows you to do the same, but in a more systematic way. See http://en.wikipedia.org/wiki/Pointcut

read() dives quickly into native code so yes probably need to go native to block at that level. Alternatively you may want to consider logging a stack trace at
the point in your code before or after read().
Something like:
log ( ExceptionUtils.getStackTrace(new Exception()) );
ExceptionUtils doco is here: https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/exception/ExceptionUtils.html

Related

Is there any harm in failing to close a file when a Java program terminates?

When my program starts, it opens a file and writes to it periodically. (It's not a log file; it's one of the outputs of the program.) I need to have the file available for the length of the program, but I don't need to do anything in particular to end the file; just close it.
I gather that for file I/O in Java I'm supposed to implement AutoCloseable and wrap it in a try-with-resources block. However, because this file is long-lived, and it's one of a few outputs of the program, I'm finding it hard to organize things such that all the files I open are wrapped in try-with-resources blocks. Furthermore, the top-level classes (where my main() function lies) don't know about this file.
Here's my code; note the lack of writer.close():
public class WorkRecorder {
public WorkRecorder(String recorderFile) throws FileNotFoundException {
writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(recorderFile)));
}
private Writer writer;
public void record(Data data) throws Exception {
// format Data object to match expected file format
// ...
writer.write(event.toString());
writer.write(System.lineSeparator());
writer.flush();
}
}
tl;dr do I need to implement AutoCloseable and call writer.close() if the resource is an opened output file, and I never need to close it until the program is done? Can I assume the JVM and the OS (Linux) will clean things up for me automatically?
Bonus (?): I struggled with this in C#'s IDisposeable too. The using block, like Java's try-with-resources construct, is a nice feature when I have something that I'm going to open, do something with quickly, and close right away. But often that's not the case, particularly with files, when the access to that resource hangs around for a while, or when needing to manage multiple such resources. If the answer to my question is "always use try-with-resources blocks" I'm stuck again.
I have similar code that doesn't lend itself to being wrapped in a try-with-resources statement. I think that is fine, as long as you close it when the program is done.
Just make sure you account for any Exceptions that may happen. For example, in my program, there is a cleanup() method that gets called when the program is shut down. This calls writer.close(). This is also called if there is any abnormal behavior that would cause the program to shut down.
If this is just a simple program, and you're expecting the Writer to be open for its duration, I don't think it's really a big deal for it to not be closed when the program terminates...but it is good practice to make sure your resources are closed, so I would go ahead and add that to wherever your program may shut down.
You should always close resources or set them to null so it can be picked up by the garbage collector in Java. Using try-with-resource blocks is a great way to have Java automatically close resources when you're done with them. Even if you use it for the duration of the program, it is good programming practice to close it even at the end. Some might say you don't need to, I personally would say just go ahead and do it and here's why:
"When a stream is no longer needed, always close it using the close() method or automatically close it using a try-with-resource statement. Not closing streams may cause data corruption in the output file, or other programming errors."
-Introduction to Java Programming 10th Edition, Y. Daniel Liang
If possible, just run the .close() method on the resource at the very end of the program.
I (now) think a better answer is "It depends" :-). A detailed treatment is provided by Lukas Eder here. Also check out the Lambda EG group post.
But in general, it's a good idea to return the resource back to the operating system when you are done with it and use try-with-resources all the time (except when you know what you are doing).

Reading System.Out

So, what I want to do is basically filter everything that is passed through System.out and System.err, is it possible to apply a filter of some sort or to create my own OutputStream to divert System.out with, and then process it normally?
Edit for clarity:
To clarify, I need to read what goes out of System.out from other programs and manipulate it how I see fit, so a logger is not really an option as I do not have control over what the other program will use.
Edit for more clarity:
I am creating a plugin for a larger program that needs to read everything written to System.out from other plugins. Because it is a plugin-based system the process my plugin is running on will always be the same one other plugins are running on.
Something I am not clear: you have mentioned that you want to "read what goes out of System.out from other programs". Are your application creating the process and want to monitor its standard out/err, or you want to monitor the standard out/err of your process itself?
For former case, after you created the process, you can get its output, input and error stream.
For latter case, you can replace the standard out of your Java process by System.setOut(yourOwnOutputStream);
However, if you are trying to deal with streams of totally irrelevant process, I believe there is no way doing so without having the caller pipe the stdout/stderr to your process (unless through some platform specific methods)
Update:
In order to "intercept" the standard out, it is nothing different from all those classic "filter outputstreams". What you can do is something like this:
import java.io.FilterOutputStream;
import java.io.IOException;
import java.io.OutputStream;
class AnalyzingOutputStream extends FilterOutputStream {
public AnalyzingOutputStream (OutputStream out) {
super(out);
}
#Override
public void write(int b) throws IOException {
// do whatever analysis you want
super.write(b); // delegate to super class' write, which will
// delegate to the filtered outputstream
}
// other overrides
}
In order to use it, your main logic should do something like:
AnalyzingOutputStream analyzingOutputStream = new AnalyzingOutputStream(System.out);
System.setOut(analyzingOutputStream );
// then you can call your methods of AnalyzingOutputStream to do whatever you want
You should use a logger but if you don't want to you van create your own PrintStream which holds the System.out. Then call System.setOut(yourPrintStream).
To capture all output from another process in Java, the simplest way is to launch the process yourself using Runtime.exec which will give you a Process object which has getOutputStream. Reading off that stream will give you the stdout for the launched process.
If you're trying to capture the output of an arbitrary process, you've got a bigger task ahead of you - and it's probably not a good idea. I have no idea how it works on Windows. Under Linux, you'll need to be running under the same user as the process you want to investigate or (more likely) root. You can look at the file descriptors for any given process under /proc/<process id>/fd/<file descriptor>. The file descriptors for stdin, stdout and stderr are 0, 1 and 2 respectively. I would recommend you stop at this point and re-think what you're trying to do.

Is Opening Java FileOutputStream efficient?

I'm writing a singleton logger for my program right now, and I was wondering whether it would be better to open and close it everytime I log something, or to open the stream at creation of the singleton and close it at the termination of the program? And if I were to do that, how would I close it at termination?
The main advantage of opening the file once is performance. You save yourself the penalty of calling an open each time, and seek to the end of the file for appending; this get worse if the file is big (and some logs tend to be).
The cons are:
You might not be able to read the last log line inmmediately, if there is some buffering in the writer (delayed writes). Howeever, this can be fixed by flushing after each write (you might lose some performance, but this is not usually relevant).
You cannot simultaneously write to the same log from different processes. But you probably don't need this - and if you need it, the open-and-close solution still needs to deal with concurrency.
Some external log processing (typically, log rotation with renaming) becomes problematic. To allow for this, you might need to implement some signalling that closes and reopens the file.
Typically, the advantage outweights the cons, so the general rule is to keep the log file open. But that depends on the scenario.
(As other answers point out, normally you'd prefer to use some standard logging library instead of implementing this on your own. But it's instructive to give it a try, or at least to think of all the issues involved).
Do not close it, just flush, this is what Log4j FileAppender does by default.
You should open once (and close once). If you do nothing, Java will close it for you. You may prefer to explicitly override Object.finalize().

Java: A FileInputStream that blocks in read() while other thread downloads remainder of file?

I have an FFmpeg-based video-playing app which is able to play content from any arbitrary InputStream.
It is important that the app is able to play a video file which is in the process of being downloaded. What I seem to need for this is a special kind of FileInputStream that will (a) share file access with the downloading thread, and (b) if it reaches the end of the downloaded portion, will quietly block until more content becomes available.
(a) seems easy enough thanks to RandomAccessFile, but I'm a bit puzzled about (b). I can probably hack something up that will work, but I am wondering if there's a standard approach to implementing this. Thinking about it in detail gives me a feeling that I may be missing something obvious.
Any thoughts? How would you guys do this?
If you can push the data not in the file but into a OutputStream (or maybe write simulataneously to both FileOutputStream and other shared PipedOutputStream), this would be the easiest solution:
Use PipedOutputStream and PipedInputStream. This will allow you to implement both A and B, however you will need to somehow implement video buffering on the viewer side.
Basically your downloader thread will write every bit of data it gets to the PipedOutputStream. The write() method is not blocking, as the data is pushed to the internal buffer of the pipe.
Your viewer thread will simply read() from the pipedInputStream, as here is what the API says: This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
You have to poll the length of the file. There is no way to block waiting for the length of the file to change using the file alone. You can busy poll, or poll every 10 or 100 ms.
If the writer and reader are in the same process, you can use locking/synchronized blocks to notify the reader when more data has been added.
With multiple processes, you could use a socket to either send the data, or at least notify when the length has changed allowing the reader to block.
In case you do not control the download process, and want to play just ANY downloading file (even by some other downloaders) then you can Watch directory for changes.
It needs to be mentioned that this method is cross-platform and cross-filesystem. Here's quote from the same article:
Most file system implementations have native support for file change notification. The Watch Service API takes advantage of this support where available. However, when a file system does not support this mechanism, the Watch Service will poll the file system, waiting for events.
I believe there is no real answer to this question. I've got something which works, but it looks like inelegant hacking to me. Perhaps sometimes that's inevitable.

Accessing program messages output to error stream

I've created a class which processes files and if it encounters certain specific errors, it outputs relevant error messages to the error stream.
I am working on another class that needs to access these error messages. I'm not sure how to do this. I am a beginner in Java programming. Based on my limited knowledge, I thought that my two options would be to either call the main method of the first class (but I don't know how I would get the error messages in this case) or to execute the compiled class and access the messages through the getErrorStream() method of the Process class. But, I am having trouble with the system deadlocking or possibly not even executing the exec command, so I'm not sure how implement the second case either.
I'm not quite sure what you're asking here, but a potential problem with your code is that you're not reading from the process' stdout. Per the Process API, "failure to promptly ... read the output stream of the subprocess may cause the subprocess to block, and even deadlock." Is this the "trouble" you mentioned?
Edit: So yeah, you can either do what you're doing, but be sure to read both the error stream and the output stream (see my comment), or you could just call the main method directly from your code, in which case the error output will be written to System.err. You could use System.setErr() to install your own stream that would let you get what's written to it, but keep in mind that any error output from your own app--the one that's running the other app--will also show up here. It sounds like spawning a separate process, like you're already doing, is what you want.
You can't build modularity based on many little programs with a main method. You have to make blocks of function as classes that are designed to be called from elsewhere -- and that means returning status information in some programmatic fashion, not just blatting it onto System.err. If it really is an error, throw an exception. If you have to return status, design a data structure to hold the status and return it. But don't go launching new processes all over the place and reading their error streams.

Categories