Java - RandomAccessFile (Emulating the Linux tail function)

Java - RandomAccessFile (Emulating the Linux tail function) - java

Java IO implementation of unix/linux "tail -f" has a similar problem; but the solution is not viable for log files that generate about 50-100 lines per second.
I have an algorithm that emulates the tail functionality in Linux. For example,
File _logFile = new File("/tmp/myFile.txt");
long _filePtr = _logFile.length();
while (true)
{
long length = _logFile.length();
if (length < _filePtr)
{
// means file was truncated
}
else if (length > _filePtr)
{
// means something was added to the file
}
// we ignore len = _filePtr ... nothing was written to file
}
My problem is when: "something was added to the file" (referring to the else if() statement).
else if (length > _filePtr)
{
RandomAccessFile _raf = new RandomAccessFile(_logFile, "r");
raf.seek(_filePtr);
while ((curLine = raf.readLine()) != null)
myTextPane.append(curLine);
_filePtr = raf.getFilePointer();
raf.close();
}
The program blocks at while ((curLine = raf.readLine()).... after 15 seconds of run-time! (Note: that the program runs right for the first 15 seconds).
It appears that raf.readLine() is never hitting NULL, because I believe this log file is being written so fast that we go into an "endless cat and mouse" loop.
What's the best way to emulate Linux's tail?

I would think that you would be best served by grabbing a block of bytes based on the file's length, then release the file and parse a ByteArrayInputStream (instead of trying to read directly from the file).
So use RandomAccessFile#read(byte[]), and size the buffer using the returned file length. You won't always show the exact end of the file, but that is to be expected with this sort of polling algorithm.
As an aside, this algorithm is horrible - you are running IO operations in a crazy tight loop - the calls to File#length() will block, but not very much. Expect this routine to take your app to it's knees CPU-wise. I don't necessarily have a better solution for you (well - actually, I do - have the source application write to a stream instead of a file - but I recognize that isn't always feasible).
In addition to the above, you may want to introduce a polling delay (sleep the thread by 100ms each loop - it looks to me like you are displaying to a GUI - a 100ms delay won't hurt anyone, and will greatly improve the performance of the swing operations).
ok - final beef: You are adjusting a Swing component from what (I hope) is code not running on the EDT. Use SwingWorker#invokeLater() to update your text pane.

It appears I have found the problem and created a solution.
Under the else if statement:
while ((curLine = raf.readLine()) != null)
myTextPane.append(curLine);
This was the problem. the append(String) method of myTextPane (which is a derived class of JTextPane) evoked "setCaretPosition()" on every line append which IS BAD!!
That meant that setCaretPosition() was called 50-100 Hz trying to "scroll down." This caused a blocking overhead to the interface.
A simple solution was to create a StringBuffer class and append "curLine" until raf.readLine() read null.
Then, append the StringBuffer and voila ... no more blocking from setCaretPosition()!
Thanks to Kevin for bringing me towards the correct direction.

You could always exec the tail program:
BufferedReader in = new BufferedReader(new InputStreamReader(
Runtime.getRuntime().exec("tail -F /tmp/myFile.txt").getInputStream()));
String line;
while ((line = in.readLine()) != null) {
// process line
}

Related

Reading external process Error Stream heavily impacts performance

I have a Java program that (including other stuff) reads from an external Python application using Input Stream.
Here is the code I use to read it:
InputStreamReader isr = new InputStreamReader(p.getInputStream()),
isrError = new InputStreamReader(p.getErrorStream());
BufferedReader br = new BufferedReader(isr), brError = new BufferedReader(isrError);
new Thread() {
#Override
public void run() {
try {
while (brError.readLine() != null);
} catch (Exception e) {
}
}
}.start();
while ((line = br.readLine()) != null) { //line is a previously declared String
//do whatever with line
}
I create the thread to read the Error Stream too, because the Python application throws errors when something goes wrong (I can't edit it, it is third party software), and for some reason eventually the InputStream gets blocked if I don't read the ErrorStream.
Is there any way to make while (brError.readLine() != null); have less impact on performance?
Right now I am looking at performance with VisualVM, and while the Java software usually stays between 0-5% CPU usage, which is pretty nice, but around 60-65% of that usage is being used by this loop in this thread, which it's only function is to prevent the main loop from blocking. And I need to improve the performance as much as possible (This is going into industrial lines, so using resources correctly is really important).
Thank you all.

For easier handling (if you don't need the contents while running), use redirectError(File) in ProcessBuilder.
ProcessBuilder pb = new ProcessBuilder("foo", "-bar");
pb.redirectError(new File("/tmp/errors.log"));
pb.start();
If you're getting cpu spinning from while (brError.readLine() != null);, you should look at what the error stream is returning. Since readLine() is a blocking call, it would mean that the error stream is pumping a lot of lines out.

You're converting the throw-away stream to characters needlessly, which may be a bit costly, especially when you're using UTF-8 (depending on the platform encoding is usually wrong, anyway).
Drop the Reader, use BufferedInputStream for the throw-away stream.
However, for external processes, the redirection is surely superior as there's no processing in Java at all.

Usefulness of DELETE_ON_CLOSE

There are many examples on the internet showing how to use StandardOpenOption.DELETE_ON_CLOSE, such as this:
Files.write(myTempFile, ..., StandardOpenOption.DELETE_ON_CLOSE);
Other examples similarly use Files.newOutputStream(..., StandardOpenOption.DELETE_ON_CLOSE).
I suspect all of these examples are probably flawed. The purpose of writing a file is that you're going to read it back at some point; otherwise, why bother writing it? But wouldn't DELETE_ON_CLOSE cause the file to be deleted before you have a chance to read it?
If you create a work file (to work with large amounts of data that are too large to keep in memory) then wouldn't you use RandomAccessFile instead, which allows both read and write access? However, RandomAccessFile doesn't give you the option to specify DELETE_ON_CLOSE, as far as I can see.
So can someone show me how DELETE_ON_CLOSE is actually useful?

First of all I agree with you Files.write(myTempFile, ..., StandardOpenOption.DELETE_ON_CLOSE) in this example the use of DELETE_ON_CLOSE is meaningless. After a (not so intense) search through the internet the only example I could find which shows the usage as mentioned was the one from which you might got it (http://softwarecave.org/2014/02/05/create-temporary-files-and-directories-using-java-nio2/).
This option is not intended to be used for Files.write(...) only. The API make is quite clear:
This option is primarily intended for use with work files that are used solely by a single instance of the Java virtual machine. This option is not recommended for use when opening files that are open concurrently by other entities.
Sorry I can't give you a meaningful short example, but see such file like a swap file/partition used by an operating system. In cases where the current JVM have the need to temporarily store data on the disc and after the shutdown the data are of no use anymore. As practical example I would mention it is similar to an JEE application server which might decide to serialize some entities to disc to freeup memory.
edit Maybe the following (oversimplified code) can be taken as example to demonstrate the principle. (so please: nobody should start a discussion about that this "data management" could be done differently, using fixed temporary filename is bad and so on, ...)
in the try-with-resource block you need for some reason to externalize data (the reasons are not subject of the discussion)
you have random read/write access to this externalized data
this externalized data only is of use only inside the try-with-resource block
with the use of the StandardOpenOption.DELETE_ON_CLOSE option you don't need to handle the deletion after the use yourself, the JVM will take care about it (the limitations and edge cases are described in the API)
.
static final int RECORD_LENGTH = 20;
static final String RECORD_FORMAT = "%-" + RECORD_LENGTH + "s";
// add exception handling, left out only for the example
public static void main(String[] args) throws Exception {
EnumSet<StandardOpenOption> options = EnumSet.of(
StandardOpenOption.CREATE,
StandardOpenOption.WRITE,
StandardOpenOption.READ,
StandardOpenOption.DELETE_ON_CLOSE
);
Path file = Paths.get("/tmp/enternal_data.tmp");
try (SeekableByteChannel sbc = Files.newByteChannel(file, options)) {
// during your business processing the below two cases might happen
// several times in random order
// example of huge datastructure to externalize
String[] sampleData = {"some", "huge", "datastructure"};
for (int i = 0; i < sampleData.length; i++) {
byte[] buffer = String.format(RECORD_FORMAT, sampleData[i])
.getBytes();
ByteBuffer byteBuffer = ByteBuffer.wrap(buffer);
sbc.position(i * RECORD_LENGTH);
sbc.write(byteBuffer);
}
// example of processing which need the externalized data
Random random = new Random();
byte[] buffer = new byte[RECORD_LENGTH];
ByteBuffer byteBuffer = ByteBuffer.wrap(buffer);
for (int i = 0; i < 10; i++) {
sbc.position(RECORD_LENGTH * random.nextInt(sampleData.length));
sbc.read(byteBuffer);
byteBuffer.flip();
System.out.printf("loop: %d %s%n", i, new String(buffer));
}
}
}

The DELETE_ON_CLOSE is intended for working temp files.
If you need to make some operation that needs too be temporaly stored on a file but you don't need to use the file outside of the current execution a DELETE_ON_CLOSE in a good solution for that.
An example is when you need to store informations that can't be mantained in memory for example because they are too heavy.
Another example is when you need to store temporarely the informations and you need them only in a second moment and you don't like to occupy memory for that.
Imagine also a situation in which a process needs a lot of time to be completed. You store informations on a file and only later you use them (perhaps many minutes or hours after). This guarantees you that the memory is not used for those informations if you don't need them.
The DELETE_ON_CLOSE try to delete the file when you explicitly close it calling the method close() or when the JVM is shutting down if not manually closed before.

Here are two possible ways it can be used:
1. When calling Files.newByteChannel
This method returns a SeekableByteChannel suitable for both reading and writing, in which the current position can be modified.
Seems quite useful for situations where some data needs to be stored out of memory for read/write access and doesn't need to be persisted after the application closes.
2. Write to a file, read back, delete:
An example using an arbitrary text file:
Path p = Paths.get("C:\\test", "foo.txt");
System.out.println(Files.exists(p));
try {
Files.createFile(p);
System.out.println(Files.exists(p));
try (BufferedWriter out = Files.newBufferedWriter(p, Charset.defaultCharset(), StandardOpenOption.DELETE_ON_CLOSE)) {
out.append("Hello, World!");
out.flush();
try (BufferedReader in = Files.newBufferedReader(p, Charset.defaultCharset())) {
String line;
while ((line = in.readLine()) != null) {
System.out.println(line);
}
}
}
} catch (IOException ex) {
ex.printStackTrace();
}
System.out.println(Files.exists(p));
This outputs (as expected):
false
true
Hello, World!
false
This example is obviously trivial, but I imagine there are plenty of situations where such an approach may come in handy.
However, I still believe the old File.deleteOnExit method may be preferable as you won't need to keep the output stream open for the duration of any read operations on the file, too.

Reading error stream from a process

I am writing a java program to read the error stream from a process . Below is the structure of my code --
ProcessBuilder probuilder = new ProcessBuilder( command );
Process process = probuilder.start();
InputStream error = process.getErrorStream();
InputStreamReader isrerror = new InputStreamReader(error);
BufferedReader bre = new BufferedReader(isrerror);
while ((linee = bre.readLine()) != null) {
System.out.println(linee);
}
The above code works fine if anything is actually written to the error stream of the invoked process. However, if anything is not written to the error stream, then the call to readLine actually hangs indefinitely. However, I want to make my code generic so that it works for all scenarios. How can I modify my code to achieve the same.
Regards,
Dev

readline() is a blocking call. It will block until there's a line to be read (terminated by an end of line character) or the underlying stream is closed (returning EOF).
You need to have logic that is checking BufferedReader.ready() or just using BufferedReader.read() and bailing out if you decide you're waiting long enough (or want to do something else then check again).
Edit to add: That being said, it shouldn't hang "indefinitely" as-is; it should return once the invoked process terminates. By any chance is your invoked process also outputting something to stdout? If that's the case ... you need to be reading from that as well or the buffer will fill and will block the external process which will prevent it from exiting which ... leads to your problem.

This is a late reply, but the issue hasn't really solved and it's on the first page for some searches. I had the same issue, and BufferedReader.ready() would still set up a situation where it would lock.
The following workaround will not work if you need to get a persistent stream. However, if you're just running a program and waiting for it to close, this should be fine.
The workaround I'm using is to call ProcessBuilder.redirectError(File). Then I'd read the file and use that to present the error stream to the user. It worked fine, didn't lock. I did call Process.destroyForcibly() after Process.waitFor() but this is likely unnecessary.
Some pseudocode below:
File thisFile = new File("somefile.ext");
ProcessBuilder pb = new ProcessBuilder(yourStringList);
pb.redirectError(thisFile);
Process p = pb.start();
p.waitFor();
p.destroyForcibly();
ArrayList fileContents = getFileContents(thisFile);
I hope this helps with at least some of your use cases.

Something like this might also work and avoid the blocking behaviour (without requiring to create a File)
InputStream error = process.getErrorStream();
// Read from InputStream
for (int k = 0; k < error.available(); ++k)
System.out.println("Error stream = " + error.read());
From the Javadoc of InputStream.available
Returns an estimate of the number of bytes that can be read (orskipped over) from this input stream without blocking by the nextinvocation of a method for this input stream. The next invocationmight be the same thread or another thread. A single read or skip of thismany bytes will not block, but may read or skip fewer bytes.

The simplest answer would be to simply redirect the error stream to stdout:
process.getErrorStream().transferTo(System.out);

Read output from external process

I am trying to run a .csh script and read it's output into a StringBuffer.
the output sometime returns empty although running the script from console returns some output. the same running flow can sometimes returns output and sometimes not, although nothing is changed in the way the process starts (same script, path , args) and the script isn't changed as well.
I'm not getting any exceptions thrown.
what might cause output now to be read correctly/successfully ?
the code segment is
public static String getOutpoutScript(Process p) {
InputStream outpout = p.getInputStream();
logger.info("Retrived script output stream");
BufferedReader buf = new BufferedReader(new InputStreamReader(outpout));
String line = "";
StringBuffer write = new StringBuffer();
try {
while ((line = buf.readLine()) != null) {
write.append(line);
}
} catch (IOException e) {
// do something
}
return write.toString().trim();
}
beside the fact not closing the streams is not good, could this or something else in the code might prevent output from being read correctly under some circumstances ?
thanks,

If you launch it with ProcessBuilder, you can combine the error stream into the output stream. This way if the program prints to stderr you'll capture this too. Alternatively you could just read both. Additionally, you may not want to use readLine, you could be stuck for awhile if the program does not print end of line character at the end.

Maybe you must replace p.getInputStream() with p.getOutputStream()
Besides this sometimes processes can block waiting on input, so you must read and write asynchronously - one possible solution is to use different threads - e.g. one thread is reading, other is writing and one that is monitoring the process.

If you have an error, this will write to getErrorStream() by default. If you have a problem, I would ensure you are reading this somewhere.
If the buffer for this stream fills, your program will stop, waiting for you to read it.
A simple way around these issues is to use ProcessBuilder.redirectErrorStream(true)

What's the best way to monitor an InputStream?

I'm reading a file in via apache.commons.FtpClient.
This works fine 99.9% of the time but sometimes it just dies in the read() method...
InputStream inStream = ftp.retrieveFileStream(path + file.getName());
String fileAsString = "";
if(inStream == null){
return;
}
while((c = inStream.read()) != -1){ //this is where the code sometimes just hangs
fileAsString += Character.valueOf((char)c);
}
My question is what is the most reliable way to protect against this locking up the system indefinitely. Should I be setting a timer in a separate thread? Or is there a simpler way to do it?

If your code hangs it means your FTP server has not sent the entire file. You can use a Timer, but I believe FtpClient allows you to set a timeout.
BTW: the way you read the file is very inefficient. If your file is larger than a few K it will use increasing amounts of CPU.
You are creating a Character from a byte (which is a bad idea in itself) and a String object for every byte in the file.
I suggest using the copy method provided or the one which comes with commons-io library to copy the data to a ByteArrayInputStream.

Just from a quick look at the docs, if you did...
while (inStream.available() > 0 && (c = inStream.read()) != -1)
It seems like it would double check that you can read without blocking before you actually read. I'm not certain on this though.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java - RandomAccessFile (Emulating the Linux tail function) - java

You could always exec the tail program: BufferedReader in = new BufferedReader(new InputStreamReader( Runtime.getRuntime().exec("tail -F /tmp/myFile.txt").getInputStream())); String line; while ((line = in.readLine()) != null) { // process line }

Related

Reading external process Error Stream heavily impacts performance

Usefulness of DELETE_ON_CLOSE

Reading error stream from a process

Read output from external process

What's the best way to monitor an InputStream?

Categories

Resources