Parallelize a blocking call in Java

Parallelize a blocking call in Java - java

I am currently using the below code to run a process and print its output from within my own program. My problem is that I'd like to do more than just print the output. However, since the Process being executed scarcely prints output to the console, the commented line is almost never executed because the readLine() method is blocking.
Process p = Runtime.getRuntime().exec(executablePath);
ProcessWatcher pw = new ProcessWatcher(p);
BufferedReader output = new BufferedReader(new InputStreamReader(p.getInputStream()));
while(!pw.isFinished()) {
String line = output.readLine();
if(line != null) {
System.out.println(line);
}
// I want to do something in parallel here
}
My problem would be solved if I could give the readLine() call some sort of timeout, or if I could run it in its own Thread. The latter would be preferred, but the former is acceptable. My primary intention is to keep the code simple.
Is there a nice, simple way to parallelize a blocking call, or will I need to create a new class that implements Runnable just for this?

You can wrap you code with.
new Thread(new Runnable() { public void run() {
///put your code here
}}).start();
You cannot do a non-blocking readLine or one with a timeout. In any case, writing code to handle non-blocking connections is usually 10x more complex. ;)

The readLine() call indeed is blocking however, the ready() call on the BufferedReader is not -- so you can check if your reader is ready (non-blocking call) and then perform a read -- something like that:
Process p = Runtime.getRuntime().exec(executablePath);
ProcessWatcher pw = new ProcessWatcher(p);
BufferedReader output = new BufferedReader(new InputStreamReader(p.getInputStream()));
while(!pw.isFinished()) {
if( output.ready() ) {
String line = output.readLine();
if(line != null) {
System.out.println(line);
}
}
// I want to do something in parallel here
}
This will allow you to execute something else in the while loop while the process is running.

IMO, you'd be better off putting the "process" spawning stuff in a separate thread of execution since spawning a process can take an indeterminate amount of time, not to mention a plethora of things that can go wrong with it. Having all those stuff bundled in a separate logical unit would be much more cleaner.

Related

Capturing stdout from java.lang.reflect method invocation

I have an application that, among other things, runs Java methods via java.lang.reflect. It normally functions as normal; however, a user used it with one of their JARs, and it broke somewhat.
As you can see in the below code, I attempt to capture both stdout and stdin from the method. However, when the method is invoked, only the first line of what the method streams to stdout is actually captured.
Here's the relevant code. If you need to see more of the code, let me know, and I'll add some more:
String retVal = "";
ByteArrayOutputStream out = new ByteArrayOutputStream();
ByteArrayOutputStream err = new ByteArrayOutputStream();
PrintStream origOut = System.out;
PrintStream origErr = System.err;
System.setOut(new PrintStream(out));
System.setErr(new PrintStream(err));
Exception myException = null;
try {
Object myRetVal = null;
myRetVal = m.invoke(obj, convertedMethodArguments);
if (myRetVal != null)
retVal = myRetVal.toString();
} catch (Exception e) {
myException = e;
}
returnObj.addProperty("stdout", out.toString());
returnObj.addProperty("stderr", err.toString());
returnObj.addProperty("rv", retVal);
returnObj.addProperty("rt", m.getReturnType().toString());
if (myException != null && myException.getCause() != null)
returnObj.addProperty("exception", myException.getCause().toString());
else
returnObj.addProperty("exception", "");
System.setOut(origOut);
System.setErr(origErr);
System.out.print(new Gson().toJson(returnObj));
// TODO: remove, debug purposes only
// Should use normal stdout
try {
System.out.println();
m.invoke(obj, convertedMethodArguments);
} catch (Exception e) {
System.out.println(e.toString());
}
When I execute the above code, it only prints out the first line of stdout. However, at the bottom of the code block, I invoke the method again, but this time without any redirection, and I get all of the stdout from the method.
Any help would be greatly appreciated!
EDIT #1: OK, get this. For fun, I commented-out the two lines where I redirect the default System streams (e.g. System.setOut and System.setErr). With these gone, I now expect all stdout to be written to the console directly when I run the app.
I added a message (e.g. System.out.println("Testing...");) at the very end of my code, so that it's the last thing that is executed. When I test the app, I get the first line of stdout, followed by my testing message, and THEN the rest of the stdout.
I have no clue what's going on here.
EDIT #2: Per #Titus's suggestion, I looked into whether or not the method I'm invoking is spinning off its own threads. Turns out, it is. Two threads are created, AWT-AppKit and AWT-Shutdown. The former thread seems to stay in RUNNABLE state, whereas the latter thread stays in the TIMED_WAITING state.
Over time, the AWT-Shutdown thread goes away, but the other one stays alive in its RUNNABLE state. Once my application exits, I believe the method I'm invoking also exits, and at that point the extra messages are displayed to the screen (which explains why I can't capture this bit of STDOUT).
What I don't understand is why this method won't terminate within my application.

Try to flush the streams after you call the method.
Here is an example:
PrintStream outPR = new PrintStream(out);
System.setOut(outPR);
....
outPR.flush();
returnObj.addProperty("stdout", out.toString());
You can even do this:
System.setOut(new PrintStream(out, true));
....
System.out.println();
returnObj.addProperty("stdout", out.toString());
The PrintStream is automatically flushed (if you use the constructor that I've used) when a \n (new line) is written to it.
Based on the edits to your question, it is possible that the method you're calling is creating new threads which means that it is possible that this new threads print to the console after the method returns.
If that is the case, you'll have to wait until this threads finish in order to get all the output.

In java, Is there a way to be notified when data becomes available on a stream or pipe

I want to be able to have a thread wait for data on one or more streams/pipes to be available and then poll-read the data ( and then wait again for more )
InputStream is = process.getInputStream() // stdout of process
is.setOnDataAvailable((stream)-> {
readerThread.signal ... // tell thread to wake up or put worker task in queue
});
// reader thread
dataAvailable = true; // start assuming data there.
for(;;) {
if(!dataAvailable) {
waitForSignal();
}
dataAvailable = false;
for(allINputs) {
if(input.dataAvailable()) {
read it and do something
dataAvailable = true; // might be more check inputs again
}
}
}

The standard Java IO API is a blocking implementation. In other words, if you have a thread with a loop that looks like:
byte[] input = ...;
while (stream.read(input) != -1)
{
...
}
Then that thread will block until there is something to read from the stream (or the stream closes, there's no more data, etc.), which from description sounds like you what you want to do.
Furthermore, from your code it looks like you want read process output. Make sure you drain the other streams from the process as well (like stderr) or you'll exceed the limit and get an exception.

Java thread stuck after join

I have this Transmitter class, which contains one BufferedReader and one PrintWriter. The idea is, on the main class, to use Transmitter.receive() and Transmitter.transmit() to the main socket. The problem is:
public void receive() throws Exception {
// Reads from the socket
Thread listener = new Thread(new Runnable() {
public void run() {
String res;
try {
while((res = input.readLine()) != null) {
System.out.println("message received: " + res);
outputMessage = (res);
if (res.equals("\n")) {
break;
}
}
} catch (IOException e) {
e.printStackTrace();
}
};
});
listener.start();
listener.join();
}
The thread changes the 'outputMessage' value, which I can get using an auxiliary method. The problem is that, without join, my client gets the outputMessage but I want to use it several times on my main class, like this:
trans1.receive();
while(trans1.getOutput() == null);
System.out.println("message: " + trans1.getOutput());
But with join this system.out never executes because trans1.receive() is stuck... any thoughts?
Edit 1: here is the transmitter class https://titanpad.com/puYBvlVery

You might send \n; that doesn't mean that you will see it in your Java code.
As it says in the Javadoc for BufferedReader.readLine() (emphasis mine):
(Returns) A String containing the contents of the line, not including any line-termination characters
so "\n" will never be returned.

Doing this:
{
Thread listener = new Thread(new Runnable() {
public void run() {
doSomeWork();
};
});
listener.start();
listener.join();
}
will create a new thread and then wait for it to do its work and finish. Therefore it's more or less the same as just directly doing:
doSomeWork();
The new thread doesn't serve any real purpose here.
Also, the extra thread introduces synchronization problems because in your code you don't make sure your variables are synchronized.
Thirdly, your thread keeps reading lines from the input in a loop until there's nothing more to read and unless the other side closes the stream, it will block on the readLine() call. What you will see in with getOutput() will be a random line that just happens to be there at the moment you look, the next time you look it might be the same line, or some completely different line; some lines will be read and forgotten immediatelly without you ever noticing it from the main thread.
You can just call input.readLine() directly in your main thread when you actually need to get a new line message from the input, you don't need an extra reader thread. You could store the read messages into a Queue as yshavit suggests, if that's desirable, e.g. for performance reasons it might be better to read the messages as soon as they are available and have them ready in memory. But if you only need to read messages one by one then you can simply call input.readLine() only when you actually need it.

Synchronizing on System.out

I changed System.out to print to a file, by invoking System.setOut and System.setErr.
Every night at midnight, we want to rename (archive) the current log file, and create a new one.
if (out != null) {
out.close();
out = null;
File f = new File(outputfilename);
f.renameTo(new File(dir.getPath().replace(".log", "-" + System.currentTimeMillis() + ".log")))
StartLogFile();
}
The StartLogFile():
if (out == null) {
out = new FileOutputStream(outputfilename, true);
System.setOut(new PrintStream(out));
System.setErr(new PrintStream(out));
}
I've left exception-handling out.
My concern is that if something tries to print in between out.close() and setOut/setErr that I'm going to miss a log.
My real question is, how can I make this atomic with other calls to System.out.println?
I was thinking about trying
synchronized (System.out) {
}
but I'm not actually sure if the intrinsic lock here does anything. Especially since I'm nullifying the out object during the operation.
Does anyone know how I can ensure proper synchronization here?

I would create the new out before closing the old one:
PrintStream old = System.out;
out = new FileOutputStream(outputfilename, true);
System.setOut(new PrintStream(out));
old.close();
This way the old PrintStream is not closed until the new one is created and assigned. At all times there is a valid PrintStream in System.out.
There is no need for synchronized block, because everything is in the same thread.

Yes you can achieve proper synchronization that way. Here is a sample test.
#Test
public void test() throws InterruptedException {
new Thread(()->{
while(true){
System.out.println("printing something");
try {
Thread.sleep(100);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
break;
}
}
}).start();
Thread.sleep(500);
synchronized (System.out){
System.out.println("changin system out");
Thread.sleep(2000);
System.out.println("finished with sysout");
}
Thread.sleep(2000);
}
and the output will be:
printing something
printing something
printing something
printing something
printing something
changin system out
finished with sysout
printing something
printing something
printing something
printing something
printing something
printing something
printing something
printing something
printing something
printing something
printing something
printing something
printing something
printing something

There is no way to make this work safely, since you have no control what the calling code is doing with System.out. Think of this:
public void doSomethingTakingALongTime(PrintStream target) {
// lots of code
}
// somewhere else
doSomethingTakingALongTime(System.out);
You can never be sure there isn't a copy of System.out reference somewhere out there in a local variable or method parameter.
The proper way to solve this would be to set System.out only once, at the very start of the program, and instead of using a standard PrintStream, you use your own implementation that delegates everything to the current target.
You are then in complete control of every output made through System.out and can synchronize at you leisure where required. If your own implementation synchronizes every operation, the question of what happens while you're changing the logging target doesn't even arise - every other caller will simply be blocked by the synchronization lock.
Btw. its questionable to use System.out for logging. The de-facto standard for logging would be using log4j. Consider switching to that.
Edit: Actually implementing this delegation can be rather easy. There is a constructor PrintStream(OutputStream). That means you can just implement delegation in an OutputStream (that has considerably less methods than PrintStream) and set System.out to your new PrintStream(YourRetargettingOutputStream).

You can define an object explicitly for locking like
static final Object lock = new Object();
How about locking over it like below
synchronized(lock){
if(out != null) {
out.close();
out = null;
File f = new File(outputfilename);
f.renameTo(new File(dir.getPath().replace(".log", "-" + System.currentTimeMillis() + ".log")))
StartLogFile();
}
}

How to determine the exact state of a BufferedReader?

I have a BufferedReader (generated by new BufferedReader(new InputStreamReader(process.getInputStream()))). I'm quite new to the concept of a BufferedReader but as I see it, it has three states:
A line is waiting to be read; calling bufferedReader.readLine will return this string instantly.
The stream is open, but there is no line waiting to be read; calling bufferedReader.readLine will hang the thread until a line becomes available.
The stream is closed; calling bufferedReader.readLine will return null.
Now I want to determine the state of the BufferedReader, so that I can determine whether I can safely read from it without hanging my application. The underlying process (see above) is notoriously unreliable and so might have hung; in this case, I don't want my host application to hang. Therefore I'm implementing a kind of timeout. I tried to do this first with threading but it got horribly complicated.
Calling BufferedReader.ready() will not distinguish between cases (2) and (3) above. In other words, if ready() returns false, it might be that the stream just closed (in other words, my underlying process closed gracefully) or it might be that the underlying process hung.
So my question is: how do I determine which of these three states my BufferedReader is in without actually calling readLine? Unfortunately I can't just call readLine to check this, as it opens my app up to a hang.
I am using JDK version 1.5.

There is a state where some data may be in the buffer, but not necessarily enough to fill a line. In this case, ready() would return true, but calling readLine() would block.
You should easily be able to build your own ready() and readLine() methods. Your ready() would actually try to build up a line, and only when it has done so successfully would it return true. Then your readLine() could return the fully-formed line.

Finally I found a solution to this. Most of the answers here rely on threads, but as I specified earlier, I am looking for a solution which doesn't require threads. However, my basis was the process. What I found was that processes seem to exit if both the output (called "input") and error streams are empty and closed. This makes sense if you think about it.
So I just polled the output and error streams and also tried to determine if the process had exited or not. Below is a rough copy of my solution.
public String readLineWithTimeout(Process process, long timeout) throws IOException, TimeoutException {
BufferedReader output = new BufferedReader(new InputStreamReader(process.getInputStream()));
BufferedReader error = new BufferedReader(new InputStreamReader(process.getErrorStream()));
boolean finished = false;
long startTime = 0;
while (!finished) {
if (output.ready()) {
return output.readLine();
} else if (error.ready()) {
error.readLine();
} else {
try {
process.exitValue();
return null;
} catch (IllegalThreadStateException ex) {
//Expected behaviour
}
}
if (startTime == 0) {
startTime = System.currentTimeMills();
} else if (System.currentTimeMillis() > startTime + timeout) {
throw new TimeoutException();
}
}
}

This is a pretty fundamental issue with java's blocking I/O API.
I suspect you're going to want to pick one of:
(1) Re-visit the idea of using threading. This doesn't have to be complicated, done properly, and it would let your code escape a blocked I/O read fairly gracefully, for example:
final BufferedReader reader = ...
ExecutorService executor = // create an executor here, using the Executors factory class.
Callable<String> task = new Callable<String> {
public String call() throws IOException {
return reader.readLine();
}
};
Future<String> futureResult = executor.submit(task);
String line = futureResult.get(timeout); // throws a TimeoutException if the read doesn't return in time
(2) Use java.nio instead of java.io. This is a more complicated API, but it has non-blocking semantics.

Have you confirmed by experiment your assertion that ready() will return false even if the underlying stream is at end of file? Because I would not expect that assertion to be correct (although I haven't done the experiment).

You could use InputStream.available() to see if there is new output from the process. This should work the way you want it if the process outputs only full lines, but it's not really reliable.
A more reliable approach to the problem would be to have a seperate thread dedicated to reading from the process and pushing every line it reads to some queue or consumer.

In general, you have to implement this with multiple threads. There are special cases, like reading from a socket, where the underlying stream has a timeout facility built-in.
However, it shouldn't be horribly complicated to do this with multiple threads. This is a pattern I use:
private static final ExecutorService worker =
Executors.newSingleThreadExecutor();
private static class Timeout implements Callable<Void> {
private final Closeable target;
private Timeout(Closeable target) {
this.target = target;
}
public Void call() throws Exception {
target.close();
return null;
}
}
...
InputStream stream = process.getInputStream();
Future<?> task = worker.schedule(new Timeout(stream), 5, TimeUnit.SECONDS);
/* Use the stream as you wish. If it hangs for more than 5 seconds,
the underlying stream is closed, raising an IOException here. */
...
/* If you get here without timing out, cancel the asynchronous timeout
and close the stream explicitly. */
if(task.cancel(false))
stream.close();

You could make your own wrapper around InputStream or InputStreamReader that works on a byte-by-byte level, for which ready() returns accurate values.
Your other options are threading which could be done simply (look into some of the concurrent data structures Java offers) and NIO, which is very complex and probably overkill.

If you just want the timeout then the other methods here are possibly better. If you want a non-blocking buffered reader, here's how I would do it, with threads: (please note I haven't tested this and at the very least it needs some exception handling added)
public class MyReader implements Runnable {
private final BufferedReader reader;
private ConcurrentLinkedQueue<String> queue = new ConcurrentLinkedQueue<String>();
private boolean closed = false;
public MyReader(BufferedReader reader) {
this.reader = reader;
}
public void run() {
String line;
while((line = reader.readLine()) != null) {
queue.add(line);
}
closed = true;
}
// Returns true iff there is at least one line on the queue
public boolean ready() {
return(queue.peek() != null);
}
// Returns true if the underlying connection has closed
// Note that there may still be data on the queue!
public boolean isClosed() {
return closed;
}
// Get next line
// Returns null if there is none
// Never blocks
public String readLine() {
return(queue.poll());
}
}
Here's how to use it:
BufferedReader b; // Initialise however you normally do
MyReader reader = new MyReader(b);
new Thread(reader).start();
// True if there is data to be read regardless of connection state
reader.ready();
// True if the connection is closed
reader.closed();
// Gets the next line, never blocks
// Returns null if there is no data
// This doesn't necessarily mean the connection is closed, it might be waiting!
String line = reader.readLine(); // Gets the next line
There are four possible states:
Connection is open, no data is available
Connection is open, data is available
Connection is closed, data is available
Connection is closed, no data is available
You can distinguish between them with the isClosed() and ready() methods.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parallelize a blocking call in Java - java

You can wrap you code with. new Thread(new Runnable() { public void run() { ///put your code here }}).start(); You cannot do a non-blocking readLine or one with a timeout. In any case, writing code to handle non-blocking connections is usually 10x more complex. ;)

Related

Capturing stdout from java.lang.reflect method invocation

In java, Is there a way to be notified when data becomes available on a stream or pipe

Java thread stuck after join

Synchronizing on System.out

How to determine the exact state of a BufferedReader?

Categories

Resources