How to close std-streams from java.lang.Process appropriate? - java

This question is about java.lang.Process and its handling of stdin, stdout and stderr.
We have a class in our project that is an extension to org.apache.commons.io.IOUtils. There we have a quiet new method for closing the std-streams of a Process-Object appropriate? Or is it not appropriate?
/**
* Method closes all underlying streams from the given Process object.
* If Exit-Code is not equal to 0 then Process will be destroyed after
* closing the streams.
*
* It is guaranteed that everything possible is done to release resources
* even when Throwables are thrown in between.
*
* In case of occurances of multiple Throwables then the first occured
* Throwable will be thrown as Error, RuntimeException or (masked) IOException.
*
* The method is null-safe.
*/
public static void close(#Nullable Process process) throws IOException {
if(process == null) {
return;
}
Throwable t = null;
try {
close(process.getOutputStream());
}
catch(Throwable e) {
t = e;
}
try{
close(process.getInputStream());
}
catch(Throwable e) {
t = (t == null) ? e : t;
}
try{
close(process.getErrorStream());
}
catch (Throwable e) {
t = (t == null) ? e : t;
}
try{
try {
if(process.waitFor() != 0){
process.destroy();
}
}
catch(InterruptedException e) {
t = (t == null) ? e : t;
process.destroy();
}
}
catch (Throwable e) {
t = (t == null) ? e : t;
}
if(t != null) {
if(t instanceof Error) {
throw (Error) t;
}
if(t instanceof RuntimeException) {
throw (RuntimeException) t;
}
throw t instanceof IOException ? (IOException) t : new IOException(t);
}
}
public static void closeQuietly(#Nullable Logger log, #Nullable Process process) {
try {
close(process);
}
catch (Exception e) {
//log if Logger provided, otherwise discard
logError(log, "Fehler beim Schließen des Process-Objekts (inkl. underlying streams)!", e);
}
}
public static void close(#Nullable Closeable closeable) throws IOException {
if(closeable != null) {
closeable.close();
}
}
Methods like these are basically used in finally-blocks.
What I really want to know is if I am safe with this implementation? Considering things like: Does a process object always return the same stdin, stdout and stderr streams during its lifetime? Or may I miss closing streams previously returned by process' getInputStream(), getOutputStream() and getErrorStream() methods?
There is a related question on StackOverflow.com: java: closing subprocess std streams?
Edit
As pointed out by me and others here:
InputStreams have to be totally consumed. When not done then the subprocess may not terminate, because there is outstanding data in its output streams.
All three std-streams have to be closed. Regardless if used before or not.
When the subprocess terminates normally everything should be fine. When not then it have to be terminated forcibly.
When an exit code is returned by subprocess then we do not need to destroy() it. It has terminated. (Even when not necessarily terminated normally with Exit Code 0, but it terminated.)
We need to monitor waitFor() and interrupt when timeout exceeds to give process a chance to terminate normally but killing it when it hangs.
Unanswered parts:
Consider Pros and Cons of consuming the InputStreams in parallel. Or must they be consumed in particular order?

An attempt at simplifying your code:
public static void close(#Nullable Process process) throws IOException
{
if(process == null) { return; }
try
{
close(process.getOutputStream());
close(process.getInputStream());
close(process.getErrorStream());
if(process.waitFor() != 0)
{
process.destroy();
}
}
catch(InterruptedException e)
{
process.destroy();
}
catch (RuntimeException e)
{
throw (e instanceof IOException) ? e : new IOException(e);
}
}
By catching Throwable I assume you wish to catch all unchecked exceptions. That is either a derivative of RuntimeException or Error. However Error should never be catched, so I have replaced Throwable with RuntimeException.
(It is still not a good idea to catch all RuntimeExceptions.)

As the question you linked to states, it is better to read and discard the output and error streams. If you are using apache commons io, something like,
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getInputStream(), new NullOutputStream());}}).start();
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getErrorStream(), new NullOutputStream());}}).start();
You want to read and discard stdout and stderr in a separate thread to avoid problems such as the process blocking when it writes enough info to stderr or stdout to fill the buffer.
If you are worried about having two many threads, see this question
I don't think you need to worry about catching IOExceptions when copying stdout, stdin to NullOutputStream, since if there is an IOException reading from the process stdout/stdin, it is probably due to the process being dead itself, and writing to NullOutputStream will never throw an exception.
You don't need to check the return status of waitFor().
Do you want to wait for the process to complete? If so, you can do,
while(true) {
try
{
process.waitFor();
break;
} catch(InterruptedException e) {
//ignore, spurious interrupted exceptions can occur
}
}
Looking at the link you provided you do need to close the streams when the process is complete, but destroy will do that for you.
So in the end, the method becomes,
public void close(Process process) {
if(process == null) return;
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getInputStream(), new NullOutputStream());}}).start();
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getErrorStream(), new NullOutputStream());}}).start();
while(true) {
try
{
process.waitFor();
//this will close stdin, stdout and stderr for the process
process.destroy();
break;
} catch(InterruptedException e) {
//ignore, spurious interrupted exceptions can occur
}
}
}

Just to let you know what I have currently in our codebase:
public static void close(#Nullable Process process) throws IOException {
if (process == null) {
return;
}
Throwable t = null;
try {
flushQuietly(process.getOutputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
close(process.getOutputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
skipAllQuietly(null, TIMEOUT, process.getInputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
close(process.getInputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
skipAllQuietly(null, TIMEOUT, process.getErrorStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
close(process.getErrorStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
try {
Thread monitor = ThreadMonitor.start(TIMEOUT);
process.waitFor();
ThreadMonitor.stop(monitor);
}
catch (InterruptedException e) {
t = mostImportantThrowable(t, e);
process.destroy();
}
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
if (t != null) {
if (t instanceof Error) {
throw (Error) t;
}
if (t instanceof RuntimeException) {
throw (RuntimeException) t;
}
throw t instanceof IOException ? (IOException) t : new IOException(t);
}
}
skipAllQuietly(...) consumes complete InputStreams. It uses internally an implementation similar to org.apache.commons.io.ThreadMonitor to interrupt consumption if a given timeout exceeded.
mostImportantThrowable(...) decides over what Throwable should be returned. Errors over everything. First occured higher prio than later occured. Nothing very important here since these Throwable are most probably discarded anyway later. We want to go on working here and we can only throw one, so we have to decide what we throw at the end, if ever.
close(...) are null-safe implementations to close stuff but throwing Exception when something went wrong.

Related

Constructing a DAG of cancelable Java tasks

I want to create a DAG out of tasks in Java, where the tasks may depend upon the output of other tasks. If there is no directed path between two tasks, they may run in parallel. Tasks may be canceled. If any task throws an exception, all tasks are canceled.
I wanted to use CompleteableFuture for this, but despite implementing the Future interface (including Future.cancel(boolean), CompletableFuture does not support cancelation -- CompletableFuture.cancel(true) is simply ignored. (Does anybody know why?)
Therefore, I am resorting to building my own DAG of tasks using Future. It's a lot of boilerplate, and complicated to get right. Is there any better method than this?
Here is an example:
I want to call Process process = Runtime.getRuntime().exec(cmd) to start a commandline process, creating a Future<Process>. Then I want to launch (fan out to) three subtasks:
One task that consumes input from process.getInputStream()
One task that consumes input from process.getErrorStream()
One task that calls process.waitFor(), and then waits for the result.
Then I want to wait for all three of the launched sub-tasks to complete (i.e. fan-in / a completion barrier). This should be collected in a final Future<Integer> exitCode that collects the exit code returned by the process.waitFor() task. The two input consumer tasks simply return Void, so their output can be ignored, but the completion barrier should still wait for their completion.
I want a failure in any of the launched subtasks to cause all subtasks to be canceled, and the underlying process destroyed.
Note that Process process = Runtime.getRuntime().exec(cmd) in the first step can throw an exception, which should cause the failure to cascade all the way to exitCode.
#FunctionalInterface
public static interface ConsumerThrowingIOException<T> {
public void accept(T val) throws IOException;
}
public static Future<Integer> exec(
ConsumerThrowingIOException<InputStream> stdoutConsumer,
ConsumerThrowingIOException<InputStream> stderrConsumer,
String... cmd) {
Future<Process> processFuture = executor.submit(
() -> Runtime.getRuntime().exec(cmd));
AtomicReference<Future<Void>> stdoutProcessorFuture = //
new AtomicReference<>();
AtomicReference<Future<Void>> stderrProcessorFuture = //
new AtomicReference<>();
AtomicReference<Future<Integer>> exitCodeFuture = //
new AtomicReference<>();
Runnable cancel = () -> {
try {
processFuture.get().destroy();
} catch (Exception e) {
// Ignore (exitCodeFuture.get() will still detect exceptions)
}
if (stdoutProcessorFuture.get() != null) {
stdoutProcessorFuture.get().cancel(true);
}
if (stderrProcessorFuture.get() != null) {
stderrProcessorFuture.get().cancel(true);
}
if (exitCodeFuture.get() != null) {
stderrProcessorFuture.get().cancel(true);
}
};
if (stdoutConsumer != null) {
stdoutProcessorFuture.set(executor.submit(() -> {
try {
InputStream inputStream = processFuture.get()
.getInputStream();
stdoutConsumer.accept(inputStream != null
? inputStream
: new ByteArrayInputStream(new byte[0]));
return null;
} catch (Exception e) {
cancel.run();
throw e;
}
}));
}
if (stderrConsumer != null) {
stderrProcessorFuture.set(executor.submit(() -> {
try {
InputStream errorStream = processFuture.get()
.getErrorStream();
stderrConsumer.accept(errorStream != null
? errorStream
: new ByteArrayInputStream(new byte[0]));
return null;
} catch (Exception e) {
cancel.run();
throw e;
}
}));
}
exitCodeFuture.set(executor.submit(() -> {
try {
return processFuture.get().waitFor();
} catch (Exception e) {
cancel.run();
throw e;
}
}));
// Async completion barrier -- wait for process to exit,
// and for output processors to complete
return executor.submit(() -> {
Exception exception = null;
int exitCode = 1;
try {
exitCode = exitCodeFuture.get().get();
} catch (InterruptedException | CancellationException
| ExecutionException e) {
cancel.run();
exception = e;
}
if (stderrProcessorFuture.get() != null) {
try {
stderrProcessorFuture.get().get();
} catch (InterruptedException | CancellationException
| ExecutionException e) {
cancel.run();
if (exception == null) {
exception = e;
} else if (e instanceof ExecutionException) {
exception.addSuppressed(e);
}
}
}
if (stdoutProcessorFuture.get() != null) {
try {
stdoutProcessorFuture.get().get();
} catch (InterruptedException | CancellationException
| ExecutionException e) {
cancel.run();
if (exception == null) {
exception = e;
} else if (e instanceof ExecutionException) {
exception.addSuppressed(e);
}
}
}
if (exception != null) {
throw exception;
} else {
return exitCode;
}
});
}
Note: I realize that Runtime.getRuntime().exec(cmd) should be non-blocking, so doesn't require its own Future, but I wrote the code using one anyway, to make the point about DAG construction.
No way. Process has no asynchronous interface (except for Process.onExit()). So you have to use threads to wait for process creation and while reading from InputStreams. Other components of your DAG can be async tasks (CompletableFutures).
This is not a big problem. The only advantage of async tasks over threads is less memory consumption. Your Process consumes a lot if memery anyway, so there is not much sense to save memory here.

How to throw exception in spark streaming

We have a spark streaming program which pull messages from kafka and process each individual message using forEachPartiton transformation.
If case if there is specific error in the processing function we would like to throw the exception back and halt the program. The same seems to be not happening. Below is the code we are trying to execute.
JavaInputDStream<KafkaDTO> stream = KafkaUtils.createDirectStream( ...);
stream.foreachRDD(new Function<JavaRDD<KafkaDTO>, Void>() {
public Void call(JavaRDD<KafkaDTO> rdd) throws PropertiesLoadException, Exception {
rdd.foreachPartition(new VoidFunction<Iterator<KafkaDTO>>() {
#Override
public void call(Iterator<KafkaDTO> itr) throws PropertiesLoadException, Exception {
while (itr.hasNext()) {
KafkaDTO dto = itr.next();
try{
//process the message here.
} catch (PropertiesLoadException e) {
// throw Exception if property file is not found
throw new PropertiesLoadException(" PropertiesLoadException: "+e.getMessage());
} catch (Exception e) {
throw new Exception(" Exception : "+e.getMessage());
}
}
}
});
}
}
In the above code even if we throw a PropertiesLoadException the program doesn't halt and streaming continues. The max retries we set in Spark configuration is only 4. The streaming program continues even after 4 failures. How should the exception be thrown to stop the program?
I am not sure if this is the best approach but we surrounded the main batch with try and catch and when I get exception I just call close context. In addition you need to make sure that stop gracfully is off (false).
Example code:
try {
process(dataframe);
} catch (Exception e) {
logger.error("Failed on write - will stop spark context immediately!!" + e.getMessage());
closeContext(jssc);
if (e instanceof InterruptedException) {
Thread.currentThread().interrupt();
}
throw e;
}
And close function:
private void closeContext(JavaStreamingContext jssc) {
logger.warn("stopping the context");
jssc.stop(false, jssc.sparkContext().getConf().getBoolean("spark.streaming.stopGracefullyOnShutdown", false));
logger.error("Context was stopped");
}
In config :
spark.streaming.stopGracefullyOnShutdown false
I think that with your code it should look like this:
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, streamBatch);
JavaInputDStream<KafkaDTO> stream = KafkaUtils.createDirectStream( jssc, ...);
stream.foreachRDD(new Function<JavaRDD<KafkaDTO>, Void>() {
public Void call(JavaRDD<KafkaDTO> rdd) throws PropertiesLoadException, Exception {
try {
rdd.foreachPartition(new VoidFunction<Iterator<KafkaDTO>>() {
#Override
public void call(Iterator<KafkaDTO> itr) throws PropertiesLoadException, Exception {
while (itr.hasNext()) {
KafkaDTO dto = itr.next();
try {
//process the message here.
} catch (PropertiesLoadException e) {
// throw Exception if property file is not found
throw new PropertiesLoadException(" PropertiesLoadException: " + e.getMessage());
} catch (Exception e) {
throw new Exception(" Exception : " + e.getMessage());
}
}
}
});
} catch (Exception e){
logger.error("Failed on write - will stop spark context immediately!!" + e.getMessage());
closeContext(jssc);
if (e instanceof InterruptedException) {
Thread.currentThread().interrupt();
}
throw e;
}
}
}
In addition please note that my stream is working on spark 2.1 Standalone (not yarn / mesos) client mode. In addition I implement the stop gracefully my self using ZK.

What causes BlockingOperationException in Netty 4?

Recently, I find some BlockingOperationException in my netty4 project.
Some people said that when using the sync() method of start netty's ServerBootstrap can cause dead lock, because sync() will invoke await() method, and there is a method called 'checkDeadLock' in await().
But I don't think so. ServerBootstrap use the EventLoopGroup called boosGroup, and Channel use the workerGroup to operation IO, I don't think they will influence each other, they have different EventExecutor.
And in my practice, Deadlock exception doesn't appear in the Netty startup process, most of which occurs after the Channel of the await writeAndFlush.
Analysis source code, checkDeadLock, BlockingOperationException exception thrown is when the current thread and executor thread of execution is the same.
My project code is blow:
private void channelWrite(T message) {
boolean success = true;
boolean sent = true;
int timeout = 60;
try {
ChannelFuture cf = cxt.write(message);
cxt.flush();
if (sent) {
success = cf.await(timeout);
}
if (cf.isSuccess()) {
logger.debug("send success.");
}
Throwable cause = cf.cause();
if (cause != null) {
this.fireError(new PushException(cause));
}
} catch (LostConnectException e) {
this.fireError(new PushException(e));
} catch (Exception e) {
this.fireError(new PushException(e));
} catch (Throwable e) {
this.fireError(new PushException("Failed to send message“, e));
}
if (!success) {
this.fireError(new PushException("Failed to send message"));
}
}
I know Netty officials advise not to use sync() or await() method, but I want to know what situation will causes deadlocks in process and the current thread and executor thread of execution is the same.
I change my project code.
private void pushMessage0(T message) {
try {
ChannelFuture cf = cxt.writeAndFlush(message);
cf.addListener(new ChannelFutureListener() {
#Override
public void operationComplete(ChannelFuture future) throws PushException {
if (future.isSuccess()) {
logger.debug("send success.");
} else {
throw new PushException("Failed to send message.");
}
Throwable cause = future.cause();
if (cause != null) {
throw new PushException(cause);
}
}
});
} catch (LostConnectException e) {
this.fireError(new PushException(e));
} catch (Exception e) {
this.fireError(new PushException(e));
} catch (Throwable e) {
this.fireError(new PushException(e));
}
}
But I face a new problem, I can't get the pushException from the ChannelHandlerListener.
BlockingOperationException will be throw by netty if you call sync*or await* on a Future in the same thread that the EventExecutor is using and to which the Future is tied to. This is usually the EventLoop that is used by the Channel itself.
Can not call await in IO thread is understandable. However, there are 2 points.
1. If you call below code in channel handler, no exception will be reported, because the the most of the time the check of isDone in await returns true, since you are in IO thread, and IO thread is writing data synchronously. the data has been written when await is called.
ChannelPromise p = ctx.writeAndFlush(msg);
p.await()
If add a handler in different EventExecutorGroup, this check is not necessary, since that executor is newly created and is not the same one with the channel's IO executor.

Handling the cause of an ExecutionException

Suppose I have a class defining a big block of work to be done, that can produce several checked Exceptions.
class WorkerClass{
public Output work(Input input) throws InvalidInputException, MiscalculationException {
...
}
}
Now suppose I have a GUI of some sort that can call this class. I use a SwingWorker to delegate the task.
Final Input input = getInput();
SwingWorker<Output, Void> worker = new SwingWorker<Output, Void>() {
#Override
protected Output doInBackground() throws Exception {
return new WorkerClass().work(input);
}
};
How can I handle the possible exceptions thrown from the SwingWorker? I want to differentiate between the Exceptions of my worker class (InvalidInputException and MiscalculationException), but the ExecutionException wrapper complicates things. I only want to handle these Exceptions - an OutOfMemoryError should not be caught.
try{
worker.execute();
worker.get();
} catch(InterruptedException e){
//Not relevant
} catch(ExecutionException e){
try{
throw e.getCause(); //is a Throwable!
} catch(InvalidInputException e){
//error handling 1
} catch(MiscalculationException e){
//error handling 2
}
}
//Problem: Since a Throwable is thrown, the compiler demands a corresponding catch clause.
catch (ExecutionException e) {
Throwable ee = e.getCause ();
if (ee instanceof InvalidInputException)
{
//error handling 1
} else if (ee instanceof MiscalculationException e)
{
//error handling 2
}
else throw e; // Not ee here
}
You could use an ugly (smart?) hack to convert the throwable into an unchecked exception. The advantage is that the calling code will receive whatever exception was thrown by your worker thread, whether checked or unchecked, but you don't have to change the signature of your method.
try {
future.get();
} catch (InterruptedException ex) {
} catch (ExecutionException ex) {
if (ex.getCause() instanceof InvalidInputException) {
//do your stuff
} else {
UncheckedThrower.throwUnchecked(ex.getCause());
}
}
With UncheckedThrower defined as:
class UncheckedThrower {
public static <R> R throwUnchecked(Throwable t) {
return UncheckedThrower.<RuntimeException, R>trhow0(t);
}
#SuppressWarnings("unchecked")
private static <E extends Throwable, R> R trhow0(Throwable t) throws E {
throw (E) t;
}
}
Try/multi-catch:
try {
worker.execute();
worker.get();
} catch (InterruptedException e) {
//Not relevant
} catch (InvalidInputException e) {
//stuff
} catch (MiscalculationException e) {
//stuff
}
Or with the ExecutionException wrapper:
catch (ExecutionException e) {
e = e.getCause();
if (e.getClass() == InvalidInputException.class) {
//stuff
} else if (e.getClass() == MiscalculationException.class) {
//stuff
}
}
Or if you want exceptions' subclasses to be treated like their parents:
catch (ExecutionException e) {
e = e.getCause();
if (e instanceof InvalidInputException) {
//stuff
} else if (e instanceof MiscalculationException) {
//stuff
}
}

Java: Synchronizing socket input

My first attempt at writing a client for a php socket server and I'm running into a little trouble and I'm sort of being flooded with info!
With the server, we want an open connection, I want my client end to wait until it receives data before notifying the thread to start parsing the input-stream. Is this achievable without using a loop? I'd rather be able to call lock.notify().
I was also looking at NIO, is this a viable option for what I want?
Here's the code I have so far, but again, I'm just trying to avoid the for(;;) and maybe even queue the received messages as they will most likely just be JSON
Thread serverRecieve = new Thread(new Runnable() {
#Override
public void run() {
try {
for (;;) {
if (in != null) {
String line;
while ((line = in.readLine()) != null) {
sout(line);
}
} else {
sout("inputstream is null! Waiting for a second to test again");
}
try {
Thread.sleep(1000);
} catch (InterruptedException ex) {
Logger.getLogger(WebManager.class.getName()).log(Level.SEVERE, null, ex);
}
}
} catch (IOException ex) {
Logger.getLogger(WebManager.class.getName()).log(Level.SEVERE, null, ex);
}
}
});
Thanks guys!
PS: I did look through A LOT of socket threads on here but decided it would be easier just to ask what I need.
I think you can use a while loop and put a condition using in != null as:
while(in == null){
//wait for a second before checking the in stream again
try {
sout("inputstream is null! Waiting for a second to test again");
Thread.sleep(1000);
} catch (InterruptedException ex) {
Logger.getLogger(WebManager.class.getName()).log(Level.SEVERE, null, ex);
}
}
//now your in is available. Read the data and proceed
String line = null;
while ((line = in.readLine()) != null) {
sout(line);
}
The first while loop will terminate as soon in stream is available.
How about creating dedicated subtype of Runnable for reading from socket, like this:
class Reader implements Runnable {
private final Socket socket;
private volatile boolean stopped;
Reader(Socket socket) {
this.socket = socket;
}
#Override
public void run() {
try {
while (true) {
int in = socket.getInputStream().read();
// process in here
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (!stopped) socket.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
public void stop() {
try {
stopped = true;
socket.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
class Client {
private volatile Reader reader;
void start() {
reader = new Reader(new Socket(serverHost, serverPort));
Thread readerThread = new Thread(reader, "Reader-thread");
readerThread.start();
}
void stop() {
Reader reader = this.reader;
// reader.stop() will close socket making `run()` method finish because of IOException
// reader.socket is final, thus we have proper visibility of it's values across threads
if (reader != null) reader.stop();
}
}

Categories