I am trying to get a very basic RxJava based application to work. I have defined the following Observable class which reads and returns lines from a file:
public Observable<String> getObservable() throws IOException
{
return Observable.create(subscribe -> {
InputStream in = getClass().getResourceAsStream("/trial.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
String line = null;
try {
while((line = reader.readLine()) != null)
{
subscribe.onNext(line);
}
} catch (IOException e) {
subscribe.onError(e);
}
finally {
subscribe.onCompleted();
}
});
}
Next I have defined the subscrober code:
public static void main(String[] args) throws IOException, InterruptedException {
Thread thread = new Thread(() ->
{
RxObserver observer = new RxObserver();
try {
observer.getObservable()
.observeOn(Schedulers.io())
.subscribe( x ->System.out.println(x),
t -> System.out.println(t),
() -> System.out.println("Completed"));
} catch (IOException e) {
e.printStackTrace();
}
});
thread.start();
thread.join();
}
The file has close to 50000 records. When running the app I am getting "rx.exceptions.MissingBackpressureException". I have gone through some of the documentation and as suggested, I tried added the ".onBackpressureBuffer()" method in the call chain. But then I am not getting the exception but the completed call too isin't getting fired.
What is the right way to handle scenario wherein we have a fast producing Observable?
The first problem is that your readLine logic ignores backpressure. You can apply onBackpressureBuffer() just before observeOn to start with but there is a recent addition SyncOnSubscribe that let's you generate values one by one and takes care of backpressure:
SyncOnSubscribe.createSingleState(() => {
try {
InputStream in = getClass().getResourceAsStream("/trial.txt");
return new BufferedReader(new InputStreamReader(in));
} catch (IOException ex) {
throw new RuntimeException(ex);
}
},
(s, o) -> {
try {
String line = s.readLine();
if (line == null) {
o.onCompleted();
} else {
o.onNext(line);
}
} catch (IOException ex) {
s.onError(ex);
}
},
s -> {
try {
s.close();
} catch (IOException ex) {
}
});
The second problem is that your Thread will complete way before all elements on the io thread has been delivered and thus the main program exits. Either remove the observeOn, add .toBlocking or use a CountDownLatch.
RxObserver observer = new RxObserver();
try {
CountDownLatch cdl = new CountDownLatch(1);
observer.getObservable()
.observeOn(Schedulers.io())
.subscribe( x ->System.out.println(x),
t -> { System.out.println(t); cdl.countDown(); },
() -> { System.out.println("Completed"); cdl.countDown(); });
cdl.await();
} catch (IOException | InterruptedException e) {
e.printStackTrace();
}
The problem here is observeOn operator, since each Observer's onNext() call is scheduled to be called on a separate thread, your Observable keeps producing those scheduled calls in a loop regardless of subscriber (observeOn) capacity.
If you keep this synchronous, Observable will not emit next element until subscriber is done with the previous one, since it's all done on a one thread and you will not have backpressure problems anymore.
If you still want to use observeOn, you will have to implement backpressure logic in your Observable's OnSubscribe#call method
Related
I'm trying to remove elements manually from ArrayBlockingQueue by using the removeIf() method, using threads.
And then have another thread trying to put an element into the ArrayBlockingQueue.
It doesn't work, which is weird because I thought that put() tries and tries until there's space and puts an element in successfully or is my understanding of it wrong?
What's the problem and what should I do to work around this problem?
Here's my code
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.CyclicBarrier;
class HelloWorld {
static ArrayBlockingQueue<Integer> q = new ArrayBlockingQueue(2);
static CyclicBarrier cb = new CyclicBarrier(2);
public static void main(String[] args) {
try {
Thread t1 = new Thread(() -> {
try {
q.put(2);
} catch (Exception e) {
e.printStackTrace();
}
});
Thread t2 = new Thread(() -> {
try {
q.put(2);
} catch (Exception e) {
e.printStackTrace();
}
});
Thread t3 = new Thread(() -> {
try {
q.put(1);
} catch (Exception e) {
e.printStackTrace();
}
});
t1.start();
t2.start();
t3.start();
t1.join();
t2.join();
System.out.println(q.size());
q.removeIf(ii -> ii == 2);
System.out.println(q.size());
t3.join();
System.out.println(q.size());
} catch (Exception e) {
e.printStackTrace();
}
}
}
The first output was 2, second was 0 after the removeIf() method and the third output never arrived, I think its because the put() method was never triggered.
I expected the third output to be 1 as I thought the put() method will be triggered as space was vacated using removeIf(), but the third output never came no matter how long I waited
I want to create a DAG out of tasks in Java, where the tasks may depend upon the output of other tasks. If there is no directed path between two tasks, they may run in parallel. Tasks may be canceled. If any task throws an exception, all tasks are canceled.
I wanted to use CompleteableFuture for this, but despite implementing the Future interface (including Future.cancel(boolean), CompletableFuture does not support cancelation -- CompletableFuture.cancel(true) is simply ignored. (Does anybody know why?)
Therefore, I am resorting to building my own DAG of tasks using Future. It's a lot of boilerplate, and complicated to get right. Is there any better method than this?
Here is an example:
I want to call Process process = Runtime.getRuntime().exec(cmd) to start a commandline process, creating a Future<Process>. Then I want to launch (fan out to) three subtasks:
One task that consumes input from process.getInputStream()
One task that consumes input from process.getErrorStream()
One task that calls process.waitFor(), and then waits for the result.
Then I want to wait for all three of the launched sub-tasks to complete (i.e. fan-in / a completion barrier). This should be collected in a final Future<Integer> exitCode that collects the exit code returned by the process.waitFor() task. The two input consumer tasks simply return Void, so their output can be ignored, but the completion barrier should still wait for their completion.
I want a failure in any of the launched subtasks to cause all subtasks to be canceled, and the underlying process destroyed.
Note that Process process = Runtime.getRuntime().exec(cmd) in the first step can throw an exception, which should cause the failure to cascade all the way to exitCode.
#FunctionalInterface
public static interface ConsumerThrowingIOException<T> {
public void accept(T val) throws IOException;
}
public static Future<Integer> exec(
ConsumerThrowingIOException<InputStream> stdoutConsumer,
ConsumerThrowingIOException<InputStream> stderrConsumer,
String... cmd) {
Future<Process> processFuture = executor.submit(
() -> Runtime.getRuntime().exec(cmd));
AtomicReference<Future<Void>> stdoutProcessorFuture = //
new AtomicReference<>();
AtomicReference<Future<Void>> stderrProcessorFuture = //
new AtomicReference<>();
AtomicReference<Future<Integer>> exitCodeFuture = //
new AtomicReference<>();
Runnable cancel = () -> {
try {
processFuture.get().destroy();
} catch (Exception e) {
// Ignore (exitCodeFuture.get() will still detect exceptions)
}
if (stdoutProcessorFuture.get() != null) {
stdoutProcessorFuture.get().cancel(true);
}
if (stderrProcessorFuture.get() != null) {
stderrProcessorFuture.get().cancel(true);
}
if (exitCodeFuture.get() != null) {
stderrProcessorFuture.get().cancel(true);
}
};
if (stdoutConsumer != null) {
stdoutProcessorFuture.set(executor.submit(() -> {
try {
InputStream inputStream = processFuture.get()
.getInputStream();
stdoutConsumer.accept(inputStream != null
? inputStream
: new ByteArrayInputStream(new byte[0]));
return null;
} catch (Exception e) {
cancel.run();
throw e;
}
}));
}
if (stderrConsumer != null) {
stderrProcessorFuture.set(executor.submit(() -> {
try {
InputStream errorStream = processFuture.get()
.getErrorStream();
stderrConsumer.accept(errorStream != null
? errorStream
: new ByteArrayInputStream(new byte[0]));
return null;
} catch (Exception e) {
cancel.run();
throw e;
}
}));
}
exitCodeFuture.set(executor.submit(() -> {
try {
return processFuture.get().waitFor();
} catch (Exception e) {
cancel.run();
throw e;
}
}));
// Async completion barrier -- wait for process to exit,
// and for output processors to complete
return executor.submit(() -> {
Exception exception = null;
int exitCode = 1;
try {
exitCode = exitCodeFuture.get().get();
} catch (InterruptedException | CancellationException
| ExecutionException e) {
cancel.run();
exception = e;
}
if (stderrProcessorFuture.get() != null) {
try {
stderrProcessorFuture.get().get();
} catch (InterruptedException | CancellationException
| ExecutionException e) {
cancel.run();
if (exception == null) {
exception = e;
} else if (e instanceof ExecutionException) {
exception.addSuppressed(e);
}
}
}
if (stdoutProcessorFuture.get() != null) {
try {
stdoutProcessorFuture.get().get();
} catch (InterruptedException | CancellationException
| ExecutionException e) {
cancel.run();
if (exception == null) {
exception = e;
} else if (e instanceof ExecutionException) {
exception.addSuppressed(e);
}
}
}
if (exception != null) {
throw exception;
} else {
return exitCode;
}
});
}
Note: I realize that Runtime.getRuntime().exec(cmd) should be non-blocking, so doesn't require its own Future, but I wrote the code using one anyway, to make the point about DAG construction.
No way. Process has no asynchronous interface (except for Process.onExit()). So you have to use threads to wait for process creation and while reading from InputStreams. Other components of your DAG can be async tasks (CompletableFutures).
This is not a big problem. The only advantage of async tasks over threads is less memory consumption. Your Process consumes a lot if memery anyway, so there is not much sense to save memory here.
I have a Producer-Consumer problem to implement in Java, where I want the producer thread to run for a specific amount of time e.g. 1 day, putting objects in a BlockingQueue -specifically tweets, streamed from Twitter Streaming API via Twitter4j- and the consumer thread to consume these objects from the queue and write them to file. I've used the PC logic from Read the 30Million user id's one by one from the big file, where producer is the FileTask and consumer is the CPUTask (check first answer; my approach uses the same iterations/try-catch blocks with it). Of course I adapted the implementations accordingly.
My main function is:
public static void main(String[] args) {
....
final int threadCount = 2;
// BlockingQueue with a capacity of 200
BlockingQueue<Tweet> tweets = new ArrayBlockingQueue<>(200);
// create thread pool with given size
ExecutorService service = Executors.newFixedThreadPool(threadCount);
Future<?> f = service.submit(new GathererTask(tweets));
try {
f.get(1,TimeUnit.MINUTES); // Give specific time to the GathererTask
} catch (InterruptedException | ExecutionException | TimeoutException e) {
f.cancel(true); // Stop the Gatherer
}
try {
service.submit(new FileTask(tweets)).get(); // Wait til FileTask completes
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
service.shutdownNow();
try {
service.awaitTermination(7, TimeUnit.DAYS);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
Now, the problem is that, although it does stream the tweets and writes them to file, it never terminates and never gets to the f.cancel(true) part. What should I change for it to work properly? Also, could you explain in your answer what went wrong here with the thread logic, so I learn from my mistake? Thank you in advance.
These are the run() functions of my PC classes:
Producer:
#Override
public void run() {
StatusListener listener = new StatusListener(){
public void onStatus(Status status) {
try {
tweets.put(new Tweet(status.getText(),status.getCreatedAt(),status.getUser().getName(),status.getHashtagEntities()));
} catch (InterruptedException e) {
e.printStackTrace();
Thread.currentTread.interrupt(); // Also tried this command
}
}
public void onException(Exception ex) {
ex.printStackTrace();
}
};
twitterStream.addListener(listener);
... // More Twitter4j commands
}
Consumer:
public void run() {
Tweet tweet;
try(PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter("out.csv", true)))) {
while(true) {
try {
// block if the queue is empty
tweet = tweets.take();
writeTweetToFile(tweet,out);
} catch (InterruptedException ex) {
break; // GathererTask has completed
}
}
// poll() returns null if the queue is empty
while((tweet = tweets.poll()) != null) {
writeTweetToFile(tweet,out);
}
} catch (IOException e) {
e.printStackTrace();
}
}
You should check if your Thread classes are handling the InterruptedException, if not, they will wait forever. This might help.
long end=System.currentTimeMillis()+60*10;
InputStreamReader fileInputStream=new InputStreamReader(System.in);
BufferedReader bufferedReader=new BufferedReader(fileInputStream);
try
{
while((System.currentTimeMillis()<end) && (bufferedReader.readLine()!=null))
{
}
bufferedReader.close();
}
catch(java.io.IOException e)
{
e.printStackTrace();
}
I actually tried doing the above for reading in 600 miliseconds time after which it should not allow reading but the readline of the bufferedreader is blocking.Please help
Using BufferedReader.available() as suggested by Sibbo isn't reliable. Documentation of available() states:
Returns an estimate of the number of bytes that can be read... It is never correct to use the return value of this method to allocate a buffer.
In other words, you cannot rely on this value, e.g., it can return 0 even if some characters are actually available.
I did some research and unless you are able to close the process input stream from outside, you need to resort to an asynchronous read from a different thread. You can find an example how to read without blocking line by line here.
Update: Here is a simplified version of the code from the link above:
public class NonblockingBufferedReader {
private final BlockingQueue<String> lines = new LinkedBlockingQueue<String>();
private volatile boolean closed = false;
private Thread backgroundReaderThread = null;
public NonblockingBufferedReader(final BufferedReader bufferedReader) {
backgroundReaderThread = new Thread(new Runnable() {
#Override
public void run() {
try {
while (!Thread.interrupted()) {
String line = bufferedReader.readLine();
if (line == null) {
break;
}
lines.add(line);
}
} catch (IOException e) {
throw new RuntimeException(e);
} finally {
closed = true;
}
}
});
backgroundReaderThread.setDaemon(true);
backgroundReaderThread.start();
}
public String readLine() throws IOException {
try {
return closed && lines.isEmpty() ? null : lines.poll(500L, TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
throw new IOException("The BackgroundReaderThread was interrupted!", e);
}
}
public void close() {
if (backgroundReaderThread != null) {
backgroundReaderThread.interrupt();
backgroundReaderThread = null;
}
}
}
You could check with BufferedReader.available() > 0 if there are chars to read.
String s;
while((System.currentTimeMillis()<end))
{
if (bufferedReader.available() > 0)
s += bufferedReader.readLine();
}
bufferedReader.close();
long end=System.currentTimeMillis()+60*10;
InputStreamReader fileInputStream = new InputStreamReader(System.in);
BufferedReader bufferedReader = new BufferedReader(fileInputStream);
try {
while ((System.currentTimeMillis() < end)) {
if (bufferedReader.ready()) {
System.out.println(bufferedReader.readLine());
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (bufferedReader != null) {
bufferedReader.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
The only reliable way would be to start a worker thread and do the actual reading inside it, while the caller thread would monitor the latency.
If the worker thread is waiting longer that allowed, the master thread would terminate it and throw an exception.
BufferReader.readLine() can block for a very long time if a line is extremely long like 1M chars.
Does your file contains such long lines?
If yes, you may have to break up the lines, or use per-char read methods like BufferReader.read().
This question is about java.lang.Process and its handling of stdin, stdout and stderr.
We have a class in our project that is an extension to org.apache.commons.io.IOUtils. There we have a quiet new method for closing the std-streams of a Process-Object appropriate? Or is it not appropriate?
/**
* Method closes all underlying streams from the given Process object.
* If Exit-Code is not equal to 0 then Process will be destroyed after
* closing the streams.
*
* It is guaranteed that everything possible is done to release resources
* even when Throwables are thrown in between.
*
* In case of occurances of multiple Throwables then the first occured
* Throwable will be thrown as Error, RuntimeException or (masked) IOException.
*
* The method is null-safe.
*/
public static void close(#Nullable Process process) throws IOException {
if(process == null) {
return;
}
Throwable t = null;
try {
close(process.getOutputStream());
}
catch(Throwable e) {
t = e;
}
try{
close(process.getInputStream());
}
catch(Throwable e) {
t = (t == null) ? e : t;
}
try{
close(process.getErrorStream());
}
catch (Throwable e) {
t = (t == null) ? e : t;
}
try{
try {
if(process.waitFor() != 0){
process.destroy();
}
}
catch(InterruptedException e) {
t = (t == null) ? e : t;
process.destroy();
}
}
catch (Throwable e) {
t = (t == null) ? e : t;
}
if(t != null) {
if(t instanceof Error) {
throw (Error) t;
}
if(t instanceof RuntimeException) {
throw (RuntimeException) t;
}
throw t instanceof IOException ? (IOException) t : new IOException(t);
}
}
public static void closeQuietly(#Nullable Logger log, #Nullable Process process) {
try {
close(process);
}
catch (Exception e) {
//log if Logger provided, otherwise discard
logError(log, "Fehler beim Schließen des Process-Objekts (inkl. underlying streams)!", e);
}
}
public static void close(#Nullable Closeable closeable) throws IOException {
if(closeable != null) {
closeable.close();
}
}
Methods like these are basically used in finally-blocks.
What I really want to know is if I am safe with this implementation? Considering things like: Does a process object always return the same stdin, stdout and stderr streams during its lifetime? Or may I miss closing streams previously returned by process' getInputStream(), getOutputStream() and getErrorStream() methods?
There is a related question on StackOverflow.com: java: closing subprocess std streams?
Edit
As pointed out by me and others here:
InputStreams have to be totally consumed. When not done then the subprocess may not terminate, because there is outstanding data in its output streams.
All three std-streams have to be closed. Regardless if used before or not.
When the subprocess terminates normally everything should be fine. When not then it have to be terminated forcibly.
When an exit code is returned by subprocess then we do not need to destroy() it. It has terminated. (Even when not necessarily terminated normally with Exit Code 0, but it terminated.)
We need to monitor waitFor() and interrupt when timeout exceeds to give process a chance to terminate normally but killing it when it hangs.
Unanswered parts:
Consider Pros and Cons of consuming the InputStreams in parallel. Or must they be consumed in particular order?
An attempt at simplifying your code:
public static void close(#Nullable Process process) throws IOException
{
if(process == null) { return; }
try
{
close(process.getOutputStream());
close(process.getInputStream());
close(process.getErrorStream());
if(process.waitFor() != 0)
{
process.destroy();
}
}
catch(InterruptedException e)
{
process.destroy();
}
catch (RuntimeException e)
{
throw (e instanceof IOException) ? e : new IOException(e);
}
}
By catching Throwable I assume you wish to catch all unchecked exceptions. That is either a derivative of RuntimeException or Error. However Error should never be catched, so I have replaced Throwable with RuntimeException.
(It is still not a good idea to catch all RuntimeExceptions.)
As the question you linked to states, it is better to read and discard the output and error streams. If you are using apache commons io, something like,
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getInputStream(), new NullOutputStream());}}).start();
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getErrorStream(), new NullOutputStream());}}).start();
You want to read and discard stdout and stderr in a separate thread to avoid problems such as the process blocking when it writes enough info to stderr or stdout to fill the buffer.
If you are worried about having two many threads, see this question
I don't think you need to worry about catching IOExceptions when copying stdout, stdin to NullOutputStream, since if there is an IOException reading from the process stdout/stdin, it is probably due to the process being dead itself, and writing to NullOutputStream will never throw an exception.
You don't need to check the return status of waitFor().
Do you want to wait for the process to complete? If so, you can do,
while(true) {
try
{
process.waitFor();
break;
} catch(InterruptedException e) {
//ignore, spurious interrupted exceptions can occur
}
}
Looking at the link you provided you do need to close the streams when the process is complete, but destroy will do that for you.
So in the end, the method becomes,
public void close(Process process) {
if(process == null) return;
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getInputStream(), new NullOutputStream());}}).start();
new Thread(new Runnable() {public void run() {IOUtils.copy(process.getErrorStream(), new NullOutputStream());}}).start();
while(true) {
try
{
process.waitFor();
//this will close stdin, stdout and stderr for the process
process.destroy();
break;
} catch(InterruptedException e) {
//ignore, spurious interrupted exceptions can occur
}
}
}
Just to let you know what I have currently in our codebase:
public static void close(#Nullable Process process) throws IOException {
if (process == null) {
return;
}
Throwable t = null;
try {
flushQuietly(process.getOutputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
close(process.getOutputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
skipAllQuietly(null, TIMEOUT, process.getInputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
close(process.getInputStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
skipAllQuietly(null, TIMEOUT, process.getErrorStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
close(process.getErrorStream());
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
try {
try {
Thread monitor = ThreadMonitor.start(TIMEOUT);
process.waitFor();
ThreadMonitor.stop(monitor);
}
catch (InterruptedException e) {
t = mostImportantThrowable(t, e);
process.destroy();
}
}
catch (Throwable e) {
t = mostImportantThrowable(t, e);
}
if (t != null) {
if (t instanceof Error) {
throw (Error) t;
}
if (t instanceof RuntimeException) {
throw (RuntimeException) t;
}
throw t instanceof IOException ? (IOException) t : new IOException(t);
}
}
skipAllQuietly(...) consumes complete InputStreams. It uses internally an implementation similar to org.apache.commons.io.ThreadMonitor to interrupt consumption if a given timeout exceeded.
mostImportantThrowable(...) decides over what Throwable should be returned. Errors over everything. First occured higher prio than later occured. Nothing very important here since these Throwable are most probably discarded anyway later. We want to go on working here and we can only throw one, so we have to decide what we throw at the end, if ever.
close(...) are null-safe implementations to close stuff but throwing Exception when something went wrong.