Terminating a Future and getting the intermediate result - java

I have a long-running Scala Future that operates as follows:
Calculate initial result
Improve result
If no improvement then terminate, else goto 2
After receiving an external signal (meaning the Future won't have any a priori knowledge about how long it's supposed to run for), I would like to be able to tell the Future to terminate and give me its intermediate result. I can do this using some sort of side channel (note: this is a Java program using Akka, hence the reason I'm creating a Scala future in Java along with all of the attendant boilerplate):
public void doCalculation(AtomicBoolean interrupt, AtomicReference output) {
Futures.future(new Callable<Boolean>() {
public Boolean call() {
Object previous = // calculate initial value
output.set(previous);
while(!interrupt.get()) {
Object next = // calculate next value
if(/* next is better than previous */) {
previous = next;
output.set(previous);
} else return true;
}
return false;
}
}, TypedActor.dispatcher());
}
This way whoever is calling doCalculation can get intermediate values via output and can terminate the Future via interrupt. However, I'm wondering if there is a way to do this without resorting to side channels, as this is going to make it somewhat difficult for somebody else to maintain my code. We're using Java 7.

This doesn't sound much like a Future to me. Consider instead a Runnable that simply updates an AtomicReference field. Your runnable can update the reference as often as needed, and callers can poll the field whenever they want.
You could have your class implement Supplier if you want it to expose a standard interface for getting a value.

Related

Java Event Listener return value

I am using Java8. I have an Listener that calls onSuccess when completed with a customToken.
#Override
public String getCustomToken(Person person) {
FirebaseAuth.getInstance().createCustomToken(person.getUid()).addOnSuccessListener(new OnSuccessListener<String>() {
#Override
public void onSuccess(String customToken) {
// I would like to return the customToken
}
});
return null;
}
Question
How do I get this method to return the String customToken?
Your question is intriguing, but the accepted answer unfortunately provides you with wrong means.
The problem with your question is that of API. You are trying to use callbacks in a way they are not designed to be used. A callback, by definition, is supposed to provide a means to do something asynchronously. It is more like a specification of what to do when something happens (in future). Making a synchronous method like getCustomToken() return something that is a result of an inherently asynchronous operation like onSuccess() implies a fundamental disconnect.
While dealing with callbacks, it is critical to understand the importance of continuations: taking actions when certain events of interest happen. Note that these events may not even happen. But you are specifying in the code the actions to take, if and when those events occur. Thus, continuation style is a shift from procedural style.
What adds to the data flow complexity is the syntax of the anonymous inner classes. You tend to think "oh, why can't I just return from here what onSuccess() returns? After all, the code is right here." But imagine that Java had no inner classes (and as you may know, (anonymous) inner class can easily be replaced by a class that is not an inner class). You'd have needed to do something like:
OnSuccessListener listener = new SomeImplementation();
FirebaseAuth.getInstance().createCustomToken(listener);
Now, the code that returned data (String) is gone. You can even visually reason that in this case, there is no way for your method to return a string -- it is simply not there!
So, I encourage you to think of what should happen if and when (in future) onSuccess() is called on the OnSuccessListener instance that you pass in. In other words, think twice if you really want to provide in your API, the getCustomToken() method (that returns a token string, given a Person instance).
If you absolutely must provide such a method, you
Should document that the returned token may be null (or something more meaningful like None) and that your clients must try again if they want a valid value.
Should provide a listener that updates a thread-safe container of tokens that this method reads.
Googling around, I found the Firebase documentation. This also seems to suggest taking an action on success (in a continuation style):
FirebaseAuth.getInstance().createCustomToken(uid)
.addOnSuccessListener(new OnSuccessListener<String>() {
#Override
public void onSuccess(String customToken) {
// **Send token back to client**
}
});
The other problem with trying to provide such API is the apparent complexity of the code for something trivial. The data flow has become quite complex and difficult to understand.
If blocking is acceptable to you as a solution, then perhaps you can use the Callable-Future style where you pass a Callable and then later do a get() on the Future that may block. But I am not sure if that is a good design choice here.
This would work syntactically:
final List<String> tokenContainer = new ArrayList<>();
FirebaseAuth.getInstance().createCustomToken(person.getUid()).addOnSuccessListener(new OnSuccessListener<String>() {
#Override
public void onSuccess(String customToken) {
tokenContainer.add(customToken);
}
});
return tokenContainer.get(0);
As said; this works syntactically. But if it really works would depend if the overall flow is happening in one thread; or multiple ones.
In other words: when the above code is executed in sequence, then that list should contain exactly one entry in the end. But if that callback happens on a different thread, then you would need a more complicated solution. A hackish way could be to prepend
return tokenContainer.get(0);
with
while (tokenContainer.isEmpty()) {
Thread.sleep(50);
}
return tokenContainer.get(0);
In other words: have the "outer thing" sit and wait for the callback to happen. But the more sane approach would be to instead use a field of the surrounding class.
Edit: if the above is regarded a hack or not; might depend on your context to a certain degree. The only thing that really troubles me with your code is the fact that you are creating a new listener; which gets added "somewhere" ... to stay there?! What I mean is: shouldn't there be code to unregister that listener somewhere?
The original accepted answer suggests sleeping the thread, which is a bad solution because you can't know how long the thread needs to sleep. A better solution is to use a semaphore (or similarly, a latch). After the listener gets the value, it releases a semaphore, which allows your thread to return the value, as shown below.
private final AtomicReference<String> tokenReference = new AtomicReference();
private final Semaphore semaphore = new Semaphore(0);
public String getCustomToken(Person person) {
FirebaseAuth.getInstance().createCustomToken(person.getUid()).addOnSuccessListener(customToken -> {
this.tokenReference.set(customToken);
this.sempahore.release();
});
this.semaphore.acquire();
return this.tokenReference.get();
}
Notice also that I used an AtomicReference because in order for what you asked for to be possible at all the listener must be called on a separate thread than the thread on which getCustomToken was called, and we want the value to be synchronized (I'd guess that behind the scenes Firebase is creating a thread, or this call occurs over the network). Since this.tokenReference will be overwritten, it is possible to get a newer value when getCustomToken is called more than once, which may or may not be acceptable depending on your use case.
Extract a variable into a suitable scope (class attribute or method variable)
private String customToken;
#Override
public String getCustomToken(Person person) {
FirebaseAuth.getInstance().createCustomToken(person.getUid()).addOnSuccessListener(new OnSuccessListener<String>() {
#Override
public void onSuccess(String customToken) {
this.customToken = customToken
}
});
return null;
}

Implementing Future interface for shared computation

I'm implementing the Future<Collection<Integer>> interface in order to share the result of some bulk computation among all thread in the application.
In fact, I intended to just put an instance of a class implemetnting Future<Collection<Integer>> into an ApplicationScope object so that any other thread which need the result just ask for the Future from the object and call the method get() on it, therefore using the computation performed by some another thread.
My question is about implementing the cancel method. For now, I would write something like that:
public class CustomerFutureImpl implements Future<Collection<Integer>>{
private Thread computationThread;
private boolean started;
private boolean cancelled;
private Collection<Integer> computationResult;
private boolean cancel(boolean mayInterruptIfRunning){
if( computationResult != null )
return false;
if( !started ){
cancelled = true;
return true;
} else {
if(mayInterruptIfRunning)
computationThread.interrupt();
}
}
//The rest of the methods
}
But the method implementation doesn't satisfy the documentation of the Future because we need to throw CancellationException in any thread awaiting for the result (has called the get() method).
Should I add another one field like private Collection<Thread> waitingForTheResultThreads; and then interrupt each thread from the Collection, catch interrupted exception and then throw new CancellationException()?
The thing is that such a solution seems kind of wierd to me... Not sure about that.
Generally you should avoid implementing Future directly at all. Concurrency code is very hard to get right, and frameworks for distributed execution - notably ExecutorService - will provide Future instances referencing the units of work you care about.
You may know that already and are intentionally creating a new similar service, but I feel it's important to call out that for the vast majority of use cases, you should not need to define your own Future implementation.
You might want to look at the concurrency tools Guava provides, in particular ListenableFuture, which is a sub-interface of Future that provides additional features.
Assuming that you really do want to define a custom Future type, use Guava's AbstractFuture implementation as a starting point, so that you don't have to reinvent the complex details you're running into.
To your specific question, if you look at the implementation of AbstractFuture.get(), you'll see that it's implemented with a while loop that looks for value to become non-null, at which time it calls getDoneValue() which either returns the value or raises a CancellationException. So essentially, each thread that is blocking on a call to Future.get() is polling the Future.value field every so often and raising a CancellationException if it detects that the Future has been cancelled. There's no need to keep track of a Collection<Thread> or anything of the sort, since each thread can inspect the state of the Future independently, and return or throw as needed.

ParallelStreams in java

I'm trying to use parallel streams to call an API endpoint to get some data back. I am using an ArrayList<String> and sending each String to a method that uses it in making a call to my API. I have setup parallel streams to call a method that will call the endpoint and marshall the data that comes back. The problem for me is that when viewing this in htop I see ALL the cores on the db server light up the second I hit this method ... then as the first group finish I see 1 or 2 cores light up. My issue here is that I think I am truly getting the result I want ... for the first set of calls only and then from monitoring it looks like the rest of the calls get made one at a time.
I think it may have something to do with the recursion but I'm not 100% sure.
private void generateObjectMap(Integer count){
ArrayList<String> myList = getMyList();
myList.parallelStream().forEach(f -> performApiRequest(f,count));
}
private void performApiRequest(String myString,Integer count){
if(count < 10) {
TreeMap<Integer,TreeMap<Date,MyObj>> tempMap = new TreeMap();
try {
tempMap = myJson.getTempMap(myRestClient.executeGet(myString);
} catch(SocketTimeoutException e) {
count += 1;
performApiRequest(myString,count);
}
...
else {
System.exit(1);
}
}
This seems an unusual use for parallel streams. In general the idea is that your are informing the JVM that the operations on the stream are truly independent and can run in any order in one thread or multiple. The results will subsequently be reduced or collected as part of the stream. The important point to remember here is that side effects are undefined (which is why variables changed in streams need to be final or effectively final) and you shouldn't be relying on how the JVM organises execution of the operations.
I can imagine the following being a reasonable usage:
list.parallelStream().map(item -> getDataUsingApi(item))
.collect(Collectors.toList());
Where the api returns data which is then handed to downstream operations with no side effects.
So in conclusion if you want tight control over how the api calls are executed I would recommend you not use parallel streams for this. Traditional Thread instances, possibly with a ThreadPoolExecutor will serve you much better for this.

Java : How to return intermediate results from a Thread

Using Java 7
I am trying to build a watcher that watches a data store (some collection type) and then will return certain items from it at certain points.
In this case they are time stamps, when a timestamp passes the current time I want it to be returned to the starting thread. Please see code below.
#Override
public void run() {
while (!data.isEmpty()) {
for (LocalTime dataTime : data) {
if (new LocalTime().isAfter(dataTime)) {
// return a result but continue running
}
}
}
}
I have read about future and callables, but they seem to stop the thread on a return.
I do not particularly want to return a value and stop the thread then start another task if using callable, unless it is the best way.
What are the best techniques to look for this? There seem to be such a wide range of doing it.
Thanks
You can put the intermediate results in a Blocking Queue so that the results are available to consumer threads as and when they are made available :
private final LinkedBlockingQueue<Result> results = new LinkedBlockingQueue<Result>();
#Override
public void run() {
while (!data.isEmpty()) {
for (LocalTime dataTime : data) {
if (new LocalTime().isAfter(dataTime)) {
results.put(result);
}
}
}
}
public Result takeResult() {
return results.take();
}
Consumer threads can simply call the takeResult method to use the intermediate results. The advantage of using a Blocking Queue is that you don't have to reinvent the wheel since this looks like a typical producer-consumer scenario that can be solved using a blocking data structure.
Note Here, Result can be a `POJO that represents the intermediate result object.
You are on the right path. Assuming proper synchronization will be there and you will be getting all your timestamps on time :) You should ideally choose a data structure that doesn't require you to scan through all the items. Choose something like a min heap or some ascending/descending lists and now when you iterate just delete the element from this data store and put it on a Blocking Queue. have a thread that is listening on this queue to proceed further.

How should I maintain a cache of values read from a file?

Setup
There is a program running that is performing arbitrary computations and writing a status (an integer value, representing progress) to a file. The integer values can only be incremented.
Now I am developing an other application that can (among other things) perform arithmetic operations, e.g., comparisons, on those integer values. The files are permanently deleted and written by a different program. As such, there is no guarantee that a file exists at any time.
Basically, the application needs to execute something arbitrary, but has a constraint on the other program's progress, i.e., it may only execute something if the other program has done enough work.
Problem
When performing the arithmetic operations, the application should not care about where the integer values come from. Especially, accessing those integer values must not throw an exception. How should I separate all the bad things that can happen when performing io access?
Note that I do not want the execution thread to block until a value can be read from the file. E.g., say the file system dies somehow, then the integer values will not be updated, but the main thread should still continue to work. This desire is driven by the definition of the arithmetic comparison as a predicate, which has exactly two outcomes, true and false, but no third "error"-outcome. That's why I think that the values that are read from the file would need to be cached somehow.
Limitation
Java 1.7, Scala 2.11
Current Approach
I have a solution that looks as if it would work, but I am not sure if there could something go wrong.
The solution is to maintain a cache of those integer values for each file. The core functionality is provided the getters of the cache, while there is a separate "updater"-thread that constantly reads the files and updates the chaches.
If an error occurs the producer should take notice (i.e., log the error), but continue to run, because an incomplete computation should not affect subsequent computations.
A minimal example of what I am currently doing would look something like this:
object Application {
def main(args: Array[String]) {
val caches = args.map(filename => new Cache(Paths.get(filename))
val producer = new Thread(new Updater(caches)))
producer.start()
execute(caches)
producer.interrupt()
}
def execute(values: Array[AccessValue]) {
while (values.head.getValue < 5) {/* This should never throw an exception */}
}
class Updater(caches: Array[Cache]) {
def run() {
var interrupted = false
while(!interrupted) {
caches.foreach{cache =>
try {
val input = Files.newInputStream(cache.file)
cache.updateValue(parse(input))
} catch {
case _: InterruptedException =>
interrupted = true
case t: Throwable =>
log.error(t)
/*continue as if nothing happend*/
}
}
}
}
def parse(input: InputStream): Int = input.read() /* In reality, some xml parsing */
}
trait AccessValue{
def getValue: Int // should not throw an exception
}
class Cache(val file: Path) extends AccessValue{
private val value = 0
def getValue = value
def updateValue(newValue: Int) { value = newValue }
}
Doing it like this works on a synthetic test setup, but I am wondering whether something bad can happen. Also, if anyone would approach the problem differently, I would be glad to hear how.
Could there be a throwable that could cause other threads to go wild? I am thinking of something like OutOfMemoryException or StackOverflow. Would I need to handle them differently, or does it not matter, because, e.g., the whole application would die anyways?
What would happen if the the InterruptException is thrown outside the try block, or even in the catch block? Is there a better way to terminate a thread?
Must the member value of class Cache be declared volatile? I do not care much about the ordering of reads and write, but the compiler must not "optimize" reading the value away just because it deduces that the value is constant.
There are a lot of different concurrency-related libraries. Do you suggest me to use something other than new Thread(...).start()? If yes, what facility do you suggest? I know of Scala's ExecutionContext, Future's, and Java's Executors class, which provides various static constructors for thread pools. However, I have never used any of these before and I do not know their advantages and disadvantages. I also stumbled upon the name "Akka", but my guess is that using Akka is overkill for what I want to achieve.
Thank you
I would recommend to read through oracle's documentation on concurrency.
When one thread writes a value and different thread reads a value, you should always use a synchronized block or declare that value as volatile. Otherwise there is no guarantee that the value written by one thread is visible to the other thread (see oracle's documentation on establishing happens-before relationship).
The OutOfMemoryException can influence the other threads as the heap space to which the OutOfMemoryException refers is shared among threads. The StackOverflow exception would kill only the thread in which it occurs because each thread has its own stack.
If you do not need some sort of synchronization between the two threads then you probably do not need any Futures or Executors.

Categories