Proper termination of a stuck Couchbase Observable

Proper termination of a stuck Couchbase Observable - java

I'm trying to delete a batch of couchbase documents in rapid fashion according to some constraint (or update the document if the constraint isn't satisfied). Each deletion is dubbed a "parcel" according to my terminology.
When executing, I run into a very strange behavior - the thread in charge of this task starts working as expected for a few iterations (at best). After this "grace period", couchbase gets "stuck" and the Observable doesn't call any of its Subscriber's methods (onNext, onComplete, onError) within the defined period of 30 seconds.
When the latch timeout occurs (see implementation below), the method returns but the Observable keeps executing (I noticed that when it kept printing debug messages when stopped with a breakpoint outside the scope of this method).
I suspect couchbase is stuck because after a few seconds, many Observables are left in some kind of a "ghost" state - alive and reporting to their Subscriber, which in turn have nothing to do because the method in which they were created has already finished, eventually leading to java.lang.OutOfMemoryError: GC overhead limit exceeded.
I don't know if what I claim here makes sense, but I can't think of another reason for this behavior.
How should I properly terminate an Observable upon timeout? Should I? Any other way around?
public List<InfoParcel> upsertParcels(final Collection<InfoParcel> parcels) {
final CountDownLatch latch = new CountDownLatch(parcels.size());
final List<JsonDocument> docRetList = new LinkedList<JsonDocument>();
Observable<JsonDocument> obs = Observable
.from(parcels)
.flatMap(parcel ->
Observable.defer(() ->
{
return bucket.async().get(parcel.key).firstOrDefault(null);
})
.map(doc -> {
// In-memory manipulation of the document
return updateDocs(doc, parcel);
})
.flatMap(doc -> {
boolean shouldDelete = ... // Decide by inner logic
if (shouldDelete) {
if (doc.cas() == 0) {
return Observable.just(doc);
}
return bucket.async().remove(doc);
}
return (doc.cas() == 0 ? bucket.async().insert(doc) : bucket.async().replace(doc));
})
);
obs.subscribe(new Subscriber<JsonDocument>() {
#Override
public void onNext(JsonDocument doc) {
docRetList.add(doc);
latch.countDown();
}
#Override
public void onCompleted() {
// Due to a bug in RxJava, onError() / retryWhen() does not intercept exceptions thrown from within the map/flatMap methods.
// Therefore, we need to recalculate the "conflicted" parcels and send them for update again.
while(latch.getCount() > 0) {
latch.countDown();
}
}
#Override
public void onError(Throwable e) {
// Same reason as above
while (latch.getCount() > 0) {
latch.countDown();
}
}
};
);
latch.await(30, TimeUnit.SECONDS);
// Recalculating remaining failed parcels and returning them for another cycle of this method (there's a loop outside)
}

I think this is indeed due to the fact that using a countdown latch doesn't signal the source that the flow of data processing should stop.
You could use more of rxjava, by using toList().timeout(30, TimeUnit.SECONDS).toBlocking().single() instead of collecting in an (un synchronized and thus unsafe) external list and of using the countdownLatch.
This will block until a List of your documents is returned.

When you create your couchbase env in code, set computationPoolSize to something large. When the Couchbase clients runs out of threads using async it just stops working, and wont ever call the callback.

Related

Waiting for something to finsih, should I use thread.sleep or ReentrantLock?

I have a Java program. The logic is as follow:
place order out (relying on Interactive Broker / Binance API)
Once the order is filled (there will be a callback from the API), immediately execute a method called "calculateSomething"
The order is placed using Interactive Broker / Binance API. Once the order is filled, the API callback method will return a message.
The problem is that I do not know how to write out the code to identify that the order has been filled, so i can immediately execute the "calculateSomething" method with minimal waiting time.
I can think of two ways:
while loop and thread.sleep
ReentrantLock.
Method 1 works, but it's not instantaneous. Hence, I am exploring ReentrantLock and I am not sure the code is correct. Nonetheless, which method is the most efficient and can immediately execute the "calculateSomething" once the order is completed If there is a more efficient approach, please give me some help, as I have been stuck in this problem for many days.
pseudocode below.
Method 1 - thread.sleep
placeOrder(); // place order to binance <- API method
while(order is not completed){
Thread.sleep(1000)
if(order is completed){
return
}
}
calculateSomething();
Method 2 - ReentrantLock
ReentrantLock lock = new ReentrantLock();
lock.lock();
System.out.println("1. Locked");
try {
while(lock.isLocked()) {
if(isOrderCompleted() == true){
lock.unlock();
}
}
} catch(Exception e){
e.printStackTrace();
}finally {
if(lock.isLocked()) {
lock.unlock();
}
}
calculateSomething();

You can have a blocking queue.
BlockingQueue<?> finishedOrders = new ArrayBlockingQueue<>(512);
Then you have a loop that processes finished orders.
public void processFinishedOrders() throws InterruptedException{
while(!Thread.interrupted()){
finishedOrders.take();
doSomethingRelevant();
}
}
I would also suggest populating finishedOrders with a meaningful class.
BlockingQueue<Order> finishedOrders;
Order fin = finishedOrders.take();
doSomethingRelevant( fin );
That way the thread waiting on the api call can create a an order and add it to the finished orders queue, and the processing thread will have the relevant information.

is it safe to store threads in a ConcurrentMap?

I am building a backend service whereby a REST call to my service creates a new thread. The thread waits for another REST call if it does not receive anything by say 5 minutes the thread will die.
To keep track of all the threads I have a collection that keeps track of all the currently running threads so that when the REST call finally comes in such as a user accepting or declining an action, I can then identify that thread using the userID. If its declined we will just remove that thread from the collection if its accepted the thread can carry on doing the next action. i have implemented this using a ConcurrentMap to avoid concurrency issues.
Since this is my first time working with threads I want to make sure that I am not overlooking any issues that may arise. Please have a look at my code and tell me if I could do it better or if there's any flaws.
public class UserAction extends Thread {
int userID;
boolean isAccepted = false;
boolean isDeclined = false;
long timeNow = System.currentTimeMillis();
long timeElapsed = timeNow + 50000;
public UserAction(int userID) {
this.userID = userID;
}
public void declineJob() {
this.isDeclined = true;
}
public void acceptJob() {
this.isAccepted = true;
}
public boolean waitForApproval(){
while (System.currentTimeMillis() < timeElapsed){
System.out.println("waiting for approval");
if (isAccepted) {
return true;
} else if (declined) {
return false;
}
}
return isAccepted;
}
#Override
public void run() {
if (!waitForApproval) {
// mustve timed out or user declined so remove from list and return thread immediately
tCollection.remove(userID);
// end the thread here
return;
}
// mustve been accepted so continue working
}
}
public class Controller {
public static ConcurrentHashMap<Integer, Thread> tCollection = new ConcurrentHashMap<>();
public static void main(String[] args) {
int barberID1 = 1;
int barberID2 = 2;
tCollection.put(barberID1, new UserAction(barberID1));
tCollection.put(barberID2, new UserAction(barberID2));
tCollection.get(barberID1).start();
tCollection.get(barberID2).start();
Thread.sleep(1000);
// simulate REST call accepting/declining job after 1 second. Usually this would be in a spring mvc RESTcontroller in a different class.
tCollection.get(barberID1).acceptJob();
tCollection.get(barberID2).declineJob();
}
}

You don't need (explicit) threads for this. Just a shared pool of task objects that are created on the first rest call.
When the second rest call comes, you already have a thread to use (the one that's handling the rest call). You just need to retrieve the task object according to the user id. You also need to get rid of expired tasks, which can be done with for example a DelayQueue.
Pseudocode:
public void rest1(User u) {
UserTask ut = new UserTask(u);
pool.put(u.getId(), ut);
delayPool.put(ut); // Assuming UserTask implements Delayed with a 5 minute delay
}
public void rest2(User u, Action a) {
UserTask ut = pool.get(u.getId());
if(!a.isAccepted() || ut == null)
pool.remove(u.getId());
else
process(ut);
// Clean up the pool from any expired tasks, can also be done in the beginning
// of the method, if you want to make sure that expired actions aren't performed
while((UserTask u = delayPool.poll()) != null)
pool.remove(u.getId());
}

There's a synchronization issue that you should make your flags isAccepted and isDeclined of class AtomicBoolean.
A critical concept is that you need to take steps to make sure changes to memory in one thread are communicated to other threads that need that data. They're called memory fences and they often occur implicitly between synchronization calls.
The idea of a (simple) Von Neumann architecture with a 'central memory' is false for most modern machines and you need to know data is being shared between caches/threads correctly.
Also as others suggest, creating a thread for each task is a poor model. It scales badly and leaves your application vulnerable to keeling over if too many tasks are submitted. There is some limit to memory so you can only have so many pending tasks at a time but the ceiling for threads will be much lower.
That will be made all the worse because you're spin waiting. Spin waiting puts a thread into a loop waiting for a condition. A better model would wait on a ConditionVariable so threads not doing anything (other than waiting) could be suspended by the operating system until notified that the thing they're waiting for is (or may be) ready.
There are often significant overheads in time and resources to creating and destroying threads. Given that most platforms can be simultaneously only executing a relatively small number of threads creating lots of 'expensive' threads to have them spend most of their time swapped out (suspended) doing nothing is very inefficient.
The right model launches a pool of a fixed number of threads (or relatively fixed number) and places tasks in a shared queue that the threads 'take' work from and process.
That model is known generically as a "Thread Pool".
The entry level implementation you should look at is ThreadPoolExecutor:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html

Non blocking function that preserves order

I have the following method:
void store(SomeObject o) {
}
The idea of this method is to store o to a permanent storage but the function should not block. I.e. I can not/must not do the actual storage in the same thread that called store.
I can not also start a thread and store the object from the other thread because store might be called a "huge" amount of times and I don't want to start spawning threads.
So I options which I don't see how they can work well:
1) Use a thread pool (Executor family)
2) In store store the object in an array list and return. When the array list reaches e.g. 1000 (random number) then start another thread to "flush" the array list to storage. But I would still possibly have the problem of too many threads (thread pool?)
So in both cases the only requirement I have is that I store persistantly the objects in exactly the same order that was passed to store. And using multiple threads mixes things up.
How can this be solved?
How can I ensure:
1) Non blocking store
2) Accurate insertion order
3) I don't care about any storage guarantees. If e.g. something crashes I don't care about losing data e.g. cached in the array list before storing them.

I would use a SingleThreadExecutor and a BlockingQueue.
SingleThreadExecutor as the name sais has one single Thread. Use it to poll from the Queue and persist objects, blocking if empty.
You can add not blocking to the queue in your store method.
EDIT
Actually, you do not even need that extra Queue - JavaDoc of newSingleThreadExecutor sais:
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So I think it's exactly what you need.
private final ExecutorService persistor = Executors.newSingleThreadExecutor();
public void store( final SomeObject o ){
persistor.submit( new Runnable(){
#Override public void run(){
// your persist-code here.
}
} );
}
The advantage of using a Runnable that has a quasi-endless-loop and using an extra queue would be the possibility to code some "Burst"-functionality. For example you could make it wait to persist only when 10 elements are in queue or the oldest element has been added at least 1 minute ago ...

I suggest using a Chronicle-Queue which is a library I designed.
It allows you to write in the current thread without blocking. It was originally designed for low latency trading systems. For small messages it takes around 300 ns to write a message.
You don't need to use a back ground thread, or a on heap queue and it doesn't wait for the data to be written to disk by default. It also ensures consistent order for all readers. If the program dies at any point after you call finish() the message is not lost. (Unless the OS crashes/loses power) It also supports replication to avoid data loss.

Have one separate thread that gets items from the end of a queue (blocking on an empty queue), and writes them to disk. Your main thread's store() function just adds items to the beginning of the queue.
Here's a rough idea (though I assume there will be cleaner or faster ways for doing this in production code, depending on how fast you need things to be):
import java.util.*;
import java.io.*;
import java.util.concurrent.*;
class ObjectWriter implements Runnable {
private final Object END = new Object();
BlockingQueue<Object> queue = new LinkedBlockingQueue();
public void store(Object o) throws InterruptedException {
queue.put(o);
}
public ObjectWriter() {
new Thread(this).start();
}
public void close() throws InterruptedException {
queue.put(END);
}
public void run() {
while (true) {
try {
Object o = queue.take();
if (o == END) {
// close output file.
return;
}
System.out.println(o.toString()); // serialize as appropriate
} catch (InterruptedException e) {
}
}
}
}
public class Test {
public static void main(String[] args) throws Exception {
ObjectWriter w = new ObjectWriter();
w.store("hello");
w.store("world");
w.close();
}
}

The comments in your question make it sound like you are unfamilier with multi-threading, but it's really not that difficult.
You simply need another thread responsible for writing to the storage which picks items off a queue. - your store function just adds the objects to the in-memory queue and continues on it's way.
Some psuedo-ish code:
final List<SomeObject> queue = new List<SomeObject>();
void store(SomeObject o) {
// add it to the queue - note that modifying o after this will also alter the
// instance in the queue
synchronized(queue) {
queue.add(queue);
queue.notify(); // tell the storage thread there's something in the queue
}
}
void storageThread() {
SomeObject item;
while (notfinished) {
synchronized(queue) {
if (queue.length > 0) {
item = queue.get(0); // get from start to ensure same order
queue.removeAt(0);
} else {
// wait for something
queue.wait();
continue;
}
}
writeToStorage(item);
}
}

Rejection handler in Executors.newScheduledThreadPool

I have a ArrayBlocking queue, , upon which a single thread fixed rate Scheduled works.
I may have failed task. I want re-run that or re-insert in queue at high priority or top level

Some thoughts here -
Why are you using ArrayBlockingQueue and not PriorityBlockingQueue ? Sounds like what you need to me . At first set all your elements to be with equal priority.
In case you receive an exception - re-insert to the queue with a higher priority

Simplest thing might be a priority queue. Attach a retry number to the task. It starts as zero. After an unsuccessful run, throw away all the ones and increment the zeroes and put them back in the queue at a high priority. With this method, you can easily decide to run everything three times, or more, if you want to later. The down side is you have to modify the task class.
The other idea would be to set up another, non-blocking, thread-safe, high-priority queue. When looking for a new task, you check the non-blocking queue first and run what's there. Otherwise, go to the blocking queue. This might work for you as is, and so far it's the simplest solution. The problem is the high priority queue might fill up while the scheduler is blocked on the blocking queue.
To get around this, you'd have to do your own blocking. Both queues should be non-blocking. (Suggestion: java.util.concurrent.ConcurrentLinkedQueue.) After polling both queues with no results, wait() on a monitor. When anything puts something in a queue, it should call notifyAll() and the scheduler can start up again. Great care is needed lest the notification occur after the scheduler has checked both queues but before it calls wait().
Addition:
Prototype code for third solution with manual blocking. Some threading is suggested, but the reader will know his/her own situation best. Which bits of code are apt to block waiting for a lock, which are apt to tie up their thread (and core) for minutes while doing extensive work, and which cannot afford to sit around waiting for the other code to finish all needs to be considered. For instance, if a failed run can immediately be rerun on the same thread with no time-consuming cleanup, most of this code can be junked.
private final ConcurrentLinkedQueue mainQueue = new ConcurrentLinkedQueue();
private final ConcurrentLinkedQueue prioQueue = new ConcurrentLinkedQueue();
private final Object entryWatch = new Object();
/** Adds a new job to the queue. */
public void addjob( Runnable runjob ) {
synchronized (entryWatch) { entryWatch.notifyAll(); }
}
/** The endless loop that does the work. */
public void schedule() {
for (;;) {
Runnable run = getOne(); // Avoids lock if successful.
if (run == null) {
// Both queues are empty.
synchronized (entryWatch) {
// Need to check again. Someone might have added and notifiedAll
// since last check. From this point until, wait, we can be sure
// entryWatch is not notified.
run = getOne();
if (run == null) {
// Both queues are REALLY empty.
try { entryWatch.wait(); }
catch (InterruptedException ie) {}
}
}
}
runit( run );
}
}
/** Helper method for the endless loop. */
private Runnable getOne() {
Runnable run = (Runnable) prioQueue.poll();
if (run != null) return run;
return (Runnable) mainQueue.poll();
}
/** Runs a new job. */
public void runit( final Runnable runjob ) {
// Do everthing in another thread. (Optional)
new Thread() {
#Override public void run() {
// Run run. (Possibly in own thread?)
// (Perhaps best in thread from a thread pool.)
runjob.run();
// Handle failure (runit only, NOT in runitLast).
// Defining "failure" left as exercise for reader.
if (failure) {
// Put code here to handle failure.
// Put back in queue.
prioQueue.add( runjob );
synchronized (entryWatch) { entryWatch.notifyAll(); }
}
}
}.start();
}
/** Reruns a job. */
public void runitLast( final Runnable runjob ) {
// Same code as "runit", but don't put "runjob" in "prioQueue" on failure.
}

Threadsafe double buffered cache (not for graphics) in Java?

I was recently looking for a way to implement a doubly buffered thread-safe cache for regular objects.
The need arose because we had some cached data structures that were being hit numerous times for each request and needed to be reloaded from cache from a very large document (1s+ unmarshalling time) and we couldn't afford to let all requests be delayed by that long every minute.
Since I couldn't find a good threadsafe implementation I wrote my own and now I am wondering if it's correct and if it can be made smaller... Here it is:
package nl.trimpe.michiel
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
/**
* Abstract class implementing a double buffered cache for a single object.
*
* Implementing classes can load the object to be cached by implementing the
* {#link #retrieve()} method.
*
* #param <T>
* The type of the object to be cached.
*/
public abstract class DoublyBufferedCache<T> {
private static final Log log = LogFactory.getLog(DoublyBufferedCache.class);
private Long timeToLive;
private long lastRetrieval;
private T cachedObject;
private Object lock = new Object();
private volatile Boolean isLoading = false;
public T getCachedObject() {
checkForReload();
return cachedObject;
}
private void checkForReload() {
if (cachedObject == null || isExpired()) {
if (!isReloading()) {
synchronized (lock) {
// Recheck expiration because another thread might have
// refreshed the cache before we were allowed into the
// synchronized block.
if (isExpired()) {
isLoading = true;
try {
cachedObject = retrieve();
lastRetrieval = System.currentTimeMillis();
} catch (Exception e) {
log.error("Exception occurred retrieving cached object", e);
} finally {
isLoading = false;
}
}
}
}
}
}
protected abstract T retrieve() throws Exception;
private boolean isExpired() {
return (timeToLive > 0) ? ((System.currentTimeMillis() - lastRetrieval) > (timeToLive * 1000)) : true;
}
private boolean isReloading() {
return cachedObject != null && isLoading;
}
public void setTimeToLive(Long timeToLive) {
this.timeToLive = timeToLive;
}
}

What you've written isn't threadsafe. In fact, you've stumbled onto a common fallacy that is quite a famous problem. It's called the double-checked locking problem and many such solutions as yours (and there are several variations on this theme) all have issues.
There are a few potential solutions to this but imho the easiest is simply to use a ScheduledThreadExecutorService and reload what you need every minute or however often you need to. When you reload it put it into the cache result and the calls for it just return the latest version. This is threadsafe and easy to implement. Sure it's not on-demand loaded but, apart from the initial value, you'll never take a performance hit while you retrieve the value. I'd call this over-eager loading rather than lazy-loading.
For example:
public class Cache<T> {
private final ScheduledExecutorsService executor =
Executors.newSingleThreadExecutorService();
private final Callable<T> method;
private final Runnable refresh;
private Future<T> result;
private final long ttl;
public Cache(Callable<T> method, long ttl) {
if (method == null) {
throw new NullPointerException("method cannot be null");
}
if (ttl <= 0) {
throw new IllegalArgumentException("ttl must be positive");
}
this.method = method;
this.ttl = ttl;
// initial hits may result in a delay until we've loaded
// the result once, after which there will never be another
// delay because we will only refresh with complete results
result = executor.submit(method);
// schedule the refresh process
refresh = new Runnable() {
public void run() {
Future<T> future = executor.submit(method);
future.get();
result = future;
executor.schedule(refresh, ttl, TimeUnit.MILLISECONDS);
}
}
executor.schedule(refresh, ttl, TimeUnit.MILLISECONDS);
}
public T getResult() {
return result.get();
}
}
That takes a little explanation. Basically, you're creating a generic interface for caching the result of a Callable, which will be your document load. Submitting a Callable (or Runnable) returns a Future. Calling Future.get() blocks until it returns (completes).
So what this does is implement a get() method in terms of a Future so initial queries won't fail (they will block). After that, every 'ttl' milliseconds the refresh method is called. It submits the method to the scheduler and calls Future.get(), which yields and waits for the result to complete. Once complete, it replaces the 'result' member. Subsequence Cache.get() calls will return the new value.
There is a scheduleWithFixedRate() method on ScheduledExecutorService but I avoid it because if the Callable takes longer than the scheduled delay you will end up with multiple running at the same time and then have to worry about that or throttling. It's easier just for the process to submit itself at the end of a refresh.

I'm not sure I understand your need. Is your need to a have a faster loading (and reloading) of the cache, for a portion of the values?
If so, I would suggest breaking your datastructure into smaller pieces.
Just load the piece that you need at the time. If you divide the size by 10, you will divide the loading time by something related to 10.
This could apply to the original document you are reading, if possible. Otherwise, it would be the way you read it, where you skip a large part of it and load only the relevant part.
I believe that most data can be broken down into pieces. Choose the more appropriate, here are examples:
by starting letter : A*, B* ...
partition your id into two part : first part is a category, look for it in the cache, load it if needed, then look for your second part inside.

If your need is not the initial loading time, but the reloading, maybe you don't mind the actual time for reloading, but want to be able to use the old version while loading the new?
If that is your need, I suggest making your cache an instance (as opposed to static) that is available in a field.
You trigger reloading every minute with a dedicated thread (or a least not the regular threads), so that you don't delay your regular threads.
Reloading creates a new instance, load it with data (takes 1 second), and then simply replace the old instance with the new. (The old will get garbage-collected.) Replacing an object with another is an atomic operation.
Analysis: What happens in that case is that any other thread can get access to the old cache until the last instant ?
In the worst case, the instruction just after getting the old cache instance, another thread replaces the old instance with a new. But this doesn't make your code faulty, asking the old cache instance will still give a value that was correct just before, which is acceptable by the requirement I gave as first sentence.
To make your code more correct, you can create your cache instance as immutable (no setters available, no way to modify internal state). This makes it clearer that it is correct to use it in a multi-threaded context.

You appare to be locking more then is required, in your good case (cache full and valid) every request aquires a lock. you can get away with only locking if the cache is expired.
If we are reloading, do nothing.
If we are not reloading, check if expired if not expired go ahead.
If we are not reloading and we are expired, get the lock and double check expired to make sure we have not sucessfuly loaded seince last check.
Also note you may wish to reload the cache in a background thread so not event the one requrest is heldup waiting for cache to fill.
private void checkForReload() {
if (cachedObject == null || isExpired()) {
if (!isReloading()) {
// Recheck expiration because another thread might have
// refreshed the cache before we were allowed into the
// synchronized block.
if (isExpired()) {
synchronized (lock) {
if (isExpired()) {
isLoading = true;
try {
cachedObject = retrieve();
lastRetrieval = System.currentTimeMillis();
} catch (Exception e) {
log.error("Exception occurred retrieving cached object", e);
} finally {
isLoading = false;
}
}
}
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.