Is there an opposite for the DelayQueue? - java

I would need a queue that will automatically remove elements that are older than a given amount of milliseconds - basically, I want the items in the queue to expire after some time.
I see there is a delay queue that seems to be doing the opposite: 'an element can only be taken when its delay has expired.' (I've never used it).
Maybe there is a queue implementation that does what I need? It would be better if it was bounded.

The problem with this is who and at which point will remove the elements that have expired. If your concern is the size of the queue not growing beyond certain limits, you will have to have a separate "cleaner" thread, removing things from your queue as they expire. You can implement it with a DelayQueue (offer would add to an internal LinkedHashSet and a DelayQueue, poll operates on the set, and additionally a cleaner thread polls the DelayQueue, and remove things from the set as they "ripen").
If you do not care all that much about items being removed from the queue as soon as they expire, you can just override the poll method of a standard queue, to check the expiration of the head, and, if it has expired, clear the rest of the queue and return null.

If you want to remove expired objects you need a DelayQueue and a Thread which will extract expired objects from it, something like this:
static class Wrapper<E> implements Delayed {
E target;
long exp = System.currentTimeMillis() + 5000; // 5000 ms delay
Wrapper(E target) {
this.target = target;
}
E get() {
return target;
}
#Override
public int compareTo(Delayed o) {
return 0;
}
#Override
public long getDelay(TimeUnit unit) {
return unit.convert(exp - System.currentTimeMillis(), TimeUnit.MILLISECONDS);
}
}
public static void main(String[] args) throws Exception {
final DelayQueue<Wrapper<Integer>> q = new DelayQueue<>();
q.add(new Wrapper<>(1));
Thread.sleep(3000);
q.add(new Wrapper<>(2));
new Thread() {
public void run() {
try {
for(;;) {
Wrapper<Integer> w = q.take();
System.out.println(w.get());
}
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
};
}.start();
}

i guess isn't a native implementation like that in java, i'm not sure through. But you can you a cache to hold this situation, not sure if is the best approuch, but you can use google guava for that, seting a Expiration Time for your itens so you'll only recover values that aren't expired.
Here are the docs for the google guava cache implementation: Guava Doc
Hope it helps!

Related

Alternative to ConcurrentLinkedQueue, do I need to use LinkedList with locks?

i am currently using a ConcurrentLinkedQueue, so that I can use natural order FIFO and also use it in a thread safe application . I have a requirement to log the size of the queue every minute and given that this collection does not guarantee size and also cost to calculate size is O(N), is there any alternative bounded non blocking concurrent queue that I can use where in obtaining size will not be a costly operation and at the same time the add/remove operation is not expensive either?
If there is no collection, do I need to use LinkedList with locks?
If you really (REALLY) need to log a correct, current size of the Queue you are currently dealing with - you need to block. There is simply no other way. You can think that maintaining a separate LongAdder field might help, may be making your own interface as a wrapper around ConcurrentLinkedQueue, something like:
interface KnownSizeQueue<T> {
T poll();
long size();
}
And an implementation:
static class ConcurrentKnownSizeQueue<T> implements KnownSizeQueue<T> {
private final ConcurrentLinkedQueue<T> queue = new ConcurrentLinkedQueue<>();
private final LongAdder currentSize = new LongAdder();
#Override
public T poll() {
T result = queue.poll();
if(result != null){
currentSize.decrement();
}
return result;
}
#Override
public long size() {
return currentSize.sum();
}
}
I just encourage you to add one more method, like remove into the interface and try to reason about the code. You will, very shortly realize, that such implementations will still give you a wrong result. So, do not do it.
The only reliable way to get the size, if you really need it, is to block for each operation. This comes at a high price, because ConcurrentLinkedQueue is documented as:
This implementation employs an efficient non-blocking...
You will lose those properties, but if that is a hard requirement that does not care about that, you could write your own:
static class ParallelKnownSizeQueue<T> implements KnownSizeQueue<T> {
private final Queue<T> queue = new ArrayDeque<>();
private final ReentrantLock lock = new ReentrantLock();
#Override
public T poll() {
try {
lock.lock();
return queue.poll();
} finally {
lock.unlock();
}
}
#Override
public long size() {
try {
lock.lock();
ConcurrentLinkedQueue
return queue.size();
} finally {
lock.unlock();
}
}
}
Or, of course, you can use an already existing structure, like LinkedBlockingDeque or ArrayBlockingQueue, etc - depending on what you need.

Non blocking function that preserves order

I have the following method:
void store(SomeObject o) {
}
The idea of this method is to store o to a permanent storage but the function should not block. I.e. I can not/must not do the actual storage in the same thread that called store.
I can not also start a thread and store the object from the other thread because store might be called a "huge" amount of times and I don't want to start spawning threads.
So I options which I don't see how they can work well:
1) Use a thread pool (Executor family)
2) In store store the object in an array list and return. When the array list reaches e.g. 1000 (random number) then start another thread to "flush" the array list to storage. But I would still possibly have the problem of too many threads (thread pool?)
So in both cases the only requirement I have is that I store persistantly the objects in exactly the same order that was passed to store. And using multiple threads mixes things up.
How can this be solved?
How can I ensure:
1) Non blocking store
2) Accurate insertion order
3) I don't care about any storage guarantees. If e.g. something crashes I don't care about losing data e.g. cached in the array list before storing them.
I would use a SingleThreadExecutor and a BlockingQueue.
SingleThreadExecutor as the name sais has one single Thread. Use it to poll from the Queue and persist objects, blocking if empty.
You can add not blocking to the queue in your store method.
EDIT
Actually, you do not even need that extra Queue - JavaDoc of newSingleThreadExecutor sais:
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So I think it's exactly what you need.
private final ExecutorService persistor = Executors.newSingleThreadExecutor();
public void store( final SomeObject o ){
persistor.submit( new Runnable(){
#Override public void run(){
// your persist-code here.
}
} );
}
The advantage of using a Runnable that has a quasi-endless-loop and using an extra queue would be the possibility to code some "Burst"-functionality. For example you could make it wait to persist only when 10 elements are in queue or the oldest element has been added at least 1 minute ago ...
I suggest using a Chronicle-Queue which is a library I designed.
It allows you to write in the current thread without blocking. It was originally designed for low latency trading systems. For small messages it takes around 300 ns to write a message.
You don't need to use a back ground thread, or a on heap queue and it doesn't wait for the data to be written to disk by default. It also ensures consistent order for all readers. If the program dies at any point after you call finish() the message is not lost. (Unless the OS crashes/loses power) It also supports replication to avoid data loss.
Have one separate thread that gets items from the end of a queue (blocking on an empty queue), and writes them to disk. Your main thread's store() function just adds items to the beginning of the queue.
Here's a rough idea (though I assume there will be cleaner or faster ways for doing this in production code, depending on how fast you need things to be):
import java.util.*;
import java.io.*;
import java.util.concurrent.*;
class ObjectWriter implements Runnable {
private final Object END = new Object();
BlockingQueue<Object> queue = new LinkedBlockingQueue();
public void store(Object o) throws InterruptedException {
queue.put(o);
}
public ObjectWriter() {
new Thread(this).start();
}
public void close() throws InterruptedException {
queue.put(END);
}
public void run() {
while (true) {
try {
Object o = queue.take();
if (o == END) {
// close output file.
return;
}
System.out.println(o.toString()); // serialize as appropriate
} catch (InterruptedException e) {
}
}
}
}
public class Test {
public static void main(String[] args) throws Exception {
ObjectWriter w = new ObjectWriter();
w.store("hello");
w.store("world");
w.close();
}
}
The comments in your question make it sound like you are unfamilier with multi-threading, but it's really not that difficult.
You simply need another thread responsible for writing to the storage which picks items off a queue. - your store function just adds the objects to the in-memory queue and continues on it's way.
Some psuedo-ish code:
final List<SomeObject> queue = new List<SomeObject>();
void store(SomeObject o) {
// add it to the queue - note that modifying o after this will also alter the
// instance in the queue
synchronized(queue) {
queue.add(queue);
queue.notify(); // tell the storage thread there's something in the queue
}
}
void storageThread() {
SomeObject item;
while (notfinished) {
synchronized(queue) {
if (queue.length > 0) {
item = queue.get(0); // get from start to ensure same order
queue.removeAt(0);
} else {
// wait for something
queue.wait();
continue;
}
}
writeToStorage(item);
}
}

Queue Worker Thread stops working, thread safety issue?

i want to introduce my problem first.
I have several WorkingThreads that are receiving a string, processing the string and afterwards sending the processed string to a global Queue like this:
class Main {
public static Queue<String> Q;
public static void main(String[] args) {
//start working threads
}
}
WorkingThread.java:
class WorkingThread extends Thread {
public void run() {
String input;
//do something with input
Main.q.append(processedString);
}
So now every 800ms another Thread called Inserter dequeues all the entries to formulate some sql, but thats not important.
class Inserter extends Thread {
public void run() {
while(!Main.Q.isEmpty()) {
System.out.print(".");
// dequeue and formulate some SQL
}
}
}
Everything works for about 5 to 10 minutes but then suddenly, i cannot see any dots printed (what is basically a heartbeat for the Inserter). The Queue is not empty i can assure that but the inserter just wont work even though it get started regulary.
I have a suspision that there is a problem when a worker wants to insert something while the Inserter dequeues the Queue, could this possibly be some kind of "deadlock"?
I really hope somebody has an explanation for this behaviour. I am looking forward to learn ;).
EDIT: I am using
Queue<String> Q = new LinkedList<String>();
You are not using a synchronized or thread safe Queue therefore you have a race hazard. Your use of a LinkedList shows a (slightly scary) lack of knowledge of this fact. You may want to read more about threading and thread safety before you try and tackle any more threaded code.
You must either synchronize manually or use one of the existing implementations provided by the JDK. Producer/consumer patterns are usually implemented using one of the BlockingQueue implementations.
A BlockingQueue of a bounded size will block producers trying to put if the queue is full. A BlockingQueue will always block consumers if the queue is empty.
This allows you to remove all of your custom logic that spins on the queue and waits for items.
A simple example using Java 8 lambdas would look like:
public static void main(String[] args) throws Exception {
final BlockingQueue<String> q = new LinkedBlockingQueue<>();
final ExecutorService executorService = Executors.newFixedThreadPool(4);
final Runnable consumer = () -> {
while (true) {
try {
System.out.println(q.take());
} catch (InterruptedException e) {
return;
}
}
};
executorService.submit(consumer);
final Stream<Runnable> producers = IntStream.range(0, 5).mapToObj(i -> () -> {
final Random random = ThreadLocalRandom.current();
while (true) {
q.add("Consumer " + i + " putting " + random.nextDouble());
try {
TimeUnit.MILLISECONDS.sleep(random.nextInt(2000));
} catch (InterruptedException e) {
//ignore
}
}
});
producers.forEach(executorService::submit);
}
The consumer blocks on the BlockingQueue.take method and immediately there is an item available, it will be woken and will print the item. If there are no items, the thread will be suspended - allowing the physical CPU to do something else.
The producers each push a String onto the queue using add. As the queue is unbounded, add will always return true. In the case where there is likely to be a backlog of work the for consumer you can bound the queue and use the put method (that throws an InterruptedException so requires a try..catch which is why it's easier to use add) - this will automatically create flow control.
Seems more like synchronization issue.. You are trying to do a simulation of - Producer - Consumer problem. You need to synchronize your Queue or use a BlockingQueue. You probably have a race condition.
You are going to need to synchronize access to your Queue or
use ConcurrentLinkedQueue see http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html
or as also suggested using a BlockingQueue (depending on your requirements) http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/BlockingQueue.html
For a more detailed explanation of the BlockingQueue see
http://tutorials.jenkov.com/java-util-concurrent/blockingqueue.html

Data buffering in multithreaded java application

I have a multi threaded application which has one producer thread and several consumer threads.
The data is stored in a shared thread safe collection and flushed to a database when there is sufficient data in the buffer.
From the javadocs -
BlockingQueue<E>
A Queue that additionally supports operations that wait for the queue to become non-empty when retrieving an element, and wait for space to become available in the queue when storing an element.
take()
Retrieves and removes the head of this queue, waiting if necessary until an element becomes available.
My questions -
Is there another collection that has a E[] take(int n) method? i.e. Blocking queue waits until an element is available. What I want is
that it should wait until 100 or 200 elements are available.
Alternatively, is there another method I could use to address the problem without polling?
I think the only way is to either extend some implementation of BlockingQueue or create some kind of utility method using take:
public <E> void take(BlockingQueue<E> queue, List<E> to, int max)
throws InterruptedException {
for (int i = 0; i < max; i++)
to.add(queue.take());
}
The drainTo method isn't exactly what you're looking for, but would it serve your purpose?
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/BlockingQueue.html#drainTo(java.util.Collection, int)
EDIT
You could implement a slightly more performant batch blocking takemin using a combination of take and drainTo:
public <E> void drainTo(final BlockingQueue<E> queue, final List<E> list, final int min) throws InterruptedException
{
int drained = 0;
do
{
if (queue.size() > 0)
drained += queue.drainTo(list, min - drained);
else
{
list.add(queue.take());
drained++;
}
}
while (drained < min);
}
I am not sure if there's a similar class in the standard library that has take(int n) type method, but you should be able to wrap the default BlockingQueue to add that function without too much hassle, don't you think?
Alternative scenario would be to trigger an action where you put elements in the collection, where a threshold set by you would trigger the flushing.
So this should be a threadsafe queue that lets you block on taking an arbitrary number of elements. More eyes to verify the threading code is correct would be welcome.
package mybq;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
public class ChunkyBlockingQueue<T> {
protected final LinkedList<T> q = new LinkedList<T>();
protected final Object lock = new Object();
public void add(T t) {
synchronized (lock) {
q.add(t);
lock.notifyAll();
}
}
public List<T> take(int numElements) {
synchronized (lock) {
while (q.size() < numElements) {
try {
lock.wait();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
ArrayList<T> l = new ArrayList<T>(numElements);
l.addAll(q.subList(0, numElements));
q.subList(0, numElements).clear();
return l;
}
}
}

Threadsafe double buffered cache (not for graphics) in Java?

I was recently looking for a way to implement a doubly buffered thread-safe cache for regular objects.
The need arose because we had some cached data structures that were being hit numerous times for each request and needed to be reloaded from cache from a very large document (1s+ unmarshalling time) and we couldn't afford to let all requests be delayed by that long every minute.
Since I couldn't find a good threadsafe implementation I wrote my own and now I am wondering if it's correct and if it can be made smaller... Here it is:
package nl.trimpe.michiel
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
/**
* Abstract class implementing a double buffered cache for a single object.
*
* Implementing classes can load the object to be cached by implementing the
* {#link #retrieve()} method.
*
* #param <T>
* The type of the object to be cached.
*/
public abstract class DoublyBufferedCache<T> {
private static final Log log = LogFactory.getLog(DoublyBufferedCache.class);
private Long timeToLive;
private long lastRetrieval;
private T cachedObject;
private Object lock = new Object();
private volatile Boolean isLoading = false;
public T getCachedObject() {
checkForReload();
return cachedObject;
}
private void checkForReload() {
if (cachedObject == null || isExpired()) {
if (!isReloading()) {
synchronized (lock) {
// Recheck expiration because another thread might have
// refreshed the cache before we were allowed into the
// synchronized block.
if (isExpired()) {
isLoading = true;
try {
cachedObject = retrieve();
lastRetrieval = System.currentTimeMillis();
} catch (Exception e) {
log.error("Exception occurred retrieving cached object", e);
} finally {
isLoading = false;
}
}
}
}
}
}
protected abstract T retrieve() throws Exception;
private boolean isExpired() {
return (timeToLive > 0) ? ((System.currentTimeMillis() - lastRetrieval) > (timeToLive * 1000)) : true;
}
private boolean isReloading() {
return cachedObject != null && isLoading;
}
public void setTimeToLive(Long timeToLive) {
this.timeToLive = timeToLive;
}
}
What you've written isn't threadsafe. In fact, you've stumbled onto a common fallacy that is quite a famous problem. It's called the double-checked locking problem and many such solutions as yours (and there are several variations on this theme) all have issues.
There are a few potential solutions to this but imho the easiest is simply to use a ScheduledThreadExecutorService and reload what you need every minute or however often you need to. When you reload it put it into the cache result and the calls for it just return the latest version. This is threadsafe and easy to implement. Sure it's not on-demand loaded but, apart from the initial value, you'll never take a performance hit while you retrieve the value. I'd call this over-eager loading rather than lazy-loading.
For example:
public class Cache<T> {
private final ScheduledExecutorsService executor =
Executors.newSingleThreadExecutorService();
private final Callable<T> method;
private final Runnable refresh;
private Future<T> result;
private final long ttl;
public Cache(Callable<T> method, long ttl) {
if (method == null) {
throw new NullPointerException("method cannot be null");
}
if (ttl <= 0) {
throw new IllegalArgumentException("ttl must be positive");
}
this.method = method;
this.ttl = ttl;
// initial hits may result in a delay until we've loaded
// the result once, after which there will never be another
// delay because we will only refresh with complete results
result = executor.submit(method);
// schedule the refresh process
refresh = new Runnable() {
public void run() {
Future<T> future = executor.submit(method);
future.get();
result = future;
executor.schedule(refresh, ttl, TimeUnit.MILLISECONDS);
}
}
executor.schedule(refresh, ttl, TimeUnit.MILLISECONDS);
}
public T getResult() {
return result.get();
}
}
That takes a little explanation. Basically, you're creating a generic interface for caching the result of a Callable, which will be your document load. Submitting a Callable (or Runnable) returns a Future. Calling Future.get() blocks until it returns (completes).
So what this does is implement a get() method in terms of a Future so initial queries won't fail (they will block). After that, every 'ttl' milliseconds the refresh method is called. It submits the method to the scheduler and calls Future.get(), which yields and waits for the result to complete. Once complete, it replaces the 'result' member. Subsequence Cache.get() calls will return the new value.
There is a scheduleWithFixedRate() method on ScheduledExecutorService but I avoid it because if the Callable takes longer than the scheduled delay you will end up with multiple running at the same time and then have to worry about that or throttling. It's easier just for the process to submit itself at the end of a refresh.
I'm not sure I understand your need. Is your need to a have a faster loading (and reloading) of the cache, for a portion of the values?
If so, I would suggest breaking your datastructure into smaller pieces.
Just load the piece that you need at the time. If you divide the size by 10, you will divide the loading time by something related to 10.
This could apply to the original document you are reading, if possible. Otherwise, it would be the way you read it, where you skip a large part of it and load only the relevant part.
I believe that most data can be broken down into pieces. Choose the more appropriate, here are examples:
by starting letter : A*, B* ...
partition your id into two part : first part is a category, look for it in the cache, load it if needed, then look for your second part inside.
If your need is not the initial loading time, but the reloading, maybe you don't mind the actual time for reloading, but want to be able to use the old version while loading the new?
If that is your need, I suggest making your cache an instance (as opposed to static) that is available in a field.
You trigger reloading every minute with a dedicated thread (or a least not the regular threads), so that you don't delay your regular threads.
Reloading creates a new instance, load it with data (takes 1 second), and then simply replace the old instance with the new. (The old will get garbage-collected.) Replacing an object with another is an atomic operation.
Analysis: What happens in that case is that any other thread can get access to the old cache until the last instant ?
In the worst case, the instruction just after getting the old cache instance, another thread replaces the old instance with a new. But this doesn't make your code faulty, asking the old cache instance will still give a value that was correct just before, which is acceptable by the requirement I gave as first sentence.
To make your code more correct, you can create your cache instance as immutable (no setters available, no way to modify internal state). This makes it clearer that it is correct to use it in a multi-threaded context.
You appare to be locking more then is required, in your good case (cache full and valid) every request aquires a lock. you can get away with only locking if the cache is expired.
If we are reloading, do nothing.
If we are not reloading, check if expired if not expired go ahead.
If we are not reloading and we are expired, get the lock and double check expired to make sure we have not sucessfuly loaded seince last check.
Also note you may wish to reload the cache in a background thread so not event the one requrest is heldup waiting for cache to fill.
private void checkForReload() {
if (cachedObject == null || isExpired()) {
if (!isReloading()) {
// Recheck expiration because another thread might have
// refreshed the cache before we were allowed into the
// synchronized block.
if (isExpired()) {
synchronized (lock) {
if (isExpired()) {
isLoading = true;
try {
cachedObject = retrieve();
lastRetrieval = System.currentTimeMillis();
} catch (Exception e) {
log.error("Exception occurred retrieving cached object", e);
} finally {
isLoading = false;
}
}
}
}
}
}

Categories