How to implement a queue that can be processed by multiple threads? - java

I think I'm doing it wrong. I am creating threads that are suppose to crunch some data from a shared queue. My problem is the program is slow and a memory hog, I suspect that the queue may not be as shared as I hoped it would be. I suspect this because in my code I added a line that displayed the size of the queue and if I launch 2 threads then I get two outputs with completely different numbers and seem to increment on their own(I thought it could be the same number but maybe it was jumping from 100 to 2 and so on but after watching it shows 105 and 5 and goes at a different rate. If I have 4 threads then I see 4 different numbers).
Here's snippet of the relevant parts. I create a static class with the data I want in the queue at the top of the program
static class queue_class {
int number;
int[] data;
Context(int number, int[] data) {
this.number = number;
this.data = data;
}
}
Then I create the queue after sending some jobs to the callable..
static class process_threaded implements Callable<Void> {
// queue with contexts to process
private Queue<queue_class> queue;
process_threaded(queue_class request) {
queue = new ArrayDeque<queue_class>();
queue.add(request);
}
public Void call() {
while(!queue.isEmpty()) {
System.out.println("in contexts queue with a size of " + queue.size());
Context current = contexts.poll();
//get work and process it, if it work great then the solution goes elsewhere
//otherwise, depending on the data, its either discarded or parts of it is added back to queue
queue.add(new queue_class(k, data_list));
As you can see, there's 3 options for the data, get sent off if data is good, discard if its totally horrible or sent back to the queue. I think the queues are going when its getting sent back but I suspect because each thread is working on its own queue and not a shared one.
Is this guess correct and am I doing this wrong?

You are correct in your assessment that each thread is (probably) working with its own queue, since you are creating a queue in the constructor of your Callable. (It's actually very weird to have a Callable<Void> -- isn't that just a Runnable?)
There are other problems there, for example, the fact that you're working with a queue that isn't thread-safe, or the fact that your code won't compile as it is written.
The important question, though, is do you really need to explicitly create a queue in the first place? Why not have an ExecutorService to which you submit your Callables (or Runnables if you decide to make that switch): Pass a reference to the executor into your Callables, and they can add new Callables to the executor's queue of tasks to run. No need to reinvent the wheel.
For example:
static class process_threaded implements Runnable {
// Reference to an executor
private final ExecutorService exec;
// Reference to the job counter
private final AtomicInteger jobCounter;
// Request to process
private queue_class request;
process_threaded( ExecutorService exec, AtomicInteger counter, queue_class request) {
this.exec = exec;
this.jobCounter = counter;
this.jobCounter.incrementAndGet(); // Assuming that you will always
// submit the process_threaded to
// the executor if you create it.
this.request = request;
}
public run() {
//get work and process **request**, if it work great then the solution goes elsewhere
//otherwise, depending on the data, its either discarded or parts of are added back to the executor
exec.submit( new process_threaded( exec, new queue_class(k, data_list) ) );
// Can do some more work
// Always run before returning: counter update and notify the launcher
synchronized(jobCounter){
jobCounter.decrementAndGet();
jobCounter.notifyAll();
}
}
}
Edit:
To solve your problem of when to shut down the executor, I think the simplest solution is to have a job counter, and shutdown when it reaches 0. For thread-safety an AtomicInteger is probably the best choice. I added some code above to incorporate the change. Then your launching code would look something like this:
void theLauncher() {
AtomicInteger jobCounter = new AtomicInteger( 0 );
ExecutorService exec = Executors.newFixedThreadPool( Runtime.getRuntime().availableProcesses());
exec.submit( new process_threaded( exec, jobCounter, someProcessRequest ) );
// Can submit some other things here of course...
// Wait for jobs to complete:
for(;;jobCounter.get() > 0){
synchronized( jobCounter ){ // (I'm not sure if you have to have the synchronized block, but I think this is safer.
if( jobCounter.get() > 0 )
jobCounter.wait();
}
}
// Now you can shutdown:
exec.shutdown();
}

Don't reinvent the wheel! How about using ConcurrentLinkedQueue? From the javadocs:
An unbounded thread-safe queue based on linked nodes. This queue orders elements FIFO (first-in-first-out). The head of the queue is that element that has been on the queue the longest time. The tail of the queue is that element that has been on the queue the shortest time. New elements are inserted at the tail of the queue, and the queue retrieval operations obtain elements at the head of the queue. A ConcurrentLinkedQueue is an appropriate choice when many threads will share access to a common collection.

Related

Blocking async queues Java

I'm trying to figure out a way to implement the following in Java.
Thread1 will add jobs to queue1.
Another different thread (Thread2) will add jobs to queue2.
In the run() method of Thread1 I wait until there's a job in queue 1, and let's say I will print it, if and only if there are no awaiting jobs in queue2.
How may I notify Thread1 that Thread2 has added a job in queue2?
Here is Thread1 Class
public class Thread1 implements Runnable {
private List queue1 = new LinkedList();
public void processData(byte [] data, int count) {
byte[] dataCopy = new byte[count];
System.arraycopy(data, 0, dataCopy, 0, count);
synchronized(queue1) {
queue1.add(data);
queue1.notify();
}
}
public void run() {
byte [] data;
while(true) {
// Wait for data to become available
synchronized(queue1) {
while(queue1.isEmpty()) {
try {
queue1.wait();
} catch (InterruptedException e) {}
}
data = (byte[]) queue1.remove(0);
}
// print data only if queue2 has no awaiting jobs in it
}
}
You have not quite well explained your question and I am not sure what you are trying to ask -its very confusing to read what you have written. Also, I don't see any code for Thread-2 and Queue-2.
So I am going to put general advice,
1.Use existing implementation of Blocking Queue instead of doing private List queue1 = new LinkedList(); and then doing synchronized(queue1).
Here is documentation of BlockingQueue interface. You can use class , LinkedBlockingQueue as implementation.
2.Sample code - If you browse above link of BlockingQueue documentation, you see code at the bottom highlighting as how to write consumers and producers. There you don't see instance of queue getting created inside Thread class but set via constructor - that way you can share a single queue with as many threads as you like - by passing reference to queue in Runnable constructor.
3.BlockingQueue implementations are thread-safe - so you don't have to synchronizeon queue instances. You can freely pass queue instances to as many threads as you like believing that its methods will be called in synchronized way.
So I suggest that you try to rewrite whatever program you are trying to write using above construct and code samples and come back for any more questions.
Hope it helps !!

Need algo for executing tasks sequentially submitted by a particular customer

Need high performance algorithm:
Scenario: Tasks are submitted in a thread pool for different customers at very frequent intervals
Requirement: Tasks need to be handle sequentially submitted by a particular customer, but different task submitted by different customer can be executed in parallel.
Here is an example of how can you write your worker class. The worker class keeps a concurrent map of all the customers being processed currently. Whenever the worker get the next work item off the list, it checks if this customer is currently being processed. If so, It re-enqueues the task at the end of the queue.
Let me know if you have any questions.
public class MyWorker extends Thread {
private static int instance = 0;
private final Queue<Task> queue;
// This is used to hold the customer that are in process at this time.
private final ConcurrentHashMap<String, Boolean> inProcessCustomers;
public MyWorker(Queue queue, ConcurrentHashMap<String, Boolean> inProcessCustomers) {
this.queue = queue;
this.inProcessCustomers = inProcessCustomers
setName("MyWorker:" + (instance++));
}
#Override
public void run() {
while ( true ) {
try {
Runnable work = null;
synchronized ( queue ) {
while ( queue.isEmpty() )
queue.wait();
// Get the next work item off of the queue
task = queue.remove();
// if the customer is in process, then add the task back to the end of the queue and return.
if(inProcessCustomers.containsKey(task.getCustomerId()) {
queue.add(task);
return;
}
inProcessCustomer.put(task.getCustomerId(), true);
}
// Process the work item
task.run();
inProcessCustomer.remove(task.getCustomerId());
}
catch ( InterruptedException ie ) {
break; // Terminate
}
}
}
private void doWork(Runnable work) { ... }
}
It sounds like you need a way to map your customer to a queue of tasks.
Assumption: Each customer has a way of being uniquely identified.
I would suggest implementing the hashCode method on whatever object represents your customer.
As the tasks are submitted you create a mapping (using HashMap) where the key is your customer and the value is a queue - I suggest ConcurrentLinkedQueue - then add either the task or the thread to the queue. As you process tasks remove them (or their thread depending on design choice) from the queue.
EDIT:
For the purposes of continued discussion I'm going to assume the tasks will be the objects stored in the queue.
Above when I wrote "As you process tasks remove them..." I meant that the task would remain in the queue until completed. You can do this using the peek method of the queue.
Regarding how to process tasks once they are added to the queue the task can be given a reference to the queue object so that once the task is completed it can trigger the next task. The basic algorithm for this piece would go something like this: the controller thread responsible for adding tasks to the queue would check to see if the queue is empty or not. If the queue is not empty it will only add the next task to the queue because the next task is triggered when the current task finishes. If the queue is empty the controller triggers the next task - which it should already have a reference to. When the current task finishes it will call its queue's poll method to remove itself from the head of the queue and then calls peek to obtain the next task. The next task is then executed.

Non blocking function that preserves order

I have the following method:
void store(SomeObject o) {
}
The idea of this method is to store o to a permanent storage but the function should not block. I.e. I can not/must not do the actual storage in the same thread that called store.
I can not also start a thread and store the object from the other thread because store might be called a "huge" amount of times and I don't want to start spawning threads.
So I options which I don't see how they can work well:
1) Use a thread pool (Executor family)
2) In store store the object in an array list and return. When the array list reaches e.g. 1000 (random number) then start another thread to "flush" the array list to storage. But I would still possibly have the problem of too many threads (thread pool?)
So in both cases the only requirement I have is that I store persistantly the objects in exactly the same order that was passed to store. And using multiple threads mixes things up.
How can this be solved?
How can I ensure:
1) Non blocking store
2) Accurate insertion order
3) I don't care about any storage guarantees. If e.g. something crashes I don't care about losing data e.g. cached in the array list before storing them.
I would use a SingleThreadExecutor and a BlockingQueue.
SingleThreadExecutor as the name sais has one single Thread. Use it to poll from the Queue and persist objects, blocking if empty.
You can add not blocking to the queue in your store method.
EDIT
Actually, you do not even need that extra Queue - JavaDoc of newSingleThreadExecutor sais:
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So I think it's exactly what you need.
private final ExecutorService persistor = Executors.newSingleThreadExecutor();
public void store( final SomeObject o ){
persistor.submit( new Runnable(){
#Override public void run(){
// your persist-code here.
}
} );
}
The advantage of using a Runnable that has a quasi-endless-loop and using an extra queue would be the possibility to code some "Burst"-functionality. For example you could make it wait to persist only when 10 elements are in queue or the oldest element has been added at least 1 minute ago ...
I suggest using a Chronicle-Queue which is a library I designed.
It allows you to write in the current thread without blocking. It was originally designed for low latency trading systems. For small messages it takes around 300 ns to write a message.
You don't need to use a back ground thread, or a on heap queue and it doesn't wait for the data to be written to disk by default. It also ensures consistent order for all readers. If the program dies at any point after you call finish() the message is not lost. (Unless the OS crashes/loses power) It also supports replication to avoid data loss.
Have one separate thread that gets items from the end of a queue (blocking on an empty queue), and writes them to disk. Your main thread's store() function just adds items to the beginning of the queue.
Here's a rough idea (though I assume there will be cleaner or faster ways for doing this in production code, depending on how fast you need things to be):
import java.util.*;
import java.io.*;
import java.util.concurrent.*;
class ObjectWriter implements Runnable {
private final Object END = new Object();
BlockingQueue<Object> queue = new LinkedBlockingQueue();
public void store(Object o) throws InterruptedException {
queue.put(o);
}
public ObjectWriter() {
new Thread(this).start();
}
public void close() throws InterruptedException {
queue.put(END);
}
public void run() {
while (true) {
try {
Object o = queue.take();
if (o == END) {
// close output file.
return;
}
System.out.println(o.toString()); // serialize as appropriate
} catch (InterruptedException e) {
}
}
}
}
public class Test {
public static void main(String[] args) throws Exception {
ObjectWriter w = new ObjectWriter();
w.store("hello");
w.store("world");
w.close();
}
}
The comments in your question make it sound like you are unfamilier with multi-threading, but it's really not that difficult.
You simply need another thread responsible for writing to the storage which picks items off a queue. - your store function just adds the objects to the in-memory queue and continues on it's way.
Some psuedo-ish code:
final List<SomeObject> queue = new List<SomeObject>();
void store(SomeObject o) {
// add it to the queue - note that modifying o after this will also alter the
// instance in the queue
synchronized(queue) {
queue.add(queue);
queue.notify(); // tell the storage thread there's something in the queue
}
}
void storageThread() {
SomeObject item;
while (notfinished) {
synchronized(queue) {
if (queue.length > 0) {
item = queue.get(0); // get from start to ensure same order
queue.removeAt(0);
} else {
// wait for something
queue.wait();
continue;
}
}
writeToStorage(item);
}
}

How to use preexisting runnables, limiting the number of runnables to create.?

Problem Statement:
I have a 5000 id's that point to rows in a database.[ Could be more than 5000 ]
Each Runnable retrieves the row in a database given an id and performs some time consuming tasks
public class BORunnable implements Callable<Properties>{
public BORunnable(String branchID) {
this.branchID=branchID;
}
public setBranchId(String branchID){
this.branchID=branchID;
}
public Properties call(){
//Get the branchID
//Do some time consuming tasks. Merely takes 1 sec to complete
return propObj;
}
}
I am going to submit these runnables to the executor service.
For that, I need to create and submit 5000 or even more runnables to the executor service. This creation of runnables, in my environment could throw out of memory exception.
[given that 5000 is just an example]
So I came up with a approach, I would be thankful if you provide anything different:
Created a thread pool of fixed size 10.
int corePoolSize = 10;
ThreadPoolExecutor executor = new ThreadPoolExecutor(corePoolSize,
corePoolSize + 5, 10, TimeUnit.SECONDS,
new LinkedBlockingQueue<Runnable>());
Collection<Future<Properties>> futuresCollection =
new LinkedList<Future<Properties>>();
Added all of the branchIDs to the branchIdQueue
Queue<String> branchIdQueue = new LinkedList<String>();
Collections.addAll(branchIdQueue, branchIDs);
I am trying to reuse runnable. Created a bunch of runnable
Now i want this number of elements to be dequeued and create runnable for each
int noOfElementsToDequeue = Math.min(corePoolSize, branchIdQueue.size());
ArrayList<BORunnable>runnablesList = dequeueAndSubmitRunnable(
branchIdQueue,noOfElementsToDequeue);
ArrayList<BORunnable> dequeueAndSubmitRunnable(branchIdQueue,
noOFElementsToDequeue){
ArrayList<BORunnable> runnablesList= new ArrayList<BORunnable>();
for (int i = 0; i < noOfElementsToDequeue; i++) {
//Create this number of runnables
runnablesList.add(new BORunnable(branchIdQueue.remove()));
}
return runnablesList;
}
Submitting the retrieved runnables to the executor
for(BORunnable boRunnableObj:runnablesList){
futuresCollection.add(executor.submit(boRunnableObj));
}
If the queue is empty, I created the runnables I needed. if it's not, I want to reuse the runnable and submit to the executor.
Here I get number of runnables to be reused = the total count - current active count
[Approximate is enough for me]
int coreSize=executor.getCorePoolSize();
while(!branchIdQueue.isEmpty()){
//Total size - current active count
int runnablesToBeReused=coreSize-executor.getActiveCount();
if(runnablesToBeReused!=0){
ArrayList<String> branchIDsTobeReset = removeElementsFromQueue(
branchIdQueue,runnablesToBeReused);
ArrayList<BORunnable> boRunnableToBeReusedList =
getBORunnableToBeReused(boRunnableList,runnablesToBeReused);
for(BORunnable aRunnable:boRunnableList){
//aRunnable.set(branchIDSTobeRest.get(0));
}
}
}
My Problem is
I couldn't able to find out which Runnable has been released by the thread pool so i could use that to submit
Hence, I randomly take few runnables and try to set the branchId, but then thread race problem may occur. [don't want to use volatile]
Reusing the Runnables makes no sense as the problem is not the cost of creating or freeing the runnable instances. These come almost for free in Java.
What you want to do is to limit the number of pending jobs which is easy to achieve: just provide a limit to the queue you are passing to the executor service. That’s as easy as passing an int value (the limit) to the LinkedBlockingQueue’s constructor. Note that you can also use an ArrayBlockingQueue then as a LinkedBlockingQueue does not provide an advantage for bounded queue usage.
When you have provided a limit to the queue, the executor will reject queuing up new jobs. The only thing left to do is to provide an appropriate RejectedExecutionHandler to the executor. E.g. CallerRunsPolicy would be sufficient to avoid that the caller creates more new jobs while the threads are all busy and the queue is full.
After execution, the Runnables are subject to garbage collection.

Rejection handler in Executors.newScheduledThreadPool

I have a ArrayBlocking queue, , upon which a single thread fixed rate Scheduled works.
I may have failed task. I want re-run that or re-insert in queue at high priority or top level
Some thoughts here -
Why are you using ArrayBlockingQueue and not PriorityBlockingQueue ? Sounds like what you need to me . At first set all your elements to be with equal priority.
In case you receive an exception - re-insert to the queue with a higher priority
Simplest thing might be a priority queue. Attach a retry number to the task. It starts as zero. After an unsuccessful run, throw away all the ones and increment the zeroes and put them back in the queue at a high priority. With this method, you can easily decide to run everything three times, or more, if you want to later. The down side is you have to modify the task class.
The other idea would be to set up another, non-blocking, thread-safe, high-priority queue. When looking for a new task, you check the non-blocking queue first and run what's there. Otherwise, go to the blocking queue. This might work for you as is, and so far it's the simplest solution. The problem is the high priority queue might fill up while the scheduler is blocked on the blocking queue.
To get around this, you'd have to do your own blocking. Both queues should be non-blocking. (Suggestion: java.util.concurrent.ConcurrentLinkedQueue.) After polling both queues with no results, wait() on a monitor. When anything puts something in a queue, it should call notifyAll() and the scheduler can start up again. Great care is needed lest the notification occur after the scheduler has checked both queues but before it calls wait().
Addition:
Prototype code for third solution with manual blocking. Some threading is suggested, but the reader will know his/her own situation best. Which bits of code are apt to block waiting for a lock, which are apt to tie up their thread (and core) for minutes while doing extensive work, and which cannot afford to sit around waiting for the other code to finish all needs to be considered. For instance, if a failed run can immediately be rerun on the same thread with no time-consuming cleanup, most of this code can be junked.
private final ConcurrentLinkedQueue mainQueue = new ConcurrentLinkedQueue();
private final ConcurrentLinkedQueue prioQueue = new ConcurrentLinkedQueue();
private final Object entryWatch = new Object();
/** Adds a new job to the queue. */
public void addjob( Runnable runjob ) {
synchronized (entryWatch) { entryWatch.notifyAll(); }
}
/** The endless loop that does the work. */
public void schedule() {
for (;;) {
Runnable run = getOne(); // Avoids lock if successful.
if (run == null) {
// Both queues are empty.
synchronized (entryWatch) {
// Need to check again. Someone might have added and notifiedAll
// since last check. From this point until, wait, we can be sure
// entryWatch is not notified.
run = getOne();
if (run == null) {
// Both queues are REALLY empty.
try { entryWatch.wait(); }
catch (InterruptedException ie) {}
}
}
}
runit( run );
}
}
/** Helper method for the endless loop. */
private Runnable getOne() {
Runnable run = (Runnable) prioQueue.poll();
if (run != null) return run;
return (Runnable) mainQueue.poll();
}
/** Runs a new job. */
public void runit( final Runnable runjob ) {
// Do everthing in another thread. (Optional)
new Thread() {
#Override public void run() {
// Run run. (Possibly in own thread?)
// (Perhaps best in thread from a thread pool.)
runjob.run();
// Handle failure (runit only, NOT in runitLast).
// Defining "failure" left as exercise for reader.
if (failure) {
// Put code here to handle failure.
// Put back in queue.
prioQueue.add( runjob );
synchronized (entryWatch) { entryWatch.notifyAll(); }
}
}
}.start();
}
/** Reruns a job. */
public void runitLast( final Runnable runjob ) {
// Same code as "runit", but don't put "runjob" in "prioQueue" on failure.
}

Categories