Non blocking function that preserves order - java

I have the following method:
void store(SomeObject o) {
}
The idea of this method is to store o to a permanent storage but the function should not block. I.e. I can not/must not do the actual storage in the same thread that called store.
I can not also start a thread and store the object from the other thread because store might be called a "huge" amount of times and I don't want to start spawning threads.
So I options which I don't see how they can work well:
1) Use a thread pool (Executor family)
2) In store store the object in an array list and return. When the array list reaches e.g. 1000 (random number) then start another thread to "flush" the array list to storage. But I would still possibly have the problem of too many threads (thread pool?)
So in both cases the only requirement I have is that I store persistantly the objects in exactly the same order that was passed to store. And using multiple threads mixes things up.
How can this be solved?
How can I ensure:
1) Non blocking store
2) Accurate insertion order
3) I don't care about any storage guarantees. If e.g. something crashes I don't care about losing data e.g. cached in the array list before storing them.

I would use a SingleThreadExecutor and a BlockingQueue.
SingleThreadExecutor as the name sais has one single Thread. Use it to poll from the Queue and persist objects, blocking if empty.
You can add not blocking to the queue in your store method.
EDIT
Actually, you do not even need that extra Queue - JavaDoc of newSingleThreadExecutor sais:
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So I think it's exactly what you need.
private final ExecutorService persistor = Executors.newSingleThreadExecutor();
public void store( final SomeObject o ){
persistor.submit( new Runnable(){
#Override public void run(){
// your persist-code here.
}
} );
}
The advantage of using a Runnable that has a quasi-endless-loop and using an extra queue would be the possibility to code some "Burst"-functionality. For example you could make it wait to persist only when 10 elements are in queue or the oldest element has been added at least 1 minute ago ...

I suggest using a Chronicle-Queue which is a library I designed.
It allows you to write in the current thread without blocking. It was originally designed for low latency trading systems. For small messages it takes around 300 ns to write a message.
You don't need to use a back ground thread, or a on heap queue and it doesn't wait for the data to be written to disk by default. It also ensures consistent order for all readers. If the program dies at any point after you call finish() the message is not lost. (Unless the OS crashes/loses power) It also supports replication to avoid data loss.

Have one separate thread that gets items from the end of a queue (blocking on an empty queue), and writes them to disk. Your main thread's store() function just adds items to the beginning of the queue.
Here's a rough idea (though I assume there will be cleaner or faster ways for doing this in production code, depending on how fast you need things to be):
import java.util.*;
import java.io.*;
import java.util.concurrent.*;
class ObjectWriter implements Runnable {
private final Object END = new Object();
BlockingQueue<Object> queue = new LinkedBlockingQueue();
public void store(Object o) throws InterruptedException {
queue.put(o);
}
public ObjectWriter() {
new Thread(this).start();
}
public void close() throws InterruptedException {
queue.put(END);
}
public void run() {
while (true) {
try {
Object o = queue.take();
if (o == END) {
// close output file.
return;
}
System.out.println(o.toString()); // serialize as appropriate
} catch (InterruptedException e) {
}
}
}
}
public class Test {
public static void main(String[] args) throws Exception {
ObjectWriter w = new ObjectWriter();
w.store("hello");
w.store("world");
w.close();
}
}

The comments in your question make it sound like you are unfamilier with multi-threading, but it's really not that difficult.
You simply need another thread responsible for writing to the storage which picks items off a queue. - your store function just adds the objects to the in-memory queue and continues on it's way.
Some psuedo-ish code:
final List<SomeObject> queue = new List<SomeObject>();
void store(SomeObject o) {
// add it to the queue - note that modifying o after this will also alter the
// instance in the queue
synchronized(queue) {
queue.add(queue);
queue.notify(); // tell the storage thread there's something in the queue
}
}
void storageThread() {
SomeObject item;
while (notfinished) {
synchronized(queue) {
if (queue.length > 0) {
item = queue.get(0); // get from start to ensure same order
queue.removeAt(0);
} else {
// wait for something
queue.wait();
continue;
}
}
writeToStorage(item);
}
}

Related

What's the best way to implement a list of elements that will have to have elements added/removed from different threads?

I'm currently trying to implement a system list that would run in a few different threads:
1) First thread is listening to incoming requests and adds them to the list.
2) A new thread is created for each request to perform certain operations.
3) Another thread iterates through the list, checks the status of each request, and removes them from the list when they're complete.
Now, the way I have it in a very simplified pseudocode can be viewed below:
private List<Job> runningJobs = new ArrayList<>(); // our list of requests
private Thread monitorThread;
private Runnable monitor = new Runnable() { // this runnable is later called in a new thread to monitor the list and remove completed requests
#Override
public void run() {
boolean monitorRun = true;
while(monitorRun) {
try {
Thread.sleep(1000);
if (runningJobs.size()>0){
Iterator<Job> i = runningJobs.iterator();
while (i.hasNext()) {
try {
Job job = i.next();
if (job.jobStatus() == 1) { // if job is complete
i.remove();
}
}
catch (java.util.ConcurrentModificationException e){
e.printStackTrace();
}
}
}
if (Thread.currentThread().isInterrupted()){
monitorRun = false;
}
} catch (InterruptedException e) {
monitorRun = false;
}
}
}
};
private void addRequest(Job job){
this.runningJobs.add(newJob);
// etc
}
In short, the Runnable monitor is what runs continuously in the third thread; the first thread is calling addRequest() occasionally.
While my current implementation somewhat works, I'm concerned about the order of operations here and possible java.util.ConcurrentModificationException (and the system is anything but robust). I'm certain there is a much better way to organize this mess.
What's the proper or a better way to do this?
Your requirements would be met nicely with an ExecutorService. For each request, create Job, and submit it to the service. Internally, the service uses a BlockingQueue, which would address your question directly, but you don't have to worry about it with an ExecutorService.
Specifically, something like this:
/* At startup... */
ExecutorService workers = Executors.newCachedThreadPool();
/* For each request... */
Job job = ... ;
workers.submit(job); /* Assuming Job implements Runnable */
// workers.submit(job::jobEntryPoint); /* If Job has some other API */
/* At shutdown... */
workers.shutdown();
There are a few different ways.
You can synchronize the list. This is possibly the most brute-force and still wouldn't help prevent an insert while you are iterating over it.
There are a few synchronized* collections. These tend to be better but have ramifications. For instance CopyOnWriteArrayList will work but it creates a new array list each time (that you would assign back to the variable). This is good for occasionally updated collections.
There is a ConcurrentLinkedQueue--Since it's "Linked" you can't reference an item in the middle.
Look through the implementations of the "List" interface and pick the one that best suits your problem.
If your problem is a queue instead of a list, there are a few implementations of that as well and they will tend to be better suited for that type of problem.
In general my answer is that you should probably scan through the Javadocs every time java does a major release and examine (at least) the new collections. You might be surprised at the stuff that's in there.

Java start multiple threads in a class

I am consuming from a certain source (say Kafka) and periodically dumping the collected messages (to, say, S3). My class definition is as follows:
public class ConsumeAndDump {
private List<String> messages;
public ConsumeAndDump(){
messages = new ArrayList<>();
// initialize required resources
}
public void consume(){
// this runs continuously and keeps consuming from the source.
while(true){
final String message = ...// consume from Kafka
messages.add(message);
}
}
public void dump(){
while(true){
final String allMessages = String.join("\n", messages);
messages.clear(); // shown here simply, but i am synchronising this to avoid race conditions
// dump to destination (file, or S3, or whatever)
TimeUnit.SECONDS.sleep(60); // sleep for a minute
}
}
public void run() {
// This is where I don't know how to proceed.
// How do I start consume() and dump() as separate threads?
// Is it even possible in Java?
// start consume() as thread
// start dump() as thread
// wait for those to finish
}
}
I want to have two threads - consume and dump. consume should run continuously whereas dump wakes up periodically, dumps the messages, clears the buffer and then goes back to sleep again.
I am having trouble starting consume() and dump() as threads. Honestly, I don't know how to do that. Can we even run member methods as threads? Or do I have to make separate Runnable classes for consume and dump? If so, how would I share messages between those?
First of all, you can't really use ArrayList for this. ArrayList is not thread-safe. Check out BlockingQueue for example. You will have to deal with things like back pressure. Don't use an unbounded queue.
Starting a thread is pretty simple, you can use lambdas for it.
public void run() {
new Thread(this::consume).start();
new Thread(this::produce).start();
}
Should work, but gives you little to no control over when those processes should end.

Queue Worker Thread stops working, thread safety issue?

i want to introduce my problem first.
I have several WorkingThreads that are receiving a string, processing the string and afterwards sending the processed string to a global Queue like this:
class Main {
public static Queue<String> Q;
public static void main(String[] args) {
//start working threads
}
}
WorkingThread.java:
class WorkingThread extends Thread {
public void run() {
String input;
//do something with input
Main.q.append(processedString);
}
So now every 800ms another Thread called Inserter dequeues all the entries to formulate some sql, but thats not important.
class Inserter extends Thread {
public void run() {
while(!Main.Q.isEmpty()) {
System.out.print(".");
// dequeue and formulate some SQL
}
}
}
Everything works for about 5 to 10 minutes but then suddenly, i cannot see any dots printed (what is basically a heartbeat for the Inserter). The Queue is not empty i can assure that but the inserter just wont work even though it get started regulary.
I have a suspision that there is a problem when a worker wants to insert something while the Inserter dequeues the Queue, could this possibly be some kind of "deadlock"?
I really hope somebody has an explanation for this behaviour. I am looking forward to learn ;).
EDIT: I am using
Queue<String> Q = new LinkedList<String>();
You are not using a synchronized or thread safe Queue therefore you have a race hazard. Your use of a LinkedList shows a (slightly scary) lack of knowledge of this fact. You may want to read more about threading and thread safety before you try and tackle any more threaded code.
You must either synchronize manually or use one of the existing implementations provided by the JDK. Producer/consumer patterns are usually implemented using one of the BlockingQueue implementations.
A BlockingQueue of a bounded size will block producers trying to put if the queue is full. A BlockingQueue will always block consumers if the queue is empty.
This allows you to remove all of your custom logic that spins on the queue and waits for items.
A simple example using Java 8 lambdas would look like:
public static void main(String[] args) throws Exception {
final BlockingQueue<String> q = new LinkedBlockingQueue<>();
final ExecutorService executorService = Executors.newFixedThreadPool(4);
final Runnable consumer = () -> {
while (true) {
try {
System.out.println(q.take());
} catch (InterruptedException e) {
return;
}
}
};
executorService.submit(consumer);
final Stream<Runnable> producers = IntStream.range(0, 5).mapToObj(i -> () -> {
final Random random = ThreadLocalRandom.current();
while (true) {
q.add("Consumer " + i + " putting " + random.nextDouble());
try {
TimeUnit.MILLISECONDS.sleep(random.nextInt(2000));
} catch (InterruptedException e) {
//ignore
}
}
});
producers.forEach(executorService::submit);
}
The consumer blocks on the BlockingQueue.take method and immediately there is an item available, it will be woken and will print the item. If there are no items, the thread will be suspended - allowing the physical CPU to do something else.
The producers each push a String onto the queue using add. As the queue is unbounded, add will always return true. In the case where there is likely to be a backlog of work the for consumer you can bound the queue and use the put method (that throws an InterruptedException so requires a try..catch which is why it's easier to use add) - this will automatically create flow control.
Seems more like synchronization issue.. You are trying to do a simulation of - Producer - Consumer problem. You need to synchronize your Queue or use a BlockingQueue. You probably have a race condition.
You are going to need to synchronize access to your Queue or
use ConcurrentLinkedQueue see http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html
or as also suggested using a BlockingQueue (depending on your requirements) http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/BlockingQueue.html
For a more detailed explanation of the BlockingQueue see
http://tutorials.jenkov.com/java-util-concurrent/blockingqueue.html

how to deal with multiple worker threads that may create new work items

I have a queue that contains work items and I want to have multiple threads work in parallel on those items. When a work item is processed it may result in new work items. The problem I have is that I can't find a solution on how to determine if I'm done. The worker looks like that:
public class Worker implements Runnable {
public void run() {
while (true) {
WorkItem item = queue.nextItem();
if (item != null) {
processItem(item);
}
else {
// the queue is empty, but there may still be other workers
// processing items which may result in new work items
// how to determine if the work is completely done?
}
}
}
}
This seems like a pretty simple problem actually but I'm at a loss. What would be the best way to implement that?
thanks
clarification:
The worker threads have to terminate once none of them is processing an item, but as long as at least one of them is still working they have to wait because it may result in new work items.
What about using an ExecutorService which will allow you to wait for all tasks to finish: ExecutorService, how to wait for all tasks to finish
I'd suggest wait/notify calls. In the else case, your worker threads would wait on an object until notified by the queue that there is more work to do. When a worker creates a new item, it adds it to the queue, and the queue calls notify on the object the workers are waiting on. One of them will wake up to consume the new item.
The methods wait, notify, and notifyAll of class Object support an efficient transfer of control from one thread to another. Rather than simply "spinning" (repeatedly locking and unlocking an object to see whether some internal state has changed), which consumes computational effort, a thread can suspend itself using wait until such time as another thread awakens it using notify. This is especially appropriate in situations where threads have a producer-consumer relationship (actively cooperating on a common goal) rather than a mutual exclusion relationship (trying to avoid conflicts while sharing a common resource).
Source: Threads and Locks
I'd look at something higher level than wait/notify. It's very difficult to get right and avoid deadlocks. Have you looked at java.util.concurrent.CompletionService<V>? You could have a simpler manager thread that polls the service and take()s the results, which may or may not contain a new work item.
Using a BlockingQueue containing items to process along with a synchronized set that keeps track of all elements being processed currently:
BlockingQueue<WorkItem> bQueue;
Set<WorkItem> beingProcessed = new Collections.synchronizedSet(new HashMap<WorkItem>());
bQueue.put(workItem);
...
// the following runs over many threads in parallel
while (!(bQueue.isEmpty() && beingProcessed.isEmpty())) {
WorkItem currentItem = bQueue.poll(50L, TimeUnit.MILLISECONDS); // null for empty queue
if (currentItem != null) {
beingProcessed.add(currentItem);
processItem(currentItem); // possibly bQueue.add(newItem) is called from processItem
beingProcessed.remove(currentItem);
}
}
EDIT: as #Hovercraft Full Of Eels suggested, an ExecutorService is probably what you should really use. You can add new tasks as you go along. You can semi-busy wait for termination of all tasks at regular interval with executorService.awaitTermination(time, timeUnits) and kill all your threads after that.
Here's the beginnings of a queue to solve your problem. bascially, you need to track new work and in process work.
public class WorkQueue<T> {
private final List<T> _newWork = new LinkedList<T>();
private int _inProcessWork;
public synchronized void addWork(T work) {
_newWork.add(work);
notifyAll();
}
public synchronized T startWork() throws InterruptedException {
while(_newWork.isEmpty() && (_inProcessWork > 0)) {
wait();
if(!_newWork.isEmpty()) {
_inProcessWork++;
return _newWork.remove(0);
}
}
// everything is done
return null;
}
public synchronized void finishWork() {
_inProcessWork--;
if((_inProcessWork == 0) && _newWork.isEmpty()) {
notifyAll();
}
}
}
your workers will look roughly like:
public class Worker {
private final WorkQueue<T> _queue;
public void run() {
T work = null;
while((work = _queue.startWork()) != null) {
try {
// do work here...
} finally {
_queue.finishWork();
}
}
}
}
the one trick is that you need to add the first work item _before you start any workers (otherwise they will all immediately exit).

Java POJO: strategies for handling a queue of request objects to a server

Right now I'm torn up in deciding the best way of handling request objects that I send up to a server. In other words, I have tracking request objects for things such as impression and click tracking within my app. Simple requests with very low payloads. There are places in my app where said objects that need to be tracked appear concurrently next to each other (at most three concurrent objects that I have to track), so every time said objects are visible for example, I have to create a tracking request object for each of them.
Now I already know that I can easily create a singleton queue thread which adds those objects into a vector and my thread either processes them in the main loop or calls wait on the queue until we have objects to process. While this sounds like a clear cut solution, the queue can accumulate into the dozens, which can be cumbersome at times, since it's making one connection for each request, thus it won't run concurrently.
What I had in mind was to create a thread pool which would allow me to create up two concurrent connections via semaphore and process thread objects that would contain my tracking event requests. In other words, I wanted to create a function that would create a new thread Object and add it into a Vector, in which the thread pool would iterate through the set of threads and process them two at a time. I know I can create a function that would add objects like so:
public boolean addThread(Runnable r){
synchronized(_queue){
while(!dead){
_queue.addElement(r);
//TODO: How would I notify my thread pool object to iterate through the list to process the queue? Do I call notify on the queue object, but that would only work on a thread right??
return true
}
return false;
}
What I am wondering is how will the threads themselves get executed. How can I write a function that would execute the thread pool after adding a thread to the list? Also, since the semaphore will block after the second connection, will that lock up my app until there is an open slot, or will it just lock up in the thread pool object while looping through the list?
As always, since I am targeting a J2ME/Blackberry environment, only pre-1.5 answers will be accepted, so no Generics or any class from the Concurrent package.
EDIT: So I take it that this is what it should look like more or less:
class MyThreadPool extends Thread{
private final Vector _queue = new Vector();
private CappedSemaphore _sem;
public MyWaitingThread (){
_sem = new CappedSemaphore(2);
this.start();
}
public void run(){
while(!dead){
Runnable r = null;
synchronized(_queue){
if(_queue.isEmpty()){
_queue.wait();
} else {
r = _queue.elementAt(0);
_queue.removeElement(0);
}
}
if(r != null){
_sem.take();
r.run();
_sem.release();
}
}
}
public boolean addThread(Runnable r){
synchronized(_queue){
if(!dead){
_queue.addElement(r);
_queue.notifyAll();
return true
}
return false;
}
}
What you would want to do, in on the thread side have the each thread wait on the queue. For example
class MyWaitingThread extends Thread{
private final Queue _queue;
public MyWaitingThread (Queue _queue){
this._queue = _queue;
}
public void run(){
while(true){
Runnable r = null;
synchronized(_queue){
if(_queue.isEmpty())
_queue.wait();
else
r = queue.pop();
}
if(r != null) r.run();
}
}
}
And in your other logic it would look like:
public void addThread(Runnable r){
if(!dead){
synchronized(_queue){
_queue.addElement(r);
_queue.notifyAll();
}
}
}
That _queue.notifyAll will wake up all threads waiting on the _queue instance. Also, notice I moved the while(!dead) outside of the synchronized block and changed it to if(!dead). I can imagine keeping it the way you originally had it wouldnt have worked exactly like you hoped.

Categories