Before I reinvent the wheel, is there a topic-like concurrent queue in plain Java? I have the following requirements:
Multiple readers/consumers
Multiple writers/producers
Every message must be consumed by every (active) consumer
After every consumer reads a message it should become garbage (i.e. no more references)
Writing to the queue should not be O(N) to the number of consumers
Concurrent, preferably non-blocking
Not JMS based: it's for a much lighter/embeddable environment
That's pretty much everything I need. Any pointers?
Basically you are talking about multiplexing, and no there isn't something in the standard lib but it is pretty simple to create one. Presuming that your clients aren't interested in messages published before they subscribe then you need a pool of queues for each consumer and publication simply offers the item to each queue:
public class Multiplexer<M> {
private final List<BlockingQueue<M>> consumers
= new CopyOnWriteArrayList<BlockingQueue<M>>();
public void publish(M msg) {
for (BlockingQueue<M> q : consumers) {
q.offer(msg);
}
}
public void addConsumer(BlockingQueue<M> consumer) {
consumers.add(consumer);
}
}
This version allows consumers to use whatever blocking queue implementation they might want. You could obviously provide a standard implementation and a nice interface for the client if you want.
the 3rd condition is not in plain java but you can use a nonblocking linked queue with a separate head for each consumer (you can rely on GC to collect the unreferenced nodes)
The simplest strategy is to pass a message to every consumer, I wouldn't have so many consumer that the number of consumers is important. You can add messages to dozens of consumers in a few micro-seconds.
One way to avoid this is to have a circular ring buffer with many readers. This is tricky to implement and means the consumers will be limited in the number of sources of message they can have.
Have just one pseudo-consumer and let the real consumers register with the pseudo-consumer. When a producer sends a message, the pseudo consumer wakes up and consumes the message. On consuming the message, the pseudo-consumer creates a separate Runnable for each real consumer registered with it and executes them on a thread-pool.
Related
I have a producer consumer model which an arduino generates packets and inside my pc I have a java program which takes those packets and puts them into a BlockingQueue. Now there are threads that process these packets.
Let's say X is producer and A, B and C are consumers thread. there is a queue which all consumers have reference to it. Messages(packets) are immutable objects (i.e. consumers can't change the state of the elements). My question is How can I know all threads are done with specific element inside queue so I can remove it?
here is my consumer run() method:
#Override
public void run()
{
while (isStarted && !queue.isEmpty()) {
updateMap(queue.peek());
}
}
One design I'm thinking of is to use a bounded queue. When producer finds out queue is full then it removes the first element. But I'm not sure if this is a safe approach. I read this tutorial and a few more and what I get is:
Producer should wait if Queue or bucket is full and Consumer should wait if queue or
bucket is empty.
I'm sorry if it sounds obvious, but I'm new to multithread programming and the concurrency nature of code sounds scary to me.
EDIT:
All A, B and C do independently. One is for Statistics one for updating network map etc.
EDIT 2:
As #Augusto sugested there is another approach which each of A, B and C has their own queue. My network listener passes the packets to each queue and they process it. It works but How can I do this with only one queue? or is it possible to implement this scenario using only one queue? and if the answer is yes. How and when I need to remove element from queue?
(a follow up on my comment after your edit) I would suggest to have a different queue per thread. This is actually a very well know pattern called publish-subscribe.
Taken from the link above:
A Publish-Subscribe Channel works like this: It has one input channel
that splits into multiple output channels, one for each subscriber.
When an event is published into the channel, the Publish-Subscribe
Channel delivers a copy of the message to each of the output channels.
Each output channel has only one subscriber, which is only allowed to
consume a message once. In this way, each subscriber only gets the
message once and consumed copies disappear from their channels.
The main difference between a ConcurrentLinkedQueue and a BlockingQueue is that you can add a limit on how many elements are on the BlockingQueue. This is good in the case that the producer reading from the network generates data faster than the consumers can process. If you use an unbounded queue and this goes for a while you'll end up with an OutOfMemoryError.
What I expect for the queue is following:
Let say, the queue contains A-B-C-D-E. And I have 3 consumers. What I need is ALL consumers consume the item A before dequeue A and move on to consume B.
That's to say, all the consumers get ALL the items in the queue. I have the trade-off solution, which is use 3 ArrayBlockingQueue for 3 consumers. But I'd know to know whether there's such queue.
Requirements you describe is publish/subscribe design pattern. It is implemented by JMS providers such as ActiveMQ.
No, there isn't. Sounds like a pretty easy thing to implement, though. Some options that jump to mind are:
Include a CountDownLatch in the object to consume, and discard the item only when it reaches 0
Have two of the consumers not consume the item (peek the queue rather than pop it)
Don't use consumers, use listeners
Or if you're willing to use something outside of rt.jar, you could use a messaging queue. ActiveMQ and RabbitMQ are popular, they both support publish-subscribe.
EDIT: brief description of listener pattern
Instead of multiplie consumers, have a single ListenerManager consumer. It pulls an object from the queue, then passes it to all Listeners that have previously registered themselves with the ListenerManager. Finally, the ListenerManager disposes of the object. Have a single ListenerManager per queue, and a single queue per event/object type. This makes it easy to manage.
You can use disruptor with multiple consumers.
Please follow the link. You might have to specify the dependency between consumers.
Please use this example code I tried to attach Two handlers with one queue
Is there any reason why the simple solution of just putting n references to item A into the queue instead of just one wouldn't work? If you have multiple producers you would have to synchronize their access, but if it's important that a finishes before b you need that anyhow already.
That's actually at least as memory efficient as using an additional class to handle this - the instance of the class adds at least 2 words overhead, while the proposed one only adds 2 references (which can be smaller than 2 words).
What you need to do is to have a single thread listen on the queue, which will then spread the item to the consumers.
// listener thread
Element element;
while ((element = queue.take()) != null) {
for (final Listener listener : myListeners) {
listener.getQueue().put(element);
}
}
Basically a Producer-pattern combined with an Observer pattern.
Use a separate queue for every consumer, and let the producer add it to all of them. There's not really a drawback here.
It's not clear if you need all consumers to have consumed A before any of them moves to B, in that case you have to adjust.
I have to write heavy load system, with pretty easy task to do. So i decided to split this tasks into multiple workers in different locations (or clouds). To communicate i want to use rabbitmq queue.
In my system there will be two kinds of software nodes: schedulers and workers. Schedulers will take user input from queue_input, split it into smaller task and put this smaller task into workers_queue. Workers reads this queue and 'do the thing'. I used round-robbin load balancing here - and all works pretty well, as long, as some worker crashed. Then i loose information about task completion (it's not allowed to do single operation twice, each task contains a pack of 50 iterations of doing worker-code with diffirent data).
I consider something like technical_queue - another channel to scheduler-worker communication, and I wonder, how to design it in a good way. I used tutorials from rabbitmq page, so my worker thread looks like :
while(true) {
message = consume(QUEUE,...);
handle(message); //do 50 simple tasks in loop for data in message
}
How can i handle second queue? Another thread we some while(true) {} loop?, or is there a better sollution to this? Maybe should I reuse existing queue with topic exchange? (but i wanted to have independent way of communication, while handling the task, which may take some time.
You should probably take a look at spring-amqp (doc). I hate to tell you to add a layer but that spring library takes care of the threading issues and management of threads with its SimpleMessageListenerContainer. Each container goes to a queue and you can specify # of threads (ie workers) per queue.
Alternatively you can make your own using an ExecutorService but you will probably end up rewriting what SimpleMessageListenerContainer does. Also you just could execute (via OS or batch scripts) more processes and that will add more consumers to each queue.
As far as queue topology is concerned it is entirely dependent on business logic/concerns and generally less on performance needs. More often you had more queues for business reasons and more workers for performance reasons but if a queue gets backed up with the same type of message considering giving that type of message its own queue. What your describing sounds like two queues with multiple consumer on your worker queue.
Other than the threading issue and queue topology I'm not entirely sure what else you are asking.
I would recommend you create a second queue consumer
consumer1 -> queue_process
consumer2 -> queue_process
Both consumers should make listening to the same queue.
Greetings I hope will help
I have an application which applies the Producer-Consumer design pattern. IT is written in java. in short, the producers put items in a blocking queue and the consumers takes them from there. the consumers should run until signaled by a producer to stop.
what is the neatest way to deliver this signal from producers to the consumers? the chief designer said he wants to keep producer and consumer separate but I dont see any other other than invoking a method on consumer thread pool?
The Chief Programmer is right. Keeping them separate leads to highly decoupled code which is excellent.
There are several ways to do this. One of them is called Poison Pill. Here's how it works - place a known item on the Queue when the Consumer see that item, they kill themselves or take another action.
This can be tricky if there are multiple Consumers (you mentioned ThreadPool) or bounded Queues. Please look this up in Java Concurrency in Practice by Joshua Bloch. He explained it best.
Send a cancel message through the queue. Your consumers' run methods would look like
while(true) {
Message message = queue.take();
if(message == Message.Cancel) {
queue.offer(message); // so that the other consumers can read the Cancel message
break;
}
}
Create a ConsumerHalter class. Register all consumers that wants to get data from queue to the ConsumerHalter class, and have producer trigger a halt event in the ConsumerHalther class. The ConsumerHalter class then calls onStopConsuming() of each consumers.
I am scared to use event bus http://code.google.com/p/guava-libraries/wiki/EventBusExplained as I think its using unbounded queue internally. If a lot of message are post to it. It might run into full gc.
Is there exists a similar implementation which does the same thing which is unbounded queue?
Indeed, Guava uses an ConcurrentLinkedQueue, which is unbounded:
An unbounded thread-safe queue based on linked nodes.
See line 151-158 of EventBus.java.
/** queues of events for the current thread to dispatch */
private final ThreadLocal<ConcurrentLinkedQueue<EventWithHandler>>
eventsToDispatch =
new ThreadLocal<ConcurrentLinkedQueue<EventWithHandler>>() {
#Override protected ConcurrentLinkedQueue<EventWithHandler> initialValue() {
return new ConcurrentLinkedQueue<EventWithHandler>();
}
};
You could always modify the code to use e.g. ArrayBlockingQueue. Have you looked into other solutions that are similar, e.g. disruptor?
I do agree with Arjit that an unbounded queue can be considered disadvantageous in certain scenarios. For example, if I have a service running that consumes messages from various sources to process them and I don't know the rate of incoming messages. It might exceed the processing speed of my worker/consumer and I might want to establish the following contract: Posting new messages to the workers will fail or block if there are still many messages pending. This will not only prevent running out of memory but also guarantee that messages will actually be processed within a certain time frame. Additionally clients will receive direct feedback if the service is running at its limits.
#Arjit: You can check out MBassador on https://github.com/bennidi/mbassador
It is very similar to Google Guava event bus but offers more features - bounded message queues being one of them. It is also very fast and its internal design allows a great deal of customization and extension. Up to know, I was able to address most of the feature requests from other users within short periods of time. Maybe you give it a try.