I have a bunch of producer threads adding to a BlockingQueue and one worker thread taking objects. Now I want to extend this so two worker threads are taking objects, but doing different work on the objects. Here's the twist: I want an object that has been put on the queue to be worked on by both of the receiving threads.
If I keep using BlockingQueue, the two threads will compete for the objects, and only one of the worker threads will get the object.
So I'm looking for something similar to BlockingQueue, but with broadcast behaviour.
The application: The producer threads are actually creating performance measurements, and one of the workers is writing the measurements to a file, while the other worker is aggregating statistics.
I'm using Java 6.
So does such a mechanism exist? In Java SE? Elsewhere? Or do I need to code my own?
I'm looking for solutions with a small footprint - I'd prefer not to install some framework to do this.
One option: have three blocking queues. Your main producer puts items into a "broadcast" queue. You then have a consumer of that queue which consumes each item, putting it into both of the other queues, each of which is serviced by a single consumer:
Q2
---- Real Consumer 1
Q1 /
Producer ---- Broadcast Consumer
\
---- Real Consumer 2
Q3
Alternatively, you could give two blocking queues to the producer, and just get it to put the items it produces into both. That's less elegant, but slightly simpler overall :)
Jon Skeet idea is beautiful in its simplicity. Other than that you can use the disruptor pattern, which is faster and addresses exactly this problem. I can give you a code example with CoralQueue, which is an implementation of the disruptor pattern done by Coral Blocks, with which I am affiliated. It provides a data structure called Splitter that accepts a single producer offering messages and accepts multiple consumers polling messages, in a way that all messages are delivered to each and every consumer.
package com.coralblocks.coralqueue.sample.splitter;
import com.coralblocks.coralqueue.splitter.AtomicSplitter;
import com.coralblocks.coralqueue.splitter.Splitter;
import com.coralblocks.coralqueue.util.Builder;
public class Basics {
private static final int NUMBER_OF_CONSUMERS = 4;
public static void main(String[] args) {
Builder<StringBuilder> builder = new Builder<StringBuilder>() {
#Override
public StringBuilder newInstance() {
return new StringBuilder(1024);
}
};
final Splitter<StringBuilder> splitter = new AtomicSplitter<StringBuilder>(1024, builder, NUMBER_OF_CONSUMERS);
Thread producer = new Thread(new Runnable() {
private final StringBuilder getStringBuilder() {
StringBuilder sb;
while((sb = splitter.nextToDispatch()) == null) {
// splitter can be full if the size of the splitter
// is small and/or the consumer is too slow
// busy spin (you can also use a wait strategy instead)
}
return sb;
}
#Override
public void run() {
StringBuilder sb;
while(true) { // the main loop of the thread
// (...) do whatever you have to do here...
// and whenever you want to send a message to
// the other thread you can just do:
sb = getStringBuilder();
sb.setLength(0);
sb.append("Hello!");
splitter.flush();
// you can also send in batches to increase throughput:
sb = getStringBuilder();
sb.setLength(0);
sb.append("Hi!");
sb = getStringBuilder();
sb.setLength(0);
sb.append("Hi again!");
splitter.flush(); // dispatch the two messages above...
}
}
}, "Producer");
final Thread[] consumers = new Thread[NUMBER_OF_CONSUMERS];
for(int i = 0; i < consumers.length; i++) {
final int index = i;
consumers[i] = new Thread(new Runnable() {
#SuppressWarnings("unused")
#Override
public void run() {
while (true) { // the main loop of the thread
// (...) do whatever you have to do here...
// and whenever you want to check if the producer
// has sent a message you just do:
long avail;
while((avail = splitter.availableToPoll(index)) == 0) {
// splitter can be empty!
// busy spin (you can also use a wait strategy instead)
}
for(int i = 0; i < avail; i++) {
StringBuilder sb = splitter.poll(index);
// (...) do whatever you want to do with the data
// just don't call toString() to create garbage...
// copy byte-by-byte instead...
}
splitter.donePolling(index);
}
}
}, "Consumer" + index);
}
for(int i = 0; i < consumers.length; i++) {
consumers[i].start();
}
producer.start();
}
}
Disclaimer: I am one of the developers of CoralQueue.
Related
I have the producer code that generates the random character:
public class Producer implements Runnable {
#Override
public void run() {
Stream<Character> generate = Stream.generate(this::generateRandomCharacter).limit(15);
generate.forEach(character -> {
MyEvent myEvent = new MyEvent();
myEvent.setMesage(character + "");
LOG.info("Producer: " + name + " is waiting to transfer...");
try {
boolean added = transferQueue.tryTransfer(myEvent, 4000, TimeUnit.MILLISECONDS);
if (added) {
numberOfProducedMessages.incrementAndGet();
LOG.info("Producer: " + name + " transferred element: A");
} else {
LOG.info("can not add an element due to the timeout");
}
} catch (InterruptedException e) {
e.printStackTrace();
}
});
}
}
The consumer code is provided:
public class Consumer implements Runnable {
private static final Logger LOG = Logger.getLogger(Consumer.class.getName());
private final TransferQueue<MyEvent> transferQueue;
private final String name;
final int numberOfMessagesToConsume;
final AtomicInteger numberOfConsumedMessages = new AtomicInteger();
Consumer(TransferQueue<MyEvent> transferQueue, String name, int numberOfMessagesToConsume) {
this.transferQueue = transferQueue;
this.name = name;
this.numberOfMessagesToConsume = numberOfMessagesToConsume;
}
#Override
public void run() {
while (true){
try {
LOG.info("Consumer: " + name + " is waiting to take element...");
MyEvent element = transferQueue.take();
longProcessing(element);
System.out.println("Consumer: " + name + " received element with messgae : " + element.getMesage());
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
private void longProcessing(MyEvent element) throws InterruptedException {
numberOfConsumedMessages.incrementAndGet();
Thread.sleep(5);
}
}
This is the call for the consumer/ produce:
TransferQueue<Event> transferQueue = new LinkedTransferQueue<>();
ExecutorService exService = Executors.newFixedThreadPool(2);
Producer producer = new Producer( transferQueue, "1", 2);
Consumer consumer = new Consumer(transferQueue, "1", 2);
exService.execute(producer);
exService.execute(consumer);
boolean isShutDown = exService.awaitTermination(5000, TimeUnit.MILLISECONDS);
if (!isShutDown) {
exService.shutdown();
}
The producer will create only a limited number of characters that the consumer will consume. How do I know if the producer is finished the character generation?
I think about implementing a timeout to know if the producer is not sending any more characters, but there might be a better option for this implementation.
There are various alternative ways to achieve this:
Use a special type of event to show that the producer has finished. (This is basically what the answer by Krzysztof Cichocki suggests). Pros: simplicity. Cons: you have to make sure that whatever special event you choose to signify "finished" cannot possibly be a real event emitted by the producer.
Use a count. It looks like this is what your code is already trying to do. For example, pass 15 in the numberOfMessagesToConsume argument to the consumer constructor, and the Run() method then stops once it has consumed 15 messages. Pros: simplicity. Cons: inflexibility, and you might not know how many messages the producer will produce beforehand.
Monitor the state of the producer thread. For example, the consumer can check while (producerThread.isAlive()) {...}. The producer thread will terminate when it has finished producing the messages. Pros: flexibility. Cons: you don't want the consumer to know about the producer thread, as that's too much coupling. For example, you might start the producer using new Thread(...) or you might use an ExecutorService or a CompletableFuture. The consumer shouldn't need to know.
One way of mitigating around the disadvantage of option 3 is to pass a function to the consumer to decouple the testing of producer state from the threading details:
Constructor:
Consumer(TransferQueue<MyEvent> transferQueue, String name, BooleanSupplier isProducerStillProducing)
Call the constructor with a lambda:
new Consumer(transferQueue, name, () -> producerThread.isAlive())
Test it in the run() method:
while (isProducerStillProducing.getAsBoolean()) { ... }
You can just send from producer an event with a message eg. "finished".
And then in you consumer just check for this message to know the stream is finished.
The time out is not so good idea, because it might happen for different reasons than closing the stream.
Sometimes coordinating the shut-down of producer and consumer can be a very puzzling task. Sometimes it is easier in one programming language than another due to differences in syntax.
The following example written using the Ada programming language creates a producer and a consumer. The producer sends a series of characters to the consumer. The consumer prints each character as it is received. The consumer terminates when the producer terminates.
This example uses the Ada Rendezvous mechanism for communication between tasks (aka threads).
with Ada.Text_IO; use Ada.Text_IO;
procedure Main is
task producer;
task consumer is
entry send (Item : in Character);
end consumer;
task body producer is
subtype lower is Character range 'a' .. 'z';
subtype upper is Character range 'A' .. 'Z';
begin
for C in lower loop
consumer.send (C);
delay 0.05;
end loop;
for C in upper loop
consumer.send (C);
delay 0.05;
end loop;
end producer;
task body consumer is
Char : Character;
begin
loop
select
accept send (Item : in Character) do
Char := Item;
end send;
Put (Char);
if Char = 'z' then
New_Line(2);
end if;
or
terminate;
end select;
end loop;
end consumer;
begin
null;
end Main;
The output of this program is:
abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
So my goal is to measure the performance of a Streaming Engine. It's basically a library to which i can send data-packages. The idea to measure this is to generate data, put it into a Queue and let the Streaming Engine grab the data and process it.
I thought of implementing it like this: The Data Generator runs in a thread and generates data packages in an endless loop with a certain Thread.sleep(X) at the end. When doing the tests the idea is to minimize tis Thread.sleep(X) to see if this has an impact on the Streaming Engine's performance. The Data Generator writes the created packages into a queue, that is, a ConcurrentLinkedQueue, which at the same time is a Singleton.
In another thread I instantiate the Streaming Engine which continuously removes the packages from the queue by doing queue.remove(). This is done in an endlees loop without any sleeping, because it should just be done as fast as possible.
In a first try to implement this I ran into a problem. It seems as if the Data Generator is not able to put the packages into the Queue as it should be. It is doing that too slow. My suspicion is that the endless loop of the Streaming Engine thread is eating up all the resources and therefore slows down everything else.
I would be happy about how to approach this issue or other design patterns, which could solve this issue elegantly.
the requirements are: 2 threads which run in parallel basically. one is putting data into a queue. the other one is reading/removing from the queue. and i want to measure the size of the queue regularly in order to know if the engine which is reading/removing from the queue is fast enough to process the generated packages.
You can use a BlockingQueue, for example ArrayBlockingQueue, you can initialize these to a certain size, so the number of items queued will never exceed a certain number, as per this example:
// create queue, max size 100
final ArrayBlockingQueue<String> strings = new ArrayBlockingQueue<>(100);
final String stop = "STOP";
// start producing
Runnable producer = new Runnable() {
#Override
public void run() {
try {
for(int i = 0; i < 1000; i++) {
strings.put(Integer.toHexString(i));
}
strings.put(stop);
} catch(InterruptedException ignore) {
}
}
};
Thread producerThread = new Thread(producer);
producerThread.start();
// start monitoring
Runnable monitor = new Runnable() {
#Override
public void run() {
try {
while (true){
System.out.println("Queue size: " + strings.size());
Thread.sleep(5);
}
} catch(InterruptedException ignore) {
}
}
};
Thread monitorThread = new Thread(monitor);
monitorThread.start();
// start consuming
Runnable consumer = new Runnable() {
#Override
public void run() {
// infinite look, will interrupt thread when complete
try {
while(true) {
String value = strings.take();
if(value.equals(stop)){
return;
}
System.out.println(value);
}
} catch(InterruptedException ignore) {
}
}
};
Thread consumerThread = new Thread(consumer);
consumerThread.start();
// wait for producer to finish
producerThread.join();
consumerThread.join();
// interrupt consumer and monitor
monitorThread.interrupt();
You could also have third thread monitoring the size of the queue, to give you an idea of which thread is outpacing the other.
Also, you can used the timed put method and the timed or untimed offer methods, which will give you more control of what to do if the queue if full or empty. In the above example execution will stop until there is space for the next element or if there are no further elements in the queue.
i want to introduce my problem first.
I have several WorkingThreads that are receiving a string, processing the string and afterwards sending the processed string to a global Queue like this:
class Main {
public static Queue<String> Q;
public static void main(String[] args) {
//start working threads
}
}
WorkingThread.java:
class WorkingThread extends Thread {
public void run() {
String input;
//do something with input
Main.q.append(processedString);
}
So now every 800ms another Thread called Inserter dequeues all the entries to formulate some sql, but thats not important.
class Inserter extends Thread {
public void run() {
while(!Main.Q.isEmpty()) {
System.out.print(".");
// dequeue and formulate some SQL
}
}
}
Everything works for about 5 to 10 minutes but then suddenly, i cannot see any dots printed (what is basically a heartbeat for the Inserter). The Queue is not empty i can assure that but the inserter just wont work even though it get started regulary.
I have a suspision that there is a problem when a worker wants to insert something while the Inserter dequeues the Queue, could this possibly be some kind of "deadlock"?
I really hope somebody has an explanation for this behaviour. I am looking forward to learn ;).
EDIT: I am using
Queue<String> Q = new LinkedList<String>();
You are not using a synchronized or thread safe Queue therefore you have a race hazard. Your use of a LinkedList shows a (slightly scary) lack of knowledge of this fact. You may want to read more about threading and thread safety before you try and tackle any more threaded code.
You must either synchronize manually or use one of the existing implementations provided by the JDK. Producer/consumer patterns are usually implemented using one of the BlockingQueue implementations.
A BlockingQueue of a bounded size will block producers trying to put if the queue is full. A BlockingQueue will always block consumers if the queue is empty.
This allows you to remove all of your custom logic that spins on the queue and waits for items.
A simple example using Java 8 lambdas would look like:
public static void main(String[] args) throws Exception {
final BlockingQueue<String> q = new LinkedBlockingQueue<>();
final ExecutorService executorService = Executors.newFixedThreadPool(4);
final Runnable consumer = () -> {
while (true) {
try {
System.out.println(q.take());
} catch (InterruptedException e) {
return;
}
}
};
executorService.submit(consumer);
final Stream<Runnable> producers = IntStream.range(0, 5).mapToObj(i -> () -> {
final Random random = ThreadLocalRandom.current();
while (true) {
q.add("Consumer " + i + " putting " + random.nextDouble());
try {
TimeUnit.MILLISECONDS.sleep(random.nextInt(2000));
} catch (InterruptedException e) {
//ignore
}
}
});
producers.forEach(executorService::submit);
}
The consumer blocks on the BlockingQueue.take method and immediately there is an item available, it will be woken and will print the item. If there are no items, the thread will be suspended - allowing the physical CPU to do something else.
The producers each push a String onto the queue using add. As the queue is unbounded, add will always return true. In the case where there is likely to be a backlog of work the for consumer you can bound the queue and use the put method (that throws an InterruptedException so requires a try..catch which is why it's easier to use add) - this will automatically create flow control.
Seems more like synchronization issue.. You are trying to do a simulation of - Producer - Consumer problem. You need to synchronize your Queue or use a BlockingQueue. You probably have a race condition.
You are going to need to synchronize access to your Queue or
use ConcurrentLinkedQueue see http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html
or as also suggested using a BlockingQueue (depending on your requirements) http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/BlockingQueue.html
For a more detailed explanation of the BlockingQueue see
http://tutorials.jenkov.com/java-util-concurrent/blockingqueue.html
The current project I am working on requires that I implement a way to efficiently pass a set of objects from one thread, that runs continuously, to the main thread. The current setup is something like the following.
I have a main thread which creates a new thread. This new thread operates continuously and calls a method based on a timer. This method fetches a group of messages from an online source and organizes them in a TreeSet.
This TreeSet then needs to be passed back to the main thread so that the messages it contains can be handled independent of the recurring timer.
For better reference my code looks like the following
// Called by the main thread on start.
void StartProcesses()
{
if(this.IsWindowing)
{
return;
}
this._windowTimer = Executors.newSingleThreadScheduledExecutor();
Runnable task = new Runnable() {
public void run() {
WindowCallback();
}
};
this.CancellationToken = false;
_windowTimer.scheduleAtFixedRate(task,
0, this.SQSWindow, TimeUnit.MILLISECONDS);
this.IsWindowing = true;
}
/////////////////////////////////////////////////////////////////////////////////
private void WindowCallback()
{
ArrayList<Message> messages = new ArrayList<Message>();
//TODO create Monitor
if((!CancellationToken))
{
try
{
//TODO fix epochWindowTime
long epochWindowTime = 0;
int numberOfMessages = 0;
Map<String, String> attributes;
// Setup the SQS client
AmazonSQS client = new AmazonSQSClient(new
ClasspathPropertiesFileCredentialsProvider());
client.setEndpoint(this.AWSSQSServiceUrl);
// get the NumberOfMessages to optimize how to
// Receive all of the messages from the queue
GetQueueAttributesRequest attributesRequest =
new GetQueueAttributesRequest();
attributesRequest.setQueueUrl(this.QueueUrl);
attributesRequest.withAttributeNames(
"ApproximateNumberOfMessages");
attributes = client.getQueueAttributes(attributesRequest).
getAttributes();
numberOfMessages = Integer.valueOf(attributes.get(
"ApproximateNumberOfMessages")).intValue();
// determine if we need to Receive messages from the Queue
if (numberOfMessages > 0)
{
if (numberOfMessages < 10)
{
// just do it inline it's less expensive than
//spinning threads
ReceiveTask(numberOfMessages);
}
else
{
//TODO Create a multithreading version for this
ReceiveTask(numberOfMessages);
}
}
if (!CancellationToken)
{
//TODO testing
_setLock.lock();
Iterator<Message> _setIter = _set.iterator();
//TODO
while(_setIter.hasNext())
{
Message temp = _setIter.next();
Long value = Long.valueOf(temp.getAttributes().
get("Timestamp"));
if(value.longValue() < epochWindowTime)
{
messages.add(temp);
_set.remove(temp);
}
}
_setLock.unlock();
// TODO deduplicate the messages
// TODO reorder the messages
// TODO raise new Event with the results
}
if ((!CancellationToken) && (messages.size() > 0))
{
if (messages.size() < 10)
{
Pair<Integer, Integer> range =
new Pair<Integer, Integer>(Integer.valueOf(0),
Integer.valueOf(messages.size()));
DeleteTask(messages, range);
}
else
{
//TODO Create a way to divide this work among
//several threads
Pair<Integer, Integer> range =
new Pair<Integer, Integer>(Integer.valueOf(0),
Integer.valueOf(messages.size()));
DeleteTask(messages, range);
}
}
}catch (AmazonServiceException ase){
ase.printStackTrace();
}catch (AmazonClientException ace) {
ace.printStackTrace();
}
}
}
As can be seen by some of the commenting, my current preferred way to handle this is by creating an event in the timer thread if there are messages. The main thread will then be listening for this event and handle it appropriately.
Presently I am unfamiliar with how Java handles events, or how to create/listen for them. I also do not know if it is possible to create events and have the information contained within them passed between threads.
Can someone please give me some advice/insight on whether or not my methods are possible? If so, where might I find some information on how to implement them as my current searching attempts are not proving fruitful.
If not, can I get some suggestions on how I would go about this, keeping in mind I would like to avoid having to manage sockets if at all possible.
EDIT 1:
The main thread will also be responsible for issuing commands based on the messages it receives, or issuing commands to get required information. For this reason the main thread cannot wait on receiving messages, and should handle them in an event based manner.
Producer-Consumer Pattern:
One thread(producer) continuosly stacks objects(messages) in a queue.
another thread(consumer) reads and removes objects from the queue.
If your problem fits to this, Try "BlockingQueue".
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/BlockingQueue.html
It is easy and effective.
If the queue is empty, consumer will be "block"ed, which means the thread waits(so do not uses cpu time) until producer puts some objects. otherwise cosumer continuosly consumes objects.
And if the queue is full, prducer will be blocked until consumer consumes some objects to make a room in the queue, vice versa.
Here's a example:
(a queue should be same object in both producer and consumer)
(Producer thread)
Message message = createMessage();
queue.put(message);
(Consumer thread)
Message message = queue.take();
handleMessage(message);
I retrieved 50000 data from database and stored them to arraylist. I split the arraylist into half saying 250000 stored in ArrayList1 (even rows) and other 25000 ArrayList2 (odd rows).
Now, I need to use multi-threading in order to process these such that all 50,000 records are processed at a time. Main aim is to speed up the transaction.
The problem is userList gets too heavy and takes time.
How can I implement ExecutorService to speed up?
Hoping to receive your suggestions asap.
List<String[]> userList = new ArrayList<String[]>();
void getRecords()
{
String [] props=null;
while (rs.next()) {
props = new String[2];
props[0] = rs.getString("useremail");
props[1] = rs.getString("active");
userList.add(props);
if (userList.size()>0) sendEmail();
}
}
void sendEmail()
{
String [] user=null;
for (int k=0; k<userList.size(); k++)
{
user = userList.get(k);
userEmail = user[0];
//send email code
}
}
Thanks in advance.
There's a simpler approach: producer-consumer. Leave all items in a single list and define a processing task that encapsulates a data item:
class Task implements Runnable {
private Object data;
public Task(Object data) {
this.data = data;
}
public void run() {
// process data
}
}
Create a thread pool and feed it the tasks one by one:
ExecutorService exec = Executors.newFixedThreadPool(4); // 4 threads
for(Object obj: itemList) {
exec.submit(new Task(obj));
}
exec.shutdown();
exec.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
Now you have parallel execution and load balancing (!!!) since the threads execute work on-demand as they finish previous tasks. By splitting the array into contiguous sections you don't have this guarantee.
I would create an ArrayList for each Thread. That way each thread only reads one list and you won't have a multi-threading issue.
ExecutorService service = ...
List<Work> workList = ...
int blockSize = (workList.size() + threads - 1)/threads;
for(int i = 0; i < threads;i ++) {
int start = i * blockSize;
int end = Math.min((i + 1) * blockSize, workList.size());
final List<Work> someWork = work.subList(start, end);
service.submit(new Runnable() {
public void run() {
process(someWork);
}
});
}
You can use any number of threads, but I suggest using the smallest number which gives you a performance increase.
I don't know why you've split the list into two lists. Why not keep them in one, and run two threads - one processing the even rows, one processing the odd rows ?
Regardless, check out the Java Executor framework. It allows you to easily write jobs and submit them for running (using thread pools, schedulign them etc.). Given that the executor framework can handle arbitrary numbers of threads, I would split your workload more intelligently (perhaps into sublists of 'n' elements) and determine (via changing the number of jobs/threads) which configuration runs the fastest in your particular scenario.
I would use a Queue instead of a List, probably a ConcurrentLinkedQueue. That should be thread safe and thus allow concurent access from different threads.