I have this piece of code:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
public void run(){
//remove one interval
//do calculations
//add some intervals
This code is being executed by a specific number of threads at the same time. As you see, loop should go on until there are no more intervals left in the collection, but there is a problem. In the beginning of each iteration an interval gets removed from collection and in the end some number of intervals might get added back into same collection.
Problem is, that while one thread is inside the loop the collection might become empty, so other threads that are trying to enter the loop won't be able to do that and will finish their work prematurely, even though collection might be filled with values after the first thread will finish the iteration. I want the thread count to remain constant (or not more than some number n) until all work is really finished.
That means that no threads are currently working in the loop and there are no elements left in the collection. What are possible ways of accomplishing that? Any ideas are welcomed.
One way to solve this problem in my specific case is to give every thread a different piece of the original collection. But after one thread would finish its work it wouldn't be used by the program anymore, even though it could help other threads with their calculations, so I don't like this solution, because it's important to utilize all cores of the machine in my problem.
This is the simplest minimal working example I could come up with. It might be to lengthy.
public class Test{
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
private int threadNumber;
private Thread[] threads;
private double result;
public Test(int threadNumber){
intervals.add(new Interval(0, 1));
this.threadNumber = threadNumber;
threads = new Thread[threadNumber];
public double find(){
for(int i = 0; i < threadNumber; i++){
threads[i] = new Thread(new Finder());
for(int i = 0; i < threadNumber; i++){
catch(InterruptedException e){
return result;
private class Finder implements Runnable{
public void run(){
Interval interval = intervals.poll();
if(interval.high - interval.low > 1e-6){
double middle = (interval.high + interval.low) / 2;
boolean something = true;
intervals.add(new Interval(interval.low + 0.1, middle - 0.1));
intervals.add(new Interval(middle + 0.1, interval.high - 0.1));
intervals.add(new Interval(interval.low + 0.1, interval.high - 0.1));
private class Interval{
double low;
double high;
public Interval(double low, double high){
this.low = low;
this.high = high;
What you might need to know about the program: After every iteration interval should either disappear (because it's too small), become smaller or split into two smaller intervals. Work is finished after no intervals are left. Also, I should be able to limit number of threads that are doing this work with some number n. The actual program looks for a maximum value of some function by dividing the intervals and throwing away the parts of those intervals that can't contain the maximum value using some rules, but this shouldn't really be relevant to my problem.
The CompletableFuture class is also an interesting solution for these kind of tasks.
It automatically distributes workload over a number of worker threads.
static CompletableFuture<Integer> fibonacci(int n) {
if(n < 2) return CompletableFuture.completedFuture(n);
else {
return CompletableFuture.supplyAsync(() -> {
CompletableFuture<Integer> f1 = fibonacci(n - 1);
CompletableFuture<Integer> f2 = fibonacci(n - 2);
return f1.thenCombineAsync(f2, (a, b) -> a + b);
}).thenComposeAsync(f -> f);
public static void main(String[] args) throws Exception {
int fib = fibonacci(10).get();
You can use atomic flag, i.e.:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue<>();
private AtomicBoolean inUse = new AtomicBoolean();
public void run() {
while (!intervals.isEmpty() && inUse.compareAndSet(false, true)) {
// work
Question has been updated, so I would give you better solution. It is more "classic" solution using blocking queue;
private BlockingQueue<Interval> intervals = new ArrayBlockingQueue<Object>();
private volatile boolean finished = false;
public void run() {
try {
while (!finished) {
Interval next = intervals.take();
// put work there
// after you decide work is finished just set finished = true
intervals.put(interval); // anyway, return interval to queue
} catch (InterruptedException e) {
Now it seems better to re-write solution and divide range to sub-ranges for each thread.
Your problem looks like a recursive one - processing one task (interval) might produce some sub-tasks (sub intervals).
For that purpose I would use ForkJoinPool and RecursiveTask:
class Interval {
class IntervalAction extends RecursiveAction {
private Interval interval;
private IntervalAction(Interval interval) {
this.interval = interval;
protected void compute() {
if (...) {
// we need two sub-tasks
IntervalAction sub1 = new IntervalAction(new Interval(...));
IntervalAction sub2 = new IntervalAction(new Interval(...));
} else if (...) {
// we need just one sub-task
IntervalAction sub3 = new IntervalAction(new Interval(...));
} else {
// current task doesn't need any sub-tasks, just return
public static void compute(Interval initial) {
ForkJoinPool pool = new ForkJoinPool();
pool.invoke(new IntervalAction(initial));
// invoke will return when all the processing is completed
I had the same problem, and I tested the following solution.
In my test example I have a queue (the equivalent of your intervals) filled with integers. For the test, at each iteration one number is taken from the queue, incremented and placed back in the queue if the new value is below 7 (arbitrary). This has the same impact as your interval generation on the mechanism.
Here is an example working code (Note that I develop in java 1.8 and I use the Executor framework to handle my thread pool.) :
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.PriorityBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
public class Test {
final int numberOfThreads;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
final BlockingQueue<Integer> sleepingThreadsTokens;
final ThreadPoolExecutor executor;
public static void main(String[] args) {
final Test test = new Test(2); // arbitrary number of thread => 2
private Test(int numberOfThreads){
this.numberOfThreads = numberOfThreads;
this.queue = new PriorityBlockingQueue<Integer>();
this.availableThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.sleepingThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numberOfThreads);
public void launch() {
// put some elements in queue at the beginning
for(int i = 0; i < numberOfThreads; i++){
boolean algorithmIsFinished = false;
if(sleepingThreadsTokens.size() != numberOfThreads){
try {
} catch (final InterruptedException e) {
// some treatment should be put there in case of failure
if(!queue.isEmpty()){ // Continuation condition
executor.submit(new Loop(queue.poll(), queue, availableThreadsTokens));
algorithmIsFinished = true;
public static class Loop implements Runnable{
int element;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
public Loop(Integer element, BlockingQueue<Integer> queue, BlockingQueue<Integer> availableThreadsTokens){
this.element = element;
this.queue = queue;
this.availableThreadsTokens = availableThreadsTokens;
public void run(){
System.out.println("taking element "+element);
for(Long l = (long) 0; l < 500000000L; l++){
for(Long l = (long) 0; l < 500000000L; l++){
for(Long l = (long) 0; l < 500000000L; l++){
if(element < 7){
System.out.println("Inserted element"+(element + 1));
System.out.println("no insertion");
I ran this code for check, and it seems to work properly. However there are certainly some improvement that can be made :
sleepingThreadsTokens do not have to be a BlockingQueue, since only the main accesses it. I used this interface because it allowed a nice sleepingThreadsTokens.drainTo(availableThreadsTokens);
I'm not sure whether queue has to be blocking or not, since only main takes from it and does not wait for elements (it waits only for tokens).
The idea is that the main thread checks for the termination, and for this it has to know how many threads are currently working (so that it does not prematurely stops the algorithm because the queue is empty). To do so two specific queues are created : availableThreadsTokens and sleepingThreadsTokens. Each element in availableThreadsTokens symbolizes a thread that have finished an iteration, and wait to be given another one. Each element in sleepingThreadsTokens symbolizes a thread that was available to take a new iteration, but the queue was empty, so it had no job and went to "sleep". So at each moment availableThreadsTokens.size() + sleepingThreadsTokens.size() = numberOfThreads - threadExcecutingIteration.
Note that the elements on availableThreadsTokens and sleepingThreadsTokens only symbolizes thread activity, they are not thread nor design a specific thread.
Case of termination : let suppose we have N threads (aribtrary, fixed number). The N threads are waiting for work (N tokens in availableThreadsTokens), there is only 1 remaining element in the queue and the treatment of this element won't generate any other element. Main takes the first token, finds that the queue is not empty, poll the element and sends the thread to work. The N-1 next tokens are consumed one by one, and since the queue is empty the token are moved into sleepingThreadsTokens one by one. Main knows that there is 1 thread working in the loop since there is no token in availableThreadsTokens and only N-1 in sleepingThreadsTokens, so it waits (.take()). When the thread finishes and releases the token Main consumes it, discovers that the queue is now empty and put the last token in sleepingThreadsTokens. Since all tokens are now in sleepingThreadsTokens Main knows that 1) all threads are inactive 2) the queue is empty (else the last token wouldn't have been transferred to sleepingThreadsTokens since the thread would have take the job).
Note that if the working thread finishes the treatment before all the availableThreadsTokens are moved to sleepingThreadsTokens it makes no difference.
Now if we suppose that the treatment of the last element would have generated M new elements in the queue then the Main would have put all the tokens from sleepingThreadsTokens back to availableThreadsTokens, and start to assign them treatments again. We put all the token back even if M < N because we don't know how much elements will be inserted in the future, so we have to keep all the thread available.
I would suggest a master/worker approach then.
The master process goes through the intervals and assigns the calculations of that interval to a different process. It also removes/adds as necessary. This way, all the cores are utilized, and only when all intervals are finished, the process is done. This is also known as dynamic work allocation.
A possible example:
public void run(){
//remove one interval
Thread t = new Thread(new Runnable()
//do calculations
//add some intervals
The possible solution you provided is known as static allocation, and you're correct, it will finish as fast as the slowest processor, but the dynamic approach will utilize all memory.
I've run into this problem as well. The way I solved it was to use an AtomicInteger to know what is in the queue. Before each offer() increment the integer. After each poll() decrement the integer. The CLQ has no real isEmpty() since it must look at head/tail nodes and this can change atomically (CAS).
This doesn't guarantee 100% that some thread may increment after another thread decrements so you need to check again before ending the thread. It is better than relying on while(...isEmpty())
Other than that, you may need to synchronize.
In ยง7.5.1 of The Art of Multiprocessor Programming by Herlihy et al. (2nd ed., 2020), the authors present a simple lock that uses an array queue to achieve FIFO locking. Intuitively, the nth thread has a (thread-local) index into an array, and then spins on that array element until the n - 1 thread unlocks the lock. Its code looks like this:
public class ALock {
ThreadLocal<Integer> mySlotIndex = new ThreadLocal<>() {
#Override protected Integer initialValue() { return 0; }
AtomicInteger tail;
volatile boolean[] flag;
int size;
public ALock(int capacity) {
size = capacity;
tail = new AtomicInteger(0);
flag = new boolean[capacity];
flag[0] = true;
public void lock() {
int slot = tail.getAndIncrement() % size;
while (!flag[slot]) {};
public void unlock() {
int slot = mySlotIndex.get();
flag[slot] = false;
flag[(slot + 1) % size] = true;
I am using a minimal test program to check that this lock is fair. In a nutshell, I create NUM_THREADS threads and map each one to an array index id. Each thread tries to acquire the same lock. Once it succeeds, it increments a global COUNT and also increments RUNS_PER_THREAD[id].
If the lock is correct, the final value of COUNT should equal the sum of the values in RUNS_PER_THREAD. If the lock is fair, the elements of RUNS_PER_THREAD should be approximately equal.
public class Main {
static long COUNT = 0;
static int NUM_THREADS = 16;
// static Lock LOCK = new ReentrantLock(true);
static ALock LOCK = new ALock(NUM_THREADS);
static long[] RUNS_PER_THREAD = new long[NUM_THREADS];
static Map<Long, Integer> THREAD_IDS = new HashMap<>();
public static void main(String[] args) {
var threads = IntStream.range(0, NUM_THREADS).mapToObj(Main::makeWorker).toArray(Thread[]::new);
for (int i = 0; i < threads.length; i++) THREAD_IDS.put(threads[i].getId(), i);
for (var thread: threads) thread.start();
try { Thread.sleep(300L); } catch (InterruptedException e) {}
for (var thread: threads) thread.interrupt();
try { Thread.sleep(100L); } catch (InterruptedException e) {}
for (int i = 0; i < NUM_THREADS; i++) System.out.printf("Thread %d:\t%12d%n", i, RUNS_PER_THREAD[i]);
System.out.println("Counted up to: \t\t\t" + COUNT);
System.out.println("Sum for all threads: \t" + Arrays.stream(RUNS_PER_THREAD).sum());
private static Thread makeWorker(int i) {
return new Thread(() -> {
while (true) {
if (Thread.interrupted()) return;
try {
var id = THREAD_IDS.get(Thread.currentThread().getId());
} finally {
If the test program is run with a fair ReentrantLock, the final count of runs per thread with 16 threads (on my M1 Max Mac with Java 17) is almost exactly equal. If the same test is run with ALock, the first few threads seem to acquire the lock approximately 10 times more frequently than the last few threads.
Is ALock, as presented, unfair, and if so, why? Alternatively, is my minimal test flawed, and if so, why does it seem to demonstrate the fairness of ReentrantLock?
Your test code has non-threadsafe update for COUNT++. Switch to COUNT.incrementAndGet() and:
static AtomicLong COUNT = new AtomicLong();
ALock will give unfair results especially when number of threads exceeds CPUs. The implementation relies on high CPU spin loop while (!flag[slot]) and not all threads are having same opportunity to enter their lock spin-loops - the first few threads are performing more of the lock-unlock cycles. Adding Thread.yield should balance out the thread access to the boolean array so all threads have similar opportunities to run through their own lock spin loop.
while (!flag[slot]) {
You should see different results if you try setting NUM_THREADS to be same or less than Runtime.getRuntime().availableProcessors() - the use of Thread.yield() may not make a difference compared to when NUM_THREADS > Runtime.getRuntime().availableProcessors().
Using this lock class will lead to slower throughput as at any one time up to N-1 threads are in high CPU spin loop waiting for the current locking thread to call unlock(). In ideal lock implementations, N-1 waiters won't be consuming CPU.
The ALock locking stategy will only work if the exact same number of threads is used as provided new ALock(NUM_THREADS) because otherwise the use of int slot = tail.getAndIncrement() % size; may result in 2 threads reading from the same slot.
Note that any code relying on spin loop or Thread.yield() to work is not an effective implementation and should not be used in production code. Both can be avoided with the classes of java.util.concurrent.*.
package com.playground.concurrency;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
public class MyRunnable implements Runnable {
private String taskName;
public String getTaskName() {
return taskName;
public void setTaskName(String taskName) {
this.taskName = taskName;
private int processed = 0;
public MyRunnable(String name) {
this.taskName = name;
private boolean keepRunning = true;
public boolean isKeepRunning() {
return keepRunning;
public void setKeepRunning(boolean keepRunning) {
this.keepRunning = keepRunning;
private BlockingQueue<Integer> elements = new LinkedBlockingQueue<Integer>(10);
public BlockingQueue<Integer> getElements() {
return elements;
public void setElements(BlockingQueue<Integer> elements) {
this.elements = elements;
public void run() {
while (keepRunning || !elements.isEmpty()) {
try {
Integer element = elements.take();
System.out.println(taskName +" :: "+elements.size());
System.out.println("Got :: " + element);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
System.out.println("Exiting thread");
public int getProcessed() {
return processed;
public void setProcessed(int processed) {
this.processed = processed;
package com.playground.concurrency.service;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import com.playground.concurrency.MyRunnable;
public class TestService {
public static void main(String[] args) throws InterruptedException {
int roundRobinIndex = 0;
int noOfProcess = 10;
List<MyRunnable> processes = new ArrayList<MyRunnable>();
for (int i = 0; i < noOfProcess; i++) {
processes.add(new MyRunnable("Task : " + i));
ExecutorService threadPoolExecutor = Executors.newFixedThreadPool(5);
for (MyRunnable process : processes) {
int totalMessages = 1000;
long start = System.currentTimeMillis();
for (int i = 1; i <= totalMessages; i++) {
if (roundRobinIndex == noOfProcess) {
roundRobinIndex = 0;
System.out.println("Done putting all the elements");
for (MyRunnable process : processes) {
try {
threadPoolExecutor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
long totalProcessed = 0;
for (MyRunnable process : processes) {
System.out.println("task " + process.getTaskName() + " processd " + process.getProcessed());
totalProcessed += process.getProcessed();
long end = System.currentTimeMillis();
System.out.println("total time" + (end - start));
I have a simple task that reads elements from a LinkedBlockingQueue. I create multiple instances of these tasks and execute by ExecutorService . This programs works as expected when the noOfProcess and thread pool size is same.(For ex: noOfProcess=10 and thread pool size=10).
However , if noOfProcess=10 and thread pool size =5 then the main thread keeps waiting at the below line after processing a few items.
What am i doing wrong here ?
Ah yes. The good old deadlock.
What happens is: You submit 10 Tasks to the ExecutorService, and then send jobs via .put(i). This blocks for Task 5 as expected when its queue is full. Now Task 5 is not currently being executed, and as a matter of fact will never be, since Task 0 to 4 are still clogging up your FixedThreadPool, blocking at .take() in the run() Method waiting for new Jobs from .put(i), which they will never get.
This error is a fundamental design flaw within your code and there are myriads of ways to fix it, one of which being the increased Thread Pool Size.
My suggestion is that you go back to the drawing board and rethink the structure in the main Method.
And since you posted your code, have some tips:
Posting your entire code can be interpreted as a call to 'pls fix my code', and you are encouraged to omit all uneccessary details (like all those getters and setters). Maybe check https://stackoverflow.com/help/minimal-reproducible-example
Posting two classes in the same body made things kinda complicated. Split it next time.
3.: (nitpick)
Combining two operations like you did here is bad style since it makes your code less readable for others. You could just have written:
processes.get(i % noOfProcesses).getElements().put(i);
To fix the behavior, you need to do one of the following:
have enough Runnables, each with enough queue capacity to take all 1,000 messages (for example: 100 Runnables with capacity 10 or more; or 10 Runnables with capacity 100 or more), or
have a thread pool that is large enough to accomodate all of your Runnables so that each of them can start running.
Without one of those happening, the ExecutorService will not start the extra Runnables. The main worker thread will continue adding items to each queue, including those of non-running Runnables, until it encounters a queue that is full, at which point it blocks. With 10 Runnables and thread pool size 5, the first queue to fill up will the be the 6th Runnable. This is the same if you had just 6 Runnables. The significant point is that you have at least one more Runnable than you have room in your thread pool.
From newFixedThreadPool() Javadoc:
If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available.
Consider a simpler example of 2 processes and thread pool size of 1. You'll be allowed to create the first process and submit it to the ExecutorService (so the ExecutorService will start and run it). The second process however, will not be allowed to run by the ExecutorService. Your main thread does not pay attention to this, however, and it will continue putting elements into the queue for the second process even though nothing is consuming it.
Your code is ok with noOfProcess=10 and thread pool size=5 โ if you also change your queue size to 100, like this: new LinkedBlockingQueue<>(100).
You can observe this behavior โ where the queue of a non-running Runnable fills up โ if you change this line:
to this (which is the same logical code, but has object references saved for use inside the println() output):
MyRunnable runnable = processes.get(roundRobinIndex++);
BlockingQueue<Integer> elements = runnable.getElements();
System.out.println("attempt to put() for " + runnable.getTaskName() + " with " + elements.size() + " elements");
Hello I've never tried using threads before, this is my first attempt but it doesn't stop, The normal verion works.
if I remove awaitTermination it looks like it works but I need the method to finish when it's all sorted out(pun intended XD).
Can you tell me what I did wrong?
Thank you.
public class Sorting {
private Sorting() {};
private static Random r = new Random();
private static int cores = Runtime.getRuntime().availableProcessors();
private static ExecutorService executor = Executors.newFixedThreadPool(cores);
public static void qsortP(int[] a) {
qsortParallelo(a, 0, a.length - 1);
private static void qsortParallelo(int[] a, int first, int last) {
while (first < last) {
int p = first + r.nextInt(last - first + 1);
int px = a[p];
int i = first, j = last;
do {
while (a[i] < px)
while (a[j] > px)
if (i <= j) {
scambia(a, i++, j--);
} while (i <= j);
executor.submit(new qsortThread(a, first, j));
first = i;
try {
executor.awaitTermination(1, TimeUnit.DAYS);
} catch (InterruptedException e) {
private static void scambia(int[] a, int x, int y) {
int temp = a[x];
a[x] = a[y];
a[y] = temp;
public static class qsortThread implements Runnable {
final int a[], first, last;
public qsortThread(int[] a, int first, int last) {
this.a = a;
this.first = first;
this.last = last;
public void run() {
qsortParallelo(a, first, last);
Instead of waiting for termination of the entire executor service (which probably isn't what you want at all), you should save all the Futures returned by executor.submit() and wait until they're all done (by calling 'get()` on them for example).
And though it's tempting to do this in the qsortParallelo() method, that would actually lead to a deadlock by exhaustion of the thread pool: parent tasks would hog the worker threads waiting for their child tasks to complete, but the child tasks would never be scheduled to run because there would be no available worker threads.
So you have to collect all the Future objects into a concurrent collection first, return the result to qsortP() and wait there for the Futures to finish.
Or use a ForkJoinPool, which was designed for exactly this kind of task and does all the donkey work for you. Recursively submitting tasks to an executor from application code is generally not a very good idea, it's very easy to get it wrong.
As an aside, the reason your code is deadlocked as it is is that every worker thread is stuck in executor.awaitTermination(), thereby preventing the termination of the executor service.
In general, the two most useful tools for designing and debugging multi-threaded applications are:
A thread dump. You can generate that with jstack, VisualVM or any other tool, but it's invaluable in deadlock situations, it gives you an accurate image of what's (not) going on with your threads.
A pen, a piece of paper and drawing a good old fashioned swimlane chart.
You are calling executor.awaitTermination inside a Thread which was launched by your executor. Thread will not stop until executor comes out of the awaitTermination and executor will not come out of awaitTermination until the Thread terminates. You need to move this code:
try {
executor.awaitTermination(1, TimeUnit.DAYS);
} catch (InterruptedException e) {
into the end of qsortP method.
The mistake in this code is simply the while-loop in qsortParallelo. first and last are never modified. Apart from that you don't need the while-loop, since you already do that the further sorting in the executor. And you'll need to start another task for the second half of the array.
I am writing a multithreaded parser.
Parser class is as follows.
public class Parser extends HTMLEditorKit.ParserCallback implements Runnable {
private static List<Station> itemList = Collections.synchronizedList(new ArrayList<Item>());
private boolean h2Tag = false;
private int count;
private static int threadCount = 0;
public static List<Item> parse() {
for (int i = 1; i <= 1000; i++) { //1000 of the same type of pages that need to parse
while (threadCount == 20) { //limit the number of simultaneous threads
try {
} catch (InterruptedException ex) {
Thread thread = new Thread(new Parser());
threadCount++; //increase the number of working threads
return itemList;
public void run() {
//Here is a piece of code responsible for creating links based on
//the thread name and passed as a parameter remained i,
//connection, start parsing, etc.
//In general, nothing special. Therefore, I won't paste it here.
threadCount--; //reduce the number of running threads when current stops
private static void addItem(Item item) {
//This method retrieves the necessary information after the H2 tag is detected
public void handleText(char[] data, int pos) {
if (h2Tag) {
String itemName = new String(data).trim();
//Item - the item on which we receive information from a Web page
Item item = new Item();
//Display information about an item in the console
System.out.println(count + " = " + itemName);
public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos) {
if (HTML.Tag.H2 == t) {
h2Tag = true;
public void handleEndTag(HTML.Tag t, int pos) {
if (HTML.Tag.H2 == t) {
h2Tag = false;
From another class parser runs as follows:
List<Item> list = Parser.parse();
All is good, but there is a problem. At the end of parsing in the final list "List itemList" contains 980 elements onto, instead of 1000. But in the console there is all of 1000 elements (items). That is, some threads for some reason did not call in the handleText method the addItem method.
I already tried to change the type of itemList to ArrayList, CopyOnWriteArrayList, Vector. Makes the method addItem synchronized, changed its call on the synchronized block. All this only changes the number of elements a little, but the final thousand can not be obtained.
I also tried to parse a smaller number of pages (ten). As the result the list is empty, but in the console all 10.
If I remove multi-threading, then everything works fine, but, of course, slowly. That's not good.
If decrease the number of concurrent threads, the number of items in the list is close to the desired 1000, if increase - a little distanced from 1000. That is, I think, there is a struggle for the ability to record to the list. But then why are synchronization not working?
What's the problem?
After your parse() call returns, all of your 1000 Threads have been started, but it is not guaranteed that they are finished. In fact, they aren't that's the problem you see. I would heavily recommend not write this by yourself but use the tools provided for this kind of job by the SDK.
The documentation Thread Pools and the ThreadPoolExecutor are e.g. a good starting point. Again, don't implement this yourself if you are not absolutely sure you have too, because writing such multi-threading code is pure pain.
Your code should look something like this:
ExecutorService executor = Executors.newFixedThreadPool(20);
List<Future<?>> futures = new ArrayList<Future<?>>(1000);
for (int i = 0; i < 1000; i++) {
futures.add(executor.submit(new Runnable() {...}));
for (Future<?> f : futures) {
There is no problem with the code, it is working as you have coded. the problem is with the last iteration. rest all iterations will work properly, but during the last iteration which is from 980 to 1000, the threads are created, but the main process, does not waits for the other thread to complete, and then return the list. therefore you will be getting some odd number between 980 to 1000, if you are working with 20 threads at a time.
Now you can try adding Thread.wait(50), before returning the list, in that case your main thread will wait, some time, and may be by the time, other threads might finish the processing.
or you can use some syncronization API from java. Instead of Thread.wait(), use CountDownLatch, this will help you to wait for the threads to complete the processing, and then you can create new threads.
So this seems like a pretty common use case, and maybe I'm over thinking it, but I'm having an issue with keeping centralized metrics from multiple threads. Say I have multiple worker threads all processing records and I every 1000 records I want to spit out some metric. Now I could have each thread log individual metrics, but then to get throughput numbers, but I'd have to add them up manually (and of course time boundaries won't be exact). Here's a simple examples:
public class Worker implements Runnable {
private static int count = 0;
private static long processingTime = 0;
public void run() {
while (true) {
...get record
long start = System.currentTimeMillis();
...do work
long end = System.currentTimeMillis();
processingTime += (end-start);
if (count % 1000 == 0) {
... log some metrics
processingTime = 0;
count = 0;
Hope that makes some sense. Also I know the two static variables will probably be AtomicInteger and AtomicLong . . . but maybe not. Interested in what kinds of ideas people have. I had thought about using Atomic variables and using a ReeantrantReadWriteLock - but I really don't want the metrics to stop the processing flow (i.e. the metrics should have very very minimal impact on the processing). Thanks.
Offloading the actual processing to another thread can be a good idea. The idea is to encapsulate your data and hand it off to a processing thread quickly so you minimize impact on the threads that are doing meaningful work.
There is a small handoff contention, but that cost is usually a lot smaller than any other type of synchronization that it should be a good candidate in many situations. I think M. Jessup's solution is pretty close to mine, but hopefully the following code illustrates the point clearly.
public class Worker implements Runnable {
private static final Metrics metrics = new Metrics();
public void run() {
while (true) {
...get record
long start = System.currentTimeMillis();
...do work
long end = System.currentTimeMillis();
// process the metric asynchronously
metrics.addMetric(end - start);
private static final class Metrics {
// a single "background" thread that actually handles
// processing
private final ExecutorService metricThread =
// data (no synchronization needed)
private int count = 0;
private long processingTime = 0;
public void addMetric(final long time) {
metricThread.execute(new Runnable() {
public void run() {
processingTime += time;
if (count % 1000 == 0) {
... log some metrics
processingTime = 0;
count = 0;
I would suggest if you don't want the logging to interfere with the processing, you should have a separate log worker thread and have your processing threads simply provide some type of value object that can be handed off. In the example I choose a LinkedBlockingQueue since it has the ability to block for an insignificant amount of time using offer() and you can defer the blocking to another thread that pulls the values from a queue. You might need to have increased logic in the MetricProcessor to order data, etc depending on your requirements, but even if it is a long running operation it wont keep the VM thread scheduler from restarting the real processing threads in the mean time.
public class Worker implements Runnable {
public void run() {
while (true) {
... do some stuff
if (count % 1000 == 0) {
... log some metrics
new Metrics(processingTime, count, ...)) {
processingTime = 0;
count = 0;
} else {
//the call would have blocked for a more significant
//amount of time, here the results
//could be abandoned or just held and attempted again
//as a larger data set later
public class WorkerMetrics {
...some interesting data
public WorkerMetrics(... data){
...getter setters etc
public class MetricProcessor implements Runnable {
LinkedBlockingQueue metrics = new LinkedBlockingQueue();
public boolean addMetrics(WorkerMetrics m) {
return metrics.offer(m); //This may block, but not for a significant amount of time.
public void run() {
while(true) {
WorkMetrics m = metrics.take(); //wait here for something to come in
//the above call does all the significant blocking without
//interrupting the real processing
...do some actual logging, aggregation, etc of the metrics
If you depend on the state of count and the state of processingTime to be in synch then you would have to be using a Lock. For example if when ++count % 1000 == 0 is true, you want to evaluate the metrics of processingTime at THAT time.
For that case, it would make sense to use a ReentrantLock. I wouldn't use a RRWL because there isn't really an instance where a pure read is occuring. It is always a read/write set. But you would need to Lock around all of
processingTime += (end-start);
if (count % 1000 == 0) {
... log some metrics
processingTime = 0;
count = 0;
Whether or not count++ is going to be at that location, you will need to lock around that also.
Finally if you are using a Lock, you do not need an AtomicLong and AtomicInteger. It just adds to the overhead and isn't more thread-safe.