ArrayBlockingQueue - every month getting queue is full always - java

I've producer consumer for processing upload images,
imageQueueSize = 150
imageConsumersNumber = 10
below init the queue:
ArrayBlockingQueue<Context> imageQueue = new ArrayBlockingQueue<Context>(imageQueueSize);
for (int i = 0; i < imageConsumersNumber; i++) {
new Thread(new QueueProcessor(imageQueue,executorsFacade)).start();
}
below using it:
if (queue.offer(context, queueOfferTimeOut, TimeUnit.SECONDS)) {
//debug info ..
return true;
} else {
log.info("queue {} is full, contentData {} will not be processed", mimeFamilyType,
context);
return false;
}
below method run implementation:
try{
while(true) {
Context context = queue.take();
Executor executor = executorsFacade.getExecutor(context.getContentData()
.getMimeFamilyType());
executor.execute(context);
}
}catch(Exception e){
log.error("Exception in run methoud ",e);
}
once every each month, i see in the logs just "queue IMAGE is full"
does anyone knows why this is happening ?
it seems that the take() method is not being executed and the queue is not cleaned.
i can see in the logs that it successfully processed 10K+ requests, but once a month it got stuck.

Related

Executor framework to process 1 million records

I had a requirement where I had to process a file containing 1 million records and save it in a redis cache. I was supposed to use redis pipeline but I didn't get any information on it. Here was my question: Question
So I decided to use multithreading-executor framework. I am new to multithreading
Here is my code:
#Async
public void createSubscribersAsync(Subscription subscription, MultipartFile file)throws EntityNotFoundException, InterruptedException, ExecutionException, TimeoutException {
ExecutorService executorService = Executors.newFixedThreadPool(8);
Collection<Callable<String>> callables = new ArrayList<>();
List<Subscriber> cache = new ArrayList<>();
int batchSize = defaultBatchSize.intValue();
while ((line = br.readLine()) != null) {
try {
Subscriber subscriber = createSubscriber(subscription, line);
cache.add(subscriber);
if (cache.size() >= batchSize) {
IntStream.rangeClosed(1, 8).forEach(i -> {
callables.add(createCallable(cache, subscription.getSubscriptionId()));});
}
} catch (InvalidSubscriberDataException e) {
invalidRows.add(line + ":" + e.getMessage());
invalidCount++;
}
}
List<Future<String>> taskFutureList = executorService.invokeAll(callables);
for (Future<String> future : taskFutureList) {
String value = future.get(4, TimeUnit.SECONDS);
System.out.println(String.format("TaskFuture returned value %s", value));
}
}
private Callable<String> createCallable(List<Subscriber> cache, String subscriptionId) {
return new Callable<String>() {
public String call() throws Exception {
System.out.println(String.format("starting expensive task thread %s", Thread.currentThread().getName()));
processSubscribers(cache,subscriptionId);
System.out.println(String.format("finished expensive task thread %s", Thread.currentThread().getName()));
return "Finish Thread:" + Thread.currentThread().getName();
}
};
}
private void processSubscribers(List<Subscriber> cache, String subscriptionId) {
subscriberRedisRepository.saveAll(cache);
cache.clear();
}
Idea here is I want to split a file in a batch and save that batch using a thread. I created the pool of 8 threads.
Is this a correct way to implement executor framework? If not could you please help me out in this? Appreciate the help.
Quick modifications to your current code to achive the ask:
In your while loop once the current cache exceeds batch size, create a callable passing in the current cache. Reset the cache list, create a new list and assign it as cache.
You are creating a list of callables to submit them as a batch, why not submit your callables right after creating them? This will start writing already read records to redis, while your main thread continues reading from file.
List<Future<String>> taskFutureList = new LinkedList<Future<String>>();
while ((line = br.readLine()) != null) {
try {
Subscriber subscriber = createSubscriber(subscription, line);
cache.add(subscriber);
if (cache.size() >= batchSize) {
taskFutureList.add(executorService.submit(createCallable(cache,subscription.getSubscriptionId())));
List<Subscriber> cache = new ArrayList<>();
}
} catch (InvalidSubscriberDataException e) {
invalidRows.add(line + ":" + e.getMessage());
invalidCount++;
}
}
//submit last batch that could be < batchSize
if(!cache.isEmpty()){
taskFutureList.add(executorService.submit(createCallable(cache,subscription.getSubscriptionId())));
}
You do not have to store a seperate list of callables.

Process large text file concurrently

So I have a large text file, in this case it's roughly 4.5 GB, and I need to process the entire file as fast as is possible. Right now I have multi-threaded this using 3 threads (not including the main thread). An input thread for reading the input file, a processing thread to process the data, and an output thread to output the processed data to a file.
Currently, the bottleneck is the processing section. Therefore, I'd like to add more processing threads into the mix. However, this creates a situation where I've got multiple threads accessing the same BlockingQueue, and their results are therefore not maintaining the order of the input file.
An example of the functionality I'm looking for would be something like this:
Input file: 1, 2, 3, 4, 5
Output file: ^ the same. Not 2, 1, 4, 3, 5 or any other combination.
I've written a dummy program that is identical in functionality to the actual program minus the processing part, (I can't give you the actual program due to the processing class containing info that is confidential). I should also mention, all of the classes (Input, Processing, and Output) are all Inner classes contained within a Main class that contains the initialise() method and the class level variables mentioned in the main thread code listed below.
Main thread:
static volatile boolean readerFinished = false; // class level variables
static volatile boolean writerFinished = false;
private void initialise() throws IOException {
BlockingQueue<String> inputQueue = new LinkedBlockingQueue<>(1_000_000);
BlockingQueue<String> outputQueue = new LinkedBlockingQueue<>(1_000_000); // capacity 1 million.
String inputFileName = "test.txt";
String outputFileName = "outputTest.txt";
BufferedReader reader = new BufferedReader(new FileReader(inputFileName));
BufferedWriter writer = new BufferedWriter(new FileWriter(outputFileName));
Thread T1 = new Thread(new Input(reader, inputQueue));
Thread T2 = new Thread(new Processing(inputQueue, outputQueue));
Thread T3 = new Thread(new Output(writer, outputQueue));
T1.start();
T2.start();
T3.start();
while (!writerFinished) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
reader.close();
writer.close();
System.out.println("Exited.");
}
Input thread: (Please forgive the commented debug code, was using it to ensure the reader thread was actually executing properly).
class Input implements Runnable {
BufferedReader reader;
BlockingQueue<String> inputQueue;
Input(BufferedReader reader, BlockingQueue<String> inputQueue) {
this.reader = reader;
this.inputQueue = inputQueue;
}
#Override
public void run() {
String poisonPill = "ChH92PU2KYkZUBR";
String line;
//int linesRead = 0;
try {
while ((line = reader.readLine()) != null) {
inputQueue.put(line);
//linesRead++;
/*
if (linesRead == 500_000) {
//batchesRead += 1;
//System.out.println("Batch read");
linesRead = 0;
}
*/
}
inputQueue.put(poisonPill);
} catch (IOException | InterruptedException e) {
e.printStackTrace();
}
readerFinished = true;
}
}
Processing thread: (Normally this would actually be doing something to the line, but for purposes of the mockup I've just made it immediately push to the output thread). If necessary we can simulate it doing some work by making the thread sleep for a small amount of time for each line.
class Processing implements Runnable {
BlockingQueue<String> inputQueue;
BlockingQueue<String> outputQueue;
Processing(BlockingQueue<String> inputQueue, BlockingQueue<String> outputQueue) {
this.inputQueue = inputQueue;
this.outputQueue = outputQueue;
}
#Override
public void run() {
while (true) {
try {
if (inputQueue.isEmpty() && readerFinished) {
break;
}
String line = inputQueue.take();
outputQueue.put(line);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
Output thread:
class Output implements Runnable {
BufferedWriter writer;
BlockingQueue<String> outputQueue;
Output(BufferedWriter writer, BlockingQueue<String> outputQueue) {
this.writer = writer;
this.outputQueue = outputQueue;
}
#Override
public void run() {
String line;
ArrayList<String> outputList = new ArrayList<>();
while (true) {
try {
line = outputQueue.take();
if (line.equals("ChH92PU2KYkZUBR")) {
for (String outputLine : outputList) {
writer.write(outputLine);
}
System.out.println("Writer finished - executing termination");
writerFinished = true;
break;
}
line += "\n";
outputList.add(line);
if (outputList.size() == 500_000) {
for (String outputLine : outputList) {
writer.write(outputLine);
}
System.out.println("Writer wrote batch");
outputList = new ArrayList<>();
}
} catch (IOException | InterruptedException e) {
e.printStackTrace();
}
}
}
}
So right now the general data flow is very linear, looking something like this:
Input > Processing > Output.
But what I'd like to have is something like this:
But the catch is, when the data gets to output, it either needs to be sorted into the correct order, or it needs to already be in the correct order.
Recommendations or examples on how to go about this would be greatly appreciated.
In the past I have used the Future and Callable interfaces to solve a task involving parallel data flows like this, but unfortunately that code was not reading from a single queue, and so is of minimal help here.
I should also add, for those of you that will notice this, batchSize and poisonPill are normally defined in the main thread and then passed around via variables, they are not usually hard coded as they are in the code for Input thread, and the output checks for the writer thread. I was just a wee bit lazy when writing the mockup for experimentation at ~1am.
Edit: I should also mention, this is required to use Java 8 at most. Java 9 features and above cannot be used due to these versions not being installed in the environments in which this program will be run.
What you could do:
Take X threads for processing, where X is the number of cores available for processing
Give each thread its own input queue.
The reader thread gives records to each thread's input queue round-robin in a predictable fashion.
Since the output files are too big for memory, you write X output files, one for each thread, and each file name has the index of the thread in it, so that you can reconstitute the original order from the file names.
After the process is complete, you merge the X output files. One line from the file for thread 1, one from the files for thread 2, etc. in a round-robin fashion again. This reconstitutes the original order.
As an added bonus, since you have an input queue per thread, you don't have lock contention on the queue between readers. (only between the reader and the writer) You could even optimize this by putting things in the input queues in batches larger than 1.
As was also proposed by Alexei, you can create OrderedTask:
class OrderedTask implements Comparable<OrderedTask> {
private final Integer index;
private final String line;
public OrderedTask(Integer index, String line) {
this.index = index;
this.line = line;
}
#Override
public int compareTo(OrderedTask o) {
return index < o.getIndex() ? -1 : index == o.getIndex() ? 0 : 1;
}
public Integer getIndex() {
return index;
}
public String getLine() {
return line;
}
}
As an output queue you can use your own backed by priority queue:
class OrderedTaskQueue {
private final ReentrantLock lock;
private final Condition waitForOrderedItem;
private final int maxQueuesize;
private final PriorityQueue<OrderedTask> backedQueue;
private int expectedIndex;
public OrderedTaskQueue(int maxQueueSize, int startIndex) {
this.maxQueuesize = maxQueueSize;
this.expectedIndex = startIndex;
this.backedQueue = new PriorityQueue<>(2 * this.maxQueuesize);
this.lock = new ReentrantLock();
this.waitForOrderedItem = this.lock.newCondition();
}
public boolean put(OrderedTask item) {
ReentrantLock lock = this.lock;
lock.lock();
try {
while (this.backedQueue.size() >= maxQueuesize && item.getIndex() != expectedIndex) {
this.waitForOrderedItem.await();
}
boolean result = this.backedQueue.add(item);
this.waitForOrderedItem.signalAll();
return result;
} catch (InterruptedException e) {
throw new RuntimeException();
} finally {
lock.unlock();
}
}
public OrderedTask take() {
ReentrantLock lock = this.lock;
lock.lock();
try {
while (this.backedQueue.peek() == null || this.backedQueue.peek().getIndex() != expectedIndex) {
this.waitForOrderedItem.await();
}
OrderedTask result = this.backedQueue.poll();
expectedIndex++;
this.waitForOrderedItem.signalAll();
return result;
} catch (InterruptedException e) {
throw new RuntimeException();
} finally {
lock.unlock();
}
}
}
StartIndex is the index of the first ordered task, and
maxQueueSize is used to stop processing of other tasks (not to fill the memory), when we wait for some earlier task to finish. It should be double/tripple of the number of processing thread, to not stop the processing immediatelly and allow the scalability.
Then you should create your task :
int indexOrder =0;
while ((line = reader.readLine()) != null) {
inputQueue.put(new OrderedTask(indexOrder++,line);
}
The line by line is only used because of your example. You should change the OrderedTask to support the batch of lines.
Why not reverse the flow ?
Output call for X batches;
Generate X promise/task (promise pattern) who will call randomly one of the processing core (keep a batch number, to pass through to the input core); batch the calls handler into a ordered list;
Each processing core call for a batch in the input core;
Enjoy ?

One Producer ten consumers file-processing with Executors.newSingleThreadExecutor()

I have a LinkedBlockingQueue with an arbitrarily picked capacity of 10, and an input file with 1000 lines. I have one ExecutorService-type variable in the main method of the service class that, to my knowledge, first handles--using Executors.newSingleThreadExecutor()--a single thread to call buffer.readline() until file line == null, and then handles--within a loop using Executors.newSingleThreadExecutor()--ten threads to process lines and write them to output files, until !queue.take().equals("Stop"). However, after writing some lines to files, when I am in the debug mode, I see that the capacity of the queue eventually reaches max (10), and the processing threads do not execute queue.take(). All threads are in the running state, but the process halts after queue.put(). What would cause this problem, and is it solvable using some combination of thread-pooling or multiple ExecutorServicehandler variables, instead of a single variable?
Outline for current state of main method in service:
//app settings to get values for keys within a properties file
AppSettings appSettings = new AppSettings();
BlockingQueue<String> queue = new LinkedBlockingQueue<String>(10);
maxProdThreads = 1;
maxConsThreads = 10;
ExecutorService execSvc = null;
for (int i = 0; i < maxProdThreads; i++) {
execSvc = Executors.newSingleThreadExecutor();
execSvc.submit(new ReadJSONMessage(appSettings,queue));
}
for (int i = 0; i < maxConsThreads; i++) {
execSvc = Executors.newSingleThreadExecutor();
execSvc.submit(new ProcessJSONMessage(appSettings,queue));
}
Reading method code:
buffer = new BufferedReader(new FileReader(inputFilePath));
while((line = buffer.readLine()) != null){
line = line.trim();
queue.put(line);
}
Processing and Writing code:
while(!(line=queue.take()).equals("Stop")){
if(line.length() > 10)
{
try {
if(processMessage(line, outputFilePath) == true)
{
++count;
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
public boolean processMessage(String line, String outputFilePath){
CustomObject cO = new CustomObject();
cO.setText(line);
writeToFile1(cO,...);
writeToFile2(cO,...);
}
public void writeOutputAToFile(CustomObject cO,...){
synchronized(cO){
...
org.apache.commons.io.FileUtils.writeStringToFile(...)
}
}
public void writeOutputBToFile(CustomObject cO,...){
synchronized(cO){
...
org.apache.commons.io.FileUtils.writeStringToFile(...)
}
}
In the Processing and writing code..ensure that all resources are closed properly..Probably the resources might not be closed properly due to which the thread keeps running and the ExecutorService can not find an idle thread...

Monitoring Thread Execution

I have below code where first I am creating an on demand db connection and then sharing it with multiple threads. This is working fine. Now, I wanted to know if it is possible to track whether all the threads using this database connection have finished execution so that I can close the database connection. Any guidance on how I can achieve this will be helpful.
Connection connection = null;
Properties connectionProperties = getProperties();
for (int reqIdx = 0; reqIdx < requests.length(); reqIdx++) {
connection = DBonnection.getConnection(connectionProperties);
ConnectorRunner connectorRunner = null;
try {
connectorRunner = new ConnectorRunner(someConnector);
connectorRunner.setDBConnection(connection);
} catch (Exception e) {
e.printStackTrace();
}
executorService.execute(connectorRunner);
}
The easiest way is using a CountDownLatch from the standard JDK facilities. In your main thread do
CountDownLatch doneSignal = new CountDownLatch(requests.length());
for (Request req : requests) {
ConnectorRunner connectorRunner = new ConnectorRunner(doneSignal);
connectorRunner.setConnection(DBonnection.getConnection());
executorService.execute(connectorRunner);
}
doneSignal.await();
DBonnection.dispose();
The ConnectorRunner must simply call doneSignal.countDown() when it's done.
Apart from the above comments, if you are looking to do something when all your threads are finished, you can take either of the below approach.
ExecutorService es = Executors.newCachedThreadPool();
for(int i=0;i<5;i++)
es.execute(new Runnable() { /* your task */ });
es.shutdown();
boolean finshed = es.awaitTermination(1, TimeUnit.MINUTES);
// all tasks have finished or the time has been reached.
OR
for (Thread thread : threads) {
thread.join();
}
Please note that second approach will block the current thread.

Issue with wait - notify implementation

I am working on Java multithreading , where I am starting 4 threads after assigning 4 different files to them , to be uploaded to the server.
My objective is , when one thread completes file upload , I need to start another thread assigning a new file to it.
After each file upload , I receive a notification from the server.
// The code for adding the first set of files
for (int count = 0; count < 4; count++) {
if (it.hasNext()) {
File current = new File((String) it.next());
try {
Thread t = new Thread(this, current );
t.start();
t.sleep(100);
} catch (Exception e) {
}
}
}
Now , I am assigning another thread with a file & keeping the thread in a wait state .
When a previous thread notifies , the current thread should start upload.
if (tempThreadCounter == 4 ) {
if (it.hasNext()) {
File current = new File((String) it.next());
try {
Thread t = new Thread(this, current);
t.start();
synchronized (this) {
t.wait();
}
tempThreadCounter++;
} catch (Exception e) {
}
}
}
On the final statement on the run method , I am adding the following statement.
public void run (){
// Performing different operations
//Final statement of the run method below
synchronized (this) {
this.notifyAll();
}
}
Currently , all the 5 threads are starting uploading at the same time.
It should be that the first 4 threads should start uploading & the fifth thread should start only when it it notified by any thread that it had completed its operation.
Any suggestions on the incorrect Thread implementation.
You can use ExecutorService with newFixedThreadPool and specify a concurrency of 1. But really, then why do you need multiple threads? One thread doing all the uploads so the user interface remains responsive should be enough.
ExecutorService exec = Executors.newFixedThreadPool(1); //1 thread at a time
for (int count = 0; count < 4; count++) {
if (it.hasNext()) {
File current = new File((String) it.next());
exec.execute(new Runnable() {
#Override
public void run() {
upload(current);
}
});
}
}
exec.shutdown();
exec.awaitTermination(900, TimeUnit.SECONDS);
Throw it all away and use java.util.concurrent.Executor.
you can join on the thread instead of waiting on it
try {
t.join();
}catch(InterruptedException e){
throw new RuntimeException(e);
}
tempThreadCounter++;

Categories