LinkedBlockingQueue limit ignored? - java

I created a Java LinkedBlockingQueue like new LinkedBlockingQueue(1) to limit the size of the queue to 1. However, in my testing, this seems to be ignored and there is often several things in the queue at any given time. Why is this?

How did check the number of entries in the queue? If you call size(), it should always return 0 or 1.
When the queue reaches the capacity, the put() call simply block. When you have very short tasks, this may give you the illusion that multiple things are in the queue.

LinkedBlockingQueue<String> queue = new LinkedBlockingQueue<String>(5);
queue.add("ddd");
queue.count // =5
queue.size // =1
queue.remainingCapacity() // =4

Related

Concurrent in-order processing of work items from a Java BlockingQueue

I have part of a system that processes a BlockingQueue of input items within a worker thread, and puts the results on an BlockingQueue of output items, where the relevant code (simplified) looks something like this:
while (running()) {
InputObject a=inputQueue.take(); // Get from input BlockingQueue
OutputObject b=doProcessing(a); // Process the item
outputQueue.put(b); // Place on output BlockingQueue
}
doProcessing is the main performance bottleneck in this code, but the processing of queue items could be parallelised since the processing steps are all independent of each other.
I would therefore like to improve this so that items can be processed concurrently by multiple threads, with the constraint that this must not change the order of outputs (e.g. I can't simply have 10 threads running the loop above, because that might result in outputs being ordered differently depending on processing times).
What is the best way to achieve this in pure, idiomatic Java?
Parallel streams from List preserve ordering:
List<T> input = ...
List<T> output = input.parallelStream()
.filter(this::running)
.map(this::doProcessing)
.collect(Collectors.toList());
PriorityBlockingQueue can be used if your work items can be compared to one another, and you will wait until running() is false before reading from the output queue:
outputQueue = new PriorityBlockingQueue<>();
Or you could order them after they have all been processed (if they can be compared to one another):
outputQueue.drainTo(outputList);
outputList.sort(null);
A simple way to implement comparation would be assigning a progressive ID to each element put into the input queue.
Create X event-loop threads, where X is the amount of steps that can be processed in parallel.
They will be processed in parallel, except one after another, i.e. not on the same item. While one step will be carried on on one item, the previous step will be carried on on the previous item, etc.
To further optimize it, you can use concurrent queues provided by JCTools, which are optimized for Single-Producer Single-Consumer scenarios (JDK's BlockingQueue implementations support Multiple-Producer Multiple-Consumer).
// Thread 1
while (running()) {
InputObject a = inputQueue.take();
OutputObject b = doProcessingStep1(a);
queue1.put(b);
}
// Thread 2
while (running()) {
InputObject a = queue1.take();
OutputObject b = doProcessingStep2(a);
queue2.put(b);
}
// Thread 3
while (running()) {
InputObject a = queue2.take();
OutputObject b = doProcessingStep3(a);
outputQueue.put(b);
}

Strange behaviour arrayBlockingQueue with array elements

I am having some strange behavior with the use of an ArrayBlockingQueue which I use in order to communicate between certain treads in a java application.
I am using 1 static ArrayBlockingQueue as initialised like this:
protected static BlockingQueue<long[]> commandQueue;
Followed by the constructor which has this as one of its lines:
commandQueue = new ArrayBlockingQueue<long[]>(amountOfThreads*4);
Where amountOfThreads is given as a constructor argument.
I then have a producer that creates an array of long[2] gives it some values and then offers it to the queue, I then change one of the values of the array directly after it and offer it once again to the queue:
long[] temp = new long[2];
temp[0] = currentThread().getId();
temp[1] = gyrAddress;//Address of an i2c sensor
CommunicationThread.commandQueue.offer(temp);//CommunicationThread is where the commandqueue is located
temp[1] = axlAddress;//Change the address to a different sensor
CommunicationThread.commandQueue.offer(temp);
The consumer will then take this data and open up an i2c connection to a specific sensor, get some data from said sensor and communicate the data back using another queue.
For now however I have set the consumer to just consume the head and print the data.
long[] command = commandQueue.take();//This will hold the program until there is at least 1 command in the queue
if (command.length!=2){
throw new ArrayIndexOutOfBoundsException("The command given is of incorrect format");
}else{
System.out.println("The thread with thread id " + command[0] + " has given the command to get data from address " +Long.toHexString(command[1]));
}
Now for testing I have a producer thread with these addresses (byte) 0x34, (byte)0x44
If things are going correctly my output should be:
The thread with thread id 14 has given the command to get data from address 44
The thread with thread id 14 has given the command to get data from address 34
However I get:
The thread with thread id 14 has given the command to get data from address 34
The thread with thread id 14 has given the command to get data from address 34
Which would mean that it is sending the temp array after it has changed it.
Things that I did to try and fix it:
I tried a sleep, if I added a 150 ms sleep then the response is correct.
However this method will quite obviously affect performance...
Since the offer method returns a true I tried the following piece of code
boolean tempBool = false;
while(!tempBool){
tempBool = CommunicationThread.commandQueue.offer(temp);
System.out.println(tempBool);
}
Which prints out a true. This did not have an affect.
I tried printing temp[1] after this while loop and at that moment it is the correct value.(It prints out 44 however the consumer receives 34)
What most likely is the case is a syncronisation issue, however I thought that the point of a BlockingQueue based object would be to solve this.
Any help or suggestion on the workings of this BlockingQueue would be greatly appreciated. Let me end on a note that this is my first time working with queues in between threads in java and that the final program will be running on a raspberry pi using the pi4j library to communicate with the sensors
Since you asked about how BlockingQueue works exactly, let's start with that:
A blocking queue is a queue that blocks when you try to dequeue from it while the queue is empty, or when you try to enqueue items to it while the queue is already full. A thread trying to dequeue from an empty queue is blocked until some other thread inserts an item into the queue.
Soo these blocking queue's prevent different threads from reading/writing to a queue while it is not yet possible because it is either empty or full.
As Andy Turner and JB Nizet already explained, variables are statically shared in memory. This means that when your thread that reads the queue it finds a reference (A.K.A. a pointer) to this variable (in memory) and uses this pointer in it's following code. However before it manages to read this data, you already changed the variable, normally in non-threaded applications this wouldn't be an issue since only one thread will try to read from memory and it will always be executed chronologically. A way to circumvent this is to create a new variable/array (which will assign itself to new memory) with the variable data every time you add an entry to the queue, this way you make sure you do not overwrite a variable in memory before it is processed by the other thread. A simple way to do this is:
long[] tempGyr = new long[2];
tempGyr[0] = currentThread().getId();
tempGyr[1] = gyrAddress;
CommunicationThread.commandQueue.offer(tempGyr);//CommunicationThread is where the commandqueue is located
long[] tempAxl = new long[2];
tempAxl[0] = currentThread().getId();
tempAxl[1] = axlAddress;
CommunicationThread.commandQueue.offer(tempAxl);
Hope this explains the subject, if not: feel free to ask for additional questions :)

Multithread - OutOfMemory

I am using an ThreadPoolExecutor with 5 active threads, number of tasks is huge 20,000.
The queue is filled up (pool.execute(new WorkingThreadTask())) with instances of a Runnable tasks almost immediately.
Each WorkingThreadTask has a HashMap:
Map<Integer, HashMap<Integer, String>> themap ;
each map can have up to 2000 items, and each sub-map has 5 items. There is also a shared BlockingQueue.
When process is running I am getting out of memory. I'm running with: (32bit -Xms1024m -Xmx1024m)
How can I handle this problem? I don't think I have leaks in hashmap... When the thread is finished hashmap is cleaned right?
Update:
After running a profiler and checking the memory, the biggest hit is:
byte[] 2,516,024 hits, 918 MB
I don't know from where it's called or used.
Name Instance count Size (bytes)
byte[ ] 2519560 918117496
oracle.jdbc.ttc7.TTCItem 2515402 120739296
char[ ] 357882 15549280
java.lang.String 9677 232248
int[ ] 2128 110976
short[ ] 2097 150024
java.lang.Class 1537 635704
java.util.concurrent.locks.ReentrantLock$NonfairSync 1489 35736
java.util.Hashtable$Entry 1417 34008
java.util.concurrent.ConcurrentHashMap$HashEntry[ ] 1376 22312
java.util.concurrent.ConcurrentHashMap$Segment 1376 44032
java.lang.Object[ ] 1279 60216
java.util.TreeMap$Entry 828 26496
oracle.jdbc.dbaccess.DBItem[ ] 802 10419712
oracle.jdbc.ttc7.v8TTIoac 732 52704
I'm not sure about the inner map but I suspect the problem is that you are creating a large number of tasks that is filling memory. You should be using a bounded task queue and limit the job producer.
Take a look at my answer here: Process Large File for HTTP Calls in Java
To summarize it, you should create your own bounded queue and then use a RejectedExecutionHandler to block the producer until there is space in the queue. Something like:
final BlockingQueue<WorkingThreadTask> queue =
new ArrayBlockingQueue<WorkingThreadTask>(100);
ThreadPoolExecutor threadPool =
new ThreadPoolExecutor(nThreads, nThreads, 0L, TimeUnit.MILLISECONDS, queue);
// we need our RejectedExecutionHandler to block if the queue is full
threadPool.setRejectedExecutionHandler(new RejectedExecutionHandler() {
#Override
public void rejectedExecution(WorkingThreadTask task,
ThreadPoolExecutor executor) {
try {
// this will block the producer until there's room in the queue
executor.getQueue().put(task);
} catch (InterruptedException e) {
throw new RejectedExecutionException(
"Unexpected InterruptedException", e);
}
}
});
Edit:
I don't think I have leeks in hashmap... when thread is finished hashmap is cleaned right?
You might consider aggressively calling clear() on the work HashMap and other collections when the task completes. Although they should be reaped by the GC eventually, giving the GC some help may solve your problem if you have limited memory.
If this doesn't work, a profiler is the way to go to help you identify where the memory is being held.
Edit:
After looking at the profiler output, the byte[] is interesting. Typically this indicates some sort of serialization or other IO. You may also be storing blobs in a database. The oracle.jdbc.ttc7.TTCItem is very interesting however. That indicates to me that you are not closing a database connection somewhere. Make sure to use proper try/finally blocks to close your connections.
HashMap carries quite a lot of overhead in terms of memory usage..... it carries about 36 bytes minimum per entry, plus the size of the key/value itself - each will be at least 32 bytes (I think that's about the typical value for 32-bit sun JVM).... doing some quick math:
20,000 tasks, each with map with 2000 entry hashmap. The value in the map is another map with 5 entries.
-> 5-entry map is 1* Map + 5* Map.Object entries + 5*keys + 5*values = 16 objects at 32 bytes => 512 bytes per sub-map.
-> 2000 entry map is 1* Map, 2000*Map.Object + 2000 keys + 2000 submaps (each is 512 bytes) => 2000*(512+32+32) + 32 => 1.1MB
-> 20,000 tasks, each of 1.1MB -> 23GB
So, your overall footprint is 23GB.
The logical solution is to restrict the depth of your blocking queue feeding the ExecutorService, and only create enough child tasks to keep it busy..... set a limit of about 64 entries in the queue, and then you will never have more than 64 + 5 tasks instantiated at one time. When wpace comes available in the executor's queue, you can create and add another task.
You can improve the efficiency by not adding so many tasks ahead of what is being processed. Try checking the queue and only adding to it if there is less than 1000 entries.
You can also make the data structures more efficient. A Map with an Integer key can often be reduced to an array of some kind.
Lastly, 1 GB isn't that much these days. My mobile phone has 2 GB. If you are going to process large amount of data, I suggest getting a machine with 32-64 GB of memory and a 64-bit JVM.
From the large byte[]s, I'd suspect IO related issues (unless you are handling video/audio or something).
Things to look at:
DB: Are you trying to read large amount of stuff at once? You can
e.g. use a cursor to not do that
File/Network: Are you trying to read large amounts of stuff from file/network at once? You should "propagate the load" to whatever is reading and regulate the rate of read.
UPDATE: OK, so you are using a cursor to read from DB. Now you need to make sure that the reading from the cursor only progresses as you finish stuff (aka "propagate the load"). To do this, use a thread pool like this:
BlockingQueue<Runnable> queue = new LinkedBlockingQueue<Runnable>(queueSize);
ThreadPoolExecutor tpe = new ThreadPoolExecutor(
threadNum,
threadNum,
1000,
TimeUnit.HOURS,
queue,
new ThreadPoolExecutor.CallerRunsPolicy());
Now when you post to this service from your code which reads from the DB, it will block when the queue is full (the calling thread is used to run tasks and hence blocks).

Java- FixedThreadPool with known pool size but unknown workers

So I think I sort of understand how fixed thread pools work (using the Executor.fixedThreadPool built into Java), but from what I can see, there's usually a set number of jobs you want done and you know how many to when you start the program. For example
int numWorkers = Integer.parseInt(args[0]);
int threadPoolSize = Integer.parseInt(args[1]);
ExecutorService tpes =
Executors.newFixedThreadPool(threadPoolSize);
WorkerThread[] workers = new WorkerThread[numWorkers];
for (int i = 0; i < numWorkers; i++) {
workers[i] = new WorkerThread(i);
tpes.execute(workers[i]);
}
Where each workerThread does something really simple,that part is arbitrary. What I want to know is, what if you have a fixed pool size (say 8 max) but you don't know how many workers you'll need to finish the task until runtime.
The specific example is: If I have a pool size of 8 and I'm reading from standard input. As I read, I split the input into blocks of a set size. Each one of these blocks is given to a thread (along with some other information) so that they can compress it. As such, I don't know how many threads I'll need to create as I need to keep going until I reach the end of the input. I also have to somehow ensure that the data stays in the same order. If thread 2 finishes before thread 1 and just submits its work, my data will be out of order!
Would a thread pool be the wrong approach in this situation then? It seems like it'd be great (since I can't use more than 8 threads at a time).
Basically, I want to do something like this:
ExecutorService tpes = Executors.newFixedThreadPool(threadPoolSize);
BufferedInputStream inBytes = new BufferedInputStream(System.in);
byte[] buff = new byte[BLOCK_SIZE];
byte[] dict = new byte[DICT_SIZE];
WorkerThread worker;
int bytesRead = 0;
while((bytesRead = inBytes.read(buff)) != -1) {
System.arraycopy(buff, BLOCK_SIZE-DICT_SIZE, dict, 0, DICT_SIZE);
worker = new WorkerThread(buff, dict)
tpes.execute(worker);
}
This is not working code, I know, but I'm just trying to illustrate what I want.
I left out a bit, but see how buff and dict have changing values and that I don't know how long the input is. I don't think I can't actually do this thought because, well worker already exists after the first call! I can't just say worker = new WorkerThread a bunch of time since isn't it already pointing towards an existing thread (true, a thread that might be dead) and obviously in this implemenation if it did work I wouldn't be running in parallel. But my point is, I want to keep creating threads until I hit the max pool size, wait till a thread is done, then keep creating threads until I hit the end of the input.
I also need to keep stuff in order, which is the part that's really annoying.
Your solution is completely fine (the only point is that parallelism is perhaps not necessary if the workload of your WorkerThreads is very small).
With a thread pool, the number of submitted tasks is not relevant. There may be less or more than the number of threads in the pool, the thread pool takes care of that.
However, and this is important: You rely on some kind of order of the results of your WorkerThreads, but when using parallelism, this order is not guaranteed! It doesn't matter whether you use a thread pool, or how much worker threads you have, etc., it will always be possible that your results will be finished in an arbitrary order!
To keep the order right, give each WorkerThread the number of the current item in its constructor, and let them put their results in the right order after they are finished:
int noOfWorkItem = 0;
while((bytesRead = inBytes.read(buff)) != -1) {
System.arraycopy(buff, BLOCK_SIZE-DICT_SIZE, dict, 0, DICT_SIZE);
worker = new WorkerThread(buff, dict, noOfWorkItem++)
tpes.execute(worker);
}
As #ignis points out, parallel execution may not be the best answer for your situation.
However, to answer the more general question, there are several other Executor implementations to consider beyond FixedThreadPool, some of which may have the characteristics that you desire.
As far as keeping things in order, typically you would submit tasks to the executor, and for each submission, you get a Future (which is an object that promises to give you a result later, when the task finishes). So, you can keep track of the Futures in the order that you submitted tasks, and then when all tasks are done, invoke get() on each Future in order, to get the results.

Java Priority Queue With Linked Lists

I have been trying to figure out a question on a recent assignment for a few days now, and I can't seem to wrap my head around it. The question reads as follows:
Create a PriorityQueue class that contains two fields noOfPriorities
and a LinkedList… It should have one constructor that takes in an int
value assign that value to the noOfPriorities… at the same time add as
many LinkedLists as numberOfPriorities.. Enqueue method that takes in
a priority and an object.. Dequeue method that returns the next
priority element… and remove it from the list…
A large part of my problem is that I can't determine exactly what the professor is looking for because the wording seems a bit weird to me... simply asking about it yielded no help either.
Just to clarify, I'm not looking for anyone to give me the answer. I'm simply looking for a push in the right direction. If anyone could help It would be greatly appreciated.
Cheers on being honest about this being homework.
I think you can understand the problem better if you read up on what a priority queue is.
Let's take a small example. You have a few tasks to do, and each task has a priority.
Pri 1 - breathe, eat, sleep
Pri 2 - study, play
Pri 3 - watch a movie
All the above information can be handled by your PriorityQueue. You have 3 kinds of priorities, so you have 3 lists. Each list is to maintain tasks with the same priority.
Once you construct the empty PriorityQueue by calling PriorityQueue(3), you can add tasks to it.
Let's say you want to add the task "study" that has priority 2.
You can say, priorityQueue.enqueue(2, "study"). You would then go to the list that maintains priority 2 items, and add the task "study" to that list.
Similarly, when you want to find out what the next priority 3 item is, you can say, priorityQueue.dequeue(3). You would then find the list that handles priority 3 items, and remove the last element from that list.
This should give you a good understanding to start working. :)
Agreed, the assignment's badly worded.
at the same time add as many LinkedLists as numberOfPriorities..
This should probably be "at the same time add as many nodes to the LinkedList as numberOfPriorities.."
The next question to ask yourself is, just what type of thing should I store in all these linked nodes...?
I think you'd need an array of linked lists instead of just one. The problem description is contradictory, saying that you need both a class that has one linked list, and to create a number of them when you construct an object.
Here's a constructor for your class:
MyPriorityQueue(int npriorities)
{
noOfPriorities = npriorities;
queueArray = new ArrayList<List<T>>();
for (int i = 0; i < npriorities; ++i) queueArray.add(new LinkedList<T>());
}
You'd then have a mapping of priorities to queues. Your enqueue method would take an object and a priority (an int representing a priority) and would add the object to the queue specified by the priority. Your dequeue method would simply return the end of the highest priority queue with an element in it.
Make any sense?

Categories