Does my implementation of LinkedBlockingQueue need to be synchronized? - java

To begin with, I have used search and found n topics related to this question. Unfortunately, they didin't help me, so it'll be n++ topics :)
Situation: I will have few working threads (the same class, just many dublicates) (let's call them WT) and one result writing thread (RT).
WT will add objects to the Queue, and RT will take them. Since there will be many WT won't there be any memory problems(independant from the max queue size)? Will those operations wait for each other to be completed?
Moreover, as I understand, BlockingQueue is quite slow, so maybe I should leave it and use normal Queue while in synchronized blocks? Or should I consider my self by using SynchronizedQueue?

LinkedBlockingQueue is designed to handle multiple threads writing to the same queue. From the documentation:
BlockingQueue implementations are thread-safe. All queuing methods achieve their effects atomically using internal locks or other forms of concurrency control. However, the bulk Collection operations addAll, containsAll, retainAll and removeAll are not necessarily performed atomically unless specified otherwise in an implementation.
Therefore, you are quite safe (unless you expect the bulk operations to be atomic).
Of course, if thread A and thread B are writing to the same queue, the order of A's items relative to B's items will be indeterminate unless you synchronize A and B.
As to the choice of queue implementation, go with the simplest that does the job, and then profile. This will give you accurate data on where the bottlenecks are so you won't have to guess.

Related

How to convert ReentrantReadWriteLock logic to LMAX Disruptor with barriers

I have a shared collection an ArrayList and also i use a ReentrantReadWriteLock lock to secure the entering on a critical area from different threads. My threads are three a writer,read,delete thread. i acquire the correct lock on each case. The logic is that i insert data to ArrayList, read them when necessary, and also when the timer reaches limits delete some entries. The process runs smoothly and everything is perfect.
My question now is can i transfer the above logic somehow and implemented it with an LMAX disruptor in order to avoid lock overheads and improve performance. If yes can you describe me an ideal case and if you are able to also post code i would really appreciate it.
i assume that instead of ArrayList data will be entered in ringbuffer and i will have 2 producers write, delete, and a consumer for read. Also i must make sure that i use producer barriers. Will the performance will be increased from lock case. i am not sure if i understand everything correctly please help me and give me directions?
If your shared state is the ArrayList and you have one thread that is reading and processing the elements in the ArrayList and you want the updates to that shared state synchronised then usually the ArrayList would be owned by one EventHandler that processes events such as Process, Write, Delete and updates and processes the shared state.
This would all run on one thread but that pretty much is what is happening now as you cannot Read at the same time as Writing/Deleting.
As you only have one reading thread there is not much to be gained from using a ReadWriteLock as you will never have concurrent reads.

Using LinkedBlockingQueue good enough for multi thread java program?

I have a consumer and a producer that adds and deletes Item objects from the queue. If I use the put() and take() methods. Is there any thread safety issues I need to still cover? This is similar to the bounded buffer problem and I was just wondering if using the blocking queue instead replaces the need for semaphores or monitors. The Item object itself would probably need synchronization (setters but getters don't need lock), am I right? And lastly, I'm not quite sure how to test if it is thread safe since I can't simultaneously make both threads call the take() because to order of execution is underterministic. Any ideas? Thanks.
It is perfectly thread-safe for what you're doing, in fact this is what it's designed for. The description of BlockingQueue (which is the interface implemented by LinkedBlockingQueue) states:
BlockingQueue implementations are thread-safe. All queuing methods
achieve their effects atomically using internal locks or other forms
of concurrency control.
Simultaneous put() and take() are not thread-safe since they use 2 different locks.
This is already answered here : Are LinkedBlockingQueue's insert and remove methods thread safe?

Concurrency design principles in practice

I have a Results object which is written to by several threads concurrently. However, each thread has a specific purpose and owns certain fields, so that no data is actually modified by more than one thread. The consumer of this data will not try to read it until all of the writer threads are done writing it. Because I know this to be true, there is no synchronization on the data writes and reads.
There is a RunningState object associated with this Results object which serves to coordinate this work. All of its methods are synchronized. When a thread is done with its work on this Results object, it calls done() on the RunningState object, which does the following: decrements a counter, checks if the counter has gone to 0 (indicating that all writers are done), and if so, puts this object on a concurrent queue. That queue is consumed by a ResultsStore which reads all of the fields and stores data in the database. Before reading any data, the ResultsStore calls RunningState.finalizeResult(), which is an empty method whose sole purpose is to synchronize on the RunningState object, to ensure that writes from all of the threads are visible to the reader.
Here are my concerns:
1) I believe that this will work correctly, but I feel like I'm violating good design principles to not synchronize on the data modifications to an object that is shared by multiple threads. However, if I were to add synchronization and/or split things up so each thread only saw the data it was responsible for, it would complicate the code. Anyone who modifies this area had better understand what's going on in any case or they're likely to break something, so from a maintenance standpoint I think the simpler code with good comments explaining how it works is a better way to go.
2) The fact that I need to call this do-nothing method seems like an indication of wrong design. Is it?
Opinions appreciated.
This seems mostly right, if a bit fragile (if you change the thread-local nature of one field, for instance, you may forget to synchronize it and end up with hard-to-trace data races).
The big area of concern is in memory visibility; I don't think you've established it. The empty finalizeResult() method may be synchronized, but if the writer threads didn't also synchronize on whatever it synchronizes on (presumably this?), there's no happens-before relationship. Remember, synchronization isn't absolute -- you synchronize relative to other threads that are also synchronized on the same object. Your do-nothing method will indeed do nothing, not even ensure any memory barrier.
You somehow need to establish a happens-before relationship between each thread doing its writes, and the thread that eventually reads. One way to do this without synchronization is via a volatile variable, or an AtomicInteger (or other atomic classes).
For instance, each writer thread can invoke counter.incrementAndGet(1) on the object, and the reading thread can then check that counter.get() == THE_CORRECT_VALUE. There's a happens-before relationship between a volatile/atomic field being written and it being read, which gives you the needed visibility.
Your design is sound, but it can be improved if you are using a true concurrent queue since a concurrent queue from the java.util.concurrent package already guarantees a happens before relationship between the thread putting an item into the queue, and the thread taking an item out, so this precludes needing to call finalizeResult() in the taking thread (so no need for that "do nothing" method call).
From java.util.concurrent package description:
The methods of all classes in java.util.concurrent and its subpackages
extend these guarantees to higher-level synchronization. In
particular:
Actions in a thread prior to placing an object into any
concurrent collection happen-before actions subsequent to the access
or removal of that element from the collection in another thread.
The comments in another answer concerning using an AtomicInteger instead of synchronization are also wise (as using an AtomicInteger to do your thread counting will likely perform better than synchronization), just make sure to get the value of the count after the atomic decrement (e.g. decrementAndGet()) when comparing to 0 in order to avoid adding to the queue twice.
What you've described is indeed safe, but it also sounds, frankly, brittle and (as you note) maintenance could become an issue. Without sample code, it's really hard to tell what's really easiest to understand, so an already subjective question becomes frankly unanswerable. Could you ask a coworker for a code review? (Particularly one that's likely to have to deal with this pattern.) I'm going to trust you that this is indeed the simplest approach, but doing something like wrapping synchronized blocks around writes would increase safety now and in the future. That said, you obviously know your code better than I do.

Analysing a BlockingQueue usage example

I was looking at the "usage example based on a typical producer-consumer scenario" at:
http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/BlockingQueue.html#put(E)
Is the example correct?
I think the put and take operations need a lock on some resource before proceeding to modify the queue, but that is not happening here.
Also, had this been a Concurrent kind of a queue, the lack of locks would have been understandable since atomic operations on a concurrent queue do not need locks.
I do not think there is something to add to what is written in api:
A Queue that additionally supports operations that wait for the queue to become non-empty when retrieving an element, and wait for space to become available in the queue when storing an element.
BlockingQueue implementations are thread-safe. All queuing methods achieve their effects atomically using internal locks or other forms of concurrency control.
BlockingQueue is just an interface. This implementation could be using synchronzed blocks, Lock or be lock-free. AFAIK most methods use Lock in the implementation.

Parallel-processing in Java; advice needed i.e. on Runnanble/Callable interfaces

Assume that I have a set of objects that need to be analyzed in two different ways, both of which take relatively long time and involve IO-calls, I am trying to figure out how/if I could go about optimizing this part of my software, especially utilizing the multiple processors (the machine i am sitting on for ex is a 8-core i7 which almost never goes above 10% load during execution).
I am quite new to parallel-programming or multi-threading (not sure what the right term is), so I have read some of the prior questions, particularly paying attention to highly voted and informative answers. I am also in the process of going through the Oracle/Sun tutorial on concurrency.
Here's what I thought out so far;
A thread-safe collection holds the objects to be analyzed
As soon as there are objects in the collection (they come a couple at a time from a series of queries), a thread per object is started
Each specific thread takes care of the initial pre-analysis preparations; and then calls on the analyses.
The two analyses are implemented as Runnables/Callables, and thus called on by the thread when necessary.
And my questions are:
Is this a reasonable scheme, if not, how would you go about doing this?
In order to make sure things don't get out of hand, should I implement a ThreadManager or some thing of that sort, which starts and stops threads, and re-distributes them when they are complete? For example, if i have 256 objects to be analyzed, and 16 threads in total, the ThreadManager assigns the first finished thread to the 17th object to be analyzed etc.
Is there a dramatic difference between Runnable/Callable other than the fact that Callable can return a result? Otherwise should I try to implement my own interface, in that case why?
Thanks,
You could use a BlockingQueue implementation to hold your objects and spawn your threads from there. This interface is based on the producer-consumer principle. The put() method will block if your queue is full until there is some more space and the take() method will block if the queue is empty until there are some objects again in the queue.
An ExecutorService can help you manage your pool of threads.
If you are awaiting a result from your spawned threads then Callable interface is a good idea to use since you can start the computation earlier and work in your code assuming the results in Future-s. As far as the differencies with the Runnable interface, from the Callable javadoc:
The Callable interface is similar to Runnable, in that both are designed for classes whose instances are potentially executed by another thread. A Runnable, however, does not return a result and cannot throw a checked exception.
Some general things you need to consider in your quest for java concurrency:
Visibility is not coming by defacto. volatile, AtomicReference and other objects in the java.util.concurrent.atomic package are your friends.
You need to carefully ensure atomicity of compound actions using synchronization and locks.
Your idea is basically sound. However, rather than creating threads directly, or indirectly through some kind of ThreadManager of your own design, use an Executor from Java's concurrency package. It does everything you need, and other people have already taken the time to write and debug it. An executor manages a queue of tasks, so you don't need to worry about providing the threadsafe queue yourself either.
There's no difference between Callable and Runnable except that the former returns a value. Executors will handle both, and ready them the same.
It's not clear to me whether you're planning to make the preparation step a separate task to the analyses, or fold it into one of them, with that task spawning the other analysis task halfway through. I can't think of any reason to strongly prefer one to the other, but it's a choice you should think about.
The Executors provides factory methods for creating thread pools. Specifically Executors#newFixedThreadPool(int nThreads) creates a thread pool with a fixed size that utilizes an unbounded queue. Also if a thread terminates due to a failure then a new thread will be replaced in its place. So in your specific example of 256 tasks and 16 threads you would call
// create pool
ExecutorService threadPool = Executors.newFixedThreadPool(16);
// submit task.
Runnable task = new Runnable(){};;
threadPool.submit(task);
The important question is determining the proper number of threads for you thread pool. See if this helps Efficient Number of Threads
Sounds reasonable, but it's not as trivial to implement as it may seem.
Maybe you should check the jsr166y project.
That's probably the easiest solution to your problem.

Categories