Recently I started reading 'Java 7 Concurrency Cookbook' and in a section Creating and running a daemon thread found the code where main thread creates and one instance of ArrayDeque and shares its reference with three producers and one consumer. The producers call deque.addFirst(event) and the consumer calls deque.getLast().
But JavaDoc of ArrayDeque clearly states that:
Array deques are not thread-safe; in the absence of external synchronization, they do not support concurrent access by multiple threads.
So I wonder whether it is a mistake or I just don't understand something?
Array deques are not thread safe, meaning you have to provide external synchronization.
However why it works is, like holger said
You are using addFirst(e) is an insert model method which does causes change in underlying datastructure
You are using getLast() which is an examine model method which does not causes change in underlying datastructure.
That is why it is working, if you had used removeLast() instead of getLast(), you should have got ConcurrentModification Exception for sure.
Hope this clears up everything , Cheers
It is clearly mentioned that if you are not going to provide any external synchronization, then ArrayDeque will not give you synchronization features just like Vector(provides internal features for thread safety-concurrency)
Related
I have an Actor that - in its very essence - maintains a list of objects. It has three basic operations, an add, update and a remove (where sometimes the remove is called from the add method, but that aside), and works with a single collection. Obviously, that backing list is accessed concurrently, with add and remove calls interleaving each other constantly.
My first version used a ListBuffer, but I read somewhere it's not meant for concurrent access. I haven't gotten concurrent access exceptions, but I did note that finding & removing objects from it does not always work, possibly due to concurrency.
I was halfway rewriting it to use a var List, but removing items from Scala's default immutable List is a bit of a pain - and I doubt it's suitable for concurrent access.
So, basic question: What collection type should I use in a concurrent access situation, and how is it used?
(Perhaps secondary: Is an Actor actually a multithreaded entity, or is that just my wrong conception and does it process messages one at a time in a single thread?)
(Tertiary: In Scala, what collection type is best for inserts and random access (delete / update)?)
Edit: To the kind responders: Excuse my late reply, I'm making a nasty habit out of dumping a question on SO or mailing lists, then moving on to the next problem, forgetting the original one for the moment.
Take a look at the scala.collection.mutable.Synchronized* traits/classes.
The idea is that you mixin the Synchronized traits into regular mutable collections to get synchronized versions of them.
For example:
import scala.collection.mutable._
val syncSet = new HashSet[Int] with SynchronizedSet[Int]
val syncArray = new ArrayBuffer[Int] with SynchronizedBuffer[Int]
You don't need to synchronize the state of the actors. The aim of the actors is to avoid tricky, error prone and hard to debug concurrent programming.
Actor model will ensure that the actor will consume messages one by one and that you will never have two thread consuming message for the same Actor.
Scala's immutable collections are suitable for concurrent usage.
As for actors, a couple of things are guaranteed as explained here the Akka documentation.
the actor send rule: where the send of the message to an actor happens before the receive of the same actor.
the actor subsequent processing rule: where processing of one message happens before processing of the next message by the same actor.
You are not guaranteed that the same thread processes the next message, but you are guaranteed that the current message will finish processing before the next one starts, and also that at any given time, only one thread is executing the receive method.
So that takes care of a given Actor's persistent state. With regard to shared data, the best approach as I understand it is to use immutable data structures and lean on the Actor model as much as possible. That is, "do not communicate by sharing memory; share memory by communicating."
What collection type should I use in a concurrent access situation, and how is it used?
See #hbatista's answer.
Is an Actor actually a multithreaded entity, or is that just my wrong conception and does it process messages one at a time in a single thread
The second (though the thread on which messages are processed may change, so don't store anything in thread-local data). That's how the actor can maintain invariants on its state.
I have many threads adding result-like objects to an array, and would like to improve the performance of this area by removing synchronization.
To do this, I would like for each thread to instead post their results to a ThreadLocal array - then once processing is complete, I can combine the arrays for the following phase. Unfortunately, for this purpose ThreadLocal has a glaring issue: I cannot combine the collections at the end, as no thread has access the collection of another.
I can work around this by additionally adding each ThreadLocal array to a list next to the ThreadLocal as they are created, so I have all the lists available later on (this will require synchronization but only needs to happen once for each thread), however in order to avoid a memory leak I will have to somehow get all the threads to return at the end to clean up their ThreadLocal cache... I would much rather the simple process of adding a result be transparent, and not require any follow up work beyond simply adding the result.
Is there a programming pattern or existing ThreadLocal-like object which can solve this issue?
You're right, ThreadLocal objects are designed to be only accessible to the current thread. If you want to communicate across threads you cannot use ThreadLocal and should use a thread-safe data structure instead, such as ConcurrentHashMap or ConcurrentLinkedQueue.
For the use case you're describing it would be easy enough to share a ConcurrentLinkedQueue between your threads and have them all write to the queue as needed. Once they're all done (Thread.join() will wait for them to finish) you can read the queue into whatever other data structure you need.
I have an ArrayList which I add items to, within a broadcastreceiver callback.
However the arraylist will eventually be attached to an adapter and then I wish to display the contents of the array to the screen.
The array contains peer information from a P2P app I'm working on so it will be subject to change frequently as devices drop in and out of connection/range.
So basically the arraylist will be read and written to frequently.
I come from a c++ background so I would normally use a lock to protect my arraylist, when accessing it, but I'm unsure what I should use in java/android.
Any Advice please.
Using a lock is never wrong. All synchronized does is use a lock under the hood. Some Java purists may complain, but you tend to get more flexibility out of just using a semaphore (and sometimes its just the only way to be correct). There's also some ugly corner cases to wait/notify that you have to really understand the use cases of to get right that semaphores just avoid. If you're familiar with them I wouldn't hesitate to use it just because you're in Java now.
Use a BlockingQueue instead of an ArrayList. It'll make your list Thread safe. As per the Documenatation :
A Queue that additionally supports operations that wait for the queue to become non-empty when retrieving an element, and wait for space to become available in the queue when storing an element.
The synchronized keyword locks on whatever object is specified. If the method is marked as synchronized and its an instance method it locks on the enclosing instance. If the method is static, it locks on the class object. If an object is specified in parentheses after the synchronized keyword in a syncrhonized block, the lock is held on that object. I would typically use a thread safe collection like AndroidWarrior proposed, but if thats not possible, just make sure that your accessors and mutators lock on the same object.
Could you please clarify if we need to use explicit synchronization or locks for using ConcurrentLinkedQueue? I am specifically interested in knowing if sync calls are needed for following ConcurrentLinkedQueue methods.
add
clear
size
Possibly size is the only method which might require explicit sync since it's a not an atomic method but ConcurrentLinkedQueue java docs say that
"Beware that, unlike in most
collections, the size method is NOT a
constant-time operation. Because of
the asynchronous nature of these
queues, determining the current number
of elements requires a traversal of
the elements. "
which make me believe that though size call may be slow but it doesn't require any explicit sync call.
Thanks in advance ...
clear() is not atomic operation (it is implemented in AbstractQueue class), as Javadoc and source says: "This implementation repeatedly invokes poll until it returns null.". poll is atomic, but if you use offer while clear() is ongoing, you will add something in the middle of clearing, and clear() will delete it...
If you will use clear() you should use LinkedBlockingQueue instead of ConcurrentLinkedQueue.
You don't need any explicit synchronization or locks. As the docs state, it is a thread-safe collection. This mean each of these methods is correctly atomic (though as you point out, size() may be slow).
You should not and do not need to use explicit locking on any of those methods.
Yeah, you do not need to use explicit synchronization because this is a thread safe collection. Any concurrent access is allowed without worry
It is unnecessary to synchronize to preserve the internal structure of the queue. However it may be necessary to linearise other invariants of your structure.
For instance size() is fairly meaningless in any shared mutable container. All it can ever tell you is something about what it was the last time you asked, not what it is now, unless you stop the world and prevent concurrent modification. It is only useful for indicative monitoring purposes, you should never use it in your algorithm.
Similarly, clear() doesn't really mean much without some kind of external intervention. Clear what? The things that are in it at the time you call clear? In a concurrent structure answering that is a difficult if not impossible question.
So, you are better off using it as a simple thread-safe queue (only offering and polling) and steering clear of the others unless you externally lock.
I am facing this issue:
I have lots of threads (1024) who access one large collection - Vector.
Question:
is it possible to do something about it which would allow me to do concurrent actions on it without having to synchronize everything (since that takes time)? What I mean, is something like Mysql database works, you don't have to worry about synchronizing and thread-safe issues. Is there some collection alike that in Java? Thanks
Vector is a very old Java class - predates the Collections API. It synchronizes on every operation, so you're not going to have any luck trying to speed it up.
You should consider reworking your code to use something like ConcurrentHashMap or a LinkedBlockingQueue, which are highly optimized for concurrent access.
Failing that, you mention that you'd like performance and access semantics similar to a database - why not use a dedicated database or a message queue? They are likely to implement it better than you ever will, and it's less code for you to write!
[edit] Given your comment:
all what thread does is adding elements to vector
(only if num of elements in vector = 0) &
removing elements from vector. (if vector size > 0)
it sounds very much like you should be using something much more like a queue than a list! A bounded queue with size 1 will give you these semantics - although I'd question why you can't add elements if there is already something there. When you've got thousands of threads this seems like a very inefficient design.
Well first off, this design doesn't sound right. It sounds like you need to think about using a proper database rather than an simple data structure, even if this means just using something like an in-memory instance of HypersonicDB.
However, if you insist on doing things this way, then the java.util.concurrent package has a number of highly concurrent, non-locking data structures. One of them might suit your purpose (e.g. ConcurrentHashMap, if you can use a Map rather than a List)
Looks like you are implementing the producer consumer pattern, you should google "producer consumer java" or have a look at the BlockingQueue interface
I agree with skaffman about looking at java.util.concurrent.
ConcurrentHashMap is very scalable. However, the size() call on it returns only an approximation. So e.g. your app will occasionally be adding elements to it even if !(num of elements in vector = 0).
If you want to strictly enforce the condition you gave, there is no other way than to synchronize.
Instead of having tons of context switches, I guess you could let your users thread post a callable on a queue and have only one thread dealing with the mutation. This will eliminate the need for synchronization on the collection. The user threads can wait on Future.get().
Just an idea.
If you do not want to change your data structure and have only infrequent writes, you might also use one or many ReentrantReadWriteLock to synchronize access. Then many threads can read at the same time, but when a thread wants to write all reads are blocked until the write is done.
But you should check whether the used data structure is appropriate for the task, or whether another of the many java.util or java.util.concurrent classes is more appropriate. java.util.Vector is synchronized, by the way.