How to perform clone() on Sets.newSetFromMap(map) - java

Previous code is like; to avoid ConcurrentModificationException on a Vector; where ever iteration is required; it was performing inside synchronized block on that Vector. So It is hitting very poor performance by making multiple threads into BLOCKED state to acquire lock on that Vector at different APIs.
I have decided to replace Vector to Collections.newSetFromMap(new ConcurrentHashMap<psConference,Boolean>()); in my project.
So after changing Vector into Concurrent collection; i have removed all SYNCH blocks.
But the problem here is some of my code is performing clone() on that Vector.
How to do the same on here since i have only Set interface ?
Vector clone() is Deep cloning or Shallow cloning ?
Also pls tell me the significance of Boolean at ConcurrentHashMap<psConference,Boolean>

But the problem here is some of my code is performing clone() on that
Vector.
How to do the same on here since i have only Set interface ?
You are working with a Set now, not a Vector. Your Set is backed by a ConcurrentHashMap, thus is safe to iterate concurrently. Rather then cloning i would suggest you to use a copy constructor.
But be aware (from the javadocs):
However, iterators are designed to be used by only one thread at a
time.
That being said, you could also use a CopyOnWriteArrayList, but you have to be careful there, because writes are expensive and the Iterator does not support element changing operations.
Vector clone() is Deep cloning or Shallow cloning?
Clone makes a copy of the references, thus is shallow.
Also pls tell me the significance of Boolean at
ConcurrentHashMap<psConference,Boolean>
The Boolean value is just a placeholder since you are using a Map as a Set. If you look at the source of the Collection class you will see that Boolean.TRUE is always used when adding elements. The actual used container for the Set is the Map#keySet(). So the Boolean parameter does actually nothing here, just a placeholder.

Personally, I would prefer avoiding concurrency problems as much as possible. Could you maybe send an example of code which throws ConcurrentModificationException? There is maybe a way to re-design algorithms to avoid them.
Also, I would rather use ArrayList to replace Vector.
To answer your point (2), based one this explanation, I would say Vector clone() is probably a shallow clone. By the way, they're also saying that it's usually better to avoid using clone() method.

Keep in mind that the Concurrent versions of the collections will not automagically make concurrent bugs go away.
Still, in your case your best bet would be to create concrete wrapper classes that implement the collection interfaces you expect, delegate all of the necessary methods into the wrapped collection, but know the data types used and know how to create copies of themselves.

Related

What is a thread-safe List implementation that allows Collections.sort

I have to write a program that requires a list. This list needs to be thread-safe in its implementation (mostly to avoid ConcurrentModificationException) but ALSO needs to allow the
Collections.sort() method to be applied, for API reasons.
CopyOnWriteArrayList fulfills the former, but not the latter, and other implementations I can find allow the latter but not the former.
Does Java have a list implementation that will work for me?
EDIT: An important point to note is that unfortunately my code needs to be Java 6 compatible.
I am wondering if this actually possible on a conceptual level: for a sort operation to be consistent, I would expect that the whole list is blocked for any adds/removes while the sorting is going on.
But Collections.sort() has no idea that it would need to lock the whole list while doing its work. You give it a list, and if another thread is trying to modify the list at the same time ... good luck with that.
Or if you reverse the point of view: how should a "thread-safe" list understand that it is right now in the process of being sorted; so - some accesses (like swapping elements) are fine; but others (like adding/removing) elements are not?!
In other words: I think you can only do this: pick any of the "thread-safe" list implementations; and then you have to put your own wrapper in place that
"Locks" the list for changes
Does the sorting work
"Unlocks" the list
And of course, for "2."; you are free to turn to Collections.sort().
Or, if you are using Java8 - you use the CopyOnWriteArrayList and its already implemented sort() method (which is kind of proving my point: you can only do proper sorting if you own the list while running the sort operation!).
Giving your latest comment: of course, you could manually "backport" the Java8 version of CopyOnWriteArrayList into your environment and use that; but of course, that won't help; as I understand that Java6-Collections.sort() will not call the new sort() method from that class.
So, it seems that the sum of your requirements can't be resolved; and you will have to bite the bullet and doing most of that in your own code.
Well, CopyOnWriteArrayList locks the entire collection (for insertion) while sorting. No?
Looks like you are good with CopyOnWriteArrayList. Below is the snippet from this class -
public void sort(Comparator<? super E> c) {
final ReentrantLock lock = this.lock;
lock.lock();**
try {
Object[] elements = getArray();
Object[] newElements = Arrays.copyOf(elements, elements.length);
#SuppressWarnings("unchecked") E[] es = (E[])newElements;
Arrays.sort(es, c);
setArray(newElements);
} finally {
lock.unlock();
}
}
Hmm.... since you've updated the question that code needs to be Java6 compatible, I'd say that you should extend the normal list and make use of https://docs.oracle.com/javase/6/docs/api/java/util/concurrent/locks/ReadWriteLock.html. In this type of lock, readers are not blocked from reading even when some other thread has acquired writeLock already, and 2 threads can acquire 'read' lock simultaneously.
Btw, this technique will require your caller to know that Collection.sort(...) shouldn't be called since you will have to expose explicit sort() method on your list. Hmm.... not sure if this was helpful.

emptyList() vs emptySet(), is there any reason to chose one over the other if an instance of Collection is needed?

In the JDK, there's Collection.emtpyList() and Collection.emptySet(). Both in their own right. But sometimes all that is needed is an empty, immutable instance of Collection. To me, there's no reason to chose one over the other as both implement all operations of Collection in an efficient way and with the same results. Yet each time I need such an empty collection I ponder which one to use for a second of two.
I do not expect to gain a deeper understanding of the collections framework from an answer to this question but maybe there's a subtle reason I could use to justify choosing one over the other without thinking about it ever again.
An answer should state at least one reason preferring one of Collection.emtpyList() and Collection.emptySet() over the other in a context where they're functionally equivalent. An answer is better if the stated reason is near the top of this list:
There's a case where the type system is happier with one over the other (e.g. type inference allows shorter code with one than the other).
There is a performance difference, maybe in some special case (e.g. if the empty collection is passed as an argument to some of the collection framework's static or instance methods like Collections.sort() or Collection.removeAll()).
Choosing one over the other "makes more sense" in the general case, if you think about it.
Examples where this question arises
To give some context, here are two examples where I am in need of an empty, unmodifiable collection.
This is an example of an API that allows creating some object by optionally specifying a collection of objects that are used in the creation. The second method just calls the first one with an empty collection:
static void createObjectWithTheseThings(Collection<Thing> things) {
...
}
static void createObjectWithoutAnyThings() {
createObjectWithTheseThings(Collections.emptyXXX());
}
This is an example of an Entity with state represented by an immutable collection stored in a non-final field. On initialization the field should be set to an empty collection:
class Example {
// Initialized to an empty collection.
private Collection<T> containedThings = Collections.emptyXXX();
...
}
Unfortunately I don't have an answer that will make the top of your priority list but if I were you I'd settle on
Collections.emptySet
Type inference was your first priority but I don't know if the choice can/should influence that given you were looking for an emptyCollection()
On the second priority, think about any api that takes in a collection which performs differently (accidentally/intentionally) based on the sub-interfaces of the concrete object passed in. Aren't they more likely to offer varied performance based on the concrete implementations (as with an ArrayList or LinkedList) instead? The empty set/list are not modeled on any empty data structures anyway; they are dummy implementations - hence no real difference
Based on java's modelling of these interfaces (which admittedly is not ideal), a Collection is very similar to a Set. In fact I think the methods are almost exactly the same. Logically too it looks OK with List being the specific-sub type that adds additional ordering concerns.
Now Collection and Set looking very similar(java-wise) brings up a question. If you are using a Collection type, it is clear it is not a list you want. Now the question is are you sure you don't mean a Set. If you don't, then are you using something like a Bag (surely there must be concrete instances which are not empty in the overall logic). So if you are concerned with say a Bag, then shouldn't it be up to the Bag api to provide an emptyBag() method? Just wondering. btw, I'd stick with emptySet() in the meantime :)
For the emptyXXX(), it really doesn't matter at all - since they are both empty (and they are unmodifieable, so they always stay empty) it doesn't matter at all. They will be equally suited to all operations Collection offers.
Take a look at what Collections really gives you there: Special implementations (the instances are shared across calls!). All relevant operations are dummy implementations that either return a constant result or immediately throw. Even iterator() is just a dummy with no state.
It wont make any notable difference at all.
Edit: You could say for the special case of emptyList/Set, they are semantically and complexity-wise the same at the Collecton interface level. All operations available on Collection are implemented by emptySet/List as O(1) operations. And since they're following both the contract defined by Collection, they are semantically identical too.
The only situation I can imagine this making a difference is if the code that will use your Collection does something like this:
Collection<T> collection = ...
List<T> asAList;
if (collection instanceof List) {
asAList = (List<T>) collection;
} else {
asAList = new ArrayList<T>(collection);
}
Obviously in a case like this you would want to use emptyList(), while if the secret target type was a Set, you'd want emptySet().
Otherwise, in terms of what "makes more sense", I agree with #ac3's logic that a generic Collection is like a Bag, and an empty immutable Set and empty immutable Bag are pretty much the same thing. However, a person very used to using immutable lists might find those easier to think of.

Is CopyOnWriteArrayList enough for keeping shopping cart thread-safe in Servlet Session scope

Is CopyOnWriteArrayList list enough to use as a collection for shopping-cart. As I understand it is thread-safe and the iterator is guaranteed not to throw ConcurrentModificationException when during iteration another thread removes a product. For example:
...
CopyOnWriteArrayList<Product> products = (CopyOnWriteArrayList<Product>)session.getAttribute("PRODUCTS");
products.addIfAbsent(aProduct);
...
P.S. I found synchronization approaches using synchronized (session) {...} but it seams a little ugly when I need synchronize session access everywhere when I work with shopping-cart as offered in this article
You need to understand what CopyOnWriteArrayList provides.
It provides you a snapshot and does not give you real time view of backend array.
It weakens the contract of visibility, it says that you will not get ConcurrentModificationException but also says that if other thread removes some element, the effect will not be visible to other thread which is iterating maybe, because on addition or removal the original array is not mutated or touched and a new one is created on every operation that mutates the backing array.
Is CopyOnWriteArrayList list enough to use as a collection for
shopping-cart.
Depends.
If this behavior is acceptable in your scenario then you can use it, but if you want visibility guarantee you may have to use explicit locking.
I think you are good to go with CopyOnWriteArrayList in the scenario you described.
It has sufficient guarantees to work as thread safe Implementations including
visibility. Yes it is true that it gives the snapshot of the data when you call iterate.
But there is always a race condition, while you remove it before reading or read it before removing.CopyOnWriteArrayList is a fine implementation
which can be used where reads >>> writes, which i think is the case in shopping cart use case.
It is just that while iterating you will not see changes (write operation). You should understand nothing is free, if you want to see the changes while traversing you need to properly synchronize your every iteration with any write operation which is will compromise perfomance. Trust me you will gain nothing. Most of the concurrent Data structures gives weakly consistent state on iteration see (ConcurrentSkipListSet).
So use either CopyOnWriteArrayList, ConcurrentSkipListSet you are good to go.
I think sets are better for your use case i.e to avoid duplicate orders ..
Is CopyOnWriteArrayList list enough to use as a collection for shopping-cart
No because it depends on what you need to synchronize. Think about what must not happen at the same time.
As I understand it is thread-safe and the iterator is guaranteed not to throw ConcurrentModificationException when during iteration another thread removes a product.
You will not get a ConcurrentModificationException because every modification you do to the list will create a copy of the list. A thread that iterates will use the most current copy. But that thread can't assume that a product is still actually in the list when it sees it. It might have been removed in the most current version.
Or maybe to use "heavier artillery" like following, in all places when accessing to shopping-cart collection.
AtomicReference<List<Product>> productListRef =
AtomicReference<List<Product>>)session.getAttribute("PRODUCTS");
List<Product> oldList;
List<Product> newList;
do {
oldList = productListRef.get();
newList = new ArrayList<>(oldList);
newList.add(aProduct);
} while (!ref.compareAndSet(oldList, newList));
Thank a lot for previous answers!

why java language forbid program update a collection when program iterate it?

I know there is a InnerClass named"Itr" in java.util.AbstractList. and there is a field named "expectedModCount", a method named "checkForComodification". when iterator a collection but update the collection, this method will throw the ConcurrentModificationException
I want to know why java language designed like this? what is the purpose to do like this.
Thx !
I want to know why java language designed like this?
It's not part of the language. It's part of the collection framework.
Basically, it's relatively hard to make a very general specification about what should happen if you're iterating over a collection and it changes underneath you. While you could certainly decide on some rules for a list, what about (say) the entry set for a map? Adding or removing entries could change the internal order entirely - what would you want to happen then?
If it was allowed to change the collection you get a lot of problematic casses.
Say we have a list with elements 0 to 4
the iterator is just passed 3
|0|1|2|3|4|
iterator^
now we add an element at the begining
|5|0|1|2|3|4|
iterator^?^
What should the iterator return now?
it could return 4 since that was the next element before the change
it could return 3 since that is now at the index where the iterator was pointing at
Depending on the list implementation each of these also adds complexity and has a performance penalty, by forbidding the modification of collections we can avoid the specifying a correct behavior and the attached complexity.
You can iterate over a collection and modify it using Iterator (which is the standard way to do this).
See Iterating through a Collection, avoiding ConcurrentModificationException when removing in loop for more discussion around this.
If a collection is modified by one thread while another reads from it, there might happen what we call a Race Condition . Avoiding it costs some performance, but you avoid unpredictable/unwanted results (e.g. you might skip or read twice an existing element in an ArrayList if there was no such check).

Does the unmodifiable wrapper for java collections make them thread safe?

I need to make an ArrayList of ArrayLists thread safe. I also cannot have the client making changes to the collection. Will the unmodifiable wrapper make it thread safe or do I need two wrappers on the collection?
It depends. The wrapper will only prevent changes to the collection it wraps, not to the objects in the collection. If you have an ArrayList of ArrayLists, the global List as well as each of its element Lists need to be wrapped separately, and you may also have to do something for the contents of those lists. Finally, you have to make sure that the original list objects are not changed, since the wrapper only prevents changes through the wrapper reference, not to the original object.
You do NOT need the synchronized wrapper in this case.
On a related topic - I've seen several replies suggesting using synchronized collection in order to achieve thread safety.
Using synchronized version of a collection doesn't make it "thread safe" - although each operation (insert, count etc.) is protected by mutex when combining two operations there is no guarantee that they would execute atomically.
For example the following code is not thread safe (even with a synchronized queue):
if(queue.Count > 0)
{
queue.Add(...);
}
The unmodifiable wrapper only prevents changes to the structure of the list that it applies to. If this list contains other lists and you have threads trying to modify these nested lists, then you are not protected against concurrent modification risks.
From looking at the Collections source, it looks like Unmodifiable does not make it synchronized.
static class UnmodifiableSet<E> extends UnmodifiableCollection<E>
implements Set<E>, Serializable;
static class UnmodifiableCollection<E> implements Collection<E>, Serializable;
the synchronized class wrappers have a mutex object in them to do the synchronized parts, so looks like you need to use both to get both. Or roll your own!
I believe that because the UnmodifiableList wrapper stores the ArrayList to a final field, any read methods on the wrapper will see the list as it was when the wrapper was constructed as long as the list isn't modified after the wrapper is created, and as long as the mutable ArrayLists inside the wrapper aren't modified (which the wrapper can't protect against).
It will be thread-safe if the unmodifiable view is safely published, and the modifiable original is never ever modified (including all objects recursively contained in the collection!) after publication of the unmodifiable view.
If you want to keep modifying the original, then you can either create a defensive copy of the object graph of your collection and return an unmodifiable view of that, or use an inherently thread-safe list to begin with, and return an unmodifiable view of that.
You cannot return an unmodifiableList(synchonizedList(theList)) if you still intend to access theList unsynchronized afterwards; if mutable state is shared between multiple threads, then all threads must synchronize on the same locks when they access that state.
An immutable object is by definition thread safe (assuming no-one retains references to the original collections), so synchronization is not necessary.
Wrapping the outer ArrayList using Collections.unmodifiableList()
prevents the client from changing its contents (and thus makes it thread
safe), but the inner ArrayLists are still mutable.
Wrapping the inner ArrayLists using Collections.unmodifiableList() too
prevents the client from changing their contents (and thus makes them
thread safe), which is what you need.
Let us know if this solution causes problems (overhead, memory usage etc);
other solutions may be applicable to your problem. :)
EDIT: Of course, if the lists are modified they are NOT thread safe. I assumed no further edits were to be made.
Not sure if I understood what you are trying to do, but I'd say the answer in most cases is "No".
If you setup an ArrayList of ArrayList and both, the outer and inner lists can never be changed after creation (and during creation only one thread will have access to either inner and outer lists), they are probably thread safe by a wrapper (if both, outer and inner lists are wrapped in such a way that modifying them is impossible). All read-only operations on ArrayLists are most likely thread-safe. However, Sun does not guarantee them to be thread-safe (also not for read-only operations), so even though it might work right now, it could break in the future (if Sun creates some internal caching of data for quicker access for example).
This is neccessary if:
There is still a reference to the original modifiable list.
The list will possibly be accessed though an iterator.
If you intend to read from the ArrayList by index only you could assume this is thread-safe.
When in doubt, chose the synchronized wrapper.

Categories