intrinsic vs mutex lock - java

Looking at the implementation of Collection's SynchronizedList i recognized that all access on the internal list is synchronized by locking on the wrapper's final member "mutex". As all access is synchronized on the same object, why dont we ommit the extra mutex object and just synchronize on the list itself? Is it only because of the fact that anybody else could synchronize on that list and we might get deadlocks?
I am asking, because i consider to implement a container-class, that holds two lists.
The container offers e.g. .addToL1(...) and .addToL2(...). In this case the inner lists are not accessible, so it should be sufficient to synchronize on the lists intrinsically, correct?

The most robust solution is to lock on an object that the caller has no way of getting access to. (We will ignore reflection and Unsafe for a moment) The JDK developers have to consider the worst thing any developer could do with the library because someone will do that and things they couldn't have thought of.
However, sometimes simplicity is the most important driver esp if you know who will be using it and understand its limitations.

In this case it is specifically done because sublists created from the list have to synchronize on the parent object
public List<E> subList(int fromIndex, int toIndex) {
synchronized(mutex) {
return new SynchronizedList<E>(list.subList(fromIndex, toIndex),
mutex);
}
}
If you create your synchronized list by Collections.synchronizedList( list );, it will set the mutex to this (that is, the synchronized list object itself).
But you can also call Collections.synchronizedList() with two parameters, the second of which will then be used as the mutex.
And as in general it isn't a good idea to synchronize on publicly visible objects, I prefer to always use the 2 parameter version and hide the mutex object from clients of the code.

Related

Non-thread-safe Attempt to Implement Put-if-absent?

There is one code snippet in the 4th chapter in Java Concurrency in Practice
public class ListHelper<E> {
public List<E> list =
Collections.synchronizedList(new ArrayList<E>());
...
public synchronized boolean putIfAbsent(E x) {
boolean absent = !list.contains(x);
if (absent)
list.add(x);
return absent;
}
}
it says this is thread safe for using different locks, putIfAbsent is not atomic relative to other operations on the List.
But I think "synchronized" preventing multithreads enter putIfAbsent, if there are other methods that do other operations on the List, key word synchronized should also be as the method atttribute. So following this way, should it be thread safe? Under what case "it is not atomic"?
putIfAbsent is not atomic relative to other operations on the List. But I think "synchronized" preventing multithreads enter putIfAbsent
This is true but there is no guarantees that there are other ways threads are accessing the list. The list field is public (which is always a bad idea) which means that other threads can call methods on the list directly. To properly protect the list, you should make it private and add add(...) and other methods to your ListHelper that are also synchronized to fully control all access to the synchronized-list.
// we are synchronizing the list so no reason to use Collections.synchronizedList
private List<E> list = new ArrayList<E>();
...
public synchronized boolean add(E e) {
return list.add(e);
}
If the list is private and all of the methods are synchronized that access the list then you can remove the Collections.synchronizedList(...) since you are synchronizing it yourself.
if there are other methods that do other operations on the List, key word synchronized should also be as the method atttribute. So following this way, should it be thread safe?
Not sure I fully parse this part of the question. But if you make the list be private and you add other methods to access the list that are all synchronized then you are correct.
Under what case "it is not atomic"?
putIfAbsent(...) is not atomic because there are multiple calls to the synchronized-list. If multiple threads are operating on the list then another thread could have called list.add(...) between the time putIfAbsent(...) called list.contains(x) and then calls list.add(x). The Collections.synchronizedList(...) protects the list against corruption by multiple threads but it cannot protect against race-conditions when there are multiple calls to list methods that could interleave with calls from other threads.
Any unsynchronized method that modifies the list may introduce the absent element after list.contains() returns false, but before the element has been added.
Picture this as two threads:
boolean absent = !list.contains(x); // Returns true
-> list.add(theSameElementAsX); // Another thread
if(absent) // absent is true, but the list has been modified!
list.add(x);
return absent;
This could be accomplished with simply a method as follows:
public void add(E e) {
list.add(e);
}
If the method were synchronized, there would be no problem, since the add method wouldn't be able to run before putIfAbsent() was fully finished.
A proper correction would include making the List private, and making sure that compound operations on it are properly synchronized (i.e. on the class or the list itself).
Thread safety is not composable! Imagine a program built entirely out of "thread safe" classes. Is the program itself "thread safe?" Not necessarily. It depends on what the program does with those classes.
The synchronizedList wrapper makes each individual method of a List "thread safe". What does that mean? It means that none of those wrapped methods can corrupt the internal structure of the list when called in a multi-threaded environment.
That doesn't protect the way in which any given program uses the list. In the example code, the list appears to be used as an implementation of a set: The program doesn't allow the same object to appear in the list more than one time. There's nothing in the synchronizedList wrapper that will enforce that particular guarantee though, because that guarantee has nothing to do with the internal structure of the list. The list can be perfectly valid as a list, but not valid as a set.
That's why the additional synchronization on the putIfAbsent() method.
Collections.synchronizedList() creates a collection which adds synchronization on private mutex for every single method of it. This mutex is list this for one-argument factory used in example, or can be provided when two-argument factory is used. That's why we need an external lock to make subsequent contains() and add() calls atomic.
In case the list is available directly, not via ListHelper, this code is broken, because access will be guarded by different locks in that case. To prevent that, it is possible to make list private to prevent direct access, and wrap all neccesary API with synchronization on the same mutex declared in ListHelper or on the this of ListHelper itself.

Is collection synchronizing (via Collections.synchronizedX) necessary when access methods are synchronized?

There is a lot of topics when synchronization in Java appears. In many of them is recommended to using invokation of Collections.synchronized{Collecation, List, Map, Set, SortedMap, SortedSet} instead of Collection, List, etc. in case of multithreading work to thread-safe access.
Lets imagine situation when some threads exist and all of them need to access collection via methods that have synchronized block in their bodies.
So then, is it necessary to use:
Collection collection = Collections.synchronizedCollection(new ArrayList<T>());
or only
Collection collection = new ArrayList<String>();
need to?
Maybe you can show me an example when second attempt instead of first will cause evidently incorrect behaviour?
To the contrary, Collections.synchronizedCollection() is generally not sufficient because many operations (like iterating, check then add, etc.) need additional, explicit synchronization.
If every access to the collection is already done through properly synchronized methods, then wrapping the collection again into a syncronized proxy is useless.
No, if your access methods are synchronized there is no need to also use a synchronized collection.
Collection collection = new ArrayList<String>();
will do just fine in that scenario.
If you have already arranged for proper synchronization of your code, you definitely do not need another layer of synchronization on the lower level of granularity.
Just make sure when you say
all of them need to access collection via methods that have synchronized block in their bodies.
that all these blocks use the same lock. It is not enough to just involve some synchronized block.

About unsynchronized & synchronized access in Java Collections Framework?

Can anyone explain what is unsynchronized & synchronized access in Java Collections Framework?
Synchronized vs unsynchronized access doesn't have to do with the Java Collections Framework per see.
Synchronized access means that you have some sort of locking for accessing the data. This can be introduced by using the synchronized keyword or by using some of the higher level constructs from the java.util.concurrent package.
Unsynchronized access means that you don't have any locking involved when accessing the data.
If you're using a collection in several threads, you better make sure that you're accessing it in a synchronized way, or, that the collection itself is thread safe, i.e., takes care of such locking internally.
To make sure all accesses to some collection coll is accessed in a synchronized way, you can either
...surround accesses with synchronized (coll) { ... }
public void someMethod() {
synchronized (coll) {
// do work...
}
}
...encapsulate it using Collections.synchronizedCollections
coll = Collections.synchronizedCollection(coll);
In the former approach, you need to make sure that every access to the collection is covered by synchronized. In the latter approach, you need to make sure that every reference points at the synchronized version of the collection.
As pointed out by #Fatal however, you should understand that the latter approach only transforms a thread unsafe collection into a thread safe collection. This is most often not sufficient for making sure that the class you are writing is thread safe. For an example, see #Fatals comment.
Synchronized access means it is thread-safe. So different threads can access the collection concurrently without any problems, but it is probably a little bit slower depending on what you are doing.
Unsynchronized is the opposite. Not thread-safe, but a little bit faster.
The synchronized access in Java Collection Framework is normally done by wrapping with Collections.synchronizedCollection(...) etc. and only access through this wrapper.
There are some exceptions already synchronized like Hashtable and Vector.
But keep in mind:
Synchronization is done over the collection instance itself and has a scope for each method call. So subsequent calls maybe interrupted by another thread.
Example:
You first call isEmtpy() method getting result that it is not empty and after that you want to retrieve an element from that collection. But this second method call may fail, because collection may be empty now due to actions by another thread done between your calls.
So even with synchronized collections you've to care about synchronization and it maybe necessary to synchronize yourself outside the collection!

How to clone a synchronized Collection?

Imagine a synchronized Collection:
Set s = Collections.synchronizedSet(new HashSet())
What's the best approach to clone this Collection?
It's prefered that the cloning doesn't need any synchronization on the original Collection but required that iterating over the cloned Collection does not need any synchronization on the original Collection.
Use a copy-constructor inside a synchronized block:
synchronized (s) {
Set newSet = new HashSet(s); //preferably use generics
}
If you need the copy to be synchronized as well, then use Collections.synchronizedSet(..) again.
As per Peter's comment - you'll need to do this in a synchronized block on the original set. The documentation of synchronizedSet is explicit about this:
It is imperative that the user manually synchronize on the returned set when iterating over it
When using synchronized sets, do understand that you will incur synchronization overhead accessing every element in the set. The Collections.synchronizedSet() merely wraps your set with a shell that forces every method to be synchronized. Probably not what you really intended. A ConcurrentSkipListSet will give you better performance in a multithreaded environment where multiple threads will be writing to the set.
The ConcurrentSkipListSet will allow you to perform the following:
Set newSet = s.clone();//preferably use generics
It's not uncommon to use a clone of a set for snapshot processing. If that's what you are after, you might add a little code to handle the case where the item is already processed. The overhead involved with the occasional object included in more than one copy set is usually less than the consistent overhead of using Collections.concurrentSet().
EDIT: I just noticed that ConcurrentSkipListSet is Cloneable and provides a threadsafe clone() method. I changed my answer because I really believe this is the best option--instead of losing scalability and performance to Collections.concurrentSet().
You can avoid synchronizing the set by doing the following which avoids exposing an Iterator on the original set.
Set newSet = new HashSet(Arrays.asList(s.toArray()));
EDIT From Collections.SynchronizedCollection
public Object[] toArray() {
synchronized(mutex) {return c.toArray();}
}
As you can see, the lock is held for the entire time the operation is performed. As such a safe copy of the data is taken. It doesn't matter if an Iterator is used internally. The array returned can be used in a thread safe manner as only the local thread has a reference to it.
NOTE: If you want to avoid these issues I suggest you use a Set from the concurrency library added in Java 5.0 in 2004. I also suggest you use generics as this can make your collections more type safe.

Does a synchronized block prevent other thread access to object?

If I do something to a list inside a synchronized block, does it prevent other threads from accessing that list elsewhere?
List<String> myList = new ArrayList<String>();
synchronized {
mylist.add("Hello");
}
Does this prevent other threads from iterating over myList and removing/adding values?
I'm looking to add/remove values from a list, but at the same time protect it from other threads/methods from iterating over it (as the values in the list might be invalidated)
No, it does not.
The synchronized block only prevents other threads from entering the block (more accurately, it prevents other threads from entering all blocks synchronized on the same object instance - in this case blocks synchronized on this).
You need to use the instance you want to protect in the synchronized block:
synchronized(myList) {
mylist.add("Hello");
}
The whole area is quite well explained in the Java tutorial:
http://download.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html
Yes, but only if all other accesses to myList are protected by synchronized blocks on the same object. The code sample you posted is missing an object on which you synchronize (i.e., the object whose mutex lock you acquire). If you synchronize on different objects or fail to synchronize at all in one instance, then other threads may very well access the list concurrently. Therefore, you must ensure that all threads have to enter a synchronized block on the same object (e.g., using synchronized (myList) { ... } consistently) before accessing the list. In fact, there is already a factory method that will wrap each method of your list with synchronized methods for you: Collections.synchronizedList.
However, you can certainly use Collections.synchronizedList to wrap your list so that all of its methods are individually synchronized, but that doesn't necessarily mean that your application's invariants are maintained. Individually marking each method of the list as synchronized will ensure that the list's internal state remains consistent, but your application may wish for more, in which case you will need to write some more complex synchronization logic or see if you can take advantage of the Concurrency API (highly recommended).
here the sychronized makes sure that only one thread is adding Hello to the myList at a time...
to be more specific about synchronizing wrt objects yu can use
synchronized( myList ) //object name
{
//other code
}
vinod
From my limited understanding of concurrency control in Java I would say that it is unlikely that the code above would present the behaviour you are looking for.
The synchronised block would use the lock of whatever object you are calling said code in, which would in no way stop any other code from accessing that list unless said other code was also synchronised using the same lock object.
I have no idea if this would work, or if its in any way advised, but I think that:
List myList = new ArrayList();
synchronized(myList) {
mylist.add("Hello");
}
would give the behaviour you describe, by synchronizing on the lock object of the list itself.
However, the Java documentation recommends this way to get a synchronized list:
List list = Collections.synchronizedList(new ArrayList(...));
See: http://download.oracle.com/javase/1.4.2/docs/api/java/util/ArrayList.html

Categories