Can anyone explain what is unsynchronized & synchronized access in Java Collections Framework?
Synchronized vs unsynchronized access doesn't have to do with the Java Collections Framework per see.
Synchronized access means that you have some sort of locking for accessing the data. This can be introduced by using the synchronized keyword or by using some of the higher level constructs from the java.util.concurrent package.
Unsynchronized access means that you don't have any locking involved when accessing the data.
If you're using a collection in several threads, you better make sure that you're accessing it in a synchronized way, or, that the collection itself is thread safe, i.e., takes care of such locking internally.
To make sure all accesses to some collection coll is accessed in a synchronized way, you can either
...surround accesses with synchronized (coll) { ... }
public void someMethod() {
synchronized (coll) {
// do work...
}
}
...encapsulate it using Collections.synchronizedCollections
coll = Collections.synchronizedCollection(coll);
In the former approach, you need to make sure that every access to the collection is covered by synchronized. In the latter approach, you need to make sure that every reference points at the synchronized version of the collection.
As pointed out by #Fatal however, you should understand that the latter approach only transforms a thread unsafe collection into a thread safe collection. This is most often not sufficient for making sure that the class you are writing is thread safe. For an example, see #Fatals comment.
Synchronized access means it is thread-safe. So different threads can access the collection concurrently without any problems, but it is probably a little bit slower depending on what you are doing.
Unsynchronized is the opposite. Not thread-safe, but a little bit faster.
The synchronized access in Java Collection Framework is normally done by wrapping with Collections.synchronizedCollection(...) etc. and only access through this wrapper.
There are some exceptions already synchronized like Hashtable and Vector.
But keep in mind:
Synchronization is done over the collection instance itself and has a scope for each method call. So subsequent calls maybe interrupted by another thread.
Example:
You first call isEmtpy() method getting result that it is not empty and after that you want to retrieve an element from that collection. But this second method call may fail, because collection may be empty now due to actions by another thread done between your calls.
So even with synchronized collections you've to care about synchronization and it maybe necessary to synchronize yourself outside the collection!
Related
Why are the methods in the Observable class are synchronized?
public synchronized void deleteObserver(Observer o) {
obs.removeElement(o);
}
Observable is intended to be a very thread-safe class; it manipulates shared data 'atomically', meaning that only one thread can access it at a time. The synchronized keyword forces each interacting thread to access the Observable instance's data atomically.
synchronized methods enable a simple strategy for preventing thread interference and memory consistency errors: if an object is visible to more than one thread, all reads or writes to that object's variables are done through synchronized methods.
Note that some methods, such as notifyObservers(), are not synchronized. This is because they do not directly affect the instance data of the Observable.
Read this writeup if you want to learn more about thread-safety.
I can't give you a definitive answer regarding why Observable is implemented the way it is, but I can explain the effect.
While Vector is a synchronized collection, it isn't synchronized while iterating. This is similar to the wrappers returned by the Collections.synchronizedXXX methods. In order to safely iterate a Vector in a concurrent context you need external synchronization. They accomplish this by using synchronized methods. But if you look at notifyObservers you'll see that the method isn't synchronized. However, if you look at the body of notifyObservers you'll see a synchronized(this) {} block. They do it this way because only part of the method body needs to be executed while holding the lock. If you're not aware, a synchronized instance method is the same as using synchronized(this) {} for the whole method.
The other effect caused by using synchronized methods is that both the obs field and the changed field are guarded by the same lock. This keeps the state between those two fields consistent in a multi-threaded environment. Why they chose the enclosing instance as the lock, I have no idea, but that's what they did.
Note that, as near as I can tell, the Observable class gives no guarantees regarding thread-safety in its documentation. This means the fact that it is thread-safe is an implementation detail.
Also note that Observable has been deprecated since Java 9.
There is a lot of topics when synchronization in Java appears. In many of them is recommended to using invokation of Collections.synchronized{Collecation, List, Map, Set, SortedMap, SortedSet} instead of Collection, List, etc. in case of multithreading work to thread-safe access.
Lets imagine situation when some threads exist and all of them need to access collection via methods that have synchronized block in their bodies.
So then, is it necessary to use:
Collection collection = Collections.synchronizedCollection(new ArrayList<T>());
or only
Collection collection = new ArrayList<String>();
need to?
Maybe you can show me an example when second attempt instead of first will cause evidently incorrect behaviour?
To the contrary, Collections.synchronizedCollection() is generally not sufficient because many operations (like iterating, check then add, etc.) need additional, explicit synchronization.
If every access to the collection is already done through properly synchronized methods, then wrapping the collection again into a syncronized proxy is useless.
No, if your access methods are synchronized there is no need to also use a synchronized collection.
Collection collection = new ArrayList<String>();
will do just fine in that scenario.
If you have already arranged for proper synchronization of your code, you definitely do not need another layer of synchronization on the lower level of granularity.
Just make sure when you say
all of them need to access collection via methods that have synchronized block in their bodies.
that all these blocks use the same lock. It is not enough to just involve some synchronized block.
I have the following class for a Router's table with synchronised methods:
public class RouterTable {
private String tableForRouter;
private Map<String,RouterTableEntry> table;
public RouterTable(String router){
tableForRouter = router;
table = new HashMap<String,RouterTableEntry>();
}
public String owner(){
return tableForRouter;
}
public synchronized void add(String network, String ipAddress, int distance){
table.put(network, new RouterTableEntry(ipAddress, distance));
}
public synchronized boolean exists(String network){
return table.containsKey(network);
}
}
Multiple threads will read and write to the HashMap. I was wondering if it would be best to remove the synchronized on the methods and just use Collections.synchronizedMap(new HashMap<String,RouterTableEntry())` what is the most sensible way in Java to do this?
I would suggest using a ConcurrentHashmap. This is a newer data structure introduced in later version of Java. It provides thread safety and allows concurrent operations, as opposed to a synchronized map, which will do one operation at a time.
If the map is the only place where thread safety is required, then just using the ConcurrentHashmap is fine. However, if you have atomic operations involving more state variables, I would suggest using synchronized code blocks instead of synchronized functions
In the absence of strict requirements about happens-before relationships and point in time correctness, the sensible thing to do in modern java is usually just use a ConcurrentMap.
Otherwise, yes, using a Collections#synchronizedMap is both safer and likely more performant (because you won't enclose any tertiary code that doesn't need synchronization) than manually synchronizing everything yourself.
The best is to use a java.util.concurrent.ConcurrentHashMap, which is designed from the ground up for concurrent access (read & write).
Using synchronization like you do works, but shows high contention and therefore not optimal performance. A collection obtained through Collections.synchronizedMap() would do just the same (it only wraps a standart collection with synchronized methods).
ConcurrentHashMap, on the contrary, used various techniques to be thread-safe and provide good concurrency ; for example, it has (by default) 16 regions, each guarded by a distinct lock, so that up to 16 threads can use it concurrently.
Synchronizing the map will prevent users of your class from doing meaningful synchronization.
They will have no way of knowing if the result from exists is still valid, once they get into there if statement, and will need to do external synchronization.
With the synchronized methods as you show, they could lock on your class until they are done with a block of method calls.
The other option is to do no synchronization and let the user handle that, which they need to do anyway to be safe.
Adding your own synchronization is what was wrong with HashTable.
The current common style tends to prefer Synchronized collections over explicit synchronized qualification on the methods that access them. However, this is not set in stone, and your decision should depend on the way you use this code/will use this code in the future.
Points to consider:
(a) If your map is going to be used by code that is outside of the RouterTable then you need to use a SynchronizedMap.
(b) OTOH, if you are going to add some additional fields to RouterTable, and their values need to be consistent with the values in the map (in other words: you want changes to the map and to the additional fields to happen in one atomic quantum), then you need to use synchrnoized method.
If I do something to a list inside a synchronized block, does it prevent other threads from accessing that list elsewhere?
List<String> myList = new ArrayList<String>();
synchronized {
mylist.add("Hello");
}
Does this prevent other threads from iterating over myList and removing/adding values?
I'm looking to add/remove values from a list, but at the same time protect it from other threads/methods from iterating over it (as the values in the list might be invalidated)
No, it does not.
The synchronized block only prevents other threads from entering the block (more accurately, it prevents other threads from entering all blocks synchronized on the same object instance - in this case blocks synchronized on this).
You need to use the instance you want to protect in the synchronized block:
synchronized(myList) {
mylist.add("Hello");
}
The whole area is quite well explained in the Java tutorial:
http://download.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html
Yes, but only if all other accesses to myList are protected by synchronized blocks on the same object. The code sample you posted is missing an object on which you synchronize (i.e., the object whose mutex lock you acquire). If you synchronize on different objects or fail to synchronize at all in one instance, then other threads may very well access the list concurrently. Therefore, you must ensure that all threads have to enter a synchronized block on the same object (e.g., using synchronized (myList) { ... } consistently) before accessing the list. In fact, there is already a factory method that will wrap each method of your list with synchronized methods for you: Collections.synchronizedList.
However, you can certainly use Collections.synchronizedList to wrap your list so that all of its methods are individually synchronized, but that doesn't necessarily mean that your application's invariants are maintained. Individually marking each method of the list as synchronized will ensure that the list's internal state remains consistent, but your application may wish for more, in which case you will need to write some more complex synchronization logic or see if you can take advantage of the Concurrency API (highly recommended).
here the sychronized makes sure that only one thread is adding Hello to the myList at a time...
to be more specific about synchronizing wrt objects yu can use
synchronized( myList ) //object name
{
//other code
}
vinod
From my limited understanding of concurrency control in Java I would say that it is unlikely that the code above would present the behaviour you are looking for.
The synchronised block would use the lock of whatever object you are calling said code in, which would in no way stop any other code from accessing that list unless said other code was also synchronised using the same lock object.
I have no idea if this would work, or if its in any way advised, but I think that:
List myList = new ArrayList();
synchronized(myList) {
mylist.add("Hello");
}
would give the behaviour you describe, by synchronizing on the lock object of the list itself.
However, the Java documentation recommends this way to get a synchronized list:
List list = Collections.synchronizedList(new ArrayList(...));
See: http://download.oracle.com/javase/1.4.2/docs/api/java/util/ArrayList.html
From Sun's tutorial:
Synchronized methods enable a simple strategy for preventing thread interference and memory consistency errors: if an object is visible to more than one thread, all reads or writes to that object's variables are done through synchronized methods. (An important exception: final fields, which cannot be modified after the object is constructed, can be safely read through non-synchronized methods, once the object is constructed) This strategy is effective, but can present problems with liveness, as we'll see later in this lesson.
Q1. Is the above statements mean that if an object of a class is going to be shared among multiple threads, then all instance methods of that class (except getters of final fields) should be made synchronized, since instance methods process instance variables?
In order to understand concurrency in Java, I recommend the invaluable Java Concurrency in Practice.
In response to your specific question, although synchronizing all methods is a quick-and-dirty way to accomplish thread safety, it does not scale well at all. Consider the much maligned Vector class. Every method is synchronized, and it works terribly, because iteration is still not thread safe.
No. It means that synchronized methods are a way to achieve thread safety, but they're not the only way and, by themselves, they don't guarantee complete safety in all situations.
Not necessarily. You can synchronize (e.g. place a lock on dedicated object) part of the method where you access object's variables, for example. In other cases, you may delegate job to some inner object(s) which already handles synchronization issues.
There are lots of choices, it all depends on the algorithm you're implementing. Although, 'synchronized' keywords is usually the simplest one.
edit
There is no comprehensive tutorial on that, each situation is unique. Learning it is like learning a foreign language: never ends :)
But there are certainly helpful resources. In particular, there is a series of interesting articles on Heinz Kabutz's website.
http://www.javaspecialists.eu/archive/Issue152.html
(see the full list on the page)
If other people have any links I'd be interested to see also. I find the whole topic to be quite confusing (and, probably, most difficult part of core java), especially since new concurrency mechanisms were introduced in java 5.
Have fun!
In the most general form yes.
Immutable objects need not be synchronized.
Also, you can use individual monitors/locks for the mutable instance variables (or groups there of) which will help with liveliness. As well as only synchronize the portions where data is changed, rather than the entire method.
synchronized methodName vs synchronized( object )
That's correct, and is one alternative. I think it would be more efficient to synchronize access to that object only instead synchronize all it's methods.
While the difference may be subtle, it would be useful if you use that same object in a single thread
ie ( using synchronized keyword on the method )
class SomeClass {
private int clickCount = 0;
public synchronized void click(){
clickCount++;
}
}
When a class is defined like this, only one thread at a time may invoke the click method.
What happens if this method is invoked too frequently in a single threaded app? You'll spend some extra time checking if that thread can get the object lock when it is not needed.
class Main {
public static void main( String [] args ) {
SomeClass someObject = new SomeClass();
for( int i = 0 ; i < Integer.MAX_VALUE ; i++ ) {
someObject.click();
}
}
}
In this case, the check to see if the thread can lock the object will be invoked unnecessarily Integer.MAX_VALUE ( 2 147 483 647 ) times.
So removing the synchronized keyword in this situation will run much faster.
So, how would you do that in a multithread application?
You just synchronize the object:
synchronized ( someObject ) {
someObject.click();
}
Vector vs ArrayList
As an additional note, this usage ( syncrhonized methodName vs. syncrhonized( object ) ) is, by the way, one of the reasons why java.util.Vector is now replaced by java.util.ArrayList. Many of the Vector methods are synchronized.
Most of the times a list is used in a single threaded app or piece of code ( ie code inside jsp/servlets is executed in a single thread ), and the extra synchronization of Vector doesn't help to performance.
Same goes for Hashtable being replaced by HashMap
In fact getters a should be synchronized too or fields are to be made volatile. That is because when you get some value, you're probably interested in a most recent version of the value. You see, synchronized block semantics provides not only atomicity of execution (e.g. it guarantees that only one thread executes this block at one time), but also a visibility. It means that when thread enters synchronized block it invalidates its local cache and when it goes out it dumps any variables that have been modified back to main memory. volatile variables has the same visibility semantics.
No. Even getters have to be synchronized, except when they access only final fields. The reason is, that, for example, when accessing a long value, there is a tiny change that another thread currently writes it, and you read it while just the first 4 bytes have been written while the other 4 bytes remain the old value.
Yes, that's correct. All methods that modify data or access data that may be modified by a different thread need to be synchronized on the same monitor.
The easy way is to mark the methods as synchronized. If these are long-running methods, you may want to only synchronize that parts that the the reading/writing. In this case you would definie the monitor, along with wait() and notify().
The simple answer is yes.
If an object of the class is going to be shared by multiple threads, you need to syncronize the getters and setters to prevent data inconsistency.
If all the threads would have seperate copy of object, then there is no need to syncronize the methods. If your instance methods are more than mere set and get, you must analyze the threat of threads waiting for a long running getter/setter to finish.
You could use synchronized methods, synchronized blocks, concurrency tools such as Semaphore or if you really want to get down and dirty you could use Atomic References. Other options include declaring member variables as volatile and using classes like AtomicInteger instead of Integer.
It all depends on the situation, but there are a wide range of concurrency tools available - these are just some of them.
Synchronization can result in hold-wait deadlock where two threads each have the lock of an object, and are trying to acquire the lock of the other thread's object.
Synchronization must also be global for a class, and an easy mistake to make is to forget to synchronize a method. When a thread holds the lock for an object, other threads can still access non synchronized methods of that object.