Thread safe implementation for Hash Map - java

First, I'll describe what I want and then I'll elaborate on the possibilities I am considering. I don't know which is the best so I want some help.
I have a hash map on which I do read and write operations from a Servlet. Now, since this Servlet is on Tomcat, I need the hash map to be thread safe. Basically, when it is being written to, nothing else should write to it and nothing should be able to read it as well.
I have seen ConcurrentHashMap but noticed its get method is not thread-safe. Then, I have seen locks and something called synchronized.
I want to know which is the most reliable way to do it.

ConcurrentHashMap.get() is thread safe.
You can make HashMap thread safe by wrapping it with Collections.synchronizedMap().

EDIT: removed false information
In any case, the synchronized keyword is a safe bet. It blocks any threads from accessing the object while inside a synchronized block.
// Anything can modify map at this point, making it not thread safe
map.get(0);
as opposed to
// Nothing can modify map until the synchronized block is complete
synchronized(map) {
map.get(0);
}

I would like to suggest you to go with ConcurrentHashMap , the requirement that you have mentioned above ,earlier I also had the same type of requirement for our application but we were little more focused on the performance side.
I ran both ConcurrentHashMap and map returned by Colecctions.synchronizedMap(); , under various types of load and launching multiple threads at a time using JMeter and I monitored them using JProfiler .After all these tests we came to conclusion that that map returned by Colecctions.synchronizedMap() was not as efficient in terms of performance in comaprison to ConcurrentHashMap.
I have written a post also on the same about my experience with both.
Thanks

Collections.synchronizedMap(new HashMap<K, V>);
Returns a synchronized (thread-safe) map backed by the specified map. In order to guarantee serial access, it is critical that all access to the backing map is accomplished through the returned map.
It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views:

This is the point of ConcurrentHashMap class. It protects your collection, when you have more than 1 thread.

Related

Java making a copy of a list thread safe?

I have multiple threads running in a Java application, and all of them need to access the same list. Only one thread, however, actually needs to insert/delete/change the list, and the others just need to access it.
In the other threads, I want to make a copy of the list for me to use whenever I need to read it, but is there any way to do this in a thread safe way? If I had a method to copy the list element by element and the list changed, that would mess it up wouldn't it?
EDIT:
The list will not be deleted from very often, so would it work if I just copied it normally and caught exceptions? If the list grew in the middle of the copy and I missed it, it wouldn't really make a difference to functionality
You can use CopyOnWriteArrayList for your purpose.
CopyOnWriteArrayList is a concurrent Collection class introduced in Java 5 Concurrency API along with its popular cousin ConcurrentHashMap in Java.
As name suggest CopyOnWriteArrayList creates copy of underlying
ArrayList with every mutation operation e.g. add or set. Normally
CopyOnWriteArrayList is very expensive because it involves costly
Array copy with every write operation but its very efficient if you
have a List where Iteration outnumber mutation e.g. you mostly need to
iterate the ArrayList and don't modify it too often.
With this collection, you shouldn't create a new instance every time. You should have only one object of this and it will work.
Hmm, so I think that what are you looking for is called CopyOnWriteArrayList.
CopyOnWriteArrayList - A thread-safe variant of ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array.
Ref: CopyOnWriteArrayList
You can use CopyOnWriteArrayList which is thread safe ,but it create new one on every update process.
Or you can use readWriteLock so when update use no one can read while multiple thread can read simultaneously .
I decided to solve this by having a separate thread that handles the thread, with BlockingQueue for the other threads to submit to if they want to write to the list, or get from the list. If they wanted to write to the list it would submit an object with the content that they wanted to write, and if they wanted to read, it would submit a Future that the thread with the list would populate
Depending on your particular usage, you might benefit from one of these:
If you don't really need random access, use ConcurrentLinkedQueue. No explicit synchronization required.
If you don't need random access for writes but only need it for reads, use ConcurrentLinkedQueue for writes and copy it to a list from time to time if changes were made to the queue (in a separate thread), give this list to "readers". Does not require explicit synchronization; gives a "weakly consistent" read view.
Since your writes come from one thread, the previous could work with 2 lists (e.g. the writing thread will copy it to the "reading view" from time to time). However, be aware that if you use an ArrayList implementation and require random access for writes then you are looking at constant copies of memory regions, not good even in the absence of excessive synchronization. This option requires synchronization for the duration of copying.
Use a map instead, ConcurrentHashMap if you don't care about ordering and want O(1) performance or ConcurrentSkipListMap if you do need ordering and are ok with O(logN) performance. No explicit synchronization required.
Use Collections.synchronizedList().
Example :
Collections.synchronizedList(new ArrayList<YourClassNameHere>())

Using ThreadLocal vs. a HashMap with Thread as a key

I would like to reuse instances of non-thread safe classes for performance reasons in a Servlet. I have two options,
use ThreadLocal where in Java takes care of doing the instance management per thread
use a static HashMap which uses Thread as the HashMap key and the instances are managed at this level
With the ThreadLocal approach there are potentials for memory leaks esp in Servlet enviornment. Because of this, I am thinking of using the 2nd option, I was wondering if anyone has experience in using this approach and any pitfalls of using the same?
Prefer the ThreadLocal approach because it is likely synchronized (or better yet, requires no synchronization) at the correct granularity and no larger.
If you roll your own solution using HashMap you'll have to acquire a lock over the HashMap every time you want to access any thread-local data. Why? Because a new thread could be created and threads can die. These are implicitly adding/removing items from a HashMap, which require synchronization on the full HashMap. You'll also have quite the time keeping object lifetimes straight because HashMap will keep alive all items it contains as long as it is referable from any thread. That is not how ThreadLocal store behaves.
The problem is not ThreadLocal itself, but the way it's being used. See here for a detailed explanation. So, your own implementation won't make a difference.

Most efficient way to make a data structure thread-safe (Java)

I have a shared Map data structure that needs to be thread-safe. Is synchronized the most efficient way to read or add to the Map?
Thanks!
Edit: The data structure is a non-updatable cache, i.e. once it fills up it does not update the cache. So lots of writes initially with some reads then it is mostly reads
"Most efficient" is relative, of course, and depends on your specific situation. But consider something like ConcurrentHashMap if you expect there to be many threads working with the map simultaneously; it's thread safe but still allows concurrent access, unlike Hashtable or Collections.synchronizedMap().
That depends on how you use it in the app.
If you're doing lots of reads and writes on it, a ConcurrentHashMap is possibly the best choice, if it's mostly reading, a common Map wrapped inside a collection using a ReadWriteLock
(since writes would not be common, you'd get faster access and locking only when writing).
Collections.synchronizedMap() is possibly the worst case, since it might just give you a wrapper with all methods synchronized, avoid it at all costs.
For your specific use case (non-updatable cache), a copy on write map will outperform both a synchronized map and ConcurrentHashMap.
See: https://labs.atlassian.com/wiki/display/CONCURRENT/CopyOnWriteMap as one example (I believe apache also has a copy on write map implementation).
synchronised methods or collections will certainly work. It's not the most efficient approach but is simple to implement and you won't notice the overhead unless you are access the structure millions of times per second.
A better idea though might be to use a ConcurrentHashMap - this was designed for concurrency from the start and should perform better in a highly concurrent situation.

arraylist that supports multithreads

I need to write an arraylist to a file.
It gets filled all the time and when it gets too big I need to start writing it.
So I thought to check when the arraylist sise is greater then 100 and then append the file and write the current rows .
But the problem is sometimes it doesnt get filled for a few minuettes and I will want to dump the data to a file.
So my second thought was to have another thread that will check if there are rows every few sec and dump it to the file.
But than I would need to manage locks between threads.
My questions are :
1. Is the multithread design ok ?
2. Is there an arraylist that supports multithread in any form ?
You can make synchronized lists using Collections.synchronizedList.
Check out this thread.
You should use
List synchronizedList = Collections.synchronizedList(list);
Instead of ArrayList I would recommend you to use proper implementation of Queue<E>. If you are only appending data to the list and then dumping it to the file removing saved items from the list, queue is a much better choice.
Some implementations are threads safe and will even allow the caller thread to block until something actually appears in the queue - which is much better approach than having a polling thread. BlockingQueue looks very promising for your case.
From your question it appears most of the time you are going to perform write operation and that too from a single thread & will be intermittently checking for the size of list.
other wise you can not use plain old ArrayList.
synchronizing the list and then using locks to access the list looks like a overkill.
rather have a if check that will check the size.
If you are going to access list in multiple threads then to avoid ConcurrentModificationException use the method suggested by #Ludevik
There are other approaches as well but for the sake of simplicity #Ludevik approach fits the bill.
Instead of arrays you can use vectors. Since vectors are thread-safe.
ArrayList is not thread-safe, but you can get a thread-safe list with Collections.synchronizedList()
You need to use Collections.synchronizedList(List) to create a thread safe list. However, you still need to synchronize operations such as add or remove and synchronize the updates to the objects held in the list.
The simple solutions are to use Collections.synchronizedList(list) or Vector. However, there is a gotcha.
The iterator() method for a synchronized list / Vector created as above is NOT synchronized. So there's nothing to stop a thread from trying to add new element to the list if you copy it using an iterator explicitly, by using for (type var : list) {...} or by using a copy constructor that relies on the iterator.
This is liable to result in concurrent modification exceptions. To avoid that problem, you will need to do your own locking.
It may be better idea to use a concurrent Queue class so that the thread that writes stuff to the file doesn't need to iterate a list.

Thread safe Hash Map?

I am writing an application which will return a HashMap to user. User will get reference to this MAP.
On the backend, I will be running some threads which will update the Map.
What I have done so far?
I have made all the backend threads so share a common channel to update the MAP. So at backend I am sure that concurrent write operation will not be an issue.
Issues I am having
If user tries to update the MAP and simultaneously MAP is being updated at backend --> Concurrent write operation problem.
If use tries to read something from MAP and simultaneously MAP is being updated at backend --> concurrent READ and WRITE Operation problem.
Untill now I have not face any such issue, but i m afraid that i may face in future. Please give sugesstions.
I am using ConcurrentHashMap<String, String>.
You are on the right track using ConcurrentHashMap. For each point:
Check out the methods putIfAbsent and replace both are threadsafe and combine checking current state of hashmap and updating it into one atomic operation.
The get method is not synchronized internally but will return the most recent value for the specified key available to it (check the ConcurrentHashMap class Javadoc for discussion).
The benefit of ConcurrentHashMap over something like Collections.synchronizedMap is the combined methods like putIfAbsent which provide traditional Map get and put logic in an internally synchronized way. Use these methods and do not try to provide your own custom synchronization over ConcurrentHashMap as it will not work. The java.util.concurrent collections are internally synchronized and other threads will not respond to attempts at synchronizing the object (e.g. synchronize(myConcurrentHashMap){} will not block other threads).
Side Note:
You might want to look into the lock free hash table implementation by Cliff Click, it's part of the Highly Scalable Java library
(Here's a Google Talk by Cliff Click about this lock free hash.)
ConcurrentHashMap was designed and implemented to avoid any issues with the scenarios you describe. You have nothing to worry about.
A hash table supporting full
concurrency of retrievals and
adjustable expected concurrency for
updates.updates.
javadoc of ConcurrentHashMap

Categories