Java synchronization in web service - java

I have a java restful webservice program thats hosted on tomcat. In one of my web service methods, I load a big arraylist of objects (about 25,000 entries) from redis. This arraylist is updated once every 30 mins. There are multiple threads reading from this arraylist all the time. When, I update the arraylist I want to cause minimum disruption/delays since there could be other threads reading from it.
I was wondering what is the best way to do this? One way is to use synchronized keyword to the method that updates the list. But, the synchronized method has an overhead, since no threads can read while the update is going on. The update method itself could take few hundred millisecs since it involves reading from redis + deserialization.
class WebService {
ArrayList<Entry> list = new ArrayList<Entry>();
//need to call this every 30 mins.
void syncrhonized updateArrayList(){
//read from redis & add elements to list
}
void readFromList(){
for(Entry e: list) {
//do some processing
}
}
}
Updated the final solution:
I ended up using no explicit synchronization primitives.

Does it have to be the same List instance getting updated? Can you build a new list every 30 minutes and replace a volatile reference?
Something along these lines:
class WebService {
private volatile List<Entry> theList;
void updateList() {
List<Entry> newList = getEntriesFromRedis();
theList = Collections.unmodifiableList(newList);
}
public List<Entry> getList() {
return theList;
}
}
The advantage of this approach is that you don't have to do any other synchronization anywhere else.

A reader-writer lock (or ReadWriteLock in Java) is what you need.
A reader-writer lock will allow concurrent access for read operations, but mutually exclusive access for write.
It will look something like
class WebService {
final ReentrantReadWriteLock listRwLock = new ReentrantReadWriteLock();
ArrayList<Entry> list = new ArrayList<Entry>();
//need to call this every 30 mins.
void updateArrayList(){
listRwLock.writeLock().lock();
try {
//read from redis & add elements to list
} finally {
listRwLock.writeLock().unlock()
}
}
void readFromList(){
listRwLock.readLock().lock();
try {
for(Entry e: list) {
//do some processing
}
} finally {
listRwLock.readLock().unlock()
}
}
}

Here is the solution I finally ended up with,
class WebService {
// key = timeWindow (for ex:10:00 or 10:30 or 11:00), value = <List of entries for that timewindow>
ConcurrentHashMap<String, List<Entry>> map= new ConcurrentHashMap<String, List<Entry>>();
//have setup a timer to call this every 10 mins.
void updateArrayList(){
// populate the map for the next time window with the corresponding entries. So that its ready before we start using it. Also, clean up the expired entries for older time windows.
}
void readFromList(){
list = map.get(currentTimeWindow)
for(Entry e: list) {
//do some processing
}
}
}

ArrayList is not thread safe.. You must use vector List to make it thread safe.
You can also use Thread safe Array list by using Collections Api but I would recommend vector list since it already provides you what you want.
//Use Collecions.synzhonizedList method
List list = Collections.synchronizedList(new ArrayList());
...
//If you wanna use iterator on the synchronized list, use it
//like this. It should be in synchronized block.
synchronized (list) {
Iterator iterator = list.iterator();
while (iterator.hasNext())
...
iterator.next();
...
}
I would recommend you to through this:
http://beginnersbook.com/2013/12/difference-between-arraylist-and-vector-in-java/

Related

How to access an array thread safely in Java?

Are operations on arrays in Java thread safe?
If not how to make access to an array thread safe in Java for both reads and writes?
You will not get an invalid state when changing arrays using multiple threads. However if a certain thread has edited a value in the array, there is no guarantee that another thread will see the changes. Similar issues occur for non-volatile variables.
Operation on array in java is not thread safe. Instead you may use ArrayList with Collections.synchronizedList()
Suppose we are trying to populate a synchronized ArrayList of String. Then you can add item to the list like -
List<String> list =
Collections.synchronizedList(new ArrayList<String>());
//Adding elements to synchronized ArrayList
list.add("Item1");
list.add("Item2");
list.add("Item3");
Then access them from a synchronized block like this -
synchronized(list) {
Iterator<String> iterator = list.iterator();
while (iterator.hasNext())
System.out.println(iterator.next());
}
Or you may use a thread safe variant of ArrayList - CopyOnWriteArrayList. A good example can be found here.
Hope it will help.
array operations are not threadsafe. you can either lock on a field, i would recommend to add a field e.g. named LOCK and do the
void add(){
syncronized(LOCK) {
// add
}
}
void get(int i){
synchronized(LOCK){
// return
}
}
or simply use
java.util.concurrent.*

Multithreaded: Identifying duplicate objects

I'm trying to implement a duplicate objects finding method over a List object. Traversing through the List and finding the duplicate objects using multiple threads is the target. So far I used ExecutorService as follows.
ExecutorService executor = Executors.newFixedThreadPool(5);
for (int i = 0; i < jobs; i++) {
Runnable worker = new TaskToDo(jobs);
executor.execute(worker);
}
executor.shutdown();
while (!executor.isTerminated()) {
}
System.out.println("Finished all threads");
At TaskToDo class I iterate through the loop. When a duplicate is detected the one out of them will be removed from the List. Following are the problems I faced,
When using multiple threads at the executor it does not result as intended. Some duplicate values are still exist in the list. But a single thread at the executor works perfectly. I tried
List<String> list = Collections.synchronizedList(new LinkedList<String>()) also but same problem exists.
What is the best data structure that i can use for this purpose of removing duplicates for better performance ?
Google gave some results to use Concurrent structures. But difficult to figure out a correct approach to achieve this.
Appreciate your help. Thanks in advance... :)
Following is the code for iterating through the specified list object. Here actual content of the files will be compared.
for(int i = currentTemp; i < list.size() - 1; i++){
if(isEqual(list.get(currentTemp), list.get(i+1))){
synchronized (list) {
list.remove(i + 1);
i--;
}}}
With your current logic, you would have to synchronize at coarser granularity, otherwise you risk removing the wrong element.
for (int i = currentTemp; i < list.size() - 1; i++) {
synchronized (list) {
if (i + 1 > list.size() && isEqual(list.get(currentTemp), list.get(i+1))) {
list.remove(i + 1);
i--;
}
}
}
You see, the isEqual() check must be inside the synchronized block to ensure atomicity of the equivalence check with the element removal. Assuming most of your concurrent processing benefit would come from asynchronous comparison of list elements using isEqual(), this change nullifies any benefit you sought.
Also, checking list.size() outside the synchronized block isn't good enough, because list elements can be removed by other threads. And unless you have a way of adjusting your list index down when elements are removed by other threads, your code will unknowingly skip checking some elements in the list. The other threads are shifting elements out from under the current thread's for loop.
This task would be much better implemented using an additional list to keep track of indexes that should be removed:
private volatile Set<Integer> indexesToRemove =
Collections.synchronizedSet(new TreeSet<Integer>(
new Comparator<Integer>() {
#Override public int compare(Integer i1, Integer i2) {
return i2.compareTo(i1); // sort descending for later element removal
}
}
));
The above should be declared at the same shared level as your list. Then the code for iterating through the list should look like this, with no synchronization required:
int size = list.size();
for (int i = currentTemp; i < size - 1; i++) {
if (!indexesToRemove.contains(i + 1)) {
if (isEqual(list.get(currentTemp), list.get(i+1))) {
indexesToRemove.add(i + 1);
}
}
}
And finally, after you have join()ed the worker threads back to a single thread, do this to de-duplicate your list:
for (Integer i: indexesToRemove) {
list.remove(i.intValue());
}
Because we used a descending-sorted TreeSet for indexesToRemove, we can simply iterate its indexes and remove each from the list.
If your algorithm acts on sufficient data that you might really benefit from multiple threads, you encounter another issue that will tend to mitigate any performance benefits. Each thread has to scan the entire list to see if the element it is working on is a duplicate, which will cause the CPU cache to keep missing as various threads compete to access different parts of the list.
This is known as False Sharing.
Even if False Sharing does not get you, you are de-duping the list in O(N^2) because for each element of the list, you re-iterate the entire list.
Instead, consider using a Set to initially collect the data. If you cannot do that, test the performance of adding the list elements to a Set. That should be a very efficient way to approach this problem.
If you're trying to dedup a large number of files, you really ought to be using a hash-based structure. Concurrently modifying lists is dangerous, not least because indexes into the list will constantly be changing out from under you, and that's bad.
If you can use Java 8, my approach would look something like this. Let's assume you have a List<String> fileList.
Collection<String> deduplicatedFiles = fileList.parallelStream()
.map(FileSystems.getDefault()::getPath) // convert strings to Paths
.collect(Collectors.toConcurrentMap(
path -> {
try {
return ByteBuffer.wrap(Files.readAllBytes(path)),
// read out the file contents and wrap in a ByteBuffer
// which is a suitable key for a hash map
} catch (IOException e) {
throw new RuntimeException(e);
}
},
path -> path.toString(), // in the values, convert back to string
(first, second) -> first) // resolve duplicates by choosing arbitrarily
.values();
That's the entire thing: it concurrently reads all the files, hashes them (though with an unspecified hash algorithm that may not be great), deduplicates them, and spits out a list of files with distinct contents.
If you're using Java 7, then what I'd do would be something like this.
CompletionService<Void> service = new ExecutorCompletionService<>(
Executors.newFixedThreadPool(4));
final ConcurrentMap<ByteBuffer, String> unique = new ConcurrentHashMap<>();
for (final String file : fileList) {
service.submit(new Runnable() {
#Override public void run() {
try {
ByteBuffer buffer = ByteBuffer.wrap(Files.readAllBytes(
FileSystem.getDefault().getPath(file)));
unique.putIfAbsent(buffer, file);
} catch (IOException e) {
throw new RuntimeException(e);
}
}, null);
}
for (int i = 0; i < fileList.size(); i++) {
service.take();
}
Collection<String> result = unique.values();

How to update a LinkedList value stored in cache using Guava's LoadingCache

I am trying to utilize LoadingCache from the Guava library to cache a LinkedList.
LoadingCache<Integer, LinkedList<String>> cache;
I've setup a CacheLoader to handle misses, which is working fine. However there is another system that needs to submit updates to existing cache entries. Each update needs to be appended to the LinkedList and will arrive at a fairly quick rate (thousands per minute). Finally, it needs to be thread safe.
Here is a naive approach that illustrates the logic but is not thread safe:
public void add(Integer key, String value) {
LinkedList<String> list = cache.get(key);
list.add(value);
cache.put(key, list);
}
Any advice on how to make this work? I can look at other libraries but Guava 14 is already a dependency of this codebase and would be very convenient.
The last line in
public void add(Integer key, String value) {
LinkedList<String> list = cache.get(key);
list.add(value);
cache.put(key, list);
}
is not needed as you already modify the object obtained from the cache. Maybe all you need is
public void add(Integer key, String value) {
LinkedList<String> list = cache.get(key);
synchronized (list) {
list.add(value);
}
}
It depends on what eviction happens. If there's no eviction at all, then it will work. If an entry can get evicted before the updating method finishes, then you're out of luck.
Nonetheless, there's a simple solution: Using a global lock would work, but obviously inefficiently. So use a list of locks:
private static final CONCURRENCY_LEVEL = 64; // must be power of two
List<Object> locks = Lists.newArrayList(); // an array would do as well
for (int i=0; i<CONCURRENCY_LEVEL; ++i) locks.add(new Object());
public void add(Integer key, String value) {
synchronized (locks.get(hash(key))) {
cache.get(key).add(value);
}
}
where hash - depending on the distribution of your keys - can be as simple as key.intValue() & (CONCURRENCY_LEVEL-1) or something like here what sort of randomizes the distribution.
While my above list of locks should work, there's Striped.lock(int) in Guava, which makes it a bit simpler and takes care of padding (see false sharing for what it's good for) and whatever.
Most probably you should not use LinkedList as it's nearly always slower than ArrayList.

What is the difference in behavior between these two usages of synchronized on a list

List<String> list = new ArrayList<String>();
list.add("a");
...
list.add("z");
synchronized(list) {
Iterator<String> i = list.iterator();
while(i.hasNext()) {
...
}
}
and
List<String> list = new ArrayList<String>();
list.add("a");
...
list.add("z");
List<String> synchronizedList = Collections.synchronizedList(list);
synchronized(synchronizedList) {
Iterator<String> i = synchronizedList.iterator();
while(i.hasNext()) {
...
}
}
Specifically, I'm not clear as to why synchronized is required in the second instance when a synchronized list provides thread-safe access to the list.
If you don't lock around the iteration, you will get a ConcurrentModificationException if another thread modifies it during the loop.
Synchronizing all of the methods doesn't prevent that in the slightest.
This (and many other things) is why Collections.synchronized* is completely useless.
You should use the classes in java.util.concurrent. (and you should think carefully about how you will guarantee you will be safe)
As a general rule of thumb:
Slapping locks around every method is not enough to make something thread-safe.
For much more information, see my blog
synchronizedList only makes each call atomic. In your case, the loop make multiple calls so between each call/iteration another thread can modify the list. If you use one of the concurrent collections, you don't have this problem.
To see how this collection differs from ArrayList.
List<String> list = new CopyOnWriteArrayList<String>();
list.addAll(Arrays.asList("a,b,c,d,e,f,g,h,z".split(",")));
for(String s: list) {
System.out.print(s+" ");
// would trigger a ConcurrentModifcationException with ArrayList
list.clear();
}
Even though the list is cleared repeatedly, it prints the following because that wa the contents when the iterator was created.
a b c d e f g h z
The second code needs to be synchronized because of the way synchronized lists are implemented. This is explained in the javadoc:
It is imperative that the user manually synchronize on the returned list when iterating over it
The main difference between the two code snippets is the effect of the add operations:
with the synchronized list, you have a visibility guarantee: other threads will see the newly added items if they call synchronizedList.get(..) for example.
with the ArrayList, other threads might not see the newly added items immediately - they might actually not ever see them.

Getting ConcurrentModificationException while modifying a HashMap in a thread class

Hi I am running a thread service, the job of this thread is to check the age of a list items in a HashMap. When an item is older than say 5 seconds, I will have to delete the item from the HashMap. The below is the simplified code. But when the code attempts to delete the item from the HashMap, I get a java.util.ConcurrentModificationException.
I am populating the HashMap in the main() method in the original program.
Can somebody please help me out with this ? PS: The deleteFromTrackList() is being called by different clients across a network through RMI.
import java.util.*;
public class NotifierThread extends Thread {
private HashMap<Integer, ArrayList> NotificationTrackList = new HashMap<Integer, ArrayList>();
#Override
public void run() {
while (true) { // this process should run continuously
checkNotifierList(getNotificationTrackList());
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public HashMap<Integer, ArrayList> getNotificationTrackList() {
return NotificationTrackList;
}
public void deleteFromTrackList(Integer messageID) {
NotificationTrackList.remove(messageID);
}
public synchronized void checkNotifierList(HashMap list) {
Set entries = list.entrySet();
for (Iterator iterator = entries.iterator(); iterator.hasNext();) {
Map.Entry<Integer, ArrayList> entry = (Map.Entry) iterator.next();
ArrayList messageInfo = entry.getValue();
Integer messageID = entry.getKey();
messageInfo = new ArrayList((ArrayList) list.get(messageID));
Long curTime = new Date().getTime();
Long refTime = (Long) messageInfo.get(1);
Long timeDiff = curTime - refTime;
if (timeDiff > 5000) {
// delete the entry if its older than 5 milliseconds and update
// internal entry list
deleteFromTrackList(messageID);
}
}
}
public static void main(String[] args) {
new NotifierThread().start();
}
}
This is the stacktrace I am getting at the console
Exception in thread "tracker" java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(Unknown Source)
at java.util.HashMap$EntryIterator.next(Unknown Source)
at java.util.HashMap$EntryIterator.next(Unknown Source)
at NotifierThread.checkNotifierList(NotifierThread.java:32)
at NotifierThread.run(NotifierThread.java:10)
The only way to remove an entry from a map while iterating over it is to remove it using the iterator. Use
iterator.remove();
instead of
deleteFromTrackList(messageID);
Note that the same applies to all the collections (List, Set, etc.)
Also, note that your design is not thread-safe, because you let other threads access the map in an unsynchronized way.
Your code isn't complectly synchronized. Try to change
public void deleteFromTrackList(Integer messageID) {
to
public synchronized void deleteFromTrackList(Integer messageID) {
Correct. You cannot modify a Map while iterating over it without using the iterator directly. There are a couple main options.
Create a List of elements that should be removed. Add each expired element to the List in the loop. After the loop, remove the elements in the List from the Map.
Use Guava's filter capability.
Maps.filterEntries This creates a new Map however which may not work for what you are trying to do.
Since you have a multi-threaded system. You may want to consider immutability as your friend. Rather than blocking threads over your entire check for stale loop, you could use an ImmutableMap which would be more thread-safe with better performance.
Thanks for your answers guys... I have found the solution for my question, instead of using HashMap, I am using ConcurrentHashMap. This solved my problem. Thanks again !
In reality you do not even need concurrent access to an hash map to get a concurrency exception.
In fact, a single thread is quite enough.
For example,
You may create a loop based on the hash map map.keySet().iterator(),
And, while you are within this loop, your (single) thread decides to remove an element from the hash map. (Not a good idea while the iterator is open.)
In the next request to the iterator().next() you will get your concurrency exception.
So careful with that.

Categories