Implementation of singleton thread-safe list - java

I'm using Spring framework. Need to have a list of objects, which should get all data from database at once. When data is changed, list will be null and next get operation should fill data from database again. Is my code correct for multi-thread environment?
#Component
#Scope("singleton")
public class MyObjectHolder {
private volatile List<MyObject> objectList = null;
public List<MyObject> getObjectList() {
if (objectList == null) {
synchronized (objectList) {
if (objectList == null) {
objectList = getFromDB();
}
}
}
return objectList;
}
synchronized
public void clearObjectList() {
objectList = null;
}
}

Short answer: no.
public class MyObjectHolder {
private final List<MyObject> objectList = new List<>();
public List<MyObject> getObjectList() {
return objectList;
}
This is the preferred singleton pattern.
Now you need to figure out how to get the data into the list in a thread-safe way. For this Java already has some pre-made thread-safe lists in the concurrent package, which should be preferred to any synchronized implementation, as they are much faster under heavy threading.
Your problem could be solved like this:
public class MyObjectHolder {
private final CopyOnWriteArrayList<MyObject> objectList = new CopyOnWriteArrayList<>();
public List<MyObject> getObjectList() {
return objectList;
}
public boolean isEmtpy() {
return objectList.isEmpty();
}
public void readDB() {
final List<MyObject> dbList = getFromDB();
// ?? objectList.clear();
objectList.addAll(dbList);
}
}
Please note the absence of any synchronized, yet the thing is completely thread-safe. Java guarantees that the calls on that list are performed atomically. So I can call isEmpty() while someone else is filling up the list. I will only get a snapshot of a moment in time and can't tell what result I will get, but it will in all cases succeed without error.
The DB call is first written into a temporary list, therefore no threading issues can happen here. Then the addAll() will atomically move the content into the real list, again: all thread-safe.
The worst-case scenario is that Thread A is just about done writing the new data, while at the same time Thread B checks if the list contains any elements. Thread B will receive the information that the list is empty, yet a microsecond later it contains tons of data. You need to deal with this situation by either repeatedly polling or by using an observer pattern to notify the other threads.

No, your code is not thread safe. For example, you could assign objectList in one thread at time X, but set it to null (via clearObjectList()) at time X+1 because you are synchronizing on 2 different objects. The first synchronization is on objectList itself and the second synchronization is on the instance of MyObjectHolder. You should look into locks when using a shared resource instead of using synchonize, specifically something like a ReadWriteLock.

Related

Missing updates with locks and ConcurrentHashMap

I have a scenario where I have to maintain a Map which can be populated by multiple threads, each modifying their respective List (unique identifier/key being the thread name), and when the list size for a thread exceeds a fixed batch size, we have to persist the records to the database.
Aggregator class
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReentrantLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.unlock();
}
}
There is one more separate thread running after every 2 minutes (using the same lock) to persist all the records in Map (to make sure we have something persisted after every 2 minutes and the map size does not gets too big)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().lock();
List<T> instrumentList = instrumentMap.values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().unlock();
}
}
This solution is working fine in almost for every scenario that we tested, except sometimes we see some of the records went missing, i.e. they are not persisted at all, although they were added fine to the Map.
My questions are:
What is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does the List that is used with the ConcurrentHashMap have an issue?
Should I use the compute method of ConcurrentHashMap here (no need I think, as ReentrantLock is already doing the same job)?
The answer provided by #Slaw in the comments did the trick. We were letting the instrumentList instance escape in non-synchronized way i.e. access/operations are happening over list without any synchonization. Fixing the same by passing the copy to further methods did the trick.
Following line of code is the one where this issue was happening
recordSaver.persist(instrumentList);
instrumentList.clear();
Here we are allowing the instrumentList instance to escape in non-synchronized way i.e. it is passed to another class (recordSaver.persist) where it was to be actioned on but we are also clearing the list in very next line(in Aggregator class) and all of this is happening in non-synchronized way. List state can't be predicted in record saver... a really stupid mistake.
We fixed the issue by passing a cloned copy of instrumentList to recordSaver.persist(...) method. In this way instrumentList.clear() has no affect on list available in recordSaver for further operations.
I see, that you are using ConcurrentHashMap's parallelStream within a lock. I am not knowledgeable about Java 8+ stream support, but quick searching shows, that
ConcurrentHashMap is a complex data structure, that used to have concurrency bugs in past
Parallel streams must abide to complex and poorly documented usage restrictions
You are modifying your data within a parallel stream
Based on that information (and my gut-driven concurrency bugs detector™), I wager a guess, that removing the call to parallelStream might improve robustness of your code. In addition, as mentioned by #Slaw, you should use ordinary HashMap in place of ConcurrentHashMap if all instrumentMap usage is already guarded by lock.
Of course, since you don't post the code of recordSaver, it is possible, that it too has bugs (and not necessarily concurrency-related ones). In particular, you should make sure, that the code that reads records from persistent storage — the one, that you are using to detect loss of records — is safe, correct, and properly synchronized with rest of your system (preferably by using a robust, industry-standard SQL database).
It looks like this was an attempt at optimization where it was not needed. In that case, less is more and simpler is better. In the code below, only two concepts for concurrency are used: synchronized to ensure a shared list is properly updated and final to ensure all threads see the same value.
import java.util.ArrayList;
import java.util.List;
public class Aggregator<T> implements Runnable {
private final List<T> instruments = new ArrayList<>();
private final RecordSaver recordSaver;
private final int batchSize;
public Aggregator(RecordSaver recordSaver, int batchSize) {
super();
this.recordSaver = recordSaver;
this.batchSize = batchSize;
}
public synchronized void addAll(List<T> moreInstruments) {
instruments.addAll(moreInstruments);
if (instruments.size() >= batchSize) {
storeInstruments();
}
}
public synchronized void storeInstruments() {
if (instruments.size() > 0) {
// in case recordSaver works async
// recordSaver.persist(new ArrayList<T>(instruments));
// else just:
recordSaver.persist(instruments);
instruments.clear();
}
}
#Override
public void run() {
while (true) {
try { Thread.sleep(1L); } catch (Exception ignored) {
break;
}
storeInstruments();
}
}
class RecordSaver {
void persist(List<?> l) {}
}
}

Return an object from an array list in a thread safe way?

I have a class in which I am populating a map liveSocketsByDatacenter from a single background thread every 30 seconds inside updateLiveSockets() method and then I have a method getNextSocket() which will be called by multiple reader threads to get a live socket available which uses the same map to get this information.
public class SocketManager {
private static final Random random = new Random();
private final ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private final AtomicReference<Map<Datacenters, List<SocketHolder>>> liveSocketsByDatacenter =
new AtomicReference<>(Collections.unmodifiableMap(new HashMap<>()));
private final ZContext ctx = new ZContext();
// Lazy Loaded Singleton Pattern
private static class Holder {
private static final SocketManager instance = new SocketManager();
}
public static SocketManager getInstance() {
return Holder.instance;
}
private SocketManager() {
connectToZMQSockets();
scheduler.scheduleAtFixedRate(new Runnable() {
public void run() {
updateLiveSockets();
}
}, 30, 30, TimeUnit.SECONDS);
}
// during startup, making a connection and populate once
private void connectToZMQSockets() {
Map<Datacenters, ImmutableList<String>> socketsByDatacenter = Utils.SERVERS;
// The map in which I put all the live sockets
Map<Datacenters, List<SocketHolder>> updatedLiveSocketsByDatacenter = new HashMap<>();
for (Map.Entry<Datacenters, ImmutableList<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> addedColoSockets = connect(entry.getKey(), entry.getValue(), ZMQ.PUSH);
updatedLiveSocketsByDatacenter.put(entry.getKey(),
Collections.unmodifiableList(addedColoSockets));
}
// Update the map content
this.liveSocketsByDatacenter.set(Collections.unmodifiableMap(updatedLiveSocketsByDatacenter));
}
private List<SocketHolder> connect(Datacenters colo, List<String> addresses, int socketType) {
List<SocketHolder> socketList = new ArrayList<>();
for (String address : addresses) {
try {
Socket client = ctx.createSocket(socketType);
// Set random identity to make tracing easier
String identity = String.format("%04X-%04X", random.nextInt(), random.nextInt());
client.setIdentity(identity.getBytes(ZMQ.CHARSET));
client.setTCPKeepAlive(1);
client.setSendTimeOut(7);
client.setLinger(0);
client.connect(address);
SocketHolder zmq = new SocketHolder(client, ctx, address, true);
socketList.add(zmq);
} catch (Exception ex) {
// log error
}
}
return socketList;
}
// this method will be called by multiple threads to get the next live socket
// is there any concurrency or thread safety issue or race condition here?
public Optional<SocketHolder> getNextSocket() {
// For the sake of consistency make sure to use the same map instance
// in the whole implementation of my method by getting my entries
// from the local variable instead of the member variable
Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter =
this.liveSocketsByDatacenter.get();
Optional<SocketHolder> liveSocket = Optional.absent();
List<Datacenters> dcs = Datacenters.getOrderedDatacenters();
for (Datacenters dc : dcs) {
liveSocket = getLiveSocket(liveSocketsByDatacenter.get(dc));
if (liveSocket.isPresent()) {
break;
}
}
return liveSocket;
}
// is there any concurrency or thread safety issue or race condition here?
private Optional<SocketHolder> getLiveSocketX(final List<SocketHolder> endpoints) {
if (!CollectionUtils.isEmpty(endpoints)) {
// The list of live sockets
List<SocketHolder> liveOnly = new ArrayList<>(endpoints.size());
for (SocketHolder obj : endpoints) {
if (obj.isLive()) {
liveOnly.add(obj);
}
}
if (!liveOnly.isEmpty()) {
// The list is not empty so we shuffle it an return the first element
Collections.shuffle(liveOnly);
return Optional.of(liveOnly.get(0));
}
}
return Optional.absent();
}
// Added the modifier synchronized to prevent concurrent modification
// it is needed because to build the new map we first need to get the
// old one so both must be done atomically to prevent concistency issues
private synchronized void updateLiveSockets() {
Map<Datacenters, ImmutableList<String>> socketsByDatacenter = Utils.SERVERS;
// Initialize my new map with the current map content
Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter =
new HashMap<>(this.liveSocketsByDatacenter.get());
for (Entry<Datacenters, ImmutableList<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> liveSockets = liveSocketsByDatacenter.get(entry.getKey());
List<SocketHolder> liveUpdatedSockets = new ArrayList<>();
for (SocketHolder liveSocket : liveSockets) { // LINE A
Socket socket = liveSocket.getSocket();
String endpoint = liveSocket.getEndpoint();
Map<byte[], byte[]> holder = populateMap();
Message message = new Message(holder, Partition.COMMAND);
boolean status = SendToSocket.getInstance().execute(message.getAdd(), holder, socket);
boolean isLive = (status) ? true : false;
// is there any problem the way I am using `SocketHolder` class?
SocketHolder zmq = new SocketHolder(socket, liveSocket.getContext(), endpoint, isLive);
liveUpdatedSockets.add(zmq);
}
liveSocketsByDatacenter.put(entry.getKey(),
Collections.unmodifiableList(liveUpdatedSockets));
}
this.liveSocketsByDatacenter.set(Collections.unmodifiableMap(liveSocketsByDatacenter));
}
}
As you can see in my class:
From a single background thread which runs every 30 seconds, I populate liveSocketsByDatacenter map with all the live sockets in updateLiveSockets() method.
And then from multiple threads, I call the getNextSocket() method to give me a live socket available which uses a liveSocketsByDatacenter map to get the required information.
I have my code working fine without any issues and wanted to see if there is any better or more efficient way to write this. I also wanted to get an opinion on thread safety issues or any race conditions if any are there, but so far I haven't seen any but I could be wrong.
I am mostly worried about updateLiveSockets() method and getLiveSocketX() method. I am iterating liveSockets which is a List of SocketHolder at LINE A and then making a new SocketHolder object and adding to another new list. Is this ok here?
Note: SocketHolder is an immutable class.
Neither of codes B or C is thread-safe.
Code B
When you are iterating on the enpoints list to copy it, nothing prevents another thread to modify, i.e. elements to be added and/or removed.
Code C
Assuming endpoints is not null, you are doing three calls to the list object: isEmpty, size, and get. There are several problems from a concurrency perspective:
Based on the type List<SocketHolder> of the argument, there is no guarantee that these methods enforce internal changes to the list to be propagated to other threads (memory visibility), let apart race conditions (if the list is modified while while your thread execute one of this function).
Let's suppose that the list endpoints provide the guarantee described just before - e.g. it has been wrapped with Collections.synchronizedList(). In this case, thread safety is still missing because between each of the calls to isEmpty, size, and get, the list can be modified while your thread executes the getLiveSocketX method. This could make your code use an outdated state of the list. For instance, your could use a size returned by endpoints.size() which is no longer valid because an element has been added to or removed from the list.
Edit - after code update
In the code you provided, it seems at first sight that:
You are indeed not co-modifying the endpoints list we were discussing about before in the method getLiveSocketX, because the method updateLiveSockets creates a new list liveUpdatedSockets which you populate from the existing liveSockets.
You use an AtomicReference to keep a map of Datacenters to the lists of sockets of interest. The consequences of this AtomicReference is to force memory visibility from this map down to all the lists and their elements. This means that, by side-effect, you are protected from memory inconsistencies between your "producer" and "consumer" threads (executing updateLiveSockets and getLiveSocket respectively). You are still exposed to race conditions, though - let's imagine updateLiveSockets and getLiveSocket running at the same time. Consider a socket S which status just switches from alive to closed. updateLiveSockets will see the status of a socket S as non-alive and create a new SocketHolder accordingly. However, getLiveSocket which is running at the exact same time will see an outdated state of S - since it will still use the list of sockets which updateLiveSockets is re-creating.
The synchronized keyword used on the method updateLiveSockets does not provide you any guarantee here, because no other part of the code is also synchronized.
To summarize, I would say:
The code of getLiveSocketX as it is written is not inherently thread-safe;
However, the way you copy the lists prevents concurrent modifications; and you are benefiting from a side-effect of the AtomicReference to have the minimal guarantee on memory visibility one would expect to get consistent list of sockets in getNextSocket after those have been generated from another thread;
You are still exposed to race conditions as described in (2), but this may be fine depending on the specifications you wish to confer to the getLiveSocket and getNextSocket methods - you may accept one socket returned by the getLiveSocket to be unavailable and have a retry mechanism.
All of that being said, I would thoroughly review and refactor the code to exhibit a more readable and explicit thread-safe consumer/producer pattern. Extra care should be taken with the use of the AtomicReference and the single synchronized, which seem to me being improperly used - although in fine the AtomicReference does help you as discussed before.

Is it safe to loop over the same List simultaneously?

I have a list
testList= new ArrayList<String>(); // gets initialized once at the very beginning init(). This list never changes. It's static and final.
I have a static method that checks if an input value is in this List :
public static boolean isInList (String var1)
throws InterruptedException {
boolean in = false;
for (String s : testList) {
if (s.equals(var1))
{
in = true;
break;
}
}
return in;
}
I have a lot of threads that use this method concurrently and check if a certain value is in this list. It seems to work fine. However, I'm not sure if this is safe. Am I doing this correctly? Is it thread safe?
It is thread-safe as long as no thread is modifying the list while other threads are reading it.
If you are using iterators over the list, they will "fail-fast" (as in throw ConcurrentModificationException) if the list is modified under them. Other methods of accessing (i.e. get(n)) won't throw an exception but may return unexpected results.
This is all covered in great detail in the Javadoc for List and ArrayList, which you should study carefully.
ArrayList is not a thread safe object. It may works for you now, but in general, when working with threads, you should make sure you're using thread-safe objects that will work with your threads as you expect.
You can use Collections.synchronizedList()
testList = Collections.synchronizedList(new ArrayList<String>());
As long as you can guarantee that no one is writing to the list, it's safe.
Note that even if the list is static and final, the code itself doesn't guarantee that the list is never modified. I recommend using Collections.unmodifiableList() instead, because it guarantees that no element is ever added to or removed from the list.
By the way, you can rewrite your code to this:
public static boolean isInList(String var1) {
for (String s : testList) {
if (Objects.equals(s, var1)) {
return true;
}
}
return false;
}
or just
testList.contains(var1);

Synchronized List/Map in Java if only one thread is writing to it

The first thread is filling a collection continuously with objects. A second thread needs to iterate over these objects, but it will not change the collection.
Currently I use Collection.synchronized for making it thread-safe, but is there a fast way to doing it?
Update
It's simple: The first thread (ui) continuously writes the mouse position to the ArrayList, as long as the mousebutton is pressed down. The second thread (render) draws a line based on the list.
Use java.util.concurrent.ArrayBlockingQueue.ArrayBlockingQueue implementation of BlockingQueue. It perfectly suits your needs.
It is perfectly suited for producer-consumer cases as that is one in yours.
You can also configure access policy. Javadoc explains access policy like this:
Fair if true then queue accesses for threads blocked on insertion or removal, are processed in FIFO order; if false the access order is unspecified.
Even if you synchronize the list, it's not necessarily thread-safe while iterating over it, so make sure you synchronize on it:
synchronized(synchronizedList) {
for (Object o : synchronizedList) {
doSomething()
}
}
Edit:
Here's a very clearly written article on the matter:
http://java67.blogspot.com/2014/12/how-to-synchronize-arraylist-in-java.html
As mentioned in comments, you need explicit synchronization on this list, because iteration is not atomic:
List<?> list = // ...
Thread 1:
synchronized(list) {
list.add(o);
}
Thread 2:
synchronized(list) {
for (Object o : list) {
// do actions on object
}
}
There are 3 options which I can currently think of to handle concurrency in ArrayList:-
Using Collections.synchronizedList(list) - currently you are using it.
CopyOnWriteArrayList - behaves much like ArrayList class, except that when the list is modified, instead of modifying the underlying array, a new array in created and the old array is discarded. It will be slower than 1.
Creating custom ArrayList class using ReentrantReadWriteLock. You can create a wrapper around ArrayList class. Use read lock when reading/iterating/looping and use write lock when adding elements in array.
For e.g:-
import java.util.List;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReadWriteLock;
import java.util.concurrent.locks.ReentrantReadWriteLock;
public class ReadWriteList<E> {
private final List<E> list;
private ReadWriteLock lock = new ReentrantReadWriteLock();
private final Lock r =lock.readLock();
private final Lock w =lock.writeLock();
public ReadWriteList(List<E> list){
this.list=list;
}
public boolean add(E e){
w.lock();
try{
return list.add(e);
}
finally{
w.unlock();
}
}
//Do the same for other modification methods
public E getElement(int index){
r.lock();
try{
return list.get(index);
}
finally{
r.unlock();
}
}
public List<E> getList(){
r.lock();
try{
return list;
}
finally{
r.unlock();
}
}
//Do the same for other read methods
}
If you're reading far more often than writing, you can use CopyOnWriteArrayList
Rather than a List will a Set suit your needs?
If so, you can use Collections.newSetFromMap(new ConcurrentHashMap<>())

Updating highly read Lists/Maps in a concurrent environment

The following class acts as a simple cache that gets updated very infrequently (say e.g. twice a day) and gets read quite a lot (up to several times a second). There are two different types, a List and a Map. My question is about the new assignment after the data gets updated in the update method. What's the best (safest) way for the new data to get applied?
I should add that it isn't necessary for readers to see the absolute latest value. The requirements are just to get either the old or the new value at any given time.
public class Foo {
private ThreadPoolExecutor _executor;
private List<Object> _listObjects = new ArrayList<Object>(0);
private Map<Integer, Object> _mapObjects = new HashMap<Integer, Object>();
private Object _mutex = new Object();
private boolean _updateInProgress;
public void update() {
synchronized (_mutex) {
if (_updateInProgress) {
return;
} else {
_updateInProgress = true;
}
}
_executor.execute(new Runnable() {
#Override
public void run() {
try {
List<Object> newObjects = loadListObjectsFromDatabase();
Map<Integer, Object> newMapObjects = loadMapObjectsFromDatabase();
/*
* this is the interesting part
*/
_listObjects = newObjects;
_mapObjects = newMapObjects;
} catch (final Exception ex) {
// error handling
} finally {
synchronized (_mutex) {
_updateInProgress = false;
}
}
}
});
}
public Object getObjectById(Integer id) {
return _mapObjects.get(id);
}
public List<Object> getListObjects() {
return new ArrayList<Object>(_listObjects);
}
}
As you see, currently no ConcurrentHashMap or CopyOnWriteArrayList is used. The only synchronisation is done in the update method.
Although not necessary for my current problem, it would be also great to know the best solution for cases where it is essential for readers to always get the absolute latest value.
You could use plan synchronization unless you are reading over 10,000 times per second.
If you want concurrent access I would use on of the concurrent collections like ConcurrentHashMap or CopyOnWriteArrayList. These are simpler to use than synchronizing the collection. (i.e. you don't need them for performance reasons, use them for simplicity)
BTW: A modern CPU can perform billions of operations in 0.1 seconds so several times a second is an eternity to a computer.
I am also seeing this issue and think of multiple solutions:
Use synchronization block on the both codes, one where reading and other where writing.
Make a separate remove list, add all removable items in that list. Remove in the same thread where reading the list just after reading is done. This way reading and deleting will happen in sequence and no error will come.

Categories