Accurate data on a map accessed by many threads - java

I am trying to sort objects into five separate groups depending on a weight given to them at instantiation.
Now, I want to sort these objects into the five groups by their weights. In order to do this, each one must be compared to the other.
Now the problem I'm having is these objects are added to the groups on separate worker threads. Each one is sent to the synchronized sorting function, which compares against all members currently in the three groups, after an object has completed downloading a picture.
The groups have been set up as two different maps. The first being a Hashtable, which crashes the program throwing an unknown ConcurrencyIssue. When I use a ConcurrentHashMap, the data is wrong because it doesn't remove the entry in time before the next object is compared against the ConcurrentHashmap. So this causes a logic error and yields groups that are sorted correctly only half of the time.
I need the hashmap to immediately remove the entry from the map before the next sort occurs... I thought synchronizing the function would do this but it still doesn't seem to work.
Is there a better way to sort objects against each other that are being added to a datastructure by worker threads? Thanks! I'm a little lost on this one.
private synchronized void sortingHat(Moment moment) {
try {
ConcurrentHashMap[] helperList = {postedOverlays, chanl_2, chanl_3, chanl_4, chanl_5};
Moment moment1 = moment;
//Iterate over all channels going from highest channel to lowest
for (int i = channelCount - 1; i > 0; i--) {
ConcurrentHashMap<String, Moment> table = helperList[i];
Set<String> keys = table.keySet();
boolean mOverlap = false;
double width = getWidthbyChannel(i);
//If there is no objects in table, don't bother trying to compare...
if (!table.isEmpty()) {
//Iterate over all objects currently in the hashmap
for (String objId : keys) {
Moment moment2 = table.get(objId);
//x-Overlap
if ((moment2.x + width >= moment1.x - width) ||
(moment2.x - width <= moment1.x + width)) {
//y-Overlap
if ((moment2.y + width >= moment1.y - width) ||
(moment2.y - width <= moment1.y + width)) {
//If there is overlap, only replace the moment with the greater weight.
if (moment1.weight >= moment2.weight) {
mOverlap = true;
table.remove(objId);
table.put(moment1.id, moment1);
}
}
}
}
}
//If there is no overlap, add to channel anyway
if (!mOverlap) {
table.put(moment1.id, moment1);
}
}
} catch (Exception e) {
Log.d("SortingHat", e.toString());
}
}
The table.remove(objId) is where the problems occur. Moment A gets sent to sorting function, and has no problems. Moment B is added, it overlaps, it compares against Moment A. If Moment B is less weight than Moment A, everything is fine. If Moment B is weighted more and A has to be removed, then when moment C gets sorted moment A will still be in the hashmap along with moment B. And so that seems to be where the logic error is.

You are having an issue with your synchronization.
The synchronize you use, will synchronize using the "this" lock. You can imagine it like this:
public synchronized void foo() { ... }
is the same as
public void foo() {
synchronized(this) {
....
}
}
This means, before entering, the current Thread will try to acquire "this object" as a lock. Now, if you have a worker Thread, that also has a synchronized method (for adding stuff to the table), they won't totally exclude each other. What you wanted is, that one Thread has to finish with his work, before the next one can start its work.
The first being a Hashtable, which crashes the program throwing an unknown ConcurrencyIssue.
This problem accourse because it may happen, that 2 Threads call something at the same time. To illustrate, imagine one Thread calling put(key, value) on it and another Thread calling remove(key). If those calls get executed at the same time (like by different cores) what will be the resulting HashTable? Because noone can say for sure, a ConcurrentModificationException will be thrown. Note: This is a verry simplyfied explanation!
When I use a ConcurrentHashMap, the data is wrong because it doesn't remove the entry in time before the next object is compared against the ConcurrentHashmap
The ConcurrentHashMap is a utility, for avoiding said concurrency issues, it is not magical, multi functional, unicorn hunting, butter knife. It snynchronizes the mehtod calls, which results in the fact, that only one Thread can either add to or remove from or do any other work on the HashMap. It does not have the same functionallity as a Lock of some sort, which would result in the access over the map being allocated to on Thread.
There could be one Thread that wants to call add and one that want to call remove. The ConcurrentHashMap only limits those calls in the matter, that they can't happen at the same time. Which comes first? You have power over that (in this scenario). What you want is, that one thread has to finish with his work, before the next one can do its work.
What you realy need is up to you. The java.util.concurrent package brings a whole arsenal of classes you could use. For example:
You could use a lock for each Map. With that, each Thread (either sorting/removing/adding or whatever) could first fetch the Lock for said Map and than work on that Map, like this:
public Worker implements Runnable {
private int idOfMap = ...;
#Override
public void run() {
Lock lock = getLock(idOfMap);
try {
lock.lock();
// The work goes here
//...
} finally {
lock.unlock();
}
}
}
The line lock.lock() would ensure, that there is no other Thread, that is currently working on the Map and modifing it, after the method call returns and this Thread will therefore have the mutial access over the Map. No one sort, before you are finished removing the right element.
Of course, you would somehow have to hold said locks, like in a data-object. With that being said, you could also utilize the Semaphore, synchronized(map) in each Thread or formulating your work on the Map in the form of Runnables and passing those to another Thread that calls all Runnables he received one by one. The possibilities are nearly endless. I personally would recommend on starting with the lock.

Related

Atomic multi-entry operations on ConcurrentHashMap

I need to perform a two-entries concurrent operation on a ConcurrentHashMapatomically.
I have a ConcurrentHashMap of Client, with an Integer id as key; every client has a selectedId attribute, which contains the id of another client or itself (means nobody selected).
At every clientChangedSelection(int whoChangedSelection) concurrent event, I need to check atomically if both the client and the selected client are referencing each other. If they do, they get removed and returned.
In the meantime clients can be added or removed by other threads.
The "ideal" solution would be to have a lock for every entry and lock the affected entries, every clientChangedSelection runs in it's own thread so they would wait if necessary. Of course that's not practical. On top of that, ConcurrentHashMap doesn't offer apis to manually lock buckets as far as I know. And on top of that again, I've read somewhere that the buckets' locks aren't reentrant. Not sure if that's true or why.
My "imaginative" approach makes heavy use of nested compute() methods to guarantee atomicity. If ConcurrentHashMap's locks aren't reentrant, this won't work. It loses any readability, requires "value capturing" workarounds, and performances are probably bad. But performances aren't much an issue as long as they don't affect threads working on unrelated entries. (i.e. in different buckets).
public Client[] match(int id){
final Client players[]=new Client[]{null,null};
clients.computeIfPresent(id,(idA, playerA)->{
if(playerA.selectedId!=idA){
clients.computeIfPresent(playerA.selectedId,(idB, playerB)->{
if(playerB.selectedId==idA){
players[0]=playerA;
players[1]=playerB;
return null;
}else{
return playerB;
}
});
}
if(players[0]==null){
return playerA;
}else{
return null;
}
});
if(players[0]==null){
return null;
}else{
return players;
}
}
The "unacceptable" approach synchronizes the entire match method. This invalidates the point of having concurrent events in the first place.
The "wrong" approach temporarily removes the two clients while working with them, and adds them back in case. This makes concurrent events using the entries fail instead of waiting, as "in use" becomes indistinguishable from "not present".
I think I'll go back to a timer which inspects the whole map in one pass every n seconds. No additional synchronization would be required, but it's less elegant.
This is, more or less, a common concurrency situation, but it's made interesting by the ConcurrentHashMap, that discourages from reinventing too much the wheel.
What would your approach be? Any suggestions?
Edit 1
Synchronizing every access (thus defeating the point of using a ConcurrentHashMap) is not a viable solution either. Concurrent access must be preserved, else the problem itself wouldn't exist.
I've removed the selectedId parameter from match(), but note that doesn't really matter. The fictitious event clientChangedSelection(int whoChangedSelection) represents the concurrent event. Could happen any time in any operating thread. match() is just an example function that gets called to handle the matching. Hope I made it clearer.
Edit 2
This is the doubly-synchronized function I ended up with. idSelect() is an example of a method that requires synchronization, as it modifies client attributes. Synchronization for put() and remove() is not required in this case, what the function sees is new enough.
There happens to be two checks: the first one is there just to get the clients to synchronize onto, the second one is there to tell if a previously executed match succeeded and removed the client, while the current one was waiting.
match() can't match the same client twice, and that was important (the atomic part).
match() can still match concurrently removed clients (removed with classic map apis, not by the same function), and that's tolerable.
public void idSelected(int id, int selectedId){
Client playerA=clients.get(id);
if(playerA!=null){
synchronized(playerA){
playerA.selectedId=selectedId;
}
}
}
public Client[] match(int id, int selectedId){
// determine if players exist in order be synchronized onto
Client playerA=clients.get(id);
if(playerA==null){
return null;
}
Client playerB=clients.get(selectedId);
if(playerB==null){
return null;
}
// sort players in order to do nested synchronization safely
if(id>selectedId){
final Client t=playerA;
playerA=playerB;
playerB=t;
}
// check under synchronization
synchronized(playerA){
if(clients.containsKey(playerA.id)){
synchronized(playerB){
if(clients.containsKey(playerB.id)){
if(playerA.selectedId==playerB.id&&playerB.selectedId==playerA.id){
clients.remove(id);
clients.remove(selectedId);
return new Client[]{playerA,playerB};
}
}
}
}
}
return null;
}

ConcurrentSkipListMap how to make remove and add calls atomic

I have N threads that add values and one removing thread. I am thinking of the best way how to sync adding to existing list of values and removing of the list.
I guess following case is possible:
thread 1 checked condition containsKey, and entered in else block
thread 2 removed the value
thread 1 try to add value to existing list, and get returns null
I think the only approach that I can use is syncing by map value, in our case is List when we adding and when we deleting
private ConcurrentSkipListMap<LocalDateTime, List<Task>> tasks = new ConcurrentSkipListMap<>();
//Thread1,3...N
public void add(LocalDateTime time, Task task) {
if (!tasks.containsKey(time)) {
tasks.computeIfAbsent(time, k -> createValue(task));
} else {
//potentially should be synced
tasks.get(time).add(task);
}
}
private List<Task> createValue(Task val) {
return new ArrayList<>(Arrays.asList(val));
}
//thread 2
public void remove()
while(true){
Map.Entry<LocalDateTime, List<Task>> keyVal = tasks.firstEntry();
if (isSomeCondition(keyVal)) {
tasks.remove(keyVal.getKey());
for (Task t : keyVal.getValue()) {
//do task processing
}
}
}
}
About the add part you would be really inclined to use merge, but the documentation is pretty clear about it - saying that it is not guaranteed to happen atomically.
I would replace your add with merge, but under a lock.
SomeLock lock ...
public void add(LocalDateTime time, Task task) {
lock.lock();
tasks.merge...
lock.unlock();
}
And same for the remove method. But then, if you are doing things under a lock there is no need for ConcurrentSkipListMap in the first place.
On the other hand if you are OK changing to ConcurrentHashMap - it has merge that is atomic for example.
It’s not entirely clear what your remove() method is supposed to do. In its current form, it’s an infinite loop, first, it will iterate over the head elements and remove them, until the condition is not met for the head element, then, it will repeatedly poll for that head element and re-evaluate the condition. Unless, it manages to remove all elements, in which case it will bail out with an exception.
If you want to process all elements currently in the map, you may simply loop over it, the weakly consistent iterators allow you to proceed while modifying it; you may notice ongoing concurrent updates or not.
If you want to process the matching head elements only, you have to insert a condition to either, return to the caller or put the thread into sleep (or better add a notification mechanism), to avoid burning the CPU with a repeated failing test (or even throw when the map is empty).
Besides that, you can implement the operations using ConcurrentSkipListMap when you ensure that there is no interference between the functions. Assuming remove is supposed to process all current elements once, the implementation may look like
public void add(LocalDateTime time, Task task) {
tasks.merge(time, Collections.singletonList(task),
(l1,l2) -> Stream.concat(l1.stream(),l2.stream()).collect(Collectors.toList()));
}
public void remove() {
for(Map.Entry<LocalDateTime, List<Task>> keyVal : tasks.entrySet()) {
final List<Task> values = keyVal.getValue();
if(isSomeCondition(keyVal) && tasks.remove(keyVal.getKey(), values)) {
for (Task t : values) {
//do task processing
}
}
}
}
The key point is that the lists contained in the map are never modified. The merge(time, Collections.singletonList(task), … operation will even store an immutable list of a single task if there was no previous mapping. In case there are previous tasks, the merge function (l1,l2) -> Stream.concat(l1.stream(),l2.stream()).collect(Collectors.toList()) will create a new list rather than modifying the existing ones. This may have a performance impact when the lists become much larger, especially when the operation has to be repeated in the case of contention, but that’s the price for not needing lock nor additional synchronization.
The remove operation uses the remove(key, value) method which only succeeds if the map’s value still matches the expected one. This relies on the fact that neither of our methods ever modifies the lists contained in the map, but replaces them with new list instances when merging. If remove(key, value) succeeds, the list can be processed; at this time, it is not contained in the map anymore. Note that during the evaluation of isSomeCondition(keyVal), the list is still contained in the map, therefore, isSomeCondition(keyVal) must not modify it, though, I assume that this should be the case for a testing method like isSomeCondition anyway. Of course, evaluating the list within isSomeCondition also relies on the other methods never modifying the list.

How to fix non-atomic use of get/check/put?

I have a JSONArray which I am iterating to populate my Map as shown below. My ppJsonArray will have data like this -
[693,694,695,696,697,698,699,700,701,702]
Below is my code which is having issues with thread safety as my static analysis tool complained -
Map<Integer, Integer> m = new HashMap<Integer, Integer>();
ConcurrentMap<String, Map<Integer, Integer>> partitionsToNodeMap = new ConcurrentHashMap<String, Map<Integer, Integer>>();
int hostNum = 2;
JSONArray ppJsonArray = j.getJSONArray("pp");
for (int i = 0; i < ppJsonArray.length(); i++) {
m.put(Integer.parseInt(ppJsonArray.get(i).toString()), hostNum);
}
Map<Integer, Integer> tempMap = partitionsToNodeMap.get("PRIMARY");
if (tempMap != null) {
tempMap.putAll(m);
} else {
tempMap = m;
}
partitionsToNodeMap.put("PRIMARY", tempMap);
But when I am running static analysis tool, it is complaining as -
Non-atomic use of get/check/put on partitionsToNodeMap.put("PRIMARY", tempMap)
Which makes me think my above code is not thread safe? How can I resolve this issue?
The above code is not thread safe.
Does it need to be thread safe? (i.e., Is partitionsToNodeMap used by more than one thread? Could more than one thread run this routine? or could thread A thread update partitionsToNodeMap in some other routine while thread B runs this routine?)
If you answered "yes" to any of those questions, then you probably need to use some kind of synchronization.
partitionsToNodeMap is a ConcurrentHashMap. That will prevent the map structure itself from becoming corrupt if it is updated by more than one thread at one time; but the data in the map presumably aren't just random strings and integers. It probably means something to your program. The fact that the map structure itself is protected from corruption will not prevent the higher-level meaning of the map contents from becoming corrupt.
Can you provide an example how can I protect this?
Not a complete one, because thread-safety is a property of the whole program. You can't do thread-safety function-by-function.
Being thread-safe is all about protecting invariants. An invariant is an assertion about your data that must always be true. For example, if you were modeling a game of Monopoly, one invariant would say that the total amount of money in the game must always be $15,140.
If some thread in the Monopoly game processes a payment by taking X dollars away from one player, and returning it to the bank, that's a two step process, and in-between the two steps the invariant is broken. If the first thread were preempted in-between the two steps, and some other thread counted all of the money in the game, it would get the wrong total.
The main use-case for the Java synchronized keyword (or equivalently, for the java.util.concurrent.locks.ReentrantLock class) is to prevent other threads from seeing broken invariants.
Either way of locking is voluntary. To make it work, you must wrap every block of code that can temporarily break an invariant in a protected block
synchronized(bank-lock) {
deductNDollarsFrom(N, player);
giveNDollarsTo(N, bank);
}
AND every block of code that cares about the invariant must also be wrapped in a protected block.
synchronized(bank-lock) {
int totalDollars = countAllMoneyInGame(...);
if (totalDollars != 15140) {
throw new CheatingDetectedException(...);
}
}
Java won't let the balance transfer and the audit happen at the same time because it never allows two threads to synchronize on the same object (bank-lock, in this case) at the same time.
You will have to figure out what your invariants are. The static analyzer is telling you that the get()...put() sequence looks like a block of code that might care about an invariant. You have to figure out whether it really does or not. Is there something that some other thread could do in-between the get() and the put() that could cause things to go south? If so then both blocks of code should synchronize on the same object so that they can not both be executed at the same time.
Your static analysis tool is confused because what you're doing looks like a classic race condition.
Map<Integer, Integer> tempMap = partitionsToNodeMap.get("PRIMARY"); // GET
if (tempMap != null) { // CHECK
tempMap.putAll(m);
} else {
tempMap = m;
}
partitionsToNodeMap.put("PRIMARY", tempMap); // PUT
If another thread were to partitionsToNodeMap.put("PRIMARY"); after you get assign tempMap, you would overwrite the other thread's work. Among a myriad of other potential bad things. It seems like you don't have multiple threads accessing it though, so it isn't an issue. However, it would be more clearly expressed as:
Map<Integer, Integer> primaryMap = partitionsToNodeMap.get("PRIMARY");
if (primaryMap != null) {
primaryMap.putAll(m);
} else {
partitionsToNodeMap.put("PRIMARY", m);
}
If you want to make the static analysis tool happy, swap out your concurrent map for a regular map. The code you've provided doesn't require a threadsafe data structure.

Three types of threads attempting to access critical section

For my application I need to be sure, that in critical session only one type of thread is processing. Number of threads for given type is not specified and could be "large". I've came with simple solution:
MutableInt a,b,c;
Semaphore mutex;
void enterA() {
while (true) {
mutex.acquire();
if (b.intValue() == 0 && c.intValue() == 0) {
a.increase();
break;
}
mutex.release();
}
}
void exitA() {
while(true) {
mutex.acquire();
a.decrease();
mutex.release();
}
}
I'm skipping exception handling and B&C part cause its just copy-paste.
It works as expected (possibility of thread starvation is ok), but the generated load is too big. Threads are constantly checking counters. I feel there is another solution but can't think of any example.
I can't tell if your solution is a portion of the problem but as it stands I'd recommend moving to AtomicInteger which handles all of the incrementing, etc. for you without locking.
If it is more complicated then you should consider using AtomicReference with some accumulator class and use compareAndSet(...) method to update it atomically.
For example, you could store your 3 integers in a MutableInts class and do something like the following:
final AtomicReference<MutableInts> reference =
new AtomicReference<MutableInts>(new MutableInts(0, 0, 0));
...
do {
MutableInts ints = reference.get();
// increment the ints properly which should generate a new MutableInts class
// it should _not_ make changes to `ints` itself
MutableInts newInts = ints.mutateSomehow(...);
// this spins in case some other thread updated it before us here
} while (!reference.compareAndSet(ints, newInts));
So it seems like you are limited in the calls you can use to accomplish this. Here are some other alternatives:
Each thread update their own data and then every so often (or maybe just at the end of processing) synchronize up with a central counters. Same locks but doing it a lot less often.
Each thread could update per-thread volatile counters and a polling thread could read the counters and update the central information. Not sure if volatile is allowed.

Writing a thread safe modular counter in Java

Full disclaimer: this is not really a homework, but I tagged it as such because it is mostly a self-learning exercise rather than actually "for work".
Let's say I want to write a simple thread safe modular counter in Java. That is, if the modulo M is 3, then the counter should cycle through 0, 1, 2, 0, 1, 2, … ad infinitum.
Here's one attempt:
import java.util.concurrent.atomic.AtomicInteger;
public class AtomicModularCounter {
private final AtomicInteger tick = new AtomicInteger();
private final int M;
public AtomicModularCounter(int M) {
this.M = M;
}
public int next() {
return modulo(tick.getAndIncrement(), M);
}
private final static int modulo(int v, int M) {
return ((v % M) + M) % M;
}
}
My analysis (which may be faulty) of this code is that since it uses AtomicInteger, it's quite thread safe even without any explicit synchronized method/block.
Unfortunately the "algorithm" itself doesn't quite "work", because when tick wraps around Integer.MAX_VALUE, next() may return the wrong value depending on the modulo M. That is:
System.out.println(Integer.MAX_VALUE + 1 == Integer.MIN_VALUE); // true
System.out.println(modulo(Integer.MAX_VALUE, 3)); // 1
System.out.println(modulo(Integer.MIN_VALUE, 3)); // 1
That is, two calls to next() will return 1, 1 when the modulo is 3 and tick wraps around.
There may also be an issue with next() getting out-of-order values, e.g.:
Thread1 calls next()
Thread2 calls next()
Thread2 completes tick.getAndIncrement(), returns x
Thread1 completes tick.getAndIncrement(), returns y = x+1 (mod M)
Here, barring the forementioned wrapping problem, x and y are indeed the two correct values to return for these two next() calls, but depending on how the counter behavior is specified, it can be argued that they're out of order. That is, we now have (Thread1, y) and (Thread2, x), but maybe it should really be specified that (Thread1, x) and (Thread2, y) is the "proper" behavior.
So by some definition of the words, AtomicModularCounter is thread-safe, but not actually atomic.
So the questions are:
Is my analysis correct? If not, then please point out any errors.
Is my last statement above using the correct terminology? If not, what is the correct statement?
If the problems mentioned above are real, then how would you fix it?
Can you fix it without using synchronized, by harnessing the atomicity of AtomicInteger?
How would you write it such that tick itself is range-controlled by the modulo and never even gets a chance to wraps over Integer.MAX_VALUE?
We can assume M is at least an order smaller than Integer.MAX_VALUE if necessary
Appendix
Here's a List analogy of the out-of-order "problem".
Thread1 calls add(first)
Thread2 calls add(second)
Now, if we have the list updated succesfully with two elements added, but second comes before first, which is at the end, is that "thread safe"?
If that is "thread safe", then what is it not? That is, if we specify that in the above scenario, first should always come before second, what is that concurrency property called? (I called it "atomicity" but I'm not sure if this is the correct terminology).
For what it's worth, what is the Collections.synchronizedList behavior with regards to this out-of-order aspect?
As far as I can see you just need a variation of the getAndIncrement() method
public final int getAndIncrement(int modulo) {
for (;;) {
int current = atomicInteger.get();
int next = (current + 1) % modulo;
if (atomicInteger.compareAndSet(current, next))
return current;
}
}
I would say that aside from the wrapping, it's fine. When two method calls are effectively simultaneous, you can't guarantee which will happen first.
The code is still atomic, because whichever actually happens first, they can't interfere with each other at all.
Basically if you have code which tries to rely on the order of simultaneous calls, you already have a race condition. Even if in the calling code one thread gets to the start of the next() call before the other, you can imagine it coming to the end of its time-slice before it gets into the next() call - allowing the second thread to get in there.
If the next() call had any other side effect - e.g. it printed out "Starting with thread (thread id)" and then returned the next value, then it wouldn't be atomic; you'd have an observable difference in behaviour. As it is, I think you're fine.
One thing to think about regarding wrapping: you can make the counter last an awful lot longer before wrapping if you use an AtomicLong :)
EDIT: I've just thought of a neat way of avoiding the wrapping problem in all realistic scenarios:
Define some large number M * 100000 (or whatever). This should be chosen to be large enough to not be hit too often (as it will reduce performance) but small enough that you can expect the "fixing" loop below to be effective before too many threads have added to the tick to cause it to wrap.
When you fetch the value with getAndIncrement(), check whether it's greater than this number. If it is, go into a "reduction loop" which would look something like this:
long tmp;
while ((tmp = tick.get()) > SAFETY_VALUE))
{
long newValue = tmp - SAFETY_VALUE;
tick.compareAndSet(tmp, newValue);
}
Basically this says, "We need to get the value back into a safe range, by decrementing some multiple of the modulus" (so that it doesn't change the value mod M). It does this in a tight loop, basically working out what the new value should be, but only making a change if nothing else has changed the value in between.
It could cause a problem in pathological conditions where you had an infinite number of threads trying to increment the value, but I think it would realistically be okay.
Concerning the atomicity problem: I don't believe that it's possible for the Counter itself to provide behaviour to guarantee the semantics you're implying.
I think we have a thread doing some work
A - get some stuff (for example receive a message)
B - prepare to call Counter
C - Enter Counter <=== counter code is now in control
D - Increment
E - return from Counter <==== just about to leave counter's control
F - application continues
The mediation you're looking for concerns the "payload" identity ordering established at A.
For example two threads each read a message - one reads X, one reads Y. You want to ensure that X gets the first counter increment, Y gets the second, even though the two threads are running simultaneously, and may be scheduled arbitarily across 1 or more CPUs.
Hence any ordering must be imposed across all the steps A-F, and enforced by some concurrency countrol outside of the Counter. For example:
pre-A - Get a lock on Counter (or other lock)
A - get some stuff (for example receive a message)
B - prepare to call Counter
C - Enter Counter <=== counter code is now in control
D - Increment
E - return from Counter <==== just about to leave counter's control
F - application continues
post- F - release lock
Now we have a guarantee at the expense of some parallelism; the threads are waiting for each other. When strict ordering is a requirement this does tend to limit concurrency; it's a common problem in messaging systems.
Concerning the List question. Thread-safety should be seen in terms of interface guarantees. There is absolute minimum requriement: the List must be resilient in the face of simultaneous access from several threads. For example, we could imagine an unsafe list that could deadlock or leave the list mis-linked so that any iteration would loop for ever. The next requirement is that we should specify behaviour when two threads access at the same time. There's lots of cases, here's a few
a). Two threads attempt to add
b). One thread adds item with key "X", another attempts to delete the item with key "X"
C). One thread is iterating while a second thread is adding
Providing that the implementation has clearly defined behaviour in each case it's thread-safe. The interesting question is what behaviours are convenient.
We can simply synchronise on the list, and hence easily give well-understood behaviour for a and b. However that comes at a cost in terms of parallelism. And I'm arguing that it had no value to do this, as you still need to synchronise at some higher level to get useful semantics. So I would have an interface spec saying "Adds happen in any order".
As for iteration - that's a hard problem, have a look at what the Java collections promise: not a lot!
This article , which discusses Java collections may be interesting.
Atomic (as I understand) refers to the fact that an intermediate state is not observable from outside. atomicInteger.incrementAndGet() is atomic, while return this.intField++; is not, in the sense that in the former, you can not observe a state in which the integer has been incremented, but has not yet being returned.
As for thread-safety, authors of Java Concurrency in Practice provide one definition in their book:
A class is thread-safe if it behaves
correctly when accessed from multiple
threads, regardless of the scheduling
or interleaving of the execution of
those threads by the runtime
environment, and with no additional
synchronization or other coordination
on the part of the calling code.
(My personal opinion follows)
Now, if we have the list
updated succesfully with two elements
added, but second comes before first,
which is at the end, is that "thread
safe"?
If thread1 entered the entry set of the mutex object (In case of Collections.synchronizedList() the list itself) before thread2, it is guaranteed that first is positioned ahead than second in the list after the update. This is because the synchronized keyword uses fair lock. Whoever sits ahead of the queue gets to do stuff first. Fair locks can be quite expensive and you can also have unfair locks in java (through the use of java.util.concurrent utilities). If you'd do that, then there is no such guarantee.
However, the java platform is not a real time computing platform, so you can't predict how long a piece of code requires to run. Which means, if you want first ahead of second, you need to ensure this explicitly in java. It is impossible to ensure this through "controlling the timing" of the call.
Now, what is thread safe or unsafe here? I think this simply depends on what needs to be done. If you just need to avoid the list being corrupted and it doesn't matter if first is first or second is first in the list, for the application to run correctly, then just avoiding the corruption is enough to establish thread-safety. If it doesn't, it is not.
So, I think thread-safety can not be defined in the absence of the particular functionality we are trying to achieve.
The famous String.hashCode() doesn't use any particular "synchronization mechanism" provided in java, but it is still thread safe because one can safely use it in their own app. without worrying about synchronization etc.
Famous String.hashCode() trick:
int hash = 0;
int hashCode(){
int hash = this.hash;
if(hash==0){
hash = this.hash = calcHash();
}
return hash;
}

Categories