How do I get the thread index after calling newthread? - java

In JNLUA, newThread() is a void function in Java but I don't quite understand the C code behind implementing the java side of the function. Also, would someone explain why the original author would return the index/pointer?

newThread pushes the new thread onto the Lua stack. There is no need to return a value. Interfacing via a pointer value would be challenging to make safe (and poor style anyway I guess), and the index is readily available without returning it (e.g. via getTop or simply by keeping track of the stack size or by using negative indices). This is more or less the same for all parts of the API that push a value onto the stack.
Note that JNLUA's newThread differs a bit from Lua's lua_newthread; threads in JNLUA are a little less general and more intended to be used like Lua coroutines (which are, of course, also Lua threads) -- newThread takes the value from the top of the stack and uses it as the "start function" for the coroutine (which will be invoked when calling resume for the thread the first time).
I won't speculate on the author's decision to expose threads in this way; returning a LuaState (and/or exposing lua_tothread) could also be a reasonable way to implement things a bit closer to the original Lua API.
This is the guts of the implementation:
static int newthread_protected (lua_State *L) {
lua_State *T;
T = lua_newthread(L);
lua_insert(L, 1);
lua_xmove(L, T, 1);
return 1;
}
This is done in a protected call because lua_newthread can throw an exception (if it runs out of memory). The only argument (and thus the only thing initially on the stack) is the start function at index 1. lua_newthread will push the new thread object to the stack at index 2. Then lua_insert will reverse the order of the new thread object and the start function and lua_xmove transfers the start function to the new thread and then the new thread is returned (which leaves it on top of the caller's stack).
In Lua, a new thread with the start function already on the stack can sort of be treated equivalently to a yielded thread by lua_resume, so this is basically what JNLUA does -- its resume can be used to start the thread with the previously provided function.

Related

Trying to understand shared variables in java threads

I have the following code :
class thread_creation extends Thread{
int t;
thread_creation(int x){
t=x;
}
public void run() {
increment();
}
public void increment() {
for(int i =0 ; i<10 ; i++) {
t++;
System.out.println(t);
}
}
}
public class test {
public static void main(String[] args) {
int i =0;
thread_creation t1 = new thread_creation(i);
thread_creation t2 = new thread_creation(i);
t1.start();
try {
Thread.sleep(500);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
t2.start();
}
}
When I run it , I get :
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Why I am getting this output ? According to my understanding , the variable i is a shared variable between the two threads created. So according to the code , the first thread will execute and increments i 10 times , and hence , i will be equal to 10 . The second thread will start after the first one because of the sleep statement and since i is shared , then the second thread will start will i=10 and will start incrementing it 10 times to have i = 20 , but this is not the case in the output , so why that ?
You seem to think that int t; in thread_creation is a shared variable. I'm afraid you are mistaken. Each t instance is a different variable. So the two threads are updating distinct counters.
The output you are seeing reflects that.
This is the nub of your question:
How do I pass a shared variable then ?
Actually, you can't1. Strictly a shared variable is actually a variable belonging to a shared object. You cannot pass a variable per se. Java does not allow passing of variables. This is what "Java does not support call-by-reference" really means. You can't pass or return a variable or the address of a variable in any method call. (Or in any other way.)
In Java you pass and return values: either primitives, or references to objects. The values may read from a variable by the call's parameter expression or assigned to a variable after the call's return. But you are not passing the variable. A variable and its value / contents are different things.
So the only way to implement a shared counter is to implement it as a shared counter object.
Note that "variable" and "object" mean different things, both in Java and in other programming languages. You should NOT use the two terms interchangeable. For example, when I declare this in Java:
String s = "Hello";
the s variable is not a String object. It is a variable that contains a reference to the String object. Other variables may contain references to the same String object as well. The distinction is even more stark when the objects are mutable. (String is not mutable ... in Java.)
Here are the two (IMO) best ways to implement a shared counter object.
You could create a custom Java Counter class with a count variable, a get method, and methods for incrementing, decrementing the counter. The class needs to implement various methods as thread-safe and atomic; e.g. by using synchronized methods or blocks2.
You could just use an AtomicInteger instance. That takes care of atomicity and thread-safety ... to the extent that it is possible with this kind of API.
The latter approach is simpler and likely more efficient ... unless you need to do something special each time the counter changes.
(It is conceivable that you could implement a shared counter other ways, but that is too much detail for this answer.)
1 - I realize that I just said the same thing more than 3 times. But as the Bellman says in "The Hunting of the Snark": "What I tell you three times is true."
2 - If the counter is not implemented using synchronized or an equivalent mutual exclusion mechanism with the appropriate happens before semantics, you are liable to see Heisenbugs; e.g. race conditions and memory visibility problems.
Two crucial things you're missing. Both individually explain this behaviour - you can 'fix' either one and you'll still see this, you'd have to fix both to see 1-20:
Java is pass-by-value
When you pass i, you pass a copy of it. In fact, in java, all parameters to methods are always copies. Hence, when the thread does t++, it has absolutely no effect whatsoever on your i. You can trivially test this, and you don't need to mess with threads to see it:
public static void main(String[] args) {
int i = 0;
add5(i);
System.out.println(i); // prints 0!!
}
static void add5(int i) {
i = i + 5;
}
Note that all non-primitives are references. That means: A copy of the reference is passed. It's like passing the address of a house and not the house itself. If I have an address book, and I hand you a scanned copy of a page that contains the address to my summer home, you can still drive over there and toss a brick through the window, and I'll 'see' that when I go follow my copy of the address. So, when you pass e.g. a list and the method you passed the list to runs list.add("foo"), you DO see that. You may think: AHA! That means java does not pass a copy, it passed the real list! Not so. Java passed a copy of a street address (A reference). The method I handed that copy to decided to drive over there and act - that you can see.
In other words, =, ++, that sort of thing? That is done to the copy. . is java for 'drive to the address and enter the house'. Anything you 'do' with . is visible to the caller, = and ++ and such are not.
Fixing the code to avoid the pass-by-value problem
Change your code to:
class thread_creation extends Thread {
static int t; // now its global!
public void run() {
increment();
}
public void increment() {
for(int i =0 ; i<10 ; i++) {
t++;
// System.out.println(t);
}
}
}
public class test {
public static void main(String[] args) throws Exception {
thread_creation t1 = new thread_creation();
thread_creation t2 = new thread_creation();
t1.start();
Thread.sleep(500);
t2.start();
Thread.sleep(500);
System.out.println(thread_creation.t);
}
}
Note that I remarked out the print line. I did that intentionally - see below. If you run the above code, you'd think you see 20, but depending on your hardware, the OS, the song playing on your mp3 playing app, which websites you have open, and the phase of the moon, it may be less than 20. So what's going on there? Enter the...
The evil coin.
The relevant spec here is the JMM (The Java Memory Model). This spec explains precisely what a JVM must do, and therefore, what a JVM is free not to do, especially when it comes to how memory is actually managed.
The crucial aspect is the following:
Any effects (updates to fields, such as that t field) may or may not be observable, JVM's choice. There's no guarantee that anything you do is visible to anything else... unless there exists a Happens-Before/Happens-After relationship: Any 2 statements with such a relationship have the property that the JVM guarantees that you cannot observe the lack of the update done by the HB line from the HA line.
HB/HA can be established in various ways:
The 'natural' way: Anything that is 'before' something else _and runs in the same thread has an HB/HA relationship. In other words, if you do in one thread x++; System.out.println(x); then you can't observe that the x++ hasn't happened yet. It's stated like this so that if you're not observing, you get no guarantees, which gives the JVM the freedom to optimize. For example, Given x++;y++; and that's all you do, the JVM is free to re-order that and increment y before x. Or not. There are no guarantees, a JVM can do whatever it wants.
synchronized. The moment of 'exiting' a synchronized (x) {} block has HB to the HA of another thread 'entering' the top of any synchronized block on the same object, if it enters later.
volatile - but note that with volatile it's basically impossible which one came first. But one of them did, and any interaction with a volatile field is HB relative to another thread accessing the same field later.
thread starting. thread.start() is HB relative to the first line of the run() of that thread.
thread yielding. thread.yield() is HA relative to the last line of the thread.
There are a few more exotic ways to establish HB/HA but that's pretty much it.
Crucially, in your code there is no HB/HA between any of the statements that modify or print t!
In other words, the JVM is free to run it all in such a way that the effects of various t++ statements run by one thread aren't observed by another thread.
What the.. WHY????
Because of efficiency. Your memory banks on your CPU are, relative to how fast CPUs are, oceans away from the CPU core. Fetching or writing to core memory from a CPU takes an incredibly long time - your CPU is twiddling its thumbs for a very long time while it waits for the memory controller to get the job done. It could be running hundreds of instructions in that time.
So, CPU cores do not write to memory AT ALL. Instead they work with caches: They have an on-core cache page, and the only interaction with your main memory banks (which are shared by CPU cores) is 'load in an entire cache page' and 'write an entire cache page'. That cache page is then effectively a 'local copy' that only that core can see and interact with (but can do so very very quickly, as that IS very close to the core, unlike the main memory banks), and then once the algorithm is done it can flush that page back to main memory.
The JVM needs to be free to use this. Had the JVM actually worked like you want (that anything any thread does is instantly observable by all others), then anything that any line does must first wait 500 cycles to load the relevant page, then wait another 500 cycles to write it back. All java apps would literally be 1000x slower than they could be.
This in passing also explains that actual synchronizing is really slow. Nothing java can do about that, it is a fundamental limitation of our modern multi-core CPUs.
So, evil coin?
Note that the JVM does not guarantee that the CPU must neccessarily work with this cache stuff, nor does it make any promises about when cache pages are flushed. It merely limits the guarantees so that JVMs can be efficiently written on CPUs that work like that.
That means that any read or write to any field any java code ever does can best be thought of as follows:
The JVM first flips a coin. On heads, it uses a local cached copy. On tails, it copies over the value from some other thread's cached copy instead.
The coin is evil: It is not reliably a 50/50 arrangement. It is entirely plausible that throughout developing a feature and testing it, the coin lands tails every time it is flipped. It remains flipping tails 100% of the time for the first week that you deployed it. And then just when that big potential customer comes in and you're demoing your app, the coin, being an evil, evil coin, starts flipping heads a few times and breaking your app.
The correct conclusion is that the coin will mess with you and that you cannot unit test against it. The only way to win the game is to ensure that the coin is never flipped.
You do this by never touching a field from multiple threads unless it is constant (final, or simply never changes), or if all access to it (both reads and writes) has clearly established HB/HA between all threads.
This is hard to do. That's why the vast majority of apps don't do it at all. Instead, they:
Talk between threads using a database, which has vastly more advanced synchronization primitives: Transactions.
Talk using a message bus such as RabbitMQ or similar.
Use stuff from the java.util.concurrent package such as a Latch, ForkJoin, ConcurrentMap, or AtomicInteger. These are easier to use (specifically: It is a lot harder to write code for these abstractions that is buggy but where the bug cannot be observed or tested for on the machine of the developer that wrote it, it'll only blow up much later in production. But not impossible, of course).
Let's fix it!
volatile doesn't 'fix' ++. x++; is 'read x, increment by 1, write result to x' and volatile doesn't make that atomic, so we cannot use this. We can either replace t++ with:
synchronized(thread_creation.class) {
t++;
}
Which works fine but is really slow (and you shouldn't lock on publicly visible stuff if you can help it, so make a custom object to lock on, but you get the gist hopefully), or, better, dig into that j.u.c package for something that seems useful. And so there is! AtomicInteger!
class thread_creation extends Thread {
static AtomicInteger t = new AtomicInteger();
public void run() {
increment();
}
public void increment() {
for(int i =0 ; i<10 ; i++) {
t.incrementAndGet();
}
}
}
public class test {
public static void main(String[] args) throws Exception {
thread_creation t1 = new thread_creation();
thread_creation t2 = new thread_creation();
t1.start();
Thread.sleep(500);
t2.start();
Thread.sleep(500);
System.out.println(thread_creation.t.get());
}
}
That code will print 20. Every time (unless those threads take longer than 500msec which technically could be, but is rather unlikely of course).
Why did you remark out the print statement?
That HB/HA stuff can sneak up on you: When you call code you did not write, such as System.out.println, who knows what kind of HB/HA relationships are in that code? Javadoc isn't that kind of specific, they won't tell you. Turns out that on most OSes and JVM implementations, interaction with standard out, such as System.out.println, causes synchronization; either the JVM does it, or the OS does. Thus, introducing print statements 'to test stuff' doesn't work - that makes it impossible to observe the race conditions your code does have. Similarly, involving debuggers is a great way to make that coin really go evil on you and flip juuust so that you can't tell your code is buggy.
That is why I remarked it out, because with it in, I bet on almost all hardware you end up seeing 20 eventhough the JVM doesn't guarantee it and that first version is broken. Even if on your particular machine, on this day, with this phase of the moon, it seems to reliably print 20 every single time you run it.

Do I need a lock if I change object pointer instead of update the object?

I'm using Java to write a multithreading application. One question I have is I have a list that is accessed by multiple threads, and I have one thread trying to update it. However, each time, the update thread will create a new List and then make the public shared list point to the new List, like this:
Public List<DataObject> publicDataObject = XXXX; // <- this will be accessed by multiple threads
Then I have one thread updating this List:
List<DataObject> newDataObjectList = CreateNewDataObject();
publicDataObject = newDataObjectList;
When I update the pointer of publicDataObject, do I need a lock to make it thread-safe?
Before going into the answer, let's first check if I understand the situation and the question.
Assumption as stated in problem description: Only a single thread creates new versions of the list, based on the previous value of publicDataObject, and stores that new list in publicDataObject.
Derived assumption: Other threads access DataObject elements from that list, but do not add, remove or change the order of elements.
If this assumption holds, the answer is below.
Otherwise, please make sure your question includes this in its description. This makes the answer much more complex, though, and I advise you to study the topic of concurrency more, for example by reading a book about Java concurrency.
Additional assumption:The DataObject objects themselves are thread-safe.
If this assumption does not hold, this would make the scope of the question too broad and I would suggest to study the topic of concurrency more, for example by reading a book about Java concurrency.
Answer
Given that the above assumptions are true, you do not need a lock, however, you cannot just access publicDataObject from multiple threads, using its definition in you code example. The reason is the Java Memory Model. The Java Memory Model makes no guarantees whatsoever about threads seeing changes made by other threads, unless you use special language constructs like atomic variables, locks or synchronization.
Simplified, these constructs ensure that a read in one thread that happens after a write in another, can see that written value, as long as you are using the same construct: the same atomic variable, lock or synchronisation on the same object. Locks and intrinsic locks (used by synchronisation) can also ensure exclusive access of a single thread to a block of code.
Given, again, that the above assumptions are true, you can suffice using an AtomicReference, whose get and set methods have the desired relationship:
// Definition
public AtomicReference<List<DataObject>> publicDataObject;
The reasons that a simple construct can be used that "only" guarantees visibility are:
The list that publicDataObject refers to, is always a new one (the first assumption). Therefore, other threads can never see a stale value of the list itself. Making sure that threads see the correct value of publicDataObject is therefore enough
As long as other threads don't change the list.
If in addition, only thread sets publicDataObject, there can be no race conditions in setting it, for example loosing updates that are overwritten by more recent updates before ever being read.

Java: is using synchronized(this) an advisable practice when creating a ConcurrentHashMap object?

I just finished developing a java web service server for a distributed programming course I am attending. One of the requirements was to guarantee multi-thread safety to our project hence I decided to use ConcurrentHashMap objects to store my data.
At the end of it all I am left with a question regarding this snippet of code:
public List<THost> getHList() throws ClusterUnavailable_Exception{
logger.entering(logger.getName(), "getHList");
if(hMap==null){
synchronized(this){
if(hMap==null){
hMap=createHMap();
}
}
}
if(hMap==null){
ClusterUnavailable cu = new ClusterUnavailable();
cu.setMessage("Data unavailable.");
ClusterUnavailable_Exception exc = new ClusterUnavailable_Exception("Data unavailable.", new ClusterUnavailable());
throw exc;
}
else{
List<THost> hList = new ArrayList<THost>(hMap.values());
logger.info("Returning list of hosts. Number of hosts returned = "+hList.size());
logger.exiting(logger.getName(), "getHList");
return hList;
}
}
do I have to use the synchronized statement when creating the concurrenthashmap object itself in order to guarantee that the service will not have any unpredictable behavior in a multi-threaded environment?
Don't bother. Eagerly initialize the Map, make the field final, and drop the synchronization until you have proven that it is actually necessary. The cost is minuscule and the "obviously safe and correct" solution will almost never be too slow.
You mentioned this is a class project -- focus on getting the code working. Concurrency is hard enough without inventing additional obstacles that you must then hurdle over.
The simple solution is to avoid the problem by eagerly initializing. And unless you have clear evidence (i.e. profiling) that eager initialization is a performance problem, that is also the best solution.
As to your question, the answer is that the synchronized block is necessary for correctness. Without it you can get the following sequence of events.
thread 1 calls getHList()
thread 1 sees that hMap is null and starts to create a map.
thread 2 calls getHList()
thread 2 sees that hMap is null and starts to create a map.
thread 1 finishes creating, and assigns the new map to hMap, and returns that map.
thread 2 finishes creating, and assigns the second new map to hMap, and returns that map.
In short, thread 1 and thread 2 could get different maps if they simultaneously call getHList() while hMap has its initial null value.
(In the above, I'm assuming that getHList() is a getter for hMap. However, the method as written won't compile, and its declared return type doesn't match the type of hMap ... so it is unclear what it is really intended to do.)
The below line has nothing to do with ConcurrentHashMap. Its just creating an instance of ConcurrentHashMap object.
Its just like synchronizing any object creation in JAVA.
hMap=new ConcurrentHashMap<BigInteger, THost>();
Double check locking pattern is broken before Java 1.5 (and is inefficient in Java 1.6 and later). See: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
Consider using a Initialization-on-demand holder or a single element enum type.

approach to null all references to objects created by a Java ThreadFactory

I have a Java ThreadFactory implementation spawning runnable thread subclass objects in my Android application. This application requires that all spawned threads are addressable before a certain event fires, and that upon the firing of said event the spawned threads can be made eligible for garbage collection (e.g. reference count of 0). I was thinking that to satisfy the first requirement I would simply maintain an ArrayList of my thread objects, and that works fine. The trouble comes with the second requirement, and has led me to a series of questions regarding Java reference counting:
Question 1: Does simply storing references to the spawned threads in an ArrayList or other container increment the reference count of each thread object? What would the reference count of each thread object be if I never stored them, but rather allowed my factory to create them and let them run without any defined handle as below?
mvThreadSource.newThread(new BackgroundThread(...)).run();
Question 2: Does the below code sample do anything to the actual thread object pointed to by hTempThread other than increment and immediately decrement its reference count?
BackgroundThread hTempThread;
for(int i=0;i<mvThreadsVector.size();i++){
hTempThread = mvThreadsVector.get(i); //probably increments ref count of thread
hTempThread = null; //probably just decrements the ref count of thread back to
//previous value
}
Question 3: Assuming the answer to question 2 is 'no', what would be an effective way to store threads spawned by a ThreadFactory implementation such that their reference counts can be reduced to 0 on demand? What would be the proper syntax involved in removing these references? Would the below code sample effectively reduce all involved objects' (mvThreadsVector, tempThreads, each thread object tracked by mvThreadsVector) reference counts to 0? What exactly does clear() do to the reference counts of objects stored in an arraylist, and what does setting an array reference to null do to the reference counts of elements stored inside (if anything)?
Object[] tempThreads = mvThreadsVector.toArray();
mvThreadsVector.clear(); //possible this line is all I need...
mvThreadsVector = null;
for(int i=0;i<tempThreads.length;i++){
tempThreads[i] = null;
}
tempThreads = null;
Any help with any/all of the above questions would be very much appreciated!
If you have your thread running, until it finishes execution it will be kept in memory (or swapped) but not GC. Even if the Thread has no code of yours referencing it but is executing code, it has at least a reference internally.
You cannot even guarantee that it will increment and decrement the reference count. The compiler may optimize it by realizing that the first assignment has no effect (since it is immediately overwritten) and remove it.
The reference count will be reduced but remember that you may have loops in the reference chain, where each element will have 1 reference count but the whole loop is not accessible and a correct GC should realize that and remove it.
I think the main issue is that you shouldn't count on the reference count of threads going down to 0; as long as they are running you should assume they have references so you must also have a way of making sure the thread stops its execution.

Writing a thread safe modular counter in Java

Full disclaimer: this is not really a homework, but I tagged it as such because it is mostly a self-learning exercise rather than actually "for work".
Let's say I want to write a simple thread safe modular counter in Java. That is, if the modulo M is 3, then the counter should cycle through 0, 1, 2, 0, 1, 2, … ad infinitum.
Here's one attempt:
import java.util.concurrent.atomic.AtomicInteger;
public class AtomicModularCounter {
private final AtomicInteger tick = new AtomicInteger();
private final int M;
public AtomicModularCounter(int M) {
this.M = M;
}
public int next() {
return modulo(tick.getAndIncrement(), M);
}
private final static int modulo(int v, int M) {
return ((v % M) + M) % M;
}
}
My analysis (which may be faulty) of this code is that since it uses AtomicInteger, it's quite thread safe even without any explicit synchronized method/block.
Unfortunately the "algorithm" itself doesn't quite "work", because when tick wraps around Integer.MAX_VALUE, next() may return the wrong value depending on the modulo M. That is:
System.out.println(Integer.MAX_VALUE + 1 == Integer.MIN_VALUE); // true
System.out.println(modulo(Integer.MAX_VALUE, 3)); // 1
System.out.println(modulo(Integer.MIN_VALUE, 3)); // 1
That is, two calls to next() will return 1, 1 when the modulo is 3 and tick wraps around.
There may also be an issue with next() getting out-of-order values, e.g.:
Thread1 calls next()
Thread2 calls next()
Thread2 completes tick.getAndIncrement(), returns x
Thread1 completes tick.getAndIncrement(), returns y = x+1 (mod M)
Here, barring the forementioned wrapping problem, x and y are indeed the two correct values to return for these two next() calls, but depending on how the counter behavior is specified, it can be argued that they're out of order. That is, we now have (Thread1, y) and (Thread2, x), but maybe it should really be specified that (Thread1, x) and (Thread2, y) is the "proper" behavior.
So by some definition of the words, AtomicModularCounter is thread-safe, but not actually atomic.
So the questions are:
Is my analysis correct? If not, then please point out any errors.
Is my last statement above using the correct terminology? If not, what is the correct statement?
If the problems mentioned above are real, then how would you fix it?
Can you fix it without using synchronized, by harnessing the atomicity of AtomicInteger?
How would you write it such that tick itself is range-controlled by the modulo and never even gets a chance to wraps over Integer.MAX_VALUE?
We can assume M is at least an order smaller than Integer.MAX_VALUE if necessary
Appendix
Here's a List analogy of the out-of-order "problem".
Thread1 calls add(first)
Thread2 calls add(second)
Now, if we have the list updated succesfully with two elements added, but second comes before first, which is at the end, is that "thread safe"?
If that is "thread safe", then what is it not? That is, if we specify that in the above scenario, first should always come before second, what is that concurrency property called? (I called it "atomicity" but I'm not sure if this is the correct terminology).
For what it's worth, what is the Collections.synchronizedList behavior with regards to this out-of-order aspect?
As far as I can see you just need a variation of the getAndIncrement() method
public final int getAndIncrement(int modulo) {
for (;;) {
int current = atomicInteger.get();
int next = (current + 1) % modulo;
if (atomicInteger.compareAndSet(current, next))
return current;
}
}
I would say that aside from the wrapping, it's fine. When two method calls are effectively simultaneous, you can't guarantee which will happen first.
The code is still atomic, because whichever actually happens first, they can't interfere with each other at all.
Basically if you have code which tries to rely on the order of simultaneous calls, you already have a race condition. Even if in the calling code one thread gets to the start of the next() call before the other, you can imagine it coming to the end of its time-slice before it gets into the next() call - allowing the second thread to get in there.
If the next() call had any other side effect - e.g. it printed out "Starting with thread (thread id)" and then returned the next value, then it wouldn't be atomic; you'd have an observable difference in behaviour. As it is, I think you're fine.
One thing to think about regarding wrapping: you can make the counter last an awful lot longer before wrapping if you use an AtomicLong :)
EDIT: I've just thought of a neat way of avoiding the wrapping problem in all realistic scenarios:
Define some large number M * 100000 (or whatever). This should be chosen to be large enough to not be hit too often (as it will reduce performance) but small enough that you can expect the "fixing" loop below to be effective before too many threads have added to the tick to cause it to wrap.
When you fetch the value with getAndIncrement(), check whether it's greater than this number. If it is, go into a "reduction loop" which would look something like this:
long tmp;
while ((tmp = tick.get()) > SAFETY_VALUE))
{
long newValue = tmp - SAFETY_VALUE;
tick.compareAndSet(tmp, newValue);
}
Basically this says, "We need to get the value back into a safe range, by decrementing some multiple of the modulus" (so that it doesn't change the value mod M). It does this in a tight loop, basically working out what the new value should be, but only making a change if nothing else has changed the value in between.
It could cause a problem in pathological conditions where you had an infinite number of threads trying to increment the value, but I think it would realistically be okay.
Concerning the atomicity problem: I don't believe that it's possible for the Counter itself to provide behaviour to guarantee the semantics you're implying.
I think we have a thread doing some work
A - get some stuff (for example receive a message)
B - prepare to call Counter
C - Enter Counter <=== counter code is now in control
D - Increment
E - return from Counter <==== just about to leave counter's control
F - application continues
The mediation you're looking for concerns the "payload" identity ordering established at A.
For example two threads each read a message - one reads X, one reads Y. You want to ensure that X gets the first counter increment, Y gets the second, even though the two threads are running simultaneously, and may be scheduled arbitarily across 1 or more CPUs.
Hence any ordering must be imposed across all the steps A-F, and enforced by some concurrency countrol outside of the Counter. For example:
pre-A - Get a lock on Counter (or other lock)
A - get some stuff (for example receive a message)
B - prepare to call Counter
C - Enter Counter <=== counter code is now in control
D - Increment
E - return from Counter <==== just about to leave counter's control
F - application continues
post- F - release lock
Now we have a guarantee at the expense of some parallelism; the threads are waiting for each other. When strict ordering is a requirement this does tend to limit concurrency; it's a common problem in messaging systems.
Concerning the List question. Thread-safety should be seen in terms of interface guarantees. There is absolute minimum requriement: the List must be resilient in the face of simultaneous access from several threads. For example, we could imagine an unsafe list that could deadlock or leave the list mis-linked so that any iteration would loop for ever. The next requirement is that we should specify behaviour when two threads access at the same time. There's lots of cases, here's a few
a). Two threads attempt to add
b). One thread adds item with key "X", another attempts to delete the item with key "X"
C). One thread is iterating while a second thread is adding
Providing that the implementation has clearly defined behaviour in each case it's thread-safe. The interesting question is what behaviours are convenient.
We can simply synchronise on the list, and hence easily give well-understood behaviour for a and b. However that comes at a cost in terms of parallelism. And I'm arguing that it had no value to do this, as you still need to synchronise at some higher level to get useful semantics. So I would have an interface spec saying "Adds happen in any order".
As for iteration - that's a hard problem, have a look at what the Java collections promise: not a lot!
This article , which discusses Java collections may be interesting.
Atomic (as I understand) refers to the fact that an intermediate state is not observable from outside. atomicInteger.incrementAndGet() is atomic, while return this.intField++; is not, in the sense that in the former, you can not observe a state in which the integer has been incremented, but has not yet being returned.
As for thread-safety, authors of Java Concurrency in Practice provide one definition in their book:
A class is thread-safe if it behaves
correctly when accessed from multiple
threads, regardless of the scheduling
or interleaving of the execution of
those threads by the runtime
environment, and with no additional
synchronization or other coordination
on the part of the calling code.
(My personal opinion follows)
Now, if we have the list
updated succesfully with two elements
added, but second comes before first,
which is at the end, is that "thread
safe"?
If thread1 entered the entry set of the mutex object (In case of Collections.synchronizedList() the list itself) before thread2, it is guaranteed that first is positioned ahead than second in the list after the update. This is because the synchronized keyword uses fair lock. Whoever sits ahead of the queue gets to do stuff first. Fair locks can be quite expensive and you can also have unfair locks in java (through the use of java.util.concurrent utilities). If you'd do that, then there is no such guarantee.
However, the java platform is not a real time computing platform, so you can't predict how long a piece of code requires to run. Which means, if you want first ahead of second, you need to ensure this explicitly in java. It is impossible to ensure this through "controlling the timing" of the call.
Now, what is thread safe or unsafe here? I think this simply depends on what needs to be done. If you just need to avoid the list being corrupted and it doesn't matter if first is first or second is first in the list, for the application to run correctly, then just avoiding the corruption is enough to establish thread-safety. If it doesn't, it is not.
So, I think thread-safety can not be defined in the absence of the particular functionality we are trying to achieve.
The famous String.hashCode() doesn't use any particular "synchronization mechanism" provided in java, but it is still thread safe because one can safely use it in their own app. without worrying about synchronization etc.
Famous String.hashCode() trick:
int hash = 0;
int hashCode(){
int hash = this.hash;
if(hash==0){
hash = this.hash = calcHash();
}
return hash;
}

Categories