How deterministic are Java semaphores guaranteed to be? - java

The (Oracle) javadoc for Semaphore.release() includes:
If any threads are trying to acquire a permit, then one is selected and given the permit that was just released.
Is this a hard promise? This implies that if thread A is waiting in acquire() and thread B does this:
sem.release()
sem.acquire()
Then the release() should pass control to A and B will be blocked in acquire(). If these are the only two threads that can hold the semaphore and the doc statement is formally true, then this is a completely deterministic process: Afterward, A will have the permit and B will be blocked.
But this is not true, or at least it does seem that way to me. I haven't bothered with an SSCCE here since I am really just looking for confirmation that:
Race conditions apply: Even though thread A is waiting on the permit, when it is released it can be immediately re-acquired by thread B, leaving thread A still blocked.
These are "fair" semaphores if that makes any difference, and I'm actually working in kotlin.

In comments on the question Slaw pointed out something else from the documentation:
When fairness is set true, the semaphore guarantees that threads invoking any of the acquire methods are selected to obtain permits in the order in which their invocation of those methods was processed (first-in-first-out; FIFO). Note that FIFO ordering necessarily applies to specific internal points of execution within these methods. So, it is possible for one thread to invoke acquire before another, but reach the ordering point after the other, and similarly upon return from the method.
The point here is that acquire() is an interruptable function with a beginning and an end. At some point during its exception the calling thread secures a spot in the fairness queue, but when that is in relation to another thread concurrently accessing the same function is still indeterminate. Call this point X and consider two threads, one of which holds the semaphore. At some point another thread calls:
sem.acquire()
There is no guarantee that the scheduler won't sideline the thread inside acquire() before point X is reached. If the owner thread then does this (this could be, eg., intended as some kind of synchronization checkpoint or barrier control):
sem.release()
sem.acquire()
It could simply release and acquire the semaphore without it being acquired by another thread even if that thread has already entered acquire.
The injection of Thread.sleep() or yield() between the calls might often work, but it is not a guarantee. To create such a checkpoint with that guarantee you need two locks/semaphores for an exchange:
Owner thread holds semA.
Client thread can take semB and then wait on semA.
Owner can release semA then wait on semB, which if another thread is really waiting for semA by holding semB, will block and guarantee semA can now be acquired by the client.
When the client is done, it releases semB, then semA.
When the owner is released from waiting on semB, it can acquire semA and release semB.
If these are properly encapsulated this mechanism is rock solid.

Related

Reentrant lock condition fairness

I have a confusion regarding the ReentrantLock's Condition. Here is the documentation:
Waiting threads are signalled in FIFO order.
The ordering of lock reacquisition for threads returning from waiting
methods is the same as for threads initially acquiring the lock, which
is in the default case not specified, but for fair locks favors those
threads that have been waiting the longest.
According to the latest bullet the fairness brings a well-specified ordering of lock reaquisition on signalling.
But what is the meaning of the first bullet Waiting threads are signalled in FIFO order? I presume in this case signalling means just "signalling" meaning that it "unparks" the thread in the order FIFO order, but the actual reaquiring order on wake up is governed by the fairness.
There are pretty large amount of staff tied with cxq and wait queues internal to HotSpot which I don't understand well (unfortunately).
QUESTION:
Does Waiting threads are signalled in FIFO order mean that waiting threads are unparked in the same order they were parked (even though the lock itself is unfair)?
Does fairness provides reaquisition ordering guarantees which is necessary since there is unpark-reaquire race in general case?
As explained in Difference in internal storing between 'fair' and 'unfair' lock, the actual difference between “fair” and “unfair” is not the organization of the queue, but that in unfair mode, a thread trying to acquire the lock might succeed even when there are already waiting threads in the queue. Such an overtaking thread will not interact with the queue at all.
A thread calling one of the await methods on a Condition must already own the associated lock and will release it so that another thread can acquire it, fulfill the condition and invoke signal or signalAll. So the thread must enqueue itself, so that the other thread knows which thread to signal. When signal is invoked, the thread waiting the longest time for the condition is fetched from the FIFO.
The signalled thread may get unparked but it’s also possible that it hasn’t parked yet. In either case, it must reacquire the lock and this reacquisition is subject to the lock’s fairness guaranty. By the time a thread calls signal it must own the lock. Therefore, the signalled thread can’t succeed immediately. When the lock is released, there might be a race between multiple threads.
But the signalling in FIFO order for a condition implies that when two or more threads are waiting on the same condition and one gets signalled, it will be the longest waiting thread and none of the others can overtake, even for an unfair lock. Only when more than one thread is signalled or other threads, not waiting for the condition, try to acquire the lock, the acquisition order of an unfair lock is arbitrary. Also, as the linked answer mentions, tryLock() may overtake even on a fair lock.
Reading the source code of ReentrantLock (Java 12) we can see that there is only a small difference from fair and not fair ReentrantLock. The difference consists in the class that extends java.util.concurrent.locks.AbstractQueuedSynchronizer. In one case it is FairSync in the other is NonfairSync. Both are defined in ReentrantLock and the only difference is that FairSync implements one more check in the method tryAcquire.
Reading the code seems that in optimal condition also in non-fair ReentrantLock FIFO is respected but this is not guaranteed due to cancellation, time-outs or similar. In fair ReentrantLock any thread before acquire the lock (also if unparked from the queue) re-check if there is older threads.
I'm not sure to understand the second question but notice that a thread is unparked from the queue by the thread that release the lock. Also if the thread that release the lock unpark the older thread in the queue, this is not enough to avoid starvation because a third thread can require the lock concurrently gaining it before the exiting thread unpark the waiting one. In fair mode there is a check of all thread waiting every time a new one try to gain the lock and this grantees FIFO and avoid starvation.
External interrupts of waiting thread does not change the queue order.

Difference between Locks and .join() method

Let's say you have two threads, thread1 and thread2. If you call thread1.start() and thread2.start() at the same time and they both print out numbers between 1 and 5, they will both run at the same time and they will randomly print out the numbers in any order, if I am not mistaken. To prevent this, you use the .join() method to make sure that a certain thread gets executed first. If this is what the .join() method does, what is the Lock object used for?
Thread.join is used to wait for another thread to finish. The join method uses the implicit lock on the Thread object and calls wait on it. When the thread being waited for finishes it notifies the waiting thread so it can stop waiting.
Java has different ways to use locks to protect access to data. There is implicit locking that uses a lock built into every Java object (this is where the synchronized keyword comes in), and then there are explicit Lock objects. Both of them protect data from concurrent access, the difference is the explicit Locks are more flexible and powerful, while implicit locking is designed to be easier to use.
With implicit locks, for instance, I can't not release the lock at the end of a synchronized method or block, the JVM makes sure that the lock gets released as the thread leaves. But programming with implicit locks can be limiting. For instance, there aren't separate condition objects so if there are different threads accessing a shared object for different things, notifying only a subset of them is not possible.
With explicit Locks you get separate condition objects and can notify only those threads waiting on a particular condition (producers might wait on one condition while consumers wait on another, see the ArrayBlockingQueue class for an example), and you can implement more involved kinds of patterns, like hand-over-hand locking. But you need to be much more careful, because the extra features introduce complications, and releasing the lock is up to you.
Locking typically prevents more than one thread from running a block of code at the same time. This is because only ONE thread at a time can acquire the lock and run the code within. If a thread wants the lock but it is already taken, then that thread goes into a wait state until the lock is released. If you have many threads waiting for the lock to be released, which one gets the lock next is INDETERMINATE (can't be predicted). This can lead to "thread starvation" where a thread is waiting for the lock, but it just never gets it because other threads always seem to get it instead. This is a very generic answer because you didn't specify a language. Some languages may differ slightly in that they might have a determinate method of deciding who gets the lock next.

How can we use notifyAll to ensure that only one thread continues after wakeup?

From Programming Language Pragmatics, by Scott
To resume a thread that is suspended on a given object, some other
thread must execute the predefined method notify from within a
synchronized statement or method that refers to the same object. Like
wait, notify has no arguments. In response to a notify call, the
language run-time system picks an arbitrary thread suspended on the
object and makes it runnable. If there are no such threads, then the
notify is a no-op. As in Mesa, it may sometimes be appropriate to
awaken all threads waiting in a given object; Java provides a built-in
notifyAll method for this purpose.
If threads are waiting for more than one condition (i.e., if their waits are embedded in dissimilar loops), there is no guarantee that
the “right” thread will awaken. To ensure that an appropriate thread
does wake up, the programmer may choose to use notifyAll instead of
notify. To ensure that only one thread continues after wakeup, the
first thread to discover that its condition has been satisfied must
modify the state of the object in such a way that other awakened
threads, when they get to run, will simply go back to sleep.
Unfortunately, since all waiting threads will end up reevaluating
their conditions every time one of them can run, this “solution” to
the multiple-condition problem can be quite expensive.
When using notifyAll, all the awaken threads will contend to reacquire the lock, but only one can reacquire the lock, then return from wait() and then reevaluate the condition. So why does it say that "all waiting threads will end up reevaluating their conditions every time one of them can run"?
How does the thread, which reacquires the lock and rechecks that the condition become true, "modify the state of the object in such a way that other awakened threads, when they get to run, will simply go back to sleep"?
Thanks.
So why does it say that "all waiting threads will end up reevaluating their conditions every time one of them can run"?
After it will reacquire and release the lock a different thread will aquire it and run. This will continue until they all do that.
How does the thread, which reacquires the lock and rechecks that the condition become true, "modify the state of the object in such a way that other awakened threads, when they get to run, will simply go back to sleep"?
All the threads will have something like:
while (condition) {
wait();
}
The notifyAll() caller will set condition to false before calling it and then the awakened thread will exit the while loop and before it returns and releases it will do:
condition = true;
All the other threads will awaken, check the condition, stay in the while loop and call wait() (go back to sleep).
Additionally, you should use explicit locking mechanism because it allows you to have multiple conditions and condition queues for a single lock, which will enable you to use signal() instead of signalAll(). And that has better performance and less contention.
Condition API

What circmustances can cause a thread to be transferred from the wait queue to the blocked queue?

Under what conditions can this happen?
As far as I know
Blocked queue is a buffer between threads producing objects and consuming objects.
Wait queue prevents threads from competing for the same lock.
So thread gets a lock, but is unable to be passed onto consumer as it is now busy?
The question makes only sense under the assumption that it actually means “What circumstances can cause a thread to change from the wait state to the blocked state?”
There might be a particular scheduler implementation maintaining these threads in a dedicated queue, having to move threads from one queue to another upon these state changes and influencing the mind set of whoever originally formulated the question, but such a question shouldn’t be loaded with assumed implementation details. As a side note, while a queue of runnable threads would make sense, I can’t imagine a real reason to put blocked or waiting threads into a (global) queue.
If this is the original intention of the question, it should not be confused with Java classes implementing queues and having similar sounding names.
A thread is in the blocked state if it tries to enter a synchronized method or code fragment while another thread owns the object monitor. From there, the thread will turn to the runnable state if the owner releases the monitor and the blocked thread succeeds in acquiring the monitor
A thread is in the waiting state if performs an explicit action that can only proceed, if another thread performs an associated action, i.e. if the thread calls wait on an object, it can only proceed when another thread calls notify on the same object. If the thread calls LockSupport.park(), another thread has to call LockSupport.unpark() with that thread as argument. When it calls join on another thread, that thread must end its execution to end the wait. The waiting state may also end due to interruption or spuriuos wakeups.
As a special case, Java considers threads to be in the timed_waiting state, if they called the methods mentioned above with a timeout or when it executes Thread.sleep. This state only differs from the waiting state in that the state might also end due to elapsed time.
When a thread calls wait on an object, it must own the object’s monitor, i.e. be inside a synchronized method or code block. The monitor is released upon this call and reacquired when returning. When it can’t be reacquired immediately, the thread will go from the waiting or timed_waiting state to the blocked state.

wait() , notify() - which thread first unlock?

Trying to understand wait() and notify(). I know when thread A went to wait() it will be waked up by notify() from other thread.
But what will happens if threads A,B,C went to wait() in represented order? Who will be waked up by notify()? According to my experiments A thread will be waked up at first. I'm right?
Does it means that system knows in which order threads went to wait() ?
From the documentation for notify(), emphasis mine:
Wakes up a single thread that is waiting on this object's monitor. If any threads are waiting on this object, one of them is chosen to be awakened. The choice is arbitrary and occurs at the discretion of the implementation. A thread waits on an object's monitor by calling one of the wait methods.
Some other APIs, such as Semaphore, have a concept of "fairness", where you can ensure that threads do proceed in the order in which they blocked.
Section 17.2.2 Notification of Java Language Specification:
There is no guarantee about which thread in the wait set is selected.
So, observed behavior is not guaranteed and should not be relied upon.
No, the VM does not know in which order the threads were put in Waiting state.
When you call notify(), one of them will be back to Alive/Runnable state and there is no way to know which one the VM will choose.
Sometimes they can run in the order they were put in Waiting state, but the specification does not guarantee that. So in a different VMs you can have a completely different results or even in the same VM, if you run the code multiple times.
No, there is no guarantee about the order. The javadoc of the notify method is pretty clear on this:
Wakes up a single thread that is waiting on this object's monitor. If any threads are waiting on this object, one of them is chosen to be awakened. The choice is arbitrary and occurs at the discretion of the implementation. A thread waits on an object's monitor by calling one of the wait methods.
There's no such an order. Either thread has an equal opportunity to get into runnable state. Actually JVM/OS can see them only as a set of waiting threads, they don't know any order.
In terms of your experiment, to get into a fair conclusion, actually you have to perform it for a huge number of times.
In threads, you can expect an order (FIFO), only if you are using something like a strong Semaphore. Then these threads are put into a waiting queue and fist comer would be first served.

Categories