CompareAndSet Operations overriding of values [duplicate]

CompareAndSet Operations overriding of values [duplicate] - java

This question already has answers here:
How does "Compare And Set" in AtomicInteger works
(2 answers)
Closed 2 years ago.
While going through the Blocking/Non Blocking Algorithms section at the link
and the below code to explain the Atomic compareAndSet operation
boolean updated = false;
while(!updated){
long prevCount = this.count.get();
updated = this.count.compareAndSet(prevCount, prevCount + 1);
}
It states that
Therefore no synchronization is necessary, and no thread suspension is
necessary. This saves the thread suspension overhead.
Does it mean that if there are 2 threads that call compareAndSet() at the same time in the above code, both of them will execute concurrently or parallelly which is in contrast to the synchronized block where one thread gets blocked if both access simultaneously? If that's the case wouldn't the values get overwritten in the above case? Which is same case when there is no synchronization?

The value of the AtomicLong is on a cacheline. On the X86 there is a feature called cacheline locking which is used for locked instructions. So when a cas is done, the cacheline is first acquired in modified/exclusive state and then locked.
If a different CPU wants to access the same cacheline, its cache coherence requests including the request-for-ownership will be ignored till the cacheline is unlocked.
So it is a very lightweight form of synchronization. If you are lucky, the other CPU has some out of order instructions it can execute while it waits for the cacheline.
This approach is called non blocking even though it could lead to other threads 'blocking' since they need to wait. The primary difference with a blocking algorithm is that it can't happen that the CPU (thread) owning the locked cacheline gets suspended while it has locked that cacheline. This is taken care of at the hardware level (so the CPU can't be interrupted between cacheline lock acquire and release). So the blocking is guaranteed to be very short instead of unbounded like with a blocking algorithm.
According to #BeeOnRope there might be some optimistic behavior as well involved but this goes beyond my knowledge level.

Related

Java why use semaphores? [duplicate]

I've heard these words related to concurrent programming, but what's the difference between lock, mutex and semaphore?

A lock allows only one thread to enter the part that's locked and the lock is not shared with any other processes.
A mutex is the same as a lock but it can be system wide (shared by multiple processes).
A semaphore does the same as a mutex but allows x number of threads to enter, this can be used for example to limit the number of cpu, io or ram intensive tasks running at the same time.
For a more detailed post about the differences between mutex and semaphore read here.
You also have read/write locks that allows either unlimited number of readers or 1 writer at any given time.
The descriptions are from a .NET perspective and might not be 100% accurate for all OS/Languages.

There are a lot of misconceptions regarding these words.
This is from a previous post (https://stackoverflow.com/a/24582076/3163691) which fits superb here:
1) Critical Section= User object used for allowing the execution of just one active thread from many others within one process. The other non selected threads (# acquiring this object) are put to sleep.
[No interprocess capability, very primitive object].
2) Mutex Semaphore (aka Mutex)= Kernel object used for allowing the execution of just one active thread from many others, among different processes. The other non selected threads (# acquiring this object) are put to sleep. This object supports thread ownership, thread termination notification, recursion (multiple 'acquire' calls from same thread) and 'priority inversion avoidance'.
[Interprocess capability, very safe to use, a kind of 'high level' synchronization object].
3) Counting Semaphore (aka Semaphore)= Kernel object used for allowing the execution of a group of active threads from many others. The other non selected threads (# acquiring this object) are put to sleep.
[Interprocess capability however not very safe to use because it lacks following 'mutex' attributes: thread termination notification, recursion?, 'priority inversion avoidance'?, etc].
4) And now, talking about 'spinlocks', first some definitions:
Critical Region= A region of memory shared by 2 or more processes.
Lock= A variable whose value allows or denies the entrance to a 'critical region'. (It could be implemented as a simple 'boolean flag').
Busy waiting= Continuosly testing of a variable until some value appears.
Finally:
Spin-lock (aka Spinlock)= A lock which uses busy waiting. (The acquiring of the lock is made by xchg or similar atomic operations).
[No thread sleeping, mostly used at kernel level only. Ineffcient for User level code].
As a last comment, I am not sure but I can bet you some big bucks that the above first 3 synchronizing objects (#1, #2 and #3) make use of this simple beast (#4) as part of their implementation.
Have a good day!.
References:
-Real-Time Concepts for Embedded Systems by Qing Li with Caroline Yao (CMP Books).
-Modern Operating Systems (3rd) by Andrew Tanenbaum (Pearson Education International).
-Programming Applications for Microsoft Windows (4th) by Jeffrey Richter (Microsoft Programming Series).
Also, you can take a look at look at:
https://stackoverflow.com/a/24586803/3163691

Most problems can be solved using (i) just locks, (ii) just semaphores, ..., or (iii) a combination of both! As you may have discovered, they're very similar: both prevent race conditions, both have acquire()/release() operations, both cause zero or more threads to become blocked/suspected...
Really, the crucial difference lies solely on how they lock and unlock.
A lock (or mutex) has two states (0 or 1). It can be either unlocked or locked. They're often used to ensure only one thread enters a critical section at a time.
A semaphore has many states (0, 1, 2, ...). It can be locked (state 0) or unlocked (states 1, 2, 3, ...). One or more semaphores are often used together to ensure that only one thread enters a critical section precisely when the number of units of some resource has/hasn't reached a particular value (either via counting down to that value or counting up to that value).
For both locks/semaphores, trying to call acquire() while the primitive is in state 0 causes the invoking thread to be suspended. For locks - attempts to acquire the lock is in state 1 are successful. For semaphores - attempts to acquire the lock in states {1, 2, 3, ...} are successful.
For locks in state state 0, if same thread that had previously called acquire(), now calls release, then the release is successful. If a different thread tried this -- it is down to the implementation/library as to what happens (usually the attempt ignored or an error is thrown). For semaphores in state 0, any thread can call release and it will be successful (regardless of which thread previous used acquire to put the semaphore in state 0).
From the preceding discussion, we can see that locks have a notion of an owner (the sole thread that can call release is the owner), whereas semaphores do not have an owner (any thread can call release on a semaphore).
What causes a lot of confusion is that, in practice they are many variations of this high-level definition.
Important variations to consider:
What should the acquire()/release() be called? -- [Varies massively]
Does your lock/semaphore use a "queue" or a "set" to remember the threads waiting?
Can your lock/semaphore be shared with threads of other processes?
Is your lock "reentrant"? -- [Usually yes].
Is your lock "blocking/non-blocking"? -- [Normally non-blocking are used as blocking locks (aka spin-locks) cause busy waiting].
How do you ensure the operations are "atomic"?
These depends on your book / lecturer / language / library / environment.
Here's a quick tour noting how some languages answer these details.
C, C++ (pthreads)
A mutex is implemented via pthread_mutex_t. By default, they can't be shared with any other processes (PTHREAD_PROCESS_PRIVATE), however mutex's have an attribute called pshared. When set, so the mutex is shared between processes (PTHREAD_PROCESS_SHARED).
A lock is the same thing as a mutex.
A semaphore is implemented via sem_t. Similar to mutexes, semaphores can be shared between threasds of many processes or kept private to the threads of one single process. This depends on the pshared argument provided to sem_init.
python (threading.py)
A lock (threading.RLock) is mostly the same as C/C++ pthread_mutex_ts. Both are both reentrant. This means they may only be unlocked by the same thread that locked it. It is the case that sem_t semaphores, threading.Semaphore semaphores and theading.Lock locks are not reentrant -- for it is the case any thread can perform unlock the lock / down the semaphore.
A mutex is the same as a lock (the term is not used often in python).
A semaphore (threading.Semaphore) is mostly the same as sem_t. Although with sem_t, a queue of thread ids is used to remember the order in which threads became blocked when attempting to lock it while it is locked. When a thread unlocks a semaphore, the first thread in the queue (if there is one) is chosen to be the new owner. The thread identifier is taken off the queue and the semaphore becomes locked again. However, with threading.Semaphore, a set is used instead of a queue, so the order in which threads became blocked is not stored -- any thread in the set may be chosen to be the next owner.
Java (java.util.concurrent)
A lock (java.util.concurrent.ReentrantLock) is mostly the same as C/C++ pthread_mutex_t's, and Python's threading.RLock in that it also implements a reentrant lock. Sharing locks between processes is harder in Java because of the JVM acting as an intermediary. If a thread tries to unlock a lock it doesn't own, an IllegalMonitorStateException is thrown.
A mutex is the same as a lock (the term is not used often in Java).
A semaphore (java.util.concurrent.Semaphore) is mostly the same as sem_t and threading.Semaphore. The constructor for Java semaphores accept a fairness boolean parameter that control whether to use a set (false) or a queue (true) for storing the waiting threads.
In theory, semaphores are often discussed, but in practice, semaphores aren't used so much. A semaphore only hold the state of one integer, so often it's rather inflexible and many are needed at once -- causing difficulty in understanding code. Also, the fact that any thread can release a semaphore is sometimes undesired. More object-oriented / higher-level synchronization primitives / abstractions such as "condition variables" and "monitors" are used instead.

Take a look at Multithreading Tutorial by John Kopplin.
In the section Synchronization Between Threads, he explain the differences among event, lock, mutex, semaphore, waitable timer
A mutex can be owned by only one thread at a time, enabling threads to
coordinate mutually exclusive access to a shared resource
Critical section objects provide synchronization similar to that
provided by mutex objects, except that critical section objects can be
used only by the threads of a single process
Another difference between a mutex and a critical section is that if
the critical section object is currently owned by another thread,
EnterCriticalSection() waits indefinitely for ownership whereas
WaitForSingleObject(), which is used with a mutex, allows you to
specify a timeout
A semaphore maintains a count between zero and some maximum value,
limiting the number of threads that are simultaneously accessing a
shared resource.

I will try to cover it with examples:
Lock: One example where you would use lock would be a shared dictionary into which items (that must have unique keys) are added.
The lock would ensure that one thread does not enter the mechanism of code that is checking for item being in dictionary while another thread (that is in the critical section) already has passed this check and is adding the item. If another thread tries to enter a locked code, it will wait (be blocked) until the object is released.
private static readonly Object obj = new Object();
lock (obj) //after object is locked no thread can come in and insert item into dictionary on a different thread right before other thread passed the check...
{
if (!sharedDict.ContainsKey(key))
{
sharedDict.Add(item);
}
}
Semaphore:
Let's say you have a pool of connections, then an single thread might reserve one element in the pool by waiting for the semaphore to get a connection. It then uses the connection and when work is done releases the connection by releasing the semaphore.
Code example that I love is one of bouncer given by #Patric - here it goes:
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
namespace TheNightclub
{
public class Program
{
public static Semaphore Bouncer { get; set; }
public static void Main(string[] args)
{
// Create the semaphore with 3 slots, where 3 are available.
Bouncer = new Semaphore(3, 3);
// Open the nightclub.
OpenNightclub();
}
public static void OpenNightclub()
{
for (int i = 1; i <= 50; i++)
{
// Let each guest enter on an own thread.
Thread thread = new Thread(new ParameterizedThreadStart(Guest));
thread.Start(i);
}
}
public static void Guest(object args)
{
// Wait to enter the nightclub (a semaphore to be released).
Console.WriteLine("Guest {0} is waiting to entering nightclub.", args);
Bouncer.WaitOne();
// Do some dancing.
Console.WriteLine("Guest {0} is doing some dancing.", args);
Thread.Sleep(500);
// Let one guest out (release one semaphore).
Console.WriteLine("Guest {0} is leaving the nightclub.", args);
Bouncer.Release(1);
}
}
}
Mutex It is pretty much Semaphore(1,1) and often used globally (application wide otherwise arguably lock is more appropriate). One would use global Mutex when deleting node from a globally accessible list (last thing you want another thread to do something while you are deleting the node). When you acquire Mutex if different thread tries to acquire the same Mutex it will be put to sleep till SAME thread that acquired the Mutex releases it.
Good example on creating global mutex is by #deepee
class SingleGlobalInstance : IDisposable
{
public bool hasHandle = false;
Mutex mutex;
private void InitMutex()
{
string appGuid = ((GuidAttribute)Assembly.GetExecutingAssembly().GetCustomAttributes(typeof(GuidAttribute), false).GetValue(0)).Value.ToString();
string mutexId = string.Format("Global\\{{{0}}}", appGuid);
mutex = new Mutex(false, mutexId);
var allowEveryoneRule = new MutexAccessRule(new SecurityIdentifier(WellKnownSidType.WorldSid, null), MutexRights.FullControl, AccessControlType.Allow);
var securitySettings = new MutexSecurity();
securitySettings.AddAccessRule(allowEveryoneRule);
mutex.SetAccessControl(securitySettings);
}
public SingleGlobalInstance(int timeOut)
{
InitMutex();
try
{
if(timeOut < 0)
hasHandle = mutex.WaitOne(Timeout.Infinite, false);
else
hasHandle = mutex.WaitOne(timeOut, false);
if (hasHandle == false)
throw new TimeoutException("Timeout waiting for exclusive access on SingleInstance");
}
catch (AbandonedMutexException)
{
hasHandle = true;
}
}
public void Dispose()
{
if (mutex != null)
{
if (hasHandle)
mutex.ReleaseMutex();
mutex.Dispose();
}
}
}
then use like:
using (new SingleGlobalInstance(1000)) //1000ms timeout on global lock
{
//Only 1 of these runs at a time
GlobalNodeList.Remove(node)
}
Hope this saves you some time.

Wikipedia has a great section on the differences between Semaphores and Mutexes:
A mutex is essentially the same thing as a binary semaphore and
sometimes uses the same basic implementation. The differences between
them are:
Mutexes have a concept of an owner, which is the process
that locked the mutex. Only the process that locked the mutex can
unlock it. In contrast, a semaphore has no concept of an owner. Any
process can unlock a semaphore.
Unlike semaphores, mutexes provide
priority inversion safety. Since the mutex knows its current owner, it
is possible to promote the priority of the owner whenever a
higher-priority task starts waiting on the mutex.
Mutexes also provide
deletion safety, where the process holding the mutex cannot be
accidentally deleted. Semaphores do not provide this.

lock, mutex, semaphore
It is a general vision. Details are depended on real language realisation
lock - thread synchronization tool. When thread get a lock it becomes a single thread which is able to execute a block of code. All others thread are blocked. Only thread which owns the lock can unlock it
mutex - mutual exclusion lock. It is a kind of lock. On some languages it is inter-process mechanism, on some languages it is a synonym of lock. For example Java uses lock in synchronised and java.util.concurrent.locks.Lock
semaphore - allows a number of threads to access a shared resource. You can find that mutex also can be implemented by semaphore. It is a standalone object which manage an access to shared resource. You can find that any thread can signal and unblock. Also it is used for signalling
[iOS lock, mutex, semaphore]

My understanding is that a mutex is only for use within a single process, but across its many threads, whereas a semaphore may be used across multiple processes, and across their corresponding sets of threads.
Also, a mutex is binary (it's either locked or unlocked), whereas a semaphore has a notion of counting, or a queue of more than one lock and unlock requests.
Could someone verify my explanation? I'm speaking in the context of Linux, specifically Red Hat Enterprise Linux (RHEL) version 6, which uses kernel 2.6.32.

Using C programming on a Linux variant as a base case for examples.
Lock:
• Usually a very simple construct binary in operation either locked or unlocked
• No concept of thread ownership, priority, sequencing etc.
• Usually a spin lock where the thread continuously checks for the locks availability.
• Usually relies on atomic operations e.g. Test-and-set, compare-and-swap, fetch-and-add etc.
• Usually requires hardware support for atomic operation.
File Locks:
• Usually used to coordinate access to a file via multiple processes.
• Multiple processes can hold the read lock however when any single process holds the write lock no other process is allowed to acquire a read or write lock.
• Example : flock, fcntl etc..
Mutex:
• Mutex function calls usually work in kernel space and result in system calls.
• It uses the concept of ownership. Only the thread that currently holds the mutex can unlock it.
• Mutex is not recursive (Exception: PTHREAD_MUTEX_RECURSIVE).
• Usually used in Association with Condition Variables and passed as arguments to e.g. pthread_cond_signal, pthread_cond_wait etc.
• Some UNIX systems allow mutex to be used by multiple processes although this may not be enforced on all systems.
Semaphore:
• This is a kernel maintained integer whose values is not allowed to fall below zero.
• It can be used to synchronize processes.
• The value of the semaphore may be set to a value greater than 1 in which case the value usually indicates the number of resources available.
• A semaphore whose value is restricted to 1 and 0 is referred to as a binary semaphore.

Supporting ownership, maximum number of processes share lock and the maximum number of allowed processes/threads in critical section are three major factors that determine the name/type of the concurrent object with general name of lock. Since the value of these factors are binary (have two states), we can summarize them in a 3*8 truth-like table.
X (Supports Ownership?): no(0) / yes(1)
Y (#sharing processes): > 1 (∞) / 1
Z (#processes/threads in CA): > 1 (∞) / 1
X Y Z Name
--- --- --- ------------------------
0 ∞ ∞ Semaphore
0 ∞ 1 Binary Semaphore
0 1 ∞ SemaphoreSlim
0 1 1 Binary SemaphoreSlim(?)
1 ∞ ∞ Recursive-Mutex(?)
1 ∞ 1 Mutex
1 1 ∞ N/A(?)
1 1 1 Lock/Monitor
Feel free to edit or expand this table, I've posted it as an ascii table to be editable:)

what is the difference between Synchronization and Mutex [duplicate]

Is there any difference between a binary semaphore and mutex or are they essentially the same?

They are NOT the same thing. They are used for different purposes!
While both types of semaphores have a full/empty state and use the same API, their usage is very different.
Mutual Exclusion Semaphores
Mutual Exclusion semaphores are used to protect shared resources (data structure, file, etc..).
A Mutex semaphore is "owned" by the task that takes it. If Task B attempts to semGive a mutex currently held by Task A, Task B's call will return an error and fail.
Mutexes always use the following sequence:
- SemTake
- Critical Section
- SemGive
Here is a simple example:
Thread A Thread B
Take Mutex
access data
... Take Mutex <== Will block
...
Give Mutex access data <== Unblocks
...
Give Mutex
Binary Semaphore
Binary Semaphore address a totally different question:
Task B is pended waiting for something to happen (a sensor being tripped for example).
Sensor Trips and an Interrupt Service Routine runs. It needs to notify a task of the trip.
Task B should run and take appropriate actions for the sensor trip. Then go back to waiting.
Task A Task B
... Take BinSemaphore <== wait for something
Do Something Noteworthy
Give BinSemaphore do something <== unblocks
Note that with a binary semaphore, it is OK for B to take the semaphore and A to give it.
Again, a binary semaphore is NOT protecting a resource from access. The act of Giving and Taking a semaphore are fundamentally decoupled.
It typically makes little sense for the same task to so a give and a take on the same binary semaphore.

A mutex can be released only by the thread that had acquired it.
A binary semaphore can be signaled by any thread (or process).
so semaphores are more suitable for some synchronization problems like producer-consumer.
On Windows, binary semaphores are more like event objects than mutexes.

The Toilet example is an enjoyable analogy:
Mutex:
Is a key to a toilet. One person can
have the key - occupy the toilet - at
the time. When finished, the person
gives (frees) the key to the next
person in the queue.
Officially: "Mutexes are typically
used to serialise access to a section
of re-entrant code that cannot be
executed concurrently by more than one
thread. A mutex object only allows one
thread into a controlled section,
forcing other threads which attempt to
gain access to that section to wait
until the first thread has exited from
that section." Ref: Symbian Developer
Library
(A mutex is really a semaphore with
value 1.)
Semaphore:
Is the number of free identical toilet
keys. Example, say we have four
toilets with identical locks and keys.
The semaphore count - the count of
keys - is set to 4 at beginning (all
four toilets are free), then the count
value is decremented as people are
coming in. If all toilets are full,
ie. there are no free keys left, the
semaphore count is 0. Now, when eq.
one person leaves the toilet,
semaphore is increased to 1 (one free
key), and given to the next person in
the queue.
Officially: "A semaphore restricts the
number of simultaneous users of a
shared resource up to a maximum
number. Threads can request access to
the resource (decrementing the
semaphore), and can signal that they
have finished using the resource
(incrementing the semaphore)." Ref:
Symbian Developer Library

Nice articles on the topic:
MUTEX VS. SEMAPHORES – PART 1: SEMAPHORES
MUTEX VS. SEMAPHORES – PART 2: THE MUTEX
MUTEX VS. SEMAPHORES – PART 3 (FINAL PART): MUTUAL EXCLUSION PROBLEMS
From part 2:
The mutex is similar to the principles
of the binary semaphore with one
significant difference: the principle
of ownership. Ownership is the simple
concept that when a task locks
(acquires) a mutex only it can unlock
(release) it. If a task tries to
unlock a mutex it hasn’t locked (thus
doesn’t own) then an error condition
is encountered and, most importantly,
the mutex is not unlocked. If the
mutual exclusion object doesn't have
ownership then, irrelevant of what it
is called, it is not a mutex.

Since none of the above answer clears the confusion, here is one which cleared my confusion.
Strictly speaking, a mutex is a locking mechanism used to
synchronize access to a resource. Only one task (can be a thread or
process based on OS abstraction) can acquire the mutex. It means there
will be ownership associated with mutex, and only the owner can
release the lock (mutex).
Semaphore is signaling mechanism (“I am done, you can carry on” kind of signal). For example, if you are listening songs (assume it as
one task) on your mobile and at the same time your friend called you,
an interrupt will be triggered upon which an interrupt service routine
(ISR) will signal the call processing task to wakeup.
Source: http://www.geeksforgeeks.org/mutex-vs-semaphore/

Their synchronization semantics are very different:
mutexes allow serialization of access to a given resource i.e. multiple threads wait for a lock, one at a time and as previously said, the thread owns the lock until it is done: only this particular thread can unlock it.
a binary semaphore is a counter with value 0 and 1: a task blocking on it until any task does a sem_post. The semaphore advertises that a resource is available, and it provides the mechanism to wait until it is signaled as being available.
As such one can see a mutex as a token passed from task to tasks and a semaphore as traffic red-light (it signals someone that it can proceed).

At a theoretical level, they are no different semantically. You can implement a mutex using semaphores or vice versa (see here for an example). In practice, the implementations are different and they offer slightly different services.
The practical difference (in terms of the system services surrounding them) is that the implementation of a mutex is aimed at being a more lightweight synchronisation mechanism. In oracle-speak, mutexes are known as latches and semaphores are known as waits.
At the lowest level, they use some sort of atomic test and set mechanism. This reads the current value of a memory location, computes some sort of conditional and writes out a value at that location in a single instruction that cannot be interrupted. This means that you can acquire a mutex and test to see if anyone else had it before you.
A typical mutex implementation has a process or thread executing the test-and-set instruction and evaluating whether anything else had set the mutex. A key point here is that there is no interaction with the scheduler, so we have no idea (and don't care) who has set the lock. Then we either give up our time slice and attempt it again when the task is re-scheduled or execute a spin-lock. A spin lock is an algorithm like:
Count down from 5000:
i. Execute the test-and-set instruction
ii. If the mutex is clear, we have acquired it in the previous instruction
so we can exit the loop
iii. When we get to zero, give up our time slice.
When we have finished executing our protected code (known as a critical section) we just set the mutex value to zero or whatever means 'clear.' If multiple tasks are attempting to acquire the mutex then the next task that happens to be scheduled after the mutex is released will get access to the resource. Typically you would use mutexes to control a synchronised resource where exclusive access is only needed for very short periods of time, normally to make an update to a shared data structure.
A semaphore is a synchronised data structure (typically using a mutex) that has a count and some system call wrappers that interact with the scheduler in a bit more depth than the mutex libraries would. Semaphores are incremented and decremented and used to block tasks until something else is ready. See Producer/Consumer Problem for a simple example of this. Semaphores are initialised to some value - a binary semaphore is just a special case where the semaphore is initialised to 1. Posting to a semaphore has the effect of waking up a waiting process.
A basic semaphore algorithm looks like:
(somewhere in the program startup)
Initialise the semaphore to its start-up value.
Acquiring a semaphore
i. (synchronised) Attempt to decrement the semaphore value
ii. If the value would be less than zero, put the task on the tail of the list of tasks waiting on the semaphore and give up the time slice.
Posting a semaphore
i. (synchronised) Increment the semaphore value
ii. If the value is greater or equal to the amount requested in the post at the front of the queue, take that task off the queue and make it runnable.
iii. Repeat (ii) for all tasks until the posted value is exhausted or there are no more tasks waiting.
In the case of a binary semaphore the main practical difference between the two is the nature of the system services surrounding the actual data structure.
EDIT: As evan has rightly pointed out, spinlocks will slow down a single processor machine. You would only use a spinlock on a multi-processor box because on a single processor the process holding the mutex will never reset it while another task is running. Spinlocks are only useful on multi-processor architectures.

Though mutex & semaphores are used as synchronization primitives ,there is a big difference between them.
In the case of mutex, only the thread that locked or acquired the mutex can unlock it.
In the case of a semaphore, a thread waiting on a semaphore can be signaled by a different thread.
Some operating system supports using mutex & semaphores between process. Typically usage is creating in shared memory.

Mutex: Suppose we have critical section thread T1 wants to access it then it follows below steps.
T1:
Lock
Use Critical Section
Unlock
Binary semaphore: It works based on signaling wait and signal.
wait(s) decrease "s" value by one usually "s" value is initialize with value "1",
signal(s) increases "s" value by one. if "s" value is 1 means no one is using critical section, when value is 0 means critical section is in use.
suppose thread T2 is using critical section then it follows below steps.
T2 :
wait(s)//initially s value is one after calling wait it's value decreased by one i.e 0
Use critical section
signal(s) // now s value is increased and it become 1
Main difference between Mutex and Binary semaphore is in Mutext if thread lock the critical section then it has to unlock critical section no other thread can unlock it, but in case of Binary semaphore if one thread locks critical section using wait(s) function then value of s become "0" and no one can access it until value of "s" become 1 but suppose some other thread calls signal(s) then value of "s" become 1 and it allows other function to use critical section.
hence in Binary semaphore thread doesn't have ownership.

On Windows, there are two differences between mutexes and binary semaphores:
A mutex can only be released by the thread which has ownership, i.e. the thread which previously called the Wait function, (or which took ownership when creating it). A semaphore can be released by any thread.
A thread can call a wait function repeatedly on a mutex without blocking. However, if you call a wait function twice on a binary semaphore without releasing the semaphore in between, the thread will block.

Myth:
Couple of article says that "binary semaphore and mutex are same" or "Semaphore with value 1 is mutex" but the basic difference is Mutex can be released only by thread that had acquired it, while you can signal semaphore from any other thread
Key Points:
•A thread can acquire more than one lock (Mutex).
•A mutex can be locked more than once only if its a recursive mutex, here lock and unlock for mutex should be same
•If a thread which had already locked a mutex, tries to lock the mutex again, it will enter into the waiting list of that mutex, which results in deadlock.
•Binary semaphore and mutex are similar but not same.
•Mutex is costly operation due to protection protocols associated with it.
•Main aim of mutex is achieve atomic access or lock on resource

Mutex are used for " Locking Mechanisms ". one process at a time can use a shared resource
whereas
Semaphores are used for " Signaling Mechanisms "
like "I am done , now can continue"

You obviously use mutex to lock a data in one thread getting accessed by another thread at the same time. Assume that you have just called lock() and in the process of accessing data. This means that you don’t expect any other thread (or another instance of the same thread-code) to access the same data locked by the same mutex. That is, if it is the same thread-code getting executed on a different thread instance, hits the lock, then the lock() should block the control flow there. This applies to a thread that uses a different thread-code, which is also accessing the same data and which is also locked by the same mutex. In this case, you are still in the process of accessing the data and you may take, say, another 15 secs to reach the mutex unlock (so that the other thread that is getting blocked in mutex lock would unblock and would allow the control to access the data). Do you at any cost allow yet another thread to just unlock the same mutex, and in turn, allow the thread that is already waiting (blocking) in the mutex lock to unblock and access the data? Hope you got what I am saying here?
As per, agreed upon universal definition!,
with “mutex” this can’t happen. No other thread can unlock the lock
in your thread
with “binary-semaphore” this can happen. Any other thread can unlock
the lock in your thread
So, if you are very particular about using binary-semaphore instead of mutex, then you should be very careful in “scoping” the locks and unlocks. I mean that every control-flow that hits every lock should hit an unlock call, also there shouldn’t be any “first unlock”, rather it should be always “first lock”.

A Mutex controls access to a single shared resource. It provides operations to acquire() access to that resource and release() it when done.
A Semaphore controls access to a shared pool of resources. It provides operations to Wait() until one of the resources in the pool becomes available, and Signal() when it is given back to the pool.
When number of resources a Semaphore protects is greater than 1, it is called a Counting Semaphore. When it controls one resource, it is called a Boolean Semaphore. A boolean semaphore is equivalent to a mutex.
Thus a Semaphore is a higher level abstraction than Mutex. A Mutex can be implemented using a Semaphore but not the other way around.

Modified question is - What's the difference between A mutex and a "binary" semaphore in "Linux"?
Ans: Following are the differences –
i) Scope – The scope of mutex is within a process address space which has created it and is used for synchronization of threads. Whereas semaphore can be used across process space and hence it can be used for interprocess synchronization.
ii) Mutex is lightweight and faster than semaphore. Futex is even faster.
iii) Mutex can be acquired by same thread successfully multiple times with condition that it should release it same number of times. Other thread trying to acquire will block. Whereas in case of semaphore if same process tries to acquire it again it blocks as it can be acquired only once.

Diff between Binary Semaphore and Mutex:
OWNERSHIP:
Semaphores can be signalled (posted) even from a non current owner. It means you can simply post from any other thread, though you are not the owner.
Semaphore is a public property in process, It can be simply posted by a non owner thread.
Please Mark this difference in BOLD letters, it mean a lot.

Mutex work on blocking critical region, But Semaphore work on count.

http://www.geeksforgeeks.org/archives/9102 discusses in details.
Mutex is locking mechanism used to synchronize access to a resource.
Semaphore is signaling mechanism.
Its up to to programmer if he/she wants to use binary semaphore in place of mutex.

Apart from the fact that mutexes have an owner, the two objects may be optimized for different usage. Mutexes are designed to be held only for a short time; violating this can cause poor performance and unfair scheduling. For example, a running thread may be permitted to acquire a mutex, even though another thread is already blocked on it. Semaphores may provide more fairness, or fairness can be forced using several condition variables.

In windows the difference is as below.
MUTEX: process which successfully executes wait has to execute a signal and vice versa. BINARY SEMAPHORES: Different processes can execute wait or signal operation on a semaphore.

While a binary semaphore may be used as a mutex, a mutex is a more specific use-case, in that only the process that locked the mutex is supposed to unlock it. This ownership constraint makes it possible to provide protection against:
Accidental release
Recursive Deadlock
Task Death Deadlock
These constraints are not always present because they degrade the speed. During the development of your code, you can enable these checks temporarily.
e.g. you can enable Error check attribute in your mutex. Error checking mutexes return EDEADLK if you try to lock the same one twice and EPERM if you unlock a mutex that isn't yours.
pthread_mutex_t mutex;
pthread_mutexattr_t attr;
pthread_mutexattr_init (&attr);
pthread_mutexattr_settype (&attr, PTHREAD_MUTEX_ERRORCHECK_NP);
pthread_mutex_init (&mutex, &attr);
Once initialised we can place these checks in our code like this:
if(pthread_mutex_unlock(&mutex)==EPERM)
printf("Unlock failed:Mutex not owned by this thread\n");

The concept was clear to me after going over above posts. But there were some lingering questions. So, I wrote this small piece of code.
When we try to give a semaphore without taking it, it goes through. But, when you try to give a mutex without taking it, it fails. I tested this on a Windows platform. Enable USE_MUTEX to run the same code using a MUTEX.
#include <stdio.h>
#include <windows.h>
#define xUSE_MUTEX 1
#define MAX_SEM_COUNT 1
DWORD WINAPI Thread_no_1( LPVOID lpParam );
DWORD WINAPI Thread_no_2( LPVOID lpParam );
HANDLE Handle_Of_Thread_1 = 0;
HANDLE Handle_Of_Thread_2 = 0;
int Data_Of_Thread_1 = 1;
int Data_Of_Thread_2 = 2;
HANDLE ghMutex = NULL;
HANDLE ghSemaphore = NULL;
int main(void)
{
#ifdef USE_MUTEX
ghMutex = CreateMutex( NULL, FALSE, NULL);
if (ghMutex == NULL)
{
printf("CreateMutex error: %d\n", GetLastError());
return 1;
}
#else
// Create a semaphore with initial and max counts of MAX_SEM_COUNT
ghSemaphore = CreateSemaphore(NULL,MAX_SEM_COUNT,MAX_SEM_COUNT,NULL);
if (ghSemaphore == NULL)
{
printf("CreateSemaphore error: %d\n", GetLastError());
return 1;
}
#endif
// Create thread 1.
Handle_Of_Thread_1 = CreateThread( NULL, 0,Thread_no_1, &Data_Of_Thread_1, 0, NULL);
if ( Handle_Of_Thread_1 == NULL)
{
printf("Create first thread problem \n");
return 1;
}
/* sleep for 5 seconds **/
Sleep(5 * 1000);
/*Create thread 2 */
Handle_Of_Thread_2 = CreateThread( NULL, 0,Thread_no_2, &Data_Of_Thread_2, 0, NULL);
if ( Handle_Of_Thread_2 == NULL)
{
printf("Create second thread problem \n");
return 1;
}
// Sleep for 20 seconds
Sleep(20 * 1000);
printf("Out of the program \n");
return 0;
}
int my_critical_section_code(HANDLE thread_handle)
{
#ifdef USE_MUTEX
if(thread_handle == Handle_Of_Thread_1)
{
/* get the lock */
WaitForSingleObject(ghMutex, INFINITE);
printf("Thread 1 holding the mutex \n");
}
#else
/* get the semaphore */
if(thread_handle == Handle_Of_Thread_1)
{
WaitForSingleObject(ghSemaphore, INFINITE);
printf("Thread 1 holding semaphore \n");
}
#endif
if(thread_handle == Handle_Of_Thread_1)
{
/* sleep for 10 seconds */
Sleep(10 * 1000);
#ifdef USE_MUTEX
printf("Thread 1 about to release mutex \n");
#else
printf("Thread 1 about to release semaphore \n");
#endif
}
else
{
/* sleep for 3 secconds */
Sleep(3 * 1000);
}
#ifdef USE_MUTEX
/* release the lock*/
if(!ReleaseMutex(ghMutex))
{
printf("Release Mutex error in thread %d: error # %d\n", (thread_handle == Handle_Of_Thread_1 ? 1:2),GetLastError());
}
#else
if (!ReleaseSemaphore(ghSemaphore,1,NULL) )
{
printf("ReleaseSemaphore error in thread %d: error # %d\n",(thread_handle == Handle_Of_Thread_1 ? 1:2), GetLastError());
}
#endif
return 0;
}
DWORD WINAPI Thread_no_1( LPVOID lpParam )
{
my_critical_section_code(Handle_Of_Thread_1);
return 0;
}
DWORD WINAPI Thread_no_2( LPVOID lpParam )
{
my_critical_section_code(Handle_Of_Thread_2);
return 0;
}
The very fact that semaphore lets you signal "it is done using a resource", even though it never owned the resource, makes me think there is a very loose coupling between owning and signaling in the case of semaphores.

Best Solution
The only difference is
1.Mutex -> lock and unlock are under the ownership of a thread that locks the mutex.
2.Semaphore -> No ownership i.e; if one thread calls semwait(s) any other thread can call sempost(s) to remove the lock.

Mutex is used to protect the sensitive code and data, semaphore is used to synchronization.You also can have practical use with protect the sensitive code, but there might be a risk that release the protection by the other thread by operation V.So The main difference between bi-semaphore and mutex is the ownership.For instance by toilet , Mutex is like that one can enter the toilet and lock the door, no one else can enter until the man get out, bi-semaphore is like that one can enter the toilet and lock the door, but someone else could enter by asking the administrator to open the door, it's ridiculous.

I think most of the answers here were confusing especially those saying that mutex can be released only by the process that holds it but semaphore can be signaled by ay process. The above line is kind of vague in terms of semaphore. To understand we should know that there are two kinds of semaphore one is called counting semaphore and the other is called a binary semaphore. In counting semaphore handles access to n number of resources where n can be defined before the use. Each semaphore has a count variable, which keeps the count of the number of resources in use, initially, it is set to n. Each process that wishes to uses a resource performs a wait() operation on the semaphore (thereby decrementing the count). When a process releases a resource, it performs a release() operation (incrementing the count). When the count becomes 0, all the resources are being used. After that, the process waits until the count becomes more than 0. Now here is the catch only the process that holds the resource can increase the count no other process can increase the count only the processes holding a resource can increase the count and the process waiting for the semaphore again checks and when it sees the resource available it decreases the count again. So in terms of binary semaphore, only the process holding the semaphore can increase the count, and count remains zero until it stops using the semaphore and increases the count and other process gets the chance to access the semaphore.
The main difference between binary semaphore and mutex is that semaphore is a signaling mechanism and mutex is a locking mechanism, but binary semaphore seems to function like mutex that creates confusion, but both are different concepts suitable for a different kinds of work.

The answer may depend on the target OS. For example, at least one RTOS implementation I'm familiar with will allow multiple sequential "get" operations against a single OS mutex, so long as they're all from within the same thread context. The multiple gets must be replaced by an equal number of puts before another thread will be allowed to get the mutex. This differs from binary semaphores, for which only a single get is allowed at a time, regardless of thread contexts.
The idea behind this type of mutex is that you protect an object by only allowing a single context to modify the data at a time. Even if the thread gets the mutex and then calls a function that further modifies the object (and gets/puts the protector mutex around its own operations), the operations should still be safe because they're all happening under a single thread.
{
mutexGet(); // Other threads can no longer get the mutex.
// Make changes to the protected object.
// ...
objectModify(); // Also gets/puts the mutex. Only allowed from this thread context.
// Make more changes to the protected object.
// ...
mutexPut(); // Finally allows other threads to get the mutex.
}
Of course, when using this feature, you must be certain that all accesses within a single thread really are safe!
I'm not sure how common this approach is, or whether it applies outside of the systems with which I'm familiar. For an example of this kind of mutex, see the ThreadX RTOS.

Mutexes have ownership, unlike semaphores. Although any thread, within the scope of a mutex, can get an unlocked mutex and lock access to the same critical section of code,only the thread that locked a mutex should unlock it.

As many folks here have mentioned, a mutex is used to protect a critical piece of code (AKA critical section.) You will acquire the mutex (lock), enter critical section, and release mutex (unlock) all in the same thread.
While using a semaphore, you can make a thread wait on a semaphore (say thread A), until another thread (say thread B)completes whatever task, and then sets the Semaphore for thread A to stop the wait, and continue its task.

MUTEX
Until recently, the only sleeping lock in the kernel was the semaphore. Most users of semaphores instantiated a semaphore with a count of one and treated them as a mutual exclusion lock—a sleeping version of the spin-lock. Unfortunately, semaphores are rather generic and do not impose any usage constraints. This makes them useful for managing exclusive access in obscure situations, such as complicated dances between the kernel and userspace. But it also means that simpler locking is harder to do, and the lack of enforced rules makes any sort of automated debugging or constraint enforcement impossible. Seeking a simpler sleeping lock, the kernel developers introduced the mutex.Yes, as you are now accustomed to, that is a confusing name. Let’s clarify.The term “mutex” is a generic name to refer to any sleeping lock that enforces mutual exclusion, such as a semaphore with a usage count of one. In recent Linux kernels, the proper noun “mutex” is now also a specific type of sleeping lock that implements mutual exclusion.That is, a mutex is a mutex.
The simplicity and efficiency of the mutex come from the additional constraints it imposes on its users over and above what the semaphore requires. Unlike a semaphore, which implements the most basic of behaviour in accordance with Dijkstra’s original design, the mutex has a stricter, narrower use case:
n Only one task can hold the mutex at a time. That is, the usage count on a mutex is always one.
Whoever locked a mutex must unlock it. That is, you cannot lock a mutex in one
context and then unlock it in another. This means that the mutex isn’t suitable for more complicated synchronizations between kernel and user-space. Most use cases,
however, cleanly lock and unlock from the same context.
Recursive locks and unlocks are not allowed. That is, you cannot recursively acquire the same mutex, and you cannot unlock an unlocked mutex.
A process cannot exit while holding a mutex.
A mutex cannot be acquired by an interrupt handler or bottom half, even with
mutex_trylock().
A mutex can be managed only via the official API: It must be initialized via the methods described in this section and cannot be copied, hand initialized, or reinitialized.
[1] Linux Kernel Development, Third Edition Robert Love

Mutex and binary semaphore are both of the same usage, but in reality, they are different.
In case of mutex, only the thread which have locked it can unlock it. If any other thread comes to lock it, it will wait.
In case of semaphone, that's not the case. Semaphore is not tied up with a particular thread ID.

Why Non blocking Concurrency is better than blocking concurrency

I just want to know why Non Blocking concurrency is better than Blocking concurrency. In Blocking Concurrency Your thread must wait till other thread completes its execution. So thread would not consuming CPU in that case.
But if I talk about Non Blocking Concurrency, Threads do not wait to get a lock they immediately returns if certain threads is containing the lock.
For Example in ConcurrentHashMap class , inside put() method there is tryLock() in a loop. Other thread will be active and continuously trying to check if lock has been released or not because tryLock() is Non Blocking. I assume in this case, CPU is unnecessary used.
So Is it not good to suspend the thread till other thread completes its execution and wake the thread up when work is finished?

Whether or not blocking or non-blocking concurrency is better depends on how long you expect to have to wait to acquire the resource you're waiting on.
With a blocking wait (i.e. a mutex lock, in C parlance), the operating system kernel puts the waiting thread to sleep. The CPU scheduler will not allocate any time to it until after the required resource has become available. The advantage here is that, as you said, this thread won't consume any CPU resources while it is sleeping.
There is a disadvantage, however: the process of putting the thread to sleep, determining when it should be woken, and waking it up again is complex and expensive, and may negate the savings achieved by not having the thread consume CPU while waiting. In addition (and possibly because of this), the OS may choose not to wake the thread immediately once the resource becomes available, so the lock may be waited on longer than is necessary.
A non-blocking wait (also known as a spinlock) does consume CPU resource while waiting, but saves the expense of putting the thread to sleep, determining when it should be woken, and waking it. It also may be able to respond faster once the lock becomes free, as it is less at the whim of the OS in terms of when it can proceed with execution.
So, as a very general rule, you should prefer a spinlock if you expect to only wait a short time (e.g. the few CPU cycles it might take for another thread to finish with an entry in ConcurrentHashMap). For longer waits (e.g. on synchronized I/O, or a number of threads waiting on a single complex computation), a mutex (blocking wait) may be preferable.

If you consider ConcurrentHashMap as an example , considering the overhead due to multiple threads performing update operations (like put) , and block waiting for the locks to release (as you mention other thread will be active and continuously trying to check if lock has been released), is not going to be the case,always.
Compared to HashTable , Concurrency control in ConcurrentHashMap is split up. So multiple threads can acquire lock(on segments of the table).
Originally, the ConcurrentHashMap class supports a hard-wired preset concurrency level of 32. This allows a maximum of 32 put and/or remove operations to proceed concurrently(factors other than synchronization tend to be bottlenecks when more than 32 threads concurrently attempt updates.)
Also, successful retrievals (when the key is present) using get(key) and containsKey(key) usually run without locking.
So for instance, one thread might be in the process of adding an element, what cannot be done with such a locking strategy is operations like add an element only if it is not already present (ConcurrentReaderHashMap provides such facilities).
Also, the size() and isEmpty() methods require accumulations across 32 control segments, and so might be slightly slower.

Is it possible to allow more number of threads than concurrency level in ConcurrentHashMap?

Suppose we have a ConcurrentHashMap of size 16 .i.e concurrency level 16. If number of threads working on this map is greater than 16 i.e. somewhere 20 to 25 then what will happen to the extra threads whether they will be in waiting state?
If yes then how long they will be in waiting state and internally how they will communicate with other threads?

There's nothing magic about the concurrency level. Even if you have fewer than 16 threads in your example, they still can collide when attempting to access the map.
If the concurrency level is 16, then when any two threads try to put two different keys into the map, there will be one chance in 16 that the two keys are assigned to the same segment. If that's the case, then their access to the segment will have to be serialized. (i.e., one of the threads will have to be blocked until the other thread has finished.)
Of course, if two threads try to access the same key at the same time then it's guaranteed that they will access the same segment and one will be blocked, no matter what the concurrency level.
what will happen to the extra threads whether they will be in waiting state?
Nothing. That's what a synchronized statement does. It does nothing (a.k.a., it "blocks") until the lock becomes available. Then, it acquires the lock, performs the body of the statement, and releases the lock.
how long they will be in waiting state?
A thread that is blocked on a mutex lock will remain blocked until the lock is released by some other thread. In well designed code, that should only be a very short time---just long enough for the other thread to update a few fields.
internally how they will communicate with other threads?
Not sure what you mean by "internally". A thread that is blocked can't communicate with other threads. It can't do anything until it is unblocked.
Maybe you are asking how the operating system knows what to unblock, and when.
When a thread attemts to lock a mutex that already is in use by some other thread, the operating system will suspend the thread, save its state, and put some token that represents the thread into a queue that is associated with the mutex. When the owner of the thread releases the mutex, the OS picks a thread from the queue, transfers ownership of the mutex to the selected thread, restores the thread's saved state, and lets it run.
The algorithms that different operating systems use to select which thread to unblock, and the means of saving and restoring thread contexts (a.k.a., "context switch") are deeper subjects than I have room or time to discuss here.

Does Yield/Join release monitor lock? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Does thread.yield() lose the lock on object if called inside a synchronized method?
I know Thread.sleep() holds the lock, but Object.wait() releases the lock. Some say yield actually implements sleep(0). Does this mean yield will not release the lock?
Another question. Say the current thread has acquired a lock, and then called anotherThread.join(). Does the current thread release the lock?

Unless the javadoc mentions an object's monitor (such as Object.wait()), you should assume that any locks will continue to be held. So:
Does this mean yield will not release the lock?
Yes.
Does the current thread release the lock?
No.

sleep puts the thread in a wait state, yield returns the thread directly to the ready pool. (So if a thread yields it could go directly from running to the ready pool to getting picked by the scheduler again without ever waiting.) Neither one has anything to do with locking.
From the Java Language Specification:
Thread.sleep causes the currently executing thread to sleep
(temporarily cease execution) for the specified duration, subject to
the precision and accuracy of system timers and schedulers. The thread
does not lose ownership of any monitors, and resumption of execution
will depend on scheduling and the availability of processors on which
to execute the thread.
It is important to note that neither Thread.sleep nor Thread.yield
have any synchronization semantics. In particular, the compiler does
not have to flush writes cached in registers out to shared memory
before a call to Thread.sleep or Thread.yield, nor does the compiler
have to reload values cached in registers after a call to Thread.sleep
or Thread.yield.
For example, in the following (broken) code fragment, assume that
this.done is a non-volatile boolean field:
while (!this.done)
Thread.sleep(1000);
The compiler is free to read the field this.done just once, and reuse
the cached value in each execution of the loop. This would mean that
the loop would never terminate, even if another thread changed the
value of this.done.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.