I haven't used the Semaphore strange enough...
Anyway I was reviewing some code using it and saw that unlike locks, a permit can be released by another thread (i.e. no ownership).
I looked into Concurrency in Action and it says (p.98):
The implementation has no actual permit objects....so a permit
acquired by one thread can be released by another
I didn't notice this detail before and looked into an OS textbook I have that said (my emphasis):
When one process modifies the semaphore value no other process
....etc
So is this Java specific design decision? I mean that a semaphore is not owned by a thread.
Or am I misunderstanding the concept of semaphore?
Note: This is not a question of whether this is a good/bad design etc. I am just trying to be sure I understand the concept
According to Wikipedia a Semaphore does not track which object is aquired/released but only the number. Hence "ownership" is not applicable here. Read the section "important observations"!
Hence there is no ownership. In this the regard the Java semaphore does the right thing. Also the Unix semaphore (see semop(2)) work this way.
However some textbooks seem to mix the terms "mutex", "lock" and "semaphores" quite liberally - you can judge the quality of that texts on your own.
EDIT:
I could not believe than Tannenbaum does not distinct between semaphores and mutexes, so I've searched the full citation of "When one process modifes the semaphore value[...]" and came up with stuff lie this (not knowing whether or not they are from Tannenbaum):
[...]the modifications to S in the P and V operations are executed indivisibly:
that is when one process modifies the semaphore value, no other process can simultaneously modify that same semaphore value.[...]
Other quotes are so similar that I suspect copy&paste :-)
The point is: If your text reads the same, then you misunderstood the intention of the paragraph - it is not about "ownership", it is "only" about concurrent access. When multiple threads try to access one semaphore at exactly the same time the threads must be serialized and modification of the value (remember - there is only one value inside the semaphore for all resources) must be atomic.
Related
All objects in Java have intrinsic locks and these locks are used for synchronization. This concept prevents objects from being manipulated by different threads at the same time, or helps control execution of specific blocks of code.
What will happen if the locks themselves get contended upon - i.e. 2 threads asking for the lock at the exact microsecond.
Who gets it, and how does it get resolved?
What will happen if the locks themselves get contended upon - i.e. 2 threads asking for the lock at the exact microsecond.
One thread will get the lock, and the other will be blocked until the first thread releases it.
(Aside: some of the other answers assert that there is no such thing as "at the same time" in Java. They are wrong!! There is such a thing! If the JVM is using two or more cores of a multi-core system, then two threads on different cores could request the same Object lock in exactly the same hardware clock cycle. Clearly, only one will get it, but that is a different issue.)
Who gets it, and how does it get resolved?
It is not specified which thread will get the lock.
It is (typically) resolved by the OS'es thread scheduler ... using whatever mechanisms that uses. This aspect of the JVM's behaviour is (obviously) platform specific.
If you really, really want to figure out precisely what is going on, the source code for OpenJDK and Linux are freely available. But to be frank, you don't need to know.
When it comes to concurrency, there is no such thing as "at the same time"; java ensures that someone is first.
If you are asking about simultaneous contended access to lock objects, that is the essence of concurrent programming - nothing to say other than "it happens by design"
If you are asking about simultaneously using an object as a lock and as a regular object, it's not a problem: It happens all the time when using non synchronized methods during a concurrent call to a synchronized method (which uses this as the lock object)
The thing handling lock requests can only handle one thing at a time; therefore, 2 threads can't ask for the lock at the same time.
Even if it is in the same microsecond, one will still be ahead of the other one (perhaps faster by a nanosecond). The one that asks first will get the lock. The one who asks second will then wait for the lock to be released.
An analogy will be ... stacking papers together... Suppose I have one hand and that hand can only hold one piece of paper. Different people(threads) are handing me a single piece of paper. If two people "offer me papers at the same time" I will handle one before the other
In reality, there is no such thing as at the same time. The phrase exists because our brains can not work at the micro...nano...pico second speeds
http://docs.oracle.com/javase/tutorial/essential/concurrency/locksync.html
Locks are implemented not only in JVM but also at OS and hardware level so the mechanisms may differ. We rely on Java API and JVM specs and they say that one of the threads will acquire the lock the other will block.
Synchronization works by providing exclusive access to an object or method by putting a Synchronized keyword before a method name. What if I want to give higher precedence to one particular access if two or more accesses to a method occurs at the same time. Can we do that?
Or just may be I'm misunderstanding the concept of Synchronization in java. Please correct me.
I have other questions as well,
Under what requirements should we make method synchronized?
When to make method synchronized ? And when to make block synchronized ?
Also if we make a method synchronized will the class too be synchronized ? little confused here.
Please Help. Thanks.
No. Sadly Java synchronization and wait/notify appear to have been copied from the very poor example of Unix, rather than almost anywhere else where there would have been priority queues instead of thundering herds. When Per Brinch Hansen, author of monitors and Objective Pascal, saw Java, he commented 'clearly I have laboured in vain'.
There is a solution for almost everything you need in multi-threading and synchronization in the concurrent package, it however requires some thinking about what you do first. The synchronized, wait and notify constructs are like the most basic tools if you have just a very basic problem to solve, but realistically most advanced programs will (/should) never use those and instead rely on the tools available in the Concurrent package.
The way you think about threads is slightly wrong. There is no such thing as a more important thread, there is only a more important task. This is why Java clearly distinguishes between Threads, Runnables and Callables.
Synchronization is a concept to prevent more than one thread from entering a specific part of code, which is - again - the most basic concept of avoiding threading issues. Those issues happen if more than one thread accesses some data, where at least one of those multiple threads is trying to modify that data. Think about an array that is read by Thread A, while it is written by Thread B at the same time. Eventually Thread B will write the cell that Thread A is just about to read. Now as the order of execution of threads is undefined, it is as well undefined whether Thread A will read the old value, the new value or something messed up in between.
A synchronized "lock" around this access is a very brute way of ensuring that this will never happen, more sophisticated tools are available in the concurrent package like the CopyOnWriteArray, that seamlessly handles the above issue by creating a copy for the writing thread, so neither Thread A nor Thread B needs to wait. Other tools are available for other solutions and problems.
If you dig a bit into the available tools you soon learn that they are highly sophisticated, and the difficulties using them is usually located with the programmer and not with the tools, because countless hours of thinking, improving and testing has been gone into those.
Edit: to clarify a bit why the importance is on the task even though you set it on the thread:
Imagine a street with 3 lanes that narrows to 1 lane (synchronized block) and 5 cars (threads) are arriving. Let's further assume there is one person (the car scheduler) that has to define which cars get the first row and which ones get the other rows. As there is only 1 lane, he can at best assign 1 cars to the first row and the others need to come behind. If all cars look the same, he will most likely assign the order more or less randomly, while a car already in front might stay in front more likely, just because it would be to troublesome to move those cars around.
Now lets say one car has a sign on top "President of the USA inside", so the scheduler will most likely give that car priority in his decision. But even though the sign is on the car, the reason for his decision is not the importance of the car (thread), but the importance on the people inside (task). So the sign is nothing but an information for the scheduler, that this car transports more important people. Whether or not this is true however, the scheduler can't say (at least not without inspection), so he just has to trust the sign on the car.
Now if in another scenario all 5 cars have the "President inside" sign, the scheduler doesn't have any way to decide which one goes first, and he is in the same situation again as he was with all the cars having no sign at all.
Well in case of synchronized, the access is random if multiple threads are waiting for the lock. But in case you need first-come first-serve basis: Then you can probably use `ReentrantLock(fairness). This is what the api says:
The constructor for this class accepts an optional fairness parameter.
When set true, under contention, locks favor granting access to the
longest-waiting thread.
Else if you wish to give access based on some other factor, then I guess it shouldn;t be complicated to build one. Have a class that when call's lock gets blocked if some other thread is executing. When called unlock it will unblock a thread based on whatever algorithm you wish to.
There's no such thing as "priority" among synchronized methods/blocks or accesses to them. If some other thread is already holding the object's monitor (i.e. if another synchronized method or synchronized (this) {} block is in progress and hasn't relinquished the monitor by a call to this.wait()), all other threads will have to wait until it's done.
There are classes in the java.util.concurrent package that might be able to help you if used correctly, such as priority queues. Full guidance on how to use them correctly is probably beyond the scope of this question - you should probably read a decent tutorial to start with.
I have a Results object which is written to by several threads concurrently. However, each thread has a specific purpose and owns certain fields, so that no data is actually modified by more than one thread. The consumer of this data will not try to read it until all of the writer threads are done writing it. Because I know this to be true, there is no synchronization on the data writes and reads.
There is a RunningState object associated with this Results object which serves to coordinate this work. All of its methods are synchronized. When a thread is done with its work on this Results object, it calls done() on the RunningState object, which does the following: decrements a counter, checks if the counter has gone to 0 (indicating that all writers are done), and if so, puts this object on a concurrent queue. That queue is consumed by a ResultsStore which reads all of the fields and stores data in the database. Before reading any data, the ResultsStore calls RunningState.finalizeResult(), which is an empty method whose sole purpose is to synchronize on the RunningState object, to ensure that writes from all of the threads are visible to the reader.
Here are my concerns:
1) I believe that this will work correctly, but I feel like I'm violating good design principles to not synchronize on the data modifications to an object that is shared by multiple threads. However, if I were to add synchronization and/or split things up so each thread only saw the data it was responsible for, it would complicate the code. Anyone who modifies this area had better understand what's going on in any case or they're likely to break something, so from a maintenance standpoint I think the simpler code with good comments explaining how it works is a better way to go.
2) The fact that I need to call this do-nothing method seems like an indication of wrong design. Is it?
Opinions appreciated.
This seems mostly right, if a bit fragile (if you change the thread-local nature of one field, for instance, you may forget to synchronize it and end up with hard-to-trace data races).
The big area of concern is in memory visibility; I don't think you've established it. The empty finalizeResult() method may be synchronized, but if the writer threads didn't also synchronize on whatever it synchronizes on (presumably this?), there's no happens-before relationship. Remember, synchronization isn't absolute -- you synchronize relative to other threads that are also synchronized on the same object. Your do-nothing method will indeed do nothing, not even ensure any memory barrier.
You somehow need to establish a happens-before relationship between each thread doing its writes, and the thread that eventually reads. One way to do this without synchronization is via a volatile variable, or an AtomicInteger (or other atomic classes).
For instance, each writer thread can invoke counter.incrementAndGet(1) on the object, and the reading thread can then check that counter.get() == THE_CORRECT_VALUE. There's a happens-before relationship between a volatile/atomic field being written and it being read, which gives you the needed visibility.
Your design is sound, but it can be improved if you are using a true concurrent queue since a concurrent queue from the java.util.concurrent package already guarantees a happens before relationship between the thread putting an item into the queue, and the thread taking an item out, so this precludes needing to call finalizeResult() in the taking thread (so no need for that "do nothing" method call).
From java.util.concurrent package description:
The methods of all classes in java.util.concurrent and its subpackages
extend these guarantees to higher-level synchronization. In
particular:
Actions in a thread prior to placing an object into any
concurrent collection happen-before actions subsequent to the access
or removal of that element from the collection in another thread.
The comments in another answer concerning using an AtomicInteger instead of synchronization are also wise (as using an AtomicInteger to do your thread counting will likely perform better than synchronization), just make sure to get the value of the count after the atomic decrement (e.g. decrementAndGet()) when comparing to 0 in order to avoid adding to the queue twice.
What you've described is indeed safe, but it also sounds, frankly, brittle and (as you note) maintenance could become an issue. Without sample code, it's really hard to tell what's really easiest to understand, so an already subjective question becomes frankly unanswerable. Could you ask a coworker for a code review? (Particularly one that's likely to have to deal with this pattern.) I'm going to trust you that this is indeed the simplest approach, but doing something like wrapping synchronized blocks around writes would increase safety now and in the future. That said, you obviously know your code better than I do.
I saw the below statement in Java Specifications.
Programs where threads hold (directly
or indirectly) locks on multiple
objects should use conventional
techniques for deadlock avoidance,
creating higher-level locking
primitives that don't deadlock, if
necessary.
So, What are the "Conventional Techniques" to follow to avoid deadlock? I'm not pretty clear with this (not understood properly, explanation needed).
The most common technique is to acquire resources (locks) in some consistent well-defined order.
The following article by Brian Goetz might be helpful: http://www.javaworld.com/javaworld/jw-10-2001/jw-1012-deadlock.html
It's pretty old, but explains the issues well.
As a somewhat absract suggestion, an answer to this might be "Have a plan for handling locks and stick to it".
The danger of locking is where, in short, one thread holds lock A and is trying to get lock B, while another thread holds lock B and is trying to get lock A. As noted by another answer, the clasic way to avoid this is to get locks in a consistent order. However, a good discipline is to minimize the amount of work that your code does with a lock held. Any code that calls another function with a lock held is a potential problem: what if that other function tries to get another lock? What if someone else later modifies that function to get a lock? Try to form a clear pattern of what functions can be called with locks held, and what cannot, and make sure the comments in your code make this all clear.
Don't do locking! Seriously. We get immense performance (100k's of transactions at sub-millisecond latency) at my work by keeping all our business logic single threaded.
Guys, can anyone give a simple practical example of LockSupport & AbstractQueuedSynchronizer use? Example given in javadocs is quite strained.
Usage of Semaphore permits is understood by me.
Thanks for any response.
If youre talking about using a locking mechanism (or even sync barriers) just use a java.util.concurrent.Lock. The obvious suggestion is to user a ReentrantLock which delegates to a Synch. The synch is an AQS which in turn uses LockSupport.
Its all done under the covers for you.
Edit:
No let's go over the practical uses of AbstractQueuedSynchronizer (AQS).
Concurrency constructs though can be very different in their usage all can have the same underlying functions.
I.e. Under some condition park this thread. Under some other condition wake a thread up.
This is a very broad set of instructions but makes it obvious that most concurrency structures would need some common functionality that would be able to handle those operations for them. Enter AQS. There are five major synchronization barriers.
ReentrantLock
ReadLock
WriteLock
Semaphore
CountDownLatch
Now, all these five structures have very different set of rules when using them. A CountdownLatch can allow many threads to run at the same time but forces one (or more) threads to wait until at least n number of threads count down on said latch.
ReentrantLock forces only one thread at a time to enter a critical section and queues up all other threads to wait for it to completed.
ReadLock allows any number of reading threads into the critical section until a write lock is acquiered.
The examples can go on, but the big picture here is they all use AQS. This is because they are able to use the primitive functions that AQS offers and implements more complex functionality on top of it. AQS allows you to park unpark and wake up threads ( interruptibly if need be) but in such a way that you can support many complex functions.
they are not meant for direct use in client code; more for helping building new concurrent classes.
AQS is a wonderful class for building concurrency primitives – but it is complex and requires a bit of study to use it properly. I have used it for a few things like lazy initialisation and a simple fast reusable latch.
As complex as it is, I don't think AQS is particularly vague, it has excellent javadocs describing how to use it properly.
2.7 release of Disruptor uses LockSupport.parkNanos instead of Thread.sleep to reduce latency:
http://code.google.com/p/disruptor/
AFAIK, AbstractQueuedSynchronizer is used to manage state transitions. The JDK uses it to extend Sync, an internal class for java.util.concurrent.FutureTask. The Sync class manages the states (READY, RUNNING, RAN, and CANCELLED) of FutureTask and the transitions between them.
This allows, as you may know, FutureTask to block on FutureTask.get() until the RAN state is reached, for example.