I came across this line "some functions are inherently thread-safe, for example memcpy()"
Wikipedia defines "thread-safe" as:
A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time.
OK. But what does inherently mean? Is it related to inheritance?
It is not related to inheritance. It is an informal expression and means more like
"some functions are thread-safe by their nature". For example a function which does not
touch any shared values/state is thread safe anyway i.e. "is inherently thread-safe".
In this context I interpret it as "without having been designed to achieve it, it still is thread-safe".
There is no direct link to the concept of inheritance, although of course the words are related. This is not an example of inheritance in the object-oriented programming sense, of course. This is just a function, that from its core nature gets the property of being thread-safe.
Of course there's nothing magic about memcpy() being inherently thread-safe, either. Any function without internal state or "shared" side-effects will be so, which is why functional programming, where all functions are supposed to be "pure" and lack side-effects, lends itself so well to parallel programming.
In practice it's hard on typical computers to get "real work" done without side-effects, particularly I/O is very much defined by its side-effects. So even pure functional languages often have some non-functional corners.
Update: Of course, memcpy() is not free from side-effects, it's core purpose is to manipulate memory which, if shared between threads, certainly isn't safe. The assumption has to be that as long as the destination areas are distinct, it doesn't matter if one or more threads run memcpy() in parallel.
Contrast this with e.g. printf(), which generates characters on a single (for the process) output stream. It has to be explicitly implemented (as required by POSIX) to be thread-safe, whereas memcpy() does not.
An inherently thread safe function, is safe without having to take any specific design decisions regarding threading, it is thread safe simply by virtue of the task it performs as opposed to being redesigned to force thread safety. Say I write the very simple function:
int cube(int x)
{
return x*x*x;
}
It is inherently thread safe, as it has no way of reading from or writing to shared memory. However I could also make a function which is not thread safe but make it thread safe through specific design and synchronization. Say I have this similar function to before:
void cubeshare()
{
static int x;
x = x * x * x;
printf("%l", x);
}
This is not thread safe, it is entirely possible it could have the value of x change between each use (well this is actually unlikely in reality as x would get cached but lets say we are not doing any optimization).
We however could make this thread safe like this (this is pseudo code, a real mutex is more complicated):
void cubesharesafe(mutex foo)
{
static int x;
lockmutex(foo);
x = x * x * x;
printf("%l", x);
unlockmutex(foo);
}
This is however not inherently thread safe, we are forcing it to be through redesign. Real examples will often be far more complicated than this but I hope that this gives an idea taken to the most simple possible level. If you have any questions please comment bellow.
In case of memcpy, only a single thread is able to provide writes from a specific source to a specific destination. : thread-safe by initial design so.
Inherently means: without needing to "tune" the base function to achieve the goal, in this case: thread safety.
If multiple threads could interfere the same "channel" at the same time, you would end up with problem of thread-safety related to shared chucks of data.
inherent means Existing in something as a permanent.
it has nothing to do with inheritance..
by default,or already some methods are thread safe...in order to protect or avoid multitasking problems..
vector,hash table..are some of the example classes that are inherently thread safe..
nothing..confusing..there are some functions..which is thread safe by default..
Related
Let non-threadsafe, mutable object X be constructed in thread A. A passes X, post construction, to thread B. B mutates X and A never accesses X again.
Will the state of X always be properly visible to B?
Is X effectively thread confined?
My reading of Java Concurrency in Practice seems to indicate that X is not properly published but I cannot cause any problems for thread B in test rigs that run millions of replications. I suspect this is just dumb luck.
For background, X represents a multitude of complex classes over which I have no control that are authored by modelers who have only a basic knowledge of Java. It is strongly preferred that X has no synchronized blocks or other concurrency mechanisms or requirements.
I am currently solving this problem by having thread A pass a thread-safe factory for X that B invokes, thus making X thread confined.
Publication only safe for final fields
The Java Memory Model doesn't guarantee that the object X will be completely published (fully constructed) to thread A.
To ensure that, you would need to make it immutable (all member fields final) or synchronize.
Quoting JSR-133:
The semantics of final fields have been strengthened to allow for thread-safe immutatability
without explicit synchronization. This may require steps such as store-store barriers at the
end of constructors in which final fields are set.
Only thing you need to avoid is leaking fields out of the class before the constructor finishes.
Testing
jcstress has in fact a sample project to demonstrate consequences of racing during publication: JMMSample_06_Finals.java
Note that some efforts had to be done to replicate the problem, like using many fields.
The implementation of the JMM depends naturally on the particular JRE that you are using and also the effects of memory barriers being used depend on the hardware being used.
On my hardware using Oracle JDK 8 I'm not able to reproduce unsafe publication using the sample with jcstress.
Synchronizing
There is a "happens-before" relationship between all synchronization actions. This is known as the synchronization order. Basically when you use any synchronization mechanism, you have the guarantee that actions before it will be visible after it.
As concluded in the Java Language Specification:
If a program is correctly synchronized, then all executions of the program will appear to be sequentially consistent
In practice
In practice it's very hard to run into problems due to actions taken in a constructor not being visible by threads using the object.
A primary reason is the usage of synchronization mechanisms. You can check some of the actions that will ensure the happens-before relationship in the javadoc: Memory Visibility
Also as I mentioned with the jcstress sample, JRE nowadays seems to be very good in ensuring consistent results even when it doesn't need to according to the language specification.
Is it safe to use the :volatile-mutable qualifier with deftype in a single-threaded program? This is a follow up to this question, this one, and this one. (It's a Clojure question, but I added the "Java" tag because Java programmers are likely to have insights about it, too.)
I've found that I can get a significant performance boost in a program I'm working on by using :volatile-mutable fields in a deftype rather than atoms, but I'm worried because the docstring for deftype says:
Note well that mutable fields are extremely difficult to use
correctly, and are present only to facilitate the building of higher
level constructs, such as Clojure's reference types, in Clojure
itself. They are for experts only - if the semantics and implications
of :volatile-mutable or :unsynchronized-mutable are not immediately
apparent to you, you should not be using them.
In fact, the semantics and implications of :volatile-mutable are not immediately apparent to me.
However, chapter 6 of Clojure Programming, by Emerick, Carper, and Grand says:
"Volatile" here has the same meaning as the volatile field modifier in
Java: reads and writes are atomic and must be executed in
program order; i.e., they cannot be reordered by the JIT compiler or
by the CPU. Volatiles are thus unsurprising and thread-safe — but
uncoordinated and still entirely open to race conditions.
This seems to imply that as long as accesses to a single volatile-mutable deftype field all take place within a single thread, there is nothing to special to worry about. (Nothing special, in that I still have to be careful about how I handle state if I might be using lazy sequences.) So if nothing introduces parallelism into my Clojure program, there should be no special danger to using deftype with :volatile-mutable.
Is that correct? What dangers am I not understanding?
That's correct, it's safe. You just have to be sure that your context is really single-threaded. Sometimes it's not that easy to guarantee that.
There's no risk in terms of thread-safety or atomicity when using a volatile mutable (or just mutable) field in a single-threaded context, because there's only one thread so there's no chance of two threads writing a new value to the field at the same time, or one thread writing a new value based on outdated values.
As others have pointed out in the comments you might want to simply use an :unsynchronized-mutable field to avoid the cost introduced by volatile. That cost comes from the fact that every write must be committed to main memory instead of thread local memory. See this answer for more info about this.
At the same time, you gain nothing by using volatile in a single-threaded context because there's no chance of having one thread writing a new value that will not be "seen" by other thread reading the same field.
That's what a volatile is intended for, but it's irrelevant in a single-thread context.
Also note that clojure 1.7 introduced volatile! intended to provide a "volatile box for managing state" as a faster alternative to
atom, with a similar interface but without it's compare and swap semantics. The only difference when using it is that you call vswap! and vreset! instead of swap! and reset!. I would use that instead of
deftype with ^:volatile-mutable if I need a volatile.
Making every object lockable looks like a design mistake:
You add extra cost for every object created, even though you'll actually use it only in a tiny fraction of the objects.
Lock usage become implicit, having lockMap.get(key).lock() is more readable than synchronization on arbitrary objects, eg, synchronize (key) {...}.
Synchronized methods can cause subtle error of users locking the object with the synchronized methods
You can be sure that when passing an object to a 3rd parting API, it's lock is not being used.
eg
class Syncer {
synchronized void foo(){}
}
...
Syncer s = new Syncer();
synchronize(s) {
...
}
// in another thread
s.foo() // oops, waiting for previous section, deadlocks potential
Not to mention the namespace polution for each and every object (in C# at least the methods are static, in Java synchronization primitives have to use await, not to overload wait in Object...)
However I'm sure there is some reason for this design. What is the great benefit of intrinsic locks?
You add extra cost for every object created, even though you'll
actually use it only in a tiny fraction of the objects.
That's determined by the JVM implementation. The JVM specification says, "The association of a monitor with an object may be managed in various ways that are beyond the scope of this specification. For instance, the monitor may be allocated and deallocated at the same time as the object. Alternatively, it may be dynamically allocated at the time when a thread attempts to gain exclusive access to the object and freed at some later time when no thread remains in the monitor for the object."
I haven't looked at much JVM source code yet, but I'd be really surprised if any of the common JVMs handled this inefficiently.
Lock usage become implicit, having lockMap.get(key).lock() is more
readable than synchronization on arbitrary objects, eg, synchronize
(key) {...}.
I completely disagree. Once you know the meaning of synchronize, it's much more readable than a chain of method calls.
Synchronized methods can cause subtle error of users locking the
object with the synchronized methods
That's why you need to know the meaning of synchronize. If you read about what it does, then avoiding these errors becomes fairly trivial. Rule of thumb: Don't use the same lock in multiple places unless those places need to share the same lock. The same thing could be said of any language's lock/mutex strategy.
You can be sure that when passing an object to a 3rd parting API, it's
lock is not being used.
Right. That's usually a good thing. If it's locked, there should be a good reason why it's locked. Other threads (third party or not) need to wait their turns.
If you synchronize on myObject with the intent of allowing other threads to use myObject at the same time, you're doing it wrong. You could just as easily synchronize the same code block using myOtherObject if that would help.
Not to mention the namespace polution for each and every object (in C#
at least the methods are static, in Java synchronization primitives
have to use await, not to overload wait in Object...)
The Object class does include some convenience methods related to synchronization, namely notify(), notifyAll(), and wait(). The fact that you haven't needed to use them doesn't mean they aren't useful. You could just as easily complain about clone(), equals(), toString(), etc.
Actually you only have reference to that monitor in each object; the real monitor object is created only when you use synchronization => not so much memory is lost.
The alternative would be to add manually monitor to those classes that you need; this would complicate the code very much and would be more error-prone. Java has traded performance for productivity.
One benefit is automatic unlock on exit from synchronized block, even by exception.
I assume that like toString(), the designers thought that the benifits outweighed the costs.
Lots of decisions had to be made and a lot of the concepts were untested (Checked exceptions-ack!) but overall I'm sure it's pretty much free and more useful than an explicit "Lock" object.
Also do you add a "Lock" object to the language or the library? Seems like a language construct, but objects in the library very rarely (if ever?) have special treatment, but treating threading more as a library construct might have slowed things down..
We are building a web app with Scala, Play framework, and MongoDB (with ReactiveMongo as our driver). The application architecture is non-blocking end to end.
In some parts of our code, we need to access some non-thread-safe libraries such as Scala's parser combinators, Scala's reflection etc. We are currently enclosing such calls in synchronized blocks. I have two questions:
Are there any gotchas to look out for when using synchronized with future-y code?
Is it better to use locks (such as ReentrantLock) rather than synchronized, from both performance and usability standpoint?
This is an old question)) see here using-actors-instead-of-synchronized for example. In short it would be more advisable to use actors instead of locks:
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) ⇒ log.info("Hello " + who)
}
}
only one message will be processed at any given time, so you can put any not-thread safe code you want instead of log.info, everything will work OK. BTW using ask pattern you can seamlessly integrate your actors into existing code that requires futures.
For me the main problem you will face is that any call to a synchronized or a locked section of code may block and thus paralyze the threads of the execution context. To avoid this issue, you can wrap any call to a potentially blocking method using scala.concurrent.blocking:
import scala.concurrent._
import ExecutionContext.Implicits.global
def synchronizedMethod(s: String) = synchronized{ s.size }
val f = future{
println("Hello")
val i = blocking{ //Adjust the execution context behavior
synchronizedMethod("Hello")
}
println(i)
i
}
Of course, it may be better to consider alternatives like thread-local variables or wrapping invocation to serial code inside an actor.
Finally, I suggest using synchronized instead of locks. For most application (especially if the critical sections are huge), the performance difference is not noticeable.
The examples you mention i.e. reflection and parsing should be reasonably immutable and you shouldn't need to lock, but if you're going to use locks then a synchronized block will do. I don't think there's much of a performance difference between using synchronized vs Lock.
Well I think the easiest and safest way would be (if at all you can) from Thread Confinement.
i.e. each thread creates its own instance of parser combinators etc and then use it.
And in case you need any synchronization (which should be avoided as under traffic it will be the killer), synchornized or ReentrantLock will give almost same performace. It again depends on what objects need to be Guarded on what locks etc. In a web-application, it is discouraged unless absolutely necessary.
I am writing a class of which will be created quite a few instances. Multiple threads will be using these instances, so the getters and setters of the fields of the class have to be concurrent. The fields are mainly floats. Thing is, I don't know what is more resource-hungry; using a synchronized section, or make the variable something like an AtomicInteger?
You should favor atomic primitives when it is possible to do so. On many architectures, atomic primitives can perform a bit better because the instructions to update them can be executed entirely in user space; I think that synchronized blocks and Locks generally need some support from the operating system kernel to work.
Note my caveat: "when it is possible to do so". You can't use atomic primitives if your classes have operations that need to atomically update more than one field at a time. For example, if a class has to modify a collection and update a counter (for example), that can't be accomplished using atomic primitives alone, so you'd have to use synchronized or some Lock.
The question already has an accepted answer, but as I'm not allowed to write comments yet here we go. My answer is that it depends. If this is critical, measure. The JVM is quite good at optimizing synchronized accesses when there is no (or little) contention, making it much cheaper than if a real kernel mutex had to be used every time. Atomics basically use spin-locks, meaning that they will try to make an atomic change and if they fail they will try again and again until they succeed. This can eat quite a bit of CPU is the resource is heavily contended from many threads.
With low contention atomics may well be the way to go, but in order to be sure try both and measure for your intended application.
I would probably start out with synchronized methods in order to keep the code simple; then measure and make the change to atomics if it makes a difference.
It is very important to construct the instances properly before they have been used by multiple threads. Otherwise those threads will get incomplete or wrong data from those partially constructed instances. My personal preference would be to use synchronized block.
Or you can also follow the "Lazy initialization holder class idiom" outlined by Brain Goetz in his book "Java concurrency in Practice":
#ThreadSafe
public class ResourceFactory {
private static class ResourceHolder {
public static Resource resource = new Resource();
}
public static Resource getResource() {
return ResourceHolder.resource;
}
}
Here the JVM defers initializing the ResourceHolder class until it is actually used. Moreover Resource is initialized with a static initializer, no additional synchronization is needed.
Note: Statically initialized objects require no explicit synchronization either during construction or when being referenced. But if the object is mutable, synchronization is still required by both readers and writers to make subsequent modifications visible and also to avoid data corruption.