I am seeing a lot of classes being added to Java which are not thread safe.
Like StringBuilder is not thread safe while StringBuffer was and StringBuilder is recoomended over Stringbuffer.
Also various collection classes are not thread safe.
Isn't being thread safe a good thing ?
Or i am just stupid and don't yet understand the meaning of being thread safe ?
Because thread safety makes things slower, and not everything has to be multi-threaded.
Consider reading this article to find out basics about thread safety :
http://en.wikipedia.org/wiki/Thread_safety
When you comfortable enough with the threads/or not, consider reading this book, it has great reviews :
http://www.amazon.com/Java-Concurrency-Practice-Brian-Goetz/dp/0321349601
Some classes are not suitable for using across multiple threads. StringBuffer is one of them IMHO.
It is very hard to find even a contrived example of when you would use StringBuffer in a multi-threaded way that cannot be more simple achieve other ways.
Thread safety is not a all or nothing property. Ten years ago some books recommended marking all methods of a class as synchronized in order to make them thread safe. This costs some performane, but it is far from a guarantee that your overall program is thread safe. Therefore, you have costs with a questionable gain. That is, why there are still classes added to Java library which are not thread safe.
The "make every method synchronized" strategy is only able to provide guarantees about the consistency of one object, and it has the potential to introduce dead-locks, or to be weaker than thought (think about wait()).
There is a performance overhead to inherently thread-safe code. If you do not need the class in a concurrent context but need the performance to be high then these, original classes are not ideal.
A typical usage of StringBuilder is something like:
return new StringBuilder().append("this").append("that").toString()
all in one thread, no need to synchronize anything.
Related
Is it safe to use the :volatile-mutable qualifier with deftype in a single-threaded program? This is a follow up to this question, this one, and this one. (It's a Clojure question, but I added the "Java" tag because Java programmers are likely to have insights about it, too.)
I've found that I can get a significant performance boost in a program I'm working on by using :volatile-mutable fields in a deftype rather than atoms, but I'm worried because the docstring for deftype says:
Note well that mutable fields are extremely difficult to use
correctly, and are present only to facilitate the building of higher
level constructs, such as Clojure's reference types, in Clojure
itself. They are for experts only - if the semantics and implications
of :volatile-mutable or :unsynchronized-mutable are not immediately
apparent to you, you should not be using them.
In fact, the semantics and implications of :volatile-mutable are not immediately apparent to me.
However, chapter 6 of Clojure Programming, by Emerick, Carper, and Grand says:
"Volatile" here has the same meaning as the volatile field modifier in
Java: reads and writes are atomic and must be executed in
program order; i.e., they cannot be reordered by the JIT compiler or
by the CPU. Volatiles are thus unsurprising and thread-safe — but
uncoordinated and still entirely open to race conditions.
This seems to imply that as long as accesses to a single volatile-mutable deftype field all take place within a single thread, there is nothing to special to worry about. (Nothing special, in that I still have to be careful about how I handle state if I might be using lazy sequences.) So if nothing introduces parallelism into my Clojure program, there should be no special danger to using deftype with :volatile-mutable.
Is that correct? What dangers am I not understanding?
That's correct, it's safe. You just have to be sure that your context is really single-threaded. Sometimes it's not that easy to guarantee that.
There's no risk in terms of thread-safety or atomicity when using a volatile mutable (or just mutable) field in a single-threaded context, because there's only one thread so there's no chance of two threads writing a new value to the field at the same time, or one thread writing a new value based on outdated values.
As others have pointed out in the comments you might want to simply use an :unsynchronized-mutable field to avoid the cost introduced by volatile. That cost comes from the fact that every write must be committed to main memory instead of thread local memory. See this answer for more info about this.
At the same time, you gain nothing by using volatile in a single-threaded context because there's no chance of having one thread writing a new value that will not be "seen" by other thread reading the same field.
That's what a volatile is intended for, but it's irrelevant in a single-thread context.
Also note that clojure 1.7 introduced volatile! intended to provide a "volatile box for managing state" as a faster alternative to
atom, with a similar interface but without it's compare and swap semantics. The only difference when using it is that you call vswap! and vreset! instead of swap! and reset!. I would use that instead of
deftype with ^:volatile-mutable if I need a volatile.
I came across this line "some functions are inherently thread-safe, for example memcpy()"
Wikipedia defines "thread-safe" as:
A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time.
OK. But what does inherently mean? Is it related to inheritance?
It is not related to inheritance. It is an informal expression and means more like
"some functions are thread-safe by their nature". For example a function which does not
touch any shared values/state is thread safe anyway i.e. "is inherently thread-safe".
In this context I interpret it as "without having been designed to achieve it, it still is thread-safe".
There is no direct link to the concept of inheritance, although of course the words are related. This is not an example of inheritance in the object-oriented programming sense, of course. This is just a function, that from its core nature gets the property of being thread-safe.
Of course there's nothing magic about memcpy() being inherently thread-safe, either. Any function without internal state or "shared" side-effects will be so, which is why functional programming, where all functions are supposed to be "pure" and lack side-effects, lends itself so well to parallel programming.
In practice it's hard on typical computers to get "real work" done without side-effects, particularly I/O is very much defined by its side-effects. So even pure functional languages often have some non-functional corners.
Update: Of course, memcpy() is not free from side-effects, it's core purpose is to manipulate memory which, if shared between threads, certainly isn't safe. The assumption has to be that as long as the destination areas are distinct, it doesn't matter if one or more threads run memcpy() in parallel.
Contrast this with e.g. printf(), which generates characters on a single (for the process) output stream. It has to be explicitly implemented (as required by POSIX) to be thread-safe, whereas memcpy() does not.
An inherently thread safe function, is safe without having to take any specific design decisions regarding threading, it is thread safe simply by virtue of the task it performs as opposed to being redesigned to force thread safety. Say I write the very simple function:
int cube(int x)
{
return x*x*x;
}
It is inherently thread safe, as it has no way of reading from or writing to shared memory. However I could also make a function which is not thread safe but make it thread safe through specific design and synchronization. Say I have this similar function to before:
void cubeshare()
{
static int x;
x = x * x * x;
printf("%l", x);
}
This is not thread safe, it is entirely possible it could have the value of x change between each use (well this is actually unlikely in reality as x would get cached but lets say we are not doing any optimization).
We however could make this thread safe like this (this is pseudo code, a real mutex is more complicated):
void cubesharesafe(mutex foo)
{
static int x;
lockmutex(foo);
x = x * x * x;
printf("%l", x);
unlockmutex(foo);
}
This is however not inherently thread safe, we are forcing it to be through redesign. Real examples will often be far more complicated than this but I hope that this gives an idea taken to the most simple possible level. If you have any questions please comment bellow.
In case of memcpy, only a single thread is able to provide writes from a specific source to a specific destination. : thread-safe by initial design so.
Inherently means: without needing to "tune" the base function to achieve the goal, in this case: thread safety.
If multiple threads could interfere the same "channel" at the same time, you would end up with problem of thread-safety related to shared chucks of data.
inherent means Existing in something as a permanent.
it has nothing to do with inheritance..
by default,or already some methods are thread safe...in order to protect or avoid multitasking problems..
vector,hash table..are some of the example classes that are inherently thread safe..
nothing..confusing..there are some functions..which is thread safe by default..
Making every object lockable looks like a design mistake:
You add extra cost for every object created, even though you'll actually use it only in a tiny fraction of the objects.
Lock usage become implicit, having lockMap.get(key).lock() is more readable than synchronization on arbitrary objects, eg, synchronize (key) {...}.
Synchronized methods can cause subtle error of users locking the object with the synchronized methods
You can be sure that when passing an object to a 3rd parting API, it's lock is not being used.
eg
class Syncer {
synchronized void foo(){}
}
...
Syncer s = new Syncer();
synchronize(s) {
...
}
// in another thread
s.foo() // oops, waiting for previous section, deadlocks potential
Not to mention the namespace polution for each and every object (in C# at least the methods are static, in Java synchronization primitives have to use await, not to overload wait in Object...)
However I'm sure there is some reason for this design. What is the great benefit of intrinsic locks?
You add extra cost for every object created, even though you'll
actually use it only in a tiny fraction of the objects.
That's determined by the JVM implementation. The JVM specification says, "The association of a monitor with an object may be managed in various ways that are beyond the scope of this specification. For instance, the monitor may be allocated and deallocated at the same time as the object. Alternatively, it may be dynamically allocated at the time when a thread attempts to gain exclusive access to the object and freed at some later time when no thread remains in the monitor for the object."
I haven't looked at much JVM source code yet, but I'd be really surprised if any of the common JVMs handled this inefficiently.
Lock usage become implicit, having lockMap.get(key).lock() is more
readable than synchronization on arbitrary objects, eg, synchronize
(key) {...}.
I completely disagree. Once you know the meaning of synchronize, it's much more readable than a chain of method calls.
Synchronized methods can cause subtle error of users locking the
object with the synchronized methods
That's why you need to know the meaning of synchronize. If you read about what it does, then avoiding these errors becomes fairly trivial. Rule of thumb: Don't use the same lock in multiple places unless those places need to share the same lock. The same thing could be said of any language's lock/mutex strategy.
You can be sure that when passing an object to a 3rd parting API, it's
lock is not being used.
Right. That's usually a good thing. If it's locked, there should be a good reason why it's locked. Other threads (third party or not) need to wait their turns.
If you synchronize on myObject with the intent of allowing other threads to use myObject at the same time, you're doing it wrong. You could just as easily synchronize the same code block using myOtherObject if that would help.
Not to mention the namespace polution for each and every object (in C#
at least the methods are static, in Java synchronization primitives
have to use await, not to overload wait in Object...)
The Object class does include some convenience methods related to synchronization, namely notify(), notifyAll(), and wait(). The fact that you haven't needed to use them doesn't mean they aren't useful. You could just as easily complain about clone(), equals(), toString(), etc.
Actually you only have reference to that monitor in each object; the real monitor object is created only when you use synchronization => not so much memory is lost.
The alternative would be to add manually monitor to those classes that you need; this would complicate the code very much and would be more error-prone. Java has traded performance for productivity.
One benefit is automatic unlock on exit from synchronized block, even by exception.
I assume that like toString(), the designers thought that the benifits outweighed the costs.
Lots of decisions had to be made and a lot of the concepts were untested (Checked exceptions-ack!) but overall I'm sure it's pretty much free and more useful than an explicit "Lock" object.
Also do you add a "Lock" object to the language or the library? Seems like a language construct, but objects in the library very rarely (if ever?) have special treatment, but treating threading more as a library construct might have slowed things down..
We are building a web app with Scala, Play framework, and MongoDB (with ReactiveMongo as our driver). The application architecture is non-blocking end to end.
In some parts of our code, we need to access some non-thread-safe libraries such as Scala's parser combinators, Scala's reflection etc. We are currently enclosing such calls in synchronized blocks. I have two questions:
Are there any gotchas to look out for when using synchronized with future-y code?
Is it better to use locks (such as ReentrantLock) rather than synchronized, from both performance and usability standpoint?
This is an old question)) see here using-actors-instead-of-synchronized for example. In short it would be more advisable to use actors instead of locks:
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) ⇒ log.info("Hello " + who)
}
}
only one message will be processed at any given time, so you can put any not-thread safe code you want instead of log.info, everything will work OK. BTW using ask pattern you can seamlessly integrate your actors into existing code that requires futures.
For me the main problem you will face is that any call to a synchronized or a locked section of code may block and thus paralyze the threads of the execution context. To avoid this issue, you can wrap any call to a potentially blocking method using scala.concurrent.blocking:
import scala.concurrent._
import ExecutionContext.Implicits.global
def synchronizedMethod(s: String) = synchronized{ s.size }
val f = future{
println("Hello")
val i = blocking{ //Adjust the execution context behavior
synchronizedMethod("Hello")
}
println(i)
i
}
Of course, it may be better to consider alternatives like thread-local variables or wrapping invocation to serial code inside an actor.
Finally, I suggest using synchronized instead of locks. For most application (especially if the critical sections are huge), the performance difference is not noticeable.
The examples you mention i.e. reflection and parsing should be reasonably immutable and you shouldn't need to lock, but if you're going to use locks then a synchronized block will do. I don't think there's much of a performance difference between using synchronized vs Lock.
Well I think the easiest and safest way would be (if at all you can) from Thread Confinement.
i.e. each thread creates its own instance of parser combinators etc and then use it.
And in case you need any synchronization (which should be avoided as under traffic it will be the killer), synchornized or ReentrantLock will give almost same performace. It again depends on what objects need to be Guarded on what locks etc. In a web-application, it is discouraged unless absolutely necessary.
This question already has answers here:
Why is the String class declared final in Java?
(16 answers)
Closed 9 years ago.
I am new to Java and while I was reading through Java language I got into two doubts. Though I referred many websites and but still I am not very clear.
Why string class is immutable ? I saw some examples with new File(str) which leads to security threat, but I don't understand how if string is immutable, it will help this scenario.
Another doubt is why wait, notify and notifyall should be inside synchronized block. I know if not it throws illegalMonitorException. But I want to know the technical background why it should be in synchronized block and why not without in synchronized block wait and notify can have same behavior.
Why string class is immutable?
The question of why strings are immutable in Java is an old one, and it's been much debated. In my book, I'd say they are immutable because they should be immutable ;). That might sound like a cop out, but let me explain.
Most simply, strings are used all over the place, if they were mutable that would require a lot of baggage everywhere for making defensive copies and dealing with synchronization and so on. Making them immutable, and then having helpers for mutating them like StringBuilder/StringBuffer is a much better design choice (and a common choice in several languages, not just Java).
Second, everything should be immutable, unless there is a very good reason to justify mutability. Many many problems disappear with immutable classes (esp. pertaining to concurrency). See Effective Java: "Classes should be immutable unless there's a very good reason to make them mutable. If a class cannot be made immutable, limit its mutability as much as possible."
Third, strings are used in the internals of Java, such as the class loading mechanism. Making them immutable makes internal processes simpler, and prevents some security issues. (Another example, String constants are "interned" in Java for performance reasons: http://en.wikipedia.org/wiki/String_interning, and this is, again, much more sane with an immutable type.)
All in all there were probably several reasons the designers chose to make strings immutable in Java and as a day to day programmer it helps you out (as do the utils around creating new strings, like StringBuilder).
Why wait, notify and notifyall should be inside synchronized block?
Here's some info on that one: wait(), notify() and notifyAll() inside synchronized statement.
Basically it makes no sense for a thread to "notify" or "wait" unless it already owns the object's monitor.
In general though, if you are new to Java, you might want to also look at some of the newer utils relating to concurrency in java.util.concurrent: http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html. Often you can rely on these classes and avoid hand coding synchronization, which is notoriously difficult and error prone.