String immutability and wait method in synchronized block [duplicate]

String immutability and wait method in synchronized block [duplicate] - java

This question already has answers here:
Why is the String class declared final in Java?
(16 answers)
Closed 9 years ago.
I am new to Java and while I was reading through Java language I got into two doubts. Though I referred many websites and but still I am not very clear.
Why string class is immutable ? I saw some examples with new File(str) which leads to security threat, but I don't understand how if string is immutable, it will help this scenario.
Another doubt is why wait, notify and notifyall should be inside synchronized block. I know if not it throws illegalMonitorException. But I want to know the technical background why it should be in synchronized block and why not without in synchronized block wait and notify can have same behavior.

Why string class is immutable?
The question of why strings are immutable in Java is an old one, and it's been much debated. In my book, I'd say they are immutable because they should be immutable ;). That might sound like a cop out, but let me explain.
Most simply, strings are used all over the place, if they were mutable that would require a lot of baggage everywhere for making defensive copies and dealing with synchronization and so on. Making them immutable, and then having helpers for mutating them like StringBuilder/StringBuffer is a much better design choice (and a common choice in several languages, not just Java).
Second, everything should be immutable, unless there is a very good reason to justify mutability. Many many problems disappear with immutable classes (esp. pertaining to concurrency). See Effective Java: "Classes should be immutable unless there's a very good reason to make them mutable. If a class cannot be made immutable, limit its mutability as much as possible."
Third, strings are used in the internals of Java, such as the class loading mechanism. Making them immutable makes internal processes simpler, and prevents some security issues. (Another example, String constants are "interned" in Java for performance reasons: http://en.wikipedia.org/wiki/String_interning, and this is, again, much more sane with an immutable type.)
All in all there were probably several reasons the designers chose to make strings immutable in Java and as a day to day programmer it helps you out (as do the utils around creating new strings, like StringBuilder).
Why wait, notify and notifyall should be inside synchronized block?
Here's some info on that one: wait(), notify() and notifyAll() inside synchronized statement.
Basically it makes no sense for a thread to "notify" or "wait" unless it already owns the object's monitor.
In general though, if you are new to Java, you might want to also look at some of the newer utils relating to concurrency in java.util.concurrent: http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html. Often you can rely on these classes and avoid hand coding synchronization, which is notoriously difficult and error prone.

Related

Why is Java's HashTable synchronized?

What was the driving factor or design plan in making the methods of HashTable synchronized?
This link says that HashTable is synchronized because its methods are synchronized. But, I want to know the reason "why" the methods were synchronized?
Was it just to provide some synchronization feature? A developer could explicitly handle a race condition through synchronization techniques. Why provide HashTable with this feature?

Keep in mind: these classes were created "ages" ago - when you check the javadoc for Hashtable, you find it says "since Java 1.0"; whereas HashMap says "1.2"!
Back then, Java was trying to compete with languages like C and C++; by providing unique selling points such as "built-in concurrency".
But people quickly figured that one better synchronizes containers when using them in multi-threaded environments!
So my (more of an opinion-based) answer is: at the time when this class was first designed, people assumed that the requirement "can be used by multiple threads" was more important than "gives optimal performance".
Because Java was "advertised" like: "use it to write multi-threaded write once run everywhere code". That approach fails quickly when the default container classes given to people need additional outside wrapping to actually make them "multi-threaded" ready.
During the years, the people behind Java started to understand that "more granular" solutions are required. Therefore the core collection classes are not synchronized to avoid the corresponding performance hits. Meaning: the default with collections is to go "unprotected"; so you have to put in some thoughts when your requirements is that "multi-threaded" correctness.
Same for "lists" btw: Vector is synchronized; ArrayList is not.

We cannot tell you why. Those who designed Java over two decades ago maybe can. It's not a useful question. Assuming you actually wanted to ask about java.util.Hashtable and not the fictional HashTable type, bear in mind that it's been obsolescent for nineteen years. Nineteen years! Don't use it. It (and Vector) have cruft that the replacement types, both synchronized and unsynchronized, do not carry. Use the modern (as of nineteen years ago) types.

Why Java/.NET allows every object to act as a lock? [duplicate]

Making every object lockable looks like a design mistake:
You add extra cost for every object created, even though you'll actually use it only in a tiny fraction of the objects.
Lock usage become implicit, having lockMap.get(key).lock() is more readable than synchronization on arbitrary objects, eg, synchronize (key) {...}.
Synchronized methods can cause subtle error of users locking the object with the synchronized methods
You can be sure that when passing an object to a 3rd parting API, it's lock is not being used.
eg
class Syncer {
synchronized void foo(){}
}
...
Syncer s = new Syncer();
synchronize(s) {
...
}
// in another thread
s.foo() // oops, waiting for previous section, deadlocks potential
Not to mention the namespace polution for each and every object (in C# at least the methods are static, in Java synchronization primitives have to use await, not to overload wait in Object...)
However I'm sure there is some reason for this design. What is the great benefit of intrinsic locks?

You add extra cost for every object created, even though you'll
actually use it only in a tiny fraction of the objects.
That's determined by the JVM implementation. The JVM specification says, "The association of a monitor with an object may be managed in various ways that are beyond the scope of this specification. For instance, the monitor may be allocated and deallocated at the same time as the object. Alternatively, it may be dynamically allocated at the time when a thread attempts to gain exclusive access to the object and freed at some later time when no thread remains in the monitor for the object."
I haven't looked at much JVM source code yet, but I'd be really surprised if any of the common JVMs handled this inefficiently.
Lock usage become implicit, having lockMap.get(key).lock() is more
readable than synchronization on arbitrary objects, eg, synchronize
(key) {...}.
I completely disagree. Once you know the meaning of synchronize, it's much more readable than a chain of method calls.
Synchronized methods can cause subtle error of users locking the
object with the synchronized methods
That's why you need to know the meaning of synchronize. If you read about what it does, then avoiding these errors becomes fairly trivial. Rule of thumb: Don't use the same lock in multiple places unless those places need to share the same lock. The same thing could be said of any language's lock/mutex strategy.
You can be sure that when passing an object to a 3rd parting API, it's
lock is not being used.
Right. That's usually a good thing. If it's locked, there should be a good reason why it's locked. Other threads (third party or not) need to wait their turns.
If you synchronize on myObject with the intent of allowing other threads to use myObject at the same time, you're doing it wrong. You could just as easily synchronize the same code block using myOtherObject if that would help.
Not to mention the namespace polution for each and every object (in C#
at least the methods are static, in Java synchronization primitives
have to use await, not to overload wait in Object...)
The Object class does include some convenience methods related to synchronization, namely notify(), notifyAll(), and wait(). The fact that you haven't needed to use them doesn't mean they aren't useful. You could just as easily complain about clone(), equals(), toString(), etc.

Actually you only have reference to that monitor in each object; the real monitor object is created only when you use synchronization => not so much memory is lost.
The alternative would be to add manually monitor to those classes that you need; this would complicate the code very much and would be more error-prone. Java has traded performance for productivity.

One benefit is automatic unlock on exit from synchronized block, even by exception.

I assume that like toString(), the designers thought that the benifits outweighed the costs.
Lots of decisions had to be made and a lot of the concepts were untested (Checked exceptions-ack!) but overall I'm sure it's pretty much free and more useful than an explicit "Lock" object.
Also do you add a "Lock" object to the language or the library? Seems like a language construct, but objects in the library very rarely (if ever?) have special treatment, but treating threading more as a library construct might have slowed things down..

safe publication and the advantage of being immutable vs. effectively immutable

I'm re-reading Java Concurrency In Practice, and I'm not sure I fully understand the chapter about immutability and safe publication.
What the book says is:
Immutable objects can be used safely by any thread without additional
synchronization, even when synchronization is not used to publish
them.
What I don't understand is, why would anyone (interested in making his code correct) publish some reference unsafely?
If the object is immutable, and it's published unsafely, I understand that any other thread obtaining a reference to the object would see its correct state, because of the guarantees offered by proper immutability (with final fields, etc.).
But if the publication is unsafe, another thread might still see null or the previous reference after the publication, instead of the reference to the immutable object, which seems to me like something no-one would like.
And if safe publication is used to make sure the new reference is seen by all the threads, then even if the object is just effectively immutable (no final fields, but no way to mute them), then everything is safe again. As the book says :
Safely published effectively immutable objects can be used safely by
any thread without additional synchronization.
So, why is immutability (vs. effective immutability) so important? In what case would an unsafe publication be wanted?

It is desirable to design objects that don't need synchronization for two reasons:
The users of your objects can forget to synchronize.
Even though the overhead is very little, synchronization is not free, especially if your objects are not used often and by many different threads.
Because the above reasons are very important, it is better to learn the sometimes difficult rules and as a writer, make safe objects that don't require synchronization rather than hoping all the users of your code will remember to use it correctly.
Also remember that the author is not saying the object is unsafely published, it is safely published without synchronization.
As for your second question, I just checked, and the book does not promise you that another thread will always see the reference to the updated object, just that if it does, it will see a complete object. But I can imagine that if it is published through the constructor of another (Runnable?) object, it will be sweet. That does help with explaining all cases though.
EDIT:
effectively immutable and immutable
The difference between effectively immutable and immutable is that in the first case you still need to publish the objects in a safe way. For the truly immutable objects this isn't needed. So truly immutable objects are preferred because they are easier to publish for the reasons I stated above.

So, why is immutability (vs. effective immutability) so important?
I think the main point is that truly immutable objects are harder to break later on. If you've declared a field final, then it's final, period. You would have to remove the final in order to change that field, and that should ring an alarm. But if you've initially left the final out, someone could carelessly just add some code that changes the field, and boom - you're screwed - with only some added code (possibly in a subclass), no modification to existing code.
I would also assume that explicit immutability enables the (JIT) compiler to do some optimizations that would otherwise be hard or impossible to justify. For example, when using volatile fields, the runtime must guarantee a happens-before relation with writing and reading threads. In practice this may require memory barriers, disabling out-of-order execution optimizations, etc. - that is, a performance hit. But if the object is (deeply) immutable (contains only final references to other immutable objects), the requirement can be relaxed without breaking anything: the happens-before relation needs to be guaranteed only with writing and reading the one single reference, not the whole object graph.
So, explicit immutability makes the program simpler so that it's both easier for humans to reason and maintain and easier for the computer to execute optimally. These benefits grow exponentially as the object graph grows, i.e. objects contain objects that contain objects - it's all simple if everything is immutable. When mutability is needed, localizing it to strictly defined places and keeping everything else immutable still gives lots of these benefits.

I had the exact same question as the original poster when finishing reading chapters 1-3 . I think the authors could have done a better job elaborating on this a bit more.
I think the difference lies therein that the internal state of effectively immutable objects can be observed to be in an inconsistent state when they are not safely published whereas the internal state of immutable objects can never be observed to be in an inconsistent state.
However I do think the reference to an immutable object can be observed to be out of date / stale if the reference is not safely published.

"Unsafe publication" is often appropriate in cases where having other threads see the latest value written to a field would be desirable, but having threads see an earlier value would be relatively harmless. A prime example is the cached hash value for String. The first time hashCode() is called on a String, it will compute a value and cache it. If another thread which calls hashCode() on the same string can see the value computed by the first thread, it won't have to recompute the hash value (thus saving time), but nothing bad will happen if the second thread doesn't see the hash value. It will simply end up performing a redundant-but-harmless computation which could have been avoided. Having hashCode() publish the hash value safely would have been possible, but the occasional redundant hash computations are much cheaper than the synchronization required for safe publication. Indeed, except on rather long strings, synchronization costs would probably negate any benefit from caching.
Unfortunately, I don't think the creators of Java imagined situations where code would write to a field and prefer that it should be visible to other threads, but not mind too much if it isn't, and where the reference stored to the field would in turn identify another object with a similar field. This leads to situations writing semantically-correct code is much more cumbersome and likely slower than code which would be likely to work but whose semantics would not be guaranteed. I don't know any really good remedy for that in some cases other than using some gratuitous final fields to ensure that things get properly "published".

Why Classes to java have been added which are not thread safe?

I am seeing a lot of classes being added to Java which are not thread safe.
Like StringBuilder is not thread safe while StringBuffer was and StringBuilder is recoomended over Stringbuffer.
Also various collection classes are not thread safe.
Isn't being thread safe a good thing ?
Or i am just stupid and don't yet understand the meaning of being thread safe ?

Because thread safety makes things slower, and not everything has to be multi-threaded.
Consider reading this article to find out basics about thread safety :
http://en.wikipedia.org/wiki/Thread_safety
When you comfortable enough with the threads/or not, consider reading this book, it has great reviews :
http://www.amazon.com/Java-Concurrency-Practice-Brian-Goetz/dp/0321349601

Some classes are not suitable for using across multiple threads. StringBuffer is one of them IMHO.
It is very hard to find even a contrived example of when you would use StringBuffer in a multi-threaded way that cannot be more simple achieve other ways.

Thread safety is not a all or nothing property. Ten years ago some books recommended marking all methods of a class as synchronized in order to make them thread safe. This costs some performane, but it is far from a guarantee that your overall program is thread safe. Therefore, you have costs with a questionable gain. That is, why there are still classes added to Java library which are not thread safe.
The "make every method synchronized" strategy is only able to provide guarantees about the consistency of one object, and it has the potential to introduce dead-locks, or to be weaker than thought (think about wait()).

There is a performance overhead to inherently thread-safe code. If you do not need the class in a concurrent context but need the performance to be high then these, original classes are not ideal.

A typical usage of StringBuilder is something like:
return new StringBuilder().append("this").append("that").toString()
all in one thread, no need to synchronize anything.

Why is String final? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why is String final in Java?
I'm just wondering why java.lang.String is made final? Is it to prevent from being inherited? Why?

Yes indeed. This allows code in security managers and classloaders to work with the String type without having to worry that it's actually dealing with a malicious subclass that's specifically designed to trick it into allowing evil code through.

You should not be extending the string class. Just write your own methods in some other class that manipulate strings.
The reason is that the string class is a stable one which should not be tampered with as you may re-define some methods which would have unknown side effects on some other transactions.

Aside from security aspects that were already mentioned, I suspect performance was another important reason. For older JVMs especially final classes (where all methods are final by definition) made it much easier to inline code on-the-fly. And since String is one of most heavily used objects, which affects overall performance of many applications, this was seen as an area where improvements would have big overall effect.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.