Proof that old Date Java API is not thread safe - java

I've been looking for code that demonstrates the kind of stuff that could happen when you use the legacy Date or Calendar classes in a multi-threaded environment, but I can't seem to find any good examples, found a couple with the DateFormatter but nothing only with either of the mentioned two classes anywhere, it is always mentioned they're not thread safe but no code examples!
Would anyone be kind enough to provide a example? Perhaps comparing to the new Java 8 Date classes that are thread safe.

First of all java Date is mutable and hold state -> there is a chance Date to be not a threadsafe.
Since java Date contains state ( transient long fastTime) and getter and setter for this field there is a chance Date to be not thread safe.
But when setting and getting a long value is not an atomic operation? If you are running the code in 32bit operation system setting long to a primitive field is not atomic operation, because long and double values were treated as two 32-bit values and that is the reason.

Related

Using NON static class Methods Without reference

I'm new to Java. I know the concept of static and non static method.
I'm wondering if it's possible to use non static methods of a class without having to create a reference to it.
Like for example, for my program I'm working with Date objects, and I need to get yesterday's date in one of them. I know one possible way is like the following:
Calendar cal= Calendar.getInstance();
cal.add(Calendar.DATE,-1);
Date yesterdayDate = new Date();
yesterdayDate = cal.getTime();
Is there a way to do that without having to create the cal reference that I will be using just once in the whole program?
Something like this (I know this is by no means a correct syntax):
Date yesterdayDate = new Date();
yesterdayDate = Calendar.getInstance().add(Calendar.DATE,-1).getTime();
If Calendar was following a fluent builder pattern, where i.e. the add method was adding, then returning the mutated instance, you would be able to.
You're not, because Calendar#add returns void.
But don't be fooled: Calendar.getInstance() does create an instance as indicated - you're just not assigning it to a reference.
What you are referring to is the known Builder pattern.
The Calendar class isn't build to support the builder pattern, but there are many other classes / apis where it is.
For example, DateTimeFormatterBuilder from joda time.
DateTimeFormatter monthAndYear = new DateTimeFormatterBuilder()
.appendMonthOfYearText()
.appendLiteral(' ')
.appendYear(4, 4)
.toFormatter();
You can always go ahead and create your own builders. (In your example, CalendarBuilder).
But you should be aware that the Calendar class is generally regarded as evil - it's not thread safe, for one. Newer alternatives are joda time and the java 8 api's.
if method return type is instance of any class, you should chain calls on it and you dont need to create named variable.
This is used in Fluent interface api, where every method returns instance of "this" class.
Note:
Be careful if you call many chained methods on different objects like:
collection.get(0).getAddress().getStreet().length();
because of possible NullPointerExceptions.
On the other hand, use of fluent api should be safe, because you always call it on "this" instance, so if api has not some strange bugs, it is safe and NPE should not occur.
The general answer is no, because classes like Calendar are stateful and therefore require an initialized instance to be able to operate. If you do:
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE,-1);
You are first calling a factory method getInstance() to create an instance of GregorianCalendar, which is the default concrete implementation of Calendar. It is initialized to the default timezone and locale and set to the current system time. This means it is different than another instance you create some milliseconds later.
Then, calling add(...) or other manipulation methods on your instance affects the calendar state, following the programmed calendar logic. If this was not a discrete instance but global (static) state, multiple threads would interfere with each other and cause very confusing results.
The same holds for example for the SimpleDateFormat class. It is often incorrectly set up as a field and re-used for formatting dates in a multi-threaded context (e.g. the handle method of a servlet). This will now and then cause erratic behavior because SimpleDateFormat is also stateful.
So the conclusion is: you need to create isolated instances of classes like Calendar and SimpleDateFormat because they are stateful and thus not thread-safe.
Bear in mind that sometimes you can optimize a bit by declaring your instance before any iterations you are doing, and then re-setting its state instead of creating a new instance on each iteration (after all, creating an instance of Calendar is indeed a bit expensive).
The other answers all correct, but I think they miss one crucial point: if you are new to Java ... don't waste your time thinking about such questions.
Don't get me wrong: it is always good to understand the programming language you are using in great depth; and it is also important to have some common sense to avoid "really stupid performance" mistakes.
But: don't worry about "performance" and spent hours to reduce the number of objects your program is dealing ... from 100 to 95. That is a complete waste of time.
If you intend to write programs that are used for a longer period of time, and by more than one person (and "good" programs tend to get there pretty fast) then it is much more important that your program is easy to read, understand, and change.
The only valid reasons to look into "performance" are:
You are in the design phase; and as mentioned before one should avoid stupid mistakes that render your end-product "unusable" because of performance issues.
You are actually confronted with "performance" issues. Then you should start profiling your application; and then, based on that analysis improve the "slow" parts.
In other words: don't try to solve "non-existing" problems. And unless your application is running in an embedded environment where every byte of memory and every CPU cycle can come at a certain prize ... don't worry about creating those Calendar objects 10, 100 or 500 times.
It is not only because method is not static. Understand that you cannot chain like this -- yesterdayDate = Calendar.getInstance().add(Calendar.DATE,-1).getTime(); because add() method does not return anything. If that method would have returned you the same calendar object, you could chain them without creating a reference.
Just to understand how chaining works, you can try creating your own methods those return objects and call other methods on them.

Why is Date better than Calendar for property typing when stuck using the legacy API?

First, please note that this question is not a duplicate of this Question:
Java Date vs Calendar. My question is much more specific. The referenced question asks "what" (or "which"), but I already know the "what" and am asking the "why".
I am on a team working on enhancements to an existing Java project for a client. This Java project uses java 6, and does not have Joda Time as a dependency. After inquiring, it looks like adding Joda Time or upgrading to Java 8 are not options.
So, when it comes to representing date/time as a field in an object, we have to use either Calendar or Date for property typing. The legacy code of this project is littered with Objects that use Calendar to represent date/time fields -- fields that we would never have cause to manipulate (as in add or subtract units of time, etc). I know that this is bad practice, as Calendar is a more complex object, while Date is simpler and would work just as well. (And granted, I know that both are fundamentally wrappers for a long of epoch millis, are mutable, and are poorly designed, but again these are our only two options.)
In other words, an object like this:
public class Reservation {
private Guest guest;
// Set only once, never used for calculations
private Calendar dateReserved;
...
}
Should be this instead:
public class Reservation {
private Guest guest;
// Set only once, never used for calculations
private Date dateReserved;
...
}
I then noticed that when adding new Objects for new features, my team was following the same convention of using Calendar instead of Date. When I brought this up, the reply was that it's better to use Calendar because it can do more and doesn't have all these deprecated methods like Date does.
I know that this reasoning is oversimplified. I also see that this answer to the broader question of usage expresses the same view, namely that Calendar should not be used for property typing. However, the answer doesn't contain much explanation as to why Calendar should not be preferred.
So I already know the "What". But I'm trying to make the case to my team, so my question is, "Why"? Why, when property typing, should Date be preferred to Calendar? What are the disadvantages of using Calendar instead of Date for property typing?
I agree with Jon Skeet's comment regarding calendar systems and time zones, and I think your premise is fundamentally flawed. Dates aren't better than Calendars. If you're never ever ever going to compare times, or never ever ever have two dates in different time zones, then sure, the smaller footprint can be nice, I guess, but at that point, just use longs and Unix timestamps. Calendars are by far the better object model, and after all, if you absolutely need it, you can get a Date object from it.
If you are stuck having to choose between Date and Calendar when property typing:
Use Calendar if either one of these is true:
You need to be able to adjust the date/time after it is initially set
(such as changing the month while leaving the day and hour the same).
You need to be aware of timezone.
Otherwise, use Date for the following reasons:
Expressing your intentions accurately. If you use Calendar, you are implying that you want a certain functionality that you don't actually intend to use (timezones, changing the day or month, etc).
Less hassle with String representations. For example, consider this class:
public class Reservation {
private Guest guest;
private Calendar dateReserved;
#Override
public String toString() {
return String.format("Reservation{guest=%s,dateReserved=\"%s\"}",
guest, dateReserved);
}
}
Now if you print out an instance of this class, you'll get something hideous:
Reservation{guest=Guest{id=17,name="John Smith"},dateReserved="java.util.GregorianCalendar[time=1426707020619,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2015,MONTH=2,WEEK_OF_YEAR=12,WEEK_OF_MONTH=3,DAY_OF_MONTH=18,DAY_OF_YEAR=77,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=3,AM_PM=1,HOUR=0,HOUR_OF_DAY=12,MINUTE=30,SECOND=20,MILLISECOND=619,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]"}
Whereas if you had used Date instead, you'd get this:
Reservation{guest=Guest{id=17,name="John Smith"},dateReserved="Wed Mar 18 12:34:26 PDT 2015"}
So if you use Calendar and you want your toString() to be usable, you would need to call dateReserved.getTime() -- which means you'd need to add a null check. This goes for whether or not you end up using a DateFormat object.
Date is a smaller object, quicker to instantiate and with less overhead.
Date is practically immutable -- meaning that the only way to change a date object is to use deprecated methods. So, as said in point 1, expressing your intentions matters. If your date field should be immutable, don't confuse developers who will touch your code in the future by using Calendar (unless of course you need timezone awareness).
"Date" is a more intuitive name than "Calendar" for the type of a field that represents a single point in time.
Date object has fewer fields and occupies less memory than Calendar object and is also faster to instantiate.

Why it is bad, that java.util.date is mutable? [duplicate]

As most people are painfully aware of by now, the Java API for handling calendar dates (specifically the classes java.util.Date and java.util.Calendar) are a terrible mess.
Off the top of my head:
Date is mutable
Date represents a timestamp, not a date
no easy way to convert between date components (day, month, year...) and Date
Calendar is clunky to use, and tries to combine different calendar systems into one class
This post sums it up quite well, and JSR-310 also expains these problems.
Now my question is:
How did these classes make it into the Java SDK? Most of these problems seem fairly obvious (especially Date being mutable) and should have been easy to avoid. So how did it happen? Time pressure? Or are the problems obvious in retrospect only?
I realize this is not strictly a programming question, but I'd find it interesting to understand how API design could go so wrong. After all, mistakes are always a good learning opportunity (and I'm curious).
Someone put it better than I could ever say it:
Class Date represents a specific instant in time, with millisecond
precision. The design of this class is a very bad joke - a sobering
example of how even good programmers screw up. Most of the methods in
Date are now deprecated, replaced by methods in the classes below.
Class Calendar is an abstract class for converting between a Date
object and a set of integer fields such as year, month, day, and hour.
Class GregorianCalendar is the only subclass of Calendar in the JDK.
It does the Date-to-fields conversions for the calendar system in
common use. Sun licensed this overengineered junk from Taligent - a
sobering example of how average programmers screw up.
from Java Programmers FAQ, version from 07.X.1998, by Peter van der Linden - this part was removed from later versions though.
As for mutability, a lot of the early JDK classes suffer from it (Point, Rectangle, Dimension, ...). Misdirected optimizations, I've heard some say.
The idea is that you want to be able to reuse objects (o.getPosition().x += 5) rather than creating copies (o.setPosition(o.getPosition().add(5, 0))) as you have to do with immutables. This may even have been a good idea with the early VMs, while it's most likely isn't with modern VMs.
Java's early APIs are nothing more than a product of their time. Immutability only became a popular concept years after that. You say that immutability is "obvious". That might be true now but it wasn't then. Just like dependency injection is now "obvious" but it wasn't 10 years ago.
It was also at one time expensive to create Calendar objects.
They remain that way for backwards compatibility reasons. What is perhaps more unfortunate was that once the mistake was realized the old class wasn't deprecated and new date/time classes were created for all APIs going forward. This has to some degree occurred with the JDK 8 adoption of a JodaTime like API (java.time, JSR 310) but really it's too little too late.
Time is itself not easy to measure and to handle with. Just look at the length of the wikipedia article about time. And then, there are different understandings about time itself: a absoulte time point (as a constant), a time point at a certain place, a time range, the resolution of time....
I remember, when i saw java.util.Date the first time (JDK 1.0?) i was really happy about it. The languages i knew of didn't had such a feature. I didn't have think about time conversion etc.
I think it's mess, because everything that changes leaves a mess if you evolve from one level of understanding (XMLGregorianCaldender vs. Date) and requirements (Nanoseconds, past 2030) to higher level, but keeping the old untouched. And java.util.Date is not a Exception. Just look at the I/O subsystem or the transition from AWT to Swing...
And because of that, "we should sometimes press the reset button." (who said that, btw.?)

Why would anyone need a thread safe SimpleDateFormat object?

I was looking for the usage of ThreadLocal and landed on this popular page When and how should I use a ThreadLocal variable?
The accepted, highest voted answer says
One possible (and common) use is when you have some object that is not thread-safe, but you want to avoid synchronizing access to that object (I'm looking at you, SimpleDateFormat).
And the core part of the code is
return new SimpleDateFormat("yyyyMMdd HHmm");
which won't change or be affected by conncurrent execution, or would it?
Can you please highlight how this could be a issue? And why would we need a thread safe object here?
In other occurrence, I have come across a similar usage with java.security.MessageDigest;, which is also a puzzler to me. It would be great if anyone could explain the reasons behind this, with some helpful code if possible.
SimpleDateFormat extends DateFormat which has setter methods so one thread could be changing properties of the SimpleDateFormat instance while others could be using it and assuming earlier properties or, even worse, have the properties change in the middle of an execution causing internally inconsistent results.
Well, take the first line in format(Date, StringBuffer, FieldDelegate):
calendar.setTime(date);
calendar there is an instance member, so that's obviously not thread-safe there. Firstly there's a date race (since setTime is not synchronized), but even more glaringly, someone could come through and set the calendar's time to something else part-way through the function (calendar's value is accessed in subFormat, which format calls).

safe publication and the advantage of being immutable vs. effectively immutable

I'm re-reading Java Concurrency In Practice, and I'm not sure I fully understand the chapter about immutability and safe publication.
What the book says is:
Immutable objects can be used safely by any thread without additional
synchronization, even when synchronization is not used to publish
them.
What I don't understand is, why would anyone (interested in making his code correct) publish some reference unsafely?
If the object is immutable, and it's published unsafely, I understand that any other thread obtaining a reference to the object would see its correct state, because of the guarantees offered by proper immutability (with final fields, etc.).
But if the publication is unsafe, another thread might still see null or the previous reference after the publication, instead of the reference to the immutable object, which seems to me like something no-one would like.
And if safe publication is used to make sure the new reference is seen by all the threads, then even if the object is just effectively immutable (no final fields, but no way to mute them), then everything is safe again. As the book says :
Safely published effectively immutable objects can be used safely by
any thread without additional synchronization.
So, why is immutability (vs. effective immutability) so important? In what case would an unsafe publication be wanted?
It is desirable to design objects that don't need synchronization for two reasons:
The users of your objects can forget to synchronize.
Even though the overhead is very little, synchronization is not free, especially if your objects are not used often and by many different threads.
Because the above reasons are very important, it is better to learn the sometimes difficult rules and as a writer, make safe objects that don't require synchronization rather than hoping all the users of your code will remember to use it correctly.
Also remember that the author is not saying the object is unsafely published, it is safely published without synchronization.
As for your second question, I just checked, and the book does not promise you that another thread will always see the reference to the updated object, just that if it does, it will see a complete object. But I can imagine that if it is published through the constructor of another (Runnable?) object, it will be sweet. That does help with explaining all cases though.
EDIT:
effectively immutable and immutable
The difference between effectively immutable and immutable is that in the first case you still need to publish the objects in a safe way. For the truly immutable objects this isn't needed. So truly immutable objects are preferred because they are easier to publish for the reasons I stated above.
So, why is immutability (vs. effective immutability) so important?
I think the main point is that truly immutable objects are harder to break later on. If you've declared a field final, then it's final, period. You would have to remove the final in order to change that field, and that should ring an alarm. But if you've initially left the final out, someone could carelessly just add some code that changes the field, and boom - you're screwed - with only some added code (possibly in a subclass), no modification to existing code.
I would also assume that explicit immutability enables the (JIT) compiler to do some optimizations that would otherwise be hard or impossible to justify. For example, when using volatile fields, the runtime must guarantee a happens-before relation with writing and reading threads. In practice this may require memory barriers, disabling out-of-order execution optimizations, etc. - that is, a performance hit. But if the object is (deeply) immutable (contains only final references to other immutable objects), the requirement can be relaxed without breaking anything: the happens-before relation needs to be guaranteed only with writing and reading the one single reference, not the whole object graph.
So, explicit immutability makes the program simpler so that it's both easier for humans to reason and maintain and easier for the computer to execute optimally. These benefits grow exponentially as the object graph grows, i.e. objects contain objects that contain objects - it's all simple if everything is immutable. When mutability is needed, localizing it to strictly defined places and keeping everything else immutable still gives lots of these benefits.
I had the exact same question as the original poster when finishing reading chapters 1-3 . I think the authors could have done a better job elaborating on this a bit more.
I think the difference lies therein that the internal state of effectively immutable objects can be observed to be in an inconsistent state when they are not safely published whereas the internal state of immutable objects can never be observed to be in an inconsistent state.
However I do think the reference to an immutable object can be observed to be out of date / stale if the reference is not safely published.
"Unsafe publication" is often appropriate in cases where having other threads see the latest value written to a field would be desirable, but having threads see an earlier value would be relatively harmless. A prime example is the cached hash value for String. The first time hashCode() is called on a String, it will compute a value and cache it. If another thread which calls hashCode() on the same string can see the value computed by the first thread, it won't have to recompute the hash value (thus saving time), but nothing bad will happen if the second thread doesn't see the hash value. It will simply end up performing a redundant-but-harmless computation which could have been avoided. Having hashCode() publish the hash value safely would have been possible, but the occasional redundant hash computations are much cheaper than the synchronization required for safe publication. Indeed, except on rather long strings, synchronization costs would probably negate any benefit from caching.
Unfortunately, I don't think the creators of Java imagined situations where code would write to a field and prefer that it should be visible to other threads, but not mind too much if it isn't, and where the reference stored to the field would in turn identify another object with a similar field. This leads to situations writing semantically-correct code is much more cumbersome and likely slower than code which would be likely to work but whose semantics would not be guaranteed. I don't know any really good remedy for that in some cases other than using some gratuitous final fields to ensure that things get properly "published".

Categories