Why would anyone need a thread safe SimpleDateFormat object?

Why would anyone need a thread safe SimpleDateFormat object? - java

I was looking for the usage of ThreadLocal and landed on this popular page When and how should I use a ThreadLocal variable?
The accepted, highest voted answer says
One possible (and common) use is when you have some object that is not thread-safe, but you want to avoid synchronizing access to that object (I'm looking at you, SimpleDateFormat).
And the core part of the code is
return new SimpleDateFormat("yyyyMMdd HHmm");
which won't change or be affected by conncurrent execution, or would it?
Can you please highlight how this could be a issue? And why would we need a thread safe object here?
In other occurrence, I have come across a similar usage with java.security.MessageDigest;, which is also a puzzler to me. It would be great if anyone could explain the reasons behind this, with some helpful code if possible.

SimpleDateFormat extends DateFormat which has setter methods so one thread could be changing properties of the SimpleDateFormat instance while others could be using it and assuming earlier properties or, even worse, have the properties change in the middle of an execution causing internally inconsistent results.

Well, take the first line in format(Date, StringBuffer, FieldDelegate):
calendar.setTime(date);
calendar there is an instance member, so that's obviously not thread-safe there. Firstly there's a date race (since setTime is not synchronized), but even more glaringly, someone could come through and set the calendar's time to something else part-way through the function (calendar's value is accessed in subFormat, which format calls).

Related

Proof that old Date Java API is not thread safe

I've been looking for code that demonstrates the kind of stuff that could happen when you use the legacy Date or Calendar classes in a multi-threaded environment, but I can't seem to find any good examples, found a couple with the DateFormatter but nothing only with either of the mentioned two classes anywhere, it is always mentioned they're not thread safe but no code examples!
Would anyone be kind enough to provide a example? Perhaps comparing to the new Java 8 Date classes that are thread safe.

First of all java Date is mutable and hold state -> there is a chance Date to be not a threadsafe.
Since java Date contains state ( transient long fastTime) and getter and setter for this field there is a chance Date to be not thread safe.
But when setting and getting a long value is not an atomic operation? If you are running the code in 32bit operation system setting long to a primitive field is not atomic operation, because long and double values were treated as two 32-bit values and that is the reason.

Using NON static class Methods Without reference

I'm new to Java. I know the concept of static and non static method.
I'm wondering if it's possible to use non static methods of a class without having to create a reference to it.
Like for example, for my program I'm working with Date objects, and I need to get yesterday's date in one of them. I know one possible way is like the following:
Calendar cal= Calendar.getInstance();
cal.add(Calendar.DATE,-1);
Date yesterdayDate = new Date();
yesterdayDate = cal.getTime();
Is there a way to do that without having to create the cal reference that I will be using just once in the whole program?
Something like this (I know this is by no means a correct syntax):
Date yesterdayDate = new Date();
yesterdayDate = Calendar.getInstance().add(Calendar.DATE,-1).getTime();

If Calendar was following a fluent builder pattern, where i.e. the add method was adding, then returning the mutated instance, you would be able to.
You're not, because Calendar#add returns void.
But don't be fooled: Calendar.getInstance() does create an instance as indicated - you're just not assigning it to a reference.

What you are referring to is the known Builder pattern.
The Calendar class isn't build to support the builder pattern, but there are many other classes / apis where it is.
For example, DateTimeFormatterBuilder from joda time.
DateTimeFormatter monthAndYear = new DateTimeFormatterBuilder()
.appendMonthOfYearText()
.appendLiteral(' ')
.appendYear(4, 4)
.toFormatter();
You can always go ahead and create your own builders. (In your example, CalendarBuilder).
But you should be aware that the Calendar class is generally regarded as evil - it's not thread safe, for one. Newer alternatives are joda time and the java 8 api's.

if method return type is instance of any class, you should chain calls on it and you dont need to create named variable.
This is used in Fluent interface api, where every method returns instance of "this" class.
Note:
Be careful if you call many chained methods on different objects like:
collection.get(0).getAddress().getStreet().length();
because of possible NullPointerExceptions.
On the other hand, use of fluent api should be safe, because you always call it on "this" instance, so if api has not some strange bugs, it is safe and NPE should not occur.

The general answer is no, because classes like Calendar are stateful and therefore require an initialized instance to be able to operate. If you do:
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE,-1);
You are first calling a factory method getInstance() to create an instance of GregorianCalendar, which is the default concrete implementation of Calendar. It is initialized to the default timezone and locale and set to the current system time. This means it is different than another instance you create some milliseconds later.
Then, calling add(...) or other manipulation methods on your instance affects the calendar state, following the programmed calendar logic. If this was not a discrete instance but global (static) state, multiple threads would interfere with each other and cause very confusing results.
The same holds for example for the SimpleDateFormat class. It is often incorrectly set up as a field and re-used for formatting dates in a multi-threaded context (e.g. the handle method of a servlet). This will now and then cause erratic behavior because SimpleDateFormat is also stateful.
So the conclusion is: you need to create isolated instances of classes like Calendar and SimpleDateFormat because they are stateful and thus not thread-safe.
Bear in mind that sometimes you can optimize a bit by declaring your instance before any iterations you are doing, and then re-setting its state instead of creating a new instance on each iteration (after all, creating an instance of Calendar is indeed a bit expensive).

The other answers all correct, but I think they miss one crucial point: if you are new to Java ... don't waste your time thinking about such questions.
Don't get me wrong: it is always good to understand the programming language you are using in great depth; and it is also important to have some common sense to avoid "really stupid performance" mistakes.
But: don't worry about "performance" and spent hours to reduce the number of objects your program is dealing ... from 100 to 95. That is a complete waste of time.
If you intend to write programs that are used for a longer period of time, and by more than one person (and "good" programs tend to get there pretty fast) then it is much more important that your program is easy to read, understand, and change.
The only valid reasons to look into "performance" are:
You are in the design phase; and as mentioned before one should avoid stupid mistakes that render your end-product "unusable" because of performance issues.
You are actually confronted with "performance" issues. Then you should start profiling your application; and then, based on that analysis improve the "slow" parts.
In other words: don't try to solve "non-existing" problems. And unless your application is running in an embedded environment where every byte of memory and every CPU cycle can come at a certain prize ... don't worry about creating those Calendar objects 10, 100 or 500 times.

It is not only because method is not static. Understand that you cannot chain like this -- yesterdayDate = Calendar.getInstance().add(Calendar.DATE,-1).getTime(); because add() method does not return anything. If that method would have returned you the same calendar object, you could chain them without creating a reference.
Just to understand how chaining works, you can try creating your own methods those return objects and call other methods on them.

What's wrong with returning this?

At the company I work for there's a document describing good practices that we should adhere to in Java. One of them is to avoid methods that return this, like for example in:
class Properties {
public Properties add(String k, String v) {
//store (k,v) somewhere
return this;
}
}
I would have such a class so that I'm able to write:
properties.add("name", "john").add("role","swd"). ...
I've seen such idiom many times, like in StringBuilder and don't find anything wrong with it.
Their argumentation is :
... can be the source of synchronization problems or failed expectations about the states of target objects.
I can't think of a situation where this could be true, can any of you give me an example?
EDIT The document doesn't specify anything about mutability, so I don't see the diference between chaining the calls and doing:
properties.add("name", "john");
properties.add("role", "swd");
I'll try to get in touch with the originators, but I wanted to do it with my guns loaded, thats' why I posted the question.
SOLVED: I got to talk with one of the authors, his original intention was apparently to avoid releasing objects that are not yet ready, like in a Builder pattern, and explained that if a context switch happens between calls, the object could be in an invalid state. I argued that this had nothing to do with returning this since you could make the same mistake buy calling the methods one by one and had more to do with synchronizing the building process properly. He admitted the document could be more explicit and will revise it soon. Victory is mine/ours!

My guess is that they are against mutable state (and often are rightly so). If you are not designing fluent interfaces returning this but rather return a new immutable instance of the object with the changed state, you can avoid synchronization problems or have no "failed expectations about the states of target objects". This might explain their requirement.

The only serious basis for the practice is avoiding mutable objects; the criticism that it is "confusing" and leads to "failed expectations" is quite weak. One should never use an object without first getting familiar with its semantics, and enforcing constraints on the API just to cater for those who opt out of reading Javadoc is not a good practice at all— especially because, as you note, returning this to achieve a fluent API design is one of the standard approaches in Java, and indeed a very welcome one.

I think sometimes this approach can be really useful, for example in 'builder' pattern.
I can say that in my organization this kind of things is controlled by Sonar rules, and we don't have such a rule.
Another guess is that maybe the project was built on top of existing codebase and this is kind of legacy restriction.
So the only thing I can suggest here is to talk to the people who wrote this doc :)
Hope this helps

I think it's perfectly acceptable to use that pattern in some situations.
For example, as a Swing developer, I use GridBagLayout fairly frequently for its strengths and flexibility, but anyone who's ever used it (with it's partener in crime GridBagConstraints) knows that it can be quite verbose and not very readable.
A common workaround that I've seen online (and one that I use) is to subclass GridBagConstraints (GBConstraints) that has a setter for each different property, and each setter returns this. This allows for the developer to chain the different properties on an as-needed basis.
The resultant code is about 1/4 the size, and far more readable/maintainable, even to the casual developer who might not be familiar with using GridBagConstaints.

Do people provide multiple mechanisms for doing the same thing in an API?

Is it confusing to design an API with multiple ways of achieving the same outcome? For example, I have my own Date library (which is a simple wrapper around the Java Date/Calendar classes to distinguish a year-month-day, Date, from an instant-in-time, Instant and provide mechanisms to convert between the two). I started off with one method to create an instance of a Date:
Date.valueOfYearMonthDay(int year, int month, int day);
But then I found that the resultant code using the API was not very readable. So I added:
Date.yearMonthDay(int year, int month, int day)
Date.ymd(int year, int month, int day)
Date.date(int year, int month, int day)
Then I started getting fluent:
Date.january().the(int day).in(int year);
(I find that the fluent version is really useful for making readable tests). All these methods do identical things and have accurate JavaDoc. I think I've read that a strength of perl is that each programmer can choose exactly which method he/she prefers to solve something. And a strength of Java is that there is usually only one way of doing things :-)
What are people's opinions?

I've been doing academic research for the past 10 years on different issues that have to do with API usability in Java.
I can tell you that the statement about having one way to do things in Java is fairly incorrect. There are often many ways to do the same thing in Java. And unfortunately, they are often not consistent or documented.
One problem with bloating the interface of a class with convenience methods is that you are making it more difficult to understand the class and how to use it. The more choices you have, things become more complex.
In an analysis of some open-source libraries, I've found instances of redundant functionality, added by different individuals using different terms. Clearly a bad idea.
A greater problem is that the information carried by a name is no longer meaningful. For example, things like putLayer vs. setLayer in swing, where one just updates the layer and the other also refreshes (guess which one?) are a problem. Similarly, getComponentAt and findComponentAt. In other ways, the more ways to do something, the more you obfuscate everything else and reduce the "entropy" of your existing functionality.
Here is a good example. Suppose you want in Java to replace a substring inside a string with another string. You can use String.replace(CharSequence, CharSequence) which works perfectly as you'd expect, literal for literal. Now suppose you wanted to do a regular expression replacement. You could use Java's Matcher and do a regular expression based replacement, and any maintainer would understand what you did. However, you could just write String.replaceAll(String, String), which calls the Matcher version. However, many of your maintainers might not be familiar with this, and not realize the consequences, including the fact that the replacement string cannot contains "$"s. So, the replacement of "USD" with "$" signs would work well with replace(), but would cause crazy things with replaceAll().
Perhaps the greatest problem, however, is that "doing the same thing" is rarely an issue of using the same method. In many places in Java APIs (and I am sure that also in other languages) you would find ways of doing "almost the same thing", but with differences in performance, synchronization, state changes, exception handling, etc. For instance, one call would work straight, while another would establish locks, and another will change the exception type, etc. This is a recipe for trouble.
So bottom line: Multiple ways to do the same thing are not a good idea, unless they are unambiguous and very simple and you take a lot of care to ensure consistency.

I'd echo what some others said in that convenience methods are great, but will take it a step further - all "convenience" methods should eventually call the same underlying method. The only thing that the convenience methods should do other than proxy the request is to take or return variables differently.
No calculations or processing allowed in the convenience methods. If you need to add additional functionality in one of them, go the extra mile and make it happen in the "main" / "real" one.

Its fine to provide convenience methods, the real problem is if each entry point begins to do behave in subtly different ways. Thats when the api isn't convenient anymore. Its just a pain to remember which way is "right," and documentation starts saying "the recommended way is..."
If Date.yearMonthDay() began to validate the date while Date.ymd() didn't, that'd be a problem. The same goes for if each begins supporting different "features" - Date.yearMonthDay() could take non-gregorian dates, and Date.date() could take a non-gregorian dates so long as a 4th object is given that tells the calendar type.

First, please don't invent your own date library. It's too hard to get right. If you absolutely have nothing better to do, be sure to read -- and understand -- Calendrical Calculations. Without understanding Calendrical Calculations you run a big risk of doing things wrong in obscure corner and edge cases.
Second, multiple access to a common underlying method is typical. Lots of Java library API methods state that they are simply a "wrapper" around some other method of class.
Also, because of the Java language limitations, you often have overloaded method names as a way to provide "optional" arguments to a method.
Multiple access methods is a fine design.

If these do the exact same thing:
Date.yearMonthDay(int year, int month, int day)
Date.ymd(int year, int month, int day)
Date.date(int year, int month, int day)
I think that is bad form. When I am reading your code, I have no clue which one to use.
Things like
canvas.setClipRegion (int left, int top, int right, int bottom);
canvas.setClipRegion (Rect r);
are different in that it allows the caller to access the functionality without having to figure out how to format the data.

My personal opinion is that you should stick with one method to do something. It all 4 methods ultimatly call the same method then you only need on of them. If however they do something in addition to calling them method then they should exist.
so:
// This method should not exist
Data yearMonthDay(final int year, final int month, final int day)
{
return (valueOfYearMonthDay(year, month, day));
}
The first methid in addition to the fluent version would make more sense. But the yearMonthDay, ymd, and date methods should go.
Also, differnt langauges have different goals. Just because it makse "sense" in Perl doesn't mean it makes sense in Java (or C#, or C++, or C, or Basic, or...)

I find that the fluent version is really useful for making readable tests.
This is a little bit troublesome because I worry that you might only be testing the fluent version. If the only reason methodX() exists is so you can have a readable test for methodY() then there is no reason for one of methodX() or methodY() to exist. You still need to test them in isolation. You're repeating yourself needlessly.
One of the guiding principles of TDD is that you force yourself into thinking about your API while you're writing your code. Decide which method you want clients of your API to use and get rid of the redundant ones. Users won't thank you providing convenience methods, they'll curse you for cluttering your API with seemingly useless redundant methods.

Java Transport.send() is it thread-safe?

The method is static, but I cannot find mention of if it is thread-safe or not. I plan on hitting this method with several threads at once and I would like to avoid a synchronized block if possible.
javax.mail.Transport.send(msg);

It is usually bad design and a violation of expectations to have a static method that is not thread-safe.
The documentation indeed appears to be devoid of any mention of thread-safety, but a quick glance through the code suggests that the implementation is thread-safe by creating a thread-confined Transport instance on every call and delegating to that.
To be absolutely sure I recommend pulling a couple of days out the calendar for a proper analysis.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.