How to check if string matches date pattern using time API? - java

My program is parsing an input string to a LocalDate object. For most of the time the string looks like 30.03.2014, but occasionally it looks like 3/30/2014. Depending on which, I need to use a different pattern to call DateTimeFormatter.ofPattern(String pattern) with. Basically, I need to check if the string matches the pattern dd.MM.yyyy or M/dd/yyyy before doing the parsing.
The regex approach would be something like:
LocalDate date;
if (dateString.matches("^\\d?\\d/\\d{2}/\\d{4}$")) {
date = LocalDate.parse(dateString, DateTimeFormatter.ofPattern("M/dd/yyyy"));
} else {
date = LocalDate.parse(dateString, DateTimeFormatter.ofPattern("dd.MM.yyyy"));
}
This works, but it would be nice to use the date pattern string when matching the string also.
Are there any standard ways to do this with the new Java 8 time API, without resorting to regex matching? I have looked in the docs for DateTimeFormatter but I couldn't find anything.

Okay I'm going ahead and posting it as an answer. One way is to create the class that will holds the patterns.
public class Test {
public static void main(String[] args){
MyFormatter format = new MyFormatter("dd.MM.yyyy", "M/dd/yyyy");
LocalDate date = format.parse("3/30/2014"); //2014-03-30
LocalDate date2 = format.parse("30.03.2014"); //2014-03-30
}
}
class MyFormatter {
private final String[] patterns;
public MyFormatter(String... patterns){
this.patterns = patterns;
}
public LocalDate parse(String text){
for(int i = 0; i < patterns.length; i++){
try{
return LocalDate.parse(text, DateTimeFormatter.ofPattern(patterns[i]));
}catch(DateTimeParseException excep){}
}
throw new IllegalArgumentException("Not able to parse the date for all patterns given");
}
}
You could improve this as #MenoHochschild did by directly creating an array of DateTimeFormatter from the array of String you pass in the constructor.
Another way would be to use a DateTimeFormatterBuilder, appending the formats you want. There may exists some other ways to do it, I didn't go deeply through the documentation :-)
DateTimeFormatter dfs = new DateTimeFormatterBuilder()
.appendOptional(DateTimeFormatter.ofPattern("yyyy-MM-dd"))
.appendOptional(DateTimeFormatter.ofPattern("dd.MM.yyyy"))
.toFormatter();
LocalDate d = LocalDate.parse("2014-05-14", dfs); //2014-05-14
LocalDate d2 = LocalDate.parse("14.05.2014", dfs); //2014-05-14

With DateTimeFormatter, optional patterns can be specified using square brackets.
Demo:
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.util.Locale;
import java.util.stream.Stream;
public class Main {
public static void main(String[] args) {
DateTimeFormatter dtf = DateTimeFormatter.ofPattern("[d.M.u][M/d/u][u-M-d]", Locale.ENGLISH);
Stream.of(
"3/30/2014",
"30.03.2014",
"2014-05-14",
"14.05.2014"
).forEach(s -> System.out.println(LocalDate.parse(s, dtf)));
}
}
Output:
2014-03-30
2014-03-30
2014-05-14
2014-05-14
Learn more about the the modern date-time API* from Trail: Date Time.
* For any reason, if you have to stick to Java 6 or Java 7, you can use ThreeTen-Backport which backports most of the java.time functionality to Java 6 & 7. If you are working for an Android project and your Android API level is still not compliant with Java-8, check Java 8+ APIs available through desugaring and How to use ThreeTenABP in Android Project.

The approach of #ZouZou is a possible solution.
In order to avoid using exceptions for program logic as much as possible (also not so nice performancewise) following alternative might be considered:
static final String[] PATTERNS = {"dd.MM.yyyy", "M/dd/yyyy"};
static final DateTimeFormatter[] FORMATTERS = new DateTimeFormatter[PATTERNS.length];
static {
for (int i = 0; i < PATTERNS.length; i++) {
FORMATTERS[i] = DateTimeFormatter.ofPattern(PATTERNS[i]);
}
}
public static LocalDate parse(String input) {
ParsePosition pos = new ParsePosition();
for (int i = 0; i < patterns.length; i++) {
try {
TemporalAccessor tacc = FORMATTERS[i].parseUnresolved(input, pos);
if (pos.getErrorIndex < 0) {
return LocalDate.from(tacc); // possibly throwing DateTimeException => validation failure
}
} catch (DateTimeException ex) { // catches also possible DateTimeParseException
// go to next pattern
}
pos.setIndex(0);
pos.setErrorIndex(-1);
}
throw new IllegalArgumentException("Input does not match any pattern: " + input);
}
More explanation about the method parseUnresolved():
This method does only the first phase of parsing, so there is no second phase containing preliminary validation or combining effort of parsed fields. However, LocalDate.from() does validate every input, so I think this is still sufficient. And the advantage is that parseUnresolved() uses the error index of ParsePosition. This is in agreement with traditional java.text.Format-behaviour.
Unfortunately the alternative and more intuitive method DateTimeFormater.parse() first creates a DateTimeParseException and then store the error index in this exception. So I decided not to use this method in order to avoid the creation of an unnecessary exception. For me, this API-detail is a questionable design decision.

Related

Custom Comparator not sorting time

I am working on a project where I am comparing the date and time in a custom Comparator. I actually concatenated the date with date and time. When I debugged the issue, I realized that time is not getting sorted. Here is the snippet of my code from my Comparator.
Date dateObject1= new Date();
Date dateObject2 = new Date();
try {
dateObject1 = sdf.parse(date1 + "T" + time1);
dateObject2 = sdf.parse(date2 + "T" + time2);
} catch (Exception e) { }
if (dateObject1.compareTo(dateObject2) > 0)
return 1;
else if (dateObject1.compareTo(dateObject2) < 0)
return -1;
else
return 0;
Test cases:
1. date1 - 2019-12-13 , date2 - 2019-12-13
time1 - 08:00:00, time2 - 12:00:00
When i debugged the issue I found it's returning 0 for the above test case. I am not sure why it's happening but I intent to return -1 such that it's sorted in ascending order.
Please advice.
Your problem is here
} catch (Exception e) { }
You initialize your 2 dates, both of which get initialized to the current time (System.currentTimeMillis()).
Date dateObject1= new Date();
Date dateObject2 = new Date();
Your parsing then fails, but you swallow the exception so you never noticed it.
Then you try to sort two dates which are either exactly same, or separated by a couple of milliseconds, but are certainly unrelated to the actual timestamps that you're trying to sort.
Check the exception, fix the parsing, and then it will work.
java.time and Comparator.comparing … thenComparing
I don’t know what your Java version is. The following snippet works on Java 8 and above. The most important ideas can be applied on Java 6 and 7 too.
List<MyObject> listToBeSorted = Arrays.asList(
new MyObject("2019-12-12", "11:53:50"),
new MyObject("2019-12-11", "13:07:05"),
new MyObject("2019-12-13", "05:02:16"),
new MyObject("2019-12-11", "09:54:57"),
new MyObject("2019-12-12", "05:53:52"),
new MyObject("2019-12-13", "06:56:08"),
new MyObject("2019-12-12", "02:31:55"),
new MyObject("2019-12-11", "09:28:16"),
new MyObject("2019-12-11", "20:58:55"));
Comparator<MyObject> cmpr = Comparator.comparing(MyObject::getDate)
.thenComparing(MyObject::getTime);
listToBeSorted.sort(cmpr);
listToBeSorted.forEach(System.out::println);
Output is:
MyObject [date=2019-12-11, time=09:28:16]
MyObject [date=2019-12-11, time=09:54:57]
MyObject [date=2019-12-11, time=13:07:05]
MyObject [date=2019-12-11, time=20:58:55]
MyObject [date=2019-12-12, time=02:31:55]
MyObject [date=2019-12-12, time=05:53:52]
MyObject [date=2019-12-12, time=11:53:50]
MyObject [date=2019-12-13, time=05:02:16]
MyObject [date=2019-12-13, time=06:56:08]
You will observe that the objects have been sorted by date and objects with the same date also by time. Here is the MyObject class that I used:
public class MyObject {
LocalDate date;
LocalTime time;
public MyObject(String dateString, String timeString) {
date = LocalDate.parse(dateString);
time = LocalTime.parse(timeString);
}
public LocalDate getDate() {
return date;
}
public LocalTime getTime() {
return time;
}
#Override
public String toString() {
return "MyObject [date=" + date + ", time=" + time + "]";
}
}
The two key messages are:
Don’t keep your dates and times as strings in your objects. Keep proper date and time objects. It may require parsing strings when you build your objects, but everything else gets noticeably easier.
Don’t use Date and SimpleDateFormat at all. Use classes from java.time, the modern Java date and time API. In this case LocalDate and LocalTime. The SimpleDateFormat and Date classes are poorly designed and long outdated, the former in particular notoriously troublesome. The modern API is so much nicer to work with.
The advantage of the Comparator methods comparing and thenComparing is not so much that code gets considerably shorter. The really important gain is that writing comparators in this style is much less error prone, and the code reads more naturally.
What went wrong in your code?
The problem is in the line that you posted in a comment:
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
This formatter only parses the date from each string and ignores the time. It’s one of many confusing traits of SimpleDateFormat that it is happy to parse only a part of the string and doesn’t draw our attention to the fact that some of the text is ignored — in this case the T and the entire time.
Link
Oracle tutorial: Date Time explaining how to use java.time.
Just return the value of the comparison since that is what you return anyway.
Date dateObject1= new Date();
Date dateObject2 = new Date();
try {
dateObject1 = sdf.parse(date1 + "T" + time1);
dateObject2 = sdf.parse(date2 + "T" + time2);
} catch (Exception e) {
e.printStackTrace(); // always print these. They are there to help you.
}
return dateObject1.compareTo(dateObject2);

Why GregorianCalendar.getTimeInMillis() changes the value of the instance?

I found a very strange behavior of GregorianCalendar.getTimeInMillis(), it seems that it changes the value of the instance content. In the following code you can see that two blocks of code differ in only one commented line, where getTimeInMillis() is called. Why is the result different when I uncomment the line?
With commented call the output is
2014-10-25T22:00:00Z -> 2014-10-26T22:00:00.000+01:00
2014-10-25T22:00:00Z -> 2014-10-27T00:00:00.000+01:00
but when I uncomment the getTimeInMillis() line, both results are the same:
2014-10-25T22:00:00Z -> 2014-10-27T00:00:00.000+01:00
2014-10-25T22:00:00Z -> 2014-10-27T00:00:00.000+01:00
Code:
package com.test;
import java.util.Calendar;
import java.util.GregorianCalendar;
import java.util.TimeZone;
import javax.xml.datatype.DatatypeFactory;
import javax.xml.datatype.XMLGregorianCalendar;
public class Main {
public static void main(String[] args) {
try {
XMLGregorianCalendar date1 = DatatypeFactory.newInstance()
.newXMLGregorianCalendar("2014-10-25T22:00:00Z");
XMLGregorianCalendar date2 = DatatypeFactory.newInstance()
.newXMLGregorianCalendar("2014-10-25T22:00:00Z");
int days = 1;
GregorianCalendar gregorianCalendar1 = date1.toGregorianCalendar();
// gregorianCalendar1.getTimeInMillis(); //UNCOMMENT THIS LINE TO GET A DIFFERENT RESULT
gregorianCalendar1.setTimeZone(TimeZone.getDefault());
gregorianCalendar1.add(Calendar.DAY_OF_MONTH, days);
XMLGregorianCalendar newXMLGregorianCalendar1 = DatatypeFactory
.newInstance().newXMLGregorianCalendar(gregorianCalendar1);
System.out.printf("%s -> %s\n", date1, newXMLGregorianCalendar1);
GregorianCalendar gregorianCalendar2 = date2.toGregorianCalendar();
gregorianCalendar2.getTimeInMillis();
gregorianCalendar2.setTimeZone(TimeZone.getDefault());
gregorianCalendar2.add(Calendar.DAY_OF_MONTH, days);
XMLGregorianCalendar newXMLGregorianCalendar2 = DatatypeFactory
.newInstance().newXMLGregorianCalendar(gregorianCalendar2);
System.out.printf("%s -> %s\n", date2, newXMLGregorianCalendar2);
} catch (Exception e) {
e.printStackTrace();
}
}
}
It's a time zone change. Not on December 31st in Shanghai, but manually, in your code.
Particularly, you are changing the time zone after having forced the calendar to compute its fields (based on the "old" time zone). This messes up the internal state of the calendar. Of course, this should not be the case, but is only one of the many strange behaviors exposed by the Calendar classes - and, most likely, mainly caused by their mutability.
Some of the potential difficulties are also stated in a comment in the implementation of Calendar#setTimeZone:
* Consider the sequence of calls:
* cal.setTimeZone(EST); cal.set(HOUR, 1); cal.setTimeZone(PST).
* Is cal set to 1 o'clock EST or 1 o'clock PST? Answer: PST.
You could possibly work around this by studying the source code of GregorianCalendar and trying to avoid the critical sequences of calls. But as others already have pointed out: The whole old Date/Time API is horribly broken. If you have the chance, you should consider using the new Date/Time API of Java 8 (or the Joda Time API, which is similar enough to Java 8 to make it easy to later change existing Joda-based code to Java 8 code).
Here is an example that demonstrates the difference between setting the time zone before the call to getTimeMillis and after the call to getTimeMillis:
import java.util.Calendar;
import java.util.GregorianCalendar;
import java.util.TimeZone;
import javax.xml.datatype.DatatypeFactory;
import javax.xml.datatype.XMLGregorianCalendar;
public class GregorianCalendarTest {
public static void main(String[] args) {
String fromSettingTimeZoneBeforeCall = createString(true);
String fromSettingTimeZoneAfterCall = createString(false);
System.out.println("Before: "+fromSettingTimeZoneBeforeCall);
System.out.println("After : "+fromSettingTimeZoneAfterCall);
}
private static String createString(boolean setTimeZoneBeforeCall)
{
try {
XMLGregorianCalendar date = DatatypeFactory.newInstance()
.newXMLGregorianCalendar("2014-10-25T22:00:00Z");
int days = 1;
GregorianCalendar gregorianCalendar = date.toGregorianCalendar();
System.out.println("After creating: "+gregorianCalendar);
if (!setTimeZoneBeforeCall)
{
gregorianCalendar.getTimeInMillis();
System.out.println("After millis : "+gregorianCalendar);
}
gregorianCalendar.setTimeZone(TimeZone.getDefault());
System.out.println("After timezone: "+gregorianCalendar);
if (setTimeZoneBeforeCall)
{
gregorianCalendar.getTimeInMillis();
System.out.println("After millis : "+gregorianCalendar);
}
gregorianCalendar.add(Calendar.DAY_OF_MONTH, days);
System.out.println("After adding : "+gregorianCalendar);
XMLGregorianCalendar newXMLGregorianCalendar = DatatypeFactory
.newInstance().newXMLGregorianCalendar(gregorianCalendar);
System.out.println("After all : "+gregorianCalendar);
return newXMLGregorianCalendar.toString();
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
}
EDIT: This behavior is also described in this bug report: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=5026826
Pre-Java 8 Calendar implementations have been under a lot of criticism for "weird" behavior. I think that this is due to the following documentation:
Getting and Setting Calendar Field Values
The calendar field values can be set by calling the set methods. Any field values set in a Calendar will not be interpreted until it needs to calculate its time value (milliseconds from the Epoch) or values of the calendar fields. Calling the get, getTimeInMillis, getTime, add and roll involves such calculation.
Note that the toString() method is marked as debug-only:
Return a string representation of this calendar. This method is intended to be used only for debugging purposes, and the format of the returned string may vary between implementations. The returned string may be empty but may not be null.
Though this will not probably end-up in a bug (as long as you don't use toString() in actual logic), it is better to use Joda-Time or new Java-8 Date and Time

Converter for Joda-Time `DateTime` object to String in `org.simpleframework.xml` ("Simple XML Serialization" library)

How to build a converter for the org.simpleframework.xml libary?
I am using the Simple XML Serialization library (org.simpleframework.xml package) from SimpleFramework.org.
I want Joda-Time DateTime objects to be serialized as an ISO 8601 string such as 2014-07-16T00:20:36Z. Upon re-constituting the Java object, I want a DateTime to be constructed from that string. The documentation does not really explain how to build a converter.
I know it has something to do with a Transform and a Matcher. In the MythTV-Service-API project, I found implementations of both Transform and Matcher. But I have not determined how to put it together.
You may choose between two approaches, as discussed in this similar Question, Serialization third-party classes with Simple XML (org.simpleframework.xml):
Converter
Transform
I do not know the pros and cons of each in comparison. But I do know how to implement the Transform approach. For that, read on.
Transform Approach
Three pieces are needed:
An implementation of the Transform interface.
An implementation of the Matcher interface.
A RegistryMatcher instance where the Transform implementation is mapped to the data-type it handles.
All three of these are a part of the transform package.
I suggest putting your implementations in a "converters" package in your project.
Transform Implementation
Your Transform implementation may look like this.
This implementation here is a simplistic. It assumes you want the output to be the default ISO 8601 string generated by a DateTime’s toString method. And it assumes every text input will be compatible with the default parser in a DateTime constructor. To handle other formats, define a bunch of DateTimeFormatter instances, calling the parseDateTime method on each one in succession until a formatter succeeds without throwing an IllegalArgumentException. Another thing to consider is time zone; you may want to force the time zone to UTC or some such.
package com.your.package.converters.simplexml;
import org.joda.time.DateTime;
import org.simpleframework.xml.transform.Transform;
import org.slf4j.LoggerFactory;
/**
*
* © 2014 Basil Bourque. This source code may be used freely forever by anyone taking full responsibility for such usage and its consequences.
*/
public class JodaTimeDateTimeTransform implements Transform<DateTime>
{
//static final org.slf4j.Logger logger = LoggerFactory.getLogger( JodaTimeDateTimeTransform.class );
#Override
public DateTime read ( String input ) throws Exception
{
DateTime dateTime = null;
try {
dateTime = new DateTime( input ); // Keeping whatever offset is included. Not forcing to UTC.
} catch ( Exception e ) {
//logger.debug( "Joda-Time DateTime Transform failed. Exception: " + e );
}
return dateTime;
}
#Override
public String write ( DateTime dateTime ) throws Exception
{
String output = dateTime.toString(); // Keeping whatever offset is included. Not forcing to UTC.
return output;
}
}
Matcher Implementation
The Matcher implementation is quick and easy.
package com.your.package.converters.simplexml;
import org.simpleframework.xml.transform.Transform;
import org.simpleframework.xml.transform.Matcher;
/**
*
* © 2014 Basil Bourque. This source code may be used freely forever by anyone taking full responsibility for such usage and its consequences.
*/
public class JodaTimeDateTimeMatcher implements Matcher
{
#Override
public Transform match ( Class classType ) throws Exception
{
// Is DateTime a superclass (or same class) the classType?
if ( org.joda.time.DateTime.class.isAssignableFrom( classType ) ) {
return new JodaTimeDateTimeTransform();
}
return null;
}
}
Registry
Putting these into action means the use of a registry.
RegistryMatcher matchers = new RegistryMatcher();
matchers.bind( org.joda.time.DateTime.class , JodaTimeDateTimeTransform.class );
// You could add other data-type handlers, such as the "YearMonth" class in Joda-Time.
//matchers.bind( org.joda.time.YearMonth.class , JodaTimeYearMonthTransform.class );
Strategy strategy = new AnnotationStrategy();
Serializer serializer = new Persister( strategy , matchers );
And continue on the usual way with a serializer that understands the Joda-Time type(s).

Synchronizing access to SimpleDateFormat

The javadoc for SimpleDateFormat states that SimpleDateFormat is not synchronized.
"Date formats are not synchronized. It
is recommended to create separate
format instances for each thread. If
multiple threads access a format
concurrently, it must be synchronized
externally."
But what is the best approach to using an instance of SimpleDateFormat in a multi threaded environment. Here are a few options I have thought of, I have used options 1 and 2 in the past but I am curious to know if there are any better alternatives or which of these options would offer the best performance and concurrency.
Option 1: Create local instances when required
public String formatDate(Date d) {
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
return sdf.format(d);
}
Option 2: Create an instance of SimpleDateFormat as a class variable but synchronize access to it.
private SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
public String formatDate(Date d) {
synchronized(sdf) {
return sdf.format(d);
}
}
Option 3: Create a ThreadLocal to store a different instance of SimpleDateFormat for each thread.
private ThreadLocal<SimpleDateFormat> tl = new ThreadLocal<SimpleDateFormat>();
public String formatDate(Date d) {
SimpleDateFormat sdf = tl.get();
if(sdf == null) {
sdf = new SimpleDateFormat("yyyy-MM-hh");
tl.set(sdf);
}
return sdf.format(d);
}
Creating SimpleDateFormat is expensive. Don't use this unless it's done seldom.
OK if you can live with a bit of blocking. Use if formatDate() is not used much.
Fastest option IF you reuse threads (thread pool). Uses more memory than 2. and has higher startup overhead.
For applications both 2. and 3. are viable options. Which is best for your case depends on your use case. Beware of premature optimization. Only do it if you believe this is an issue.
For libraries that would be used by 3rd party I'd use option 3.
The other option is Commons Lang FastDateFormat but you can only use it for date formatting and not parsing.
Unlike Joda, it can function as a drop-in replacement for formatting.
(Update: Since v3.3.2, FastDateFormat can produce a FastDateParser, which is a drop-in thread-safe replacement for SimpleDateFormat)
If you are using Java 8, you may want to use java.time.format.DateTimeFormatter:
This class is immutable and thread-safe.
e.g.:
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd");
String str = new java.util.Date().toInstant()
.atZone(ZoneId.systemDefault())
.format(formatter);
Commons Lang 3.x now has FastDateParser as well as FastDateFormat. It is thread safe and faster than SimpleDateFormat. It also uses the same format/parse pattern specifications as SimpleDateFormat.
Don't use SimpleDateFormat, use joda-time's DateTimeFormatter instead. It is a bit stricter in the parsing side and so isn't quite a drop in replacement for SimpleDateFormat, but joda-time is much more concurrent friendly in terms of safety and performance.
I would say, create a simple wrapper-class for SimpleDateFormat that synchronizes access to parse() and format() and can be used as a drop-in replacement. More foolproof than your option #2, less cumbersome than your option #3.
Seems like making SimpleDateFormat unsynchronized was a poor design decision on the part of the Java API designers; I doubt anyone expects format() and parse() to need to be synchronized.
Another option is to keep instances in a thread-safe queue:
import java.util.concurrent.ArrayBlockingQueue;
private static final int DATE_FORMAT_QUEUE_LEN = 4;
private static final String DATE_PATTERN = "yyyy-MM-dd HH:mm:ss";
private ArrayBlockingQueue<SimpleDateFormat> dateFormatQueue = new ArrayBlockingQueue<SimpleDateFormat>(DATE_FORMAT_QUEUE_LEN);
// thread-safe date time formatting
public String format(Date date) {
SimpleDateFormat fmt = dateFormatQueue.poll();
if (fmt == null) {
fmt = new SimpleDateFormat(DATE_PATTERN);
}
String text = fmt.format(date);
dateFormatQueue.offer(fmt);
return text;
}
public Date parse(String text) throws ParseException {
SimpleDateFormat fmt = dateFormatQueue.poll();
if (fmt == null) {
fmt = new SimpleDateFormat(DATE_PATTERN);
}
Date date = null;
try {
date = fmt.parse(text);
} finally {
dateFormatQueue.offer(fmt);
}
return date;
}
The size of dateFormatQueue should be something close to the estimated number of threads which can routinely call this function at the same time.
In the worst case where more threads than this number do actually use all the instances concurrently, some SimpleDateFormat instances will be created which cannot be returned to dateFormatQueue because it is full. This will not generate an error, it will just incur the penalty of creating some SimpleDateFormat which are used only once.
I just implemented this with Option 3, but made a few code changes:
ThreadLocal should usually be static
Seems cleaner to override initialValue() rather than test if (get() == null)
You may want to set locale and time zone unless you really want the default settings (defaults are very error prone with Java)
private static final ThreadLocal<SimpleDateFormat> tl = new ThreadLocal<SimpleDateFormat>() {
#Override
protected SimpleDateFormat initialValue() {
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-hh", Locale.US);
sdf.setTimeZone(TimeZone.getTimeZone("America/Los_Angeles"));
return sdf;
}
};
public String formatDate(Date d) {
return tl.get().format(d);
}
Imagine your application has one thread. Why would you synchronize access to SimpleDataFormat variable then?

Determine if a String is a valid date before parsing

I have this situation where I am reading about 130K records containing dates stored as String fields. Some records contain blanks (nulls), some contain strings like this: 'dd-MMM-yy' and some contain this 'dd/MM/yyyy'.
I have written a method like this:
public Date parsedate(String date){
if(date !== null){
try{
1. create a SimpleDateFormat object using 'dd-MMM-yy' as the pattern
2. parse the date
3. return the parsed date
}catch(ParseException e){
try{
1. create a SimpleDateFormat object using 'dd/MM/yyy' as the pattern
2. parse the date
3. return parsed date
}catch(ParseException e){
return null
}
}
}else{
return null
}
}
So you may have already spotted the problem. I am using the try .. catch as part of my logic. It would be better is I can determine before hand that the String actually contains a parseable date in some format then attempt to parse it.
So, is there some API or library that can help with this? I do not mind writing several different Parse classes to handle the different formats and then creating a factory to select the correct6 one, but, how do I determine which one?
Thanks.
See Lazy Error Handling in Java for an overview of how to eliminate try/catch blocks using an Option type.
Functional Java is your friend.
In essence, what you want to do is to wrap the date parsing in a function that doesn't throw anything, but indicates in its return type whether parsing was successful or not. For example:
import fj.F; import fj.F2;
import fj.data.Option;
import java.text.SimpleDateFormat;
import java.text.ParseException;
import static fj.Function.curry;
import static fj.Option.some;
import static fj.Option.none;
...
F<String, F<String, Option<Date>>> parseDate =
curry(new F2<String, String, Option<Date>>() {
public Option<Date> f(String pattern, String s) {
try {
return some(new SimpleDateFormat(pattern).parse(s));
}
catch (ParseException e) {
return none();
}
}
});
OK, now you've a reusable date parser that doesn't throw anything, but indicates failure by returning a value of type Option.None. Here's how you use it:
import fj.data.List;
import static fj.data.Stream.stream;
import static fj.data.Option.isSome_;
....
public Option<Date> parseWithPatterns(String s, Stream<String> patterns) {
return stream(s).apply(patterns.map(parseDate)).find(isSome_());
}
That will give you the date parsed with the first pattern that matches, or a value of type Option.None, which is type-safe whereas null isn't.
If you're wondering what Stream is... it's a lazy list. This ensures that you ignore patterns after the first successful one. No need to do too much work.
Call your function like this:
for (Date d: parseWithPatterns(someString, stream("dd/MM/yyyy", "dd-MM-yyyy")) {
// Do something with the date here.
}
Or...
Option<Date> d = parseWithPatterns(someString,
stream("dd/MM/yyyy", "dd-MM-yyyy"));
if (d.isNone()) {
// Handle the case where neither pattern matches.
}
else {
// Do something with d.some()
}
Don't be too hard on yourself about using try-catch in logic: this is one of those situations where Java forces you to so there's not a lot you can do about it.
But in this case you could instead use DateFormat.parse(String, ParsePosition).
You can take advantage of regular expressions to determine which format the string is in, and whether it matches any valid format. Something like this (not tested):
(Oops, I wrote this in C# before checking to see what language you were using.)
Regex test = new Regex(#"^(?:(?<formatA>\d{2}-[a-zA-Z]{3}-\d{2})|(?<formatB>\d{2}/\d{2}/\d{3}))$", RegexOption.Compiled);
Match match = test.Match(yourString);
if (match.Success)
{
if (!string.IsNullOrEmpty(match.Groups["formatA"]))
{
// Use format A.
}
else if (!string.IsNullOrEmpty(match.Groups["formatB"]))
{
// Use format B.
}
...
}
If you formats are exact (June 7th 1999 would be either 07-Jun-99 or 07/06/1999: you are sure that you have leading zeros), then you could just check for the length of the string before trying to parse.
Be careful with the short month name in the first version, because Jun may not be June in another language.
But if your data is coming from one database, then I would just convert all dates to the common format (it is one-off, but then you control the data and its format).
In this limited situation, the best (and fastest method) is certinally to parse out the day, then based on the next char either '/' or '-' try to parse out the rest. and if at any point there is unexpected data, return NULL then.
Assuming the patterns you gave are the only likely choices, I would look at the String passed in to see which format to apply.
public Date parseDate(final String date) {
if (date == null) {
return null;
}
SimpleDateFormat format = (date.charAt(2) == '/') ? new SimpleDateFormat("dd/MMM/yyyy")
: new SimpleDateFormat("dd-MMM-yy");
try {
return format.parse(date);
} catch (ParseException e) {
// Log a complaint and include date in the complaint
}
return null;
}
As others have mentioned, if you can guarantee that you will never access the DateFormats in a multi-threaded manner, you can make class-level or static instances.
Looks like three options if you only have two, known formats:
check for the presence of - or / first and start with that parsing for that format.
check the length since "dd-MMM-yy" and "dd/MM/yyyy" are different
use precompiled regular expressions
The latter seems unnecessary.
Use regular expressions to parse your string. Make sure that you keep both regex's pre-compiled (not create new on every method call, but store them as constants), and compare if it actually is faster then the try-catch you use.
I still find it strange that your method returns null if both versions fail rather then throwing an exception.
you could use split to determine which format to use
String[] parts = date.split("-");
df = (parts.length==3 ? format1 : format2);
That assumes they are all in one or the other format, you could improve the checking if need be
An alternative to creating a SimpleDateFormat (or two) per iteration would be to lazily populate a ThreadLocal container for these formats. This will solve both Thread safety concerns and concerns around object creation performance.
A simple utility class I have written for my project. Hope this helps someone.
Usage examples:
DateUtils.multiParse("1-12-12");
DateUtils.multiParse("2-24-2012");
DateUtils.multiParse("3/5/2012");
DateUtils.multiParse("2/16/12");
public class DateUtils {
private static List<SimpleDateFormat> dateFormats = new ArrayList<SimpleDateFormat>();
private Utils() {
dateFormats.add(new SimpleDateFormat("MM/dd/yy")); // must precede yyyy
dateFormats.add(new SimpleDateFormat("MM/dd/yyyy"));
dateFormats.add(new SimpleDateFormat("MM-dd-yy"));
dateFormats.add(new SimpleDateFormat("MM-dd-yyyy"));
}
private static Date tryToParse(String input, SimpleDateFormat format) {
Date date = null;
try {
date = format.parse(input);
} catch (ParseException e) {
}
return date;
}
public static Date multiParse(String input) {
Date date = null;
for (SimpleDateFormat format : dateFormats) {
date = tryToParse(input, format);
if (date != null) break;
}
return date;
}
}
On one hand I see nothing wrong with your use of try/catch for the purpose, it’s the option I would use. On the other hand there are alternatives:
Take a taste from the string before deciding how to parse it.
Use optional parts of the format pattern string.
For my demonstrations I am using java.time, the modern Java date and time API, because the Date class used in the question was always poorly designed and is now long outdated. For a date without time of day we need a java.time.LocalDate.
try-catch
Using try-catch with java.time looks like this:
DateTimeFormatter ddmmmuuFormatter = DateTimeFormatter.ofPattern("dd-MMM-uu", Locale.ENGLISH);
DateTimeFormatter ddmmuuuuFormatter = DateTimeFormatter.ofPattern("dd/MM/uuuu");
String dateString = "07-Jun-09";
LocalDate result;
try {
result = LocalDate.parse(dateString, ddmmmuuFormatter);
} catch (DateTimeParseException dtpe) {
result = LocalDate.parse(dateString, ddmmuuuuFormatter);
}
System.out.println("Date: " + result);
Output is:
Date: 2009-06-07
Suppose instead we defined the string as:
String dateString = "07/06/2009";
Then output is still the same.
Take a taste
If you prefer to avoid the try-catch construct, it’s easy to make a simple check to decide which of the formats your string conforms to. For example:
if (dateString.contains("-")) {
result = LocalDate.parse(dateString, ddmmmuuFormatter);
} else {
result = LocalDate.parse(dateString, ddmmuuuuFormatter);
}
The result is the same as before.
Use optional parts in the format pattern string
This is the option I like the least, but it’s short and presented for some measure of completeness.
DateTimeFormatter dateFormatter
= DateTimeFormatter.ofPattern("[dd-MMM-uu][dd/MM/uuuu]", Locale.ENGLISH);
LocalDate result = LocalDate.parse(dateString, dateFormatter);
The square brackets denote optional parts of the format. So Java first tries to parse using dd-MMM-uu. No matter if successful or not it then tries to parse the remainder of the string using dd/MM/uuuu. Given your two formats one of the attempts will succeed, and you have parsed the date. The result is still the same as above.
Link
Oracle tutorial: Date Time explaining how to use java.time.

Categories