Parsing a ZonedDateTime around the Daylight saving transition fails [duplicate] - java

For improving performance of some legacy code, I am considering a replacement of java.text.SimpleDateFormat by java.time.format.DateTimeFormatter.
Among the tasks performed is parsing date/time values that had been serialized using java.util.Date.toString. With SimpleDateFormat, it was possible to turn them back into the original timestamps (neglecting fractional seconds), however I am facing problems when attempting to do the same with DateTimeFormatter.
When formatting with either, my local timezone is indicated as CET or CEST, depending on whether daylight savings time is in effect for the time to be formatted. However it appears that at parsing time, both CET and CEST are treated the same by DateTimeFormatter.
This creates a problem with the overlap occurring at the end of daylight savings time. When formatting, 02:00:00 is created twice, for times one hour apart, but with CEST and CET timezone names - which is fine. But at parsing time, that difference can't be reclaimed.
Here is an example:
long msecPerHour = 3600000L;
long cet_dst_2016 = 1477778400000L;
DateTimeFormatter formatter =
DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ENGLISH);
ZoneId timezone = ZoneId.of("Europe/Berlin");
for (int hours = 0; hours < 6; ++hours) {
long time = cet_dst_2016 + msecPerHour * hours;
String formatted = formatter.format(Instant.ofEpochMilli(time).atZone(timezone));
long parsedTime = Instant.from(formatter.parse(formatted)).toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - time));
}
which results in
Sun Oct 30 00:00:00 CEST 2016, diff: 0
Sun Oct 30 01:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CET 2016, diff: -3600000
Sun Oct 30 03:00:00 CET 2016, diff: 0
Sun Oct 30 04:00:00 CET 2016, diff: 0
It shows that the second occurrence of 02:00:00, inspite of the different timezone name, was treated like the first one. So the result effectively is off by one hour.
Obviously the formatted string has all information available, and SimpleDateFormat parsing in fact honored it. Is it possible to roundtrip through formatting and parsing, using DateTimeFormatter, with the given pattern?

It is possible for a specific case:
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern("EEE MMM dd HH:mm:ss ")
.appendText(OFFSET_SECONDS, ImmutableMap.of(2L * 60 * 60, "CEST", 1L * 60 * 60, "CET"))
.appendPattern(" yyyy")
.toFormatter(Locale.ENGLISH);
This maps the exact offset to the expected text. Where this fails is when you need to deal with more than one time-zone.
To do the job properly requires a JDK change.

It seems like a bug. I tested in Java 17 and it's still the same behaviour. I dug into the parsing logic and I can see why this happens.
One of the first things that happens is TimeZoneNameUtility.getZoneStrings(locale) is called. This gives you a 2D array of Strings
[
[
"Europe/Paris",
"Central European Standard Time", "CET",
"Central European Summer Time", "CEST",
"Central European Time", "CET"
],
// others
]
It builds a prefix tree out of them. All items in here get mapped to the 0th item - "Europe/Paris". When it's parsing, it descends the prefix tree one character at a time e.g. C... E... T..., then returns a match if there was one. Since CEST and CET map to the same thing, they're effectively just aliases of one another.
Later on that string is passed to ZoneId.of() which means the fact of whether it's summertime or not has been thrown away.
It does seem in Java 18 that there have been significant changes in this code, so maybe they're addressing that.

The general workaround
JodaStephen, the main author of java.time, in his answer shows a workaround for the case of CET and CEST (Central European Time and Central European Summer Time). I present a workaround that I believe will work in all time zones having different abbreviations for standard time and summer time (DST).
public static ZonedDateTime parse(String text) {
ZonedDateTime result = ZonedDateTime.parse(text, FORMATTER);
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Default we get the earlier offset at overlap,
// so if it didn’t work, try the later offset
result = result.withLaterOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// As a last desperate attempt, try earlier offset explicitly
result = result.withEarlierOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Give up
throw new IllegalArgumentException();
}
The method could use any formatter with a time zone name or abbreviation as long as it’s supposed to give the same output from formatting as the input it parses (so optional parts are a no-no, for example). I have assumed a formatter equivalent to yours:
private static final DateTimeFormatter FORMATTER
= DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ROOT);
Your trouble was with a millisecond value of 1 477 789 200 000, which was formatted into Sun Oct 30 02:00:00 CET 2016 and then parsed to 1 477 785 600 000 for a difference of -3 600 000 milliseconds. So let’s try my method with that one.
private static final ZoneId TIME_ZONE = ZoneId.of("Europe/Berlin");
long trouble = 1_477_789_200_000L;
String formatted = Instant.ofEpochMilli(trouble).atZone(TIME_ZONE).format(FORMATTER);
ZonedDateTime zdt = parse(formatted);
long parsedTime = zdt.toInstant().toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - trouble));
Output is:
Sun Oct 30 02:00:00 CET 2016, diff: 0
But don’t parse three letter time zone abbreviations
All of the above said, even with a workaround for that case of the fall overlap, you are on shaky ground when trying to parse time zone abbreviations. Most of the most common ones are ambiguous, and you don’t know what you get from parsing. In the case of CET and CEST, they are common abbreviations for very many European time zones that at present share offset +01:00 during standard time and +02:00 during summer time, but historically have had their own offset each and are likely to go separate ways again since the EU has decided to give up summer time completely. Next year one time zone may use CET all year and another CEST all year. My code above does not account for that.
Instead simply take the output from ZonedDateTime.toString and parse it back using the one-arg ZonedDateTime.parse(CharSequence).

Related

Future dates in java calendar are giving a strange behaviour

I have an application which I create dates that a user can select to an appointment. If a user start to work at 9, and an appointment takes 2 hours, I create dates at 9, 11, 13... until a limit, of course. And then I change the day and start again.
This is the code for doing this:
public List<Agenda> createListOfDates(Calendar initial, Calendar end,
int appointmentDuration, int lunchTimeDuration, int lunchTimeStart) {
List<Agenda> agendaList = new ArrayList<Agenda>();
Agenda agenda = new Agenda();
agenda.setWorkingHour(initial.getTime());
agendaList.add(agenda);
while (true) {
initial.add(Calendar.HOUR_OF_DAY, appointmentDuration);
// Logger.error("" + initial.getTime());
if (initial.getTime().after(end.getTime())) {
break;
} else if (initial.get(Calendar.HOUR_OF_DAY) == lunchTimeStart
&& initial.get(Calendar.DAY_OF_WEEK) != Calendar.SATURDAY
) {
initial.add(Calendar.HOUR_OF_DAY, lunchTimeDuration);
agenda = new Agenda();
agenda.setWorkingHour(initial.getTime());
agendaList.add(agenda);
} else {
agenda = new Agenda();
agenda.setWorkingHour(initial.getTime());
agendaList.add(agenda);
}
}
for(Agenda agendaX : agendaList){
Logger.info("" + agendaX.getWorkingHour());
}
return agendaList;
}
I am working with the "America/Sao_Paulo" timezone to create these dates. I set the variables "initial" and "end" as "America/Sao_Paulo". My system timezone is "GMT", and that is ok, because I want to save these dates in GMT in the database. When I print the dates in last "for", magically it is already converted from "America/Sao_Paulo" to "GMT" and it is printing right. The strange thing is that from a certain date, it changes the time zone. Example of prints:
Sat Mar 30 12:00:00 GMT 2019
Sat Mar 30 14:00:00 GMT 2019
Sat Mar 30 16:00:00 GMT 2019
Sat Mar 30 18:00:00 GMT 2019
Mon Apr 01 13:00:00 BST 2019
Mon Apr 01 15:00:00 BST 2019
Mon Apr 01 18:00:00 BST 2019
Mon Apr 01 20:00:00 BST 2019
Mon Apr 01 22:00:00 BST 2019
While is in GMT, it is right, but I can't understand this BST. Can it be because it's too much in the future? It always starts on April.
Your system time isn’t GMT, it’s Europe/London (or something similar). In March London time coincides with GMT. Not in April. That’s why.
getWorkingHour() returns an instance of Date (another poorly designed and long outdated class, but let that be a different story for now). When you append it to the empty string, Date.toString is implicitly called and builds the string using your system time zone. During standard time it prints GMT as time zone abbreviation. Summer time (DST) begins in London on the last Sunday of March, in this case March 31. So in April Date.toString on your JVM uses British Summer Time and its abbreviation, BST for printing the time.
The good solution involves two changes:
Don’t rely on the JVM’s default time zone. It can be changed at any time from another part of your program or another program running in the same JVM, so is too fragile. Instead give explicit time zone to your date-time operations.
Skip the old date-time classes Calendar and Date and instead use java.time, the modern Java date and time API. It is so much nicer to work with and gives much clearer code, not least when it comes to conversions between time zones.
Instead of Calendar use ZonedDateTime. Depending on the capabilities of your JDBC driver, convert it to either Instant or OffsetDateTime in UTC for saving to the database.
To create a ZonedDateTime, one option is to use one of its of methods (there are several):
ZonedDateTime initial = ZonedDateTime.of(2019, 3, 10, 9, 0, 0, 0, ZoneId.of("America/Sao_Paulo"));
This creates a date-time of March 10, 2019 at 09:00 in São Paolo. To add 2 hours to it:
int appointmentDuration = 2;
ZonedDateTime current = initial.plusHours(appointmentDuration);
System.out.println(current);
Output:
2019-03-10T11:00-03:00[America/Sao_Paulo]
To convert to an Instant for your database:
Instant inst = current.toInstant();
System.out.println(inst);
Output:
2019-03-10T14:00:00Z
Instants are time zone neutral, just a point in time, but print in UTC. Some JDBC drivers accept them for UTC times. If yours doesn’t happen to, you will need to give it an OffsetDateTime instead. Convert like this:
OffsetDateTime odt = current.toOffsetDateTime().withOffsetSameInstant(ZoneOffset.UTC);
System.out.println(odt);
Output:
2019-03-10T14:00Z
Note that I give UTC explicitly rather than relying on the JVM default. So this is explicitly in UTC. You notice that the date and time agree with what was printed from the Instant.

DateFormat returning wrong hour

I am trying to parse the string "20/08/18 13:21:00:428" using the DateFormat class and a formatting pattern of "dd/MM/yy' 'HH:mm:ss:SSS". The Timezone is set to BST.
The date returned for the above is correct but the time is getting returned as 08 for the hours instead of 13 - "Mon Aug 20 08:21:00 BST 2018"
The following snippet prints the date and time just mentioned:
String toBeParsed = "20/08/18 13:21:00:428";
DateFormat format = new SimpleDateFormat("dd/MM/yy' 'HH:mm:ss:SSS");
format.setTimeZone(TimeZone.getTimeZone("BST"));
Date parsedDate = format.parse(toBeParsed);
System.out.println(parsedDate);
Is this something to do with my timezone or have I misunderstood the pattern?
BST is Bangladesh Standard Time. The correct time zone to use is "Europe/London" if you want automatic summer time, or "UTC+1" if you want British Summer Time always.
See https://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.html#SHORT_IDS
java.time
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("dd/MM/uu H:mm:ss:SSS");
String toBeParsed = "20/08/18 13:21:00:428";
ZonedDateTime dateTime = LocalDateTime.parse(toBeParsed, formatter)
.atZone(ZoneId.of("Europe/London"));
System.out.println(dateTime);
Output from this snippet is:
2018-08-20T13:21:00.428+01:00[Europe/London]
What went wrong in your code?
While I always recommend against the long outdated and poorly designed classes Date, TimeZone and DateFormat, in this case they are behaving particularly confusingly. Printing a Date on a JVM with Europe/London as default time zone gives time zone as BST if the date is in the summer time part of the year:
TimeZone.setDefault(TimeZone.getTimeZone("Europe/London"));
Date oldFashionedDate = new Date();
System.out.println(oldFashionedDate);
Mon Aug 20 15:45:39 BST 2018
However, when I give time zone as BST, Bangladesh time is understood, but it comes out with the non-standard abbreviation BDT:
TimeZone.setDefault(TimeZone.getTimeZone("BST"));
System.out.println(oldFashionedDate);
Mon Aug 20 20:45:39 BDT 2018
(I have observed this behaviour on Java 8 and Java 10.)
Another lesson to learn is never to rely on three and four letter time zone abbreviations. They are ambiguous and not standardized.
BST may mean Brazil Summer Time or Brazilian Summer Time, Bangladesh Standard Time, Bougainville Standard Time or British Summer Time (note that S is sometimes for Standard, sometimes for Summer, which is typically the opposite of Standard Time).
BDT may mean Brunei Darussalam Time or British Daylight Time (another name for British Summer Time (BST)), but I wasn’t aware that Bangladesh Time was also sometimes abbreviated this way.
PS Thanks to DodgyCodeException for spotting the time zone abbreviation interpretation issue.
Link
Time Zone Abbreviations — Worldwide List

Java.util.Date not converting properly to Java.time.LocalDate

I'm trying to convert java Date to java LocalTime and my code looks like this
Date memberBirthdayDate = club.getMembers().get(i).getDob();
System.out.println(memberBirthdayDate);
LocalDate memberBirthday = memberBirthdayDate.toInstant().atZone(ZoneId.systemDefault()).toLocalDate();
When I print out the date before and after converstion it looks like this:
Before: Wed May 21 00:00:00 GMT 94
After: 0094-05-18
It looks like it's converting backwards but I can't work out how to do it!
you can do that
LocalDateTime memberBirthday = LocalDateTime.ofInstant(memberBirthdayDate.toInstant(),
ZoneId.systemDefault());
Date out = Date.from(memberBirthday.atZone(ZoneId.systemDefault()).toInstant()); System.out.println(out);
There seems to be a bug in your years. You gave the answer yourself in the comment:
I'd imported from a csv through Excel and it had automatically
formatted the date to remove the '19'
The conversion you made works nicely for a date in 1994:
System.out.println("Before: " + memberBirthdayDate);
LocalDate memberBirthday = memberBirthdayDate.toInstant()
.atZone(ZoneId.systemDefault())
.toLocalDate();
System.out.println("After: " + memberBirthday);
Output:
Before: Sat May 21 00:00:00 IST 1994
After: 1994-05-21
I used Europe/Dublin time to reproduce your exact result, so IST is for Irish Summer Time. Your Date seems to denote the start of day at some GMT offset, so you need to use a time zone that agrees with this offset. I expect the conversion to work as expected at least for all dates after year 1900, and likely earlier too.
However, when the year gets truncated from 1994 to 94, funny things start to happen. Dates that far back are not so well defined. LocalDate uses the proleptic Gregorian calendar, which is practical and well-defined, but doesn’t agree with dates used in real life before the introduction of the Gregorian calendar from 1582 and on. I’m not sure what Date uses. For dates back in year 94 AD we shouldn’t be surprised about May 21 coming through as May 18.
Before: Wed May 21 00:00:00 GMT 94
After: 0094-05-18
Link: Wikipedia article Gregorian calendar

DateTimeFormatter parsing - timezone names and daylight savings overlap times

For improving performance of some legacy code, I am considering a replacement of java.text.SimpleDateFormat by java.time.format.DateTimeFormatter.
Among the tasks performed is parsing date/time values that had been serialized using java.util.Date.toString. With SimpleDateFormat, it was possible to turn them back into the original timestamps (neglecting fractional seconds), however I am facing problems when attempting to do the same with DateTimeFormatter.
When formatting with either, my local timezone is indicated as CET or CEST, depending on whether daylight savings time is in effect for the time to be formatted. However it appears that at parsing time, both CET and CEST are treated the same by DateTimeFormatter.
This creates a problem with the overlap occurring at the end of daylight savings time. When formatting, 02:00:00 is created twice, for times one hour apart, but with CEST and CET timezone names - which is fine. But at parsing time, that difference can't be reclaimed.
Here is an example:
long msecPerHour = 3600000L;
long cet_dst_2016 = 1477778400000L;
DateTimeFormatter formatter =
DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ENGLISH);
ZoneId timezone = ZoneId.of("Europe/Berlin");
for (int hours = 0; hours < 6; ++hours) {
long time = cet_dst_2016 + msecPerHour * hours;
String formatted = formatter.format(Instant.ofEpochMilli(time).atZone(timezone));
long parsedTime = Instant.from(formatter.parse(formatted)).toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - time));
}
which results in
Sun Oct 30 00:00:00 CEST 2016, diff: 0
Sun Oct 30 01:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CET 2016, diff: -3600000
Sun Oct 30 03:00:00 CET 2016, diff: 0
Sun Oct 30 04:00:00 CET 2016, diff: 0
It shows that the second occurrence of 02:00:00, inspite of the different timezone name, was treated like the first one. So the result effectively is off by one hour.
Obviously the formatted string has all information available, and SimpleDateFormat parsing in fact honored it. Is it possible to roundtrip through formatting and parsing, using DateTimeFormatter, with the given pattern?
It is possible for a specific case:
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern("EEE MMM dd HH:mm:ss ")
.appendText(OFFSET_SECONDS, ImmutableMap.of(2L * 60 * 60, "CEST", 1L * 60 * 60, "CET"))
.appendPattern(" yyyy")
.toFormatter(Locale.ENGLISH);
This maps the exact offset to the expected text. Where this fails is when you need to deal with more than one time-zone.
To do the job properly requires a JDK change.
It seems like a bug. I tested in Java 17 and it's still the same behaviour. I dug into the parsing logic and I can see why this happens.
One of the first things that happens is TimeZoneNameUtility.getZoneStrings(locale) is called. This gives you a 2D array of Strings
[
[
"Europe/Paris",
"Central European Standard Time", "CET",
"Central European Summer Time", "CEST",
"Central European Time", "CET"
],
// others
]
It builds a prefix tree out of them. All items in here get mapped to the 0th item - "Europe/Paris". When it's parsing, it descends the prefix tree one character at a time e.g. C... E... T..., then returns a match if there was one. Since CEST and CET map to the same thing, they're effectively just aliases of one another.
Later on that string is passed to ZoneId.of() which means the fact of whether it's summertime or not has been thrown away.
It does seem in Java 18 that there have been significant changes in this code, so maybe they're addressing that.
The general workaround
JodaStephen, the main author of java.time, in his answer shows a workaround for the case of CET and CEST (Central European Time and Central European Summer Time). I present a workaround that I believe will work in all time zones having different abbreviations for standard time and summer time (DST).
public static ZonedDateTime parse(String text) {
ZonedDateTime result = ZonedDateTime.parse(text, FORMATTER);
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Default we get the earlier offset at overlap,
// so if it didn’t work, try the later offset
result = result.withLaterOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// As a last desperate attempt, try earlier offset explicitly
result = result.withEarlierOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Give up
throw new IllegalArgumentException();
}
The method could use any formatter with a time zone name or abbreviation as long as it’s supposed to give the same output from formatting as the input it parses (so optional parts are a no-no, for example). I have assumed a formatter equivalent to yours:
private static final DateTimeFormatter FORMATTER
= DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ROOT);
Your trouble was with a millisecond value of 1 477 789 200 000, which was formatted into Sun Oct 30 02:00:00 CET 2016 and then parsed to 1 477 785 600 000 for a difference of -3 600 000 milliseconds. So let’s try my method with that one.
private static final ZoneId TIME_ZONE = ZoneId.of("Europe/Berlin");
long trouble = 1_477_789_200_000L;
String formatted = Instant.ofEpochMilli(trouble).atZone(TIME_ZONE).format(FORMATTER);
ZonedDateTime zdt = parse(formatted);
long parsedTime = zdt.toInstant().toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - trouble));
Output is:
Sun Oct 30 02:00:00 CET 2016, diff: 0
But don’t parse three letter time zone abbreviations
All of the above said, even with a workaround for that case of the fall overlap, you are on shaky ground when trying to parse time zone abbreviations. Most of the most common ones are ambiguous, and you don’t know what you get from parsing. In the case of CET and CEST, they are common abbreviations for very many European time zones that at present share offset +01:00 during standard time and +02:00 during summer time, but historically have had their own offset each and are likely to go separate ways again since the EU has decided to give up summer time completely. Next year one time zone may use CET all year and another CEST all year. My code above does not account for that.
Instead simply take the output from ZonedDateTime.toString and parse it back using the one-arg ZonedDateTime.parse(CharSequence).

Parse Date with time zone from GWT

I read many of similar SO questions regarding this very topic, but I still am confused how to do it now... I try to make it simpel:
I got this String: "Sat Jan 24 00:00:00 GMT+100 2015"
which shall not be modified in any way
My question now is: What kind of pattern shall I use to parse this String into a java.util.Date? I tried: "EEE MMM dd HH:mm:ss z yyyy" but it fails with "unparsable Date"
What I know is: If I'd have "Sat Jan 24 00:00:00 GMT+1:00 2015", which is the same (right?), then my pattern works. But I can't (want..) modify it.
--> Is there a pattern which works out of the box, yes or no?
PS: I assume this question ends as duplicate of one of all the others, but if you vote so, please answer my bold question in addition, as I could not read it out of there with certainty
regards and thanks in advance
"Hours must be between 0 to 23 and Minutes must be between 00 to 59. For example, "GMT+10" and "GMT+0010" mean ten hours and ten minutes ahead of GMT, respectively."
as per java time zone specification http://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html and this is what i found for three letter time zone Three-letter time zone IDs
For compatibility with JDK 1.1.x, some other three-letter time zone IDs (such as "PST", "CTT", "AST") are also supported. However, their use is deprecated because the same abbreviation is often used for multiple time zones (for example, "CST" could be U.S. "Central Standard Time" and "China Standard Time"), and the Java platform can then only recognize one of them.
so i do not think there is any out of box pattern which works for you. check the other answer .
I don't think there is a pattern that can match your timezone / offset GMT+100. You could amend the input before parsing:
String input = "Sat Jan 24 00:00:00 GMT+100 2015";
input = input.replaceAll("([+-])(\\d+?)(\\d{2})", "$1$2:$3");
String pattern = "EEE MMM dd HH:mm:ss z yyyy";
Date date = new SimpleDateFormat(pattern, Locale.ENGLISH).parse(input);

Categories