I read many of similar SO questions regarding this very topic, but I still am confused how to do it now... I try to make it simpel:
I got this String: "Sat Jan 24 00:00:00 GMT+100 2015"
which shall not be modified in any way
My question now is: What kind of pattern shall I use to parse this String into a java.util.Date? I tried: "EEE MMM dd HH:mm:ss z yyyy" but it fails with "unparsable Date"
What I know is: If I'd have "Sat Jan 24 00:00:00 GMT+1:00 2015", which is the same (right?), then my pattern works. But I can't (want..) modify it.
--> Is there a pattern which works out of the box, yes or no?
PS: I assume this question ends as duplicate of one of all the others, but if you vote so, please answer my bold question in addition, as I could not read it out of there with certainty
regards and thanks in advance
"Hours must be between 0 to 23 and Minutes must be between 00 to 59. For example, "GMT+10" and "GMT+0010" mean ten hours and ten minutes ahead of GMT, respectively."
as per java time zone specification http://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html and this is what i found for three letter time zone Three-letter time zone IDs
For compatibility with JDK 1.1.x, some other three-letter time zone IDs (such as "PST", "CTT", "AST") are also supported. However, their use is deprecated because the same abbreviation is often used for multiple time zones (for example, "CST" could be U.S. "Central Standard Time" and "China Standard Time"), and the Java platform can then only recognize one of them.
so i do not think there is any out of box pattern which works for you. check the other answer .
I don't think there is a pattern that can match your timezone / offset GMT+100. You could amend the input before parsing:
String input = "Sat Jan 24 00:00:00 GMT+100 2015";
input = input.replaceAll("([+-])(\\d+?)(\\d{2})", "$1$2:$3");
String pattern = "EEE MMM dd HH:mm:ss z yyyy";
Date date = new SimpleDateFormat(pattern, Locale.ENGLISH).parse(input);
Related
For improving performance of some legacy code, I am considering a replacement of java.text.SimpleDateFormat by java.time.format.DateTimeFormatter.
Among the tasks performed is parsing date/time values that had been serialized using java.util.Date.toString. With SimpleDateFormat, it was possible to turn them back into the original timestamps (neglecting fractional seconds), however I am facing problems when attempting to do the same with DateTimeFormatter.
When formatting with either, my local timezone is indicated as CET or CEST, depending on whether daylight savings time is in effect for the time to be formatted. However it appears that at parsing time, both CET and CEST are treated the same by DateTimeFormatter.
This creates a problem with the overlap occurring at the end of daylight savings time. When formatting, 02:00:00 is created twice, for times one hour apart, but with CEST and CET timezone names - which is fine. But at parsing time, that difference can't be reclaimed.
Here is an example:
long msecPerHour = 3600000L;
long cet_dst_2016 = 1477778400000L;
DateTimeFormatter formatter =
DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ENGLISH);
ZoneId timezone = ZoneId.of("Europe/Berlin");
for (int hours = 0; hours < 6; ++hours) {
long time = cet_dst_2016 + msecPerHour * hours;
String formatted = formatter.format(Instant.ofEpochMilli(time).atZone(timezone));
long parsedTime = Instant.from(formatter.parse(formatted)).toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - time));
}
which results in
Sun Oct 30 00:00:00 CEST 2016, diff: 0
Sun Oct 30 01:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CET 2016, diff: -3600000
Sun Oct 30 03:00:00 CET 2016, diff: 0
Sun Oct 30 04:00:00 CET 2016, diff: 0
It shows that the second occurrence of 02:00:00, inspite of the different timezone name, was treated like the first one. So the result effectively is off by one hour.
Obviously the formatted string has all information available, and SimpleDateFormat parsing in fact honored it. Is it possible to roundtrip through formatting and parsing, using DateTimeFormatter, with the given pattern?
It is possible for a specific case:
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern("EEE MMM dd HH:mm:ss ")
.appendText(OFFSET_SECONDS, ImmutableMap.of(2L * 60 * 60, "CEST", 1L * 60 * 60, "CET"))
.appendPattern(" yyyy")
.toFormatter(Locale.ENGLISH);
This maps the exact offset to the expected text. Where this fails is when you need to deal with more than one time-zone.
To do the job properly requires a JDK change.
It seems like a bug. I tested in Java 17 and it's still the same behaviour. I dug into the parsing logic and I can see why this happens.
One of the first things that happens is TimeZoneNameUtility.getZoneStrings(locale) is called. This gives you a 2D array of Strings
[
[
"Europe/Paris",
"Central European Standard Time", "CET",
"Central European Summer Time", "CEST",
"Central European Time", "CET"
],
// others
]
It builds a prefix tree out of them. All items in here get mapped to the 0th item - "Europe/Paris". When it's parsing, it descends the prefix tree one character at a time e.g. C... E... T..., then returns a match if there was one. Since CEST and CET map to the same thing, they're effectively just aliases of one another.
Later on that string is passed to ZoneId.of() which means the fact of whether it's summertime or not has been thrown away.
It does seem in Java 18 that there have been significant changes in this code, so maybe they're addressing that.
The general workaround
JodaStephen, the main author of java.time, in his answer shows a workaround for the case of CET and CEST (Central European Time and Central European Summer Time). I present a workaround that I believe will work in all time zones having different abbreviations for standard time and summer time (DST).
public static ZonedDateTime parse(String text) {
ZonedDateTime result = ZonedDateTime.parse(text, FORMATTER);
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Default we get the earlier offset at overlap,
// so if it didn’t work, try the later offset
result = result.withLaterOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// As a last desperate attempt, try earlier offset explicitly
result = result.withEarlierOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Give up
throw new IllegalArgumentException();
}
The method could use any formatter with a time zone name or abbreviation as long as it’s supposed to give the same output from formatting as the input it parses (so optional parts are a no-no, for example). I have assumed a formatter equivalent to yours:
private static final DateTimeFormatter FORMATTER
= DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ROOT);
Your trouble was with a millisecond value of 1 477 789 200 000, which was formatted into Sun Oct 30 02:00:00 CET 2016 and then parsed to 1 477 785 600 000 for a difference of -3 600 000 milliseconds. So let’s try my method with that one.
private static final ZoneId TIME_ZONE = ZoneId.of("Europe/Berlin");
long trouble = 1_477_789_200_000L;
String formatted = Instant.ofEpochMilli(trouble).atZone(TIME_ZONE).format(FORMATTER);
ZonedDateTime zdt = parse(formatted);
long parsedTime = zdt.toInstant().toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - trouble));
Output is:
Sun Oct 30 02:00:00 CET 2016, diff: 0
But don’t parse three letter time zone abbreviations
All of the above said, even with a workaround for that case of the fall overlap, you are on shaky ground when trying to parse time zone abbreviations. Most of the most common ones are ambiguous, and you don’t know what you get from parsing. In the case of CET and CEST, they are common abbreviations for very many European time zones that at present share offset +01:00 during standard time and +02:00 during summer time, but historically have had their own offset each and are likely to go separate ways again since the EU has decided to give up summer time completely. Next year one time zone may use CET all year and another CEST all year. My code above does not account for that.
Instead simply take the output from ZonedDateTime.toString and parse it back using the one-arg ZonedDateTime.parse(CharSequence).
Is it possible to convert this date string using Java time package
3-6-2017
to this format
"Mon Mar 6 00:00:00 EST 2017"
I created these two formatters, but which time instance should I use? I've tried LocalDate, LocalDateTime, and ZonedDateTime.
DateTimeFormatter inputFormat = DateTimeFormatter.ofPattern("M-d-uuuu");
DateTimeFormatter convertedToFormat = DateTimeFormatter.ofPattern("EEE MMM dd hh:mm:ss zzz yyyy");
I believe that you have three issues:
To accept month in either 1 or 2 digits (like 3 for March and 11 for November) you need to specify one pattern letter M, not two. Similarly for day of month. So your input format pattern string should be M-d-uuuu (or just M-d-u). Edit: You also need d instead of dd in the “converted to” pattern.
To print hour of day (from 00 through 23) you need uppercase HH. Lowercase hh is for clock hour within AM or PM from 01 through 12.
Since your input string did not contain time of day, you need to specify time of day some other way. Similar for time zone since your “converted to” format contains zzz for time zone abbreviation.
So in code I suggest:
DateTimeFormatter inputFormat = DateTimeFormatter.ofPattern("M-d-uuuu");
DateTimeFormatter convertedToFormat = DateTimeFormatter.ofPattern("EEE MMM d HH:mm:ss zzz yyyy");
String input = "3-6-2017";
ZonedDateTime startOfDay = LocalDate.parse(input, inputFormat)
.atStartOfDay(ZoneId.of("America/New_York"));
String output = startOfDay.format(convertedToFormat);
System.out.println(output);
Output from my snippet is the desired:
Mon Mar 6 00:00:00 EST 2017
Or to answer your question a little more directly:
… which time instance should I use?
You need two of them: LocalDate for parsing your input and ZonedDateTime for formatting your output. And then a conversion between the two. The one-arg atStartOfDay method provides the conversion we need. (There is a trick for parsing directly into a ZonedDateTime using default values for time and time zone, but it’s more complicated.)
There are other time zones that will also produce EST as time zone abbreviation. Since your profile says you’re in Boston, I think that America/New_York is the one you want.
I have the following piece of code that is throwing a DateTimeParseException:
String dateString = "Jul 20 09:32:46"
DateTimeFormatter formatter=DateTimeFormatter.ofPattern("L d HH:mm:ss");
return ZonedDateTime.parse(dateString, formatter);
According to the documentation, you will observe that Jul is the example for character L.
However, the exception message is:
java.time.format.DateTimeParseException: Text 'Jul' could not be parsed at index 0
What am I missing?
You have some issues here:
To correctly parse 'Jul' you have to use MMM instead of L (here explains why).
Your date string doesn't have a year. You can't create a ZonedDateTime without the year.
If is a Zoned date time, it has to include the time zone information too, which is not in your date string. You can use a LocalDateTime if you don't want to work with time zones.
Here are some alternatives:
With timezone:
String dateString = "Jul 20 2018 09:32:46+0000";
DateTimeFormatter formatter= DateTimeFormatter.ofPattern("MMM dd y H:mm:ssZ");
return ZonedDateTime.parse(dateString, formatter);
Without timezone:
String dateString = "Jul 20 2018 09:32:46";
DateTimeFormatter formatter= DateTimeFormatter.ofPattern("MMM dd y H:mm:ss");
return LocalDateTime.parse(dateString, formatter);
The answer by Juan Carlos Mendoza is correct. I will give my suggestions as a supplement: either improve your string to include year and time zone, or build a formatter that can parse your current string without them.
Improving your string
String dateString = "Jul 20 2018 09:32:46 America/Argentina/La_Rioja";
DateTimeFormatter formatter
= DateTimeFormatter.ofPattern("LLL d uuuu HH:mm:ss VV", Locale.ROOT);
System.out.println(ZonedDateTime.parse(dateString, formatter));
This prints
2018-07-20T09:32:46-03:00[America/Argentina/La_Rioja]
The same formatter will also parse Jul 20 2018 09:32:46 -08:30 into a ZonedDateTime of 2018-07-20T09:32:46-08:30.
First potential issue is the locale. If “Jul” is in English, give an English-speaking locale, or parsing will likely fail on computers with a language where the month of July is called something else. I recommend you always specify locale with your formatter. Even if you end up going for Locale.getDefault(). It will still tell the reader (and yourself) that you have made a conscious choice.
Next the documentation says that both M and L can give month as number/text and gives examples 7; 07; Jul; July; J. So this line is clearly relevant: “Number/Text: If the count of pattern letters is 3 or greater, use the Text rules above. Otherwise use the Number rules above.” Since “Jul” is text, you need 3 pattern letters or greater. “Less than 4 pattern letters will use the short form.” “Jul” is short, so we need exactly three letters.
The code above works with Java 9.0.4 no matter if I use MMM or LLL in the format pattern string. In jdk1.8.0_131 it works with MMM but funnily fails with LLL, this may be a bug (tested on a Mac). See Juan Carlos Mendoza’s for a treatment of the intended difference between M and L.
Build a formatter that works
String dateString = "Jul 20 09:32:46";
ZoneId zone = ZoneId.of("America/Argentina/La_Rioja");
DateTimeFormatter formatter = new DateTimeFormatterBuilder().appendPattern("LLL d HH:mm:ss")
.parseDefaulting(ChronoField.YEAR, Year.now(zone).getValue())
.toFormatter(Locale.ROOT)
.withZone(zone);
System.out.println(ZonedDateTime.parse(dateString, formatter));
This will parse the string from your question into 2018-07-20T09:32:46-03:00[America/Argentina/La_Rioja]. Please substitute your desired default time zone if it didn’t happen to coincide with the one I picked at random. Also substitute your desired year if you don’t want the current year.
Again my Java 8 requires MMM rather than LLL.
For improving performance of some legacy code, I am considering a replacement of java.text.SimpleDateFormat by java.time.format.DateTimeFormatter.
Among the tasks performed is parsing date/time values that had been serialized using java.util.Date.toString. With SimpleDateFormat, it was possible to turn them back into the original timestamps (neglecting fractional seconds), however I am facing problems when attempting to do the same with DateTimeFormatter.
When formatting with either, my local timezone is indicated as CET or CEST, depending on whether daylight savings time is in effect for the time to be formatted. However it appears that at parsing time, both CET and CEST are treated the same by DateTimeFormatter.
This creates a problem with the overlap occurring at the end of daylight savings time. When formatting, 02:00:00 is created twice, for times one hour apart, but with CEST and CET timezone names - which is fine. But at parsing time, that difference can't be reclaimed.
Here is an example:
long msecPerHour = 3600000L;
long cet_dst_2016 = 1477778400000L;
DateTimeFormatter formatter =
DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ENGLISH);
ZoneId timezone = ZoneId.of("Europe/Berlin");
for (int hours = 0; hours < 6; ++hours) {
long time = cet_dst_2016 + msecPerHour * hours;
String formatted = formatter.format(Instant.ofEpochMilli(time).atZone(timezone));
long parsedTime = Instant.from(formatter.parse(formatted)).toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - time));
}
which results in
Sun Oct 30 00:00:00 CEST 2016, diff: 0
Sun Oct 30 01:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CEST 2016, diff: 0
Sun Oct 30 02:00:00 CET 2016, diff: -3600000
Sun Oct 30 03:00:00 CET 2016, diff: 0
Sun Oct 30 04:00:00 CET 2016, diff: 0
It shows that the second occurrence of 02:00:00, inspite of the different timezone name, was treated like the first one. So the result effectively is off by one hour.
Obviously the formatted string has all information available, and SimpleDateFormat parsing in fact honored it. Is it possible to roundtrip through formatting and parsing, using DateTimeFormatter, with the given pattern?
It is possible for a specific case:
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern("EEE MMM dd HH:mm:ss ")
.appendText(OFFSET_SECONDS, ImmutableMap.of(2L * 60 * 60, "CEST", 1L * 60 * 60, "CET"))
.appendPattern(" yyyy")
.toFormatter(Locale.ENGLISH);
This maps the exact offset to the expected text. Where this fails is when you need to deal with more than one time-zone.
To do the job properly requires a JDK change.
It seems like a bug. I tested in Java 17 and it's still the same behaviour. I dug into the parsing logic and I can see why this happens.
One of the first things that happens is TimeZoneNameUtility.getZoneStrings(locale) is called. This gives you a 2D array of Strings
[
[
"Europe/Paris",
"Central European Standard Time", "CET",
"Central European Summer Time", "CEST",
"Central European Time", "CET"
],
// others
]
It builds a prefix tree out of them. All items in here get mapped to the 0th item - "Europe/Paris". When it's parsing, it descends the prefix tree one character at a time e.g. C... E... T..., then returns a match if there was one. Since CEST and CET map to the same thing, they're effectively just aliases of one another.
Later on that string is passed to ZoneId.of() which means the fact of whether it's summertime or not has been thrown away.
It does seem in Java 18 that there have been significant changes in this code, so maybe they're addressing that.
The general workaround
JodaStephen, the main author of java.time, in his answer shows a workaround for the case of CET and CEST (Central European Time and Central European Summer Time). I present a workaround that I believe will work in all time zones having different abbreviations for standard time and summer time (DST).
public static ZonedDateTime parse(String text) {
ZonedDateTime result = ZonedDateTime.parse(text, FORMATTER);
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Default we get the earlier offset at overlap,
// so if it didn’t work, try the later offset
result = result.withLaterOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// As a last desperate attempt, try earlier offset explicitly
result = result.withEarlierOffsetAtOverlap();
if (result.format(FORMATTER).equals(text)) {
return result;
}
// Give up
throw new IllegalArgumentException();
}
The method could use any formatter with a time zone name or abbreviation as long as it’s supposed to give the same output from formatting as the input it parses (so optional parts are a no-no, for example). I have assumed a formatter equivalent to yours:
private static final DateTimeFormatter FORMATTER
= DateTimeFormatter.ofPattern("EEE MMM dd HH:mm:ss zzz yyyy", Locale.ROOT);
Your trouble was with a millisecond value of 1 477 789 200 000, which was formatted into Sun Oct 30 02:00:00 CET 2016 and then parsed to 1 477 785 600 000 for a difference of -3 600 000 milliseconds. So let’s try my method with that one.
private static final ZoneId TIME_ZONE = ZoneId.of("Europe/Berlin");
long trouble = 1_477_789_200_000L;
String formatted = Instant.ofEpochMilli(trouble).atZone(TIME_ZONE).format(FORMATTER);
ZonedDateTime zdt = parse(formatted);
long parsedTime = zdt.toInstant().toEpochMilli();
System.out.println(formatted + ", diff: " + (parsedTime - trouble));
Output is:
Sun Oct 30 02:00:00 CET 2016, diff: 0
But don’t parse three letter time zone abbreviations
All of the above said, even with a workaround for that case of the fall overlap, you are on shaky ground when trying to parse time zone abbreviations. Most of the most common ones are ambiguous, and you don’t know what you get from parsing. In the case of CET and CEST, they are common abbreviations for very many European time zones that at present share offset +01:00 during standard time and +02:00 during summer time, but historically have had their own offset each and are likely to go separate ways again since the EU has decided to give up summer time completely. Next year one time zone may use CET all year and another CEST all year. My code above does not account for that.
Instead simply take the output from ZonedDateTime.toString and parse it back using the one-arg ZonedDateTime.parse(CharSequence).
From DateTimeFormatter javadoc:
Zone names: Time zone names ('z') cannot be parsed.
Therefore timezone parsing like:
System.out.println(new SimpleDateFormat("EEE MMM dd HH:mm:ss z yyyy").parse("Fri Nov 11 12:13:14 JST 2010"));
cannot be done in Joda:
DateTimeFormatter dtf = DateTimeFormat.forPattern("EEE MMM dd HH:mm:ss z yyyy");
System.out.println(dtf.parseDateTime("Fri Nov 11 12:13:14 JST 2010"));
//Exception in thread "main" java.lang.IllegalArgumentException: Invalid format: "Fri Nov 11 12:13:14 JST 2010" is malformed at "JST 2010"
//at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:673)
I think that the reason is that 'z' timezone names are conventional (not standardized) and ambiguous; i.e. they mean different things depending on your country of origin. For example, "PST" can be "Pacific Standard Time" or "Pakistan Standard Time".
If you are interested, this site has a listing of a large number of timezone names. It is not difficult to spot cases where there is ambiguity.
Probably because some time zone abbreviations are ambiguous and the parser can't know which time zone is meant.
It might of course also be one of the tiny, strange ticks and missing features you find after working with Joda for a while.
Abbreviated time zones are indeed ambiguous and Joda took a step further removing support for them as stated in the DateTimeZone javadoc: