I have a user input field and would like to parse his date, whatever he puts in.
The user might provide his date with a leading zero or without one, so I wanna be able to parse an input like this
02.05.2019
and also this
2.5.2019
But as far as I can tell there is no way to make the leading zero optional, either always have 2 digits like 01, 03, 12 and so on, or only have the necessary digits like 1, 3, 12.
So apparently I have to decide whether to allow leading zeros or not, but is there seriously no way to make the leading zero optional ?
Well, I tested a pattern that included a leading zero dd.MM.uuuu and I tested a pattern that did not include a leading zero d.M.uuuu and when I parsed the wrong input with the wrong pattern exceptions were thrown.
Therefore my question is if there is a way to make the leading zero optional.
This is trivial when you know it. One pattern letter, for example d or M, will accept either one or two digits (or for year up to 9 digits).
DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("d.M.u");
System.out.println(LocalDate.parse("02.05.2019", dateFormatter));
System.out.println(LocalDate.parse("3.5.2019", dateFormatter));
System.out.println(LocalDate.parse("4.05.2019", dateFormatter));
System.out.println(LocalDate.parse("06.5.2019", dateFormatter));
System.out.println(LocalDate.parse("15.12.2019", dateFormatter));
Output:
2019-05-02
2019-05-03
2019-05-04
2019-05-06
2019-12-15
I searched for this information in the documentation and didn’t find it readily. I don’t think it is well documented.
You can create a DateTimeFormatter with a custom format like this
DateTimeFormatter.ofPattern("d.M.yyyy")
Then you can parse dates if they provide 1 or 2 digits for the day and month.
String input = "02.5.2019";
LocalDate date = LocalDate.parse(input, DateTimeFormatter.ofPattern("d.M.yyyy"));
I've used LocalDate here from the new java.time package so I'm assuming that your java version is recent.
Your suggested date format should work - just as this test:
#Test
public void test() throws ParseException {
SimpleDateFormat f = new SimpleDateFormat("d.M.yyyy");
f.parse("7.8.2019");
f.parse("07.08.2019");
f.parse("007.008.002019");
}
The DateTimeFormatter will not accept leading zeros for year in comparison, but leading zeros for day and month are not an issue:
#Test
public void test2() throws ParseException {
DateTimeFormatterBuilder builder = new DateTimeFormatterBuilder();
DateTimeFormatter f = builder.appendPattern("d.M.yyyy").toFormatter();
f.parse("7.8.2019");
f.parse("07.08.2019");
f.parse("007.008.2019");
}
Related
I am trying to parse an incoming string which might contain time or not. Both the following dates should be accepted
"2022-03-03" and "2022-03-03 15:10:05".
The DateTimeFormatter that I know will fail in any one of the cases. This is one answer I got, but I don't know if in any ways time part can be made optional here.
ISO_DATE_TIME.format() to LocalDateTime with optional offset
The idea is if the time part is not present I should set it to the end of the day, so the time part should be 23:59:59.
Any help is appreciated. Thanks!
Well, you could utilize a DateTimeFormatterBuilder to specify defaults for missing fields:
private static LocalDateTime parse(String str) {
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern("uuuu-MM-dd[ HH:mm:ss]")
.parseDefaulting(ChronoField.HOUR_OF_DAY, 23)
.parseDefaulting(ChronoField.MINUTE_OF_HOUR, 59)
.parseDefaulting(ChronoField.SECOND_OF_MINUTE, 59)
.toFormatter();
return LocalDateTime.parse(str, formatter);
}
The pattern specifies the pattern it will try to parse. Note that the square brackets ([]) are optional parts. Everything between them will be either completely consumed, or entirely discarded.
With parseDefaulting you can specify the default values for when fields are missing. In your case, if the user provides only the date, the hour-of-day, minute-of-hour and second-of-minute fields are missing, that's why it is needed to provide defaults for them.
Example
System.out.println(parse("2022-03-03"));
System.out.println(parse("2022-03-03 15:10:05"));
System.out.println(parse("2025"));
Outputs the following:
2022-03-03T23:59:59
2022-03-03T15:10:05
Exception in thread "main" java.time.format.DateTimeParseException: Text '2025' could not be parsed at index 4
I need to read one csv file which has different time format in one timestamp column. It can be anything from below mentioned 5 formats. I need to match the fetched date and parse accordingly on each row.
Please suggest how to validate ad parse it. thanks in advance.
public static final String DEFAULT_DATE_FORMAT_PATTERN = "yyyy-MM-dd";
public static final String DEFAULT_DATE_TIME_FORMAT_PATTERN = "yyyy-MM-dd HH:mm:ss.SSS";
public static final String DATE_TIME_MINUTES_ONLY_FORMAT_PATTERN = "yyyy-MM-dd HH:mm";
public static final String DATE_TIME_WITHOUT_MILLIS_FORMAT_PATTERN = "yyyy-MM-dd HH:mm:ss";
Epoch in milli
What you need is a formatter with optional parts. A pattern can contain square brackets to denote an optional part, for example HH:mm[:ss]. The formatter then is required to parse HH:mm, and tries to parse the following text as :ss, or skips it if that fails. yyyy-MM-dd[ HH:mm[:ss[.SSS]]] would then be the pattern.
There is only one issue here – when you try to parse a string with the pattern yyyy-MM-dd (so without time part) using LocalDateTime::parse, it will throw a DateTimeFormatException with the message Unable to obtain LocalDateTime from TemporalAccessor. Apparently, at least one time part must be available to succeed.
Luckily, we can use a DateTimeFormatterBuilder to build a pattern, instructing the formatter to use some defaults if information is missing from the parsed text. Here it is:
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern("yyyy-MM-dd[ HH:mm[:ss[.SSS]]]")
.parseDefaulting(ChronoField.HOUR_OF_DAY, 0)
.parseDefaulting(ChronoField.MINUTE_OF_HOUR, 0)
.parseDefaulting(ChronoField.SECOND_OF_MINUTE, 0)
.toFormatter();
LocalDateTime dateTime = LocalDateTime.parse(input, formatter);
Tests:
String[] inputs = {
"2020-10-22", // OK
"2020-10-22 14:55", // OK
"2020-10-22T14:55", // Fails: incorrect format
"2020-10-22 14:55:23",
"2020-10-22 14:55:23.9", // Fails: incorrect fraction of second
"2020-10-22 14:55:23.91", // Fails: incorrect fraction of second
"2020-10-22 14:55:23.917", // OK
"2020-10-22 14:55:23.9174", // Fails: incorrect fraction of second
"2020-10-22 14:55:23.917428511" // Fails: incorrect fraction of second
};
And what about epoch in milli?
Well, this cannot be parsed directly by the DateTimeFormatter. But what's more: an epoch in milli has an implicit timezone: UTC. The other patterns lack a timezone. So an epoch is a fundamentally different piece of information. One thing you could do is assume a timezone for the inputs missing one.
However, if you nevertheless want to parse the instant, you could try to parse it as a long using Long::parseLong, and if it fails, then try to parse with the formatter. Alternatively, you could use a regular expression (like -?\d+ or something) to try to match the instant, and if it does, then parse as instant, and if it fails, then try to parse with the abovementioned formatter.
The brute force approach:
simply try your 4 formats, one after the other to parse the incoming string
if parsing throws an exception, try the next one
if parsing passes, well, that format just matched
Of course, if we are talking about larger tables, that is quite inefficient. Possible optimisations:
obviously, the different patterns have subtle differences, so you could use indexOf() checks first. Like: if the value to be parsed contains no ':' char, then it can only be the first pattern.
you can look at your data manually to figure the actual distribution of patterns that are used. then you adapt the order of patterns to try to the likelihood of the pattern being used in your data
Alternatively: you could define your own regex. The only thing that makes it slightly ugly is the fact that your input uses month names, not month number. But I think it shouldn't be too hard to write up a single regex that covers all your cases.
I am receiving timestamp in format : HHmmss followed by milleseconds and microseconds.Microseconds after the '.' are optional
For example: "timestamp ":"152656375.489991" is 15:26:56:375.489991.
Below code is throwing exceptions:
final DateTimeFormatter FORMATTER = new DateTimeFormatterBuilder()
.appendPattern("HHmmssSSS")
.appendFraction(ChronoField.MICRO_OF_SECOND, 0, 6, true)
.toFormatter();
LocalTime.parse(dateTime,FORMATTER);
Can someone please help me with DateTimeformatter to get LocalTime in java.
Here is the stacktrace from the exception from the code above:
java.time.format.DateTimeParseException: Text '152656375.489991' could not be parsed: Conflict found: NanoOfSecond 375000000 differs from NanoOfSecond 489991000 while resolving MicroOfSecond
at java.base/java.time.format.DateTimeFormatter.createError(DateTimeFormatter.java:1959)
at java.base/java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1894)
at java.base/java.time.LocalTime.parse(LocalTime.java:463)
at com.ajax.so.Test.main(Test.java:31)
Caused by: java.time.DateTimeException: Conflict found: NanoOfSecond 375000000 differs from NanoOfSecond 489991000 while resolving MicroOfSecond
at java.base/java.time.format.Parsed.updateCheckConflict(Parsed.java:329)
at java.base/java.time.format.Parsed.resolveTimeFields(Parsed.java:462)
at java.base/java.time.format.Parsed.resolveFields(Parsed.java:267)
at java.base/java.time.format.Parsed.resolve(Parsed.java:253)
at java.base/java.time.format.DateTimeParseContext.toResolved(DateTimeParseContext.java:331)
at java.base/java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1994)
at java.base/java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1890)
... 3 more
There are many options, depending on the possible variations in the strings you need to parse.
1. Modify the string so you need no formatter
String timestampString = "152656375.489991";
timestampString = timestampString.replaceFirst(
"^(\\d{2})(\\d{2})(\\d{2})(\\d{3})(?:\\.(\\d*))?$", "$1:$2:$3.$4$5");
System.out.println(timestampString);
LocalTime time = LocalTime.parse(timestampString);
System.out.println(time);
The output from this snippet is:
15:26:56.375489991
The replaceFirst() call modifies your string into 15:26:56.375489991, the default format for LocalTime (ISO 8601) so it can be parsed without any explicit formatter. For this I am using a regular expression that may not be too readable. (…) enclose groups that I use as $1, $2, etc., in the replacement string. (?:…) denotes a non-capturing group, that is, cannot be used in the replacement string. I put a ? after it to specify that this group is optional in the original string.
This solution accepts from 1 through 6 decimals after the point and also no fractional part at all.
2. Use a simpler string modification and a formatter
I want to modify the string so I can use this formatter:
private static DateTimeFormatter fullParser
= DateTimeFormatter.ofPattern("HHmmss.[SSSSSSSSS][SSS]");
This requires the point to be after the seconds rather than after the milliseoncds. So move it three places to the left:
timestampString = timestampString.replaceFirst("(\\d{3})(?:\\.|$)", ".$1");
LocalTime time = LocalTime.parse(timestampString, fullParser);
15:26:56.375489991
Again I am using a non-capturing group, this time to say that after the (captured) group of three digits must come either a dot or the end of the string.
3. The same with a more flexible parser
The formatter above specifies that there must be either 9 or 3 digits after the decimal point, which may be too rigid. If you want to accept something in between too, a builder can build a more flexible formatter:
private static DateTimeFormatter fullParser = new DateTimeFormatterBuilder()
.appendPattern("HHmmss")
.appendFraction(ChronoField.NANO_OF_SECOND, 3, 9, true)
.toFormatter();
I think that this would be my favourite approach, again depending on the exact requirements.
4. Parse only a part of the string
There is no problem so big and awful that it cannot simply be run away
from (Linus in Peanuts, from memory)
If you can live without the microseconds, ignore them:
private static DateTimeFormatter partialParser
= DateTimeFormatter.ofPattern("HHmmssSSS");
To parse only a the part of the string up to the point using this formatter:
TemporalAccessor parsed
= partialParser.parse(timestampString, new ParsePosition(0));
LocalTime time = LocalTime.from(parsed);
15:26:56.375
As you can see it has ignored the part from the decimal point, which I wouldn’t find too satisfactory.
What went wrong in your code?
Your 6 digits after the decimal point denote nanoseconds. Microseconds would have been only 3 decimals after the milliseconds. To use appendFraction() to parse these you would have needed a TemporalUnit of nano of millisecond. The ChronoUnit enum offers nano of day and nano of second, but not nano of milli. TemporalUnit is an interface, so in theory we could develop our own nano of milli class for the purpose. I tried to develop a class implementing TemporalUnit once, but gave up, I couldn’t get it to work.
Links
Wikipedia article: ISO 8601
Regular expressions in Java - Tutorial
What is the Java8 java.time equivalent of
org.joda.time.formatDateTimeFormat.shortDate()
I've tried below way, but it fails to parse values such as "20/5/2016" or "20/5/16".
DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT)
You are correct: A Joda-Time DateTimeFormatter (which is the type you get from DateTimeFormat.shortDate()) parses more leniently than a java.time DateTimeFormatter. In the English/New Zealand locale (en-NZ) shortDate uses the format pattern d/MM/yy and parses both 20/5/2016 and 20/5/16 into 2016-05-20.
I frankly find it nasty that it interprets both two-digit and four-digit years into the same year. When the format specifies two-digit year, I would have expected four digits to be an error for stricter input validation. Accepting one-digit month when the format specifies two digits is lenient too, but maybe not so dangerous and more in line with what we might expect.
java.time too uses the format pattern d/MM/yy (tested on jdk-11.0.3). When parsing is accepts one or two digits for day of month, but insist on two-digit month and two-digit year.
You may get the Joda-Time behaviour in java.time, but it requires you to specify the format pattern yourself:
Locale loc = Locale.forLanguageTag("en-NZ");
DateTimeFormatter dateFormatter
= DateTimeFormatter.ofPattern("d/M/[yyyy][yy]", loc);
System.out.println(LocalDate.parse("20/5/2016", dateFormatter));
System.out.println(LocalDate.parse("20/5/16", dateFormatter));
Output is:
2016-05-20
2016-05-20
If you want an advanced solution that works in other locales, I am sure that you can write a piece of code that gets the format pattern from DateTimeFormatterBuilder.getLocalizedDateTimePattern and modifies it by replacing dd with d, MM with M and any number of y with [yyyy][yy]. Then pass the modified format pattern string to DateTimeFormatter.ofPattern.
Edit: I’m glad that you got something to work. In your comment you said that you used:
Stream<String> shortFormPatterns = Stream.of(
"[d][dd]/[M][MM]",
"[d][dd]-[M][MM]",
"[d][dd].[M][MM]",
"[d][dd] [M][MM]",
"[d][dd]/[M][MM]/[yyyy][yy]",
"[d][dd]-[M][MM]-[yyyy][yy]",
"[d][dd].[M][MM].[yyyy][yy]",
"[d][dd] [M][MM] [yyyy][yy]");
It covers more cases that your Joda-Time formatter. Maybe that’s good. Specifically your Joda-Time formatter insists on a slash / between the numbers and rejects either hyphen, dot or space. Also I believe that Joda-Time would object to the year being left out completely.
While you do need [yyyy][yy], you don’t need [d][dd] nor [M][MM]. Just d and M suffice since they also accept two digits (what happens in your code is that for example [d] parses either one or two digits, so [dd] is never used anyway).
If you prefer only one format pattern string, I would expect d[/][-][.][ ]M[/][-][.][ ][yyyy][yy] to work (except in hte cases where the year is omitted) (I haven’t tested).
FormatStyle.SHORT returns shortest format either dd/MM/yy or d/M/yy format, so you need to use pattern to get the customized format
LocalDate date = LocalDate.now();
System.out.println(date.format(DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT))); //9/29/19
You can also use DateTimeFormatter.ISO_DATE or DateTimeFormatter.ISO_LOCAL_DATE to get the iso format like yyyy-MM-dd, and also you can see the available formats in DateTimeFormatter
System.out.println(date.format(DateTimeFormatter.ISO_DATE)); //2019-09-29
System.out.println(date.format(DateTimeFormatter.ISO_LOCAL_DATE)); //2019-09-29
If you want the custom format like yyyy/MM/dd the use ofPattern
System.out.println(date.format(DateTimeFormatter.ofPattern("yyyy/MM/dd"))); //2019/09/29
String dateString = "20110706 1607";
DateTimeFormatter dateStringFormat = DateTimeFormat.forPattern("YYYYMMDD HHMM");
DateTime dateTime = dateStringFormat.parseDateTime(dateString);
Resulting stacktrace:
Exception in thread "main" java.lang.IllegalArgumentException: Invalid format: "201107206 1607" is malformed at " 1607"
at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:644)
at org.joda.time.convert.StringConverter.getInstantMillis(StringConverter.java:65)
at org.joda.time.base.BaseDateTime.<init>(BaseDateTime.java:171)
at org.joda.time.DateTime.<init>(DateTime.java:168)
......
Any thoughts? If I truncate the string to 20110706 with pattern "YYYYMMDD" it works, but I need the hour and minute values as well. What's odd is that I can convert a Jodatime DateTime to a String using the same pattern "YYYYMMDD HHMM" without issue
Thanks for looking
Look at your pattern - you're specifying "MM" twice. That can't possibly be right. That would be trying to parse the same field (month in this case) twice from two different bits of the text. Which would you expect to win? You want:
DateTimeFormat.forPattern("yyyyMMdd HHmm")
Look at the documentation for DateTimeFormat to see what everything means.
Note that although calling toString with that pattern will produce a string, it won't produce the string you want it to. I wouldn't be surprised if the output even included "YYYY" and "DD" due to the casing, although I can't test it right now. At the very least you'd have the month twice instead of the minutes appearing at the end.