SimpleDateFormat vs LocalDateTime - Parsing Differences with postfixes [duplicate] - java

I'm retrofitting old some SimpleDateFormat code to use the new Java 8 DateTimeFormatter. SimpleDateFormat, and thus the old code, accepts strings with stuff in them after the date like "20130311nonsense". The DateTimeFormat I created throws a DateTimeParseException for these strings, which is probably the right thing to do, but I'd like to maintain compatibility. Can I modify my DateTimeFormat to accept these strings?
I'm currently creating it like this:
DateTimeFormatter.ofPattern("yyyyMMdd")

Use the parse() method that takes a ParsePosition, as that one doesn't fail when it doesn't read the entire text:
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyyMMdd");
TemporalAccessor parse = formatter.parse("20140314 some extra text", new ParsePosition(0));
System.out.println(LocalDate.from(parse));
The ParsePosition instance that you pass will also be updated with the point at which the parsing stopped, so if you need to do something with the leftover text then it will be useful to assign it to a variable prior to calling parse.

Related

Parsing PDF date using Java DateTimeFormatter

I'm trying to parse the date format used in PDFs. According to this page, the format looks as follows:
D:YYYYMMDDHHmmSSOHH'mm'
Where all components except the year are optional. I assume this means the string can be cut off at any point as i.e. specifying a year and an hour without specifying a month and a day seems kind of pointless to me. Also, it would make parsing pretty much impossible.
As far as I can tell, Java does not support zone offsets containing single quotes. Therefore, the first step would be to get rid of those:
D:YYYYMMDDHHmmSSOHHmm
The resulting Java date pattern should then look like this:
['D:']uuuu[MM[dd[HH[mm[ss[X]]]]]]
And my overall code looks like this:
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("['D:']uuuu[MM[dd[HH[mm[ss[X]]]]]]");
TemporalAccessor temporalAccessor = formatter.parseBest("D:20020101",
ZonedDateTime::from,
LocalDateTime::from,
LocalDate::from,
Month::from,
Year::from
);
I would expect that to result in a LocalDate object but what I get is java.time.format.DateTimeParseException: Text 'D:20020101' could not be parsed at index 2.
I've played around a bit with that and found out that everything works fine with the optional literal at the beginning but as soon as I add optional date components, I get an exception.
Can anybody tell me what I'm doing wrong?
Thanks in advance!
I've found a solution:
String dateString = "D:20020101120000+01'00'";
String normalized = dateString.replace("'", "");
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("['D:']ppppy[ppM[ppd[ppH[ppm[pps[X]]]]]]");
TemporalAccessor temporalAccessor = formatter.parseBest(normalized,
OffsetDateTime::from,
LocalDateTime::from,
LocalDate::from,
YearMonth::from,
Year::from
);
As it seems, the length of the components is ambiguous and parsing of the date without any separators thus failed.
When specifying a padding, the length of each component is clearly stated and the date can therefore be parsed.
At least that's my theory.

Generic date and time parsing in java 8

I was recently trying to make a generic date and time parsing method with the java 8 time API, mainly for interfacing with older code using Date.
I wanted to do something like that:
public static Date parse(String dateStr, String pattern) {
return Date.from(Instant.parse(dateStr, DateTimeFormatter.ofPattern(pattern)));
}
The problem is that with the time API, the class to use depends on the pattern DateTimeFormatter.parse will never fail but will return a TemporalAccessor which is horrible to work with and convert to a usable class.
And LocalDateTime.parse will fail if the pattern has no time information like "dd/MM/yyyy". Other classes like Instant, ZonedDateTime, etc. will all fail to parse if the pattern doesn't match the expected class.
Ideally, I'd like a way to parse leniently and return an Instant, with default values for missing fields, but I can't find a way to do that.
Any idea?
You can use DateTimeFormatterBuilder::parseDefaulting to set default values.
var now = ZonedDateTime.now();
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern(pattern)
.parseDefaulting(ChronoField.OFFSET_SECONDS, now.getOffset().getTotalSeconds())
.parseDefaulting(ChronoField.YEAR, now.getYear())
.parseDefaulting(ChronoField.MONTH_OF_YEAR, now.getMonthValue())
.parseDefaulting(ChronoField.DAY_OF_MONTH, now.getDayOfMonth())
.parseDefaulting(ChronoField.HOUR_OF_DAY, now.getHour())
.parseDefaulting(ChronoField.MINUTE_OF_HOUR, now.getMinute())
.parseDefaulting(ChronoField.SECOND_OF_MINUTE, now.getSecond())
.toFormatter(Locale.ROOT);
Instant dt = Instant.from(formatter.parse(str));
Note that it's important to first append the pattern using appendPattern, and then set all your defaults using parseDefaulting.
Also note that I used the current time stamp to fill the defaults. So, for example, if you left out the year, it takes the current year (2022 at the time of writing). Of course, the defaults depend on your exact use case.
Examples:
At the time of writing, it's 2022-06-09T17:18:36+02:00.
System.out.println(parse("9-6", "d-M"));
System.out.println(parse("2023", "uuuu"));
System.out.println(parse("10:13", "H:m"));
System.out.println(parse("25 Dec, 16:22", "d MMM, H:mm"));
resolves to
2022-06-09T15:18:36Z
2023-06-09T15:18:36Z
2022-06-09T08:13:36Z
2022-12-25T14:22:36Z

Date toString without timezone conversion for MessageFormat

I'm using ibm's MessageFormat library to localize an incoming date.
The task here is to run a few checks on the date before showing it to the end user. I get a ZonedDateTime object and I need to make sure that it doesn't fall in the weekend, which I do using the getDayOfWeek.
My problem happens when I try to convert my date to a string using MessageFormat. Since MessageFormat accepts only java.util.Date objects, I convert my ZonedDateTime -> Instant -> Date. Unfortunately, this method results in my "Monday" becoming a "Sunday," as shown below.
I noticed that this "loss" happens upon the Date conversion. This is because the Date.toString() object is being invoked by MessageFormat, and the former uses the JVM's default timezone (in my case, PST). As a result, my UTC gets implicitly converted to a PST and I lose a day.
Any ideas how to tackle this? Is there anything else that I can pass to MessageFormat? Is there a way to use Date but not get this undesired behavior? Is there another localization library I can use?
Internally, MessageFormat uses a DateFormat object but does not allow you to set its timezone. #Assylias linked a question where the answer tries to pull out the internal DateFormat, set its timezone, and then use the MessageFormat as usual, which resolves the issue.
However, I found that to be too wordy, particularly because you have to create a new MessageFormat everytime (as opposed to reusing the MessageFormat that you already set the timezone for).
What I opted for was to simply use SimpleDateFormat directly.
// I have a ZonedDateTime zonedDateTime that I want to print out.
final SimpleDateFormat dateFormat = new SimpleDateFormat("EEEE, MM dd", locale);
dateFormat.setTimeZone(TimeZone.getTimeZone(zonedDateTime.getZone()));
final String formattedDateString = dateFormat.format(Date.from(zonedDateTime.toInstant()));
I then use String.format to insert my formatted date into a larger string. Hope this helps.

Using Joda Date & Time API to parse multiple formats

I'm parsing third party log files containing date/time using Joda. The date/time is in one of two different formats, depending on the age of the log files I'm parsing.
Currently I have code like this:
try {
return DateTimeFormat.forPattern("yyyy/MM/dd HH:mm:ss").parseDateTime(datePart);
} catch (IllegalArgumentException e) {
return DateTimeFormat.forPattern("E, MMM dd, yyyy HH:mm").parseDateTime(datePart);
}
This works but contravenes Joshua Bloch's advice from Effective Java 2nd Edition (Item 57: Use exceptions only for exceptional conditions). It also makes it hard to determine if an IllegalArgumentException occurs due to a screwed up date/time in a log file.
Can you suggest a nicer approach that doesn't misuse exceptions?
You can create multiple parsers and add them to the builder by using DateTimeFormatterBuilder.append method:
DateTimeParser[] parsers = {
DateTimeFormat.forPattern( "yyyy-MM-dd HH" ).getParser(),
DateTimeFormat.forPattern( "yyyy-MM-dd" ).getParser() };
DateTimeFormatter formatter = new DateTimeFormatterBuilder().append( null, parsers ).toFormatter();
DateTime date1 = formatter.parseDateTime( "2010-01-01" );
DateTime date2 = formatter.parseDateTime( "2010-01-01 01" );
Joda-Time supports this by allowing multiple parsers to be specified - DateTimeFormatterBuilder#append
Simply create your two formatters using a builder and call toParser() on each. Then use the builder to combine them using append.
Unfortunately I don't believe Joda Time has any such capabilities. It would be nice to have a "tryParseDateTime" method, but it doesn't exist.
I suggest you isolate this behaviour into your own class (one which takes a list of patterns, and will try each in turn) so that the ugliness is only in one place. If this is causing performance issues, you might want to try to use some heuristics to guess which format to try first. For example, in your case if the string starts with a digit then it's probably the first pattern.
Note that DateTimeFormatters in Joda Time are conventionally immutable - you shouldn't be creating a new one each time you want to parse a line. Create them once and reuse them.

How to convert a date String into the right format in Java?

Can somebody please explain to me how I can convert
2009-10-27 14:36:59.580250
into
27.10.2009, 14:36 ?
The first date is available as a string and the second one should be a string as well ;) Up to now I'm not so into date conversion within Java...
Thanks in advance!
You can use java.text.SimpleDateFormat for this. First step is to parse the first string into a java.util.Date object using SimpleDateFormat based on the pattern of the first string. Next step is to format the obtained java.util.Date object into a string based on the pattern of the second string. For example:
String datestring1 = "2009-10-27 14:36:59.580250";
Date date1 = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").parse(datestring1);
String datestring2 = new SimpleDateFormat("dd.MM.yyyy HH:mm").format(date1);
System.out.println(datestring2);
Edit: I've removed the .SSSSSS part from the first pattern because it failed. But in my opinion it should in theory have worked with "yyyy-MM-dd HH:mm:ss.SSS" and "yyyy-MM-dd HH:mm:ss.SSSSSS" as well, but it is calculating them as seconds! I consider this as a buggy implementation in SimpleDateFormat. The JodaTime handles the millis/micros perfectly with those patterns.
You can use SimpleDateFormat. Although there's no format specification for micro-seconds (the last fragment of your input), you can make use of the fact that the parser ignores the rest of the string if it has already managed to match the configured pattern:
SimpleDateFormat parser = new SimpleDateFormat("yyyy-MM-dd HH:mm");
SimpleDateFormat formatter = new SimpleDateFormat("dd.MM.yyyy HH:mm");
System.out.println(formatter.format(parser.parse("2009-10-27 14:36:59.580250")));
The parser will in this case simply ignore the last part ":59.580250" of the input string.
Check out SimpleDateFormat. You can use this to both parse and format. I would suggest parsing the above into a Date object using one SimpleDateFormat, and then formatting to a String using a 2nd SimpleDateFormat.
Note that SimpleDateFormat suffers from threading issues, and so if you're using this in a threaded environment, either create new SimpleDateFormats rather than used static versions, or use the corresponding but thread-safe classes in Joda.
Keep in mind when you do this that you are losing precision. Depending on your specific application, this may or may not matter.
If you already have the original date saved somewhere, this is not an issue. However, if the source date is from a transient source (e.g., streaming in from a physical sensor of some sort), it may be a good idea to persist the interim Date object (output of SimpleDateFormat#parse(String)) somewhere.
Just thought I'd point that out.

Categories