I having problems to generate a regex for a range of dates.
For example this range [2015-11-17, 2017-10-05], How can I do? to validate if having a date belogns to that range using regex.
And second question if is possible to have a generic regex which I can use for several range of date, only replacing few values in the regex with the new ranges I have, and the regex continues validating a range of dates , but with the new ranges. Thanks in advance for help =)
Do not use Regex
As the comments state, Regex is not appropriate for a range of dates, nor any span of time. Regex is intended to be “dumb” in the sense of looking only at the syntax of the text not the semantics (the meaning).
java.time
Use the java.time framework built into Java 8 and later.
Parse your strings into LocalDate objects.
LocalDate start = LocalDate.parse( "2015-11-17" );
Compare by calling the isEqual, isBefore, and isAfter methods.
Note that we commonly use the Half-Open approach in date-time work where the beginning is inclusive while the ending is exclusive.
These issues are covered already in many other Questions and Answers on Stack Overflow. So I have abbreviated my discussion here.
Just for completeness: You can actually use regular expressions to recognize any finite set of strings, such as a specific date range, however it would be more of an academic exercise than an actual recommended usage. However, if you happen to be programming some arcane hardware it could actually be necessary.
Assuming the input is always a valid date in the given format, the regex for your example could consist of:
2015-0[1-9].* - 2015 January to September
2015-10.* - 2015 October
2015-11-0[1-9] - 2015 November 1 to 9
2015-11-1[0-7] - 2015 November 10 to 17
2016.* - all dates of 2016
Add analogously for 2017, make a disjunction using | (a|b|c|...), apply escaping of the regex implementation you use and then you have your date checker. If the input is not guaranteed to be a valid date it gets a bit more complicated but is still possible.
Related
I am trying to format LocalDate variables to dd.MMM.YYYY with:
DateTimeFormatter.ofPattern("dd.MMM.yyyy")
The problem is that more half the time I get two dots. For example 01-01-2000 goes to 01.Jan..2000.
I know why I have this problem, because of the three Ms. When I use dd.MM.yyyy I get to 01.01.2000 without issue. The third M is the problem.
How can I fix this?
The cause of your problem is that the abbreviations for months are locale specific:
In some locales there is a dot (period) to indicate abbreviations1; Locale.CANADA for example. In others there isn't; Locale.ENGLISH for example.
In the locales where a dot indicates abbreviation, you may or may not find that there is dot when the name of months doesn't need abbreviating. For example the name of the month May is only three letters, so May. indicating that this is an abbreviation would be illogical.
There are various ways to deal with this, including:
Don't fix it. The output with doubled dots in some cases and not others is logically correct (by a certain logic)2, even though it looks odd.
My preferred way would be to use a different output format. Don't use dot as a separator.
Using dot characters as separators is ... unconventional ... and when you combine this with abbreviated month names, you get this awkward edge-case.
Sure there are ways to deal with this, but consider that other people might then run into an equivalent problem if they need to parse your dates in their code-base.
Hard wire your DateTimeFormatter to an existing Locale where there are no dots in the abbreviated names.
There is a theoretical risk that they may decide to change the abbreviations in a standard Locale. But doubt that they would, because such a change is liable to break customer code which is implicitly locale dependent ... like yours would be.
Create a custom Locale and use that when creating the DateTimeFormatter.
Use DateTimeFormatterBuilder for create the formatter. To deal with the month, use appendText(TemporalField field, Map<Long,String> textLookup) with a lookup table that contains exactly the abbreviations that you want to use.
Depending on how you "append" the other fields, your formatter can be totally or partially locale independent.
Of these, 2. and 5. are the most "correct", in my opinion. Ole's answer illustrates these options with code.
1 - See this article on American English grammar - When you need periods after abbreviations.
2 - The problem would be convincing people that "looks odd but is logical" is better than "looks nice but is illogical". I don't think you would win this argument ...
Stephen C. has written an answer that covers your options really well. As a supplement, since I agree that options 2 and 5 are the most correct, I would like to spell those two out.
Option 2: Use a different format
Localized date formats for most available locales are built into Java. These are generally under-used. We can save ourselves a lot if trouble by relying on Java to know how to format dates for our audience and their locale. I am using German as an example because it’s one of those locales that consistently includes dots both between the parts of the date and for abbreviation. The following should work for your locale too even if it’s not German (if you substitute Locale.getDefault(Locale.Category.FORMAT) or your users’ locale).
private static final Locale LOCALE = Locale.GERMAN;
private static final DateTimeFormatter DATE_FORMATTER
= DateTimeFormatter.ofLocalizedDate(FormatStyle.MEDIUM)
.withLocale(LOCALE);
For demonstration I am formatting a day of each month of the current year:
LocalDate date = LocalDate.of(2021, Month.JANUARY, 16);
for (int i = 0; i < 12; i++) {
System.out.println(date.format(DATE_FORMATTER));
date = date.plusMonths(1).minusDays(1);
}
Output is:
16.01.2021
15.02.2021
14.03.2021
13.04.2021
12.05.2021
11.06.2021
10.07.2021
09.08.2021
08.09.2021
07.10.2021
06.11.2021
05.12.2021
For German locale we got numeric months here. Other locales may give other results, for example month abbreviations.
If you want a longer format that doesn’t use numeric months, specify for example FormatStyle.LONG instead of FormatStyle.MEDIUM:
private static final DateTimeFormatter DATE_FORMATTER
= DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG)
.withLocale(LOCALE);
16. Januar 2021
15. Februar 2021
14. März 2021
13. April 2021
12. Mai 2021
11. Juni 2021
10. Juli 2021
9. August 2021
8. September 2021
7. Oktober 2021
6. November 2021
5. Dezember 2021
I suggest that your users would be happy with one of the above.
Option 5: DateTimeFormatterBuilder.appendText(TemporalField, Map<Long, String>)
If your users tell you that they don’t want the localized formats above and they do want your format with month abbreviations and single dots — it’s getting longer, but the result is beautiful and everyone will be happy.
private static final DateTimeFormatter DATE_FORMATTER = new DateTimeFormatterBuilder()
.appendPattern("dd.")
.appendText(ChronoField.MONTH_OF_YEAR, getMonthAbbreviations())
.appendPattern(".uuuu")
.toFormatter(LOCALE);
private static Map<Long, String> getMonthAbbreviations() {
return Arrays.stream(Month.values())
.collect(Collectors.toMap(m -> Long.valueOf(m.getValue()),
MyClass::getDisplayNameWithoutDot));
}
private static String getDisplayNameWithoutDot(Month m) {
return m.getDisplayName(TextStyle.SHORT, LOCALE)
.replaceFirst("\\.$", "");
}
Output from the same loop as above:
16.Jan.2021
15.Feb.2021
14.März.2021
13.Apr.2021
12.Mai.2021
11.Juni.2021
10.Juli.2021
09.Aug.2021
08.Sep.2021
07.Okt.2021
06.Nov.2021
05.Dez.2021
One dot each time. The central trick is to use Java’s month abbreviation and remove the dot from it if there is one (Jan. becomes Jan) and use it as-is if there is no dot (Mai stays Mai). My getDisplayNameWithoutDot method does this. I am in turn using this method to build the map that the two-arg appendText(TemporalField, Map<Long, String>) method requires and uses for formatting.
Im trying to format a date without a year (just day and month, e.g 12.10)
DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT) still yield year for me (12.10.20).
so I tried DateTimeFormatter.ofPattern("dd. MM") but that obviously hardcodes order and dot, which wont make american users happy. (who expect slashes and month first)
How can I internationalize a pattern? Is there some abstract syntax for separators etc?
Well, as Ole pointed out there is no 100% satisfying solution using java.time only. But my library Time4J has found a solution based on the data of the CLDR repository (ICU4J also gives support) using the type AnnualDate (as replacement for MonthDay):
LocalDate yourLocalDate = ...;
MonthDay md = MonthDay.from(yourLocalDate);
AnnualDate ad = AnnualDate.from(md);
ChronoFormatter<AnnualDate> usStyle =
ChronoFormatter.ofStyle(DisplayMode.SHORT, Locale.US, AnnualDate.chronology());
ChronoFormatter<AnnualDate> germanStyle =
ChronoFormatter.ofStyle(DisplayMode.SHORT, Locale.GERMANY, AnnualDate.chronology());
System.out.println("US-format: " + usStyle.format(ad)); // US-format: 12/31
System.out.println("German: " + germanStyle.format(ad)); // German: 31.12.
I don’t think that a solution can be made that gives 100 % satisfactory results for all locales. Let’s give it a shot anyway.
Locale formattingLocale = Locale.getDefault(Locale.Category.FORMAT);
String formatPattern = DateTimeFormatterBuilder.getLocalizedDateTimePattern(
FormatStyle.SHORT, null, IsoChronology.INSTANCE, formattingLocale);
// If year comes first, remove it and all punctuation and space before and after it
formatPattern = formatPattern.replaceFirst("^\\W*[yu]+\\W*", "")
// If year comes last and is preceded by a space somewhere, break at the space
// (preserve any punctuation before the space)
.replaceFirst("\\s\\W*[yu]+\\W*$", "")
// Otherwise if year comes last, remove it and all punctuation and space before and after it
.replaceFirst("\\W*[yu]+\\W*$", "");
DateTimeFormatter monthDayFormatter
= DateTimeFormatter.ofPattern(formatPattern, formattingLocale);
For comparison I am printing a date both using the normal formatter with year from your question and using my prepared formatter.
LocalDate exampleDate = LocalDate.of(2020, Month.DECEMBER, 31);
System.out.format(formattingLocale, "%-11s %s%n",
exampleDate.format(DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT)),
exampleDate.format(monthDayFormatter));
Output in French locale (Locale.FRENCH):
31/12/2020 31/12
In Locale.GERMAN:
31.12.20 31.12
Edit: My German girl friend informs me that this is wrong. We should always write a dot after each of the two numbers because both are ordinal numbers. Meno Hochschild, the German author of the other answer, also produces 31.12. with two dots for German.
In Locale.US:
12/31/20 12/31
It might make American users happy. In Swedish (Locale.forLanguageTag("sv")):
2020-12-31 12-31
In a comment I mentioned Bulgarian (bg):
31.12.20 г. 31.12
As far as I have understood, “г.” (Cyrillic g and a dot) is an abbreviation of a word that means year, so when leaving out the year, we should probably leave this abbreviation out too. I’m in doubt whether we ought to include the dot after 12.
Finally Hungarian (hr):
31. 12. 2020. 31. 12.
How the code works: We are first inquiring DateTimeFormatterBuilder about the short date format pattern for the locale. I assume that this is the pattern that your formatter from the question is also using behind the scenes (haven’t checked). I then use different regular expressions to remove the year from different variants, see the comments in the code. Year may be represented by y or u, so I take both into account (in practice y is used). Now it’s trivial to build a new formatter from the modified pattern. For the Bulgarian: from my point of view there is an error in Java regular expressions, they don’t recognize Cyrillic letters as word characters, which is why г was removed too (the error is in documentation too, it claims that a word character is [a-zA-Z_0-9]).
We were lucky, though, in our case it produces the result that I wanted.
If you’re happy with a 90 % solution, this would be my suggestion, and I hope you can modify it to any needs your users in some locale may have.
Link: Documentation of Java regular expressions (regex)
i have a string like this on my java code:
17:00
I want to make a subtraction using a constant integer
public static final int MAX_DUREE_TRAVAIL_JOUR = 10;
When i do this:
Integer.parseInt("17:00") - ConstantesIntervention.MAX_DUREE_TRAVAIL_JOUR
I have this error:
java.lang.NumberFormatException: For input string: "17:00"
Thx.
What are you expecting to happen? 17:00 is not a valid string representation of an integer.
You probably want to use a SimpleDateFormat to parse the string as a Date and do the time arithmetic on that.
Alternatively, take a look at the JodaTime library which provides much better handling of dates/times.
17:00 cant be to converted to an integer.
Because 17:00 is not a Correct integer.You should divide the string and use Integer.parse() then according to your business logic use those integers.
The answer by Ajai is correct.
Some advice: When working with dates and times, work with dates and times (not strings and integers). Meaning use date-time classes.
Use a good date-time library (not the mess that is java.util.Date/Calendar).
Use Joda-Time 2.3 now.
In the future, with Java 8, consider moving to JSR 310: Date and Time API which supplants the Date/Calendar classes and is inspired by Joda-Time.
P.S. Mercer, Joda-Time even knows how to speak français. See another answer of mine today for an example.
I have date in string format and I want to parse that into util date.
var date ="03/11/2013"
I am parsing this as :
new SimpleDateFormat("MM/dd/yyyy").parse(date)
But the strange thing is that, if I am passing "03-08-201309 hjhkjhk" or "03-88-2013" or 43-88-201378", it does not throw error , it parses it.
For this now, I have to write regex pattern for checking whetehr input of date is correct or not.
but why is it so ??
Code :
scala> val date="03/88/201309 hjhkjhk"
date: java.lang.String = 03/88/201309 hjhkjhk
scala> new SimpleDateFormat("MM/dd/yyyy").parse(date)
res5: java.util.Date = Mon May 27 00:00:00 IST 201309
You should use DateFormat.setLenient(false):
SimpleDateFormat df = new SimpleDateFormat("MM/dd/yyyy");
df.setLenient(false);
df.parse("03/88/2013"); // Throws an exception
I'm not sure that will catch everything you want - I seem to remember that even with setLenient(false) it's more lenient than you might expect - but it should catch invalid month numbers for example.
I don't think it will catch trailing text, e.g. "03/01/2013 sjsjsj". You could potentially use the overload of parse which accepts a ParsePosition, then check the current parse index after parsing has completed:
ParsePosition position = new ParsePosition(0);
Date date = dateFormat.parse(text, position);
if (position.getIndex() != text.length()) {
// Throw an exception or whatever else you want to do
}
You should also look at the Joda Time API which may well allow for a stricter interpretation - and is a generally cleaner date/time API anyway.
Jon Skeet’s answer is correct and was a good answer when it was written in 2013.
However, the classes you use in your question, SimpleDateFormat and Date, are now long outdated, so if someone got a similar issue with them today, IMHO the best answer would be to change to using the modern Java date & time API.
I am sorry I cannot write Scala code, so you will have to live with Java. I am using
private static DateTimeFormatter parseFormatter
= DateTimeFormatter.ofPattern("MM/dd/yyyy");
The format pattern letters are the same as in your question, though the meaning is slightly different. DateTimeFormatter takes the number of pattern letters literally, as we shall see. Now we try:
System.out.println(LocalDate.parse(date, parseFormatter));
Results:
"03/11/2013" is parsed into 2013-03-11 as expected. I used the modern LocalDate class, a class that represents a date without time-of-day, exactly what we need here.
Passing "03/88/2013 hjhkjhk" gives a DateTimeParseException with the message Text '03/88/2013 hjhkjhk' could not be parsed, unparsed text found at index 10. Pretty precise, isn’t it? The modern API has methods to parse only part of a string if that is what we want, though.
"03/88/201309" gives Text '03/88/201309' could not be parsed at index 6. We asked for a 4 digit year and gave it 6 digits, which leads to the objection. Apparently it detects and reports this error before trying to interpret 88 as a day of month.
It does object to a day of month of 88 too, though: "03/88/2013" gives Text '03/88/2013' could not be parsed: Invalid value for DayOfMonth (valid values 1 - 28/31): 88. Again, please enjoy how informative the message is.
"03-08-2013" (with hyphens instead of slashes) gives Text '03-08-2013' could not be parsed at index 2, not very surprising. Index 2 is where the first hyphen is.
Jon Skeet explained that the outdated SimpleDateFormat can be lenient or non-lenient. This is true for DateTimeFormatter too, in fact it has 3 instead of 2 resolver styles, called ‘lenient’, ‘smart’ and ‘strict’. Since many programmers are not aware of this, though, I think they made a good choice of not making ‘lenient’ the default (‘smart’ is).
What if we wanted to make our formatter lenient?
private static DateTimeFormatter parseFormatter
= DateTimeFormatter.ofPattern("MM/dd/yyyy")
.withResolverStyle(ResolverStyle.LENIENT);
Now it also parses "03/88/2013", into 2013-05-27. I believe this is what the old class would also have done: counting 88 days from the beginning of March gives May 27. The other error messages are still the same. In other words it still objects to unparsed text, to a 6 digit year and to hyphens.
Question: Can I use the modern API with my Java version?
If using at least Java 6, you can.
In Java 8 and later the new API comes built-in.
In Java 6 and 7 get the ThreeTen Backport, the backport of the new classes (that’s ThreeTen for JSR-310, where the modern API was first defined).
On Android, use the Android edition of ThreeTen Backport. It’s called ThreeTenABP, and I think that there’s a wonderful explanation in this question: How to use ThreeTenABP in Android Project.
I want to define a pattern for the Java SimpleDaterFormat to parse existing strings.
The existing dates look like this: 2011-05-02T13:40:00+02:00.
I tried with different patterns, but I got ParseExceptions. The problem seems to be the timezone format.
Printing the pattern in Java:
yyyy-MM-dd'T'HH:mm:ssZ
2012-03-14T15:40:44+0100
yyyy-MM-dd'T'HH:mm:ssz
2012-03-14T15:41:58MEZ
But how can I get
???
2011-05-02T13:40:00+02:00
I'm using Java 6, not Java 7.
If you can use Java 7 or newer, you can use the XXX pattern to get the timezone to look like +02:00:
yyyy-MM-dd'T'HH:mm:ssXXX
Otherwise you might have to manipulate the date string to remove the colon from the timezone before parsing it.
I know it's a bit old question, but someone else might benefit from my hint.
You can use JodaTime. As library documentation stands:
Zone: 'Z' outputs offset without a colon, 'ZZ' outputs the offset with
a colon, 'ZZZ' or more outputs the zone id.
You can use it as well with java 6. You have more examples in this question