I have searched throughout the site but I think I have a slightly different issue and could really do with some help before I either have heart failure or burn the computer.
I dynamically generate a list of month names (in the form June 2011, July 2011) and obviously I want this to be locale sensitive: hence I use the simple date format object as follows:
//the actual locale name is dependent on UI selection
Locale localeObject=new Locale("pl");
// intended to return full month name - in local language.
DateFormat dtFormat = new SimpleDateFormat("MMMM yyyy",localeObject);
//this bit just sets up a calendar (used for other bits but here to illustrate the issue
String systemTimeZoneName = "GMT";
TimeZone systemTimeZone=TimeZone.getTimeZone(systemTimeZoneName);
Calendar mCal = new GregorianCalendar(systemTimeZone); //"gmt" time
mCal.getTime(); //current date and time
but if I do this:
String value=dtFormat.format(mCal.getTime());
this "should" return the localized version of the month name. In polish the word "September" is "Wrzesień" -- note the accent on the n. However all I get back is "Wrzesie?"
What am I doing wrong?
Thanks to all - I accept now that it's a presentation issue - but how can I "read" the result from dtFormat safely - I added some comments below ref using getBytes etc. - this worked in other situations, I just can't seem to get access to the string result without messing it up
-- FINAL Edit; for anyone that comes accross this issue
The answer was on BalusC's blog : http://balusc.blogspot.com/2009/05/unicode-how-to-get-characters-right.html#DevelopmentEnvironment
Basically the DTformat object was returning UTF-8 and was being automatically transformed back to the system default character set when I read it into a string
so this code worked for me
new String(dtFormat.format(mCal.getTime()).getBytes("UTF-8"),"ISO-8859-1");
thank you very much for the assistance
Your problem has nothing to do with SimpleDateFormat - you're just doing the wrong thing with the result.
You haven't told us what you're doing with the string afterwards - how you're displaying it in the UI - but that's the problem. You can see that it's fetching a localized string; it's only the display of the accented character which is causing a problem. You would see exactly the same thing if you had a string constant in there containing the same accented character.
I suggest you check all the encodings used throughout your app if it's a web app, or check the font you're displaying the string in if it's a console or Swing app.
If you examine the string in the debugger I'm sure you'll see it's got exactly the right characters - it's just how they're getting to the user which is the problem.
In my tests, dtFormat.format(mCal.getTime()) returns
październik 2011
new SimpleDateFormat(0,0,localeObject).format(mCal.getTime()) returns:
poniedziałek, 3 październik 2011 14:26:53 EDT
Related
I'm using the en_GB locale, but a similar issue may also affect other en_XX locales.
Under Java 15 the following code works:
LocalDate.parse("10-Sep-17", DateTimeFormatter.ofPattern("dd-MMM-yy", Locale.UK));
Under Java 16 it gives: DateTimeParseException: Text '10-Sep-17' could not be parsed at index 3
After spending a long time in the debugger I have traced this to this commit: 8251317: Support for CLDR version 38
This commit changes the abbreviated form of September in make/data/cldr/common/main/en_GB.xml from Sep to Sept for both the context-sensitive and standalone forms. None of the other months are touched, remaining as 3 characters.
I have verified that this is indeed a genuine change between CLDR versions 37 and 38, although I'm not sure when we Brits switched to using 4 letters for our 3-letter abbreviation for September...
Now this is annoying, as it has broken my datafile processing (although I suspect I can fix it by specifying Locale.ENGLISH rather than using the default locale in my code), but I can't decide if it counts as a bug that has been introduced that breaks my reliable 3-character-month match pattern, or whether this is actually meant to be a feature.
The JavaDoc says:
Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. ...
and later:
Number/Text: If the count of pattern letters is 3 or greater, use the Text rules above. Otherwise use the Number rules above.
My bad for never having read this carefully enough to spot that textual values are handled differently to numbers, where the number of letters in your pattern sets the width. But this leaves me wondering how you are supposed to specify a fixed number of characters when you output a month, and equally why it can't be permissive and accept the three-character form when parsing rather than throw an exception?
At the end of the day this still feels like a regression to me. My code that has worked reliably for years parsing dates with 3-character months in now, with no warning, fails on all dates in September. Am I wrong to think this feels incorrect?
I wonder if it's possible to parse any string (at least to try) to sql Date without specifing the string format? In other words I want to make a generic method who take as input a string and return an sql Date.
For instance I have:
String date1="31/12/2099";
String date2="31-12-2099";
and call parseToSqlDate(date1) and parseToSqlDate(date2) which will returns sql dates.
Short answer: No
Why: Parsing any string to a valid date is a task you as an intelligent being could not do (there is no "logical" way to determine the correct date), so you cannot "tell" a computer(program) to do that for you (see JGrice's comment, and there we still have 4-digit years).
Long answer: Maybe, if you are willed to either take risks or do not need a high rate of success.
How:
Define your minimal (format) requirements of a date. E.g. "a minimal date contains 1-8 numbers; 01/01/2001 , 01-01-01 , 01.01 (+current year) , 1.1 (+current year), 1 (+current month + current year) and/or "..contains 1-6 numbers and the letters for months"; 01-Jan-2001 and so on.
Split the input along any non-number/non-month-name characters, with a regex like [^0-9a-zA-Z] (quick thought, may hold some pitfalls)
You now have 1 to 3 (actually more if e.g. the time is included) separate numbers + 1 month name which can be aligned for year/month/day any way you like
For this "alignment", there are several possibilities:
Try a fixed format at first, if it "fits", take it, else try another (or fail)
(only of you get more than one entry at a time) guess the format by assuming all entries have the same (e.g. any number block containing values > 12 is not a month and > 31 is not a day)
BUT, and this is a big one, you can expect any such method to become a major PITA at some point, because you can never fully "trust" it to guess correctly (you can never be sure to have missed some special format or introduced some ambiguous interpretation). I outlined some cases/format, but definitely not all of them, so you will refine that method very often if you actually use it.
Appendix to your comment: "May be to add another parameter and in this way to know where goes day , month and so on?" So you are willed to add "pseudo-format-string" parameter specifying the order of day, month and year; that would make it a lot easier (as "simply" filtering out the delimiters can be achieved).
I'm experimenting on a Java Program that makes use of Jasper Reports. What started out as a (supposedly) simple "arrange the dates in descending order" task for the Report became more complex when I found out that the 'dates' were in String format, and thus, were being sorted, albeit in a wrong manner. Example:
03/26/12
03/26/12
08/11/12
08/26/12
10/26/11
I can only guess that the 10/26/11 is placed on the bottom simply because of the 10 in front.
I've looked into the Jasper Report using iReport 3.0.0, and I've found the following:
The date in question (named: DTEEFFEC), under Fields, is set to String.
The textField is also set to String.
This doesn't produce any errors, it only makes it difficult, if not impossible, to arrange the 'dates' in descending order.
So I've done the following:
Left DTEEFFEC as is (String).
Changed the textField from Java.Lang.String to Java.Util.Date
Added the following to the New Field Expression:
new SimpleDateFormat("MM-dd-yyyy").parse($F{DTEEFFEC}.toString())
I found out that bit of code after some research on my problem. A lot of the responses were along the lines of "it worked", but not so for me.
Caused by: java.text.ParseException: Unparseable date: "03/26/2012"
That is what the Java program returns. I've tried tinkering with both the field and the textField (alternating between either String or Date values), but it gives me other errors entirely.
Can I have some help on this?
Thanks.
Other information: I'm using iReports 3.0.0 to modify the JRXML file, and Eclipse for the Java Program. If the Referenced Libraries under Eclipse is to be believed, I'm using JasperReports 3.5.2. The entire thing runs on Windows 7.
Look at your code:
new SimpleDateFormat("MM-dd-yyyy").parse(...)
That's clearly expecting something of the form "MM-dd-yyyy" such as "03-26-2012".
Now look at your actual data: "03/26/2012". (Apparently, even though your earlier samples were two-digit years...)
That's got slashes, not dashes. So you need to change your pattern appropriately:
new SimpleDateFormat("MM/dd/yyyy").parse(...)
Change the new SimpleDateFormat("MM-dd-yyyy") into new SimpleDateFormat("mm/dd/yyyy")
so the parser can correctly parse your date input
I've found strange behavior with the format tag library. I'm formatting a copyright message in the footer of a webpage. I'm using the following pseudo code:
<fmt:message var="copyright" key="someKey">
<fmt:param value="${year}"/>
</fmt:message>
...
<c:out value="${copyright}"/>
I'm just passing the year as an argument into the resource bundle. If you c-out the year value before passing it in:
<c:out value="${year}"/>
<%-- renders as 2012 --%>
But after the argument gets passed in, the year gets formatted as a number. The number is rendered as 2,012.
I've googled and asked around and haven't found anything besides the generic Oracle documentation (http://docs.oracle.com/javaee/5/jstl/1.1/docs/tlddocs/fmt/tld-summary.html)
Has anyone else reached this?
Thanks in advance.
I had the same issue but after playing around discovered that only number types will be formatted. If you make year a string first then it won't:
Calendar cal = Calendar.getInstance();
int currYear = cal.get(Calendar.YEAR);
String cYear = Integer.toString(currYear);
<fmt:message key="msg.parameterized"><fmt:param value="<%=currYear%>"/></fmt:message>
<fmt:message key="msg.parameterized"><fmt:param value="<%=cYear%>"/></fmt:message>
The first one will contain 2,012 and the second just 2012
It's been interpreted as a Number by MessageFormat and hence being formatted with a thousands separator which can be a comma or a dot, depending on the current locale. You can prevent it being interpreted as a Number by adding a zero width space:
<fmt:param value="${year}"/>
Whilst BalusC's answer is simple and superficially effective, it feels a little, well, impure to me. For a start, someone else might come along one day and wonder what on Earth that extra character is for and maybe even remove it.
Since <fmt:message /> uses Java's built-in MessageFormat class under the hood, we can just insert a formatting pattern in the ResourceBundle's message string itself.
For example, in your ResourceBundle you could have:
someKey = Copyright (c) {0,number,#} ACME Inc.
The # here can in fact be any format string as documented in the DecimalFormat class. In this case, # alone just outputs the number without any additional formatting.
As an aside, since in this particular instance you want to output a year, you could pass a java.util.Date instance as the value in <fmt:param /> and use the following in your ResourceBundle:
someKey = Copyright (c) {0,date,yyyy} ACME Inc.
In this case, any SimpleDateFormat format string can be used instead of yyyy.
I've been trying to isolate a bug in my application. I succeeded in producing the following "riddle":
SimpleDateFormat f1 = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ");
SimpleDateFormat f2 = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ");
Date d = f1.parse("2012-01-01T00:00:00+0700");
String s1 = f1.format(d); // 2011-12-31T18:00:00+0700
String s2 = f2.format(d); // 2011-12-31T18:00:00+0100
I get the values in comments when I run this code on Android API 7 (yes, really). This behavior depends on particular Java implementation.
My questions are:
Why s1 does not equal s2?
And more importantly, why s1 is incorrect? While s2 points to a proper point in time, s1 does not. There seems to be a bug in Android's SimpleDateFormat implementation.
ANSWER TO QUESTION 1: See the answer by BalusC:
[After using SimpleDateFormat#parse] any TimeZone value that has previously been set by a call to setTimeZone may need to be restored for further operations.
ANSWER TO QUESTION 2: See the answer by wrygiel (myself).
This is due to a bug in Android 2.1 (API 7).
This is mentioned in javadoc of DateFormat#parse():
Parse a date/time string according to the given parse position. For example, a time text "07/10/96 4:5 PM, PDT" will be parsed into a Date that is equivalent to Date(837039900000L).
By default, parsing is lenient: If the input is not in the form used by this object's format method but can still be parsed as a date, then the parse succeeds. Clients may insist on strict adherence to the format by calling setLenient(false).
This parsing operation uses the calendar to produce a Date. As a result, the calendar's date-time fields and the TimeZone value may have been overwritten, depending on subclass implementations. Any TimeZone value that has previously been set by a call to setTimeZone may need to be restored for further operations.
Note the last paragraph. It unfortunately doesn't explain when exactly this will occur. To fix your particular problem you need to explicitly set the desired timezone before the formatting operation.
As to the mutability of SimpleDateFormat itself, this is known for years. You should never create and assign an instance of it as a static or class variable, but always as a method (threadlocal) variable.
This is due to a bug in Android 2.1 (API 7). It seems that Android programmers missed some undocumented Java behavior (which is classified as an unfixable bug itself!) in their implementation of Android 2.1.
Your question intrigued me so I went ahead and compiled your code. The result? As expected...
2011-12-31T18:00:00+0100
2011-12-31T18:00:00+0100
The two values are the same, are you using some concurrency? Maybe the variable gets changed on another thread right before the f2.format(d).
I tried to compare s1 and s2 by running the same program. they come equal to me.