Investigating a Java bug regarding String.valueOf(float) - java

In Java is it a possibility that String.valueOf(float) would format a float number differently based on what operating system the code is run on, the version of java and/or the operating systems locale.
For example, with the float number 4.5 would it ever be formatted to "4,5" instead of "4.5"?

String.valueOf(float) calls Float.toString().
Float.toString() calls intern sun.misc.FloatingDecimal.toJavaFormatString(float)
The result string will never contain the sign , bacause of hard-coded '.' (ASCII: 46) inside the BinaryToASCIIBuffer.getChars(chars[])
You can see it if you decompile sun.misc.FloatingDecimal class (in my case java 8 jdk) or see the (similar) implementation in openjdk.

Related

How to use unsupported Locale in Java 11 and numbers in String.format()

How can I use an unsupported Locale (eg. ar-US) in JAVA 11 when I output a number via String.format()?
In Java 8 this worked just fine (try jdoodle, select JDK 1.8.0_66):
Locale locale = Locale.forLanguageTag("ar-US");
System.out.println(String.format(locale, "Output: %d", 120));
// Output: 120
Since Java 11 the output is in Eastern Arabic numerals (try jdoodle, use default JDK 11.0.4):
Locale locale = Locale.forLanguageTag("ar-US");
System.out.println(String.format(locale, "Output: %d", 120));
// Output: ١٢٠
It seems, this problem comes from the switch in the Locale Data Providers form JRE to CLDR (source: Localization Changes in Java 9 by #mcarth). Here is a list of supported locales: JDK 11 Supported Locales
UPDATE
I updated the questions example to ar-US, as my example before didn't make sense. The idea is to have a format which makes sense in that given country. In the example it would be the United States (US).
The behavior conforms to the CLDR being treated as the preferred Locale. To confirm this, the same snippet in Java-8 could be executed with
-Djava.locale.providers=CLDR
If you step back to look at the JEP 252: Use CLDR Locale Data by Default, the details follow :
The default lookup order will be CLDR, COMPAT, SPI, where COMPAT
designates the JRE's locale data in JDK 9. If a particular provider
cannot offer the requested locale data, the search will proceed to the
next provider in order.
So, in short if you really don't want the default behaviour to be that of Java-11, you can change the order of lookup with the VM argument
-Djava.locale.providers=COMPAT,CLDR,SPI
What might help further is understanding more about picking the right language using CLDR!
I'm sure I'm missing some nuance, but the problem is with your tag, so fix that. Specifically:
ar-EN makes no sense. That's short for:
language = arabic
country = ?? nobody knows.
EN is not a country. en is certainly a language code (for english), but the second part in a language tag is for country, and EN is not a country. (for context, there is en-GB for british english and en-US for american english).
Thus, this is as good as ar (as in, language = arabic, not tied to any particular country). Even if you did tie it to some country, that is mostly immaterial here; that would affect things like 'what is the first day of the week' ,'which currency symbol is to be presumed' and 'should temperatures be stated in Kelvin or Fahrenheit' perhaps. It has no bearing on how to show digits, because that's all based on language.
And language is arabic, thus, ١٢٠ is what you get when you try ar as a language tag when printing the number 120. The problem is that you expect this to return "120" which is a bizarre wish1, combined with the fact that java, unfortunately, shipped with a bug for a long long time that made it act in this bizarre fashion, thinking that rendering the number 120 in arabic is best done with "120", which is wrong.
So, with that context, in order of preference:
Best solution
Find out why your system ends up with ar-EN and nevertheless expects '120', and fix this. Also fix ar-EN in general; EN is not a country.
More generally, 'unsupported locale' isn't really a thing. the ar part is supported, and it's the only relevant part of the tag for rendering digits.
Alternatives
The most likely best answer if the above is not possible is to explicitly work around it. Detect the tag yourself, and write code that will just respond with the result of formatting this number using Locale.ENGLISH instead, guaranteeing that you get Output: 120. The rest seems considerably worse: You could try to write a localization provider which is a ton of work, or you can try to tell java to use the JRE version of the provider, but that one is obsoleted and will not be updated, so you're kicking the can down the road and setting yourself up for a maintenance burden later.
1.) Given that the JRE variant actually printed 120, and you're also indicating you want this, I get that nagging feeling I'm missing some political or historical info and the expectation that ar-EN results in rendering the number 120 as "120" is not so crazy. I'd love to hear that story if you care to provide it!

Compare strings bwlow API 18

I searched and got clear that == is not used to compare the content of string variables but equals().
However, AS reports that equals() is only available on API 19 (Android 4.4) and up and I targetted API 18 (my only phone is Android 4.3)
So right now, I'm doing if (var1.contains(var2)) or if (array[i].contains(var)) to compare strings and it works but it doesn't seem correct to me.
What would be the correct way to achieve this on API < 19?
Thanks.
Edit: for clarification (I don't know how to put inline images)
With ==
With equals()
Comparison fails with equals().
The equals function of the Object class was added in Java JDK 1.0.
This version was released on January 23, 1996. It was called 'Oak' back then, so technically it predates even Java itself. (source).
In contrast, Android API 1 was released on September 23, 2008. At this time it would be made with Java JDK 1.5 (latest version was Java SE 5 Update 16).
So in conclusion, equals is available on API level 18, there must be some other error.
After seeing the posted code, you are using Objects.equals(), which is a utility method that checks equals() in a null-safe manner.
In many cases, like yours, you don't need the extra null check because you know at least one of the objects is not null and you can just call equals directly:
if("una".equals(hourNames[realHour]))
Your hourNames array will probably not contain null elements so you should turn it around to the more readable order:
if(hourNames[realHour].equals("una"))
use hourNames[realHour].equals("una")
yeah, that is a bug on android studio (or maybe from intellij idea).
Since if (var1.contains(var2)) works for you why don't you post the exact strings ? In case you don't have debug capability, I would suggest using these to debug the point in difference in the strings:-
boolean contentEquals(CharSequence cs)
public int compareTo(String anotherString)
https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#compareTo(java.lang.String)
Also you could use the simulator env and run this small test:-
String s1 = "Test";
String s2 = "Test";
if (s1.equals(s2))
System.out.println("Equal");
else
System.out.println("Not Equal");

Java string split gives different outputs on Windows and linux

Please see the below code --
String s11 ="!country=India ";
String[] ss =s11.split("((?<=[!&|])|(?=[!&|]))");
System.out.println(ss.length);
for(String s :ss) {
System.out.println(s);
}
On Windows it gives
2
!
country=India
Whereas with Ubuntu it gives
3
!
country=India
Why would that be ?
This behavior is not because of different operating systems, but likely different versions of the JVM are used.
This "behavior change" has caused bugs to be filed incorrectly for Java 8.
The documentation has been updated for JDK 8, and is also discussed at length in this question, where split in Java 8 removes empty strings at the start of the result array. This is why the additional empty string before the ! is not created (hence the length of 2 instead 3).
Notice the difference in documentation for the split() method in Java 7 and in Java 8 for the Pattern class, and the string class (Java 7, Java 8) respectively. See the original question linked for further information on this.
I have also reproduced this issue on Java 7: sun-jdk-1.7.0_10 (ideone) and Java 8 sun-jdk-8u25 (ideone). See the Java versions here. Java 8's split will provide not add the extra empty string into the array, while Java 7's split will.
This it is not because of the system being Linux or Windows, but rather the JVM version. You can double check your JVM's version with java -version

default number system and charecter set in java

Thi is a fundamental questuion about how java works and so i dont have any code to support it.
I am new to java development and want to know how the different number systems, charecter sets like UTF 8 and unicode come together in Java.
Lets say a user creates a new string and int with the same value.
int i=100;
String S="100";
The hardware of a computer understands zeros and ones. so it has to be converted to binary?(correct me if im wrong). this conversion should be done by the JVM(correct me if im wrong)? and to represent charecters of different languages into charecters that can be typed into the keyboard (english) UTF-8 and such conversions are used(correction needed)?
now how does this whole flow fit into the bigger picture of running a java web application?
how does a string/int get converted to a binary for the machine's hardware to understand?
how does it get converted to UTF-8 for a browser to understand?
and what are the default number format and charecterset in java? if im reading contents of a file? will they be read into binary or utf-8?
All computers run in binary. The conversion is done by the JVM and the computer that you have. You shouldn't worry about converting the code into the coordinating 1's and 0's. The browser has its own conversion hard code to change the universal 1's and 0's(used by all programs and computer software) into however it decides to display the given information. All languages are just a translation guide for the user to "speak" with the computer. And vice versa. Hope this helps though I don't think I really answered anything.
How java represents any data type in memory is the choice of the actual JVM. In practice, the JVM will chose the format native to the processor (e.g. chose between little/big endian for int), simply because it offers the best performance on that platform.
Basically, the JLS makes certain guarantees (like that a byte has 8 bits and the values range from -128 to 127) - the VM just maps that to the platform as it deems suitable (the JLS was specified to match common computing technology closely, so there is usually no magic needed to guess how primitive types map to the platform).
You should never care how the VM represents data in memory, java does not offer any legal way to access the data in a manner where you would need to know (bypassing most of the VM's logic by using sun.misc.Unsafe is not considered legal).
If you care for educational purposes, learn what binary representations the underlying platform (e.g. x86) uses and take a look at the VM. It has little to do with java really, its all VM and platform specific.
For java.lang.String, its the implementation of the class that defines how the String is stored internally - it went through quite some changes over major java versions - but what that String exposes is quite narrowly defined (see JDK javadoc for String.length(), String.charAt()).
As for how user input is translated to java standard types, thats actually platform specific. The JVM selects the default encoding (e.g. String.toBytes() can return quite different results for the same string, depending on the platform - thats why its recommended to explictly specify the desired encoding). Same goes for many other things (time zone, number format etc.).
CharSets and Formats are building blocks the program wires up to translate data from the outside world (file, http or user input) into javas representation of data (or vice versa). For example, a Web application will use the encoding from a HTTP header to determine what CharSet to use when interpreting the contents (the HTTP headers encoding is defined to be US-ASCII by the spec).

Why are there no binary literals in Java?

Is there any particular reason why this kind of literal is not included whereas hex and octal formats are allowed?
Java 7 includes it.Check the new features.
Example:
int binary = 0b1001_1001;
Binary literals were introduced in Java 7. See "Improved Integer Literals":
int i = 0b1001001;
The reason for not including them from day one is most likely the following: Java is a high-level language and has been quite restrictive when it comes to language constructs that are less important and low level. Java developers have had a general policy of "if in doubt, keep it out".
If you're on Java 6 or older, your best option is to do
int yourInteger = Integer.parseInt("100100101", 2);
actually, it is. in java7.
http://code.joejag.com/2009/new-language-features-in-java-7/
The associated bug is open since April 2004, has low priority and is considered as a request for enhancement by Sun/Oracle.
I guess they think binary literals would make the language more complex and doesn't provide obvious benefits...
There seems to be an impression here that implementing binary literals is complex. It isn't. It would take about five minutes. Plus the test cases of course.
Java 7 does allow binary literals !
Check this:
int binVal = 0b11010;
at this link:
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html

Categories