Java String/Char charAt() Comparison - java

I have seen various comparisons that you can do with the charAt() method.
However, I can't really understand a few of them.
String str = "asdf";
str.charAt(0) == '-'; // What does it mean when it's equal to '-'?
char c = '3';
if (c < '9') // How are char variables compared with the `<` operator?
Any help would be appreciated.

// What does it mean when it's equal to '-'?
Every letter and symbol is a character. You can look at the first character of a String and check for a match.
In this case you get the first character and see if it's the minus character. This minus sign is (char) 45 see below
// How are char variables compared with the < operator?
In Java, all characters are actually 16-bit unsigned numbers. Each character has a number based on it unicode. e.g. '9' is character (char) 57 This comparison is true for any character less than the code for 9 e.g. space.
The first character of your string is 'a' which is (char) 97 so (char) 97 < (char) 57 is false.

String str = "asdf";
String output = " ";
if(str.charAt(0) == '-'){
// What does it mean when it's equal to '-'?
output= "- exists in the first index of the String";
}
else {
output="- doesn't exists in the first index of the String";
}
System.out.println(output);
It checks if that char exists in index 0, it is a comparison.
As for if (c < '9'), the ascii values of c and 9 are compared. I don't know why you would check if ascii equivalent of c is smaller than ascii equivalent of '9' though.
If you want to get ascii value of any char, then you can:
char character = 'c';
int ascii = character;
System.out.println(ascii);

str.charAt(0) == '-'; returns a boolean , in this case false.
if (c < '9') compares ascii value of '3' with ascii value of '9' and return boolean again.

str.charAt(0) == '-'
This statement returns a true if the character at point 0 is '-' and false otherwise.
if (c < '9')
This compares the ascii value of c with the ascii value of '9' in this case 99 and 57 respectively.

Characters are a primitive type in Java, which means it is not a complex object. As a consequence, every time you're making a comparison between chars, you are directly comparing their values.
Java characters are defined according to the original unicode specification, which gives each character a 16-bit value. These are the values that Java is comparing when you are comparing something like c>'3' or str.charAt(0) == '-'.

Related

Convert special characters into decimal equivalents in java

Is there a java library to convert special characters into decimal equivalent?
example:
input: "©™®"
output: "& #169; & #8482; & #174;"(space after & is only for question purpose, if typed without a space decimal equivalent is converted to special character)
Thank you !
This can be simply achieved with String.format(). The representations are simply the character value as decimal, padded to 4 characters and wrapped in &#;
The only tricky part is deciding which characters are "special". Here I've assumed not digit, not whitespace and not alpha...
StringBuilder output = new StringBuilder();
String input = "Foo bar ©™® baz";
for (char each : input.toCharArray()) {
if (Character.isAlphabetic(each) || Character.isDigit(each) || Character.isWhitespace(each)) {
output.append(each);
} else {
output.append(String.format("&#%04d;", (int) each));
}
}
System.out.println(output.toString());
You just need to fetch the integer value of the character as mentioned in How do I get the decimal value of a unicode character in Java?.
As per Oracle Java doc
char: The char data type is a single 16-bit Unicode character. It has
a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or
65,535 inclusive).
Assuming your characters fall within the character range, you can just get the decimal equivalent of each character from your string.
String text = "©™®";
char[] cArr = text.toCharArray();
for (char c : cArr)
{
int value = c; // get the decimal equivalent of the character
String result = "& #" + value; // append to some format string
System.out.println(result);
}
Output:
& #169
& #8482
& #174

Converting a lowercase char to uppercase without using an if statement

How I can convert a lowercase char to uppercase without using an if statement?. I.e. don't use code like this:
if(c > 'a' && c < 'z')
{
c = c-32;
}
You can use this:
char uppercase = Character.toUpperCase(c);
Use Character.toUpperCase(char):
Converts the character argument to uppercase using case mapping information from the UnicodeData file.
For example, Character.toUpperCase('a') returns 'A'.
So the full code you probably want is:
c = Character.toUpperCase(c);
If you are sure that your characters are ASCII alphabetic, then you can unset the bit that makes it lowercase, since the difference between the lowercase and uppercase latin chars is only one bit in the ASCII table.
You can simply do:
char upper = c & 0x5F;
You can use the ternary operator. For your case, try something like this:
c = (c >= 'a' && c <= 'z') ? c = c - 32 : c;

why char ch =4 (without '') , is not error?

I want to know why char ch =5; (for example)
is not error ? but if I print
System.out.println(Character.isDigit(ch));
// output
false
it will be false ?
Because 5 is an integer literal that can be converted to a char. It is not the character '5' however.
A character is represented by two bytes in memory. Java converts 5 to a character.
'5' is not the 6th character (its hexadecimal code is 35 and not 5) in the ASCII table and is thus not a "digit".
try this example :
char ch = 97;
JOptionPane.showMessageDialog(null,"ch = "+ch);
The answer would be : ch = a
It simply won't give an error even though 97 is without (' ') because 97 represent the ASCII code for the character 'a' so it's not a digit , and that's why you are getting false as a result.
if you give ch = 5, it's automatically covert to char based on ASCII value.

What is that ('Z' - 'A')

In Currency.java file there is a line.
private static final int A_TO_Z = ('Z' - 'A') + 1;
What means is this? I didn't see this before. What is A_TO_Z's value and why it using 'Z' instead number.
With this expression you are treating chars as ints, using character's Unicode value instead of the character itself.
'Z' - 'A' + 1
Will become
90 - 65 + 1 (=26)
'Z' is a char with an integral value of 90.
'A' is a char with an integral value of 65.
90 - 65 + 1 = 26
Nasty. 'A' is the char literal for ASCII value of A (65 in decimal). 'Z' is 90. So A_TO_Z is 26, the number of letters in the English alphabet.
Characters have numeric values according to their value in the character table. That expression exploits the fact that all the letters form A to Z have consecutive values in the underlying encoding table thus subtracting the first value from the last ( + 1) gives the length of the English alphabet. The actual numerical values are unimportant in this case and the code is more or less self-explainable to the reader. In case the used encoding spreads the letters differently, the expression will become incorrect.

Java Character literals value with getNumericValue()

Why do I get the same results for both upper- and lowercase literals? For instance:
char ch1 = 'A';
char ch2 = 'a';
char ch3 = 'Z';
char ch4 = 'z';
print("ch1 -- > " + Integer.toBinaryString(Character.getNumericValue(ch1)));
print("ch2 -- > " + Integer.toBinaryString(Character.getNumericValue(ch2)));
print("ch3 -- > " + Integer.toBinaryString(Character.getNumericValue(ch3)));
print("ch4 -- > " + Integer.toBinaryString(Character.getNumericValue(ch4)));
As results I get:
ch1 -- > 1010
ch2 -- > 1010
ch3 -- > 100011
ch4 -- > 100011
And don't really see the difference between 'A' and 'a'. Even if I use character literals in UTF form (\u0041 for 'A' and \u0061 for 'a') I do get the same results.
It's behaving exactly as documented:
The letters A-Z in their uppercase ('\u0041' through '\u005A'), lowercase ('\u0061' through '\u007A'), and full width variant ('\uFF21' through '\uFF3A' and '\uFF41' through '\uFF5A') forms have numeric values from 10 through 35.
Basically this means that when parsing hex (say), 0xfa == 0xFA, as you'd expect.
I'd only expect case to matter when using something like base64.
Judging from the commentary, you're actually looking for the codepoints of the characters, rather than their numeric value, so I'll just isolate that into an answer. The getNumericValue() function returns what the character means as a number when interpreting its glyph, it does not return the codepoint of a character. For instance, getNumericValue('5') returns 5 as an int, not the codepoint of 5.
To use the codepoints, just use your variables or the char literals as they are. char is a numeric datatype. For instance, System.out.println((int)'a'); will print 65, quite simply.

Categories