Why a character Array accepts integer values in Java? - java

Of course I'm a beginner for Java, previously I learned C. Please take a look on the following code segments.
char Character;
int Number = 27;
Character = Number;
System.out.println(Character);
The above code cannot be compiled as an error stated as “Loss of Information”
Whereas the following code...
char Character = ‘F’;
int Number;
Number = Character;
System.out.println(Number);`
The above code can be compiled but the output is “70”... not as “F”
Also take a look on the following code...
char [] arrayCh = new char [3];
arrayCh [0] = 27;
System.out.println(arrayCh[0]);
The above code can be compiled however it also gives an unfamiliar symbol...
I know the issues regarding the ASCII Values and the memory taking as 'char' takes 16 Bits, 'int' takes 32 Bits. Therefore an integer value couldn’t be assigned in to a character variable whereas a character value can be assigned in to an integer variable as "ASCII" value.
My question is... why a 'char' array accepts 'int' values..? Could anyone explain?

A char is a 2-bytes long, unsigned integer. 27 is an integer literal that is in the range of a char, so the compiler accepts to let you assign it to a char.
'F' is a character literal that represents the character F, which has the decimal value 70 in the unicode standard. So, assigning 'F' to an integer is the same thing as assigning 70.

Related

How to Output int from char in JAVA after storing it [duplicate]

This question already has answers here:
Char - Java not working as intended / my code
(4 answers)
Closed 3 years ago.
I am seeing a tutorial on udemy and there the instructor says that we can store the integer variable in the char data type. But when I try to print the value ... nothing shows up
I tried assigning the "char one" value to integer variable and then get the output from int variable,It works but why can not I use the char to output the value
public static void main(String[] args) {
char one = 10;
System.out.println(one);
}
If you look at the ASCII table you would see that the character 10 represents the newline character.
This can be proved by the code below:
public static void main(String[] args) {
char one = 10;
//no newline added by print, but println adds a newline implicitly
System.out.print("Test");
System.out.print(one);
System.out.print("Test");
}
The output is:
Test
Test
Although I used System.out.print a newline was still added in the output after the first Test. So you see something was actually printed.
Furthermore, when you pass a char to the System.out.println() the char is converted to its String representation as per the ASCII table by invoking the String.valueOf(char) as char is a primitive.
For Objects when you pass a reference in the System.out.println() the toString() method of the object would be called to get its String representation.
If you change the value to char one = 65 you would see the letter A printed.
In Java char type is an int, therefore they can be converted char <-> int.
When you print an int - you get an integer number. When you print char - you get an ASCII character. char ch = 10 - is not printable character.
char ch = 'A';
System.out.println(ch); // print 'A'
int code = ch;
System.out.println(code); // print 65 - ASCII code of 'A'
Adding to the above answers, if you want to output the int value from the variable "one", a cast would work:
char one = 10;
System.out.println((int) one);
If you take a look at the ASCII Table, you can see the value of 10 is LF which is a new line. If you print this alone, it will appear to be doing nothing because it is just a new line.
However if you modify the code a bit to print some actual characters on both side of the LF char:
char c1 = 70;
System.out.print(c1);
char one = 10;
System.out.print(one);
char c2 = 71;
System.out.print(c2);
This will output:
F
G
On separate lines due to the newline in between, without it they would have printed on the same line.
Additionally you can see on that table 70 corresponds with F, and 71 with G.
Note: Java does not technically use ASCII, but rather a different encoding depending on your environment(commonly UTF-16 or ISO-8859-1), however, the characters are usually equivalent to ASCII for the amount of values the ASCII table contains (a superset). For example char c1 = 202 will print Ê for me, which is not an ASCII value.
You are misinterpreting your output and drawing the wrong conclusion.
A char is a UTF-16 code unit. UTF-16 is a character encoding for the Unicode character set. UTF-16 encodes a Unicode codepoint with one or two UTF-16 code units. Typically, if it might be two code units, you'd use String or char[] instead of char. But if your codepoint is known to take only one UTF-16 code unit, you could use char.
The codepoint you are using is U+000A 'LINE FEED (LF)'. It does take one UTF-16 code unit \u000a, which is convertible from the integer value 0xa or 10. If you inspect your output carefully, you'll "see". Perhaps adding output before and after would help.

Java CASE why do i get a complete differet int back with and without using ' '

As a Java beginner I'm playing around with a case statement at this point.
I have set: int d = '1'; And with: System.out.println("number is: " + d);, this returns 51.
Now I found out that if I set it as: int d = 1;, it does return 1.
Now my question is why does it return 49 when I set it as '3'? What difference do the ' ' make?
My code that returns 49:
int a = '1';
switch(a)
{
case '1' :
System.out.println("Good");
break;
case '2' :
case '3' :
System.out.println("great");
break;
default :
System.out.println("invalid");
}
System.out.println("value: " + a);
'1' is the character '1', whose integer value is 49. Each character has a numeric value in the range from 0 to 2^16-1. Therefore 1 and '1' are different integer values.
With ' (single quotes) around a character you tell the compiler that the value in between is a character (variable type char in Java). Generally, computers only understand numbers so characters get represented as numbers in memory as well. The value of the character '1' is 49. You want to actually save the value 1 in memory and use that to calculate things. So, you have to tell the compiler that you want this to be interpreted as an actual number and not as a character.
To summarize:
'1' is the character that prints the number 1
1 is the actual number 1
This means 1 and '1' have different integer values.
You can look at the integer value of the most commonly used characters in coding here: ASCII Table
In Java the character values are stored as int internally. That means when you assign a char value using 'A' to an int type, the compiler casts the char value into int and thus the UTF-16 Code unit value is stored into the int variable.
Try this one as well:
int yourInt = 33;
char ch = (char) yourInt;
System.out.println(yourInt);
System.out.println(ch);
1 is interpreted as a character whose value is 49.
Do:
int a = 1;
switch(a) {
case 1:
...
Or, if you go with your current code:
System.out.println("value: " + Integer.parseInt((char)a+'');
It is happens because for some weird reason java treat char type as short.
Your '1' expression is constant expression of type char. It is stored internally as some numeric value. Usually all goes fine with this approach for example System.out.println('1') will print exactly what you expect.
But when you write int a = '1'; you char value is converted to int value because char behave like short int there.
PS: if there was no implicit conversion between char and int (which anyway have no sense) you just got compilation error.
When you declare int d='1'; then 1 is not treated as integer it is a character. It is converted to 49 according its unicode value. similarly character '3' will coverted to 51. Jvm do implicit type casting from character to integer.
you should try this code
char c = '3';
int a ='1';
System.out.println(c);
System.out.println((int)c);
You will get output as
3
51
49

Converting a int to char and then back to int - doesn't give same result always

I am trying to get a char from an int value > 0xFFFF. But instead, I always get back the same char value, that when cast to an int, prints the value 65535 (0xFFFF).
I couldn't understand why it is generating symbols for unicode > 0xFFFF.
int hex = 0x10FFFF;
char c = (char)hex;
System.out.println((int)c);
I expected the output to be 0x10FFFF. Instead, the output comes back as 65535.
This is because, while an int is 4 bytes, a char is only 2 bytes. Thus, you can't represent all values in a char that you can in an int. Using a standard unsigned integer representation, you can only represent the range of values from 0 to 2^16 - 1 == 65535 in a 2-byte value, so if you convert any number outside that range to a 2-byte value and back, you'll lose data.
int is 4 byte. char is 2 byte.
Your number was well within range an int can hold, but not which char can.
So when you converted that number to a char, it lost data and became the maximum a char can hold, which is what it printed i.e. 65535
Your number was too big to be a char which is 2 bytes. But it was small enough where it fit in as an int which is 4 bytes. 65535 is the biggest amount that fits in a char so that's why you got that value. Also, if a char was big enough to fit your number, when you returned it to an int it might have returned the decimal value for 0x10FFFF which is 1114111.
Unfortunately, I think you were expecting a Java char to be the same thing as a Unicode code point. They are not the same thing.
The Java char, as already expressed by other answers, can only support code points that can be represented in 16 bits, whereas Unicode needs 21 bits to support all code points.
In other words, a Java char on its own, only supports Basic Multilingual Plane characters (code points <= 0xFFFF). In Java, if you want to represent a Unicode code point that is in one of the extended planes (code points > 0xFFFF), then you need surrogate characters, or a pair of characters to do that. This is how UTF-16 works. And, internally, this is how Java strings work as well. Just for fun, run the following snippet to see how a single Unicode code point is actually represented by 2 characters if the code point is > 0xFFFF:
// Printing string length for a string with
// a single unicode code point: 0x22BED.
System.out.println("𢯭".length()); // prints 2, because it uses a surrogate pair.
If you want to safely convert an int value that represents a Unicode code point to a char (or chars to be more exact), and then convert it back to an int code point, you will have to use code like this:
public static void main(String[] args) {
int hex = 0x10FFFF;
System.out.println(Character.isSupplementaryCodePoint(hex)); // prints true because hex > 0xFFFF
char[] surrogateChars = Character.toChars(hex);
int codePointConvertedBack = Character.codePointAt(surrogateChars, 0);
System.out.println(codePointConvertedBack); // prints 1114111
}
Alternatively, instead of manipulating char arrays, you can use a String, like this:
public static void main(String[] args) {
int hex = 0x10FFFF;
System.out.println(Character.isSupplementaryCodePoint(hex)); // prints true because hex > 0xFFFF
String s = new String(new int[] {hex}, 0, 1);
int codePointConvertedBack = s.codePointAt(0);
System.out.println(codePointConvertedBack); // prints 1114111
}
For further reading: Java Character Class

Convert XSLT function (string-to-codepoints) to Java

How can I "translate" this XSLT code to Java ?
<xsl:value-of select="number(string-to-codepoints(upper-case($char)) - string-to-codepoints('A'))+10"/>
I only know that: "The fn:string-to-codepoints function returns a sequence of xs:integer values representing the Unicode code points."
From the example that is given in (http://www.xsltfunctions.com/xsl/fn_string-to-codepoints.html) :
string-to-codepoints('a') = 97
I found this:
char ch = 'a';
System.out.println(String.format("\\u%04x", (int) ch));
But I get : \u0061
For a single char you can just cast it to int to get the decimal value:
System.out.println((int)ch);
For a String there's .toCharArray() to convert it to a char[] but that isn't quite the same as a "sequence of codepoints" if the String involves Unicode characters outside the BMP (i.e. above U+FFFF), which are represented in Java as a surrogate pair of two char values. To handle surrogates properly you would need to use a technique like the one described in this answer.
To answer the specific question you ask, you can do
number(string-to-codepoints(upper-case($char)) - string-to-codepoints('A'))+10
in Java as
char ch = // wherever you get $char from
int num = Character.toUpperCase(ch) - 'A' + 10;
since char is an integer type in Java and you can add or subtract char values like any other number.
But this will probably only give you a sensible answer when the initial ch is an ASCII letter.
You print the value as unicode escape sequence; XSLT prints a decimal value.
This should work much better:
System.out.println("a".codePointAt(0));

Use print statements to find out the character for the hexadecimal code "5A"

String b = "5A";
int bConv = Integer.parseInt(b, 16);
char $2 = bConv;
When I try this I get a possible loss of precision error warning.
That's because an int by default takes 4 bytes, a char takes 2 bytes. So by casting you might lose data.
An explicit cast will remove the warning:
String b = "5A";
int bConv = Integer.parseInt(b, 16);
char $2 =(char)bConv;
Side note: In my opinion $2 is a bad name for a variable.
UPDATE
Is there a better way to represent characters using hexadecimal values?
Don't know if it can be considered better, but you can assign the hex value directly to the character:
char myHexChar = 0x5A; //0x tells the compiler that the value is in hex format
If you print that variable, you'll get Z

Categories