The Character Class in Java - java

Here is a short program that counts the letters of any given word entered by the user.
I'm trying to figure out what the following lines actually do in this program:
counts[s.charAt(i) - 'a']++; // I don't understand what the - 'a' is doing
System.out.println((char)('a' + i) // I don't get what the 'a' + i actually does.
import java.util.Scanner;
public class Listing9_3 {
public static void main(String[] args) {
//Create a scanner
Scanner input = new Scanner (System.in);
System.out.println("Enter a word to find out the occurences of each letter: ");
String s = input.nextLine();
//Invoke the count Letters Method to count each letter
int[] counts = countLetters(s.toLowerCase());
//Display results
for(int i = 0; i< counts.length; i++){
if(counts[i] != 0)
System.out.println((char)('a' + i) + " appears " +
counts[i] + ((counts[i] == 1)? " time" : " times"));
***//I don't understand what the 'a' + i is doing
}
}
public static int[] countLetters(String s) {
int[] counts = new int [26]; // 26 letters in the alphabet
for(int i = 0; i < s.length(); i++){
if(Character.isLetter(s.charAt(i)))
counts[s.charAt(i) - 'a']++;
***// I don't understand what the - 'a' is doin
}
return counts;
}
}

Characters are a kind of integer in Java; the integer is a number associated with the character on the Unicode chart. Thus, 'a' is actually the integer 97; 'b' is 98, and so on in sequence up through 'z'. So s.charAt(i) returns a character; assuming that it is a lower-case letter in the English alphabet, subtracting 'a' from it gives the result 0 for 'a', 1 for 'b', 2 for 'c', and so on.
You can see the first 4096 characters of the Unicode chart at http://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF (and there will be references to other pages of the chart as well). You'll see 'a' there as U+0061 (which is hex, = 97 decimal).

Because you want your array to contains only the count of each letter from 'a' to 'z'.
So to index correctly each count of the letter within the array you would need a mapping letter -> index with 'a' -> 0, 'b' -> 1 to 'z' -> 25.
Each character is represented by a integer value on 16 bits (so from 0 to 65,535). You're only interested from the letters 'a' to 'z', which have respectively the values 97 and 122.
How would you get the mapping?
This can be done using the trick s.charAt(i) - 'a'.
This will ensure that the value returned by this operation is between 0 and 25 because you know that s.charAt(i) will return a character between 'a' and 'z' (you're converting the input of the user in lower case and using Character.isLetter)
Hence you got the desired mapping to count the occurences of each letter in the word.
On the other hand, (char)('a' + i) does the reverse operation. i varies from 0 to 25 and you respectively got the letters from 'a' to 'z'. You just need to cast the result of the addition to char otherwise you would see its unicode value be printed.

counts[s.charAt(i) - 'a']++; // I don't understand what the - 'a' is doing
assume charAT(i) is 'z'
now z-a will be equal to 25 (subtract the unicode / ASCII values).
so counts[25]=counts[25]+1; // just keeps track of count of each character

Related

Decrementing lowercase letters to lowercase letters only

I want to decrement lowercase letters to lowercase letters only. I do this by taking the ASCII value of the character and decrement it. But for example if I decrement a by 2, the answer should be y. Not a symbol or a uppercase letter.
int charValue = temps.charAt(i);
String increment = String.valueOf( (char) (charValue - (m) ));
if((charValue - m) < 65){
int diff = 65 - (charValue - m);
increment = String.valueOf((char) (91 - diff));
}else if((charValue - m) < 97 && (charValue - m) >= 91){
int diff = 97 - (charValue - m);
increment = String.valueOf((char) (123 - diff));
}
System.out.print(increment);
This is the code I have so far. The problem with this is if I decrement a by 8, it shows an upper case letter.
EX:- if i input 'a' and m value as 8, the expected output should be 's'. But im getting 'Y'
Here charToBeChanged is the lowercase character that you want to shift. And decrementValue is the value by how much you want to shift. In the main post you said:
if i input 'a' and m value as 8, the expected output should be 's'
So, here charToBeChanged is a and decrementValue is 8.
System.out.println((char) ((charToBeChanged - 'a' + 26 - decrementValue) % 26 + 'a'));
For a moment forget about the ASCII code table (or any other) and suppose that the letters are numbered sequentially from 1 (for a) to 26 (for z). The problem is now a simple matter of arithmetic modulo 26. In pseudocode, decrementing a by 2 translates into something like
Mod[(1-2),26]
which is 25, the codepoint for y in the sequential code outlined above.
My Java is laughable so I'll leave it to OP to take care of the translation between ASCII code values and sequential code values, and the implementation of a function to perform the operation.
It pretty much depends on your definition of 'lowercase letter'. Is ö lowercase letter? For me, it is. Or č? I have it in my name, so definitely I consider it lowercase letter.
Therefore the program needs to define its own sequence of considered lowercase letters. Note that for the example I only included the characters a to g and x to z, but anything (including \u010D for č or \u00f6 for ö) could be included in the list.
public class DecrementChars {
List<Character> validLowercaseChars
= Arrays.asList('a', 'b', 'c', 'd', 'e', 'f', 'g', 'x','y', 'z');
boolean isLowercaseletter(char letter) {
return validLowercaseChars.contains(letter);
}
char decrement(char input, int decrement) {
if(!isLowercaseletter(input)) {
throw new IllegalArgumentException();
}
int inputIndex = validLowercaseChars.indexOf(input);
int size = validLowercaseChars.size();
int outputIndex = (size + inputIndex - decrement) % size;
return validLowercaseChars.get(outputIndex);
}
#Test(expected=IllegalArgumentException.class)
public void thatDecrementOfInvalidInputThrows() {
decrement('9', 1);
}
#Test
public void thatDecrementOfbByOneGetsa() {
Assert.assertEquals('a', decrement('b', 1));
}
#Test
public void thatDecrementOfaByTwoGetsy() {
Assert.assertEquals('y', decrement('a', 2));
}
}
You could use the equals() and to...Case methods to check what the input was and then convert the output to the same case.
if(charValue.equals("[a-z; A-z]")){
if(charValue.equals(increment)){
System.out.println(increment);
}
if(!charValue.equals(increment)){
System.out.println(increment.toUpperCase());
}
if(!charValue.equals(increment)){
System.out.println(increment.toLowerCase());
}
}else{
System.out.println("Not a letter");
}
Note that I haven't tested this and I am a bit rusty with Regex.
Try with this:
int charValue = 'a';
int m = 8;
String increment = null;
//if a-z
if(charValue>96 && charValue<123){
int difference = charValue - m;
if(difference < 97)
difference+=26;
increment = String.valueOf((char) difference);
}
System.out.println(increment);

Converting char "a" to number 0 using java function

So I'm learning about functions and methods, and trying to create a function that would allow me to replace a Letter with a Number, thus "a" would be 0, "b" would be 1, so on and so forth. I don't know ascii at all, and have only run into creating a very long if, else statement, but I don't even know if I'm on the right track. I'm trying to find a way to create a function without having to make a long conditional statement and use less line of code.
This is the new code I have written with suggestions:
public class CaesarCipher {
/*
* create function that converts a letter to a number
* ex. a -> 0, b -> 1, etc...
*/
static char letterToNumber (char firstLetter){
if (firstLetter < 'a' || firstLetter > 'z') {
}
return firstLetter;
}
static int numberToLetter (int firstNumer){
if (firstNumber < '0' || firstNumber '25'){
}
return firstNumber;
}
public static void main(String[] args) {
char a = 0;
// TODO Auto-generated method stub
System.out.println (letterToNumber (a)); //suppose to compile to convert a -> the number 0
System.out.println(numberToLetter (1)); //compile to convert 1 -> the letter b
}
}
The simplest approach is just to subtract the literal 'a'... which will implicitly convert both your input letter and the 'a' to int:
public int convert(char letter) {
if (letter < 'a' || letter > 'z') {
throw new IllegalArgumentException("Only lower-case ASCII letters are valid");
}
return letter - 'a';
}
The nice thing about this solution is that it's reasonably "obviously correct" (with the assumption that the letters 'a' to 'z' are consecutive in UTF-16). You don't need to include any magic integer values.
char letter = 'a';
int letterAscii = (int)c;
int asciiOffsetOfA = 97;
int positionInAlphabet=letterAscii-asciiOffsetOfA;
Use this with combination of String.toCharArray() and String.toLowerCase() on your input String.
The ASCII value of 0 is 48, a is 97 and A is 65. So to convert small letter to 0 you decrease 49 and capital letter 17. Same goes for B/b and 1, C/c and 2, etc.
int smallChar = 'a' - 49; // equal 0
int capitalChar = 'A' - 17; // equal 0

Counting every single letter in an array with ASCII table (Unicode code)

I am new at Java and I could not understand this structure:
public static int[] upperCounter(String str) {
final int NUMCHARS = 26;
int[] upperCounts = new int[NUMCHARS];
char c;
for (int i = 0; i < str.length(); i++) {
c = str.charAt(i);
if (c >= 'A' && c <= 'Z')
upperCounts[c-'A']++;
}
return upperCounts;
}
This method works, but what does list[c-'A']++; mean?
c - 'A' is taking a character in the range ['A' .. 'Z'], and subtracting 'A' to create a numerical value in the range [0 .. 25] so it can be used as an array index.
upperCounts[c - 'A']++ increments the occurrence count for the character c using its corresponding index c - 'A'.
Effectively, the loop is generating an array of character type counts.
This is really advanced syntax, let me try to break it down:
c - 'a'
c is an indexed variable from the loop, while 'a' is a character that has a certain integer value as denoted by the ASCII table. This operation produces an integer result.
list[c - 'a']
This integer value is then used to interface an int[] array getting the nth item in the list array, returning an integer result.
list[c - 'a']++;
The ++ operator adds one to that value.
It means, that you increment (++) value of element at c - 'A' index in list array.
c is variable - number of letter in alphabet
'A'refers to the Unicode code point of the letter A (65 decimal). Letter B is 66 decimal etc.
Value in a char variable can be represented as an integer. The letter A is 65 and a would be 97 (see the ASCII table for more letters).
list[c-'A']++;
This code means, take the value of c (which is between 65 and 90 [due to if ( c >= 'A' && c <= 'Z' )]) and reduce the value of A (i.e. 65). This will return an index in array list and increases its current value.
Example: c is C:
C = 67
A = 65
C - A = 2
Therefore index 2 will be changed. Index 2 is the third element, like C is the third letter in the alphabet.

java ASCII conversion

Here is part of my code:
// randomly create lowercase ASCII values
int rLowercase1 = random.nextInt(122) + 97;
// convert ASCII to text
System.out.print((char)rLowercase1);
When I run my program, it displays symbols instead of lowercase letters. Is there any way that I can fix this so that it displays lowercase letters?
How about
rLowercase1 = 'a' + random.nextInt('z' - 'a' + 1);
Number of letters in alphabet can be calculated with 'z' - 'a' + 1 = 25 + 1 = 26.
Since random.nextInt(n) will return value from range [0; n) - n is excluded - it means that you can get 'a'+0 = 'a' as minimal value and 'a'+25 = 'z' as max value.
In other words your range of characters is from 'a' till 'z' (both included).
Change your code as:
int rLowercase1 = random.nextInt(26) + 97; // it will generate a-z
There are only 26 lower case letters:
int rLowercase1 = random.nextInt(26) + 97;
If you want only the 26 unaccented Latin letters, Change 122 to 26:
int rLowercase1 = random.nextInt(26) + 97;
I think the meaning of this is a little clearer if written like this:
int rLowercase1 = 'a' + random.nextInt(26);

Display the number of the characters in a string

I have a Java question: I am writing a program to read a string and display the number of characters in that string. I found some example code but I don't quite understand the last part - can anyone help?
int[] count = countLetters(line.toLowerCase());
for (int i=0; i<count.length; i++)
{
if ((i + 1) % 10 == 0)
System.out.println( (char) ('a' + i)+ " " + count[i]);
else
System.out.print( (char) ('a' + i)+ " " + count[i]+ " ");
}
public static int[] countLetters(String line)
{
int[] count = new int[26];
for (int i = 0; i<line.length(); i++)
{
if (Character.isLetter(line.charAt(i)))
count[(int)(line.charAt(i) - 'a')]++;
}
return count;
}
Your last loop is :
For every character we test if it's a letter, if yes, we increment the counter relative to that character. Which means, 'a' is 0, 'b' is 1 ... (in other words, 'a' is 'a'-'a' which is 0, 'b' is 'b'-'a' which is 1 ...).
This is a common way to count the number of occurrences of characters in a string.
The code you posted counts not the length of the string, but the number of occurrences of alphabet letters that occur in the lowercased string.
Character.isLetter(line.charAt(i))
retrieved the character at position i and returns true if it is a letter.
count[(int)(line.charAt(i) - 'a')]++;
increments the count at index character - 'a', this is 0 to 26.
The result of the function is an array of 26 integers containing the counts per letter.
The for loop over the counts array ends the printed output every 10th count and uses
(char) ('a' + i)
to print the letter that the counts belongs to.
I guess you are counting the occurences of letters, not characters ('5' is also a character).
The last part:
for (int i = 0; i<line.length(); i++)
{
if (Character.isLetter(line.charAt(i)))
count[(int)(line.charAt(i) - 'a')]++;
}
It iterates over the input line and checks for each character if it is a letter. If it is, it increments the count for that letter. The count is kept in an array of 26 integers (for the 26 letters in the latin alphabet). The count for letter 'a' is kept at index 0, letter 'b' at 1, 'z' at 25. To get the index the code subtracts the value 'a' from the letter value (each character not only is a character/glyph, but also a numeric value). So if the letter is 'a' it subtracts the value of 'a' which should be 0 and so on.
In the method countLetters, the for loop goes through all characters in the line. The if checks to make sure it's a letter, otherwise it will be ignored.
line.charAt() yields the single character at position i. The type of this is char.
Now deep inside Java, a char is just a number corresponding to a character code. Lowercase 'a' has a character code of 97, 'b' is 98 and so on. (int) forces conversion from char to int. So we take the character code, let's say it's a 'b' so the code is 98, and we subtract the code for 'a', which is 97, so we get the offset 1 (from the beginning of the alphabet). For any letter in the alphabet, the offset will be between 0 and 25 (inclusive).
So we use that offset as an index into the array count and use ++ to increment it. Then later the loop in the top part of the program can print out the counts.
The loop at the top is using the reverse "trick" to convert those offsets from 0 to 25 back into letters from a to z.
The 'last part', the implementation of the loop is really hard to understand. Close to obfuscation ;) Here's a refactoring of the count method (split in two method, a general one for all chars and a special on for just the small capital letters:
public static int[] countAllASCII(String line) {
int[] count = new int[256];
char[] chars = line.toCharArray();
for (char c : chars) {
int index = (int) c;
if (index < 256) {
count[index]++;
}
}
return count;
}
public static int[] countLetters(String line) {
int[] countAll = countAll(line);
int[] result = new int[26];
System.arraycopy(countAll, (int) 'a', result, 0, 26);
return result;
}
General idea: the countAll method just counts all chars. Yes, the array is bigger, but in these dimensions, nobody cares today. The advantage: I don't have to test each char. The second method just copy the area of interest into a new (resulting) array and returns it.
EDIT
I'd changed my code for a less unfriendly comment as well. Thanks anyway, Bombe.

Categories