How can we calculate frequency of characters in a string

How can we calculate frequency of characters in a string - java

I was looking into the solution of the problem.
static void printCharWithFreq(String str)
{
// size of the string 'str'
int n = str.length();
// 'freq[]' implemented as hash table
int[] freq = new int[SIZE];
// accumulate freqeuncy of each character
// in 'str'
for (int i = 0; i < n; i++)
freq[str.charAt(i) - 'a']++;
// traverse 'str' from left to right
for (int i = 0; i < n; i++) {
// if frequency of character str.charAt(i)
// is not equal to 0
if (freq[str.charAt(i) - 'a'] != 0) {
// print the character along with its
// frequency
System.out.print(str.charAt(i));
System.out.print(freq[str.charAt(i) - 'a'] + " ");
// update frequency of str.charAt(i) to
// 0 so that the same character is not
// printed again
freq[str.charAt(i) - 'a'] = 0;
}
}
}
I am not able to understand how
for (int i = 0; i < n; i++)
freq[str.charAt(i) - 'a']++;
is able to calculate the frequency of the elements.
and how is it stored back in to the position.
I am confused with it.
Can anyone please help me with it?

The lowercase ASCII letters are occupying a continuous part of the ASCII table, from index 97 to 122. If your input is composed of lowercase ASCII letters the expression str.charAt(i) - 'a' will evaluate to values from range [0, 25]. a will become 0, b will become 1, c will become 2 and so on.
However this approach fails for non lowercase ASCII characters, e.g. uppercase 'A' letter has value 65, 'A' - 'a' will be 65 - 97 thus attempting to access a negative array index.

It seems to me that you could rewrite your solution in a much simpler way. Unless i'm misunderstanding it then that solution is far more complex than it needs to be.
s.chars().mapToObj(c -> (char) c).collect(Collectors.groupingBy(c -> c, Collectors.counting()));
As for the frequency, characters in Java are backed by ASCII codes. So you can subtract chars from each other to obtain ASCII values. Thanks to #BackSlash for the stream implementation.

Related

How to count the remaining uppercase letter T for the letter occurrences counter?

I need some guidance from the code I have here, which counts the alphabet letter occurrences from a sentence.
public static void main(String[] args) {
int wordCount = 0;
String word = "The quick brown fox jumps over the lazy dog near the bank of the river";
for (char letter = 'a'; letter <= 'z'; letter++) {
for (int i = 0; i < word.length(); i++) {
if (word.charAt(i) == letter) {
wordCount++;
}
}
if (wordCount > 0) {
System.out.print(letter + "=" + wordCount + ", ");
wordCount = 0;
}
}
}
Output:
a=3, b=2, c=1, d=1, e=7, f=2, g=1, h=4, i=2, j=1, k=2, l=1, m=1, n=3, o=5, p=1, q=1, r=5, s=1, t=3, u=2, v=2, w=1, x=1, y=1, z=1,
My issue here is that my program doesn't seem to count the uppercase letter T at the start of my sentence; thus, the result of the occurrences of letter t lacks one more (since my expected output for the occurrences of t is 4, but it resulted in only 3)(This code is a little bit simple, but I'm a bit baffled in terms of utilizing loops and arrays.
Should I add another for loop in creating the alphabet letters but this time in an uppercase form and put them in an array?
Your responses and guides would really help me on this code that I am constructing.
Thank you very much, everyone!!!

Well, a T is not a t. Us humans tend to interpret them as (loosely) the same, but to a machine, they aren't.
You have three options here:
You could convert the whole string to lowercase, so T becomes t prior to processing the string. You could use String::toLowerCase for that.
You could also convert each character to lowercase using Character::toLowerCase. This will delegate the conversion to the Character class, which contains many methods to act upon a character.
if (Character.toLowerCase(word.charAt(i)) == letter) {
wordCount++;
}
Another option is to do some arithmetic to convert the uppercase characters to lowercase.
char c = word.charAt(i);
if (c >= 'A' && c <= 'Z') {
c += 32;
}
if (c == letter) {
wordCount++;
}
This works because the difference between 'A' and 'a' is 32.
Note that your code is a little inefficient regarding time-complexity. You have a loop within a loop, so you're iterating 26 times over the whole string.
Another fairly well-known approach is to create an array of 26 positions, one for each letter. For each letter, you increment the value of the array index corresponding to the letter:
// Let's agree that position 0 is for the 'a' and position 25 for the 'z'
int[] frequencies = new int[26];
for (int i = 0; i < word.length(); i++) {
char c = word.charAt(i);
// <<Convert your character to lowercase here>>
// Also check if your character is a Latin letter
int position = c - 'a'; // Now 'a' => 0 and 'z' => 25
frequencies[position]++;
}
Now, the only thing you need to do is loop over the frequencies array and print the quantities.

Create substring using loop

I am given a string from which i have to find out sub-strings that satisfy the following conditions
all characters in the sub-string are same. eg: aa,bbb,cccc.
all the character except the middle character have to be the same.
eg: aba, bbabb, etc.
I've made an algo something like this
I beak the string using two loops 1st loop holds the first char and the second loop traverses through the string.
Then i send the sub-string to the vet() to see if the substring contains less than or equals two character.
If the sub-string contains two character then i check if its a palindrome
public static int reverse(String s)
{
String wrd="";
for(int i = s.length()-1 ;i>=0;i--)
wrd = wrd + s.charAt(i);
if(s.equals(wrd))
return 1;
else
return 0;
}
public static boolean vet(String s)
{
HashSet<Character> hs = new HashSet<>();
for(char c : s.toCharArray())
{
hs.add(c);
}
if(hs.size() <= 2)
return true;
else
return false;
}
static long substrCount(int n, String s) {
List<String> al = new ArrayList<>();
for(int i=0;i<s.length();i++)
{
for(int j=i;j<s.length();j++)
{
if(vet(s.substring(i,j+1)))
{
if(reverse(s.substring(i,j+1)) == 1)
al.add(s.substring(i,j+1));
}
}
}
return al.size();
}
This code works fine for small strings, however if the string is big say ten thousand character, this code will throw Time limit exception.
I suspect the loop that breaks the string and create the sub-string in the substrCount() is causing the time complexity as it has nested loops.
Please review this code and provide a better way to break the string or if the complexity is increasing due to some other section then let me know.
link : https://www.hackerrank.com/challenges/special-palindrome-again/problem?h_l=interview&playlist_slugs%5B%5D=interview-preparation-kit&playlist_slugs%5B%5D=strings

You can collect counts from left side and right side of the string in 2 separate arrays.
Now, we collect counts in the fashion of if previous char equals current char, increase count by 1, else set it to 1.
Example:
a a b a a c a a
1 2 1 1 2 1 1 2 // left to right
2 1 1 2 1 1 2 1 // right to left
For strings that have all characters equal, we just collect all of them while iterating itself.
For strings with all equal except the middle character, you can use above the above table and you can collect string as below:
Pseudocode:
if(str.charAt(i-1) == str.charAt(i+1)){ // you will add checks for boundaries
int min_count = Math.min(left[i-1],right[i+1]);
for(int j=min_count;j>=1;--j){
set.add(str.substring(i-j,i+j+1));
}
}
Update:
Below is my accepted solution.
static long substrCount(int n, String s) {
long cnt = 0;
int[] left = new int[n];
int[] right = new int[n];
int len = s.length();
for(int i=0;i<len;++i){
left[i] = 1;
if(i > 0 && s.charAt(i) == s.charAt(i-1)) left[i] += left[i-1];
}
for(int i=len-1;i>=0;--i){
right[i] = 1;
if(i < len-1 && s.charAt(i) == s.charAt(i+1)) right[i] += right[i+1];
}
for(int i=len-1;i>=0;--i){
if(i == 0 || i == len-1) cnt += right[i];
else{
if(s.charAt(i-1) == s.charAt(i+1) && s.charAt(i-1) != s.charAt(i)){
cnt += Math.min(left[i-1],right[i+1]) + 1;
}else if(s.charAt(i) == s.charAt(i+1)) cnt += right[i];
else cnt++;
}
}
return cnt;
}
Algorithm:
The algorithm is the same as explained above with a few additional stuff.
If the character is at the boundary, say 0 or at len-1, we just look at right[i] to count the strings, because we don't have a left here.
If a character is inside this boundary, we do checks as follows:
If previous character equals next character, we check if previous character does not equal current character. We do this because, we want to avoid future addition of strings at the current iteration itself(say for strings like aaaaa where we are at the middle a).
Second condition says s.charAt(i) == s.charAt(i+1), meaning, we again have strings like aaa and we are at the first a. So we just add right[i] to indicate addition of strings like a,aa,aaa).
Third does cnt++ meaning addition of individual character.
You can make a few optimizations like completely avoiding right array etc, but I leave that to you as an exercise.
Time complexity: O(n), Space complexity: O(n)

Your current solution runtime is O(n^4). You can reduce it to O(n^2logn) by removing the number of character count in substrings and optimising the palindrome check portion.
To do so, you have to pre-calculate an array say "counter" where every position of the "counter" array indicates number of different characters from starting index to that position.
After constructing the array, you can check if a substring has more than two characters in O(1) by subtracting the end position and starting position value of counter array. If the value is 1 then there have only one character in the substring. If the value is 2, then you can binary search in the counter array between the substrings starting and end positions to find the position of single character. After finding out the position of the single character its straight forward to check if the substring is palindrome or not.
UPDATE!
Let me explain with an example:
Suppose the string is "aaabaaa".
So, the counter array would be = [1, 1, 1, 2, 2, 2, 2];
Now, lets assume for a specific time, the outer for loops value i = 1 and the inner for loops value j = 5; so the substring is "aabaa".
Now to find the number of character in the substring by following code:
noOfDifferentCharacter = counter[j] - counter[i-1] + 1
If the noOfDifferentCharacter is 1 then no need to check for palindrome. If the noOfDifferentCharacter is 2 like in our case we need to check if the substring is palindrome. To check if the substring is palindrome have to perform a binary search in the counter array from index i to j to check for the position where the value is greater than its previous index. In our case the position is 3, then you just need to check if the position is the middle position of the substring. Note that the counter array is sorted.
Hope this helps. Let me know if you don't understand any step. Happy coding!

Java: Assign values to alphabet and determine value of a string

So I am trying to solve the problem in Java below. Could someone give me an idea of how to approach this? I can only think of using a bunch of confusing for-loops to split up the arr, go through the alphabet, and go through each string, and even then I am confused about strings versus chars. Any advice would be great.
--
Suppose the letter 'A' is worth 1, 'B' is worth 2, and so forth, with 'Z' worth 26. The value of a word is the sum of all the letter values in it. Given an array arr of words composed of capital letters, return the value of the watch with the largest value. You may assume that arr has length at least 1.
{"AAA","BBB","CCC"} => 9
{"AAAA","B","C"} => 4
{"Z"} => 26
{"",""} => 0
--
Here is what I have tried so far but I'm lost:
public static int largestValue(String[] arr){
String alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int largest = 0;
int wordTotal=0;
for (int i = 0; i < arr.length; i++){
String[] parts = arr[i].split("");
if (wordTotal < largest){ //I don't think this is in the right place
largest = 0; }
for (int j = 0; j < alphabet.length(); j++){
for(int k = 0; k <parts.length; k++){
if ( alphabet.charAt(j) == parts[k].charAt(0) ){
wordTotal = 0;
wordTotal += alphabet.indexOf(alphabet.charAt(j))+1;
}
}
}
}
return largest;
}

I would start by breaking the problem into parts, the first step is summing one String. To calculate the sum you can iterate the characters, test if the character is between 'A' and 'Z' (although your requirements say your input is guaranteed to be valid), subtract 'A' (a char literal) from the character and add it to your sum. Something like,
static int sumString(final String str) {
int sum = 0;
for (char ch : str.toCharArray()) {
if (ch >= 'A' && ch <= 'Z') { // <-- validate input
sum += 1 + ch - 'A'; // <-- 'A' - 'A' == 0, 'B' - 'A' == 1, etc.
}
}
return sum;
}
Then you can iterate an array of String(s) to get the maximum sum; something like
static int maxString(String[] arr) {
int max = sumString(arr[0]);
for (int i = 1; i < arr.length; i++) {
max = Math.max(max, sumString(arr[i]));
}
return max;
}
or with Java 8+
static int maxString(String[] arr) {
return Stream.of(arr).mapToInt(x -> sumString(x)).max().getAsInt();
}
And, finally, validate the entire operation like
public static void main(String[] args) {
String[][] strings = { { "AAA", "BBB", "CCC" }, { "AAAA", "B", "C" },
{ "Z" }, { "", "" } };
for (String[] arr : strings) {
System.out.printf("%s => %d%n", Arrays.toString(arr), maxString(arr));
}
}
And I get
[AAA, BBB, CCC] => 9
[AAAA, B, C] => 4
[Z] => 26
[, ] => 0

I think it helps to take note of the two key parts here:
1: You need to be able to find the value of a single word, which is the sum of each letter
2: You need to find the value of all words, and find the largest
Since you need to go through each element (letter/character) in a string, and also each element (word) in the array, the problem really is set up for using 2 loops. I think part of the whole problem is making the for loops clear and concise, which is definitely doable. I don't want to give it away, but having a function that, given a word, returns the value of the word, will help. You could find the value of a word, see if its the largest so far, and repeat. Also, to find the value of a word, please do not use 26 if's (look up ASCII table instead!). Hope this gives you a better understanding without giving it away!

StringIndexOutOfBounds in Java

I have two exact copies of code here, except one has '<' in the for loops while the other has '<='. Could someone please explain why I get the index out of bounds exception when I use '<=', but then it works fine with '<'
Error code:
for(int i = 0; i <= str.length(); i++) {
int count = 0;
char currentChar = str.charAt(i);
for(int j = 0; j <= str.length(); j++) {
if (currentChar == str.charAt(j) ) {
count++;
Working code:
for(int i = 0; i < str.length(); i++) {
int count = 0;
char currentChar = str.charAt(i);
for(int j = 0; j < str.length(); j++) {
if (currentChar == str.charAt(j) ) {
count++;
If I don't use <= how will it compare the last character in the string?

Valid String indexes in Java, just like the indexes in any array, go from zero to length minus one. So clearly if you set up your condition to go up to i <= str.length(), you'll get outside the string.
Remember that a String on the inside is nothing more than a char[], and again: the valid indexes go from 0 to length-1. This is a convention, followed by many other programming languages that decided to start counting from zero instead of one.

Because you cannot access str.chatAt(str.length()) without throwing a exception.
a < b means "a is less than b" and it will be false when a equals to b.
a <= b means "a is less than or equals to b" and it will be true when a equals to b.
To compare the last character in the string, write some code to do so, compile and run.
bool res = currentChar == str.charAt(str.length() - 1); // assuming str has string with one character or more

str.length() returns the number of characters in the String. So "String".length() returns 6.
Now, when using indices, you start with zero. so "String".charAt(0) returns 'S'. "String".charAt(6) gives you a StringIndexOutOfBoundsException because the last character in "String" is at index 5.

String indexes begin at 0. str.length() returns how many elements are in your array. if you have a string
"dog"
"dog".length() = 3,
'd':0, 'o':1, 'g':2.
Since your for loop initializes i to 0, the working loop goes through indexes 0-2, which is 3 values, while the non-working one goes 0-3, and references a null, and str.charAt(3) does not exist.

Compressing a file

I want to accomplish a program which can take a textfile and make the size smaller. So far it replaces all the double character occurrences, and now I want to replace "ou" with "1".
I've tried with an if-statement, but it doesn't seem to work quite well.
My method is below:
public String compressIt (String input)
{
int length = input.length(); // length of input
int ix = 0; // actual index in input
char c; // actual read character
int cCounter; // occurrence counter of actual character
String ou = "ou";
StringBuilder output = // the output
new StringBuilder(length);
// loop over every character in input
while(ix < length)
{
// read character at actual index then increments the index
c = input.charAt(ix++);
// we count one occurrence of this character here
cCounter = 1;
// while not reached end of line and next character
// is the same as previously read
while(ix < length && input.charAt(ix) == c)
{
// inc index means skip this character
ix++;
// and inc character occurence counter
cCounter++;
}
if (input.charAt(ix) == 'o' && input.charAt(++ix) == 'u' && ix < length - 1)
{
output.append("1");
}
// if more than one character occurence is counted
if(cCounter > 1)
{
// print the character count
output.append(cCounter);
}
// print the actual character
output.append(c);
}
// return the full compressed output
return output.toString();
}
It's this lines of code I'm referring to.
if (input.charAt(ix) == 'o' && input.charAt(ix + 1) == 'u')
{
output.append("1");
}
What I want to do: Replace characters. I got a text-file which contains "Alice In Wonderland". When my looping through all characters sees an 'o' and an 'u' (like "You"), I want to replace the characters so it looks like: "Y1".
Regards

So most likely you're trying loop from ix = 0 to the length of the string.
First of all my guess is that youre looping up and including string.length(). Which doesnt work, charAt is 0 indexed aka
"abc" has a charAt 0, 1 and 2 but not 3 which gives the error you describe.
Second of all the line you showed uses input.charAt (ix++) which does the following: get the char at position ix (old value) and after that, update the value ix to ix + 1, if you want ix to be updated before the surrounding charAt you'd have to write input.charAt(++ix)
Third of all there is a String.replace function, input.replace("abc", "def") will work great for simple replacements, for more complicated replacements, consider using regex.

This has nothing to do with the charAt method. You need to change your if condition to run it till length-1. It is failing in the last case as it is going out array.
for(int i=0; i<inputString.length() - 1; i++)
{
char temp = inputString.charAt(i);
char blah = inputString.charAt(i+1);
System.out.println("temp: "+ temp);
System.out.println("blah: "+ blah);
}
This works for me!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How can we calculate frequency of characters in a string - java

Related

How to count the remaining uppercase letter T for the letter occurrences counter?

Create substring using loop

Java: Assign values to alphabet and determine value of a string

StringIndexOutOfBounds in Java

Compressing a file

Categories

Resources