Accessing characters of a String beyond the range of Integer

Accessing characters of a String beyond the range of Integer - java

I have a String/character sequence that is being repeated infinitesimally... Naturally ,characters will go out of range of Integer and start falling into range of Long, since methods used for accessing characters for both String as well as StringBuilder class all require an int "index" how do I access these characters at say ,Long long>Intger.MAX_VALUE . is there a way to override these methods such as charAt(int index) so that they start "accepting " long arguments ,if not so , how can I access the characters at this index, considered conversion to character array using String.toCharArray() method but then again, array length can only be upto Integer.MAX_VALUE. Is there a method/constructor type that I'm not aware of which accepts long arguments?

You should definitely not construct a string and do measurements on it.
This is a test on how well you are able to abstract things. I will give you some code you may study. You should not copy+paste it for several reasons - including the possibility that I did some mistake.
The idea is, to simply compute the information, which is possible because we have a simple repetition pattern.
class RepeatedString {
private String s;
public RepeatedString(String s) {this.s = s;}
public char charAt(long i) {
return s.charAt((int)(i % s.length()));
}
public long count(char c, long i) {
long n = 0;
// how many complete repetitions?
{
long r = i / s.length();
if (r > 0) {
// count c in s
for (int j = 0 ; j < s.length() ; j++) n += s.charAt(j) == c ? 1 : 0;
n *= r;
}
}
// how many c in last repitition
{
long l = i % s.length();
for (int j = 0 ; j < l ; j++) n += s.charAt(j) == c ? 1 : 0;
}
return n;
}
}
class Kata {
public static void main(String[] args) {
RepeatedString s = new RepeatedString("bla");
System.out.println(s.charAt(1)); // expected 'l'
System.out.println(s.charAt(6)); // expected 'b'
System.out.println(s.count('a', 19)); // expected 6
System.out.println(s.count('a', 21)); // expected 7
}
}

Related

Iterate over two strings checking to see if it matches with its pair

I am still new to Java, and I am currently working on a program that will take two strings as arguments and return the number of mismatched pairs. For my program I am working with ATGC because in science, A's always match up with T's and G's always match up with C's. I cant quite figure out how to iterate over the strings and see that the first character in string one (T for example) matches up with its intended pair (A), and if it doesn't it is a mismatched pair and it should be added to a counter to be totaled at the end. I believe I can use something called charAt(), but I am unsure of how that works.
I also need to figure out how to be able to take the absolute value of counter before it is added to the finalCounter. The main reason for this is because I just want to worry about getting the length difference between the two rather than making sure that the longer string is subracted from the smaller string.
Any help would be greatly appreciated!
''''
public class CountMismatches {
public static void main(String[] args) {
{
String seq1 = "TTCGATGGAGCTGTA";
String seq2 = "TAGCTAGCTCGGCATGA";
System.out.println(count_mismatches(seq1, seq2))
//*expected to print out 5 because there are 3 mismatched pairs and 2 that do not have a pair*
}
}
public static int count_mismatches(String seq1, String seq2) {
int mismatchCount = 0;
int counter = seq1.length() - seq2.length();
int finalCounter = mismatchCount + counter;
for(int i = 0; i < seq1.length(); i++) if (seq1.charAt(i) == seq2.charAt(i)) {
break; //checks to see if the length of seq1 and seq2 are the same
}
for(int i = 0; i < seq1.length(); i++) if (seq1.charAt(i) != seq2.charAt(i)) {
return counter; //figure out how to do absolute value for negative numbers
}
return finalCounter;
}
}
'''

Since you want to count only the places where there are differences, you can iterate through the minimum length present in both the strings and find out the places where they are different.
In the end, you can add absolute difference of length between seq1 and seq2 and return that value to the main function.
For the logic, all you have to do is apply 4 if conditions to check if character is A,G,C,T and if suitable pair is present in the other string.
public class CountMismatches {
public static void main(String[] args) {
{
String seq1 = "TTCGATGGAGCTGTA";
String seq2 = "TAGCTAGCTCGGCATGA";
System.out.println(count_mismatches(seq1, seq2));
}
}
public static int count_mismatches(String seq1, String seq2) {
int finalCounter = 0;
for (int i = 0; i < Math.min(seq1.length(), seq2.length()); i++) {
char c1 = seq1.charAt(i);
char c2 = seq2.charAt(i);
if (c1 == 'A') {
if (c2 == 'T')
continue;
else
finalCounter++;
} else if (c1 == 'T') {
if (c2 == 'A')
continue;
else
finalCounter++;
} else if (c1 == 'G') {
if (c2 == 'C')
continue;
else
finalCounter++;
} else if (c1 == 'C') {
if (c2 == 'G')
continue;
else
finalCounter++;
}
}
return finalCounter + (Math.abs(seq1.length() - seq2.length()));
}
}
and the output is as follows :
5

Make these refactorings:
To make the comparisons easy to code and understand, create a Map whose entires are each pair (both directions)
Iterate over the Strings up to the length of the shortest one, adding up the number of matching pairs as you go
The result is the length of the longest String minus the number of pairs
Like this:
public static int count_mismatches(String seq1, String seq2) {
Map<Character, Character> pairs = Map.of('A', 'T', 'T', 'A', 'G', 'C', 'C', 'G');
int count = 0;
for (int i = 0; i < Math.min(seq1.length(), seq2.length()); i++) {
if (pairs.get(seq1.charAt(i)) == seq2.charAt(i)) {
count++;
}
}
return Math.max(seq1.length(), seq2.length()) - count;
}
See live demo, which returns 5 for your sample input.

Good Evening,
Something seems off here, this snippet of code:
for(int i = 0; i < seq1.length(); i++)
if (seq1.charAt(i) == seq2.charAt(i)) {
break; //checks to see if the length of seq1 and seq2 are the same
}
Does not do what you think it does. This cycle will loop through all characters in sequence1 using i < seq1.length() and for each character that exists in seq1, it will check if said character is equal to the character with the same index in seq2.
This means that a correction is in order:
int countMismatches = 0;
for(int i = 0; i < seq1.length();i++){
switch(seq1.charAt(i)){
case 'A':
if(seq2.charAt(i) != 'T') countMismatches++;
break;
}
}
Repeat this process for the other letters, and voilá, you should be able to count your mismatches this way.
Do be careful with sequences having different lengths, as if that happens, as soon as you step out of a bound, you will receive an IndexOutOfBoundsException, indicating you've tried to check a character that does not exist.

First you must find out which string is the shortest in length. Also you need to get the length difference when calculating the shortest string. After that, use that length as a terminating condition in your for loop. You can use booleans to check whether the values are present before incrementing the counter with an if statement.
The absolute value of any number can be obtained by calling the static method abs() from the Math class. Last, just add the mismatchCounts to the absolute value of the length difference in order to obtain the result.
Here is my solution.
public class App {
public static void main(String[] args) throws Exception {
String seq1 = "TTCGATGGAGCTGTA";
String seq2 = "TAGCTAGCTCGGCATGA";
System.out.println(compareStrings(seq1, seq2));
}
public static int compareStrings(String stringOne, String stringTwo) {
Character A = 'A', T = 'T', G = 'G', C = 'C';
int mismatchCount = 0;
int lowestStringLenght = 0;
int length_one = stringOne.length();
int length_two = stringTwo.length();
int lenght_difference = 0;
if (length_one < length_two) {// string one lenght is greater
lowestStringLenght = length_one;
lenght_difference = length_one - length_two;
} else if (length_one > length_two) {// string two lenght is greater
lowestStringLenght = length_two;
lenght_difference = length_two - length_one;
} else { // lenghts must be equal, use either
lowestStringLenght = length_one;
lenght_difference = 0; // there is no difference because they are equal
}
for (int i = 0; i < lowestStringLenght; i++) {
// A matches with T
// G matches with C
// evaluate if the values A, T, G, C are present
boolean A_T_PRESENT = stringOne.charAt(i) == A && stringTwo.charAt(i) == T;
boolean G_C_PRESENT = stringOne.charAt(i) == G && stringTwo.charAt(i) == C;
boolean T_A_PRESENT = stringOne.charAt(i) == T && stringTwo.charAt(i) == A;
boolean C_G_PRESENT = stringOne.charAt(i) == C && stringTwo.charAt(i) == G;
boolean TWO_EQUAL = stringOne.charAt(i) == stringTwo.charAt(i);
// characters are equal, increase mismatch counter
if (TWO_EQUAL) {
mismatchCount++;
continue;
}
// all booleans evaluated to false, it means that the characters are not proper
// matches. Increment mismatchCount
else if (!A_T_PRESENT && !G_C_PRESENT && !T_A_PRESENT && !C_G_PRESENT) {
mismatchCount++;
continue;
} else {
continue;
}
}
// calculate the sum of the mismatches plus the abs of the lenght difference
lenght_difference = Math.abs(lenght_difference);
return mismatchCount + lenght_difference;
}
}

Avoid char
The char type is legacy, essentially broken. As a 16-bit value, char is physically incapable of representing most characters. The char type in your particular case would work. But using char is a bad habit generally, as such code may break when encountering any of about 75,000 characters defined in Unicode.
Code point
Use code point integer numbers instead. A code point is the number assigned to each of the over 140,000 characters defined by the Unicode Consortium.
Here we get an IntStream, a series of int values, one for each character in the input string. Then we collect these integer numbers into an array of int values.
int[] codePoints1 = seq1.codePoints().toArray() ;
int[] codePoints2 = seq2.codePoints().toArray() ;
You said the input strings may be of unequal length. So our two arrays may be jagged, of different lengths. Figure out the size of the shorter array.
int smallerSize = Math.min( codePoints1.length , codePoints2.length ) ;
Keep track of the index number of mismatched rows.
List<Integer> mismatchIndices = new ArrayList <>();
Loop the arrays based on that smaller size.
for( int i = 0 ; i < smallerSize ; i ++ )
{
if ( isBasePairValid( codePoint first , codePoint second ) )
{
…
} else
{
mismatchIndices.add( i ) ;
}
}
Write an isBasePairValid method
Write the isBasePairValid method, taking two arguments, the code points of the two nucleobase letters.
static int A = "A".codePointAt( 0 ) ; // Annoying zero-based index counting. So first character is number zero.
static int C = "C".codePointAt( 0 ) ;
static int G = "G".codePointAt( 0 ) ;
static int T = "T".codePointAt( 0 ) ;
if( first == A ) return ( second == T )
else if( first == T ) return ( second == A )
else if( first == C ) return ( second == G )
else if( first == G ) return ( second == C )
else { throw new IllegalStateException( … ) ; }
Count the mismatches.
int countMismatches = mismatchIndices.size() ;

The numerical sum of chars T & A and G & C is fixed and unique for legal nucleobase pairs. So you just need to ensure that the corresponding bases have one of those sums.
String seq1 = "TTCGATGGAGCTGTA";
String seq2 = "TAGCTAGCTCGGCATGA";
System.out.println(count_mismatches(seq1, seq2));
prints
5
find max length to iterate
establish fixed sums for comparison
iterate and compare to expected pairing and update count appropriately
public static int count_mismatches(String seq1, String seq2) {
int len1 = seq1.length();
int len2 = seq2.length();
int len = len1;
if (len1 > len2) {
len = len2;
}
int sumTA = 'T'+'A';
int sumGC = 'G'+'C';
int misMatchCount = Math.abs(len1-len2);
for (int i = 0; i < len; i++) {
int pair = seq1.charAt(i) + seq2.charAt(i);
if (pair != sumTA && pair != sumGC) {
misMatchCount++;
}
}
return misMatchCount;
}

Java: Adding Digits Of A String

I have the correct code, I found an answer long ago, however I still don't understand why it works.
class Main {
public static void main(String[] args) {
System.out.println(D5PSum(10));
}
private static int D5PSum(int number) {
String n = Integer.toString(number);
int sum = 0;
for (int i = 0; i < n.length(); i++) {
// (1) WHY DOES THIS NOT WORK
//sum += Integer.parseInt(number.charAt(i));
// (2) MORE IMPORTANTLY WHY DOES THIS WORK
char c = n.charAt(i);
sum += (c-'0');
// (3) WHAT IN THE WORLD IS c-'0'
}
return sum;
}
}

// (1) WHY DOES THIS NOT WORK
because Integer.parseInt(...); is expecting a string as parameter not a char
// (2) MORE IMPORTANTLY WHY DOES THIS WORK
char c = n.charAt(i);
any char is nothing else as an integer mapped to a table of symbols...(ASCII table for example) so this (c - '0') is just another valid mathematical operation

charAt is not a valid method of the primitive type int.
'0' is the character 0, and the character encoding set that Java uses has 0 to 9 in a consecutive block. Therefore c - '0' yields the position of c in that consecutive block which is therefore the value of the digit. (Actually this sort of thing is idiomatic C - goes right back to the 1960s).

You should first convert String to Int.. Please check the below code:
class MainClass {
public static void main(String[] args) {
System.out.println(D5PSum(11));
}
private static int D5PSum(int number) {
String n = Integer.toString(number);
System.out.println(n);
int sum = 0;
for (int i = 0; i < n.length(); i++) {
// (1) WHY DOES THIS NOT WORK
String str = String.valueOf(n.charAt(i));
sum += Integer.parseInt(str);
// (2) MORE IMPORTANTLY WHY DOES THIS WORK
// char c = n.charAt(i);
// sum += (c-'0');
// (3) WHAT IN THE WORLD IS c-'0'
}
return sum;
}
}

1
It doesnt work because Integer.parseInt takes a String and String.charAt returns a char(actar). Integer.parseInt (Character.toString(n.charAt(i))) would Work.
2/3
A char represents a number between 0 and 65535. EACH digit-characters (0-9) has a number in that range which depends on the charset. All digits are typically in a row, for example the character 0 has the value 48, 1 49 and 9 57. So ich you want to know the digit value of a char you simply subtract the character value of 0 from your character. That is the Line c-'0'

// (1) WHY DOES THIS NOT WORK
//sum += Integer.parseInt(number.charAt(i));
number is a variable of primitive data type "int" so number.charAt(i) won't work.
// (2) MORE IMPORTANTLY WHY DOES THIS WORK
char c = n.charAt(i);
n is an instance of String and we are getting the character at i th position in the n string
sum += (c-'0');
// (3) WHAT IN THE WORLD IS c-'0'
for every character there is an ascii code assigned. '0' = 48, 'c' = 99. That's the reason why it works here. when 'c'-'0' is executed, it's equivalent to 99-48

Why convert to a string in the first place? The simplest and fastest way to solve this is without deviation to strings:
private static int D5PSum(int number) {
int v = number, sum = 0;
while (v != 0) {
sum += v % 10;
v /= 10;
}
return sum;
}

If you want your code (the part which does not works to work then do this).
class Main {
public static void main(String[] args) {
System.out.println(D5PSum(10));
}
private static int D5PSum(int number) {
String n = Integer.toString(number);
int sum = 0;
for (int i = 0; i < n.length(); i++) {
sum += Integer.parseInt(n.charAt(i)+"");
}
return sum;
}
}

To get sum of the digits of a string str:
int sum = str.chars().map(Character::getNumericValue).sum();

Determine if a given string is a k-palindrome

I'm trying to solve the following interview practice question:
A k-palindrome is a string which transforms into a palindrome on removing at most
k characters.
Given a string S, and an integer K, print "YES" if S is a k-palindrome;
otherwise print "NO".
Constraints:
S has at most 20,000 characters.
0 <= k <= 30
Sample Test Cases:
Input - abxa 1
Output - YES
Input - abdxa 1
Output - NO
My approach I've decided is going to be taking all possible String combinations of length s.length - k or greater, i.e. "abc" and k = 1 -> "ab" "bc" "ac" "abc" and checking if they are palindromes. I have the following code so far, but can't seem to figure out a proper way to generate all these string combinations in the general case:
public static void isKPalindrome(String s, int k) {
// Generate all string combinations and call isPalindrome on them,
// printing "YES" at first true
}
private static boolean isPalindrome(String s) {
char[] c = s.toCharArray()
int slow = 0;
int fast = 0;
Stack<Character> stack = new Stack<>();
while (fast < c.length) {
stack.push(c[slow]);
slow += 1;
fast += 2;
}
if (c.length % 2 == 1) {
stack.pop();
}
while (!stack.isEmpty()) {
if (stack.pop() != c[slow++]) {
return false;
}
}
return true;
}
Can anyone figure out a way to implement this, or perhaps demonstrate a better way?

I think there is a better way
package se.wederbrand.stackoverflow;
public class KPalindrome {
public static void main(String[] args) {
KPalindrome kPalindrome = new KPalindrome();
String s = args[0];
int k = Integer.parseInt(args[1]);
if (kPalindrome.testIt(s, k)) {
System.out.println("YES");
}
else {
System.out.println("NO");
}
}
boolean testIt(String s, int k) {
if (s.length() <= 1) {
return true;
}
while (s.charAt(0) == s.charAt(s.length()-1)) {
s = s.substring(1, s.length()-1);
if (s.length() <= 1) {
return true;
}
}
if (k == 0) {
return false;
}
// Try to remove the first or last character
return testIt(s.substring(0, s.length() - 1), k - 1) || testIt(s.substring(1, s.length()), k - 1);
}
}
Since K is max 30 it's likely the string can be invalidated pretty quick and without even examining the middle of the string.
I've tested this with the two provided test cases as well as a 20k characters long string with just "ab" 10k times and k = 30;
All tests are fast and returns the correct results.

This can be solved using Edit distance dynamic programming algorithm. Edit distance DP algorithm is used to find the minimum operations required to convert a source string to destination string. The operations can be either addition or deletion of characters.
The K-palindrome problem can be solved using Edit distance algorithm by checking the minimum operation required to convert the input string to its reverse.
Let editDistance(source,destination) be the function which takes source string and destination string and returns the minimum operations required to convert the source string to destination string.
A string S is K-palindrome if editDistance(S,reverse(S))<=2*K
This is because we can transform the given string S into its reverse by deleting atmost K letters and then inserting the same K letters in different position.
This will be more clear with an example.
Let S=madtam and K=1.
To convert S into reverse of S (i.e matdam) first we have to remove the character 't' at index 3 ( 0 based index) in S.
Now the intermediate string is madam. Then we have to insert the character 't' at index 2 in the intermediate string to get "matdam" which is the reverse of string s.
If you look carefully you will know that the intermediate string "madam" is the palindrome that is obtained by removing k=1 characters.

I found the length of a longest string such that after removing characters >= k, we will be having a palindrome. I have used dynamic programming here. The palindrome I have considered need not be consecutive. Its like abscba has a longest palindromic length of 4.
So now this can be used further, such that whenever k >= (len - len of longest palindrome), it results to true else false.
public static int longestPalindrome(String s){
int len = s.length();
int[][] cal = new int[len][len];
for(int i=0;i<len;i++){
cal[i][i] = 1; //considering strings of length = 1
}
for(int i=0;i<len-1;i++){
//considering strings of length = 2
if (s.charAt(i) == s.charAt(i+1)){
cal[i][i+1] = 2;
}else{
cal[i][i+1] = 0;
}
}
for(int p = len-1; p>=0; p--){
for(int q=p+2; q<len; q++){
if (s.charAt(p)==s.charAt(q)){
cal[p][q] = 2 + cal[p+1][q-1];
}else{
cal[p][q] = max(cal[p+1][q], cal[p][q-1]);
}
}
}
return cal[0][len-1];
}

This is a common interview question, and I'm little surprised that no one has mentioned dynamic programming yet. This problem exhibits optimal substructure (if a string is a k-palindrome, some substrings are also k-palindromes), and overlapping subproblems (the solution requires comparing the same substrings more than once).
This is a special case of the edit distance problem, where we check if a string s can be converted to string p by only deleting characters from either or both strings.
Let the string be s and its reverse rev. Let dp[i][j] be the number of deletions required to convert the first i characters of s to the first j characters of rev. Since deletions have to be done in both strings, if dp[n][n] <= 2 * k, then the string is a k-palindrome.
Base case: When one of the strings is empty, all characters from the other string need to be deleted in order to make them equal.
Time complexity: O(n^2).
Scala code:
def kPalindrome(s: String, k: Int): Boolean = {
val rev = s.reverse
val n = s.length
val dp = Array.ofDim[Int](n + 1, n + 1)
for (i <- 0 to n; j <- 0 to n) {
dp(i)(j) = if (i == 0 || j == 0) i + j
else if (s(i - 1) == rev(j - 1)) dp(i - 1)(j - 1)
else 1 + math.min(dp(i - 1)(j), dp(i)(j - 1))
}
dp(n)(n) <= 2 * k
}
Since we are doing bottom-up DP, an optimization is to return false if at any time i == j && dp[i][j] > 2 * k, since all subsequent i == j must be greater.

Thanks to Andreas, that algo worked like a charm. Here my implementation for anyone who's curious. Slightly different, but fundamentally your same logic:
public static boolean kPalindrome(String s, int k) {
if (s.length() <= 1) {
return true;
}
char[] c = s.toCharArray();
if (c[0] != c[c.length - 1]) {
if (k <= 0) {
return false;
} else {
char[] minusFirst = new char[c.length - 1];
System.arraycopy(c, 1, minusFirst, 0, c.length - 1);
char[] minusLast = new char[c.length - 1];
System.arraycopy(c, 0, minusLast, 0, c.length - 1);
return kPalindrome(String.valueOf(minusFirst), k - 1)
|| kPalindrome(String.valueOf(minusLast), k - 1);
}
} else {
char[] minusFirstLast = new char[c.length - 2];
System.arraycopy(c, 1, minusFirstLast, 0, c.length - 2);
return kPalindrome(String.valueOf(minusFirstLast), k);
}
}

This problem can be solved using the famous Longest Common Subsequence(LCS) method. When LCS is applied with the string and the reverse of the given string, then it gives us the longest palindromic subsequence present in the string.
Let the longest palindromic subsequence length of a given string of length string_length be palin_length. Then (string_length - palin_length) gives the number of characters required to be deleted to convert the string to a palindrome. Thus, the given string is k-palindrome if (string_length - palin_length) <= k.
Let me give some examples,
Initial String: madtam (string_length = 6)
Longest Palindromic Subsequence: madam (palin_length = 5)
Number of non-contributing characters: 1 ( string_length - palin_length)
Thus this string is k-palindromic where k>=1. This is because you need to delete atmost k characters ( k or less).
Here is the code snippet:
#include<iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
#define MAX 10000
int table[MAX+1][MAX+1];
int longest_common_subsequence(char *first_string, char *second_string){
int first_string_length = strlen(first_string), second_string_length = strlen(second_string);
int i, j;
memset( table, 0, sizeof(table));
for( i=1; i<=first_string_length; i++ ){
for( j=1; j<=second_string_length; j++){
if( first_string[i-1] == second_string[j-1] )
table[i][j] = table[i-1][j-1] + 1;
else
table[i][j] = max(table[i-1][j], table[i][j-1]);
}
}
return table[first_string_length][second_string_length];
}
char first_string[MAX], second_string[MAX];
int main(){
scanf("%s", first_string);
strcpy(second_string, first_string);
reverse(second_string, second_string+strlen(second_string));
int max_palindromic_length = longest_common_subsequence(first_string, second_string);
int non_contributing_chars = strlen(first_string) - max_palindromic_length;
if( k >= non_contributing_chars)
printf("K palindromic!\n");
else
printf("Not K palindromic!\n");
return 0;
}

I designed a solution purely based on recursion -
public static boolean isKPalindrome(String str, int k) {
if(str.length() < 2) {
return true;
}
if(str.charAt(0) == str.charAt(str.length()-1)) {
return isKPalindrome(str.substring(1, str.length()-1), k);
} else{
if(k == 0) {
return false;
} else {
if(isKPalindrome(str.substring(0, str.length() - 1), k-1)) {
return true;
} else{
return isKPalindrome(str.substring(1, str.length()), k-1);
}
}
}
}
There is no while loop in above implementation as in the accepted answer.
Hope it helps somebody looking for it.

public static boolean failK(String s, int l, int r, int k) {
if (k < 0)
return false;
if (l > r)
return true;
if (s.charAt(l) != s.charAt(r)) {
return failK(s, l + 1, r, k - 1) || failK(s, l, r - 1, k - 1);
} else {
return failK(s, l + 1, r - 1, k);
}
}

Find all substrings that are palindromes

If the input is 'abba' then the possible palindromes are a, b, b, a, bb, abba.
I understand that determining if string is palindrome is easy. It would be like:
public static boolean isPalindrome(String str) {
int len = str.length();
for(int i=0; i<len/2; i++) {
if(str.charAt(i)!=str.charAt(len-i-1) {
return false;
}
return true;
}
But what is the efficient way of finding palindrome substrings?

This can be done in O(n), using Manacher's algorithm.
The main idea is a combination of dynamic programming and (as others have said already) computing maximum length of palindrome with center in a given letter.
What we really want to calculate is radius of the longest palindrome, not the length.
The radius is simply length/2 or (length - 1)/2 (for odd-length palindromes).
After computing palindrome radius pr at given position i we use already computed radiuses to find palindromes in range [i - pr ; i]. This lets us (because palindromes are, well, palindromes) skip further computation of radiuses for range [i ; i + pr].
While we search in range [i - pr ; i], there are four basic cases for each position i - k (where k is in 1,2,... pr):
no palindrome (radius = 0) at i - k
(this means radius = 0 at i + k, too)
inner palindrome, which means it fits in range
(this means radius at i + k is the same as at i - k)
outer palindrome, which means it doesn't fit in range
(this means radius at i + k is cut down to fit in range, i.e because i + k + radius > i + pr we reduce radius to pr - k)
sticky palindrome, which means i + k + radius = i + pr
(in that case we need to search for potentially bigger radius at i + k)
Full, detailed explanation would be rather long. What about some code samples? :)
I've found C++ implementation of this algorithm by Polish teacher, mgr Jerzy Wałaszek.
I've translated comments to english, added some other comments and simplified it a bit to be easier to catch the main part.
Take a look here.
Note: in case of problems understanding why this is O(n), try to look this way:
after finding radius (let's call it r) at some position, we need to iterate over r elements back, but as a result we can skip computation for r elements forward. Therefore, total number of iterated elements stays the same.

Perhaps you could iterate across potential middle character (odd length palindromes) and middle points between characters (even length palindromes) and extend each until you cannot get any further (next left and right characters don't match).
That would save a lot of computation when there are no many palidromes in the string. In such case the cost would be O(n) for sparse palidrome strings.
For palindrome dense inputs it would be O(n^2) as each position cannot be extended more than the length of the array / 2. Obviously this is even less towards the ends of the array.
public Set<String> palindromes(final String input) {
final Set<String> result = new HashSet<>();
for (int i = 0; i < input.length(); i++) {
// expanding even length palindromes:
expandPalindromes(result,input,i,i+1);
// expanding odd length palindromes:
expandPalindromes(result,input,i,i);
}
return result;
}
public void expandPalindromes(final Set<String> result, final String s, int i, int j) {
while (i >= 0 && j < s.length() && s.charAt(i) == s.charAt(j)) {
result.add(s.substring(i,j+1));
i--; j++;
}
}

So, each distinct letter is already a palindrome - so you already have N + 1 palindromes, where N is the number of distinct letters (plus empty string). You can do that in single run - O(N).
Now, for non-trivial palindromes, you can test each point of your string to be a center of potential palindrome - grow in both directions - something that Valentin Ruano suggested.
This solution will take O(N^2) since each test is O(N) and number of possible "centers" is also O(N) - the center is either a letter or space between two letters, again as in Valentin's solution.
Note, there is also O(N) solution to your problem, based on Manacher's algoritm (article describes "longest palindrome", but algorithm could be used to count all of them)

I just came up with my own logic which helps to solve this problem.
Happy coding.. :-)
System.out.println("Finding all palindromes in a given string : ");
subPal("abcacbbbca");
private static void subPal(String str) {
String s1 = "";
int N = str.length(), count = 0;
Set<String> palindromeArray = new HashSet<String>();
System.out.println("Given string : " + str);
System.out.println("******** Ignoring single character as substring palindrome");
for (int i = 2; i <= N; i++) {
for (int j = 0; j <= N; j++) {
int k = i + j - 1;
if (k >= N)
continue;
s1 = str.substring(j, i + j);
if (s1.equals(new StringBuilder(s1).reverse().toString())) {
palindromeArray.add(s1);
}
}
}
System.out.println(palindromeArray);
for (String s : palindromeArray)
System.out.println(s + " - is a palindrome string.");
System.out.println("The no.of substring that are palindrome : "
+ palindromeArray.size());
}
Output:-
Finding all palindromes in a given string :
Given string : abcacbbbca
******** Ignoring single character as substring palindrome ********
[cac, acbbbca, cbbbc, bb, bcacb, bbb]
cac - is a palindrome string.
acbbbca - is a palindrome string.
cbbbc - is a palindrome string.
bb - is a palindrome string.
bcacb - is a palindrome string.
bbb - is a palindrome string.
The no.of substring that are palindrome : 6

I suggest building up from a base case and expanding until you have all of the palindomes.
There are two types of palindromes: even numbered and odd-numbered. I haven't figured out how to handle both in the same way so I'll break it up.
1) Add all single letters
2) With this list you have all of the starting points for your palindromes. Run each both of these for each index in the string (or 1 -> length-1 because you need at least 2 length):
findAllEvenFrom(int index){
int i=0;
while(true) {
//check if index-i and index+i+1 is within string bounds
if(str.charAt(index-i) != str.charAt(index+i+1))
return; // Here we found out that this index isn't a center for palindromes of >=i size, so we can give up
outputList.add(str.substring(index-i, index+i+1));
i++;
}
}
//Odd looks about the same, but with a change in the bounds.
findAllOddFrom(int index){
int i=0;
while(true) {
//check if index-i and index+i+1 is within string bounds
if(str.charAt(index-i-1) != str.charAt(index+i+1))
return;
outputList.add(str.substring(index-i-1, index+i+1));
i++;
}
}
I'm not sure if this helps the Big-O for your runtime, but it should be much more efficient than trying each substring. Worst case would be a string of all the same letter which may be worse than the "find every substring" plan, but with most inputs it will cut out most substrings because you can stop looking at one once you realize it's not the center of a palindrome.

I tried the following code and its working well for the cases
Also it handles individual characters too
Few of the cases which passed:
abaaa --> [aba, aaa, b, a, aa]
geek --> [g, e, ee, k]
abbaca --> [b, c, a, abba, bb, aca]
abaaba -->[aba, b, abaaba, a, baab, aa]
abababa -->[aba, babab, b, a, ababa, abababa, bab]
forgeeksskeegfor --> [f, g, e, ee, s, r, eksske, geeksskeeg,
o, eeksskee, ss, k, kssk]
Code
static Set<String> set = new HashSet<String>();
static String DIV = "|";
public static void main(String[] args) {
String str = "abababa";
String ext = getExtendedString(str);
// will check for even length palindromes
for(int i=2; i<ext.length()-1; i+=2) {
addPalindromes(i, 1, ext);
}
// will check for odd length palindromes including individual characters
for(int i=1; i<=ext.length()-2; i+=2) {
addPalindromes(i, 0, ext);
}
System.out.println(set);
}
/*
* Generates extended string, with dividors applied
* eg: input = abca
* output = |a|b|c|a|
*/
static String getExtendedString(String str) {
StringBuilder builder = new StringBuilder();
builder.append(DIV);
for(int i=0; i< str.length(); i++) {
builder.append(str.charAt(i));
builder.append(DIV);
}
String ext = builder.toString();
return ext;
}
/*
* Recursive matcher
* If match is found for palindrome ie char[mid-offset] = char[mid+ offset]
* Calculate further with offset+=2
*
*
*/
static void addPalindromes(int mid, int offset, String ext) {
// boundary checks
if(mid - offset <0 || mid + offset > ext.length()-1) {
return;
}
if (ext.charAt(mid-offset) == ext.charAt(mid+offset)) {
set.add(ext.substring(mid-offset, mid+offset+1).replace(DIV, ""));
addPalindromes(mid, offset+2, ext);
}
}
Hope its fine

public class PolindromeMyLogic {
static int polindromeCount = 0;
private static HashMap<Character, List<Integer>> findCharAndOccurance(
char[] charArray) {
HashMap<Character, List<Integer>> map = new HashMap<Character, List<Integer>>();
for (int i = 0; i < charArray.length; i++) {
char c = charArray[i];
if (map.containsKey(c)) {
List list = map.get(c);
list.add(i);
} else {
List list = new ArrayList<Integer>();
list.add(i);
map.put(c, list);
}
}
return map;
}
private static void countPolindromeByPositions(char[] charArray,
HashMap<Character, List<Integer>> map) {
map.forEach((character, list) -> {
int n = list.size();
if (n > 1) {
for (int i = 0; i < n - 1; i++) {
for (int j = i + 1; j < n; j++) {
if (list.get(i) + 1 == list.get(j)
|| list.get(i) + 2 == list.get(j)) {
polindromeCount++;
} else {
char[] temp = new char[(list.get(j) - list.get(i))
+ 1];
int jj = 0;
for (int ii = list.get(i); ii <= list
.get(j); ii++) {
temp[jj] = charArray[ii];
jj++;
}
if (isPolindrome(temp))
polindromeCount++;
}
}
}
}
});
}
private static boolean isPolindrome(char[] charArray) {
int n = charArray.length;
char[] temp = new char[n];
int j = 0;
for (int i = (n - 1); i >= 0; i--) {
temp[j] = charArray[i];
j++;
}
if (Arrays.equals(charArray, temp))
return true;
else
return false;
}
public static void main(String[] args) {
String str = "MADAM";
char[] charArray = str.toCharArray();
countPolindromeByPositions(charArray, findCharAndOccurance(charArray));
System.out.println(polindromeCount);
}
}
Try out this. Its my own solution.

// Maintain an Set of palindromes so that we get distinct elements at the end
// Add each char to set. Also treat that char as middle point and traverse through string to check equality of left and right char
static int palindrome(String str) {
Set<String> distinctPln = new HashSet<String>();
for (int i=0; i<str.length();i++) {
distinctPln.add(String.valueOf(str.charAt(i)));
for (int j=i-1, k=i+1; j>=0 && k<str.length(); j--, k++) {
// String of lenght 2 as palindrome
if ( (new Character(str.charAt(i))).equals(new Character(str.charAt(j)))) {
distinctPln.add(str.substring(j,i+1));
}
// String of lenght 2 as palindrome
if ( (new Character(str.charAt(i))).equals(new Character(str.charAt(k)))) {
distinctPln.add(str.substring(i,k+1));
}
if ( (new Character(str.charAt(j))).equals(new Character(str.charAt(k)))) {
distinctPln.add(str.substring(j,k+1));
} else {
continue;
}
}
}
Iterator<String> distinctPlnItr = distinctPln.iterator();
while ( distinctPlnItr.hasNext()) {
System.out.print(distinctPlnItr.next()+ ",");
}
return distinctPln.size();
}

Code is to find all distinct substrings which are palindrome.
Here is the code I tried. It is working fine.
import java.util.HashSet;
import java.util.Set;
public class SubstringPalindrome {
public static void main(String[] args) {
String s = "abba";
checkPalindrome(s);
}
public static int checkPalindrome(String s) {
int L = s.length();
int counter =0;
long startTime = System.currentTimeMillis();
Set<String> hs = new HashSet<String>();
// add elements to the hash set
System.out.println("Possible substrings: ");
for (int i = 0; i < L; ++i) {
for (int j = 0; j < (L - i); ++j) {
String subs = s.substring(j, i + j + 1);
counter++;
System.out.println(subs);
if(isPalindrome(subs))
hs.add(subs);
}
}
System.out.println("Total possible substrings are "+counter);
System.out.println("Total palindromic substrings are "+hs.size());
System.out.println("Possible palindromic substrings: "+hs.toString());
long endTime = System.currentTimeMillis();
System.out.println("It took " + (endTime - startTime) + " milliseconds");
return hs.size();
}
public static boolean isPalindrome(String s) {
if(s.length() == 0 || s.length() ==1)
return true;
if(s.charAt(0) == s.charAt(s.length()-1))
return isPalindrome(s.substring(1, s.length()-1));
return false;
}
}
OUTPUT:
Possible substrings:
a
b
b
a
ab
bb
ba
abb
bba
abba
Total possible substrings are 10
Total palindromic substrings are 4
Possible palindromic substrings: [bb, a, b, abba]
It took 1 milliseconds

How to simplify this array method?

Here is my code to an array method:
private int _a;
public static void main(String[] args) {}
public int[] countAll(String s) {
int[] xArray = new int[27];
int[] yArray = new int[27];
_a = (int)'a';
for (int i = 0; i < xArray.length; i++) {
xArray[i] = _a;
_a = _a++;
}
for (int j = 0; j < s.length(); j++) {
s = s.toLowerCase();
char c = s.charAt(j);
int g = (int) c;
int letterindex = g - yArray[0];
if (letterindex >= 0 && letterindex <= 25) {
xArray[letterindex]++;
} else if (letterindex < 0 || letterindex > 25) {
xArray[26]++;
}
}
return xArray;
}
This code works in java but I was told that there is a simpler way. I am having a lot of trouble figuring out a simplified version of my code. Please help me.

If all you want to do is count the upper and lower case, that's a very roundabout way of doing it, what's wrong with something like:
public static int countUpper(String str)
{
int upper = 0;
for(char c : str.toCharArray())
{
if(Character.isUpperCase(c))
{
upper++;
}
}
return upper;
}
Then just the same thing with Character.isLowerCase(c) for the opposite.

public static int[] countAll(String s) {
int[] xArray = new int[27];
for (char c : s.toLowerCase().toCharArray()){
if (Character.isLetter(c))
xArray[c -'a']++;
else
xArray[26]++;
}
return xArray;
}

It looks like your program is trying to find frequencies of different alphabets in a string, and you are counting the non letters in special index 26. In that case your code to initialize the count is wrong. It is getting pre-initialized with some values in following for loop:
for (int i = 0; i < xArray.length; i++) {
xArray[i] = _a;
_a = _a++;
}
I think the method can be simply something like:
s = s.toLowerCase();
int histogram[] = new int[27];
for (char c: s.toCharArray()) {
int index = c - 'a';
if (index < 0 || index > 25) {
index = 26;
}
histogram[index]++;
}

Here are two important improvements that should be made to your code:
Add a method javadoc for countAll, so that readers don't have to trawl through 20+ lines of turgid code to reverse engineer what the method is supposed to be.
Get rid of the _a abomination. According to the most widely accepted Java coding standard, the underscore character has no place in a variable name. Besides, a is about the most useless field name I've ever come across. If it is intended to convey some meaning to the reader ... you have totally lost me.
(Oh I get it. It shouldn't be a field at all. Bzzzt!!!)
Then there is the yArray array. As far as I can tell the only place it is used is here:
int letterindex = g - yArray[0];
which is actually the same as:
int letterindex = g;
since yArray[0] is never assigned to. In short yArray is completely redundant.
And this:
if (letterindex >= 0 && letterindex <= 25) {
xArray[letterindex]++;
} else if (letterindex < 0 || letterindex > 25) {
xArray[26]++;
}
The condition in the else part is redundant. Your code will be easier to read if you just write this:
if (letterindex >= 0 && letterindex <= 25) {
xArray[letterindex]++;
} else {
xArray[26]++;
}
The two are equivalent. Do you see why?
Finally the initialization of the xArray elements looks plain wrong to me. If xArray contains counts, the elements need to start at zero. (Didn't you wonder why your code was telling you that every string contained lots of "zees"?)
"This code works in java ..."
I don't think so. Maybe it compiles. Maybe it runs without crashing. But it doesn't give correct answers!

public static int[] countAll(String s) {
int[] count = new int[26];
for (char c : s.toLowerCase().toCharArray()) {
if ('a' <= c && c <= 'z') {
count[c - 'a']++;
}
}
return count;
}
First.. your arrays where to big.
Second.. why do you need two arrays at all?
Third.. your code didn't seemt to work.. the word "hello" returned an array with the number 97 (26 times) and the number 102.
Edit: Made it shorter.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Accessing characters of a String beyond the range of Integer - java

Related

Iterate over two strings checking to see if it matches with its pair

Java: Adding Digits Of A String

Determine if a given string is a k-palindrome

Find all substrings that are palindromes

How to simplify this array method?

Categories

Resources