Find all substrings that are palindromes - java

If the input is 'abba' then the possible palindromes are a, b, b, a, bb, abba.
I understand that determining if string is palindrome is easy. It would be like:
public static boolean isPalindrome(String str) {
int len = str.length();
for(int i=0; i<len/2; i++) {
if(str.charAt(i)!=str.charAt(len-i-1) {
return false;
}
return true;
}
But what is the efficient way of finding palindrome substrings?

This can be done in O(n), using Manacher's algorithm.
The main idea is a combination of dynamic programming and (as others have said already) computing maximum length of palindrome with center in a given letter.
What we really want to calculate is radius of the longest palindrome, not the length.
The radius is simply length/2 or (length - 1)/2 (for odd-length palindromes).
After computing palindrome radius pr at given position i we use already computed radiuses to find palindromes in range [i - pr ; i]. This lets us (because palindromes are, well, palindromes) skip further computation of radiuses for range [i ; i + pr].
While we search in range [i - pr ; i], there are four basic cases for each position i - k (where k is in 1,2,... pr):
no palindrome (radius = 0) at i - k
(this means radius = 0 at i + k, too)
inner palindrome, which means it fits in range
(this means radius at i + k is the same as at i - k)
outer palindrome, which means it doesn't fit in range
(this means radius at i + k is cut down to fit in range, i.e because i + k + radius > i + pr we reduce radius to pr - k)
sticky palindrome, which means i + k + radius = i + pr
(in that case we need to search for potentially bigger radius at i + k)
Full, detailed explanation would be rather long. What about some code samples? :)
I've found C++ implementation of this algorithm by Polish teacher, mgr Jerzy Wałaszek.
I've translated comments to english, added some other comments and simplified it a bit to be easier to catch the main part.
Take a look here.
Note: in case of problems understanding why this is O(n), try to look this way:
after finding radius (let's call it r) at some position, we need to iterate over r elements back, but as a result we can skip computation for r elements forward. Therefore, total number of iterated elements stays the same.

Perhaps you could iterate across potential middle character (odd length palindromes) and middle points between characters (even length palindromes) and extend each until you cannot get any further (next left and right characters don't match).
That would save a lot of computation when there are no many palidromes in the string. In such case the cost would be O(n) for sparse palidrome strings.
For palindrome dense inputs it would be O(n^2) as each position cannot be extended more than the length of the array / 2. Obviously this is even less towards the ends of the array.
public Set<String> palindromes(final String input) {
final Set<String> result = new HashSet<>();
for (int i = 0; i < input.length(); i++) {
// expanding even length palindromes:
expandPalindromes(result,input,i,i+1);
// expanding odd length palindromes:
expandPalindromes(result,input,i,i);
}
return result;
}
public void expandPalindromes(final Set<String> result, final String s, int i, int j) {
while (i >= 0 && j < s.length() && s.charAt(i) == s.charAt(j)) {
result.add(s.substring(i,j+1));
i--; j++;
}
}

So, each distinct letter is already a palindrome - so you already have N + 1 palindromes, where N is the number of distinct letters (plus empty string). You can do that in single run - O(N).
Now, for non-trivial palindromes, you can test each point of your string to be a center of potential palindrome - grow in both directions - something that Valentin Ruano suggested.
This solution will take O(N^2) since each test is O(N) and number of possible "centers" is also O(N) - the center is either a letter or space between two letters, again as in Valentin's solution.
Note, there is also O(N) solution to your problem, based on Manacher's algoritm (article describes "longest palindrome", but algorithm could be used to count all of them)

I just came up with my own logic which helps to solve this problem.
Happy coding.. :-)
System.out.println("Finding all palindromes in a given string : ");
subPal("abcacbbbca");
private static void subPal(String str) {
String s1 = "";
int N = str.length(), count = 0;
Set<String> palindromeArray = new HashSet<String>();
System.out.println("Given string : " + str);
System.out.println("******** Ignoring single character as substring palindrome");
for (int i = 2; i <= N; i++) {
for (int j = 0; j <= N; j++) {
int k = i + j - 1;
if (k >= N)
continue;
s1 = str.substring(j, i + j);
if (s1.equals(new StringBuilder(s1).reverse().toString())) {
palindromeArray.add(s1);
}
}
}
System.out.println(palindromeArray);
for (String s : palindromeArray)
System.out.println(s + " - is a palindrome string.");
System.out.println("The no.of substring that are palindrome : "
+ palindromeArray.size());
}
Output:-
Finding all palindromes in a given string :
Given string : abcacbbbca
******** Ignoring single character as substring palindrome ********
[cac, acbbbca, cbbbc, bb, bcacb, bbb]
cac - is a palindrome string.
acbbbca - is a palindrome string.
cbbbc - is a palindrome string.
bb - is a palindrome string.
bcacb - is a palindrome string.
bbb - is a palindrome string.
The no.of substring that are palindrome : 6

I suggest building up from a base case and expanding until you have all of the palindomes.
There are two types of palindromes: even numbered and odd-numbered. I haven't figured out how to handle both in the same way so I'll break it up.
1) Add all single letters
2) With this list you have all of the starting points for your palindromes. Run each both of these for each index in the string (or 1 -> length-1 because you need at least 2 length):
findAllEvenFrom(int index){
int i=0;
while(true) {
//check if index-i and index+i+1 is within string bounds
if(str.charAt(index-i) != str.charAt(index+i+1))
return; // Here we found out that this index isn't a center for palindromes of >=i size, so we can give up
outputList.add(str.substring(index-i, index+i+1));
i++;
}
}
//Odd looks about the same, but with a change in the bounds.
findAllOddFrom(int index){
int i=0;
while(true) {
//check if index-i and index+i+1 is within string bounds
if(str.charAt(index-i-1) != str.charAt(index+i+1))
return;
outputList.add(str.substring(index-i-1, index+i+1));
i++;
}
}
I'm not sure if this helps the Big-O for your runtime, but it should be much more efficient than trying each substring. Worst case would be a string of all the same letter which may be worse than the "find every substring" plan, but with most inputs it will cut out most substrings because you can stop looking at one once you realize it's not the center of a palindrome.

I tried the following code and its working well for the cases
Also it handles individual characters too
Few of the cases which passed:
abaaa --> [aba, aaa, b, a, aa]
geek --> [g, e, ee, k]
abbaca --> [b, c, a, abba, bb, aca]
abaaba -->[aba, b, abaaba, a, baab, aa]
abababa -->[aba, babab, b, a, ababa, abababa, bab]
forgeeksskeegfor --> [f, g, e, ee, s, r, eksske, geeksskeeg,
o, eeksskee, ss, k, kssk]
Code
static Set<String> set = new HashSet<String>();
static String DIV = "|";
public static void main(String[] args) {
String str = "abababa";
String ext = getExtendedString(str);
// will check for even length palindromes
for(int i=2; i<ext.length()-1; i+=2) {
addPalindromes(i, 1, ext);
}
// will check for odd length palindromes including individual characters
for(int i=1; i<=ext.length()-2; i+=2) {
addPalindromes(i, 0, ext);
}
System.out.println(set);
}
/*
* Generates extended string, with dividors applied
* eg: input = abca
* output = |a|b|c|a|
*/
static String getExtendedString(String str) {
StringBuilder builder = new StringBuilder();
builder.append(DIV);
for(int i=0; i< str.length(); i++) {
builder.append(str.charAt(i));
builder.append(DIV);
}
String ext = builder.toString();
return ext;
}
/*
* Recursive matcher
* If match is found for palindrome ie char[mid-offset] = char[mid+ offset]
* Calculate further with offset+=2
*
*
*/
static void addPalindromes(int mid, int offset, String ext) {
// boundary checks
if(mid - offset <0 || mid + offset > ext.length()-1) {
return;
}
if (ext.charAt(mid-offset) == ext.charAt(mid+offset)) {
set.add(ext.substring(mid-offset, mid+offset+1).replace(DIV, ""));
addPalindromes(mid, offset+2, ext);
}
}
Hope its fine

public class PolindromeMyLogic {
static int polindromeCount = 0;
private static HashMap<Character, List<Integer>> findCharAndOccurance(
char[] charArray) {
HashMap<Character, List<Integer>> map = new HashMap<Character, List<Integer>>();
for (int i = 0; i < charArray.length; i++) {
char c = charArray[i];
if (map.containsKey(c)) {
List list = map.get(c);
list.add(i);
} else {
List list = new ArrayList<Integer>();
list.add(i);
map.put(c, list);
}
}
return map;
}
private static void countPolindromeByPositions(char[] charArray,
HashMap<Character, List<Integer>> map) {
map.forEach((character, list) -> {
int n = list.size();
if (n > 1) {
for (int i = 0; i < n - 1; i++) {
for (int j = i + 1; j < n; j++) {
if (list.get(i) + 1 == list.get(j)
|| list.get(i) + 2 == list.get(j)) {
polindromeCount++;
} else {
char[] temp = new char[(list.get(j) - list.get(i))
+ 1];
int jj = 0;
for (int ii = list.get(i); ii <= list
.get(j); ii++) {
temp[jj] = charArray[ii];
jj++;
}
if (isPolindrome(temp))
polindromeCount++;
}
}
}
}
});
}
private static boolean isPolindrome(char[] charArray) {
int n = charArray.length;
char[] temp = new char[n];
int j = 0;
for (int i = (n - 1); i >= 0; i--) {
temp[j] = charArray[i];
j++;
}
if (Arrays.equals(charArray, temp))
return true;
else
return false;
}
public static void main(String[] args) {
String str = "MADAM";
char[] charArray = str.toCharArray();
countPolindromeByPositions(charArray, findCharAndOccurance(charArray));
System.out.println(polindromeCount);
}
}
Try out this. Its my own solution.

// Maintain an Set of palindromes so that we get distinct elements at the end
// Add each char to set. Also treat that char as middle point and traverse through string to check equality of left and right char
static int palindrome(String str) {
Set<String> distinctPln = new HashSet<String>();
for (int i=0; i<str.length();i++) {
distinctPln.add(String.valueOf(str.charAt(i)));
for (int j=i-1, k=i+1; j>=0 && k<str.length(); j--, k++) {
// String of lenght 2 as palindrome
if ( (new Character(str.charAt(i))).equals(new Character(str.charAt(j)))) {
distinctPln.add(str.substring(j,i+1));
}
// String of lenght 2 as palindrome
if ( (new Character(str.charAt(i))).equals(new Character(str.charAt(k)))) {
distinctPln.add(str.substring(i,k+1));
}
if ( (new Character(str.charAt(j))).equals(new Character(str.charAt(k)))) {
distinctPln.add(str.substring(j,k+1));
} else {
continue;
}
}
}
Iterator<String> distinctPlnItr = distinctPln.iterator();
while ( distinctPlnItr.hasNext()) {
System.out.print(distinctPlnItr.next()+ ",");
}
return distinctPln.size();
}

Code is to find all distinct substrings which are palindrome.
Here is the code I tried. It is working fine.
import java.util.HashSet;
import java.util.Set;
public class SubstringPalindrome {
public static void main(String[] args) {
String s = "abba";
checkPalindrome(s);
}
public static int checkPalindrome(String s) {
int L = s.length();
int counter =0;
long startTime = System.currentTimeMillis();
Set<String> hs = new HashSet<String>();
// add elements to the hash set
System.out.println("Possible substrings: ");
for (int i = 0; i < L; ++i) {
for (int j = 0; j < (L - i); ++j) {
String subs = s.substring(j, i + j + 1);
counter++;
System.out.println(subs);
if(isPalindrome(subs))
hs.add(subs);
}
}
System.out.println("Total possible substrings are "+counter);
System.out.println("Total palindromic substrings are "+hs.size());
System.out.println("Possible palindromic substrings: "+hs.toString());
long endTime = System.currentTimeMillis();
System.out.println("It took " + (endTime - startTime) + " milliseconds");
return hs.size();
}
public static boolean isPalindrome(String s) {
if(s.length() == 0 || s.length() ==1)
return true;
if(s.charAt(0) == s.charAt(s.length()-1))
return isPalindrome(s.substring(1, s.length()-1));
return false;
}
}
OUTPUT:
Possible substrings:
a
b
b
a
ab
bb
ba
abb
bba
abba
Total possible substrings are 10
Total palindromic substrings are 4
Possible palindromic substrings: [bb, a, b, abba]
It took 1 milliseconds

Related

Pancake Sorting with a Twist (Java)

I am attempting to solve a version of a pancake sorting algorithm. In this problem I am given a string that is composed of any combination of characters A-F and has a maximum length of 6. For instance I may receive the String 'ACFE'. In this problem I am trying to use pancake sorting to fix the string to be in Alphabetical Order. So the above example would become 'ACEF'.
That is pretty simple and straightforward. Here is the catch: the characters in the input string can be Uppercase OR Lowercase. Whenever you flip characters in the string, the flipped characters switch case. So an uppercase A would become 'a'. The goal at the end is to flip the string into order and also have all of the characters in uppercase as well.
I have had no problem putting together the algorithm to solve the sorting part of the algorithm, but it is the part where I am trying to make sure that we aren't done flipping the characters until they are all uppercase that I am having trouble with and can't seem to solve.
To make things easier on myself, I have made a HashMap of Characters to Integers so that it is easier to sort the characters (we can just use an equivalent Integer value). I also break the string apart at the beginning into a char[] and put it in reverse order to make the algorithm easier for myself.
Here is the code I use to do everything:
private static final HashMap<Character, Integer> numericalEquivalent = new HashMap<>();
static {
numericalEquivalent.put('A', 6);
numericalEquivalent.put('B', 5);
numericalEquivalent.put('C', 4);
numericalEquivalent.put('D', 3);
numericalEquivalent.put('E', 2);
numericalEquivalent.put('F', 1);
numericalEquivalent.put('a', 6);
numericalEquivalent.put('b', 5);
numericalEquivalent.put('c', 4);
numericalEquivalent.put('d', 3);
numericalEquivalent.put('e', 2);
numericalEquivalent.put('f', 1);
}
private static int flip(char[] arr, int i, int numFlips) {
char temp;
int start = 0;
if (start < i) {
while (start < i) {
temp = (Character.isUpperCase(arr[start]) ? Character.toLowerCase(arr[start]) : Character.toUpperCase(arr[start]));
arr[start] = (Character.isUpperCase(arr[i]) ? Character.toLowerCase(arr[i]) : Character.toUpperCase(arr[i]));
arr[i] = temp;
start++;
i--;
}
numFlips++;
}
return numFlips;
}
private static int findMax(char[] arr, int n) {
int mi, i;
for (mi = 0, i = 0; i < n; ++i)
if (numericalEquivalent.get(arr[i]) > numericalEquivalent.get(arr[mi]))
mi = i;
return mi;
}
private static int getFlips (char[] pancakes) {
int n = pancakes.length;
int numFlips = 0;
for (int curr_size = n; curr_size > 1; --curr_size) {
int mi = findMax(pancakes, curr_size);
if (mi != curr_size - 1) {
numFlips = flip(pancakes, mi, numFlips);
if (!isSorted(pancakes))
numFlips = flip(pancakes, curr_size - 1, numFlips);
}
}
return numFlips;
}
private static boolean isSorted(char[] arr) {
for (int i = 0; i < arr.length - 1; i++) {
if (numericalEquivalent.get(arr[i]) > numericalEquivalent.get(arr[i + 1]))
return false;
}
return true;
}
public static void main(String[] args) {
while(true) {
String input = scanner.nextLine();
if (input.equals("0")) break;
else System.out.println(getFlips(new StringBuilder(input).reverse().toString().toCharArray()));
}
}
My goal is to get back the minimum number of flips that it will take to flip the characters into order. How can I modify this code, though, to make sure it accounts for characters being lowercase and the need to make sure they all end up in Uppercase?
You can change the stop condition from if (!isSorted(pancakes)) to if (!isSortedAndUppercase(pancakes)) where isSortedAndUppercase(pancakes) is defined as:
private static boolean isSortedAndUppercase(char[] arr){
return isUpperCase(arr) && isSorted(arr);
}
private static boolean isUpperCase(char[] arr) {
String s = String.valueOf(arr);
return s.equals(s.toUpperCase());
}
and stop the search only when the stop condition is met.
Consider using Breadth-First Search for this task.
Side notes:
There is no need to map chars into integers. Try the following:
char[] chars = "ABCDEF".toCharArray();
for (int i = 0; i < chars.length; i++) {
System.out.println(chars[i] +" int value: "+(int)chars[i]);
System.out.println(Character.toLowerCase(chars[i]) +" int value: "+(int)Character.toLowerCase(chars[i]));
}

Algorithm to create all permutations and lengths

I am looking to create an algorithm preferably in Java. I would like to go through following char array and create every possible permutations and lengths out of it.
For example, loop and print the following:
a
aa
aaaa
aaaaa
.... keep going ....
aaaaaaaaaaaaaaaaa ....
ab
aba
abaa .............
Till I hit all possible lengths and permutations from my array.
private void method(){
char[] data = "abcdefghiABCDEFGHI0123456789".toCharArray();
// loop and print each time
}
I think it would be silly to come up with 10s of for loops for this. I am guessing some form of recursion would help here but can't get my head around to even start with. Could I get some help with this please? Even if pointing me to a start or a blog or something. Been Googling and looking around and many permutations examples exists but keeps to fixed max length. None seems to have examples on multiple length + permutations. Please advice. Thanks.
Another way to do it is this:
public class HelloWorld{
public static String[] method(char[] arr, int length) {
if(length == arr.length - 1) {
String[] strArr = new String[arr.length];
for(int i = 0; i < arr.length; i ++) {
strArr[i] = String.valueOf(arr[i]);
}
return strArr;
}
String[] before = method(arr, length + 1);
String[] newArr = new String[arr.length * before.length];
for(int i = 0; i < arr.length; i ++) {
for(int j = 0; j < before.length; j ++) {
if(i == 0)
System.out.println(before[j]);
newArr[i * before.length + j] = (arr[i] + before[j]);
}
}
return newArr;
}
public static void main(String []args){
String[] all = method("abcde".toCharArray(), 0);
for(int i = 0; i < all.length; i ++) {
System.out.println(all[i]);
}
}
}
However be careful you'll probably run out of memory or the program will take a looooong time to compile/run if it does at all. You are trying to print 3.437313508041091e+40 strings, that's 3 followed by 40 zeroes.
Here's the solution also in javascript because it starts running but it needs 4 seconds to get to 4 character permutations, for it to reach 5 character permutations it will need about 28 times that time, for 6 characters it's 4 * 28 * 28 and so on.
const method = (arr, length) => {
if(length === arr.length - 1)
return arr;
const hm = [];
const before = method(arr, length + 1);
for(let i = 0; i < arr.length; i ++) {
for(let j = 0; j < before.length; j ++) {
if(i === 0)
console.log(before[j]);
hm.push(arr[i] + before[j]);
}
}
return hm;
};
method('abcdefghiABCDEFGHI0123456789'.split(''), 0).forEach(a => console.log(a));
private void method(){
char[] data = "abcdefghiABCDEFGHI0123456789".toCharArray();
// loop and print each time
}
With your given input there are 3.43731350804×10E40 combinations. (Spelled result in words is eighteen quadrillion fourteen trillion three hundred ninety-eight billion five hundred nine million four hundred eighty-one thousand nine hundred eighty-four. ) If I remember it correctly the maths is some how
1 + x + x^2 + x^3 + x^4 + ... + x^n = (1 - x^n+1) / (1 - x)
in your case
28 + 28^2 + 28^3 + .... 28^28
cause you will have
28 combinations for strings with length one
28*28 combinations for strings with length two
28*28*28 combinations for strings with length three
...
28^28 combinations for strings with length 28
It will take a while to print them all.
One way I can think of is to use the Generex library, a Java library for generating String that match a given regular expression.
Generex github. Look at their page for more info.
Generex maven repo. Download the jar or add dependency.
Using generex is straight forward if you are somehow familiar with regex.
Example using only the first 5 chars which will have 3905 possible combinations
public static void main(String[] args) {
Generex generex = new Generex("[a-e]{1,5}");
System.out.println(generex.getAllMatchedStrings().size());
Iterator iterator = generex.iterator();
while (iterator.hasNext()) {
System.out.println(iterator.next());
}
}
Meaning of [a-e]{1,5} any combination of the chars a,b,c,d,e wit a min length of 1 and max length of 5
output
a
aa
aaa
aaaa
aaaaa
aaaab
aaaac
aaaad
aaaae
aaab
aaaba
aaabb
aaabc
aaabd
aaabe
aaac
....
eeee
eeeea
eeeeb
eeeec
eeeed
eeeee
You can have a for loop that starts from 1 and ends at array.length and in each iteration call a function that prints all the permutations for that length.
public void printPermutations(char[] array, int length) {
/*
* Create all permutations with length = length and print them
*/
}
public void method() {
char data = "abcdefghiABCDEFGHI0123456789".toCharArray();
for(int i = 1; i <= data.length; i ++) {
printPermutations(data, i);
}
}
I think the following recursion could solve your problem:
public static void main(String[] args) {
final String[] data = {"a", "b", "c"};
sampleWithReplacement(data, "", 1, 5);
}
private static void sampleWithReplacement(
final String[] letters,
final String prefix,
final int currentLength,
final int maxLength
) {
if (currentLength <= maxLength) {
for (String letter : letters) {
final String newPrefix = prefix + letter;
System.out.println(newPrefix);
sampleWithReplacement(letters, newPrefix, currentLength + 1, maxLength);
}
}
}
where data specifies your possible characters to sample from.
Is this what you're talking about?
public class PrintPermutations
{
public static String stream = "";
public static void printPermutations (char[] set, int count, int length)
{
if (count < length)
for (int i = 0; i < set.length; ++i)
{
stream += set[i];
System.out.println (stream);
printPermutations (set, count + 1, length);
stream = stream.substring (0, stream.length() - 1);
}
}
public static void main (String[] args)
{
char[] set = "abcdefghiABCDEFGHI0123456789".toCharArray();
printPermutations (set, 0, set.length);
}
}
Test it using a smaller string first.
On an input string 28 characters long this method is never going to end, but for smaller inputs it will generate all permutations up to length n, where n is the number of characters. It first prints all permutations of length 1, then all of length 2 etc, which is different from your example, but hopefully order doesn't matter.
static void permutations(char[] arr)
{
int[] idx = new int[arr.length];
char[] perm = new char[arr.length];
Arrays.fill(perm, arr[0]);
for (int i = 1; i < arr.length; i++)
{
while (true)
{
System.out.println(new String(perm, 0, i));
int k = i - 1;
for (; k >= 0; k--)
{
idx[k] += 1;
if (idx[k] < arr.length)
{
perm[k] = arr[idx[k]];
break;
}
idx[k] = 0;
perm[k] = arr[idx[k]];
}
if (k < 0)
break;
}
}
}
Test:
permutations("abc".toCharArray());
Output:
a
b
c
aa
ab
ac
ba
bb
bc
ca
cb
cc

Pair Palindrome

I have this code to find all pairs of string to form a palindrome. e.g) D: { AB, DEEDBA } => AB + DEEDBA -> YES and will be returned. Another example, { NONE, XENON } => NONE + XENON = > YES.
What would be running time of this ?
public static List<List<String>> pairPalindrome(List<String> D) {
List<List<String>> pairs = new LinkedList<>();
Set<String> set = new HashSet<>();
for (String s : D) {
set.add(s);
}
for (String s : D) {
String r = reverse(s);
for (int i = 0; i <= r.length(); i++) {
String prefix = r.substring(0, i);
if (set.contains(prefix)) {
String suffix = r.substring(i);
if (isPalindrom(suffix)) {
pairs.add(Arrays.asList(s, prefix));
}
}
}
}
return pairs;
}
private static boolean isPalindrom(String s) {
int i = 0;
int j = s.length() - 1;
char[] c = s.toCharArray();
while (i < j) {
if (c[i] != c[j]) {
return false;
}
i++;
j--;
}
return true;
}
private static String reverse(String s) {
char[] c = s.toCharArray();
int i = 0;
int j = c.length - 1;
while (i < j) {
char temp = c[i];
c[i] = c[j];
c[j] = temp;
i++;
j--;
}
return new String(c);
}
I'm going to take a few guesses here as I don't have much experience with Java.
First, isPalindrome is O(N) with the size of suffix string. Add operation to 'pairs' would probably be O(1).
Then, we have the for loop, it's O(N) with the length of r. Getting a substring I'd think is O(M) with the size of the substring. Checking if a hashmap contains a certain key, with a perfect hash function would be (IIRC) O(1), in your case we can assume O(lgN) (possibly). So, first for loop has O(NMlgK), where K is your hash table size, N is r's length and M is substring's length.
Finally we have the outmost for loop, it runs for each string in the string list, so that's O(N). Then, we reverse each of them. So for each of these strings we have another O(N) operation inside, with the other loop being O(NMlgK). So, overall complexity is O(L(N + NMlgK)), where L is the amount of strings you have. But, it'd reduce to O(LNMlgK). I'd like if someone verified or corrected my mistakes.
EDIT: Actually, substring length will at most be N, as the length of the entire string, so M is actually N. Now I'd probably say it's O(LNlgK).

Finding the Number of Times an Expression Occurs in a String Continuously and Non Continuously

I had a coding interview over the phone and was asked this question:
Given a String (for example):
"aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc"
and an expression (for example):
"a+b+c-"
where:
+: means the char before it is repeated 2 times
-: means the char before it is repeated 4 times
Find the number of times the given expression appears in the string with the operands occurring non continuously and continuously.
The above expression occurs 4 times:
1) aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc
^^ ^^ ^^^^
aa bb cccc
2) aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc
^^ ^^ ^^^^
aa bb cccc
3) aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc
^^ ^^ ^^^^
aa bb cccc
4) aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc
^^ ^^ ^^^^
aa bb cccc
I had no idea how to do it. I started doing an iterative brute force method with lots of marking of indices but realized how messy and hard that would to code half way through:
import java.util.*;
public class Main {
public static int count(String expression, String input) {
int count = 0;
ArrayList<char[]> list = new ArrayList<char[]>();
// Create an ArrayList of chars to iterate through the expression and match to string
for(int i = 1; i<expression.length(); i=i+2) {
StringBuilder exp = new StringBuilder();
char curr = expression.charAt(i-1);
if(expression.charAt(i) == '+') {
exp.append(curr).append(curr);
list.add(exp.toString().toCharArray());
}
else { // character is '-'
exp.append(curr).append(curr).append(curr).append(curr);
list.add(exp.toString().toCharArray());
}
}
char[] inputArray = input.toCharArray();
int i = 0; // outside pointer
int j = 0; // inside pointer
while(i <= inputArray.length) {
while(j <= inputArray.length) {
for(int k = 0; k< list.size(); k++) {
/* loop through
* all possible combinations in array list
* with multiple loops
*/
}
j++;
}
i++;
j=i;
}
return count;
}
public static void main(String[] args) {
String expression = "a+b+c-";
String input = "aaksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc";
System.out.println("The expression occurs: "+count(expression, input)+" times");
}
}
After spending a lot of time doing it iteratively he mentioned recursion and I still couldn't see a clear way doing it recursively and I wasn't able to solve the question. I am trying to solve it now post-interview and am still not sure how to go about this question. How should I go about solving this problem? Is the solution obvious? I thought this was a really hard question for a coding phone interview.
Non-recursion algorithm that requires O(m) space and operates in O(n*m), where m is number of tokens in query:
#Test
public void subequences() {
String input = "aabbccaacccccbbd";
String query = "a+b+";
// here to store tokens of a query: e.g. {a, +}, {b, +}
char[][] q = new char[query.length() / 2][];
// here to store counts of subsequences ending by j-th token found so far
int[] c = new int[query.length() / 2]; // main
int[] cc = new int[query.length() / 2]; // aux
// tokenize
for (int i = 0; i < query.length(); i += 2)
q[i / 2] = new char[] {query.charAt(i), query.charAt(i + 1)};
// init
char[] sub2 = {0, 0}; // accumulator capturing last 2 chars
char[] sub4 = {0, 0, 0, 0}; // accumulator capturing last 4 chars
// main loop
for (int i = 0; i < input.length(); i++) {
shift(sub2, input.charAt(i));
shift(sub4, input.charAt(i));
boolean all2 = sub2[1] != 0 && sub2[0] == sub2[1]; // true if all sub2 chars are same
boolean all4 = sub4[3] != 0 && sub4[0] == sub4[1] // true if all sub4 chars are same
&& sub4[0] == sub4[2] && sub4[0] == sub4[3];
// iterate tokens
for (int j = 0; j < c.length; j++) {
if (all2 && q[j][1] == '+' && q[j][0] == sub2[0]) // found match for "+" token
cc[j] = j == 0 // filling up aux array
? c[j] + 1 // first token, increment counter by 1
: c[j] + c[j - 1]; // add value of preceding token counter
if (all4 && q[j][1] == '-' && q[j][0] == sub4[0]) // found match for "-" token
cc[j] = j == 0
? c[j] + 1
: c[j] + c[j - 1];
}
if (all2) sub2[1] = 0; // clear, to make "aa" occur in "aaaa" 2, not 3 times
if (all4) sub4[3] = 0;
copy(cc, c); // copy aux array to main
}
}
System.out.println(c[c.length - 1]);
}
// shifts array 1 char left and puts c at the end
void shift(char[] cc, char c) {
for (int i = 1; i < cc.length; i++)
cc[i - 1] = cc[i];
cc[cc.length - 1] = c;
}
// copies array contents
void copy(int[] from, int[] to) {
for (int i = 0; i < from.length; i++)
to[i] = from[i];
}
The main idea is to catch chars from the input one by one, holding them in 2- and 4-char accumulators and check if any of them match some tokens of the query, remembering how many matches have we got for sub-queries ending by these tokens so far.
Query (a+b+c-) is splitted into tokens (a+, b+, c-). Then we collect chars in accumulators and check if they match some tokens. If we find match for first token, we increment its counter by 1. If we find match for another j-th token, we can create as many additional subsequences matching subquery composed of tokens [0...j], as many of them now exist for subquery composed of tokens [0... j-1], because this match can be appended to every of them.
For example, we have:
a+ : 3 (3 matches for a+)
b+ : 2 (2 matches for a+b+)
c- : 1 (1 match for a+b+c-)
when cccc arrives. Then c- counter should be increased by b+ counter value, because so far we have 2 a+b+ subsequences and cccc can be appended to both of them.
Let's call the length of the string n, and the length of the query expression (in terms of the number of "units", like a+ or b-) m.
It's not clear exactly what you mean by "continuously" and "non-continuously", but if "continuously" means that there can't be any gaps between query string units, then you can just use the KMP algorithm to find all instances in O(m+n) time.
We can solve the "non-continuous" version in O(nm) time and space with dynamic programming. Basically, what we want to compute is a function:
f(i, j) = the number of occurrences of the subquery consisting of the first i units
of the query expression, in the first j characters of the string.
So with your example, f(2, 41) = 2, since there are 2 separate occurrences of the subpattern a+b+ in the first 41 characters of your example string.
The final answer will then be f(n, m).
We can compute this recursively as follows:
f(0, j) = 0
f(i, 0) = 0
f(i > 0, j > 0) = f(i, j-1) + isMatch(i, j) * f(i-1, j-len(i))
where len(i) is the length of the ith unit in the expression (always 2 or 4) and isMatch(i, j) is a function that returns 1 if the ith unit in the expression matches the text ending at position j, and 0 otherwise. For example, isMatch(15, 2) = 1 in your example, because s[14..15] = bb. This function takes just constant time to run, because it never needs to check more than 4 characters.
The above recursion will already work as-is, but we can save time by making sure that we only solve each subproblem once. Because the function f() depends only on its 2 parameters i and j, which range between 0 and m, and between 0 and n, respectively, we can just compute all n*m possible answers and store them in a table.
[EDIT: As Sasha Salauyou points out, the space requirement can in fact be reduced to O(m). We never need to access values of f(i, k) with k < j-1, so instead of storing m columns in the table we can just store 2, and alternate between them by always accessing column m % 2.]
Wanted to try it for myself and figured I could then share my solution as well. The parse method obviously has issues when there is indeed a char 0 in the expression (although that would probably be the bigger issue itself), the find method will fail for an empty needles array and I wasn't sure if ab+c- should be considered a valid pattern (I treat it as such). Note that this covers only the non-continous part so far.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class Matcher {
public static void main(String[] args) {
String haystack = "aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc";
String[] needles = parse("a+b+c-");
System.out.println("Needles: " + Arrays.toString(needles));
System.out.println("Found: " + find(haystack, needles, 0));
needles = parse("ab+c-");
System.out.println("Needles: " + Arrays.toString(needles));
System.out.println("Found: " + find(haystack, needles, 0));
}
private static int find(String haystack, String[] needles, int i) {
String currentNeedle = needles[i];
int pos = haystack.indexOf(currentNeedle);
if (pos < 0) {
// Abort: Current needle not found
return 0;
}
// Current needle found (also means that pos + currentNeedle.length() will always
// be <= haystack.length()
String remainingHaystack = haystack.substring(pos + currentNeedle.length());
// Last needle?
if (i == needles.length - 1) {
// +1: We found one match for all needles
// Try to find more matches of current needle in remaining haystack
return 1 + find(remainingHaystack, needles, i);
}
// Try to find more matches of current needle in remaining haystack
// Try to find next needle in remaining haystack
return find(remainingHaystack, needles, i) + find(remainingHaystack, needles, i + 1);
}
private static String[] parse(String expression) {
List<String> searchTokens = new ArrayList<String>();
char lastChar = 0;
for (int i = 0; i < expression.length(); i++) {
char c = expression.charAt(i);
char[] chars;
switch (c) {
case '+':
// last char is repeated 2 times
chars = new char[2];
Arrays.fill(chars, lastChar);
searchTokens.add(String.valueOf(chars));
lastChar = 0;
break;
case '-':
// last char is repeated 4 times
chars = new char[4];
Arrays.fill(chars, lastChar);
searchTokens.add(String.valueOf(chars));
lastChar = 0;
break;
default:
if (lastChar != 0) {
searchTokens.add(String.valueOf(lastChar));
}
lastChar = c;
}
}
return searchTokens.toArray(new String[searchTokens.size()]);
}
}
Output:
Needles: [aa, bb, cccc]
Found: 4
Needles: [a, bb, cccc]
Found: 18
How about preprocessing aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc?
This become a1k1s1d1b1a2l1a1s1k1d1h1f1b2l1a1j1d1f1h1a1c4a1o1u1d1g1a1l1s1a2b2l1i1s1d1f1h1c4
Now find occurrences of a2, b2, c4.
Tried it code below but right now it gives only first possible match based of depth first.
Need to be changed to do all possible combination instead of just first
import java.util.ArrayList;
import java.util.List;
public class Parsing {
public static void main(String[] args) {
String input = "aksdbaalaskdhfbblajdfhacccc aoudgalsaa bblisdfhcccc";
System.out.println(input);
for (int i = 0; i < input.length(); i++) {
System.out.print(i/10);
}
System.out.println();
for (int i = 0; i < input.length(); i++) {
System.out.print(i%10);
}
System.out.println();
List<String> tokenisedSearch = parseExp("a+b+c-");
System.out.println(tokenisedSearch);
parse(input, 0, tokenisedSearch, 0);
}
public static boolean parse(String input, int searchFromIndex, List<String> tokensToSeach, int currentTokenIndex) {
if(currentTokenIndex >= tokensToSeach.size())
return true;
String token = tokensToSeach.get(currentTokenIndex);
int found = input.indexOf(token, searchFromIndex);
if(found >= 0) {
System.out.println("Found at Index "+found+ " Token " +token);
return parse(input, searchFromIndex+1, tokensToSeach, currentTokenIndex+1);
}
return false;
}
public static List<String> parseExp(String exp) {
List<String> list = new ArrayList<String>();
String runningToken = "";
for (int i = 0; i < exp.length(); i++) {
char at = exp.charAt(i);
switch (at) {
case '+' :
runningToken += runningToken;
list.add(runningToken);
runningToken = "";
break;
case '-' :
runningToken += runningToken;
runningToken += runningToken;
list.add(runningToken);
runningToken = "";
break;
default :
runningToken += at;
}
}
return list;
}
}
Recursion may be the following (pseudocode):
int search(String s, String expression) {
if expression consists of only one token t /* e. g. "a+" */ {
search for t in s
return number of occurrences
} else {
int result = 0
divide expression into first token t and rest expression
// e. g. "a+a+b-" -> t = "a+", rest = "a+b-"
search for t in s
for each occurrence {
s1 = substring of s from the position of occurrence to the end
result += search(s1, rest) // search for rest of expression in rest of string
}
return result
}
}
Applying this to entire string, you'll get number of non-continuous occurrences. To get continuous occurrences, you don't need recursion at all--just transform expression into string and search by iteration.
If you convert the search string first with a simple parser/compiler so a+ becomes aa etc. then you can simply take this string and run a regular expression match against your hay stack. (Sorry, I'm no Java coder so can't deliver any real code but it is not really difficult)

Determine if a given string is a k-palindrome

I'm trying to solve the following interview practice question:
A k-palindrome is a string which transforms into a palindrome on removing at most
k characters.
Given a string S, and an integer K, print "YES" if S is a k-palindrome;
otherwise print "NO".
Constraints:
S has at most 20,000 characters.
0 <= k <= 30
Sample Test Cases:
Input - abxa 1
Output - YES
Input - abdxa 1
Output - NO
My approach I've decided is going to be taking all possible String combinations of length s.length - k or greater, i.e. "abc" and k = 1 -> "ab" "bc" "ac" "abc" and checking if they are palindromes. I have the following code so far, but can't seem to figure out a proper way to generate all these string combinations in the general case:
public static void isKPalindrome(String s, int k) {
// Generate all string combinations and call isPalindrome on them,
// printing "YES" at first true
}
private static boolean isPalindrome(String s) {
char[] c = s.toCharArray()
int slow = 0;
int fast = 0;
Stack<Character> stack = new Stack<>();
while (fast < c.length) {
stack.push(c[slow]);
slow += 1;
fast += 2;
}
if (c.length % 2 == 1) {
stack.pop();
}
while (!stack.isEmpty()) {
if (stack.pop() != c[slow++]) {
return false;
}
}
return true;
}
Can anyone figure out a way to implement this, or perhaps demonstrate a better way?
I think there is a better way
package se.wederbrand.stackoverflow;
public class KPalindrome {
public static void main(String[] args) {
KPalindrome kPalindrome = new KPalindrome();
String s = args[0];
int k = Integer.parseInt(args[1]);
if (kPalindrome.testIt(s, k)) {
System.out.println("YES");
}
else {
System.out.println("NO");
}
}
boolean testIt(String s, int k) {
if (s.length() <= 1) {
return true;
}
while (s.charAt(0) == s.charAt(s.length()-1)) {
s = s.substring(1, s.length()-1);
if (s.length() <= 1) {
return true;
}
}
if (k == 0) {
return false;
}
// Try to remove the first or last character
return testIt(s.substring(0, s.length() - 1), k - 1) || testIt(s.substring(1, s.length()), k - 1);
}
}
Since K is max 30 it's likely the string can be invalidated pretty quick and without even examining the middle of the string.
I've tested this with the two provided test cases as well as a 20k characters long string with just "ab" 10k times and k = 30;
All tests are fast and returns the correct results.
This can be solved using Edit distance dynamic programming algorithm. Edit distance DP algorithm is used to find the minimum operations required to convert a source string to destination string. The operations can be either addition or deletion of characters.
The K-palindrome problem can be solved using Edit distance algorithm by checking the minimum operation required to convert the input string to its reverse.
Let editDistance(source,destination) be the function which takes source string and destination string and returns the minimum operations required to convert the source string to destination string.
A string S is K-palindrome if editDistance(S,reverse(S))<=2*K
This is because we can transform the given string S into its reverse by deleting atmost K letters and then inserting the same K letters in different position.
This will be more clear with an example.
Let S=madtam and K=1.
To convert S into reverse of S (i.e matdam) first we have to remove the character 't' at index 3 ( 0 based index) in S.
Now the intermediate string is madam. Then we have to insert the character 't' at index 2 in the intermediate string to get "matdam" which is the reverse of string s.
If you look carefully you will know that the intermediate string "madam" is the palindrome that is obtained by removing k=1 characters.
I found the length of a longest string such that after removing characters >= k, we will be having a palindrome. I have used dynamic programming here. The palindrome I have considered need not be consecutive. Its like abscba has a longest palindromic length of 4.
So now this can be used further, such that whenever k >= (len - len of longest palindrome), it results to true else false.
public static int longestPalindrome(String s){
int len = s.length();
int[][] cal = new int[len][len];
for(int i=0;i<len;i++){
cal[i][i] = 1; //considering strings of length = 1
}
for(int i=0;i<len-1;i++){
//considering strings of length = 2
if (s.charAt(i) == s.charAt(i+1)){
cal[i][i+1] = 2;
}else{
cal[i][i+1] = 0;
}
}
for(int p = len-1; p>=0; p--){
for(int q=p+2; q<len; q++){
if (s.charAt(p)==s.charAt(q)){
cal[p][q] = 2 + cal[p+1][q-1];
}else{
cal[p][q] = max(cal[p+1][q], cal[p][q-1]);
}
}
}
return cal[0][len-1];
}
This is a common interview question, and I'm little surprised that no one has mentioned dynamic programming yet. This problem exhibits optimal substructure (if a string is a k-palindrome, some substrings are also k-palindromes), and overlapping subproblems (the solution requires comparing the same substrings more than once).
This is a special case of the edit distance problem, where we check if a string s can be converted to string p by only deleting characters from either or both strings.
Let the string be s and its reverse rev. Let dp[i][j] be the number of deletions required to convert the first i characters of s to the first j characters of rev. Since deletions have to be done in both strings, if dp[n][n] <= 2 * k, then the string is a k-palindrome.
Base case: When one of the strings is empty, all characters from the other string need to be deleted in order to make them equal.
Time complexity: O(n^2).
Scala code:
def kPalindrome(s: String, k: Int): Boolean = {
val rev = s.reverse
val n = s.length
val dp = Array.ofDim[Int](n + 1, n + 1)
for (i <- 0 to n; j <- 0 to n) {
dp(i)(j) = if (i == 0 || j == 0) i + j
else if (s(i - 1) == rev(j - 1)) dp(i - 1)(j - 1)
else 1 + math.min(dp(i - 1)(j), dp(i)(j - 1))
}
dp(n)(n) <= 2 * k
}
Since we are doing bottom-up DP, an optimization is to return false if at any time i == j && dp[i][j] > 2 * k, since all subsequent i == j must be greater.
Thanks to Andreas, that algo worked like a charm. Here my implementation for anyone who's curious. Slightly different, but fundamentally your same logic:
public static boolean kPalindrome(String s, int k) {
if (s.length() <= 1) {
return true;
}
char[] c = s.toCharArray();
if (c[0] != c[c.length - 1]) {
if (k <= 0) {
return false;
} else {
char[] minusFirst = new char[c.length - 1];
System.arraycopy(c, 1, minusFirst, 0, c.length - 1);
char[] minusLast = new char[c.length - 1];
System.arraycopy(c, 0, minusLast, 0, c.length - 1);
return kPalindrome(String.valueOf(minusFirst), k - 1)
|| kPalindrome(String.valueOf(minusLast), k - 1);
}
} else {
char[] minusFirstLast = new char[c.length - 2];
System.arraycopy(c, 1, minusFirstLast, 0, c.length - 2);
return kPalindrome(String.valueOf(minusFirstLast), k);
}
}
This problem can be solved using the famous Longest Common Subsequence(LCS) method. When LCS is applied with the string and the reverse of the given string, then it gives us the longest palindromic subsequence present in the string.
Let the longest palindromic subsequence length of a given string of length string_length be palin_length. Then (string_length - palin_length) gives the number of characters required to be deleted to convert the string to a palindrome. Thus, the given string is k-palindrome if (string_length - palin_length) <= k.
Let me give some examples,
Initial String: madtam (string_length = 6)
Longest Palindromic Subsequence: madam (palin_length = 5)
Number of non-contributing characters: 1 ( string_length - palin_length)
Thus this string is k-palindromic where k>=1. This is because you need to delete atmost k characters ( k or less).
Here is the code snippet:
#include<iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
#define MAX 10000
int table[MAX+1][MAX+1];
int longest_common_subsequence(char *first_string, char *second_string){
int first_string_length = strlen(first_string), second_string_length = strlen(second_string);
int i, j;
memset( table, 0, sizeof(table));
for( i=1; i<=first_string_length; i++ ){
for( j=1; j<=second_string_length; j++){
if( first_string[i-1] == second_string[j-1] )
table[i][j] = table[i-1][j-1] + 1;
else
table[i][j] = max(table[i-1][j], table[i][j-1]);
}
}
return table[first_string_length][second_string_length];
}
char first_string[MAX], second_string[MAX];
int main(){
scanf("%s", first_string);
strcpy(second_string, first_string);
reverse(second_string, second_string+strlen(second_string));
int max_palindromic_length = longest_common_subsequence(first_string, second_string);
int non_contributing_chars = strlen(first_string) - max_palindromic_length;
if( k >= non_contributing_chars)
printf("K palindromic!\n");
else
printf("Not K palindromic!\n");
return 0;
}
I designed a solution purely based on recursion -
public static boolean isKPalindrome(String str, int k) {
if(str.length() < 2) {
return true;
}
if(str.charAt(0) == str.charAt(str.length()-1)) {
return isKPalindrome(str.substring(1, str.length()-1), k);
} else{
if(k == 0) {
return false;
} else {
if(isKPalindrome(str.substring(0, str.length() - 1), k-1)) {
return true;
} else{
return isKPalindrome(str.substring(1, str.length()), k-1);
}
}
}
}
There is no while loop in above implementation as in the accepted answer.
Hope it helps somebody looking for it.
public static boolean failK(String s, int l, int r, int k) {
if (k < 0)
return false;
if (l > r)
return true;
if (s.charAt(l) != s.charAt(r)) {
return failK(s, l + 1, r, k - 1) || failK(s, l, r - 1, k - 1);
} else {
return failK(s, l + 1, r - 1, k);
}
}

Categories