Boolean algorithm producing true output when output should be false - java

Basically i am trying to create an algorithm that will test whether a given string is a cover string for a list of strings. A string is a cover string if it contains the characters for every string in the list in a way that maintains the left to right order of the listed strings. For example, for the two strings "cat" and "dog", "cadhpotg" would be a cover string, but "ctadhpog" would not be one.
I have created an algorithm however it is producing the output true when the output should be false, as the given string is a cover String for Strings list1 and list2, but not for list3.
Any help into why this algorithm is producing the wrong output would be highly appreciated.
public class StringProcessing2 {
//ArrayList created and 3 fields added.
public static ArrayList<String> stringList = new ArrayList<>();
public static String list1 = "phillip";
public static String list2 = "micky";
public static String list3 = "fad";
//Algorithm to check whether given String is a cover string.
public static boolean isCover(String coverString){
int matchedWords = 0;
stringList.add(list1);
stringList.add(list2);
stringList.add(list3);
//for-loops to iterate through each character of every word in stringList to test whether they appear in
//coverString in left to right order.
for(int i = 0; i < stringList.size(); i++){
int countLetters = 1;
for(int n = 0; n < (stringList.get(i).length())-1; n++){
if(coverString.indexOf(stringList.get(i).charAt(n)) <= (coverString.indexOf((stringList.get(i).charAt(n+1)),
coverString.indexOf((stringList.get(i).charAt(n)))))){
countLetters++;
if(countLetters == stringList.get(i).length()){
matchedWords++;
}
}
}
}
if(matchedWords == stringList.size()){
return true;
}
else
return false;
}
public static void main(String[] args) {
System.out.println(isCover("phillmickyp"));
}
}

Probably the easiest way to go about this is to break down the problem into parts. Have every function do the least possible work while still getting something done towards the overall goal.
To accomplish this, I'd recommend creating a helper method that takes two Strings and returns a boolean, checking if one String is the cover of another.
boolean isCover(String s, String cover)
{
int i = 0;
for(char c : s.toCharArray())
if((i = cover.indexOf(c, i)) == -1)
return false;
return true;
}
Then once you have something that can correctly tell you if it's a valid cover String or not, it becomes much simpler to check if one String is a valid cover for multiple Strings
boolean isCover(List<String> strings, String cover)
{
for(String s : strings)
if(!isCover(s, cover))
return false;
return true;
}

Related

Obtain lexicographically smallest & largest substring. My algorithm failed most of the test cases but I can't understand why. Help me figure it out

I had to do a test today for an interview and the problem was obtaining the lexicographically smallest and largest substring (in other words, sort by name).
Link - Complete the function SmallestAndLargestSubstring, which takes a string S consisting of lowercase English letters (a-z) as its argument and returns lexicographically smallest and largest substrings which start with a vowel and end with a consonant.
My algorithm passed the basic test cases but failed most of the others. It's not the most efficient code, but it was the fastest to write.
static String[] SmallestAndLargestSubstring(String s) {
ArrayList<Character> vowelList = new ArrayList<Character>();
vowelList.add('a');
vowelList.add('e');
vowelList.add('i');
vowelList.add('o');
vowelList.add('u');
ArrayList<Character> consonantList = new ArrayList<Character>();
for (char c='a'; c<='z'; c++) {
if (!vowelList.contains(c))
consonantList.add(c);
}
ArrayList<String> substringList = new ArrayList<String>();
for (int i=0; i<s.length(); i++) {
char c = s.charAt(i);
if (vowelList.contains(c)) {
String substring = "";
substring+=c;
for (int j=i+1; j<s.length(); j++) {
char c2 = s.charAt(j);
substring+=c2;
if (consonantList.contains(c2)) {
substringList.add(substring);
}
}
}
}
Collections.sort(substringList);
String[] outputAdapter = new String[2];
outputAdapter[0]=substringList.get(0);
outputAdapter[1]=substringList.get(substringList.size()-1);
return outputAdapter;
}
Anyway, I wanted to figure out where I went wrong, so I reversed engineered the test cases to figure out what was the input being passed in, and hopefully I would be able to figure out what was wrong with my algorithm.
Here's what I uncovered, and these are my answers (which are wrong according to the test cases).
Input
String s = "azizezozuzawwwwwwwwwuzzzzzzzzabbbbbbbaaaabbbboiz"
My answer
smallest = "aaaab";
largest = "uzzzzzzzzabbbbbbbaaaabbbboiz";
But for the life of me, I can't figure out where my mistake is. Here's my full list of substrings, sorted from the smallest to the largest. Link to sorted results
Been racking my brains for the last 3 hours. I'd be grateful if anyone can figure out where my mistake was.
Edit: Here are 3 more test cases. My answers match these test case answers.
string = "aba"; smallest = "ab"; largest = "ab";
string = "aab"; smallest = "aab"; largest = "ab";
string = "abababababbaaaaaaaaaaaaaaz"; smallest = "aaaaaaaaaaaaaaz"; largest = "az";
/*
It is the Basic code to Obtain Substring which start with Vowel and End up with Consonant. It is going to print on the Basis of Name Comparable, The first and the Last Substring in the List.Similarly we can achieve on the basis of length, the firt and last Substring using different comparator function.
*/
public class StringSubsequencesStartVowelEndConsonant {
static List<String> subsequence = new LinkedList<>();
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
System.out.println("Enter the String:\n");
String string = in.next();
findSubstring(string);
}
private static void findSubstring(String string) {
for(int i=0;i<string.length();i++){
if(isVowel(string.charAt(i))){
for(int j=string.length()-1;j>=i;j--){
if(isConsonant(string.charAt(j))){
String subString = string.substring(i,j+1);
subsequence.add(subString);
}
}
}
}
Collections.sort(subsequence);
for(String str : subsequence){
System.out.print(str+" ");
}
System.out.println();
System.out.println(subsequence.get(0));
System.out.println(subsequence.get(subsequence.size()-1));
}
private static boolean isConsonant(char chars) {
return !(chars=='a'|| chars=='e'||chars=='i'||chars=='o'||chars=='u');
}
private static boolean isVowel(char chars) {
return (chars=='a'|| chars=='e'||chars=='i'||chars=='o'||chars=='u');
}
}

Determine if two words are anagrams

I recently took a quiz asking me to determine if elements in an array were anagrams. I completed an implementation, but upon running their tests, I only passed 1 of 5 test cases. The problem is, they wouldn't allow me to see what the tests were, so I'm really unsure about what I failed on. I've recreated my answer below, which basically multiplies the letters in a word and adds this number to an array. It then compares the numbers in one array to the numbers in the other, and prints true if they are the same. I'm basically asking what are some scenarios in which this would fail, and how would I modify this code to account for these cases?
public class anagramFinder {
public static void main (String[] args){
String[] listOne = new String[5];
listOne[0] = "hello";
listOne[1] = "lemon";
listOne[2] = "cheese";
listOne[3] = "man";
listOne[4] = "touch";
String[] listTwo = new String[5];
listTwo[0] = "olleh";
listTwo[1] = "melon";
listTwo[2] = "house";
listTwo[3] = "namer";
listTwo[4] = "tou";
isAnagram(listOne,listTwo);
}
public static void isAnagram(String[] firstWords, String[] secondWords){
int firstTotal = 1;
int secondTotal = 1;
int[] theFirstInts = new int[firstWords.length];
int[] theSecondInts = new int[secondWords.length];
for(int i = 0;i<firstWords.length;i++){
for(int j = 0;j<firstWords[i].length();j++){
firstTotal = firstTotal * firstWords[i].charAt(j);
}
theFirstInts[i] = firstTotal;
firstTotal = 1;
}
for(int i = 0;i<secondWords.length;i++){
for(int j = 0;j<secondWords[i].length();j++){
secondTotal = secondTotal * secondWords[i].charAt(j);
}
theSecondInts[i] = secondTotal;
secondTotal = 1;
}
for(int i=0;i<minimum(theFirstInts.length,theSecondInts.length);i++){
if(theFirstInts[i] == theSecondInts[i]){
System.out.println("True");
} else {
System.out.println("False");
}
}
}
public static int minimum(int number,int otherNumber){
if(number<otherNumber){
return number;
} else {
return otherNumber;
}
}
}
In my above example that I run in the main method, this prints True True False False False, which is correct
Copying my answer from a similar question.
Here's a simple fast O(n) solution without using sorting or multiple loops or hash maps. We increment the count of each character in the first array and decrement the count of each character in the second array. If the resulting counts array is full of zeros, the strings are anagrams. Can be expanded to include other characters by increasing the size of the counts array.
class AnagramsFaster{
private static boolean compare(String a, String b){
char[] aArr = a.toLowerCase().toCharArray(), bArr = b.toLowerCase().toCharArray();
if (aArr.length != bArr.length)
return false;
int[] counts = new int[26]; // An array to hold the number of occurrences of each character
for (int i = 0; i < aArr.length; i++){
counts[aArr[i]-97]++; // Increment the count of the character at i
counts[bArr[i]-97]--; // Decrement the count of the character at i
}
// If the strings are anagrams, the counts array will be full of zeros
for (int i = 0; i<26; i++)
if (counts[i] != 0)
return false;
return true;
}
public static void main(String[] args){
System.out.println(compare(args[0], args[1]));
}
}
The idea of multiplying ASCII codes isn't bad, but not perfect. It would need a deep analysis to show that two different words could have the same products, with the given range of 'a' to 'z', and within reasonable length.
One conventional approach would be to create a Map for counting the letters, and compare the Maps.
Another one would sort the letters and compare the sorted strings.
A third one would iterate over the letters of the first word, try to locate the letter in the second word, and reduce the second word by that letter.
I can't think of a fourth way, but I'm almost certain there is one ;-)
Later
Well, here's a fourth way: assign 26 prime numbers to 'a' to 'z' and (using BigInteger) multiply the primes according to the letters of a word. Anagrams produce identical products.

Determining if two strings are a substring of a permutation of another String

So I am trying to figure out if two strings when combined together are a substring of a permutation of another string.
I have what I believe to be a working solution but it is failing some of the JUnit test cases and I dont have access to the ones that it is failing on.
here is my code with one test case
String a="tommarvoloriddle";
String b="lord";
String c="voldemort";
String b= b+c;
char[] w= a.toCharArray();
char[] k= b.toCharArray();
Arrays.sort(k);
Arrays.sort(w);
pw.println(isPermuation(w,k)?"YES":"NO");
static boolean isPermuation(char[] w, char[] k)
{
boolean found=false;
for(int i=0; i<k.length; i++)
{
for(int j=i; j<w.length; j++)
{
if(k[i]==w[j])
{
j=w.length;
found=true;
}
else
found=false;
}
}
return found;
}
any help getting this to always produce the correct answer would be awesome and help making it more efficient would be great too
What you have is not a working solution. However, you don't explain why you thought it might be, so it's hard to figure out what you intended. I will point out that your code updates found unconditionally for each inner loop, so isPermutation() will always return the result of the last comparison (which is certainly not what you want).
You did the right thing in sorting the two arrays in the first place -- this is a classic step which should allow you to efficiently evaluate them in one pass. But then, instead of a single pass, you use a nested loop -- what did you intend here?
A single pass implementation might be something like:
static boolean isPermutation(char[] w, char[] k) {
int k_idx=0;
for(w_idx=0; w_idx < w.length; ++w_idx) {
if(k_idx == k.length)
return true; // all characters in k are present in w
if( w[w_idx] > k[k_idx] )
return false; // found character in k not present in w
if( w[w_idx] == k[k_idx] )
++k_idx; // character from k corresponds to character from w
}
// any remaining characters in k are not present in w
return k_idx == k.length;
}
So we are only interested in whether the two combined strings are a subset of a permutation of another string, meaning that the lengths can in fact differ. So let's say we have:
String a = "tommarvoloriddle";
String b = "lord";
String c = "voldemort";
char[] master = a.ToCharArray();
char[] combined = (b + c).ToCharArray();
Arrays.Sort(master);
Arrays.Sort(combined);
System.out.println(IsPermutation(master, combined) ? "YES" : "NO");
Then our method is:
static boolean IsPermutation(char[] masterString, char[] combinedString)
{
int combinedStringIndex = 0;
int charsFound = 0;
int result = 0;
for (int i = 0; i < masterString.Length; ++i) {
result = combinedString[combinedStringIndex].CompareTo(masterString[i]);
if (result == 0) {
charsFound++;
combinedStringIndex++;
}
else if (result < 0) {
return false;
}
}
return (charsFound == combinedString.Length);
}
What the above method does: it starts comparing characters of the two strings. If we have a mismatch, that is, the character at the current masterString index does not match the character at the current combinedString index, then we simply look at the next character of masterString and see if that matches. At the end, we tally the total number of characters matched from our combinedString, and, if they are equal to the total number of characters in combinedString (its length), then we have established that it is indeed a permutation of masterString. If at any point, the current character in masterString is numerically greater than the current character in combinedString then it means that we will never be able to match the current character, so we give up. Hope that helps.
If two Strings are a permuation of the other you should be able to do this
public static boolean isPermuted(Strign s1, String s2) {
if (s1.length() != s2.length()) return false;
char[] chars1 = s1.toCharArray();
char[] chars2 = s2.toCharArray();
Arrays.sort(chars1);
Arrays.sort(chars2);
return Arrays.equals(chars1, chars2);
}
This means that when sorted the characters are the same, in the same number.

java string permutations and combinations lookup

I'm writing an Android word app. My code includes a method that would find all combinations of the string and the substrings of a 7 letter string with a minimum of length 3. Then compare all available combination to every word in the dictionary to find all the valid words. I'm using a recursive method. Here's the code.
// Gets all the permutations of a string.
void permuteString(String beginningString, String endingString) {
if (endingString.length() <= 1){
if((Arrays.binarySearch(mDictionary, beginningString.toLowerCase() + endingString.toLowerCase())) >= 0){
mWordSet.add(beginningString + endingString);
}
}
else
for (int i = 0; i < endingString.length(); i++) {
String newString = endingString.substring(0, i) + endingString.substring(i + 1);
permuteString(beginningString + endingString.charAt(i), newString);
}
}
// Get the combinations of the sub-strings. Minimum 3 letter combinations
void subStrings(String s){
String newString = "";
if(s.length() > 3){
for(int x = 0; x < s.length(); x++){
newString = removeCharAt(x, s);
permuteString("", newString);
subStrings(newString);
}
}
}
The above code runs fine but when I installed it on my Nexus s I realized that it runs a bit too slow. It takes a few seconds to complete. About 3 or 4 seconds which is unacceptable.
Now I've played some word games on my phone and they compute all the combinations of a string instantly which makes me believe that my algorithm is not very efficient and it can be improved. Can anyone help?
public class TrieNode {
TrieNode a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z;
TrieNode[] children = {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z};
private ArrayList<String> words = new ArrayList<String>();
public void addWord(String word){
words.add(word);
}
public ArrayList<String> getWords(){
return words;
}
}
public class Trie {
static String myWord;
static String myLetters = "afinnrty";
static char[] myChars;
static Sort sort;
static TrieNode myNode = new TrieNode();
static TrieNode currentNode;
static int y = 0;
static ArrayList<String> availableWords = new ArrayList<String>();
public static void main(String[] args) {
readWords();
getPermutations();
}
public static void getPermutations(){
currentNode = myNode;
for(int x = 0; x < myLetters.length(); x++){
if(currentNode.children[myLetters.charAt(x) - 'a'] != null){
//availableWords.addAll(currentNode.getWords());
currentNode = currentNode.children[myLetters.charAt(x) - 'a'];
System.out.println(currentNode.getWords() + "" + myLetters.charAt(x));
}
}
//System.out.println(availableWords);
}
public static void readWords(){
try {
BufferedReader in = new BufferedReader(new FileReader("c://scrabbledictionary.txt"));
String str;
while ((str = in.readLine()) != null) {
myWord = str;
myChars = str.toCharArray();
sort = new Sort(myChars);
insert(myNode, myChars, 0);
}
in.close();
} catch (IOException e) {
}
}
public static void insert(TrieNode node, char[] myChars, int x){
if(x >= myChars.length){
node.addWord(myWord);
//System.out.println(node.getWords()+""+y);
y++;
return;
}
if(node.children[myChars[x]-'a'] == null){
insert(node.children[myChars[x]-'a'] = new TrieNode(), myChars, x=x+1);
}else{
insert(node.children[myChars[x]-'a'], myChars, x=x+1);
}
}
}
In your current approach, you're looking up every permutation of each substring. So for "abc", you need to look up "abc", "acb", "bac", "bca", "cab" and "cba". If you wanted to find all permutations of "permutations", your number of lookups is nearly 500,000,000, and that's before you've even looked at its substrings. But we can reduce this to one lookup, regardless of length, by preprocessing the dictionary.
The idea is to put each word in the dictionary into some data structure where each element contains a set of characters, and a list of all words containing (only) those characters. So for example, you could build a binary tree, which would have a node containing the (sorted) character set "abd" and the word list ["bad", "dab"]. Now, if we want to find all permutations of "dba", we sort it to give "abd" and look it up in the tree to retrieve the list.
As Baumann pointed out, tries are well suited to storing this kind of data. The beauty of the trie is that the lookup time depends only on the length of your search string - it is independent of the size of your dictionary. Since you'll be storing quite a lot of words, and most of your search strings will be tiny (the majority will be the 3-character substrings from the lowest level of your recursion), this structure is ideal.
In this case, the paths down your trie would reflect the character sets rather than the words themselves. So if your entire dictionary was ["bad", "dab", "cab", "cable"], your lookup structure would end up looking like this:
There's a bit of a time/space trade-off in the way you implement this. In the simplest (and fastest) approach, each Node contains just the list of words, and an array Node[26] of children. This allows you to locate the child you're after in constant time, just by looking at children[s.charAt(i)-'a'] (where s is your search string and i is your current depth in the trie).
The downside is that most of your children arrays will be mostly empty. If space is an issue, you can use a more compact representation like a linked list, dynamic array, hash table, etc. However, these come at the cost of potentially requiring several memory accesses and comparisons at each node, instead of the simple array access above. But I'd be surprised if the wasted space was more than a few megabytes over your whole dictionary, so the array-based approach is likely your best bet.
With the trie in place, your whole permutation function is replaced with one lookup, bringing the complexity down from O(N! log D) (where D is the size of your dictionary, N the size of your string) to O(N log N) (since you need to sort the characters; the lookup itself is O(N)).
EDIT: I've thrown together an (untested) implementation of this structure: http://pastebin.com/Qfu93E80
See here: How to find list of possible words from a letter matrix [Boggle Solver]
The idea behind the code in the answers is as follows:
Iterate over each word dictionary.
Iterate over each letter in the word, adding it to a string and adding the string each time to an array of prefixes.
When creating string combinations, test to see that they exist in the prefix array before branching any further.
static List<String> permutations(String a) {
List<String> result=new LinkedList<String>();
int len = a.length();
if (len<=1){
result.add(a);
}else{
for (int i=0;i<len; i++){
for (String it:permutations(a.substring(0, i)+a.substring(i+1))){
result.add(a.charAt(i)+it);
}
}
}
return result;
}
I don't think adding all permutations is necessary. You can simply encapsulate the string into a PermutationString:
public class PermutationString {
private final String innerString;
public PermutationString(String innerString) {
this.innerString = innerString;
}
#Override
public int hashCode() {
int hash = 0x00;
String s1 = this.innerString;
for(int i = 0; i < s1.length(); i++) {
hash += s1.charAt(i);
}
return hash;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final PermutationString other = (PermutationString) obj;
int nChars = 26;
int[] chars = new int[nChars];
String s1 = this.innerString;
String s2 = other.innerString;
if(s1.length() != s2.length()) {
return false;
}
for(int i = 0; i < s1.length(); i++) {
chars[s1.charAt(i)-'a']++;
}
for(int i = 0; i < s2.length(); i++) {
chars[s2.charAt(i)-'a']--;
}
for(int i = 0; i < nChars; i++) {
if(chars[i] != 0x00) {
return false;
}
}
return true;
}
}
A PermutationString is a string, but where two PermutationStrings are equal if they have the same frequency of characters. Thus new PermutationString("bad").equals(new PermutationString("dab")). This also holds for the .hashCode(): if the strings are permutations of each other, they will generate the same .hashCode().
Now you can simply a HashMap<PermutationString,ArrayList<String>> as follows:
HashMap<PermutationString,ArrayList<String>> hm = new HashMap<PermutationString,ArrayList<String>>();
String[] dictionary = new String[] {"foo","bar","oof"};
ArrayList<String> items;
for(String s : dictionary) {
PermutationString ps = new PermutationString(s);
if(hm.containsKey(ps)) {
items = hm.get(ps);
items.add(s);
} else {
items = new ArrayList<String>();
items.add(s);
hm.put(ps,items);
}
}
So now we iterate over all possible words in the dictionary, construct a PermutationString as key, and if the key already exists (that means that there is already a word with the same character frequencies), we simply add our own word to it. Otherwise, we add a new ArrayList<String> with the single word.
Now that we have filled up the hm with all permutations (but not that much keys), you can query:
hm.get(new PermutationString("ofo"));
This will return an ArrayList<String> with "foo" and "oof".
Testcase:
HashMap<PermutationString, ArrayList<String>> hm = new HashMap<PermutationString, ArrayList<String>>();
String[] dictionary = new String[]{"foo", "bar", "oof"};
ArrayList<String> items;
for (String s : dictionary) {
PermutationString ps = new PermutationString(s);
if (hm.containsKey(ps)) {
items = hm.get(ps);
items.add(s);
} else {
items = new ArrayList<String>();
items.add(s);
hm.put(ps, items);
}
}
Assert.assertNull(hm.get(new PermutationString("baa")));
Assert.assertNull(hm.get(new PermutationString("brr")));
Assert.assertNotNull(hm.get(new PermutationString("bar")));
Assert.assertEquals(1,hm.get(new PermutationString("bar")).size());
Assert.assertNotNull(hm.get(new PermutationString("rab")));
Assert.assertEquals(1,hm.get(new PermutationString("rab")).size());
Assert.assertNotNull(hm.get(new PermutationString("foo")));
Assert.assertEquals(2,hm.get(new PermutationString("foo")).size());
Assert.assertNotNull(hm.get(new PermutationString("ofo")));
Assert.assertEquals(2,hm.get(new PermutationString("ofo")).size());
Assert.assertNotNull(hm.get(new PermutationString("oof")));
Assert.assertEquals(2,hm.get(new PermutationString("oof")).size());
Use a Trie
Instead of testing all N! possibilities, you only follow prefix trees that lead to a result. This will significanlty reduce the amount of strings that you're checking against.
Well, you can extend your dictionary entities with array letters[] where letters[i] stays for times that i-th letter of alphabet used in this word. It'll take some additional memory, not far much than it is used now.
Then, for each word which permutations you want to check, you'll need to count number of distinct letters too and then traverse through dictiory with easy comparison procedure. If for all letters for word from dictionary number of occurrences less or equal than for word we are checking - yes, this word can be represented as permutation of substring, otherwise - no.
Complexity: it'll took O(D * maxLen) for precalculation, and O(max(N, D)) for each query.

Matching the occurrence and pattern of characters of String2 in String1

I was asked this question in a phone interview for summer internship, and tried to come up with a n*m complexity solution (although it wasn't accurate too) in Java.
I have a function that takes 2 strings, suppose "common" and "cmn". It should return True based on the fact that 'c', 'm', 'n' are occurring in the same order in "common". But if the arguments were "common" and "omn", it would return False because even though they are occurring in the same order, but 'm' is also appearing after 'o' (which fails the pattern match condition)
I have worked over it using Hashmaps, and Ascii arrays, but didn't get a convincing solution yet! From what I have read till now, can it be related to Boyer-Moore, or Levenshtein Distance algorithms?
Hoping for respite at stackoverflow! :)
Edit: Some of the answers talk about reducing the word length, or creating a hashset. But per my understanding, this question cannot be done with hashsets because occurrence/repetition of each character in first string has its own significance. PASS conditions- "con", "cmn", "cm", "cn", "mn", "on", "co". FAIL conditions that may seem otherwise- "com", "omn", "mon", "om". These are FALSE/FAIL because "o" is occurring before as well as after "m". Another example- "google", "ole" would PASS, but "google", "gol" would fail because "o" is also appearing before "g"!
I think it's quite simple. Run through the pattern and fore every character get the index of it's last occurence in the string. The index must always increase, otherwise return false.
So in pseudocode:
index = -1
foreach c in pattern
checkindex = string.lastIndexOf(c)
if checkindex == -1 //not found
return false
if checkindex < index
return false
if string.firstIndexOf(c) < index //characters in the wrong order
return false
index = checkindex
return true
Edit: you could further improve the code by passing index as the starting index to the lastIndexOf method. Then you would't have to compare checkindex with index and the algorithm would be faster.
Updated: Fixed a bug in the algorithm. Additional condition added to consider the order of the letters in the pattern.
An excellent question and couple of hours of research and I think I have found the solution. First of all let me try explaining the question in a different approach.
Requirement:
Lets consider the same example 'common' (mainString) and 'cmn'(subString). First we need to be clear that any characters can repeat within the mainString and also the subString and since its pattern that we are concentrating on, the index of the character play a great role to. So we need to know:
Index of the character (least and highest)
Lets keep this on hold and go ahead and check the patterns a bit more. For the word common, we need to find whether the particular pattern cmn is present or not. The different patters possible with common are :- (Precedence apply )
c -> o
c -> m
c -> n
o -> m
o -> o
o -> n
m -> m
m -> o
m -> n
o -> n
At any moment of time this precedence and comparison must be valid. Since the precedence plays a huge role, we need to have the index of each unique character Instead of storing the different patterns.
Solution
First part of the solution is to create a Hash Table with the following criteria :-
Create a Hash Table with the key as each character of the mainString
Each entry for a unique key in the Hash Table will store two indices i.e lowerIndex and higherIndex
Loop through the mainString and for every new character, update a new entry of lowerIndex into the Hash with the current index of the character in mainString.
If Collision occurs, update the current index with higherIndex entry, do this until the end of String
Second and main part of pattern matching :-
Set Flag as False
Loop through the subString and for
every character as the key, retreive
the details from the Hash.
Do the same for the very next character.
Just before loop increment, verify two conditions
If highestIndex(current character) > highestIndex(next character) Then
Pattern Fails, Flag <- False, Terminate Loop
// This condition is applicable for almost all the cases for pattern matching
Else If lowestIndex(current character) > lowestIndex(next character) Then
Pattern Fails, Flag <- False, Terminate Loop
// This case is explicitly for cases in which patterns like 'mon' appear
Display the Flag
N.B : Since I am not so versatile in Java, I did not submit the code. But some one can try implementing my idea
I had myself done this question in an inefficient manner, but it does give accurate result! I would appreciate if anyone can make out an an efficient code/algorithm from this!
Create a function "Check" which takes 2 strings as arguments. Check each character of string 2 in string 1. The order of appearance of each character of s2 should be verified as true in S1.
Take character 0 from string p and traverse through the string s to find its index of first occurrence.
Traverse through the filled ascii array to find any value more than the index of first occurrence.
Traverse further to find the last occurrence, and update the ascii array
Take character 1 from string p and traverse through the string s to find the index of first occurence in string s
Traverse through the filled ascii array to find any value more than the index of first occurrence. if found, return False.
Traverse further to find the last occurrence, and update the ascii array
As can be observed, this is a bruteforce method...I guess O(N^3)
public class Interview
{
public static void main(String[] args)
{
if (check("google", "oge"))
System.out.println("yes");
else System.out.println("sorry!");
}
public static boolean check (String s, String p)
{
int[] asciiArr = new int[256];
for(int pIndex=0; pIndex<p.length(); pIndex++) //Loop1 inside p
{
for(int sIndex=0; sIndex<s.length(); sIndex++) //Loop2 inside s
{
if(p.charAt(pIndex) == s.charAt(sIndex))
{
asciiArr[s.charAt(sIndex)] = sIndex; //adding char from s to its Ascii value
for(int ascIndex=0; ascIndex<256; ) //Loop 3 for Ascii Array
{
if(asciiArr[ascIndex]>sIndex) //condition to check repetition
return false;
else ascIndex++;
}
}
}
}
return true;
}
}
Isn't it doable in O(n log n)?
Step 1, reduce the string by eliminating all characters that appear to the right. Strictly speaking you only need to eliminate characters if they appear in the string you're checking.
/** Reduces the maximal subsequence of characters in container that contains no
* character from container that appears to the left of the same character in
* container. E.g. "common" -> "cmon", and "whirlygig" -> "whrlyig".
*/
static String reduceContainer(String container) {
SparseVector charsToRight = new SparseVector(); // Like a Bitfield but sparse.
StringBuilder reduced = new StringBuilder();
for (int i = container.length(); --i >= 0;) {
char ch = container.charAt(i);
if (charsToRight.add(ch)) {
reduced.append(ch);
}
}
return reduced.reverse().toString();
}
Step 2, check containment.
static boolean containsInOrder(String container, String containee) {
int containerIdx = 0, containeeIdx = 0;
int containerLen = container.length(), containeeLen == containee.length();
while (containerIdx < containerLen && containeeIdx < containeeLen) {
// Could loop over codepoints instead of code-units, but you get the point...
if (container.charAt(containerIdx) == containee.charAt(containeeIdx)) {
++containeeIdx;
}
++containerIdx;
}
return containeeIdx == containeeLen;
}
And to answer your second question, no, Levenshtein distance won't help you since it has the property that if you swap the arguments the output is the same, but the algo you want does not.
public class StringPattern {
public static void main(String[] args) {
String inputContainer = "common";
String inputContainees[] = { "cmn", "omn" };
for (String containee : inputContainees)
System.out.println(inputContainer + " " + containee + " "
+ containsCommonCharsInOrder(inputContainer, containee));
}
static boolean containsCommonCharsInOrder(String container, String containee) {
Set<Character> containerSet = new LinkedHashSet<Character>() {
// To rearrange the order
#Override
public boolean add(Character arg0) {
if (this.contains(arg0))
this.remove(arg0);
return super.add(arg0);
}
};
addAllPrimitiveCharsToSet(containerSet, container.toCharArray());
Set<Character> containeeSet = new LinkedHashSet<Character>();
addAllPrimitiveCharsToSet(containeeSet, containee.toCharArray());
// retains the common chars in order
containerSet.retainAll(containeeSet);
return containerSet.toString().equals(containeeSet.toString());
}
static void addAllPrimitiveCharsToSet(Set<Character> set, char[] arr) {
for (char ch : arr)
set.add(ch);
}
}
Output:
common cmn true
common omn false
I would consider this as one of the worst pieces of code I have ever written or one of the worst code examples in stackoverflow...but guess what...all your conditions are met!
No algorithm could really fit the need, so I just used bruteforce...test it out...
And I could just care less for space and time complexity...my aim was first to try and solve it...and maybe improve it later!
public class SubString {
public static void main(String[] args) {
SubString ss = new SubString();
String[] trueconditions = {"con", "cmn", "cm", "cn", "mn", "on", "co" };
String[] falseconditions = {"com", "omn", "mon", "om"};
System.out.println("True Conditions : ");
for (String str : trueconditions) {
System.out.println("SubString? : " + str + " : " + ss.test("common", str));
}
System.out.println("False Conditions : ");
for (String str : falseconditions) {
System.out.println("SubString? : " + str + " : " + ss.test("common", str));
}
System.out.println("SubString? : ole : " + ss.test("google", "ole"));
System.out.println("SubString? : gol : " + ss.test("google", "gol"));
}
public boolean test(String original, String match) {
char[] original_array = original.toCharArray();
char[] match_array = match.toCharArray();
int[] value = new int[match_array.length];
int index = 0;
for (int i = 0; i < match_array.length; i++) {
for (int j = index; j < original_array.length; j++) {
if (original_array[j] != original_array[j == 0 ? j : j-1] && contains(match.substring(0, i), original_array[j])) {
value[i] = 2;
} else {
if (match_array[i] == original_array[j]) {
if (value[i] == 0) {
if (contains(original.substring(0, j == 0 ? j : j-1), match_array[i])) {
value[i] = 2;
} else {
value[i] = 1;
}
}
index = j + 1;
}
}
}
}
for (int b : value) {
if (b != 1) {
return false;
}
}
return true;
}
public boolean contains(String subStr, char ch) {
for (char c : subStr.toCharArray()) {
if (ch == c) {
return true;
}
}
return false;
}
}
-IvarD
I think this one is not a test of your computer science fundamentals, more what you would practically do within the Java programming environment.
You could construct a regular expression out of the second argument, i.e ...
omn -> o.*m[^o]*n
... and then test candidate string against this by either using String.matches(...) or using the Pattern class.
In generic form, the construction of the RegExp should be along the following lines.
exp -> in[0].* + for each x : 2 -> in.lenght { (in[x-1] +
[^in[x-2]]* + in[x]) }
for example:
demmn -> d.*e[^d]*m[^e]*m[^m]*n
I tried it myself in a different way. Just sharing my solution.
public class PatternMatch {
public static boolean matchPattern(String str, String pat) {
int slen = str.length();
int plen = pat.length();
int prevInd = -1, curInd;
int count = 0;
for (int i = 0; i < slen; i++) {
curInd = pat.indexOf(str.charAt(i));
if (curInd != -1) {
if(prevInd == curInd)
continue;
else if(curInd == (prevInd+1))
count++;
else if(curInd == 0)
count = 1;
else count = 0;
prevInd = curInd;
}
if(count == plen)
return true;
}
return false;
}
public static void main(String[] args) {
boolean r = matchPattern("common", "on");
System.out.println(r);
}
}

Categories