Count all possible decoding Combination of the given binary String in Java - java

Suppose we have a string of binary values in which some portions may correspond to specific letters, for example:
A = 0
B = 00
C = 001
D = 010
E = 0010
F = 0100
G = 0110
H = 0001
For example, if we assume the string "00100", we can have 5 different possibilities:
ADA
AF
CAA
CB
EA
I have to extract the exact number of combinations using Dynamic programming.
But I have difficulty in the formulation of subproblems and in the composition of the corresponding vector of solutions.
I appreciate any indications of the correct algorithm formulation.
class countString {
static int count(String a, String b, int m, int n) {
if ((m == 0 && n == 0) || n == 0)
return 1;
if (m == 0)
return 0;
if (a.charAt(m - 1) == b.charAt(n - 1))
return count(a, b, m - 1, n - 1) +
count(a, b, m - 1, n);
else
return count(a, b, m - 1, n);
}
public static void main(String[] args) {
Locale.setDefault(Locale.US);
ArrayList<String> substrings = new ArrayList<>();
substrings.add("0");
substrings.add("00");
substrings.add("001");
substrings.add("010");
substrings.add("0010");
substrings.add("0100");
substrings.add("0110");
substrings.add("0001");
if (args.length != 1) {
System.err.println("ERROR - execute with: java countString -filename- ");
System.exit(1);
}
try {
Scanner scan = new Scanner(new File(args[0])); // not important
String S = "00100";
int count = 0;
for(int i=0; i<substrings.size(); i++){
count = count + count(S,substrings.get(i),S.length(),substrings.get(i).length());
}
System.out.println(count);
} catch (FileNotFoundException e) {
System.out.println("File not found " + e);
}
}
}

In essence, Dynamic Programming is an enhanced brute-force approach.
Like in the case of brute-force, we need to generate all possible results. But contrary to a plain brute-force the problem should be divided into smaller subproblems, and previously computed result of each subproblem should be stored and reused.
Since you are using recursion you need to apply so-called Memoization technic in order to store and reuse the intermediate results. In this case, HashMap would be a perfect mean of storing results.
But before applying the memoization in order to understand it better, it makes sense to start with a clean and simple recursive solution that works correctly, and only then enhance it with DP.
Plain Recursion
Every recursive implementation should contain two parts:
Base case - that represents a simple edge-case (or a set of edge-cases) for which the outcome is known in advance. For this problem, there are two edge-cases: the length of the given string is 0 and result would be 1 (an empty binary string "" results into an empty string of letters ""), another case is when it's impossible to decode a given binary string and result will be 0 (in the solution below it resolves naturally when the recursive case is being executed).
Recursive case - a part of a solution where recursive calls a made and when the main logic resides. In the recursive case, we need to find each binary "binary letter" at the beginning of the string and then call the method recursively by passing the substring (without the "letter"). Results of these recursive calls need to be accumulated in the total count that will returned from the method.
In order to implement this logic we need only two arguments: the binary string to analyze and a list of binary letters:
public static int count(String str, List<String> letters) {
if (str.isEmpty()) { // base case - a combination was found
return 1;
}
// recursive case
int count = 0;
for (String letter: letters) {
if (str.startsWith(letter)) {
count += count(str.substring(letter.length()), letters);
}
}
return count;
}
This concise solution is already capable of producing the correct result. Now, let's turn this brute-force version into a DP-based solution, by applying the memoization.
Dynamic Programming
As I've told earlier, a HashMap will be a perfect mean to store the intermediate results because allows to associate a count (number of combinations) with a particular string and then retrieve this number almost instantly (in O(1) time).
That how it might look like:
public static int count(String str, List<String> letters, Map<String, Integer> vocab) {
if (str.isEmpty()) { // base case - a combination was found
return 1;
}
if (vocab.containsKey(str)) { // result was already computed and present in the map
return vocab.get(str);
}
int count = 0;
for (String letter: letters) {
if (str.startsWith(letter)) {
count += count(str.substring(letter.length()), letters, vocab);
}
}
vocab.put(str, count); // storing the total `count` into the map
return count;
}
main()
public static void main(String[] args) {
List<String> letters = List.of("0", "00", "001", "010", "0010", "0100", "0110", "0001"); // binary letters
System.out.println(count("00100", letters, new HashMap<>())); // DP
System.out.println(count("00100", letters)); // brute-force recursion
}
Output:
5 // DP
5 // plain recursion
A link to Online Demo

Hope this helps.
Idea is to create every possible string with these values and check whether input starts with the value or not. If not then switch to another index.
If you have test cases ready with you you can verify more.
I have tested only with 2-3 values.
public int getCombo(String[] array, int startingIndex, String val, String input) {
int count = 0;
for (int i = startingIndex; i < array.length; i++) {
String matchValue = val + array[i];
if (matchValue.length() <= input.length()) {
// if value matches then count + 1
if (matchValue.equals(input)) {
count++;
System.out.println("match Found---->" + count); //ommit this sysout , its only for testing.
return count;
} else if (input.startsWith(matchValue)) { // checking whether the input is starting with the new value
// search further combos
count += getCombo(array, 0, matchValue, input);
}
}
}
return count;
}
In main Method
String[] arr = substrings.toArray(new String[0]);
int count = 0;
for (int i = 0; i < arr.length; i++) {
System.out.println("index----?> " + i);
//adding this condition for single inputs i.e "0","010";
if(arr[i].equals(input))
count++;
else
count = count + getCombo(arr, 0, arr[i], input);
}
System.out.println("Final count : " + count);
My test results :
input : 00100
Final count 5
input : 000
Final count 3

Related

Number of ways to recreate a given string using a given list of words

Given is a String word and a String array book that contains some strings. The program should give out the number of possibilities to create word only using elements in book. An element can be used as many times as we want and the program must terminate in under 6 seconds.
For example, input:
String word = "stackoverflow";
String[] book = new String[9];
book[0] = "st";
book[1] = "ck";
book[2] = "CAG";
book[3] = "low";
book[4] = "TC";
book[5] = "rf";
book[6] = "ove";
book[7] = "a";
book[8] = "sta";
The output should be 2, since we can create "stackoverflow" in two ways:
1: "st" + "a" + "ck" + "ove" + "rf" + "low"
2: "sta" + "ck" + "ove" + "rf" + "low"
My implementation of the program only terminates in the required time if word is relatively small (<15 characters). However, as I mentioned before, the running time limit for the program is 6 seconds and it should be able to handle very large word strings (>1000 characters). Here is an example of a large input.
Here is my code:
1) the actual method:
input: a String word and a String[] book
output: the number of ways word can be written only using strings in book
public static int optimal(String word, String[] book){
int count = 0;
List<List<String>> allCombinations = allSubstrings(word);
List<String> empty = new ArrayList<>();
List<String> wordList = Arrays.asList(book);
for (int i = 0; i < allCombinations.size(); i++) {
allCombinations.get(i).retainAll(wordList);
if (!sumUp(allCombinations.get(i), word)) {
allCombinations.remove(i);
allCombinations.add(i, empty);
}
else count++;
}
return count;
}
2) allSubstrings():
input: a String input
output: A list of lists, each containing a combination of substrings that add up to input
static List<List<String>> allSubstrings(String input) {
if (input.length() == 1) return Collections.singletonList(Collections.singletonList(input));
List<List<String>> result = new ArrayList<>();
for (List<String> temp : allSubstrings(input.substring(1))) {
List<String> firstList = new ArrayList<>(temp);
firstList.set(0, input.charAt(0) + firstList.get(0));
if (input.startsWith(firstList.get(0), 0)) result.add(firstList);
List<String> l = new ArrayList<>(temp);
l.add(0, input.substring(0, 1));
if (input.startsWith(l.get(0), 0)) result.add(l);
}
return result;
}
3.) sumup():
input: A String list input and a String expected
output: true if the elements in input add up to expected
public static boolean sumUp (List<String> input, String expected) {
String x = "";
for (int i = 0; i < input.size(); i++) {
x = x + input.get(i);
}
if (expected.equals(x)) return true;
return false;
}
I've figured out what I was doing wrong in my previous answer: I wasn't using memoization, so I was redoing an awful lot of unnecessary work.
Consider a book array {"a", "aa", "aaa"}, and a target word "aaa". There are four ways to construct this target:
"a" + "a" + "a"
"aa" + "a"
"a" + "aa"
"aaa"
My previous attempt would have walk through all four, separately. But instead, one can observe that:
There is 1 way to construct "a"
You can construct "aa" in 2 ways, either "a" + "a" or using "aa" directly.
You can construct "aaa" either by using "aaa" directly (1 way); or "aa" + "a" (2 ways, since there are 2 ways to construct "aa"); or "a" + "aa" (1 way).
Note that the third step here only adds a single additional string to a previously-constructed string, for which we know the number of ways it can be constructed.
This suggests that if we count the number of ways in which a prefix of word can be constructed, we can use that to trivially calculate the number of ways a longer prefix by adding just one more string from book.
I defined a simple trie class, so you can quickly look up prefixes of the book words that match at any given position in word:
class TrieNode {
boolean word;
Map<Character, TrieNode> children = new HashMap<>();
void add(String s, int i) {
if (i == s.length()) {
word = true;
} else {
children.computeIfAbsent(s.charAt(i), k -> new TrieNode()).add(s, i + 1);
}
}
}
For each letter in s, this creates an instance of TrieNode, and stores the TrieNode for the subsequent characters etc.
static long method(String word, String[] book) {
// Construct a trie from all the words in book.
TrieNode t = new TrieNode();
for (String b : book) {
t.add(b, 0);
}
// Construct an array to memoize the number of ways to construct
// prefixes of a given length: result[i] is the number of ways to
// construct a prefix of length i.
long[] result = new long[word.length() + 1];
// There is only 1 way to construct a prefix of length zero.
result[0] = 1;
for (int m = 0; m < word.length(); ++m) {
if (result[m] == 0) {
// If there are no ways to construct a prefix of this length,
// then just skip it.
continue;
}
// Walk the trie, taking the branch which matches the character
// of word at position (n + m).
TrieNode tt = t;
for (int n = 0; tt != null && n + m <= word.length(); ++n) {
if (tt.word) {
// We have reached the end of a word: we can reach a prefix
// of length (n + m) from a prefix of length (m).
// Increment the number of ways to reach (n+m) by the number
// of ways to reach (m).
// (Increment, because there may be other ways).
result[n + m] += result[m];
if (n + m == word.length()) {
break;
}
}
tt = tt.children.get(word.charAt(n + m));
}
}
// The number of ways to reach a prefix of length (word.length())
// is now stored in the last element of the array.
return result[word.length()];
}
For the very long input given by OP, this gives output:
$ time java Ideone
2217093120
real 0m0.126s
user 0m0.146s
sys 0m0.036s
Quite a bit faster than the required 6 seconds - and this includes JVM startup time too.
Edit: in fact, the trie isn't necessary. You can simply replace the "Walk the trie" loop with:
for (String b : book) {
if (word.regionMatches(m, b, 0, b.length())) {
result[m + b.length()] += result[m];
}
}
and it performs slower, but still way faster than 6s:
2217093120
real 0m0.173s
user 0m0.226s
sys 0m0.033s
A few observations:
x = x + input.get(i);
As you are looping, using String+ isn't a good idea. Use a StringBuilder and append to that within the loop, and in the end return builder.toString(). Or you follow the idea from Andy. There is no need to merge strings, you already know the target word. See below.
Then: List implies that adding/removing elements might be costly. So see if you can get rid of that part, and if it would be possible to use maps, sets instead.
Finally: the real point would be to look into your algorithm. I would try to work "backwards". Meaning: first identify those array elements that actually occur in your target word. You can ignore all others right from start.
Then: look at all array entries that **start*+ your search word. In your example you can notice that there are just two array elements that fit. And then work your way from there.
My first observation would be that you don't actually need to build anything: you know what string you are trying to construct (e.g. stackoverflow), so all you really need to keep track of is how much of that string you have matched so far. Call this m.
Next, having matched m characters, provided m < word.length(), you need to choose a next string from book which matches the portion of word from m to m + nextString.length().
You could do this by checking each string in turn:
if (word.matches(m, nextString, 0, nextString.length()) { ...}
But you can do better, by determining strings that can't match in advance: the next string you append will have the following properties:
word.charAt(m) == nextString.charAt(0) (the next characters match)
m + nextString.length() <= word.length() (adding the next string shouldn't make the constructed string longer than word)
So, you can cut down the potential words from book that you might check by constructing a map of letters to words that start with that (point 1); and if you store the words with the same starting letter in increasing length order, you can stop checking that letter as soon as the length gets too big (point 2).
You can construct a map once and reuse:
Map<Character, List<String>> prefixMap =
Arrays.asList(book).stream()
.collect(groupingBy(
s -> s.charAt(0),
collectingAndThen(
toList(),
ss -> {
ss.sort(comparingInt(String::length));
return ss;
})));
You can count the number of ways recursively, without constructing any additional objects (*):
int method(String word, String[] book) {
return method(word, 0, /* construct map as above */);
}
int method(String word, int m, Map<Character, List<String>> prefixMap) {
if (m == word.length()) {
return 1;
}
int result = 0;
for (String nextString : prefixMap.getOrDefault(word.charAt(m), emptyList())) {
if (m + nextString.length() > word.length()) {
break;
}
// Start at m+1, because you already know they match at m.
if (word.regionMatches(m + 1, nextString, 1, nextString.length()-1)) {
// This is a potential match!
// Make a recursive call.
result += method(word, m + nextString.length(), prefixMap);
}
}
return result;
}
(*) This may construct new instances of Character, because of the boxing of the word.charAt(m): cached instances are guaranteed to be used for chars in the range 0-127 only. There are ways to work around this, but they would only clutter the code.
I think you are already doing a pretty good job at optimizing your application. In addition to the answer by GhostCat here are a few suggestions of my own:
public static int optimal(String word, String[] book){
int count = 0;
List<List<String>> allCombinations = allSubstrings(word);
List<String> wordList = Arrays.asList(book);
for (int i = 0; i < allCombinations.size(); i++)
{
/*
* allCombinations.get(i).retainAll(wordList);
*
* There is no need to retrieve the list element
* twice, just set it in a local variable
*/
java.util.List<String> combination = allCombinations.get(i);
combination.retainAll(wordList);
/*
* Since we are only interested in the count here
* there is no need to remove and add list elements
*/
if (sumUp(combination, word))
{
/*allCombinations.remove(i);
allCombinations.add(i, empty);*/
count++;
}
/*else count++;*/
}
return count;
}
public static boolean sumUp (List<String> input, String expected) {
String x = "";
for (int i = 0; i < input.size(); i++) {
x = x + input.get(i);
}
// No need for if block here, just return comparison result
/*if (expected.equals(x)) return true;
return false;*/
return expected.equals(x);
}
And since you are interested in seeing the execution time of your method I would recommend implementing a benchmarking system of some sort. Here is a quick mock-up:
private static long benchmarkOptima(int cycles, String word, String[] book) {
long totalTime = 0;
for (int i = 0; i < cycles; i++)
{
long startTime = System.currentTimeMillis();
int a = optimal(word, book);
long executionTime = System.currentTimeMillis() - startTime;
totalTime += executionTime;
}
return totalTime / cycles;
}
public static void main(String[] args)
{
String word = "stackoverflow";
String[] book = new String[] {
"st", "ck", "CAG", "low", "TC",
"rf", "ove", "a", "sta"
};
int result = optimal(word, book);
final int cycles = 50;
long averageTime = benchmarkOptima(cycles, word, book);
System.out.println("Optimal result: " + result);
System.out.println("Average execution time - " + averageTime + " ms");
}
Output
2
Average execution time - 6 ms
Note: The implementation is getting stuck in the test case mentioned by #user1221, working on it.
What I could think of is a Trie based approach that is O(sum of length of words in dict) space. Time is not optimal.
Procedure:
Build a Trie of all the words in the dictionary. This is a pre-processing task that will take O(sum of lengths of all strings in dict).
We try finding the string that you want to make in the trie, with a twist. We start with searching a prefix of the string. If we get a prefix in the trie, we start the search from the top recursively and continue to look for more prefixes.
When we reach the end of out string i.e. stackoverflow, we check if we arrived at the end of any string, if yes, then we reached a valid combination of this string. we count this while going back up the recursion.
eg:
In the above case, we use the dict as {"st", "sta", "a", "ck"}
We construct our trie ($ is the sentinel char, i.e. a char which is not in the dict):
$___s___t.___a.
|___a.
|___c___k.
the . represents that a word in the dict ends at that position.
We try to find the no of constructions of stack.
We start searching stack in the trie.
depth=0
$___s(*)___t.___a.
|___a.
|___c___k.
We see that we are at the end of one word, we start a new search with the remaining string ack from the top.
depth=0
$___s___t(*).___a.
|___a.
|___c___k.
Again we are at the end of one word in the dict. We start a new search for ck.
depth=1
$___s___t.___a.
|___a(*).
|___c___k.
depth=2
$___s___t.___a.
|___a.
|___c(*)___k.
We reach the end of stack and end of a word in the dict, hence we have 1 valid representation of stack.
depth=2
$___s___t.___a.
|___a.
|___c___k(*).
We go back to the caller of depth=2
No next char is available, we return to the caller of depth=1.
depth=1
$___s___t.___a.
|___a(*, 1).
|___c___k.
depth=0
$___s___t(*, 1).___a.
|___a.
|___c___k.
We move to next char. We see that we reached the end of one word in the dict, we launch a new search for ck in the dict.
depth=0
$___s___t.___a(*, 1).
|___a.
|___c___k.
depth=1
$___s___t.___a.
|___a.
|___c(*)___k.
We reach the end of the stack and a work in the dict, so another valid representation. We go back to the caller of depth=1
depth=1
$___s___t.___a.
|___a.
|___c___k(*, 1).
There are no more chars to proceed, we return with the result 2.
depth=0
$___s___t.___a(*, 2).
|___a.
|___c___k.
Note: The implementation is in C++, shouldn't be too hard to convert to Java and this implementation assumes that all chars are lowercase, it's trivial to extend it to both cases.
Sample code (full version):
/**
Node *base: head of the trie
Node *h : current node in the trie
string s : string to search
int idx : the current position in the string
*/
int count(Node *base, Node *h, string s, int idx) {
// step 3: found a valid combination.
if (idx == s.size()) return h->end;
int res = 0;
// step 2: we recursively start a new search.
if (h->end) {
res += count(base, base, s, idx);
}
// move ahead in the trie.
if (h->next[s[idx] - 'a'] != NULL) {
res += count(base, h->next[s[idx] - 'a'], s, idx + 1);
}
return res;
}
def cancons(target,wordbank, memo={}):
if target in memo:
return memo[target]
if target =='':
return 1
total_count =0
for word in wordbank:
if target.startswith(word):
l= len(word)
number_of_way=cancons(target[l:],wordbank,memo)
total_count += number_of_way
memo[target]= total_count
return total_count
if __name__ == '__main__':
word = "stackoverflow";
String= ["st", "ck","CAG","low","TC","rf","ove","a","sta"]
b=cancons(word,String,memo={})
print(b)

Why the hashset's performance is way faster than list?

This problem is from leetcode (https://leetcode.com/problems/word-ladder/)!
Given two words (beginWord and endWord), and a dictionary's word list, find the length of shortest transformation sequence from beginWord to endWord, such that:
Only one letter can be changed at a time.
Each transformed word must exist in the word list. Note that beginWord is not a transformed word.
Note:
Return 0 if there is no such transformation sequence.
All words have the same length.
All words contain only lowercase alphabetic characters.
You may assume no duplicates in the word list.
You may assume beginWord and endWord are non-empty and are not the same.
This is my code which takes 800 ms to run:
class Solution {
public int ladderLength(String beginWord, String endWord, List<String> wordList){
if(!wordList.contains(endWord))
return 0;
int ret = 1;
LinkedList<String> queue = new LinkedList<>();
Set<String> visited = new HashSet<String>();
queue.offer(beginWord);
queue.offer(null);
while(queue.size() != 1 && !queue.isEmpty()) {
String temp = queue.poll();
if(temp == null){
ret++;
queue.offer(null);
continue;
}
if(temp.equals(endWord)) {
//System.out.println("succ ret = " + ret);
return ret;
}
for(String word:wordList) {
if(diffOf(temp,word) == 1){
//System.out.println("offered " + word);
//System.out.println("ret =" + ret);
if(!visited.contains(word)){
visited.add(word);
queue.offer(word);
}
}
}
}
return 0;
}
private int diffOf(String s1, String s2) {
if(s1.length() != s2.length())
return Integer.MAX_VALUE;
int dif = 0;
for(int i=0;i < s1.length();i++) {
if(s1.charAt(i) != s2.charAt(i))
dif++;
}
return dif;
}
}
Here is another code which takes 100ms to run:
class Solution {
public int ladderLength(String beginWord, String endWord, List<String> wordList) {
Set<String> set = new HashSet<>(wordList);
if (!set.contains(endWord)) {
return 0;
}
int distance = 1;
Set<String> current = new HashSet<>();
current.add(beginWord);
while (!current.contains(endWord)) {
Set<String> next = new HashSet<>();
for (String str : current) {
for (int i = 0; i < str.length(); i++) {
char[] chars = str.toCharArray();
for (char c = 'a'; c <= 'z'; c++) {
chars[i] = c;
String s = new String(chars);
if (s.equals(endWord)) {
return distance + 1;
}
if (set.contains(s)) {
next.add(s);
set.remove(s);
}
}
}
}
distance++;
if (next.size() == 0) {
return 0;
}
current = next;
}
return 0;
}
}
I think the second code is less efficient, because it test 26 letters for each word. Why is it so fast?
Short answer: Your breath-first search does orders of magnitude more compares per 'word distance unit' (hereafter called iteration).
You compare every candidate to every remaining word. Time complexity T(N×n) per iteration,
They compare every candidate to artificially constructed 'next' candidates. And because they construct candidates they don't have to 'calculate' the distance. For simplicity, I assume both (constructing or checking) have the same running time. The time complexity is T(26×l×n) per iteration.
(N=word list size, n = number of candidates for this iteration, l = word length)
Of course 26×l×n is much less than N×n because the word length is small but the word list is huge.
I tried your routine on ("and","has",[List of 2M English words]) and after 30 seconds I killed it because I thought it crashed. It didn't crash, it was just slow. I turned to another word list of 50K and yours now takes 8 seconds, vs 0.04s for their implementation.
For my word list of N=51306 there are 2167 3-letter words. This means that for every word, on average, there are 3×cbrt(2167) possible candidates, which is n≈38.82.
Their expected performance: T(26×l×n) ≈ T(3027) work per iteration,
Your expected performance: T(N×n) ≈ T(1991784) work per iteration.
(assuming word list does not get shorter; but with this many words the difference is negligible)
Incidentally, your queue-based circular buffer implementation is possibly faster than their two-alternating-Sets implementation, so you could make a hybrid that's even faster.

Applying Linear and Binary Searches to Arrays

I have to create a program that takes a user input (a number) and then the program should have that number and apply a search to the array and output the corresponding title by matching the index and the number the user inputted. However during run time, nothing happens. I have set breakers in my code and noticed a problem with the for loop (search algorithm). Please help me and let me know what is wrong is my search algorithm. What I am trying to do is use the number of that the user inputs to match a index and then output the book title that is stored in the index.
private void btnFindActionPerformed(java.awt.event.ActionEvent evt) {
// TODO add your handling code here:
// declares an array
String[] listOfBooks = new String [101];
// assigns index in array to book title
listOfBooks[1] = "The Adventures of Tom Sawyer";
listOfBooks[2] = "Huckleberry Finn";
listOfBooks[4] = "The Sword in the Stone";
listOfBooks[6] = "Stuart Little";
listOfBooks[10] = "Treasure Island";
listOfBooks[12] = "Test";
listOfBooks[14] = "Alice's Adventures in Wonderland";
listOfBooks[20] = "Twenty Thousand Leagues Under the Sea";
listOfBooks[24] = "Peter Pan";
listOfBooks[26] = "Charlotte's Web";
listOfBooks[31] = "A Little Princess";
listOfBooks[32] = "Little Women";
listOfBooks[33] = "Black Beauty";
listOfBooks[35] = "The Merry Adventures of Robin Hood";
listOfBooks[40] = "Robinson Crusoe";
listOfBooks[46] = "Anne of Green Gables";
listOfBooks[50] = "Little House in the Big Woods";
listOfBooks[52] = "Swiss Family Robinson";
listOfBooks[54] = "The Lion, the Witch and the Wardrobe";
listOfBooks[54] = "Heidi";
listOfBooks[66] = "A Winkle in Time";
listOfBooks[100] = "Mary Poppins";
// gets user input
String numberInput = txtNumberInput.getText();
int number = Integer.parseInt(numberInput);
// Linear search to match index number and user input number
for(int i = 0; i < listOfBooks.length - 1; i++) {
if (listOfBooks.get(i) == number) {
txtLinearOutput.setText(listOfBooks[i]);
break;
}
}
*There is a problem with the listOfBooks.get in the if statement. Also I need to apply a binary search that would search the same array just using the binary method. Need help to apply this type of binary search.
How could I make a statement that checks if the int number is equal to an index?
Note that the following code is just an example of what I have to apply. Variables are all for example purposes:
public static Boolean binarySearch(String [ ] A, int left, int right, String V){
int middle;
if (left > right) {
return false;
}
middle = (left + right)/2;
int compare = V.compareTo(A[middle]);
if (compare == 0) {
return true;
}
if (compare < 0) {
return binarySearch(A, left, middle-1, V);
} else {
return binarySearch(A, middle + 1, right, V);
}
}
you can avoid for loop and check condition by just giving number like this: txtLinearOutput.setText(listOfBooks[number-1]);
remove your code
// Linear search to match index number and user input number
for(int i = 0; i < listOfBooks.length - 1; i++) {
if (listOfBooks.get(i) == number) {
txtLinearOutput.setText(listOfBooks[i]);
break;
}
with
try{
int number = Integer.parseInt(numberInput);
if(number>0 && number<101){
txtLinearOutput.setText(listOfBooks[number-1]);
}else{
// out of range
}
}catch(Exception e){
// handle exception here
}
You are comparing if (listOfBooks.get(i) == number) it is wrong, you should compare: if (i == number), becouse you need compare element position.
This isn't a binary search answer. Just an implementation of HashMap. Have a look at it.
HashMap<String, Integer> books = new HashMap();
books.put("abc", 1);
books.put("xyz", 2);
books.put("pqr", 3);
books.put("lmo", 4);
System.out.println(books.getValue("abc");
Using the inbuilt BinarySearch.
String []arr = new String[15];
arr[0] = "abc";
arr[5] = "prq";
arr[7] = "lmo";
arr[10] = "xyz";
System.out.println(Arrays.binarySearch(arr, "lmo"));
How to compare Strings using binary search.
String[] array = new String[4];
array[0] = "abc";
array[1] = "lmo";
array[2] = "pqr";
array[3] = "xyz";
int first, last, middle;
first = 0;
last = array.length - 1;
middle = (first + last) / 2;
String key = "abc";
while (first <= last) {
if (compare(array[middle], key))
first = middle + 1;
else if (array[middle].equals(key)) {
System.out.println(key + " found at location " + (middle) + ".");
break;
} else {
last = middle - 1;
}
middle = (first + last) / 2;
}
if (first > last)
System.out.println(key + " is not found.\n");
}
private static boolean compare(String string, String key) {
// TODO Auto-generated method stub
for (int i = 0; i < Math.min(string.length(), key.length()); ++i)
if (string.charAt(i) < key.charAt(i))
return true;
return false;
}
Your linear search code looks something like this
try{
txtLinearOutput.setText(listOfBooks[yourNumber]);
}
catch(IndexOutOfBoundsException ie){
// prompt that number is not an index
}
catch(Exception e){
// if any other exception is caught
}
What you are doing here:
if (listOfBooks.get(i) == number) {
is that you are matching the content of the array with the input number, which is irrelevant.
You can directly use the input number to fetch the value stored at the index.
For example:
txtLinearOutput.setText(listOfBooks[number-1]);
Additionally, int number = Integer.parseInt(numberInput); should be placed within try-catch block to validate input number parsing. And you can check if the input number is within the range of the array to avoid exceptions like:
try{
int number = Integer.parseInt(numberInput);
// Linear search to match index number and user input number
if (number > 0 && number <=100) {
txtLinearOutput.setText(listOfBooks[number-1]);
} else {
// Display error message
}
} catch(Exception e) {
// Handle exception and display error message
}
And for using binary search, the string array need to be sorted. You can use Arrays.sort() method for sorting it.
And regarding using binary search, you can use Java Arrays Binary Search method

How to extract the left most common characters in a string list?

Assume I have the following list of string objects:
ABC1, ABC2, ABC_Whatever
What's the most efficient way to extract the left most common characters from this list ? So I'd get ABC in my case.
StringUtils.getCommonPrefix(String... strs) from Apache Commons Lang.
This will work for you
public static void main(String args[]) {
String commonInFirstTwo=greatestCommon("ABC1","ABC2");
String commonInLastTwo=greatestCommon("ABC2","ABC_Whatever");
System.out.println(greatestCommon(commonInFirstTwo,commonInLastTwo));
}
public static String greatestCommon(String a, String b) {
int minLength = Math.min(a.length(), b.length());
for (int i = 0; i < minLength; i++) {
if (a.charAt(i) != b.charAt(i)) {
return a.substring(0, i);
}
}
return a.substring(0, minLength);
}
You hash all the substrings of the words in the given list and keep track of those substrings. The one with the maximum occurrences is the one you want. Here is a sample implementation. It returns the most common substring
static String mostCommon(List<String> list) {
Map<String, Integer> word2Freq = new HashMap<String, Integer>();
String maxFreqWord = null;
int maxFreq = 0;
for (String word : list) {
for (int i = 0; i < word.length(); ++i) {
String sub = word.substring(0, i + 1);
Integer f = word2Freq.get(sub);
if (f == null) {
f = 0;
}
word2Freq.put(sub, f + 1);
if (f + 1 > maxFreq) {
if (maxFreqWord == null || maxFreqWord.length() < sub.length()) {
maxFreq = f + 1;
maxFreqWord = sub;
}
}
}
}
return maxFreqWord;
}
The above implementation may not suffice if you more than one common substring. Use the map within it.
System.out.println(mostCommon(Arrays.asList("ABC1", "ABC2", "ABC_Whatever")));
System.out.println(mostCommon(Arrays.asList("ABCDEFG1", "ABGG2", "ABC11_Whatever")));
Returns
ABC
AB
Your problem is just a rephrase of the standard problem of finding the longest common prefix
If you know what the common characters are, then you could check if the other strings contain those characters by using the .contains() method.
If you're willing to use a third party library, then the following using jOOλ generates that prefix for you:
String prefix = Seq.of("ABC1", "ABC2", "ABC_Whatever").commonPrefix();
Disclaimer: I work for the company behind jOOλ
if there are N strings and the minimum length among them is M charterers, then the most efficient (correct) answer will take N * M at worst case (when all strings are same).
outer loop - each character of first string at a time
inner loop - each of the strings
test - each charterer of the string in inner
loop against the charterer in outer loop.
the performance can be tuned upto (N-1) * M if we do not test against the first string in ther inner loop

Matching the occurrence and pattern of characters of String2 in String1

I was asked this question in a phone interview for summer internship, and tried to come up with a n*m complexity solution (although it wasn't accurate too) in Java.
I have a function that takes 2 strings, suppose "common" and "cmn". It should return True based on the fact that 'c', 'm', 'n' are occurring in the same order in "common". But if the arguments were "common" and "omn", it would return False because even though they are occurring in the same order, but 'm' is also appearing after 'o' (which fails the pattern match condition)
I have worked over it using Hashmaps, and Ascii arrays, but didn't get a convincing solution yet! From what I have read till now, can it be related to Boyer-Moore, or Levenshtein Distance algorithms?
Hoping for respite at stackoverflow! :)
Edit: Some of the answers talk about reducing the word length, or creating a hashset. But per my understanding, this question cannot be done with hashsets because occurrence/repetition of each character in first string has its own significance. PASS conditions- "con", "cmn", "cm", "cn", "mn", "on", "co". FAIL conditions that may seem otherwise- "com", "omn", "mon", "om". These are FALSE/FAIL because "o" is occurring before as well as after "m". Another example- "google", "ole" would PASS, but "google", "gol" would fail because "o" is also appearing before "g"!
I think it's quite simple. Run through the pattern and fore every character get the index of it's last occurence in the string. The index must always increase, otherwise return false.
So in pseudocode:
index = -1
foreach c in pattern
checkindex = string.lastIndexOf(c)
if checkindex == -1 //not found
return false
if checkindex < index
return false
if string.firstIndexOf(c) < index //characters in the wrong order
return false
index = checkindex
return true
Edit: you could further improve the code by passing index as the starting index to the lastIndexOf method. Then you would't have to compare checkindex with index and the algorithm would be faster.
Updated: Fixed a bug in the algorithm. Additional condition added to consider the order of the letters in the pattern.
An excellent question and couple of hours of research and I think I have found the solution. First of all let me try explaining the question in a different approach.
Requirement:
Lets consider the same example 'common' (mainString) and 'cmn'(subString). First we need to be clear that any characters can repeat within the mainString and also the subString and since its pattern that we are concentrating on, the index of the character play a great role to. So we need to know:
Index of the character (least and highest)
Lets keep this on hold and go ahead and check the patterns a bit more. For the word common, we need to find whether the particular pattern cmn is present or not. The different patters possible with common are :- (Precedence apply )
c -> o
c -> m
c -> n
o -> m
o -> o
o -> n
m -> m
m -> o
m -> n
o -> n
At any moment of time this precedence and comparison must be valid. Since the precedence plays a huge role, we need to have the index of each unique character Instead of storing the different patterns.
Solution
First part of the solution is to create a Hash Table with the following criteria :-
Create a Hash Table with the key as each character of the mainString
Each entry for a unique key in the Hash Table will store two indices i.e lowerIndex and higherIndex
Loop through the mainString and for every new character, update a new entry of lowerIndex into the Hash with the current index of the character in mainString.
If Collision occurs, update the current index with higherIndex entry, do this until the end of String
Second and main part of pattern matching :-
Set Flag as False
Loop through the subString and for
every character as the key, retreive
the details from the Hash.
Do the same for the very next character.
Just before loop increment, verify two conditions
If highestIndex(current character) > highestIndex(next character) Then
Pattern Fails, Flag <- False, Terminate Loop
// This condition is applicable for almost all the cases for pattern matching
Else If lowestIndex(current character) > lowestIndex(next character) Then
Pattern Fails, Flag <- False, Terminate Loop
// This case is explicitly for cases in which patterns like 'mon' appear
Display the Flag
N.B : Since I am not so versatile in Java, I did not submit the code. But some one can try implementing my idea
I had myself done this question in an inefficient manner, but it does give accurate result! I would appreciate if anyone can make out an an efficient code/algorithm from this!
Create a function "Check" which takes 2 strings as arguments. Check each character of string 2 in string 1. The order of appearance of each character of s2 should be verified as true in S1.
Take character 0 from string p and traverse through the string s to find its index of first occurrence.
Traverse through the filled ascii array to find any value more than the index of first occurrence.
Traverse further to find the last occurrence, and update the ascii array
Take character 1 from string p and traverse through the string s to find the index of first occurence in string s
Traverse through the filled ascii array to find any value more than the index of first occurrence. if found, return False.
Traverse further to find the last occurrence, and update the ascii array
As can be observed, this is a bruteforce method...I guess O(N^3)
public class Interview
{
public static void main(String[] args)
{
if (check("google", "oge"))
System.out.println("yes");
else System.out.println("sorry!");
}
public static boolean check (String s, String p)
{
int[] asciiArr = new int[256];
for(int pIndex=0; pIndex<p.length(); pIndex++) //Loop1 inside p
{
for(int sIndex=0; sIndex<s.length(); sIndex++) //Loop2 inside s
{
if(p.charAt(pIndex) == s.charAt(sIndex))
{
asciiArr[s.charAt(sIndex)] = sIndex; //adding char from s to its Ascii value
for(int ascIndex=0; ascIndex<256; ) //Loop 3 for Ascii Array
{
if(asciiArr[ascIndex]>sIndex) //condition to check repetition
return false;
else ascIndex++;
}
}
}
}
return true;
}
}
Isn't it doable in O(n log n)?
Step 1, reduce the string by eliminating all characters that appear to the right. Strictly speaking you only need to eliminate characters if they appear in the string you're checking.
/** Reduces the maximal subsequence of characters in container that contains no
* character from container that appears to the left of the same character in
* container. E.g. "common" -> "cmon", and "whirlygig" -> "whrlyig".
*/
static String reduceContainer(String container) {
SparseVector charsToRight = new SparseVector(); // Like a Bitfield but sparse.
StringBuilder reduced = new StringBuilder();
for (int i = container.length(); --i >= 0;) {
char ch = container.charAt(i);
if (charsToRight.add(ch)) {
reduced.append(ch);
}
}
return reduced.reverse().toString();
}
Step 2, check containment.
static boolean containsInOrder(String container, String containee) {
int containerIdx = 0, containeeIdx = 0;
int containerLen = container.length(), containeeLen == containee.length();
while (containerIdx < containerLen && containeeIdx < containeeLen) {
// Could loop over codepoints instead of code-units, but you get the point...
if (container.charAt(containerIdx) == containee.charAt(containeeIdx)) {
++containeeIdx;
}
++containerIdx;
}
return containeeIdx == containeeLen;
}
And to answer your second question, no, Levenshtein distance won't help you since it has the property that if you swap the arguments the output is the same, but the algo you want does not.
public class StringPattern {
public static void main(String[] args) {
String inputContainer = "common";
String inputContainees[] = { "cmn", "omn" };
for (String containee : inputContainees)
System.out.println(inputContainer + " " + containee + " "
+ containsCommonCharsInOrder(inputContainer, containee));
}
static boolean containsCommonCharsInOrder(String container, String containee) {
Set<Character> containerSet = new LinkedHashSet<Character>() {
// To rearrange the order
#Override
public boolean add(Character arg0) {
if (this.contains(arg0))
this.remove(arg0);
return super.add(arg0);
}
};
addAllPrimitiveCharsToSet(containerSet, container.toCharArray());
Set<Character> containeeSet = new LinkedHashSet<Character>();
addAllPrimitiveCharsToSet(containeeSet, containee.toCharArray());
// retains the common chars in order
containerSet.retainAll(containeeSet);
return containerSet.toString().equals(containeeSet.toString());
}
static void addAllPrimitiveCharsToSet(Set<Character> set, char[] arr) {
for (char ch : arr)
set.add(ch);
}
}
Output:
common cmn true
common omn false
I would consider this as one of the worst pieces of code I have ever written or one of the worst code examples in stackoverflow...but guess what...all your conditions are met!
No algorithm could really fit the need, so I just used bruteforce...test it out...
And I could just care less for space and time complexity...my aim was first to try and solve it...and maybe improve it later!
public class SubString {
public static void main(String[] args) {
SubString ss = new SubString();
String[] trueconditions = {"con", "cmn", "cm", "cn", "mn", "on", "co" };
String[] falseconditions = {"com", "omn", "mon", "om"};
System.out.println("True Conditions : ");
for (String str : trueconditions) {
System.out.println("SubString? : " + str + " : " + ss.test("common", str));
}
System.out.println("False Conditions : ");
for (String str : falseconditions) {
System.out.println("SubString? : " + str + " : " + ss.test("common", str));
}
System.out.println("SubString? : ole : " + ss.test("google", "ole"));
System.out.println("SubString? : gol : " + ss.test("google", "gol"));
}
public boolean test(String original, String match) {
char[] original_array = original.toCharArray();
char[] match_array = match.toCharArray();
int[] value = new int[match_array.length];
int index = 0;
for (int i = 0; i < match_array.length; i++) {
for (int j = index; j < original_array.length; j++) {
if (original_array[j] != original_array[j == 0 ? j : j-1] && contains(match.substring(0, i), original_array[j])) {
value[i] = 2;
} else {
if (match_array[i] == original_array[j]) {
if (value[i] == 0) {
if (contains(original.substring(0, j == 0 ? j : j-1), match_array[i])) {
value[i] = 2;
} else {
value[i] = 1;
}
}
index = j + 1;
}
}
}
}
for (int b : value) {
if (b != 1) {
return false;
}
}
return true;
}
public boolean contains(String subStr, char ch) {
for (char c : subStr.toCharArray()) {
if (ch == c) {
return true;
}
}
return false;
}
}
-IvarD
I think this one is not a test of your computer science fundamentals, more what you would practically do within the Java programming environment.
You could construct a regular expression out of the second argument, i.e ...
omn -> o.*m[^o]*n
... and then test candidate string against this by either using String.matches(...) or using the Pattern class.
In generic form, the construction of the RegExp should be along the following lines.
exp -> in[0].* + for each x : 2 -> in.lenght { (in[x-1] +
[^in[x-2]]* + in[x]) }
for example:
demmn -> d.*e[^d]*m[^e]*m[^m]*n
I tried it myself in a different way. Just sharing my solution.
public class PatternMatch {
public static boolean matchPattern(String str, String pat) {
int slen = str.length();
int plen = pat.length();
int prevInd = -1, curInd;
int count = 0;
for (int i = 0; i < slen; i++) {
curInd = pat.indexOf(str.charAt(i));
if (curInd != -1) {
if(prevInd == curInd)
continue;
else if(curInd == (prevInd+1))
count++;
else if(curInd == 0)
count = 1;
else count = 0;
prevInd = curInd;
}
if(count == plen)
return true;
}
return false;
}
public static void main(String[] args) {
boolean r = matchPattern("common", "on");
System.out.println(r);
}
}

Categories