How to iterate through large string with substrings incrementing by 1 position?

How to iterate through large string with substrings incrementing by 1 position? - java

I have a String with an enormous number in it (thousands of chars):
String pi = "3.14159265358979323846264338327950288419716939937..."
I want to cycle through this string, grabbing 6 chars at a time, and checking if they match a given String:
String substring = "3.1415"
However, on each subsequent substring, I want to shift 1 position to the right of the chars in the original String:
substring = ".14159"
substring = "141592"
substring = "415926"
substring = "159265"
etc. etc.
What is the best way to do this? I have considered StringBuilder's methods, but converting to a String each iteration might be costly. String's method
substring(int beginIndex, int endIndex)
seems to approach what I'm trying to do, but I don't know if those indices can be incremented algorithmically.

I don't know if those indices can be incremented algorithmically.
These are parameters. They are values provided by you for each invocation of the method.
You are free to specify anything you want based on variables, constants, expressions, user input, or anything else. In this case, you can keep one or two variables, increment them, and pass them as parameters.
Here's an example using two variables that are both incremented by 1 each iteration:
class Main {
public static void main(String[] args) {
String pi = "3.14159265358979323846264338327950288419716939937...";
for(int start=0, end=6; end <= pi.length(); start++, end++) {
String substring = pi.substring(start, end);
System.out.println(substring);
}
}
}

Here's an algorithm that's efficient at matching values. Might be more efficient then using substring methods since it short circuits as soon as values don't match the provided sequence.
public static int containsSubstring(String wholeString, String findValue) {
//Break values into arrays
char[] wholeArray = wholeString.toCharArray();
char[] findArray = findValue.toCharArray();
//Use named outer loop for easy continuation to next character place
outerLoop:
for(int i = 0; i < wholeArray.length; i++) {
//Remaining values aren't large enough to contain find values so stop looking
if(i + findArray.length > wholeArray.length) {
break;
}
//Loop through next couple digits to check for matching sequence
for(int j = 0; j < findArray.length; j++) {
//Breaks loop as soon as a values don't match
if(wholeArray[i + j] != findArray[j]) {
continue outerLoop;
}
}
return i; //Or 'true' of you just care whether it's in there, and set the method return to boolean
}
return -1; //Or 'false'
}

Or java 8 style
String pi = "3.14159265358979323846264338327950288419716939937...";
IntStream.range(0, pi.length() - 5)
.mapToObj(i -> new StringBuffer(pi.substring(i, i + 6)))
.forEach(System.out::println)
;
You have the possibility to make it parallel
String pi = "3.14159265358979323846264338327950288419716939937...";
IntStream.range(0, pi.length() - 5)
.mapToObj(i -> new StringBuffer(pi.substring(i, i + 6)))
.parallel()
.forEach(System.out::println)
;
Speaking about performances the classic for loop method is still a little bit faster of it; you should do some tests:
public class Main {
static long firstTestTime;
static long withStreamTime;
static String pi = "3.141592653589793238462643383279502884197169399375105820974944592307816";
public static void main(String[] args) {
firstTest(pi);
withStreams(pi);
System.out.println("First Test: " + firstTestTime);
System.out.println("With Streams: " + withStreamTime);
}
static void withStreams(String pi) {
System.out.println("Starting stream test");
long startTime = System.currentTimeMillis();
IntStream.range(0, pi.length() - 5)
.mapToObj(i -> new StringBuffer(pi.substring(i, i + 6)))
//.parallel()
.forEach(System.out::println)
;
withStreamTime = System.currentTimeMillis() - startTime;
}
// By #that other guy
static void firstTest(String pi) {
System.out.println("Starting first test");
long startTime = System.currentTimeMillis();
for(int start=0, end=6; end <= pi.length(); start++, end++) {
String substring = pi.substring(start, end);
System.out.println(substring);
}
firstTestTime = System.currentTimeMillis() - startTime;
}
}
Try to increase the greek pi length!

Related

Count all possible decoding Combination of the given binary String in Java

Suppose we have a string of binary values in which some portions may correspond to specific letters, for example:
A = 0
B = 00
C = 001
D = 010
E = 0010
F = 0100
G = 0110
H = 0001
For example, if we assume the string "00100", we can have 5 different possibilities:
ADA
AF
CAA
CB
EA
I have to extract the exact number of combinations using Dynamic programming.
But I have difficulty in the formulation of subproblems and in the composition of the corresponding vector of solutions.
I appreciate any indications of the correct algorithm formulation.
class countString {
static int count(String a, String b, int m, int n) {
if ((m == 0 && n == 0) || n == 0)
return 1;
if (m == 0)
return 0;
if (a.charAt(m - 1) == b.charAt(n - 1))
return count(a, b, m - 1, n - 1) +
count(a, b, m - 1, n);
else
return count(a, b, m - 1, n);
}
public static void main(String[] args) {
Locale.setDefault(Locale.US);
ArrayList<String> substrings = new ArrayList<>();
substrings.add("0");
substrings.add("00");
substrings.add("001");
substrings.add("010");
substrings.add("0010");
substrings.add("0100");
substrings.add("0110");
substrings.add("0001");
if (args.length != 1) {
System.err.println("ERROR - execute with: java countString -filename- ");
System.exit(1);
}
try {
Scanner scan = new Scanner(new File(args[0])); // not important
String S = "00100";
int count = 0;
for(int i=0; i<substrings.size(); i++){
count = count + count(S,substrings.get(i),S.length(),substrings.get(i).length());
}
System.out.println(count);
} catch (FileNotFoundException e) {
System.out.println("File not found " + e);
}
}
}

In essence, Dynamic Programming is an enhanced brute-force approach.
Like in the case of brute-force, we need to generate all possible results. But contrary to a plain brute-force the problem should be divided into smaller subproblems, and previously computed result of each subproblem should be stored and reused.
Since you are using recursion you need to apply so-called Memoization technic in order to store and reuse the intermediate results. In this case, HashMap would be a perfect mean of storing results.
But before applying the memoization in order to understand it better, it makes sense to start with a clean and simple recursive solution that works correctly, and only then enhance it with DP.
Plain Recursion
Every recursive implementation should contain two parts:
Base case - that represents a simple edge-case (or a set of edge-cases) for which the outcome is known in advance. For this problem, there are two edge-cases: the length of the given string is 0 and result would be 1 (an empty binary string "" results into an empty string of letters ""), another case is when it's impossible to decode a given binary string and result will be 0 (in the solution below it resolves naturally when the recursive case is being executed).
Recursive case - a part of a solution where recursive calls a made and when the main logic resides. In the recursive case, we need to find each binary "binary letter" at the beginning of the string and then call the method recursively by passing the substring (without the "letter"). Results of these recursive calls need to be accumulated in the total count that will returned from the method.
In order to implement this logic we need only two arguments: the binary string to analyze and a list of binary letters:
public static int count(String str, List<String> letters) {
if (str.isEmpty()) { // base case - a combination was found
return 1;
}
// recursive case
int count = 0;
for (String letter: letters) {
if (str.startsWith(letter)) {
count += count(str.substring(letter.length()), letters);
}
}
return count;
}
This concise solution is already capable of producing the correct result. Now, let's turn this brute-force version into a DP-based solution, by applying the memoization.
Dynamic Programming
As I've told earlier, a HashMap will be a perfect mean to store the intermediate results because allows to associate a count (number of combinations) with a particular string and then retrieve this number almost instantly (in O(1) time).
That how it might look like:
public static int count(String str, List<String> letters, Map<String, Integer> vocab) {
if (str.isEmpty()) { // base case - a combination was found
return 1;
}
if (vocab.containsKey(str)) { // result was already computed and present in the map
return vocab.get(str);
}
int count = 0;
for (String letter: letters) {
if (str.startsWith(letter)) {
count += count(str.substring(letter.length()), letters, vocab);
}
}
vocab.put(str, count); // storing the total `count` into the map
return count;
}
main()
public static void main(String[] args) {
List<String> letters = List.of("0", "00", "001", "010", "0010", "0100", "0110", "0001"); // binary letters
System.out.println(count("00100", letters, new HashMap<>())); // DP
System.out.println(count("00100", letters)); // brute-force recursion
}
Output:
5 // DP
5 // plain recursion
A link to Online Demo

Hope this helps.
Idea is to create every possible string with these values and check whether input starts with the value or not. If not then switch to another index.
If you have test cases ready with you you can verify more.
I have tested only with 2-3 values.
public int getCombo(String[] array, int startingIndex, String val, String input) {
int count = 0;
for (int i = startingIndex; i < array.length; i++) {
String matchValue = val + array[i];
if (matchValue.length() <= input.length()) {
// if value matches then count + 1
if (matchValue.equals(input)) {
count++;
System.out.println("match Found---->" + count); //ommit this sysout , its only for testing.
return count;
} else if (input.startsWith(matchValue)) { // checking whether the input is starting with the new value
// search further combos
count += getCombo(array, 0, matchValue, input);
}
}
}
return count;
}
In main Method
String[] arr = substrings.toArray(new String[0]);
int count = 0;
for (int i = 0; i < arr.length; i++) {
System.out.println("index----?> " + i);
//adding this condition for single inputs i.e "0","010";
if(arr[i].equals(input))
count++;
else
count = count + getCombo(arr, 0, arr[i], input);
}
System.out.println("Final count : " + count);
My test results :
input : 00100
Final count 5
input : 000
Final count 3

Number of ways to recreate a given string using a given list of words

Given is a String word and a String array book that contains some strings. The program should give out the number of possibilities to create word only using elements in book. An element can be used as many times as we want and the program must terminate in under 6 seconds.
For example, input:
String word = "stackoverflow";
String[] book = new String[9];
book[0] = "st";
book[1] = "ck";
book[2] = "CAG";
book[3] = "low";
book[4] = "TC";
book[5] = "rf";
book[6] = "ove";
book[7] = "a";
book[8] = "sta";
The output should be 2, since we can create "stackoverflow" in two ways:
1: "st" + "a" + "ck" + "ove" + "rf" + "low"
2: "sta" + "ck" + "ove" + "rf" + "low"
My implementation of the program only terminates in the required time if word is relatively small (<15 characters). However, as I mentioned before, the running time limit for the program is 6 seconds and it should be able to handle very large word strings (>1000 characters). Here is an example of a large input.
Here is my code:
1) the actual method:
input: a String word and a String[] book
output: the number of ways word can be written only using strings in book
public static int optimal(String word, String[] book){
int count = 0;
List<List<String>> allCombinations = allSubstrings(word);
List<String> empty = new ArrayList<>();
List<String> wordList = Arrays.asList(book);
for (int i = 0; i < allCombinations.size(); i++) {
allCombinations.get(i).retainAll(wordList);
if (!sumUp(allCombinations.get(i), word)) {
allCombinations.remove(i);
allCombinations.add(i, empty);
}
else count++;
}
return count;
}
2) allSubstrings():
input: a String input
output: A list of lists, each containing a combination of substrings that add up to input
static List<List<String>> allSubstrings(String input) {
if (input.length() == 1) return Collections.singletonList(Collections.singletonList(input));
List<List<String>> result = new ArrayList<>();
for (List<String> temp : allSubstrings(input.substring(1))) {
List<String> firstList = new ArrayList<>(temp);
firstList.set(0, input.charAt(0) + firstList.get(0));
if (input.startsWith(firstList.get(0), 0)) result.add(firstList);
List<String> l = new ArrayList<>(temp);
l.add(0, input.substring(0, 1));
if (input.startsWith(l.get(0), 0)) result.add(l);
}
return result;
}
3.) sumup():
input: A String list input and a String expected
output: true if the elements in input add up to expected
public static boolean sumUp (List<String> input, String expected) {
String x = "";
for (int i = 0; i < input.size(); i++) {
x = x + input.get(i);
}
if (expected.equals(x)) return true;
return false;
}

I've figured out what I was doing wrong in my previous answer: I wasn't using memoization, so I was redoing an awful lot of unnecessary work.
Consider a book array {"a", "aa", "aaa"}, and a target word "aaa". There are four ways to construct this target:
"a" + "a" + "a"
"aa" + "a"
"a" + "aa"
"aaa"
My previous attempt would have walk through all four, separately. But instead, one can observe that:
There is 1 way to construct "a"
You can construct "aa" in 2 ways, either "a" + "a" or using "aa" directly.
You can construct "aaa" either by using "aaa" directly (1 way); or "aa" + "a" (2 ways, since there are 2 ways to construct "aa"); or "a" + "aa" (1 way).
Note that the third step here only adds a single additional string to a previously-constructed string, for which we know the number of ways it can be constructed.
This suggests that if we count the number of ways in which a prefix of word can be constructed, we can use that to trivially calculate the number of ways a longer prefix by adding just one more string from book.
I defined a simple trie class, so you can quickly look up prefixes of the book words that match at any given position in word:
class TrieNode {
boolean word;
Map<Character, TrieNode> children = new HashMap<>();
void add(String s, int i) {
if (i == s.length()) {
word = true;
} else {
children.computeIfAbsent(s.charAt(i), k -> new TrieNode()).add(s, i + 1);
}
}
}
For each letter in s, this creates an instance of TrieNode, and stores the TrieNode for the subsequent characters etc.
static long method(String word, String[] book) {
// Construct a trie from all the words in book.
TrieNode t = new TrieNode();
for (String b : book) {
t.add(b, 0);
}
// Construct an array to memoize the number of ways to construct
// prefixes of a given length: result[i] is the number of ways to
// construct a prefix of length i.
long[] result = new long[word.length() + 1];
// There is only 1 way to construct a prefix of length zero.
result[0] = 1;
for (int m = 0; m < word.length(); ++m) {
if (result[m] == 0) {
// If there are no ways to construct a prefix of this length,
// then just skip it.
continue;
}
// Walk the trie, taking the branch which matches the character
// of word at position (n + m).
TrieNode tt = t;
for (int n = 0; tt != null && n + m <= word.length(); ++n) {
if (tt.word) {
// We have reached the end of a word: we can reach a prefix
// of length (n + m) from a prefix of length (m).
// Increment the number of ways to reach (n+m) by the number
// of ways to reach (m).
// (Increment, because there may be other ways).
result[n + m] += result[m];
if (n + m == word.length()) {
break;
}
}
tt = tt.children.get(word.charAt(n + m));
}
}
// The number of ways to reach a prefix of length (word.length())
// is now stored in the last element of the array.
return result[word.length()];
}
For the very long input given by OP, this gives output:
$ time java Ideone
2217093120
real 0m0.126s
user 0m0.146s
sys 0m0.036s
Quite a bit faster than the required 6 seconds - and this includes JVM startup time too.
Edit: in fact, the trie isn't necessary. You can simply replace the "Walk the trie" loop with:
for (String b : book) {
if (word.regionMatches(m, b, 0, b.length())) {
result[m + b.length()] += result[m];
}
}
and it performs slower, but still way faster than 6s:
2217093120
real 0m0.173s
user 0m0.226s
sys 0m0.033s

A few observations:
x = x + input.get(i);
As you are looping, using String+ isn't a good idea. Use a StringBuilder and append to that within the loop, and in the end return builder.toString(). Or you follow the idea from Andy. There is no need to merge strings, you already know the target word. See below.
Then: List implies that adding/removing elements might be costly. So see if you can get rid of that part, and if it would be possible to use maps, sets instead.
Finally: the real point would be to look into your algorithm. I would try to work "backwards". Meaning: first identify those array elements that actually occur in your target word. You can ignore all others right from start.
Then: look at all array entries that **start*+ your search word. In your example you can notice that there are just two array elements that fit. And then work your way from there.

My first observation would be that you don't actually need to build anything: you know what string you are trying to construct (e.g. stackoverflow), so all you really need to keep track of is how much of that string you have matched so far. Call this m.
Next, having matched m characters, provided m < word.length(), you need to choose a next string from book which matches the portion of word from m to m + nextString.length().
You could do this by checking each string in turn:
if (word.matches(m, nextString, 0, nextString.length()) { ...}
But you can do better, by determining strings that can't match in advance: the next string you append will have the following properties:
word.charAt(m) == nextString.charAt(0) (the next characters match)
m + nextString.length() <= word.length() (adding the next string shouldn't make the constructed string longer than word)
So, you can cut down the potential words from book that you might check by constructing a map of letters to words that start with that (point 1); and if you store the words with the same starting letter in increasing length order, you can stop checking that letter as soon as the length gets too big (point 2).
You can construct a map once and reuse:
Map<Character, List<String>> prefixMap =
Arrays.asList(book).stream()
.collect(groupingBy(
s -> s.charAt(0),
collectingAndThen(
toList(),
ss -> {
ss.sort(comparingInt(String::length));
return ss;
})));
You can count the number of ways recursively, without constructing any additional objects (*):
int method(String word, String[] book) {
return method(word, 0, /* construct map as above */);
}
int method(String word, int m, Map<Character, List<String>> prefixMap) {
if (m == word.length()) {
return 1;
}
int result = 0;
for (String nextString : prefixMap.getOrDefault(word.charAt(m), emptyList())) {
if (m + nextString.length() > word.length()) {
break;
}
// Start at m+1, because you already know they match at m.
if (word.regionMatches(m + 1, nextString, 1, nextString.length()-1)) {
// This is a potential match!
// Make a recursive call.
result += method(word, m + nextString.length(), prefixMap);
}
}
return result;
}
(*) This may construct new instances of Character, because of the boxing of the word.charAt(m): cached instances are guaranteed to be used for chars in the range 0-127 only. There are ways to work around this, but they would only clutter the code.

I think you are already doing a pretty good job at optimizing your application. In addition to the answer by GhostCat here are a few suggestions of my own:
public static int optimal(String word, String[] book){
int count = 0;
List<List<String>> allCombinations = allSubstrings(word);
List<String> wordList = Arrays.asList(book);
for (int i = 0; i < allCombinations.size(); i++)
{
/*
* allCombinations.get(i).retainAll(wordList);
*
* There is no need to retrieve the list element
* twice, just set it in a local variable
*/
java.util.List<String> combination = allCombinations.get(i);
combination.retainAll(wordList);
/*
* Since we are only interested in the count here
* there is no need to remove and add list elements
*/
if (sumUp(combination, word))
{
/*allCombinations.remove(i);
allCombinations.add(i, empty);*/
count++;
}
/*else count++;*/
}
return count;
}
public static boolean sumUp (List<String> input, String expected) {
String x = "";
for (int i = 0; i < input.size(); i++) {
x = x + input.get(i);
}
// No need for if block here, just return comparison result
/*if (expected.equals(x)) return true;
return false;*/
return expected.equals(x);
}
And since you are interested in seeing the execution time of your method I would recommend implementing a benchmarking system of some sort. Here is a quick mock-up:
private static long benchmarkOptima(int cycles, String word, String[] book) {
long totalTime = 0;
for (int i = 0; i < cycles; i++)
{
long startTime = System.currentTimeMillis();
int a = optimal(word, book);
long executionTime = System.currentTimeMillis() - startTime;
totalTime += executionTime;
}
return totalTime / cycles;
}
public static void main(String[] args)
{
String word = "stackoverflow";
String[] book = new String[] {
"st", "ck", "CAG", "low", "TC",
"rf", "ove", "a", "sta"
};
int result = optimal(word, book);
final int cycles = 50;
long averageTime = benchmarkOptima(cycles, word, book);
System.out.println("Optimal result: " + result);
System.out.println("Average execution time - " + averageTime + " ms");
}
Output
2
Average execution time - 6 ms

Note: The implementation is getting stuck in the test case mentioned by #user1221, working on it.
What I could think of is a Trie based approach that is O(sum of length of words in dict) space. Time is not optimal.
Procedure:
Build a Trie of all the words in the dictionary. This is a pre-processing task that will take O(sum of lengths of all strings in dict).
We try finding the string that you want to make in the trie, with a twist. We start with searching a prefix of the string. If we get a prefix in the trie, we start the search from the top recursively and continue to look for more prefixes.
When we reach the end of out string i.e. stackoverflow, we check if we arrived at the end of any string, if yes, then we reached a valid combination of this string. we count this while going back up the recursion.
eg:
In the above case, we use the dict as {"st", "sta", "a", "ck"}
We construct our trie ($ is the sentinel char, i.e. a char which is not in the dict):
$___s___t.___a.
|___a.
|___c___k.
the . represents that a word in the dict ends at that position.
We try to find the no of constructions of stack.
We start searching stack in the trie.
depth=0
$___s(*)___t.___a.
|___a.
|___c___k.
We see that we are at the end of one word, we start a new search with the remaining string ack from the top.
depth=0
$___s___t(*).___a.
|___a.
|___c___k.
Again we are at the end of one word in the dict. We start a new search for ck.
depth=1
$___s___t.___a.
|___a(*).
|___c___k.
depth=2
$___s___t.___a.
|___a.
|___c(*)___k.
We reach the end of stack and end of a word in the dict, hence we have 1 valid representation of stack.
depth=2
$___s___t.___a.
|___a.
|___c___k(*).
We go back to the caller of depth=2
No next char is available, we return to the caller of depth=1.
depth=1
$___s___t.___a.
|___a(*, 1).
|___c___k.
depth=0
$___s___t(*, 1).___a.
|___a.
|___c___k.
We move to next char. We see that we reached the end of one word in the dict, we launch a new search for ck in the dict.
depth=0
$___s___t.___a(*, 1).
|___a.
|___c___k.
depth=1
$___s___t.___a.
|___a.
|___c(*)___k.
We reach the end of the stack and a work in the dict, so another valid representation. We go back to the caller of depth=1
depth=1
$___s___t.___a.
|___a.
|___c___k(*, 1).
There are no more chars to proceed, we return with the result 2.
depth=0
$___s___t.___a(*, 2).
|___a.
|___c___k.
Note: The implementation is in C++, shouldn't be too hard to convert to Java and this implementation assumes that all chars are lowercase, it's trivial to extend it to both cases.
Sample code (full version):
/**
Node *base: head of the trie
Node *h : current node in the trie
string s : string to search
int idx : the current position in the string
*/
int count(Node *base, Node *h, string s, int idx) {
// step 3: found a valid combination.
if (idx == s.size()) return h->end;
int res = 0;
// step 2: we recursively start a new search.
if (h->end) {
res += count(base, base, s, idx);
}
// move ahead in the trie.
if (h->next[s[idx] - 'a'] != NULL) {
res += count(base, h->next[s[idx] - 'a'], s, idx + 1);
}
return res;
}

def cancons(target,wordbank, memo={}):
if target in memo:
return memo[target]
if target =='':
return 1
total_count =0
for word in wordbank:
if target.startswith(word):
l= len(word)
number_of_way=cancons(target[l:],wordbank,memo)
total_count += number_of_way
memo[target]= total_count
return total_count
if __name__ == '__main__':
word = "stackoverflow";
String= ["st", "ck","CAG","low","TC","rf","ove","a","sta"]
b=cancons(word,String,memo={})
print(b)

Replace the Indexes without Loop

My Question is :
First I have String variable of 1000 character, and I have another set of String variable of 1000 character
1] In First Set of Variable contains "1110000XXXXX0001111...." like this and so on, till 1000
2] In The second Set of Variable contains "1110000101010001111..." like this and so on till 1000
3] I need to get the position of X in first Variable and replace the Value of similar position from the second variable
For ex : 1st Variable of data "000XXX000X0"
2nd Variable of data "00011000010"
The X should be replaced by the values which is in the position in 2nd set of data.
NOTE : TO BE DONE WITHOUT LOOP
because if we put loop its runs 1000 times in a loop and 'X' may be anywhere in 1000 characters in the String
For ex: 1 Record 1000 Times
if 100K Records means 1000*100K (PERFORMANCE FAILS)
So need solution for it.
Kindly Help me out with this.
My Code is :
String sInputStr="0X11XXXXX000000000000000000000000000000000000000000000000X000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011";
String sDbStr="0111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011";
int iLength=sInputStr.length();
for(int i=0;i<iLength;i++){
if(sInputStr.charAt(i)=='X'){
}else{
if(i>sDbStr.length()){
break;
}else{
sChar[i] = sInputStr.charAt(i);
}
}
}//End of For
sVal=String.valueOf(sChar);
System.out.println("sVal == " +sVal);
Help Me friends

All you need is something like this
class FirstApp {
public static void main(String[] args) {
String sDbStr="0111111110000001234000000000000011";
StringBuilder sNewStr= new StringBuilder("011111111000000XXXX00000000000001112");
String findStr = "X";
int lastIndex = 0;
System.out.println("Starting");
long startTime = System.currentTimeMillis();
String result = replaceValues("X", sDbStr, sNewStr);
long endTime = System.currentTimeMillis();
System.out.println("Result");
System.out.println(result);
System.out.println(String.valueOf(endTime-startTime));
}
public static String replaceValues(String toReplace, String fromStr, StringBuilder toStr) {
int lastIndex = toStr.indexOf(toReplace);
if(lastIndex != -1){
toStr.replace(lastIndex,lastIndex+1,Character.toString(fromStr.charAt(lastIndex)));
System.out.println(toStr);
return replaceValues(toReplace, fromStr, toStr);
} else {
return toStr.toString();
}
}
}
sample result:
Starting
0111111110000001XXX00000000000001112
01111111100000012XX00000000000001112
011111111000000123X00000000000001112
011111111000000123400000000000001112
Result
011111111000000123400000000000001112
UPDATE Updated solution to ensure less execution time using stringBuilder and recursion

If X point to one value, like as mentioned X replace by 1. Go for string.replaceAll function.
ie.
String oriString="000XXX000X0";
String replaceOne=oriString.replace('X','1');
System.out.println(replaceOne);

If I understood the problem correctly then you want to replace the values of X in your first array with the values from seconds array at the same positions. For Example, array1: 000XXX000 & array2: 100101001. Then array1 should finally be 000101000.
Here is a simple code snippet to achieve this:
char[] arr1 = sInputStr.toCharArray();
char[] arr2 = sDbStr.toCharArray();
for(int i = 0; i < arr1.size(); i++)
if(arr1[i] == 'X')
arr1[i] == arr2[i];

The idea is to search for the index of the occurrence of the character say 'X' and copy all the characters from second string into the first as long as we find 'X'. Repeat the process till the last occurrence of 'X'.
public class Test
{
public static void main(String args[])
{
String s1 = "0X11XXXXX000000000000000000000000000000000000000000000000X000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011";
String s2= "0111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011";
char a[] = s1.toCharArray();
int i = s1.indexOf('X', 0);
while(i!=-1)
{
while(a[i] == 'X'){
a[i] = s2.charAt(i);
i++;
}
i = s1.indexOf('X',i+1);
}
s1 = new String(a);
System.out.println("result: "+s1);
}
}

How to extract the left most common characters in a string list?

Assume I have the following list of string objects:
ABC1, ABC2, ABC_Whatever
What's the most efficient way to extract the left most common characters from this list ? So I'd get ABC in my case.

StringUtils.getCommonPrefix(String... strs) from Apache Commons Lang.

This will work for you
public static void main(String args[]) {
String commonInFirstTwo=greatestCommon("ABC1","ABC2");
String commonInLastTwo=greatestCommon("ABC2","ABC_Whatever");
System.out.println(greatestCommon(commonInFirstTwo,commonInLastTwo));
}
public static String greatestCommon(String a, String b) {
int minLength = Math.min(a.length(), b.length());
for (int i = 0; i < minLength; i++) {
if (a.charAt(i) != b.charAt(i)) {
return a.substring(0, i);
}
}
return a.substring(0, minLength);
}

You hash all the substrings of the words in the given list and keep track of those substrings. The one with the maximum occurrences is the one you want. Here is a sample implementation. It returns the most common substring
static String mostCommon(List<String> list) {
Map<String, Integer> word2Freq = new HashMap<String, Integer>();
String maxFreqWord = null;
int maxFreq = 0;
for (String word : list) {
for (int i = 0; i < word.length(); ++i) {
String sub = word.substring(0, i + 1);
Integer f = word2Freq.get(sub);
if (f == null) {
f = 0;
}
word2Freq.put(sub, f + 1);
if (f + 1 > maxFreq) {
if (maxFreqWord == null || maxFreqWord.length() < sub.length()) {
maxFreq = f + 1;
maxFreqWord = sub;
}
}
}
}
return maxFreqWord;
}
The above implementation may not suffice if you more than one common substring. Use the map within it.
System.out.println(mostCommon(Arrays.asList("ABC1", "ABC2", "ABC_Whatever")));
System.out.println(mostCommon(Arrays.asList("ABCDEFG1", "ABGG2", "ABC11_Whatever")));
Returns
ABC
AB

Your problem is just a rephrase of the standard problem of finding the longest common prefix

If you know what the common characters are, then you could check if the other strings contain those characters by using the .contains() method.

If you're willing to use a third party library, then the following using jOOλ generates that prefix for you:
String prefix = Seq.of("ABC1", "ABC2", "ABC_Whatever").commonPrefix();
Disclaimer: I work for the company behind jOOλ

if there are N strings and the minimum length among them is M charterers, then the most efficient (correct) answer will take N * M at worst case (when all strings are same).
outer loop - each character of first string at a time
inner loop - each of the strings
test - each charterer of the string in inner
loop against the charterer in outer loop.
the performance can be tuned upto (N-1) * M if we do not test against the first string in ther inner loop

Matching the occurrence and pattern of characters of String2 in String1

I was asked this question in a phone interview for summer internship, and tried to come up with a n*m complexity solution (although it wasn't accurate too) in Java.
I have a function that takes 2 strings, suppose "common" and "cmn". It should return True based on the fact that 'c', 'm', 'n' are occurring in the same order in "common". But if the arguments were "common" and "omn", it would return False because even though they are occurring in the same order, but 'm' is also appearing after 'o' (which fails the pattern match condition)
I have worked over it using Hashmaps, and Ascii arrays, but didn't get a convincing solution yet! From what I have read till now, can it be related to Boyer-Moore, or Levenshtein Distance algorithms?
Hoping for respite at stackoverflow! :)
Edit: Some of the answers talk about reducing the word length, or creating a hashset. But per my understanding, this question cannot be done with hashsets because occurrence/repetition of each character in first string has its own significance. PASS conditions- "con", "cmn", "cm", "cn", "mn", "on", "co". FAIL conditions that may seem otherwise- "com", "omn", "mon", "om". These are FALSE/FAIL because "o" is occurring before as well as after "m". Another example- "google", "ole" would PASS, but "google", "gol" would fail because "o" is also appearing before "g"!

I think it's quite simple. Run through the pattern and fore every character get the index of it's last occurence in the string. The index must always increase, otherwise return false.
So in pseudocode:
index = -1
foreach c in pattern
checkindex = string.lastIndexOf(c)
if checkindex == -1 //not found
return false
if checkindex < index
return false
if string.firstIndexOf(c) < index //characters in the wrong order
return false
index = checkindex
return true
Edit: you could further improve the code by passing index as the starting index to the lastIndexOf method. Then you would't have to compare checkindex with index and the algorithm would be faster.
Updated: Fixed a bug in the algorithm. Additional condition added to consider the order of the letters in the pattern.

An excellent question and couple of hours of research and I think I have found the solution. First of all let me try explaining the question in a different approach.
Requirement:
Lets consider the same example 'common' (mainString) and 'cmn'(subString). First we need to be clear that any characters can repeat within the mainString and also the subString and since its pattern that we are concentrating on, the index of the character play a great role to. So we need to know:
Index of the character (least and highest)
Lets keep this on hold and go ahead and check the patterns a bit more. For the word common, we need to find whether the particular pattern cmn is present or not. The different patters possible with common are :- (Precedence apply )
c -> o
c -> m
c -> n
o -> m
o -> o
o -> n
m -> m
m -> o
m -> n
o -> n
At any moment of time this precedence and comparison must be valid. Since the precedence plays a huge role, we need to have the index of each unique character Instead of storing the different patterns.
Solution
First part of the solution is to create a Hash Table with the following criteria :-
Create a Hash Table with the key as each character of the mainString
Each entry for a unique key in the Hash Table will store two indices i.e lowerIndex and higherIndex
Loop through the mainString and for every new character, update a new entry of lowerIndex into the Hash with the current index of the character in mainString.
If Collision occurs, update the current index with higherIndex entry, do this until the end of String
Second and main part of pattern matching :-
Set Flag as False
Loop through the subString and for
every character as the key, retreive
the details from the Hash.
Do the same for the very next character.
Just before loop increment, verify two conditions
If highestIndex(current character) > highestIndex(next character) Then
Pattern Fails, Flag <- False, Terminate Loop
// This condition is applicable for almost all the cases for pattern matching
Else If lowestIndex(current character) > lowestIndex(next character) Then
Pattern Fails, Flag <- False, Terminate Loop
// This case is explicitly for cases in which patterns like 'mon' appear
Display the Flag
N.B : Since I am not so versatile in Java, I did not submit the code. But some one can try implementing my idea

I had myself done this question in an inefficient manner, but it does give accurate result! I would appreciate if anyone can make out an an efficient code/algorithm from this!
Create a function "Check" which takes 2 strings as arguments. Check each character of string 2 in string 1. The order of appearance of each character of s2 should be verified as true in S1.
Take character 0 from string p and traverse through the string s to find its index of first occurrence.
Traverse through the filled ascii array to find any value more than the index of first occurrence.
Traverse further to find the last occurrence, and update the ascii array
Take character 1 from string p and traverse through the string s to find the index of first occurence in string s
Traverse through the filled ascii array to find any value more than the index of first occurrence. if found, return False.
Traverse further to find the last occurrence, and update the ascii array
As can be observed, this is a bruteforce method...I guess O(N^3)
public class Interview
{
public static void main(String[] args)
{
if (check("google", "oge"))
System.out.println("yes");
else System.out.println("sorry!");
}
public static boolean check (String s, String p)
{
int[] asciiArr = new int[256];
for(int pIndex=0; pIndex<p.length(); pIndex++) //Loop1 inside p
{
for(int sIndex=0; sIndex<s.length(); sIndex++) //Loop2 inside s
{
if(p.charAt(pIndex) == s.charAt(sIndex))
{
asciiArr[s.charAt(sIndex)] = sIndex; //adding char from s to its Ascii value
for(int ascIndex=0; ascIndex<256; ) //Loop 3 for Ascii Array
{
if(asciiArr[ascIndex]>sIndex) //condition to check repetition
return false;
else ascIndex++;
}
}
}
}
return true;
}
}

Isn't it doable in O(n log n)?
Step 1, reduce the string by eliminating all characters that appear to the right. Strictly speaking you only need to eliminate characters if they appear in the string you're checking.
/** Reduces the maximal subsequence of characters in container that contains no
* character from container that appears to the left of the same character in
* container. E.g. "common" -> "cmon", and "whirlygig" -> "whrlyig".
*/
static String reduceContainer(String container) {
SparseVector charsToRight = new SparseVector(); // Like a Bitfield but sparse.
StringBuilder reduced = new StringBuilder();
for (int i = container.length(); --i >= 0;) {
char ch = container.charAt(i);
if (charsToRight.add(ch)) {
reduced.append(ch);
}
}
return reduced.reverse().toString();
}
Step 2, check containment.
static boolean containsInOrder(String container, String containee) {
int containerIdx = 0, containeeIdx = 0;
int containerLen = container.length(), containeeLen == containee.length();
while (containerIdx < containerLen && containeeIdx < containeeLen) {
// Could loop over codepoints instead of code-units, but you get the point...
if (container.charAt(containerIdx) == containee.charAt(containeeIdx)) {
++containeeIdx;
}
++containerIdx;
}
return containeeIdx == containeeLen;
}
And to answer your second question, no, Levenshtein distance won't help you since it has the property that if you swap the arguments the output is the same, but the algo you want does not.

public class StringPattern {
public static void main(String[] args) {
String inputContainer = "common";
String inputContainees[] = { "cmn", "omn" };
for (String containee : inputContainees)
System.out.println(inputContainer + " " + containee + " "
+ containsCommonCharsInOrder(inputContainer, containee));
}
static boolean containsCommonCharsInOrder(String container, String containee) {
Set<Character> containerSet = new LinkedHashSet<Character>() {
// To rearrange the order
#Override
public boolean add(Character arg0) {
if (this.contains(arg0))
this.remove(arg0);
return super.add(arg0);
}
};
addAllPrimitiveCharsToSet(containerSet, container.toCharArray());
Set<Character> containeeSet = new LinkedHashSet<Character>();
addAllPrimitiveCharsToSet(containeeSet, containee.toCharArray());
// retains the common chars in order
containerSet.retainAll(containeeSet);
return containerSet.toString().equals(containeeSet.toString());
}
static void addAllPrimitiveCharsToSet(Set<Character> set, char[] arr) {
for (char ch : arr)
set.add(ch);
}
}
Output:
common cmn true
common omn false

I would consider this as one of the worst pieces of code I have ever written or one of the worst code examples in stackoverflow...but guess what...all your conditions are met!
No algorithm could really fit the need, so I just used bruteforce...test it out...
And I could just care less for space and time complexity...my aim was first to try and solve it...and maybe improve it later!
public class SubString {
public static void main(String[] args) {
SubString ss = new SubString();
String[] trueconditions = {"con", "cmn", "cm", "cn", "mn", "on", "co" };
String[] falseconditions = {"com", "omn", "mon", "om"};
System.out.println("True Conditions : ");
for (String str : trueconditions) {
System.out.println("SubString? : " + str + " : " + ss.test("common", str));
}
System.out.println("False Conditions : ");
for (String str : falseconditions) {
System.out.println("SubString? : " + str + " : " + ss.test("common", str));
}
System.out.println("SubString? : ole : " + ss.test("google", "ole"));
System.out.println("SubString? : gol : " + ss.test("google", "gol"));
}
public boolean test(String original, String match) {
char[] original_array = original.toCharArray();
char[] match_array = match.toCharArray();
int[] value = new int[match_array.length];
int index = 0;
for (int i = 0; i < match_array.length; i++) {
for (int j = index; j < original_array.length; j++) {
if (original_array[j] != original_array[j == 0 ? j : j-1] && contains(match.substring(0, i), original_array[j])) {
value[i] = 2;
} else {
if (match_array[i] == original_array[j]) {
if (value[i] == 0) {
if (contains(original.substring(0, j == 0 ? j : j-1), match_array[i])) {
value[i] = 2;
} else {
value[i] = 1;
}
}
index = j + 1;
}
}
}
}
for (int b : value) {
if (b != 1) {
return false;
}
}
return true;
}
public boolean contains(String subStr, char ch) {
for (char c : subStr.toCharArray()) {
if (ch == c) {
return true;
}
}
return false;
}
}
-IvarD

I think this one is not a test of your computer science fundamentals, more what you would practically do within the Java programming environment.
You could construct a regular expression out of the second argument, i.e ...
omn -> o.*m[^o]*n
... and then test candidate string against this by either using String.matches(...) or using the Pattern class.
In generic form, the construction of the RegExp should be along the following lines.
exp -> in[0].* + for each x : 2 -> in.lenght { (in[x-1] +
[^in[x-2]]* + in[x]) }
for example:
demmn -> d.*e[^d]*m[^e]*m[^m]*n

I tried it myself in a different way. Just sharing my solution.
public class PatternMatch {
public static boolean matchPattern(String str, String pat) {
int slen = str.length();
int plen = pat.length();
int prevInd = -1, curInd;
int count = 0;
for (int i = 0; i < slen; i++) {
curInd = pat.indexOf(str.charAt(i));
if (curInd != -1) {
if(prevInd == curInd)
continue;
else if(curInd == (prevInd+1))
count++;
else if(curInd == 0)
count = 1;
else count = 0;
prevInd = curInd;
}
if(count == plen)
return true;
}
return false;
}
public static void main(String[] args) {
boolean r = matchPattern("common", "on");
System.out.println(r);
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.