Given a string s, what is the fastest method to generate a set of all its unique substrings?
Example: for str = "aba" we would get substrs={"a", "b", "ab", "ba", "aba"}.
The naive algorithm would be to traverse the entire string generating substrings in length 1..n in each iteration, yielding an O(n^2) upper bound.
Is a better bound possible?
(this is technically homework, so pointers-only are welcome as well)
As other posters have said, there are potentially O(n^2) substrings for a given string, so printing them out cannot be done faster than that. However there exists an efficient representation of the set that can be constructed in linear time: the suffix tree.
There is no way to do this faster than O(n2) because there are a total of O(n2) substrings in a string, so if you have to generate them all, their number will be n(n + 1) / 2 in the worst case, hence the upper lower bound of O(n2) Ω(n2).
First one is brute force which has complexity O(N^3) which could be brought down to O(N^2 log(N))
Second One using HashSet which has Complexity O(N^2)
Third One using LCP by initially finding all the suffix of a given string which has the worst case O(N^2) and best case O(N Log(N)).
First Solution:-
import java.util.Scanner;
public class DistinctSubString {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
System.out.print("Enter The string");
String s = in.nextLine();
long startTime = System.currentTimeMillis();
int L = s.length();
int N = L * (L + 1) / 2;
String[] Comb = new String[N];
for (int i = 0, p = 0; i < L; ++i) {
for (int j = 0; j < (L - i); ++j) {
Comb[p++] = s.substring(j, i + j + 1);
}
}
/*
* for(int j=0;j<N;++j) { System.out.println(Comb[j]); }
*/
boolean[] val = new boolean[N];
for (int i = 0; i < N; ++i)
val[i] = true;
int counter = N;
int p = 0, start = 0;
for (int i = 0, j; i < L; ++i) {
p = L - i;
for (j = start; j < (start + p); ++j) {
if (val[j]) {
//System.out.println(Comb[j]);
for (int k = j + 1; k < start + p; ++k) {
if (Comb[j].equals(Comb[k])) {
counter--;
val[k] = false;
}
}
}
}
start = j;
}
System.out.println("Substrings are " + N
+ " of which unique substrings are " + counter);
long endTime = System.currentTimeMillis();
System.out.println("It took " + (endTime - startTime) + " milliseconds");
}
}
Second Solution:-
import java.util.*;
public class DistictSubstrings_usingHashTable {
public static void main(String args[]) {
// create a hash set
Scanner in = new Scanner(System.in);
System.out.print("Enter The string");
String s = in.nextLine();
int L = s.length();
long startTime = System.currentTimeMillis();
Set<String> hs = new HashSet<String>();
// add elements to the hash set
for (int i = 0; i < L; ++i) {
for (int j = 0; j < (L - i); ++j) {
hs.add(s.substring(j, i + j + 1));
}
}
System.out.println(hs.size());
long endTime = System.currentTimeMillis();
System.out.println("It took " + (endTime - startTime) + " milliseconds");
}
}
Third Solution:-
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.Arrays;
public class LCPsolnFroDistinctSubString {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Enter Desired String ");
String string = br.readLine();
int length = string.length();
String[] arrayString = new String[length];
for (int i = 0; i < length; ++i) {
arrayString[i] = string.substring(length - 1 - i, length);
}
Arrays.sort(arrayString);
for (int i = 0; i < length; ++i)
System.out.println(arrayString[i]);
long num_substring = arrayString[0].length();
for (int i = 0; i < length - 1; ++i) {
int j = 0;
for (; j < arrayString[i].length(); ++j) {
if (!((arrayString[i].substring(0, j + 1)).equals((arrayString)[i + 1]
.substring(0, j + 1)))) {
break;
}
}
num_substring += arrayString[i + 1].length() - j;
}
System.out.println("unique substrings = " + num_substring);
}
}
Fourth Solution:-
public static void printAllCombinations(String soFar, String rest) {
if(rest.isEmpty()) {
System.out.println(soFar);
} else {
printAllCombinations(soFar + rest.substring(0,1), rest.substring(1));
printAllCombinations(soFar , rest.substring(1));
}
}
Test case:- printAllCombinations("", "abcd");
For big oh ... Best you could do would be O(n^2)
No need to reinvent the wheel, its not based on a strings, but on a sets, so you will have to take the concepts and apply them to your own situation.
Algorithms
Really Good White Paper from MS
In depth PowerPoint
Blog on string perms
well, since there is potentially n*(n+1)/2 different substrings (+1 for the empty substring), I doubt you can be better than O(n*2) (worst case). the easiest thing is to generate them and use some nice O(1) lookup table (such as a hashmap) for excluding duplicates right when you find them.
class SubstringsOfAString {
public static void main(String args[]) {
String string = "Hello", sub = null;
System.out.println("Substrings of \"" + string + "\" are :-");
for (int i = 0; i < string.length(); i++) {
for (int j = 1; j <= string.length() - i; j++) {
sub = string.substring(i, j + i);
System.out.println(sub);
}
}
}
}
class program
{
List<String> lst = new List<String>();
String str = "abc";
public void func()
{
subset(0, "");
lst.Sort();
lst = lst.Distinct().ToList();
foreach (String item in lst)
{
Console.WriteLine(item);
}
}
void subset(int n, String s)
{
for (int i = n; i < str.Length; i++)
{
lst.Add(s + str[i].ToString());
subset(i + 1, s + str[i].ToString());
}
}
}
This prints unique substrings.
https://ideone.com/QVWOh0
def uniq_substring(test):
lista=[]
[lista.append(test[i:i+k+1]) for i in range(len(test)) for k in
range(len(test)-i) if test[i:i+k+1] not in lista and
test[i:i+k+1][::-1] not in lista]
print lista
uniq_substring('rohit')
uniq_substring('abab')
['r', 'ro', 'roh', 'rohi', 'rohit', 'o', 'oh', 'ohi', 'ohit', 'h',
'hi', 'hit', 'i', 'it', 't']
['a', 'ab', 'aba', 'abab', 'b', 'bab']
Many answers that include 2 for loops and a .substring() call claim O(N^2) time complexity. However, it is important to note that the worst case for a .substring() call in Java (post update 6 in Java 7) is O(N). So by adding a .substring() call in your code, the order of N has increased by one.
Therefore, 2 for loops and a .substring() call within those loops equals an O(N^3) time complexity.
It can only be done in o(n^2) time as total number of unique substrings of a string would be n(n+1)/2.
Example:
string s = "abcd"
pass 0: (all the strings are of length 1)
a, b, c, d = 4 strings
pass 1: (all the strings are of length 2)
ab, bc, cd = 3 strings
pass 2: (all the strings are of length 3)
abc, bcd = 2 strings
pass 3: (all the strings are of length 4)
abcd = 1 strings
Using this analogy, we can write solution with o(n^2) time complexity and constant space complexity.
The source code is as below:
#include<stdio.h>
void print(char arr[], int start, int end)
{
int i;
for(i=start;i<=end;i++)
{
printf("%c",arr[i]);
}
printf("\n");
}
void substrings(char arr[], int n)
{
int pass,j,start,end;
int no_of_strings = n-1;
for(pass=0;pass<n;pass++)
{
start = 0;
end = start+pass;
for(j=no_of_strings;j>=0;j--)
{
print(arr,start, end);
start++;
end = start+pass;
}
no_of_strings--;
}
}
int main()
{
char str[] = "abcd";
substrings(str,4);
return 0;
}
Naive algorithm takes O(n^3) time instead of O(n^2) time.
There are O(n^2) number of substrings.
And if you put O(n^2) number of substrings, for example, set,
then set compares O(lgn) comparisons for each string to check if it alrady exists in the set or not.
Besides it takes O(n) time for string comparison.
Therefore, it takes O(n^3 lgn) time if you use set. and you can reduce it O(n^3) time if you use hashtable instead of set.
The point is it is string comparisons not number comparisons.
So one of the best algorithm let's say if you use suffix array and longest common prefix (LCP) algorithm, it reduces O(n^2) time for this problem.
Building a suffix array using O(n) time algorithm.
Time for LCP = O(n) time.
Since for each pair of strings in suffix array, do LCP so total time is O(n^2) time to find the length of distinct subtrings.
Besides if you want to print all distinct substrings, it takes O(n^2) time.
Try this code using a suffix array and longest common prefix. It can also give you the total number of unique substrings. The code might give a stack overflow in visual studio but runs fine in Eclipse C++. That's because it returns vectors for functions. Haven't tested it against extremely long strings. Will do so and report back.
// C++ program for building LCP array for given text
#include <bits/stdc++.h>
#include <vector>
#include <string>
using namespace std;
#define MAX 100000
int cum[MAX];
// Structure to store information of a suffix
struct suffix
{
int index; // To store original index
int rank[2]; // To store ranks and next rank pair
};
// A comparison function used by sort() to compare two suffixes
// Compares two pairs, returns 1 if first pair is smaller
int cmp(struct suffix a, struct suffix b)
{
return (a.rank[0] == b.rank[0])? (a.rank[1] < b.rank[1] ?1: 0):
(a.rank[0] < b.rank[0] ?1: 0);
}
// This is the main function that takes a string 'txt' of size n as an
// argument, builds and return the suffix array for the given string
vector<int> buildSuffixArray(string txt, int n)
{
// A structure to store suffixes and their indexes
struct suffix suffixes[n];
// Store suffixes and their indexes in an array of structures.
// The structure is needed to sort the suffixes alphabatically
// and maintain their old indexes while sorting
for (int i = 0; i < n; i++)
{
suffixes[i].index = i;
suffixes[i].rank[0] = txt[i] - 'a';
suffixes[i].rank[1] = ((i+1) < n)? (txt[i + 1] - 'a'): -1;
}
// Sort the suffixes using the comparison function
// defined above.
sort(suffixes, suffixes+n, cmp);
// At his point, all suffixes are sorted according to first
// 2 characters. Let us sort suffixes according to first 4
// characters, then first 8 and so on
int ind[n]; // This array is needed to get the index in suffixes[]
// from original index. This mapping is needed to get
// next suffix.
for (int k = 4; k < 2*n; k = k*2)
{
// Assigning rank and index values to first suffix
int rank = 0;
int prev_rank = suffixes[0].rank[0];
suffixes[0].rank[0] = rank;
ind[suffixes[0].index] = 0;
// Assigning rank to suffixes
for (int i = 1; i < n; i++)
{
// If first rank and next ranks are same as that of previous
// suffix in array, assign the same new rank to this suffix
if (suffixes[i].rank[0] == prev_rank &&
suffixes[i].rank[1] == suffixes[i-1].rank[1])
{
prev_rank = suffixes[i].rank[0];
suffixes[i].rank[0] = rank;
}
else // Otherwise increment rank and assign
{
prev_rank = suffixes[i].rank[0];
suffixes[i].rank[0] = ++rank;
}
ind[suffixes[i].index] = i;
}
// Assign next rank to every suffix
for (int i = 0; i < n; i++)
{
int nextindex = suffixes[i].index + k/2;
suffixes[i].rank[1] = (nextindex < n)?
suffixes[ind[nextindex]].rank[0]: -1;
}
// Sort the suffixes according to first k characters
sort(suffixes, suffixes+n, cmp);
}
// Store indexes of all sorted suffixes in the suffix array
vector<int>suffixArr;
for (int i = 0; i < n; i++)
suffixArr.push_back(suffixes[i].index);
// Return the suffix array
return suffixArr;
}
/* To construct and return LCP */
vector<int> kasai(string txt, vector<int> suffixArr)
{
int n = suffixArr.size();
// To store LCP array
vector<int> lcp(n, 0);
// An auxiliary array to store inverse of suffix array
// elements. For example if suffixArr[0] is 5, the
// invSuff[5] would store 0. This is used to get next
// suffix string from suffix array.
vector<int> invSuff(n, 0);
// Fill values in invSuff[]
for (int i=0; i < n; i++)
invSuff[suffixArr[i]] = i;
// Initialize length of previous LCP
int k = 0;
// Process all suffixes one by one starting from
// first suffix in txt[]
for (int i=0; i<n; i++)
{
/* If the current suffix is at n-1, then we don’t
have next substring to consider. So lcp is not
defined for this substring, we put zero. */
if (invSuff[i] == n-1)
{
k = 0;
continue;
}
/* j contains index of the next substring to
be considered to compare with the present
substring, i.e., next string in suffix array */
int j = suffixArr[invSuff[i]+1];
// Directly start matching from k'th index as
// at-least k-1 characters will match
while (i+k<n && j+k<n && txt[i+k]==txt[j+k])
k++;
lcp[invSuff[i]] = k; // lcp for the present suffix.
// Deleting the starting character from the string.
if (k>0)
k--;
}
// return the constructed lcp array
return lcp;
}
// Utility function to print an array
void printArr(vector<int>arr, int n)
{
for (int i = 0; i < n; i++)
cout << arr[i] << " ";
cout << endl;
}
// Driver program
int main()
{
int t;
cin >> t;
//t = 1;
while (t > 0) {
//string str = "banana";
string str;
cin >> str; // >> k;
vector<int>suffixArr = buildSuffixArray(str, str.length());
int n = suffixArr.size();
cout << "Suffix Array : \n";
printArr(suffixArr, n);
vector<int>lcp = kasai(str, suffixArr);
cout << "\nLCP Array : \n";
printArr(lcp, n);
// cum will hold number of substrings if that'a what you want (total = cum[n-1]
cum[0] = n - suffixArr[0];
// vector <pair<int,int>> substrs[n];
int count = 1;
for (int i = 1; i <= n-suffixArr[0]; i++) {
//substrs[0].push_back({suffixArr[0],i});
string sub_str = str.substr(suffixArr[0],i);
cout << count << " " << sub_str << endl;
count++;
}
for(int i = 1;i < n;i++) {
cum[i] = cum[i-1] + (n - suffixArr[i] - lcp[i - 1]);
int end = n - suffixArr[i];
int begin = lcp[i-1] + 1;
int begin_suffix = suffixArr[i];
for (int j = begin, k = 1; j <= end; j++, k++) {
//substrs[i].push_back({begin_suffix, lcp[i-1] + k});
// cout << "i push " << i << " " << begin_suffix << " " << k << endl;
string sub_str = str.substr(begin_suffix, lcp[i-1] +k);
cout << count << " " << sub_str << endl;
count++;
}
}
/*int count = 1;
cout << endl;
for(int i = 0; i < n; i++){
for (auto it = substrs[i].begin(); it != substrs[i].end(); ++it ) {
string sub_str = str.substr(it->first, it->second);
cout << count << " " << sub_str << endl;
count++;
}
}*/
t--;
}
return 0;
}
And here's a simpler algorithm:
#include <iostream>
#include <string.h>
#include <vector>
#include <string>
#include <algorithm>
#include <time.h>
using namespace std;
char txt[100000], *p[100000];
int m, n;
int cmp(const void *p, const void *q) {
int rc = memcmp(*(char **)p, *(char **)q, m);
return rc;
}
int main() {
std::cin >> txt;
int start_s = clock();
n = strlen(txt);
int k; int i;
int count = 1;
for (m = 1; m <= n; m++) {
for (k = 0; k+m <= n; k++)
p[k] = txt+k;
qsort(p, k, sizeof(p[0]), &cmp);
for (i = 0; i < k; i++) {
if (i != 0 && cmp(&p[i-1], &p[i]) == 0){
continue;
}
char cur_txt[100000];
memcpy(cur_txt, p[i],m);
cur_txt[m] = '\0';
std::cout << count << " " << cur_txt << std::endl;
count++;
}
}
cout << --count << endl;
int stop_s = clock();
float run_time = (stop_s - start_s) / double(CLOCKS_PER_SEC);
cout << endl << "distinct substrings \t\tExecution time = " << run_time << " seconds" << endl;
return 0;
}
Both algorithms listed a simply too slow for extremely long strings though. I tested the algorithms against a string of length over 47,000 and the algorithms took over 20 minutes to complete, with the first one taking 1200 seconds, and the second one taking 1360 seconds, and that's just counting the unique substrings without outputting to the terminal. So for probably strings of length up to 1000 you might get a working solution. Both solutions did compute the same total number of unique substrings though. I did test both algorithms against string lengths of 2000 and 10,000. The times were for the first algorithm: 0.33 s and 12 s; for the second algorithm it was 0.535 s and 20 s. So it looks like in general the first algorithm is faster.
Here is my code in Python. It generates all possible substrings of any given string.
def find_substring(str_in):
substrs = []
if len(str_in) <= 1:
return [str_in]
s1 = find_substring(str_in[:1])
s2 = find_substring(str_in[1:])
substrs.append(s1)
substrs.append(s2)
for s11 in s1:
substrs.append(s11)
for s21 in s2:
substrs.append("%s%s" %(s11, s21))
for s21 in s2:
substrs.append(s21)
return set(substrs)
If you pass str_ = "abcdef" to the function, it generates the following results:
a, ab, abc, abcd, abcde, abcdef, abcdf, abce, abcef, abcf, abd, abde, abdef, abdf, abe, abef, abf, ac, acd, acde, acdef, acdf, ace, acef, acf, ad, ade, adef, adf, ae, aef, af, b, bc, bcd, bcde, bcdef, bcdf, bce, bcef, bcf, bd, bde, bdef, bdf, be, bef, bf, c, cd, cde, cdef, cdf, ce, cef, cf, d, de, def, df, e, ef, f
Related
My task is to generates all possible combinations of that rows without the hidden # number sign. The input is XOXX#OO#XO and here is the example of what the output should be:
XOXXOOOOXO
XOXXOOOXXO
XOXXXOOOXO
XOXXXOOXXO
I am only allowed to solve this solution iteratively and I am not sure how to fix this and have been working on this code for a week now.
Here is my code:
import java.lang.Math;
public class help {
public static void main(String[] args) {
String str = new String("XOXX#OO#XO");
UnHide(str);
}
public static void UnHide(String str) {
//converting string to char
char[] chArr = str.toCharArray();
//finding all combinations for XO
char[] xo = new char[]{'X', 'O'};
int count = 0;
char perm = 0;
String s = "";
//finding amount of times '#' appears in string
for (int i = 0; i < str.length(); i++) {
if (chArr[i] == '#')
count++;
}
int[] combo = new int[count];
int pMax = xo.length;
while (combo[0] < pMax) {
// print the current permutation
for (int k = 0; k < count; k++) {
//print each character
//System.out.print(xo[combo[i]]);
perm = xo[combo[k]];
s = String.valueOf(perm);
char[] xoArr = s.toCharArray();
String strChar = new String(xoArr);
//substituting '#' to XO combo
for (int i = 0; i < chArr.length; i++) {
for (int j = 0; j < s.length(); j++) {
if (chArr[i] == '#') {
chArr[i] = xoArr[j];
strChar = String.copyValueOf(chArr);
i++;
}
}
i++;
if (i == chArr.length - 1) {
System.out.println(strChar);
i = 0;
}
}
}
System.out.println(); //print end of line
// increment combo
combo[count - 1]++; // increment the last index
//// if increment overflows
for (int i = count - 1; combo[i] == pMax && i > 0; i--) {
combo[i - 1]++; // increment previous index
combo[i] = 0; // set current index to zero
}
}
}
}
Since your input has 2 #'s, there are 2n = 4 permutations.
If you count from 0 to 3, and look at the numbers in binary, you get 00, 01, 10, and 11, so if you use that, inserting O for 0 and X for 1, you can do this using simple loops.
public static void unHide(String str) {
int count = 0;
for (int i = 0; i < str.length(); i++)
if (str.charAt(i) == '#')
count++;
if (count > 30)
throw new IllegalArgumentException("Too many #'s found. " + count + " > 30");
char[] buf = str.toCharArray();
for (int permutation = 0, end = 1 << count; permutation < end; permutation++) {
for (int i = buf.length - 1, bit = 0; i >= 0; i--)
if (str.charAt(i) == '#')
buf[i] = "OX".charAt(permutation >>> bit++ & 1);
System.out.println(buf);
}
}
Test
unHide("XOXX#OO#XO");
Output
XOXXOOOOXO
XOXXOOOXXO
XOXXXOOOXO
XOXXXOOXXO
You can iteratively generate all possible combinations of strings using streams as follows:
public static String[] unHide(String str) {
// an array of substrings around a 'number sign'
String[] arr = str.split("#", -1);
// an array of possible combinations
return IntStream
// iterate over array indices
.range(0, arr.length)
// append each substring with possible
// combinations, except the last one
// return Stream<String[]>
.mapToObj(i -> i < arr.length - 1 ?
new String[]{arr[i] + "O", arr[i] + "X"} :
new String[]{arr[i]})
// reduce stream of arrays to a single array
// by sequentially multiplying array pairs
.reduce((arr1, arr2) -> Arrays.stream(arr1)
.flatMap(str1 -> Arrays.stream(arr2)
.map(str2 -> str1 + str2))
.toArray(String[]::new))
.orElse(null);
}
// output to the markdown table
public static void main(String[] args) {
String[] tests = {"XOXX#OOXO", "XOXX#OO#XO", "#XOXX#OOXO#", "XO#XX#OO#XO"};
String header = String.join("</pre> | <pre>", tests);
String matrices = Arrays.stream(tests)
.map(test -> unHide(test))
.map(arr -> String.join("<br>", arr))
.collect(Collectors.joining("</pre> | <pre>"));
System.out.println("| <pre>" + header + "</pre> |");
System.out.println("|---|---|---|---|");
System.out.println("| <pre>" + matrices + "</pre> |");
}
XOXX#OOXO
XOXX#OO#XO
#XOXX#OOXO#
XO#XX#OO#XO
XOXXOOOXOXOXXXOOXO
XOXXOOOOXOXOXXOOOXXOXOXXXOOOXOXOXXXOOXXO
OXOXXOOOXOOOXOXXOOOXOXOXOXXXOOXOOOXOXXXOOXOXXXOXXOOOXOOXXOXXOOOXOXXXOXXXOOXOOXXOXXXOOXOX
XOOXXOOOOXOXOOXXOOOXXOXOOXXXOOOXOXOOXXXOOXXOXOXXXOOOOXOXOXXXOOOXXOXOXXXXOOOXOXOXXXXOOXXO
The process would probably be best to calculate the number of permutations, then loop through each to define what combination of characters to use.
For that, we'll have to divide the permutation number by some value related to the index of the character we're replacing, which will serve as the index of the character to swap it to.
public static void test(String word) {
// Should be defined in class (outside method)
String[] replaceChars = {"O", "X"};
char replCharacter = '#';
String temp;
int charIndex;
int numReplaceable = 0;
// Count the number of chars to replace
for (char c : word.toCharArray())
if (c == replCharacter)
numReplaceable++;
int totalPermutations = (int) Math.pow(replaceChars.length, numReplaceable);
// For all permutations:
for (int permNum = 0; permNum < totalPermutations; permNum++) {
temp = word;
// For each replacement character in the word:
for (int n = 0; n < numReplaceable; n++) {
// Calculate the character to swap the nth replacement char to
charIndex = permNum / (int) (Math.pow(replaceChars.length, n))
% replaceChars.length;
temp = temp.replaceFirst(
replCharacter + "", replaceChars[charIndex]);
}
System.out.println(temp);
}
}
Which can produces:
java Test "#TEST#"
OTESTO
XTESTO
OTESTX
XTESTX
This can also be used with any number of characters, just add more to replaceChars.
I found this very challenging coding problem online which I though I'd give a try.
The general idea is that given string of text T and pattern P, find the occurrences of this pattern, sum up it's corresponding value and return max and min. If you want to read the problem in more details, please refer to this.
However, below is the code I've provided, it works for a simple test case, but when running on multiple and complex test cases its pretty slow, and I'm not sure where my code needs to be optimized.
Can anyone please help where im getting the logic wrong.
public class DeterminingDNAHealth {
private DeterminingDNAHealth() {
/*
* Fixme:
* Each DNA contains number of genes
* - some of them are beneficial and increase DNA's total health
* - Each Gene has a health value
* ======
* - Total health of DNA = sum of all health values of beneficial genes
*/
}
int checking(int start, int end, String pattern) {
String[] genesChar = new String[] {
"a",
"b",
"c",
"aa",
"d",
"b"
};
String numbers = "123456";
int total = 0;
for (int i = start; i <= end; i++) {
total += KMPAlgorithm.initiateAlgorithm(pattern, genesChar[i]) * (i + 1);
}
return total;
}
public static void main(String[] args) {
String[] genesChar = new String[] {
"a",
"b",
"c",
"aa",
"d",
"b"
};
Gene[] genes = new Gene[genesChar.length];
for (int i = 0; i < 6; i++) {
genes[i] = new Gene(genesChar[i], i + 1);
}
String[] checking = "15caaab 04xyz 24bcdybc".split(" ");
DeterminingDNAHealth DNA = new DeterminingDNAHealth();
int i, mostHealthiest, mostUnhealthiest;
mostHealthiest = Integer.MIN_VALUE;
mostUnhealthiest = Integer.MAX_VALUE;
for (i = 0; i < checking.length; i++) {
int start = Character.getNumericValue(checking[i].charAt(0));
int end = Character.getNumericValue(checking[i].charAt(1));
String pattern = checking[i].substring(2, checking[i].length());
int check = DNA.checking(start, end, pattern);
if (check > mostHealthiest)
mostHealthiest = check;
else
if (check < mostUnhealthiest)
mostUnhealthiest = check;
}
System.out.println(mostHealthiest + " " + mostUnhealthiest);
// DNA.checking(1,5, "caaab");
}
}
KMPAlgorithm
public class KMPAlgorithm {
KMPAlgorithm() {}
public static int initiateAlgorithm(String text, String pattern) {
// let us generate our LPC table from the pattern
int[] partialMatchTable = partialMatchTable(pattern);
int matchedOccurrences = 0;
// initially we don't have anything matched, so 0
int partialMatchLength = 0;
// we then start to loop through the text, !note, not the pattern. The text that we are testing the pattern on
for (int i = 0; i < text.length(); i++) {
// if there is a mismatch and there's no previous match, then we've hit the base-case, hence break from while{...}
while (partialMatchLength > 0 && text.charAt(i) != pattern.charAt(partialMatchLength)) {
/*
* otherwise, based on the number of chars matched, we decrement it by 1.
* In fact, this is the unique part of this algorithm. It is this part that we plan to skip partialMatchLength
* iterations. So if our partialMatchLength was 5, then we are going to skip (5 - 1) iteration.
*/
partialMatchLength = partialMatchTable[partialMatchLength - 1];
}
// if however we have a char that matches the current text[i]
if (text.charAt(i) == pattern.charAt(partialMatchLength)) {
// then increment position, so hence we check the next char of the pattern against the next char in text
partialMatchLength++;
// we will know that we're at the end of the pattern matching, if the matched length is same as the pattern length
if (partialMatchLength == pattern.length()) {
// to get the starting index of the matched pattern in text, apply this formula (i - (partialMatchLength - 1))
// this line increments when a match string occurs multiple times;
matchedOccurrences++;
// just before when we have a full matched pattern, we want to test for multiple occurrences, so we make
// our match length incomplete, and let it run longer.
partialMatchLength = partialMatchTable[partialMatchLength - 1];
}
}
}
return matchedOccurrences;
}
private static int[] partialMatchTable(String pattern) {
/*
* TODO
* Note:
* => Proper prefix: All the characters in a string, with one or more cut off the end.
* => proper suffix: All the characters in a string, with one or more cut off the beginning.
*
* 1.) Take the pattern and construct a partial match table
*
* To construct partial match table {
* 1. Loop through the String(pattern)
* 2. Create a table of size String(pattern).length
* 3. For each character c[i], get The length of the longest proper prefix in the (sub)pattern
* that matches a proper suffix in the same (sub)pattern
* }
*/
// we will need two incremental variables
int i, j;
// an LSP table also known as “longest suffix-prefix”
int[] LSP = new int[pattern.length()];
// our initial case is that the first element is set to 0
LSP[0] = 0;
// loop through the pattern...
for (i = 1; i < pattern.length(); i++) {
// set our j as previous elements data (not the index)
j = LSP[i - 1];
// we will be comparing previous and current elements data. ei char
char current = pattern.charAt(i), previous = pattern.charAt(j);
// we will have a case when we're somewhere in loop and two chars will not match, and j is not in base case.
while (j > 0 && current != previous)
// we decrement our j
j = LSP[j - 1];
// simply put, if two characters are same, then we update our LSP to say that at that point, we hold the j's value
if (current == previous)
// increment our j
j++;
// update the table
LSP[i] = j;
}
return LSP;
}
}
Cource code credit to Github
You may try this KMP implementation. It is O(m+n), as KMP is intended to be. It should be a lot faster:
private static int[] failureFunction(char[] pattern) {
int m = pattern.length;
int[] f = new int[pattern.length];
f[0] = 0;
int i = 1;
int j = 0;
while (i < m) {
if (pattern[i] == pattern[j]) {
f[i] = j + 1;
i++;
j++;
} else if (j > 0) {
j = f[j - 1];
} else {
f[i] = 0;
i++;
}
}
return f;
}
private static int kmpMatch(char[] text, char[] pattern) {
int[] f = failureFunction(pattern);
int m = pattern.length;
int n = text.length;
int i = 0;
int j = 0;
while (i < n) {
if (pattern[j] == text[i]) {
if (j == m - 1){
return i - (m - 1);
} else {
i++;
j++;
}
} else if (j > 0) {
j = f[j - 1];
} else {
i++;
}
}
return -1;
}
I am having a lot of trouble finding an efficient solution to Problem #9 in the UCF HSPT programming competition. The whole pdf can we viewed here, and the problem is called "Chomp Chomp!".
Essentially the problem involves taking 2 "chomps" out of an array, where each chomp is a continuous chain of elements from the array and the 2 chomps have to have at least element between them that's not "chomped." Once the two "chomps" are determined, the sum of all the elements in both "chomps" has to be a multiple of the number given in the input. My solution essentially is a brute-force that goes through every possible "chomp" and I tried to improve the speed of it by storing previously calculated sums of chomps.
My code:
import java.util.Arrays;
import java.util.HashMap;
import java.util.Scanner;
public class chomp {
static long[] arr;
public static long sum(int start, int end) {
long ret = 0;
for(int i = start; i < end; i++) {
ret+=arr[i];
}
return ret;
}
public static int sumArray(int[] arr) {
int sum = 0;
for(int i = 0; i < arr.length; i++) {
sum+=arr[i];
}
return sum;
}
public static long numChomps(long[] arr, long divide) {
HashMap<String, Long> map = new HashMap<>();
int k = 1;
long numchomps = 0;
while(true) {
if (k > arr.length-2) break;
for (int i = 0; i < arr.length -2; i++) {
if ((i+k)>arr.length-2) break;
String one = i + "";
String two = (i+k) + "";
String key1 = one + " " + two;
long first = 0;
if(map.containsKey(key1)) {
//System.out.println("Key 1 found!");
first = map.get(key1).longValue();
} else {
first = sum(i, i+k);
map.put(key1, new Long(first));
}
int kk = 1;
while(true){
if (kk > arr.length-2) break;
for (int j = i+k+1; j < arr.length; j++) {
if((j+kk) > arr.length) break;
String o = j + "";
String t = (j+kk) + "";
String key2 = o + " " + t;
long last = 0;
if(map.containsKey(key2)) {
//System.out.println("Key 2 found!");
last = map.get(key2).longValue();
} else {
last = sum(j, j+kk);
map.put(key2, new Long(last));
}
if (((first+last) % divide) == 0) {
numchomps++;
}
}
kk++;
}
}
k++;
}
return numchomps;
}
public static void main(String[] args) {
// TODO Auto-generated method stub
Scanner in = new Scanner(System.in);
int n = Integer.parseInt(in.nextLine());
for(int i = 1; i <= n; i++) {
int length = in.nextInt();
long divide = in.nextLong();
in.nextLine();
arr = new long[length];
for(int j = 0; j < length; j++) {
arr[j] = (in.nextLong());
}
//System.out.println(arr);
in.nextLine();
long blah = numChomps(arr, divide);
System.out.println("Plate #"+i + ": " + blah);
}
}
}
My code gets the right answer, but seems to take too long, especially for large inputs when the size of the array is 1000 or greater. I tried to improve the speed of my algorithm my storing previous sums calculated in a HashMap, but that didn't improve the speed of my program considerably. What can I do to improve the speed so it runs under 10 seconds?
The first source of inefficiency is constant recalculation of sums. You should make an auxiliary array of partial sums long [n] partial;, then instead of calling sum(i, i + k) you may simply do partial[i + k] - partial[i].
Now the problem reduces to finding indices i<j<k<m such that
(partial[j] - partial[i] + partial[m] - partial[k]) % divide == 0
or, rearranging terms,
(partial[j] + partial[m]) % divide == (partial[i] + partial[k]) % divide
To find them you may consider an array of triples (s, i, j) where s = (partial[j] - partial[i]) % divide, stable sort it by s, and inspect equal ranges for non-overlapping "chomps".
This approach improves performance from O(n4) to O(n2 log n). Now you shall be able to improve it to O(n log n).
I have to write a program to accept a String as input, and as output I'll have to print each and every alphabetical letter, and how many times each occurred in the user input. There are some constraints:
I cannot use built-in functions and collection
The printed result should be sorted by occurrence-value.
For example, with this input:
abbbccccdddddzz
I would expect this output:
a-1,z-2,b-3,c-4,d-5
This is what I have so far:
public static void isCountChar(String s) {
char c1[] = s.toCharArray();
int c3[] = new int[26];
for (int i = 0; i < c1.length; i++) {
char c = s.charAt(i);
c3[c - 'a']++;
}
for (int j = 0; j < c3.length; j++) {
if (c3[j] != 0) {
char c = (char) (j + 'a');
System.out.println("character is:" + c + " " + "count is: " + c3[j]);
}
}
}
But I don't know how to sort.
First of all a tip for your next question: The things you've stated in your comments would fit better as an edit to your question. Try to clearly state what the current result is, and what the expected result should be.
That being said, it was an interesting problem, because of the two constraints.
First of all you weren't allowed to use libraries or collections. If this wasn't a constraint I would have suggested a HashMap with character as keys, and int as values, and then the sorting would be easy.
Second constraint was to order by value. Most people here suggested a sorting like BubbleSort which I agree with, but it wouldn't work with your current code because it would sort by alphabetic character instead of output value.
With these two constraints it is probably best to fake key-value pairing yourself by making both an keys-array and values-array, and sort them both at the same time (with something like a BubbleSort-algorithm). Here is the code:
private static final int ALPHABET_SIZE = 26;
public static void isCountChar(String s)
{
// Convert input String to char-array (and uppercase to lowercase)
char[] array = s.toLowerCase().toCharArray();
// Fill the keys-array with the alphabet
char[] keys = new char[ALPHABET_SIZE];
for (int i = 0; i < ALPHABET_SIZE; i++)
{
keys[i] = (char)('a' + i);
}
// Count how much each char occurs in the input String
int[] values = new int[ALPHABET_SIZE];
for (char c : array)
{
values[c - 'a']++;
}
// Sort both the keys and values so the indexes stay the same
bubbleSort(keys, values);
// Print the output:
for (int j = 0; j < ALPHABET_SIZE; j++)
{
if (values[j] != 0)
{
System.out.println("character is: " + keys[j] + "; count is: " + values[j]);
}
}
}
private static void bubbleSort(char[] keys, int[] values)
{
// BUBBLESORT (copied from http://www.java-examples.com/java-bubble-sort-example and modified)
int n = values.length;
for(int i = 0; i < n; i++){
for(int j = 1; j < (n - i); j++){
if(values[j-1] > values[j]){
// Swap the elements:
int tempValue = values[j - 1];
values[j - 1] = values[j];
values[j] = tempValue;
char tempKey = keys[j - 1];
keys[j - 1] = keys[j];
keys[j] = tempKey;
}
}
}
}
Example usage:
public static void main (String[] args) throws java.lang.Exception
{
isCountChar("TestString");
}
Output:
character is: e; count is: 1
character is: g; count is: 1
character is: i; count is: 1
character is: n; count is: 1
character is: r; count is: 1
character is: s; count is: 2
character is: t; count is: 3
Here is a working ideone to see the input and output.
some sort: easy if not easiest to understand an coding
You loop from the first element to the end -1 : element K
compare element K and element K+1: if element K>element K+1, invert them
continue loop
if you made one change redo that !
I'm trying to show all combinations possible without using recursion.
I was trying it with a loop but it isn't working.
Without recursion(Not Working):
import java.util.Arrays;
public class Combination {
public static void main(String[] args) {
String[] arr = {"A","B","C","D","E","F"};
String[] result = new String[3];
int i = 0, len = 3;
while(len != 0 || i <= arr.length-len)
{
result[result.length - len] = arr[i];
len--;
i++;
}
if (len == 0){
System.out.println(Arrays.toString(result));
return;
}
}
}
With recursion(Working):
import java.util.Arrays;
public class Combination {
public static void main(String[] args){
String[] arr = {"A","B","C","D","E","F"};
combinations2(arr, 3, 0, new String[3]);
}
static void combinations2(String[] arr, int len, int startPosition, String[] result){
if (len == 0){
System.out.println(Arrays.toString(result));
return;
}
for (int i = startPosition; i <= arr.length-len; i++){
result[result.length - len] = arr[i];
combinations2(arr, len-1, i+1, result);
}
}
}
What am I doing wrong?
It is not clear what is the answer you are trying to get. The question should be set with example data to specify exactly what is it that you want to get. Given the part of the program you provided. I guess what you are looking for is all possible subsets.
the following code will get you all the possible subsets. You can easily modify this program to give you all possible subsets of a specific size (Try it to make sure you understand how it works).
static void subsets(String[]arr) {
int len = arr.length;
for (int i = 0; i < (1 << len); i++) { // 1 << len means 2 ^ len there are 2 ^ n subsets for a set of size n
/*
* We will use the bits of the integer i to represent which elements are taken
* a bit of value 1 means this element is included in the current subset
* a bit of value 0 means that we do not take the element in the given position
*/
String res = "[";
for(int j = 0; j < len; j++) { //loop over the bits of i from 0 to len -1
if ( (i & (1 << j)) != 0) {
//the jth bit is 1 in the integer i as such the jth
//element is taken in the current subset
res += (arr[j]+", ");
}
}
res += "]";
int lastCommaPosition = res.lastIndexOf(',');
if (res.lastIndexOf(',') != -1) { // a comma exists
res = res.substring(0, lastCommaPosition)+ "]";//remove the extra comma and space
}
System.out.println(res);
}
}
You can modify the method above to take an integer size for example, count the number of bits of value 1 (the number of elements taken) if this count is equal to the size you print it otherwise you don't. Feel free to modify the question if this is not the answer you wanted or ask in the comments if there is something you don't understand.
Here is how you can get the exact output without recursion. I am using for loops though:
for(int i = 0; i < arr.length - 2; i ++) {
for(int j = i + 1; j < arr.length - 1; j++ ) {
for(int k = j + 1; k < arr.length ; k++ ) {
System.out.println(arr[i] + ", " + arr[j] + ", " + arr[k]);
}
}
}