Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to find missing number from even number
eg: {2,4,6,8,10,14}; // output should be 12
I tried:
public class MissingNumber {
public static void main(String[] args) {
int a[] = {2,4,6,8,10,14};
int sum = 0;
for (int i = 0; i<a.length; i++) {
sum = sum + a[i];
}
int sum1 = 0;
for(int j=1; j<=7; j++) {
sum1 = sum1 + j;
}
System.out.println("missing number is:"+(sum1-sum));
}
}
You can use the idea that a[i+1] must equal a[i]+2 for your number stream to be considered valid. If a[i]+2 != a[i+1] then a[i]+2 is the number which is missing from your stream of even numbers:
public static void main(String[] args) {
int[] a = {2,4,6,8,10,14}; // create Array
Arrays.sort(a); // Sort Array (Just in case)
int missing = -1; // Default value if no missing int is found
for (int i = 0; i<a.length-1; i++) { // loop through Array
if(a[i]+2 != a[i+1]) { // check if expected next number exists
missing = a[i]+2; // sets missing to the missing number
break; // stops the for loop
}
}
System.out.println("missing number is: " + missing);
}
Not sure why you are looking at sums unless you are sure your series of numbers always only is missing one number.
Otherwise, how about something like this:
int a[] = {2,4,6,8,10,14};
int expected = 2;
for (int val : a) {
if (expected != val) {
System.out.println("Missing number is " + expected);
}
expected = expected +2;
}
You can try to do the even sum an check with the array element.
public static void main(String[] args) {
int a[] = {2, 4, 6, 8, 10, 14};
int sum = 2;
for (int i = 0; i < a.length; i++) {
if (a[i] != sum) {
System.out.println("missing number is: " + sum);
break;
}
sum = sum + 2;
}
}
I made some changes in your code. By these changes, you will get your answer.
public class MissingNumber {
public static void main(String[] args) {
int a[] = {2,4,6,8,10,14};
int sum = 0;
for (int i = 0; i<a.length; i++) {
sum = sum + a[i];
}
int sum1 = 0;
int even = 2;
for (int j=0; j<=a.length; j++) {
sum1 = sum1 + even;
even = even + 2;
}
System.out.println("missing number is:"+(sum1-sum));
}
}
There are lots of solutions above, but all of them have O(N) complexity.
I believe that you can solve this problem with O(log N) using https://en.wikipedia.org/wiki/Binary_search_algorithm.
Raw code:
public class MissingNumber {
public static void main(String[] args) {
int a[] = {2,4,8,10, 12, 14};
int start = 0;
int end = a.length;
int pointer = 0;
while (end - start > 1) {
if (a[pointer] == (pointer + 1) * 2) {
start = pointer;
} else {
end = pointer;
}
pointer = (start + end) / 2;
}
System.out.println("Missing element: " + (pointer + 2) * 2);
}
}
You can add some more conditions.
// Missing Number Fast
public int missingIntegerFast(int[] arr) {
int sum = 0;
for (int i : arr) {
sum += i;
}
int n = arr.length + 1;
return (n * (n - 1) / 2) - sum;
}
You can do something like below:
// create the set of initial values to use them in filter step.
Set<Integer> givenValues = Arrays.stream(a).boxed().collect(Collectors.toSet());
OptionalInt first =
// generate range by 2 and limit it to size of input array
IntStream.iterate(2, i -> i + 2).limit(a.length)
// filter out only this value that are not in givenValues
.filter(i -> !givenValues.contains(i))
// get first value
.findFirst();
// finally it have to be check if found any value. If not then return default value.
System.out.println("missing number is: "+first.orElse(-1));
It can be too over engineering. But it does not depend on order of given input.
The question is -
The arithmetic sequence, 1487, 4817, 8147, in which each of the terms
increases by 3330, is unusual in two ways: (i) each of the three terms
are prime, and, (ii) each of the 4-digit numbers are permutations of
one another.
There are no arithmetic sequences made up of three 1-, 2-, or 3-digit
primes, exhibiting this property, but there is one other 4-digit
increasing sequence.
What 12-digit number do you form by concatenating the three terms in
this sequence?
I've written this code -
package Problems;
import java.util.ArrayList;
import java.util.LinkedList;
public class Pro49 {
private static boolean isPrime(int n){
if(n%2 == 0) return false;
for(int i = 3; i<= Math.sqrt(n); i++){
if(n%i == 0) return false;
}
return true;
}
private static boolean isPerm(int m, int n){
ArrayList<Integer> mArr = new ArrayList<>();
ArrayList<Integer> nArr = new ArrayList<>();
for(int i = 0; i<4; i++){
mArr.add(m%10);
m /= 10;
}
for(int i = 0; i<4; i++){
nArr.add(n%10);
n /= 10;
}
return mArr.containsAll(nArr);
}
public static void main(String[] args) {
LinkedList<Integer> primes = new LinkedList<>();
for(int i = 1001; i<10000; i++){
if(isPrime(i)) primes.add(i);
}
int k = 0;
boolean breaker = false;
for(int i = 0; i<primes.size() - 2; i++){
for(int j = i + 1; j<primes.size() - 1; j++){
if(isPerm(primes.get(i), primes.get(j))) {
k = primes.get(j) + (primes.get(j) - primes.get(i));
if(k<10000 && primes.contains(k) && isPerm(primes.get(i), k)) {
System.out.println(primes.get(i) + "\n" + primes.get(j) + "\n" + k);
breaker = true;
break;
}
}
if(breaker) break;
}
if(breaker) break;
}
}
}
I added the print line System.out.println(primes.get(i) + "\n" + primes.get(j) + "\n" + k); to check the numbers. I got 1049, 1499, 1949 which are wrong. (At least 1049 is wrong I guess).
Can any one point out where my code/logic is wrong?
Any help is appreciated.
I think where your logic is going wrong is your isPerm method. You are using AbstractCollection#containsAll, which, AFAIK, only checks if the parameters are in the collection at least once.
i.e. it basically does
for(E e : collection)
if(!this.contains(e)) return false;
return true;
Therefore, for example, 4999 will be a permutation of 49 because 49 contains 4 and 9 (while it is clearly not based on your example).
The reason why your method seems to work for these values is that you are looping a fixed amount of time - that is, 4. For a number like 49 you will end up with {9, 4, 0, 0} instead of {9, 4}. Do something like this:
while(n != 0) {
nArr.add(n%10);
n /= 10;
}
and you will get the correct digit Lists (and see that containsAll won't work.)
Add the 4-digit restriction elsewhere (e.g. in your loop.)
Maybe you could check the occurrences per digit.
For example:
int[] occurrencesA = new int[10], occurrencesB = new int[10];
for(; m != 0; m /= 10)
occurrencesA[m % 10]++;
for(; n != 0; n /= 10)
occurrencesB[n % 10]++;
for(int i = 0; i < 10; i++)
if(occurrencesA[i] != occurrencesB[i]) return false;
return true;
I found a possible alternative for isPerm
private static boolean isPerm(int m, int n){
ArrayList<Integer> mArr = new ArrayList<>();
ArrayList<Integer> nArr = new ArrayList<>();
final String mS = Integer.toString(m);
final String nS = Integer.toString(n);
if(mS.length() != nS.length()) return false;
for(int i = 0; i<mS.length(); i++){
mArr.add(m%10);
m /= 10;
}
for(int i = 0; i<nS.length(); i++){
nArr.add(n%10);
n /= 10;
}
return (mArr.containsAll(nArr) && nArr.containsAll(mArr));
}
This is giving me the correct answer. Another alternative is posted by some other person below.
The program I have currently takes N numbers and then a goal target. It inserts either "+" or "*" in between the numbers to try reach the goal. If it can reach the goal it will print out the correct operations.
However the way it finds the answer is by brute force, which is inadequate for a large set of N numbers. My current code is below:
public class Arithmetic4{
private static ArrayList<String> input = new ArrayList<String>();
private static ArrayList<String> second_line = new ArrayList<String>();
private static ArrayList<Integer> numbers = new ArrayList<Integer>();
private static ArrayList<String> operations = new ArrayList<String>();
private static ArrayList<Integer> temp_array = new ArrayList<Integer>();
public static void main(String [] args){
Scanner sc = new Scanner(System.in);
while(sc.hasNextLine()){
readInput(sc);
}
}
public static void readInput(Scanner sc){
String line = sc.nextLine();
input.add(line);
line = sc.nextLine();
second_line.add(line);
dealInput();
}
public static void dealInput(){
String numberS = input.get(0);
String[] stringNumbers = numberS.split("\\s+");
for(int i = 0; i < stringNumbers.length; i++){
String numberAsString = stringNumbers[i];
numbers.add(Integer.parseInt(numberAsString));
}
String orderString = second_line.get(0);
String[] stringWhatWay = orderString.split("\\s+");
int target = Integer.parseInt(stringWhatWay[0]);
char whatway = stringWhatWay[1].charAt(0);
long startTime = System.currentTimeMillis();
whatEquation(numbers, target, whatway);
long elapsedTime = System.currentTimeMillis() - startTime;
long elapsedMSeconds = elapsedTime / 1;
System.out.println(elapsedMSeconds);
numbers.clear();
input.clear();
second_line.clear();
}
public static void whatEquation(ArrayList<Integer> numbers, int target, char whatway){
if(whatway != 'L' && whatway != 'N'){
System.out.println("Not an option");
}
if(whatway == 'N'){
ArrayList<Integer> tempo_array = new ArrayList<Integer>(numbers);
int count = 0;
for (int y: numbers) {
count++;
}
count--;
int q = count;
calculateN(numbers, target, tempo_array, q);
}
if (whatway == 'L'){
if(numbers.size() == 1){
System.out.println("L " + numbers.get(0));
}
ArrayList<Integer> temp_array = new ArrayList<Integer>(numbers);
calculateL(numbers, target, temp_array);
}
}
public static void calculateN(ArrayList<Integer> numbers, int target, ArrayList<Integer> tempo_numbers, int q){
int sum = 0;
int value_inc = 0;
int value_add;
boolean firstRun = true;
ArrayList<Character> ops = new ArrayList<Character>();
ops.add('+');
ops.add('*');
for(int i = 0; i < Math.pow(2, q); i++){
String bin = Integer.toBinaryString(i);
while(bin.length() < q)
bin = "0" + bin;
char[] chars = bin.toCharArray();
List<Character> oList = new ArrayList<Character> ();
for(char c: chars){
oList.add(c);
}
ArrayList<Character> op_array = new ArrayList<Character>();
ArrayList<Character> temp_op_array = new ArrayList<Character>();
for (int j = 0; j < oList.size(); j++) {
if (oList.get(j) == '0') {
op_array.add(j, ops.get(0));
temp_op_array.add(j, ops.get(0));
} else if (oList.get(j) == '1') {
op_array.add(j, ops.get(1));
temp_op_array.add(j, ops.get(1));
}
}
sum = 0;
for(int p = 0; p < op_array.size(); p++){
if(op_array.get(p) == '*'){
int multiSum = numbers.get(p) * numbers.get(p+1);
numbers.remove(p);
numbers.remove(p);
numbers.add(p, multiSum);
op_array.remove(p);
p -= 1;
}
}
for(Integer n: numbers){
sum += n;
}
if(sum != target){
numbers.clear();
for (int t = 0; t < tempo_numbers.size(); t++) {
numbers.add(t, tempo_numbers.get(t));
}
}
if (sum == target){
int count_print_symbol = 0;
System.out.print("N ");
for(int g = 0; g < tempo_numbers.size(); g++){
System.out.print(tempo_numbers.get(g) + " ");
if(count_print_symbol == q){
break;
}
System.out.print(temp_op_array.get(count_print_symbol) + " ");
count_print_symbol++;
}
System.out.print("\n");
return;
}
}
System.out.println("N is Impossible");
}
public static void calculateL(ArrayList<Integer> numbers, int target, ArrayList<Integer> temp_array){
int op_count = 0;
int sum = 0;
int n = (numbers.size() -1);
boolean firstRun = true;
for (int i = 0; i < Math.pow(2, n); i++) {
String bin = Integer.toBinaryString(i);
while (bin.length() < n)
bin = "0" + bin;
char[] chars = bin.toCharArray();
char[] charArray = new char[n];
for (int j = 0; j < chars.length; j++) {
charArray[j] = chars[j] == '0' ? '+' : '*';
}
//System.out.println(charArray);
for(char c : charArray){
op_count++;
if(firstRun == true){
sum = numbers.get(0);
numbers.remove(0);
// System.out.println(sum);
}
if (!numbers.isEmpty()){
if (c == '+') {
sum += numbers.get(0);
} else if (c == '*') {
sum *= numbers.get(0);
}
numbers.remove(0);
}
firstRun = false;
//System.out.println(sum);
if(sum == target && op_count == n){
int count_print_op = 0;
System.out.print("L ");
for(int r = 0; r < temp_array.size(); r++){
System.out.print(temp_array.get(r) + " ");
if(count_print_op == n){
break;
}
System.out.print(charArray[count_print_op] + " ");
count_print_op++;
}
System.out.print("\n");
return;
}
if(op_count == n && sum != target){
firstRun = true;
sum = 0;
op_count = 0;
for(int e = 0; e < temp_array.size(); e++){
numbers.add(e, temp_array.get(e));
}
}
}
}
System.out.println("L is impossible");
}
}
Is there a faster to way to reach a similar conclusion?
This problem can be solved in O(NK²) using the Dynamic Programming paradigm, where K is the maximum possible value for the goal target. This is not that good and maybe there is a faster algorithm, but it's still a lot better than the O(2^N) brute force solution.
First let's define a recurrence to solve the problem: let G be the goal value and f(i,j,k) be a function that returns:
1 if we can reach the value G-j-k using only elements from index i and onwards
0 otherwise
We are going to use j as an accumulator that holds the current total sum and k as an accumulator that holds the total product of the current chain of multiplications, you will understand it soon.
The base cases for the recurrence are:
f(N,x,y) = 1 if x+y = G (we have used every element and reached our goal)
f(N,x,y) = 0 otherwise
f(i,x,y) = 0 i != N and x+y >= G (we have exceeded the goal before using every element)
For other i values we can define the recurrence as:
f(i,j,k) = max( f(i+1,j+k,v[i]) , f(i+1,j,k*v[i]) )
The first function call inside max() means that we will put a "+" sign before the current index, so our current multiplication chain is broken and we have to add its total product to the current sum, so the second parameter is j+k, and since we are starting a new multiplication chain right now, it's total product is exactly v[i].
The second function call inside max() means that we will put a "*" sign before the current index, so our current multiplication chain is still going on, so the second parameter remains j, and the third parameter will become k * v[i].
What we want is the value of f(0,0,0) (we haven't used any elements, and our current accumulated sums are equal to 0). f(0,0,0) equals 1 if and only if there is a solution for the problem, so the problem is solved. Now let's go back to the recurrence and fix a detail: when we run f(0,0,0), the value of k*v[i] will be 0 no matter the value of v[i], so we have to add a special check when we are computing the answer for i = 0, and the final recurrence will look like this:
f(i,j,k) = max( f(i+1,j+k,v[i]) , f(i+1,j,(i==0?v[i]:k*v[i])) )
Finally, we apply the memoization/dynamic programming paradigm to optimize the calculation of the recurrence. During the execution of the algorithm, we will keep track of every calculated state so when this state is called again by another recursive call we just return the stored value instead of computing its whole recursion tree again. Don't forget to do this or your solution is going to be as slow as a brute force solution (or even worse) due to recalculation of subproblems. If you need some resources on DP, you can start here: https://en.wikipedia.org/wiki/Dynamic_programming
Given a string s, what is the fastest method to generate a set of all its unique substrings?
Example: for str = "aba" we would get substrs={"a", "b", "ab", "ba", "aba"}.
The naive algorithm would be to traverse the entire string generating substrings in length 1..n in each iteration, yielding an O(n^2) upper bound.
Is a better bound possible?
(this is technically homework, so pointers-only are welcome as well)
As other posters have said, there are potentially O(n^2) substrings for a given string, so printing them out cannot be done faster than that. However there exists an efficient representation of the set that can be constructed in linear time: the suffix tree.
There is no way to do this faster than O(n2) because there are a total of O(n2) substrings in a string, so if you have to generate them all, their number will be n(n + 1) / 2 in the worst case, hence the upper lower bound of O(n2) Ω(n2).
First one is brute force which has complexity O(N^3) which could be brought down to O(N^2 log(N))
Second One using HashSet which has Complexity O(N^2)
Third One using LCP by initially finding all the suffix of a given string which has the worst case O(N^2) and best case O(N Log(N)).
First Solution:-
import java.util.Scanner;
public class DistinctSubString {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
System.out.print("Enter The string");
String s = in.nextLine();
long startTime = System.currentTimeMillis();
int L = s.length();
int N = L * (L + 1) / 2;
String[] Comb = new String[N];
for (int i = 0, p = 0; i < L; ++i) {
for (int j = 0; j < (L - i); ++j) {
Comb[p++] = s.substring(j, i + j + 1);
}
}
/*
* for(int j=0;j<N;++j) { System.out.println(Comb[j]); }
*/
boolean[] val = new boolean[N];
for (int i = 0; i < N; ++i)
val[i] = true;
int counter = N;
int p = 0, start = 0;
for (int i = 0, j; i < L; ++i) {
p = L - i;
for (j = start; j < (start + p); ++j) {
if (val[j]) {
//System.out.println(Comb[j]);
for (int k = j + 1; k < start + p; ++k) {
if (Comb[j].equals(Comb[k])) {
counter--;
val[k] = false;
}
}
}
}
start = j;
}
System.out.println("Substrings are " + N
+ " of which unique substrings are " + counter);
long endTime = System.currentTimeMillis();
System.out.println("It took " + (endTime - startTime) + " milliseconds");
}
}
Second Solution:-
import java.util.*;
public class DistictSubstrings_usingHashTable {
public static void main(String args[]) {
// create a hash set
Scanner in = new Scanner(System.in);
System.out.print("Enter The string");
String s = in.nextLine();
int L = s.length();
long startTime = System.currentTimeMillis();
Set<String> hs = new HashSet<String>();
// add elements to the hash set
for (int i = 0; i < L; ++i) {
for (int j = 0; j < (L - i); ++j) {
hs.add(s.substring(j, i + j + 1));
}
}
System.out.println(hs.size());
long endTime = System.currentTimeMillis();
System.out.println("It took " + (endTime - startTime) + " milliseconds");
}
}
Third Solution:-
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.Arrays;
public class LCPsolnFroDistinctSubString {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Enter Desired String ");
String string = br.readLine();
int length = string.length();
String[] arrayString = new String[length];
for (int i = 0; i < length; ++i) {
arrayString[i] = string.substring(length - 1 - i, length);
}
Arrays.sort(arrayString);
for (int i = 0; i < length; ++i)
System.out.println(arrayString[i]);
long num_substring = arrayString[0].length();
for (int i = 0; i < length - 1; ++i) {
int j = 0;
for (; j < arrayString[i].length(); ++j) {
if (!((arrayString[i].substring(0, j + 1)).equals((arrayString)[i + 1]
.substring(0, j + 1)))) {
break;
}
}
num_substring += arrayString[i + 1].length() - j;
}
System.out.println("unique substrings = " + num_substring);
}
}
Fourth Solution:-
public static void printAllCombinations(String soFar, String rest) {
if(rest.isEmpty()) {
System.out.println(soFar);
} else {
printAllCombinations(soFar + rest.substring(0,1), rest.substring(1));
printAllCombinations(soFar , rest.substring(1));
}
}
Test case:- printAllCombinations("", "abcd");
For big oh ... Best you could do would be O(n^2)
No need to reinvent the wheel, its not based on a strings, but on a sets, so you will have to take the concepts and apply them to your own situation.
Algorithms
Really Good White Paper from MS
In depth PowerPoint
Blog on string perms
well, since there is potentially n*(n+1)/2 different substrings (+1 for the empty substring), I doubt you can be better than O(n*2) (worst case). the easiest thing is to generate them and use some nice O(1) lookup table (such as a hashmap) for excluding duplicates right when you find them.
class SubstringsOfAString {
public static void main(String args[]) {
String string = "Hello", sub = null;
System.out.println("Substrings of \"" + string + "\" are :-");
for (int i = 0; i < string.length(); i++) {
for (int j = 1; j <= string.length() - i; j++) {
sub = string.substring(i, j + i);
System.out.println(sub);
}
}
}
}
class program
{
List<String> lst = new List<String>();
String str = "abc";
public void func()
{
subset(0, "");
lst.Sort();
lst = lst.Distinct().ToList();
foreach (String item in lst)
{
Console.WriteLine(item);
}
}
void subset(int n, String s)
{
for (int i = n; i < str.Length; i++)
{
lst.Add(s + str[i].ToString());
subset(i + 1, s + str[i].ToString());
}
}
}
This prints unique substrings.
https://ideone.com/QVWOh0
def uniq_substring(test):
lista=[]
[lista.append(test[i:i+k+1]) for i in range(len(test)) for k in
range(len(test)-i) if test[i:i+k+1] not in lista and
test[i:i+k+1][::-1] not in lista]
print lista
uniq_substring('rohit')
uniq_substring('abab')
['r', 'ro', 'roh', 'rohi', 'rohit', 'o', 'oh', 'ohi', 'ohit', 'h',
'hi', 'hit', 'i', 'it', 't']
['a', 'ab', 'aba', 'abab', 'b', 'bab']
Many answers that include 2 for loops and a .substring() call claim O(N^2) time complexity. However, it is important to note that the worst case for a .substring() call in Java (post update 6 in Java 7) is O(N). So by adding a .substring() call in your code, the order of N has increased by one.
Therefore, 2 for loops and a .substring() call within those loops equals an O(N^3) time complexity.
It can only be done in o(n^2) time as total number of unique substrings of a string would be n(n+1)/2.
Example:
string s = "abcd"
pass 0: (all the strings are of length 1)
a, b, c, d = 4 strings
pass 1: (all the strings are of length 2)
ab, bc, cd = 3 strings
pass 2: (all the strings are of length 3)
abc, bcd = 2 strings
pass 3: (all the strings are of length 4)
abcd = 1 strings
Using this analogy, we can write solution with o(n^2) time complexity and constant space complexity.
The source code is as below:
#include<stdio.h>
void print(char arr[], int start, int end)
{
int i;
for(i=start;i<=end;i++)
{
printf("%c",arr[i]);
}
printf("\n");
}
void substrings(char arr[], int n)
{
int pass,j,start,end;
int no_of_strings = n-1;
for(pass=0;pass<n;pass++)
{
start = 0;
end = start+pass;
for(j=no_of_strings;j>=0;j--)
{
print(arr,start, end);
start++;
end = start+pass;
}
no_of_strings--;
}
}
int main()
{
char str[] = "abcd";
substrings(str,4);
return 0;
}
Naive algorithm takes O(n^3) time instead of O(n^2) time.
There are O(n^2) number of substrings.
And if you put O(n^2) number of substrings, for example, set,
then set compares O(lgn) comparisons for each string to check if it alrady exists in the set or not.
Besides it takes O(n) time for string comparison.
Therefore, it takes O(n^3 lgn) time if you use set. and you can reduce it O(n^3) time if you use hashtable instead of set.
The point is it is string comparisons not number comparisons.
So one of the best algorithm let's say if you use suffix array and longest common prefix (LCP) algorithm, it reduces O(n^2) time for this problem.
Building a suffix array using O(n) time algorithm.
Time for LCP = O(n) time.
Since for each pair of strings in suffix array, do LCP so total time is O(n^2) time to find the length of distinct subtrings.
Besides if you want to print all distinct substrings, it takes O(n^2) time.
Try this code using a suffix array and longest common prefix. It can also give you the total number of unique substrings. The code might give a stack overflow in visual studio but runs fine in Eclipse C++. That's because it returns vectors for functions. Haven't tested it against extremely long strings. Will do so and report back.
// C++ program for building LCP array for given text
#include <bits/stdc++.h>
#include <vector>
#include <string>
using namespace std;
#define MAX 100000
int cum[MAX];
// Structure to store information of a suffix
struct suffix
{
int index; // To store original index
int rank[2]; // To store ranks and next rank pair
};
// A comparison function used by sort() to compare two suffixes
// Compares two pairs, returns 1 if first pair is smaller
int cmp(struct suffix a, struct suffix b)
{
return (a.rank[0] == b.rank[0])? (a.rank[1] < b.rank[1] ?1: 0):
(a.rank[0] < b.rank[0] ?1: 0);
}
// This is the main function that takes a string 'txt' of size n as an
// argument, builds and return the suffix array for the given string
vector<int> buildSuffixArray(string txt, int n)
{
// A structure to store suffixes and their indexes
struct suffix suffixes[n];
// Store suffixes and their indexes in an array of structures.
// The structure is needed to sort the suffixes alphabatically
// and maintain their old indexes while sorting
for (int i = 0; i < n; i++)
{
suffixes[i].index = i;
suffixes[i].rank[0] = txt[i] - 'a';
suffixes[i].rank[1] = ((i+1) < n)? (txt[i + 1] - 'a'): -1;
}
// Sort the suffixes using the comparison function
// defined above.
sort(suffixes, suffixes+n, cmp);
// At his point, all suffixes are sorted according to first
// 2 characters. Let us sort suffixes according to first 4
// characters, then first 8 and so on
int ind[n]; // This array is needed to get the index in suffixes[]
// from original index. This mapping is needed to get
// next suffix.
for (int k = 4; k < 2*n; k = k*2)
{
// Assigning rank and index values to first suffix
int rank = 0;
int prev_rank = suffixes[0].rank[0];
suffixes[0].rank[0] = rank;
ind[suffixes[0].index] = 0;
// Assigning rank to suffixes
for (int i = 1; i < n; i++)
{
// If first rank and next ranks are same as that of previous
// suffix in array, assign the same new rank to this suffix
if (suffixes[i].rank[0] == prev_rank &&
suffixes[i].rank[1] == suffixes[i-1].rank[1])
{
prev_rank = suffixes[i].rank[0];
suffixes[i].rank[0] = rank;
}
else // Otherwise increment rank and assign
{
prev_rank = suffixes[i].rank[0];
suffixes[i].rank[0] = ++rank;
}
ind[suffixes[i].index] = i;
}
// Assign next rank to every suffix
for (int i = 0; i < n; i++)
{
int nextindex = suffixes[i].index + k/2;
suffixes[i].rank[1] = (nextindex < n)?
suffixes[ind[nextindex]].rank[0]: -1;
}
// Sort the suffixes according to first k characters
sort(suffixes, suffixes+n, cmp);
}
// Store indexes of all sorted suffixes in the suffix array
vector<int>suffixArr;
for (int i = 0; i < n; i++)
suffixArr.push_back(suffixes[i].index);
// Return the suffix array
return suffixArr;
}
/* To construct and return LCP */
vector<int> kasai(string txt, vector<int> suffixArr)
{
int n = suffixArr.size();
// To store LCP array
vector<int> lcp(n, 0);
// An auxiliary array to store inverse of suffix array
// elements. For example if suffixArr[0] is 5, the
// invSuff[5] would store 0. This is used to get next
// suffix string from suffix array.
vector<int> invSuff(n, 0);
// Fill values in invSuff[]
for (int i=0; i < n; i++)
invSuff[suffixArr[i]] = i;
// Initialize length of previous LCP
int k = 0;
// Process all suffixes one by one starting from
// first suffix in txt[]
for (int i=0; i<n; i++)
{
/* If the current suffix is at n-1, then we don’t
have next substring to consider. So lcp is not
defined for this substring, we put zero. */
if (invSuff[i] == n-1)
{
k = 0;
continue;
}
/* j contains index of the next substring to
be considered to compare with the present
substring, i.e., next string in suffix array */
int j = suffixArr[invSuff[i]+1];
// Directly start matching from k'th index as
// at-least k-1 characters will match
while (i+k<n && j+k<n && txt[i+k]==txt[j+k])
k++;
lcp[invSuff[i]] = k; // lcp for the present suffix.
// Deleting the starting character from the string.
if (k>0)
k--;
}
// return the constructed lcp array
return lcp;
}
// Utility function to print an array
void printArr(vector<int>arr, int n)
{
for (int i = 0; i < n; i++)
cout << arr[i] << " ";
cout << endl;
}
// Driver program
int main()
{
int t;
cin >> t;
//t = 1;
while (t > 0) {
//string str = "banana";
string str;
cin >> str; // >> k;
vector<int>suffixArr = buildSuffixArray(str, str.length());
int n = suffixArr.size();
cout << "Suffix Array : \n";
printArr(suffixArr, n);
vector<int>lcp = kasai(str, suffixArr);
cout << "\nLCP Array : \n";
printArr(lcp, n);
// cum will hold number of substrings if that'a what you want (total = cum[n-1]
cum[0] = n - suffixArr[0];
// vector <pair<int,int>> substrs[n];
int count = 1;
for (int i = 1; i <= n-suffixArr[0]; i++) {
//substrs[0].push_back({suffixArr[0],i});
string sub_str = str.substr(suffixArr[0],i);
cout << count << " " << sub_str << endl;
count++;
}
for(int i = 1;i < n;i++) {
cum[i] = cum[i-1] + (n - suffixArr[i] - lcp[i - 1]);
int end = n - suffixArr[i];
int begin = lcp[i-1] + 1;
int begin_suffix = suffixArr[i];
for (int j = begin, k = 1; j <= end; j++, k++) {
//substrs[i].push_back({begin_suffix, lcp[i-1] + k});
// cout << "i push " << i << " " << begin_suffix << " " << k << endl;
string sub_str = str.substr(begin_suffix, lcp[i-1] +k);
cout << count << " " << sub_str << endl;
count++;
}
}
/*int count = 1;
cout << endl;
for(int i = 0; i < n; i++){
for (auto it = substrs[i].begin(); it != substrs[i].end(); ++it ) {
string sub_str = str.substr(it->first, it->second);
cout << count << " " << sub_str << endl;
count++;
}
}*/
t--;
}
return 0;
}
And here's a simpler algorithm:
#include <iostream>
#include <string.h>
#include <vector>
#include <string>
#include <algorithm>
#include <time.h>
using namespace std;
char txt[100000], *p[100000];
int m, n;
int cmp(const void *p, const void *q) {
int rc = memcmp(*(char **)p, *(char **)q, m);
return rc;
}
int main() {
std::cin >> txt;
int start_s = clock();
n = strlen(txt);
int k; int i;
int count = 1;
for (m = 1; m <= n; m++) {
for (k = 0; k+m <= n; k++)
p[k] = txt+k;
qsort(p, k, sizeof(p[0]), &cmp);
for (i = 0; i < k; i++) {
if (i != 0 && cmp(&p[i-1], &p[i]) == 0){
continue;
}
char cur_txt[100000];
memcpy(cur_txt, p[i],m);
cur_txt[m] = '\0';
std::cout << count << " " << cur_txt << std::endl;
count++;
}
}
cout << --count << endl;
int stop_s = clock();
float run_time = (stop_s - start_s) / double(CLOCKS_PER_SEC);
cout << endl << "distinct substrings \t\tExecution time = " << run_time << " seconds" << endl;
return 0;
}
Both algorithms listed a simply too slow for extremely long strings though. I tested the algorithms against a string of length over 47,000 and the algorithms took over 20 minutes to complete, with the first one taking 1200 seconds, and the second one taking 1360 seconds, and that's just counting the unique substrings without outputting to the terminal. So for probably strings of length up to 1000 you might get a working solution. Both solutions did compute the same total number of unique substrings though. I did test both algorithms against string lengths of 2000 and 10,000. The times were for the first algorithm: 0.33 s and 12 s; for the second algorithm it was 0.535 s and 20 s. So it looks like in general the first algorithm is faster.
Here is my code in Python. It generates all possible substrings of any given string.
def find_substring(str_in):
substrs = []
if len(str_in) <= 1:
return [str_in]
s1 = find_substring(str_in[:1])
s2 = find_substring(str_in[1:])
substrs.append(s1)
substrs.append(s2)
for s11 in s1:
substrs.append(s11)
for s21 in s2:
substrs.append("%s%s" %(s11, s21))
for s21 in s2:
substrs.append(s21)
return set(substrs)
If you pass str_ = "abcdef" to the function, it generates the following results:
a, ab, abc, abcd, abcde, abcdef, abcdf, abce, abcef, abcf, abd, abde, abdef, abdf, abe, abef, abf, ac, acd, acde, acdef, acdf, ace, acef, acf, ad, ade, adef, adf, ae, aef, af, b, bc, bcd, bcde, bcdef, bcdf, bce, bcef, bcf, bd, bde, bdef, bdf, be, bef, bf, c, cd, cde, cdef, cdf, ce, cef, cf, d, de, def, df, e, ef, f
This program is meant to collect data on the speed of three search algorithms (linear, binary, and random). I've extensively tested each search method called and they all work. I know the error exists in the for loop containing k, but I'm not sure what or why. I get the proper results for linear[0], binary[0], and random[0], but cannot get an answer for any other place in the arrays within the loop (which makes sense since I the other places have not happened yet). Outside of the loop though, I can't print out any part of any of the arrays.
import java.util.Random;
/*class to generate testing data*/
public class GenerateData{
public static void main(String[] args) {
/*creates and runs through each power of 2 array*/
for(int i=3; i < 5; i++) {
/*calculates a power of 2 starting at 8*/
int n = (int)Math.pow(2, i);
/*creates an array of a power of 2 starting at 8*/
int [] test = new int[n];
/*generates a starting testing point for the array*/
int start = 3;
/*fills the array with sorted testing data*/
for(int j=0; j < n; j++) {
test[j] = start;
start = start + 2;
}
/*creates an array to store linear testing times*/
long [] linear = new long[10];
/*creates an array to store binary testing times*/
long [] binary = new long[10];
/*creates an array to store random testing times*/
long [] random = new long[10];
/*runs through the search algorithms ten times*/
for(int k=0; k < 10; k++) {
/*generates a random number to test no larger than the largest
value in the array*/
int queryValue = (int)Math.floor(Math.random()*(start-1));
/*tests the array in each algorithm, keeping track of start and
end time*/
long linearStartTime = System.nanoTime();
int linearTest = LinearSearch.linearSearch(queryValue, test);
/*calculates the time for linear algorithm and adds it to
an array keeping track of linear algorithm run times*/
linear[k] = System.nanoTime() - linearStartTime;
long binaryStartTime = System.nanoTime();
int binaryTest = BinarySearch.binarySearch(queryValue, test);
/*calculates the time for binary algorithm and adds it to
an array keeping track of binary algorithm run times*/
binary[k] = System.nanoTime() - binaryStartTime;
long randomStartTime = System.nanoTime();
int randomTest = RandomSearch.randomSearch(queryValue, test);
/*calculates the time for random algorithm and adds it to
an array keeping track of random algorithm run times*/
random[k] = System.nanoTime() - randomStartTime;
}
/*placeholder initial values for mins, maxes, and avgs*/
long linMin = linear[1];
long binMin = binary[1];
long randMin = random[1];
long linMax = linear[1];
long binMax = binary[1];
long randMax = random[1];
long linAvg = 0;
long binAvg = 0;
long randAvg = 0;
/*cycles through the arrays calculating the min, max, and avg*/
for(int l=0; l < 9; l++) {
/*calculates the avg for each algorithm array*/
linAvg = linAvg + linear[l] / 2;
binAvg = binAvg + binary[l] / 2;
randAvg = randAvg + random[l] / 2;
/*calculates the min for each algorithm array*/
if(linear[l] < linMin) {
linMin = linear[l];
}
if(binary[l] < binMin) {
binMin = binary[l];
}
if(random[l] < randMin) {
randMin = random[l];
}
/*calculates the max for each algorithm array*/
if(linear[l] > linMax) {
linMax = linear[l];
}
if(binary[l] > binMax) {
binMax = linear[l];
}
if(random[l] > randMax) {
randMax = random[l];
}
/*prints the current power of 2, min, max, and avg
for each algorithm into a file*/
StdOut.println("linear" + "," + n + "," + linMin + "," + linMax + "," + linAvg);
StdOut.println("binary" + "," + n + "," + binMin + "," + binMax + "," + binAvg);
StdOut.println("random" + "," + n + "," + randMin + "," + randMax + "," + randAvg);
}
}
}
}
The binary search I'm using. Provided for testing purposes.
/*class containing binary search algorithm*/
public class BinarySearch {
/*conducts a binary search as specified by user*/
public static int binarySearch(int queryValue, int[] list) {
int length = list.length;
/*last point of list*/
int top = length-1;
/*first point of list*/
int bottom = 0;
/*starting midpoint of list*/
int mid = (int)Math.round((top + bottom)/2);
/*binary search*/
do {
if(queryValue > list[mid]) {
bottom = mid;
mid = (int)Math.ceil((top + bottom) / 2.0);
}
else {
top = mid;
mid = (int)Math.floor((top + bottom) / 2.0);
}
if(queryValue == list[mid]) {
return mid;
}
} while (mid < top || mid > bottom);
/*returns -1 if user value not found*/
return -1;
}
}
The linear search I'm using. Provided for testing purposes.
/*class containing linear search algorithm*/
public class LinearSearch {
public static int linearSearch(int queryValue, int[] list) {
int length = list.length;
/*conducts a linear search as specified by user*/
for(int i = 0; i < length; i++) {
if((int)queryValue == (int)list[i]) {
return i;
}
}
/*return -1 if user value not found*/
return -1;
}
}
The random search I'm using. Provided for testing purposes.
import java.util.Random;
/*class containing random search algorithm*/
public class RandomSearch {
public static int randomSearch(int queryValue, int[] list) {
/*conducts a random search as specified by user*/
/*trys 10,000,000 random combinations searching
for user value*/
int length = list.length;
for(int i=0; i < 10000000; i++) {
/*generates a random number from 0 to length*/
int randomNum = (int)Math.floor(Math.random()*(length));
if((int)queryValue == (int)list[randomNum]) {
return randomNum;
}
}
/*returns -2 if user value not found*/
return -2;
}
}
If this is Java, you want to use
System.out.println(....)
not your StdOut....