My task is to generates all possible combinations of that rows without the hidden # number sign. The input is XOXX#OO#XO and here is the example of what the output should be:
XOXXOOOOXO
XOXXOOOXXO
XOXXXOOOXO
XOXXXOOXXO
I am only allowed to solve this solution iteratively and I am not sure how to fix this and have been working on this code for a week now.
Here is my code:
import java.lang.Math;
public class help {
public static void main(String[] args) {
String str = new String("XOXX#OO#XO");
UnHide(str);
}
public static void UnHide(String str) {
//converting string to char
char[] chArr = str.toCharArray();
//finding all combinations for XO
char[] xo = new char[]{'X', 'O'};
int count = 0;
char perm = 0;
String s = "";
//finding amount of times '#' appears in string
for (int i = 0; i < str.length(); i++) {
if (chArr[i] == '#')
count++;
}
int[] combo = new int[count];
int pMax = xo.length;
while (combo[0] < pMax) {
// print the current permutation
for (int k = 0; k < count; k++) {
//print each character
//System.out.print(xo[combo[i]]);
perm = xo[combo[k]];
s = String.valueOf(perm);
char[] xoArr = s.toCharArray();
String strChar = new String(xoArr);
//substituting '#' to XO combo
for (int i = 0; i < chArr.length; i++) {
for (int j = 0; j < s.length(); j++) {
if (chArr[i] == '#') {
chArr[i] = xoArr[j];
strChar = String.copyValueOf(chArr);
i++;
}
}
i++;
if (i == chArr.length - 1) {
System.out.println(strChar);
i = 0;
}
}
}
System.out.println(); //print end of line
// increment combo
combo[count - 1]++; // increment the last index
//// if increment overflows
for (int i = count - 1; combo[i] == pMax && i > 0; i--) {
combo[i - 1]++; // increment previous index
combo[i] = 0; // set current index to zero
}
}
}
}
Since your input has 2 #'s, there are 2n = 4 permutations.
If you count from 0 to 3, and look at the numbers in binary, you get 00, 01, 10, and 11, so if you use that, inserting O for 0 and X for 1, you can do this using simple loops.
public static void unHide(String str) {
int count = 0;
for (int i = 0; i < str.length(); i++)
if (str.charAt(i) == '#')
count++;
if (count > 30)
throw new IllegalArgumentException("Too many #'s found. " + count + " > 30");
char[] buf = str.toCharArray();
for (int permutation = 0, end = 1 << count; permutation < end; permutation++) {
for (int i = buf.length - 1, bit = 0; i >= 0; i--)
if (str.charAt(i) == '#')
buf[i] = "OX".charAt(permutation >>> bit++ & 1);
System.out.println(buf);
}
}
Test
unHide("XOXX#OO#XO");
Output
XOXXOOOOXO
XOXXOOOXXO
XOXXXOOOXO
XOXXXOOXXO
You can iteratively generate all possible combinations of strings using streams as follows:
public static String[] unHide(String str) {
// an array of substrings around a 'number sign'
String[] arr = str.split("#", -1);
// an array of possible combinations
return IntStream
// iterate over array indices
.range(0, arr.length)
// append each substring with possible
// combinations, except the last one
// return Stream<String[]>
.mapToObj(i -> i < arr.length - 1 ?
new String[]{arr[i] + "O", arr[i] + "X"} :
new String[]{arr[i]})
// reduce stream of arrays to a single array
// by sequentially multiplying array pairs
.reduce((arr1, arr2) -> Arrays.stream(arr1)
.flatMap(str1 -> Arrays.stream(arr2)
.map(str2 -> str1 + str2))
.toArray(String[]::new))
.orElse(null);
}
// output to the markdown table
public static void main(String[] args) {
String[] tests = {"XOXX#OOXO", "XOXX#OO#XO", "#XOXX#OOXO#", "XO#XX#OO#XO"};
String header = String.join("</pre> | <pre>", tests);
String matrices = Arrays.stream(tests)
.map(test -> unHide(test))
.map(arr -> String.join("<br>", arr))
.collect(Collectors.joining("</pre> | <pre>"));
System.out.println("| <pre>" + header + "</pre> |");
System.out.println("|---|---|---|---|");
System.out.println("| <pre>" + matrices + "</pre> |");
}
XOXX#OOXO
XOXX#OO#XO
#XOXX#OOXO#
XO#XX#OO#XO
XOXXOOOXOXOXXXOOXO
XOXXOOOOXOXOXXOOOXXOXOXXXOOOXOXOXXXOOXXO
OXOXXOOOXOOOXOXXOOOXOXOXOXXXOOXOOOXOXXXOOXOXXXOXXOOOXOOXXOXXOOOXOXXXOXXXOOXOOXXOXXXOOXOX
XOOXXOOOOXOXOOXXOOOXXOXOOXXXOOOXOXOOXXXOOXXOXOXXXOOOOXOXOXXXOOOXXOXOXXXXOOOXOXOXXXXOOXXO
The process would probably be best to calculate the number of permutations, then loop through each to define what combination of characters to use.
For that, we'll have to divide the permutation number by some value related to the index of the character we're replacing, which will serve as the index of the character to swap it to.
public static void test(String word) {
// Should be defined in class (outside method)
String[] replaceChars = {"O", "X"};
char replCharacter = '#';
String temp;
int charIndex;
int numReplaceable = 0;
// Count the number of chars to replace
for (char c : word.toCharArray())
if (c == replCharacter)
numReplaceable++;
int totalPermutations = (int) Math.pow(replaceChars.length, numReplaceable);
// For all permutations:
for (int permNum = 0; permNum < totalPermutations; permNum++) {
temp = word;
// For each replacement character in the word:
for (int n = 0; n < numReplaceable; n++) {
// Calculate the character to swap the nth replacement char to
charIndex = permNum / (int) (Math.pow(replaceChars.length, n))
% replaceChars.length;
temp = temp.replaceFirst(
replCharacter + "", replaceChars[charIndex]);
}
System.out.println(temp);
}
}
Which can produces:
java Test "#TEST#"
OTESTO
XTESTO
OTESTX
XTESTX
This can also be used with any number of characters, just add more to replaceChars.
Related
import java.util.Scanner;
class Motu
{
// Returns length of the longest subsequence
// of the form 0*1*0*
public static int longestSubseq(String s)
{
int n = s.length();
int[] count_1 = new int[n + 1];
count_1[0] = 0;
for (int j = 1; j <= n; j++)
{
count_1[j] = count_1[j - 1];
if (s.charAt(j - 1) != '0')
count_1[j]++;
}
// Compute result using precomputed values
int ans = 0;
for (int i = 1; i <= n; i++)
for (int j = i; j <= n; j++)
ans = Math.max(count_1[j] - count_1[i - 1] , ans);
return ans;
}
// Driver code
public static void main(String[] args)
{
#SuppressWarnings("resource")
Scanner sc=new Scanner(System.in);
String s =sc.next();
System.out.println(longestSubseq(s));
}
}
I am trying to make a program to get maximum sequences 1 in a string containing 0's & 1's. But I am unable to make out the logic for it, my program prints a number of 1's in the string which is not my desired output.
Sample input:- 0011100111100
output:- 4
You're quite good, but you're missing one thing : if the char is '0' : reset the counter to zero
for (int j = 1; j <= n; j++) {
if (s.charAt(j - 1) != '0')
count_1[j] = count_1[j - 1] + 1;
else
count_1[j] = 0;
}
But that can be done in one loop only, count with an int, and keep track of the max
public static int longestSubseq(String s) {
int ans = 0;
int count = 0;
for (char c : s.toCharArray()) {
if (c == '1')
count++;
else
count = 0;
ans = Math.max(ans, count);
}
return ans;
}
public static int longestSubSequence(String str, char ch) {
int res = 0;
int count = 0;
for (int i = 0; i < str.length(); i++) {
count = str.charAt(i) == ch ? count + 1 : 0;
res = Math.max(res, count);
}
return res;
}
The input string may be split by the characters that are not 1 (thus all non-1 characters are ignored and subsequences containing only 1 remain), and then the max length of the remaining parts can be found using Stream API:
public static int longestSubSequence(String str, char ch) {
return Arrays.stream(str.split("[^" + ch + "]"))
.mapToInt(String::length)
.max()
.orElse(0);
}
Similarly, a matching pattern can be created, and the max length of the group can be found:
public static int longestSubSequence(String str, char ch) {
return Pattern.compile(ch + "+")
.matcher(str)
.results()
.map(MatchResult::group)
.mapToInt(String::length)
.max()
.orElse(0);
}
Test:
System.out.println(longestSubSequence("00111011001111", '1')); // 4
It's worth mentioning that the characters other than '0' and '1' may be present in the input string, only subsequences of the given char are counted.
As an alternative to the other answers that work with a for-loop:
You could split the sequence into groups of ones with regex. Next, just iterate over the groups and update the count, if the length of the group is bigger than the previous length.
The first group will be 111 and the next one 1111. Thus the count will first be 3 and then it will be updated to 4.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class CountSubsequence {
public static void main(String []args){
String sequence = "0011100111100";
Pattern pattern = Pattern.compile("(1+)");
Matcher matcher = pattern.matcher(sequence);
int count = 0;
while (matcher.find()) {
int currentLength = matcher.group().length();
if (currentLength > count) count = currentLength;
}
System.out.println(count); // 4
}
}
Since regex is not that performant you might want to use the for-loop in case you care for performance - but that just matters if you execute it a lot.
I have a function called lengthOfLongestSubstring and its job is to find the longest substring without any repeated characters. For the most part, it works, but when it gets an input like "dvdf" it prints out 2 (rather than 3) and gives [dv, df] when it should be [d, vdf].
So, I first go through the string and see if there are any unique characters. If there are, I append it to the ans variable. (I think this is the part that needs some fixing). If there is a duplicate, I store it in the substrings linked list and reset the ans variable to the duplicate string.
Once the whole string has been traversed, I find the longest substring and return its length.
public static int lengthOfLongestSubstring(String s) {
String ans = "";
int len = 0;
LinkedList<String> substrings = new LinkedList<String>();
for (int i = 0; i < s.length(); i++) {
if (!ans.contains("" + s.charAt(i))) {
ans += s.charAt(i);
} else {
substrings.add(ans);
ans = "" + s.charAt(i);
}
}
substrings.add(ans); // add last seen substring into the linked list
for (int i = 0; i < substrings.size(); i++) {
if (substrings.get(i).length() >= len)
len = substrings.get(i).length();
}
System.out.println(Arrays.toString(substrings.toArray()));
return len;
}
Here are some test results:
//correct
lengthOfLongestSubstring("abcabcbb") -> 3 ( [abc, abc, b, b])
lengthOfLongestSubstring("pwwkew") -> 3 ([pw, wke, w]).
lengthOfLongestSubstring("ABDEFGABEF"); -> 6 ([ABDEFG, ABEF])
// wrong
System.out.println(lengthOfLongestSubstring("acadf")); -> 3, ([ac, adf]) *should be 4, with the linked list being [a, cadf]
Any suggestions to fix this? Do I have to redo all my logic?
Thanks!
You code is mistakenly assuming that when you find a repeated character, the next candidate substring starts at the repeated character. That is not true, it starts right after the original character.
Example: If string is "abcXdefXghiXjkl", there are 3 candidate substrings: "abcXdef", "defXghi", and "ghiXjkl".
As you can see, the candidate substrings ends before a repeating character and starts after a repeating character (and begin and end of string).
So, when you find a repeating character, the position of the previous instance of that character is needed to determine the start of the next substring candidate.
The easiest way to handle that, is to build a Map of character to last seen position. That will also perform faster than continually performing substring searches to check for repeating character, like the question code and the other answers are doing.
Something like this:
public static int lengthOfLongestSubstring(String s) {
Map<Character, Integer> charPos = new HashMap<>();
List<String> candidates = new ArrayList<>();
int start = 0, maxLen = 0;
for (int idx = 0; idx < s.length(); idx++) {
char ch = s.charAt(idx);
Integer preIdx = charPos.get(ch);
if (preIdx != null && preIdx >= start) { // found repeat
if (idx - start > maxLen) {
candidates.clear();
maxLen = idx - start;
}
if (idx - start == maxLen)
candidates.add(s.substring(start, idx));
start = preIdx + 1;
}
charPos.put(ch, idx);
}
if (s.length() - start > maxLen)
maxLen = s.length() - start;
if (s.length() - start == maxLen)
candidates.add(s.substring(start));
System.out.print(candidates + ": ");
return maxLen;
}
The candidates is only there for debugging purposes, and is not needed, so without that, the code is somewhat simpler:
public static int lengthOfLongestSubstring(String s) {
Map<Character, Integer> charPos = new HashMap<>();
int start = 0, maxLen = 0;
for (int idx = 0; idx < s.length(); idx++) {
char ch = s.charAt(idx);
Integer preIdx = charPos.get(ch);
if (preIdx != null && preIdx >= start) { // found repeat
if (idx - start > maxLen)
maxLen = idx - start;
start = preIdx + 1;
}
charPos.put(ch, idx);
}
return Math.max(maxLen, s.length() - start);
}
Test
System.out.println(lengthOfLongestSubstring(""));
System.out.println(lengthOfLongestSubstring("x"));
System.out.println(lengthOfLongestSubstring("xx"));
System.out.println(lengthOfLongestSubstring("xxx"));
System.out.println(lengthOfLongestSubstring("abcXdefXghiXjkl"));
System.out.println(lengthOfLongestSubstring("abcabcbb"));
System.out.println(lengthOfLongestSubstring("pwwkew"));
System.out.println(lengthOfLongestSubstring("ABDEFGABEF"));
Output (with candidate lists)
[]: 0
[x]: 1
[x, x]: 1
[x, x, x]: 1
[abcXdef, defXghi, ghiXjkl]: 7
[abc, bca, cab, abc]: 3
[wke, kew]: 3
[ABDEFG, BDEFGA, DEFGAB]: 6
Instead of setting ans to the current char when a character match is found
ans = "" + s.charAt(i);
You should add the current char to all the characters after the first match of the current char
ans = ans.substring(ans.indexOf(s.charAt(i)) + 1) + s.charAt(i);
The full method thus becomes
public static int lengthOfLongestSubstring(String s) {
String ans = "";
int len = 0;
LinkedList<String> substrings = new LinkedList<>();
for (int i = 0; i < s.length(); i++) {
if (!ans.contains("" + s.charAt(i))) {
ans += s.charAt(i);
} else {
substrings.add(ans);
// Only the below line changed
ans = ans.substring(ans.indexOf(s.charAt(i)) + 1) + s.charAt(i);
}
}
substrings.add(ans); // add last seen substring into the linked list
for (int i = 0; i < substrings.size(); i++) {
if (substrings.get(i).length() >= len)
len = substrings.get(i).length();
}
System.out.println(Arrays.toString(substrings.toArray()));
return len;
}
Using this code the acceptance criteria you specified passed successfully
//correct
lengthOfLongestSubstring("dvdf") -> 3 ( [dv, vdf])
lengthOfLongestSubstring("abcabcbb") -> 3 ([abc, bca, cab, abc, cb, b])
lengthOfLongestSubstring("pwwkew") -> 3 ([pw, wke, kew]).
lengthOfLongestSubstring("ABDEFGABEF"); -> 6 ([ABDEFG, BDEFGA, DEFGAB, FGABE, GABEF])
lengthOfLongestSubstring("acadf"); -> 4 ([ac, cadf])
Create a nested for loop to check at each index in the array.
public static int lengthOfLongestSubstring(String s) {
String ans = "";
int len = 0;
LinkedList<String> substrings = new LinkedList<String>();
int k = 0;
for (int i = 0; i < s.length(); i++) {
if(k == s.length()) {
break;
}
for(k = i; k < s.length(); k++) {
if (!ans.contains("" + s.charAt(k))) {
ans += s.charAt(k);
} else {
substrings.add(ans);
ans = "";
break;
}
}
}
substrings.add(ans); // add last seen substring into the linked list
for (int i = 0; i < substrings.size(); i++) {
if (substrings.get(i).length() >= len)
len = substrings.get(i).length();
}
System.out.println(Arrays.toString(substrings.toArray()));
return len;
}
Example:
lengthOfLongestSubstring("ABDEFGABEF"); -> 6 ([ABDEFG, BDEFGA, DEFGAB, EFGAB, FGABE, GABEF])
In this code I am having some problem as I have marked using a loop which is printing some values. I am storing them in an array as mentioned and am trying to print the values in another function. But even after using the global array the value of the array is changing.
I am not able to figure out the problem. Please help me out.
import java.io.*;
import java.util.*;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
// Java program to print all permutations of a
// given string.
public class test3
{
static int[] val = new int[100] ; //array declaration as global
public static void main(String[] args)
{
System.out.println("An incremented value");
for(int i=2;i<=2;i++) {
String p="";
for(int j=0;j<=i;j++) {
for(int m=0;m<j;m++) {
p=p+"&";
}
for(int m=0;m<i-j;m++) {
p=p+"|";
}
printAllPermutations(p);
p="";
}
}
System.out.println();
for(int xy=0;xy<32;xy++)
System.out.print("["+xy+"]"+"="+val[xy]+" "); //trying to print the array
}
static void print(char[] temp) {
String a="";
System.out.println();
for (int i = 0; i < temp.length; i++)
{ System.out.print(temp[i]);
a=a+temp[i];}
System.out.print(" "+"opr:"+temp.length+" ");
final int N = temp.length+1;
/*===================CODE PROBLEM PART START=======================*/
for (int i = 0; i < (1 << N); i++) {
// System.out.println(zeroPad(Integer.toBinaryString(i), N));
String b=zeroPad(Integer.toBinaryString(i), N)+"";
// System.out.println("a: "+a+" b:"+b);
char[] arrayA = b.toCharArray();
char[] arrayB = a.toCharArray();
StringBuilder sb = new StringBuilder();
int ii = 0;
while( ii < arrayA.length && ii < arrayB.length){
sb.append(arrayA[ii]).append(arrayB[ii]);
++ii;
}
for(int j = ii; j < arrayA.length; ++j){
sb.append(arrayA[j]);
}
for(int j = ii; j < arrayB.length; ++j){
sb.append(arrayB[j]);
}
//System.out.println(sb.toString());
try {
ScriptEngineManager sem = new ScriptEngineManager();
ScriptEngine se = sem.getEngineByName("JavaScript");
String myExpression = sb.toString();
// System.out.print(se.eval(myExpression));
val[i]=(int)(se.eval(myExpression)); //inserting array value
System.out.print(val[i]); //NEED TO HAVE THESE VALUES IN THE 1-D ARRAY
// System.out.print(val[i]);
} catch (ScriptException e) {
System.out.println("Invalid Expression");
e.printStackTrace();}
}
/*===================CODE PROBLEM PART END========================*/
//
}
//unchangable = rest of the function
static int factorial(int n) {
int f = 1;
for (int i = 1; i <= n; i++)
f = f * i;
return f;
}
static int calculateTotal(char[] temp, int n) {
int f = factorial(n);
// Building HashMap to store frequencies of
// all characters.
HashMap<Character, Integer> hm =
new HashMap<Character, Integer>();
for (int i = 0; i < temp.length; i++) {
if (hm.containsKey(temp[i]))
hm.put(temp[i], hm.get(temp[i]) + 1);
else
hm.put(temp[i], 1);
}
// Traversing hashmap and finding duplicate elements.
for (Map.Entry e : hm.entrySet()) {
Integer x = (Integer)e.getValue();
if (x > 1) {
int temp5 = factorial(x);
f = f / temp5;
}
}
return f;
}
static void nextPermutation(char[] temp) {
// Start traversing from the end and
// find position 'i-1' of the first character
// which is greater than its successor.
int i;
for (i = temp.length - 1; i > 0; i--)
if (temp[i] > temp[i - 1])
break;
// Finding smallest character after 'i-1' and
// greater than temp[i-1]
int min = i;
int j, x = temp[i - 1];
for (j = i + 1; j < temp.length; j++)
if ((temp[j] < temp[min]) && (temp[j] > x))
min = j;
// Swapping the above found characters.
char temp_to_swap;
temp_to_swap = temp[i - 1];
temp[i - 1] = temp[min];
temp[min] = temp_to_swap;
// Sort all digits from position next to 'i-1'
// to end of the string.
Arrays.sort(temp, i, temp.length);
// Print the String
print(temp);
}
static void printAllPermutations(String s) {
// Sorting String
char temp[] = s.toCharArray();
Arrays.sort(temp);
// Print first permutation
print(temp);
// Finding the total permutations
int total = calculateTotal(temp, temp.length);
for (int i = 1; i < total; i++)
nextPermutation(temp);
}
static String zero(int L) {
return (L <= 0 ? "" : String.format("%0" + L + "d", 0));
}
static String zeroPad(String s, int L) {
return zero(L - s.length()) + s;
}
}
The output that I am getting is
An incremented value
|| opr:2 01111111 //WANT TO STORE THESE 32 VALUES IN 1 D ARRAY
&| opr:2 01010111 // AND PRINT THEM OUT
|& opr:2 00011111
&& opr:2 00000001
[0]=0 [1]=0 [2]=0 [3]=0 [4]=0 [5]=0 [6]=0 [7]=1 [8]=0 [9]=0 [10]=0 [11]=0 [12]=0 [13]=0 [14]=0 [15]=0 [16]=0 [17]=0 [18]=0 [19]=0 [20]=0 [21]=0 [22]=0 [23]=0 [24]=0 [25]=0 [26]=0 [27]=0 [28]=0 [29]=0 [30]=0 [31]=0
what I need to do is to store those 32 values in 1 D array for further operation but while doing it all the array values displays 0 only except [7]. I dont know whats going on here.
Reference types are not bound to local scopes, just because your array is static to the class it does not mean that changing the values in one function will not change the values in the actual array. The reference to your array as a parameter will be a copy, but the reference is still "pointing" on an actual object, which is not a copy bound to your local scope.
If you want to save two different states of the array, you will have to save them yourself.
While I am creating Java String shuffler, I am getting a problem:
program stucks somewhere.
I have to pass a sentence or a word through BufferedReader
I have to shuffle the word/sentence so that first element is the first letter, then last letter, then 2nd letter, then 2nd from the end till the job is done
2.1. If word/sentence length is odd, the middle character has to be put in the end of the word/sentence.
Have to print it out
Result should be like this:
My code;
public static void main(String[] args) {
String enteredValue = null;
int charArrayLength = 0;
System.out.println("Dāvis Naglis IRDBD11 151RDB286");
System.out.println("input string:");
try {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
enteredValue = br.readLine();
charArrayLength = enteredValue.length(); // length of array entered
char[] characters = new char[charArrayLength];
characters = enteredValue.toCharArray();
} catch (Exception e) {
e.printStackTrace();
}
char[] tempChars = new char[charArrayLength];
for (int i = 0; i <= charArrayLength - 1; i++) {
tempChars[i] = enteredValue.charAt(i);
}
}
/**
* Shuffles the char array if it's length is even
*
* #param array
*/
public static void shuffle(char[] array) {
char[] tempChars = null;
for (int j = 0; j <= array.length; j++) {
if ((array.length % 2 == 0) && (j < array.length)) { // array[j] == (array.length / 2) + 1
tempChars[j] = array[array.length - j];
} else if (array.length % 2 != 0) {
tempChars[array.length] = array[j];
} // end else if
} // end for
String shuffledSentence = new String(tempChars);
System.out.println(shuffledSentence);
}
Don't look at multiple line comments, haven't changed them since start.
Thanks in advance!
Shuffling made easy:
int len = array.length;
char[] tempArray = new char[len];
int i=0, j=0, k=len-1;
while (i<len) {
tempArray[i++] = array[j++];
if (i<len) {
tempArray[i++] = array[k--];
}
}
Wow. Your algorithm is too complex for this problem!
Try this for shuffling:
int n = array.length;
char[] resChars = new char[n];
boolean flag = false;
if (n % 2 != 0) {
flag = true;
char tmp = array[n / 2];
}
for (int j = 0; j < n - 1; j += 2) {
resChars[j] = array[j / 2];
resChars[j + 1] = array[n - 1 - (j / 2)]
}
if (flag)
resChars[n - 1] = tmp;
You can try something like this:
String str;
char[] chars = str.toCharArray();
List<Character> list = new ArrayList<>();
for (char aChar : chars) {
list.add(aChar);
}
Collections.shuffle(list);
String result = list.toString().replaceAll("[\\[\\],]", "");
Your program doesn't stuck, it exits properply.
Process finished with exit code 0
There is no more work to do for the process, since you don't call
your static shuffle method.
In addition, as the others already answered, your static shuffle method needs some design
refactoring.
I have a string that consists of characters A,B,C and D and I am trying to calculate the length of the longest substring that has an equal amount of each one of these characters in any order.
For example ABCDB would return 4, ABCC 0 and ADDBCCBA 8.
My code currently:
public int longestSubstring(String word) {
HashMap<Integer, String> map = new HashMap<Integer, String>();
for (int i = 0; i<word.length()-3; i++) {
map.put(i, word.substring(i, i+4));
}
StringBuilder sb;
int longest = 0;
for (int i = 0; i<map.size(); i++) {
sb = new StringBuilder();
sb.append(map.get(i));
int a = 4;
while (i<map.size()-a) {
sb.append(map.get(i+a));
a+= 4;
}
String substring = sb.toString();
if (equalAmountOfCharacters(substring)) {
int length = substring.length();
if (length > longest)
longest = length;
}
}
return longest;
}
This currently works pretty well if the string length is 10^4 but I'm trying to make it 10^5. Any tips or suggestions would be appreciated.
Let's assume that cnt(c, i) is the number of occurrences of the character c in the prefix of length i.
A substring (low, high] has an equal amount of two characters a and b iff cnt(a, high) - cnt(a, low) = cnt(b, high) - cnt(b, low), or, put it another way, cnt(b, high) - cnt(a, high) = cnt(b, low) - cnt(a, low). Thus, each position is described by a value of cnt(b, i) - cnt(a, i). Now we can generalize it for more that two characters: each position is described by a tuple (cnt(a_2, i) - cnt(a_1, i), ..., cnt(a_k, i) - cnt(a_1, i)), where a_1 ... a_k is the alphabet.
We can iterate over the given string and maintain the current tuple. At each step, we should update the answer by checking the value of i - first_occurrence(current_tuple), where first_occurrence is a hash table that stores the first occurrence of each tuple seen so far. Do not forget to put a tuple of zeros to the hash map before iteration(it corresponds to an empty prefix).
If there were only A's and B's, then you could do something like this.
def longest_balanced(word):
length = 0
cumulative_difference = 0
first_index = {0: -1}
for index, letter in enumerate(word):
if letter == 'A':
cumulative_difference += 1
elif letter == 'B':
cumulative_difference -= 1
else:
raise ValueError(letter)
if cumulative_difference in first_index:
length = max(length, index - first_index[cumulative_difference])
else:
first_index[cumulative_difference] = index
return length
Life is more complicated with all four letters, but the idea is much the same. Instead of keeping just one cumulative difference, for A's versus B's, we keep three, for A's versus B's, A's versus C's, and A's versus D's.
Well, first of all abstain from constructing any strings.
If you don't produce any (or nearly no) garbage, there's no need to collect it, which is a major plus.
Next, use a different data-structure:
I suggest 4 byte-arrays, storing the count of their respective symbol in the 4-span starting at the corresponding string-index.
That should speed it up considerably.
You can count the occurrences of the characters in word. Then, a possible solution could be:
If min is the minimum number of occurrences of any character in word, then min is also the maximum possible number of occurrences of each character in the substring we are looking for. In the code below, min is maxCount.
We iterate over decreasing values of maxCount. At every step, the string we are searching for will have length maxCount * alphabetSize. We can view this as the size of a sliding window we can slide over word.
We slide the window over word, counting the occurrences of the characters in the window. If the window is the substring we are searching for, we return the result. Otherwise, we keep searching.
[FIXED] The code:
private static final int ALPHABET_SIZE = 4;
public int longestSubstring(String word) {
// count
int[] count = new int[ALPHABET_SIZE];
for (int i = 0; i < word.length(); i++) {
char c = word.charAt(i);
count[c - 'A']++;
}
int maxCount = word.length();
for (int i = 0; i < count.length; i++) {
int cnt = count[i];
if (cnt < maxCount) {
maxCount = cnt;
}
}
// iterate over maxCount until found
boolean found = false;
while (maxCount > 0 && !found) {
int substringLength = maxCount * ALPHABET_SIZE;
found = findSubstring(substringLength, word, maxCount);
if (!found) {
maxCount--;
}
}
return found ? maxCount * ALPHABET_SIZE : 0;
}
private boolean findSubstring(int length, String word, int maxCount) {
int startIndex = 0;
boolean found = false;
while (startIndex + length <= word.length()) {
int[] count = new int[ALPHABET_SIZE];
for (int i = startIndex; i < startIndex + length; i++) {
char c = word.charAt(i);
int cnt = ++count[c - 'A'];
if (cnt > maxCount) {
break;
}
}
if (equalValues(count, maxCount)) {
found = true;
break;
} else {
startIndex++;
}
}
return found;
}
// Returns true if all values in c are equal to value
private boolean equalValues(int[] count, int value) {
boolean result = true;
for (int i : count) {
if (i != value) {
result = false;
break;
}
}
return result;
}
[MERGED] This is Hollis Waite's solution using cumulative counts, but taking my observations at points 1. and 2. into consideration. This may improve performance for some inputs:
private static final int ALPHABET_SIZE = 4;
public int longestSubstring(String word) {
// count
int[][] cumulativeCount = new int[ALPHABET_SIZE][];
for (int i = 0; i < ALPHABET_SIZE; i++) {
cumulativeCount[i] = new int[word.length() + 1];
}
int[] count = new int[ALPHABET_SIZE];
for (int i = 0; i < word.length(); i++) {
char c = word.charAt(i);
count[c - 'A']++;
for (int j = 0; j < ALPHABET_SIZE; j++) {
cumulativeCount[j][i + 1] = count[j];
}
}
int maxCount = word.length();
for (int i = 0; i < count.length; i++) {
int cnt = count[i];
if (cnt < maxCount) {
maxCount = cnt;
}
}
// iterate over maxCount until found
boolean found = false;
while (maxCount > 0 && !found) {
int substringLength = maxCount * ALPHABET_SIZE;
found = findSubstring(substringLength, word, maxCount, cumulativeCount);
if (!found) {
maxCount--;
}
}
return found ? maxCount * ALPHABET_SIZE : 0;
}
private boolean findSubstring(int length, String word, int maxCount, int[][] cumulativeCount) {
int startIndex = 0;
int endIndex = (startIndex + length) - 1;
boolean found = true;
while (endIndex < word.length()) {
for (int i = 0; i < ALPHABET_SIZE; i++) {
if (cumulativeCount[i][endIndex] - cumulativeCount[i][startIndex] != maxCount) {
found = false;
break;
}
}
if (found) {
break;
} else {
startIndex++;
endIndex++;
}
}
return found;
}
You'll probably want to cache cumulative counts of characters for each index of String -- that's where the real bottleneck is. Haven't thoroughly tested but something like the below should work.
public class Test {
static final int LEN = 4;
static class RandomCharSequence implements CharSequence {
private final Random mRandom = new Random();
private final int mAlphabetLen;
private final int mLen;
private final int mOffset;
RandomCharSequence(int pLen, int pOffset, int pAlphabetLen) {
mAlphabetLen = pAlphabetLen;
mLen = pLen;
mOffset = pOffset;
}
public int length() {return mLen;}
public char charAt(int pIdx) {
mRandom.setSeed(mOffset + pIdx);
return (char) (
'A' +
(mRandom.nextInt() % mAlphabetLen + mAlphabetLen) % mAlphabetLen
);
}
public CharSequence subSequence(int pStart, int pEnd) {
return new RandomCharSequence(pEnd - pStart, pStart, mAlphabetLen);
}
#Override public String toString() {
return (new StringBuilder(this)).toString();
}
}
public static void main(String[] pArgs) {
Stream.of("ABCDB", "ABCC", "ADDBCCBA", "DADDBCCBA").forEach(
pWord -> System.out.println(longestSubstring(pWord))
);
for (int i = 0; ; i++) {
final double len = Math.pow(10, i);
if (len >= Integer.MAX_VALUE) break;
System.out.println("Str len 10^" + i);
for (int alphabetLen = 1; alphabetLen <= LEN; alphabetLen++) {
final Instant start = Instant.now();
final int val = longestSubstring(
new RandomCharSequence((int) len, 0, alphabetLen)
);
System.out.println(
String.format(
" alphabet len %d; result %08d; time %s",
alphabetLen,
val,
formatMillis(ChronoUnit.MILLIS.between(start, Instant.now()))
)
);
}
}
}
static String formatMillis(long millis) {
return String.format(
"%d:%02d:%02d.%03d",
TimeUnit.MILLISECONDS.toHours(millis),
TimeUnit.MILLISECONDS.toMinutes(millis) -
TimeUnit.HOURS.toMinutes(TimeUnit.MILLISECONDS.toHours(millis)),
TimeUnit.MILLISECONDS.toSeconds(millis) -
TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(millis)),
TimeUnit.MILLISECONDS.toMillis(millis) -
TimeUnit.SECONDS.toMillis(TimeUnit.MILLISECONDS.toSeconds(millis))
);
}
static int longestSubstring(CharSequence pWord) {
// create array that stores cumulative char counts at each index of string
// idx 0 = char (A-D); idx 1 = offset
final int[][] cumulativeCnts = new int[LEN][];
for (int i = 0; i < LEN; i++) {
cumulativeCnts[i] = new int[pWord.length() + 1];
}
final int[] cumulativeCnt = new int[LEN];
for (int i = 0; i < pWord.length(); i++) {
cumulativeCnt[pWord.charAt(i) - 'A']++;
for (int j = 0; j < LEN; j++) {
cumulativeCnts[j][i + 1] = cumulativeCnt[j];
}
}
final int maxResult = Arrays.stream(cumulativeCnt).min().orElse(0) * LEN;
if (maxResult == 0) return 0;
int result = 0;
for (int initialOffset = 0; initialOffset < LEN; initialOffset++) {
for (
int start = initialOffset;
start < pWord.length() - result;
start += LEN
) {
endLoop:
for (
int end = start + result + LEN;
end <= pWord.length() && end - start <= maxResult;
end += LEN
) {
final int substrLen = end - start;
final int expectedCharCnt = substrLen / LEN;
for (int i = 0; i < LEN; i++) {
if (
cumulativeCnts[i][end] - cumulativeCnts[i][start] !=
expectedCharCnt
) {
continue endLoop;
}
}
if (substrLen > result) result = substrLen;
}
}
}
return result;
}
}
Suppose there are K possible letters in a string of length N. We could track the balance of letters seen with a vector pos of length K that is updated as follows:
If letter 1 is seen, add (K-1, -1, -1, ...)
If letter 2 is seen, add (-1, K-1, -1, ...)
If letter 3 is seen, add (-1, -1, K-1, ...)
Maintain a hash that maps pos to the first string position where pos is reached. Balanced substrings occur whenever hash[pos] already exists and the substring value is s[hash[pos]:pos].
The cost of maintaining the hash is O(log N) so processing the string takes O(N log N). How does this compare with solutions so far? These types of problems tend to have linear solutions but I haven't come across one yet.
Here's some code demonstrating the idea for 3 letters and a run using biased random strings. (Uniform random strings allow for solutions that are around half the string length, which is unwieldy to print).
#!/usr/bin/python
import random
from time import time
alphabet = "abc"
DIM = len(alphabet)
def random_string(n):
# return a random string over choices[] of length n
# distribution of letters is non-uniform to make matches harder to find
choices = "aabbc"
s = ''
for i in range(n):
r = random.randint(0, len(choices) - 1)
s += choices[r]
return s
def validate(s):
# verify frequencies of each letter are the same
f = [0, 0, 0]
a2f = {alphabet[i] : i for i in range(DIM)}
for c in s:
f[a2f[c]] += 1
assert f[0] == f[1] and f[1] == f[2]
def longest_balanced(s):
"""return length of longest substring of s containing equal
populations of each letter in alphabet"""
slen = len(s)
p = [0 for i in range(DIM)]
vec = {alphabet[0] : [2, -1, -1],
alphabet[1] : [-1, 2, -1],
alphabet[2] : [-1, -1, 2]}
x = -1
best = -1
hist = {str([0, 0, 0]) : -1}
for c in s:
x += 1
p = [p[i] + vec[c][i] for i in range(DIM)]
pkey = str(p)
if pkey not in hist:
hist[pkey] = x
else:
span = x - hist[pkey]
assert span % DIM == 0
if span > best:
best = span
cand = s[hist[pkey] + 1: x + 1]
print("best so far %d = [%d,%d]: %s" % (best,
hist[pkey] + 1,
x + 1,
cand))
validate(cand)
return best if best > -1 else 0
def main():
#print longest_balanced( "aaabcabcbbcc" )
t0 = time()
s = random_string(1000000)
print "generate time:", time() - t0
t1 = time()
best = longest_balanced( s )
print "best:", best
print "elapsed:", time() - t1
main()
Sample run on an input of 10^6 letters with an alphabet of 3 letters:
$ ./bal.py
...
best so far 189 = [847894,848083]: aacacbcbabbbcabaabbbaabbbaaaacbcaaaccccbcbcbababaabbccccbbabbacabbbbbcaacacccbbaacbabcbccaabaccabbbbbababbacbaaaacabcbabcbccbabbccaccaabbcabaabccccaacccccbaacaaaccbbcbcabcbcacaabccbacccacca
best: 189
elapsed: 1.43609690666