Java Searching through a String for a valid character sequence

Java Searching through a String for a valid character sequence - java

I just took a codility test and was wondering why my solution only scored 37/100. The problem was that you were given a String and had to search through it for valid passwords.
Here are the rules:
1) A valid password starts with a capital letter and cannot contain any numbers. The input is restricted to any combination of a-z, A-Z and 0-9.
2)The method they wanted you to create is suppose to return the size of the largest valid password. So for example if you input "Aa8aaaArtd900d" the number 4 is suppose to be outputted by the solution method. If no valid String is found the method should return -1
I cannot seem to figure out where I went wrong in my solution. Any help would be greatly appreciated! Also any suggestions on how to better test code for something like this would be greatly appreciated.
class Solution2 {
public int solution(String S) {
int first = 0;
int last = S.length()-1;
int longest = -1;
for(int i = 0; i < S.length(); i++){
if(Character.isUpperCase(S.charAt(i))){
first = i;
last = first;
while(last < S.length()){
if(Character.isDigit(S.charAt(last))){
i = last;
break;
}
last++;
}
longest = Math.max(last - first, longest);
}
}
return longest;
}
}
added updated solution, any thoughts to optimize this further?

Your solution is too complicated. Since you are not asked to find the longest password, only the length of the longest password, there is no reason to create or store strings with that longest password. Therefore, you do not need to use substring or an array of Strings, only int variables.
The algorithm for finding the solution is straightforward:
Make an int pos = 0 variable representing the current position in s
Make a loop that searches for the next candidate password
Starting at position pos, find the next uppercase letter
If you hit the end of line, exit
Starting at the position of the uppercase letter, find the next digit
If you hit the end of line, stop
Find the difference between the position of the digit (or the end of line) and the position of the uppercase letter.
If the difference is above max that you have previously found, replace max with the difference
Advance pos to the position of the last letter (or the end of line)
If pos is under s.length, continue the loop at step 2
Return max.
Demo.

Related

Length of the Longest Common Substring without repeating characters

Given "abcabcbb", the answer is "abc", which the length is 3.
Given "bbbbb", the answer is "b", with the length of 1.
Given "pwwkew", the answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.
I have came up with a solution that worked, but failed for several test cases. I then found a better solution and I rewrote it to try and understand it. The solution below works flawlessly, but after about 2 hours of battling with this thing, I still can not understand why this particular line of code works.
import java.util.*;
import java.math.*;
public class Solution {
public int lengthOfLongestSubstring(String str) {
if(str.length() == 0)
return 0;
HashMap<Character,Integer> map = new HashMap<>();
int startingIndexOfLongestSubstring = 0;
int max = 0;
for(int i = 0; i < str.length(); i++){
char currentChar = str.charAt(i);
if(map.containsKey(currentChar))
startingIndexOfLongestSubstring = Math.max(startingIndexOfLongestSubstring, map.get(currentChar) + 1);
map.put(currentChar, i);
max = Math.max(max, i - startingIndexOfLongestSubstring + 1);
}//End of loop
return max;
}
}
The line in question is max = Math.max(max, i - startingIndexOfLongestSubstring + 1);
I don't understand why this works. We're taking the max between our previous max, and the difference between our current index and the starting index of what is currently the longest substring and then adding 1. I know that the code is getting the difference between our current index, and the startingIndexOfSubstring, but I can't conceptualize WHY it works to give us the intended result; Can someone please explain this step to me, particularly WHY it works?

I'm usually bad at explaining, let me give it a shot by considering an example.
String is "wcabcdeghi".
Forget the code for a minute and assume we're trying to come up with a logic.
We start from w and keep going until we reach c -> a -> b -> c.
We need to stop at this point because "c" is repeating. So we need a map to store if a character is repeated. (In code : map.put(currentChar, i); )
Now that we know if a character is repeated, We need to know what is the max. length so far. (In code -) max
Now we know there is no point in keeping track of count of first 2 variables w->c. This is because including this, we already got the Max. value. So from next iteration onwards we need to check length only from a -> b -> soon.
Lets have a variable (In code -)startingIndexOfLongestSubstring to keep track of this. (This should've been named startingIndexOfNonRepetativeCharacter, then again I'm bad with naming as well).
Now we again keep continuing, but wait we still haven't finalized on how to keep track of sub-string that we're currently parsing. (i.e., from abcd...)
Coming to think of it, all I need is the position of where "a" was present (which is startingIndexOfNonRepetativeCharacter) so to know the length of current sub-string all I need to do is (In code -)i - startingIndexOfLongestSubstring + 1 (current character position - The non-repetative character length + (subtraction doesn't do inclusive of both sides so adding 1). Lets call this currentLength
But wait, what are we going to do with this count. Every time we find a new variable we need to check if this currentLength can break our max.
So (In code -) max = Math.max(max, i - startingIndexOfLongestSubstring + 1);
Now we've covered most of the statements that we need and according to our logic everytime we encounter a variable which was already present all we need is startingIndexOfLongestSubstring = map.get(currentChar). So why are we doing a Max?
Consider a scenario where String is "wcabcdewghi". when we start processing our new counter as a -> b -> c -> d -> e -> w At this point our logic checks if this character was present previously or not. Since its present, it starts the count from index "1". Which totally messes up the whole count. So We need to make sure, the next index we take from map is always greater than the starting point of our count(i.e., select a character from the map only if the character occurs before startingIndexOfLongestSubstring).
Hope I've answered all lines in the code and mainly If the explanation was understandable.

Because
i - startingIndexOfLongestSubstring + 1
is amount of characters between i and startingIndexOfLongestSubstring indexes. For example how many characters between position 2 and 3? 3-2=1 but we have 2 characters: on position 2 and position 3.
I've described every action in the code:
public class Solution {
public int lengthOfLongestSubstring(String str) {
if(str.length() == 0)
return 0;
HashMap<Character,Integer> map = new HashMap<>();
int startingIndexOfLongestSubstring = 0;
int max = 0;
// loop over all characters in the string
for(int i = 0; i < str.length(); i++){
// get character at position i
char currentChar = str.charAt(i);
// if we already met this character
if(map.containsKey(currentChar))
// then get maximum of previous 'startingIndexOfLongestSubstring' and
// map.get(currentChar) + 1 (it is last occurrence of the current character in our word before plus 1)
// "plus 1" - it is because we should start count from the next character because our current character
// is the same
startingIndexOfLongestSubstring = Math.max(startingIndexOfLongestSubstring, map.get(currentChar) + 1);
// save position of the current character in the map. If map already has some value for current character
// then it will override (we don't want to know previous positions of the character)
map.put(currentChar, i);
// get maximum between 'max' (candidate for return value) and such value for current character
max = Math.max(max, i - startingIndexOfLongestSubstring + 1);
}//End of loop
return max;
}
}

Recursive backtracking to create permutations of given string

I am currently working on a programming assignment where the user inputs a word
i.e. "that"
and the program should return all valid words that can be made from the given string
i.e. [that, hat, at]
The issue I am having is that the resulting words should be created using a recursive method that checks if the prefix is valid.
i.e. if the given word is "kevin" once the program tries the combination "kv" it should know that no words start with kv and try the next combination in order to save time.
Currently my code just creates ALL permutations which takes a relatively large amount of time when the input is larger than 8 letter.
protected static String wordCreator(String prefix, String letters) {
int length = letters.length();
//if each character has been used, return the current permutation of the letters
if (length == 0) {
return prefix;
}
//else recursively call on itself to permute possible combinations by incrementing the letters
else {
for (int i = 0; i < length; i++) {
words.add(wordCreator(prefix + letters.charAt(i), letters.substring(0, i) + letters.substring(i+1, length)));
}
}
return prefix;
}
If anyone could help me figure this out I'd be much appreciated. I am also using an AVL tree to store the dictionary words for validation incase that is needed.

Finding the index of a permutation within a string

I just attempted a programming challenge, which I was not able to successfully complete. The specification is to read 2 lines of input from System.in.
A list of 1-100 space separated words, all of the same length and between 1-10 characters.
A string up to a million characters in length, which contains a permutation of the above list just once. Return the index of where this permutation begins in the string.
For example, we may have:
dog cat rat
abcratdogcattgh
3
Where 3 is the result (as printed by System.out).
It's legal to have a duplicated word in the list:
dog cat rat cat
abccatratdogzzzzdogcatratcat
16
The code that I produced worked providing that the word that the answer begins with has not occurred previously. In the 2nd example here, my code will fail because dog has already appeared before where the answer begins at index 16.
My theory was to:
Find the index where each word occurs in the string
Extract this substring (as we have a number of known words with a known length, this is possible)
Check that all of the words occur in the substring
If they do, return the index that this substring occurs in the original string
Here is my code (it should be compilable):
import java.io.BufferedReader;
import java.io.InputStreamReader;
public class Solution {
public static void main(String[] args) throws Exception {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String line = br.readLine();
String[] l = line.split(" ");
String s = br.readLine();
int wl = l[0].length();
int len = wl * l.length;
int sl = s.length();
for (String word : l) {
int i = s.indexOf(word);
int z = i;
//while (i != -1) {
int y = i + len;
if (y <= sl) {
String sub = s.substring(i, y);
if (containsAllWords(l, sub)) {
System.out.println(s.indexOf(sub));
System.exit(0);
}
}
//z+= wl;
//i = s.indexOf(word, z);
//}
}
System.out.println("-1");
}
private static boolean containsAllWords(String[] l, String s) {
String s2 = s;
for (String word : l) {
s2 = s2.replaceFirst(word, "");
}
if (s2.equals(""))
return true;
return false;
}
}
I am able to solve my issue and make it pass the 2nd example by un-commenting the while loop. However this has serious performance implications. When we have an input of 100 words at 10 characters each and a string of 1000000 characters, the time taken to complete is just awful.
Given that each case in the test bench has a maximum execution time, the addition of the while loop would cause the test to fail on the basis of not completing the execution in time.
What would be a better way to approach and solve this problem? I feel defeated.

If you concatenate the strings together and use the new string to search with.
String a = "dog"
String b = "cat"
String c = a+b; //output of c would be "dogcat"
Like this you would overcome the problem of dog appearing somewhere.
But this wouldn't work if catdog is a valid value too.

Here is an approach (pseudo code)
stringArray keys(n) = {"cat", "dog", "rat", "roo", ...};
string bigString(1000000);
L = strlen(keys[0]); // since all are same length
int indices(n, 1000000/L); // much too big - but safe if only one word repeated over and over
for each s in keys
f = -1
do:
f = find s in bigString starting at f+1 // use bigString.indexOf(s, f+1)
write index of f to indices
until no more found
When you are all done, you will have a series of indices (location of first letter of match). Now comes the tricky part. Since the words are all the same length, we're looking for a sequence of indices that are all spaced the same way, in the 10 different "collections". This is a little bit tedious but it should complete in a finite time. Note that it's faster to do it this way than to keep comparing strings (comparing numbers is faster than making sure a complete string is matched, obviously). I would again break it into two parts - first find "any sequence of 10 matches", then "see if this is a unique permutation".
sIndx = sort(indices(:))
dsIndx = diff(sIndx);
sequence = find {n} * 10 in dsIndx
for each s in sequence
check if unique permutation
I hope this gets you going.

Perhaps not the best optimized version, but how about following theory to give you some ideas:
Count length of all words in row.
Take random word from list and find the starting index of its first
occurence.
Take a substring with length counted above before and after that
index (e.g. if index is 15 and 3 words of 4 letters long, take
substring from 15-8 to 15+11).
Make a copy of the word list with earlier random word removed.
Check the appending/prepending [word_length] letters to see if they
match a new word on the list.
If word matches copy of list, remove it from copy of list and move further
If all words found, break loop.
If not all words found, find starting index of next occurence of
earlier random word and go back to 3.
Why it would help:
Which word you pick to begin with wouldn't matter, since every word
needs to be in the succcessful match anyway.
You don't have to manually loop through a lot of the characters,
unless there are lots of near complete false matches.
As a supposed match keeps growing, you have less words on the list copy left to compare to.
Can also keep track or furthest index you've gone to, so you can
sometimes limit the backwards length of picked substring (as it
cannot overlap to where you've already been, if the occurence are
closeby to each other).

1772 of Caribbean online judge giving a time limit exceeded error. please help me find why is my algorithm taking so long

So I am trying to solve the problem 1772 of the Caribbean online judge web page http://coj.uci.cu/24h/problem.xhtml?abb=1772, the problem asks to find if a substring of a bigger string contains at least one palindrome inside it:
e.g. Analyzing the sub-strings taken from the following string: "baraabarbabartaarabcde"
"bara" contains a palindrome "ara"
"abar" contains a palindrome "aba"
"babar" contains a palindrome "babar"
"taar" contains a palindrome "aa"
"abcde" does not contains any palindrome.
etc etc etc...
I believe my approach is really fast because I am iterating the strings starting at the first char and at the last char at the same time, advancing towards the center of the string looking only for the following patterns: "aa" "aba" whenever I find a pattern like those I can say the substring given contains a palindrome inside it. Now the problem is that the algorithm is taking a long time but I can't spot the problem on it. Please help me find it I am really lost on this one. Here is my algorithm
public static boolean hasPalindromeInside(String str)
{
int midpoint=(int) Math.ceil((float)str.length()/2.0);
int k = str.length()-1;
for(int i = 0; i < midpoint;i++)
{
char letterLeft = str.charAt(i);
char secondLetterLeft=str.charAt(i+1);
char letterRight = str.charAt(k);
char secondLetterRight = str.charAt(k-1);
if((i+2)<str.length())
{
char thirdLetterLeft=str.charAt(i+2);
char thirdLetterRight=str.charAt(k-2);
if(letterLeft == thirdLetterLeft || letterRight == thirdLetterRight)
{
return true;
}
}
if(letterLeft == secondLetterLeft || letterRight==secondLetterRight)
{
return true;
}
k--;
}
return false;
}
}
I have removed the code that grabs the input strings and intervals of sub-strings, I am using String.substring() to get the substrings and I don't think that will be causing the problem. If you need that code please let me know.
Thanks!

I think you can solve this in O(1) time per query given O(n) preprocessing to find the locations of all 2 and 3 character palindromes. (Any even plaindrome will have a 2 character plaindrome at the centre, while any odd will have a 3 character one so it suffices to check 2 and 3.)
For example,
Given your string baraabarbabartaarabcde, first compute an array indicating the locations of the 2 character palindromes:
baraabarbabartaarabcde
000100000000001000000-
Then compute the cumulative sum of this array:
baraabarbabartaarabcde
000100000000001000000-
000111111111112222222-
By doing a subtraction you can immediately work out whether there are any 2 character palindromes in a query range.
Similarly for three character plaindromes:
baraabarbabartaarabcde String
01001000100000010000-- Indicator
01112222333333344444-- Cumulative

Finding various char's within a String

I have a basic String variable that contains the letter x a total of three times.
I have attempted to find x within the String using charAt, and then print the char and the next two characters next to it.
I have hit a snag within my code and would appreciate any help.
Here is my code.
public class StringX{
public static void main(String[] args){
String ss = "xarxatxm";
char first = ss.charAt(0);
char last == ss.charAt(3);
if(first == "x"){
String findx = ss.substring(0, 2);
}
if(last == "x"){
String findX = ss.substring(3, 5);
}
System.out.print(findx + findX);
}
}
Also, is there a way to implement the for loop to cycle through the String looking for x also?
I just need some advice to see where my code is going wrong.

You cannot find characters using charAt - it's for getting a character once you know where it is.
Is there a way to implement the for loop to cycle through the String looking for x also?
You need to use indexOf for finding positions of characters. Pass the initial position which is the position of the last x that you found so far to get the subsequent position.
For example, the code below
String s = "xarxatxm";
int pos = -1;
while (true) {
pos = s.indexOf('x', pos+1);
if (pos < 0) break;
System.out.println(pos);
}
prints 0 3 6 for the three positions of 'x' in the string.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Searching through a String for a valid character sequence - java

Related

Length of the Longest Common Substring without repeating characters

Recursive backtracking to create permutations of given string

Finding the index of a permutation within a string

1772 of Caribbean online judge giving a time limit exceeded error. please help me find why is my algorithm taking so long

Finding various char's within a String

Categories

Resources