Google Challenge Dilemma, Insights into possible errors? - java

I am currently passing 4 of the 5 hidden test cases for this challenge and would like some input
Quick problem description:
You are given two input strings, String chunk and String word
The string "word" has been inserted into "chunk" some number of times
The task is to find the shortest string possible when all instances of
"word" have been removed from "chunk".
Keep in mind during removal, more instances of the "word" might be
created in "chunk". "word" can also be inserted anywhere, including
between "word" instances
If there are more the one shortest possible strings after removal,
return the shortest word that is lexicographic-ally the earliest.
This is easier understood with examples:
Inputs:
(string) chunk = "lololololo"
(string) word = "lol"
Output:
(string) "looo" (since "looo" is eariler than "oolo")
Inputs:
(string) chunk = "goodgooogoogfogoood"
(string) word = "goo"
Output:
(string) "dogfood"
right now I am iterating forwards then backwards, removing all instances of word and then comparing the two results of the two iterations.
Is there a case I am overlooking? Is it possible there is a case where you have to remove from the middle first or something along those lines?
Any insight is appreciated.

I am not sure. But i will avoid matching first and last character of chunk. Should replace all other.

Related

Searching for a word in a String using parallelism with Java Fork/Join

Let's say I want to search the occurrence of a word in a string in a parallel way.
Say for example we have a string "Hello i am bob and my name is bob" and a word "bob".
The function needs to return 2.
Achieving this sequentially is pretty easy. We just need to use a for loop to go over our string and count whenever our word matches another word in the string.
I am trying to solve this using parallelism. I thought about splitting the string on every white space and passing the word to each thread, which then will check if it matches our searched word. However, looking for white spaces in our string is still being done sequentially. So, parallelism can not be beneficial here.
Is there any other way to achieve this?
This is not a problem to be solved with fork join since this is not recursive action. Stream api is the way to go here:
String str = "Hello i am bob and my name is bob";
long count = Arrays.stream(str.split("\\s+"))
.parallel()
.filter(s -> s.equals("bob"))
.count();
System.out.println("Bob appeared " + count + " times");
You can do str.indexOf(“bob”) != str.lastIndexOf(“bob”). If it’s not equal, you got two. You can do another check by removing first bob and the last bob becomes the first index, if you find another one by indexOf != lastIndexOf, you remove the first one again and continue searching until you are done. I’m sure there will be a way to still make this better.

Why is my String array length 3 instead of 2?

I'm trying to understand regex. I wanted to make a String[] using split to show me how many letters are in a given string expression?
import java.util.*;
import java.io.*;
public class Main {
public static String simpleSymbols(String str) {
String result = "";
String[] alpha = str.split("[\\+\\w\\+]");
int alphaLength = alpha.length;
// System.out.print(alphaLength);
String[] charCount = str.split("[a-z]");
int charCountLength = charCount.length;
System.out.println(charCountLength);
}
}
My input string is "+d+=3=+s+". I split the string to count the number of letters in string. The array length should be two but I'm getting three. Also, I'm trying to make a regex to check the pattern +b+, with b being any letter in the alphabet? Is that correct?
So, a few things pop out to me:
First, your regex looks correct. If you're ever worried about how your regex will perform, you can use https://regexr.com/ to check it out. Just put your regex on the top and enter your string in the bottom to see if it is matching correctly
Second, upon close inspection, I see you're using the split function. While it is convenient for quickly splitting strings, you need to be careful as to what you are splitting on. In this case, you're removing all of the strings that you were initially looking at, which would make it impossible to find. If you print it out, you would notice that the following shows (for an input string of +d+=3=+s+):
+
+=3=+
+
Which shows that you accidentally cut out what you were looking to find in the first place. Now, there are several ways of fixing this, depending on what your criteria is.
Now, if what you wanted was just to separate on all +s and it doesn't matter that you find only what is directly bounded by +s, then split works awesome. Just do str.split("+"), and this will return you a list of the following (for +d+=3=+s+):
d
=3=
s
However, you can see that this poses a few problems. First, it doesn't strip out the =3= that we don't want, and second, it does not truly give us values that are surrounded by a +_+ format, where the underscore represents the string/char you're looking for.
Seeing as you're using +w, you intend to find words that are surrounded by +s. However, if you're just looking to find one character, I would suggest using another like [a-z] or [a-zA-Z] to be more specific. However, if you want to find multiple alphabetical characters, your pattern is fine. You can also add a * (0 or more) or a + (1 or more) at the end of the pattern to dictate what exactly you're looking for.
I won't give you the answer outright, but I'll give you a clue as to what to move towards. Try using a pattern and a matcher to find the regex that you listed above and then if you find a match, make sure to store it somewhere :)
Also, for future reference, you should always start a function name with a lower case, at least in Java. Only constants and class names should start in a capital :)
I am trying to use split to count the number of letters in that string. The array length should be two, but I'm getting three.
The regex in the split functions is used as delimiters and will not be shown in results. In your case "str.split([a-z])" means using alphabets as delimiters to separate your input string, which makes three substrings "(+)|d|(+=3=+)|s|(+)".
If you really want to count the number of letters using "split", use 'str.split("[^a-z]")'. But I would recommend using "java.util.regex.Matcher.find()" in order to find out all letters.
Also, I'm trying to make a regex to check the pattern +b+, with b being any letter in the alphabet? Is that correct?
Similarly, check the functions in "java.util.regex.Matcher".

What's the fastest way to only rotate certain elements in an array?

I'm writing a form of word scrambler for strings which takes all letters except for the first and last, and rotates their positions. However, I am supposed to only look at the second to second last letters. How should I only scramble from the second last letter to the second last letter?
e.g. scramble "string" to "srintg"
I can call Collections.rotate() on an array of characters created by splitting the string, but that will scramble the entire word.
List<String> newWordList = Arrays.asList(word.split(" "));
Collections.rotate(newWordList, -1);
String newWord = String.join("", newWordList);
I want to get the output "srintg", but instead I will get "rintgs".
Provided that your word is long enough for it to be sensible (at least four letters), you can make the approach you present work by rotating a sublist of your list:
Collections.rotate(newWordList.subList(1, newWordList.size() - 1), -1);
List.subList() creates a view of a portion of a List list for the exact purpose of avoiding the need for overloading List methods with versions that operate on indexed sub-ranges of the elements. That's "fast" in the sense of fast to write, and it's fairly clear.
If you are looking for "fast" in a performance sense, however, then splitting and joining strings seems ill-advised. Fastest is probably not something we can offer, as performance needs to be tested, but if I were looking for best performance then I would test at least these general approaches:
Work with an array form of your word
Use String.toCharArray() to obtain your word's letters in array form.
Use an indexed for loop to rotate the characters in the array.
Construct a new String from the modified array (using the appropriate constructor).
Use a StringBuilder to assemble the word
Create a StringBuilder with initial capacity equal to the word length.
Iterate over the word's letters using a CharacterIterator, appending them to the builder in the order required. This can be done in a single pass.
Obtain the result string from the builder.

Rearranging one string to another in Java

I am trying to find whether a part of given string A can be or can not be rearranged to given string B (Boolean output).
Since the algorithm must be at most O(n), to ease it, I used stringA.retainAll(stringB), so now I know string A and string B consist of the same set of characters and now the whole task smells like regex.
And .. reading about regex, I might be now having two problems(c).
The question is, do I potentially face a risk of getting O(infinity) by using regex or its more efficient to use StreamAPI with the purpose of finding whether each character of string A has enough duplicates to cover each of character of string B? Let alone regex syntax is not easy to read and build.
As of now, I can't use sorting (any sorting is at least n*log(n)) nor hashsets and the likes (as it eliminates duplicates in both strings).
Thank you.
You can use a HashMap<Character,Integer> to count the number of occurrences of each character of the first String. That would take linear time.
Then, for each Character of the second String, find if it's in the HashMap and decrement the counter (if it's still positive). This will also take linear time, and if you manage to decrement the counters for all the characters of the second String, you succeed.

Java: Finding how many words appear in BOTH data sources?

I'm trying to figure out if there is an easy way to count the number of words that appear in small paragraph (#1) and small paragraph (#2).
Generally, Im determining how much overlap there is in these paragraphs on a word by word basis. So if (#1) contains the word "happy" and (#2) contains the word "happy" that would be like a +1 value.
I know that I could use a String.contains() for each word in (#1) applied to (#2). But I was wondering if there is something more efficient that I could use
You can create two sets s1 and s2, containing all words from first and second paragraph respectively, and intersect them: s1.retainAll(s2). Sounds easy enough.
update
Works for me
Set<String> s1 = new HashSet<String>(Arrays.asList("abc xyz 123".split("\\s")));
Set<String> s2 = new HashSet<String>(Arrays.asList("xyz 000 111".split("\\s")));
s1.retainAll(s2);
System.out.println(s1.size());
Don't forget to remove empty word from both sets.

Categories