String metrics alghoritms in Java - java

I am solving the problem of processing strings in Java. Please help me solve the problem.
Condition of the problem: John has launched his new startup to recognize the clouds he has seen, which he called string A of length N . But suddenly he found out that Sam also launched his cloud recognition startup and called it string B of length N.
More formally, let there be strings A, the name of John's startup, and string B, the name of Sam's startup. Both strings are the same length
N. For each position 1 ≤ i ≤ N of string B , you need to calculate the type of match at this position with string A .
If a
Bi=Ai, then in position i the match type must be equal to P (from the word plagiarism).
If Bi≠Ai, but there is another position 1≤j≤N such that Bi=Aj, then in position i
match type must be equal to S (from the word suspicious).
Note:
Letters within one line can be repeated.
Each letter of string A can be used in at most one plagiarism or suspicious match.
Preference is always given to the plagiarism type.
In the case of a suspicious match, the leftmost position in row A is always preferred.
. In other positions, the match type must be equal to I (from the word innocent).
Input Format
The first line contains the string
A(1≤∣∣A∣∣≤10^6) is the startup name chosen by John.
The second line contains the string B(|B|=|A|) — the name of Sam's startup.
It is guaranteed that strings A and B
contain only uppercase latin letters.
Output Format
Output a single line
C(|C|=|B|), where Ci is the match type of the character Bi(1≤i≤|B|):
for type plagiarism Ci= P.
for type suspicious Ci=S.
for type innocentCi=I.
Example 1:
Input Output
CLOUD PSIIP
CUPID
Example 2:
Input Output
ALICE SPII
ELIBO
Example 3:
Input Output
ABCBCYA IPSSPIP
ZBBACAA
Notes:
Explanation for the first test
B1=A1 and B5=A5 , so for positions 1 and 5 the answer is P.
B2≠A2 , but B2=A4, so for position 2 the answer is S.
Letters P and I do not occur in string A, so for positions 3 and 4 the answer is I.
Explanation for the second test:
B2=A2 and B3=A3, so for positions 2 and 3 the answer is P.
B1≠A1 , but B1=A5, so for position 1 the answer is S.
Letters B and O do not occur in string A, so for positions 4 and 5 the answer is I.
Explanation for the third test:
B2=A2 , B5=A5 and B7=A7 so for positions 2, 5 and 7 the answer is P.
B3≠A3 but B3=A2=A4. A2 is already enabled according to B2=A2,
therefore, the correspondence B3=A4 is chosen - for position 3 the answer is S.
B4≠A4 and B6≠A6, but B4=B6=A1=A7.
A7 is already enabled according to B7=A7;
4<6, therefore, for position 4, the correspondence B4=A1 (answer S) is selected;
there are no matches left for position 6 (answer I).
The letter Z does not occur in string A, so for position 1 the answer is I.
A7 is already enabled according to B7=A7;
4<6, therefore, for position 4, the correspondence B4=A1 (answer S) is selected;
there are no matches left for position 6 (answer I).
The letter Z does not occur in string A, so for position 1 the answer is I.
My solution:
public class Solution {
public boolean backspaceCompare(String s, String t) {
return formBackSpaceString(s).equals(formBackSpaceString(t));
}
private String formBackSpaceString(String s) {
Stack<Character> stack = new Stack<>();
for (char c : s.toCharArray()) {
if (c == '#') {
if (!stack.isEmpty()) {
stack.pop();
}
} else {
stack.push(c);
}
}
StringBuilder sb = new StringBuilder();
while (!stack.isEmpty()) {
sb.append(stack.pop());
}
return sb.toString();
}
public static void main (String[] args){
}
}
I am bogged down in the logic of this task. I would be grateful for help at least at the pseudocode level.

The code you provided does not seem related to the problem you described.
The problem asks to apply a simple rule for chars (Ai, Bi) from the original string to construct a new string C. There are only three rules:
If Ai == Bi, Ci = "P"
Otherwise, if A contains char Bi, Ci = "S"
Otherwise, Ci = "I"
You can do that in a simple way using stream API:
String problem(String A, String B) {
// construct a set of all chars in A
Set<Integer> aChars = A.chars().boxed().collect(Collectors.toSet());
// apply rules for chars Ai, Bi
return IntStream.range(0, A.length())
.mapToObj(i -> A.charAt(i) == B.charAt(i) ? "P" :
aChars.contains((int) B.charAt(i)) ? "S" : "I")
.collect(Collectors.joining());
}
Or, more verbosely, without it:
String problem(String A, String B) {
Set<Character> aChars = new HashSet<>();
for (int i = 0; i < A.length(); i++) {
aChars.add(A.charAt(i));
}
StringBuilder builder = new StringBuilder();
for (int i = 0; i < A.length(); i++) {
if (A.charAt(i) == B.charAt(i)) {
builder.append("P");
} else if (aChars.contains(B.charAt(i))) {
builder.append("S");
} else {
builder.append("I");
}
}
return builder.toString();
}

Related

What happens when if statement goes true (in this code)?

There is a problem in codingbat.com which you're supposed to remove "yak" substring from the original string. and they provided a solution for that which I can't understand what happens when the if statement goes true!
public String stringYak(String str) {
String result = "";
for (int i=0; i<str.length(); i++) {
// Look for i starting a "yak" -- advance i in that case
if (i+2<str.length() && str.charAt(i)=='y' && str.charAt(i+2)=='k') {
i = i + 2;
} else { // Otherwise do the normal append
result = result + str.charAt(i);
}
}
return result;
}
It just adds up i by 2 and what? When it appends to the result string?
Link of the problem:
https://codingbat.com/prob/p126212
The provided solution checks for all single characters in the input string. For this i is the current index of the checked character. When the current char is not a y and also the (i+2) character is not a k the current char index is advanced by 1 position.
Example:
yakpak
012345
i
So here in the first iteration the char at i is y and i+2 is a k, so we have to skip 3 chars. Keep in mind i is advanced by 1 everytime. So i has to be increased by 2 more. After this iteration i is here
yakpak
012345
i
So now the current char is no y and this char will get added to the result string.
But it's even simpler in Java as this functionality is build in with regex:
public String stringYak(String str) {
return str.replaceAll("y.k","");
}
The . means every char.
If i is pointing at a y and there is as k two positions down, then it wants to skip the full y*k substring, so it add 2 to i so i now refers to the k. WHen then loop continues, i++ will skip past the k, so in effect, the entire 3-letter y*k substring has been skipped.

A java string exercise i came across

Look for patterns like "zip" and "zap" in the string -- length-3, starting with 'z' and ending with 'p'. Return a string where for all such words, the middle letter is gone, so "zipXzap" yields "zpXzp"
Here is a solution i got from someone:
public class Rough {
public static void main(String [] args){
StringBuffer mat = new StringBuffer("matziplzdpaztp");
for(int i = 0; i < mat.length() - 2; ++i){
if (mat.charAt(i) == 'z' & mat.charAt(i + 2) == 'p'){
mat.deleteCharAt(i + 1);
}
}
System.out.println(mat);
}
}
But why is it that the for loop condition (i < mat.length() -2) is not (i < mat.length())????
Because in the loop:
if (mat.charAt(i) == 'z' & mat.charAt(i + 2) == 'p'){
// -----------------------------------^^^^^
If i were bound by i < mat.length(), then i + 2 would be out of bounds.
Because you don't have to reach the end of your sentence since your words are at least three letters long.
"2" stands for "the length except the first word",you just need to check all the positions in the string variable , and treat the positions as the first word of the substring , so just ignore the "length of the substring without the first word".
in your case , the length of "z*p" is 3, you just check all the position in the string , and treat the position as z to check something ,so just ignore "*p" ,which has length 2.
mat.length() will give length 14 and if you check for mat.charAt(i + 2) at the end it will give java.lang.StringIndexOutOfBoundsException because the string counts from index 0 not from 1. If you still want to use mat.length() you have to replace the AND '&' operator with short circuit AND '&&' operator in if condition.

Find every possible subset given a string [duplicate]

This question already has answers here:
Memory efficient power set algorithm
(5 answers)
Closed 8 years ago.
I'm trying to find every possible anagram of a string in Java - By this I mean that if I have a 4 character long word I want all the possible 3 character long words derived from it, all the 2 character long and all the 1 character long. The most straightforward way I tought of is to use two nested for loops and iterare over the string. This is my code as of now:
private ArrayList<String> subsets(String word){
ArrayList<String> s = new ArrayList<String>();
int length = word.length();
for (int c=0; c<length; c++){
for (int i=0; i<length-c; i++){
String sub = word.substring(c, c+i+1);
System.out.println(sub);
//if (!s.contains(sub) && sub!=null)
s.add(sub);
}
}
//java.util.Collections.sort(s, new MyComparator());
//System.out.println(s.toString());
return s;
}
My problem is that it works for 3 letter words, fun yelds this result (Don't mind the ordering, the word is processed so that I have a string with the letters in alphabetical order):
f
fn
fnu
n
nu
u
But when I try 4 letter words, it leaves something out, as in catq gives me:
a
ac
acq
acqt
c
cq
cqt
q
qt
t
i.e., I don't see the 3 character long word act - which is the one I'm looking for when testing this method. I can't understand what the problem is, and it's most likely a logical error I'm making when creating the substrings. If anyone can help me out, please don't give me the code for it but rather the reasoning behind your solution. This is a piece of coursework and I need to come up with the code on my own.
EDIT: to clear something out, for me acq, qca, caq, aqc, cqa, qac, etc. are the same thing - To make it even clearer, what happens is that the string gets sorted in alphabetical order, so all those permutations should come up as one unique result, acq. So, I don't need all the permutations of a string, but rather, given a 4 character long string, all the 3 character long ones that I can derive from it - that means taking out one character at a time and returning that string as a result, doing that for every character in the original string.
I hope I have made my problem a bit clearer
It's working fine, you just misspelled "caqt" as "acqt" in your tests/input.
(The issue is probably that you're sorting your input. If you want substrings, you have to leave the input unsorted.)
After your edits: see Generating all permutations of a given string Then just sort the individual letters, and put them in a set.
Ok, as you've already devised your own solution, I'll give you my take on it. Firstly, consider how big your result list is going to be. You're essentially taking each letter in turn, and either including it or not. 2 possibilities for each letter, gives you 2^n total results, where n is the number of letters. This of course includes the case where you don't use any letter, and end up with an empty string.
Next, if you enumerate every possibility with a 0 for 'include this letter' and a 1 for don't include it, taking your 'fnu' example you end up with:
000 - ''
001 - 'u'
010 - 'n'
011 - 'nu'
100 - 'f'
101 - 'fu' (no offense intended)
110 - 'fn'
111 - 'fnu'.
Clearly, these are just binary numbers, and you can derive a function that given any number from 0-7 and the three letter input, will calculate the corresponding subset.
It's fairly easy to do in java.. don't have a java compiler to hand, but this should be approximately correct:
public string getSubSet(string input, int index) {
// Should check that index >=0 and < 2^input.length here.
// Should also check that input.length <= 31.
string returnValue = "";
for (int i = 0; i < input.length; i++) {
if (i & (1 << i) != 0) // 1 << i is the equivalent of 2^i
returnValue += input[i];
}
return returnValue;
}
Then, if you need to you can just do a loop that calls this function, like this:
for (i = 1; i < (1 << input.length); i++)
getSubSet(input, i); // this doesn't do anything, but you can add it to a list, or output it as desired.
Note I started from 1 instead of 0- this is because the result at index 0 will be the empty string. Incidentally, this actually does the least significant bit first, so your output list would be 'f', 'n', 'fn', 'u', 'fu', 'nu', 'fnu', but the order didn't seem important.
This is the method I came up with, seems like it's working
private void subsets(String word, ArrayList<String> subset){
if(word.length() == 1){
subset.add(word);
return;
}
else {
String firstChar = word.substring(0,1);
word = word.substring(1);
subsets(word, subset);
int size = subset.size();
for (int i = 0; i < size; i++){
String temp = firstChar + subset.get(i);
subset.add(temp);
}
subset.add(firstChar);
return;
}
}
What I do is check if the word is bigger than one character, otherwise I'll add the character alone to the ArrayList and start the recursive process. If it is bigger, I save the first character and make a recursive call with the rest of the String. What happens is that the whole string gets sliced in characters saved in the recursive stack, until I hit the point where my word has become of length 1, only one character remaining.
When that happens, as I said at the start, the character gets added to the List, now the recursion starts and it looks at the size of the array, in the first iteration is 1, and then with a for loop adds the character saved in the stack for the previous call concatenated with every element in the ArrayList. Then it adds the character on its own and unwinds the recursion again.
I.E., with the word funthis happens:
f saved
List empty
recursive call(un)
-
u saved
List empty
recursive call(n)
-
n.length == 1
List = [n]
return
-
list.size=1
temp = u + list[0]
List = [n, un]
add the character saved in the stack on its own
List = [n, un, u]
return
-
list.size=3
temp = f + list[0]
List = [n, un, u, fn]
temp = f + list[1]
List = [n, un, u, fn, fun]
temp = f + list[2]
List = [n, un, u, fn, fun, fu]
add the character saved in the stack on its own
List = [n, un, u, fn, fun, fu, f]
return
I have been as clear as possible, I hope this clarifies what was my initial problem and how to solve it.
This is working code:
public static void main(String[] args) {
String input = "abcde";
Set<String> returnList = permutations(input);
System.out.println(returnList);
}
private static Set<String> permutations(String input) {
if (input.length() == 1) {
Set<String> a = new TreeSet<>();
a.add(input);
return a;
}
Set<String> returnSet = new TreeSet<>();
for (int i = 0; i < input.length(); i++) {
String prefix = input.substring(i, i + 1);
Set<String> permutations = permutations(input.substring(i + 1));
returnSet.add(prefix);
returnSet.addAll(permutations);
Iterator<String> it = permutations.iterator();
while (it.hasNext()) {
returnSet.add(prefix + it.next());
}
}
return returnSet;
}

How do I check if a char is a vowel?

This Java code is giving me trouble:
String word = <Uses an input>
int y = 3;
char z;
do {
z = word.charAt(y);
if (z!='a' || z!='e' || z!='i' || z!='o' || z!='u')) {
for (int i = 0; i==y; i++) {
wordT = wordT + word.charAt(i);
} break;
}
} while(true);
I want to check if the third letter of word is a non-vowel, and if it is I want it to return the non-vowel and any characters preceding it. If it is a vowel, it checks the next letter in the string, if it's also a vowel then it checks the next one until it finds a non-vowel.
Example:
word = Jaemeas then wordT must = Jaem
Example 2:
word=Jaeoimus then wordT must =Jaeoim
The problem is with my if statement, I can't figure out how to make it check all the vowels in that one line.
Clean method to check for vowels:
public static boolean isVowel(char c) {
return "AEIOUaeiou".indexOf(c) != -1;
}
Your condition is flawed. Think about the simpler version
z != 'a' || z != 'e'
If z is 'a' then the second half will be true since z is not 'e' (i.e. the whole condition is true), and if z is 'e' then the first half will be true since z is not 'a' (again, whole condition true). Of course, if z is neither 'a' nor 'e' then both parts will be true. In other words, your condition will never be false!
You likely want &&s there instead:
z != 'a' && z != 'e' && ...
Or perhaps:
"aeiou".indexOf(z) < 0
How about an approach using regular expressions? If you use the proper pattern you can get the results from the Matcher object using groups. In the code sample below the call to m.group(1) should return you the string you're looking for as long as there's a pattern match.
String wordT = null;
Pattern patternOne = Pattern.compile("^([\\w]{2}[AEIOUaeiou]*[^AEIOUaeiou]{1}).*");
Matcher m = patternOne.matcher("Jaemeas");
if (m.matches()) {
wordT = m.group(1);
}
Just a little different approach that accomplishes the same goal.
Actually there are much more efficient ways to check it but since you've asked what is the problem with yours, I can tell that the problem is you have to change those OR operators with AND operators. With your if statement, it will always be true.
So in event anyone ever comes across this and wants a easy compare method that can be used in many scenarios.
Doesn't matter if it is UPPERCASE or lowercase. A-Z and a-z.
bool vowel = ((1 << letter) & 2130466) != 0;
This is the easiest way I could think of. I tested this in C++ and on a 64bit PC so results may differ but basically there's only 32 bits available in a "32 bit integer" as such bit 64 and bit 32 get removed and you are left with a value from 1 - 26 when performing the "<< letter".
If you don't understand how bits work sorry i'm not going go super in depth but the technique of
1 << N is the same thing as 2^N power or creating a power of two.
So when we do 1 << N & X we checking if X contains the power of two that creates our vowel is located in this value 2130466. If the result doesn't equal 0 then it was successfully a vowel.
This situation can apply to anything you use bits for and even values larger then 32 for an index will work in this case so long as the range of values is 0 to 31. So like the letters as mentioned before might be 65-90 or 97-122 but since but we keep remove 32 until we are left with a remainder ranging from 1-26. The remainder isn't how it actually works, but it gives you an idea of the process.
Something to keep in mind if you have no guarantee on the incoming letters it to check if the letter is below 'A' or above 'u'. As the results will always be false anyways.
For example teh following will return a false vowel positive. "!" exclamation point is value 33 and it will provide the same bit value as 'A' or 'a' would.
For starters, you are checking if the letter is "not a" OR "not e" OR "not i" etc.
Lets say that the letter is i. Then the letter is not a, so that returns "True". Then the entire statement is True because i != a. I think what you are looking for is to AND the statements together, not OR them.
Once you do this, you need to look at how to increment y and check this again. If the first time you get a vowel, you want to see if the next character is a vowel too, or not. This only checks the character at location y=3.
String word="Jaemeas";
String wordT="";
int y=3;
char z;
do{
z=word.charAt(y);
if(z!='a'&&z!='e'&&z!='i'&&z!='o'&&z!='u'&&y<word.length()){
for(int i = 0; i<=y;i++){
wordT=wordT+word.charAt(i);
}
break;
}
else{
y++;
}
}while(true);
here is my answer.
I have declared a char[] constant for the VOWELS, then implemented a method that checks whether a char is a vowel or not (returning a boolean value). In my main method, I am declaring a string and converting it to an array of chars, so that I can pass the index of the char array as the parameter of my isVowel method:
public class FindVowelsInString {
static final char[] VOWELS = {'a', 'e', 'i', 'o', 'u'};
public static void main(String[] args) {
String str = "hello";
char[] array = str.toCharArray();
//Check with a consonant
boolean vowelChecker = FindVowelsInString.isVowel(array[0]);
System.out.println("Is this a character a vowel?" + vowelChecker);
//Check with a vowel
boolean vowelChecker2 = FindVowelsInString.isVowel(array[1]);
System.out.println("Is this a character a vowel?" + vowelChecker2);
}
private static boolean isVowel(char vowel) {
boolean isVowel = false;
for (int i = 0; i < FindVowelsInString.getVowel().length; i++) {
if (FindVowelsInString.getVowel()[i] == vowel) {
isVowel = true;
}
}
return isVowel;
}
public static char[] getVowel() {
return FindVowelsInString.VOWELS;
}
}

Subsequence of a string

I have to write a program that takes string argument s and integer argument k and prints out all subsequences of s of length k. For example if I have
subSequence("abcd", 3);
the output should be
abc abd acd bcd
I would like guidance. No code, please!
Thanks in advance.
Update:
I was thinking to use this pseudocode:
Start with an empty string
Append the first letter to the string
Append the second letter
Append the third letter
Print the so-far build substring - base case
Return the second letter
Append the fourth letter
Print the substring - base case
Return the first letter
Append the third letter
Append the fourth letter
Print the substring - base case
Return third letter
Append the second letter
Append the third letter
Append the fourth letter
Print the substring - base case
Return the third letter
Return the second letter
Append the third letter
Append the fourth letter
Return third letter
Return fourth letter
Return third letter
Return second letter
Return first letter
The different indent means going deeper in the recursive calls.
(In response to Diego Sevilla):
Following your suggestion:
private String SSet = "";
private String subSequence(String s, int substr_length){
if(k == 0){
return SSet;
}
else{
for(int i = 0; i < substr_length; i++){
subString += s.charAt(i);
subSequence(s.substring(i+1), k-1);
}
}
return SSet;
}
}
As you include "recursion" as a tag, I'll try to explain you the strategy for the solution. The recursive function should be a function like that you show:
subSequence(string, substr_length)
that actually returns a Set of (sub)-strings. Note how the problem could be divided in sub-problems that are apt to recursion. Each subSequence(string, substr_length) should:
Start with an empty substring set, that we call SSet.
Do a loop from 0 to the length of the string minus substr_length
In each loop position i, you take string[i] as the beginning character, and call recursively to subSequence(string[i+1..length], substr_length - 1) (here the .. imply an index range into the string, so you have to create the substring using these indices). That recursive call to subSequence will return all the strings of size substr_length -1. You have to prepend to all those substrings the character you selected (in this case string[i]), and add all of them to the SSet set.
Just return the constructed SSet. This one will contain all the substrings.
Of course, this process is highly optimizable (for example using dynamic programming storing all the substrings of length i), but you get the idea.
So, I see you want to implement a method: subSequence(s, n): Which returns a collection of all character character combinations from s of length n, such that ordering is preserved.
In the spirit of your desire to not provide you with code, I assume you would prefer no pseudo-code either. So, I will explain my suggested approach in a narrative fashion, leaving the translation to procedural code as an exercise-to-the-reader(TM).
Think of this problem where you are obtaining all combinations of character positions, which could be represented as an array of bits (a.k.a. flags). So where s="abcd" and n=3 (as in your example), all combinations could be represented as follows:
1110 = abc
1101 = abd
1011 = acd
0111 = bcd
Note, that we start with a bit-field where all characters are turned "on" and then shift the "off" bit over by 1. Things get interesting in an example where n < length(s) - 1. For example, say s="abcd" and n=2. Then we have:
1100 = ab
1001 = ad
1010 = ac
0110 = bc
0101 = bd
0011 = cd
The recursion comes into play when you analyze a sub set of the bit-fields. Hence, a recursive call would reduce the size of the bit-field and "bottom-out" where you have three flags:
100
010
001
The bulk of the work is a recursive approach to find all of the bit-fields. Once you have them, the positions of each bit can be used as an index in the the array of characters (that is s).
This should be sufficient to get you started on some pseudo-code!
The problem is precisely this:
Given an ordered set S : {C0, C1, C2, ..., Cn}, derive all ordered subsets S', where each member of S' is a member of S, and relative order of {S':Cj, S':Cj+1} is equivalent to relative order {S:Ci, S:Ci+d} where S':Cj = S:Ci and S':Cj+1 = S:Ci+d. |S|>=|S'|.
Assume/assert size of set S, |S| is >= the size of the subset, |S'|
If |S| - |S'| = d, then you know each of the subsets S' begins with digit at Si, where 0 < i < d.
e.g given S:{a, b, d, c} and |S'| = 3
d = 1
S' sets begin with 'a' (S:0), and 'b' (S:1).
So we see the problem is actually to solve d lexically ordered permutations of length 3 of subsets of S.
#d=0: get l.o.permutations of length 3 for {a, b, c, d}
#d=1: get l.o.permutations of length 3 for {b, c, d}
#d=2: d > |S|-|S'|. STOP.
string subSeqString() {
string s1="hackerrank";
string s="hhaacckkekraraannk";
int k=0,c=0;
int size=s1.size();
for(int i=0;i<size;i++)
{
for(int j=k;j<s.size();j++)
{
if(s1[i]==s[j])
{
c++;
k++;
break;
}
k++;
}
}
if(c==size)
return "YES";
else
return "NO";
}

Categories