What does this Java 8 stream code mean?

What does this Java 8 stream code mean? - java

boolean isA2Z(String str) {
return (str.chars().filter(i -> i >= 'a' && i <= 'z').distinct().count() == 26);
}
From the method name, it looks like it tests whether a String contains letters from a->z, but it doesn't look right?
It collects individual characters from the string and then return the count of the stream. How can this work?

str.chars() --> returns IntStream for characters in String
filter(i -> i >= 'a' && i <= 'z') --> Filtering and it allows only [a, z] (inclusive) to go on next processing function
distinct() --> get all distinct values
count() --> count the items made their way through here.
functionally it checks if the String contains all the small [a, z] (inclusive) atleast once. (avoiding corner cases here)

Here's how it works, step by step:
Convert string to list of individual characters
Exclude any characters not between 'a' and 'z' (inclusive)
Create a unique list of remaining characters
Count the number of unique characters
Return true if the number is 26; return false otherwise
In other words, the method returns true if and only if the input string contains at least one occurrence of every lowercase letter from a to z.

Lets say if string is following
"abbcdefghijklmnopqrstuvwxyz"
This method will return true. Method is checking whether string contain all alphabets from a to z regardless if they are repeating.

Related

Check if two Strings are Anagrams Without using built-in features that require Import

I need to compare 2 strings and see if the number of every character each of them contain is the same.
Requirement:
Without using any packages or imports
My idea:
I want to check if a char from charsA exists within charsB. And if it does, I'll remove it from charsB.
And if the length of charsB is 0 by the end of the for loop, then I know it is an anagram
Issue:
It might end up deleting each character that matches a particular character more than once.
For example, if input is "oellh" and "heloo" (result should be false):
checks for o: new charsB: "hel"
checks for e: new charsB: "hl"
checks for l: new charsB: "h"
checks for l: new charsB: "h"
checks for h: new charsB: ""
My code:
static boolean isAnagram(String a, String b) {
boolean isAnagram;
//case 2:
//if strings lengths dont match isAnagram is set to false
if(a.length() != b.length()){isAnagram = false;}
// will turn each string into an array or characters
String[] charsA = a.split("");
String[] charsB = b.split("");
for (int i = 0; i < a.length(); i++) {
// some code to compare the strings here.
}
return isAnagram;
}
Test cases
*Input:
"hello", "olhel" // case one
"helloq", "olhel" // case two
"helloq", "olheln" // case three
Output:
true // case one
false // case two
false // case three

Generate a HashMap of frequencies of symbols for both given strings.
Then compare the two maps. If they are equal, then the strings are anagrams.
public static boolean isAnagram(String a, String b) {
return getFrequencies(a).equals(getFrequencies(b));
}
public static Map<Integer, Long> getFrequencies(String str) {
return str.codePoints()
.boxed()
.collect(Collectors.groupingBy(
Function.identity(),
Collectors.counting()
));
}
main()
public static void main(String[] args) {
System.out.println(isAnagram("hello", "olhel"));
System.out.println(isAnagram("hello", "olhelq"));
}
Output:
true
false
Without using any packages or imports
There are different possibilities, the optimal choice would depend on the constraints on the strings received as an input.
If we expect that input will contain only letters of the English alphabet, then we could handle this case by populating two arrays of length 26 where each element corresponds to a frequency of a particular letter (as shown in the link provided by #Federico klez Culloca in the comments). To get the result, we can compare these arrays iterating over the indices and checking values at the same index are equal. This algorithm will run in a linear time O(n) (as well as map-based solution shown above).
In case if we can't expect that symbols in the given strings would be in a particular range, then we can extract arrays of characters from both strings, sort them and compare both array character by character. This algorithm will run in a linear-logarithmic time O(n * log n) because it requires sorting. And since we are not allowed to use imports (like Arrays utility class) then we need to implement a linear-logarithmic sorting algorithm (like quicksort or merge-sort) manually.

What does Character.isDigit return?

I found this example online. peek contains one character read from the buffered reader with readch(br). The following cycle must continue until the read character is a number
while(Character.isDigit(peek)
&& !Character.isLetter(peek) && peek != '_') {
int n = (int) peek - 48;
sum = sum*10 + n;
readch(br);
}
Isn't it enough to just say Character.isDigit?

Yes, it's redundant. Character.isDigit returns true if the character type (from Character.getType) is DECIMAL_DIGIT_NUMBER, and Character.isLetter returns true if the same type is one of several categories (DECIMAL_DIGIT_NUMBER is not one of the categories listed).
getType returns a single value, so there are no characters that have multiple types according to Java. Thus, there is no character for which isDigit and isLetter both return true. Likewise, _ is CONNECTOR_PUNCTUATION (easy to see this with a quick sample Java program), which is neither a digit nor a letter.
So this is code by someone who was being overly defensive. isDigit suffices.

The sample is doing the job only halve.
It assumes that Character::isDigit returns true only for the characters '0' to '9', but that is wrong. This also means that the calculation (int) peek - 48 is not reliable to get the numeric value of the digit.
When the code should really work on any kind of digits, it needs to look like this:
final var radix = 10;
var value = 0;
while( Character.isDigit( peek) && ((value = Character.digit( peek, radix )) != -1) )
{
sum = sum * radix + value;
readch( br );
}
To cover also the hex digits from 'A' to 'F' or their lowercase equivalents, just remove the Character::isDigit check (and change radix accordingly).
Using Character::getNumericValue would also get the Roman numbers, but this will not work properly for the calculation of sum.

How to handle the time complexity for permutation of strings during anagrams search?

I have a program that computes that whether two strings are anagrams or not.
It works fine for inputs of strings below length of 10.
When I input two strings whose lengths are equal and have lengths of more than 10 program runs and doesn't produce an answer .
My concept is that if two strings are anagrams one string must be a permutation of other string.
This program generates the all permutations from one string, and after that it checks is there any matching permutation for the other string. In this case I wanted to ignore cases.
It returns false when there is no matching string found or the comparing strings are not equal in length, otherwise returns true.
public class Anagrams {
static ArrayList<String> str = new ArrayList<>();
static boolean isAnagram(String a, String b) {
// there is no need for checking these two
// strings because their length doesn't match
if (a.length() != b.length())
return false;
Anagrams.permute(a, 0, a.length() - 1);
for (String string : Anagrams.str)
if (string.equalsIgnoreCase(b))
// returns true if there is a matching string
// for b in the permuted string list of a
return true;
// returns false if there is no matching string
// for b in the permuted string list of a
return false;
}
private static void permute(String str, int l, int r) {
if (l == r)
// adds the permuted strings to the ArrayList
Anagrams.str.add(str);
else {
for (int i = l; i <= r; i++) {
str = Anagrams.swap(str, l, i);
Anagrams.permute(str, l + 1, r);
str = Anagrams.swap(str, l, i);
}
}
}
public static String swap(String a, int i, int j) {
char temp;
char[] charArray = a.toCharArray();
temp = charArray[i];
charArray[i] = charArray[j];
charArray[j] = temp;
return String.valueOf(charArray);
}
}
1. I want to know why can't this program process larger strings
2. I want to know how to fix this problem
Can you figure it out?

To solve this problem and check whether two strings are anagrams you don't actually need to generate every single permutation of the source string and then match it against the second one. What you can do instead, is count the frequency of each character in the first string, and then verify whether the same frequency applies for the second string.
The solution above requires one pass for each string, hence Θ(n) time complexity. In addition, you need auxiliary storage for counting characters which is Θ(1) space complexity. These are asymptotically tight bounds.

you're doing it in very expensive way and the time complexity here is exponential because your'e using permutations which requires factorials and factorials grow very fast , as you're doing permutations it will take time to get the output when the input is greater than 10.
11 factorial = 39916800
12 factorial = 479001600
13 factorial = 6227020800
and so on...
So don't think you're not getting an output for big numbers you will eventually get it
If you go something like 20-30 factorial i think i will take years to produce any output , if you use loops , with recursion you will overflow the stack.
fact : 50 factorial is a number that big it is more than the number of sand grains on earth , and computer surrender when they have to deal with numbers that big.
That is why they make you include special character in passwords to make the number of permutations too big that computers will not able to crack it for years if they try every permutations , and encryption also depends on that weakness of the computers.
So you don't have to and should not do that to solve it (because computer are not good very at it), it is an overkill
why don't you take each character from one string and match it with every character of other string, it will be quadratic at in worst case.
And if you sort both the strings then you can just say
string1.equals(string2)
true means anagram
false means not anagram
and it will take linear time,except the time taken in sorting.

You can first get arrays of characters from these strings, then sort them, and then compare the two sorted arrays. This method works with both regular characters and surrogate pairs.
public static void main(String[] args) {
System.out.println(isAnagram("ABCD", "DCBA")); // true
System.out.println(isAnagram("𝗔𝗕𝗖𝗗", "𝗗𝗖𝗕𝗔")); // true
}
static boolean isAnagram(String a, String b) {
// invalid incoming data
if (a == null || b == null
|| a.length() != b.length())
return false;
char[] aArr = a.toCharArray();
char[] bArr = b.toCharArray();
Arrays.sort(aArr);
Arrays.sort(bArr);
return Arrays.equals(aArr, bArr);
}
See also: Check if one array is a subset of the other array - special case

finding the middle index of a substring when there are duplicates in the string

I was working on a Java coding problem and encountered the following issue.
Problem:
Given a string, does "xyz" appear in the middle of the string? To define middle, we'll say that the number of chars to the left and right of the "xyz" must differ by at most one
xyzMiddle("AAxyzBB") → true
xyzMiddle("AxyzBBB") → false
My Code:
public boolean xyzMiddle(String str) {
boolean result=false;
if(str.length()<3)result=false;
if(str.length()==3 && str.equals("xyz"))result=true;
for(int j=0;j<str.length()-3;j++){
if(str.substring(j,j+3).equals("xyz")){
String rightSide=str.substring(j+3,str.length());
int rightLength=rightSide.length();
String leftSide=str.substring(0,j);
int leftLength=leftSide.length();
int diff=Math.abs(rightLength-leftLength);
if(diff>=0 && diff<=1)result=true;
else result=false;
}
}
return result;
}
Output I am getting:
Running for most of the test cases but failing for certain edge cases involving more than once occurence of "xyz" in the string
Example:
xyzMiddle("xyzxyzAxyzBxyzxyz")
My present method is taking the "xyz" starting at the index 0. I understood the problem. I want a solution where the condition is using only string manipulation functions.
NOTE: I need to solve this using string manipulations like substrings. I am not considering using list, stringbuffer/builder etc. Would appreciate answers which can build up on my code.

There is no need to loop at all, because you only want to check if xyz is in the middle.
The string is of the form
prefix + "xyz" + suffix
The content of the prefix and suffix is irrelevant; the only thing that matters is they differ in length by at most 1.
Depending on the length of the string (and assuming it is at least 3):
Prefix and suffix must have the same length if the (string's length - the length of xyz) is even. In this case:
int prefixLen = (str.length()-3)/2;
result = str.substring(prefixLen, prefixLen+3).equals("xyz");
Otherwise, prefix and suffix differ in length by 1. In this case:
int minPrefixLen = (str.length()-3)/2;
int maxPrefixLen = minPrefixLen+1;
result = str.substring(minPrefixLen, minPrefixLen+3).equals("xyz") || str.substring(maxPrefixLen, maxPrefixLen+3).equals("xyz");
In fact, you don't even need the substring here. You can do it with str.regionMatches instead, and avoid creating the substrings, e.g. for the first case:
result = str.regionMatches(prefixLen, "xyz", 0, 3);

Super easy solution:
Use Apache StringUtils to split the string.
Specifically, splitByWholeSeparatorPreserveAllTokens.
Think about the problem.
Specifically, if the token is in the middle of the string then there must be an even number of tokens returned by the split call (see step 1 above).
Zero counts as an even number here.
If the number of tokens is even, add the lengths of the first group (first half of the tokens) and compare it to the lengths of the second group.
Pay attention to details,
an empty token indicates an occurrence of the token itself.
You can count this as zero length, count as the length of the token, or count it as literally any number as long as you always count it as the same number.
if (lengthFirstHalf == lengthSecondHalf) token is in middle.

Managing your code, I left unchanged the cases str.lengt<3 and str.lengt==3.
Taking inspiration from #Andy's answer, I considered the pattern
prefix+'xyz'+suffix
and, while looking for matches I controlled also if they respect the rule IsMiddle, as you defined it. If a match that respect the rule is found, the loop breaks and return a success, else the loop continue.
public boolean xyzMiddle(String str) {
boolean result=false;
if(str.length()<3)
result=false;
else if(str.length()==3 && str.equals("xyz"))
result=true;
else{
int preLen=-1;
int sufLen=-2;
int k=0;
while(k<str.lenght){
if(str.indexOf('xyz',k)!=-1){
count++;
k=str.indexOf('xyz',k);
//check if match is in the middle
preLen=str.substring(0,k).lenght;
sufLen=str.substring(k+3,str.lenght-1).lenght;
if(preLen==sufLen || preLen==sufLen-1 || preLen==sufLen+1){
result=true;
k=str.length; //breaks the while loop
}
else
result=false;
}
else
k++;
}
}
return result;
}

Subsequence of a string

I have to write a program that takes string argument s and integer argument k and prints out all subsequences of s of length k. For example if I have
subSequence("abcd", 3);
the output should be
abc abd acd bcd
I would like guidance. No code, please!
Thanks in advance.
Update:
I was thinking to use this pseudocode:
Start with an empty string
Append the first letter to the string
Append the second letter
Append the third letter
Print the so-far build substring - base case
Return the second letter
Append the fourth letter
Print the substring - base case
Return the first letter
Append the third letter
Append the fourth letter
Print the substring - base case
Return third letter
Append the second letter
Append the third letter
Append the fourth letter
Print the substring - base case
Return the third letter
Return the second letter
Append the third letter
Append the fourth letter
Return third letter
Return fourth letter
Return third letter
Return second letter
Return first letter
The different indent means going deeper in the recursive calls.
(In response to Diego Sevilla):
Following your suggestion:
private String SSet = "";
private String subSequence(String s, int substr_length){
if(k == 0){
return SSet;
}
else{
for(int i = 0; i < substr_length; i++){
subString += s.charAt(i);
subSequence(s.substring(i+1), k-1);
}
}
return SSet;
}
}

As you include "recursion" as a tag, I'll try to explain you the strategy for the solution. The recursive function should be a function like that you show:
subSequence(string, substr_length)
that actually returns a Set of (sub)-strings. Note how the problem could be divided in sub-problems that are apt to recursion. Each subSequence(string, substr_length) should:
Start with an empty substring set, that we call SSet.
Do a loop from 0 to the length of the string minus substr_length
In each loop position i, you take string[i] as the beginning character, and call recursively to subSequence(string[i+1..length], substr_length - 1) (here the .. imply an index range into the string, so you have to create the substring using these indices). That recursive call to subSequence will return all the strings of size substr_length -1. You have to prepend to all those substrings the character you selected (in this case string[i]), and add all of them to the SSet set.
Just return the constructed SSet. This one will contain all the substrings.
Of course, this process is highly optimizable (for example using dynamic programming storing all the substrings of length i), but you get the idea.

So, I see you want to implement a method: subSequence(s, n): Which returns a collection of all character character combinations from s of length n, such that ordering is preserved.
In the spirit of your desire to not provide you with code, I assume you would prefer no pseudo-code either. So, I will explain my suggested approach in a narrative fashion, leaving the translation to procedural code as an exercise-to-the-reader(TM).
Think of this problem where you are obtaining all combinations of character positions, which could be represented as an array of bits (a.k.a. flags). So where s="abcd" and n=3 (as in your example), all combinations could be represented as follows:
1110 = abc
1101 = abd
1011 = acd
0111 = bcd
Note, that we start with a bit-field where all characters are turned "on" and then shift the "off" bit over by 1. Things get interesting in an example where n < length(s) - 1. For example, say s="abcd" and n=2. Then we have:
1100 = ab
1001 = ad
1010 = ac
0110 = bc
0101 = bd
0011 = cd
The recursion comes into play when you analyze a sub set of the bit-fields. Hence, a recursive call would reduce the size of the bit-field and "bottom-out" where you have three flags:
100
010
001
The bulk of the work is a recursive approach to find all of the bit-fields. Once you have them, the positions of each bit can be used as an index in the the array of characters (that is s).
This should be sufficient to get you started on some pseudo-code!

The problem is precisely this:
Given an ordered set S : {C0, C1, C2, ..., Cn}, derive all ordered subsets S', where each member of S' is a member of S, and relative order of {S':Cj, S':Cj+1} is equivalent to relative order {S:Ci, S:Ci+d} where S':Cj = S:Ci and S':Cj+1 = S:Ci+d. |S|>=|S'|.
Assume/assert size of set S, |S| is >= the size of the subset, |S'|
If |S| - |S'| = d, then you know each of the subsets S' begins with digit at Si, where 0 < i < d.
e.g given S:{a, b, d, c} and |S'| = 3
d = 1
S' sets begin with 'a' (S:0), and 'b' (S:1).
So we see the problem is actually to solve d lexically ordered permutations of length 3 of subsets of S.
#d=0: get l.o.permutations of length 3 for {a, b, c, d}
#d=1: get l.o.permutations of length 3 for {b, c, d}
#d=2: d > |S|-|S'|. STOP.

string subSeqString() {
string s1="hackerrank";
string s="hhaacckkekraraannk";
int k=0,c=0;
int size=s1.size();
for(int i=0;i<size;i++)
{
for(int j=k;j<s.size();j++)
{
if(s1[i]==s[j])
{
c++;
k++;
break;
}
k++;
}
}
if(c==size)
return "YES";
else
return "NO";
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

What does this Java 8 stream code mean? - java

Lets say if string is following "abbcdefghijklmnopqrstuvwxyz" This method will return true. Method is checking whether string contain all alphabets from a to z regardless if they are repeating.

Related

Check if two Strings are Anagrams Without using built-in features that require Import

What does Character.isDigit return?

How to handle the time complexity for permutation of strings during anagrams search?

finding the middle index of a substring when there are duplicates in the string

Subsequence of a string

Categories

Resources