HashSet contains - java

I have set of keywords and I have one string which contains keyword instances separated by '/'. e.g. 'Food' or 'Car' are keywords and '/food/oatmeal/fruits' , '/tyre/car/wheel' are strings. Total # of keywords are 5500 . I need to flag this string 'eligible' if it has at least one of the 5550 keywords in it. One way I can do is to load all 5500 keywords in hashSet and split String in to tokens and check if hashSet contains each of the tokens. If find match, I flag that String 'eligible'.
Performance wise, Can there be a better solution ?

A simplified solution for token matching could be
public class REPL {
private static final HashSet<String> keyWords = new HashSet<>();
public static void main(String[] args) {
keyWords.add("food");
keyWords.add("car");
String[] strings = {
"/food/oatmeal/fruits",
"/tyre/car/wheel",
"/steel/nuts/bolts",
"/cart/handle/grill"
};
for (String s : strings) {
System.out.printf("string: %-20s ", s);
if (isEligible(s)) {
System.out.println("eligible: true");
} else {
System.out.println("eligible: false");
}
}
}
private static boolean isEligible(String s) {
StringTokenizer st = new StringTokenizer(s, "/");
while (st.hasMoreTokens()) {
if (keyWords.contains(st.nextToken())) {
return true;
}
}
return false;
}
}

Related

Regex to validate that every digit is different from each other

I have to validate strings with specific conditions using a regex statement. The condition is that every digit is different from each other. So, 123 works but not 112 or 131.
So, I wrote a statement which filters a string according to the condition and prints true once a string fullfies everything, however it only seems to print "true" altough some strings do not meet the condition.
public class MyClass {
public static void main(String args[]) {
String[] value = {"123","951","121","355","110"};
for (String s : value){
System.out.println("\"" + s + "\"" + " -> " + validate(s));
}
}
public static boolean validate(String s){
return s.matches("([0-9])(?!\1)[0-9](?!\1)[0-9]");
}
}
#Vinz's answer is perfect, but if you insist on using regex, then you can use:
public static boolean validate(String s) {
return s.matches("(?!.*(.).*\\1)[0-9]+");
}
You don't need to use regex for that. You can simply count the number of unique characters in the String and compare it to the length like so:
public static boolean validate(String s) {
return s.chars().distinct().count() == s.length();
}

How to check if the String object passed to the method contains at least one of the words from the list- JAVA [duplicate]

This question already has answers here:
How to check whether a List<String> contains a specific string?
(4 answers)
Closed last year.
I need help with creating a method that takes an object of the String type in the input arguments and a list of objects of the String type. The list contains forbidden words. How can I check if the String object passed to the method contains at least one of the words from the list?
public class Filter {
public static void main(String[] args) {
wordsFilter("This sentence contains a forbidden word");
}
private static void wordsFilter(String sentence) {
List<String> forbiddenWords = new ArrayList<>();
forbiddenWords.add("forbiddenWord");
forbiddenWords.add("forbidden word");
for (String word : forbiddenWords) {
if (sentence.contains(word)) {
System.out.println("The content cannot be displayed");
} else {
System.out.println(sentence);
}
}
}
}
Looks like you are missing a condition to exit the loop when a forbidden word was found:
private static void wordsFilter(String sentence) {
List<String> forbiddenWords = new ArrayList<>();
forbiddenWords.add("forbiddenWord");
forbiddenWords.add("forbidden word");
boolean doesContainAnyForbiddenWords = false;
for (String word : forbiddenWords) {
if (sentence.contains(word)) {
doesContainAnyForbiddenWords = true;
break; // leave the loop
} else {
System.out.println(sentence);
}
}
if (doesContainAnyForbiddenWords) {
System.out.println("The content cannot be displayed");
} else {
System.out.println(sentence);
}
}
You can do this easily using the Streams API
Optional<String> potential_forbidden_word =
forbiddenWords.stream().filter(word -> sentence.contains(word)).findFirst();
if(potential_forbidden_word.isPresent())
System.out.println("don't usw: "+potential_forbidden_word.get());
else
System.out.println("the sentence is clean");
you can even shorten the stream:
Optional<String> potential_forbidden_word =
forbiddenWords.stream().filter(sentence::contains).findFirst();
AS #Adriaan Koster mentioned: you can simply use the terminal operation anyMatch(Predicate):
boolean contains_forbidden_word =
forbiddenWords.stream().anyMatch(sentence::contains);
you might check for with equalsIgnoreCase() because "foo" or "Foo" or "FoO" and so on might also be forbidden.

Java substring string when specific string occurs

i need help to substring a string when a a substring occurs.
Example
Initial string: 123456789abcdefgh
string to substr: abcd
result : 123456789
I checked substr method but it accept index position value.I need to search the occurrence of the substring and than pass the index?
If you want to split the String from the last number (a), then the code would look like this:
you can change the "a" to any char within the string
package nl.testing.startingpoint;
public class Main {
public static void main(String args[]) {
String[] part = getSplitArray("123456789abcdefgh", "a");
System.out.println(part[0]);
System.out.println(part[1]);
}
public static String[] getSplitArray(String toSplitString, String spltiChar) {
return toSplitString.split("(?<=" + spltiChar + ")");
}
}
Bear in mind that toSplitString.split("(?<=" + spltiChar + ")"); splits from the first occurrence of that character.
Hope this might help:
public static void main(final String[] args)
{
searchString("123456789abcdefghabcd", "abcd");
}
public static void searchString(String inputValue, final String searchValue)
{
while (!(inputValue.indexOf(searchValue) < 0))
{
System.out.println(inputValue.substring(0, inputValue.indexOf(searchValue)));
inputValue = inputValue.substring(inputValue.indexOf(searchValue) +
searchValue.length());
}
}
Output:
123456789
efgh
Use a regular expression, like this
static String regex = "[abcd[.*]]"
public String remove(String string, String regex) {
return string.contains(regex) ? string.replaceAll(regex) : string;
}

Regex filename with exactly 2 underscores

I need to match if filenames have exactly 2 underscores and extension 'txt'.
For example:
asdf_assss_eee.txt -> true
asdf_assss_eee_txt -> false
asdf_assss_.txt -> false
private static final String FILENAME_PATTERN = "/^[A-Za-z0-9]+_[A-Za-z0-9]+_[A- Za-z0-9]\\.txt";
does not working.
You just need to add + after the third char class and you must remove the first forward slash.
private static final String FILENAME_PATTERN = "^[A-Za-z0-9]+_[A-Za-z0-9]+_[A-Za-z0-9]+\\.txt$";
You can use a regex like this with insensitive flag:
[a-z\d]+_[a-z\d]+_[a-z\d]+\.txt
Or with inline insensitive flag
(?i)[a-z\d]+_[a-z\d]+_[a-z\d]+\.txt
Working demo
In case you want to shorten it a little, you could do:
([a-z\d]+_){2}[a-z\d]+\.txt
Update
So lets assume you want to at least one or more characters after the second underscore, before the file extension.
Regex is still not "needed" for this. You could split the String by the underscore and you should have 3 elements from the split. If the 3rd element is just ".txt" then it's not valid.
Example:
public static void main(String[] args) throws Exception {
String[] data = new String[] {
"asdf_assss_eee.txt",
"asdf_assss_eee_txt",
"asdf_assss_.txt"
};
for (String d : data) {
System.out.println(validate(d));
}
}
public static boolean validate(String str) {
if (!str.endsWith(".txt")) {
return false;
}
String[] pieces = str.split("_");
return pieces.length == 3 && !pieces[2].equalsIgnoreCase(".txt");
}
Results:
true
false
false
Old Answer
Not sure I understand why your third example is false, but this is something that can easily be done without regex.
Start with checking to see if the String ends with ".txt", then check if it contains only two underscores.
Example:
public static void main(String[] args) throws Exception {
String[] data = new String[] {
"asdf_assss_eee.txt",
"asdf_assss_eee_txt",
"asdf_assss_.txt"
};
for (String d : data) {
System.out.println(validate(d));
}
}
public static boolean validate(String str) {
if (!str.endsWith(".txt")) {
return false;
}
return str.chars().filter(c -> c == '_').count() == 2;
}
Results:
true
false
true
Use this Pattern:
Pattern p = Pattern.compile("_[^_]+_[^_]+\\.txt")
and use .find() instead of .match() in the Matcher:
Matcher m = p.matcher(filename);
if (m.find()) {
// found
}

Matching multiple keywords from a line in java

I have a line from which multiple keywords are to be matched. The whole keywords should be matched.
Example,
String str = "This is an example text for matching countries like Australia India England";
if(str.contains("Australia") ||
str.contains("India") ||
str.contains("England")){
System.out.println("Matches");
}else{
System.out.println("Does not match");
}
This code works fine. But if there are too many keywords to be matched, the line grows. Is there any elegant way of writing the same code?
Thanks
Your can write a regular expression like this:
Country0|Country1|Country2
Use it like this:
String str = "This is an example text like Australia India England";
if (Pattern.compile("Australia|India|England").matcher(str).find())
System.out.println("Matches");
If you would like to know which countries has matched:
public static void main(String[] args) {
String str = "This is an example text like Australia India England";
Matcher m = Pattern.compile("Australia|India|England").matcher(str);
while (m.find())
System.out.println("Matches: " + m.group());
}
Outputs:
Matches: Australia
Matches: India
Matches: England
Put countries to array and use small helper method. Using Set makes it even nicer, but building set of countries is bit more tedious. Something like following, but with better naming and null handling if wished:
String[] countries = {"Australia", "India", "England"};
String str = "NAustraliaA";
if (containsAny(str, countries)) {
System.out.println("Matches");
}
else {
System.out.println("Does not match");
}
public static boolean containsAny(String toCheck, String[] values) {
for (String s: values) {
if (toCheck.contains(s)) {
return true;
}
}
return false;
}
From readability point of view, an ArrayList of strings to be matched will be elegant. A loop can be formed to check if the word is available else it will set a flag to indicate that a keyword was missing
Something like, in case all are to be matched
for (String checkStr : myList) {
if(!str.contains(checkStr)) {
flag=false;
break;
}
}
in case any should match
for (String checkStr : myList) {
if(str.contains(checkStr)) {
flag=true;
break;
}
}
package com.test;
public class Program {
private String str;
public Program() {
str = "This is an example text for matching countries like Australia India England";
// TODO Auto-generated constructor stub
}
public static void main(String[] args) {
Program program = new Program();
program.doWork();
}
private void doWork() {
String[] tomatch = { "Australia", "India" ,"UK"};
for(int i=0;i<tomatch.length;i++){
if (match(tomatch[i])) {
System.out.println(tomatch[i]+" Matches");
} else {
System.out.println(tomatch[i]+" Does not match");
}
}
}
private boolean match(String string) {
if (str.contains(string)) {
return true;
}
return false;
}
}
//-----------------
output
Australia Matches
India Matches
UK Does not match

Categories