Is this an appropriate use of regex and string.matches? [closed] - java

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm trying to check if a string being passed in like "055# 444$ 285" contains any non-whitespace or digits and decided regex might be useful here (never used it before). If the string contains any non-whitespace or non-digits I want to return false.
Using regex and string.matches can I do what I want or is there another method I should use?
boolean isValid(String candidate)
{
if(candidate.matches("[^\\d|\\s]+"))
{
return false;
}
return true;
}

Problem
Your current code checks to see if the entire string matches non-digit/non-whitespace/non-pipe (|) characters from start to finish. Change it instead to match [\d\s] from start to finish and return the result. The pipe is not needed in character sets for alternation (it'll match that character literally).
You can test how your current program is running by passing it the following strings:
!## # returns true
||| # returns false
1 2 # returns false
Fix
You can use the following instead. It checks that the entire string matches only digits and whitespace characters:
return candidate.matches("[\\d\\s]+");
Code sample working here
class Main {
public static void main(String[] args) {
boolean valid = isValid("055 444 285");
System.out.printf(Boolean.toString(valid));
}
static boolean isValid(String candidate) {
return candidate.matches("[\\d\\s]+");
}
}

matches goes for the ENTIRE string.
You're saying:
Does the ENTIRE string fit this description:
The first character is not a digit, not a bar, and not a whitespace.
Then we're either at the end, or another non-digit, non-bar, non-whitespace char.
Keep going until the end.
Obviously, "055# 444$ 285" is indeed NOT matching that regexp. Starts right at the first character: 0 is a digit.
What you want is: Is there just a single non-digit, non-space character anywhere in it? Then - invalid. So, you want find, not matches, and you don't want the plus. Presumably, you don't want the | either.

Related

Parsing in Java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a few, theoretical ideas, but I don't know the language well. We basically need to make a rudimentary lexical analyzer. I have most of the bits, but here's the context. The straight-forward question is at the end.
I need to read in a line from the console, then see if parts match up with a symbol table, which includes the keywords: "print [variable]", "load [variable]", "mem [variable]", "sqrt" and "stop", as well as mathematical symbols.
It also needs to recognise the variables on their own (such as "c = a + b" as well.)
So...it's not that hard, in theory. You'd check the first character of the string matched up with keywords or variables. If they do, keep looping through that keyword or variable to check if it's the same string, up until you hit a space.
To summarize: How do I check the characters of a read in string to compare to stuff in Java?
I recommend to use Regex for text pattern matching. You receive the text via console as argument, you do so by using the args array of the main-method. Here's a small example:
public final class Parser {
public static void main(final String[] args) {
if (args.length < 1) {
// Error, no input given
}
String input = args[0];
Pattern pattern = Pattern.compile("YOUR REGEX HERE");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
// Input matches the Regex pattern
// Access to capturing groups using matcher.group(int)
// Example: System.out.println(matcher.group(1));
}
}
}
For Regex you can find various explanations on the web and on SO. You can try out your patterns at regex101.
Here's an example pattern which matches "name = name + name":
(.+) = (.+) \+ (.+)
The () creates capturing groups. Using matcher.group(x) for x from 1 to 3 you can access the matched values inside the brackets, i.e. the variables.
Here's the same example online with test input: regex101.com/r/mJ9jI5/1
Fairly easy. However you may need to make the pattern more robust. It may not accept whitespace characters or special characters (for example a +) inside a variable name and so on.

Need help on Regex return values [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a regex question assigned by my instructor and he wants us to make all the return values true by changing the string value in the three declared variables. This is my first time doing a regex question and I wanted a little help if that's okay. I tried www.regexpal.com but I didn't know how to use it.
Could someone shed a little light on this topic as to how I begin to solve this? Thanks
Heres the following code:
public class RegexTester {
public static void main(String[] args) {
String regexSSN = ""; //TODO add a regex for Social Security Numbers
String regex9000 = ""; //TODO add a regex for GGC 9000 numbers here
String regexZipPlus4 = ""; //TODO add a regex for zip+4 zipcodes here
System.out.println("All of the following tests shoule return true, "
+ "the negative tests are negated (meaning that they should "
+ "also return true)");
System.out.println("192-192-5555".matches(regexSSN)); // the following tests should all match
System.out.println("444-183-1212".matches(regexSSN));
System.out.println("032-431-9375".matches(regexSSN));
System.out.println("122-650-4343".matches(regexSSN));
System.out.println("900012389".matches(regex9000));
System.out.println("900112389".matches(regex9000));
System.out.println("900012390".matches(regex9000));
System.out.println("900050000".matches(regex9000));
System.out.println("30043".matches(regexZipPlus4));
System.out.println("30043-1234".matches(regexZipPlus4));
System.out.println(); // the following codes print out true
System.out.println(!"192-XYZ-5555".matches(regexSSN)); // the following tests should NOT match
System.out.println(!"00-192-5555".matches(regexSSN));
System.out.println(!"90005000".matches(regex9000)); // too short!
System.out.println(!"900250000".matches(regex9000)); // must be 9000* or 9001*
System.out.println(!"9002500001".matches(regex9000)); // to big
System.out.println(!"9001FOO00".matches(regex9000)); // no alpha allowed
System.out.println(!"30043-12345".matches(regexZipPlus4)); // too long
System.out.println(!"300430-1234".matches(regexZipPlus4)); // too long
System.out.println(!"30043-12".matches(regexZipPlus4)); // too short
System.out.println(!"300FO-1234".matches(regexZipPlus4)); // no alpha allowed
System.out.println(!"30043=1234".matches(regexZipPlus4)); // missing hyphen
}
}
Start with reading through java.util.regex.Pattern documentation. It contains all necessary information to complete the assignment. You need to clearly understand requirements when constructing your regex pattern. You can then convert those requirements into regex.
E.g., to match telephone number of the following format XXX-XXX-XXXX, where X is any number you need
3 digits followed by dash, followed by 3 digits, followed by another dash and then by 4 digits:
$\d{3}\-\d{3}\-\d{4}$
Please note that when assigning this pattern to a Java string, you need to escape special characters.
I like using RegexPlanet to test my code. Here is a link for the fist problem: regexSSN (although ssn should be 9 digit long, in your code it's 10). Click on Go button. You will be able to enter your test cases.
Here's the solution for your fist case.
String regexSSN = "^(\\d{3}\\-\\d{3}\\-\\d{4})";
Hopefully this will get you started so you can complete other two problems.
When designing regex strings, I like to begin by categorizing parts of the string into similar components. Lets take the SSN regex as an example.
Step 1: We see the format is ###-###-##### where # is a number 0-9
Step 2: The regex for matching a number is either [0-9] or \d
Step 3: Now we can write it out in regex \d\d\d-\d\d\d-\d\d\d\d where - is just a literal dash.
Step 4: Notice repetition? We can take care of that too with {n} where n is the number of time we want to repeat the previous section, so now we have \d{3}-\d{3}-\d{4}
And thats how you do SSN Regex.

Blank space in array with String splitter [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
My objective is to separate the numbers from the string, but in my array's first position i get a blank space. So i need help for that not to happen.
str1 = "Y=9x1+29x2";
String[] split2 = str2.split("[^-?.?0-9]+");
Blank space at the start is due to presence of non-digit character at the start of your input.
You can remove all non-digits at start before splitting:
String linha = "Y=9x1+29x2";
String[] split = linha.replaceFirst("[^-.\\d]+", "").split("[^-.\\d]+");
for (String tok: split2)
System.out.println(tok);
Output:
9
1
29
2
I think your question is rather vague, but after looking at it, I'm guessing that you want to extract the numbers out of the string, where a "number" has this format: an optional minus sign, followed by an optional decimal point, followed by one or more digits. I suspect you also want to include numbers that have digits followed by a decimal point followed by more digits.
I'm guessing this is what you want, because of the ? you put in your regex. The problem is that inside square brackets, ? doesn't mean "optional", and it doesn't mean "zero or one of something". It means a question mark. The regex [^-?.?0-9] means "match one character that is not a digit, a period, a hyphen, or a question mark". A pattern in square brackets always matches one character, and you tell it what characters are OK (or, if you begin with ^, what characters are not OK). This kind of "character set" pattern never matches a sequence of characters. It just looks at one character at a time. If you put + after the pattern, it still looks at one character at a time; it just does so repeatedly.
I think what you're trying to do is to take a pattern that represents a number, and then say "look for something that doesn't look like that pattern", and you tried to do it by using [^...]. That simply will not work.
In fact, split() is the wrong tool for this job. The purpose of split is to break up a string whose delimiters match a given pattern. Using it when the strings you want to keep in the array match a given pattern doesn't work very well, unless the pattern is extremely simple. I recommend that you create a Matcher and use the find() method in a loop. find() is set up so that it can find all matching substrings of a string if you call it repeatedly. This is what you want to accomplish, so it's the right tool.

Regex validation issue [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I need to validate and prevent entering '='(equal sign) for an input string and i use regex to catch it [=]* . but it catches the other strings as well. ex input: 2c450807-4a4c-4f18-bf4f-5a100ced87a0 . above regex catches the this string as well.
please help me.
and also ,can anyone please explain me why this regex doesn't catch the input. I need to catch the special characters mentioned in the regex.
final String REGEX="[.,%*$##?^<!&>'|/\\\\~\\[\\]{}+=\"-]*";
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher("2c450807-4a4c-4f18-bf4f-5a100ced87a0");
if (matcher.matches()) {
System.out.println("found");
}
else{
System.out.println("not found!");
}
this prints "not found!"
When you use regular expression, you might want to find items depending the number of times they appear:
If you want to match a group containering exactly n symbol (in your case: Equal (=) ) you can do something like this:
(=){n}
ie: if(myVar === myValue) is matched when n=3
If you want to match this symbol One or More times:
(=)+
ie: if((myVar = myValue) or (myVar == myValue) or (myVar === myValue))
If you want to match an item which might appear:
(=)*
ie: if(myVar < myValue)
The item does not need to be present in your expression to check. The value can be present 0 to n times.
I think the problem you have is that the * quantifier allows 0 occurrences of the preceding subpattern. Thus, [=]* matches any string.
You need to use a mere
=
And then you will not match 2c450807-4a4c-4f18-bf4f-5a100ced87a0.
Also please note that = is not a special regex character, you do not need to escape it, or place into a character class to avoid escaping.
However, as it has been pointed out in another comment, if you do not have to use a "regex", just check if a string contains = with a str.contains("=").

Java regex to find everything except for '.' [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I'm trying to write a regex that will find everything except for '.' - that is, a string which only contains '.' should return false and everything else true.
String regex = "(?!(^\\.$)).*";
String test = ".";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(test);
System.out.println(matcher.find());
^(?![.]$).*$
This should do it for you.You need to use anchors to make sure you dont make any partial matches.
I think this will do the job [^.].*|[.].+, i tried 15 different inputs on http://www.regexplanet.com/advanced/java/index.html, and it only says false when trying to find '.'
Explanation:
[^.].* - match everything that starts with no-dot character, and after that has 0 or more of any other characters,
[.].+ - match everything that starts with dot, and is followed by at least one or more of any other characters.
[^.].*|[.].+ - both parts merged with operator 'or'
Do you really need a regex for this, just use string comparison:
if (!".".equals(test)) {
// not equal to a dot
}

Categories