This question already has answers here:
Trying to check if string contains special characters or lowercase java
(4 answers)
Closed 7 years ago.
I'm still learning with regex.
I'm trying to check if the string contains ANY lowercase values. If it does, I just want to return false.
I've read answers on here, but they don't seem to work in my program, YET, they work in a regex emulator online.
if (str.matches("[a-z]+"){
System.out.println("removed");
return false;
This seems to highlight the lowercase letters in regexr but not in my program. Any help please?
If you're just looking for the correct regular expression itself .*[a-z].* as provided by #Adrian Leonhard as a comment above, is indeed correct. However, I think its important to mention that regular expressions take a very long time to compile, and if this if statement is nested in a loop it might be a good idea to use the full regular expression implementation in java.util.regex.*. rather than the convenience methods provided in String. To do so, first compile a Pattern object from your string.
Pattern p = Pattern.compile(".*[a-z].*");
This way the regex only has to be compiled once instead of every time String#matches(String regex) is called. Regex compilation is very computationally intensive. Then, create a matcher using your input string.
Matcher m = p.matcher(str);
Now, call Matcher#find()
if(m.find()) {
//Your code here
}
However, you could also just test to see if
str.toUpperCase().equals(str)
It's up to you. I would only use regex if absolutely necessary as it can slow down your program, and isn't very elegant in this case. At least you know how to use them properly in the future now.
Related
This question already has answers here:
Grammatical inference of regular expressions for given finite list of representative strings?
(2 answers)
Closed 3 years ago.
I wanted to convert sets of strings to regular expression using java.
I searched many things for it but there was no such satisfying answer available on the internet which resolves my issue. so I prefer to ask here.
First is it possible to convert it if yes, then kindly suggest me the way to get rid of this issue I'm facing?
Let's suppose I have sets of strings
abb
abababb
babb
aabb
bbbbabb
...
and I want to make a regular expression for it such as
(a+b)*abb
how it can be possible?
If you have a collection of strings, and want to build a regex that matches any of those strings, you should build a regex that uses the | OR pattern.
Since the strings could contain regex special characters, they need to be quoted.
To make sure the best string matches, you need to match longest string first. E.g. if aba and abax are both on the list, and text to scan contains abax, we'd want to match on the second string, not the first one.
So, you can do it like this:
public static String toRegex(Iterable<String> strings) {
return StreamSupport.stream(strings.spliterator(), false)
.sorted(Comparator.comparingInt(String::length).reversed())
.map(Pattern::quote)
.collect(Collectors.joining("|"));
}
What you are looking for is a way to infer a regular expression from a set of examples. This is a non-trivial computing problem to solve for the general case. See this post for details.
You can use the Pattern.compile method described here.
I don't believe you can.
The problem is that you want to provide only some of the total collection of valid strings and the algorithm has no way of inferring the exact complete set from the given subset. If you do provide the complete set of valid strings (and it doesn't seem like you can), then you can use David Zimmerman's answer in the comments. Or, perhaps more efficiently, just use a Set to hold the complete set of valid strings and to test candidate strings.
This question already has answers here:
Rationale for Matcher throwing IllegalStateException when no 'matching' method is called
(6 answers)
Closed 7 years ago.
Can't figure out why the below pattern isn't matching for anything. Including some of my more simple test examples :
// Pattern attempts to match against a String containing ?[0-9]?+
Pattern groups = Pattern.compile(".*?\\?([2-9]+)\\?.*");
Matcher m = groups.matcher("incSkl(?2?,2)");
int val = Integer.parseInt(m.group(1));
Even went super simple and tried feeding in a simple input of "?2?" to matcher. Still will error out on line 3.
Strangely the regex tester below seems to agree with me. It says both inputs should be a valid full match, without any flags needed on the Pattern
http://www.regexplanet.com/advanced/java/index.html
What's going on here? I even threw something up on CoderPad just to make sure there wasn't something 'off' with my environment, and it errors with 'no match' as well.
I realize at this point I could probably do something with find() (and that option for this use case would be the most sensible), but I've never had anything like this happen, and at this point want to know why it can't do a full match when most other regex implementations go through with no problem.
You need to call:
m.find()
or
m.matches()
before calling: m.group(1) otherwise your code will throw a nice Exception at the time of calling group()
Before y'all jump on me for posting something similar to previous questions asked, yes, there seem to be a number of regex related questions but nothing which seems to help me, or at least that I can see.
I am trying to parse strings in JAVA using PATTERN and MATCHER and am really having no joy. My regular expression seems to match my input string when I use a few of the online regular expression testing websites but Java simply does not match my expression.
My input string is:
"Big apple" title="Little Apple" type="Container" url="http://malcolm.com/testing"
The regular expression I am using to match is ".*" title="(.*)" type="Container" url="(.*)"
Essentially I want to pull out the text within the second and the fourth set of quotes. There will always be 4 sets of quotes with text within and around.
I am coding as follows:
Variable XMLSubstring contains the string above (including the quotes) and is as stated, even when I print it out.
Pattern p = Pattern.compile(".* title=\"(.*)\" type=\"Container\" url=\"(.*)\"");
m = p.matcher(XMLSubstring);
It doesn't appear to be rocket science I'm attempting but I'm pulling my hair out trying to debug the bloody thing.
Is there something wrong with my regex pattern?
Is there something wrong with the code I am using?
Am I simply a moron and should stop coding with immediate effect?
EDIT & UPDATE: I have found the problem. My string had a space at the end of it which was breaking the parser! How silly, and I think based on that, I need to accept the third suggestion of mine and give up programming. Thanks all for your assistance.
Try this,
String str="\"Big apple\" title=\"Little Apple\" type=\"Container\" url=\"http://malcolm.com/testing\"";
Pattern p=Pattern.compile(".* title=\\\".*\\\" type=\\\"Container\\\" url=\\\".*\\\"");
Matcher m=p.matcher(str);
I am doing string manipulations and I need more advanced functions than the original ones provided in Java.
For example, I'd like to return a substring between the (n-1)th and nth occurrence of a character in a string.
My question is, are there classes already written by users which perform this function, and many others for string manipulations? Or should I dig on stackoverflow for each particular function I need?
Check out the Apache Commons class StringUtils, it has plenty of interesting ways to work with Strings.
http://commons.apache.org/lang/api-2.3/index.html?org/apache/commons/lang/StringUtils.html
Have you looked at the regular expression API? That's usually your best bet for doing complex things with strings:
http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
Along the lines of what you're looking to do, you can traverse the string against a pattern (in your case a single character) and match everything in the string up to but not including the next instance of the character as what is called a capture group.
It's been a while since I've written a regex, but if you were looking for the character A for instance, then I think you could use the regex A([^A]*) and keep matching that string. The stuff in the parenthesis is a capturing group, which I reference below. To match it, you'd use the matcher method on pattern:
http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#matcher%28java.lang.CharSequence%29
On the Matcher instance, you'd make sure that matches is true, and then keep calling find() and group(1) as needed, where group(1) would get you what is in between the parentheses. You could use a counter in your looping to make sure you get the n-1 instance of the letter.
Lastly, Pattern provides flags you can pass in to indicate things like case insensitivity, which you may need.
If I've made some mistakes here, then someone please correct me. Like I said, I don't write regexes every day, so I'm sure I'm a little bit off.
First of all, here is a chunk of affected code:
// (somewhere above, data is initialized as a String with a value)
Pattern detailsPattern = Pattern.compile("**this is a valid regex, omitted due to length**", Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
Matcher detailsMatcher = detailsPattern.matcher(data);
Log.i("Scraper", "Initialized pattern and matcher, data length "+data.length());
boolean found = detailsMatcher.find();
Log.i("Scraper", "Found? "+((found)?"yep":"nope"));
I omitted the regex inside Pattern.compile because it's very long, but I know it works with the given data set; or if it doesn't, it shoudn't break anything anyway.
The trouble is, I do get the feedback I/Scraper(23773): Initialized pattern and matcher, data length 18861 but I never see the "Found?" line, it is just stuck on the find() call.
Is this a known Android bug? I've tried it over and over and just can't get it to work. Somehow, I think something over the past few days broke this because my app was working fine before, and I have in the past couple days received several comments of the app not working so it is clearly affecting other users as well.
How can I further debug this?
Some regexes can take a very, very long time to evaluate. In particular, regexes that have lots of quantifiers can cause the regex engine to do a huge amount of backtracking to explore all of the possible ways that the input string might match. And if it is going to fail, it has to explore all of those possibilities.
(Here is an example:
regex = "a*a*a*a*a*a*b"; // 6 quantifiers
input = "aaaaaaaaaaaaaaaaaaaa"; // 20 characters
A typical regex engine will do in the region of 20^6 character comparisons before deciding that the input string does not match.)
If you showed us the regex and the string you are trying to match, we could give a better diagnosis, and possibly offer some alternatives. But if you are trying to extract information from HTML, then the best solution is to not use regexes at all. There are HTML parsers that are specifically designed to deal with real-world HTML.
How long is the string you are trying to parse ?
How long and how complicated is the regex you are trying to match ?
Have you tried to break down your regex down to simpler bits ? Adding up the bits one after another will let you see when it breaks and maybe why.
make some RE like [a-zA-Z]* pass it as argument to compile(),here this example allows only characters small & cap.
Read my blogpost on android validation for more info.
I had the same issue and I solved it replacing all the wildchart . with [\s\S]. I really don't know why it worked for me but it did. I come from Javascript world and I know in there that expression is faster for being evaluated.