What's wrong with my regex -?\\d{2}\\.?\\d{6} - java

I'm trying extract lat and long from a url:
source:
...sensor=false&center=-15.842208999999999%2C-48.023084&zoom=17&size=256x256&language=en&client=google-maps-frontend&signature=hbey3U4lycTNgX48asW8MODjJLM
I'm not good in regexes, so I used this regex tester (http://regexpal.com/) and coded this regex -?\d{2}\.?\d{6} (is for JAVA )
It produces this result (who's saying it is regexpal.com):
-15.842208 ... -48.023084
So when I do it (in java):
for (Element element : newsHeadlines) {
if(element.toString().contains("https://maps.google.com")){
List<String> lista = get_matches(element.attr("content"), "-?\\d{2}\\.?\\d{6}");
}
}
public static List<String> get_matches(String s, String p) {
// returns all matches of p in s for first group in regular expression
List<String> matches = new ArrayList<String>();
Matcher m = Pattern.compile(p).matcher(s);
while(m.find()) {
matches.add(m.group(1)); //<-- Exception m.group(1) not have any results.
}
return matches;
}
What's wrong with my regex?

Your method get_matches is looking for m.group(1) groups are defined in Regex with Parenthesis. So you regex needs to be like this instead:
(-?\\d{2}\\.?\\d{6})
Online Demo

Just make one symbol as optional whether it may be - or ..
-\d{2}\.?\d{6}
Equivalent java regex:
-\\d{2}\\.?\\d{6}
OR
-?\d{2}\.\d{6}
Equivalent java regex:
-?\\d{2}\\.\\d{6}
DEMO
And call m.group(0) to print only the matched strings. If you want to call m.group(1) then you need to enclose the patterns within paranthesis.

Related

Java Pattern matcher not matching for HTTP response code [duplicate]

I have this small piece of code
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]"))
{
System.out.println(s);
}
}
Supposed to print
dkoe
but it prints nothing!!
Welcome to Java's misnamed .matches() method... It tries and matches ALL the input. Unfortunately, other languages have followed suit :(
If you want to see if the regex matches an input text, use a Pattern, a Matcher and the .find() method of the matcher:
Pattern p = Pattern.compile("[a-z]");
Matcher m = p.matcher(inputstring);
if (m.find())
// match
If what you want is indeed to see if an input only has lowercase letters, you can use .matches(), but you need to match one or more characters: append a + to your character class, as in [a-z]+. Or use ^[a-z]+$ and .find().
[a-z] matches a single char between a and z. So, if your string was just "d", for example, then it would have matched and been printed out.
You need to change your regex to [a-z]+ to match one or more chars.
String.matches returns whether the whole string matches the regex, not just any substring.
java's implementation of regexes try to match the whole string
that's different from perl regexes, which try to find a matching part
if you want to find a string with nothing but lower case characters, use the pattern [a-z]+
if you want to find a string containing at least one lower case character, use the pattern .*[a-z].*
Used
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]+"))
{
System.out.println(s);
}
}
I have faced the same problem once:
Pattern ptr = Pattern.compile("^[a-zA-Z][\\']?[a-zA-Z\\s]+$");
The above failed!
Pattern ptr = Pattern.compile("(^[a-zA-Z][\\']?[a-zA-Z\\s]+$)");
The above worked with pattern within ( and ).
Your regular expression [a-z] doesn't match dkoe since it only matches Strings of lenght 1. Use something like [a-z]+.
you must put at least a capture () in the pattern to match, and correct pattern like this:
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("(^[a-z]+$)"))
{
System.out.println(s);
}
}
You can make your pattern case insensitive by doing:
Pattern p = Pattern.compile("[a-z]+", Pattern.CASE_INSENSITIVE);

Get all matches within a string using complie and regex

I'm trying to get all matches which starts with _ and ends with = from a URL which looks like
?_field1=param1,param2,paramX&_field2=param1,param2,paramX
In that case I'm looking for any instance of _fieldX=
A method which I use to get it looks like
public static List<String> getAllMatches(String url, String regex) {
List<String> matches = new ArrayList<String>();
Matcher m = Pattern.compile("(?=(" + regex + "))").matcher(url);
while(m.find()) {
matches.add(m.group(1));
}
return matches;
}
called as
List<String> fieldsList = getAllMatches(url, "_.=");
but somehow is not finding anything what I have expected.
Any suggestions what I have missed?
A regex like (?=(_.=)) matches all occurrences of overlapping matches that start with _, then have any 1 char (other than a line break char) and then =.
You need no overlapping matches in the context of the string you provided.
You may just use a lazy dot matching pattern, _(.*?)=. Alternatively, you may use a negated character class based regex: _([^=]+)= (it will capture into Group 1 any one or more chars other than = symbol).
Since you are passing a regex to the method, it seems you want a generic function.
If so, you may use this method:
public static List<String> getAllMatches(String url, String start, String end) {
List<String> matches = new ArrayList<String>();
Matcher m = Pattern.compile(start + "(.*?)" + end).matcher(url);
while(m.find()) {
matches.add(m.group(1));
}
return matches;
}
and call it as:
List<String> fieldsList = getAllMatches(url, "_", "=");

How get a whole word with just a part of it?

Example:
I have a String like this:
String query = "....COD_OP = 1 AND USER_DATA_SIGNIN = ...."
I need to get the whole word ("USER_DATA_SIGNIN") when it have the "_DATA_" part.
In java is possible use some kind of substring inversely ? In this case I don't know how to get the "USER" part.
Simple imlementation, null checks are left to you:
for (String string : query.split(" ")) {
if(string.contains("_DATA_"))
{
System.out.println(string); // USER_DATA_SIGNIN
System.out.println(string.split("_")[0]); // USER
}
}
You can use Pattern/Matcher classes which are responsible to regex mechanism in Java. You can create pattern which will represent word which have a-zA-Z0-9_ characters (which can be represented by \w character class) before and/or after it like
Pattern p = Pattern.compile("\\w*_DATA_\\w*");
Matcher m = p.matcher(text);
while(m.find()){
System.out.println(m.group());
}

Java and Regex, get a substring which matches

I want to match the following pattern:
[0-9]*-[0-9]*-[BL]
and apply the pattern to this string:
123-456-L-234
which should become
123-456-L.
Here's my code:
HelperRegex{
..
final static Pattern KEY = Pattern.compile("\\d*-\\d*-[BL]");
public static String matchKey(String key) {
return KEY.matcher(key).toMatchResult().group(0);
}
Junit:
#Test
public final void testMatchKey() {
Assert.assertEquals("453-04430-B", HelperRegex.matchKey("453-04430-B-1"));
}
there is a no match found exception thrown.
I've proven my regex with "the regex coach" and it seems not broken, and matches all the teststring
Never mind all that complexity. You only need one line:
String match = input.replaceAll(".*?([0-9]*-[0-9]*-[BL])?.*", "$1");
This will produce a blank string if the pattern is not found.
If it were me, I would in-line this and not even have a separare method.
You need to create the group you want to retrieve with () and make sure your regex matches the whole string (note that group 0 is the whole string, so what you want is group 1):
String key = "453-04430-B-1";
Pattern pattern = Pattern.compile("(\\d*-\\d*-[BL]).*");
Matcher m = pattern.matcher(key);
if (m.matches())
System.out.println(m.group(1)); //prints 453-04430-B

regular expression java

I am trying to Take the content between Input, my pattern is not doing the right thing please help.
below is the sudocode:
s="Input one Input Two Input Three";
Pattern pat = Pattern.compile("Input(.*?)");
Matcher m = pat.matcher(s);
if m.matches():
print m.group(..)
Required Output:
one
Two
Three
Use a lookahead for Input and use find in a loop, instead of matches:
Pattern pattern = Pattern.compile("Input(.*?)(?=Input|$)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
See it working online: ideone
But it's better to use split here:
String[] result = s.split("Input");
// You need to ignore the first element in result, because it is empty.
See it working online: ideone
this does not work, because m.matches is true if and only if the whole string is matched by the expression. You could go two ways:
Use s.split("Input") instead, it gives you an array of the substrings between occurences of "Input"
Use Matcher.find() and Matcher.group(int). But be aware that your current expression will match everything after the first occurence of "Input", so you should change your expression.
Greetings,
Jost
import java.util.regex.*;
public class Regex {
public static void main(String[] args) {
String s="Input one Input Two Input Three";
Pattern pat = Pattern.compile("(Input) (\\w+)");
Matcher m = pat.matcher(s);
while( m.find() ) {
System.out.println( m.group(2) );
}
}
}

Categories