getting matching regular expressions in java - java

String.split(String regex) splits the string around a given regular expression and returns an String array. But I am interested in the regex matches and would like them to be returned as string array instead of strings around them.
For example,
In case of trival regex like ":" it probably wouldn't matter. But there are regexes which would match a particular date in a paragraph and I would like to get all these dates which may be different each time. I checked the jdk api but couldn't find any such methods. Is there any method that I can make use of?. Any help would much appreciated.

Take a look at java.util.regex package Matcher and Pattern classes:
http://download.oracle.com/javase/6/docs/api/java/util/regex/package-summary.html

Just use the Java regular expression API
Pattern pat = Pattern.compile("\\d");
Matcher mat= pat.matcher("Foo99Bar66Baz");
while(mat.find()) {
System.out.println(mat.group());
}

You can find simple but quite comprehensive examples for startup in the following link
http://www.vogella.de/articles/JavaRegularExpressions/article.html
Also Pattern and Matcher usage example in:
http://www.vogella.de/articles/JavaRegularExpressions/article.html#regexjava

Related

Matching a string in java regex expression

I would like to use the "matches" method of the String class.
I don't want to create a Pattern and Matcher object and use matcher.find()
to match a specific string I am working with.
Here's my code:
String string = "-12Log";
if(string.matches("-?\\d+(\\.\\d*)?[Log]")System.out.println("dinos");
I have used different types of regexes with no success.
I have used the following:
-?\\d+(\\.\\d*)?\\[Log]
-?\\d+(\\.\\d*)?[+a-zA-Z]
-?\\d+(\\.\\d*)?+[a-zA-Z]
Please note that I don't want to break down the string into its characters. I would like to use the string as it is.
Any ideas would be appreciated
thanks for the help everyone: found an answer already
the matcher that worked is:
-?\\d+(\\.\\d*)?.*(Log).*

Java regex to parse a particular semicolon delimited param from a URL?

I have a URL I'm expecting like:
www.somewebsite.com/misc-session/;session-id=1FSDSF2132FSADASD13213
I want to parse out
session-id=1FSDSF2132FSADASD13213
Using a regular express in Java, what would be the best approach to take for this?
Using a test regex website I've experimented with some different ways but I'm wondering what is the best approach that is the most fail safe, and protected incase the URL is actually formed like:
www.somewebsite.com/misc-session/;session-id=1FSDSF2132FSADASD13213?someExtraParam=false
or
www.somewebsite.com/misc-session/extra-path/;session-id=1FSDSF2132FSADASD13213?someExtraParam=false
I am always just looking for the value of "session-id".
EDIT:
The value of session-id is NOT limited to digits and is guaranteed to contain a combination of both.
What is the best approach that is the most fail safe, and protected.
Well I think matching word boundary on both sides will be enough.
Regex: \bsession-id=\d+\b
Note:- Use \\d and \\b if regex flavor you are using needs double escaping.
Regex101 Demo
Just in case session-id have characters in range [A-Za-z0-9] use this regex.
Regex: \bsession-id=[A-Za-z0-9]+\b
Regex101 Demo
Ideone Demo
Remember to include
import java.util.regex.Matcher;
import java.util.regex.Pattern;
Try this one:
String str = "www.somewebsite.com/misc-session/;session-id=213213213";
Pattern p = Pattern.compile("(session-id=\\d+)");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println(m.group(0));
}
Note that session-id= is always given and you are interested in the following number, that is represented with \d (use double \\d in Java). The + stands for at least one number at all.
However better look at the detailed description at Regex101.

get the last portion of the link using java regex

I have an arraylist links. All links having same format abc.([a-z]*)/\\d{4}/
List<String > links= new ArrayList<>();
links.add("abc.com/2012/aa");
links.add("abc.com/2014/dddd");
links.add("abc.in/2012/aa");
I need to get the last portion of every link. ie, the part after domain name. Domain name can be anything(.com, .in, .edu etc).
/2012/aa
/2014/dddd
/2012/aa
This is the output i want. How can i get this using regex?
Thanks
Some people, when confronted with a problem, think “I know, I'll use
regular expressions.” Now they have two problems.
(see here for background)
Why use regex ? Perhaps a simpler solution is to use String.split("/") , which gives you an array of substrings of the original string, split by /. See this question for more info.
Note that String.split() does in fact take a regex to determine the boundaries upon which to split. However you don't need a regex in this case and a simple character specification is sufficient.
Try with below regex and use regex grouping feature that is grouped based on parenthesis ().
\.[a-zA-Z]{2,3}(/.*)
Pattern description :
dot followed by two or three letters followed by forward slash then any characters
DEMO
Sample code:
Pattern pattern = Pattern.compile("\\.[a-zA-Z]{2,3}(/.*)");
Matcher matcher = pattern.matcher("abc.com/2012/aa");
if (matcher.find()) {
System.out.println(matcher.group(1));
}
output:
/2012/aa
Note:
You can make it more precise by using \\.[a-zA-Z]{2,3}(/\\d{4}/.*) if there are always 4 digits in the pattern.
String result = s.replaceAll("^[^/]*","");
s would be the string in your list.
Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.
Why not just use the URI class?
output = new URI(link).getPath()
Try this one and use the second capturing group
(.*?)(/.*)
Use foreach loop to iterate over list.
Use substring and indexOf('/').
FOR EXAMPLE
String s="abc.com/2014/dddd";
System.out.println(s.substring(s.indexOf('/')));
OUTPUT
/2014/dddd
Or you can go for split method.
System.out.println(s.split("/",2)[1]);//OUTPUT:2014/dddd --->you need to add /

How can I provide an OR operator in regular expressions?

I want to match my string to one sequence or another, and it has to match at least one of them.
For and I learned it can be done with:
(?=one)(?=other)
Is there something like this for OR?
I am using Java, Matcher and Pattern classes.
Generally speaking about regexes, you definitely should begin your journey into Regex wonderland here: Regex tutorial
What you currently need is the | (pipe character)
To match the strings one OR other, use:
(one|other)
or if you don't want to store the matches, just simply
one|other
To be Java specific, this article is very good at explaining the subject
You will have to use your patterns this way:
//Pattern and Matcher
Pattern compiledPattern = Pattern.compile(myPatternString);
Matcher matcher = pattern.matcher(myStringToMatch);
boolean isNextMatch = matcher.find(); //find next match, it exists,
if(isNextMatch) {
String matchedString = myStrin.substring(matcher.start(),matcher.end());
}
Please note, there are much more possibilities regarding Matcher then what I displayed here...
//String functions
boolean didItMatch = myString.matches(myPatternString); //same as Pattern.matches();
String allReplacedString = myString.replaceAll(myPatternString, replacement)
String firstReplacedString = myString.replaceFirst(myPatternString, replacement)
String[] splitParts = myString.split(myPatternString, howManyPartsAtMost);
Also, I'd highly recommend using online regex checkers such as Regexplanet (Java) or refiddle (this doesn't have Java specific checker), they make your life a lot easier!
The "or" operator is spelled |, for example one|other.
All the operators are listed in the documentation.
You can separate with a pipe thus:
Pattern.compile("regexp1|regexp2");
See here for a couple of simple examples.
Use the | character for OR
Pattern pat = Pattern.compile("exp1|exp2");
Matcher mat = pat.matcher("Input_data");
The answers are already given, use the pipe '|' operator. In addition to that, it might be useful to test your regexp in a regexp tester without having to run your application, for example:
http://www.regexplanet.com/advanced/java/index.html

Java, regular expression catching multiple occurances of pattern

This is my original String:
String response = "attributes[{"id":50,"name":super},{"id":55,"name":hello}]";
I'm trying to parse the String and extract all the id values e.g
50
55
Pattern idPattern = Pattern.compile("{\"id\":(.*),");
Matcher matcher = idPattern.matcher(response);
while(matcher.find()){
System.out.println(matcher.group(1));
}
When i try to print the value i get an exception:
java.util.regex.PatternSyntaxException: Illegal repetition
Not had much experience with regular expressions in the past but cannot find a simple solution to this online.
Appreciate any help!
Pattern.compile("\"id\":(\\d+)");
Don't use a greedy match operator like * with a . which matches any character. unnecessarily.
If you want the digits extracted, you can use \d.
"id":(\d+)
Within a Java String,
Pattern.compile("\"id\":(\\d+)");
{ is a reserved character in regular expressions and should be escaped.
\{\"id\":(.*?),
Edit : If you're going to be working with JSON, you should consider using a dedicated JSON parser. It will make your life much easier. See Parsing JSON Object in Java

Categories