how to deal with a string with regex - java

for example:
I have a string like this:
http://shop.vipshop.com/detail-97996-12358781.html
I want to use regex to find 97996 and 12358781
java code is appreciated
Many thanks.
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex ="\-d{5}\-";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
System.out.println(matcher.group());
but it was wrong

Try this
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex =".*detail-(\\d+)-(\\d+).html";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if(matcher.matches()){
System.out.println(matcher.group(1) + "|" + matcher.group(2));
}

You have to invoke either Matcher#find() or Matcher#matches() to actually get the matches. In this case, you would need the former one, as you are only finding a part of string matching the regex.
And you can use + quantifier to get any length of digit. Try using this:
String regex ="\\d+";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}

Two lines:
String num1 = str.replaceAll(".*-(\\d+)-.*", "$1");
String num2 = str.replaceAll(".*-(\\d+)\\..*", "$1");

String str = "http://shop.vipshop.com/detail-97996-12358781.html";
String regex = "(?<=detail-)(\\d+)-(\\d+)(?=\\.html)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
matcher.find();
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
output:
97996 12358781

You have used a quantifier (5) and still the 1235... is 8 characters long?
Is it always 5 and 8 can you use:
"([\\d]{5,8})"
The matches captured into backreferences
But if you need to find in the specific form detail-NUMBER-NUMBER.html you can use:
"detail-([\\d]*)-([\\d]*).html"
The matches captured in [1] and [2]

you can use this:
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex ="[^0-9]+([0-9]+)[^0-9]+([0-9]+).+";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if(matcher.matches()){
System.out.println(matcher.group(1) + " " + matcher.group(2));
}
for advance tutorial go to this link
RegEx tutorial
and Regular Expression tutorial

You should add
if(matcher.find()){
}
on
System.out.println(matcher.group());
then your code is:
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex ="\\d{5,}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if(matcher.find()){
System.out.println(matcher.group());
}

Related

Having find() find more than once in regex in java

I need find() to find more than once. For example, in the regex below it will only get "i am cool1", but I also want it to get "i am cool2" and "i am cool3". How would I do that?
Pattern pattern = Pattern.compile("i am cool([0-9]{1})", Pattern.CASE_INSENSITIVE);
String theString = "i am cool1 text i am cool2 text i am cool3 text";
Matcher matcher = pattern.matcher(theString);
matcher.find();
whatYouNeed = matcher.group(1);
You have to invoke find() for every match. You can get the whole match with group() (without index).
Pattern pattern = Pattern.compile("i am cool([0-9]{1})", Pattern.CASE_INSENSITIVE);
String theString = "i am cool1 text i am cool2 text i am cool3 text";
Matcher matcher = pattern.matcher(theString);
while (matcher.find()) {
System.out.println(matcher.group());
}
This will print
i am cool1
i am cool2
i am cool3

How can I find an instance of a whole word in Java (Android)? for user tagging on a social media app

I'm building a social media app and I want to add the "tagging" functionality. My initial thought was to look for an instance of a whole word beginning with #. This is what I've got so far.
String text = "#martin I will come and meet you at the woods 123#martin # martin";
List<String> tokens = new ArrayList<String>();
tokens.add("#martin"); // martin should be an string arbitrary
String patternString = "\\b(" + StringUtils.join(tokens, "|") + ")\\b";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
It should print out #martin but not 123#martin, # or # martin, etc.
Am I close to a solution here, and is there a better way to do this?
Try this:
String text = "#martin I will come and meet you at the woods 123#martin # martin";
String patternString = "(^|\\W)(#\\w+?)(\\W|$)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group(2));
}

java regular expression issue about capture group

public void test(){
String source = "hello<a>goodA</a>boys can goodB\"\n"
+ " + \"this can help";
Pattern pattern = Pattern.compile("<a[\\s+.*?>|>](.*?)</a>");
Matcher matcher = pattern.matcher(source);
while (matcher.find()){
System.out.println("laozhu:" + matcher.group(1));
}
}
Output:
laozhu:goodA
laozhu:href="www.baidu.com">goodB
Why the second match is not laozhu:goodB?
Try this Regex:
<a(?: .*?)?>(\w+)<\/a>
So your Pattern should look like this:
Pattern pattern = Pattern.compile("<a(?: .*?)?>(\\w+)<\\/a>");
It matches goodA and goodB.
For the detailed description, look here: Regex101.
Pattern pattern = Pattern.compile("<a.*?>(.*?)</a>");

use java regex,how can i get group value with "( )" string?

I have strings:
#Table(name = "T_MEM_MEMBER_ADDRESS1")
#Table( name = "T_MEM_MEMBER_ADDRESS2")
#Table ( name = "T_MEM_MEMBER_ADDRESS3" )
I want to write a regex, which can get the name value,such as :
T_MEM_MEMBER_ADDRESS1
T_MEM_MEMBER_ADDRESS2
T_MEM_MEMBER_ADDRESS3
I write
String regexPattern="...";
Pattern pattern = Pattern.compile(regexPattern);
Matcher matcher = pattern.matcher(input);
boolean matches = matcher.matches();
if (matches){
log.debug(matcher.group(1));
}
but i cannot write the regexPattern..
You can use this regex:
(?<=")(.+)(?=")
In Java:
String regexPattern="(?<=\")(.+)(?=\")";
It uses look-behinds and lookaheads.
Group 1 will contain what you want.
You can use this piece of code:
String input = "#Table(name = \"T_MEM_MEMBER_ADDRESS1\")";
String regexPattern=".*\"(.*)\".*";
Pattern pattern = Pattern.compile(regexPattern);
Matcher matcher = pattern.matcher(input);
boolean matches = matcher.matches();
if (matches){
System.out.println(matcher.group(1));
}
Hope it helps.

Regx for extracting substring from in between data using java

String line = "asdasdasdasd <meta name=\"generator\" content=\"WordPress 3.5.2\" /> asdasdasdasdasd";
Pattern p = Pattern.compile("<meta name=\"generator\" content=\"WordPress\\s+([\\d.]+)\" />");
Matcher m = p.matcher(line);
if(m.matches())
System.out.println(m.group(1));
else
System.out.println("not found");
The regex I have used does not give the desired result. I want the wordpress version from the supplied string.
Matcher#matches() matches at the beginning of the string. So, you would need to build regex for complete string.
Alternatively, you can use Matcher#find() with just the regex for relevant part of the string:
Pattern p = Pattern.compile("content=\"WordPress\\s+([\\d.]+)\"");
Matcher m = p.matcher(line);
if(m.find())
System.out.println(m.group(1));
else
System.out.println("not found");
You have to escape the dot and accept more numbers just in case
Pattern p = Pattern.compile("WordPress\\s+([\\d+\\.]+)");

Categories