String line = "asdasdasdasd <meta name=\"generator\" content=\"WordPress 3.5.2\" /> asdasdasdasdasd";
Pattern p = Pattern.compile("<meta name=\"generator\" content=\"WordPress\\s+([\\d.]+)\" />");
Matcher m = p.matcher(line);
if(m.matches())
System.out.println(m.group(1));
else
System.out.println("not found");
The regex I have used does not give the desired result. I want the wordpress version from the supplied string.
Matcher#matches() matches at the beginning of the string. So, you would need to build regex for complete string.
Alternatively, you can use Matcher#find() with just the regex for relevant part of the string:
Pattern p = Pattern.compile("content=\"WordPress\\s+([\\d.]+)\"");
Matcher m = p.matcher(line);
if(m.find())
System.out.println(m.group(1));
else
System.out.println("not found");
You have to escape the dot and accept more numbers just in case
Pattern p = Pattern.compile("WordPress\\s+([\\d+\\.]+)");
Related
i have a string like this:
font-size:36pt;color:#ffffff;background-color:#ff0000;font-family:Times New Roman;
How can I get the value of the color and the value of background-color?
color:#ffffff;
background-color:#ff0000;
i have tried the following code but the result is not my expected.
Pattern pattern = Pattern.compile("^.*(color:|background-color:).*;$");
The result will display:
font-size:36pt; color:#ffffff; background-color:#ff0000; font-family:Times New Roman;
If you want to have multiple matches in a string, don't assert ^ and $ because if those matches, then the whole string matches, which means that you can't match it again.
Also, use a lazy quantifier like *?. This will stop matching as soon as it finds some string that matches the pattern after it.
This is the regex you should use:
(color:|background-color:)(.*?);
Group 1 is either color: or background-color:, group 2 is the color code.
Demo
To do this you should use the (?!abc) expression in regex. This finds a match but doesn't select it. After that you can simply select the hexcode, like this:
String s = "font-size:36pt;color:#ffffff;background-color:#ff0000;font-family:Times New Roman";
Pattern pattern = Pattern.compile("(?!color:)#.{6}");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group());
}
Pattern pattern = Pattern.compile("color\\s*:\\s*([^;]+)\\s*;\\s*background-color\\s*:\\s*([^;]+)\\s*;");
Matcher matcher = pattern.matcher("font-size:36pt; color:#ffffff; background-color:#ff0000; font-family:Times New Roman;");
if (matcher.find()) {
System.out.println("color:" + matcher.group(1));
System.out.println("background-color:" + matcher.group(2));
}
No need to describe the whole input, only the relevant part(s) that you're looking to extract.
The regex color:(#[\\w\\d]+); does the trick for me:
String input = "font-size:36pt;color:#ffffff;background-color:#ff0000;font-family:Times New Roman;";
String regex = "color:(#[\\w\\d]+);";
Matcher m = Pattern.compile(regex).matcher(input);
while (m.find()) {
System.out.println(m.group(1));
}
Notice that m.group(1) returns the matching group which is inside the parenthesis in the regex. So the regex actually matches the whole color:#ffffff; and color:#ff0000; parts, but the print only handles the number itself.
Use a CSS parser like ph-css
String input = "font-size:36pt; color:#ffffff; background-color:#ff0000; font-family:Times New Roman;";
final CSSDeclarationList cssPropertyList =
CSSReaderDeclarationList.readFromString(input, ECSSVersion.CSS30);
System.out.println(cssPropertyList.get(1).getProperty() + " , "
+ cssPropertyList.get(1).getExpressionAsCSSString());
System.out.println(cssPropertyList.get(2).getProperty() + " , "
+ cssPropertyList.get(2).getExpressionAsCSSString());
Prints:
color , #ffffff
background-color , #ff0000
Find more about ph-css on github
I have this string:
text=123+456+789&xxxxxxxxx&yyyyyyyyyy&zzzzzzzzzzz
I need to extract 123+456+789
What I done so far is:
String s = "text=123+456+789&xxxxxxxxx&yyyyyyyyyy&zzzzzzzzzzz";
String ps = "text=(.*)&";
Pattern p = Pattern.compile(ps);
Matcher m = p.matcher(s);
if (m.find()){
System.out.println(m.group(0));
System.out.println(m.group(1));
}
And I got all text until the last & which is: 123+456+789&xxxxxxxxx&yyyyyyyyyy while the requested output is: 123+456+789
Any suggestions how to fix it (regex is mandatory)?
Use a negated character class:
String ps = "text=([^&]*)";
The value you need will be in Group 1.
The [^&] matches any character but an ampersand.
You almost getting, you need to make your regex lazy (or non greedy) like this:
String ps = "text=(.*?)&";
here ---^
Working demo
Try this regex :
([0-9+]+)
Link : https://regex101.com/r/xU2zF4/1
java code :
String s = "text=123+456+789&xxxxxxxxx&yyyyyyyyyy&zzzzzzzzzzz";
String ps = "([0-9+]+)";
Pattern p = Pattern.compile(ps);
Matcher m = p.matcher(s);
if (m.find()){
System.out.println(m.group(0)); // value of s
System.out.println(m.group(1)); // returns 123+456+789
}
Now I have this:
String s = "1<script type='text/javascript'>2</script>3<script type='text/javascript'>3</script>5";
Pattern pattern = Pattern.compile("<script.*</script>");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
s = s.replace(matcher.group(), "");
}
System.out.println(s);
The result is
15
But I need
135
In PHP we have /U modificator, but what should I do in Java? I thought about sth like this, but it is incorrect:
Pattern pattern = Pattern.compile("<script[^(script)]*</script>");
<script([^>]*)?>.*?<\/script>
Try this.You needed a ? for lazy match or shorter match.
See demo.
http://regex101.com/r/kO7lO2/3
replaceAll the below regex by empty string:
<script [^>]*>[^<]*</script>
I was experimenting trying to extract the 't' and 'f' flags from here.
So I was surprised to see extra characters in the output. Apparently the matcher backtracked - I dont understand why. What should be the correct regex?
System.out.println("searching...");
// "Sun:\\s Mon:\\s Tue:\\s Wed:\\s Thu:\\s Fri:\\s Sat:\\s "
Pattern p = Pattern.compile("[t|f]");
Matcher m = p.matcher("Sun:t Mon:f Tue:t Wed:t Thu:f Fri:t Sat:f ");
while (m.find()) {
System.out.println(m.group());
}
Output:
searching...
t
f
t
t
f
t
t
f
Sat has a t in it. Try ":([tf])" instead.
Pattern p = Pattern.compile(":([tf])");
Matcher m = p.matcher("Sun:t Mon:f Tue:t Wed:t Thu:f Fri:t Sat:f ");
while (m.find()) {
System.out.println(m.group(1));
}
for example:
I have a string like this:
http://shop.vipshop.com/detail-97996-12358781.html
I want to use regex to find 97996 and 12358781
java code is appreciated
Many thanks.
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex ="\-d{5}\-";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
System.out.println(matcher.group());
but it was wrong
Try this
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex =".*detail-(\\d+)-(\\d+).html";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if(matcher.matches()){
System.out.println(matcher.group(1) + "|" + matcher.group(2));
}
You have to invoke either Matcher#find() or Matcher#matches() to actually get the matches. In this case, you would need the former one, as you are only finding a part of string matching the regex.
And you can use + quantifier to get any length of digit. Try using this:
String regex ="\\d+";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}
Two lines:
String num1 = str.replaceAll(".*-(\\d+)-.*", "$1");
String num2 = str.replaceAll(".*-(\\d+)\\..*", "$1");
String str = "http://shop.vipshop.com/detail-97996-12358781.html";
String regex = "(?<=detail-)(\\d+)-(\\d+)(?=\\.html)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
matcher.find();
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
output:
97996 12358781
You have used a quantifier (5) and still the 1235... is 8 characters long?
Is it always 5 and 8 can you use:
"([\\d]{5,8})"
The matches captured into backreferences
But if you need to find in the specific form detail-NUMBER-NUMBER.html you can use:
"detail-([\\d]*)-([\\d]*).html"
The matches captured in [1] and [2]
you can use this:
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex ="[^0-9]+([0-9]+)[^0-9]+([0-9]+).+";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if(matcher.matches()){
System.out.println(matcher.group(1) + " " + matcher.group(2));
}
for advance tutorial go to this link
RegEx tutorial
and Regular Expression tutorial
You should add
if(matcher.find()){
}
on
System.out.println(matcher.group());
then your code is:
String str="http://shop.vipshop.com/detail-97996-12358781.html";
String regex ="\\d{5,}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if(matcher.find()){
System.out.println(matcher.group());
}