I had asked a question , many thanks for all the help.
I have a URL Like this.
/Hello/World/special/Case/2016/07/01/offer-015155.html
I need only "2016/07/01/offer-015155" this part and this dynamically changes each time. Could you help?
I tried "(.*?)" , "\d{4}/\d\d/\d\d/offer-\d+." but did not help.
When I run it says, not found . :(
If you want extract the part of an URL it will be quite enough to use something very simple like (.+)
Demo:
Example Regular Expression configuration (mind "Field to Check" bit)
References:
JMeter Regular Expressions
Perl 5 Regex Cheat sheet
Using RegEx (Regular Expression Extractor) With JMeter
You could consider using a positive lookbehind combined with a positive lookahead, like this:
(?<=\/Hello\/World\/special\/Case\/).*(?=.html)
This regex is very explicit but does the job. See this for an explanation of the regex: https://regex101.com/r/vY5mS2/2
EDIT:
Or simply use capture groups (https://regex101.com/r/vY5mS2/5)
\/Hello\/World\/special\/Case\/(.*)\.html
You can try the following code also:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("(\\d{4}/\\d{2}/\\d{2}/offer-\\d+)");
String testString = "/Hello/World/special/Case/2016/07/01/offer-015155.html";
Matcher matcher = pattern.matcher(testString);
if (matcher.find()) {
System.out.println(matcher.group());
}
}
}
Execution result: 2016/07/01/offer-015155
Related
I have the following configuration in the urlrewrite.xml:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE urlrewrite PUBLIC "-//tuckey.org//DTD UrlRewrite 4.0//EN" "http://www.tuckey.org/res/dtds/urlrewrite4.0.dtd">
<urlrewrite use-query-string="true">
<rule>
<from>^(/event/showEventList)(\.{1})(\bhtm\b|\bhtml\b)(\?{0,1})([a-zA-Z0-9-_=&]{0,}+)(#{0,1})([a-zA-Z0-9-_=&]{0,}+)$</from>
<to type="redirect" last="true">/events$4$5</to>
</rule>
</urlrewrite>
The regex ^(/event/showEventList)(\.{1})(\bhtm\b|\bhtml\b)(\?{0,1})([a-zA-Z0-9-_=&]{0,}+)(#{0,1})([a-zA-Z0-9-_=&]{0,}+)$ has 7 groups, which are:
(/event/showEventList): matches /event/showEventList
(\.{1}): matches a single dot (.)
(\bhtm\b|\bhtml\b): matches only htm or html
(\?{0,1}): matches question mark (?) which can may occur zero or one
([a-zA-Z0-9-_=&]{0,}+): matches the query string which can occur zero or more
(#{0,1}): matches hashtag (#) which can may occur zero or one
([a-zA-Z0-9-_=&]{0,}+): matches the fragment which can occur zero or more
If I test this configuration with a test URL: /event/showEventList.html?pageNumber=1#key=val, I am expecting that the redirected URL would be /events?pageNumber=1, but I am getting /events?pageNumber=1#key=val
I have a code snippet to test it, which is:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class UrlRewriterRegexTest {
public static void main(String[] args) {
String input = "/event/showEventList.html?pageNumber=1#key=val";
String regex = "^(/event/showEventList)(\\.{1})(\\bhtm\\b|\\bhtml\\b)(\\?{0,1})([a-zA-Z0-9-_=&]{0,}+)(#{0,1})([a-zA-Z0-9-_=&]{0,}+)$";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
System.out.println(matcher.replaceFirst("/events$4$5"));
}
}
It outputs to: /events?pageNumber=1.
Any pointer would be very helpful.
I'd simplify the expression a bit.
Escape slashes, as they are typically used as delimiters for the regex (\/event\/showEventList)
Remove superfluous quantifier (\.)
Shorten the html string test (htm(l)?) - careful, this messes with your capturing group numbers
Remove word boundary checks around html
Use ? instead of {0,1}
Use * instead of {0,}
Remove possessive quantifier (I don't see why you'd need it)
Ignore everything after #, you don't seem to need it in your replacement
This gives us ^(\/event\/showEventList)(\.)(htm(l)?)(\??)([a-zA-Z0-9-_=&]+)*#(.+)$ which subsitutes your example to /events?pageNumber=1
To play around, see https://regexr.com/4otp7
I've simplified the expression and here is the working solution
<from>^(\/event\/showEventList\.html?)(\?[a-zA-Z0-9-_=&]*)\#.*$</from>
<to type="redirect" last="true">/events$2</to>
This will match any thing and take everything from the beginning of query string till the first occurrence of #
Explanation:
Group 1 : Match the url /event/showEventList.html OR /event/showEventList.htm
Group 2 : Match all query string between o to many till the first occurrence of #
Group 2 is the string which you want to use for redirect and ignore any thing after # including #
Example:
I am answering my own question, so that in future if someone else stumbles upon the same problem, this answer could help him.
There is nothing to do with the UrlRewriteFilter framework. By enabling the debug log for this framework I have seen that the URL it is receiving before applying the defined rules doesn't have the URL Hash(#). From other SO answers and by analyzing the network traffic of the browser, I saw that the browser does not send the URL fragment to the server so it's not available in the HttpServletRequest. This is the reason the Regular Expressions are not working.
Since this hash is available in the client browser and thanks to HTML5 History API I am able to solve the problem using JavaScript:
<script type="text/javascript">
window.addEventListener('DOMContentLoaded', (event) => {
const url = new URL(window.location);
url.hash = '';
history.replaceState(null, document.title, url);
});
</script>
I am trying to capture host address from string with regex. My code looks like the following:
private static final Pattern OBTAIN_HOST_PATTERN = Pattern.compile("Host:\\s?(.*)");
public static String getHostAddress(String line) {
Matcher m = OBTAIN_HOST_PATTERN.matcher(line);
if (m.matches()) {
return OBTAIN_HOST_PATTERN.matcher(line).group(1);
}
return "Pattern does not match.";
}
Then I call getHostAddress("Host: abc"); and it gives me java.lang.IllegalStateException: No match found which means it matches the pattern but group capturing does not work. So, could you please help me find out why does this happen and what I am missing. Thanks in advance :)
Edit: I resolved the issue. It was because I am getting the matcher twice (or at least I think this was the reason), but can someone explain why does this happen?
The statement
return OBTAIN_HOST_PATTERN.matcher(line).group(1);
calls neither matches or find. As the if statement has already found a match so you can just do
return m.group(1);
You could even do better, by naming your group so you don't get confused with group indexes while trying to find your corresponding group. It can be achieved by doing the following thing :
"Host:\\s?(?<mygroupname>.*)"
and then
m.group("mygroupname")
A bit of doc about it : https://blogs.oracle.com/xuemingshen/entry/named_capturing_group_in_jdk7
I have following two different payloads where I am trying to Write Java Regex:
Input Payload 1
ISA*00* *00* *ZZ*EXDO *ZZ*047336389 *150327*1007*U*00401*900063730*0*P*>~
GS*QM*EXDO*047336389*20150327*1007*900063730*X*004010~
ST*214*900063730~
B10*326GENT15173**EXDO~
L11*019*TN~
Input Payload 2
ISA*00* *00* *02*HJBT *01*047336389 *140103*1751*U*00401*000012003*0*P*>\
GS*QM*HJBT*047336389*20140103*1751*12003*X*004010\
ST*214*0001\
B10*117094*B065199*HJBT\
N1*SH*INTEVA PRODUCTS LLC-\
I have following REGEX:
.*(ST\*214|ST\*210).*
I tried to evaluate the REGEX on this URL http://www.regexplanet.com/advanced/java/index.html
I see matches() as NO for 1st Payload and matches() as YES for 2nd Payload. I am looking for Updated REGEX which actually works for BOTH conditions here.
My Purpose here to validate payload information just like String contains method can do it using following approach.
payload.toString().contains('ST*214') || payload.toString().contains('ST*210').
I want to use regex instead of string.contains here.
"(?s).*(ST\\*214|ST\\*210).*"
In Java you need to enable DOTALL mode (to make . match with line terminators too). This can be done by including (?s) modifier. You had match only in this ST*214*900063730~ particular part of first string.
use the following regexp:
".*(ST\*214|ST\*210).*"
Have tested your two strings with following code:
public class RegTest {
public static void main (String[] args) {
String test1 = "ISA*00* 00 ZZEXDO *ZZ*047336389*150414*1108*U*00401*979863647*0*P*>~ GSQMEXDO*047336389*20150414*1108*979863647*X*004010~ ST*214*979863647~ B10*186143**EXDO~";
String test2 = "ISA*00* 00 *02*HJBT *01*047336389*140103*1751*U*00401*000012003*0*P*>\\GSQMHJBT*047336389*20140103*1751*12003*X*004010\\ST*214*0001\\B10*117094*B065199*HJBT\\N1*SH*INTEVA PRODUCTS LLC-\\";
if (test1.matches(".*(ST\\*214|ST\\*210).*")) {
System.out.println("String1 matches");
}
if (test2.matches(".*(ST\\*214|ST\\*210).*")) {
System.out.println("String2 matches");
}
}
}
just small fix, regexp in comment lost two '\' characters. You can use the regexp from code.
I think you try to match the wildcard character '*' so you should use backslashes :
.*(ST\*214|ST\*210).*
or
.*ST\*(214|210).*
or
.*ST\*21(4|0).*
or
.*ST\*21[40].*
Are the linefeed part of your payload or just some formatting ?
I have following code in Java:
Pattern fieldsPattern = Pattern.compile("(\"([^\"]+)\")|"
+"("+this.field_tag+"([0-9a-zA-Z_]+))");
Matcher fieldsMatcher = fieldsPattern.matcher(field);
while(fieldsMatcher.find())
{
//...
}
This code should capture expressions like "expression" and :expression (field_tag is just ":"). The problem occurs when I try to capture an expression like: "10.1" or "10,1". It dosen't work.
But expressions:
"10-1",
"10+1"
works as expected.
I also tried use this regexp on regexpal.com - site with javascript implementation of RegExp. On this site expressions like "10.1" and "10,1" works fine.
Is there any difference in java vs javascript in capturing dots? What am I doing wrong?
This works for me
Pattern fieldsPattern = Pattern.compile("(\"[^\"]+\")");
String field =" aa \"10\" \"10.1\" and \"10,1\"";
Matcher fieldsMatcher = fieldsPattern.matcher(field);
while(fieldsMatcher.find()) {
System.out.println(fieldsMatcher.group());
}
prints
"10"
"10.1"
"10,1"
The second set of brackets in the regex appear to be redundant, but are harmless.
We're using GWT 2.03 along with SmartGWT 2.2. I'm trying to match a regex like below in client side code.
Pattern pattern = Pattern.compile("\\\"(/\d+){4}\\\"");
String testString1 = "[ \"/2/4/5/6/8\", \"/2/4/5/6\"]";
String testString2 = "[ ]";
Matcher matcher = pattern.matcher(testString1);
boolean result = false;
while (matcher.find()) {
System.out.println(matcher.group());
}
It appears that Pattern and Matcher classes are NOT compiled to Javascript by the GWTC compiler and hence this application did NOT load. What is the equivalent GWT client code so that I can find regex matches within a String ?
How have you been able to match regexes within a String in client-side GWT ?
Thank you,
Just use the String class to do it!
Like this:
String text = "google.com";
if (text.matches("(\\w+\\.){1,2}[a-zA-Z]{2,4}"))
System.out.println("match");
else
System.out.println("no match");
It works fine like this, without having to import or upgrade or whatever.
Just change the text and regex to your liking.
Greetings, Glenn
Consider upgrading to GWT 2.1 and using RegExp.
Use GWT JSNI to call native Javascript regexp:
public native String jsRegExp(String str, String regex)
/*-{
return str.replace(/regex/); // an example replace using regexp
}
}-*/;
Perhaps you could download the RegExp files from GWT 2.1 and add them to your project?
http://code.google.com/p/google-web-toolkit/source/browse/trunk/user/src/com/google/gwt/regexp/
Download GWT 2.1 incl source, add that directory somewhere in your project, then add the reference to the "RegExp.gwt.xml" using the <inherits> tag from your GWT XML.
I'm not sure if that would work, but it'd be worth a shot. Maybe it references something else GWT 2.1 specific which you don't have, but I've just checked out the code a bit and I don't think it does.
GWT 2.1 now has a RegExp class that might solve your problem:
// Compile and use regular expression
RegExp regExp = RegExp.compile(patternStr);
MatchResult matcher = regExp.exec(inputStr);
boolean matchFound = regExp.test(inputStr);
if (matchFound) {
Window.alert("Match found");
// Get all groups for this match
for (int i=0; i<=matcher.getGroupCount(); i++) {
String groupStr = matcher.getGroup(i);
System.out.println(groupStr);
}
}else{
Window.alert("Match not found");
}