Translate XML Schema pattern to Java regular expression - java

Who can help me to translate this XML Schema pattern "[0-9]+-([0-9]|K)" to java regular expression?

Here is the pattern with a snippet on how to use it.
\\ is there to escape the \ in a string. \d represents [0-9]. I do not recall if the - has to be escaped so I did it just in case.
Pattern p = Pattern.compile("\\d+\\-[\\d|K]"); //The string is the pattern
Matcher m = p.matcher(whatYouWantToMatch);
boolean b = m.matches();

At least this case is compatible with java regex...
String s = "test cases here";
s.matches("[0-9]+-([0-9]|K)") //works OK.

Related

Substring between lines using Regular Expression Java

Hi I am having following string
abc test ...
interface
somedata ...
xxx ...
!
sdfff as ##
example
yyy sdd ## .
!
I have a requirement that I want to find content between a line having word "interface" or "example" and a line "!".
Required output will be something like below
String[] output= {"somedata ...\nxxx ...\n","yyy sdd ## .\n"} ;
I can do this manually using substring and iteration . But I want to achieve this using regular expression.
Is it possible?
This is what I have tried
String sample="abc\ninterface\nsomedata\nxxx\n!\nsdfff\ninterface\nyyy\n!\n";
Pattern pattern = Pattern.compile("(?m)\ninterface(.*?)\n!\n");
Matcher m =pattern.matcher(sample);
while (m.find()) {
System.out.println(m.group());
}
Am I Right? Please suggest a right way of doing it .
Edit :
A small change : I want to find content between a line "interface" or "example" and a line "!".
Can we achieve this too using regex ?
You could use (?s) DOTALL modifier.
String sample="abc\ninterface\nsomedata\nxxx\n!\nsdfff\ninterface\nyyy\n!\n";
Pattern pattern = Pattern.compile("(?s)(?<=\\ninterface\\n).*?(?=\\n!\\n)");//Pattern.compile("(?m)^.*$");
Matcher m =pattern.matcher(sample);
while (m.find()) {
System.out.println(m.group());
}
Output:
somedata
xxx
yyy
Note that the input in your example is different.
(?<=\\ninterface\\n) Asserts that the match must be preceded by the characters which are matched by the pattern present inside the positive lookbehind.
(?=\\n!\\n) Asserts that the match must be followed by the characters which are matched by the pattern present inside the positive lookahead.
Update:
Pattern pattern = Pattern.compile("(?s)(?<=\\n(?:example|interface)\\n).*?(?=\\n!\\n)");

Regex match for string literal including escape sequence

This works just fine for normal string literal ("hello").
"([^"]*)"
But I also want my regex to match literal such as "hell\"o".
This what i have been able to come up with but it doesn't work.
("(?=(\\")*)[^"]*")
here I have tried to look ahead for <\">.
How about
Pattern.compile("\"((\\\\\"|[^\"])*)\"")//
^^ - to match " literal
^^^^ - to match \ literal
^^^^^^ - will match \" literal
or
Pattern.compile("\"((?:\\\\\"|[^\"])*)\"")//
if you don't want to add more capturing groups.
This regex accept \" or any non " between quotation marks.
Demo:
String input = "ab \"cd\" ef \"gh \\\"ij\"";
Matcher m = Pattern.compile("\"((?:\\\\\"|[^\"])*)\"").matcher(input);
while (m.find())
System.out.println(m.group(1));
Output:
cd
gh \"ij
Use this method:
"((?:[^"\\\\]*|\\\\.)*)"
[^"\\\\]* now will not match \ anymore either. But on the other alternation, you get to match any escaped character.
Try with this one:
Pattern pattern = Pattern.compile("((?:\\\"|[^\"])*)");
\\\" to match \" or,
[^\"] to match anything by "

Parsing Email Address to fetch domain and comparing it

I have a requirement where in a request containing field comes in to my rest webservice.
In my webservice, I have to check for this field and if the validation for this passes, then I send the request to a third party service.
Validation Required:
message_from field contains an email address as string. I have to check if the domain name(everything after #) is roin.com
For ex: abc#roin.com passes, john_mandoza#roin.com passes, john_manodza#google.com fails...
Can I use pattern matchers or anything else to do this validation?
I have used string parsing to capture everything after (#) and then did an equalsIgnoreCase to compare it with roin.com
This string parsing approach works, but is there any better way to do this?
You can try this pattern (\\S+?#roin\\.com): -
\\S+ is used to match any non-space character
? after \\S+ is used to do reluctant matching. It will match least number of character to satisfy the pattern
\\. is used to match .
Since . is a special character in Regex, that is why we need to escape it to match it as literal.
So, here's the code: -
String str = "abc#roin.com";
Pattern pattern = Pattern.compile("\\S+?#roin\\.com");
Matcher matcher = pattern.matcher(str);
if (matcher.matches()) {
System.out.println("Matches"); // Prints this for this email
}

How to extract CSS color using regex?

I have a CSS style that I need to extract the color from using a Java regex.
eg
color:#000;
I need to extract the thing after : to ;. Can anyone give an example?
I'm not sure how to apply it to Java, but one regex to do this would be:
^color:\s*(#[0-9a-f]+);?$
To just extract from : up to ; do something like:
Pattern pattern = Pattern.compile("[^:]*:(.*);");
Matcher matcher = pattern.matcher(text);
if (matcher.matches()) {
String value = matcher.group(1);
System.out.println("'" + value+ "'"); // do something with value
}
[^:]* - any number of chars that are not ':'
: - one ':'
(...) - a capturing group
.*- any number of any character
;- the terminating ';'
use color:(.*); for only accepting values for 'color'.
/(?<=:).+(?=;)/
That will do it for you
Not sure how you implement regex in Java though.
www.regexr.com to help you text out your regex in real time.
The expression
":(#.+);"
should do it

How to split this string using Java Regular Expressions

I want to split the string
String fields = "name[Employee Name], employeeno[Employee No], dob[Date of Birth], joindate[Date of Joining]";
to
name
employeeno
dob
joindate
I wrote the following java code for this but it is printing only name other matches are not printing.
String fields = "name[Employee Name], employeeno[Employee No], dob[Date of Birth], joindate[Date of Joining]";
Pattern pattern = Pattern.compile("\\[.+\\]+?,?\\s*" );
String[] split = pattern.split(fields);
for (String string : split) {
System.out.println(string);
}
What am I doing wrong here?
Thank you
This part:
\\[.+\\]
matches the first [, the .+ then gobbles up the entire string (if no line breaks are in the string) and then the \\] will match the last ].
You need to make the .+ reluctant by placing a ? after it:
Pattern pattern = Pattern.compile("\\[.+?\\]+?,?\\s*");
And shouldn't \\]+? just be \\] ?
The error is that you are matching greedily. You can change it to a non-greedy match:
Pattern.compile("\\[.+?\\],?\\s*")
^
There's an online regular expression tester at http://gskinner.com/RegExr/?2sa45 that will help you a lot when you try to understand regular expressions and how they are applied to a given input.
WOuld it be better to use Negated Character Classes to match the square brackets? \[(\w+\s)+\w+[^\]]\]
You could also see a good example how does using a negated character class work internally (without backtracking)?

Categories