Find words matching a specific REGEX within a sentence using JAVA

Find words matching a specific REGEX within a sentence using JAVA - java

I am trying to generate a dynamic message that can be used for processing using Java and Regular Expressions. My incoming value can be just "$bdate$" or be embedded within a sentence like "Your Birthdate : $bdate$". I want to replace these $aaa$ values dynamically at run time and am not able to isolate the regex matched values within a sentence. Here is what I have so far....
package com.test;
import java.util.Arrays;
import java.util.List;
import java.util.regex.Pattern;
public class TestRegex {
public static String REGEX = "\\$((?:[a-zA-Z0-9_ ]*))\\$";
public static String testString = "Summary : $summary$"
+ "Age : $age$"
+ "Location : $location$";
public static void main(String[] args) {
System.out.println("Matcher : " + Pattern.matches(REGEX, "$ABX_ 11$"));
String [] splitStrings = testString.split("\\W+"); //also tried "\\b+"
List<String> stringList = Arrays.asList(splitStrings);
for(String test : stringList) {
System.out.println("Split Word : " + test);
}
}
}
The output is below - it misses the preceding and succeeding $ symbols:
Matcher : true
Split Word : Summary
Split Word : summary
Split Word : Age
Split Word : age
Split Word : Location
Split Word : location
I know I am very close but not able to figure out the issue - Can anyone please help !!

You can use the following:
String pattern = "\\w+|\\$\\w+\\$";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(testString);
while (m.find( )) {
System.out.println("Found value: " + m.group(0) );
}
See Ideone DEMO

Just to extend #Karthik's answer and complete the thread, below code snippet only looks for words that match a pattern within the sentence and collects them - it might be easier to replace those dynamically at run time.
package com.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestRegex {
public static String testString = "Summary : $summary$"
+ "Age : $age$"
+ "Location : $location$";
public static void main(String[] args) {
String pattern = "\\$\\w+\\$";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(testString);
while (m.find( )) {
System.out.println("Found value: " + m.group(0) );
}
}
}

Related

Regex to grab validate email address in complete XML string or normal string

Need to grab string text of email value in big XML/normal string.
Been working with Regex for it and as of now below Regex is working correctly for normal String
Regex : ^[\\w!#$%&'*+/=?`{|}~^-]+(?:\\.[\\w!#$%&'*+/=?`{|}~^-]+)*#(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{1,6}$
Text : paris#france.c
but in case when above text is enclosed in XML tag it fails to return.
<email>paris#france.c</email>
I am trying to amend some change to this regex so that it will work for both of the scenarios

You have put ^ at the beginning which means the "Start of the string", and $ at the end which means the "End of the string". Now, look at your string:
<email>paris#france.c</email>
Do you think, it starts and ends with an email address?
I have removed them and also escaped the - in your regex. Here you can check the following auto-generated Java code with the updated regex.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Example {
public static void main(String[] args) {
final String regex = "[\\w!#$%&'*+/=?`\\{|\\}~^\\-]+(?:\\\\.[\\w!#$%&'*+/=?`\\{|\\}~^\\-]+)*#(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{1,6}";
final String string = "paris#france.c\n"
+ "<email>paris#france.c</email>";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
}
}
Output:
Full match: paris#france.c
Full match: paris#france.c

Regex to split a string using java

I am trying to parse a string as I need to pass the map to UI.
Here is my input string :
"2020-02-01T00:00:00Z",1,
"2020-04-01T00:00:00Z",4,
"2020-05-01T00:00:00Z",2,
"2020-06-01T00:00:00Z",31,
"2020-07-01T00:00:00Z",60,
"2020-08-01T00:00:00Z",19,
"2020-09-01T00:00:00Z",10,
"2020-10-01T00:00:00Z",33,
"2020-11-01T00:00:00Z",280,
"2020-12-01T00:00:00Z",61,
"2021-01-01T00:00:00Z",122,
"2021-12-01T00:00:00Z",1
I need to split the string like this :
"2020-02-01T00:00:00Z",1 : split[0]
"2020-04-01T00:00:00Z",4 : split[1]
Issue is I can't split it on " , " as its repeated 2 times.
I need a regex that gives 2020-02-01T00:00:00Z,1 as one token to process further.
I am new to regex. Can someone please provide a regex expression for the same.

If you want the pairs of date-time and ID, you can use the regex, (\"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\",\d+)(?=,|$) to get the match results.
The pattern, (?=,|$) is the lookahead assertion for comma or end of the line.
Demo:
import java.util.List;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
public class Main {
public static void main(String[] args) {
String s = "\"2020-02-01T00:00:00Z\",1,\n"
+ " \"2020-04-01T00:00:00Z\",4,\n"
+ " \"2020-05-01T00:00:00Z\",2,\n"
+ " \"2020-06-01T00:00:00Z\",31,\n"
+ " \"2020-07-01T00:00:00Z\",60,\n"
+ " \"2020-08-01T00:00:00Z\",19,\n"
+ " \"2020-09-01T00:00:00Z\",10,\n"
+ " \"2020-10-01T00:00:00Z\",33,\n"
+ " \"2020-11-01T00:00:00Z\",280,\n"
+ " \"2020-12-01T00:00:00Z\",61,\n"
+ " \"2021-01-01T00:00:00Z\",122,\n"
+ " \"2021-12-01T00:00:00Z\",1";
List<String> list = Pattern.compile("(\\\"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z\\\",\\d+)(?=,|$)")
.matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.toList());
list.stream()
.forEach(p -> System.out.println(p));
}
}
Output:
"2020-02-01T00:00:00Z",1
"2020-04-01T00:00:00Z",4
"2020-05-01T00:00:00Z",2
"2020-06-01T00:00:00Z",31
"2020-07-01T00:00:00Z",60
"2020-08-01T00:00:00Z",19
"2020-09-01T00:00:00Z",10
"2020-10-01T00:00:00Z",33
"2020-11-01T00:00:00Z",280
"2020-12-01T00:00:00Z",61
"2021-01-01T00:00:00Z",122
"2021-12-01T00:00:00Z",1

Why can't you just split on , and ignore the last value?

Here's your pattern:
final Pattern pattern = Pattern.compile("(\\S+),(\\d+)");
final Matcher matcher = pattern.matcher("Input....");
Here's how to use it:
while (matcher.find()) {
final String date = matcher.group(1);
final String number = matcher.group(2);
}

How to avoid replacing specific words in a text in java

I have a method like this :
for(String abc:abcs){
xyz = abc.replaceAll(abc+"\\(", "_"+abc+"\\(");
}
How to avoid replacing few replacements which have specific prefixes for them in java
I tried this :
String data = "Today, abc.xyz is object oriented language";
String regex = "(?<!abc.)xyz";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(data);
System.out.println(matcher.find());

Does this work for you?
package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String prefix = "abc";
String replaceWith = " text";
String input = "This xyz is an example xyz to show how you can replace certains values of the xyz.\n"
+ "The xyz can conain any arbitrary xyz, for example abc.xyz.";
Pattern pattern = Pattern.compile("[^" + prefix + "].xyz");
Matcher m = pattern.matcher(input);
while (m.find()) {
input = input.replace(m.group().substring(1), replaceWith);
}
System.out.println(input);
}
}

Regular Expression : No match found

I just started learning about regular expressions. I am trying to get the attribute values within "mytag" tags and used the following code, but it is giving me No match found exception.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class dummy {
public static void testRegEx()
{
// String pattern_termName = "(?i)\\[.*\\]()\\[.*\\]";
Pattern patternTag;
Matcher matcherTag;
String mypattern= "\\[mytag attr1="(.*?)" attr2="(.*?)" attr3="(.*?)"](.+?)\\[/mytag]";
String term="[mytag attr1=\"20258044753052856\" attr2=\"A security \" attr3=\"cvvc\" ]TagTitle[/mytag]";
patternTag = Pattern.compile(mypattern);
matcherTag = patternTag.matcher(term);
System.out.println(matcherTag.group(1)+"*********"+matcherTag.group(2)+"$$$$$$$$$$$$");
}
public static void main(String args[])
{
testRegEx();
}
}
I have used \" in place of " but it still shows me same exception.

You forget to check the matcher object against find function and also you need to use \"
instead of ",. The find method scans the input sequence looking for the next subsequence that matches the pattern.
Pattern patternTag;
Matcher matcherTag;
String mypattern= "\\[mytag attr1=\"(.*?)\" attr2=\"(.*?)\" attr3=\"(.*?)\"\\s*](.+?)\\[/mytag]";
String term="[mytag attr1=\"20258044753052856\" attr2=\"A security \" attr3=\"cvvc\" ]TagTitle[/mytag]";
patternTag = Pattern.compile(mypattern);
matcherTag = patternTag.matcher(term);
while(matcherTag.find()){
System.out.println(matcherTag.group(1)+"*********"+matcherTag.group(2)+"$$$$$$$$$$$$");
}
Output:
20258044753052856*********A security $$$$$$$$$$$$
DEMO

\\s+ or \\s* missing
code:
final String pattern = "\\[\\s*mytag\\s+attr1\\s*=\\s*\"(.*?)\"\\s+attr2\\s*=\\s*\"(.*?)\"\\s+attr3\\s*=\\s*\"(.*?)\"\\s*\\](.+?)\\[/mytag\\]";
final String input = "[mytag attr1=\"20258044753052856\" attr2=\"A security \" attr3=\"cvvc\" ]TagTitle[/mytag]";
final Pattern p = Pattern.compile( pattern );
final Matcher m = p.matcher( input );
if( m.matches()) {
System.out.println(
m.group(1) + '\t' + m.group(2) + '\t' + m.group(3) + '\t' + m.group(4));
}
outpout:
20258044753052856 A security cvvc TagTitle

Java regex get exact token value

I've string like below , want to get the value of cn=ADMIN , but dont know how to get to using regex efficient way.
group:192.168.133.205:387/cn=ADMIN,cn=groups,dc=mi,dc=com,dc=usa

well ... like this?
package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexSample {
public static void main(String[] args) {
String str = "group:192.168.133.205:387/cn=ADMIN,cn=groups,dc=mi,dc=com,dc=usa";
Pattern pattern = Pattern.compile("^.*/(.*)$");
Matcher matcher = pattern.matcher(str);
if (matcher.matches()) {
String right = matcher.group(1);
String[] parts = right.split(",");
for (String part : parts) {
System.err.println("part: " + part);
}
}
}
}
Output is:
part: cn=ADMIN
part: cn=groups
part: dc=mi
part: dc=com
part: dc=usa

String bubba = "group:192.168.133.205:387/cn=ADMIN,cn=groups,dc=mi,dc=com,dc=usa";
String target = "cn=ADMIN";
for(String current: bubba.split("[/,]")){
if(current.equals(target)){
System.out.println("Got it");
}
}

Pattern for regex
cn=([a-zA-Z0-9]+?),
Your name will be in group 1 of matcher. You can extend character classes if you allow spaces etc.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Find words matching a specific REGEX within a sentence using JAVA - java

You can use the following: String pattern = "\\w+|\\$\\w+\\$"; Pattern r = Pattern.compile(pattern); Matcher m = r.matcher(testString); while (m.find( )) { System.out.println("Found value: " + m.group(0) ); } See Ideone DEMO

Related

Regex to grab validate email address in complete XML string or normal string

Regex to split a string using java

How to avoid replacing specific words in a text in java

Regular Expression : No match found

Java regex get exact token value

Categories

Resources