How do I extract substring from this line using RegEx - java

I have the following String line:
dn: cn=Customer Management,ou=groups,dc=digitalglobe,dc=com
I want to extract just this from the line above: Customer Management
I've tried the following RegEx expression but it does quite do what I want:
^dn: cn=(.*?),
Here is the java code snippet that tests the above expression:
Pattern pattern = Pattern.compile("^dn: cn=(.*?),");
String mydata = "dn: cn=Delivery Admin,ou=groups,dc=digitalglobe,dc=com";
Matcher matcher = pattern.matcher(mydata);
if(matcher.matches()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}
The output is "No match found"... :(

Your regex should work properly, but matches attempts to match the regex to the entire string. Instead, use the find method which will look for a match at any point in the string.
if(matcher.find()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}

Your problem is that the matcher want to match the whole input. Try adding a wildcard to the end of the pattern.
Pattern pattern = Pattern.compile("^dn: cn=(.*?),.*");
String mydata = "dn: cn=Delivery Admin,ou=groups,dc=digitalglobe,dc=com";
Matcher matcher = pattern.matcher(mydata);
if(matcher.matches()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}

Please use below code:
#NOTE: instead of using matches you have to use find
public static void main(String[] args)
{
Pattern pattern = Pattern.compile("^dn: cn=(.*?),");
String mydata = "dn: cn=Delivery Admin,ou=groups,dc=digitalglobe,dc=com";
Matcher matcher = pattern.matcher(mydata);
if(matcher.find()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}
}

Related

How to extract string using regex with space before or after

in the following examples, I want to extract "Mywebsite.xx". How do I do it?
Search Mywebsite.de ----> Mywebsite.de
Mywebsite.de durchsuchen ----> Mywebsite.de
Search Mywebsite.co.uk ----> Mywebsite.co.uk
Mywebsite.co.uk something ----> Mywebsite.co.uk
I tried this but it's not working:
String mydata2 = "Mywebsite.de durchsuchen";
Matcher matcher = Pattern.compile("Mywebsite(.*?)").matcher(mydata2);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
You can use the Mywebsite\.([a-z]+\.[a-z]+)
public static void extractDomain(String domain){
Pattern domainPattern = Pattern.compile("Mywebsite\.([a-z]+\.[a-z]+)");
Matcher match = domainPattern.matcher(domain);
System.out.println("Mywebsite"+ match.group(1));
}
You can try this pattern match for the input array of possible strings. The first four strings will match.
String patternStr = "(\\s|^)mywebsite([.][a-z][a-z]){1,2}(\\s|$)";
Pattern pattern = Pattern.compile(patternStr, Pattern.CASE_INSENSITIVE);
String [] stringsToMatch = {
"Mywebsite.co.uk xyz",
"abc Mywebsite.co.uk",
"abc Mywebsite.co.uk xyz",
"Mywebsite.co.uk",
"Mywebsite.co.uk.us",
"Mywebsite"
};
for (String str : stringsToMatch) {
Matcher matcher = pattern.matcher(str);
System.out.println(str);
if (matcher.find()) {
System.out.println(" " + str.substring(matcher.start(), matcher.end()));
}
else {
System.out.println(" No match");
}
}
To find the domain name from a string you can use regex like
(?:http[s]?:\/\/)?(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+
This program will capture list of domain from your string
public static List<String> extractDomainNames(String input) {
List<String> domainNames = new ArrayList<>();
String domainNamePattern = "(?:http[s]?://)?(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+";
Pattern pattern = Pattern.compile(domainNamePattern);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
domainNames.add(matcher.group());
}
return domainNames;
}
You could try this regex: Mywebsite\.[^\s]+
String input = "Mywebsite.de durchsuchen";
Pattern regexPattern = Pattern.compile("Mywebsite\.[^\s]+");
Matcher regexMatcher = regexPattern.matcher(input);
while (regexMatcher.find()) {
System.out.println(regexMatcher.group());
}
See regex demo here

Parsing a String using Java regex

I have the below java string in the below format.
externalCustomerID: { \"custToken\": \"xyz\" }
I want to extract xyz value from above string.
can anyone suggest me any regex expression for that in java?
check this one
Pattern pattern = Pattern.compile("(\\w+: \\{ \"\\w+\": \")(\\w+)");
Matcher matcher = pattern.matcher("externalCustomerID: { \"custToken\": \"xyz\" }");
if (matcher.find()) {
System.out.println(matcher.group(2));
}

Using regex in java to get a word from a string [duplicate]

This question already has answers here:
Difference between matches() and find() in Java Regex
(5 answers)
Closed 3 years ago.
I need to find the word "best" in a string using regex but it's throwing a "no match found" error. What am I doing wrong?
Pattern pattern = Pattern.compile("(best)");
String theString = "the best of";
Matcher matcher = pattern.matcher(theString);
matcher.matches();
String whatYouNeed = matcher.group(1);
Log.d(String.valueOf(LOG), whatYouNeed);
As per your requirement you have to find the string "best" in "the best of", so find() method suits your requirement instead of matches(). Please find the sample code snippet below:
Pattern pattern = Pattern.compile("best");
String theString = "the best of";
Matcher matcher = pattern.matcher(theString);
if(matcher.find()) {
System.out.println("found");
}else {
System.out.println("not found");
}
}
Use find() not matches!
public static void main(String[] args){
Pattern pattern = Pattern.compile("(best)");
String theString = "the best of";
Matcher matcher = pattern.matcher(theString);
if(matcher.find())
System.out.println("Hi!");
}
What I think you want is this.
String theString = "the best of";
String regex = "(best)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(theString);
while (m.find()) {
String result = m.group(1);
System.out.println("found: " + result);
}
outputs:
found: best

Matcher. How to get index of found group?

I have sentence and I want to calculate words, semiPunctuation and endPunctuation in it.
Command "m.group()" will show String result. But how to know which group is found?
I can use method with "group null", but it is sounds not good.
String input = "Some text! Some example text."
int wordCount=0;
int semiPunctuation=0;
int endPunctuation=0;
Pattern pattern = Pattern.compile( "([\\w]+) | ([,;:\\-\"\']) | ([!\\?\\.]+)" );
Matcher m = pattern.matcher(input);
while (m.find()) {
// need more correct method
if(m.group(1)!=null) wordCount++;
if(m.group(2)!=null) semiPunctuation++;
if(m.group(3)!=null) endPunctuation++;
}
You could use named groups to capture the expressions
Pattern pattern = Pattern.compile( "(?<words>\\w+)|(?<semi>[,;:\\-\"'])|(?<end>[!?.])" );
Matcher m = pattern.matcher(input);
while (m.find()) {
if (m.group("words") != null) {
wordCount++;
}
...
}

Extract substring from string - regex

I have a String say:
<encoded:2,Message request>
Now I want to extract 2 and Message request from the line above.
private final String pString = "<encoded:[0-9]+,.*>";
private final Pattern pattern = Pattern.compile(pString);
private void parseAndDisplay(String line) {
Matcher matcher = pattern.matcher(line);
if (matcher.matches()) {
while(matcher.find()) {
String s = matcher.group();
System.out.println("=====>"+s);
}
}
}
This doesn't retrieve it. What is wrong with it
You have to define groups in your regex:
"<encoded:([0-9]+),(.*?)>"
or
"<encoded:(\\d+),([^>]*)"
try
String s = "<encoded:2,Message request>";
String s1 = s.replaceAll("<encoded:(\\d+?),.*", "$1");
String s2 = s.replaceAll("<encoded:\\d+?,(.*)>", "$1");
Try
"<encoded:([0-9]+),([^>]*)"
Also, as suggested in other comments, use group(1) and group(2)
Try this out :
Matcher matcher = Pattern.compile("<encoded:(\\d+)\\,([\\w\\s]+)",Pattern.CASE_INSENSITIVE).matcher("<encoded:2,Message request>");
while (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
}

Categories