Regular expression in Java? - java

I have below string:
String line = put retur#ERns between #errf #fgrf#re paragraphs #fg^%tg2#785Ty*;
How can I get below values with regex:
#ERns
#errf
#fgrf
#re
#fg^%tg2
#785Ty*
My code:
String pattern = "^#\S+";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
while (m.find()) {
Log.i("log", m.group());
}

You can use this regex instead:
#[^#\s]*
RegEx Demo
Negated character class [^#\s] matches a character that is not # and not a whitespace.
In Java use:
final String pattern = "#[^#\\s]*";

Related

Regex Pattern to Split Word in A string Using An Identifier

I would like to split the following string by commas using a DOTALL regex pattern what will accept letters, numbers, whitespaces and special characters such as underscores and asterisks i.e. #input("Test_1, Test_TWO , TEST_THIRTY_3*") so the output would look like:
"Test_1",
"Test_TWO",
"TEST_THIRTY_3*"
public static void main(String args[])
{
String line = "#input(\"Test_1,Test_TWO , TEST_THIRTY_3*\"\\)\";
String pattern = "#input(\"(.*?)\".*";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println("Found word: " + m.group(1) );
}
You have to escape the ( by \( so your regex should look like this #input\(\"(.*?)\".*, second you can use \s*,\s* to split the result like this :
String line = "#input(\"Test_1,Test_TWO , TEST_THIRTY_3*\"\\)";
String pattern = "#input\\(\"(.*?)\".*";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println(Arrays.toString(m.group(1).split("\\s*,\\s*")));
//----------------------------------------------------^^^^^^^^
}
outputs
[Test_1, Test_TWO, TEST_THIRTY_3*]
If you do not have to stick to regex you might just take the string methods.
List<String> output = Arrays.asList(line.split(","));

How to find match for exact word using pattern matcher in java

I have shared my sample code here. here i am trying to find word "engine" with different strings. i used word boundary to match the words in string.
it matches word if it starts with #engine(example).
it should only match with exact word.
private void checkMatch() {
String source1 = "search engines has ";
String source2 = "search engine exact word";
String source3 = "enginecheck";
String source4 = "has hashtag #engine";
String key = "engine";
System.out.println(isContain(source1, key));
System.out.println(isContain(source2, key));
System.out.println(isContain(source3, key));
System.out.println(isContain(source4, key));
}
private boolean isContain(String source, String subItem) {
String pattern = "\\b" + subItem + "\\b";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(source);
return m.find();
}
**Expected output**
false
true
false
false
**actual output**
false
true
false
true
For this case, you have to use regex OR instead of word boundary. \\b matches between a word char and non-word char (vice-versa). So your regex should find a match in #engine since # is a non-word character.
private boolean isContain(String source, String subItem) {
String pattern = "(?m)(^|\\s)" + subItem + "(\\s|$)";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(source);
return m.find();
}
or
String pattern = "(?<!\\S)" + subItem + "(?!\\S)";
Change your pattern as below.
String pattern = "\\s" + subItem + "\\b";
If you are looking for a literal text enclosed with spaces or start/end of the string, you can split the string with a mere whitespace pattern like \s+ and check if any of the chunks equals the search text.
Java demo:
String s = "Can't start the #engine here, but this engine works";
String searchText = "engine";
boolean found = Arrays.stream(s.split("\\s+"))
.anyMatch(word -> word.equals(searchText));
System.out.println(found); // => true
Change the regexp to
String pattern = "\\s"+subItem + "\\s";
I'm using the
\s A whitespace character: [ \t\n\x0B\f\r]
For more info look into the java.util.regex.Pattern javadoc
Also if you want to support strings like these:
"has hashtag engine"
"engine"
You can improve it by adding the ending/starting line terminators (^ and $)
by using this pattern:
String pattern = "(^|\\s)"+subItem + "(\\s|$)";

regex for first character before space

i am trying to extract "d320" from the below string using regex in java using the below code
n-us; micromax d320 build/kot49h)
String m = "n-us; micromax d320 build/kot49h) ";
String pattern = "micromax (.*)(\\d\\D)(.*) ";
Pattern r = Pattern.compile(pattern);
Matcher m1 = r.matcher(m);
if (m1.find()) {
System.out.println(m1.group(1));
}
but it is giving me the output as "d320 build/kot4" , i want only d320
Try to use micromax\\s(.*?)\\s like this:
String m = "n-us; micromax d320 build/kot49h) ";
String pattern = "micromax\\s(.*?)\\s";
Pattern r = Pattern.compile(pattern);
Matcher m1 = r.matcher(m);
if (m1.find()) {
System.out.println(m1.group(1));
}
Output:
d320
It's not known whether you want the word after "micromax", or the word that starts with a letter and has all digits afterward, so here's both solutions:
To extract the word following "micromax":
String code = m.replaceAll(".*micromax\\s+(\\w+)?.*", "$1");
To extract the word that looks like "x9999":
String code = m.replaceAll(".*?\b([a-z]\\d+)?\b.*", "$1");
Both snippets will result in a blank string if is there's no match.

Find and Replace a pattern of string in java

I use regex and string replaceFirst to replace the patterns as below.
String xml = "<param>otpcode=1234567</param><param>password=abc123</param>";
if(xml.contains("otpcode")){
Pattern regex = Pattern.compile("<param>otpcode=(.*)</param>");
Matcher matcher = regex.matcher(xml);
if (matcher.find()) {
xml = xml.replaceFirst("<param>otpcode=" + matcher.group(1)+ "</param>","<param>otpcode=xxxx</param>");
}
}
System.out.println(xml);
if (xml.contains("password")) {
Pattern regex = Pattern.compile("<param>password=(.*)</param>");
Matcher matcher = regex.matcher(xml);
if (matcher.find()) {
xml = xml.replaceFirst("<param>password=" + matcher.group(1)+ "</param>","<param>password=xxxx</param>");
}
}
System.out.println(xml);
Desired O/p
<param>otpcode=xxxx</param><param>password=abc123</param>
<param>otpcode=xxxx</param><param>password=xxxx</param>
Actual o/p (Replaces the entire string in a single shot in first IF itself)
<param>otpcode=xxxx</param>
<param>otpcode=xxxx</param>
You need to do a non-greedy regex:
<param>otpcode=(.*?)</param>
<param>password=(.*?)</param>
This will match up to the first </param> not the last one...

regex extract string between two characters

I would like to extract the strings between the following characters in the given string using regex in Java:
/*
1) Between \" and \" ===> 12222222222
2) Between :+ and # ===> 12222222222
3) Between # and > ===> 192.168.140.1
*/
String remoteUriStr = "\"+12222222222\" <sip:+12222222222#192.168.140.1>";
String regex1 = "\"(.+?)\"";
String regex2 = ":+(.+?)#";
String regex3 = "#(.+?)>";
Pattern p = Pattern.compile(regex1);
Matcher matcher = p.matcher(remoteUri);
if (matcher.matches()) {
title = matcher.group(1);
}
I am using the above given code snippet, its not able to extract the strings that I want it to. Am I doing anything wrong? Meanwhile, I am quite new to regex.
The matches() method attempts to match the regular expression against the entire string. If you want to match a part of the string, you want the find() method:
if (matcher.find())
You could, however, build a single regular expression to match all three parts at once:
String regex = "\"(.+?)\" \\<sip:\\+(.+?)#(.+?)\\>";
Pattern p = Pattern.compile(regex);
Matcher matcher = p.matcher(remoteUriStr);
if (matcher.matches()) {
title = matcher.group(1);
part2 = matcher.group(2);
ip = matcher.group(3);
}
Demo: http://ideone.com/8t2EC
If your input always looks like that and you always want the same parts from it you can put that in a single regex (with multiple capturing groups):
"([^"]+)" <sip:([^#]+)#([^>]+)>
So you can then use
Pattern p = Pattern.compile("\"([^\"]+)\" <sip:([^#]+)#([^>]+)>");
Matcher m = p.matcher(remoteUri);
if (m.find()) {
String s1 = m.group(1);
String s2 = m.group(2);
String s3 = m.group(3);
}

Categories