Matcher. How to get index of found group? - java

I have sentence and I want to calculate words, semiPunctuation and endPunctuation in it.
Command "m.group()" will show String result. But how to know which group is found?
I can use method with "group null", but it is sounds not good.
String input = "Some text! Some example text."
int wordCount=0;
int semiPunctuation=0;
int endPunctuation=0;
Pattern pattern = Pattern.compile( "([\\w]+) | ([,;:\\-\"\']) | ([!\\?\\.]+)" );
Matcher m = pattern.matcher(input);
while (m.find()) {
// need more correct method
if(m.group(1)!=null) wordCount++;
if(m.group(2)!=null) semiPunctuation++;
if(m.group(3)!=null) endPunctuation++;
}

You could use named groups to capture the expressions
Pattern pattern = Pattern.compile( "(?<words>\\w+)|(?<semi>[,;:\\-\"'])|(?<end>[!?.])" );
Matcher m = pattern.matcher(input);
while (m.find()) {
if (m.group("words") != null) {
wordCount++;
}
...
}

Related

Regex Pattern to Split Word in A string Using An Identifier

I would like to split the following string by commas using a DOTALL regex pattern what will accept letters, numbers, whitespaces and special characters such as underscores and asterisks i.e. #input("Test_1, Test_TWO , TEST_THIRTY_3*") so the output would look like:
"Test_1",
"Test_TWO",
"TEST_THIRTY_3*"
public static void main(String args[])
{
String line = "#input(\"Test_1,Test_TWO , TEST_THIRTY_3*\"\\)\";
String pattern = "#input(\"(.*?)\".*";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println("Found word: " + m.group(1) );
}
You have to escape the ( by \( so your regex should look like this #input\(\"(.*?)\".*, second you can use \s*,\s* to split the result like this :
String line = "#input(\"Test_1,Test_TWO , TEST_THIRTY_3*\"\\)";
String pattern = "#input\\(\"(.*?)\".*";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println(Arrays.toString(m.group(1).split("\\s*,\\s*")));
//----------------------------------------------------^^^^^^^^
}
outputs
[Test_1, Test_TWO, TEST_THIRTY_3*]
If you do not have to stick to regex you might just take the string methods.
List<String> output = Arrays.asList(line.split(","));

How can I find a digit after string with pattern?

I'm trying to get the last return code from an SSH shell in linux.
I'm using the command:echo &? to get it.
I've written following code but it's not working:
int last_len = 0;
Pattern p = Pattern.compile("echo $?\r\n[0-9]");
while(in.available() > 0 ) {
last_len = in.read(buffer);
String str = new String(buffer, 0, last_len);
Matcher m = p.matcher(str);
if(m.find()) {
return Integer.parseInt(m.group().substring(9));
}
}
What am I doing wrong?
You need to escape $, ? in the regex inorder to match the literal form of those characters since ?, $ are considered as special chars in regex.
Pattern p = Pattern.compile("echo \\$\\?\\r?\\n([0-9])");
Matcher m = p.matcher(str);
if(m.find()) {
System.out.println(m.group(1));
}
or
Pattern p = Pattern.compile("echo\\s+\\$\\?[\\r\\n]+([0-9])");

How to match regex in java

I want a pattern like this: GJ-16-RS-1234 and I have applied following patterns but they are not working.
My regex patterns are:
String str_tempPattern = "(^[A-Z]{2})\\-([0-9]{2})\\-([A-Z]{1,2})\\-([0-9]{1,4}$)";
String str_tempPattern = "(^[A-Z]{2})-([0-9]{1,2})-([A-Z]{1,2})-([0-9]{1,4})$";
String str_tempPattern = "^[A-Z]{2}\\-[0-9]{1,2}\\-[A-Z]{1,2}\\-[0-9]{1,4}$";
And I am using text watcher to check for any change in the aftertextchange()
Pattern p = Pattern.compile(str_tempPattern, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Matcher m = p.matcher(s);
if (m.find()){
}
Just set the condition using matches method.
if (string.matches("[A-Z]{2}\\-[0-9]{1,2}\\-[A-Z]{1,2}\\-[0-9]{1,4}"))
{
// Yes it matches
}
else
{
// No it won't
}

how can I get particular string(sub string ) from a string

I have string like
{Action}{RequestId}{Custom_21_addtion}{custom_22_substration}
{Imapact}{assest}{custom_23_multiplication}.
From this I want only those sub string which contains "custom".
For example from above string I want only
{Custom_21_addtion}{custom_22_substration}{custom_23_multiplication}.
How can I get this?
You can use a regular expression, looking from {custom to }. It will look like this:
Pattern pattern = Pattern.compile("\\{custom.*?\\}", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
while (matcher.find()) {
System.out.print(matcher.group());
}
The .* after custom means 0 or more characters after the word "custom", and the question mark limits the regex to as few character as possible, meaning that it will break on the next } that it can find.
If you want an alternative solution without regex:
String a = "{Action}{RequestId}{Custom_21_addtion}{custom_22_substration}{Imapact}{assest}{custom_23_multiplication}";
String[] b = a.split("}");
StringBuilder result = new StringBuilder();
for(String c : b) {
// if you want case sensitivity, drop the toLowerCase()
if(c.toLowerCase().contains("custom"))
result.append(c).append("}");
}
System.out.println(result.toString());
you can do it sth like this:
StringTokenizer st = new StringTokenizer(yourString, "{");
List<String> llista = new ArrayList<String>():
Pattern pattern = Pattern.compile("(\W|^)custom(\W|$)", Pattern.CASE_INSENSITIVE);
while(st.hasMoreTokens()) {
String string = st.nextElement();
Matcher matcher = pattern.matcher(string);
if(matcher.find()){
llista.add(string);
}
}
Another solution:
String inputString = "{Action}{RequestId}{Custom}{Custom_21_addtion}{custom_22_substration}{Imapact}{assest}" ;
String strTokens[] = inputString.split("\\}");
for(String str: strTokens){
Pattern pattern = Pattern.compile( "custom", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
if (matcher.find()) {
System.out.println("Tag Name:" + str.replace("{",""));
}
}

What's wrong with this replacement code?

I need replace {word} by a regex named group: (?< word >\w++) to future match expressions, i.e.: /{name}/{age}... This code doesn't work!
String p = "/{name}/{id}";
p = p.replaceAll("\\{(\\w+)\\}", "(?<$1>\\\\\\\\w+)");
Pattern URL_PATTERN = Pattern.compile(p);
CharSequence cs = "/lucas/3";
Matcher m = URL_PATTERN.matcher(cs);
if(m.matches()){
for(int i=1;i<m.groupCount();++i){
System.out.println(m.group("name"));
}
}
Result: nothing :(
But when I get the result of replacement: /(?\w+)/(?\w+) and put in Pattern.compile() this works:
String p = "/{name}/{id}";
p = p.replaceAll("\\{(\\w+)\\}", "(?<$1>\\\\\\\\w+)");
Pattern URL_PATTERN = Pattern.compile("/(?<name>\\w+)/(?<id>\\w+)");
System.out.println(p);
CharSequence cs = "/lucas/3";
Matcher m = URL_PATTERN.matcher(cs);
if(m.matches()){
for(int i=1;i<m.groupCount();++i){
System.out.println(m.group("name"));
}
}
Result: "lucas"
What's wrong?
I think you used too many \ in your replace. Try
p = p.replaceAll("\\{(\\w+)\\}", "(?<$1>\\\\\\w+)");

Categories