Regular expression to find substring in text

Regular expression to find substring in text - java

I have a text file contains some strings I want to extract with Java regex,
Those strings are in format of:
$numbers,numbers,numbers....,numbers##
(start with $, followed by groups of numbers plus ,, and end with ##)
Here is my pattern.
Pattern pattern = Pattern.compile("$*##");
Matcher matcher = pattern.matcher(text);
if (matcher.find())
{
}
It turns out that nothing match my pattern
Can anyone tell me what's wrong with it?

You need to do:
Pattern pattern = Pattern.compile("\\$\\$\\d+(,\\d+)*##$");
Thanks to #Pshemo for his valuable inputs to reach the solution.

Related

What is the easiest way to filter a changing number in a string?

Can someone tell me the easiest way to extract the number '20' in the following substring.
Level I (10/20)
Note: The numbers in the brackets and the number behind 'Level' are changing and can contain more chars than in this example
It would be awesome if there is a method for using a regex and extract a specific part out of it.

I'm not the best with regex, but here's a working solution for your example:
String s = "Level I (10/20)";
Pattern pattern = Pattern.compile("\\(\\d+/(\\d+)\\)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
20

How about this one, works for multi-line input too:
^Level[[:blank:]].+\([\d]*\/([\d]*)\)
Test here

Java Regex finding a substring without using space character in the pattern string?

I have a string with the value AB-CD>AY-ZV (FG). Out of this, I would like to get only the value AB-CD>AY-ZV using regex.
In this, I do not want to use space \s as part of my pattern.
Any help is appreciated.

You could use the below regex to match the first part.
String s = "AB-CD>AY-ZV (FG)";
Matcher m = Pattern.compile("^\\S+").matcher(s);
if(m.find())
{
System.out.println(m.group());
}
Here \\S+ matches one or more non-space characters and ^ asserts that we are at the start.

Java Regular expressions for filename

I want to check the filenames sent to me against two patterns.
The first regular expression is ~*~, which should match names like ~263~. I put this in online regular expression testers and it matches. The code doesnt work though. Says no match
List<FTPFile> ret = new ArrayList<FTPFile>();
Pattern pattern = Pattern.compile("~*~");
Matcher matcher;
for (FTPFile file : files)
{
matcher = pattern.matcher(file.getName());
if(matcher.matches())
{
ret.add(file);
}
}
return ret;
Also the second pattern I need is ##* which should match strings like abc#ere#sss
Please tell me the proper patterns in java for this.

You need to define your pattern like,
Pattern pattern = Pattern.compile("~.*~");
~* in your regex ~*~ will repeat the first ~ zero or more times. So it won't match the number following the first ~. Because matches method tries to match the whole input string, this regex causes the match to fail. So you need to add .* inbetween to match strings like ~66~ or ~kjk~ . To match the strings which has only numbers present inbetween ~, you need to use ~\d+~

Try Regex:
\~.*\~
Instead:
~*~
Example:
Pattern pattern = Pattern.compile("\\~.*\\~");

Beginning index of every word

I want to get the beginning index of every word in a string. Word is defined by anything non whitespace character.
String test = "this that and that";
Matcher matcher = Pattern.compile("\\s+[WHAT TO WRITE HERE]\\s+").matcher(test);
while (matcher.find()) {
System.out.println(matcher.start());
}
What should I write in the regular expression? For e.g. the output should be 0,5,10,14
There can be multiple whitespaces between words.

Word is defined by anything non whitespace character.
And there is a character class for that: \S.
Your regex should therefore be:
private static final Pattern PATTERN = Pattern.compile("\\S+");
Note however that the definition of "word" you have is rather large; this will also include punctuation etc.
As to your loop, it is correct, since when you have a match, the Matcher's .start() method will indeed contain the index at which the match has started.
Taking your code and modifying it a little, this gives:
String test = "this that and that";
Matcher matcher = PATTERN.matcher(test);
while (matcher.find()) {
System.out.println(matcher.start());
}

I would use this regex:
...
Matcher matcher = Pattern.compile("[^\\s]+").matcher(test);
...

I would use :
[A-Za-z0-9]+
It will find only alpha-numeric word.
I think "\S+" will be problematic with punctuation marks and weird chars.
You can even drop the numeric ("0-9") part if you want.

#fge already gave the best answer but since I can't reply to his comment. #Ian McGrath you were asking what you could have written well other solutions exist. This is what I came up with and it seemed to work also.
Matcher matcher = Pattern.compile("\\w+?(\\s+|$)").matcher(test);

%etd(msg01) regular expression?

I am trying to write a regular expression for String like %etd(msg01).
String string = "My name is %etd(msg01) and %etd(msg02)";
Pattern pattern = Pattern.compile("%etd(.+)");
Matcher matcher = pattern.matcher(string);
while(matcher.find()) {
System.out.println(matcher.group());
}
It prints %etd(msg01) and %etd(msg02). However, I want it to print %etd(msg01) %etd(msg02) separately. I mean I am looking for non-greedy match.
How should the regular expression be changed to make it non greedy in this situation?

You should use this regex:
Pattern pattern = Pattern.compile("%etd\\([^)]+\\)");

Please place a question mark after .* or .+ to make it nongreedy. This should work for you...
Pattern pattern = Pattern.compile("%etd\\(.+?\\)");
Double slashes are also necessary in front of open and close parenthesis because they carry a special meaning in regular expression.
Another way of using is as below if you are sure that your names doesn't contain an open paranthesis after the first one.
Pattern pattern = Pattern.compile("%etd\\([^(]+\\)");

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regular expression to find substring in text - java

You need to do: Pattern pattern = Pattern.compile("\\$\\$\\d+(,\\d+)*##$"); Thanks to #Pshemo for his valuable inputs to reach the solution.

Related

What is the easiest way to filter a changing number in a string?

Java Regex finding a substring without using space character in the pattern string?

Java Regular expressions for filename

Beginning index of every word

%etd(msg01) regular expression?

Categories

Resources