How to match a set of string in regexp - java

How do I combine ch..+ and ch..- in regexp effectively without having to scan separately?
And are we using matcher in the pattern?
My output code is like this:
ch01+
ch01-
ch02+
ch02-
...

How do I combine ch..+ and ch..- in regexp effectively without having to scan separately?
Use | (pipe) for alternation:
ch..(\+|-)
And are we using matcher in the pattern?
Depends on how you're using the regexp and the pattern. To get a concrete answer, you'll have to show some actual code, or ask a much more specific question.
N.B. If you want to restrict the two characters after ch to 0-9, you can use \d, which is a shorthand character class for [0-9]:
ch\d{2}(\+|-)

You can use a character class containing just "+" and "-" like so "[+-]".
Pattern p = Pattern.compile("ch..[+-]");
Matcher m = p.matcher("ch01+");
if (m.find()) {
// found it...

Related

Java Regular expressions for filename

I want to check the filenames sent to me against two patterns.
The first regular expression is ~*~, which should match names like ~263~. I put this in online regular expression testers and it matches. The code doesnt work though. Says no match
List<FTPFile> ret = new ArrayList<FTPFile>();
Pattern pattern = Pattern.compile("~*~");
Matcher matcher;
for (FTPFile file : files)
{
matcher = pattern.matcher(file.getName());
if(matcher.matches())
{
ret.add(file);
}
}
return ret;
Also the second pattern I need is ##* which should match strings like abc#ere#sss
Please tell me the proper patterns in java for this.
You need to define your pattern like,
Pattern pattern = Pattern.compile("~.*~");
~* in your regex ~*~ will repeat the first ~ zero or more times. So it won't match the number following the first ~. Because matches method tries to match the whole input string, this regex causes the match to fail. So you need to add .* inbetween to match strings like ~66~ or ~kjk~ . To match the strings which has only numbers present inbetween ~, you need to use ~\d+~
Try Regex:
\~.*\~
Instead:
~*~
Example:
Pattern pattern = Pattern.compile("\\~.*\\~");

Match characters via regex in a string mutliple times

I'm trying to replace all occurences in a string with a regex using a Pattern object, but it only replaces the odd occurences:
final Pattern p = Pattern.compile("(^|\\W|\\\\N)(recursive)(\\W|$)", Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
System.out.println(p.matcher("i-i-i").replaceAll("$1I$3"));
This returns me:
I-i-I
But I need to to match also the I in the middle, but somehow it doesn't catch that. I also tried a simplified regex (^|-)(I)($|-) and try to do the same with i-i-i-i-i-i which returned me I-i-I-i-I-i.
I guess it is because the odd dashs (at 4x+1) were already matched, so they can't be matched a second time for the even is. Is it possible to allow that?
It seems that your problem is that you are trying to use same character - in few matches. In that case you should probably use look-around mechanism. For example you can change your
(^|-)(I)($|-)
patter to
(^|-)(I)(?=($|-))
and as replacement use $1I. This way regex will only check if after I exists $ or - but will not include it in match, so
final Pattern p = Pattern.compile("(^|-)(I)(?=($|-))",
Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
System.out.println(p.matcher("i-i-i-i-i-i").replaceAll("$1I"));
prints
I-I-I-I-I-I

java pattern filter

i need created java pattern to filter data, like 13.6Gb, 12MB,15.5Kb
I use those code
Pattern p = Pattern.compile("(\\d+)(\\w+)");
Matcher m = p.matcher(content);
String num_letter = m.group(1);
String union = m.group(2);
but it can't detect decimal number, so how to modify this pattern
Try adding a conditional match for the decimal part:
Pattern.compile("(\\d+(?:[.]\\d+)?)(\\w+)");
Note the use of non-capturing group for the decimal part.
Have is a variation on the conditional decimal match:
Pattern.compile("(\\d+\\.?\\d+?)+(\\w+)");
If you are using eclipse, I prefer to use a tool like: http://myregexp.com/eclipsePlugin.html - it makes this waaaaay easy.
Eyeballing yours, I would say something like (\\d+(\\.?(\\d+))?) then you could see how many match groups you have before pulling the ones out you want. Alternatively, using named capture groups would be more readable.
-Ryan

Need regex to match the given string

I need a regex to match a particular string, say 1.4.5 in the below string . My string will be like
absdfsdfsdfc1.4.5kdecsdfsdff
I have a regex which is giving [c1.4.5k] as an output. But I want to match only 1.4.5. I have tried this pattern:
[^\\W](\\d\\.\\d\\.\\d)[^\\d]
But no luck. I am using Java.
Please let me know the pattern.
When I read your expression [^\\W](\\d\\.\\d\\.\\d)[^\\d] correctly, then you want a word character before and not a digit ahead. Is that correct?
For that you can use lookbehind and lookahead assertions. Those assertions do only check their condition, but they do not match, therefore that stuff is not included in the result.
(?<=\\w)(\\d\\.\\d\\.\\d)(?!\\d)
Because of that, you can remove the capturing group. You are also repeating yourself in the pattern, you can simplify that, too:
(?<=\\w)\\d(?:\\.\\d){2}(?!\\d)
Would be my pattern for that. (The ?: is a non capturing group)
Your requirements are vague. Do you need to match a series of exactly 3 numbers with exactly two dots?
[0-9]+\.[0-9]+\.[0-9]+
Which could be written as
([0-9]+\.){2}[0-9]+
Do you need to match x many cases of a number, seperated by x-1 dots in between?
([0-9]+\.)+[0-9]+
Use look ahead and look behind.
(?<=c)[\d\.]+(?=k)
Where c is the character that would be immediately before the 1.4.5 and k is the character immediately after 1.4.5. You can replace c and k with any regular expression that would suit your purposes
I think this one should do it : ([0-9]+\\.?)+
Regular Expression
((?<!\d)\d(?:\.\d(?!\d))+)
As a Java string:
"((?<!\\d)\\d(?:\\.\\d(?!\\d))+)"
String str= "absdfsdfsdfc**1.4.5**kdec456456.567sdfsdff22.33.55ffkidhfuh122.33.44";
String regex ="[0-9]{1}\\.[0-9]{1}\\.[0-9]{1}";
Matcher matcher = Pattern.compile( regex ).matcher( str);
if (matcher.find())
{
String year = matcher.group(0);
System.out.println(year);
}
else
{
System.out.println("no match found");
}

Split a string in java using regex

I am trying to split a string using a regex "A.*B", which works just fine to retrieve strings between 'A' and 'B'. But the dot '.' doesn't include new line characters \n,\r.
Can you please guide me on how to achieve this?
Thanks
Thanks all. Pattern.DOTALL worked like a charm.
I had another question related to this. What should be done if I need to extract all the strings between 'A' and 'B' (which basically match the above regex).
I tried using find() and group() of matcher class, but with the pattern below it seems to return the whole string.
Pattern p = Pattern.compile("A.*B",Pattern.DOTALL);
Use a java.util.regex.Pattern with the MULTILINE flag:
import java.util.regex.Pattern;
Pattern pattern = Pattern.compile("A.*B", Pattern.MULTILINE);
pattern.split(string);
Compile the regex with this option: Pattern regex = Pattern.compile("A.*B",Pattern.DOTALL)
Try "A[.\\s]*B"
Or you may specify the DOTALL switch so that "." will include even line terminators. Take a look aƄ the documentation of the Pattern class.
Have a look at java.util.regex.Pattern.compile(String regex, int flags), esp. the DOTALL flag
I assume you use the Pattern, Matcher classes for this.
Have you tried providing MULTILINE to your Pattern.compile() method?
Pattern.compile(regex, Pattern.MULTILINE)
'.' = Any character (may or may not match line terminators)
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html#lt
Try changing yor regex to "A(.|\\s)*B" This means A followed by any character(.) or any white character(\s) any number of times followed by B (double scaped \s is needed at java Code).
Reference for Regular Expressions (constructs, spacial characters, etc.) in Java: http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

Categories