Match characters via regex in a string mutliple times - java

I'm trying to replace all occurences in a string with a regex using a Pattern object, but it only replaces the odd occurences:
final Pattern p = Pattern.compile("(^|\\W|\\\\N)(recursive)(\\W|$)", Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
System.out.println(p.matcher("i-i-i").replaceAll("$1I$3"));
This returns me:
I-i-I
But I need to to match also the I in the middle, but somehow it doesn't catch that. I also tried a simplified regex (^|-)(I)($|-) and try to do the same with i-i-i-i-i-i which returned me I-i-I-i-I-i.
I guess it is because the odd dashs (at 4x+1) were already matched, so they can't be matched a second time for the even is. Is it possible to allow that?

It seems that your problem is that you are trying to use same character - in few matches. In that case you should probably use look-around mechanism. For example you can change your
(^|-)(I)($|-)
patter to
(^|-)(I)(?=($|-))
and as replacement use $1I. This way regex will only check if after I exists $ or - but will not include it in match, so
final Pattern p = Pattern.compile("(^|-)(I)(?=($|-))",
Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
System.out.println(p.matcher("i-i-i-i-i-i").replaceAll("$1I"));
prints
I-I-I-I-I-I

Related

Java Regex not working as desired

I created this regex, but somehow it only detects the first part
of the regex not the last part. I would like to know what is going on?
Here's the code:
String m = -2√3254i/18.5
String regex = "-?\\d+(\\.\\d*)?\\√\\d+(\\.\\d*)?i\\/\\d+(\\.\\d*)?"
I have tried many different ways, such as:
-?\\d+(\\.\\d*)?\\√\\d+(\\.\\d*)?i+\\/+\\d+(\\.\\d*)?
-?\\d+(\\.\\d*)?\\√\\d+(\\.\\d*)?i/\\d+(\\.\\d*)?
-?\\d+(\\.\\d*)?\\√\\d+(\\.\\d*)?i\\(\\/\\d+(\\.\\d*))?
none of them work.
the output is always
-2√3254
Any suggestions,
thank you
Okay so my regex is actually composed of many regexes:
String regex = "a regex | another regex | -?\\d+(\\.\\d*)?\\√\\d+(\\.\\d*)?\\i"
+ "|another regex | -?\\d+(\\.\\d*)?\\√\\d+(\\.\\d*)?i\\/\\d+ (\\.\\d*)?
The problem happens between both regexes shown. The first symbolical regex is picked up by the matcher first, but I really intended for the second symbolical regex to pick up my String m = "-2√32454i/18.5"
It seems the matcher exits matching when one of the boolean conditions is met.
All I had to do was rearrange the order of my regexes which make up my regex.

Why do the Regex is not working properly?

I have some String like
s3://my-source-bucket/molomics/molecules35455720556210282.csv or,
s3://my-source-bucket/molecules10282.csv
s3://my-source-bucket/molename
Criterias:
1. the portion of `s3://` is fixed
2. the bucket name will be consists of letters, numbers and dash(-) and dots(.), say,
my-source-bucket and will be followed by /
3. Number 2 will repeat one or more time
4. In the end there will be no /
I would like to match them using the regex. I have this small program that I use to get the matches provided below,
public static void findMatchUsingRegex(String input) {
String REGEX = "(w+://)([0-9A-Za-z-]+/)([0-9A-Za-z-/]+)([0-9A-Za-z-.]+)?";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(input); // get a matcher object
while(m.find()) {
count++;
System.out.println("Match number "+count);
System.out.println("start(): "+m.start());
System.out.println("end(): "+m.end());
}
}
In the online editor, I find the matches.However, these doesn't return anything as expected in the actual run of the program. How to change the regex to work it properly and may be to work better ?
Some points in order
Criterion #1 states that s3:// is fixed, so you can use that explicitly.
You need to escape special regex characters like ., -, and /. Because you're writing the regex as a Java string, you'll need to use two backslashes: \\. to match the literal ..
It looks like you can simplify your pattern quite a bit.
I don't know exactly what findMatchUsingRegex is supposed to do, but make sure you want to use Pattern.find over Pattern.match.
A solution
s3:\/(\/[0-9A-Za-z\-\.]+)+
Note how the \/ comes first, so the string must end with a number, letter, ., or -. In Java, you'll need to write this as:
s3:\\/(\\/[0-9A-Za-z\\-\\.]+)+
(Technically, you don't need to escape - and . here. But that's probably good practice because they're special characters.)

Java Regex Match

Given such Regex code:
Matcher m = Pattern.compile("c:.*?(|t:){1}.*?").matcher(string);
I only want to match something like c:somesubstring|t:somesubstring. However it also matches some thing like this:
c:somesubstring
and
c:somesubstring|a:somesubtring
How could this come? I use (|t:){1} to guarantee that the pattern |t: occurs and occurs only once. Will be helpful to tell me what's wrong with my regex and give me a regex to match only c:somesubstring|t:somesubstring
| is a special meta character in regex which acts like a logical OR operator usually used to combine two regexes . You need to escape the | symbol, so that it would match a literal | symbol.
Matcher m = Pattern.compile("c:.*?(\\|t:){1}.*?").matcher(string);
much shorter.
Matcher m = Pattern.compile("c:.*?\\|t:.*?").matcher(string);

Java Regular expressions for filename

I want to check the filenames sent to me against two patterns.
The first regular expression is ~*~, which should match names like ~263~. I put this in online regular expression testers and it matches. The code doesnt work though. Says no match
List<FTPFile> ret = new ArrayList<FTPFile>();
Pattern pattern = Pattern.compile("~*~");
Matcher matcher;
for (FTPFile file : files)
{
matcher = pattern.matcher(file.getName());
if(matcher.matches())
{
ret.add(file);
}
}
return ret;
Also the second pattern I need is ##* which should match strings like abc#ere#sss
Please tell me the proper patterns in java for this.
You need to define your pattern like,
Pattern pattern = Pattern.compile("~.*~");
~* in your regex ~*~ will repeat the first ~ zero or more times. So it won't match the number following the first ~. Because matches method tries to match the whole input string, this regex causes the match to fail. So you need to add .* inbetween to match strings like ~66~ or ~kjk~ . To match the strings which has only numbers present inbetween ~, you need to use ~\d+~
Try Regex:
\~.*\~
Instead:
~*~
Example:
Pattern pattern = Pattern.compile("\\~.*\\~");

How to match a set of string in regexp

How do I combine ch..+ and ch..- in regexp effectively without having to scan separately?
And are we using matcher in the pattern?
My output code is like this:
ch01+
ch01-
ch02+
ch02-
...
How do I combine ch..+ and ch..- in regexp effectively without having to scan separately?
Use | (pipe) for alternation:
ch..(\+|-)
And are we using matcher in the pattern?
Depends on how you're using the regexp and the pattern. To get a concrete answer, you'll have to show some actual code, or ask a much more specific question.
N.B. If you want to restrict the two characters after ch to 0-9, you can use \d, which is a shorthand character class for [0-9]:
ch\d{2}(\+|-)
You can use a character class containing just "+" and "-" like so "[+-]".
Pattern p = Pattern.compile("ch..[+-]");
Matcher m = p.matcher("ch01+");
if (m.find()) {
// found it...

Categories