How do get value between () in java string just before .? - java

I have to replace file name abc(1).jpg to abc(2).jpg . Here is the code
String example = "my attachements with some name (56).jpg";
Matcher m = Pattern.compile("\\((\\d{1,}).\\)").matcher(example);
int a = 0;
while(m.find()) {
a=Integer.parseInt(m.group(1));
String p = example.replace(String.valueOf(a), String.valueOf(a+1));
}
It is working fien as per given use case . But fails in case of abc(ab)(1)(ab).jpg for this case it just changed to abc(ab)(2)(ab).jpg . Which is not required . So how do i can verify that numeric bracket is just before dot i.e .

You may use
String example = "my attachements with some name (56).jpg";
Matcher m = Pattern.compile("(?<=\\()\\d+(?=\\)\\.)").matcher(example);
example = m.replaceAll(r -> String.valueOf(Integer.parseInt(m.group())+1) );
System.out.println( example );
// => my attachements with some name (57).jpg
See the Java demo. The regex used is
(?<=\()\d+(?=\)\.)
See the regex demo. It matches
(?<=\() - a location immediately preceded with (
\d+ - then consumes 1+ digits
(?=\)\.) - immediately followed with ). char sequence.
If you need to tell the regex to match the dot that is the last dot in the string (where it is most likely the extension delimiter) replace (?=\)\.) with (?=\)\.[^.]*$). See this regex demo.

You can use a lookahead regex for this:
"\\((\\d+)\\)(?=\\.)"
(?=\.) is a lookahead condition that asserts presence of dot right after closing )
RegEx Demo

Related

How to replace a part of email address using regex?

I want to replace a part of email address using regex. How to do it ?
For example : An email address test.email+alex#gmail.com is there and I want to replace the part of that email address from + to before # with '' so that final string will be test.email#gmail.com.
I tried with this given below :
str.replaceAll("[^+[a-z]]","");
You can try with that:
\+[^#]*
Explanation:
\+ matches + where \ is the escape character
[^#]* matches anything until it reaches #, where * means zero or more
The code is given below:
final String string = "test.email+alex#gmail.com";
final Pattern pattern = Pattern.compile("\\+[^#]*");
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll("");
Regex Test Case
If you want to match either a dot or a plus sign till an #, you could use a positive lookahead to assert an # on the right for both cases and list each option using an alternation.
(?:\.|\+[^#]*)(?=.*#)
Explanation
(?: Non capture group
\. Match a dot
| Or
\+[^#]* Match + and 0+ times any char except a dot
) Close group
(?=.*#) Positive lookahead, assert an # to the right
Regex demo | Java demo
In Java
str.replaceAll("(?:\\.|\\+[^#]*)(?=.*#)","")

find substring using match regex

Using regex how to find a substring in other string. Here are two strings:
String a= "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease .";
String b = "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeightAverage> ?weight . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease";
I want to match only
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>
Since this is not quite HTML and any XML/HTML parser couldn't help it you can try with regex. It seems that you want to find text in form
?drug <someData> ?disease
To describe such text regex you need to escape ? (it is one of regex special characters representing optional - zero or once - quantifier) so you need to place \ before it (which in String needs to be written as "\\").
Also part <someData> can be written as as <[^>]> which means,
<,
one or more non > after it,
and finally >
So regex to match ?drug <someData> ?disease can be written as
"\\?drug <[^>]+> \\?disease"
But since we are interested only in part <[^>]+> representing <someData> we need to let regex group founded contend. In short if we surround some part of regex with parenthesis, then string matched by this regex part will be placed in something we call group, so we will be able to get part from this group. In short final regex can look like
"\\?drug (<[^>]+>) \\?disease"
^^^^^^^^^---first group,
and can be used like
String a = "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease .";
String b = "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeightAverage> ?weight . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease";
Pattern p = Pattern.compile("\\?drug (<[^>]+>) \\?disease");
Matcher m = p.matcher(a);
while (m.find()) {
System.out.println(m.group(1));
}
System.out.println("-----------");
m = p.matcher(b);
while (m.find()) {
System.out.println(m.group(1));
}
which will produce as output
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>
-----------
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>
There's no need to use a regex here, just do this :
String substr = "<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>";
System.out.println(b.contains(substr)); // prints true
System.out.println(a.contains(substr)); // prints true

Remove occurrences of a given character sequence at the beginning of a string using Java Regex

I have a string that begins with one or more occurrences of the sequence "Re:". This "Re:" can be of any combinations, for ex. Re<any number of spaces>:, re:, re<any number of spaces>:, RE:, RE<any number of spaces>:, etc.
Sample sequence of string : Re: Re : Re : re : RE: This is a Re: sample string.
I want to define a java regular expression that will identify and strip off all occurrences of Re:, but only the ones at the beginning of the string and not the ones occurring within the string.
So the output should look like This is a Re: sample string.
Here is what I have tried:
String REGEX = "^(Re*\\p{Z}*:?|re*\\p{Z}*:?|\\p{Z}Re*\\p{Z}*:?)";
String INPUT = title;
String REPLACE = "";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT);
while(m.find()){
m.appendReplacement(sb,REPLACE);
}
m.appendTail(sb);
I am using p{Z} to match whitespaces(have found this somewhere in this forum, as Java regex does not identify \s).
The problem I am facing with this code is that the search stops at the first match, and escapes the while loop.
Try something like this replace statement:
yourString = yourString.replaceAll("(?i)^(\\s*re\\s*:\\s*)+", "");
Explanation of the regex:
(?i) make it case insensitive
^ anchor to start of string
( start a group (this is the "re:")
\\s* any amount of optional whitespace
re "re"
\\s* optional whitespace
: ":"
\\s* optional whitespace
) end the group (the "re:" string)
+ one or more times
in your regex:
String regex = "^(Re*\\p{Z}*:?|re*\\p{Z}*:?|\\p{Z}Re*\\p{Z}*:?)"
here is what it does:
see it live here
it matches strings like:
\p{Z}Reee\p{Z: or
R\p{Z}}}
which make no sense for what you try to do:
you'd better use a regex like the following:
yourString.replaceAll("(?i)^(\\s*re\\s*:\\s*)+", "");
or to make #Doorknob happy, here's another way to achieve this, using a Matcher:
Pattern p = Pattern.compile("(?i)^(\\s*re\\s*:\\s*)+");
Matcher m = p.matcher(yourString);
if (m.find())
yourString = m.replaceAll("");
(which is as the doc says the exact same thing as yourString.replaceAll())
Look it up here
(I had the same regex as #Doorknob, but thanks to #jlordo for the replaceAll and #Doorknob for thinking about the (?i) case insensitivity part ;-) )

How to create a java regular expression pattern that would match a string only at certain positon?

I would like to create a regular expression pattern that would succeed in matching only if the pattern string not followed by any other string in the test string or input string ! Here is what i tried :
Pattern p = Pattern.compile("google.com");//I want to know the right format
String input1 = "mail.google.com";
String input2 = "mail.google.com.co.uk";
Matcher m1 = p.matcher(input1);
Matcher m2 = p.matcher(input2);
boolean found1 = m1.find();
boolean found2 = m2.find();//This should be false because "google.com" is followed by ".co.uk" in input2 string
Any help would be appreciated!
Your pattern should be google\.com$. The $ character matches the end of a line. Read about regex boundary matchers for details.
Here is how to match and get the non-matching part as well.
Here is the raw regex pattern as an interactive link to a great regular expression tool
^(.*)google\.com$
^ - match beginning of string
(.*) - capture everything in a group up to the next match
google - matches google literal
\. - matches the . literal has to be escaped with \
com - matches com literal
$ - matches end of string
Note: In Java the \ in the String literal has to be escaped as well! ^(.*)google\\.com$
You should use google\.com$. $ character matches the end of a line.
Pattern p = Pattern.compile("google\\.com$");//I want to know the right format
String input2 = "mail.google.com.co.uk";
Matcher m2 = p.matcher(input2);
boolean found2 = m2.find();
System.out.println(found2);
Output = false
Pattern p = Pattern.compile("google\.com$");
The dollar sign means it has to occur at the end of the line/string being tested. Note too that your dot will match any character, so if you want it to match a dot only, you need to escape it.

Using a special email regular expression

I have some emails in the form:
staticN123#sub1.mydomain.com
staticN456#sub2.mydomain.com
staticN789#sub3-sub.mydomain.com
The dynamic is the number after the (N or M or F) character, and the subDomain between the # and mydomain.com
I want to make a regular expression that matches this form in a string, and if it's a match, get the number after the N character.
staticN([0-9]+)#.+\.mydomain\.com
instead of [0-9]+ you can also use \d+ which is the same.
the .+ after the # could match too much. eventually you'd like to replace that with [^\.]+ to exclude sub.sub domains.
update:
^staticN(\d+)#[a-z0-9_-]+\.mydomain\.com$
adding ^ and $ to match start and end of the search string to avoid false match to e.g. somthingwrong_staticN123#sub.mydomain.com.xyz
you can test this regexp here link to rubular
--
applying changes discussed in comments below:
^(?:.+<)?static[NMF](\d+)#[a-z0-9_-]+\.mydomain\.com>?$
code example to answer the question in one of the comments:
// input
String str = "reply <staticN123#sub1.mydomain.com";
// example 1
String nr0 = str.replaceAll( "^(?:.+<)?static[NMF](\\d+)#[a-z0-9_-]+\\.mydomain\\.com>?$", "$1" );
System.out.println( nr0 );
// example 2 (precompile regex is faster if it's used more than once afterwards)
Pattern p = Pattern.compile( "^(?:.+<)?static[NMF](\\d+)#[a-z0-9_-]+\\.mydomain\\.com>?$" );
Matcher m = p.matcher( str );
boolean b = m.matches();
String nr1 = m.group( 1 ); // m.group only available after m.matches was called
System.out.println( nr1 );

Categories