Java regex for replacement with delimiter as : or $ - java

I've jobs running following non-standard naming conventions with some job examples below:
=========================================
Job Name | New Name
----------------------------- ----------
JOB:/Level0_APP1_12345_0/ JOB
JOB:Level1_DBASW_t323dk23_p1 JOB
JOB$SAV: JOB
backup:SYNC1 backup
QUERY:logs QUERY
QUERY$maps: QUERY
QUERY: QUERY
FS1:\ FS1:\ -- No change in name
PS:\MXMI PS:\MXMI -- No change in name
========================================
The delimiter is either (;) or ($) whichever comes first. Also, the regex should not job which have (:\) in name, as shown in last 2 examples.
I've used the below, but without success
Regex:
(:|\$[a-zA-Z\/0-9]+)|(\$[a-zA-Z\/0-9]+)|(:$)
(.*)((\:|\$)([a-zA-Z\/0-9]+|$))
(.*)((\:|\$)(.*|$))
Substitution -> $1

I would use a simple regex here:
^(.*?)(?::(?!\\)|\$).*
It matches:
^ - start of string
(.*?) - capture into Group 1 as few symbols (other than a newline) as possible before the first...
(?::(?!\\)|\$) - either : that is not followed by \ (with (?::(?!\\)) or a literal $ (with \$)
.* - match the rest of the line
See IDEONE demo:
List<String> strs = Arrays.asList("JOB:/Level0_APP1_12345_0/", "JOB:Level1_DBASW_t323dk23_p1",
"JOB$SAV:", "backup:SYNC1","QUERY:logs","QUERY$maps:","QUERY:","FS1:\\","PS:\\MXMI");
for (String str : strs)
System.out.println(str.replaceAll("^(.*?)(?::(?!\\\\)|\\$).*", "$1"));
Output:
JOB
JOB
JOB
backup
QUERY
QUERY
QUERY
FS1:\
PS:\MXMI

Try this:
^(\w+(?=:\\.+):\\.+|[^:$]+)
The first capturing group ($1) is what you are looking for

Related

Regular expression: Replace everything before first occurence

I have the following regular expression that I'm using to remove the dev. part of my URL.
String domain = "dev.mydomain.com";
System.out.println(domain.replaceAll(".*\\.(?=.*\\.)", ""));
Outputs: mydomain.com but this is giving me issues when the domains are in the vein of dev.mydomain.com.pe or dev.mydomain.com.uk in those cases I am getting only the .com.pe and .com.uk parts.
Is there a modifier I can use on my regex to make sure it only takes what is before the first . (dot included)?
Desired output:
dev.mydomain.com -> mydomain.com
stage.mydomain.com.pe -> mydomain.com.pe
test.mydomain.com.uk -> mydomain.com.uk
You may use
^[^.]+\.(?=.*\.)
See the regex demo and the regex graph:
Details
^ - start of string
[^.]+ - 1 or more chars other than dots
\. - a dot
(?=.*\.) - followed with any 0 or more chars other than line break chars as many as possible and then a ..
Java usage example:
String result = domain.replaceFirst("^[^.]+\\.(?=.*\\.)", "");
Following regex will work for you. It will find first part (if exists), captures rest of the string as 2nd matching group and replaces the string with 2nd matching group. .*? is non-greedy search that will match until it sees first dot character.
(.*?\.)?(.*\..*)
Regex Demo
sample code:
String domain = "dev.mydomain.com";
System.out.println(domain.replaceAll("(.*?\\.)?(.*\\..*)", "$2"));
domain = "stage.mydomain.com.pe";
System.out.println(domain.replaceAll("(.*?\\.)?(.*\\..*)", "$2"));
domain = "test.mydomain.com.uk";
System.out.println(domain.replaceAll("(.*?\\.)?(.*\\..*)", "$2"));
domain = "mydomain.com";
System.out.println(domain.replaceAll("(.*?\\.)?(.*\\..*)", "$2"));
output:
mydomain.com
mydomain.com.pe
mydomain.com.uk
mydomain.com

Replacing quotes in a Java String only on specific places

We have a String as below.
\config\test\[name="sample"]\identifier["2"]\age["3"]
I need to remove the quotes surrounding the numbers. For example, the above string after replacement should look like below.
\config\test\[name="sample"]\identifier[2]\age[3]
Currently I'm trying with the regex as below
String.replaceAll("\"\\\\d\"", "");
This is replacing the numbers also. Please help to find out a regex for this.
You can use replaceAll with this regex \"(\d+)\" so you can replace the matching of \"(\d+)\" with the capturing group (\d+) :
String str = "\\config\\test\\[name=\"sample\"]\\identifier[\"2\"]\\age[\"3\"]";
str = str.replaceAll("\"(\\d+)\"", "$1");
//----------------------^____^------^^
Output
\config\test\[name="sample"]\identifier[2]\age[3]
regex demo
Take a look about Capturing Groups
We can try doing a blanket replacement of the following pattern:
\["(\d+)"\]
And replacing it with this:
\[$1\]
Note that we specifically target quoted numbers only appearing in square brackets. This minimizes the risk of accidentally doing an unintended replacement.
Code:
String input = "\\config\\test\\[name=\"sample\"]\\identifier[\"2\"]\\age[\"3\"]";
input = input.replaceAll("\\[\"(\\d+)\"\\]", "[$1]");
System.out.println(input);
Output:
\config\test\[name="sample"]\identifier[2]\age[3]
Demo here:
Rextester
You can use:
(?:"(?=\d)|(?<=\d)")
and replace it with nothing == ( "" )
fast test:
echo '\config\test\[name="sample"]\identifier["2"]\age["3"]' | perl -lpe 's/(?:"(?=\d)|(?<=\d)")//g'
the output:
\config\test\[name="sample"]\identifier[2]\age[3]
test2:
echo 'identifier["123"]\age["456"]' | perl -lpe 's/(?:"(?=\d)|(?<=\d)")//g'
the output:
identifier[123]\age[456]
NOTE
if you have only a single double quote " it works fine; otherwise you should add quantifier + for both beginning and end "
test3:
echo '"""""1234234"""""' | perl -lpe 's/(?:"+(?=\d)|(?<=\d)"+)//g'
the output:
1234234

Modification of java regex

I was using the below regex to substitute file names
Regex -> .*\/([A-Z0-9_]{1,9})_(O).*.cmd
Substitution -> $1
The file names were like below:
File Name | Substituted Name
---------------------------------- ------------------
/V3/OGM_REC_Offline_Level0_4D.cmd OGM_REC
/V2/PIE_PROD_Online_Level1_6D.cmd PIE_PROD
/V3/BR2_OnDemand.cmd BR2
/opt/STING_Online_Inc0_1W.cmd STING
Then the files changed and I modified the regex
Regex -> .*\/([A-Z0-9_]{1,9})(_O|Full).*.cmd
Substitution -> $1
Additional new file names
File Name | Substituted Name
---------------------- ------------------
/opt/RSU10Full.cmd RSU10
/V4/REZ40_1Full.cmd REZ40_1
Now, it seems there are new files are getting updated with below name formats
/app/OMGIT_FullOnDemand_4W.cmd
/admin/FOC_STG_Full_6D.cmd
I've modified the regex again, but it's not getting successful
Regex -> .*\/([A-Z0-9_]{1,9})(_O|Full|_Full).*.cmd
Substitution -> $1
I suggest using a version with a lazy limiting quantifier {1,9}? and optional _:
.*/([A-Z0-9_]{1,9}?)(_O|_?Full).*[.]cmd
This way, we match as few characters with [A-Z0-9_]{1,9}? as possible to return a valid captured subtext, and _?Full part can hold the optional underscore.
See the regex demo
I've noticed that unnecessary tail is allways started with: (optional) _, letter in uppercase, letter in lowercase.
So, universal solution is:
.*\/([^a-z]*?)[_]?[A-Z][a-z].*

Get String in between either single quotes or empty space

I wish to have a regular expression which gives me the name of classLoader inserted in single quotes/empty-space but not a mixture of both.
i.e. some examples. :
2014-05-21 22:05:13.685 TRACE [Core]
sun.misc.Launcher$AppClassLoader#62c8aeb3 searching for resource
'java/util/LoggerFactory.class'.
expected output sun.misc.Launcher$AppClassLoader#62c8aeb3
2014-05-21 22:05:13.685 TRACE [Core] Class
'org.jboss.weld.metadata.TypeStore' not found in classloader
'org.jboss.modules.ModuleClassLoader#4ebded0b'.
expected output org.jboss.modules.ModuleClassLoader#4ebded0b
2014-05-21 22:04:34.591 INFO [Core] Started plugin
org.zeroturnaround.javarebel.integration.IntegrationPlugin from
/Users/endragor/Downloads/jrebel/jrebel.jar in
sun.misc.Launcher$AppClassLoader#62c8aeb3
expected output sun.misc.Launcher$AppClassLoader#62c8aeb3
Note that for last exampe, the line ends with new line character. i.e. there is nothing in front.
This is what I have tried ".*[\\s|'](.*ClassLoader.*[^']*)['|\\s].*". But it doesn't work. For the first example it gives below rather than sun.misc.Launcher$AppClassLoader#62c8aeb3:
sun.misc.Launcher$AppClassLoader#62c8aeb3 searching for resource
'java/util/LoggerFactory.class
Also my regex does not handle if the class loader string is end of the line i.e. example-3 above. What can I do so that either ' is considered or \\s but not both
Try this one :
String extractedValue=yourString.replaceAll("(.*)([ '])(.*ClassLoader.*?)(\\2)(.*)", "$3");
Whenever we want to extract String between a predifined set of value , where the first and last delimiter should have the same value , we can use Backreference feature .
this regex should do without grouping:
[^\s']*ClassLoader[^\s']*
in java it should be:
[^\\s']*ClassLoader[^\\s']*
you don't need the pipe | in [..], in regex [abcd] means a or b or c or d
update
add java codes:
public static void main(String[] args){
Pattern p = Pattern.compile("[^\\s']*ClassLoader[^\\s']*");
Matcher m = p.matcher("2014-05-21 22:05:13.685 TRACE [Core] sun.misc.Launcher$AppClassLoader#62c8aeb3 searching for resource 'java/util/LoggerFactory.class'.");
if (m.find()) {
System.out.println(m.group());
}
}
output:
sun.misc.Launcher$AppClassLoader#62c8aeb3
As operator has changed the original post so here is the updated answer.
Simply use below pattern to check for default toString() representation of Object class.
[\w\.$]+ClassLoader#[a-z0-9]+
Pattern Expiation:
\w A word character: [a-zA-Z_0-9]
X+ X, one or more times
[abc] a, b, or c (simple class)
Snapshot:
Here is the DEMO

java regexp for reluctant matching

need to find an expression for the following problem:
String given = "{ \"questionID\" :\"4\", \"question\":\"What is your favourite hobby?\",\"answer\" :\"answer 4\"},{ \"questionID\" :\"5\", \"question\" :\"What was the name of the first company you worked at?\",\"answer\" :\"answer 5\"}";
What I want to get: "{ \"questionID\" :\"4\", \"question\":\"What is your favourite hobby?\",\"answer\" :\"*******\"},{ \"questionID\" :\"5\", \"question\" :\"What was the name of the first company you worked at?\",\"answer\" :\"******\"}";
What I am trying:
String regex = "(.*answer\"\\s:\"){1}(.*)(\"[\\s}]?)";
String rep = "$1*****$3";
System.out.println(test.replaceAll(regex, rep));
What I am getting:
"{ \"questionID\" :\"4\", \"question\":\"What is your favourite hobby?\",\"answer\" :\"answer 4\"},{ \"questionID\" :\"5\", \"question\" :\"What was the name of the first company you worked at?\",\"answer\" :\"******\"}";
Because of the greedy behaviour, the first group catches both "answer" parts, whereas I want it to stop after finding enough, perform replacement, and then keep looking further.
The pattern
("answer"\s*:\s*")(.*?)(")
Seems to do what you want. Here's the escaped version for Java:
(\"answer\"\\s*:\\s*\")(.*?)(\")
The key here is to use (.*?) to match the answer and not (.*). The latter matches as many characters as possible, the former will stop as soon as possible.
The above pattern won't work if there are double quotes in the answer. Here's a more complex version that will allow them:
("answer"\s*:\s*")((.*?)[^\\])?(")
You'll have to use $4 instead of $3 in the replacement pattern.
The following regex works for me :
regex = "(?<=answer\"\\s:\")(answer.*?)(?=\"})";
rep = "*****";
replaceALL(regex,rep);
The \ and " might be incorrectly escaped since I tested without java.
http://regexr.com?303mm

Categories