There is piece of code that replaces the C/o,d/o,s/o or w/o as below :
if (temp.contains(",,"))
{
temp=temp.replace ("C/O,,","");
temp=temp.replace ("S/O,,","");
temp=temp.replace ("D/O,,","");
temp=temp.replace ("W/O,,","");
}
But i want to replace above by regex so that it automatically removes C or S or D or W if there is a char sequence ",," I am not able to get what regex can be used .
Please help.
You mean this?
temp=temp.replaceAll("[SDWC]/O,,","");
For case-insensitive match,
temp=temp.replaceAll("(?i)[SDWC]/O,,","");
Related
We have a String as below.
\config\test\[name="sample"]\identifier["2"]\age["3"]
I need to remove the quotes surrounding the numbers. For example, the above string after replacement should look like below.
\config\test\[name="sample"]\identifier[2]\age[3]
Currently I'm trying with the regex as below
String.replaceAll("\"\\\\d\"", "");
This is replacing the numbers also. Please help to find out a regex for this.
You can use replaceAll with this regex \"(\d+)\" so you can replace the matching of \"(\d+)\" with the capturing group (\d+) :
String str = "\\config\\test\\[name=\"sample\"]\\identifier[\"2\"]\\age[\"3\"]";
str = str.replaceAll("\"(\\d+)\"", "$1");
//----------------------^____^------^^
Output
\config\test\[name="sample"]\identifier[2]\age[3]
regex demo
Take a look about Capturing Groups
We can try doing a blanket replacement of the following pattern:
\["(\d+)"\]
And replacing it with this:
\[$1\]
Note that we specifically target quoted numbers only appearing in square brackets. This minimizes the risk of accidentally doing an unintended replacement.
Code:
String input = "\\config\\test\\[name=\"sample\"]\\identifier[\"2\"]\\age[\"3\"]";
input = input.replaceAll("\\[\"(\\d+)\"\\]", "[$1]");
System.out.println(input);
Output:
\config\test\[name="sample"]\identifier[2]\age[3]
Demo here:
Rextester
You can use:
(?:"(?=\d)|(?<=\d)")
and replace it with nothing == ( "" )
fast test:
echo '\config\test\[name="sample"]\identifier["2"]\age["3"]' | perl -lpe 's/(?:"(?=\d)|(?<=\d)")//g'
the output:
\config\test\[name="sample"]\identifier[2]\age[3]
test2:
echo 'identifier["123"]\age["456"]' | perl -lpe 's/(?:"(?=\d)|(?<=\d)")//g'
the output:
identifier[123]\age[456]
NOTE
if you have only a single double quote " it works fine; otherwise you should add quantifier + for both beginning and end "
test3:
echo '"""""1234234"""""' | perl -lpe 's/(?:"+(?=\d)|(?<=\d)"+)//g'
the output:
1234234
Given the string
Content ID [9283745997] Content ID [9283005997] There can be text in between Content ID [9283745953] Content ID [9283741197] Content ID [928374500] There can be valid text here which should not be removed.
I want to remove the text starting Content ID followed by [9283745997] any numbers can be present between square brackets. Eventually I want the result string to be
There can be text in between There can be valid text here which should not be removed.
Could anyone please provide a valid regex to capture this recurring text but the numerals within square brackets are unique?
I appreciate your help!
My soulution to this was :
Pattern p = Pattern.compile("(Content ID \\[\\d*\\] )");
Matcher m = p.matcher(str);
StringBuffer sb = new StringBuffer();
while(m.find()) {
m.appendReplacement(sb, "");
}
m.appendTail(sb);
System.out.println(sb);
So basically you are trying to remove each of Content ID [one or more digits].
To do this you can use replaceAll("regex","replacement") method of String class. As replacement you can use empty String "".
Only problem that stays is what regex should you use.
to match Content ID just write it normally as "Content ID "
to match [ or ] you will have to add \ before each of them because they are regex metacharacters and you need to escape them (in Java you will need to write \ as "\\")
to represent one digit (character from range 0-9) regex uses \d (again in Java you will need to write \ as "\\" which will result in "\\d")
to say "one or more of previously described element" just add + after definition of such element. For example if you want to match one or more letters a you can write it as a+.
Now you should be able to create correct regex. If you will have some questions feel free to ask them in comments.
Try this one:
(Content ID \[[0-9]+\])
You can test it here: http://regexpal.com/
I would use the regex
Content ID \[\d+\] ?
Implement it like this:
str.replaceAll("Content ID \\[\\d+\\] ?", "");
You can find an explanation and demonstration here: http://regex101.com/r/qD5rJ6
I have a String with single quote. I want to replace the single quote with 2 single quotes.
I tried using
String s="Kathleen D'Souza";
s.replaceAll("'","''");
s.replaceAll("\'","\'\'");
s.replace("'","''");
s.replace("\'","\'\'");
But the single quote is not getting replaced with 2 single quotes.
reassign the replaced string to s
String s="Kathleen D'Souza";
s = s.replaceAll("'","''");
Please try
s= "test ' test";
`s.replaceAll("'","\"");` => test " test
`s.replaceAll("'","''");` => test '' test
Strings are immutable. Assign the result of replaceAll to your String:
s = s.replaceAll("'","''");
String s="Kathleen D'Souza";
s= s.replace("'", "''");
Try String#replace(). It will replace all occurrence of single ' with double ''.
Note, with the given solutions successive single quotes will be doubled, so Kathleen D''Souza turns into Kathleen D''''Souza. (I've seen users outsmart themselves like this.) If that is something you are concerned about, you can match successive single quotes with:
s = s.replaceAll("''*","''");
I need to split up a string according to multiple tokens which also may have multiple charecter like given bellow,
word1:word2|word3||word4|word5|||word6|word7
I need to token the above string according to ':', '|', '||', '|||'.
Is it possible with StringTokenizer or else what is the code to tokenize it using Regular Expression split??.. Remember, i also need the token in the resulted array...
You can use the StringUtils Lang API.
Please find the Javadocs for the same here.
It has the following methods -
Substring/Left/Right/Mid - null-safe substring extractions
SubstringBefore/SubstringAfter/SubstringBetween - substring extraction relative to other strings
This is possible with StringTokenizer. But this has to be multi-step process.
Obviously, you can split the String like this:
line.split ("[:|]+")
res113: Array[java.lang.String] = Array(word1, word2, word3, word4, word5, word6, word7)
But what were the delimiters? Well - obviously the opposite:
line.split ("[^:|]+")
res114: Array[java.lang.String] = Array("", :, |, ||, |, |||, |)
I dont know if any API available.you can solve like below.
steps should be.
1.take String
2.define regex to be replaced //you should know them in advance
3.loop all expressions
4.replace every expression with Space.
5.now you can use String tokenizer.
String str="word1:word2|word3||word4|word5|||word6|word7";
String[] tokens={"[:]","[|]{3}","[|]{2}","[|]"};
for (int i = 0; i < tokens.length; i++) {
str=str.replaceAll(tokens[i], " ");
System.out.println(str);
}
I want to split the string
String fields = "name[Employee Name], employeeno[Employee No], dob[Date of Birth], joindate[Date of Joining]";
to
name
employeeno
dob
joindate
I wrote the following java code for this but it is printing only name other matches are not printing.
String fields = "name[Employee Name], employeeno[Employee No], dob[Date of Birth], joindate[Date of Joining]";
Pattern pattern = Pattern.compile("\\[.+\\]+?,?\\s*" );
String[] split = pattern.split(fields);
for (String string : split) {
System.out.println(string);
}
What am I doing wrong here?
Thank you
This part:
\\[.+\\]
matches the first [, the .+ then gobbles up the entire string (if no line breaks are in the string) and then the \\] will match the last ].
You need to make the .+ reluctant by placing a ? after it:
Pattern pattern = Pattern.compile("\\[.+?\\]+?,?\\s*");
And shouldn't \\]+? just be \\] ?
The error is that you are matching greedily. You can change it to a non-greedy match:
Pattern.compile("\\[.+?\\],?\\s*")
^
There's an online regular expression tester at http://gskinner.com/RegExr/?2sa45 that will help you a lot when you try to understand regular expressions and how they are applied to a given input.
WOuld it be better to use Negated Character Classes to match the square brackets? \[(\w+\s)+\w+[^\]]\]
You could also see a good example how does using a negated character class work internally (without backtracking)?