How to find and replace a substring? - java

For example I have such a string, in which I must find and replace multiple substrings, all of which start with #, contains 6 symbols, end with ' and should not contain ) ... what do you think would be the best way of achieving that?
Thanks!
Edit:
just one more thing I forgot, to make the replacement, I need that substring, i.e. it gets replaces by a string generated from the substring being replaced.

yourNewText=yourOldText.replaceAll("#[^)]{6}'", "");
Or programmatically:
Matcher matcher = Pattern.compile("#[^)]{6}'").matcher(yourOldText);
StringBuffer sb = new StringBuffer();
while(matcher.find()){
matcher.appendReplacement(sb,
// implement your custom logic here, matcher.group() is the found String
someReplacement(matcher.group());
}
matcher.appendTail(sb);
String yourNewString = sb. toString();

Assuming you just know the substrings are formatted like you explained above, but not exactly which 6 characters, try the following:
String result = input.replaceAll("#[^\\)]{6}'", "replacement"); //pattern to replace is #+6 characters not being ) + '

You must use replaceAll with the right regular expression:
myString.replaceAll("#[^)]{6}'", "something")
If you need to replace with an extract of the matched string, use a a match group, like this :
myString.replaceAll("#([^)]{6})'", "blah $1 blah")
the $1 in the second String matches the first parenthesed expression in the first String.

this might not be the best way to do it but...
youstring = youstring.replace("#something'", "new stringx");
youstring = youstring.replace("#something2'", "new stringy");
youstring = youstring.replace("#something3'", "new stringz");
//edited after reading comments, thanks

Related

Need Regex to replace all characters between first set of parenthesis from a string

I've been able to generate a regex to pull everything that is between parenthesis in a string, but I'm unclear on how to make it only happen once and only with the first set. In JAVA:
My current pattern = "\\(([^)]+)\\)"
Any help would be greatly appreciated.
Use replaceFirst instead of replaceAll
OR if you must use replaceAll let it consume rest of your string and put it back again like
replaceAll("yourRegex(.*)","yourReplacement$1");
where $1 represents match from first group (.*).
try:
String x= "Hie(Java)";
Matcher m = Pattern.compile("\\((.*?)\\)").matcher(x);
while(m.find()) {
System.out.println(m.group(1));
}
or
String str = "Hie(Java)";
String answer = str.substring(str.indexOf("(")+1,str.indexOf(")"));
for last index:
update with
String answer = str.substring(str.indexOf("(")+1,str.lastIndexOf(")"));

split string based on text qualifier regex java

I want to split a string based on text qualifier for example
"1","10411721","MikeTison","08/11/2009","21/11/2009","2800.00","002934538","051","New York","10411720-002",".\Images\b.jpg",".\RTF\b.rtf"
Qualifer="
Spliter = ,
I want to split string based on Spliter , but if Spliter comes inside qualifier " than ignore it and return string including Spliter .
Regular expression i am using is (?:|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)
but this regular expression only returns commas,please help me in this perspective as i am new to regular expressions
please note that if we have newline characters in string ie \r\n than it should ignore newline character
"1","10411","Muis","a","21/11/2009","2800.06","0029683778","03005136851","Awan","10411720-001",".\Images\a.jpg",".\RTF\a.rtf"
"2","08/10/2009","07:32","Call","On-Net","030092343242342376543","Monk","00:00","1.500","0.000","10.000","0.200"
"2","08/10/2009","02:50","Call","Off-Net","030092343242342376543","Une","08:00","1.500","2.000","20.000","3.500"
"2","09/10/2009","03:55","SMS","On-Net","030092343242342376543","Mink","00:00","1.500","0.000","5.000","100.500"
"2","09/10/2009","12:30","Call","Off-Net","030092343242342376543","Zog","01:01","3.500","3.000","70.000","6.500"
"2","09/10/2009","09:11","Call","On-Net","030092343242342376543","Monk","02:30","2.00","2.000","90.000","4.000"
Probably easiest solution is not searching for place to split, but finding elements which you want to return. In your case these elements
starts "
ends with "
have no " inside.
So you try with something like
String data = "\"1\",\"10411721\",\"MikeTison\",\"08/11/2009\",\"21/11/2009\",\"2800.00\",\"002934538\",\"051\",\"New York\",\"10411720-002\",\".\\Images\\b.jpg\",\".\\RTF\\b.rtf\"";
Pattern p = Pattern.compile("\"([^\"]+)\"");
Matcher m = p.matcher(data);
while(m.find()){
System.out.println(m.group(1));
}
Output:
1
10411721
MikeTison
08/11/2009
21/11/2009
2800.00
002934538
051
New York
10411720-002
.\Images\b.jpg
.\RTF\b.rtf
You can split using this regex:
String[] arr = input.split( "(?=(([^\"]*\"){2})*[^\"]*$),+" );
This regex will split on commas if those are outside double quotes by using a lookahead to make sure there are even number of quotes after a comma.
Remove the first and the last character of the whole string. Then split with ","
String test = "\"1\",\"10411721\",\"MikeTison\",\"08/11/2009\",\"21/11/2009\",\"2800.00\",\"002934538\",\"051\",\"New York\",\"10411720-002\",\".\\Images\\b.jpg\",\".\\RTF\\b.rtf\"";
if (test.length() > 0)
test = test.substring(1, test.length()-1);
System.out.println(Arrays.toString(test.split("\",\"")));
This works even if you have new line character..try it out
String str="\"1\",\"10411721\",\"MikeTison\",\"08/11/2009\",\"21/11/2009\",\"2800.00\",\"002934538\",\"051\",\"New York\",\"10411720-002\",\".\\Images\\b.jpg\",\".\\RTF\\b.rtf\"";
System.out.println(Arrays.toString(str.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)")));

Remove all non-word char except if & or ' pattern

I am trying to clean a string of all non-word character except when it is & i.e. pattern might be like &[\w]+;
For example:
abc; => abc
abc & => abc &
abc& => abc
if i use string.replaceAll("\W","") it removes ; and '&' too from second example which I don't want.
Can using negative look-ahead in this problem could give a quick solution regex pattern?
First of all, I really like the question. Now, what you want could not be done with a single replaceAll, because for that, we would need a negative look-behind with variable length, which is not allowed. If it was allowed, then it would not have been that difficult.
Anyways, since single replaceAll is no option here, you can use a little hack here. Like first replacing the last semi-colon of you entity reference, with some character sequence, which you are sure won't be there in the rest of the string, like XXX or anything. I know this is not correct, but you sure can't help it out.
So, here's what you can try:
String str = "a;b&c &";
str = str.replaceAll("(&\\w+);", "$1XXX")
.replaceAll("&(?!\\w+?XXX)|[^\\w&]", "")
.replaceAll("(&\\w+)XXX", "$1;");
System.out.println(str);
Explanation:
The first replaceAll, replaces the pattern like & with &ampXXX, or any other sequence replaced for last ;.
The second replaceAll, replaces any & not followed by \\w+XXX, or any non-word, non & character. This will replace all the &'s which are not a part of & kind of pattern. Plus, also replaces any other non-word character.
The third replaceAll, re-replaces XXX with ;, to create back & from &ampXXX
And to make it easier to understand, you can rather use Pattern and Matcher classes and I would always prefer to use them whenever the replacement criteria is complex.
String str = "a;b&c &";
Pattern pattern = Pattern.compile("&\\w+;|[^\\w]");
Matcher matcher = pattern.matcher(str);
StringBuilder sb = new StringBuilder();
while (matcher.find()) {
String match = matcher.group();
if (!match.matches("&\\w+;")) {
matcher.appendReplacement(sb, "");
} else {
matcher.appendReplacement(sb, match);
}
}
matcher.appendTail(sb);
System.out.println(sb.toString());
This one is similar to #Eric's code, but is a generalization over it. That one will only work for & of course if it was improved to remove NullPointerException that is thrown in it.
I'm not sure you can do this using a simple String.replaceAll. You should probably use a Pattern and Matcher to loop through the matches, effectively doing a manual search and replace. Something like the following code should do the trick.
public String replaceString(String origString) {
Pattern pattern = Pattern.compile("&(\w+);|[^\w]");
Matcher matcher = pattern.matcher(origString);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
if (matcher.group().startsWith("&") && !matcher.group(1).equals("amp")) {
matcher.appendReplacement(sb, matcher.group());
} else {
matcher.appendReplacement(sb, "");
}
}
matcher.appendTail(sb);
return sb.toString();
}
I would suggest you use a negative lookahead like this:
string.replace(/&(?!\w+;)/ig, '');
Which replaces all & not followed by a word characters ending with a semicolon.
EDIT (Java):
string.replaceAll("/&(?!\w+;)/i", '');

regex pattern - extract a string only if separated by a hyphen

I've looked at other questions, but they didn't lead me to an answer.
I've got this code:
Pattern p = Pattern.compile("exp_(\\d{1}-\\d)-(\\d+)");
The string I want to be matched is: exp_5-22-718
I would like to extract 5-22 and 718. I'm not too sure why it's not working What am I missing? Many thanks
Try this one:
Pattern p = Pattern.compile("exp_(\\d-\\d+)-(\\d+)");
In your original pattern you specified that second number should contain exactly one digit, so I put \d+ to match as more digits as we can.
Also I removed {1} from the first number definition as it does not add value to regexp.
If the string is always prefixed with exp_ I wouldn't use a regular expression.
I would:
replaceFirst() exp_
split() the resulting string on -
Note: This answer is based on the assumptions. I offer it as a more robust if you have multiple hyphens. However, if you need to validate the format of the digits then a regular expression may be better.
In your regexp you missed required quantifier for second digit \\d. This quantifier is + or {2}.
String yourString = "exp_5-22-718";
Matcher matcher = Pattern.compile("exp_(\\d-\\d+)-(\\d+)").matcher(yourString);
if (matcher.find()) {
System.out.println(matcher.group(1)); //prints 5-22
System.out.println(matcher.group(2)); //prints 718
}
You can use the string.split methods to do this. Check the following code.
I assume that your strings starts with "exp_".
String str = "exp_5-22-718";
if (str.contains("-")){
String newStr = str.substring(4, str.length());
String[] strings = newStr.split("-");
for (String string : strings) {
System.out.println(string);
}
}

parsing string with substring

what is the best way to parse string
Example
SomeName_Some1_Name2_SomeName3
I want to get out SomeName. What is the best way to do? With substring and calculationg positions or is another better way
You can match pattern SomeName for extracting-
String str= "SomeName_Some1_Name2_SomeName3";
Pattern ptrn= Pattern.compile("SomeName");
Matcher matcher = ptrn.matcher(str);
while(matcher.find()){
System.out.println(matcher.group());
}
Split it by underscore _ using method split()
Get index # 0 from returning array from previous step
If you know the delimiter then you can just try this:
System.out.println("SomeName_Some1_Name2_SomeName3".split("_")[0]);
See also: Javadoc of String.split()
Depends on your configuration and whether you're interested in the other fields.
In that case, go for splitting the string using the _ separator.
In case you just want a part of the string, I'ld just go for substringing in combination with indexOf('_').
In case you want all Occurences you could also find all occurences of 'someName' in your text.
Use regex and Pattern Matcher API to get SomeName.
Here you go:
String str = "SomeName_Some1_Name2_SomeName3";
String newStr = str.substring(0, str.indexOf("_"));
System.out.println(newStr);
Output:
SomeName
String your_String = "SomeName_Some1_Name2_SomeName3";
your_String = your_String.split("_")[0];
Log.v("log","your string "+ your_String);
String str = "SomeName_Some1_Name2_SomeName3";
String output = str.split ( "_" ) [ 0 ];
you will get your output as SomeName.

Categories