Replace the opening and closing string, while keeping the enclosed value? - java

String value="==Hello==";
For the above string, I have to replace the "==" tags as <Heading>Hello</Heading>. I have tried doing it like this:
value = value.replaceAll("(?s)\\=\\=.","<heading>");
value = value.replaceAll(".\\=\\=(?s)","</heading>");
However, my original dataset is huge, with lots of strings like this to be replaced. Can the above be performed in a single statement, giving preference to performance?
The regex should not affect strings of form, ===<value>===, where value is any string of characters[a-z,A-Z].

To avoid iterating over string many times to first replace ===abc=== and then ==def== we can iterate over it once and thanks to Matehr#appendReplacement and Matcher#appendTail dynamically decide how to replace found match (based on amount of =).
Regex which can search find both described cases can look like: (={2,3})([a-z]+)\1 but to make it more usable lets use named groups (?<name>subregex) and also instead of [a-z] use more general [^=]+.
This will give us
Pattern p = Pattern.compile("(?<eqAmount>={2,3})(?<value>[^=]*)\\k<eqAmount>");
Group named eqAmount will hold == or ===. \\k<eqAmount> is backreference to that group, which means regex expects to find it also == or === depending on what eqAmount already holds.
Now we need some mapping between == or === and replacements. To hold such mapping we can use
Map<String,String> replacements = new HashMap<>();
replacements.put("===", "<subheading>${value}</subheading>");
replacements.put("==", "<heading>${value}</heading>");
${value} is reference to capturing group named value - here (?<value>[^=]*) so it will hold text between both == or ===.
Now lets see how it works:
String input = "===foo=== ==bar== ==baz== ===bam===";
Map<String, String> replacements = new HashMap<>();
replacements.put("===", "<subheading>${value}</subheading>");
replacements.put("==", "<heading>${value}</heading>");
Pattern p = Pattern.compile("(?<eqAmount>={2,3})(?<value>[^=]*)\\k<eqAmount>");
StringBuffer sb = new StringBuffer();
Matcher m = p.matcher(input);
while (m.find()) {
m.appendReplacement(sb, replacements.get(m.group("eqAmount")));
}
m.appendTail(sb);
String result = sb.toString();
System.out.println(result);
Output: <subheading>foo</subheading> <heading>bar</heading> <heading>baz</heading> <subheading>bam</subheading>

Try this:
public static void main(final String[] args) {
String value = "===Hello===";
value = value.replaceAll("===([^=]+)===", "<Heading>$1</Heading>");
System.out.println(value);
}

Related

Replace a String between two Strings and Boundary words also [duplicate]

This question already has answers here:
Regex Match all characters between two strings
(16 answers)
Closed 13 days ago.
I want to replace string variable text between two words and replace the boundary words themselves.
Similar to this question, however I want to replace between &firstString and &endString
with newText .
Replace a String between two Strings
Input:
abcd&firstString={variableText}&endStringxyz
Output:
abcdnewTextxyz
I could just do two str.replaceAll(&firstString) and str.replaceAll(&secondString).
However, is it possible to do in one line of code changing maybe this code solution?
String newstr = str.replaceAll("(&firstString=)[^&]*(&endString=)", "$1foo$2");
String replacement = "newText";
String text = "abcd&firstString={currentText}&secondStringxyz";
String result = text.replaceAll("&firstString=\\{.*?\\}&secondString",replacement);
System.out.println(result);
prints
abcdnewTextxyz
The question is really confusing but I think I have got it based on the other answer.
You want to replace &firstString={XYZ}&endString with the value of a XYZ String variable?
First of all, in Java, one cannot reference a variable using a String that stores the variable name (Java is statically typed). To counteract this, you can have a HashMap that stores the 'variable' names as the keys and the replacement String as the value.
So, using your example we have:
String in = "abcd&firstString={variableText}&endStringxyz";
and you want to replace variableText with "newText".
So, we can do this:
Map<String, String> replacementMap = new HashMap<>(){{
put("variableText", "newText"); // replace "variableText" with "newText"
put("foo", "bar"); // replace "foo" with "bar"
}};
We can then declare a utility method for replacement (slightly modified code gotten from here):
private static final Pattern replaceTextPattern = Pattern.compile("&firstString=\\{.*?\\}&endString");
public static String replaceText(String input, Map<String, String> replacementMap){
StringBuilder sb = new StringBuilder();
Matcher matcher = replaceTextPattern.matcher(input);
while(matcher.find()){
String match = matcher.group();
String key = match.substring(
match.indexOf('{')+1,
match.lastIndexOf('}')
);
String replacement = replacementMap.get(key);
matcher.appendReplacement(sb, replacement);
}
matcher.appendTail(sb);
return sb.toString();
}
And finally call the method:
System.out.println(replaceText(in, replacementMap)); // abcdnewTextxyz
It even works with multiple variables:
System.out.println(replaceText("abcd&firstString={variableText}&endStringxyz&firstString={foo}&endStringdef", replacementMap)); // abcdnewTextxyzbardef

Pattern for ulr (key=value&key=value) Java regex

Want to know how to write the correct pattern regex of my url to match this :
key=value .
2 pairs of key=value are Separated by « & » .
Remove key if value is empty or null
Thanks
If you want to remove empty parameters from your query string, you can use this regex \w+=[^&]+ to match only key value pairs whose value part is non-empty. For e.g. if you have following string,
key1=value1&key2=value2&key3=&key4=value4
Then match only URLs using above regex and filter out rest. This Java code should help you,
String s = "key1=value1&key2=value2&key3=&key4=value4";
Pattern p = Pattern.compile("\\w+=[^&]+");
Matcher m = p.matcher(s);
StringBuilder sb = new StringBuilder();
while(m.find()) {
sb.append(m.group()).append("&");
}
System.out.println(sb.substring(0,sb.length()-1));
Prints this which has key3 value removed as it was empty,
key1=value1&key2=value2&key4=value4
Using Java8 streams, you can use this one liner code to achieve,
String s = "key1=value1&key2=value2&key3=&key4=value4";
String cleaned = Arrays.stream(s.split("&")).filter(x -> Pattern.matches("\\w+=[^&]+", x)).collect(Collectors.joining("&"));
System.out.println(cleaned);
Prints,
key1=value1&key2=value2&key4=value4

Android Java, replace each letter in a text [duplicate]

I have some strings with equations in the following format ((a+b)/(c+(d*e))).
I also have a text file that contains the names of each variable, e.g.:
a velocity
b distance
c time
etc...
What would be the best way for me to write code so that it plugs in velocity everywhere a occurs, and distance for b, and so on?
Don't use String#replaceAll in this case if there is slight chance part you will replace your string contains substring that you will want to replace later, like "distance" contains a and if you will want to replace a later with "velocity" you will end up with "disvelocityance".
It can be same problem as if you would like to replace A with B and B with A. For this kind of text manipulation you can use appendReplacement and appendTail from Matcher class. Here is example
String input = "((a+b)/(c+(d*e)))";
Map<String, String> replacementsMap = new HashMap<>();
replacementsMap.put("a", "velocity");
replacementsMap.put("b", "distance");
replacementsMap.put("c", "time");
StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile("\\b(a|b|c)\\b");
Matcher m = p.matcher(input);
while (m.find())
m.appendReplacement(sb, replacementsMap.get(m.group()));
m.appendTail(sb);
System.out.println(sb);
Output:
((velocity+distance)/(time+(d*e)))
This code will try to find each occurrence of a or b or c which isn't part of some word (it doesn't have any character before or after it - done with help of \b which represents word boundaries). appendReplacement is method which will append to StringBuffer text from last match (or from beginning if it is first match) but will replace found match with new word (I get replacement from Map). appendTail will put to StringBuilder text after last match.
Also to make this code more dynamic, regex should be generated automatically based on keys used in Map. You can use this code to do it
StringBuilder regexBuilder = new StringBuilder("\\b(");
for (String word:replacementsMap.keySet())
regexBuilder.append(Pattern.quote(word)).append('|');
regexBuilder.deleteCharAt(regexBuilder.length()-1);//lets remove last "|"
regexBuilder.append(")\\b");
String regex = regexBuilder.toString();
I'd make a hashMap mapping the variable names to the descriptions, then iterate through all the characters in the string and replace each occurrance of a recognised key with it's mapping.
I would use a StringBuilder to build up the new string.
Using a hashmap and iterating over the string as A Boschman suggested is one good solution.
Another solution would be to do what others have suggested and do a .replaceAll(); however, you would want to use a regular expression to specify that only the words matching the whole variable name and not a substring are replaced. A regex using word boundary '\b' matching will provide this solution.
String variable = "a";
String newVariable = "velocity";
str.replaceAll("\\b" + variable + "\\b", newVariable);
See http://docs.oracle.com/javase/tutorial/essential/regex/bounds.html
For string str, use the replaceAll() function:
str = str.toUpperCase(); //Prevent substitutions of characters in the middle of a word
str = str.replaceAll("A", "velocity");
str = str.replaceAll("B", "distance");
//etc.

Split a string based on pattern and merge it back

I need to split a string based on a pattern and again i need to merge it back on a portion of string.
for ex: Below is the actual and expected strings.
String actualstr="abc.def.ghi.jkl.mno";
String expectedstr="abc.mno";
When i use below, i can store in a Array and iterate over to get it back. Is there anyway it can be done simple and efficient than below.
String[] splited = actualstr.split("[\\.\\.\\.\\.\\.\\s]+");
Though i can acess the string based on index, is there any other way to do this easily. Please advise.
You do not understand how regexes work.
Here is your regex without the escapes: [\.\.\.\.\.\s]+
You have a character class ([]). Which means there is no reason to have more than one . in it. You also don't need to escape .s in a char class.
Here is an equivalent regex to your regex: [.\s]+. As a Java String that's: "[.\\s]+".
You can do .split("regex") on your string to get an array. It's very simple to get a solution from that point.
I would use a replaceAll in this case
String actualstr="abc.def.ghi.jkl.mno";
String str = actualstr.replaceAll("\\..*\\.", ".");
This will replace everything with the first and last . with a .
You could also use split
String[] parts = actualString.split("\\.");
string str = parts[0]+"."+parts[parts.length-1]; // first and last word
public static String merge(String string, String delimiter, int... partnumbers)
{
String[] parts = string.split(delimiter);
String result = "";
for ( int x = 0 ; x < partnumbers.length ; x ++ )
{
result += result.length() > 0 ? delimiter.replaceAll("\\\\","") : "";
result += parts[partnumbers[x]];
}
return result;
}
and then use it like:
merge("abc.def.ghi.jkl.mno", "\\.", 0, 4);
I would do it this way
Pattern pattern = Pattern.compile("(\\w*\\.).*\\.(\\w*)");
Matcher matcher = pattern.matcher("abc.def.ghi.jkl.mno");
if (matcher.matches()) {
System.out.println(matcher.group(1) + matcher.group(2));
}
If you can cache the result of
Pattern.compile("(\\w*\\.).*\\.(\\w*)")
and reuse "pattern" all over again this code will be very efficient as pattern compilation is the most expensive. java.lang.String.split() method that other answers suggest uses same Pattern.compile() internally if the pattern length is greater then 1. Meaning that it will do this expensive operation of Pattern compilation on each invocation of the method. See java.util.regex - importance of Pattern.compile()?. So it is much better to have the Pattern compiled and cached and reused.
matcher.group(1) refers to the first group of () which is "(\w*\.)"
matcher.group(2) refers to the second one which is "(\w*)"
even though we don't use it here but just to note that group(0) is the match for the whole regex.

Regex for extraction of a key value pair

I have a text file. Sample content of that particular text file is like
root(ROOT-0, good-4)nn(management-2, company-1)nsubj(good-4, management-2)
Now i need to separate this and store it in ArrayList. For that i write the following code
public class subject {
public void getsub(String f){
ArrayList <String>ar=new ArrayList<String>();
String a="[a-z]([a-z]-[0-9],[a-z]-[0-9])";
Pattern pattern=Pattern.compile(a);
Matcher matcher=pattern.matcher(f);
while(matcher.find()){
if(matcher.find()){
ar.add(matcher.group(0));
}
}
System.out.println(ar.size());
for(int i=0;i<ar.size();i++){
System.out.println(ar.get(i));
}
}
}
but arraylist is not getting populated. Why is that so
You are using unquoted parenthesis in your Pattern.
Unquoted parenthesis imply the definition of a group within your Pattern, for later back-references.
However, here you are trying to match actual parenthesis, so they need to be escaped as such: \\( and \\).
For a rough solution, try this:
String text = "root(ROOT-0, good-4)nn(management-2, company-1)nsubj(good-4, management-2)";
List<String> myPairs = new ArrayList<String>();
Pattern p = Pattern.compile(".+?\\(.+?,.+?\\)");
Matcher m = p.matcher(text);
while (m.find()) {
myPairs.add(m.group());
}
System.out.println(myPairs);
Output:
[root(ROOT-0, good-4), nn(management-2, company-1), nsubj(good-4, management-2)]
Final note: for an improved solution, I would try and use groups to distinguish between the first part of your Pattern and the actual pair in the parenthesis, so to build a Map<String, ArrayList<String>> as a data object in this case - but this is out of scope for this answer.

Categories