Java regex match with error due to $ - java

I have the following regex snippet to parse the URL of an ahref as follow:
(?<=href=)[^\"']+(?=(\"|'))?>
What I m trying to do is replace the following snippet with data, i populate at runtime:
<a href=$tracking_url$&langding_url=google.com>
<img src="irreleavnt" />
</a>
When i try replaceAll() as follows, it fails
String fragment = <a href=$click_tracking_url$&landing_url=google.com><img src=\"10.gif\" /></a>
String processedFragment = fragment.replaceAll(AHREF_REGEX, ahrefurl);
The error is :
java.lang.IllegalArgumentException: Illegal group reference
at java.util.regex.Matcher.appendReplacement(Matcher.java:724)
at java.util.regex.Matcher.replaceAll(Matcher.java:824)
at java.lang.String.replaceAll(String.java:1572)
How can i fix the regex to match <a href=$click_tracking_url$ ? How can i escape $ from regex?

Well.. I tried your regexp and didn't get the error. However, the regexp didn't replace the $click_tracking_url$ but the whole text until the end of the element.
In case you need to dynamically replace placeholders, try something like this:
Map<String, String> placeholders = new HashMap<String, String>();
// init your placeholders somehow...
placeholders.put("click_tracking_url", "something");
String fragment = "<img src=\"10.gif\" />";
Matcher m = Pattern.compile("\\$(\\w+)\\$").matcher(fragment);
if (m.find()) {
String processedFragment = m.replaceAll(placeholders.get(m.group(1)));
System.out.println(processedFragment);
}

Here, give this a shot. Replace this...
(?<=href=)(['"]?)\\$([^$>]+)\\$
...with this:
$1$url
And don't forget to escape your backslashes.

If I understand your question correctly, you're trying to replace patterns in the form of $someIdentifier$ with some value in your application you're using someIdentifier to dereference.
It seems you would want to use the pattern \$([^\$]+)\$ and find each occurrence in the string, grab the value of group one (1), look up the value and then replace all occurrences of that specific sequence with the value you looked up.
String someString = "some$string$withatoken";
Pattern tokenPattern = Pattern.compile("\\$([^\\$]+)\\$");
Matcher tokenMatcher = tokenPattern.matcher(someString);
// find not matches. matches will compare the entire string against the pattern
// basically enforcing ^{pattern}$.
while (tokenMatcher.find()) {
String tokenName = tokenMatch.group(1);
String tokenValue = tokenMap.get(tokenName); // lookup value of token name.
someString = someString.replaceAll("\\$" + tokenName + "\\$", tokenValue);
// resets the target of the matcher with the new value of someString.
tokenMatcher.reset(someString);
}

Related

Pattern for ulr (key=value&key=value) Java regex

Want to know how to write the correct pattern regex of my url to match this :
key=value .
2 pairs of key=value are Separated by « & » .
Remove key if value is empty or null
Thanks
If you want to remove empty parameters from your query string, you can use this regex \w+=[^&]+ to match only key value pairs whose value part is non-empty. For e.g. if you have following string,
key1=value1&key2=value2&key3=&key4=value4
Then match only URLs using above regex and filter out rest. This Java code should help you,
String s = "key1=value1&key2=value2&key3=&key4=value4";
Pattern p = Pattern.compile("\\w+=[^&]+");
Matcher m = p.matcher(s);
StringBuilder sb = new StringBuilder();
while(m.find()) {
sb.append(m.group()).append("&");
}
System.out.println(sb.substring(0,sb.length()-1));
Prints this which has key3 value removed as it was empty,
key1=value1&key2=value2&key4=value4
Using Java8 streams, you can use this one liner code to achieve,
String s = "key1=value1&key2=value2&key3=&key4=value4";
String cleaned = Arrays.stream(s.split("&")).filter(x -> Pattern.matches("\\w+=[^&]+", x)).collect(Collectors.joining("&"));
System.out.println(cleaned);
Prints,
key1=value1&key2=value2&key4=value4

Java Regular Expression for finding specific string

I have a file with a long string a I would like to split it by specific item i.e.
String line = "{{[Metadata{"this, is my first, string"}]},{[Metadata{"this, is my second, string"}]},{[Metadata{"this, is my third string"}]}}"
String[] tab = line.split("(?=\\bMetadata\\b)");
So now when I iterate my tab I will get lines starting from word: "Metadata" but I would like lines starting from:
"{[Metadata"
I've tried something like:
String[] tab = line.split("(?=\\b{[Metadata\\b)");
but it doesnt work.
Can anyone help me how to do that, plese?
You may use
(?=\{\[Metadata\b)
See a demo on regex101.com.
Note that the backslashes need to be escaped in Java so that it becomes
(?=\\{\\[Metadata\\b)
Here is solution using a formal pattern matcher. We can try matching your content using the following regex:
(?<=Metadata\\{\")[^\"]+
This uses a lookbehind to check for the Metadata marker, ending with a double quote. Then, it matches any content up to the closing double quote.
String line = "{{[Metadata{\"this, is my first, string\"}]},{[Metadata{\"this, is my second, string\"}]},{[Metadata{\"this, is my third string\"}]}}";
String pattern = "(?<=Metadata\\{\")[^\"]+";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
while (m.find( )) {
System.out.println(m.group(0));
}
this, is my first, string
this, is my second, string
this, is my third string

Get a specific word out of a String with regex

I have an String
String string = "-minY:50 -maxY:100 -minVein:8 -maxVein:10 -meta:0 perChunk:5;";
And I want to somehow get the -meta:0 out of it with regex (replace everything except -meta:0), I made an regex which deletes -meta:0 but I can't make it delete everything except -meta:0
I tried using some other regex but it was ignoring whole line when I had -meta:[0-9] in it, and like you can see I have one line for everything.
This is how it has been deleting -meta:0 from the String:
String meta = string.replaceAll("( -meta:[0-9])", "");
System.out.println(meta);
I just somehow want to reverse that and delete everything except -meta:[0-9]
I couldn't find anything on the page about my issue because everything was ignoring whole line after it found the word, so sorry if there's something similar to this.
You should be capturing your match in a captured group and use it's reference in replacement as:
String meta = string.replaceAll("^.*(-meta:\\d+).*$", "$1");
System.out.println(meta);
//=> "-meta:0"
RegEx Demo
As I understand your requirement you want to :
a) you want to extract meta* from the string
b) replace everything else with ""
You could do something like :
String string = "-minY:50 -maxY:100 -minVein:8 -maxVein:10 -meta:0 perChunk:5;";
Pattern p = Pattern.compile(".*(-meta:[0-9]).*");
Matcher m = p.matcher(string);
if ( m.find() )
{
string = string.replaceAll(m.group(0),m.group(1));
System.out.println("After removal of meta* : " + string);
}
What this code does is it finds meta:[0-9] and retains it and removes other found groups

replacing <img> by <img></img> in a string

I want to replace every <img> tag with closing <img></img> tags in a string. The string is actually an html document where the img tag are generated by me and always look like this :
<img src="some_source.jpg" style="some style attributes and values">
Src is user input so it can be anything.
I made a regex expression, not sure if correct because it's my first time using it but upon testing it was working. The problem is that I don't know how to keep the content of the src.
/<img\ssrc=".+?"\sstyle=".+?">/g
But I have difficulties replacing the tags in the string.
and all I got is this:
Pattern p = Pattern.compile("/<img\\ssrc=\".+?\"\\sstyle=\".+?\">/g");
Matcher m = p.matcher(str);
List<String> imgStrArr = new ArrayList<String>();
while (m.find()) {
imgStrArr.add(m.group(0));
}
Matcher m2 = p.matcher(str);
You can use the following regex to match:
(<img[^>]+>)
And replace with $1</img>
Code:
str = str.replaceAll("(<img[^>]+>)", "$1</img>");
Edit: Considering #MarcusMüller's advice you can do the following:
Regex: (<img[^>]+)>
Replace with $1/>
Code:
str = str.replaceAll("(<img[^>]+)>", "$1/>");
You don't have to use Pattern and Matcher classes, you can use the regular replace method like this:
str = str.replaceAll("(<img.*?>)", "$1</img>");
IdeOne working demo

Java regex to match a pattern

I am new to Java. Please help me with a Java regex to match a pattern and retrieve the value.
I need to match the pattern bellow:
\# someproperty=somevalue // this is a new property
\#someproperty=somevalue // this is a new property
I have to match the above patterns (which may contains spaces) and I need to retrieve "someproperty" and "somevalue".
I tried with the pattern below, but it just matches only someproperty=somevalue , without "#" at the beginning. Please help me out.
Pattern propertyKeyPattern = Pattern.compile("^\\s*(\\S+?)\\s*=.*?");
If you want to match the whole string and find patterns, such as "\# someproperty =some value".
Try regular Expression
^\\#\s*(\S+?)\s*=(.*)$
as Java string, it is
"^\\\\#\\s*(\\S+?)\\s*=(.*)$"
The match result for string \# someproperty = a some value is
matches() = Yes
find() = Yes
group(0) = \# someproperty = a some value
group(1) = someproperty
group(2) = a some value
String a=yourString.replaceAll("[^\w\s]"," ");
By using this you will get "someproperty" and "somevalue" string then u can check it. For more post your question clearly.

Categories