I want to make a regex so I can do a "Search/Replace" over a json file with many object. Every object has a key named "resource" containing a URL.
Take a look at these examples:
"resource":"/designs/123/image.jpg"
"resource":"/designs/221/elephant.gif"
"resource":"/designs/icon.png"
I want to make a regex to replace the whole url with a string like this: localhost:8080/filepath.
This way, the result would be:
"resource":"localhost:8080/designs/123/image.jpg"
"resource":"localhost:8080/designs/221/elephant.gif"
"resource":"localhost:8080/designs/icon.png"
I'm just starting with regular expressions and I'm completely lost. I was thinking that one valid idea would be to write something starting with this pattern "resource":"
How could I write the regular expression?
The easiest method is probably just to replace "resource":"/ with "resource":"localhost:8080/. You don't even need a regex for this (but if you do you just have to escape some stuff).
With vim this would be
:%s/"resource":"\(.*\)"/"resource":"localhost:8080\1"
this should be easily transferable to java.
Related
Now I am trying to match some patterns from a String containing elasticsearch's structured bulk requests. Here is an example:
index {[event_20191209][event][null], source[{"haha":"haha","jaja":"jaja"}]}, update {[event_20191209][event][xxx], doc_as_upsert[false], doc[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]}, delete {[event_20191208][_doc][sjdos]}, update {[event_20191209][event][yyy], doc_as_upsert[false], upsert[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]}
My goal is to match every separate request out of the bulk requests string, i.e to get strings like:
index {[event_20191209][event][null], source[{"haha":"haha","jaja":"jaja"}]},
update {[event_20191209][event][xxx], doc_as_upsert[false], doc[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]},
delete {[event_20191208][_doc][sjdos]},
update {[event_20191209][event][yyy], doc_as_upsert[false], upsert[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]}
And my pattern expression is [a-z]+\s\{.+?\}[,\w\t\r\n]+? which works fine on a Javascript based regular expression online tester like below:
However, when I copied this pattern expression to my Java code, the output was not what I expected. It was like this:
So I realized there exists some differences between Javascript and Java regular expression engine, but I cannot figure out how to update my expression so that it could work well in Java after so much coding and googling.
I would be so grateful if someone could give me some favor or hint for this.
After a short nap, I found epiphany. I was a fool in the morning....
The workaround is so easy to implement. Elasticsearch has well overridden toString() for us.
At first glance, I wouldn't suggest using regex right away. It looks like those lines follow some kind of pattern that you could parse and split up first.
After that, if you're talking about regex, I'd try:
Taking a look at the java regex format: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
How about using an online java regex tool instead?
Currently I am working on a project and I am trying to see which String method would be most appropriate to use or how to approach this. I am trying to prepend a string to each occurrence of this specific string. For example, I am extracting HTML and for each /img/image1.png I find I want to append a url to it.
However, there are images that are already like that for example www.anylink.com/img/image2.png which do not need appending but are in the string in which I pulled. I looked at replaceAll() method but not sure if this allows for appending in replacement and also not sure if I need regex to search for instances where only /img/ exists(no url) and not the full url since only local hosted images I want to append to. I am looking for some suggestions as I am not sure how to begin this code after research.
Thank you.
I think that the method replaceAll() in String is enough for what you need.
You just need to write the correct regular expression.
If you write some examples, I can suggest the regex.
For example something like:
System.out.println("<div><img src=\"/test/this.png\" /></div>".replaceAll("src=\"/(.*)\"", "src=\"www.google.com$1\""));
My HTTP Request responds with combination of string and JSON, something like this:
null{"username:name","email:email"}
I need only the JSON part.
I directly tried parsing as json object, which was not right of course. I tried splitting it: serverResponse.split("{"), but android does not allow to parse with this character because it is not a pattern. Any suggestion how i can achieve this?
String.split uses regular expressions, and since '{' is a special character in regular expressions, you should escape it like this: serverResponse.split("\\{").
It would be better to change the server side, but you can also just use split. The only thing you need to do is escape your {.
String json = serverResponse.split("\\{")[1];
It is a bad idea and a bad practice to split a Json. If one day it you change on the serve side, it may pick a wrong part of your Json Object.
I recommend you to PARSE it, even if it is simple and small.
I am working on a plugin. I will parse HTML files. I have a naming convention like that:
<!--$include="a.html" -->
or
<!--$include="a.html"-->
is similar
According to this pattern(similar to server side includes) I want to search an HTML file.
Question is that:
Find that pattern and get value (a.html at my example, it is variable)
It should be like:
while(!notFinishedWholeFile){
fileName = findPatternFunc(htmlFile)
replaceFunc(fileName,something)
}
PS: Using regex at Java or implementing it different(as like using .indexOf()) I don't know which one is better. If regex is good at this situation by performence I want to use it.
Any ideas?
You mean like this?
<!--\$include=\"(?<htmlName>[a-z-_]*).html\"\s?-->
Read a file into a string then
str = str.replaceAll("(?<=<!--\\$include=\")[^\"]+(?=\" ?-->)", something);
will replace the filenames with the string something, then the string can be written back to the file.
(Note: this replaces any text inside the double quotes, not just valid filenames.)
If you want only want to replace filenames with the html extension, swap the [^\"]+ for [^.]+.html.
Using regex for this task is fine performance wise, but see e.g.
How to use regular expressions to parse HTML in Java? and Java Regex performance etc.
I have used that pattern:
"<!--\\$include=\"(.+)(.)(html|htm)\"-->"
I have a Java string which looks like this, it is actually an XML tag:
"article-idref="527710" group="no" height="267" href="pc011018.pct" id="pc011018" idref="169419" print-rights="yes" product="wborc" rights="licensed" type="photo" width="322" "
Now I want to remove the article-idref="52770" segment by using regular expression, I came up with the following one:
trimedString.replaceAll("\\article-idref=.*?\"","");
but it doesn't seem to work, could anybody give me an idea on where I got wrong in my regular expression? I need this to be represented as a String in my Java class, so probably HTMLParser won't help me a lot here.
Thanks in advance!
Try this:
trimedString.replaceAll("article-idref=\"[^\"]*\" *","");
I corrected the regular expression by adding quotes and a word boundary (to prevent false matches). Also, in case you didn't, remember to reassign to your string after the replacement:
trimmedString = trimmedString.replaceAll("\\barticle-idref=\".*?\"", "");
See it working at ideone.
Also since this is from an XML document it might be better to use an XML parser to extract the correct attributes instead of a regular expression. This is because XML is quite a complex data format to parse correctly. The example in your question is simple enough. However a regular expression could break on a more complex case, such as a document that includes XML comments. This could be an issue if you are reading data from an untrusted source.
if you are sure the article-idref is allways at the beginning try this:
// removes everything from the beginning to the first whitespace
trimedString = trimedString.replaceFirst("^\\s","");
Be sure to assign the result to trimedString again, since replace does not midify the string itself but returns another string.