get particular string using regex in java - java

i want how to code for get only link from string using regex or anyothers.
here the following is java code:
String aas = "window.open("+"\""+"http://www.example.com/jscript/jex5.htm"+"\""+")"+"\n"+"window.open("+"\""+"http://www.example.com/jscript/jex5.htm"+"\""+")";
how to get the link http://www.example.com/jscript/jex5.htm
thanks and advance

The Regex
(?<=window.open\(")[^"]*(?="\))
matches the link in the string you have given. Properly escaped it reads
"(?<=window.open\\(\")[^\"]*(?=\"\\))"

This will print out the first URL contained in the string that starts with "http://":
public static void main(String[] args) throws Exception {
String javascriptString = "window.open(" + "\"" + "http://www.example.com/jscript/jex5.htm" + "\"" + ")" + "\n" + "window.open(" + "\""
+ "http://www.example.com/jscript/jex5.htm" + "\"" + ")";
Pattern pattern = Pattern.compile(".*(http://.*)\".*\n.*");
Matcher m = pattern.matcher(javascriptString);
if (m.matches()) {
System.out.println(m.group(1));
}
}

Related

Regex to split a string using java

I am trying to parse a string as I need to pass the map to UI.
Here is my input string :
"2020-02-01T00:00:00Z",1,
"2020-04-01T00:00:00Z",4,
"2020-05-01T00:00:00Z",2,
"2020-06-01T00:00:00Z",31,
"2020-07-01T00:00:00Z",60,
"2020-08-01T00:00:00Z",19,
"2020-09-01T00:00:00Z",10,
"2020-10-01T00:00:00Z",33,
"2020-11-01T00:00:00Z",280,
"2020-12-01T00:00:00Z",61,
"2021-01-01T00:00:00Z",122,
"2021-12-01T00:00:00Z",1
I need to split the string like this :
"2020-02-01T00:00:00Z",1 : split[0]
"2020-04-01T00:00:00Z",4 : split[1]
Issue is I can't split it on " , " as its repeated 2 times.
I need a regex that gives 2020-02-01T00:00:00Z,1 as one token to process further.
I am new to regex. Can someone please provide a regex expression for the same.
If you want the pairs of date-time and ID, you can use the regex, (\"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\",\d+)(?=,|$) to get the match results.
The pattern, (?=,|$) is the lookahead assertion for comma or end of the line.
Demo:
import java.util.List;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
public class Main {
public static void main(String[] args) {
String s = "\"2020-02-01T00:00:00Z\",1,\n"
+ " \"2020-04-01T00:00:00Z\",4,\n"
+ " \"2020-05-01T00:00:00Z\",2,\n"
+ " \"2020-06-01T00:00:00Z\",31,\n"
+ " \"2020-07-01T00:00:00Z\",60,\n"
+ " \"2020-08-01T00:00:00Z\",19,\n"
+ " \"2020-09-01T00:00:00Z\",10,\n"
+ " \"2020-10-01T00:00:00Z\",33,\n"
+ " \"2020-11-01T00:00:00Z\",280,\n"
+ " \"2020-12-01T00:00:00Z\",61,\n"
+ " \"2021-01-01T00:00:00Z\",122,\n"
+ " \"2021-12-01T00:00:00Z\",1";
List<String> list = Pattern.compile("(\\\"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z\\\",\\d+)(?=,|$)")
.matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.toList());
list.stream()
.forEach(p -> System.out.println(p));
}
}
Output:
"2020-02-01T00:00:00Z",1
"2020-04-01T00:00:00Z",4
"2020-05-01T00:00:00Z",2
"2020-06-01T00:00:00Z",31
"2020-07-01T00:00:00Z",60
"2020-08-01T00:00:00Z",19
"2020-09-01T00:00:00Z",10
"2020-10-01T00:00:00Z",33
"2020-11-01T00:00:00Z",280
"2020-12-01T00:00:00Z",61
"2021-01-01T00:00:00Z",122
"2021-12-01T00:00:00Z",1
Why can't you just split on , and ignore the last value?
Here's your pattern:
final Pattern pattern = Pattern.compile("(\\S+),(\\d+)");
final Matcher matcher = pattern.matcher("Input....");
Here's how to use it:
while (matcher.find()) {
final String date = matcher.group(1);
final String number = matcher.group(2);
}

return' character and new line use with regex java 8

I 'm facing strange behaviour in java 8 regarding the use of (\r?\n) inside a regex to parse text file with IDE eclipse runing under java 8.
see regex101 test demo https://regex101.com/r/QHSsfQ/4
the regex work fine for java 7 with IDE eclipse .
but with IDE runing in java 8 it dosen't work ( see bellow code )
can someone help how me to solved this?
String REGEX =
"\\s+NAME.*" + "\\r?\\n"
+ "INFO-\\d{1,2}\\s+(?<name>[$\\w]+).*" + "\\r?\\n"
+ ".*" + "\\r?\\n"
+ ".*VERAT2.*" + "\\r?\\n"
+ "\\s+\\w+\\s+(?<verat2>\\w+).*"
.......
.......
Matcher matcher = Pattern.compile( REGEX ).matcher( data );
if( matcher.find() )
{
System.out.println("LEVELINFO=DATA=" + matcher.group("name") + " &&NAME=" + matcher.group("name") +" &&VERAT2="+ matcher.group("verat2")+"\n");
}
}
sc.close();
the sample text file looks like this :
DATA NAME MAC1
INFO-0 EQUIP Q10
VL VER VERAT2
V22 V22
thanks
Alternative regex:
String regexName = "^DATA\\s+NAME\\s+.*?^\\S+\\s+(?<name>\\S+)";
String regexVerat2 = "\\s+VER\\s+VERAT2\\s+.*?^\\s+\\S+\\s+(?<verat2>\\S+)";
String regex = String.format("%s.*?%s", regexName, regexVerat2);
Matcher matcher = Pattern.compile(regex, Pattern.MULTILINE|Pattern.DOTALL).matcher(input);
Regex in context:
public static void main(String[] args) {
String input =
"DATA NAME MAC1 MAC2\n"
+ "INFO-0 EQUIP Q10 Q13\n"
+ " \n"
+ " VL VER VERAT2 MAP\n"
+ " V22 V22 SELF100\n"
+ " \n"
+ " CMD1 CMD2 CMD3 CMD4 CMD4 \n"
+ " NO 44 FAL BYTE\n";
String regexName = "^DATA\\s+NAME\\s+.*?^\\S+\\s+(?<name>\\S+)";
String regexVerat2 = "\\s+VER\\s+VERAT2\\s+.*?^\\s+\\S+\\s+(?<verat2>\\S+)";
String regex = String.format("%s.*?%s", regexName, regexVerat2);
Matcher matcher = Pattern.compile(regex, Pattern.MULTILINE|Pattern.DOTALL).matcher(input);
while(matcher.find()) {
System.out.println("Name: " + matcher.group("name"));
System.out.println("Verat2 : " + matcher.group("verat2"));
}
}
Output:
Name: EQUIP
Verat2 : V22

Find substring from a complex string using regex

I have a String containing huge script code as follows :
String script = "node {
stage(someString) {
try {
**parameters= [
[someString],
[someString],
[someString],
[someString],
[someString],
[someString],
[someString],
]**
//some more script
}
}";
I want to extract the parameters variable containing array of array values
I tried the following pattern but didnt work
Pattern pattern = Pattern.compile("parameters= [(.*?)]");
How do I extract the parameters variable from script String variable using Regex?
Thanks in advance!
You may try using:
parameters=\s*\[(.*)]
Explanation of the above regex:
parameters= - Matches parameters= literally.
\s* - Matches a white-space character zero or more times.
\[ - Matches [ literally.
(.*)] - represents a capturing group capturing everything before a ].
You can find the demo of the above regex in here.
Sample Implementation in java:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Main
{
private static final Pattern pattern = Pattern.compile("parameters=\\s*\\[(.*)]", Pattern.DOTALL);
public static void main(String[] args) {
String string = "node {\n"
+ " stage(someString) {\n"
+ " try {\n"
+ " **parameters= [\n"
+ " [someString],\n"
+ " [someString],\n"
+ " [someString],\n"
+ " [someString],\n"
+ " [someString],\n"
+ " [someString],\n"
+ " [someString],\n"
+ " ]**\n"
+ " //some more script";
StringBuilder sb = new StringBuilder();
Matcher matcher = pattern.matcher(string);
while(matcher.find()){
// Replaced all the unwanted spaces and commas. You can address that accordingly.
sb.append(matcher.group(1).replaceAll("[\\s,]+", " "));
}
System.out.println(sb.toString());
}
}
Please find the sample run of the above implementation in here.

How to replace all domains with pattern in a XML string in Java?

I have an XML output like this (<xml> element or xlink:href attribute are just fiction and you cannot rely on them to create regex pattern.)
<xml>http://localhost:8080/def/abc/xyx</xml>
<element xlink:href="http://localhostABCDEF/def/ABC/XYZ">Some Text</element>
...
What I want to do is using Java regex to replace the domain pattern (I don't know about existing domains):
"http(s)?://.*/def/.*
with an input domain (e.g: http://google.com/def) and the result will be:
<xml>http://google.com/def/abc/xyx</xml>
<element xlink:href="http://google.com.com/def/ABC/XYZ">Some Text</element>
...
How can I do it? I think Regex in Java can do or String.replaceAll (but this one seems not possible).
Regex: http[s]?:\/{2}.+\/def Substitution: http://google.com/def
Details:
? Matches between zero and one times
[] Match a single character present in the list
. Matches any character
+ Matches between one and unlimited times
Java code:
String domain = "http://google.com/def";
String html = "<xml>http://localhost:8080/def/abc/xyx</xml>\r\n<element xlink:href=\"http://localhostABCDEF/def/ABC/XYZ\">Some Text</element>";
html = html.replaceAll("http[s]?:\\/{2}.+\\/def", domain);
System.out.print(html);
Output:
<xml>http://google.com/def/abc/xyx</xml>
<element xlink:href="http://google.com/def/ABC/XYZ">Some Text</element>
Actually, this could be done with Regex and it is simple enough than parsing XML document. Here is the answer:
String text = "<epsg:CommonMetaData>\n"
+ " <epsg:type>geographic 2D</epsg:type>\n"
+ " <epsg:informationSource>EPSG. See 3D CRS for original information source.</epsg:informationSource>\n"
+ " <epsg:revisionDate>2007-08-27</epsg:revisionDate>\n"
+ " <epsg:changes>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2002.151\"/>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2003.370\"/>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2006.810\"/>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2007.079\"/>\n"
+ " </epsg:changes>\n"
+ " <epsg:show>true</epsg:show>\n"
+ " <epsg:isDeprecated>false</epsg:isDeprecated>\n"
+ " </epsg:CommonMetaData>\n"
+ " </gml:metaDataProperty>\n"
+ " <gml:metaDataProperty>\n"
+ " <epsg:CRSMetaData>\n"
+ " <epsg:projectionConversion xlink:href=\"http://www.opengis.net/def/coordinateOperation/EPSG/0/15593\"/>\n"
+ " <epsg:sourceGeographicCRS xlink:href=\"http://www.opengis.net/def/crs/EPSG/0/4979\"/>\n"
+ " </epsg:CRSMetaData>\n"
+ " </gml:metaDataProperty>"
+ "<gml:identifier codeSpace=\"OGP\">http://www.opengis.net/def/area/EPSG/0/1262</gml:identifier>";
String patternString1 = "(http(s)?://.*/def/.*)";
Pattern pattern = Pattern.compile(patternString1);
Matcher matcher = pattern.matcher(text);
String prefixDomain = "http://localhost:8080/def";
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
String url = prefixDomain + matcher.group(1).split("def")[1];
matcher.appendReplacement(sb, url);
System.out.println(url);
}
matcher.appendTail(sb);
System.out.println(sb.toString());
which returns output https://www.diffchecker.com/CyJ8fY8p

Regular Expression issue, deleting whole lines

I have been trying for the last couple of hours to create a regular expression that deletes lines of text that start with particular wordage after selecting out a rating.
Below is what I'm trying to delete. I'm also trying to pull the Rating out of the paragraph (it's pass or fail).
Review Master: text here
1111111111 text here
Rating: Fail text here
Review Master Page text here
I am trying to delete all lines that start with the following.
I have
^Review Master:
^[0-9]{10}
^Rating:
^Review Master Page
Again, I am struggling with the replacement(deleting) and finding only the rating.
If you want to find those exact lines in your file then this will work:
Review Master:\n\\d++\nRating:\\s*+(\\w++)\nReview Master Page"
Here is an example using your input as a test string:
public static void main(String[] args) throws Exception {
final String in = "Review Master:\n"
+ "1111111111\n"
+ "Rating: Fail\n"
+ "Review Master Page";
final Matcher m = Pattern.compile(""
+ "Review Master:\n"
+ "\\d++\n"
+ "Rating:\\s*+(\\w++)\n"
+ "Review Master Page").matcher(in);
while(m.find()) {
System.out.println(m.group(1));
}
}
Output:
Fail
If you want to delete those lines then your need to replace the pattern in the file which your have as a String:
public static void main(String[] args) throws Exception {
final String in = "Some other text\n"
+ "Review Master:\n"
+ "1111111111\n"
+ "Rating: Fail\n"
+ "Review Master Page\n"
+ "Some final text";
final Matcher m = Pattern.compile(""
+ "\n?"
+ "Review Master:\n"
+ "\\d++\n"
+ "Rating:\\s*+(\\w++)\n"
+ "Review Master Page").matcher(in);
final StringBuffer output = new StringBuffer();
while (m.find()) {
System.out.println(m.group(1));
m.appendReplacement(output, "");
}
m.appendTail(output);
System.out.println("Result: \"" + output.toString() + "\"");
}
Output:
Fail
Result: "Some other text
Some final text"
i.e. we use the Matcher to yank the pass/fail from the input and also build the output replacing the block of text matched with nothing.
You have not made clear which parts of the patterns are variable.

Categories