I am trying to parse a string as I need to pass the map to UI.
Here is my input string :
"2020-02-01T00:00:00Z",1,
"2020-04-01T00:00:00Z",4,
"2020-05-01T00:00:00Z",2,
"2020-06-01T00:00:00Z",31,
"2020-07-01T00:00:00Z",60,
"2020-08-01T00:00:00Z",19,
"2020-09-01T00:00:00Z",10,
"2020-10-01T00:00:00Z",33,
"2020-11-01T00:00:00Z",280,
"2020-12-01T00:00:00Z",61,
"2021-01-01T00:00:00Z",122,
"2021-12-01T00:00:00Z",1
I need to split the string like this :
"2020-02-01T00:00:00Z",1 : split[0]
"2020-04-01T00:00:00Z",4 : split[1]
Issue is I can't split it on " , " as its repeated 2 times.
I need a regex that gives 2020-02-01T00:00:00Z,1 as one token to process further.
I am new to regex. Can someone please provide a regex expression for the same.
If you want the pairs of date-time and ID, you can use the regex, (\"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\",\d+)(?=,|$) to get the match results.
The pattern, (?=,|$) is the lookahead assertion for comma or end of the line.
Demo:
import java.util.List;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
public class Main {
public static void main(String[] args) {
String s = "\"2020-02-01T00:00:00Z\",1,\n"
+ " \"2020-04-01T00:00:00Z\",4,\n"
+ " \"2020-05-01T00:00:00Z\",2,\n"
+ " \"2020-06-01T00:00:00Z\",31,\n"
+ " \"2020-07-01T00:00:00Z\",60,\n"
+ " \"2020-08-01T00:00:00Z\",19,\n"
+ " \"2020-09-01T00:00:00Z\",10,\n"
+ " \"2020-10-01T00:00:00Z\",33,\n"
+ " \"2020-11-01T00:00:00Z\",280,\n"
+ " \"2020-12-01T00:00:00Z\",61,\n"
+ " \"2021-01-01T00:00:00Z\",122,\n"
+ " \"2021-12-01T00:00:00Z\",1";
List<String> list = Pattern.compile("(\\\"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z\\\",\\d+)(?=,|$)")
.matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.toList());
list.stream()
.forEach(p -> System.out.println(p));
}
}
Output:
"2020-02-01T00:00:00Z",1
"2020-04-01T00:00:00Z",4
"2020-05-01T00:00:00Z",2
"2020-06-01T00:00:00Z",31
"2020-07-01T00:00:00Z",60
"2020-08-01T00:00:00Z",19
"2020-09-01T00:00:00Z",10
"2020-10-01T00:00:00Z",33
"2020-11-01T00:00:00Z",280
"2020-12-01T00:00:00Z",61
"2021-01-01T00:00:00Z",122
"2021-12-01T00:00:00Z",1
Why can't you just split on , and ignore the last value?
Here's your pattern:
final Pattern pattern = Pattern.compile("(\\S+),(\\d+)");
final Matcher matcher = pattern.matcher("Input....");
Here's how to use it:
while (matcher.find()) {
final String date = matcher.group(1);
final String number = matcher.group(2);
}
I 'm facing strange behaviour in java 8 regarding the use of (\r?\n) inside a regex to parse text file with IDE eclipse runing under java 8.
see regex101 test demo https://regex101.com/r/QHSsfQ/4
the regex work fine for java 7 with IDE eclipse .
but with IDE runing in java 8 it dosen't work ( see bellow code )
can someone help how me to solved this?
String REGEX =
"\\s+NAME.*" + "\\r?\\n"
+ "INFO-\\d{1,2}\\s+(?<name>[$\\w]+).*" + "\\r?\\n"
+ ".*" + "\\r?\\n"
+ ".*VERAT2.*" + "\\r?\\n"
+ "\\s+\\w+\\s+(?<verat2>\\w+).*"
.......
.......
Matcher matcher = Pattern.compile( REGEX ).matcher( data );
if( matcher.find() )
{
System.out.println("LEVELINFO=DATA=" + matcher.group("name") + " &&NAME=" + matcher.group("name") +" &&VERAT2="+ matcher.group("verat2")+"\n");
}
}
sc.close();
the sample text file looks like this :
DATA NAME MAC1
INFO-0 EQUIP Q10
VL VER VERAT2
V22 V22
thanks
Alternative regex:
String regexName = "^DATA\\s+NAME\\s+.*?^\\S+\\s+(?<name>\\S+)";
String regexVerat2 = "\\s+VER\\s+VERAT2\\s+.*?^\\s+\\S+\\s+(?<verat2>\\S+)";
String regex = String.format("%s.*?%s", regexName, regexVerat2);
Matcher matcher = Pattern.compile(regex, Pattern.MULTILINE|Pattern.DOTALL).matcher(input);
Regex in context:
public static void main(String[] args) {
String input =
"DATA NAME MAC1 MAC2\n"
+ "INFO-0 EQUIP Q10 Q13\n"
+ " \n"
+ " VL VER VERAT2 MAP\n"
+ " V22 V22 SELF100\n"
+ " \n"
+ " CMD1 CMD2 CMD3 CMD4 CMD4 \n"
+ " NO 44 FAL BYTE\n";
String regexName = "^DATA\\s+NAME\\s+.*?^\\S+\\s+(?<name>\\S+)";
String regexVerat2 = "\\s+VER\\s+VERAT2\\s+.*?^\\s+\\S+\\s+(?<verat2>\\S+)";
String regex = String.format("%s.*?%s", regexName, regexVerat2);
Matcher matcher = Pattern.compile(regex, Pattern.MULTILINE|Pattern.DOTALL).matcher(input);
while(matcher.find()) {
System.out.println("Name: " + matcher.group("name"));
System.out.println("Verat2 : " + matcher.group("verat2"));
}
}
Output:
Name: EQUIP
Verat2 : V22
I have an XML output like this (<xml> element or xlink:href attribute are just fiction and you cannot rely on them to create regex pattern.)
<xml>http://localhost:8080/def/abc/xyx</xml>
<element xlink:href="http://localhostABCDEF/def/ABC/XYZ">Some Text</element>
...
What I want to do is using Java regex to replace the domain pattern (I don't know about existing domains):
"http(s)?://.*/def/.*
with an input domain (e.g: http://google.com/def) and the result will be:
<xml>http://google.com/def/abc/xyx</xml>
<element xlink:href="http://google.com.com/def/ABC/XYZ">Some Text</element>
...
How can I do it? I think Regex in Java can do or String.replaceAll (but this one seems not possible).
Regex: http[s]?:\/{2}.+\/def Substitution: http://google.com/def
Details:
? Matches between zero and one times
[] Match a single character present in the list
. Matches any character
+ Matches between one and unlimited times
Java code:
String domain = "http://google.com/def";
String html = "<xml>http://localhost:8080/def/abc/xyx</xml>\r\n<element xlink:href=\"http://localhostABCDEF/def/ABC/XYZ\">Some Text</element>";
html = html.replaceAll("http[s]?:\\/{2}.+\\/def", domain);
System.out.print(html);
Output:
<xml>http://google.com/def/abc/xyx</xml>
<element xlink:href="http://google.com/def/ABC/XYZ">Some Text</element>
Actually, this could be done with Regex and it is simple enough than parsing XML document. Here is the answer:
String text = "<epsg:CommonMetaData>\n"
+ " <epsg:type>geographic 2D</epsg:type>\n"
+ " <epsg:informationSource>EPSG. See 3D CRS for original information source.</epsg:informationSource>\n"
+ " <epsg:revisionDate>2007-08-27</epsg:revisionDate>\n"
+ " <epsg:changes>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2002.151\"/>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2003.370\"/>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2006.810\"/>\n"
+ " <epsg:changeID xlink:href=\"http://www.opengis.net/def/change-request/EPSG/0/2007.079\"/>\n"
+ " </epsg:changes>\n"
+ " <epsg:show>true</epsg:show>\n"
+ " <epsg:isDeprecated>false</epsg:isDeprecated>\n"
+ " </epsg:CommonMetaData>\n"
+ " </gml:metaDataProperty>\n"
+ " <gml:metaDataProperty>\n"
+ " <epsg:CRSMetaData>\n"
+ " <epsg:projectionConversion xlink:href=\"http://www.opengis.net/def/coordinateOperation/EPSG/0/15593\"/>\n"
+ " <epsg:sourceGeographicCRS xlink:href=\"http://www.opengis.net/def/crs/EPSG/0/4979\"/>\n"
+ " </epsg:CRSMetaData>\n"
+ " </gml:metaDataProperty>"
+ "<gml:identifier codeSpace=\"OGP\">http://www.opengis.net/def/area/EPSG/0/1262</gml:identifier>";
String patternString1 = "(http(s)?://.*/def/.*)";
Pattern pattern = Pattern.compile(patternString1);
Matcher matcher = pattern.matcher(text);
String prefixDomain = "http://localhost:8080/def";
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
String url = prefixDomain + matcher.group(1).split("def")[1];
matcher.appendReplacement(sb, url);
System.out.println(url);
}
matcher.appendTail(sb);
System.out.println(sb.toString());
which returns output https://www.diffchecker.com/CyJ8fY8p
Below regex is working fine in most of the regex tools. However, its not working in the java code. Can anyone please advise?
String text="CHANGE FEE/ADD COLLECT DATA "+
"1.1 COLOR/RED TOMATO "+
"CF USD10.00 "+
" "+
"2.2 COLOR/DARK BLUE PLUM "+
"CF USD11.00 "+
" ";
String patterString = "([0-9]{1,3}\\.[0-9]{1,3})\\s.+\\s*CF\\s+[a-zA-Z]{1,5}([0-9]{1,10}.[0-9]{2})";
Pattern pattern = Pattern.compile(patterString);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println("found: " + matcher.group(1) +">>>"+ matcher.group(2));
}
actual output:
found: 1.1>>>11.00
expected output:
found: 1.1>>>10.00
found: 2.2>>>11.00
Your regex needs to be:
String patterString = "([0-9]{1,3}\\.[0-9]{1,3}).*?CF\\s+[a-zA-Z]{1,5}([0-9]{1,10}.[0-9]{2})";
Which yields:
found: 1.1>>>10.00
found: 2.2>>>11.00
I haven't read the docs, but guess that when iterating with find() it's implicitly in MULTILINE mode, so the portion of your regex \\s.+\\s* is greedy - replacing this with .*? minimizes the greed ;-)
Edit, sample source:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexFind {
public static void main(String[] args)
{
String text="CHANGE FEE/ADD COLLECT DATA "+
"1.1 COLOR/RED TOMATO "+
"CF USD10.00 "+
" "+
"2.2 COLOR/DARK BLUE PLUM "+
"CF USD11.00 "+
" ";
//String patterString = "([0-9]{1,3}\\.[0-9]{1,3})\\s.+\\s*CF\\s+[a-zA-Z]{1,5}([0-9]{1,10}.[0-9]{2})";
String patterString = "([0-9]{1,3}\\.[0-9]{1,3}).*?CF\\s+[a-zA-Z]{1,5}([0-9]{1,10}.[0-9]{2})";
Pattern pattern = Pattern.compile(patterString);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println("found: " + matcher.group(1) +">>>"+ matcher.group(2));
}
}
}
i want how to code for get only link from string using regex or anyothers.
here the following is java code:
String aas = "window.open("+"\""+"http://www.example.com/jscript/jex5.htm"+"\""+")"+"\n"+"window.open("+"\""+"http://www.example.com/jscript/jex5.htm"+"\""+")";
how to get the link http://www.example.com/jscript/jex5.htm
thanks and advance
The Regex
(?<=window.open\(")[^"]*(?="\))
matches the link in the string you have given. Properly escaped it reads
"(?<=window.open\\(\")[^\"]*(?=\"\\))"
This will print out the first URL contained in the string that starts with "http://":
public static void main(String[] args) throws Exception {
String javascriptString = "window.open(" + "\"" + "http://www.example.com/jscript/jex5.htm" + "\"" + ")" + "\n" + "window.open(" + "\""
+ "http://www.example.com/jscript/jex5.htm" + "\"" + ")";
Pattern pattern = Pattern.compile(".*(http://.*)\".*\n.*");
Matcher m = pattern.matcher(javascriptString);
if (m.matches()) {
System.out.println(m.group(1));
}
}