Java pattern to check text line

Java pattern to check text line - java

I have a text line, and i read form android, i want to check if line is aceptable , the code will run.
Here my line
[al:Vol30]
[offset:0]
[00:37.00]3
[00:38.00]2
[00:39.00]1
[00:40.00]0/
So i want check line have pattern like this [00:37.00]3
I create 1 pattern with this code:
String pattern = "^[d{2}:d{2}.d{2}].";
....
//check line
if(str.matches(pattern))
{//do some thing}
How ever, this pattern is not correct so all line are fail. Can some one suggestion?

Try this
String pattern = "^\\[\\d{2}:\\d{2}.\\d{2}\\].";

Related

Cannot match string with regex pattern when such string is done of multiple lines

I have a string like the following:
SYBASE_OCS=OCS-12_5
SYBASE=/opt/sybase/oc12.5.1-EBF12850
//there is a newline here as well
The string at the debugger appears like this:
I am trying to match the part coming after SYBASE=, meaning I'm trying to match /opt/sybase/oc12.5.1-EBF12850.
To do that, I've written the following code:
String key = "SYBASE";
Pattern extractorPattern = Pattern.compile("^" + key + "=(.+)$");
Matcher matcher = extractorPattern.matcher(variableDefinition);
if (matcher.find()) {
return matcher.group(1);
}
The problem I'm having is that this string on 2 lines is not matched by my regex, even if the same regex seems to work fine on regex 101.
State of my tests:
If I don't have multiple lines (e.g. if I only had SYBASE=... followed by the new line), it would match
If I evaluate the expression extractorPattern.matcher("SYBASE_OCS=OCS-12_5\\nSYBASE=/opt/sybase/oc12.5.1-EBF12850\\n") (note the double backslash in front of the new line), it would match.
I have tried to use variableDefinition.replace("\n", "\\n") to what I give to the matcher(), but it doesn't match.
It seems something simple but I can't get out of it. Can anyone please help?
Note: the string in that format is returned by a shell command, I can't really change the way it gets returned.

The anchors ^ and $ anchors the match to the start and end of the input.
In your case you would like to match the start and end of a line within the input string. To do this you'll need to change the behavior of these anchors. This can be done by using the multi line flag.
Either by specifying it as an argument to Pattern.compile:
Pattern.compile("regex", Pattern.MULTILINE)
Or by using the embedded flag expression: (?m):
Pattern.compile("(?m)^" + key + "=(.+)$");
The reason it seemed to work in regex101.com is that they add both the global and multi line flag by default:

Regex expression for multiple patterns in 1 line

I am scraping information from a log that I need 3 elements. Another added difficulty is that I am parsing the log via readLine() in my java program aka one(1) line at a time. (If there is a possibility to read multiple lines when parsing let me know :) ) NOTE: I have no control over the log output format.
There are 2 possibilities of what I must extract. Either the log is nice and gives the following
NICE FORMAT
.text.rank 0x0000000000400b8f 0x351 is_x86.o
where I must grab .text.rank , 0x0000000000400b8f , and 0x351
Now the not so nice case: If the name is too long, it bumps everything else to the next line like is below, now the only thing after the first element is one blank space followed by a newline (\n) which gets clobbered by readLine() anyway.
EVIL FORMAT : Note each line is in a separate arraylist entry.
.text.__sfmoreglue
0x0000000000401d00 0x55 /mnt/drv2homelibc_popcorn.a(lib_a-findfp.o)
Therefore what the regex actually sees is:
.text.__sfmoreglue
CORNER CASE FORMAT that also occurs within the log but I DO NOT want
*(.text.unlikely)
Finally below is my Pattern line I am currently using for the first line and pline2 is what is used on the next line when group 2 of the first line is empty.
UPDATE: The pattern below works for the NICE FORMAT and EVIL FORMAT But now pattern pline2 has no matches, even though on regex101.com it is correct. Link: https://regex101.com/r/vS7vZ3/9
UPDATE2: I fixed it, I forgot to add m2.find() once I compiled the second line with Pattern pline2. Corrected code is below.
Pattern p = Pattern.compile("^[ \\s](\\.[tex]*\\.[\\._\\-\\#a-zA-Z0-9]*)\\s*([x0-9a-f]*)[ \\s]*([x0-9a-f]*).*");
Pattern pline2 = Pattern.compile("^\\s*([x0-9a-f]*)[ \\s]*([x0-9a-f]*)\\s*[\\w\\(\\)\\.\\-]*");
To give a little background I am first matching the name .text.whatever to m.group(1) followed by the address 0x000012345 to m.group(2) and finally the size 0xa48 to m.group(3). This is all assuming the log is in the NICE format. If it is in the EVIL format I see that group(2) is empty and therefore readin the next line of the log to a temp buffer and apply the second pattern pline2 to new line.
Can someone help me with the regex?
Is there a way I can make sure my current line (or even better, just the second grouping) is either the NICE FORMAT or is empty?
As requested my java code:
//1st line pattern
Pattern p = Pattern.compile("^[ \\s](\\.[tex]*\\.[\\._\\-\\#a-zA-Z0-9]*)\\s*([x0-9a-f]*)[ \\s]*([x0-9a-f]*).*");
//conditional 2nd line pattern
Pattern pline2 = Pattern.compile("^\\s*([x0-9a-f]*)[ \\s]*([x0-9a-f]*)\\s*[\\w\\(\\)\\.\\-]*");
while((temp = br1.readLine()) != null){
Matcher m = p.matcher(temp);
while(m.find()){
System.out.println("What regex finds: m1:"+m.group(1)+"# m2:"+m.group(2)+"# m3:"+m.group(3));
if(!m.group(1).isEmpty() && m.group(2).isEmpty() && m.group(3).isEmpty()){
//means we probably hit a long symbol name and important stuff is on the next line
//save the name at least
name = m.group(1);
//read and utilize the next line
if((temp = br1.readLine()) == null){
return;
}
System.out.println("EVILline2:"+temp); //sanity check the input
System.out.println(pline2.toString()); //sanity check the regex
Matcher m2= pline2.matcher(temp);
while(m2.find()){
System.out.println("regex line2 finds: m1:"+m2.group(1));//+"# m2:"+m2.group(2));
if(m2.group(2).isEmpty()){
size = 0;
}else{
size = Long.parseLong(m2.group(2).replaceFirst("0x", ""),16);
}
addr = Long.parseLong(m2.group(1).replaceFirst("0x", ""),16);
System.out.println("#########LONG NAME: "+name+" addr:"+addr+" size:"+size);
}
}//end if
else{ // assume in NICE FORMAT
//do nice format stuff.
}//end while
}//end outerwhile
An Aside, The output I currently get:
line: .text.c_print_results
What regex finds: m1:.text.c_print_results# m2:# m3:
EVIL FORMATline2: 0x00000000004001e6 0x231 c_print_results_x86.o
^\s*([x0-9a-f]*)[ \s]*([x0-9a-f]*)\s*[\w\(\)\.\-]*
Exception in thread "main" java.lang.IllegalStateException: No match found
at java.util.regex.Matcher.group(Matcher.java:536)
at java.util.regex.Matcher.group(Matcher.java:496)
at regexTest.regex.grabSymbolsInRange(regex.java:143)
at regexTest.regex.main(regex.java:489)

You have a few issues with your pattern.
1st is the separation of first and second groups (that's why group 2 is returning null).
You have 4 groups and you need 3
After capturing your 3 values you can stop matching, so pattern after
last group isn't necessary
you need global modifier \g so it returns all matches
So, instead of your posted Regex, you can try:
(\\.[tex]*\\.[\\._\\-\\#a-zA-Z0-9]*)\\s*([x0-9a-f]*)[ \\s]+([x0-9a-f]*)/g
Tested on Regex101.com:
https://regex101.com/r/lM4bQ9/1
Other then that, a few suggestions:
if you know your text is going to start with text, just put it on the
pattern, don't use [tex]*, which will require a few extra work from
the engine.
[ \s] is the same thing of \s.
[\._\-\#a-zA-Z0-9]* from what i understood, is basically
everything but space, so why not just use [^\s]*
So having these in mind I would suggest you to use this pattern instead:
(\\.text\\.[^\\s]*)\\s*([x0-9a-f]*)\\s+([x0-9a-f]*)/g

How to chomp previous line regex?

I have the following input:
-- input --
Keep this
Chomp this
ChompHere:
Anything below gets chomped
And I need the output to look like:
-- output (expected) --
Keep this
Right now I get the following based on the code below:
-- output (actual) --
Keep this
Chomp this
ASK: How can I delete the previous line of a regex match (Chomp this):
public void chompPreviousLine() {
String text = "Keep this\n"
+ "Chomp this\nChompHere:\nAnything below gets chomped";
Pattern CHOMP= Pattern.compile("^(ChompHere:(.*))$", Pattern.MULTILINE | Pattern.DOTALL);
Matcher m = CHOMP.matcher(text);
if (m.find()) {
// chomp everything below and one line above!
text = m.replaceAll("");
// but....??? how to delete the previous line ???
text = text .replaceAll("[\n]+$", ""); // delete any remaining /n
System.out.println(text);
}
}

You can modify the regex so that it also gets the previous line:
Pattern CHOMP= Pattern.compile("[^\n]+\nChompHere:(.*)", Pattern.MULTILINE | Pattern.DOTALL);
[^\n]+\n will match any consecutive character that is not an end-of-line character then the end-of-line itself. Since it is before ChompHere in the regex, it will match the complete line before ChompHere.
I have removed parenthesis since you don't really use groups in your algorithm; you are indeed replacing the whole matching text.

You could use a positive look-ahead:
^(.*)(?=ChompHere:)
Depending on whether you want the line break matched or not, you have to add it to the lookahead.
But would a simple parser not be easier for this?

Use RegEx in Java to extract parameters in between parentheses

I'm writing a utility to extract the names of header files from JSPs. I have no problem reading the JSPs line by line and finding the lines I need. I am having a problem extracting the specific text needed using regex. After looking at many similar questions I'm hitting a brick wall.
An example of the String I'll be matching from within is:
<jsp:include page="<%=Pages.getString(\"MY_HEADER\")%>" flush="true"></jsp:include>
All I need is MY_HEADER for this example. Any time I have this tag:
<%=Pages.getString
I need what comes between this:
<%=Pages.getString(\" and this: )%>
Here is what I have currently (which is not working, I might add) :
String currentLine;
while ((currentLine = fileReader.readLine()) != null)
{
Pattern pattern = Pattern.compile("<%=Pages\\.getString\\(\\\\\"([^\\\\]*)");
Matcher matcher = pattern.matcher(currentLine);
while(matcher.find()) {
System.out.println(matcher.group(1).toString());
}}
I need to be able to use the Java RegEx API and regex to extract those header names.
Any help on this issue is greatly appreciated. Thanks!
EDIT:
Resolved this issue, thankfully. The tricky part was, after being given the right regex, it had to be taken into account that the String I was feeding to the regex was always going to have two " / " characters ( (/"MY_HEADER"/) ) that needed to be escaped in the pattern.
Here is what worked (thanks to the help ;-)):
Pattern pattern = Pattern.compile("<%=Pages\\.getString\\(\\\\\"([^\\\\\"]*)");

This should do the trick:
<%=Pages\\.getString\\(\\\\\"([^\\\\]*)
Yeah that's a scary number of back slashes. matcher.group(1) should return MY_HEADER. It starts at the \" and matches everything until the next \ (which I assume here will be at \")%>.)
Of course, if your target text contains a backslash (\), this will not work. But you didn't give an indication that you'd ever be looking for something like <%=Pages.getString(\"Fun!\Yay!\")%> -- where this regex would only return Fun! and ignore the rest.
EDIT
The reason your test case was failing is because you were using this test string:
String currentLine = "<%=Pages.getString(\"MY_HEADER\")%>";
This is the equivalent of reading it in from a file and seeing:
<%=Pages.getString("MY_HEADER")%>
Note the lack of any \. You need to use this instead:
String sCurrentLine = "<%=Pages.getString(\\\"MY_HEADER\\\")%>";
Which is the equivalent of what you want.
This is test code that works:
String currentLine = "<%=Pages.getString(\\\"MY_HEADER\\\")%>";
Pattern pattern = Pattern.compile("<%=Pages\\.getString\\(\\\\\"([^\\\\]*)");
Matcher matcher = pattern.matcher(currentLine);
while(matcher.find()) {
System.out.println(matcher.group(1).toString());
}

How to use \\p{Punct} in java to check for beginning of text line with: {"

Given a String that begin's with the symbols: {" and ends with: "}. There are other punctuation's present in between the line aswell, like: , ' or "" etc. How to use java regex utility to know whether the given String starts with: {". I am trying to return the Boolean value by using:
Pattern.matches(begin, string)
where
begin = "[\\p{Punct}&&[{]]"
and
string = {"name":"Aman"},{"surname":"Gupta"}.
(Please suggest regex option than JSON) I want to do it by using regex only. Please suggest a way how to achieve this.

You should try smth like this:
Pattern p = Pattern.compile("\{.*?\}");
Matcher m = p.matcher(/*your string here*/);
while (m.find()){
String substringInBraces = m.group();
/*do smth with your substring*/
}
This will give you a substring of anything that might be between two nearest curly braces.
You might be interested in reading this and this

Pattern.compile("^{").matcher(string).find()
I don't know why you insist on using \\p{Punct}, it's totally unnecessary here.
Note that Pattern.matches() wants to match the entire string, so it is not useful when you only want to match something at the start of a string.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java pattern to check text line - java

Try this String pattern = "^\\[\\d{2}:\\d{2}.\\d{2}\\].";

Related

Cannot match string with regex pattern when such string is done of multiple lines

Regex expression for multiple patterns in 1 line

How to chomp previous line regex?

Use RegEx in Java to extract parameters in between parentheses

How to use \\p{Punct} in java to check for beginning of text line with: {"

Categories

Resources