Parsing string that contains user entered free text [closed] - java

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I'm trying to parse strings of the following format in Java:
Number-Action-Msg, Number-Action-Msg, Number-Action-Msg, Number-Action-Msg, ...
For example
"512-WARN-Cannot update the name.,615-PREVENT-The app is currently down, please try again later.,736-PREVENT-Testing,"
I would like to get an array with the following entries:
512-WARN-Cannot update the name.
615-PREVENT-The app is currently down, please try again later.
736-PREVENT-Testing
The problem is that the message is user entered, so I can't rely on just the commas to split up the String. The actions will always be WARN or PREVENT. What's the best way to accomplish this parsing? Thanks!

Instead of splitting by comma you can use this lookahead based regex for matching:
(\d+-(?:WARN|PREVENT).*?)(?=,\d+-(?:WARN|PREVENT)|,$)
RegEx Demo
(?=,\d+-(?:WARN|PREVENT)|,$) is a positive lookahead to assert there is a comma followed by digits-(WARN|PREVENT) or end of line ahead.

Seems quite simple:
Regular expression:
WARN|PREVENT
Debuggex Demo
In java:
String string = "512-WARN-Cannot update the name.,615-PREVENT-The app is currently down, please try again later.,736-PREVENT-Testing,";
String regex = "WARN|PREVENT";
System.out.println(Arrays.toString(string.split(regex)));
Will output:
[512-, -Cannot update the name.,615-, -The app is currently down, please try again later.,736-, -Testing,]
Of course you may want to adjust regex adding the -, for example:
String regex = "-WARN-|-PREVENT-";

Related

Java regex pattern too long? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I have this regex which is a bit longer than usual. I try to capture some values in a text document.
\\n*.*(k\\s=\\s\\d)(.|\\n)*?estimate\\s.*\\n*\\s*((\\d+|<)\\.\\d+)\\s*((\\d+|<)\\.\\d+)\\s*((\\d+|<)\\.\\d+)\\s*((\\d+|<)\\.\\d+)\\s*((\\d+|<)\\.\\d+)\\s*((\\d+|<)\\.\\d+)\\s+
It works perfectly fine on regexr.com link
but in Java only this part works
\\n*.*(k\\s=\\s\\d)(.|\\n)*?estimat
as soon as I add the missing 'e' it stops working.
For now I am ignoring that some groups are filled wrongly.
What goes wrong?
The (.|\\n)*? makes the regex engine perform too many redundant backtracking steps. You need to replace all such parts in your pattern with (?s:.*?), a modifier group that matches any 0+ chars including line break chars. Since there is no alternation, there is no redundant backtracking here.
Note that in JavaScript (as you are testing the pattern at regexr.com that only supports JavaScript regex flavor), the (.|\n)*? should be replaced with [^]*? or [\s\S]*? as its regex engine does not support inline modifiers at all.

Extract version from string using java regex [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I could not seem to figure out the proper regex (java style) to extract just the version part for these string
EXPECTATION
spring-aop-4.2.5.RELEASE.jar -> 4.2.5
rumi-1.js -> 1
BouncyCastle-Net-12-1.dll -> 12-1
With the following java regex I keep getting a period at the end of line
\\b\\d[\\d|\\.|\\-]*\\b
Anyone can suggest a better regex?
FAULTY_RESULT
spring-aop-4.2.5.RELEASE.jar -> 4.2.5.
rumi-1.js -> 1.
BouncyCastle-Net-12-1.dll -> 12-1.
Digit(s), followed by zero or more lots of a dot/dash and digit(s), not preceded by a word character:
(?<!\w)\d+([.-]\d+)*
In java, it can be done in one line:
String version = packageName.replaceAll(".*?((?<!\\w)\\d+([.-]\\d+)*).*", "$1");
Here, the target term is captured while the regex matches the entire input and the replacement term returns the target (via the captured group).
Probably you can get this expression better, but this is my first attempt. Note. you need to scape this expression.
(\d+(\.?|-?|\d)+\d|\d)
regards.
Try something like this:
\b\d[\d|.|\-]*\d\b
The doubling of the back-slashes goes into the String constant only:
"\\b\\d[\\d|.|\\-]+*\\d\\b"
However this will not match a single digit version...

Regex that removes everything but the number [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am trying to use java's string.replaceAll() or replaceFirst() method in order to edit data read from a pdf document. A line of data that could be returned is:
21/1**E (6-11) 4479 77000327633 (U)
I wish to only store the 77000327633 into a variable for working with and looking for the correct regex that will capture ONLY this 11 digit number. I've tried searching around for a regex but nothing seems to give me my desired outcome.
It could be done like this:
String value = "21/1**E (6-11) 4479 77000327633 (U)";
Pattern pattern = Pattern.compile(".* (\\d{11}) .*");
System.out.println(pattern.matcher(value).replaceAll("$1"));
Output:
77000327633
NB: This assumes that your number has 11 digits and that there is a space before and after.
NB2: It is not meant to be perfect it is only to show the idea which is here to define a global pattern with a group and replace everything by the content of the group
This is it : (.*)[ ]([0-9])*[ ](.*)
Can access to your value using $2

How do I extract the following patterns in java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have string in following format:
String s = " some text....
[[Category:Anarchism| ]]
[[Category:Political culture]]
[[Category:Political ideologies]]
[[Category:Far-left politics]]
... some more text"
I want to extract all the categories from this text. [Anarchism,Political culture ....,Far-left politics]
Also, is there a good tutorial where I can learn about this regex pattern matching stuff..
Thanks
You can use the following regex to get categories:
\[\[Category:(.+)\]\]
Then you can access to your groups to get the category values.
Remember to add backslash to backslashes if you use on java strings:
\\[\\[Category:(.+)\\]\\]
You can see it working:
Assuming you don't want to select the word "Category" itself, the regex would be:
(?<=Category:).*?(?=])
I'll break this down a bit for you.
The first bit in brackets looks for Category, without actually selecting it.
Next .+? looks for 1-infinity characters (other than a newline), but stops as soon as the next part is matched:
The final brackets tells it to look for a ], but without actually selecting it.
The results would be the bits below highlighted in blue.

Regular expression based detection [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I need to identify a pattern in given text (string), and I'm looking for a regex for the same. Using a Regex is preferable due to the framework I'm working in.
For instance, consider the text --
Problem:
<<< empty line(s) >>>>
Reason:
here goes some multi-line reasoning...
...
...
As you can see there is "no text (empty line(s)) after Problem: and before Reason: ".
I need to be able to identify this pattern from the text given to me, using a regex.
Any help is much appreciated.
Thanks!
The simplest regex would be
Pattern regex = Pattern.compile("Problem:\\s+Reason:");
which finds the text Problem:, followed by one or more whitespace characters, followed by the text Reason:.
If you want to make sure that there are at least two linebreaks between the two texts, you could also do
Pattern regex = Pattern.compile("Problem:[ \t]*\r?\n[ \t]*\r?\nReason:");
but that's probably not necessary.

Categories