Regular expression get the third element from a string - java

Hello Im having trouble getting the third element of a string (F604080)
<sourceDocumentId>AX02_APF604_F604080</sourceDocumentId>
I have tried with this regular expression and variations, but i can manage to get
F604080.
(?<=\w+_)\w+(?=\<)
(?<=\w+_\w+_)\w+(?=\<)
....
Any help will be appreciated.
Thanks.

You don't need look behind or look ahead, instead just use this simple regex,
.*_(\w+)
and capture group 1.
Java codes,
public static void main(String[] args) {
String s = "<sourceDocumentId>AX02_APF604_F604080</sourceDocumentId>";
Pattern p = Pattern.compile(".*_(\\w+)");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println(m.group(1));
} else {
System.out.println("Didn't match");
}
}
Prints this like you wanted.
F604080

Using regex you can use something like >\w+_\w+_(\w+)<\/
String str = "<sourceDocumentId>AX02_APF604_F604080</sourceDocumentId>";
String code = null;
Matcher m = Pattern.compile(">\\w+_\\w+_(\\w+)</").matcher(str);
if (m.find()) {
code = m.group(1);
}
Simply use substring() operation
String code = str.substring(str.lastIndexOf('_') + 1, str.lastIndexOf('<'));
If later you parse XML with more element, you may use something like Java DOM Parser XML, but here this is not the best option as you have only one element

Can you just parse the string using "_" as separator and take the 3rd element ?

Both of your regular expressions seems to be matching the given string.
Anyway you could be a little bit more specific with this one:
^(?:<\w+>)(?:\w+)_(?:\w+)_(\w+)(?:<\/\w+>)$
Be sure that the input is the string you think it is and no additional text is given after that.

Related

How get numeric between two character with regex java?

I have the string as follows :
SUB8&20.000,-&succes&09/12/18SUB12&100.000,-&failed&07/12/18SUB16&40.000,-&succes&09/12/18
I want to get a string "8&20.000","16&40.000" between SUB and ,-&succes
I want to get succes data how to get the string using java regex ?
Use this regex,
SUB([^,]*),-&succes
Java code,
public static void main(String[] args) {
String s = "SUB8&20.000,-&succes&09/12/18SUB12&100.000,-&failed&07/12/18SUB16&40.000,-&succes&09/12/18";
Pattern p = Pattern.compile("SUB([^,]*),-&succes");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
}
Prints,
8&20.000
16&40.000
Check here
You could use the pattern SUB[^S]+&success[^S]+ and choose the one you want after that.
The two match would be SUB8&20.000,-&succes&09/12/18 and SUB16&40.000,-&succes&09/12/18.
Once you have chosen you can strip away the unwanted stuff with [0-9]+&[0-9.]+.
I don`t know if I am answering your question properly or not. But this regex will give exact string that you are looking for.
(?<=SUB)([^,]*)(?=,-&succes)
https://regex101.com/r/RLFXNf/1

Remove parenthesis from String using java regex

I want to remove parenthesis using Java regular expression but I faced to error No group 1 please see my code and help me.
public String find_parenthesis(String Expr){
String s;
String ss;
Pattern p = Pattern.compile("\\(.+?\\)");
Matcher m = p.matcher(Expr);
if(m.find()){
s = m.group(1);
ss = "("+s+")";
Expr = Expr.replaceAll(ss, s);
return find_parenthesis(Expr);
}
else
return Expr;
}
and it is my main:
public static void main(String args[]){
Calculator c1 = new Calculator();
String s = "(4+5)+6";
System.out.println(s);
s = c1.find_parenthesis(s);
System.out.println(s);
}
The simplest method is to just remove all parentheses from the string, regardless of whether they are balanced or not.
String replaced = "(4+5)+6".replaceAll("[()]", "");
Correctly handling the balancing requires parsing (or truly ugly REs that only match to a limited depth, or “cleverness” with repeated regular expression substitutions). For most cases, such complexity is overkill; the simplest thing that could possibly work is good enough.
What you want is this: s = s.replaceAll("[()]","");
For more on regex, visit regex tutorial.
You're getting the error because your regex doesn't have any groups, but I suggest you use this much simpler, one-line approach:
expr = expr.replaceAll("\\((.+?)\\)", "$1");
You can't do this with a regex at all. It won't remove the matching parentheses, just the first left and the first right, and then you won't be able to get the correct result from the expression. You need a parser for expressions. Have a look around for recursive descent ezpresssion parsers, the Dijkstra shunting-yard algorithm, etc.
The regular expression defines a character class consisting of any whitespace character (\s, which is escaped as \s because we're passing in a String), a dash (escaped because a dash means something special in the context of character classes), and parentheses. Try it working code.
phoneNumber.replaceAll("[\\s\\-()]", "");
I know I'm very late here. But, just in case you're still looking for a better answer. If you want to remove both open and close parenthesis from a string, you can use a very simple method like this:
String s = "(4+5)+6";
s=s.replaceAll("\\(", "").replaceAll("\\)","");
If you are using this:
s=s.replaceAll("()", "");
you are instructing the code to look for () which is not present in your string. Instead you should try to remove the parenthesis separately.
To explain in detail, consider the below code:
String s = "(4+5)+6";
String s1=s.replaceAll("\\(", "").replaceAll("\\)","");
System.out.println(s1);
String s2 = s.replaceAll("()", "");
System.out.println(s2);
The output for this code will be:
4+5+6
(4+5)+6
Also, use replaceAll only if you are in need of a regex. In other cases, replace works just fine. See below:
String s = "(4+5)+6";
String s1=s.replace("(", "").replace(")","");
Output:
4+5+6
Hope this helps!

Regex and String Manipulation Techniques

Given that an input String may be specified as follows:
read(xpath(‘...’)) or
xpath(‘...’) or
...
Where ... just holds some xpath expression, for example,/comment/text
All I really want is the xpath expression; what would be an efficient way to in general extract this value given the three possible valid patterns that could be specified.
Also, I am implementing this in Java.
Here is the basic example, it matches xpath part in both of your string examples:
import java.util.regex.*;
class Untitled {
public static void main(String[] args) {
String input = "read(xpath('...'))";
String result = null;
Pattern regex = Pattern.compile("xpath\\(\'(.*?)\'\\)");
Matcher matcher = regex.matcher(input);
if (matcher.find()) {
result = matcher.group(1);
}
System.out.println(result);
}
}
It would have helped you posted some code but you can use String Split. Please refer to http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#split%28java.lang.String%29
If you are sure to have xpath('') , then you can use xpath(' as your regex and strip the string and gather the data inside it until it hits another ' (apostrophe).
I hope this gives you an idea.

Parse a string in Java

I have strings formatted similar to the one below in a Java program. I need to get the number out.
Host is up (0.0020s latency).
I need the number between the '(' and the 's' characters. E.g., I would need the 0.0020 in this example.
If you are sure it will always be the first number you could use the regular expresion \d+\.\d+ (but note that the backslashes need to be escaped in Java string literals).
Try this code:
String input = "Host is up (0.0020s latency).";
Pattern pattern = Pattern.compile("\\d+\\.\\d+");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
System.out.println(matcher.group());
}
See it working online: ideone
You could also include some of the surrounding characters in the regular expression to reduce the risk of matching the wrong number. To do exactly as you requested in the question (i.e. matching between ( and s) use this regular expression:
\((\d+\.\d+)s
See it working online: ideone
Sounds like a case for regular expressions.
You'll want to match for the decimal figure and then parse that match:
Float matchedValue;
Pattern pattern = Pattern.compile("\\d*\\.\\d+");
Matcher matcher = pattern.matcher(yourString);
boolean isfound = matcher.find();
if (isfound) {
matchedValue = Float.valueOf(matcher.group(0));
}
It depends on how "similar" you mean. You could potentially use a regular expression:
import java.math.BigDecimal;
import java.util.regex.*;
public class Test {
public static void main(String args[]) throws Exception {
Pattern pattern = Pattern.compile("[^(]*\\(([0-9]*\\.[0-9]*)s");
String text = "Host is up (0.0020s latency).";
Matcher match = pattern.matcher(text);
if (match.lookingAt())
{
String group = match.group(1);
System.out.println("Before parsing: " + group);
BigDecimal value = new BigDecimal(group);
System.out.println("Parsed: " + value);
}
else
{
System.out.println("No match");
}
}
}
Quite how specific you want to make your pattern is up to you, of course. This only checks for digits, a dot, then digits after an opening bracket and before an s. You may need to refine it to make the dot optional etc.
This is a great site for building regular expressions from simple to very complex. You choose the language and boom.
http://txt2re.com/
Here's a way without regex
String str = "Host is up (0.0020s latency).";
str = str.substring(str.indexOf('(')+1, str.indexOf("s l"));
System.out.println(str);
Of course using regular expressions in this case is best solution but in many simple cases you can use also something like :
String value = myString.subString(myString.indexOf("("), myString.lastIndexOf("s"))
double numericValue = Double.parseDouble(value);
This is not recomended because text in myString can changes.

Parsing text from the end (using regular expressions)

I have a seemingly simple problem though i am unable to get my head around it.
Let's say i have the following string: 'abcabcabcabc' and i want to get the last occurrence of 'ab'. Is there a way i can do this without looping through all the other 'ab's from the beginning of the string?
I read about anchoring the end of the string and then parsing the string with the required regular expression. I am unsure how to do this in Java (is it supported?).
Update: I guess i have caused a lot of confusion with my (over) simplified example. Let me try another one. Say, i have a string as thus - '12/08/2008 some_text 21/10/2008 some_more_text 15/12/2008 and_finally_some_more'. Here, i want the last date and hence i need to use regular expressions. I hope this is a better example.
Thanks,
Anirudh
Firstly, thanks for all the answers.
Here is what i tried and this worked for me:
Pattern pattern = Pattern.compile("(ab)(?!.*ab)");
Matcher matcher = pattern.matcher("abcabcabcd");
if(matcher.find()) {
System.out.println(matcher.start() + ", " + matcher.end());
}
This displays the following:
6, 8
So, to generalize - <reg_ex>(?!.*<reg_ex>) should solve this problem where '?!' signifies that the string following it should not be present after the string that precedes '?!'.
Update: This page provides a more information on 'not followed by' using regex.
This will give you the last date in group 1 of the match object.
.*(\d{2}/\d{2}/\d{4})
Pattern p = Pattern.compile("ab.*?$");
Matcher m = p.matcher("abcabcabcabc");
boolean b = m.matches();
I do not understand what you are trying to do. Why only the last if they are all the same? Why a regular expression and why not int pos = s.lastIndexOf(String str) ?
For the date example, you could do this with the Pattern API and not in the regex itself. The basic idea is to get all the matches, then return the last one.
public static void main(String[] args) {
// this may be over-kill, you can replace with a much simpler but more lenient version
final String dateRegex = "\\b(0?[1-9]|[12][0-9]|3[01])[- /.](0?[1-9]|1[012])[- /.](19|20)?[0-9]{2}\\b";
final String sample = "12/08/2008 some_text 21/10/2008 some_more_text 15/12/2008 and_finally_some_more";
List<String> allMatches = getAllMatches(dateRegex, sample);
System.out.println(allMatches.get(allMatches.size() - 1));
}
private static List<String> getAllMatches(final String regex, final String input) {
final Matcher matcher = Pattern.compile(regex).matcher(input);
return new ArrayList<String>() {{
while (matcher.find())
add(input.substring(matcher.start(), matcher.end()));
}};
}

Categories