How get numeric between two character with regex java? - java

I have the string as follows :
SUB8&20.000,-&succes&09/12/18SUB12&100.000,-&failed&07/12/18SUB16&40.000,-&succes&09/12/18
I want to get a string "8&20.000","16&40.000" between SUB and ,-&succes
I want to get succes data how to get the string using java regex ?

Use this regex,
SUB([^,]*),-&succes
Java code,
public static void main(String[] args) {
String s = "SUB8&20.000,-&succes&09/12/18SUB12&100.000,-&failed&07/12/18SUB16&40.000,-&succes&09/12/18";
Pattern p = Pattern.compile("SUB([^,]*),-&succes");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
}
Prints,
8&20.000
16&40.000
Check here

You could use the pattern SUB[^S]+&success[^S]+ and choose the one you want after that.
The two match would be SUB8&20.000,-&succes&09/12/18 and SUB16&40.000,-&succes&09/12/18.
Once you have chosen you can strip away the unwanted stuff with [0-9]+&[0-9.]+.

I don`t know if I am answering your question properly or not. But this regex will give exact string that you are looking for.
(?<=SUB)([^,]*)(?=,-&succes)
https://regex101.com/r/RLFXNf/1

Related

Regular expression get the third element from a string

Hello Im having trouble getting the third element of a string (F604080)
<sourceDocumentId>AX02_APF604_F604080</sourceDocumentId>
I have tried with this regular expression and variations, but i can manage to get
F604080.
(?<=\w+_)\w+(?=\<)
(?<=\w+_\w+_)\w+(?=\<)
....
Any help will be appreciated.
Thanks.
You don't need look behind or look ahead, instead just use this simple regex,
.*_(\w+)
and capture group 1.
Java codes,
public static void main(String[] args) {
String s = "<sourceDocumentId>AX02_APF604_F604080</sourceDocumentId>";
Pattern p = Pattern.compile(".*_(\\w+)");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println(m.group(1));
} else {
System.out.println("Didn't match");
}
}
Prints this like you wanted.
F604080
Using regex you can use something like >\w+_\w+_(\w+)<\/
String str = "<sourceDocumentId>AX02_APF604_F604080</sourceDocumentId>";
String code = null;
Matcher m = Pattern.compile(">\\w+_\\w+_(\\w+)</").matcher(str);
if (m.find()) {
code = m.group(1);
}
Simply use substring() operation
String code = str.substring(str.lastIndexOf('_') + 1, str.lastIndexOf('<'));
If later you parse XML with more element, you may use something like Java DOM Parser XML, but here this is not the best option as you have only one element
Can you just parse the string using "_" as separator and take the 3rd element ?
Both of your regular expressions seems to be matching the given string.
Anyway you could be a little bit more specific with this one:
^(?:<\w+>)(?:\w+)_(?:\w+)_(\w+)(?:<\/\w+>)$
Be sure that the input is the string you think it is and no additional text is given after that.

Regex look ahead to seperate string into tokens

I currently have the following code which allows me to find matches from a String.
I need to be able to find all words similar to 64xand split them up into tokens, so I'll get 64 and x as the output.
I have looked at regexs lookahead and this does not solve the issue, is there a way to do this without creating a new arraylist to store matches similar to 64x then splitting them up?
String input = "Hello world 65x";
ArrayList<String> userInput = new ArrayList<>();
Matcher isMatch = Pattern.compile("[0-9]*+[a-zA-Z]")
.matcher(input);
while (isMatch.find()) {
userInput.add(isMatch.group());
}
You can try the following regular expression:
\b(\p{Digit}+)(\p{Alpha})\b
Additionally, if you plan to use the regular expression very often, it is recommended to use a constant in order to avoid recompile it each time, e.g.:
private static final Pattern REGEX_PATTERN =
Pattern.compile("\\b(\\p{Digit}+)(\\p{Alpha})\\b");
public static void main(String[] args) {
String input = "Hello world 65x";
Matcher matcher = REGEX_PATTERN.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
}
}
Output:
65
x
No need of lookaheads, you can use nested captured groups:
Matcher isMatch = Pattern.compile("\\b([0-9]+)([a-zA-Z])\\b");
Group #1 will contain 65 and group #2 will contain x.
Better to add \\b (word boundary) on either side to avoid matching abc56xyz
You just need to use Matcher.group(int). This lets you extract pieces of the matched text. Read about caputring groups here. A regex that contains capturing groups is \\b([0-9]+)([a-zA-Z])\\b (as given by anubhava).

Use regex to replace a specific pattern

From a given string, am trying to replace a pattern such as "sometext.othertext.lasttext" with "lasttext". Is this possible in Java with Regex replace? If yes, how? Thanks in advance.
I tried
"hellow.world".replaceAll("(.*)\\.(.*)", "$2")
which results in world. But, I want to replace any such arbitrary sequence. For instance com.google.code should be replace with code and com.facebook should be replaced with facebook.
Just to add, a test input is:
if (com.google.code) then
and the test output should be:
if (code) then
Thanks.
I believe this is what you are looking for, if you're trying to avoid String methods. It can be made more succinct, but I'm hoping this will give you a better understanding.
As others suggested, String methods are cleaner.
class Split {
public static void main (String[] args) {
String inputString = "if (com.google.code) then";
Pattern p=Pattern.compile("((?<=\\()[^}]*(?=\\)))"); // Find text within parenthesis
Pattern p2 = Pattern.compile("(\\w+)(\\))"); // Find last portion of text between . and )
Matcher m = p.matcher(inputString);
Matcher m2 = p2.matcher(inputString);
String in2 = "";
if (m2.find())
in2=m2.group(1); // else ... error checking
inputString = m.replaceAll(in2); // do whatcha need to do
}
}
If the parenthesis aren't the concern, use this.
class Split {
public static void main (String[] args) {
String in = "if (com.google.code) then";
Pattern p = Pattern.compile("(\\w+)(\\))");
Matcher m = p.matcher(in);
if(m.find())
in = m.group(1);
System.out.println(in); // or whatever
}
}
Use:
str.replaceAll(".*\\.(\\w+)$", "$1")
Explanation here

regex - matching between a literal string and a quotation mark

I'm terrible at Regex and would greatly appreciate any help with this issue, which I think will be newb stuff for anyone familiar.
I'm getting a response like this from a REST call
{"responseData":{"translatedText":"Ciao mondo"},"responseDetails":"","responseStatus":200,"matches":[{"id":"424913311","segment":"Hello World","translation":"Ciao mondo","quality":"74","reference":"","usage-count":50,"subject":"All","created-by":"","last-updated-by":null,"create-date":"2011-12-29 19:14:22","last-update-date":"2011-12-29 19:14:22","match":1},{"id":"0","segment":"Hello World","translation":"Ciao a tutti","quality":"70","reference":"Machine Translation provided by Google, Microsoft, Worldlingo or the MyMemory customized engine.","usage-count":1,"subject":"All","created-by":"MT!","last-updated-by":null,"create-date":"2012-05-14","last-update-date":"2012-05-14","match":0.85}]}
All I need is the 'Ciao mondo' in between those quotations. I was hoping with Java's Split feature I could do this but unfortunately it doesn't allow two separate delimiters as then I could have specified the text before the translation.
To simplify, what I'm stuck with is the regex to gather whatever is inbetween translatedText":" and the next "
I'd be very grateful for any help
You can use \"translatedText\":\"([^\"]*)\" expression to capture the match.
The expression meaning is as follows: find quoted translatedText followed by a colon and an opening quote. Then match every character before the following quote, and capture the result in a capturing group.
String s = " {\"responseData\":{\"translatedText\":\"Ciao mondo\"},\"responseDetails\":\"\",\"responseStatus\":200,\"matches\":[{\"id\":\"424913311\",\"segment\":\"Hello World\",\"translation\":\"Ciao mondo\",\"quality\":\"74\",\"reference\":\"\",\"usage-count\":50,\"subject\":\"All\",\"created-by\":\"\",\"last-updated-by\":null,\"create-date\":\"2011-12-29 19:14:22\",\"last-update-date\":\"2011-12-29 19:14:22\",\"match\":1},{\"id\":\"0\",\"segment\":\"Hello World\",\"translation\":\"Ciao a tutti\",\"quality\":\"70\",\"reference\":\"Machine Translation provided by Google, Microsoft, Worldlingo or the MyMemory customized engine.\",\"usage-count\":1,\"subject\":\"All\",\"created-by\":\"MT!\",\"last-updated-by\":null,\"create-date\":\"2012-05-14\",\"last-update-date\":\"2012-05-14\",\"match\":0.85}]}";
System.out.println(s);
Pattern p = Pattern.compile("\"translatedText\":\"([^\"]*)\"");
Matcher m = p.matcher(s);
if (!m.find()) return;
System.out.println(m.group(1));
This fragment prints Ciao mondo.
use look-ahead and look-behind to gather strings inside quotations:
(?<=[,.{}:]\").*?(?=\")
class Test
{
public static void main(String[] args)
{
Scanner scanner = new Scanner(System.in);
String in = scanner.nextLine();
Matcher matcher = Pattern.compile("(?<=[,.{}:]\\\").*?(?=\\\")").matcher(in);
while(matcher.find())
System.out.println(matcher.group());
}
}
Try this regular expression -
^.*translatedText":"([^"]*)"},"responseDetails".*$
The matching group will contain the text Ciao mondo.
This assumes that translatedText and responseDetails will always occur in the positions specified in your sample.

Parsing text from the end (using regular expressions)

I have a seemingly simple problem though i am unable to get my head around it.
Let's say i have the following string: 'abcabcabcabc' and i want to get the last occurrence of 'ab'. Is there a way i can do this without looping through all the other 'ab's from the beginning of the string?
I read about anchoring the end of the string and then parsing the string with the required regular expression. I am unsure how to do this in Java (is it supported?).
Update: I guess i have caused a lot of confusion with my (over) simplified example. Let me try another one. Say, i have a string as thus - '12/08/2008 some_text 21/10/2008 some_more_text 15/12/2008 and_finally_some_more'. Here, i want the last date and hence i need to use regular expressions. I hope this is a better example.
Thanks,
Anirudh
Firstly, thanks for all the answers.
Here is what i tried and this worked for me:
Pattern pattern = Pattern.compile("(ab)(?!.*ab)");
Matcher matcher = pattern.matcher("abcabcabcd");
if(matcher.find()) {
System.out.println(matcher.start() + ", " + matcher.end());
}
This displays the following:
6, 8
So, to generalize - <reg_ex>(?!.*<reg_ex>) should solve this problem where '?!' signifies that the string following it should not be present after the string that precedes '?!'.
Update: This page provides a more information on 'not followed by' using regex.
This will give you the last date in group 1 of the match object.
.*(\d{2}/\d{2}/\d{4})
Pattern p = Pattern.compile("ab.*?$");
Matcher m = p.matcher("abcabcabcabc");
boolean b = m.matches();
I do not understand what you are trying to do. Why only the last if they are all the same? Why a regular expression and why not int pos = s.lastIndexOf(String str) ?
For the date example, you could do this with the Pattern API and not in the regex itself. The basic idea is to get all the matches, then return the last one.
public static void main(String[] args) {
// this may be over-kill, you can replace with a much simpler but more lenient version
final String dateRegex = "\\b(0?[1-9]|[12][0-9]|3[01])[- /.](0?[1-9]|1[012])[- /.](19|20)?[0-9]{2}\\b";
final String sample = "12/08/2008 some_text 21/10/2008 some_more_text 15/12/2008 and_finally_some_more";
List<String> allMatches = getAllMatches(dateRegex, sample);
System.out.println(allMatches.get(allMatches.size() - 1));
}
private static List<String> getAllMatches(final String regex, final String input) {
final Matcher matcher = Pattern.compile(regex).matcher(input);
return new ArrayList<String>() {{
while (matcher.find())
add(input.substring(matcher.start(), matcher.end()));
}};
}

Categories