RegEx for capturing digits from a string - java

I have this String:
String filename = 20190516.BBARC.GLIND.statistics.xml;
How can I get the first part of the String (numbers) without the use of substrings.

Here, we might just want to collect our digits using a capturing group, and if we wish, we could later add more boundaries, maybe with an expression as simple as:
([0-9]+)
For instance, if our desired digits are at the start of our inputs, we might want to add a start char as a left boundary:
^([0-9]+)
Or if our digits are always followed by a ., we can bound it with that:
^([0-9]+)\.
and we can also add a uppercase letter after that to strengthen our right boundary and continue this process, if it might be necessary:
^([0-9]+)\.[A-Z]
RegEx
If this expression wasn't desired, it can be modified or changed in regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([0-9]+)";
final String string = "20190516.BBARC.GLIND.statistics.xml";
final String subst = "\\1";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
Demo
const regex = /([0-9]+)(.*)/gm;
const str = `20190516.BBARC.GLIND.statistics.xml`;
const subst = `$1`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);

To extract a part or parts of string using regex I prefer to define groups.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class B {
public static void main(String[] args) {
String in="20190516.BBARC.GLIND.statistics.xml";
Pattern p=Pattern.compile("(\\w+).*");
Matcher m=p.matcher(in);
if(m.matches())
System.out.println(m.group(1));
else
System.out.println("no match");
}
}

Related

Regex to find all dollar signs and parentheses and commas

I want a regex to remove all instances of dollar signs, commas, and opening and closing parentheses so that the String can be parsed to a Double.
Exmaples are:
($108.34)
$39.60
1,388.80
The code:
#Parsed
#Replace(expression = "", replacement = "")
public Double extdPrice;
This may help, we delete all the elements in this list: , $ ( )
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Example {
public static void main(String[] args) {
final String regex = "[(),$]";
final String string = "($108.34)\n"
+ "$39.60\n"
+ "1,388.80";
final String subst = "";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
}
}
\d{1,3}(\,\d\d\d)*(\.\d+)?
can match all number like your examples, but it can't match 123456(no comma).
result was
108.34
39.60
1,388.80
and you need replace comma
Regex Expression = [^0-9\\.]
is what you are looking for. It will match anything other than digits 0-9 and character .
So technically this regex will remove all extra characters like ( , $ USD and etc
Example: System.out.println("($123.89)".replaceAll("[^0-9\\.]", "")); will give an output 123.89
Test output:
($108.34) => 108.34
$39.60 => 39.60
1,388.80 => 1388.80

Replace all occurrences matching given patterns

Having following string:
String value = "/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=32ee/group_key=222/end_date=2020-04-20/run_key_default=32sas1/somethingElse=else"
In need to replace values of run_key and run_key_default with %, for example, for above string result output will be the:
"/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=%/group_key=222/end_date=2020-04-20/run_key_default=%/somethingElse=else"
I would like to avoid mistakenly modifying other values, so in my opinion the best solution for it is combining replaceAll method with regex
String output = value.replaceAll("\run_key=[*]\", "%").replaceAll("\run_key_default=[*]\", "%")
I'm not sure how should I construct regex for it?
Feel free to post if you know better solution for it, than this one which I provided.
You may use this regex for search:
(/run_key(?:_default)?=)[^/]*
and for replacement use:
"$1%"
RegEx Demo
Java Code:
String output = value.replaceAll("(/run_key(?:_default)?=)[^/]*", "$1%");
RegEx Details:
(: Start capture group #1
/run_key: Match literal text /run_key
(?:_default)?: Match _default optionally
=: Match a literal =
): End capture group #1
[^/]*: Match 0 or more of any characters that is not /
"$1%" is replacement that puts our 1st capture group back followed by a literal %
public static void main(String[] args) {
final String regex = "(run_key_default|run_key)=\\w*"; //regex
final String string = "/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=32ee/group_key=222/end_date=2020-04-20/run_key_default=32sas1/somethingElse=else";
final String subst = "$1=%"; //group1 as it is while remaining part with %
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
}
output
Substitution result:
/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=%/group_key=222/end_date=2020-04-20/run_key_default=%/somethingElse=else

How to match a pattern in String where string to match is "operands":["10000"]

I have a long json string, "attributeName":"Loc ID"},"operands":["10000"]}],"frequency":{"type":" This is just some part of it, i just want to match this pattern "operands":["10000"] in the given string.
I have already tried using
string.replace("\"operands\":[\"10000\"]","\"operands\":[\"20000\"]")
Even tried regex "\"operands\":[\"\\d+\"]"
I am using JAVA to get the desired result.
Maybe, this expression,
"operands"\\s*:\\s*\\[\\s*"(\\d*)"\\s*\\]
and a replacement of,
"operands":["20000"]
might work just OK.
If there is no additional space,
\"operands\":\\[\"(\\d*)\"\\]
might work just fine.
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class re{
public static void main(String[] args){
final String regex = "\"operands\"\\s*:\\s*\\[\\s*\"\\s*(\\d*)\\s*\"\\s*\\]";
final String string = "\"attributeName\":\"Loc ID\"},\"operands\":[\"10000\"]}],\"frequency\":{\"type\":\"\n"
+ "\"attributeName\":\"Loc ID\"},\"operands\":[ \" 10000 \" ]}],\"frequency\":{\"type\":\"";
final String subst = "\"operands\":[\"20000\"]";
final Pattern pattern = Pattern.compile(regex, Pattern.DOTALL);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll(subst);
System.out.println(result);
}
}
Output
"attributeName":"Loc ID"},"operands":["20000"]}],"frequency":{"type":"
"attributeName":"Loc ID"},"operands":["20000"]}],"frequency":{"type":"
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.

Regex in java to match a string begining with html followed by something

Can someone help me out with a regex to match a string which starts with the following eg:
The string can begin with any html tag eg:
< span > or < p > etc so basically I want a regex to check if a string begins with any opening html tag <> and then followed by [apple videoID=
Eg:
<span>[apple videoID=
Here's what I've tried :
static String pattern = "^<[^>]+>[apple videoID=";
static Pattern pattern1 = Pattern.compile(pattern);
What is wrong in the above?
You have a typo in the following line.
static String pattern = "^<[^>]+>[apple videoID=";
This string is not a valid regular expression because you have an unclosed [ right before the word apple, hence the "Unclosed character class" PatternSyntaxException. You either meant to type
static String pattern = "^<[^>]+><apple videoID=";
assuming that apple is an html tag, or
static String pattern = "^<[^>]+>\\[apple videoID=";
if you really did want the [ in front of apple. This is because [ is a special character in regular expressions and must be escaped with a \ which is a special character in Java strings and must be escaped with a \. Therefore \\[.
simple as this:
<[.]+><apple videoID=[.]*
Try this pattern :
"^<[A-Za-z]+>\\[apple videoID=$"
This pattern will match [apple videoID=
Hope this will help you..!
Here is the solution
Pattern.CASE_INSENSITIVE helps to fetch the pattern either in upper case or lower case.
Tested and Executed.
package sireesh.yarlagadda;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Pattern {
public static void main(String[] args) {
String text="<span><apple videoID=";
String patternString = "<[a-zA-Z]*>\\<apple videoID=";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
System.out.println("lookingAt = " + matcher.lookingAt());
System.out.println("matches = " + matcher.matches());
}
}

Pattern.COMMENTS always causing Matcher.find to fail

The following code matches the two expressions and prints success.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id : [0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
However, I want white space to not matter, so the following should also print success.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id:[0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex, Pattern.COMMENTS);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
The Pattern.COMMENTS flag is supposed to permit white space, but it causes Failure to be printed. It even causes Failure to be printed if the strings are exactly equivalent including white space, like in the first example. For example,
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id : [0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex, Pattern.COMMENTS);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
Prints Failure.
Why is this happening and how do I make the Pattern ignore white space?
There is a misunderstanding on your side. Pattern.COMMENTS allow you to put additional whitespace into your regex, to improve the readability of the regex, but this whitespace will NOT be matched in the string.
This does not allow whitespace in your string, that is then matched automatically, without being defined in the regex.
Example
With Pattern.COMMENTS you can put whitespace in your regex like this
String regex = "\\{ user_id: [0-9]+ \\}";
to improve readablitiy, but the it will not match the string
String string = "{user_id : 0}";
because you haven't defined the whitespaces in the string, so if you want to use Pattern.COMMENTS then you need to treat whitespace you want to match specially, either you escape it
String regex = "\\{ user_id\\ :\\ [0-9]+ \\}";
or you use the whitespace class
String regex = "\\{ user_id \\s:\\s [0-9]+ \\}";

Categories