Regex to mask multiple phone numbers (~) separated except last 4 digiits - java

I am trying to find a regex which masks phone numbers except last 4 digits.
example: phone=9988998888~7654321908~6789054321
Desired output : phone=******8888~******1908~*****4321
I tried below regex but it is masking only starting number
phone=******8888~7654321908~6789054321
^(phone)=(\d(?=\d{4}))*

Use replaceAll​(Function<MatchResult,​String> replacer) to replace each digit in MatchResult with "*".
public class PhoneNumberMask {
public static void main(String[] args) {
String target = "phone=9988998888~7654321908~6789054321";
Pattern pattern = Pattern.compile("(\\d+(?=\\d{4}))");
Matcher matcher = pattern.matcher(target);
String result = matcher.replaceAll((matchResult) -> matchResult.group(1).replaceAll("\\d", "*"));
System.out.println(result);
}
}

You could use:
\d(?=\d{4})
See this online demo
\d - Any single digit.
(?=\d{4}) - Positive lookahead for 4 digits.
Replace with *.
See a Java demo

Assuming you only want to mask all numbers in a string that starts with phone= separated with ~, you can use a plain regex solution without a lambda in the replacement with
String masked = text.replaceAll("(\\G(?!^)(?:\\d{4}~)?|^phone=)\\d(?=\\d{4})", "$1*");
See the regex demo. Details:
(\G(?!^)(?:\d{4}~)?|^phone=) - Group 1: end of the previous successful match and then an optional sequence of four digits and a ~ or start of string and phone=
\d - a digit
(?=\d{4}) - followed with any four digits.

Related

Use regex to get 2 specific groups of substring

String s = #Section250342,Main,First/HS/12345/Jack/M,2000 10.00,
#Section250322,Main,First/HS/12345/Aaron/N,2000 17.00,
#Section250399,Main,First/HS/12345/Jimmy/N,2000 12.00,
#Section251234,Main,First/HS/12345/Jack/M,2000 11.00
Wherever there is the word /Jack/M in the3 string, I want to pull the section numbers(250342,251234) and the values(10.00,11.00) associated with it using regex each time.
I tried something like this https://regex101.com/r/4te0Lg/1 but it is still messed.
.Section(\d+(?:\.\d+)?).*/Jack/M
If the only parts of each section that change are the section number, the name of the person and the last value (like in your example) then you can make a pattern very easily by using one of the sections where Jack appears and replacing the numbers you want by capturing groups.
Example:
#Section250342,Main,First/HS/12345/Jack/M,2000 10.00
becomes,
#Section(\d+),Main,First/HS/12345/Jack/M,2000 (\d+.\d{2})
If the section substring keeps the format but the other parts of it may change then just replace the rest like this:
#Section(\d+),\w+,(?:\w+/)*Jack/M,\d+ (\d+.\d{2})
I'm assuming that "Main" is a class, "First/HS/..." is a path and that the last value always has 2 and only 2 decimal places.
\d - A digit: [0-9]
\w - A word character: [a-zA-Z_0-9]
+ - one or more times
* - zero or more times
{2} - exactly 2 times
() - a capturing group
(?:) - a non-capturing group
For reference see: https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/regex/Pattern.html
Simple Java example on how to get the values from the capturing groups using java.util.regex.Pattern and java.util.regex.Matcher
import java.util.regex.*;
public class GetMatch {
public static void main(String[] args) {
String s = "#Section250342,Main,First/HS/12345/Jack/M,2000 10.00,#Section250322,Main,First/HS/12345/Aaron/N,2000 17.00,#Section250399,Main,First/HS/12345/Jimmy/N,2000 12.00,#Section251234,Main,First/HS/12345/Jack/M,2000 11.00";
Pattern p = Pattern.compile("#Section(\\d+),\\w+,(?:\\w+/)*Jack/M,\\d+ (\\d+.\\d{2})");
Matcher m;
String[] tokens = s.split(",(?=#)"); //split the sections into different strings
for(String t : tokens) //checks every string that we got with the split
{
m = p.matcher(t);
if(m.matches()) //if the string matches the pattern then print the capturing groups
System.out.printf("Section: %s, Value: %s\n", m.group(1), m.group(2));
}
}
}
You could use 2 capture groups, and use a tempered greedy token approach to not cross #Section followed by a digit.
#Section(\d+)(?:(?!#Section\d).)*\bJack/M,\d+\h+(\d+(?:\.\d+)?)\b
Explanation
#Section(\d+) Match #Section and capture 1+ digits in group 1
(?:(?!#Section\d).)* Match any character if not directly followed by #Section and a digit
\bJack/M, Match the word Jack and /M,
\d+\h+ Match 1+ digits and 1+ spaces
(\d+(?:\.\d+)?) Capture group 2, match 1+ digits and an optional decimal part
\b A word boundary
Regex demo
In Java:
String regex = "#Section(\\d+)(?:(?!#Section\\d).)*\\bJack/M,\\d+\\h+(\\d+(?:\\.\\d+)?)\\b";

Why does this regex fails to check accurately?

I have the following regex method which does the matches in 3 stages for a given string. But for some reason the Regex fails to check some of the things. As per whatever knowledge I have gained by working they seem to be correct. Can someone please correct me what am I doing wrong here?
I have the following code:
public class App {
public static void main(String[] args) {
String identifier = "urn:abc:de:xyz:234567.1890123";
if (identifier.matches("^urn:abc:de:xyz:.*")) {
System.out.println("Match ONE");
if (identifier.matches("^urn:abc:de:xyz:[0-9]{6,12}.[0-9]{1,7}.*")) {
System.out.println("Match TWO");
if (identifier.matches("^urn:abc:de:xyz:[0-9]{6,12}.[a-zA-Z0-9.-_]{1,20}$")) {
System.out.println("Match Three");
}
}
}
}
}
Ideally, this code should generate the output
Match ONE
Match TWO
Match Three
Only when the identifier = "urn:abc:de:xyz:234567.1890123.abd12" but it provides the same output event if the identifier does not match the regex such as for the following inputs:
"urn:abc:de:xyz:234567.1890123"
"urn:abc:de:xyz:234567.1890ANC"
"urn:abc:de:xyz:234567.1890123"
"urn:abc:de:xyz:234567.1890ACB.123"
I am not understanding why is it allowing the Alphanumeric characters after the . and also it does not care about the characters after the second ..
I would like my Regex to check that the string has the following format:
String starts with urn:abc:de:xyz:
Then it has the numbers [0-9] which range from 6 to 12 (234567).
Then it has the decimal point .
Then it has the numbers [0-9] which range from 1 to 7 (1890123)
Then it has the decimal point ..
Finally it has the alphanumeric character and spcial character which range from 1 to 20 (ABC123.-_12).
This is an valid string for my regex: urn:abc:de:xyz:234567.1890123.ABC123.-_12
This is an invalid string for my regex as it misses the elements from point 6:
urn:abc:de:xyz:234567.1890123
This is also an invalid string for my regex as it misses the elements from point 4 (it has ABC instead of decimal numbers).
urn:abc:de:xyz:234567.1890ABC.ABC123.-_12
This part of the regex:
[0-9]{6,12}.[0-9]{1,7} matches 6 to 12 digits followed by any character followed by 1 to 7 digits
To match a dot, it needs to be escaped. Try this:
^urn:abc:de:xyz:[0-9]{6,12}\.[0-9]{1,7}\.[a-zA-Z0-9\-_]{1,20}$
This will match with any number of dot alphanum at the end of the string as your examples:
^urn:abc:de:xyz:\d{6,12}\.\d{1,7}(?:\.[\w-]{1,20})+$
Demo & explanation

What regex should I use to check a string only has numbers and 2 special characters ( - and , ) in Java?

Scenario: I want to check whether string contains only numbers and 2 predefined special characters, a dash and a comma.
My string contains numbers (0 to 9) and 2 special characters: a dash (-) defines a range and a comma (,) defines a sequence.
Tried attempt :
Tried following regex [0-9+-,]+, but not working as expected.
Possible inputs :
1-5
1,5
1-5,6
1,3,5-10
1-5,6-10
1,3,5-7,8,10
The regex should not accept these types of strings:
-----
1--4
,1,5
5,6,
5,4,-
5,6-
-5,6
Please can any one help me to create regex for above scenario?
You may use
^\d+(?:-\d+)?(?:,\d+(?:-\d+)?)*$
See the regex demo
Regex details:
^ - start of string
\d+ - 1 or more digits
(?:-\d+)? - an optional sequence of - and 1+ digits
(?:,\d+(?:-\d+)?)* - zero or more seuqences of:
, - a comma
\d+(?:-\d+)? - same pattern as described above
$ - end of string.
Change your regex [0-9+-,]+ to [0-9,-]+
final String patternStr = "[0-9,-]+";
final Pattern p = Pattern.compile(patternStr);
String data = "1,3,5-7,8,10";
final Matcher m = p.matcher(data);
if (m.matches()) {
System.out.println("SUCCESS");
}else{
System.out.println("ERROR");
}

How to get all integers before hyphen from java String

I want to parse through hyphen, the answer should be 0 0 1 (integer), what could be the best way to parse in java
public static String str ="[0-S1|0-S2|1-S3, 1-S1|0-S2|0-S3, 0-S1|1-S2|0-S3]";
Please help me out.
Use the below regex with Pattern and matcher classes.
Pattern.compile("\\d+(?=-)");
\\d+ - Matches one or more digits. + repeats the previous token \\d (which matches a digit character) one or more times.
(?=-) - Only if it's followed by an hyphen. (?=-) Called positive lookahead assertion which asserts that the match must be followed by an - symbol.
String str ="[0-S1|0-S2|1-S3, 1-S1|0-S2|0-S3, 0-S1|1-S2|0-S3]";
Matcher m = Pattern.compile("\\d+(?=-)").matcher(str);
while(m.find())
{
System.out.println(m.group());
}
one lazy way: if you already know the pattern of the string, use substring and indexof to locate your word.
String str ="[0-S1|0-S2|1-S3, 1-S1|0-S2|0-S3, 0-S1|1-S2|0-S3]";
integer int1 = Integer.parseInt(str.substring(str.indexOf("["),str.indexOf("-S1")));
and so on.

Regex to get first number in string with other characters

I'm new to regular expressions, and was wondering how I could get only the first number in a string like 100 2011-10-20 14:28:55. In this case, I'd want it to return 100, but the number could also be shorter or longer.
I was thinking about something like [0-9]+, but it takes every single number separately (100,2001,10,...)
Thank you.
/^[^\d]*(\d+)/
This will start at the beginning, skip any non-digits, and match the first sequence of digits it finds
EDIT:
this Regex will match the first group of numbers, but, as pointed out in other answers, parseInt is a better solution if you know the number is at the beginning of the string
Try this to match for first number in string (which can be not at the beginning of the string):
String s = "2011-10-20 525 14:28:55 10";
Pattern p = Pattern.compile("(^|\\s)([0-9]+)($|\\s)");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println(m.group(2));
}
Just
([0-9]+) .*
If you always have the space after the first number, this will work
Assuming there's always a space between the first two numbers, then
preg_match('/^(\d+)/', $number_string, $matches);
$number = $matches[1]; // 100
But for something like this, you'd be better off using simple string operations:
$space_pos = strpos($number_string, ' ');
$number = substr($number_string, 0, $space_pos);
Regexs are computationally expensive, and should be avoided if possible.
the below code would do the trick.
Integer num = Integer.parseInt("100 2011-10-20 14:28:55");
[0-9] means the numbers 0-9 can be used the + means 1 or more times. if you use [0-9]{3} will get you 3 numbers
Try ^(?'num'[0-9]+).*$ which forces it to start at the beginning, read a number, store it to 'num' and consume the remainder without binding.
This string extension works perfectly, even when string not starts with number.
return 1234 in each case - "1234asdfwewf", "%sdfsr1234" "## # 1234"
public static string GetFirstNumber(this string source)
{
if (string.IsNullOrEmpty(source) == false)
{
// take non digits from string start
string notNumber = new string(source.TakeWhile(c => Char.IsDigit(c) == false).ToArray());
if (string.IsNullOrEmpty(notNumber) == false)
{
//replace non digit chars from string start
source = source.Replace(notNumber, string.Empty);
}
//take digits from string start
source = new string(source.TakeWhile(char.IsDigit).ToArray());
}
return source;
}
NOTE: In Java, when you define the patterns as string literals, do not forget to use double backslashes to define a regex escaping backslash (\. = "\\.").
To get the number that appears at the start or beginning of a string you may consider using
^[0-9]*\.?[0-9]+ # Float or integer, leading digit may be missing (e.g, .35)
^-?[0-9]*\.?[0-9]+ # Optional - before number (e.g. -.55, -100)
^[-+]?[0-9]*\.?[0-9]+ # Optional + or - before number (e.g. -3.5, +30)
See this regex demo.
If you want to also match numbers with scientific notation at the start of the string, use
^[0-9]*\.?[0-9]+([eE][+-]?[0-9]+)? # Just number
^-?[0-9]*\.?[0-9]+([eE][+-]?[0-9]+)? # Number with an optional -
^[-+]?[0-9]*\.?[0-9]+([eE][+-]?[0-9]+)? # Number with an optional - or +
See this regex demo.
To make sure there is no other digit on the right, add a \b word boundary, or a (?!\d)
or (?!\.?\d) negative lookahead that will fail the match if there is any digit (or . and a digit) on the right.
public static void main(String []args){
Scanner s=new Scanner(System.in);
String str=s.nextLine();
Pattern p=Pattern.compile("[0-9]+");
Matcher m=p.matcher(str);
while(m.find()){
System.out.println(m.group()+" ");
}
\d+
\d stands for any decimal while + extends it to any other decimal coming directly after, until there is a non number character like a space or letter

Categories