How to pull the first four consecutive digits in an alphanumeric string? - java

I have a string which contains alphanumeric characters- this is a serial number of a product.
I need a way to pull the first four consecutive digits in that string, these represent the manufactured date of the product in YYMM.
Example string: USA43XY121100004.
1211 is what I would need.
Thanks

You can use regular expressions and find the first group of 4 digits:
Pattern p = Pattern.compile("([0-9]{4})");
Matcher m = p.matcher("USA43XY121100004");
if (m.find()) {
System.out.println(m.group(1));
}
As suggested in the comments, a version without group capturing in the regex:
Pattern p = Pattern.compile("[0-9]{4}");
Matcher m = p.matcher("USA43XY121100004");
if (m.find()) {
System.out.println(m.group());
}

Related

Extract mobile number from string using regex

I want to extract mobile number from a string.
Example string is "Hi, Your Mobile no. is: 9876499321."
Now I want to extract "9876499321" from the string. My main string can have +919876499321 or 919876499321 or 09876499321 inside the string along with other words. How to achieve this?
Rules I want:
First of all remove all "-"
Then extract number that can range from 10 digit to 14 digit (inclusive)
I have tried this:
String myregex = "^\\d{10}$";
Pattern pattern = Pattern.compile(myregex);
Matcher matcher = pattern.matcher(inputStr);
while (matcher.find()) {
System.out.println(matcher.group());
}
I am not able to find any match.
You may remove all hyphens before passing the string to pattern.matcher and then match standalone numbers of 10 to 14 digits:
String inputStr = "Hi, Your Mobile no. is: 9876499321. Also, +919876499321 or 919876499321 or 09-876499321.";
String myregex = "(?<!\\d)\\d{10,14}(?!\\d)";
// Or String myregex = "\\b\\d{10,14}\\b";
Pattern pattern = Pattern.compile(myregex);
Matcher matcher = pattern.matcher(inputStr.replace("-", ""));
while(matcher.find()) {
System.out.println(matcher.group());
}
See the Java demo, output:
9876499321
919876499321
919876499321
09876499321
The (?<!\d)\d{10,14}(?!\d) pattern matches 10 to 14 digits only if they are not enclosed with other digits.
If it's always the last 10 digits of a 10+ digit string, you can do the following:
String myregex = "^.*(\\d{10})([^\\d].*|$)";
And use matcher.group(0) instead of matcher.group().

Regex for find data

I have used this (?:#\d{7}) regex for extracting only 7 digit after '#'.
For example I have string something like "#1234567890". After using the above patterrn I will get 7 digit after '#'.
Now the problem is : I have string something like that "Referenc number #1234567890"
where "Referenc number #" fixed.
Now I am finding for regex which can return the 1234567 number from the above string.
I have a one file which contains above string and there are also other data available.
You can try something like this:
String ref_no = "Referenc number #123456789";
Pattern p = Pattern.compile("Referenc number #([0-9]{7})");
Matcher m = p.matcher(ref_no);
while (m.find())
{
System.out.println(m.group(1));
}
The ?: should make your group "non-capturing", so if you add that separately around the hash sign, it should used for matching but excluded from capture.
(?:#)(\d{7})
If the String always starts with Referenc number # you could just use the following code:
String text = "Referenc number #1234567890";
Pattern pattern = Pattern.compile("\\d{7}");
Matcher matcher = pattern.matcher(text);
while(matcher.find()){
System.out.println(matcher.group());
}

Extract numbers from a url's string group using regex in java

I have a url which has this format:
https://address.com/somestring/somestring-2/c100.200.3.4/somestrigx3/somestring.4
I want to obtain the number from c100.200.3.4 which are delimited by c and / and a dot. So in the end I want to have 100, 200, 3, 4.
I was wondering if there is a way to build a regex pattern for this instead of the classic string search and compute.
It is possible to get with 1 regex, but with a bit of code.
String s = "https://address.com/somestring/somestring-2/c100.200.3.4/somestrigx3/somestring.4";
Pattern pattern = Pattern.compile("(?<=/c)(\\d+)|(?!^)\\G\\.(\\d+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
if (matcher.group(1) != null)
System.out.println(matcher.group(1));
if (matcher.group(2) != null)
System.out.println(matcher.group(2));
}
See IDEONE demo
The regex (?<=/c)(\d+)|(?!^)\G\.(\d+) contains two alternatives: (?<=/c)(\d+) matches and captures into Group 1 any sequence of digits after /c, and the (?!^)\G\.(\d+) matches consecutive sequences of a literal . and digits (capturing the latter into Group 2) after the successful previous match (due to (?!^)\G). Since either group can be non-initialized, we have to check it for null.
UPDATE
Since - as it turns out - the number of digit groups is a fix one (4), you can use a simpler regex with capturing groups:
String s = "https://address.com/somestring/somestring-2/c100.200.3.4/somestrigx3/somestring.4";
Pattern pattern = Pattern.compile("(?<=/c)(\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
System.out.println(matcher.group(3));
System.out.println(matcher.group(4));
}
See another demo
String splits[] = input_url.replaceAll(".*?/c([0-9.]+)/.*", "$1").split("[.]");
Here, first it is picking the text in between /c(...)/ at group $1 and replacing the whole string with the captured group. After that it is splitting the string with a dot.

Java Regex group matches spaces

I have this regex and my output seems to be matching each single space but the capturing group is only alpha chars. I must be missing something.
String regexstring = new String("1234567 Mike Peloso ");
Pattern pattern = Pattern.compile("[A-Za-z]*");
Matcher matcher = pattern.matcher(regexstring);
while(matcher.find())
{
System.out.println(Integer.toString(matcher.start()));
String someNumberStr = matcher.group();
System.out.println(someNumberStr);
}
There is no capturing group, but you need to use the + quantifier (meaning 1 or more times). The * quantifier matches the preceding element zero or more times and creates a disaster of output...
Pattern pattern = Pattern.compile("[A-Za-z]+");
And then print the match result:
while (matcher.find()) {
System.out.println(matcher.start());
System.out.println(matcher.group());
}
Working Demo

Retrieve numbers separated by '-'

Lets say I have a large amount of (random) text. Within this text there is a phone number, consisting of three digits, a dash, another three digits, a dash, and four digits. For example, XXX-XXX-XXXX. What would be the regex for retrieving this number from the text. I tried using:
Matcher matcher = pattern.matcher(previousText);
Pattern pattern2 = Pattern.compile(".*(\\d\\d\\d-\\d\\d\\d-\\d\\d\\d\\d).*")
Matcher matcher2 = pattern2.matcher(currentText);
Now, I though it would work, but it doesn't. Please help.
The regex: \d{3}-\d{3}-\d{4}
Pattern pattern = Pattern.compile(".*(\\d{3}-\\d{3}-\\d{4}).*");
Matcher matcher = pattern.matcher(text);
if (matcher.find()) {
String number = matcher.group(1);
System.out.println(number);
}

Categories