Retrieving data using java matcher and pattern

Retrieving data using java matcher and pattern - java

I am trying to get an alphanumeric number from a string statement for eg. UN345690 .I am using the below code.
Pattern pattern = Pattern.compile("\\d+");
Please tell me what code should I use to get the desired result.

I am trying to get an alphanumeric number
Since you already specified the regex \d+ I assume you want to extract the ordinary number 345690 from UN345690. An alphanumeric number would also include letters; but it is unclear which letters would be allowed in that case and how you would differentiate between a regular word and an alphanumeric "number".
Use a Matcher to search for your pattern in a string, then retrieve the first match.
Pattern pattern = Pattern.compile("\\d+");
Matcher matcher = pattern.matcher("UN345690");
if (matcher.find()) {
String number = matcher.group();
// do something with the extracted number
}

Related

Regex Pattern In Between Two Exact Strings

Hi I have this working code to detect a valid UUID pattern.
String pattern = "\\b[a-f0-9]{8}\\b-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-\\b[a-f0-9]{12}\\b";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(fileName);
This correctly detects strings like:
0101a8ef-db10-405a-a1d2-6bebdeb17625
I would like to add two exact strings on each side of this pattern like so:
FOLDER/0101a8ef-db10-405a-a1d2-6bebdeb17625.txt
Here is the code I am trying, but is not working:
String pattern = "FOLDER/\\b//[a-f0-9]{8}\\b-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-\\b[a-f0-9]{12}\\b.txt";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(uri.toString());

There is a // in the pattern that is not in the example string. Besides that, you can omit the word boundaries in between a character a-f0-9 and a - because it is implicit.
Note to escape the dot to match it literally, and you can add the word boundaries at the start and at the end to prevent partial matches.
You could update the pattern to
\\bFOLDER/[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}\\.txt\\b
See a regex demo.

Java regex validate a string consisting of repeated sequences of comma separated digits enclosed in parentheses

I have an input containing repeated pattern like this:
(3,5)(6,7)(8,9).....
I am working on a regex to validate the above string pattern.
I tried:
Pattern.compile("\\((\\d+),(\\d+)\\)")

If you have a specific pattern that may repeat one or more times in a string, and you want to make sure your string only consists of the repeated occurrences of the same pattern, you may use
^(?:YOUR_PATTERN)+$
\A(?:YOUR_PATTERN)+\z
Note that $ also matches before a final newline char, that is why \z anchor is preferred when you need to match the very end of the string.
If you allow an empty string use the * quantifier instead of +:
^(?:YOUR_PATTERN)*$
\A(?:YOUR_PATTERN)*\z
In this case, the YOUR_PATTERN is \(\d+,\d+\), thus, the repeated sequence validating pattern will be \A(?:\(\d+,\d+\))+\z.
In Java, you may omit the ^/$ and \A/\z anchors when validating a string with .matches() method since it requires a full string match:
boolean isValid = text.matches("(?:\\(\\d+,\\d+\\))+");
Or, decalre the Pattern class instance first and then create a matcher with the string as input and run Matcher#matches()
Pattern p = Pattern.compile("(?:\\(\\d+,\\d+\\))+");
Matcher m = p.matcher(text);
if (m.matches()) {
// The text is valid!
}

Regex: How to detect a pattern if an undesired sub pattern comes before the pattern?

I'm new to regex and I'm trying to use Java to detect a sequence of either: lowercase, uppercase, or digits, but not JUST digits separated by periods.
Restriction: No consecutive periods.
The sample String I have is: ###951.324.1###foo1.bar2.123proccess.this.subString
I currently have the following regex: ((\p{Alnum})+\.)+(\p{Alnum})+
I'm trying to have the pattern recognize foo1.bar2.123proccess.this.subString but my regex gives me 951.324.1 since it's a sub-pattern of the pattern I defined.
How would I go about detecting the subString foo1.bar2.123proccess.this.subString
I would imagine the general nature would be: The entire returned String should have at least 1 lowercase or uppercase char, but I'm hopelessly confused on how I would detect that in the String.

[a-zA-Z\d.]*[a-zA-Z][a-zA-Z\d.]*
This can be split into 3 parts:
[a-zA-Z\d.]* // optional sequence of letters/numbers/dots
[a-zA-Z] // MUST have a letter
[a-zA-Z\d.]* // optional sequence of letters/numbers/dots
Basically, "sandwiching" things that are required in optional things.
Try it here: https://regex101.com/r/VT4t2x/1

You may use
String rx = "\\d+(?:\\.\\d+)+|(\\p{Alnum}+(?:\\.\\p{Alnum}+)+)";
See the regex demo (pattern adjusted since regex101 does not support Java POSIX character class syntax)
The point is to match and skip dot-separated digit chunks, and only match and capture what you need. See Java demo:
String s = "###951.324.1###abc.123";
String rx = "\\d+(?:\\.\\d+)+|(\\p{Alnum}+(?:\\.\\p{Alnum}+)+)";
Pattern pattern = Pattern.compile(rx);
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
if (matcher.group(1) != null) {
System.out.println(matcher.group(1));
}
} // => abc.123

Use regex in Java to extract specific parts of a String

In the following string I want to extract the ids that come after {\"company_id\": the part. The first in this case will be 4100, and there are two more farther away 4045 and 2979. All of this ids will be 4 digits. Sorry for including such a long string. The reason why I want to use regex and not some sort of Json parser is because the json is string that is malformed.
String company = "[{\"company_id\":4100,\"data\":{\"drm_user_id\":572901936637129135,\"direct_status_id\":0,\"direct_optin_date\":0,\"direct_first_optin_date\":0,\"direct_last_optin_date\":0,\"direct_optout_date\":0,\"direct_last_form_date\":0,\"direct_last_form_id\":0,\"direct_last_promo_id\":0,\"anon_status_id\":600,\"anon_optin_date\":1446132360498,\"anon_first_optin_date\":1446132360498,\"anon_last_optin_date\":1446132360498,\"anon_optout_date\":0,\"anon_last_form_date\":1446132360498,\"anon_last_form_id\":101,\"anon_last_promo_id\":1002003,\"last_registration_date\":1446132360498,\"mp_status_id\":600,\"mp_control_state\":-1,\"mp_match_date\":0,\"mp_vs_version\":0,\"mp_initial_value_segment\":0,\"mp_id\":0,\"conversion_last_form_date\":0,\"conversion_last_form_id\":0,\"conversion_last_promo_id\":-1,\"last_message_date\":1446132368928,\"cg_version\":0,\"cg_version_date\":0,\"num_anon_messages_global\":0,\"num_anon_messages_global_date\":0,\"reg_creator_id\":576,\"reg_form_id\":101,\"reg_method_id\":1,\"reg_creator_type_id\":1},\"personal_data\":{\"version\":0,\"personal_data\":\"{}\",\"mdc_data\":{\"version\":0},\"custom_data\":\"{}\"},\"category_data\":{},\"campaignImpressions\":{},\"journeyStartDate\":0},{\"company_id\":4045,\"data\":{\"drm_user_id\":572901936637129135,\"direct_status_id\":0,\"direct_optin_date\":0,\"direct_first_optin_date\":0,\"direct_last_optin_date\":0,\"direct_optout_date\":0,\"direct_last_form_date\":0,\"direct_last_form_id\":0,\"direct_last_promo_id\":0,\"anon_status_id\":600,\"anon_optin_date\":1446132360498,\"anon_first_optin_date\":1446132360498,\"anon_last_optin_date\":1446132360498,\"anon_optout_date\":0,\"anon_last_form_date\":1446132360498,\"anon_last_form_id\":101,\"anon_last_promo_id\":1002003,\"last_registration_date\":1446132360498,\"mp_status_id\":600,\"mp_control_state\":-1,\"mp_match_date\":0,\"mp_vs_version\":0,\"mp_initial_value_segment\":0,\"mp_id\":0,\"conversion_last_form_date\":0,\"conversion_last_form_id\":0,\"conversion_last_promo_id\":-1,\"last_message_date\":1446132368928,\"cg_version\":0,\"cg_version_date\":0,\"num_anon_messages_global\":0,\"num_anon_messages_global_date\":0,\"reg_creator_id\":576,\"reg_form_id\":101,\"reg_method_id\":1,\"reg_creator_type_id\":1},\"personal_data\":{\"version\":0,\"personal_data\":\"{}\",\"mdc_data\":{\"version\":0},\"custom_data\":\"{}\"},\"category_data\":{},\"campaignImpressions\":{},\"journeyStartDate\":0},{\"company_id\":2979,\"data\":{\"drm_user_id\":572901936637129135,\"direct_status_id\":0,\"direct_optin_date\":0,\"direct_first_optin_date\":0,\"direct_last_optin_date\":0,\"direct_optout_date\":0,\"direct_last_form_date\":0,\"direct_last_form_id\":0,\"direct_last_promo_id\":0,\"anon_status_id\":600,\"anon_optin_date\":1446132360498,\"anon_first_optin_date\":1446132360498,\"anon_last_optin_date\":1446132360498,\"anon_optout_date\":0,\"anon_last_form_date\":1446132360498,\"anon_last_form_id\":101,\"anon_last_promo_id\":1002003,\"last_registration_date\":1446132360498,\"mp_status_id\":600,\"mp_control_state\":-1,\"mp_match_date\":0,\"mp_vs_version\":0,\"mp_initial_value_segment\":0,\"mp_id\":0,\"conversion_last_form_date\":0,\"conversion_last_form_id\":0,\"conversion_last_promo_id\":-1,\"last_message_date\":1446132368928,\"cg_version\":0,\"cg_version_date\":0,\"num_anon_messages_global\":0,\"num_anon_messages_global_date\":0,\"reg_creator_id\":576,\"reg_form_id\":101,\"reg_method_id\":1,\"reg_creator_type_id\":1},\"personal_data\":{\"version\":0,\"personal_data\":\"{}\",\"mdc_data\":{\"version\":0},\"custom_data\":\"{}\"},\"category_data\":{},\"campaignImpressions\":{},\"journeyStartDate\":0}]";
This is what I have so far:
Pattern pattern = Pattern.compile("company_id\\\\\":(\\d{4})");
Matcher matcher = pattern.matcher(company);
while(matcher.find()){
System.out.println(matcher.group(1)+"\n");
}
However this does not work,and I am not sure how to actually check that the number comes after this {\"company_id\": specific part.

Just a single backslash would be enough. \" should match a double quote.
Pattern pattern = Pattern.compile("\"company_id\":(\\d{4})");

Regular expression to find substring in text

I have a text file contains some strings I want to extract with Java regex,
Those strings are in format of:
$numbers,numbers,numbers....,numbers##
(start with $, followed by groups of numbers plus ,, and end with ##)
Here is my pattern.
Pattern pattern = Pattern.compile("$*##");
Matcher matcher = pattern.matcher(text);
if (matcher.find())
{
}
It turns out that nothing match my pattern
Can anyone tell me what's wrong with it?

You need to do:
Pattern pattern = Pattern.compile("\\$\\$\\d+(,\\d+)*##$");
Thanks to #Pshemo for his valuable inputs to reach the solution.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Retrieving data using java matcher and pattern - java

I am trying to get an alphanumeric number from a string statement for eg. UN345690 .I am using the below code. Pattern pattern = Pattern.compile("\\d+"); Please tell me what code should I use to get the desired result.

Related

Regex Pattern In Between Two Exact Strings

Java regex validate a string consisting of repeated sequences of comma separated digits enclosed in parentheses

Regex: How to detect a pattern if an undesired sub pattern comes before the pattern?

Use regex in Java to extract specific parts of a String

Regular expression to find substring in text

Categories

Resources