How get a whole word with just a part of it? - java

Example:
I have a String like this:
String query = "....COD_OP = 1 AND USER_DATA_SIGNIN = ...."
I need to get the whole word ("USER_DATA_SIGNIN") when it have the "_DATA_" part.
In java is possible use some kind of substring inversely ? In this case I don't know how to get the "USER" part.

Simple imlementation, null checks are left to you:
for (String string : query.split(" ")) {
if(string.contains("_DATA_"))
{
System.out.println(string); // USER_DATA_SIGNIN
System.out.println(string.split("_")[0]); // USER
}
}

You can use Pattern/Matcher classes which are responsible to regex mechanism in Java. You can create pattern which will represent word which have a-zA-Z0-9_ characters (which can be represented by \w character class) before and/or after it like
Pattern p = Pattern.compile("\\w*_DATA_\\w*");
Matcher m = p.matcher(text);
while(m.find()){
System.out.println(m.group());
}

Related

Java Pattern matcher not matching for HTTP response code [duplicate]

I have this small piece of code
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]"))
{
System.out.println(s);
}
}
Supposed to print
dkoe
but it prints nothing!!
Welcome to Java's misnamed .matches() method... It tries and matches ALL the input. Unfortunately, other languages have followed suit :(
If you want to see if the regex matches an input text, use a Pattern, a Matcher and the .find() method of the matcher:
Pattern p = Pattern.compile("[a-z]");
Matcher m = p.matcher(inputstring);
if (m.find())
// match
If what you want is indeed to see if an input only has lowercase letters, you can use .matches(), but you need to match one or more characters: append a + to your character class, as in [a-z]+. Or use ^[a-z]+$ and .find().
[a-z] matches a single char between a and z. So, if your string was just "d", for example, then it would have matched and been printed out.
You need to change your regex to [a-z]+ to match one or more chars.
String.matches returns whether the whole string matches the regex, not just any substring.
java's implementation of regexes try to match the whole string
that's different from perl regexes, which try to find a matching part
if you want to find a string with nothing but lower case characters, use the pattern [a-z]+
if you want to find a string containing at least one lower case character, use the pattern .*[a-z].*
Used
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]+"))
{
System.out.println(s);
}
}
I have faced the same problem once:
Pattern ptr = Pattern.compile("^[a-zA-Z][\\']?[a-zA-Z\\s]+$");
The above failed!
Pattern ptr = Pattern.compile("(^[a-zA-Z][\\']?[a-zA-Z\\s]+$)");
The above worked with pattern within ( and ).
Your regular expression [a-z] doesn't match dkoe since it only matches Strings of lenght 1. Use something like [a-z]+.
you must put at least a capture () in the pattern to match, and correct pattern like this:
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("(^[a-z]+$)"))
{
System.out.println(s);
}
}
You can make your pattern case insensitive by doing:
Pattern p = Pattern.compile("[a-z]+", Pattern.CASE_INSENSITIVE);

Check if a String satisfies a regex

I have a List of String and I want to filter out the String that doesn't match a regex pattern
Input List = Orthopedic,Orthopedic/Ortho,Length(in.)
My code
for(String s : keyList){
Pattern p = Pattern.compile("[a-zA-Z0-9-_]");
Matcher m = p.matcher(s);
if (!m.find()){
System.out.println(s);
}
}
I expect the 2nd and 3rd string to be printed as they do not match the regex. But it is not printing anything
Explanation
You are not matching the entire input. Instead, you are trying to find the next matching part in the input. From Matcher#finds documentation:
Attempts to find the next subsequence of the input sequence that matches the pattern.
So your code will match an input if at least one character is one of a-zA-Z0-9-_.
Solution
If you want to match the whole region you should use Matcher#matches (documentation):
Attempts to match the entire region against the pattern.
And you probably want to adjust your pattern to allow multiple characters, for example by a pattern like
[a-zA-Z0-9-_]+
The + allows 1 to infinite many repetitions of the pattern (? is 0 to 1 and * is 0 to infinite).
Notes
You have an extra - at the end of your pattern. You probably want to remove that. Or, if you intended to match the character litteraly, you need to escape it:
[a-zA-Z0-9\\-_]+
You can test your regex on sites like regex101.com, here's your pattern: regex101.com/r/xvT8V0/1.
Note that there is also String#matches (documentation). So you could write more compact code by just using s.matches("[a-zA-Z0-9_]+").
Also note that you can shortcut character sets like [a-zA-Z0-9_] by using predefined sets. The set \w (word character) matches exactly your desired pattern.
Since the pattern and also the matcher don't change, you might want to move them outside of the loop to slightly increase performance.
Code
All in all your code might then look like:
Pattern p = Pattern.compile("[a-zA-Z0-9_]+");
Matcher m = p.matcher(s);
for (String s : keyList) {
if (!m.matches()) {
System.out.println(s);
}
}
Or compact:
for (String s : keyList) {
if (!s.matches("\\w")) {
System.out.println(s);
}
}
Using streams:
keyList.stream()
.filter(s -> !s.matches("\\w"))
.forEach(System.out::println);
You shouldn't construct a Pattern in a loop, you currently only match a single character, and you can use !String.matches(String) and a filter() operation. Like,
List<String> keyList = Arrays.asList("Orthopedic", "Orthopedic/Ortho", "Length(in.)");
keyList.stream().filter(x -> !x.matches("[a-zA-Z0-9-_]+"))
.forEachOrdered(System.out::println);
Outputs (as requested)
Orthopedic/Ortho
Length(in.)
Or, using the Pattern, like
List<String> keyList = Arrays.asList("Orthopedic", "Orthopedic/Ortho", "Length(in.)");
Pattern p = Pattern.compile("[a-zA-Z0-9-_]+");
keyList.stream().filter(x -> !p.matcher(x).matches()).forEachOrdered(System.out::println);
There are two problems:
1) the regular expression is wrong, it matches just one character.
2) you need to use m.matches() instead of m.find().
You can use matches instead of find:
//Added the + at the end and removed the extra -
Pattern p = Pattern.compile("[a-zA-Z0-9_]+");
for(String s : keyList){
Matcher m = p.matcher(s);
if (!m.matches()){
System.out.println(s);
}
}
Also note that the point of compiling a pattern is to reuse it, so put it outside the loop. Otherwise you may as well use:
for(String s : keyList){
if (!s.matches("[a-zA-Z0-9_]+")){
System.out.println(s);
}
}

Java how to check multiple regex patterns against an input?

(If I'm taking the complete wrong direction let me know if there is a better way I should be approaching this)
I have a Java program that will have multiple patterns that I want to compare against an input. If one of the patterns matches then I want to save that value in a String. I can get it to work with a single pattern but I'd like to be able to check against many.
Right now I have this to check if an input matches one pattern:
Pattern pattern = Pattern.compile("TST\\w{1,}");
Matcher match = pattern.matcher(input);
String ID = match.find()?match.group():null;
So, if the input was TST1234 or abcTST1234 then ID = "TST1234"
I want to have multiple patterns like:
Pattern pattern = Pattern.compile("TST\\w{1,}");
Pattern pattern = Pattern.compile("TWT\\w{1,}");
...
and then to a collection and then check each one against the input:
List<Pattern> rxs = new ArrayList<Pattern>();
rxs.add(pattern);
rxs.add(pattern2);
String ID = null;
for (Pattern rx : rxs) {
if (rx.matcher(requestEnt).matches()){
ID = //???
}
}
I'm not sure how to set ID to what I want. I've tried
ID = rx.matcher(requestEnt).group();
and
ID = rx.matcher(requestEnt).find()?rx.matcher(requestEnt).group():null;
Not really sure how to make this work or where to go from here though. Any help or suggestions are appreciated. Thanks.
EDIT: Yes the patterns will change over time. So The patten list will grow.
I just need to get the string of the match...ie if the input is abcTWT123 it will first check against "TST\w{1,}", then move on to "TWT\w{1,}" and since that matches the ID String will be set to "TWT123".
To collect the matched string in the result you may need to create a group in your regexp if you are matching less than the entire string:
List<Pattern> patterns = new ArrayList<>();
patterns.add(Pattern.compile("(TST\\w+)");
...
Optional<String> result = Optional.empty();
for (Pattern pattern: patterns) {
Matcher matcher = pattern.match();
if (matcher.matches()) {
result = Optional.of(matcher.group(1));
break;
}
}
Or, if you are familiar with streams:
Optional<String> result = patterns.stream()
.map(Pattern::match).filter(Matcher::matches)
.map(m -> m.group(1)).findFirst();
The alternative is to use find (as in #Raffaele's answer) that implicitly creates a group.
Another alternative you may want to consider is to put all your matches into a single pattern.
Pattern pattern = Pattern.compile("(TST\\w+|TWT\\w+|...");
Then you can match and group in a single operation. However this might might it harder to change the matches over time.
Group 1 is the first matched group (i.e. the match inside the first set of parentheses). Group 0 is the entire match. So if you want the entire match (I wasn't sure from your question) then you could perhaps use group 0.
Use an alternation | (a regex OR):
Pattern pattern = Pattern.compile("TST\\w+|TWT\\w+|etc");
Then just check the pattern once.
Note also that {1,} can be replaced with +.
Maybe you just need to end the loop when the first pattern matches:
// TST\\w{1,}
// TWT\\w{1,}
private List<Pattern> patterns;
public String findIdOrNull(String input) {
for (Pattern p : patterns) {
Matcher m = p.matcher(input);
// First match. If the whole string must match use .matches()
if (m.find()) {
return m.group(0);
}
}
return null; // Or throw an Exception if this should never happen
}
If your patterns are all going to be simple prefixes like your examples TST and TWT you can define all of those at once, and user regex alternation | so you won't need to loop over the patterns.
An example:
String prefixes = "TWT|TST|WHW";
String regex = "(" + prefixes + ")\\w+";
Pattern pattern = Pattern.compile(regex);
String input = "abcTST123";
Matcher match = pattern.matcher(input);
String ID = match.find() ? match.group() : null;
// given this, ID will come out as "TST123"
Now prefixes could be read in from a java .properties file, or a simple text file; or passed as a parameter to the method that does this.
You could also define the prefixes as a comma-separated list or one-per-line in a file then process that to turn them into one|two|three|etc before passing it on.
You may be looping over several inputs, and then you would want to create the regex and pattern variables only once, creating only the Matcher for each separate input.

Regex to get value between two colon excluding the colons

I have a string like this:
something:POST:/some/path
Now I want to take the POST alone from the string. I did this by using this regex
:([a-zA-Z]+):
But this gives me a value along with colons. ie I get this:
:POST:
but I need this
POST
My code to match the same and replace it is as follows:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
System.out.println(matcher.group());
ss = ss.replaceFirst(":([a-zA-Z]+):", "*");
}
System.out.println(ss);
EDIT:
I've decided to use the lookahead/lookbehind regex since I did not want to use replace with colons such as :*:. This is my final solution.
String s = "something:POST:/some/path/";
String regex = "(?<=:)[a-zA-Z]+(?=:)";
Matcher matcher = Pattern.compile(regex).matcher(s);
if (matcher.find()) {
s = s.replaceFirst(matcher.group(), "*");
System.out.println("replaced: " + s);
}
else {
System.out.println("not replaced: " + s);
}
There are two approaches:
Keep your Java code, and use lookahead/lookbehind (?<=:)[a-zA-Z]+(?=:), or
Change your Java code to replace the result with ":*:"
Note: You may want to define a String constant for your regex, since you use it in different calls.
As pointed out, the reqex captured group can be used to replace.
The following code did it:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
ss = ss.replaceFirst(matcher.group(1), "*");
}
System.out.println(ss);
UPDATE
Looking at your update, you just need ReplaceFirst only:
String result = s.replaceFirst(":[a-zA-Z]+:", ":*:");
See the Java demo
When you use (?<=:)[a-zA-Z]+(?=:), the regex engine checks each location inside the string for a * before it, and once found, tries to match 1+ ASCII letters and then assert that there is a : after them. With :[A-Za-z]+:, the checking only starts after a regex engine found : character. Then, after matching :POST:, the replacement pattern replaces the whole match. It is totlally OK to hardcode colons in the replacement pattern since they are hardcoded in the regex pattern.
Original answer
You just need to access Group 1:
if (matcher.find()) {
System.out.println(matcher.group(1));
}
See Java demo
Your :([a-zA-Z]+): regex contains a capturing group (see (....) subpattern). These groups are numbered automatically: the first one has an index of 1, the second has the index of 2, etc.
To replace it, use Matcher#appendReplacement():
String s = "something:POST:/some/path/";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile(":([a-zA-Z]+):").matcher(s);
while (m.find()) {
m.appendReplacement(result, ":*:");
}
m.appendTail(result);
System.out.println(result.toString());
See another demo
This is your solution:
regex = (:)([a-zA-Z]+)(:)
And code is:
String ss = "something:POST:/some/path/";
ss = ss.replaceFirst("(:)([a-zA-Z]+)(:)", "$1*$3");
ss now contains:
something:*:/some/path/
Which I believe is what you are looking for...

Regular expression match a-alphanumeric&b-digits&c-digits

I have query about java regular expressions. Actually, I am new to regular expressions.
So I need help to form a regex for the statement below:
Statement: a-alphanumeric&b-digits&c-digits
Possible matching Examples: 1) a-90485jlkerj&b-34534534&c-643546
2) A-RT7456ffgt&B-86763454&C-684241
Use case: First of all I have to validate input string against the regular expression. If the input string matches then I have to extract a value, b value and c value like
90485jlkerj, 34534534 and 643546 respectively.
Could someone please share how I can achieve this in the best possible way?
I really appreciate your help on this.
you can use this pattern :
^(?i)a-([0-9a-z]++)&b-([0-9]++)&c-([0-9]++)$
In the case what you try to match is not the whole string, just remove the anchors:
(?i)a-([0-9a-z]++)&b-([0-9]++)&c-([0-9]++)
explanations:
(?i) make the pattern case-insensitive
[0-9]++ digit one or more times (possessive)
[0-9a-z]++ the same with letters
^ anchor for the string start
$ anchor for the string end
Parenthesis in the two patterns are capture groups (to catch what you want)
Given a string with the format a-XXX&b-XXX&c-XXX, you can extract all XXX parts in one simple line:
String[] parts = str.replaceAll("[abc]-", "").split("&");
parts will be an array with 3 elements, being the target strings you want.
The simplest regex that matches your string is:
^(?i)a-([\\da-z]+)&b-(\\d+)&c-(\\d+)
With your target strings in groups 1, 2 and 3, but you need lot of code around that to get you the strings, which as shown above is not necessary.
Following code will help you:
String[] texts = new String[]{"a-90485jlkerj&b-34534534&c-643546", "A-RT7456ffgt&B-86763454&C-684241"};
Pattern full = Pattern.compile("^(?i)a-([\\da-z]+)&b-(\\d+)&c-(\\d+)");
Pattern patternA = Pattern.compile("(?i)([\\da-z]+)&[bc]");
Pattern patternB = Pattern.compile("(\\d+)");
for (String text : texts) {
if (full.matcher(text).matches()) {
for (String part : text.split("-")) {
Matcher m = patternA.matcher(part);
if (m.matches()) {
System.out.println(part.substring(m.start(), m.end()).split("&")[0]);
}
m = patternB.matcher(part);
if (m.matches()) {
System.out.println(part.substring(m.start(), m.end()));
}
}
}
}

Categories