Regex matching entire string

Regex matching entire string - java

How can I make my string only pass a test if every character in the string is in the regex?
Here is what I have so far:
String w = theApplet.Word.getText().toLowerCase();
if(w.matches(".*[a-z-_]+.*")){
theApplet.words.add(w);
theApplet.str.setText("The word: "+w+" has been added to the list");
}
However, the string is valid even if it contains invalid characters, as long as it contains at least 1 of the characters in the regex.

.* means "match any character zero or more times"
[a-z-_]+ means "match any lowercase character or dash (-) or underscore (_) one or more times".
So the first part is consuming nearly the entire string and the regex is returning true if there is at least one lowercase character/dash/underscore.
Simply remove the .*'s to force all characters to be lowercase characters/dashes/underscores.

Related

How can I remove the last character in a String that has another regex computation in Java?

I want to remove the following character from a String: <,>,t,d,/ and also to remove the last character after the first removal. And I want to do this on a single statement.
Regex for removing <,>,t,d,/:
String codCIM = element.toString().replaceAll("[<>,t,d,//]", ""); -WORKS FINE
Regex for removing <,>,t,d,/ AND the last character:
String codCIM = element.toString().replaceAll("[<>,t,d,//].$", ""); -DONT WORK
Ex: "dtt>W43451005/dttt>" should be W4345100.
But I can only achieve: W43451005

Try use this regex: [<>td/]|.(?=[<>td/]*$)
regex 101
[<>td/] matches the target characters;
.(?=[<>td/]*$) matches the character before the ending target character sequence, which is basically the last character after removing all target characters;

First, the expected output shows that a sequence consisting of d, t, <, >, / should be removed, and then the character before this match at the end of the string has to be removed.
This can be achieved with the following regexp:
System.out.println("dtt>W43451005/dttt>".replaceAll("([<>td/]|.[<>td/]*$)", ""));
Output
W4345100

Matching whole words with special characters with a dynamically built pattern

I need to match an exact substring in a string in Java. I've tried with
String pattern = "\\b"+subItem+"\\b";
But it doesn't work if my substring contains non alphanumerical characters.
I want this to work exactly as the "Match whole word only" function in Notepad++.
Could you help?

I suggest either unambigous word boundaries (that match a string only if the search pattern is not enclosed with letters, digits or underscores):
String pattern = "(?<!\\w)"+Pattern.quote(subItem)+"(?!\\w)";
where (?<!\w) matches a location not preceded with a word char and (?!\w) fails if there is no word char immediately after the current position (see this regex demo), or, you can use a variation that takes into account leading/trailing special chars of the potential match:
String pattern = "(?:\\B(?!\\w)|\\b(?=\\w))" + Pattern.quote(subword) + "(?:(?<=\\w)\\b|(?<!\\w)\\B)";
See the regex demo.
Details:
(?:\B(?!\w)|\b(?=\w)) - either a non-word boundary if the next char is not a word char, or a word boundary if the next char is a word char
Data\[3\] - this is a quoted subItem
(?:(?<=\w)\b|(?<!\w)\B) - either a word boundary if the preceding char is a word char, or a non-word boundary if the preceding char is not a word char.

How to match strings with regex to know of the first two words of a letter has their first later capitalised

I have a String " Karren Warren, this is a very good product " and I want to use a regex to return true whenever the first letter in the first word is capitalized and the first word in the second letter is capitalized. Meaning the two words whose first letters are capitalized has to be consecutive.
So in the example given above, the regex would return true because K is capitalized and W is capitalized. Conversely, it would return false in the scenario when the text is Karren warren, Kindly check this out
I used this pattern ([A-Z]\w+\s){2}but it keeps on returning false.

You could use this regex:
^\s*[A-Z][^\s]+\s+[A-Z]
Demo: https://regex101.com/r/B8XDKg/1/

To match all uppercase letters that have a lowercase variant, you could use \p{Lu}. If you don't want to cross newlines, you can use \h to match horizontal whitespace chars, as \s could also match a newline.
^\p{Lu}\S+\h+\p{Lu}
Regex demo | Java demo
In Java with the doubled backslashes
String regex = "^\\p{Lu}\\S+\\h+\\p{Lu}";

regex capture includes too much

I have a string from which I would like to caputre all after and including colon until (excluding) white space or paranthesis.
Why does the following regex include the paranthesis in the string match?
:(.*?)[\(\)\s] or also :(.+?)[\)\s] (non-greedy) does not work.
Example input: WHERE t.operator_id = :operatorID AND (t.merchant_id = :merchantID) AND t.readerApplication_id = :readerApplicationID AND t.accountType in :accountTypes
Should exctract :operatorID, :merchantID, :readerApplicationID, :accountTypes.
But my regexes extract for the second match :marchantID)
What is wrong and why?
Even if I use an exacter mapping condition in the capture, it does not work: :([a-zA-z0-9_]+?)[\)\(\s]

Put your conditional "followed by space or paren" as a lookahead, so that it sees but doesn't match. Right now you are explicitly matching parentheses with [\(\)\s]:
:(.+?)(?=[\s\(\)])
https://regex101.com/r/im8KWF/1/
Or, use the built-in \b "word boundary", which is also a "zero-width" assertion meaning the same thing*:
:(.+?)\b
https://regex101.com/r/FnnzGM/3/
*Definition of word boundary from regular-expressions.info:
There are three different positions that qualify as word boundaries:
Before the first character in the string, if the first character is a
word character. After the last character in the string, if the last
character is a word character. Between two characters in the string,
where one is a word character and the other is not a word character.

Java regular expression to check if there is at least one letter

I've found lot of variations on this subject on both SO and web, but most (if not all) ask for at least one letter and one digit. I need to have at least one letter.
I've tried but I haven't make it right, what I need is that String contain only letters, letters + numbers (any order), dashes and spaces are allowed but not at the beginning or the end of the string. Here is how it looks like right now:
protected static final String PATTERN = "[\u00C0-\u017Fa-zA-Z0-9']+([- ][\u00C0-\u017Fa-zA-Z0-9']+)*";
public static void main(String[] args) {
String name;
//name = "Street"; // allowed
//name = "Some-Street"; // allowed
//name = "Street "; // not allowed
//name = " Street"; // not allowed
//name = "Street-"; // not allowed
//name = "-Street"; // not allowed
//name = "Street"; // allowed
//name = "1 st street"; // allowed
//name = "street 5"; // allowed
name = "111"; // NOT allowed
if (!Pattern.matches(PATTERN, name)) {
System.out.println("ERROR!");
} else System.out.println("OK!");
}
}
How do I add check if there is at least one character?
No matter if it is at the beginning or end, or if there is space or dash between it and numbers. There just have to be at least one character.

You can use this regex for your problem:
^(?=.*\pL)[\pL\pN]+(?:[ -]+[\pL\pN]+)*$
RegEx Demo
For Java use:
final String regex = "^(?=.*\\pL)[\\pL\\pN]+(?:[ -]+[\\pL\\pN]+)*$";
RegEx Breakup:
^: Start
(?=.*\pL): Using a lookahead make sure we have at least one unicode letter somewhere
[\pL\pN]+: Match one or more unicode letter or unicode digit
(?:: Non-capturing group start
[ -]+: Match one or more space or hyphen
[\pL\pN]+: Match one or more unicode letter or unicode digit
)*: Non-capturing group end. * means zero or more of this group.
$: End

If I understand correctly, and according to what you've presented, you have the following conditions:
At least 1 letter
Can contain digits (but only if the previous condition is met)
Dashes and spaces are allowed only if they are not at the beginning or end of the string
Based on these conditions, the following regex will work:
^(?![ -]|\d+$)[[:alnum:] -]+(?<![ -])$
To see this regex in use, click this link.
This regex works as follows:
Ensure the string doesn't begin with hyphen - or space
Ensure the string isn't composed of only digits
Ensure the string contains between one and unlimited alphanumeric characters
Ensure the string doesn't end with hyphen - or space
This will give you the following matches
Street
Some-Street
Street
1 st street
street 5
The regex will fail to match the following strings (as per your examples)
Street
Street
Street-
-Street
111
Edit
Negative lookbehinds can sometimes cause issues in certain languages (like java).
Below is an adapted version of my previous regex that uses a negative lookahead instead of a negative lookbehind to ensure that the string doesn't end with hyphen - or space .
^(?![ -]|\d+$)(?:(?![ -]$)[\pL\pN -])+$
You can see this regex in use here

Following regex does the job:
(?=.*[[:alpha:]])[[:alnum:]]{1}[[:alnum:] -]*[[:alnum:]]{1}
(?=.*[[:alpha:]]) part guarantees that alpha character [A-Za-z]
exists inside word.
[[:alnum:]]{1} part guarantees that string starts with alphanumeric
character [A-Za-z0-9]
[[:alnum:] -]* alphanumeric characters, space and dash characher
might exist here.
[[:alnum:]]{1} part guarantees that string ends with alphanumeric
character [A-Za-z0-9]
To see it live https://regex101.com/r/V0lesF/1

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regex matching entire string - java

Related

How can I remove the last character in a String that has another regex computation in Java?

Matching whole words with special characters with a dynamically built pattern

How to match strings with regex to know of the first two words of a letter has their first later capitalised

regex capture includes too much

Java regular expression to check if there is at least one letter

Categories

Resources