Java Regex : 4 Letters followed by 2 Integers - java

Regex beginner here.
Already visited the followings, none answers my question :
1, 2, 3, 4, 5, 6, etc.
I have a simple regex to check if a string contains 4 chars followed by 2 digits.
[A-Za-z]{4}[0-9]{2}
But, when using it, it doesn't matches. Here is the method I use and an example of input and output :
Input in a JPasswordField
Mypass85
Output
false
Method
public static boolean checkPass(char[] ca){
String s = new String(ca);
System.out.println(s); // Prints : Mypass85
p = Pattern.compile("[A-Za-z]{4}[0-9]{2}");
return p.matcher(s).matches();
}

Matcher#matches attempts to match full input. Use Matcher#find instead:
public static boolean checkPass(String s){
System.out.println(s); // Prints : Mypass85
p = Pattern.compile("[A-Za-z]{4}[0-9]{2}");
return p.matcher(s).find();
}

Promoting a comment to an answer.
It doesn't match because "Mypass85" is 6 letters followed by 2 numbers, but your pattern expects exactly 4 letters followed by 2 numbers.
You can either pass something like "Pass85" to match your existing pattern, or you can get "Mypass85" to match by changing the {4} to {6} or to {4,} (4 or more).

Related

Why does this regex fails to check accurately?

I have the following regex method which does the matches in 3 stages for a given string. But for some reason the Regex fails to check some of the things. As per whatever knowledge I have gained by working they seem to be correct. Can someone please correct me what am I doing wrong here?
I have the following code:
public class App {
public static void main(String[] args) {
String identifier = "urn:abc:de:xyz:234567.1890123";
if (identifier.matches("^urn:abc:de:xyz:.*")) {
System.out.println("Match ONE");
if (identifier.matches("^urn:abc:de:xyz:[0-9]{6,12}.[0-9]{1,7}.*")) {
System.out.println("Match TWO");
if (identifier.matches("^urn:abc:de:xyz:[0-9]{6,12}.[a-zA-Z0-9.-_]{1,20}$")) {
System.out.println("Match Three");
}
}
}
}
}
Ideally, this code should generate the output
Match ONE
Match TWO
Match Three
Only when the identifier = "urn:abc:de:xyz:234567.1890123.abd12" but it provides the same output event if the identifier does not match the regex such as for the following inputs:
"urn:abc:de:xyz:234567.1890123"
"urn:abc:de:xyz:234567.1890ANC"
"urn:abc:de:xyz:234567.1890123"
"urn:abc:de:xyz:234567.1890ACB.123"
I am not understanding why is it allowing the Alphanumeric characters after the . and also it does not care about the characters after the second ..
I would like my Regex to check that the string has the following format:
String starts with urn:abc:de:xyz:
Then it has the numbers [0-9] which range from 6 to 12 (234567).
Then it has the decimal point .
Then it has the numbers [0-9] which range from 1 to 7 (1890123)
Then it has the decimal point ..
Finally it has the alphanumeric character and spcial character which range from 1 to 20 (ABC123.-_12).
This is an valid string for my regex: urn:abc:de:xyz:234567.1890123.ABC123.-_12
This is an invalid string for my regex as it misses the elements from point 6:
urn:abc:de:xyz:234567.1890123
This is also an invalid string for my regex as it misses the elements from point 4 (it has ABC instead of decimal numbers).
urn:abc:de:xyz:234567.1890ABC.ABC123.-_12
This part of the regex:
[0-9]{6,12}.[0-9]{1,7} matches 6 to 12 digits followed by any character followed by 1 to 7 digits
To match a dot, it needs to be escaped. Try this:
^urn:abc:de:xyz:[0-9]{6,12}\.[0-9]{1,7}\.[a-zA-Z0-9\-_]{1,20}$
This will match with any number of dot alphanum at the end of the string as your examples:
^urn:abc:de:xyz:\d{6,12}\.\d{1,7}(?:\.[\w-]{1,20})+$
Demo & explanation

How to split a string into only positive and negative integers?

I'm writing a program to do different calculations with vector functions, but the program I have as of now delimits the negative digits. I've tried using different delimiters but I can't seem to get the right one.
Does anyone know how to keep the positive and negative digits when splitting a string? Also, is there a way to keep any decimal values? .45 would return 45 and .29 would return 29
This is the code:
ArrayList<Integer> list = new ArrayList<Integer>();
String twoVectors = "a=<-1,2,-3> b=<4,-5,6>"; // vector and a and b
String[] tokens = twoVectors.split("\\D");
for (String s : tokens)
if (!s.equals(""))
list.add(Integer.parseInt(s));
System.out.println(Arrays.toString(list.toArray()));
When I run the program I get [1, 2, 3, 4, 5, 6] instead of [-1, 2, -3, 4, -5, 6]. All the functions I have worked perfectly fine but dont work when using negative values.
Any help would be appreciated.
You can use
String[] tokens = twoVectors.split("[^\\d-]+");
[^\\d-]+ : match anything except digits and -
[] : match everything mentioned inside []
^ : negation mean do not match (\\d-)
\\d- : digits 0-9 and - character
Regex Demo
String twoVectors = "a=<-1,2,-3> b=<4,-5,6>";
ArrayList<Integer> list = new ArrayList<Integer>();
String[] tokens = twoVectors.split("[^\\d-]");
for (String s : tokens)
if (!s.equals(""))
list.add(Integer.parseInt(s));
System.out.println(Arrays.toString(list.toArray()));
Output :
[-1, 2, -3, 4, -5, 6]
Or
you can use Pattern along with matcher to find all the desired values i.e singed or unsigned numbers with -?\\d+ regex
Regex Demo -?\d+
Update : For Double values , you can use [^\\d-.]+ and make sure to use Double instead of Integer along with Double.parseDouble
And with Pattern and Matcher use -?\\d*\\.?\\d+
Use [^\\d-] inside your split method i.e. twoVectors.split("[^\\d-]")
Why [^\\d-]:
^ : Finds regex that must match at the beginning of the line.
\d : Any digit from [0-9]
- : will match '-' if it exists
The regex that you currently have splits the string on anything but digits. So anything that is not a digit is considered a splitter. If you added - sign to this pattern, anything that is not a digit or a - sign will be included. This will work for some cases, but will fail if you have - or . without a number afterwards.
What you need to do is to specify the number format in a regex (like -?\d*.?\d+), and then find all matches of this pattern. You will also need to change the numbers to Double so that you can parse decimal numbers.
String twoVectors = "a=<-1,.2,-3> b=<4,-5,6>";
ArrayList<Double> numbers = new ArrayList<Double>();
Matcher matcher = Pattern.compile("-?\\d*\\.?\\d+").matcher(twoVectors);
while (matcher.find()) {
numbers.add(Double.parseDouble(matcher.group()));
}
System.out.println(Arrays.toString(numbers.toArray()));
Output
[-1.0, 0.2, -3.0, 4.0, -5.0, 6.0]
A 1-line solution:
List<Integer> numbers = Arrays
.stream(twoVectors.replaceAll("^[^\\d-]+", "").split("[^\\d-]+"))
.map(Integer::new)
.collect(Collectors.toList());
The initial replace is to remove the leading non-target chars (otherwise the split would return a blank in the first element).

Trying to understand this Regex code [duplicate]

This question already has an answer here:
SCJP6 regex issue
(1 answer)
Closed 7 years ago.
I have the following code. As far as I can see, the program should print 0123445. Instead, it prints 01234456.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Regex2 {
public static void main(String[] args) {
Pattern p = Pattern.compile("\\d*");
Matcher m = p.matcher("ab34ef");
boolean b = false;
while(b=m.find()){
System.out.print(m.start() + m.group());
}
System.out.println();
}
}
I think the following should happen-
Since the search pattern is for a \d*,
It finds a hit at position 0, but since the hit is not a digit, it just prints 0
It finds a hit at position 1, but again, not a digit, prints 0
Finds a hit at position 2 and since we are looking for \d*, the hit is 34, and so it prints 234.
Moves to position 4, finds a hit, but since hit is not a digit, it just prints 4.
Moves to position 5, finds a hit, but since hit is not a digit, it just prints 5.
At this point, as far as I can see, it should be done. But for some reason, the program also returns a 6.
Much appreciate it if someone can explain.
The \d* matches zero(!) or more digits, that's why it returns an empty string as a match at 0 and 1, it the matches 34 at position 2 and an empty string again at position 4 and 5. At that point what is left to match against is an empty string. And this empty string also matches \d* (because an empty string contains zero digits), that's why there is another match at position 6.
To contrast this try using \d+ (which matches one or more digits) as the pattern and see what happens then.

Understanding regular expression output [duplicate]

This question already has an answer here:
SCJP6 regex issue
(1 answer)
Closed 7 years ago.
I need help to understand the output of the code below. I am unable to figure out the output for System.out.print(m.start() + m.group());. Please can someone explain it to me?
import java.util.regex.*;
class Regex2 {
public static void main(String[] args) {
Pattern p = Pattern.compile("\\d*");
Matcher m = p.matcher("ab34ef");
boolean b = false;
while(b = m.find()) {
System.out.println(m.start() + m.group());
}
}
}
Output is:
0
1
234
4
5
6
Note that if I put System.out.println(m.start() );, output is:
0
1
2
4
5
6
Because you have included a * character, your pattern will match empty strings as well. When I change your code as I suggested in the comments, I get the following output:
0 ()
1 ()
2 (34)
4 ()
5 ()
6 ()
So you have a large number of empty matches (matching each location in the string) with the exception of 34, which matches the string of digits. Use \\d+ if you want to match digits without also matching empty strings..
You used this regex - \d* - which basically means zero or more digits. Mind the zero!
So this pattern will match any group of digits, e.g. 34 plus any other position in the string, where the matched sequence will be the empty string.
So, you will have 6 matches, starting at indices 0,1,2,4,5,6. For match starting at index 2, the matched sequence is 34, while for the remaining ones, the match will be the empty string.
If you want to find only digits, you might want to use this pattern: \d+
d* - match zero or more digits in the expresion.
expresion ab34ef and his corresponding indices 012345
On the zero index there is no match so start() prints 0 and group() prints nothing, then on the first index 1 and nothing, on the second we find match so it prints 2 and 34. Next it will print 4 and nothing and so on.
Another example:
Pattern pattern = Pattern.compile("\\d\\d");
Matcher matcher = pattern.matcher("123ddc2ab23");
while(matcher.find()) {
System.out.println("start:" + matcher.start() + " end:" + matcher.end() + " group:" + matcher.group() + ";");
}
which will println:
start:0 end:2 group:12;
start:9 end:11 group:23;
You will find more information in the tutorial

Splitting string expression into tokens

My input is like
String str = "-1.33E+4-helloeeee+4+(5*2(10/2)5*10)/2";
i want the output as:
1.33E+4
helloeeee
4
5
2
10
2
5
10
2
But I am getting the output as
1.33, 4, helloeeee, 4, 5, 2, 10, 2, 5, 10, 2
i want the exponent value completely after splitting "1.33e+4"
here is my code:
String str = "-1.33E+4-helloeeee+4+(5*2(10/2)5*10)/2";
List<String> tokensOfExpression = new ArrayList<String>();
String[] tokens=str.split("[(?!E)+*\\-/()]+");
for(String token:tokens)
{
System.out.println(token);
tokensOfExpression.add(token);
}
if(tokensOfExpression.get(0).equals(""))
{
tokensOfExpression.remove(0);
}
I would first replace the E+ with a symbol that is not ambiguous such as
str.ReplaceAll("E+","SCINOT");
You can then parse with StringTokenizer, replacing the SCINOT symbol when you need to evaluate the number represented in scientific notation.
You can't do that with a single regular expression, because of the ambiguities introduced by FP constants in scientific notation, and in any case you need to know which token is which without having to re-scan them. You've also mis-stated your requirement, as you certainly need the binary operators in the output as well. You need to write both a scanner and a parser. Have a look for 'recursive descent expression parser' and 'Dijkstra shunting-yard algorithm'.Resetting the digest is redundant.
Try this
String[] tokens=str.split("(?<!E)+[*\\-/()+]");
It's easier to achieve the result with Matcher
String str = "-1.33E+4-helloeeee+4+(5*2(10/2)5*10)/2";
Matcher m = Pattern.compile("\\d+\\.\\d*E[+-]?\\d+|\\w+").matcher(str);
while(m.find()) {
System.out.println(m.group());
}
prints
1.33E+4
helloeeee
4
5
2
10
2
5
10
2
note that it needs some testing for different floating point expressions but it is easily adjustable

Categories