How to validate file path? - java

Edit 1
I am stuck on a little problem and the guy who I normally turn to is, believe it or not, in Australia on his honeymoon, how inconsiderate is that.
The problem is that I have worked on trying to get a boolean which will either let me know if the file path is correct or not.
The problem is that it always returns false,
I have tried the following sample data
c:\Lingerie\
c:\\Lingerie\\
c:\\
c:\
Edit
This is the input screen that I have developed so far. I have already thought of having extra white spaces, so I already popped in the trim command.
They all have returned false.
Here is the method that I'm using and the code that calls it.
dbFilePath = (text.getText()).trim();
bool03 = busLog.isFilePath(dbFilePath);
System.out.println("The result is " + bool03);
And the method is called is
public boolean isFilePath(String filePath) {
return discreetLog.isFilePathMatched(filePath);
}
And which calls
public boolean isFilePathMatched(String myFilePath){
String regularExpression = "([a-zA-Z]:)?(\\\\[a-zA-Z0-9_.-]+)+\\\\?";
Pattern pattern = Pattern.compile(regularExpression);
return Pattern.matches(regularExpression, myFilePath);
}
I don't know if it is my code or an input error.

The first path of your sample is actually matching the regex correctly.
To match the second and fourth, you have to allow double backslashes. Just add \\\\? on the right of the previous backslashes in your regex.
To match the third and fourth, you need to make the second group optional. Currently it is used with the operator +. Use * instead.
Therefore if you replace your regex with this one:
String regularExpression = "([a-zA-Z]:)?(\\\\\\\\?[a-zA-Z0-9_.-]+)*\\\\?\\\\?";
it will match all your paths, and nothing more.
If you still can not match your sample data, then there is a problem somewhere else in your code. You can try posting an SSCCE.

Related

Regex code not collecting multiple lines of matching pattern

I'm new to using regex and I was hoping that someone could help me with this.
I have this regex code which is supposed to identify tab groups in a tablature file. It works on regex testing websites such as regexr.com, regextester.com, and extendsclass.com/regex-tester, but when I code it in java using the example text shown below, I am given each individual line as its own separate group, instead of 4 groups containing all the text which are separated only by one newline.
I have read through this stack overflow thread"Regular expression works on regex101.com, but not on prod" and have been careful to avoid string literal problems, multiline problems, and ive tried the code with other regex engines on regex101 and it worked, but still, it does not work in my java code shown below.
I tried enabling the multiline flag but it still doesn't work. I thought it was a problem with my code, but then I got the same wrong output on other regex tester websites: myregexp.com and freeformatter.com/java-regex-tester
here is the original regex. It is ling, so it might be easier to use the regex above as they both have the same problem I was talking about:
RealRegexCode = (^|[\n\r])(((?<=^|[\n\r])[^\S\n\r]*\|*[^\S\n\r]*((E|A|D|G|B|e|a|d|g|b)[^\S\n\r]*\|*(?=(([^\S\n\r]*-[ -]*(?=\|))|([ -]*((\(?[a-zB-Z0-9]+\)?)+[^\S\n\r]*-[ -]*)+((\(?[a-zB-Z0-9]+\)?)+){0,1}[^\S\n\r]*))[|\r\n]|$)))((([^\S\n\r]*-[ -]*(?=\|))|([ -]*((\(?[a-zB-Z0-9]+\)?)+[^\S\n\r]*-[ -]*)+((\(?[a-zB-Z0-9]+\)?)+){0,1}[^\S\n\r]*))\|)+(((?<=\|)[^\S\n\r]*((E|A|D|G|B|e|a|d|g|b)[^\S\n\r]*\|*(?=(([^\S\n\r]*-[ -]*(?=\|))|([ -]*((\(?[a-zB-Z0-9]+\)?)+[^\S\n\r]*-[ -]*)+((\(?[a-zB-Z0-9]+\)?)+){0,1}[^\S\n\r]*))[|\r\n]|$)))((([^\S\n\r]*-[ -]*(?=\|))|([ -]*((\(?[a-zB-Z0-9]+\)?)+[^\S\n\r]*-[ -]*)+((\(?[a-zB-Z0-9]+\)?)+){0,1}[^\S\n\r]*))\|)+)*(\n|\r|$))+
Here is a simplified regex code that displays the same problem, provided for the sake of debugging
SimplifiedRegexCode = (^|[\n\r])([^\n\r]+(\n|\r|$))+
here is the code that finds the matches using the regex pattern:
public static void main(String[] args){
String filePath = "C:\\Users\\stani\\IdeaProjects\project\\src\\testing files\\guitar - a thousand matches by passenger.txt";
Path path = Path.of(filePath);
List<String> stuff = new ArrayList<>();
try {
String rootStr = Files.readString(path);
Pattern pattern = Pattern.compile("(^|[\\n\\r])([^\\n\\r]+(\\n|\\r|$))+");
Matcher ptrnMatcher = pattern.matcher(rootStr);
while (ptrnMatcher.find()) {
stuff.add(ptrnMatcher.group());
}
}catch (Exception e) {
e.printStackTrace();
}
System.out.println(new Patterns().MeasureGroupCollection);
for (String s:stuff)
System.out.println(s);
}
And here is the text I was testing it with. It might help to copy and paste this in a text editor as stack overflow might distort how the text looks:
e|---------------------------------|------------------------------------|
e|------------------------------------------------------------------|
B|-----1--------(1)----1-----------|-------1---------------1----------1-|
B|-----1--------(1)----0---------0-----1---------1-----3--------(3)-|
G|-----------0------------0--------|-------------0----------------0-----|
G|-----------0---------------0---------------0---------------0------|
D|-----0h2-----2-------2-----------|-------2-------2-------0--------0---|
D|-----2-------2-------2-------2-------2-------2-------0-------0----|
A|-3-------3-------3-------3-------|------------------------------------|
A|-0-------0--------------------------------------------------------|
E|-----------------------------0---|---1-------1-------3-------3--------|
E|-----------------0-------0--------1------1-------3-------3--------|
e|-------------------------------------------------------------------|
B|-----1---------1-----1---------1-----3---------3-------1---------1-|
G|-----------0---------------0---------------0-----------------0-----|
D|-----3-------2-------2-------2-------0-------0---------2-------2---|
A|-----------------3-------3-------------------------3-------3-------|
E|-1-------1-----------------------3-------3-------------------------|
It should identify four different groups from the text. However, in java and in the two testers I mentioned above, it recognizes each line as its own different group (i.e 12 groups)
I couldn't help but respond to this as I am familiar with both regex and guitar haha.
For your short regex, please see the following regex on regex101.com:
https://regex101.com/r/NqGhoh/1/
The multiline modifier is required.
The main problem with this is that you are handling newlines on the front and back of the expression. I have modified the expression in a couple ways:
Made the regex match newlines only on the end, always looking for a ^ at the beginning.
Matching the carriage return new line combination as \r?\n as a carriage return should always be followed by a newline when it is used.
Used non-capturing groups to improve overhead and reduce complexity when looking at matches. This is the ?: just inside the parenthesis. It means the group won't be captured in the result, just used for encapsulation.
I started testing your longer regex and may update that as well, though it sounds like you already know what to do with the shorter one corrected.

String.contains always appears false

public final void nameErrorLoop () {
while (error) {
System.out.println("Enter the employee's name.");
setName(kb.next());
if (name.contains("[A-Za-z]")) {
error = false;
}
else {
System.out.println("The name can only contain letters.");
}
}
error = true;
}
Despite a similar setup working in a different method, this method becomes stuck in a constant loop because the if statement is always false. I've tried .matches(), and nothing that I found on the Interent has helped so far. Any ideas why? Thanks for your help in advance.
Edit: I just noticed as I was finishing the project, that trying to print 'name' later only shows the first name, and the last is never printed. Is there any way I can get the 'name' string to include both?
String.contains doesn't use regular expressions - it just checks whether one string contains another, in the way that "foobar" contains "oob".
It sounds like you want to check that name only contains letters, in which case you should be checking something like:
if (name.matches("^[A-Za-z]+$"))
The + (instead of *) will check that it's non-empty; the ^ and $ will check that there's nothing other than the letters.
If you expect it to be a full name, however, you may well want to allow spaces, hyphens and apostrophes:
if (name.matches("^[-' A-Za-z]+$"))
Also consider accented characters - and punctuation from other languages.
Easy. .contains() is not what you think. It does exact String matching.
"anything".contains("something that's not a regular expression");
Either use this
Pattern p2=Pattern.compile("[A-Za-z]+");//call only once
p2.matcher(txt).find();//call in loop
or this:
for(char ch: "something".toCharArray()){
if(Character.isAlphabetic(ch)){
}
}

How to Check a String in Java with an Zero Character RegEx?

The following piece of code checks for same variable portion /en(^$|.*) which is empty or any characters. So the expression should match /en AND /en/bla, /en/blue etc.
But the expression doesn't work when checking for just /en.
"/en".matches("/en(^$|.*)")
Is there a way to make this empty regex check (^$) perform with java?
edit
I mean: Is there a way to make this piece of code return true?
What you're currently doing is checking whether en is followed by the start of string then the end of string (which doesn't make sense, since the start of string needs to be first) or anything else. This should work:
"/en".matches("/en(|.*)")
Or just using ? (optional):
"/en".matches("/en(.*)?")
But it's rather pointless, since * is zero or more (so a blank string will match for .*), just this should do it:
"/en".matches("/en.*")
EDIT:
Your code was already returning true, but it was not matching the ^$ part, but rather .* (similar to the above).
I should point out that you may as well use startsWith, unless your real data is more complex:
"/en".startsWith("/en")
Is there a way to make this piece of code return true?
"/en".matches("/en(^$|.*)")
That code does return true. Just try it!
However, your pattern is unnecessarily complex. Try:
"/en".matches("/en.*")
This will match /en followed by anything (including nothing).

Java if else on partial match

I've moved into Selenium WebDriver, and still finding the most confusing examples.
I need to be able to read a string (succeeded) run a conditional that asks If specific text is present.
For the sake of this text.
String oneoff = "Jeff is old"
I need to match on Jeff, see code below, as long as Jeff exists in the string, I want to return true. If Jeff doesn't exist, then I will check for oh say 50-75 other names. However the string may contain their name and additional text that cannot be controlled. so I have to do a partial match.
Question 1. am I screwed and will have to build each regex expression in that crazy format that I have been seeing, or am I missing something obvious?
Question 2. Will someone for my sanity please show me the proper way to match on Jeff, with the possibility of text being before and after the name Jeff.
Thank you!
String oneoff = driver.findElement(By.id("id_one_off_byline"))
.getAttribute("value");
System.out.println("One Off is:" + oneoff);
if (oneoff.matches("Jeff")) {
System.out.println("It is Jeff");
} else {
System.out.println("it is not jeff");
}
This is just the functional part of the code,
as Jeff exists in the string, I want to return true
Then you probably should test it with
if (oneoff.contains("Jeff"))
since matches use regex as parameter, so if (oneoff.matches("Jeff")) would return true only if oneoff = "Jeff".
You do not need to use match() for the code you have supplied. Instead use oneoff.equals("String") for string matching. Match() is more for a regex expressions. You could also use oneoff.contains("String") if you want to return true even if the string only exists as a subset of the target string.
if (oneoff.contains("Jeff")) {
System.out.println("It is Jeff");
} else if (!oneoff.contains("Jeff")) {
System.out.println("it is not jeff");
}
I think you should improve your code to be like this, because java probably didn't recognize else string if contained with other "jeff" maybe "JEef" or "JEEF" or even maybe "Jeef "
I hope it works, I used to found same bug like yours and I try this way to overcome it.

Trouble implementing partial matches with regular expression on Android

I am creating a regular expression to evaluate if an IP address is a valid multicast address. This validation is occurring in real time while you type (if you type an invalid / out of range character it is not accepted) so I cannot simply evaluate the end result against the regex. The problem I am having with it is that it allows for a double period after each group of numbers (224.. , 224.0.., 224.0.0.. all show as valid).
The code below is a static representation of what's happening. Somehow 224.. is showing as a legal value. I've tested this regex online (non-java'ized: ^2(2[4-9]|3\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]\d|\d)){3}$ ) and it works perfectly and does not accept the invalid input i'm describing.
Pattern p = Pattern.compile("^2(2[4-9]|3\\d)(\\.(25[0-5]|2[0-4]\\d|[0-1]?\\d?\\d)){3}$");
Matcher m = p.matcher("224..");
if (!m.matches() && !m.hitEnd()) {
System.out.println("Invalid");
} else {
System.out.println("Valid");
}
It seems that the method m.hitEnd() is evaluating to true whenever I input 224.. which does not make sense to me.
If someone could please look this over and make sure I'm not making any obvious mistake and maybe explain why hitEnd() is returning true in this case I'd appreciate it.
Thanks everyone.
After doing some evaluating myself (after discovering this was on Android), I realized that the same code responds differently on Dalvik than it does on a regular JVM.
The code is:
Pattern p = Pattern.compile("^2(2[4-9]|3\\d)(\\.(25[0-5]|2[0-4]\\d|[0-1]?\\d?\\d)){3}$");
Matcher m = p.matcher("224..");
if (!m.matches() && !m.hitEnd()) {
System.out.println("Invalid");
} else {
System.out.println("Valid");
}
This code (albeit modified a bit), prints Valid on Android and Invalid on the JVM.
I do not know how have you tested your regex but it does not look correct according to your description.
Your regext requires all 4 sections of digits. There is no chance it will match 224..
Only [0-1] and \d are marked with question mark and therefore are optional.
So, without dealing with details of limitations of wich specific digits are permitted I'd suggest you something like this:
^\\d{1-3}\\.(\\d{0-3}\\.)?(\\d{0-3}\\.)?(\\d{0-3}\\.)?$
And you do not have to use hitEnd(): $ in the end is enough. And do not use matches(). Use find() instead. matches() is like find() but adds ^ and $ automatically.
I just tested out your code and m.hitEnd() evaluates to false for me, and I am receiving invalid...
So I'm not really sure what the problem here is?
I reported bug 20625 in Dalvik. In the interim, you don't need to use hitEnd(), having the $ suffix should be sufficient.
public void testHitEnd() {
String text = "b";
String pattern = "^aa$";
Matcher matcher = Pattern.compile(pattern).matcher(text);
assertFalse(matcher.matches());
assertFalse(matcher.hitEnd());
}

Categories