Java validation annotation #Pattern with regexp of \\\\ doesn't work properly - java

I need to do a validation for one of the request fields. I use annotation #Pattern for this.
#Size(min = 3, max = 1000)
#Pattern(regexp = "[0-9A-Za-z\\\\/]+")
private String name;
That is, I must be able to enter the latin letters, numbers, slash and backslash. With this pattern, when I trying to write something like name/\ (looks like "name/\\" in Postman) I get an error
Unexpected internal error near index 6\r\nname/\\
UPDATE:
I did pattern [////]+ and it works! But when I added the alphabet again, the problem came back. I noticed that the stack trace looks different than it does for validation. The error occurs on the following line:
Pattern pattern = Pattern.compile(name, Pattern.CASE_INSENSITIVE);
I use this to check if the same name exists in the MongoDB.

Related

Is quantifiers are required in all the Regex expressions?

I am building a Spring Boot application where input parameters are validated using java.validation.*. I want to check my input parameters for alphabetical characters, numbers and hyphen.
public class Foo {
#NotNull #NotEmpty
#Pattern(regexp = "^[a-zA-Z0-9-]")
private String subscriberType;
#NotNull #NotEmpty
#Size(min = 32, max = 43)
#Pattern(regexp = "^[a-zA-Z0-9-]")
private String caseId;
......
I am using regex as below.
#Pattern(regexp = "^[a-zA-Z]")
If I use above and give input parameters as below,
{
"subscriberType":"prepaid",
"caseId":"5408899645efabf60844de9077372571"
}
I get my validation failed.
Resolving exception from handler [public org.springframework.http.ResponseEntity<java.lang.Object> my.org.presentation.NumberRecycleController.numberRecycle(java.util.Optional<java.lang.String>,my.org.domain.request.NumberRecycleReqDto)
throws java.lang.Exception]: org.springframework.web.bind.MethodArgumentNotValidException: Validation failed for argument at index 1 in method:
public org.springframework.http.ResponseEntity<java.lang.Object> my.org.presentation.NumberRecycleController.numberRecycle(java.util.Optional<java.lang.String>,my.org.domain.request.NumberRecycleReqDto)
throws java.lang.Exception, with 2 error(s): [Field error in object 'numberRecycleReqDto' on field 'subscriberType': rejected value [prepaid]; codes [Pattern.numberRecycleReqDto.subscriberType,
Pattern.subscriberType,Pattern.java.lang.String,Pattern]; arguments [org.springframework.context.support.DefaultMessageSourceResolvable:
codes [numberRecycleReqDto.subscriberType,subscriberType]; arguments []; default message [subscriberType],[Ljavax.validation.constraints.Pattern$Flag;#5dcb2ea8,org.springframework.validation.beanvalidation.SpringValidatorAdapter$ResolvableAttribute#3de9c7b1];
default message [must match "^[a-zA-Z0-9]"]] [Field error in object
'numberRecycleReqDto' on field 'caseId': rejected value [35408899645efabf60844de907737257]; codes [Pattern.numberRecycleReqDto.caseId,Pattern.caseId,Pattern.java.lang.String,Pattern];
arguments [org.springframework.context.support.DefaultMessageSourceResolvable: codes [numberRecycleReqDto.caseId,caseId]; arguments [];
default message [caseId],[Ljavax.validation.constraints.Pattern$Flag;#5dcb2ea8,org.springframework.validation.beanvalidation.SpringValidatorAdapter$ResolvableAttribute#61637994]; default message [must match "^[a-zA-Z0-9-]"]]
I have gone through some similar questions and found a solution. I can get my validation success if I use my regex as below.
#Pattern(regexp = "^[a-zA-Z0-9-]+"
or
#Pattern(regexp = "^[a-zA-Z0-9-]{1,}"
Can you please explain what actually happens here? I know that quantifiers are looking for given number of matches. In my case 1 or more matches which fall into given pattern.
My question is, giving a quantifier is always required or what? What is the reason for the failure of my initial regex pattern?
The pattern must match the whole string. A character class matches only one character. If the string may contain more than one character, you need the quantifier.
Btw. the ^ at the beginning of the regular expression is redundant. The pattern always must match the whole string.

Regex not required [duplicate]

This question already has answers here:
Regular expression which matches a pattern, or is an empty string
(5 answers)
Closed 3 years ago.
I am trying to develop a simple REGEX in Java with pattern like that :
#Pattern(regexp = "[a-zA-Z]{2}[0-9]{1}[2-8]{1}" , message = "The format is invalid")
but this message is still displayed when the field is empty,
so i want to show this message only when the field is not empty (i want that the field is will be not required).
Thank you.
Try using the following regex, which matches both your expected string and empty string:
[a-zA-Z]{2}[0-9]{1}[2-8]{1}|^$
Java code:
#Pattern(regexp = "[a-zA-Z]{2}[0-9]{1}[2-8]{1}|^$", message = "The format is invalid")
You could make your whole pattern optional using a non capturing group (?:...)?to match either an empty string or the whole pattern.
Note that you can omit the {1} part.
^(?:[a-zA-Z]{2}[0-9][2-8])?$
Regex demo
#Pattern(regexp = "^(?:[a-zA-Z]{2}[0-9][2-8])?$" , message = "The format is invalid")

Regex pattern error on API 21(android 5) and below

Android 5 and below getting error from my regex pattern on runtime:
java.util.regex.PatternSyntaxException: Syntax error in regexp pattern near index 4:
(?<g1>(http|ftp)(s)?://)?(?<g2>[\w-:#])+(?<TLD>\.[\w\-]+)+(:\d+)?((|\?)([\w\-._~:/?#\[\]#!$&'()*+,;=.%])*)*
Here is code sample:
val urlRegex = "(?<g1>(http|ftp)(s)?://)?(?<g2>[\\w-:#])+(?<TLD>\\.[\\w\\-]+)+(:\\d+)?((|\\?)([\\w\\-._~:/?#\\[\\]#!$&'()*+,;=.%])*)*"
val sampleUrl = "https://www.google.com"
val urlMatchers = Pattern.compile(urlRegex).matcher(sampleUrl)
assert(urlMatchers.find())
This pattern works really fine on all APIs above 21.
It seems the earlier versions do not support named groups. As per this source, the named groups were introduced in Kotlin 1.2. Remove them if you do not need those submatches and only use the regex for validation.
Your regex is very inefficient as it contains a lot of nested quantified groups. See a "cleaner" version of it below.
Also, it seems you want to check if there is a regex match inside your input string. Use Regex#containsMatchIn():
val urlRegex = "(?:(?:http|ftp)s?://)?[\\w:#.-]+\\.[\\w-]+(?::\\d+)?\\??[\\w.~:/?#\\[\\]#!$&'()*+,;=.%-]*"
val sampleUrl = "https://www.google.com"
val urlMatchers = Regex(urlRegex).containsMatchIn(sampleUrl)
println(urlMatchers) // => true
See the Kotlin demo and the regex demo.
If you need to check the whole string match use matches:
Regex(urlRegex).matches(sampleUrl)
See another Kotlin demo.
Note that to define a regex, you need to use the Regex class constructor.

Spring Boot - custom validation message with annotation parameters

I want to create my custom validation messages with Spring Boot/MVC Validation.
I found following setup working:
public class RegisterCredentials {
#NotEmpty
#NotNull
#Size(min=3, max=15)
private String username;
...
}
messages.properties:
NotEmpty.registerCredentials.username = Username field cannot be empty.
NotNull.registerCredentials.username = Username field cannot be empty.
Size.registerCredentials.username = Username length must be between 3 and 15 characters.
After that I wanted to replace fixed values min=3 and max=15 with attributes. I tried like that, but it didn't worked (it works in #Size(message="..") annotation):
Size.registerCredentials.username = Username length must be between {min} and {max} characters.
Then I found following code working..:
Size.registerCredentials.username = Username length must be between {1} and {2} characters.
Well.. almost, because it produces following message:
Username length must be between 15 and 3 characters.
Replacing order of these {1} and {2} solves the problem, however it produces confusing code in futher analyze.
Is there a solution to solve this problem with clean code?

Lucene wildcard matching fails on chemical notations(?)

Using Hibernate Search Annotations (mostly just #Field(index = Index.TOKENIZED)) I've indexed a number of fields related to a persisted class of mine called Compound. I've setup text search over all the indexed fields, using the MultiFieldQueryParser, which has so far worked fine.
Among the fields indexed and searchable is a field called compoundName, with sample values:
3-Hydroxyflavone
6,4'-Dihydroxyflavone
When I search for either of these values in full the related Compound instances are returned. However problems occur when I use the partial name and introduce wildcards:
searching for 3-Hydroxyflav* still gives the correct hit, but
searching for 6,4'-Dihydroxyflav* fails to find anything.
Now as I'm quite new to Lucene / Hibernate-search, I'm not quite sure where to look at this point.. I think it might have something to do with the ' present in the second query, but I don't know how to proceed.. Should I look into Tokenizers / Analyzers / QueryParsers or something else entirely?
Or can anyone tell me how I can get the second wildcard search to match, preferably without breaking the MultiField-search behavior?
I'm using Hibernate-Search 3.1.0.GA & Lucene-core 2.9.3.
Some relevant code bits to illustrate my current approach:
Relevant parts of the indexed Compound class:
#Entity
#Indexed
#Data
#EqualsAndHashCode(callSuper = false, of = { "inchikey" })
public class Compound extends DomainObject {
#NaturalId
#NotEmpty
#Length(max = 30)
#Field(index = Index.TOKENIZED)
private String inchikey;
#ManyToOne
#IndexedEmbedded
private ChemicalClass chemicalClass;
#Field(index = Index.TOKENIZED)
private String commonName;
...
}
How I currently search over the indexed fields:
String[] searchfields = Compound.getSearchfields();
MultiFieldQueryParser parser =
new MultiFieldQueryParser(Version.LUCENE_29, searchfields, new StandardAnalyzer(Version.LUCENE_29));
FullTextSession fullTextSession = Search.getFullTextSession(getSession());
FullTextQuery fullTextQuery =
fullTextSession.createFullTextQuery(parser.parse("searchterms"), Compound.class);
List<Compound> hits = fullTextQuery.list();
Use WhitespaceAnalyzer instead of StandardAnalyzer. It will just split at whitespace, and not at commas, hyphens etc. (It will not lowercase them though, so you will need to build your own chain of whitespace + lowercase, assuming you want your search to be case-insensitive). If you need to do things differently for different fields, you can use a PerFieldAnalyzer.
You can't just set it to un-tokenized, because that will interpret your entire body of text as one token.
I think your problem is a combination of analyzer and query language problems. It is hard to say what exactly causes the problem. To find this out I recommend you inspect you index using the Lucene index tool Luke.
Since in your Hibernate Search configuration you are not using a custom analyzer the default - StandardAnalyzer - is used. This would be consistent with the fact that you use StandardAnalyzer in the constructor of MultiFieldQueryParser (always use the same analyzer for indexing and searching!). What I am not so sure of is how "6,4'-Dihydroxyflavone" gets tokenized by StandardAnalyzer. That the first thing you have to find out. For example the javadoc says:
Splits words at hyphens, unless
there's a number in the token, in
which case the whole token is
interpreted as a product number and is
not split.
It might be that you need to write your own analyzer which tokenizes your chemical names the way you need it for your use cases.
Next the query parser. Make sure you understand the query syntax - Lucene query syntax. Some characters have special meaning, for example a '-'. It could be that your query is parsed the wrong way.
Either way, first step os to find out how your chemical names get tokenized. Hope that helps.
I wrote my own analyzer:
import java.util.Set;
import java.util.regex.Pattern;
import org.apache.lucene.index.memory.PatternAnalyzer;
import org.apache.lucene.util.Version;
public class ChemicalNameAnalyzer extends PatternAnalyzer {
private static Version version = Version.LUCENE_29;
private static Pattern pattern = compilePattern();
private static boolean toLowerCase = true;
private static Set stopWords = null;
public ChemicalNameAnalyzer(){
super(version, pattern, toLowerCase, stopWords);
}
public static Pattern compilePattern() {
StringBuilder sb = new StringBuilder();
sb.append("(-{0,1}\\(-{0,1})");//Matches an optional dash followed by an opening round bracket followed by an optional dash
sb.append("|");//"OR" (regex alternation)
sb.append("(-{0,1}\\)-{0,1})");
sb.append("|");//"OR" (regex alternation)
sb.append("((?<=([a-zA-Z]{2,}))-(?=([^a-zA-Z])))");//Matches a dash ("-") preceded by two or more letters and succeeded by a non-letter
return Pattern.compile(sb.toString());
}
}

Categories