Ant path style patterns - java

What are the rules for Ant path style patterns.
The Ant site itself is surprisingly uninformative.

Ant-style path patterns matching in spring-framework:
The mapping matches URLs using the following rules:
? matches one character
* matches zero or more characters
** matches zero or more 'directories' in a path
{spring:[a-z]+} matches the regexp [a-z]+ as a path variable named "spring"
Some examples:
com/t?st.jsp - matches com/test.jsp but also com/tast.jsp or com/txst.jsp
com/*.jsp - matches all .jsp files in the com directory
com/**/test.jsp - matches all test.jsp files underneath the com path
org/springframework/**/*.jsp - matches all .jsp files underneath the org/springframework path
org/**/servlet/bla.jsp - matches org/springframework/servlet/bla.jsp but also org/springframework/testing/servlet/bla.jsp and org/servlet/bla.jsp
com/{filename:\\w+}.jsp will match com/test.jsp and assign the value test to the filename variable
http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/util/AntPathMatcher.html

I suppose you mean how to use path patterns
If it is about whether to use slashes or backslashes these will be translated to path-separators on the platform used during execution-time.

Most upvoted answer by #user11153 using tables for a more readable format.
The mapping matches URLs using the following rules:
+-----------------+---------------------------------------------------------+
| Wildcard | Description |
+-----------------+---------------------------------------------------------+
| ? | Matches exactly one character. |
| * | Matches zero or more characters. |
| ** | Matches zero or more 'directories' in a path |
| {spring:[a-z]+} | Matches regExp [a-z]+ as a path variable named "spring" |
+-----------------+---------------------------------------------------------+
Some examples:
+------------------------------+--------------------------------------------------------+
| Example | Matches: |
+------------------------------+--------------------------------------------------------+
| com/t?st.jsp | com/test.jsp but also com/tast.jsp or com/txst.jsp |
| com/*.jsp | All .jsp files in the com directory |
| com/**/test.jsp | All test.jsp files underneath the com path |
| org/springframework/**/*.jsp | All .jsp files underneath the org/springframework path |
| org/**/servlet/bla.jsp | org/springframework/servlet/bla.jsp |
| also: | org/springframework/testing/servlet/bla.jsp |
| also: | org/servlet/bla.jsp |
| com/{filename:\\w+}.jsp | com/test.jsp & assign value test to filename variable |
+------------------------------+--------------------------------------------------------+

ANT Style Pattern Matcher
Wildcards
The utility uses three different wildcards.
+----------+-----------------------------------+
| Wildcard | Description |
+----------+-----------------------------------+
| * | Matches zero or more characters. |
| ? | Matches exactly one character. |
| ** | Matches zero or more directories. |
+----------+-----------------------------------+

As #user11153 mentioned, Spring's AntPathMatcher implements and documents the basics of Ant-style path pattern matching.
In addition, Java 7's nio APIs added some built in support for basic pattern matching via FileSystem.getPathMatcher

Related

Get Predicates with Prefix from an RDF

When extracting subject, property, and object from a RDF file, I want to replace the IRI of the predicate with the keyword it corresponds to. For example, A general SPARQL query returns these results:
| <http://extbi.dk/resource/727> | <http://extbi.dk/p/population> | "21,749"
| <http://extbi.dk/resource/727> | <http://extbi.dk/p/region> | "Central"
| <http://extbi.dk/resource/727> | <http://extbi.dk/p/id> | "727"
What I want to do is: If the prefix keyword for http://extbi.dk/p/ is schema, then my desired result is:
| <http://extbi.dk/resource/727> | <schema:population> | "21,749"
| <http://extbi.dk/resource/727> | <schema:region> | "Central"
| <http://extbi.dk/resource/727> | <schema:id> | "727"
I am using Apache Jena.
Prefixes are handled in Jena using PrefixMapping objects.
This example should return QName or null if it doesnt exists:
Node n;
PrefixMapping prefixes = new PrefixMapping.Factory.create();
qnameFor(n.getURI());
shortForm(String URI) can also be used to simplify an URI based on the "original" URI resource.
Here is the link to the Javadoc : Link.

Suggest framework for external rule storage

There is a situation:
I've got 2 .xlsx files:
1. With bussines data
for example:
-----------------------------------------
| Column_A | Column_B| Column_C | Result |
-----------------------------------------
| test | 562.03 | test2 | |
------------------------------------------
2. With bussiness rules
for example:
-------------------------------------------------------------------------
| Column_A | Column_B | Column_C | Result |
-------------------------------------------------------------------------
| EQUALS:test | GREATER:100 | EQUALS:test2 & NOTEQUALS:test | A |
--------------------------------------------------------------------------
| EQUALS:test11 | GREATER:500 | EQUALS:test11 & NOTEQUALS:test | B |
--------------------------------------------------------------------------
With condition in each cell.
One row contains list of these conditions and composes one rule.
All rules will be processed iteratively. But of course, I think, it would be better to construct some 'decision tree' or 'classification flow-chart'.
So, my task is: to store these conditions functionality (methods like EQUALS, GREATER, NOTEQUALS) in some external file or some other resource. To have a possibility to change it without compilation into java bytecode. To have a dynamic solution, not to hard code in java methods.
I found DROOLS http://drools.jboss.org/ as a whay that can work with such cases. But maybe there are another frameworks that can work with such issues?
JavaScript, DynamicSQL, DB solution is not suitable.

Sub string detection performance?

I need to match a sub string, and I wonder which one is faster when it comes to matching RegEx?
if ( str.matches(".*hello.*") ) {
...
}
Pattern p = Pattern.compile( ".*hello.*" );
Matcher m = p.matcher( str );
if ( m.find() ) {
...
}
And if don't need a regEx, should I use 'contains' ?
if ( str.contains("hello") ) {
...
}
Thanks.
Although matches() and using a Matcher are identical (matches() uses a Matcher in its implementation), using a Matcher can be faster if you cache and reuse the compiled Pattern. I did some rough testing and it improved performance (in my case) by 400% - the improvement depends on the regex, but there will always be sone improvement.
Although I haven't tested it, I would expect contains() to outperform any regex approach, because the algorithm is far simpler and you don't need regex for this situation.
Here are the results of 6 ways to test for a String containing a substring, with the target ("http") located at various places within a standard 60 character input:
|------------------------------------------------------------|
| Code tested with "http" in the input | µsec | µsec | µsec |
| at the following positions: | start| mid|absent|
|------------------------------------------------------------|
| input.startsWith("http") | 6 | 6 | 6 |
|------------------------------------------------------------|
| input.contains("http") | 2 | 22 | 49 |
|------------------------------------------------------------|
| Pattern p = Pattern.compile("^http.*")| | | |
| p.matcher(input).find() | 90 | 88 | 86 |
|------------------------------------------------------------|
| Pattern p = Pattern.compile("http.*") | | | |
| p.matcher(input).find() | 84 | 145 | 181 |
|------------------------------------------------------------|
| input.matches("^http.*") | 745 | 346 | 340 |
|------------------------------------------------------------|
| input.matches("http.*") | 1663 | 1229 | 1034 |
|------------------------------------------------------------|
The two-line options are where a static pattern was compiled then reused.
They are more or less equivalent if you use m.match() in the second code snippet. String.matches() specs this :
An invocation of this method of the form str.matches(regex) yields exactly the same result as the expression Pattern.matches(regex, str)
this in turn specifies:
An invocation of this convenience method of the form
Pattern.matches(regex, input);
behaves in exactly the same way as the expression
Pattern.compile(regex).matcher(input).matches()
If a pattern is to be used multiple times, compiling it once and
reusing it will be more efficient than invoking this method each time.
So calling String.matches(String) in itself will not bring performance benefits, but storing a pattern (e.g. as a constant) and reusing it does.
If you use find then matches could be more efficient if the terms don't match early, as find may keep looking. But find and matches don't perform the same function, so comparison of performance is moot.

antlr 4.2.2 output to console warning (157)

I downloaded latest release of ANTLR - 4.2.2 (antlr-4.2.2-complete.jar)
When I use it to generate parsers for grammar file Java.g4 it prints me some warnings like:
"Java.g4:525:16: rule 'expression' contains an 'assoc' terminal option in an unrecognized location"
Files was generated but didn't compile
Previous version works fine.
Whats wrong?
The <assoc> should now be moved left of the "expression".
It must be placed always right to the surrounding |:
Look here: https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Left-recursive+rules
...
| expression '&&' expression
| expression '||' expression
| expression '?' expression ':' expression
|<assoc=right> expression
( '='
| '+='
| '-='
| '*='
| '/='
| '&='
| '|='
| '^='
| '>>='
| '>>>='
| '<<='
| '%='
)
expression

Regex expression to capture hyphenated word between lines, and non hyphenated words

I am trying to write a regular expression, in java, that matches words and hyphenated words. So far I have:
Pattern p1 = Pattern.compile("\\w+(?:-\\w+)",Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile("[a-zA-Z0-9]+",Pattern.CASE_INSENSITIVE);
Pattern p3 = Pattern.compile("(?<=\\s)[\\w]+-$",Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
This is my test case:
Programs
Dsfasdf. Programs Programs Dsfasdf. Dsfasdf. as is wow woah! woah. woah? okay.
he said, "hi." aasdfa. wsdfalsdjf. go-to go-
to
asdfasdf.. , : ; " ' ( ) ? ! - / \ # # $ % & ^ ~ ` * [ ] { } + _ 123
Any help would be awesome
My expected result would be to match all the words ie.
Programs Dsfasdf Programs Programs Dsfasdf Dsfasdf
as is wow woah woah woah okay he said hi aasdfa
wsdfalsdjf go-to go-to asdfasdf
the part I'm struggling with is matching the words that are split up between lines as one word.
ie.
go-
to
\p{L}+(?:-\n?\p{L}+)*
\ /^\ /^\ /\ /^^^
\ / | | | | \ / |||
| | | | | | ||`- Previous can repeat 0 or more times (group of literal '-', optional new-line and one or more of any letter (upper/lower case))
| | | | | | |`-- End first non-capture group
| | | | | | `--- Match one or more of previous (any letter, upper/lower case)
| | | | | `------ Match any letter (upper/lower case)
| | | | `---------- Match a single new-line (optional because of `?`)
| | | `------------ Literal '-'
| | `-------------- Start first non-capture group
| `---------------- Match one or more of previous (any letter between A-Z (upper/lower case))
`------------------- Match any letter (upper/lower case)
Is this OK?
I would go with regex:
\p{L}+(?:\-\p{L}+)*
Such regex should match also words "fiancé", "À-la-carte" and other words containing some special category "letter" characters. \p{L} matches a single code point in the category "letter".

Categories