How to write regex to match a key in yaml file? - java

I have a yaml which looks like this..! Sonar by default providing sonar-yaml-plugin with some templates which accepts regex as input to verify particular key is present or not in .yml file.
I want regex to match entire key logging:file
server:
port: 8989
logging:
file: ./sample1.txt
path: ./log
I have tried using (logging)(?s:.*?)(file) but its not validating when I use it in sonar-plugin.

I'm not so sure what you may want to match, but maybe this regex might help you to do so or design your desired expression:
^(logging:|\s+file:)(.+)
This expression has a left boundary on start ^.
Your two words connected with an OR (|)
Then, matches everything after that using .+
You can also add additional boundaries to it, however if you could add some real samples to your question, it would be easier to answer.

Related

Regex to match conan dependency from conanfile.txt

I am trying to create a regex in Java to match and get the name, version, channel and owner for each dependency but I haven't been able to have one that covers all the possible scenarios:
the structure is something like name/version#owner/channel, where the version might have a semver structure, the owner and channel are optional.
Currently, I have :
^(?<name>[\d\w][\d\w\+\.-]+)\/(?<version>[\d\w][\d\w\.-]+)(#(?<owner>\w+))?(\/(?<channel>.+))?$
but it's failing for boost_atomic/1.59.0+4#owner/release, since the +4 is not matched and I need the value before that -> 1.59.0
Some other scenarios that need to be valid and are valid for the regex above are:
Poco/1.9.0#pocoproject/stable
zlib/1.2.11#conan/stable
freetype/2.10.1/stable
openssl/1.0.2g/stable
openssl/1.0.2g
openssl/1.0.2g#owner
Also, there might be some dependencies with comments :
zlib/1.2.11#conan/stable # comment
In that case I would need to get rid of the component and only get the relevant information with the regex.
I am not sure if my current regex is good, but from what I've tested only some scenarios are missing
You can simplify your regex and avoid putting too many characters in that character set and escaping them, instead use something like [^\/] to capture anything except / as you want to capture anything preceding a slash.
I've made some modifications and the updated regex that should work for you is following,
^(?<name>[^\/]+)\/(?<version>[^\/#\s]+)(#(?<owner>\w+))?(\/(?<channel>\S+))?(?:\s*#\s*(?<comment>.+))?$
I've added another named group for comment as you mentioned that can also be present. Let me know if this works for you.
Try this demo
Edit: If channel contains a text like release:132434 and anything followed by a colon is to be ignored as part of channel, you can use updated regex below,
^(?<name>[^\/]+)\/(?<version>[^\/#\s]+)(?:#(?<owner>\w+))?(?:\/(?<channel>[^:\s]+)\S*)?(?:\s*#\s*(?<comment>.+))?\s*$
Updated Demo

Java Properties File to Use / Forward Slash in Key

I have a properties file, and I need to use the / forward slash in some of my keys.
e.g.
app.module/hdr.key1=value 1
app.module/hdr.key2=value 2
I just have no choice but need to do it that way. Please advise is this achievable and how to do this?
Thanks.
The use of forward slashes will not cause a problem. To understand why, I suggest you read a critique of the syntax used in Java properties that I wrote. In essence, what you need to know is the following:
Leaving aside edge cases (comment lines, blank lines and escape sequences), the syntax of a name=value pair permits almost any character (including forward-slashes) in the name.
The = can actually be any of the following: (1) = (optionally preceded and/or followed by whitespace); (2) : (optionally preceded and/or followed by whitespace); or (3) just whitespace. So, yes name=value is equivalent to name:value and also to name value.
All escape sequences begin with the backslash character. For details of the escape sequences, I suggest you do a Google search for java.util.Properties to find online documentation for that class, and look at the long description of the load(InputStream) method.

Regex to check if file does not have an extension

I want to process files based on file extension. I need to process 2 files: one is with .nc extension and another file with does not have any extension
File name could be anything, doesn't matter.
for .nc extension I have .*.nc regex but I need combine regex. I googled but unable to find anything. Could anyone help me with regex which matches these 2 conditions?
You can use this pattern (?(?=.*\.)^.*\.nc$|^.*$)
This is conditional with positive lookahead, which checks if string contains dot (with pattern (?=.*\.)). If it does, then match string with .nc extension (with ^.*\.nc$), if not, then match whole string (with ^.*$).
Demo
You can use the regex (\w+.nc\b|\b\w+\b[^.]). It would capture anything like abc.nc and abc but not abc.rc So it would only capture the required extention or with no extension.
I think this would also do just fine for your case.
^([^\s]+.nc|[^\s.]+)$
.^and$ asserts position at the start and end of line respectively and in between it matches any word character without extension or with .nc extension.

Java regex skip match

I need to capture all # characters in the text except those that are surrounded by #[ ... #].
I wrote the PCRE version (online example) which works great but Java doesn't support (*SKIP)(*FAIL).
#\[.*#\](*SKIP)(*FAIL)|#
Is there an Java equivalent of this regex? Thanks.
This uses a little trick to match the #s you don't want first and then match the rest in a capture group:
#\[.*?#\]|(#+)
https://regex101.com/r/sU1kR2/1
You will need to extract the first capture group to get the desired #s.
If you want to capture each individual # not part of or in the custom brackets, you can drop the + from the capture group as follows:
#\[.*?#\]|(#)
Also, if you can have text like ##[text]#, then you might need a lookaround as follows:
#\[.*?#\]|(#(?!\[))
If you can use \K (but I don't think you can in Java), it is even simpler with the following because then you don't have to worry about capture groups:
#\[.*?#\]\K|#

Replace the property value using RegEx

I have config.properties file containing properties in Java Properties format.
I need to replace the value of a property with a known name with a new value. The comments and formatting of the file should be preserved.
My current approach is to use RegEx to match the property name and then replace its value. However, Java Properties supports multi-line values, which I have hard time matching.
Here is an example. Suppose config.properties contains the following text:
# A property
A = 1\
2
# B property
B = 2
I would like to replace the value of property A with "3". The end result should be:
# A property
A = 3
# B property
B = 2
My current RegEx (?s)(A[\\s]*=[\\s]*)(.*) does not work correctly.
Please suggest a RegEx or an a different way of doing this.
Thanks!
Try this:
String regex = "(?m)^(A\\s*+=\\s*+)"
+ "(?:[^\r\n\\\\]++|\\\\(?:\r?\n|\r|.))*+$";
I left the first part as you wrote it so I could concentrate on matching the value; the rules governing the key and separator are actually much more complicated than that.
The value consists of zero or more of any character except carriage return, linefeed or backslash, or a backslash followed by a line separator or any single non-line-separator character. The line separator can be any of the three most common forms: DOS/Windows (\r\n), Unix/Linux/OSX (\n) or pre-OSX Mac (\r).
Note that the regex is in multiline mode so the line anchors will work, but not singleline (DOTALL) mode. I also used possessive quantifiers throughout because I know backtracking will never be useful.
You have tools in Java to load, read, modify and save properties files.
Personally I like Jakarta Commons Configuration.
I agree with streetpc on using Jakarta Commons Configuration.
However to focus on your regex, the problem is that most of the regex engines work on a line per line basis by default.
For example in the (quite old) Perl5Util class (see http://jakarta.apache.org/oro/api/org/apache/oro/text/perl/Perl5Util.html) you can read that patterns have following syntax :
[m]/pattern/[i][m][s][x]
The m prefix is optional and the meaning of the optional trailing options are:
i case insensitive match
m treat the input as consisting of multiple lines
s treat the input as consisting of a single line
x enable extended expression syntax incorporating whitespace and comments

Categories