Invalid logback pattern - java

I was using this working pattern (logback.groovy):
{'((?:password(=|:|>))|(?:secret(=|:))|(?:salt(=|:)))','\$1*******\$3'}
to mask sensitive data. One day I needed to surround it with double quotes, like
was: password=smth
became: "password"="smth"
So I turned regexp into this (just added \" before and after keywords, and also I've tried \\"):
{'(\"?(?:password\"?(=|:|>))|(?:secret\"?(=|:))|(?:salt\"?(=|:)))','\$1*******\$3'}
But I get this error on app startup:
Failed to parse pattern
Unexpected character ('?' (code 63)): was expecting comma to separate Object entries
Can someone please explain to me what am I doing wrong?

If someone wondering here is correct version:
{'(\\\"?(?:password\\\"?(=|:|>))|(?:secret\\\"?(=|:))|(?:salt\\\"?(=|:)))','\$1*******\$3'}

Related

Unable to capture next line character in Java

I have a requirement of parsing through an python file which contains multiple sql queries and get the start and end positions of the query to get only the query part using JAVA
I am using .contains function to check for sql(''' as my opening character for the query and now for the closing character I have ''') but there are some cases where ''') comes in between the query when there is a variable involved which should not be detected as an end of the query.
Something like this :
spark.sql(''' SELECT .......
FROM.....
WHERE xxx IN ('''+ Variable +''')
''')
here the last but one line also gets detected as end of line if I use line.contains(" ''') ") which is wrong.
All I can think of is to check for next line character as the end of the query as each query is separated by two empty lines. So tried these if (line.contains(" ''')\n") & if (line.contains(" ''')\r\n") but none of them work for me.
Kindly let me know of any other way to do this.
Note that I do not have the privilege to change the query file.
Thanks
I believe simple contains won't solve this problem.
You will have to use Pattern if you are looking to match \n.
String query = "spark.sql(''' SELECT .......\n" +
"FROM..... \n" +
"WHERE xxx IN ('''+ Variable +''')\n" +
"''')";
Pattern pattern = Pattern.compile("^spark.sql\\('''(.*)'''\\)$", Pattern.DOTALL);
System.out.println(pattern.matcher(query).find());
Output:
true
Pattern.DOTALL tells Java to allow the dot to match newline characters, too.

Filter Special Characters in Spring / Java

I'm using jsoup to get all text from websites.
Document doc = Jsoup.connect("URL").get();
String allText doc.text().toLowerCase();
Then I'm using Hibernate to persist the object that holds all text to a MySQL DB:
...
#Column(name="all_text")
#Lob
private String allText = null;
...
Everything is good so far. Only that sometimes I get a MySQL error when I try to save the object with allText:
java.sql.SQLException: Incorrect string value: '\xF0\x9F\x98\x8A s...' for column 'all_text' at row 1
Already looked this up and it's an encoding error. Probably have some special characters on their websites. I found a way to fix this by changing the encoding in the DB.
But my actual question is: what's the best way to filter and remove the special characters from the allText string and not persist them at all?
EDIT: To clarify, by special characters I mean Emoticons and all that stuff. Definitely anything that doesn't fit into UTF-8 encoding. I'm not concerned about ~ ^ etc...
Thanks in advance!
Just use regex:
allText.replaceAll("\\p{C}", "");
Don't forget to import java.util.regexPattern

Change log entry pattern dynamically on some condition

In my Java app Logback is used as logging framework. The appenders configured with the following pattern (simplified):
[CORR=%X{CORR}] [MSG=%msg]%n
As one can see, CORR value is taken from MDC. Log entry example:
[CORR=12342314] [MSG=Some message]
There are cases when the attribute is not stored in MDC, so log entry looks like:
[CORR=] [MSG=Some message]
But should be:
[MSG=Some message]
Is there any way to totally get rid of this [CORR=] part of pattern if the corresponding value is absent in MDC without creating custom LayoutBase implementations?
I'm trying to configure evaluator:
<evaluator name="DISPLAY_CORR_EVAL">
<expression>((String) mdc.get("CORR")) != null</expression>
</evaluator>
but have no idea how to use it in my case.
The problem was solved with help of Logback replace(p){r, t} conversion word:
Replaces occurrences of 'r', a regex, with its replacement 't' in the
string produces by the sub-pattern 'p'. For example,
"%replace(%msg){'\s', ''}" will remove all spaces contained in the
event message.
The pattern 'p' can be arbitrarily complex and in particular can
contain multiple conversion keywords. For instance, "%replace(%logger
%msg){'.', '/'}" will replace all dots in the logger or the message
of the event with a forward slash.
My pattern now looks as follows:
%replace([CORR=%X{CORR}]){'\[CORR=\]', ''}[MSG=]%n
when CORR is empty, [CORR=] matches r regex and thus being replaced by empty string.

What is the MySQL SQL REGEX for this regex

Regular regex:
foo(\((\d{1}|\d{2}|\d{3})\))?
This regex works in Java:
foo(\\((\\d{1}|\\d{2}|\\d{3})\\))?
Examples:
fooa //no match
foo(1)a //no match
foo(a) //no match
foo(1) //match
foo(999) //match
foo //match
MySQL 5.5 documentation (https://dev.mysql.com/doc/refman/5.5/en/regexp.html) says
Note:
Because MySQL uses the C escape syntax in strings (for example, ā€œ\nā€ to
represent the newline character), you must double any ā€œ\ā€ that you use
in your REGEXP strings.
I tried as a test running the following on MySQL 5.x
select 'foo' REGEXP 'foo(\\((\\d{1}|\\d{2}|\\d{3})\\))?'
Here is the error message I get:
Error: You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to
use near ''foo(\\([(]\\d{1}' at line 1
I looked at Adapting a Regex to work with MySQL and tried the suggestion of replacing \d{1} etc.. with [0-9] which gave me:
select 'foo' REGEXP 'foo(\\(([0-9]|[0-9]|[0-9])\\))?'
But still getting MySQL death.
Not having an immediately availble MySQL console to verify, this should work:
'foo\\([:digit:]{1,3})\\)?'
Your other regexes have capture groups around both foo(123) and foo(123). It doesn't look like you want the capture groups in MySQL (does it even support them?), which would lead to MySQL choking.
Popping in because I ran into this and found the problem/solution.
Go go Global Preferences -> MySQL tab. Under "Use Custom Query Tokenizer" there is a "Procedure/Function Separator." If that is "|" change it to something else (like "/"). This is what's causing SQuirreL to fail parsing the REGEX.

Regex backreference when string section excluded

I have a regular expression I am trying to use to rewrite an incoming REST url and am getting stuck on one use case when one section of the URL is excluded.
Here is the regex I'm currently using:
^(/[^/]+/(?:books))/([^/]+?)(?:/(?:(?!page).+?))?(?:/page/(\\d+))?$
As example I'm using "$1 - $2 - $3" as parts to use in writing new URL.
Here are the examples that are working correctly...
"/mySite/books/topic1/page/2" results in "/mySite/books - topic1 - 2"
"/mySite/books/topic1/subtopic1/page/2" results in "/mySite/books - topic1 - 2"
All the above work as intended. The problem is when the URL excludes the "topic1" part of the URL then the results are not what I need. Example:
"/mySite/books/page/2" results in "/mySite/books - page - "
What I need is the $2 to be blank, because there is no topic, and the page number still as $3. What I need as output...
"/mySite/books/page/2" results in "/mySite/books - - 2"
What can I change in my regex to satisfy that scenario without disrupting the existing ones that work correctly? This is being done in Java.
You might try to use regex pattern
^(/[^/]+/books)/(?:(?!page/)([^/]+)/)?page/(\\d+)$
It should suffice to make your second group ungreedy. Then the engine will first try to find a match without using it (trying only /page/\\d+ instead). And if that fails it tries to include the second group:
^(/[^/]+/(?:books))/([^/]+?)(?:/(?:(?!page).+?))??(?:/page/(\\d+))?$
Prepending any kind of quantifier (+, *, ? and {..} with ?) makes it ungreedy.

Categories