Regex in java to extract specific pattern - java

I want to match the pattern (including the square brackets, equals, quotes)
[fixedtext="sometext"]
What would be a correct regex expression?
Anything can occur inside quotes. 'fixedtext' is fixed.

Your basic solution (although I'd be skeptical of this, per the comments) is essentially:
"\\[fixedtext=\\\"(.*)\\\"\\]"
which resolves to:
"\[fixedtext=\"(.*)\"\]"
Simple escaping of [] and quotes. The (.*) says capture everything in quotes as a capture group (matcher.group(1)).
But if you had a string of, for example '[fixedtext="abc\"]def"]' you'd get the an answer of abc\ instead of abc\"]def.
If you know the ending bracket ends the line, then use:
"\\[fixedtext=\\\"(.*)\\\"\\]$"
(add the $ at the end to mark end of line) and that should be fairly reliable.

My suggestion is using named-capturing groups.
You can find more details here:
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Here's an example for your input:
String input = "[fixedtext=\"sometext\"]";
Pattern pattern = Pattern.compile("\\[(?<field>.*)=\"(?<value>.*)\"]");
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println(matcher.group("field"));
System.out.println(matcher.group("value"));
} else {
System.err.println(input + " doesn't match " + pattern);
}

Related

Setting up regex in Pattern object for java string

Am trying to get all matches in a java string. The matches must be bases and powers in a math equation. As you know, bases and powers could be negative and decimals as well. I have the pattern, regex and matcher set up. It looks something like this, but it is not giving me what I expect. I guided myself by this post here on StackOverflow Regex to find integer or decimal from a string in java in a single group?
Am really just interested in capturing powers that have negative exponents both integers and non-integers.
Well here is my code:
String ss = "2.5(4x+3)-2.548^-3.654=-14^-2.545";
String regex = "(\\d^+(?:\\d+)?)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(ss);
while(m.find()){
System.out.println("Wiwi the cat: "+m.group(1));
}
The output of this code is nothing. Any ideas or suggestions would be great.
thanks
Your basic problem is this character: ^. It means "start of input", so your regex can't match anything.
You must escape it \^ so it becomes a literal.
I also fixed the rest of your regex:
String regex = "-?\\d+(\\.\\d*)?(\\^-?\\d+(\\.\\d*)?)?";
See live demo.
And use
match.group(); // the whole match
EDIT: replaced [0123456789] with (\\d)
EDIT EDIT: added context for OP
Answer
Here's a pattern that should match what you need:
-?(\\d)+(\\.(\\d)+)?\\^-?(\\d)+(\\.(\\d)+)?
Explanation
-? - zero or one minus symbols
(\\d)+ - one or more digits
(\\.(\\d)+)? - (optional) a decimal point, followed by one or more digits
\\^ - one caret symbol
Using this on your input with String.replace(pattern, "[FOUND]") produced:
"2.5(4x+3)[FOUND]=[FOUND]"
In the context of your answer, simply replace your regex with the one I posted, and use m.group() instead of m.group(1).
String ss = "2.5(4x+3)-2.548^-3.654=-14^-2.545";
String regex = "-?(\\d)+(\\.(\\d)+)?\\^-?(\\d)+(\\.(\\d)+)?";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(ss);
while(m.find()) {
System.out.println("Wiwi the cat: " + m.group());
}
Best of luck!

Split string between words and quotation marks

I currently have this string:
"display_name":"test","game":"test123"
and I want to split the string so I can get the value test. I have looked all over the internet and tried some things, but I couldn't get it to work.
I found that splitting using quotation marks could be done using this regex: \"([^\"]*)\". So I tried this regex: display_name:\":\"([^\"]*)\"game\", but this returned null. I hope that someone could explain me why my regex didn't work and how it should be done.
You forget to include the ",comma before "game" and also you need to remove the extra colon after display_name
display_name\":\"([^\"]*)\",\"game\"
or
\"display_name\":\"([^\"]*)\",\"game\"
Now, print the group index 1.
DEMO
Matcher m = Pattern.compile("\"display_name\":\"([^\"]*)\",\"game\"").matcher(str);
while(m.find())
{
System.out.println(m.group(1))
}
I think you could do it easier, like this:
/(\w)+/g
This little regex will take all your strings.
Your java code should be something like:
Pattern pattern = Pattern.compile("(\w)+");
Matcher matcher = pattern.matcher(yourText);
while (matcher.find()) {
System.out.println("Result: " + matcher.group(2));
}
I also want to note as #AbishekManoharan noted that it looks like JSON

Java Regular expressions for filename

I want to check the filenames sent to me against two patterns.
The first regular expression is ~*~, which should match names like ~263~. I put this in online regular expression testers and it matches. The code doesnt work though. Says no match
List<FTPFile> ret = new ArrayList<FTPFile>();
Pattern pattern = Pattern.compile("~*~");
Matcher matcher;
for (FTPFile file : files)
{
matcher = pattern.matcher(file.getName());
if(matcher.matches())
{
ret.add(file);
}
}
return ret;
Also the second pattern I need is ##* which should match strings like abc#ere#sss
Please tell me the proper patterns in java for this.
You need to define your pattern like,
Pattern pattern = Pattern.compile("~.*~");
~* in your regex ~*~ will repeat the first ~ zero or more times. So it won't match the number following the first ~. Because matches method tries to match the whole input string, this regex causes the match to fail. So you need to add .* inbetween to match strings like ~66~ or ~kjk~ . To match the strings which has only numbers present inbetween ~, you need to use ~\d+~
Try Regex:
\~.*\~
Instead:
~*~
Example:
Pattern pattern = Pattern.compile("\\~.*\\~");

java easy Regular expression

I have strings like "xxxxx?434334", "xxx?411112", "xxxxxxxxx?11113" and so on.
How to substring properly to retrieve "xxxxx" (everything that comes untill '?' character)?
return s.substring(0, s.indexOf('?'));
No need for a regex for that.
If you have a problem, use a regex. Now you have two problems.
str = str.replaceAll("[?].*", "");
In other words, "remove everything after, and including, the question mark character". The ? has to be enclosed in square brackets because otherwise it has a special meaning.
I would agree with others answers that you should avoid using regex wherever possible, but if you did want to use it for this scenario you could use the following
Pattern regex = Pattern.compile("([^\\?]*)\\?{1}");
Matcher m = regex.matcher(str);
if (m.find()) {
result = m.group(1);
}
where str is your input string.
EDIT:
Description of regex match any group of characters that are not a "?" and have a single "?" after the group
The Pattern ".*(?=\?)" should work as well. ?= is a positive lookahead, which means the mattern matches everything that comes before a quotation mark, but not the quotation mark itself.

Strip all reluctant curly braces using regex

Note: This is a Java-only question (i.e. no Javascript, sed, Perl, etc.)
I need to filter out all the "reluctant" curly braces ({}) in a long string of text.
(by "reluctant" I mean as in reluctant quantifier).
I have been able to come up with the following regex which correctly finds and lists all such occurrences:
Pattern pattern = Pattern.compile("(\\{)(.*?)(\\})", Pattern.DOTALL);
Matcher matcher = pattern.matcher(originalString);
while (matcher.find()) {
Log.d("WITHIN_BRACES", matcher.group(2));
}
My problem now is how to replace every found matcher.group(0) with the corresponding matcher.group(2).
Intuitively I tried:
while (matcher.find()) {
String noBraces = matcher.replaceAll(matcher.group(2));
}
But that replaced all found matcher.group(0) with only the first matcher.group(2), which is of course not what I want.
Is there an expression or a method in Java's regex to perform this "corresponding replaceAll" that I need?
ANSWER: Thanks to the tip below, I have been able to come up with 2 fixes that did the trick:
if (matcher.find()) {
String noBraces = matcher.replaceAll("$2");
}
Fix #1: Use "$2" instead of matcher.group(2)
Fix #2: Use if instead of while.
Works now like a charm.
You can use the special backreference syntax:
String noBraces = matcher.replaceAll("$2");

Categories