java matches regex for variables - java

I want to match any string that has the pattern
{"id":"362237-
any number of characters followed by
"http//:www.abc.com"
any number of characters followed by
"id":"364121-
any number of characters followed by
"http://www.efg.com"
I want to match above pattern to the string below.
[{"id":"362237-13","http//:www.abc.com"},{"id":"364075-13","http://www.xyz.com"},{"id":"364121-13","http://www.efg.com"}]
Code:
String pttrn=".*{\"id\":"362237-.*\"http//:www.abc.com\".*\"id\":"364121-.*\"http://www.efg.com\".*";
String mtchr="[{\"id\":\"362237-13\",\"http//:www.abc.com\"},{\"id":\"364075-13\",\"http://www.xyz.com\"},{\"id\":\"364121-13\",\"http://www.efg.com\"}]";
boolean b = Pattern.matches(pttrn, mtchr);
System.out.println("b is !!" + b);
I was expecting b to be true but it returns false. I have got the regex wrong.
Please let me know how to fix it.
Thanks

You need to escape your curly brace to the regex engine with a backslash. ...and you'll need to escape the backslash to Java with another backslash.
String pttrn=".*\\{\"id\":\"362237-.*\"http//:www.abc.com\".*\"id\":\"364121-.*\"http://www.efg.com\".*";
String mtchr="[{\"id\":\"362237-13\",\"http//:www.abc.com\"},{\"id\":\"364075-13\",\"http://www.xyz.com\"},{\"id\":\"364121-13\",\"http://www.efg.com\"}]";
boolean b = Pattern.matches(pttrn, mtchr);
System.out.println("b is !!" + b);

Related

Regex to identify strings containing a particular symbol?

I have set of inputs ++++,----,+-+-.Out of these inputs I want the string containing only + symbols.
If you want to see if a String contains nothing but + characters, write a loop to check it:
private static boolean containsOnly(String input, char ch) {
if (input.isEmpty())
return false;
for (int i = 0; i < input.length(); i++)
if (input.charAt(i) != ch)
return false;
return true;
}
Then call it to check:
System.out.println(containsOnly("++++", '+')); // prints: true
System.out.println(containsOnly("----", '+')); // prints: false
System.out.println(containsOnly("+-+-", '+')); // prints: false
UPDATE
If you must do it using regex (worse performance), then you can do any of these:
// escape special character '+'
input.matches("\\++")
// '+' not special in a character class
input.matches("[+]+")
// if "+" is dynamic value at runtime, use quote() to escape for you,
// then use a repeating non-capturing group around that
input.matches("(?:" + Pattern.quote("+") + ")+")
Replace final + with * in each of these, if an empty string should return true.
The regular expression for checking if a string is composed of only one repeated symbol is
^(.)\1*$
If you only want lines composed by '+', then it's
^\++$, or ^++*$ if your regex implementation does not support +(meaning "one or more").
For a sequence of the same symbol, use
(.)\1+
as the regular expression. For example, this will match +++, and --- but not +--.
Regex pattern: ^[^\+]*?\+[^\+]*$
This will only permit one plus sign per string.
Demo Link
Explanation:
^ #From start of string
[^\+]* #Match 0 or more non plus characters
\+ #Match 1 plus character
[^\+]* #Match 0 or more non plus characters
$ #End of string
edit, I just read the comments under the question, I didn't actually steal the commented regex (it just happens to be intellectual convergence):
Whoops, when using matches disregard ^ and $ anchors.
input.matches("[^\\+]*?\+[^\\+]*")

java regex replaceAll with negated groups

I'm trying to use the String.replaceAll() method with regex to only keep letter characters and ['-_]. I'm trying to do this by replacing every character that is neither a letter nor one of the characters above by an empty string.
So far I have tried something like this (in different variations) which correctly keeps letters but replaces the special characters I want to keep:
current = current.replaceAll("(?=\\P{L})(?=[^\\'-_])", "");
Make it simplier :
current = current.replaceAll("[^a-zA-Z'_-]", "");
Explanation :
Match any char not in a to z, A to Z, ', _, - and replaceAll() method will replace any matched char with nothing.
Tested input : "a_zE'R-z4r#m"
Output : a_zE'R-zrm
You don't need lookahead, just use negated regex:
current = current.replaceAll("[^\\p{L}'_-]+", "");
[^\\p{L}'_-] will match anything that is not a letter (unicode) or single quote or underscore or hyphen.
Your regex is too complicated. Just specify the characters you want to keep, and use ^ to negate, so [^a-z'_-] means "anything but these".
public class Replacer {
public static void main(String[] args) {
System.out.println("with 1234 &*()) -/.,>>?chars".replaceAll("[^\\w'_-]", ""));
}
}
You can try this:
String str = "Se#rbi323a`and_Eur$ope#-t42he-[A%merica]";
str = str.replaceAll("[\\d+\\p{Punct}&&[^-'_\\[\\]]]+", "");
System.out.println("str = " + str);
And it is the result:
str = Serbia'and_Europe-the-[America]

Validate string a+b

I would like to validate if the particular string is true or not in form of a + b
If input = a + b true
If input = a + false
if input = + b false
where a and b can be any string characters
I can think of a couple of ways:
Use a regex to match a "+" the characters before and after it.
Use String.indexOf("+") to find a "+" character and test the value of the index to see if it as the start or end of the string.
(Don't forget the cases where a or b could contain a "+" character; i.e. multiple "+" characters in the string.)
You can use regular expression (regex) to test the string. In java you can use the Pattern and Matcher classes to test if a string matches a given regex. The regex you want to use is:
String regex = ".* \\+ .*";
This regex will test for a string in the following form: "[characters] + [characters]".
Here is more information about the regex in java.

Java Pattern / Matcher not finding word break

I am having trouble with Java Pattern and Matcher. I've included a very simplified example of what I'm trying to do.
I had expected the pattern ".\b" to find the last character of the first word (or "4" in the example), but as I step through the code, m.find() always returns false. What am I missing here?
Why does the following Java code always print out "Not Found"?
Pattern p = Pattern.compile(".\b");
Matcher m = p.matcher("102939384 is a word");
int ixEndWord = 0;
if (m.find()) {
ixEndWord = m.end();
System.out.println("Found: " + ixEndWord);
} else {
System.out.println("Not Found");
}
You need to escape special characters in the regex: ".\\b"
Basically, in a String the backslash has to be escaped. So "\\" becomes the character '\'.
So the String ".\\b" becomes the litteral String ".\b", which will be used by the Pattern.
To expand upton AntonH's comment, whenever you want the "\" character to appear in a regex expression, you have to escape it so that it first appears in the string you are passing in.
As is, ".\b" is the string of a dot . followed by the special backspace character represented by \b, compared to ".\\b", which is the regex .\b.

Match only first and last character of a string

I had a look at other stackoverflow questions and couldn't find one that asked the same question, so here it is:
How do you match the first and last characters of a string (can be multi-line or empty).
So for example:
String = "this is a simple sentence"
Note that the string includes the beginning and ending quotation marks.
How do I get match the first and last characters where the string begins and ends with a quotation mark (").
I tried:
^"|$" and \A"\Z"
but these do not produce the desired result.
Thanks for your help in advance :)
Is this what you are looking for?
String input = "\"this is a simple sentence\"";
String result = input.replaceFirst("(?s)^\"(.*)\"$", " $1 ");
This will replace the first and last character of the input string with spaces if it starts and ends with ". It will also work across multiple lines since the DOTALL flag is specified by (?s).
The regex that matches the whole input ".*". In java, it looks like this:
String regex = "\".*\"";
System.out.println("\"this is a simple sentence\"".matches(regex)); // true
System.out.println("this is a simple sentence".matches(regex)); // false
System.out.println("this is a simple sentence\"".matches(regex)); // false
If you want to remove the quotes, use this:
String input = "\"this is a simple sentence\"";
input = input.replaceAll("(^\"|\"$)", "")); // this is a simple sentence (without any quotes)
If you want this to work over multiple lines, use this:
String input = "\"this is a simple sentence\"\n\"and another sentence\"";
System.out.println(input + "\n");
input = input.replaceAll("(?m)(^\"|\"$)", "");
System.out.println(input);
which produces output:
"this is a simple sentence"
"and another sentence"
this is a simple sentence
and another sentence
Explanation of regex (?m)(^"|"$):
(?m) means "Caret and dollar match after and before newlines for the remainder of the regular expression"
(^"|"$) means ^" OR "$, which means "start of line then a double quote" OR "double quote then end of line"
Why not use the simple logic of getting the first and last characters based on charAt method of String? Place a few checks for empty/incomplete strings and you should be done.
String regexp = "(?s)\".*\"";
String data = "\"This is some\n\ndata\"";
Matcher m = Pattern.compile(regexp).matcher(data);
if (m.find()) {
System.out.println("Match starts at " + m.start() + " and ends at " + m.end());
}

Categories