Regex find() not true; detect duplicate characters in string - java

I'm trying to detect if there are any duplicate characters in a string using regex. When I test the pattern and input in an online regex tester, it says find() should be true. But it doesn't work in my program.
I'm using info from: regex to match a word with unique (non-repeating) characters.
What's happening? Am I using regex correctly in Java?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
Pattern pat = Pattern.compile("(.).*\1");
String s = "1112";
Matcher m = pat.matcher(s);
if (m.find()) System.out.println("Matches");
else System.out.println("No Match");
}
}

The backreference needs to be escaped
Pattern pattern = Pattern.compile("(.).*\\1");

Related

How to build the regex to find particular word [duplicate]

This question already has answers here:
Regex to get the words after matching string
(6 answers)
Closed 2 years ago.
I need to find and print out a particular word in a String. What regex can you recommend me to find a "9.1.1_offline" in following String:
EGA_SAMPLE_APP-iOS-master-<Any word>-200710140849862
Another examples are:
EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862
EGA_SAMPLE_APP-iOS-master-10.2.3_offline-200710140849862
Use the regex, \\d+\\.\\d+\\.\\d+\\_(offline|online)
Demo:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
// Test strings
String[] arr = { "EGA_SAMPLE_APP-iOS-master-9.1.1_offline-200710140849862",
"EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862",
"EGA_SAMPLE_APP-iOS-master-10.2.3_offline-200710140849862" };
Pattern pattern = Pattern.compile("\\d+\\.\\d+\\.\\d+\\_(offline|online)");
// Print the matching string
for (String s : arr) {
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
}
Output:
9.1.1_offline
9.2.3_online
10.2.3_offline
Explanation of the regex:
\\d+ specifies one or more digits
\\. specifies a .
\\_ specifies a _
(offline|online) specifies offline or online.
[Update]
Based on the edited question i.e. find anything between EGA_SAMPLE_APP-iOS-master- and -An_integer_number: Use the regex, EGA_SAMPLE_APP-iOS-master-(.*)-\\d+
Demo:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
// Test strings
String[] arr = { "EGA_SAMPLE_APP-iOS-master-9.1.1_offline-200710140849862",
"EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862",
"EGA_SAMPLE_APP-iOS-master-10.2.3_offline-200710140849862",
"EGA_SAMPLE_APP-iOS-master-anything here-200710140849862" };
// Define regex pattern
Pattern pattern = Pattern.compile("EGA_SAMPLE_APP-iOS-master-(.*)-\\d+");
// Print the matching string
for (String s : arr) {
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
}
}
Output:
9.1.1_offline
9.2.3_online
10.2.3_offline
anything here
Explanation of the regex:
.* specifies anything and the parenthesis around it specifies a capturing group which I've captured with group(1) in the code.
I can suggest the following one line option using String#replaceAll:
String input = "EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862";
String target = input.replaceAll(".*\\b(\\d+\\.\\d+\\.\\d+_(?:online|offline))\\b.*", "$1");
System.out.println(target);
This prints:
9.2.3_online

Java regex can't work if have \n character

I have project to detect if editor have write html entities, but when it containt \n it doesnt work? why?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
String text = "asdasdas <h1>Test</h1></div>";
String regex = ".*<[^&lt]+>.*";
Pattern pattern = Pattern.compile(regex);
Matcher m = pattern.matcher(text);
System.out.println(m.matches());
}
}
If you want to take \n into consideration, you can do this:
Pattern pattern = Pattern.compile(regex, Pattern.DOTALL);
This takes the escape sequence into consideration.
You can also use Pattern.MULTILINE, which matches the regex with Each Line. So if you add ^ or $ in your regex, it matches the starting and ending of the regex respctively for each new line.
This is a link to the Oracle docs which may help you better understand, rather than just application of the code. The More You Know... :)

why does this code (extraction of host-name from a URL with regular expression) fail

I'm trying to match a host-name from a url with regex and groups.
I wrote this test in order to simulate the acceptable inputs.
why does this code fails?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args)
{
Pattern HostnamePattern = Pattern.compile("^https?://([^/]+)/?", Pattern.CASE_INSENSITIVE);
String[] inputs = new String[]{
"http://stackoverflow.com",
"http://stackoverflow.com/",
"http://stackoverflow.com/path",
"http://stackoverflow.com/path/path2",
"http://stackoverflow.com/path/path2/",
"http://stackoverflow.com/path/path2/?qs1=1",
"https://stackoverflow.com/path",
"https://stackoverflow.com/path/path2",
"https://stackoverflow.com/path/path2/",
"https://stackoverflow.com/path/path2/?qs1=1",
};
for(String input : inputs)
{
Matcher matcher = HostnamePattern.matcher(input);
if(!matcher.matches() || !"stackoverflow.com".equals(matcher.group(1)))
{
throw new Error(input+" fails!");
}
}
}
}
It is because your regex ^https?://([^/]+)/? and your call to Matcher#matches method which expects to match the input completely.
You need to use:
matcher.find()
Otherwise your regex will only match first 2 input strings: http://stackoverflow.com and http://stackoverflow.com/
Take a look at "http://stackoverflow.com/path". How should your pattern match? It doesn't recognize the part path.

Java RegExp can't get the result ater evaluating pattern

Hi I have been trying to learn RegExpresions using Java I am still at the begining and I wanted to start a little program that is given a string and outputs the syllabels split.This is what I got so far:
String mama = "mama";
Pattern vcv = Pattern.compile("([aeiou][bcdfghjklmnpqrstvwxyz][aeiou])");
Matcher matcher = vcv.matcher(mama);
if(matcher){
// the result of this should be ma - ma
}
What I am trying to do is create a pattern that checks the letters of the given word and if it finds a pattern that contains a vocale/consonant/vocale it will add a "-" like this v-cv .How can I achive this.
In the following example i matched the first vowel and used positive lookahead for the next consonant-vowel group. This is so i can split again if i have a vcvcv group.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
new Test().run();
}
private void run() {
String mama = "mama";
Pattern vcv =
Pattern.compile("([aeiou])(?=[bcdfghjklmnpqrstvwxyz][aeiou])");
Matcher matcher = vcv.matcher(mama);
System.out.println(matcher.replaceAll("$1-"));
String mamama = "mamama";
matcher = vcv.matcher(mamama);
System.out.println(matcher.replaceAll("$1-"));
}
}
Output:
ma-ma
ma-ma-ma
try
mama.replaceAll('([aeiou])([....][aeiou])', '\1-\2');
replaceAll is a regular expression method
Your pattern only matches if the String starts with a vocal. If you want to find a substring, ignoring the beginning, use
Pattern vcv = Pattern.compile (".*([aeiou][bcdfghjklmnpqrstvwxyz][aeiou])");
If you like to ignore the end too:
Pattern vcv = Pattern.compile (".*([aeiou][bcdfghjklmnpqrstvwxyz][aeiou]).*");

Pattern.COMMENTS always causing Matcher.find to fail

The following code matches the two expressions and prints success.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id : [0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
However, I want white space to not matter, so the following should also print success.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id:[0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex, Pattern.COMMENTS);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
The Pattern.COMMENTS flag is supposed to permit white space, but it causes Failure to be printed. It even causes Failure to be printed if the strings are exactly equivalent including white space, like in the first example. For example,
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id : [0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex, Pattern.COMMENTS);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
Prints Failure.
Why is this happening and how do I make the Pattern ignore white space?
There is a misunderstanding on your side. Pattern.COMMENTS allow you to put additional whitespace into your regex, to improve the readability of the regex, but this whitespace will NOT be matched in the string.
This does not allow whitespace in your string, that is then matched automatically, without being defined in the regex.
Example
With Pattern.COMMENTS you can put whitespace in your regex like this
String regex = "\\{ user_id: [0-9]+ \\}";
to improve readablitiy, but the it will not match the string
String string = "{user_id : 0}";
because you haven't defined the whitespaces in the string, so if you want to use Pattern.COMMENTS then you need to treat whitespace you want to match specially, either you escape it
String regex = "\\{ user_id\\ :\\ [0-9]+ \\}";
or you use the whitespace class
String regex = "\\{ user_id \\s:\\s [0-9]+ \\}";

Categories