regex matcher check in if logic not working - java

Hi, you can see my code below. I have some strings Country, rank and grank in my code, initially they will be null, but if regex is mached, it should change the value. But even if regex is matched it is not changing the value it is always null. If I remove all if statements and append the string it works fine, but if match is not found it is throwing an exception. Please let me know how can I check this in if logic.
System.err.println(content);
Pattern c = Pattern.compile("NAME=\"(.*)\" RANK");
Pattern r = Pattern.compile("\" RANK=\"(.*)\"");
Pattern gr = Pattern.compile("\" TEXT=\"(.*)\" SOURCE");
Matcher co = c.matcher(content);
Matcher ra = r.matcher(content);
Matcher gra = gr.matcher(content);
co.find();
ra.find();
gra.find();
String country = null;
String Rank = null;
String Grank = null;
if (co.matches()) {
country = co.group(1);
}
if (ra.matches()) {
Rank = ra.group(1);
}
if (gra.matches()) {
Grank = gra.group(1);
}

You have to escape a single \ - use double \\ then it should work.
Tried this?
while (co.find()) {
System.out.print("Start index: " + co.start());
System.out.print(" End index: " + co.end() + " ");
System.out.println(co.group());
}

Personally I can't make your program work with / without the if so it's not a problem of logic but just a problem that it doesn't match the string for me
So I changed it to get something working, maybe you can use it :)
String content = "NAME=\"salut\" RANK=\"pouet\" TEXT=\"text\" SOURCE";
System.out.println(content);
System.out.println(content.replaceAll(("NAME=\"(.*)\"\\sRANK=\"(.*)\"\\sTEXT=\"(.*)\" SOURCE"), "$1---$2---$3"));
Output
NAME="salut" RANK="pouet" TEXT="text" SOURCE
salut---pouet---text

Related

String Manipulation in java 1.6

String can be like below. Using java1.6
String example = "<number>;<name-value>;<name-value>";
String abc = "+17005554141;qwq=1234;ddd=ewew;otg=383";
String abc = "+17005554141;qwq=123454";
String abc = "+17005554141";
I want to remove qwq=1234 if present from String. qwq is fixed and its value can VARY like for ex 1234 or 12345 etc
expected result :
String abc = "+17005554141;ddd=ewew;otg=383";
String abc = "+17005554141"; \\removed ;qwq=123454
String abc = "+17005554141";
I tried through
abc = abc.replaceAll(";qwq=.*;", "");
but not working.
I came up with this qwq=\d*\;? and it works. It matches for 0 or more decimals after qwq=. It also has an optional parameter ; since your example seems to include that this is not always appended after the number.
I know the question is not about javascript, but here's an example where you can see the regex working:
const regex = /qwq=\d*\;?/g;
var items = ["+17005554141;qwq=123454",
"+17005554141",
"+17005554141;qwq=1234;ddd=ewew;otg=383"];
for(let i = 0; i < items.length; i++) {
console.log("Item before replace: " + items[i]);
console.log("Item after replace: " + items[i].replace(regex, "") + "\n\n");
}
You can use regex for removing that kind of string like this. Use this code,
String example = "+17005554141;qwq=1234;ddd=ewew;otg=383";
System.out.println("Before: " + example);
System.out.println("After: " + example.replaceAll("qwq=\\d+;?", ""));
This gives following output,
Before: +17005554141;qwq=1234;ddd=ewew;otg=383
After: +17005554141;ddd=ewew;otg=383
.* applies to multi-characters, not limited to digits. Use something that applies only to bunch of digits
abc.replaceAll(";qwq=\\d+", "")
^^
Any Number
please try
abc = abc.replaceAll("qwq=[0-9]*;", "");
If you don't care about too much convenience, you can achieve this by just plain simple String operations (indexOf, replace and substring). This is maybe the most legacy way to do this:
private static String replaceQWQ(String target)
{
if (target.indexOf("qwq=") != -1) {
if (target.indexOf(';', target.indexOf("qwq=")) != -1) {
String replace =
target.substring(target.indexOf("qwq="), target.indexOf(';', target.indexOf("qwq=")) + 1);
target = target.replace(replace, "");
} else {
target = target.substring(0, target.indexOf("qwq=") - 1);
}
}
return target;
}
Small test:
String abc = "+17005554141;qwq=1234;ddd=ewew;otg=383";
String def = "+17005554141;qwq=1234";
System.out.println(replaceQWQ(abc));
System.out.println(replaceQWQ(def));
outputs:
+17005554141;ddd=ewew;otg=383
+17005554141
Another one:
abc.replaceAll(";qwq=[^;]*;", ";");
You must to use groups in replaceAll method.
Here is an example:
abc.replaceAll("(.*;)(qwq=\\d*;)(.*)", "$1$3");
More about groups you can find on: http://www.vogella.com/tutorials/JavaRegularExpressions/article.html

Two separate patterns and matchers (java)

I'm working on a simple bot for discord and the first pattern reading works fine and I get the results I'm looking for, but the second one doesn't seem to work and I can't figure out why.
Any help would be appreciated
public void onMessageReceived(MessageReceivedEvent event) {
if (event.getMessage().getContent().startsWith("!")) {
String output, newUrl;
String word, strippedWord;
String url = "http://jisho.org/api/v1/search/words?keyword=";
Pattern reading;
Matcher matcher;
word = event.getMessage().getContent();
strippedWord = word.replace("!", "");
newUrl = url + strippedWord;
//Output contains the raw text from jisho
output = getUrlContents(newUrl);
//Searching through the raw text to pull out the first "reading: "
reading = Pattern.compile("\"reading\":\"(.*?)\"");
matcher = reading.matcher(output);
//Searching through the raw text to pull out the first "english_definitions: "
Pattern def = Pattern.compile("\"english_definitions\":[\"(.*?)]");
Matcher matcher2 = def.matcher(output);
event.getTextChannel().sendMessage(matcher2.toString());
if (matcher.find() && matcher2.find()) {
event.getTextChannel().sendMessage("Reading: "+matcher.group(1)).queue();
event.getTextChannel().sendMessage("Definition: "+matcher2.group(1)).queue();
}
else {
event.getTextChannel().sendMessage("Word not found").queue();
}
}
}
You had to escape the [ character to \\[ (once for the Java String and once for the Regex). You also did forget the closing \".
the correct pattern looks like this:
Pattern def = Pattern.compile("\"english_definitions\":\\[\"(.*?)\"]");
At the output, you might want to readd \" and start/end.
event.getTextChannel().sendMessage("Definition: \""+matcher2.group(1) + "\"").queue();

complex regular expression in Java

I have a rather complex (to me it seems rather complex) problem that I'm using regular expressions in Java for:
I can get any text string that must be of the format:
M:<some text>:D:<either a url or string>:C:<some more text>:Q:<a number>
I started with a regular expression for extracting the text between the M:/:D:/:C:/:Q: as:
String pattern2 = "(M:|:D:|:C:|:Q:.*?)([a-zA-Z_\\.0-9]+)";
And that works fine if the <either a url or string> is just an alphanumeric string. But it all falls apart when the embedded string is a url of the format:
tcp://someurl.something:port
Can anyone help me adjust the above reg exp to extract the text after :D: to be either a url or a alpha-numeric string?
Here's an example:
public static void main(String[] args) {
String name = "M:myString1:D:tcp://someurl.com:8989:C:myString2:Q:1";
boolean matchFound = false;
ArrayList<String> values = new ArrayList<>();
String pattern2 = "(M:|:D:|:C:|:Q:.*?)([a-zA-Z_\\.0-9]+)";
Matcher m3 = Pattern.compile(pattern2).matcher(name);
while (m3.find()) {
matchFound = true;
String m = m3.group(2);
System.out.println("regex found match: " + m);
values.add(m);
}
}
In the above example, my results would be:
myString1
tcp://someurl.com:8989
myString2
1
And note that the Strings can be of variable length, alphanumeric, but allowing some characters (such as the url format with :// and/or . - characters
You mention that the format is constant:
M:<some text>:D:<either a url or string>:C:<some more text>:Q:<a number>
Capture groups can do this for you with the pattern:
"M:(.*):D:(.*):C:(.*):Q:(.*)"
Or you can do a String.split() with a pattern of "M:|:D:|:C:|:Q:". However, the split will return an empty element at the first index. Everything else will follow.
public static void main(String[] args) throws Exception {
System.out.println("Regex: ");
String data = "M:<some text>:D:tcp://someurl.something:port:C:<some more text>:Q:<a number>";
Matcher matcher = Pattern.compile("M:(.*):D:(.*):C:(.*):Q:(.*)").matcher(data);
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println(matcher.group(i));
}
}
System.out.println();
System.out.println("String.split(): ");
String[] pieces = data.split("M:|:D:|:C:|:Q:");
for (String piece : pieces) {
System.out.println(piece);
}
}
Results:
Regex:
<some text>
tcp://someurl.something:port
<some more text>
<a number>
String.split():
<some text>
tcp://someurl.something:port
<some more text>
<a number>
To extract the URL/text part you don't need the regular expression. Use
int startPos = input.indexOf(":D:")+":D:".length();
int endPos = input.indexOf(":C:", startPos);
String urlOrText = input.substring(startPos, endPos);
Assuming you need to do some validation along with the parsing:
break the regex into different parts like this:
String m_regex = "[\\w.]+"; //in jsva a . in [] is just a plain dot
String url_regex = "."; //theres a bunch online, pick your favorite.
String d_regex = "(?:" + url_regex + "|\\p{Alnum}+)"; // url or a sequence of alphanumeric characters
String c_regex = "[\\w.]+"; //but i'm assuming you want this to be a bit more strictive. not sure.
String q_regex = "\\d+"; //what sort of number exactly? assuming any string of digits here
String regex = "M:(?<M>" + m_regex + "):"
+ "D:(?<D>" + d_regex + "):"
+ "C:(?<D>" + c_regex + "):"
+ "Q:(?<D>" + q_regex + ")";
Pattern p = Pattern.compile(regex);
Might be a good idea to keep the pattern as a static field somewhere and compile it in a static block so that the temporary regex strings don't overcrowd some class with basically useless fields.
Then you can retrieve each part by its name:
Matcher m = p.matcher( input );
if (m.matches()) {
String m_part = m.group( "M" );
...
String q_part = m.group( "Q" );
}
You can go even a step further by making a RegexGroup interface/objects where each implementing object represents a part of the regex which has a name and the actual regex. Though you definitely lose the simplicity makes it harder to understand it with a quick glance. (I wouldn't do this, just pointing out its possible and has its own benefits)

How to get a part data from a filename in Java?

How to get the part of data from string:
csvFile = "c:/Users//PHV/01Surname local.csv"
i need to extract Surname from above string
UPD
what you think about it?
File f = new File(csvFile);
String[] parts = f.getName().split(" ");
String strParts = new String(parts[0]);
String finFileName = strParts.substring(2, strParts.length());
You need a regular expression. Something like:
Pattern p = Pattern.compile("^.*/[0-9]+(a-zA-Z)+ .*");
Matcher m = p.matcher(csvFile);
String surname;
if (m.matches()) {
surname = m.group(1);
} else {
System.out.println("filename seems malformed: " + csvFile);
}
UPDATE: Here is a tutorial about regular expressions but not sure it is the best. I think it must work for you though: http://docs.oracle.com/javase/tutorial/essential/regex/
I'm not sure I understand your question, but I assume you want to extract "Surname". If that's correct, please try this:
String surname = csvFile.substring(csvFile.lastIndexOf("/") + 3, csvFile.lastIndexOf(" "));

Regular expression on a string

I have a String like below
String phone = (123) 456-7890
Now I would like my program to verify if that my input is the same pattern as string 'phone'
I did the following
if(phone.contains("([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]")) {
//display pass
}
else {
//display fail
}
It didn't work. I tried with other combinations too. nothing worked.
Question :
1. How can I achieve this without using 'Pattern' like above?
2. How to do this with pattern. I tried with pattern as below
Pattern pattern = Pattern.compile("(\d+)");
Matcher match = pattern.matcher(phone);
if (match.find()) {
//Displaypass
}
String#matches checks if a string matches a pattern:
if (phone.matches("\\(\\d{3}\\) \\d{3}-\\d{4}")) {
//Displaypass
}
The pattern is a regular expression. Therefor I had to escape the round brackets, as they have a special meaning in regex (they denote capturing groups).
contains() only checks if a string contains the substring passed to it.
I'm not going to dive too deeply into regex syntax, but there definitely is something off with your regex.
"([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]"
it containes ( and ) and those have special meaning. Escape them
"\([0-9][0-9][0-9]\) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]"
and you'll also have to escape your \ for the final
"\\([0-9][0-9][0-9]\\) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]"
You can write like:
Pattern pattern = Pattern.compile("\\(\\d{3}\\) \\d{3}-\\d{4}");
Matcher matcher = pattern.matcher(sPhoneNumber);
if (matcher.matches()) {
System.out.println("Phone Number Valid");
}
For more information you can visit this article.
It appears that your problem is that you didn't escape the parentheses, so your Regex is failing. Try this:
\([0-9][0-9][0-9]\) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]
This works
String PHONE_REGEX = "[(]\\b[0-9]{3}\\b[)][ ]\\b[0-9]{3}\\b[-]\\b[0-9]{4}\\b";
String phone1 = "(1234) 891-6762";
Boolean b = phone1.matches(PHONE_REGEX);
System.out.println("is e-mail: " + phone1 + " :Valid = " + b);
String phone2 = "(143) 456-7890";
b = phone2.matches(PHONE_REGEX);
System.out.println("is e-mail: " + phone2 + " :Valid = " + b);
Output:
is phone: (1234) 891-6762 :Valid = false
is phone: (143) 456-7890 :Valid = true

Categories