Parsing String in Java using a Pattern

Parsing String in Java using a Pattern - java

I am trying parse out 3 pieces of information from a String.
Here is my code:
text = "H:7 E:7 P:10";
String pattern = "[HEP]:";
Pattern p = Pattern.compile(pattern);
String[] attr = p.split(text);
I would like it to return:
String[0] = "7"
String[1] = "7"
String[2] = "10"
But all I am getting is:
String[0] = ""
String[1] = "7 "
String[2] = "7 "
String[3] = "10"
Any suggestions?

A not-so-elegant solution I just devised:
String text = "H:7 E:7 P:10";
String pattern = "[HEP]:";
text = text.replaceAll(pattern, "");
String[] attr = text.split(" ");

From the javadoc, http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#split(java.lang.CharSequence) :
The array returned by this method contains each substring of the input
sequence that is terminated by another subsequence that matches this
pattern or is terminated by the end of the input sequence.
You get the empty string first because you have a match at the beginning of the string, it seems.
If I try your code with String text = "A H:7 E:7 P:10" I get indeed:
A 7 7 10
Hope it helps.

I would write a full regular expression like the following:
Pattern pattern = Pattern.compile("H:(\\d+)\\sE:(\\d+)\\sP:(\\d+)");
Matcher matcher = pattern.matcher("H:7 E:7 P:10");
if (!matcher.matches()) {
// What to do!!??
}
String hValue = matcher.group(1);
String eValue = matcher.group(2);
String pValue = matcher.group(3);

Basing on your comment I take it that you only want to get the numbers from that string (in a particular order?).
So I would recommend something like this:
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher("H:7 E:7 P:10");
while(m.find()) {
System.out.println(m.group());
}

Related

Match everything after and before something regex Java

Here is my code:
String stringToSearch = "https://example.com/excludethis123456/moretext";
Pattern p = Pattern.compile("(?<=.com\\/excludethis).*\\/"); //search for this pattern
Matcher m = p.matcher(stringToSearch); //match pattern in StringToSearch
String store= "";
// print match and store match in String Store
if (m.find())
{
String theGroup = m.group(0);
System.out.format("'%s'\n", theGroup);
store = theGroup;
}
//repeat the process
Pattern p1 = Pattern.compile("(.*)[^\\/]");
Matcher m1 = p1.matcher(store);
if (m1.find())
{
String theGroup = m1.group(0);
System.out.format("'%s'\n", theGroup);
}
I want to to match everything that is after excludethis and before a / that comes after.
With "(?<=.com\\/excludethis).*\\/" regex I will match 123456/ and store that in String store. After that with "(.*)[^\\/]" I will exclude / and get 123456.
Can I do this in one line, i.e combine these two regex? I can't figure out how to combine them.

Just like you have used a positive look behind, you can use a positive look ahead and change your regex to this,
(?<=.com/excludethis).*(?=/)
Also, in Java you don't need to escape /
Your modified code,
String stringToSearch = "https://example.com/excludethis123456/moretext";
Pattern p = Pattern.compile("(?<=.com/excludethis).*(?=/)"); // search for this pattern
Matcher m = p.matcher(stringToSearch); // match pattern in StringToSearch
String store = "";
// print match and store match in String Store
if (m.find()) {
String theGroup = m.group(0);
System.out.format("'%s'\n", theGroup);
store = theGroup;
}
System.out.println("Store: " + store);
Prints,
'123456'
Store: 123456
Like you wanted to capture the value.

This may be useful for you :)
String stringToSearch = "https://example.com/excludethis123456/moretext";
Pattern pattern = Pattern.compile("excludethis([\\d\\D]+?)/");
Matcher matcher = pattern.matcher(stringToSearch);
if (matcher.find()) {
String result = matcher.group(1);
System.out.println(result);
}

If you don't want to use regex, you could just try with String::substring*
String stringToSearch = "https://example.com/excludethis123456/moretext";
String exclusion = "excludethis";
System.out.println(stringToSearch.substring(stringToSearch.indexOf(exclusion)).substring(exclusion.length(), stringToSearch.substring(stringToSearch.indexOf(exclusion)).indexOf("/")));
Output:
123456
* Definitely don't actually use this

How can i replace this?

How can I replace this
String str = "KMMH12DE1433";
String pattern = "^[a-z]{2}([0-9]{2})[a-z]{1,2}([0-9]{4})$";
String str2 = str.replaceAll(pattern, "repl");
Log.e("Founded_words2",str2);
What I got: KMMH12DE1433
What I want: MH12DE1433

Try it like this using a proper java.util.regex.Pattern and a java.util.regex.Matcher:
String str = "KMMH12DE1433";
//Make the pattern, case-insensitive using (?i)
Pattern pattern = Pattern.compile("(?i)[a-z]{2}([0-9]{2})[a-z]{1,2}([0-9]{4})");
//Create the Matcher
Matcher m = pattern.matcher(str);
//Check if we find anything
if(m.find()) {
//Use what you found - with proper capturing groups you
//gain access to parts of your pattern as needed
System.out.println("Found this: " + m.group());
}

If you just want to remove the first two characters and if the first two characters will always be uppercase letters:
String str = "KMMH12DE1433";
String pattern = "^[A-Z]{2}";
String str2 = str.replaceAll(pattern, "");
Log.e("Output string: ", str2);

try this :
String a = "KMMH12DE1433";
String pattern = "^[A-Z]{2}";
String rs = a.replaceAll(pattern,"");

Please change like this
String ans=str.substring(0);

How to extract id from url ? Google sheet

I have the follow urls.
https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
Foreach url, I need to extract the sheet id: 1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY into a java String.
I am thinking of using split but it can't work with all test cases:
String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
String[] parts = string.split("/");
String res = parts[parts.length-2];
Log.d("hello res",res );
How can I that be possible?

You can use regex \/d\/(.*?)(\/|$) (regex demo) to solve your problem, if you look closer you can see that the ID exist between d/ and / or end of line for that you can get every thing between this, check this code demo :
String[] urls = new String[]{
"https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258",
"https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258",
"https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY"
};
String regex = "\\/d\\/(.*?)(\\/|$)";
Pattern pattern = Pattern.compile(regex);
for (String url : urls) {
Matcher matcher = pattern.matcher(url);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
Outputs
1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY

it looks like the id you are looking for always follow "/spreadsheets/d/" if it is the case you can update your code to that
String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
String[] parts = string.split("spreadsheets/d/");
String result;
if(parts[1].contains("/")){
String[] parts2 = parts[1].split("/");
result = parts2[0];
}
else{
result=parts[1];
}
System.out.println("hello "+ result);

Using regex
Pattern pattern = Pattern.compile("(?<=\\/d\\/)[^\\/]*");
Matcher matcher = pattern.matcher(url);
System.out.println(matcher.group(1));
Using Java
String result = url.substring(url.indexOf("/d/") + 3);
int slash = result.indexOf("/");
result = slash == -1 ? result
: result.substring(0, slash);
System.out.println(result);

Google use fixed lenght characters for its IDs, in your case they are 44 characters and these are the characters google use: alphanumeric, -, and _ so you can use this regex:
regex = "([\w-]){44}"
match = re.search(regex,url)

regex for first character before space

i am trying to extract "d320" from the below string using regex in java using the below code
n-us; micromax d320 build/kot49h)
String m = "n-us; micromax d320 build/kot49h) ";
String pattern = "micromax (.*)(\\d\\D)(.*) ";
Pattern r = Pattern.compile(pattern);
Matcher m1 = r.matcher(m);
if (m1.find()) {
System.out.println(m1.group(1));
}
but it is giving me the output as "d320 build/kot4" , i want only d320

Try to use micromax\\s(.*?)\\s like this:
String m = "n-us; micromax d320 build/kot49h) ";
String pattern = "micromax\\s(.*?)\\s";
Pattern r = Pattern.compile(pattern);
Matcher m1 = r.matcher(m);
if (m1.find()) {
System.out.println(m1.group(1));
}
Output:
d320

It's not known whether you want the word after "micromax", or the word that starts with a letter and has all digits afterward, so here's both solutions:
To extract the word following "micromax":
String code = m.replaceAll(".*micromax\\s+(\\w+)?.*", "$1");
To extract the word that looks like "x9999":
String code = m.replaceAll(".*?\b([a-z]\\d+)?\b.*", "$1");
Both snippets will result in a blank string if is there's no match.

regex extract string between two characters

I would like to extract the strings between the following characters in the given string using regex in Java:
/*
1) Between \" and \" ===> 12222222222
2) Between :+ and # ===> 12222222222
3) Between # and > ===> 192.168.140.1
*/
String remoteUriStr = "\"+12222222222\" <sip:+12222222222#192.168.140.1>";
String regex1 = "\"(.+?)\"";
String regex2 = ":+(.+?)#";
String regex3 = "#(.+?)>";
Pattern p = Pattern.compile(regex1);
Matcher matcher = p.matcher(remoteUri);
if (matcher.matches()) {
title = matcher.group(1);
}
I am using the above given code snippet, its not able to extract the strings that I want it to. Am I doing anything wrong? Meanwhile, I am quite new to regex.

The matches() method attempts to match the regular expression against the entire string. If you want to match a part of the string, you want the find() method:
if (matcher.find())
You could, however, build a single regular expression to match all three parts at once:
String regex = "\"(.+?)\" \\<sip:\\+(.+?)#(.+?)\\>";
Pattern p = Pattern.compile(regex);
Matcher matcher = p.matcher(remoteUriStr);
if (matcher.matches()) {
title = matcher.group(1);
part2 = matcher.group(2);
ip = matcher.group(3);
}
Demo: http://ideone.com/8t2EC

If your input always looks like that and you always want the same parts from it you can put that in a single regex (with multiple capturing groups):
"([^"]+)" <sip:([^#]+)#([^>]+)>
So you can then use
Pattern p = Pattern.compile("\"([^\"]+)\" <sip:([^#]+)#([^>]+)>");
Matcher m = p.matcher(remoteUri);
if (m.find()) {
String s1 = m.group(1);
String s2 = m.group(2);
String s3 = m.group(3);
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing String in Java using a Pattern - java

A not-so-elegant solution I just devised: String text = "H:7 E:7 P:10"; String pattern = "[HEP]:"; text = text.replaceAll(pattern, ""); String[] attr = text.split(" ");

Basing on your comment I take it that you only want to get the numbers from that string (in a particular order?). So I would recommend something like this: Pattern p = Pattern.compile("\\d+"); Matcher m = p.matcher("H:7 E:7 P:10"); while(m.find()) { System.out.println(m.group()); }

Related

Match everything after and before something regex Java

How can i replace this?

How to extract id from url ? Google sheet

regex for first character before space

regex extract string between two characters

Categories

Resources