parsing string with substring

parsing string with substring - java

what is the best way to parse string
Example
SomeName_Some1_Name2_SomeName3
I want to get out SomeName. What is the best way to do? With substring and calculationg positions or is another better way

You can match pattern SomeName for extracting-
String str= "SomeName_Some1_Name2_SomeName3";
Pattern ptrn= Pattern.compile("SomeName");
Matcher matcher = ptrn.matcher(str);
while(matcher.find()){
System.out.println(matcher.group());
}

Split it by underscore _ using method split()
Get index # 0 from returning array from previous step

If you know the delimiter then you can just try this:
System.out.println("SomeName_Some1_Name2_SomeName3".split("_")[0]);
See also: Javadoc of String.split()

Depends on your configuration and whether you're interested in the other fields.
In that case, go for splitting the string using the _ separator.
In case you just want a part of the string, I'ld just go for substringing in combination with indexOf('_').
In case you want all Occurences you could also find all occurences of 'someName' in your text.

Use regex and Pattern Matcher API to get SomeName.

Here you go:
String str = "SomeName_Some1_Name2_SomeName3";
String newStr = str.substring(0, str.indexOf("_"));
System.out.println(newStr);
Output:
SomeName

String your_String = "SomeName_Some1_Name2_SomeName3";
your_String = your_String.split("_")[0];
Log.v("log","your string "+ your_String);

String str = "SomeName_Some1_Name2_SomeName3";
String output = str.split ( "_" ) [ 0 ];
you will get your output as SomeName.

Related

Need Regex to replace all characters between first set of parenthesis from a string

I've been able to generate a regex to pull everything that is between parenthesis in a string, but I'm unclear on how to make it only happen once and only with the first set. In JAVA:
My current pattern = "\\(([^)]+)\\)"
Any help would be greatly appreciated.

Use replaceFirst instead of replaceAll
OR if you must use replaceAll let it consume rest of your string and put it back again like
replaceAll("yourRegex(.*)","yourReplacement$1");
where $1 represents match from first group (.*).

try:
String x= "Hie(Java)";
Matcher m = Pattern.compile("\\((.*?)\\)").matcher(x);
while(m.find()) {
System.out.println(m.group(1));
}
or
String str = "Hie(Java)";
String answer = str.substring(str.indexOf("(")+1,str.indexOf(")"));
for last index:
update with
String answer = str.substring(str.indexOf("(")+1,str.lastIndexOf(")"));

How to fix the following RegEx?

I've the following piece of code. I use the Case-insensitve pattern modifier so it will find any occurrence, but what I want is the replacement to be exactly the chars that matched the pattern, keeping the case. How could I fix this?
String str = "Ten tender tEens";
String substr = "te";
str = str.replaceAll("(?i)"+substr, "("+substr+")");
System.out.println( str );
Desired output:
(Te)n (te)nder (tE)ens
Received output:
(te)n (te)nder (te)ens

replaceAll() work as the same ways as matcher(string).replaceAll(exp):
To make this work and for better understanding you can break the code like :
String str = "Ten tender tEens";
Pattern pattern=Pattern.compile("(?i)(te)");
Matcher matcher=pattern.matcher(str);
System.out.println( matcher.replaceAll("$1"));
Combining these steps you can use (does the same):
String substr = "te";
str = str.replaceAll("(?i)("+substr+")", "($1)");

You have to use
str = str.replaceAll("(?i)("+substr+"?)", "($1)");
This will create a group and replace the group.

You need to use capturing group.
str = str.replaceAll("(?i)("+substr+")", "($1)");

String split for specific element

I need to split the following string only the data between the "CHAR" tabs:
Input:
<MSG><KEY>name.extObject</KEY><PARAM><CHAR>Number</CHAR><CHAR>7015:188188</CHAR></PARAM></MSG>
Expected output: Number 7015:188188
I am looking for something efficient.
Any recommendation ?
Thanks

It is good practice to avoid parsing XML/HTML with regex. Instead you can use proper XML parser? I like to use jsoup so here is example how it can be done with this libraryL:
String data = "<MSG><KEY>name.extObject</KEY><PARAM><CHAR>Number</CHAR><CHAR>7015:188188</CHAR></PARAM></MSG>";
Document doc = Jsoup.parse(data, "", Parser.xmlParser());
String charText = doc.select("CHAR").text();
System.out.println(charText);
Output: Number 7015:188188

I think you meant to capture the content between tags than splitting the string.
It's well known that you should NOT use a regex to parse xhtml since you can get w͈̦̝͉̬͔͕͡ͅe̴͏̰̜͖̗̤̙̖̕i̧̩̭̳̱̖̦͠ͅŗ̴̼̺̻͕̀d̶̩̖̦̖̲̣̺̫͘ ̡͇̥̩͓c͕̻̫͉̞͝ͅo̯̗͜͜͝ṇ̠͘t̛̬̮̞̥͕̙̞e̷̸̗̼͟ͅn̡͎̖̜̱͟͢t̨̙̫̻̱̺͈̗͝. Although, if you still want a regex you can use a regex like this:
<CHAR>(.*?)<\/CHAR>
Working demo
And you can have this java code:
String line = "<MSG><KEY>name.extObject</KEY><PARAM><CHAR>Number</CHAR><CHAR>7015:188188</CHAR></PARAM></MSG>";
Pattern pattern = Pattern.compile("<CHAR>(.*?)<\\/CHAR>");
Matcher matcher = pattern.matcher(line);
String result = "";
while (matcher.find()) {
result += matcher.group(1) + " ";
}
System.out.println(result); //Prints: Number 7015:188188
Update: as Pshemo pointed in his comment:
/ is not special character in Java regex engine. You don't have to escape it
So, you can use:
Pattern pattern = Pattern.compile("<CHAR>(.*?)</CHAR>");
Btw, I really like Pshemo answer, it's a nice approach to solve this without regex and xhtml

In case you know the tag value is always some digit, then an optional colon with digits, and it is the only <CHAR> tag that has such a numeric value, you may want to use this regex:
(?<=<CHAR>)\d+(?::\d+)?(?=<\/CHAR>)
Java string:
String pattern = "(?<=<CHAR>)\\d+(?::\\d+)?(?=</CHAR>)";
Sample code:
String str = "<MSG><KEY>name.extObject</KEY><PARAM><CHAR>Number</CHAR><CHAR>7015:188188</CHAR></PARAM></MSG>";
Pattern ptrn = Pattern.compile("(?<=<CHAR>)\\d+(?::\\d+)?(?=</CHAR>)");
Matcher matcher = ptrn.matcher(str);
if (matcher.find()) {
System.out.println(matcher.group(0));
}
Output:
7015:188188

String s = inputString;
String result="";
while(s.indexOf("<CHAR>") != -1)
{
result += s.substring(s.indexOf("<CHAR>") + "<CHAR>".length(), s.indexOf("</CHAR>")) + " ";
s = s.substring(s.indexOf("</CHAR>") + "</CHAR>".length());
}
//result is now the desired output

Regex for that is : (.*?)</CHAR>
However, it is better to use an XML parser for that.

regex pattern - extract a string only if separated by a hyphen

I've looked at other questions, but they didn't lead me to an answer.
I've got this code:
Pattern p = Pattern.compile("exp_(\\d{1}-\\d)-(\\d+)");
The string I want to be matched is: exp_5-22-718
I would like to extract 5-22 and 718. I'm not too sure why it's not working What am I missing? Many thanks

Try this one:
Pattern p = Pattern.compile("exp_(\\d-\\d+)-(\\d+)");
In your original pattern you specified that second number should contain exactly one digit, so I put \d+ to match as more digits as we can.
Also I removed {1} from the first number definition as it does not add value to regexp.

If the string is always prefixed with exp_ I wouldn't use a regular expression.
I would:
replaceFirst() exp_
split() the resulting string on -
Note: This answer is based on the assumptions. I offer it as a more robust if you have multiple hyphens. However, if you need to validate the format of the digits then a regular expression may be better.

In your regexp you missed required quantifier for second digit \\d. This quantifier is + or {2}.
String yourString = "exp_5-22-718";
Matcher matcher = Pattern.compile("exp_(\\d-\\d+)-(\\d+)").matcher(yourString);
if (matcher.find()) {
System.out.println(matcher.group(1)); //prints 5-22
System.out.println(matcher.group(2)); //prints 718
}

You can use the string.split methods to do this. Check the following code.
I assume that your strings starts with "exp_".
String str = "exp_5-22-718";
if (str.contains("-")){
String newStr = str.substring(4, str.length());
String[] strings = newStr.split("-");
for (String string : strings) {
System.out.println(string);
}
}

How to find and replace a substring?

For example I have such a string, in which I must find and replace multiple substrings, all of which start with #, contains 6 symbols, end with ' and should not contain ) ... what do you think would be the best way of achieving that?
Thanks!
Edit:
just one more thing I forgot, to make the replacement, I need that substring, i.e. it gets replaces by a string generated from the substring being replaced.

yourNewText=yourOldText.replaceAll("#[^)]{6}'", "");
Or programmatically:
Matcher matcher = Pattern.compile("#[^)]{6}'").matcher(yourOldText);
StringBuffer sb = new StringBuffer();
while(matcher.find()){
matcher.appendReplacement(sb,
// implement your custom logic here, matcher.group() is the found String
someReplacement(matcher.group());
}
matcher.appendTail(sb);
String yourNewString = sb. toString();

Assuming you just know the substrings are formatted like you explained above, but not exactly which 6 characters, try the following:
String result = input.replaceAll("#[^\\)]{6}'", "replacement"); //pattern to replace is #+6 characters not being ) + '

You must use replaceAll with the right regular expression:
myString.replaceAll("#[^)]{6}'", "something")
If you need to replace with an extract of the matched string, use a a match group, like this :
myString.replaceAll("#([^)]{6})'", "blah $1 blah")
the $1 in the second String matches the first parenthesed expression in the first String.

this might not be the best way to do it but...
youstring = youstring.replace("#something'", "new stringx");
youstring = youstring.replace("#something2'", "new stringy");
youstring = youstring.replace("#something3'", "new stringz");
//edited after reading comments, thanks

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

parsing string with substring - java

what is the best way to parse string Example SomeName_Some1_Name2_SomeName3 I want to get out SomeName. What is the best way to do? With substring and calculationg positions or is another better way

You can match pattern SomeName for extracting- String str= "SomeName_Some1_Name2_SomeName3"; Pattern ptrn= Pattern.compile("SomeName"); Matcher matcher = ptrn.matcher(str); while(matcher.find()){ System.out.println(matcher.group()); }

Split it by underscore _ using method split() Get index # 0 from returning array from previous step

If you know the delimiter then you can just try this: System.out.println("SomeName_Some1_Name2_SomeName3".split("_")[0]); See also: Javadoc of String.split()

Use regex and Pattern Matcher API to get SomeName.

Here you go: String str = "SomeName_Some1_Name2_SomeName3"; String newStr = str.substring(0, str.indexOf("_")); System.out.println(newStr); Output: SomeName

String your_String = "SomeName_Some1_Name2_SomeName3"; your_String = your_String.split("_")[0]; Log.v("log","your string "+ your_String);

String str = "SomeName_Some1_Name2_SomeName3"; String output = str.split ( "_" ) [ 0 ]; you will get your output as SomeName.

Related

Need Regex to replace all characters between first set of parenthesis from a string

How to fix the following RegEx?

String split for specific element

regex pattern - extract a string only if separated by a hyphen

How to find and replace a substring?

Categories

Resources