Using regex for doing string operation - java

I have a string
String s="my name is ${name}. My roll no is ${rollno} "
I want to do string operations to update the name and rollno using a method.
public void name(String name, String roll)
{
String new = s.replace(" ${name}", name).replace(" ${rollno}", roll);
}
Can we achieve the same using some other means like using regex to change after first "$" and similarly for the other?

You can use either Matcher#appendReplacement or Matcher#replaceAll (with Java 9+):
A more generic version:
String s="my name is ${name}. My roll no is ${rollno} ";
Matcher m = Pattern.compile("\\$\\{([^{}]+)\\}").matcher(s);
Map<String,String> replacements = new HashMap();
replacements.put("name","John");
replacements.put("rollno","123");
StringBuffer replacedLine = new StringBuffer();
while (m.find()) {
if (replacements.get(m.group(1)) != null)
m.appendReplacement(replacedLine, replacements.get(m.group(1)));
else
m.appendReplacement(replacedLine, m.group());
}
m.appendTail(replacedLine);
System.out.println(replacedLine.toString());
// => my name is John. My roll no is 123
Java 9+ solution:
Matcher m2 = Pattern.compile("\\$\\{([^{}]+)\\}").matcher(s);
String result = m2.replaceAll(x ->
replacements.get(x.group(1)) != null ? replacements.get(x.group(1)) : x.group());
System.out.println( result );
// => my name is John. My roll no is 123
See the Java demo.
The regex is \$\{([^{}]+)\}:
\$\{ - a ${ char sequence
([^{}]+) - Group 1 (m.group(1)): any one or more chars other than { and }
\} - a } char.
See the regex demo.

Related

How to get value of optional parameters from a url with regex in java

I have some uris which I want to extract parameters if they are exist, I come up with this code. Can someone point me to fix regex to success.
cityId and countryId works as expected but Cant get only numbers after word '-a-'
Regex
// "/city/berlin-a-10284?cityId=123456&countryId=4545"
// "/city/berlin-a-10284"
// "/city/berlin-a-10284?cityId=123456"
// "/city/berlin-a-10284?countryId=4545"
private String ValueExtractor(String url, String searchWord) {
String regex = "(?<=" + searchWord + ").*?(?=&|$)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(url);
return matcher.find() ? matcher.group() : "";
}
String productId = "";
String cityId = "";
String countryId = "";
if (url.contains("-p-")) {
productId = ValueExtractor(url, "-a-");
}
if (url.contains("cityId")) {
cityId = ValueExtractor(url, "cityId=");
}
if (url.contains("countryId")) {
countryId = ValueExtractor(url, "countryId=");
}
Expected results:
"/city/berlin-a-10284?cityId=123456&countryId=4545"
productId:10284
cityId: 123456
countryId: 4545
"/city/berlin-a-10284"
productId:10284
"/city/berlin-a-10284?cityId=123456"
productId:10284
cityId: 123456
"/city/berlin-a-10284?countryId=4545"
productId:10284
countryId: 4545
Cant get only numbers after word '-a-'
You can use the regex, (?<=-a-)\d+(?=[?&]|$) to retrieve this number.
(?<=-a-)\d+ specifies one or more digits preceded by -a-.
(?=[?&]|$) specifies positive lookahead for ?, or & or end of line.

Get the value after a string and comma and ends if there is character '|'

If I have string variable :
String word = "wordA";
and I have another string variable :
String fullText= "wordA,A A|wordB,B B|wordC,C C|wordD,D D";
so is it possible to get the value after the comma and ends with | ?
Example
If word equals "wordA" then I get only "A A" because in fullText right after wordA and comma is 'A A' and ends with |
and if word equals "wordD" then varible result is "D D" based on the variable fullText.
So how to get this variable result in Java ?
You can use a simple regular expression. Like this:
String text = fullText.replaceAll(".*" + word + ",([^\\|]+).*", "$1");
Alternatively:
Matcher matcher = Pattern.compile(word + ",([^\\|]+)").matcher(fullText);
matcher.find();
matcher.group(1); // "A A" for word = wordA
If you are using Java8 you can use stream like so :
String result = Arrays.stream(fullText.split("\\|")) // split with |
.filter(s -> s.startsWith(word + ",")) // filter by start with word + ','
.findFirst() // find first or any
.map(a -> a.substring(word.length() + 1)) // get every thing after work + ','
.orElse(null); // or else null or any default value
How about this:
public static String search(String fullText, String key) {
Pattern re = Pattern.compile("(?:^|\\|)" + key + ",([^|]*)(?:$|\\|)");
Matcher matcher = re.matcher(fullText);
if (matcher.find()) {
return matcher.group(1);
}
return null;
}
Example:
String fullText= "wordA,A A|wordB,B B|wordC,C C|wordD,D D";
System.out.println(search(fullText, "wordA"));
System.out.println(search(fullText, "wordB"));
System.out.println(search(fullText, "wordC"));
System.out.println(search(fullText, "wordD"));
Output:
A A
B B
C C
D D
UPDATE: To avoid recompiling the regex at each search:
private static final Pattern RE = Pattern.compile("(?:^|\\|)([^,]*),([^|]*)(?:$|(?=\\|))");
public static String search(String fullText, String key) {
Matcher matcher = RE.matcher(fullText);
while (matcher.find()) {
if (matcher.group(1).equals(key)) {
return matcher.group(2);
}
}
return null;
}

How to get exact match keyword from the given string using java?

I'm trying to match exact AdvanceJava keyword with the given inputText string but it executes both if and else condition,instead of I want only AdvanceJava keyword matched.
String inputText = ("iwanttoknowrelatedtoAdvancejava").toLowerCase().replaceAll("\\s", "");
String match = "java";
List keywordsList = new ArrayList<>();//where keywordsList{advance,core,programming} -> keywordlist fetch
// from database
Enumeration e = Collections.enumeration(keywordsList);
int size = keywordsList.size();
while (e.hasMoreElements()) {
for (int i = 0; i < size; i++) {
String s1 = (String) keywordsList.get(i);
if (inputText.contains(s1) && inputText.contains(match)) {
System.out.println("Yes we providing " + s1);
} else if (!inputText.contains(s1) && inputText.contains(match)) {
System.out.println("Yes we are working on java");
}
}
break;
}
Thanks
you can simply do this by using pattern and matcher classes
Pattern p = Pattern.compile("java");
Matcher m = p.matcher("Print this");
m.find();
If you want to find multiple matches in a line, you can call find() and group() repeatedly to extract them all.
Here's how you can achieve what you seek using pattern matching.
In the first example I have taken your input text as it is. This only improves your algorithm which has O(n^2) performance.
String inputText = ("iwanttoknowrelatedtoAdvancejava").toLowerCase().replaceAll("\\s", "");
String match = "java";
List<String> keywordsList = Arrays.asList("advance", "core", "programming");
for (String keyword : keywordsList) {
Pattern p = Pattern.compile(keyword.concat(match));
Matcher m = p.matcher(inputText);
//System.out.println(m.find());
if (m.find()) {
System.out.println("Yes we are providing " + keyword.concat(match));
}
}
But we can improve this in to a better implementation. Here's a more generic version of the above implementation. This code doesn't manipulate the input text before matching, rather we provide a more generic regular expression which ignores spaces and matches case insensitive manner.
String inputText = "i want to know related to Advance java";
String match = "java";
List<String> keywordsList = Arrays.asList("advance", "core", "programming");
for (String keyword : keywordsList) {
Pattern p = Pattern.compile(MessageFormat.format("(?i)({0}\\s*{1})", keyword, match));
Pattern p1 = Pattern.compile(MessageFormat.format("(?i)({0})", match));
Matcher m = p.matcher(inputText);
Matcher m1 = p1.matcher(inputText);
//System.out.println(m.find());
if(m.find()) {
System.out.println("Yes we are providing " + keyword.concat(match));
} else if(m1.find()) {
System.out.println("Yes we are working with " + match);
}
}
#sithum - Thanks but it executes both condition of if else in output.Please refer Screen shot which I attached here.
I applied following logic and it works fine. please refer it , Thanks.
String inputText = ("iwanttoknowrelatedtoAdvancejava").toLowerCase().replaceAll("\\s", "");
String match = "java";
List<String> keywordsList = session.createSQLQuery("SELECT QUESTIONARIES_RAISED FROM QUERIES").list(); // Fetch values from database (advance,core,programming)
String uniqueKeyword=null;
String commonKeyword= null;
int size =keywordsList.size();
for(int i=0;i<size;i++){
String s1 = (String) keywordsList.get(i);//get values one by one from list
if(inputText.contains(match)){
if(inputText.contains(s1) && inputText.contains(match)){
Queries q1 = new Queries();
q1.setQuestionariesRaised(s1); //set matched keyword to getter setter method
keywordsList1=session.createQuery("from Queries sentence where questionariesRaised='"+q1.getQuestionariesRaised()+"'").list(); // based on matched keyword fetch according to matched keyword sentence which stored in database
for(Queries ob : keywordsList1){
uniqueKeyword= ob.getSentence().toString();// Store fetched sentence to on string variable
}
break;
}else {
commonKeyword= "java only";
}
}
}}
if(uniqueKeyword!= null){
System.out.println("Yes we providing......................" + uniqueKeyword);
}else if(commonKeyword!= null){
System.out.println("Yes we providing " + commonKeyword);
}else{
}

Need to extract data from CSV file

In my file I have below data, everything is string
Input
"abcd","12345","success,1234,out",,"hai"
The output should be like below
Column 1: "abcd"
Column 2: "12345"
Column 3: "success,1234,out"
Column 4: null
Column 5: "hai"
We need to use comma as a delimiter , the null value is comming without double quotes.
Could you please help me to find a regular expression to parse this data
You could try a tool like CSVReader from OpenCsv https://sourceforge.net/projects/opencsv/
You can even configure a CSVParser (used by the reader) to output null on several conditions. From the doc :
/**
* Denotes what field contents will cause the parser to return null: EMPTY_SEPARATORS, EMPTY_QUOTES, BOTH, NEITHER (default)
*/
public static final CSVReaderNullFieldIndicator DEFAULT_NULL_FIELD_INDICATOR = NEITHER;
You can use this Regular Expression
"([^"]*)"
DEMO: https://regex101.com/r/WpgU9W/1
Match 1
Group 1. 1-5 `abcd`
Match 2
Group 1. 8-13 `12345`
Match 3
Group 1. 16-32 `success,1234,out`
Match 4
Group 1. 36-39 `hai`
Using the ("[^"]+")|(?<=,)(,) regex you may find either quoted strings ("[^"]+"), which should be treated as is, or commas preceded by commas, which denote null field values. All you need now is iterate through the matches and check which of the two capture groups defined and output accordingly:
String input = "\"abcd\",\"12345\",\"success,1234,out\",,\"hai\"";
Pattern pattern = Pattern.compile("(\"[^\"]+\")|(?<=,)(,)");
Matcher matcher = pattern.matcher(input);
int col = 1;
while (matcher.find()) {
if (matcher.group(1) != null) {
System.out.println("Column " + col + ": " + matcher.group(1));
col++;
} else if (matcher.group(2) != null) {
System.out.println("Column " + col + ": null");
col++;
}
}
Demo: https://ideone.com/QmCzPE
Step #1:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "(,,)";
final String string = "\"abcd\",\"12345\",\"success,1234,out\",,\"hai\"\n"
+ "\"abcd\",\"12345\",\"success,1234,out\",\"null\",\"hai\"";
final String subst = ",\"null\",";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
Original Text:
"abcd","12345","success,1234,out",,"hai"
Transformation: (with null)
"abcd","12345","success,1234,out","null","hai"
Step #2: (use REGEXP)
"([^"]*)"
Result:
abcd
12345
success,1234,out
null
hai
Credits:
Emmanuel Guiton [https://stackoverflow.com/users/7226842/emmanuel-guiton] REGEXP
You can also use the Replace function:
final String inuput = "\"abcd\",\"12345\",\"success,1234,out\",,\"hai\"";
System.out.println(inuput);
String[] strings = inuput
.replaceAll(",,", ",\"\",")
.replaceAll(",,", ",\"\",") // if you have more then one null successively
.replaceAll("\",\"", "\";\"")
.replaceAll("\"\"", "")
.split(";");
for (String string : strings) {
String output = string;
if (output.isEmpty()) {
output = null;
}
System.out.println(output);
}

complex regular expression in Java

I have a rather complex (to me it seems rather complex) problem that I'm using regular expressions in Java for:
I can get any text string that must be of the format:
M:<some text>:D:<either a url or string>:C:<some more text>:Q:<a number>
I started with a regular expression for extracting the text between the M:/:D:/:C:/:Q: as:
String pattern2 = "(M:|:D:|:C:|:Q:.*?)([a-zA-Z_\\.0-9]+)";
And that works fine if the <either a url or string> is just an alphanumeric string. But it all falls apart when the embedded string is a url of the format:
tcp://someurl.something:port
Can anyone help me adjust the above reg exp to extract the text after :D: to be either a url or a alpha-numeric string?
Here's an example:
public static void main(String[] args) {
String name = "M:myString1:D:tcp://someurl.com:8989:C:myString2:Q:1";
boolean matchFound = false;
ArrayList<String> values = new ArrayList<>();
String pattern2 = "(M:|:D:|:C:|:Q:.*?)([a-zA-Z_\\.0-9]+)";
Matcher m3 = Pattern.compile(pattern2).matcher(name);
while (m3.find()) {
matchFound = true;
String m = m3.group(2);
System.out.println("regex found match: " + m);
values.add(m);
}
}
In the above example, my results would be:
myString1
tcp://someurl.com:8989
myString2
1
And note that the Strings can be of variable length, alphanumeric, but allowing some characters (such as the url format with :// and/or . - characters
You mention that the format is constant:
M:<some text>:D:<either a url or string>:C:<some more text>:Q:<a number>
Capture groups can do this for you with the pattern:
"M:(.*):D:(.*):C:(.*):Q:(.*)"
Or you can do a String.split() with a pattern of "M:|:D:|:C:|:Q:". However, the split will return an empty element at the first index. Everything else will follow.
public static void main(String[] args) throws Exception {
System.out.println("Regex: ");
String data = "M:<some text>:D:tcp://someurl.something:port:C:<some more text>:Q:<a number>";
Matcher matcher = Pattern.compile("M:(.*):D:(.*):C:(.*):Q:(.*)").matcher(data);
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println(matcher.group(i));
}
}
System.out.println();
System.out.println("String.split(): ");
String[] pieces = data.split("M:|:D:|:C:|:Q:");
for (String piece : pieces) {
System.out.println(piece);
}
}
Results:
Regex:
<some text>
tcp://someurl.something:port
<some more text>
<a number>
String.split():
<some text>
tcp://someurl.something:port
<some more text>
<a number>
To extract the URL/text part you don't need the regular expression. Use
int startPos = input.indexOf(":D:")+":D:".length();
int endPos = input.indexOf(":C:", startPos);
String urlOrText = input.substring(startPos, endPos);
Assuming you need to do some validation along with the parsing:
break the regex into different parts like this:
String m_regex = "[\\w.]+"; //in jsva a . in [] is just a plain dot
String url_regex = "."; //theres a bunch online, pick your favorite.
String d_regex = "(?:" + url_regex + "|\\p{Alnum}+)"; // url or a sequence of alphanumeric characters
String c_regex = "[\\w.]+"; //but i'm assuming you want this to be a bit more strictive. not sure.
String q_regex = "\\d+"; //what sort of number exactly? assuming any string of digits here
String regex = "M:(?<M>" + m_regex + "):"
+ "D:(?<D>" + d_regex + "):"
+ "C:(?<D>" + c_regex + "):"
+ "Q:(?<D>" + q_regex + ")";
Pattern p = Pattern.compile(regex);
Might be a good idea to keep the pattern as a static field somewhere and compile it in a static block so that the temporary regex strings don't overcrowd some class with basically useless fields.
Then you can retrieve each part by its name:
Matcher m = p.matcher( input );
if (m.matches()) {
String m_part = m.group( "M" );
...
String q_part = m.group( "Q" );
}
You can go even a step further by making a RegexGroup interface/objects where each implementing object represents a part of the regex which has a name and the actual regex. Though you definitely lose the simplicity makes it harder to understand it with a quick glance. (I wouldn't do this, just pointing out its possible and has its own benefits)

Categories