I have a string which looks like : String s = "date1, calculatedDate(currentDate, 35), false";.
I need to extract all param of verify function. So the expected result should be :
elem[0] = date1
elem[1] = calculatedDate(currentDate, 35)
elem[2] = false
If I use split function on , char but I got this result :
elem[0] = date1
elem[1] = calculatedDate(currentDate
elem[2] = 35)
elem[3] = false
Moreover, the method have to be generic, because some functions have 2 or 7 parameters...
Did you have any solution to help me on that?
You could use StringTokenizer to parse your arguments inside the parentheses:
final static String DELIMITER = ",";
final static String PARENTHESES_START = "(";
final static String PARENTHESES_END = ")";
public static List<String> parseArguments(String text) {
List<String> arguments = new ArrayList<>();
StringBuilder argParsed = new StringBuilder();
StringTokenizer st = new StringTokenizer(text, DELIMITER);
while (st.hasMoreElements()) {
// default: add next token
String token = st.nextToken();
System.out.println("Token: " + token);
argParsed.append(token);
// if token contains '(' we have
// an expression or nested call as argument
if (token.contains(PARENTHESES_START)) {
System.out.println("Nested expression with ( starting: " + token);
// reconstruct to string-builder until ')'
while(st.hasMoreElements() && !token.contains(PARENTHESES_END)) {
// add eliminated/tokenized delimiter
argParsed.append(DELIMITER);
// default: add next token
token=st.nextToken();
System.out.println("Token inside nested expression: " + token);
argParsed.append(token);
}
System.out.println("Nested expression with ) ending: " + token);
}
// add complete argument and start fresh
arguments.add(argParsed.toString());
argParsed.setLength(0);
}
return arguments;
}
It can parse even following input: date1, calculatedDate(currentDate, 35), false, (a+b), x.toString()
Sucessfully found all 5 arguments, including complex ones:
(nested) function-calls like calculatedDate(currentDate, 35)
expressions like (a+b)
method-calls on objects like x.toString()
Run this demo on IDEone.
Read more and extend
There might be more complex texts or grammars to handle (in the future).
Then, if neither regex-capturing, nor string-splitting, nor tokenizing can solve, consider using or generating a PEG- or CFG-parser. See the discussion about Regular Expression Vs. String Parsing.
Try this:
String s = "verify(date1, calculatedDate(currentDate, 35), false)";
Pattern p = Pattern.compile("(?<=verify\\()(\\w+)(,\\s)(.*)(,\\s)((?<=,\\s)\\w+)(?=\\))");
Matcher m = p.matcher(s);
while(m.find()) {
System.out.println(m.group(1) + "\n" + m.group(3) + "\n" + m.group(5));
}
Update for s = "date1, calculatedDate(currentDate, 35), false":
String s = "date1, calculatedDate(currentDate, 35), false";
Pattern p = Pattern.compile("(\\w+)(,\\s)(.*)(,\\s)((?<=,\\s)\\w+)");
Matcher m = p.matcher(s);
while(m.find()) {
System.out.println(m.group(1) + "\n" + m.group(3) + "\n" + m.group(5));
}
Output:
date1
calculatedDate(currentDate, 35)
false
About regex:
(\\w+) one or more(+) word characters
(,\\s) , part
(.*) matches any character, here just the part between two ,
(,\\s) , part
((?<=,\\s)\\w+) ?<= is a positive look behind, helps to catch , false part but does not include ,
Related
I'm trying to make a regex to allow only a case of a number then "," and another number or same case seperated by ";" like
57,1000
57,1000;6393,1000
So far i made this: Pattern.compile("\\b[0-9;,]{1,5}?\\d+;([0-9]{1,5},?)+").matcher("57,1000").find();
which work if case is 57,1000;6393,1000 but it also allow letters and don't work when case 57,1000
try Regex "(\d+,\d+(;\d+,\d+)?)"
#Test
void regex() {
Pattern p = Pattern.compile("(\\d+,\\d+)(;\\d+,\\d+)?");
Assertions.assertTrue(p.matcher("57,1000").matches());
Assertions.assertTrue(p.matcher("57,1000;6393,1000").matches());
}
How about like this. Just look for two numbers separated by a comma and capture them.
String[] data = {"57,1000",
"57,1000;6393,1000"};
Pattern p = Pattern.compile("(\\d+),(\\d+)");
for (String str : data) {
System.out.println("For String : " + str);
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group(1) + " " + m.group(2));
}
System.out.println();
}
prints
For String : 57,1000
57 1000
For String : 57,1000;6393,1000
57 1000
6393 1000
If you just want to match those, you can do the following: It matches a single instance of the string followed by an optional one preceded by a semi-colon.
String regex = "(\\d+,\\d+)(;(\\d+,\\d+))?";
for (String str : data) {
System.out.println("Testing String " + str + " : " +str.matches(regex));
}
prints
Testing String 57,1000 : true
Testing String 57,1000;6393,1000 : true
I have a string
String s="my name is ${name}. My roll no is ${rollno} "
I want to do string operations to update the name and rollno using a method.
public void name(String name, String roll)
{
String new = s.replace(" ${name}", name).replace(" ${rollno}", roll);
}
Can we achieve the same using some other means like using regex to change after first "$" and similarly for the other?
You can use either Matcher#appendReplacement or Matcher#replaceAll (with Java 9+):
A more generic version:
String s="my name is ${name}. My roll no is ${rollno} ";
Matcher m = Pattern.compile("\\$\\{([^{}]+)\\}").matcher(s);
Map<String,String> replacements = new HashMap();
replacements.put("name","John");
replacements.put("rollno","123");
StringBuffer replacedLine = new StringBuffer();
while (m.find()) {
if (replacements.get(m.group(1)) != null)
m.appendReplacement(replacedLine, replacements.get(m.group(1)));
else
m.appendReplacement(replacedLine, m.group());
}
m.appendTail(replacedLine);
System.out.println(replacedLine.toString());
// => my name is John. My roll no is 123
Java 9+ solution:
Matcher m2 = Pattern.compile("\\$\\{([^{}]+)\\}").matcher(s);
String result = m2.replaceAll(x ->
replacements.get(x.group(1)) != null ? replacements.get(x.group(1)) : x.group());
System.out.println( result );
// => my name is John. My roll no is 123
See the Java demo.
The regex is \$\{([^{}]+)\}:
\$\{ - a ${ char sequence
([^{}]+) - Group 1 (m.group(1)): any one or more chars other than { and }
\} - a } char.
See the regex demo.
If I have string variable :
String word = "wordA";
and I have another string variable :
String fullText= "wordA,A A|wordB,B B|wordC,C C|wordD,D D";
so is it possible to get the value after the comma and ends with | ?
Example
If word equals "wordA" then I get only "A A" because in fullText right after wordA and comma is 'A A' and ends with |
and if word equals "wordD" then varible result is "D D" based on the variable fullText.
So how to get this variable result in Java ?
You can use a simple regular expression. Like this:
String text = fullText.replaceAll(".*" + word + ",([^\\|]+).*", "$1");
Alternatively:
Matcher matcher = Pattern.compile(word + ",([^\\|]+)").matcher(fullText);
matcher.find();
matcher.group(1); // "A A" for word = wordA
If you are using Java8 you can use stream like so :
String result = Arrays.stream(fullText.split("\\|")) // split with |
.filter(s -> s.startsWith(word + ",")) // filter by start with word + ','
.findFirst() // find first or any
.map(a -> a.substring(word.length() + 1)) // get every thing after work + ','
.orElse(null); // or else null or any default value
How about this:
public static String search(String fullText, String key) {
Pattern re = Pattern.compile("(?:^|\\|)" + key + ",([^|]*)(?:$|\\|)");
Matcher matcher = re.matcher(fullText);
if (matcher.find()) {
return matcher.group(1);
}
return null;
}
Example:
String fullText= "wordA,A A|wordB,B B|wordC,C C|wordD,D D";
System.out.println(search(fullText, "wordA"));
System.out.println(search(fullText, "wordB"));
System.out.println(search(fullText, "wordC"));
System.out.println(search(fullText, "wordD"));
Output:
A A
B B
C C
D D
UPDATE: To avoid recompiling the regex at each search:
private static final Pattern RE = Pattern.compile("(?:^|\\|)([^,]*),([^|]*)(?:$|(?=\\|))");
public static String search(String fullText, String key) {
Matcher matcher = RE.matcher(fullText);
while (matcher.find()) {
if (matcher.group(1).equals(key)) {
return matcher.group(2);
}
}
return null;
}
How to tokenize an String like in lexer in java?
Please refer to the above question. I never used java regex . How to put the all substring into new string with matched characters (symbols like '(' ')' '.' '<' '>' ") separated by single space . for e.g. before regex
String c= "List<String> uncleanList = Arrays.asList(input1.split("x"));" ;
I want resultant string like this .
String r= " List < String > uncleanList = Arrays . asList ( input1 . split ( " x " ) ) ; "
Referring to the code that you linked to, matcher.group() will give you a single token. Simple use a StringBuilder to append this token and a space to get a new string where the tokens are space-separated.
String c = "List<String> uncleanList = Arrays.asList(input1.split(\"x\"));" ;
Pattern pattern = Pattern.compile("\\w+|[+-]?[0-9\\._Ee]+|\\S");
Matcher matcher = pattern.matcher(c);
StringBuilder sb = new StringBuilder();
while (matcher.find()) {
String token = matcher.group();
sb.append(token).append(" ");
}
String r = sb.toString();
System.out.println(r);
String c = "List<String> uncleanList = Arrays.asList(input1.split('x'));";
Matcher matcher = Pattern.compile("\\<|\\>|\\\"|\\.|\\(|\\)").matcher(c);
while(matcher.find()){
String symbol = matcher.group();
c = c.replace(symbol," " + symbol + " ");
}
Actually if you look deeply You can figure out that you have to separate only not alphabet symbols and space ((?![a-zA-Z]|\ ).)
I need to split below string using below regex. but it splits data which comes under brackets.
Input
T(i-1).XX_1 + XY_8 + T(i-1).YY_2 * ZY_14
Expected Output
T(i-1).XX_1 , XY_8 , T(i-1).YY_2 , ZY_14
It should not split data which comes under "(" and ")";
I tried with below code but split data which comes under "(" and ")"
String[] result = expr.split("[+*/]");
any pointer to fix this.
I am new to this regex.
Input
(T(i-1).XX_1 + XY_8) + T(i-1).YY_2 * (ZY_14 + ZY_14)
Output
T(i-1).XX_1 , XY_8 , T(i-1).YY_2 , ZY_14 , ZY_14
if it is T(i-1) need to ignore.
For below expression its not working
XY_98 + XY_99 +XY_100
String lineExprVal = lineExpr.replaceAll("\\s+","");
String[] result = lineExprVal.split("[+*/-] (?!(^))");
You can split every thing outside your parentheses like this :
String str = "T(i-1).XX_1 + XY_8 + T(i-1).YY_2 * ZY_14";
String result[] = str.split("[+*/-] (?!(^))");
//---------------------------^----^^--List of your delimiters
System.out.println(Arrays.toString(result));
This will print :
[T(i-1).XX_1 , XY_8 , T(i-1).YY_2 , ZY_14]
The idea is simple you have to split with your delimiters that not inside your parenthesis.
You can check this here ideone and you can check your regex here Regex demo
EDIT
In your second case you have to use this regex :
String str = "(T(i - 1).XX_1 + XY_8)+ (i - 1).YY_2*(ZY_14 + ZY_14)";
String result[] = str.split("[+*+\\/-](?![^()]*(?:\\([^()]*\\))?\\))");
System.out.println(Arrays.toString(result));
This will give you :
[(T(i-1).XX_1+XY_8), T(i-1).YY_2, (ZY_14+ZY_14)]
^----Group1------^ ^--Groupe2-^ ^--Groupe3-^
You can find the Regex Demo, i inspirit this solution from this post here Regex to match only comma's but not inside multiple parentheses .
Hope this can help you.
Split in your second mathematical expression is really hard if it is not possible, so instead you have to use pattern, it is more helpful, so for your expression, you need this regex :
(\w+\([\w-*+\/]+\).\w+)|((?:(\w+\(.*?\))))|(\w+)
Here is a Demo regex you will understand more.
To get the result you need to loop throw your result :
public static void main(String[] args) {
String input = "(T(i-1).XX_1 + XY_8) + X + T(i-1).YY_2 * (ZY_14 + ZY_14) + T(i-1)";
Pattern pattern = Pattern.compile("(\\w+\\([\\w-*+\\/]+\\).\\w+)|((?:(\\w+\\(.*?\\))))|(\\w+)");
Matcher matcher = pattern.matcher(input);
List<String> reslt = new ArrayList<>();
while (matcher.find()) {//loop throw your matcher
if (matcher.group(1) != null) {
reslt.add(matcher.group(1));
}
//In your case you have to avoid this two groups
// if (matcher.group(2) != null) {
// reslt.add(matcher.group(2));
// }
// if (matcher.group(3) != null) {
// reslt.add(matcher.group(3));
// }
if (matcher.group(4) != null) {
reslt.add(matcher.group(4));
}
}
reslt.forEach(System.out::println);
}
This will gives you :
T(i-1).XX_1
XY_8
X
T(i-1).YY_2
ZY_14
ZY_14