Java regex insert single character between two patterns - java

Let's say I have for example a string...
2*Math.sqrt(5265+Math.sin(53*2*Math.exp(3+(5+5)*3
I'd like to get following string by using regex...2*Math.sqrt(5265)+Math.sin(53)*2*Math.exp(3)+(5+5)*3
So if I am right I just need to insert ) into a string, right between an unknown-length number [0-9] and a operator, which can be one of * - / +.
Furthermore, I'd like to know if (how?) it is possible to get an extended version supporting both Math.pow and above examples. At the moment, I'm using only single-argument Math.* methods, which are OK with above not-implemented-yet solution. But what if I'd like to use Math.pow? So let's just say, that input string is for example...
2*Math.sqrt(5265+Math.sin(53*2*Math.exp(3+(5+5)*3*24Math.pow(3*5*Math.sin(5
And the output I wish looks like... 2*Math.sqrt(5265)+Math.sin(53)*2*Math.exp(3)+(5+5)*3*Math.pow(24,3)*5*Math.sin(5)
(I am asking, because then I'd like to pass edited strings to ScriptEngine eval.)
Edit
Yes, sorry, I haven't made myself clear. I actually don't care whether I get *2* or *(2)* because ScriptEngine's method eval takes both and at the moment I can't imagine any problems with these forms. Second, I'd like to get general solution with every single operator * - / +, strings mentioned above were just examples. I'll try to be more specific. So now ignoring the second Math.pow part, the main string, let's call it A, basically consists of one or more strings in the form that looks like this... QWMath.[a-z](XWY, where Q is [0-9], W is one of * - / + and X is again [0-9].QW and Y are both optional, where Y is either another A string or [0-9], which is connected to the previous string with a W operator.
But it seems, that femtoRgon's solution replacing (Math\.[a-z]+\(\d+) with $1) is OK.
For the Math.pow part: The femtoRgon's solution replacing (\d+)(Math.pow\()(\d+) with $2$1$3) is fine too except missing , in $2$1$3). It should be $2$1,$3) and second, with replace method I had to add aditional backslashes: (\\d+)(Math.pow\\()(\\d+), same with (Math\.[a-z]+\(\d+)...
So in the end I it seems It's quite sufficient to do something like this...
String evalString = "2*Math.sqrt(5265+Math.sin(53*2*Math.exp(3+(5+5)*3*2554Math.pow(451";
String firstPattern = "(\\d+)(Math.pow\\()(\\d+)";
String secondPattern = "(Math\\.[a-z]+\\(\\d+)";
String tempString = evalString.replaceAll(firstPattern, "$2$1,$3)");
System.out.println(tempString.replaceAll(secondPattern, "$1"));
...maybe it can be much shorter. Of course it's not the exact solution, but the rest is just a cosmetic detail now.

By your explanation, your first case sounds simple enough. Simply replace:
(\(\d*)
with
$1)
However, this will not produce your desired output, but insteead will give you:
2*Math.sqrt(5265)+Math.sin(53)*2*Math.exp(3)+(5)+5)*3
What is the logic that would prevent you transforming (5+5) in that way?
Perhaps you only want to perform this with function begining with Math.? In that case, you could try replacing:
(Math\.[a-z]+\(\d+)
with
$1)
Don't run it twice though. There is nothing stopping this from changing Math.pow(3) into Math.pow(3))
To transform 24Math.pow(3 in Math.pow(24,3), you could replace:
(\d+)(Math.pow\()(\d+)
with
$2$1$3)

Related

Regex to match a fixed sub string in a String

I am trying to write a regular expression to verify the presence of a specific number in a fixed position in a String.
String: 109300300330066611111111100000000017000656052086116020170111Name 1
Number to find: 111111111 (Staring from position 17)
I have written the following regular expression:
^.{16}(?<Ones>111111111)(.*)
My understanding is:
Let first 16 characters be whatever they are
Use the Named Capturing Group to grab the specific word
Let the rest of the characters be whatever they are
I am new to regex, is there any issue with the above approach?
Can it be done in other/better way?
I am using Java 8.
Without more details of why you're doing what you're doing, there's just one possible improvement I can see. You repeated any character 16 times at the beginning of the string rather than writing out 16 .s, which is nice and readable, but then, it would be nice to do the same for the repeated 1s:
^.{16}(?<Ones>1{9})(.*)
Otherwise, the string of 1s is hard to understand without the coder manually counting how many there are in the regex.
If you want to hard-code the ones and you know the starting position and you just wnat to know if it is there, using a regex seems unnecessary. you can use this:
String s = "109300300330066611111111100000000017000656052086116020170111Name 1";
if (s.indexOf("111111111").equals(16) doSomething();
Another possible solution without regex:
if(s.substring(16,25).equals("111111111") doSomething();
Otherwise your regex looks good.

String.split() returns an array with an additional empty value

I'm working on a piece of code where I've to split a string into individual parts. The basic logic flow of my code is, the numbers below on the LHS, i.e 1, 2 and 3 are ids of an object. Once I split them, I'd use these ids, get the respective value and replace the ids in the below String with its respective values. The string that I have is as follow -
String str = "(1+2+3)>100";
I've used the following code for splitting the string -
String[] arraySplit = str.split("\\>|\\<|\\=");
String[] finalArray = arraySplit[0].split("\\(|\\)|\\+|\\-|\\*");
Now the arrays that I get are as such -
arraySplit = [(1+2+3), >100];
finalArray = [, 1, 2, 3];
So, after the string is split, I'd replace the string with the values, i.e the string would now be, (20+45+50)>100 where 20, 45 and 50 are the respective values. (this string would then be used in SpEL to evaluate the formula)
I'm almost there, just that I'm getting an empty element at the first position. Is there a way to not get the empty element in the second array, i.e finalArray? Doing some research on this, I'm guessing it is splitting the string (1+2+3) and taking an empty element as a part of the string.
If this is the thing, then is there any other method apart from String.split() that would give me the same result?
Edit -
Here, (1+2+3)>100 is just an example. The round braces are part of a formula, and the string could also be as ((1+2+3)*(5-2))>100.
Edit 2 -
After splitting this String and doing some code over it, I'm goind to use this string in SpEL. So if there's a better solution by directly using SpEL then also it would be great.
Also, currently I'm using the syntax of the formula as such - (1+2+3) * 4>100 but if there's a way out by changing the formula syntax a bit then that would also be helpful, e.g replacing the formula by - ({#1}+{#2}+{#3}) *
{#4}>100, in this case I'd get the variable using {# as the variable and get the numbers.
I hope this part is clear.
Edit 3 -
Just in case, SpEL is also there in my project although I don't have much idea on it, so if there's a better solution using SpEL then its more than welcome. The basic logic of the question is written at the starting of the question in bold.
If you take a look at the split(String regex, int limit)(emphasis is mine):
When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.
Thus, you can specify 0 as limit param:
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
If you keep things really simple, you may be able to get away with using a combination of regular expressions and string operations like split and replace.
However, it looks to me like you'd be better off writing a simple parser using ANTLR.
Take a look at Parsing an arithmetic expression and building a tree from it in Java and https://theantlrguy.atlassian.net/wiki/display/ANTLR3/Five+minute+introduction+to+ANTLR+3
Edit: I haven't used ANTLR in a while - it's now up to version 4, and there may be some significant differences, so make sure that you check the documentation for that version.

Regex to match if string *only* contains *all* characters from a character set, plus an optional one

I ran into a wee problem with Java regex. (I must say in advance, I'm not very experienced in either Java or regex.)
I have a string, and a set of three characters. I want to find out if the string is built from only these characters. Additionally (just to make it even more complicated), two of the characters must be in the string, while the third one is **optional*.
I do have a solution, my question is rather if anyone can offer anything better/nicer/more elegant, because this makes me cry blood when I look at it...
The set-up
There mandatory characters are: | (pipe) and - (dash).
The string in question should be built from a combination of these. They can be in any order, but both have to be in it.
The optional character is: : (colon).
The string can contain colons, but it does not have to. This is the only other character allowed, apart from the above two.
Any other characters are forbidden.
Expected results
Following strings should work/not work:
"------" = false
"||||" = false
"---|---" = true
"|||-|||" = true
"--|-|--|---|||-" = true
...and...
"----:|--|:::|---::|" = true
":::------:::---:---" = false
"|||:|:::::|" = false
"--:::---|:|---G---n" = false
...etc.
The "ugly" solution
Now, I have a solution that seems to work, based on this stackoverflow answer. The reason I'd like a better one will become obvious when you've recovered from seeing this:
if (string.matches("^[(?\\:)?\\|\\-]*(([\\|\\-][(?:\\:)?])|([(?:\\:)?][\\|\\-]))[(?\\:)?\\|\\-]*$") || string.matches("^[(?\\|)?\\-]*(([\\-][(?:\\|)?])|([(?:\\|)?][\\-]))[(?\\|)?\\-]*$")) {
//do funny stuff with a meaningless string
} else {
//don't do funny stuff with a meaningless string
}
Breaking it down
The first regex
"^[(?\\:)?\\|\\-]*(([\\|\\-][(?:\\:)?])|([(?:\\:)?][\\|\\-]))[(?\\:)?\\|\\-]*$"
checks for all three characters
The next one
"^[(?\\|)?\\-]*(([\\-][(?:\\|)?])|([(?:\\|)?][\\-]))[(?\\|)?\\-]*$"
check for the two mandatory ones only.
...Yea, I know...
But believe me I tried. Nothing else gave the desired result, but allowed through strings without the mandatory characters, etc.
The question is...
Does anyone know how to do it a simpler / more elegant way?
Bonus question: There is one thing I don't quite get in the regexes above (more than one, but this one bugs me the most):
As far as I understand(?) regular expressions, (?\\|)? should mean that the character | is either contained or not (unless I'm very much mistaken), still in the above setup it seems to enforce that character. This of course suits my purpose, but I cannot understand why it works that way.
So if anyone can explain, what I'm missing there, that'd be real great, besides, this I suspect holds the key to a simpler solution (checking for both mandatory and optional characters in one regex would be ideal.
Thank you all for reading (and suffering ) through my question, and even bigger thanks for those who reply. :)
PS
I did try stuff like ^[\\|\\-(?:\\:)?)]$, but that would not enforce all mandatory characters.
Use a lookahead based regex.
^(?=.*\\|)(?=.*-)[-:|]+$
or
^(?=.*\\|)[-:|]*-[-:|]*$
or
^[-:|]*(?:-:*\\||\\|:*-)[-:|]*$
DEMO 1DEMO 2
(?=.*\\|) expects atleast one pipe.
(?=.*-) expects atleast one hyphen.
[-:|]+ any char from the list one or more times.
$ End of the line.
Here is a simple answer:
(?=.*\|.*-|.*-.*\|)^([-|:]+)$
This says that the string needs to have a '-' followed by '|', or a '|' followed by a '-', via the look-ahead. Then the string only matches the allowed characters.
Demo: http://fiddle.re/1hnu96
Here is one without lookbefore and -hind.
^[-:|]*\\|[-:|]*-[-:|]*|[-:|]*-[-:|]*\\|[-:|]*$
This doesn't scale, so Avinash's solution is to be preferred - if your regex system has the lookbe*.

Splitting strings like push1234 in java

So, please bear with me as I have a long question here, I have some code in java that is using an array list to implement a stack. I need to be able to enter the command "push" to add stuff to the stack. However my problem is that it has to be in the format pushSTUFF.
Where the "STUFF" is anything, upper case, lower case, string, int, etc.. The way I've been trying to implement this is with the string split method where PUSH is the delimiter. Then the command is passed to a switch case.
I quickly realized that the split gets discarded, at least as far as I can tell, and that the switch case is getting pushSTUFF not push as the case input.
In contemplating this problem I came up with a couple of ways I could do this. I just don't know if they are possible or how to do them.
So,
Is there a way to split a string like pushSTUFF and keep both parts (the push and the STUFF)
Is there a way to split, from a string, something of unknown length or contents (since I don't know what the user will input the STUFF is unknown)
Is there a way to tell the switch case to look for the pushSTUFF as opposed to just push (again because STUFF is unknown).
Are any of these even possible to do? If so what would you recommend?
I'm sure there are better ways but as I'm still learning java these seemed like the best for right now. Also I didn't post any code because I didn't feel it was necessary to the question. I will post some if you need it though. Just ask and I will be happy to oblige.
(tl;dr) Is it possible to do any of 1, 2, or 3 above and if so how?
Thanks in advance.
Instead of splitting the strings, you can use regular expressions with groups and iterate over the matching parts of it (as you saw, the split character(s) get discarded).
For #1, you could do something like (pseudocode):
regex = (push)(.*)
stuff = groups[1]
That should also cover #2 since it will match all characters after the push.
I'm not entirely sure what you're asking in #3.
There is a regex tutorial here if you're not familiar with java regular expressions.
You can also take a look at the StringTokenizer, which has an option to keep delimiters.
If the format will always be push*SOMETHING* why aren't you using String.substring()?
You can do:
String something = "pushSTUFF".substring(4);
This way, you will always get whatever is behind push.
I really don't understand what you are trying to achieve without seeing the actual code, but your problem seems simple enough to be solved this way.
Use .indexOf and find push:
public class splitstring {
public static void main(String[] args){
String tosplit, part1, part2 = new String();
int ind = 0;
tosplit = "push1234";
ind = tosplit.indexOf("push");
part1 = tosplit.substring(ind,ind + 4);
part2 = tosplit.substring(ind + 4, tosplit.length());
}
}
You can search for any Uppercase letter and use String.substring(...)
Find if first character in a string is upper case, Java

Java Regex Engine Crashing

Regex Pattern - ([^=](\\s*[\\w-.]*)*$)
Test String - paginationInput.entriesPerPage=5
Java Regex Engine Crashing / Taking Ages (> 2mins) finding a match. This is not the case for the following test inputs:
paginationInput=5
paginationInput.entries=5
My requirement is to get hold of the String on the right-hand side of = and replace it with something. The above pattern is doing it fine except for the input mentioned above.
I want to understand why the error and how can I optimize the Regex for my requirement so as to avoid other peculiar cases.
You can use a look behind to make sure your string starts at the character after the =:
(?<=\\=)([\\s\\w\\-.]*)$
As for why it is crashing, it's the second * around the group. I'm not sure why you need that, since that sounds like you are asking for :
A single character, anything but equals
Then 0 or more repeats of the following group:
Any amount of white space
Then any amount of word characters, dash, or dot
End of string
Anyway, take out that *, and it doesn't spin forever anymore, but I'd still go for the more specific regex using the look behind.
Also, I don't know how you are using this, but why did you have the $ in there? Then you can only match the last one in the string (if you have more than one). It seems like you'd be better off with a look-ahead to the new line or the end: (?=\\n|$)
[Edit]: Update per comment below.
Try this:
=\\s*(.*)$

Categories