Splitting a string keeping some delimiters but removing another - java

Basically I would like to split a string into an array delimiting by spaces and operators, but keep the operators while removing the spaces
ex. 3 52 9+- 2 3 * /
will be [3][52][9][+][-][2][3][*][/]

The logic you want when splitting is to consume delimiters which are whitespace and to not consume delimiters which are arithmetic symbols. Towards this end, we can use a lookahead to split by symbol, and use plain \\s to split by whitespace and remove it from the result.
String input = "3 52 9+- 2 3 * /";
input = input.replaceAll("([\\+\\-*/])(.)", " $1$2")
.replaceAll("\\s+", " ");
String[] parts = input.split("(?<=[\+\-*/])|\\s")
System.out.println(Arrays.toString(parts));
Output:
[3, 52, 9, +, -, 2, 3, *, /]

import java.util.ArrayList;
import java.util.List;
public class Test {
public static void main(String[] args) {
String input = "3 52 9+- 2 3 * /";
input = input.replaceAll("([\\+\\-*/])", " $1 ").replaceAll("\\s+", " ");
String[] parts = input.split("(?<=[+\\-*/ ])");
List<String> finalList = new ArrayList<String>();
for(String part : parts) {
if(part.trim().length() > 0) {
finalList.add(part);
}
}
System.out.println(finalList);
}
}
Output
[3 , 52 , 9 , +, -, 2 , 3 , *, /]

Try this regex:
([\-]?[^\s\+\-\*\/]+|[\+\-\*\/])
It will select:
[\-]? signed or unsigned
[^\s\+\-\*\/] characters that is neither spaces nor [+ - * /]
or [\+\-\*\/] [+ - * /]
Just match your case.

Related

How to split this String using Regex Java

As the title says I want to split a string specifically but I donĀ“t know what to put inside the String.split("regex")
Say we have:
String s = "sum := A+B*( 99.1 +.44 444 1234+++)-sum/232.123459"
Now I want to split it to put it this way:
String[] splitted = ["sum"; ":="; "A"; "+"; "B"; "*"; "("; "99.1"; "+"; ".44"; "444"; "1234"; "+"; "+"; "+"; ")"; "-"; "sum"; "/"; "232.123459"]
So, basically I want to split by space, words, the math operators, numbers, the parenthesis, the letters and the number ".44" has to remain this way.
Can you help me?
Thanks in advance
Don't use split(). Use a find() loop.
String regex = "[0-9]+\\.?[0-9]*" + // Match number (e.g. 999 or 999.99)
"|\\.[0-9]+" + // Match number (e.g. .999)
"|[a-zA-Z]\\w*" + // Match identifier
"|:=" + // Match complex operator
"|\\S"; // Match other single non-space, incl. operators: +, -, *, /, (, )
Test
String s = "sum := A+B*( 99.1 +.44 444 1234+++)-sum/232.123459";
String[] splitted = Pattern.compile(regex).matcher(s).results()
.map(MatchResult::group).toArray(String[]::new);
System.out.println(Arrays.toString(splitted));
Output
[sum, :=, A, +, B, *, (, 99.1, +, .44, 444, 1234, +, +, +, ), -, sum, /, 232.123459]

Want to string split specifically in Java

I want to string split the following String
String ToSplit = "(2*(sqrt 9))/5";
into the following array of String:
String[] Splitted = {"(", "2", "*", "(", "sqrt", "9", ")", ")", "/", "5"};
As you can see the string ToSplit doesn't have spaces and I am having a hard time splitting the word " sqrt " from the rest of the elements because it is a full word. I am doing:
String[] Splitted = ToSplit.split("");
and the word " sqrt " is splitted into {"s", "q", "r", "t"} (obviously) and I want it splitted as the whole word to get the String splitted as shown above
How can I separate the word " sqrt " (as 1 element) from the others ?
Thanks in advance.
Here is a working solution which splits on lookarounds. See below the code for an explanation.
String input = "(2*(sqrt 9))/5";
String[] parts = input.split("(?<=[^\\w\\s])(?=\\w)|(?<=\\w)(?=[^\\w\\s])|(?<=[^\\w\\s])(?=[^\\w\\s])|\\s+");
for (String part : parts) {
System.out.println(part);
}
(
2
*
(
sqrt
9
)
)
/
5
There are four terms in the regex alternation, and here is what each one does:
(?<=[^\\w\\s])(?=\\w)
split if what precedes is neither a word character nor whitespace, AND
what follows is a word character
e.g. split (2 into ( and 2
(?<=\\w)(?=[^\\w\\s])
split if what precedes is a word character AND
what follows is neither a word character nor whitespace
e.g. split 9) into 9 and )
(?<=[^\\w\\s])(?=[^\\w\\s])
split between two non word/whitespace characters
e.g. split )/ into ) and /
\\s+
finally, also split and consume any amount of whitespace
as a separator between terms
e.g. sqrt 9 becomes sqrt and 9

Use closing parenthesis() as delimiters in multiple Regular Expressions

how to use this closing parenthesis () in this group of delimiters :
group one :
(?<=[-+*/%])
group tow :
(?=[-+*/%])
My code is :
String str = " (31 + 4) -6+ 1 % 7";
String [] result = str.split("(?<=[-+*/%])|(?=[-+*/%]) " ) ;
System.out.println(Arrays.toString(result));
I want this closing parenthesis () act as other characters in brackets [] ,
with
look-behind assertion
(?<=...)
and look-ahead assertion
(?=...)
the output is :
[ (31, +, 4), -, 6, +, 1, %, 7]
the output needed :
[ ( , 31, +, 4 , ) , -, 6, +, 1, %, 7]
Adding the round brackets to both the character classes you have can help you achieve what you need. To remove the empty (space-only) elements, you can use the post-processing with .removeAll(Collections.singleton(" ")) run against an list of strings.
Thus, you can use the following code:
String s = " (31 + 4) -6+ 1 % 7";
String[] sp = s.split("(?<=[-+*/%()])|(?=[-+*/%()])\\s*");
List<String> list = new ArrayList<String>(Arrays.asList(sp));
list.removeAll(Collections.singleton(" "));
System.out.println(list);
See IDEONE demo
Another way mentioned by anubhava, with [0-9]+|[-+*/%()] regex that matches sequences of digits or the operators:
String s = " (31 + 4) -6+ 1 % 7";
Pattern pattern = Pattern.compile("[0-9]+|[-+*/%()]");
Matcher matcher = pattern.matcher(s);
List<String> res = new ArrayList<>();
while (matcher.find()){
res.add(matcher.group(0));
}
See another IDEONE demo

How to parse string with Java?

I am trying to make a simple calculator application that would take a string like this
5 + 4 + 3 - 2 - 10 + 15
I need Java to parse this string into an array
{5, +4, +3, -2, -10, +15}
Assume the user may enter 0 or more spaces between each number and each operator
I'm new to Java so I'm not entirely sure how to accomplish this.
You can use Integer.parseInt to get the values, splitting the string you can achieve with String class. A regex could work, but I dont know how to do those :3
Take a look at String.split():
String str = "1 + 2";
System.out.println(java.util.Arrays.toString(str.split(" ")));
[1, +, 2]
Note that split uses regular expressions, so you would have to quote the character to split by "." or similar characters with special meanings. Also, multiple spaces in a row will create empty strings in the parse array which you would need to skip.
This solves the simple example. For more rigorous parsing of true expressions you would want to create a grammar and use something like Antlr.
Let str be your line buffer.
Use Regex.match for pattern ([-+]?[ \t]*[0-9]+).
Accumulate all matches into String[] tokens.
Then, for each token in tokens:
String s[] = tokens[i].split(" +");
if (s.length > 1)
tokens[i] = s[0] + s[1];
else
tokens[i] = s[0];
You can use positive lookbehind:
String s = "5 + 4 + 3 - 2 - 10 + 15";
Pattern p = Pattern.compile("(?<=[0-9]) +");
String[] result = p.split(s);
for(String ss : result)
System.out.println(ss.replaceAll(" ", ""));
String cal = "5 + 4 + 3 - 2 - 10 + 15";
//matches combinations of '+' or '-', whitespace, number
Pattern pat = Pattern.compile("[-+]{1}\\s*\\d+");
Matcher mat = pat.matcher(cal);
List<String> ops = new ArrayList<String>();
while(mat.find())
{
ops.add(mat.group());
}
//gets first number and puts in beginning of List
ops.add(0, cal.substring(0, cal.indexOf(" ")));
for(int i = 0; i < ops.size(); i++)
{
//remove whitespace
ops.set(i, ops.get(i).replaceAll("\\s*", ""));
}
System.out.println(Arrays.toString(ops.toArray()));
//[5, +4, +3, -2, -10, +15]
Based off the input of some of the answers here, I found this to be the best solution
// input
String s = "5 + 4 + 3 - 2 - 10 + 15";
ArrayList<Integer> numbers = new ArrayList<Integer>();
// remove whitespace
s = s.replaceAll("\\s+", "");
// parse string
Pattern pattern = Pattern.compile("[-]?\\d+");
Matcher matcher = pattern.matcher(s);
// add numbers to array
while (matcher.find()) {
numbers.add(Integer.parseInt(matcher.group()));
}
// numbers
// {5, 4, 3, -2, -10, 15}

How can I create a create a java regular expression for a comma separator list

How can I create a java regular expression for a comma separator list
(3)
(3,6)
(3 , 6 )
I tried, but it does not match anything:
Pattern.compile("\\(\\S[,]+\\)")
and how can I get the value "3" or "3"and "6" in my code from the Matcher class?
It's not clear to me exactly what your input looks like, but I doubt the pattern your using is what you want. Your pattern will match a literal (, followed by a single non-whitespace character, followed by one or more commas, followed by a literal ).
If you want to match a number, optionally followed by a comma and another number, all surrounded by parentheses, you could try this pattern:
"\\(\\s*(\\d+)\\s*(,\\d+)?\\s*\\)"
That should match (3), ( 3 ), ( 3, 6), etc. but not (a) or (3, a).
You can retrieve the matched digit(s) using Matcher.group; the first digit will be group 1, the second (if any) will be group 2.
Validation regex
You can try this meta-regex approach for clarity:
String pattern =
"< part (?: , part )* >"
.replace("<", "\\(")
.replace(">", "\\)")
.replace(" ", "\\s*")
.replace("part", "[^\\s*(,)]++");
System.out.println(pattern);
/*** this is the pattern
\(\s*[^\s*(,)]+\s*(?:\s*,\s*[^\s*(,)]+\s*)*\s*\)
****/
The part pattern is [^\s(,)]+, i.e. one or more of anything but whitespace, brackets and comma. This construct is called the negated character class. [aeiou] matches any of the 5 vowel letters; [^aeiou] matches everything but (which includes consonants but also numbers, symbols, whitespaces).
The + repetition is also made possessive to ++ for optimization. The (?:...) construct is a non-capturing group, also for optimization.
References
regular-expressions.info/Character Class, Possessive Quantifier, Non-capturing Group
java.util.regex.Pattern
Testing and splitting
We can then test the pattern as follows:
String[] tests = {
"(1,3,6)",
"(x,y!,a+b=c)",
"( 1, 3 , 6)",
"(1,3,6,)",
"(())",
"(,)",
"()",
"(oh, my, god)",
"(oh,,my,,god)",
"([],<>)",
"( !! , ?? , ++ )",
};
for (String test : tests) {
if (test.matches(pattern)) {
String[] parts = test
.replaceAll("^\\(\\s*|\\s*\\)$", "")
.split("\\s*,\\s*");
System.out.printf("%s = %s%n",
test,
java.util.Arrays.toString(parts)
);
} else {
System.out.println(test + " no match");
}
}
This prints:
(1,3,6) = [1, 3, 6]
(x,y!,a+b=c) = [x, y!, a+b=c]
( 1, 3 , 6) = [1, 3, 6]
(1,3,6,) no match
(()) no match
(,) no match
() no match
(oh, my, god) = [oh, my, god]
(oh,,my,,god) no match
([],<>) = [[], <>]
( !! , ?? , ++ ) = [!!, ??, ++]
This uses String.split to get a String[] of all the parts after trimming the brackets out.

Categories