How to remove everything apart from math signs java regex - java

I have a string which is a maths sum. e.g. 3+5*5/2 etc
I want to get one string array that contains the numbers and another array that contains the operations.
I have get the numbers by itself but I can't get the operations.
This is what I have so far:
String extractingIntegers = "4+5*9/8-6";
String[] operationsInStringformat = extractingIntegers.split("[^0-9]");
String[] numbersInStringformat = extractingIntegers.split("\\D");
The \\D works but not the [^0-9]

The opposite of \D is \d
String extractingIntegers = "4+5*9/8-6";
String[] operationsInStringformat = extractingIntegers.split("\\d");
String[] numbersInStringformat = extractingIntegers.split("\\D");

Just remove the ^ in the below line
String[] operationsInStringformat = extractingIntegers.split("[0-9]+");
Also include + to handle numbers with more than 1 digit.

Related

split string into array and add the delimiter to the array

i have an String and i need to split this String to array
My String is for example "-2x+3"
i split it with this code
public static String[] splitAnswer(String answerInput){
answerInput = answerInput.trim();
String[] token = answerInput.split("[\\+\\-\\*\\\\\\/]");
return token;
}
but i need the minus sign with 2x i.e. (-2x) and my array output will be {"-2x","3"}
the important thing i need the minus with the number after
You can use following regex:
String[] token = answerInput.split("[+*/]|(?=-)")
So, this splits on all the operators, except -. For - operator, it splits on empty string before the - operator. BTW, you don't need to escape anything inside the character class.
For -2x + 3, the split positions are:
|-2x+3 ( `|` is empty space)
^ ^

(Java) Substrings & Reading data from two files using hashmap

If I had a .txt file called animals that had fishfroggoat etc. in it, and another file called owners that had something like:
fish:jane
frog:mark
goat:joe
how could I go about pairing the pets to their owners? I'm fairly sure a HashMap would be good here, but I'm stuck. I put the animal text into a string, but I don't know how to break it up into 4 characters properly.
Any help would be great.
Sorry I didn't add any code, but thanks to you guys' help (especially Ted Hopps) I worked it out and, more importantly, understood it. :-)
There are various approaches. The most direct is to split it using the substring method:
String animals = "fishfroggoat";
String fish = animals.substring(0, 4);
String frog = animals.substring(4, 8);
String goat = animals.substring(8); // or (8, 12)
If you have an arbitrarily long list of 4-character animals, you can do this:
String animals = "fishfroggoatbear";
int n = animals.length() / 4;
String[] animalArray = new String[n];
for (int i = 0; i < n; ++i) {
animalArray[i] = animals.substring(4*i, 4*i + 4);
}
You can split the pet/owner strings using split:
String rawData = "fish:jane";
String[] data = rawData.split(":");
String pet = data[0];
String owner = data[1];
Use String split as given below.
String msg=fish:jane;
msg.split(":")
Then it will make array separate by ":".
This is how you split a string into 4-character chunks in just one line:
String[] animals = input.split("(?<=\\G....)");
This may seem like black magic, so I'll try to demystify it. Welcome to the dark art of regular expressions...
The String.split() method splits the string on every match to the specified regex. So let's look at the regex:
(?<=\\G....)
The construct (?<=regex) is a "positive look behind" for the regex, meaning that the characters preceding the point in the input between characters (because a look behind is zero-width) must natch the regex.
The regex \G (coded as \\G as a java String constant) means "start of previous match" but also initially matches start of input.
The regex .... matches any 4 characters.
Thus, when expressed in English, the regex (?<=\\G....) means "after every characters".
IF anyone is interested, removing \G and splitting on (?<=\....) causes it to split on every character after the 4th = it just means "preceded by 4 characters" - you need the \G to find 4 new characters.
Here's some test code:
public static void main(String[] args) throws Exception {
String input = "fishfroggoatbear";
String[] animals = input.split("(?<=\\G....)");
System.out.println(Arrays.toString(animals));
}
Output:
[fish, frog, goat, bear]

Java String Regex Divide - Always the Same Pattern

I never understood how to make properly regex to divide my Strings.
I have this types of Strings example = "on[?a, ?b, ?c]";
Sometimes I have this, Strings example2 = "not clear[?c]";
For the first Example I would like to divide into this:
[on, a, b, c]
or
String name = "on";
String [] vars = [a,b,c];
And for the second example I would like to divide into this type:
[not clear, c]
or
String name = "not clear";
String [] vars = [c];
Thanks alot in advance guys ;)
If you know the character set of your identifiers, you can simply do a split on all of the text that isn't in that set. For example, if your identifiers only consist of word characters ([a-zA-Z_0-9]) you can use:
String[] parts = "on[?a, ?b, ?c]".split("[\\W]+");
String name = parts[0];
String[] vars = Arrays.copyOfRange(parts, 1, parts.length);
If your identifiers only have A-Z (upper and lower) you could replace \\W above with ^A-Za-z.
I feel that this is more elegant than using a complex regular expression.
Edit: I realize that this will have issues with your second example "not clear". If you have no option of using something like an underscore instead of a space there, you could do one split on [? (or substring) to get the "name", and another split on the remainder, like so:
String s = "not clear[?a, ?b, ?c]";
String[] parts = s.split("\\[\\?"); //need the '?' so we don't get an extra empty array element in the next split
String name = parts[0];
String[] vars = parts[1].split("[\\W]+");
This comes close, but the problem is the third remembered group is actually repeated so it only captures the last match.
(.*?)\[(?:\s*(?:\?(.*?)(?:\s*,\s*\?(.*?))*)\s*)?]
For example, the first one you list on[?a, ?b, ?c] would give group 1 as on, 2 as a 3 as c. If you are using perl, you could the g flag to apply a regex to a line multiple times and use this:
my #tokens;
while ( my $line =~ /\s*(.*?)\s*[[,\]]/g ) {
push( #tokens, $1 );
}
Note, i did not actually test the perl code, just off the top of my head. It should give you the idea though
String[] parts = example.split("[^\\w ]");
List<String> x = new ArrayList<String>();
for (int i = 0; i < parts.length; i++) {
if (!"".equals(parts[i]) && !" ".equals(parts[i])) {
x.add(parts[i]);
}
}
This will work as long as you don't have more than one space separating your non-space characters. There's probably a cleverer way of filtering out the null and " " strings.

Java regex, delete content to the left of comma

I got a string with a bunch of numbers separated by "," in the following form :
1.2223232323232323,74.00
I want them into a String [], but I only need the number to the right of the comma. (74.00). The list have abouth 10,000 different lines like the one above. Right now I'm using String.split(",") which gives me :
System.out.println(String[1]) =
1.2223232323232323
74.00
Why does it not split into two diefferent indexds? I thought it should be like this on split :
System.out.println(String[1]) = 1.2223232323232323
System.out.println(String[2]) = 74.00
But, on String[] array = string.split (",") produces one index with both values separated by newline.
And I only need 74.00 I assume I need to use a REGEX, which is kind of greek to me. Could someone help me out :)?
If it's in a file:
Scanner sc = new Scanner(new File("..."));
sc.useDelimiter("(\r?\n)?.*?,");
while (sc.hasNext())
System.out.println(sc.next());
If it's all one giant string, separated by new-lines:
String oneGiantString = "1.22,74.00\n1.22,74.00\n1.22,74.00";
Scanner sc = new Scanner(oneGiantString);
sc.useDelimiter("(\r?\n)?.*?,");
while (sc.hasNext())
System.out.println(sc.next());
If it's just a single string for each:
String line = "1.2223232323232323,74.00";
System.out.println(line.replaceFirst(".*?,", ""));
Regex explanation:
(\r?\n)? means an optional new-line character.
. means a wildcard.
.*? means 0 or more wildcards (*? as opposed to just * means non-greedy matching, but this probably doesn't mean much to you).
, means, well, ..., a comma.
Reference.
split for file or single string:
String line = "1.2223232323232323,74.00";
String value = line.split(",")[1];
split for one giant string (also needs regex) (but I'd prefer Scanner, it doesn't need all that memory):
String line = "1.22,74.00\n1.22,74.00\n1.22,74.00";
String[] array = line.split("(\r?\n)?.*?,");
for (int i = 1; i < array.length; i++) // the first element is empty
System.out.println(array[i]);
Just try with:
String[] parts = "1.2223232323232323,74.00".split(",");
String value = parts[1]; // your 74.00
String[] strings = "1.2223232323232323,74.00".split(",");

Use String.split() with multiple delimiters

I need to split a string base on delimiter - and .. Below are my desired output.
AA.BB-CC-DD.zip ->
AA
BB
CC
DD
zip
but my following code does not work.
private void getId(String pdfName){
String[]tokens = pdfName.split("-\\.");
}
I think you need to include the regex OR operator:
String[]tokens = pdfName.split("-|\\.");
What you have will match:
[DASH followed by DOT together] -.
not
[DASH or DOT any of them] - or .
Try this regex "[-.]+". The + after treats consecutive delimiter chars as one. Remove plus if you do not want this.
You can use the regex "\W".This matches any non-word character.The required line would be:
String[] tokens=pdfName.split("\\W");
The string you give split is the string form of a regular expression, so:
private void getId(String pdfName){
String[]tokens = pdfName.split("[\\-.]");
}
That means to split on any character in the [] (we have to escape - with a backslash because it's special inside []; and of course we have to escape the backslash because this is a string). (Conversely, . is normally special but isn't special inside [].)
Using Guava you could do this:
Iterable<String> tokens = Splitter.on(CharMatcher.anyOf("-.")).split(pdfName);
For two char sequence as delimeters "AND" and "OR" this should be worked. Don't forget to trim while using.
String text ="ISTANBUL AND NEW YORK AND PARIS OR TOKYO AND MOSCOW";
String[] cities = text.split("AND|OR");
Result : cities = {"ISTANBUL ", " NEW YORK ", " PARIS ", " TOKYO ", " MOSCOW"}
pdfName.split("[.-]+");
[.-] -> any one of the . or - can be used as delimiter
+ sign signifies that if the aforementioned delimiters occur consecutively we should treat it as one.
I'd use Apache Commons:
import org.apache.commons.lang3.StringUtils;
private void getId(String pdfName){
String[] tokens = StringUtils.split(pdfName, "-.");
}
It'll split on any of the specified separators, as opposed to StringUtils.splitByWholeSeparator(str, separator) which uses the complete string as a separator
String[] token=s.split("[.-]");
It's better to use something like this:
s.split("[\\s\\-\\.\\'\\?\\,\\_\\#]+");
Have added a few other characters as sample. This is the safest way to use, because the way . and ' is treated.
Try this code:
var string = 'AA.BB-CC-DD.zip';
array = string.split(/[,.]/);
You may also specified regular expression as argument in split() method ..see below example....
private void getId(String pdfName){
String[]tokens = pdfName.split("-|\\.");
}
s.trim().split("[\\W]+")
should work.
you can try this way as split accepts varargs so we can pass multiple parameters as delimeters
String[]tokens = pdfName.split("-",".");
you can pass as many parameters that you want.
If you know the sting will always be in the same format, first split the string based on . and store the string at the first index in a variable. Then split the string in the second index based on - and store indexes 0, 1 and 2. Finally, split index 2 of the previous array based on . and you should have obtained all of the relevant fields.
Refer to the following snippet:
String[] tmp = pdfName.split(".");
String val1 = tmp[0];
tmp = tmp[1].split("-");
String val2 = tmp[0];
...

Categories