Split a string with multiple unique delimiters

Split a string with multiple unique delimiters - java

I have string in this format
x/1.25 o/33.3 description for item A`
Now, I want to split it so it becomes
1.25, 33.3, description for item A
So far, I used .split(x/|o/) which works for this case. However, it becomes invalid if the user put x/ or o/ in the description of item A, like description o/ item A.
Is there any better regex that utilize the order of the parameter and delimiter in above string format? Thanks in advance.

One possible way is to do
String[] fields = input.split("\\s+",3);
which will split into 3 blank-delimited fields. Then apply your regex to only the first two and reassemble the result into the output.

Related

Java remove dynamic substring from string

I need to remove dynamic substring from string. There is a few similar topic of this theme, but noone of them helped me. I have a string e.g.:
product test1="001" test2="abc" test3="123xzy"
and i need output:
product test1="001" test3="123xzy"
I mean I need remove test2="abc". test2 is an unique element and can be placed anywhere in original string. "abc" is dynamic variable and can have various length. What is the fastest and the most elegant solution of this problem? Thx

You can use a regular expression:
String input = "product test1=\"001\" test2=\"abc\" test3=\"123xzy\"";
String result = input.replaceAll("test2=\".*?\"\\s+", "");
In substance: find a substring like test2="xxxxxx", optionally followed by some spaces (\\s+) and replace it with nothing.

String.split() returns an array with an additional empty value

I'm working on a piece of code where I've to split a string into individual parts. The basic logic flow of my code is, the numbers below on the LHS, i.e 1, 2 and 3 are ids of an object. Once I split them, I'd use these ids, get the respective value and replace the ids in the below String with its respective values. The string that I have is as follow -
String str = "(1+2+3)>100";
I've used the following code for splitting the string -
String[] arraySplit = str.split("\\>|\\<|\\=");
String[] finalArray = arraySplit[0].split("\\(|\\)|\\+|\\-|\\*");
Now the arrays that I get are as such -
arraySplit = [(1+2+3), >100];
finalArray = [, 1, 2, 3];
So, after the string is split, I'd replace the string with the values, i.e the string would now be, (20+45+50)>100 where 20, 45 and 50 are the respective values. (this string would then be used in SpEL to evaluate the formula)
I'm almost there, just that I'm getting an empty element at the first position. Is there a way to not get the empty element in the second array, i.e finalArray? Doing some research on this, I'm guessing it is splitting the string (1+2+3) and taking an empty element as a part of the string.
If this is the thing, then is there any other method apart from String.split() that would give me the same result?
Edit -
Here, (1+2+3)>100 is just an example. The round braces are part of a formula, and the string could also be as ((1+2+3)*(5-2))>100.
Edit 2 -
After splitting this String and doing some code over it, I'm goind to use this string in SpEL. So if there's a better solution by directly using SpEL then also it would be great.
Also, currently I'm using the syntax of the formula as such - (1+2+3) * 4>100 but if there's a way out by changing the formula syntax a bit then that would also be helpful, e.g replacing the formula by - ({#1}+{#2}+{#3}) *
{#4}>100, in this case I'd get the variable using {# as the variable and get the numbers.
I hope this part is clear.
Edit 3 -
Just in case, SpEL is also there in my project although I don't have much idea on it, so if there's a better solution using SpEL then its more than welcome. The basic logic of the question is written at the starting of the question in bold.

If you take a look at the split(String regex, int limit)(emphasis is mine):
When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.
Thus, you can specify 0 as limit param:
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

If you keep things really simple, you may be able to get away with using a combination of regular expressions and string operations like split and replace.
However, it looks to me like you'd be better off writing a simple parser using ANTLR.
Take a look at Parsing an arithmetic expression and building a tree from it in Java and https://theantlrguy.atlassian.net/wiki/display/ANTLR3/Five+minute+introduction+to+ANTLR+3
Edit: I haven't used ANTLR in a while - it's now up to version 4, and there may be some significant differences, so make sure that you check the documentation for that version.

Most efficient way to get the substring after a specific other substring

If I have a string that looks something like this:
String text = "id=2009,name=Susie,city=Berlin,phone=0723178,birthday=1991-12-07";
I only want to have the info name and phone. I know how to parse the entire String, but in my specific case it is important to only get those two "fields".
So what is the best/most efficient way to have my search method do the following:
search for the substring "name=" and return the substring after it ("Susie") until it reaches the next comma
My approach would have been to:
get the last index of "name=" first
use this index then as the new start for my parsing method
Any other suggestions maybe on how this could be done more efficiently and with a more condense code? Thank you for any input

You can use following regex to capture the expected word after phone and name and get frist group from matched object:
(?:phone|name)=([^,]+)
With regards to following command if it might happen to have a word which is contain phone or name as a more comprehensive way you can putt a comma before your name.
(?:^|,)(?:phone|name)=([^,]+)
Read more about regular expression http://www.regular-expressions.info/

Regex might be more efficient, but for readability, I <3 Guava
String text = "id=2009,name=Susie,city=Berlin,phone=0723178,birthday=1991-12-07";
final Map<String, String> infoMap = Splitter.on(",")
.omitEmptyStrings()
.trimResults()
.withKeyValueSeparator("=")
.split(text);
System.out.println(infoMap.get("name"));
System.out.println(infoMap.get("birthday"));

Best delimiter to separate multipe regex

I need to put multiple regular expressions in a single string and then parse it back as separate regex. Like below
regex1<!>regex2<!>regex3
Problem is I am not sure which delimiter will be best to use to separate the expressions in place of <!> shown in example, so that I can safely split the string when parsing it back.
Constraints are, I can not make the string in multiple lines or use xml or json string. Because this string of expressions should be easily configurable.
Looking forward for any suggestion.
Edited:
Q: Why does it have to be a single string?
A: The system has a configuration manager that loads config from properties file. And properties are containing lines like
com.some.package.Class1.Field1: value
com.some.package.Class1.Expressions: exp1<!>exp2<!>exp3
There is no way to write the value in multiple lines in the properties file. That's why.

The best way would be to use invalid regex as delimiter such as ** Because if it is used in normal regex it won't work and would throw an exception{NOTE:++ is valid}
regex1+"**"+regex2
Now you can split it with this regex
(?<!\\\\)[*][*](?![*])
------- -----
| |->to avoid matching pattern like "A*"+"**"+"n+"
|->check if * is not escaped
Following is a list of invalid regex
[+
(+
[*
(*
[?
*+
** (delimiter would be (?<!\\\\)[*][*](?![*]))
??(delimiter would be (?<!\\\\)[?][?](?![?]))
While splitting you need to check if they are escaped
(?<!\\\\)delimiter

Best delimiter is depends upon your requirement. But for best practice use sequesnce of special characters so that possibility of occureance of this sequesnce is minimal
like
$$**##$$
#$%&&%$#

i think its something helpful for u
First you have to replace tag content with single special character and then split
String inputString="regex1<!>regex2<!>regex3";
String noHTMLString = inputString.replaceAll("\\<.*?>","-");
String[] splitString1 = (noHTMLString.split("[-]+"));
for (String string : splitString1) {
System.out.println(string);
}

Split UK postcode into two main parts using java

This regular expression for validating postcodes works perfect
^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) {0,1}[0-9][A-Za-z]{2})$
but I want to split the postcodes to retrieve the individual parts of the postcode using java.
How can this be done in java?

Here are the official regexes for matching UK postcodes:
http://interim.cabinetoffice.gov.uk/media/291370/bs7666-v2-0-xsd-PostCodeType.htm
If you want to split a found postcode into it's two parts, isn't it simply a question of splitting on whitespace? A UK postcode's two parts are just separated by a space, right? In java this would be:
String[] fields = postcode.split("\\s");
where postcode is a validated postcode and fields[] will be an array of length 2 containing the first and second parts.
Edit: If this is to validate user input, and you want to validate the first part, your regex would be:
Pattern firstPart = Pattern.compile("[A-Z]{1,2}[0-9R][0-9A-Z]?");
To validate the second part it is:
Pattern secondPart = Pattern.compile("[0-9][A-Z-[CIKMOV]]{2}");

I realise that it's rather a long time since this question was asked, but I had the same requirement and thought that I'd post my solution in case it helps someone out there :
const string matchString = #"^(?<Primary>([A-Z]{1,2}[0-9]{1,2}[A-Z]?))(?<Secondary>([0-9]{1}[A-Z]{2}))$";
var regEx = new Regex(matchString);
var match = regEx.Match(Postcode);
var postcodePrimary = match.Groups["Primary"];
var postcodeSecondary = match.Groups["Secondary"];
This doesn't validate the postcode, but it does split it into 2 parts if no space has been entered between them.

You can use the Google's recentlly open sourced library for this. http://code.google.com/p/libphonenumber/

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Split a string with multiple unique delimiters - java

One possible way is to do String[] fields = input.split("\\s+",3); which will split into 3 blank-delimited fields. Then apply your regex to only the first two and reassemble the result into the output.

Related

Java remove dynamic substring from string

String.split() returns an array with an additional empty value

Most efficient way to get the substring after a specific other substring

Best delimiter to separate multipe regex

Split UK postcode into two main parts using java

Categories

Resources