Split String 2 times but with different splits ";" and "." - java

Original String: "12312123;www.qwerty.com"
With this Model.getList().get(0).split(";")[1]
I get: "www.qwerty.com"
I tried doing this: Model.getList().get(0).split(";")[1].split(".")[1]
But it didnt work I get exception. How can I solve this?
I want only "qwerty"

Try this, to achieve "qwerty":
Model.getList().get(0).split(";")[1].split("\\.")[1]
You need escape dot symbol

Try to use split(";|\\.") like this:
for (String string : "12312123;www.qwerty.com".split(";|\\.")) {
System.out.println(string);
}
Output:
12312123
www
qwerty
com

You can split a string which has multiple delimiters. Example below:
String abc = "11;xyz.test.com";
String[] tokens = abc.split(";|\\.");
System.out.println(tokens[tokens.length-2]);

The array index 1 part doesn't make sense here. It will throw an ArrayIndexOutOfBounds Exception or something of the sort.
This is because splitting based on "." doesn't work the way you want it to. You would need to escape the period by putting "\." instead. You will find here that "." means something completely different.

You'd need to escape the ., i.e. "\\.". Period is a special character in regular expressions, meaning "any character".
What your current split means is "split on any character"; this means that it splits the string into a number of empty strings, since there is nothing between consecutive occurrences of " any character".
There is a subtle gotcha in the behaviour of the String.split method, which is that it discards trailing empty strings from the token array (unless you pass a negative number as the second parameter).
Since your entire token array consists of empty strings, all of these are discarded, so the result of the split is a zero-length array - hence the exception when you try to access one of its element.

Don't use split, use a regular expression (directly). It's safer, and faster.
String input = "12312123;www.qwerty.com";
String regex = "([^.;]+)\\.[^.;]+$";
Matcher m = Pattern.compile(regex).matcher(input);
if (m.find()) {
System.out.println(m.group(1)); // prints: qwerty
}

Related

Java - Splitting String [duplicate]

I am wondering if I am going about splitting a string on a . the right way? My code is:
String[] fn = filename.split(".");
return fn[0];
I only need the first part of the string, that's why I return the first item. I ask because I noticed in the API that . means any character, so now I'm stuck.
split() accepts a regular expression, so you need to escape . to not consider it as a regex meta character. Here's an example :
String[] fn = filename.split("\\.");
return fn[0];
I see only solutions here but no full explanation of the problem so I decided to post this answer
Problem
You need to know few things about text.split(delim). split method:
accepts as argument regular expression (regex) which describes delimiter on which we want to split,
if delim exists at end of text like in a,b,c,, (where delimiter is ,) split at first will create array like ["a" "b" "c" "" ""] but since in most cases we don't really need these trailing empty strings it also removes them automatically for us. So it creates another array without these trailing empty strings and returns it.
You also need to know that dot . is special character in regex. It represents any character (except line separators but this can be changed with Pattern.DOTALL flag).
So for string like "abc" if we split on "." split method will
create array like ["" "" "" ""],
but since this array contains only empty strings and they all are trailing they will be removed (like shown in previous second point)
which means we will get as result empty array [] (with no elements, not even empty string), so we can't use fn[0] because there is no index 0.
Solution
To solve this problem you simply need to create regex which will represents dot. To do so we need to escape that .. There are few ways to do it, but simplest is probably by using \ (which in String needs to be written as "\\" because \ is also special there and requires another \ to be escaped).
So solution to your problem may look like
String[] fn = filename.split("\\.");
Bonus
You can also use other ways to escape that dot like
using character class split("[.]")
wrapping it in quote split("\\Q.\\E")
using proper Pattern instance with Pattern.LITERAL flag
or simply use split(Pattern.quote(".")) and let regex do escaping for you.
Split uses regular expressions, where '.' is a special character meaning anything. You need to escape it if you actually want it to match the '.' character:
String[] fn = filename.split("\\.");
(one '\' to escape the '.' in the regular expression, and the other to escape the first one in the Java string)
Also I wouldn't suggest returning fn[0] since if you have a file named something.blabla.txt, which is a valid name you won't be returning the actual file name. Instead I think it's better if you use:
int idx = filename.lastIndexOf('.');
return filename.subString(0, idx);
the String#split(String) method uses regular expressions.
In regular expressions, the "." character means "any character".
You can avoid this behavior by either escaping the "."
filename.split("\\.");
or telling the split method to split at at a character class:
filename.split("[.]");
Character classes are collections of characters. You could write
filename.split("[-.;ld7]");
and filename would be split at every "-", ".", ";", "l", "d" or "7". Inside character classes, the "." is not a special character ("metacharacter").
As DOT( . ) is considered as a special character and split method of String expects a regular expression you need to do like this -
String[] fn = filename.split("\\.");
return fn[0];
In java the special characters need to be escaped with a "\" but since "\" is also a special character in Java, you need to escape it again with another "\" !
String str="1.2.3";
String[] cats = str.split(Pattern.quote("."));
Wouldn't it be more efficient to use
filename.substring(0, filename.indexOf("."))
if you only want what's up to the first dot?
Usually its NOT a good idea to unmask it by hand. There is a method in the Pattern class for this task:
java.util.regex
static String quote(String s)
The split must be taking regex as a an argument... Simply change "." to "\\."
The solution that worked for me is the following
String[] fn = filename.split("[.]");
Note: Further care should be taken with this snippet, even after the dot is escaped!
If filename is just the string ".", then fn will still end up to be of 0 length and fn[0] will still throw an exception!
This is, because if the pattern matches at least once, then split will discard all trailing empty strings (thus also the one before the dot!) from the array, leaving an empty array to be returned.
Using ApacheCommons it's simplest:
File file = ...
FilenameUtils.getBaseName(file.getName());
Note, it also extracts a filename from full path.
split takes a regex as argument. So you should pass "\." instead of "." because "." is a metacharacter in regex.

About string split method in java [duplicate]

I am wondering if I am going about splitting a string on a . the right way? My code is:
String[] fn = filename.split(".");
return fn[0];
I only need the first part of the string, that's why I return the first item. I ask because I noticed in the API that . means any character, so now I'm stuck.
split() accepts a regular expression, so you need to escape . to not consider it as a regex meta character. Here's an example :
String[] fn = filename.split("\\.");
return fn[0];
I see only solutions here but no full explanation of the problem so I decided to post this answer
Problem
You need to know few things about text.split(delim). split method:
accepts as argument regular expression (regex) which describes delimiter on which we want to split,
if delim exists at end of text like in a,b,c,, (where delimiter is ,) split at first will create array like ["a" "b" "c" "" ""] but since in most cases we don't really need these trailing empty strings it also removes them automatically for us. So it creates another array without these trailing empty strings and returns it.
You also need to know that dot . is special character in regex. It represents any character (except line separators but this can be changed with Pattern.DOTALL flag).
So for string like "abc" if we split on "." split method will
create array like ["" "" "" ""],
but since this array contains only empty strings and they all are trailing they will be removed (like shown in previous second point)
which means we will get as result empty array [] (with no elements, not even empty string), so we can't use fn[0] because there is no index 0.
Solution
To solve this problem you simply need to create regex which will represents dot. To do so we need to escape that .. There are few ways to do it, but simplest is probably by using \ (which in String needs to be written as "\\" because \ is also special there and requires another \ to be escaped).
So solution to your problem may look like
String[] fn = filename.split("\\.");
Bonus
You can also use other ways to escape that dot like
using character class split("[.]")
wrapping it in quote split("\\Q.\\E")
using proper Pattern instance with Pattern.LITERAL flag
or simply use split(Pattern.quote(".")) and let regex do escaping for you.
Split uses regular expressions, where '.' is a special character meaning anything. You need to escape it if you actually want it to match the '.' character:
String[] fn = filename.split("\\.");
(one '\' to escape the '.' in the regular expression, and the other to escape the first one in the Java string)
Also I wouldn't suggest returning fn[0] since if you have a file named something.blabla.txt, which is a valid name you won't be returning the actual file name. Instead I think it's better if you use:
int idx = filename.lastIndexOf('.');
return filename.subString(0, idx);
the String#split(String) method uses regular expressions.
In regular expressions, the "." character means "any character".
You can avoid this behavior by either escaping the "."
filename.split("\\.");
or telling the split method to split at at a character class:
filename.split("[.]");
Character classes are collections of characters. You could write
filename.split("[-.;ld7]");
and filename would be split at every "-", ".", ";", "l", "d" or "7". Inside character classes, the "." is not a special character ("metacharacter").
As DOT( . ) is considered as a special character and split method of String expects a regular expression you need to do like this -
String[] fn = filename.split("\\.");
return fn[0];
In java the special characters need to be escaped with a "\" but since "\" is also a special character in Java, you need to escape it again with another "\" !
String str="1.2.3";
String[] cats = str.split(Pattern.quote("."));
Wouldn't it be more efficient to use
filename.substring(0, filename.indexOf("."))
if you only want what's up to the first dot?
Usually its NOT a good idea to unmask it by hand. There is a method in the Pattern class for this task:
java.util.regex
static String quote(String s)
The split must be taking regex as a an argument... Simply change "." to "\\."
The solution that worked for me is the following
String[] fn = filename.split("[.]");
Note: Further care should be taken with this snippet, even after the dot is escaped!
If filename is just the string ".", then fn will still end up to be of 0 length and fn[0] will still throw an exception!
This is, because if the pattern matches at least once, then split will discard all trailing empty strings (thus also the one before the dot!) from the array, leaving an empty array to be returned.
Using ApacheCommons it's simplest:
File file = ...
FilenameUtils.getBaseName(file.getName());
Note, it also extracts a filename from full path.
split takes a regex as argument. So you should pass "\." instead of "." because "." is a metacharacter in regex.

String split method returning first element as empty using regex

I'm trying to get the digits from the expression [1..1], using Java's split method. I'm using the regex expression ^\\[|\\.{2}|\\]$ inside split. But the split method returning me String array with first value as empty, and then "1" inside index 1 and 2 respectively. Could anyone please tell me what's wrong I'm doing in this regex expression, so that I only get the digits in the returned String array from split method?
You should use matching. Change your expression to:
`^\[(.*?)\.\.(.*)\]$`
And get your results from the two captured groups.
As for why split acts this way, it's simple: you asked it to split on the [ character, but there's still an "empty string" between the start of the string and the first [ character.
Your regex is matching [ and .. and ]. Thus it will split at this occurrences.
You should not use a split but match each number in your string using regex.
You've set it up such that [, ] and .. are delimiters. Split will return an empty first index because the first character in your string [1..1] is a delimiter. I would strip delimiters from the front and end of your string, as suggested here.
So, something like
input.replaceFirst("^[", "").split("^\\[|\\.{2}|\\]$");
Or, use regex and regex groups (such as the other answers in this question) more directly rather than through split.
Why not use a regex to capture the numbers? This will be more effective less error prone. In that case the regex looks like:
^\[(\d+)\.{2}(\d+)\]$
And you can capture them with:
Pattern pat = Pattern.compile("^\\[(\\d+)\\.{2}(\\d+)\\]$");
Matcher matcher = pattern.matcher(text);
if(matcher.find()) { //we've found a match
int range_from = Integer.parseInt(matcher.group(1));
int range_to = Integer.parseInt(matcher.group(2));
}
with range_from and range_to the integers you can no work with.
The advantage is that the pattern will fail on strings that make not much sense like ..3[4, etc.

Java replace New Lines, Commas, and Spaces at end of String

I am using
mString.replaceAll("[\n,\\s]$", "");
Not working, what is the correct way to remove newlines commas or spaces from the end of a string if the can appear in any order.
Try this
mString = mString.replaceAll("[\n,\\s]+$", "");
There are two reasons your attempt
mString.replaceAll("[\n,\\s]$", "");
doesn't work. First of all, replaceAll does not modify the String instance, because Strings are immutable. It returns the modified string as the result of the method. But the above statement discards the result. So you at least need
mString = mString.replaceAll(...);
The second reason is that the replacement method looks for the pattern in order. If it started over at the beginning of the string after each replacement, then your expression would replace a newline, comma, or whitespace at the end of the string, then it would keep doing it until there were no more such characters at the end. But it doesn't do things this way (and if it did, it would be way too easy to write replaceAll expressions that looped infinitely). replaceAll works like this: It searches for the pattern, and if it finds it, it copies all characters before the pattern to the result. Then, it copies the replacement string to the result. Then, it resets the matcher to the character after the match. In your case, since the pattern match goes to the end of the input (because of the $), the character after the match will be the end of the string, and there can be no more matches. Thus, the matcher would only be able to replace one character. That's why you need to add + to the pattern, as in the other correct answers, like Anubhava's:
mString = mString.replaceAll("[,\\s]+$", "");
You can just take out \n since \s includes new lines also. You also need to add + quantifier to make it match more than 1 occurrence of whitespace or comma at end.
mString = mString.replaceAll("[,\\s]+$", "");
Try mString = mString.replaceAll("(\\n|,|\\s)+$", "");

Cutting String java

I need to cut certain strings for an algorithm I am making. I am using substring() but it gets too complicated with it and actually doesn't work correctly. I found this topic how to cut string with two regular expression "_" and "."
and decided to try with split() but it always gives me
java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 0
+
^
So this is the code I have:
String[] result = "234*(4-5)+56".split("+");
/*for(int i=0; i<result.length; i++)
{
System.out.println(result[i]);
}*/
Arrays.toString(result);
Any ideas why I get this irritating exception ?
P.S. If I fix this I will post you the algorithm for cutting and then the algorithm for the whole calculator (because I am building a calculator). It is gonna be a really badass calculator, I promise :P
+ in regex has a special meaning. to be treated as a normal character, you should escape it with backslash.
String[] result = "234*(4-5)+56".split("\\+");
Below are the metacharaters in regex. to treat any of them as normal characters you should escape them with backslash
<([{\^-=$!|]})?*+.>
refer here about how characters work in regex.
The plus + symbol has meaning in regular expression, which is how split parses it's parameter. You'll need to regex-escape the plus character.
.split("\\+");
You should split your string like this: -
String[] result = "234*(4-5)+56".split("[+]");
Since, String.split takes a regex as delimiter, and + is a meta-character in regex, which means match 1 or more repetition, so it's an error to use it bare in regex.
You can use it in character class to match + literal. Because in character class, meta-characters and all other characters loose their special meaning. Only hiephen(-) has a special meaning in it, which means a range.
+ is a regex quantifier (meaning one or more of) so needs to be escaped in the split method:
String[] result = "234*(4-5)+56".split("\\+");

Categories