Split String in java by specified pattern

Split String in java by specified pattern - java

How to split this String in java such that I'll get the text occurring between the braces in a String array?
GivenString = "(1,2,3,4,#) (a,s,3,4,5) (22,324,#$%) (123,3def,f34rf,4fe) (32)"
String [] array = GivenString.split("");
Output must be:
array[0] = "1,2,3,4,#"
array[1] = "a,s,3,4,5"
array[2] = "22,324,#$%"
array[3] = "123,3def,f34rf,4fe"
array[4] = "32"

You can try to use:
Matcher mtc = Pattern.compile("\\((.*?)\\)").matcher(yourString);

The best solution is the answer by Rahul Tripathi, but your question said "How to split", so if you must use split() (e.g. this is an assignment), then this regex will do:
^\s*\(|\)\s*\(|\)\s*$
It says:
Match the open-parenthesis at the beginning
Match close-parenthesis followed by open-parenthesis
Match the close-parenthesis at the end
All 3 allowing whitespace.
As a Java regex, that would mean:
str.split("^\\s*\\(|\\)\\s*\\(|\\)\\s*$")
See regex101 for demo.
The problem with using split() is that the leading open-parenthesis causes a split before the first value, resulting in an empty value at the beginning:
array[0] = ""
array[1] = "1,2,3,4,#"
array[2] = "a,s,3,4,5"
array[3] = "22,324,#$%"
array[4] = "123,3def,f34rf,4fe"
array[5] = "32"
That is why Rahul's answer is better, because it won't see such an empty value.

Usually, you would want to use the split() function as this is the easiest way to split a string into multiple arrays when the string is broken up by a key char.
The main problem is that you need information inbetween two chars. The easiest way to solve this problem would to go through the string get ride of every instance of '('. This leaves the string looking like
String = "1,2,3,4,#) a,s,3,4,5) 22,324,#$%) 123,3def,f34rf,4fe) 32)"
And this is perfect, as you can split by the char ')' and not worry about the other bracket interfering with the split. I suggest using the replace("","") where it replaces every instance of the first parameter with the second parameter (we can use "" to delete it).
Here is some example code that may work :
String a = "(1,2,3,4,#) (a,s,3,4,5) (22,324,#$%) (123,3def,f34rf,4fe) (32)"
a = a.replace("(","");
//a is now equal to 1,2,3,4,#) a,s,3,4,5) 22,324,#$%) 123,3def,f34rf,4fe) 32)
String[] parts = a.split("\\)");
System.out.println(parts[0]); //this will print 1,2,3,4,#
I haven't tested it completely, so you may end up with unwanted spaces at the end of the strings you may need to get rid of!
You can then loop through parts[] and it should have all of the required parts for you!

Related

How can I remove whitespaces around the first occurrence of specific char?

How can I remove the whitespaces before and after a specific char? I want also to remove the whitespaces only around the first occurrence of the specific char. In the examples below, I want to remove the whitespaces before and after the first occurrence of =.
For example for those strings:
something = is equal to = something
something = is equal to = something
something =is equal to = something
I need to have this result:
something=is equal to = something
Is there any regular expression that I can use or should I check for the index of the first occurrence of the char =?

private String removeLeadingAndTrailingWhitespaceOfFirstEqualsSign(String s1) {
return s1.replaceFirst("\\s*=\\s*", "=");
}
Notice this matches all whitespace including tabs and new lines, not just space.

You can use the regular expression \w*\s*=\s* to get all matches. From there call trim on the first index in the array of matches.
Regex demo.

Yes - you can create a Regex that matches optional whitespace followed by your pattern followed by optional whitepace, and then replace the first instance.
public static String replaceFirst(final String toMatch, final String forIP) {
// string you want to match before and after
final String quoted = Pattern.quote(toMatch);
final Pattern patt = Pattern.compile("\\s*" + quoted + "\\s*");
final Matcher match = patt.matcher(forIP);
return match.replaceFirst(toMatch);
}
For your inputs this gives the expected result - assuming toMatch is =. It also works with arbitrary bigger things - eg.. imagine giving "is equal to" instead ... getting
something =is equal to= something
For the simple case you can ignore the quoting, for an arbitrary case it helps (although as
many contributors have pointed out before the Pattern.quoting isn't good for every case).
The simple case thus becomes
return forIP.replaceFirst("\\s*" + forIP + "\\s*", forIP);
OR
return forIP.replaceFirst("\\s*=\\s*", "=");

Use substring to retain middle value in string

String s = "John Stuart Mill";
String aFriendlyAssigneeName = s.substring(s.lastIndexOf('-')+1);
I'm currently able to remove jstm - from jstm - John Stuart Mill but I'm not sure how to now remove everything after John.
All data will be in the format initials - Fist Middle Last. Basically I just want to strip everything except First.
How can I accomplish this? Perhaps by removing everything after the third white space...

I'd just use this, should be fast enough, and quite short:
String aFriendlyAssigneeName = s.split(" ")[2];
(Splits the string at the spaces in it, and takes the third member of the array, which should be the first name if they're all in that format.)

This should work:
String s = "jstm - John Stuart Mill";
String aFriendlyAssigneeName = s.substring(s.lastIndexOf('-')+1);
String aFriendlyAssigneeName = aFriendlyAssigneeName.substring(aFriendlyAssigneeName.indexOf(' '));
After you have removed th Initials, the firstname ends after the first blank.

You are looking for the following method -
s.substring(startIndex, endIndex);
This gives a begin and end index, this will help you to easily get the middle of any String.
You can then find the last index with a bit of ( I dare say ) magic...
endIndex = indexOf(" ", fromIndex)
Where from index is
s.lastIndexOf('-')+1
Alternatively
If substring is no "hard" requirement, try using
String[] words = s.split(" ");
This will return an array of all values separated by the space.
You can then just select the index of the word. ( This case words[2] )

Why do not you find the substring after the first occurrence of the space in the string that you found without initials?
aFriendlyAssigneeName = aFriendlyAssigneeName.substring(aFriendlyAssigneeName.indexOf(' '));

In my opinion this is a job for a regex: .* - (\w+)? .*
final String value = "jstm - John Stuart Mill";
final Matcher matcher = Pattern.compile(".* - (\\w+)? .*").matcher(value);
matcher.matches();
System.out.println(matcher.group(1));
In my opinion using a regex vs substring:
Pros:
More clear on what you expect as input and what you intent to capture.
Easily modified/extended if input changes or you want to capture some other part.
Cons:
Regexes can look more cryptic to someone that's not used to them.

the best way for character replacement in String in java

I want to check a string for each character I replace it with other characters or keep it in the string. and also because it's a long string the time to do this task is so important. what is the best way of these, or any better idea?
for all of them I append the result to an StringBuilder.
check all of the characters with a for and charAt commands.
use switch like the previous way.
use replaceAll twice.
and if one of the first to methods is better is there any way to check a character with a group of characters, like :
if (st.charAt(i)=='a'..'z') ....
Edit:
please tell the less consuming in time way and tell the reason.I know all of these ways you said!

If you want to replace a single character (or a single sequence), use replace(), as other answers have suggested.
If you want to replace several characters (e.g., 'a', 'b', and 'c') with a single substitute character or character sequence (e.g., "X"), you should use a regular expression replace:
String result = original.replaceAll("[abc]", "X");
If you want to replace several characters, each with a different replacement (e.g., 'a' with 'A', 'b' with 'B'), then looping through the string yourself and building the result in a StringBuilder will probably be the most efficient. This is because, as you point out in your question, you will be going through the string only once.
String sb = new StringBuilder();
String targets = "abc";
String replacements = "ABC";
for (int i = 0; i < result.length; ++i) {
char c = original.charAt(i);
int loc = targets.indexOf(c);
sb.append(loc >= 0 ? replacements.charAt(loc) : c);
}
String result = sb.toString();

Check the documentation and find some good methods:
char from = 'a';
char to = 'b';
str = str.replace(from, to);

String replaceSample = "This String replace Example shows
how to replace one char from String";
String newString = replaceSample.replace('r', 't');
Output: This Stting teplace Example shows how to teplace one chat ftom Stting
Also, you could use contains:
str1.toLowerCase().contains(str2.toLowerCase())
To check if the substring str2 exists in str1
Edit.
Just read that the String come from a file. You can use Regex for this. That would be the best method.
http://docs.oracle.com/javase/tutorial/essential/regex/literals.html

This is your comment:
I want to replace all of the uppercases to lower cases and replace all
of the characters except a-z with space.
You can do it like this:
str = str.toLowerCase().replaceAll("[^a-z]", " ");
Your requirement should be part of the question, not in comment #7 under a posted answer...

You should look into regex for Java. You can match an entire set of characters. Strings have several functions: replace, replaceAll, and match, which you may find useful here.
You can match the set of alphanumeric, for instance, using [a-zA-Z], which may be what you're looking for.

Java String Regex Divide - Always the Same Pattern

I never understood how to make properly regex to divide my Strings.
I have this types of Strings example = "on[?a, ?b, ?c]";
Sometimes I have this, Strings example2 = "not clear[?c]";
For the first Example I would like to divide into this:
[on, a, b, c]
or
String name = "on";
String [] vars = [a,b,c];
And for the second example I would like to divide into this type:
[not clear, c]
or
String name = "not clear";
String [] vars = [c];
Thanks alot in advance guys ;)

If you know the character set of your identifiers, you can simply do a split on all of the text that isn't in that set. For example, if your identifiers only consist of word characters ([a-zA-Z_0-9]) you can use:
String[] parts = "on[?a, ?b, ?c]".split("[\\W]+");
String name = parts[0];
String[] vars = Arrays.copyOfRange(parts, 1, parts.length);
If your identifiers only have A-Z (upper and lower) you could replace \\W above with ^A-Za-z.
I feel that this is more elegant than using a complex regular expression.
Edit: I realize that this will have issues with your second example "not clear". If you have no option of using something like an underscore instead of a space there, you could do one split on [? (or substring) to get the "name", and another split on the remainder, like so:
String s = "not clear[?a, ?b, ?c]";
String[] parts = s.split("\\[\\?"); //need the '?' so we don't get an extra empty array element in the next split
String name = parts[0];
String[] vars = parts[1].split("[\\W]+");

This comes close, but the problem is the third remembered group is actually repeated so it only captures the last match.
(.*?)\[(?:\s*(?:\?(.*?)(?:\s*,\s*\?(.*?))*)\s*)?]
For example, the first one you list on[?a, ?b, ?c] would give group 1 as on, 2 as a 3 as c. If you are using perl, you could the g flag to apply a regex to a line multiple times and use this:
my #tokens;
while ( my $line =~ /\s*(.*?)\s*[[,\]]/g ) {
push( #tokens, $1 );
}
Note, i did not actually test the perl code, just off the top of my head. It should give you the idea though

String[] parts = example.split("[^\\w ]");
List<String> x = new ArrayList<String>();
for (int i = 0; i < parts.length; i++) {
if (!"".equals(parts[i]) && !" ".equals(parts[i])) {
x.add(parts[i]);
}
}
This will work as long as you don't have more than one space separating your non-space characters. There's probably a cleverer way of filtering out the null and " " strings.

Help in writing a Regular expression for a string

Hi please help me out in getting regular expression for the
following requirement
I have string type as
String vStr = "Every 1 nature(s) - Universe: (Air,Earth,Water sea,Fire)";
String sStr = "Every 1 form(s) - Earth: (Air,Fire) ";
from these strings after using regex I need to get values as "Air,Earth,Water sea,Fire" and "Air,Fire"
that means after
String vStrRegex ="Air,Earth,Water sea,Fire";
String sStrRegex ="Air,Fire";
All the strings that are input will be seperated by ":" and values needed are inside brackets always
Thanks

The regular expression would be something like this:
: \((.*?)\)
Spelt out:
Pattern p = Pattern.compile(": \\((.*?)\\)");
Matcher m = p.matcher(vStr);
// ...
String result = m.group(1);
This will capture the content of the parentheses as the first capture group.

Try the following:
\((.*)\)\s*$
The ending $ is important, otherwise you'll accidentally match the "(s)".

If you have each string separately, try this expression: \(([^\(]*)\)\s*$
This would get you the content of the last pair of brackets, as group 1.
If the strings are concatenated by : try to split them first.

Ask yourself if you really need a regex. Does the text you need always appear within the last two parentheses? If so, you can keep it simple and use substring instead:
String vStr = "Every 1 nature(s) - Universe: (Air,Earth,Water sea,Fire)";
int lastOpeningParens = vStr.lastIndexOf('(');
int lastClosingParens = vStr.lastIndexOf(')');
String text = vStr.substring(lastOpeningParens + 1, lastClosingParens);
This is much more readable than a regex.

I assume that there are only whitespace characters between : and the opening bracket (:
Pattern regex = Pattern.compile(":\\s+\\((.+)\\)");
You'll find your results in capturing group 1.

Try this regex:
.*\((.*)\)
$1 will contain the required string

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Split String in java by specified pattern - java

You can try to use: Matcher mtc = Pattern.compile("\\((.*?)\\)").matcher(yourString);

Related

How can I remove whitespaces around the first occurrence of specific char?

Use substring to retain middle value in string

the best way for character replacement in String in java

Java String Regex Divide - Always the Same Pattern

Help in writing a Regular expression for a string

Categories

Resources