Split a string in java into two parts

Split a string in java into two parts - java

I want to split a string based on a substring, and get the first part. Example below.
Input:
body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li[12]/div/div/div/div[2]/div[2]
Ouptut: splitted at [12]
body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li[12]
I wrote this code :
String path1 = "body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li[12]/div/div/div/div[2]/div[2]"
String result;
if(path1.contains("[12]")){
System.out.println("yes");
result = path1.split("[12]")[0];
System.out.println(result);
}
but I got result like this :
body/div[

String result = path1.substring(0, path1.indexOf("li[12]") + 6);

The split method accepts regular expressions. The regular expression [12] matches one character which is either 1 or 2 and therefore splits the string between each 1 or 2. A better solution is to search for the occurrence of [12] directly:
int indexOf12 = path1.indexOf("[12]");
if(indexOf12 != -1)
{
System.out.println("yes");
String result = path1.substring(0, indexOf12 + 4);
System.out.println(result);
}

The [ character is interpreted as a special regex character so you should escape it by adding \\
So replace
result = path1.split("[12]")[0];
By
result = path1.split("\\[12]")[0];
Output:
yes
body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li

need to add [12] after substring so +6 in result
String result = path1.substring(0, path1.indexOf("li[12]")+6);

This will solve your problem. Thing is you have to provide Regex for split. Not only string.
String path1 = "body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li[12]/div/div/div/div[2]/div[2]";
String result;
if(path1.contains("[12]")){
System.out.println("yes");
result = path1.split("\\[12\\]")[0];
System.out.println(result+"[12]");
}

Here's an example of RegEx specific approach:
Matcher m = Pattern.compile("(.*\\[12\\])")
.matcher("body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li[12]/div/div/div/div[2]/div[2]");
Output
body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li[12]
Code
import java.util.regex.*;
import java.util.*;
public class HelloWorld {
public static void main(String[] args) {
List < String > allMatches = new ArrayList < String > ();
Matcher m = Pattern.compile("(.*\\[12\\])")
.matcher("body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li[12]/div/div/div/div[2]/div[2]");
while (m.find())
allMatches.add(m.group(1));
for (String match: allMatches)
System.out.println(match);
}
}

Related

Remove all the leading zero from the number part of a string

I am trying to remove all the leading zero from the number part of a string. I have came up with this code (below). From the given example it worked. But when I add a '0' in the begining it will not give the proper output. Anybody know how to achive this? Thanks in advance
input: (2016)abc00701def00019z -> output: (2016)abc701def19z -> resut: correct
input: 0(2016)abc00701def00019z -> output: (2016)abc71def19z -> result: wrong -> expected output: (2016)abc701def19z
EDIT: The string can contain other than english alphabet.
String localReference = "(2016)abc00701def00019z";
String localReference1 = localReference.replaceAll("[^0-9]+", " ");
List<String> lists = Arrays.asList(localReference1.trim().split(" "));
System.out.println(lists.toString());
String[] replacedString = new String[5];
String[] searchedString = new String[5];
int counter = 0;
for (String list : lists) {
String s = CharMatcher.is('0').trimLeadingFrom(list);
replacedString[counter] = s;
searchedString[counter++] = list;
System.out.println(String.format("Search: %s, replace: %s", list,s));
}
System.out.println(StringUtils.replaceEach(localReference, searchedString, replacedString));

str.replaceAll("(^|[^0-9])0+", "$1");
This removes any row of zeroes after non-digit characters and at the beginning of the string.

I tried doing the task using Regex and was able to do the required according to the two test cases you gave. Also $1 and $2 in the code below are the parts in the () brackets in preceding Regex.
Please find the code below:
public class Demo {
public static void main(String[] args) {
String str = "0(2016)abc00701def00019z";
/*Below line replaces all 0's which come after any a-z or A-Z and which have any number after them from 1-9. */
str = str.replaceAll("([a-zA-Z]+)0+([1-9]+)", "$1$2");
//Below line only replace the 0's coming in the start of the string
str = str.replaceAll("^0+","");
System.out.println(str);
}
}

java has \P{Alpha}+, which matches any non-alphabetic character and then removing the the starting Zero's.
String stringToSearch = "0(2016)abc00701def00019z";
Pattern p1 = Pattern.compile("\\P{Alpha}+");
Matcher m = p1.matcher(stringToSearch);
StringBuffer sb = new StringBuffer();
while(m.find()){
m.appendReplacement(sb,m.group().replaceAll("\\b0+",""));
}
m.appendTail(sb);
System.out.println(sb.toString());
output:
(2016)abc701def19z

How to extract a number from a string in a particular format?

I have a String like this as shown below. From below string I need to extract number 123 and it can be at any position as shown below but there will be only one number in a string and it will always be in the same format _number_
text_data_123
text_data_123_abc_count
text_data_123_abc_pqr_count
text_tery_qwer_data_123
text_tery_qwer_data_123_count
text_tery_qwer_data_123_abc_pqr_count
Below is the code:
String value = "text_data_123_abc_count";
// this below code will not work as index 2 is not a number in some of the above example
int textId = Integer.parseInt(value.split("_")[2]);
What is the best way to do this?

With a little guava magic:
String value = "text_data_123_abc_count";
Integer id = Ints.tryParse(CharMatcher.inRange('0', '9').retainFrom(value)
see also CharMatcher doc

\\d+
this regex with find should do it for you.

Use Positive lookahead assertion.
Matcher m = Pattern.compile("(?<=_)\\d+(?=_)").matcher(s);
while(m.find())
{
System.out.println(m.group());
}

You can use replaceAll to remove all non-digits to leave only one number (since you say there will be only 1 number in the input string):
String s = "text_data_123_abc_count".replaceAll("[^0-9]", "");
See IDEONE demo
Instead of [^0-9] you can use \D (which also means non-digit):
String s = "text_data_123_abc_count".replaceAll("\\D", "");
Given current requirements and restrictions, the replaceAll solution seems the most convenient (no need to use Matcher directly).

u can get all parts from that string and compare with its UPPERCASE, if it is equal then u can parse it to a number and save:
public class Main {
public static void main(String[] args) {
String txt = "text_tery_qwer_data_123_abc_pqr_count";
String[] words = txt.split("_");
int num = 0;
for (String t : words) {
if(t == t.toUpperCase())
num = Integer.parseInt(t);
}
System.out.println(num);
}
}

Get string within double quotes along with rest of the string

I have a case where I need to extract the string within double quotes in one var and the rest of the string in another var.
Two possibilities:
String: "Franklin B" Benjamin
Result:
var1 = Franklin B
var2 = Benjamin
String: Benjamin "Franklin B"
Result:
var1 = Benjamin
var2 = Franklin B
Regex/Without regex; I am open to any method.

Give this a try...
Basically you remove any leading delimiter in the string before you perform the split. This way you don't have to worry about a leading empty element.
public static void main(String[] args) {
String testString = "\"Franklin B\" Benjamin";
String testString2 = "Benjamin \"Franklin B\"";
displaySplitResults(mySplit(testString, "\""));
displaySplitResults(mySplit(testString2, "\""));
}
private static String[] mySplit(final String input, final String delim)
{
return input.replaceFirst("^" + delim, "").split(delim);
}
private static void displaySplitResults(String[] splitResults) {
if (splitResults.length == 2) {
String var1 = splitResults[0].trim();
String var2 = splitResults[1].trim();
System.out.println(var1);
System.out.println(var2);
}
}
Results:
Franklin B
Benjamin
Benjamin
Franklin B

A simple non-regex way to do it:
public static String[] split(String input) {
if (input.charAt(0) == '"') {
return input.substring(1).split("\" ");
} else {
return input.substring(0, input.length() - 1).split(" \"");
}
}
First check whether the first character is ". Then remove the quote from either beginning or the end and simply split it.

The following will get you a List with the values you want:
private List<String> getValues(String input) {
List<String> matchList = new ArrayList<>();
Pattern regex = Pattern.compile("[^\\s\"']+|\"[^\"]*\"|'[^']*'");
Matcher regexMatcher = regex.matcher(input);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
return matchList;
}
Taken from Regex for splitting a string using space when not surrounded by single or double quotes

#Shar1er80 Nice piece of work without regex. Worked great.
I also tried with regex:
//Using regex to get values separated by whitespace but keeping values with double quotes
RegexOptions options = RegexOptions.None;
Regex regex = new Regex( #"((""((?<token>.*?)(?<!\\)"")|(?<token>[\w]+))(\s)*)", options );
string input = #" Here is ""my string"" it has "" six matches"" ";
var result = (from Match m in regex.Matches( input )
where m.Groups[ "token" ].Success
select m.Groups[ "token" ].Value).ToList();
Gave me exact result.

Split string without losing split character

I want to split a string in Java some string like this, normal split function splits the string while losing the split characters:
String = "123{456]789[012*";
I want to split the string for {,[,],* character but don't want to lose them. I mean I want results like this:
part 1 = 123{
part 2 = 456]
part 3 = 789[
part 4 = 012*
Normally split function splits like this:
part 1 = 123
part 2 = 456
part 3 = 789
part 4 = 012
Is it possible?

You can use zero-width lookahead/behind expressions to define a regular expression that matches the zero-length string between one of your target characters and anything that is not one of your target characters:
(?<=[{\[\]*])(?=[^{\[\]*])
Pass this expression to String.split:
String[] parts = "123{456]789[012*".split("(?<=[{\\[\\]*])(?=[^{\\[\\]*])");
If you have a block of consecutive delimiter characters this will split once at the end of the whole block, i.e. the string "123{456][789[012*" would split into four blocks "123{", "456][", "789[", "012*". If you used just the first part (the look-behind)
(?<=[{\[\]*])
then you would get five parts "123{", "456]", "[", "789[", "012*"

Using a positive lookbehind:
(?<={|\[|\]|\*)
String str = "123{456]789[012*";
String parts[] = str.split("(?<=\\{|\\[|\\]|\\*)");
System.out.println(Arrays.toString(parts));
Output:
[123{, 456], 789[, 012*]

I think you're looking for something like
String str = "123{456]789[012*";
String[] parts = new String[] {
str.substring(0,4), str.substring(4,8), str.substring(8,12),
str.substring(12)
};
System.out.println(Arrays.toString(parts));
Output is
[123{, 456], 789[, 012*]

You can use a PatternMatcher to find the next index after a splitting character and the splitting character itself.
public static List<String> split(String string, String splitRegex) {
List<String> result = new ArrayList<String>();
Pattern p = Pattern.compile(splitRegex);
Matcher m = p.matcher(string);
int index = 0;
while (index < string.length()) {
if (m.find()) {
int splitIndex = m.end();
String splitString = m.group();
result.add(string.substring(index,splitIndex-1) + splitString);
index = splitIndex;
} else
result.add(string.substring(index));
}
return result;
}
Example code:
public static void main(String[] args) {
System.out.println(split("123{456]789[012*","\\{|\\]|\\[|\\*"));
}
Output:
[123{, 456], 789[, 012*]

How to return the first chunk of either numerics or letters from a string?

For example, if I had (-> means return):
aBc123afa5 -> aBc
168dgFF9g -> 168
1GGGGG -> 1
How can I do this in Java? I assume it's something regex related but I'm not great with regex and so not too sure how to implement it (I could with some thought but I have a feeling it would be 5-10 lines long, and I think this could be done in a one-liner).
Thanks

String myString = "aBc123afa5";
String extracted = myString.replaceAll("^([A-Za-z]+|\\d+).*$", "$1");
View the regex demo and the live code demonstration!
To use Matcher.group() and reuse a Pattern for efficiency:
// Class
private static final Pattern pattern = Pattern.compile("^([A-Za-z]+|\\d+).*$");
// Your method
{
String myString = "aBc123afa5";
Matcher matcher = pattern.matcher(myString);
if(matcher.matches())
System.out.println(matcher.group(1));
}
Note: /^([A-Za-z]+|\d+).*$ and /^([A-Za-z]+|\d+)/ both works in similar efficiency. On regex101 you can compare the matcher debug logs to find out this.

Without using regex, you can do this:
String string = "168dgFF9g";
String chunk = "" + string.charAt(0);
boolean searchDigit = Character.isDigit(string.charAt(0));
for (int i = 1; i < string.length(); i++) {
boolean isDigit = Character.isDigit(string.charAt(i));
if (isDigit == searchDigit) {
chunk += string.charAt(i);
} else {
break;
}
}
System.out.println(chunk);

public static String prefix(String s) {
return s.replaceFirst("^(\\d+|\\pL+|).*$", "$1");
}
where
\\d = digit
\\pL = letter
postfix + = one or more
| = or
^ = begin of string
$ = end of string
$1 = first group `( ... )`
An empty alternative (last |) ensures that (...) is always matched, and always a replace happens. Otherwise the original string would be returned.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Split a string in java into two parts - java

String result = path1.substring(0, path1.indexOf("li[12]") + 6);

The [ character is interpreted as a special regex character so you should escape it by adding \\ So replace result = path1.split("[12]")[0]; By result = path1.split("\\[12]")[0]; Output: yes body/div[2]/div[3]/div/div[1]/div/div[2]/div[2]/ul/li

need to add [12] after substring so +6 in result String result = path1.substring(0, path1.indexOf("li[12]")+6);

Related

Remove all the leading zero from the number part of a string

How to extract a number from a string in a particular format?

Get string within double quotes along with rest of the string

Split string without losing split character

How to return the first chunk of either numerics or letters from a string?

Categories

Resources