Using split method in java to separate different inputs - java

Using the split method in java to split "Smith, John (111) 123-4567" to "John" "Smith" "111". I need to get rid of the comma and the parentheses. This is what I have so far but it doesn't split the strings.
// split data into tokens separated by spaces
tokens = data.split(" , \\s ( ) ");
first = tokens[1];
last = tokens[0];
area = tokens[2];
// display the tokens one per line
for(int k = 0; k < tokens.length; k++) {
System.out.print(tokens[1] + " " + tokens[0] + " " + tokens[2]);
}

Can also be solved by using a regular expression to parse the input:
String inputString = "Smith, John (111) 123-4567";
String regexPattern = "(?<lastName>.*), (?<firstName>.*) \\((?<cityCode>\\d+)\\).*";
Pattern pattern = Pattern.compile(regexPattern);
Matcher matcher = pattern.matcher(inputString);
if (matcher.matches()) {
out.printf("%s %s %s", matcher.group("firstName"),
matcher.group("lastName"),
matcher.group("cityCode"));
}
Output: John Smith 111

It looks like the string.split function does not know to split the parameter value into separate regex match strings.
Unless I am unaware of an undocumented feature of the Java string.split() function (documentation here), your split function parameter is trying to split the string by the entire value " , \\s ( )", which is not literally present in the operand string.
I am not able to test your code in a Java runtime to answer, but I think you need to split your split operation into individual split operations, something like:
data = "Last, First (111) 123-4567";
tokens = data.split(",");
//tokens variable should now have two strings:
//"Last", and "First (111) 123-4567"
last = tokens[0];
tokens = tokens[1].split(" ");
//tokens variable should now have three strings:
//"First", "(111)", and "123-4567"
first = tokens[0];
area = tokens[1];

Related

Regex pattern to convert comma separated String

Changing string with comma separated values to numbered new-line values
For example:
Input: a,b,c
Output:
1.a
2.b
3.c
Finding it hard to change it using regex pattern, instead of converting string to string array and looping through.
I'm not really sure, that it's possible to achive with only regex without any kind of a loop. As fore me, the solution with spliting the string into an array and iterating over it, is the most straightforward:
String value = "a,b,c";
String[] values = value.split(",");
String result = "";
for (int i=1; i<=values.length; i++) {
result += i + "." + values[i-1] + "\n";
}
Sure, it's possible to do without splitting and any kind of arrays, but it could be a little bit awkward solution, like:
String value = "a,b,c";
Pattern pattern = Pattern.compile("[(^\\w+)]");
Matcher matcher = pattern.matcher(value.replaceAll("\\,", "\n"));
StringBuffer s = new StringBuffer();
int i = 0;
while (matcher.find()) {
matcher.appendReplacement(s, ++i + "." + matcher.group());
}
System.out.println(s.toString());
Here the , sign is replaced with \n new line symbol and then we are looking for a groups of characters at the start of every line [(^\\w+)]. If any group is found, then we are appending to the start of this group a line number. But even here we have to use a loop to set the line number. And this logic is not as clear, as the first one.

Parsing a string with [3:0] substring in it

I want to store two numbers from a string into two distinct variables - for example, var1 = 3 and var2 = 0 from "[3:0]". I have the following code snippet:
String myStr = "[3:0]";
if (myStr.trim().matches("\\[(\\d+)\\]")) {
// Do something.
// If it enter the here, here I want to store 3 and 0 in different variables or an array
}
Is it possible doing this with split and regular expressions?
Don't call trim(). Enhance you regex instead.
Your regex is missing the pattern for : and the second number, and you don't need to escape the ].
To capture the matched numbers, you need the Matcher:
String myStr = " [3:0] ";
Matcher m = Pattern.compile("\\s*\\[(\\d+):(\\d+)]\\s*").matcher(myStr);
if (m.matches())
System.out.println(m.group(1) + ", " + m.group(2));
Output
3, 0
You can use replaceAll and split
String myStr = "[3:0]";
if(myStr.trim().matches("\\[\\d+:\\d+\\]") {
String[] numbers = myStr.replaceAll("[\\[\\]]","").split(":");
}
Moreover, your regExp to match String should be \\[\\d+:\\d+\\], if you want to avoid trim you can add \\s+ at start and end to match the spaces.But trim is not bad.
EDIT
As suggested by Andreas in comments,
String myStr = "[3:0]";
String regExp = "\\[(\\d+):(\\d+)\\]";
Pattern pattern = Pattern.compile(regExp);
Matcher matcher = pattern.matcher(myStr.trim());
if(matcher.find()) {
int a = Integer.parseInt(matcher.group(1));
int b = Integer.parseInt(matcher.group(2));
System.out.println(a + " : " + b);
}
OUTPUT
3 : 0
Without any regular expressions you could do this:
// this will remove the braces [ and ] and just leave "3:0"
String numberString= myString.trim().replace("[", "").replace("]","");
// this will split the string in everything before the : and everything after the : (so two values as an array)
String[] numbers = numberString.split(":");
// get the first value and parse it as a number "3" will become a simple 3
int firstNumber = Integer.parseInt(numbers[0]) ;
// get the second value and parse it from "0" to a plain 0
int secondNumber = Integer.parseInt(numbers[1]);
be carefull when parsing numbers, depending on your input string and what other possibilities there might be (e.g. "3:12" is ok, but "3:02" might throw an error).
In case you don't need to validate input and you want to simply get numbers from it, you could simply find indexOf(":") and substring parts which you are interested, in which are:
from [ (which is at position 0) till :
and from index of : till ] (which is at position equal to length of string -1)
Your code can look like
String text = "[3:0]";
int colonIndex = text.indexOf(':');
String first = text.substring(1, colonIndex);
String second = text.substring(colonIndex + 1, text.length() - 1);

Best way to select parts certain parts of data in a string that changes in size

I'm looking for a good method of parsing certain parts of information in a string that changes in size.
For example, the string might be
"id:1234 alert:a-b up:12.3 down:12.3"
and I need to pick out the value for id, alert, up and down so my initial thought was substring but then I thought that length of the string can change in size for example
"id:123456 alert:a-b-c-d up:12.345 down:12.345"
So using substring each time to look at say characters 3 to 7 may not work each time because it would not capture all of the data needed.
What would be a smart way of selecting each value that is needed? Hopefully I've explained this well as I normally tend to confuse people with my bad explanations. I am programming in Java.
You could simply use String.split(), first to tokenize the whitespace and then to tokenize on your key/value separator (colon in this case):
String line = "id:1234 alert:a-b up:12.3 down:12.3";
// first split the line by whitespace
String[] keyValues = line.split("\\s+");
for (String keyValueString : keyValues) {
String[] keyValue = keyValueString.split(":");
// TODO might want to check for bad data, that we have 2 values
System.out.println(String.format("Key: %-10s Value: %-10s", keyValue[0], keyValue[1]));
}
Result:
Key: id Value: 1234
Key: alert Value: a-b
Key: up Value: 12.3
Key: down Value: 12.3
A basic solution based on regular expressions might look like this:
String input = "id:1234 alert:a-b up:12.3 down:12.3";
Matcher matcher = Pattern.compile("(\\S+):(\\S+)").matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(1) + " = " + matcher.group(2));
}
This assumes you are looking for one or more non-whitespace characters, then a colon, then one or more non-whitespace characters.
Output:
id = 1234
alert = a-b
up = 12.3
down = 12.3
You can use the method .split() from the String class.
Check this out:
String line = "id:1234 alert:a-b up:12.3 down:12.3";
String []splittedLine = line.split(" ");
for(int i = 0; i <= splittedLine.length;i++){
System.out.println(splittedLine[i]);
}
What you are doing here is splitting your string line on every whitespace character it found.
This is the result:
id:1234
alert:a-b
up:12.3
down:12.3

Getting next two words from a given word in string with words containing non alphanumeric characters as well

I have a String as below:
String str = "This is something Total Toys (RED) 300,000.00 (49,999.00) This is something";
Input from user would be a keyword String viz. Total Toys (RED)
I can get the index of the keyword using str.indexOf(keyword);
I can also get the start of the next word by adding length of keyword String to above index.
However, how can I get the next two tokens after the keyword in given String which are the values I want?
if(str.contains(keyWord)){
String Value1 = // what should come here such that value1 is 300,000.00 which is first token after keyword string?
String Value2 = // what should come here such that value2 is (49,999.00) which is second token after keyword string?
}
Context : Read a PDF using PDFBox. The keyword above is the header in first column of a table in the PDF and the next two tokens I want to read are the values in the next two columns on the same row in this table.
You can use regular expressions to do this. This will work for all instances of the keyword that are followed by two tokens, if the keyword is not followed by two tokens, it won't match; however, this is easily adaptable, so please state if you want to match in cases where 0 or 1 tokens follow the keyword.
String regex = "(?i)%s\\s+([\\S]+)\\s+([\\S]+)";
Matcher m = Pattern.compile(String.format(regex, Pattern.quote(keyword))).matcher(str);
while (m.find())
{
System.out.println(m.group(1));
System.out.println(m.group(2));
}
In you example, %s in regex would be replaced by "Total Toys", giving:
300,000.00 49,999.00
(?i) means case-insensitive
\\s means whitespace
\\S means non-whitespace
[...] is a character class
+ means 1 or more
(...) is a capturing group
EDIT:If you want to use a keyword with special characters intrinsic to regular expressions, then you need to use Pattern.quote(). For example, in regex, ( and ) are special characters, so a keyword with them will result in an incorrect regex. Pattern.quote() interprets them as raw characters, so they will be escaped in the regex, ie changed to \\( and \\).
If you want three groups, use this:
String regex = "%s\\s+([\\S]+)\\s+([\\S]+)(?:\\s+([\\S]+))?";
NB: If only two groups follow, group(3) will be null.
Something like this:
String remainingPart= str.substring(str.indexOf(keyWord)+keyWord.length());
StringTokenizer st=new StringTokenizer(remainingPart);
if(st.hasMoreTokens()){
Value1=st.nextToken();
}
if(st.hasMoreTokens()){
Value2=st.nextToken();
}
Try this,
String str = "This is something Total Toys 300,000.00 49,999.00 This is something";
if(str.contains(keyWord)) {
String splitLine = str.split(keyword)[1];
String tokens[] = splitLine.split(" ");
String Value1 = tokens[1];
String Value2 = tokens[2];
}
Here is something that works given what you have provided:
public static void main(String[] args)
{
String search = "Total Toys";
String str = "This is something Total Toys 300,000.00 49,999.00 This is something";
int index = str.indexOf(search);
index += search.length();
String[] tokens = str.substring(index, str.length()).trim().split(" ");
String val1 = tokens[0];
String val2 = tokens[1];
System.out.println("Val1: " + val1 + ", Val2: " + val2);
}
Output:
Val1: 300,000.00, Val2: 49,999.00

explode function in java as in PHP

I have a variable in java, which is like this I+am+good+boy I want to get seperate them on the basis of + , in PHP I can use explode which is very handy, is there any function in java?I saw the split() function definition but that was not helpful.as it take regular expression.
Any help
Thanks
Use String.split() in regards to explode.
An example of use:
Explode :
String[] explode = "I+am+a+good+boy".split("+");
And you can reverse this like so (or "implode" it):
String implode = StringUtils.join(explode[], " ");
You have two options as I know :
String text = "I+am+good+boy";
System.out.println("Using Tokenizer : ");
StringTokenizer tokenizer = new StringTokenizer(text, "+");
while (tokenizer.hasMoreTokens()) {
String token = tokenizer.nextToken();
System.out.println(" Token = " + token);
}
System.out.println("\n Using Split :");
String [] array = text.split("\\+");
for (int i = 0; i < array.length; i++) {
System.out.println(array[i]);
}
You can try like this
String str = "I+am+a+good+boy";
String[] array = str.split("\\+");
you will get "I", "am", "a", "good", "boy" strings in the array. And you can access them as
String firstElem = array[0];
In the firstElem string you will get "I" as result.
The \\ before + because split() takes regular expressions (regex) as argument and regex has special meaning for a +. It means one or more copies of the string trailing the +.
So, if you want to get literal + sign, then you have to use escape char \\.
Just use split and escape the regex - either by hand or using the Pattern.quote() method.
String str = "I+am+a+good+boy";
String[] pieces = str.split("+")
Now you can use pieces[0], pieces[1] and so on.
More Info: http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29

Categories