Parsing a string with [3:0] substring in it

Parsing a string with [3:0] substring in it - java

I want to store two numbers from a string into two distinct variables - for example, var1 = 3 and var2 = 0 from "[3:0]". I have the following code snippet:
String myStr = "[3:0]";
if (myStr.trim().matches("\\[(\\d+)\\]")) {
// Do something.
// If it enter the here, here I want to store 3 and 0 in different variables or an array
}
Is it possible doing this with split and regular expressions?

Don't call trim(). Enhance you regex instead.
Your regex is missing the pattern for : and the second number, and you don't need to escape the ].
To capture the matched numbers, you need the Matcher:
String myStr = " [3:0] ";
Matcher m = Pattern.compile("\\s*\\[(\\d+):(\\d+)]\\s*").matcher(myStr);
if (m.matches())
System.out.println(m.group(1) + ", " + m.group(2));
Output
3, 0

You can use replaceAll and split
String myStr = "[3:0]";
if(myStr.trim().matches("\\[\\d+:\\d+\\]") {
String[] numbers = myStr.replaceAll("[\\[\\]]","").split(":");
}
Moreover, your regExp to match String should be \\[\\d+:\\d+\\], if you want to avoid trim you can add \\s+ at start and end to match the spaces.But trim is not bad.
EDIT
As suggested by Andreas in comments,
String myStr = "[3:0]";
String regExp = "\\[(\\d+):(\\d+)\\]";
Pattern pattern = Pattern.compile(regExp);
Matcher matcher = pattern.matcher(myStr.trim());
if(matcher.find()) {
int a = Integer.parseInt(matcher.group(1));
int b = Integer.parseInt(matcher.group(2));
System.out.println(a + " : " + b);
}
OUTPUT
3 : 0

Without any regular expressions you could do this:
// this will remove the braces [ and ] and just leave "3:0"
String numberString= myString.trim().replace("[", "").replace("]","");
// this will split the string in everything before the : and everything after the : (so two values as an array)
String[] numbers = numberString.split(":");
// get the first value and parse it as a number "3" will become a simple 3
int firstNumber = Integer.parseInt(numbers[0]) ;
// get the second value and parse it from "0" to a plain 0
int secondNumber = Integer.parseInt(numbers[1]);
be carefull when parsing numbers, depending on your input string and what other possibilities there might be (e.g. "3:12" is ok, but "3:02" might throw an error).

In case you don't need to validate input and you want to simply get numbers from it, you could simply find indexOf(":") and substring parts which you are interested, in which are:
from [ (which is at position 0) till :
and from index of : till ] (which is at position equal to length of string -1)
Your code can look like
String text = "[3:0]";
int colonIndex = text.indexOf(':');
String first = text.substring(1, colonIndex);
String second = text.substring(colonIndex + 1, text.length() - 1);

Related

(hello-> h3o) How to replace in a String the middle letters for the number of letters replaced

I need to build a method which receive a String e.g. "elephant-rides are really fun!". and return another similar String, in this example the return should be: "e6t-r3s are r4y fun!". (because e-lephan-t has 6 middle letters, r-ide-s has 3 middle letters and so on)
To get that return I need to replace in each word the middle letters for the number of letters replaced leaving without changes everything which isn't a letter and the first and the last letter of every word.
for the moment I've tried using regex to split the received string into words, and saving these words in an array of strings also I have another array of int in which I save the number of middle letters, but I don't know how to join both arrays and the symbols into a correct String to return
String string="elephant-rides are really fun!";
String[] parts = string.split("[^a-zA-Z]");
int[] sizes = new int[parts.length];
int index=0;
for(String aux: parts)
{
sizes[index]= aux.length()-2;
System.out.println( sizes[index]);
index++;
}

You may use
String text = "elephant-rides are really fun!";
Pattern r = Pattern.compile("(?U)(\\w)(\\w{2,})(\\w)");
Matcher m = r.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group(1) + m.group(2).length() + m.group(3));
}
m.appendTail(sb); // append the rest of the contents
System.out.println(sb);
// => e6t-r3s are r4y fun!
See the Java demo
Here, (?U)(\\w)(\\w{2,})(\\w) matches any Unicode word char capturing it into Group 1, then captures any 2 or more word chars into Group 2 and then captures a single word char into Group 3, and inside the .appendReplacement method, the second group contents are "converted" into its length.
Java 9+:
String text = "elephant-rides are really fun!";
Pattern r = Pattern.compile("(?U)(\\w)(\\w{2,})(\\w)");
Matcher m = r.matcher(text);
String result = m.replaceAll(x -> x.group(1) + x.group(2).length() + x.group(3));
System.out.println( result );
// => e6t-r3s are r4y fun!

For the instructions you gave us, this would be sufficient:
String [] result = string.split("[\\s-]");
for (int i=0; i<result.length; i++){
result[i] = "" + result[i].charAt(0) + ((result[i].length())-2) + result[i].charAt(result[i].length()-1);
}
With your input, it creates the array [ "e6t", "r3s", "a1e", "r4y", "f2!" ]
And it works even with one or two sized words, but it gives result such as:
Input: I am a small; Output: [ "I-1I", "a0m", "a-1a", "s3l" ]
Again, for the instructions you gave us this would be legal.
Hope I helped!

Regex pattern to convert comma separated String

Changing string with comma separated values to numbered new-line values
For example:
Input: a,b,c
Output:
1.a
2.b
3.c
Finding it hard to change it using regex pattern, instead of converting string to string array and looping through.

I'm not really sure, that it's possible to achive with only regex without any kind of a loop. As fore me, the solution with spliting the string into an array and iterating over it, is the most straightforward:
String value = "a,b,c";
String[] values = value.split(",");
String result = "";
for (int i=1; i<=values.length; i++) {
result += i + "." + values[i-1] + "\n";
}
Sure, it's possible to do without splitting and any kind of arrays, but it could be a little bit awkward solution, like:
String value = "a,b,c";
Pattern pattern = Pattern.compile("[(^\\w+)]");
Matcher matcher = pattern.matcher(value.replaceAll("\\,", "\n"));
StringBuffer s = new StringBuffer();
int i = 0;
while (matcher.find()) {
matcher.appendReplacement(s, ++i + "." + matcher.group());
}
System.out.println(s.toString());
Here the , sign is replaced with \n new line symbol and then we are looking for a groups of characters at the start of every line [(^\\w+)]. If any group is found, then we are appending to the start of this group a line number. But even here we have to use a loop to set the line number. And this logic is not as clear, as the first one.

Getting next two words from a given word in string with words containing non alphanumeric characters as well

I have a String as below:
String str = "This is something Total Toys (RED) 300,000.00 (49,999.00) This is something";
Input from user would be a keyword String viz. Total Toys (RED)
I can get the index of the keyword using str.indexOf(keyword);
I can also get the start of the next word by adding length of keyword String to above index.
However, how can I get the next two tokens after the keyword in given String which are the values I want?
if(str.contains(keyWord)){
String Value1 = // what should come here such that value1 is 300,000.00 which is first token after keyword string?
String Value2 = // what should come here such that value2 is (49,999.00) which is second token after keyword string?
}
Context : Read a PDF using PDFBox. The keyword above is the header in first column of a table in the PDF and the next two tokens I want to read are the values in the next two columns on the same row in this table.

You can use regular expressions to do this. This will work for all instances of the keyword that are followed by two tokens, if the keyword is not followed by two tokens, it won't match; however, this is easily adaptable, so please state if you want to match in cases where 0 or 1 tokens follow the keyword.
String regex = "(?i)%s\\s+([\\S]+)\\s+([\\S]+)";
Matcher m = Pattern.compile(String.format(regex, Pattern.quote(keyword))).matcher(str);
while (m.find())
{
System.out.println(m.group(1));
System.out.println(m.group(2));
}
In you example, %s in regex would be replaced by "Total Toys", giving:
300,000.00 49,999.00
(?i) means case-insensitive
\\s means whitespace
\\S means non-whitespace
[...] is a character class
+ means 1 or more
(...) is a capturing group
EDIT:If you want to use a keyword with special characters intrinsic to regular expressions, then you need to use Pattern.quote(). For example, in regex, ( and ) are special characters, so a keyword with them will result in an incorrect regex. Pattern.quote() interprets them as raw characters, so they will be escaped in the regex, ie changed to \\( and \\).
If you want three groups, use this:
String regex = "%s\\s+([\\S]+)\\s+([\\S]+)(?:\\s+([\\S]+))?";
NB: If only two groups follow, group(3) will be null.

Something like this:
String remainingPart= str.substring(str.indexOf(keyWord)+keyWord.length());
StringTokenizer st=new StringTokenizer(remainingPart);
if(st.hasMoreTokens()){
Value1=st.nextToken();
}
if(st.hasMoreTokens()){
Value2=st.nextToken();
}

Try this,
String str = "This is something Total Toys 300,000.00 49,999.00 This is something";
if(str.contains(keyWord)) {
String splitLine = str.split(keyword)[1];
String tokens[] = splitLine.split(" ");
String Value1 = tokens[1];
String Value2 = tokens[2];
}

Here is something that works given what you have provided:
public static void main(String[] args)
{
String search = "Total Toys";
String str = "This is something Total Toys 300,000.00 49,999.00 This is something";
int index = str.indexOf(search);
index += search.length();
String[] tokens = str.substring(index, str.length()).trim().split(" ");
String val1 = tokens[0];
String val2 = tokens[1];
System.out.println("Val1: " + val1 + ", Val2: " + val2);
}
Output:
Val1: 300,000.00, Val2: 49,999.00

I need to get a substring from a java string Tokenizer

I need to get a substring from a java string tokenizer.
My inpunt string is = Pizza-1*Nutella-20*Chicken-65*
StringTokenizer productsTokenizer = new StringTokenizer("Pizza-1*Nutella-20*Chicken-65*", "*");
do
{
try
{
int pos = productsTokenizer .nextToken().indexOf("-");
String product = productsTokenizer .nextToken().substring(0, pos+1);
String count= productsTokenizer .nextToken().substring(pos, pos+1);
System.out.println(product + " " + count);
}
catch(Exception e)
{
}
}
while(productsTokenizer .hasMoreTokens());
My output must be:
Pizza 1
Nutella 20
Chicken 65
I need the product value and the count value in separate variables to insert that values in the Data Base.
I hope you can help me.

You could use String.split() as
String[] products = "Pizza-1*Nutella-20*Chicken-65*".split("\\*");
for (String product : products) {
String[] prodNameCount = product.split("\\-");
System.out.println(prodNameCount[0] + " " + prodNameCount[1]);
}
Output
Pizza 1
Nutella 20
Chicken 65

You invoke the nextToken() method 3 times. That will get you 3 different tokens
int pos = productsTokenizer .nextToken().indexOf("-");
String product = productsTokenizer .nextToken().substring(0, pos+1);
String count= productsTokenizer .nextToken().substring(pos, pos+1);
Instead you should do something like:
String token = productsTokenizer .nextToken();
int pos = token.indexOf("-");
String product = token.substring(...);
String count= token.substring(...);
I'll let you figure out the proper indexes for the substring() method.
Also instead of using a do/while structure it is better to just use a while loop:
while(productsTokenizer .hasMoreTokens())
{
// add your code here
}
That is don't assume there is a token.

An alternative answer you may want to use if your input grows:
// find all strings that match START or '*' followed by the name (matched),
// a hyphen and then a positive number (not starting with 0)
Pattern p = Pattern.compile("(?:^|[*])(\\w+)-([1-9]\\d*)");
Matcher finder = p.matcher(products);
while (finder.find()) {
// possibly check if the new match directly follows the previous one
String product = finder.group(1);
int count = Integer.valueOf(finder.group(2));
System.out.printf("Product: %s , count %d%n", product, count);
}

Some people dislike regex, but this is a good application for them. All you need to use is "(\\w+)-(\\d{1,})\\*" as your pattern. Here's a toy example:
String template = "Pizza-1*Nutella-20*Chicken-65*";
String pattern = "(\\w+)-(\\d+)\\*";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(template);
while(m.find())
{
System.out.println(m.group(1) + " " + m.group(2));
}
To explain this a bit more, "(\\w+)-(\\d+)\\*" looks for a (\\w+), which is any set of at least 1 character from [A-Za-z0-9_], followed by a -, followed by a number \\d+, where the+ means at least one character in length, followed by a *, which must be escaped. The parentheses capture what's inside of them. There are two sets of capturing parentheses in this regex, so we reference them by group(1) and group(2) as seen in the while loop, which prints:
Pizza 1
Nutella 20
Chicken 65

Java: Splitting string variable into a string and integer variables?

I have looked everywhere but can seem to find a solution. Is it possible to separate a string variable such as "A1" into a string "A" and integer 1 variables?

Start parsing from 0 if you see non digit keep adding it to a StringBuffer, as you see digit add the content of StringBuffer to List<String> strings; and same for Digits List<String> numbers

If the string length is strictly one letter and the number is only 1 digit, you can use String.split(""), however for a more generic solution you can use regex
Sample Code:
Matcher matcher = Pattern.compile("([a-zA-Z]+)(\\d+)").matcher("variable1121");
if (matcher.matches()) {
System.out.println(matcher.group(1) + " , " + matcher.group(2));
}
Output:
variable , 1121

If you have a string variable such as A1 or ABC123, you can try:
String input = "A1";
String[] array = input.split("(?<=([a-zA-Z]++))");
String str = array[0];
int integer = Integer.parseInt(array[1]);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing a string with [3:0] substring in it - java

Related

(hello-> h3o) How to replace in a String the middle letters for the number of letters replaced

Regex pattern to convert comma separated String

Getting next two words from a given word in string with words containing non alphanumeric characters as well

I need to get a substring from a java string Tokenizer

Java: Splitting string variable into a string and integer variables?

Categories

Resources