Java regex split double from string - java

I am having a problem splitting something like the following string:
43.80USD
What I want is to be able to split the expression into an array that has "43.80" as the first element and "USD" as the second. So the result would be something like:
["43.80", "USD"]
I am sure there is some way to do this with regex, but I am not proficient enough with it to figure it out on my own. Any help would be much appreciated.

If the format of your string is fixed you can split it as follows
String[] currency = "48.50USD".split("(?<=\\d)(?=[a-zA-Z])");
System.out.println("Amount='"+currency[0]+"'; Denomination='"+currency[1]+"'");
// prints: Amount='48.50'; Denomination='USD'
The regex above uses a positive look-behind (?<=) and a positive lookahead (?=) to find a separator (which is of zero-length here) that's preceded with a number and followed by a letter.

If your data really looks like "43.80USD" then you can use
"43.80USD".split("(?i)(?=[a-z])",2)
(?=[a-z]) will split before any of a-z characters
(?i) will make used regex case-insensitive so it will also work for uSd
second argument is max size of result array, since you don't want ["43.80", "U", "S, "D"] but ["43.80", "USD"] we need to use 2.

This regex works(\d*\.\d*)([a-zA-Z]*). Group 1 will be the amount, including the decimal. Group 2 will be the USD or other monetary name. Note that this regex only requires a decimal point, everything else is optional. So this also matches: "45123.15542ABCDEFG". Group 1 will be 45123.15542 and group 2 will be ABCDEFG. If you want more strict requirements, tell me what they are and Ill put it in. Otherwise your code will look something like:
Pattern p = Pattern.compile("(\\d*\\.\\d*)([a-zA-Z]*)");//Note the double \\ to escape twice.
Matcher m = p.matcher("43.80USD");
String amount, type;
if(m.matches){
amount = m.group(1);
type = m.group(2);
}

Related

Java regex to replace numeric values with quotes numeric values

Can someone please help with regex for replacing all integers and doubles with in a given String with single quotes :
Key1=a,Key2=2,Key3=999.6,Key4=8888,Key5=true
with this :
Key1=a,Key2='2',Key3='999.6',Key4='8888',Key5=true
I would like to use regex group capturing rules to replace all numeric string startng after = and replace with ''.
Use a look arounds each end:
String quoted = str.replaceAll("(?<==)\\d+(\\.\\d+)?(?=,|$)", "'$0'");
The entire match, which is group 0, is the number to be replaced by quotes around group 0.
The match starts with a look behind for an equals and ends with a look ahead for a comma or end of input.
You may try a regex replace all approach here:
String input = "Key1=a,Key2=2,Key3=999.6,Key4=8888,Key5=true";
String output = input.replaceAll("([^,]+)=(\\d+(?:\\.\\d+)?)", "$1='$2'");
System.out.println(output); // Key1=a,Key2='2',Key3='999.6',Key4='8888',Key5=true
Here is an explanation of the regex pattern used:
([^,]+) match and capture the key in $1
= match =
(\\d+(?:\\.\\d+)?) match and capture an integer or float number in $2
Then, we replace with $1='$2', quoting the number value.

Extract substring from end till first alphabet in java

I have a string of format: A-2-Q4567
More examples: AB-456-T12, A24-5-M12345, etc.
I want to extract the last numerical values out of these strings, which are: 4567, 12, 12345 respectively (which is the numerical value of the substring from the end till first non-numeric character is encountered)
I can split the string, get the last string from the splitted string array, and then do a parseInt after removing the non-numerical characters from it.
But is there a more elegant way of doing this?
You can use this regex: (\d+$). It returns the last sequence of digits in the string.
EDIT - some explanation:
The \d means any digit.
The + means one or more of the previous symbols. Since the previous symbol is a digit, then \d+ means "one or more digits".
The $ means the end of the string, so \d+$ is the last sequence of digits in the string.
you can do this :
String getLastNumeric(String input)
{
String str="";
char c;
for(int i=input.length()-1;i>=0 && Character.isDigit(c=input.charAt(i));i--)
str=c+str;
return str;
}
The regex solutions might be more elegant but performance-wise I think the above is the best because Regex match can be more expensive than a simple for loop with a simple condition to evaluate.
Ofcourse The Regex is more flexible, what if your requirements change and now a dash "-" must precede the numbers ? with Regex it should be just a matter of changing one regex expression.
I put the Regex version here but remember if you're sure your requirements won't change I think the above solution is better on the CPU :
Matcher matcher= Pattern.compile("(\\d+$)").matcher(input);
if(matcher.find())
return matcher.group();
return "";

Match regex but only replace the first section - Java

I'm trying to take a phone number which can be in the format either +44 or +4 followed by any number of digits or hyphens, and replace the +44 or +4 with +44 or +4 followed by a space.
I believe I need a look around to match the full number but only replace the initial prefix, what I'm trying atm is
^[+]\d[0-9](?:([0-9]+))?
which matches the number (without hyphens) however I thought the lookahead would only match the number and not capture the extra digits however it seems to capture the whole thing.
Can anyone point me in the right direction as to what I've done wrong?
EDIT:
To be clearer my Java code is
Pattern pattern = Pattern.compile("^[+]\\d[0-9](?:([0-9]+))?");
if(pattern.matcher("+441234567890").matches())
String num = pattern.matcher(title).replaceFirst("$0 $1");
Thanks.
If you want to match whole number, but replace only part of it, you should not use positive lookahead, but just gruping, like in:
(^\+\d\d)([\d-]+)?
prefix will be in group 1, and the rest of number in group 2, so to add a space between these parts, just use something like group1 + space + group2.
In your example it should look like this:
Pattern pattern = Pattern.compile("(^\\+\\d\\d)([\\d-]+)?");
if(pattern.matcher("+441234567890").matches()) {
num = pattern.matcher(title).replaceFirst("$1 $2");
}
However this regex will always capture two digits in prefix, if you want to match +44 or +4 you should use:
(^\+(44|4))([\d-]+)?
so if you have more possible prefixes, you need to change this regex also.
You regex didn't work as you expected because (?:([0-9]+))? is a non capturing group, so the fragment matched by this part of regex was not captured, but it was still matched by whole regex. So $0 returned whole regex, and $1 should not return anything.

regex to match a recurring pattern

I am trying to write a regex for java that will match the following string:
number,number,number (it could be this simple or it could have a variable number of numbers, but each number has to have a comma after it there will not be any white space though)
here was my attempt:
[[0-9],[0-9]]+
but it seems to match anything with a number in it
You could try something along the lines of ([0-9]+,)*[0-9]+
This will match:
Only one number, e.g.: 7
Two numbers, e.g.: 7,52
Three numbers, e.g.: 7,52,999
etc.
This will not match:
Things with spaces, e.g.: 7, 52
A list ending with a comma, e.g.: 7, 52,
Many other things out of the scope of this problem.
I think this would work
\d+,(\d+,)+
Note that as you want, that will only capture number followed by a comma
I guess you are starting with a String. Why don't you just use String.split(",") ?
^ means the start of a string and $ means the end. If you don't use those, you could match something in the middle (b matched "abc").
The + works on the element before it. b is an element, [0-9] is an element, and so are groups (things wrapped in parenthesis).
So, the regex you want matches:
The start of the string ^
a number [0-9]
any amount of comas flowed by numbers (,[0-9])+
the end of the string $
or, ^[0-9](,[0-9])+$
Try regex as [\d,]* string representation as [\\d,]* e.g. below:
Pattern p4 = Pattern.compile("[\\d,]*");
Matcher m4 = p4.matcher("12,1212,1212ad,v");
System.out.println(m4.find()); //prints true
System.out.println(m4.group());//prints 12,1212,1212
If you want to match minimum one comma (,) and two numbers e.g. 12,1212 then you may want to use regex as (\d+,)+\d+ with string representation as \\d+,)+\\d+. This regex matches a a region with a number minimum one digit followed by one comma(,) followed by minimum one digit number.

Extracting two numbers from a string

I have a String like the following one:
"some value is 25 but must not be bigger then 12"
I want to extract the two numbers from the string.
The numbers are integers.
There might be no text before the first number and some text after the second number.
I tried to do it with a regexp and groups, but failed miserably:
public MessageParser(String message) {
Pattern stringWith2Numbers = Pattern.compile(".*(\\d?).*(\\d?).*");
Matcher matcher = stringWith2Numbers.matcher(message);
if (!matcher.matches()) {
couldParse = false;
firstNumber = 0;
secondNumber = 0;
} else {
final String firstNumberString = matcher.group(1);
firstNumber = Integer.valueOf(firstNumberString);
final String secondNumberString = matcher.group(2);
secondNumber = Integer.valueOf(secondNumberString);
couldParse = true;
}
}
Any help is apreciated.
Your pattern should look more like:
Pattern stringWith2Numbers = Pattern.compile("\\D*(\\d+)\\D+(\\d+)\\D*");
You need to accept \\d+ because it can be one or more digits.
Your ".*" patterns are being greedy, as is their wont, and are gobbling up as much as they can -- which is going to be the entire string. So that first ".*" is matching the entire string, rendering the rest moot. Also, your "\\d?" clauses indicate a single digit which happens to be optional, neither of which is what you want.
This is probably more in line with what you're shooting for:
Pattern stringWith2Numbers = Pattern.compile(".*?(\\d+).*?(\\d+).*?");
Of course, since you don't really care about the stuff before or after the numbers, why bother with them?
Pattern stringWith2Numbers = Pattern.compile("(\\d+).*?(\\d+)");
That ought to do the trick.
Edit: Taking time out from writing butt-kickingly awesome comics, Alan Moore pointed out some problems with my solution in the comments. For starters, if you have only a single multi-digit number in the string, my solution gets it wrong. Applying it to "This 123 is a bad string" would cause it to return "12" and "3" when it ought to simply fail. A better regex would stipulate that there MUST be at least one non-digit character separating the two numbers, like so:
Pattern stringWith2Numbers = Pattern.compile("(\\d+)\\D+(\\d+)");
Also, matches() applies the pattern to the entire string, essentially bracketing it in ^ and $; find() would do the trick, but that's not what the OP was using. So sticking with matches(), we'd need to bring back in those "useless" clauses in front of and after the two numbers. (Though having them explicitly match non-digits instead of the wildcard is better form.) So it would look like:
Pattern stringWith2Numbers = Pattern.compile("\\D*(\\d+)\\D+(\\d+)\\D*");
... which, it must be noted, is damn near identical to jjnguy's answer.
Your regex matches, but everything gets eaten up by your first .* and the rest matches the empty string.
Change your regex to "\\D*(\\d+)\\D+(\\d+)\\D*".
This should be read as: At least one numeric digit followed by at least one character that isn't a numeric digit, followed by at least one numeric digit.

Categories