Parse a string in Java - java

I have strings formatted similar to the one below in a Java program. I need to get the number out.
Host is up (0.0020s latency).
I need the number between the '(' and the 's' characters. E.g., I would need the 0.0020 in this example.

If you are sure it will always be the first number you could use the regular expresion \d+\.\d+ (but note that the backslashes need to be escaped in Java string literals).
Try this code:
String input = "Host is up (0.0020s latency).";
Pattern pattern = Pattern.compile("\\d+\\.\\d+");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
System.out.println(matcher.group());
}
See it working online: ideone
You could also include some of the surrounding characters in the regular expression to reduce the risk of matching the wrong number. To do exactly as you requested in the question (i.e. matching between ( and s) use this regular expression:
\((\d+\.\d+)s
See it working online: ideone

Sounds like a case for regular expressions.
You'll want to match for the decimal figure and then parse that match:
Float matchedValue;
Pattern pattern = Pattern.compile("\\d*\\.\\d+");
Matcher matcher = pattern.matcher(yourString);
boolean isfound = matcher.find();
if (isfound) {
matchedValue = Float.valueOf(matcher.group(0));
}

It depends on how "similar" you mean. You could potentially use a regular expression:
import java.math.BigDecimal;
import java.util.regex.*;
public class Test {
public static void main(String args[]) throws Exception {
Pattern pattern = Pattern.compile("[^(]*\\(([0-9]*\\.[0-9]*)s");
String text = "Host is up (0.0020s latency).";
Matcher match = pattern.matcher(text);
if (match.lookingAt())
{
String group = match.group(1);
System.out.println("Before parsing: " + group);
BigDecimal value = new BigDecimal(group);
System.out.println("Parsed: " + value);
}
else
{
System.out.println("No match");
}
}
}
Quite how specific you want to make your pattern is up to you, of course. This only checks for digits, a dot, then digits after an opening bracket and before an s. You may need to refine it to make the dot optional etc.

This is a great site for building regular expressions from simple to very complex. You choose the language and boom.
http://txt2re.com/

Here's a way without regex
String str = "Host is up (0.0020s latency).";
str = str.substring(str.indexOf('(')+1, str.indexOf("s l"));
System.out.println(str);

Of course using regular expressions in this case is best solution but in many simple cases you can use also something like :
String value = myString.subString(myString.indexOf("("), myString.lastIndexOf("s"))
double numericValue = Double.parseDouble(value);
This is not recomended because text in myString can changes.

Related

Java Regular Expression to check for fixed length and more

I am not even sure if regular expressions are the best way to do this. Here is the requirement on a string:
To check length is 13 characters
First and Last 2 characters are always characters only.
Characters from 3 - 11 are numeric.
Please suggest whether regular expression is the best way to do it and what the regular expression would like to check such a thing?
Regards
Akhil
Use e.g.
"^[a-z]{2}[0-9]{9}[a-z]{2}$"
The square brackets say what is allowed, 'a-z' means small alphabetics between a and z. The curly says how many must be there. ^ means no characters before this, and $ means no characters after.
Usage:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MatcherExample {
public static void main(String[] args) {
String text = "aa123456789bb";
String patternString = "^[a-z]{2}[0-9]{9}[a-z]{2}$";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
boolean matches = matcher.matches();
System.out.println("Matches: " + matches);
}
}

Regular expression -- ignore part of a matched string

I have text that I want to fined some thing like this
"Name DAVID"
I want to match "DAVID" in this, larger, text.
I tried use a regular expression like this:
(Name(.*))
and also
(?:Name(.*))
but this also matched "Name," and I only want to match "David".
Just drop the extra parens:
"Name (.*)"
Even that is probably excessive, you probably want something more like:
"Name (\w*)"
to catch exactly the characters that you want.
You need to use a Matcher. This snippet of code worked for me
public static void main(String[] asdf) {
String text = "NAME David";
Pattern p = Pattern.compile("NAME (.+)");
Matcher m = p.matcher(text);
if (m.matches()){
System.out.println(m.group(1));
}
}
Note that m.matches() is mandatory or m.group(1) will throw java.lang.IllegalStateException: No match found
String's matches()
public static void main(String[] args) {
String regex = "^Name(.+)$";
System.out.println("Name".matches(regex));
System.out.println("Name MIKE".matches(regex));
System.out.println("Name DAVID".matches(regex));
}
For some reason, it doesn't look like your question had yet been answered with a correct regex for what you requested.
Here is what you are looking for:
(?<=Name )DAVID
This only matches DAVID, in the proper context (see demo).
You probably know this, but this is a way to test a string with this regex:
Pattern regex = Pattern.compile("(?<=Name )DAVID");
Matcher regexMatcher = regex.matcher(subjectString);
foundMatch = regexMatcher.find();
Explain Regex
(?<= # look behind to see if there is:
Name # 'Name '
) # end of look-behind
DAVID # 'DAVID'

Find a number of a given number of digits between given separators

What regex/pattern can I use to find the following pattern in a string?
#nnnn:
nnnn can be any 4-digit long number as long as it is sorrounded by a hashtag and a colon.
I have tried the code below:
String string = "#8226:";
if(string.matches( ".*\\d:.*" )) {
System.out.println( "Yes" );
}
It DOES work, but it matches other strings like below:
"This is a string 1234: Hahaha!" // Outputs "Yes"
"Hello 1834: World!!!" // Outputs "Yes"
I want it to only match the pattern at the top of the question.
Can anybody tell me where did I go wrong?
It can be done with Regular Expression
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FindPattern {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("#[0-9]{4}:");
String text = "#1233:#3433:abc#3993: #a343:___#8888:ki";
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
output is:
#1233:
#3433:
#3993:
#8888:
You have already a pattern: #nnnn:. The only problem is that this is not a java compatible regular expression. Let's convert.
# and : are valid character literals, so let these untouched.
As you probably know (according to your solution), a number is denoted with the \d sequence (note, there are some alternatives, e. g. [0-9], \p{Digit}). Just replace all ns with \d:
#\d\d\d\d:
There are four equal subpatterns here, so we can shorten it with a fixed quantifier:
#\d{4}:
You can now write string.matches("#\\d{4}:"). Note that this is slow because compiles the given regex pattern every time. If this code is called frequently, I would consider using a precompiled Pattern like:
Pattern HASH_NUMBER_COLON_PATTERN = Pattern.compile("#\\d{4}:");
// ...
if (HASH_NUMBER_COLON_PATTERN.matcher(yourString).matches()) {
// ...
}
Even better to use some regular expression builder library, such as regex-builder, JavaVerbalExpressions or RegexBee. These tools can make your intention very clear. RegexBee example:
Pattern HASH_NUMBER_COLON_PATTERN = Bee
.then(Bee.fixedChar('#'))
.then(Bee.intBetween(1000, 9999))
.then(Bee.fixedChar(':'))
.toPattern()

java regular expression

Can anyone please help me do the following in a java regular expression?
I need to read 3 characters from the 5th position from a given String ignoring whatever is found before and after.
Example : testXXXtest
Expected result : XXX
You don't need regex at all.
Just use substring: yourString.substring(4,7)
Since you do need to use regex, you can do it like this:
Pattern pattern = Pattern.compile(".{4}(.{3}).*");
Matcher matcher = pattern.matcher("testXXXtest");
matcher.matches();
String whatYouNeed = matcher.group(1);
What does it mean, step by step:
.{4} - any four characters
( - start capturing group, i.e. what you need
.{3} - any three characters
) - end capturing group, you got it now
.* followed by 0 or more arbitrary characters.
matcher.group(1) - get the 1st (only) capturing group.
You should be able to use the substring() method to accomplish this:
string example = "testXXXtest";
string result = example.substring(4,7);
This might help: Groups and capturing in java.util.regex.Pattern.
Here is an example:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Example {
public static void main(String[] args) {
String text = "This is a testWithSomeDataInBetweentest.";
Pattern p = Pattern.compile("test([A-Za-z0-9]*)test");
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println("Matched: " + m.group(1));
} else {
System.out.println("No match.");
}
}
}
This prints:
Matched: WithSomeDataInBetween
If you don't want to match the entire pattern rather to the input string (rather than to seek a substring that would match), you can use matches() instead of find(). You can continue searching for more matching substrings with subsequent calls with find().
Also, your question did not specify what are admissible characters and length of the string between two "test" strings. I assumed any length is OK including zero and that we seek a substring composed of small and capital letters as well as digits.
You can use substring for this, you don't need a regex.
yourString.substring(4,7);
I'm sure you could use a regex too, but why if you don't need it. Of course you should protect this code against null and strings that are too short.
Use the String.replaceAll() Class Method
If you don't need to be performance optimized, you can try the String.replaceAll() class method for a cleaner option:
String sDataLine = "testXXXtest";
String sWhatYouNeed = sDataLine.replaceAll( ".{4}(.{3}).*", "$1" );
References
https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html
http://www.vogella.com/tutorials/JavaRegularExpressions/article.html#using-regular-expressions-with-string-methods

How do I make a regex match for measurement units?

I'm building a small Java library which has to match units in strings. For example, if I have "300000000 m/s^2", I want it to match against "m" and "s^2".
So far, I have tried most imaginable (by me) configurations resembling (I hope it's a good start)
"[[a-zA-Z]+[\\^[\\-]?[0-9]+]?]+"
To clarify, I need something that will match letters[^[-]numbers] (where [ ] denotes non obligatory parts). That means: letters, possibly followed by an exponent which is possibly negative.
I have studied regex a little bit, but I'm really not fluent, so any help will be greatly appreciated!
Thank you very much,
EDIT:
I have just tried the first 3 replies
String regex1 = "([a-zA-Z]+)(?:\\^(-?\\d+))?";
String regex2 = "[a-zA-Z]+(\\^-?[0-9]+)?";
String regex3 = "[a-zA-Z]+(?:\\^-?[0-9]+)?";
and it doesn't work... I know the code which tests the patterns work, because if I try something simple, like matching "[0-9]+" in "12345", it will match the whole string. So, I don't get what's still wrong. I'm trying with changing my brackets for parenthesis where needed at the moment...
CODE USED TO TEST:
public static void main(String[] args) {
String input = "30000 m/s^2";
// String input = "35345";
String regex1 = "([a-zA-Z]+)(?:\\^(-?\\d+))?";
String regex2 = "[a-zA-Z]+(\\^-?[0-9]+)?";
String regex3 = "[a-zA-Z]+(?:\\^-?[0-9]+)?";
String regex10 = "[0-9]+";
String regex = "([a-zA-Z]+)(?:\\^\\-?[0-9]+)?";
Pattern pattern = Pattern.compile(regex3);
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println("MATCHES");
do {
int start = matcher.start();
int end = matcher.end();
// System.out.println(start + " " + end);
System.out.println(input.substring(start, end));
} while (matcher.find());
}
}
([a-zA-Z]+)(?:\^(-?\d+))?
You don't need to use the character class [...] if you're matching a single character. (...) here is a capturing bracket for you to extract the unit and exponent later. (?:...) is non-capturing grouping.
You're mixing the use of square brackets to denote character classes and curly brackets to group. Try this instead:
[a-zA-Z]+(\^-?[0-9]+)?
In many regular expression dialects you can use \d to mean any digit instead of [0-9].
Try
"[a-zA-Z]+(?:\\^-?[0-9]+)?"

Categories