I am not even sure if regular expressions are the best way to do this. Here is the requirement on a string:
To check length is 13 characters
First and Last 2 characters are always characters only.
Characters from 3 - 11 are numeric.
Please suggest whether regular expression is the best way to do it and what the regular expression would like to check such a thing?
Regards
Akhil
Use e.g.
"^[a-z]{2}[0-9]{9}[a-z]{2}$"
The square brackets say what is allowed, 'a-z' means small alphabetics between a and z. The curly says how many must be there. ^ means no characters before this, and $ means no characters after.
Usage:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MatcherExample {
public static void main(String[] args) {
String text = "aa123456789bb";
String patternString = "^[a-z]{2}[0-9]{9}[a-z]{2}$";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
boolean matches = matcher.matches();
System.out.println("Matches: " + matches);
}
}
Related
I am incorporating a pattern with has a backslash(\) with an escape sequence once.But that is not working at all.I am getting result as no match.
package com.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestClassRegex {
private static final String VALIDATION = "^[0-9\\-]+$";
public static void main(String[] args) {
String line = "1234\56";
Pattern r = Pattern.compile(VALIDATION);
Matcher m = r.matcher(line);
if (m.matches()) {
System.out.println("match");
}
else {
System.out.println("no match !!");
}
}
}
How can I write a pattern which can recognize backslash literally.
I have actually seen another post :
Java regular expression value.split("\\."), "the back slash dot" divides by character?
which doesn't answer my question completely.Hence needs some heads up here.
"1234\56" will not produce "123456" but instead "1234."
Why?
The \ in a String is used to refer to the octal value of a character in the ASCII table. Here, you're calling \056 which is the character number 46 in the ASCII table and is represented by .
That's exactly the reason why you're not getting a match here.
Solution
You should first of all change your regex to ^[0-9\\\\-]+$ because in Java you need to escape the \ in a String. Even if your initial RegEx does not do it.
Your input needs to look like 1234\\56 for the same reason as above.
I'm looking for the regex pattern, not the Java code, to match the last word in an English (or European language) sentence. If the last word is, in this case, "hi" then I want to match "hi" and not "hi."
The regex (\w+)\.$ will match "hi.", whereas the output should be just "hi". What's the correct regex?
thufir#dur:~/NetBeansProjects/regex$
thufir#dur:~/NetBeansProjects/regex$ java -jar dist/regex.jar
trying
a b cd efg hi
matches:
hi
trying
a b cd efg hi.
matches:
thufir#dur:~/NetBeansProjects/regex$
code:
package regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
String matchesLastWordFine = "a b cd efg hi";
lastWord(matchesLastWordFine);
String noMatchFound = matchesLastWordFine + ".";
lastWord(noMatchFound);
}
private static void lastWord(String sentence) {
System.out.println("\n\ntrying\n" + sentence + "\nmatches:");
Pattern pattern = Pattern.compile("(\\w+)$");
Matcher matcher = pattern.matcher(sentence);
String match = null;
while (matcher.find()) {
match = matcher.group();
System.out.println(match);
}
}
}
My code is in Java, but that's neither here nor there. I'm strictly looking for the regex, not the Java code. (Yes, I know it's possible to strip out the last character with Java.)
What regex should I put in the pattern?
You can use lookahead asserion. For example to match sentence without period:
[\w\s]+(?=\.)
and
[\w]+(?=\.)
For just last word (word before ".")
If you need to have the whole match be the last word you can use lookahead.
\w+(?=(\.))
This matches a set of word characters that are followed by a period, without matching the period.
If you want the last word in the line, regardless of wether the line ends on the end of a sentence or not you can use:
\w+(?=(\.?$))
Or if you want to also include ,!;: etc then
\w+(?=(\p{Punct}?$))
You can use matcher.group(1) to get the content of the first capturing group ((\w+) in your case). To say a little more, matcher.group(0) would return you the full match. So your regex is almost correct. An improvement is related to your use of $, which would catch the end of the line. Use this only if your sentence fill exactly the line!
With this regular expression (\w+)\p{Punct} you get a group count of 1, means you get one group with punctionation at matcher.group(0) and one without the punctuation at matcher.group(1).
To write the regular expression in Java, use: "(\\w+)\\p{Punct}"
To test your regular expressions online with Java (and actually a lot of other languages) see RegexPlanet
By using the $ operator you will only get a match at the end of a line. So if you have multiple sentences on one line you will not get a match in the middle one.
So you should just use:
(\w+)\.
the capture group will give the correct match.
You can see an example here
I don't understand why really, but this works:
package regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
String matchesLastWordFine = "a b cd efg hi";
lastWord(matchesLastWordFine);
String noMatchFound = matchesLastWordFine + ".";
lastWord(noMatchFound);
}
private static void lastWord(String sentence) {
System.out.println("\n\ntrying\n" + sentence + "\nmatches:");
Pattern pattern = Pattern.compile("(\\w+)"); //(\w+)\.
Matcher matcher = pattern.matcher(sentence);
String match = null;
while (matcher.find()) {
match = matcher.group();
}
System.out.println(match);
}
}
I guess regex \w+ will match all the words (doh). Then the last word is what I was after. Too simple, really, I was trying to exclude punctuation, but I guess regex does that automagically for you..?
What regex/pattern can I use to find the following pattern in a string?
#nnnn:
nnnn can be any 4-digit long number as long as it is sorrounded by a hashtag and a colon.
I have tried the code below:
String string = "#8226:";
if(string.matches( ".*\\d:.*" )) {
System.out.println( "Yes" );
}
It DOES work, but it matches other strings like below:
"This is a string 1234: Hahaha!" // Outputs "Yes"
"Hello 1834: World!!!" // Outputs "Yes"
I want it to only match the pattern at the top of the question.
Can anybody tell me where did I go wrong?
It can be done with Regular Expression
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FindPattern {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("#[0-9]{4}:");
String text = "#1233:#3433:abc#3993: #a343:___#8888:ki";
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
output is:
#1233:
#3433:
#3993:
#8888:
You have already a pattern: #nnnn:. The only problem is that this is not a java compatible regular expression. Let's convert.
# and : are valid character literals, so let these untouched.
As you probably know (according to your solution), a number is denoted with the \d sequence (note, there are some alternatives, e. g. [0-9], \p{Digit}). Just replace all ns with \d:
#\d\d\d\d:
There are four equal subpatterns here, so we can shorten it with a fixed quantifier:
#\d{4}:
You can now write string.matches("#\\d{4}:"). Note that this is slow because compiles the given regex pattern every time. If this code is called frequently, I would consider using a precompiled Pattern like:
Pattern HASH_NUMBER_COLON_PATTERN = Pattern.compile("#\\d{4}:");
// ...
if (HASH_NUMBER_COLON_PATTERN.matcher(yourString).matches()) {
// ...
}
Even better to use some regular expression builder library, such as regex-builder, JavaVerbalExpressions or RegexBee. These tools can make your intention very clear. RegexBee example:
Pattern HASH_NUMBER_COLON_PATTERN = Bee
.then(Bee.fixedChar('#'))
.then(Bee.intBetween(1000, 9999))
.then(Bee.fixedChar(':'))
.toPattern()
Can anyone please help me do the following in a java regular expression?
I need to read 3 characters from the 5th position from a given String ignoring whatever is found before and after.
Example : testXXXtest
Expected result : XXX
You don't need regex at all.
Just use substring: yourString.substring(4,7)
Since you do need to use regex, you can do it like this:
Pattern pattern = Pattern.compile(".{4}(.{3}).*");
Matcher matcher = pattern.matcher("testXXXtest");
matcher.matches();
String whatYouNeed = matcher.group(1);
What does it mean, step by step:
.{4} - any four characters
( - start capturing group, i.e. what you need
.{3} - any three characters
) - end capturing group, you got it now
.* followed by 0 or more arbitrary characters.
matcher.group(1) - get the 1st (only) capturing group.
You should be able to use the substring() method to accomplish this:
string example = "testXXXtest";
string result = example.substring(4,7);
This might help: Groups and capturing in java.util.regex.Pattern.
Here is an example:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Example {
public static void main(String[] args) {
String text = "This is a testWithSomeDataInBetweentest.";
Pattern p = Pattern.compile("test([A-Za-z0-9]*)test");
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println("Matched: " + m.group(1));
} else {
System.out.println("No match.");
}
}
}
This prints:
Matched: WithSomeDataInBetween
If you don't want to match the entire pattern rather to the input string (rather than to seek a substring that would match), you can use matches() instead of find(). You can continue searching for more matching substrings with subsequent calls with find().
Also, your question did not specify what are admissible characters and length of the string between two "test" strings. I assumed any length is OK including zero and that we seek a substring composed of small and capital letters as well as digits.
You can use substring for this, you don't need a regex.
yourString.substring(4,7);
I'm sure you could use a regex too, but why if you don't need it. Of course you should protect this code against null and strings that are too short.
Use the String.replaceAll() Class Method
If you don't need to be performance optimized, you can try the String.replaceAll() class method for a cleaner option:
String sDataLine = "testXXXtest";
String sWhatYouNeed = sDataLine.replaceAll( ".{4}(.{3}).*", "$1" );
References
https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html
http://www.vogella.com/tutorials/JavaRegularExpressions/article.html#using-regular-expressions-with-string-methods
I have strings formatted similar to the one below in a Java program. I need to get the number out.
Host is up (0.0020s latency).
I need the number between the '(' and the 's' characters. E.g., I would need the 0.0020 in this example.
If you are sure it will always be the first number you could use the regular expresion \d+\.\d+ (but note that the backslashes need to be escaped in Java string literals).
Try this code:
String input = "Host is up (0.0020s latency).";
Pattern pattern = Pattern.compile("\\d+\\.\\d+");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
System.out.println(matcher.group());
}
See it working online: ideone
You could also include some of the surrounding characters in the regular expression to reduce the risk of matching the wrong number. To do exactly as you requested in the question (i.e. matching between ( and s) use this regular expression:
\((\d+\.\d+)s
See it working online: ideone
Sounds like a case for regular expressions.
You'll want to match for the decimal figure and then parse that match:
Float matchedValue;
Pattern pattern = Pattern.compile("\\d*\\.\\d+");
Matcher matcher = pattern.matcher(yourString);
boolean isfound = matcher.find();
if (isfound) {
matchedValue = Float.valueOf(matcher.group(0));
}
It depends on how "similar" you mean. You could potentially use a regular expression:
import java.math.BigDecimal;
import java.util.regex.*;
public class Test {
public static void main(String args[]) throws Exception {
Pattern pattern = Pattern.compile("[^(]*\\(([0-9]*\\.[0-9]*)s");
String text = "Host is up (0.0020s latency).";
Matcher match = pattern.matcher(text);
if (match.lookingAt())
{
String group = match.group(1);
System.out.println("Before parsing: " + group);
BigDecimal value = new BigDecimal(group);
System.out.println("Parsed: " + value);
}
else
{
System.out.println("No match");
}
}
}
Quite how specific you want to make your pattern is up to you, of course. This only checks for digits, a dot, then digits after an opening bracket and before an s. You may need to refine it to make the dot optional etc.
This is a great site for building regular expressions from simple to very complex. You choose the language and boom.
http://txt2re.com/
Here's a way without regex
String str = "Host is up (0.0020s latency).";
str = str.substring(str.indexOf('(')+1, str.indexOf("s l"));
System.out.println(str);
Of course using regular expressions in this case is best solution but in many simple cases you can use also something like :
String value = myString.subString(myString.indexOf("("), myString.lastIndexOf("s"))
double numericValue = Double.parseDouble(value);
This is not recomended because text in myString can changes.