How to check a string starts with numeric number? - java

I have a string which contains alphanumeric character.
I need to check whether the string is started with number.
Thanks,

See the isDigit(char ch) method:
https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Character.html
and pass it to the first character of the String using the String.charAt() method.
Character.isDigit(myString.charAt(0));

Sorry I didn't see your Java tag, was reading question only. I'll leave my other answers here anyway since I've typed them out.
Java
String myString = "9Hello World!";
if ( Character.isDigit(myString.charAt(0)) )
{
System.out.println("String begins with a digit");
}
C++:
string myString = "2Hello World!";
if (isdigit( myString[0]) )
{
printf("String begins with a digit");
}
Regular expression:
\b[0-9]
Some proof my regex works: Unless my test data is wrong?

I think you ought to use a regex:
import java.util.regex.*;
public class Test {
public static void main(String[] args) {
String neg = "-123abc";
String pos = "123abc";
String non = "abc123";
/* I'm not sure if this regex is too verbose, but it should be
* clear. It checks that the string starts with either a series
* of one or more digits... OR a negative sign followed by 1 or
* more digits. Anything can follow the digits. Update as you need
* for things that should not follow the digits or for floating
* point numbers.
*/
Pattern pattern = Pattern.compile("^(\\d+.*|-\\d+.*)");
Matcher matcher = pattern.matcher(neg);
if(matcher.matches()) {
System.out.println("matches negative number");
}
matcher = pattern.matcher(pos);
if (matcher.matches()) {
System.out.println("positive matches");
}
matcher = pattern.matcher(non);
if (!matcher.matches()) {
System.out.println("letters don't match :-)!!!");
}
}
}
You may want to adjust this to accept floating point numbers, but this will work for negatives. Other answers won't work for negatives because they only check the first character! Be more specific about your needs and I can help you adjust this approach.

This should work:
String s = "123foo";
Character.isDigit(s.charAt(0));

System.out.println(Character.isDigit(mystring.charAt(0));
EDIT: I searched for java docs, looked at methods on string class which can get me 1st character & looked at methods on Character class to see if it has any method to check such a thing.
I think, you could do the same before asking it.
EDI2: What I mean is, try to do things, read/find & if you can't find anything - ask.
I made a mistake when posting it for the first time. isDigit is a static method on Character class.

Use a regex like ^\d

Related

How to check if a string contains only digits in Java

In Java for String class there is a method called matches, how to use this method to check if my string is having only digits using regular expression. I tried with below examples, but both of them returned me false as result.
String regex = "[0-9]";
String data = "23343453";
System.out.println(data.matches(regex));
String regex = "^[0-9]";
String data = "23343453";
System.out.println(data.matches(regex));
Try
String regex = "[0-9]+";
or
String regex = "\\d+";
As per Java regular expressions, the + means "one or more times" and \d means "a digit".
Note: the "double backslash" is an escape sequence to get a single backslash - therefore, \\d in a java String gives you the actual result: \d
References:
Java Regular Expressions
Java Character Escape Sequences
Edit: due to some confusion in other answers, I am writing a test case and will explain some more things in detail.
Firstly, if you are in doubt about the correctness of this solution (or others), please run this test case:
String regex = "\\d+";
// positive test cases, should all be "true"
System.out.println("1".matches(regex));
System.out.println("12345".matches(regex));
System.out.println("123456789".matches(regex));
// negative test cases, should all be "false"
System.out.println("".matches(regex));
System.out.println("foo".matches(regex));
System.out.println("aa123bb".matches(regex));
Question 1:
Isn't it necessary to add ^ and $ to the regex, so it won't match "aa123bb" ?
No. In java, the matches method (which was specified in the question) matches a complete string, not fragments. In other words, it is not necessary to use ^\\d+$ (even though it is also correct). Please see the last negative test case.
Please note that if you use an online "regex checker" then this may behave differently. To match fragments of a string in Java, you can use the find method instead, described in detail here:
Difference between matches() and find() in Java Regex
Question 2:
Won't this regex also match the empty string, "" ?*
No. A regex \\d* would match the empty string, but \\d+ does not. The star * means zero or more, whereas the plus + means one or more. Please see the first negative test case.
Question 3
Isn't it faster to compile a regex Pattern?
Yes. It is indeed faster to compile a regex Pattern once, rather than on every invocation of matches, and so if performance implications are important then a Pattern can be compiled and used like this:
Pattern pattern = Pattern.compile(regex);
System.out.println(pattern.matcher("1").matches());
System.out.println(pattern.matcher("12345").matches());
System.out.println(pattern.matcher("123456789").matches());
You can also use NumberUtil.isNumber(String str) from Apache Commons
Using regular expressions is costly in terms of performance. Trying to parse string as a long value is inefficient and unreliable, and may be not what you need.
What I suggest is to simply check if each character is a digit, what can be efficiently done using Java 8 lambda expressions:
boolean isNumeric = someString.chars().allMatch(x -> Character.isDigit(x));
One more solution, that hasn't been posted, yet:
String regex = "\\p{Digit}+"; // uses POSIX character class
You must allow for more than a digit (the + sign) as in:
String regex = "[0-9]+";
String data = "23343453";
System.out.println(data.matches(regex));
Long.parseLong(data)
and catch exception, it handles minus sign.
Although the number of digits is limited this actually creates a variable of the data which can be used, which is, I would imagine, the most common use-case.
We can use either Pattern.compile("[0-9]+.[0-9]+") or Pattern.compile("\\d+.\\d+"). They have the same meaning.
the pattern [0-9] means digit. The same as '\d'.
'+' means it appears more times.
'.' for integer or float.
Try following code:
import java.util.regex.Pattern;
public class PatternSample {
public boolean containNumbersOnly(String source){
boolean result = false;
Pattern pattern = Pattern.compile("[0-9]+.[0-9]+"); //correct pattern for both float and integer.
pattern = Pattern.compile("\\d+.\\d+"); //correct pattern for both float and integer.
result = pattern.matcher(source).matches();
if(result){
System.out.println("\"" + source + "\"" + " is a number");
}else
System.out.println("\"" + source + "\"" + " is a String");
return result;
}
public static void main(String[] args){
PatternSample obj = new PatternSample();
obj.containNumbersOnly("123456.a");
obj.containNumbersOnly("123456 ");
obj.containNumbersOnly("123456");
obj.containNumbersOnly("0123456.0");
obj.containNumbersOnly("0123456a.0");
}
}
Output:
"123456.a" is a String
"123456 " is a String
"123456" is a number
"0123456.0" is a number
"0123456a.0" is a String
According to Oracle's Java Documentation:
private static final Pattern NUMBER_PATTERN = Pattern.compile(
"[\\x00-\\x20]*[+-]?(NaN|Infinity|((((\\p{Digit}+)(\\.)?((\\p{Digit}+)?)" +
"([eE][+-]?(\\p{Digit}+))?)|(\\.((\\p{Digit}+))([eE][+-]?(\\p{Digit}+))?)|" +
"(((0[xX](\\p{XDigit}+)(\\.)?)|(0[xX](\\p{XDigit}+)?(\\.)(\\p{XDigit}+)))" +
"[pP][+-]?(\\p{Digit}+)))[fFdD]?))[\\x00-\\x20]*");
boolean isNumber(String s){
return NUMBER_PATTERN.matcher(s).matches()
}
Refer to org.apache.commons.lang3.StringUtils
public static boolean isNumeric(CharSequence cs) {
if (cs == null || cs.length() == 0) {
return false;
} else {
int sz = cs.length();
for(int i = 0; i < sz; ++i) {
if (!Character.isDigit(cs.charAt(i))) {
return false;
}
}
return true;
}
}
In Java for String class, there is a method called matches(). With help of this method you can validate the regex expression along with your string.
String regex = "^[\\d]{4}$";
String value = "1234";
System.out.println(data.matches(value));
The Explanation for the above regex expression is:-
^ - Indicates the start of the regex expression.
[] - Inside this you have to describe your own conditions.
\\\d - Only allows digits. You can use '\\d'or 0-9 inside the bracket both are same.
{4} - This condition allows exactly 4 digits. You can change the number according to your need.
$ - Indicates the end of the regex expression.
Note: You can remove the {4} and specify + which means one or more times, or * which means zero or more times, or ? which means once or none.
For more reference please go through this website: https://www.rexegg.com/regex-quickstart.html
Offical regex way
I would use this regex for integers:
^[-1-9]\d*$
This will also work in other programming languages because it's more specific and doesn't make any assumptions about how different programming languages may interpret or handle regex.
Also works in Java
\\d+
Questions regarding ^ and $
As #vikingsteve has pointed out in java, the matches method matches a complete string, not parts of a string. In other words, it is unnecessary to use ^\d+$ (even though it is the official way of regex).
Online regex checkers are more strict and therefore they will behave differently than how Java handles regex.
Try this part of code:
void containsOnlyNumbers(String str)
{
try {
Integer num = Integer.valueOf(str);
System.out.println("is a number");
} catch (NumberFormatException e) {
// TODO: handle exception
System.out.println("is not a number");
}
}

java regular expression not working

I am trying to match input data from the user and search if there is a match of this input.
for example if the user type : A*B*C*
i want to search all word which start with A and contains B and B
i tried this code and it;s not working:(get output false)
public static void main(String[] args)
{
String envVarRegExp = "^A[^\r\n]B[^\r\n]C[^\r\n]";
Pattern pattern = Pattern.compile(envVarRegExp);
Matcher matcher = pattern.matcher("AmBmkdCkk");
System.out.println(matcher.find());
}
Thanks.
You don't really need Regex here. Simple String class methods will work: -
String str = "AfasdBasdfCa";
if (str.startsWith("A") && str.contains("B") && str.contains("C")) {
System.out.println("true");
}
Note that this will not ensure that your B and C are in specific order, which I assume you don't need as you have not mentioned anything about that.
If you want them to be in some order (like B comes before C then use this Regex: -
if (str.matches("^A.*B.*C.*$")) {
System.out.println("true");
}
Note that, . will match any character except newline. So, you can use it instead of [^\r\n], its more clear. And you need to use the quantifier * because you need to match any repetition of the characters before B or C is found.
Also, String.matches matches the complete string, and hence the anchors at the ends.
I thing you should use * modifier in your regex like this (for 0 or more matches between A & B and then between B & C):
String envVarRegExp = "^A[^\r\n]*B[^\r\n]*C";
EDIT: It appears that you're working off the input coming from your user where user can use asterisk * in inputs. If that is the case consider this:
String envVarRegExp = userInput.replace("*", ".*?");
Where userInput is String like this:
String userInput = "a*b*c*d*e";
You need to add quantifiers to your character classes;
String envVarRegExp = "^A[^\r\n]*B[^\r\n]*C[^\r\n]*$";

Find a number of a given number of digits between given separators

What regex/pattern can I use to find the following pattern in a string?
#nnnn:
nnnn can be any 4-digit long number as long as it is sorrounded by a hashtag and a colon.
I have tried the code below:
String string = "#8226:";
if(string.matches( ".*\\d:.*" )) {
System.out.println( "Yes" );
}
It DOES work, but it matches other strings like below:
"This is a string 1234: Hahaha!" // Outputs "Yes"
"Hello 1834: World!!!" // Outputs "Yes"
I want it to only match the pattern at the top of the question.
Can anybody tell me where did I go wrong?
It can be done with Regular Expression
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FindPattern {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("#[0-9]{4}:");
String text = "#1233:#3433:abc#3993: #a343:___#8888:ki";
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
output is:
#1233:
#3433:
#3993:
#8888:
You have already a pattern: #nnnn:. The only problem is that this is not a java compatible regular expression. Let's convert.
# and : are valid character literals, so let these untouched.
As you probably know (according to your solution), a number is denoted with the \d sequence (note, there are some alternatives, e. g. [0-9], \p{Digit}). Just replace all ns with \d:
#\d\d\d\d:
There are four equal subpatterns here, so we can shorten it with a fixed quantifier:
#\d{4}:
You can now write string.matches("#\\d{4}:"). Note that this is slow because compiles the given regex pattern every time. If this code is called frequently, I would consider using a precompiled Pattern like:
Pattern HASH_NUMBER_COLON_PATTERN = Pattern.compile("#\\d{4}:");
// ...
if (HASH_NUMBER_COLON_PATTERN.matcher(yourString).matches()) {
// ...
}
Even better to use some regular expression builder library, such as regex-builder, JavaVerbalExpressions or RegexBee. These tools can make your intention very clear. RegexBee example:
Pattern HASH_NUMBER_COLON_PATTERN = Bee
.then(Bee.fixedChar('#'))
.then(Bee.intBetween(1000, 9999))
.then(Bee.fixedChar(':'))
.toPattern()

How to replace last dot in a string using a regular expression?

I'm trying to replace the last dot in a String using a regular expression.
Let's say I have the following String:
String string = "hello.world.how.are.you!";
I want to replace the last dot with an exclamation mark such that the result is:
"hello.world.how.are!you!"
I have tried various expressions using the method String.replaceAll(String, String) without any luck.
One way would be:
string = string.replaceAll("^(.*)\\.(.*)$","$1!$2");
Alternatively you can use negative lookahead as:
string = string.replaceAll("\\.(?!.*\\.)","!");
Regex in Action
Although you can use a regex, it's sometimes best to step back and just do it the old-fashioned way. I've always been of the belief that, if you can't think of a regex to do it in about two minutes, it's probably not suited to a regex solution.
No doubt get some wonderful regex answers here. Some of them may even be readable :-)
You can use lastIndexOf to get the last occurrence and substring to build a new string: This complete program shows how:
public class testprog {
public static String morph (String s) {
int pos = s.lastIndexOf(".");
if (pos >= 0)
return s.substring(0,pos) + "!" + s.substring(pos+1);
return s;
}
public static void main(String args[]) {
System.out.println (morph("hello.world.how.are.you!"));
System.out.println (morph("no dots in here"));
System.out.println (morph(". first"));
System.out.println (morph("last ."));
}
}
The output is:
hello.world.how.are!you!
no dots in here
! first
last !
The regex you need is \\.(?=[^.]*$). the ?= is a lookahead assertion
"hello.world.how.are.you!".replace("\\.(?=[^.]*$)", "!")
Try this:
string = string.replaceAll("[.]$", "");

Regular Expression problem in Java

I am trying to create a regular expression for the replaceAll method in Java. The test string is abXYabcXYZ and the pattern is abc. I want to replace any symbol except the pattern with +. For example the string abXYabcXYZ and pattern [^(abc)] should return ++++abc+++, but in my case it returns ab++abc+++.
public static String plusOut(String str, String pattern) {
pattern= "[^("+pattern+")]" + "".toLowerCase();
return str.toLowerCase().replaceAll(pattern, "+");
}
public static void main(String[] args) {
String text = "abXYabcXYZ";
String pattern = "abc";
System.out.println(plusOut(text, pattern));
}
When I try to replace the pattern with + there is no problem - abXYabcXYZ with pattern (abc) returns abxy+xyz. Pattern (^(abc)) returns the string without replacement.
Is there any other way to write NOT(regex) or group symbols as a word?
What you are trying to achieve is pretty tough with regular expressions, since there is no way to express “replace strings not matching a pattern”. You will have to use a “positive” pattern, telling what to match instead of what not to match.
Furthermore, you want to replace every character with a replacement character, so you have to make sure that your pattern matches exactly one character. Otherwise, you will replace whole strings with a single character, returning a shorter string.
For your toy example, you can use negative lookaheads and lookbehinds to achieve the task, but this may be more difficult for real-world examples with longer or more complex strings, since you will have to consider each character of your string separately, along with its context.
Here is the pattern for “not ‘abc’”:
[^abc]|a(?!bc)|(?<!a)b|b(?!c)|(?<!ab)c
It consists of five sub-patterns, connected with “or” (|), each matching exactly one character:
[^abc] matches every character except a, b or c
a(?!bc) matches a if it is not followed by bc
(?<!a)b matches b if it is not preceded with a
b(?!c) matches b if it is not followed by c
(?<!ab)c matches c if it is not preceded with ab
The idea is to match every character that is not in your target word abc, plus every word character that, according to the context, is not part of your word. The context can be examined using negative lookaheads (?!...) and lookbehinds (?<!...).
You can imagine that this technique will fail once you have a target word containing one character more than once, like example. It is pretty hard to express “match e if it is not followed by x and not preceded by l”.
Especially for dynamic patterns, it is by far easier to do a positive search and then replace every character that did not match in a second pass, as others have suggested here.
[^ ... ] will match one character that is not any of ...
So your pattern "[^(abc)]" is saying "match one character that is not a, b, c or the left or right bracket"; and indeed that is what happens in your test.
It is hard to say "replace all characters that are not part of the string 'abc'" in a single trivial regular expression. What you might do instead to achieve what you want could be some nasty thing like
while the input string still contains "abc"
find the next occurrence of "abc"
append to the output a string containing as many "+"s as there are characters before the "abc"
append "abc" to the output string
skip, in the input string, to a position just after the "abc" found
append to the output a string containing as many "+"s as there are characters left in the input
or possibly if the input alphabet is restricted you could use regular expressions to do something like
replace all occurrences of "abc" with a single character that does not occur anywhere in the existing string
replace all other characters with "+"
replace all occurrences of the target character with "abc"
which will be more readable but may not perform as well
Negating regexps is usually troublesome. I think you might want to use negative lookahead. Something like this might work:
String pattern = "(?<!ab).(?!abc)";
I didn't test it, so it may not really work for degenerate cases. And the performance might be horrible too. It is probably better to use a multistep algorithm.
Edit: No I think this won't work for every case. You will probably spend more time debugging a regexp like this than doing it algorithmically with some extra code.
Try to solve it without regular expressions:
String out = "";
int i;
for(i=0; i<text.length() - pattern.length() + 1; ) {
if (text.substring(i, i + pattern.length()).equals(pattern)) {
out += pattern;
i += pattern.length();
}
else {
out += "+";
i++;
}
}
for(; i<text.length(); i++) {
out += "+";
}
Rather than a single replaceAll, you could always try something like:
#Test
public void testString() {
final String in = "abXYabcXYabcHIH";
final String expected = "xxxxabcxxabcxxx";
String result = replaceUnwanted(in);
assertEquals(expected, result);
}
private String replaceUnwanted(final String in) {
final Pattern p = Pattern.compile("(.*?)(abc)([^a]*)");
final Matcher m = p.matcher(in);
final StringBuilder out = new StringBuilder();
while (m.find()) {
out.append(m.group(1).replaceAll(".", "x"));
out.append(m.group(2));
out.append(m.group(3).replaceAll(".", "x"));
}
return out.toString();
}
Instead of using replaceAll(...), I'd go for a Pattern/Matcher approach:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static String plusOut(String str, String pattern) {
StringBuilder builder = new StringBuilder();
String regex = String.format("((?:(?!%s).)++)|%s", pattern, pattern);
Matcher m = Pattern.compile(regex).matcher(str.toLowerCase());
while(m.find()) {
builder.append(m.group(1) == null ? pattern : m.group().replaceAll(".", "+"));
}
return builder.toString();
}
public static void main(String[] args) {
String text = "abXYabcXYZ";
String pattern = "abc";
System.out.println(plusOut(text, pattern));
}
}
Note that you'll need to use Pattern.quote(...) if your String pattern contains regex meta-characters.
Edit: I didn't see a Pattern/Matcher approach was already suggested by toolkit (although slightly different)...

Categories