Java better way to code this simple parameter intake? - java

"Employee identification number (a string) in the format XXX-L, where each X is a digit within the range 0-9 and the L is a letter within the range ‘A’-‘M’ (both lowercase and uppercase letters are acceptable)"
The above is a field which will be an argument for the constructor. Right now, I'm planning on making sure the the first 3 letters of the string is a number between 0-9, and then make sure there is a dash in the index of 4, and then make sure there is a letter between A-M in the 5th index, all using if else statements. Is there a better way of doing this, like if the entering of the parameter didn't have to be so exact, and the programs able to fix it by itself? Thank you.
I coded it and tried regex expression tools:
import java.util.regex.*;
public class Employee {
private String eName;
private String IDNumber;
public Employee(String name, String number) {
String regex = "[0-9][0-9][0-9][\\-][a-mA-M]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(number);
this.eName = name;
if(matcher.matches()) {
this.IDNumber = number;
} else{
this.IDNumber = "999-M";
}
}
public String getNumber() {
System.out.println(IDNumber);
return IDNumber;
}
public static void main(String[] args) {
Employee e = new Employee("John", "123-f");
e.getNumber();
Employee c = new Employee("Jane","25z");
c.getNumber();
}
}
I haven't thoroughly tested it, but it works, but looking at other people's regex expression, mine seems to be very newbish. I was wondering if someone can help me construct a shorter or better regex expression.

^\\d{3}-[a-mA-M]$
This should be an improvement I think.
^ means the start of the text
\d means any digit (backslash itself needs to be escaped)
{3} means previous match 3 times
the hyphen is a literal, as long as it isn't in square brackets
[a-mA-M] means any upper or lower case letter as you knew
$ means the end of the text
I used this site to test it out on regexpal.com

Related

How to extract a number from a string in a particular format?

I have a String like this as shown below. From below string I need to extract number 123 and it can be at any position as shown below but there will be only one number in a string and it will always be in the same format _number_
text_data_123
text_data_123_abc_count
text_data_123_abc_pqr_count
text_tery_qwer_data_123
text_tery_qwer_data_123_count
text_tery_qwer_data_123_abc_pqr_count
Below is the code:
String value = "text_data_123_abc_count";
// this below code will not work as index 2 is not a number in some of the above example
int textId = Integer.parseInt(value.split("_")[2]);
What is the best way to do this?
With a little guava magic:
String value = "text_data_123_abc_count";
Integer id = Ints.tryParse(CharMatcher.inRange('0', '9').retainFrom(value)
see also CharMatcher doc
\\d+
this regex with find should do it for you.
Use Positive lookahead assertion.
Matcher m = Pattern.compile("(?<=_)\\d+(?=_)").matcher(s);
while(m.find())
{
System.out.println(m.group());
}
You can use replaceAll to remove all non-digits to leave only one number (since you say there will be only 1 number in the input string):
String s = "text_data_123_abc_count".replaceAll("[^0-9]", "");
See IDEONE demo
Instead of [^0-9] you can use \D (which also means non-digit):
String s = "text_data_123_abc_count".replaceAll("\\D", "");
Given current requirements and restrictions, the replaceAll solution seems the most convenient (no need to use Matcher directly).
u can get all parts from that string and compare with its UPPERCASE, if it is equal then u can parse it to a number and save:
public class Main {
public static void main(String[] args) {
String txt = "text_tery_qwer_data_123_abc_pqr_count";
String[] words = txt.split("_");
int num = 0;
for (String t : words) {
if(t == t.toUpperCase())
num = Integer.parseInt(t);
}
System.out.println(num);
}
}

Get a substring of a string made of xCharsxInts

I have a list of constants:
public static final String INSTANCE_PREFIX = "in";
public static final String INDICATOR_PREFIX = "i";
public static final String MODEL_PREFIX = "m";
...
They have variable lengths, which are put in front of a number and the result is a variable's id. For example, it could be in30 or i2 or m4353. I am trying to make the method as abstract as possible to account for x letters x numbers. The letters are always going to be some prefix that is inside of my Constants.java so I know that much, but the method won't know with which combination it's working with.
I just want the number attached to the end. For example, I want to pass in the m4353 from above and just get back the 4353. Whether it uses the constants file or not is not relevant, but I include them as they may be useful for some approach.
It seems to me like you don't care about the prefixes at all, so I have ignored them in this answer. If you do care about the prefixes, please scroll down to the second half of this answer:
This code uses regular expressions to extract the trailing numbers at the end of a string.
() represents a capturing group (used by m.group(1));
[0-9]+ represents a String of digits of at least 1 in length
$ represents the end of the string, guaranteeing the numbers are only the ones at the end.
Here is the code:
private static final Pattern p = Pattern.compile("([0-9]+)$");
public static int extractNumber(String value) {
Matcher m = p.matcher(value);
if(m.find()) {
return Integer.parseInt(m.group(1));
} else {
return Integer.MIN_VALUE; // error code
}
}
Demo.
If you want to capture the prefix, you could use Pattern.compile("^([a-z]+)([0-9]+)$ instead.
Note that the numbers are now the second group, so they would be captured in m.group(2), and the prefix would be captured in m.group(1).
Try the String replaceAll method
For example:
String x = "prefix1111111";
x = x.replaceAll("\\D", "");
int justNum = Integer.parseInt(x);
where "\\D" is any non-digit character. So it deletes all non-digits in your string.
Note, you might want to use Long.parseLong or Double.parseDouble and the associated primitive types instead if your numbers will be longer than 9 digits as Java ints can only handle values up to 2147483647

java regex match any integer or double then replace non number/decimal characters

I am trying to match a string to any integer or double then, if it does not match, I want to remove all invalid characters to make the string a valid integer or double (or empty string). So far, this is what I have but it will print 15- which is not valid
String anchorGuyField = "15-";
if(!anchorGuyField.matches("-?\\d+(.\\d+)?")){ //match integer or double
anchorGuyField = anchorGuyField.replaceAll("[^-?\\d+(.\\d+)?]", ""); //attempt to replace invalid chars... failing here
}
You can use Pattern() and Matcher() to validate if string is suitable for covertion to int or double:
public class Match{
public static void main(String[] args){
String anchorGuyField = "asdasda-15.56757-asdasd";
if(!anchorGuyField.matches("(-?\\d+(\\.\\d+)?)")){ //match integer or double
Pattern pattern = Pattern.compile("(-?\\d+(\\.\\d+)?)");
Matcher matcher = pattern.matcher(anchorGuyField);
if(matcher.find()){
anchorGuyField = anchorGuyField.substring(matcher.start(),matcher.end());
}
}
System.out.println(anchorGuyField);
}
}
with:
anchorGuyField = anchorGuyField.replaceAll("[^-?\\d+(.\\d+)?]", "");
you actually delete content you wanted to match from string, insted of 15 from 15-, you should get just -
The negation checks that none of the given character matches. 15- only contains digits or commas, hence nothing matches the second regex. Maybe you could use something else than a regex to filter out characters.
Check first character is either a minus sign or a number, else remove it, then remove all non numbers characters.

Regular Expression problem in Java

I am trying to create a regular expression for the replaceAll method in Java. The test string is abXYabcXYZ and the pattern is abc. I want to replace any symbol except the pattern with +. For example the string abXYabcXYZ and pattern [^(abc)] should return ++++abc+++, but in my case it returns ab++abc+++.
public static String plusOut(String str, String pattern) {
pattern= "[^("+pattern+")]" + "".toLowerCase();
return str.toLowerCase().replaceAll(pattern, "+");
}
public static void main(String[] args) {
String text = "abXYabcXYZ";
String pattern = "abc";
System.out.println(plusOut(text, pattern));
}
When I try to replace the pattern with + there is no problem - abXYabcXYZ with pattern (abc) returns abxy+xyz. Pattern (^(abc)) returns the string without replacement.
Is there any other way to write NOT(regex) or group symbols as a word?
What you are trying to achieve is pretty tough with regular expressions, since there is no way to express “replace strings not matching a pattern”. You will have to use a “positive” pattern, telling what to match instead of what not to match.
Furthermore, you want to replace every character with a replacement character, so you have to make sure that your pattern matches exactly one character. Otherwise, you will replace whole strings with a single character, returning a shorter string.
For your toy example, you can use negative lookaheads and lookbehinds to achieve the task, but this may be more difficult for real-world examples with longer or more complex strings, since you will have to consider each character of your string separately, along with its context.
Here is the pattern for “not ‘abc’”:
[^abc]|a(?!bc)|(?<!a)b|b(?!c)|(?<!ab)c
It consists of five sub-patterns, connected with “or” (|), each matching exactly one character:
[^abc] matches every character except a, b or c
a(?!bc) matches a if it is not followed by bc
(?<!a)b matches b if it is not preceded with a
b(?!c) matches b if it is not followed by c
(?<!ab)c matches c if it is not preceded with ab
The idea is to match every character that is not in your target word abc, plus every word character that, according to the context, is not part of your word. The context can be examined using negative lookaheads (?!...) and lookbehinds (?<!...).
You can imagine that this technique will fail once you have a target word containing one character more than once, like example. It is pretty hard to express “match e if it is not followed by x and not preceded by l”.
Especially for dynamic patterns, it is by far easier to do a positive search and then replace every character that did not match in a second pass, as others have suggested here.
[^ ... ] will match one character that is not any of ...
So your pattern "[^(abc)]" is saying "match one character that is not a, b, c or the left or right bracket"; and indeed that is what happens in your test.
It is hard to say "replace all characters that are not part of the string 'abc'" in a single trivial regular expression. What you might do instead to achieve what you want could be some nasty thing like
while the input string still contains "abc"
find the next occurrence of "abc"
append to the output a string containing as many "+"s as there are characters before the "abc"
append "abc" to the output string
skip, in the input string, to a position just after the "abc" found
append to the output a string containing as many "+"s as there are characters left in the input
or possibly if the input alphabet is restricted you could use regular expressions to do something like
replace all occurrences of "abc" with a single character that does not occur anywhere in the existing string
replace all other characters with "+"
replace all occurrences of the target character with "abc"
which will be more readable but may not perform as well
Negating regexps is usually troublesome. I think you might want to use negative lookahead. Something like this might work:
String pattern = "(?<!ab).(?!abc)";
I didn't test it, so it may not really work for degenerate cases. And the performance might be horrible too. It is probably better to use a multistep algorithm.
Edit: No I think this won't work for every case. You will probably spend more time debugging a regexp like this than doing it algorithmically with some extra code.
Try to solve it without regular expressions:
String out = "";
int i;
for(i=0; i<text.length() - pattern.length() + 1; ) {
if (text.substring(i, i + pattern.length()).equals(pattern)) {
out += pattern;
i += pattern.length();
}
else {
out += "+";
i++;
}
}
for(; i<text.length(); i++) {
out += "+";
}
Rather than a single replaceAll, you could always try something like:
#Test
public void testString() {
final String in = "abXYabcXYabcHIH";
final String expected = "xxxxabcxxabcxxx";
String result = replaceUnwanted(in);
assertEquals(expected, result);
}
private String replaceUnwanted(final String in) {
final Pattern p = Pattern.compile("(.*?)(abc)([^a]*)");
final Matcher m = p.matcher(in);
final StringBuilder out = new StringBuilder();
while (m.find()) {
out.append(m.group(1).replaceAll(".", "x"));
out.append(m.group(2));
out.append(m.group(3).replaceAll(".", "x"));
}
return out.toString();
}
Instead of using replaceAll(...), I'd go for a Pattern/Matcher approach:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static String plusOut(String str, String pattern) {
StringBuilder builder = new StringBuilder();
String regex = String.format("((?:(?!%s).)++)|%s", pattern, pattern);
Matcher m = Pattern.compile(regex).matcher(str.toLowerCase());
while(m.find()) {
builder.append(m.group(1) == null ? pattern : m.group().replaceAll(".", "+"));
}
return builder.toString();
}
public static void main(String[] args) {
String text = "abXYabcXYZ";
String pattern = "abc";
System.out.println(plusOut(text, pattern));
}
}
Note that you'll need to use Pattern.quote(...) if your String pattern contains regex meta-characters.
Edit: I didn't see a Pattern/Matcher approach was already suggested by toolkit (although slightly different)...

How to check a string starts with numeric number?

I have a string which contains alphanumeric character.
I need to check whether the string is started with number.
Thanks,
See the isDigit(char ch) method:
https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Character.html
and pass it to the first character of the String using the String.charAt() method.
Character.isDigit(myString.charAt(0));
Sorry I didn't see your Java tag, was reading question only. I'll leave my other answers here anyway since I've typed them out.
Java
String myString = "9Hello World!";
if ( Character.isDigit(myString.charAt(0)) )
{
System.out.println("String begins with a digit");
}
C++:
string myString = "2Hello World!";
if (isdigit( myString[0]) )
{
printf("String begins with a digit");
}
Regular expression:
\b[0-9]
Some proof my regex works: Unless my test data is wrong?
I think you ought to use a regex:
import java.util.regex.*;
public class Test {
public static void main(String[] args) {
String neg = "-123abc";
String pos = "123abc";
String non = "abc123";
/* I'm not sure if this regex is too verbose, but it should be
* clear. It checks that the string starts with either a series
* of one or more digits... OR a negative sign followed by 1 or
* more digits. Anything can follow the digits. Update as you need
* for things that should not follow the digits or for floating
* point numbers.
*/
Pattern pattern = Pattern.compile("^(\\d+.*|-\\d+.*)");
Matcher matcher = pattern.matcher(neg);
if(matcher.matches()) {
System.out.println("matches negative number");
}
matcher = pattern.matcher(pos);
if (matcher.matches()) {
System.out.println("positive matches");
}
matcher = pattern.matcher(non);
if (!matcher.matches()) {
System.out.println("letters don't match :-)!!!");
}
}
}
You may want to adjust this to accept floating point numbers, but this will work for negatives. Other answers won't work for negatives because they only check the first character! Be more specific about your needs and I can help you adjust this approach.
This should work:
String s = "123foo";
Character.isDigit(s.charAt(0));
System.out.println(Character.isDigit(mystring.charAt(0));
EDIT: I searched for java docs, looked at methods on string class which can get me 1st character & looked at methods on Character class to see if it has any method to check such a thing.
I think, you could do the same before asking it.
EDI2: What I mean is, try to do things, read/find & if you can't find anything - ask.
I made a mistake when posting it for the first time. isDigit is a static method on Character class.
Use a regex like ^\d

Categories