regex for comma separated values - java

I am new to write regular expressions so please help.
I want to match this pattern (in Java):
"ABC",010,00,"123",0,"time","time",01,00, 10, 10,88,217," ",," "
the data I get will always be in the above format with 16 values. But the format will never change.
I am not looking for parsing as this can be parsed by java split too.
I will have large chunks of these data so want to capture the first 16 data points and match with this pattern to check if I received it correctly else ignore.
so far I have only tried this regex:
^(\".\"),.,(\".\"),.,(\".\"),(\".\"),.,.,.,.,.,.,(\".\"),.,(\".\")$
I am still in the process of building it.
I just need to match the pattern from a given pool. I take first 16data points and try to see if it matches this pattern else ignore.
Thanks!!

This should do the trick. Keep in mind that he doesn't care what order the data points occur in (ie. they could all be strings or all numbers).
(\s?("[\w\s]*"|\d*)\s?(,|$)){16}
You can try it out here.

Please find in the below code comprising comma separated evaluation for String, Number and Decimal.
public static void commaSeparatedStrings() {
String value = "'It\\'s my world', 'Hello World', 'What\\'s up', 'It\\'s just what I expected.'";
if (value.matches("'([^\'\\\\]*(?:\\\\.[^\'\\\\])*)[\\w\\s,\\.]+'(((,)|(,\\s))'([^\'\\\\]*(?:\\\\.[^\'\\\\])*)[\\w\\s,\\.]+')*")) {
System.out.println("Valid...");
} else {
System.out.println("Invalid...");
}
}
/**
*
*/
public static void commaSeparatedDecimals() {
String value = "-111.00, 22111.00, -1.00";
// "\\d+([,]|[,\\s]\\d+)*"
if (value.matches(
"^([-]?)\\d+\\.\\d{1,10}?(((,)|(,\\s))([-]?)\\d+\\.\\d{1,10}?)*")) {
System.out.println("Valid...");
} else {
System.out.println("Invalid...");
}
}
/**
*
*/
public static void commaSeparatedNumbers() {
String value = "-11, 22, -31";
if (value.matches("^([-]?)\\d+(((,)|(,\\s))([-]?)\\d+)*")) {
System.out.println("Valid...");
} else {
System.out.println("Invalid...");
}
}

Related

Regex for a line with doubles

I am fairly new to programming and regex is very confusing. I am trying to identify a data line that consists of 3 doubles with spaces in between for example:
500.00 56.48 500.00
I have tried this:
data.matches("^[0-9]+\\.[0-9]+\\s[0-9]+\\.[0-9]+\\s[0-9]+\\.[0-9]+$")
But this doesn't recognize the line. What am I doing wrong?
Don't do it the way you have tried.
Although the regex pattern you have used works for the numbers you have used, it will fail for a wide range of numbers e.g. .5 or 5.6E2 which are also double numbers.
Given below is the demo with your data and pattern:
public class Main {
public static void main(String[] args) {
String data = "500.00 56.48 500.00";
System.out.println(data.matches("^[0-9]+\\.[0-9]+\\s[0-9]+\\.[0-9]+\\s[0-9]+\\.[0-9]+$"));
}
}
Output:
true
However, it will fail to give you the expected result in the following case:
public class Main {
public static void main(String[] args) {
String data = ".5 5.6E2 500.00";
System.out.println(data.matches("^[0-9]+\\.[0-9]+\\s[0-9]+\\.[0-9]+\\s[0-9]+\\.[0-9]+$"));
}
}
Output:
false
Even though .5 and 5.6E2 are valid double numbers, your pattern failed to recognize them.
The recommended way:
You should split the data line on whitespace and try to parse each number using Double#parseDouble e.g.
public class Main {
public static void main(String[] args) {
String data = "500.00 56.48 500.00";
System.out.println(matches(data));
data = ".5 5.6E2 500.00";
System.out.println(matches(data));
data = ".5 500.00";
System.out.println(matches(data));
data = ".5 abc 500.00";
System.out.println(matches(data));
}
static boolean matches(String data) {
String[] nums = data.split("\\s+");
boolean match = true;
if (nums.length == 3) {
for (String num : nums) {
try {
Double.parseDouble(num);
} catch (NumberFormatException e) {
match = false;
break;
}
}
} else {
match = false;
}
return match;
}
}
Output:
true
true
false
false
Improve your regex by observing a few things:
[0-9] is the same as \d
you're looking for the same pattern, thrice
So, let's do that:
three times:
one or more numbers, optionally followed by
a period and then one or more numbers, optionally followed by
white space
Which means:
(...){3} where ... is:
\d+, optionally followed by
(\.\d+)? (i.e. zero-or-once), optionally followed by
\s* (zero-or-more)
Putting that all together, and remembering to use proper string escaping:
data.matches("^(\\d+(\\.\\d+)?\\s*){3}$")
You can see this working over on https://regex101.com/r/PGxAm9/1, and keeping regex101 bookmarked for future debugging is highly recommended.

Validating parts of a string input to make sure letters or numbers are input for certain indexes

Stupid title aside, I'm having trouble validating that the value "MBA222" is not allowed when inputted and values like "MB2222" are. That is to say I'm unsure of how to insure the first two characters of the String are letters and the next four are numbers. Am I perhaps using the wrong method?
public static String getValidMembership(String aMember){
while(isValidMembership(aMember) == false){
aMember = JOptionPane.showInputDialog("Please enter Membership Number");
}
return aMember;
}
private static boolean isValidMembership(String aMember){
boolean result = false;
//TODO add your code here
try{
if(!aMember.substring(0,1).contains("[a-zA-Z]+") &&
!aMember.substring(2,5).contains("[0-9]+")&&
aMember.length() != 6){
result = false;
}
else{
result = true;
}
}catch(Exception e){
result = false;
}
return result;
}
All you need is a single regex:
public static boolean isValidMembership(String aMember) {
return Pattern.matches("^[a-zA-Z]{2}\\d{4}$", aMember);
}
I would recommend reading the Javadoc for the Pattern class which provides a lot of details of how to use regexes.
The String's method contains(...) doesn't work with RegEx, you should use matches(...) for that.
To check if a String matches your criteria, you can use a statement like this:
s.matches("[A-Za-z]{2}\\d{4}")
Here is an elegant and lean implementation of your isValidMembership() method:
private static boolean isValidMembership(String aMember){
if (aMember.replaceFirst("([a-zA-Z]{2}[0-9]{4})", "").length() != 0) {
return false;
}
return true;
}
I removed your try-catch block because it doesn't seem necessary.
This solution uses the following regex, which you can also test by following the link given below:
[a-zA-Z]{2}[0-9]{4}
Regex101
Note that this regex will match two letters followed by four numbers, but the Java code imposes the additional constraint that the length of the string must be 6.

Determine if a JTextField contains an integer

I'm making a word guessing game. The JTextField can't be empty, can't be longer or smaller than 5 and cannot contain numbers. How do I do the last part?
#Override
public void actionPerformed(ActionEvent e) {
if (text.getText().isEmpty()) {
showErrorMessage("You have to type something");
} else if (text.getText().length() != 5) {
showErrorMessage("Currently only supporting 5-letter words");
} else if (contains integer){
showErrorMessage("Words don't contain numbers!");
} else {
r.setGuess(text.getText());
}
}
Rather than explicitly checking for numbers, I would suggest whitelisting characters which are allowed. Would letters with accents be allowed, for example? Both upper and lower case? Punctuation?
Once you've worked out your requirements, I suggest you express them as a regular expression - and then check whether the text matches the regular expression.
For example:
private static final Pattern VALID_WORD = Pattern.compile("^[A-Za-z]*$");
...
if (!VALID_WORD.matcher(text.getText()).matches()) {
// Show some appropriate error message
}
Note that I haven't included length checks in there, as you're already covering those - and it may well be worth doing so in order to give more specific error messages.
You might also want to consider preventing your text box from accepting the invalid characters to start with - just reject the key presses, rather than waiting for a submission event. (You could also change the case consistently at the same time, for example.) As noted in comments, JFormattedTextField is probably a good match - see the tutorial to get started.
create a method that checks if the JTextField has a number like this:
private boolean isContainInt(JTextField field) {
Pattern pt = Pattern.compile("\\d+");
Matcher mt = pt.matcher(field.getText());
return mt.find();
}
if (!text.getText().matches("[a-zA-Z]*")) {
// something wrong
}

How can I match a char at a specific index in a string?

I have a simple string that I'm trying to determine if a specific index results in a specific char.
I'm trying to do it like this (but I get compilation errors):
if(myString.charAt(7).equals("/")) {
// do stuff
} else {
// do other stuff
}
Error:
Type mismatch: cannot convert from char to boolean
(myString.charAt(7).equals("/")
should be following because charAt() returns char:
myString.charAt(7) == '/'
if(myString.charAt(7)== '/') {
// do stuff
} else {
// do other stuff
}
Putting a character in double quotes makes it a String. Single quotes makes it a char. And you compare characters with literal == whereas you compare Objects with the equals method
There's a couple solutions on this answer that give you what you were probably trying to do, which is compare a single character to another single character. I won't go over that because they've done excellently.
But you can still use a String if you like, and prepare for the future. (Perhaps "/" changes to "//"?) you can do this:
if(myString.substring(7,8).equals("/")) {
// stuff
}
Then down the road you might be like
public static final String SEPARATOR_STRING = "//";
public static final int SEPARATOR_START = 7;
public static final int SEPARATOR_END = 7 + SEPARATOR_STRING.length();
// later
if(myString.substring(SEPARATOR_START,7SEPARATOR_END).equals(SEPARATOR_STRING)) {
// stuff
}
charAt() returns char, not object, so you need to compare it that way:
if(myString.charAt(7)== '/') {
...
note the single quote around /.
if(myString.substring(7,8).equals("/"))
or
if(myString.charAt(7)=='/')
or
if(myString.indexOf("/"))==7) can be use

Need help parsing strings in Java

I am reading in a csv file in Java and, depending on the format of the string on a given line, I have to do something different with it. The three different formats contained in the csv file are (using random numbers):
833
"79, 869"
"56-57, 568"
If it is just a single number (833), I want to add it to my ArrayList. If it is two numbers separated by a comma and surrounded by quotations ("79, 869)", I want to parse out the first of the two numbers (79) and add it to the ArrayList. If it is three numbers surrounded by quotations (where the first two numbers are separated by a dash, and the third by a comma ["56-57, 568"], then I want to parse out the third number (568) and add it to the ArrayList.
I am having trouble using str.contains() to determine if the string on a given line contains a dash or not. Can anyone offer me some help? Here is what I have so far:
private static void getFile(String filePath) throws java.io.IOException {
BufferedReader reader = new BufferedReader(new FileReader(filePath));
String str;
while ((str = reader.readLine()) != null) {
if(str.endsWith("\"")){
if (str.contains(charDash)){
System.out.println(str);
}
}
}
}
Thanks!
I recommend using the version of indexOf that actually takes a char rather than a string, since this method is much faster. (It is a simple loop, without a nested loop.)
I.e.
if (str.indexOf('-')!=-1) {
System.out.println(str);
}
(Note the single quotes, so this is a char, rather than a string.)
But then you have to split the line and parse the individual values. At present, you are testing if the whole line ends with a quote, which is probably not what you want.
The following code works for me (note: I wrote it with no optimization in mind - it's just for testing purposes):
public static void main(String args[]) {
ArrayList<String> numbers = GetNumbers();
}
private static ArrayList<String> GetNumbers() {
String str1 = "833";
String str2 = "79, 869";
String str3 = "56-57, 568";
ArrayList<String> lines = new ArrayList<String>();
lines.add(str1);
lines.add(str2);
lines.add(str3);
ArrayList<String> numbers = new ArrayList<String>();
for (Iterator<String> s = lines.iterator(); s.hasNext();) {
String thisString = s.next();
if (thisString.contains("-")) {
numbers.add(thisString.substring(thisString.indexOf(",") + 2));
} else if (thisString.contains(",")) {
numbers.add(thisString.substring(0, thisString.indexOf(",")));
} else {
numbers.add(thisString);
}
}
return numbers;
}
Output:
833
79
568
Although it gets a lot of hate these days, I still really like the StringTokenizer for this kind of stuff. You can set it up to return the tokens and, at least to me, it makes the processing trivial without interacting with regexes
you'd have to create it using ",- as your tokens, then just kick it off in a loop.
st=new StringTokenizer(line, "\",-", true);
Then you set up a loop:
while(st.hasNextToken()) {
String token=st.nextToken();
Each case becomes it's own little part of the loop:
// Use punctuation to set flags that tell you how to interpret the numbers.
if(token == "\"") {
isQuoted = !isQuoted;
} else if(token == ",") {
...
} else if(...) {
...
} else { // The punctuation has been dealt with, must be a number group
// Apply flags to determine how to parse this number.
}
I realize that StringTokenizer is outdated now, but I'm not really sure why. Parsing regular expressions can't be faster and the syntax is--well split is a pretty sweet syntax I gotta admit.
I guess if you and everyone you work with is really comfortable with Regular Expressions you could replace that with split and just iterate over the resultant array but I'm not sure how to get split to return the punctuation--probably that "+" thing from other answers but I never trust that some character I'm passing to a regular expression won't do something utterly unexpected.
will
if (str.indexOf(charDash.toString()) > -1){
System.out.println(str);
}
do the trick?
which by the way is fastest than contains... because it implements indexOf
Will this work?
if(str.contains("-")) {
System.out.println(str);
}
I wonder if the charDash variable is not what you are expecting it to be.
I think three regexes would be your best bet - because with a match, you also get the bit you're interested in. I suck at regex, but something along the lines of:
.*\-.*, (.+)
.*, (.+)
and
(.+)
ought to do the trick (in order, because the final pattern matches anything including the first two).

Categories