Java Repeat regular expression - java

I have following RegEx which should match e.g. some ids in brackets:
[swpf_02-7679, swpf_02-7622, ...]
Pattern p = Pattern.compile("[\\[\\s]*?[a-z]{1,8}[0-9]*?_[0-9]{2,}\\-[0-9]+[\\s]*?\\]");
The goal is now to combine this pattern with "split" at "," to fit the string [swpf_02-7679, swpf_02-7622] and not only [swpf_02-7679] like the posted RegEx above.
Can someone give me a hint?

Just remove the [ and ] from the string then split at the ,

The easiest way to do what you want to do I think is to just remove the '[' and ']' in front and back (use String.subString()), then split on comma with String.split() and use the regex on each individual string so returned (adjust the regex to remove the brackets of course).

Ok, assuming that you want the bits that the id's are like "swpf_02-7622", then split on the comma, and loop through the remains, trimming as you go. Some thing like
List<String> cleanIds = new ArrayList<String>();
for(String id : ids.split(","))
cleanIds.add(id.trim());
If you want rid of the "swpf_" bits, then id.substring(5).
Finally, to git rid of the square brackets, use id.startsWith('[') and id.endsWith(']') .

Why don't you use the Java StringTokenizer class and then just use the regex on the tokens you get out of this? You can post-process them to include the brackets you need or modify the regex slightly.

As #was and #garyh already mentioned the simplest way is to remove [], then split your list using `String.split("\s*,\S*"), then match each member using your pattern.
You can also match your string multiple times using start position as a end position of the previous iteration:
Pattern p = .... // your pattern in capturing brackets ()
Matcher m = p.matcher(str);
for (int start = 0; m.find(start); start = m.end()) {
String element = m.group(1);
// do what you need with the element.
}

If you simply want to extract all the codes in you list you could use this regular expression:
[^,\s\[\]]+
Getting all the matches from the following string:
[swpf_02-7679, swpf_02-762342, swpf_02-7633 , swpf_02-723422]
Would give you the following results:
swpf_02-7679
swpf_02-762342
swpf_02-7633
swpf_02-723422

Related

Split String 2 times but with different splits ";" and "."

Original String: "12312123;www.qwerty.com"
With this Model.getList().get(0).split(";")[1]
I get: "www.qwerty.com"
I tried doing this: Model.getList().get(0).split(";")[1].split(".")[1]
But it didnt work I get exception. How can I solve this?
I want only "qwerty"
Try this, to achieve "qwerty":
Model.getList().get(0).split(";")[1].split("\\.")[1]
You need escape dot symbol
Try to use split(";|\\.") like this:
for (String string : "12312123;www.qwerty.com".split(";|\\.")) {
System.out.println(string);
}
Output:
12312123
www
qwerty
com
You can split a string which has multiple delimiters. Example below:
String abc = "11;xyz.test.com";
String[] tokens = abc.split(";|\\.");
System.out.println(tokens[tokens.length-2]);
The array index 1 part doesn't make sense here. It will throw an ArrayIndexOutOfBounds Exception or something of the sort.
This is because splitting based on "." doesn't work the way you want it to. You would need to escape the period by putting "\." instead. You will find here that "." means something completely different.
You'd need to escape the ., i.e. "\\.". Period is a special character in regular expressions, meaning "any character".
What your current split means is "split on any character"; this means that it splits the string into a number of empty strings, since there is nothing between consecutive occurrences of " any character".
There is a subtle gotcha in the behaviour of the String.split method, which is that it discards trailing empty strings from the token array (unless you pass a negative number as the second parameter).
Since your entire token array consists of empty strings, all of these are discarded, so the result of the split is a zero-length array - hence the exception when you try to access one of its element.
Don't use split, use a regular expression (directly). It's safer, and faster.
String input = "12312123;www.qwerty.com";
String regex = "([^.;]+)\\.[^.;]+$";
Matcher m = Pattern.compile(regex).matcher(input);
if (m.find()) {
System.out.println(m.group(1)); // prints: qwerty
}

String split method returning first element as empty using regex

I'm trying to get the digits from the expression [1..1], using Java's split method. I'm using the regex expression ^\\[|\\.{2}|\\]$ inside split. But the split method returning me String array with first value as empty, and then "1" inside index 1 and 2 respectively. Could anyone please tell me what's wrong I'm doing in this regex expression, so that I only get the digits in the returned String array from split method?
You should use matching. Change your expression to:
`^\[(.*?)\.\.(.*)\]$`
And get your results from the two captured groups.
As for why split acts this way, it's simple: you asked it to split on the [ character, but there's still an "empty string" between the start of the string and the first [ character.
Your regex is matching [ and .. and ]. Thus it will split at this occurrences.
You should not use a split but match each number in your string using regex.
You've set it up such that [, ] and .. are delimiters. Split will return an empty first index because the first character in your string [1..1] is a delimiter. I would strip delimiters from the front and end of your string, as suggested here.
So, something like
input.replaceFirst("^[", "").split("^\\[|\\.{2}|\\]$");
Or, use regex and regex groups (such as the other answers in this question) more directly rather than through split.
Why not use a regex to capture the numbers? This will be more effective less error prone. In that case the regex looks like:
^\[(\d+)\.{2}(\d+)\]$
And you can capture them with:
Pattern pat = Pattern.compile("^\\[(\\d+)\\.{2}(\\d+)\\]$");
Matcher matcher = pattern.matcher(text);
if(matcher.find()) { //we've found a match
int range_from = Integer.parseInt(matcher.group(1));
int range_to = Integer.parseInt(matcher.group(2));
}
with range_from and range_to the integers you can no work with.
The advantage is that the pattern will fail on strings that make not much sense like ..3[4, etc.

How can I push regex matches to array in java?

I've currently got a string, of which I want to use certain parts. With these parts I want to do various things, like pushing them to an array or showing them in a text area.
Fist I try to split method. It delete my regex matches and prints other part of string. I want to delete other part and print the regex match.
How can I do this?
For example:
There are lot of youtube links like this
https://www.youtube.com/watch?v=qJuoXM7G322&list=PLRfAW_jVDn06M7qxHIwlowgLY3Io1pG6z&index=7
I want to take only simple video link with this expression
"https:\\/\\/www.youtube.com\\/watch\\?v=.{11}"
when I use this code :
String ytLink = linkArea.getText();
String regexp = "https:\\/\\/www.youtube.com\\/watch\\?v=.{11}";
String[] tokenVal;
tokenVal = ytLink.split(regexp);
System.out.println("Count of Links : "+tokenVal.length);
for (String t : tokenVal) {
System.out.println(t);
}
It prints
"&list=PLRfAW_jVDn06M7qxHIwlowgLY3Io1pG6z&index=7"
I want to output be like this:
"https://www.youtube.com/watch?v=SATL2mTfZO0"
"when I Right this code :"
You are splitting the string with that regular expression, which is not the correct tool for the job.
It is dividing your example string into:
"" // The bit before the separator.
"https://www.youtube.com/watch?v=qJuoXM7G322" // The separator
"&list=PLRfAW_jVDn06M7qxHIwlowgLY3Io1pG6z&index=7" // The bit after the separator
but then discarding the separator, so you'd get back a 2-element array containing:
"" // The bit before the separator.
"&list=PLRfAW_jVDn06M7qxHIwlowgLY3Io1pG6z&index=7" // The bit after the separator
If you want to get the thing that matches the regex, you'd need to use Pattern and Matcher:
Pattern pattern = Pattern.compile("https:\\/\\/www.youtube.com\\/watch\\?v=.{11}");
Matcher matcher = pattern.matcher(ytLink);
if (matcher.find()) {
System.out.println(matcher.group());
}
(I don't entirely trust your escaped backslashes in your regular expression; however the pattern is not really important to the principle)
You can negate your regex using the negative lookaround: (?!pattern)
See also : How to negate the whole regex?

How get only characters between `(` and `)` using regular expression?

I am having a scenario like this:
A= add(1a,2b,3c,4d) now i want only the values inside the brackets .How can i do that ?Can anyone help me.
I tried using this:
replaceAll("\\d",""); It removed all the integers, but I want to get the char inside the bracket with commas.
For example: a,b,c,d
([0-9()]|.*(?=\())
[0-9()] This will match the digits and brackets
.*(?=\() This will match anything before the opening bracket
The | in the middle acts like an OR e.g. match (THIS | THIS)
In your case with a replace this A= add(1a,2b,3c,4d) will become a,b,c,d
Test here and choose the replace tab at the top http://gskinner.com/RegExr/
You can use this Regex:
\\((.*?)\\)
If you want to extract the chars without the ,, you can use String#split.
Your solution doesn't work because you are doing something irrelevant. You are replacing all digits with "", meaning that you're removing them.
Another solution:
String myStr = str.split("\\(|\\)")[1];
If your string is A= add(1a,2b,3c,4d), after the regex, you'll get 1a,2b,3c,4d. If you don't want the ints, use replaceAll.
The pattern you require is: "\\((.*)\\)".
The code below demonstrates how to use this pattern to find all the occurrences of items enclosed in brackets in an input string:
String example = "A= add(a,b,c,d)";
Pattern pattern = Pattern.compile("\\((.*)\\)");
Matcher matcher = pattern.matcher(example);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Prints:
a,b,c,d
\(([^\)]*)\) will give back in the first group "a,b,c,d".

split string on comma, but not commas inside parenthesis

Now I have a task about Java IO , I need to read file line by line and save this data into database, But Now I got a problem.
"desk","12","15","(small,median,large)"
I want to use string.split(",") to split by , and save data into each column. But I found that the data in the () has also been split, I do not want to split (small, median, large) and I want to keep this integrity. How can I do that? I know I can use regualr expression , but I really do not know how to do it?
You could solve this by using Pattern and Matcher. Any solution using split would just seem like a nasty workaround. Here's an example:
public static void main(String[] args){
String s = "\"desk\",\"12\",\"15\",\"(small,median,large)\"";
Pattern p = Pattern.compile("\".+?\"");
Matcher m = p.matcher(s);
List<String> matches = new ArrayList<String>();
while (m.find()){
matches.add(m.group());
}
System.out.println(matches);
}
or, if Java must be:), you can split by "\\s*\"\\s*,\\s*\"" and add afterwards the " if necessary to the beginning of the first field and to the end of the second.
I put \s because I see that you also have blanks separators - 15",blank"(small
(\(.+?\)|\w+)
the code above matches the result below this will allow for a more flexible solution that some of the other posted ones. The syntax for the regular expression is in another answer on this page just use this regular expression instead
desk
12
15
(small,median,large)

Categories