What pattern would I use to split the following types of strings.:
"NumStringNumString..."
For example "3X12Y5Z" into a String array of "3X","12Y", and "5Z"
Note: if necessary assume that the string is only one character as the original problem stated. I would still prefer the more general solution though.
I thought that the pattern "^(\d+\w+)" would work, but it doesn't cut it.
^ forces to the beginning of the string, where you want to find all the patterns.
if necessary assume that the string is only one character
I'll also assume uppercase characters only
Pattern p = Pattern.compile("[0-9]+[A-Z]")
Matcher m = p.matcher("3X12Y5Z")
while (m.find()) {
System.out.println(m.group())
}
Related
I have been trying to solve this problem. I have a string which has a pattern. Eg.
CW1234 has been despatched to CW334545
i.e the String can have patterns starting with CW followed by any number of intergers (at max 16).
I want to replace all these patters with an empty character. So that the string will look like
has been despatched to
I have tried the following but it replaces only the first digit followed by the CW. I'm pretty new to java. Any insights would be of great help.
if(Pattern.matches(".*[C][W][0-9].*", str1)) {
Matcher m = Pattern.compile(".*[C][W][0-9].*").matcher(str1);
while(m.find()) {
str1 = str1.replaceAll("[C][W][0-9]", "");
}
}
System.out.println(str1);
You need to have {n,m} quantifier on your digits, to enforce maximum digits. Also, for replacement purpose, you don't need to check beforehand whether the pattern is there or not. replaceAll will replace only if there is matching pattern, else will leave the string as it is.
So, remove all those Pattern and Matcher part, and change your regex to:
str1 = str1.replaceAll("CW\\d{0,16}", "");
If you want at least 1 digit, then make it {1,16}. No need to put C and W in different character classes. A character class with single character is as good as that character itself (given that it's not a special character). Also, you can use \\d instead of [0-9].
You're needlessly constructing the pattern and matching the string several times.
str1 = str1.replaceAll("CW\\d+", "");
This is sufficient. All other code is redundant.
You can also opt to do the replace by hand if performance is a problem.
Your replaceAll is missing a +:
str1 = str1.replaceAll("[C][W][0-9]+", "");
The + will make the regex match any number of digits directly following CW.
Your regex is wrong. Try with:
String str1 = CW1234;
str1 = str1.replaceAll("\\bCW\\d{0,16}\\b","");
if the "CW12134" is a single token in a string or with
String str1 = CW1234;
str1 = str1.replaceAll("^CW\\d{0,16}$","");
if the "CW1234" is a full string.
String.replaceAll("CW[0-9\\s]*", "") does what you need, and it also removes the space at the end of the number.
On another note, the whole point of Pattern.compile() is that you need to compile the required expression once in the application, and then use the matcher to find occurences. So I think your usage is inappropriate (rather than incorrect).
Pattern pattern = Pattern.compile("CD[0-9\\s]*");occurs only once in the code and then reuse it as
Matcher matcher = pattern.matcher(stringToMatch);
I want to check a string that matches the format "=number", ex "=5455".
As long as the fist char is "=" & the subsequence is any number in [0-9] (dot is not allowed), then it will popup "correct" message.
if(str.matches("^[=][0-9]+")){
Window.alert("correct");
}
So, is this ^[=][0-9]+ the correct one?
if it is not correct, can u provide a correct solution?
if it is correct, then can u find a better solution?
I'm no big regex expert and more knowledgeable people than me might correct this answer, but:
I don't think there's a point in using [=] rather than simply = - the [...] block is used to declare multiple choices, why declare a multiple choice of one character?
I don't think you need to use ^ (if your input string contains any character before =, it won't match anyway). I'm unsure as to whether its presence makes your regex faster, slower or has no effect.
In conclusion, I'd use =[0-9]+
That should be correct it is looking for an anchored at the beginning = sign and then 1 or more digits between 0-9
Your regex will work, even though it can be simplified:
.matches() does not really do regex matching, since it tries and matches all the input against the regex; therefore the beginning of input anchor is not needed;
you don't need the character class around the =.
Therefore:
if (str.matches("=[0-9]+")) { ... }
If you want to match a string which only begins with that regex, you have to use a Pattern, a Matcher and .find():
final Pattern p = Pattern.compile("^=[0-9]+");
final Matcher m = p.matcher(str);
if (m.find()) { ... }
And finally, Matcher also has .lookingAt() which anchors the regex only at the beginning of the input.
I have the following string:
http://xxx/Content/SiteFiles/30/32531a5d-b0b1-4a8b-9029-b48f0eb40a34/05%20%20LEISURE.mp3?&mydownloads=true
How can I extract the part after 30/? In this case, it's 32531a5d-b0b1-4a8b-9029-b48f0eb40a34.I have another strings having same part upto 30/ and after that every string having different id upto next / which I want.
You can do like this:
String s = "http://xxx/Content/SiteFiles/30/32531a5d-b0b1-4a8b-9029-b48f0eb40a34/05%20%20LEISURE.mp3?&mydownloads=true";
System.out.println(s.substring(s.indexOf("30/")+3, s.length()));
split function of String class won't help you in this case, because it discards the delimiter and that's not what we want here. you need to make a pattern that looks behind. The look behind synatax is:
(?<=X)Y
Which identifies any Y that is preceded by a X.
So in you case you need this pattern:
(?<=30/).*
compile the pattern, match it with your input, find the match, and catch it:
String input = "http://xxx/Content/SiteFiles/30/32531a5d-b0b1-4a8b-9029-b48f0eb40a34/05%20%20LEISURE.mp3?&mydownloads=true";
Matcher matcher = Pattern.compile("(?<=30/).*").matcher(input);
matcher.find();
System.out.println(matcher.group());
Just for this one, or do you want a generic way to do it ?
String[] out = mystring.split("/")
return out[out.length - 2]
I think the / is definitely the delimiter you are searching for.
I can't see the problem you are talking about Alex
EDIT : Ok, Python got me with indexes.
Regular expression is the answer I think. However, how the expression is written depends on the data (url) format you want to process. Like this one:
Pattern pat = Pattern.compile("/Content/SiteFiles/30/([a-z0-9\\-]+)/.*");
Matcher m = pat.matcher("http://xxx/Content/SiteFiles/30/32531a5d-b0b1-4a8b-9029-b48f0eb40a34/05%20%20LEISURE.mp3?&mydownloads=true");
if (m.find()) {
System.out.println(m.group(1));
}
I want to using regex on Java to split a number string.
I using a online regex tester test the regex is right.
But in Java is wrong.
Pattern pattern = Pattern.compile("[\\\\d]{1,4}");
String[] results = pattern.split("123456");
// I expect 2 results ["1234","56"]
// Actual results is ["123456"]
Anything do I missing?
I knows this question is boring. But I wanna to solve this problem.
Answer
Pattern pattern = Pattern.compile("[\\d]{1,4}");
String[] results = pattern.split("123456");
// Results length is 0
System.out.println(results.length);
is not working. I have try it. It's will return nothing on the results.
Please try before answer it.
Sincerely thank the people who helped me.
Solution:
Pattern pattern = Pattern.compile("([\\d]{1,4})");
Matcher matcher = pattern.matcher("123456");
List<String> results = new ArrayList<String>();
while (matcher.find()) {
results.add(matcher.group(1));
}
Output 2 results ["1234","56"]
Pattern pattern = Pattern.compile("[\\\\d]{1,4}")
Too many backslashes, try [\\d]{1,4} (you only have to escape them once, so the backslash in front of the d becomes \\. The pattern you wrote is actually [\\d]{1,4} (a literal backslash or a literal d, one to four times).
When Java decided to add regular expressions to the standard library, they should have also added a regular expression literal syntax instead of shoe-horning it over Strings (with the unreadable extra escaping and no compile-time syntax checking).
Solution:
Pattern pattern = Pattern.compile("([\\d]{1,4})");
Matcher matcher = pattern.matcher("123456");
List<String> results = new ArrayList<String>();
while (matcher.find()) {
results.add(matcher.group(1));
}
Output 2 results ["1234","56"]
You can't do it in one method call, because you can't specify a capturing group for the split, which would be needed to break up into four char chunks.
It's not "elegant", but you must first insert a character to split on, then split:
String[] results = "123456".replaceAll("....", "$0,").split(",");
Here's the output:
System.out.println(Arrays.toString(results)); // prints [1234, 56]
Note that you don't need to use Pattern etc because String has a split-by-regex method, leading to a one-line solution.
Question closed because I misunderstood the situation. To show my stupidity though, I'll not remove what I wrote.
I'd like to encode a piece of string into Pattern, and get the string back.
I tried:
String s = buff.readLine();
Pattern p = new Pattern(s);
and use the following to retrieve my string
System.out.println(p.toString());
But it didn't work, the output is just the "package name#(some random things)... I tried Pattern p = Pattern.compile (s);
but I got an error from the compiler.
Well I just tried this:
Pattern p = Pattern.compile("Hello");
System.out.println( p.toString() );
And it worked, printing out 'Hello'.
Are you importing the java.util.regex.Pattern package?
The javadoc for Pattern#toString() seems to indicate that the source of the complete regex is only returned since java 1.5. However, Pattern#pattern() does not have a since tag, so it is presumably available since the class was introduced (java 1.4). Try System.out.println(p.pattern());
You're using a regex Pattern object to store and retrieve a String. This makes no sense. A Pattern is not used for storing Strings. A Pattern is used for searching other strings. It's a regular expression engine. Let me give you an example of the use of a Pattern.
We really have 2 objects when using Regular Expressions in Java. Pattern, and Matcher.
Pattern = A Regular Expression.
Matcher = All of the Matches found when we apply the Pattern to a String.
Let me give you an example of Pattern and Matcher, we'll search for four digits, separated by a colon, like as in time, ie 12:42
long timeL;
Pattern pattern = Pattern.compile(".*([1234567890]{2}:[1234567890]{2}).*");
Matcher matcher = pattern.matcher("Match me! 12:42 Match me!");
if (matcher.matches()) {
String timeStr = matcher.group(1);
System.out.println("Just the time: "+timeStr);
System.out.println("The entire String: "+matcher.group(0));
String[] timeParts = timeStr.split("[:]");
int hours = Integer.parseInt(timeParts[0]);
int minutes = Integer.parseInt(timeParts[1]);
timeL = (hours*60*60*1000) + (minutes*60*1000);
System.out.println(timeL);
}
After we've applied the Pattern to the String, and gotten a Matcher, we ask if the Matcher actually has a Match or not. You'll notice that we then request group 1, which is the match in the parantheses in: .([1234567890]{2}:[1234567890]{2}).
group 0 would be the entire match, and would result in returning the String given.
So, I hope you understand why it's extremely weird to be using a Pattern to store a String.