regex doesn't find last word in my string - java

I have a regular expression [a-z]\d to unpack the text witch is compressed by simple rule
hellowoooorld -> hel2owo4rld
So now i have to unpack my text and it doesn't work correctly. It can't find last word in my String
it always like skip gu4ys
StringBuilder text = new StringBuilder("Hel2o peo7ple it is ou6r wo3rld gu4ys");
Pattern pattern = Pattern.compile("[a-z]\\d");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
int startWord = matcher.start();
int numLetters = Integer.parseInt(text.substring(startWord + 1, startWord + 2));
text.deleteCharAt(startWord + 1);
for (int i = 0; i < numLetters - 1; ++i) {
text.insert(startWord + 1, text.charAt(startWord));
}
}
System.out.println(text);
Result is : Hello peooooooople it is ouuuuuur wooorld gu4ys
I expect this : Hello peooooooople it is ouuuuuur wooorld guuuuys
I can't understand why it doesn't work all is simple

It seems like Java's Matcher checks your string size when it initializes, and doesn't go past that. You are inserting to the string, which makes it longer. The matcher doesn't check that far.
A quick, though slow, fix is to re-initialize the matcher every time.
StringBuilder text = new StringBuilder("Hel2o peo7ple it is ou6r wo3rld gu4ys");
Pattern pattern = Pattern.compile("[a-z]\\d");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
int startWord = matcher.start();
int numLetters = Integer.parseInt(text.substring(startWord + 1, startWord + 2));
text.deleteCharAt(startWord + 1);
for (int i = 0; i < numLetters - 1; ++i) {
text.insert(startWord + 1, text.charAt(startWord));
}
matcher = pattern.matcher(text);
}
System.out.println(text);
A faster approach would find the numbers, calculate the string length and then manually construct the string using the found numbers.

The issue is probably that the matcher is only finding the pattern [a-z]\d, which matches a single letter followed by a digit, but it is not finding the last word "gu4ys" because it doesn't match that pattern.
To fix this, you can modify the regular expression to include an optional group that matches any remaining letters at the end of the text.
Try this regex and please let me know if it worked :)
"[a-z]\d|[a-z]+"

Related

find overlapping regex pattern

I'm using regex to find a pattern
I need to find all matches in this way :
input :"word1_word2_word3_..."
result: "word1_word2","word2_word3", "word4_word5" ..
It can be done using (?=) positive lookahead.
Regex: (?=(?:_|^)([^_]+_[^_]+))
Java code:
String text = "word1_word2_word3_word4_word5_word6_word7";
String regex = "(?=(?:_|^)([^_]+_[^_]+))";
Matcher matcher = Pattern.compile(regex).matcher(text);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
word1_word2
word2_word3
word3_word4
...
Code demo
You can do it without regex, using split:
String input = "word1_word2_word3_word4";
String[] words = input.split("_");
List<String> outputs = new LinkedList<>();
for (int i = 0; i < words.length - 1; i++) {
String first = words[i];
String second = words[i + 1];
outputs.add(first + "_" + second);
}
for (String output : outputs) {
System.out.println(output);
}

How to use regex to split a string containing numbers and letters in java

My task is splitting a string, which starts with numbers and contains numbers and letters, into two sub-strings.The first one consists of all numbers before the first letter. The second one is the remained part, and shouldn't be split even if it contains numbers.
For example, a string "123abc34de" should be split as: "123" and "abc34de".
I know how to write a regular expression for such a string, and it might look like this:
[0-9]{1,}[a-zA-Z]{1,}[a-zA-Z0-9]{0,}
I have tried multiple times but still don't know how to apply regex in String.split() method, and it seems very few online materials about this. Thanks for any help.
you can do it in this way
final String regex = "([0-9]{1,})([a-zA-Z]{1,}[a-zA-Z0-9]{0,})";
final String string = "123ahaha1234";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
matcher.group(1) contains the first part and matcher.group(2) contains the second
you can add it to a list/array using these values
You can use a pretty simple pattern : "^(\\d+)(\\w+)" which capture digits as start, and then when letters appear it take word-char
String string = "123abc34de";
Matcher matcher = Pattern.compile("^(\\d+)(\\w+)").matcher(string);
String firstpart = "";
String secondPart = "";
if (matcher.find()) {
firstpart = matcher.group(1);
secondPart = matcher.group(2);
}
System.out.println(firstpart + " - " + secondPart); // 123 - abc34de
This is not the correct way but u will get the result
public static void main(String[] args) {
String example = "1234abc123";
int index = 0;
String[] arr = new String[example.length()];
for (int i = 0; i < example.length(); i++) {
arr = example.split("");
try{
if(Integer.parseInt(arr[i]) >= 0 & Integer.parseInt(arr[i]) <= 9){
index = i;
}
else
break;
}catch (NumberFormatException e) {
index = index;
}
}
String firstHalf = example.substring(0,Integer.parseInt(arr[index])+1);
String secondHalf = example.substring(Integer.parseInt(arr[index])+1,example.length());
System.out.println(firstHalf);
System.out.println(secondHalf);
}
Output will be: 1234 and in next line abc123

Need a regexp to extract a Sub String of a String

I have a string, The string looks like :
abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml
I want to get the content after n occurrences and before m occurrences of the character /.
For instance, from the string above, I want:
mn/src/main
Please suggest some solution for this.
the regex you require is this :
(?:.*?\/){7}(.*?)(.*)(?:\/.*?){2}$
a generic regex:
(?:.*?\/){n}(.*?)(.*)(?:\/.*?){m}$
substitute 7 and 2 with n and m and you will get your result
demo here:
http://regex101.com/r/bW2yF3
Use split().
String path = "abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml"
String [] tokens = path.split("/");
Now just print it:
for (int i = n; i < m; i++){
System.out.print(tokens[i] + (i != m - 1 ? "/" : ""));
}
If you must use regex:
String s = "abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml";
int n = 7;
int m = 10;
Pattern p = Pattern.compile("(?:[^/]*/){" + n + "}((?:[^/]*/){" + (m - n - 1) + "}[^/]*)/.*");
Matcher matcher = p.matcher(s);
if (matcher.matches()) {
System.out.println(matcher.group(1));
}

How many times one string contains another [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Occurences of substring in a string
As in the subject how to check how many times one string contains another one?
Example:
s1 "babab"
s2 "bab"
Result : 2
If i use Matcher it does only recognize first occurence:
String s1 = JOptionPane.showInputDialog(" ");
String s2 = JOptionPane.showInputDialog(" ");
Pattern p = Pattern.compile(s2);
Matcher m = p.matcher(s1);
int counter = 0;
while(m.find()){
System.out.println(m.group());
counter++;
}
System.out.println(counter);
I can do it like that, but I would like below to use Java libraries iike Scanner, StringTokenizer, Matcher etc:
String s1 = JOptionPane.showInputDialog(" ");
String s2 = JOptionPane.showInputDialog(" ");
String pom;
int count = 0;
for(int i = 0 ; i< s1.length() ; i++){
if(s1.charAt(i) == s2.charAt(0)){
if(i + s2.length() <= s1.length()){
pom = s1.substring(i,i+s2.length());
if(pom.equals(s2)){
count++;
}
}
}
}
System.out.println(count);
One liner solution for the lulz
longStr is the input string. findStr is the string to search for. No assumption, except that longStr and findStr must not be null and findStr must have at least 1 character.
longStr.length() - longStr.replaceAll(Pattern.quote(findStr.substring(0,1)) + "(?=" + Pattern.quote(findStr.substring(1)) + ")", "").length()
Since 2 matches are considered different as long as they starts at different index, and overlapping can happen, we need a way to differentiate between the matches and allow for matched part to be overlapped.
The trick is to consume only the first character of the search string, and use look-ahead to assert the rest of the search string. This allows overlapping portion to be rematched, and by removing the first character of the match, we can count the number of matches.
i think this might work if you know the word you are looking for in the string you might need to edit the regex pattern tho.
String string = "hellohellohellohellohellohello";
Pattern pattern = Pattern.compile("hello");
Matcher matcher = pattern.matcher(string);
int count = 0;
while (matcher.find()) count++;
The class Matcher has two methods "start" and "end" which return the start index and end index of the last match. Further, the method find has an optional parameter "start" at which it starts searching.
you can do it like this
private int counterString(String s,String search) {
int times = 0;
int index = s.indexOf(search,0);
while(index > 0) {
index = s.indexOf(search,index+1);
++times;
}
return times;
}
Some quick Bruce Forte solution:
String someString = "bababab";
String toLookFor = "bab";
int count = 0;
for (int i = 0; i < someString.length(); i++) {
if (someString.length() - i >= toLookFor.length()) {
if (someString.substring(i, i + toLookFor.length()).equals(toLookFor) && !"".equals(toLookFor)) {
count++;
}
}
}
System.out.println(count);
This prints out 3. Please note I assume that none of the Strings is null.

Issue in writing regular express

How to read the string from the position (Example)5 to the end of the string in java.
QRegExp StripType(re, Qt::CaseInsensitive);
int p = StripType.indexIn(line, 0);
int len = StripType.matchedLength();
String tmp += line.mid(len);
How to convert QT into java
Where re is in the above code is regular expression and i want to covert the above into java i have tried
String s =pattern.toString();
int pos = s.indexOf(line);
Matcher matcher = Pattern.compile(re).matcher(line);
if (matcher.find()) {
System.out.println(matcher.group());
} else {
System.out.println("String contains no character other than that");
}
len = matcher.start();
But its not working correct
Thanks in Advance
To begin with you should add the Pattern.CASE_INSENSITIVE flag.
Matcher matcher = Pattern.compile(re, Pattern.CASE_INSENSITIVE).matcher(line);

Categories