find overlapping regex pattern - java

I'm using regex to find a pattern
I need to find all matches in this way :
input :"word1_word2_word3_..."
result: "word1_word2","word2_word3", "word4_word5" ..

It can be done using (?=) positive lookahead.
Regex: (?=(?:_|^)([^_]+_[^_]+))
Java code:
String text = "word1_word2_word3_word4_word5_word6_word7";
String regex = "(?=(?:_|^)([^_]+_[^_]+))";
Matcher matcher = Pattern.compile(regex).matcher(text);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
word1_word2
word2_word3
word3_word4
...
Code demo

You can do it without regex, using split:
String input = "word1_word2_word3_word4";
String[] words = input.split("_");
List<String> outputs = new LinkedList<>();
for (int i = 0; i < words.length - 1; i++) {
String first = words[i];
String second = words[i + 1];
outputs.add(first + "_" + second);
}
for (String output : outputs) {
System.out.println(output);
}

Related

How to use regex to split a string containing numbers and letters in java

My task is splitting a string, which starts with numbers and contains numbers and letters, into two sub-strings.The first one consists of all numbers before the first letter. The second one is the remained part, and shouldn't be split even if it contains numbers.
For example, a string "123abc34de" should be split as: "123" and "abc34de".
I know how to write a regular expression for such a string, and it might look like this:
[0-9]{1,}[a-zA-Z]{1,}[a-zA-Z0-9]{0,}
I have tried multiple times but still don't know how to apply regex in String.split() method, and it seems very few online materials about this. Thanks for any help.
you can do it in this way
final String regex = "([0-9]{1,})([a-zA-Z]{1,}[a-zA-Z0-9]{0,})";
final String string = "123ahaha1234";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
matcher.group(1) contains the first part and matcher.group(2) contains the second
you can add it to a list/array using these values
You can use a pretty simple pattern : "^(\\d+)(\\w+)" which capture digits as start, and then when letters appear it take word-char
String string = "123abc34de";
Matcher matcher = Pattern.compile("^(\\d+)(\\w+)").matcher(string);
String firstpart = "";
String secondPart = "";
if (matcher.find()) {
firstpart = matcher.group(1);
secondPart = matcher.group(2);
}
System.out.println(firstpart + " - " + secondPart); // 123 - abc34de
This is not the correct way but u will get the result
public static void main(String[] args) {
String example = "1234abc123";
int index = 0;
String[] arr = new String[example.length()];
for (int i = 0; i < example.length(); i++) {
arr = example.split("");
try{
if(Integer.parseInt(arr[i]) >= 0 & Integer.parseInt(arr[i]) <= 9){
index = i;
}
else
break;
}catch (NumberFormatException e) {
index = index;
}
}
String firstHalf = example.substring(0,Integer.parseInt(arr[index])+1);
String secondHalf = example.substring(Integer.parseInt(arr[index])+1,example.length());
System.out.println(firstHalf);
System.out.println(secondHalf);
}
Output will be: 1234 and in next line abc123

Need a regexp to extract a Sub String of a String

I have a string, The string looks like :
abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml
I want to get the content after n occurrences and before m occurrences of the character /.
For instance, from the string above, I want:
mn/src/main
Please suggest some solution for this.
the regex you require is this :
(?:.*?\/){7}(.*?)(.*)(?:\/.*?){2}$
a generic regex:
(?:.*?\/){n}(.*?)(.*)(?:\/.*?){m}$
substitute 7 and 2 with n and m and you will get your result
demo here:
http://regex101.com/r/bW2yF3
Use split().
String path = "abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml"
String [] tokens = path.split("/");
Now just print it:
for (int i = n; i < m; i++){
System.out.print(tokens[i] + (i != m - 1 ? "/" : ""));
}
If you must use regex:
String s = "abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml";
int n = 7;
int m = 10;
Pattern p = Pattern.compile("(?:[^/]*/){" + n + "}((?:[^/]*/){" + (m - n - 1) + "}[^/]*)/.*");
Matcher matcher = p.matcher(s);
if (matcher.matches()) {
System.out.println(matcher.group(1));
}

Java, How to split String with shifting

How can I split a string by 2 characters with shifting.
For example;
My string is = todayiscold
My target is: "to","od","da","ay","yi","is","sc","co","ol","ld"
but with this code:
Arrays.toString("todayiscold".split("(?<=\\G.{2})")));
I get: `"to","da","yi","co","ld"
anybody helps?
Try this:
String e = "example";
for (int i = 0; i < e.length() - 1; i++) {
System.out.println(e.substring(i, i+2));
}
Use a loop:
String test = "abcdefgh";
List<String> list = new ArrayList<String>();
for(int i = 0; i < test.length() - 1; i++)
{
list.add(test.substring(i, i + 2));
}
Following regex based code should work:
String str = "todayiscold";
Pattern p = Pattern.compile("(?<=\\G..)");
Matcher m = p.matcher(str);
int start = 0;
List<String> matches = new ArrayList<String>();
while (m.find(start)) {
matches.add(str.substring(m.end()-2, m.end()));
start = m.end()-1;
}
System.out.println("Matches => " + matches);
Trick is to use end()-1 from last match in the find() method.
Output:
Matches => [to, od, da, ay, yi, is, sc, co, ol, ld]
You cant use split in this case because all split does is find place to split and brake your string in this place, so you cant make same character appear in two parts.
Instead you can use Pattern/Matcher mechanisms like
String test = "todayiscold";
List<String> list = new ArrayList<String>();
Pattern p = Pattern.compile("(?=(..))");
Matcher m = p.matcher(test);
while(m.find())
list.add(m.group(1));
or even better iterate over your Atring characters and create substrings like in D-Rock's answer

How to get substring based on special characters in android?

Actually this is a very simple question, I tried a lot but I am unable to get the exact solution. I have a string like:
String mystring = "one<1234567>,two<98765432>,three<878897656>";
Here I want the data which is inside "<" and ">". Can anyone help me with this?
I would use regex
String str = "one<1234567>,two<98765432>,three<878897656>";
Matcher m = Pattern.compile("<(.+?)>").matcher(str);
while(m.find()) {
String v = m.group(1);
}
Try
String mystring = "one<1234567>,two<98765432>,three<878897656>";
String[] result = mystring.split(",");
for (String s : result) {
s = s.substring(s.indexOf("<")+1);
s = s.substring(0, s.indexOf(">"));
System.out.println(s);
}
Print result :
1234567
98765432
878897656
You can use a regex like <(.*?)> :
String mystring = "one<1234567>,two<98765432>,three<878897656>";
Pattern pattern = Pattern.compile("<(.*?)>");
Matcher matcher = pattern.matcher(mystring);
while (matcher.find())
{
System.out.println(matcher.group(1));
}
Try this
String mystring = "one<1234567>,two<98765432>,three<878897656>";
String[] a = myString.split(",");
for(int i = 0; i < a.length; i++){
String substr=a[i].subString(a[i].indexOf("<"),a[i].indexOf(">"));
System.out.println(substr);
}
Try if your inner bracket value always numeric and outside alphabetical i.e. <, >
String[] strings=mystring.replaceAll("[a-z<>]", "").split(",");
for(String string:stringsArray)
{
System.out.println(string);
}
i found an new solution from StringTokenizer class
you can use it as,
StringTokenizer tokens = new StringTokenizer(KEY_SUBFOLDERNAME, ".");
String first_string = tokens.nextToken();
File_Ext = tokens.nextToken();
System.out.println("First_string : "+first_string);
System.out.println("File_Ext : "+File_Ext);

Text split after a specified length but dont break words using grails

I have a long string that I need to parse into an array of strings that do not exceed 50 characters in length. The tricky part of this for me is making sure that the regex finds the last whitespace before 50 characters to make a clean break between strings since I don't want words cut off.
public List<String> splitInfoText(String msg) {
int MAX_WIDTH = 50;
def line = [] String[] words;
msg = msg.trim();
words = msg.split(" ");
StringBuffer s = new StringBuffer();
words.each {
word -> s.append(word + " ");
if (s.length() > MAX_WIDTH) {
s.replace(s.length() - word.length()-1, s.length(), " ");
line << s.toString().trim();
s = new StringBuffer(word + " ");
}
}
if (s.length() > 0)
line << s.toString().trim();
return line;
}
Try this:
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile(".{1,50}(?:\\s|$)", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
I believe a Groovier version of Tim's answer is:
List matchList = ( subjectString =~ /(?s)(.{1,50})(?:\s|$)/ ).collect { it[ 1 ] }

Categories