Need a regexp to extract a Sub String of a String - java

I have a string, The string looks like :
abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml
I want to get the content after n occurrences and before m occurrences of the character /.
For instance, from the string above, I want:
mn/src/main
Please suggest some solution for this.

the regex you require is this :
(?:.*?\/){7}(.*?)(.*)(?:\/.*?){2}$
a generic regex:
(?:.*?\/){n}(.*?)(.*)(?:\/.*?){m}$
substitute 7 and 2 with n and m and you will get your result
demo here:
http://regex101.com/r/bW2yF3

Use split().
String path = "abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml"
String [] tokens = path.split("/");
Now just print it:
for (int i = n; i < m; i++){
System.out.print(tokens[i] + (i != m - 1 ? "/" : ""));
}

If you must use regex:
String s = "abc/axs/abc/def/gh/ij/kl/mn/src/main/resources/xx.xml";
int n = 7;
int m = 10;
Pattern p = Pattern.compile("(?:[^/]*/){" + n + "}((?:[^/]*/){" + (m - n - 1) + "}[^/]*)/.*");
Matcher matcher = p.matcher(s);
if (matcher.matches()) {
System.out.println(matcher.group(1));
}

Related

regex doesn't find last word in my string

I have a regular expression [a-z]\d to unpack the text witch is compressed by simple rule
hellowoooorld -> hel2owo4rld
So now i have to unpack my text and it doesn't work correctly. It can't find last word in my String
it always like skip gu4ys
StringBuilder text = new StringBuilder("Hel2o peo7ple it is ou6r wo3rld gu4ys");
Pattern pattern = Pattern.compile("[a-z]\\d");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
int startWord = matcher.start();
int numLetters = Integer.parseInt(text.substring(startWord + 1, startWord + 2));
text.deleteCharAt(startWord + 1);
for (int i = 0; i < numLetters - 1; ++i) {
text.insert(startWord + 1, text.charAt(startWord));
}
}
System.out.println(text);
Result is : Hello peooooooople it is ouuuuuur wooorld gu4ys
I expect this : Hello peooooooople it is ouuuuuur wooorld guuuuys
I can't understand why it doesn't work all is simple
It seems like Java's Matcher checks your string size when it initializes, and doesn't go past that. You are inserting to the string, which makes it longer. The matcher doesn't check that far.
A quick, though slow, fix is to re-initialize the matcher every time.
StringBuilder text = new StringBuilder("Hel2o peo7ple it is ou6r wo3rld gu4ys");
Pattern pattern = Pattern.compile("[a-z]\\d");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
int startWord = matcher.start();
int numLetters = Integer.parseInt(text.substring(startWord + 1, startWord + 2));
text.deleteCharAt(startWord + 1);
for (int i = 0; i < numLetters - 1; ++i) {
text.insert(startWord + 1, text.charAt(startWord));
}
matcher = pattern.matcher(text);
}
System.out.println(text);
A faster approach would find the numbers, calculate the string length and then manually construct the string using the found numbers.
The issue is probably that the matcher is only finding the pattern [a-z]\d, which matches a single letter followed by a digit, but it is not finding the last word "gu4ys" because it doesn't match that pattern.
To fix this, you can modify the regular expression to include an optional group that matches any remaining letters at the end of the text.
Try this regex and please let me know if it worked :)
"[a-z]\d|[a-z]+"

find overlapping regex pattern

I'm using regex to find a pattern
I need to find all matches in this way :
input :"word1_word2_word3_..."
result: "word1_word2","word2_word3", "word4_word5" ..
It can be done using (?=) positive lookahead.
Regex: (?=(?:_|^)([^_]+_[^_]+))
Java code:
String text = "word1_word2_word3_word4_word5_word6_word7";
String regex = "(?=(?:_|^)([^_]+_[^_]+))";
Matcher matcher = Pattern.compile(regex).matcher(text);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
word1_word2
word2_word3
word3_word4
...
Code demo
You can do it without regex, using split:
String input = "word1_word2_word3_word4";
String[] words = input.split("_");
List<String> outputs = new LinkedList<>();
for (int i = 0; i < words.length - 1; i++) {
String first = words[i];
String second = words[i + 1];
outputs.add(first + "_" + second);
}
for (String output : outputs) {
System.out.println(output);
}

How to use regex to split a string containing numbers and letters in java

My task is splitting a string, which starts with numbers and contains numbers and letters, into two sub-strings.The first one consists of all numbers before the first letter. The second one is the remained part, and shouldn't be split even if it contains numbers.
For example, a string "123abc34de" should be split as: "123" and "abc34de".
I know how to write a regular expression for such a string, and it might look like this:
[0-9]{1,}[a-zA-Z]{1,}[a-zA-Z0-9]{0,}
I have tried multiple times but still don't know how to apply regex in String.split() method, and it seems very few online materials about this. Thanks for any help.
you can do it in this way
final String regex = "([0-9]{1,})([a-zA-Z]{1,}[a-zA-Z0-9]{0,})";
final String string = "123ahaha1234";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
matcher.group(1) contains the first part and matcher.group(2) contains the second
you can add it to a list/array using these values
You can use a pretty simple pattern : "^(\\d+)(\\w+)" which capture digits as start, and then when letters appear it take word-char
String string = "123abc34de";
Matcher matcher = Pattern.compile("^(\\d+)(\\w+)").matcher(string);
String firstpart = "";
String secondPart = "";
if (matcher.find()) {
firstpart = matcher.group(1);
secondPart = matcher.group(2);
}
System.out.println(firstpart + " - " + secondPart); // 123 - abc34de
This is not the correct way but u will get the result
public static void main(String[] args) {
String example = "1234abc123";
int index = 0;
String[] arr = new String[example.length()];
for (int i = 0; i < example.length(); i++) {
arr = example.split("");
try{
if(Integer.parseInt(arr[i]) >= 0 & Integer.parseInt(arr[i]) <= 9){
index = i;
}
else
break;
}catch (NumberFormatException e) {
index = index;
}
}
String firstHalf = example.substring(0,Integer.parseInt(arr[index])+1);
String secondHalf = example.substring(Integer.parseInt(arr[index])+1,example.length());
System.out.println(firstHalf);
System.out.println(secondHalf);
}
Output will be: 1234 and in next line abc123

Java, How to split String with shifting

How can I split a string by 2 characters with shifting.
For example;
My string is = todayiscold
My target is: "to","od","da","ay","yi","is","sc","co","ol","ld"
but with this code:
Arrays.toString("todayiscold".split("(?<=\\G.{2})")));
I get: `"to","da","yi","co","ld"
anybody helps?
Try this:
String e = "example";
for (int i = 0; i < e.length() - 1; i++) {
System.out.println(e.substring(i, i+2));
}
Use a loop:
String test = "abcdefgh";
List<String> list = new ArrayList<String>();
for(int i = 0; i < test.length() - 1; i++)
{
list.add(test.substring(i, i + 2));
}
Following regex based code should work:
String str = "todayiscold";
Pattern p = Pattern.compile("(?<=\\G..)");
Matcher m = p.matcher(str);
int start = 0;
List<String> matches = new ArrayList<String>();
while (m.find(start)) {
matches.add(str.substring(m.end()-2, m.end()));
start = m.end()-1;
}
System.out.println("Matches => " + matches);
Trick is to use end()-1 from last match in the find() method.
Output:
Matches => [to, od, da, ay, yi, is, sc, co, ol, ld]
You cant use split in this case because all split does is find place to split and brake your string in this place, so you cant make same character appear in two parts.
Instead you can use Pattern/Matcher mechanisms like
String test = "todayiscold";
List<String> list = new ArrayList<String>();
Pattern p = Pattern.compile("(?=(..))");
Matcher m = p.matcher(test);
while(m.find())
list.add(m.group(1));
or even better iterate over your Atring characters and create substrings like in D-Rock's answer

How to get with JAVA a specific value for one substring from string?

I have ONE string field which is in format:
"TransactionID=30000001197169 ExecutionStatus=6
additionalCurrency=KMK
pin= 0000"
So they are not separated with some ; оr , they are not seperated even with one blank space.
I want to get value for Execution Status and put it in some field?
How to achieve this?
Thanks for help
This works. But I am not sure this is the most optimal.It just solves your problem.
String s = "TransactionID=30000001197169ExecutionStatus=6additionalCurrency=KMKpin=0000";
if(s!=null && s.contains("ExecutionStatus="))
{
String s1[] = s.split("ExecutionStatus=");
if(s1!=null && s1.length>1)
{
String line = s1[1];
String pattern = "[0-9]+";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Match");
System.out.println("Found value: " + m.group(0) );
} else {
System.out.println("NO MATCH");
}
}
}
In your example they are indeed seperated by blanks, but the following should be working without blanks, too. Assuming your String is stored in String arguments
String executionStatus;
String[] anArray = arguments.split("=");
for (int i; i < anArray.length; i++)
if (anArray[i].contains("ExecutionStatus")){
executionStatus = anArray[++i].replace("additionalCurrency","");
executionStatus = executionStatus.trim();
}
}
Check if it contains() ExecutionStatus=
If yes then split the string with ExecutionStatus=
Now take the Second string from array find the first occurance of non digit char and use substring()
Assuming all that white space is present in your string, this works.
String str = "\"TransactionID=30000001197169 ExecutionStatus=6\n" +
" additionalCurrency=\"KMK\"\n" +
" pin= \"0000\"\"";
int start = str.indexOf("ExecutionStatus=") + "ExecutionStatus=".length();
int status = 0;
if (start >= 0) {
String strStatus = str.substring(start, str.indexOf("additionalCurrency=") - 1);
try {
status = Integer.parseInt(strStatus.trim());
} catch (NumberFormatException e) {
}
}
At the risk of attracting "... and now you have two problems!" comments, this is probably easiest done with regexes (str is the String defined above):
Pattern p = Pattern.compile("ExecutionStatus\\s*=\\s*(\\d+)"); // Whitespace matching around equals for safety, capturing group around the digits of the status)
Matcher m = p.matcher(str);
String status = m.find() ? m.group(1) : null;

Categories