String split with regex - java

I want to take a string according to my regex in java. Suppose i have a String "R12T12W5P12T5L3"
. And now i want to have something like this : myStr[0]="R12T12",myStr[1]="W5P12",myStr[2]=T5L3. I want to have my regex first a character then a number then again a character and last a number.
How can i do that?

String s="R12T12W5P12T5L3";
String regex = "([A-Z]\\d+){2}";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s);
while(m.find()){
System.out.println(m.group(0));
}
this will print
R12T12
W5P12
T5L3
you can put them into a list and convert into array at the end.

All operations from the regex to the string building, in javascript :
var str = "R12T12W5P12T5L3";
var result = str.split(/(?=[^\d]){2}/).map(function(v,i,a){
return i%2 ? a[i-1]+v+'",' : 'myStr['+(i/2)+']="'
}).join('').slice(0,-1);
Result :
myStr[0]="R12T12",myStr[1]="W5P12",myStr[2]="T5L3"

Related

How to replace character in the string using regex in java?

I want to replace every x in the end of line or string and behind every letters except aiueo with nya.
Expected input and output:
Input: bapakx
Output: bapaknya
I've tried this one:
String myString = "bapakx";
String regex = "[^aiueo]x(\\s|$)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(myString);
if(m.find()){
myString = m.replaceAll("nya");
}
But the output is not bapaknya but bapanya. The k character is also replaced. How can I solve this?
To get consonant back Use a zero width lookbehind in your regex as:
String regex = "(?<=[^aiueo])x(?=\\s|$)";
Here (?<=[^aiueo]) will only assert presence of consonant before x but won't match it.
Alternatively you can use capture groups:
String regex = "([^aiueo])x(\\s|$)";
and use it as:
myString = m.replaceAll("$1nya");

Get Substring from a String in Java

I have the following text:
...,Niedersachsen,NOT IN CHARGE SINCE: 03.2009, CATEGORY:...,
Now I want to extract the date after NOT IN CHARGE SINCE: until the comma.
So i need only 03.2009 as result in my substring.
So how can I handle that?
String substr = "not in charge since:";
String before = s.substring(0, s.indexOf(substr));
String after = s.substring(s.indexOf(substr),s.lastIndexOf(","));
EDIT
for (String s : split) {
s = s.toLowerCase();
if (s.contains("ex peps")) {
String substr = "not in charge since:";
String before = s.substring(0, s.indexOf(substr));
String after = s.substring(s.indexOf(substr), s.lastIndexOf(","));
System.out.println(before);
System.out.println(after);
System.out.println("PEP!!!");
} else {
System.out.println("Line ok");
}
}
But that is not the result I want.
You can use Patterns for example :
String str = "Niedersachsen,NOT IN CHARGE SINCE: 03.2009, CATEGORY";
Pattern p = Pattern.compile("\\d{2}\\.\\d{4}");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println(m.group());
}
Output
03.2009
Note : if you want to get similar dates in all your String you can use while instead of if.
Edit
Or you can use :
String str = "Niedersachsen,NOT IN CHARGE SINCE: 03.03.2009, CATEGORY";
Pattern p = Pattern.compile("SINCE:(.*?)\\,");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println(m.group(1).trim());
}
You can use : to separate the String s.
String substr = "NOT IN CHARGE SINCE:";
String before = s.substring(0, s.indexOf(substr)+1);
String after = s.substring(s.indexOf(':')+1, s.lastIndexOf(','));
Of course, regular expressions give you more ways to do searching/matching, but assuming that the ":" is the key thing you are looking for (and it shows up exactly once in that position) then:
s.substring(s.indexOf(':')+1, s.lastIndexOf(',')).trim();
is the "most simple" and "least overhead" way of fetching that substring.
Hint: as you are searching for a single character, use a character as search pattern; not a string!
If you have a more generic usecase and you know the structure of the text to be matched well you might profit from using regular expressions:
Pattern pattern = Pattern.compile(".*NOT IN CHARGE SINCE: \([0-9.]*\),");
Matcher matcher = pattern.matcher(line);
System.out.println(matcher.group());
A more generic way to solve your problem is to use Regex to match Every group Between : and ,
Pattern pattern = Pattern.compile("(?<=:)(.*?)(?=,)");
Matcher m = p.matcher(str);
You have to create a pattern for it. Try this as a simple regex starting point, and feel free to improvise on it:
String s = "...,Niedersachsen,NOT IN CHARGE SINCE: 03.2009, CATEGORY:....,";
Pattern pattern = Pattern.compile(".*NOT IN CHARGE SINCE: ([\\d\\.]*).*");
Matcher matcher = pattern.matcher(s);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
That should get you whatever group of digits you received as date.

Finding Upper Case in String Array and extracting it out

I have an array input like this which is an email id in reverse order along with some data:
MOC.OOHAY#ABC.PQRqwertySDdd
MOC.OOHAY#AB.JKLasDDbfn
MOC.OOHAY#XZ.JKGposDDbfn
I want my output to come as
MOC.OOHAY#ABC.PQR
MOC.OOHAY#AB.JKL
MOC.OOHAY#XZ.JKG
How should I filter the string since there is no pattern?
There is a pattern, and that is any upper case character which is followed either by another upper case letter, a period or else the # character.
Translated, this would become something like this:
String[] input = new String[]{"MOC.OOHAY#ABC.PQRqwertySDdd","MOC.OOHAY#AB.JKLasDDbfn" , "MOC.OOHAY#XZ.JKGposDDbfn"};
Pattern p = Pattern.compile("([A-Z.]+#[A-Z.]+)");
for(String string : input)
{
Matcher matcher = p.matcher(string);
if(matcher.find())
System.out.println(matcher.group(1));
}
Yields:
MOC.OOHAY#ABC.PQR
MOC.OOHAY#AB.JKL
MOC.OOHAY#XZ.JKG
Why do you think there is no pattern?
You clearly want to get the string till you find a lowercase letter.
You can use the regex (^[^a-z]+) to match it and extract.
Regex Demo
Simply split on [a-z], with limit 2:
String s1 = "MOC.OOHAY#ABC.PQRqwertySDdd";
String s2 = "MOC.OOHAY#AB.JKLasDDbfn";
String s3 = "MOC.OOHAY#XZ.JKGposDDbfn";
System.out.println(s1.split("[a-z]", 2)[0]);
System.out.println(s2.split("[a-z]", 2)[0]);
System.out.println(s3.split("[a-z]", 2)[0]);
Demo.
You can do it like this:
String arr[] = { "MOC.OOHAY#ABC.PQRqwertySDdd", "MOC.OOHAY#AB.JKLasDDbfn", "MOC.OOHAY#XZ.JKGposDDbfn" };
for (String test : arr) {
Pattern p = Pattern.compile("[A-Z]*\\.[A-Z]*#[A-Z]*\\.[A-Z.]*");
Matcher m = p.matcher(test);
if (m.find()) {
System.out.println(m.group());
}
}

Extract a substring by using regex doesn't work

I have a string like this:
String source = "https://1116.netrk.net/conv1?prid=478&orderid=[6aa3482b-519b-4127-abee-debcd6e39e96]"
I want to extract orderid which is inside [ ]. I wrote this method:
public String extractOrderId(String source)
{
Pattern p = Pattern.compile("[(.*?)]");
Matcher m = p.matcher(source);
if (m.find())
return m.group(1);
else
return "";
}
But I always get Exception
java.lang.IndexOutOfBoundsException: No group 1
Any idea what's wrong? Thanks
You need to escape brackets:
Pattern p = Pattern.compile("\\[(.*?)\\]");
Aside from using a regex you could use the URL class to extract the query:
URL url = new URL("https://1116.netrk.net/conv1?prid=478&orderid=[6aa3482b-519b-4127-abee-debcd6e39e96]");
String query = url.getQuery();
At that point the value of query is prid=478&orderid=[6aa3482b-519b-4127-abee-debcd6e39e96]. From there you can easily extract the orderid by a combination of indexOf and substring.
String searchFor = "orderid=[";
int fromIndex = query.indexOf(searchFor);
int toIndex = query.indexOf("]", fromIndex);
//6aa3482b-519b-4127-abee-debcd6e39e96
String orderId = query.substring(fromIndex+searchFor.length(), toIndex);
Your RegEx is lack of \\
Pattern p = Pattern.compile("\\[(.*?)]");
You need to use escape characters for [] , try the following
Pattern p = Pattern.compile("\\[(.*)\\]");
You have used [(.*?)]. Please refer this for the meaning of square brackets.
So in this case, you need to define the regex for the character [ and ] as well. So you need to escape those characters from the Pattern.compiler.
The following will match the requirement that you want.
Pattern p = Pattern.compile("\\[(.*?)\\]");

Java Regex: how to capture multiple matches in the same line

I am trying to match a regex pattern in Java, and I have two questions:
Inside the pattern I'm looking for there is a known beginning and then an unknown string that I want to get up until the first occurrence of an &.
there are multiple occurrences of these patterns in the line and I would like to get each occurrence separately.
For example I have this input line:
1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate%7C120HZ&sName=View+All&subCatView=true 0 2819357575609397706
And I am interested in these strings:
Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.
Screen+Refresh+Rate%7C120HZ
Assuming the known beginning is filter=**, the regular expression pattern (?:filter=\\*\\*)(.*?)(?:&) should get you what you need. Use Matcher.find() to get all occurrences of the pattern in a given string. Using the test string you provided, the following:
final Pattern p = Pattern.compile("(?:filter=\\*\\*)(.*?)(?:&)");
final Matcher m = p.matcher(testString);
int cnt = 0;
while (m.find()) {
System.out.println(++cnt + ": G1: " + m.group(1));
}
Will output:
1: G1: Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.
2: G1: Screen+Refresh+Rate%7C120HZ**
If i know that I might need other query parameters in the future, I think it'll be more prudent to decode and parse the URL.
String url = URLDecoder.decode("http://www.gold.com/shc/s/c_10153_12605_" +
"Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate" +
"%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All&viewItems=25&subCatView=true"
,"utf-8");
Pattern amp = Pattern.compile("&");
Pattern eq = Pattern.compile("=");
Map<String, String> params = new HashMap<String, String>();
String queryString = url.substring(url.indexOf('?') + 1);
for(String param : amp.split(queryString)) {
String[] pair = eq.split(param);
params.put(pair[0], pair[1]);
}
for(Entry<String, String> param : params.entrySet()) {
System.out.format("%s = %s\n", param.getKey(), param.getValue());
}
Output
subCatView = true
viewItems = 25
sName = View All
filter = Screen Refresh Rate|120HZ^Screen Size|37 in. to 42 in.
in your example, there is sometimes a "**" at the end before the "&". but basically, (assuming "filter=" is the start pattern you are looking for) you want something like:
"filter=([^&]+)&"
Using the regular expression (?<=filter=\*{0,2})[^&]*[^&*]+ in java:
Pattern p = Pattern.compile("(?<=filter=\\*{0,2})[^&]*[^&*]+");
String s = "1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All**&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ**&sName=View+All&subCatView=true 0 2819357575609397706";
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
EDIT:
Added [^&*]+ to the end of the regex to prevent the ** from being included in the second match.
EDIT2:
Changed regular expression to use lookbehind.
The regex you're looking for is
Screen\+Refresh\+Rate[^&]*
You could use Matcher.find() to find all matches.
are you looking for a string that follows with "filter=" and ignores the first "*" and is end with the first "&".
your can try the following:
String str = "1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All**&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ**&sName=View+All&subCatView=true 0 2819357575609397706";
Pattern p = Pattern.compile("filter=(?:\\**)([^&]+?)(?:\\**)&");
Matcher matcher = p.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}

Categories