Finding patterns using Regex and replacing them with processed data - java

There are input strings of the format
${ENC}:107ec5141234742beec5cb5b1917e2e6:{ENC}$${ENC}:d0b2ddf0b9e7b397558c20c6232‌​37c4f:{ENC}$${ENC}:85d6f3cd7dcc5c67cad68ae45a0d5afc:{ENC}$${ENC}:5c0dfb55a843f830‌​024df0d74993b668:{ENC}$
As you can see, the data ( in bold), are prefixed with ${ENC}: and suffixed with :{ENC}$. And i want to replace all the Strings in between them with processed data.
I am using the Regular Expression:
\$\{ENC\}\:(.*?)\:\{ENC\}\$
which after escaping for java:
\\$\\{ENC\\}\\:(.*?)\\:\\{ENC\\}\\$
to find the matches and replace the Strings.
My code sample is below:
String THE_REGEX = "\\$\\{ENC\\}\\:(.*?)\\:\\{ENC\\}\\$";
Pattern THE_PATTERN = Pattern.compile(THE_REGEX);
public static boolean isProcessingRequired(String data){
if(data == null){
return false;
}
return data.matches(THE_REGEX);
}
public String getProcessedString(String dataString){
Matcher matcher = THE_PATTERN.matcher(dataString);
if(matcher.matches()){
String processedData = null;
String dataItem = matcher.group(1);
String procItem = doSomeProcessing(dataItem);
processedData = dataString.replaceAll("\\$\\{ENC\\}:" + encData + ":\\{ENC\\}\\$", procItem);
if(isProcessingRequired(processedData)){
processedData = getProcessedString(processedData);
}
return processedData;
} else {
return dataString;
}
}
public String doSomeProcessing(String str){
// do some processing on the string
// for now:
str = "DONE PROCESSING!!"
return str;
}
But at matcher.group(1), I'm getting
107ec5141234742beec5cb5b1917e2e6:ENC}$${ENC}:d0b2ddf0b9e7b397558c20c623237c4f:{ENC}$${ENC}:85d6f3cd7dcc5c67cad68ae45a0d5afc:{ENC}$${ENC}:5c0dfb55a843f830024df0d74993b668
instead of
107ec5141234742beec5cb5b1917e2e6
which I was expecting.
I'm using the ? in regex to avoid this problem.
And when I tried it at the www.regexe.com, regex appears to be fine
What am I doing wrong here?

The problem is that you are using Matcher.matches() instead of Matcher.find().
From the javadoc:
public boolean matches()
Attempts to match the entire region against the pattern.
public boolean find()
Attempts to find the next subsequence of the input sequence that matches the pattern.
Here is a simple code expliciting the difference :
Matcher matcher = Pattern.compile("\\Q${ENC}\\E(.*?)\\Q{ENC}$\\E").matcher("${ENC}1{ENC}$${ENC}2{ENC}$");
if (matcher.matches()) {
System.out.println(matcher.group(1)); // Will print "1{ENC}$${ENC}2"
}
matcher.reset();
if (matcher.find()) {
System.out.println(matcher.group(1)); // Will print "1"
}

Related

How do I extract substring from this line using RegEx

I have the following String line:
dn: cn=Customer Management,ou=groups,dc=digitalglobe,dc=com
I want to extract just this from the line above: Customer Management
I've tried the following RegEx expression but it does quite do what I want:
^dn: cn=(.*?),
Here is the java code snippet that tests the above expression:
Pattern pattern = Pattern.compile("^dn: cn=(.*?),");
String mydata = "dn: cn=Delivery Admin,ou=groups,dc=digitalglobe,dc=com";
Matcher matcher = pattern.matcher(mydata);
if(matcher.matches()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}
The output is "No match found"... :(
Your regex should work properly, but matches attempts to match the regex to the entire string. Instead, use the find method which will look for a match at any point in the string.
if(matcher.find()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}
Your problem is that the matcher want to match the whole input. Try adding a wildcard to the end of the pattern.
Pattern pattern = Pattern.compile("^dn: cn=(.*?),.*");
String mydata = "dn: cn=Delivery Admin,ou=groups,dc=digitalglobe,dc=com";
Matcher matcher = pattern.matcher(mydata);
if(matcher.matches()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}
Please use below code:
#NOTE: instead of using matches you have to use find
public static void main(String[] args)
{
Pattern pattern = Pattern.compile("^dn: cn=(.*?),");
String mydata = "dn: cn=Delivery Admin,ou=groups,dc=digitalglobe,dc=com";
Matcher matcher = pattern.matcher(mydata);
if(matcher.find()) {
System.out.println(matcher.group(1));
} else {
System.out.println("No match found!");
}
}

check for string after a string

I'm writing a logic to check if there is any number following a specific keyword ("top" in the case).
String test = "top 10 results";
String test = "show top 10 results";
I used the pattern match find to acheive that and its working.
public static boolean hasNumberafterTOP(String query)
{
boolean hasNumberafterTOP = false;
Pattern pat = Pattern.compile("top\\s+([0-9]+)");
Matcher matcher = pat.matcher(query);
if(matcher.find())
{
hasNumberafterTOP = true;
numberAfterTOP=matcher.group(1);
}
return hasNumberafterTOP;
}
Is there any way I can extend this check to include these words too. five,ten,twenty,thirty etc.?
Try this code:
String keyword = "top";
Pattern p= Pattern.compile("\\d*\\s+"+Pattern.quote(keyword));
Matcher match = p.matcher(query");
match .find();
String inputInt = makeMatch.group();
System.out.println(inputInt);

How to get the video id from a YouTube URL with regex in Java

Porting from javascript answer to Java version
JavaScript REGEX: How do I get the YouTube video id from a URL?
Figured this question wasn't here for Java, in case you need to do it in Java here's a way
public static String extractYTId(String ytUrl) {
String vId = null;
Pattern pattern = Pattern.compile(
"^https?://.*(?:youtu.be/|v/|u/\\w/|embed/|watch?v=)([^#&?]*).*$",
Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(ytUrl);
if (matcher.matches()){
vId = matcher.group(1);
}
return vId;
}
Works for URLs like (also for https://...)
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg
Previous answers don't work for me.. It's correct method. It works with www.youtube.com and youtu.be links.
private String getYouTubeId (String youTubeUrl) {
String pattern = "(?<=youtu.be/|watch\\?v=|/videos/|embed\\/)[^#\\&\\?]*";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(youTubeUrl);
if(matcher.find()){
return matcher.group();
} else {
return "error";
}
}
Update 21 Oct 2022
As this post getting attention from people, I am going to update to serve more cases, thanks to #Nikhil Seban for your new links, the Java code is remained thanks to guys provide code #Leap Bun #Pankaj
This regex also use match group(1), please take a look on the bottom
Regex Online Here
http(?:s)?:\/\/(?:m.)?(?:www\.)?youtu(?:\.be\/|(?:be-nocookie|be)\.com\/(?:watch|[\w]+\?(?:feature=[\w]+.[\w]+\&)?v=|v\/|e\/|embed\/|user\/(?:[\w#]+\/)+))([^&#?\n]+)
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=pJegNopBLL8
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
https://www.youtube.com/watch?v=0zM4nApSvMg&feature=youtu.be
https://www.youtube.com/watch?v=_5BTo2oZ0SE
https://m.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE
https://m.youtube.com/watch?v=_5BTo2oZ0SE
https://www.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE&app=desktop
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=eLlxrBmD3H4
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
https://www.youtube.com/watch?v=wz4MLJBdSpw&t=67s
http://www.youtube.com/watch?v=dQw4w9WgXcQ&a=GxdCwVVULXctT2lYDEPllDR0LRTutYfW
http://www.youtube.com/embed/dQw4w9WgXcQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
https://www.youtube.com/watch?v=EL-UCUAt8DQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
http://youtu.be/dQw4w9WgXcQ
http://www.youtube.com/v/dQw4w9WgXcQ
http://www.youtube.com/e/dQw4w9WgXcQ
http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ
https://www.youtube-nocookie.com/embed/xHkq1edcbk4?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg
Old post
This below code you can also use either review what I have updated or develop new pattern
I know this topic is old, but I get more cases to get Youtube id, this is my regex
http(?:s)?:\/\/(?:m.)?(?:www\.)?youtu(?:\.be\/|be\.com\/(?:watch\?(?:feature=youtu.be\&)?v=|v\/|embed\/|user\/(?:[\w#]+\/)+))([^&#?\n]+)
This will work for
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=pJegNopBLL8
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
https://www.youtube.com/watch?v=0zM4nApSvMg&feature=youtu.be
https://www.youtube.com/watch?v=_5BTo2oZ0SE
https://m.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE
https://m.youtube.com/watch?v=_5BTo2oZ0SE
https://www.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE&app=desktop
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=eLlxrBmD3H4
Note: Using regex match group(1) to get ID
public static String getVideoId(#NonNull String videoUrl) {
String videoId = "";
String regex = PLEASE_COPY_ABOVE_REGEX_PATTERN
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(url);
if(matcher.find()){
videoId = matcher.group(1);
}
return videoId;
}
To me, for the sample links you posted, this regex looks more precise (capturing the ID to Group 1):
https?://(?:www\.)?youtu(?:\.be/|be\.com/(?:watch\?v=|v/|embed/|user/(?:[\w#]+/)+))([^&#?\n]+)
On the demo, see the capture groups in the bottom right pane. However, for the second match, not entirely sure that this is a correct ID as it looks different from the other samples—to be tweaked if needed.
In code, something like:
Pattern regex = Pattern.compile("http://(?:www\\.)?youtu(?:\\.be/|be\\.com/(?:watch\\?v=|v/|embed/|user/(?:[\\w#]+/)+))([^&#?\n]+)");
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
VideoID = regexMatcher.group(1);
}
private fun extractVideoId(ytUrl: String?): String? {
var videoId: String? = null
val regex =
"^((?:https?:)?//)?((?:www|m)\\.)?((?:youtube\\.com|youtu.be|youtube-nocookie.com))(/(?:[\\w\\-]+\\?v=|feature=|watch\\?|e/|embed/|v/)?)([\\w\\-]+)(\\S+)?\$"
val pattern: Pattern = Pattern.compile(
regex ,
Pattern.CASE_INSENSITIVE
)
val matcher: Matcher = pattern.matcher(ytUrl)
if (matcher.matches()) {
videoId = matcher.group(5)
}
return videoId
}
The above function with regex can extract video Id from
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
https://www.youtube.com/watch?v=wz4MLJBdSpw&t=67s
http://www.youtube.com/watch?v=dQw4w9WgXcQ&a=GxdCwVVULXctT2lYDEPllDR0LRTutYfW
http://www.youtube.com/embed/dQw4w9WgXcQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
https://www.youtube.com/watch?v=EL-UCUAt8DQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
http://youtu.be/dQw4w9WgXcQ
http://www.youtube.com/v/dQw4w9WgXcQ
http://www.youtube.com/e/dQw4w9WgXcQ
http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ
Tried the other ones but failed in my case - adjusted the regex to fit for my urls
public static String extractYoutubeVideoId(String ytUrl) {
String vId = null;
String pattern = "(?<=watch\\?v=|/videos/|embed\\/)[^#\\&\\?]*";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(ytUrl);
if(matcher.find()){
vId= matcher.group();
}
return vId;
}
This one works for below url's
http://www.youtube.com/embed/Woq5iX9XQhA?html5=1
http://www.youtube.com/watch?v=384IUU43bfQ
http://gdata.youtube.com/feeds/api/videos/xTmi7zzUa-M&whatever
https://www.youtube.com/watch?v=C7RVaSEMXNk
Hope it will help some one. Thanks
Using regex from Tấn Nguyên:
public String getVideoIdFromYoutubeUrl(String url){
String videoId = null;
String regex = "http(?:s)?:\\/\\/(?:m.)?(?:www\\.)?youtu(?:\\.be\\/|be\\.com\\/(?:watch\\?(?:feature=youtu.be\\&)?v=|v\\/|embed\\/|user\\/(?:[\\w#]+\\/)+))([^&#?\\n]+)";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(url);
if(matcher.find()){
videoId = matcher.group(1);
}
return videoId;
}
Credit to Tấn Nguyên.
This will work for me and simple..
public static String getVideoId(#NonNull String videoUrl) {
String reg = "(?:youtube(?:-nocookie)?\\.com\\/(?:[^\\/\\n\\s]+\\/\\S+\\/|(?:v|e(?:mbed)?)\\/|\\S*?[?&]v=)|youtu\\.be\\/)([a-zA-Z0-9_-]{11})";
Pattern pattern = Pattern.compile(reg, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(videoUrl);
if (matcher.find())
return matcher.group(1);
return null;
}
Just pass youtube link in below function , it will detect all type of youtube url
public static String getYoutubeID(String youtubeUrl) {
if (TextUtils.isEmpty(youtubeUrl)) {
return "";
}
String video_id = "";
String expression = "^.*((youtu.be" + "\\/)" + "|(v\\/)|(\\/u\\/w\\/)|(embed\\/)|(watch\\?))\\??v?=?([^#\\&\\?]*).*"; // var regExp = /^.*((youtu.be\/)|(v\/)|(\/u\/\w\/)|(embed\/)|(watch\?))\??v?=?([^#\&\?]*).*/;
CharSequence input = youtubeUrl;
Pattern pattern = Pattern.compile(expression, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
String groupIndex1 = matcher.group(7);
if (groupIndex1 != null && groupIndex1.length() == 11)
video_id = groupIndex1;
}
if (TextUtils.isEmpty(video_id)) {
if (youtubeUrl.contains("youtu.be/") ) {
String spl = youtubeUrl.split("youtu.be/")[1];
if (spl.contains("\\?")) {
video_id = spl.split("\\?")[0];
}else {
video_id =spl;
}
}
}
return video_id;
}
Combining a couple of methods, to cover as many formats as possible, is recommended.
String ID1 = getYoutubeID1(url); regex 1
String ID2 = getYoutubeID2(url); regex 2
String ID3 = getYoutubeID3(url); regex 3
then, using an if/switch statement to choose a value that isn't null and is valid.
I actully try all the above but some times work and some times fail
any way I found this may it help -- in this site
private final static String expression = "(?<=watch\\?v=|/videos/|embed\\/|youtu.be\\/|\\/v\\/|\\/e\\/|watch\\?v%3D|watch\\?feature=player_embedded&v=|%2Fvideos%2F|embed%\u200C\u200B2F|youtu.be%2F|%2Fv%2F)[^#\\&\\?\\n]*";
public static String getVideoId(String videoUrl) {
if (videoUrl == null || videoUrl.trim().length() <= 0){
return null;
}
Pattern pattern = Pattern.compile(expression);
Matcher matcher = pattern.matcher(videoUrl);
try {
if (matcher.find())
return matcher.group();
} catch (ArrayIndexOutOfBoundsException ex) {
ex.printStackTrace();
}
return null;
}
he said
This will work on two kinks of youtube video urls either direct or shared URL.
Direct url: https://www.youtube.com/watch?v=XXXXXXXXXXX
Shared url: https://youtu.be/XXXXXXXXXXX
This is very similar to the link posted in your question.
String url = "https://www.youtube.com/watch?v=wZZ7oFKsKzY";
if(url.indexOf("v=") == -1) //there's no video id
return "";
String id = url.split("v=")[1]; //everything before the first 'v=' will be the first element, i.e. [0]
int end_of_id = id_to_end.indexOf("&"); //if there are other parameters in the url, get only the id's value
if(end_index != -1)
id = url.substring(0, end_of_id);
return id;
Without regex :
String url - "https://www.youtube.com/watch?v=BYrumLBuzHk"
String video_id = url.replace("https://www.youtube.com/watch?v=", "")
So now you have a video ID in string video_id.

Extract content after "=" and before "&", Regex expression in java

guys, I wanna extract the content in a string, the content is before "&" and after the "=", like this example:
asdfaf=afl10109&adsfjkl
I want to extract "afl10109" out of the string, can anyone teach me how to do this, I am very new to regex expression...
Use replaceAll() to replace the whole input with just what you want:
String target = str.replaceAll(".*=(.*)&.*", "$1");
The target is captured in a group (group number 1), which is then referenced in the replacement string.
try
public static void main(String args[]) {
String input="asdfaf=afl10109&adsfjkl";
Pattern pattern = Pattern.compile("=[^&]*&");
Matcher m = pattern.matcher(input);
while (m.find()) {
String str = m.group();
System.out.println( str.substring(1,str.length()-1));
}
}
This is not regex but you can also use split()
String str = "asdfaf=afl10109&adsfjkl";
System.out.println(str.split("=")[1].split("&")[0]);
Output:
afl10109
Using good old String#substring()
String str = "foo=bar&baz";
int begin = str.indexOf('=');
if (begin != -1) {
int end = str.indexOf('&', begin);
if (end != -1) {
System.out.println(str.substring(begin+1, end)); // bar
}
}

Java Regex Replace with Capturing Group

Is there any way to replace a regexp with modified content of capture group?
Example:
Pattern regex = Pattern.compile("(\\d{1,2})");
Matcher regexMatcher = regex.matcher(text);
resultString = regexMatcher.replaceAll("$1"); // *3 ??
And I'd like to replace all occurrence with $1 multiplied by 3.
edit:
Looks like, something's wrong :(
If I use
Pattern regex = Pattern.compile("(\\d{1,2})");
Matcher regexMatcher = regex.matcher("12 54 1 65");
try {
String resultString = regexMatcher.replaceAll(regexMatcher.group(1));
} catch (Exception e) {
e.printStackTrace();
}
It throws an IllegalStateException: No match found
But
Pattern regex = Pattern.compile("(\\d{1,2})");
Matcher regexMatcher = regex.matcher("12 54 1 65");
try {
String resultString = regexMatcher.replaceAll("$1");
} catch (Exception e) {
e.printStackTrace();
}
works fine, but I can't change the $1 :(
edit:
Now, it's working :)
How about:
if (regexMatcher.find()) {
resultString = regexMatcher.replaceAll(
String.valueOf(3 * Integer.parseInt(regexMatcher.group(1))));
}
To get the first match, use #find(). After that, you can use #group(1) to refer to this first match, and replace all matches by the first maches value multiplied by 3.
And in case you want to replace each match with that match's value multiplied by 3:
Pattern p = Pattern.compile("(\\d{1,2})");
Matcher m = p.matcher("12 54 1 65");
StringBuffer s = new StringBuffer();
while (m.find())
m.appendReplacement(s, String.valueOf(3 * Integer.parseInt(m.group(1))));
System.out.println(s.toString());
You may want to look through Matcher's documentation, where this and a lot more stuff is covered in detail.
earl's answer gives you the solution, but I thought I'd add what the problem is that's causing your IllegalStateException. You're calling group(1) without having first called a matching operation (such as find()). This isn't needed if you're just using $1 since the replaceAll() is the matching operation.
Java 9 offers a Matcher.replaceAll() that accepts a replacement function:
resultString = regexMatcher.replaceAll(
m -> String.valueOf(Integer.parseInt(m.group()) * 3));
Source: java-implementation-of-rubys-gsub
Usage:
// Rewrite an ancient unit of length in SI units.
String result = new Rewriter("([0-9]+(\\.[0-9]+)?)[- ]?(inch(es)?)") {
public String replacement() {
float inches = Float.parseFloat(group(1));
return Float.toString(2.54f * inches) + " cm";
}
}.rewrite("a 17 inch display");
System.out.println(result);
// The "Searching and Replacing with Non-Constant Values Using a
// Regular Expression" example from the Java Almanac.
result = new Rewriter("([a-zA-Z]+[0-9]+)") {
public String replacement() {
return group(1).toUpperCase();
}
}.rewrite("ab12 cd efg34");
System.out.println(result);
Implementation (redesigned):
import static java.lang.String.format;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public abstract class Rewriter {
private Pattern pattern;
private Matcher matcher;
public Rewriter(String regularExpression) {
this.pattern = Pattern.compile(regularExpression);
}
public String group(int i) {
return matcher.group(i);
}
public abstract String replacement() throws Exception;
public String rewrite(CharSequence original) {
return rewrite(original, new StringBuffer(original.length())).toString();
}
public StringBuffer rewrite(CharSequence original, StringBuffer destination) {
try {
this.matcher = pattern.matcher(original);
while (matcher.find()) {
matcher.appendReplacement(destination, "");
destination.append(replacement());
}
matcher.appendTail(destination);
return destination;
} catch (Exception e) {
throw new RuntimeException("Cannot rewrite " + toString(), e);
}
}
#Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append(pattern.pattern());
for (int i = 0; i <= matcher.groupCount(); i++)
sb.append(format("\n\t(%s) - %s", i, group(i)));
return sb.toString();
}
}

Categories