Why does this regex fail?

Why does this regex fail? - java

I have a PCL file and open it with Notepad ++ to view the source code (with PCL Viewer I see the final results but I need to view the source also).
Please see Lab Number and the rest of the characters. I am able to extract Lab Number and its code with this regex:
private static String PATTERN_LABNUMBER = "Lab Number[\\W\\D]*(\\d*)";
and it gives me:
0092616281
I now want to extract Date Reported and I use this regex (after a lot of other tries):
private static String PATTERN_DATE_REPORTED =
"Date Reported[\\W\\D]*(\\d\\d/\\d\\d/\\d\\d\\d\\d \\d\\d:\\d\\d)";
but it does NOT find it in the PCL file.
I've also tried with:
private static String PATTERN_DATE_REPORTED =
"Date Reported[\\W\\D]*([0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2})";
but the same not found result...
Do you see where I am missing something in this last regex?
Thanks a lot!
UPDATE:
I use this java code to extract Lab number and Date Reported:
public String extractWithRegEx(String regextype, String input) {
String matchedString = null;
if (regextype != null && input != null) {
Matcher matcher = Pattern.compile(regextype).matcher(input);
if (matcher.find()) {
System.out.println("Matcher found for regextype "+regextype);
matchedString = matcher.group(0);
if (matcher.groupCount() > 0) {
matchedString = matcher.group(1);
}
}
}
return matchedString;
}

Here is the code to accomplish what you want..
Pattern pattern = Pattern.compile("Date Reported.*(\\d{2}/\\d{4} \\d{2}:\\d{2})$", Pattern.MULTILINE);
String st = "date dfdsfsd fgfd gdfgfdgdf gdfgdfg gdfgdf 3232/22/2010 23:34\n"+
"dsadsadasDate Reported gdfgfd gdfgfdgdf gdfgdfg gdfgdf 3232/22/2010 23:34";
Matcher matcher = pattern.matcher(st);
while (matcher.find()) {
System.out.println(matcher.group(1));
}

Related

method to take string inside curly braces using split or tokenizer

String s = "author= {insert text here},";
Trying to get the inside of the string, ive looked around but couldn't find a resolution with just split or tokenizer...
so far im doing this
arraySplitBracket = s.trim().split("\\{", 0);
which gives me insert text here},
at array[1] but id like a way to not have } attached
also tried
StringTokenizer st = new StringTokenizer(s, "\\{,\\},");
But it gave me author= as output.

public static void main(String[] args) {
String input="{a c df sdf TDUS^&%^7 }";
String regEx="(.*[{]{1})(.*)([}]{1})";
Matcher matcher = Pattern.compile(regEx).matcher(input);
if(matcher.matches()) {
System.out.println(matcher.group(2));
}
}

You can use \\{([^}]*)\\} Regex to get string between curly braces.
Code Snap :
String str = "{insert text here}";
Pattern p = Pattern.compile("\\{([^}]*)\\}");
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
}
Output :
insert text here

String s = "auther ={some text here},";
s = s.substring(s.indexOf("{") + 1); //some text here},
s = s.substring(0, s.indexOf("}"));//some text here
System.out.println(s);

How about taking a substring by excluding the character at arraySplitBracket.length()-1
Something like
arraySplitBracket[1] = arraySplitBracket[1].substring(0,arraySplitBracket.length()-1);
Or use String Class's replaceAll function to replace } ?

Unable to find pattern in Java

I have been trying to use pattern matcher to find the specific pattern and I have created the regex pattern through this website and it shows that the pattern is found in the text file I wanted to read.
Extra info.: This code works like this : Start reading the textfile,
when meet >D10, enter another loop and get the information until the
next >D10 is found. Loop this process until EOF.
My sample text file:
D14*
Y7620D03*
X247390Y66680D03*
X251540Y160150D03*
G01Y136780*
G03X-374970Y133680I3100J0*
D17*
Y7620D03*
X247390Y66680D03*
X251540Y160150D03*
G01Y136780*
G03X-374970Y133680I3100J0*
My pattern code in java:
private final Pattern PinNamePattern = compile("(D[1-9][0-9])\\*");
private final Pattern LocationXYPattern = compile("^(G0[1-3])?(X|Y)(-?[\\d]+)(D0[1-3])?\\*",Pattern.MULTILINE);
private final Pattern LocationXYIJPattern = compile("^(G0[1-3])?X(-?[\\d]+)?Y(-?[\\d]+)?I?(-?[\\d]+)?J?(-?[\\d]+)?(D0[1-3])?\\*",Pattern.MULTILINE);
My code in java:
while ((line = br.readLine()) != null) {
Matcher pinNameMatcher = PinNamePattern.matcher(line);
//If found Aperture Name
if (pinNameMatcher.find()) {
currentApperture = pinNameMatcher.group(1);
System.out.println(currentApperture);
pinNameMatcher.reset();
//Start matching Location X Y I J
//Will keep looping as long as next aperture name not found
//Second While loop
while (!(pinNameMatcher.find()) ) {
line = br.readLine();
Matcher locXYMatcher = LocationXYPattern.matcher(line);
Matcher locXYIJMatcher = LocationXYIJPattern.matcher(line);
LineNumber++;
if (locXYMatcher.find()) {
System.out.println("XY FOUND");
if (locXYIJMatcher.find()) {
System.out.println("XYIJ FOUND");
}
}
However, when I'm using java to read, the pattern just simply cannot be found. Is there anything I missed out or am I doing it wrong? I have tried removing the "^" and MULTILINE flag but the pattern is still not found.

Your regex looks and works fine, it's possible you aren't searching it properly.
String s = "G03X-374970Y133680I3100J0*";
Pattern pattern = Pattern.compile("^(G0[1-3])?X(-?[\\d]+)?Y(-?[\\d]+)?I?(-?[\\d]+)?J?(-?[\\d]+)?(D0[1-3])?\\*");
Matcher m = pattern.matcher(s);
while (m.find()) {
String s = m.group(0);
System.out.println(s); // prints G03X-374970Y133680I3100J0*
}
In your updated code, you are looking for the second and third pattern only when the first pattern matches, which is probably not what you want. Try using this as a foundation and improving upon it:
while ((line = br.readLine()) != null) {
Matcher pinNameMatcher = PinNamePattern.matcher(line);
if (pinNameMatcher.find()) {
currentApperture = pinNameMatcher.group(0);
System.out.println(currentApperture);
}
Matcher locXYMatcher = LocationXYPattern.matcher(line);
if (locXYMatcher.find()) {
System.out.println(locXYMatcher.group(0));
}
Matcher locXYIJMatcher = LocationXYIJPattern.matcher(line);
if (locXYMatcher.find()) {
System.out.println(locXYIJMatcher.group(0));
}
}

check for string after a string

I'm writing a logic to check if there is any number following a specific keyword ("top" in the case).
String test = "top 10 results";
String test = "show top 10 results";
I used the pattern match find to acheive that and its working.
public static boolean hasNumberafterTOP(String query)
{
boolean hasNumberafterTOP = false;
Pattern pat = Pattern.compile("top\\s+([0-9]+)");
Matcher matcher = pat.matcher(query);
if(matcher.find())
{
hasNumberafterTOP = true;
numberAfterTOP=matcher.group(1);
}
return hasNumberafterTOP;
}
Is there any way I can extend this check to include these words too. five,ten,twenty,thirty etc.?

Try this code:
String keyword = "top";
Pattern p= Pattern.compile("\\d*\\s+"+Pattern.quote(keyword));
Matcher match = p.matcher(query");
match .find();
String inputInt = makeMatch.group();
System.out.println(inputInt);

How to remove text between <script></script> tags

I want to remove the content between <script></script>tags. I'm manually checking for the pattern and iterating using while loop. But, I'm getting StringOutOfBoundException at this line:
String script = source.substring(startIndex,endIndex-startIndex);
Below is the complete method:
public static String getHtmlWithoutScript(String source) {
String START_PATTERN = "<script>";
String END_PATTERN = " </script>";
while (source.contains(START_PATTERN)) {
int startIndex=source.lastIndexOf(START_PATTERN);
int endIndex=source.indexOf(END_PATTERN,startIndex);
String script=source.substring(startIndex,endIndex);
source.replace(script,"");
}
return source;
}
Am I doing anything wrong here? And I'm getting endIndex=-1. Can anyone help me to identify, why my code is breaking.

String text = "<script>This is dummy text to remove </script> dont remove this";
StringBuilder sb = new StringBuilder(text);
String startTag = "<script>";
String endTag = "</script>";
//removing the text between script
sb.replace(text.indexOf(startTag) + startTag.length(), text.indexOf(endTag), "");
System.out.println(sb.toString());
If you want to remove the script tags too add the following line :
sb.toString().replace(startTag, "").replace(endTag, "")
UPDATE :
If you dont want to use StringBuilder you can do this:
String text = "<script>This is dummy text to remove </script> dont remove this";
String startTag = "<script>";
String endTag = "</script>";
//removing the text between script
String textToRemove = text.substring(text.indexOf(startTag) + startTag.length(), text.indexOf(endTag));
text = text.replace(textToRemove, "");
System.out.println(text);

You can use a regex to remove the script tag content:
public String removeScriptContent(String html) {
if(html != null) {
String re = "<script>(.*)</script>";
Pattern pattern = Pattern.compile(re);
Matcher matcher = pattern.matcher(html);
if (matcher.find()) {
return html.replace(matcher.group(1), "");
}
}
return null;
}
You have to add this two imports:
import java.util.regex.Matcher;
import java.util.regex.Pattern;

I know I'm probably late to the party. But I would like to give you a regex (really tested solution).
What you have to note here is that when it comes to regular expressions, their engines are greedy by default. So a search string such as <script>(.*)</script> will match the entire string starting from <script> up until the end of the line, or end of the file depending on the regexp options used. This is due to the fact that the search engine uses greedy matching by default.
Now in order to perform the match that you want to in an accurate manner... you could use "lazy" searching.
Search with Lazy loading
<script>(.*?)<\/script>
Now with that, you will get accurate results.
You can read more about about Regexp Lazy & Greedy in this answer.

This worked for me:
private static String removeScriptTags(String message) {
String scriptRegex = "<(/)?[ ]*script[^>]*>";
Pattern pattern2 = Pattern.compile(scriptRegex);
if(message != null) {
Matcher matcher2 = pattern2.matcher(message);
StringBuffer str = new StringBuffer(message.length());
while(matcher2.find()) {
matcher2.appendReplacement(str, Matcher.quoteReplacement(" "));
}
matcher2.appendTail(str);
message = str.toString();
}
return message;
}
Credit goes to nealvs: https://nealvs.wordpress.com/2010/06/01/removing-tags-from-a-string-in-java/

How to get the video id from a YouTube URL with regex in Java

Porting from javascript answer to Java version
JavaScript REGEX: How do I get the YouTube video id from a URL?

Figured this question wasn't here for Java, in case you need to do it in Java here's a way
public static String extractYTId(String ytUrl) {
String vId = null;
Pattern pattern = Pattern.compile(
"^https?://.*(?:youtu.be/|v/|u/\\w/|embed/|watch?v=)([^#&?]*).*$",
Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(ytUrl);
if (matcher.matches()){
vId = matcher.group(1);
}
return vId;
}
Works for URLs like (also for https://...)
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg

Previous answers don't work for me.. It's correct method. It works with www.youtube.com and youtu.be links.
private String getYouTubeId (String youTubeUrl) {
String pattern = "(?<=youtu.be/|watch\\?v=|/videos/|embed\\/)[^#\\&\\?]*";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(youTubeUrl);
if(matcher.find()){
return matcher.group();
} else {
return "error";
}
}

Update 21 Oct 2022
As this post getting attention from people, I am going to update to serve more cases, thanks to #Nikhil Seban for your new links, the Java code is remained thanks to guys provide code #Leap Bun #Pankaj
This regex also use match group(1), please take a look on the bottom
Regex Online Here
http(?:s)?:\/\/(?:m.)?(?:www\.)?youtu(?:\.be\/|(?:be-nocookie|be)\.com\/(?:watch|[\w]+\?(?:feature=[\w]+.[\w]+\&)?v=|v\/|e\/|embed\/|user\/(?:[\w#]+\/)+))([^&#?\n]+)
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=pJegNopBLL8
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
https://www.youtube.com/watch?v=0zM4nApSvMg&feature=youtu.be
https://www.youtube.com/watch?v=_5BTo2oZ0SE
https://m.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE
https://m.youtube.com/watch?v=_5BTo2oZ0SE
https://www.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE&app=desktop
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=eLlxrBmD3H4
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
https://www.youtube.com/watch?v=wz4MLJBdSpw&t=67s
http://www.youtube.com/watch?v=dQw4w9WgXcQ&a=GxdCwVVULXctT2lYDEPllDR0LRTutYfW
http://www.youtube.com/embed/dQw4w9WgXcQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
https://www.youtube.com/watch?v=EL-UCUAt8DQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
http://youtu.be/dQw4w9WgXcQ
http://www.youtube.com/v/dQw4w9WgXcQ
http://www.youtube.com/e/dQw4w9WgXcQ
http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ
https://www.youtube-nocookie.com/embed/xHkq1edcbk4?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg
Old post
This below code you can also use either review what I have updated or develop new pattern
I know this topic is old, but I get more cases to get Youtube id, this is my regex
http(?:s)?:\/\/(?:m.)?(?:www\.)?youtu(?:\.be\/|be\.com\/(?:watch\?(?:feature=youtu.be\&)?v=|v\/|embed\/|user\/(?:[\w#]+\/)+))([^&#?\n]+)
This will work for
http://www.youtube.com/watch?v=0zM4nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/SomeUser#p/a/u/1/QDK8U-VIH_o
http://www.youtube.com/v/0zM4nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM4nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM4nApSvMg
http://youtu.be/0zM4nApSvMg
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=pJegNopBLL8
http://www.youtube.com/watch?v=0zM4nApSvMg#t=0m10s
https://www.youtube.com/watch?v=0zM4nApSvMg&feature=youtu.be
https://www.youtube.com/watch?v=_5BTo2oZ0SE
https://m.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE
https://m.youtube.com/watch?v=_5BTo2oZ0SE
https://www.youtube.com/watch?feature=youtu.be&v=_5BTo2oZ0SE&app=desktop
https://www.youtube.com/watch?v=nBuae6ilH24
https://www.youtube.com/watch?v=eLlxrBmD3H4
Note: Using regex match group(1) to get ID
public static String getVideoId(#NonNull String videoUrl) {
String videoId = "";
String regex = PLEASE_COPY_ABOVE_REGEX_PATTERN
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(url);
if(matcher.find()){
videoId = matcher.group(1);
}
return videoId;
}

To me, for the sample links you posted, this regex looks more precise (capturing the ID to Group 1):
https?://(?:www\.)?youtu(?:\.be/|be\.com/(?:watch\?v=|v/|embed/|user/(?:[\w#]+/)+))([^&#?\n]+)
On the demo, see the capture groups in the bottom right pane. However, for the second match, not entirely sure that this is a correct ID as it looks different from the other samples—to be tweaked if needed.
In code, something like:
Pattern regex = Pattern.compile("http://(?:www\\.)?youtu(?:\\.be/|be\\.com/(?:watch\\?v=|v/|embed/|user/(?:[\\w#]+/)+))([^&#?\n]+)");
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
VideoID = regexMatcher.group(1);
}

private fun extractVideoId(ytUrl: String?): String? {
var videoId: String? = null
val regex =
"^((?:https?:)?//)?((?:www|m)\\.)?((?:youtube\\.com|youtu.be|youtube-nocookie.com))(/(?:[\\w\\-]+\\?v=|feature=|watch\\?|e/|embed/|v/)?)([\\w\\-]+)(\\S+)?\$"
val pattern: Pattern = Pattern.compile(
regex ,
Pattern.CASE_INSENSITIVE
)
val matcher: Matcher = pattern.matcher(ytUrl)
if (matcher.matches()) {
videoId = matcher.group(5)
}
return videoId
}
The above function with regex can extract video Id from
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
https://www.youtube.com/watch?v=wz4MLJBdSpw&t=67s
http://www.youtube.com/watch?v=dQw4w9WgXcQ&a=GxdCwVVULXctT2lYDEPllDR0LRTutYfW
http://www.youtube.com/embed/dQw4w9WgXcQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
https://www.youtube.com/watch?v=EL-UCUAt8DQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
http://youtu.be/dQw4w9WgXcQ
http://www.youtube.com/v/dQw4w9WgXcQ
http://www.youtube.com/e/dQw4w9WgXcQ
http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ

Tried the other ones but failed in my case - adjusted the regex to fit for my urls
public static String extractYoutubeVideoId(String ytUrl) {
String vId = null;
String pattern = "(?<=watch\\?v=|/videos/|embed\\/)[^#\\&\\?]*";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(ytUrl);
if(matcher.find()){
vId= matcher.group();
}
return vId;
}
This one works for below url's
http://www.youtube.com/embed/Woq5iX9XQhA?html5=1
http://www.youtube.com/watch?v=384IUU43bfQ
http://gdata.youtube.com/feeds/api/videos/xTmi7zzUa-M&whatever
https://www.youtube.com/watch?v=C7RVaSEMXNk
Hope it will help some one. Thanks

Using regex from Tấn Nguyên:
public String getVideoIdFromYoutubeUrl(String url){
String videoId = null;
String regex = "http(?:s)?:\\/\\/(?:m.)?(?:www\\.)?youtu(?:\\.be\\/|be\\.com\\/(?:watch\\?(?:feature=youtu.be\\&)?v=|v\\/|embed\\/|user\\/(?:[\\w#]+\\/)+))([^&#?\\n]+)";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(url);
if(matcher.find()){
videoId = matcher.group(1);
}
return videoId;
}
Credit to Tấn Nguyên.

This will work for me and simple..
public static String getVideoId(#NonNull String videoUrl) {
String reg = "(?:youtube(?:-nocookie)?\\.com\\/(?:[^\\/\\n\\s]+\\/\\S+\\/|(?:v|e(?:mbed)?)\\/|\\S*?[?&]v=)|youtu\\.be\\/)([a-zA-Z0-9_-]{11})";
Pattern pattern = Pattern.compile(reg, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(videoUrl);
if (matcher.find())
return matcher.group(1);
return null;
}

Just pass youtube link in below function , it will detect all type of youtube url
public static String getYoutubeID(String youtubeUrl) {
if (TextUtils.isEmpty(youtubeUrl)) {
return "";
}
String video_id = "";
String expression = "^.*((youtu.be" + "\\/)" + "|(v\\/)|(\\/u\\/w\\/)|(embed\\/)|(watch\\?))\\??v?=?([^#\\&\\?]*).*"; // var regExp = /^.*((youtu.be\/)|(v\/)|(\/u\/\w\/)|(embed\/)|(watch\?))\??v?=?([^#\&\?]*).*/;
CharSequence input = youtubeUrl;
Pattern pattern = Pattern.compile(expression, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
String groupIndex1 = matcher.group(7);
if (groupIndex1 != null && groupIndex1.length() == 11)
video_id = groupIndex1;
}
if (TextUtils.isEmpty(video_id)) {
if (youtubeUrl.contains("youtu.be/") ) {
String spl = youtubeUrl.split("youtu.be/")[1];
if (spl.contains("\\?")) {
video_id = spl.split("\\?")[0];
}else {
video_id =spl;
}
}
}
return video_id;
}

Combining a couple of methods, to cover as many formats as possible, is recommended.
String ID1 = getYoutubeID1(url); regex 1
String ID2 = getYoutubeID2(url); regex 2
String ID3 = getYoutubeID3(url); regex 3
then, using an if/switch statement to choose a value that isn't null and is valid.

I actully try all the above but some times work and some times fail
any way I found this may it help -- in this site
private final static String expression = "(?<=watch\\?v=|/videos/|embed\\/|youtu.be\\/|\\/v\\/|\\/e\\/|watch\\?v%3D|watch\\?feature=player_embedded&v=|%2Fvideos%2F|embed%\u200C\u200B2F|youtu.be%2F|%2Fv%2F)[^#\\&\\?\\n]*";
public static String getVideoId(String videoUrl) {
if (videoUrl == null || videoUrl.trim().length() <= 0){
return null;
}
Pattern pattern = Pattern.compile(expression);
Matcher matcher = pattern.matcher(videoUrl);
try {
if (matcher.find())
return matcher.group();
} catch (ArrayIndexOutOfBoundsException ex) {
ex.printStackTrace();
}
return null;
}
he said
This will work on two kinks of youtube video urls either direct or shared URL.
Direct url: https://www.youtube.com/watch?v=XXXXXXXXXXX
Shared url: https://youtu.be/XXXXXXXXXXX

This is very similar to the link posted in your question.
String url = "https://www.youtube.com/watch?v=wZZ7oFKsKzY";
if(url.indexOf("v=") == -1) //there's no video id
return "";
String id = url.split("v=")[1]; //everything before the first 'v=' will be the first element, i.e. [0]
int end_of_id = id_to_end.indexOf("&"); //if there are other parameters in the url, get only the id's value
if(end_index != -1)
id = url.substring(0, end_of_id);
return id;

Without regex :
String url - "https://www.youtube.com/watch?v=BYrumLBuzHk"
String video_id = url.replace("https://www.youtube.com/watch?v=", "")
So now you have a video ID in string video_id.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Why does this regex fail? - java

Related

method to take string inside curly braces using split or tokenizer

Unable to find pattern in Java

check for string after a string

How to remove text between <script></script> tags

How to get the video id from a YouTube URL with regex in Java

Categories

Resources