Regular Expression to find entire link in string

Regular Expression to find entire link in string - java

I have a regular expression in apex that is only grabbing part of the link I need in a string. I need it to grab the entire link.
Here is what im working with:
String myvar = 'this is an example http://test.com/testing/123654123%0A%0A%0A%';
String myvar1 = '(?:(?:(?:[a-z0-9]{3,9}:(?://)?)(?:[-;:&=+$,w]+#)?[a-z0-9.-]+|(?:www.|[-;:&=+$??,w]+#)[a-z0-9.-]+)((?:/[+~%/.w-]*)?\\??(?:[-+=&;%#.w]*)#?w*)?)';
Pattern MyPattern = Pattern.compile(myvar1);
Matcher MyMatcher = MyPattern.matcher(myvar);
while (MyMatcher.find()) {
System.debug(MyMatcher.group());
Location = MyMatcher.group();
}
This is only returning http://test.com/
I need http://test.com/testing/123654123
How can I modify the regular expression to provide the complete link?
I just need to modify my existing regex to accomplish this. How can keep as much of the regular expression im using as possible?
(?:(?:(?:[a-z0-9]{3,9}:(?://)?)(?:[-;:&=+$,w]+#)?[a-z0-9.-]+|(?:www.|[-;:&=+$??,w]+#)[a-z0-9.-]+)((?:/[+~%/.w-]*)?\\??(?:[-+=&;%#.w]*)#?w*)?)

Use this regex :
https?:\/\/[a-zA-Z0-9.\/-]*
Online demo http://regexr.com/3d7j7

Related

Get link from url and get email by regex

I'm looking for good regex in java to get string url from all links and all emails. Now I have regex for links:
String linkRegex = "http[s]*://(\\w+\\.)*(\\w+)";
Pattern pattern = Pattern.compile(linkRegex);
Matcher matcher = pattern.matcher(stringAddres);
while (matcher.find()) {
String currentLink = matcher.group();
}
and I got links like: http://twitter.com but also I have https://google. So is there any way that I can remove links like https://google?
And I need regex that gives me email from string, for example:
from this:
href="mailto:contact#example.com">contact#example.com</a></span>
I should get only contact#example.com

There are many answered questions with simple regex patterns that work with most common mails, still I would suggest this regex based on RFC 5322 Standard:
(?:[a-z0-9!#$%&'+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'+/=?^_`{|}~-]+)|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\[\x01-\x09\x0b\x0c\x0e-\x7f])")#(?:(?:a-z0-9?.)+a-z0-9?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\[\x01-\x09\x0b\x0c\x0e-\x7f])+)])
Copied from this site.

I would just use look-behind to lock onto the interesting attributes in the text, and then just capture everything in the "...".
Like this
((?<=href="mailto:)|(?<=src="))[^"]+

Regext do not work in java but it does inr egex check [duplicate]

I have a piece of code that I can't make it working on Eclipse with Java 1.7 installed.
There is a regex expression I want to use to match and extract 2 strings from every match, so I am using 2 groups for this.
I have tested my expression in many websites (online regex testers) and it works for them bust it isn't working on my Java project in Eclipse.
The source string looks like anyone of these:
Formal Language: isNatural
Annotation Tool: isHuman%Human Annotator: isHuman
Hybrid Annotation: conceptType%Hybrid Annotation Tool: conceptType%Hybrid Tagset: conceptType
... and so on.
I want to extract the first words before the ":" and the word after for every match.
The regex I'm using is this:
(\w*\s*\w+):(\s+\w+)%{0,1}
And the snippet of code:
String attribute = parts[0];
Pattern pattern = Pattern.compile("(\\w*\\s*\\w+):(\\s+\\w+)%{0,1}");
Matcher matcher = pattern.matcher(attribute);
OWLDataProperty dataProp = null;
if (matcher.matches()){
while (matcher.find()){
String name = null, domain = null;
domain = matcher.group(1);
name = matcher.group(2);
dataProp = factory.getOWLDataProperty(":"+Introspector.decapitalize(name), pm);
OWLClass domainClass = factory.getOWLClass(":"+domain.replaceAll(" ", ""), pm);
OWLDataPropertyDomainAxiom domainAxiom = factory.getOWLDataPropertyDomainAxiom(dataProp, domainClass);
manager.applyChange(new AddAxiom(ontology, domainAxiom));
}
Does anybody of you know why it's not working?
Many thanks.

When using matches(), you are asking if the string you provided matches your regex as a whole. It is as if you added ^ at the beginning of your regex and $ at the end.
Your regex is otherwise fine, and returns what you expect. I recommend testing it regexplanet.com, Java mode. You will see when matches() is true, when it false, and what each find() will return.
To solve your problem, I think you only need to remove the if (matcher.matches()) condition.

Java Regex pattern that matches in any online tester but doesn't in Eclipse

I have a piece of code that I can't make it working on Eclipse with Java 1.7 installed.
There is a regex expression I want to use to match and extract 2 strings from every match, so I am using 2 groups for this.
I have tested my expression in many websites (online regex testers) and it works for them bust it isn't working on my Java project in Eclipse.
The source string looks like anyone of these:
Formal Language: isNatural
Annotation Tool: isHuman%Human Annotator: isHuman
Hybrid Annotation: conceptType%Hybrid Annotation Tool: conceptType%Hybrid Tagset: conceptType
... and so on.
I want to extract the first words before the ":" and the word after for every match.
The regex I'm using is this:
(\w*\s*\w+):(\s+\w+)%{0,1}
And the snippet of code:
String attribute = parts[0];
Pattern pattern = Pattern.compile("(\\w*\\s*\\w+):(\\s+\\w+)%{0,1}");
Matcher matcher = pattern.matcher(attribute);
OWLDataProperty dataProp = null;
if (matcher.matches()){
while (matcher.find()){
String name = null, domain = null;
domain = matcher.group(1);
name = matcher.group(2);
dataProp = factory.getOWLDataProperty(":"+Introspector.decapitalize(name), pm);
OWLClass domainClass = factory.getOWLClass(":"+domain.replaceAll(" ", ""), pm);
OWLDataPropertyDomainAxiom domainAxiom = factory.getOWLDataPropertyDomainAxiom(dataProp, domainClass);
manager.applyChange(new AddAxiom(ontology, domainAxiom));
}
Does anybody of you know why it's not working?
Many thanks.

When using matches(), you are asking if the string you provided matches your regex as a whole. It is as if you added ^ at the beginning of your regex and $ at the end.
Your regex is otherwise fine, and returns what you expect. I recommend testing it regexplanet.com, Java mode. You will see when matches() is true, when it false, and what each find() will return.
To solve your problem, I think you only need to remove the if (matcher.matches()) condition.

find the path param using regex in the url

what is the regular expression to find the path param from the url?
http://localhost:8080/domain/v1/809pA8
https://localhost:8080/domain/v1/809pA8
Want to retrieve the value(809pA8) from the above URL using regular expression, java is preferable.

I would suggest you do something like
url.substring(url.lastIndexOf('/') + 1);
If you really prefer regexps, you could do
Matcher m = Pattern.compile("/([^/]+)$").matcher(url);
if (m.find())
value = m.group(1);

I would try:
String url = "http://localhost:8080/domain/v1/809pA8";
String value = String.valueOf(url.subSequence(url.lastIndexOf('/'), url.length()-1));
No need for regex here, I think.
EDIT: I'm sorry I made a mistake:
String url = "http://localhost:8080/domain/v1/809pA8";
String value = String.valueOf(url.subSequence(url.lastIndexOf('/')+1, url.length()));
See this code working here: https://ideone.com/E30ddC

For your simple case, regex is an overkill, as others noted. But, if you have more cases and this is why you prefer regex, give Spring's AntPathMatcher#extractUriTemplateVariables a look, if you're using Spring. It's actually better equipped for extracting path variables than regex directly. Here are some good examples.

Quickie regular expression stuck

I have a line of stringy goodness:
"B8&soundFile=http%3A%2F%2Fwww.example.com%2Faeero%2Fj34d1.mp3%2Chttp%3A%2F%2Fwww.example.com%2Faudfgo%2set4.mp3"
Can I use regular expressions to just extract the http up to mp3 for all times it exists?
I have tried reading the documents for regular expressions but none mention how to go FROM http to mp3. Can anyone help?

It would be better if you directly go for index based String operation.
String data = "B8&soundFile=http%3A%2F%2Fwww.example.com%2Faeero%2Fj34d1.mp3%2Chttp%3A%2F%2Fwww.example.com%2Faudfgo%2set4.mp3";
System.out.println(data.substring(data.indexOf("http"), data.indexOf(".mp3")));
Output :
http%3A%2F%2Fwww.example.com%2Faeero%2Fj34d1
B8&soundFile=http%3A%2F%2Fwww.example.com%2Faeero%2Fj34d1.mp3%2Chttp%3A%2F%2Fwww.example.com%2Faudfgo%2set4.mp3

I probably wouldn't do this with a regex. URL decode it, break it up by tokens, and parse it using Java's URL class.

Try http.+?mp3

the following should do it (assuming you want the http and mp3 as part of your match):
.*(http.*mp3)
if you just want the bits between then:
.*http(.*)mp3
for example:
String input = "B8&soundFile=http%3A%2F%2Fwww.example.com%2Faeero%2Fj34d1.mp3%2Chttp%3A%2F%2Fwww.example.com%2Faudfgo%2set4.mp3";
Pattern p = Pattern.compile(".*(http.*mp3)");
Matcher m = p.matcher(input);
if (m.find()) {
System.out.println(m.group(1));
}
gives us
http%3A%2F%2Fwww.example.com%2Faudfgo%2set4.mp3

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regular Expression to find entire link in string - java

Use this regex : https?:\/\/[a-zA-Z0-9.\/-]* Online demo http://regexr.com/3d7j7

Related

Get link from url and get email by regex

Regext do not work in java but it does inr egex check [duplicate]

Java Regex pattern that matches in any online tester but doesn't in Eclipse

find the path param using regex in the url

Quickie regular expression stuck

Categories

Resources