Java How to copy part of a url in webdriver? - java

I want to know how can copy the "?ned=us&topic=t" part in "http://news.google.com/?ned=us&topic=t". Basically, I want to copy the path of the url, or the portion after the ".com". How do I do this?
public class Example {
public static String url = "http://news.google.com/?ned=us&topic=t";
public static void main(String[] args) {
WebDriver driver = new FirefoxDriver();
driver.get(url);
WebElement reportCln=driver.findElement(By.id("id_submit_button"));
String path=driver.getCurrentUrl();
System.out.println(path);
}
}

You should have a look at the java.net.URL class and its getPath() and getQuery() methods.
#Test
public void urls() throws MalformedURLException {
final URL url = new URL("http://news.google.com/?ned=us&topic=t");
assertEquals("ned=us&topic=t", url.getQuery());
assertEquals("?ned=us&topic=t", "?" + url.getQuery());
assertEquals("/", url.getPath());
}
Regular expressions are fun, but IMO this is easier to understand.

Try this:
String request_uri = null;
String url = "http://news.google.com/?ned=us&topic=t";
if (url.startsWith("http://") {
request_uri = url.substring(7).split("/")[1];
} else {
request_uri = url.split("/")[1];
}
System.out.println (request_uri); // prints: ?ned=us&topic=t
If you're only interested in the query string i.e. for google.com/search?q=key+words you want to ignore search? then just split on ? directly
// prints: q=key+words
System.out.println ("google.com/search?q=key+words".split("\\?")[0]);

You can use regular expression to extract the part you want:
String txt = "http://news.google.com/?ned=us&topic=t";
String re1 = "(http:\\/\\/news\\.google\\.com\\/)"; // unwanted part
String re2 = "(\\?.*)"; // wanted part
Pattern p = Pattern.compile(re1 + re2, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Matcher m = p.matcher(txt);
if (m.find())
{
String query = m.group(2);
System.out.print(query);
}

Related

How can i extract substring from the string using regex in Java?

I have a String xxxxxxxxsrc="/slm/attachment/63338424306/Note.jpg"xxxxxxxx Now, I want to extract substrings slm/attachment/63338424306/Note.jpg & Note.jpg from the String in to variables i.e. temp1 & temp2.
How can I do that using regex in Java?
Note: 63338424306 could be any random no. & Note.jpg could be anything
like Note.png or abc.jpg or xxxx.yyy etc.
Please help me to extract these two strings using regex.
You can use negative look behind to get file name
((?:.(?<!/))+)\"
and below regex to get full path
/(.*)\"
Sample code
public static void main(String[] args) {
Pattern pattern = Pattern.compile("/(.*)\"");
Pattern pattern1 = Pattern.compile("((?:.(?<!/))+)\"");
String matchString = "/slm/attachment/63338424306/Note.jpg\"xxxxxxxx";
Matcher matcher = pattern.matcher(matchString);
String fullString = "";
while (matcher.find()) {
fullString = matcher.group(1);
}
matcher = pattern1.matcher(matchString);
String fileName = "";
while (matcher.find()) {
fileName = matcher.group(1);
}
System.out.println(fullString + " " + fileName);
}
As per your comment taking the string as declared below in my code:
Please clarify if your input string is not like this or I'm missing something.
public static void main(String[] args) {
String str = "xxxxxxxxsrc=\"/slm/attachment/63338424306/Note.jpg\"xxxxxxxx";
String url = null;
// The below pattern will grab string between quotes
Pattern p = Pattern.compile("\"([^\"]*)\"");
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
url = m.group(1);
}
// and this will grab filename from the path(url)
p = Pattern.compile("(?:.(?<!/))+$");
m = p.matcher(url);
while (m.find()) {
System.out.println(m.group());
}
}

Removing link from Text in Java?

I need to change somethign like this -> Hello, go here http://www.google.com for your ...
grab the link, and change it in a method i made, and replace it back into the string like this
-> Hello, go here http://www.yahoo.com for your...
Here is what i have so far:
if(Text.toLowerCase().contains("http://"))
{
// Do stuff
}
else if(Text.toLowerCase().contains("https://"))
{
// Do stuff
}
All i need to do is change the URL in the String to something different. The Url in the String will not always be http://www.google.com, so i can not just say replace("http://www.google.com","")
Use regex:
String oldUrl = text.replaceAll(".*(https?://)www((\\.\\w+)+).*", "www$2");
text = text.replaceAll("(https?://)www(\\.\\w+)+", "$1" + traslateUrl(oldUrl));
Note: code changed to meet extra requirements in comments below.
you can grab the link from the string using below code. I assumed the string will contain only .com domain
String input = "Hello, go here http://www.google.com";
Pattern pattern = Pattern.compile("http[s]{0,1}://www.[a-z-]*.com");
Matcher m = pattern.matcher(input);
while (m.find()) {
String str = m.group();
}
Have you tried something like:
s= s.replaceFirst("http:.+[ ]", new link);
This will find any word beginning with http up till the first white space and replace it with whatever you want
if you want to keep the link then you can do:
String oldURL;
if (s.contains("http")) {
String[] words = s.split(" ");
for (String word: words) {
if (word.contains("http")) {
oldURL = word;
break;
}
}
//then replace the url or whatever
}
You can try this
private String removeUrl(String commentstr)
{
String urlPattern = "((https?|ftp|gopher|telnet|file|Unsure|http):((//)|(\\\\))+[\\w\\d:##%/;$()~_?\\+-=\\\\\\.&]*)";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
int i = 0;
while (m.find()) {
commentstr = commentstr.replaceAll(m.group(i),"").trim();
i++;
}
return commentstr;
}

Replacing a string with substring of that string

In Java I have a String
String string = "sdfgjhjdfg.m\"gb=1234509876\"xcvbnfghj".
I want to replace it with Hi="1234509876".
In string replace function i could not do this.
string = string.replace(".*gb=(.*)\".*","Hi=(.*)");
In the 2nd parameter (.*) group, the group in 1st parameters should get replace
Please Help me....
Try to use the following code for getting number string from main string
public static void main(String[] args)
{
String string = "sdfgjhjdfg.m\"gb=1234509876\"xcvbnfghj";
Pattern p = Pattern.compile("-?\\d+");
Matcher m = p.matcher(string);
while (m.find()) {
string=m.group();
}
System.out.println("res"+string);
}
You can try something like this
String string = "sdfgjhjdfg.m\"gb=1234509876\"xcvbnfghj";
String newStr=string.replaceAll("gb=1234509876","Hi=1234509876");
System.out.println(newStr);
try
string = string.replace("gb=","Hi=");
try this
String string = "sdfgjhjdfg.m\"gb=1234509876\"xcvbnfghj";
System.out.println(string);
string = string.replaceAll("gb=-?\\d+","Hi='new value'");
System.out.println(string);

regex pattern to match particular uri from list of urls

I have a list of urls (lMapValues ) with wild cards like as mentioned in the code below
I need to match uri against this list to find matching url.
In below code I should get matching url as value of d in the map m.
That means if part of uri is matching in the list of urls, that particular url should be picked.
I tried splitting uri in tokens and then checking each token in list lMapValues .However its not giving me correct result.Below is code for that.
public class Matcher
{
public static void main( String[] args )
{
Map m = new HashMap();
m.put("a","https:/abc/eRControl/*");
m.put("b","https://abc/xyz/*");
m.put("c","https://work/Mypage/*");
m.put("d","https://cr/eRControl/*");
m.put("e","https://custom/MyApp/*");
List lMapValues = new ArrayList(m.values());
List tokens = new ArrayList();
String uri = "cr/eRControl/work/custom.jsp";
StringTokenizer st = new StringTokenizer(uri,"/");
while(st.hasMoreTokens()) {
String token = st.nextToken();
tokens.add(token);
}
for(int i=0;i<lMapValues.size();i++) {
String value = (String)lMapValues.get(i);
String patternString = "\\b(" + StringUtils.join(tokens, "|") + ")\\b";
Pattern pattern = Pattern.compile(patternString);
java.util.regex.Matcher matcher = pattern.matcher(value);
while (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(value);
}
}
}
}
Please help me with regex pattern to achieve above objective.
Any help will be appreciated.
It's much simpler to check if a string starts with a certain value with String.indexOf().
String[] urls = {
"abc/eRControl",
"abc/xyz",
"work/Mypage",
"cr/eRControl",
"custom/MyApp"
};
String uri = "cr/eRControl/work/custom.jsp";
for (String url : urls) {
if (uri.indexOf(url) == 0) {
System.out.println("Matched: " + url);
}else{
System.out.println("Not matched: " + url);
}
}
Also. There is no need to store the scheme into the map if you are never going to match against it.
if I understand your goal correctly, you might not even need regular expressions here.
Try this...
package test;
import java.util.HashSet;
import java.util.Set;
public class PartialURLMapper {
private static final Set<String> PARTIAL_URLS = new HashSet<String>();
static {
PARTIAL_URLS.add("cr/eRControl");
// TODO add more partial Strings to check against input
}
public static String getPartialStringIfMatching(final String input) {
if (input != null && !input.isEmpty()) {
for (String partial: PARTIAL_URLS) {
// this will be case-sensitive
if (input.contains(partial)) {
return partial;
}
}
}
// no partial match found, we return an empty String
return "";
}
// main method just to add example
public static void main(String[] args) {
System.out.println(PartialURLMapper.getPartialStringIfMatching("cr/eRControl/work/custom.jsp"));
}
}
... it will return:
cr/eRControl
The problem is that i is acting as a key not as an index on
String value = (String)lMapValues.get(i);
you will be better served exchanging the map for a list, and using the for each loop.
List<String> patterns = new ArrayList<String>();
...
for (String pattern : patterns) {
....
}

Java. Replace relative Links to absolute with regex

I want to replace in a String, which represents a Html-File,all relative Links with absolute Links. I write the following method, which does not work. any links are followed by a duplicate baseurl like http://www.google.dehttp://www.google.de/resource?
public static String replacePattern(URL targetUrl,String urlAsString,String patternString) throws IOException{
System.out.println(targetUrl.toString());
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(urlAsString);
Set<String> replacedStrings = new TreeSet<String>();
//return matcher.replaceAll(targetUrl.toString()+"$0");
while (matcher.find()) {
String relativeLink = matcher.group(1);
//System.out.println("Find Link " + relativeLink);
if(!replacedStrings.contains(relativeLink)){
//System.out.println("Relative Link " + relativeLink);
String newLink = targetUrl.toString() + relativeLink;
//System.out.println("New Link " + newLink);
urlAsString = urlAsString.replace(relativeLink,newLink);
replacedStrings.add(relativeLink);
}
}
return urlAsString;
}
UrlAsString is a String which contains the wholecontent as a String.My patterns are
href=['\"](/[^'\"]+)['\"]
and
src=['\"](/[^'\"]+)['\"]
Use Class URL:
URL baseUrl = new URL("http://www.domain.com/folder/");
URL url = new URL(baseURL , "url.html");

Categories