Java: How to Scrape Images from Amazon with Selenium?

Java: How to Scrape Images from Amazon with Selenium? - java

I'm trying to scrape the 6 images on left side of page from this URL on Amazon using Selenium WebDriver:
http://www.amazon.com/EasyAcc%C2%AE-10000mAh-Brilliant-Smartphone-Bluetooth/dp/B00H9BEC8E
However, whatever I try causes an error. What I've tried so far:
I tried scraping images directly using XPATH and then extracting src using "getAttributes" method. For example, for the 1st image on page the XPATH is:
.//*[#id='a-autoid-2']/span/input
so I tried the following:
String path1 = ".//*[#id='a-autoid-2']/span/input";
String url = "http://www.amazon.com/EasyAcc%C2%AE-10000mAh-Brilliant-Smartphone-Bluetooth/dp/B00H9BEC8E";
WebDriver driver = new FirefoxDriver();
driver.get(url);
WebElement s;
s = driver.findElement(By.xpath(path1));
String src;
src = s.getAttribute("src");
System.out.println(src);
But I'm unable to find source.
Note: This problem occurs only when scraping images from certain types of products. For example, I can easily scrape images from this product using Selenium:
http://www.amazon.com/Ultimate-Unification-Diet-Health-Disease/dp/0615797806/
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
public class mytest {
public static void main(String[] args) {
// TODO Auto-generated method stub
String path = ".//*[#id='imgThumbs']/div[2]/img";
String url = "http://www.amazon.com/Ultimate-Unification-Diet-Health-Disease/dp/0615797806/";
WebDriver driver = new FirefoxDriver();
driver.get(url);
WebElement s;
s = driver.findElement(By.xpath(path));
String src;
src = s.getAttribute("src");
System.out.println(src);
driver.close();
}
}
This code works flawlessly. It is only when scraping certain products that there seems to be no way around it.
I tried clicking on image which causes an iframe to open but I'm unable to scrape images from this iframe either, even after switching to iframe with:
driver.switchTo().frame(IFRAMEID);
I know I can use the "screenshot" method but I'm wondering if there's a way to scrape the images directly?
Thanks

Try this code
String path = "//div[#id='imageBlock_feature_div']//span/img";
String url = "http://rads.stackoverflow.com/amzn/click/0615797806";
WebDriver driver = new FirefoxDriver();
driver.get(url);
List<WebElement> srcs;
srcs = driver.findElements(By.xpath(path));
for(WebElement src : srcs) {
System.out.println(src.getAttribute("src"));
}
driver.close();
Result
2015-01-23 12:36:14 [main]-[INFO] Opened url: http://rads.stackoverflow.com/amzn/click/B00H9BEC8E
http://ecx.images-amazon.com/images/I/41cOP3mFX3L._SX38_SY50_CR,0,0,38,50_.jpg
http://ecx.images-amazon.com/images/I/51YkMhRXqcL._SX38_SY50_CR,0,0,38,50_.jpg
http://ecx.images-amazon.com/images/I/51nSbXF%2BCTL._SX38_SY50_CR,0,0,38,50_.jpg
http://ecx.images-amazon.com/images/I/31s%2B31F%2BQmL._SX38_SY50_CR,0,0,38,50_.jpg
http://ecx.images-amazon.com/images/I/41FmTOJEOOL._SX38_SY50_CR,0,0,38,50_.jpg
http://ecx.images-amazon.com/images/I/41U6qpLJ07L._SX38_SY50_CR,0,0,38,50_.jpg
However, to get Amazon Images, I suggest you to try Amazon API https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html
It's much better.

Related

Not able to click an element and redirect to new tab in selenium

I am trying to learn selenium by testing them on different websites. In this process, I am trying to work with Flipkart website. In this, I would like to give puma is search bar and trying to click one of the resultant items. But I am not able to do that using below-mentioned code. Could anyone help in solving it?
Secondly, If we click on any item, it is redirected to new-tab. How to access the new-tab elements using the same script?
import org.openqa.selenium.By;
import org.openqa.selenium.Keys;
import org.openqa.selenium.NoSuchElementException;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
public class AutomationTesting {
public static void main(String[] args) {
System.setProperty("webdriver.gecko.driver","/Users/xxxx/eclipse-workspace/seleniumTesting/lib/geckoDriver/geckodriver");
WebDriver driver = new FirefoxDriver();
driver.get("https://www.google.de");
driver.findElement(By.id("lst-ib")).sendKeys("flipkart");
driver.findElement(By.id("lst-ib")).sendKeys(Keys.ENTER);
WebDriverWait wait = new WebDriverWait(driver, 20);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.partialLinkText("Flipkart")));
driver.findElement(By.partialLinkText("Flipkart")).click();
driver.findElement(By.cssSelector("._3Njdz7 [class = '_2AkmmA _29YdH8']")).click();
driver.findElement(By.xpath("//input[#class = 'LM6RPg']")).click();
driver.findElement(By.xpath("//input[#class = 'LM6RPg']")).sendKeys("Puma");
driver.findElement(By.xpath("//button[#class = 'vh79eN']")).click();
driver.findElement(By.xpath("//a[#title='Puma Men Black Wallet' and #class= '_1Nyybr _30XEf0']")).click();
}
}

You need to use the window switchTo feature.
String mainWindowHandle = driver.getWindowHandle();
ArrayList<String> wins = driver.getWindowHandles();
// You can use a for loop here, or get the assumed second window directly
driver.switchTo().window(wins.get(1));
// Test some things, then switch back
driver.close();
driver.switchTo().window(mainWindowHandle);
See http://www.seleniumhq.org/docs/03_webdriver.jsp#moving-between-windows-and-frames

You will have to inspect that entire frame. I would suggest instead of customising xpaths on your own try firebug and firepath add on of Firefox for getting xpath of entire elements. Inspect the entire frame within which the results are getting displayed then save it in some variable like below one:
List<WebElements> searchResults;
searchResults=driver.findElements(By.xpath("Your xpath"));
Then access elements of this list using index and then you can perform .click() action on the same. More on this link for capturing xpath using firebug and firepath

Question 1: During the execution of the last line in the above code I am getting the following error. Unable to locate element: //a[#title='Puma Men Black Wallet' and #class= '_1Nyybr _30XEf0']
--> Due to Lazy loading webelement not present in dom at the time when you are hiting click event on it. So to over come this you need to make the webElement on view and then hit the click event.
Please refer below code:-
driver.get("https://www.google.de");
driver.findElement(By.id("lst-ib")).sendKeys("flipkart");
driver.findElement(By.id("lst-ib")).sendKeys(Keys.ENTER);
WebDriverWait wait = new WebDriverWait(driver,60);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.partialLinkText("Flipkart")));
driver.findElement(By.partialLinkText("Flipkart")).click();
try{
driver.findElement(By.cssSelector("._3Njdz7 [class = '_2AkmmA _29YdH8']")).click();
}catch(Exception e){
System.out.println("No division");
}
driver.findElement(By.xpath("//input[#class = 'LM6RPg']")).click();
driver.findElement(By.xpath("//input[#class = 'LM6RPg']")).sendKeys("Puma");
driver.findElement(By.xpath("//button[#class = 'vh79eN']")).click();
// Thread.sleep(3000);
wait.until(ExpectedConditions.visibilityOf( driver.findElement(By.xpath("//a[#title='Puma Men Black Wallet']"))));
// getting element into view
((JavascriptExecutor) driver).executeScript("arguments[0].scrollIntoView();", driver.findElement(By.xpath("//*[#alt='Puma Men Black Wallet']")));
Thread.sleep(2000);
driver.findElement(By.xpath("//*[#alt='Puma Men Black Wallet']")).click();
Question 2: If that last command works then it is redirected to the new tab. So how to access the elements from the new-tab?
--> As suggested be #Damian Jansen add that code after last click event.
String mainWindowHandle = driver.getWindowHandle();
ArrayList<String> wins = driver.getWindowHandles();
for(String win : wins ){
driver.switchTo().window(win);
// other operation
System.out.println(driver.getTitle());
}
// back to old window
driver.switchTo().window(mainWindowHandle);
System.out.println(driver.getTitle());
Hope this help you :)

Use this code to reach upto puma and then select the option use below code
public static void main(String[] args) throws InterruptedException, AWTException {
System.setProperty("webdriver.chrome.driver","G:\\java programme\\SendkeysExample\\lib\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.manage().window().maximize();
driver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS);
driver.get("https://www.google.com");
driver.findElement(By.id("lst-ib")).sendKeys("flipkart");
driver.findElement(By.id("lst-ib")).sendKeys(Keys.ENTER);
driver.findElement(By.linkText("Flipkart")).click();
driver.findElement(By.className("LM6RPg")).sendKeys("Puma");
Robot rb = new Robot();
rb.keyPress(KeyEvent.VK_DOWN);
rb.keyPress(KeyEvent.VK_DOWN);
rb.keyPress(KeyEvent.VK_ENTER);
rb.keyRelease(KeyEvent.VK_DOWN);
rb.keyRelease(KeyEvent.VK_DOWN);
rb.keyRelease(KeyEvent.VK_ENTER);
Thread.sleep(2000);
/*driver.findElement(By.xpath(".//*[#id='container']/div/header/div[1]/div/div/div/div[1]/form/ul/li[2]/a"));
driver.findElement(By.className("icon-add-circle"))*/;
driver.close();
}
}

Find the URL of an image in java using Selenium from any given website

I'm new to using Selenium Web Driver, however I'm currently undergoing a project that would make me need to save an image of a person from a website using just their name. Many solutions I've found online don't seem to work at least in my case scenario. The main issue I need assistance with is singling out specific images from a webpage. For example if I were to use the link https://www.squarespace.com/about/team/ I would need to be able to download images, or at least get the link of an image of solely one person based on their name. Any information would help, thanks!

Here is the Answer to your Question:
You can use the following code block to retrieve the name of all the Team Members and the link of the image of solely one person based on their name:
import java.util.ArrayList;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
public class Q44980374_url_of_an_image
{
public static void main(String[] args)
{
System.setProperty("webdriver.chrome.driver", "C:\\Utility\\BrowserDrivers\\chromedriver.exe");
ChromeOptions options = new ChromeOptions();
options.addArguments("start-maximized");
options.addArguments("disable-infobars");
WebDriver driver=new ChromeDriver(options);
driver.get("https://www.squarespace.com/about/team/");
JavascriptExecutor jse = (JavascriptExecutor) driver;
jse.executeScript("window.scrollBy(0,1000)", "");
//Get the Name of the Members
List<WebElement> team_member = driver.findElements(By.xpath("//main[#id='content']//div[#class='team-members']/div[#class='team-member']/div[#class='team-member-text']/h3"));
List<String> mem_name_list = new ArrayList<String>();
for (WebElement member:team_member)
{
String memeber_name = member.getAttribute("innerHTML");
mem_name_list.add(memeber_name);
}
List<WebElement> team_member_images = driver.findElements(By.xpath("//main[#id='content']//div[#class='team-members']/div[#class='team-member']/div[#class='team-member-portrait']/img"));
List<String> mem_image_list = new ArrayList<String>();
for (WebElement image_link:team_member_images)
{
String memebr_image = image_link.getAttribute("src");
mem_image_list.add(memebr_image);
}
for(int i=0;i<team_member.size();i++)
{
System.out.println(mem_name_list.get(i)+"'s image link is : "+mem_image_list.get(i) );
}
}
}
The Output of this code block is:
Anthony Casalena's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/anthony-casalena-300w.jpg
Nicole Anasenes's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/nicole-anasenes-300w.jpg
Andrew Bartholomew's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/andrew-bartholomew-300w.jpg
John Colton's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/john-colton-300w.jpg
Raphael Fontes's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/raphael-fontes-300w.jpg
David Lee's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/david-lee-300w.jpg
Kinjil Mathur's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/kinjil-mathur-300w.jpg
Roberta Meo's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/roberta-meo-300w.jpg
Kris Passet's image link is : https://static1.squarespace.com/static/ta/5134cbefe4b0c6fb04df8065/9025/assets/pages/about/team/executive-team/kris-passet-300w.jpg
Let me know if this Answers your Question.

find background image location in url content

i want to save the background image of "http://yooz.ir/" to my Disk by a java code. this image changes every few days. it has a download link blue arrow, but i don't find image location to save it by code. how i can find the location of this image in url content by jsoup, htmlunit or etc?

You have the wrong url. Here is the right one: http://imgs.yooz.ir/yooz/walls/yooz-950602-2.jpg

I see this element is populated asynchronously
so you need to use some webdriver automation tool, the most common is selenium
/*
import selenium, you can do it via maven;
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-server</artifactId>
<version>2.45.0</version>
</dependency>
*/
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.support.ui.ExpectedCondition;
import org.openqa.selenium.support.ui.WebDriverWait;
public class SaveImageFromUrl {
public static void main(String[] args) throws Exception {
// download chrome driver http://chromedriver.storage.googleapis.com/index.html
// or use firefox, w/e u like
System.setProperty("webdriver.chrome.driver", chromeDriverLocation);
WebDriver driver = new ChromeDriver(); // opens browsers
driver.get("http://yooz.ir/"); // redirect to site
// wait until element which contains the download link exist in page
new WebDriverWait(driver, 5).until(new ExpectedCondition<WebElement>() {
#Override
public WebElement apply(WebDriver d) {
return d.findElement(By.className("image-day__download"));
}
});
// get the link inside the element via queryselector
// https://developer.mozilla.org/en-US/docs/Web/API/Document/querySelector
String img2download = driver.findElement(By.cssSelector(".image-day__download a")).getAttribute("href");
System.out.println("img2download = " + img2download);
//TODO download..
driver.close();
}
}

Selenium Webdriver: how to use xpath filter?

I'm trying to follow the Selenium Webdrive Basic Tutorial in the case of using HTML tables here
http://www.toolsqa.com/selenium-webdriver/handling-tables-selenium-webdriver/
The code of "Practice Exercise 1" in that page doesn't work: the problem seems to be about the xpath filter here
String sCellValue = driver.findElement(By.xpath(".//*[#id='post-1715']/div/div/div/table/tbody/tr[1]/td[2]")).getText();
and here
driver.findElement(By.xpath(".//*[#id='post-1715']/div/div/div/table/tbody/tr[1]/td[6]/a")).click();
The page used in the sample code is this one
http://www.toolsqa.com/automation-practice-table/
I've tried to change the code extracting the xpath directly form the page using Firebug and my new code is the following
package practiceTestCases;
import java.util.concurrent.TimeUnit;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
public class PracticeTables_00 {
private static WebDriver driver = null;
public static void main(String[] args) {
driver = new FirefoxDriver();
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
driver.get("http://www.toolsqa.com/automation-practice-table");
//Here we are storing the value from the cell in to the string variable
String sCellValue = driver.findElement(By.xpath("/html/body/div[1]/div[3]/div[2]/div/div/table/tbody/tr[1]/td[2]")).getText();
System.out.println(sCellValue);
// Here we are clicking on the link of first row and the last column
driver.findElement(By.xpath("/html/body/div[1]/div[3]/div[2]/div/div/table/tbody/tr[1]/td[6]/a")).click();
System.out.println("Link has been clicked otherwise an exception would have thrown");
driver.close();
}
}
Trying to execute the error is still
Exception in thread "main" org.openqa.selenium.NoSuchElementException: Unable to locate element: {"method":"xpath","selector":"/html/body/div[1]/div[3]/div[2]/div/div/table/tbody/tr[1]/td[2]"}
I'm using Eclipse Luna on Windows 7
Any suggestions? Thank you in advance ...
Cesare

Your xpath is wrong. As I see you try to get the content from the second row/first column (if you doesn't count the headers). Try the following code on http://www.toolsqa.com/automation-practice-table/ page:
/html/body/div[2]/div[3]/div[2]/div/div/table/tbody/tr[2]/td[1]
or
//*[#id='content']/table/tbody/tr[2]/td[1]
If you run getText() it will return the following value: Saudi Arabia

As they given in Practice Test scenario, that xpath has been changed. And ID or className in the xpath might be changed. so change the xpath as per updated HTML code.
Here i have changed everything:
import java.util.concurrent.TimeUnit;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
public class PracticeTables_00 {
private static WebDriver driver = null;
public static void main(String[] args) {
driver = new FirefoxDriver();
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
driver.get("http://www.toolsqa.com/automation-practice-table");
// Here we are storing the value from the cell in to the string variable
String sCellValue = driver
.findElement(
By.xpath(".//*[#id='content']/table/tbody/tr[1]/td[2]"))
.getText();
System.out.println(sCellValue);
// Here we are clicking on the link of first row and the last column
driver.findElement(
By.xpath(".//*[#id='content']/table/tbody/tr[1]/td[6]/a"))
.click();
System.out
.println("Link has been clicked otherwise an exception would have thrown");
driver.close();
}
}

Selenium webdriver click google search

I'm searching text "Cheese!" on google homepage and unsure how can I can click on the searched links after pressing the search button. For example I wanted to click the third link from top on search page then how can I find identify the link and click on it. My code so far:
package mypackage;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.support.ui.ExpectedCondition;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.support.ui.WebDriverWait;
public class myclass {
public static void main(String[] args) {
System.setProperty("webdriver.chrome.driver", "C:\\selenium-java-2.35.0\\chromedriver_win32_2.2\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.get("http://www.google.com");
WebElement element = driver.findElement(By.name("q"));
element.sendKeys("Cheese!");
element.submit();
//driver.close();
}
}

Google shrinks their css classes etc., so it is not easy to identify everything.
Also you have the problem that you have to "wait" until the site shows the result.
I would do it like this:
public static void main(String[] args) {
WebDriver driver = new FirefoxDriver();
driver.get("http://www.google.com");
WebElement element = driver.findElement(By.name("q"));
element.sendKeys("Cheese!\n"); // send also a "\n"
element.submit();
// wait until the google page shows the result
WebElement myDynamicElement = (new WebDriverWait(driver, 10))
.until(ExpectedConditions.presenceOfElementLocated(By.id("resultStats")));
List<WebElement> findElements = driver.findElements(By.xpath("//*[#id='rso']//h3/a"));
// this are all the links you like to visit
for (WebElement webElement : findElements)
{
System.out.println(webElement.getAttribute("href"));
}
}
This will print you:
http://de.wikipedia.org/wiki/Cheese
http://en.wikipedia.org/wiki/Cheese
http://www.dict.cc/englisch-deutsch/cheese.html
http://www.cheese.com/
http://projects.gnome.org/cheese/
http://wiki.ubuntuusers.de/Cheese
http://www.ilovecheese.com/
http://cheese.slowfood.it/
http://cheese.slowfood.it/en/
http://www.slowfood.de/termine/termine_international/cheese_2013/

#Test
public void google_Search()
{
WebDriver driver;
driver = new FirefoxDriver();
driver.get("http://www.google.com");
driver.manage().window().maximize();
WebElement element = driver.findElement(By.name("q"));
element.sendKeys("Cheese!\n");
element.submit();
//Wait until the google page shows the result
WebElement myDynamicElement = (new WebDriverWait(driver, 10)).until(ExpectedConditions.presenceOfElementLocated(By.id("resultStats")));
List<WebElement> findElements = driver.findElements(By.xpath("//*[#id='rso']//h3/a"));
//Get the url of third link and navigate to it
String third_link = findElements.get(2).getAttribute("href");
driver.navigate().to(third_link);
}

There would be multiple ways to find an element (in your case the third Google Search result).
One of the ways would be using Xpath
#For the 3rd Link
driver.findElement(By.xpath(".//*[#id='rso']/li[3]/div/h3/a")).click();
#For the 1st Link
driver.findElement(By.xpath(".//*[#id='rso']/li[2]/div/h3/a")).click();
#For the 2nd Link
driver.findElement(By.xpath(".//*[#id='rso']/li[1]/div/h3/a")).click();
The other options are
By.ByClassName
By.ByCssSelector
By.ById
By.ByLinkText
By.ByName
By.ByPartialLinkText
By.ByTagName
To better understand each one of them, you should try learning Selenium on something simpler than the Google Search Result page.
Example - http://www.google.com/intl/gu/contact/
To Interact with the Text input field with the placeholder "How can we help? Ask here." You could do it this way -
# By.ByClassName
driver.findElement(By.ClassName("searchbox")).sendKeys("Hey!");
# By.ByCssSelector
driver.findElement(By.CssSelector(".searchbox")).sendKeys("Hey!");
# By.ById
driver.findElement(By.Id("query")).sendKeys("Hey!");
# By.ByName
driver.findElement(By.Name("query")).sendKeys("Hey!");
# By.ByXpath
driver.findElement(By.xpath(".//*[#id='query']")).sendKeys("Hey!");

Based on quick inspection of google web, this would be CSS path to links in page list
ol[id="rso"] h3[class="r"] a
So you should do something like
String path = "ol[id='rso'] h3[class='r'] a";
driver.findElements(By.cssSelector(path)).get(2).click();
However you could also use xpath which is not really recommended as a best practice and also JQuery locators but I am not sure if you can use them aynywhere else except inArquillian Graphene

Simple Xpath for locating Google search box is:
Xpath=//span[text()='Google Search']

public class GoogleSearch {
public static void main(String[] args) {
WebDriver driver=new FirefoxDriver();
driver.get("http://www.google.com");
driver.findElement(By.xpath("//input[#type='text']")).sendKeys("Cheese");
driver.findElement(By.xpath("//button[#name='btnG']")).click();
driver.manage().timeouts().implicitlyWait(30,TimeUnit.SECONDS);
driver.findElement(By.xpath("(//h3[#class='r']/a)[3]")).click();
driver.manage().timeouts().implicitlyWait(30,TimeUnit.SECONDS);
}
}

Most of the answers on this page are outdated.
Here's an updated python version to search google and get all results href's:
import urllib.parse
import re
from selenium import webdriver
driver.get("https://google.com/")
q = driver.find_element_by_name('q')
q.send_keys("always look on the bright side of life monty python")
q.submit();
sleep(1)
links= driver.find_elements_by_xpath("//h3[#class='r']//a")
for link in links:
url = urllib.parse.unquote(webElement.get_attribute("href")) # decode the url
url = re.sub("^.*?(?:url\?q=)(.*?)&sa.*", r"\1", url, 0, re.IGNORECASE) # get the clean url
Please note that the element id/name/class (#class='r') ** will change depending on the user agent**.
The above code used PhantomJS default user agent.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java: How to Scrape Images from Amazon with Selenium? - java

Related

Not able to click an element and redirect to new tab in selenium

Find the URL of an image in java using Selenium from any given website

find background image location in url content

Selenium Webdriver: how to use xpath filter?

Selenium webdriver click google search

Categories

Resources