I am working a project in MAVEN using Java.
I have to get a URL, scroll them down ,and get all the links of other items in this given web page.
Till now, I get the page dynamically using Selenium , and scrolling them down, and fetch the links also. But it takes too much time. Please help me in optimize that.
Example:-, I am working on a page , whose link is here.
My Questions :-
Scrolling web page using selenium is very slow. How can I optimize this? (Suggest any other method
to do the same or help me to optimize this one)
Thanks in advance. Looking for your kind response.
Code to dynamically get and scroll the page:-
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
import com.google.common.collect.*;
import java.io.File;
import java.util.ArrayList;
import java.util.Date;
import org.apache.commons.io.FileUtils;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.firefox.FirefoxProfile;
/**
*
* #author jhamb
*/
public class Scroll_down {
private static FirefoxProfile createFirefoxProfile() {
File profileDir = new File("/tmp/firefox-profile-dir");
if (profileDir.exists()) {
return new FirefoxProfile(profileDir);
}
FirefoxProfile firefoxProfile = new FirefoxProfile();
File dir = firefoxProfile.layoutOnDisk();
try {
profileDir.mkdirs();
FileUtils.copyDirectory(dir, profileDir);
} catch (IOException e) {
e.printStackTrace();
}
return firefoxProfile;
}
public static void main(String[] args) throws InterruptedException{
String url1 = "http://www.jabong.com/men/shoes/men-sports-shoes/?source=home-leftnav";
System.out.println("Fetching %s..." + url1);
WebDriver driver = new FirefoxDriver(createFirefoxProfile());
driver.get(url1);
JavascriptExecutor jse = (JavascriptExecutor)driver;
jse.executeScript("window.scrollBy(0,250)", "");
for (int second = 0;; second++) {
if (second >= 60) {
break;
}
jse.executeScript("window.scrollBy(0,200)", "");
Thread.sleep(1000);
}
String hml = driver.getPageSource();
driver.close();
Document document = Jsoup.parse(hml);
Elements links = document.select("div");
for (Element link : links) {
System.out.println(link.attr("data-url"));
}
}
}
Well Selenium scrolling is based on Javascript. I dont know your goal with selenium though, you have no assertion to compare anything in your code ?
When you are so sure that your data fetching so fast then don't use any sleep methode.
Sleep methods makes selenium slower, but yeah it is waiting until the element is properly loaded .....
It's up to you, what to test though
How about page down?
ele.sendKeys(Keys.PAGE_DOWN); //WebElement ele = <Any existing element>
Repeat this till you find that particular item.
Related
First Error Screen
Second Error Screen
I am running the selenium example code:
import org.openqa.selenium.By;
import org.openqa.selenium.Keys;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.support.ui.WebDriverWait;
import static org.openqa.selenium.support.ui.ExpectedConditions.presenceOfElementLocated;
import java.time.Duration;
public class HelloSelenium {
public static void main(String[] args) {
WebDriver driver = new FirefoxDriver();
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
try {
driver.get("https://google.com/ncr");
driver.findElement(By.name("q")).sendKeys("cheese" + Keys.ENTER);
WebElement firstResult = wait.until(presenceOfElementLocated(By.cssSelector("h3>div")));
System.out.println(firstResult.getAttribute("textContent"));
} finally {
driver.quit();
}
}
}
And getting the errors as shown in the screenshots above.
Note that the action is being performed but the last statement in the try block isn't printing the attribute of the firstElement. I understand the problem is not very easy to read but solving should be interesting.
Also I am using the geckodriver (for Firefox) in Manjaro.
And I am using gradle.
It states the error on the debug window.
WebElement firstResult = wait.until(presenceOfElementLocated(By.cssSelector("h3>div")));
An exception is thrown because there is a timeout on the "wait.until" function.
It doesn't find the element you are searching for.
Your css selector is invalid.
I am learning Selenium and using jetblue.com for test. When I click on "FIND IT" button in homepage by providing all the required values, the page simply refreshes instead of going to the next screen. Can anyone advise where I am going wrong?
I tried using .click() and submit(). but not the control does not go the next page
package testCases;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.Keys;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.interactions.Actions;
import org.testng.annotations.Test;
public class Calendar {
#Test
public void calControl() throws InterruptedException
{
System.setProperty("webdriver.chrome.driver","C:\\chromedriver_win32\\chromedriver.exe");
ChromeOptions options = new ChromeOptions();
options.addArguments("disable-infobars");
options.addArguments("--start-maximized");
WebDriver driver= new ChromeDriver(options);
driver.get("https://www.jetblue.com");
// driver.findElement(By.className("input-group-btn")).click();
Thread.sleep(3000);
// driver.findElement(By.cssSelector("button[class='btn pull-right']")).click();
List<WebElement> count = driver.findElements(By.className("input-group-btn"));
int count1 = driver.findElements(By.className("input-group-btn")).size();
count.get(0).click();
Thread.sleep(3000);
driver.findElement(By.xpath("//table[#class='calendar']//td//span[.=27]")).click();
System.out.println(count1);
for (int i = 0;i<count1;i++)
{
System.out.println(count.get(i).toString());
}
Thread.sleep(3000);
count.get(1).click();
Thread.sleep(3000);
//driver.findElement(By.xpath("//button/span[#class='foreground-sprite-calendarforward']")).click();
List<WebElement> pullRight = driver.findElements(By.cssSelector("button[class='btn pull-right']"));
int count2 = driver.findElements(By.cssSelector("button[class='btn pull-right']")).size();
do
{
pullRight.get(1).click();
} while (driver.findElement(By.xpath("//div/strong[.='March 2018']")).isDisplayed()==false);
List<WebElement> returnDate = driver.findElements(By.xpath("//table[#class='calendar']//td//span[.=8]"));
int returnCount = driver.findElements(By.xpath("//table[#class='calendar']//td//span[.=3]")).size();
returnDate.get(1).click();
//driver.findElement(By.xpath("//input[#class='piejs']")).click(); Find Button
WebElement from = driver.findElement(By.id("jbBookerDepart"));
from.click();
Thread.sleep(2000);
from.sendKeys("DXB");
from.sendKeys(Keys.TAB);
Thread.sleep(2000);
WebElement to = driver.findElement(By.id("jbBookerArrive"));
to.click();
Thread.sleep(2000);
to.sendKeys("SFO");
to.sendKeys(Keys.TAB);
Thread.sleep(2000);
WebElement findButton = driver.findElement(By.xpath("//*[#id='login-search-wrap']/div[3]/div/div[3]/form/input[5]"));
//System.out.println("Value of button:" +driver.findElement(By.xpath("//*[#id='login-search-wrap']/div[3]/div/div[3]/form/input[5]")).toString());
/*Actions a = new Actions(driver);
//a.click(findButton).build().perform();
a.clickAndHold(findButton).doubleClick().build().perform();*/
/*driver.findElement(By.cssSelector("input[value='Find it']")).submit();
driver.findElement(By.xpath("input[value='Find it']")).submit();*/
System.out.println(findButton.isEnabled());
findButton.click();
Thread.sleep(5000);
}
}
That page is probably using anti-selenium software. I debugged your code several times, and here are my observations - I tried do perform some operations by hand, and some by WebDriver, and the result is: If ANY operation is performed by WebDriver, the form will not submit. That even includes opening of the page. They probably set some flag whenever an automated software is performing any action on their page.
Have a look at this answer. I don't know what anti-bot method they may be using, but this could be the first step.
i hope you'll find a solution for my problem.
The page I try to test (I'll have an ARender deployed in my company and I need to test it) : http://arender.fr/ARender/
The xpath I use to click on next page (given by FireBug, I tried to use anothers but does the same thing):
//*[#id='id_#_0.7290579307692522']/tbody/tr/td[2]/div/img
I tried many things and I really don't find the solution, I already tried to make a Javascript executor click it...
Java code :
package firstPackage;
import java.io.File;
import java.io.IOException;
import java.util.*;
import java.util.NoSuchElementException;
import java.util.concurrent.TimeUnit;
import firstPackage.Props;
import firstPackage.methods;
import firstPackage.PropTech;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
import org.junit.*;
import org.openqa.selenium.WebDriver;
import static org.junit.Assert.*;
import static org.hamcrest.CoreMatchers.*;
import org.openqa.selenium.*;
import org.openqa.selenium.ie.InternetExplorerDriver;
import org.openqa.selenium.ie.InternetExplorerDriverLogLevel;
import org.openqa.selenium.ie.InternetExplorerDriverService;
import org.openqa.selenium.support.ui.Select;
public class TestSelenium {
private WebDriver driver;
protected static InternetExplorerDriverService service;
private StringBuffer verificationErrors = new StringBuffer();
#Before
public void setUp()
{
System.out.println("*******************");
System.out.println("launching IE browser");
System.setProperty("webdriver.ie.driver", PropTech.driverPath+"IEDriverServer.exe");
driver = new InternetExplorerDriver();
driver.manage().window().maximize();
}
#Test
public void test() throws Exception
{
driver.navigate().to("arender.fr" + "/ARender/");
Thread.sleep(15000);
driver.findElement(By.xpath("//div[#title='Next page']/img")).click();
}
#After
public void tearDown()
{
if(driver!=null)
{
System.out.println("Closing IE browser");
driver.quit();
//Kill les process
try {
Runtime.getRuntime().exec("taskkill /F /IM IEDriverServer.exe");
Runtime.getRuntime().exec("taskkill /F /IM iexplore.exe");
} catch (IOException e) {
e.printStackTrace();
}
}
}
It might be dinamically loaded on the page, try to use WebdriverWait. And please pay attention on xpath, I've changed it a little. Checked execution in IE
WebDriverWait wait = new WebDriverWait(driver, 30);
driver.get("http://arender.fr/ARender/");
//Click element after it bacomes clickable
wait.until(ExpectedConditions.elementToBeClickable(By.xpath("//td[div[#title='Next page']]"))).click();
The following XPath should work:
//div[#title='Next page']/img
Seems the ID which you used in you xpath is dynamically generated. (#id='id_#_0.7290579307692522') can you confirm it's remains same even after refreshing the page?
If not try to make xpath with some other attributes or use xpath whilecard search.
Xpath wildcard search
thanks
I am trying to get data from a webpage (http://steamcommunity.com/id/Winning117/games/?tab=all) using a specific tag but I keep getting null. My desired result is to get the "hours played" for a specific game - Cluckles' Adventure in this case.
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class TestScrape {
public static void main(String[] args) throws Exception {
String url = "http://steamcommunity.com/id/Winning117/games/?tab=all";
Document document = Jsoup.connect(url).get();
Element playTime = document.select("div#game_605250").first();
System.out.println(playTime);
}
}
Edit: How can I tell if a webpage is using JavaScript and is therefore unable to be parsed by Jsoup?
To execute javascript in java code there is Selenium :
Selenium-WebDriver makes direct calls to the browser using each
browser’s native support for automation.
To include it with maven use this dependency:
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-server</artifactId>
<version>3.4.0</version>
</dependency>
Next I give you code of simple JUnit test that creates instance of WebDriver and goes to given url and executes simple script to get rgGames .
File chromedriver you have to download at https://sites.google.com/a/chromium.org/chromedriver/downloads.
package SeleniumProject.selenium;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Map;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.JUnit4;
import org.openqa.selenium.By;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriverService;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.remote.RemoteWebDriver;
import org.openqa.selenium.support.ui.ExpectedCondition;
import org.openqa.selenium.support.ui.WebDriverWait;
import junit.framework.TestCase;
#RunWith(JUnit4.class)
public class ChromeTest extends TestCase {
private static ChromeDriverService service;
private WebDriver driver;
#BeforeClass
public static void createAndStartService() {
service = new ChromeDriverService.Builder()
.usingDriverExecutable(new File("D:\\Downloads\\chromedriver_win32\\chromedriver.exe"))
.withVerbose(false).usingAnyFreePort().build();
try {
service.start();
} catch (IOException e) {
System.out.println("service didn't start");
// TODO Auto-generated catch block
e.printStackTrace();
}
}
#AfterClass
public static void createAndStopService() {
service.stop();
}
#Before
public void createDriver() {
ChromeOptions chromeOptions = new ChromeOptions();
DesiredCapabilities capabilities = DesiredCapabilities.chrome();
capabilities.setCapability(ChromeOptions.CAPABILITY, chromeOptions);
driver = new RemoteWebDriver(service.getUrl(), capabilities);
}
#After
public void quitDriver() {
driver.quit();
}
#Test
public void testJS() {
JavascriptExecutor js = (JavascriptExecutor) driver;
// Load a new web page in the current browser window.
driver.get("http://steamcommunity.com/id/Winning117/games/?tab=all");
// Executes JavaScript in the context of the currently selected frame or
// window.
ArrayList<Map> list = (ArrayList<Map>) js.executeScript("return rgGames;");
// Map represent properties for one game
for (Map map : list) {
for (Object key : map.keySet()) {
// take each key to find key "name" and compare its vale to
// Cluckles' Adventure
if (key instanceof String && key.equals("name") && map.get(key).equals("Cluckles' Adventure")) {
// print all properties for game Cluckles' Adventure
map.forEach((key1, value) -> {
System.out.println(key1 + " : " + value);
});
}
}
}
}
}
As you can see selenium loads page at
driver.get("http://steamcommunity.com/id/Winning117/games/?tab=all");
And to get data of all games by Winning117 it returns rgGames variable:
ArrayList<Map> list = (ArrayList<Map>) js.executeScript("return rgGames;");
The page you want to scrape is load by js,and there is not any #game_605250 element that jsoup get.All datas are write in page by using js.
But when I print document to a file ,I see some data like this:
<script language="javascript">
var rgGames = [{"appid":224260,"name":"No More Room in Hell","logo":"http:\/\/cdn.steamstatic.com.8686c.com\/steamcommunity\/public\/images\/apps\/224260\/670e9aba35dc53a6eb2bc686d302d357a4939489.jpg","friendlyURL":224260,"availStatLinks":{"achievements":true,"global_achievements":true,"stats":false,"leaderboards":false,"global_leaderboards":false},"hours_forever":"515","last_played":1492042097},{"appid":241540,"name":"State of Decay","logo":"http:\/\/....
then,you can extract 'rgGames' by some StringTools and format it to json obj.
It't not a clerver method,but it worked
try this :
public class TestScrape {
public static void main(String[] args) throws Exception {
String url = "http://steamcommunity.com/id/Winning117/games/?tab=all";
Document document = Jsoup.connect(url).get();
Element playTime = document.select("div#game_605250");
Elements val = playTime.select(".hours_played");
System.out.println(val.text());
}
}
I was trying to run Sikuli WebDriver based tests on Sauce On Demand infrastructure.
But I have a problem with RemoteWebDriver.
I have this BaseSikuliWebDriver class
package com.pitito.sikuli.base;
import java.lang.reflect.Method;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.HashMap;
import java.util.Map;
import org.openqa.selenium.Platform;
import org.openqa.selenium.remote.Augmenter;
import org.openqa.selenium.remote.CapabilityType;
import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.remote.RemoteWebDriver;
import org.testng.ITestResult;
import org.testng.annotations.AfterMethod;
import org.testng.annotations.BeforeMethod;
import com.pitito.core.basetests.BaseLoggingTest;
import com.pitito.selenium.webdriver.RemoteWebDriverSession;
import com.pitito.selenium.webdriver.WebDriverScreenshooter;
import com.pitito.sikuli.webdriver.SikuliFirefoxDriver;
/**
* Base class for all Sikuli WebDriver tests.
*
* #author guillem.hernandez
*/
public abstract class BaseSikuliWebDriverTest {
Map<String, Object> sauceJob = new HashMap<String, Object>();
private static SikuliFirefoxDriver sikuliDriver;
protected SikuliFirefoxDriver driver() {
return getDriver();
}
public static SikuliFirefoxDriver getDriver() {
return sikuliDriver;
}
public static void setDriver(SikuliFirefoxDriver driver) {
BaseSikuliWebDriverTest.sikuliDriver = driver;
}
#Override
#BeforeMethod(alwaysRun = true)
protected void setup(Method method, Object[] testArguments) {
super.setup(method, testArguments);
String sessionId = method.getName() + "_" + testArguments.hashCode();
DesiredCapabilities caps = DesiredCapabilities.firefox();
caps.setCapability("id", sessionId);
caps.setCapability("name", sessionId);
caps.setCapability(CapabilityType.BROWSER_NAME, "firefox");
caps.setCapability("platform", Platform.XP);
caps.setCapability("version", "21");
try {
sikuliDriver = (SikuliFirefoxDriver) new Augmenter().augment(new RemoteWebDriver(new URL("http://"
+ RemoteWebDriverSession.USER + ":" + RemoteWebDriverSession.APIKEY
+ "#ondemand.saucelabs.com:80/wd/hub"), caps));
} catch (MalformedURLException e) {
e.printStackTrace();
}
setDriver(sikuliDriver);
}
#Override
#AfterMethod(alwaysRun = true)
protected void teardown(ITestResult tr, Method method) {
if ((logger() != null) && (tr.getStatus() == ITestResult.FAILURE)) {
logUnexpectedException(tr.getThrowable());
}
super.teardown(tr, method);
sikuliDriver.quit();
}
#Override
protected void logScreenshot(String screenshotName) {
logResource(new WebDriverScreenshooter(driver(), screenshotName).getScreenshot());
}
}
The test I implemented is the Sikuli WebDriver example and the code is as follows:
package com.pitito.sikuli.tests;
import java.net.MalformedURLException;
import java.net.URL;
import org.openqa.selenium.By;
import org.openqa.selenium.Keys;
import org.openqa.selenium.WebElement;
import org.testng.annotations.Test;
import com.pitito.sikuli.base.BaseSikuliWebDriverTest;
import com.pitito.sikuli.webdriver.ImageElement;
/**
* Sikuli Firefox WebDriver Automated Test Example.
*
* #author guillem.hernandez
*/
public class SikuliGoogleCodeTest extends BaseSikuliWebDriverTest {
#Test(groups = { "ES" }, description = "Use Sikuli to search on Google Maps")
public void testSikuliWebDriverPassingExample_ES() {
verifySikuliWebDriverPassingTest();
}
private void verifySikuliWebDriverPassingTest() {
// visit Google Map
driver().get("https://maps.google.com/");
// enter "Denver, CO" as search terms
WebElement input = driver().findElement(By.id("gbqfq"));
input.sendKeys("Denver, CO");
input.sendKeys(Keys.ENTER);
ImageElement image;
// find and click on the image of the lakewood area
try {
image = driver().findImageElement(new URL("https://dl.dropbox.com/u/5104407/lakewood.png"));
image.doubleClick();
// find and click on the image of the kendrick lake area
image =
driver().findImageElement(new URL("https://dl.dropbox.com/u/5104407/kendrick_lake.png"));
image.doubleClick();
// find and click the Satellite icon to switch to the satellite view
image = driver().findImageElement(new URL("https://dl.dropbox.com/u/5104407/satellite.png"));
image.click();
// find and click the plus button to zoom in
image = driver().findImageElement(new URL("https://dl.dropbox.com/u/5104407/plus.png"));
image.click();
// find and click the link button
WebElement linkButton = driver().findElement(By.id("link"));
linkButton.click();
} catch (MalformedURLException e) {
e.printStackTrace();
}
}
}
When I try to run the test, the error I get is this one:
[Invoker 18958118] Invoking #BeforeMethod BaseSikuliWebDriverTest.setup(java.lang.reflect.Method, [Ljava.lang.Object;)[pri:0, instance:com.pitito.sikuli.tests.SikuliGoogleCodeTest#137008a]
Failed to invoke configuration method com.pitito.sikuli.base.BaseSikuliWebDriverTest.setup:org.openqa.selenium.remote.RemoteWebDriver$$EnhancerByCGLIB$$52a1cf6f cannot be cast to com.pitito.sikuli.webdriver.SikuliFirefoxDriver
The problem resides here:
sikuliDriver = (SikuliFirefoxDriver) new Augmenter().augment(new RemoteWebDriver(new URL("http://"
+ RemoteWebDriverSession.USER + ":" + RemoteWebDriverSession.APIKEY
+ "#ondemand.saucelabs.com:80/wd/hub"), caps));
How can I use SikuliFirefoxDriver remotely? How can I cast RemoteWebDriver with SikuliFirefoxDriver? Can I do it?
As far as I know, the Selenium Grid server does not have the capability of passing Sikuli commands (and binary screenshots for comparison purposes) through its JSON api. Not even SauceLabs has this capability. Hopefully, its on the radar to be implemented someday. On the SauceLabs forum, there is someone that asked this question (and I answered that one also with this same answer).
I know that there is a project in-progress called Marionette that is supposed to be able to automate browser/Firefox menus and native dialogs.
I implemented a remote driver version of sikuli driver. You can use that to do this action.
Please feel free to fork:
https://github.com/AJ-72/SikuliRemoteWebdriver
My guess is that SikuliFirefoxDriver cannot be augmented because it was not invoked as RemoteWebdriver. Try invoke it as Remote webdriver with sikuli as desired capabilities.
Please, post here if it worked (I could not find proofs if it is possible, but still worth a shot)