Selenium WebDriver findElements() Fails on Single Quotes - java

My goal is to parse a block of HTML code like below to obtain the text, comments and replies fields as separate parts of the block:
<div id='fooID' class='foo'>
<p>
This is the top caption of picture's description</p>
<p>
T=<img src="http://www.mysite.com/images/img23.jpg" alt="" width="64" height="108"/> </p>
<p>
And here is more text to describe the photo.</p>
<div class=comments>(3 comments)</div>
<div id='reply13' class='replies'>
<a href=javascript:getReply('13',1)>Show reply </a></div>
</div>
My problem is that Selenium's WebDriver does not seem to support non-string identifiers in the HTML (notice that the class field in the HTML is 'foo' and as opposed to "foo"). From all examples that I have seen in both the Selenium docs and in other SO posts, the latter format is what WebDriver commonly expects.
Here is the relevant part of my Java code with my various (unsuccessful) attempts:
java.util.List<WebElement> elementList = driver.findElements(By.xpath("//div[#class='foo']"));
java.util.List<WebElement> elementList = (List<WebElement>) ((JavascriptExecutor)driver).executeScript("return $('.foo')[0]");
java.util.List<WebElement> elementList = driver.findElements(By.xpath("//div[contains(#class, 'foo')]"));
java.util.List<WebElement> elementList = driver.findElements(By.cssSelector("div." + foo_tag)); // where foo_tag = "'foo'".replace("'", "\'");
java.util.List<WebElement> elementList = driver.findElements(By.cssSelector("'foo'"));
Is there a sure way of handling this? Or is there an alternative, better way of extracting the above fields?
Other info:
I'm an HTML noob, but have made efforts to understand the structure of the HTML code/tags
Using Firefox (and, accordingly, FirefoxDriver)
Your help/suggestions greatly appreciated!

It's invalid HTML, so Selenium won't have a chance. You should fix it.
You will have a better chance with HTMLAgilityPack:
http://htmlagilitypack.codeplex.com/
It is a little better when it comes to badly formed (which this is) HTML.
Below is a SO post which a few different options for a few different languages, with tools like HTMLAgilityPack. You should find a suitable one:
Options for HTML scraping?

The problem is that the html specification doesnt know single quotes as far as I know. Therefore you don't have a problem with the Selenum webdriver, the problem is the html.
Do you have the chance to edit the html code?

Related

How to locate ::before element in Selenium and get its contents?

screenshot.png
Hi All,
I am having a situation where a ::before in HTML code is pointing to asterisk (mandatory field) in HTML page. Please see attached screenshot.
HTML code:
<lightning-input-field class="customRequired abc">
::before
<lightning-picklist>
</lightning-picklist>
</lightning-input-field>
How to write xpath for ::before?
I think there is no straight forward solution instead use javascript :
querySelector takes css selectors
css_selector = 'lightning-input-field[class="customerRequired"]'
browser.execute_script("return window.getComputedStyle(document.querySelector('{}'),':before').getPropertyValue('content')".format(css_selector))

How to check href of an element found using xpath contains text()

I'm trying to find an element by the text it contains, then check that that element also has a link to a particular place. I'm using selenium/java.
I'm trying to find elements by text when I can to minimise how many changes I will need to make if the UI is updated (reduce test maintenance costs).
I've tried the following, but the assert fails as the getAttribute ends up being null.
WebElement newsHeadlineTemplate = driver.findElement(By.xpath("//*[contains(text(), 'News Headline')]"));
Assert.assertEquals("Template not clickable", "/news/create/new", newsHeadlineTemplate.getAttribute("href"));
HTML for element I'm trying to find/use:
<div class="columns">
<div class="column is-one-third">
<p>News Headline</p>
</div>
</div>
I'm still fairly new to selenium so any help is very much appreciated.
Your XPath selector is a little bit wrong, you're matching <p> tag and you need to match the <a> tag which is the following-sibling for the <p> tag.
So you need to amend your expression to look like:
//p[text()='News Headline']/following-sibling::a
More information:
XPath Tutorial
XPath Axes
XPath Operators & Functions

#FindAll Is Kind of Slow When Trying to Locate Multiple Elements

I have a page that is localized and the "Create Account" WebElement can be English, Chinese or Japanese. I am using Selenium, Java and TestNG framework to run a test to click on this element. However, the slow performance when using this #FindAll to identify the page makes me wonder if there are any better way to do this.
The element from Inspect element while "English" locale is selected:
<div class="form-group">
<a translate="create-account" class="pointer ng-scope" ng-click="vm.createAccount()">Create Account</a>
</div>
My FindAll declaration:
#FindAll({
#FindBy(linkText="Create Account"),
#FindBy(linkText="创建账号"),
#FindBy(linkText="アカウントを作成")
})
private List<WebElement> createAccount;
As a baseline to compare, if I use the #FindAll above, it takes about 15 seconds before Webdriver clicks on the link. If I use just #FindBy, it takes about 2-3 seconds. However, #FindBy does not work for me as I need to be able to locate the correct locale to click on the link.
You could use a single css selector like:
a[ng-click*='createAccount']
Or one of the xpaths:
//a[contains(#ng-click, 'createAccount')]
//a[contains(text(), 'Create Account') or contains(text(), '创建账号') or contains(text(), 'アカウントを作成')]
For css if you pass part of the attribute value then it should be [#attributeName*='part_of_attribute_value']
Please take a look here to view a basic list of css rules w3schools css selectors
Thanks #lauda for helping out and the link to w3 css selectors.
I actually found two more ways that I can identify this link using css:
#FindBy(css="a[translate='create-account']")
private WebElement CreateAccount;
and
#FindBy (css="a.pointer.ng-scope")
private WebElement CreateAccount;
However, not sure why the original solution that lauda posted did not work for me though.
a[ng-click*='createAccount']

How to convert selenium webdriver xpath value into css

The current code has long HTML Xpath values that need to be converted & shortened to a css value:
driver.findElement(By.xpath("html/body/div[1]/div/div[2]/div/form/div[3]/div[2]/button")).click();
driver.findElement(By.xpath("html/body/div[1]/div/div[2]/div/form/div[3]/div[2]/button")).click();
You could possibly have chosen a better xpath expression than this one above. What you have done (without looking at the actual HTML code) is you have written down the complete xpath, is it possible to make it shorter / more robust?
Consider the following example:
<html itemscope itemtype="http://schema.org/QAPage">
<body class = "question-page new-topbar">
<div id="notify-container"></div>
<div id="topbar-wrapper"></div>
<button id="button1"></button>
</body>
</html>
You want to click on button1, you can find it using the complete xpath:
driver.findElement(By.xpath("html/body/div/div/button")).click()
or you can find it using xpath along this element's other attributes, in this case, its id.
driver.findElement(By.xpath("//button[#id='button1']")).click()
or as you wanted, you can use CSS selector:
driver.findElement(By.cssSelector('button[id='button1']')).click()
If you want us to help you with converting your xpath into css selector, you will need to copy and paste your html code in your question as well. Without looking at the actual code, we can not be 100% sure.
You may find the following link useful when trying to convert xpath into css selector.
https://www.simple-talk.com/dotnet/.net-framework/xpath,-css,-dom-and-selenium-the-rosetta-stone/

How do I get this text using Jsoup?

How do i get "this text" from the following html code using Jsoup?
<h2 class="link title"><a href="myhref.html">this text<img width=10
height=10 src="img.jpg" /><span class="blah">
<span>Other texts</span><span class="sometime">00:00</span></span>
</a></h2>
When I try
String s = document.select("h2.title").select("a[href]").first().text();
it returns
this textOther texts00:00
I tried to read the api for Selector in Jsoup but could not figure out much.
Also how do i get an element of class class="link title blah" (multiple classes?). Forgive me I only know both Jsoup and CSS a little.
Use Element#ownText() instead of Element#text().
String s = document.select("h2.link.title a[href]").first().ownText();
Note that you can select elements with multiple classes by just concatenating the classname selectors together like as h2.link.title which will select <h2> elements which have at least both the link and title class.

Categories