<span class="label label-danger" style="font-size : 13px; font-weight : 400;">Critical</span>
Below is the xpath which I am using:
.//tr[#data-index='0']/td/span
I have a line in HTML source like above. So, I have used corresponding Xpath and used getText() method to get the text i.e. Critical. I am succeed in that.
But, I have another line in another page like this.
<div class="col-xs-12">
<div id="project-update-success-information" class="panel-confirmation success" style="display: none;">
<span class="fa fa-check"/>
Project Updated
</div>
Below is the xpath which I am using:-
.//*[#id='project-update-success-information']/span
I have used the corresponding Xpath and getText(),but unfortunately it doesn't retrieve the text for me. I doubted that there is no </span> close tag in the second line which causes the problem. Is there any other way to get the text?
This question has many answers already, but none of them really explains the problem. First, let us get your initial confusion about self-closing elements out of the way, before moving on to the real problem: No, it is not a problem that an element like
<span class="fa fa-check"/>
does not have a </span> tag. There is no need to indicate where it ends because the /> already tells you that this element does not contain anything and closes at this point.
Then let's look at only the fragment of the document that you show:
<div class="col-xs-12">
<div id="project-update-success-information" class="panel-confirmation success" style="display: none;">
<span class="fa fa-check"/>
Project Updated
</div>
</div>
An XPath expression like (note that most likely you do not need the . at the very beginning of the expression):
//*[#id='project-update-success-information']
will return the inner div element with all that it contains. What it does contain is, exactly in this order:
a whitespace-only text node
a self-closing span element with no content other than an attribute
the text node that contains "Project Updated"
So, it is not at all surprising that when you select the inner div and use .getText(), you end up with 2 text nodes in the result. Another way to get at the text content of an element is by using text() in the XPath expression:
//*[#id='project-update-success-information']/text()
which will return (individual elements separated by --------):
[whitespace-only text node]
-----------------------
Project Updated
The solutions are either
use getText() to retrieve all text nodes and later exclude those that only contain whitespace or
use an XPath expression that targets text nodes directly and excludes the ones that only contain whitespace. The standard way of doing this is with [normalize-space()]:
//*[#id='project-update-success-information']/text()[normalize-space()]
Note that, in general, there is no guarantee that the text content of an element will be in one single text node. It is very likely that you will sometimes encounter HTML or XML where elements have several text nodes, all of them containing non-whitespace characters, e.g.:
<div>
Project
<span/>
Updated
</div>
Try this text() method like below:-
//span[#class='fa fa-check']/text()
Hope it will help you :)
The element is empty and thus contains no text
<span class="fa fa-check"/>
If on the other hand it was like
<span class="fa fa-check">Some content</span>
then it would, as in yor first attempt, contain some text.
Without knowing more of the content I would try another xpath method: following-sibling.
Try:
driver.findElement(By.className("panel-confirmation success")).getText();
Related
I have the following HTML:
<div class="a-row a-spacing-small a-size-small">
<div class="a-row">
<a class="a-link-normal a-declarative g-visible-js reviewStarsPopoverLink" href="#" data-action="a-popover" data-a-popover="{"closeButton":"false","url":"/gp/customer-reviews/widgets/average-customer-review/popover/ref=wl_it_o_cm_cr_acr_img_hz?ie=UTF8&a=B05555JQP&contextId=wishi&link=1&seeall=1","name":"review-hist-pop.B075555RJQP","max-width":"700","position":"triggerBottom","data":{"itemId":"I2555555554GT","isGridViewInnerPopover":""},"header":"","cache":"true"}">
<i id="review_stars_I2J55555554GT" class="a-icon a-icon-star a-star-4-5">
<span class="a-icon-alt">4.5 out of 5 stars</span>
</i>
<i class="a-icon a-icon-popover"/>
</a>
<a class="a-link-normal g-visible-no-js" href="/product-reviews/B075555JQP/ref=wl_it_o_cm_cr_acr_txt_hz?ie=UTF8&colid=2K4U5555551D&coliid=I2J5555555T&showViewpoints=1">
<span class="a-letter-space"/>
<a id="review_count_I2J55555555GT" class="a-link-normal" href="/product-reviews/B05555555P/ref=wl_it_o_cm_cr_acr_txt_hz?ie=UTF8&colid=255555555D&coliid=I2555555GT&showViewpoints=1">(68)</a>
</div>
<div class="a-row">
<div class="a-row a-size-small itemAvailability">
<div class="a-row itemUsedAndNew">
</div>
I'm trying to extract the value 4.5 out of 5 stars via one of the following XPath:
.//*[contains(#id,'review_stars')]/span[#class='a-icon-alt']
.//*[contains(#id,'review_stars')]
However, everything that I've tried so far has failed (returns empty String)
The funny thing is that all of these XPaths actually work in Firebug so I'm not sure why it isn't working in my program (I suspect it has something to do with the fact that the rating isn't actually visible in browser unless you hover over a specific element but I'm not sure if/why/how this would cause the above mentioned problem and how to fix it)
Thanks!
You failed to include the image between the anchor and span. The span is inside the image, not a sibling of the anchor.
try:
.//*[contains(#id,'review_stars')]/i/span[#class='a-icon-alt']
I will attempt to answer my own question although I do not entirely understand why my previous code isn't working. If someone could provide me with an in depth explanation I will accept their answer as the final answer.
For now this is what works for me:
Instead of calling element.getText(); call element.getAttribute("innerHTML");
This returns the correct result but I would like to understand why getText() does not work in this case. Again, if someone could provide an XPath that works or could provide explanation to all this I will accept it as the final answer.
Thanks
To extract the value 4.5 out of 5 stars through XPath you can use :
//a[#class='a-link-normal a-declarative g-visible-js reviewStarsPopoverLink']/i[starts-with(#id,'review_stars_') and #class='a-icon a-icon-star a-star-4-5']/span[#class='a-icon-alt']
Update :
As you mentioned This does not work either. I just tried it. you must have missed out a part from the xpath which I have provided. My Answer was a proven one. See the snapshot below :
Note : Though your question was related to xpath you have pulled out your answer with respect to getText() method and getAttribute("innerHTML") method. How ever my Answer will be working with both getText() and getAttribute("innerHTML") method.
I have a structure like this:
<div>
<div>
<span class="">TextA</span>
</div>
<div>
<span class="">TextB</span>
</div>
</div>
I can find element span with TextA, and from there, I want to find span with textB, then click on it (which is not available to find it alone). So I used xpath like this:
webDriver.findElement(By.xpath("//span[contains(.,'TextA')]/following-sibling::span")).click();
I got exception Element is not found. I assumed these spans are siblings ?!. Can anyone help me in this case. Thanks
I got exception Element is not found. I assumed these spans are siblings ?
These two spans are not siblings - they are children of different elements and cannot be siblings.
Instead of following-sibling, you may use the following axis:
//span[contains(.,'TextA')]/following::span
Or, you may get the div element that contains the span with TextA text and then get it's following sibling:
//div[contains(span, 'TextA')]/following-sibling::div/span
The <span>'s aren't siblings. if the <span>'s were siblings, they'd have to be adjacent.
for example:
<div>
<div>
<span class="">TextA</span>
<span class="">TextB</span>
</div>
</div>
In this case, they are siblings and your selector would work.
The elements that are actually siblings, are the <div>s.
The xpath that could work for you, would be:
//span[contains(.,'TextA')]/../following-sibling::div/span
I want to select some supermarket product info from this page:
http://www.angeloni.com.br/super/index?grupo=15022
For that I should select <ul> tags with class "lstProd ":
If the class name were "lstProd" it would be easy, but the problem is the whitespace at the end of name. I couldn't make Jsoup deal with it.
I tried the code below and other ways but it always get an empty list.
org.jsoup.nodes.Document document = Jsoup.connect("http://www.angeloni.com.br/super/index?grupo=15022").get();
org.jsoup.select.Elements list = doc.select("ul.lstProd ");
the code snippet from html page that I want to get:
<ul class="lstProd ">
<li>
<span class="cod">CÓD. 1341372</span>
<span class="lnkImgProd">
<a href="/super/produto?grupo=15022&idProduto=1341372">
<img src="http://assets.angeloni.com.br/files/images/7/1B/C6/1341372_1_V.jpg" width="120" height="120"
alt="Creme Dental SORRISO Super Refrescante Tubo 90g">
</a>
</span>
<div class="RgtDetProd">
<div class="boxInfoProd">
<span class="descr">
<a href="/super/produto?grupo=15022&idProduto=1341372">Creme Dental SORRISO Super Refrescante
Tubo 90g</a>
</span>
<ul class="lstProdFlags after">
</ul>
</div>
...
I think you are facing two completely separate problems:
Jsoup does not load the site you think it loads. The website you specified renders its contents via JavaScript and loads some content after initial page loading through AJAX. JSoup can't deal with this. You either need to investigative the AJAX calls and get them directly with Jsoup, or you use something like selenium webdriver to get the page in a real browser which will render everything as you expect it.
CSS class names can't contain spaces for practical purposes 1. In HTML spaces are used as separator between class names. Hence <ul class="lstProd "> is the same as <ul class="lstProd">. In CSS selectors however a class name is specified by .className, i.e. dot followed by the class name. You can concatinate several classes like this: element.select(".className1.className2")
1 Technically you can put spaces in CSS classes, but you need to escape them with '\ '. See https://mathiasbynens.be/notes/css-escapes or Which characters are valid in CSS class names/selectors?
edit: be more precise about CSS class names
CSS class names CAN contain whitespaces.
And <ul class="lstProd "> is NOT same as <ul class="lstProd">.
And I can see that you have multiple <ul> with same class name.
The better way to inspect or traverse such element is by nth-child
So to find your required selector you can use #abaProd > ul:nth-child(4)
For more details about nth-child
Question is for JAVA + Selenium:
My HTML is:
<section class="d-menu d-outclass-bootstrap unclickable d-apps d-app-list">
<section class="standard-component image-sequence-button" tabindex="0" role="link">
<div class="image-region">
<div class="core-component image">...
</div>
<div class="sequence-region">
<div class="core-component section">
<div>
<section class="standard-component text hide-section-separator-line">
<div class="text-region">
<div class="core-component text">
<span class="main-text">BART Times</span>
<span class="sub-text">Provider</span>
</div>
</div>
</section>
<section class="standard-component speech-bubble hide-section-separator-line">...
<section class="standard-component text">...
</div>
</div>
</div>
<div class="button-region">
<div class="core-component button" tabindex="0" role="link">...
</div>
</section>
<section class="standard-component image-sequence-button" tabindex="0" role="link">...
<section class="standard-component image-sequence-button" tabindex="0" role="link">...
<section class="standard-component image-sequence-button" tabindex="0" role="link">...</section>
EDIT:
All <section class="standard-component image-sequence-button"... have exact same structure and hierarchy (same attributes for all tags). The only thing that changes are the TEXT values of the tags(e.g. span)
PART1:
I'm looking for various elements inside the second section tag. So, What I'm trying to do is get the <span class="main-text"> which has a value BART Times because of the business requirement.
I already know how to get it via xpath:
My xpath (verified via firebug):
"//section//div[#class = 'sequence-region']//section[#class = 'standard-component text hide-section-separator-line']//span[#class = 'main-text' and text() = '%s']"
I can get the span tag via checking for %s values (e.g. BART Times).
However, due to design considerations, we've been told to use CSS only. So, I tried to come up with a CSS counterpart for the above xpath but did not find it.
The following CSS
"section div.sequence-region section.standard-component.text.hide-section-separator-line span[class=main-text]"
returns all the span tags under all the section tags.
Question1: How do I get the span tag which has a certain TEXT value (the %s part of xpath)?
Things I've tried for that last span tag which did not worked(according to the firebug):
span.main-text[text='BART Times']
span[class=main-text][text='BART Times']
span.main-text:contains('BART Times')
span[class=main-text]:contains('BART Times')
span.main-text[text="BART Times"]
span[class=main-text][text="BART Times"]
span.main-text[text=\"BART Times\"]
span[class=main-text][text=\"BART Times\"]
span[text="BART Times"]
span[text=\"BART Times\"]
span:contains('BART Times')
span:contains("BART Times")
span:contains(\"BART Times\")
So, basically I want to put a check on BOTH class and TEXT value of the span tag in CSS selector.
Part 2:
Then I want to get the <section class="standard-component image-sequence-button"... element where I found the <span class="main-text"> and then find other elements inside that specific section tag
Question 2:
Assuming, I found the span tag in question 1 via CSS, how do I get the section tag (which is a super--- parent of the span tag)?
If CSS is not possible, please provide an xpath counterpart for this as a workaround for a while.
CSS selectors can't select based on text. The answers to Is there a CSS selector for elements containing certain text? go into detail on why.
To select based on class and text in xpath: //span[contains(#class, 'main-text') and text() = 'BART Times']
Regarding question 1, it is not possible, as stated in the other answer here. This is another thread about the topic : CSS selector based on element text?
Regarding question 2, once again there is no such parent selector in XPath : Is there a CSS parent selector?. Now for the xpath counterpart, you can use parent axis (parent::*) or shortcut notation for the same (..), or put the span selector as predicate for the parent (the third example below) :
....//span[#class = 'main-text' and text() = '%s']/parent::*
....//span[#class = 'main-text' and text() = '%s']/..
....//*[span[#class = 'main-text' and text() = '%s']]
See the following thread for some better (yet more complicated) alternative to match element by CSS class using XPath, just in case you haven't came across link on this topic : How can I find an element by CSS class with XPath?
I am trying to locate a specific element on a page but cannot figure out the proper Xpath to use.
Here is the HTML (note that the location of each div can vary):
<div>
<label>First Name</label>
<span class="metadataField metadataFieldReadonly">
<input type="text" name="some-random-value" value="John">
</span>
</div>
<div>
<label>Last Name</label>
<span class="metadataField metadataFieldReadonly">
<input type="text" name="some-random-value" value="Smith">
</span>
</div>
So what I am trying to locate the INPUT element that is the //div/span[#class='metadataField metadataFieldReadonly']/input in the same div that has a //div/label[text()='Last Name']
I can successfully locate the label with this (using JAVA):
driver.findElement(By.xpath("//div/label[text()='Last Name']")).click();
And I can successfully locate the first input under the first element (but I may not always want the first element) with this:
driver.findElement(By.xpath("//div/span[#class='metadataField metadataFieldReadonly']/input")).click();
So the problems are that (i) the name tag and value of the INPUT are always different so they cannot be used to pick the element, and (ii) the div with the last name label may not always be the second one, and (iii) the label and span are the same level (siblings) so I cannot figure out how to properly create the Xpath statement.
So in words, I need to find the input of the span in the same div that has a label with 'Last Name' in it.
So I need to know how to combine these two XPath statements into one complex statement (assuming they are in the same div and that the label and span are siblings):
//div/label[text()='Last Name']
//div/span[#class='metadataField metadataFieldReadonly']/input
Thanks
That's downright easy: //div[label[text()='Last Name']]/span[#class='metadataField metadataFieldReadonly']/input - literally, "the input contained in the span with the metadataField and metadataFieldReadonly classes contained in the div that contains the label with the text 'Last Name'". So you're using the label to locate the div and then building from it to the input you want.
To be a more robust, you shouldn't count on the ordering of the class names - that isn't guaranteed by the specs. So something like this would be stronger: //div[label[text()='Last Name']]/span[contains(concat(' ', #class,' '),' metadataField ') and contains(concat(' ',#class,' '),' metadataFieldReadonly ')]/input.