For the purpose of my question, I have created a simple HTML page, an extract of which is the following:
<table class="fruit-vegetables">
<thead>
<th>Fruit</th>
<th>Vegetables</th>
</thead>
<tbody>
<tr>
<td>
<b>
Apples
</b>
</td>
<td>
Carrots
</td>
</tr>
<tr>
<td>
<i>
Oranges
</i>
</td>
<td>
Peas
</td>
</tr>
</tbody>
</table>
I want to extract the data from the first column called "Fruit" using Jsoup. Thus, the result should be:
Apples
Oranges
I have written a program, an extract of which is the following:
//In reality, it should be connect(html).get().
//Also, suppose that the String `html` has the full source code.
Document doc = Jsoup.parse(html);
Elements table = doc.select("table.fruit-vegetables").select("tbody").select("tr").select("td").select("a");
for(Element element : table){
System.out.println(element.text());
}
The result of this program is:
Apples
Carrots
Oranges
Peas
I know that something is not working good, but I can't find my mistake. All the other questions here in Stack Overflow did not solve my problem. What do I have to do?
You seems to be looking for
Elements el = doc.select("table.fruit-vegetables td:eq(0)");
for (Element e : el){
System.out.println(e.text());
}
From http://jsoup.org/cookbook/extracting-data/selector-syntax you can find description of :eq(n) as
:eq(n): find elements whose sibling index is equal to n; e.g. form input:eq(1)
So with td:eq(0) we are selecting each <td> which is first child of its parent - in this case <tr>.
Related
I have the following table:
<table>
<tbody>
<tr>
<td>
</td>
<td>
User1
</td>
</tr>
<tr>
<td>
</td>
<td>
User2
</td>
</tr>
</tbody>
</table>
I want to find the a-tag of the tr, where the data User2 is in the same row. I know that I can find an a-tag with partial link like findElement(By.partialLinkText("/ResetPassword/")); (the number 2 can change, so I canĀ“t use it as seperator). But I need to seperate it by User. Is there a solution like tr.td.text("User2") > findElement(By.partialLinkText("/ResetPassword/"));?
This XPath should do the trick for you. .//tr[td[normalize-space(text())='User2']]//a Just keep changing "User2" part with the desired user value.
Hi I was wondering if you could use the .getValue() statement with /#a at the end of the xpath to locate the attribute "a".
Main thing to do is find a bulletproof xpath to locate the User2 row once you have done that finding the value of "a" should be easy enough.
I hope this helps
U can try something like this(not sure though)-
List<WebElement> list=table.findElements(By.tagName("tr"));
List<WebElement> tdvalues=null;
for(WebElement web:list){
tdvalues=web.findElements(By.tagName("td"));
if(tdvalues.contains("User2")){
System.out.println(tdvalues.get(0).getText());//0th position contains the link
}
tdvalues.clear();
}
I want to get the last item which the last item in the specific tags,
I mean ;
<tr>
<td><b>my name</b></td>
<td><spec id="nm" nm="eg">Example Name</spec>
</td>
</tr>
....
<tr>
<td><b>samp2</b></td>
<td title="samp2"><div>Example 2</div>
</td>
</tr>
I want to reach "Example Name" I want to write a dynamic program? How can I do that?
(you can see the the last tag is "spec" maybe the other scenerio the last tag is sam how can I find last tag inner html? second sample I want to get Example 2)
updated sample
if I has this :
<table>
<tr>
<td>1</td>
<td><div>2</div></td>
</tr>
<tr>
<td><span>3</span></td>
</tr>
</table>
So I need the output should be:
2 and 3
because they are the last tags inner html under tr tag.
(I want to last tag under tr tag , but if it has child element I want to its inner html)
thanks in advance?
You can use jsoup html parser to do it, you can use css or jquery like selector to find element
String html = "<table><tr><td>1</td><td>2</td></tr><tr><td>3</td><td>4</td></tr></table>";
Document doc = Jsoup.parse(html);
System.out.println(doc);
Elements elements = doc.select("tr td:last-child");
for(Element element: elements) {
System.out.println(element.html());
}
output
2
4
you can try with a regex like :
/<spec[^>]*>(.*?)<\/spec>/
i think it is not efficient but you can try, check the regex for a better performance
/<td[^>]*>(.*?)<\/td><\/tr>/
this is an approximation. would fail the subject of child. You can use this result to remove span, div etc.
/<(.*?)[^>]*>(.*?)<\/(.*?)>/
Alright I am having trouble with finding an equivalent of Element.children() because I have an Elements object...
What I am trying to do is download an html file (well I have done that...) and identify a single table row (, I've done that by using doc.getElementsByClass("emphasizedRowColor"); because the row I want has that emphasizedRowColor class and no other elements do). I just don't understand how to isolate the one Element in my Elements object RWTableRow.
Html:
<tr class="rwOdd emphasizedRowColor">
<td class="jewel" style="">
<div class="teamJewel" style="background-position: 0px -336px;margin: 0 0 2px 2px;"></div>
</td>
<td class="left" style=""> Detroit</td>
<td style="">18</td>
<td style="">9</td>
<td style="">5</td>
<td style="">4</td>
<td class="narrowStatsColumn cSrt" style="">22</td>
<td class="narrowStatsColumn" style="">9</td>
<td style="">45</td>
<td style="">48</td>
<td style="">3-2-4</td>
<td style="">6-3-0</td>
<td style="">3-3-4</td>
</tr>
I can figure out what to do once I actually get the table as an Element but oh boy I think I just need a fresh set of eyes to figure out what I'm doing...
Java:
Document doc = Jsoup.connect(url).userAgent("Mozilla").get();
Elements RWTableRow = doc.getElementsByClass("emphasizedRowColor");
As you can see, I'm in quite the pickle...
Elements is a standard java.util.List, you can simply call
Element e = RWTableRow.get(0);
And there you have it.
<tbody>
<tr class="odd">
<td>
<input id="nodeAccountOid" type="radio" onclick="setNodeAccountIdToCredentialCheck('E9E2930C4493B569E040A8C0158E4ABD');" style="width:100%;border:0px">
</td>
<td>E9E2930C4493B569E040A8C0158E4ABD</td>
<td>monacho1</td>
<td>urn:dece:org:org:dece:dece:cs</td>
</tr>
<tr class="even">
<td>
<input id="nodeAccountOid" type="radio" onclick="setNodeAccountIdToCredentialCheck('E9E2930C4494B569E040A8C0158E4ABD');" style="width:100%;border:0px">
</td>
<td>E9E2930C4494B569E040A8C0158E4ABD</td>
<td>monacho1</td>
<td>urn:dece:org:org:dece:coord:cs</td>
</tr>
<tr class="odd">
<td>
<input id="nodeAccountOid" type="radio" onclick="setNodeAccountIdToCredentialCheck('E9E2930C4495B569E040A8C0158E4ABD');" style="width:100%;border:0px">
</td>
<td>E9E2930C4495B569E040A8C0158E4ABD</td>
<td>monacho1</td>
<td>urn:dece:org:org:dece:300</td>
</tr>
<tr class="even">
<td>
<input id="nodeAccountOid" type="radio" onclick="setNodeAccountIdToCredentialCheck('E9E2930C4495B569E040A8C0158E4ABD');" style="width:100%;border:0px">
</td>
<td>E9E2930C4495B569E040A8C0158E4ABD</td>
<td>monacho1</td>
<td>urn:dece:org:org:dece:10</td>
</tr>
</tbody>
</table>
i want to select the radio button corresponding to urn:dece:org:org:dece:10 which is fourth row in the html provided. the row may change sometimes based on some inputs in AUT. please provide me the way to select it.
thanks in advance
Since the text never changes, you can use that as a starting point within the DOM, and use XPath to navigate through to the input you need:
//td[.='urn:dece:org:org:dece:10']/parent::tr/descendant::input[#id='nodeAccountOid' and #type='radio']
Get the td that has it's text equal to urn:dece:org:org:dece:10
Get that td's parent tr
From that parent tr, get the input that has an id equal to nodeAccountOid and has a type of radio.
Therefore it doesn't matter where exactly the elements are, as long as the XPath locator can navigate up to the parent and back down again to the input you need.
The problem you are facing is to locate a certain element in the web page that may occur in different positions.
If there is another element which can be identified easily by its content, id or name and there is a static relation between this element and the one you would like to locate, then you could use xpath. There are also some examples for this in the selenium documentation as far as I know.
I am looking for the help to read a table span using selenium code and write span into the file,
following is my html code
<table>
<tbody><tr>
<tr>
<td><span>
FIRST
</span>
</td>
</tr>
<tr>
<td>
<span>SECOND</span>
</td>
</tr>
<tr>
<td>
<span>THIRD</span>
</td>
</tr>
<tbody>
</table>
I need to write FIRST SECOND THIRD on a file in java.
Thanks a lot.
I suppose you have located/found the table WebElement. Then you can get span elements content like this:
List<WebElement> spanElements = tableElement.findElements(By.ByTagName("span"));
for (WebElement element : spanElements) {
String spanContent = element.getText();
//save it to a collection or a StringBuilder, then write it to a file
}
Having a look at this and this might help.
1) XPath that gets all text is:
//span/text()
2) in java code you may type something like
String text = selenium.getText("xpath=//span");