Selenium read a table and write into a file line by line - java

I am looking for the help to read a table span using selenium code and write span into the file,
following is my html code
<table>
<tbody><tr>
<tr>
<td><span>
FIRST
</span>
</td>
</tr>
<tr>
<td>
<span>SECOND</span>
</td>
</tr>
<tr>
<td>
<span>THIRD</span>
</td>
</tr>
<tbody>
</table>
I need to write FIRST SECOND THIRD on a file in java.
Thanks a lot.

I suppose you have located/found the table WebElement. Then you can get span elements content like this:
List<WebElement> spanElements = tableElement.findElements(By.ByTagName("span"));
for (WebElement element : spanElements) {
String spanContent = element.getText();
//save it to a collection or a StringBuilder, then write it to a file
}
Having a look at this and this might help.

1) XPath that gets all text is:
//span/text()
2) in java code you may type something like
String text = selenium.getText("xpath=//span");

Related

Extract all string data except String containing HTML Table's in java

I have a long String like this.
<p>Some Text above the tabular data. I hope this text will be seen.</p>
<table border="1" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td style="width:150px">
<p>S.No.</p>
</td>
</td>
</tr>
<tr>
<td style="width:150px">
<p>2</p>
</td>
</tbody>
</table>
<p> </p>
<p>Please go through this tabular data.</p>
<table border="1" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td style="width:150px">
<p>S.No.</p>
</td>
</tr>
<tr>
<td style="width:150px">
<p>1</p>
</td>
<tr>
<td style="width:150px">
>
</td>
</td>
</tr>
</tbody>
</table>
<p>End Of String</p>
Now I want to extract whole string before html table and after it like this. And add "HTML Table..." inplace of HTML Table. I tried few things but not able to achive it. Tried splitting into arrays, but it didn't worked
Sample Output
<p>Some Text above the tabular data. I hope this text will be seen.</p>
<p> </p>
HTML Table....
<p>Please go through this tabular data.</p>
<p>End Of String</p>
You can do this simply with String.replaceAll using regexp handling multiline and case-insensitive flags (?is):
String noTables = longTableString.replaceAll("(?is)(\\<table .*?\\</table\\>)", "HTML Table...");
// result
<p>Some Text above the tabular data. I hope this text will be seen.</p>
HTML Table...
<p> </p>
<p>Please go through this tabular data.</p>
HTML Table...
<p>End Of String</p>
This is may not be the most elegant solution, you can start with using regex to capture your table locations and then replace it with the desired content. Something like below will help.
String htmlString = <your html string> ;
Pattern pattern = Pattern.compile( "(<table)([\\s\\S]*?)(</table>)" ); // capture table elements using a suitable regex.
Matcher matcher = pattern.matcher( htmlStr );
String result = htmlStr;
while( matcher.find() )
{
// replace the table elements with another string
result = result.replace( htmlStr.substring( matcher.start(), matcher.end() ), "HTML Table...." );
}
System.out.println( result ); // print output
There are few drawbacks in this approach, like your regex must match with the html content. And the spacing depends on the original string spaces. You really don't have control over how the spaces in the output will look like. And more importantly, the regex evaluation is CPU intensive depending on the size of your HTML string.
This is just an approach to try.

Using jsoup to get data from first column of table

For the purpose of my question, I have created a simple HTML page, an extract of which is the following:
<table class="fruit-vegetables">
<thead>
<th>Fruit</th>
<th>Vegetables</th>
</thead>
<tbody>
<tr>
<td>
<b>
Apples
</b>
</td>
<td>
Carrots
</td>
</tr>
<tr>
<td>
<i>
Oranges
</i>
</td>
<td>
Peas
</td>
</tr>
</tbody>
</table>
I want to extract the data from the first column called "Fruit" using Jsoup. Thus, the result should be:
Apples
Oranges
I have written a program, an extract of which is the following:
//In reality, it should be connect(html).get().
//Also, suppose that the String `html` has the full source code.
Document doc = Jsoup.parse(html);
Elements table = doc.select("table.fruit-vegetables").select("tbody").select("tr").select("td").select("a");
for(Element element : table){
System.out.println(element.text());
}
The result of this program is:
Apples
Carrots
Oranges
Peas
I know that something is not working good, but I can't find my mistake. All the other questions here in Stack Overflow did not solve my problem. What do I have to do?
You seems to be looking for
Elements el = doc.select("table.fruit-vegetables td:eq(0)");
for (Element e : el){
System.out.println(e.text());
}
From http://jsoup.org/cookbook/extracting-data/selector-syntax you can find description of :eq(n) as
:eq(n): find elements whose sibling index is equal to n; e.g. form input:eq(1)
So with td:eq(0) we are selecting each <td> which is first child of its parent - in this case <tr>.

Selenium WebDriver Java / JUnit - Getting table element

I have the following table:
<table>
<tbody>
<tr>
<td>
</td>
<td>
User1
</td>
</tr>
<tr>
<td>
</td>
<td>
User2
</td>
</tr>
</tbody>
</table>
I want to find the a-tag of the tr, where the data User2 is in the same row. I know that I can find an a-tag with partial link like findElement(By.partialLinkText("/ResetPassword/")); (the number 2 can change, so I canĀ“t use it as seperator). But I need to seperate it by User. Is there a solution like tr.td.text("User2") > findElement(By.partialLinkText("/ResetPassword/"));?
This XPath should do the trick for you. .//tr[td[normalize-space(text())='User2']]//a Just keep changing "User2" part with the desired user value.
Hi I was wondering if you could use the .getValue() statement with /#a at the end of the xpath to locate the attribute "a".
Main thing to do is find a bulletproof xpath to locate the User2 row once you have done that finding the value of "a" should be easy enough.
I hope this helps
U can try something like this(not sure though)-
List<WebElement> list=table.findElements(By.tagName("tr"));
List<WebElement> tdvalues=null;
for(WebElement web:list){
tdvalues=web.findElements(By.tagName("td"));
if(tdvalues.contains("User2")){
System.out.println(tdvalues.get(0).getText());//0th position contains the link
}
tdvalues.clear();
}

Regex to iterate over an table and extract the td information inside an div using java

Hello i know parsing HTML with regex is not efficient .But i need to do with regex i have no other option.
HTML
<div class="test">
<h2>what</h2>
<table cellpadding="0" cellspacing="0">
<tbody>
<tr>
<th>Example </th>
<td> ui </td>
</tr>
<tr>
<th>Sample </th>
<td>123 </td>
</tr>
</tbody>
</table>
</div>
I tried to do it using (?s)<div class="test">.*<td>(.*?)</td>.*</div> it extracts only the last value can any one tell me what is the issue?
Why only using Regular expression, how about some jquery??
$("div.test > td").each(function() {
var $this = $(this);
alert( $this.text() )
});
The * operator reads as much as is possible, so the first .* also swallows most of the text.
Try with .*?. The question mark reduces this behaviour and lets * only take as much as necessary, not as much as possible.
Otherwise please be more specific what parts you really want and which not.

How to iterate over <td> tags with condition using jsoup

I am able get all text with in tags but I want to access only specific td tags.
Eg.I want to get data of second cell text whose first cell html contains attribute
a name="manufacturer"
or Content.I am using Jsoup.
<tabel>
<tr>
<td><a name="Manufacturer"></a>manufacturer</td>
<td>happiness</td>
</tr>
<td>manuf</td>
<td>hap</td>
</tr>
<tr>
<td>tents</td>
<td>acd</td>
</tr>
<tr>
<td><a name="Content"></a>Contents</td>
<td>abcd</td>
</tr>
</tabel>
I am using the code ..
doc.select("a[name=Manufacturer]");
..but its giving me the reference of cell one ,I need to go to cell two get cell two text
You need to use selector like [attr=value]: elements with attribute value, e.g. [width=500].
Take a look at official documentation Selector Syntax

Categories