I'm working on a tool and I'm in my last steps but I face a little problem, will appreciate is you can give me a hint.
I have these 3 tables, I can get the data from only the first 2, how can I reach the third where it is written Upgrade Warranty and Service Information?
Here is the tables code:
<body>
<div id="ibm-pcon">
<div id="ibm-content">
<div id="ibm-leadspace-head" class="ibm-alternate">
<div id="ibm-leadspace-body">
<br></br>
<script type="text/javascript">currentDate();</script>
<br></br>
<!--BEGIN OPTIONAL BREADCRUMBING--> <span style="font-size: small;">Machine Lookup > Warranty Information > </span>
<!--END OPTIONAL BREADCRUMBING-->
<br></br>
<h1>PEW | Warranty Information</h1>
</div>
</div>
<!-- CONTENT_BODY -->
<div id="ibm-content-body">
<div id="ibm-content-main">
<!-- LEADSPACE_BEGIN -->
<!-- This section can be used to test JavaScript and CSS before promoting the data to the template XML. -->
<table class="ibm-results-table" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody xmlns="http://www.w3.org/TR/xhtml1/">
<thead>
<tr>
<th scope="col" class="pg2OutputTableSectionTitle">Results of Machine Type/Serial Number Query</th>
</tr>
</thead>
<tr>
<td><table class="ibm-data-table ibm-alternating" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody>
<thead>
<tr>
<th scope="col" colspan="3" class="pg2TableSectionTitle">General Machine Information:</th>
</tr>
</thead>
<tr>
<td>
Type:
<span>1746</span>
</td><td>
Model:
<span>C4A</span>
</td><td>
Serial:
<span>13D06MK</span>
</td>
</tr>
<tr>
<td>
Status:
<span>Proof Of Purchase Rcvd</span>
</td><td>
Build Date:
<span> </span>
</td><td>
Build to Model:
<span> </span>
</td>
</tr>
<tr>
<td>
Geography:
<span>EMEA</span>
</td><td>
Country:
<span>GREECE</span>
</td><td>
Configuration Id:
<span> </span>
</td>
</tr>
<tr>
<td>
OES Order Number:
<span>2076804957</span>
</td><td>
Customer Number:
<span>108401</span>
</td><td>
Delivery Number:
<span>8519501492</span>
</td>
</tr>
<tr>
<td colspan="2">
Service Status:
<span>This machine is currently out of warranty.</span>
</td><td colspan="1">
UAR End Date:
<span>2012-08-02</span>
</td>
</tr>
</tbody></table></td>
</tr>
<tr>
<td><table class="ibm-data-table ibm-alternating" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody>
<thead>
<tr>
<th scope="col" colspan="3" class="pg2TableSectionTitle">Warranty and Service Information:</th>
</tr>
</thead>
<tr>
<th scope="col">Start Date</th><th scope="col">End Date</th><th scope="col">SDF</th>
</tr>
<tr>
<td>2012-07-04</td><td>2015-07-03</td><td>3XL</td>
</tr>
<tr>
<td colspan="3">
SDF Description:
<span>This product has a 3 year limited warranty and is entitled to CRU (customer replaceable unit) and On-site service. Tier 1 CRUs are customer responsibility, see announcement for details. On-site Service is available Monday - Friday, except holidays, with a next business day response objective.</span>
</td>
</tr>
</tbody></table></td>
</tr>
<tr>
<td><table class="ibm-data-table ibm-alternating" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody>
<thead>
<tr>
<th scope="col" colspan="3" class="pg2TableSectionTitle">Upgrade Warranty and Service Information:</th>
</tr>
</thead>
<tr>
<th scope="col">Start Date</th><th scope="col">End Date</th><th scope="col">SDF</th>
</tr>
<tr>
<td>2012-07-04</td><td>2015-07-03</td><td>SP4</td>
</tr>
<tr>
<td colspan="3">
SDF Description:
<span>This product has a three year limited warranty which includes a warranty upgrade. This product is entitled to parts and labor and includes on-site repair service. Service is available 7X24 with an 4 hour response objective.</span>
</td>
</tr>
</tbody></table></td>
</tr>
<tr>
<td><table class="ibm-data-table" cellpadding="0" cellspacing="0" border="0"><thead>
<tr>
<th scope="col" class="pg2MessageHead">Messages</th>
</tr>
</thead>
<tbody>
<tr>
<td class="pg2MessagePanel" align="left"> </td>
</tr>
</tbody></table></td>
</tr>
</tbody></table>
</div>
My working code is :
public void actionPerformed(ActionEvent e) {
try {
String getTextArea;
getTextArea = textArea.getText();
String[] arr = getTextArea.split("\\n");
String type = null;
String serial = null;
int line = 0;
for(String s : arr) {
line++;
if(s.isEmpty()) {
textArea_1.append("Empty Line" + '\n');
continue;
}
type = s.substring(0, 4);
serial = s.substring(5, 12);
String html = "bla bla bla + type + serial;
Document doc = Jsoup.connect(html).get();
Elements tableElements = doc.select("table");
java.util.Iterator<Element> ite = tableElements.select("tr").iterator();
Elements tableElement = doc.select("tr");
java.util.Iterator<Element> ite1 = tableElement.select("table").iterator();
ite.next();
ite1.next();
String result,result1,result2;
result = ite.next().text();
result1 = ite1.next().text();
Scanner sr = new Scanner(result);
Scanner sr1 = new Scanner(result1);
// System.out.println(result);
// System.out.println(result1);
// result of first table
while(sr.hasNext()) {
result = result;
ite.next().text();
String lineOfType;
lineOfType = ite.next().text();
type = lineOfType.substring(6, 10);
String model;
model = lineOfType.substring(18, 21);
serial = lineOfType.substring(30, 37);
ite.next().text();
String country = ite.next().text();
country = country.substring(24, 31);
textArea_1.append(line + "-" + type + '\t' + model + '\t' + serial + " " + country + " ");
}
sr.close();
// result of secind table
while(sr1.hasNext()) {
result1 = result1;
String startDate = result1.substring(58, 68);
String endDate = result1.substring(69, 79);
textArea_1.append(startDate + " " + endDate + " ");
break;
}
sr1.close();
// getting the elements for the 3rd table, but not working as expected, it gets the secnd table data.
Elements tableElement2 = doc.select("tr");
java.util.Iterator<Element> ite2 = tableElement2.select("table").iterator();
ite2.next();
result2 = ite2.next().text();
Scanner sr2 = new Scanner(result2);
// this while shows the same result as the second while !
while(sr2.hasNext()) {
sr2.next();
result2 = result2;
System.out.println(result2);
String srvPkStart = result2.substring(58, 68);
if(srvPkStart.equals(result1.substring(58, 68))) {
srvPkStart = "Not found";
}
String srvPkEnd = result2.substring(69, 79);
if(srvPkEnd.equals(result1.substring(69, 79))) {
srvPkEnd = "";
}
System.out.println(srvPkStart + '\t' + srvPkEnd);
textArea_1.append("ServicePack Dates: " + srvPkStart + '\t' + srvPkEnd + '\n');
break;
}
} // end of for loop
} catch (Exception e2) {
// TODO: handle exception
}
}
});
Let' say change another easier way to get those tables. I would suggest to get tables by class using org.jsoup.nodes.Element.select().
Checkout this link to learn on how to use jsoup-selector-syntax to get Elements.
String html = "<body><div id=\"ibm-pcon\"><div id=\"ibm-content\"><div id=\"ibm-leadspace-head\" class=\"ibm-alternate\"><div id=\"ibm-leadspace-body\"><br></br><script type=\"text/javascript\">currentDate();</script><br></br><!--BEGIN OPTIONAL BREADCRUMBING--> <span style=\"font-size: small;\">Machine Lookup > Warranty Information > </span><!--END OPTIONAL BREADCRUMBING--><br></br><h1>PEW | Warranty Information</h1> </div></div><!-- CONTENT_BODY --><div id=\"ibm-content-body\"><div id=\"ibm-content-main\"><table class=\"ibm-results-table\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"><tbody xmlns=\"www.w3.org/TR/xhtml1/\"><thead> <tr><th scope=\"col\" class=\"pg2OutputTableSectionTitle\">Results of Machine Type/Serial Number Query</th> </tr></thead><tr> <td><table class=\"ibm-data-table ibm-alternating\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <tbody> <thead><tr> <th scope=\"col\" colspan=\"3\" class=\"pg2TableSectionTitle\">General Machine Information:</th></tr> </thead> <tr><td> Type: <span>1746</span></td><td> Model: <span>C4A</span></td><td> Serial: <span>13D06MK</span></td> </tr> <tr><td> Status: <span>Proof Of Purchase Rcvd</span></td><td> Build Date: <span> </span></td><td> Build to Model: <span> </span></td> </tr> <tr><td> Geography: <span>EMEA</span></td><td> Country: <span>GREECE</span></td><td> Configuration Id: <span> </span></td> </tr> <tr><td> OES Order Number: <span>2076804957</span></td><td> Customer Number: <span>108401</span></td><td> Delivery Number: <span>8519501492</span></td> </tr> <tr><td colspan=\"2\"> Service Status: <span>This machine is currently out of warranty.</span></td><td colspan=\"1\"> UAR End Date: <span>2012-08-02</span></td> </tr> </tbody></table> </td></tr><tr> <td><table class=\"ibm-data-table ibm-alternating\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <tbody> <thead><tr> <th scope=\"col\" colspan=\"3\" class=\"pg2TableSectionTitle\">Warranty and Service Information:</th></tr> </thead> <tr><th scope=\"col\">Start Date</th><th scope=\"col\">End Date</th><th scope=\"col\">SDF</th> </tr> <tr><td>2012-07-04</td><td>2015-07-03</td><td>3XL</td> </tr> <tr><td colspan=\"3\"> SDF Description: <span>This product has a 3 year limited warranty and is entitled to CRU (customer replaceable unit) and On-site service. Tier 1 CRUs are customer responsibility, see announcement for details. On-site Service is available Monday - Friday, except holidays, with a next business day response objective.</span></td> </tr> </tbody></table> </td></tr><tr> <td><table class=\"ibm-data-table ibm-alternating\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <tbody> <thead><tr> <th scope=\"col\" colspan=\"3\" class=\"pg2TableSectionTitle\">Upgrade Warranty and Service Information:</th></tr> </thead> <tr><th scope=\"col\">Start Date</th><th scope=\"col\">End Date</th><th scope=\"col\">SDF</th> </tr> <tr><td>2012-07-04</td><td>2015-07-03</td><td>SP4</td> </tr> <tr><td colspan=\"3\"> SDF Description: <span>This product has a three year limited warranty which includes a warranty upgrade. This product is entitled to parts and labor and includes on-site repair service.Service is available 7X24 with an 4 hour response objective.</span></td> </tr> </tbody></table> </td></tr><tr> <td><table class=\"ibm-data-table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <thead><tr> <th scope=\"col\" class=\"pg2MessageHead\">Messages</th></tr> </thead> <tbody><tr> <td class=\"pg2MessagePanel\" align=\"left\"> </td></tr> </tbody></table> </td></tr></tbody> </table></div> </body>";
Document doc = Jsoup.parse(html, "", Parser.xmlParser());
Elements tables = doc.select("table.ibm-data-table.ibm-alternating"); // Get table which has classes = ibm-data-table, ibm-alternating
System.out.println(tables.size()); // tables.size = 3
for (Element ele: tables) {
// Get table header
Elements thElements = ele.select("tr > th.pg2TableSectionTitle"); // Get tableheader has classes = pg2TableSectionTitle
if (thElements != null && thElements.size() > 0) {
String tableTitle = thElements.get(0).text();
System.out.println(tableTitle);
if (tableTitle.contains("General Machine Information:")) {
// Apply your logic accordingly for table #General Machine
}
else if (tableTitle.contains("Warranty and Service Information:")) {
// Apply your logic accordingly for table #Warranty and Service
}
else if (tableTitle.contains("Upgrade Warranty and Service Information:")) {
// Apply your logic accordingly for table #Upgrade Warranty
}
}
}
Related
I want to loop through the news table and get the title and rating of each row. I tried different options, but I can’t understand why the select method receives all the options at once.
I need to get each news block in a loop.
I used this way to get table link:
Elements elements = document.select("#hnmain > tbody > tr:nth-child(3) > td > table");
This query doesn't work in a loop because it gets all the elements at once. I need to get the elements sequentially. So that I can do like this:
List list = new ArrayList<>();
for (Element element: elements){
String title = element...
String rating = element...
list.add(title);
list.add(rating);
}
Sample data from html:
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr class="athing" id="33582264">
<td align="right" valign="top" class="title"><span class="rank">1.</span></td>
<td valign="top" class="votelinks">
<center>
<a id="up_33582264" href="vote?id=33582264&how=up&goto=front%3Fday%3D2022-11-13">
<div class="votearrow" title="upvote"></div></a>
</center></td>
<td class="title"><span class="titleline">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.<span class="sitebit comhead"> (<span class="sitestr">upbase.io</span>)</span></span></td>
</tr>
<tr>
<td colspan="2"></td>
<td class="subtext"><span class="subline"> <span class="score" id="score_33582264">632 points</span> by tonypham <span class="age" title="2022-11-13T12:00:06">20 days ago</span> <span id="unv_33582264"></span> | hide | 456 comments </span></td>
</tr>
<tr class="spacer" style="height:5px"></tr>
<tr class="athing" id="33584941">
<td align="right" valign="top" class="title"><span class="rank">2.</span></td>
<td valign="top" class="votelinks">
<center>
<a id="up_33584941" href="vote?id=33584941&how=up&goto=front%3Fday%3D2022-11-13">
<div class="votearrow" title="upvote"></div></a>
</center></td>
<td class="title"><span class="titleline">Forking Chrome to turn HTML into SVG<span class="sitebit comhead"> (<span class="sitestr">fathy.fr</span>)</span></span></td>
</tr>
if I understand your question I think this code will work for you
Document doc = Jsoup.parse("<table border=\"0\" id=\"hnmain\" cellpadding=\"0\" cellspacing=\"0\"> <tbody> <tr class=\"athing\" id=\"33582264\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">1.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33582264\" href=\"vote?id=33582264&how=up&goto=front%3Fday%3D2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.<span class=\"sitebit comhead\"> (<span class=\"sitestr\">upbase.io</span>)</span></span></td> </tr> <tr> <td colspan=\"2\"></td> <td class=\"subtext\"><span class=\"subline\"> <span class=\"score\" id=\"score_33582264\">632 points</span> by tonypham <span class=\"age\" title=\"2022-11-13T12:00:06\">20 days ago</span> <span id=\"unv_33582264\"></span> | hide | 456 comments </span></td> </tr> <tr class=\"spacer\" style=\"height:5px\"></tr> <tr class=\"athing\" id=\"33584941\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">2.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33584941\" href=\"vote?id=33584941&how=up&goto=front%3Fday%3D2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\">Forking Chrome to turn HTML into SVG<span class=\"sitebit comhead\"> (<span class=\"sitestr\">fathy.fr</span>)</span></span></td> </tr>");
Elements elements = doc.select("#hnmain .athing");
for (Element element : elements) {
String title = element.select(".title").text();
String rank = element.select(".rank").text();
System.out.println(title + " -- "+rank);
}
I have a table which has a column which contains:
a list of invoices
a column which contains a lots of charge types for every invoice displayed
What I want to do is to make a function which receives a String parameter,for example the invoice number and return all the charge types for invoice number inserted
Here is the code for the table
Every time a new invoice is displayed on the table,the first line of the table contains and a value
That value represents the number of the charge types displayed on every invoice
For example the charge types are :Management fee,Payments,Funds Transmission Cost,Acquiring Authorisation Fee,Service etc.
<form method="post" action="/accounting/billing/showInvoiceTransactionsCountTotal.html?
jlbz=lfISHfhqWHPj5fSzCwFKoP8c5ukwXecQt0fr4iL6ak" target="detail">
<table>
<tbody>
<tr>
<tr class="odd">
<td rowspan="8">
<a href="/accounting/billing/showInvoice.html?invoiceNumber=BA7123399&jlbz=lfISHfhqWHPj5fSzCwFKoP8c5ukwXecQt0fr4iL6ak">
BA7123399
<input type="hidden" value="BA7123399" name="invoiceChecked"/>
</a>
</td>
<td>Management fee (captured transactions)</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>2</td>
</tr>
<tr class="odd">
<td>Payments</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>2</td>
</tr>
<tr class="odd">
<td>Funds Transmission Cost (FTC)</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>1</td>
</tr>
<tr class="odd">
<td>Acquiring Authorisation Fees</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>2</td>
</tr>
<tr class="odd">
<td>Service</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>2</td>
</tr>
<tr class="odd">
<td>Refunds</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>1</td>
</tr>
<tr class="odd">
<td>Chargebacks</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>1</td>
</tr>
<tr class="odd">
<td>Minimum Billing</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>2</td>
</tr>
<tr class="even">
<td rowspan="4">
<a href="/accounting/billing/showInvoice.html?invoiceNumber=BA7123421&jlbz=lfISHfhqWHPj5fSzCwFKoP8c5ukwXecQt0fr4iL6ak">
BA7123421
<input type="hidden" value="BA7123421" name="invoiceChecked"/>
</a>
</td>
<td>Payments</td>
<td>ALEXAUTOMATION01</td>
<td>ALEXADCODE</td>
<td>1</td>
</tr>
<tr class="even">
<tr class="even">
<tr class="even">
<tr class="odd">
<td rowspan="8">
<a href="/accounting/billing/showInvoice.html?invoiceNumber=BA7123398&jlbz=lfISHfhqWHPj5fSzCwFKoP8c5ukwXecQt0fr4iL6ak">
BA7123398
<input type="hidden" value="BA7123398" name="invoiceChecked"/>
</a>
</td>
<td>Management fee (captured transactions)</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>1</td>
</tr>
<tr class="odd">
<tr class="odd">
<tr class="odd">
<tr class="odd">
<tr class="odd">
<tr class="odd">
<tr class="odd">
<tr class="even">
<td rowspan="10">
<a href="/accounting/billing/showInvoice.html?invoiceNumber=BA7123397&jlbz=lfISHfhqWHPj5fSzCwFKoP8c5ukwXecQt0fr4iL6ak">
BA7123397
<input type="hidden" value="BA7123397" name="invoiceChecked"/>
</a>
</td>
<td>Management fee (captured transactions)</td>
<td>PAYPALC001M2</td>
<td>PAYPALC001A1</td>
<td>2</td>
</tr>
<tr class="even">
<tr class="even">
<tr class="even">
<tr class="even">
<tr class="even">
<tr class="even">
<tr class="even">
<tr class="even">
<tr class="even">
</tbody>
You need to add some logic to achieve this scenario.since,invoice number row is missing for some of the common invoice number. Please try with the below
Algorithm:
1. Firstly,find all the row elements from the table
2. Iterate all the rowelement and match the expected Invoice Number.
3. If the Invoice Number is matched, then print all the sub sequence column charge type until next charge invoice number matches
Code:
String InvoiceNumber="";
List<String> chargetype=new ArrayList<>();
Boolean isInvoiceSpecificCharge=false;
//Find All the tr specific element
List<WebElement> elementList=driver.findElements(By.xpath("//table/tbody/tr"));
for(WebElement element:elementList){
WebElement tempElement=null;
try{
tempElement=element.findElement(By.xpath(".//a"));
}
catch(Exception e){
}
//If the Invoice Number is present, then we need to take the charge from td[2] else from td[1].
if(tempElement.getText().equalsIgnoreCase(InvoiceNumber)){
isInvoiceSpecificCharge=true;
chargetype.add(element.findElement(By.xpath(".//td[2]")).getText());
}
else if(tempElement==null && isInvoiceSpecificCharge ==true){
chargetype.add(element.findElement(By.xpath(".//td")).getText());
}else if(!tempElement.getText().equalsIgnoreCase(InvoiceNumber)){
isInvoiceSpecificCharge=false;
}
}
I am new in Thymeleaf and i try to subtract value of column paid amount from total amount but gives error as follow:
and if I comment Remaining amount column i get following result:
<div th:if="${not #lists.isEmpty(cust)}">
<table border="1" style="width: 300px">
<thead>
<tr>
<th>Name</th>
<th>Address</th>
<th>Phone</th>
<th>Total Amount</th>
<th>Paid Amount</th>
<th>Remaining Amount</th>
</tr>
</thead>
<tbody>
<tr th:each="customer : ${cust}">
<td th:text="${customer.name}"></td>
<td th:text="${customer.address}"></td>
<td th:text="${customer.phone}"></td>
<td
th:with="result1=${#aggregates.sum(customer.customerDetails.![totalAmount])}">
<span th:text="${result1}"></span>
</td>
<td
th:with="result3=${#aggregates.sum(customer.payment.![paidAmount])}">
<span th:text="${result3}"></span>
</td>
<td th:with="result=${#aggregates.sum(customer.customerDetails.![totalAmount])}, result2=${#aggregates.sum(customer.payment.![paidAmount])}">
<span th:text="${result}- ${result2}"></span>
</td>
</tr>
</th:block>
</tbody>
</table>
</div>
I have a website that contains a table that look like similar(bigger..) to this one:
</table>
<tr>
<td>
<table width="100%" cellspacing="-1" cellpadding="0" border="0" dir="rtl" style="padding-top: 25px;">
<tr>
<td align="right" style="padding-right: 25px;">
<span class="artist_name_txt">
name
<p class="diccografia">subname</p>
</span>
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td>
<table width="100%" border="0" cellspacing="0" cellpadding="0" dir="rtl" style="padding-right: 25px; padding-left: 25px">
<tr>
<td class="songs" align="right">
number1
</td>
</tr>
<tr>
<td class="songs" align="right">
number2
.......
</td>
</tr>
</table>
and I need an idea how can i parse the website and extract this table into 2 arrays -
one will be something like names{number1, number2}
and the second will be links{number1link, number2link}
I tried a lot of ways and nothing really helps me.
You should read the JSoup Cookbook - especially the Selector syntax is very powerful.
Here's an example:
final String html = ...
// use connect().get() instead if you connect to an website
Document doc = Jsoup.parse(html);
List<String> names = new ArrayList<>();
List<String> links = new ArrayList<>();
for( Element element : doc.select("a.artist_player_songlist") )
{
names.add(element.text());
links.add(element.attr("href"));
}
System.out.println("Names: " + names);
System.out.println("Links: " + links);
Output:
Names: [number1, number2]
Links: [/number1link, /number2link]
Android Web Scraping with a Headless Browser
Htmlunit on Android application
HttpUnit/HtmlUnit equivalent for android
I have html code that is very similar to this:
<TH CLASS="ddtitle">MovieOne</TH>
<TABLE CLASS="datadisplaytable" ><CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Genre</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Seating</TH>
<TH CLASS="ddheader" scope="col" >Actors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
<TD CLASS="dddefault">10:00 am - 12:00 pm</TD>
<TD CLASS="dddefault">SMTWTHFSA</TD>
<TD CLASS="dddefault">AMC Showplace</TD>
<TD CLASS="dddefault">Aug 20, 2014 - Sept 12, 2014</TD>
<TD CLASS="dddefault">Reservations</TD>
<TD CLASS="dddefault">Will Ferrel (<ABBR title= "Primary">P</ABBR>) target="Will Ferrel" ></TD>
</TR>
</TABLE>
<TH CLASS="ddtitle">MovieTwo</TH>
<TABLE CLASS="datadisplaytable" ><CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Genre</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Seating</TH>
<TH CLASS="ddheader" scope="col" >Actors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
<TD CLASS="dddefault">11:00 am - 12:30 pm</TD>
<TD CLASS="dddefault">SMTWTHFSA</TD>
<TD CLASS="dddefault">Showplace Cinemas</TD>
<TD CLASS="dddefault">Aug 20, 2014 - Sept 12, 2014</TD>
<TD CLASS="dddefault">TBA</TD>
<TD CLASS="dddefault">Zach Galifinakis (<ABBR title= "Primary">P</ABBR>) target="Zach Galifinakis" ></TD>
</TR>
</TABLE>
<TH CLASS="ddtitle">MovieThree</TH>
<BR>
<BR>
Coming Soon
<BR>
What I want to be able to do, is take the individual table data that is relevant for the movie title, and if a Movie doesn't have a table I want to say the values are TBA. So far, I am able to get the relevant table information, but I am unable to skip a table. For example I use this code to get the genre of the movie:
int tcounter = 1;
for (Element elements : li) {
WebElement genre = driver.findElement(By.xpath("//table[#class='datadisplaytable']/descendant::table["+tcounter+"]//td[1]"));
WebElement time = driver.findElement(By.xpath("//table[#class='datadisplaytable']/descendant::table["+tcounter+"]//td[2]"));
WebElement days = driver.findElement(By.xpath("//table[#class='datadisplaytable']/descendant::table["+tcounter+"]//td[3]"));
WebElement where = driver.findElement(By.xpath("//table[#class='datadisplaytable']/descendant::table["+tcounter+"]//td[4]"));
WebElement date_range = driver.findElement(By.xpath("//table[#class='datadisplaytable']/descendant::table["+tcounter+"]//td[5]"));
WebElement seating = driver.findElement(By.xpath("//table[#class='datadisplaytable']/descendant::table["+tcounter+"]//td[6]"));
WebElement actors = driver.findElement(By.xpath("//table[#class='datadisplaytable']/descendant::table["+tcounter+"]//td[7]"));
tcounter++;
}
elements refers to a list storing all links on the webpage
(result for [1] would be action, [2] would be 10:00 am - 12:00pm ...).
This is within a for loop that increments the value of the tcounter by 1 in order to receive the data for different tables. Is there a way I can be able to tell the program to see if a table is present under the TH class, and if not give the values TBA and skip it?
This is my second attempt based on siking's answer:
List<WebElement> linstings = driver.findElements(By.className("ddtitle"));
String genre = "";
String time = "";
String days = "";
String where = "";
String dateRange = "";
String seating = "";
String actors = "";
for(WebElement potentialMovie : linstings) {
try {
WebElement actualMovie = potentialMovie.findElement(By.xpath("//table[#class='datadisplaytable']"));
// System.out.println("Actual: " + actualMovie.getText());
// make all your assignments, for example:
type = actualMovie.findElement(By.xpath("/descendant::table//td")).getText();
time = actualMovie.findElement(By.xpath("/descendant::table//td[2]")).getText();
days = actualMovie.findElement(By.xpath("/descendant::table//td[3]")).getText();
location = actualMovie.findElement(By.xpath("/descendant::table//td[4]")).getText();
dates = actualMovie.findElement(By.xpath("/descendant::table//td[5]")).getText();
schedType = actualMovie.findElement(By.xpath("/descendant::table//td[6]")).getText();
instructor = actualMovie.findElement(By.xpath("/descendant::table//td[7]")).getText();
System.out.println(genre+" "+time+" "+days+" "+where+" "+dateRange+" "+actors);
} catch(Exception ex) {
// there is no table, so:
genre = "TBA";
}
}
The problem with this code is that it keeps returning the values for only the first table.
I trimmed down your HTML sample to the following:
<TH CLASS="ddtitle">MovieOne</TH>
<TABLE CLASS="datadisplaytable">
<CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col">Genre</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
</TR>
</TABLE>
<TH CLASS="ddtitle">MovieTwo</TH>
<BR/>
<BR/>
Coming Soon
<BR/>
<TH CLASS="ddtitle">MovieThree</TH>
<TABLE CLASS="datadisplaytable">
<CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col">Genre</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
</TR>
</TABLE>
Hopefully it is representative of all your cases!
Don't use a counter, but use the actual WebElements to iterate over:
// default all your variables to TBA, like:
String genre = "TBA";
// find all the listings on the page...
List<WebElement> linstings = driver.findElements(By.className("ddtitle"));
// ... and iterate over them
for (WebElement listing : linstings) {
// grab whatever is the _first_ element under the TH ...
WebElement potentialMovie = listing.findElement(By.xpath("following-sibling::*[1]"));
// ... check if it has a child element CAPTION
if (potentialMovie.findElement(By.xpath("caption")) != null) {
// make all your assignments, for example:
genre = potentialMovie.findElement(By.xpath("tr[2]/td[1]")).getText();
}
}
Please note that this code is untested, your mileage may vary!