Jsoup getting value of tag - java

I am using Jsoup to try and read all the elements in the html and loop through and do stuff based on the the type of element.
I'm not having any luck, I can't find the proper method to check for the values of each element.
Any suggestions?
This is my latest attempt:
Elements a = doc.getAllElements();
for(Element e: a)
{
if( e.val().equals("td"))
{
System.out.println("TD");
}
else if(e.equals("tr"))
{
System.out.println("TR");
}
}
This does not print anything.

Try this one:
Elements tdElements = doc.getElementsByTag("td");
for(Element element : tdElements )
{
//Print the value of the element
System.out.println(element.text());
}

Better you select each element by its tags:
Elements tdTags = doc.select("td");
Elements trTags = doc.select("tr");
// Loop over all tdTags - you can do the same with trTags
for( Element element : tdTags )
{
System.out.println(element); // print the element
}

e.tag() will do that
Elements tdElements = doc.getElementsByTag("td");
for(Element element : tdElements )
{
//Print the value of the element
System.out.println(element.tag());
}

Related

How can we get index of a WebElement (from a list) passed in a function?

List<WebElement> deleteBtn = driver.findElements(By.xpath("//div[#class='btn']//div[#class='deleteUsers']"));
public void clickDeleteBtn(WebElement element) {
element.click();
/* Here I want to retrieve the index of the element passed in the function */
}
main() {
clickDeleteBtn(deleteBtn.get(5));
}
Suppose the findElements() above gives me a list of 10 WebElements and I pass element indexed 5 in clickDeleteBtn(). How, in the function, can I get the index of the element passed in?
I have tried element.toString() but it only gives me:
Element: [[ChromeDriver: chrome on WINDOWS (f4f6be3ed1e2a964a2dc8f0d848d3e87)] -> xpath: //div[#class='btn']//div[#class='deleteUsers']]
No information about the index of the element suggested.
I'd really appreciate your advice! Thanks
WebElement itself doesn't know his index inside a list you stored it in.
The only way to know the index of given WebElement object is to iterate over the list of elements and compare the given WebElement against the current WebElement in the list, as following:
WebElement element = unknownWebElement;
List<WebElement> list = entireListOfWebElements;
for(int i=0;i<list.size();i++){
if(element = list.get(i)){
return i;
}
return -1;
}

Selenium: Select all the elements on the page containing any text

I want to select all the elements on the page containing any text.
Only elements actually containing texts themselves, not the parent elements containing texts in their child elements only.
This XPath is matching elements containing any non-empty texts
//*[text() != ""]
However this
List<WebElement> list = driver.findElements(By.xpath("//*[text() != '']"));
gives me a list of all elements containing texts themselves or in their child elements.
I can iterate over this list with something like this to get elements actually containing texts themselves into real list
List<WebElement> real = new ArrayList<>();
for(WebElement element : list){
js = (JavascriptExecutor)driver;
String text = js.executeScript("""
return jQuery(arguments[0]).contents().filter(function() {
return this.nodeType == Node.TEXT_NODE;
}).text();
""", element);
if(text.length()>0){
real.add(element);
}
But this is a kind of workaround.
I'm wondering is there a way to get the list of elements actually containing any text doing that directly or more elegantly?
List<WebElement> elementsWithOwnText = new ArrayList<WebElement>();
List<WebElement> allElements = driver.findElements(By.xpath("//*"));
for (WebElement element: allElements) {
List<WebElement> childElements = element.findElements(By.xpath(".//*"));
String text = element.getText();
if (childElements.size() == 0 && text.lenght() > 0) {
elementsWithOwnText.add(element);
}
}
Be aware of org.openqa.selenium.StaleElementReferenceException. While looping allElements any of them may be no more attached to the page document (dynamic content f.e.).
You can try this:
it selects all leaf elements with text.
List<WebElement> list = driver.findElements(By.xpath("//*[not(child::*) and text()]"));
for (WebElement webElement : list)
System.out.println(webElement.getText());
Until you find the xpath that you need, as a temporary solution, I would recommand to try the below iteration too (even though is not so efficient as a direct xpath).
In my case it took 1 minute to evaluate 700 nodes with text and returned 152 elements that have its own text:
public static List<WebElement> getElementsWithText(WebDriver driver) {
return driver.findElements(By.xpath("//*[normalize-space() != '']"))
.stream().filter(element -> doesParentHaveText(element))
.collect(Collectors.toList());
}
private static boolean doesParentHaveText(WebElement element) {
try {
String text = element.getText().trim();
List<WebElement> children = element.findElements(By.xpath("./*"));
for (WebElement child: children) {
text = text.replace(child.getText(), "").trim();
}
return text.trim().replace("[\\n|\\t|\\r]", "").length() > 0;
} catch (WebDriverException e) {
return false; //in case something does wrong on reading text; you can change the return false with thrown error
}
}
this could help:
source
List<String> elements = driver.findElements(By.xpath("//a")).stream().map(productWebElement -> productWebElement.getText()).distinct().collect(Collectors.toList());
// Print count of product found
System.out.println("Total unique product found : " + elements.size());
// Printing product names
System.out.println("All product names are : ");
elements.forEach(name -> System.out.println(name));

How to stop for each loop running on web elements list in selenium

I created a method that run on web elements list, and return the index of the expected value in the list.
the method evaluate the get attribute value with expected value.
It is working OK, however it is running until the end of the list. is the only option is to break the loop is to enter break; after index = iterator
public Integer indexInList(String expectedValue,List<WebElement> dropdownOptions,String attributeValue) throws Exception {
Integer index = -1; // -1 meanning not found in list
int iterator = 0; // run on the list
for (WebElement element : dropdownOptions )
{
if(element.getAttribute(attributeValue).equals(expectedValue))
{
index = iterator;
}
iterator ++;
}
return index;
}
Yes , you need enter break after the line index=iterator
No.
Apart from "enter break; after index = iterator", you can use return index; to replace the statement break; in the same position.
Return the iterator when for loop finds desired element:
public Integer indexInList(String expectedValue,List<WebElement> dropdownOptions,String attributeValue) throws Exception {
int iterator = 0; // run on the list
for (WebElement element : dropdownOptions )
{
if(element.getAttribute(attributeValue).equals(expectedValue))
{
return iterator;
}
iterator ++;
}
return -1; // -1 meanning not found in list
}
The above is slight modification of your code. You don't need the index variable, you can operate on iterator.
Return the iterator when the desired element is found and it will stop for loop.
If none of the elements satisfies the condition, return -1

Java - Executing two corrosponding for loops at same time

I have two for loops written in java below. The first one grabs all the titles of news articles on a website, and the second one grabs all the links of the same new articles on the same website.
How do I make it so that when the first loop executes once, the second loop executes once, and then the first loop executes a second time, and the second one executes a second time, etc. I would really appreciate your help, thanks.
for( org.jsoup.nodes.Element element : elements1 ){
sendMessageRequest.setText(element.text());
sendMessage(sendMessageRequest);
System.out.print("sent message");
}
for( org.jsoup.nodes.Element element : elements2 ) {
sendMessageRequest.setText(element.text());
sendMessage(sendMessageRequest);
System.out.print("sent message");
}
I'm going to assume that elements1 and elements2 are some kind of Iterable<Element>, e.g. List<Element>.
First, remember that for (Element element : elements1) is just syntactic sugar for:
Iterator<Element> iter = elements1.iterator();
while (iter.hasNext()) {
Element element = iter.next();
// code here
}
Except that you don't have access to the Iterator.
So, if you want to iterate two different Iterable objects, do so the old-fashioned way:
Iterator<Element> iter1 = elements1.iterator();
Iterator<Element> iter2 = elements2.iterator();
while (iter1.hasNext() && iter2.hasNext()) {
Element element1 = iter1.next();
Element element2 = iter2.next();
// code here
}
If the two Iterable objects are not the same length, the loop will only iterate until the shorter one has been exhausted. Extra elements in the other one will simply be ignored.
If elements1 and elements2 are guaranteed to have the same length, just iterate through them into one loop:
for (int i = 0; i < elements1.length; i++) {
processMessageRequest(elements1[i]);
processMessageRequest(elements2[i]);
}
Using a new method processMessageRequest to make your code more DRY:
private void processMessageRequest(Element e) {
sendMessageRequest.setText(e.text());
sendMessage(sendMessageRequest);
System.out.println("sent message");
}
I'm not sure what the scope of sendMessageRequest is... but with some tweaking this way could work.

compare jsoup elements

hello guys i am try to compare one jsoup element with all other elements and if two elements are equal i need to make count++; in this case i need to compare all elements in links1 with all elements in links2 links3 links4....
Document document1 = Jsoup.parse(webPage1);
Elements links1 = document1.select("example");
Document document2 = Jsoup.parse(webPage2);
Elements links2 = document2.select("example");
Document document3 = Jsoup.parse(webPage3);
Elements links3 = document3.select("example");
Document document4 = Jsoup.parse(webPage4);
Elements links4 = document4.select("example");
what would be the code....in JSP....
Elements is just a List of Element, so compating will look like:
for (Element element : links1) {
if(links2.contains(element)){
count++;
}
//maybe do the same thing with links3 links4.
}
If you want do it in JSP — this is another question.

Categories