Combine two JSoup Elements - java

I am using JSoup for the first time to parse the HTML two elements based on class. I am able to successfully pull the data of each. The problem I am having is formatting the data the way I want. The data I am pulling is for a link hit counter.
The final result I want is something like
https://yadayadayada.com 1,
https://yadayadayada.com 4,
... etc
instead I am getting
https://yadayadayada.com https://yadayadayada.com 1, 4,
This is how I am getting my current output
Document doc = Jsoup.connect(link).get();
Elements links = doc.getElementsByClass("details shorlinkUrl");
Elements count = doc.getElementsByClass("highlight listUrl").append(",");
String counter = count.text();
String linkname = links.text();
System.out.println(prettyname.toString()+count.toString());
String results = new StringBuilder(14).append(prettyname).append(counter).toString();
Any ideas or direction is greatly apperciated!!

When you call text() method on Elements object, you get concatenated text from all elements in this collection. You can iterate over separate elements and get text from elements separately and manipulate it accordingly.
Elements links = doc.getElementsByClass("details shorlinkUrl");
Elements count = doc.getElementsByClass("highlight listUrl");
if(links.size()!= count.size()) {
throw new IllegalStateException("Think about this situation");
}
for(int i = 0; i< links.size(); i++) {
System.out.println(links.get(i).text() + " " +count.get(i).text()+ ",");
}

Related

How do I select a specific element from a set with similar XPath paths?

There are 2 drop-down lists. Each has a similar meaning, for example, "Jorge". Lists in different modules. When I need to fill in, for example, a list that is lower in the tree, then the first match is taken along the XPath path, on an undisclosed list.
Not lists, but values in drop-down lists!
There are 2 drop-down lists. Each has a similar meaning, for example, "Jorge". Lists in different modules. When I need to fill in, for example, a list that is lower in the tree, then the first match is taken along the XPath path, on an undisclosed list.
Not lists, but values in drop-down lists!
I wanted to implement it in Java this way:
Example:
if (findElement(By.xpath("(//example//example)")).isDisplayed()) {
findElement(By.xpath("(//example//example)")).click();
}
But in this case, the element is not displayed.
How to implement a search of all values similar to the XPath path in order to get the one that is displayed?
I tried to do something like this: (//example//example)1 (//example//example)[2] (//example//example)[3]
In my case, we have that 1 - the element does not exist [2] - exists, but is not displayed (isDisplayed = false) [3] - exists, is displayed (isDisplayed = true)
iterating through the values in the loop for [n] cannot be implemented, because, for example, the value 1 is not.
Described as difficult as possible :D. Excuse me.
If someone understands my nonsense, please help me. How to implement my requirement?
enter image description here
UPD:
The problem was solved (for me) by substituting the first value into the expression ()"{1}" immediately.
Now I'm interested in why I get an exception after the first iteration:
Method threw 'org.openqa.selenium.ElementNotInteractableException' exception.
Code:
int number = 1;
String option = "(//ul[contains(#style, 'display: block')]//li//span[contains(text(),'" + valueField + "') or strong[contains(text(),'" + valueField.toUpperCase() + "')]])";
findElement(By.xpath(option+"["+number+"]"));
String[] words = valueField.split(" ");
StringBuilder builder = new StringBuilder();
for (int i = 0; i < words.length; i++) {
builder.append(words[i]);
setFieldByLabel(nameModule, nameLabel, builder.toString());
fastWaitLoading();
for (int y = 0; y < 10; y++) {
if (findElement(By.xpath(option+"["+number+"]")).isDisplayed()) {
new Actions(browser.getWebDriver())
.moveToElement(findElement(option))
.click()
.build()
.perform();
break;
}
number++;
}
}
So I am trying to fully understand your question, and I don't. What I would recommend for a situation like this is, iterate through all elements by creating a list with: findElements(By.xpath ... )
This way you will get a list of webelements and you can iterate through them. Then apply a foreach, assert if element is displayed (it exists as it has been found with findElements) and you should be able to interact with it.
Yeah, everything is in a prominent place)
Missed it
new actions(browser.getWebDriver()) .moveToElement(findElement(**option**)) .click() .build() .perform(); break;
Here
new actions(browser.getWebDriver())
.moveToElement(findElement(**option + "[" + number+"]"**))
.click()
.build()
.perform();
break;

Capturing an element on its nth instance of an xpath slow downs the code run after many runs

I have been writing an xpath of an element, for which the instance changes by its count. Say I have 100 values for which I have to automate one bay one from an excel file, the instance of the xpath of an element while automating for the first value is 1, and the next can be 2 or 3 or 1. It can be any one count of the closePath. The speed to search for the element keeps slowing down as I go down the list. Below is my code:the xpathNumber adds the instance to the xpath on every run.
driver.switchTo().defaultContent();
wait.until(ExpectedConditions.frameToBeAvailableAndSwitchToIt(0));
int closePaths = driver.findElements(By.xpath("//*[text()[contains(.,'Message Map Data - View and Edit the Message Map Data')]]/following-sibling::div[#id='fb_buttons']/span[2]")).size();
for (int j = 1; j <= closePaths; j++) {
String closePath = "(//*[text()[contains(.,'Message Map Data - View and Edit the Message Map Data')]]/following-sibling::div[#id='fb_buttons']/span[2])";
String xpathNumber = "[" + j + "]";
closePath = closePath + xpathNumber;
try {
driver.findElement(By.xpath(closePath)).click();
} catch (Exception e) {}
}
driver.switchTo().defaultContent();
wait.until(ExpectedConditions.frameToBeAvailableAndSwitchToIt(0));
wait.until(ExpectedConditions.frameToBeAvailableAndSwitchToIt(2));
driver.findElement(By.id("navFind1")).click();

Verifying the data from two arraylist and delete it

I have a master arraylist call toBeDeleted which stored timestamp and email. The following are the sample data inside the toBeDeleted arraylist
[1507075234, bunny#outlook.com]
I have one arraylist call logData1 which stored status,email,timestamps and ID. The following are the sample data inside the logData1 arraylist.
[16, bunny#outlook, 1507075234, 0OX9VQB-01-00P-02]
I hope to delete the data inside the logData1 arraylist by verifying the timestamp first with timestamps stated in toBeDeleted1 arraylist, if the timestamp matched, I will check the email for both arraylist. If both of them are matched, I would like to delete away all the data (status,email,timestamp,ID). But I cant make it work
this is my sample output from my source code
[16, bunny#outlook.com, 1507075234, 0OX9VQB-01-00P-02]
The data inside toBeDeleted1 is :[1507075234, bunny#outlook.com]
The time1 is :1507075234
The email1 is :bunny#outlook.com
The time is :1507075234
The emails is :bunny#outlook.com
The data is :bunny#outlook.com
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -3
at java.util.ArrayList.elementData(Unknown Source)
at java.util.ArrayList.get(Unknown Source)
at EmailReporting.main(EmailReporting.java:83)
This is my sample program
System.out.println(logData1);
System.out.println("The data inside toBeDeleted1 is :"+toBeDeleted1);
for(int v = 0;v<toBeDeleted1.size();v++) //look through the logdata1 for removing the record base on timestamp
{
String time1 = toBeDeleted1.get(v);
String email1 = toBeDeleted1.get(v+1);
System.out.println("The time1 is :"+time1);
System.out.println("The email1 is :"+email1);
for(int f = logData1.size();f>logData1.size()-1;f--)
{
// System.out.println(logData1.size());
// System.out.println("The data in logdata1 is "+logData1.get(f-2));
if(time1.equals(logData1.get(f-2)))
{
System.out.println("The time is :"+logData1.get(f-2));
System.out.println("The emails is :"+logData1.get(f-3));
if(email1.equals(logData1.get(f-3)))
{
System.out.println("The data is :"+logData1.get(f-3));
logData1.remove(f-1);
logData1.remove(f-2);
logData1.remove(f-3);
logData1.remove(f-4);
f-=4;
}
}
}
}
The error occurred after this line of code executed
System.out.println("The data is :"+logData1.get(f-3));
You can find elements in the list in order using Collections.indexOfSubList:
List<String> toFind = Arrays.asList(time1, email1);
int emailIndex = Collections.indexOfSubList(logData1, toFind);
A similar lastIndexOfSubList method also exists. That might be more appropriate for your use case.
You can then use this to remove the elements from toFind:
int emailIndex = Collections.lastIndexOfSubList(logData1, toFind);
if (emailIndex >= 1) {
logData1.subList(emailIndex-1, emailIndex+3).clear();
}
Just do this in a loop to keep going until all occurrences have been removed.
Note that just doing this in a loop naively will keep on searching over the tail of the list repeatedly. Instead, you can use subList to "chop" the end of the list, to avoid re-searching it:
List<String> view = logData1;
int emailIndex;
while ((emailIndex = Collections.lastIndexOfSubList(view, toFind)) >= 1) {
logData1.subList(emailIndex-1, emailIndex+3).clear();
view = logData1.subList(0, emailIndex-1);
}
Additionally, note that deleting from the middle of an ArrayList is inefficient, because the elements after the ones you delete have to be shifted down. This is why using subList(...).clear() is better, because it does all of those shifts at once. But if you are removing lots of 4-element batches, you can do better.
Instead of the subList(...).clear(), you can set the bits of elements to be deleted into a BitSet:
List<String> view = logData1;
BitSet bits = new BitSet(logData1.size());
int emailIndex;
while ((emailIndex = Collections.lastIndexOfSubList(view, toFind)) >= 1) {
bits.set(emailIndex-1, emailIndex+3);
view = logData1.subList(0, emailIndex-1);
}
And then shift all the elements down at once, discarding the elements you want to delete:
int dst = 0;
for (int src = 0; src < logData1.size(); ++src) {
if (!bits.get(src)) {
logData1.set(dst++, logData1.get(src));
}
}
And now truncate the list:
logData1.subList(dst, logData1.size());

jsoup multi element output

hello guys i am try to print the output of two element data simultaneously
Document document2 = Jsoup.parse(webPage2);
Document document22 = Jsoup.parse(webPage2);
Elements links2 = document2.select("a.yschttl");
Elements links22 = document22.select("div.abstr");
can we include both a.yschttl and div.abstr or...
for (Element link2 : links2) {
out.println(link2);
}
can we include two say links2 and links22 in same for loop...
or how to achive it...
You can do something like:
for (int i = 0; i < links2.size(); i++) {
out.println(links2.get(i));
out.println(links22.get(i));
}
But in this case you will get IndexOutOfBoundsException if size of links22 higher than size of links2.
What do you want to achieve?
If you are just trying to select both at the same time, you can do something like this:
for (Element link : document.select("a.yschttl, div.abstr") {
out.println(link);
}
If you are trying to make two selections and outputting those values in tandem, you will have to do something like #vacuum suggests, but being careful of the lengths of the lists.
A side note, you don't have to parse the document twice to make two selections. You can parse once and select twice.

ArrayOutOfIndex... Parsing URL's but dont know exact index for each HTML page?

I am parsing some URL links with the following..
Document jsDoc2 = null;
try {
jsDoc2 = Jsoup.connect(url).get();
Elements thumbs = jsDoc2.select("div.latest-media-images img.latestMediaThumb");
List<String> thumbLinks = new ArrayList<String>();
for(Element thumb : thumbs) {
thumbLinks.add(thumb.attr("src"));
}
for(String thumb : thumbLinks) {
url0 = thumbLinks.get(0);
url1 = thumbLinks.get(1);
url2 = thumbLinks.get(2);
Log.e("URL0", url0);
Log.e("URL1", url1);
Log.e("URL2", url2);
After testing the code with multiple sources. ive ran across a problem.
09-19 20:59:56.710: ERROR/AndroidRuntime(7793): Caused by: java.lang.IndexOutOfBoundsException: Invalid index 2, size is 2
09-19 20:59:56.710: ERROR/AndroidRuntime(7793): at java.util.ArrayList.throwIndexOutOfBoundsException(ArrayList.java:255)
09-19 20:59:56.710: ERROR/AndroidRuntime(7793): at java.util.ArrayList.get(ArrayList.java:308)
Sometimes there arent three links availible and of cource when i ask to get the index of 2 and it doesnt exist it FC's.
How can i code defensively or create a better way to implement this. The uncertainty of not knowing the exact amount of urls it will parse and load into the list
EDIT:
My COde now
for(int i = 0; i< thumbLinks.size(); i++) {
Log.e("URL" + i, thumbLinks.get(i));
url0 = thumbLinks.get(i);
url1 = thumbLinks.get(i);
//Fix index out of bounds exception
url2 = thumbLinks.get(i);
}
Think this might work better for your loop structure
for(int i = 0; i< thumbLinks.size(); i++) {
Log.e("URL" + i, thumbLinks.get(i));
}
You could also just shorten this to something like:
Elements thumbs = jsDoc2.select("div.latest-media-images img.latestMediaThumb");
int index = 0;
for(Element thumb : thumbs) {
Log.e("URL" + index, (thumb.attr("src"));
index++;
}
This second option reduces the operational time by reducing the looping structures and providing the same result. You will not get index out of bound errors with either of these because it will only add URLs if there are thumbs in your list.
The reason your code is failing is because you're explicitly trying to call indexes and essentially ignoring the loop structure all together. The reason to use a loop is so you don't have explicitly get objects from the array and chance out of bounds exceptions.

Categories