Java ScriptEngine and parent window functions - java

I am using the Java ScriptEngine to call a JS function from a backing bean under certain conditions in the program. The catch is that the JSF page is running in an iFrame. Although the iFrame is not the real problem at this stage, I want it to be able to browse to a new page. For this, I have used top.window.location and parent.location and they work on a button click to load the page out of the iFrame without a problem.
This is a sample of the code I'm using for this:
ScriptEngine se = new ScriptEngineManager().getEngineByName("JavaScript");
se.eval("function someFunc(){parent.location = \"www.someurl.com\";}");
Invocable invocableJS = (Invocable) se;
invocableJS.invokeFunction("someFunc");
With parent.location and top.window.location I have read that they are called from the browser itsself, and are not JavaScript. So because of this, I get the following error:
javax.script.ScriptException: sun.org.mozilla.javascript.internal.EcmaError: ReferenceError: "parent" is not defined. (<Unknown source>#1) in <Unknown source> at line number 1
It does exactly the same with document.getElementById("someComponent") and other similar. And it seems that the error remains with sun.org.mozilla..... in Chrome, IE, and FF all together.
So the question: How can I get the JS function to possibly use the parent.location? I imagine that I would need to check for the browser used and then import or call something from there depending on the browser, but I am not sure how I should go to work to do this. Any light that can possibly be shed on this problem will be very helpful.
Thanks in advance.

I have solved this problem, by checking for the cookies before the redirect in a Javascript method, and simply calling the JS method before the page loads using onload="someMethod();" in the body tag. Everything worked out fine, by using the parent.location to navigate to the new page. Tested in FF, Chrome and IE. Thus there was also no need for the ScriptEngine in this situation.

Related

Issue with HtmlUnitDriver's frame switch in Java / Issue with switching to a 'srcdoc' based iframe

I have a java program to scrape a web page, using selenium-server-standalone-3.9.1 jar.
I create org.openga.selenium.htmlunit.HtmlUnitDriver to fetch the page which contains 2 iframes inside. JavascriptEnabled is true.
I can access the page and parse ok without issues, but for some reason, switching to those frames, doesn't seem to work. Here is a snippet of the code.
driver.switchTo().defaultContent();
//The below line works fine without issues, so I'm assuming switching is succeeding.
//FYI if I give a bogus bad frame name here, it does raise an exception, which indicates that //the driver is recognizing the frame ok
driver.switchTo().frame(driver.findElement(By.id("xyz"))); //no exceptions raised
//Here, driver.getPageSource() here shows no content, just have an empty as follows
//<html>
// <head/>
// <body/>
//</html>
// So, I can't get any actually existing elements inside the iframe at all here..
// i.e. getElements() etc would return nothing..
driver.switchTo().defaultContent();
driver.getPageSource(); //shows the full page ok again..
I don't believe it's page loading issue as the full source dump from the driver does show the contents of those embedded iframes source ok. So, I don't think "wait" would help here.
What am I missing here?
Tried to use WebDriverWait to wait for frames to be available etc, no luck.

Customize the browser icon when running a Selenium session?

I have some Selenium sessions where, if certain events occurs, I spawn a new browser and leave the old one as is so I later on can manually intervene. The problem is that it is hard to distinguish between such a deserted browser session and the one that is currently running.
Ideally I would like to add a badge to the browser icon that is displayed in the application switcher (cmd-tab) and the dock (but other solutions/suggestions are also welcome, like add something to the name of the browser). Is that possible?
Using Java on a Mac. A solution can be platform specific.
You can use below execute_script (This python code use java equalent)
from selenium import webdriver
import time
driver = webdriver.Chrome()
driver.get(
"https://stackoverflow.com/questions/9943771/adding-a-favicon-to-a-static-html-page")
head = driver.find_element_by_tag_name("head")
link = driver.find_element_by_css_selector('link[rel="shortcut icon"]')
driver.execute_script('''var link = document.createElement("link");
link.setAttribute("rel", "icon");
link.setAttribute("type", "image/png");
link.setAttribute("href", "https://i.stack.imgur.com/uOtHF.png?s=64&g=1");
arguments[1].remove();
arguments[0].appendChild(link);
''',head,link)
time.sleep(70000)
you can use link element on head tag to add favicon. THe above code is an exaple where stackoverflow site will showup with my avatar
Output:
You should find the current link the website uses, remove it and replace it with your new link as shown in the code

Why is my Jsoup Code not Returning the Correct Elements?

I am working on an app in Android Studio and am having some trouble web-scraping with JSoup. I have successfully connected to the webpage and returned some basic elements to test the library, but now I cannot actually get the elements I need for my app.
I am trying to get a number of elements with the "data-at" attribute. The weird thing is, a few elements with the "data-at" attribute are returned, but not the ones I am looking for. For whatever reason my code is not extracting all of the elements that share the "data-at" attribute on the web page.
This is the URL of the webpage I am scraping:
https://express.liatoyotaofcolonie.com/inventory?f=dealer.name%3ALia%20Toyota%20of%20Colonie&f=submodel%3ACamry&f=trim%3ALE&f=year%3A2020
The method containing the web-scraping code:
#Override
protected String doInBackground(Void... params) {
String title = "";
Document doc;
Log.d(TAG, queryString.toString());
try {
doc = Jsoup.connect(queryString.toString()).get();
Elements content = doc.select("[data-at]");
for (Element e: content) {
Log.d(TAG, e.text());
}
} catch (IOException e) {
Log.e(TAG, e.toString());
}
return title;
}
The results in Logcat
The element I want to retrieve
One of the elements that is actually being retrieved
This is because some of the content - including the one you are looking for - is created asyncronously and is not present in initial DOM (Javascript ;))
When you view the source of the page you will notice that there is only 17 data-at occurences, while running document.querySelector("[data-at]") 29 nodes are returned.
What you are able to get in the JSoup is static content of the page (initial DOM). You wont be able to fetch dynamically created content as you do not run required JS scripts.
In order to overcome this, you will have to either fetch and parse required resources manually (eg trace what AJAX calls are made by the browser) or use headless browser setup. Selenium + headless Chrome should be enough.
Letter option will allow you to scrape ANY posible web application, including SPA apps, which is not possible using plaing Jsoup.
I don't quite know what to do about this, but I'm going to try one more time... The "Problematic Lines" in your code are these:
doc = Jsoup.connect(queryString.toString()).get();
Elements content = doc.select("[data-at]");
It is the queryString that you have requested - the URL points to a page that contains quite a bit of script code. When you load up a browser and click the button (or menu-option) that reads: "View Source", the HTML you see is not the same exact HTML that is broadcast to and received by JSoup.
If the HTML that is broadcast contains any <SCRIPT TYPE="text/javascript"> ... </SCRIPT> in it (and the named URL in your question does), AND those <SCRIPT> tags are involved in the initial loading of the page, then JSoup will not know anything about it... It only parses what it receives, it cannot process any dynamic content.
There are four ways that I know of to get the "Post Script Loaded" version of the HTML from a dynamic web-page, and I will type them here, now. The first is likely the most popular method (in Java) that I have heard about on Stack Overflow:
Selenium This Answer will show how the tool can run Java-Script. These are some Selenium Docs. And then there is this page right here has a great "first class" for using the tool to retrieve post-script processed HTML. Again, there is no way JSoup can retrieve HTML that is sent to the browser by script (JS/AJAX/Angular/React) since it just a parser.
Puppeteer This requires running a language called Node.js Perhaps calling a simple Node.js program from Java could work, but it would be a "Two Language" solution. I've never used it. Here is an answer that shows getting, sort of, what you are trying to get... The HTML after the script.
WebView Android Java Programmers have a popular class called "WebView" (documented here), that I have recently been told about (yesterday ... but it has been out for years) that will execute script in a browser, and return the HTML. Here is an answer that shows "JavaScript Injection" to retrieve DOM Tree elements from a "WebView" instance (which is how I was told it was done)
Splash My favorite tool, which I don't think anyone has heard of, but has been the simplest for me... So there is an A.P.I. called the "Splash API". Here is their explanation for a "Java-Script Rendering Service." Since this one I have been using... I'll post a code snippet that shows how "Splash Tool" can retrieve post-script processed HTML below.
To run the Splash API (only if you have access to the docker loading program) ... You start a Splash Server as below. These two lines are typed into a GCP (Google Cloud Platform) Shell instance, and the server starts right up without any configurations:
Pull the image:
$ sudo docker pull scrapinghub/splash
Start the container:
$ sudo docker run -it -p 8050:8050 --rm scrapinghub/splash
In your code, just prepend the String to your URL's:
"http://localhost:8050/render.html?url="
So in your code, you would use the following command (instead), and the script would (more likely) load all the HTML Elements that you are not finding:
String SPLASH_URL = "http://localhost:8050/render.html?url=";
doc = Jsoup.connect(SPLASH_URL + queryString.toString()).get();

Selenium element can not be found within iframe

I am trying to retrieve a JSON element, the problem is that in the source code it doesn't exist, but I can find it via inspect element.
I have tried with
C.driver.findElement(By.id("ticket-parsed"))
and via XPath
C.driver.findElement(By.xpath("//*[#id=\"ticket_parsed\"]"));
and I can't find it.
Also
C.driver.switchTo().frame("html5-frame");
System.out.println(C.driver.findElement(By.id("ticket_parsed")));
C.driver.switchTo().defaultContent();
i get
[[ChromeDriver: chrome on XP (1f75e50635f9dd5b9535a149a027a447)] -> id: ticket_parsed]
on
driver.switchTo().frame(0) or driver.switchTo().frame(1)
i get that the frame doesn't exists
and at last i tried
WebElement frame = C.driver.findElement(By.id("html5-frame"));
C.driver.switchTo().frame(frame.getAttribute("ticket_parsed"));
an i got a null pointer exception
Here's an image of the source:
what am I doing wrong?
Well!
The element #ticket-parsed is in iFrame. So, you can click it without getting into an iframe.
Here is the code to switch to iFrame,
driver.switchTo().frame("frame_name");
or
driver.switchTo().frame(frame_index);
In your case,
driver.switchTo().frame("html5-frame");
After switching into the iframe, you can click that element using either XPath or CSS.
C.driver.findElement(By.id("ticket-parsed"))
NOTE:
After completing the operation inside the iframe, you have to again return back to the main window using the following command.
driver.switchTo().defaultContent();
I didn't found a solution with my excising setup,but i did found a js command which gets the object correctly
document.getElementById("html5-frame").contentDocument.getElementById("ticket_parsed")
you can integrate js commands like this
JavascriptExecutor js=(JavascriptExecutor)driver;
js.executeScript(*yourCommandHere*);
if you want to get the output of the command just add the word return before your command (in this specific situation it didn't work but in any other situation it did)
*TypeOfData* foo = js.executeScript(return *yourCommandHere*);
at last because of limited time i had to use unorthodox methods like taking screenshots and comparing the images if they are exactly the same
Thanks for the help

selecting pulldown in htmlunit

I am using htmlunit in jython and am having trouble selecting a pull down link. The page I am going to has a table with other ajax links, and I can click on them and move around and it seems okay but I can't seem to figure out how to click on a pulldown menu that allows for more links on the page(this pulldown affects the ajax table so its not redirecting me or anything).
Here's my code:
selectField1 = page.getElementById("pageNumSelection")
options2 = selectField1.getOptions()
theOption3 = options2[4]
This gets the option I want, I verify its right. so I select it:
MoreOnPage = selectField1.setSelectedAttribute(theOption3, True)
and I am stuck here(not sure if selecting it works or not because I don't get any message, but I'm not sure what to do next. How do I refresh the page to see the larger list? When clicking on links all you have to do is find the link and then select linkNameVariable.click() into a variable and it works. but I'm not sure how to refresh a pulldown. when I try to use the webclient to create an xml page based on the the select variable, I still get the old page.
to make it a bit easier, I used htmlunit scripter and got some code that should work but its java and I'm not sure how to port it to jython. Here it is:
try
{
page = webClient.getPage( url );
HtmlSelect selectField1 = (HtmlSelect) page.getElementById("pageNumSelection");
List<HtmlOption> options2 = selectField1.getOptions();
HtmlOption theOption3 = null;
for(HtmlOption option: options2)
{
if(option.getText().equals("100") )
{
theOption3 = option;
break;
}
}
selectField1.setSelectedAttribute(theOption3, true );
Have a look at HtmlForm getSelectedByName
HtmlSelect htmlSelect = form.getSelectByName("stuff[1].type");
HtmlOption htmlOption = htmlSelect.getOption(3);
htmlOption.setSelected(true);
Be sure that WebClient.setJavaScriptEnabled is called. The documentation seems to indicate that it is on by default, but I think this is wrong.
Alternatively, you can use WebDriver, which is a framework that supports both HtmlUnit and Selenium. I personally find the syntax easier to deal with than HtmlUnit.
If I understand correctly, the selection of an option in the select box triggers an AJAX calls which, once finished, modifies some part of the page.
The problem here is that since AJAX is, by definition, asynchronous, you can't really know when the call is finished and when you may inspect the page again to find the new content.
HtmlUnit has a class named NicelyResynchronizingAjaxController, which you can pass an instance of to the WebClient's setAjaxController method. As indicated in the javadoc, using this ajax controller will automatically make the asynchronous calls coming from a direct user interaction synchronous instead of asynchronous. Once the setSelectedAttribute method is called, you'll thus be able to see the changed made to the original page.
The other option is to use WebClient's waitForBackgrounfJavascript method after the selection is done, and inspect he page once the background JavaScript has ended, or the timeout has been reached.
This isn't really an answer to the question because I've not used HtmlUnit much before, but you might want to look at Selenium, and in particular Selenium RC. With Selenium RC you are able to control the interactions with a page displayed in a native browser (Firefox for example). It has developer API's for Java and Python amongst others.
I understand that HtmlUnit uses its own javascript and web browser rendering engine and I'm wondering whether that may be a problem.

Categories