Summary
I want to figure out a way to add a <script> tag into the head of DOM using Selenium's JavascriptExecutor, or any other way of doing this would be nice.
I have tried many ways and also found a few similar topics and none of them solved my problem which is why I felt the need to ask it on here.
For example :
Suggested solutions in this question did not solve my problem. Some people say it worked for them but nope, it didn't for me.
What I've been trying to execute?
Here is the small snippet of the code that I want to execute:
WebDriver driver = new FirefoxDriver();
JavascriptExecutor jse = (JavascriptExecutor) driver;
jse.executeScript("var s = document.createElement('script');");
jse.executeScript("s.type = 'text/javascript';");
jse.executeScript("s.text = 'function foo() {console.log('foo')}';");
jse.executeScript("window.document.head.appendChild(s);");
I just skipped the code above where you navigate to a webpage using driver.get() etc. and then try to execute the scripts.
Also, s.text would contain the actual script that I want to use so I just put there a foo() function just to give the idea.
The above code throws this error when you run it:
Exception in thread "main" org.openqa.selenium.JavascriptException: ReferenceError: s is not defined
So far I've tried every possible solution I could find on the Internet but none of them seems to work.
OP came up with the following solution:
jse.executeScript("var s=window.document.createElement('script');" +
"s.type = 'text/javascript';" + "s.text = function foo() {console.log('foo')};" +
"window.document.head.appendChild(s);");
For one, this line is invalid.
jse.executeScript("s.text = 'function foo() {console.log('foo')}';");
Note how you wrap single-quote text in single quotes. Use one set as "\""
I would personally do this by doing (edited to make it a global function):
using OpenQA.Selenium.Support.Extensions;
driver.ExecuteJavascript("window.foo = function foo() {console.log('foo')}");
It's as simple as that. You are registering foo as a method by doing this. After you execute this javascript, you can manually go in to the browser developer tools and call "foo()" to check. Additionally, you can check this by registering it directly in the console. Just enter "function foo() {console.log('foo')}" into your browser console, and then call "foo()".
No need to add this as a script tag.
EDIT #2: I fixed my above code suggestion so that the method is assigned to the window, and thus accessible globally, and outside of the anonymous script that javascript executor runs the code in. The original issues with this not working are resolved by this, at least in my testing of it.
Related
I'm trying to make a program that checks avaliable positions and books the first avaliable one. I started writing it and i ran into a problem pretty early.
The problem is that when I try to connect with the site (which is https) the program doesn't do anything. It doesn't throw an error, it doesn't crash. And the weirdest thing is that it works with some https websites and with some it doesn't.
I've spent countless hours trying to resolve this problem. I tried using htmlunitdriver and it still doesn't work. Please help.
private final WebClient webc = new WebClient(BrowserVersion.CHROME);
webc.getCookieManager().setCookiesEnabled(true);
HtmlPage loginpage = webc.getPage(loginurl);
System.out.println(loginpage.getTitleText());
I'm getting really frustrated with this. Thank you in advance.
As far as i can see this has nothing to do with HttpS. It is a good idea to do some traffic analysis using Charles or Fiddler.
What you can see....
The page returned from the server as response to your first call to https://online.enel.pl/ loads some external javascript. And then the story begins:
This JS looks like
(function() {
var z = "";
var b = "766172205f3078666.....";
eval((function() {
for (var i = 0; i < b.length; i += 2) {
z += String.fromCharCode(parseInt(b.substring(i, i + 2), 16));
}
return z;
})());
})();
As you can see someone likes to hide the real javascript that gets processed.
Next step is to check the javascript after this simple decoding
It is really huge and looks like this
var _0xfbfd = ['\x77\x71\x30\x6b\x77 ....
(function (_0x2ea96d, _0x460da4) {
var _0x1da805 = function (_0x55e996) {
while (--_0x55e996) {
_0x2ea96d['\x70\x75\x73\x68'](_0x2ea96d['\x73\x68\x69\x66\x74']());
}
};
.....
Ok now we have obfuscated javascript. If you like you can start with http://ddecode.com/hexdecoder/ to get some more readable text but this was the step where i have stopped my analysis. Looks like this script does some really bad things or someone still believes in security by obscurity.
If you run this with HtmlUnit, this codes gets interpreted - yes the decoding works and the code runs. Sadly this code runs endless (maybe because of an error or some incompatibility with real browsers).
If you like to get this working, you have to figure out, where the error is and open an bug report for HtmlUnit. For this you can simply start with a small local HtmlFile and include the code from the first external javascript. Then add some log statements to get the decoded version. Then replace this with the decoded version and try to understand what is going on. You can start adding alert statements and check if the code in HtmlUnit follows the same path as browsers do. Sorry but my time is to limited to do all this work but i really like to help/fix if you can point to a specific function in HtmlUnit that works different from real browsers.
Without the URL that you are querying it is dificult to say what could be wrong. However, having worked with HTML unit some time back I found that it was failing with many sites that I needed to get data from. The site owners will do many things to avoid you using programs to access them and you might have to resort to using some lower level library like Apache HTTP components where you have more control over what is going on under the hood.
Also check if the website is constructed using JavaScript which is getting more and more popular but making it increasingly dificult to use programs to interrogate the content.
This works:
HtmlPage page = (HtmlPage) browser.getPage("http://www.somewebsite.com/viewprofile.aspx?profile_id=107992814")
However if I put the URL in a variable like this:
String userPage = "http://www.somewebsite.com/" + profileAnchorLink.getHrefAttribute();
page = (HtmlPage) browser.getPage (userPage);
I get an error that starts off like this
Exception in thread "main" ======= EXCEPTION START ========
Exception class=[net.sourceforge.htmlunit.corejs.javascript.WrappedException]
com.gargoylesoftware.htmlunit.ScriptException: Wrapped com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "data" from undefined (https://www.gstatic.com/swiffy/v7.3.2/runtime.js#72)
Any ideas? I had an html web bot that worked beautifully but then I upgraded to Windows 10 and went through some messy problems, not sure if that has anything to do with it. I made a new project and re-imported the HtmlUnit libraries in case something was broken (kept the same workspace though not sure if that matters) and still to no avail.
The even weirder part is that sometimes it actually works. Initially my program wasn't even using the URL it was just going directly to the link but then something broke so I tried to do things a different way, the URL method was actually working but then it started to work only sometimes and now it doesn't work at all.
So I'm really quite lost on what's going on here.
Seems like the real problem was that I wasn't using getPage properly, after implementing the information from this answer (How to call getPage from HtmlUnit WebClient and have setTimeout not wait forever?) all is well...for now.
What can I do in case if I load the page in Selenium and then I have to do like 100 different parsing requests to this page?
At this moment I use different driver.findElement(By...) and the problem is that every time it is a http (get/post) request from java into selenium. From this case one simple page parsing costs me like 30+ seconds (too much).
I think that I must get source code (driver.getPageSource()) from first request and then parse this string locally (my page does not change while I parse it).
Can I build some kind of HTML object from this string to keep working with WebElement requests?
Do I have to use another lib to build HTML object? (for example - jsoup) In this case I will have to rebuild my parsing requests from webelement's and XPath.
Anything else?
When you call findElement, there is no need for Selenium to parse the page to find the element. The parsing of the HTML happens when the page is loaded. Some further parsing may happen due to JavaScript modifications to the page (like when doing element.innerHTML += ...). What Selenium does is query the DOM with methods like .getElementsByClassName, .querySelector, etc. This being said, if your browser is loaded on a remote machine, things can slow down. Even locally, if you are doing a huge amount of round-trip to between your Selenium script and the browser, it can impact the script's speed quite a bit. What can you do?
What I prefer to do when I have a lot of queries to do on a page is to use .executeScript to do the work on the browser side. This can reduce dozens of queries to a single one. For instance:
List<WebElement> elements = (List<WebElement>) ((JavascriptExecutor) driver)
.executeScript(
"var elements = document.getElementsByClassName('foo');" +
"return Array.prototype.filter.call(elements, function (el) {" +
" return el.attributes.whatever.value === 'something';" +
"});");
(I've not run the code above. Watch out for typos!)
In this example, you'd get a list of all elements of class foo that have an attribute named whatever which has a value equal to something. (The Array.prototype.filter.call rigmarole is because .getElementsByClassName returns something that behaves like an Array but which is not an Array so it does not have a .filter method.)
Parsing locally is an option if you know that the page won't change as you examine it. You should get the page's source by using something like:
String html = (String) ((JavascriptExecutor) driver).executeScript(
"return document.documentElement.outerHTML");
By doing this, you see the page exactly in the way the browser interpreted it. You will have to use something else than Selenium to parse the HTML.
Maybe try evaluating your elements only when you try to use them?
I dont know about the Java equivalent, but in C# you could do something similar to the following, which would only look for the element when it is used:
private static readonly By UsernameSelector = By.Name("username");
private IWebElement UsernameInputElement
{
get { return Driver.FindElement(UsernameSelector); }
}
I need to reach the following scenario:
1) Initializing JS var with JavascriptExecutor that will indicate if some operation is done.
2) Do some ordinary manipulation with the renderer page.
3) Verify the change to the var created in point (1).
For example:
jsc.executeScript("var test = false;");
Now, some manipulation is done.
And then:
String testVal = jsc.executeScript("return test;").toString
I get the error:
org.openqa.selenium.WebDriverException: {"errorMessage":"Can't find
variable: test","request":{"headers":{"Accept":"application/json,
image/png","Connection":"Keep-Alive","Content-Length":"35","Content-Type":"application/json;
charset=utf-8","Host":"localhost:14025"},"httpVersion":"1.1","method":"POST","post":"{\"args\":[],\"script\":\"return
test;\"}","url":"/execute","urlParsed":{"anchor":"","query":"","file":"execute","directory":"/","path":"/execute","relative":"/execute","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/execute","queryKey":{},"chunks":["execute"]},"urlOriginal":"/session/7e2c8ab0-b781-11e4-8a54-6115c321d700/execute"}}
When i'm running them in the same execution, it works correctly.:
String testVal = jsc.executeScript("var test = false; return test;").toString;
From JavascriptExecutor doc i found the explanation i needed:
Within the script, use document to refer to the current document. Note
that local variables will not be available once the script has
finished executing, though global variables will persist.
What is my alternative/workaround to this?
Not sure what is the motivation behind it, but you can use a globally available window object:
jsc.executeScript("window.test = false;");
String testVal = jsc.executeScript("return window.test;").toString
It might also be a use case for executeAsyncSript(), see:
Understanding execute async script in Selenium
Consider a simple DefaultSelenium object
DefaultSelenium sel = new
DefaultSelenium("http://localhost:8080/myapp",4444,"*iexplore","/myAppLevel1");
Now my server is set with the option of -forcedBrowserMode "*firefox" in the command line when I start it up. However, I have 2 different batch files to start the Server, one forced in firefox, one forced to IE. FYI, the -forcedBrowserMode overrides the settings within of the instantiated java object.
The problem is from java, I can't seem to find a way to determine which browser my DefaultSelenium object is running on... I was thinking something like:
sel.getBrowserName();
But nothing like that exists. Are there any other creative ways of doing this?
I need to know because with a GWT web application, to click on a button you need to do it differently based on the browser. As well, you may wonder why I even use the -forcedBrowserMode, because then I can use custom setup firefox/ie installs to test on.
Thanks in advance for help!
I think that you can get the browser by executing some JavaScript, for example verify navigator.userAgent or any browser specific object, for example document.defaultView will be null in IE and not null in FF, something like this:
DefaultSelenium sel = ...
String res = sel.getEval("document.defaultView ? false : true");
boolean isIE = "true".equals(res);