Background
I am using htmlUnit to simulate user behavior in a certain page
I am reaching to a login page which I need to enter the user credentials
Issue:
The form that I am suppose to fill in the details dynamically changes and pushes new input fields with value that changes upon each char inserted.
This input field has several event listeners which as far as I was able to find from Chrome debugging the keypress event is the most relevant to me as this what ultimately generates the updates value
I am getting the following errors when the page "loads":
[User1st] An error occurred while extracting lang code TypeError: Cannot call method "getAttribute" of undefined
4.c.g.h.javascript.StrictErrorReporter : runtimeError: message=[An invalid or illegal selector was specified (selector: '*,:x' error: Invalid selector: *:x).] sourceName= https://???/jquery-1.10.2.min.js] line=[3] lineSource=[null] lineOffset=[0]
some code:
WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setRedirectEnabled(true);
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.waitForBackgroundJavaScript(5000);
final HtmlPage page = webClient.getPage(WEBSITE_URL);
HtmlForm loginForm = page.getFormByName("login");
HtmlTextInput userIdField = loginForm.getInputByName("USERID");
HtmlPasswordInput passwordField = loginForm.getInputByName("USERPASSWORD");
userIdField.type("ID");
passwordField.setText("PASSWORD");
What I am doing next is simply iterating the form input fields and see their value.
How can I make sure that all related js code really getting executed if any?
I'm not sure if this helps but just letting the script sleeps work for me. This probably gives time loading all the js scripts.
Thread.sleep(2000);
Related
I'm attempting to press a Javascript Button on a webpage using HTMLUnit 2.36 which navigates to another page, and so on...:
ScriptResult result = page.executeJavaScript("__doPostBack('LinkBtn_thebutton','')");
Page page = result.getNewPage();
I've attempted to use the code above which causes the following error, supposedly because getNewPage() is no longer supported:
The method GetNewPage() is undefined for type ScriptResult
I've also attempted to add a cast with getJavaScriptResult() as shown below with no luck:
HtmlPage page1 = (HtmlPage) result.getJavaScriptResult();
Causing the following error:
Exception in thread "main" java.lang.ClassCastException: class net.sourceforge.htmlunit.corejs.javascript.Undefined cannot be cast to class com.gargoylesoftware.htmlunit.html.HtmlPage
You are not supposed to cast the result.getJavaScriptResult();, treat it like a void. If your page is going to be redirected, make sure that redirecting is enabled: webClient.getOptions().setRedirectEnabled(true);
I'm trying to write an application for automatic update page content (inside my account). I used the HTMLUnit because it supports javascript.
But I faced with "your browser is too old" problem.
My code:
public static void main(String[] args) {
Locale.setDefault(Locale.ENGLISH);
try (final WebClient client = new WebClient(BrowserVersion.FIREFOX_45)) {
client.getOptions().setUseInsecureSSL(true);
client.setAjaxController(new NicelyResynchronizingAjaxController());
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.getOptions().setThrowExceptionOnScriptError(false);
client.waitForBackgroundJavaScript(30000);
client.waitForBackgroundJavaScriptStartingBefore(30000);
client.getOptions().setCssEnabled(false);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setRedirectEnabled(true);
HtmlPage page = client.getPage("https://passport.yandex.ru/passport?mode=auth&retpath=https://toloka.yandex.ru/?ncrnd=5645");
HtmlForm form = page.getForms().get(0);
HtmlInput inputLogin = form.getInputByName("login");
inputLogin.setValueAttribute(userName);
HtmlInput inputPassw = form.getInputByName("passwd");
inputPassw.setValueAttribute(password);
DomElement button = page.getElementsByTagName("button").get(0);
HtmlPage page2 = button.click();
System.out.println(page2.asXml());
}
catch (IOException e) {
}
}
Login is successful, but I can't load second page. (It should redirect to content page)
Answer:
<h1 style="padding-top: 20px;">Browser is too old</h1>
<p>
Unfortunately you are using an old browser.
Please, upgrade to at least IE10 or use one of the modern browsers, e.g.
Yandex.Browser,
Google Chrome or
Mozilla Firefox
</p>
How can I solve it? Thanks.
There is no simple solution for your problem but there are some things you can do.
use the latest snapshot build of HtmlUnit (http://htmlunit.sourceforge.net/gettingLatestCode.html)
try with different simulated browsers (e.g. chrome)
cleanup your client settings, set only the options required (in your case maybe setUseInsecureSSL(true);
waitForBackgroundJavaScript and waitForBackgroundJavaScriptStartingBefore are no options; doing this at client setup is useless
check your log; maybe there are some hints about not supported javascript methods
Place the call of waitForBackgroundJavaScript after the button click; mabey the redirect is done by some javascript with a small delay.
HtmlPage page2 = button.click();
client.waitForBackgroundJavaScript(30000);
And because the javascript might have change the page content you have to get the page content again.
page2 = page2.getEnclosingWindow().getEnclosedPage();
Usually the checks for the browser version are done by some javascript magic. Maybe the magic trick used by your web site is not (correctly) supported/emulated by HtmlUnit. If you are able to find out the root cause for this you can fill a bug (see http://htmlunit.sourceforge.net/submittingJSBugs.html for some hints how to find this).
I'm trying to get the textbox with u_0_1e as id, from the page wall but HtmlUnit does not find anything. The last line prints null.
Here's the code:
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(java.util.logging.Level.OFF);
WebClient client = new WebClient(BrowserVersion.CHROME);
JavaScriptEngine engine = new JavaScriptEngine(client);
client.setJavaScriptEngine(engine);
HtmlPage home = client.getPage("https://www.facebook.com/login.php");
HtmlSubmitInput login = (HtmlSubmitInput) home.getElementById("u_0_1");
HtmlTextInput name = (HtmlTextInput) home.getElementById("email");
HtmlPasswordInput pass = (HtmlPasswordInput) home.getElementById("pass");
name.setValueAttribute("myname");
pass.setValueAttribute("mypass");
HtmlPage page = login.click();
HtmlPage wall = client.getPage("https://www.facebook.com/");
System.out.println(wall.getElementById("u_0_1e"));
I have some comments about your issue.
First of all, you have disabled HtmlUnit's logging. So if you have any JavaScript issue then you are not going to see it. If you are actually getting a JavaScript error then the JavaScript code won't be fully executed. If the element you're trying to fetch was dynamically fetched from the server (probably using AJAX) then the JavaScript errors, if any, might result in that element not being fetched.
If you are webscraping, which is clearly the case, then you don't have any control over the JS so you can only accept it as not working or disable JS and manually processing the AJAX requests.
Of course, you will see the page perfectly working on a real browser but take into consideration that the JavaScript engine HtmlUnit uses is different from the real browsers.
Secondly, the two lines containing the word engine are absolutely unneeded.
Thirdly, as I mentioned in a previous question of yours, this will be more suitable to be handled by means of the Facebook API.
Finally, you might find this other answer useful:
JavaScript not being properly executed in HtmlUnit
I am writing a Java application that has a log in screen. Ideally, I would like to take the user supplied data (name, password), and submit it to an ASP form that can verify their credentials. I do not own the ASP form, I can only access the URL. I also do not want the user to be entering their credentials straight into the web form. They would enter their credentials into my program, and my program would put the data into the form and submit, and allow/deny the user based on the response.
Of course, the submit button on the ASP form is a POST request. However, constructing the URL (...login?username=name&password=pass) does not work, as the form must be submitted via the button with the text boxes filled in.
I have tried two approaches:
Using Java's URLConnection class. This does not seem to work because the form submitting is limited to the method I mentioned above, which is constructing the URL.
Using Javascript to access and edit the elements on the page. This has not worked either, because the Javascript is being run from my program, which is not a web browser, and therefore has no access to the 'document' or 'window' commonly used.
Other potential solutions I can think of:
Opening a browser to the login page but not giving it focus, running a script to fill out and submit the form, parsing the response, and then closing the browser. This would not involve the user at all, except for the input into the login page in my Java program.
Using a 3rd party Java library (suggestions? references to tutorials?).
Embedding the URL into my login screen (any help in this regard would be appreciated).
The things that cannot be changed are that my program is in Java, and that the login URL is an ASP form that hides the POST data from the URL.
Let me know if anything needs clarification. Any help is welcome.
try htmlunit, although it was designed for testing it would be ideal for this. You can use it in conjunction with Selenium webdriver
Why don't you open up your ASP form in an IFrame using javascript populate all the fields & then post it.
This should solve your problem.
Sfk is correct, i had a similar problem and manage to fill the form and submit it with Htmlunit .#sfk many thanks, you put me in the right path.
So with htmlunit
//for chrome simulation
WebClient webClient = new WebClient(BrowserVersion.CHROME_16);
//has getting an error from [http://www.google-analytics.com/ga.js with javascript on.
webClient.setJavaScriptEnabled(false);
HtmlPage page = webClient.getPage("http://yourtargetpage/Default.aspx");
//get the form by name, check page source for name
HtmlForm form = page.getFormByName("aspnetForm");
HtmlPasswordInput inputPass = form.getInputByName("your input password text field name");
HtmlTextInput userName = form.getInputByName("your input user text field name");
HtmlSubmitInput button=form.getInputByName("your target submit button");
//set username and password
userName.setText("myuser");
inputPass.setText("mypassword");
//click the submit button and get the returned page
HtmlPage page2 = button.click();
That´s it.. you got the reply page and sent the information on the fields..you can now parse the page and get the site response..
I want to fill a text field of a HTTP form through java and then want to click on the submit button through java so as to get the page source of the document returned after submitting the form.
I can do this by sending HTTP request directly but I don't to this in this way.
I usually do it using HtmlUnit. They have an example on their page :
#Test
public void submittingForm() throws Exception {
final WebClient webClient = new WebClient();
// Get the first page
final HtmlPage page1 = webClient.getPage("http://some_url");
// Get the form that we are dealing with and within that form,
// find the submit button and the field that we want to change.
final HtmlForm form = page1.getFormByName("myform");
final HtmlSubmitInput button = form.getInputByName("submitbutton");
final HtmlTextInput textField = form.getInputByName("userid");
// Change the value of the text field
textField.setValueAttribute("root");
// Now submit the form by clicking the button and get back the second page.
final HtmlPage page2 = button.click();
}
And you can read more here.
If you don't want to talk HTTP directly (why?), then take a look at Watij.
It allows you to invoke a browser (IE) as a COM control within your Java process, navigate through page elements by using their document ids etc., fill in forms and press buttons. Because it's running a browser, Javascript will run as normal (like if you were doing this manually).
You would probably need to write a Java Applet, as the only other way than sending a direct request would be to have it interface with the browser.
Of course, for this to work, you would have to embed the applet in the page. If you don't control the page, this can't be done. If you do control the page, you might as well be using Javascript, instead of trying to get a Java Applet to do it, which would be much more cumbersome and difficult.
Just to clarify, what is the problem you are having creating an HTTP Request and why do you want to use a different method?