HtmlUnit button click - java

I'm trying to send a message on www.meetme.com but can't figure out how to do it. I can type in the message in the comment area but clicking the Send button doesn't do anything. What am I doing wrong? When I login and press the Login button the page does change and everything is fine. Anyone have any ideas or clues?
HtmlPage htmlPage = null;
HtmlElement htmlElement;
WebClient webClient = null;
HtmlButton htmlButton;
HtmlForm htmlForm;
try{
// Create and initialize WebClient object
webClient = new WebClient(BrowserVersion.FIREFOX_17 );
webClient.setCssEnabled(false);
webClient.setJavaScriptEnabled(false);
webClient.setThrowExceptionOnFailingStatusCode(false);
webClient.setThrowExceptionOnScriptError(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setUseInsecureSSL(true);
webClient.getCookieManager().setCookiesEnabled(true);
/*webClient.setRefreshHandler(new RefreshHandler() {
public void handleRefresh(Page page, URL url, int arg) throws IOException {
System.out.println("handleRefresh");
}
});*/
htmlPage = webClient.getPage("http://www.meetme.com");
htmlForm = htmlPage.getFirstByXPath("//form[#action='https://ssl.meetme.com/login']");
htmlForm.getInputByName("username").setValueAttribute("blah#gmail.com");
htmlForm.getInputByName("password").setValueAttribute("blah");
//Signing in
htmlButton = htmlForm.getElementById("login_form_submit");
htmlPage = (HtmlPage) htmlButton.click();
htmlPage = webClient.getPage("http://www.meetme.com/member/1234567890");
System.out.println("BEFORE CLICK");
System.out.println(htmlPage.asText());
//type message in text area
HtmlTextArea commentArea = (HtmlTextArea)htmlPage.getFirstByXPath("//textarea[#id='profileQMBody']");
commentArea.setText("Testing");
htmlButton = (HtmlButton) htmlPage.getHtmlElementById("profileQMSend");
htmlPage = (HtmlPage)htmlButton.click();
webClient.waitForBackgroundJavaScript(7000);
//The print is exactly the same as the BEFORE CLICK print
System.out.println("AFTER CLICK");
System.out.println(htmlPage.asText());
}catch(ElementNotFoundException e){
e.printStackTrace();
}catch(Exception e){
e.printStackTrace();
}

Without knowing much about the webpage you're accessing, you just can't perform an AJAX request with JavaScript disabled. If changing that doesn't result in success, then you will have to keep debugging, but make sure JavaScript is enabled.
Additionally, make sure you're using HtmlUnit 1.12 and update all the deprecated methods in your code.
BTW, I'd also recommend to turn may JavaScript warnings off. Check this answer to see how you can do that.

Related

Find a form with Java and htmlUnit

I have written a simple program which should login via a form on a website.
Unfortunately, the form in the html has no name or id.
I use the latest version of HtmlUnit and Java 11.
I tried to find the form with the .getForms () method, but without success.
Html Snippet from Website i try to login
Here is my code to find the form:
//Get the form
HtmlForm form = LoginPage.getFormByName("I tried several options here");
//Get the Submit button
final HtmlButton loginButton = form.getButtonByName("Anmelden");
//Get the text fields for password and username
final HtmlTextInput username = form.getInputByName("text");
final HtmlTextInput password = form.getInputByName("password");
Whatever I tried, I didn't find any form.
This is my connection class if it helps:
public HtmlPage CslPlasmaConnection(){
//Create Webclient to connect to CslPlasma
WebClient CslPlasmaConnection = new WebClient(BrowserVersion.BEST_SUPPORTED);
//helper variable ini with null
HtmlPage CslPlasmaLoginPage = null;
//Get the content from CslPlasma
try {
CslPlasmaLoginPage = CslPlasmaConnection.getPage(URL);
} catch (IOException e) {
e.printStackTrace();
}
//Return CslPlasma Login Page
return CslPlasmaLoginPage;
}
Without knowing the page i can only guess...
Have a look at this answer https://stackoverflow.com/a/54188201/4804091
And try to use the latest page (maybe there is some js that creates the form).
webClient.getPage(url);
webClient.waitForBackgroundJavaScript(10000);
HtmlPage page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage();
If you're sure this is the only form on the page or you know which form number it is, you can use page.getForms() to get all forms of the page and get yours from the resulting list.
Like so:
HtmlForm form = LoginPage.getForms().get(0); // if it's the only form, its index is 0

How to get info from website once logged on with HTMLUNIT?

I have made a post before about this but have gained some more details on how to do it, yet i am still unable to do it properly. This is the main part of code. When i run it i get a whole bunch of warnings related to css in the console. And it wont work. Im trying to get the user's name as i mentioned in the code.If someone could help it would mean a lot to me. The website is my school website: https://lionel2.kgv.edu.hk/login/index.php . I have included the logged on website ( I removed most elements except for my user part ) if that helps. Thanks in advance,
Vijay.
website:
https://drive.google.com/a/kgv.hk/file/d/0B-O_Xw0mAw7tajJhVlRxTkFhOE0/view?usp=sharing
//most of this is from https://gist.github.com/harisupriyanto/6805988
String loginUrl = "http://lionel2.kgv.edu.hk";
int loginFormNum = 1;
String usernameInputName = "nameinput";
String passwordInputName = "passinput";
String submitLoginButtonValue = "Sign In";
// create the HTMLUnit WebClient instance
WebClient wclient = new WebClient();
// configure WebClient based on your desired
wclient.getOptions().setPrintContentOnFailingStatusCode(false);
wclient.getOptions().setCssEnabled(true);
wclient.getOptions().setThrowExceptionOnFailingStatusCode(false);
wclient.getOptions().setThrowExceptionOnScriptError(false);
try {
final HtmlPage loginPage = (HtmlPage)wclient.getPage(loginUrl);
final HtmlForm loginForm = loginPage.getForms().get(loginFormNum);
final HtmlTextInput txtUser = loginForm.getInputByName(usernameInputName);
txtUser.setValueAttribute(username);
final HtmlPasswordInput txtpass = loginForm.getInputByName(passwordInputName);
txtpass.setValueAttribute(password);
final HtmlSubmitInput submitLogin = loginForm.getInputByValue(submitLoginButtonValue);
final HtmlPage returnPage = submitLogin.click();
final HtmlElement returnBody = returnPage.getBody();
//if (//there is a class called "Login info, then print out the nodeValue.) {
// }
} catch(FailingHttpStatusCodeException e) {
e.printStackTrace();
} catch(Exception e) {
e.printStackTrace();
}
}
You most likely do not need the CSS so you could disable it.
To improve performance and reduce warnings and errors I disable/limit as much as possible.
webClient.setJavaScriptTimeout(30 * 1000); // 30s
webClient.getOptions().setTimeout(300 * 1000); // 300s
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setThrowExceptionOnScriptError(false); // no Exceptions because of javascript
webClient.getOptions().setPopupBlockerEnabled(true);

HTMLUnit not clicking a button

I have a simple HTML unit code that is used to press a button and submit a form on a page, but it doesn't want to work. This is my code:
public static boolean sub(String ref, String zip) throws Exception {
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);
webClient.getOptions().setRedirectEnabled(true);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getCookieManager().setCookiesEnabled(true);
final HtmlPage page1 = webClient.getPage("http://site.com/");
webClient.waitForBackgroundJavaScript(20000);
final HtmlButton button = page1.getFirstByXPath("//*[#id=\"lookup\"]");
final HtmlTextInput orderField = page1.getFirstByXPath("//*[#id=\"order-number\"]");
final HtmlTextInput zipField = page1.getFirstByXPath("//*[#id=\"order-user-info\"]");
orderField.setValueAttribute(ref);
zipField.setValueAttribute(zip);
final HtmlPage page2 = button.click();
webClient.waitForBackgroundJavaScript(20000);
System.out.println(page2.asText());
webClient.closeAllWindows();
return true;
}
All this does is print out the text of the first page, but with the text boxes filled in. As you can see, I tried waiting for javascript, it still doesn't work. Any help is appreciated.
UPDATE: I found at some new info. It seems that when I enter an order number that is correct, it just shows the current page with the text boxes filled in, instead of the page it is suposed to redirect to, but when I enter wrong info, it shows the current page with the text boxes filled in AND the "Wrong Info" error message. It seems that this is just not redirecting...
have you tried using
webClient.getOptions().setRedirectEnabled(true);

HtmlUnit can't retrieve page after downloading a file

I'm having this weird problem with HtmlUnit in Java. I am using it to download some data from a website, the process is something like this:
1 - Login
2 - For each element (cars)
----- 3 Search for car
----- 4 Download zip file from a link
The code:
Creation of the webclient:
webClient = new WebClient(BrowserVersion.FIREFOX_3_6);
webClient.setJavaScriptEnabled(true);
webClient.setThrowExceptionOnScriptError(false);
DefaultCredentialsProvider provider = new DefaultCredentialsProvider();
provider.addCredentials(USERNAME, PASSWORD);
webClient.setCredentialsProvider(provider);
webClient.setRefreshHandler(new ImmediateRefreshHandler());
Login:
public void login() throws IOException
{
page = (HtmlPage) webClient.getPage(URL);
HtmlForm form = page.getFormByName("formLogin");
String user = USERNAME;
String password = PASSWORD;
// Enter login and password
form.getInputByName("LoginSteps$UserName").setValueAttribute(user);
form.getInputByName("LoginSteps$Password").setValueAttribute(password);
// Click Login Button
page = (HtmlPage) form.getInputByName("LoginSteps$LoginButton").click();
webClient.waitForBackgroundJavaScript(3000);
// Click on Campa area
HtmlAnchor link = (HtmlAnchor) page.getElementById("ctl00_linkCampaNoiH");
page = (HtmlPage) link.click();
webClient.waitForBackgroundJavaScript(3000);
System.out.println(page.asText());
}
Search for car in website:
private void searchCar(String _regNumber) throws IOException
{
// Open search window
page = page.getElementById("search_gridCampaNoi").click();
webClient.waitForBackgroundJavaScript(3000);
// Write plate number
HtmlInput element = (HtmlInput) page.getElementById("jqg1");
element.setValueAttribute(_regNumber);
webClient.waitForBackgroundJavaScript(3000);
// Click on search
HtmlAnchor anchor = (HtmlAnchor) page.getByXPath("//*[#id=\"fbox_gridCampaNoi_search\"]").get(0);
page = anchor.click();
webClient.waitForBackgroundJavaScript(3000);
System.out.println(page.asText());
}
Download pdf:
try
{
InputStream is = _link.click().getWebResponse().getContentAsStream();
File path = new File(new File(DOWNLOAD_PATH), _regNumber);
if (!path.exists())
{
path.mkdir();
}
writeToFile(is, new File(path, _regNumber + "_pdfs.zip"));
}
catch (Exception e)
{
e.printStackTrace();
}
}
The problem:
The first car works okay, pdf is downloaded, but as soon as I search for a new car, when I get to this line:
page = page.getElementById("search_gridCampaNoi").click();
I get this exception:
Exception in thread "main" java.lang.ClassCastException: com.gargoylesoftware.htmlunit.UnexpectedPage cannot be cast to com.gargoylesoftware.htmlunit.html.HtmlPage
After debugging, I've realized that the moment I make this call:
InputStream is = _link.click().getWebResponse().getContentAsStream();
the return type of page.getElementById("search_gridCampaNoi").click() changes from HtmlPage to WebResponse, so instead of receiving a new page, I'm receiving again the file that I already downloaded.
A couple of screenshots of the debugger showing this situation:
First call, return type OK:
Second call, return type changed and I no longer receive a HtmlPage:
Thanks in advance!
Just in case someone encounters the same problem, I found a workaround.Changing the line:
InputStream is = _link.click().getWebResponse().getContentAsStream();
to
InputStream is = _link.openLinkInNewWindow().getWebResponse().getContentAsStream();
seems to do the trick. Im having problems now when doing several iterations, sometimes it works, sometimes it doesn't but at least I have something now.

apache HTMLUNIT..... PROBLEM in handling javascript

I want to login to a website (http://www.orkut.com) through
com.gargoylesoftware.htmlunit.WebClient
But when I click on the "Submit" button, it doesn't take me to the expected page that should come after login. Instead it returns the same login page again. In clear sense, there is some problem in login. When I try the same code with sites that doen't have javascript, it works fine so I think I am not able to handle scripts.
I am trying using the follwoing code:
public static void main(String[] args) {
final WebClient webClient = new WebClient();
try {
HtmlPage loginPage = webClient.getPage(new URL("https://www.google.com/accounts/ServiceLogin?service=orkut&hl=en-US&rm=false&continue=http%3A%2F%2Fwww.orkut.com%2FRedirLogin%3Fmsg%3D0%26page%3Dhttp%253A%252F%252Fwww.orkut.co.in%252FHome.aspx&cd=IN&passive=true&skipvpage=true&sendvemail=false"));
System.out.println(loginPage.getTextContent());
List<HtmlForm> forms = loginPage.getForms();
HtmlForm loginForm = forms.get(0);
HtmlInput username = loginForm.getInputByName("Email");
HtmlInput password = loginForm.getInputByName("Passwd");
HtmlInput submit = loginForm.getInputByName("signIn");
username.setNodeValue("username");
password.setNodeValue("password");
HtmlPage homePage = submit.click();
Thread.sleep(10 * 1000);
System.out.println(homePage.getTextContent());
}catch(Exception ex) {
ex.printStackTrace();
}
}
When we do click on the "submit" button, in actual it calls first this function
onsubmit="return(gaia_onLoginSubmit());"
specified as the attribute of the form below
<form id="gaia_loginform" action="https://www.google.com/accounts/ServiceLoginAuth?service=orkut" method="post"
onsubmit="return(gaia_onLoginSubmit());">
Can anyone help me in this.
NOTE: I WILL PAY FOR THE SOLUTION
According to their site the JavaScript support is provided by Mozilla Rhino, so maybe all you need is to add it to your classpath (and perhaps fiddle with some configurations).
Also, HtmlUnit has professional support

Categories