How to enable Flash to HTMLUnit? - java

I'm trying to grab html contents by HTMLUnit. Everything went nice, but couldn't get Flash contents those are visible as <img> where its actually in <object>, i have
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setActiveXNative(true);
webClient.getOptions().setAppletEnabled(true);
webClient.getOptions().setCssEnabled(true);
In SO some places i found someone saying HTMLUnit won't support Flash, but those answers seems old, so am raising this question. Someone please help.
Thanks.

I found it with the help
for this i have downgrade my HTMLUnit version from 2.15 to 2.13 as in 2.15 BrowserVersionFeatures.JS_FRAME_RESOLVE_URL_WITH_PARENT_WINDOW seems deprecated and don't know what feature replaced here.
private static BrowserVersion firefox17WithUptoDateFlash = new BrowserVersion(BrowserVersion.FIREFOX_17.getApplicationName(), BrowserVersion.FIREFOX_17.getApplicationVersion(), BrowserVersion.FIREFOX_17.getUserAgent(), BrowserVersion.FIREFOX_17.getBrowserVersionNumeric(),new BrowserVersionFeatures[]{
BrowserVersionFeatures.JS_FRAME_RESOLVE_URL_WITH_PARENT_WINDOW,
BrowserVersionFeatures.STYLESHEET_HREF_EXPANDURL,
BrowserVersionFeatures.STYLESHEET_HREF_STYLE_NULL
});
static {
PluginConfiguration plugin1 = new PluginConfiguration(
"Shockwave Flash",
"Shockwave Flash 11.4 r402",
"NPSWF32_11_4_402_287.dll");
plugin1.getMimeTypes().add(new PluginConfiguration.MimeType(
"application/x-shockwave-flash",
"Adobe Flash movie",
"swf"));
firefox17WithUptoDateFlash.getPlugins().add(plugin1);
}
final WebClient webClient = new WebClient(firefox17WithUptoDateFlash);
Here newly written browser instance will give support to htmlunit to act as flash enabled GUI-less browser

Related

How to download file using headless (gui-less) Selenium WebDriver

I need to download files using headless web browser in Java. I checked HtmlUnit where I was able to download file in some simple cases but I was not able to download when Ajax initialized downloading (actually it is more complicated as there are two requests, the first one download the URL where the second request actually download file from the given URL). I have replaced HtmlUnit with Selenium. I already checked two WebDrivers, HtmlUnitDriver and ChromeDriver.
HtmlUnitDriver - similar behaviour like HtmlUnit
ChromeDriver - I am able to download files in visible mode but when I turned on headless mode, files are no longer downloaded
ChromeOptions lChromeOptions = new ChromeOptions();
HashMap<String, Object> lChromePrefs = new HashMap<String, Object>();
lChromePrefs.put("profile.default_content_settings.popups", 0);
lChromePrefs.put("download.default_directory", _PATH_TO_DOWNLOAD_DIR);
lChromeOptions.setExperimentalOption("prefs", lChromePrefs);
lChromeOptions.addArguments("--headless");
return new ChromeDriver(lChromeOptions);
I know that downloading files in headless mode is turned off because of security reasons but there must be some workaround
I used 2.28 httpunit before, few minutes ago I started to work with 2.29 but still it seems that Ajax function stops somewhere. This is the way I retrieve data after click and expect a file data: _link.click().getWebResponse().getContentAsStream()
Does WebConnectionWrapper shows all the requests/responses that are made on the website? Do You know how can I debug this to have better insight? I see that the first part of the Ajax function after link is clicked is being properly called (there are 2 http requests in this function). I even tried to create my custom http request to retrive data/file after first response is fetched inside WebConnectionWrapper -> getResponse but it returns 404 error which indicates that this second request had been somehow done but I dont see any log/debug information neither in _link.click().getWebResponse().getContentAsStream() nor WebConnectionWrapper -> getResponse()
Regarding HtmlUnit you can try this:
Calling click() on a dom element is a sync call. This means, this returns after the response of this call is retrieved and processed. Usually all the JS libs out there doing some async magic (like starting some processing with setTimeout(,10) for various (good) reasons. Your code will be aware of this.
A better approach is to do something like this
Page page = _link.click();
webClient.waitForBackgroundJavaScript(1000);
Sometimes the Ajax requests are doing an redirect to the new content. We have to address this new stuff by checking the current window content
page = page.getEnclosingWindow().getEnclosedPage();
Or maybe better
In case of downloads the (binary) response might be opened in a new window
WebWindow tmpWebWindow = webClient.getCurrentWindow();
tmpWebWindow = tmpWebWindow.getTopWindow();
page = tmpWebWindow.getEnclosedPage();
This might be the response you are looking for.
page.getWebResponse().getContentAsStream();
Its a bit tricky to guess what is going on with your web application. If you like you can reach me via private mail or discuss this in the HtmlUnit user mailing list.

HtmlUnit can't find forms on website

At the following website I try to access the login and password forms with HtmlUnit: https://zof.interreport.com/diveport#
However this very simple javascript returns an empty list [].
void homePage() throws Exception{
final WebClient webClient = new WebClient(BrowserVersion.CHROME);
final HtmlPage page = webClient.getPage("https://zof.interreport.com/diveport#");
System.out.println(page.getForms());
}
So somehow HtmlUnit doesn't recognize the forms on the page. How can I fix this?
At first: you only show some java code but you talk about javascript - is there anything missing?
Regarding the form. The page you are trying to test is one of these pages that doing some work on the client side. This implies, that after the page is loaded, the real page/dom is created inside your browser by invoking javascript. When using HtmlUnit you have to take care of that. In simple cases it is sufficient to wait for the javacript to be processed.
This code works for me:
final WebClient webClient = new WebClient(BrowserVersion.CHROME);
final HtmlPage page = webClient.getPage("https://zof.interreport.com/diveport#");
webClient.waitForBackgroundJavaScriptStartingBefore(5000);
System.out.println(page.getForms());
Take care to use the latest SNAPSHOT build of HtmlUnit.
I have not worked on that API but here is the trick
Open same page in your browser by disabling the JavaScript. It is not working.
This means the page loading its content using some JavaScript dom operations.
If you can not get html here there must be some way out in API you are using.
Check with the HtmlUnit api documentation. The class JAVADOC
There is method
public ScriptResult executeJavaScript(String sourceCode)
The key here is if API you are using will not execute the JavaScript on its won and you have to code for it.

Java PhantomJS NETWORK_ERR XMLHttpRequest Exception 101

I'm trying to get the page https://secure.twitch.tv/login with PhantomJS in Java using Selenium, but on the driver.get(...) I always get crashed with this error. I've tried implementing this:
String [] phantomJsArgs = {"--web-security=no", "--ignore-ssl-errors=yes"};
desireCaps.setCapability(PhantomJSDriverService.PHANTOMJS_GHOSTDRIVER_CLI_ARGS, phantomJsArgs);
But that doesn't seem to make a difference. Does anyone know a workaround?
Here is some code:
private void setup(){
DesiredCapabilities desireCaps = new DesiredCapabilities();
desireCaps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "C:\\Users\\Scott\\workspace\\Twitch Bot v2\\libs\\phantomjs.exe");
desireCaps.setCapability("takesScreenshot", true);
String [] phantomJsArgs = {"--disable-web-security"};
desireCaps.setCapability(PhantomJSDriverService.PHANTOMJS_GHOSTDRIVER_CLI_ARGS, phantomJsArgs);
driver = new PhantomJSDriver(desireCaps);
//driver = new HtmlUnitDriver();
}
This is what the console is printing out when I try to grab the twitch page.
It seems you are trying to load the page with async XMLHttpRequest, but the server does not provide cross origin headers (Access-Control-Allow-Origin) in its response. Loading such resource with async XMLHttpRequest is discouraged for security reasons.
To bypass this limitation, add the flag --disable-web-security to phantomJsArgs.
Just another guess what might be going on: phantomjs still defaults to SSL 3.0 requests but lots of websites have disabled SSL 3.0 so these requests will fail. To use more modern protocols use the following option with phantomjs:
--ssl-protocol=any

Why doesn't HTMLunit work on this https webpage?

I'm trying to learn more about HTMLunit and doing some tests at the moment. I am trying to get basic information such as page title and text from this site:
https://....com (removed the full url, important part is that it is https)
The code I use is this, which is working fine on other websites:
final WebClient webClient = new WebClient();
final HtmlPage page;
page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
System.out.println(page.getTitleText());
System.out.println(page.asText());
Why can't I get this basic information ? If it is because of security measures, what are the specifics and can I bypass them ? Thanks.
Edit:Hmm the code stops working after webclient.getpage(); , test2 is not written. So I can not check if page is null or not.
final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_2);
final HtmlPage page;
System.out.println("test1");
try {
page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
System.out.println("test2");
I solved this by adding this line of code:
webClient.setUseInsecureSSL(true);
which is deprecated way of disabling secure SSL. In current HtmlUnit version you have to do:
webClient.getOptions().setUseInsecureSSL(true);
I think that this is an authentication problem - If I go tho that page in Firefox I get a login box.
Try
webClient.setAuthentication(realm,username,password);
before the call the getPage()

How to open IE from java and perform operations like click() etc. through java?

I wan't to login to a website through Java and perform operations, like click, add text to a textfield, etc., through Java.
I suggest using a testing framework like HtmlUnit. Even through it's designed for testing, it's a perfectly good programmatic "navigator" of remote websites.
Here's some sample code from the site, showing how to navigate to a page and fill in a form:
public void submittingForm() throws Exception {
WebClient webClient = new WebClient();
HtmlPage page1 = webClient.getPage("http://some_url");
HtmlForm form = page1.getFormByName("myform");
HtmlSubmitInput button = form.getInputByName("submitbutton");
HtmlTextInput textField = form.getInputByName("userid");
textField.setValueAttribute("root");
HtmlPage page2 = button.click();
}
You could launch it by
Runtime.getRuntime().exec("command-line command to launch IE");
then use Java's Robot class to send mouse clicks and fill in text. This seems rather crude, though, and you can probably do better by communicating directly with the web server (bypassing the browser entirely).
This question's answers may be helpful.
But you should consider direct HTTP as a better way to interact with websites.
You could also use WebTest from Canoo which actually uses HTMLUnit but with an extra layer on top of it. Should be easier to start du to the scripting layer and it comes with additional abstractions for sending mails, verifying output etc.
http://webtest.canoo.com/webtest/manual/WebTestHome.html
Might as well try Selenium. Its free and has a fairly nice wrapper for IE.
If you really need a 'real' IE you could try Watij, if you just need browser features in java I recommend HttpClient
Update: as the OP indicated using a real browser was not needed/wanted. An example of a form login using HttpClient can be found here: https://github.com/apache/httpcomponents-client/blob/master/httpclient5/src/test/java/org/apache/hc/client5/http/examples/ClientFormLogin.java

Categories