HtmlUnit and HTTPS pages - java

I'm trying to make a program that checks avaliable positions and books the first avaliable one. I started writing it and i ran into a problem pretty early.
The problem is that when I try to connect with the site (which is https) the program doesn't do anything. It doesn't throw an error, it doesn't crash. And the weirdest thing is that it works with some https websites and with some it doesn't.
I've spent countless hours trying to resolve this problem. I tried using htmlunitdriver and it still doesn't work. Please help.
private final WebClient webc = new WebClient(BrowserVersion.CHROME);
webc.getCookieManager().setCookiesEnabled(true);
HtmlPage loginpage = webc.getPage(loginurl);
System.out.println(loginpage.getTitleText());
I'm getting really frustrated with this. Thank you in advance.

As far as i can see this has nothing to do with HttpS. It is a good idea to do some traffic analysis using Charles or Fiddler.
What you can see....
The page returned from the server as response to your first call to https://online.enel.pl/ loads some external javascript. And then the story begins:
This JS looks like
(function() {
var z = "";
var b = "766172205f3078666.....";
eval((function() {
for (var i = 0; i < b.length; i += 2) {
z += String.fromCharCode(parseInt(b.substring(i, i + 2), 16));
}
return z;
})());
})();
As you can see someone likes to hide the real javascript that gets processed.
Next step is to check the javascript after this simple decoding
It is really huge and looks like this
var _0xfbfd = ['\x77\x71\x30\x6b\x77 ....
(function (_0x2ea96d, _0x460da4) {
var _0x1da805 = function (_0x55e996) {
while (--_0x55e996) {
_0x2ea96d['\x70\x75\x73\x68'](_0x2ea96d['\x73\x68\x69\x66\x74']());
}
};
.....
Ok now we have obfuscated javascript. If you like you can start with http://ddecode.com/hexdecoder/ to get some more readable text but this was the step where i have stopped my analysis. Looks like this script does some really bad things or someone still believes in security by obscurity.
If you run this with HtmlUnit, this codes gets interpreted - yes the decoding works and the code runs. Sadly this code runs endless (maybe because of an error or some incompatibility with real browsers).
If you like to get this working, you have to figure out, where the error is and open an bug report for HtmlUnit. For this you can simply start with a small local HtmlFile and include the code from the first external javascript. Then add some log statements to get the decoded version. Then replace this with the decoded version and try to understand what is going on. You can start adding alert statements and check if the code in HtmlUnit follows the same path as browsers do. Sorry but my time is to limited to do all this work but i really like to help/fix if you can point to a specific function in HtmlUnit that works different from real browsers.

Without the URL that you are querying it is dificult to say what could be wrong. However, having worked with HTML unit some time back I found that it was failing with many sites that I needed to get data from. The site owners will do many things to avoid you using programs to access them and you might have to resort to using some lower level library like Apache HTTP components where you have more control over what is going on under the hood.
Also check if the website is constructed using JavaScript which is getting more and more popular but making it increasingly dificult to use programs to interrogate the content.

Related

only builtin utterances working ASK

I'm developing a skill in Amazon Alexa. I'm trying to test the same using echosim.io but the problem is as below.
My Skill name is MyBot and the same is invocation name.
In echosim.io, When I say Alexa Launch MyBot, it gives the welcome response (The help response that I've coded in). When I say help, it gives me the help response that I've entered.
I've 4 intents say
FaqIntentOne
FIntentOne
FaqIntentTwo
FIntentTwo
And my Sample utterances are as below.
FaqIntentOne what is first answer
FIntentOne give me first answer
FaqIntentTwo what is second answer
FIntentTwo give me second answer
When I run these, Alexa doesn't give me a response.
I've the correct methods and the correct response set there. please let me know why it is not working for the utterances other than the built in ones.
when test in Alexa's test interface in developer.amazon.com, it is giving me the correct response.
This is quite confusing.
Below is how it looks in my code.
if ("FaqIntentOne".equals(intentName) || "FIntentOne".equals(intentName)) {
return getFirstHelp(intent, session);
}
else if ("FaqIntentTwo".equals(intentName) || "FIntentTwo".equals(intentName)) {
return getSecondHelp(intent, session);
}
Thanks
Though Amazon has referred people to echosim, it is not 'official' (it was developed by a 3rd party), so if it works in Amazon's test environment and not in echosim then it is possible that the issue is with echosim.
Otherwise I think you are going to need to look more closely at what is happening in your code, ie. debug it or put in some print statements and compare what happens when invoked in those 2 ways.
If you are running in Lambda - seems to be the most common - then you will need to take a look at CloudWatch logs.

HtmlUnit won't open a link unless I manually tell it to

This works:
HtmlPage page = (HtmlPage) browser.getPage("http://www.somewebsite.com/viewprofile.aspx?profile_id=107992814")
However if I put the URL in a variable like this:
String userPage = "http://www.somewebsite.com/" + profileAnchorLink.getHrefAttribute();
page = (HtmlPage) browser.getPage (userPage);
I get an error that starts off like this
Exception in thread "main" ======= EXCEPTION START ========
Exception class=[net.sourceforge.htmlunit.corejs.javascript.WrappedException]
com.gargoylesoftware.htmlunit.ScriptException: Wrapped com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "data" from undefined (https://www.gstatic.com/swiffy/v7.3.2/runtime.js#72)
Any ideas? I had an html web bot that worked beautifully but then I upgraded to Windows 10 and went through some messy problems, not sure if that has anything to do with it. I made a new project and re-imported the HtmlUnit libraries in case something was broken (kept the same workspace though not sure if that matters) and still to no avail.
The even weirder part is that sometimes it actually works. Initially my program wasn't even using the URL it was just going directly to the link but then something broke so I tried to do things a different way, the URL method was actually working but then it started to work only sometimes and now it doesn't work at all.
So I'm really quite lost on what's going on here.
Seems like the real problem was that I wasn't using getPage properly, after implementing the information from this answer (How to call getPage from HtmlUnit WebClient and have setTimeout not wait forever?) all is well...for now.

How to speed up page parsing in Selenium

What can I do in case if I load the page in Selenium and then I have to do like 100 different parsing requests to this page?
At this moment I use different driver.findElement(By...) and the problem is that every time it is a http (get/post) request from java into selenium. From this case one simple page parsing costs me like 30+ seconds (too much).
I think that I must get source code (driver.getPageSource()) from first request and then parse this string locally (my page does not change while I parse it).
Can I build some kind of HTML object from this string to keep working with WebElement requests?
Do I have to use another lib to build HTML object? (for example - jsoup) In this case I will have to rebuild my parsing requests from webelement's and XPath.
Anything else?
When you call findElement, there is no need for Selenium to parse the page to find the element. The parsing of the HTML happens when the page is loaded. Some further parsing may happen due to JavaScript modifications to the page (like when doing element.innerHTML += ...). What Selenium does is query the DOM with methods like .getElementsByClassName, .querySelector, etc. This being said, if your browser is loaded on a remote machine, things can slow down. Even locally, if you are doing a huge amount of round-trip to between your Selenium script and the browser, it can impact the script's speed quite a bit. What can you do?
What I prefer to do when I have a lot of queries to do on a page is to use .executeScript to do the work on the browser side. This can reduce dozens of queries to a single one. For instance:
List<WebElement> elements = (List<WebElement>) ((JavascriptExecutor) driver)
.executeScript(
"var elements = document.getElementsByClassName('foo');" +
"return Array.prototype.filter.call(elements, function (el) {" +
" return el.attributes.whatever.value === 'something';" +
"});");
(I've not run the code above. Watch out for typos!)
In this example, you'd get a list of all elements of class foo that have an attribute named whatever which has a value equal to something. (The Array.prototype.filter.call rigmarole is because .getElementsByClassName returns something that behaves like an Array but which is not an Array so it does not have a .filter method.)
Parsing locally is an option if you know that the page won't change as you examine it. You should get the page's source by using something like:
String html = (String) ((JavascriptExecutor) driver).executeScript(
"return document.documentElement.outerHTML");
By doing this, you see the page exactly in the way the browser interpreted it. You will have to use something else than Selenium to parse the HTML.
Maybe try evaluating your elements only when you try to use them?
I dont know about the Java equivalent, but in C# you could do something similar to the following, which would only look for the element when it is used:
private static readonly By UsernameSelector = By.Name("username");
private IWebElement UsernameInputElement
{
get { return Driver.FindElement(UsernameSelector); }
}

JPivot Display Mondrian Result

I am trying to display the result of a Mondrian query using JPivot. Many examples are showing how to use the tag library for JSP but I need to use the Java API, I looked at the documentation but I cannot understand how to use it to display the results in the table. Here is my code
Query query = connection.parseQuery(mdxQuery);
Result result = connection.execute(query);
result.print(new PrintWriter(System.out,true));
I would like to know if I can use the result object to build the jpivot table.
Thanks in advance!
First of all, using JPivot
is a pretty bad idea.
It was discontinued back in 2008.
There is a good project which is intended to replace the JPivot called Pivot4j. Despite it is currently under development (0.8 -> 0.9 version), Pivot4j can actually do the business.
However, if we're talking about your case:
result.print(new PrintWriter(System.out,true));
This string prints the HTML code with OLAP cube into your System.out.
You can write the HTML code in some output stream (like FileOuputStream), and then display it.
OutputStream out = new FileOutputStream("result.html");
result.print(new PrintWriter(out, true));
//then display this file in a browser
However, if you want to have the same interface as in JPivot, I don't think there is an easy way to do it without .jsp. In these case I strongly recommend you to try Pivot4j.
Good luck!

Starting OpenOffice from applet

I have this code down and this working fine from command line ...
But when I put this in applet I get following error
com.sun.star.lang.IllegalArgumentException
at com.sun.star.comp.bridgefactory.BridgeFactory.createBridge(BridgeFactory.java:158)
at
com.sun.star.comp.urlresolver.UrlResolver$_UrlResolver.resolve(UrlResolver.java:130)
Anybody have solution for this problem ? Where I can find BridgeFactory source ?
Runtime.getRuntime().exec("C:/Program Files/OpenOffice.org 3/program/soffice.exe -accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager"); // oooUrlW - the url of soffice.exe
Thread.sleep(5000);
XComponentContext xLocalContext = com.sun.star.comp.helper.Bootstrap.createInitialComponentContext(null);
XMultiComponentFactory xLocalServiceManager = xLocalContext.getServiceManager();
Object urlResolver = xLocalServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver",xLocalContext);
XUnoUrlResolver xUnoUrlResolver = (XUnoUrlResolver) UnoRuntime.queryInterface(XUnoUrlResolver.class,urlResolver);
Object initialObject = xUnoUrlResolver.resolve("uno:socket,host=localhost,port=8100;urp;StarOffice.ServiceManager");
XPropertySet xPropertySet = (XPropertySet) UnoRuntime.queryInterface(XPropertySet.class,initialObject);
XComponentContext remoteContext = (XComponentContext) UnoRuntime.queryInterface(XComponentContext.class, xPropertySet.getPropertyValue("DefaultContext"));
XMultiComponentFactory remoteServiceManager = remoteContext.getServiceManager();
Object desktop = remoteServiceManager.createInstanceWithContext("com.sun.star.frame.Desktop", remoteContext);
xDesktop =(XDesktop) UnoRuntime.queryInterface( XDesktop.class, desktop);
XComponent xCalcComponent =
newDocComponent(xDesktop, "scalc");
XSpreadsheetDocument xCalcDocument =
(XSpreadsheetDocument)UnoRuntime.queryInterface(
XSpreadsheetDocument.class, xCalcComponent);
XSpreadsheets a=xCalcDocument.getSheets();
Object o = a.getByName("Sheet1");
XSpreadsheet sheet = (XSpreadsheet)UnoRuntime.queryInterface(
XSpreadsheet.class, o);
XCell jjjj = sheet.getCellByPosition(0, 0);
jjjj.setFormula("Some Text ");
Is your applet signed ? else I don't think you can call
Runtime.getRuntime().exec("C:/Program Files/OpenOffice.org 3/program/soffice.exe-accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager");
from an applet.
I agree with Pierre... you would need a trusted/signed applet to do that. You might also want to reconsider why you are trying to do this with an applet rather than a standalone application (using webstart or something if you need to web-deliverable).
One more thing to consider is that the end-user would have to have OpenOffice installed locally (unless they have changed the way their API works) for any Java-OO.o access to work correctly. This requirement may have changed though, it has been a while since I have played around with their API.
Good luck and I hope this helps a little.
It is signed, and I found kind of solution - on client I grant
permission java.security.AllPermission; and now everything work...
I still did'nt try grant SignedBy "MyCompany" permission java.securyty.AllPermission
which I must do...
Error message is misleading me
com.sun.star.lang.IllegalArgumentException ... stupid message
I must use applet ... it is Oracle Forms application and I need to start Calc on client and fill some data.
Thanks on help.
There is a very simple way to place OOo in an applet - use the OfficeBean
While you'll still have your java security problem, your code will be a lot tighter. We're using this to do the same thing. My post on how to get OO 3.2 working in Java 6 applets is here is you want to take a look. It works for 3.1 and 3.2.

Categories