Find the child nodes of a web page in javafx

Find the child nodes of a web page in javafx - java

I want to find all nodes of a web page and get the absolute position of them, how can I do this via JavaFX?!
(in other words, I want to find the absolute position of each HTML tag when it is shown by a browser, how can i do this work via JavaFX or anything else?!)
I am using JavaFX 2-2.
For more details, please see the first two comments.

I guess you need this: JavaFX WebView.
Honestly, I have no idea how to use it, but looking at examples (on same page) I can see that it uses WebKit and allows you to query all the rendered elements.

Related

Jsoup: Getting a link that doesn't show in the HTML

I am working on a little app for myself. I am trying to get a list of links from a site. The site is for example: http://kinox.to/Stream/Prison_Break.html
If you hover over the big window in the middle that says kinox.to best online, it show the link that I want in the bottom left. The problem is if I look at the html file I can't find the link anywhere. I guess it has to do something with the site using JavaScript or Ajax.
Is it possible to somehow get the link using JSoup or are there any other Java libraries that could help me?

I did not look closely into the page you try to load, but here is what I think the problem may be: The link is loaded/generated dynamically via JavaScript. Jsoup does not run JavaScript, so therefore you can't find the link in the html.
Two possible solutions:
1) Use something like selenium webdriver to access the content. The Java bindings allow to remote control a real browser which should have no problems loading the page and running all scripts within. Solution 1 is simple to program, but runs slowly. It may depend on an extern browser program which must be installed on the machine. An alternative to webdriver is the JavaFx webkit engine in case you are on java 8.
2) Analyse the traffic and the JavaScript on the page and find out where the link comes from. This may take a bit of time to find out, but when you succeed you can use Jsoup to get all the data you need. This solution should run much faster than solution 1.

One solution and probably the easiest would be to use Selenium:
WebDriver driver = new FirefoxDriver();
driver.get("http://kinox.to/Stream/Prison_Break.html");
String mylink = driver.findElement(By.cssSelector("#AjaxStream > a")).getText();

JSP/Tomcat: Navigation system with sub-folders but one page

My JSP project is the back-end of a fairly simple site with the purpose to show many submissions which I want to present on the website. They are organized in categories, basically similar to a typical forum.
The content is loaded entirely from a database since making separate files for everything would be extremely redundant.
However, I want to give the users the possibility to navigate properly on my site and also give unique links to each submission.
So for example a link can be: site.com/category1/subcategory2/submission3.jsp
I know how to generate those links, but is there a way to automatically redirect all the theoretically possible links to the main site.com/index.jsp ?
The Java code of the JSP needs access to the original link of course.
Hope someone has an idea..
Big thanks in advance! :)

Alright, in case someone stumbles across this one day...
The way I've been able to solve this was by using a Servlet. Eclipse allows their creation directly in the project and the wizard even allows you to set the url-mapping, for example /main/* so you don't have to mess with the web.xml yourself.
The doGet function simply contains the redirection as follows:
request.getRequestDispatcher("/index.jsp").forward(request,response);
This kind of redirection unfortunately causes all relative links in the webpage to fail. This can be solved by hardlinking to the root directory for example though. See the neat responses here for alternatives: Browser can't access/find relative resources like CSS, images and links when calling a Servlet which forwards to a JSP

Creating an image from a webpage

I'm working on a way to detect defacement on my website. The idea is to crawl the whole website and for each page, take a screenshot or render the website as an image and compare it with the last time the page has been checked.
I'm looking for a way to convert a whole webpage (HTML, CSS, JS) into an image, like a screenshot, no matter the language is (but I would prefer Java, Python or C#)
I need it to be fast and usable on a server.
I already tried the folowing in Java:
CssBox, but the rendering isn't good enough (no JS)
Selenium Web Driver, but it's way too slow (Time to open firefox, display the page etc...) and not usable without GUI
I think a solution would be a kind of wrapper for a web engine but I didn't find anything about that (at least in Java). I've been told PhantomJS would fit for this need, is it right?
The perfect result would be to create something like that: http://www.page2images.com/home

Use a browser which you can control via a script or command line options like phantomjs. The documentation contains examples how to make screenshots from URLs.

The website you linked offer some good rest API that perform the task: it's not a viable option for you?

Selenium is your best bet. Depending on your page content (ie. JS libraries, etc) it might take some time, but you could automate this with a script to run nightly via cron. Or using screen.
It has a rich language of assertions and simulated mouse events, and ways to regression-test and/or monitor the state of a set of pages.
Good luck.

With no GUI, it's probably not possible to do something like this.
If you're not too tight on the GUI and related things, you can use the JavaFX Webview and take a screenshot of the node using the following code
WritableImage image = webView.snapshot(null, null);
BufferedImage bufferedImage = SwingFXUtils.fromFXImage(image, null);
....
References:
WebView#snapshot
SwingFXUtils#fromFXImage

how to extract HTML data from a webpage which scrolls down for a fixed number of times?

I want to extract HTML data from a website using JAVA. The problem is the webpage keeps scrolling down once the user reaches the bottom of the page. Number of times it scrolls down is fixed. My JAVA code can extract only for the 1st part. How do I extract for the remaining scrolls? Is there a way to load the whole page at once with JAVA? ANy help would be appreciated :)

This might be the type of thing that PhantomJS (http://phantomjs.org/) was designed for. It will crawl entire web pages and even execute JavaScript, using a "real" browser in headless mode. I suggest stopping what you're doing with Java and take a look at PhantomJS instead. It could save you a LOT of time. :)

This type of behavior is implemented in the browser, interpreting the user's scrolling actions to load more content via AJAX and dynamically modifying the in-memory DOM in the browser. Consider that your Java runs in a web container on the server, and that web container (i.e. Tomcat, JBoss, etc) provides a huge amount of underlying code so your app doesn't have to worry about the plumbing.
Conceptually, a similar thing occurs at the client, with the DHTML web page running in its own "container" (the browser), which provides a wealth of functionality, from UI to networking, to DOM, etc. If you remove the browser from the equation and replace it with a Java program, you will need to provide the equivalent of the browser in which the DHTML/Javascript can execute.
I believe that HTMLUnit may fill the bill, but have not worked with it personally.

Reloading an iFrame which was running an Applet

I have a web page which is divided into several iFrames.
In each frame is a different solution, like JavaScript in one, flash in another, applet in another.
When a user interacts with the Applet, I am trying to provide a solution where if a certain event happens in the Applet, that the Applet will die and the same iFrame gets loaded with another solution (with an href like solution). I want to be able to load another Applet, or a raw HTML solution, or whatever.
I suspect I need to wrap these solutions in something else like JavaScript, but wondering what would this solution look like.
Thanks in advance.

See Applet.getAppletContext().showDocument(url, target).
Note that it is not guaranteed to be implemented in the JRE/browser combo. the applet is loaded in, let alone work. That is where the JavaScript comes in. ;)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Find the child nodes of a web page in javafx - java

I guess you need this: JavaFX WebView. Honestly, I have no idea how to use it, but looking at examples (on same page) I can see that it uses WebKit and allows you to query all the rendered elements.

Related

Jsoup: Getting a link that doesn't show in the HTML

JSP/Tomcat: Navigation system with sub-folders but one page

Creating an image from a webpage

how to extract HTML data from a webpage which scrolls down for a fixed number of times?

Reloading an iFrame which was running an Applet

Categories

Resources