Screen Capture for Web application

Screen Capture for Web application - java

How to take user's current page screen shot and save into one folder in web application,
I tried in java side,
In that I used Robot class its taking server screen instead of client screen.

You can use phantomjs
http://phantomjs.org
If you want to use it along with nodejs , there are many nodejs bridges available also
I have used node-phantom and it works really well
https://github.com/alexscheelmeyer/node-phantom
Well , this is not a pure jquery solution. But i used node-phantom with the support socket.io and jquery.

If your are interested in taking a sceenshot of the webpages (or parts of them), you should take a look at html2canvas project.
It's not taking the "actual screenshots" of the current page, but rather builds a representation of it based on the properties it reads from the DOM.
If you are using HTML5, take a look at this Take Webpage Screenshot with HTML5 and JavaScript
So, this will take you a screenshot. Later you may decide what to do with it. You can push it to you server side service and store it to a file/db.

You can go for casperjs.
http://casperjs.org/
There is a nice example to capture the screen.
http://docs.casperjs.org/en/latest/modules/casper.html#captureselector

Related

Creating an image from a webpage

I'm working on a way to detect defacement on my website. The idea is to crawl the whole website and for each page, take a screenshot or render the website as an image and compare it with the last time the page has been checked.
I'm looking for a way to convert a whole webpage (HTML, CSS, JS) into an image, like a screenshot, no matter the language is (but I would prefer Java, Python or C#)
I need it to be fast and usable on a server.
I already tried the folowing in Java:
CssBox, but the rendering isn't good enough (no JS)
Selenium Web Driver, but it's way too slow (Time to open firefox, display the page etc...) and not usable without GUI
I think a solution would be a kind of wrapper for a web engine but I didn't find anything about that (at least in Java). I've been told PhantomJS would fit for this need, is it right?
The perfect result would be to create something like that: http://www.page2images.com/home

Use a browser which you can control via a script or command line options like phantomjs. The documentation contains examples how to make screenshots from URLs.

The website you linked offer some good rest API that perform the task: it's not a viable option for you?

Selenium is your best bet. Depending on your page content (ie. JS libraries, etc) it might take some time, but you could automate this with a script to run nightly via cron. Or using screen.
It has a rich language of assertions and simulated mouse events, and ways to regression-test and/or monitor the state of a set of pages.
Good luck.

With no GUI, it's probably not possible to do something like this.
If you're not too tight on the GUI and related things, you can use the JavaFX Webview and take a screenshot of the node using the following code
WritableImage image = webView.snapshot(null, null);
BufferedImage bufferedImage = SwingFXUtils.fromFXImage(image, null);
....
References:
WebView#snapshot
SwingFXUtils#fromFXImage

Reduce HTML using Applets

My supervisor has tasked me with programmatically reducing a website's content by looking at the HTML tags to reveal only the core content. Importantly, this particular piece of the project must be written in Java.
Now having learnt about the differences betweenPlugins, Extensions, Applets, and Widgets, I think I want to use an Extension that calls a client-side Applet. My approach was going to be this:
Using the Google-Chrome API, I was going to display a button that
the user can click.
If clicked, the action is to launch a new browser tab that has the
Applet embedded within it.
The applet automatically sources the called tab's HTML code and
filters it.
Once filtered, the reduced copy of the original site appears.
So I have a few questions. To start, is it even possible to use an Extension with an Applet? Moreover, is it possible for an applet to look # another tabs HTML code? If not, is it possible to just reload the original tab with the Applet now embedded within it and complete the function. Thanks.

Javascript is already on most mobile web platforms. Java is not, and there is no reasonable way mobile customers will be able to install Java. Android, which runs many, but not all, mobile devices has a Java run time environment, and is basically a loader for Java apps. But an Apple iPhone is not an Android device... nor is a Windows Phone.
If you want to summarize content on the client, and in Javascript, as I see it you have two choices:
Succeed with some inner burst of genius where dozens of the best expert PhDs in Natural Language Computing have just begun exploring how to extract "true meaning" from text; OR
look at document.title and be done with it.
The 2nd approach assumes that the authors of web pages set titles and set a title appropriate for summarizing their website. This isn't a perfect assumption, but it is OK
most of the time. It is also a lot less expensive than #1
With the 1st approach you can get a head start with a "natural language toolkit" that can do things like scan text for unusual words and phrases. To get a rough idea of the kinds of software that have been built in this area, review wikipedia: Outline of natural language processing:: toolkits. A popular tookit for python is called NLTK. Whether you use a toolkit from java, or python, it means working on the server because the client will not have the storage, network speed, or CPU. For python there are server side app frameworks like django or web2py that can make building out a server app faster, and on Java there are servlets frameworks. Ultimately you'll need a lot of help, training, or luck and as I have hinted above it can easily be beyond the capabilities of a small team of fresh hires, and certainly way beyond what a single new developer eager to prove his/her capabilities can do in a few weeks on their own with limited help.
Most web pages have titles set like this near the beginning of the downloaded HTML:
<head><title>My Furry Kittens!</title></head>
You don't need to write a parser. If you are running in the browser, the title has been parsed into the DOM or Document Object Model already. The string "My Furry Kittens!" in this example would be available in the global variable document.title.
If you like, you could put a button into a plugin and let people push it to summarize the website. Or, they could just look up at the title. It is already on the page. Of course, if the goal is to scrape titles one can avoid writing a parser and use a "fake" headless scriptable browser like phantomJS or similar.
You can read more about document.title on the Mozilla Developer Network. MDN is a great reference for learning how web browsers work. They are the maintainers of the Mozilla Firefox browser. Most of what you can learn there will also work on Chrome, Internet Explorer, and various mobile platforms.
Good Luck!

How about implementing a local proxy server on the mobile device. The browser would just need to be configured to use the proxy, while the custom proxy implementation can transform the requested html however it likes.

how to extract HTML data from a webpage which scrolls down for a fixed number of times?

I want to extract HTML data from a website using JAVA. The problem is the webpage keeps scrolling down once the user reaches the bottom of the page. Number of times it scrolls down is fixed. My JAVA code can extract only for the 1st part. How do I extract for the remaining scrolls? Is there a way to load the whole page at once with JAVA? ANy help would be appreciated :)

This might be the type of thing that PhantomJS (http://phantomjs.org/) was designed for. It will crawl entire web pages and even execute JavaScript, using a "real" browser in headless mode. I suggest stopping what you're doing with Java and take a look at PhantomJS instead. It could save you a LOT of time. :)

This type of behavior is implemented in the browser, interpreting the user's scrolling actions to load more content via AJAX and dynamically modifying the in-memory DOM in the browser. Consider that your Java runs in a web container on the server, and that web container (i.e. Tomcat, JBoss, etc) provides a huge amount of underlying code so your app doesn't have to worry about the plumbing.
Conceptually, a similar thing occurs at the client, with the DHTML web page running in its own "container" (the browser), which provides a wealth of functionality, from UI to networking, to DOM, etc. If you remove the browser from the equation and replace it with a Java program, you will need to provide the equivalent of the browser in which the DHTML/Javascript can execute.
I believe that HTMLUnit may fill the bill, but have not worked with it personally.

develop a user defined plugin for web browser

How to develop a user defined plugin for a web browser.
It should features:
It should be installed in any browsers.
It should be executed whenever the browser starts.
It should monitor the web page and access the web page that the browser displays.
It should monitor and access the web page (for example, getting a value from a text box) irrespective of the web page the browser displays. (The web page can be of any URL either google or any domain)
How to start with it? It would be helpful if there is some sample. Thanks in advance

For Firefox < 4 write an Addon, for 4 and above Jetpack will be the way to go. For Chrome write a Extension. Opera, well wait till 11.5 ships. Safari 5. IE.
Read the documentation for each browser.
Hm...
I hope you tell the user about that.
Right now it reads like you want to deploy something to a PC and monitor all browsers, well if you want to do that you'll have to put some effort into it.

I don't think 1. is possible, you will have to create multiple versions of your plugin in order to work with each browser.
There is not a single example, because as I mentioned, you are going to have to do something different. You will need to determine and target specific browsers. I would suggest starting with one and once you have it have it working move to the next browser.

Do you mean a Plugin (like Flash, PDF Reader) or and Extension?
Plugins are native programs and extensions are normally coded in JavaScript & HTML.
Depending on what you want to do, an extension is enough powerful and the better choice.
There is no browser independent way to implement plugins. For each browsers you must read the interface reference. For example the reference for chrome: http://code.google.com/chrome/extensions/getstarted.html

Java: "Control" External Application

Is it possible to programmatically start an application from Java and then send commands to it and receive the program's output?
I'm trying to realize this scenario:
I want to access a website that uses lots of javascript and special html + css features -> the website isn't properly displayed in swt.browser or any of the other of the available Browser Widgets. But the website can be displayed without any problems in firefox. So I want to run a hidden instance of firefox, load the website and get the data. (It would be nice if FF can be embedded in a JFrame or so..)
Has anybody got an idea how to realize this?
Any help would really be appreciated!
EDIT: The website loads some Javascript that does some html magic and loads some pictures. When I only read the html from the website I see nothing more than some JavaScript calls. But when the website is loaded in a Browser, it displays some images overlayed with text. That's what I'm trying to show the user of my app.

To start Firefox from within the application, you could use:
Runtime runtime = Runtime.getRuntime();
try {
String path = "/path/to/firefox";
Process process = runtime.exec(path + " " + url);
} catch (IOException e) {
// ...
}
To manipulate processes once they have started, one can often use process.getInputStream() and process.getOutputStream(), but that would not help you in the case of Firefox.
You should probably look into ways of solving your specific problem other than trying to interact directly between your application and a browser instance. Consider either moving the whole interface into a Java gui, or doing a web app from the ground up -- not half and half.

See this article - it will teach you how to start a process, read its output and write to its input stream.
However this solution may be not be the best for your problem. What kind of data do you need to get from the Web Page? Would it be better to read the html with an HTTP GET and then parse it with an Html parser?

If you have a text-mode browser available (like links2 on linux) you might want to see how well that can render the page. For example, the command "links -dump http://someurl.com" will format the page as text and exit immediately, resulting in output that might be easily parseable using the methods that Ray Myers and kgiannakakis suggest.

If the website is static, you could use a web scraper like Jericho to load the URL, parse the HTML and wander your way through the DOM to the info you need.

Although a similar feature to what you describe is planned for FireFox in the future, it is not available yet. The feature is dubbed TaskFox, and from the linked wiki, "its aim is to allow users to quickly access information and perform tasks that would normally take several steps to complete."
News of the upcoming TaskFox feature just broke today, in fact. Perhaps you should consider a career being a psychic instead of a programmer.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.