Java embedded browser with resources in memory - java

We have a Java desktop app with an embedded browser, now using XULRunner (Firefox engine) on SWT. This browser's API allows us to load webs specifying an URI or its HTML content.
What we need is to load HTML webpages including resources but being everything in memory. The best solution would be to provide a listener used when the engine tries to load resources so we can send it the appropriate content.
Any ideas? thank you!

It sounds like you need a small HTTP / web server. There is Jetty, there are also a few smaller ones, just search for "small java web server" or so.

In HTML 5 your can put your resources inside the HTML itself.
So you can use SWT with browser that supports HTML 5 and prepare your webpages to have resources inside HTML 5.
With SWT Browser your can simply do browser.setText(html) to load the page from memory.

Related

Upload a HTML page to server along with the referenced images using Java

I am working on a code, that requires uploading of any kind of document from client's machine to the server, and extracting images out of it. For almost all docs Tika is helpful, but in case of an html page, the images are referenced to the local machine's path. So how do I upload the html page along with the images it contains?
I'm using Java Servlets and JSP as platform.
This is impossible to solve server-side, you have to implement a client-side (Javascript? Java applet? Flash (yuck!)?) solution. The HTML document is just a text, it does not contain the images - it just references them. So you have to parse the document, get the images, upload them independently, and then - server-side - process the document and adjust the image references (the values of src attributes).
Pretty complex, isn't it?

Jsoup get dynamically generated HTML

I can connect to most sites and get the HTML just fine but when trying to connect to a website where most of the content is generated after the initial page load with JavaScript, it does not get any of that data. Is there any way to do this with Jsoup or does it not support it?
JSoup has some basic connection handling included, but it is not a web browser. It excels at parsing static html content. It does not run any javascript, so you are out of luck. However, there are different options that you might follow:
You can analyze the page that you want to retrieve and find out how the content you are interested in gets loaded. Often it is not very hard to tap the original source of the loaded content and work with this. This approach has the benefit that you get what you want with no need of extra libraries and the retrieval will be fast.
You can use a (full) browser and automate the loading of the page. A very good tool for this is selenium webdriver in combination with the headless webkit browser phantomjs. This however requires extra software and extra libraries in your project and will run much much slower than the first solution.

Web Java applet show web page

Is it possible to make an web Java applet that loads a web page? I need to load an web page inside a web page.
An iframe, object and other HTML things I don't need. I need just a Java applet.
Some source code or tutorials?
It is possible.
Swing text components support HTML (EditorPanes at Java Tutorials) but if the web page is too complex it might not be displayed correctly.

Fetching the entire web page from a specific URL using Java

Can I fetch the entire web page, including CSS and images, using Java? That is basically what happens when using "save as" action in a browser. I can use any free 3rd party library.
edit:
HtmlUnit library seems to be doing exactly what I need. This is how I use it to grab the entire web page:
WebClient webClient = new WebClient();
HtmlPage page = webClient.getPage(new URL("..."));
page.save(new File("..."));
Java has some built in functions that you can utilize to open a stream the external sources say a web server and request a page which would return you the source to the page. You would then need to parse the links to external images and css and requests and save them accordingly.
here is a link to an example of opening a stream to an external source being a website
maybe lobo browser help you. It is an open source free browser completely by java. It has some jar libraries that can be added to your project.

Web Page Size Using Java

How to get the web page size from an url in bytes. This should include all images.
How can we do that. Any help
Thanks in Advance
The way to find the number of bytes used to represent a web page you would need to:
fetch the HTML page and all images, scripts, CSS files, etc that it references, transitively,
evaluate any embedded scripts (as per the HTML spec) to see if they pull in further resources, and
sum the byte counts for all resources loaded to give the "web page size".
But I don't see what you would learn by doing this. For instance, the web page size (as above) is not a good predictor of network usage.
You say:
I am doing this to analyze the performance of an web page.
A better way would be to use something like the "yslow" plugin for Firefox.

Categories