Scraping a reactjs website using Java and Selenium - java

I have everything set up to run a headless browser using Selenium in Java. I cant figure out now what I need to do to extract elements from this ReactJS website (the site contains either ReactJS or Javascript I'm not sure).
I feel like I may be approaching this wrong and/or missing some libraries that would help me along the way.
Any help greatly appreciated.

If you do Web Interface tests or scraping with Java + Selenium, I advise you to use the NoraUi Open Source Framework.

Related

Automatically navigating a website with java

A few years ago I made a program in .NET that uses the webbrowser control. With that I was able to automatically log in to a website, navigate, and download pictures. It was GUI based since it was using the webbrowser control. It had the advantage that I could follow along and see if something went wrong.
What is the best way forward to replicate that idea in Java? Is there a similar free control that acts as a webbrowser and gives access to the DOM?
I suspect the optimal way would be to use the Google Chrome Developer tools to replicate the login via GET/POST methods, but at first would prefer the webbrowser approach.
You can use Selenium for that. It is a free (open source) automated testing suite for web applications across different browsers and platforms. It mainly focuses on automating web-based applications.
In Java, You can use Selenium which will give you full control on Web-Browsers as well as DOM.
In Selenium Web Driver is a class which provides full automated control of a browser that we want to use.
This may help You!
Thanks!
You could use the JavaFX webView, class javafx.scene.web.WebView.
It uses a Webkit engine that is HTML 5 compliant and seem to be up to date (it was in java 8 & 9).
The engine has interraction with the JS engine that may help to introspect and navigate.
Example to get the "window" JS object:
JSObject window = (JSObject) webView.getEngine().executeScript("window");
Webview ewample:
JavaFx Webview HTML5 DragAndDrop

How to integrate selenium and ZAP

I am planning to automate security testing for a web application.
I have the selenium code which is developed using JAVA language, now I need to integrate with ZAP.
Kindly help me how to integrate these two and generate the reports for the same.
You can configure Selenium to use Zap as a proxy. Take a look here for a full solution using WebDriver.io, which is a tiny JS wrapper around selenium. It should give you a high level overview of how to build and run such a solution in the CI. Let me know if you need more help.

How to access html information that is generated by javascript?

I am trying to get article headlines from NY times .
But I think the html is generated by javascript, as it is only visible when I use the 'inspect element' on firefox.
How can I get to the articles? Probably, one of the ways is to emulate a browser but that seems like overkill.
I would prefer to do this in Java but Python is okay too. Your help is appreciated!
edit:
I tried using the api. But there are a lot of bad urls (page not found). Anyone has any more ideas on how to get the urls and headlines?
Selenium is probably what you're looking for; it's a browser automation framework.
You can use Python but Selenium actually uses Firefox to parse a site's content (last time I heard).
You can get the python version here but there are other options.
You could try to use a browser without GUI like HtmlUnit. It has good JavaScript-support and you're able to read the contents of the page from your Java-program.
As an alternative solution to this particular problem, how about using the New York Times API? They provide JSONP for JavaScript support. Using the API is probably more future-proof if they ever change the site layout.

Using java to create a web browser

Is it possible to use Java to build a web browser like Internet Explorer that will open all the web pages and display all the contents?
The only valid answer to that question is:
Yes, it's possible to use Java to build a web browser.
However, a web browser is an exceptionally complex piece of software. Even Google, when building its Google Chrome browser, used existing technology to do it, rather than inventing their own browser from scratch.
If your goal is anything other than building and marketing your own browser, you may want to reconsider what exactly you want to accomplish, in order to find a more direct approach.
I advise you to take a look at the Lobo Browser project, an open-source java-written web browser. Take a look at the source and see how they did it.
Yes, it is possible. JWebPane is a work in progress migration of Webkit. It is supposed to be included in JDK7 but I wouldn't hold my breath.
JWebPane browser = new JWebPane();
new JFrame("Browser").add(browser);
browser.load(someURL);
Yes, it's possible, and here's what you would need to start looking at.
First, search for an HTML renderer in Java. An example would be JWebEngine. You can start by manually downloading HTML pages and verifying that you can view them.
Second, you need to handle the networking piece. Read a tutorial on sockets, or use an HTTP Client such as the Apache HTTPClient project.
Edit:
Just to add one more thought, you should be honest with yourself about why you would work on this project. If it's to rebuild IE, FF, that is unrealistic. However, what you might get out of it is learning what the major issues are with browser development, and that may be worthwhile.
Take a look at the JEditorPane class. It can be used to render HTML pages and could form the basis of a simple browser.
Yes. One of the projects in Java After Hours shows you how to build a simple web browser. It's not nearly as full-featured as IE or Firefox of course (it's only one chapter in the book), but it will show you how to get started.
The hardest thing will be the rendering component. Java7 will include JWebPane, that internally uses WebKit. Here you can find some screenshots.
I develop this browser for my college project may be this helpful for you
My Button is open source java web browser.
Develop for school and college projects and learning purpose.
Download source code extract .zip file and copy “mybutton” folder from “parser\mybutton” to C:\
Import project “omtMyButton” in eclipse.
Require Java 6.
Download .exe and source code :
https://sourceforge.net/projects/omtmybutton/files/

Programmatic web browser Java library

Does anyone know of any Java library for programmatic web browsing?
Prowser doesn't cut it because there's no "push the button" method and Watij is limited to Internet Explorer Windows only.
htmlunit?
http://htmlunit.sourceforge.net/
The above link says:
... HtmlUnit is not a generic unit testing
framework. It is specifically a way to
simulate a browser for testing
purposes...
You may be able to find some of what you want in Selenium and especially when using Selenium Server like in this IBM article

Categories