How to screen scrape an Ajax site in Java? - java

I wish to screen scrape several Ajax based websites and simulate clicks which refresh part of the webpage, and then read the updated HTML. Is there any Java library which can do this?

Use HtmlUnit it's great for this!! It is a headless browser and has the ability to play with clicks, mouse positions and pretty much everything you would want.

I think the only way to do this is to embed a browser so that the Javascript is executed and grab the data when the DOM is updated. This related stack overflow question may help.

These books should help you (although only the first one is intended to Java developers):
Programming Spiders, Bots, and Aggregators in Java;
Webbots, Spiders, and Screen Scrapers;
Spidering Hacks;

Related

Dynamically and in real time update web page contents

I need to create a screen that is automatically updated every minute or so with fresh data from a server-based data source, perhaps a simple text file or XML - and is displayed as a web page.
The screen will show a list of items that can be marked with a red, yellow or green icon, as to indicate their status. Each item has a drop-down where you can select/change the current status. So, when a user changes the status for one of the items, every screen monitoring this web page will be automatically updated.
I'm a pretty novice web programmer (I only have experience with desktop programming, VB and a little C#), so I'm really just hoping for a quick push in the right direction here. I assume that this really isn't all that difficult to implement. Am I wrong? And where can I find more info on how to do this? :-)
I really appreciate any help I can get. Thanks in advance!
This sounds like a solution with Websockets. You don't need update every minute. You're able to update every times when a user make changes. Client Side in Java Script is very easily to implement Websockets. Tutorial can be found here.
Server side you need a Websocket Server. What you need depends on which programm language you use.

Posting/clicking buttons java

i have some basic knowledge of java (made little programs to help everyday life) Now the thing is i want to make a program that posts offers on a site every 2-3 minutes. I have never done anything with java related to the internet and web pages and even after browsing the internet i am clueless. How would i go about setting up a connection to a certain page click a certain button on that page and then fill in 3 boxes with information and post the offer?
Here is what i have to click:
Here is the form i have to fill in:
You have to learn a bit more about how HTML form works, but basically browser sends a POST request to the webserver with the values from the form.
Chrome's Developer Tools, Firefox's web developer toolbar has a functionality what can help you to discover what request is sent to the server when you post a single form. Of course you have to fill the values with your heart content
If you already know what content you want to send, you can post it relative easily, it is described well with examples in this StackOverflow answer https://stackoverflow.com/a/4206094/182474.
But I strongly recommend to discover how browsers works internally from the aspect of HTTP protocol, because it is a very useful knowledge if you want to develop tools that interacts with web APIs or simulate posting HTML forms.
You could use java.awt.Robot class and make all this thinks =)
you only need to write your manually action to the automated action =)
move mouse on text
write this...
tab
write this other..
tab press enter
wait N second
hope it helps! =)

Making an android app for a website

I'm trying to make an app for website that I DO NOT OWN OR HAVE ACCESS TO ITS DB.
The website is a forum community website and I wish make an app that can list the menu and the posts in a UI suited for mobile.
Also I am trying to see if I add real time notice function that will let the thread poster know when there is a new comment/post to his or her thread. (Website does not support this function)
Do you guys think I can achieve this through Jsoup or would I need other utilities too?
Also I am quite a beginner in java so the app cannot be too complicated.
Thanks.
It means that you want to write your own web browser! You need to call the Url get what it returns, parse it and show it... also you need to implement RSS to specific pages.. It means app need to call this url regularly and check if any changes or comments... Its do-able but wont really be efficient or bug-free... I wouldnt recommend it but as i sad do-able

How to embed a Twitch Stream inside a JFrame

I am a novice programmer and I'm trying to create a program that has links to many popular 'streamers' on Twitch.tv. (For those who don't know: Twitch.tv is a streaming website for people who stream games). When a user clicks on a link (JButton) to their favourite streamer, I want to open the Twitch Video inside my program UI (A JFrame). How can I achieve this?
All I know is that it's possible with YouTube videos if you render them as HTML5, but Twitch doesn't seem to have this feature and should require Adobe Flash... I also tried searching online but to no avail...
Any help would be greatly appreciated!
You could try one of these two libraries:
http://djproject.sourceforge.net/main/index.html
or
http://www.teamdev.com/jxbrowser/

How does Java interact with buttons on http webpage?

I have little knowledge about web programming, and I was looking through a lot of examples which gives alot of examples about writing on the webpage for web browser to see, but nothing about getting request when user presses a button the webpage.
For example, upon opening the server, it opens the port 80, then I want to go to web browser and type "localhost:80" to access the webpage made by the server client. The webpage has a button which when I click it, webpage changes to something else like "Clicked!".
Can someone show me an example code of this? Link to an example would be great as well.
Thank you very much.
I would recommend using something like Apache's HttpClient to imitate a button press, which is just an HTTP POST. And if you don't want to use a third-party library, the standard Java library already rolls its own.

Categories