Reading html list items from android java code - java

As I explained in title, I found this website (rpg.rem.uz) that uses a list.
I wanted to read the title of each list item programmatically in my android java code. I need it to populate a listview in the same way that list is populated.
please let me know if it is possible and how to do that.
thanks in advance
EDIT
I tried using Jsoup but I get an Handshake failed exception

You can use Jsoup library for parsing HTML.
Read doc and its examples:
https://jsoup.org

Theoretically you can (android.text.Html), but practically don't.
A WebView (android.webkit.WebView) could satisfy your need, but you better think about some APIs for you site, JSON is what you need

Try this may help you. you need to set up this library first to your project then use the code below :
File input = new File("/tmp/input.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");
Element content = doc.getElementById("content");//pass your list id
Elements links = content.getElementsByTag("li");
for (Element list: lists) {
String linkText = link.text();
}
OR
You can direct load your html from your url -:
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Elements newsHeadlines = doc.select("li");
Just read the documentation from this link. You can also see some examples here

Related

Sun's HTTP-Server Read Data directly

I am trying to get the Value of for example a Textfield on the Website...
Is there an other way than reading it from the uri?
For Example something like
getContent().getvar("name");
You need to use library like jsoup (here) to connect to the site (url) and get the data in DOM format. Below is an example code snippet:
Document doc = Jsoup.connect("http://google.com").get();
String title = doc.title();
System.out.println("title : " + title);
Elements links = doc.select("a[href]");
You can use methods like select to get the required element, e.g.:
Element head= doc.select("div.head").first();
Here is the javadoc of Document class and here are some examples.
If you have HTML code of the website, you can read it with jsoup and convert it into Document object. Unfortunately, there is no direct way to read the content of website/page without actually connecting to it (via uri).

Extracting article links only using jsoup

I am trying to use JSoup to extract links of articles from stock symbols.
For example on this page: http://finance.yahoo.com/q/p?s=+AAPL+Press+Releases
there are a bunch of press release titles. When you press each title, you are given a link. I want to use JSoup to extract and store the links of each one of those press releases.
As of now this is what I have so far:
Document doc = Jsoup
.connect("http://finance.yahoo.com/q/p?s=AAPL+Press+Releases").get();
And to get the links I am using
Elements url = jSoupDoc.select("p").select("a");
System.out.println(url.text());
The output that I am getting is not the link only, I am getting some other information with it. Please help me tweak the .select() statements to get only the link.
Try this code:
Document document = Jsoup.connect("http://finance.yahoo.com/q/p?s=+AAPL+Press+Releases")
.get();
Element div = document.select("div.mod.yfi_quote_headline.withsky").first();
Elements aHref = div.select("a[href]");
for(Element element : aHref)
System.out.println(element.attr("abs:href"));
Output:
http://finance.yahoo.com/news/hagens-berman-payday-millions-e-161500428.html
http://finance.yahoo.com/news/swift-playgrounds-app-makes-learning-185500537.html
http://finance.yahoo.com/news/apple-previews-ios-10-biggest-185500113.html
http://finance.yahoo.com/news/powerful-siri-capabilities-single-sign-185500577.html
http://finance.yahoo.com/news/apple-previews-major-macos-sierra-185500097.html
http://finance.yahoo.com/news/apple-previews-watchos-3-faster-185500388.html
http://finance.yahoo.com/news/apple-union-square-highlights-design-173000006.html
http://finance.yahoo.com/news/apple-opens-development-office-hyderabad-043000495.html
http://finance.yahoo.com/news/apple-announces-ios-app-design-043000238.html
http://finance.yahoo.com/news/apple-celebrates-chinese-music-garageband-230000088.html
http://finance.yahoo.com/news/apple-sap-partner-revolutionize-iphone-183000583.html

How to get specific HTML items into an ArrayList of Strings

I'm trying to understand how to make use of the HTML data from the APOD archive. Preferably my end goal is to end up with an ArrayList of Strings like so:
From this url view-source:http://apod.nasa.gov/apod/archivepix.html
get each of these 2015 February 26: Love and War by Moonlight<br>
and put them into an ArrayList
I'm more used to JSON or even XML from rest API's -- parsing through HTML just seems crazy hard, so it'd be really helpful if someone could point me in the right direction on this.
Thanks!
Take a look on these HTML Parser called jsoup.
This will make your task easy.
This link would be helpfull for extracting the values from html.
For example:-
Document doc = Jsoup.connect("http://apod.nasa.gov/apod/archivepix.html").get();
Elements links = content.getElementsByTag("b");
for (Element link : links) {
String linkHref = link.attr("href");
String linkText = link.text();
}
Parse as you need it.
Maybe try using JAXP because you know what element it is that contains the data you want. http://docs.oracle.com/javase/tutorial/jaxp/

Java Jericho hyperlink parsing

I'm trying to figure out a way to get all hyperlinks in a webpage - except if they are in an anchor tag().
For this I'm using the Jericho parser.
My initial approach was to take the difference between
List<Element> elementList = source.getAllElements(); and
getAllElements(HTMLElementName.A), but other elements might also contain an anchor link within them, so I don't think that's the right approach.
I recommend you Jsoup for Html processing.
Here's an example how you can get all links (= a-tag with href-attribute):
Document doc = Jsoup.connect("http:// - link here -").get(); // Connect to website and parse its html
Elements links = doc.select("a[href]"); // Select all 'a'-tags' with 'href'-attribute
for( Element element : links ) // iterate over all links (example)
{
// process element
}
Documentation:
Selector API (DOM API is available too)
Cookbook (Examples)
list links (Example)
JavaDoc
Btw. can you explain this a bit more?
except if they are in an anchor tag

Java: Extract all links with a certain word in them with JSoup?

Might be an unclear question so here's the code and explanation:
Document doc = Jsoup.parse(exampleHtmlData);
Elements certainLinks = doc.select("a[href=google.com/example/]");
The String exampleHtmlData contains a parsed HTML source from a certain site. This site has a lot of links which direct the user to google. A few examples would be:
http://google.com/example/hello
http://google.com/example/certaindir/anotherdir/something
http://google.com/anotherexample
I want to extract all the links that contain google.com/example/ in the link with the doc.select function. How do I do this with JSoup?
You can refer the SelectorSyntax.
Document doc = Jsoup.parse(exampleHtmlData);
Elements certainLinks = doc.select("a[href*=google.com/example/]");

Categories