I amm trying to get the data from a website. With this code:
#WebServlet(description = "get content from teamforge", urlPatterns = { "/JsoupEx" })
public class JsoupEx extends HttpServlet {
private static final long serialVersionUID = 1L;
private static final String URL = "http://www.moving.com/real-estate/city-profile/results.asp?Zip=60505";
public JsoupEx() {
super();
}
protected void doGet(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
Document doc = Jsoup.connect(URL).get();
for (Element table : doc.select("table.DataTbl")) {
for (Element row : table.select("tr")) {
Elements tds = row.select("td");
if (tds.size() > 1) {
System.out.println(tds.get(0).text() + ":"
+ tds.get(2).text());
}
}
}
}
}
I am using the jsoup parser. When run, I do not get any errors, just no output.
Please help on this.
With the following code
public class Tester {
private static final String URL = "http://www.moving.com/real-estate/city-profile/results.asp?Zip=60505";
public static void main(String[] args) throws IOException {
Document doc = Jsoup.connect(URL).get();
System.out.println(doc);
}
}
I get a java.net.SocketTimeoutException: Read timed out. I think the particuliar URL you are trying to crawl is too slow for Jsoup. Being in Europe, my connection might be slower as yours. However you might want to check for this exception in the log of your AS.
By setting the timeout to 10 seconds, I was able to download and parse the document :
Connection connection = Jsoup.connect(URL);
connection.timeout(10000);
Document doc = connection.get();
System.out.println(doc);
With the rest of your code I get :
Population:78,413
Population Change Since 1990:53.00%
Population Density:6,897
Male:41,137
Female:37,278
.....
thanx Julien, I tried with the following code, getting SocketTimeoutException. And code is
Connection connection=Jsoup.connect("http://www.moving.com/real-estate/city-
profile/results.asp?Zip=60505");
connection.timeout(10000);
Document doc = connection.get();
System.out.println(doc);
Related
I'm working on code for parsing the weather site.
I found a CSS class with needed data on the web-site. How to pick up from there "on October 12" in the form of a string? (Tue, Oct 12)
public class Pars {
private static Document getPage() throws IOException {
String url = "https://www.gismeteo.by/weather-mogilev-4251/3-day/";
Document page = Jsoup.parse(new URL(url), 3000);
return page;
}
public static void main(String[] args) throws IOException {
Document page = getPage();
Element Nameday = page.select("div [class=date date-2]").first();
String date = Nameday.select("div [class=date date-2").text();
System.out.println(Nameday);
}
}
The code is written for the purpose of parsing the weather site. On the page I found the right class in which only the date and day of the week I need. But at the stage of converting data from a class, an error crashes into a string.
The problem is with class selector, it should look like this: div.date.date-2
Working code example:
public class Pars {
private static Document getPage() throws IOException {
String url = "https://www.gismeteo.by/weather-mogilev-4251/3-day/";
return Jsoup.parse(new URL(url), 3000);
}
public static void main(String[] args) throws IOException {
Document page = getPage();
Element dateDiv = page.select("div.date.date-2").first();
if(dateDiv != null) {
String date = dateDiv.text();
System.out.println(date);
}
}
}
Here is an answer to Your problem: Jsoup select div having multiple classes
In future, please make sure Your question is more detailed and well structured. Here is the "asking questions" guideline: https://stackoverflow.com/help/how-to-ask
I need to check a value on https://new.ppy.sh/u/9889129 from div class profile-header-extra__rank-box (pp score). But it's returns nothig from there. How to do it?
public class mainClass
{
public static void main(String[] args) throws Exception {
String url = "https://new.ppy.sh/u/9889129";
Document document = Jsoup.connect(url).get();
String ppValue = document.select(".profile-header-extra__rank-global").text();
System.out.println("PP: "+ppValue);
}
}
So i have started with some java, i am not that good i am still a beginner..
what im trying to do is grab specific information from Yahoo finance with Jsoup.
public class WebScraping {
public static void main(String[] args) throws Exception {
String url = "https://in.finance.yahoo.com/q/is?s=AAPL&annual";
Document document = Jsoup.connect(url).get();
String information = document.select(".yfnc_tabledata1").text();
System.out.println("Information: " + information);
}
}
but i get the whole table i want specific information like the Net Income and the income only for year 2015
so i found the solution
public static void main(String[] args) throws Exception {
String url = "https://in.finance.yahoo.com/q/is?s=AAPL&annual";
Document document = Jsoup.connect(url).get();
String information = document.select("table tr:eq(7) > td:eq(2)").text();
System.out.println("Information: " + information);
}
}
When I am scraping with the following code it does not show any element within body tag, but in manually checking with view-source it shows the elements in body. How to scrape the hyperlinks in the following URL?
public static void main(String[] args) throws SQLException, IOException {
String search_url = "http://www.manta.com/search?search=geico";
Document doc = Jsoup.connect(search_url).userAgent("Mozilla").get();
System.out.println(doc);
Elements links = doc.select("a[href]");
System.out.println(links);
for (Element a : links) {
System.out.println(a);
String linkhref=a.attr("href");
System.out.println(linkhref);
}
}
I am getting Google-trends data through html response after hitting on a URL. I managed to parse that response through Jsoup library. I got the data but only for 3-4 times. After that it started to giving Status-203 error.Each day i run this code for 3-4 times after that i got this exception. Please help me what should i do now ?
My Code is -
public class HTMLParser {
private static HashMap<String, HashMap<String, String>> hostcokkies = new HashMap<String, HashMap<String,String>>();
public static ArrayList<HotTrends> getYouTubeTrendings()
{
Document document;
ArrayList<HotTrends> list = new ArrayList<HotTrends>();
HotTrends trends = null;
try {
document = Jsoup.connect("http://www.google.com/trends/fetchComponent?geo=IN&date=today+12-m&gprop=youtube&cmpt=q&cid=TOP_QUERIES_0_0").get();
Elements links = document.select("a[href]");
for(Element link : links){
trends = new HotTrends();
trends.setWord(link.text());
list.add(trends);
}
}
catch(Exception e)
{
e.printStackTrace();
}
return list;
}
public static void main(String args[])
{
ArrayList<HotTrends> hotTrends = new ArrayList<HotTrends>();
hotTrends = HTMLParser.getYouTubeTrendings();
for(HotTrends trends : hotTrends)
{
System.out.println(trends.getWord());
}
}
}
I have the same problem, just change your IP address by yourself or using a program (I use Hotspot Shield http://www.hotspotshield.com/) it works for me