How to get Ticker symbol from table using jsoup? - java

I'm trying to get the symbols from the table at YahooFinance, but can't figure out why my code doesn't detect the table.
This is what I tried:
public String[] getTrendingTickers() {
String[] trendingTickers = new String[30];
int numTickers = 0;
String url = "https://finance.yahoo.com/trending-tickers/";
try {
Document document = Jsoup.connect(url).get();
for (Element row : document.select("table.W(100%) tr")) {
String ticker = row.select(
".Fz\\(s\\).Ta\\(start\\)\\!.Bgc\\(\\$lv2BgColor\\).Z\\(1\\).Bgc\\(\\$lv3BgColor\\).Pos\\(st\\).simpTblRow\\:h_Bgc\\(\\$hoverBgColor\\).Pend\\(10px\\).Start\\(0\\).Pend\\(15px\\).Pstart\\(6px\\).Ta\\(start\\).Va\\(m\\)")
.text();
System.out.println(ticker);
trendingTickers[numTickers] = ticker;
numTickers++;
}
} catch (Exception e) {
System.out.println(e);
}
return trendingTickers;
}
With the error org.jsoup.select.Selector$SelectorParseException: Could not parse query 'table.W(100%).tr': unexpected token at '(100%).tr'

Here is some sample code that creates a list of all the symbols in the table of the page you reference:
Document document = Jsoup.connect("https://finance.yahoo.com/trending-tickers/").get();
Element table = document.select("table tbody").first();
List<String> symbols = new ArrayList<>();
for (Element row: table.select("tr")) {
symbols.add(row.select("td").first().text());
}
System.out.println(symbols);
See https://jsoup.org/apidocs/org/jsoup/select/Selector.html for details on the selector syntax.

Related

Parsing currency exchange data from https://uzmanpara.milliyet.com.tr/doviz-kurlari/

I prepare the program and I wrote this code with helping but the first 10 times it works then it gives me NULL values,
String url = "https://uzmanpara.milliyet.com.tr/doviz-kurlari/";
//Document doc = Jsoup.parse(url);
Document doc = null;
try {
doc = Jsoup.connect(url).timeout(6000).get();
} catch (IOException ex) {
Logger.getLogger(den3.class.getName()).log(Level.SEVERE, null, ex);
}
int i = 0;
String[] currencyStr = new String[11];
String[] buyStr = new String[11];
String[] sellStr = new String[11];
Elements elements = doc.select(".borsaMain > div:nth-child(2) > div:nth-child(1) > table.table-markets");
for (Element element : elements) {
Elements curreny = element.parent().select("td:nth-child(2)");
Elements buy = element.parent().select("td:nth-child(3)");
Elements sell = element.parent().select("td:nth-child(4)");
System.out.println(i);
currencyStr[i] = curreny.text();
buyStr[i] = buy.text();
sellStr[i] = sell.text();
System.out.println(String.format("%s [buy=%s, sell=%s]",
curreny.text(), buy.text(), sell.text()));
i++;
}
for(i = 0; i < 11; i++){
System.out.println("currency: " + currencyStr[i]);
System.out.println("buy: " + buyStr[i]);
System.out.println("sell: " + sellStr[i]);
}
here is the code, I guess it is a connection problem but I could not solve it I use Netbeans, Do I have to change the connection properties of Netbeans or should I have to add something more in the code
can you help me?
There's nothing wrong with the connection. Your query simply doesn't match the page structure.
Somewhere on your page, there's an element with class borsaMain, that has a direct child with class detL. And then somewhere in the descendants tree of detL, there is your table. You can write this as the following CSS element selector query:
.borsaMain > .detL table
There will be two tables in the result, but I suspect you are looking for the first one.
So basically, you want something like:
Element table = doc.selectFirst(".borsaMain > .detL table");
for (Element row : table.select("tr:has(td)")) {
// your existing loop code
}

I want to get list of top 250 movies of imdb but I'm unable as it give me "{}" not whole list

I have added libraries properly there is no one error but it is not showing desired result. I have got the return type string and saved it to a variable and then set it text view. I have stuck here. Please help me.
public String TableToJson() throws JSONException {
int i=0;
String s="http://www.imdb.com/chart/top";
Document doc = Jsoup.parse(s);
JSONObject jsonParentObject = new JSONObject();
//JSONArray list = new JSONArray();
for (Element table : doc.select("table")) {
for (Element row : table.select("tr")) {
JSONObject jsonObject = new JSONObject();
Elements tds = row.select("td");
i++;
String no = Integer.toString(i);
String Name = tds.get(1).text();
String rating = tds.get(2).text();
jsonObject.put("Ranking", no);
jsonObject.put("Title", Name);
jsonObject.put("Rating", rating);
jsonParentObject.put(Name, jsonObject);
}
}
return jsonParentObject.toString();
}
and output is only
{}
As you can see just using a regular expression will work for you.
Sample query can be similar to this
<strong title=".*</strong>
Showing 250 matches Tested using freeformatter

Java jsoup link extracting

I am trying to extract the links within a given element in jsoup. Here what I have done but its not working:
Document doc = Jsoup.connect(url).get();
Elements element = doc.select("section.row");
Element s = element.first();
Elements se = s.getElementsByTag("article");
for(Element link : se){
System.out.println("link :" + link.select("href"));
}
Here is the html:
The thing I am trying to do is get all the links withing the article classes. I thought that maybe first I must select the section class ="row", and then after that derive somehow the links from the article class but I could not make it work.
Try out this.
Document doc = Jsoup.connect(url).get();
Elements section = doc.select("#main"); //select section with the id = main
Elements allArtTags = section.select("article"); // select all article tags in that section
for (Element artTag : allArtTags ){
Elements atags = artTag.select("a"); //select all a tags in each article tag
for(Element atag : atags){
System.out.println(atag.text()); //print the link text or
System.out.println(atag.attr("href"));//print link
}
}
I'm using this in one of my projects:
final Elements elements = doc.select("div.item_list_section.item_description");
you'll have to get the elements you want to extract links from.
private static ... inspectElement(Element e) {
try {
final String name = getAttr(e, "a[href]");
final String link = e.select("a").first().attr("href");
//final String price = getAttr(e, "span.item_price");
//final String category = getAttr(e, "span.item_category");
//final String spec = getAttr(e, "span.item_specs");
//final String datetime = e.select("time").attr("datetime");
...
}
catch (Exception ex) { return null; }
}
private static String getAttr(Element e, String what) {
try {
return e.select(what).first().text();
}
catch (Exception ex) { return ""; }
}

JSoup parsing data from within a tag

I am managing to parse most of the data I need except for one as it is contained within the a href tag and I am needing the number that appears after "mmsi="
Sunsail 4013
my current parser fetches all the other data I need and is below. I tried a few things out the code commented out returns unspecified occasionally for an entry. Is there any way I can add to my code below so that when the data is returned the number "235083844" returns before the name "Sunsail 4013"?
try {
File input = new File("shipMove.txt");
Document doc = Jsoup.parse(input, null);
Elements tables = doc.select("table.shipInfo");
for( Element element : tables )
{
Elements tdTags = element.select("td");
//Elements mmsi = element.select("a[href*=/showship.php?mmsi=]");
// Iterate over all 'td' tags found
for( Element td : tdTags ){
// Print it's text if not empty
final String text = td.text();
if( text.isEmpty() == false )
{
System.out.println(td.text());
}
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Example of data parsed and html file here
You can use attr on an Element object to retrieve a particular attribute's value
Use substring to get the required value if the String pattern is consistent
Code
// Using just your anchor html tag
String html = "Sunsail 4013";
Document doc = Jsoup.parse(html);
// Just selecting the anchor tag, for your implementation use a generic one
Element link = doc.select("a").first();
// Get the attribute value
String url = link.attr("href");
// Check for nulls here and take the substring from '=' onwards
String id = url.substring(url.indexOf('=') + 1);
System.out.println(id + " "+ link.text());
Gives,
235083844 Sunsail 4013
Modified condition in your for loop from your code:
...
for (Element td : tdTags) {
// Print it's text if not empty
final String text = td.text();
if (text.isEmpty() == false) {
if (td.getElementsByTag("a").first() != null) {
// Get the attribute value
String url = td.getElementsByTag("a").first().attr("href");
// Check for nulls here and take the substring from '=' onwards
String id = url.substring(url.indexOf('=') + 1);
System.out.println(id + " "+ td.text());
}
else {
System.out.println(td.text());
}
}
}
...
The above code would print the desired output.
If you need value of attribute, you should use attr() method.
for( Element td : tdTags ){
Elements aList = td.select("a");
for(Element a : aList){
String val = a.attr("href");
if(StringUrils.isNotBlank(val)){
String yourId = val.substring(val.indexOf("=") + 1);
}
}

JSoup storing text in a variable

i'm new in JAVA / Android development.
I made an app to extract text from a HTML class;
protected List<String> doInBackground(String... url) {
try {
Document doc = Jsoup.connect(
"http://example/test.html").get();
Elements st1 = doc.select("a[class*=subject_rating_details");
for (Element element : st1) {
sgrade[0] = st1.get(0).text();
sgrade[1] = st1.get(0).text();
sgrade[2] = st1.get(0).text();
sgrade[3] = st1.get(0).text();
sgrade[4] = st1.get(0).text();
}
} catch (IOException e) {
e.printStackTrace();
}
List<String> pinfo = null;
return pinfo;
}
#Override
protected void onPostExecute(List<String> pinfo) {
prog.dismiss();
}
}
List<ListData> varlist = new ArrayList<ListData>();
String sgrade[] = new String[] {};
I used JSoup to extract from my webpage different text from the HTML class="subject_rating_details".
But it force closes with the code above.
I can successfully extract it with a single String, example:
for (Element element : st1) {
stringname = st1.get(0).text();
stringname = st1.get(1).text();
stringname = st1.get(2).text();
stringname = st1.get(3).text();
stringname = st1.get(4).text();
}
But it only stores the last one ( stringname = st1.get(4).text(); )
I've tried also:
for (Element element : st1) {
stringname1 = st1.get(0).text();
stringname2 = st1.get(1).text();
stringname3 = st1.get(2).text();
stringname4 = st1.get(3).text();
stringname5 = st1.get(4).text();
}
But i need the text from st1 in a single variable.
What can i do?
Thanks
EDIT
I want something like this:
String sgrade[] = new String[] {};
for (Element element : st1) {
sgrade[0] = st1.get(0).text();
sgrade[1] = st1.get(0).text();
sgrade[2] = st1.get(0).text();
sgrade[3] = st1.get(0).text();
sgrade[4] = st1.get(0).text();
}
Witch later i could read each text and display it in a TextView:
textview1.setText(sgrade[0]); <--/// This would display "Ford"
textview2.setText(sgrade[1]); <--/// This would display "Mustang"
textview3.setText(sgrade[2]); <--/// This would display "2013"
/// HTML ///
...
<p class="subject_rating_details">Ford</p>
<p class="subject_rating_details">Mustang</p>
<p class="subject_rating_details">2013</p>
...
/// HTML ///
Please try this way. With this you will get value of st1 in single string named stringname.
List<String> stringname =new ArrayList<String>();
for (Element element : st1) {
stringname.add(st1.get(0).text());
stringname.add(st1.get(1).text());
stringname.add(st1.get(2).text());
stringname.add(st1.get(3).text());
stringname.add(st1.get(4).text());
}

Categories