Using Jsoup to extract data for Android studio

Using Jsoup to extract data for Android studio - java

I am trying to extract data to get public ip address of each of my users so I can compare if any are currently the same. the website has no html just text that says: {"ip":"current ip"}
I try to extact this and use a toast just to test I have the info right but the toast is always blank. Here is my code:
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_homepage);
new doit().execute();
}
public class doit extends AsyncTask<Void, Void, Void>{
String ipAddressGet;
#Override
protected Void doInBackground(Void... params) {
try {
Document doc = Jsoup.connect("http://ipv4bot.whatismyipaddress.com/").get();
ipAddressGet = doc.text();
}catch(Exception e){
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void aVoid) {
super.onPostExecute(aVoid);
Toast.makeText(Homepage.this, ipAddressGet, Toast.LENGTH_LONG).show();
}
}
sorry I have never used this website before to post hopefully everything is clear.
all the descrptions for jsoup involve HTML use but this has none so i dont know how to apply the descrptions there

Just use this
ipAddressGet = doc.body().ownText();
instead of
ipAddressGet = doc.text();

Change your code:
try {
Document doc = Jsoup.connect("http://ipv4bot.whatismyipaddress.com/").get();
ipAddressGet = doc.text();
} catch(Exception e) {
e.printStackTrace();
}
To this:
try {
Document doc = Jsoup.connect("http://ipv4bot.whatismyipaddress.com/").get();
Element body = doc.body();
ipAddressGet = body.text();
} catch(Exception e) {
e.printStackTrace();
}
But to get and handle responses of requests to the web, use one of known libs, like Retrofit or Volley.

You do not need JSoup for this. Java can deal with it just fine:
private static String readStringFromURL(String requestURL) throws IOException
{
try (Scanner scanner = new Scanner(new URL(requestURL).openStream(),
StandardCharsets.UTF_8.toString()))
{
scanner.useDelimiter("\\A");
return scanner.hasNext() ? scanner.next() : "";
}
}
If your text is also JSON-formatted, most JSON parsing libraries also have built-in capabilities for reading from a URL.

Related

Creating a JSoup class in android using AsyncTask

I know that I have to use ASyncTasks in order to make JSoup work for android, but all examples online illustrate that just by using random jsoup methods in the MainActivity.
I want to create a HTMLParser class which will contains a function for each element I want to parse but I can't seem to make it work.
My HTMLParser:
public class HTMLParser extends AsyncTask<Void, Void, Void> {
private Document doc;
#Override
protected Void doInBackground(Void... voids) {
try {
doc = Jsoup.connect("https://www.bodybuilding.com/exercises").get();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
public ArrayList<String> findMuscleGroups(){
ArrayList<String> muscleGroups = new ArrayList<>();
Elements section = doc.select("a");
if (section != null) {
for (Element exercise : section) {
if (exercise.hasText() && !muscleGroups.contains(exercise.text()) &&
exercise.attr("href").contains("exercises/muscle")) {
muscleGroups.add(exercise.text());
}
}
}
return muscleGroups;
}
}
In my MainActivity I want to be able to create a HTMLParser object and be able to use something like ArrayList = htmlParser.findMuscleGroups()
My MainActivity:
HTMLParser parser = new HTMLParser();
new HTMLParser().execute();
for (String muscleGroup : parser.findMuscleGroups()){
textView.setText(muscleGroup + "\n");
}
Which won't work. I'm well aware that it isn't supposed to work and there is something I'm missing but I hope you guys can point me in the right direction.

Solution:
Probably not the best, but it works so there's that
I've added this toHTMLParser class
public Document initializeDoc(){
try {
Document doc = Jsoup.connect("https://www.bodybuilding.com/exercises").get();
return doc;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
In MainActivity I've created
private volatile Document doc = null;
private volatile HTMLParser htmlParser;
variables, and added the following in the onCreate method
htmlParser = new HTMLParser();
Thread t = new Thread(new Runnable() {
#Override
public void run() {
doc = htmlParser.initializeDoc();
}
});
try {
t.start();
t.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
And needless to say, calling htmlParser.findMuscleGroups(doc) works as expected

file not found exception while fetching RSS feed from news.bitcoin.com in android

The link to the RSS feed : https://news.bitcoin.com/feed/
Here is my code thus far :
MainActivity :
public class MainActivity extends AppCompatActivity {
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
RSSDataDownload downloadTask = new RSSDataDownload();
downloadTask.execute();
}
public static class RSSDataDownload extends AsyncTask<Void, Void, Void> {
#Override
protected Void doInBackground(Void... voids) {
//DOESN'T WORK, FILE NOT FOUND EXCEPTION:
String MY_URL="https://news.bitcoin.com/feed" ;
try {
URL url = new URL(MY_URL);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
InputStream inputStream = connection.getInputStream();
processXML(inputStream);
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
public void processXML(InputStream inputStream) throws Exception {
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = documentBuilderFactory.newDocumentBuilder();
Document XMLdocument= builder.parse(inputStream);
Element rootElement=XMLdocument.getDocumentElement(); //root element of the XML Document
Log.d("XML",rootElement.getTagName());
}
}
}
The above code works for : http://www.bitnewz.net/rss/Feed and http://bitcoin.worldnewsoffice.com/rss/category/1/ and a couple of other RSS feeds, but doesn't work for https://news.bitcoin.com/feed/.
Any special reasons? How to get around it?

I don't know what is the problem you are facing but https://news.bitcoin.com/feed/ work for me by using this library.
You can get the idea from RSS-Parser library:
https://github.com/prof18/RSS-Parser

Jsoup.connect().get() takes only part of html file on Android

So, I try to parse Wikipedia, and my code works well at computer.
All, what I changed - .connect().get is in AsyncTask, but I get only part of html file (no "body", only half of second "script" in "title") and I can't understand why.
This is my code example for Android.
protected String doInBackground(String... params) {
try {
Document doc = Jsoup.connect(params[0]).get();
return doc.toString();
} catch (IOException e) {
//...
e.printStackTrace();
}
return null;
}
And this is simple.
String url = "https://en.wikipedia.org/wiki/Protectorate";
Document doc = null;
try {
doc = Jsoup.connect(url).get();
} catch (IOException e) {
//...
e.printStackTrace();
}
I checked, params[0] is https://en.wikipedia.org/wiki/Protectorate, here's no mistake.
If you need some extra information, I will give it, of course.

Logcat fools us here, since it shortens the message (I assume you checked your string with logcat? See related question)
If you split your result string into chunks, you will see that the whole page was loaded. Try adding something like this logAll function to your AsyncTask class to see the full output:
private class DownloadTask extends AsyncTask<String, Integer, String> {
Document doc = null;
protected String doInBackground(String... params) {
try {
doc = Jsoup.connect(params[0]).get();
return doc.toString();
} catch (Exception e) {
e.printStackTrace();
}
return doc.toString();
}
#Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
logAll("async",doc.toString());
}
void logAll(String TAG, String longString) {
int splitSize = 300;
if (longString.length() > splitSize) {
int index = 0;
while (index < longString.length()-splitSize) {
Log.e(TAG, longString.substring(index, index + splitSize));
index += splitSize;
}
Log.e(TAG, longString.substring(index, longString.length()));
} else {
Log.e(TAG, longString.toString());
}
}
}

Retrieving more than one string with an AsyncTask

I am using AsyncTask in conjunction with StreamScraper to get shoucast metadata for an app I am developing. Right now, I am getting only the song title, but I would also like to get the stream title (which is achieved with stream.getTitle();.) Below is my AsyncTask.
public class HarvesterAsync extends AsyncTask <String, Void, String> {
#Override
protected String doInBackground(String... params) {
String songTitle = null;
Scraper scraper = new ShoutCastScraper();
List<Stream> streams = null;
try {
streams = scraper.scrape(new URI(params[0]));
} catch (URISyntaxException e) {
e.printStackTrace();
} catch (ScrapeException e) {
e.printStackTrace();
}
for (Stream stream: streams) {
songTitle = stream.getCurrentSong();
}
return songTitle;
}
#Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
MainActivity.songTitle.setText(s);
}
}
What do I need to change so that I can get more than one string?

The simplest way to return more than one value from a background task in this case is to return an array.
#Override
protected String[] doInBackground(String... params) {
String songTitle = null;
String streamTitle = null; // new
Scraper scraper = new ShoutCastScraper();
List<Stream> streams = null;
try {
streams = scraper.scrape(new URI(params[0]));
} catch (URISyntaxException e) {
e.printStackTrace();
} catch (ScrapeException e) {
e.printStackTrace();
}
for (Stream stream: streams) {
songTitle = stream.getCurrentSong();
streamTitle = stream.getTitle(); // new. I don't know what method you call to get the stream title - this is an example.
}
return new String[] {songTitle, streamTitle}; // new
}
#Override
protected void onPostExecute(String[] s) {
super.onPostExecute(s); // this like is unnecessary, BTW
MainActivity.songTitle.setText(s[0]);
MainActivity.streamTitle.setText(s[1]); // new. Or whatever you want to do with the stream title.
}

returning an array of images retrieved from a url with jsoup

Im getting a null pointer exception when I attempt to get an array of image urls using jsoup, really not sure what Im doing wrong here as I appears that Im following the example layed out in the javadoc, any help would go a long way thanks.
public class ImagetestActivity extends Activity {
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
String url = "http://www.goal.com/en/news/1717/editorial/2012/05/20/3116140/in-pictures-chelsea-celebrate-champions-league-success#";
Document doc = null;
List<Element> media = new ArrayList<Element>();
try {
doc = Jsoup.connect(url).get();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
media = doc.select("[src]");
for (Element src : media) {
if (src.tagName().equals("img")) {
Toast.makeText(ImagetestActivity.this, src.text(),
Toast.LENGTH_LONG).show();
}
}
}
}

Try this:
media = doc.select("img[src]");
for (Element src : media) {
Toast.makeText(ImagetestActivity.this, src.attr("src"),
Toast.LENGTH_LONG).show();
}
I.e. select images (no need to check the tag name). And probably you need src attribute value, not inner text (which is always empty)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Using Jsoup to extract data for Android studio - java

Just use this ipAddressGet = doc.body().ownText(); instead of ipAddressGet = doc.text();

Related

Creating a JSoup class in android using AsyncTask

file not found exception while fetching RSS feed from news.bitcoin.com in android

Jsoup.connect().get() takes only part of html file on Android

Retrieving more than one string with an AsyncTask

returning an array of images retrieved from a url with jsoup

Categories

Resources