Jsoup.connect().get() takes only part of html file on Android - java

So, I try to parse Wikipedia, and my code works well at computer.
All, what I changed - .connect().get is in AsyncTask, but I get only part of html file (no "body", only half of second "script" in "title") and I can't understand why.
This is my code example for Android.
protected String doInBackground(String... params) {
try {
Document doc = Jsoup.connect(params[0]).get();
return doc.toString();
} catch (IOException e) {
//...
e.printStackTrace();
}
return null;
}
And this is simple.
String url = "https://en.wikipedia.org/wiki/Protectorate";
Document doc = null;
try {
doc = Jsoup.connect(url).get();
} catch (IOException e) {
//...
e.printStackTrace();
}
I checked, params[0] is https://en.wikipedia.org/wiki/Protectorate, here's no mistake.
If you need some extra information, I will give it, of course.

Logcat fools us here, since it shortens the message (I assume you checked your string with logcat? See related question)
If you split your result string into chunks, you will see that the whole page was loaded. Try adding something like this logAll function to your AsyncTask class to see the full output:
private class DownloadTask extends AsyncTask<String, Integer, String> {
Document doc = null;
protected String doInBackground(String... params) {
try {
doc = Jsoup.connect(params[0]).get();
return doc.toString();
} catch (Exception e) {
e.printStackTrace();
}
return doc.toString();
}
#Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
logAll("async",doc.toString());
}
void logAll(String TAG, String longString) {
int splitSize = 300;
if (longString.length() > splitSize) {
int index = 0;
while (index < longString.length()-splitSize) {
Log.e(TAG, longString.substring(index, index + splitSize));
index += splitSize;
}
Log.e(TAG, longString.substring(index, longString.length()));
} else {
Log.e(TAG, longString.toString());
}
}
}

Related

Using Jsoup to extract data for Android studio

I am trying to extract data to get public ip address of each of my users so I can compare if any are currently the same. the website has no html just text that says: {"ip":"current ip"}
I try to extact this and use a toast just to test I have the info right but the toast is always blank. Here is my code:
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_homepage);
new doit().execute();
}
public class doit extends AsyncTask<Void, Void, Void>{
String ipAddressGet;
#Override
protected Void doInBackground(Void... params) {
try {
Document doc = Jsoup.connect("http://ipv4bot.whatismyipaddress.com/").get();
ipAddressGet = doc.text();
}catch(Exception e){
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void aVoid) {
super.onPostExecute(aVoid);
Toast.makeText(Homepage.this, ipAddressGet, Toast.LENGTH_LONG).show();
}
}
sorry I have never used this website before to post hopefully everything is clear.
all the descrptions for jsoup involve HTML use but this has none so i dont know how to apply the descrptions there
Just use this
ipAddressGet = doc.body().ownText();
instead of
ipAddressGet = doc.text();
Change your code:
try {
Document doc = Jsoup.connect("http://ipv4bot.whatismyipaddress.com/").get();
ipAddressGet = doc.text();
} catch(Exception e) {
e.printStackTrace();
}
To this:
try {
Document doc = Jsoup.connect("http://ipv4bot.whatismyipaddress.com/").get();
Element body = doc.body();
ipAddressGet = body.text();
} catch(Exception e) {
e.printStackTrace();
}
But to get and handle responses of requests to the web, use one of known libs, like Retrofit or Volley.
You do not need JSoup for this. Java can deal with it just fine:
private static String readStringFromURL(String requestURL) throws IOException
{
try (Scanner scanner = new Scanner(new URL(requestURL).openStream(),
StandardCharsets.UTF_8.toString()))
{
scanner.useDelimiter("\\A");
return scanner.hasNext() ? scanner.next() : "";
}
}
If your text is also JSON-formatted, most JSON parsing libraries also have built-in capabilities for reading from a URL.

Creating a JSoup class in android using AsyncTask

I know that I have to use ASyncTasks in order to make JSoup work for android, but all examples online illustrate that just by using random jsoup methods in the MainActivity.
I want to create a HTMLParser class which will contains a function for each element I want to parse but I can't seem to make it work.
My HTMLParser:
public class HTMLParser extends AsyncTask<Void, Void, Void> {
private Document doc;
#Override
protected Void doInBackground(Void... voids) {
try {
doc = Jsoup.connect("https://www.bodybuilding.com/exercises").get();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
public ArrayList<String> findMuscleGroups(){
ArrayList<String> muscleGroups = new ArrayList<>();
Elements section = doc.select("a");
if (section != null) {
for (Element exercise : section) {
if (exercise.hasText() && !muscleGroups.contains(exercise.text()) &&
exercise.attr("href").contains("exercises/muscle")) {
muscleGroups.add(exercise.text());
}
}
}
return muscleGroups;
}
}
In my MainActivity I want to be able to create a HTMLParser object and be able to use something like ArrayList = htmlParser.findMuscleGroups()
My MainActivity:
HTMLParser parser = new HTMLParser();
new HTMLParser().execute();
for (String muscleGroup : parser.findMuscleGroups()){
textView.setText(muscleGroup + "\n");
}
Which won't work. I'm well aware that it isn't supposed to work and there is something I'm missing but I hope you guys can point me in the right direction.
Solution:
Probably not the best, but it works so there's that
I've added this toHTMLParser class
public Document initializeDoc(){
try {
Document doc = Jsoup.connect("https://www.bodybuilding.com/exercises").get();
return doc;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
In MainActivity I've created
private volatile Document doc = null;
private volatile HTMLParser htmlParser;
variables, and added the following in the onCreate method
htmlParser = new HTMLParser();
Thread t = new Thread(new Runnable() {
#Override
public void run() {
doc = htmlParser.initializeDoc();
}
});
try {
t.start();
t.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
And needless to say, calling htmlParser.findMuscleGroups(doc) works as expected

Jackson Parsing with java

I really hate to do this, but I have two questions: can Jackson 2.7.3 parse the following url and can do I have to parse every part of the JSON?
Here is the code I am working with so far:
public class Song {
private String tracks;
private String album;
private String images;
public void setTracks(String tracks){
this.tracks=tracks;
}
public void setAlbum(String album){
this.album= album;
}
public void setImages (String images){
this.images= images;
}
}
And
Thread thread = new Thread(new Runnable() {
#Override
public void run() {
try {
ObjectMapper mapper = new ObjectMapper();
Document doc = Jsoup.connect("http://api.spotify.com/v1/search?q=track:" + finalSong + "%20artist:" + finalArtist+"%20" + "&type=track").ignoreContentType(true).get();
String title = String.valueOf(doc.body().text());
Song obj = mapper.readValue(String.valueOf(title), Song.class);
} catch (JsonGenerationException e) {
e.printStackTrace();
} catch (JsonMappingException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
});
thread.start();
return null;
}
All I need is the "preview_url" and one of the "images" url towards the top
the JSON is located at https://api.spotify.com/v1/search?q=track:Ready%20To%20Fall%20artist:rise%20against%20&type=track.
Do you necessary need to map your Json response into a class?
If not you can get your desired values as following e.g. for preview_url
You can use readTree to map the json result into a tree of nodes.
There after you can use findPath to search for the property you looking for.
In the case of image it contains an array. Thus if you want to select a specific item from that list you get use get to select the specific item you want.
example
JsonNode readTree = mapper.readTree(body);
for (JsonNode node : readTree.findPath("items")) {
System.out.println(node.findPath("images").get(2));
System.out.println(node.findPath("preview_url"));
}

Retrieving more than one string with an AsyncTask

I am using AsyncTask in conjunction with StreamScraper to get shoucast metadata for an app I am developing. Right now, I am getting only the song title, but I would also like to get the stream title (which is achieved with stream.getTitle();.) Below is my AsyncTask.
public class HarvesterAsync extends AsyncTask <String, Void, String> {
#Override
protected String doInBackground(String... params) {
String songTitle = null;
Scraper scraper = new ShoutCastScraper();
List<Stream> streams = null;
try {
streams = scraper.scrape(new URI(params[0]));
} catch (URISyntaxException e) {
e.printStackTrace();
} catch (ScrapeException e) {
e.printStackTrace();
}
for (Stream stream: streams) {
songTitle = stream.getCurrentSong();
}
return songTitle;
}
#Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
MainActivity.songTitle.setText(s);
}
}
What do I need to change so that I can get more than one string?
The simplest way to return more than one value from a background task in this case is to return an array.
#Override
protected String[] doInBackground(String... params) {
String songTitle = null;
String streamTitle = null; // new
Scraper scraper = new ShoutCastScraper();
List<Stream> streams = null;
try {
streams = scraper.scrape(new URI(params[0]));
} catch (URISyntaxException e) {
e.printStackTrace();
} catch (ScrapeException e) {
e.printStackTrace();
}
for (Stream stream: streams) {
songTitle = stream.getCurrentSong();
streamTitle = stream.getTitle(); // new. I don't know what method you call to get the stream title - this is an example.
}
return new String[] {songTitle, streamTitle}; // new
}
#Override
protected void onPostExecute(String[] s) {
super.onPostExecute(s); // this like is unnecessary, BTW
MainActivity.songTitle.setText(s[0]);
MainActivity.streamTitle.setText(s[1]); // new. Or whatever you want to do with the stream title.
}

Call a web service and parse xml response in blackberry

Currently I have a ready design for blackberry application.
Now, I need to call the web service in my app, and that web service will give me some xml response.
So, I need to parse that response from xml to some POJO.
So, for parsing the xml response should I go with the basic DOM praser, or should I use any other J2ME specific prasing concept ?
If anybody have any sample tutorial link for the same then it would be very much useful to me.
Thanks in advance....
It depends on what your web service serves.
If it is REST-based, you're likely responsible to parse the XML yourself, with a library. I've only ever used kXml 2, a J2ME library that can be used on BlackBerry devices. To use it, it's best to link to the source (otherwise, you have to preverify the jar and export it and that never seems to work for me). It's a forward-only pull parser, similar to XmlReader in .NET, if you're familiar with that.
If your web service is WS*-based (i.e. it uses SOAP), you can use a stub generator to generate a client class that you can use. BlackBerry supports JSR 172, the web services API for J2ME. The WTK has a stub generator that works well. Just point the generator to your web service's wsdl file. A web search should clarify how to use it.
Add your xml file data in to strXML
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = new ByteArrayInputStream(strXML.getBytes("UTF-8"));
Document document = builder.parse( inputStream );
Element rootElement = document.getDocumentElement();
rootElement.normalize();
blnViewReport=false;
listNodes(rootElement); // use this function to parse the xml
inputStream.close();
void listNodes(Node node)
{
Node tNode;
String strData;
String nodeName = node.getNodeName();
if( nodeName.equals("Tagname"))
{
tNode=node.getFirstChild();
if(tNode.getNodeType() == Node.TEXT_NODE)
{
// here you get the specified tag value
}
}
else if(nodeName.equals(“Tag name 2”))
.....
.....
NodeList list = node.getChildNodes();
if(list.getLength() > 0)
{
for(int i = 0 ; i<list.getLength() ; i++)
{
listNodes(list.item(i));
}
}
}
I believe that you have recieved the request object.
I will give the code I used to parse the request object from XML.
_value is the object
System.out.println("value="+_value);
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = null; // create a parser
try {
parser = factory.newSAXParser();
}
catch (ParserConfigurationException e1)
{
System.out.println("ParserConfigurationException"+e1.getMessage());
}
catch (SAXException e1)
{
System.out.println("SAXException"+e1.getMessage());
}
// instantiate our handler
PharmacyDataXMLHandler pharmacydataXMLHandler= new PharmacyDataXMLHandler();
ByteArrayInputStream objBAInputStream = new java.io.ByteArrayInputStream(_value.getBytes());
InputSource inputSource = new InputSource(objBAInputStream);
// perform the synchronous parse
try {
parser.parse(inputSource, pharmacydataXMLHandler);
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
_pharmacydataList = pharmacydataXMLHandler.getpharmacydataList();
}
public class PharmacyDataXMLHandler extends DefaultHandler
{
private Vector _pharmacyDataList = new Vector();
PharmacyData _pharmacydata;
StringBuffer _sb = null;
public void warning(SAXParseException e) {
System.err.println("warning: " + e.getMessage());
}
public void error(SAXParseException e) {
System.err.println("error: " + e.getMessage());
}
public void fatalError(SAXParseException e) {
System.err.println("fatalError: " + e.getMessage());
}
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
try{
_sb = new StringBuffer("");
if(localName.equals("Table"))
{
_pharmacydata= new PharmacyData();
}
}catch (Exception e) {
System.out.println(""+e.getMessage());
}
}
public void endElement(String namespaceURI, String localName, String qName) throws SAXException
{
try{
if(localName.equals("ID"))
{
// System.out.println("Id :"+sb.toString());
this._pharmacydata.setId(_sb.toString());
}
else if(localName.equals("Name"))
{
//System.out.println("name :"+sb.toString());
this._pharmacydata.setName(_sb.toString());
}
else if(localName.equals("PharmacyID"))
{
// System.out.println("pharmacyId :"+sb.toString());
this._pharmacydata.setPharmacyId(_sb.toString());
}
else if(localName.equals("Password"))
{
// System.out.println("password :"+sb.toString());
this._pharmacydata.setPassword(_sb.toString());
}
else if(localName.equals("Phone"))
{
// System.out.println("phone:"+sb.toString());
this._pharmacydata.setPhone(_sb.toString());
}
else if(localName.equals("Transmit"))
{
//System.out.println("transmit"+sb.toString());
this._pharmacydata.setTransmit(_sb.toString());
}
else if(localName.equals("TimeZone"))
{
// System.out.println("timeZone"+sb.toString());
this._pharmacydata.setTimeZone(_sb.toString());
}
else if(localName.equals("FaxModem"))
{
// System.out.println("faxModem"+sb.toString());
this._pharmacydata.setFaxModem(_sb.toString());
}
else if(localName.equals("VoicePhone"))
{
// System.out.println("voicePhone"+sb.toString());
this._pharmacydata.setVoicePhone(_sb.toString());
}
else if(localName.equals("ZipCode"))
{
// System.out.println("zipCode"+sb.toString());
this._pharmacydata.setZipCode(_sb.toString());
}
else if(localName.equals("Address"))
{
// System.out.println("address"+sb.toString());
this._pharmacydata.setAddress(_sb.toString());
}
else if(localName.equals("City"))
{
// System.out.println("city"+sb.toString());
this._pharmacydata.setCity(_sb.toString());
}
else if(localName.equals("State"))
{
// System.out.println("state"+sb.toString());
this._pharmacydata.setState(_sb.toString());
}
else if(localName.equals("WebInterface"))
{
// System.out.println("webInterface"+sb.toString());
this._pharmacydata.setWebInterface(_sb.toString());
}
else if(localName.equals("NABPnumber"))
{
// System.out.println("nabPnumber"+sb.toString());
this._pharmacydata.setNabPnumber(_sb.toString());
}
else if(localName.equals("ServiceType"))
{
// System.out.println("serviceType:"+sb.toString());
this._pharmacydata.setServiceType(_sb.toString());
}
else if(localName.equals("Mobile"))
{
// System.out.println("mobile:"+sb.toString());
this._pharmacydata.setMobile(_sb.toString());
}
else if(localName.equals("Table"))
{
// System.out.println("end table:"+sb.toString());
_pharmacyDataList.addElement(_pharmacydata);
}
}catch (Exception e) {
System.out.println(""+e.getMessage());
}
}
public void characters(char ch[], int start, int length) {
String theString = new String(ch, start, length);
_sb.append(theString);
}
/**
* #return the PharmacyDataList
*/
public Vector getpharmacydataList()
{
return _pharmacyDataList;
}
}

Categories