Extract tweet from URL using Java? - java

According to the things I've found twitter4J seems to be the most prominent tool when it comes to Java and Twitter. I went through the code examples and javadoc but I couldn't find a way to do this.
What I want to do is, extract the tweet (content of the tweet) using the URL of it. I tried using JSOUP and the CSS selector but when its a conversation it pulls all the tweets of it. How can I do it using Twitter4J?
Input tweet URL -> Output the content of the tweet

Using jsoup is easier, u can do this:
Document doc = Jsoup.connect("the tweet url here").timeout(10*1000).get();
Element tweet = doc.select(".js-tweet-text-container").first();
now u can use the tweet object to parse the information.

Related

How to retrieve html file from a website and get data between specific tags using java

I am going to create a desktop client for my university's parent web interface. When logged in, a webpage displays the student details in a table. I want to retrieve those details using java.
A short google search brought me to this library. https://jsoup.org
As it seems, it can send http requests (to receive the data from your university website) as well as parse these html to simple search for the tables you need.
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
log(doc.title());
Elements newsHeadlines = doc.select("#mp-itn b a");
for (Element headline : newsHeadlines) {
log("%s\n\t%s",
headline.attr("title"), headline.absUrl("href"));
}
If you don't know how html is structure you should take a short tutorial on how to write simple html to understand what is going on and what you are looking for.

expanding links in the tweets in twitter using twitter 4j

Hey i am doing a project of data retrieval from tweets from twitter, i am collecting tweets from certain kinds of events, few of the post contains some links, few are expanded and few are shorten, i want to save link from each tweets to my mysql database. I have found code for expanding url, someone please tell me will this work for every shorten url.
for (URLEntity urle : status.getURLEntities()) {
System.out.println(urle.getDisplayURL());
System.out.println(urle.getExpandedURL());
}
With that code you will print the url twice, from the Javadoc
getDisplayURL
Returns: the display URL if mentioned URL is shorten, or null if no shorten URL was mentioned.
So, for every URLEntity you will need to print the expanded URL if it's shortened
for (URLEntity urle : status.getURLEntities()) {
if(urle.getExpandedURL()){System.out.println(urle.getExpandedURL());}
else {System.out.println(urle.getDisplayURL());}
}
Or in you case, save them to a database.

Using java or javascript instead of Google app script

From Google app script (https://developers.google.com/apps-script/),I got this:-
var doc = DocumentApp.create('Hello, World');
// Access the body of the document, then add a paragraph.
doc.getBody().appendParagraph('This document was created by Google Apps Script.');
// Get the URL of the document.
var url = doc.getUrl();
What I would like to do is to be able to duplicate this from my javascript or java code so I can create a doc and get its URL. Any help is appreciated.
You'll need to use the Google Drive API.

Order By doesnt work on Google Spreadsheet API

I am trying to sort a Google Spreadsheet with the Java API but unfortunately it doesn't seem to work. The code I am using is really simple as shown in the API reference.
URL listFeedUrl = new URI(worksheet.getListFeedUrl().toString() + "?orderby=columnname").toURL();
However, this does not work. The feed returned is not sorted at all. Am I missing something? FYI the column I am trying to sort contains email addresses.
EDIT: I just realized that the problem only happens with the old version of Google Spreadsheet.
maybe this happens. The query is performed on the spreadsheet xml and xml tags are in lower case, for example the title of my column in my spreadseet is "Nombre" and the xml <gsx:nombre>is not working so instead of using [?orderby=Nombre], use [?orderby=nombre] with a lowercase "n"
The correct query for this is.
URL listFeedUrl = new URI(worksheet.getListFeedUrl().toString() + "?orderby=nombre").toURL();

get <img> value from a string in java

I'm parsing data from a json file. Now, I've a data like this
String Content = <p><img class="alignleft size-full wp-image-56999" alt="abdullah" src="http://www.some.com/wp-content/uploads/2013/12/imageName.jpg" width="348" height="239" />Text</p>
<p>Text</p> <p>Text</p><p>The post Some Text appeared first on Some Webiste</p>
Now, I want to divide this string in two pieces. I want to get this URL from src.
http://www.some.com/wp-content/uploads/2013/12/imageName.jpg
and store it a variable. Also, I want to remove the last line The Post appeared... and store the text's in another variable.
So, the questions are:
Is it possible to get that?
If possible, how can I achieve that ?
IN Java
Get a Document object
Document originalDoc = new SAXReader().read(new StringReader("<div>data</div>");
Then you can parse it.. (read this tutorial)
http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/
In JavaScript
to get attribute
var url = document.getElementsByTagName('img')[0].getAttribute('src');
In case if you have a string and you want a document object, use jquery
string stringValue = '<div>data</div>';
var myObject= $(stringValue);
Use String.substring(firstIndex, lastIndex) to get the link from src attribute
learn to use a HTML parser like JSoup, will be useful in near future
If its a well structured string you can parse it using any DOM parser and extract data from it...

Categories