Finding out the location of tweets downloaded by flume from twitter

Finding out the location of tweets downloaded by flume from twitter - java

I used some keywords and downloaded the tweets from twitter using flume.And the sample data looks like as follows
{"filter_level":"medium","contributors":null,"text":"Messi, Ozil, CR7 & Suarez Bertengger di Lamborghini t.co/SKk8xnnjl7","geo":null,"retweeted":false,"in_reply_to_screen_name":null,"possibly_sensitive":false,"truncated":false,"lang":"in","entities":{"symbols":[],"urls":[{"expanded_url":"dlvr.it/5XH5Vk","indices":[56,78],"display_url":"dlvr.it/5XH5Vk","url":"t.co/SKk8xnnjl7"}],"hashtags":[],"user_mentions":[]},"in_reply_to_status_id_str":null,"id":461450307856130048,"source":"http://dlvr.it\" rel=\"nofollow\">dlvr.it</a>","in_reply_to_user_id_str":null,"favorited":false,"in_reply_to_status_id":null,"retweet_count":0,"created_at":"Wed Apr 30 10:21:41 +0000 2014","in_reply_to_user_id":null,"favorite_count":0,"id_str":"461450307856130048","place":null,"user":{"location":"Subscribe Us","default_profile":false,"profile_background_tile":true,"statuses_count":158496,"lang":"en","profile_link_color":"006400","profile_banner_url":"pbs.twimg.com/profile_banners/251586988/1368528690","id":251586988,"following":null,"protected":false,"favourites_count":1,"profile_text_color":"333333","description":"Latest Breaking News & Software.\r\n\r\nAkun ini dijual Rp150.000","verified":false,"contributors_enabled":false,"profile_sidebar_border_color":"000000","name":"TOP NEWS","profile_background_color":"000000","created_at":"Sun Feb 13 12:54:44 +0000 2011","is_translation_enabled":false,"default_profile_image":false,"followers_count":37879,"profile_image_url_https":"pbs.twimg.com/profile_images/449966329588482048/Rb4azNrv_normal.jpeg","geo_enabled":false,"profile_background_image_url":"abs.twimg.com/images/themes/theme14/bg.gif","profile_background_image_url_https":"abs.twimg.com/images/themes/theme14/bg.gif","follow_request_sent":null,"url":"google.com","utc_offset":25200,"time_zone":"Bangkok","notifications":null,"profile_use_background_image":true,"friends_count":10,"profile_sidebar_fill_color":"DDEEF6","screen_name":"7HotNews","id_str":"251586988","profile_image_url":"http://pbs.twimg.com/profile_images/449966329588482048/Rb4azNrv_normal.jpeg","listed_count":19,"is_translator":false},"coordinates":null}
Now I have to find out the location of the tweets from where it is tweeted.Also i came across some websites in which most answers that "geo" field in above json format gives the location of tweet.But it is null for my most of the tweets.
Please,anyone help me on this...I just rolling my head for two weeks....
Thanks in Advance,
RedDevil

You have two types of locations: geo location and user location.
The user location is not reliables most of the time since users can type in anything they want.
The geo location is the best way for locating a tweet and as you have seen, most of the users do not enable their geo location.
You can use the few geo locations you have depending on what the purpose will be.

Related

how to read recurring events from outlook pst file using libpst

I'm using the java libpst 0.9.3 (https://mvnrepository.com/artifact/com.pff/java-libpst/0.9.3) to read calendar events from a local outlook pst file.
I struggle to get the proper information for recurring appopintments incl. exceptions.
When scanning the appointments, I use
myAppointment.getRecurrenceType()!= 0
to check whether this is an recurring event and then try to use the class
PSTAppointmentRecurrence(byte[] recurrencePattern,
PSTAppointment appt,
PSTTimeZone tz)
to create a recurrend appointment object but I fail to create an instance of this class.
I tried this:
myRecurrendAppointment= new PSTAppointmentRecurrence(myAppointment.getRecurrenceStructure(), myAppointment, ???)
??? tz= where do I get the system timezone as an instance of PSTTimeZone?
thanks in advance
Chris

Get historic prices by ISIN from yahoo finance

I have the following problem:
I have around 1000 unique ISIN numbers of stock exchange listed companies.
I need the historic prices of these companies starting with the earliest listing until today on a daily basis.
However, as far as my research goes, yahoo can only provide prices for stock ticker symbols, which I do not have.
Is there a way to get for example for ISIN: AT0000609664, which is the company Porr the historic prices from yahoo automatically via their api?
I appreciate your replies!

The Answer:
To get the Yahoo ticker symbol from an ISIN, take a look at the yahoo.finance.isin table, here is an example query:
http://query.yahooapis.com:80/v1/public/yql?q=select * from yahoo.finance.isin where symbol in ("DE000A1EWWW0")&env=store://datatables.org/alltableswithkeys
This returns the ticker ADS.DE inside an XML:
<query yahoo:count="1" yahoo:created="2015-09-21T12:18:01Z" yahoo:lang="en-US">
<results>
<stock symbol="DE000A1EWWW0">
<Isin>ADS.DE</Isin>
</stock>
</results>
</query>
<!-- total: 223 -->
<!-- pprd1-node600-lh3.manhattan.bf1.yahoo.com -->
I am afraid your example ISIN won't work, but that's an error on Yahoos side (see Yahoo Symbol Lookup, type your ISINs in there to check if the ticker exists on Yahoo).
The Implementation:
Sorry, I am not proficient in Java or R anymore, but this C# code should be almost similar enough to copy/paste:
public String GetYahooSymbol(string isin)
{
string query = GetQuery(isin);
XDocument result = GetHttpResult(query);
XElement stock = result.Root.Element("results").Element("stock");
return stock.Element("Isin").Value.ToString();
}
where GetQuery(string isin) returns the URI for the query to yahoo (see my example URI) and GetHttpResult(string URI) fetches the XML from the web. Then you have to extract the contents of the Isin node and you're done.
I assume you have already implemented the actual data fetch using ticker symbols.
Also see this question for the inverse problem (symbol -> isin). But for the record:
Query to fetch historical data for a symbol
http://query.yahooapis.com:80/v1/public/yql?q=select * from yahoo.finance.historicaldata where symbol in ("ADS.DE") and startDate = "2015-06-14" and endDate = "2015-09-22"&env=store://datatables.org/alltableswithkeys
where you may pass arbitrary dates and an arbitrary list of ticker symbols. It's up to you to build the query in your code and to pull the results from the XML you get back. The response will be along the lines of
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="71" yahoo:created="2015-09-22T20:00:39Z" yahoo:lang="en-US">
<results>
<quote Symbol="ADS.DE">
<Date>2015-09-21</Date>
<Open>69.94</Open>
<High>71.21</High>
<Low>69.65</Low>
<Close>70.79</Close>
<Volume>973600</Volume>
<Adj_Close>70.79</Adj_Close>
</quote>
<quote Symbol="ADS.DE">
<Date>2015-09-18</Date>
<Open>70.00</Open>
<High>71.43</High>
<Low>69.62</Low>
<Close>70.17</Close>
<Volume>3300200</Volume>
<Adj_Close>70.17</Adj_Close>
</quote>
......
</results>
</query>
<!-- total: 621 -->
<!-- pprd1-node591-lh3.manhattan.bf1.yahoo.com -->
This should get you far enough to write your own code. Note that there are possibilities to get data as .csv format with &e=.csv at the end of the query, but I don't know much about that or if it will work for the queries above, so see here for reference.

I found a Web-Service which provides historic data based on date range. Please have a look
http://splice.xignite.com/services/Xignite/XigniteHistorical/GetHistoricalQuotesRange.aspx

RestFB get events in Java

Using FQL, by means of which finds events that contain a given word. FQL works only in API version <2.1. By which I use the Graph API Explorer to display events. Eg.
search?q=york&type=event
Example of FQL
SELECT Eid, name, location, start_time, description, pic_small, creator, event venue FROM WHERE start_time> "Sun Jun 21 0:00:35 GMT 2015" AND (CONTAINS ("york")
I would like to make a search events by using RestFB not using FQL, but do not know how. The documentation is scarce.

I answered this already on github, but perhaps someone else find this useful.
Your special case is not in the documentation, but you can transfer the knowledge you find in the documentation and solve your problem: http://restfb.com/#searching
Connection<Event> eventList =
facebookClient.fetchConnection("search", Event.class,
Parameter.with("q", "york"), Parameter.with("type", "event"));
Now, you can iterate over the eventList.
Here you can find how this can be done: http://restfb.com/#fetching-connections

Extracting tweets of a hashtag tweeted from a specific location using twitter4j

I tried to extract all the tweets of hash tag tweeted from a specific location like below
Query query = new Query("Obama");
GeoLocation gl = new GeoLocation(40.7127,74.0059); // geo location of Newyork
query.setCount(100);
query.setLocale("en");
query.setLang("en");
query.setGeoCode(gl, 50, "Query.MILES");
But, I got an empty result. I got all the tweets on Obama before using the setGeoCode method. Where did I go wrong?

Call query.setGeoCode without quotes around Query.MILES:
query.setGeoCode(gl, 50, Query.MILES);

How to parse json to a database?

I have a complex json written in a string. I know java and a little of mysql. I need to make a database out of the json.
I'm using some twitter data so the tweets contain the description of the user who tweeted it and in case it's been retweeted, it contains the description of the user who tweeted it before this user.
My objective is to create a user table ( or array or any other data structure ) which contains all the tweets this user tweeted, and all his tweets which have been retweeted.
The tweet object contains around 50-80 objects so giving an example here will make this post really long.
Example
StatusJSONImpl{createdAt=Wed Sep 28 12:04:55 IST 2011, id=118936707830775808, text='RT #nytimesbits: Google's Biggest Threat Is Google http://t.co/kTNqJFJC', source='web', isTruncated=false, inReplyToStatusId=-1, inReplyToUserId=-1, isFavorited=false, inReplyToScreenName='null', geoLocation=null, place=null, retweetCount=6, wasRetweetedByMe=false, contributors=null, annotations=null, retweetedStatus=StatusJSONImpl{createdAt=Wed Sep 28 05:35:26 IST 2011, id=118838689248985088, text='Google's Biggest Threat Is Google http://t.co/kTNqJFJC', source='The New York Times', isTruncated=false, inReplyToStatusId=-1, inReplyToUserId=-1, isFavorited=false, inReplyToScreenName='null', geoLocation=null, place=null, retweetCount=6, wasRetweetedByMe=false, contributors=null, annotations=null, retweetedStatus=null, userMentionEntities=[], urlEntities=[URLEntityJSONImpl{start=34, end=54, url=http://t.co/kTNqJFJC, expandedURL=http://nyti.ms/pR9DfX, displayURL=nyti.ms/pR9DfX}], hashtagEntities=[], user=UserJSONImpl{id=14434070, name='NYTimes Bits Blog', screenName='nytimesbits', location='The Cloud', description='News and analysis on tech and business. Also here: select retweets from NYT tech writers and friends. Account maintained by David F. Gallagher (#davidfg).', isContributorsEnabled=true, profileImageUrl='http://a1.twimg.com/profile_images/108833947/bits75_normal.jpg', profileImageUrlHttps='https://si0.twimg.com/profile_images/108833947/bits75_normal.jpg', url='http://nytimes.com/bits', isProtected=false, followersCount=53180, status=null, profileBackgroundColor='9ae4e8', profileTextColor='000000', profileLinkColor='0000ff', profileSidebarFillColor='e0ff92', profileSidebarBorderColor='87bc44', profileUseBackgroundImage=true, showAllInlineMedia=false, friendsCount=139, createdAt=Fri Apr 18 20:49:26 IST 2008, favouritesCount=5, utcOffset=-18000, timeZone='Eastern Time (US & Canada)', profileBackgroundImageUrl='http://a3.twimg.com/profile_background_images/4780380/twitter_post.png', profileBackgroundImageUrlHttps='https://si0.twimg.com/profile_background_images/4780380/twitter_post.png', profileBackgroundTiled=true, lang='en', statusesCount=6360, isGeoEnabled=false, isVerified=true, translator=false, listedCount=4671, isFollowRequestSent=false}}, userMentionEntities=[UserMentionEntityJSONImpl{start=3, end=15, name='NYTimes Bits Blog', screenName='nytimesbits', id=14434070}], urlEntities=[URLEntityJSONImpl{start=51, end=71, url=http://t.co/kTNqJFJC, expandedURL=http://nyti.ms/pR9DfX, displayURL=nyti.ms/pR9DfX}], hashtagEntities=[], user=UserJSONImpl{id=17989546, name='Wolfgang Fasching-K.', screenName='wwwof', location='Vienna', description='Digital ist besser. Fokus: IT & Internet, World News & US Politik, Medien & Pop/Kultur. http://www.riverone.at', isContributorsEnabled=false, profileImageUrl='http://a0.twimg.com/profile_images/67758989/SF050069-w_normal.JPG', profileImageUrlHttps='https://si0.twimg.com/profile_images/67758989/SF050069-w_normal.JPG', url='null', isProtected=false, followersCount=59, status=null, profileBackgroundColor='C0DEED', profileTextColor='333333', profileLinkColor='0084B4', profileSidebarFillColor='DDEEF6', profileSidebarBorderColor='C0DEED', profileUseBackgroundImage=true, showAllInlineMedia=false, friendsCount=64, createdAt=Tue Dec 09 17:09:35 IST 2008, favouritesCount=0, utcOffset=3600, timeZone='Vienna', profileBackgroundImageUrl='http://a3.twimg.com/profile_background_images/234523169/Naschmarkt-Wien-Juni10-2010-s.jpg', profileBackgroundImageUrlHttps='https://si0.twimg.com/profile_background_images/234523169/Naschmarkt-Wien-Juni10-2010-s.jpg', profileBackgroundTiled=true, lang='en', statusesCount=269, isGeoEnabled=false, isVerified=false, translator=false, listedCount=4, isFollowRequestSent=false}}

For JSON parsing, I recommend Jackson. Also, in order to validate your input, you should have a look at JSON Schema (for which I have an implementation if you want).
Here is how to parse a JSON in a string using Jackson:
final ObjectMapper mapper = new ObjectMapper();
final JsonNode node = mapper.readTree(yourInput);
// Access members:
node.get(0); // access node 0 of an array
for (final JsonNode entry: node) {
... // cycle through array nodes
}
node.get("foo"); // access property "foo" of an object
node.get("foo").getTextValue(); // access as a text
// etc etc
It also has a s*load of options to serialize to POJOs if that's what you want.

Your first step will be to parse the JSON to get an object graph, using a library like gson or any of several others.
Then (and this seems really general, but it's a pretty open question) it's a matter of determining what the schema should be, creating the tables, and looping through the object graph populating them.
You might look at "document databases" (so-called NoSQL) rather than SQL ones if you're allowed to, as they usually allow the schema to be more fluid.

If your problem is just with Twitter, you can look for dedicated APIs like Twiiter4J or Spring Social, that should provides ready java beans for tweets.
If you're realizing a small project Gson is the best solution for parsing. But if your making something more sophisticated, I suggest you to use Jackson for parsing and Hibernate as a middleware between application and sql database.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Finding out the location of tweets downloaded by flume from twitter - java

Related

how to read recurring events from outlook pst file using libpst

Get historic prices by ISIN from yahoo finance

RestFB get events in Java

Extracting tweets of a hashtag tweeted from a specific location using twitter4j

How to parse json to a database?

Categories

Resources