Polling multiple twitter accounts for tweet impression and likes - java

I'm currently working on a use case (developed in Java,Spring) where I've large no. of twitter accounts (no. of accounts can go to thousand) to which I can post data(tweet) as when configured/scheduled.
I've implemented posting of data to twitter but I'm confused how to pull impressions/retweets and likes of tweets from various twitter accounts.
One solution is to poll all accounts on regular interval, but in that case I won't be getting no of likes on tweet made, because I'm using user and mentions timeline APIs with "since_id" parameter, which do not return no of likes on my older tweets as it always fetches latest tweet and retweet.
Another option is to use streaming APIs, in which I will be opening a stream for every twitter account I have but that doesn't seems feasible to me because I have very large no. of twitter accounts with me and I doubt that my Java app can handle that many no. of streams.
Can someone please suggest how can I solve this, any help is greatly appreciated.

IT seems your problem is due to scale rather then the design ,the statement "and I doubt that my Java app can handle that many no. of streams."
let's look in a different direction.
It's time to move to the world of "Big Data".
Apache kafa,Pig,Hive,Yarn,Strom,HBase,Hadoop etc.list is overwhelming.
Apache Spark- large-scale data processing that and supports concepts such as MapReduce, in-memory processing, stream processing, graph processing etc.
Storm was created by Twitter the counter part you can say is Apache storm.
Apache Kafka it offers brokers that collect streams, log and buffer them in a fault tolerant manner.
Hadoop for storage of the data.
http://www.itworld.com/article/2827285/big-data/what-hadoop-can--and-can-t-do.html
happy designing.

Related

Allow app to communicate with web domain

I have a domain that is user posts based. I plan to create a user posts based app like 9gag. I need the app to be able to communicate and fetch data hosted from my domain.
Things I need the app to do:
1) Allow users to post pictures though the app.
2)Allow users to leave comments through the app.
3)Allow users to leave 'likes' though the app.
I want the data to be stored on my domain, while when a user opens the app, the app will fetch this data from the domain and display it for the user. How can I make my app communicate with the domain?
Thanks!
The best way to do this would be to implement an API on your domain that your app can send requests to. I cannot explain all this in detail here because it would require a lot of space and a full blown tutorial, but I can tell you what to research and what to implement to make this happen.
First off you need to create an API for your app to send requests to. I suggest a "RESTful" api as they are pretty strait forward to the average programmer. Here is a good video that explains what an API is and a little bit of how they are typically implemented. https://youtu.be/7YcW25PHnAA
After you have an API setup, you have "encode" the information so that it is easy to parse once your app has a hold of all that information. To do this we use a "data-interchange format". One of the big ones being used today is JSON, see their website to learn more here: http://www.json.org/ JSON is pretty strait forward and easy to understand if you have a concept of what Programming: objects, strings, arrays etc are.
Ok so you have gotten your information from the server, you have parsed it from the JSON you got, and displayed all your content... now, what do you do if your user give a thumbs up or comment on something? This is also implemented via the API, this part should be easiest for you, it involves wrapping up the required data (Content id, user id, what they did [ie liked the content]) and send this via a http request, just like how you got your information in the first place, but instead of reading the data response, now we are just sending the HTTP request from the app, and we don't care what happens next (on the app level) its up to the server to record the data from the HTTP request.
I would highly suggest looking up how to create API and look through some tutorials... there are a lot of tutorials out there that want you to modify the HTACCESS file on the server, this is really necessary (Boy I hope I don't get crucified for saying that; fellow Stack Overflow Citizens, if you disagree, please explain your reasoning) Obviously for a large mainstream website, the whole HTACCESS file might be a good idea, but for a beginner, I don't think it is really needed.

Amazon-MWS: Difference between Reports and Order lists

I'm trying to integrate the orders from Amazon Marketplace into our system. I did that before with Magento and thought this should be easy as that, but somehow I got stuck.
I downloaded the Java APIs from Amazon and started playing around with the examples.
So far so good - I was able to get them running.
But playing with the Reports API and the Orders API, I started to wonder which one to use if I only want to get the unshipped orders to put them into our system.
1. doing this with the Report API seems very complicated and involves a lot of calls to the MWS. This is documented by Amazon here.
2. using the Orders API seems pretty straightforward. I only have to create a ListOrdersRequest, define what type of orders I want to have and finally get them via a ListOrders call.
So my question is: What is the reason to choose the Reports API over the Orders API?
Seems like Amazon is recommending the Reports API, but I really do not understand why this should be so complicated. Why should I get Reports when I can get the Orders directly?
Both approaches can work. Here's why I would pick the Reports API:
Reports are more scalable. I believe MWS reports can return an unlimited number of records. ListOrders can only return a maximum of 100 orders. You can get more using ListOrdersByNextToken, but that brings throttling into the problem and it is not clear whether or not you're just paging by an offset (which could cause lost/duplicate orders) or whether it is a snapshot.
You can acknowledge reports and filter on unacknowledged reports. Orders can be acknowledged too, but I don't think there is a way of filtering ListOrders based on acknowledgement status.
Reports can be scheduled to auto-generate on an interval, as often as every 15 minutes. This means that it may not be as many calls as you think: really, it's only three every interval: one to list unacknowledged order reports, one to pull the report you want and one to acknowledge it.

Evolution of an application, data treatment, and synchronization

I m actually an android developper in a little society in France. I ve never been faced to this problem before now.
Explanation :
We have developped an android application which has to work with and without network connection (impliying synchronization ascending and descending)
The user has to connect himself, then we request the WebService to access the informations he needs to make his treatments. But, he needs to get ~2500 lines (marshalled by JackSon in our objects). This synchronization takes nearly 3 mns in 3g and more than 5 mns in Edge... and then make what he had to do, and sent the information back to the server when he get a network connection.
MySQL and our webservices give information in good time ( ~0.05s/requests for mysql, and ~105ms/request accessing the webservice from a webpage). We actually need 10-15 requests to get all the needed informations.
Is there any way to reduce, or a solution to improve / refactor our coding methods.
In fact, I guess we didnt think the application by the good side, when I look Google drive mobile or messenger Facebook app which are really really fast x__x' .
So I m looking for a solution, and moreover, we have a client that needs to get ~50 000 lines per users in the next few monthes...
Thanks for all,

Best way to build a chat bot

What framework can I start with to create a simple chatbot? the focus of the bot is very limited(for my project management website http://ayeboss.com).
One can compare it with SIRI on iPhone. I want to create a simple "answering" chat which will answer questions like "give me all completed tasks so far" or "show me last completed task" or "show|list|give me my pending tasks" etc. After user asks the question I want to present the data to the user
As of now I am creating a regex dictionary of possible questions and if there is no match then I do a lucene search to find the nearest match.
Am I doing it right?
Typically, chatbots within a narrow field like yours typically rely on 2 important concepts:
Intent detection: identifying what the user is requesting
Entity Extraction: Identifying entites in the users request. For instance in a flight reservation bot examples of entities are the source, destination and travel dates. In a weather bot the entities can be the desired date for the weather or the location where the weather is required.
For your specific type of chatbot, which has a definite goal of retrieving list of completed tasks and retrieving the last completed task. To develop this, you need to define the intents of interest. From you examples we can easily define 2 intents:
COMPLETED_TASKS_REQUEST
LAST_COMPLETED_TASK
Based on this 2 intents, there is really no entity to be detected. You simply query your service API to retrieve the requested information in each scenario.
The next phase will be to train a classifier to identify the intents. This can be done by getting some sample sentences for each request type and training over those.
The flow is then reduced to the following:
Bot receives message
Bot identifies intent
Bot extracts relevant entities (if required)
If intent is recognised bot queries data source to retrieve answer else bot complains it doesn't understand the request. Alternatively if bot needs an entity to complete request, bot asks user to provide the information and completes its task. This is usually called a slot based approach. You can read more on how a Dialog Manager works.
Note that if you're not into Machine Learning or NLP you can easily train an intent detector on platforms like wit.ai or api.ai and the entity classification part of this task will be reduced to simple http API requests. Though when building genuinely complicated or sophisticated bots it is almost always better to build your own models since you can have full control and can handle edge cases better. Platforms like wit.ai or api.ai generally need to perform well across multiple fields while you can focus on making yours an expert in task management.
Hope this helps.
PS: To make your bot more interesting we can add one more intent like retrieving the status of a specific task given the id. E.g a user can ask what is the status of task 54. This intent can be called:
TASK_STATUS_REQUEST. In this example, the intent has an entity which is the id of the requested task so you will need to extract that :)
This is a NLP task, and for building a system like this require lot of R&D. You can start by building a set of question that might be asked. Analyzing the questions and coming up with word patterns for each type of question. The next step would be to transform the English sentence into some form of formal structure( maybe SQL or lambda calculus). The backend DB should have the data stored in it which can be queried by the formal language.
The main problem lies in converting the English sentence to a formal language. You can start with regex and progress to make it more complex by checking Part of Speech, Syntactic structure of input sentences. Check out NLTK package for doing NLP tasks.
On top of chat bot library, you can integration instant messaging library like Hyphenate to enable chat bot for mobile and web communication.
Here're few simple steps:
Hyphenate Console: Create a chatbot entity by sign up an account at Hyphenate console (console.hyphenate.io) to give your chatbot an identity and a voice by creating Hyphenate IM account for the bot.
Platform SDK: Integrate your app (iOS, Android, or Web) with Hyphenate IM services and open source UI library.
Webhooks (event callback): Set up Hyphenate webhooks to receive the messages from the user that push to your developer backend, then process it in with your chatbot AI library.
Backend REST API: push chatbot's messages to the user via REST APIs provided by Hyphenate from your developer backend.
Hooray! Webhooks + backend REST API = relay messages between chatbot and user.
http://docs.hyphenate.io/docs/chat-bot-integration
You can use Microsoft NLP frameworks which are pretty straight forward and easy to use for beginners. Also was earlier know as LUIS, it's one of the Cognitive services that Microsoft provides.
It's basically a combination of API calls.
Not sure which language you are familiar with, but in Java you can do it using Apache OpenNLP library. This is a very good & easy to use library for natural language processing. To give very basic approach, you can break sentences & tokenize them into words. Then you can lemmatize words to get them to basic word forms. Then you can classify or categorize them using categorizer with proper training data. better the training, smarter the chat bot. Also you can select categories to make chat bot have conversation in more engaging way. Here is a very good article with detailed example & demo.

Aggregating results from various sources - Java application architecture

I searched in google and stackoverflow for my problem, but couldn't find a good solution. Below is the description,
Our Java web application displays search results from our local database and external webservice API calls. So, the search logic should combine these results and display it in the result page. The problem is, the external API calls return the results slower than our local DB calls. Performance is crucial for our search results and the results should be live i.e. we should not cache or persist the external results in our local DB. Right now, we are spanning two threads, one for the DB call and another one for the exteral API, and combine these results and display it on the screen. But it kills the performance of our application, particularly when we call more than one external APIs.
Is there any architectural solution for this problem?
Any help would be greatly appreciated.
Thanks.
You cannot display data before you have it.
1) You can display your local data and as they come, add via ajax other data.
2) And if there are repeated questions, you could cache external answers for short time (and display them with warning that they are old and that they will be replaced by fresh answer) and as soon as fresh anwer arrive, push new answer.
With at least 1), system will be responsive, with 2) usable answer can be available imediately, even if its not current.
btw, if external source take long to answer, are you sure that their answer is not stale (eg. if they gather some data and wait for rest, then what they gathered so far can go stale)? So maybe (and maybe not) short term persisting is not as bad as you think.

Categories