What framework can I start with to create a simple chatbot? the focus of the bot is very limited(for my project management website http://ayeboss.com).
One can compare it with SIRI on iPhone. I want to create a simple "answering" chat which will answer questions like "give me all completed tasks so far" or "show me last completed task" or "show|list|give me my pending tasks" etc. After user asks the question I want to present the data to the user
As of now I am creating a regex dictionary of possible questions and if there is no match then I do a lucene search to find the nearest match.
Am I doing it right?
Typically, chatbots within a narrow field like yours typically rely on 2 important concepts:
Intent detection: identifying what the user is requesting
Entity Extraction: Identifying entites in the users request. For instance in a flight reservation bot examples of entities are the source, destination and travel dates. In a weather bot the entities can be the desired date for the weather or the location where the weather is required.
For your specific type of chatbot, which has a definite goal of retrieving list of completed tasks and retrieving the last completed task. To develop this, you need to define the intents of interest. From you examples we can easily define 2 intents:
COMPLETED_TASKS_REQUEST
LAST_COMPLETED_TASK
Based on this 2 intents, there is really no entity to be detected. You simply query your service API to retrieve the requested information in each scenario.
The next phase will be to train a classifier to identify the intents. This can be done by getting some sample sentences for each request type and training over those.
The flow is then reduced to the following:
Bot receives message
Bot identifies intent
Bot extracts relevant entities (if required)
If intent is recognised bot queries data source to retrieve answer else bot complains it doesn't understand the request. Alternatively if bot needs an entity to complete request, bot asks user to provide the information and completes its task. This is usually called a slot based approach. You can read more on how a Dialog Manager works.
Note that if you're not into Machine Learning or NLP you can easily train an intent detector on platforms like wit.ai or api.ai and the entity classification part of this task will be reduced to simple http API requests. Though when building genuinely complicated or sophisticated bots it is almost always better to build your own models since you can have full control and can handle edge cases better. Platforms like wit.ai or api.ai generally need to perform well across multiple fields while you can focus on making yours an expert in task management.
Hope this helps.
PS: To make your bot more interesting we can add one more intent like retrieving the status of a specific task given the id. E.g a user can ask what is the status of task 54. This intent can be called:
TASK_STATUS_REQUEST. In this example, the intent has an entity which is the id of the requested task so you will need to extract that :)
This is a NLP task, and for building a system like this require lot of R&D. You can start by building a set of question that might be asked. Analyzing the questions and coming up with word patterns for each type of question. The next step would be to transform the English sentence into some form of formal structure( maybe SQL or lambda calculus). The backend DB should have the data stored in it which can be queried by the formal language.
The main problem lies in converting the English sentence to a formal language. You can start with regex and progress to make it more complex by checking Part of Speech, Syntactic structure of input sentences. Check out NLTK package for doing NLP tasks.
On top of chat bot library, you can integration instant messaging library like Hyphenate to enable chat bot for mobile and web communication.
Here're few simple steps:
Hyphenate Console: Create a chatbot entity by sign up an account at Hyphenate console (console.hyphenate.io) to give your chatbot an identity and a voice by creating Hyphenate IM account for the bot.
Platform SDK: Integrate your app (iOS, Android, or Web) with Hyphenate IM services and open source UI library.
Webhooks (event callback): Set up Hyphenate webhooks to receive the messages from the user that push to your developer backend, then process it in with your chatbot AI library.
Backend REST API: push chatbot's messages to the user via REST APIs provided by Hyphenate from your developer backend.
Hooray! Webhooks + backend REST API = relay messages between chatbot and user.
http://docs.hyphenate.io/docs/chat-bot-integration
You can use Microsoft NLP frameworks which are pretty straight forward and easy to use for beginners. Also was earlier know as LUIS, it's one of the Cognitive services that Microsoft provides.
It's basically a combination of API calls.
Not sure which language you are familiar with, but in Java you can do it using Apache OpenNLP library. This is a very good & easy to use library for natural language processing. To give very basic approach, you can break sentences & tokenize them into words. Then you can lemmatize words to get them to basic word forms. Then you can classify or categorize them using categorizer with proper training data. better the training, smarter the chat bot. Also you can select categories to make chat bot have conversation in more engaging way. Here is a very good article with detailed example & demo.
Related
I'm currently working on a use case (developed in Java,Spring) where I've large no. of twitter accounts (no. of accounts can go to thousand) to which I can post data(tweet) as when configured/scheduled.
I've implemented posting of data to twitter but I'm confused how to pull impressions/retweets and likes of tweets from various twitter accounts.
One solution is to poll all accounts on regular interval, but in that case I won't be getting no of likes on tweet made, because I'm using user and mentions timeline APIs with "since_id" parameter, which do not return no of likes on my older tweets as it always fetches latest tweet and retweet.
Another option is to use streaming APIs, in which I will be opening a stream for every twitter account I have but that doesn't seems feasible to me because I have very large no. of twitter accounts with me and I doubt that my Java app can handle that many no. of streams.
Can someone please suggest how can I solve this, any help is greatly appreciated.
IT seems your problem is due to scale rather then the design ,the statement "and I doubt that my Java app can handle that many no. of streams."
let's look in a different direction.
It's time to move to the world of "Big Data".
Apache kafa,Pig,Hive,Yarn,Strom,HBase,Hadoop etc.list is overwhelming.
Apache Spark- large-scale data processing that and supports concepts such as MapReduce, in-memory processing, stream processing, graph processing etc.
Storm was created by Twitter the counter part you can say is Apache storm.
Apache Kafka it offers brokers that collect streams, log and buffer them in a fault tolerant manner.
Hadoop for storage of the data.
http://www.itworld.com/article/2827285/big-data/what-hadoop-can--and-can-t-do.html
happy designing.
I have a domain that is user posts based. I plan to create a user posts based app like 9gag. I need the app to be able to communicate and fetch data hosted from my domain.
Things I need the app to do:
1) Allow users to post pictures though the app.
2)Allow users to leave comments through the app.
3)Allow users to leave 'likes' though the app.
I want the data to be stored on my domain, while when a user opens the app, the app will fetch this data from the domain and display it for the user. How can I make my app communicate with the domain?
Thanks!
The best way to do this would be to implement an API on your domain that your app can send requests to. I cannot explain all this in detail here because it would require a lot of space and a full blown tutorial, but I can tell you what to research and what to implement to make this happen.
First off you need to create an API for your app to send requests to. I suggest a "RESTful" api as they are pretty strait forward to the average programmer. Here is a good video that explains what an API is and a little bit of how they are typically implemented. https://youtu.be/7YcW25PHnAA
After you have an API setup, you have "encode" the information so that it is easy to parse once your app has a hold of all that information. To do this we use a "data-interchange format". One of the big ones being used today is JSON, see their website to learn more here: http://www.json.org/ JSON is pretty strait forward and easy to understand if you have a concept of what Programming: objects, strings, arrays etc are.
Ok so you have gotten your information from the server, you have parsed it from the JSON you got, and displayed all your content... now, what do you do if your user give a thumbs up or comment on something? This is also implemented via the API, this part should be easiest for you, it involves wrapping up the required data (Content id, user id, what they did [ie liked the content]) and send this via a http request, just like how you got your information in the first place, but instead of reading the data response, now we are just sending the HTTP request from the app, and we don't care what happens next (on the app level) its up to the server to record the data from the HTTP request.
I would highly suggest looking up how to create API and look through some tutorials... there are a lot of tutorials out there that want you to modify the HTACCESS file on the server, this is really necessary (Boy I hope I don't get crucified for saying that; fellow Stack Overflow Citizens, if you disagree, please explain your reasoning) Obviously for a large mainstream website, the whole HTACCESS file might be a good idea, but for a beginner, I don't think it is really needed.
I have some issues with my parts of final year projects. We are implementing a plagiarism detection framework. I'm working on internet sources detection part. Currently my internet search algorithm is completed. But I need to enhance it so that internet search delay is reduced.
My idea is like this:
First user is prompt to insert some web links as the initial knowledge feed for the system.
Then it crawl through internet and expand it's knowledge
Once the knowledge is fetch System don't need to query internet again. Can someone provide me some guidance to implement it? We are using Java. But any abstract detail will surely help me.
if the server side programming is you hand then you can manage a tabel having a boolean in database which shows whether the details were read before. every time your client connects to server, it will check the boolean first and if boolean was set false then it will mean that there is a need to send updates to client other wise no updates will be sent,
the boolean will become true every time when client downloads any data from server and will become false when ever the database is updated
I'm not quite sure that I understand what you're asking. Anyway:
if you're looking for a Java Web crawler, then you I recommend that you read this question
if you're looking for Java libraries to build a knowledge base (KB), then it really depends on (1) what kind of properties your KB should have, and (2) what kind of reasoning capabilities you expect from your KB. One option is to use the Jena framework, but this requires that you're comfortable with Semantic Web formalisms.
Good luck!
I just got a project which I need to implement lots of workflows, and I am considering to use jbpm engine to implement those workflow, so I want to know is their limits which I need to think through before useing jbpm engine , or any alternates ?
Our workflow is something like following:
user fill in the application form => assistant manager approval => dept director approval => director approval => boss approval. And we need to customize the task forms and integrate with other legend system.
Is their any workflow foundation like in windows in Java ?
Any recommendation are greatly appreciated !
I have to say; my experience regarding workflow in both Java and .Net when it comes to the core libraries or API libraries was either under-featured or over complicated.
Saying that, I found that in most cases having a table with statuses did the trick. Let me explain.
Have a foreign key in the table which contains your application form referring to a status table.
Have the status table with an ID (PrimaryKey) Column and StatusName Column.
The statuses should include:
ID(1) Captured - User fill in application form.
ID(2) Pending Assistant Manager Approval - Assistant Manager Approval
etc for all of the statuses...
Have a user table or make usage Java - ACL which has group assignments for
each of the Users vs Groups.
Like for instance, the people / person who has access to Assistant Manager Approvals will be able to see the application forms which have a status of Captured.
You should get the picture by now.
On the other had I do find the jBPM very useful, but I also think it has its place in much bigger workflow environments.
I am doing a keyboard for Android.
I am willing to have a plugin structure to allow users to improve the prediction engine.
The prediction engine is done with Android NDK and it was coded in C. I have a wrapper class that calls the C code. An instance of this wrapper class is a field in the InputMethodService.
The prediction engine gets updated by sending complete sentences. Something like:
public void updateEngine(String sentence);
Plugins should be calling that method.
An example of the plugin can be a .txt parser. You select a txt file and the plugin will start sending to the main app all sentences.
I would like plugins to be customizable, e.g.: They might have an screen where you can choose max sentence to send, to run on background, etc.
The UI (don't know if it should be on the main app or the plugin, check my questions below) should have the possibility to ask the plugin how many sentence it can send (to do a progress bar).
My questions are:
Should I use Intents or IPC?
I think I should use Intents since I just use primitive types.
Should my operations be atomic or send an array of sentences?
I am willing to start with atomic operations but I am worried about performance.
The plugins must be activities or Services?
They should be Activities and if necessary ("process on background" on) launch a service. Or perhaps they are just services and the main app takes care of the UI.
Who should save information about the last execution. Plugin or the mainApp?
e.g. When was the last time the plugin was used.
Should I use Intents or IPC?
Either works. Intents may be simpler.
Should my operations be atomic or send
an array of sentences?
I'd bundle these up into a single operation if possible. Lots of little cross-process trips is more expensive than one large one, as I understand it.
The plugins must be activities or
Services?
You appear to want the plugin to call into the engine, which is fine, but...when? Plugins won't get control automatically at install time. The choice of trigger mechanism for the plugin calling into the engine will dictate whether the plugin needs an activity or not.
Who should save information about the
last execution. Plugin or the mainApp?
This doesn't make a lot of sense to me given the rest of what you have here, so I can't comment. This may roll back to the lack-of-trigger issue I mention above. For example, under what circumstances would a plugin ever be used more than once?