social media slang identifier [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am doing a project on social media slang identifier.I have to identify abbreviations from different comments.But the problem is that, suppose in a particular comment it's written GM (means Good Morning) and at the same time in some other comment it's written again GM (means General Manager).
So I need to differentiate between these two, although it shows same in both case (i.e. GM).
I am really confused in this problem.I am not getting any idea for this.
Can any one help me to overcome from this?

This is a hard problem. You need some semantic algorithm to make this distinction.
You cannot infer the meaning just from the syntax or just from the textual representation.
Google "disambiguation natural language processing". You will see lots of resources.
This is just to give you a hint. As said the problem is broad and complex.

This sounds like a very complex issue.
From my understanding of it you would need a quite large dictionary of these abbreviations and also, the lexical field (a.k.a. semantic field) in which they are used.
In order to detect the lexical field you could also group the speakers into "work related" or "colleagues from university" or "drinking buddies", and maybe have a standard for these groups, so that data from other users is also used. In order to understand this, maybe you can understand a sort of synonym of slang, which is argot.
So for instance, if someone says "the GM's feedback was actually pretty good" not only do you understand that it is a usual noun but feedback is also from the "business" lexical field.
An actual time frame, and data you'd work with would be useful, and I will edit this answer accordingly.

Related

How to start learning Big Data? What are the modules I need to concentrate on as a developer [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm planning to learn Big Data. I just have gone through tutorials but I'm a little bit confused what the modules are that I need to concentrate on from a developer perspective. Presently I'm working on java. I hope your response will be helpful for the next step of my Big Data journey.
First I'd propose to get familiar with the term, Big Data is a bit fluffy and debated one, more a marketing catchphrase than a technical specification, covering a huge range of technology.
Starting from that I'd try to determine which aspect (IoT, build/run datacenters, etl/data integration/warehousing, analytics/statistics/machine learning...) or perhaps which field of application (retail, bioinformatics...) you're interested in, and which is reasonable to access from an employment point of view. I'd think also about the tech stack you'd like to work on (Scala, Python...).
Reverse engineering job offers could be a way to get to that information actually.
The Data Scientist profile (etl + machine learing + visualization) gained broad acceptance and encompasses certain skill sets, Big Data Analyst and Bid Data Engineer also can be found, arguably with a not so well defined profile.
Nowadays one can get whole MSCs in data science (here's a personal evaluation of it), but perhaps you can get your foot into the door on a less fancy route too. Trainigs may come in varying quality, I found Andy Ngs machine learning and deep learing (big neural networks) MOOCs stunning, and everything coming from the EPFL-Scala side (if you want to go down that road) is technically superior and from the presentation ok (I tried Big Data Analysis with Scala and Spark).

Create too many classes or have some schema-less data structure(like dictionary)? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I'm have to use 50 different custom datatypes(/classes) which are defined in a document(xml/json), they have only fields and no methods and maybe strong validations.
My question is should i go ahead and create(/generate) 50 classes or use some generic data structure (like HashMap<String,Object>)?
Update: My fear is if i go with class geneartion, then my codebase might increased by very much
and if go with schema-less way, my data integrity might be compromised, so which one is lesser evil.
Unless it is just ridiculous, more code is more forgivable, in general. There are a few different reasons:
If you give them base classes at the right points, you can have it both ways, as your handling code can hold the base classes, and may have anchor points for extracting, validating or cleaning information stored in the different formats. Surely some of the processing can be shared.
If absolutely everything really falls to the base class, you can refactor the sub-classes out of existence without pain. On the other hand, if you start the amorphous way, gathering the special cases back into separate classes is more likely to go wrong.
Excessively large code is only bad if the extra volume does not clarify the logic for readers. I would have the classes, if they constitute units in which people think.
Also, actual functionality is more important than format or even readability. So if the risk is to data integrity vs code bloat, protect the content, not the form.

What would be your interpertation of this requested queue implementation? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I had been reading two books on JAVA and while covering data structures, I started to do some online research with regards to "QUEUE" implementation. I have an extensive background in FLEX, however ACTIONSCRIPT isn't comparable to advance languages.
Lets say if I was on a job interview and asked to implement a Queue of Object, how should I pursue it as? I am not looking for code help here, I would like to what would you quick answer be? I have been to Java online docs and do understand there are 13 known implementing classes, and "LinkedList" is one of them.
Google search has return more results with "LinkedList" implementation code than any other.
My apologies if you find this question to be rubbish or pointless in anyway.
Oracle's Java online doc ref:
Do you know what the concept of a queue is and how it differs from a stack (closely related data structure)? If so, you should be able to think of multiple ways to implement it.
Which is best depends on the exact requirements of the task it's being used to address.
So the right response to that interview question is not to start coding but to ask them for more information about the requirements your implementation has to address. Performance? Memory size? Multitasking? Any limits on maximum queue depth, eg to guard against things like a DOS attack? What's being enqueued -- objects, primitives, other? Specific kinds thereof? Parameterized type? Are there any values which should be discarded (maybe null shouldn't be enqueued)?
Knowing the requirements, you should be able to judge which answer is appropriate. Starting coding without asking the requirements is immediately going to earn you a demerit.

Should names be in good english? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
(This might be the wrong place to ask the question, please let me know).
Should I name my method isStaticallyImported or isStaticlyImported?
(They'd be pronounced pretty much the same way, I believe)
Of course they should be in good english. Even if the human brain will likely have no problems reading garbled up words, compilers do not enjoy the same luxury.
How many times have you miswritten a variable name, then later on used the correct spelling, only to find out that the program crashed at run/compile time?
This problem is only amplified when working on code that was not written by you, because we think of things as, well, things, and having to specially remember that the thing had to be spelled in a special way is just an unneeded break to your workflow.
Yes, your variables should be clear to the developer. You can name it whatever you want and it will work because the compiler doesn't care. When you name the variable in a human readable manner then developers after you will be able to read and understand your code much easier. You should name it "isStaticallyImported".
They should be in the most easily understandable language for those using and maintaining it in my opinion.
I'm also pretty sure the compiler doesn't care about the quality of spelling.

My own code vs library [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
This is kind of unusual question for developers but for some reason i want to post it here and hope to get adequate answer.
Here is a simple example:
I wrote a java function that calculates distance between two geo points. The function is not more than 50 lines of code. I decided to download a source code from ibm that does the same thing but when i opened it i saw that it looks very complicated and is almost thousand lines of code.
What kind of people write such source code? Are they just very good programmers? Should i use their source code or my own?
I have noticed this kind of thing lots of times and i from time to time i start to wonder if it is just me who do not know how exactly to program or maybe i am wrong?
Do you guys have the same kind of feeling when you browse throught some other peoples source code?
The code you found, does it do the exact same calculation? Perhaps it takes into account some edge cases you didn't think of, or uses an algorithm that has better numerical stability, lower asymptotic complexity, or is written to take advantage of branch prediction or CPU caches. Or it could be just over-engineered.
Remember the saying: "For every complex problem there is a solution that is simple, elegant, and wrong." If you are dealing with numerical software, even the most basic problems like adding a bunch of numbers can turn out to be surprisingly complex.

Categories