How would i be able to implement KNN on a dataset that has words in the fields as well as numbers? I understand asking questions without source codes and in depth explanations of said code is looked down upon but I just want a couple of pointers of how one may be able to go about doing this so i can be pointed in the right direction and try learn.
Pseudocode or even just bullet points explaining the steps required to achieve this would be incredibly ideal but all information will be welcome and greatly appreciated.
Related
Disclaimer: Not Homework
I've written very basic calculator parsers in Java using tokenizers, but I have recently started writing a problem to help me understand chemistry. As I began writing more and more formulas, it became increasingly apparent that the complexity required to solve for each equation is almost more tedious than mastery of the equations themselves. Take the equation PV = nRT, how could I write a parser that would allow me to input all know variables and solve if it was solvable? I can do the logic behind solvability, but here are a few requirements:
must be able to solve for any unknown variable.
parsing should be capable for formulas of any size (ex: I want to implement more than one formula, such as π = MRT and formulas of increasing complexity, and only want to have to define them once.)
Once again this is purely for my enjoyment and to be used as a learning tool. Any help would be appreciated, as searching Google and StackOverflow for this problem have given me either vague or inapplicable answers.
As you have posted no code I will speak my thoughts.
I used to have an hp48G calculator which had a library with many formulas covering different areas.
Instead of parsing a single line, you had to choose through menus which was the formula to apply, then the calculator would ask for each parameter separately, and apply it to the formula to provide the result.
If you follow this approach, and helped with java's interfaces I think you can do something like you are asking.
First of all hello everyone, this is my first post here. I am asking for your help, I'm designing a network topology for access points, but to make it more creative I want to do some coding on java. I want to code it using Graph data structure but what i may have a problem is like the following, i want to point some access points on the most frequent places. But whats the best algorithm to find the best places to put the AP's. Also is there any possible choice I could do double graph, one pair of graphs for people, and the other pair for AP's ? I would really appreciate your help cos I'm pretty lost here.
I am trying to implement a paper and I am facing problem while representing linear equations mentioned in the paper. I am using LPsolve (linear problem solver) to solve the equations. But not able to represent some equations in Java so that LPSOLVE can resolve. Anyone with expertise in this please do help me.
paper i am trying to implement is http://www.cs.cmu.edu/~dshahaf/kdd2010-shahaf-guestrin.pdf and equations are mentioned in section 2.2.1
Based on what I can tell, you seem to have trouble figuring out o implementing some functions that would represent how certain mathematical functions work. It doesn't sound like you've run into an error, so I'll write down a few tips I can think of.
First off, check if the functions you are looking for already exist in the basic library by taking a look in the documentation. Maybe it doesn't state it exactly like you want, but perhaps some of the functionality is there.
http://lpsolve.sourceforge.net/5.5/Java/docs/api/
If you can't find everything you want, then you've got two options. One is to program the functions you desire yourself, and the other is to use another fleshed out Java library such as Colt which has many features.
http://dst.lbl.gov/ACSSoftware/colt/
I wanted to try some NLP things on a Neural-Network, but for the input I need vectors of word, "one-of-k" can't be used because of the big vocabulary. So I tried to do "multidimensional scaling",which for reasons unknown to me doesn't work. "Programming Collective Intelligence" was the book that I followed for this.
This isn't actually my Problem on which I wanted to work on. So if there would be a library available which will do this work, I could overcome this obstacle and experiment on my actual problem.
You can try OpenNLP and try the api's from it. Although, I must accept I have not clearly understood the question.
The problem - I have 10 number of cards value 1 to 10. Now I have to arrange the cards in away that adding 5 cards gives me 36 and product of remaining 5 cards give me 360.
I had successfully made a GA to solve cards Problem in java. Now I am thinking to solve same problem with Neural Network. Is it possible to solve this by NN? What approach should I take?
This problem is hard to solve directly with a Neural Network. Neural Networks are not going to have a concept of sum or product, so they won't be able to tell the difference between a valid and invalid solution directly.
If you created enough examples and labelled then then the neural network might be able to learn to tell the "good" and "bad" arrangements apart just by memorising them all. But it would be a very inefficient and inaccurate way of doing this, and it would be somewhat pointless - you'd have to have a separate program that knew how to solve the problem in order to create the data to train the neural network.
P.S. I think you are a bit lucky that you managed to get the GA to work as well - I suspect it only worked because the problem is small enough for the GA to try most of the possible solutions in the vicinity of the answer(s) and hence it stumbles upon a correct answer by chance before too long.
To follow up on #mikera's comments on why Neural Networks (NNs) might not be best for this task, it is useful to consider how NNs are usually used.
A NN is usually used in a supervised learning task. That is, the implementer provides many examples of input and the correct output that goes with that input. The NN then finds a general function which captures the provided input/output pairs and hopefully captures many other previously unseen input/output pairs as well.
In your problem you are solving a particular optimization, so there isn't much training to be done. There is just one (or more) right answers. So, NNs aren't really designed for such problems.
Note that the concept of not having a sum/product doesn't necessarily hurt a NN. You just have to create your own input layer which has sum and product features so that the NN can learn directly from these features. But, in this problem it won't help very much.
Note also that your problem is so small that even a naive enumeration of all combinations (10! = 3,628,800) of numbers should be achievable in a few seconds at most.