I am making a program that calculates the equation for the tangent of a graph at a given point and ideally I'd want it to work for any type of graph. e.g. 1/x ,x^2 ,ln(x), e^x, sin, tan. I know how to work out the tangent and everything but I just don't really know how to get the input from the user.
Would I have to have options where they choose the type of graph and then fill in the coefficients for it e.g. "Choice 1: 1/(Ax^B) Enter the values of A and B"? Or is there a way so that the program recognises what the user types in so that instead of entering a choice and then the values of A and B, the user can type "1/3x^2" and the program would recognise that the A and B are 3 and 2 and that the graph is a 1/x graph.
This website is kind of an example of what I would like to do be able to do: https://www.symbolab.com/solver/tangent-line-calculator
Thanks for any help :)
Looks like you want to evalute the expression. In that case, you could look into Dijkstra's Shunting-Yard algorithm to convert the expression to prefix notation, and then evaluate the expression using stacks. Alternatively, you can use a library such as exp4j. There are multiple tutorials for it, but remember that you need to add operations for both binary and unary operations (binary meaning it supports 2 operations while unary is like sin(x)).
Then, after you evaluate the expression, you can use first principles to solve. I have an example of this system working without exp4j on my github repository. If you go back in the commit history, you can see the implementation with exp4j as well.
Parsing a formula from user input is itself a problem much harder than calculating the tangent. If this is an assignment, see if the wording allows for the choice of the functions and its parameters, as you're suggesting, because otherwise you are going to spend 10% of time writing code for calculating the derivative and 90% for reading the function from the standard input.
If it's your own idea and you'd like to try your hand at it, a teaser is that you will likely need to design a whole class structure for different operators, constants, and the unknown. Keep a stack of mathematical operations, because in 1+2*(x+1)+3 the multiplication needs to happen before the outer additions, but after the inner one. You'll have to deal with reading non-uniform input that has a high level of freedom (in whitespace, omission of * sign, implicit zero before a –, etc.) Regular expressions may be of help, but be prepared for a debugging nightmare and a ton of special cases anyway.
If you're fine with restricting your users (yourself?) to valid expressions following JavaScript syntax (which your examples are not, due to the implied multiplication and the haphazard rules of precedence thereof to the 1/...) and you can trust them absolutely in having no malicious intentions, see this question. You wouldn't have your expression represented as a formula internally, but you would still be able to evaluate it in different points x. Then you can approximate the derivative by (f(x+ε) - f(x)) / ε with some sufficiently small ε (but not too small either, using trial and error for convergence). Watch out for points where the function has a jump, but in basic principle this works, too.
Related
I would like to create two models of binary prediction: one with the cut point strictly greater than 0.5 (in order to obtain fewer signals but better ones) and second with the cut point strictly less than 0.5.
Doing the cross-validation, we have a test error related to the cut point equal to 0.5. How can I do it with other cut value? I talk about XGBoost for Java.
xgboost returns a list of scores. You can do what ever you want to that list of scores.
I think that particularly in Java, it returns a 2d ArrayList of shape (1, n)
In binary prediction you probably used a logistic function, thus your scores will be between 0 to 1.
Take your scores object and create a custom function that will calculate new predictions, by the rules you've described.
If you are using an automated/xgboost-implemented Cross Validation Function, you might want to build a customized evaluation function which will do as you bid, and pass it as an argument to xgb.cv
If you want to be smart when setting your threshold, I suggest reading about AUC of Roc Curve and Precision Recall Curve.
I am creating a graphing calculator and I need an algorithm to interpret equations that users input. For example, if the user types in "x^3+5x^2-4x-9", the algorithm should take the string input and return (0, -9), (1, -7) and so on. How should I go about doing this? Are there any open source libraries I can use? Thanks in advance.
You could implement a Shunting-Yard Algorithm, but there are plenty of mathematical parsers already out there. There is a very famous saying in software development:
"Don't reinvent the wheel."
I encountered this problem when developing my own app. If there's already open source libraries out there, you should definitely use them to your advantage. That being said, I would recommend the MathEval library if you want double precision. Double is usually enough in terms of precision, because precision after double such as BigDecimal, which is an exact way of representing numbers is extremely expensive in terms of speed and memory, two things you want to reduce in computation.
The easiest, O(n) method in which you would generate solutions is set a range in a for loop, and use MathEval's setVariable method for x based on the iterations in the loop and retrieve the result through MathEval's evaluate method. There could be some boundaries in the equation, so make sure you compute the necessary restrictions and condition them inside your loop.
Check out this link Terrance which mentions the MathEval library:
Built-in method for evaluating math expressions in Java
Let me know if you need any help because I have implemented both the algorithm from scratch and through an external library.
What are the best resources on learning 'number crunching' using Java ? I am referring to things like correct methods of decimal number processing , best practices , API , notable idioms for performance and common pitfalls ( and their solutions ) while coding for number processing using Java.
This question seems a bit open ended and open to interpretation. As such, I will just give two short things.
1) Decimal precision - never assume that two floating point (or double) numbers are equal, even if you went through the exact same steps to calculate them both. Due to a number of issues with rounding in various situations, you often cannot be certain that a decimal number is exactly what you expect. If you do double myNumber = calculateMyNumber() and then do a bunch of things and then come back to it and check if(myNumber == calculateMyNumber(), that evaluation could be false even if you have not changed the calculations done in calculateMyNumber()
2) There are limitations in the size and precision of numbers that you can keep track of. If you have int myNumber = 2000000000 and if(myNumber*2 < myNumber), that will actually evaluate to true, as myNumber*2 will result in a number less than myNumber, because the memory allocated for the number isn't big enough to hold a number that large and it will overflow, becoming smaller than it was before. Look into classes that encapsulate large numbers, such as BigInteger and BigDecimal.
You will figure stuff like this out as a side effect if you study the computer representations of numbers, or binary representations of numbers.
First, you should learn about floating point math. This is not specific to java, but it will allow you to make informed decisions later about, for example, when it's OK to use Java primitives such as float and double. Relevant topics include (copied from a course that I took on scientific computing):
Sources of error: roundoff, truncation error, incomplete convergence, statistical error,
program bug.
Computer floating point arithmetic and the IEEE standard.
Error amplification through cancellation.
Conditioning, condition number, and error amplification.
This leads you to decisions about whether to use Java's BigDecimal, BigInteger, etc. There are lots of questions and answers about this already.
Next, you're going to hit performance, including both CPU and memory. You probably will find various rules of thumb, such as "autoboxing a lot is a serious performance problem." But as always, the best thing to do is profile your own application. For example, let's say somebody insists that some optimization is important, even though it affects legibility. If that optimization doesn't matter for your application, then don't do it!
Finally, where possible, it often pays to use a library for numerical stuff. Your first attempt probably will be slower and more buggy than existing libraries. For example, for goodness sake, don't implement your own linear programming routine.
I am trying to design a way to represent mathematical equations as Java Objects. This is what I've come up with so far:
Term
-Includes fields such as coefficient (which could be negative), exponent and variable (x, y, z, etc). Some fields may even qualify as their own terms alltogether, introducing recursion.
-Objects that extend Term would include things such as TrigTerm to represent trigonometric functions.
Equation
-This is a collection of Terms
-The toString() method of Equation would call the toString() method of all of its Terms and concatenate the results.
The overall idea is that I would be able to programmatically manipulate the equations (for example, a dirivative method that would return an equation that is the derivative of the equation it was called for, or an evaluate method that would evaluate an equation for a certain variable equaling a certain value).
What I have works fine for simple equations:
This is just two Terms: one with a variable "x" and an exponent "2" and another which is just a constant "3."
But not so much for more complex equations:
Yes, this is a terrible example but I'm just making a point.
So now for the question: what would be the best way to represent math equations as Java objects? Are there any libraries that already do this?
what would be the best way to
represent math equations as Java
objects?
I want you to notice, you don't have any equations. Equations look like this;
x = 3
What you have are expressions: collections of symbols that could, under some circumstances, evaluate out to some particular values.
You should write a class Expression. Expression has three subclasses: Constant (e.g. 3), Variable (e.g. x), and Operation.
An Operation has a type (e.g. "exponentiation" or "negation") and a list of Expressions to work on. That's the key idea: an Operation, which is an Expression, also has some number of Expressions.
So your is SUM(EXP(X, 2), 3) -- that is, the SUM Operation, taking two expressions, the first being the Exponentiation of the Expressions Variable X and Constant 2, and the second being the Constant 3.
This concept can be infinitely elaborated to represent any expression you can write on paper.
The hard part is evaluating a string that represents your expression and producing an Expression object -- as someone suggested, read some papers about parsing. It's the hardest part but still pretty easy.
Evaluating an Expression (given fixed values for all your Variables) and printing one out are actually quite easy. More complicated transforms (like differentiation and integration) can be challenging but are still not rocket science.
Consult a good compiler book for details about how to write the part of a compiler that converts input into an expression tree.
You might find this series inspirational: http://compilers.iecc.com/crenshaw/
If you "just" want to evaluate an input string, then have a look at the snippet compiler in the Javassist library.
Here I described the representation of parsed math expressions as Abstract Syntax Trees in the Symja project.
The D[f,x] function in the D.java file implements a derivative function by reading the initial Derivative[] rules from the System.mep file.
We were just assigned a new project in my data structures class -- Generating text with markov chains.
Overview
Given an input text file, we create an initial seed of length n characters. We add that to our output string and choose our next character based on frequency analysis..
This is the cat and there are two dogs.
Initial seed: "Th"
Possible next letters -- i, e, e
Therefore, probability of choosing i is 1/3, e is 2/3.
Now, say we choose i. We add "i" to the output string. Then our seed becomes
hi and the process continues.
My solution
I have 3 classes, Node, ConcreteTrie, and Driver
Of course, the ConcreteTrie class isn't a Trie of the traditional sense. Here is how it works:
Given the sentence with k=2:
This is the cat and there are two dogs.
I generate Nodes Th, hi, is, ... + ... , gs, s.
Each of these nodes have children that are the letter that follows them. For example, Node Th would have children i and e. I maintain counts in each of those nodes so that I can later generate the probabilities for choosing the next letter.
My question:
First of all, what is the most efficient way to complete this project? My solution seems to be very fast, but I really want to knock my professor's socks off. (On my last project A variation of the Edit distance problem, I did an A*, a genetic algorithm, a BFS, and Simulated Annealing -- and I know that the problem is NP-Hard)
Second, what's the point of this assignment? It doesn't really seem to relate to much of what we've covered in class. What are we supposed to learn?
On the relevance of this assignment with what you covered in class (Your second question). The idea of a 'data structures' class is to expose students to the very many structures frequently encountered in CS: lists, stacks, queues, hashes, trees of various types, graphs at large, matrices of various creed and greed, etc. and to provide some insight into their common implementations, their strengths and weaknesses and generally their various fields of application.
Since most any game / puzzle / problem can be mapped to some set of these structures, there is no lack of subjects upon which to base lectures and assignments. Your class seems interesting because while keeping some focus on these structures, you are also given a chance to discover real applications.
For example in a thinly disguised fashion the "cat and two dogs" thing is an introduction to statistical models applied to linguistics. Your curiosity and motivation prompted you to make the relation with markov models and it's a good thing, because chances are you'll meet "Markov" a few more times before graduation ;-) and certainly in a professional life in CS or related domain. So, yes! it may seem that you're butterflying around many applications etc. but so long as you get a feel for what structures and algorithms to select in particular situations, you're not wasting your time!
Now, a few hints on possible approaches to the assignment
The trie seems like a natural support for this type of problem. Maybe you can ask yourself however how this approach would scale, if you had to index say a whole book rather than this short sentence. It seems mostly linearly, although this depends on how each choice on the three hops in the trie (for this 2nd order Markov chain) : as the number of choices increase, picking a path may become less efficient.
A possible alternative storage for the building of the index is a stochatisc matrix (actually a 'plain' if only sparse matrix, during the statistics gathering process, turned stochastic at the end when you normalize each row -or column- depending on you set it up) to sum-up to one (100%). Such a matrix would be roughly 729 x 28, and would allow the indexing, in one single operation, of a two-letter tuple and its associated following letter. (I got 28 for including the "start" and "stop" signals, details...)
The cost of this more efficient indexing is the use of extra space. Space-wise the trie is very efficient, only storing the combinations of letter triplets effectively in existence, the matrix however wastes some space (you bet in the end it will be very sparsely populated, even after indexing much more text that the "dog/cat" sentence.)
This size vs. CPU compromise is very common, although some algorithms/structures are somtimes better than others on both counts... Furthermore the matrix approach wouldn't scale nicely, size-wize, if the problem was changed to base the choice of letters from the preceding say, three characters.
None the less, maybe look into the matrix as an alternate implementation. It is very much in spirit of this class to try various structures and see why/where they are better than others (in the context of a specific task).
A small side trip you can take is to create a tag cloud based on the probabilities of the letters pairs (or triplets): both the trie and the matrix contain all the data necessary for that; the matrix with all its interesting properties, may be more suited for this.
Have fun!
You using bigram approach with characters, but usually it applied to words, because the output will be more meaningful if we use just simple generator as in your case).
1) From my point of view you doing all right. But may be you should try slightly randomize selection of the next node? E.g. select random node from 5 highest. I mean if you always select node with highest probability your output string will be too uniform.
2) I've done exactly the same homework at my university. I think the point is to show to the students that Markov chains are powerful but without extensive study of application domain output of generator will be ridiculous