I've three sets of data such as:
x y
4 0
6 60
8 0
Does anyone know any (efficient) Java codes that can give me back the values of a, b, and c (the coefficients)?
I assume you want the formula in this form:
y = a * x^2 + b*x + c
If you have only three points you can describe the quadratic curve that goes through all three points with the formula:
y = ((x-x2) * (x-x3)) / ((x1-x2) * (x1-x3)) * y1 +
((x-x1) * (x-x3)) / ((x2-x1) * (x2-x3)) * y2 +
((x-x1) * (x-x2)) / ((x3-x1) * (x3-x2)) * y3
In your example:
x1 = 4, y1 = 0, x2 = 6, y2 = 60, x3 = 8, y3 = 0
To get the coefficients a, b, c in terms of x1, x2, x3, y1, y2 and y3 you just need to multiply the formula out and then collect the terms. It's not difficult, and it will run very fast but it will be quite a lot of code to type in. It would probably be better to look for a package that already does this for you, but if you want to do it yourself, this is how you could do it.
The fact that two of the y terms are zero in your example makes the formula a lot simpler, and you might be able to take advantage of that. But if that was just a coincidence and not a general rule, then you need the full formula.
The LaGrange interpolation is probably the most 'efficient' (how do you measure that?) solution you are going to find. So I'll suggest a completely general code. You did want code, right? This code can go linear, quadratic, cubic, .... for any number of points.
I didn't actually try to compile it, so I don't if the source code is up to date. You know how online demos go. Yet the applet from the associated web page is fully functional. The jar file will run standalone. With a resizable window, you really don't need to customize it.
It depends on exactly what you are looking for: Are you looking for the unique polynomial which is defined by those three points, or are you looking for a library which will generate a polynomial which passes through all points?
If you are looking at the first, the best technique is to construct the coefficient matrix(That is, the set of three linear equations which uniquely constrain this quadratic equation)and apply Gaussian Elimination to get your result. This can be done by hand the most efficiently, but you can also use The Apache Commons Math Library's Real Matrix solve methods. (EDIT Thanks for the correction--I speak before I think sometimes ;)
If you are looking at the second, this is specific case of a general class of problems called Interpolation by Polynomials, and there are several ways of solving--Splines are my personal favorite, but all have their strengths and weaknesses. Luckily, Apache Commons Math implements several such methods. I would look at the SplineInterpolator class. Splines use cubics instead of quadratics, but they tend to be very good approximations. They also don't fail if one point is a linear multiple of another.
For just three points, both methods should be about equal in performance characteristics. If you are doing more than three points, however, I would strongly recommend using interpolation, as using Guassian Elimination scales incredibly poorly( O(n^3)), and Splines(Or another interpolation technique) are less likely to fail.
Related
I'm using the non linear least squares Levenburg Marquardt algorithm in java to fit a number of exponential curves (A+Bexp(Cx)). Although the data is quite clean and has a good approximation to the model the algorithm is not able to model the majority of them even with a excessive number of iterations(5000-6000). For the curves it can model, it does so in about 150 iterations.
LeastSquaresProblem problem = new LeastSquaresBuilder()
.start(start).model(jac).target(dTarget)
.lazyEvaluation(false).maxEvaluations(5000)
.maxIterations(6000).build();
LevenbergMarquardtOptimizer optimizer = new LevenbergMarquardtOptimizer();
LeastSquaresOptimizer.Optimum optimum = optimizer.optimize(problem);}
My question is how would I define a convergence criteria in apache commons in order to stop it hitting a max number of iterations?
I don't believe Java is your problem. Let's address the mathematics.
This problem is easier to solve if you change your function.
Your assumed equation is:
y = A + B*exp(C*x)
It'd be easier if you could do this:
y-A = B*exp(C*x)
Now A is just a constant that can be zero or whatever value you need to shift the curve up or down. Let's call that variable z:
z = B*exp(C*x)
Taking the natural log of both sides:
ln(z) = ln(B*exp(C*x))
We can simplify that right hand side to get the final result:
ln(z) = ln(B) + C*x
Transform your (x, y) data to (x, z) and you can use least squares fitting of a straight line where C is the slope in (x, z) space and ln(B) is the intercept. Lots of software available to do that.
I am trying to find a function to perform Lagrange Interpolation in java. I have 3 (x,y) pairs, where x and y are BigInteger objects, and would like to use some interpolation function to determine f(0) for the polynomial f used to calculate my x,y these pairs. Something like this seems perfect, except that this class doesn't seem to actually belong to a package I can import: http://nssl.eew.technion.ac.il/files/Projects/thresholddsaimporvement/doc/javadoc/Lagrange.html
Forgive me if my question is naive, any help I can get would really be appreciated.
A similar class seems to be in Apache Commons Math. This is probably going to be much more reliable than what you found.
http://commons.apache.org/math/apidocs/org/apache/commons/math3/analysis/polynomials/PolynomialFunctionLagrangeForm.html
It would appear that you construct it with PolynomialFunctionLagrangeForm(double[] x, double[] y), then call value(0) to get the value at x = 0.
Are there any methods which do that? I have an application where I need the area under the curve, and I am given the formula, so if I can do the integration on hand, I should be able to do it programatically? I can't find the name of the method I'm referring to, but this image demonstrates it: http://www.mathwords.com/a/a_assets/area%20under%20curve%20ex1work.gif
Edit: to everyone replying, I have already implemented rectangular, trapezoidal and Simpson's rule. However, they take like 10k+ stripes to be accurate, and should I not be able to find programatically the integrated version of a function? If not, there must be a bloody good reason for that.
Numerical integration
There are multiple methods, which can be used. For description, have a look in Numerical Recipes: The Art of Scientific Computing.
For Java there is Apace Commons library, which can be used. Integration routines are in Numerical Analysis section.
Symbolic integration
Check out jScience. Functions module "provides support for fairly simple symbolic math analysis (to solve algebraic equations, integrate, differentiate, calculate expressions, and so on)".
If type of function is given, it can be possible to integrate faster in that specific case than when using some standard library.
To compute it exactly, you would need a computer algebra system library of some sort to perform symbolic manipulations. Such systems are rather complicated to implement, and I am not familiar with any high quality, open source libraries for Java. An alternative, though, assuming it meets your requirements, would be to estimate the area under the curve using the trapezoidal rule. Depending on how accurate you require your result to be, you can vary the size of the subdivisions accordingly.
I would recommend using Simpsons rule or the trapezium rule, because it could be excessively complicated to integrate every single type of graph.
See Numerical analysis specifically numerical integration. How about using the Riemann sum method?
You can use numerical integration, using some rule, like already mentioned Simpsons, Trapezoidal, or Monte-Carlo simulation. It uses pseudo random generator.
You can try some libraries for symbolic integration, but I'm not sure that you can get symbolic representation of every integral.
Here's a simple but efficient approach:
public static double area(DoubleFunction<Double> f, double start, double end, int intervals) {
double deltaX = (end - start)/intervals;
double area = 0.0;
double effectiveStart = start + (deltaX / 2);
for (int i=0; i<intervals; ++i) {
area += f.apply(effectiveStart + (i * deltaX));
}
return deltaX * area;
}
This is a Riemann sum using the midpoint rule, which is a variation of the trapezoidal rule, except instead of calculating the area of a trapezoid, I use a rectangle from f(x) at the middle of the interval. This is faster and gives a better result. This is why my effective starting value of x is at the middle of the first interval. And by looping over an integer, I avoid any round-off problems.
I also improve performance by waiting till the end of the loop before multiplying by deltaX. I could have written the loop like this:
for (int i=0; i<intervals; ++i) {
area += deltaX * f.apply(effectiveStart + (i * deltaX)); // this is x * y for each rectangle
}
But deltaX is constant, so it's faster to wait till the loop is finished.
One of the most popular forms of numeric integration is the Runge-Kutta order 4 (RK4) technique. It's implementations is as follows:
double dx, //step size
y ; //initial value
for(i=0;i<number_of_iterations;i++){
double k1=f(y);
double k2=f(y+dx/2*k1);
double k3=f(y+dx/2*k2);
double k4=f(y+dx*k3);
y+= dx/6*(k1+2*k2+2*k3+k4);
}
and will converge much faster than rectangle, trapezoids, and Simpson's rule. It is one of the more commonly used techniques for integration in physics simulations.
I need help to make my code below more efficient, and to clean it up a little.
As shown in this image, x and y can be any point around the whole screen, and I am trying to find the angle t. Is there a way I can reduce the number of lines here?
Note: The origin is in the top left corner, and moving right/down is moving in the positive direction
o := MiddleOfScreenX - x;
a := MiddleOfScreenY - y;
t := Abs(Degrees(ArcTan(o / a)));
if(x > MiddleOfScreenX)then
begin
if(y > MiddleOfScreenY)then
t := 180 + t
else
t := 360 - t;
end
else
if(y > MiddleOfScreenY)then
t := 180 - t;
The code is in pascal, but answers in other languages with similar syntax or c++ or java are fine as well.
:= sets the variable to that value
Abs() result is the absolute of that value (removes negatives)
Degrees() converts from radians to degrees
ArcTan() returns the inverse tan
see this http://www.cplusplus.com/reference/clibrary/cmath/atan2/ for a C function.
atan2 takes 2 separate arguments, so can determine the quadrant.
pascal may have arctan2 see http://www.freepascal.org/docs-html/rtl/math/arctan2.html or http://www.gnu-pascal.de/gpc/Run-Time-System.html
o := MiddleOfScreenX - x;
a := MiddleOfScreenY - y;
t := Degrees(ArcTan2(o, a));
The number of lines of code isn't necessarily the only optimization you need to consider. Trigonometric functions are costly in terms of the time it takes for a single one to finish its computation (ie: a single cos() call may require hundreds of additions and multiplications depending on the implementation).
In the case of a commonly used function in signal processing, the discrete Fourier transform, the results of thousands of cos() and sin() calculations are pre-calculated and stored in a massive lookup table. The tradeoff is that you use more memory when running your application, but it runs MUCH faster.
Please see the following article, or search for the importance of "precomputed twiddle factors", which essentially means calculating a ton of complex exponentials in advance.
In the future, you should also mention what you are trying to optimize for (ie: CPU cycles used, number of bytes of memory used, cost, among other things). I can only assume that you mean to optimize in terms of instructions executed, and by extension, the number of CPU cycles used (ie: you want to reduce CPU overhead).
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.9421&rep=rep1&type=pdf
You should only need one test to determine what to do with the arctan.. your existing tests recover the information destroyed by Abs().
atan() normally returns in the range -pi/4 to pi/4. Your coordinate system is a bit strange--rotate 90 deg clockwise to get a "standard" one, though you take atan of x/y as opposed to y/x. I'm already having a hard time resolving this in my head.
Anyways, I believe your test just needs to be that if you're in negative a, add 180 deg. If you want to avoid negative angles; add 360 deg if it's then negative.
Does anyone know of a scientific/mathematical library in Java that has a straightforward implementation of weighted linear regression? Something along the lines of a function that takes 3 arguments and returns the corresponding coefficients:
linearRegression(x,y,weights)
This seems fairly straightforward, so I imagine it exists somewhere.
PS) I've tried Flannigan's library: http://www.ee.ucl.ac.uk/~mflanaga/java/Regression.html, it has the right idea but seems to crash sporadically and complain out my degrees of freedom?
Not a library, but the code is posted: http://www.codeproject.com/KB/recipes/LinReg.aspx
(and includes the mathematical explanation for the code, which is a huge plus).
Also, it seems that there is another implementation of the same algorithm here: http://sin-memories.blogspot.com/2009/04/weighted-linear-regression-in-java-and.html
Finally, there is a lib from a University in New Zealand that seems to have it implemented: http://www.cs.waikato.ac.nz/~ml/weka/ (pretty decent javadocs). The specific method is described here:
http://weka.sourceforge.net/doc/weka/classifiers/functions/LinearRegression.html
I was also searching for this, but I couldn't find anything. The reason might be that you can simplify the problem to the standard regression as follows:
The weighted linear regression without residual can be represented as
diag(sqrt(weights))y = diag(sqrt(weights))Xb where diag(sqrt(weights))T basically means multiplying each row of the T matrix by a different square rooted weight. Therefore, the translation between weighted and unweighted regressions without residual is trivial.
To translate a regression with residual y=Xb+u into a regression without residual y=Xb, you add an additional column to X - a new column with only ones.
Now that you know how to simplify the problem, you can use any library to solve the standard linear regression.
Here's an example, using Apache Commons Math:
void linearRegression(double[] xUnweighted, double[] yUnweighted, double[] weights) {
double[] y = new double[yUnweighted.length];
double[][] x = new double[xUnweighted.length][2];
for (int i = 0; i < y.length; i++) {
y[i] = Math.sqrt(weights[i]) * yUnweighted[i];
x[i][0] = Math.sqrt(weights[i]) * xUnweighted[i];
x[i][1] = Math.sqrt(weights[i]);
}
OLSMultipleLinearRegression regression = new OLSMultipleLinearRegression();
regression.setNoIntercept(true);
regression.newSampleData(y, x);
double[] regressionParameters = regression.estimateRegressionParameters();
double slope = regressionParameters[0];
double intercept = regressionParameters[1];
System.out.println("y = " + slope + "*x + " + intercept);
}
This can be explained intuitively by the fact that in linear regression with u=0, if you take any point (x,y) and convert it to (xC,yC), the error for the new point will also get multiplied by C. In other words, linear regression already applies higher weight to points with higher x. We are minimizing the squared error, that's why we extract the roots of the weights.
I personally used org.apache.commons.math.stat.regression.SimpleRegression Class of the Apache Math library.
I also found a more lightweight class from Princeton university but didn't test it:
http://introcs.cs.princeton.edu/java/97data/LinearRegression.java.html
Here's a direct Java port of the C# code for weighted linear regression from the first link in Aleadam's answer:
https://github.com/lukehutch/WeightedLinearRegression.java