How do I implement a nonlinear optimization with nonlinear constraints in java? I am currently using org.apache.commons.math3.optim.nonlinear.scalar.noderiv, and I have read that none of the optimizers (such as the one I am currently working with, SimplexOptimizer) take constraints by default, but that instead one must map the constrained parameters to unconstrained ones by implementing the MultivariateFunctionPenaltyAdapter or MultivariateFunctionMappingAdapter classes. However, as far as I can tell, even using these wrappers, one can still only implement linear or "simple" constraints. I am wondering if there is any way to include nonlinear inequality constraints?
For example, suppose that My objective function is a function of 3 parameters: a,b,and c (depending on them non-linearly) and that additionally these parameters are subject to the constraint that ab
Any advice that would solve the problem using just apache commons would be great, but any suggestions for extending existing classes or augmenting the package would also be welcome of course.
My best attempt so far at implementing the COBYLA package is given below:
public static double[] Optimize(double[][] contractDataMatrix,double[] minData, double[] maxData,double[] modelData,String modelType,String weightType){
ObjectiveFunction objective = new ObjectiveFunction(contractDataMatrix,modelType,weightType);
double rhobeg = 0.5;
double rhoend = 1.0e-6;
int iprint = 3;
int maxfun = 3500;
int n = modelData.length;
Calcfc calcfc = new Calcfc(){
#Override
public double Compute(int n, int m, double[] x, double[] con){
con[0]=x[3]*x[3]-2*x[0]*x[1];
System.out.println("constraint: "+(x[3]*x[3]-2*x[0]*x[1]));
return objective.value(x);
}
};
COBYLAExitStatus result = COBYLA.FindMinimum(calcfc, n, 1, modelData, rhobeg, rhoend, iprint, maxfun);
return modelData;
}
The issue is that I am still getting illegal values in my optimization. As you can see, within the anonymous override of the compute function, I am printing out the value of my constraint. The result is often negative. But shouldn't this value be constrainted to be non-negative?
EDIT: I found the bug in my code, which was unrelated to the optimizer itself but rather my implementation.
Best,
Paul
You might want to consider an optimizer that is not available in Apache Commons Math. COBYLA is a derivative-free method for relatively small optimization problems (less than 100 variables) with nonlinear constraints. I have ported the original Fortran code to Java, the source code is here.
Related
I want to use Apache Commons Math's DBSCANClusterer<T extends Clusterable> to perform a clustering using the DBSCAN algorithm, but with a custom distance metric as my data points contain non-numerical values. This seems to have been easily achievable in the older version (note that the fully qualified name of this class is org.apache.commons.math3.stat.clustering.DBSCANClusterer<T> whereas it is org.apache.commons.math3.ml.clustering.DBSCANClusterer<T> for the current release), which has now been deprecated. In the older version, Clusterable would take a type-param, T, describing the type of the data points being clustered, and the distance between two points would be defined by one's implementation of Clusterable.distanceFrom(T), e.g.:
class MyPoint implements Clusterable<MyPoint> {
private String someStr = ...;
private double someDouble = ...;
#Override
public double distanceFrom(MyPoint p) {
// Arbitrary distance metric goes here, e.g.:
double stringsEqual = this.someStr.equals(p.someStr) ? 0.0 : 10000.0;
return stringsEqual + Math.sqrt(Math.pow(p.someDouble - this.someDouble, 2.0));
}
}
In the current release, Clusterable is no longer parameterized. This means that one has to come up with a way of representing one's (potentially non-numerical) data points as a double[] and return that representation from getPoint(), e.g.:
class MyPoint implements Clusterable {
private String someStr = ...;
private double someDouble = ...;
#Override
public double[] getPoint() {
double[] res = new double[2];
res[1] = someDouble; // obvious
res[0] = ...; // some way of representing someStr as a double required
return res;
}
}
And then provide an implementation of DistanceMeasure that defines the custom distance function in terms of the double[] representations of the two points being compared, e.g.:
class CustomDistanceMeasure implements DistanceMeasure {
#Override
public double compute(double[] a, double[] b) {
// Let's mimic the distance function from earlier, assuming that
// a[0] is different from b[0] if the two 'someStr' variables were
// different when their double representations were created.
double stringsEqual = a[0] == b[0] ? 0.0 : 10000.0;
return stringsEqual + Math.sqrt(Math.pow(a[1] - b[1], 2.0));
}
}
My data points are of the form (integer, integer, string, string):
class MyPoint {
int i1;
int i2;
String str1;
String str2;
}
And I want to use a distance function/metric that essentially says "if str1 and/or str2 differ for MyPoint mpa and MyPoint mpb, the distance is maximal, otherwise the distance is the Euclidean distance between the integers" as illustrated by the following snippet:
class Dist {
static double distance(MyPoint mpa, MyPoint mpb) {
if (!mpa.str1.equals(mpb.str1) || !mpa.str2.equals(mpb.str2)) {
return Double.MAX_VALUE;
}
return Math.sqrt(Math.pow(mpa.i1 - mpb.i1, 2.0) + Math.pow(mpa.i2 - mpb.i2, 2.0));
}
}
Questions:
How do I represent a String as a double in order to enable the above distance metric in the current release (v3.6.1) of Apache Commons Math? String.hashCode() is insufficient as hash code collisions would cause different strings to be considered equal. This seems like an unsolvable problem as I'm essentially trying to create a unique mapping from an infinite set of strings to a finite set of numerical values (64bit double).
As (1) seems impossible, am I misunderstanding how to use the library? If yes, were did I take a wrong turn?
Is my only alternative to use the deprecated version for this kind of distance metric? If yes, (3a) why would the designers choose to make the library less general? Perhaps in favor of speed? Perhaps to get rid of the self-reference in class MyPoint implements Clusterable<MyPoint> which some might consider bad design? (I realize that this might be too opinionated, so please disregard it if that is the case). For the commons-math experts: (3b) what downsides are there to using the deprecated version other than forward compatibility (the deprecated version will be removed in 4.0)? Is it slower? Perhaps even incorrect?
Note: I am aware of ELKI which is apparently popular among a set of SO users, but it does not fit my needs as it is marketed as a command-line and GUI tool rather than a Java library to be included in third-party applications:
You can even embed ELKI into your application (if you accept the
AGPL-3 license), but we currently do not (yet) recommend to do so,
because the API is still changing substantially. [...]
ELKI is not designed as embeddable library. It can be used, but it is
not designed to be used this way. ELKI has tons of options and
functionality, and this comes at a price, both in runtime (although it
can easily outperform R and Weka, for example!) memory usage and in
particular in code complexity.
ELKI was designed for research in data mining algorithms, not for
making them easy to include in arbitrary applications. Instead, if you
have a particular problem, you should use ELKI to find out which
approach works good, then reimplement that approach in an optimized
manner for your problem (maybe even in C++ then, to further reduce
memory and runtime).
I'm planning on creating a calculator for physics that would run off of a few equations. But, I realized that it would be a lot of code.
With the equation v = x/t (just one of many I want to include) , there's already three possible equations.
v = x/t x = vt t = x/v
What I was planning to have the program do is:
-Ask the user what equation they're going to use
-Ask what variable is missing
-Solve for it with a matching equation
My question is whether or not there is a way I can format the code more efficiently. Without knowing how, it seems like running a lot of very similar code for each variant of an equation.
I'm planning to create this using multiple classes, if it isn't clear.
There's 2 approaches I can think of that would make the most sense.
The first more traditional way would be to make a bunch of classes for each kind of equation you wanted to include.
public class Velocity implements Equation{
public double solveT(double v, double x){
if(v != 0)
return x / v;
else
return 0; //or whatever value is appropriate
}
public double solveX(double v, double t){
return v * t;
}
public double solveV(double t, double x){
if(t != 0)
return x / t;
else
return 0; //or whatever value is appropriate
}
}
This keeps all of your different equations separate, and if you define an empty Equation interface you can substitute different Equation objects as needed. The drawback is that you'd have a lot of classes to keep track of, and you would have to make sure that the Equation object you're trying to call methods on is the correct instance, i.e. trying to call solveX() on a Density instance that doesn't have a solveX() method. However, having each class separate is a nice way to organize and debug.
The other approach is using Java8 lambdas:
interface twoTermEq{
double solve(double a, double b);
}
public class Calculator{
public double solveTwoTermEq(twoTermEq eq, double a, double v){
eq.solve(a, b);
}
}
public static void main(String[] args){
twoTermEq velSolveX = (t, v) -> return t * v;
twoTermEq velSolveT = (x, v) -> v != 0.0 ? return x / v : 0.0;
twoTermEq velSolveV = (x, t) -> t != 0.0 ? return x / t : 0.0;
//define as many equations as needed...
Calculator c = new Calculator();
//select which equation to run, collect user input
....
//do the calculation
double result = c.solveTwoTermEq(velSolveX, t, v);
}
This lets you define your equations all in one place and doesn't need a boatload of classes. You could similarly define interfaces for ThreeTermEq, FourTermEq, etc., as well as solveThreeTermEq(), solveFourTermEq(), etc. methods for the Calculator class. The drawback here is that it might become more difficult to maintain and organize, and I believe there's an upper limit on how big a class file can be; if a class file becomes too big it won't compile, which could happen if you've defined tons of equations.
For me the choice would come down to how I wanted to organize the code; if I wanted to only include a (relatively) small number of (relatively) simple equations, I would probably use lambdas. If I wanted to include every physics equation across as many physics topics as possible, I'd probably use classes.
Either way, there's going to have to be some similar code written for different permutations of an equation - I don't think there's really any way around that. You could try for a novel approach using a bunch of Objects to try to circumvent that, but I think that would be overwrought and not worth the effort; it's not like flipping variables around is hard.
You would probably be best off using some kind of symbolic math toolbox. Maple and MatLab are good languages/environments for working with equations, as they recognize symbolic math and can manipulate equations fairly easily. Java does not have any built in libraries for this, and it is difficult to find any libraries that would support a 'Computer Algebra System' to manipulate the equations for you. You might want to look at JAS (Java Algebra System), but I'm not sure that will do what you're looking to do. Most likey, you will need to solve for each variable by hand and build functions for each individual expression.
If you're sticking with Java, this is how I would go about it. In terms of code formatting, I would just create one Equation class that holds an array of all the variations of a given equation. The variations (i.e. V=I*R, I=V/R, R=V/I) would all be passed into the constructor for the class. A solve method could then be implemented that takes the requested variable to be solved for, the other variables and their values (distinguished by two arrays- one for characters and one for values)
Usage could be as follows:
Equation ohmsLaw = new Equation(new String[] {"V=I*R", "I=V/R", "R=V/I"});
double resistance = ohmsLaw.solve('R', new char[] {'I', 'V'}, new double[] {0.5, 12.0});
You would need to write a little bit of symbolic parsing, but that makes it fun, right?
May or may not have been the answer you were looking for, but hopefully it's some help. Good luck!
I have a system of nonlinear dynamics which I which to solve to optimality. I know how to do this in MATLAB, but I wish to implement this in JAVA. I'm for some reason lost in how to do it in Java.
What I have is following:
z(t) which returns states in a dynamic system.
z(t) = [state1(t),...,state10(t)]
The rate of change of this dynamic system is given by:
z'(t) = f(z(t),u(t),d(t)) = [dstate1(t)/dt,...,dstate10(t)/dt]
where u(t) and d(t) is some external variables that I know the value of.
In addition I have a function, lets denote that g(t) which is defined from a state variable:
g(t) = state4(t)/c1
where c1 is some constant.
Now I wish to solve the following unconstrained nonlinear system numerically:
g(t) - c2 = 0
f(z(t),u(t),0)= 0
where c2 is some constant. Above system can be seen as a simple f'(x) = 0 problem consisting of 11 equations and 1 unkowns and if I where supposed to solve this in MATLAB I would do following:
[output] = fsolve(#myDerivatives, someInitialGuess);
I am aware of the fact that JAVA doesn't come with any build-in solvers. So as I see it there are two options in solving the above mentioned problem:
Option 1: Do it my-self: I could use numerical methods as e.g. Gauss newton or similar to solve this system of nonlinear equations. However, I will start by using a java toolbox first, and then move to a numerical method afterwards.
Option 2: Solvers (e.g. commons optim) This solution is what I am would like to look into. I have been looking into this toolbox, however, I have failed to find an exact example of how to actually use the MultiVariateFunction evaluater and the numerical optimizer. Does any of you have any experience in doing so?
Please let me know if you have any ideas or suggestions for solving this problem.
Thanks!
Please compare what your original problem looks like:
A global optimization problem
minimize f(y)
is solved by looking for solutions of the derivatives system
0=grad f(y) or 0=df/dy (partial derivatives)
(the gradient is the column vector containing all partial derivatives), that is, you are computing the "flat" or horizontal points of f(y).
For optimization under constraints
minimize f(y,u) such that g(y,u)=0
one builds the Lagrangian functional
L(y,p,u) = f(y,u)+p*g(y,u) (scalar product)
and then compute the flat points of that system, that is
g(y,u)=0, dL/dy(y,p,u)=0, dL/du(y,p,u)=0
After that, as also in the global optimization case, you have to determine what the type of the flat point is, maximum, minimun or saddle point.
Optimal control problems have the structure (one of several equivalent variants)
minimize integral(0,T) f(t,y(t),u(t)) dt
such that y'(t)=g(t,y(t),u(t)), y(0)=y0 and h(T,y(T))=0
To solve it, one considers the Hamiltonian
H(t,y,p,u)=f(t,y,u)-p*g(t,y,u)
and obtained the transformed problem
y' = -dH/dp = g, (partial derivatives, gradient)
p' = dH/dy,
with boundary conditions
y(0)=y0, p(T)= something with dh/dy(T,y(T))
u(t) realizes the minimum in v -> H(t,y(t),p(t),v)
I'm currently developing a Java-based library for network coding (http://en.wikipedia.org/wiki/Network_coding). This is very CPU-intensive and therefore need some help optimizing the encoding stage. What I'm essentially doing is that I'm creating random-linear combinations of the original data where addition is XOR and multiplication is a Galois-field multiplication (in GF(2^16)).
I've come as far as I'm capable with the optimizations. For instance I'm using tricks like this: http://groups.google.com/group/comp.dsp/browse_thread/thread/cba57ae9db9971fd/7cd21eec39ddae1a?hl=en&lnk=gst&q=Sarwate+Galois#7cd21eec39ddae1a to make the multiplications faster.
I'm therefore looking for tips on how to optimize this further. It's hard to profile since the profilers I've used doesn't give you any hints on which operation is the most expensive (e.g. is it the array-lookup or the XOR). So I'm at the point where I'm sort of randomly trying out different ideas and test if it improves the overall performance.
More specifically some potential areas of improvement that I need help on are:
How can I make sure that Java can skip the bounds-checking on the array-operations?
How can I retrieve the bytecode that actually executes after the HotSpot is done optimizing?
Here's the core of the algorithm. It might be hard to understand out of context but if you see any unnecessarily expensive operations I'm doing then please let me know!
int messageFragmentStart = 0;
int messageFragmentEnd = fragmentCharSize;
int coefficientIndex = fragmentID * messageFragmentsPerDataBlock;
final int resultArrayIndexStart = fragmentID * fragmentCharSize;
for (int messageFragmentIndex = 0; messageFragmentIndex < messageFragmentsPerDataBlock; messageFragmentIndex++) {
final int coefficientLogValue = coefficientLogValues[coefficientIndex++];
int resultArrayIndex = resultArrayIndexStart;
for (int i = messageFragmentStart; i < messageFragmentEnd; i++) {
final int logSum = coefficientLogValue + logOfDataToEncode[i];
final int messageMultipliedByCoefficient = expTable[logSum];
resultArray[resultArrayIndex++] ^= messageMultipliedByCoefficient;
}
messageFragmentStart += fragmentCharSize;
messageFragmentEnd = Math.min(messageFragmentEnd + fragmentCharSize, maxTotalLength);
}
You can't make Java forgo the bounds checking as its specified in the JLS. But in most cases the JIT is able to avoid this as long as the bounds check is simple (eg i < array.length) - if not, there's no way to avoid it (well I assume one could play with unsafe objects?).
For your second problem there's this here which should fulfill the purposes just fine.
But anyhow from your code it seems like this problem is trivial to vectorize and sadly the JVM isn't very good at it/does it at all. Hence implementing the same code in c/c++ using compile intrinsics (you could even try the auto vectorization of ICC/GCC) could lead to some quite noticeable speedups - assuming we're not completely memory bound. So I'd implement it in C++ and use JNI just for reference.
In the following line of code:
x = x.times(x).plus(y);
in what order are these expressions going to be executed?
Will it be like:
x = (x + y)*x
or x = (x^2) + y,
or something else and why?
Links to documentation about the specific subject will be highly appreciated as I had no luck with my search. Apparently I don't know where to look at and what to look for.
Thank you.
These are methods; the fact that they are called "plus" and "times" doesn't mean that they'll necessarily follow the behaviour of the built-in + and * operators.
So x.times(x) will be executed first. This will return a reference to an object, on which plus(y) will then be executed. The return value of this will then be assigned to x. It's equivalent to:
tmp = x.times(x);
x = tmp.plus(y);
Here's a link to a documentation which most likely contains the required answer (probably at 15.7). It's highly technical and verbose but not inaccessible to most people (I believe).
However, it seems that you're just starting programming, so you'll be better off reading other answers here, and programming more to get an intuitive feel (not exactly a 'feel', as it's systematic and rigourous) of the order of operations etc...
Don't be afraid to write "throw-away" code (which you can incidentally save too) to find out things you don't know if you don't know where else to look for the answer. You can always google more intensively or dive through the language specs at a latter date. You'll learn faster this way. :)
One simple way to find out is to write something like this:
class Number{
private int number;
public Number(int x){
number = x;
}
public Number times(Number x){
System.Out.PrintLn("times");
return number * x;
}
public Number plus(Number x){
System.Out.PrintLn("plus");
return number + x;
}
}
Method chains get executed from left to right, with each method using the result from the previous method, so it will be x = (x^2) + y.
What you're referring to in the algebraic expressions is operator precedence - evaluating multiplications before addition, for example. The Java compiler knows about these rules for expressions, and will generate code to evaluate them as you expect.
For method calling, there are no "special rules". When given x = x.times(x).plus(y); the compiler only knows that to evaluate x.times(x).plus(y), it first needs to know what x is, so it can call times on it. Likewise, it then needs to know what x.times(x) is so it can call the plus method on that result. Hence, this type of statement is parsed left to right : (x * x) + y.
Some languages allow the creation of functions that are "infix" with user supplied precedence. (such as Haskell : See http://www.haskell.org/tutorial/functions.html, section "Fixity declarations"). Java is, alas, not one of them.
It's going to be executed in left-to-right order, as
x = (x.times(x)).plus(y)
The other way:
x = x.(times(x).plus(y))
doesn't even make sense to me. You would have to rewrite it as
x = x.times(x.plus(y))
to make sense of it, but the fact that the second x is contained within times() while the y is outside it rules out that interpretation.
The reason the documentation doesn't say anything about this is probably that such expressions follow the normal rules for how a statement like a.b().c().d() is evaluated: from left to right. We start with a and call the function b() on it. Then, we call c() on the result of that call, and we call d() on the result of c(). Hence, x.times(x).plus(y) will first perform the multiplication, then the addition.