How to use SimplexSolver or SimplexOptimizer in java apache math? - java

I'm trying to use the apache commons math library version 3.5+ to solve an optimization problem. Basically, I'm trying to fit a (gamma) distribution to some data points. I can't seem to find any simple examples of how to use the new (version 3.5) optimization tools, such as SimplexSolver, SimplexOptimizer, or OptimizationData, to solve a trivial optimization problem.
Similar questions have been asked here before, but all the answers seem to be for older version of apache math - in 3.5 things were restructured and none of the example code I could find works.
Does anyone have a working example how to use the new optimizers or solvers? I'm most interested in SimplexOptimizer, but at this point anything would be useful.

Indeed, the optimizers may be hard to use: Lots of parameters, of which different combinations are required for the different types of optimizers, and they are all hidden in the generic OptimizationData array that they receive. Unless you start matching the code with the papers that they refer to, you can hardly get any results out of them whatsoever.
I also wanted to use some of thes solvers/optimizers a try occasionally, the main source of reliable, working ""examples"" for me turned out to be the unit tests of these classes, which usually are quite elaborate and cover many cases. For example, regarding the SimplexOptimizer, you may want to have a look at the org/apache/commons/math4/optim/nonlinear/scalar/noderiv/ test cases, containing the test classes SimplexOptimizerMultiDirectionalTest.java and SimplexOptimizerNelderMeadTest.java.
(Sorry, maybe this is not what you expected or hoped for, but ... I found these tests tremendously helpful when I tried to figure out which OptimizationData these optimizers actually need...)
EDIT
Just for reference, a complete example, extracted from one of the basic unit tests:
import java.util.Arrays;
import org.apache.commons.math3.analysis.MultivariateFunction;
import org.apache.commons.math3.optim.InitialGuess;
import org.apache.commons.math3.optim.MaxEval;
import org.apache.commons.math3.optim.PointValuePair;
import org.apache.commons.math3.optim.nonlinear.scalar.GoalType;
import org.apache.commons.math3.optim.nonlinear.scalar.ObjectiveFunction;
import org.apache.commons.math3.optim.nonlinear.scalar.noderiv.NelderMeadSimplex;
import org.apache.commons.math3.optim.nonlinear.scalar.noderiv.SimplexOptimizer;
import org.apache.commons.math3.util.FastMath;
public class SimplexOptimizerExample
{
public static void main(String[] args)
{
SimplexOptimizer optimizer = new SimplexOptimizer(1e-10, 1e-30);
final FourExtrema fourExtrema = new FourExtrema();
final PointValuePair optimum =
optimizer.optimize(
new MaxEval(100),
new ObjectiveFunction(fourExtrema),
GoalType.MINIMIZE,
new InitialGuess(new double[]{ -3, 0 }),
new NelderMeadSimplex(new double[]{ 0.2, 0.2 }));
System.out.println(Arrays.toString(optimum.getPoint()) + " : "
+ optimum.getSecond());
}
private static class FourExtrema implements MultivariateFunction
{
// The following function has 4 local extrema.
final double xM = -3.841947088256863675365;
final double yM = -1.391745200270734924416;
final double xP = 0.2286682237349059125691;
final double yP = -yM;
final double valueXmYm = 0.2373295333134216789769; // Local maximum.
final double valueXmYp = -valueXmYm; // Local minimum.
final double valueXpYm = -0.7290400707055187115322; // Global minimum.
final double valueXpYp = -valueXpYm; // Global maximum.
public double value(double[] variables)
{
final double x = variables[0];
final double y = variables[1];
return (x == 0 || y == 0) ? 0 : FastMath.atan(x)
* FastMath.atan(x + 2) * FastMath.atan(y) * FastMath.atan(y)
/ (x * y);
}
}
}

Related

Need help converting DCGAN to Java for Tensorflow for Java

I am trying to get DCGAN ( Deep Convolutional Generative Adversarial Networks) to work with tensorflow for Java.
I have added the necessary code to DCGAN’s model.py as below to output a graph to be later used in tensorflow for Java.
//at the beginning to define where the model will be saved
#
self.load_dir = load_dir
self.models_dir = models_dir
graph = tf.Graph()
self.graph = graph
self.graph.as_default()
#
//near the end where the session is ran in order to build and save the model to be used in tensorflow for java. A model is saved every 200 samples as defined by DCGAN’s default settings.
#
steps = "training_steps-" + "{:08d}".format(step)
set_models_dir = os.path.join(self.models_dir, steps)
builder = tf.saved_model.builder.SavedModelBuilder(set_models_dir)
self.builder = builder
self.builder.add_meta_graph_and_variables(self.sess, [tf.saved_model.tag_constants.SERVING])
self.builder.save()
#
The above codes output a graph that is loaded by the following Java code
package Main;
import java.awt.image.BufferedImage;
import java.io.File;
import java.util.Random;
import javax.imageio.ImageIO;
import org.tensorflow.Tensor;
public class DCGAN {
public static void main(String[] args) throws Exception {
String model_dir = "E:\\AgentWeb\\mnist-steps\\training_steps-00050000";
//SavedModelBundle model = SavedModelBundle.load(model_dir , "serve");
//Session sess = model.session();
Random rand = new Random();
int sample_num = 64;
int z_dim = 100;
float [][] gen_random = new float [64][100];
for(int i = 0 ; i < sample_num ; i++) {
for(int j = 0 ; j < z_dim ; j++) {
gen_random[i][j] = (float)rand.nextGaussian();
}
}
Tensor <Float> sample_z = Tensor.<Float>create(gen_random, Float.class);
Tensor <Float> sample_inputs = Tensor.<Float>create(placeholder, Float.class);
// placeholder is the tensor which I want to create after solving the problem below.
//Tensor result = sess.runner().fetch("t_vars").feed("z", sample_z).feed("inputs", sample_inputs).run().get(3);
}
}
(I have left some comments as I used them for debugging)
With this method I am stuck at a certain portion of translating the python code to Java for use in tensorflow for Java. In DCGAN’s model.py where the images are processed there’s the following code.
get_image(sample_file,
input_height=self.input_height,
input_width=self.input_width,
resize_height=self.output_height,
resize_width=self.output_width,
crop=self.crop,
grayscale=self.grayscale) for sample_file in sample_files]
which calls get_iamge in saved_utils.py as follows
def get_image(image_path, input_height, input_width,
resize_height=64, resize_width=64,
crop=True, grayscale=False):
image = imread(image_path, grayscale)
return transform(image, input_height, input_width,
resize_height, resize_width, crop)
which then calls a method called imread as follows
def imread(path, grayscale = False):
if (grayscale):
return scipy.misc.imread(path, flatten = True).astype(np.float)
else:
# Reference: https://github.com/carpedm20/DCGAN-tensorflow/issues/162#issuecomment-315519747
img_bgr = cv2.imread(path)
# Reference: https://stackoverflow.com/a/15074748/
img_rgb = img_bgr[..., ::-1]
return img_rgb.astype(np.float)
My question is that I am unsure what the img_rgb = img_bgr[..., ::-1]
part does and how do I translate it for use in my Java file in tensorflow.java.
I am familiar with the way python slices arrays but I am unfamiliar with the three dots used in there.
I did read about the reference to the stackoverflow questions there and it mentions that it is similar to img[:, :, ::-1]. But I am not really sure about what it is exactly doing.
Any help is appreciated and thank you for taking your time to read this long post.
What basically do the imread and get_image is
1) reads an image
2) convert it from BGR to RGB
3) convert it to floats
4) rescale the image
You can do this in Java either by using an imaging library, such as JMagick or AWT, or by using TensorFlow.
If you use TensorFlow, it is possible to run this preprocessing in eager mode or by building and running a small graph. For example, given tf an instance of org.tensorflow.op.Ops:
tf.image.decode* can read content of an image (you know to know the type of your image though to pick the right operation).
tf.reverse can reverse the value in your channel dimension (RGB to BGR)
tf.dtypes.cast can convert the image to floats
tf.image.resizeBilinear can rescale your image

Trying to convert this formula into an arithmetic expression in Java

I'm trying to take user input in the form of myMonthlyPayment, myAnnualInterestRate, and myPrincipal in order to calculate the number of months needed to pay off debt by using The formula I've attached to this post. What I have in eclipse for the formula right now is:
monthsNeeded = ((Math.log(myMonthlyPayment) - Math.log(myMonthlyPayment)
- ((myAnnualInterestRate / 1200.0) * myPrincipal))
/ ((Math.log(myAnnualInterestRate) / 1200.0) + 1.0));
I should be getting an output of 79 months with the inputs I'm using but instead I'm getting -62. I know the formula is correct, I'm almost positive I've made a mistake somewhere in the translation of it into Java. If someone could point it out that would be greatly appreciated!
So I've fixed it, with a sample input and output.
I didn't put much effort into making this code beautiful but you can see that even separating it into 3 parts using method extraction (although I didn't know how to name them, lacking the domain knowledge) made the code easier to understand.
public class Example {
public static void main(String[] args) {
double myMonthlyPayment = 2000;
double myAnnualInterestRate = 5;
double myPrincipal = 200000;
System.out.println(a(myMonthlyPayment));
System.out.println(b(myPrincipal, myAnnualInterestRate, myMonthlyPayment));
System.out.println(c(myAnnualInterestRate));
double monthsNeeded = (a(myMonthlyPayment) - b(myPrincipal, myAnnualInterestRate, myMonthlyPayment))
/ c(myAnnualInterestRate);
System.out.println(monthsNeeded);
}
private static double c(double myAnnualInterestRate) {
return Math.log((myAnnualInterestRate / 1200.0) + 1);
}
private static double b(double myPrinicipal, double myAnnualInterestRate, double myMonthlyPayment) {
return Math.log(myMonthlyPayment - (myAnnualInterestRate / 1200.0) * myPrinicipal);
}
private static double a(double myMonthlyPayment) {
return Math.log(myMonthlyPayment);
}
}
I think this is what you're looking for:
monthsNeeded = (Math.log(myMonthlyPayment) - Math.log(myMonthlyPayment - myAnnualInterestRate / 1200d * myPrincipal)) / Math.log(myAnnualInterestRate / 1200d + 1);
It seems that, in your solution, you weren't calculating your myAnnualInterestRate/1200*myPrincipal inside your second Math.log(...). You had also left some calculations outside of Math.log(...) in the bottom half of your equation.
If you have an equation that does an operation inside a natural log, when you convert that equation to Java code, the operation needs to still be done, inside the natural log:
ln(someNumber + 10)
would be converted to:
Math.log(someNumber + 10),
NOT:
Math.log(someNumber) + 10
Hope this helps and good luck. :)

Why is the eval class giving me a casting error from int to double?

I am trying to make a method that takes a string formula, and solves the integral of that formula by doing a Riemann's sum with very small intervals. I am using the ScriptEngine and ScriptEngineManager classes to evaluate the function (with the eval() method). For some reason, I am getting this error:
Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
at sum.integral(sum.java:31)
at sum.main(sum.java:13)
import java.beans.Expression;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
public class sum {
//testing method
public static void main(String[] args) throws ScriptException {
double x = integral("5*x^2",0,5);
System.out.println(x);
}
public static double integral(String function, double lower, double upper) throws ScriptException
{
double total = 0;
ScriptEngineManager mgr = new ScriptEngineManager();
ScriptEngine engine = mgr.getEngineByName("JavaScript");
//Solves function from upper to lower with a .001 interval, adding it to the total.
for (double i = lower; i < upper; i+=.001)
{
//evaluates the interval
engine.put("x",i);
total += (double)engine.eval(function);
}
return total;
}
}
Nashorn uses optimistic typing (since JDK 8u40), so it will using integers when doubles are not needed. Thus, you cannot count on it returning a Double.
Also, 5*x^2 means "five times x xor two" in JavaScript. The ** exponentiation operator is defined in newer versions of the JavaScript language, but Nashorn doesn't support it yet.
If you change your JavaScript code to 5*x*x it will work, but it would be safer to do:
total += 0.001 * ((Number)engine.eval(function)).doubleValue();
Compiling Frequently Used Code
Since you call this function repeatedly in a loop, a best practice is to compile the function in advance. This performance optimization is not strictly necessary, but as it is the engine has to compile your function every time (although it may use a cache to help with that).
import javax.script.Compilable;
import javax.script.CompiledScript;
import javax.script.Invocable;
import javax.script.ScriptContext;
CompiledScript compiledScript = ((Compilable)engine)
.compile("function func(x) { return " + function + "}");
compiledScript.eval(compiledScript.getEngine()
.getBindings(ScriptContext.ENGINE_SCOPE));
Invocable funcEngine = (Invocable) compiledScript.getEngine();
// . . .
total += 0.001 * ((Number)funcEngine.invokeFunction("func", i)).doubleValue();
Using ES6 Language Features
In the future, when Nashorn does support the ** operator, if you want to use it you may need to turn on ES6 features like this:
import jdk.nashorn.api.scripting.NashornScriptEngineFactory;
NashornScriptEngineFactory factory = new NashornScriptEngineFactory();
ScriptEngine enjin = factory.getScriptEngine("--language=es6");
Or like this:
java -Dnashorn.args=--language=es6
* Edited to account for the mathematical fix pointed out in the comments.
Your JS snippet returns an Integer (*), because x^2 is not the correct way to get a power of 2 in JavaScript. Try 5*Math.pow(x,2) instead, and the expression will return a Double.
In JavaScript, ^ operator is bitwise XOR.
Also the loop to compute the integral is wrong, you need to multiply by rectangle width:
double delta = 0.001;
for (double i = lower; i < upper; i += delta) {
//evaluates the interval
engine.put("x", i);
total += delta * ((Number) engine.eval(function)).doubleValue();
}
(*) See David's answer for a tentative explanation. But in comments, #A.Sundararajan provides evidence against this. I have not investigated the exact reason, I have only observed I got an Integer, and was only guessing the use of bitwise operation in expression (from OP's original code) was triggering a conversion to integer. I originally edited my post to include the fix for "math error", but David's newer answer (by about 4 minutes ^^) is more complete for the original question, and should remain the accepted answer IMHO.

In Apache Spark, can I easily repeat/nest a SparkContext.parallelize?

I am trying to model a genetics problem we are trying to solve, building up to it in steps. I can successfully run the PiAverage examples from Spark Examples. That example "throws darts" at a circle (10^6 in our case) and counts the number that "land in the circle" to estimate PI
Let's say I want to repeat that process 1000 times (in parallel) and average all those estimates. I am trying to see the best approach, seems like there's going to be two calls to parallelize? Nested calls? Is there not a way to chain map or reduce calls together? I can't see it.
I want to know the wisdom of something like the idea below. I thought of tracking the resulting estimates using an accumulator. jsc is my SparkContext, full code of single run is at end of question, thanks for any input!
Accumulator<Double> accum = jsc.accumulator(0.0);
// make a list 1000 long to pass to parallelize (no for loops in Spark, right?)
List<Integer> numberOfEstimates = new ArrayList<Integer>(HOW_MANY_ESTIMATES);
// pass this "dummy list" to parallelize, which then
// calls a pieceOfPI method to produce each individual estimate
// accumulating the estimates. PieceOfPI would contain a
// parallelize call too with the individual test in the code at the end
jsc.parallelize(numberOfEstimates).foreach(accum.add(pieceOfPI(jsc, numList, slices, HOW_MANY_ESTIMATES)));
// get the value of the total of PI estimates and print their average
double totalPi = accum.value();
// output the average of averages
System.out.println("The average of " + HOW_MANY_ESTIMATES + " estimates of Pi is " + totalPi / HOW_MANY_ESTIMATES);
It doesn't seem like a matrix or other answers I see on SO give the answer to this specific question, I have done several searches but I am not seeing how to do this without "parallelizing the parallelization." Is that a bad idea?
(and yes I realize mathematically I could just do more estimates and effectively get the same results :) Trying to build a structure my boss wants, thanks again!
I have put my entire single-test program here if that helps, sans an accumulator I was testing out. The core of this would become PieceOfPI():
import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;
import org.apache.spark.Accumulable;
import org.apache.spark.Accumulator;
import org.apache.spark.SparkContext;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.storage.StorageLevel;
import org.apache.spark.SparkConf;
import org.apache.spark.storage.StorageLevel;
public class PiAverage implements Serializable {
public static void main(String[] args) {
PiAverage pa = new PiAverage();
pa.go();
}
public void go() {
// should make a parameter like all these finals should be
// int slices = (args.length == 1) ? Integer.parseInt(args[0]) : 2;
final int SLICES = 16;
// how many "darts" are thrown at the circle to get one single Pi estimate
final int HOW_MANY_DARTS = 1000000;
// how many "dartboards" to collect to average the Pi estimate, which we hope converges on the real Pi
final int HOW_MANY_ESTIMATES = 1000;
SparkConf sparkConf = new SparkConf().setAppName("PiAverage")
.setMaster("local[4]");
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
// setup "dummy" ArrayList of size HOW_MANY_DARTS -- how many darts to throw
List<Integer> throwsList = new ArrayList<Integer>(HOW_MANY_DARTS);
for (int i = 0; i < HOW_MANY_DARTS; i++) {
throwsList.add(i);
}
// setup "dummy" ArrayList of size HOW_MANY_ESTIMATES
List<Integer> numberOfEstimates = new ArrayList<Integer>(HOW_MANY_ESTIMATES);
for (int i = 0; i < HOW_MANY_ESTIMATES; i++) {
numberOfEstimates.add(i);
}
JavaRDD<Integer> dataSet = jsc.parallelize(throwsList, SLICES);
long totalPi = dataSet.filter(new Function<Integer, Boolean>() {
public Boolean call(Integer i) {
double x = Math.random();
double y = Math.random();
if (x * x + y * y < 1) {
return true;
} else
return false;
}
}).count();
System.out.println(
"The average of " + HOW_MANY_DARTS + " estimates of Pi is " + 4 * totalPi / (double)HOW_MANY_DARTS);
jsc.stop();
jsc.close();
}
}
Let me start with your "background question". Transformation operations like map, join, groupBy, etc. fall into two categories; those that require a shuffle of data as input from all the partitions, and those that don't. Operations like groupBy and join require a shuffle, because you need to bring together all records from all the RDD's partitions with the same keys (think of how SQL JOIN and GROUP BY ops work). On the other hand, map, flatMap, filter, etc. don't require shuffling, because the operation works fine on the input of the previous step's partition. They work on single records at a time, not groups of them with matching keys. Hence, no shuffling is necessary.
This background is necessary to understand that an "extra map" does not have a significant overhead. A sequent of operations like map, flatMap, etc. are "squashed" together into a "stage" (which is shown when you look at details for a job in the Spark Web console) so that only one RDD is materialized, the one at the end of the stage.
On to your first question. I wouldn't use an accumulator for this. They are intended for "side-band" data, like counting how many bad lines you parsed. In this example, you might use accumulators to count how many (x,y) pairs were inside the radius of 1 vs. outside, as an example.
The JavaPiSpark example in the Spark distribution is about as good as it gets. You should study why it works. It's the right dataflow model for Big Data systems. You could use "aggregators". In the Javadocs, click the "index" and look at the agg, aggregate, and aggregateByKey functions. However, they are no more understandable and not necessary here. They provide greater flexibility than map then reduce, so they are worth knowing
The problem with your code is that you are effectively trying to tell Spark what to do, rather than expressing your intent and letting Spark optimize how it does it for you.
Finally, I suggest you buy and study O'Reilly's "Learning Spark". It does a good job explaining the internal details, like staging, and it shows lots of example code you can use, too.

Java: Apache Regression gives me absolutly wrong regression parameters

I wanted to get regression parameters by using Apache's Commons.Math3 library and the OLSMultipleLinearRegression.
The regression should be polynomial with a power of 2.
It worked fine with test data but when I use this experimental data the methode gives me an absolutely wrong regression.
public static void poly (){
OLSMultipleLinearRegression quadRegression = new OLSMultipleLinearRegression();
double [] y = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,
26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59};
double [][] x = {{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{0.95,0.90},{0.96,0.91},{0.96,0.92},{0.96,0.92},{0.96,0.92},{0.92,0.84},{0.92,0.85},
{0.92,0.86},{0.93,0.86},{0.93,0.87},{0.89,0.80},{0.90,0.81},{0.90,0.81},{0.90,0.82},{0.89,0.80},{0.90,0.81},{0.90,0.82},{0.91,0.82},{0.91,0.83},{0.90,0.80},{0.90,0.80},{0.90,0.81},{0.91,0.82},
{0.89,0.79},{0.89,0.80},{0.90,0.80},{0.90,0.81},{0.88,0.77},{0.88,0.77},{0.88,0.78},{0.88,0.78},{0.86,0.73},{0.86,0.74},{0.86,0.74},{0.86,0.74},{0.84,0.71},{0.85,0.72},{0.85,0.72},{0.85,0.73},
{0.84,0.71},{0.84,0.71},{0.84,0.71},{0.84,0.71},{0.83,0.69},{0.83,0.69},{0.83,0.69},{0.82,0.68},{0.82,0.68},{0.82,0.68},{0.82,0.68}};
quadRegression.newSampleData(y, x);
quadRegression.setNoIntercept(false);
double [] results = quadRegression.estimateRegressionParameters();}
For this input data I get the equation y=117.54x²-504.83x+389.088 which would result in a y-value of 379.760.85 for x=59 - way beyond my input value.
So I either handled the class absolutly wrong or I got stuck in a mathematical pitfall.
If someone please could explain me what I did wrong or misinterpreted - this problem drives me insane.

Categories