Smoothing experimental data with piecewise functions

Smoothing experimental data with piecewise functions - java

I have a data set of a single measurement vs. time (about 3000 points). I'd like to smooth the data by fitting a curve through it. The experiment is a multi-stage physical process so I am pretty sure a single polynomial won't fit the whole set.
Therefore I'm looking at a piecewise series of polynomials. I'd like to specify how many polynomials are used. This seems to me to a fairly straightforward thing and I was hoping that there would be some pre-built library to do it. I've seen org.apache.commons.math3.fitting.PolynomialFitter in Apache Commons Math but it seems to only work with a single polynomial.
Can anyone suggest the best way to do this? Java preferred but I could work in Python.

If you're looking for local regression, Commons Math implements it as LoessInterpolator. You'll get the end result as a "spline," a smooth sequence of piecewise cubic polynomials.

In finmath lib there is a class called curve which implements some interpolation schemes (linear, spline, akima, etc.). These curves can provide their points as parameters to a solver and you can then use a global optimization (like a Levenberg Marquardt optimizer) to minimize the distance of your data to the curve (defining some preferred norm).
This is actually done in the "Curve Calibration" which is an application from mathematical finance. If you have as many points (parameters) in the curve as data you will likely get a perfect fit. If you have fewer points than data you get the best fit in your norm.
The Levenberg Marquardt in finmath lib is multithreaded and very fast (> 200 points are fitted in << 1 sec).
See
the Curve class at http://svn.finmath.net/finmath%20lib/trunk/src/main/java/net/finmath/marketdata/model/curves/Curve.java
and the LM optimizer at http://svn.finmath.net/finmath%20lib/trunk/src/main/java/net/finmath/optimizer/LevenbergMarquardt.java
Disclaimer: I am the/a developer of that library.
Note: I also like commons-math, but for the curve fitting I don't use it (yet), since I need(ed) some fitting properties specific to my application (mathematical finance).
(Edit)
Here is small demo: (Note: This demo requires finmath-lib 1.2.13 or the current 1.2.12-SNAPSHOT available at mvn.finmath.net or github.com/finmath/finmath-lib (it is not compatible with 1.2.12)
package net.finmath.tests.marketdata.curves;
import java.text.DecimalFormat;
import java.text.NumberFormat;
import org.junit.Test;
import net.finmath.marketdata.model.curves.Curve;
import net.finmath.marketdata.model.curves.CurveInterface;
import net.finmath.optimizer.LevenbergMarquardt;
import net.finmath.optimizer.SolverException;
/**
* A short demo on how to use {#link net.finmath.marketdata.model.curves.Curve}.
*
* #author Christian Fries
*/
public class CurveTest {
private static NumberFormat numberFormat = new DecimalFormat("0.0000");
/**
* Run a short demo on how to use {#link net.finmath.marketdata.model.curves.Curve}.
*
* #param args Not used.
* #throws SolverException Thrown if optimizer fails.
* #throws CloneNotSupportedException Thrown if curve cannot be cloned for optimization.
*/
public static void main(String[] args) throws SolverException, CloneNotSupportedException {
(new CurveTest()).testCurveFitting();
}
/**
* Tests fitting of curve to given data.
*
* #throws SolverException Thrown if optimizer fails.
* #throws CloneNotSupportedException Thrown if curve cannot be cloned for optimization.
*/
#Test
public void testCurveFitting() throws SolverException, CloneNotSupportedException {
/*
* Build a curve (initial guess for our fitting problem, defines the times).
*/
Curve.CurveBuilder curveBuilder = new Curve.CurveBuilder();
curveBuilder.setInterpolationMethod(Curve.InterpolationMethod.LINEAR);
curveBuilder.setExtrapolationMethod(Curve.ExtrapolationMethod.LINEAR);
curveBuilder.setInterpolationEntity(Curve.InterpolationEntity.VALUE);
// Add some points - which will not be fitted
curveBuilder.addPoint(-1.0 /* time */, 1.0 /* value */, false /* isParameter */);
curveBuilder.addPoint( 0.0 /* time */, 1.0 /* value */, false /* isParameter */);
// Add some points - which will be fitted
curveBuilder.addPoint( 0.5 /* time */, 2.0 /* value */, true /* isParameter */);
curveBuilder.addPoint( 0.75 /* time */, 2.0 /* value */, true /* isParameter */);
curveBuilder.addPoint( 1.0 /* time */, 2.0 /* value */, true /* isParameter */);
curveBuilder.addPoint( 2.2 /* time */, 2.0 /* value */, true /* isParameter */);
curveBuilder.addPoint( 3.0 /* time */, 2.0 /* value */, true /* isParameter */);
final Curve curve = curveBuilder.build();
/*
* Create data to which the curve should be fitted to
*/
final double[] givenTimes = { 0.0, 0.5, 0.75, 1.0, 1.5, 1.75, 2.5 };
final double[] givenValues = { 3.5, 12.3, 13.2, 7.5, 5.5, 2.9, 4.4 };
/*
* Find a best fitting curve.
*/
// Define the objective function
LevenbergMarquardt optimizer = new LevenbergMarquardt(
curve.getParameter() /* initial parameters */,
givenValues /* target values */,
100, /* max iterations */
Runtime.getRuntime().availableProcessors() /* max number of threads */
) {
#Override
public void setValues(double[] parameters, double[] values) throws SolverException {
CurveInterface curveGuess = null;
try {
curveGuess = curve.getCloneForParameter(parameters);
} catch (CloneNotSupportedException e) {
throw new SolverException(e);
}
for(int valueIndex=0; valueIndex<values.length; valueIndex++) {
values[valueIndex] = curveGuess.getValue(givenTimes[valueIndex]);
}
}
};
// Fit the curve (find best parameters)
optimizer.run();
CurveInterface fittedCurve = curve.getCloneForParameter(optimizer.getBestFitParameters());
// Print out fitted curve
for(double time = -2.0; time < 5.0; time += 0.1) {
System.out.println(numberFormat.format(time) + "\t" + numberFormat.format(fittedCurve.getValue(time)));
}
// Check fitted curve
double errorSum = 0.0;
for(int pointIndex = 0; pointIndex<givenTimes.length; pointIndex++) {
errorSum += fittedCurve.getValue(givenTimes[pointIndex]) - givenValues[pointIndex];
}
System.out.println("Mean deviation: " + errorSum);
/*
* Test: With the given data, the fit cannot over come that at 0.0 we have an error of -2.5.
* Hence we test if the mean deviation is -2.5 (the optimizer reduces the variance)
*/
org.junit.Assert.assertTrue(Math.abs(errorSum - -2.5) < 1E-5);
}
}

Related

How to enable linear relaxation outputs

I have a rather complex MILP, but the main problem is the number of continuous variables, not the number of binaries. I just "hard-coded" the linear relaxation to understand its output, and it takes approx. 10-15 minutes to solve (which is not extremely surprising). If I run the MILP with outputs, I don't see anything happening for the first 10 minutes, because it takes those 10 minutes to construct a first integer-feasible solution. So it would help to be able to enable the same outputs I am seeing when solving the linear relaxation "manually" (so something like Iteration: 1 Dual objective = 52322816.412592) within the B&B output.
Is this possible? I googled at bit, but I only found solutions for steering the solution algorithm, or for deriving linear relaxations using callbacks, while I am interested in a "simple" output of the intermediate steps.

It sounds like you are asking for extra detailed logging during the linear relaxation part of the solve during the B&B. Have a look at the CPLEX parameter settings like IloCplex.Param.MIP.Display (try setting this to 5) and also IloCplex.Param.Simplex.Display (try setting to 1 or 2).

within java you could rely on IloConversion objects that will allow you to locally change the type of one or more variables.
See the sample AdMIPex6.java
/* --------------------------------------------------------------------------
* File: AdMIPex6.java
* Version 20.1.0
* --------------------------------------------------------------------------
* Licensed Materials - Property of IBM
* 5725-A06 5725-A29 5724-Y48 5724-Y49 5724-Y54 5724-Y55 5655-Y21
* Copyright IBM Corporation 2001, 2021. All Rights Reserved.
*
* US Government Users Restricted Rights - Use, duplication or
* disclosure restricted by GSA ADP Schedule Contract with
* IBM Corp.
* --------------------------------------------------------------------------
*
* AdMIPex6.java -- Solving a model by passing in a solution for the root node
* and using that in a solve callback
*
* To run this example, command line arguments are required:
* java AdMIPex6 filename
* where
* filename Name of the file, with .mps, .lp, or .sav
* extension, and a possible additional .gz
* extension.
* Example:
* java AdMIPex6 mexample.mps.gz
*/
import ilog.concert.*;
import ilog.cplex.*;
public class AdMIPex6 {
static class Solve extends IloCplex.SolveCallback {
boolean _done = false;
IloNumVar[] _vars;
double[] _x;
Solve(IloNumVar[] vars, double[] x) { _vars = vars; _x = x; }
public void main() throws IloException {
if ( !_done ) {
setStart(_x, _vars, null, null);
_done = true;
}
}
}
public static void main(String[] args) {
try (IloCplex cplex = new IloCplex()) {
cplex.importModel(args[0]);
IloLPMatrix lp = (IloLPMatrix)cplex.LPMatrixIterator().next();
IloConversion relax = cplex.conversion(lp.getNumVars(),
IloNumVarType.Float);
cplex.add(relax);
cplex.solve();
System.out.println("Relaxed solution status = " + cplex.getStatus());
System.out.println("Relaxed solution value = " + cplex.getObjValue());
double[] vals = cplex.getValues(lp.getNumVars());
cplex.use(new Solve(lp.getNumVars(), vals));
cplex.delete(relax);
cplex.setParam(IloCplex.Param.MIP.Strategy.Search,
IloCplex.MIPSearch.Traditional);
if ( cplex.solve() ) {
System.out.println("Solution status = " + cplex.getStatus());
System.out.println("Solution value = " + cplex.getObjValue());
}
}
catch (IloException e) {
System.err.println("Concert exception caught: " + e);
}
}
}
if you use OPL then you could have a look at Relax integrity constraints and dual value
int nbKids=300;
float costBus40=500;
float costBus30=400;
dvar int+ nbBus40;
dvar int+ nbBus30;
minimize
costBus40*nbBus40 +nbBus30*costBus30;
subject to
{
ctKids:40*nbBus40+nbBus30*30>=nbKids;
}
main {
var status = 0;
thisOplModel.generate();
if (cplex.solve()) {
writeln("Integer Model");
writeln("OBJECTIVE: ",cplex.getObjValue());
}
// relax integrity constraint
thisOplModel.convertAllIntVars();
if (cplex.solve()) {
writeln("Relaxed Model");
writeln("OBJECTIVE: ",cplex.getObjValue());
writeln("dual of the kids constraint = ",thisOplModel.ctKids.dual);
}
}

JavaBDD sat count with subset of variables

I am using JavaBDD to do some computation with BDDs.
I have a very large BDD with many variables and I want to calculate how many ways it can be satisfied with a small subset of those variables.
My current attempt looks like this:
// var 1,2,3 are BDDVarSets with 1 variable.
BDDVarSet union = var1;
union = union.union(var2);
union = union.union(var3);
BDD varSet restOfVars = allVars.minus(union);
BDD result = largeBdd.exist(restOfVars);
double sats = result.satCount(); // Returns a very large number (way too large).
double partSats = result.satCount(union) // Returns an inccorrect number. It is documented that this should not work.
Is the usage of exist() incorrect?

After a bit of playing around I understood what my problem was.
double partSats = result.satCount(union);
Does return the correct answer. What it does is calculate how many possible solutions there are, and divides the solution by 2^(#vars in set).
The reason I thought satCount(union) does not work is due to an incorrect usage of exist() somewhere else in the code.
Here is the implementation of satCound(varSet) for reference:
/**
* <p>Calculates the number of satisfying variable assignments to the variables
* in the given varset. ASSUMES THAT THE BDD DOES NOT HAVE ANY ASSIGNMENTS TO
* VARIABLES THAT ARE NOT IN VARSET. You will need to quantify out the other
* variables first.</p>
*
* <p>Compare to bdd_satcountset.</p>
*
* #return the number of satisfying variable assignments
*/
public double satCount(BDDVarSet varset) {
BDDFactory factory = getFactory();
if (varset.isEmpty() || isZero()) /* empty set */
return 0.;
double unused = factory.varNum();
unused -= varset.size();
unused = satCount() / Math.pow(2.0, unused);
return unused >= 1.0 ? unused : 1.0;
}

Getting the unicode value of the first character in a string

I am basically being asked to take the Unicode value of a string, multiply it by 10% and add whatever level the object currently has. It's frustrating because as it turns out I have the logic down including the code yet I still get an error that says: expected:<0> but was:<8>. Any suggestions, maybe it's just a slight nuance I have to make in the logic, although I'm fairly certain it's right. Take note of the getLevel method because that's where the error is
public class PouchCreature implements Battleable {
private String name;
private int strength;
private int levelUps;
private int victoriesSinceLevelUp;
/**
* Standard constructor. levelUps and victoriesSinceLevelUp start at 0.
*
* #param nameIn desired name for this PouchCreature
* #param strengthIn starting strength for this PouchCreature
*/
public PouchCreature(String nameIn, int strengthIn) {
this.name = nameIn;
this.strength = strengthIn;
this.levelUps = 0;
this.victoriesSinceLevelUp = 0;
}
/**
* Copy constructor.
*
* #param other reference to the existing object which is the basis of the new one
*/
public PouchCreature(PouchCreature other) {
this.name=other.name;
this.strength=other.strength;
this.levelUps=other.levelUps;
this.victoriesSinceLevelUp=other.victoriesSinceLevelUp;
}
/**
* Getter for skill level of the PouchCreature, which is based on the
* first character of its name and the number of levelUps it has.
* Specifically, the UNICODE value of the first character in its name
* taken %10 plus the levelUps.
*
* #return skill level of the PouchCreature
*/
public int getLevel() {
int value = (int)((int)(getName().charAt(0)) * 0.1);
return value + this.levelUps;
}

You've said you're supposed to increase the value by 10%. What you're actually doing, though, is reducing it 90% by taking just 10% of it (and then truncating that to an int). 67.0 * 0.1 = 6.7, which when truncated to an int is 6.
Change the 0.1 to 1.1 to increase it by 10%:
int value = (int)((int)(getName().charAt(0)) * 1.1);
// --------------------------------------------^
There, if getName() returns "Centaur" (for instance), the C has the Unicode value 67, and value ends up being 73.

We need to see the code you're calling the class with and that is generating your error message. Why is it expecting 0? 8 seems like a valid return value from the information you've given.

Documenting Varargs Appropriately for Javadoc

I'm using varargs in a method for optional parameters. Any suggestion for how best to document the method?
Here's a wonderfully contrived example:
/**
*
* #param consumption
* liters of liquid consumed after last pee
* #param options
* urgency
* how badly you have to pee on a scale of 1-3,
* 3 being the highest (default 1)
* bribe
* what's a toilet worth to you? (default 0)
* #return waitTime
* minutes until you'll be able to relieve yourself
*/
public integer whenCanIUseTheBathroom(int consumption, int... options){
// Segment handling options, defining defaults/fallbacks
int urgency = 1;
int bribe = 0;
if(options.length > 0) {
urgency = options[0];
}
if(options.length == 2) {
bribe = options[1];
}
// Segment determining one's fate
...
}

Varargs are not usually used to implement optional parameters with different meanings because it does not support different types for the "subparams", offers poor refactoring support (want to insert a new "subparam" or remove an old one?), and is inflexible (you can't omit "urgency" while providing "bribe"). Therefore, there is no standard way to document them with javadoc, either.
Optional parameters are typically implemented using overloading (usually with delegation), or a variant of the builder pattern, which allows you to write:
new BathroomRequest(3).withBribe(2).compute();
For a more thorough discussion of that approach, see Joshua Bloch's Effective Java, item 2.

Trend analysis using iterative value increments

We have configured iReport to generate the following graph:
The real data points are in blue, the trend line is green. The problems include:
Too many data points for the trend line
Trend line does not follow a Bezier curve (spline)
The source of the problem is with the incrementer class. The incrementer is provided with the data points iteratively. There does not appear to be a way to get the set of data. The code that calculates the trend line looks as follows:
import java.math.BigDecimal;
import net.sf.jasperreports.engine.fill.*;
/**
* Used by an iReport variable to increment its average.
*/
public class MovingAverageIncrementer
implements JRIncrementer {
private BigDecimal average;
private int incr = 0;
/**
* Instantiated by the MovingAverageIncrementerFactory class.
*/
public MovingAverageIncrementer() {
}
/**
* Returns the newly incremented value, which is calculated by averaging
* the previous value from the previous call to this method.
*
* #param jrFillVariable Unused.
* #param object New data point to average.
* #param abstractValueProvider Unused.
* #return The newly incremented value.
*/
public Object increment( JRFillVariable jrFillVariable, Object object,
AbstractValueProvider abstractValueProvider ) {
BigDecimal value = new BigDecimal( ( ( Number )object ).doubleValue() );
// Average every 10 data points
//
if( incr % 10 == 0 ) {
setAverage( ( value.add( getAverage() ).doubleValue() / 2.0 ) );
}
incr++;
return getAverage();
}
/**
* Changes the value that is the moving average.
* #param average The new moving average value.
*/
private void setAverage( BigDecimal average ) {
this.average = average;
}
/**
* Returns the current moving average average.
* #return Value used for plotting on a report.
*/
protected BigDecimal getAverage() {
if( this.average == null ) {
this.average = new BigDecimal( 0 );
}
return this.average;
}
/** Helper method. */
private void setAverage( double d ) {
setAverage( new BigDecimal( d ) );
}
}
How would you create a smoother and more accurate representation of the trend line?

This depends on the behavior of the item you are measuring. Is this something that moves (or changes) in a manner that can be modeled?
If the item is not expected to change, then your trend should be the underlying mean value of the entire sample set, not just the past two measurements. You can get this using Bayes theorem. The running average can be calculated incrementally using the simple formula
Mtn1 = (Mtn * N + x) / (N+1)
where x is the measurement at time t+1, Mtn1 is the mean a time t+1, Mtn is the mean at time t, and N is the number of measurements taken by time t.
If the item you are measuring fluctuates in a manner that can be predicted by some underlying equation, then you can use a Kalman filter to provide a best estimate of the next point based on the previous (recent) measurements and the equation that models the predicted behavior.
As a starting point, the Wikipedia entry on Bayesian estimators and Kalman Filters will be helpful.

Resulting Image
The result is still incomplete, however it clearly shows a better trend line than that in the question.
Calculation
There were two key components missing:
Sliding window. A List of Double values that cannot grow beyond a given size.
Calculation. A variation on the accept answer (one less call to getIterations()):
((value - previousAverage) / (getIterations() + 1)) + previousAverage
Source Code
import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.List;
import net.sf.jasperreports.engine.fill.AbstractValueProvider;
import net.sf.jasperreports.engine.fill.JRFillVariable;
import net.sf.jasperreports.engine.fill.JRIncrementer;
/**
* Used by an iReport variable to increment its average.
*/
public class RunningAverageIncrementer
implements JRIncrementer {
/** Default number of tallies. */
private static final int DEFAULT_TALLIES = 128;
/** Number of tallies within the sliding window. */
private static final int DEFAULT_SLIDING_WINDOW_SIZE = 30;
/** Stores a sliding window of values. */
private List<Double> values = new ArrayList<Double>( DEFAULT_TALLIES );
/**
* Instantiated by the RunningAverageIncrementerFactory class.
*/
public RunningAverageIncrementer() {
}
/**
* Calculates the average of previously known values.
* #return The average of the list of values returned by getValues().
*/
private double calculateAverage() {
double result = 0.0;
List<Double> values = getValues();
for( Double d: getValues() ) {
result += d.doubleValue();
}
return result / values.size();
}
/**
* Called each time a new value to be averaged is received.
* #param value The new value to include for the average.
*/
private void recordValue( Double value ) {
List<Double> values = getValues();
// Throw out old values that should no longer influence the trend.
//
if( values.size() > getSlidingWindowSize() ) {
values.remove( 0 );
}
this.values.add( value );
}
private List<Double> getValues() {
return values;
}
private int getIterations() {
return getValues().size();
}
/**
* Returns the newly incremented value, which is calculated by averaging
* the previous value from the previous call to this method.
*
* #param jrFillVariable Unused.
* #param tally New data point to average.
* #param abstractValueProvider Unused.
* #return The newly incremented value.
*/
public Object increment( JRFillVariable jrFillVariable, Object tally,
AbstractValueProvider abstractValueProvider ) {
double value = ((Number)tally).doubleValue();
recordValue( value );
double previousAverage = calculateAverage();
double newAverage =
((value - previousAverage) / (getIterations() + 1)) + previousAverage;
return new BigDecimal( newAverage );
}
protected int getSlidingWindowSize() {
return DEFAULT_SLIDING_WINDOW_SIZE;
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Smoothing experimental data with piecewise functions - java

If you're looking for local regression, Commons Math implements it as LoessInterpolator. You'll get the end result as a "spline," a smooth sequence of piecewise cubic polynomials.

Related

How to enable linear relaxation outputs

JavaBDD sat count with subset of variables

Getting the unicode value of the first character in a string

Documenting Varargs Appropriately for Javadoc

Trend analysis using iterative value increments

Categories

Resources