Solving a non linear system in java (using optim toolbox)

Solving a non linear system in java (using optim toolbox) - java

I have a system of nonlinear dynamics which I which to solve to optimality. I know how to do this in MATLAB, but I wish to implement this in JAVA. I'm for some reason lost in how to do it in Java.
What I have is following:
z(t) which returns states in a dynamic system.
z(t) = [state1(t),...,state10(t)]
The rate of change of this dynamic system is given by:
z'(t) = f(z(t),u(t),d(t)) = [dstate1(t)/dt,...,dstate10(t)/dt]
where u(t) and d(t) is some external variables that I know the value of.
In addition I have a function, lets denote that g(t) which is defined from a state variable:
g(t) = state4(t)/c1
where c1 is some constant.
Now I wish to solve the following unconstrained nonlinear system numerically:
g(t) - c2 = 0
f(z(t),u(t),0)= 0
where c2 is some constant. Above system can be seen as a simple f'(x) = 0 problem consisting of 11 equations and 1 unkowns and if I where supposed to solve this in MATLAB I would do following:
[output] = fsolve(#myDerivatives, someInitialGuess);
I am aware of the fact that JAVA doesn't come with any build-in solvers. So as I see it there are two options in solving the above mentioned problem:
Option 1: Do it my-self: I could use numerical methods as e.g. Gauss newton or similar to solve this system of nonlinear equations. However, I will start by using a java toolbox first, and then move to a numerical method afterwards.
Option 2: Solvers (e.g. commons optim) This solution is what I am would like to look into. I have been looking into this toolbox, however, I have failed to find an exact example of how to actually use the MultiVariateFunction evaluater and the numerical optimizer. Does any of you have any experience in doing so?
Please let me know if you have any ideas or suggestions for solving this problem.
Thanks!

Please compare what your original problem looks like:
A global optimization problem
minimize f(y)
is solved by looking for solutions of the derivatives system
0=grad f(y) or 0=df/dy (partial derivatives)
(the gradient is the column vector containing all partial derivatives), that is, you are computing the "flat" or horizontal points of f(y).
For optimization under constraints
minimize f(y,u) such that g(y,u)=0
one builds the Lagrangian functional
L(y,p,u) = f(y,u)+p*g(y,u) (scalar product)
and then compute the flat points of that system, that is
g(y,u)=0, dL/dy(y,p,u)=0, dL/du(y,p,u)=0
After that, as also in the global optimization case, you have to determine what the type of the flat point is, maximum, minimun or saddle point.
Optimal control problems have the structure (one of several equivalent variants)
minimize integral(0,T) f(t,y(t),u(t)) dt
such that y'(t)=g(t,y(t),u(t)), y(0)=y0 and h(T,y(T))=0
To solve it, one considers the Hamiltonian
H(t,y,p,u)=f(t,y,u)-p*g(t,y,u)
and obtained the transformed problem
y' = -dH/dp = g, (partial derivatives, gradient)
p' = dH/dy,
with boundary conditions
y(0)=y0, p(T)= something with dh/dy(T,y(T))
u(t) realizes the minimum in v -> H(t,y(t),p(t),v)

Related

What would be the best way to build a Big-O runtime complexity analyzer for pseudocode in a text file?

I am trying to create a class that takes in a string input containing pseudocode and computes its' worst case runtime complexity. I will be using regex to split each line and analyze the worst-case and add up the complexities (based on the big-O rules) for each line to give a final worst-case runtime. The pseudocode written will follow a few rules for declaration, initilization, operations on data structures. This is something I can control. How should I go about designing a class considering the rules of iterative and recursive analysis?
Any help in C++ or Java is appreciated. Thanks in advance.
class PseudocodeAnalyzer
{
public:
string inputCode;
string performIterativeAnalysis(string line);
string performRecursiveAnalysis(string line);
string analyzeTotalComplexity(string inputCode);
}
An example for iterative algorithm: Check if number in a grid is Odd:
1. Array A = Array[N][N]
2. for i in 1 to N
3. for j in 1 to N
4. if A[i][j] % 2 == 0
5. return false
6. endif
7. endloop
8. endloop
Worst-case Time-Complexity: O(n*n)

The concept: "I wish to write a program that analyses pseudocode in order to print out the algorithmic complexity of the algorithm it describes" is mathematically impossible!
Let me try to explain why that is, or how you get around the inevitability that you cannot write this.
Your pseudocode has certain capabilities. You call it pseudocode, but given that you are now trying to parse it, it's still a 'real' language where terms have real meaning. This language is capable of expressing algorithms.
So, which algorithms can it express? Presumably, 'all of them'. There is this concept called a 'turing machine': You can prove that anything a computer can do, a turing machine can also do. And turing machines are very simple things. Therefore, if you have some simplistic computer and you can use that computer to emulate a turing machine, you can therefore use it to emulate a complete computer. This is how, in fundamental informatics, you can prove that a certain CPU or system is capable of computing all the stuff some other CPU or system is capable of computing: Use it to compute a turing machine, thus proving you can run it all. Any system that can be used to emulate a turing machine is called 'turing complete'.
Then we get to something very interesting: If your pseudocode can be used to express anything a real computer can do, then your pseudocode can be used to 'write'... your very pseudocode checker!
So let's say we do just that and stick the pseudocode that describes your pseudocode checker in a function we shall call pseudocodechecker. It takes as argument a string containing some pseudocode, and returns a string such as O(n^2).
You can then write this program in pseudocode:
1. if pseudocodechecker(this-very-program) == O(n^2)
2. If True runSomeAlgorithmThatIsO(1)
3. If False runSomeAlgorithmTahtIsO(n^2)
And this is self-defeating: We have 'programmed' a paradox. It's like "This statement is a lie", or "the set of all sets that do not contain themselves". If it's false it is true and if it is true it false. [Insert GIF of exploding computer here].
Thus, we have mathematically proved that what you want is impossible, unless one of the following is true:
A. Your pseudocode-based checker is incorrect. As in, it will flat out give a wrong answer sometimes, thus solving the paradox: If you feed your program a paradox, it gives a wrong answer. But how useful is such an app? An app where you know the answer it gives may be incorrect?
B. Your pseudocode-based checker is incomplete: The official definition of your pseudocode language is so incapable, you cannot even write a turing machine in it.
That last one seems like a nice solution; but it is quite drastic. It pretty much means that your algorithm can only loop over constant ranges. It cannot loop until a condition is true, for example. Another nice solution appears to be: The program is capable of realizing that an answer cannot be given, and will then report 'no answer available', but unfortunately, with some more work, you can show that you can still use such a system to develop a paradox.

The answer by #rzwitserloot and the ones given in the link are correct. Let me just add that it is possible to compute an approximation both to the halting problem as well as to finding the time complexity of a piece of code (written in a Turing-complete language!). (Compare that to the existence of automated theorem provers for arithmetic and other second order logics, which are undecidable!) A tool that under-approximated the complexity problem would output the correct time complexity for some inputs, and "don't know" for other inputs.
Indeed, the whole wide field of code analyzers, often built into the IDEs that we use every day, more often than not under-approximate decision problems that are uncomputable, e.g. reachability, nullability or value analyses.
If you really want to write such a tool: the basic idea is to identify heuristics, i.e., common patterns for which a solution is known, such as various patterns of nested for-loops with only very basic arithmetic operations manipulating the indices, or simple recursive functions where the recurrence relation can be spotted straight-away. It would actually be not too hard (though definitely not easy!) to write a tool that could solve most of the toy problems (such as the one you posted) that are given as homework to students, and that are often posted as questions here on SO, since they follow a rather small number of patterns.
If you wish to go beyond simple heuristics, the main theoretical concept underlying more powerful code analyzers is abstract interpretation. Applied to your use case, this would mean developing a mapping between code constructs in your language to code constructs in a different language (or simpler code constructs in the same language) for which it is easier to compute the time complexity. This mapping would have to conform to some constraints, in particular, the mapped constructs have have the same or worse time complexity as the original code. Actually, mapping a piece of code to a recurrence relation would be an example of abstract interpretation. So is replacing a line of code with something like "O(1)". So, the task is just to formalize some of the things that we do in our heads anyway when we are analyzing the time complexity of code.

Exponentional function parameters

I have 3 points [x0 y0], [x1 y1], [x2 y2] with strict conditional x0<x1<x2, y0<y1<y2. All this points lay on some exponentional functions y=ae^(bx)+c. I need to find a,b,c... It's not possible to solve system of 3 equations precisely, therefore I need to approximate it. Is there some math library in java that will help me solve this problem? I find something similar on mathcad
https://help.ptc.com/mathcad/en/index.html#page/PTC_Mathcad_Help/exponential_regression.html but not find in java.
Other way - how to solve system of 3 equations and 3 values approximately.
ae^(bx_0)+c=y_0
ae^(bx_1)+c=y_1
ae^(bx_2)+c=y_2

You have to solve a system of non-linear equations, for which only an approximate solution is possible but can be done using the Newton Raphson's Multivariate method.
The algorithm is, quite frankly, a notational pain but you can go through it here -
http://fourier.eng.hmc.edu/e176/lectures/NM/node21.html.
What is happening essentially is you have a function whose derivative lead you to an 'equilibrium' from an initial random point (which you guess as a possible root)
If you are not willing to write the code yourself this repo can give you a starter of sorts - https://github.com/prasser/newtonraphson.
But AFAIK, no ready library exists for this purpose. You can use Wolfram's Mathematica or MATLAB/OCTAVE for ready libraries though.
That said, here are a few other (more complicated) things you can look into
https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm
https://www1.fpl.fs.fed.us/optimization.html
http://icl.cs.utk.edu/f2j/
http://optalgtoolkit.sourceforge.net/
http://scribblethink.org/Computer/Javanumeric/index.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html
Hope this helps!

What is a distributive function under IDFS and why is pointer analysis non-distributive?

I'm doing an inter-procedrual analysis project in Java at the moment and I'm looking into using an IFDS solver to compute the control flow graph of a program. I'm finding it hard to follow the maths involved in the description of the IFDS framework and graph reachability. I've read in several places that its not possible to compute the points-to sets of a program using this solver as "pointer analysis is known to be a non-distributive problem." [1] Other sources have said that this is often specifically with regard to 'strong updates', which from what I can gather are field write statements.
I think I can basically follow how the solver computes edges and works out the dataflow facts. But I don't quite follow what this: f(A ∪ B) = f(A) ∪ f(B) means in practical terms as a definition of a distributive function, and therefore what it means to say that points-to analysis deals with non-distributive functions.
The linked source [1] gives an example specific to field write statements:
A a = new A();
A b = a;
A c = new C();
b.f = c;
It claims that in order to reason about the assignment to b.f one must also take into account all aliases of the base b. I can follow this. But what I don't understand is what are the properties of this action that make it non-distributive.
A similar (I think) example from [2]:
x = y.n
Where before the statement there are points-to edges y-->obj1 and obj1.n-->obj2 (where obj1 and 2 are heap objects). They claim
it is not possible to correctly deduce that the edge x-->obj2 should be generated after the statement if we consider each input edge independently. The flow function for this statement is a function of the points-to graph as a whole and cannot be decomposed into independent functions of each edge and then merged to get a correct result.
I think I almost understand what, at least the first, example is saying but that I am not getting the concept of distributive functions which is blocking me getting the full picture. Can anyone explain what a distributive or non-distributive function is on a practical basis with regards to pointer analysis, without using set theory which I am having difficulty following?
[1] http://karimali.ca/resources/pubs/conf/ecoop/SpaethNAB16.pdf
[2] http://dl.acm.org/citation.cfm?doid=2487568.2487569 (paywall, sorry)

The distributiveness of a flow function is defined as: f(a Π b) = f(a) Π f(b), with Π being the merge function. In IFDS, Π is defined as the set union ∪.
What this means is that it doesn't matter whether or not you apply the merge function before or after the flow function, you will get the same result in the end.
In a traditional data-flow analysis, you go through the statements of your CFG and propagate sets of data-flow facts. So with a flow function f, for each statement, you compute f(in, stmt) = out, with in and out the sets of information you want to keep (e.g.: for an in-set {(a, allocA), (b, allocA)} -denoting that the allocation site of objects a and b is allocA, and the statement "b.f = new X();" -which we will name allocX, you would likely get the out-set {(a, allocA), (b, allocA), (a.f, allocX), (b.f, allocX)} because a and b are aliased).
IFDS explodes the in-set into its individual data-flow facts. So for each fact, instead of running your flow-function once with your entire in-set, you run it on each element of the in-set: ∀ d ∈ in, f(d, stmt) = out_d. The framework then merges all out_d together into the final out-set.
The issue here is that for each flow function, you don't have access to the entire in-set, meaning that for the example we presented above, running the flow-function f((a, allocA)) on the statement would yield a first out-set {(a, allocA)}, f((b, allocA)) would yield a second out-set {(b, allocA)}, and f(0) would yield a third out-set {(0), (b.f, allocX)}.
So the global out-set after you merge the results would be {(a, allocA), (b, allocA), (b.f, allocX)}. We are missing the fact {(a.f, allocX)} because when running the flow function f(0), we only know that the in-fact is 0 and that the statement is "b.f = new X();". Because we don't know that a and b refer to the allocation site allocA, we don't know that they are aliased, and we therefore cannot know that a.f should also point to allocX after the statement.
IFDS runs on the assumption of distributiveness: merging the out-sets after running the flow-function should yield the same results as merging the in-sets before running the flow-function.
In other words, if you need to combine information from multiple elements on the in-set to create a certain data-flow fact in your out-set, then you are not distributive, and should not express your problem in IFDS (unless you do something to handle those combination cases, like the authors of the paper you refer to as [1] did).

Convex optimization, java

I'm looking for a Java library to solve this problem:
We know X is sparse(most of it's entries are zero), so X can be recovered by solving this:
variable X;
minimize(norm(X,1)+norm(A*X - Y,2));
It's a MATLAB code, matrix A and vector Y are known and I want the best X.
I saw JOptimizer, but I couldn't use it. (Doesn't have good documentation or examples).

What you need is a reasonably good LP Solver.
Possible Java LP Solver Options
Apache Commons (Math) Simplex Solver.
See this blog post.
If you have access to CPLEX (not-free), its Java API would work great.
Also, you can look into SuanShu, a Java numerical and statistical library
lpSolve has a Java wrapper which can do the job.
Finally, JOptimizer is indeed a good option. Not sure if you looked at this example.
Hope at least one of those help.

As far as I can tell, you're trying to solve a binary integer program for feasibility
Ax = b, x in {0,1}.
I'm not completely sure, but it seems that you might be interested in the optimization problem
min 1'*x
s.t. Ax = b, x in {0,1}
where 1 is a vector of 1's of the same dimension as x.
The feasibility problem may be in practice much easier than the optimization problem - it all depends on a particular A and b.
If you can get a license of either CPLEX or Gurobi (if you're an academic), these are excellent integer programming solvers with good Java API's. If you don't have access to these, lpsolve may be a good option.
As far as I can tell, JOptimizer will not solve your problem since your variables are integers (although I have never used JOptimizer).

To solve convex optimization problems in java you can use the following library https://github.com/erikerlandson/gibbous

Formula manipulation algorithm

I am wanting to make a program that will when given a formula, it can manipulate the formula to make any value (or in the case of a simultaneous formula, a common value) the subject of the formula.
For example if given:
a + b = c
d + b = c
The program should therefore say:
b = c - a, d = c - b etc.
I'm not sure if java can do this automatically or not when I give the original formula as input. I am not really interested in solving the equation and getting the result of each variable, I am just interested in returning a manipulated formula.
Please let me know if I need to make an algorithm or not for this, and if so, how would I go about doing this. Also, if there are any helpful links that you might have, please post them.
Regards

Take a look at JavaCC. It's a little daunting at first but it's the right tool for something like this. Plus there are already examples of what you are trying to achieve.

Not sure what exactly you are after, but this problem in its general problem is hard. Very hard.
In fact, given a set of "formulas" (axioms), and deduction rules (mathematical equivalence operations), we cannot deduce if a given formula is correct or not. This problem is actually undecideable.
This issue was first addressed by Hilbert as Entscheidungsproblem

I read a book called Fluid Concepts and Creative Analogies by Douglas Hofstadter that talked about this sort of algebraic manipulations that would automatically rewrite equations in other ways attempting to join equations to other equations an infinite (yet restricted) number of ways given rules. It was an attempt to prove yet unproven theorems/proofs by brute force.
http://en.wikipedia.org/wiki/Fluid_Concepts_and_Creative_Analogies
Douglas Hofstadter's Numbo program attempts to do what you want. He doesn't give you the source, only describes how it works in detail.
It sounds like you want a program to do what highschool students do when they solve algebraic problems to move from a position where you know something, modifying it and combining it with other equations, to prove something previously unknown. It takes a strong Artificial intelligence to do this. The part of your brain that does this is the Neo Cortex, which does science, and it's operating principle is as of yet not understood.
If you want something that will do what college students do when they manipulate equations in calculus, you'll have to build a fairly strong artificial intelligence.
http://en.wikipedia.org/wiki/Neocortex
When we can do whole-brain emulation of a human neo cortex, I will post the answer here.

Yes, you need to write some algorithm to do this kind of computer algebra. At least
a parser to interpret the input
an algebra model to relate parsed operands ('a', 'b', ...) and operator ('+', '=')
implement any appropriate rule to support the manipulation you wish to do

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.