Z3 producing different models when run multiple times - java

I've been using Z3 with the JAVA bindings for 2 years now.
For some reason, I've always generated the SMTLib2 code myself as a String and then used the parseSMTLib2String to build the corresponding Z3 Expr.
As far as I can remember, every time I entered the exact same input twice with this method, I always got the same model.
But I recently decided to change and to use the JAVA API directly and build the expressions with ctx.mk...(). Basically, I'm not generating the String and then parse it, but I let Z3 do the job of building the Z3 Expr.
What happens now is that I get different models while I've checked that the solver does indeed store the exact same code.
My JAVA code looks something like this:
static final Context context = new Context();
static final Solver solver = context.mkSolver();
public static void someFunction(){
solver.add(context.mk...()); // Add some bool expr to the solver
Status status = solver.check();
if(status == SATISFIABLE){
System.out.println(solver.getModel()); // Prints different model with same expr
}
}
I'm making more than 1 call to "someFunction()" during runtime, and the checked expression context.mk...() changes. But if I run my program twice, the same sequence of expression is checked and sometimes give me different models from one run to another.
I've tried disabling the auto-config parameter and setting my own random seed, but Z3 still produces different models sometimes. I'm only using bounded Integer variables and uninterpreted functions.
Am I using the API in the wrong way?
I could add the whole SMTLib2 code to this question if needed but it isn't really short and contains multiple solver calls (I don't even know which of them will produce a different model from one execution to another, I just know that some do).
I must precise that I've read the following threads but found the answers to be either outdated or (if I understood correctly) in favour of "Z3 is deterministic and should produce the same model for the same input":
Z3 timing variation
Randomness in Z3 Results
different run time for the same code in Z3
Edit:
Surprisingly enough, with the following code I seem to always get the same models and Z3 now seems deterministic. However, the memory consumption is huge compared to my previous code since I need to keep the context in memory for a while. Any idea what I could do to achieve the same behaviour with less memory use ?
public static void someFunction(){
Context context = new Context();
Solver solver = context.mkSolver();
solver.add(context.mk...()); // Add some bool expr to the solver
Status status = solver.check();
if(status == SATISFIABLE){
System.out.println(solver.getModel()); // Seem to always print the same model :-)
}
}
Here is the memory consumption I get from calling the method "someFunction" multiple times:

As long as it doesn't toggle between SAT and UNSAT on the same problem, it's not a bug.
One of the answers you linked explains what's happening:
Randomness in Z3 Results
"That being said, if we solve the same problem twice in the same execution path, then Z3 can produce different models. Z3 assigns internal unique IDs to expressions. The internal IDs are used to break ties in some heuristics used by Z3. Note that the loop in your program is creating/deleting expressions. So, in each iteration, the expressions representing your constraints may have different internal IDs, and consequently the solver may produce different solutions."
Perhaps when it's parsing it's assigning the same ids, whereas with the API it may differ, although I'd find that a bit hard to believe...
If you need this behavior and you're sure it was doing this from the SMT encoding, you could always print the expressions from the API then parse them.

I think I spotted the specific parts of code producing these strange opposite behavious.
Maybe the Z3 experts around can tell me if I'm completely wrong.
First of all, if I try the same code (no matter if it's manually generated code or code generated with the API) twice in a single run of my program, I sometimes end up with different models. That is something I didn't notice before, and this actually isn't a real problem for me.
My main concern however is what happens if I run my program twice, checking the exact same code during the two runs.
When I'm generating the code manually, I end up with functions definitions like this:
(declare-fun var!0 () Int)
(declare-fun var!2 () Int)
(declare-fun var!42 () Int)
(assert (and
(or (= var!0 0) (= var!0 1))
(or (= var!2 0) (= var!2 1))
(or (= var!42 0) (= var!42 1))
))
(define-fun fun ((i! Int)) Int
(ite (= i! 0) var!0
(ite (= i! 1) var!2
(ite (= i! 2) var!42 -1)
)
)
)
As far as I can tell (and for what I've read about it (see here)), the API doesn't handle the way I defined the "fun" function.
So what I did to define it with the API was something like that:
(declare-fun var!0 () Int)
(declare-fun var!2 () Int)
(declare-fun var!42 () Int)
(assert (and
(or (= var!0 0) (= var!0 1))
(or (= var!2 0) (= var!2 1))
(or (= var!42 0) (= var!42 1))
))
(declare-fun fun (Int) Int)
(assert (forall
((i! Int))
(ite (= i! 0) (= (fun i!) var!0)
(ite (= i! 1) (= (fun i!) var!2)
(ite (= i! 2) (= (fun i!) var!42) (= (fun i!) -1))
)
)
))
It seems that with the first method, checking the same code for different runs always (or at least so often that it isn't a real problem for me) gives the same models.
With the second method, checking the same code for different runs very often gives different models.
Can anybody tell me if there is indeed some logic behind what I've exposed regarding how Z3 actually works ?
Since I need my results to be as reproducible as possible, I went back to the manual code generation and it seems to work perfectly fine. I would love to see a function in the API allowing us to define functions directly, and not having to use the "forall" method, and see if what I just described is true or not.

Related

java ternary operator can be replaced with Math.max call

I have the following code delay = (delay>200) ? delay : 200;
Java issues a warning message Can be replaced with 'Math.max' call for this.
Here I see that Math.max(a, b) is actually the same as (a > b) ? a : b so ternary operator is not worse than Math.max
So why Java issues this warning message if there are no advantages replacing the ternary operator by Math.max method call?
I doubt that this is a real compiler warning, probably some IDE inspection/warning.
Nonetheless, you are correct, there are no hard technical reasons to prefer one over the other.
But: from the point of a human reader, using Math.max() has one major advantage: it is easier to read and understand. That simple.
Besides: do not duplicate code unless you have to.
Always remember: you write your code for your human readers. Compilers accept anything that is syntactically correct. But for your human readers, there is a difference between a condition and an assignment vs a very telling "take the maximum of two numbers".
Math.max(a, b) is more readable than the tenary statement because:
the value 200 does not need to be repeated.
there is no need to write and understand >
In general, the ternary is more powerful because it lets you do things like this:
delay = (delay>200) ? 200 : delay;
delay = (delay<200) ? delay : 200;
delay = (delay>200) ? delay: 300;
The reader of your code needs to understand which of those things you are actually doing. It takes time to parse it and understand it is a simple max().
The max shows your intention more clearly.
In addition to the existing answers, there can be a performance advantage if the lower limit (in your case, 200) is not a constant but a derived value:
delay = (delay > readLimitFromFile()) ? delay : readLimitFromFile();
This could end up doing 2 expensive disk-read operations, when one operation would be sufficient. Using Math.max:
delay = Math.max(delay, readLimitFromFile());
would use only one disk-read operation.

What is a distributive function under IDFS and why is pointer analysis non-distributive?

I'm doing an inter-procedrual analysis project in Java at the moment and I'm looking into using an IFDS solver to compute the control flow graph of a program. I'm finding it hard to follow the maths involved in the description of the IFDS framework and graph reachability. I've read in several places that its not possible to compute the points-to sets of a program using this solver as "pointer analysis is known to be a non-distributive problem." [1] Other sources have said that this is often specifically with regard to 'strong updates', which from what I can gather are field write statements.
I think I can basically follow how the solver computes edges and works out the dataflow facts. But I don't quite follow what this: f(A ∪ B) = f(A) ∪ f(B) means in practical terms as a definition of a distributive function, and therefore what it means to say that points-to analysis deals with non-distributive functions.
The linked source [1] gives an example specific to field write statements:
A a = new A();
A b = a;
A c = new C();
b.f = c;
It claims that in order to reason about the assignment to b.f one must also take into account all aliases of the base b. I can follow this. But what I don't understand is what are the properties of this action that make it non-distributive.
A similar (I think) example from [2]:
x = y.n
Where before the statement there are points-to edges y-->obj1 and obj1.n-->obj2 (where obj1 and 2 are heap objects). They claim
it is not possible to correctly deduce that the edge x-->obj2 should be generated after the statement if we consider each input edge independently. The flow function for this statement is a function of the points-to graph as a whole and cannot be decomposed into independent functions of each edge and then merged to get a correct result.
I think I almost understand what, at least the first, example is saying but that I am not getting the concept of distributive functions which is blocking me getting the full picture. Can anyone explain what a distributive or non-distributive function is on a practical basis with regards to pointer analysis, without using set theory which I am having difficulty following?
[1] http://karimali.ca/resources/pubs/conf/ecoop/SpaethNAB16.pdf
[2] http://dl.acm.org/citation.cfm?doid=2487568.2487569 (paywall, sorry)
The distributiveness of a flow function is defined as: f(a Π b) = f(a) Π f(b), with Π being the merge function. In IFDS, Π is defined as the set union ∪.
What this means is that it doesn't matter whether or not you apply the merge function before or after the flow function, you will get the same result in the end.
In a traditional data-flow analysis, you go through the statements of your CFG and propagate sets of data-flow facts. So with a flow function f, for each statement, you compute f(in, stmt) = out, with in and out the sets of information you want to keep (e.g.: for an in-set {(a, allocA), (b, allocA)} -denoting that the allocation site of objects a and b is allocA, and the statement "b.f = new X();" -which we will name allocX, you would likely get the out-set {(a, allocA), (b, allocA), (a.f, allocX), (b.f, allocX)} because a and b are aliased).
IFDS explodes the in-set into its individual data-flow facts. So for each fact, instead of running your flow-function once with your entire in-set, you run it on each element of the in-set: ∀ d ∈ in, f(d, stmt) = out_d. The framework then merges all out_d together into the final out-set.
The issue here is that for each flow function, you don't have access to the entire in-set, meaning that for the example we presented above, running the flow-function f((a, allocA)) on the statement would yield a first out-set {(a, allocA)}, f((b, allocA)) would yield a second out-set {(b, allocA)}, and f(0) would yield a third out-set {(0), (b.f, allocX)}.
So the global out-set after you merge the results would be {(a, allocA), (b, allocA), (b.f, allocX)}. We are missing the fact {(a.f, allocX)} because when running the flow function f(0), we only know that the in-fact is 0 and that the statement is "b.f = new X();". Because we don't know that a and b refer to the allocation site allocA, we don't know that they are aliased, and we therefore cannot know that a.f should also point to allocX after the statement.
IFDS runs on the assumption of distributiveness: merging the out-sets after running the flow-function should yield the same results as merging the in-sets before running the flow-function.
In other words, if you need to combine information from multiple elements on the in-set to create a certain data-flow fact in your out-set, then you are not distributive, and should not express your problem in IFDS (unless you do something to handle those combination cases, like the authors of the paper you refer to as [1] did).

Java's switch equivalent in Clojure?

Is there an equivalent for Java's switch construct in Clojure? If yes, what is it? If no, do we have to use if else ladder to achieve it?
case is a good option as pointed out by Jan
cond is also very useful in many related circumstances, particularly if you want to switch on the basis of evaluating a range of different conditional expressions, e.g.
(defn account-message [balance]
(cond
(< balance 0) "Overdrawn!"
(< balance 100) "Low balance"
(> balance 1000000) "Rich as creosote"
:else "Good balance"))
Note that the result of cond is determined by the first matching expression, so a negative balance will display "Overdrawn!" even though it also matches the low balance case.
[I have edited the code - removed the extra bracket at the end to make it work]
Try the case macro:
(case (+ 2 3)
6 "error"
5 "ok")
or with default value
(case (+ 2 3)
5 "ok"
"error")
Remember that according to the documentation
The test-constants are not evaluated. They must be compile-time literals, and need not be quoted. (...)
See more examples at ClojureDocs.
Though #Jan and #mikera suggestions to use case or cond (may I add condp to the list?) are sound from a functional¹ standpoint and though case 's limitations (e.g. test values can only be compile-time constants ; a default return value is mandatory) mirror those of switch there are some subtle differences:
case cannot be used with Java Enum constants ;
case 's dispatch is based on hashing AFAIK which makes it comparable to hashmaps in terms of performance ; switch is way faster ;
you cannot fall-through with case, which means that you must use other options (condp with value sets ?) to mirror switch 's behaviour.
[¹] not functional as in functional-programming, functional as in fulfilling a function, serving a purpose.

Dynamically generating high performance functions in clojure

I'm trying to use Clojure to dynamically generate functions that can be applied to large volumes of data - i.e. a requirement is that the functions be compiled to bytecode in order to execute fast, but their specification is not known until run time.
e.g. suppose I specify functions with a simple DSL like:
(def my-spec [:add [:multiply 2 :param0] 3])
I would like to create a function compile-spec such that:
(compile-spec my-spec)
Would return a compiled function of one parameter x that returns 2x+3.
What is the best way to do this in Clojure?
Hamza Yerlikaya has already made the most important point, which is that Clojure code is always compiled. I'm just adding an illustration and some information on some low-hanging fruit for your optimisation efforts.
Firstly, the above point about Clojure's code always being compiled includes closures returned by higher-order functions and functions created by calling eval on fn / fn* forms and indeed anything else that can act as a Clojure function. Thus you don't need a separate DSL to describe functions, just use higher order functions (and possibly macros):
(defn make-affine-function [a b]
(fn [x] (+ (* a x) b)))
((make-affine-function 31 47) 5)
; => 202
Things would be more interesting if your specs were to include information about the types of parameters, as then you could be interested in writing a macro to generate code using those type hints. The simplest example I can think of would be a variant of the above:
(defmacro make-primitive-affine-function [t a b]
(let [cast #(list (symbol (name t)) %)
x (gensym "x")]
`(fn [~x] (+ (* ~(cast a) ~(cast x)) ~(cast b)))))
((make-primitive-affine-function :int 31 47) 5)
; => 202
Use :int, :long, :float or :double (or the non-namespace-qualified symbols of corresponding names) as the first argument to take advantage of unboxed primitive arithmetic appropriate for your argument types. Depending on what your function's doing, this may give you a very significant performance boost.
Other types of hints are normally provided with the #^Foo bar syntax (^Foo bar does the same thing in 1.2); if you want to add them to macro-generated code, investigate the with-meta function (you'll need to merge '{:tag Foo} into the metadata of the symbols representing the formal arguments to your functions or let-introduced locals that you wish to put type hints on).
Oh, and in case you'd still like to know how to implement your original idea...
You can always construct the Clojure expression to define your function -- (list 'fn ['x] (a-magic-function-to-generate-some-code some-args ...)) -- and call eval on the result. That would enable you to do something like the following (it would be simpler to require that the spec includes the parameter list, but here's a version assuming arguments are to be fished out from the spec, are all called paramFOO and are to be lexicographically sorted):
(require '[clojure.walk :as walk])
(defn compile-spec [spec]
(let [params (atom #{})]
(walk/prewalk
(fn [item]
(if (and (symbol? item) (.startsWith (name item) "param"))
(do (swap! params conj item)
item)
item))
spec)
(eval `(fn [~#(sort #params)] ~#spec))))
(def my-spec '[(+ (* 31 param0) 47)])
((compile-spec my-spec) 5)
; => 202
The vast majority of the time, there is no good reason to do things this way and it should be avoided; use higher-order functions and macros instead. However, if you're doing something like, say, evolutionary programming, then it's there, providing the ultimate flexibility -- and the result is still a compiled function.
Even if you don't AOT compile your code, as soon as you define a function it gets compiled to bytecode on the fly.

Drools Rules: How can I use a method on "when" section?

I need to execute a method on "when" section of a DSLR file and I´m not sure if it´s possible. Example:
rule "WNPRules_10"
when
$reminder:Reminder(source == "HMI")
$user:User(isInAgeRange("30-100")==true)
Reminder(clickPercentual >= 10)
User(haveAtLeastOptIns("1,2,3,4") == true)
then
$reminder.setPriority(1);update($reminder);
end
(note: isInAgeRange() and haveAtLeastOptIns() are methods of User)
I tried with eval() and no errors appeared, but it didn´t execute. Like this:
rule "WNPRules_10"
when
$reminder:Reminder(source == "HMI")
$user:User(eval($user.isInAgeRange("30-100")==true))
Reminder(clickPercentual >= 10)
User(eval($user.haveAtLeastOptIns("1,2,3,4") == true))
then
$reminder.setPriority(1);update($reminder);
end
How can I resolve this problem?
Your second attempt looks fairly confused - also - do you have so User patterns - do you want them to refer to the same instance of user? or can they be separate instances (or must they be separate?) - that will change things a bit in some cases depending on your intent.
In terms of the simplest rewrite I can think of:
rule "WNPRules_10"
when
$reminder:Reminder(source == "HMI")
$user:User()
eval($user.isInAgeRange("30-100") && $user.haveAtLeastOptIns("1,2,3,4"))
Reminder(clickPercentual >= 10)
then
$reminder.setPriority(1);update($reminder);
end
Note the use of the eval() top level element - it also uses only one user pattern - and then applies the constraints to it. (In a future version inline evals will work without having to write eval !).

Categories