Choco solver propogation and search strategy interaction - java

I have started to use choco-solver in my work and didn't understand how propogator and search stradegies interart with each other.
I think choco has some flag that shows me if there is any domain of constraint variables that is changed during propogation. And if there is, then propogation starts again and again until no domain changes occur. And after that, if constraint still not satisfied or fails, search strategies will be connected to solving process.
But output of my progamm shows me that I'm wrong. Propogator really works 2 or 3 times, changing domains each time, but then search strategy is called.
Help me please, where am I wrong in my conclusions?
Or it should work just the way I think and there is some mistakes in my code, that lead to wrong output?
Sorry for my bad english

Choco is a Constraint Programming solver, these solvers all work according to the same principe.
Different than a brute force search, a constraint solver will first call all (relevant) propagators to remove values, from the variable domain, that it knows the variables can't take. Calling one propagator might trigger new values to become impossible and might thus trigger other propagators to run again.
Once all propagators report that they can't remove anymore values (we call this fix point), the search strategy will be consulted to see what to do next. (In general this is a guess to what should be a solution and we might need to backtrack).
If all variables have only one possible value, this is a solution. However, it can happen that in our search a variable loses all its possible values. In this case a propagator will fail. If we had already used search, we will need to backtrack. If this was at the root node, then it means the problem was unsatisfiable.
For more information, try the tutorials of several constraint solvers. A lot of them can be found on Wikipedia. You might also be able to find online courses.

To complete Dekker answer, and based on my experience, fix point is generally reached within a pretty small number of iterations in practice. This can still be slow (because there are many constraints or because "global constraints" may be slow to propagate) but it is rare that the ping-pong effect is dramatic. Choco Solver and similar solvers have many tricks to be efficient on propagation...
So it is completely ok that each propagator is called only 2-3 times before branching.

Related

Dealing with uninitialized solutions in OptaPlanner

I'm creating a schedule generator for a school and I am facing two challenges:
1: User feedback during construction phase
During the construction heuristic phase I'm not getting any callbacks to the bestSolutionConsumer passed in to the SolverManager.solveAndListen which means that I'm not able to give any feedback to the user during this phase. (It's only about 10 seconds or so as of today, but still annoying.)
I suspect that this is by design (judging from this question), but please correct me if I'm wrong.
(I suspect that the idea is that the construction heuristic phase should be quick anyway, and that 99% of a long running solve will be spent in the local search phase, and thus that's the only phase that actually matters. Correct?)
2: Manual placement of lectures
This scheduling program will only be semi-automatic. I'd like the user to be able to pin lectures, move lectures around manually, and even remove lectures from the schedule by putting them in a pile on the side for later placement (where the later placement could perhaps be done by OptaPlanner).
Rethink definition of initialized?
This lead me to rethink what I consider an initialized solution. If I...
want to have progress feedback even during initial placement of lectures, and
want to allow the user to interact with a schedule where only half of the lectures have been scheduled
...then maybe I should make the timeslot nullable or have a sentinel timeslot value for unscheduled lectures, and simply penalize such solutions.
In this scenario, I'm imagining that a solution is immediately and trivially initialized (all lectures initially in the unscheduled state, but formally speaking the solution is still initialized) and that the construction phase is basically skipped.
Questions
Is this stupid for some reason?! It feels like I'm throwing out a large part of OptaPlanner capabilities.
Am I overlooking any downsides of this approach?
Is it even possible to skip construction phase by doing this?
Also, repeated planning is of importance to me, and the docs say:
Repeated planning (especially real-time planning) does not mix well with a nullable planning variable.
Does the same apply also to the approach of using an unscheduled sentinel value?
1/ No, this is not stupid. It is, in fact, an example of over-constrained planning.
2/ Well, now that variables are nullable, you need to write your constraints such that nulls are acceptable. And you may run into situations where the solver will find it easier to just leave some variables null, unless there is a pretty substantial penalty for that. You may have to design special constraints to work around that, or in the worst case even custom moves.
3/ Construction heuristics are not mandatory, but they can still be useful. Even if they leave some variables null, they can still give you a decent initial solution. You may also want to try a custom phase.
4/ If you worry about some of the things above, indeed introducing a dummy value instead of making a variable nullable could solve some of those worries. (And introduce others, as every constraint now has to deal with this dummy value.)
My advice is to do a quick proof of concept. See how each of the approaches behaves. Pick the one that you prefer to deal with. There are no silver bullets.

How to read Drools memory in order to detect partially matched rules

I'd like to be able to read Drools memory in such a way that I can detect which condition has matched, even though the rule didn't fire in the end.
Say I have this rule:
rule "MyRule"
when
FirstFact(condition == "str")
SecondFast(anotherCondition > 30)
then
...
end
If I insert only an object "FirstFact" in memory and I call fireAllRules(), the rule will not be fired. But still, I'd like to track down that the first condition of this rule matched.
I understand this is a weird requirement, and it may take some time to develop as it would probably not be straightforward but if there's a way to do it I'm interested.
I was thinking of accessing Drools memory and visit all conditions contained in the Rete tree, but I am not sure if it's a good approach or even possible.
Thanks !
Because of the way RETE works, what you are trying to do is not possible. Please read this other Question to get and idea of a possible solution: Drools 7, event listener to whenever a rule is activated (even if partially matched)
Hope it helps,

Quest editor in java for big game

It's a very specific problem I have:
I'm working on a text-based RPG, where the main work is to implement an editor, that gives the possibility to add NPCs, Items and place them on the map (...) without any knowledge about programming.
All of these things work fine with doing some SQL queries and the whole thing already works. Now I'm working on quest editing. My basic concept is, that every time the player enters a command, a database entry for the specific string is queried, that's linked to a set of conditions and actions, which have unique IDs. Those are queried in the java code, where a specific condition (e.g. that the players money equals 100) has a part of code that returns the result. This means, that hundreds (or more) IFs have to be passed, each time a command is entered - same with the actions according to the command. I'm not even sure if that is the right way (If anyone has a propose to this, feel free to post).
The point is now, that quests basically consist of quest stages, which also have conditions to be enabled and actions, performed when enabled. That means, that also with each entered command, all of these queries have to take place. I thought about using some kind of trigger, but I don't have a good idea how to implement it, because I don't really want to edit java code out of this editor. I also considered using prolog, but also in that case I'd have to add triggers into java code I guess.
I know that this is a little bit hard to handle in a forum like this, but if anyone has a suggestion, I'd be really glad.
EDIT:
As suggested in a comment, I'd like to shorten the whole thing: If any command (out of houndres or thousands) could trigger one particular quest/quest stage (out of thousands) and these triggers should be set with an editor, what's a proper way to implement that?
reasoning over lots of facts and triggering actions when a set of facts matches specific conditions is a good match for drools.
you could represent every action/decision that the player has made as a fact, which you could insert into a drools knowledge session.
in that session you could store all of your "triggers" as drools rules, which will fire when a collection of facts in memory match the condition.
drools supports dynamic addition/removal/editing of rules and is explicitely targeted at allowing non-developers to write logic using a simpler rule language.
the specific part of drools to start with is the core - drools expert

When to stop following the advice of static code analysis?

I do use static code analysis on a project with more than 100.000 lines of Java code for quite a while now. I started with Findbugs, which gave me around 1500 issues at the beginning. I fixed the most severe over time and started using additional tools like PMD, Lint4J, JNorm and now Enerjy.
With the more severe issues being fixed, there is a huge number of low severity issues. How do you handle these low priority issues?
Do you try fixing all of them?
Or only in newly written code?
Do you regularly disable certain rules? (I found that I do on nearly any of the available tools).
And if you ignore or disable rules, do you document those? What do your managers say about "leaving some thousand low priority issues not fixed"? Do you use (multiple) tool specific comments in the code or is there any better way?
Keep in mind that static analysis is meant to generate a lot of false positives; this is the price you pay for generally avoiding false negatives. That is, they assume that you would much rather be told incorrectly that something looks suspicious (a false positive) instead of being told that something that's actually a problem is perfectly fine (a false negative).
So in general, you should be configuring these tools rather than accepting the out-of-the-box defaults, which generate a lot of noise.
Do you try fixing all of them?
On projects where I have technical control, my standard modus operandi is to encourage a culture where people review of all new static analysis defects from our CI system. If we decline to fix enough defects over a period of time that are of a specific kind, we disable that rule since it's become just noise. Every so often we'll look at the disabled rules to make sure that they're still relevant.
But once we've turned off the less effective rules, yes, we fix all the defects. If you think that something is a problem, you should fix it if the priority doesn't exceed that of other things you have to do. Ignoring it is damaging to your team's culture and sends the wrong message.
And if you ignore or disable rules, do you document those?
The rules file is part of our project, so a commit message is sufficient to document the fact that such-and-such rules were disabled in this commit.
What do your managers say about "leaving some thousand low priority issues not fixed"?
Nothing. We make sure that they understand what we're doing and why, but they're usually (rightfully so) focused on higher-level metrics, like velocity and overall project health.
What do your managers say about "leaving some thousand low priority issues not fixed"?
I expect managers to prioritize: to decide (or, be told) whether any task is high- or low-priority, and to be happy with people's spending time on high-priority instead of low-priority tasks.
If you were to look at the analogy of your bug tracking database, a good number of those reported are low priority bugs that you'll probably never get to. Sure, they are real bugs and you would like to fix them but most programmers work under very real constraints and don't have the time to address every concern. I wrote an article recently about the special nature of static analysis defects.
One important difference about addressing static analysis bugs though is that they are typically much easier to deal with than a regularly reported bug. Thus a quick scan of the defects to identify not only the high priority items to fix but also the ones that are easiest to fix can be useful. Static analysis defects after all are detected very early in the development process and the specific parts o the code in question are very plainly spelled out. Thus you'll likely catch quite a few low hanging fruit on the lower priority ones.
The various strategies I've seen used to make this successful include:
* First of all, make sure the analysis is tuned properly. Static analysis comes "out of the box" with factory settings and can't possibly understand all code. If you can't tune it yourself get some help (disclaimer, we provide some of that type of help). You'll lower the false positive rate and find more good bugs.
* Identify the characteristics that for the most part prioritize the defects (they could be specific categories, specific areas of the code, built-in prioritization scoring provided by the static analysis tool, etc.).
* Determine what level of threshold is important and possibly make it an acceptance criteria (e.g. all high and critical need to be addressed)
* Make sure that each defect that blocks the acceptance criteria is addressed (addressed meaning that it at least has to be looked at because some could be false positives)
* Make sure that the ones marked false positive are checked, either through a peer code review process or through a tail end audit process so you don't have any mismarking problems.
Bottom line: choose what to focus on, review what is necessary, don't fix everything unless your business requirements make you
Decide what rules you want to follow at your company and disable the rest. A lot of the rules that you can set up with those tools are subjective style issues and can safely be ignored. You should document what rules you are following once in a company style guide (so you don't clutter your code with that information).
I wouldn't make a change to old code just based on the recommendation of a static analysis tool. That code is already tested and presumably working. If I need to go into the code and make a change for any other reason, then I'll run the analysis and make any changes it recommends.
The key in.my mind is to at least review all the issues even if you decide in the end not to fix them. The reason is that bugs, like misery, love company. I can't enumerate ow many times I've found all kinds of nasty bugs that findbugs didn't report; but found them by looking at seemingly unimportant ones that it did report.
I personally find code analysis though useful but overrated.
I can't say for Java but with C# there is also such a thing as a code analysis. I agree it gives you a lot of smart ideas how to do thing better, but at times the recommendations are just annoying. Some suggestions are not based on common sense but are only a matter of style. After some time playing around with the code analysis, I stopped using it. The reason is that I happened to disagree with many warnings and didn't want to follow the proposals.
In general, I would recommend doing what the code analysis says. But up to a certain point. Whatever seems to be a matter of personal style of whoever write the rules definitions, can easily be ignored. You can add exceptions for rule 1, then 2, then three... until it gets old. Then you just deactivate the feature.

Could the Drools update method potentially be causing my problems

I am currently writing an application using Drools 5.0. This application seems to be running a little slow, but I have a theory why. This application receives many updates for facts already in stored in the knowledge session. The Drools update function under the hood really does a retraction then an insertion. This application has over 200 rules. Some of the rules are written to fire when certain facts are removed. Other rules are written to fire when certain facts are asserted into the knowledge session. Since update really does a retraction then insertion will the retraction and insertion related rules still fire during an update? Even though nothing is really being 'inserted' or retracted from the knowledge session?
One thing to note, I 'hooked' up the WorkingMemoryFileLogger to my knowledge session to get a better idea about whats going on. That's when I saw lots of unexpected retraction/insertion rule-activation creations being added to the Agenda, but it seems they don't ever get activated. It seems to me that updating facts can be expensive especially based on your fact model, and should me used sparingly. Is this correct?
I think you have understood it correctly. An update is kind of like a retract plus an assert.
The first thing to be sure of is if your rules are giving you what you want - ie do it work but you just want to improve performance?
In some ways you can think of an update (and also checkout the "modify" keyword..) as part of the evils of immutability ;) When you update - you tell the network that the fact has changed, but it doesn't yet track it at a field level (that is TBD) so it may cause more work then is necessary as all these activations are created that are not really needed (as they are using fields that didn't actually change in value).
Hard to be more specific - if you provided some sample rules/fact model (if you can in a safe way of course !) we might be able to suggest some ideas to break it down to be more granular.
Good luck !
The best way to know is to profile the app and find out exactly what's happening. Use something like OptimizeIt or JProbe in combination with the jvisualvm.exe that ships with JDK 1.6. Don't guess - get more data.
In my experience, the update() method is only necessary if you need an entity to be reevaluated by the WHERE clause within the context of the currently executing rule. Since the RETE evaluation happens all at once upon entry of the rule, removing some update() statements (where possible) will speed it's execution. Sometimes this involves setting some flags and postponing the real update() until a later rule. You can also put some of the evaluation of the current entity states into an if statement in the THEN clause, using the WHERE clause for more basic filtering.

Categories