Agenda groups not working as expected in Drools

Agenda groups not working as expected in Drools - java

Right now, in my drools project I have two groups of rules in separate DRL files which are split by agenda groups. For the agenda group "preCheck" I am setting auto focus to true for each rule in that agenda group. Example:
rule "preCheckDuplicate"
agenda-group "preCheck"
auto-focus true
no-loop true
salience 50
when
$f : IngestFileMetadata(isDuplicate.equalsIgnoreCase("True"))
then
$f.setIsDuplicate("True");
end
For the other agenda group - "defaultRules" - the rules do NOT have the auto focus attribute set. Example:
rule "duplicate file default"
agenda-group "defaultRules"
activation-group "isDuplicate"
no-loop true
salience 0
when
$f : IngestFileMetadata(isDuplicate.equals("True"))
then
insert(createResponse($f));
end
When invoking the rules via the rest API, I am also trying to set focus to the "preCheck" agenda group through the JSON payload. Example:
{
"lookup": "defaultStatelessKieSession",
"set-focus": "preCheck",
"commands": [
{
"insert": {
"out-identifier": "IngestFileMetadata",
"return-object": "true",
"entry-point": "DEFAULT",
"object": {
"com.hms.ingestion.rules.IngestFileMetadata": {
* * * * * data attributes here * * * * *
}
}
}
},
{
"fire-all-rules": {"out-identifier": "fired"}
},
{
"query": {"name": "rulesResponses", "out-identifier": "rulesResponses"}
}
]
}
However, when the rules are executed, it seems like the rules in the "defaultRules" agenda group are being evaluated first. I have no idea why. I'm relatively new to drools so it's entirely possible I'm not correctly understanding the concept of agenda groups, but I was sure this design would ensure the "preCheck" rules would evaluate first.
Can anyone provide any insight on why this is not happening? If I need to provide more details I can.
Thanks in advance.

Agenda groups allow you to place rules into groups, and to place those groups onto a stack. The
stack has push/pop behavior.
Before going into how to use agenda group firstly, I want to say that configuring agenda group is depends on what type of KieSession you are using in your rule engine. For stateful Session, you can directly configure it by calling ksession.getAgenda().getAgendaGroup( "preCheck" ).setFocus();.
For Stateless Session, you have to declare an explicit rule to set the focus of the session to the particular Agenda. You can use the below rule to set agenda in Stateless Session:
rule "global"
salience 100
when
$f : IngestFileMetadata()
then
drools.setFocus($f.getAgenda());
end
Note : You have to find some way to get the agenda variable in your rule file. In the above example, getAgenda() is a method in your IngestFileMetadata class and it returns agenda value of String type

Turns out my issue was having to issue an explicit update to my fact once I updated the attributes during my pre-check rule. That solved the problem.

Related

Drools Stream mode negative patterns

i have a question about stream mode in Drools.
I'm using this rule
declare MetaMessage
#role(event)
end
rule 'rule1' ruleflow-group 'default'
when
$inMess : MetaMessage() from entry-point 'default'
not(MetaMessage(this != $inMess, this after [0s,10s] $inMess) from entry-point 'default')
then
//do things
end
If i send a MetaMessage, i expect the rule to execute after the 10s specified, but nothing appens.
If i send a new MetaMessage, after 10 seconds, the rule executes.
Edit: if i change the rule and take away the not, it works like a charm
I'm not sure what am i doing wrong.
This is how i create the KieBase
KieBaseConfiguration config = KieServices.Factory.get().newKieBaseConfiguration();
config.setOption(EventProcessingOption.STREAM);
KieBase kieBase = kieHelper.build(config);
KieSession kieSession = kieBase.newKieSession();
Edit 2
I fire the rules using fireAllRules() everytime a new MetaMessage is inserted in a Kafka queue.
So i have a consumer collecting messages and inserting them in the session like this
EntryPoint ep = kieSession.getEntryPoint("default");
ep.insert(metaMessage);
kieSession.fireAllRules();
Edit 3
I have another simple rule that gets executed toghether with the previous one
rule "AccumulatedTest"
when
accumulate(MetaMessage( timestamp > 0 ); $cnt: count(1))
then
log.info("Message n: "+$cnt);
end
The first time a message gets inserted (when the kieSession is newly created) i get the info "Message n: 0".
But then this rule does not fire anymore.
If any other message gets inserted in the session the rule does not fire

You use ruleflow-group 'default', thus you must set focus to agenda group to get rule executed, like getSession().getAgenda().getAgendaGroup("default").setFocus();
Most likely you set focus before or after event insertion. After inserting first message you set focus but no rule was added to the agenda (rule is not yet eligible for execution) at that time, and focus get reset to main agenda group. After 10 seconds you state that no rule was triggered despite that rule was added to 'default' agenda (because main group has focus). You insert second message and set focus to 'default' agenda which executes the rule from the agenda being triggered by the first message.
Rule is getting executed if you
remove ruleflow-group (latest documentation promote agenda-group)
add auto-focus true to the rule
set agenda group focus after 10 seconds
See how agenda group works

Ok, thanks to #EstebanAliverti and #Mike, i have fixed the problem.
Let me elaborate:
I had to create a scheduled fireAllRules() that runs every 1s (as #EstebanAliverti suggested)
I removed the ruleflow-group from the rule because (as #Mike suggested) the focus switches back to the default agenda group and, in my implementation, i cannot pass the agenda group from the scheduled execution
So now the fireAllRules() runs every 1s and there is no ruleflow or agenda group
Now the rule looks like this and it works as intended
declare MetaMessage
#role(event)
end
rule 'rule1'
when
$inMess : MetaMessage()
not(MetaMessage(this != $inMess, this after [0s,10s] $inMess) )
then
//do things
end
Last but not least, since i removed the entrypoint as i wasn't using the funcionality, i now insert the messages directly inside the KieSession
kieSession.insert(metaMessage);
kieSession.fireAllRules();

Elastic search handling missing indices

I would like to know if there is a way to specify to elastic search that I don't mind missing or erroneous indices on my search query. In other words I have a query which tries to query 7 different indices but one of them might be missing depending on the circumstances. What I want to know is that if there is a way to say, forget the broken one and get me the results of the other 6 indices?
SearchRequestBuilder builder = elasticsearchClient.getClient().prepareSearch(indices)
.setQuery(Query.buildQueryFrom(term1, term2))
.addAggregation(AggregationBuilders.terms('term')
.field('field')
.shardSize(shardSize)
.size(size)
.minDocCount(minCount));
As an example query you can find the above one.

Take a look at the ignore_unavailable option, which is part of the multi index syntax. This has been available since at least version 1.3 and allows you to ignore missing or closed indexes when performing searches (among other multi index operations).
It is exposed in the Java API by IndicesOptions. Browsing through the source code, I found there is a setIndicesOptions() method on the SearchRequestBuilder used in the example. You need to pass it an instance of IndicesOptions.
There are various static factory methods on the IndicesOptions class for building an instance with your specific desired options. You would probably benefit from using the more convenient lenientExpandOpen() factory method (or the deprecated version, lenient(), depending on your version) which sets ignore_unavailable=true,allow_no_indices=true, and expand_wildcards=open.
Here is a modified version of the example query which should provide the behavior you are looking for:
SearchRequestBuilder builder = elasticsearchClient.getClient().prepareSearch(indices)
.setQuery(Query.buildQueryFrom(term1, term2))
.addAggregation(AggregationBuilders.terms('term')
.field('field')
.shardSize(shardSize)
.size(size)
.minDocCount(minCount))
.setIndicesOptions(IndicesOptions.lenientExpandOpen());

Have you tried using Index Aliases?
Rather than referring to individual aliases you can specify a single index value. Behind this can be several indexes.
Here I'm adding two indexes to the alias and removing the missing / broken one:
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "bad-index", "alias" : "alias-index" } },
{ "add" : { "index" : "good-index1", "alias" : "alias-index" } },
{ "add" : { "index" : "good-index2", "alias" : "alias-index" } }
]
}'

Logstash + Kibana terms panel without breaking words

I have a Java application that writes to a log file in json format.
The fields that come in the logs are variable.
The logstash reads this logfile and sends it to Kibana.
I've configured the logstash with the following file:
input {
file {
path => ["[log_path]"]
codec => "json"
}
}
filter{
json {
source => "message"
}
date {
match => [ "data", "dd-MM-yyyy HH:mm:ss.SSS" ]
timezone => "America/Sao_Paulo"
}
}
output {
elasticsearch_http {
flush_size => 1
host => "[host]"
index => "application-%{+YYYY.MM.dd}"
}
}
I've managed to show correctly everything in Kibana without any mapping.
But when I try to create a terms panel to show a count of the servers who sent those messages I have a problem.
I have a field called server in my json, that show the servers name (like: a1-name-server1), but the terms panel split the server name because of the "-".
Also I would like to count the number of times that a error message appears, but the same problem occurs, because the terms panel split the error message because of the spaces.
I'm using Kibana 3 and Logstash 1.4.
I've searched a lot on the web and couldn't find any solution.
I also tried using the .raw from logstash, but it didn't work.
How can I manage this?
Thanks for the help.

Your problem here is that your data is being tokenized. This is helpful to make any search over your data. ES (by default) will split your field message split into different parts to be able to search them. For example you may want to search for the word ERROR in your logs, so you probably would like to see in the results messages like "There was an error in your cluster" or "Error processing whatever". If you don't analyze the data for that field with tokenizers, you won't be able to search like this.
This analyzed behaviour is helpful when you want to search things, but it doesn't allow you to group when different messages that have the same content. This is your usecase. The solution to this is to update your mapping putting not_analyzed for that specific field that you don't want to split into tokens. This will probably work for your host field, but will probably break the search.
What I usually do for these kind of situations is to use index templates and multifields. The index template allow me to set a mapping for every index that match a regex and the multifields allow me to have the analyzed and not_analyzed behaviour in a same field.
Using the following query would do the job for your problem:
curl -XPUT https://example.org/_template/name_of_index_template -d '
{
"template": "indexname*",
"mappings": {
"type": {
"properties": {
"field_name": {
"type": "multi_field",
"fields": {
"field_name": {
"type": "string",
"index": "analyzed"
},
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}'
And then in your terms panel you can use field.untouched, to consider the entire content of the field when you calculate the count of the different elements.
If you don't want to use index templates (maybe your data is in a single index), setting the mapping with the Put Mapping API would do the job too. And if you use multifields, there is no need to reindex the data, because from the moment that you set the new mapping for the index, the new data will be duplicated in these two subfields (field_name and field_name.untouched). If you just change the mapping from analyzed to not_analyzed you won't be able to see any change until you reindex all your data.

Since you didn't define a mapping in elasticsearch, the default settings takes place for every field in your type in your index. The default settings for string fields (like your server field) is to analyze the field, meaning that elastic search will tokenize the field contents. That is why its splitting your server names to parts.
You can overcome this issue by defining a mapping. You don't have to define all your fields, but only the ones that you don't want elasticsearch to analyze. In your particular case, sending the following put command will do the trick:
http://[host]:9200/[index_name]/_mapping/[type]
{
"type" : {
"properties" : {
"server" : {"type" : "string", "index" : "not_analyzed"}
}
}
}
You can't do this on an already existing index because switching from analyzed to not_analyzed is a major change in the mapping.

MongoDb - Update collection atomically if set does not exist

I have the following document in my collection:
{
"_id":NumberLong(106379),
"_class":"x.y.z.SomeObject",
"name":"Some Name",
"information":{
"hotelId":NumberLong(106379),
"names":[
{
"localeStr":"en_US",
"name":"some Other Name"
}
],
"address":{
"address1":"5405 Google Avenue",
"city":"Mountain View",
"cityIdInCitiesCodes":"123456",
"stateId":"CA",
"countryId":"US",
"zipCode":"12345"
},
"descriptions":[
{
"localeStr":"en_US",
"description": "Some Description"
}
],
},
"providers":[
],
"some other set":{
"a":"bla bla bla",
"b":"bla,bla bla",
}
"another Property":"fdfdfdfdfdf"
}
I need to run through all documents in collection and if "providers": [] is empty I need to create new set based on values of information section.
I'm far from being MongoDB expert, so I have the few questions:
Can I do it as atomic operation?
Can I do this using MongoDB console? as far as I understood I can do it using $addToSet and $each command?
If not is there any Java based driver that can provide such functionality?

Can I do it as atomic operation?
Every document will be updated in an atomic fashion. There is no "atomic" in MongoDB in the sense of RDBMS, meaning all operations will succeed or fail, but you can prevent other writes interleaves using $isolated operator
Can I do this using MongoDB console?
Sure you can. To find all empty providers array you can issue a command like:
db.zz.find(providers :{ $size : 0}})
To update all documents where the array is of zero length with a fixed set of string, you can issue a query such as
db.zz.update({providers : { $size : 0}}, {$addToSet : {providers : "zz"}})
If you want to add a portion to you document based on a document's data, you can use the notorious $where query, do mind the warnings appearing in that link, or - as you had mentioned - query for empty provider array, and use cursor.forEach()
If not is there any Java based driver that can provide such functionality?
Sure, you have a Java driver, as for each other major programming language. It can practically do everything described, and basically every thing you can do from the shell. Is suggest you to get started from the Java Language Center.
Also there are several frameworks which facilitate working with MongoDB and bridge the object-document world. I will not give a least here as I'm pretty biased, but I'm sure a quick Google search can do.

db.so.find({ providers: { $size: 0} }).forEach(function(doc) {
doc.providers.push( doc.information.hotelId );
db.so.save(doc);
});
This will push the information.hotelId of the corresponding document into an empty providers array. Replace that with whatever field you would rather insert into the providers array.

Drools: Strange behaviour of a String global variable

The situation is easy. I created a rules file:
package org.domain.rules;
dialect "mvel"
import eu.ohim.fsp.core.configuration.domain.xsd.Section;
global java.lang.String sectionName;
rule "rule 1"
salience 1000
when
Section($name : nameOfTheSection)
eval(sectionName == null)
then
System.out.println("Section: " + $name+ "("+$name.length()+")");
System.out.println("Section Name: " + sectionName + "("+sectionName.length()+")");
System.out.println("Mark Details: " + sectionName.equals(null));
end
And before firing the rules I added the Section object with a valid coreName and the globals:
public void fireInserted(Section section1) {
kstateful.insert(section1);
kstateful.setGlobal("sectionName", new String("markudetails"));
kstateful.fireAllRules();
}
The result is:
Section: markudetails(12)
Section Name: markudetails(12)
Mark Details: false
QUESTION: How can it be possible? in when part is null and in then part is not null!!!

Global vars are not a part of the knowledge base, but a separate channel to push some context into the rule execution. It is not appropriate to use them in a when clause. The exact reason why it was null in your case may be hard to trace, since rule activation is completely decoupled from rule execution. The variable may simply not be bound at when clause evaluation time, but is bound at then clause execution time.
To summarize: don't use globals in a when clause, that's not what they are for.
Your problem has an easy general solution: you can insert a configuration object into the knowledge. That object can have your desired "sectionName" property which you will then find easy to test in a when.
As an aside, it is meaningless to test for object.equals(null) -- this can never produce true. There is also no need to use new String("markudetails"). Instead use just "markudetails".

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.