Data Driven Rules Engine - Drools - java

I have been evaluating Drools as a Rules Engine for use in our Business Web Application.
My use case is a Order Management Application.
And the rules are of following kind:
- If User Type is "SPECIAL" give an extra 5% discount.
- If User has made 10+ Purchases already, give an extra 3% discount.
- If Product Category is "OLD", give a Gift Hamper to the user worth $5.
- If Product Category is "NEW", give a Gift Hamper to the user worth $1
- If User has made purchases of over $1000 in the past, Shipping is Free
The immediate challenge i see is that:
- There is no meaningful UI that i can offer to the end users to modify the rules.
- Guvnor UI or any Editor to modify drl files is just not acceptable from end user point of view
- Most of these Rules will operate on often huge data available in db
So,
- I want a way for Admin users to specify these Rule from within my Web App UI.
- Could i store these "Rules" in database, and then operate on them via Drools - at least that allows me to "modify" these Rules via my "own" UI. So this is something like a Decision Table in DB.
- What is the best way to go about this?

You asked me to give an answer to your question, given my answer to Data driven business rules. My answer to that question was that SQL is a bad solution to execute business rules stored in the database. The person who asked that question wanted to generate SQL expressions from their stored business rules, and I cautioned against doing that, because it would lead to problems in security, testability, performance, and maintenance.
I have not used Drools, but I gather from documentation that it includes Guvnor, a business rules manager that supports using an RDBMS as a repository for user-defined rules.
[Drools] Guvnor uses the JCR standard for storing assets such as rules. The default implementation is Apache Jackrabbit, http://jackrabbit.apache.org. This includes an out of the box storage engine/database, which you can use as is, or configure to use an existing RDBMS if needed. (http://docs.jboss.org/drools/release/5.2.0.Final/drools-guvnor-docs/html/chap-database_configuration.html)
Apache Jackrabbit is not an RDBMS, it is "a content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more." This seems like a more appropriate repository for Drools.
But Drools doesn't say it tries to use SQL to execute those business rules. It has a separate component, Drools Expert (Rules Engine) to do that.
Drools Expert is a declarative, rule based, coding environment. This allows you to focus on "what it is you want to do", and not the "how to do this".
(http://www.jboss.org/drools/drools-expert.html)
SQL is also a declarative programming language, but it's designed to perform relational operations on table-structured data. A language to implement a rules engine has different goals, and can probably do things that SQL can't (and vice-versa).
So I would suggest if you use Drools, feel free to use an RDBMS as a repository as they document (use their JCR-compliant implementation of content repository, do not try to design your own). Then use their Drools Expert as a specialized language designed for executing rules.

There is no meaningful UI that i can offer to the end users to modify the rules.
Out of the box, Guvnor provides web based decision tables (and Excel if you prefer), as you say you would like to provide. It provides guided editors for more complex rules, but your rules would appear to be very simple.
Guvnor UI or any Editor to modify drl files is just not acceptable from end user point of view
As mentioned, Guvnor supports decision tables. If you don't like the layout of the Guvnor web application, then you can just embed the Guvnor editors into your own web application.
Most of these Rules will operate on often huge data available in db
The size of your database is irrelevant to the use of Guvnor. Guvnor is for editing rules, not runtime evaluation. Drools Expert is the runtime rules engine. It's fast. It can deal with very large volumes of data and very large volumes of rules. All you need to do is write database queries to get relevant chunks of that data into the rules engine at runtime. You need to do that, whatever solution you try to implement.
On a side-note, if what you're really after is an explanation of when rules engines are good (and bad) solutions to a problem, then I would recommend reading the Why use a Rule engine? section of the Drools Expert manual.

Generally, I've found it is easier to work at a more abstract level, such as a Domain Model, and have some sort of programmatic conversion from that to Drools rules, instead of dealing with Drools rules directly. That way, you can store your Domain Model however you like, and you can build UIs around it, etc, and still have the option to generate Drools rules on demand. Then challenge with this is creating a programmatic transformation from your model to Drools rules, but templating tools will help here. I've used Groovy templating for this, and it has worked well.

Related

Configurable data model - How to not re-invent the wheel?

So my company decided that it needs some kind of system / service that has the following properties:
Uses Java and Spring-Boot as a Back-End
Has an Angular Front-End for and Admin UI
Uses mongoDB for persistence
The system should include following functionality:
Users (Experts, Data-Engineers, Developers) should be able to define the data model including dynamic types having a set of properties and relationships via an Admin UI.
The system should support multi-tenancy, meaning that it should integrate multiple clients from different tenants.
It is important, that different clients have different projections of the data when reading, meaning that not all clients are allowed to read all the properties of an entity, but are restricted to what has been configured for them.
There must be some kind of validation for the properties (e.g. if it is of type string it must follow a common pattern, if it is of type enum only certain values are accepted)
I did some research and came to the conclusion that there are certain drawbacks with this proposed solution being:
using java for handling dynamic types
favoring abstract code instead of an explicit data model violating the "use before re-use principle"
I also fear, that we are re-inventing the wheel - meaning that there might be existing solutions for dynamically defining a data model.
I have to take these decisions as a given and try to find a way to still use existing implementations as far as possible.
In the end it will adapt a pattern similar to the EAV model. In my opinion there will be almost no possibility to adapt domain specific language and rules, since it aims to be as abstract as possible. It seems to me that it is a clear case of the inner-platform-effect, meaning that it will be result in a system that is
"so customizable as to become a replica, and often a poor replica, of
the software development platform they are using"
Nonetheless I have to deliver some kind of implementation which has to move within this frame.
I don't want to re-invent the wheel and create a proprietary solution - so I am thinking about using at least some standard for solving this issue, as for example generating json-schema from the user configuration instead of inventing my own data structures and validation logic.
Has anyone experience with json-schema and dialects or can point me to any other solution that I can adapt to make my life easier without having to come up with a home-made solution?
PS: I am not sure if these design questions belong to SO or any other place, so please let me know if you think I misused SO

Is there a mature Java Workflow Engine for BPM backed by NoSQL?

I am researching how to build a general application or microservice to enable building workflow-centric applications. I have done some research about frameworks (see below), and the most promising candidates share a hard reliance upon RDBMSes to store workflow and process state combined with JPA-annotated entities. In my opinion, this damages the possibility of designing a general, data-driven workflow microservice. It seems that a truly general workflow system can be built upon NoSQL solutions like MondoDB or Cassandra by storing data objects and rules in JSON or XML. These would allow executing code to enforce types or schemas while using one or two simple Java objects to retrieve and save entities. As I see it, this could enable a single application to be deployed as a Controller for different domains' Model-View pairs without modification (admittedly given a very clever interface).
I have tried to find a workflow engine/BPM framework that supports NoSQL backends. The closest I have found is Activiti-Neo4J, which appears to be an abandoned project enabling a connector between Activity and Neo4J.
Is there a Java Work Engine/BPM framework that supports NoSQL backends and generalizes data objects without requiring specific POJO entities?
If I were to give up on my ideal, magically general solution, I would probably choose a framework like jBPM and Activi since they have great feature sets and are mature. In trying to find other candidates, I have found a veritable graveyard of abandoned projects like this one on Java-Source.net.
Yes, Temporal Workflow has pluggable persistence and runs on Cassandra as well as on SQL databases. It was tested to up to 100 Cassandra nodes and could support tens of thousands of events per second and hundreds of millions of open workflows.
It allows to model your workflow logic as plain old java classes and ensures that the code is fully fault tolerant and durable across all sorts of failures. This includes local variable and threads.
See this presentation that goes into more details about the programming model.
I think the reason why workflow engines are often based on RDBMS is not the database schema but more the combination to a transaction-safe data store.
Transactional robustness is an important factor for workflow engines, especially for long-running or nested transactions which are typical for complex workflows.
So maybe this is one reason why most engines (like activi) did not focus on a data-driven approach. (I am not talking about data replication here which is covered by NoSQL databases in most cases)
If you take a look at the Imixs-Workflow Project you will find a different approach based on Java Enterprise. This engine uses a generic data object which can consume any kind of serializable data values. The problem of the data retrieval is solved with the Lucene Search technology. Each object is translated into a virtual document with name/value pairs for each item. This makes it easy to search through the processed business data as also to query structured workflow data like the status information or the process owners. So this is one possible solution.
Apart from that, you always have the option to store your business data into a NoSQL database. This is independent from the workflow data of a running process instance as far as you link both objects together.
Going back to the aspect of transactional robustness it's a good idea to store the reference to your NoSQL data storage into the process instance, which is transaction aware. Take also a look here.
So the only problem you can run into is the fact that it's very hard to synchronize a transaction context from a EJB/JPA to an 'external' NoSQL database. For example: what will you do when your data was successful saved into your NoSQL data storage (e.g. Casnadra), but the transaction of the workflow engine fails and a role-back is triggered?
The designers of the Activiti project have also been aware of the problem you have stated, but knew it would be quite a re-write to implement such flexibility which, arguably, should have been designed into the project from the beginning. As you'll see in the link provided below, the problem has been a lack of interfaces toward which to code different implementations other than that of a relational database. With version 6 they went ahead and ripped off the bandaid and refactored the framework with a set of interfaces for which different implementations (think Neo4J, MongoDB or whatever other persistence technology you fancy) could be written and plugged in.
In the linked article below, they provide some code examples for a simple in-memory implementation of the aforementioned interfaces. Looks pretty cool and sounds to perhaps be precisely what you're looking for.
https://www.javacodegeeks.com/2015/09/pluggable-persistence-in-activiti-6.html

ETL architecture

I've been asked to make an ETL-style application that transfers information from one data source to another. At the moment, I've decided to use a three-layer architecture but I would like to find out more about the best practices as well as the life cycle described on this wikipedia page:
http://en.wikipedia.org/wiki/Extract,_transform,_load
Four-layered approach for ETL architecture design
Functional layer: Core functional ETL processing (extract, transform, and load).
Operational management layer: Job-stream definition and management, parameters, scheduling, monitoring, communication and alerting.
Audit, balance and control (ABC) layer: Job-execution statistics, balancing and controls, rejects- and error-handling, codes management.
Utility layer: Common components supporting all other layers.
Real-life ETL cycle
The typical real-life ETL cycle consists of the following execution steps:
Cycle initiation
Build reference data
Extract (from sources)
Validate
Transform (clean, apply business rules, check for data integrity, create aggregates or disaggregates)
Stage (load into staging tables, if used)
Audit reports (for example, on compliance with business rules. Also, in case of failure, helps to diagnose/repair)
Publish (to target tables)
Archive
Clean up
I don't know what your situation is or what your requirements are, but you're likely over thinking the problem.
The name alone is "the" architecture:
Extract
Transform
Load
Exporting a DB table to a CSV can be considered "ET" while loading the CSV is the "L". Most ETL problems are simply not complicated.
Beyond that, you should grab any of the 1 or 2 million ETL and ESB packages already available in Java, free and commercial, libraries and full boat processing systems, and simply adopt one of them that you like best.
Get a white board, string some bubbles together with lines and turn that in to code.
To answer the question, "What's the best practice?" the answer depends on what you are trying to accomplish.
To simplify let's assume you are doing one of the following:
You are building a data warehouse that will restructure the data in some way
You are moving data from point A to point B, but you are not restructuring the data
When I use the word "restructuring", I mean changing the grain or lowest level of detail of a table.
For 1. The ten steps outlined in your question is generally followed. General best practices:
As much transformation logic as possible is pushed onto database resources, not ETL software (ETL software is generally slower)
Validate, Transform, and Audit steps are used to employ whatever Master Data Management (MDM) standards your organization uses
For 2. This is much more straightforward so either method outlined in your question can be used.

Usecase for Workflow Engine

We have an issue where a Database table has to be updated on the status for a particular entity. Presently, its all Java code with a lot of if conditions and an update to the status. I was thinking along lines of using a Workflow engine since there can be multiple flows in future. Is it an overkill to use a Workflow Engine here... where do you draw the line ?
It depends on the complexity of your use case.
In a simple use case, we have a database column updated by multiple consumers for each stage in an Order lifecycle. This is done by a web service calling into the database.
The simple lifecycle goes from ACKNOWLEDGED > ACCEPTED/REJECTED > FULFILLED > CLOSED. All of these are in the same table on the same column. This is executed in java classes with no workflow.
A workflow engine is suited in a more complex use case which involves actions on multiple data providers eg: database or Content Mgmt or Document Mgmt or search engine, multiple parallel processes, forking based on the success/failure of a previous step, sending an email at a certain step, offline error alerting.
You can look at Apache ODE to implement this.
We have an issue where a Database table has to be updated on the status for a particular entity. Presently, its all Java code with a lot of if conditions and an update to the status.
Sounds like something punctual, no need for orchestrating actions among workflow participants.
Maybe a rule engine is better suited for this. Drools could be a good candidate. When X then Y.
If you're using Spring, this is a good article on how to implement your requirement
http://www.javaworld.com/javaworld/jw-04-2005/jw-0411-spring.html
I think you should consider a workflow engine. Workflow should be separated from application logic.
Reasons:
Maintainable: Easier to modify, add new flows and even easier to replace by another workflow engine.
Business Process management: Workflows are mostly software representations of BPM. So it is usually designed by process designers (Non-tech people). So it is not a good idea to code inside the application. Instead BPM products such as ALBPM or JPBM should be used which support graphical workflow designs.
Monitoring business flows: They are often monitored by the Top level managers and used to make strategic decisions.
Easier for Data mining/Reports/Statistics.
ALBPM(Now Oracle BPM): is a commercial tool from Oracle suitable for large scope projects.
My recommendation is JBPM. Open source tool from JBOSS. Unlike ALBPM which requires separate DB and application server, it can be packaged with your application and runs as another module in your application. I think suitable for your project.

JSF - correct way to build forms

Ok, i wish to know the correct way to build forms in JSF. I have multidatabase app(user can switch databases during runtime, all databases are build on the same scheme) and now i want to build forms for data input.
I tried build functionalities in NetBeans, where i can generate entity classes from database, but, as far as i understood, this way correctly works only in case, i have one database. For my DB connections i use Hibernate. I already completed part, where i can switch between databases.
Maybe, there are some advices, how i can build forms for app? Preferable will be dynamic form build, it can be from XML file. Looking forward for your replies!
If your application is really divided into independent layers (DAO / Service / presentation for example, or MVC if you prefer), then the presentation layer, which is managed by the JSF framework, must not be impacted by the database connection.
You say that every database uses the same structure, so I don't really think that your JSF forms design and structures will be impacted by the database chosen by the user. This parameter will be taken into consideration in the deeper layers of your application, the ones managed by Hibernate in particular.
So to answer your question, I would say that you don't have to care about this specificity when designing your pages with JSF. So use the "default" best practices for JSF developments.
Have a look at how Seam does it with the Seam-gen tool. It will generate the entire application - including forms - from the database. It's based on Freemarker templates.
There is no such thing as "correct way" of building forms. It all depends on the functional requirements and what all is available.
If the forms are purely meant as kind of "database admin tool", then you need to build from the DB to the UI (bottom-up approach).
If the forms are purely meant to give the enduser some functionality (e.g. register form, order form, contact form, etc), then often a top-down approach (build from UI to DB) is preferred.
In your case it's likely the bottom-up approach. You're lucky, there are much more generators/tools for that available (as you probably already have found out). As long as you keep everything abstract, I don't forsee (re)usability and maintainability problems.
However, it is possible to "convert" some XML file to a JSF (XHTML) page with little help of XSL and a Filter.

Categories