Configurable data model - How to not re-invent the wheel?

Configurable data model - How to not re-invent the wheel? - java

So my company decided that it needs some kind of system / service that has the following properties:
Uses Java and Spring-Boot as a Back-End
Has an Angular Front-End for and Admin UI
Uses mongoDB for persistence
The system should include following functionality:
Users (Experts, Data-Engineers, Developers) should be able to define the data model including dynamic types having a set of properties and relationships via an Admin UI.
The system should support multi-tenancy, meaning that it should integrate multiple clients from different tenants.
It is important, that different clients have different projections of the data when reading, meaning that not all clients are allowed to read all the properties of an entity, but are restricted to what has been configured for them.
There must be some kind of validation for the properties (e.g. if it is of type string it must follow a common pattern, if it is of type enum only certain values are accepted)
I did some research and came to the conclusion that there are certain drawbacks with this proposed solution being:
using java for handling dynamic types
favoring abstract code instead of an explicit data model violating the "use before re-use principle"
I also fear, that we are re-inventing the wheel - meaning that there might be existing solutions for dynamically defining a data model.
I have to take these decisions as a given and try to find a way to still use existing implementations as far as possible.
In the end it will adapt a pattern similar to the EAV model. In my opinion there will be almost no possibility to adapt domain specific language and rules, since it aims to be as abstract as possible. It seems to me that it is a clear case of the inner-platform-effect, meaning that it will be result in a system that is
"so customizable as to become a replica, and often a poor replica, of
the software development platform they are using"
Nonetheless I have to deliver some kind of implementation which has to move within this frame.
I don't want to re-invent the wheel and create a proprietary solution - so I am thinking about using at least some standard for solving this issue, as for example generating json-schema from the user configuration instead of inventing my own data structures and validation logic.
Has anyone experience with json-schema and dialects or can point me to any other solution that I can adapt to make my life easier without having to come up with a home-made solution?
PS: I am not sure if these design questions belong to SO or any other place, so please let me know if you think I misused SO

Related

Where to put event upcasters in a microservice architecture?

I'm "playing" with Axon Framework with some small examples where the query and command services (and the logic behind them) are running as separated applications in several Docker containers.
Everything works fine so far and I started to evolve the event versioning topic. I haven't implemented that yet, but I like the idea to share the events as an API via JSON schema. But I've got stuck using that idea with the potential need of event upcasters.
If I understand that approach correctly every listening component has to upcasts the events independently, therefore it might be a good idea to share the upcasters, there is no need for different implementations, right? But then the upcasters seem to became a part of the API, or am I missing something?
How do you deal with that situation? Or generally, what are the best practices for API definitions in such scenario?

When accessing a microservices environment with distinct repositories for the different services, I feel it is common place to have a dedicated module/package/repository for the API of the given microservice. Or, a dedicated module for the shared language within a Bounded Context.
Especially when following the notion of Bounded Context, thus that every service within the context speaks the same language, to me emphasizes the requirement to share the created upcasters as well.
So shortly yes, I would group upcasters together with the API in question.
Schema languages typically also have solutions in place to support several versions of a message for example. Thus if you would be to use a schema language as your core API, that would also include a (although different) form of upcaster.
This is my 2 cents on the situation; hope this helps you out!

Is there a mature Java Workflow Engine for BPM backed by NoSQL?

I am researching how to build a general application or microservice to enable building workflow-centric applications. I have done some research about frameworks (see below), and the most promising candidates share a hard reliance upon RDBMSes to store workflow and process state combined with JPA-annotated entities. In my opinion, this damages the possibility of designing a general, data-driven workflow microservice. It seems that a truly general workflow system can be built upon NoSQL solutions like MondoDB or Cassandra by storing data objects and rules in JSON or XML. These would allow executing code to enforce types or schemas while using one or two simple Java objects to retrieve and save entities. As I see it, this could enable a single application to be deployed as a Controller for different domains' Model-View pairs without modification (admittedly given a very clever interface).
I have tried to find a workflow engine/BPM framework that supports NoSQL backends. The closest I have found is Activiti-Neo4J, which appears to be an abandoned project enabling a connector between Activity and Neo4J.
Is there a Java Work Engine/BPM framework that supports NoSQL backends and generalizes data objects without requiring specific POJO entities?
If I were to give up on my ideal, magically general solution, I would probably choose a framework like jBPM and Activi since they have great feature sets and are mature. In trying to find other candidates, I have found a veritable graveyard of abandoned projects like this one on Java-Source.net.

Yes, Temporal Workflow has pluggable persistence and runs on Cassandra as well as on SQL databases. It was tested to up to 100 Cassandra nodes and could support tens of thousands of events per second and hundreds of millions of open workflows.
It allows to model your workflow logic as plain old java classes and ensures that the code is fully fault tolerant and durable across all sorts of failures. This includes local variable and threads.
See this presentation that goes into more details about the programming model.

I think the reason why workflow engines are often based on RDBMS is not the database schema but more the combination to a transaction-safe data store.
Transactional robustness is an important factor for workflow engines, especially for long-running or nested transactions which are typical for complex workflows.
So maybe this is one reason why most engines (like activi) did not focus on a data-driven approach. (I am not talking about data replication here which is covered by NoSQL databases in most cases)
If you take a look at the Imixs-Workflow Project you will find a different approach based on Java Enterprise. This engine uses a generic data object which can consume any kind of serializable data values. The problem of the data retrieval is solved with the Lucene Search technology. Each object is translated into a virtual document with name/value pairs for each item. This makes it easy to search through the processed business data as also to query structured workflow data like the status information or the process owners. So this is one possible solution.
Apart from that, you always have the option to store your business data into a NoSQL database. This is independent from the workflow data of a running process instance as far as you link both objects together.
Going back to the aspect of transactional robustness it's a good idea to store the reference to your NoSQL data storage into the process instance, which is transaction aware. Take also a look here.
So the only problem you can run into is the fact that it's very hard to synchronize a transaction context from a EJB/JPA to an 'external' NoSQL database. For example: what will you do when your data was successful saved into your NoSQL data storage (e.g. Casnadra), but the transaction of the workflow engine fails and a role-back is triggered?

The designers of the Activiti project have also been aware of the problem you have stated, but knew it would be quite a re-write to implement such flexibility which, arguably, should have been designed into the project from the beginning. As you'll see in the link provided below, the problem has been a lack of interfaces toward which to code different implementations other than that of a relational database. With version 6 they went ahead and ripped off the bandaid and refactored the framework with a set of interfaces for which different implementations (think Neo4J, MongoDB or whatever other persistence technology you fancy) could be written and plugged in.
In the linked article below, they provide some code examples for a simple in-memory implementation of the aforementioned interfaces. Looks pretty cool and sounds to perhaps be precisely what you're looking for.
https://www.javacodegeeks.com/2015/09/pluggable-persistence-in-activiti-6.html

Data Driven Rules Engine - Drools

I have been evaluating Drools as a Rules Engine for use in our Business Web Application.
My use case is a Order Management Application.
And the rules are of following kind:
- If User Type is "SPECIAL" give an extra 5% discount.
- If User has made 10+ Purchases already, give an extra 3% discount.
- If Product Category is "OLD", give a Gift Hamper to the user worth $5.
- If Product Category is "NEW", give a Gift Hamper to the user worth $1
- If User has made purchases of over $1000 in the past, Shipping is Free
The immediate challenge i see is that:
- There is no meaningful UI that i can offer to the end users to modify the rules.
- Guvnor UI or any Editor to modify drl files is just not acceptable from end user point of view
- Most of these Rules will operate on often huge data available in db
So,
- I want a way for Admin users to specify these Rule from within my Web App UI.
- Could i store these "Rules" in database, and then operate on them via Drools - at least that allows me to "modify" these Rules via my "own" UI. So this is something like a Decision Table in DB.
- What is the best way to go about this?

You asked me to give an answer to your question, given my answer to Data driven business rules. My answer to that question was that SQL is a bad solution to execute business rules stored in the database. The person who asked that question wanted to generate SQL expressions from their stored business rules, and I cautioned against doing that, because it would lead to problems in security, testability, performance, and maintenance.
I have not used Drools, but I gather from documentation that it includes Guvnor, a business rules manager that supports using an RDBMS as a repository for user-defined rules.
[Drools] Guvnor uses the JCR standard for storing assets such as rules. The default implementation is Apache Jackrabbit, http://jackrabbit.apache.org. This includes an out of the box storage engine/database, which you can use as is, or configure to use an existing RDBMS if needed. (http://docs.jboss.org/drools/release/5.2.0.Final/drools-guvnor-docs/html/chap-database_configuration.html)
Apache Jackrabbit is not an RDBMS, it is "a content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more." This seems like a more appropriate repository for Drools.
But Drools doesn't say it tries to use SQL to execute those business rules. It has a separate component, Drools Expert (Rules Engine) to do that.
Drools Expert is a declarative, rule based, coding environment. This allows you to focus on "what it is you want to do", and not the "how to do this".
(http://www.jboss.org/drools/drools-expert.html)
SQL is also a declarative programming language, but it's designed to perform relational operations on table-structured data. A language to implement a rules engine has different goals, and can probably do things that SQL can't (and vice-versa).
So I would suggest if you use Drools, feel free to use an RDBMS as a repository as they document (use their JCR-compliant implementation of content repository, do not try to design your own). Then use their Drools Expert as a specialized language designed for executing rules.

There is no meaningful UI that i can offer to the end users to modify the rules.
Out of the box, Guvnor provides web based decision tables (and Excel if you prefer), as you say you would like to provide. It provides guided editors for more complex rules, but your rules would appear to be very simple.
Guvnor UI or any Editor to modify drl files is just not acceptable from end user point of view
As mentioned, Guvnor supports decision tables. If you don't like the layout of the Guvnor web application, then you can just embed the Guvnor editors into your own web application.
Most of these Rules will operate on often huge data available in db
The size of your database is irrelevant to the use of Guvnor. Guvnor is for editing rules, not runtime evaluation. Drools Expert is the runtime rules engine. It's fast. It can deal with very large volumes of data and very large volumes of rules. All you need to do is write database queries to get relevant chunks of that data into the rules engine at runtime. You need to do that, whatever solution you try to implement.
On a side-note, if what you're really after is an explanation of when rules engines are good (and bad) solutions to a problem, then I would recommend reading the Why use a Rule engine? section of the Drools Expert manual.

Generally, I've found it is easier to work at a more abstract level, such as a Domain Model, and have some sort of programmatic conversion from that to Drools rules, instead of dealing with Drools rules directly. That way, you can store your Domain Model however you like, and you can build UIs around it, etc, and still have the option to generate Drools rules on demand. Then challenge with this is creating a programmatic transformation from your model to Drools rules, but templating tools will help here. I've used Groovy templating for this, and it has worked well.

Looking for design patterns to isolate framework layers from each other

I'm wondering if anyone has any experience in "isolating" framework objects from each other (Spring, Hibernate, Struts). I'm beginning to see design "problems" where an object from one framework gets used in another object from a different framework. My fear is we're creating tightly coupled objects.
For instance, I have an application where we have a DynaActionForm with several attributes...one of which is a POJO generated by the Hibernate Tools. This POJO gets used everywhere...the JSP populates data to it, the Struts Action sends it down to a Service Layer, the DAO will persist it...ack!
Now, imagine that someone decides to do a little refactoring on that POJO...so that means the JSP, Action, Service, DAO all needs to be updated...which is kind of painful...There has got to be a better way?!
There's a book called Core J2EE Patterns: Best Practices and Design Strategies (2nd Edition)...is this worth a look? I don't believe it touches on any specific frameworks, but it looks like it might give some insight on how to properly layer the application...
Thanks!

For instance, I have an application where we have a DynaActionForm with several attributes...one of which is a POJO generated by the Hibernate Tools. This POJO gets used everywhere...the JSP populates data to it, the Struts Action sends it down to a Service Layer, the DAO will persist it...ack!
To me, there is nothing wrong with having Domain Objects as a "transveral" layer in a web application (after all, you want their state to go from the database to the UI and I don't see the need to map them into intermediate structures):
Now, imagine that someone decides to do a little refactoring on that POJO...so that means the JSP, Action, Service, DAO all needs to be updated...which is kind of painful...There has got to be a better way?!
Sure, you could read "Beans" from the database at the DAO layer level, map them into "Domain Objects" at the service layer and map the Domain Objects into "Value Objects" for the presentation layer and you would have very low coupling. But then you'll realize that:
Adding a column in a database usually means adding some information on the view and vice-versa.
Duplication of objects and mappings are extremely painful to do and to maintain.
And you'll forget this idea.
There's a book called Core J2EE Patterns: Best Practices and Design Strategies (2nd Edition)...is this worth a look? I don't believe it touches on any specific frameworks, but it looks like it might give some insight on how to properly layer the application...
This book was a "showcase" of how to implement (over engineered) applications using the whole J2EE stack (with EJB 2.x) and has somehow always been considered as too complicated (too much patterns). On top of that, it is today clearly outdated. So it is interesting but must be taken with a giant grain of salt.
In other words, I wouldn't recommend that book (at least certainly not as state of the art). Instead, have a look at Real World Java EE Patterns - Rethinking Best Practices (see Chapter 3 - Mapping of the Core J2EE patterns into Java EE) and/or the Spring literature if you are not using Java EE.

First, avoid Struts 1. Having to extend a framework class (like DynaActionForm) is one of the reasons this framework is no longer a good choice.
You don't use spring classes in the usual scenarios. Spring is non-invasive - it just wires your objects. You depend on it only if using some interfaces like ApplicationContextAware, or if you are using the hibernate or jdbc extensions. Using these extensions together with hibernate/jdbc completely fine and it is not an undesired coupling.
Update: If you are forced to work with Struts 1 (honestly, try negotiating for Struts 2, Struts 1 is obsolete!), the usual way to go was to create a copy of the Form class, that contained the exact same fields, but did not extend the framework class. There would be a factory method that takes the form class and returns the simple POJO. This is duplication of code, but I've seen it in practice and is not that bad (compared to the use of Struts 1 :) )

I think your problem is not so big as it seems.
Let's imagine, what can you really change in your POJO:
1) name of its class: any IDE with refactoring support will automatically make all necessary changes for you
2) add some field/method: it almost always means adding new functionality what is always should be done manually and carefully. It usually cause to some changes in your service layer, very seldom in DAO, and usually in your view (jsp).
3) change methods implementation: with good design this should cause any changes in other classes.
That's all, imho.
Make a decision about technology for implementing busyness-logic (EJB or Spring) and use its facilities of dependency injection. Using DI will make different parts of your program communicate to each other through interfaces. It should be enough for reaching necessary (small enough) level of coupling.

It's always nice to keep things clear if you can and separate the layers etc. But don't go overboard. I've seen systems where the developers were so intent on strictly adhering to their adopted patterns and practices that they ended up with a system worse than the imaginary one they were trying to avoid.
The art of good design is understanding the good practices and patterns, knowing when and how to apply them, but also knowing when it's appropriate to break or ignore them.
So take a good look at how you can achieve what you are after, read up on the patterns. Then do a trial on a separate proof of concept or a small part of your system to see your ideas in practice. My experience is that only once you actually put some code in place, do you really see the pros and cons of the idea. Once you have done that, you will be able to make an informed decision about what you will or will not introduce.
Finally, it's possible to build a system which does handle all the issues you are concerned about, but be pragmatic - is each goal you are attempting to reach worth the extra code and APIs you will have to introduce to reach it.

I'd say that Core J2EE Patterns: Best Practices and Design Strategies (2nd Edition) addresses EJB 2.0 concerns, some of which would be considered anti-patterns today. Knowledge is never wasted, but I wouldn't make this my first choice.
The problem is that it's impossible to decouple all the layers. Refactoring the POJO means modifying the problem you're solving, so all the layers DO have to be modified. There's no way around that.
Pure decoupling of layers that have no knowledge of each other requires a lot of duplication, translation, and mapping to occur. Don't fall for the idea that loose coupling means this work goes away.
One thing you can do is have a service layer that's expressed in terms of XML requests and responses. It forces you to map the XML to objects on the service side, but it does decouple the UI from the rest.

"Integration" between Rails' ActiveRecord and Java's Hibernate

Hi everybody: let me do a bit of "concept mining" here: I am involved in mantaining/extending an application whose functionality is distributed across several servers. For example, we have a machine running the ApplicationServer, another running the DataServer and so on.
This application has a Web Interface. The current UI is totally implemented in Java, and in a way that makes adding new functionality hard. One of my goals is extending this interface, and we're considering shifting the whole thing to another platform, like Rails, for example.
Problem being, the database that is manipulated by the UI (possibly Rails in the future) is also manipulated by ApplicationServer (Java).
So, my main question is: both Rails and Java can access databases through their own ORM (ActiveRecord for Rails and Hibernate or similar for Java). Is there any way to guarantee that the mappings are consistent?*
Even if the answer is a hard "no", I'd also like to hear your thoughts on how you'd approach this scenario.
I hope the question is clear enough, but warn me if it isn't and I'll edit accordingly. =D
*Edit: per request, I'm extending this explanation: what I mean is, how to make sure things don't break when someone needs to add a new field to the database and edits the Hibernate mapping because of it? I know that Rails "guesses" the entity attributes pretty much by itself (making things easier), but I was wondering if there was some "magical way" to "connect" the ActiveRecord directly to the Hibernate mapping.

Depends on your case and how important it is to actually ensure that things won't break. I would probably code the Rails app to do its best, and then write a good set of db integration test cases for Rails to test against breakage.
Because Hibernate needs a mapping conf whereas Rails uses the database layout directly, it's best to do the db changes on Hibernate/mapped Java class side and then run the test suite on Rails side after changes.

this might be coming too late to the party, but ActiveJDBC is an ActiveRecord- like implementation in Java which reads metadata and configures self pretty much the same as ActiveRecord: http://code.google.com/p/activejdbc/

You should look at using DataMapper instead of ActiveRecord. DataMapper and Hibernate following roughly the same pattern so the mappings would be similar. Also, DataMapper defines the mapping in the class itself rather than figuring it out from the model. This is much closer to Hibernate and you could probably write a simple hbm to dm converter and just eval the output at the top of your model classes. If you didn't design your original data model with Rails in mind, none of the convention over configuration standards are likely to be there; with DataMapper, the default seems to be to map properties and relationships like Hibernate.
Another idea: if you use the Hibernate annotations instead of xml mapping, maybe you could JRuby as the bridge to build the Ruby model from the Java one.
But either way, if you have good tests, it should be obvious when a data model change break something.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.