How to better structure the artifacts of a java library with plugins

How to better structure the artifacts of a java library with plugins - java

Let's say I have a library, which provides two independent plugin interfaces, 2 implementations per plugin, and one parent POM file.
There are also some abstract tests in the "core", which the plugins have to implement and pass to be considered compliant.
From code perspective, the plugin interfaces don't depend on core at all.
Only core depends on plugins.
My assumptions are:
the core abstractions go into the "core" artifact and its tests are packaged in a test-jar.
each implementation of a "plugin" goes into a separate artifact (4 artifacts in this example).
the parent POM also goes into a separate artifact.
I have considered several options about how to structure the dependencies between each artifact, which can be boiled down to these 2:
Leave it at just 6 artifacts. Every "plugin" depends on "core". Every "plugin" and "core" all depend on parent artifact.
This makes it possible for the library users to only specify 2 artifacts in their pom.xml/build.gradle, because "core" is a transitive dependency.
BUT, given I have some changes to the "core" which cause a version bump, I would have to update every plugin implementaton to bump the dependency version. Also, if users didn't specify the core explicitly, they now depend on outdated core.
Extract plugin interfaces into separate artifacts, so that implementations no longer depend on core. Which now creates 8 artifacts.
Now, unlike the previous approach, library users can no longer skip the "core" dependency in their pom.xml/build.gradle - it is now a given that it has to be specified. Which means, they would have to depend on 3 artifacts.
But, overall, any update of the core no longer forces a cascading update of the plugins. The plugin implementations need version bumps only if their respective interface updates.
The downside is probably that I now have 2 more artifacts to maintain.
My questions are:
Which approach is the more correct one? Does it depend on project size or some other factors?
Are there other approaches?
Is it bad that users have to depend on plugins & "core" explicitly, even if plugins transitively bring "core" in the first approach?
Anything that is intrinsic to the problem and cannot be solved? (like, is it a given that 8 artifacts are to be maintained, with no way to minimize that?)
Is it correct to provide abstract tests in the "test-jar", if I want to make sure that all plugin implementations comply with the interface contracts? Or do I have to copy-paste the tests in each plugin implementation?
Reply to #vokail
Generally, If you release a new version of the core, you must release a new version of the plugin, right?
Currently the code is structured in such a way, that plugin have no dependencies on the core. With 1st scheme, if core updates, plugins must update. With 2nd scheme - if core updates, plugins don't care.
I think it's possible to have more than two plugins implementations
for plugin developers they need to use only this as dependency directly
True & true
plugin-api need only core-api
Currently, I cannot invert the dependency in such a way. Plugins know nothing about the core, except the plugins API.
As a note, there are 2 plugin APIs. Their code doesn't depend on core and their code doesn't depend on each other.
With 1st scheme, all plugin APIs are inside a single core artifact.
With 2nd scheme all plugin APIs are in separate artifacts (so it's 1 core artifact, and 2 separate API artifacts = 3 artifacts in total).
core-api can be implemented by more than one core-impl ( in the future )
Mhm... Don't see it in the future.
It's better to depend for my plugin implementation from an interface only, not from a core one
To clarify, this is what I meant.
From library user perspective, 1st scheme looks like this:
// Implementaton of "A" api, variant 1
implementation 'library:plugin-a1-impl:1.0.0'
// Implementaton of "B" api, variant 2
implementation 'library:plugin-b2-impl:1.0.0'
// Both plugins transitively bring in "library:core:1.0.0".
// But if for example core:1.1.0 is released, it has to be included explicitly
2nd scheme looks like this:
// Implementaton of "A" api, variant 1
// Transitively brings in "library:plugin-a-api" - a new artifact
implementation 'library:plugin-a1-impl:1.0.0'
// Implementaton of "B" api, variant 2
// Transitively brings in "library:plugin-b-api" - a new artifact
implementation 'library:plugin-b2-impl:1.0.0'
// Core has to be explicitly specified, nobody depends on it, only core depends on plugins
implementation 'library:core:1.0.0'
just do one artifact and let people depend on that only ( as example to minimize that ).
Currently there are separate projects that depend on the library, and they use different plugin implementations. Users pick between different implementation of the same APIs depending on shared dependencies.
For example, there's A, and there are 2 implementations: A-oranges, A-apples. If the project already uses oranges, it imports A-oranges. If it already uses apples, it imports A-apples.
In other words, the plugins are more like adapters between the library and external projects.
Another depiction of the differences between 2 options:
Squares represent ".jar" artifacts. Circles inside a square represent interfaces/classes and their dependencies on each other.
It could be said, that the code is DIP compliant - both core and plugin implementations depend on abstractions.
It's only a question of artifact structuring - is it worth extracting abstractions into separate artifacts as well?

I suppose there is an issue on how and how much often do you release a new version of the core and the plugin. Generally, If you release a new version of the core, you must release a new version of the plugin, right? If not so, please specify this.
I'm for solution 2, but with a little difference, as the following example:
As you can see I've introduced a plugin-api artifact, with only interfaces used by plugins, because:
I think it's possible to have more than two plugins implementations
for plugin developers they need to use only this as dependency directly
plugin-api need only core-api
core-api can be implemented by more than one core-impl ( in the future )
Following this approach you focus will be to design plugin-api better you can, stabilize it and then let plugin developers do the job.
What if:
core-impl change ? For example a bugfix or new release. Ask yourself: do I need to change core-api ? For example to provide a new feautre to plugin-api ? If so, release a new core-api and then release a new plugin-api
core-api change? Like before
plugin-api ? if plugin-api change, you need to change only plugin-impls
To answer on your questions:
Which approach is the more correct one? Does it depend on project size or some other factors?
There is no "correct one", depends for sure on project size ( just count how many feature/methos/interfaces you have in core-api and plugin-api ), how many developers works on it and how your release process works
Are there other approaches?
See one answer before, you can search from some big project like apache or eclipse foundation one to learn their patterns, but depends heavily on the subject and can be an huge task.
Is it bad that users have to depend on plugins & "core" explicitly, even if plugins transitively bring "core" in the first approach?
For my understanding, yes. It's better to depend for my plugin implementation from an interface only, not from a core one
Anything that is intrinsic to the problem and cannot be solved? (like, is it a given that 8 artifacts are to be maintained, with no way to minimize that?)
Well, If you are alone, this is an open source project used only by yourself, don't overengineering this, just do one artifact and let people depend on that only ( as example to minimize that ).
Is it correct to provide abstract tests in the "test-jar", if I want to make sure that all plugin implementations comply with the interface contracts? Or do I have to copy-paste the tests in each plugin implementation?
For me it's better to have a plugin-api and let plugin implementation declare only that, it's more clear and concise. For tests, I'm not sure if you plan to do tests on implementations by yourself, of "ask" to plugin developers to do the test. For sure copy-paste is not the right choice, you can use a command pattern or similar to make these tests, see here
After updated question, I'm still for solution 2, event if there are two separated plugin-api, is better to have different plugin-api.
It's only a question of artifact structuring - is it worth extracting abstractions into separate artifacts as well?
I think yes, in the long run. If you separate in different artifacts, you can change them independently, for example change something in plugin-apiA and this doesn't affect plugin-apiB. If you change the core, yes of course.
Note: for my diagram above I think can still be working, can't you make an abstract set of interfaces for plugin-api and have a common artifact for them ?

If plugin A and B are two distinct type of plugin, then the option 2 is a better pick however:
If A depends on core#v1
If B depends on core#v2
Then core#v2 have to be binary compatible with core#v1, otherwise it will fail when for example someone depends on an implementation of A and an implementation of B: there always have to upgrade the plugin version in any case.
You could probably use Java Module do hide the details (eg: only provide an interface that is likely to never changes) which will makes the solution 2 of Vokail useless in some sense: you don't need a core-impl because Java module will ensure you that, apart from your core module, no one access the details (the impl). This also allow you to reuse the same package.
If A and B interface are in core, then the likeness of a binary incompatibility fall down.

Related

How should I structure my maven projects of a grouped collection of modules?

The general architecture
We have an internal java application (let's call it com.example.framework) acting as a kind of framework in the sense of being extensible through plugins. These plugins can serve various purposes. As an example, there will be a plugin for the support of the differnt database providers, e.g., MysqlPlugin, OraclePlugin and MssqlPlugin. On the other hand their might be support for exchange formats such as JSON or XML, etc.
Code splitting
The framework application is developed as a seperate multi-module java project with the parent group id com.example.framework having the API/SPI as a distinct child module. Therefore, the plugins have this api-module as a dependecy called com.example.framework.api, which works perfectly fine. Idealy, each plugin will have its own artifact under a group called com.example.framework.plugins such that I will only have those plugins installed that are really needed.
The problem to solve
To ease the developer experience, I would like to group plugins of similar functionality, which even might want to share a bit of code together to a git project while keeping some special ones alone. Now I wonder what the best way of structuring this in the maven system is.
Current idea
The best solution I could find is to also use the multi-module approach for grouped plugin projects to achieve splitted artifacts while being able to share code between two plugins. However, I am still confused about the groupId and version of the parent:
The naming convention of maven suggests to use a unique groupId for each project. This would mean to introduce another depth of naming, e.g. com.example.framework.plugins.sql.mysql, which would be inconvenient, since the name of a plugin is no longer sufficient to derive the full module name (with the a-priori knowledge of the common package name com.example.framework.plugins). So I wonder, whether the purpose of the convention is soley to avoid possible duplicates by design? Since I control the namespace and all plugins, I would make sure that there are no conflicts.
The actual question
If I were to remove the intermediate name layer and thus have multiple parent poms with the same groupId, what problems could arise? Since plugins would not even share versions, the parent has no real purpose and also no artifact on its own that could collide, or am I missing anything?
Or is my entire structure not ideal and I should adopt some other form? During my research I could not find any similar use-case.

Usually, different related projects share the same groupId. There is no problem in that. The linked Maven page is misleading.

How do big companies tackle with the package dependencies conflict problem?

Just as shown in the picture, one app (Java) referenced two third-party package jars (packageA and packageB), and they referenced packageC-0.1 and packageC-0.2 respectively. It would work well if packageC-0.2 was compatible with packageC-0.1. However sometimes packageA used something that could not be supported in packageC-0.2 and Maven can only use the latest version of a jar. This issue is also known as "Jar Hell".
It would be difficult in practice to rewrite package A or force its developers to update packageC to 0.2.
How do you tackle with these problems? This often happens in large-scale companies.
I have to declare that this problem is mostly occurred in BIG companies due to the fact that big company has a lot of departments and it would be very expensive to let the whole company update one dependency each time certain developers use new features of new version of some dependency jars. And this is not big deal in small companies.
Any response will be highly appreciated.
Let me throw away a brick in order to get a gem first.
Alibaba is one of the largest E-Commerces in the world. And we tackle with these problems by creating an isolation container named Pandora. Its principle is simple: packaging those middle-wares together and load them with different ClassLoaders so that they can work well together even they referenced same packages with different versions. But this need a runtime environment provided by Pandora which is running as a tomcat process. I have to admit that this is a heavy plan. Pandora is developed based on a fact that JVM identifies one class by class-loader plus classname.
If you know someone maybe know the answers, share the link with him/her.

We are a large company and we have this problem a lot. We have large dependency trees that over several developer groups. What we do:
We manage versions by BOMs (lists of Maven dependencyManagement) of "recommended versions" that are published by the maintainers of the jars. This way, we make sure that recent versions of the artifacts are used.
We try to reduce the large dependency trees by separating the functionality that is used inside a developer group from the one that they offer to other groups.
But I admit that we are still trying to find better strategies. Let me also mention that using "microservices" is a strategy against this problem, but in many cases it is not a valid strategy for us (mainly because we could not have global transactions on databases any more).

This is a common problem in the java world.
Your best options are to regularly maintain and update dependencies of both packageA and packageB.
If you have control over those applications - make time to do it. If you don't have control, demand that the vendor or author make regular updates.
If both packageA and packageB are used internally, you can use the following practise: have all internal projects in your company refer to a parent in the maven pom.xml that defines "up to date" versions of commonly used third party libraries.
For example:
<framework.jersey>2.27</framework.jersey>
<framework.spring>4.3.18.RELEASE</framework.spring>
<framework.spring.security>4.2.7.RELEASE</framework.spring.security>
Therefore, if your project "A" uses spring, if they use the latest version of your company's "parent" pom, they should both use 4.3.18.RELEASE.
When a new version of spring is released and desirable, you update your company's parent pom, and force all other projects to use that latest version.
This will solve many of these dependency mismatch issues.
Don't worry, it's common in the java world, you're not alone. Just google "jar hell" and you can understand the issue in the broader context.
By the way mvn dependency:tree is your friend for isolating these dependency problems.

I agree with the answer of #JF Meier ，In Maven multi-module project, the dependency management node is usually defined in the parent POM file when doing unified version management. The content of dependencies node declared by the node class is about the resource version of unified definition. The resources in the directly defined dependencies node need not be introduced into the version phase. The contents of the customs are as follows:
in the parent pom
<dependencyManagement> 
    <dependencies > 
      <dependency > 
        <groupId>com.devzuz.mvnbook.proficio</groupId> 
        <artifactId>proficio-model</artifactId> 
        <version>${project.version}</version> 
      </dependency > 
</dependencies >
</dependencyManagement>
in your module ,you do not need to set the version
<dependencies > 
    <dependency > 
      <groupId>com.devzuz.mvnbook.proficio</groupId> 
       <artifactId>proficio-model</artifactId> 
    </dependency > 
  </dependencies > 
This will avoid the problem of inconsistency .

This question can't be answered in general.
In the past we usually just didn't use dependencies of different versions. If the version was changed, team-/company-wide refactoring was necessary. I doubt it is possible with most build tools.
But to answer your question..
Simple answer: Don't use two versions of one dependency within one compilation unit (usually a module)
But if you really have to do this, you could write a wrapper module that references to the legacy version of the library.
But my personal opinion is that within one module there should not be the need for these constructs because "one module" should be relatively small to be manageable. Otherwise it might be a strong indicator that the project could use some modularization refactoring. However, I know very well that some projects of "large-scale companies" can be a huge mess where no 'good' option is available. I guess you are talking about a situation where packageA is owned by a different team than packageB... and this is generally a very bad design decision due to the lack of separation and inherent dependency problems.

First of all, try to avoid the problem. As mentioned in #Henry's comment, don't use 3rd party libraries for trivial tasks.
However, we all use libraries. And sometimes we end up with the problem you describe, where we need two different versions of the same library. If library 'C' has removed and added some APIs between the two versions, and the removed APIs are needed by 'A', while 'B' needs the new ones, you have an issue.
In my company, we run our Java code inside an OSGi container. Using OSGi, you can modularize your code in "bundles", which are jar files with some special directives in their manifest file. Each bundle jar has its own classloader, so two bundles can use different versions of the same library. In your example, you could split your application code that uses 'packageA' into one bundle, and the code that uses 'packageB' in another. The two bundles can call each others APIs, and it will all work fine as long as your bundles do not use 'packageC' classes in the signature of the methods used by the other bundle (known as API leakage).
To get started with OSGi, you can e.g. take a look at OSGi enRoute.

Let me throw away a brick in order to get a gem first.
Alibaba is one of the largest E-Commerces in the world. And we tackle with these problems by creating an isolation container named Pandora. Its principle is simple: packaging those middle-wares together and load them with different ClassLoaders so that they can work well together even they referenced same packages with different versions. But this need a runtime environment provided by Pandora which is running as a tomcat process. I have to admit that this is a heavy plan.
Pandora is developed based on a fact that JVM identifies one class by class-loader plus classname.

Maximum Reusability for Two Implementations with Different Dependencies

I have a task that includes migrating an API Gateway from Zuul to Spring Cloud Gateway. There are two main versions currently: 1.0.1.RELEASE and 2.0.0.RC1. The first version is very basic and I'd have to manually implement filters related to rate limiting, authentication, etc...
While the second version has all the features we need with complete YML support. We have a strict rule in the company to never use beta or RC, and we need the first version of the gateway to be in production within a couple of weeks so there is not enough time to wait for the final release of version 2.
My team-leader specifically asked me to make 2 versions of using version 1.0.1 and 2.0.0 of SCG. How do you implement the module for maximum reusability? I mean I want switching between the two versions to be as easy as possible and I want to reuse as much of the logic as I can. The first thing that came to my mind is simply to create two separate projects. What do you think?

As I understand the question, you want an easy transition from the version 1.0.1.RELEASE to 2.0.0.RC1 of some dependency.
I would approach it as follows:
Create 3 modules (or projects):
api
bindings-1
bindings-2
The api module contains the API which you'll define to access functions of the dependency.
The bindings-1 and bindings-2 both implement what's defined in api, but based on the versions 1.0.1.RELEASE and 2.0.0.RC2 accordingly.
Your code will use the dependency only and exclusively via the api. No direct access to the classes and methods provided by the dependency. I would even not include the dependency as a compile-time dependency. You'll then import bindings-1 or bindings-2 depending on which version you want to use.
Having a separate api will require certain effort. It will seem overengineered. But if you don't do this, bindings to the dependency will diffuse in your code and switching from one version to another will be much more difficult.
With a dedicated api you will be forced to crystallize everything you need from the dependency in your api - in a version-independent manner.
I would also not develop bindings-1/bindings-2 as SCM branches. It's not like you'll be merging them, so why branches?

How to resolve maven dependency chaos to do versioning right?

I come into the situation that I'm responsible for a legacy Java maven project, and I'm facing some problems with the versioning and the dependencies.
The Project consists of some sub applications, which consists of some modules + all applications (or their modules) also depend on something like a commons module (or better depend on some sub modules of that commons module)
Now I'm facing problems:
1) Seeing the dependencies and changing the version of a module would lead me to a big mess as I don't know where the module is used so I wound have to search for its usings and update manually the version.
Is there a better or other approach that can be used for this dependencies/version mash - maybe tool support (currently using IntelliJ Idea)
2) I'm struggling with an clear approach of versioning for the modules/applications in different Branches.
It is clear for normal release/maintaining breaches - like semantic versioning, eg minor version is increased and is stable for that branch, no other branch is using same minor version - that would be the best.
But what if I have to have a branch for same minor version, for another customer - and it has to be in different brunch (it seems in the past and also to the management common product development is a unknown - do other features that should not make it in the other branch (not even disabled, not allowed to be in the code by regulation)).
How to distinguish between the branches? Should I use one of the for version digits for the branch or should I attach to the last digit some characters (eg.: An abbreviation of the project)?
A) 1.5.3.2 (
B) 1.0.3.2-US vs 1.0.3.2-EU
(And yes I know that product with feature toggle it something similar would be the best solution but that's not an option (due to management and regulation obligations))
Question is also if there tooling supports the chosen way.

for 1): Yes, it is. You can use parent pom and set version as variable in parent pom. When updating a module version, just update parent pom is enough.
for 2): You can use git branches to set different versions for customers. If you use git branches, you need to update all branches when there's new commit. It's also convenient to use different fixes for different customers.
Or you can also solve it by maven through maven profiles. With different package command like
mvn package -PEU

Multi-component versioning/building best practices

I have a Java project, built with Maven, that aggregates several components, each one in its own Maven project. Any one of these components may evolve separately.
The structure of my project can be described as follows:
my-main-project that depends on:
my-component-1
my-component-2
etc.
Nowadays, all pom.xml are using "snapshot" versions, so, they are all using the "latest" version available in my repository.
But once I send a release version to my customer, I'm supposed to freeze the versions and make a TAG (or equivalent) in my source-control, so I can restore a previous state in case of maintenance.
So, my question is: should I change all pom.xml files before each release, give version numbers to the components, and tie everything with this dependency versions? Also, if I have many components (my project currenty has 30+ small subcomponents) would I have to renumber/reversion each one before each release? When a single component evolves (due to bug fix or enhancement), must I increase its version so that the changes do not affect pre-existing releases, right?
How people using maven generally handle this many-component versioning case?
Of course, I could just rely on my version-control tags to restore to a previous point-in-time and just tag every component on each release, but I don't like this approach, since the dependency versioning (with maven) gives me much more control and visibility about what is packaged, and relations of (broken-)compatibility and many more.

General Considerations
You may consider some relations between your components.
Are they really independant (each one vs each other) ? Or is there some kinds of relation ... some commons lifecycles ?
If you find some relationship between them, consider using maven multi-modules : http://www.sonatype.com/books/mvnex-book/reference/multimodule.html. In a few words, you will have a parent, with one version, and some modules (some jars .. in a way like Spring and its submodules). This will help you to reduce versions management.
You may consider using maven-release-plugin. It will help you to tag, build and deploy automatically your modules, dealing more easily with versionning and links with SCM and Repository.
Moreover, combine with multi-module it would drastically help you !
There is a lot of topic dealing with this on Stack Overflow.
I don't know if you already know that. I could explain it a lot further if you want, but you may have enough elements to search by yourself if you don't.
Straight Answers
So, my question is: should I change all pom.xml files before each release, give version numbers to the components, and tie everything with this dependency versions?
Yes you should. In Application Lifecycle Management follow the changes is REALLY important. So, as you could imagine, and as you point it out, you really should build and tag each of your components. It could be painful, but maven-realease-plugin and multi module (even with a Continuous Integration plateform) it could be easier.
would I have to renumber/reversion each one before each release?
For exactly the same reasons : yes !
must I increase its version so that the changes do not affect pre-existing releases, right?
Yes, you should too. Assuming you choose a common versionning like MAJOR.minor.correction, the first number indicate compatibilty breaks. Minor version would bring some breaks, but should not. Corrections whould NEVER affect compatibility.
How people using maven generally handle this many-component versioning case?
I cannot reply for every one, but my previous comments on release-plugin and multi-module considered as best pratices. If you want to a little bit further, you can imagine use more powerfull SCM (Clearcase, Perforce, ...), but maven integration is fewer, not "well" documented and community provide less examples than SVN or Git.

Maven Release Plugin
If you are using a multi-module pom.xml you should be able to do mvn release -DautoVersionSubmodules and have it do a "release" build of all your dependencies and remove the -SNAPSHOT versions and upload them to your repository. That is what the release plugin and its workflow exists solely to do.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.