Maven Mojo leaks static information to subsequent executions

Maven Mojo leaks static information to subsequent executions - java

I am working on a Maven compiler plug-in for a new programming language. Like with other Maven compiler plug-ins, this plug-in is invoked in different phases of a module build (e.g., compile, test-compile), and for multiple modules during a reactor build. The language has a "bootstrap" component that stores certain static information which includes Class and ClassLoader objects. The information stored in this bootstrap component is not reusable across multiple executions of the compiler plug-in (it needs to be freshly generated each time). Currently, compilation fails because invalid information from an earlier execution is picked up (e.g., the test-compile execution might use some information left over from the earlier compile phase).
I realize that there are a number of different ways to solve such an issue (e.g., handle the currently static information in a different way, or provide some way of wiping it out between phases, etc.). My question here is primarily about Maven-specific ways to address this sort of problem.
The default instantiation strategy already creates a new plug-in instance every time, but that doesn't help in my case since the static data is not in the Maven plug-in itself, but in the compiler library that it depends on (come to think of it, I don't think it would help with static data either way). Another option would be to start a separate JVM for each execution, similar to the <fork> option of the Maven Compiler Plugin, but it appears that this existing mechanism is not easily reusable and would have to be essentially re-created from scratch.
Is there a Maven mechanism to isolate data (and loaded classes) between different executions of the same plug-in?

Related

How do big companies tackle with the package dependencies conflict problem?

Just as shown in the picture, one app (Java) referenced two third-party package jars (packageA and packageB), and they referenced packageC-0.1 and packageC-0.2 respectively. It would work well if packageC-0.2 was compatible with packageC-0.1. However sometimes packageA used something that could not be supported in packageC-0.2 and Maven can only use the latest version of a jar. This issue is also known as "Jar Hell".
It would be difficult in practice to rewrite package A or force its developers to update packageC to 0.2.
How do you tackle with these problems? This often happens in large-scale companies.
I have to declare that this problem is mostly occurred in BIG companies due to the fact that big company has a lot of departments and it would be very expensive to let the whole company update one dependency each time certain developers use new features of new version of some dependency jars. And this is not big deal in small companies.
Any response will be highly appreciated.
Let me throw away a brick in order to get a gem first.
Alibaba is one of the largest E-Commerces in the world. And we tackle with these problems by creating an isolation container named Pandora. Its principle is simple: packaging those middle-wares together and load them with different ClassLoaders so that they can work well together even they referenced same packages with different versions. But this need a runtime environment provided by Pandora which is running as a tomcat process. I have to admit that this is a heavy plan. Pandora is developed based on a fact that JVM identifies one class by class-loader plus classname.
If you know someone maybe know the answers, share the link with him/her.

We are a large company and we have this problem a lot. We have large dependency trees that over several developer groups. What we do:
We manage versions by BOMs (lists of Maven dependencyManagement) of "recommended versions" that are published by the maintainers of the jars. This way, we make sure that recent versions of the artifacts are used.
We try to reduce the large dependency trees by separating the functionality that is used inside a developer group from the one that they offer to other groups.
But I admit that we are still trying to find better strategies. Let me also mention that using "microservices" is a strategy against this problem, but in many cases it is not a valid strategy for us (mainly because we could not have global transactions on databases any more).

This is a common problem in the java world.
Your best options are to regularly maintain and update dependencies of both packageA and packageB.
If you have control over those applications - make time to do it. If you don't have control, demand that the vendor or author make regular updates.
If both packageA and packageB are used internally, you can use the following practise: have all internal projects in your company refer to a parent in the maven pom.xml that defines "up to date" versions of commonly used third party libraries.
For example:
<framework.jersey>2.27</framework.jersey>
<framework.spring>4.3.18.RELEASE</framework.spring>
<framework.spring.security>4.2.7.RELEASE</framework.spring.security>
Therefore, if your project "A" uses spring, if they use the latest version of your company's "parent" pom, they should both use 4.3.18.RELEASE.
When a new version of spring is released and desirable, you update your company's parent pom, and force all other projects to use that latest version.
This will solve many of these dependency mismatch issues.
Don't worry, it's common in the java world, you're not alone. Just google "jar hell" and you can understand the issue in the broader context.
By the way mvn dependency:tree is your friend for isolating these dependency problems.

I agree with the answer of #JF Meier ，In Maven multi-module project, the dependency management node is usually defined in the parent POM file when doing unified version management. The content of dependencies node declared by the node class is about the resource version of unified definition. The resources in the directly defined dependencies node need not be introduced into the version phase. The contents of the customs are as follows:
in the parent pom
<dependencyManagement> 
    <dependencies > 
      <dependency > 
        <groupId>com.devzuz.mvnbook.proficio</groupId> 
        <artifactId>proficio-model</artifactId> 
        <version>${project.version}</version> 
      </dependency > 
</dependencies >
</dependencyManagement>
in your module ,you do not need to set the version
<dependencies > 
    <dependency > 
      <groupId>com.devzuz.mvnbook.proficio</groupId> 
       <artifactId>proficio-model</artifactId> 
    </dependency > 
  </dependencies > 
This will avoid the problem of inconsistency .

This question can't be answered in general.
In the past we usually just didn't use dependencies of different versions. If the version was changed, team-/company-wide refactoring was necessary. I doubt it is possible with most build tools.
But to answer your question..
Simple answer: Don't use two versions of one dependency within one compilation unit (usually a module)
But if you really have to do this, you could write a wrapper module that references to the legacy version of the library.
But my personal opinion is that within one module there should not be the need for these constructs because "one module" should be relatively small to be manageable. Otherwise it might be a strong indicator that the project could use some modularization refactoring. However, I know very well that some projects of "large-scale companies" can be a huge mess where no 'good' option is available. I guess you are talking about a situation where packageA is owned by a different team than packageB... and this is generally a very bad design decision due to the lack of separation and inherent dependency problems.

First of all, try to avoid the problem. As mentioned in #Henry's comment, don't use 3rd party libraries for trivial tasks.
However, we all use libraries. And sometimes we end up with the problem you describe, where we need two different versions of the same library. If library 'C' has removed and added some APIs between the two versions, and the removed APIs are needed by 'A', while 'B' needs the new ones, you have an issue.
In my company, we run our Java code inside an OSGi container. Using OSGi, you can modularize your code in "bundles", which are jar files with some special directives in their manifest file. Each bundle jar has its own classloader, so two bundles can use different versions of the same library. In your example, you could split your application code that uses 'packageA' into one bundle, and the code that uses 'packageB' in another. The two bundles can call each others APIs, and it will all work fine as long as your bundles do not use 'packageC' classes in the signature of the methods used by the other bundle (known as API leakage).
To get started with OSGi, you can e.g. take a look at OSGi enRoute.

Let me throw away a brick in order to get a gem first.
Alibaba is one of the largest E-Commerces in the world. And we tackle with these problems by creating an isolation container named Pandora. Its principle is simple: packaging those middle-wares together and load them with different ClassLoaders so that they can work well together even they referenced same packages with different versions. But this need a runtime environment provided by Pandora which is running as a tomcat process. I have to admit that this is a heavy plan.
Pandora is developed based on a fact that JVM identifies one class by class-loader plus classname.

Maven create many jars from one module

I have a maven (multi-module) project that contains something like an IAction. In this project I have implemented about 50 implementation of different actions. Each Action consists of a MyAction.java and a MyAction.properties file.
I use Java's SPI (java.util.spi) to load all the implementations at runtime. This all works great, but now I want to package each Action into a single jar, so that I end up with 50 jars.
What would be a good way to accomplish this? I don't really want to create a sub-module for each action, as that takes a lot of maintenance.

You can use the Maven assembly plugin in order to create additional artifacts for building a module. However, note that you need to add these artifacts to your deployment queue once you created them, in case that you do not only want to build these artifacts. This can be achieved by for example using the build-helper-maven-plugin.
However, as for the iteration over 50 elements, you might want to consider writing your own specialized plugin. There are additional plugins that allow the iteration over another plugin, but this will blow up your POM. If a build is therefore very specific, you might therefore consider writing an individual plugin which is specialized on this task. You can do this in pure Java instead of XML and it is not as hard as it sounds. There is documentation and there are default APIs to Maven. And you can check the source code of the named plugins on how to achieve your requirements.

Can I force GWT compilation without entry point? (To validate compatibility to GWT)

This question is related to, but not a duplicate of, this question.
My issue is slightly different; I have a "utility module", shared between the client and server code, and it contains no GWT-specific code.
I understand that normally, all the sources are pulled into one specific project, where everything is compiled together. But there is one issue with that: I only get to know if my utility project is "GWT compatible", when I compile the main project. This is way too late; I haven't even got around to start on the main project, but I want to know before I make a "commit" to my SCM, that my utility project is "GWT compatible".
In other words, I want to validate the utility project for GWT compatibility, independently from it's use in a separate project (module).
There's a large part of the JRE that is not covered by GWT, and it is particularly likely in a utility module that non-GWT-compatible classes or method be used. That is what I want to validate against.
EDIT: I could add a "dummy entry point", I suppose, but that makes the project depend on GWT, which I don't want to, since it is "general" code, also to be used by people that don't use GWT. If it matters, I use Maven as build system.
EDIT2: No matter what I do, I will only get real compilation/validation with an entry point (does NOT need to reference any of the classes). Neither <force>true</force>, nor <failOnError>true</failOnError> will do. Is there a way I can define that entry point, for the shared project, such that only gwt-maven-plugin sees it, but not javac (so as not to add an unneeded dependency in the Java code)?

The compiler actually always visits all code on the source path (note: not quite the same as the classpath), by starting at the requested module with any <source> tags, and then checking each <inherits> along the way. If it finds something that isn't compatible or isn't compilable, it will mark it as broken, and move on - as long as nothing actually depends on it (i.e. an EntryPoint, or something that an EntryPoint depends on) you'll just see this message:
Validating newly compiled units
Ignored 1 unit with compilation errors in first pass.
Compile with -strict or with -logLevel set to TRACE or DEBUG to see all errors.
If you include that -strict flag, the compile will actually fail when it hits something that can't be included correctly.
This work is done in the very early stages of the compile, while constructing the TypeOracle, which is used for Generators, long before any JS is built. That type oracle is passed to generators, which need to be able to ask questions like 'what interfaces on the sourcepath have a JSO implementation' and 'what are all possible subclasses of List'. Generators can do a huge number of things, including emit even more types which then need to be parsed, compiled, and the process continues until a full JProgram is created of all possible types, based on the current set of modules.
That JProgram then gets compiled down based on what can be reached from the roots - the entrypoint, as well as a few other details such as how to emulate Java details like casts, arrays, longs, exceptions, etc.
If -strict was not specified, and the compiler ends up needing to reach something which is unavailable due to earlier compilation problems, that is the time you find out. Using -strict to stop earlier will help ensure that you catch those issues sooner.
One more fun fact: By default, with com.google.gwt.user.User in your module (or any other <inherits> that depends on it), you already have an entrypoint, or several! These do some quick checking that your page is working correctly, such as using a strict doctype, or the browser actually matching the expected user.agent setting. This means that it is usually possible to compile a module even without an entrypoint (except with gwt-maven-plugin:compile, which will not consider a module for compilation just by those built-in ones).
EDIT: Okay, even one more: From http://www.gwtproject.org/doc/latest/DevGuideCompilingAndDebugging.html, combined with -strict, it looks like you can force the validation to run without actually compiling to JS:
-validateOnly Validate all source code, but do not compile

I don't think it's possible because the GWT compiler does not compile any unused code.
This means that your shared utility "module" may have code in it that is not compatible with GWT, but it will not cause any problems as long as GWT code never calls such incompatible classes or methods. Without an entry point GWT compiler won't know which code is used and which is not - it will assume that all of it is unused.

How to automate a build of a Java class and all the classes it depends on?

I guess this is kind of a follow-on to question 1522329.
That question talked about getting a list of all classes used at runtime via the java -verbose:class option.
What I'm interested in is automating the build of a JAR file which contains my class(es), and all other classes they rely on. Typically, this would be where I am using code from some third party open source product's "client logic" but they haven't provided a clean set of client API objects. Their complete set of code goes server-side, but I only need the necessary client bits.
This would seem a common issue but I haven't seen anything (e.g. in Eclipse) which helps with this. Am I missing something?
Of course I can still do it manually by: biting the bullet and including all the third-party code in a massive JAR (offending my purist sensibilities) / source walkthrough / trial and error / -verbose:class type stuff (but the latter wouldn't work where, say, my code runs as part of a J2EE servlet, and thus I only want to see this for a given Tomcat webapp and, ideally, only for classes related to my classes therein).

I would recommend using a build system such as Ant or Maven. Maven is designed with Java in mind, and is what I use pretty much exclusively. You can even have Maven assemble (using the assembly plugin) all of the dependent classes into one large jar file, so you don't have to worry about dependencies.
http://maven.apache.org/
Edit:
Regarding the servlet, you can also define which dependencies you want packaged up with your jar, and if you are making a stand alone application you can have the jar tool make an executable jar.
note: yes, I am a bit of a Maven advocate, as it has made the project I work on much easier. No I do not work on the project personally. :)

Take a look at ProGuard.
ProGuard is a free Java class file shrinker, optimizer, obfuscator, and preverifier. It detects and removes unused classes, fields, methods, and attributes. It optimizes bytecode and removes unused instructions. It renames the remaining classes, fields, and methods using short meaningless names. Finally, it preverifies the processed code for Java 6 or for Java Micro Edition.

What you want is not only to include the classes you rely on but also the classes, the classes you rely on, rely on. And so on, and so forth.
So that's not really a build problem, but more a dependency one. To answer your question, you can either solve this with Maven (apparently) or Ant + Ivy.
I work with Ivy and I sometimes build "ueber-jar" using the zipgroupfileset functionality of the Ant Jar task. Not very elegant would say some, but it's done in 10 seconds :-)

How many multiple "Eclipse Projects" is considered too excessive for one actual development project?

I'm currently working on a project that contains many different Eclipse projects referencing each other to make up one large project. Is there a point where a developer should ask themselves if they should rethink the way their development project is structured?
NOTE: My project currently contains 25+ different Eclipse projects.

My general rule of thumb is I would create a new project for every reusable component. So for example if I have some isolated functionality that can be packaged say as a jar, I would create a new project so I can build,package and distribute the component independently.
Also, if there are certain projects that you do not need to make frequent changes to, you can build them only when required and keep them "closed" in eclipse to save time on indexing, etc. Even if you think that a certain component is not reusable, as long as it is separated from the rest of the code base in terms of logic/concerns you may be well served by just separating it out. Sometimes seemingly specific code might be reusable in another project or in a future version of the same project.

When compiled, a project would typically result in a jar. So if your application consists of potentially reusable components, it is ok to use a project for each.
I'm a big fan of using a lot of projects, I feel that this "breaks down" large things beyond what I can do with packages, and helps me orient and navigate.
Of course, if you're developing Eclipse plug-ins, everything would be a project anyway.
The only thing I would watch out for has to do with your source-control and it's ability to handle moves of files between projects. Subclipse had been giving me trouble with it, or maybe it's my SVN server that did.

If your project has that many sub-projects, or modules, needed to actually compose your final artifact then it is time to look at having something like Maven and setting up a multi-module project. It will a) allow you to work on each module independently without ide worries and allow easy setup in your ide (and others' IDEs) through the mvn eclipse:eclipse goal. In addition, when building your entire top level project, maven will be able to derive from list of dependencies you have described what modules need to be built in what order.
Here's a quick link via google and a link to the book Maven: The Definitive Guide, which will explain things in much better detail in chapter 6 (once you have the basics).
This will also force your project to not be explicitly tied to Eclipse. Being able to build independent from an ide means that any Joe Schmoe can come along and easily work with your code base using whatever tools he/she needs.

Create jars for the projects you don't work in often. That should greatly reduce the clutter. If you work on all the projects often, then you can add targets to your build that will jar up the respective projects for you, which condenses everything down to one file that you can then include on the class path.

An additional method is to create many different workspaces. The benefit of separate workspaces is that you can remove some of the visual clutter/ performance overhead of having lots of projects. You can use targets to jar up all of you projects and put them in a repository so you can reference them in each workspace.

At a former job the entire application was more then +170 projects. While it was rarely necessary to have all projects checked out locally, even the 30-40 projects constantly in our scope made reindexing, etc. very slow.

Yeesh. One Project for each Project. If you are using reusable projects, make them into a library for heavens sake. Break the none re-usable projects into packages, that's what they are there for.

That's a hard question and answers span from having one eclipse project at all to having one eclipse project for every single class.
My bottomline:
You can have too few projects,
and never too many (of course use
automation e.g. mvn eclipse:eclipse)
Use
-Declipse.useProjectReferences=true/false
when using maven to switch workspace
mode btw jar and project
dependencies
Use mvn release plugin to generate
consecutive releases (automatic
version increase)
Multiple projects gives you
independent versioning which is
extremely important. E.g. one dev may work on a new version of a
module while you still depends on
the previous one and you at some
point decide to upgrade to the newer
version(possibly by increasing its version in pom.xml dependency section). Or in other scenario if one
project contains a bug you downgrade
to its previous version.
Multiple projects makes you think
about the architecture more than if
you have just packages.
Multiple projects generally make
architectural problems evident more
than if you have just one project.
Anyone would like to comment on
this?
You never know if you project
evolves into OSGI/SOA/EDA where you
need separation.
Even if you're 100% sure that you
projects will be deployed as one jar
in an old way in a single jvm, it
still does not hurt(mvn assembly
plugin) to have multiple eclipse
projects for logically independent
pieces of code
BTW, the project I work on is divided into 24 eclipse projects.

Hell, we have more than 100. Projects don't cost anything.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.