Maven building only changed files - java

Lets say i have module structure like below
Modules
->utils
->domain
->client
->services
->deploy (this is at the module level)
Now to lauch the client i need to make a build of all the modules, i.e utils, domain, client, services, because i am loading the jars of all the above modules to fianlly lanch the client
And all the jars gets assembled in the module deploy.
My question is if i change anything in services for example, then is there a way when running a build from deploy maven could recognise it has to build only services and hence build it and deploy it in deploy folder?

If you only call "mvn install" without "clean", the compiler plugin will compile only modified classes.

For GIT
mvn install -amd -pl $(git status | grep -E "modified:|deleted:|added:" | awk '{print $2}' | cut -f1 -d"/")
OR
In your .bashrc file (.bashrc can be found in home directory ~/.bashrc , or create it if doesn't exists) add the following function.
mvn_changed_modules(){
[ -z "$1" ] && echo "Expected command : mvn_changed_modules (install/build/clean or any maven command)" && exit 0
modules=$(git status | grep -E "modified:|deleted:|added:" | awk '{print $2}' | cut -f1 -d"/")
if [ -z "$modules" ];
then
echo "No changes (modified / deleted / added) found"
else
echo -e "Changed modules are : `echo $modules`\n\n"
mvn $1 -amd -pl $modules
fi
}
**Then after re-starting your bash** (command prompt), you **can just use the following command** from the ROOT directory itself.
smilyface#machine>ProjectRootDir]$ mvn_changed_module install
How it works
As per the question mvn install -amd -pl services is the command when "some changes done in services module". So, first get module name from the changed file(s) and put it as input for mvn-install command
Say for example, below is a list of modified files (output of git status) -
services/pom.xml
services/ReadMe.txt
web/src/java/com/some/Name.java
Then services and web are the modules name which need to be build / compile / install

Within a multi-module build you can use:
mvn -pl ChangedModule compile
from the root module will compile only the given ChangedModule. The compiler plugin will only compile the files which have been changed. But it can happen that the module you have changed would cause a recompile of other module which are depending on the ChangedModule. This can be achieved by using the following:
mvn -amd -pl ChangedModule compile
where the -amd means also make dependents. This will work without installing the whole modules into the local repository by a mvn install.

After trying and using aforementioned advises, I've met following problems:
Maven install (without clean) still takes a lot of time, which for several projects can be 10-20s extra time.
Sebasjm's solution is fast and useful (I was using it for a couple of months), but if you have several changed projects, rebuilding them all the time (if you even hadn't change anything) is a huge waste of time
What really worked for me is comparing source modification dates against .jar modification in local repository. And if you check only for VCS changed files (see sebasjm's answer), then date comparison won't take noticeable time (for me it was less than 1s for 100 changed files).
Main benefit of such approach is very accurate rebuild of only really changed projects.
Main problem is doing modification date comparison is a bit more than one-liner script.
For those, who want to try it, but too lazy to write such script themself sharing my version of it: https://github.com/bugy/rebuilder (linux/windows).
It can do some additional useful things, but the main idea and central algorithm is as explained above.

If you are using SVN and *nix, from the root module
mvn install -amd -pl $(svn st | colrm 1 8 | sed 's /.* ' | xargs echo | sed 's- -,:-g' | sed 's ^ : ')

I had the same frustration and I also wrote a project at the time - alas it is not available but I found people who implemented something similar:
for example - https://github.com/erickzanardo/maven-watcher
It uses nodejs and assumes an maven project but should work on windows and unix alike.
The idea of my implementation is to watch for changes and then compile what changed. - kind of like nodemon.
So for example
When a java file changes - I compile the module
When a class file or jar changes - I do something else (for example copy the jar under tomcat and restart tomcat)
And the two are unrelated.. so if the java compilation failed, there should be no reason for the jar file to update.. and it's quite stable.
I have used it on a project with 23K .java files and it worked smoothly.
It took the watch process a couple of seconds to start - but then it would only run if change was detected so the overall experience was nice.
The next step I intended to add is similar to your SVN support - list the modified files and use them as initialization.
Important to note - if compilation fails, it will retry on the next modification. so if you are modifying multiple jars, and the compilation fails as long as you are writing code, it will retry to compile everything on each code change until it compiled successfully.
If you'd like I can try find my old project, fix it up a bit and publish it..

mvn clean install to run full build
mvn install to compile only changed and prepare war/jars other binaries
mvn compile to compile only changed files...
So mvn compile is the fastest. but if run/debug your project with war/jars it might not show those changes.

The question and the answers posted so far do not take the dependency tree into account. What if the utils module is changed? We need to rebuild (retest at least) it and all the modules depending on it.
Ways to do so:
https://github.com/avodonosov/hashver-maven-plugin/
https://github.com/vackosar/gitflow-incremental-builder/
Gradle Enterprise is a commercial service which provides build cache, in
particular for maven
Migrate to newer build tools like Gradle or Bazel which support build caches out of box.

Related

Installing parquet-tools

I am trying to install parquet tools on a FreeBSD machine.
I cloned this repo: git clone https://github.com/apache/parquet-mr
Then I did cd parquet-mr/parquet-tools
Then I did `mvn clean package -Plocal
As specified here: https://github.com/apache/parquet-mr/tree/master/parquet-tools
This is what I got:
Why is this dependency error here? How do I get around it?
On Ubuntu 20, I install via pip:
python3 -m pip install parquet-tools
Haven't tried on FreeBSD but I'd imagine it would also work. See related answer for a caveat on using pip on FreeBSD.
And you can view a file with:
parquet-tools show filename.parquet
I know the question specifies FreeBSD, but if you're on mac, you can do
brew install parquet-tools
parquet-tools is just one module of parquet-mr. It depends on some of the other modules.
When you build from a source version that corresponds to a release, those other modules will be available to Maven, because release artifacts are published as a part of the release process.
However, when building from a snapshot version, you have to make those dependencies available yourself. There are two ways to do so:
Option 1: Build and install all modules of the parent directory:
git clone https://github.com/apache/parquet-mr
cd parquet-mr
mvn install -Plocal
This will put the snapshot artifacts in your local ~/.m2 directory. Subsequently, you can (re)build just parquet-tools like you initially tried, because now the snapshot artifacts will already be available from ~/.m2.
Option 2: Build the parquet-mr modules from the parent directory, while asking Maven to build needed modules as well along the way:
git clone https://github.com/apache/parquet-mr
cd parquet-mr
mvn package -pl parquet-tools -am -Plocal
Option 1 will build more projects than option 2, so if you only need parquet-tools, you are better off with the latter. Please note though that probably both will require installation of a thrift compiler.
Parquet tools- A utility that can be leveraged to read parquet files. Yuu can clone it from Github and run some maven command.
1. git clone https://github.com/Parquet/parquet-mr.git
2. cd parquet-mr/parquet-tools/
3. mvn clean package -Plocal
OR You can download stable release & built from local.
Downloading stable Parquet release.
https://github.com/apache/parquet-mr/archive/apache-parquet-1.8.2.tar.gz
2. Maven local install.
D:\parquet>cd parquet-tools && mvn clean package -Plocal
3. Test it (paste a parquet file under target directory):
D:\parquet\parquet-tools\target>java -jar parquet-tools-1.8.2.jar schema out.parquet
(where out.parquet is my parquet file under target directory)
// Read parquet file
D:\parquet\parquet-tools\target>java -jar parquet-tools-1.6.0.jar cat out.parquet
// Read few lines in parquet file
D:\parquet\parquet-tools\target>java -jar parquet-tools-1.6.0.jar head -n5 out.parquet
Some answers have broken link for the jar download, but you can get it from
maven central
However... this jar and others like it are built so that the hadoop dependencies are "provided" and if you build from source, you'll get that default. So you need to set -Dhadoop.scope=compile when you build, or the result will only work when run on a hadoop node using the "hadoop ..." command.
To make matters worse, this tool apparently disables System.out and System.err so that exceptions that cause main() fails are never printed and you'll be left wondering what happened.
I also found that the default settings for the maven-license-plugin caused it to fail the build when files showed up that it didn't expect (e.g. nbactions.xml if you use netbeans).

Execute Maven Plugin in every Build

i've a question I can't answer by myself.
Introduction:
We've developed a corporate pom which enforces some rules on projects, provided some company wide profiles, distribution management, etc.
Most of our developer use the corporate pom, but not everyone does, and furthermore not all migrate old projects within the next development cycle.
So I decided to write a maven plugin which simply checks whether someone builds a project on developement stage (not qs/prod) and whether the user uses our corporate pom.
Problem:
I want to force the plugin to execute on every "mvn clean install" or something similar without configure the plugin within the project pom.
My first guess was the settings.xml, but unfortually you can't execute plugins via settings.xml.
Does someone have a solution?
Question in short: Force a plugin execution on every build without providing the plugin in the project pom and or command line!
In general you don't control the system of the developer, so trying to solve this with the settings.xml of with the Maven distribution is not the way to go. You must look for the placing which is used by every project. One system could be the SCM, but adding hooks there is probably hard.
Assuming you have a artifact repository manager, that is the place to do these kind of checks. I know both Nexus and Artifactory can validate the uploaded files, and are especially strong in analyzing the pom.xml
So your focus should not be on trying to solve this with a maven plugin, but on a common place in the infra.
As Robert mentioned it may be better to validate this kind on the infrastructre instead on the developer machine.
A pre-receive hook in git does the trick.
#!/bin/bash
#
# Hook simply validates whether the corporate pom is used
# and rejects the commit when the corporate pom is absent
echo "### Validate commit against Company rules... ####"
corppom_artefactId='corporate-pom'
oldrev=$1
newrev=$2
refname=$3
while read oldrev newrev refname; do
pom=`git ls-tree --full-name -r ${newrev} | grep pom.xml | awk '{ print $3 }'`
# Project seems not to be a maven project. So it's okay that the corporate pom is missing
if [ -z ${pom} ];
then
continue;
else
git cat-file blob ${pom} | grep $corppom_artefactId >> /dev/null
if [[ $? -ne 0 ]];
then
echo "### NO CORPORATE POM... Bye Bye ###"
# Rejecting commit
exit 1;
else
echo "### CORPORATE POM IS USED. GREAT! ###"
fi
fi
done
Be aware that this script is just an example! It will not work on multi module projects and furthermore is not coded very well. But as solution approach it is sufficient.

How do I know which project is requesting a specific jar from Maven

I'm using Eclipse and recently upgraded all my projects to use the latest version of a library.
However in the Maven repository I can still see the old version of the library.
I've deleted manually the old library from the Maven repository, but it keeps coming back.
I am sure all the projects in Eclipse point to the new version: I've checked all my pom.xml, I've used the "Dependency Hierarchy" tool, etc.
Is there a way to know which project is telling Maven to download the old version of the library?
Many thanks!
You can use the Maven dependency plugin's tree goal:
mvn dependency:tree
and filter using the includes option which uses the pattern [groupId]:[artifactId]:[type]:[version].
Re: "and I have many". Perform the following in the topmost directory:
find . -name "pom.xml" -type f -exec mvn dependency:tree -f {} ';' | grep '^\[.*\] [-+\\\|].*'
Syntax details may vary from Bash to Bash.
Hint: Try it in a bottommost project directory first to ensure that it runs properly as intended. Since you have many projects it may take a while to finish and to recognize possible errors only then.
You can use below command to get a tree of all dependencies and then find out where the specific artifact is coming from.
You can pipe with grep to show only the related ones if you you are on linux/unix based os.
mvn dependency:tree
Thanks guys, appreciated, but it certainly is not an easy way. It looks like you have to do project by project (and I have many). Plus most of my pom reference poms in other folders and it's not able to process that either.

could the first ever maven build be made offline?

The problem: you have a zipped java project distribution, which depends on several libraries like spring-core, spring-context, jacskon, testng and slf4j. The task is to make the thing buildable offline. It's okay to create project-scope local repo with all required library jars.
I've tried to do that. Looks like even as the project contains the jars it requires for javac and runtime, the build would still require internet access. Maven would still lurk into network to fetch most of its own plugins it requires for the build. I assume that maven is run with empty .m2 directory (as this may be the first launch of the build, which may be an offline build). No, I am not okay with distributing full maven repo snapshot along the project itself, as this looks like an utter mess for me.
A bit of backround: the broader task is to create windows portable-style JDK/IntelliJ Idea distribution which goes along the project and allows for some minimal java coding/running inside IDE with minimal configuration and minimal internet access. The project is targeted towards students in a computer class, with little or no control over system configuration. It is desirable to keep console build system intact for the offline mode, but I guess that maven is overly dependent on the network, so I have to ditch it in favor of good old ant.
So, what's your opinion, could we move first maven build in offline mode completely? My gut feeling is that initial maven distribution just contains the bare minimum required to pull essential plugins off the main repo and is not fully functional without seeing the main repo at least once.
Maven has a '-o' switch which allows you to build offline:
-o,--offline Work offline
Of course, you will need to have your dependencies already cached into your $HOME/.m2/repository for this to build without errors. You can load the dependencies with:
mvn dependency:go-offline
I tried this process and it doesn't seem to fully work. I did a:
rm -rf $HOME/.m2/repository
mvn dependency:go-offline # lot of stuff downloaded
# unplugged my network
# develop stuff
mvn install # errors from missing plugins
What did work however is:
rm -rf $HOME/.m2/repository
mvn install # while still online
# unplugged my network
# develop stuff
mvn install
You could run maven dependency:go-offline on a brand new .m2 repo for the concerned project. This should download everything that maven needs to be able to run offline. If these are then put into a project-scope local repo, you should be able to achieve what you want. I haven't tried this though
Specify a local repository location, either within settings.xml file with <localRepository>...</localRepository> or by running mvn with -Dmaven.repo.local=... parameter.
After initial project build, all necessary artifacts should be cached locally, and you can reference this repository location the same ways, while running other Maven builds in offline mode (mvn -o ...).

How do you keep Hudson from giving Maven the -B option for builds?

When Hudson goes to build my project, it executes Maven as follows:
Executing Maven: -B -f /path/to/root/pom.xml clean install
This works fine on most projects. (The -B is for "batch" or "non-interactive mode", BTW).
But for this one project that uses AndroMDA (which I can't recommend for future projects, it's really a pain-in-the-butt; slows down the build by 1000% with code generation for things that could be trivially done with inheritance and annotation-based config).
For some reason unbeknown to me, when Maven is given the -B flag the generated classes are no put on the classpath causing compilation errors for references to the generated classes. I've tested building manually with -B and without it and the result is that it builds fine without -B (outside of Hudson) and it doesn't build with -B (again, outside of Hudson).
Using Hudson version 1.369 and an external Maven 2.2.1 install.
Any advice greatly appreciated!!!
P.S. Hudson is AWESOME!!!!
The simplest version would be to have a free style project, and call maven yourself.

Categories