Multiple Java projects and refactoring

Multiple Java projects and refactoring - java

I have recently joined a project that is using multiple different projects. A lot of these projects are depending on each other, using JAR files of the other project included in a library, so anytime you change one project, you have to then know which other projest use it and update them too. I would like to make this much easier, and was thinking about merging all this java code into one project in seperate packages. Is it possible to do this and then deploy only some of the packages in a jar. I would like to not deploy only part of it but have been sassked if this is possible.
Is there a better way to handle this?

Approach 1: Using Hudson
If you use a continuous integration server like Hudson, then you can configure upstream/downstream projects (see Terminology).
A project can have one or several downstream projcets. The downstream projects are added to the build queue if the current project is built successfully. It is possible to setup that it should add the downstream project to the call queue even if the current project is unstable (default is off).
What this means is, if someone checks in some code into one project, at least you would get early warning if it broke other builds.
Approach 2: Using Maven
If the projects are not too complex, then perhaps you could create a main project, and make these sub-projects child modules of this project. However, mangling a project into a form that Maven likes can be quite tricky.

If you use Eclipse (or any decent IDE) you can just make one project depend on another, and supply that configuration aspect in your SVN, and assume checkouts in your build scripts.
Note that if one project depends on a certain version of another project, the Jar file is a far simpler way to manage this. A major refactoring could immediately means lots of work in all the other projects to fix things, whereas you could just drop the new jar in to each project as required and do the migration work then.

I guess it probably all depends on the specific project, but I think I would keep all the projects separate. This help keep the whole system loosely coupled. You can use a tool such as maven to help manage all the dependencies between the projects. Managing dependencies like this is one of maven's main strengths.

Using Ant as your build tool, you can package your project any way that you want. However, leaving parts of your code out of the distribution seems like it would be error prone; you might accidentally leave out necessary classes (presumably, all of your classes are necessary).
In relation to keeping your code in different projects, I have a loose guideline. Keep the code that changes together in the same project and package it in its own jar file. This works best when some of your code can be broken out into utility libraries that change less frequently than your main application.
For example, you might have an application where you've generated web service client classes from a web service WSDL (using something like the Axis library). The web service interface will likely change infrequently, so you don't want to have the regeneration step reoccurring all the time in your main application build. Create a separate project for this piece so that you only have to recreate the web service client classes when the WSDL changes. Create a separate jar and use it in your main application. This style also allows other projects to reuse these utility modules.
When following this style, you should place a version number in the jar manifest so that you can keep track of which applications are using which versions of your module. Depending on how far you want to take this, you could also keep a text file in the jar that details the changes that have occurred for each revision (much like an open source library).

It's all possible (we had the same situation some years ago). How hard or easy it'll be depends on your IDE (refactoring, merging, organizing new project) and you build tool (deploying). We used IDEA as IDE and Ant as build tool and it wasn't too hard. One sunday (nobody working+committing), 2 people on one computer.
I'm not sure what you mean by
"deploy only some of the packages in a jar"
I think you will need all of them at runtime, won't you? As I understood they depend on each other.

Related

Java package / class library convention?

I am a C# developer and I am messing around with Java. In C# I would normally have my front end project and then when I need to add another layer to the project (i.e service layer etc) I would add a class library in the solution and add a reference to it.
What is the convention in Java? Do you add another Java project to the workspace and then reference the project? Or do you add a package to the project which contains your front end?
UPDATE
Sorry, I am using eclipse...hence the reference to 'workspace'

There's no real convention. When you say "workspace" you're not referring to Java, but rather a development environment (sounds like Eclipse). There are a number of ways to do it; you could do it the way you're suggesting, you could include the dependency via Maven, you could combine them all together in one project, etc.
Which to choose depends on your needs, who else will be consuming either the individual libraries or the completed project, and so on.

How to divide your source code depends a lot on the structure of your project. It is important to pay attention to a good code organization. You should keep classes for a common task or for a distinct application layer in own packages. You should watch for inter-package dependencies.
Using different "projects" (be it Maven or Eclipse projects) helps ensuring that you (your developers) do not violate structural boundaries because the compiler checks the dependencies (one project references the other project, like in C#/VS). Maven generates a build artifact (e.g. a JAR file) for every project.
To summarize, I think it is a good idea to create new individual projects for each program module in order to be able to manage the dependencies between the projects explicitly.

You are assuming everyone works with eclipse, it seems (your references to "workspace").
You can do anything you want, but keep in mind others might not be able to include 'separate' projects for various components of the application.
You can easily address that issue by using some build tool (ant, maven) to build appropriate jars for the various app components, like data-model, persistence, API, etc.
If you front-end is an RIA, might make more sense to develop it as a separate project, although not necessary. If your app is some sort of Java driven UI, you can still do whatever you want, in both cases make sure the UI components have their own package hierarchy.

Yes, I guess I would create a separate package. So your UI code might be in com.mycompany.app.ui, your service code in com.mycompany.app.service, etc. However you want to organize your classes is up to you. Java itself doesn't care what packages the classes are in. The packages just help to make the code more manageable for the developers.

Unlike most things in Java, there's no real convention defined for how to split up project.
In my experience, it makes sense to include code that serves a particular business purpose in a single project, and to separate out code that you intend to share between multiple projects, or code that is not specific to a particular business purpose (e.g. database access, JMS libraries, etc.), into a separate project.

If the UI and the server layer are being developed in a single project, which means packaged and deployed in the same WAR file, I'd create a new package for the service and add classes and interfaces as needed.
If the service layer is deployed separately, I'd add dependencies as a JAR to the web project. All I should need are clients for the service.

If you're working in Eclipse follow these steps:
1) Right-click the project and choose "Build Path"-"Configure Build Path..."
2) Switch to Libraries tab and click Add External JARs (or just Add JARs if they're already in the workspace).
3) Now you can either manually add import of the corresponding class, or just hit Ctrl+Shift+O (Source-Organize Imports) and Eclipse will do the job for you.

I suggest you can use netbeans then you can create a java class library,when you deploy your project,netbeans will generate jar files for you,and place them at the right location.I'm also a ms developer,hope it helps

Is there any benefit in using Maven Multimodule when working in a small application?

We are building a small application using different architectural layers such as domain, interface, infrastructure and application. This follows the Onion DDD model. Now I am wondering if there is any benefit in splitting the application into a multimodule maven project. As far as I can see now it seems to make things more difficult than needed. The entire application will be deployed as a single WAR file into a Tomcat container.

Splitting your application makes sense for the following:
When a certain part of the project needs to have new functionality or bug fixes, you can simply focus on that module and run just the tests for it. Compiling a fraction of all the code and running just the related tests speeds up your work.
You can re-use the code from the modules across different projects. Let's assume your project contains some well-written generic-enough code for mail sending. If you later have another project that need mail sending functionality, you can simply re-use your existing module or build upon it (in another module by adding it as a dependency).
Easier maintainability on the long run. Maybe now it seems like a small project. In a few months things might look different and then you'll need to do more refactoring to split things into logical units (modules).
Conceptual clarity (as added by Adriaan Koster).
Concerning the WAR: You can have an assembly module which puts things together and produces a final WAR file from all the related modules.
Initially, this may seem as more work, but in the long-run, modularized projects are easier to work with and to maintain. Most sane developers would prefer this approach.

Using multiple modules forces you to have a hierarchy of dependencies. You have one module which is standalone and doesn't depend on any other of your modules. You have another which only depends on that. It might appear harder than allowing anything to depend on anything else but this approach results in a mess of dependencies which is hard to fix later.
If you are trying to follow a layered model I suggest you place each layer in a different module. This will ensure you are not tempted to break the model.

Short answer: today it is small, tomorrow it will bigger and more complicated to maintain, reuse, extend, integrate with other system and so on

As far as I know, Maven do little help for WAR dependencies. As you are talking about single WAR, this should never be a problem.
You can separate java classes into several "jar" submodules, but if you split the WAR project into several smaller WARs, using some kind of "overlapped" packaging things get complicated.
Just information, one of our projects, it contains too many web pages, so we decided to split it into several WAR submodules, however, the session is not shared between different WARs deployed, and we are not going to use Kerberos stuff. At last, we modified a lot sources of Glassfish, Jetty, MyFaces, etc. To make them resolve web.xml stuff inside JARs. And converted the whole project to Facelets 2.0 (to avoid the dependency of JDK tools.jar and custom resource handler), the only reason is to change the WAR submodules to JAR submodules, and move all webapp/pages into class resources. So the conclusion, Maven does great job for JAR dependencies, but no WAR or single WAR.
EDIT You can put applicationContext.xml in one of the base submodule, and import it by classpath:com/example/applicationContext.xml. Also Spring 3.0 do have annotation supports, you can make spring auto scan them instead of declaring them all in the xml.

Spliting your project into multiple maven projects is useful if you want to reuse your classes in another project or if your projects are deployed in different configurations.
Maybe think of a webservice - if you are hosting the server, you could build a project for your domain classes (models) and your endpoint interfaces that could be used by server and client. The server would be another project that is build to a WAR.
To develop further clients the first project could be used, too.
Use a parent project for dependency management on common projects (like logging) and different profiles and build configurations.

Best ways to manage generated artifacts for web service/xml bindings in a java webapp/client?

I'm working on a couple of web services that use JAXB bindings for the messages (in JAX-WS or spring-ws). When using these bindings there's always some code that is automatically generated from the WSDL to bind the message objects. I'm struggling to figure out the best way I can make this work so that it's easy to work with, hard to break and integrates nicely with IDEs (mostly using eclipse).
I think there are a couple of ways to go about this. The three main options I see right now are:
Generate code, keep the source artifacts and check them into the repository. Pros: integrates easily with IDEs (source highlighting etc), works within the build system. Cons: generated code changes each time you regenerate it, possibly creating noisy commits. It's also redundant since the WSDL file is already checked in, usually.
Generate code as part of the build process. Don't keep source artifacts or only keep them in output directories. Pros: fixes all the cons from the previous one. Cons: harder to integrate with IDE, though maybe this build step can be run automatically? I currently use this on one of my projects but the first time I checkout the project it appears broken, which is a minor nuisance.
Keep generated bindings in separate libraries (jars) included with maven or manually updated jars, depending on your build process. I got the idea from a thread on java.net. This seems more stable and uses explicit versioning but seems a bit heavyweight.
Which one of these options would you implement and how? We're currently using maven and eclipse, so any ideas in that regard would be great. I think this problem generalises to most other build systems and IDE combinations though, even other languages perhaps.

I went for option 3. If you already host your own repository (and optionally CI), it's not that heavyweight. All it takes is a simple POM. It's even possible to include some utility/wrapper/builder classes (that often make life easier with generated classes) and use them in several projects.

I'd go for option 2 and generate code in the "standard" ${project.build.directory}/generated-sources/<toolname> location as part of the build process. Using generated sources is well supported by m2eclipse (use Maven > Update Project Configuration once sources have been generated) and, if I remember well, by the maven eclipse plugin as well (i.e. the folder will be added to the Java Build Path). Actually, I think NetBeans also handle this fine. Not sure for Idea.
For the generation itself, you may need the maven-jaxb2-plugin if I understood correctly.

How to modularize a (large) Java App?

I have a rather large (several MLOC) application at hand that I'd like to split up into more maintainable separate parts. Currently the product is comprised of about 40 Eclipse projects, many of them having inter-dependencies. This alone makes a continuous build system unfeasible, because it would have to rebuild very much with each checkin.
Is there a "best practice" way of how to
identify parts that can immediately be separated
document inter-dependencies visually
untangle the existing code
handle "patches" we need to apply to libraries (currently handled by putting them in the classpath before the actual library)
If there are (free/open) tools to support this, I'd appreciate pointers.
Even though I do not have any experience with Maven it seems like it forces a very modular design. I wonder now whether this is something that can be retrofitted iteratively or if a project that was to use it would have to be layouted with modularity in mind right from the start.
Edit 2009-07-10
We are in the process of splitting out some core modules using Apache Ant/Ivy. Really helpful and well designed tool, not imposing as much on you as maven does.
I wrote down some more general details and personal opinion about why we are doing that on my blog - too long to post here and maybe not interesting to everyone, so follow at your own discretion: www.danielschneller.com

Using OSGi could be a good fit for you. It would allow to create modules out of the application. You can also organize dependencies in a better way. If you define your interfaces between the different modules correctly, then you can use continuous integration as you only have to rebuild the module that you affected on check-in.
The mechanisms provided by OSGi will help you untangle the existing code. Because of the way the classloading works, it also helps you handle the patches in an easier way.
Some concepts of OSGi that seem to be a good match for you, as shown from wikipedia:
The framework is conceptually divided into the following areas:
Bundles - Bundles are normal jar components with extra manifest headers.
Services - The services layer connects bundles in a dynamic way by offering a publish-find-bind model for plain old Java objects(POJO).
Services Registry - The API for management services (ServiceRegistration, ServiceTracker and ServiceReference).
Life-Cycle - The API for life cycle management (install, start, stop, update, and uninstall bundles).
Modules - The layer that defines encapsulation and declaration of dependencies (how a bundle can import and export code).
Security - The layer that handles the security aspects by limiting bundle functionality to pre-defined capabilities.

First: good luck & good coffee. You'll need both.
I once had a similiar problem. Legacy code with awful circular dependencies, even between classes from different packages like org.example.pkg1.A depends on org.example.pk2.B and vice versa.
I started with maven2 and fresh eclipse projects. First I tried to identify the most common functionalities (logging layer, common interfaces, common services) and created maven projects. Each time I was happy with a part, I deployed the library to the central nexus repository so that it was almost immediately available for other projects.
So I slowly worked up through the layers. maven2 handled the dependencies and the m2eclipse plugin provided a helpful dependency view. BTW - it's usually not too difficult to convert an eclipse project into a maven project. m2eclipse can do it for you and you just have to create a few new folders (like src/main/java) and adjust the build path for source folders. Takes just a minute or two. But expect more difficulties, if your project is an eclipse plugin or rcp application and you want maven not only to manage artifacts but also to build and deploy the application.
To opinion, eclipse, maven and nexus (or any other maven repository manager) are a good basis to start. You're lucky, if you have a good documentation of the system architecture and this architecture is really implemented ;)

I had a similar experience in a small code base (40 kloc). There are no °rules":
compiled with and without a "module" in order to see it's usage
I started from "leaf modules", modules without other dependencies
I handled cyclic dependencies (this is a very error-prone task)
with maven there is a great deal with documentation (reports) that can be deployed
in your CI process
with maven you can always see what uses what both in the site both in netbeans (with a
very nice directed graph)
with maven you can import library code in your codebase, apply source patches and
compile with your products (sometimes this is very easy sometimes it is very
difficult)
Check also Dependency Analyzer:
(source: javalobby.org)
Netbeans:
(source: zimmer428.net)

Maven is painful to migrate to for an existing system. However it can cope with 100+ module projects without much difficulty.

The first thing you need to decide is what infra-structure you will move to. Should it be a lot of independently maintained modules (which translates to individual Eclipse projects) or will you consider it a single chunk of code which is versioned and deployed as a whole. The first is well suited for migrating to a Maven like build environment - the latter for having all the source code in at once.
In any case you WILL need a continuous integration system running. Your first task is to make the code base build automatically, so you can let your CI system watch over your source repository and rebuild it whenyou change things. I decided for a non-Maven approach here, and we focus on having an easy Eclipse environment so I created a build enviornment using ant4eclipse and Team ProjectSet files (which we use anyway).
The next step would be getting rid of the circular dependencies - this will make your build simpler, get rid of Eclipse warnings, and eventually allow you to get to the "checkout, compile once, run" stage. This might take a while :-( When you migrate methods and classes, do not MOVE them, but extract or delegate them and leave their old name lying around and mark them deprecated. This will separate your untangeling with your refactoring, and allow code "outside" your project to still work with the code inside your project.
You WILL benefit from a source repository which allows for moving files, and keeping history. CVS is very weak in this regard.

I wouldn't recommend Maven for a legacy source code base. It could give you many headaches just trying to adapt everything to work with it.
I suppose what you need is to do an architectural layout of your project. A tool might help, but the most important part is to organize a logical view of the modules.

It's not free but Structure101 will give you as good as you will get in terms of tool support for hitting all your bullet points. But for the record I'm biased, so you might want to check out SonarJ and Lattix too. ;-)

How many multiple "Eclipse Projects" is considered too excessive for one actual development project?

I'm currently working on a project that contains many different Eclipse projects referencing each other to make up one large project. Is there a point where a developer should ask themselves if they should rethink the way their development project is structured?
NOTE: My project currently contains 25+ different Eclipse projects.

My general rule of thumb is I would create a new project for every reusable component. So for example if I have some isolated functionality that can be packaged say as a jar, I would create a new project so I can build,package and distribute the component independently.
Also, if there are certain projects that you do not need to make frequent changes to, you can build them only when required and keep them "closed" in eclipse to save time on indexing, etc. Even if you think that a certain component is not reusable, as long as it is separated from the rest of the code base in terms of logic/concerns you may be well served by just separating it out. Sometimes seemingly specific code might be reusable in another project or in a future version of the same project.

When compiled, a project would typically result in a jar. So if your application consists of potentially reusable components, it is ok to use a project for each.
I'm a big fan of using a lot of projects, I feel that this "breaks down" large things beyond what I can do with packages, and helps me orient and navigate.
Of course, if you're developing Eclipse plug-ins, everything would be a project anyway.
The only thing I would watch out for has to do with your source-control and it's ability to handle moves of files between projects. Subclipse had been giving me trouble with it, or maybe it's my SVN server that did.

If your project has that many sub-projects, or modules, needed to actually compose your final artifact then it is time to look at having something like Maven and setting up a multi-module project. It will a) allow you to work on each module independently without ide worries and allow easy setup in your ide (and others' IDEs) through the mvn eclipse:eclipse goal. In addition, when building your entire top level project, maven will be able to derive from list of dependencies you have described what modules need to be built in what order.
Here's a quick link via google and a link to the book Maven: The Definitive Guide, which will explain things in much better detail in chapter 6 (once you have the basics).
This will also force your project to not be explicitly tied to Eclipse. Being able to build independent from an ide means that any Joe Schmoe can come along and easily work with your code base using whatever tools he/she needs.

Create jars for the projects you don't work in often. That should greatly reduce the clutter. If you work on all the projects often, then you can add targets to your build that will jar up the respective projects for you, which condenses everything down to one file that you can then include on the class path.

An additional method is to create many different workspaces. The benefit of separate workspaces is that you can remove some of the visual clutter/ performance overhead of having lots of projects. You can use targets to jar up all of you projects and put them in a repository so you can reference them in each workspace.

At a former job the entire application was more then +170 projects. While it was rarely necessary to have all projects checked out locally, even the 30-40 projects constantly in our scope made reindexing, etc. very slow.

Yeesh. One Project for each Project. If you are using reusable projects, make them into a library for heavens sake. Break the none re-usable projects into packages, that's what they are there for.

That's a hard question and answers span from having one eclipse project at all to having one eclipse project for every single class.
My bottomline:
You can have too few projects,
and never too many (of course use
automation e.g. mvn eclipse:eclipse)
Use
-Declipse.useProjectReferences=true/false
when using maven to switch workspace
mode btw jar and project
dependencies
Use mvn release plugin to generate
consecutive releases (automatic
version increase)
Multiple projects gives you
independent versioning which is
extremely important. E.g. one dev may work on a new version of a
module while you still depends on
the previous one and you at some
point decide to upgrade to the newer
version(possibly by increasing its version in pom.xml dependency section). Or in other scenario if one
project contains a bug you downgrade
to its previous version.
Multiple projects makes you think
about the architecture more than if
you have just packages.
Multiple projects generally make
architectural problems evident more
than if you have just one project.
Anyone would like to comment on
this?
You never know if you project
evolves into OSGI/SOA/EDA where you
need separation.
Even if you're 100% sure that you
projects will be deployed as one jar
in an old way in a single jvm, it
still does not hurt(mvn assembly
plugin) to have multiple eclipse
projects for logically independent
pieces of code
BTW, the project I work on is divided into 24 eclipse projects.

Hell, we have more than 100. Projects don't cost anything.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.