First off, I'm coming (back) to Java from C#, so apologies if my terminology or philosophy doesn't quite line up.
Here's the background: we've got a growing collection of internal support tools written for the web. They use HTML5/AJAX/other buzzwords for the frontend and Java for the backend. These tools utilize a lightweight in-house framework so they can share an administrative interface for security and other configuration. Each tool has been written by a separate author and I expect that trend to continue, so I'd like to make it easy for future authors to stay "standardized" on the third-party libraries that we've already decided to use for things like DI, unit testing, ORM, etc.
Our package naming currently looks like this:
com.ourcompany.tools.framework
com.ourcompany.tools.apps.app1name
com.ourcompany.tools.apps.app2name
...and so on.
So here's my question: should each of these apps (and the framework) be treated as a separate project for purposes of Maven setup, Eclipse, etc?
We could have lots of apps appear here over time, so it seems like separation would keep dependencies cleaner and let someone jump in on a single tool more easily. On the other hand, (1) maybe "splitting" deeper portions of a package structure over multiple projects is a code smell and (2) keeping them combined would make tool writers more inclined to use third-party libraries already in place for the other tools.
FWIW, my initial instinct is to separate them.
What say you, Java gurus?
I would absolutely separate them. For the purposes of Maven, make sure each app/project has the appropriate dependencies to the framework/apps so you don't have to build everything when you just want to build a single app.
I keep my projects separated out, but use a parent pom for including all of the dependencies and other common properties. Individual tools / projects have a name and a reference to the parent project, and any project-specific dependencies, if any. This works for helping to keep to common libraries and dependencies, since the common ones are already all configured, but allows me to focus on the specific portion of the codebase that I need to work with.
I'd definitely separate these kind of things out into separate projects.
You should use Maven to handle the dependencies / build process automatically (both for your own internal shared libraries and third party dependencies). There won't be any issue having multiple applications reference the same shared libraries - you can even keep multiple versions around if you need to.
Couple of bonuses from this approach:
This forces you to think carefully about your API design for the shared projects which will be a good thing in the long run.
It will probably also give you about the right granularity for source code control - i.e. your developers can check out and work on specific applications or backend modules individually
If there is a section of a project that is likely to be used on more than one project it makes sense to pull that out. It will make it a little cleaner as well if you need to update the code in one of the commonly used projects.
If you keep them together you will have fewer obstacles developing, building and deploying your tools.
We had the opposite situation, having many separate projects. After merging them into one project tree we are much more productive and this is more important to us than whatever conventions happen to be trending.
Related
We currently have an application which is essentially a fully-functional demo for potential clients. All the functionality is there. However, we use generic branding/logos, call our own web services (which would later be swapped out for calls to client web-services), etc.
Here is my question. If we have two different clients, we would prefer as little duplicate code as possible. I understand that this could be done -- from a java perspective -- by simply including a shared JAR. However, we will need to change around resources. Also, one client may not want some functionality that another client does want. On top of this, if we are doing general bug fixes, we will normally want these fixes to be in both versions of the application.
We are using Git for version control and Maven for building the project.
One option we discussed is simply branching the project and maintaining separate versions. However, then we would have to manually merge changes that we want reflected in all versions of the app.
Another option we discussed is somehow swapping out resources, etc. using maven profiles. However, if we need to make any non-superficial changes to the code itself, this could be a problem. We might have to get into factories and different implementations.
Does anyone have recommendations on the best way to handle this?
We use a library project with git submodules to handle all of our similar projects. The master project is pretty hefty but we use a configuration file to determine what features should be in the finished product.
I would like to know how you set up your projects in Java. For example, in my current work project, a six year old J2EE app with approximately 2 million LoC, we only have one project in Eclipse. The package structure is split into tiers and then domains, so it follows guidelines from Sun/Oracle. A huge ant-script is building different jars out of this one source-folder
Personally I think it would be better to have multiple projects, at least for each tier. Recently I was playing around with a project structure like this:
Domainproject (contains only annotated pojos, needed by all other projects)
Datalayer (only persistence)
Businesslogic (services)
Presenter
View
This way, it should be easier to exchange components. In addition, when using a build tool like Maven I can have everything in a repository so when I am only working on the frontend I can get the rest as a dependency in my classpath.
Does this makes sense to you? Do you use different approaches and how do they look like?
Furthermore I am struggling how to name my packages/projects correctly. Right now, the above project-structure reflects in the names of the packages, eg. de.myapp.view and it continues with some technical subfolders like internal or interfaces. What I am missing here, and I don't know how to do this properly, is the distinction to a certain domain. When the project gets bigger it would be nice to recognise a particular domain but also the technical details to navigate more easily within the project.
This leads to my second question: how do you name your projects and packages?
Your approach makes sense. I would normally decompose into a model (shared), numerous libraries, and then the applications consuming that code and the GUIs - all as separate projects. I tend to follow the Pragmatic Programmers' dictum of build toolsets, not applications. That way you can reassemble your components in lots of different ways.
Each library/application would be its own project, with unit/functional tests and a deliverable (in your case, a Maven artifact that you can store and version appropriately).
The only headache is managing the interfaces and linking between these components. An effective integration test environment is key here.
This leads to my second question: how
do you name your projects and
packages?
For project names i prefer an internal name like Longhorn=WinVista. This name never changes (like my kids names). So marketing, etc can register any name, rebrand, etc.
Packages are a question of (personal) preferences and style. And normally the senior programmer decides the structure. Of course there are some "standards" as "gui" for UI classes, "util","misc", "impl" for interface implementations, "domain" for domain object classes, etc that you should use consistently and express your style.
What is considered best practice deciding how to define the set of JAR's for a project (for example a Swing GUI)? There are many possible groupings:
JAR per layer (presentation, business, data)
JAR per (significant?) GUI panel. For significant system, this results in a large number of JAR's, but the JAR's are (should be) more re-usable - fine-grained granularity
JAR per "project" (in the sense of an IDE project); "common.jar", "resources.jar", "gui.jar", etc
I am an experienced developer; I know the mechanics of creating JAR's, I'm just looking for wisdom on best-practice.
Personally, I like the idea of a JAR per component (e.g. a panel), as I am mad-keen on encapsulation, and the holy-grail of re-use accross projects. I am concerned, however, that on a practical, performance level, the JVM would struggle class loading over dozens, maybe hundreds of small JAR's. Each JAR would contain; the GUI panel code, necessary resources (i.e. not centralised) so each panel can stand alone.
When I say "holy grail of reuse", I say this more because it demonstrates a cleanly decoupled, encapsulated design, rather than necessarily expecting its re-use elsewhere. I consider myself a "normally intelligent" person; I consider the spagetti of intertwined nonsense I've had to deal with during my career slows me down 10 to 100-fold. A cleanly decoupled design allows me to deal with one concept at a time, one layer, one class.
Does anyone have wisdom to share?
I would recommend as fewer JARs as possible.
The logic behind it, the disk storage is the cheapest commodity there available, but time spending tracing down complex dependencies is priceless.
Hence the emergence of the .war files where all dependencies of the web application are put into a single file.
BTW, Eclipse has a JAR exporter plugin which puts all dependent jars into a super jar and expose the entry level main method, so you can start your app with java -jar file.jar command. Although the resultant jar may be large, the flip side is not maintaining very complex class paths for you application.
So, in your case I would go with one jar per project. If you determine that you indeed need to reuse some code in another project, just refactor it into the base project and make it a dependency in your existent project and another project.
You can actually use both approaches. Spring for example offers a big monolithic jar file, which contains most common functionality. If you want however you can also download independent jar files. It is then left to the user to select what is best. Big jar files are easier to deploy, but they are harder to upgrade. Also you may need to add a big jar whereas you only need a simple class. I find that is is easier to spot dependencies with small jar files. I also thinK that updating/upgrading is easier.
Java provides encapsulation and re-use at the class layer - the jar file format doesn't really provide it. I don't see much advantage in putting a significant component in its own jar, unless you think lots of people will be downloading it.
I read somewhere (and I was trying to find it when I found this) that project per layer is the best. It's what I've been doing. Struts, Spring MVC, Swing, whatever in one layer, EJBs in another, business services in another and DAOs in another. I put all of the DTOs in its own project as well, even though they don't represent a layer, but are instead passed through the layers.
The main benefit I remember reading about was being able to version each jar separately.
Oh, and BTW, each layer actually has two jars, one for the interfaces that the layer above uses, and another for the implementation(s).
I have a rather large (several MLOC) application at hand that I'd like to split up into more maintainable separate parts. Currently the product is comprised of about 40 Eclipse projects, many of them having inter-dependencies. This alone makes a continuous build system unfeasible, because it would have to rebuild very much with each checkin.
Is there a "best practice" way of how to
identify parts that can immediately be separated
document inter-dependencies visually
untangle the existing code
handle "patches" we need to apply to libraries (currently handled by putting them in the classpath before the actual library)
If there are (free/open) tools to support this, I'd appreciate pointers.
Even though I do not have any experience with Maven it seems like it forces a very modular design. I wonder now whether this is something that can be retrofitted iteratively or if a project that was to use it would have to be layouted with modularity in mind right from the start.
Edit 2009-07-10
We are in the process of splitting out some core modules using Apache Ant/Ivy. Really helpful and well designed tool, not imposing as much on you as maven does.
I wrote down some more general details and personal opinion about why we are doing that on my blog - too long to post here and maybe not interesting to everyone, so follow at your own discretion: www.danielschneller.com
Using OSGi could be a good fit for you. It would allow to create modules out of the application. You can also organize dependencies in a better way. If you define your interfaces between the different modules correctly, then you can use continuous integration as you only have to rebuild the module that you affected on check-in.
The mechanisms provided by OSGi will help you untangle the existing code. Because of the way the classloading works, it also helps you handle the patches in an easier way.
Some concepts of OSGi that seem to be a good match for you, as shown from wikipedia:
The framework is conceptually divided into the following areas:
Bundles - Bundles are normal jar components with extra manifest headers.
Services - The services layer connects bundles in a dynamic way by offering a publish-find-bind model for plain old Java objects(POJO).
Services Registry - The API for management services (ServiceRegistration, ServiceTracker and ServiceReference).
Life-Cycle - The API for life cycle management (install, start, stop, update, and uninstall bundles).
Modules - The layer that defines encapsulation and declaration of dependencies (how a bundle can import and export code).
Security - The layer that handles the security aspects by limiting bundle functionality to pre-defined capabilities.
First: good luck & good coffee. You'll need both.
I once had a similiar problem. Legacy code with awful circular dependencies, even between classes from different packages like org.example.pkg1.A depends on org.example.pk2.B and vice versa.
I started with maven2 and fresh eclipse projects. First I tried to identify the most common functionalities (logging layer, common interfaces, common services) and created maven projects. Each time I was happy with a part, I deployed the library to the central nexus repository so that it was almost immediately available for other projects.
So I slowly worked up through the layers. maven2 handled the dependencies and the m2eclipse plugin provided a helpful dependency view. BTW - it's usually not too difficult to convert an eclipse project into a maven project. m2eclipse can do it for you and you just have to create a few new folders (like src/main/java) and adjust the build path for source folders. Takes just a minute or two. But expect more difficulties, if your project is an eclipse plugin or rcp application and you want maven not only to manage artifacts but also to build and deploy the application.
To opinion, eclipse, maven and nexus (or any other maven repository manager) are a good basis to start. You're lucky, if you have a good documentation of the system architecture and this architecture is really implemented ;)
I had a similar experience in a small code base (40 kloc). There are no °rules":
compiled with and without a "module" in order to see it's usage
I started from "leaf modules", modules without other dependencies
I handled cyclic dependencies (this is a very error-prone task)
with maven there is a great deal with documentation (reports) that can be deployed
in your CI process
with maven you can always see what uses what both in the site both in netbeans (with a
very nice directed graph)
with maven you can import library code in your codebase, apply source patches and
compile with your products (sometimes this is very easy sometimes it is very
difficult)
Check also Dependency Analyzer:
(source: javalobby.org)
Netbeans:
(source: zimmer428.net)
Maven is painful to migrate to for an existing system. However it can cope with 100+ module projects without much difficulty.
The first thing you need to decide is what infra-structure you will move to. Should it be a lot of independently maintained modules (which translates to individual Eclipse projects) or will you consider it a single chunk of code which is versioned and deployed as a whole. The first is well suited for migrating to a Maven like build environment - the latter for having all the source code in at once.
In any case you WILL need a continuous integration system running. Your first task is to make the code base build automatically, so you can let your CI system watch over your source repository and rebuild it whenyou change things. I decided for a non-Maven approach here, and we focus on having an easy Eclipse environment so I created a build enviornment using ant4eclipse and Team ProjectSet files (which we use anyway).
The next step would be getting rid of the circular dependencies - this will make your build simpler, get rid of Eclipse warnings, and eventually allow you to get to the "checkout, compile once, run" stage. This might take a while :-( When you migrate methods and classes, do not MOVE them, but extract or delegate them and leave their old name lying around and mark them deprecated. This will separate your untangeling with your refactoring, and allow code "outside" your project to still work with the code inside your project.
You WILL benefit from a source repository which allows for moving files, and keeping history. CVS is very weak in this regard.
I wouldn't recommend Maven for a legacy source code base. It could give you many headaches just trying to adapt everything to work with it.
I suppose what you need is to do an architectural layout of your project. A tool might help, but the most important part is to organize a logical view of the modules.
It's not free but Structure101 will give you as good as you will get in terms of tool support for hitting all your bullet points. But for the record I'm biased, so you might want to check out SonarJ and Lattix too. ;-)
We have this constant discussion in our project as to the granularity of our maven modules. We have come to agree that there may be differences in the needs of a framework (like spring) and an in-house application that is always deployed monolithically.
We also agree that it's fairly sensible to hide implementation details of adapters to external systems behind a separate API module, so the implementation classes don't bleed into the classpath of the main implementations.
as
But that's as far as we go. It's a web project so we have modules like "web", "core" and "adapter(s)". We have multiple backends, but we don't require plugability.
What criteria do you use for modularizing in maven ? Which modules do you make for web projects ?
In my opinion, the project division should be pretty fine grained, even for "only a webapp".
I would make separate projects for the data access layer interfaces and implementation, business layer interfaces and implementation, and the webapp itself. I would also make atleast one "commons" project for containing code relevant to more than one of the other projects. But this is just the beginning. I would not hesitate to extract a commons-util project for utility classes relevant regardless of the application that is being developed (String, Date, Reflection, etc). I would also make a project for useful utilities when doing testing (commons-test). And that's just the next step ... ;)
If I wrote generally useful code relevant to hibernate, I would put it in a hibernate-utils project. Useful Spring utilities would go in a spring-utils project etc. When doing this, many projects will only contain a single or a few packages, and the packages will commonly contain few classes.
My reasoning for doing this, is that it helps me think about the code I write. Is this REALLY business logic, or is it general String manipulation, Date manipulation, Hibernate specific logic etc? My layers become cleaner, and it becomes harder to get circular dependencies between packages and projects (we don't want those). In addition, it becomes much easier to reuse code in other projects. There will always be other projects...
I have also found that it is easier for new developers to get a hang of the structure, because the projects become smaller and more manageable; it's easier to start coding when you feel you don't have to under stand everything.
As a last advantage to the fine grained approach, build times reduce because you don't have to build everything every time.