I'm wondering if there is some recommended reading, best practice or opinion on how to organize larger Java projects.
I made the observations that there are folks who split up everything into projects (i.e. modules) and create many many projects that share a web of dependencies. This has the advantage that compilation is often super fast, but when the project gets large nobody knows anymore what depends on what and why. Not talking about dependent libraries, version conflicts & co.
The alternative is to have just a couple of projects such as frontend, backend, ... . The namespacing does the job.
Any opinion, further reading anyone could recommend?
As soon as you start splitting a big project up into smaller projects, you encounter a lot of dependency tracking that you generally didn't have to consider. You could manage this yourself or you could use software which already handles a lot of the core issues.
I would recommend Apache's Ivy. It integrates well with Apache's Ant, and has a separate configuration file (which gets checked in) to track what is required for each kind of build.
Apache's Maven is another good choice; however, it does a lot more than Apache's Ivy. Sometimes that "a lot more" means you doing less of what you would have done anyway, sometimes that "a lot more" means you are doing (and configuring) things that you didn't do before. Depending on the fit of your practice to Maven's, migrating to Maven might be easy or very hard.
In addition, using Ivy, you can set up your own private repository of "permitted" jar files to pull from, and that will make code auditing much easier. Basically, reconfigure ivy to not pull from the web, but to pull from your local repository only, and then control access to the repository to only allow jar files which were reviewed to have acceptable licensing.
Once you have software in place, you can afford to split projects up into smaller pieces. This will permit you to do the right thing (if your project favors small decomposition) instead of the expedient thing (a few big chunks which might not really buy you much in decomposition maintainability). As far as where to make the cuts, that depends heavily on the specifics of your application.
Many small pieces tend to be easier for a new person to digest one-by-one. They also get people thinking about where functionality is to be added to a project; however, it does cost time and effort to untangle and separate all of the components. The plus side is that it is generally easier to test and validate something smaller, the downside is that it is a longer road to decompose one monolithic collection of responsibilities into many small, well integrated yet functionally disparate units.
Good luck
A very large project will need to have some way of tracking all of the libraries and other dependencies that it uses. The defacto standard for doing this is Maven. It's definitely the best way to start keeping track of what is going into your application.
Then you can decide how to split your application up. Basically, what you're trying to do here is to split up your application up into complete functional pieces. For instance, if you had a website that had a contact form, a photo gallery, a shopping cart, and a forum, you would split the project up into pieces that contained each of those different modules.
Actually, you will want to utilize both projects and namespacing.
Namespacing is an important tool for differentiating what purpose a code has at the code level. Regardless of what project a class comes from, the package should give me some idea of its purpose.
At a higher level, it is easier to manage builds and your development environment by having your code separated into projects. For instance, if you are developing a UI, why do you need to have the database code loaded into you IDE? It is just extra clutter in your workspace. It also makes it simpler to share common functionality between different projects. This will of course lead to needing some form of dependency management, of which either of the mentioned tools such as Maven or Ivy will suffice.
An important note though. Do not use split packages between projects. This causes nightmares if you or anyone who will ever use your code wants to do so in an OSGi environment. So, your namespaces should be unique within a project, although they should share a common root with other related projects.
Related
I have a project in Android Studio and two different app module but both of them are using this same Android library (core code). I want change some resources and logic in application but this code is in core (Android library). How do I approach this? How to customize core code?
I've been asking the same question and after some research, I've found that you have different approach:
Gradle Flavors:
Why Product Flavors?
They address the issue of having separate project code for each version of the app while still having one project code.
Given a scenario where you have a free and a paid app you can limit features in the free and expose all the other features in the paid version of the app.
Given another scenario where you want to implement region-specific functions depending on the country, you can use product flavors for such a use case.
White labeling (these are apps that are developed by a certain company, and they are re-branded and resold by other companies).
Pros
They address the issue of having a separate project code base for each version of the app.
They keep the code tidy and makes it much easier and faster to navigate through the code base as everything related to the specific product flavor would be kept in their corresponding folders.
Cons
(Scaling Up) The more variants, the greater the complexity which thereby makes it harder to maintain the codebase.
-IDEs sometimes takes time to build the project after switching between variants.
Source: https://levelup.gitconnected.com/simple-guide-to-android-product-flavors-674106455038
Multi-Module
Why ?
Faster build times.
Fine-grained dependency control.
Improve reusability across other apps.
Improves the ownership & the quality of the codebase.
Stricter boundaries when compared to packages.
Pros
Scales well as the application grows with new features
Medium to large development teams are able to work on different modules without affecting each other (Merge Conflicts)
Encapsulates unit and ui tests to their specific features
Keeps Resources separated between modules. Which improves readability and organization
Keeps logic contained in their own modules, which can be hidden behind interfaces
Forces the developer to keep their code better organized and structured
Improved build speed, as changes in a module means only that module will need to be rebuilt
Cons
Adds additional boilerplate around the construction of the modules
More development time overall
Requires a maven/gradle file for each module
Navigation between module activities can be difficult to setup correctly
Requires a lot more pre-planning on how best to structure code, and determining where shared code bases are stored.
Limited amount of online resources showing best practices
Source:
https://medium.com/swlh/modularization-by-feature-and-layer-with-android-architecture-components-80bf317d737
https://codelift.dev/android-modular-app-architecture/
Also look at the gradle doc to start with modularizing.
You can build the library's aar and use it in your main projects by simply copying into lib folder in your project directory.
Or you can build it with services like Jitpack and add it to your project by the implementation method in Gradle.
Are there any libraries or frameworks out there that are designed to facilitate the building of projects from within another full fledged programming language?
It is very easy to specify logic, sets, and complicated rules in programming languages like C++, java, etc, but it seems like an uphill battle to do these things in a Makefile. I haven't dug into Ant or Maven or any of the other building tools yet, but if I could just write all my build logic into a C++ program it would be much easier! (Assuming I had some helpful tools.)
Maybe you'd like to look at this project: http://www.scons.org/
But, what would you like to do besides specifying modules dependencies and build rules? I think putting too much logic into the build system is just creating another extra problem; try to stick to well known configuration tools and you'll have one thing less to worry about.
About Maven and Ant, they are a little bit java-centric, but can be used for any kind of projects and have loads of plugins to perform almost any task you could imagine. If you prefer to use a more unix-oriented environment, but want a higher level layer on top make you can use the Autotools.
Dig into Ant and/or Maven. These can handle all the different tasks needed for creating complicated builds. This will save you reinventing the wheel by trying to replicate all those features in C++. I think the issue you are struggling with is that you need to spend the time to learn to use a new build tool.
I was just about to include the HtmlUnit library in a project. I unpacked the zip-file and realised that it had no less than 12 dependencies.
I've always been concerned when it comes to introducing dependencies. I suppose I have to ship all these dependencies together with the application (8.7 mb in this particular case). Should I bother checking for, say, security updates for these libraries? Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
The task HtmlUnit solves for me could be solved "manually" but that would probably not be as elegant.
Should I be concerned about this? What are the best practices in situations like these?
Edit: I'm interested in the general situation, not particularly involving HtmlUnit. I just used it here as an example as that was my current concern.
Handle your dependencies with care. They can bring you much speed, but can be a pain to maintain down the road. Here are my thoughts:
Use some software to maintain your dependencies. Maven is what I would use for Java to do this. Without it you will very soon loose track of your dependencies.
Remember that the various libraries have different licenses. It is not granted that a given license works for your setting. I work for a software house and we cannot use GPL based libraries in any of the software we ship, as the software we sell are closed source. Similarly we should avoid LGPL as well if we can (This is due to some intricate lawyer reasoning, don't ask me why)
For unit testing I'd say go all out. It is not the end of the world if you have to rewrite your tests in the future. It might even be then that that part of the software is either extremely stable or maybe not even maintained no more. Loosing those is not that big of a deal as you already had a huge gain of gaining speed when you got it.
Some libraries are harder to replace later than others. Some are like a marriage that should last the life of the software, but some other are just tools that are easily replaceable. (Think Spring versus an xml library)
Check out how the community support older versions of the library. Are they supporting older versions? What happens when life continues and you are stuck at a version? Is there an active community or do you have the skill to maintain it yourself?
For how long are your software supposed to last? Is it one year, five year, ten year or beyond? If the software has short time span, then you can use more to get where you are going as it is not that important to be able to keep up with upgrading your libraries.
It could be a serious issue if there isn't a active community which does maintain the libraries on long term. It is ok to use libraries, but to be honest you should care to get the sources and put them into your VCS.
Should I bother checking for, say, security updates for these libraries?
In general, it is probably a good idea to do this. But then so should everyone upstream and downstream of you.
In your particular case, we are talking about test code. If potential security flaws in libraries used only in testing are significant, your downstream users are doing something strange ...
Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
Ah yes. Assuming that you are building your own classpaths, etc by hand, you need to make a decision about which version of the dependent libraries you should use. It is usually safe to just pick the most recent of the versions used. But if the older version is not backwards incompatible with the new (for your use case) then you've got a problem.
Should I be concerned about this?
IMO, for your particular example (where we are talking about test code), no.
What are the best practices in situations like these?
Use Maven! It explicitly exposes the dependencies to the folks who download your code, making it possible for them to deal with the issue. It also tells you when you've got dependency version conflicts and provides a simple "exclude" mechanism for dealing with it.
Maven also removes the need to create distributions. You publish just your artifacts with references to their dependents. The Maven command then downloads the dependent artifacts from wherever they have been published.
EDIT
Obviously, if you are using HtmlUnit for production code (rather than just tests), then you need to pay more attention to security issues.
A similar thing has happened to me actually.
Two of my dependencies had the same 'transitive' dependency but a different version.
My favorite solution is to avoid "dependency creep" by not including too many dependencies. So, the simplest solution would be to remove the one I need less, or the one I could replace with a simple Util class, etc.
Too bad, it's not always that simple. In unfortunate cases where you actually need both libraries, it is possible to try to sync their versions, i.e. downgrade one of them so that dependency versions match.
In my particular case, I manually edited one of the jars, deleted the older dependency from it, and hoped it would still work with new version loaded from other jar. Luckily, it did (i.e. maintainers of the transitive dependency didn't remove any classes or methods that library used).
Was it ugly - Yes (Yuck!), but it worked.
I try to avoid introducing frivolous dependencies, because it is possible to run into conflicts.
One interesting technique I have seen used to avoid conflicts: renaming a library's package (if its license allows you to -- most BSD-style licenses do.) My favorite example of this is what Sun did when they built Xerces into the JRE as the de-facto JAXP XML parser: they renamed the whole of Xerces from org.apache.xerces to com.sun.org.apache.xerces.internal. Is this technique drastic, time consuming, and hard to maintain? Yes. But it gets the job done, and I think it is an important possible alternative to keep in mind.
Another possibility is -- license terms abided -- copying/renaming single classes or even single methods out of a library.
HtmlUnit can do a lot, though. If you are really using a large portion of its functionality on a lot of varied input data, it would be hard to make a case for spending the large amount of time it would take to re-write the functionality from scratch, or repackage it.
As for the security concerns -- you might weigh the chances of a widely used library having problems, vs. the likelihood of your hand-written less-tested code having some security flaw. Ultimately you are responsible for the security of your programs, though -- so do what you feel you must.
I've been beginning a client-server application. At first I naturally created two projects in Eclipse, two source control repositories, etc. But I'm quickly seeing that there is a bit of shared code between the two that would probably benefit from sharing (in the same project or in a shared library) instead of copying.
In addition, I've been learning and trying test-driven development, and it seems to me that it would be easier to test based on real client components rather than having to set up a huge amount of code just to mock something, when the code is probably mostly in the client. In this case it seems having the client and server together, in one project, thinly separated by root packages (org.myapp.client.* and org.myapp.server., maybe org.myapp.shared. too).
My biggest concern in merging the client and server, however, is of security; how do I ensure that the server pieces of the code do not reach an user's computer? When Eclipse bundles a JAR, I'd have to pick out the server-specific bits and hope I don't miss any, right?
So especially if you are writing client-server applications yourself (and especially in Java, though this can turn into a language-agnostic question if you'd like to share your experience with this in other languages), what sort of separation do you keep between your client and server code? Are they just in different packages/namespaces or completely different binaries using shared libraries, or something else entirely? How do you test the code together and yet ship separately?
A lot of this is going to depend on your specific implementation but I typically find that you have at least three assemblies (binaries) that are created with a project like this.
A Common DLL that contains shared functionality that is used by both the client and the server
The DLL/Exe for the client
The Dll/exe for the server
Using this approach you have your shared items, but you make sure that items that are server specific are never in a distribution that is sent to the client workstations.
Neither. It should be 3. (common, client and server) However, it doesn't necessarily need to be three "projects". Using Maven I create three sub-modules under a master project. You can do something similar using Ant.
I have found that at least one project per finished entity (server deployment, client binary, etc) works well with e.g. Hudson. Then you can have shared code in a basic project available to all.
In some projects, I have to deal with more than one programming language (for example a Delphi GUI application which communicates with an C# or Java app). The Subversion repository currently contains three top branches, one per language.
Should I change this and group all parts of the project in the trunk like in the following example to make branching and tagging on project level easier?
project1
branches
...
tags
...
trunk
csharp_app
delphi_app
java_app
...
project2
...
As these separate sub projects interact, then they need to move in lockstep, and you need to tag/branch/release the C#/Java/whatever components together. If they're unrelated then I would advocate (perhaps) separate repositories, or separate directories within the same repository. But not branches or tags.
Branches are used to manage different development streams on the same codebase. Tags are used to indicate a particular point in the project's evolution.
I think the programming language is irrelevant. Ask yourself what your releasable is, and how you need to manage this. I've done this successfully in the past with projects incorporating Java and C++, and the language is not the issue - it's keeping the components in sync that you need to manage.
I wouldn't necessarily create a new top-level directory per language. What happens if your Java component suddenly requires a JNI layer ? It strikes me that the implementation is reflected in the top-level directory structure, and that shouldn't really be a concern.
Programming language is irrelevant when you're managing a single project. A single module may be written in a variety of programming languages but still be too tightly coupled to worth separating. If each module of the application (whether or not it's written in the same language) is independent enough to be considered a separate project (and consequently, become independently versioned), you may want to separate it. Otherwise, don't do that.
I don't think it's a good idea, because theoretically the different components could break compatibility, and if they don't stay synced up it would be hard to go back to the last working good config.
Yes. The criterion is whether the apps require some form of synchronization of their features between each other, and that communication protocol (API, shared code, shared libs) may change over time. If the apps have nothing to do with each other, then separate repositories. Having apps written in multiple languages in the repository is irrelevant.