We recently migrated from SVN, with most code in a single repo, to git, with most projects in their own repos (about 70 of them). We build about a dozen different apps from this java source. The apps all run on *nix servers. We use maven and nexus to build. Many of us are struggling with developing features when that feature touches more than one repo. Here are a few of the challenges:
The developer has to branch each repo separately - we use the same name for all branches for one feature to make tracking less difficult.
One must update poms of all repos to point to the updated versions of each repo's artifact. If multiple people are working on the same branch, there can be a lot of merging others pom changes. When I commit a change to a repo, then the artifact is renamed to "-SNAPSHOT" which means more pom updates.
Changes need to be pushed in the right order or our automated builds will fail, e.g: repo A depends on a change to repo B; if repo A is pushed before repo B is built and deployed, then repo A won't build.
The person reviewing the feature has to look at changes in multiple repos.
When the feature is merged from its branch to, say, master, One has to remember all the repos that were touched.
It looks like switching to a mostly monorepo approach might be best, tho there are some drawbacks there:
Building the entire codebase with maven takes a looong time. (Why can't maven be more like make, only building things that have changed or whose dependencies have changed?)
Each push kicks off a big set of builds and many unit tests rather than just one repo's artifact build and test.
The developers who generally work in one or two repos prefer this new multi-repo world and will resist a change back.
I've looked into git submodules and sub trees, which don't seem to solve many of our issues (Not sure about Google Repo). Some of us use tools like "mu" to help. It would be sweet if there was a toolkit that would help developers maintain versions in poms, and track changes across repos.
Let me know if you have a set of procedures or tools you use to ease development in this kind of environment.
with most projects in their own repos (about 70 of them).`
For me this is where the problems start. My vote goes for minimising this number significantly.
If you really don't want a single repo (1 repo gets my vote) then you could separate the code base into n*change_often repos with 1*change_rarely repo. Keeping the n small is important. This way you would avoid rebuilding the bits that change rarely.
Also, even with the a single repo you don't need to reference everything by source and use binaries for base libraries. When a base library changes the person making the change could also update all the references in one go so that that all projects are up to date.
Related
I am facing one problem and not sure what is the best way to go,
So I have two git repositories with Spring projects which have to share the same database. (Both are spring projects with hibernate).
One of them is main project so-called MASTER which should modify all Hibernate entities and other I will call SLAVE which is secondary project and needs to read only from the same database.
Here is the small illustration what I have.
So the issue appears when I realized that need to keep the duplicate of entities in both master and slave.
I found two ways to go with this issue.
Using git submodules. Where I can have my entities to be an independent submodule.
Building a JAR from entity classes and include it in both projects.
This both solutions are not meeting my requirements which are:
The solution of submodules is not good because whenever I commit anything from the MASTER I want SLAVE to track that changes. Please note, I have 3 git branches for both projects, master, staging and production. So all the branches should have accordingly their version of entities.
The solution of the jars will work, just I do not find it nice and solid, as I should all the time build them and add a dependency for every project.
The development of these projects is done independently from each other.
Please, could you share your opinion on this issue?
I kind of sure that I am not only the one who is trying to achieve the same.
You should consider publishing your jar to a maven repository for easier exchange between the projects. You could even host your own like sonar nexus. https://www.sonatype.com/nexus-repository-sonatype
Personally, I think that managing the versions can be very annoying when you have multiple projects. Especially when you are testing something and you have to create a new jar and then publish it over and over again. However your project will be rebuildable and you can controll which project/module can use a newer version of your entity-dependency.
I have a very well structured big project which is maintained in SVN in my team. And we not want to move to Git - which will further integrated with continious integration servers like Teamcity or Jenkins (yet to decide).
When I saw the svn I found that since svn allows you to create tags from anywhere - It has a lot of tags which only contains single-single projects.
The codebase is so huge that I import all the code in one git repo. One approach would be for each project - I can create seperate repos, or combine some of the project to one repo (not sure)
The thing I want to achieve is I want to retain all the history and break the repo so that it becomes easy to manage. (Also it ll be helpful in future while integration with jenkins for automated builds on every commit)
How can I ensure to retain history, have all the existing tags, and move to git
git-svn is the obvious solution to help with this migration but, in my opinion, this is better for a one-off migration from SVN to git.
If you want to keep both the SVN and git repositories alive and allow commits to both (keeping them both in sync), I would recommend SubGit.
When we moved from svn to git I found this stackoverflow post very helpful, specifically the second answer. I assume from your description that everything is under a single trunk. That does make it a little more complicated. I have not tried moving individual projects before, but you may be able to do it in one of two ways:
Migrate the entire svn repo, tags, branches, etc. to your local git repo (before you push to the remote). Then break up the projects by following the suggestions in (this) stackoverflow post. This should give you individual repos that you can then push to the remote repos.
You may be able to alter the steps when running the svn to git conversion and specify an individual project, but that seems dangerous and confusing since the tags/branches won't necessarily line up.
We are currently trying to pilot the transition from Git to SVN to increase production and collaboration within our team.
However we are facing some issues with trasitioning and finding counterparts which currently work for us. I've been reading up on Git and can't seem to find a specific answer.
Here are some issues:
Our project is composed of several subprojects each built as a project of its own. How do we manage these subprojects with Git? One of the main issues I've encountered is when switching branches, I have to individually switch branches among
I've read about Subproject support as mentioned in https://git.wiki.kernel.org/index.php/SubprojectSupport, but I've also read that this isn't supported by git-svn.
We have multiple SVN branches currently, each representing a release. Most of us have all relevant branches (usually 2-3) checked out in our workspace. Switching branches might be okay if it's fast, but another problem is the configuration of our build paths & etc (considering we don't have any dependency management system in place at our level of development and all are done manually). Is there a way to go around this in Git, either by allowing multiple branches active in a workspace, or through rapid switching?
I'm not sure if there will be any specific correct answer, but pointing me to relevant resources will be helpful as well. Thank you.
Your question is rather broad (or possibly contains multiple questions), but I'll try a general answer:
Our project is composed of several subprojects each built as a project
of its own. How do we manage these subprojects with Git?
Usually you would put these into one Git repo, each in its subdirectory. You can use multiple repositories, but that only makes sense if the projects are versioned, branched and released independently. Branches are always per repository in Git (unlike in SVN, where you branch a single directory), so the rule of thumb is: What is branched together shares on repo, what is branched separately gets its own repo.
We have multiple SVN branches currently, each representing a release.
Most of us have all relevant branches (usually 2-3) checked out in our
workspace. Switching branches might be okay if it's fast, but another
problem is the configuration of our build paths & etc (considering we
don't have any dependency management system in place at our level of
development and all are done manually). Is there a way to go around
this in Git, either by allowing multiple branches active in a
workspace, or through rapid switching?
You cannot have multiple branches checked out in one working directory (how would that even work?). You can make multiple clones (each with its own working directory), then check out different branches. However, I'm not sure that is the best solution for you.
Switching branches in git is very fast - essentially just the time for the filesystem I/O required to change the files that need changing.
About the build paths: If you switch branches in git, the paths do not change, because the switch happens inside the working directory.
A final note: It looks like you should really look into some kind of dependency management and artifact management. Doing all this with source code only is rather error-prone and difficult.
Background. My org uses Maven, Bamboo and Artifactory to support a continuous integration process. We rely on Maven's SNAPSHOT qualifier to help manage storage in Artifactory (rotate out old SNAPSHOT builds) and also to help keep cross-team integrations current (Maven checks for updates to SNAPSHOT dependencies automatically on each build).
Problem. One of the challenges we're having is around correctly promoting builds from environment to environment while continuing to use SNAPSHOT. Say that a tester deploys version 1.8.2-SNAPSHOT to a functional test environment, and it's at rev 1400 in Subversion. Let's say also that it passes functional test. By the time a tester decides to pull 1.8.2-SNAPSHOT from Artifactory into the performance testing environment, a developer could have committed a change to Subversion, so the actual binary in Artifactory is at a different rev. How do we ensure that the rev doesn't change out from under us when using SNAPSHOT builds?
Constraints. We obviously don't want to deploy different builds unknowingly. We also don't want to rebuild from source as we want to test the exact binary in performance test that we tested in functional test.
Approaches we've considered. The thought is that we want to stamp the versions with a fourth component, like 1.8.2.1400, where the fourth component is a Subversion rev. (As a side question, is there a Maven plugin or something else that does that automatically?) But if we do that, then essentially we lose the SNAPSHOT feature since Maven and Artifactory think that these are different versions.
We are using Scrum, so we deploy to the test environments very early (like day two or so). I don't think it makes sense to remove the SNAPSHOT qualifier that early in the dev cycle because we lose the SNAPSHOT benefits again.
Would appreciate knowing how other orgs solve this issue.
Just to circle back on this one, I wanted to share what we are doing.
Basically we deploy snapshot builds like 1.8.2-SNAPSHOT into the development environment. No other teams need to use these builds, so it is fine to leave -SNAPSHOT on them.
But any build that we deploy to a test environment (e.g. functional test, system test) or else production must include the revision; e.g., 1.8.2.1400. We call these "quads". The reason for insisting upon quads in test is that we can attach issues (features, bugfixes, etc.) to specific revisions so the testers know what to test. For production it's really just because we want to deploy exactly the same artifact that we tested, so that means we're deploying a quad.
Anyway hope that information is useful to somebody.
if you enable "uniqueVersion" for you snapshot builds, every snapshot deployed will have a unique id. you can use that to ensure you are deploying the correctly promote builds across environments.
and, as a side note, you can use the buildnumber-maven-plugin to add subversion buildnumbers to artifacts.
Rather than embed the build number of VCS revision in the artifact's version, we embed the CI build number in the META-INF/MANIFEST-MF file .
See for instance Using Hudson environment variables to identify your builds . Although the article is applicable to Jenkins/Hudson I believe it is trivial to port to Bamboo.
I'm currently working on a project that contains many different Eclipse projects referencing each other to make up one large project. Is there a point where a developer should ask themselves if they should rethink the way their development project is structured?
NOTE: My project currently contains 25+ different Eclipse projects.
My general rule of thumb is I would create a new project for every reusable component. So for example if I have some isolated functionality that can be packaged say as a jar, I would create a new project so I can build,package and distribute the component independently.
Also, if there are certain projects that you do not need to make frequent changes to, you can build them only when required and keep them "closed" in eclipse to save time on indexing, etc. Even if you think that a certain component is not reusable, as long as it is separated from the rest of the code base in terms of logic/concerns you may be well served by just separating it out. Sometimes seemingly specific code might be reusable in another project or in a future version of the same project.
When compiled, a project would typically result in a jar. So if your application consists of potentially reusable components, it is ok to use a project for each.
I'm a big fan of using a lot of projects, I feel that this "breaks down" large things beyond what I can do with packages, and helps me orient and navigate.
Of course, if you're developing Eclipse plug-ins, everything would be a project anyway.
The only thing I would watch out for has to do with your source-control and it's ability to handle moves of files between projects. Subclipse had been giving me trouble with it, or maybe it's my SVN server that did.
If your project has that many sub-projects, or modules, needed to actually compose your final artifact then it is time to look at having something like Maven and setting up a multi-module project. It will a) allow you to work on each module independently without ide worries and allow easy setup in your ide (and others' IDEs) through the mvn eclipse:eclipse goal. In addition, when building your entire top level project, maven will be able to derive from list of dependencies you have described what modules need to be built in what order.
Here's a quick link via google and a link to the book Maven: The Definitive Guide, which will explain things in much better detail in chapter 6 (once you have the basics).
This will also force your project to not be explicitly tied to Eclipse. Being able to build independent from an ide means that any Joe Schmoe can come along and easily work with your code base using whatever tools he/she needs.
Create jars for the projects you don't work in often. That should greatly reduce the clutter. If you work on all the projects often, then you can add targets to your build that will jar up the respective projects for you, which condenses everything down to one file that you can then include on the class path.
An additional method is to create many different workspaces. The benefit of separate workspaces is that you can remove some of the visual clutter/ performance overhead of having lots of projects. You can use targets to jar up all of you projects and put them in a repository so you can reference them in each workspace.
At a former job the entire application was more then +170 projects. While it was rarely necessary to have all projects checked out locally, even the 30-40 projects constantly in our scope made reindexing, etc. very slow.
Yeesh. One Project for each Project. If you are using reusable projects, make them into a library for heavens sake. Break the none re-usable projects into packages, that's what they are there for.
That's a hard question and answers span from having one eclipse project at all to having one eclipse project for every single class.
My bottomline:
You can have too few projects,
and never too many (of course use
automation e.g. mvn eclipse:eclipse)
Use
-Declipse.useProjectReferences=true/false
when using maven to switch workspace
mode btw jar and project
dependencies
Use mvn release plugin to generate
consecutive releases (automatic
version increase)
Multiple projects gives you
independent versioning which is
extremely important. E.g. one dev may work on a new version of a
module while you still depends on
the previous one and you at some
point decide to upgrade to the newer
version(possibly by increasing its version in pom.xml dependency section). Or in other scenario if one
project contains a bug you downgrade
to its previous version.
Multiple projects makes you think
about the architecture more than if
you have just packages.
Multiple projects generally make
architectural problems evident more
than if you have just one project.
Anyone would like to comment on
this?
You never know if you project
evolves into OSGI/SOA/EDA where you
need separation.
Even if you're 100% sure that you
projects will be deployed as one jar
in an old way in a single jvm, it
still does not hurt(mvn assembly
plugin) to have multiple eclipse
projects for logically independent
pieces of code
BTW, the project I work on is divided into 24 eclipse projects.
Hell, we have more than 100. Projects don't cost anything.