Which artifacts of Cobertura should I check-in to SVN? - java

I have Cobertura integrated with my project, and it works as expected. However, I am not sure which of the Cobertura artifacts to check-in to SVN.
The directory structure looks something like:
MainProjectDir
cobertura.ser
coberturaDir
cssDir
imagesDir
instrumentedDir
js
reports
LOTS OF html FILES
There is just over 1 MEG of space in the coberturaDir, and checking in that directory seems troublesome for future commits.
My goal is to keep track of the total for the project and each class.
Of the cobertura artifacts, what should I be committing to SVN?
Thanks,
Sean

None of them.
You should be able to regenerate the Cobertura reports by pointing toward an older revision in your version control system. Since the reports are a derivative product of the version of software, there's no need to store them. This is the same principle that applies to generated documentation (javadoc, doxygen) and binary files produced from your source code (jars, exes, class files).
If you need history, I'd suggest saving the report outside of version control, in someplace like a file server. You can then compress old report directories into ZIPs or tarballs so they are available, but archived to reduce space and make finding the latest data easier. You can also take the measurements and metrics that are most important and put them into a single file, such as a spreadsheet, and put that in a file server.

Like Thomas Owens said: None of them.
Ah, you say. I want to be able to see the results and save them. I want to be able to link them back to the developers and see how my test coverage changes over time.
In that case, use a Continuous Integration system like Jenkins. Jenkins can examine your XML based Cobertura coverage reports and display them as graphs. It can save these graphs with each build. Each build will show you who made the commit that triggered the build and the changes in coverage since the last build. You can even play a CI Game and award points to developers who create Unit tests that expand your coverage. (First prize is a Cadillac Eldorado. Second prize is a set of steak knives, and third prize is you're fired.)
Jenkins is pretty simple to setup and get working. You'll need to download the Cobertura plugin which is pretty easy to do. It'll do what you want without having to check in your Cobertura files.

Related

Multiple Maven Projects, Single JaCoCo Site?

I am working for a company that is using multiple Maven projects/modules to create what will eventually become one product. To help me explain, imagine a file structure similar to below:
- Parent Directory
- Project_1
- /src/
- /target/
- POM.xml
- Project_2
- /src/
- /target/
- POM.xml
Along the way we are using JUnit to unit test our code, and it is an important contractual requirement that we achieve above a certain percentage threshold of code coverage with our tests.
We are using JaCoCo to generate coverage reports in the form of a HTML website. JaCoCo itself is proving to be invaluable but one major problem we have is that this creates a single site under the /target/site/jacoco/ directory.
I have done some investigating myself and found that, unless I am mistaken, JaCoCo by default does not support the ability to converge multiple Maven projects into a single JaCoCo report.
So my question is, can anybody suggest an alternative solution - something that will allow us to converge multiple reports onto a single web server?
One option we have is to move all sites into individual folders on a web server and then have an index page linking them together, but it's "clumsy" at best. For example:
- Web Server
- index.html
- Project_1
- (Generated report files)
- Project_2
- (Generated report files)
Any better suggestions would be greatly appreciated.
JaCoCo does not provide a simple way to do this as of today. However, they do specify three alternatives that are described here: https://github.com/jacoco/jacoco/wiki/MavenMultiModule
Their most suitable approach involves creating a separate reporter module that contains dependencies on all the other modules (in the github article referred to as Strategy: Module with Dependencies).
The reporter module uses the jacoco:report-aggregate (http://www.eclemma.org/jacoco/trunk/doc/report-aggregate-mojo.html) maven goal to fetch all the individual reports and binds them together into one.
An example project:
https://prismoskills.appspot.com/lessons/Maven/Chapter_06_-_Jacoco_report_aggregation.jsp
There are many different approaches you can go with.
First of all you might want to consider something like Sonar, so that you'll compile all your modules and will run a Sonar that will inspect the coverage among other things. Sonar will take the results and upload to sonar server (with the database and everything) so that you'll be able to see in UI what went wrong
Another approach is just rolling your own Maven plugin (assuming you're using Maven). The reports generated by jacoco is also an XML report if I'm not mistaken. So it can be parsed pretty easily. So, one could write a Maven plugin that would identify all the reports like this, parse them and provide some unified view.
Yet another approach is to cause the whole build failure when the coverage doesn't reach some threshold. I know, it doesn't answer your question directly, but if you do it like this, you'll kind of guarantee the minimum level of coverage (that can be increased from time to time at the level of project).

Track test coverage history(trend)

Goal - track unit/functional/integration tests coverage history on some regular basis (weekly, every release, etc) for relatively long period (6 months).
JUnit plugin in Jenkins out of the box does that per every build and problem here that it does not allow track specific slices (milestones, releases, etc) and history is kept for some fixed number of builds. So, here we are dependent on Jenkins workspace folder content which is not reliable.
Currently, we are capturing metrics from Jenkins and manually moving them into a table in confluence, so that we can use raw data to build graphs, trends. As you understand this approach requires a lot of manual effort and does not scale for cases when we need to track different test types, multiple projects
Is there any existing tool that provide capability to track history and show the trend?
Atlassian Clover can track historical coverage, however, keep in mind that you will still have to gather history point files in some place:
https://confluence.atlassian.com/display/CLOVER/'Historical'+Report
https://confluence.atlassian.com/display/CLOVER/clover-historypoint

Want artifact traceability without giving up the SNAPSHOT qualifier

Background. My org uses Maven, Bamboo and Artifactory to support a continuous integration process. We rely on Maven's SNAPSHOT qualifier to help manage storage in Artifactory (rotate out old SNAPSHOT builds) and also to help keep cross-team integrations current (Maven checks for updates to SNAPSHOT dependencies automatically on each build).
Problem. One of the challenges we're having is around correctly promoting builds from environment to environment while continuing to use SNAPSHOT. Say that a tester deploys version 1.8.2-SNAPSHOT to a functional test environment, and it's at rev 1400 in Subversion. Let's say also that it passes functional test. By the time a tester decides to pull 1.8.2-SNAPSHOT from Artifactory into the performance testing environment, a developer could have committed a change to Subversion, so the actual binary in Artifactory is at a different rev. How do we ensure that the rev doesn't change out from under us when using SNAPSHOT builds?
Constraints. We obviously don't want to deploy different builds unknowingly. We also don't want to rebuild from source as we want to test the exact binary in performance test that we tested in functional test.
Approaches we've considered. The thought is that we want to stamp the versions with a fourth component, like 1.8.2.1400, where the fourth component is a Subversion rev. (As a side question, is there a Maven plugin or something else that does that automatically?) But if we do that, then essentially we lose the SNAPSHOT feature since Maven and Artifactory think that these are different versions.
We are using Scrum, so we deploy to the test environments very early (like day two or so). I don't think it makes sense to remove the SNAPSHOT qualifier that early in the dev cycle because we lose the SNAPSHOT benefits again.
Would appreciate knowing how other orgs solve this issue.
Just to circle back on this one, I wanted to share what we are doing.
Basically we deploy snapshot builds like 1.8.2-SNAPSHOT into the development environment. No other teams need to use these builds, so it is fine to leave -SNAPSHOT on them.
But any build that we deploy to a test environment (e.g. functional test, system test) or else production must include the revision; e.g., 1.8.2.1400. We call these "quads". The reason for insisting upon quads in test is that we can attach issues (features, bugfixes, etc.) to specific revisions so the testers know what to test. For production it's really just because we want to deploy exactly the same artifact that we tested, so that means we're deploying a quad.
Anyway hope that information is useful to somebody.
if you enable "uniqueVersion" for you snapshot builds, every snapshot deployed will have a unique id. you can use that to ensure you are deploying the correctly promote builds across environments.
and, as a side note, you can use the buildnumber-maven-plugin to add subversion buildnumbers to artifacts.
Rather than embed the build number of VCS revision in the artifact's version, we embed the CI build number in the META-INF/MANIFEST-MF file .
See for instance Using Hudson environment variables to identify your builds . Although the article is applicable to Jenkins/Hudson I believe it is trivial to port to Bamboo.

CPD / PMD between projects?

I am rephrasing this question to make it a little more straightforward and easy to understand, hopefully.
I have roughly 30 components (internal) that go into a single web application. That means 30 different projects with their own separate POM. I use inheritance quite a bit in my POMs so one of the things they inherit is a PMD/CPD configuration to prevent code duplication.
Even though I have CPD/PMD running, it only detects duplicate code within the same project. I would like it to detect in any of my projects if there is code shared among the projects that can be refactored out. Moreover, I was looking for something that could (using the same concept/pattern) verify that no code is shared between other open source dependencies.
It would be CPD/PMD, except it would operate on the source jars. This task would consume a large amount of memory if you scan all projects and their dependencies for duplication. Right now, I would just like to apply that to internal projects. If it works, then it would be relatively easy/straightforward to scale that out.
Walter
I'm not sure I got everything but...
I'd create an aggregating module with all projects as dependencies, use the maven-dependency-plugin and it's unpack-dependencies mojo to get all dependencies sources jar (the mojo can take a classifier as parameter) and unpack-them (maybe in target/generated-sources/java, the maven build helper plugin may help here) and finally run pmd:cpd on the whole source base.
This may need some tweaking, I didn't test this at all.
It sounds like you want to find duplicate code anywhere in your 30 projects. I can't speak for PMD; I assume you tell it to make one giant project containing all the source files from the union of the projects. But yes, this would take a lot of RAM and CPU.
Another tool that does is the Java CloneDR. The CloneDR finds duplicate code whether it is exactly the same or close (e.g., a few edits) regardless of source code layout or intervening comments. It is pretty easy to set it up to process all the files in your set of projects.
Just run PMD:CPD as a stand-alone program. All it needs is a directory, and it will recurse. At least, it did for me. I moved all my source to one directory and ran the CPD gui from the batch file distributed with PMD-4.2.5 .
You can perhaps take a look at sonar :
Sonar-CPD engine that is much more scalable and can detect cross-projects duplications.
You can try Lizard for Python.
It doesn't work on source jars, though.
"Code Duplicate Detector
lizard -Eduplicate {path to your code}"
https://pypi.org/project/lizard/
PMD/CPD provides more granularity since it allows the user to specify the number of tokens before a block of code is flagged as duplicate.
https://pmd.github.io/latest/pmd_userdocs_cpd.html#cli-options-reference

Build management/ Continuous Integration best practices

How does your team handle Builds?
We use Cruise Control, but (due to lack of knowledge) we are facing some problems - Code freeze in SVN - Build management
Specifically, how do you make available a particular release when code is constantly being checked in?
Generally, can you discuss what best practices you use in release management?
I'm positively astonished that this isn't a duplicate, but I can't find another one.
Okay, here's the deal. They are two separate, but related questions.
For build management, the essential point is that you should have an automatic, repeatable build that rebuilds the entire collection of software from scratch, and goes all the way to your deliverable configuration. in other words, you should build effectively a release candidate every time. Many projects don't really do this, but I've seen it burn people (read "been burned by it") too many times.
Continuous integration says that this build process should be repeated every time there is a significant change event to the code (like a check in) if at all possible. I've done several projects in which this turned into a build every night because the code was large enough that it took several hours to build, but the ideal is to set up your build process so that some automatic mechanism --- like an ant script or make file --- only rebuilds the pieces affected by a change.
You handle the issue of providing a specific release by in some fashion preserving the exact configuration of all affected artifacts for each build, so you can apply your repeatable build process to the exact configuration you had. (That's why it's called "configuration management.") The usual version control tools, like git or subversion, provide ways to identify and name configurations so they can be recovered; in svn, for example, you might construct a tag for a particular build. You simply need to keep a little bit of metadata around so you know which configuration you used.
You might want to read one of the "Pragmatic Version Control" books, and of course the stuff on CI and Cruise Control on Martin Fowler's site is essential.
Look at continuous integration: best pratices, from Martin Fowler.
Well, I have managed to find a related thread, I participated in, a year ago. You might find it useful, as well.
And here is how we do it.
[Edited]
We are using Cruise Control as integration tool. We just deal with the trunk, which is the main Subversion repository in our case. We seldom pull out a new branch for doing new story cards, when there is a chance of complex conflicts. Normally, we pull out a branch for a version release and create the build from that and deliver that to our test team. Meanwhile we continue the work in trunk and wait for the feedback from test team. Once all tested we create a tag from the branch, which is immutable logically in our case. So, we can release any version any time to any client in case. In case of bugs in the release we don't create tag, we fix the things there in the branch. After getting everything fixed and approved by test team, we merge the changes back to trunk and create a new tag from the branch specific to that release.
So, the idea is our branches and tags are not really participating in continuous integration, directly. Merging branch code back to the trunk automatically make that code becomes the part CI (Continuous Integration). We normally do just bugfixes, for the specific release, in branches, so it doesn't really participate into CI process, I believe. To the contrary, if we start doing new story cards, for some reasons, in a branch, then we don't keep that branch apart too long. We try to merge it back to trunk as soon as possible.
Precisely,
We create branches manually, when we plan a next release
We create a branch for the release and fix bugs in that branch in case
After getting everything good, we make a tag from that branch, which is logically immutable
At last we merge the branch back to trunk if has some fixes/modifications
Release Management goes well beyond continuous integration.
In your case, you should use Cruise Control to automatically make a tag, which allows developers to go on coding while your incremental build can take place.
If your build is incremental, that means you can trigger it every x minutes (and not for every commit, because if they are too frequent, and if your build is too long, it may not have time to finish before the next build tries to take place). The 'x' should be tailored to be longer that a compilation/unit test cycle.
A continuous integration should include automatic launch of unit tests as well.
Beyond that, a full release management process will involve:
a series of deployment on homologation servers
a full cycle of homologation / UAT (User Acceptance Test)
non-regression tests
performance / stress tests
pre-production (and parallel run tests)
before finally releasing into production.
Again "release management" is much more complex than just "continuous integration" ;)
Long story short: Create a branch copied from trunk and checkout/build your release on that branch on the build server.
However, to get to that point in a completely automated fashion using cc.net is not an easy task. I could go into details about our build process if you like, but it's probably too fine grained for this discussion.
I agree with Charlie about having an automatic, repeatable build from scratch. But we don't do everything for the "Continuous" build, only for Nightly, Beta, Weekly or Omega (GA/RTM/Gold) release builds. Simply because some things, like generating documentation, can take a long time, and for the continuous build you want to provide developer with rapid feedback on a build result.
I totally agree with preserving exact configuration, which is why branching a release or tagging is a must. If you have to maintain a release, i.e. you can't just release another copy of trunk, then a branch on release approach is the way to go, but you will need to get comfortable with merging.
You can use Team Foundation Server 2008 and Microsoft Studio Team System to accomplish your source control, branching, and releases.

Categories