I am creating a system that will compile first and second year java programs, at the moment I have it compiling single Java files.
As I was starting to try and get the system to compile projects with multiple classes, it accord to me that being first and second year students they are not going to hand up the projects all in the same format.
I was trying to research this all yesterday but could not find out much of about things like:
What are to main differences between Netbeans and Eclipse projects when compiling
How to compile projects in jar files
Just the different formats in general
So my question is, is there a compiler out there that compiles all the different formats, or do you have to set up the different formats to a certain way to compile them?
Any examples of this as well?
Make it a requirement to use Maven to build (yes it has it's faults, but at least you'll get consistency).
What I understand is you want something that can compile all types of Java projects (NetBeans, Eclipse, etc.)
Sorry to say this but there isn't one that can compile all the formats out there. But you could write your own, for at least the most common types of formats that you receive from the students.
Check out this page for more information: Building Java Projects.
What I suggest is, start by studying the build architecture used by those tools (NetBeans, Eclipse, etc.) and come up with a build script of your own that can extract the class paths of all the classes in the java project. Let your script do the work for you!
If you don't wish to write your own scripts, then you may consider changing the projects you receive into a standard project format. Check out this and this link to see more about migrating from Ant to Maven or Maven to Ant.
Else, you can always manually port your existing projects into other IDE, provided they follow the same build mechanism. Check out this answer to know more.
we require the software version number of a maven project both in the java code and in the NSIS installer script. Following the DRY principle, the version number should be stored in the maven pom only. What is the best way to get this version number in the Java code as well as in the NSIS script? Updates on the version number should of course be distributed without the developer having to care about it.
The current approach: Wherever the version number is needed, ${"versionNr"} is inserted as a substitute. Then, during the maven build phase, all java and NSIS source files are filtered and the key is replaced by the version number. To avoid changes in the checked in source code, the filtered files are actually copied to a different location not within the scm. Having the original source and the source filtered by maven causes a lot of confusion, which I would like to avoid.
Any hints?
I typically put the version parameter (like ${project.version}) in a properties-file and only apply filtering on that one file in the maven build. Like
app.version=${project.version}
Then I use this properties file in the code to get the version.
pom.properties gets built into JAR file (to META-INF/maven/<groupId>/<artifactId>/pom.properties) when the project is packaged up. It looks something like:
#Generated by Maven
#Mon Sep 26 09:03:19 EST 2011
version=1.0-SNAPSHOT
groupId=my.project.group.id
artifactId=my-artifactid
You could read this as a resource in your Java code, and use Property API to read the version out.
Not sure whether NSIS scripts can read property files, but according to the source code of the NSIS plugin it creates a few !defines, including PROJECT_VERSION which gets the project version straight from the POM. Maybe you can use this.
I am using Eclipse 3.4.2 to develop my code. As part of my project definition I reference a utility library to which I have attached the source code. So far, so good - I can see that source when I bring up classes from the library and while I am debugging.
Now however I would like to make a change to one of the classes while still retaining all the features of the Eclipse Java editor (specifically things like tool tips and quick fix). These features seem to work when I'm viewing the source (I can CTRL+LClick through method names for instance), but it is read-only. On the other hand I can explicitly open the source file which will allow me to edit it, but I lose all of the "smart" editing features.
I've recently switched to Eclipse from IntelliJ where this was possible so I'm hoping it is in Eclipse as well. Note that although I could simply include the code as a project in my workspace, I'd really rather not. The workspace is already quite large and I don't want to further slow Eclipse down by adding projects I rarely would ever touch.
I am not sure I get your question right. When you add a precompiled library to your projects build path (the JAR) and attach source to this JAR, Eclipse will show you the source code when you click on a .class inside the JAR. The same goes for the debugger, which will also allow you to step through the code lines in the source, if the classes in the JAR were compiled with line number information.
Now what you seem to want to do is modify the classes inside the JAR (the source view is just an overlay which can even be off, if you attach a different version of the source), which is not possible, because they are wrapped up in binary form in the JAR archive - even though Eclipse is smart enough to display them individually.
I guess you would expect your changes to be hot-swapped into the running program by the debugger. This can only be done through a recompile once you finished your changes. Usually Eclipse does that automatically when you save a Java source file. As your source file is however not part of the workspace (or an external folder explicitly declared as Java source) - it will not do that recompile and swap.
I'd recommend to include the source of your external library as a project in Eclipse and not worry about performance too much - I work with 3.4.2 every day and my workspace has about 45 open projects with several 10.000 classes and millions of lines of code. I assign a Gigabyte of RAM to the Eclipse VM and have no problems with that on a Core2Duo 2.6GHz machine.
I'm soon going to check in the very first commit of a new Java project. I work with Eclipse Ganymede and a bunch of plug ins are making things a little bit easier.
Previously I've been part of projects where the entire Eclipse project was checked in. It's quite convenient to get the project settings after a check out. However this approach still was not problem free:
I strongly suspect that some Eclipse configuration files would change without user interaction (from when I used Eclipse Europa), making them appear as changed (as they were changed, but not interactively) when it's time to do a commit.
There are settings unique to each development machine as well as settings global for all developers on a project. Keeping these apart was hard.
Sometime if the Eclipse version was different from others Eclipse would get angry and mess up the project configuration. Another case is that it change the format so it gets updated, and if commited messes up the configuration for others.
For this specific project I have another reason not to commit the project files:
There might be developers who prefer NetBeans which will join the project later. However they won't join within the coming months.
How do you organize this? What do you check into versioning control and what do you keep outside? What do you consider best practice in this kind of situation?
At a minimum you should be check-in the .project and .classpath files. If anybody on your team is hard-coding an external JAR location in the .classpath you should put them up against the wall and shoot them. I use Maven to manage my dependencies but if you are not using maven you should create user libraries for your external JARs with with a consistent naming convention.
After that you need to consider things on a plug-in by plug-in basis. For example I work with Spring so I always check-in the .springBeans and likewise for CheckStyle I always check-in the .checkstyle project.
It gets a bit trickier when it comes to the configuration in the .settings folder but I generally check-in the following if I change the default settings for my project and want them shared with the rest of the team:
.settings/org.eclipse.jdt.ui.prefs - it contains the settings for the import ordering
.settings/org.eclipse.jdt.core.prefs - it contains the settings for the compiler version
In general I haven't noticed Ganymede modifying files without me modifying the project preferences.
I recommend to use maven so that the entire life cycle is outside of any IDE. You can easily create an eclipse project with it on the command line and you can use whatever you want, if it's not eclipse. It has it's quirks but takes out a lot of bitterness when it comes to dependencies and build management.
In our world, we check in the entire Eclipse project and the entire parallel but separate Netbeans project. Our motivations for this were entirely focused on "when I do a checkout, I want a functional configuration immediately afterward." This means that we had to do some work:
Create runnable configurations for each primary IDE (people like what they like). This includes main class, working directory, VM parameters, etc.
Create useful start up scripts for all of our relevant scenarios.
Create edited datasets that don't cause the checkout to take too much longer (it's a big project).
This philosophy was worth cash money (or at least labor hours which are almost more valuable) when our new hire was able to check out the project from Subversion into Eclipse and immediately run a functional system with a (small) real data set without any fuss or bother on his part.
Follow up: this philosophy of "make the new guy's life easier" paid off again when he changed IDEs (he decided to try Netbeans after using Eclipse for quite a long time and decided to stick with it for a while). No configuration was required at all, he just opened the Netbeans project in the same directory that Eclipse had been pointing to. Elapsed switchover time: approximately 60 seconds.
I only ever check in things are done by humans, anything else that is generated (whether automaticly or not) should be easy to regenerate again and is liable to change (as you've stated). The only exeption to this is when the generated files are hard (requires alot of human intervention ;) ) to get it right. How ever things like this should really be automated some how.
Try to port your project to a build system like maven. It has everything you need to get the same experience of the project on every machine you use.
There are plugins for just everything. Like the eclipse plugin. You just type "mvn eclipse:eclipse" and the plugin generates your entire ready to work eclipse project.
To give the answer to your question. Never check in files that are not being used by your project at any time in the development cycle. That means that metadata files like eclipse properties etc. should never be checked in in a SCM.
I like checking in the .project, .classpath, and similar files only if they will be identical on any Eclipse user's machine anyway. (People using other IDEs should be able to check out and build your project regardless, but that issue is orthogonal to whether or not to check in Eclipse-only files.)
If different users working on the project will want to make changes or tweaks to their .project or .classpath or other files, I recommend that you do not check them into source control. It will only cause headaches in the long run.
I use IntelliJ, which has XML project files. I don't check those in, because they change frequently and are easy to recreate if I need to.
I don't check in JAR files. I keep those in a separate repository, a la Maven 2.
I don't check in WARs or JARs or javadocs or anything else that can be generated.
I do check in SQL and scripts and Java source and XML config.
I'd suggest having the actual project files ignored by the version control system due to the downsides you mentioned.
If there is enough consistent information in the project settings that there would be benefit from having it accessible, copy it to a location that Eclipse doesn't treat as special, and you'll have it available to work with on checkout (and copy back to where Eclipse will pay attention to it). There is a decent chance that keeping the actual project files separate from the controlled ones will result in loss of synch, so I'd only suggest this if there is clear benefit from having the settings available (or you're confident that you'll be able to keep them synchronised)
In our case, we used to check in the project files (.project and .classpath) to make it easy for all developers to create their project workspace. A common preferences file and team project set were located in source control as well, so creating your workspace was as simple as import preferences and import team project set. This worked very well, but does rely on everyone having a consistent environment, any customizations would have to be applied after the basic workspace is created.
We still do this for the most part, but Maven is now used so of course dependency management is handled via Maven instead. To avoid conflicting information, the .project and .classpath were removed from source control and are now generated via maven goals before we import the team project set. This would easily allow for different environments, as you would simply need scripts to generate the IDE specific portions based on the Maven configuration.
PS-For ease of maintenance though, I prefer having everyone use the same environment. Anything else inevitably becomes a full time maintenance job for someone.
Netbeans 6.5 has an improved Eclipse project import which is supposed to sync changes from Netbeans back to Eclipse: http://wiki.netbeans.org/NewAndNoteWorthyNB65#section-NewAndNoteWorthyNB65-EclipseProjectImportAndSynchronization
Don't. Only check in the source code of your projects.
As a response to:
"There are settings unique to each development machine as well as settings global for all developers on a project. Keeping these apart was hard."
Eclipse offers a number of ways to keep local settings manageable: Java Classpath Variables (Java > Build Path > Classpath Variables) are one, 'Linked Resources' (General > Workspace > Linked Resources) are another http://help.eclipse.org/stable/index.jsp?topic=/org.eclipse.platform.doc.user/concepts/concepts-13.htm Creating a README that states which settings to set before building/running the project works pretty well in my opinion.
Now how to make sure your continuous build system understands the changes that were made to the eclipse settings, thats another issue... (I have a separate build.xml for ant that I keep up to date by hand)
Some of my colleagues are convinced that committing build artefacts to the subversion repository is a good idea. The argument is that this way, installation and update on the test machines is easy - just "svn up"!
I'm sure there are weighty arguments against this bad practice, but all I can think of are lame ones like "it takes up more room". What are the best, killer reasons to not do this? And what other approaches should we do instead?
This is for Java code if that makes a difference. Everything is compiled from Eclipse (with no automated PDE builds).
When I say add the build artifacts, I mean a commit would look like this:
"Added the new Whizbang feature"
M src/foo/bar/Foo.java
M bin/Foo.jar
Each code change has the corresponding generated jar file.
In my opinion the code repository should only contain source code as well as third party libraries required to compile this source code (also the third party libraries might be retrieved with some dependency management tool during the build process). The resulting binaries should not get checked in along with the source code.
I think the problem in your case is that you don't have proper build scripts in place. That's why building a binary from the sources involves some work like starting up eclipse, importing the project, adjusting classpathes, etc...
If there are build scripts in place, getting the binaries can be done with a command like:
svn update; ant dist
I think the most important reason not to checkin the binaries along with the source is the resulting size of your repository. This will cause:
Larger repository and maybe too few space on versioning system server
Lots of traffic between versioning system server and the clients
Longer update times (imagine you do an SVN update from the internet...)
Another reason might be:
Source code is easily comparable, so lots of the features of a versioning system do make sense. But you can't easily compare binaries...
Also your approach as described above introduces a lot of overhead in my opinion. What if a developer forgets to update a corresponding jar file?
Firstly, Subversion (and all others nowadays) are not source code control managers (I always thought SCM means Software Configuration Management), but version control systems.
That means they store changes to the stuff you store in them, it doesn't have to be source code, it could be image files, bitmap resources, configuration files (text or xml), all kinds of stuff. There's only 1 reason why built binaries shouldn't be considered as part of this list, and that's because you can rebuild them.
However, think why you would want to store the released binaries in there as well.
Firstly, its a system to assist you, not to tell you how you should build your applications. Make the computer work for you, instead of against you. So what if storing binaries takes up space - you have hundreds of gigabytes of disk space and super fast networks. Its not a big deal to store binary objects in there anymore (whereas ten years ago it might have been a problem - this is perhaps why people think of binaries in SCM as a bad practice).
Secondly, as a developer, you might be comfortable with using the system to rebuild any version of an application, but the others who might use it (eg qa, test, support) might not. This means you'd need an alternative system to store the binaries, and really, you already have such a system, its your SCM! Make use of it.
Thirdly, you assume that you can rebuild from source. Obviously you store all the source code in there, but you don't store the compiler, the libraries, the sdks, and all the other dependant bits that are required. What happens when someone comes along and asks "can you build me the version we shipped 2 years ago, a customer has a problem with that version". 2 years is an eternity nowadays, do you even have the same compiler you used back then? What happens when you check all the source out only to find that the newly updated sdk is incompatible with your source and fails with errors? Do you wipe your development box and reinstall all the dependencies just to build this app? Can you even remember what all the dependencies were?!
The last point is the big one, to save a few k of disk space, you might cost yourself days if not weeks of pain. (And Sod's law also says that whichever app you need to rebuild will be the one that required the most obscure, difficult to set up dependency you were ever glad to get rid of)
So store the binaries in your SCM, don't worry over trivialities.
PS. we stick all binaries in their own 'release' directory per project, then when we want to update a machine, we use a special 'setup' project that consists of nothing but svn:externals. You export the setup project and you're done as it fetches the right things and puts them into the right directory structure.
A continuous integration server like Hudson would have the ability to archive build artifacts. It doesn't help your argument with "why not" but at least it is an alternative.
I'm sure there are weighty arguments
against this bad practice
You have the wrong presumption that committing "build artifacts" to the version control is a bad idea (unless you wrongly phrased your question). It is not.
It is ok, and very important indeed, to keep what you call "build artifacts" in version control. More than that, you should also keep compilers and anything else used to transform the set of source files to a finished product.
In five years from now, you'll certainly be using different compilers and different build environments, that may happen to not be able to compile today's version of your project, for whatever reason. What could be a simple small change to fix a bug in a legacy version, will transform into a nightmare of porting that old software to current compilers and build tools, just to recompile a source file that had a one-line change.
So, there is no reason you should be so afraid of storing "build artifacts" in version control. What you may want to do is to keep them in separate places.
I suggest separating them like:
ProjectName
|--- /trunk
| |--- /build
| | |--- /bin <-- compilers go here
| | |--- /lib <-- libraries (*.dll, *.jar) go here
| | '--- /object <-- object files (*.class, *.jar) go here
| '--- /source <-- sources (*.java) go here
| |--- package1 <-- sources (*.java) go here
| |--- package2 <-- sources (*.java) go here
You have to configure your IDE or your build scripts to place object files in /ProjectName/trunk/build/object (perhaps even recreating the directory structure under .../source).
This way, you give your users the option to checkout either /ProjectName/trunk to get the full building environment, or /ProjectName/trunk/source to get the source of the application.
In ../build/bin and ../build/lib you must place the compilers and libraries that were used to compile the final product, the ones used to ship the software to the user. In 5 or 10 years, you will have them there, available for your use in some eventuality.
"committing build artifacts to the subversion repository" can be a good idea if you know why.
It is a good idea for a release management purpose, more specifically for:
1/ Packaging issue
If a build artifact is not just an exe (or a dll or...), but also:
some configuration files
some scripts to start/stop/restart your artifact
some sql to update your database
some sources (compressed into a file) to facilitate debugging
some documentation (javadoc compressed in a file)
then it is a good idea to have a build artifact and all those associated files stored in a VCS.
(Because it is not anymore just a matter of "re-building" the artifact, but also of "retrieving" all those extra files that will make that artifact run)
2/ Deployment issue
Suppose you need to deploy many artifacts in different environment (test, homologation, pre-production, production).
If:
you produce many build artifacts
those artifacts are quite long to recreate from scratch
then having those artifacts in a VCS is a good idea, in order to avoid recreating them.
You can just query them from environment to environment.
But you need to remember:
1/ you cannot store every artifacts you make in the VCS: all the intermediate build you make for continuous integration purpose must not be stored in the VCS (or you end up with a huge repository with many useless versions of the binaries).
Only the versions needed for homologation and production purposes need to be referenced.
For intermediate build, you need an external repository (maven or a shared directory) in order to publish/test quickly those builds.
2/ you should not store them in the same Subversion Repository, since your development is committed (revision number) much more often than your significant builds (the ones deemed worthy of homologation and production deployment)
That means the artifacts stored in that second repository must have a naming convention for the tag (or for a property) in order to easily retrieve the revision number of the development from which they have been built.
In my experience could storing of Jars in SVN end in a mess.
I think it is better to save the Jar-files in a Maven-Repository like Nexus.
This has also the advantages, that you can use a dependecy managing tool like Maven or Ivy.
Binaries, especially your own, but also third party, have no place in a source control tool like SVN.
Ideally you should have a build scripts to build your own binaries (that can then be automated with one of the many fine automatic build tools that can check the source straight out of SVN).
For third party binaries you will need a dependency management tool like Maven2. You can then set up a local Maven repository to handle all third party binaries (or just rely on the public ones). The local repo can also manage your own binaries.
Putting the binaries in the trunk or branches is definitely overkill. Besides taking up space like you mention, it also leads to inconsistencies between source and binaries. When you refer to revision 1234, you don't want to wonder whether that means "the build resulting from the source at revision 1234" vs "the binaries in revision 1234". The same rule of avoiding inconsistencies applies to auto-generated code. You should not version what can be generated by the build.
OTOH I'm more or less OK with putting binaries in tags. This way it is easy for other projects to use the binaries of other projects via svn:externals, without needing to build all these dependencies. It also enables testers to easily switch between tags without needing a full build environment.
To get binaries in tags, you can use this procedure:
check out a clean working copy
run the build script and evaluate any test results
if the build is OK, svn add the
binaries
instead of committing to the trunk
or branch, tag directly from your
working copy like this: svn copy
myWorkingCopyFolder myTagURL
discard the working copy to avoid
accidental commits of binaries to
the trunk or branch
We have a tagbuild script to semi-automate steps 3 and 4.
One good reason would be to quickly get an executable running on a new machine. In particular if the build environment takes a while to set up. (Load compilers, 3rd party libraries and tools, etc.)
On my projects, I usually have post-build hooks to build from a special working copy on the server, namely in a path reachable from a HTTP browser. That means, after every commit, anyone [who can read the internal web] can easily download the relevant binaries.
No consistency problems, instant updating + a path towards automated testing.
Version control should have everything you need to do: svn co and then build. It shouldn't have intermediates or final product, as that defeats the purpose. You can create a new project in SVN for the result and version the binary result separately (for releases and patches if needed).
Checking in significant binaries violates a usage principle of source code/SVN, namely that files in source control should possess a meaningful property of difference.
Todays source file is meaningfully different to yesterdays source file; a diff will produce a set of changes which make sense to a human reader. Todays picture of the front of the office does not possess a meaningful diff with regard to yesterdays picture of the office.
Because things like images do not possess the concept of difference, WHY are you storing them in a system which exists record and store the differences between files?
Revision based storage is about storing histories of changes to files. There is no meaingful change history in the data of (say) JPEG files. Such files are stored perfectly as well simply in a directory.
More practically, storing large files - build output files - in SVN makes checkout slow. The potential to abuse SVN as a generalised binary repository is there. It all seems fine at first - because there aren't many binary files. Of course, the number of files increases at time passes; I've seen modules which take hours to check out.
It is better to store large associated binary files (and output files) in a directory structure and refer to them from the build process.
Do you mean you have the sources plus the result of the build in the same repository ?
This is a good argument for a daily build, with versioned build scripts in a separate repository. Binary in the repository itself is not bad, but sources + result of build looks bad to me
If you build several binaries and don't notice a build breakage somewhere, then you end up with binaries from different revision, and you are preparing yourself for some subtle bug chase.
Advocate for a daily, separately versioned autobuild script, than just against the binaries + code
Subversion is a Source Control Manager -> Binaries are not source
If you use "svn up" command to update production all developers with commit-permissions can update/modify/broke production?
Alternatives: Use continuous integration like Hudson or Cruise Control.
I think the feeling of having done a bad thing when binary files are comitted to the VCS is reasoned by the basic idea that one should never put redundant things in an archive, reasoned by resource economy and drawbacks of double data management.
That is why: If you can easily reconstruct your archived state of work from the other files of that certain version, like with simple recompiling or installing standard setups, you should not commit such binaries, but rather commit something like a README or INSTALL file. If the difficulties or risk of failing to reconstruct is too much, do commit.