How to assess the risk of a java version upgrade? - java

I'm being asked to assess whether we can safely upgrade the java version on one of our production-deployed webapps. The codebase is fairly large and we want to avoid having to regression test everything (no automated tests sadly), but we've already encountered at least one problem during some manual testing (XmlStringReader.getLocalName now throws an IllegalStateExeption when it just used to return null) and higher-ups are pretty nervous about the upgrade.
The current suggested approach is to do a source compare of the JDK sources for each version and assess those changes to see which ones might have impact, but it seems there's a lot of changes to go through (and as mentioned the codebase is kinda large). Is it safe and easier to just review the java version changes for each version? Or is there an easier way to conduct this assessment?
Edit: I forgot to mention the version upgrade being considered is a minor version upgrade, i.e. 1.6.10 to 1.6.33

Nothing is going to replace testing it in a real system. you may be able to catch something blatant in a bug report or in visual inspection, but detecting changes due to more complex interactions will be impossible. or even detecting seemingly simple changes which alter how the GC affects your running app or how hotspot optimizes your code (you are checking the c++ code too, right) or how some key algorithm performs...

As #jtahlborn says: Nothing will replace testing it properly.
I would go further and state that without automation then this is cost a you will occur again and again.
The correct answer is to
Define a regression
Run it (verify it)
Automate as much as possible as you go through it
A simpler scenario is to simply run it and catch the errors are you or your customers find time. Personally I think this is good way to get demotivated developers, managers and customers. I strongly recommend you don't use this approach.

Related

What are the dangers of using Hibernate Search with Elasticsearch version alpha 3 in production?

I am developing a e commerce application.I am using Spring MVC 4 annotation based configuration with hibernate and MySQL.I needed to integrate a search engine so I decide to go go with hibernate elastic search.I need to know whether using Hibernate search alpha 3 in production environment will pose any threats to my eCommerce web application? If alpha version is a threat then What can be a alternative solution for me?
Answering on behalf of the Hibernate Search team (I'm the project lead).
When we release anything, it is in our opinion good code and we think that the features which we implemented to be reliable. You might think of it as being similar to when you code anything for your own project and consider something "done, this will do the job nicely".
But while we take pride in our goal to write great code, we're human and sometimes we're wrong.
We're testing in various environments and OS combinations, and any pull request is peer-reviewed by another committer, and open for scrutiny from anybody (it's all public on github) so I would say that the quality is generally quite high.
What's the risk of using any non-Final version?
Environment
While we test in many combinations of OS, JDKs, databases, hardware types (small embedded to very high end servers), the combinations we can test are limited: Red Hat kindly sponsors us but the budget is not unlimited.
When you all download an Alpha/Beta and test it on your environment, you might catch some corner case that we don't know about.
Do yourself a favour, and get your team to test regularly our preview builds on the environment which matters to you: if it fails and you can report it, we'll make sure that works for the Final release.
Several people help out by doing this, so the Final will have had much better coverage. However consider testing this in your own environment, so that your specific requirements are covered.
So when you push an Alpha in production, it might still have issues related to some environment that we don't know of yet. You could look into our issue tracker though, to see if any issues which other volunteers reported could bother you: if no issues are reported, changes are that the next release will not be "more reliable" than the Alpha but just the same again.
Test Coverage
We develop various unit tests and integration tests, and performance tests to cover new features and to be safe against regressions.
Other people might try to use the new features in ways that we didn't anticipate, or simply on field and type combinations which we didn't cover with your tests.
When you download our previews, and use it to resolve your requirement, that might uncover issues which we have not covered. The best way to make sure that the Final release will be suited for your requirements, is to try it out early on, and let us know what is not good yet.
If you send us a patch which adds a unit test to cover for your use case you can get a very high return on the investment: we'll include that test in our codebase, so that continuous integration will ensure that your requirements will be covered by future releases as well.
Of course if you tried and it works just fine - as we expect - then you might as well put it into production, as long as you understand the following points on what differentiates an Alpha from a Final.
So what is different between an Alpha and a Beta release?
Generally it's about feature coverage.
For a Beta release we typically require it to be "feature complete": to have all the things implemented which we think you will need to benefit from the new features.
An example in this case could be that the Alpha release of 5.6 - the first version supporting Elasticsearch - didn't have the feature to rebuild the Index. I think having this option is essential for various practical reasons, but if your specific use case doesn't require it (you might have your own strategy?) then the lack of such a feature might not bother you.
A Beta release will incorporate any fixes about issues that you all have let us know from testing the Alpha version. So the chances that it will "just work" on your environment are even higher.
What's different between a Beta and a Candidate Release or Final?
After publishing the Beta1 release we might still have some pending work to do, but we expect the API of new features to not change anymore. Unless someone has a significant and reasonable concern.
So we expect even more people to be happy to test a Beta, and when we received enough feedback like "It's working great!" (we love to hear that, please let us know!) then we make a call that enough people have tested it and we call it a Final release - possibly with a candidate release to give people a last chance to try it.
At this point it's probably too late to tell us that some API is confusing, that you'd suggest a method to be named differently, etc.. so make sure to try it out as early as practical for your project.
I hope that helps making a sensible choice for your specific use case.
In terms of Ready for Production: I think an Alpha is as ready as writing your own integration would be; just make sure you test it as you would your own team's code and study the release notes in detail to be equally aware of known limitations.
Much depends on how you can handle a potential issue with it: I would not recommend it on a mission critical system but there are people who ultimately get a better integration, because they can test early versions in an environment very similar to production, or they can handle the risk of having it production already.

How to unit test for backward and forward compatibility?

I am working on developing an Plug-In API that uses Java serialization. The idea is similar to SmallTalk's system images. I was wondering how would to best to automate testing for whether changes I am making will break deserialization since some changes seem to be innocuous like adding a method to an interface that is implemented (as long as that is not called, otherwise it will result in a AbstractMethodException).
Yes, this is more for an experimental spike rather than production code so please do not suggest not using serialisation.
For backward compatibility of data, keep a lot of old messages in binary form, and see if you can still deserialize them with the new code.
For backward compatibility of code, you'll need some way of building your old code (e.g. one version per release) and testing that against data created from the newest version of the code. This is a slightly more challenging problem - you may want to build a small test jar on each appropriate release, and put that into source control at the same time to avoid having to build the same code again and again. Your tests would then try all the different jar files against the output of the new code.
To be honest, this all sounds like quite a lot of work for an experimental spike. For real work I'd just use protocol buffers of course :)

Should I be concerned with large number of dependencies?

I was just about to include the HtmlUnit library in a project. I unpacked the zip-file and realised that it had no less than 12 dependencies.
I've always been concerned when it comes to introducing dependencies. I suppose I have to ship all these dependencies together with the application (8.7 mb in this particular case). Should I bother checking for, say, security updates for these libraries? Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
The task HtmlUnit solves for me could be solved "manually" but that would probably not be as elegant.
Should I be concerned about this? What are the best practices in situations like these?
Edit: I'm interested in the general situation, not particularly involving HtmlUnit. I just used it here as an example as that was my current concern.
Handle your dependencies with care. They can bring you much speed, but can be a pain to maintain down the road. Here are my thoughts:
Use some software to maintain your dependencies. Maven is what I would use for Java to do this. Without it you will very soon loose track of your dependencies.
Remember that the various libraries have different licenses. It is not granted that a given license works for your setting. I work for a software house and we cannot use GPL based libraries in any of the software we ship, as the software we sell are closed source. Similarly we should avoid LGPL as well if we can (This is due to some intricate lawyer reasoning, don't ask me why)
For unit testing I'd say go all out. It is not the end of the world if you have to rewrite your tests in the future. It might even be then that that part of the software is either extremely stable or maybe not even maintained no more. Loosing those is not that big of a deal as you already had a huge gain of gaining speed when you got it.
Some libraries are harder to replace later than others. Some are like a marriage that should last the life of the software, but some other are just tools that are easily replaceable. (Think Spring versus an xml library)
Check out how the community support older versions of the library. Are they supporting older versions? What happens when life continues and you are stuck at a version? Is there an active community or do you have the skill to maintain it yourself?
For how long are your software supposed to last? Is it one year, five year, ten year or beyond? If the software has short time span, then you can use more to get where you are going as it is not that important to be able to keep up with upgrading your libraries.
It could be a serious issue if there isn't a active community which does maintain the libraries on long term. It is ok to use libraries, but to be honest you should care to get the sources and put them into your VCS.
Should I bother checking for, say, security updates for these libraries?
In general, it is probably a good idea to do this. But then so should everyone upstream and downstream of you.
In your particular case, we are talking about test code. If potential security flaws in libraries used only in testing are significant, your downstream users are doing something strange ...
Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
Ah yes. Assuming that you are building your own classpaths, etc by hand, you need to make a decision about which version of the dependent libraries you should use. It is usually safe to just pick the most recent of the versions used. But if the older version is not backwards incompatible with the new (for your use case) then you've got a problem.
Should I be concerned about this?
IMO, for your particular example (where we are talking about test code), no.
What are the best practices in situations like these?
Use Maven! It explicitly exposes the dependencies to the folks who download your code, making it possible for them to deal with the issue. It also tells you when you've got dependency version conflicts and provides a simple "exclude" mechanism for dealing with it.
Maven also removes the need to create distributions. You publish just your artifacts with references to their dependents. The Maven command then downloads the dependent artifacts from wherever they have been published.
EDIT
Obviously, if you are using HtmlUnit for production code (rather than just tests), then you need to pay more attention to security issues.
A similar thing has happened to me actually.
Two of my dependencies had the same 'transitive' dependency but a different version.
My favorite solution is to avoid "dependency creep" by not including too many dependencies. So, the simplest solution would be to remove the one I need less, or the one I could replace with a simple Util class, etc.
Too bad, it's not always that simple. In unfortunate cases where you actually need both libraries, it is possible to try to sync their versions, i.e. downgrade one of them so that dependency versions match.
In my particular case, I manually edited one of the jars, deleted the older dependency from it, and hoped it would still work with new version loaded from other jar. Luckily, it did (i.e. maintainers of the transitive dependency didn't remove any classes or methods that library used).
Was it ugly - Yes (Yuck!), but it worked.
I try to avoid introducing frivolous dependencies, because it is possible to run into conflicts.
One interesting technique I have seen used to avoid conflicts: renaming a library's package (if its license allows you to -- most BSD-style licenses do.) My favorite example of this is what Sun did when they built Xerces into the JRE as the de-facto JAXP XML parser: they renamed the whole of Xerces from org.apache.xerces to com.sun.org.apache.xerces.internal. Is this technique drastic, time consuming, and hard to maintain? Yes. But it gets the job done, and I think it is an important possible alternative to keep in mind.
Another possibility is -- license terms abided -- copying/renaming single classes or even single methods out of a library.
HtmlUnit can do a lot, though. If you are really using a large portion of its functionality on a lot of varied input data, it would be hard to make a case for spending the large amount of time it would take to re-write the functionality from scratch, or repackage it.
As for the security concerns -- you might weigh the chances of a widely used library having problems, vs. the likelihood of your hand-written less-tested code having some security flaw. Ultimately you are responsible for the security of your programs, though -- so do what you feel you must.

Which work process in my company should I Improve first?

I've just started to work in a new place, and I see several things they do that I find really terrible, and I want to know if they are indeed so wrong, or I am just too strict. Please let me know if my criticism is in place, and your opinion on which problem is the worst and should be fixed first. The developement is all in Java.
Not using svnignore. This means svn stat can't be used, and developers forget to add files and break the build.
Generated files go to same folders as committed files. Can't use simple maven clean, have to find them one by one. Maven clean doesn't remove all of them.
Not fixing IDE analyze warnings. Analyze code returns about 5,000 warning, of many different kinds.
Not following conventions: spring beans names sometimes start with uppercase and sometimes not, ant properties sometimes with underline and sometimes with dots delimiter, etc.
Incremental build takes 6 minutes, even when nothing is changed.
Developers only use remote debug, and don't know how to run the Tomcat server internally from the IDE.
Developers always restart the server after every compilation, instead of dynamically reloading the class and saving the server's state. It takes them at least 10 minutes to start checking any change in the code.
Developers only compile from command line. When there are compilation errors, they manually open the file and go the the problematic line.
A complete mess in project dependencies. Over 200 open sources are depended on, and no one knows what is indeed needed and why. They do know that not all dependencies are necessary.
Mixing Maven and Ant in a way that disables the benefits of both. In one case, even dependency checks are not done by Maven.
Not using generics properly.
Developers don't use Subversion integration with IDE (Eclipse, Intellij Idea).
What do you think? Where should I start? Is any of the things I mentioned not really a problem?
I'd look at it like this:
Anything that affects productivity should be solved first
Things that affect profitability solved second (most productivity fixes are profitability fixes too)
Nitpicky stuff last
Therefore, you should have the following (in order of my opinion):
7 - Restarting the Server after Compilation
5 - Incremental Build Speed
6 - Remote Debugging only
8 - Compiling from command line
12 - Subversion Integration (kind of in the same league as 5. above)
2 - Generated Files
11 - Not using Generics Correctly
Then
1 - svnignore
9 - Project Dependencies (this will take a great deal of time i'm sure)
10 - Mixing Maven + Ant
3 - IDE Warnings
4 - Conventions
The reason I have the ordering in this sense is time vs. benefit. If it takes a user 16 minutes to compile and check their code, it's bloody well insane to be honest. Say a developer compiles 5x per day, we're taking about 80 minutes, doing nothing.
After that it's productivity. If you speed up the rate at which your developers can do their work, the turnover of work completed will rise substantially. (Profitability++)
After this is the "nitpicky" things. I say this not as to infer that they're not important, but the fact is from the looks of things you have much bigger fish to fry, so get those done first before correcting casing in code.
Not being a maven user, I can't comment on some of the above but most things in your list look like good areas for improvement. I think the best advice I can give is:
Take it slow. They've probably been doing it this way for ages and may resist change.
Get the team (and possibly the manager(s)) involved in the changes. Hold a meeting to discuss what improvements you see (keep it simple, just a few), what they think of them, and if they think its sensible to implement. Then, if agreed, pair with someone to get the improvement in place.
Offer presentations on working practices which are easily changed. E.g. show them in a live setting the difference of dynamic class loading during a debugging session.
Prioritize the list above and focus on a few at a time.
Be gentle. Change is hard! To much at once will potentially alienate or disengage folk.
Start with quick wins that will make an immediate and positive difference to the developers in the team. This will build their confidence in accepting more difficult changes.
Best of luck...
Comments to the issues
1) Not using svnignore. This means svn stat can't be used, and developers forget to add files and break the build.
Doesn't sound very critical to me - I assume from the above that you have a CI or nightly build (if not, that would be a major issue indeed). The purpose of the CI build is to catch such problems, so IMHO it is not a catastrophe if it is broken every now and then. Of course if it happens daily, that's a different story :-(
2) Generated files go to same folders as committed files. Can't use simple maven clean, have to find them one by one. Maven clean doesn't remove all of them.
This is bad, and is fairly simple to fix (under normal circumstances :-)
3) Not fixing IDE analyze warnings. Analyze code returns about 5,000 warning, of many different kinds.
This is bad, and it takes a lot of time to fix. However, skimming through the analysis results to spot really critical issues could be a high priority task.
4) Not following conventions: spring beans names sometimes start with uppercase and sometimes not, ant properties sometimes with underline and sometimes with dots delimiter, etc.
Not a catastrophe, OTOH easy to fix.
5) Incremental build takes 6 minutes, even when nothing is changed.
This is bad, and (considering cases 9 and 10 below) may be a rather daunting task to fix.
6) Developers only use remote debug, and don't know how to run the Tomcat server internally from the IDE.
A short demo and mentoring should not take a lot of effort. However, there may be cultural issues, and old members of the team might not be willing to learn new tricks. So a sensitive approach is required.
7) Developers always restart the server after every compilation, instead of dynamically reloading the class and saving the server's state. It takes them at least 10 minutes to start checking any change in the code.
Same as above.
8) Developers only compile from command line. When there are compilation errors, they manually open the file and go the the problematic line.
Same as above.
9) A complete mess in project dependencies. Over 200 open sources are depended on, and no one knows what is indeed needed and why. They do know that not all dependencies are necessary.
This is bad, and is a huge task to fix.
10) Mixing Maven and Ant in a way that disables the benefits of both. In one case, even dependency checks are not done by Maven.
Same as above.
11) Not using generics properly.
Do you mean they are still programming the Java 1.4 way using non generic collections et al? If the code is otherwise stable, this can be fixed (and developers educated) gradually.
12) Developers don't use Subversion integration with IDE (Eclipse, Intellij Idea).
See case 6.
Priorities
I would try ordering tasks based on cost vs benefit ratio. My order would be:
First, tasks to simplify and speed up day to day work, and build up your credibility and authority within the team: 7, 6, 8, 12, 2
Then the fundamental, but difficult and time-consuming tasks, for which you need more support from the team: (continuous integration in case there is none yet), 5, 10, 9
The rest can be introduced gradually: 1, 3, 4, 11
You don't mention continuous integration, but one good thing to start with is to give developers a rapid feedback if the build is broken. Once you have it you can add code quality metrics to encourage them to correct warnings or bad use of generics.
Only with all that in place you should start working on their productivity by showing them how to hot deploy, debug, use an IDE and so on...
And good luck :)
In general, I think that stuff that wastes time for developers should be taken care of first. The less idle the developer is, the greater the benefit.
1) This together with 12) should be prioritized higher, as broken builds take time.
3) IDE warnings aren't that important, as they are just warnings and could be as simple as unused imports.
4) Lack of naming conventions doesn't break anything and can be enforced gradually. That will take a lot of time however you do it. This shouldn't be highly prioritized.
5) I assume you have a separated build server that takes care of the builds, six minutes doesn't sound like a very long time. If you don't have a build server, that would be a good investment.
7) Sounds like a lot of time is being wasted for all developers, this should be highly prioritized.
11) These could cause a lot of the warnings in 3), should be fixed, but not highest priority.
12) Subversion integration with IDE should help out a lot, I think the developers would see this as very useful.
First of all, I totally agree with you list of problems since I have been on projects that have basically the same issues.
If you start with the places where you think you can gain the most productivity.
I would think that you can save considerable amount of time with:
Get tomcat running in the IDE.
Fix any build script, maybe also test, issues to make the builds run fast.
Compilation within the IDE and hotdeploy (hotswap/jrebel) will also save you a lot of time.
As already posted, get a continuous build server up and running for the project.
And if you have an issue tracker, add these issues to it so that everyone is aware of what needs to be done. When the high priority stuff has been completed try to push for time to get the other stuff fixed as well, its really annoying with all the small problems and even though they seem small now they can cause considerable headache after a while.
First of all, since you are new, you need to be careful not to be considered very annoying.
I would suggest you start by talking to your own boss, and say that you may have some experiences that might be useful to your new company. Management backup is essential if you want something done rather quickly.
You will need to demonstrate to your boss and coworkers that your suggestions are immediately beneficial to them, in their daily work. Hence select just one pain-point and fix it good (but reversible as going back is a nice option to have when trying things out). Be prepared to do a lot of work yourself and a lot of mentoring.
Based on your description, I would suggest a quick demonstration of a proof-of-concept web application being run inside the IDE with hotspot editing and instant redeployment of changed files would be very eyeopening. Do NOT go to fast - be certain that everybody understand what you do. Then as a final demonstration, do it again but in normal speed.

Strategies for migrating medium-sized code base from Java 1.4.2 to Java 5

I'm in the process of reviewing a code base (~20K LOC) and trying to determine how to migrating it from 1.4.2 to 5. Obviously, it's not an overnight project and the suggestion which I have received is to write new code against Java 5 and migrate the old code in a piece-meal fashion. Also, I'm no expert in the new features in Java 5 (i.e. I know of them, but have never written any for production use).
My questions:
What features of Java 5 are typically used in production code? (i.e. generics, auto-boxing, etc.) Are there features to be avoided / not considered to be best-practices?
What are the best refactoring strategies which I can use migrate a code base of this size? (i.e. make changes to classes one at a time only when a class is edited, etc.) Objective - reduce risk on the code base. Limitation - resources to do refactoring.
Any advice is appreciated - thanks in advance.
UPDATE - a year too late, but better late than never? =)
Thank you for all of the comments - lots of great points of view. In the life of a software developer, there's always going to be the projects you strive to finish but never get around to because of something more "urgent".
With respect to the use of Java 5 (at that time), it was something which was required in the client's production environment, so that was why we did not use Java 6.
I found that the stronger typing for collections, enums and unboxing of primitives were the features I tend to apply the most, both to old and new code. The refactoring was fairly straight-forward, but code comprehension improved significantly and standards became easier to enforce. The ones I had the most trouble with was the generics; I think it's a concept which I still haven't had a chance to fully grasp and appreciate yet, so it was difficult for me to find previous cases where the application of generics was appropriate.
Thanks again to everyone who contributed to this thread and apologies for the late follow up.
Java 5 is almost completely backwards compatible with Java 4. Typically, the only change you must make when you migrate is to rename any usages of the new enum keyword in the Java 4 code.
The full list of potential compatibility problems is listed here:
http://java.sun.com/j2se/1.5.0/compatibility.html
The only other one that I've run into in practice is related to the change in the JAXP implementation. In our case, it simply meant removing xerces.jar from the classpath.
As far as refactoring goes, I think that migrating your collection classes to use the new strongly-typed generic versions and removing unnecessary casting is a good idea. But as another poster pointed out, changing to generic collections tends to work best if you work in vertical slices. Otherwise, you end up having to add casting to the code to make the generic types compatible with the non-generic types.
Another feature I like to use when I'm migrating code is the #Override annotation. It helps to catch inheritance problems when you're refactoring code.
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Override.html
The new concurrency library is very useful if your code uses threading. For example, you may be able to replace home-grown thread pools with a ThreadPoolExecutor.
http://java.sun.com/j2se/1.5.0/docs/relnotes/features.html#concurrency
I would definitely take the approach of updating the code as you change it during normal maintenance. Other than the compatibility issues, I don't think there is a compelling reason to use the new Java 5 features unless you're already changing the code for other reasons.
There is one very real issue with the "viral" nature of generics; once you start introducing them at a given layer in an architecture you generally want to introduce it at the layer above & below as well. I have found that introducing generics is probably best done in full "verticals". But you do not have to do all the verticals at once.
This is a really hard question to answer because it depends on what code will be affected and how critical that code is.
First and foremost, when migration is a nontrivial undertaking, do yourself a favour and upgrade to the latest version of Java, which would be Java 6 not Java 5. Java 6 has been out for a year and a half or more and is mature. There's no reason to not pick it over Java 5 (imho).
Secondly, like any software project, your goal should be to get something into production as soon as you possibly can. So you need to identify a slice of your system. The smaller the better, the more non-cdritical, the better.
The other thing to do is just try starting up your app under Java 6 and seeing what breaks. It might be worse than you expected. It might be much better.
The other thing you'll probably need to be aware of is that by the sounds of it you will have jars/libraries in your app that have since been deprecated. Some may not even be compatible with Java beyond 1.4.2. You will probably want to upgrade all of these to the latest version as well.
This will probably mean more stuff breaking but using old/deprecated APIs is just kicking the can down the street and causes you other problems.
There are exceptions to this where upgrading can have far-reaching consequences. Axis1 to Axis2 comes to mind. Those situations require more careful thought.
As for what features are used... all of them pretty much. I can't think of any that should be avoided off the top of my head.
Also, I just noticed the size of your project: ~20K LOC. That's actually quite small (eg I've written an app about that size in the last 3 months by myself).
Lastly, this also depends on how easily you will find things that break. If you have good unit test coverage then great. That's pretty rare though. If you can just run through the app and reliably find problems it's not too bad.
The problematic situations are where scenarios are hard to test and it's likely you won't uncover problems straight away. That calls for more caution.
You would want to migrate stuff that doesn't work in the transition from 1.4 to 5 (not sure what that would be), but I'd be wary of migrating stuff for the sake of it.
If you do take this route, some questions:
Do you have comprehensive test coverage ? If not, you should write unit tests for the code you're going to be migrating.
Do you have components that are widely used within your codebase ? If so, they are probably candidates to be migrated in terms of their API (e.g. using generics etc.)
In terms of what's widely used from Java 5. Generics is important and makes your life a lot easier. I don't see autoboxing too much, nor enums (this is all relative). Varargs almost never. Annotations are useful for frameworks, but I consume these. I don't think I've ever implemented one myself.
20 (non-comment) kloc should be small enough to insert generics with a big bang. Obviously make sure your code compiles an runs on Java SE 5 first. The relatively easy thing about generics is that adding them makes very little change to semantics (certain overloadings can change because of implicit cases - Iterator<char[]> iter; ... System.out.println(iter.next()); as a bad example off the top of my head).
Some cases adding generics will highlight conceptual problems with the code. Using one Map as two maps with disjoint key sets, for example. TreeMap is an example in the Java library where a single class has two distinct mode (using Comparator<T> or Comparable<T>).
Things like enhanced-for and auto-boxing are very local and can be added piecemeal. enums are rarer and might take some thinking about how you are actually going to use them.
I think you're going about this the wrong way. Your plan shouldn't be to update all current code to Java 1.5, your plan should be to ensure that all current code runs exactly the same in 1.5 as it did in 1.4.2, and that all future code written will work fine in 1.5.
I've gone through a few transitions like this of varied sized code bases. The goal was always to make sure we had a ton of unit tests so that we could easily plug in 1.5 and run our tests through it. We actually encountered about 10 problems, mostly related to regular expression libraries not supporting something or supporting something differently.
Write all new code in 1.5 then, and if you change an older class for whatever reason, spend a minute and implement generics, but there's no reason to refactor everything. That sounds a bit dangerous to me if you don't have the tests in place.

Categories