Context
Running multiple version of the same library seems to be a usual need and there are many questions for this when dealing with versioned jar dependencies.
However, I have another constraint here: my code is part of a rolling-release MOAB where code has no version. I cannot depend on a older version of a library from the MOAB.
The use case of that question is being able to load different versions of the same code at runtime for compatibility.
Eg: GET /my/api/call?compat_version=42
I have to be able to provide several compatibility versions (ie code from version x that have not been changed). This must be the actual code that was running when version x was the current/latest version and not any kind of retrocompatibility trick.
Naive solution
The "obvious" way seems to duplicate the code for each version. For instance by having per-version packages:
com.me.thing.v1
com.me.thing.v2
com.me.thing.v3
...
and dynamically loading the code from the associate package upon the provided compat_version parameter by whatever technique. Let's suppose for know that all those versions share a common interface (API).
Challenge
I'd like to challenge that and maybe find a better option than the naive solution.
Since using the exact code from version x is a prerequisite, I don't believe I can get rid of the copy-paste (but please, tell me I'm wrong).
What technique would you suggest as a simple (but not necessarily easy) and robust implementation? Reflection? Dependency injection?
Is there a "good" pattern for doing such things? Is there any literature on that?
This was already an old problem when Java was developed, hence Sun's emphasis on binary compatibility (which existed for Solaris of course as well). This is the original guarantee offered by the platform- that you can upgrade the bits underneath and applications will continue to work, unmodified.
The way to run legacy code in the JVM world is to run the full legacy application.
Many segregation architectures have of course been developed over the years and reached various levels of maturity- like OSGI and many others before it- but there are edge cases upon edge cases and many failure modes.
Do not futz with multiple code versions within a single JVM. It was never a design goal and in environments where it matters only leads to pain.
Related
I want to keep only java util, io, nioand math packages and want to remove all other packages like java.sql and others from my JDK.
How can I remove them?
So if I write some program which import removed packages it will give
error package doesn't exist.
Use a SecurityManager instead of hacking the JDK
I'm going to give you the best answer I can.
Why you really shouldn't be doing what you want to do
When you're writing code, it is commonly agreed to develop that code in a way that is extendable. That is, your code can be plugged into other applications, or it can be changed and added to, very easily. Now with that principle in mind, let's review what happens when you delete the possible functionality of your program. Say you delete the SQL package, and in the future, you want a backend database to provide some persistence in your program. You're screwed.
The idea of Java, in fact I'd go as far as to say the major advantage of Java, is it's commonality, consistency and standardization of patterns. A getter is always a getter. A variable (that isn't a constant) starts with a lower case letter. Classes have a standardized way of being structured. All these things make developing in Java quite intuitive.
The JDK is part of that consistency, and to edit it is to really impact one of the major points of Java. It sounds like you want to implement your program in a different, more compact language.
Finally, you have no idea how the client may want to extend your project in the future. IF you want to have some repeatable business from the client, and generate a good reputation at the same time, you want to design your code with good design practise in mind.
There is no such tool, AFAIK.
Removing stuff from the Java libraries can be technically tricky, 'cos it can be difficult to know if your code might directly or indirectly use some class or method.
There are potentially "licensing issues" if you add or remove classes from a JRE installer, and ship it to other people.
Concerning your proposed use case.
If you are building this as a web application, then you are going to have a lot of difficulty cutting out classes that are not needed. A typical webapp server-side framework uses a lot of Java SE interfaces.
If you accepted and ran code someone who wanted to try and bring down your service, they could do it without using only the Object class. (Hint: infinite loops and filling the heap.) Running untrusted code on your server is a bad idea. Period.
Think about the consequence for someone trying to run legitimate code on your server. Why shouldn't they be allowed to use library classes / methods? (I'd certainly be a bit miffed if I couldn't use "ordinary" library classes ...)
My advice would be reconsider if it was a good idea to implement such a service at all ... given the risks, and the difficulty you could have if your safeguards were ineffective. If you decide to proceed, I advise running the untrusted code within the JVM in a security box. As a second level of defence in case Java security is compromised, I'd recommend running the service "chrooted" or better still in an isolated virtual machine that can be turned off if you run into problems.
My Question: What is a good way to deal with two different versions of an API? (Sub-question: Is there any way to avoid classpath reference problems if you include two libraries with the same class-names?)
Description of question: I have a project that uses an API. I have spent the last few months developing for the old version and I'm about to add features for a newer version. So far as I can tell there's only one critical difference. However, I don't have the API yet (waiting on someone to get me the jar) so I can't be sure whether there are more differences.
Description of subquestion: I'm worried that there may be class reference inconsistencies between the two APIs (like I said, I don't have the jar yet to be sure).
I realize you may want some code to look at, but this is a design question, not a coding issue. I'm hoping to get some best practices out of this. Thanks!
Not a duplicate: I've looked at a few questions which may appear to be duplicates, but they didn't really address my issue :)
Avoid, or shade.
If you have full control of the classpath you can use ordering: this is brittle and non-intuitive.
If you don't, you're at the mercy of whatever is, hence shading, and its related headaches.
There are also classloader isolation strategies you can use. For example, if you're in a container environment (Java EE, OSGi) you can put the different versions of the libraries in different classloader contexts (i.e., different EJB jars, or different web applications) and they won't interfere with each other.
OSGi can do the same sort of thing; deploy library A.1 and A.2 into your OSGi container, and deploy project.uno (which uses A.1) and project.dos (which uses A.2) into the same JVM, and the OSGi container handles resolution of the classpaths.
I was just about to include the HtmlUnit library in a project. I unpacked the zip-file and realised that it had no less than 12 dependencies.
I've always been concerned when it comes to introducing dependencies. I suppose I have to ship all these dependencies together with the application (8.7 mb in this particular case). Should I bother checking for, say, security updates for these libraries? Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
The task HtmlUnit solves for me could be solved "manually" but that would probably not be as elegant.
Should I be concerned about this? What are the best practices in situations like these?
Edit: I'm interested in the general situation, not particularly involving HtmlUnit. I just used it here as an example as that was my current concern.
Handle your dependencies with care. They can bring you much speed, but can be a pain to maintain down the road. Here are my thoughts:
Use some software to maintain your dependencies. Maven is what I would use for Java to do this. Without it you will very soon loose track of your dependencies.
Remember that the various libraries have different licenses. It is not granted that a given license works for your setting. I work for a software house and we cannot use GPL based libraries in any of the software we ship, as the software we sell are closed source. Similarly we should avoid LGPL as well if we can (This is due to some intricate lawyer reasoning, don't ask me why)
For unit testing I'd say go all out. It is not the end of the world if you have to rewrite your tests in the future. It might even be then that that part of the software is either extremely stable or maybe not even maintained no more. Loosing those is not that big of a deal as you already had a huge gain of gaining speed when you got it.
Some libraries are harder to replace later than others. Some are like a marriage that should last the life of the software, but some other are just tools that are easily replaceable. (Think Spring versus an xml library)
Check out how the community support older versions of the library. Are they supporting older versions? What happens when life continues and you are stuck at a version? Is there an active community or do you have the skill to maintain it yourself?
For how long are your software supposed to last? Is it one year, five year, ten year or beyond? If the software has short time span, then you can use more to get where you are going as it is not that important to be able to keep up with upgrading your libraries.
It could be a serious issue if there isn't a active community which does maintain the libraries on long term. It is ok to use libraries, but to be honest you should care to get the sources and put them into your VCS.
Should I bother checking for, say, security updates for these libraries?
In general, it is probably a good idea to do this. But then so should everyone upstream and downstream of you.
In your particular case, we are talking about test code. If potential security flaws in libraries used only in testing are significant, your downstream users are doing something strange ...
Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
Ah yes. Assuming that you are building your own classpaths, etc by hand, you need to make a decision about which version of the dependent libraries you should use. It is usually safe to just pick the most recent of the versions used. But if the older version is not backwards incompatible with the new (for your use case) then you've got a problem.
Should I be concerned about this?
IMO, for your particular example (where we are talking about test code), no.
What are the best practices in situations like these?
Use Maven! It explicitly exposes the dependencies to the folks who download your code, making it possible for them to deal with the issue. It also tells you when you've got dependency version conflicts and provides a simple "exclude" mechanism for dealing with it.
Maven also removes the need to create distributions. You publish just your artifacts with references to their dependents. The Maven command then downloads the dependent artifacts from wherever they have been published.
EDIT
Obviously, if you are using HtmlUnit for production code (rather than just tests), then you need to pay more attention to security issues.
A similar thing has happened to me actually.
Two of my dependencies had the same 'transitive' dependency but a different version.
My favorite solution is to avoid "dependency creep" by not including too many dependencies. So, the simplest solution would be to remove the one I need less, or the one I could replace with a simple Util class, etc.
Too bad, it's not always that simple. In unfortunate cases where you actually need both libraries, it is possible to try to sync their versions, i.e. downgrade one of them so that dependency versions match.
In my particular case, I manually edited one of the jars, deleted the older dependency from it, and hoped it would still work with new version loaded from other jar. Luckily, it did (i.e. maintainers of the transitive dependency didn't remove any classes or methods that library used).
Was it ugly - Yes (Yuck!), but it worked.
I try to avoid introducing frivolous dependencies, because it is possible to run into conflicts.
One interesting technique I have seen used to avoid conflicts: renaming a library's package (if its license allows you to -- most BSD-style licenses do.) My favorite example of this is what Sun did when they built Xerces into the JRE as the de-facto JAXP XML parser: they renamed the whole of Xerces from org.apache.xerces to com.sun.org.apache.xerces.internal. Is this technique drastic, time consuming, and hard to maintain? Yes. But it gets the job done, and I think it is an important possible alternative to keep in mind.
Another possibility is -- license terms abided -- copying/renaming single classes or even single methods out of a library.
HtmlUnit can do a lot, though. If you are really using a large portion of its functionality on a lot of varied input data, it would be hard to make a case for spending the large amount of time it would take to re-write the functionality from scratch, or repackage it.
As for the security concerns -- you might weigh the chances of a widely used library having problems, vs. the likelihood of your hand-written less-tested code having some security flaw. Ultimately you are responsible for the security of your programs, though -- so do what you feel you must.
I'm in the process of reviewing a code base (~20K LOC) and trying to determine how to migrating it from 1.4.2 to 5. Obviously, it's not an overnight project and the suggestion which I have received is to write new code against Java 5 and migrate the old code in a piece-meal fashion. Also, I'm no expert in the new features in Java 5 (i.e. I know of them, but have never written any for production use).
My questions:
What features of Java 5 are typically used in production code? (i.e. generics, auto-boxing, etc.) Are there features to be avoided / not considered to be best-practices?
What are the best refactoring strategies which I can use migrate a code base of this size? (i.e. make changes to classes one at a time only when a class is edited, etc.) Objective - reduce risk on the code base. Limitation - resources to do refactoring.
Any advice is appreciated - thanks in advance.
UPDATE - a year too late, but better late than never? =)
Thank you for all of the comments - lots of great points of view. In the life of a software developer, there's always going to be the projects you strive to finish but never get around to because of something more "urgent".
With respect to the use of Java 5 (at that time), it was something which was required in the client's production environment, so that was why we did not use Java 6.
I found that the stronger typing for collections, enums and unboxing of primitives were the features I tend to apply the most, both to old and new code. The refactoring was fairly straight-forward, but code comprehension improved significantly and standards became easier to enforce. The ones I had the most trouble with was the generics; I think it's a concept which I still haven't had a chance to fully grasp and appreciate yet, so it was difficult for me to find previous cases where the application of generics was appropriate.
Thanks again to everyone who contributed to this thread and apologies for the late follow up.
Java 5 is almost completely backwards compatible with Java 4. Typically, the only change you must make when you migrate is to rename any usages of the new enum keyword in the Java 4 code.
The full list of potential compatibility problems is listed here:
http://java.sun.com/j2se/1.5.0/compatibility.html
The only other one that I've run into in practice is related to the change in the JAXP implementation. In our case, it simply meant removing xerces.jar from the classpath.
As far as refactoring goes, I think that migrating your collection classes to use the new strongly-typed generic versions and removing unnecessary casting is a good idea. But as another poster pointed out, changing to generic collections tends to work best if you work in vertical slices. Otherwise, you end up having to add casting to the code to make the generic types compatible with the non-generic types.
Another feature I like to use when I'm migrating code is the #Override annotation. It helps to catch inheritance problems when you're refactoring code.
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Override.html
The new concurrency library is very useful if your code uses threading. For example, you may be able to replace home-grown thread pools with a ThreadPoolExecutor.
http://java.sun.com/j2se/1.5.0/docs/relnotes/features.html#concurrency
I would definitely take the approach of updating the code as you change it during normal maintenance. Other than the compatibility issues, I don't think there is a compelling reason to use the new Java 5 features unless you're already changing the code for other reasons.
There is one very real issue with the "viral" nature of generics; once you start introducing them at a given layer in an architecture you generally want to introduce it at the layer above & below as well. I have found that introducing generics is probably best done in full "verticals". But you do not have to do all the verticals at once.
This is a really hard question to answer because it depends on what code will be affected and how critical that code is.
First and foremost, when migration is a nontrivial undertaking, do yourself a favour and upgrade to the latest version of Java, which would be Java 6 not Java 5. Java 6 has been out for a year and a half or more and is mature. There's no reason to not pick it over Java 5 (imho).
Secondly, like any software project, your goal should be to get something into production as soon as you possibly can. So you need to identify a slice of your system. The smaller the better, the more non-cdritical, the better.
The other thing to do is just try starting up your app under Java 6 and seeing what breaks. It might be worse than you expected. It might be much better.
The other thing you'll probably need to be aware of is that by the sounds of it you will have jars/libraries in your app that have since been deprecated. Some may not even be compatible with Java beyond 1.4.2. You will probably want to upgrade all of these to the latest version as well.
This will probably mean more stuff breaking but using old/deprecated APIs is just kicking the can down the street and causes you other problems.
There are exceptions to this where upgrading can have far-reaching consequences. Axis1 to Axis2 comes to mind. Those situations require more careful thought.
As for what features are used... all of them pretty much. I can't think of any that should be avoided off the top of my head.
Also, I just noticed the size of your project: ~20K LOC. That's actually quite small (eg I've written an app about that size in the last 3 months by myself).
Lastly, this also depends on how easily you will find things that break. If you have good unit test coverage then great. That's pretty rare though. If you can just run through the app and reliably find problems it's not too bad.
The problematic situations are where scenarios are hard to test and it's likely you won't uncover problems straight away. That calls for more caution.
You would want to migrate stuff that doesn't work in the transition from 1.4 to 5 (not sure what that would be), but I'd be wary of migrating stuff for the sake of it.
If you do take this route, some questions:
Do you have comprehensive test coverage ? If not, you should write unit tests for the code you're going to be migrating.
Do you have components that are widely used within your codebase ? If so, they are probably candidates to be migrated in terms of their API (e.g. using generics etc.)
In terms of what's widely used from Java 5. Generics is important and makes your life a lot easier. I don't see autoboxing too much, nor enums (this is all relative). Varargs almost never. Annotations are useful for frameworks, but I consume these. I don't think I've ever implemented one myself.
20 (non-comment) kloc should be small enough to insert generics with a big bang. Obviously make sure your code compiles an runs on Java SE 5 first. The relatively easy thing about generics is that adding them makes very little change to semantics (certain overloadings can change because of implicit cases - Iterator<char[]> iter; ... System.out.println(iter.next()); as a bad example off the top of my head).
Some cases adding generics will highlight conceptual problems with the code. Using one Map as two maps with disjoint key sets, for example. TreeMap is an example in the Java library where a single class has two distinct mode (using Comparator<T> or Comparable<T>).
Things like enhanced-for and auto-boxing are very local and can be added piecemeal. enums are rarer and might take some thinking about how you are actually going to use them.
I think you're going about this the wrong way. Your plan shouldn't be to update all current code to Java 1.5, your plan should be to ensure that all current code runs exactly the same in 1.5 as it did in 1.4.2, and that all future code written will work fine in 1.5.
I've gone through a few transitions like this of varied sized code bases. The goal was always to make sure we had a ton of unit tests so that we could easily plug in 1.5 and run our tests through it. We actually encountered about 10 problems, mostly related to regular expression libraries not supporting something or supporting something differently.
Write all new code in 1.5 then, and if you change an older class for whatever reason, spend a minute and implement generics, but there's no reason to refactor everything. That sounds a bit dangerous to me if you don't have the tests in place.
What is classpath hell and is/was it really a problem for Java?
Classpath hell is an unfortunate consequence of dynamic linking of the kind carried out by Java.
Your program is not a fixed entity but rather the exact set of classes loaded by a JVM in a particular instance.
It is very possible to be in situations where the same command line on different platforms or even on the same one would result in completely different results because of the resolution rules.
There could be differences in standard libraries (very common). Libraries could be hidden by one another (an an older version may even be used instead of a newer one). The directory structure could mess resolution. A different version of the same class may appear in multiple libraries and the first one encountered will be used, etc. Since Java, by specification, uses a first-encountered policy, unknown ordering dependencies can lead to problems. Of course, since this is the command line and it is part of the spec, there are no real warnings.
It is very much still a problem. For example on Mac OS the horrible support from Apple means that you machine ends up with several JVMs and several JREs, and you can never easily poart things from place to place. If you have multiple libraries that were compiled against specific but different versions of other libraries, you coulld have problems, etc.
However, this problem is not inherent in Java. I remember my share of DLL hell situations while programming windows in the 90s. Any situation where you have to count on something in the file system to assemble your program rather than having a single well defined executable is a problem.
However, the benefits of this model are still great, so I'm willing to tolerate this hell. There are also steps in the right direction on the side of Sun. For example, Java6 allows you to simply specify a directory with jars rather than have to enumerate them.
BTW: Classpaths are also a problem if you are using an environment that uses a non-default classloader. For example, I have had a lot of problems running things like Hibernate or Digester under Eclipse because the classloaders were incompatible.
Classpath/jar-hell has a couple of escape hatches if they make sense for your project:
OSGi
JarJarLinks
NetBeans Module System - Not sure if this is usable outside of NetBeans
Others?
I think "classpath hell" refers to the time when the classpath of a Java app could only be set by using the CLASSPATH environment variable. This led to many applications requiring changes to the global system configuration (different for each OS), version conflicts between applications, and general confusion.
This is a somewhat more concrete example:
When two libraries (or a library
and the application) require different versions of the same third
library. If both versions of the third library use the same class
names, there is no way to load both versions of the third library with
the same classloader.
Take a loot at http://en.wikipedia.org/wiki/Java_Classloader#JAR_hell for more examples.
There's lot of good stuff here http://mindprod.com/jgloss/classpath.html and http://java.sun.com/javase/6/docs/technotes/tools/windows/classpath.html
I've only had issues with classpaths when I am not setting is myself using -cp. Trying to figure out how your third-party software sets their classpaths can be a pain at times.