How to unit test for backward and forward compatibility?

How to unit test for backward and forward compatibility? - java

I am working on developing an Plug-In API that uses Java serialization. The idea is similar to SmallTalk's system images. I was wondering how would to best to automate testing for whether changes I am making will break deserialization since some changes seem to be innocuous like adding a method to an interface that is implemented (as long as that is not called, otherwise it will result in a AbstractMethodException).
Yes, this is more for an experimental spike rather than production code so please do not suggest not using serialisation.

For backward compatibility of data, keep a lot of old messages in binary form, and see if you can still deserialize them with the new code.
For backward compatibility of code, you'll need some way of building your old code (e.g. one version per release) and testing that against data created from the newest version of the code. This is a slightly more challenging problem - you may want to build a small test jar on each appropriate release, and put that into source control at the same time to avoid having to build the same code again and again. Your tests would then try all the different jar files against the output of the new code.
To be honest, this all sounds like quite a lot of work for an experimental spike. For real work I'd just use protocol buffers of course :)

Related

Run simultaneously multiple versions of same feature

Context
Running multiple version of the same library seems to be a usual need and there are many questions for this when dealing with versioned jar dependencies.
However, I have another constraint here: my code is part of a rolling-release MOAB where code has no version. I cannot depend on a older version of a library from the MOAB.
The use case of that question is being able to load different versions of the same code at runtime for compatibility.
Eg: GET /my/api/call?compat_version=42
I have to be able to provide several compatibility versions (ie code from version x that have not been changed). This must be the actual code that was running when version x was the current/latest version and not any kind of retrocompatibility trick.
Naive solution
The "obvious" way seems to duplicate the code for each version. For instance by having per-version packages:
com.me.thing.v1
com.me.thing.v2
com.me.thing.v3
...
and dynamically loading the code from the associate package upon the provided compat_version parameter by whatever technique. Let's suppose for know that all those versions share a common interface (API).
Challenge
I'd like to challenge that and maybe find a better option than the naive solution.
Since using the exact code from version x is a prerequisite, I don't believe I can get rid of the copy-paste (but please, tell me I'm wrong).
What technique would you suggest as a simple (but not necessarily easy) and robust implementation? Reflection? Dependency injection?
Is there a "good" pattern for doing such things? Is there any literature on that?

This was already an old problem when Java was developed, hence Sun's emphasis on binary compatibility (which existed for Solaris of course as well). This is the original guarantee offered by the platform- that you can upgrade the bits underneath and applications will continue to work, unmodified.
The way to run legacy code in the JVM world is to run the full legacy application.
Many segregation architectures have of course been developed over the years and reached various levels of maturity- like OSGI and many others before it- but there are edge cases upon edge cases and many failure modes.
Do not futz with multiple code versions within a single JVM. It was never a design goal and in environments where it matters only leads to pain.

How to remove java packages from jdk

I want to keep only java util, io, nioand math packages and want to remove all other packages like java.sql and others from my JDK.
How can I remove them?
So if I write some program which import removed packages it will give
error package doesn't exist.

Use a SecurityManager instead of hacking the JDK

I'm going to give you the best answer I can.
Why you really shouldn't be doing what you want to do
When you're writing code, it is commonly agreed to develop that code in a way that is extendable. That is, your code can be plugged into other applications, or it can be changed and added to, very easily. Now with that principle in mind, let's review what happens when you delete the possible functionality of your program. Say you delete the SQL package, and in the future, you want a backend database to provide some persistence in your program. You're screwed.
The idea of Java, in fact I'd go as far as to say the major advantage of Java, is it's commonality, consistency and standardization of patterns. A getter is always a getter. A variable (that isn't a constant) starts with a lower case letter. Classes have a standardized way of being structured. All these things make developing in Java quite intuitive.
The JDK is part of that consistency, and to edit it is to really impact one of the major points of Java. It sounds like you want to implement your program in a different, more compact language.
Finally, you have no idea how the client may want to extend your project in the future. IF you want to have some repeatable business from the client, and generate a good reputation at the same time, you want to design your code with good design practise in mind.

There is no such tool, AFAIK.
Removing stuff from the Java libraries can be technically tricky, 'cos it can be difficult to know if your code might directly or indirectly use some class or method.
There are potentially "licensing issues" if you add or remove classes from a JRE installer, and ship it to other people.
Concerning your proposed use case.
If you are building this as a web application, then you are going to have a lot of difficulty cutting out classes that are not needed. A typical webapp server-side framework uses a lot of Java SE interfaces.
If you accepted and ran code someone who wanted to try and bring down your service, they could do it without using only the Object class. (Hint: infinite loops and filling the heap.) Running untrusted code on your server is a bad idea. Period.
Think about the consequence for someone trying to run legitimate code on your server. Why shouldn't they be allowed to use library classes / methods? (I'd certainly be a bit miffed if I couldn't use "ordinary" library classes ...)
My advice would be reconsider if it was a good idea to implement such a service at all ... given the risks, and the difficulty you could have if your safeguards were ineffective. If you decide to proceed, I advise running the untrusted code within the JVM in a security box. As a second level of defence in case Java security is compromised, I'd recommend running the service "chrooted" or better still in an isolated virtual machine that can be turned off if you run into problems.

How to assess the risk of a java version upgrade?

I'm being asked to assess whether we can safely upgrade the java version on one of our production-deployed webapps. The codebase is fairly large and we want to avoid having to regression test everything (no automated tests sadly), but we've already encountered at least one problem during some manual testing (XmlStringReader.getLocalName now throws an IllegalStateExeption when it just used to return null) and higher-ups are pretty nervous about the upgrade.
The current suggested approach is to do a source compare of the JDK sources for each version and assess those changes to see which ones might have impact, but it seems there's a lot of changes to go through (and as mentioned the codebase is kinda large). Is it safe and easier to just review the java version changes for each version? Or is there an easier way to conduct this assessment?
Edit: I forgot to mention the version upgrade being considered is a minor version upgrade, i.e. 1.6.10 to 1.6.33

Nothing is going to replace testing it in a real system. you may be able to catch something blatant in a bug report or in visual inspection, but detecting changes due to more complex interactions will be impossible. or even detecting seemingly simple changes which alter how the GC affects your running app or how hotspot optimizes your code (you are checking the c++ code too, right) or how some key algorithm performs...

As #jtahlborn says: Nothing will replace testing it properly.
I would go further and state that without automation then this is cost a you will occur again and again.
The correct answer is to
Define a regression
Run it (verify it)
Automate as much as possible as you go through it
A simpler scenario is to simply run it and catch the errors are you or your customers find time. Personally I think this is good way to get demotivated developers, managers and customers. I strongly recommend you don't use this approach.

Should I be concerned with large number of dependencies?

I was just about to include the HtmlUnit library in a project. I unpacked the zip-file and realised that it had no less than 12 dependencies.
I've always been concerned when it comes to introducing dependencies. I suppose I have to ship all these dependencies together with the application (8.7 mb in this particular case). Should I bother checking for, say, security updates for these libraries? Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
The task HtmlUnit solves for me could be solved "manually" but that would probably not be as elegant.
Should I be concerned about this? What are the best practices in situations like these?
Edit: I'm interested in the general situation, not particularly involving HtmlUnit. I just used it here as an example as that was my current concern.

Handle your dependencies with care. They can bring you much speed, but can be a pain to maintain down the road. Here are my thoughts:
Use some software to maintain your dependencies. Maven is what I would use for Java to do this. Without it you will very soon loose track of your dependencies.
Remember that the various libraries have different licenses. It is not granted that a given license works for your setting. I work for a software house and we cannot use GPL based libraries in any of the software we ship, as the software we sell are closed source. Similarly we should avoid LGPL as well if we can (This is due to some intricate lawyer reasoning, don't ask me why)
For unit testing I'd say go all out. It is not the end of the world if you have to rewrite your tests in the future. It might even be then that that part of the software is either extremely stable or maybe not even maintained no more. Loosing those is not that big of a deal as you already had a huge gain of gaining speed when you got it.
Some libraries are harder to replace later than others. Some are like a marriage that should last the life of the software, but some other are just tools that are easily replaceable. (Think Spring versus an xml library)
Check out how the community support older versions of the library. Are they supporting older versions? What happens when life continues and you are stuck at a version? Is there an active community or do you have the skill to maintain it yourself?
For how long are your software supposed to last? Is it one year, five year, ten year or beyond? If the software has short time span, then you can use more to get where you are going as it is not that important to be able to keep up with upgrading your libraries.

It could be a serious issue if there isn't a active community which does maintain the libraries on long term. It is ok to use libraries, but to be honest you should care to get the sources and put them into your VCS.

Should I bother checking for, say, security updates for these libraries?
In general, it is probably a good idea to do this. But then so should everyone upstream and downstream of you.
In your particular case, we are talking about test code. If potential security flaws in libraries used only in testing are significant, your downstream users are doing something strange ...
Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
Ah yes. Assuming that you are building your own classpaths, etc by hand, you need to make a decision about which version of the dependent libraries you should use. It is usually safe to just pick the most recent of the versions used. But if the older version is not backwards incompatible with the new (for your use case) then you've got a problem.
Should I be concerned about this?
IMO, for your particular example (where we are talking about test code), no.
What are the best practices in situations like these?
Use Maven! It explicitly exposes the dependencies to the folks who download your code, making it possible for them to deal with the issue. It also tells you when you've got dependency version conflicts and provides a simple "exclude" mechanism for dealing with it.
Maven also removes the need to create distributions. You publish just your artifacts with references to their dependents. The Maven command then downloads the dependent artifacts from wherever they have been published.
EDIT
Obviously, if you are using HtmlUnit for production code (rather than just tests), then you need to pay more attention to security issues.

A similar thing has happened to me actually.
Two of my dependencies had the same 'transitive' dependency but a different version.
My favorite solution is to avoid "dependency creep" by not including too many dependencies. So, the simplest solution would be to remove the one I need less, or the one I could replace with a simple Util class, etc.
Too bad, it's not always that simple. In unfortunate cases where you actually need both libraries, it is possible to try to sync their versions, i.e. downgrade one of them so that dependency versions match.
In my particular case, I manually edited one of the jars, deleted the older dependency from it, and hoped it would still work with new version loaded from other jar. Luckily, it did (i.e. maintainers of the transitive dependency didn't remove any classes or methods that library used).
Was it ugly - Yes (Yuck!), but it worked.

I try to avoid introducing frivolous dependencies, because it is possible to run into conflicts.
One interesting technique I have seen used to avoid conflicts: renaming a library's package (if its license allows you to -- most BSD-style licenses do.) My favorite example of this is what Sun did when they built Xerces into the JRE as the de-facto JAXP XML parser: they renamed the whole of Xerces from org.apache.xerces to com.sun.org.apache.xerces.internal. Is this technique drastic, time consuming, and hard to maintain? Yes. But it gets the job done, and I think it is an important possible alternative to keep in mind.
Another possibility is -- license terms abided -- copying/renaming single classes or even single methods out of a library.
HtmlUnit can do a lot, though. If you are really using a large portion of its functionality on a lot of varied input data, it would be hard to make a case for spending the large amount of time it would take to re-write the functionality from scratch, or repackage it.
As for the security concerns -- you might weigh the chances of a widely used library having problems, vs. the likelihood of your hand-written less-tested code having some security flaw. Ultimately you are responsible for the security of your programs, though -- so do what you feel you must.

Eclipse as an IDE - What do you find missing as a beginner in Java?

I am working on a solution that aims at solving problems that newbie programmers experience when they are "modifying code" while bug fixing / doing change requests, on code in production. Eclipse, as we all know is a great IDE. Features such as Code Completion, Open Declaration, Type Hierarchy, Package Explorer, Navigator, Finding References etc aids people in fixing things quicker compared to say using something like Textpad.
If you are a newbie java programmer and you are using Eclipse IDE, what areas of the Eclipse IDE do you think were less helpful/ less intuitive? If you are a seasoned programmer, what are the common issues that newbies look up to you to solve for them?
Please ignore issues related to : Domain Expertise (Business Knowledge), Infra( where to test your change etc), performance related (eclipse search being slow,etc), Skill level in a particular language (think of the developer as a noob) ... and think one language - Java
I did a local survey in my small team and here are some:
Newbies using Eclipse to handle code that is written to interfaces where the implementation is supplied at runtime. Doing a 'Open Declaration' will always show you an interface. This could be confusing at times.
Eclipse is not intuitive while developing EJBs. Sure, you know all you have to do to create a new bean is to right click and 'Create Bean', however, once created it shows no contextual help to what the next step should be. For instance, generating stubs.
When Data Source Mapping with entity beans, changing something screws up the entire flow of things and eclpise never complains / hints.
Developing applications that make use of Struts, eclipse doesn't tell you that when you change struts-config.xml, particular web flow would get affected.
At this point, to me, as someone who is interested in collecting opinions for my research, it appears as if Eclipse could use more 'contextual runtime hints'.
I am sure the community would have a lot more to add... Please add more of your negative experiences (just from the code change perspective).
EDIT:
I guess, my question was too lengthy and confusing. I am gonna rephrase it a bit and keep it short:
While "making a code change" (not analogous to code formatting, infra related activities, CVS etc... say something like refactoring), what feature(s) of eclipse IDE did you not like / hate the most? Here are the examples:
When modifying code that has been written to interfaces: 'Open Declaration /F3 on an object instance shows you the interface when the implementation is supplied at runtime'.
When changing apps using EJBs: No contextual help
When changing apps using MVCs(Spring / Struts) : No warnings about change impact.

Missing in Eclipse are:
Software visualization, as for example System Complexity View [Lanza 2003]
And also by Lanza, the Class Blueprint [Ducasse 2005]
Post Scriptum: Software visualization in Eclipse: X-Ray provides System Complexity View of Java projects, http://xray.inf.usi.ch/xray.php (via #anjaguzzi and Paul Lammertsma)
And then collaborative filtering "other developers that edited this method before also edited" [Zimmermann 2005]
And the collection of browsable examples, and autocompletion at the level of these examples. That is, for example if your write
ByteBuffer buf = file.
and hit autocompletion it should search the codebase and the interwebs for examples that convert files to bytebuffers and insert that 10-20 lines there.
Parseweb supports developers by recommending method invocation sequences that yield a required
destination data type from given input parameter types. http://doi.acm.org/10.1145/1453101.1453129
Prospector supports developers by recommending method invocation sequences that yield a required
destination data type from given input parameter types.http://doi.acm.org/10.1145/1064978.1065018
Strathcona provides source code examples and structural con-
text for the code fragment under development. http://lsmr.cpsc.ucalgary.ca/papers/holmes-icse-2005.pdf
Rascal recommends how and when to call the methods of objects from common libraries such as Java Swing, based on an analysis of existing classes. It uses collaborative filtering. http://dx.doi.org/10.1007/s10462-005-9012-8
And of course also the feature that I can write a Unit test and then the IDE searches the interwebs for classes that pass the test. Yes, this can be done!
CodeGenie is an Eclipse plugin that allows you to write unit tests and then uses the Sourcerer source code search engine to find passing classes. http://doi.acm.org/10.1145/1529282.1529384
CodeConjurer which is based on Merobase also offers that feature, see http://dx.doi.org/10.1109/MS.2008.110
This list could go on and on, good starting points for more work are the proceedings of past
Conference on Mining Software Repositories (MSR)
Workshop on Search-driven Software Engineering (SUITE)
Workshop on Recommendation Systems for Software Engineering (RSSE)
which are all under the umbrella of the ICSE conference.

"newbie issues" I've seen myself (I've used Eclipse for a good while, but it keeps "surprising" me occasionally) and helping colleagues just starting to use Eclipse:
It's large and complex enough to be very intimidating to some at first. Seems people consider netbeans easier to use initially. One colleague took refuge with the VI editor for a god while...
Installing plugins can be tricky (finding site URLs, awareness of plugins, why is "install"+"update" under the Help menu???)
Updates are still slow (but much better than before) with Eclipse 3.5/Galileo. It's difficult to understand which plugins to install just by their name sometimes.
Any platform besides Mac - preferences under the Window menu seems illogical?
Understanding how to set the project class path neatly. Setting the right project JDK version.
Lack of or unexpected interaction between ant/maven build tools' classpath and that of eclipse's (ant/maven clean causes Eclipse compiler errors when classpath is shared etc.).
Views and (large number of) perspectives are confusing/overwhelming at first. Which are useful when? How to drag views to the desired location or restoring closed ones?
Some JDK/Eclipse version combinations required too much PermGen space than available by default, took a while to diagnose.

For me, most of the newbie problems in Eclipse come from one of it's strengths, its configurability & plugin structure.
When I need to change a property in Eclipse, I always seem to have to spend a few minutes working out where to change it. Example: changing the Java editor to insert 4 spaces instead of a tab. The search bar in the properties is always welcome :-)
That and the lack of documentation for some of the plugins always makes for fun when I'm setting up a project.
EDIT: You can always show the classes that implement an interface using ctrl-T.
One thing I would add is that when I have a complex project, I tend to use Refresh & Project->Rebuild All *a lot". And I use TortoiseSVN to maniuplate stuff outside of Eclipse, because a lot of times this is easier (some refactoring for instance). However, if I'm modifying the project outside of Eclipse, I *always" quit Eclipse, and do a full refresh and build when I restart it. Otherwise Eclipse gets very confused sometimes.

I think the biggest problem I faced (and still face) with Eclipse is that it isn't particularly aware of standard technologies that surround modern Java development. If I'm developing an application, it might include the following:
Spring
Maven
JSF/Struts 2
Subversion
JUnit
I think Eclipse handles those technologies in increasing levels of awareness: (so JUnit will be fine, it works out of the box; Subversion requires Subclipse, and it's a little ropier than the CVS support; JSF needs some WTP tooling to be installed; Maven...you're probably best off setting up your own external tools commands rather than trust M2Eclipse, unless it's become dramatically better in recent times; and Spring, well, as you say, try ctrl-clicking on a method and you'll almost certainly get an interface, because the implementation is hidden away behind a Spring config file).
Getting all of that to play together and check out/compile, then later compile/run tests/check in is the difficult bit. The code change itself is probably easy :)

For me, the biggest hurdle to learning to use eclipse effectively was understanding where to set the classpath and also how to figure out exactly what is included on the classpath for various stages of development (compile, build, test). I was confused for a long time about the difference between compile time, debug configurations, and run configurations classpaths. Then if you throw ant into the mix (which automatically creates a run configuration ) it makes it even more confusing for newbies.

As a beginner, I didn't do EJB or Struts stuff. Or even data source mapping. So I think the question's title may be a little misleading.
I would have appreciated having something like JadClipse built in to "look at" library code when I hit it in debugging or such. But it should be made VERY clear that this is "reconstituted code" and not meant to be hacked around in.
Second, noobs need to be made much more aware that Shift-F2 will get them the API documentation for whatever class/method they're looking at. I know too many novice Java programmers who explore their APIs with nothing more than code completion; they're missing many valuable hints provided by the library authors.

A mindreader which generates code on the fly, so that a single click is sufficient to complete a project.

I found Visual Studio easy to pick up, I tried clicking on each button at least once, and figured out what the whole thing does. It's thought out by a single design team at the highest level, and everything follows the standards top to bottom, more or less.
Then, I play with Eclipse. In technical terms, it's janky at best. Look at the preferences dialog; it's an overwhelming trainwreck, unless you already know exactly what you're looking for, and what the developer working on that feature decided to call it.
Eclipse's configurability relies on the fact you already know how to configure it. The learning curve there is awful, and the only saving grace is that most of the defaults are okay to begin with.

I was a noob to eclipse recently, mostly doing Android and BlackBerry stuff. And one thing that eludes me to this day is the massive multitude of options and settings and various places they can be found in. For instance, if you have a plugin installed (say BlackBerry plugin), the setting might be found in the general prefs or BlackBerry prefs or the project prefs.
It's always a hunt.

Here are basic missing features as far as I know :
Show the beginning '{' when you are at the end '}' where beginning '{' is out of the view
Automatically synchronize the editor with package explorer
Go to different view (package explorer, outline, etc) with keyboard
Inline find, which does not open a dialog.
Go to next error location with keyboard
Go to next/previous structure with keyboard
More stability in general.
These features work great in IntelliJ. Especially #1 and #5 are really useful.

VCS integration - typically the developer is also new to merging changes, keeping working copies in sync, resolving conflicts and so on. There are often several ways of achieving the same thing in Eclipse and I've seen this cause confusion on several occasions (actually, I've seen this with experienced developers too; they know Subversion but not Eclipse, and the latter tries to 'hide' the underlying repository operations).

I think the issue with all IDEs for beginners is the disconnection from the tool chain: how the compiler takes some source and compiles it to bytecode which I then run using the VM with a correctly configured classpath.
As a dev I love the fact I don't need to deal with this - and I've never found myself unable to do what I want concerning more complex build configurations - but it's too important for beginners to understand what's happening when you press that Play button to be ignored.

when i first tried using eclipse i absolutely hated it's coplexity.. you had to do a whole bunch of things befor you were able to start working. furthermore you have way to many options to check and it's not always selfexplaining what each button does.
instead i started using netbeans. way more intuitive and easier to handle. check there gui.. not to much buttons and most of the time you know what the button does even if you have no clue of java (as i had at that time).
when i changed back to eclipse (due to some features not supported in netbeans) it seemed far easier to work with it. so some part of the gui might just be added in a not intuitive way and beginners will definitly have a hard time with it.

simplicity
clearness
consistency
I might write it much more in detail, but i think, that eclipse is overweight and much too much oriented on features - instead of ease of use. IMO, this concerns beginners as well as professionals.

Eclipse has no visual designer for Swing components.
Compare that to Visual Studio, where:
click 'new form'
drag buttons and text boxes onto the form, move them around, add some labels
double click on a button, add some code
done, one quick application, show it to the boss, get paid/promotion/coffee break
In Eclipse you either have to use Netbeans instead (ie not Eclipse...), or use IBM's SWT, or code the Swing forms by hand.
I feel it would be very nice to have a great wysiwyg forms designer in Eclipse for Swing forms.

The other confusing part is that for web development there is a separate version that needs to be used! Well I didn't like this at all and I know that a lot of people just don't about this.

Eclipse is missing Maven Embedded in standard distribution , Maven would help any user in getting their program all the jars and better library management .
Netbeans already has this tools.
Also eclipse misses integrated tools for hibernate , spring , xfire and tomcat deployment using maven.
Check this site http://maven.apache.org/

All of my other problems with Eclipse have already been mentioned except one: It's slooooooooow. Was their goal to prove the "Java is slow" people right? I'm guessing this is related to the "Eclipse does everything", but I stopped using it because it lags every time I click on anything. Change tabs? Lag. Open preferences? Lag. Change tabs in preferences? Lag. It's like using Photoshop with 32 Mb of memory.
Oh and it's incredibly ugly. I wish I could get it with real GTK+ integration.

The android SDK integration is full of jank. xml layouts don't render correctly. Code completion doesn't work well. IBM needs to fix the jank for sure.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.