Java - Access private class - java

I know how to access private variables, but I have the following class that I'm trying to test:
ProcessStatusResult:
#Getter
class ProcessStatusBody {
public ProcessStatusBody(ProcessStatus status) {
this.status = status;
}
ProcessStatus status;
}
#Getter
public class ProcessStatusResult {
ProcessStatusBody body;
...
public ProcessStatusResult(ProcessStatus status) {
body = new ProcessStatusBody(status);
...
}
}
In my test, I need to get the ProcessStatus inside ProcessStatusBody to validate it, but I have no idea how to do that.
Is there a way to use reflection (or some other method) to access this without having to add a getter in ProcessStatusResult for the sake of the test alone?

There are broadly speaking 3 different ways to tackle this problem.
The common way - (ab) use package private.
It's common to stick (unit1) tests in the same package as the code it tests. Different 'source root', same package. You'd have:
src/main/java/com/foo/Whatever.java
src/test/java/com/foo/TestWhatever.java
When you do this, your test code can just 'see' all package private stuff (because, same package).
Some projects would like to highlight this a bit and have an annotation that indicates 'this item is package private for the sake of test code; we would have made it private otherwise', and if you really want to go all-out, you add linter tools that detect any attempt to interact with entities so marked from other source files in the same package (not the test code) and flag it as an error.
These tools are, in my experience, unfortunately not commonly used and I'm not readily aware of linter tools that do this. A somewhat common annotation for this is guava's #VisibleForTesting. It doesn't do anything except serve as documentation (for now - one could write a linter tool that actually checks for style violations, of course).
The uncommon way, reflection and mocking
Test frameworks can do this but it severely hampers productivity. If you want to access your status variable, all you need to do is type body. and hit CTRL+SPACE or equivalent in your IDE and it'll be shown right there, you can just select it, you can CMD+click once you type or auto-complete it to be taken straight to its definition, and if you typo the name, your IDE will red-wavy-underline it.
Replace body.status with something like reflect.getField(body, "status", ProcessStatus.class); and it's a heck of a mess. It's long, ugly, mismatches generics to types (what if status is of type List<String>? List<String>.class isn't valid java. You could use reflection hackery to allow you to assign reflect.getField to anything you please, but now you have no type safety. If you mess up the type, you won't know until you run it. If you mess up the generics, you may not ever know, ouch.
Hence, this isn't great either, and is consequently not used often.
The nuanced way: That's not how it works.
private perhaps just means private. It's not meant for consumption by anything 'external to the source' file and that includes the unit tests. No source file consists solely of 100% private everything - there is some API you're supposed to plug into from 'outside', and test code can get at everything except private (so, package private, protected, and public is all accessible by test code). Test the public API parts, don't test the private stuff - after all, the normal idea behind private is that you're free to mess with these as much as you please, change what they do, change their signatures, all you need to check is what impact that has based solely on what else is in this source file you are looking at.
There is benefit in not having to extend that to: "... oh and you also need to revisit all the unit tests that interact directly with this".
Make an actual public-ish API (package private if you wish, but you write it with the actual intent that code from outside this source file will want to use this feature) and test that. Or, if truly this field is not a detail that even other source files in the same package are meant to know about then do not test it - it's too fine grained a test.
There are upsides and downsides to testing at such incredibly fine grained levels. Generally, that level of fine-grainedness makes tests simpler and more 'localized' (any error the tests catch will be pointing right at the actual problem), but going that deeply means you need a heck of a lot of tests, most tests turn into the trivial, and it creates significant resistance to refactoring - after all, even a simple refactor requires modifying a lot of tests as well. You're also risking "Copycat syndrome", where the same programmer writes both the unit test and the code that is being tested, and any mental error is therefore likely to exist in both, thus making the test completely useless.
A solution to copycat syndrome is to disentangle the writing of the test with the writing of the code. This is obviously easy to do when one person writes the test and another writes the implementation, but you can do it on your own - just make sure there's time and a 'mental break' between writing tests and writing the implementation, enough to prevent your brain from replaying the exact same reasoning when writing either (which is the likely path to making the same error twice in a row without realizing).
However, when you disentangle like this, it becomes really weird and unwieldy to test with the granularity of 'testing the private stuff' - testing like this tends to follow the notion of 'there is some API spec of sorts and that is what we test', and private stuff by definition isn't part of such a spec.
Whether you like to write the tests first, or the tests last - the same principle applies. It's bizarre to write tests and, during the writing of them, already start cooking up the private methods that you envision will be required to implement it: That's not your 'job' as test writer.
It's similarly strange (slightly less so, perhaps) to write tests that test implementation choices.
SO isn't intended for opinion so I won't spend more than two sentences on it, but my strong suggestion is that third option. private means: Not to be tested directly.
1) Really, any test where the scope of what it tests is clearly restricted to some 'unit' of code that in your project's code style rules requires being in a single package. The two concepts ('what do I stuff in packages' and 'what does this test' need to at least line up, or 'what do I test' needs to be more finegrained than that). In other words, integration tests that test the entire project don't get to use this strategy at all, but then if those are testing private variables, I'm pretty sure you're doing something wrong.

Well, here you have some working code using reflection:
// Assuming you have the objects
ProcessStatusResult processStatusResult = new ProcessStatusResult(new ProcessStatus());
ProcessStatusBody processStatusBody = processStatusResult.getBody();
try {
// The status field, you know the name, right?
Field privateField = ProcessStatusBody.class.getDeclaredField("status");
// Set the accessibility to true since it is not visible
privateField.setAccessible(true);
// Here we go
ProcessStatus status = (ProcessStatus) privateField.get(processStatusBody);
System.out.println(status);
} catch (NoSuchFieldException | SecurityException | IllegalArgumentException | IllegalAccessException e) {
// do something
}
Sorry, I didn't realize you said: "Is there a way to use reflection..."

Related

Converting a public method into a private method

I recently refactored some code which converted a public method that was only being used in conjure with another public method, into one call.
public class service() {
public String getAuthenticatedUserName() {
return SecurityContext.getName();
}
public getIdentityUserIdByUsername(String username) {
return db.getUser(username).getId();
}
}
which was being utilised in a few other classes as service.getIdentityUserIdByUsername(service.getUsername()), which seemed redudant. A new method was created combining the two calls.
public getIdentityUserId() {
return getIdentityUserIdByUsername(getUsername());
}
The getIdentityUserIdByUsername() is still being utilised in other classes without the need for getUsername(). However, the getUserName() method is no longer used in other classes.
My example is much simpler than the implementation, the method has test coverage that is a bit awkward to do (mocking static classes without Powermock and a bit of googling etc). In the future it's likely we will need the getUsername() method, and the method will not change.
It was suggested in code review that the getUsername() method should now be private due to it not being called anywhere else. This would require the explicit tests for the method be removed/commented out which seems like it would be repeated effort to rewrite or ugly to leave commented out code.
Is it best practice to change the method to private or leave it public because it has explicit coverage and you might need it in the future?
Is it best practice to change the method to private or leave it public because it has explicit coverage and you might need it in the future?
IMO, you are asking the wrong question. So called "best practice" doesn't come into it. (Read the references below!)
The real question is which of the alternatives is / are most likely to be best for you. That is really for you to decide. Not us.
The alternatives are:
You could remove the test case for the private method.
You could comment out the test case.
You could fix the test case so that it runs with the private version of the method.
You could leave the method as public.
To make a rational decision, you need to consider the technical and non-technical pros and cons of each alternative ... in the context of your project. But don't be too concerned about making the wrong decision. In the big picture, it is highly unlikely that making the wrong choice will have serious consequences.
Finally, I would advise to avoid dismissing options just because they are "code smell". That phrase has the same issue as "best practice". It causes you to dismiss valid options based on generalizations ... and current opinions (even fashions) on what is good or bad "practice".
Since you want someone else's opinion ("best practice" is just opinion!), mine is that all of the alternatives are valid. But my vote would be to leave the method as public. It is the least amount of work, and an unused method in an API does little harm. And as you say, there is a reasonable expectation that the method will be used in the future.
You don't need to agree with your code reviewer. (But this is not worth making enemies over ...)
References:
No Best Practices by James Bach
There is no such thing as "Best Practices": Context Matters. by Ted Neward.
It can make sense to want to test private methods. The industry standard way to do this, which has quite some advantages, is this:
Ensure that the test code lives in the same package as the code it tries to test. That doesn't mean same directory; for example, have src/main/java/pkg/MyClass.java and src/test/java/pkg/MyClassTest.java.
Make your private methods package private instead. Annotate them with #VisibleForTesting (from guava) if you want some record of this.
Separately from this, the entry space for public methods (public in the sense of: This is part of my API and defines the access points where external code calls my code) is normally some list of entrypoints.. if you have it at all. More often there is no such definition at all. One could say that all public methods in all public types implicitly form the list (i.e. that the keyword public implies that it is for consumption by external code), which then by tautology decrees that any public method has the proper signature. Not a very useful definition. In practice, the keyword public does not have to mean 'this is API accessible'. Various module systems (such as jigsaw or OSGi) have solutions for this, generally by letting you declare certain packages as actually public.
With such tooling, 'treeshaking' your public methods to point out that they need no longer be public makes sense. Without them... you can't really do this. There is such a notion as 'this method is never called in my codebase, but it is made available to external callers; callers that I don't have available here, and the point is that this is released, and there are perhaps projects that haven't even started being written yet which are intended to call this'.
Assuming you do have the tree-shaking concept going, you can still leave them in for that 'okay maybe not today but tomorrow perhaps' angle. If that applies, leave it in. If you can't imagine any use case where external code needs access to it, just delete it. If it really needs to be recovered, hey, there's always the history in version control.
If the method is a public static then you can leave it as is because there is no impact of it being public. It is aside effect free method, it being exposed will never cause any harm.
If it is a object level public method then -
1) Keep it if it is like an API. It has well defined input, output and delivers a well defined functionality and has tests associated with it. It being public doesn't harm anything.
2) Make it private immediately if it has side effects. If it causes others methods to behave differently because it changes the state of the object then it is harmful being public.

How to testing something like a converter

i have a question regarding testing classes like a converter.
Lets say i have a converter from EntityA to EntityB. The converter seems like this:
public EntityB convert(EntityA){
//call interal methods
return B.
}
private xy internalMethod1(...){
//call other interal Method
}
private xy internalMethod2(...){
....
}
private xy internalMethod3(...){
....
}
private xy internalMethod4(...){
....
}
The converter has one public method and 4 internal methods to convert the entity.
How should i test it?
Option1
I only test the public method and cover all cases from the internalMethods by different example inputs.
Advantages:
Testing only the "interface". Dont know the interal structure.
Internal refactoring is very easy and needs no changes at the tests.
Disadvantages:
Really big maybe unclear tests that tests all cases.
Every input must be pass all the methods.
Option2
I write tests for my public method and my private methods. (Some testframeworks can access private methods like powermock or spock (groovy))
I test every method alone and mock every other internal method.
Advantages:
Really small tests that only test the method itself and mock all other methods .
Disadvantages:
I know how it is implemented internal and must change the tests if i refactor some method, some methodname or something at the internal calling structure
Option3
I write some new classes that do the internal stuff and have public methods
Advantages:
Tests are maybe clearer and only for the special classes.
Disadvantages:
More classes for one conversion task.
Please help me what is the best practise here.
Maybe some good links/hints.
Thank you for your time.
The points you make are valid, but I think you might not be estimating their weight correctly.
Writing brittle tests (tests that are coupled to the implementation code) makes for a rigid code base that is hard to change. Since the point of writing tests in the first place is to be able to go fast, this is counter productive.
This is why you write your tests through the API only - it decouples the tests from the implementation. As you've said, this might make writing the tests a bit harder, but the reward is worth the effort since you'll get safety and be able to refactor easily.
Option 3 comes into play when you see a code smell where some tests cover only some of the code, and other tests only cover the other part of the code. This usually means there's a collaborator that maybe needs to be extracted. This is especially true when some internal functions only use some parameters and others don't. Also, when there's code duplication and the like.
What I would suggest, is to write it using the way you described in option 1, and then extract code out if needed, in the refactoring stage.

Should I test (duplicate) data, or only the behavior?

From the design perspective, I am wondering should I test the data, especially if it's a generally known data (not something very configurable) - this can apply to things like popular file extensions, special IP addresses etc.
Suppose we have a emergency phone number classifier:
public class ContactClassifier {
public final static String EMERGENCY_PHONE_NUMBER = "911";
public boolean isEmergencyNumber(String number) {
return number.equals(EMERGENCY_PHONE_NUMBER);
}
}
Should I test it this way ("911" duplication):
#Test
public testClassifier() {
assertTrue(contactClassifier.isEmergencyNumber("911"));
assertFalse(contactClassifier.isEmergencyNumber("111-other-222"));
}
or (test if properly recognized "configured" number):
#Test
public testClassifier() {
assertTrue(contactClassifier.isEmergencyNumber(ContactClassifier.EMERGENCY_PHONE_NUMBER));
assertFalse(contactClassifier.isEmergencyNumber("111-other-222"));
}
or inject "911" in the constructor,which looks the most reasonable for me, but even if I do so - should I wrote a test for the "application glue" if the component was instantiated with proper value? If someone can do a typo in data (code), then I see no reasons someone can do a typo in tests case (I bet such data would be copy-paste)
What is the point in test data that you can test? That constant value is in fact constant value? It's already defined in code. Java makes sure that the value is in fact the value so don't bother.
What you should do in unit test is test implementation, if it's correct or not. To test incorrect behaviour you use data defined inside test, marked as wrong, and send to method. To test that data is correct you input it during test, if it's border values that are not well known, or use application wide known values (constants inside interfaces) if they're defined somewhere already.
What is bothering you is that the data, that should be well known to everyone) is placed in test and that is not correct at all. What you can do is to move it to interface level. This way, by design, you have your application known data designed to be part of contract and it's correctness checked by java compiler.
Values that are well known should not be checked but should be handled by interfaces of some sort to maintain them. Changing it is easy, yes, and your test will not fail during that change, but to avoid accidents with it you should have merge request, reviews and tasks that are associated with them. If someone does change it by accident you can find that at the code review. If you commit everything to master you have bigger problems than constants doubly defined.
Now, onto parts that are bothering you in other approaches:
1) If someone can do a typo in data (code), then I see no reasons someone can do a typo in tests case (I bet such data would be copy-paste)
Actually, if someone changes values in data and then continues to develop, at some point he will run clean-install and see those failed tests. At that point he will probably change/ignore test to make it pass. If you have person that changes data so randomly you have bigger issues, and if not and the change is defined by task - you made someone do the change twice (at least?). No pros and many cons.
2) Worrying about someone making a mistake is generally bad practice. You can't catch it using code. Code reviews are designed for that. You can worry though about someone not correctly using the interface you defined.
3) Should I test it this way:
#Test
public testClassifier() {
assertTrue(contactClassifier.isEmergencyNumber(ContactClassifier.EMERGENCY_PHONE_NUMBER));
assertFalse(contactClassifier.isEmergencyNumber("111-other-222"));
}
Also not this way. This is not test but test batch, i.e. multiple tests in the same method. It should be this way (convention-s):
#Test
public testClassifier_emergencyNumberSupplied_correctnessConfirmed() {
assertTrue(contactClassifier.isEmergencyNumber(ContactClassifier.EMERGENCY_PHONE_NUMBER));
}
#Test
public testClassifier_incorrectValueSupplied_correctnessNotConfirmed() {
assertFalse(contactClassifier.isEmergencyNumber("111-other-222"));
}
4) it's not necessary when method is properly named, but if it's long enough you might consider naming the values inside test. For example
#Test
public testClassifier_incorrectValueSupplied_correctnessNotConfirmed() {
String nonEmergencyNumber = "111-other-222";
assertFalse(contactClassifier.isEmergencyNumber(nonEmergencyNumber));
}
External constants as such have a problem. The import disappears and the constant is added to the class' constant pool. Hence when in the future the constant is changed in the original class, the compiler does not see a dependency between the .class files, and leaves the old constant value in the test class.
So you would need a clean build.
Furthermore tests should be short, clear to read and fast to write. Tests deal with concrete cases of data. Abstractions are counter-productive, and may even lead to errors in the test themselves. Constants (like a speed limit) should be etched in stone, should be literals. Value properties like the maximum velocity of a car brand can stem from some kind of table lookup.
Of course repeated values could be placed in local constants. Prevents typos, easy - as local - abstraction, clarifies the semantic meaning of a value.
However as cases in general will use constants maybe twice or three times (positive and negative test), I would go for bare constants.
In my opinion the test should check behaviour and not the internal implementation.
The fact that isEmergencyNumber verifies the number over constant declared in the class you're trying to test is verification over internal implementation. You shouldn't rely on it in the test because it is not safe.
Let me give you some examples:
Example #1: Someone changed EMERGENCY_PHONE_NUMBER by mistake and didn't notice. The second test will never catch it.
Example #2: Suppose ContactClassifier is changed by not very smart developer to the following code. Of course it is completely edge case and most likely it will never happen in practice, but it also helps to understand what I mean.
public final static String EMERGENCY_PHONE_NUMBER = new String("911");
public boolean isEmergencyNumber(String number) {
return number == EMERGENCY_PHONE_NUMBER;
}
In this case your second test will not fail because it relies on internal implementation, but your first test which checks real word behaviour will catch the problem.
Writing a unit test serves an important purpose: you specify rules to be followed by the method being tested.
So, when the method breaks that rule i.e. the behavior changes, the test would fail.
I suggest, write in human language, what you want the rule to be, and then accordingly write it in computer language.
Let me elaborate.
Option 1 When I ask ContactClassifier.isEmergencyNumber method, "Is the string "911" an emergency number?", it should say yes.
Translates to
assertTrue(contactClassifier.isEmergencyNumber("911"));
What this means is you want to control and test what number is specified by the constant ContactClassifier.EMERGENCY_PHONE_NUMBER. Its value should be 911 and that the method isEmergencyNumber(String number) does its logic against this "911" string.
Option 2 When I ask ContactClassifier.isEmergencyNumber method, "Is the string specified in ContactClassifier.EMERGENCY_PHONE_NUMBER an emergency number ?", it should say yes.
It translates to
assertTrue(contactClassifier.isEmergencyNumber("911"));
What this means is you don't care what string is specified by the constant ContactClassifier.EMERGENCY_PHONE_NUMBER. Just that the method isEmergencyNumber(String number) does its logic against that string.
So, the answer would depend on which one of above behaviors you want to ensure.
I'd opt for
#Test
public testClassifier() {
assertTrue(contactClassifier.isEmergencyNumber("911"));
assertFalse(contactClassifier.isEmergencyNumber("111-other-222"));
}
as this doesn't test against something from the class under test that might be faulty. Testing with
#Test
public testClassifier() {
assertTrue(contactClassifier.isEmergencyNumber(ContactClassifier.EMERGENCY_PHONE_NUMBER));
assertFalse(contactClassifier.isEmergencyNumber("111-other-222"));
}
will never catch if someone introduces a typo into ContactClassifier.EMERGENCY_PHONE_NUMBER.
In my opinion that is not necessary to test this logic. The reason is: this logic is trivial for me.
We can test all line of our code, but I don't think that is a good idea to do this. For example getter and setter. If we follow the theory to test all line of code, we have to write test for each of getter and setter. But these tests have low value and cost more time to write, to maintain. That is not a good investment

Difference between #VisibleForTesting and #Deprecated int unit test

Let's say that there just so happens to be an existing long private method (one of which i'm not allowed to refactor into smaller pieces at this stage in the development process) but i really want to write a couple of regression-protection unit test for it, just for now.
i just heard of this #VisibleForTesting annotation, but am not too sure of its benefits and gotchas. Previously, i had always been marking things with #Deprecated and comments to try and make it VERY CLEAR like:
... some code ...
// ====================================== TESTING USE ONLY BELOW ======================================
#Deprecated // TESTING ONLY, DO NOT USE!
boolean testGiveAccessToSomethingPrivate() {
// call some private method and get the results
}
it seems that whenever i mark something as #VisibleForTesting it seems to expose the method for realz, without any indication to the user of the API that this method was only meant for testing... (whereas if i mark the method with #Deprecated, most IDEs will put a strike-through that warns other developers to not accidentally use the test method for their actual code
#Deprecated
When the method is marked #Deprecated, programmers use this method will know maybe this method will be removed, different behavior ... in future version. And if that happen, your code will be broken and you will rework. So, they should use this function very carefully or should replace by other function. (often programmer who made this api will make another similar function without deprecated for you to use).
#VisibleForTesting
You have learnt public private protected ... and you use it well. But in real world, everything not as our imagination. For example, your class have a private variable, and you want to test this variable, how can?
class SimpleClass() {
private int a;
public simpleMethod() {
if (a < 0) {
// do something
} else {
// do another thing
}
}
}
One way is changing scope of variable a to package level, because your test file will be same package with file to be tested. But another programmer jumps into and they will ask their self: "Why someone put modifier of a is package level? I think private level is better". (of course, they don't know or don't mind reading your test code because life is so busy). So you will have a way for their to know by using #VisibleForTesting. They read this and Google and know: "Ahhhh, I understood. He want this variable is testable".
That's a very short story about those two annotations. The similar is: They don't change the way your code run. Not change anything but notify for other people to know something. Very famous annotation every Java Developer knew and similar is #Override
The difference is: #Deprecated for someone using your code, and #VisibleForTesting for someone reading your code. Let make life's programmer easier in both case.
Hope this help :)

How do I generate the source code to create an object I'm debugging?

Typical scenario for me:
The legacy code I work on has a bug that only a client in production is having
I attach a debugger and figure out how to reproduce the issue on their system given their input. But, I don't know why the error is happening yet.
Now I want to write an automated test on my local system to try and reproduce then fix the bug
That last step is really hard. The input can be very complex and have a lot of data to it. Creating the input by hand (eg: P p = new P(); p.setX("x"); p.setY("x"); imagine doing this 1000 times to create the object) is very tedious and error prone. In fact you may notice there's a typo in the example I just gave.
Is there an automated way to take a field from a break point in my debugger and generate source code that would create that object, populated the same way?
The only thing I've come up with is to serialize this input (using Xstream, for example). I can save that to a file and read it back in in an automated test. This has a major problem: If the class changes in certain ways (eg: a field/getter/setter name is renamed), I won't be able to deserialize the object anymore. In other words, the tests are extremely fragile.
Java standard serialisation is well know to be not very usefull when objects change their version ( content, naming of fields). Its fine for quick demo projects.
More suitable for your needs, is the approach that objetcs support your own (binary) custom serialisation:
This is not difficult, use DataOutputStream to write out all fields of an object. But now introduce versiong, by first writing out a versionId. Objects that have only one version, write out versionId 1. That way you can later, when you have to introduce a change in your objetcs, remove fields, add fields, raise the version number.
Such a ICustomSerializable will then first read out the version number from the input stream, in a readObject() method, and depending on the version Id call readVersionV1() or e.g readVersionV2().
public Interface ICustomSerializable {
void writeObject(DataOutputStream dos);
Object readObject(DataInputStream dis);
}
public Class Foo {
public static final VERSION_V1 = 1;
public static final VERSION_V2 = 2;
public static final CURRENT_VERSION = VERSION_V2;
private int version;
private int fooNumber;
private double fooDouble;
public void writeObject(DataOutputStream dos) {
dos.writeInt(this.version);
if (version == VERSION_V1) {
writeVersionV1(dos);
} else (version == VERSION_V2) {
writeVersionV2(dos);
} else {
throw new IllegalFormatException("unkown version: " + this.version);
}
}
public void writeVersionV1(DataOutputStream dos) {
writeInt(this.fooNumber);
writeDouble(this.fooValue);
}
}
Further getter and setter, and a constructor with initialised the version to CURRENT_VERSION is needed.
This kind of serialisazion is safe to refactoring if you change or add also the appropriate read and write version. For complex objects using classes from external libs not und your controll, it can be more work, but strings, lists are easily serialized.
I think what you want to do is store the "state", and then restore that in your test to ensure the bug stays fixed.
Short answer: There is afaik no such general code generation tool, but as long as several constraints are kept, writing such a tool is small work.
Long Comment:
There are constraints under which that can work. If everything is just beans with getter and setter for all the fields you need, then generating code for this is not so difficult. And yes that would be safe to renaming if you refactor the generated code along with the normal code. If setter are missing, then this approach will not work. And that is only one example of why this is no general solution.
Refactoring can also for example move fields to other classes. How do you want to introduce the values from the other fields of that class? How can you later know if they that altered your saved state still reflects the critical data? Or worse, imagine the refactoring gives the same field a different meaning than before.
The nature of the bug itself is also a constraint. Imagine for example the bug happened because a field/method had this and that name. If a refactoring now changes the name the bug will not appear anymore regardless your state.
Those are just arbitrary examples, that may have exactly nothing to do with your real life cases. But this is a case to case decision, not a general strategy. Anyway, if you know your code the bug and your refactorings are all well behaving enough for this, then making such a tool is done in less than day, probably much less.
With xstream you would partially get this as well, but you would have to change the xml yourself. If you used for example db4o you would have to tell it that this and that field has now this and that name.

Categories