XStream - treat objects with the same identifier as one

XStream - treat objects with the same identifier as one - java

I am working on a legacy system where XStream is being used to serialize objects in order to keep two databases in sync. A new object is first stored in one database, then the stored object is serialized and sent to be stored in the other database.
Up until recently, the structure of the object in question was like this:
public class Project {
List<Milestone> milestones;
[...]
}
But, after changes to the requirements, the structure is supposed to be like this:
public class Project {
List<Goal> goals;
}
public class Goal {
List<Milestone> milestones;
}
In order to keep the milestones of legacy data, which knew nothing about goals, the final structure of project was this:
public class Project {
List<Goal> goals;
List<Milestone> milestones;
}
So, there are two paths from a Project, to a Milestone, one directly and one through a Goal. The problem occurs when this structure is deserialized and stored. When it is being deserialized by XStream, the objects for the Milestones connected to the Project directly becomes different objects from the ones connected through Goals, even though they have the same id.
As long as Hibernate's Session#merge() was used to persist this object, it was no problem, since merge() doesn't care about the object identifiers as long as the db identifiers are the same.
But, I can no longer use merge() for this purpose, and have to rely on Session#save() instead. And save() DO care about the object identifiers! So now I get a org.hibernate.NonUniqueObjectException when trying to store the deserialized object.
I figure the least intrusive way to solve this is, if it's possible, to make XStream create 1 object per database id. But is this possible?

After some consideration, it is appearant to me that the problem is not XStream, as it has mechanisms for object references. The problem is another nifty "feature" of the project I'm working on - it has 2 versions of each domain class, one for commmunication with Hibernate, and one for "logic use" (don't ask me why...) In the conversion between these two versions (which basically moves values from one object to another), objects are new-ed uncritically, resulting in the same "Hibernate-object" being transformed into multiple "Java-objects". Then, I can't really blame XStream for not understanding that these should be the same :)

Related

How to implement save/load functionality of Java application data/state?

I have an application of POJOs (plain old java objects) representing my data.
While running, the application manipulates and remembers data as desired.
Now I want to implement a save/load feature.
I am NOT asking about basic file I/O.
I am NOT asking whether ObjectOutputStream exists.
Options I have found are those such as:
1) JSON/XML/YAML libraries such as Gson, Jackson
2) Roll your own binary file format marking everything as Serializable with a Serialization Proxy pattern.
Option 1 is unsuitable because my data model can feature cyclic references. Gson resulted in a stack overflow.
Option 2 is unsuitable because the files should be cross platform and independent of JVM; it should work on desktop and android java.
A properties file is also obviously unsuitable due to the complexity of the model.
Please do not attack my use case; my data model is perfectly well designed. The example may not be.
I will now give example code of the kind of structure that needs to be saved.
class Application {
//This College is my top level object. It could correspond to an individual save file.
College college = new College();
//I would love to be able to just throw this guy into a file.
SomeLibrary.writeToFile(college);
//And read another back.
College college2 = SomeLibrary.readFromFile(anotherCollege);
}
class College {
//The trees are implemented recursively, so this is actually just the root of each tree.
Tree<Course> artCourseTree;
Tree<Course> engineeringCourseTree;
Tree<Course> businessCourseTree;
List<Student> maleStudents;
List<Student> femaleStudents;
}
class Course {
//Each course only has 2 students in this example. Ignore.
Student student1;
Student student2;
List<Exam> examsInCourse;
LocalDate courseStartDate;
Period duration;
}
class Student {
String name;
List<Exam> listOfExamsTaken;
}
class Exam {
Student studentTakingIt;
LocalDate dateTaken;
BigDecimal score;
}
As you can see, Exams are intended to be the atomic object in this model at the bottom of the hierarchy. However, not only are they referenced by both Students and Courses, but they also refer back up to a Student and contain nonprimitives such as LocalDate and BigDecimal. The model is given meaning by referencing different subsets of Exams in different Courses and Students.
I need to save the relationships, the arrangement of these things, an arbitrary number of these things, as well as the data they hold.
What hope do I have of saving and loading such a model?
What options are there to implement a save/load feature on such a model, with such constraints?
Is it really industry standard for every java program to roll its own binary file format and create a monstrous apparatus to serialize and deserialize everything? It's that or JSON? What am I missing here? Do I have to just snapshot the VM somehow? Why is there not a standard practice for this?

Circular references is a common use case and can be handled by providing #JsonManagedReference or #JsonBackReference. Check out this SO answer for more details. Another option is to implement custom serializer and solve circular references by yourself. Here is the example for the same.
However, do consider the following aspects before going ahead with using files as database
You will have to manage concurrent writes by yourself. If not correctly handled might result in corruption/loss of the data because files are not ACID compliant by nature.
The solution is not scalable as file size will grow. Time to serialize and deserialize will increase proportionately.
You won't be able to query easily on the data stored in the file. You will always have to deserialize data first and then query on POJOs.
I'll highly recommend checking SQLite which is small, fast, self-contained, high-reliability, full-featured, SQL database engine.

Java Object Clone (additional class member) using Prototype, Builder Pattern

It is not easy to explain my issue.
JPA creates some complex objects for calculations, which are stored in a database.
We decided to set the results in a working copy of this objects.
This means for each object model we created a seperated working copy model file with the same fields but some other LocalDates values and new result fields.
When the calculation was starting the working copies are instantiated.
This approach is not the best i think.
I think of the prototype pattern to clone the object.
There i come to the problem how to add the new fields. How?
Instantion costs and ist creates lots of additionals model class files.
I only think of put the result field in the calculation data models as transient fields.
Maybe inner class or local class?
I also tried to use an interface as data bucket.
But thats not the realy purpose of interfaces and also it works only with many curious trick.
For Unit Tests and user input i think it is the best to use the builder pattern and then tell JPA to store the parent object, or not?

Sorry but my answer was to long for a comment :(
There is big complex object relationship with Lists and Sets One To Many etc. relationship. When i set the result i a new class i cant determine the right object e.g. in a list. So we bild the same structurefor these result and seperated these classes in a package. Maybe it is possible to dont build the structure a second time with also references to the "basic classes". It should be sufficient to reference to each basic class a result class. It would only a little bit more navigation to get values from deeper classes. For a similiar use case there must be a best practise, or? Interfaces or sth. I very dislike the many classes for the result. Is it not possible to clone and add classmember to it for the result or to logical group easier or something like this?
It could be a solution for somebody:
http://help.eclipse.org/luna/index.jsp?topic=%2Forg.eclipse.jdt.doc.isv%2Freference%2Fapi%2Forg%2Feclipse%2Fjdt%2Fcore%2FIWorkingCopy.html
Here you will work with the Eclipse API and create IWorkingCopies.
For the described task toooo much.

Correct way of finding what was modified by a post in an spring-mvc controller?

It is a rather general question, but I will give a stripped down example. Say I have a Web CRUD application that manages simple entities stored in a database, nothing but classic : JSP view, RequestMapping annotated controller, transactional service layer and DAO.
On an update, I need to know the previous values of my fields, because a business rule asks a for a test involving the old and new values.
So I am searching for a best practice on that use case.
I thing that spring code is way more extensively tested and more robust than my own, and I would like to do it the spring way as much as possible.
Here is what I have tried :
1/ load an empty object in controller and manage the update in service :
Data.java:
class Data {
int id; // primary key
String name;
// ... other fields, getters, and setters omitted for brevity
}
DataController
...
#RequestMapping("/data/edit/{id}", method=RequestMethod.GET)
public String edit(#PathVariable("id") int id, Model model) {
model.setAttribute("data", service.getData(id);
return "/data/edit";
}
#RequestMapping("/data/edit/{id}", method=RequestMethod.POST)
public String update(#PathVariable("id") int id, #ModelAttribute Data data, BindingResult result) {
// binding result tests omitted ..
service.update(id, data)
return "redirect:/data/show";
}
DataService
#Transactional
public void update(int id, Data form) {
Data data = dataDao.find(id);
// ok I have old values in data and new values in form -> do tests stuff ...
// and MANUALLY copy fields from form to data
data.setName(form.getName);
...
}
It works fine, but in real case, if I have many domain objects and many fields in each, it is quite easy to forget one ... when spring WebDataBinder has done it including validation in the controller without I have to write any single thing other than #ModelAttribute !
2/ I tried to preload the Data from the database by declaring a Converter
DataConverter
public class DataConverter<String, Data> {
Data convert(String strid) {
return dataService.getId(Integer.valueOf(strid));
}
}
Absolutely magic ! The data if fully initialized from database and fields present in form are properly updated. But ... no way to get the previous values ...
So my question is : what could be the way to use spring DataBinder magic and to have access to previous values of my domain objects ?

You have already found the possible choices so i will just add some ideas here ;)
I will start with your option of using a empty bean and copying the values over to a loaded instance:
As you have shown in your example it's an easy approach. It's quite easily adaptable to create a generalized solution.
You do not need to copy the properties manually! Take a look at the 'BeanWrapperImpl' class. This spring object allows you to copy properties and is in fact the one used by Spring itself to achieve it's magic. It's used by the 'ParameterResolvers' for example.
So copying properties is the easy part. Clone the loaded object, fill the loaded object and compare them somehow.
If you have one service or just several this is the way to go.
In my case we needed this feature on each entity. Using Hibernate we have the issue that an entity might not only change inside a specific service call, but theoretically all over the place..
So I decided to create a 'MappedSuperClass' which all entities need to extend. This entity has a 'PostLoad' event listener which clones the entity in a transient field directly after loading. (This works if you don't have to load thousands of entities in a request.) Then you need also the 'PostPersist' and 'PostUpdate' listeners to clone the new state again as you probably don't reload the entity before another modification.
To facilitate the controller mapping I have implemented a 'StringToEntityConverter' doing exactly what you did, just generalized to support any entity type.
Finding the changes in a generalized approach will involve quite a bit of reflection. It's not that hard and I don't have the code available right now, but you can also use the 'BeanWrapper' for that:
Create a wrapper for both objects. Get all 'PropertyDescriptors' and compare the results. The hardest part is to find out when to stop. Compare only the first level or do you need deep comparison?
One other solution could also be to rely on Hibernate Envers. This would work if you do not need the changes during the same transaction. As Envers tracks the changes during a flush and creates a 'Revision' you can "simply" fetch twp revisions and compare them.
In all scenarios you will have to write a comparison code. I'm not aware of a library but probably there is something around in the java world :)
Hope that helps a bit.

Pattern/Library for sending objects over network, keeping pointers

Let's say you have a Client and a Server that wants to share/synchronize the same Models/Objects. The models point to each other, and you want them to keep pointing at the same object after being sent/serialized between the client and the server. My current solution roughly looks like this:
class Person {
static Map<Integer,Person> allPeople;
int myDogId;
static Person getPerson(int key){
return allPeople.get(key);
}
Dog getMyDog() {
return Dog.getDog(myDogId);
}
}
class Dog {
static Map<Integer,Dog> allDogs;
int myOwnersId;
static Dog getDog(int key) {
return allDogs.get(key);
}
Person getMyOwner() {
return Person.getPerson(myOwnersId);
}
}
But i'm not too satisfied with this solution, fields being integer and stuff. This should also be a pretty common problem. So what I'm looking for here is a name for this problem, a pattern, common solution or a library/framework.

There are two issues here.
Are you replicating the data in the Client and the Server (if so, why?) or does one, the other, or
a database agent hold the Model?
How does each agent access (its/the) model?
If the model is only held by one agent (Client, Server, Database), then the other agents
need a way to remotely query the model (e.g., object enumerators, getters and setters for various fields)
operating on abstract model entities (e.g, model element identifiers, which might be implemented
as integers as you have done).
Regardless of who holds the model (one or all), each model can be implemented naturally.
THe normal implementation has each object simply refer to other objects using normal object references,
as if you had coded this without any thought of sharing between agents, and unlike what
you did.
You can associate an objectid with each object, as you have, but your application
code doesn't need to use it; it is only necessary when referencing a remote copy of
of the model. Whether this objectid is associated with each object as a special
field, a hash table, or is computed on the fly is just an implementation detail.
One way to handle this is to compute the objectid on-the-fly. You can do this
if there is a canonical spanning tree over the entire model. In this case,
the objectid is "just" the path from root of the spanning tree to the location
of object. If you don't have a spanning tree or it is too expensive to compute,
you can assign objectids as objects are created.
The real problem with a duplicated, distributed model as you have suggested you have,
is keeping it up to date with both agents updating it. How do you prevent
one from creating an object (an assigning an objectid) at the same time
as the other, but the objects being created are different with the same objectid,
or the same with with different Objectids? You'll need remote locking
and signalling to keep the models in sync (this is the same problem as
"cache coherency" for multiple CPUs; just think of each object as acting like a cache line). The way it is generally solved
is to designate who holds the master copy (perhaps of the entire model,
perhaps of individual objects within the model) and then issue queries,
reads, reads-with-intent-to-modify, or writes to ensure that the
"unique" entire model gets updated.

The only solution I am aware of is to send the complete structure, i.e. Dogs and Persons over the network. Then they will end up pointing to the correct copy on the other side of the network. The implementation of this solution however depends on a lot of circumstances. For example when your inclusion relation defines a tree you can go at this problem differently than if it is a graph with cycles.
Have a look at this for more information.

I guess one can use the proxy pattern for this.

Storing historical data with Java and Hibernate

This is a problem about historical data handling.
Suppose you have a class MyClass like the following one:
class MyClass {
String field1;
Integer field2;
Long field3;
getField1() {...}
setField1(String ...) {...}
...
}
Now, suppose I need to make MyClass able to store and retrieve old data, what's the best way to do this?
The requirements are to persist the classes through Hibernate, too. And to have at most two tables per "entity": only one table or one table for the "continuity" class (the one which represents the entity which evolves over the time) and another table for the historical data (as it's suggested here)
Please note that I have to be able to assign an arbitrary valid time to the values of the fields.
The class should have an interface like:
class MyClass {
// how to store the fields????
getField1At(Instant i) {...}
setField1At(Instant i, String ...) {...}
...
}
I'm currently using the JTemporal library, and it has a TemporalAttribute<T> class, which is like a map: you can do things like T myAttr.get(Instant i) to get the version of myAttr at Instant i. I know how to persist a TemporalAttribute in a table with Hibernate (it's simple: I persist the SortedMap used by the TemporalAttribute and you get a table with start and end valid time and the value of the attribute).
The real problem is that here we have multiple attributes.
I have a solution in mind but it's not clear, and I'd like to hear your ideas.

Your project reminds me of Hibernate Envers.
The Envers project aims to enable easy
auditing of persistent classes. All
that you have to do is annotate your
persistent class or some of its
properties, that you want to audit,
with #Audited. For each audited
entity, a table will be created, which
will hold the history of changes made
to the entity. You can then retrieve
and query historical data without much
effort.
choose what you want to audit (on a per attribute basis)
make your own Revision Entity (that stores informations such as revision number, author, timestamp...)
Using Hibernate Envers for this decouples entities and revision data (in database and in your code).

You can do something like this simply by adding a version number to your domain class. I did something like this where the Id was a composite between an db assigned number and the version number, but I would advise against that. Use a normal surrogate key, and if you really want, make the [id, version] tuple a natural key.
You can actually version entire object graphs that way, just by ensuring that the version number is the same for all elements on the graph. You can then easily go back to any previous version.
You should write a lot of service tests to insure the integrity of the code that manages the version.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.