Versioned Serialization in Java - java

I have a simple Java class that I need to serialize to be stored as a value in an RDBMS or a key-value store. The class is just a collection of properties of simple types (native types or Maps/Lists of native types). The issue is that the class will likely be evolving over time (likely: adding new properties, less likely but still possible: renaming a property, changing the type of a property, deleting a property).
I'd like to be able to gracefully handle changes in the class version. Specifically, when an attempt is made to de-serialize an instance from an older version, I'd like to be able to specify some sort of handler for managing the transition of the older instance to the newest version.
I don't think Java's built-in serialization is appropriate here. Before I try to roll my own solution, I'm wondering if anyone knows of any existing libraries that might help? I know of a ton of alternative serialization methods for Java, but I'm specifically looking for something that will let me gracefully handle changes to a class definition over time.
Edit:
For what it's worth, I ended up going with Protocol Buffer (http://code.google.com/p/protobuf/) serialization, since it's flexible to adding and renaming fields, while being on less piece of code I have to maintain (in reference to the custom Java serialization using readObject/writeObject).

Java serialisation allows customising of the serial form by providing readObject and writeObject methods. Together with ObjectInputStream.readFields, ObjectOutputStrean.putFields and defining serialPersistentFields, the serialised form can be unrelated to the actual fields in the implementation class.
However, Java serialisation produces opaque data that is not amenable to reading and writing through other techniques.

Perhaps you should map your Java class into the relational model. Dumping some language serialized blob into a database column is a horrible approach.

This is pretty straightforward using read and write object.
Try setting serialversionuid to a fixed value, then define a static final field for your version. The readobject can then use a switch statement to construct the fields depending on the version. We use this to store historical data on our file system. It's very quick on retrieval- so much so that users can't tell the difference.

I had a similar problem. I found out Java's serialVersionUID doesn't help much when you have multiple versions of objects. So I rolled out my own.
Here is what I do to save our user sessions,
In my DB, besides the BLOB field for serialized objects, I added a version column.
Whenever we change the session object, I save the old class, for example SessionV3.
Session is always written to the DB with current version number.
When reading the session, it's deserialized into session object directly if version is current. Otherwise, it's deserialized into old object and manually copied into current session object (SessionV3 => Session).
Once a while, we run a DB script to remove real old session versions so we can clean out old sessions from code. If we care about the old sessions, we can choose to convert them also.
There might be easier way to do this but our approach gives us most flexibility.

Never tried it but you may be able to do something with a custom bootloader to load the correct version of the class file at runtime for the object being deserialized.

Related

Strategy for managing Java object serialization version when using cache

I have an application that uses JBoss Cache to cache some internal objects and the cache data is persisted in a MySQL database. When the application starts, data persisted in the database will be loaded into the cache. Java serialization mechanism is used to persist the data. I assigned a fixed value to serialVersionUID field for those persistent Java classes.
The problem is that when someone introduces incompatible changes to the serialization format, then the cache data loading from database will fail due to de-serialization errors. After some researches, I got some possible solutions, but not sure which one is the best.
Change the serialVersionUID to use a new version. This approach seems to be easy. But do I need to change all the classes' serialVersionUID field value to the new version, or just the actually changed classes? Because what's been read is actually an object graph, if different serialization versions are used for different classes, will that be a problem when de-serializing the whole object graph?
Use the same serialVersionUID value but create my own readObject and 'writeObject`.
Use Externalizable interface.
Which approach may be the best solution?
The first option is good enough for a cache system as an incompatible serialization change should be a rare event and should be treated as a cache miss. Basically you need to discard from cache the incompatible class instances and re-add the new ones. No need to change the serialVersionUID for compatible classes.
If you want to further minimize the number of incompatibilities between serializable class versions you might consider the second option to customize your class serialization form or provide a serialization proxy. Here you should think if a logical representation of the class is simpler than the default physical representation that might serialize unnecessary implementation details.
Providing a custom serialization form might have further important advantages for a cache system: the size of the serialization form might be reduced and the speed of serialization process might be increased.
See also "Effective Java" by Joshua Block that has a good chapter discussing serialization issues.

Serialize a static class

In my Android application I have several activities where the user configures a certain operation. In every step of the configuration I store the parameters supplied by the user in a Static Class that manages this operation. After the configuration all activities can access these parameters and everything works perfect.
Except that I want to persist this configuration so in future executions of the app there is no need to configure it again.
How can I restore the static class state?
I didn't want to create a table in the database just to store this one object (I think it is an ugly solution).
I could also dump all the configurations into a SharedPreferences, however I have a lot of parameters in the manager, and it stores lists of objects with the results of the operation execution, and it would be a bit pain in the ass to manually store it in a Key/Value solution.
Instead I was thinking on serializing the class into a file, and on the application startup I check if the file exists, if true, I deserialize it into my Manager again.
Is this a correct approach, or are there prettier solutions? Also, would the lists objects in this static class be serialized, or do I need to serialize each of them separately?
I was thinking on doing something like shown in this example
I think serializing to a file is an excellent idea. Seems a lot cleaner and simpler than dealing with databases or key/value pairs. Otherwise it feels like you are serializing the serialization.
And to answer your second question: Generally, implementations of java.util.List will implement java.io.Serializable, so you do not need to do them separately.
I would do exactly what you describe with the file, and in that example.

Can a non-serialized Java object be stored in mySQL BLob column?

I have a java objects which are not serializable. It is an external library and I cannot flag them as serializable. Here are a couple of questions..
1) Can they still be written to a mySQL BLOB column?
2) Is there any other way of persisting them outside of my JVM?
Any help will be useful.
Thanks
-a.
1) Have you tried it ?
2) Sure, for example in XML files. I personnally use XStream
1) Can they still be written to a mySQL BLOB column?
Yes, but you'll need to implement a serialisation algorithm to generate the bytes. Also, you will need to be sure you can access all the required internal state.
2) Is there any other way of persisting them outside of my JVM?
Take a look at XStream
Well, they don't have to be serializable in terms of "Java's default binary serialization scheme" but they have to be serializable somehow, by definition: you've got to have some way of extracting the data into a byte array, and then reconstituting it later from that array.
How you do that is up to you - there are lots of serialization frameworks/protocols/etc around.
They can, but not automatically. You'll have to come up with your own code to construct a binary representation of your object, persist the binary data to your database, and then reconstruct the object when you pull the data out of the database.
Yes, but again it will not be automatic. You'll have to come up with your own binary representation of the object, decide how you want to store it, and then reconstruct the object when you want to read it.
Serializable doesn't do anything in itself, it's just a way to hint that a class can be serializable. Some tools requires the presence of the interface while some does not.
I haven't looked into storing java objects as mySQL BLOB, but if serializable java objects can be then I see no reason why it wouldn't be possible.
2) Is there any other way of persisting them outside of my JVM?
There are many ways to persist objects outside JVM. Store it to disk, ftp, network storage, etc., and there exist just as many tools for storing in various format (such as XML, etc.).

What's the best way to read a UDT from a database with Java?

I thought I knew everything about UDTs and JDBC until someone on SO pointed out some details of the Javadoc of java.sql.SQLInput and java.sql.SQLData JavaDoc to me. The essence of that hint was (from SQLInput):
An input stream that contains a stream
of values representing an instance of
an SQL structured type or an SQL
distinct type. This interface, used
only for custom mapping, is used by
the driver behind the scenes, and a
programmer never directly invokes
SQLInput methods.
This is quite the opposite of what I am used to do (which is also used and stable in productive systems, when used with the Oracle JDBC driver): Implement SQLData and provide this implementation in a custom mapping to
ResultSet.getObject(int index, Map mapping)
The JDBC driver will then call-back on my custom type using the
SQLData.readSQL(SQLInput stream, String typeName)
method. I implement this method and read each field from the SQLInput stream. In the end, getObject() will return a correctly initialised instance of my SQLData implementation holding all data from the UDT.
To me, this seems like the perfect way to implement such a custom mapping. Good reasons for going this way:
I can use the standard API, instead of using vendor-specific classes such as oracle.sql.STRUCT, etc.
I can generate source code from my UDTs, with appropriate getters/setters and other properties
My questions:
What do you think about my approach, implementing SQLData? Is it viable, even if the Javadoc states otherwise?
What other ways of reading UDT's in Java do you know of? E.g. what does Spring do? what does Hibernate do? What does JPA do? What do you do?
Addendum:
UDT support and integration with stored procedures is one of the major features of jOOQ. jOOQ aims at hiding the more complex "JDBC facts" from client code, without hiding the underlying database architecture. If you have similar questions like the above, jOOQ might provide an answer to you.
The advantage of configuring the driver so that it works behind the scenes is that the programmer does not need to pass the type map into ResultSet.getObject(...) and therefore has one less detail to remember (most of the time). The driver can also be configured at runtime using properties to define the mappings, so the application code can be kept independent of the details of the SQL type to object mappings. If the application could support several different databases, this allows different mappings to be supported for each database.
Your method is viable, its main characteristic is that the application code uses explicit type mappings.
In the behind the scenes approach the ResultSet.getObject(int) method will use the type mappings defined on the connection rather than those passed by the application code in ResultSet.getObject(int index, Map mapping). Otherwise the approaches are the same.
Other Approaches
I have seen another approach used with JBoss 4 based on these classes:
org.jboss.ejb.plugins.cmp.jdbc.JDBCParameterSetter
org.jboss.ejb.plugins.cmp.jdbc.JDBCResultSetReader.AbstractResultSetReader
The idea is the same but the implementation is non-standard (it probably pre-dates the version of the JDBC standard defining SQLData/SQLInput).
What other ways of reading UDT's in Java do you know of? E.g. what does Spring do? what does Hibernate do? What does JPA do? What do you do?
An example of how something similar to this can be done in Hibernate/JPA is shown in this answer to another question:
Java Enums, JPA and Postgres enums - How do I make them work together?
I know what Spring does: you write implementations of their RowMapper interface. I've never used SQLData with Spring. Your post was the first time I'd ever heard of or thought about that interface.

How to Serialize Hibernate Collections Properly?

I'm trying to serialize objects from a database that have been retrieved with Hibernate, and I'm only interested in the objects' actual data in its entirety (cycles included).
Now I've been working with XStream, which seems powerful. The problem with XStream is that it looks all too blindly on the information. It recognizes Hibernate's PersistentCollections as they are, with all the Hibernate metadata included. I don't want to serialize those.
So, is there a reasonable way to extract the original Collection from within a PersistentCollection, and also initialize all referring data the objects might be pointing to. Or can you recommend me to a better approach?
(The results from Simple seem perfect, but it can't cope with such basic util classes as Calendar. It also accepts only one annotated object at a time)
solution described here worked well for me: http://jira.codehaus.org/browse/XSTR-226
the idea is to have custom XStream converter/mapper for hibernate collections, which will extract actual collection from hibernate one and will call corresponding standard converter (for ArrayList, HashMap etc.)
I recommend a simpler approach: user dozer: http://dozer.sf.net. Dozer is a bean mapper, you can use it to convert, say, a PersonEJB to an object of the same class. Dozer will recursively trigger all proxy fecthes through getter() calls, and will also convert src types to dest types (let's say java.sql.date to java.utilDate).
Here's a snippet:
MapperIF mapper = DozerBeanMapperSingletonWrapper.getInstance();
PersonEJB serializablePerson = mapper.map(myPersonInstance, PersonEJB.class);
Bear in mind, as dozer walks through your object tree it will trigger the proxy loading one by one, so if your object graph has many proxies you will see many queries, which can be expensive.
What generally seems to be the best way to do it, and the way I am currently doing it is to have another layer of DTO objects. This way you can exclude data that you don't want to go over the channel as well as limit the depth to which the graph is serialized. I use Dozer for my current DTO (Data Transfer Object) from Hibernate objects to the Flex client.
It works great, with a few caveats:
It's not fast, in fact it's downright slow. If you send a lot of data, Dozer will not perform very well. This is mostly because of the Reflection involved in performing its magic.
In a few cases you'll have to write custom converters for special behavior. These work very well, but they are bi-directional. I personally had to hack the Dozer source to allow uni-directional custom converters.

Categories