Strategy for managing Java object serialization version when using cache - java

I have an application that uses JBoss Cache to cache some internal objects and the cache data is persisted in a MySQL database. When the application starts, data persisted in the database will be loaded into the cache. Java serialization mechanism is used to persist the data. I assigned a fixed value to serialVersionUID field for those persistent Java classes.
The problem is that when someone introduces incompatible changes to the serialization format, then the cache data loading from database will fail due to de-serialization errors. After some researches, I got some possible solutions, but not sure which one is the best.
Change the serialVersionUID to use a new version. This approach seems to be easy. But do I need to change all the classes' serialVersionUID field value to the new version, or just the actually changed classes? Because what's been read is actually an object graph, if different serialization versions are used for different classes, will that be a problem when de-serializing the whole object graph?
Use the same serialVersionUID value but create my own readObject and 'writeObject`.
Use Externalizable interface.
Which approach may be the best solution?

The first option is good enough for a cache system as an incompatible serialization change should be a rare event and should be treated as a cache miss. Basically you need to discard from cache the incompatible class instances and re-add the new ones. No need to change the serialVersionUID for compatible classes.
If you want to further minimize the number of incompatibilities between serializable class versions you might consider the second option to customize your class serialization form or provide a serialization proxy. Here you should think if a logical representation of the class is simpler than the default physical representation that might serialize unnecessary implementation details.
Providing a custom serialization form might have further important advantages for a cache system: the size of the serialization form might be reduced and the speed of serialization process might be increased.
See also "Effective Java" by Joshua Block that has a good chapter discussing serialization issues.

Related

What happens in a clustered wicket environment when a model object is not serializable?

I'm seeing a NotSerializableException in the logs for a few of our model objects and I know that the fix for this is to make them serializable, but we are also seeing MarkupExceptions about components not being added to the page and I'm wondering if this could be related. We're only seeing the errors in production where clustering is turned on.
So my question is, what happens when a model object is not serializable, even if all it's attributes are serializable?
As far as I know, if you do not declare a class as serializable, then it will be missing from the serialized version on the subsequent actions (e.g. form submissions, behaviour, AJAX). Consequently, when the object is deserialized, it is likely that any object references will be null if child object cannot successfully be reloaded from the storage.
You should definitely avoid serializing objects unnecessarily. This includes in response to AJAX requests.
Best practices dictate:
Store only the minimum serialized objects required
Avoid declaring variables 'final' and referencing these from anonymous inner classes
Avoid having too many objects stored as fields in serialized objects (pages, panels or lists, etc)
Load objects for each request - especially model objects, which should be loaded from your data repository as each request is processed
Load your data objects outside of the constructor
Use models for all data, and attempt to mirror the Wicket structure for simplicity (e.g. using CompoundPropertyModel, where fields are loaded from the model object using reflection, based on the wicket:id used)
Use detachable models for any large objects that are loaded
Avoid using anonymous inner classes too much - use proper event handlers so your code is easier to read and maintain.
I have been working on a complex Wicket application for some time and I can tell you that you want to avoid duplicate objects appearing due to overuse of serialization / deserialization - it can be a nightmare to debug and fix.
Have a read of these for more information / suggestions:
http://letsgetdugg.com/2009/04/19/wicket-anti-patterns-avoiding-session-bloat/
https://cwiki.apache.org/confluence/display/WICKET/Best+Practices+and+Gotchas

JPA, complex object graphs and Serialisation

I have a "Project" entity/class that includes a number of "complex" fields, eg referenced as interfaces with many various possible implementations. To give an example: an interface Property, with T virtually of any type (as many types as I have implemented).
I use JPA. For those fields I have had no choice but to actually serialize them to store them. Although I have no need to use those objects in my queries, this is obviously leading to some issues, eg maintenance/updates to start with.
I have two questions:
1) is there a "trick" I could consider to keep my database up to date in case I have a "breaking" change in my serialised class (most of the time serialisation changes are handled well)?
2) will moving to JDO help at all? I very little experience with JDO but my understanding is that with JDO, having serialised objects in the tables will never happen (how are changes handled though?).
In support to 2) I must also add that the object graphs I have can be quite complex, possibly involving 10s of tables just to retrieve a full "Project" for instance.
JDO obviously supports persistence of interface fields (but then DataNucleus JPA also allows their persistence, but as vendor extension). Having some interface field being one of any possible type presents problems to RDBMS rather than to JDO as such. The underlying datastore is more your problem (in not being able to adequately mirror your model), and one of the many many other datastores could help you with that. For example DataNucleus JDO/JPA supports GAE/Datastore, Neo4j, MongoDB, HBase, ODF, Excel, etc and simply persists the "id" of the related object in a "column" (or equivalent) in the owning object representation ... so such breaking changes would be much much less than what you have now

DynaBeans vs CodeGenerated JavaBeans performance implications

I have to use JavaBeans in my application.
The application is a config driven application. Depending on config, different JavaBeans Classes will be required.
One option is that depending on configuration, I use a code generator to generate JavaBean classes.
Other option that sounds very appealing is use Dynamic Beans from Apache Beanutils. It saves me from one extra step of Code generation.
Can you please help me that what are the performance and memory implications of using Dynabeans vs Generated JavaBeans.
Is there any better alternative to DynaBeans?
In both cases, I will be using Apache BeanUtils to invoke getters/setters later.
I have been looking at BeanUtils implementation of BasicDynaBean and have reached at following conclusion regarding comparison with Code Generated JavaBean.
Memory
BasicDynaBean uses a HashMap to store keys/values. If there are 1000 instances of sme DynaBean then lot of memory is being wasted because keys are being stored again in each instance. Therefore it is more memory consuming that a code generated JavaBean and would not recommend this if you are going to store a large no of instances of Dynabean in memory.
Speed
To access the different fields, it invokes get/put methods on HashMap. Therefore it is faster than Code Generated Java Beans because there I will have to access the getter/setter methods using reflection

Can a non-serialized Java object be stored in mySQL BLob column?

I have a java objects which are not serializable. It is an external library and I cannot flag them as serializable. Here are a couple of questions..
1) Can they still be written to a mySQL BLOB column?
2) Is there any other way of persisting them outside of my JVM?
Any help will be useful.
Thanks
-a.
1) Have you tried it ?
2) Sure, for example in XML files. I personnally use XStream
1) Can they still be written to a mySQL BLOB column?
Yes, but you'll need to implement a serialisation algorithm to generate the bytes. Also, you will need to be sure you can access all the required internal state.
2) Is there any other way of persisting them outside of my JVM?
Take a look at XStream
Well, they don't have to be serializable in terms of "Java's default binary serialization scheme" but they have to be serializable somehow, by definition: you've got to have some way of extracting the data into a byte array, and then reconstituting it later from that array.
How you do that is up to you - there are lots of serialization frameworks/protocols/etc around.
They can, but not automatically. You'll have to come up with your own code to construct a binary representation of your object, persist the binary data to your database, and then reconstruct the object when you pull the data out of the database.
Yes, but again it will not be automatic. You'll have to come up with your own binary representation of the object, decide how you want to store it, and then reconstruct the object when you want to read it.
Serializable doesn't do anything in itself, it's just a way to hint that a class can be serializable. Some tools requires the presence of the interface while some does not.
I haven't looked into storing java objects as mySQL BLOB, but if serializable java objects can be then I see no reason why it wouldn't be possible.
2) Is there any other way of persisting them outside of my JVM?
There are many ways to persist objects outside JVM. Store it to disk, ftp, network storage, etc., and there exist just as many tools for storing in various format (such as XML, etc.).

Versioned Serialization in Java

I have a simple Java class that I need to serialize to be stored as a value in an RDBMS or a key-value store. The class is just a collection of properties of simple types (native types or Maps/Lists of native types). The issue is that the class will likely be evolving over time (likely: adding new properties, less likely but still possible: renaming a property, changing the type of a property, deleting a property).
I'd like to be able to gracefully handle changes in the class version. Specifically, when an attempt is made to de-serialize an instance from an older version, I'd like to be able to specify some sort of handler for managing the transition of the older instance to the newest version.
I don't think Java's built-in serialization is appropriate here. Before I try to roll my own solution, I'm wondering if anyone knows of any existing libraries that might help? I know of a ton of alternative serialization methods for Java, but I'm specifically looking for something that will let me gracefully handle changes to a class definition over time.
Edit:
For what it's worth, I ended up going with Protocol Buffer (http://code.google.com/p/protobuf/) serialization, since it's flexible to adding and renaming fields, while being on less piece of code I have to maintain (in reference to the custom Java serialization using readObject/writeObject).
Java serialisation allows customising of the serial form by providing readObject and writeObject methods. Together with ObjectInputStream.readFields, ObjectOutputStrean.putFields and defining serialPersistentFields, the serialised form can be unrelated to the actual fields in the implementation class.
However, Java serialisation produces opaque data that is not amenable to reading and writing through other techniques.
Perhaps you should map your Java class into the relational model. Dumping some language serialized blob into a database column is a horrible approach.
This is pretty straightforward using read and write object.
Try setting serialversionuid to a fixed value, then define a static final field for your version. The readobject can then use a switch statement to construct the fields depending on the version. We use this to store historical data on our file system. It's very quick on retrieval- so much so that users can't tell the difference.
I had a similar problem. I found out Java's serialVersionUID doesn't help much when you have multiple versions of objects. So I rolled out my own.
Here is what I do to save our user sessions,
In my DB, besides the BLOB field for serialized objects, I added a version column.
Whenever we change the session object, I save the old class, for example SessionV3.
Session is always written to the DB with current version number.
When reading the session, it's deserialized into session object directly if version is current. Otherwise, it's deserialized into old object and manually copied into current session object (SessionV3 => Session).
Once a while, we run a DB script to remove real old session versions so we can clean out old sessions from code. If we care about the old sessions, we can choose to convert them also.
There might be easier way to do this but our approach gives us most flexibility.
Never tried it but you may be able to do something with a custom bootloader to load the correct version of the class file at runtime for the object being deserialized.

Categories