Re-serializing JBPM process variables directly via MySQL - java

I'm working with an application that uses JBPM 3.1 and MySQL. The core problem is that there are processes instances with variables that contain an older version of an external, non-JBPM Serializable class. When the main application is upgraded, these processes instances cause an exception to be thrown by JBPM since the SUID of a specific class instance has changed in the main application.
I believe I have a method for fixing the deserialization process using the technique described in the following:
How to deserialize an object persisted in a db now when the object has different serialVersionUID
However, my problem is figuring out where in MySQL JBPM stores process instance variables, so I can write a program that can interate over all the variables for all instances, an reserialize the variables so the offending class will have the new SUID, so JBPM can operate against the processes.
My initial looking at the JBPM tables, it appears that the JBPM_BYTEARRAY and/or JBPM_BYTEBLOCK may be the tables to operate against. However, I'm unsure how to proceed. I'm guessing each process variable is stored in a wrapping container class. Is that class org.jbpm.context.exe.VariableInstance? Or is it something else?
I figure if I have the proper jar files in the class path, and I know what the main class instance is that JBPM uses to store process variables in MySQL, I can deserialize the class (which will fix the SUID problem with the embedded problem class instance), and reserialize the class back. Since JBPM documentation does mention stuff about converters, I'm unsure if I have to replicate the conversion process JPBM does when deserializing, or if standard java deserialization is enough.

Some analysis of JBPM indicates that binary data may be split across multiple records. This may not be the case for mysql itself, but the JPBM code is written to support multiple RDBMs, and some have limits on the size of binary records.
Since the question earned me a tumbleweed reward, I was not going to get a usable mysql-based answer in within the deadline I had to meet, so I re-considered the core problem and the operating context the problem occurs, and came up with a solution that avoided the needed to perform direct mysql operations.
The main application in question already has some customize modifications to JBPM, so the solution I implemented altered JBPM source which performs the deserialization of process instance variables. This avoids the need to deal with JBPM logic that extracts the deserialized binary data from the RDBMs.
In the class org.jbpm.context.exe.converter.SerializableToByteArrayConverter, I modifed the code to use a custom ObjectInputStream class that returns the latest SUID of a class. The technique of just replacing the descriptor with the latest version of the class as described in the post referenced in the question does not work if the new class includes new fields. Doing so causes an end-of-data exception since the base deserialization code tries to access the "new" fields in the old, deserialized version of the class.
Therefore, I just need to replace the SUID, but keep all other parts of the descriptor the same. Since the JDK does not make ObjectStreamClass extensible, I created a sub-class of ObjectInputStream that returns the new SUID based upon a given calling pattern the java library executes against ObjectInputStream when deserialzing data.
The pattern: When reading the header of a deserialized object, the readUTF() function is called (to obtain the class name) followed by a readLong() call. Therefore, if this calling sequence occurs, and if the readUTF() returned the class name I want to change the SUID of, I return the newer SUID in the readLong() call.
The custom code reads a configuration file that specifies class names and associated SUIDs that should be mapped to the latest SUIDs for the classes listed. This allows mapping of alternate classes in the future w/o modifying the custom code.
Note, this approach is applicable to general deserialization operations, where one needs to map old SUIDs to the latest SUIDs of specified classes, and leaving the other parts of the serialized class descriptor alone to avoid end-of-data problems if the newer class definition includes additional field declarations not present in the older class definition.

Do you know if you made changes that break the contract or is it just simple adding new fields ? If it is simply adding new fields, then just define prior serialversionuid.. Otherwise.. you will have to read all the variables that have different serialversionids and save them under the new class because you are the only person who knows how to convert them.

Related

What is use of making classes serializable?

I have noticed many of the library classes "ArrayList", "String" even the exceptions are having a serialVersionUID. Why they have made it like this. Whats the practical use of doing that.FYI I am familiar with the concept of Serialization. Please point out the practical purpose of it.
For your reference find the serialversionUid for ClassCastException
public class ClassCastException extends RuntimeException {
private static final long serialVersionUID = -9223365651070458532L;
Where these object's state going to persist? And where will these objects state going to be retrieved ?
I am currently working in a project where we are making REST controllers whose input and output parameters will be JSON.We are creating simple POJOs for i/p and o/p parameters.I have seen people making those POJOs serializable .Whats the point in doing that ?
But I havent seen **out.readObject** or out.writeobject which is used to write and read the state of object.Will the POJO's state persist just making it serializable? If yes where it will be stored?
If you want the full story, read the spec: Java Object Serialization Specification.
[...] many of the library classes "ArrayList", "String" even the exceptions are having a serialVersionUID. Why they have made it like this.
To support backwards compatibility when reading objects that were written in an older version of the class. See Stream Unique Identifiers.
Where these object's state going to persist?
Wherever you decide. See Writing to an Object Stream.
And where will these objects state going to be retrieved ?
Wherever you put it. See Reading from an Object Stream.
[...] input and output parameters will be JSON. [...] I have seen people making those POJOs serializable. Whats the point in doing that ?
None. JSON is not using Java serialization. Java serialization creates a binary stream. JSON creates text.
Will the POJO's state persist just making it serializable? If yes where it will be stored?
No, see above.

Use of Serializable other than Writing& Reading object to/from File

In Which Cases it is a good coding practice to use implements serializable other than Writing & Reading object to/from file.In a project i went through code. A class using implements serializable even if in that class/project no any Writing/Reading objects to/from file?
If the object leaves the JVM it was created in, the class should implement Serializable.
Serialization is a method by which an object can be represented as a sequence of bytes that includes the object's data as well as information about the object's type and the types of data stored in the object.
After a serialized object has been written into a file, it can be read from the file and deserialized that is, the type information and bytes that represent the object and its data can be used to recreate the object in memory.
This is the main purpose of de-serialization. To get the object information, object type, variable type information from a written(loosely speaking) representation of an object. And hence serialization is required in the first place, to make this possible.
So, whenever, your object has a possibility of leaving the JVM, the program is being executed in, you should make the class, implement Serializable.
Reading/Writing objects into files (Memory), or passing an object over internet or any other type of connection. Whenever the object, leaves the JVM it was created in, it should implement Serializable, so that it can be serialized and deserialized for recognition once it enters back into another/same JVM.
Many good reads at :
1: Why Java needs Serializable interface?
2: What is the purpose of Serialization in Java?
Benefits of serialization:
To persist data for future use.
To send data to a remote computer using client/server Java technologies like RMI , socket programming etc.
To flatten an object into array of bytes in memory.
To send objects between the servers in a cluster.
To exchange data between applets and servlets.
To store user session in Web applications
To activate/passivate enterprise java beans.
You can refer to this article for more details.
If you ever expect your object to be used as data in a RMI setting, they should be serializable, as RMI either needs objects Serializable (if they are to be serialized and sent to the remote side) or to be a UnicastRemoteObject if you need a remote reference.
In earlier versions of java (before java 5) marker interfaces were good way to declare meta data but currently we having annotation which are more powerful to declare meta data for classes.
Annotation provides the very flexible and dynamic capability and we can provide the configuration for annotation meta deta that either we want to send that information in byte code or at run time.
Here If you are not willing to read & write object then there is one purpose left of serialization is, declare metadata for class and if you are goint to declare meta data for class then personally I suggest you don't use serialization just go for annotation.
Annotation is better choice than marker interface and JUnit is a perfect example of using Annotation e.g. #Test for specifying a Test Class. Same can also be achieved by using Test marker interface.
There is one more example which indicate that Annotations are better choice #ThreadSafe looks lot better than implementing ThraedSafe marker interface.
There are other cases in which you want to send an object by value instead of by reference:
Sending objects over the network.
Can't really send objects by reference here.
Multithreading, particularly in Android
Android uses Serializable/Parcelable to send information between Activities. It has something to do with memory mapping and multithreading. I don't really understand this though.
Along with Martin C's answer I want to add that - if you use Serializable then you can easily load your Object graph to memory. For example you have a Student class which have a Deportment. So if you serialize your Student then the Department also be saved. Moreover it also allow you -
1. to rename variables in a serialized class while maintaining backwards-compatibility.
2. to access data from deleted fields in a new version (in other words, change the internal representation of your data while maintaining backwards-compatibility).
Some frameworks/environments might depend upon data objects being serializable. For example in J2EE, the HttpSession attributes must be serializable in order to benefit from Session Persistence. Also RMI and other dark ages artifacts use serialization.
Therefore, though you might not immediately need your data objects to be serializable, it might make sense to declare Serializable just in case (It is almost free, unless you need to go through the pain of declaring readObject/writeObject methods)

Serializing objects with changing class source code

Note: Due to the lack of questions like this on SO, I've decided to put one up myself as a Q&A
Serializing objects (using an ObjectOutputStream and an ObjectInputStream) is a method for storing an instance of a Java Object as data that can be later deserialized for use. This can cause problems and frustration when the Class used to deserialize the data does not remain the same (source-code changes; program updates).
So how can an Object be serialized and deserialized with an updated / downgraded version of a Class?
Here are a few common ways of serializing an object that can be deserialized in a backwards-compatible way.
1. Store the data in the JSON format using import and export methods designed to save all fields needed to recreate the instance. This can be made backwards-compatible by including a version key that allows for an update algorithm to be called if the version is too low. A common library for this is the Google Gson library which can represent Java objects in JSON as well as normally editing a JSON file.
2. Use the built-in java Properties class in a way similar to the method described above. Properties objects can be later stored using a stream (store()) written as a regular Java Properties file, or saved in XML (storeToXML()).
3. Sometimes simple objects can be easily represented with key-value pairs in a place where storing them in a JSON, XML, or Properties file is either too complicated or not neccessary (overkill one could say). In this case, an effective way of serializing the object could be using the ObjectOutputStream class to serialize a HashMap object containing key-value pairs where the key could be a String and the value could be an Object (HashMap<String,Object>). This allows for all of the object's fields to be stored as well as including a version key while providing much versatility.
Note: Although serializing an object using the ObjectOutputStream for persistence storage is normally considered bad convention, it can be used either way as long as the class' source code remains the same.
Also Note about versioning: Changes to a class can be safely made without disrupting deserialization using an ObjectOutputStream as long as they are a compatible change. As mentioned in the Versioning of Serializable Objects chapter of the Object Serialization Specification:
A compatible change is a change that does not affect the contract
between the class and its callers.

How to map CSV records to a bean?

I'm looking for a Java library that can help me parse a CSV file containing pipe-delimited records and create instances of my bean class from them.
I've looked into several alternatives such as SuperCSV, OpenCSV, BeanIO, JFileHelper, jsefa, ... but neither of them seems to have what it takes.
Requirements of the library:
support records with a variable number of fields
provide iterator-style access so the file is never loaded entirely into memory
support mapping a field to its actual type, i.e. be able to take an date field and put it a java.util.Date into my bean instead of a string
let me supply my own factory object to create the beans from, instead of defaulting to Class.newInstance()
All the libraries I've looked into seem to lack requirement #4.
I can live with reflection, but the problem is that it still creates a new bean object for every line in the CSV file. Since the only thing I want to do with my bean at this point is pass it to my persistence layer and store it in the DB, it makes sense to put a couple of the bean instances into a pool and create a factory that takes instances from this pool. This way I can re-use my instances and parsing a 100000 line CSV file won't result in 100000 instances living in memory until the GC comes along.
Does anyone know of a library that can handle all these requirements?
This might be an alternative: https://github.com/org-tigris-jsapar/jsapar
It will probably fall short on requirement #4 though.
Here, you can find a more comprehensive list of alternatives: https://org-tigris-jsapar.github.io/jsapar/links
EDIT
As of jsapar version 1.8, it is now possible to customize Java object creation in an external factory class so I guess that requirement #4 is now also complied to.

Best practice: Java/XML serialization: how to determine to what class to deserialize?

I have an application that saves its context to XML. In this application, there is a hierarchy of classes, that all implement a common interface, and that represent different settings. For instance, a first setting class may be made of 4 public float fields, another one can be made of a sole HashMap.
I am trying to determine what is the best way to handle writing and reading to XML in a generic way. I read on this site a lot about JAXB and XStream for instance, which are able to make a specific class instance from XML.
However my question is related to the fact that the actual class can be anything that implement a given interface. When you read the XML file, how would you guess the actual class to instantiate from the XML data? How do you do that in your applications?
I thought that I could write the .class name in a XML attribute, read it and compare it to all possible class .class names, until I find a match. Is there a more sensible way?
Thanks
xstream should already take care of this and create the object of correct type.
The tutorial seems to confirm that:
To reconstruct an object, purely from the XML:
Person newJoe = (Person)xstream.fromXML(xml);
If you don't know the type, you will have to first assign it to the common interface type:
CommonInterface newObject = (CommonInterface)xstream.fromXML(xml);
// now you can either check its type or call virtual methods
In my case I just have a kind of header that stores the class name that is serialized and when de-serializing it I just use the header value to figure out to which class shall I de-serialize the values.
A best practice would to use an established, well documented XML parser/mapper. All of the serialization/deserialization work has been done, so you can worry about your business logic instead. Castor and Apache Axiom are two APIs that I have used to marshal/unmarshall(serialize/deserialize) Java Classes and XML.
http://www.castor.org
Apache Axiom

Categories