How does serialization tool skip unknown fields during deserialization?

How does serialization tool skip unknown fields during deserialization? - java

How does serialization tool(i.e. hessian) deserialize a class of different version with the same serialVersionUID? In most cases, it can skip those unknown(not found in class loader) fields and keep compatible. But last time, I tried appending a new field of Map<String, Object>, put some unknown object into the map, then it threw a ClassNotFoundException.
Why can't skip the map like the others?
Is it a problem associated with the tool's implementation or serialization mechanism?

This would depend on the tool itself. serialVersionUID is intended for use by Java's built-in serializer (ObjectOutputStream) which, as best I can tell from reading the Hessian source, is not used by Hessian.
For Hessian specifically, the best source I can find which mentions these kinds of changes is this email:
At least for Hessian, it's best to think of versioning as a set of
types of changes that can be handled.
Specifically Hessian can manage the following kinds of changes: 1)
if you add or drop a field, the side that doesn't understand the
field will ignore it. 2) some field type changes are possible, if
Hessian can convert (e.g. int to long) 3) there's some flexibility
on map(bean) types, depending on how much information Hessian has
(which is a reason to prefer concrete types.)
So, if the sender sends an untyped map {"field1", 10} and the target
is known to be MyValue { int field1; }, then Hessian can map the
fields.
But it cannot manage things like: 1) field name changes (the data
will be dropped). 2) class name changes where the target is
underdefined, like Object field1. If you send a MyValue2 as the new
field1, when the previous version was MyValue1, Hessian can't make
that automatic transition. (But as with #3 above, a "MyValue2 field1"
would give Hessian enough information to translate.) 3) class
splits, e.g. creating a subclass and pushing some fields into it.
4) map to list or list to map changes.
Basically, I don't think Hessian intends to support unknown types in maps.

Related

Are there good alternatives for serializing enums in Java?

The Java language benefited much from adding enums to it; but unfortunately they don't work well when sending serialized objects between systems that have different code levels.
Example: assume that you have two systems A and B. They both start of with the same code levels, but at some point the start to see code updates at different points in time. Now assume that there is some
public enum Whatever { FIRST; }
And there are other objects that keep references to constants of that enum. Those objects are serialized and sent from A to B or vice versa. Now consider that B has a newer version of Whatever
public enum Whatever { FIRST; SECOND }
Then:
class SomethingElse implements Serializable { ...
private final Whatever theWhatever;
SomethingElse(Whatever theWhatever) {
this.theWhatever = theWhatever; ..
gets instantiated ...
SomethingElse somethin = new SomethingElse(Whatever.SECOND)
and then serialized and sent over to A (for example as result of some RMI call). Which is bad, because now there will be an error during deserialization on A: A knows the Whatever enum class, but in a version that doesn't have SECOND.
We figured this the hard way; and now I am very anxious to use enums for situations that would actually "perfect for enums"; simply because I know that I can't easily extend an existing enum later on.
Now I am wondering: are there (good) strategies to avoid such compatibility issues with enums? Or do I really have to go back to "pre-enum" times; and don't use enums, but have to rely on a solution where I use plain strings all over the place?
Update: please note that using the serialversionuid doesn't help here at all. That thing only helps you in making an incompatible change "more obvious". But the point is: I don't care why deserialization fails - because I have to avoid it to happen. And I am also not in a position to change the way we serialize our objects. We are doing RMI; and we are serializing to binary; I have no means to change that.

As #Jesper mentioned in the comments, I would recommend something like JSON for your inter-service communication. This will allow you to have more control on how unknown Enum values are handled.
For example, using the always awesome Jackson you can use the Deserialization Features READ_UNKNOWN_ENUM_VALUES_AS_NULL or READ_UNKNOWN_ENUM_VALUES_USING_DEFAULT_VALUE. Both will allow your application logic to handle unknown enum values as you see fit.
Example (straight from the Jackson doc)
enum MyEnum { A, B, #JsonEnumDefaultValue UNKNOWN }
...
final ObjectMapper mapper = new ObjectMapper();
mapper.enable(DeserializationFeature.READ_UNKNOWN_ENUM_VALUES_USING_DEFAULT_VALUE);
MyEnum value = mapper.readValue("\"foo\"", MyEnum.class);
assertSame(MyEnum.UNKNOWN, value);

After going back and forth regarding different solutions, I figured a solution based on the suggestion from #GuiSim : one can build a class that contains an enum value. This class can
do custom deserialization; thus I can prevent there won't be exceptions during the deserialization process
provide simple methods like isValid() and getEnumValue(): the first one tells you if the enum deserialization actually worked; and the second one returns the deserialized enum (or throws an exception)

Java deserialization of old Object

I'm having a problem when serializing and deserializing my objects in my project. I'm writing the object to a name.dat file.
However whenever i make a change in the Name class i can nolonger deserialize it, since it's two different objects.
Is there any way around this?

Your best options are:
Don't change your classes :-)
Throw away any serialized objects each time you change your classes.
Don't use Java object serialization.
Given that 1) and 2) are probably out of the question, option 3) should be given serious consideration. There a variety of alternatives to Java serialization, depending on the nature of the data you are persisting. These include:
Using Java properties files
Storing the data in a classical database (using SQL and the JDBC API)
Using an object-relational database mapping such as Hibernate
Using XML or JSON and a "binding" technology so that you can serialize / deserialize POJOs.
Finally, it is possible to implement class versioning using Java object serialization. However, it is tricky. And if you are continually changing the classes, then it is not going to be pleasant. Start by reading Versioning of Serializable Objects.

GreenDao and entity inheritance

My task is to make disk cache on Android OS for my application (it is some sort of messenger). I'd like to store messages in database, but have met a problem of storing different types of messages (currently 5 types of messages each type have it's own fields and they all extends base class)
GreenDao documentation says:
Note: currently it’s impossible to have another entity as a super class (there are no polymorphic queries either)
I am planing to have entity which almost 1 to 1 to base class, except one column - raw binary or json data in which every child class can write anything it need.
My questions are:
GreenDao is good solution in such case? Is there any solutions which allow not to worry about inheritance - and how much did they cost in terms of efficiency.
How to "serialize" data to such field (what method I should override or where I should put my code which will do all necessary things
How to give GreenDao correct constructor to "deserialize" Json or binary to correct class instance
Should I use reflection - or just switch/case for finding correct constructor (only 5 types of constructors are possible) - is reflection how much will reflection "cost" in such case?

If you really need inheritance greendao is not the r I get choice, since it doesn't support it. But I think you can go without inheritance:
You can design an entity with a discriminator column (messagetype) and a binary or text column (data). Then you can use an abstract factory to create desired objects from data depending of the messagetype.
If the conversion is complex, I'd put it in a separate class, otherwise I'd put it as a method in the keep section.
Be aware that this design may slow you down, if you really have a lot of messages, since separate tables would reduce index sizes.
Talking about indexes: if you want to access a message through some property of your data column later on, you are screwed since you can't put an index on it.

Use of Serializable other than Writing& Reading object to/from File

In Which Cases it is a good coding practice to use implements serializable other than Writing & Reading object to/from file.In a project i went through code. A class using implements serializable even if in that class/project no any Writing/Reading objects to/from file?

If the object leaves the JVM it was created in, the class should implement Serializable.
Serialization is a method by which an object can be represented as a sequence of bytes that includes the object's data as well as information about the object's type and the types of data stored in the object.
After a serialized object has been written into a file, it can be read from the file and deserialized that is, the type information and bytes that represent the object and its data can be used to recreate the object in memory.
This is the main purpose of de-serialization. To get the object information, object type, variable type information from a written(loosely speaking) representation of an object. And hence serialization is required in the first place, to make this possible.
So, whenever, your object has a possibility of leaving the JVM, the program is being executed in, you should make the class, implement Serializable.
Reading/Writing objects into files (Memory), or passing an object over internet or any other type of connection. Whenever the object, leaves the JVM it was created in, it should implement Serializable, so that it can be serialized and deserialized for recognition once it enters back into another/same JVM.
Many good reads at :
1: Why Java needs Serializable interface?
2: What is the purpose of Serialization in Java?

Benefits of serialization:
To persist data for future use.
To send data to a remote computer using client/server Java technologies like RMI , socket programming etc.
To flatten an object into array of bytes in memory.
To send objects between the servers in a cluster.
To exchange data between applets and servlets.
To store user session in Web applications
To activate/passivate enterprise java beans.
You can refer to this article for more details.

If you ever expect your object to be used as data in a RMI setting, they should be serializable, as RMI either needs objects Serializable (if they are to be serialized and sent to the remote side) or to be a UnicastRemoteObject if you need a remote reference.

In earlier versions of java (before java 5) marker interfaces were good way to declare meta data but currently we having annotation which are more powerful to declare meta data for classes.
Annotation provides the very flexible and dynamic capability and we can provide the configuration for annotation meta deta that either we want to send that information in byte code or at run time.
Here If you are not willing to read & write object then there is one purpose left of serialization is, declare metadata for class and if you are goint to declare meta data for class then personally I suggest you don't use serialization just go for annotation.
Annotation is better choice than marker interface and JUnit is a perfect example of using Annotation e.g. #Test for specifying a Test Class. Same can also be achieved by using Test marker interface.
There is one more example which indicate that Annotations are better choice #ThreadSafe looks lot better than implementing ThraedSafe marker interface.

There are other cases in which you want to send an object by value instead of by reference:
Sending objects over the network.
Can't really send objects by reference here.
Multithreading, particularly in Android
Android uses Serializable/Parcelable to send information between Activities. It has something to do with memory mapping and multithreading. I don't really understand this though.

Along with Martin C's answer I want to add that - if you use Serializable then you can easily load your Object graph to memory. For example you have a Student class which have a Deportment. So if you serialize your Student then the Department also be saved. Moreover it also allow you -
1. to rename variables in a serialized class while maintaining backwards-compatibility.
2. to access data from deleted fields in a new version (in other words, change the internal representation of your data while maintaining backwards-compatibility).

Some frameworks/environments might depend upon data objects being serializable. For example in J2EE, the HttpSession attributes must be serializable in order to benefit from Session Persistence. Also RMI and other dark ages artifacts use serialization.
Therefore, though you might not immediately need your data objects to be serializable, it might make sense to declare Serializable just in case (It is almost free, unless you need to go through the pain of declaring readObject/writeObject methods)

what is the best way to merge two java beans in RESTful API?

The scenario is simple:
UI call RESTful API to get an object tree, then UI change some data and call RESTful API to update it.
But for security or performance reason..., my RESTful API can NOT bring the whole object tree to the UI.
We have two choose for this purpose: creating an individual Java Bean for RESTful API or extend existing business Java Bean plus #JsonIgnore.
The second looks smarter because we re-use business class.
But Now we have a trouble: I need to merge the object from UI with the object from DB, otherwise I will lose some data.
But how do I know which piece of data will come from UI?
I know I can hard code to copy fields one by one.
But this way is dangerous.
I am asking for generic way to avoid hard code to copy fields.
I tried org.apache.commons.beanutils.BeanUtils, but it can't meet the requirement because it always overwrite target fields.
So I am thinking this way:
If the field in UI bean is not Null, then overwrite the value of the same name field in destination bean. but how do I handle if the field is some kind of primitive type like int which have default value 0?
I don't know if the field really carry an UI value 0 or just not comes back from UI.
I tried to convert primitive type to object type, but it still have troubles on boolean type, many java tools don’t support “ Boolean isValid(){…}” like BeanUtils. And this kind converting is dangerous on existing code.
I tried those code:
JacksonAnnotationIntrospector ai = new JacksonAnnotationIntrospector();
AnnotatedClass ac = AnnotatedClass.construct(MyClassDTO.class, ai, null);
String[] ignoredList = ai.findPropertiesToIgnore(ac);
for(String one: ignoredList){
System.out.println(one);
}
but ignoredList is always null. I am using Jackson 1.9.2

You could consider using JsonPatch. We use it and it works quite well. Of course it means you apply patches at the JSON level and not in the bean directly so if you need to support more than just JSON, it might be a problem.
Here's an implementation: https://github.com/fge/json-patch

I found the solution on Jackson:
MyBean defaults = objectMapper.readValue(defaultJson, MyBean.class);
ObjectReader updater = objectMapper.readerForUpdating(defaults);
MyBean merged = updater.readValue(overridesJson);
it comes from :
readerForUpdating
merging on Jackson

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.