Java serialization: readObject() vs. readResolve() - java

The book Effective Java and other sources provide a pretty good explanation on how and when to use the readObject() method when working with serializable Java classes. The readResolve() method, on the other hand, remains a bit of a mystery. Basically all documents I found either mention only one of the two or mention both only individually.
Questions that remain unanswered are:
What is the difference between the two methods?
When should which method be implemented?
How should readResolve() be used, especially in terms of returning what?
I hope you can shed some light on this matter.

readResolve is used for replacing the object read from the stream. The only use I've ever seen for this is enforcing singletons; when an object is read, replace it with the singleton instance. This ensures that nobody can create another instance by serializing and deserializing the singleton.

Item 90, Effective Java, 3rd Ed covers readResolve and writeReplace for serial proxies - their main use. The examples do not write out readObject and writeObject methods because they are using default serialisation to read and write fields.
readResolve is called after readObject has returned (conversely writeReplace is called before writeObject and probably on a different object). The object the method returns replaces this object returned to the user of ObjectInputStream.readObject and any further back references to the object in the stream. Both readResolve and writeReplace may return objects of the same or different types. Returning the same type is useful in some cases where fields must be final and either backward compatibility is required or values must copied and/or validated.
Use of readResolve does not enforce the singleton property.

readResolve can be used to change the data that is serialized through readObject method. For e.g. xstream API uses this feature to initialize some attributes that were not in the XML to be deserialized.
http://x-stream.github.io/faq.html#Serialization

readObject() is an existing method in ObjectInputStream class.
At the time of deserialization readObject() method internally checks whether the object that is being deserialized has readResolve() method implemented. If readResolve() method exists then it will be invoked
A sample readResolve() implementation would look like this
protected Object readResolve() {
return INSTANCE:
}
So, the intent of writing readResolve() method is to ensure that the same object that lives in JVM is returned instead of creating new object during deserialization.

readResolve is for when you may need to return an existing object, e.g. because you're checking for duplicate inputs that should be merged, or (e.g. in eventually-consistent distributed systems) because it's an update that may arrive before you're aware of any older versions.

readResolve() will ensure the singleton contract while serialization.
Please refer

As already answered, readResolve is an private method used in ObjectInputStream while deserializing an object. This is called just before actual instance is returned. In case of Singleton, here we can force return already existing singleton instance reference instead of deserialized instance reference.
Similary we have writeReplace for ObjectOutputStream.
Example for readResolve:
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;
public class SingletonWithSerializable implements Serializable {
private static final long serialVersionUID = 1L;
public static final SingletonWithSerializable INSTANCE = new SingletonWithSerializable();
private SingletonWithSerializable() {
if (INSTANCE != null)
throw new RuntimeException("Singleton instance already exists!");
}
private Object readResolve() {
return INSTANCE;
}
public void leaveTheBuilding() {
System.out.println("SingletonWithPublicFinalField.leaveTheBuilding() called...");
}
public static void main(String[] args) throws FileNotFoundException, IOException, ClassNotFoundException {
SingletonWithSerializable instance = SingletonWithSerializable.INSTANCE;
System.out.println("Before serialization: " + instance);
try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("file1.ser"))) {
out.writeObject(instance);
}
try (ObjectInputStream in = new ObjectInputStream(new FileInputStream("file1.ser"))) {
SingletonWithSerializable readObject = (SingletonWithSerializable) in.readObject();
System.out.println("After deserialization: " + readObject);
}
}
}
Output:
Before serialization: com.ej.item3.SingletonWithSerializable#7852e922
After deserialization: com.ej.item3.SingletonWithSerializable#7852e922

When serialization is used to convert an object so that it can be saved in file, we can trigger a method, readResolve(). The method is private and is kept in the same class whose object is being retrieved while deserialization.
It ensures that after the deserialization, what object is returned is the same as was serialised. That is, instanceSer.hashCode() == instanceDeSer.hashCode()
readResolve() method is not a static method. After in.readObject() is called while deserialisation it just makes sure that the returned object is the same as the one which was serialized as below while out.writeObject(instanceSer)
..
ObjectOutput out = new ObjectOutputStream(new FileOutputStream("file1.ser"));
out.writeObject(instanceSer);
out.close();
In this way, it also helps in singleton design pattern implementation, because every time same instance is returned.
public static ABCSingleton getInstance(){
return ABCSingleton.instance; //instance is static
}

I know this question is really old and has an accepted answer, but as it pops up very high in google search I thought I'd weigh in because no provided answer covers the three cases I consider important - in my mind the primary use for these methods. Of course, all assume that there is actually a need for custom serialization format.
Take, for example collection classes. Default serialization of a linked list or a BST would result in a huge loss of space with very little performance gain comparing to just serializing the elements in order. This is even more true if a collection is a projection or a view - keeps a reference to a larger structure than it exposes by its public API.
If the serialized object has immutable fields which need custom serialization, original solution of writeObject/readObject is insufficient, as the deserialized object is created before reading the part of the stream written in writeObject. Take this minimal implementation of a linked list:
public class List<E> extends Serializable {
public final E head;
public final List<E> tail;
public List(E head, List<E> tail) {
if (head==null)
throw new IllegalArgumentException("null as a list element");
this.head = head;
this.tail = tail;
}
//methods follow...
}
This structure can be serialized by recursively writing the head field of every link, followed by a null value. Deserializing such a format becomes however impossible: readObject can't change the values of member fields (now fixed to null). Here come
the writeReplace/readResolve pair:
private Object writeReplace() {
return new Serializable() {
private transient List<E> contents = List.this;
private void writeObject(ObjectOutputStream oos) {
List<E> list = contents;
while (list!=null) {
oos.writeObject(list.head);
list = list.tail;
}
oos.writeObject(null);
}
private void readObject(ObjectInputStream ois) {
List<E> tail = null;
E head = ois.readObject();
if (head!=null) {
readObject(ois); //read the tail and assign it to this.contents
this.contents = new List<>(head, this.contents)
}
}
private Object readResolve() {
return this.contents;
}
}
}
I am sorry if the above example doesn't compile (or work), but hopefully it is sufficient to illustrate my point. If you think this is a very far fetched example please remember that many functional languages run on the JVM and this approach becomes essential in their case.
We may want to actually deserialize an object of a different class than we wrote to the ObjectOutputStream. This would be the case with views such as a java.util.List list implementation which exposes a slice from a longer ArrayList. Obviously, serializing the whole backing list is a bad idea and we should only write the elements from the viewed slice. Why stop at it however and have a useless level of indirection after deserialization? We could simply read the elements from the stream into an ArrayList and return it directly instead of wrapping it in our view class.
Alternatively, having a similar delegate class dedicated to serialization may be a design choice. A good example would be reusing our serialization code. For example, if we have a builder class (similar to the StringBuilder for String), we can write a serialization delegate which serializes any collection by writing an empty builder to the stream, followed by collection size and elements returned by the colection's iterator. Deserialization would involve reading the builder, appending all subsequently read elements, and returning the result of final build() from the delegates readResolve. In that case we would need to implement the serialization only in the root class of the collection hierarchy, and no additional code would be needed from current or future implementations, provided they implement abstract iterator() and builder() method (the latter for recreating the collection of the same type - which would be a very useful feature in itself). Another example would be having a class hierarchy which code we don't fully control - our base class(es) from a third party library could have any number of private fields we know nothing about and which may change from one version to another, breaking our serialized objects. In that case it would be safer to write the data and rebuild the object manually on deserialization.

The readResolve Method
For Serializable and Externalizable classes, the readResolve method allows a class to replace/resolve the object read from the stream before it is returned to the caller. By implementing the readResolve method, a class can directly control the types and instances of its own instances being deserialized. The method is defined as follows:
ANY-ACCESS-MODIFIER Object readResolve()
throws ObjectStreamException;
The readResolve method is called when ObjectInputStream has read an object from the stream and is preparing to return it to the caller. ObjectInputStream checks whether the class of the object defines the readResolve method. If the method is defined, the readResolve method is called to allow the object in the stream to designate the object to be returned. The object returned should be of a type that is compatible with all uses. If it is not compatible, a ClassCastException will be thrown when the type mismatch is discovered.
For example, a Symbol class could be created for which only a single instance of each symbol binding existed within a virtual machine. The readResolve method would be implemented to determine if that symbol was already defined and substitute the preexisting equivalent Symbol object to maintain the identity constraint. In this way the uniqueness of Symbol objects can be maintained across serialization.

Related

Java: Cannot serialize object with a DateTimeFormatter property? [duplicate]

I have:
class MyClass extends MyClass2 implements Serializable {
//...
}
In MyClass2 is a property that is not serializable. How can I serialize (and de-serialize) this object?
Correction: MyClass2 is, of course, not an interface but a class.
As someone else noted, chapter 11 of Josh Bloch's Effective Java is an indispensible resource on Java Serialization.
A couple points from that chapter pertinent to your question:
assuming you want to serialize the state of the non-serializable field in MyClass2, that field must be accessible to MyClass, either directly or through getters and setters. MyClass will have to implement custom serialization by providing readObject and writeObject methods.
the non-serializable field's Class must have an API to allow getting it's state (for writing to the object stream) and then instantiating a new instance with that state (when later reading from the object stream.)
per Item 74 of Effective Java, MyClass2 must have a no-arg constructor accessible to MyClass, otherwise it is impossible for MyClass to extend MyClass2 and implement Serializable.
I've written a quick example below illustrating this.
class MyClass extends MyClass2 implements Serializable{
public MyClass(int quantity) {
setNonSerializableProperty(new NonSerializableClass(quantity));
}
private void writeObject(java.io.ObjectOutputStream out)
throws IOException{
// note, here we don't need out.defaultWriteObject(); because
// MyClass has no other state to serialize
out.writeInt(super.getNonSerializableProperty().getQuantity());
}
private void readObject(java.io.ObjectInputStream in)
throws IOException {
// note, here we don't need in.defaultReadObject();
// because MyClass has no other state to deserialize
super.setNonSerializableProperty(new NonSerializableClass(in.readInt()));
}
}
/* this class must have no-arg constructor accessible to MyClass */
class MyClass2 {
/* this property must be gettable/settable by MyClass. It cannot be final, therefore. */
private NonSerializableClass nonSerializableProperty;
public void setNonSerializableProperty(NonSerializableClass nonSerializableProperty) {
this.nonSerializableProperty = nonSerializableProperty;
}
public NonSerializableClass getNonSerializableProperty() {
return nonSerializableProperty;
}
}
class NonSerializableClass{
private final int quantity;
public NonSerializableClass(int quantity){
this.quantity = quantity;
}
public int getQuantity() {
return quantity;
}
}
MyClass2 is just an interface so techinicaly it has no properties, only methods. That being said if you have instance variables that are themselves not serializeable the only way I know of to get around it is to declare those fields transient.
ex:
private transient Foo foo;
When you declare a field transient it will be ignored during the serialization and deserialization process. Keep in mind that when you deserialize an object with a transient field that field's value will always be it's default (usually null.)
Note you can also override the readResolve() method of your class in order to initialize transient fields based on other system state.
If possible, the non-serialiable parts can be set as transient
private transient SomeClass myClz;
Otherwise you can use Kryo. Kryo is a fast and efficient object graph serialization framework for Java (e.g. JAVA serialization of java.awt.Color requires 170 bytes, Kryo only 4 bytes), which can serialize also non serializable objects. Kryo can also perform automatic deep and shallow copying/cloning. This is direct copying from object to object, not object->bytes->object.
Here is an example how to use kryo
Kryo kryo = new Kryo();
// #### Store to disk...
Output output = new Output(new FileOutputStream("file.bin"));
SomeClass someObject = ...
kryo.writeObject(output, someObject);
output.close();
// ### Restore from disk...
Input input = new Input(new FileInputStream("file.bin"));
SomeClass someObject = kryo.readObject(input, SomeClass.class);
input.close();
Serialized objects can be also compressed by registering exact serializer:
kryo.register(SomeObject.class, new DeflateCompressor(new FieldSerializer(kryo, SomeObject.class)));
If you can modify MyClass2, the easiest way to address this is declare the property transient.
Depends why that member of MyClass2 isn't serializable.
If there's some good reason why MyClass2 can't be represented in a serialized form, then chances are good the same reason applies to MyClass, since it's a subclass.
It may be possible to write a custom serialized form for MyClass by implementing readObject and writeObject, in such a way that the state of the MyClass2 instance data in MyClass can be suitably recreated from the serialized data. This would be the way to go if MyClass2's API is fixed and you can't add Serializable.
But first you should figure out why MyClass2 isn't serializable, and maybe change it.
You will need to implement writeObject() and readObject() and do manual serialization/deserialization of those fields. See the javadoc page for java.io.Serializable for details. Josh Bloch's Effective Java also has some good chapters on implementing robust and secure serialization.
You can start by looking into the transient keyword, which marks fields as not part of the persistent state of an object.
Several possibilities poped out and i resume them here:
Implement writeObject() and readObject() as sk suggested
declare the property transient and it won't be serialized as first stated by hank
use XStream as stated by boris-terzic
use a Serial Proxy as stated by tom-hawtin-tackline
XStream is a great library for doing fast Java to XML serialization for any object no matter if it is Serializable or not. Even if the XML target format doesn't suit you, you can use the source code to learn how to do it.
A useful approach for serialising instances of non-serializable classes (or at least subclasses of) is known a Serial Proxy. Essentially you implement writeReplace to return an instance of a completely different serializable class which implements readResolve to return a copy of the original object. I wrote an example of serialising java.awt.BasicStroke on Usenet

Java serialization: readFields() beyond of readObject()?

ObjectInputStream.readFields() is eligible only within private void readObject(ObjectInputStream) method.
public ObjectInputStream.GetField readFields() throws IOException, ClassNotFoundException {
SerialCallbackContext ctx = curContext;
if (ctx == null) {
throw new NotActiveException("not in call to readObject");
}
...
I'm in situation when I can't use default serialisation for reading object (i.e. ObjectInputStream.defaultReadObject()) and don't wish to implement readObject() method in all my classes. In ideal case I would like to have ownDefaultReadObject() method that will construct new object from serialized fields (e.g. by reflection).
Any ideas?
If someone would like to know more. Field names in some of my classes were renamed (e.g. by obfuscator) to a, b, c etc. Such classes were serialized with renamed fields using default Java serialization. I need to deserialise them to original classes (I know pairs of field names for each class; a=> fieldName, b=> age, c=>gender etc.).
To rename fields from an object stream, the method you need to override is ObjectInputStream.readClassDescriptor which returns an ObjectStreamClass.
Instances ObjectStreamClass fulfil one of two different roles through large different subsets of the interface. For the avoidance of doubt, this design choice should not be copied.
Describes the fields of a serialisable class running in the current JVM instance. Find these instance through ObjectStreamClass.lookup.
Describes the fields of a serialisable class as represented in a particular serialised stream. These instances are returned by the implementations of ObjectInputStream.readClassDescriptor.
In your override call super.readClassDescriptor. This will read in the data from the stream. Substitute the value from the stream with one having the new fields names, if it's a class you're interested in.
How to create you own ObjectStreamClass? Write dummy instance of the classes you are interested in to an ObjectOutputStream. You can do this as part of the built, just keeping the binary data. Read with another ObjectInputStream with readClassDescriptor overridden to stash the descriptors.
ObjectInputStream.defaultReadObject/readFields wouldn't make any sense outside of readObject (or similar) because they rely on the current deserialising object rather than an argument. There are other limitations to prevent other code calling defaultReadObject to rewrite fields that must remain constant, copied validated, security checked or similar.

Best design approach for creating Immutable Class

I am reading about the specific guidelines that needs to be followed while creating Immutable Class in Effective Java.
I read that In Immutable class method should not allowed to be overridden otherwise overridden method may change the behaviour of method. Following are the design approaches available in java to solve this problem :-
We can mark class final but as per my understanding, it has a one disadvantage that it makes the class inextensible.
Secondly is to make individual methods final but I can not get other disadvantage besides that we need to individually mark each method as final in order to prevent overridding.
As per book,better approach is to make the constructor private or package-private and provide public static factory method for creating object.
My question is: Even if we include private or default constructor in the class, it cannot be extended anymore in same package (in other package in case of package-private constructor), it has a same problem which the first one had. How is it considered as the better approach than the previous ones?
An immutable object should not be extensible. Why?
Because extending it will allow either direct access to fields (if they are protected which would allow writing methods that change them), or adding state which may be mutable.
Imagine we wrote a class FlexiblyRoundableDouble that extends Double, which has an additional field roundingMode that lets us choose a "rounding mode". You could write a setter for this field, and now your object is mutable.
You can argue that if all the methods are set as final, you cannot change the original behavior of the object. The only methods that could access your roundingMode field are new methods that are not polymorphically available if you assign your object to a Double variable. But when a class's contract says that it's immutable, you make decisions based on that. For example, if you write a clone() method or copy constructor for a class that has Double fields, you know that you don't need to deep-copy the Double fields, as they do not change their state, and can therefore be safely shared between the two clones.
Also, you can write methods that return the internal object without fearing that the caller will then change that object. If the object was mutable, you'd have to make a "defensive copy" of it. But if it's immutable, it's safe to return a reference to the actual internal object.
However, what happens if someone assigned a FlexiblyRoundableDouble to one of your Double fields? That object would be mutable. The clone() would assume it isn't, it will be shared between two objects, perhaps even returned by a method. The caller would then be able to cast it back as a FlexiblyRoundableDouble, change the field... and it will affect other objects that use that same instance.
Therefore, immutable objects should be final.
All this has nothing to do with the constructor issue. Objects can be safely immutable with public constructors (as demonstrated by String, Double, Integer and other standard Java immutables). The static factory method is simply a way utilizing the fact that the object is immutable, and several other objects can hold references to it safely, to create fewer objects with the same value.
Providing a static factory method gives you room to implement the Flyweight Pattern.
They're stating that you should hide the possibility of creating a new object using a constructor, and should rather make a call to a method which checks if an object with similar state exists in the "object pool" (a map filled with objects waiting to be re-used). Not re-using immutable objects is a waste of memory; this is why String literals are encouraged, and new String() is shunned (unless needed).
class ImmutableType {
private static final Map<Definition, ImmutableType> POOL = new HashMap<>();
private final Definition definition;
private ImmutableType(Definition def) {
definition = def;
}
public static ImmutableType get(Definition def) {
if(POOL.contains(def))
return POOL.get(def);
else {
ImmutableType obj = new ImmutableType(def);
POOL.put(def, obj);
return obj;
}
}
}
Definition stores the state of the ImmutableType. If a type with the same definition already exists in the pool, then re-use it. Otherwise, create it, add it to the pool then return it as the value.
As for the statement about marking the class final, immutable types should not be extensible in the first place (to avoid possibly modifying behavior). Marking every method final is just crazy for immutable classes.

static factory method question!

in this site it says that a new object isnt being created each time , which leads to efficiency, but by what i can see an object is being created each time in the static method..
do not need to create a new object
upon each invocation - objects can be
cached and reused, if necessary.
http://www.javapractices.com/topic/TopicAction.do?Id=21
so why are the static factory methods are so efficient?
isnt writing something like this : Object obj=new Object is same as if i did Object obj=Someclass.GetObj();
class Someclass
{
public static Object GetObj()
{
return new Object
}
}
There is caching, but a new object is created either way...
Objects can be cached and reused. They aren't always. There are a number of other advantages, like:
better naming of the method
returning subclasses
There is an item in Effective Java for that, so go ahead and read it. The book is a must-read anyway.
Update: as I said, object can be cached. But it depends on the implementation. The one you show does not cache them. The one shown by Peter caches them. You have that option. With a constructor - you don't.
They are more flexible - for example if the input parameters for new object are not valid, you can return null or some null object implementation (=instance, which does nothing, but will not break your code by NullPointerException), or, as previously mentioned by others, you can cache created instances. There is another benefit from using factory methods over constructors - you can name them whatever you like, which can be more readable, if there are multiple constructors with lots of optional parameters.
EDIT: if you want to use only one instance, you can use this simple factory:
class Someclass{
private static Object o=new Object();
public static Object getObj(){
return o;
}
}
When you use new Object(), a new Object has to be created.
If you use a static factory, it can optionally create a new object, or it can reuse an existing one.
A simple example is using Integer.valueOf(int) instead of new Integer(int). The static factory has a cache of small integers and can save to the creation of a significant portion of integers. For some use cases this can be all the integers used. The later case will always create a new object which is relatively inefficient.
The link you presented provides very different explanation of a Factory Pattern. Generally factory pattern is used to obtain instances of classes whcih implement same interface but provide different behavior for the same contract. It allows us to choose different implementation at run time. Check out the example here:
http://www.allapplabs.com/java_design_patterns/factory_pattern.htm
Factory pattern is not generally used for caching objects. Singleton pattern is defined to ensure only one instance of the object is created.
The idea is that you use them as a strategy. If later you want to implement caching, you just change that method and add it in there. Compare this with having "new Bla()" scattered all over the code, and trying to implement caching for the Bla class.
Since the method is static, and usually just a few lines of code, it means it can be resolved at compile time, and even inlined.
Thus there is no advantage of using "new Bla()" instead of factory methods at all.
Using factory in some situations you could make your code more flexible, faster and also better readable.
For example, imagine, you have to write class which download some data from url
public class WavAudio {
private byte[] raw;
private static HashMap<String,WavAudio> cache;
private WavAudio(byte[] raw){
this.raw=raw;
}
public static loadFromUrl(String someUrl){
//If data has been loaded previously we don't have to do this more (faster..)
if (cache.containsKey(someUrl))
return cache.get(someUrl);
//Else we'll load data (that would take some time)
InputStream ires=(new URL(someUrl)).openStream();
ByteArrayOutputStream baos=new ByteArrayOutputStream();
byte[] raw = new byte[4096];
int nBytesRead;
while ((nBytesRead = ires.read(raw, 0, raw.length))>0)
baos.write(raw, 0, raw);
byte[] downloaded=baos.toByteArray();
WavAudio curr=new WavAudio(raw);
cache.put(someUrl,raw);
return raw;
}
public static void main(String[] args){
WavAudio wav=WavAudio.loadFromUrl("http://someUrl_1");
SomePlayer.play(wav); //the first melody is playing
WavAudio wav=WavAudio.loadFromUrl("http://someUrl_2");
SomePlayer.play(wav); //the second melody is playing
//won't be downloaded twice
WavAudio wav=WavAudio.loadFromUrl("http://someUrl_1");
SomePlayer.play(wav);
}
}

readObject() vs. readResolve() to restore transient fields

According to Serializable javadoc, readResolve() is intended for replacing an object read from the stream. But surely (?) you don't have to replace the object, so is it OK to use it for restoring transient fields and return the original reference, like so:
private Object readResolve() {
transientField = something;
return this;
}
as opposed to using readObject():
private void readObject(ObjectInputStream s) {
s.defaultReadObject();
transientField = something;
}
Is there any reason to choose one over other, when used to just restore transient fields? Actually I'm leaning toward readResolve() because it needs no parameters and so it could be easily used also when constructing the objects "normally", in the constructor like:
class MyObject {
MyObject() {
readResolve();
}
...
}
In fact, readResolve has been define to provide you higher control on the way objects are deserialized. As a consequence, you're left free to do whatever you want (including setting a value for an transient field).
However, I imagine your transient field is set with a constant value. Elsewhere, it would be the sure sign that something is wrong : either your field is not that transient, either your data model relies on false assumptions.
Use readResolve. The readObject method lets you customize how the object is read, if the format is different than the expected default. This is not what you are trying to do. The readResolve method, as its name implies, is for resolving the object after it is read, and its purpose is precisely to let you resolve object state that is not restored after deserialization. This is what you are trying to do. You may return "this" from readResolve.

Categories