Completing object construction after GSON deserialization

Completing object construction after GSON deserialization - java

I have successfully started using GSON to serialize and de-serialize a hierarchy of objects in my Android application.
Some of the objects being serialized have members which I must mark as transient (or otherwise use alternative GSON annotations to prevent them being serialized) because they are references to objects that I do not want to serialize as part of the output JSON string. Those references are to objects which must be separately constructed by some other means.
Once the structure is de-serialized back into Java objects, at some point I need to fill in those references. I could easily do this perhaps by using a series of setXXX() type methods, but until that is done, those objects are in an incomplete state. What I am therefore wondering is whether there is a more robust approach to this.
Ways I have thought of so far:
Have the objects throw a RuntimeException (or something more suitable) if they're in an incomplete state; that is, if they're asked to do some work when some initialization method wasn't called.
Separate out the serializable bits into a separate data model object. In other words, take out the stuff that can't be serialized. After GSON de-serialization, build up my 'real' objects using those data objects in their composition. This seems to defeat the convenience of using GSON somewhat.
Write a custom deserializer for GSON to handle the special creation of those objects.

Check out https://github.com/julman99/gson-fire
It's a library I made that extends Gson to handle cases like Post-serialization and Post-deserialization
Also it has many other cool features that I've needed over time with Gson.

I would likely take the second approach, because as I typically design my applications, anything that needs to be serialized/deserialized is really just plain old data, or POJOs if you prefer. If I find myself needing to customize/configure the serialization API to do what I want, I tend to simplify what's being serialized, so the serialization API doesn't need the extra configurations.
So, if I have a more complicated data model, parts of which aren't to be serialized/deserialized, then I extract from it a simpler set of POJOs, as a conceptually separate data model to participate in the serialization/deserialization. This does then indeed require an extra step to map between the two data models, but that's usually pretty simple, also.
If the third approach is preferred, then note also the Instance Creator feature, as it can provide another useful hook into customizing the deserialization process.

Related

Serialize an existing Java object with circular dependencies

I have an object from a class I cannot modify.
The object has a circular dependency.
I would like to serialize the object, but I don't have access to java source code, as it's in a library.
In C++ I could create a subclass, override the virtual methods, then cast down to get the desired behavior. In Java this is not possible.
What options do I have besides creating a new POJO class and copying over every field by hand?

Serialization using Serializable is a bit smelly anyway. I'd prefer using an external serializer, using either JSON (e.g. Jackson, GSON) or a binary format (e.g. Kryo). Either way, you can write custom serializers in all of these that resolve your circular dependencies.

I'd opt for Jackson, even if you are going to be the only one reading/writing your objects. Many advantages, up to and including:
Diagnosing/debugging the serialized JSON makes life easy.
Staying on this well-trodden path means solutions to any questions are a Google away, e.g., http://www.baeldung.com/jackson-deserialization

Should I reuse one instance of GSON or create new ones on demand?

In most of my classes (especially server resources) I tend to create new instances of com.google.gson.Gson on demand. Sometimes I create them with the default constructor (for handling of simple POJOs), sometimes I use more sophisticated variants created with custom com.google.gson.GsonBuilder.
I know that Gson is a threadsafe class, so there is nothing standing against reusing the same instance of Gson instead of creating new ones. Heck, I might even reuse a static constant for this!
My question is this: should I create new instances whenever I need them, or should I create and use just one? What sort of performance implications will I be facing, if I serialize simple POJO with a Gson instance that was created with GsonBuilder and taught how to parse more complex data structures (had few custom serializers being registered)?

I know this is an old question but for future reference, the answer to this question is that you should go for the single instance if it is possible.
The creation of a GSON object is expensive depending on how many custom deserializer/serializer/handlers you register to it. I doubt that you will see any big performance boost from this.
About the second question, GSON internally has a list of registered serializer, and each one is checked against the object you are trying to parse. So you basically will add more iterations each time you register a custom serializer, but again, this is not a big performance issue compared to having a couple of big clumsy objects into memory.

Write object with transient attributes to stream (Java)

I want to write an object into a stream (or byte array) with its transient attributes to be able to reconstruct it in another VM. I don't want to modify its attributes because that object is a part of legacy application.
Standard Java serialization mechanism doesn't help. What other options do I have?
Update:
The reason I'm asking the question is that I want to modify an existing Spring application. It called a bean's method in-process earlier but now I want to move the bean on a separate machine and use Spring remoting through HTTP invoker. And I have a problem with parameters that have transient fields that need to be passed to this method but not needed to be serialized in other parts of the app.

Hmm - if an attribute is marked as transient, that means exactly that it's not mean to be considered part of the object's persistent state, e.g. for serialization. The fact that you want to do this at all is a code smell, and the correct solution is to stop those fields being transient.
Let's say though that for whatever reason you can't modify the target classes themselves. My first thought was that you could customise the serialisation by implementing readObject() and writeObject() methods, but that would also require changes to the target class.
In that case, you'll need to work with some kind of reflection-based or metadata-based API in order to do this. There are many libraries that will convert objects to and from XML or JSON or DB rows, etc. Your best bet would be to use one of these to convert the object to and from "hydrated" form (and likely you'll need to customise them, as any sane serialiser will ignore transient fields). Which one to pick depends on your current software stack, and your precise requirements.

I assume you cannot change the legacy code. In this case I think you will have to resort to going over the object fields with reflection and DataOutputStream.

transient variables are supposed to be those that aren't serializable or are easily recalculated.
My first suggestion is to look for methods on this object to recalculate the transient fields.

Difference between serializing and deserializing and writing internals to a file and then reading them and passing them in constructor

Lets say we have a class
Class A implements serializable{
String s;
int i;
Date d;
public A(){
}
public A(String s, int i, Date d){
this.s =s;
blah blah
}
}
Now lets say one way i store all the internal values of s,i,d to a file and read them again, and pass them to the constructor and create a new object. Second I serialize and then deserialize to a new object. What is the basic difference between the two approaches.
I know serialization will be slow and secure and the other approach is not. Any other differences.

Read this article, explains pretty good what is serialization about (it is for Java RMI but the serialization explanation and problems are the same): http://oreilly.com/catalog/javarmi/chapter/ch10.html
The main differences I see is that:
(As the other answers says) you are responsible to serialize - deserialize. What is going to happen when one of the properties is another big complex class? What are you going to do then? Save its value as well?
Serialization depends on reflection, while the file thing depends on getters/setters/constructors. With reflection you don't need public setters/getters or a constructor with parameters. With the file thing you need them.
Extracted from the link above:
Using Serialization
Serialization is a mechanism built into the core Java libraries for writing a graph of objects into a stream of data. This stream of data can then be programmatically manipulated, and a deep copy of the objects can be made by reversing the process. This reversal is often called deserialization.
In particular, there are three main uses of serialization:
As a persistence mechanism. If the stream being used is FileOutputStream, then the data will automatically be written to a file.
As a copy mechanism. If the stream being used is ByteArrayOutputStream, then the data will be written to a byte array in memory. This byte array can then be used to create duplicates of the original objects.
As a communication mechanism. If the stream being used comes from a socket, then the data will automatically be sent over the wire to the receiving socket, at which point another program will decide what to do.
The important thing to note is that the use of serialization is independent of the serialization algorithm itself. If we have a serializable class, we can save it to a file or make a copy of it simply by changing the way we use the output of the serialization mechanism.

In your first approach, you are responsible for maintaining the logical relationship between the data values (in the sense that you store the data and then read it back and construct the object back).
In the second approach, Java does this for you behind the scenes.

Serialization and Deserialization in Java
Serialization is a process by which we can store the state of an object into any storage medium. We can store the state of the object into a file, into a database table etc. Deserialization is the opposite process of serialization where we retrieve the object back from the storage medium.
Eg1: Assume you have a Java bean object and its variables are having some values. Now you want to store this object into a file or into a database table. This can be achieved using serialization. Now you can retrieve this object again from the file or database at any point of time when you need it. This can be achieved using deserialization: (Post by Bobin Goswami).

Not real difference other than that you are implementing a custom serialization scheme, so that will typically involve more code, since by default serialization requires just an interface declaration.
You can achieve something very similar with Externalizable - you are in control of exactly what data is saved, so you can choose to save just the constructor arguments and construct the object from that. (You could achieve this also with serialization by marking non-constructor arguments as transient.)

The section on Serialization in Joshua Bloch's Effective Java, 2nd Ed. is really a good read on this subject. Something that is very important to keep in mind:
Using your own homegrown persistence method is intralinguistic. When you read data back from a store, you control how an object's state is restored. Very often this is with constructors and/or static factories. The invariants of the object's state are preserved. Encapsulation is maintained because you don't necessarily need to disclose implementation details as part of the custom store. The downside, of course, is that data very often needs to go places and #pakore nicely outlined those situations in which serialization is useful.
Serialization is an extralinguistic mechanism. Bloch makes compelling arguments for why serialization (in particular, the Serializiable interface) should be invoked only with the greatest of care. Serialization can bypass constructors because reconstitution of objects does not depend on one. There are profound possible security concerns. The invariants of your object's state are vulnerable. Moreover, using Serializable tends to lock you into supporting a particular class implementation (i.e., it destroys encapsulation) because much of your object's state becomes part of the class's exported API once it becomes Serializable (this can be proactively deferred by marking certain instance fields as transient).
TL;DR: Serialization is a common and even fundamental aspect of modern Java-based computing. Data these days must go places, and serialization provides a commonly used mechanism for communication. Because of the vulnerabilities that serialization may invoke and because it may case much (or all) of your object's internal state to become part of its exported API, the Serializable interface should be used with the greatest of care.

How to cache any object type to memory/disk in java?

Is there a generic way to cache any type of object (be in a java class, or a word document etc.) to memory or disk?
Is simply serializing the object, and retaining the file extension (if it has one) enough to rebuild the object?

You seems to be using the word Object to describe 2 different things.
If your object is a Java object then having that object implement the Serializable is enough if you then use the java methods to serialize/de-serialize the object.
If you want to cache arbitrary data from the filesystem, the best way is to read it in an byte array(Or ArrayList). Then you can just write the array back to the disk or where you want it.

If you're talking about the inbuilt Java serialization, then you wouldn't even need to retain the file extension. The serialized form has enough information such that the deserialization process will produce an identical object without any additional help. I suppose that depending on how your code is structured, though, you might need to store some metadata for your own benefit so that you know what to cast the resulting Object as.
Note that Java serialization doesn't seem to fit your requirements, though - it cannot serialize any type of object, only those that implement Serializable. Perhaps you need to think a little more about what you mean by "simply serializing the object", since that's the rub.

No.
There is a class of objects which cannot be deserialized in a meaningful way. Think of an open network connection which is in the middle of transferring a file. You can not store that to disk, close your app, open your app, deserialize that connection and expect that it "just continues".
Java has an interface Serializable which indicates that an object can be serialized. It's up to you to ensure that is indeed possible. Typically an object is Serializable if all the data it holds is Serializable, or that data which is not Serializable is marked transient.
This is not to say that you could not, theoretically, dump the memory contents to a file as a byte stream, and read it back again later. You could build something like that I suppose. But to expect that it works is a different thing altogether.
In short, it is not possible to serialize any type. However, there is a generic way to serialize Java objects which are marked to be Serializable.

Not sure what you mean by "or a word document". Serialization can be used for disk caching, not sure what the purpose of using it in memory would be since it would probably be far faster to simply keep the original object.
A more robust solution might be ehcache it can manage the size of the cache as well as moving it between memory and disk.

If you're wondering about the cross platform (disk or memory) persistence part of the question, look at Java's Preferences class.

My, what a lot of answers!
Any object can make itself serializable by implementing java.io.Serializable.
But:
A default serialiser is implemented in ObjectOutputStream, which simply walks the object tree. This is fine for simple javabean type objects, but it can have undesirable effects such as system objects being serialised (I once inspected a serialised java object file and found that it was including all of the system timezone objects). And, of course, if your object has objects inside it that are not serializable (and not transient), then ObjectOutputStream will throw an exception.
(actually, even for JavaBean objects the default serializer it awful - the default serializer emits the classname of java.lang.String for every string field.)
So if your object is complicated, then you really should implement Externalizable and write a serialiser and deserializer with some smarts.
http://download.oracle.com/javase/6/docs/platform/serialization/spec/serial-arch.html#7185
So basically - no, you can't serialise any old object. You have to design object that are intended to be serialised and, ideally, that have some smarts about how they get themselves to and from a stream.

You cannot serialize any object in Java. Moreover, Java uses shallow copying(or is it called something else) for serialization, so if you want to seialize something like a HashMap, it might not save your data.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.