Write object with transient attributes to stream (Java)

Write object with transient attributes to stream (Java) - java

I want to write an object into a stream (or byte array) with its transient attributes to be able to reconstruct it in another VM. I don't want to modify its attributes because that object is a part of legacy application.
Standard Java serialization mechanism doesn't help. What other options do I have?
Update:
The reason I'm asking the question is that I want to modify an existing Spring application. It called a bean's method in-process earlier but now I want to move the bean on a separate machine and use Spring remoting through HTTP invoker. And I have a problem with parameters that have transient fields that need to be passed to this method but not needed to be serialized in other parts of the app.

Hmm - if an attribute is marked as transient, that means exactly that it's not mean to be considered part of the object's persistent state, e.g. for serialization. The fact that you want to do this at all is a code smell, and the correct solution is to stop those fields being transient.
Let's say though that for whatever reason you can't modify the target classes themselves. My first thought was that you could customise the serialisation by implementing readObject() and writeObject() methods, but that would also require changes to the target class.
In that case, you'll need to work with some kind of reflection-based or metadata-based API in order to do this. There are many libraries that will convert objects to and from XML or JSON or DB rows, etc. Your best bet would be to use one of these to convert the object to and from "hydrated" form (and likely you'll need to customise them, as any sane serialiser will ignore transient fields). Which one to pick depends on your current software stack, and your precise requirements.

I assume you cannot change the legacy code. In this case I think you will have to resort to going over the object fields with reflection and DataOutputStream.

transient variables are supposed to be those that aren't serializable or are easily recalculated.
My first suggestion is to look for methods on this object to recalculate the transient fields.

Related

Check whether a Java Object has been modified

I would like to use a clean/automatic way to check if a Java Object has been modified.
My specific problem is the following:
In my Java application, I use XStream library to deserialize XML to Java Objects, then the user can modify or change them. I'd like a way to check if these Objects in memory are at some point different from the serialized ones, so I can inform the user and ask him if he want to save the changes (i.e. serialize using XStream) or not.
In my application there are many Objects and are quite complex.
Please consider that I don't use databases in my application, so I'm not interested in solutions like using hibernate.

Two approaches:
Implement a hashcode for your objects, and compare the hashcode of the in-memory objects against the hashcode of the serialized objects to see if they've been changed. This is has a low impact on your class design, but performance will go down as O(n^2) as the number of objects increases. Note that two objects might return the same hashcode, but a good hashing implementation will make this very unlikely. If you are concerned about this, implement and use your own equals() method.
Have your objects implement the Observer pattern and have each setter method, or any other method that modifies the object, notify the observer when it's called. Performance will be better for large numbers of objects (as long as they aren't changing constantly), but it requires you to introduce Observer code into possibly lightweight classes. Java provides a utility interface for Observable, but you'll still need to do most of the work.

You can store a version field in this object, whenever the object changed it should update its version field (increment it), you can then compare the version field with the serialized object version field

Completing object construction after GSON deserialization

I have successfully started using GSON to serialize and de-serialize a hierarchy of objects in my Android application.
Some of the objects being serialized have members which I must mark as transient (or otherwise use alternative GSON annotations to prevent them being serialized) because they are references to objects that I do not want to serialize as part of the output JSON string. Those references are to objects which must be separately constructed by some other means.
Once the structure is de-serialized back into Java objects, at some point I need to fill in those references. I could easily do this perhaps by using a series of setXXX() type methods, but until that is done, those objects are in an incomplete state. What I am therefore wondering is whether there is a more robust approach to this.
Ways I have thought of so far:
Have the objects throw a RuntimeException (or something more suitable) if they're in an incomplete state; that is, if they're asked to do some work when some initialization method wasn't called.
Separate out the serializable bits into a separate data model object. In other words, take out the stuff that can't be serialized. After GSON de-serialization, build up my 'real' objects using those data objects in their composition. This seems to defeat the convenience of using GSON somewhat.
Write a custom deserializer for GSON to handle the special creation of those objects.

Check out https://github.com/julman99/gson-fire
It's a library I made that extends Gson to handle cases like Post-serialization and Post-deserialization
Also it has many other cool features that I've needed over time with Gson.

I would likely take the second approach, because as I typically design my applications, anything that needs to be serialized/deserialized is really just plain old data, or POJOs if you prefer. If I find myself needing to customize/configure the serialization API to do what I want, I tend to simplify what's being serialized, so the serialization API doesn't need the extra configurations.
So, if I have a more complicated data model, parts of which aren't to be serialized/deserialized, then I extract from it a simpler set of POJOs, as a conceptually separate data model to participate in the serialization/deserialization. This does then indeed require an extra step to map between the two data models, but that's usually pretty simple, also.
If the third approach is preferred, then note also the Instance Creator feature, as it can provide another useful hook into customizing the deserialization process.

Difference between serializing and deserializing and writing internals to a file and then reading them and passing them in constructor

Lets say we have a class
Class A implements serializable{
String s;
int i;
Date d;
public A(){
}
public A(String s, int i, Date d){
this.s =s;
blah blah
}
}
Now lets say one way i store all the internal values of s,i,d to a file and read them again, and pass them to the constructor and create a new object. Second I serialize and then deserialize to a new object. What is the basic difference between the two approaches.
I know serialization will be slow and secure and the other approach is not. Any other differences.

Read this article, explains pretty good what is serialization about (it is for Java RMI but the serialization explanation and problems are the same): http://oreilly.com/catalog/javarmi/chapter/ch10.html
The main differences I see is that:
(As the other answers says) you are responsible to serialize - deserialize. What is going to happen when one of the properties is another big complex class? What are you going to do then? Save its value as well?
Serialization depends on reflection, while the file thing depends on getters/setters/constructors. With reflection you don't need public setters/getters or a constructor with parameters. With the file thing you need them.
Extracted from the link above:
Using Serialization
Serialization is a mechanism built into the core Java libraries for writing a graph of objects into a stream of data. This stream of data can then be programmatically manipulated, and a deep copy of the objects can be made by reversing the process. This reversal is often called deserialization.
In particular, there are three main uses of serialization:
As a persistence mechanism. If the stream being used is FileOutputStream, then the data will automatically be written to a file.
As a copy mechanism. If the stream being used is ByteArrayOutputStream, then the data will be written to a byte array in memory. This byte array can then be used to create duplicates of the original objects.
As a communication mechanism. If the stream being used comes from a socket, then the data will automatically be sent over the wire to the receiving socket, at which point another program will decide what to do.
The important thing to note is that the use of serialization is independent of the serialization algorithm itself. If we have a serializable class, we can save it to a file or make a copy of it simply by changing the way we use the output of the serialization mechanism.

In your first approach, you are responsible for maintaining the logical relationship between the data values (in the sense that you store the data and then read it back and construct the object back).
In the second approach, Java does this for you behind the scenes.

Serialization and Deserialization in Java
Serialization is a process by which we can store the state of an object into any storage medium. We can store the state of the object into a file, into a database table etc. Deserialization is the opposite process of serialization where we retrieve the object back from the storage medium.
Eg1: Assume you have a Java bean object and its variables are having some values. Now you want to store this object into a file or into a database table. This can be achieved using serialization. Now you can retrieve this object again from the file or database at any point of time when you need it. This can be achieved using deserialization: (Post by Bobin Goswami).

Not real difference other than that you are implementing a custom serialization scheme, so that will typically involve more code, since by default serialization requires just an interface declaration.
You can achieve something very similar with Externalizable - you are in control of exactly what data is saved, so you can choose to save just the constructor arguments and construct the object from that. (You could achieve this also with serialization by marking non-constructor arguments as transient.)

The section on Serialization in Joshua Bloch's Effective Java, 2nd Ed. is really a good read on this subject. Something that is very important to keep in mind:
Using your own homegrown persistence method is intralinguistic. When you read data back from a store, you control how an object's state is restored. Very often this is with constructors and/or static factories. The invariants of the object's state are preserved. Encapsulation is maintained because you don't necessarily need to disclose implementation details as part of the custom store. The downside, of course, is that data very often needs to go places and #pakore nicely outlined those situations in which serialization is useful.
Serialization is an extralinguistic mechanism. Bloch makes compelling arguments for why serialization (in particular, the Serializiable interface) should be invoked only with the greatest of care. Serialization can bypass constructors because reconstitution of objects does not depend on one. There are profound possible security concerns. The invariants of your object's state are vulnerable. Moreover, using Serializable tends to lock you into supporting a particular class implementation (i.e., it destroys encapsulation) because much of your object's state becomes part of the class's exported API once it becomes Serializable (this can be proactively deferred by marking certain instance fields as transient).
TL;DR: Serialization is a common and even fundamental aspect of modern Java-based computing. Data these days must go places, and serialization provides a commonly used mechanism for communication. Because of the vulnerabilities that serialization may invoke and because it may case much (or all) of your object's internal state to become part of its exported API, the Serializable interface should be used with the greatest of care.

How to cache any object type to memory/disk in java?

Is there a generic way to cache any type of object (be in a java class, or a word document etc.) to memory or disk?
Is simply serializing the object, and retaining the file extension (if it has one) enough to rebuild the object?

You seems to be using the word Object to describe 2 different things.
If your object is a Java object then having that object implement the Serializable is enough if you then use the java methods to serialize/de-serialize the object.
If you want to cache arbitrary data from the filesystem, the best way is to read it in an byte array(Or ArrayList). Then you can just write the array back to the disk or where you want it.

If you're talking about the inbuilt Java serialization, then you wouldn't even need to retain the file extension. The serialized form has enough information such that the deserialization process will produce an identical object without any additional help. I suppose that depending on how your code is structured, though, you might need to store some metadata for your own benefit so that you know what to cast the resulting Object as.
Note that Java serialization doesn't seem to fit your requirements, though - it cannot serialize any type of object, only those that implement Serializable. Perhaps you need to think a little more about what you mean by "simply serializing the object", since that's the rub.

No.
There is a class of objects which cannot be deserialized in a meaningful way. Think of an open network connection which is in the middle of transferring a file. You can not store that to disk, close your app, open your app, deserialize that connection and expect that it "just continues".
Java has an interface Serializable which indicates that an object can be serialized. It's up to you to ensure that is indeed possible. Typically an object is Serializable if all the data it holds is Serializable, or that data which is not Serializable is marked transient.
This is not to say that you could not, theoretically, dump the memory contents to a file as a byte stream, and read it back again later. You could build something like that I suppose. But to expect that it works is a different thing altogether.
In short, it is not possible to serialize any type. However, there is a generic way to serialize Java objects which are marked to be Serializable.

Not sure what you mean by "or a word document". Serialization can be used for disk caching, not sure what the purpose of using it in memory would be since it would probably be far faster to simply keep the original object.
A more robust solution might be ehcache it can manage the size of the cache as well as moving it between memory and disk.

If you're wondering about the cross platform (disk or memory) persistence part of the question, look at Java's Preferences class.

My, what a lot of answers!
Any object can make itself serializable by implementing java.io.Serializable.
But:
A default serialiser is implemented in ObjectOutputStream, which simply walks the object tree. This is fine for simple javabean type objects, but it can have undesirable effects such as system objects being serialised (I once inspected a serialised java object file and found that it was including all of the system timezone objects). And, of course, if your object has objects inside it that are not serializable (and not transient), then ObjectOutputStream will throw an exception.
(actually, even for JavaBean objects the default serializer it awful - the default serializer emits the classname of java.lang.String for every string field.)
So if your object is complicated, then you really should implement Externalizable and write a serialiser and deserializer with some smarts.
http://download.oracle.com/javase/6/docs/platform/serialization/spec/serial-arch.html#7185
So basically - no, you can't serialise any old object. You have to design object that are intended to be serialised and, ideally, that have some smarts about how they get themselves to and from a stream.

You cannot serialize any object in Java. Moreover, Java uses shallow copying(or is it called something else) for serialization, so if you want to seialize something like a HashMap, it might not save your data.

serializing a java object which might change later on

I need to serialize a java object which might change later on, like some of the variables can be added or removed. What are the pit falls of such an approach and What precautions should I take, if this remains the only way out.

You definitely need to add a serialVersionUID field right from the beginning.
Changes might make the serialized objects incompatible. Adding and removing fields can cause the violation of class contracts (up to the point of Exceptions being thrown) when deserializing instances where the field was not present in a class version that expects it to be - the field is set to the type's default value in that case; the most likely problems are NullPointerExceptions. This can be averted by implementing readObject() and writeObject(). Other changes (such as changing a field's type) can cause the deserialization to fail entirely.

As Michael pointed out Java provides some support for serialization with java.io.Serializable. The main problem with the Java support is that versioning is clunky and requires to user to deal with it.
Instead I would recommend something like Googles Protocol Buffers or Apache Thrift. For both you define the object in a very simple language and then they will generate the serialization code for you. Both also handle all the versioning for you such that you don't have to worry about if you are reading an old or a new version of the object.
For example if you have a type foo() which has a field bar and you write a bunch of foo objects to disk. Then some time later you add a field baz to foo and write a few more foo objects to disk. When you read them back they will all be foo objects, it will seem as if all of the original foo objects simply never set their baz field.

I suppose the short answer would be that you will have to implement some sort of custom deserialization process, that will know of the changes and will deserialize older versions of an object in a correct way. You should also include the serialVersionUID field that will keep track of you version and will help you find out if a serialized object is an old version. You can read more about this here

When you now that your serialized object will change in the future, you should create a new serialzed Object with another namespace, instead of changing an existing one.
And adding a serialVersionUID like Michael described is also a ToDo.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.