we have a whole bunch of serialized classes but want the database bits to be invalidated whenever the "signature" ie: field structure and serialization code for a class changes,
Is there a utility that can generate a "hash" for a class file that will optimally detect when the serialization structure for java.serializable changes for that class?
There's really no way to "optimally detect when a serialization structure changes" for one rather important reason:
Serialization breaks encapsulation.
When you implement Serializable, all the private and package-private fields and members of a class become part of that class's exported API. From the moment that a class is published into the wild, its serialized form (which contains all of its implementation details) is a part of its contract. Changing the serialized form will have one of two consequences:
Backward compatibility. Because Serializable breaks encapsulation, the serialized form becomes part of its exported API. When an implementation detail changes, at the developer's discretion customized readObject() and writeObject() methods can be designed to continue to support the original serialized form (even if it would change as a result of the new implementation). This is desireable if the API is far flung and changing the serialized form would break many clients of the API. In this case, even though the serialized form would change by the new implementation, the serialVersionUID will need to remain the same to continue to support the original serialized form.
Forced upgrade. If the implementation of a class changes and it is impossible or infeasible to support the original serialized form, changing the serialVersionUID will cause clients of the API to break, thereby forcing clients to be upgraded to use the new serialized form. This may be desireable in certain circumstances (but will force clients to upgrade their code).
It is worth mentioning that if you do not explicitly declare a static final serialVersionUID in your serializable class, the Java environment will automatically compute one for you by applying a complex procedure to the code (that takes into account fields and method signatures).
In short, the serialVersionUID should track with the serialized form that is used rather than the actual class implementation. If you want the serialVersionUID to change automatically whenever the class implementation changes, you can simply omit the explicit declaration of the serialVersionUID (but this may have other negative consequences). The decision to change the serialVersionUID needs to be made explicitly depending on how you want your API to behave when an implementation detail changes.
Java specification stats that
The serialization runtime associates with each serializable class a
version number, called a serialVersionUID, which is used during
deserialization to verify that the sender and receiver of a serialized
object have loaded classes for that object that are compatible with
respect to serialization. If the receiver has loaded a class for the
object that has a different serialVersionUID than that of the
corresponding sender's class, then deserialization will result in an
InvalidClassException.enter code here
But If I assign all the classes same serial version id as follows
static final long serialVersionUID = 1L;
Now all my classes will have same serialversionUID and this will never result in InvalidclassException.
I know that we give it explicitly so that its value remains constant across different JVM implementations.
Please let me know what is the use of putting same id for all the classes and what will happen if we modify class in between serialization and deserialization ?
I believe the serialVersionUID is only used to determine the version of that class that has been serialized - the class name is always present too, so it won't lead to any ambiguity.
Please let me know what is the use of putting same id for all the classes
They're effectively unrelated values. It's just simple to use 1 and increment it every time you make a breaking change.
what will happen if we modify class in between serialization and deserialization ?
That entirely depends on the type of modification you make. If you're just adding or removing methods, or static fields, it's fine - that wouldn't affect the serialized data anyway. If you make any changes to instance fields, that's when life gets hairy. You'd need to study the serialization format in detail to work out exactly what would constitute a breaking change, but it's entirely possible that just changing the name of a field could break things - e.g. if fields are serialized in name order.
You might want to consider using a binary data format which plays a bit more pleasantly with data format changes, such as Protocol Buffers. (There are other benefits available too, such as portability and speed - and protobuf isn't the only game in town, either.)
There are 3 ways to define the serialVersionUID :
1. private static final long serialVersionUID = 1L; (Default)
2. private static final long serialVersionUID = -8940196742313994740L; (Generated)
3. Don't define serialVersionUID and let the JVM define it at runtime. #Lance Java
But I don't understand the first way!
I have seen it already, that somebody defines "serialVersionUID=1L" for all java-classes in source code.
What is the meaning? Is that useful?
If all classes have the same serialVersionUID 1L, is there no problem?
What is the meaning? Is that useful?
Yes. The point of serialVersionUID is to give the programmer control over which versions of a class are considered incompatible in regard to serialization. As long as the serialVersionUID stays the same, the serialization mechanism will make a best effort to translate serialized instances, which may not be what you want. If you make a semantic change that renders older versions incompatible, you can change the serialVersionUID to make deserializing older instances fail.
If all classes have the same serialVersionUID 1L, is there no problem?
No - the serialVersionUID is per class.
This is explain here:
The serialVersionUID is a universal version identifier for a Serializable class. Deserialization uses this number to ensure that a loaded class corresponds exactly to a serialized object. If no match is found, then an InvalidClassException is thrown.
From the javadoc:
The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization. If the receiver has loaded a class for the object that has a different serialVersionUID than that of the corresponding sender's class, then deserialization will result in an InvalidClassException. A serializable class can declare its own serialVersionUID explicitly by declaring a field named "serialVersionUID" that must be static, final, and of type long:
Useful Links
java.io.Serializable
Why should I bother about serialVersionUID? (StackOverflow)
serialVersionUID in Java Serialization
The JVM will throw a InvalidClassException if the serialVersionUID of a serialized object does not match the serialVersionUID of the class it is being deserialized as.
The serialVersionUID of a class should change every time the class changes in an incompatible way. Normally this means every time you change the shape of a class (ie fields or methods change).
There are some cases where you don't want the serialVersionUID to change. For instance you might accept old versions of the object into your application. In this case, you can leave the serialVersionUID the same and new fields will come through as null.
Yes, I've seen code that defined serialVersionUID like that too. I think it is a bad idea.
In this context there is only one "distinguished" value for the serialVersionUID field; i.e. zero ... 0L. Zero means "compute the version id at runtime by applying the standard algorithm" based on the actual class that you are serializing / deserializing. That means that whenever your code's effective serialization signature changes, the serialization / deserialization code will use a different version id. From a (big picture) type safety perspective, this is the safest thing to do, though it is also somewhat inefficient, and protentially more fragile.
What is the meaning?
The 1L has no special meaning. It is just a number that will match 1L as the "other" version id.
To my mind, you are better off either using 0L, or a version number that was (at some point) generated using the standard algorithm.
If you use 0L, then you get definite deserialization exceptions if classes change in ways that could be source of problems. If you need this, it is a good thing.
On the other hand you use a generated version id, you (the programmer) can make your own decision about when to regenerate the id. And when you do decide to regenerate, the id will only change if the class signature has changed. (If classes representation etc hasn't changed, the regenerated signature should be identical to the original one!) And when the id does change, you can think about whether to add custom methods ('readObject', etc) to deal with the incompatibility.
However, if you use 1L, you can't tell if the version id needs to change without checking your code history, and comparing the old / new versions of the classes ... back as far as you need to.
Is that useful?
It depends on what you consider "useful" to mean. If you think it is a good thing to hard wire the version id to "trust me, it is ok", then 1L is useful.
My recollection is that some versions of Eclipse offer 1L as one of the possible auto-corrections for a missing serialVersionUID field warning. That is probably where the examples you see have come from.
Imagine you write a class with a serialVersionUID, instantiate it, then serialize it to a file (with ObjectOutputStream)
Then you modify the class.
Then, with the modified class, you deserialize (read in) the version you serialized before modification. Java will check the serialVersionUID of the current class and the serialized class, and if they don't match, the deserialization will fail. This is a deliberate fail-fast to prevent much more subtle (and harder to debug) errors occurring later on due to class version incompatibilities.
If you omit serialVersionUID then you disable the version checking.
If you always set if to 1L, then the check will always pass (the value is always the same) so you are still vulnerable to subtle class version incompatibility problems.
The value is used while serializing an object and de-serializing the object back into JVM.
Further, If your class changes and you don't want to support backward compatibility (i.e. able to de-serialize the object back which was serialized using your last version of class) you can change the version number to any other value.
However, to support the backward compatibility you need to keep the same version number as previously set value of serialVersionUID.
The best practice is to change the serialVersionUID, every time you have some incompatible changes to the class.
It's important to make clear the fact that having a class implement the Serializable interface makes ALL fields not declared transient part of the exported API of the class, whether or not those fields are declared private.
In other words, implementing Serializable:
breaks encapsulation. If the class has any chance to become a successful, long-lived class then you must support the serialized form ... forever.
can seriously impair your ability to evolve that class, precisely because it is a part of its exported API. The alternative is to break backward compatibility.
can create security problems for your class and its application. Deserialization represents a way for making Java objects without a constructor, so it's possible to violate a class's invariants by providing rogue byte streams to the deserialization facility.
The serialVerionUID should be thought of as a property of the serialized form. It is meant to convey to one JVM whether or not there a difference between the serialized form of a class instance that it is receiving and the serialized form of of that same class rendered (maybe) somewhere else.
You can see the potential problems that may occur if the serialized forms are different but the UIDs are the same. The receiving JVM will assume that the received serial form version between an old class and the new one are the same when they aren't and will dutifully go ahead and attempt to deserialize the byte stream.
TLDR: You shouldn't change the UID when you feel like it. You should change it when the serialized form of the class changes so that versions of software that use older versions of your class (with the different serialized form) will break instead of (possibly silently) doing the wrong thing. Not designing a good serialized form your classes will make it harder (even much harder) to provide backward compatibility for its clients. In the ideal case, the serialized form for a class persists throughout its entire evolution (and so its UID need never change).
You can assign any long value to serialVersionUID, but you have to change it every time you modify your class.
The second looks like a generated serialVersionUID, based on the features of the current class version.
I'm can only persist objects to DB if implements Serializable, even if I don't add a private static final long serialVersionUID = 1L; or similar.
Question: do I have to set this serialID mandatory? What are the drawbacks if I don't?
You don't need a serialVersionUID to make an object serializable. It's only needed when you need to be able to read objects serialized using an old version of the class, or to maintain the serialization format when you make some minor changes (like the ordering of the fields, for example).
I wrote a blog post (in French sorry, but google translate could help) about that.
First what is SerialVersionUID and is it mandatory?
The serialVersionUID is used as a version control in a Serializable class. If you do not explicitly declare a serialVersionUID, JVM will do it for you automatically, based on various aspects of your Serializable class, as described in the Java(TM) Object Serialization Specification.
Drawbacks if I dont?
The default serialVersionUID computation is highly sensitive to class details and may vary from different JVM implementation, and result in an unexpected InvalidClassExceptions during the deserialization process.
So to avoid this it is better to specify serialVersionUID for your serialization and deserialization process
I think it's not mandatory. Eclipse shows the same warning when I crete a class that extends JFrame. I always add #SuppressWarnings("serial").
On a project, we have several objects serialized. It will be necessary to use these objects on machine with different JVM (possibly different versions).
Our objects serialVersionUID are fixed and won't change, but we are concerned about the serialVersionUID of the JVM standard objects, for instance ArrayList/HashSet that are used in our serialized objects.
So the question is, can these serialVersionUID change between different versions of JVM or between different JVM ?
Or do we have to use another serialization mechanism to support different JVMs ?
The serialVersionUID should only be changed if there is a change to the class that would not be compatible with previously serialized versions of it.
To see what changes would potentially break compatibility check the Specification
I highly doubt that a new version of Java would introduce any changes to core classes that would break compatibility.
We use serialVersionUID as a version code for the class, and we should change this field when we modify the class. This field is used as identity of the class in deserialization.
For example, you serialize a object of class A and save it in an binary file, you can deserialize file to the original object later. But if you add a field to A and do not change the serialVersionUID, the deserialization may return a malformed object. And if you change the serialVersionUID, the deserialization will reject the input and throw an exception. An exception is better than a unknown error.
These error/exception happen if and only if you used an old serialization result to create a instance of a modified class. If you don't use serialization for data persistence, there won't be any problems.