I'm analyzing Java SE 7 project by SonarQube version 5.1.
Then, I faced squid:S1948 on below code.
Fields in a "Serializable" class should either be transient or serializable
Fields in a Serializable class must themselves be either Serializable or transient even if the class is never explicitly serialized or deserialized. That's because under load, most J2EE application frameworks flush objects to disk, and an allegedly Serializable object with non-transient, non-serializable data members could cause program crashes, and open the door to attackers.
enum ShutterSpeed {
private final Rational value; // Make "value" transient or serializable.
...
}
I think that any enum fields won't be serialized in J2SE 5.0 (Serialization of Enum Constants)
Is this a false-positive?
Whole code and issue are here.
It is actually a false-positive. The Serialization of Enum Constants (which you've provided a link to) says that:
Enum constants are serialized differently than ordinary serializable
or externalizable objects. The serialized form of an enum constant
consists solely of its name; field values of the constant are not
present in the form.
As I see it, it doesn't make sense to mark Enum's field values as transient or make them implement Serializable, since they'll never get serialized, no matter if they're marked as transient or implement Serializable.
If that analyzing tool forces you to do one of these two things, then you'll be writing useless code. If I were you, I'd try to disable that warning for enums.
As said, it's a false positive, so you can suppress the warning:
#SuppressWarnings("squid:S1948")
I would simply mark the field as transient.
Related
I have a class that is serialised. Now I need to add a new variable into the class, with setter and getter methods. This class is sent over wire in RMI.
Without changing the UID, can I add new parameters and getter and setter methods for it? I tried to write an example class that is sent over wire, and did not change the UID, and added new parameters and getter and setter methods for it. On the other end, I tested it and I still got the values properly. I had assumed, if I add new parameters, getter and setter methods, I need to change the UID. Am I wrong?
If you hard-code the SerialVersionUID of a class, (to 1L, usually), store some instances, and then re-define the class, you basically get this behavior (which is more or less common sense):
New fields (present in class definition, not present in the serialized instance) are assigned a default value, which is null for objects, or the same value as an uninitialized field for primitives.
Removed fields (not present in class definition but present in the serialized instance) are simply ignored.
So the general rule of thumb is, if you simply add fields and methods, and don't change any of the existing stuff, AND if you're OK with default values for these new fields, you're generally OK.
Wow, a lot of bad information.
Java serialization is +very+ robust. There are a very well defined set of rules governing backwards compatibility of objects with the same uid and different data. the basic idea is that as long as you don't change the the type of an existing member, you can maintain the same uid without data issues.
that said, your code still needs to be smart about handling classes with potentially missing data. the object may deserialize correctly, but there may not be data in certain fields (e.g. if you added a field to the class and are deserializing an old version of the class). if your code can handle this, than you can probably keep the current uid. if not, then you should probably change it.
in addition to the pre-defined rules, there are advanced usage scenarios where you could even change the type of existing fields and still manage to deserialize the data, but that generally only necessary in extreme situations.
java serialization is very well documented online, you should be able to find all this information in the relevant sun/oracle tutorials/docs.
This only matters if you let Java generate a default UID for your class. It uses the actual members and methods of the class to generate it, thus making it invalid once you change the class structure. If you provide an UID for your class then this only matters if you need to deserialize older versions of your class from a file and such.
Want to define few point to highlight the changes which impacts serialization.
Below you will find the link to Oracle Java Docs for more details.
Incompatible Changes
Incompatible changes to classes are those changes for which the guarantee of interoperability cannot be maintained. The incompatible changes that may occur while evolving a class are:
Deleting fields
Moving classes up or down the hierarchy
Changing a nonstatic field to static or a nontransient field to transient
Changing the declared type of a primitive field
Changing the writeObject or readObject method so that it no longer writes or reads the default field data or changing it so that it attempts to write it or read it when the previous version did not.
Changing a class from Serializable to Externalizable or vice versa.
Changing a class from a non-enum type to an enum type or vice versa.
Removing either Serializable or Externalizable.
Adding the writeReplace or readResolve method to a class, if the behavior would produce an object that is incompatible with any older version of the class.
Link from where the above information is taken
http://docs.oracle.com/javase/7/docs/platform/serialization/spec/version.html#6678
As for the subject title: Why is it legal to declare a transient variable in a non serializable class?
What would the use be?
The transient access modifier can be seen by code other than the serialization mechanism, and is used by some object databases to mark a data field as not persistent. Aside from that, there isn't any harm in allowing this.
Because also other serialization forms that don't requirier Serializable are able to make use of it too.
How about if a subclass implements Serializable?
In any case, it is impossible for the compiler to enforce this rule, i.e. emit a compile error
based on class hierarchy (except - of course - superclass defined methods).
If the Serializable interface is just a Marker-Interface that is used for passing some-sort of meta-data about classes in java - I'm a bit confused:
After reading the process of java's serialization algorithm (metadata bottom-to-top, then actual instance data top-to-bottom), I can't really understand what data cannot be processed through that algorithm.
In short and formal:
What data may cause the NotSerializableException?
How should I know that I am not supposed to add the implements Serializable clause for my class?
First of all, if you don't plan to ever serialize an instance of your class, there is no need to even think about serializing it. Only implement what you need, and don't try to make your class serializable just for the sake of it.
If your object has a reference (transitive or direct) to any non-serializable object, and this reference is not marked with the transient keyword, then your object won't be serializable.
Generally, it makes no sense to serialize objects that can't be reused when deserialized later or somewhere else. This could be because the state of the object is only meaningful here and now (if it has a reference to a running thread, for example), or because it uses some resource like a socket, a database connection, or something like that. A whole lot of objects don't represent data, and shouldn't be serializable.
When you are talking about NotSerializableException it is throw when you want to serialize an object, which has not been marked as Serializable - that's all, although when you extend non serializable class, and add Serializable interface it is perfectly fine.
There is no data that can't be serialized.
Anything your Serializable class has in it that is not Serializable will throw this exception. You can avoid it by using the transient keyword.
Common examples of things you can't serialize include Swing components and Threads. If you think about it it makes sense because you could never deserialize them and have it make sense.
All the primitive data types and the classes extend either Serializable directly,
class MyClass extends Serializable{
}
or indirectly,
class MyClass extends SomeClass{
}
SomeClass implements Serializable.
can be serialized. All the fields in a serializable class gets serialized except the fields which are marked transient. If a serializable class contains a field which is not serializable(not primitive and do not extend from serializable interface) then NotSerializableException will be thrown.
Answer to the second question : As #JB Nizet said. If you are going to write the instance of a class to some stream then and then only mark it as Serializable, otherwise never mark a class Serializable.
You need to handle the serialization of your own Objects.
Java will handle the primitive data types for you.
More info: http://www.tutorialspoint.com/java/java_serialization.htm
After reading the process of java's serialization algorithm (metadata bottom-to- top, then actual instance data top-to-bottom), I can't really understand what data cannot be processed through that algorithm.
The answer to this is certain system-level classes such as Thread, OutputStream and its subclasses which are not serializable. Explained very well on the oracle documents: http://www.oracle.com/technetwork/articles/java/javaserial-1536170.html
Below is the abstract:
On the other hand, certain system-level classes such as Thread, OutputStream and its subclasses, and Socket are not serializable. Indeed, it would not make any sense if they were. For example, thread running in my JVM would be using my system's memory. Persisting it and trying to run it in your JVM would make no sense at all.
NotSerialisable exception is thrown when something in your serializable marked as serializable. One such case can be:
class Super{}
class Sub implements Serializable
{
Super super;
Here super is not mentioned as serializable so will throw NotSerializableException.
More practically, no object can be serialized (via Java's built-in
mechanism) unless its class implements the Serializable interface.
Being an instance of such a class is not a sufficient condition,
however: for an object to be successfully serialized, it must also be
true that all non-transient references it holds must be null or refer to
serializable objects. (Do note that that is a recursive condition.)
Primitive values, nulls, and transient variables aren't a problem.
Static variables do not belong to individual objects, so they don't
present a problem either.
Some common classes are reliably serialization-safe. Strings are
probably most notable here, but all the wrapper classes for primitive
types are also safe. Arrays of primitives are reliably serializable.
Arrays of reference types can be serialized if all their elements can be
serialized.
What data may cause the NotSerializableException?
In Java, we serialize object (the instance of a Java class which has already implemented the Serializable interface). So it's very clear that if a class has not implemented the Serializable interface, it cannot be serialized (then in that case NotSerializableException will be thrown).
The Serializable interface is merely a marker-interface, in a way we can say that it is just a stamp on a class and that just says to JVM that the class can be Serialized.
How should I know that I am not supposed to add the implements
Serializable clause for my class?
It all depends on your need.
If you want to store the Object in a database, you can
serialize it to a sequence of byte and can store it in the
database as persistent data.
You can serialize your Object to be used by other JVM working
on different machine.
Say I have an ArrayList<B> array of objects from a certain class B, which extends A. B has an instance field bb and A a field aa. I know that saving array to a .dat-file using ObjectOutputStream requires that B (not just ArrayList!) implement Serializable. I've found, however, that when loading the object back from the file (using an ObjectInputStream):
arrayLoaded = (ArrayList<B>)myObjIn.readObject();
the loaded array isn't identical to the original array: In the particular case, arrayLoaded.get(0).bb has the same value as in array, but arrayLoaded.get(0).aa is "zeroed". It has a default initialization value, regardless of its value when array was saved to file. However, this problem is solved by letting also A implement Serializable.
What bothers me is that this error is so subtle: no exception, no warning (in eclipse), nothing. Is there a reason for this or is this simply an oversight by the java developers? Do I just have to accept it and think hard about which classes in the hierarchy implement Serializable every time I want to use object IO streams?
Just because B implements Serializable, that does not retroactively include the fields of the non-serializable superclass in what gets serialized. (This makes sense, especially when you consider that being able to serialize private and package-private fields of any class just by extending it and implementing Serializable would violate its encapsulation.)
A field declared in A will behave the same as a field declared as transient in B. There is a workaround however. From the documentation for Serializable:
To allow subtypes of non-serializable classes to be serialized, the
subtype may assume responsibility for saving and restoring the state
of the supertype's public, protected, and (if accessible) package
fields. The subtype may assume this responsibility only if the class
it extends has an accessible no-arg constructor to initialize the
class's state.
So you will need to implement writeObject and readObject in B to handle the serialization/deserialization of A.aa.
What bothers me is that this error is so subtle: no exception, no warning (in eclipse), nothing. Is there a reason for this or is this simply an oversight by the java developers?
It is by design. (See #Paul Bellora's answer). The alternatives would be to:
Make it illegal to declare a class Serializable unless its superclass is Serializable. That's obviously unworkable.
Automatically serialize the superclasses fields which breaks if the should or can't be serialized. (Note that we can't rely on transient here, because if the designer of the superclass didn't intend the it to be serializable, he/she won't have labelled the fields.)
Do I just have to accept it and think hard about which classes in the hierarchy implement Serializable every time I want to use object IO streams?
Basically, yes. In particular, you need to think hard when you write a Serializable subclass of an existing non-Serializable class.
I guess it is possible to write FindBugs / PMD / etc rules to flag this particular usage as potentially problematic.
To make a class serializable we do the following:
class A implements Serializable {
transient Object a;
}
And not this:
serializable class A {
transient Object a;
}
Why, if we want to make a class serializable, do we implement a special interface. And if we want to exclude some fields we use the keyword transient?
Why aren't special keywords used in both cases? I mean were there any reasons to make the same thing in different ways? I know, there is no such keyword as serializable but why wasn't it introduced instead of the special interface Serializable?
Why isn't used some special keyword to
mark classes as serializable too?
Serializable interface looks like a
magic numbers in code and not like the
language feature.
I think you have to look at it the other way: language keywords exist mainly to support compile-time language constructs. Serialization is a runtime mechanism. Additionally, you don't want to have an extra keyword for everything, because you then can't use it as an identifier. A marker interface on the other hand is much less intrusive.
The question is thus: why do we need a language keyword to mark transient fields? And the answer is that there simply was no other way to mark specific fields at that time.
Nowadays, one would use annotations for this purpose in both cases (and for other things like the obscure strictfp keyword as well).
Serializable is a marker interface. Interfaces are a standard way (in Java and in some other languages) of indicating features of a class; an "is a" relaionship. Making Serializable an interface means we can declare methods that accept or return Serializables just like we can methods that work with other interfaces. Anything else would have required syntax changes to the language (at the time; now we have annotations, but I think an interface would still be used).
Serializable is a marker interface (like Cloneable) that is used to set a flag for standard Java runtime library code that an object can be serialised according to the designer of that class.
The transient keyword can be used to specify that an attribute does not need to be serialised, for instance because it is a derived attribute.
See also this reply to a similar question on SO and this one about designing marker interfaces.
Update
Why marker interfaces and no keywords for things like serializable, cloneable, etc? My guess would be the possibility to consistently extend the Java runtime lib with new marker interfaces combined with too many keywords if behavioural aspects made it into the language.
The fact that class attributes cannot implement Interfaces and transient can be seen as a generic property of an attribute makes sense of introducing transient as a language keyword.
So you're asking why you can't mark a class as not serializable (like a transient member)? Why wouldn't you just not mark class members of the not-to-serialize type as transient? Or use a serialization delegate for that class type when you do the serialization? It seems a little weird that you would want to tell Java to not do something at this level instead of telling it to do something.
Transient keywords are used to protect a variable or a field from being stored and we do this to protect some sensitive information we just don't want to distribute at every place and we use Serializable interface to make a class Serializable. Although we can use Externalizable interface also but we prefer to use Serializable because of some advantages.
Go though this to clearly understand Serialization and transient keyword.
http://www.codingeek.com/java/io/object-streams-serialization-deserialization-java-example-serializable-interface/