Java partial (de)serialization of objects

Java partial (de)serialization of objects - java

Let's say we have 3 Classes:
class foo { // not a singleton
String s;
}
class bar {
foo f;
int i;
}
class baz {
foo sameF;
}
Now we create instances
foo onlyFoo = new foo("the only one");
bar theBar = new bar(onlyFoo, 123);
baz theBaz = new baz(onlyFoo);
After that we want to store them serilazed in a file.
If we deserialze theBaz, modify onlyFoo and deserialize theBaz to the file again, theBar still contains the original onlyFoo, so there are two different versions of onlyFoo.
What I want instead is that I store theBaz and theBar without onlyFoo, store the three objects separately and once someone deserialize theBaz I want to give him onlyFoo, too. If he deserializes the changed onlyFoo again, theBaz and theBar will have the same modified onlyFoo, so if someone requests an object (for example theBar) he gets the full serialized object with all the referenced objects (onlyFoo) like the normal serialization process would return.
I know that I have to store the objects and keep the references manually and separately because the default serialization cannot handle this problem. What I don't know is how do I partially serialize/deserialize Java objects? I need to separate the primitives and their wrappers from the 'higher' objects and store this objects separately.
Update
I cannot modify the classes.
I don't know all classes. It should work with all serializable objects I maybe never heard about (they may or may not be final)

If you want more controll you could overwrite writeObject() and readObject()
and serialize yourself.
class bar {
...
private void writeObject(ObjectOutputStream stream) throws IOException {
// let version 1, later when you need to have versioning.
stream.writeInt(version);
stream.writeInt(i);
// leave out
// stream.writeObject(foo);
}
}
// read object the analog, see
http://docs.oracle.com/javase/6/docs/platform/serialization/spec/output.html#861

Mark references you don't want serialized with transient keyword.

You can make foo transient as below.
class bar {
transient foo f;
int i;
}

Related

Java - Manual object validation against a white list

I'm trying to write a class to take a mega object, and ensure that only certain fields have been changed, normally you would annotate / add validation this way, but that is not an option in this case unfortunately, the only thing I can change is the one class I am working on, which will receive the very large (and very nested!) object that I'm supposed to somehow validate.
My initial thoughts was to make a 'list' of things that can be changed, then iterate over all properties in the object and check if anything has been updated that is not on the 'whitelist', I have the old version of the object, so I can check each field against the old one to confirm, but I'm not entirely sure how to do this, or if there is a better solution. I've never tried something like this before.
Any suggestions are appreciated. If there aren't any better solutions, how should I create the white list / iterate over all properties / nested properties of the mega object?
UPDATE:
Based on the suggestions, here is what I'm trying out, it still have a few problems though (Please note I'm just throwing things around, this is by no means my final class or good programming yet):
isTraversable works on collections, but I'm not sure how to get check custom classes, eg. a Person class, which would still need to be iterated through.
There are cyclic refs all over the place, not sure how to handle those either.
public class Test {
private Object obj1;
private Object obj2;
private List<String> whitelist;
public void validate(Object objectToTraverse,
Object objectToCompareTo,
List<String> whitelist){
this.obj1 = objectToTraverse;
this.obj2 = objectToCompareTo;
this.whitelist = whitelist;
traverseAndCompare(obj1, obj2);
}
private void traverseAndCompare(Object objectToTraverse,
Object objectToCompareTo){
try {
for (Field field : objectToTraverse.getClass()
.getDeclaredFields()) {
field.setAccessible(true);
Object fieldValue = field.get(objectToTraverse);
if (isTraversable(field)) {
traverseAndCompare(field.get(objectToTraverse),
field.get(objectToCompareTo));
} else {
getFieldValuesAndCompare(field, obj1, obj2);
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
private boolean getFieldValuesAndCompare(Field field,
Object obj1,
Object obj2)
throws Exception{
Object value1 = field.get(obj1);
Object value2 = field.get(obj2);
return compare(value1, value2);
}
private boolean compare(Object value1,
Object value2){
return Objects.equals(value1, value2);
}
private boolean isTraversable(Field field){
// This should handle collections, but it does not work
// on custom classes, eg. Person class
if (Collection.class.isAssignableFrom(field.getType())) {
return true;
}
// Need to somehow figure out is this is a class with
// properties I can traverse, or something with a value,
// like String, Long, etc, hopefully
// without listing everything
return false;
}
}

Putting descriptive answer since object can not be shared due to legal reason.
You have couple of choices. Each with pro and con.
Reflection
You can maintain a list of fields not allowed to change with their full path. Like a.b.c. You can then write pure reflection code or use common utils like http://commons.apache.org/proper/commons-beanutils/ to get values (even deep in object tree) and compare.
It needs less code and less maintenance. But you need to know exact list of blacklist fields. Performance wise it will take little bit more time.
Simple plain code technique
Write your own comparator or method in java to go through all fields that can not change and decide. Need lot of code but very easy to maintain and performance wise best.

Deep copying Java objects with circular references

How would I go about implementing a deep copy for Foo? It contains an instance of Bar, which then has a reference to that Foo.
public class Foo {
Bar bar;
Foo () {
bar = new Bar(this);
}
Foo (Foo oldFoo) {
bar = new Bar(oldFoo.bar);
}
public static void main(String[] args) {
Foo foo = new Foo();
Foo newFoo = new Foo(foo);
}
class Bar {
Foo foo;
Bar (Foo foo) {
this.foo = foo;
}
Bar (Bar oldBar) {
foo = newFoo(oldbar.Foo);
}
}
}
As it stands, this code would cause a stack overflow due to infinite recursion.
Also, this is the most simplistic example I could construct. In practice, the object graph would be larger, with multiple instance variables which could themselves be collections. Think multiple Bars, with multiple Foos, for instance.
EDIT: I'm currently in the process of implementing #chiastic-security's method. Am I doing it correctly for Foo? I'm using a separate HashMap to contain all parts of the object graph so that I can write the deep copy functionality as generally as possible.
Foo (Foo oldFoo) throws Exception {
this(oldFoo, new IdentityHashMap<Object, Object>(), new IdentityHashSet<Object>());
}
Foo (Foo oldFoo, IdentityHashMap<Object, Object> clonedObjects, IdentityHashSet<Object> cloning) throws Exception {
System.out.println("Copying a Foo");
HashMap<Object, Object> newToOldObjectGraph = new HashMap<Object, Object>();
newToOldObjectGraph.put(bar, oldFoo.bar);
deepCopy(newToOldObjectGraph, clonedObjects, cloning);
}
void deepCopy(HashMap<Object, Object> newToOldObjectGraph, IdentityHashMap<Object, Object> clonedObjects, IdentityHashSet<Object> cloning) throws Exception {
for (Entry<Object, Object> entry : newToOldObjectGraph.entrySet()) {
Object newObj = entry.getKey();
Object oldObj = entry.getValue();
if (clonedObjects.containsKey(oldObj)) {
newObj = clonedObjects.get(oldObj);
}
else if (cloning.contains(oldObj)){
newObj = null;
}
else {
cloning.add(oldObj);
// Recursively deep clone
newObj = newObj.getClass().getConstructor(oldObj.getClass(), clonedObjects.getClass(), cloning.getClass()).
newInstance(oldObj, clonedObjects, cloning);
clonedObjects.put(oldObj, newObj);
cloning.remove(oldObj);
}
if (newObj == null && clonedObjects.containsKey(oldObj)) {
newObj = clonedObjects.get(oldObj);
}
}
}

The easiest way to implement a deep copy that might involve circular references, if you want it to be tolerant of changes to the structure later, would be to use an IdentityHashMap and an IdentityHashSet (from here). When you want to copy:
Create an empty IdentityHashMap<Object,Object>, to map source objects to their clones.
Create an empty IdentityHashSet<Object> to track all the objects that are currently in the process of being cloned, but haven't yet finished.
Start the copy process going. At each stage, when you want to copy an object, look it up in your IdentityHashMap to see if you've already cloned that bit. If you have, return the copy that you find in the IdentityHashMap.
Check in the IdentityHashSet to see if you're in the middle of cloning the object you've now reached (because of a circular reference). If you have, just set it to null for now, and move on.
If you haven't previously cloned this (i.e., the source object isn't in the map), and you're not in the middle of cloning it (i.e., it's not in the set), add it to the IdentityHashSet, recursively deep clone it, and then when you've finished the recursive call, add the source/clone pair to the IdentityHashMap, and remove it from the IdentityHashSet.
Now at the end of your recursive cloning, you need to deal with the null references you left hanging because you encountered a circular reference. You can walk the graph of source and destination simultaneously. Whenever you find an object in the source graph, look it up in your IdentityHashMap, and find out what it should map to. If it exists in the IdentityHashMap, and if it's currently null in the destination graph, then you can set the destination reference to the clone you find in the IdentityHashMap.
This will make sure you don't clone the same part of the graph twice, but always end up with the same reference whenever there's an object that appears twice in your graph. It will also mean that circular references don't cause infinite recursion.
The point of using the Identity versions is that if two objects in your graph are the same as determined by .equals(), but different instances as determined by ==, then a HashSet and HashMap would identify the two, and you'd end up joining things together that shouldn't be joined. The Identity versions will treat two instances as the same only if they're identical, i.e., the same as determined by ==.
If you want to do all this but without having to implement it yourself, you could have a look at the Java Deep Cloning Library.

Preventing mutability for Java generic types

I have been working to upgrade my Java code baseline so that it follows good security practices and have run into an issue related to generics. Say you have the following:
public class SomeClass<T>
{
private T value;
public T getValue()
{
return value;
}
public void setValue(T value)
{
this.value = value;
}
}
I have not found a good answer on how to edit these methods so that value does not leak like it does in this example class for a generic object that does not implement Clonable and in some cases has no default constructor.

As I understand it, you want to make sure that nothing outside SomeClass can mutate the object value.
In C++, you could returns a const reference (avoid copying altogether), but Java does not have that. So let's look at copying...
First, know that some objects cannot be copied. For example, stream, gui elements, etc. Thus, trying to copy all objects is a hopeless endeavor from the start.
But what about objects that are copiable?
In Java, you cannot call the copy constructor (or any other constructor) of a generic (Calling constructor of a generic type).
There is the Cloneable interface, but that is really nothing more than a promise that clone works; it does not actually expose clone publically. Thus, for generics, you have to use reflection, as shown here.
Unfortunately, there is no good solution. The only viable one (except for changing the purpose or semantics of your class) is to use the clone method as shown in the link above, and realize that some objects cannot be copied.
Ultimately, the best thing to do is find a solution that does not require this. Make a (non-generic) read-only wrapper class that exposes the non-mutating methods. Or stipulate in documentation that mutating methods must not be called.

I can see three approaches:
Make copies. This of course would only work with types can can be copied (and that you know how to copy).
Only support immutable types.
Remove getValue(). Instead, provide methods that operate directly on this.value without exposing it outside the class. In this approach, setValue() can still be problematic (you need to make sure that the caller does not hold on to the object reference after calling setValue()).
If T can be arbitrary type that you have no control over, then options 1 and 2 won't be suitable.

I believe that i undestand you ... If you want to restrict a generic type you should use extends keyword that in generic type is not equals to general class. If you use only the class how implements Clonable are able to instantiate this class. One example:
public class Stack {
public static void main(String[] args) {
SomeClass<Animal> sc = new SomeClass<>(); //This generate an error because doesnt implements Clonable interface
SomeClass<Person> sc1 = new SomeClass<>();
}
}
class SomeClass<T extends Comparable> //Note that extends means implements or the common extends
{
private T value;
public T getValue()
{
return value;
}
public void setValue(T value)
{
this.value = value;
}
}
class Person implements Comparable<Person>{
#Override
public int compareTo(Person p){
return 0;
}
}
class Animal {
}
I wish i helped you.
:)

An object whose state is encapsulated in a mutable object should generally never expose to the outside world any reference to that object, and should avoid giving the outside world a reference to any mutable object (even a copy) which claims to encapsulate its state. The problem is that given code:
Foo foo = myEntity1.getFoo();
foo.bar = 23;
myEntity2.setFoo(foo);
foo.bar = 47;
myEntity3.setFoo(foo);
there is no clear indication whether or how the change to foo.bar would affect the various entities. If the code had instead been:
Foo foo = myEntity1.getFoo();
foo = foo.withBar(23); // makes a new instance which is like foo, but where bar==23
myEntity2.setFoo(foo);
foo = foo.withBar(47); // makes a new instance which is like foo, but where bar==47
myEntity3.setFoo(foo);
it would be very clear that the bar property of myEntity1's foo will be unaffected, that of myEntity2 will be 23, and that of myEntity3 will be 47. If foo is a mutable class, the pattern should be:
Foo foo = new Foo();
myEntity1.writeTo(foo); // Copy properties from myEntity1 to the supplied instance
foo.bar = 23;
myEntity2.readFrom(foo); // Copy properties from the supplied instance to myEntity2
foo.bar = 47;
myEntity2.readFrom(foo); // Copy properties from the supplied instance to myEntity3
Here, myEntity1 isn't giving the caller an object, but is instead copying data to an object supplied by the caller. Consequently, it's much clearer that the caller shouldn't expect the writes to foo.bar to affect the entities directly, but merely change what will be written in subsequent readFrom calls.

serialization/deserialization mechanism

Say, I have a class X which has a field value, that is,
class X implements Serializable {
private int value;
// ...
}
Further it has getters and setters not displayed here. This class is serialized.
At the deserialzation, end same class has value field and access specifier is public. Further, this class does not have getters and setters. So, my questions are:
Does deserialization fail in case the access specifier of the field changes OR some or all of the methods go missing in the class at the deserialization end?
What is the mechanism by which fields are assigned their values during deserialization?

Some good links The Java serialization algorithm revealed
1) does deserialization fail in case the access specifier of the field
changes OR some or all of the methods go missing in the class at the
deserialization end ?
Serialization happens using Using Reflection
Java Detects the changes to a class using the
private static final long serialVersionUID
The default involves a hashcode. Serialization creates a single hashcode, of type long, from the following information:
The class name and modifiers
The names of any interfaces the class implements
Descriptions of all methods and constructors except private methods and constructors
Descriptions of all fields except private, static, and private transient
The default behavior for the serialization mechanism is a classic "better safe than sorry" strategy. The serialization mechanism uses the suid, which defaults to an extremely sensitive index, to tell when a class has changed. If so, the serialization mechanism refuses to create instances of the new class using data that was serialized with the old classes.
2) what is the mechanism by which fields are assigned their values
during deserialization ?

The real details can be read in the Java Object Serialization Specification.
To answer your questions:
Serialization has a basic sanity check to see if the serialization ends use the same version of a class: the serialVersionUID member must be equal. Read the section Stream Unique Identifiers to know more about it. Basically, it's a static value which you can either manage yourself by declaring it on your class, or let the compiler generate one for you. If the compiler generates it, ANY change to a class will result in a change of serialVersionUID and hence will make the deserialization fail if the ends do not have exactly the same classes. If you want to avoid this, declare the variable yourself and update it manually when a change to the class' member variables does make classes incompatible.
The Java Virtual Machine does a lot of the magic here, it can access all internal state directly without the need for getters (fields marked transient or static aren't serialized though). Also, while the Serializable interface doesn't specify any methods to implement, there are a number of 'magic methods' which you can declare to influence the serialization process. Read section "The writeObject Method" and onwards to know more. Be aware though that you should use these sparingly as they might confuse any maintenance developers!

You don't need to have getters/setter to serialize/deserialize using java serialization, for example, check this code:
public class Main {
public static class Q implements Serializable {
private int x;
public Q() {
x = 10;
}
public void printValue() {
System.out.println(x);
}
}
public static void main(String[] args) throws Exception {
Q q = new Q();
FileOutputStream fos = new FileOutputStream("c:\\temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeObject(q);
fos.close();
FileInputStream fis = new FileInputStream("c:\\temp.out");
ObjectInputStream oin = new ObjectInputStream(fis);
Q q2 = (Q)oin.readObject();
fis.close();
q2.printValue();
}
}

I don't really know how you get this results, but what you tell is not the default behaviour of serialisation. So, I guess you are using it wrong. Here is some sample code:
public class X implements Serializable
{
private int value;
public int getValue() { return value; }
}
Here the serialisation/deserialisation process:
X x = new X();
x.setValue(4);
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputSteam(buffer);
oos.writeObject(x);
oos.flush();
oos.close();
ByteArrayInputStream in = new ByteArrayInputStream(buffer.toByteArray());
ObjectInputStream ois = new ObjectInputStream(in);
Object obj = ois.readObject();
if (obj instanceof X)
{
X readObject = (X) obj;
System.out.println(readObject.getValue());
}
You probably used Java Reflection to get your results. Make sure you use getDeclaredFields(); and getDeclaredMethods(); instead of the variants without Declared in the method name.

Does deserialization fail in case the access specifier of the field changes
No.
OR some or all of the methods go missing in the class at the deserialization end?
Yes, unless the receiving class has a serialVersionUID member whose value equals the value encoded in the stream.
what is the mechanism by which fields are assigned their values during deserialization?
Too broad, but:
Reflection, and
name matching (rather than matching by position in the class and stream).

How to keep an object parameter unchanged in a Runnable Class in Java?

I have a Runnable class like:
Class R1 implements Runnable {
private static final Log LOGGER = LogFactory.getLog(R1.class);
private final ObjectClass obj;
private final SomeService service;
public R1(ObjectClass obj, SomeService service) {
this.obj = obj;
this.service = service;
}
#override
public void run() {
String value = this.obj.getSomeValue();
LOGGER.debug("Value is " + value);
// some actions, such as:
// service.someMethod(obj);
}
}
I use a ExecutorService object to execute R1 and put R1 in a queue.
But later outside R1 I change the value in the ObjectClass that I passed in R1 so the the actions in R1 after getSomeValue() aren't behaving as I expected. If I want to keep the value of ObjectClass object in R1 unchanged what can I do? Suppose the object is big and has a lot of get and set methods.
To make the problem clearer, I need to pass the obj into a service class object which is also used as a parameter in the runnable class. I have changed the original codes accordingly.

As per comments, apparently my suggested solution has problems.
As such, follow the other suggestions about creating a new instance and copying the properties you require across. Or create a lightweight data object that holds the properties you require. Either way, I believe you need 2 instances to do what you want.
I suggest you could implement clone method that creates a new instance.
http://download.oracle.com/javase/1,5,0/docs/api/java/lang/Cloneable.html
The problem here is that you have passed the instance into your R1class, but it is still the same single instance, so changes to it will affect everything else. So, implementing a clone method will allow you to easily create a copy of your instance that can be used in your R1 class, while allowing you to make further changes to your original.
In your R1 class,
public R1(ObjectClass obj) {
//this.obj = obj;
this.obj = obj.clone();
}
P.S. you must implement this method yourself. It won't just automatically give you a deep copy.

Depending on the nature of your program, there are a couple options.
You could "Override Clone Judiciously" (Item 11 in Effective Java) and clone the object before handing it to the runnable. If overriding clone doesn't work for you, it might be better to do one of the following:
Create a new instance of the object manually and copy the values from obj.
Add a subset of the data contained in obj. So instead of passing obj into the constructor, you would pass in someValue. I would advocate this method, so that you only supply R1 with the data it needs, and not the entire object.
Alternatively, if it doesn't matter that the data in obj changes before R1 is executed, then you only need to make sure that obj doesn't change while R1 is executing. In this case, you could add the synchronize keyword to the getSomeValue() method, and then have R1 synchronize on obj like so:
#Override
public void run() {
synchronize (obj) {
String value = obj.getSomeValue();
}
// some actions.
}

Pass the object to the constructor, and don't keep a reference to it.

if objet is too big，
maybe an immutable ParameterObject, with enough data/method ，is better.

If possible, try making your ObjectClass immutable. (no state changes supported). In Java you have to "do this yourself"; there's no notion of 'const' object (as in C++)
Perhaps you can have your orig ObjectClass but create a new class ImmutableObjectClass which takes your orig in the ctor.

Assumption: You don't care if R1 operates on old data.
You can then change your code to:
public class R1 implements Runnable {
private final String value;
// Option 1: Pull out the String in the constructor.
public R1(ObjectClass obj) {
this.value = obj.getSomeValue(); // Now it is immutable
}
// Option 2: Pass the String directly into the constructor.
public R1(String value) {
this.value = value; // This constructor has no coupling
}
#Override public void run() {
// Do stuff with value
}
}
If you want R1 to operate on the latest data, as opposed to what the data was when you constructed it, then you will need some type of synchronisation between R1 and the data modification.

The "problem" here is that under the Java Memory Model, threads may (and do) cache field values. This means that if one thread updates a field (of the ObjectClass object), other threads won't "see" the change - they'll still be looking at their cached (stale) value.
In order to make change visible across threads you have two options:
Make the fields you'll be changing in ObjectClass volatile - the volatile keyword forces threads to not cache the field's value (ie always use the latest value)
synchronize access, both read and write, to the fields - all changes made within a synchronized block are visible to other threads synchronizing on the same lock object (if you synchronize methods, the this object is used as the lock)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java partial (de)serialization of objects - java

Mark references you don't want serialized with transient keyword.

You can make foo transient as below. class bar { transient foo f; int i; }

Related

Java - Manual object validation against a white list

Deep copying Java objects with circular references

Preventing mutability for Java generic types

serialization/deserialization mechanism

How to keep an object parameter unchanged in a Runnable Class in Java?

Categories

Resources