Deep copying Java objects with circular references

Deep copying Java objects with circular references - java

How would I go about implementing a deep copy for Foo? It contains an instance of Bar, which then has a reference to that Foo.
public class Foo {
Bar bar;
Foo () {
bar = new Bar(this);
}
Foo (Foo oldFoo) {
bar = new Bar(oldFoo.bar);
}
public static void main(String[] args) {
Foo foo = new Foo();
Foo newFoo = new Foo(foo);
}
class Bar {
Foo foo;
Bar (Foo foo) {
this.foo = foo;
}
Bar (Bar oldBar) {
foo = newFoo(oldbar.Foo);
}
}
}
As it stands, this code would cause a stack overflow due to infinite recursion.
Also, this is the most simplistic example I could construct. In practice, the object graph would be larger, with multiple instance variables which could themselves be collections. Think multiple Bars, with multiple Foos, for instance.
EDIT: I'm currently in the process of implementing #chiastic-security's method. Am I doing it correctly for Foo? I'm using a separate HashMap to contain all parts of the object graph so that I can write the deep copy functionality as generally as possible.
Foo (Foo oldFoo) throws Exception {
this(oldFoo, new IdentityHashMap<Object, Object>(), new IdentityHashSet<Object>());
}
Foo (Foo oldFoo, IdentityHashMap<Object, Object> clonedObjects, IdentityHashSet<Object> cloning) throws Exception {
System.out.println("Copying a Foo");
HashMap<Object, Object> newToOldObjectGraph = new HashMap<Object, Object>();
newToOldObjectGraph.put(bar, oldFoo.bar);
deepCopy(newToOldObjectGraph, clonedObjects, cloning);
}
void deepCopy(HashMap<Object, Object> newToOldObjectGraph, IdentityHashMap<Object, Object> clonedObjects, IdentityHashSet<Object> cloning) throws Exception {
for (Entry<Object, Object> entry : newToOldObjectGraph.entrySet()) {
Object newObj = entry.getKey();
Object oldObj = entry.getValue();
if (clonedObjects.containsKey(oldObj)) {
newObj = clonedObjects.get(oldObj);
}
else if (cloning.contains(oldObj)){
newObj = null;
}
else {
cloning.add(oldObj);
// Recursively deep clone
newObj = newObj.getClass().getConstructor(oldObj.getClass(), clonedObjects.getClass(), cloning.getClass()).
newInstance(oldObj, clonedObjects, cloning);
clonedObjects.put(oldObj, newObj);
cloning.remove(oldObj);
}
if (newObj == null && clonedObjects.containsKey(oldObj)) {
newObj = clonedObjects.get(oldObj);
}
}
}

The easiest way to implement a deep copy that might involve circular references, if you want it to be tolerant of changes to the structure later, would be to use an IdentityHashMap and an IdentityHashSet (from here). When you want to copy:
Create an empty IdentityHashMap<Object,Object>, to map source objects to their clones.
Create an empty IdentityHashSet<Object> to track all the objects that are currently in the process of being cloned, but haven't yet finished.
Start the copy process going. At each stage, when you want to copy an object, look it up in your IdentityHashMap to see if you've already cloned that bit. If you have, return the copy that you find in the IdentityHashMap.
Check in the IdentityHashSet to see if you're in the middle of cloning the object you've now reached (because of a circular reference). If you have, just set it to null for now, and move on.
If you haven't previously cloned this (i.e., the source object isn't in the map), and you're not in the middle of cloning it (i.e., it's not in the set), add it to the IdentityHashSet, recursively deep clone it, and then when you've finished the recursive call, add the source/clone pair to the IdentityHashMap, and remove it from the IdentityHashSet.
Now at the end of your recursive cloning, you need to deal with the null references you left hanging because you encountered a circular reference. You can walk the graph of source and destination simultaneously. Whenever you find an object in the source graph, look it up in your IdentityHashMap, and find out what it should map to. If it exists in the IdentityHashMap, and if it's currently null in the destination graph, then you can set the destination reference to the clone you find in the IdentityHashMap.
This will make sure you don't clone the same part of the graph twice, but always end up with the same reference whenever there's an object that appears twice in your graph. It will also mean that circular references don't cause infinite recursion.
The point of using the Identity versions is that if two objects in your graph are the same as determined by .equals(), but different instances as determined by ==, then a HashSet and HashMap would identify the two, and you'd end up joining things together that shouldn't be joined. The Identity versions will treat two instances as the same only if they're identical, i.e., the same as determined by ==.
If you want to do all this but without having to implement it yourself, you could have a look at the Java Deep Cloning Library.

Related

Does copy constructor makes a shallow copy?

My question is clear. Is a copy constructor make a deep copy? or a shallow copy?
Here is the situation I faced:
I'm making a node editor application. I've a abstract Node class. In that, I've abstract method called Create(). Also I overrode that method in all sub classes in this way,
public Node Create(){
TestClass theTest = new TestClass();
theTest.Name = "Test Node";
theTest.Title = "Default Node";
theTest.setSize(new Point2D.Float(250,200));
System.out.print(theTest.getClass());
return theTest;
}
I thought this should make a deep copy. Since that didn't work, I tried this also.
public Node Create(Point2D location) {
TestClass theTest = null;
try {
theTest = this.getClass().newInstance();
} catch (InstantiationException | IllegalAccessException e) {
e.printStackTrace();
}
if (theTest != null) {
theTest.Name = "The Node";
theTest.Title = "Defaul Node";
theTest.setSize((new Point2D.Float(250,200)));
theTest.Location = location;
}
return theTest;
}
Then all the subclass types are added into a list and a popup menu is created with subclasses. User can click it and add a new node. This is the code to add a node. This method is called by a MouseEvent of the JMenuItem.
private void addNode(Node node){
Node newNode = node.Create(locationPersistence);
nodes.add(newNode);
}
But no luck. It seems to create a shallow copy instead of a deep copy. When I add the first node, it appears fine. But when adding a second node of same type, first node disappears from there and reappear at the new Location. Does this mean that this is making a shallow copy. If so, how to achieve a deep copy?

First, there is no such thing as copy constructor in Java by default. There is a Cloneable interface and clone() method. But that method by default will make shallow copy.
Your code sets link to the same Point2D object references in property location for both objects. You need to create new instance of Point2D object and use it in new object.

Java avoids deep copying.
For the immutable String class this is no problem, as Strings may be shared.
For the old mutable java awt Point2D.Float class one indeed has a problem. Substituting it for an immutable class would probably be better - than a deep copying.
The javafx.geometry.Point2D is immutable.
For mutable arrays there is a problem. Even a final array can have its elements changed from outside. Here the advise would be to use collections instead.
private final List<Point2D> points = new ArrayList<>();
public List<Point2D> getPoints() {
return Collections.unmodifiableList<>(points);
}
Use the java convention of field and method names starting with a small letter.
Java is quite rigorous with respect to that.
C/C++ partly need deep copying for keeping objects on the local stack.
Java removed the need somewhat for copy constructors, but historically failed for String: String has a senseless copy constructor, probably instigated by intern() and having an internal char array.

A copy constructor is when your class contains a constructor that accepts an instance of itself as parameter. The parameter is used to create a new instance of the class that has the exact same values for its fields as the instance class that was provided as parameter.
Your Node class will have to have a constructor like this:
public class Node {
public Node(Node n) {
//copy all fields in Node n here
//eg this.a = n.a
//this.b = n.b etc
}
}
Then when you inherit from Node, you need to call this parent method in the child class constructor as well:
public class TestClass extends Node {
public TestClass(TestClass t) {
super(t);
//copy any additional fields that is only present in TestClass here
}
}
Now, difference between shallow and deep copy.
Shallow copy is when a reference is set equal to another reference.
Eg:
Point2D a = new Point2D(50, 50);
Point2D b = a;
When you change the value of one of a's members, b will also be affected. The reason is that both a and b is a reference to the same object.
a.x = 100;
System.out.println(b.x == 100); //prints true
Now deep copy is if both a and b refers to their own instances. This can be done as follows:
Point2D a = new Point2D(50, 50);
Point2D b = new Point2D();
b.x = a.x
b.y = a.y
If I now type:
a.x = 100
then b.x will not change to this same value, but keep the previous value that was originally store in a, in this case 50.
System.out.println(b.x == 100); //prints false
System.out.println(b.x == 50); //prints true
If you want to have deep copy in your constructor, then you need to ensure that all members of the class that are references to mutable classes, refer to their own instances

Why Collections.copy still copies Reference, it doesn't clone the Objects that are copied?

Ok, I just want to copy List<String[]> list to List<String[]> list2. After that I will modify the object in list2 & I want that it won't affect any object int list.
String[] s={"1","2"};
List<String[]> list=new ArrayList<String[]>();
list.add(s);
List<String[]> list2=new ArrayList<String[]>(list);
Collections.copy(list2,list);
list2.get(0)[1]="3";
for (String[] strings : list) {
System.out.println(Arrays.toString(strings));
}
Out put: [1, 3]
Why we change things in list2 & it affect list1?
How to fix it?

Collections.copy() does not perform a deep copy.
It simply copies elements from one collection to another. It is basically the same as the ArrayList constructor, so you do not need to call both.
The first element in both list still refers to the same object. Thus, when you run your code, you are modifying the array in both lists. You can iterate over all your elements, and use Arrays.copyOf on each list item.
Something like this:
private List<String[]> deepCopy(List<String[]> list) {
List<String[]> copy = new ArrayList<String[]>(list.size());
for (String[] element : list) {
copy.add(Arrays.<String> copyOf(element, element.length));
}
return copy;
}
EDIT: Java < 1.6 version:
private List<String[]> deepCopy(List<String[]> list) {
List<String[]> copy = new ArrayList<String[]>(list.size());
for (String[] element : list) {
String[] elementCopy = new String[element.length];
System.arraycopy(element, 0, elementCopy, 0, element.length);
copy.add(elementCopy);
}
return copy;
}

What you are asking for is to copy the collection and clone all the objects it holds. In your case you only have lists of strings. The others already gave you good answers for that. Just as a note. In general it is not that easy, because there is no universal recipe for cloning objects. Because each object in the collection could have references to other objects itself. And those objects could have further references to more objects. So it depends on what you need.
Do you just need to clone the objects in the collection, but not the objects those objects reference to?
This is called "shallow copying".
Or do you need to copy all the objects down the objects reference tree. This is called "deep copying".
Or you might have requirements for something in between. Meaning you only need to copy certain objects.
You see there is no golden bullet which solves everything. That's why it is not implemented in a generic collection class. In some cases cloning might not even be possible if you are dealing with open file handles or other system resources.
But what you can do is, have your classes implement the Cloneable interface. Within the clone() method you can call super.clone() which does already a shallow cloning for you. Everything beyond that needs to be implemented by yourself. Then you only need to call clone() for each object in you collection in order to create the cloned objects.
This is how to implement Cloneable:
class MyClass implements Cloneable {
private int a;
private int b;
private MyClass c;
#Override
public Object clone() throws CloneNotSupportedException {
return super.clone();
}
}
Calling clone() would give you a copy of your MyClass-object. That means a and b would be copies as well as the reference to object c. But only the reference, not the object c itself. If you need that, you need to do something like this:
#Override
public Object clone() throws CloneNotSupportedException {
MyClass clone = (MyClass)super.clone();
clone.c = (MyClass2)c.clone();
return clone;
}
Here is a more detailed explanation for shallow and deep copying and the Cloneable interface:
http://javapapers.com/core-java/java-clone-shallow-copy-and-deep-copy/

Most Concise Way to Determine List<Foo> Contains Element Where Foo.getBar() = "Baz"?

Given a starting List<Foo>, what is the most concise way to determine if a Foo element having a property bar (accessed by getBar()) has a value of "Baz"? The best answer I can come up with is a linear search:
List<Foo> listFoo;
for(Foo f:listFoo) {
if(f.getBar().equals("Baz")) {
// contains value
}
}
I looked into HashSet but there doesn't seem to be a way to use contains() without first instantiating a Foo to pass in (in my case, Foo is expensive to create). I also looked at HashMap, but there doesn't seem to be a way to populate without looping through the list and adding each Foo element one at a time. The list is small, so I'm not worried about performance as much as I am clarity of code.
Most of my development experience is with C# and Python, so I'm used to more concise statements like:
// C#
List<Foo> listFoo;
bool contains = listFoo.Count(f => f.getBar=="Baz")>0;
or
# Python
# list_foo = [Foo(), ...]
contains = "Baz" in (f.bar for f in list_foo)
Does Java have a way to pull this off?

Java does not support closures (yet), so your solution is one of the shortest. Another way would be to use, for example, google-collections Iterable's closure-like Predicate:
boolean contains = Iterables.any(iterableCollection, new Predicate<Foo>() {
#Override
public boolean apply(Foo foo) {
return foo != null && foo.getBar().equals("Baz");
}
}

In and of itself Java does not.
Also (just as an fyi) f.getBar == "Baz" won't work for string comparison, due to the fact that strings are objects. Then you use the == operator you are actually comparing objects (which are not equal because they are not at the same memory location and are individual objects). The equals method is the best way to do object comparisons. And specifically it is best to "Baz".equals(f.getBar()) as this also avoids nasty NullPointerExceptions.
Now to address your question. I can think of ways to do it, but it probably depends on the relationship of the parent object Foo to the child object Bar. Will it always be one to one or not? In other words could the Bar value of "Baz" be associated with more than one Foo object?
Where I'm going with this is the HashMap object that you talked about earlier. This is because there are the methods containsKey and containsValue. Since HashMap does allow duplicate values associated with different keys, you could put Bar as the value and Foo as the key. Then just use myHashMap.containsValue("Baz") to determine if it is in "the list". And since it is, then you can always retrieve the keys (the Foos) that are associate with it.

You can only emulate this in Java, e.g. using a "function object". But since this is a bit awkward and verbose in Java, it is only worth the trouble if you have several different predicates to select elements from a list:
interface Predicate<T> {
boolean isTrueFor(T item);
}
Foo getFirst(List<Foo> listFoo, Predicate<Foo> pred) {
for(Foo f:listFoo) {
if(pred.isTrueFor(f)) {
return f;
}
}
}
class FooPredicateBar implements Predicate<Foo> {
private final String expected;
FooPredicateBar(String expected) {
this.expected = expected;
}
public boolean isTrueFor(Foo item) {
return item != null && expected.equals(item.getBar());
}
}
...
List<Foo> listFoo;
Foo theItem = getFirst(listFoo, new FooPredicateBar("Baz"));

You can also use Apache Commons CollectionUtils:
boolean contains = CollectionUtils.exists(listFoo, new Predicate() {
public boolean evaluate(Object input) {
return "Baz".equals(((Foo)input).getBar());
}
});

Java memory allocation on stack vs heap

I feel like a novice for asking this question -- but why is it that when I pass the Set below into my method and point it to a new HashSet, it still comes out as the EmptySet? Is it because local variables are allocated on the stack, and so my new is blown away when I exit the method? How could I achieve the functional equivalent?
import java.util.HashSet;
import java.util.Set;
public class TestMethods {
public static void main(final String[] args) {
final Set<Integer> foo = java.util.Collections.emptySet();
test(foo);
}
public static void test(Set<Integer> mySet) {
mySet = new HashSet<Integer>();
}
}

Java passes references by value, think of mySet as just a copy of the foo reference. In void test(Set<Integer> mySet) , the mySet variable is just a local variable within that function, so setting it to something else doesn't affect the caller in main.
mySet does reference(or "point to" if you like) the same Set as the foo variable does in main though.
If you want to alter the reference in main, you could do e.g.:
foo = test(); //foo can't be final now though
public static Set<Integer> test() {
return new HashSet<Integer>();
}

... Is it because local variables are allocated on the stack, and so my new is blown away when I exit the method?
No. It is because of the argument passing semantics of Java.
Java arguments are passed "by value", but in the case of an object or array type, the value you are passing is the object/array reference. When you create and assign a new set object to mySet, you are simply setting the local variable / parameter. Since Java uses pass by value, this has no effect on the foo variable in the main method.
When you enter the test method, you have two copies of the reference to the HashSet instance created in the main method; one in foo and one in mySet. Your code then replaces the reference in mySet with a reference to a newly created HashSet, but this new reference doesn't get passed back to the caller. (You could change your code to pass it back ... for example as the result of the test method. But you have to do this explicitly.)
OK - however -- if I were to do add or some other operation within my method call, that allocation would be preserved. Why is that?
That is because when you call an instance method using the reference in foo or mySet, that method is executed on the object (HashSet) that the reference refers to. Assuming that the two references point to the same object, your "allocation will be preserved". Or more precisely, you can observe the effects of operations on one reference to an object via operations on other references to the same object.
Just remember that Java method calls copy references to object, not the objects themselves.
By the way you won't be able to add elements to a set returned by Collections.emptySet(). That set object is immutable. Calling (for example) add on it will throw an exception.

Your 'foo' referred to an empty set going into the test() call, the test call did not modify that object, and so it's still an empty set on return from there.
Within the test() method, 'mySet' is just a local reference, which refers to the original set (foo) on entry, and when you did the assignment of a new HashSet to that reference, you lost the reference to the original set. But these effects are all entirely local to the test() method, because java simply gave test() a duplicate of the reference to the original set.
Now, within test(), since you have a reference to the original object, you can modify that object. For instance, you could add elements to that set. But you can't change the reference in the calling function, you can only change what it refers to. So you can't replace the one collection with a different one, and if you wanted a HashSet in the first place, you'd have to new the HashSet in main().

Not sure I understand the question. In the test method, you are instantiating a new set and assigning it to the local mySet variable. mySet then no longer will reference the same set as foo does back in Main.
When you return from the method, foo still references the original emptySet() and the HashSet created in the method will be marked for garbage collection.

import java.util.HashSet;
import java.util.Set;
public class TestMethods {
public static void main(final String[] args) {
final Set<Integer> foo = java.util.Collections.emptySet();
test(foo);
}
public static void test(Set<Integer> mySet) {
// here mySet points to the same object as foo in main
mySet = new HashSet<Integer>();
// mySet now points to a new object created by your HashSet constructor call,
// any subsequent operations on mySet are no longer associated with foo, because
// they are no longer referencing the same object
}
}
How could I achieve the functional
equivalent?
I am not sure if I understand this question, are you looking for a return?
public static Set<Integer> test(Set<Integer> mySet) {
for(Integer i : mySet){
// do something??
}
mySet = new HashSet<Integer>();
return mySet;
}
Now, if you assign foo to what test returns, you have the "functional equivalent"?

you should read this book:
"A Programmer's Guide to Java SCJP Certification: A Comprehensive Primer (3rd Edition)"

Java Object Oriented Design: Returning multiple objects in java

The below code in Java throws Null pointer exception.
public class New{
int i;
New(int i)
{
this.i = i;
}
public void func(New temp)
{
temp.i = 10;
temp = new New(20);
}
public static void main(String[] args)
{
New n = null;
n.func(n);
System.out.println("value "+ n.i);
}
}
The reason being, java passes objects references by value. If I wanted to return one object, then I can return it from the function.
But, If I have multiple objects, the only way I could return the object references is, by keeping them into another object, like having some container which has references to all the objects.
Is there a better way to do it?
In C++, I normally just pass the address of pointer to handle this scenario. If I wanted to just return two objects of a single type, creating a container and passing it is a over kill.
What is the problem with returning multiple objects from a function? Why cannot the semantics of the function in all these languages be changed?

Most often you create an object to hold the combination of objects you want to return.
For a more general-purpose solution, you can either return a collection, and array or some sort of tuple, such as Pair, Triple, etc (the latter you will need to create).
Note, you don't generally pass a mutable object as a parameter, but return an immutable one:
public Pair<Integer,Integer> getLowHighTemp() {
int low,hgh;
// do stuff...
return new Pair(low,hgh);
}

This is more of 2 questions than one.
Firstly the problem with your code is that you are not declaring n before you use it. That is throwing the exception.
Secondly if you would like to return 2 objects, you need to have a container object that will hold 2 objects.

You can return some kind of Collection. Returning a Map or List is pretty common.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.