Creating a deep copy of a multi-level list? - java

I have:
ArrayList<ArrayList<ArrayList<Task>>> optimalPaths = new ArrayList<ArrayList<ArrayList<Task>>>();
I would like to create a deep copy of optimalPaths. The copy itself should contain no references whatsoever to optimalPaths. Would the following code work?
ArrayList<ArrayList<ArrayList<Task>>> altPaths = new ArrayList<ArrayList<ArrayList<Task>>>();
for (ArrayList<ArrayList<Task>> e : optimalPaths){
altPaths.add((ArrayList<ArrayList<Task>>) e.clone()); // Create deep copy of optimalPaths
}
I'm not sure if there are still references within altPaths on some level.

You may do it by yourself
for (ArrayList<ArrayList<Task>> outer : optimalPaths) {
ArrayList<ArrayList<Task>> newOuter = new ArrayList<>();
for (ArrayList<Task> inner : outer) {
ArrayList<Task> newInner = new ArrayList<>();
for (Task task: inner) {
newInner.add((Task) task.clone());
}
newOuter.add(newInner);
}
altPaths.add(newOuter);
}

You can use copy by serialization and deserialization if Task class doesnt have any transient fields that you want to copy:
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = new ObjectOutputStream(bos);
out.writeObject(optimalPaths);
ByteArrayInputStream bis = new ByteArrayInputStream(bos.toByteArray());
ObjectInputStream in = new ObjectInputStream(bis);
ArrayList<ArrayList<ArrayList<Task>>> copied = (ArrayList<ArrayList<ArrayList<Task>>>) in.readObject();
or use external class to do that: SerializationUtils from Apache Commons

Related

Groovy deep copy json map

I am trying to create a deep copy of a JSON map in groovy for a build config script.
I have tried the selected answer
def deepcopy(orig) {
bos = new ByteArrayOutputStream()
oos = new ObjectOutputStream(bos)
oos.writeObject(orig); oos.flush()
bin = new ByteArrayInputStream(bos.toByteArray())
ois = new ObjectInputStream(bin)
return ois.readObject()
}
from this existing question but it fails for JSON maps with java.io.NotSerializableException: groovy.json.internal.LazyMap
how can I create a deep copy of the JSON map?
Once you read the JSON, you have the copy.
import groovy.json.JsonSlurper
import groovy.json.JsonOutput
def json = new JsonSlurper().parseText('''{"l1": {"l2": {"l3": 42}}}''')
json.l1.l2.l3 = 23
assert '''{"l2":{"l3":23}}''' == JsonOutput.toJson(json.l1)

Java: storing a big map in resources

I need to use a big file that contains String,String pairs and because I want to ship it with a JAR, I opted to include a serialized and gzipped version in the resource folder of the application. This is how I created the serialization:
ObjectOutputStream out = new ObjectOutputStream(
new BufferedOutputStream(new GZIPOutputStream(new FileOutputStream(OUT_FILE_PATH, false))));
out.writeObject(map);
out.close();
I chose to use a HashMap<String,String>, the resulting file is 60MB and the map contains about 4 million entries.
Now when I need the map and I deserialize it using:
final InputStream in = FileUtils.getResource("map.ser.gz");
final ObjectInputStream ois = new ObjectInputStream(new BufferedInputStream(new GZIPInputStream(in)));
map = (Map<String, String>) ois.readObject();
ois.close();
this takes about 10~15 seconds. Is there a better way to store such a big map in a JAR? I ask because I also use the Stanford CoreNLP library which uses big model files itself but seems to perform better in that regard. I tried to locate the code where the model files are read but gave up.
Your problem is you zipped the data. Store it plain text.
The performance hit is most probably in unzipping the stream. Jars are already zipped, so there's no space saving storing the file zipped.
Basically:
Store the file in plain text
Use Files.lines(Paths.get("myfilenane.txt")) to stream the lines
Consume each line with minimal code
Something like this, assuming data is in form key=value (like a Properties file):
Map<String, String> map = new HashMap<>();
Files.lines(Paths.get("myfilenane.txt"))
.map(s -> s.split("="))
.forEach(a -> map.put(a[0], a[1]));
Disclaimer: Code may not compile or work as it was thumbed in on my phone (but there's a reasonable chance it will work)
What you can do is to apply a technique coming from the book Java Performance: The definitive guide from Scott Oaks which actually stores the zipped content of the object into a byte array so for this we need a wrapper class that I call here MapHolder:
public class MapHolder implements Serializable {
// This will contain the zipped content of my map
private byte[] content;
// My actual map defined as transient as I don't want to serialize its
// content but its zipped content
private transient Map<String, String> map;
public MapHolder(Map<String, String> map) {
this.map = map;
}
private void writeObject(ObjectOutputStream out) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try (GZIPOutputStream zip = new GZIPOutputStream(baos);
ObjectOutputStream oos = new ObjectOutputStream(
new BufferedOutputStream(zip))) {
oos.writeObject(map);
}
this.content = baos.toByteArray();
out.defaultWriteObject();
// Clear the temporary field content
this.content = null;
}
private void readObject(ObjectInputStream in) throws IOException,
ClassNotFoundException {
in.defaultReadObject();
try (ByteArrayInputStream bais = new ByteArrayInputStream(content);
GZIPInputStream zip = new GZIPInputStream(bais);
ObjectInputStream ois = new ObjectInputStream(
new BufferedInputStream(zip))) {
this.map = (Map<String, String>) ois.readObject();
// Clean the temporary field content
this.content = null;
}
}
public Map<String, String> getMap() {
return this.map;
}
}
Your code will then simply be:
final ByteArrayInputStream in = new ByteArrayInputStream(
Files.readAllBytes(Paths.get("/tmp/map.ser"))
);
final ObjectInputStream ois = new ObjectInputStream(in);
MapHolder holder = (MapHolder) ois.readObject();
map = holder.getMap();
ois.close();
As you may have noticed, you don't zip anymore the content it is zipped internally while serializing the MapHolder instance.
You could consider one of many fast serialization libraries:
protobuf (https://github.com/google/protobuf)
flat buffers (https://google.github.io/flatbuffers/)
cap'n proto (https://capnproto.org)

How to copy new object java

On my program I want to save some state using stack.
But all object in stack are always the same(the lest i enterd).
here is my Save code:
public void saveState(){
state.setMatrix(temp.clone()); //temp is int[][]
state.setScore(score); //score is int
State newgaGameState = new State(state); //copy consructor
stack.push(newgaGameState);
}
I guess i need to copy the state because the stack save the reference.
How I need to do it??
what I did wrong?
thanks.
You need to use deep copy. So, you can do one of the following:
use state.clone() (of course, you have to check if it creates a deep copy)
serialize and then deserialize object you want to copy
implement this by yourself
You are adding information to state, and then you save gameState. If this is not a typo, it seems you are not really saving anything new.
Use ObjectOutputStream and ObjectInputStream;
WriteObject
FileOutputStream fout = new FileOutputStream(file-full-path);
ObjectOutputStream oos = new ObjectOutputStream(fout);
oos.writeObject(your-object);
oos.close();
Read Object
FileInputStream fin = new FileInputStream(file-full-path);
ObjectInputStream ois = new ObjectInputStream(fin);
Your-Class yourObject = (Your-Class) ois.readObject();
ois.close();

Implementing duplication or cloning in groovy/java

I have a method to duplicate(clone) as below
static duplicateRecord(record)
{
def copyRecord = [:]
record.each{ fieldname, value ->
if (value)
{
copyRecord [(fieldname)] = value?.clone()
}
}
return copyRecord
}
Do we have any clone() method in Groovy/java to accomplish the same functionality ?
This should do it.
Copied from: https://stackoverflow.com/a/13155429/889945
// standard deep copy implementation
def deepcopy(orig) {
bos = new ByteArrayOutputStream()
oos = new ObjectOutputStream(bos)
oos.writeObject(orig); oos.flush()
bin = new ByteArrayInputStream(bos.toByteArray())
ois = new ObjectInputStream(bin)
return ois.readObject()
}
I think you would need to implement the Cloneable interface. This post shows how to clone an object in Groovy without implementing the Cloneable interface, though I have not tested it.

How to test a Weka Text Classification (FilteredClassifier)

Looked at lots of examples for this, and so far no luck. I'd like to classify free text.
Configure a text classifier. (FilteredClassifier using StringToWordVector and LibSVM)
Train the classifier (add in lots of documents, train on filtered text)
Serialize the FilteredClassifier to disk, quit the app
Then later
Load up the serialized FilteredClassifier
Classify stuff!
It goes ok up to when I try to read from disk and classify things. All the documents and examples show the training list and testing list being built at the same time, and in my case, I'm trying to build a testing list after the fact.
A FilteredClassifier alone is not enough to create a testing Instance with the same "dictionary" as the original training set, so how do I save everything I need to classify at a later date?
http://weka.wikispaces.com/Use+WEKA+in+your+Java+code just says "Instances loaded from somewhere" and doesn't say anything about using a similar dictionary.
ClassifierFramework cf = new WekaSVM();
if (!cf.isTrained()) {
train(cf); // Train, save to disk
cf = new WekaSVM(); // reloads from file
}
cf.test("this is a test");
Ends up throwing
java.lang.ArrayIndexOutOfBoundsException: 2
at weka.core.DenseInstance.value(DenseInstance.java:332)
at weka.filters.unsupervised.attribute.StringToWordVector.convertInstancewoDocNorm(StringToWordVector.java:1587)
at weka.filters.unsupervised.attribute.StringToWordVector.input(StringToWordVector.java:688)
at weka.classifiers.meta.FilteredClassifier.filterInstance(FilteredClassifier.java:465)
at weka.classifiers.meta.FilteredClassifier.distributionForInstance(FilteredClassifier.java:495)
at weka.classifiers.AbstractClassifier.classifyInstance(AbstractClassifier.java:70)
at ratchetclassify.lab.WekaSVM.test(WekaSVM.java:125)
Serialize your Instances which holds the definition of the trained data -similar dictionary?- while you are serializing your classifier:
Instances trainInstances = ... //
Instances trainHeader = new Instances(trainInstances, 0);
trainHeader.setClassIndex(trainInstances .classIndex());
OutputStream os = new FileOutputStream(fileName);
ObjectOutputStream objectOutputStream = new ObjectOutputStream(os);
objectOutputStream.writeObject(classifier);
if (trainHeader != null)
objectOutputStream.writeObject(trainHeader);
objectOutputStream.flush();
objectOutputStream.close();
To desialize:
Classifier classifier = null;
Instances trainHeader = null;
InputStream is = new BufferedInputStream(new FileInputStream(fileName));
ObjectInputStream objectInputStream = new ObjectInputStream(is);
classifier = (Classifier) objectInputStream.readObject();
try { // see if we can load the header
trainHeader = (Instances) objectInputStream.readObject();
} catch (Exception e) {
}
objectInputStream.close();
Use trainHeader to create new Instance:
int numAttributes = trainHeader.numAttributes();
double[] vals = new double[numAttributes];
for (int i = 0; i < numAttributes - 1; i++) {
Attribute attribute = trainHeader.attribute(i);
//If your attribute is nominal or string:
double value = attribute.indexOfValue(myStrVal); //get myStrVal from your source
//If your attribute is numeric
double value = myNumericVal; //get myNumericVal from your source
vals[i] = value;
}
vals[numAttributes] = Instance.missingValue();
Instance instance = new Instance(1.0, vals);
instance.setDataset(trainHeader);
return instance;

Categories