How to implement save/load functionality of Java application data/state? - java

I have an application of POJOs (plain old java objects) representing my data.
While running, the application manipulates and remembers data as desired.
Now I want to implement a save/load feature.
I am NOT asking about basic file I/O.
I am NOT asking whether ObjectOutputStream exists.
Options I have found are those such as:
1) JSON/XML/YAML libraries such as Gson, Jackson
2) Roll your own binary file format marking everything as Serializable with a Serialization Proxy pattern.
Option 1 is unsuitable because my data model can feature cyclic references. Gson resulted in a stack overflow.
Option 2 is unsuitable because the files should be cross platform and independent of JVM; it should work on desktop and android java.
A properties file is also obviously unsuitable due to the complexity of the model.
Please do not attack my use case; my data model is perfectly well designed. The example may not be.
I will now give example code of the kind of structure that needs to be saved.
class Application {
//This College is my top level object. It could correspond to an individual save file.
College college = new College();
//I would love to be able to just throw this guy into a file.
SomeLibrary.writeToFile(college);
//And read another back.
College college2 = SomeLibrary.readFromFile(anotherCollege);
}
class College {
//The trees are implemented recursively, so this is actually just the root of each tree.
Tree<Course> artCourseTree;
Tree<Course> engineeringCourseTree;
Tree<Course> businessCourseTree;
List<Student> maleStudents;
List<Student> femaleStudents;
}
class Course {
//Each course only has 2 students in this example. Ignore.
Student student1;
Student student2;
List<Exam> examsInCourse;
LocalDate courseStartDate;
Period duration;
}
class Student {
String name;
List<Exam> listOfExamsTaken;
}
class Exam {
Student studentTakingIt;
LocalDate dateTaken;
BigDecimal score;
}
As you can see, Exams are intended to be the atomic object in this model at the bottom of the hierarchy. However, not only are they referenced by both Students and Courses, but they also refer back up to a Student and contain nonprimitives such as LocalDate and BigDecimal. The model is given meaning by referencing different subsets of Exams in different Courses and Students.
I need to save the relationships, the arrangement of these things, an arbitrary number of these things, as well as the data they hold.
What hope do I have of saving and loading such a model?
What options are there to implement a save/load feature on such a model, with such constraints?
Is it really industry standard for every java program to roll its own binary file format and create a monstrous apparatus to serialize and deserialize everything? It's that or JSON? What am I missing here? Do I have to just snapshot the VM somehow? Why is there not a standard practice for this?

Circular references is a common use case and can be handled by providing #JsonManagedReference or #JsonBackReference. Check out this SO answer for more details. Another option is to implement custom serializer and solve circular references by yourself. Here is the example for the same.
However, do consider the following aspects before going ahead with using files as database
You will have to manage concurrent writes by yourself. If not correctly handled might result in corruption/loss of the data because files are not ACID compliant by nature.
The solution is not scalable as file size will grow. Time to serialize and deserialize will increase proportionately.
You won't be able to query easily on the data stored in the file. You will always have to deserialize data first and then query on POJOs.
I'll highly recommend checking SQLite which is small, fast, self-contained, high-reliability, full-featured, SQL database engine.

Related

How to identify a specific object byte offset value and extract it directly from a serialized binary file?

I have the following:
public class Building implements Serializable {
int buildingID;
String buildingName;
List<Person> personList;
...
}
class Person implements Serializable {
int age;
String name;
byte[] importantData;
...
}
I'm planning to serialize Building as a binary file. We can assume that personList will hold numerous Person entries (3GB+). In the future, I plan to use an existing Building file to extract a specific importantData from a specific Person entry in personList. Currently, the most straightforward way for me to do this would be to deserialize the file back to Building object to get the specific importantData. However, since this Building file is pretty big, the deserializing process will take some time.
I would like to do this in a much faster way by simply reading the data directly from the serialized file (skipping deserialization). The problem is that I'm not exactly sure how I can obtain or learn the byte offset value where importantData is actually stored in the file. Additionally, is it possible to get this offset value without running a byte comparison on the serialized Building file itself?
Suggestions
Don't use java serialization and plain file system to manage large datasets
It always better to use a dedicated data store to manage data across multiple service instances
Use some kind of columnar data store as you have requirement to fetch specific parts of record
Data consistency can be improved by using shared data store (still not guaranteed and depends on application logic and data store support)
Java serialization can make it difficult to change the structure of data in future
If serialization is the only option, then look into kryo or proto based serialization

schema.org deserialize in Java

I am trying to deserialize schema.org's objects but every time I face a wall of complexity. I'm not sure if it's my fault or no one ever did this. I tried several schema.org's item and all of them sooner or later encounter the same issue (for obvious reasons actually). The problem lies on property like "Author". For example a cooking recipe has an author. Schema.org/Recipe says that the author can be an a Person or an Organisation. Both are schema.org's objects.
Until now it's easy. I get a schema for a Recipe and pass it to jsonschema2pojo.org and obtain my classes.
Then with Gson
Gson gson = new Gson();
Recipe recipe = gson.fromJson(myString,Recipe.class);
myString is the json-ld I used to generate the Recipe classes. Once I try to download some more recipes from the web, I immediately encounter schemas where the Author is not a schema.org item, but a simple String. From this point on I am blocked. The parser is stuck, exactly like google's schemaorg-java parser.
I did read that some people modify the class to have authors as Object and then modify the getter and setters. A deserializer should be made for the whole Recipe class, but it must behave differently only for Author (and other similar Parameters.
Isn't there an easier way to deserialize schema.org in java? Am I googling wrong?
If you're using GSON you'll need to create custom deserialisers. Your best bet is to read the type of the author value and if it is a string create a custom Author POJO with the name set as the string.
You've chosen a difficult language to implement this parser. Strongly typed languages are going to have a tough time deserialising data from a loosely typed language.
On top of that schema.org isn't well defined. Plus people will screw up their schema.org markup. It's up to you to decide how you'll handle it. Do you reject all data that doesn't conform exactly to the data or do you try to coerce the data?
I'm curious what you're working on. Is it a web service?

XStream - treat objects with the same identifier as one

I am working on a legacy system where XStream is being used to serialize objects in order to keep two databases in sync. A new object is first stored in one database, then the stored object is serialized and sent to be stored in the other database.
Up until recently, the structure of the object in question was like this:
public class Project {
List<Milestone> milestones;
[...]
}
But, after changes to the requirements, the structure is supposed to be like this:
public class Project {
List<Goal> goals;
}
public class Goal {
List<Milestone> milestones;
}
In order to keep the milestones of legacy data, which knew nothing about goals, the final structure of project was this:
public class Project {
List<Goal> goals;
List<Milestone> milestones;
}
So, there are two paths from a Project, to a Milestone, one directly and one through a Goal. The problem occurs when this structure is deserialized and stored. When it is being deserialized by XStream, the objects for the Milestones connected to the Project directly becomes different objects from the ones connected through Goals, even though they have the same id.
As long as Hibernate's Session#merge() was used to persist this object, it was no problem, since merge() doesn't care about the object identifiers as long as the db identifiers are the same.
But, I can no longer use merge() for this purpose, and have to rely on Session#save() instead. And save() DO care about the object identifiers! So now I get a org.hibernate.NonUniqueObjectException when trying to store the deserialized object.
I figure the least intrusive way to solve this is, if it's possible, to make XStream create 1 object per database id. But is this possible?
After some consideration, it is appearant to me that the problem is not XStream, as it has mechanisms for object references. The problem is another nifty "feature" of the project I'm working on - it has 2 versions of each domain class, one for commmunication with Hibernate, and one for "logic use" (don't ask me why...) In the conversion between these two versions (which basically moves values from one object to another), objects are new-ed uncritically, resulting in the same "Hibernate-object" being transformed into multiple "Java-objects". Then, I can't really blame XStream for not understanding that these should be the same :)

Transfer of a Java Serialized Object

Is it possible to declare an instance of a serializable object in one Java program / class, then repeat the definitions of the internal objects in a different program /class entirely, and load in a big complex object from a data file? The goal is to be able to write an editor for items that's kept locally on my build machine, then write the game itself and distribute it to people who would like to play the game.
I'm writing a game in Java as a hobbyist project. Within my game, there's an a family of classes that extend a parent class, GameItem. Items might be in various families like HealingPotion, Bomb, KeyItem, and so on.
class GameItem implements Serializable {
String ItemName
String ImageResourceLocation
....}
What I want to do is include definitions of how to create each item in a particularly family of items, but then have a big class called GameItemList, which contains all possible items that can occur as you play the game.
class GameItemList implements Serializable {
LinkedList<GameItem>gameItemList;
//methods here like LookUpByName, LookUpByIndex that return references to an item
}
Maybe at some point - as the player starts a new game, or as the game launches, do something like:
//create itemList
FileInputStream fileIn = new FileInputStream("items.dat");
ObjectInputStream in = new ObjectInputStream(fileIn);
GameItemList allItems = (GameItemList)in.readObject();
in.close();
//Now I have an object called allItems that can be used for lookups.
Thanks guys, any comments or help would be greatly appreciated.
When you serialize an object, every field of the object is serialized, unless marked with transient. And this behavior is of course recursive. So yes, you can serialize an object, then deserialize it, and the deserialized object will have the same state as the serialized one. A different behavior would make serialization useless.
I wouldn't use native serialization for long-term storage of data, though. Serialized objects are hard to inspect, impossible to modify using a text editor, and maintaining backward compatibility with older versions of the classes is hard. I would use a more open format like XML or JSON.
Yes, that is possible. If an object is correctly serialized, it can be deserialized in any other machine as long as the application running there knowns the definition of the class to be deserialized.
This will work, but Java serialization is notorious for making it hard to "evolve" classes -- the internal representation is explicitly tied to the on-disk format. You can work around this with custom reader / writer methods, but you might consider a more portable format like JSON or XML instead of object serialization.

Permanent collections in Java

According to my assignment which asks to develop a small-scale Student Accommodation Management System :
The application should be developed using object-oriented concepts using Student class and Apartment class, implementing the appropriate data fields and methods for the classes. Data may be stored in collections i.e. array of objects, vectors, etc. or into data files except a database.
So far, I have worked with Sets. I am not sure if it the right way but I added HashSets to my classes. Example:
public static Set<Apartment> listOfApartments = new HashSet<Apartment>();
// in Apartment Class)
Now that I just realized I actually need persistent collections or some solutions to actually store the data permanently.
Any Suggestions?
If I where you I would use something such as an ArrayList to store data, especially students. Sets do not allow duplicate data so this could cause problems down the line.
With regards to persisting your data, you should take a look at the ObjectOutputStream to store your objects and to the ObjectInputStream to load them back into your application. You can take a look here for an ObjectStreams tutorial.
What I would recommend though is to use something such as XStream (you can see how to use it here). This will allow your application to store data in a human readable way (which is helpful for debugging) and will also allow your data to be read by different programming languages.
If Appartment is Serializable, then Set<Apartment> is also Serializable and doens't require any extra work to persist it using java.io classes
To make a class Serializable, you must :
make it implement the interface java.io.Serializable
add a default constructor
It is that easy

Categories