What is the purpose of Serialization in Java? - java

I have read quite a number of articles on Serialization and how it is so nice and great but none of the arguments were convincing enough. I am wondering if someone can really tell me what is it that we can really achieve by serializing a class?

Let's define serialization first, then we can talk about why it's so useful.
Serialization is simply turning an existing object into a byte array. This byte array represents the class of the object, the version of the object, and the internal state of the object. This byte array can then be used between JVM's running the same code to transmit/read the object.
Why would we want to do this?
There are several reasons:
Communication: If you have two machines that are running the same code, and they need to communicate, an easy way is for one machine to build an object with information that it would like to transmit, and then serialize that object to the other machine. It's not the best method for communication, but it gets the job done.
Persistence: If you want to store the state of a particular operation in a database, it can be easily serialized to a byte array, and stored in the database for later retrieval.
Deep Copy: If you need an exact replica of an Object, and don't want to go to the trouble of writing your own specialized clone() class, simply serializing the object to a byte array, and then de-serializing it to another object achieves this goal.
Caching: Really just an application of the above, but sometimes an object takes 10 minutes to build, but would only take 10 seconds to de-serialize. So, rather than hold onto the giant object in memory, just cache it out to a file via serialization, and read it in later when it's needed.
Cross JVM Synchronization: Serialization works across different JVMs that may be running on different architectures.

While you're running your application, all of its objects are stored in memory (RAM). When you exit, that memory gets reclaimed by the operating system, and your program essentially 'forgets' everything that happened while it was running. Serialization remedies this by letting your application save objects to disk so it can read them back the next time it starts. If your application is going to provide any way of saving/sharing a previous state, you'll need some form of serialization.

I can share my story and I hope it will give some ideas why serialization is necessary. However, the answers to your question are already remarkably detail.
I had several projects that need to load and read a bunch of text files. The files contained stop words, biomedical verbs, biomedical abbreviations, words semantically connected to each other, etc. The contents of these files are simple: words!
Now for each project, I needed to read the words from each of these files and put them into different arrays; as the contents of the file never changed, it became a common, however redundant, task after the first project.
So, what I did is that I created one object to read each of these files and to populate individual arrays (instance variables of the objects). Then I serialized the objects and then for the later projects, I simply deserialized them. I didn't have to read the files and populate the arrays again and again.

In essense:
Serialization is the process of
converting a set of object instances
that contain references to each other
into a linear stream of bytes, which
can then be sent through a socket,
stored to a file, or simply
manipulated as a stream of data
See uses from Wiki:
Serialization has a number of advantages. It provides:
a method of persisting objects which
is more convenient than writing
their properties to a text file on
disk, and re-assembling them by
reading this back in.
a method of
issuing remote procedure calls,
e.g., as in SOAP
a method for
distributing objects, especially in
software componentry such as COM,
CORBA, etc.
a method for detecting
changes in time-varying data.

The most obvious is that you can transmit the serialized class over a network,
and the recepient can construct a duplicate of the original instanstance. Likewise,
you can save a serialized structure to a file system.
Also, note that serialization is recursive, so you can serialize an entire heterogenous
data structure in one swell foop, if desired.

Serialized objects maintain state in space, they can be transferred over the network, file system, etc... and time, they can outlive the JVM that created them.
Sometimes this is useful.

I use serialized objects to standardize the arguments I pass to functions or class constructors. Passing one serialized bean is much cleaner than a long list of arguments. The result is code that is easier to read and debug.

For the simple purpose of learning (notice, I said learning, I did not say best, or even good, but just for the sake of understanding stuff), you could save your data to a text file on the computer, then have a program that reads that info, and based on the file, you could have your program respond differently. If you were more advanced, it wouldn't necessarily have to be a txt file, but something else.
Serializing on the other hand, puts things directly into computer language. It's like you're telling a Spanish computer something in Spanish, rather than telling it something in French, forcing it to learn French, then save things into its native Spanish by translating everything. Not the most tech-intensive answer, I'm just trying to create an understandable example in a common language format.
Serialization is also faster, because in Java, objects are handled on the heap, and take much longer than if they were represented as primitives on the stack. Speed, speed, speed. And less file processing from a programmer point of view.

One of the classical example where serialization is used in daily life is "Save Game" option in any computer games. When player decides save his progress in the game then the application writes the saved state of the game into a file via serialization and when player "Load Game" the serialized file is read and Game state is re-created.

Related

Partial deserialization and serialization in Java?

There are a huge number of libraries and approaches out there to serialize and de-serialize objects in Java.
What I would like to do involves rather large and complex objects which need to get sent back and forth between processing nodes.
However, each node only is interested in one or a few, usually small parts of the whole object. The processing node processes that part and creates a new part that would need to get spliced into the existing serialized object before it gets sent on.
For this, two things would be of high importance:
being able to just deserialize parts of the serialized object (and thus save parsing/deserialization time, object creation time, memory...) and to also add the serialization of some new part to the existing serialized object (again saving time and memory) -- skipping the unwanted parts in the serialized version should be extremely fast and efficient and should ideally be possible in a streaming mode, without the need to keep the whole serialized data in memory at once
overall compact and fast serialization and deserialization.
I am pretty flexible as to how much automation I get for actually creating typed objects versus untyped maps and lists: if all else fails I would be able to represent the whole object as a nested data structure of just maps, arrays and the basic datatypes boolean, String and Number.
UPDATE: forgot to mention two additional, rather important requirements:
the solution must be possible with the existing objects, i.e. it is not possible to re-implement the current object using a e.g. different collections class.
ideally the solution should be based on open-source software because the software I need this for will be published itself as open-source.
It sounds like you're planning a design where a whole bunch of data is sent to a processing node, and that node will only read/modify/write a small part of it. But then will send the whole bundle on to another node.
Why not have the host that has all the data figure out which node needs which data, and only send that data? Then processing can happen in parallel, instead of daisy-chain. And your total network traffic will be less than every node sending a full copy of everything: O(n*m).
It might be worth designing your own message format, potentially based on JSON, binary, or something else.

Saving object for future use

I am writing a program (a bot) to play a Risk-like game in an AI competition. I'm new to programming so I've used some very basic coding so far. In this game, each round the program receive some information from the game engine. In the program, I have a class BotState, which allows me to treat information from the current round, such as the opponent bot moves, or the regions currently under my control, etc. This information is put in some ArrayLists. I have some getters to access this information and use them in the main class.
My problem is that each round, the information is overwritten (each round means a new run of the program), so I can only access the information from the current round. What I would like to do is save all of the information each round, so that for example if the game state is at round 10, I still can access the moves that the opponent made on round 8.
I looked for ways to solve this problem, and I came across something named "object serialization". I didn't quite understood how it works, so I would like to know if there is a simpler/better way to do what I want, or if serialization is the way to go. Thanks for your help.
edit: I can't link the program to my disk or a database. I upload the source files of the bot to the game server, so everything has to be in the source files
Object serialization should be fairly simple for your case.
Simply put it is a way to store your object on disk and
to later on take data from the disk and recreate your object
in memory in the same state it was before serializing it.
Another way is to define some sort of representation yourself
e.g. as an XML chunk and for each object and to store those
chunks in an XML file. You can view this as a custom serialization
but it's still a serialization.
Another way is to store your objects into a database.
All in all, you need some permanent/persistent storage
for your objects (whether it's the disk directly or a DB
/which is again using the disk at the lowest level/).
Consider using a modeling framework for your application. The Eclipse Modeling Framework (EMF) comes with a simple XMI serialization built into it. If your model is small and/or simple enough it may be worth it. Have a look at this EMF introduction tutorial and this tutorial on serialization in EMF.
Also, have a look at this question: What's the easiest way to persist java objects?.

Storing Large Amounts of Dictionary-Like Data Within an Application in Java

I fear I may not be truly understanding the utility of database software like MySQL, so perhaps this is an easy question to answer.
I'm writing a program that stores and accesses a bestiary for use in the program. It is a stand-alone application, meaning that it will not connect to the internet or a database (which I am under the impression requires a connection to a server). Currently, I have an enormous .txt file that it parses via a simple pattern (Habitat is on every tenth line, starting with the seventh; name is on every tenth line, starting with the first; etc.) This is prone to parsing errors (problems with reading data that is unrecognizable with the specified encoding, as a lot of the data is copy/pasted by lazy data-entry-ists) and I just feel that parsing a giant .txt file every time I want data is horribly inefficient. Plus, I've never seen a deployed program that had a .txt laying around called "All of our important data.txt".
Are databases the answer? Can they be used simply in basic applications like this one? Writing a class for each animal seems silly. I've heard XML can help, too - but I know virtually nothing about it except that its a mark-up language.
In summary, I just don't know how to store large amounts of data within an application. A good analogy would be: How would you store data for a dictionary/encyclopedia application?
So you are saying that a standalone application without internet access cannot have a database connection? Well your Basic assumption that DB cannot exist in standalone apps is wrong. Today's web applications use Browser assisted SQL databases to store data. All you need is to experiment rather than speculate. If you need direction, start with light weight SQLite
While databases are undoubtedly a good idea for the kind of application you're describing, I'll throw another suggestion your way, which might suit you if your data doesn't necessarily need to change at all, and there's not a "huge" amount of it.
Java provides the ability to serialise objects, which you could use to persist and retrieve object instance data directly to/from files. Using this simple approach, you could:
Write code to parse your text file into a collection of serialisable application-specific object instances;
Serialise these instances to some file(s) which form part of your application;
De-serialise the objects into memory every time the application is run;
Write your own Java code to search and retrieve data from these objects yourself, for example using ordered collection structures with custom comparators.
This approach may suffice if you:
Don't expect your data to change;
Do expect it to always fit within memory on the JVMs you're expecting the application will be run on;
Don't require sophisticated querying abilities.
Even if one or more of the above things do not hold, it may still suit you to try this approach, so that your next step could be to use a so-called object-relational mapping tool like Hibernate or Castor to persist your serialisable data not in a file, but a database (XML or relational). From there, you can use the power of some database to maintain and query your data.

Non -Ram Storage

I am learning java from "Thinking In Java" by Bruce Eckel. I am unable to understand the concept of Non -Ram Storage.
As the book says:
Non-RAM storage. If data lives completely outside a program, it can
exist while the program is not running, outside the control of the
program. The two primary examples of this are streamed objects, in
which objects are turned into streams of bytes, generally to be sent
to another machine, and persistent objects, in which the objects are
placed on disk so they will hold their state even when the program is
terminated. The trick with these types of storage is turning the
objects into something that can exist on the other medium, and yet can
be resurrected into a regular RAM-based object when necessary. Java
provides support for lightweight persistence,and mechanism such as
JDBC!
What is lightweight persistence?..what is meant by turning the objects into something that can exist on the other medium, and yet can be resurrected into a regular RAM-based object when necessary?
Persistent data is information that can outlive the program that creates it. The majority of complex programs use persistent data: GUI applications need to store user preferences across program invocations, web applications track user movements and orders over long periods of time, etc. (source provided below)
Here is the answer your question:
Lightweight persistence is a storage area which requires a little or no work from the developer side. Examples:Java serialization is a form of lightweight persistence because it can be used to persist Java objects directly to a file with very little effort.
I am very happy that you are not just reading the book...rather you are asking questions about anything you come across in the book. good luck
source
There is a processing in java (and other languages) called serialization. Basically it lets you turn an object into a byte stream, so it can be written to a file, stored in a database, sent to a cloud, etc. The idea is that there is an easy and automatic translation between the stored object and the in-memory RAM object. If you do it yourself, such as writing individual fields to a file or database, you need to come up with a file format or database schema. This is heavy-weight storage.
Here is a tutorial on java serialization: http://www.tutorialspoint.com/java/java_serialization.htm

need help creating an internal log file

I have a character creator program for an RPG that I've created and I want a way to log what the initial rolls are before the user gets a chance to edit anything. I could easily dump the info to a text file or something similar, but I want something that can't be edited. I save the character info in a serialized class object. What would be the best way to log this info in serialized form (i.e. inside the class object)? I thought of a string but that could get rather large. Is there a better way?
I would just create an object to hold the initial roll values; can't be more than a couple dozen bytes -- how many times is the user going to roll? You don't really explain what the rolls represent and if there's a fixed number of them.
If this information only has to persist during the running of the application, there's not much more to do beyond that.
If this information has to persist beyond program execution, then you'll have to write it to a file one way or another. If that's the case, then there's no way to truly protect it. RPG game writers have been trying to do this for decades with very little success. Typically, the solution is to keep player stats at a central server that the users have no access to. C.f. the "duping" problem that Diablo suffered from.
Your best bet is to write your "rolls" object to be serializable. Serialize it to a string, convert the string to bytes, encrypt the bytes with an encryption key hidden deep in the code, and write that to a file. Or if you want, just securely sign the string instead. That would allow the users to see the stats file, but not modify it.
The old Hack game used to save game state to a file with similar protection and it worked very well. One trick they used on Unix systems was to make the file's inode number part of the signed state. That way, if the user tried to copy the file (to back up a game in progress), the inode number would change, invalidating the file. Hack was very unforgiving about these things; if it detected a modified save file, your character died on the spot, game over.
The problem here is that dedicated users will reverse-engineer your code and discover the encryption algorithm and key, no matter how hard you try to hide them, or (if you're only signing the file) simply modify your executable so that it doesn't bother checking the signature.
Without more detail on what you're trying to do, this is as much advice as I can give you. It would help if (as is the custom) you told us what you've tried so far.

Categories