Thinking about what is serializable and what is not, do I get it right that if no error messages pop up during de/serialization then everything has been perfectly serialized and deserialized? Or is it still possible while not getting any errors to have my object somehow damaged or changed during de/serialization?
My question may seem odd but it's rather difficult for a newbie like myself to keep track of every part of an object (which is fairly vast) whether this part can be serialized or not. So I'd rather fully rely on error indications if it's an adequate approach.
Actually not. Absence of error in writing/reading serializable object to/from DataStream means there is no exceptional situations. But that doesn't mean that you will get consistent data.
You could read a lot more in Effective Java by Bloch. There are several chapters concerning serialization.
Related
I was wondering how the serialization of MicroStream works in detail.
Since it is described as "Super-Fast" it has to rely on code-generation, right? Or is it based on reflections?
How would it perform in comparison to the Protobuf-Serialization, which relies on Code-generation that directly reads out of the java-fields and writes them into a bytebuffer and vice-versa.
Using reflections would drastically decrease the performance when serializing objects on a huge scale, wouldn't it?
I'm looking for a fast way to transmit and persist objects for a multiplayer-game and every millisecond counts. :)
Thanks in advance!
PS: Since I don't have enough reputation, I can not create the "microstream"-tag. https://microstream.one/
I am the lead developer of MicroStream.
(This is not an alias account. I really just created it. I'm reading on StackOverflow for 10 years or so but never had a reason to create an account. Until now.)
On every initialization, MicroStream analyzes the current runtime's versions of all required entity and value type classes and derives optimized metadata from them.
The same is done when encountering a class at runtime that was unknown so far.
The analysis is done per reflection, but since it is only done once for every handled class, the reflection performance cost is negligible.
The actual storing and loading or serialization and deserialization is done via optimized framework code based on the created metadata.
If a class layout changes, the type analysis creates a mapping from the field layout that the class' instances are stored in to that of the current class.
Automatically if possible (unambiguous changes or via some configurable heuristics), otherwise via a user-provided mapping. Performance stays the same since the JVM does not care if it (simplified speaking) copies a loaded value #3 to position #3 or to position #5. It's all in the metadata.
ByteBuffers are used, more precisely direct ByteBuffers, but only as an anchor for off-heap memory to work on via direct "Unsafe" low-level operations. If you are not familiar with "Unsafe" operations, a short and simple notion is: "It's as direct and fast as C++ code.". You can do anything you want very fast and close to memory, but you are also responsible for everything. For more details, google "sun.misc.Unsafe".
No code is generated. No byte code hacking, tacit replacement of instances by proxies or similar monkey business is used. On the technical level, it's just a Java library (including "Unsafe" usage), but with a lot of properly devised logic.
As a side note: reflection is not as slow as it is commonly considered to be. Not any more. It was, but it has been optimized pretty much in some past Java version(s?).
It's only slow if every operation has to do all the class analysis, field lookups, etc. anew (which an awful lot of frameworks seem to do because they are just badly written). If the fields are collected (set accessible, etc.) once and then cached, reflection is actually surprisingly fast.
Regarding the comparison to Protobuf-Serialization:
I can't say anything specific about it since I haven't used Protocol Buffers and I don't know how it works internally.
As usual with complex technologies, a truly meaningful comparison might be pretty difficult to do since different technologies have different optimization priorities and limitations.
Most serialization approaches give up referential consistency but only store "data" (i.e. if two objects reference a third, deserialization will create TWO instances of that third object.
Like this: A->C<-B ==serialization==> A->C1 B->C2.
This basically breaks/ruins/destroys object graphs and makes serialization of cyclic graphs impossible, since it creates and endlessly cascading replication. See JSON serialization, for example. Funny stuff.)
Even Brian Goetz' draft for a Java "Serialization 2.0" includes that limitation (see "Limitations" at http://cr.openjdk.java.net/~briangoetz/amber/serialization.html) (and another one which breaks the separation of concerns).
MicroStream does not have that limitation. It handles arbitrary object graphs properly without ruining their references.
Keeping referential consistency intact is by far not "trying to do too much", as he writes. It is "doing it properly". One just has to know how to do it properly. And it even is rather trivial if done correctly.
So, depending on how many limitations Protobuf-Serialization has ("pacts with the devil"), it might be hardly or even not at all comparable to MicroStream in general.
Of course, you can always create some performance comparison tests for your particular requirements and see which technology suits you best. Just make sure you are aware of the limitations a certain technology imposes on you (ruined referential consistency, forbidden types, required annotations, required default constructor / getters / setters, etc.).
MicroStream has none*.
(*) within reason: Serializing/storing system-internals (e.g. Thread) or non-entities (like lambdas or proxy instances) is, while technically possible, intentionally excluded.
I'm trying to change the page persistence in our XPages application, intending to move from "Keep pages in memory" to "Keep only the current page in memory". And of course I get run-time errors telling me that XPages cannot serialize a JavaScript function. But which function? The stack trace only shows the standard Java error stuff, but nothing about which variable or function cannot be serialized?
I had similar issues before, and it always cost me a lot of time to dig deep in the code and solve the problem. It takes ages... and I've really had it by now.
Is there a clever way to find out which function cannot be serialized??
UPDATE
What OpenLog Logger comes up with:
Client Version
Release 9.0.1FP3
January 12, 2015
Database aalto803.nsf
Agent /aASK.xsp
Method class java.lang.StackTraceElement.writeValue
Error Num -
Error Line 364
Error Msg Impossible de sérialiser une fonction JavaScript
Language Java
Stack Trace
java.io.IOException: Impossible de sérialiser une fonction JavaScript
at com.ibm.jscript.types.FBSValue.writeValue(FBSValue.java:364)
at com.ibm.jscript.types.FBSDefaultObject.writeExternal(FBSDefaultObject.java:746)
at com.ibm.jscript.std.ObjectObject.writeExternal(ObjectObject.java:106)
at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1462)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346)
at java.util.HashMap.writeObject(HashMap.java:942)
at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1020)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1502)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
at java.io.ObjectOutputStream.writeUnshared(ObjectOutputStream.java:413)
at com.ibm.xsp.application.AbstractSerializingStateManager$FastObjectOutputStream.writeObjectEx(AbstractSerializingStateManager.java:438)
at com.ibm.xsp.application.AbstractSerializingStateManager$FastObjectOutputStream.writeObjectEx(AbstractSerializingStateManager.java:417)
at com.ibm.xsp.application.AbstractSerializingStateManager$FastObjectOutputStream.writeObjectEx(AbstractSerializingStateManager.java:417)
at com.ibm.xsp.application.AbstractSerializingStateManager$FastObjectOutputStream.writeObjectEx(AbstractSerializingStateManager.java:417)
at com.ibm.xsp.application.AbstractSerializingStateManager$FastObjectOutputStream.writeObjectEx(AbstractSerializingStateManager.java:417)
at com.ibm.xsp.application.AbstractSerializingStateManager.saveSerializedView(AbstractSerializingStateManager.java:294)
at com.ibm.xsp.application.AbstractSerializingStateManager.doSaveSerializedView(AbstractSerializingStateManager.java:269)
at com.ibm.xsp.application.FileStateManager.doSaveSerializedView(FileStateManager.java:290)
at com.ibm.xsp.application.FileStateManager.doSaveSerializedView(FileStateManager.java:270)
at com.ibm.xsp.application.AbstractStateManager.saveSerializedView(AbstractStateManager.java:114)
at com.ibm.xsp.application.StateManagerImpl.saveSerializedView(StateManagerImpl.java:152)
at com.ibm.xsp.application.ViewHandlerExImpl._saveViewState(ViewHandlerExImpl.java:455)
at com.ibm.xsp.application.ViewHandlerExImpl.saveViewState(ViewHandlerExImpl.java:449)
at com.ibm.xsp.application.ViewHandlerExImpl._renderView(ViewHandlerExImpl.java:324)
at com.ibm.xsp.application.ViewHandlerExImpl.renderView(ViewHandlerExImpl.java:336)
at com.sun.faces.lifecycle.RenderResponsePhase.execute(RenderResponsePhase.java:103)
XPages OpenLog Logger not only catches uncaught exceptions (it sounds like this is one of those), but also catches which component triggered the problem. It needs an error XPage in the application (otherwise after the error occurs there is no Render Response phase run, from which XPages OpenLog Logger retrieves the details). That might help you track it down.
Otherwise, check what functions you're storing in viewScope etc. That might help you narrow things down. SSJS is not really designed for object orientated programming, which I think is where issues arise when storing functions in scope.
The answer to your question is not so much WHICH function can't be serialized, its that NONE of your functions can be serialized. OR if you want to get very technical, none can be expected to persist in any way reliably. SSJS is not meant to be serialized. In this blog post here: http://xomino.com/2014/03/26/why-learning-javascript-is-more-critical-to-xpage-developers-than-java/ there is a good discussion in the comments about why and where serialization is toxic, specifically with SSJS (you can sort of disregard the actual discussion surrounding the blog post of java vs JavaScript - just concentrate on the bits concerning serialization).
A recent discovery of mine answers this question too IMHO, see XPages: how to put a Java Date value in an ObjectObject . A few years back, I started with moving my code from SSJS to Java, and I had some (better: a lot of) trouble with the ObjectObject and ArrayObject classes, mostly with the rather ugly way values have to be converted using a class called FBSUtility.
My main issue was with the fact that I couldn't manage to store a Date in an ObjectObject object. In a later stage, I was happy to have found a call with a JSContext parameter, FBSUtility.wrap(jsContext, someDate), which permits storing a Date value. Call me clueless, about what the JSContext actually does here (which I am), but I thought that was the end to it.
Recently, in order to test our application, I changed the Persistence Mode, from a few pages in memory to everything on disk, hence forcing the serialization of all objects. I found out that a specific element of my application no longer worked, always stopping on a Serialization error. Further tests proved that there were no errors when I removed all Date values from the OO object.
Earlier, I had already adopted JsonJavaObject and JsonJavaArray classes for some other parts of the application (yeah I know, messy coding, big application, never time to do things right, 50 Mb template db, etc.). I rewrote the code to remove all use of the classes JSContext, FBSUtility, ObjectObject and ArrayObject, to replace them with the JsonJava classes, and there's no longer the dreaded message that a JS function cannot be serialized.
So, what I learned is: if your Persistence Mode is set to Keep pages on Disk or Keep only the current page in memory, try avoid ObjectObject objects and never use FBSUtility in combination with JSContext.
When I first started using the Java Preferences API, the one glaring omission from the API was a putObject() method. I've always wondered why they did not include it.
So, I did some googling and I found this article from IBM which shows you how to do it: http://www.ibm.com/developerworks/library/j-prefapi/
The method they're using seems a bit hackish to me, because you have to break the Object up into byte matrices, store them, and reassemble them later.
My question is, has anyone tried this approach? Can you testify that it is a good way to store/retrieve objects?.
I'm also curious why the Java devs left putObject() out of the API. Does anyone have valuable insight?
I'm also curious why the Java devs left putObject() out of the API.
Does anyone have valuable insight?
From: http://docs.oracle.com/javase/7/docs/technotes/guides/preferences/designfaq.html
Why doesn't this API contain methods to read and write arbitrary
serializable objects?
Serialized objects are somewhat fragile: if the version of the program
that reads such a property differs from the version that wrote it, the
object may not deserialize properly (or at all). It is not impossible
to store serialized objects using this API, but we do not encourage
it, and have not provided a convenience method.
The article describes a reliable way to do it. I see there are a couple of things I may do differently (like I would store the count of the number of pieces as well as the pieces themselves so that I can figure things out easily when I retrieve them).
Your comment about Serialization is wrong though.... the object you want to store has to be Serializable.... that's how the ObjectOutputStream that the document uses does it's job.
So, Yes, it looks like a reliable mechanism, you need to have Serializable objects, and I imagine that the reason that putObject and getObject are not part of the API for two reasons:
it's not part of the way that is native to Windows registries
It risks people putting huge amounts of data in the registry.
Storing serialized objects in the registry strikes me as being somewhat concerning because they can be so big. I would only use it for occasions when there is no way to reconstruct the Object from constructors, and the serialized version is relatively small.
I have written a math game in Java, and have distributed some copies to a few beta-testers. The problem is that the version I have given them is saving the GameData via object serialization, which I found out is mainly for sending Objects, or in this case, ArrayLists of GameData, over a network. It is NOT persistance; that is what a relational database is for. Knowing this, I would like to know if it would be better to create a database on the beta-tester's machine (and rewrite the game), or continue with the Object serialization version of the game, and then retrieve the Objects when they are ready to send the data?
My guess would be to just move their data to a database that is created on their computer, and then give them the database version of the game. That way, the data can be persisted and be much easier to manipulate. What turns me away from that idea is the question of how am I going to write their database into mine (in the future)?
Although relatively rare, there are still lots of applications that use serialization for storage and retrieval of objects. It's not wrong to do this, just slightly unusual. If it's working for you, stick with it because DB's are a heavyweight solution. What you found out, about serialization, is only an opinion and an ill-formed one at that.
In terms of using an embedded database, two options to consider are SQLite and HyperSQL. However, serialization is also an option, and in my opinion it should be your default option if you've already implemented it. Some considerations:
With serialization you've generally got to retrieve the entire object, which is slow if you've got an object with several dozen fields and you only want to read one of them. If you're making queries like these, then use a database. I suspect that you're just reading in all of your serialized objects at startup and serializing them back out to disk at shutdown, in which case there's no reason to use a database instead of serialization.
Java's default serialization mechanism is fairly slow. You may want to consider another serialization mechanism, such as Kryo or Jackson, but only if you're not happy with your program's serialization performance.
It is difficult to advise on the best choice of technology without knowing what you are persisting and why.
If the state is simply a snapshot of your game state (i.e. a save file) or a "best scores" table, then you don't need a database. Serializing using JSON, XML or ... Java Object serialization is sufficient.
If the state needs to be read or updated incrementally or shared with other applications ... or users on other machines ... then a database is more appropriate.
Serialization mechanisms are problematic if the requirements include incremental changes, etcetera. You end up building a database-like layer over the top of the serialization.
As to whether you should stick with Java serialization ... or switch to JSON or XML or something like that:
Object serialization is simple, but it can be fragile if you change the classes that you are serializing. This fragility can be mitigated, but it is messy and you lose the simplicity. (You need to write custom readObject and writeObject methods that know how to read "old versions" of the serialized objects.)
JSON and XML are a bit more complicated, but still relatively simple if you use an object binding mechanism.
It is worth noting that changes to the persisted object classes (or the database schemas) are potentially problematic no matter what you do. There is no easy universal solution to this problem.
UPDATE
Given the additional information that you provided in your first comment (below), it seems like you don't need a database in the game itself. All you need is something that can read and analyse the session state save files that your beta testers provide for you. Indeed, it doesn't even seem like the actual app needs to be able read the files. (But that's unclear, because you've not said what the real purpose of these files is ... or at least, not what the entire purpose is.)
It is also worth noting that you are probably saving the wrong information if your aim is to tune the sets of questions. What you really need to do is record the length of time and whether the user got the right or wrong answer and the time ... for each individual question. And you probably need to know what the actual answer given was ... so that you can spot cases where the user's answer was actually right and you "marked" it as wrong ... or vice versa.
"What turns me away from that idea is the question of how am I going to write their database into mine (in the future)?"
Exactly. If you hadn't prematurely "analysed" the data, you wouldn't have this problem.
But ignoring that, it seems like that a simple state saving mechanism is sufficient to meet your (still hypothetical / inferred) requirement of keeping a personal score board for the end user. Your "tuning" stuff would be better implemented using a custom log file. I cannot see any value in incorporating a database as part of the app itself.
I presume you are doing java serialisation, If so there is nothing wrong with it. Just be aware of its limitations - Different versions of java might not be able to retrieve the file.
Also If you change the Class, previous saved data can not be retrieved.
If you decide to change you could look at Xml, JSon, Protocol Buffers, Thrift, Avro etc as well as a DB.
Note:
Xml is builtin in to java
Java Db (Derby) is also in Java
Other serialisation schema's require a seperate library.
I've had a look around but nothing seems to quite cover what I wish to do. Is it possible to save a Class<?> instance object at run-time? If so how would I go about doing it?
Have you gone through concept of serialization using java. this link will help you on your problem.
In short Java.lang.Serializable is your friend to do this.
This is a set of comments on Java Serialization rather than an answer. Just some info not (yet) in the other answers.
Serialization not only saves an object, it saves all the objects it references, directly and indirectly. This can be really cool, but you might write one little objectette and find you've unexpectedly created a 10MB file.
If there's a reference, however indirect, to an non-serializable object in the one you're writing, the write will throw an Exception.
If you're using a socket, reset the ObjectOutput stream regularly. Otherwise, every time an object is written other than the first time, all that gets sent is a reference to the original data. Send the same object with successive values of 1, 2, 3, 4, and 5, and the object read will have values of 1, 1, 1, 1, and 1. Also, without reset, memory usage will soar because both ObjectOutput and ObjectInput will keep pretty much everything sent in memory. (Though it will only keep one copy of each distinct object.)
Serialization doesn't work if a class changes between being written and being read. Clever work with Externalization can get around this, however. (And remember serialVersionUID if your IDE will let you forget.)
Externalization lets you write the code to serialize a class. This can be very useful. You can put in version numbers and check them and you can leave out data than isn't needed or can be recreated during the read. It takes more work than automatic serialization, though.
When doing an Externalization read, be aware that all references may refer to objects whose data has not yet arrived; you can't consistently sum amounts from a list of child objects, for instance. It might pay to call a method after readObject to set up values that need to be calculated. (It's often better to send redundant information than to recalculate it.)
I learned all this the hard way.
I learned about Serialization from this website. It teaches the concept well. I recommend starting there.