What are the lightweight options one has to persist Java objects [closed]

What are the lightweight options one has to persist Java objects [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
What are the lightweight options one has to persist Java objects ?
I'm looking for the easiest possible solution. I don't want anything fancy featurewise just something simple that works with reasonnably simple objects (some collections, relationships and simple inheritance) and doesn't bring too much apparent complexity (if any) to the existing codebase.
The options I'm aware of include Hibernate and frameworks such as EMF but they sound (and have been experienced to be) too complex and time-consuming. I'd like something out of the box, preferably file oriented than dababase oriented, that I can just put on top of my plain old java objects.
This is a newbie question, thanks in advance for any tutorial-like, context clarifying guidance.

db4o is an object database so there's no schema setup and it adapts well to changes in object structure over time.

If you are looking at something simple you might also want the data in a format you can read (not a binary file or database). If that is the case, you should look at JAXB (Java's XML Binding) that is part of Java 6 and later. There are other technologies that may be able to do this better such as XML Beans but this one is built in.
Take a look at this page from Java's API. It is pretty straight forward to serializing and deserializing Java objects.
http://java.sun.com/javase/6/docs/api/javax/xml/bind/JAXBContext.html
Basically you use the following:
JAXBContext jc = JAXBContext.newInstance( "com.acme.foo" );
// unmarshal from foo.xml
Unmarshaller u = jc.createUnmarshaller();
FooObject fooObj = (FooObject)u.unmarshal( new File( "foo.xml" ) );
// marshal to System.out
Marshaller m = jc.createMarshaller();
m.marshal( fooObj, System.out );
Just make sure your FooObject has the #XmlRootElement annotation. Java bean properties are automatically interpreted but you can use other annotations to control how the XML looks and model more complex objects.

The ObjectOutputStream and ObjectInputStream can do this. It's built into Java and allows you to serialize/deserialize objects quite easily.
The bad thing about these is that if you change your objects you can't import existing old objects. You'll have to think of a way to stay compatible with existing objects that might already be saved on a user's computer, such as adding a version number to your classes and creating the ability to convert old objects to new ones.
More info here:
http://java.sun.com/j2se/1.4.2/docs/api/java/io/ObjectOutputStream.html
http://java.sun.com/j2se/1.4.2/docs/api/java/io/ObjectInputStream.html
EDIT:
You may also consider adding the following attribute to all your classes right off the bat. This may allow you to then add attributes to classes and still be able to load old Object files. In RMI-land this works, but I'm not sure about files. Don't ever remove attributes though.
static final long serialVersionUID = 1L;

Try Simple XML persistence http://simple.sourceforge.net. Its really simple and requires no configuration. Just take your existing POJOs and annotate them, then you can read and write them to a file. Cyclical object graphs are supported as well as all Java collections. It can be used like so.
Serializer serializer = new Persister();
MyObject object = serializer.read(MyObject.class, new File("file.xml));
and writing is just as easy
serializer.write(myInstance, new File("file.xml"));
This is an extremely lightweight approach, no dependancies other than standard Java 1.5. Compared with other object to XML technologies such as JAXB, XStream, Castor, which are dependant on a whole host of other projects Simple XML is very lightweight and memory efficient.

See this similar question Light-weight alternative to hibernate. The most light weight framework I know of is iBatis

You could try XStream which is an open source library. All it does is turn your POJOs into XML which you can then saved to disk. It is very easy to use and requires only a few lines of code to use.

One simple approach is to serialize (as in Serializable) your objects to disk.
http://www.devx.com/Java/Article/9931/1954

The simplest option I know of is using Xstream to make an xml file out of ANY java obect.

I have to say the initial learning curve for Hibernate is relatively shallow. You should be able to get a test system up and running in less than a day. It's only when you want the more advanced features where it starts to get steeper. I would recommend you definitely take a look, I mean what's a day if you end up not choosing it?
Having said that, have you considered just serializing your objects directly to disk if you just want something quick n dirty?

I have used PersistentObject extensively in production code and it serves my needs well.

Related

How to replace XML with Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I don't understand worldwide fashion to use XML for storing settings, page or GUI layouts and even bean sets.
If we take into account the fact, that XML is not a single language, but a template to define multiple languages, the mass insanity become obvious.
Why use DOZENS of languages instead of using some ONE?
For example, Java.
It can define or declare any type of GUI hierarchy, setting tables and so on.
So why don't we use java files to store all things?
For example, one could provide settings file in java format like this:
class Properties {
String dbpassword = "password";
String database = "localhost";
String dbuser = "mkyong";
}
The this file could be compiled on the fly by runtime compiler and fields extracted by reflection.
So, why don't we do so?
The only reason coming in mind is security.
External file could contain not only properties but some constructor code, which could hack the application or crash it.
So. Is there any protection for this case? Can we limit possible functionality of runtime compiled files?
UPDATE
I don't mean namely Java. Java is just a sample. I need ways to use single language.
UPDATE 2
I claim, that it would be better to use one single language for entire project aspects.
Suppose that in MVC pattern we would use three separate languages for model, view and controller! This would be nonsense! We use single language. I mean we people use different languages, but each one programmer uses one single language still separating MVC. This is what configuration files are: the separation of one more concept. No any sense to use different language just because concept was separated.
This also applies to HTML / JavaScript / CSS. No sense to have three languages. No any damned sense!
Of course I understand historical reasons etc. But these are no sense, these are just pesky reasons.
So, why then we are seeing new and new projects with the same nature?
Why did MS created XAML for WPF? They made new fresh C# language, why not to put XAML into it too?
Why did Oracle created FXML for JavaFX? Thanks God they abandoned JavaFX script! Because they wanted to have THREE languages for just RIA!

The comments have identified one possible explanation:
A Java syntax would be hard to deal with from other languages.
... though for the configuration properties of Java program, this seems a bit of a stretch. (Why would you read a Java program's config properties from a non-Java program?)
I think the real reason nobody does what you propose is that it introduces some new problems which make your solution as bad as doing XML configs the "hard way".
The person writing the configs needs to understand Java syntax.
If they make a mistake, the application has to deal with a Java compilation error in the middle of starting the application.
Accessing the properties would entail the application programmer writing of reflective code. This code is typically fragile, and writing it would be as hard as writing a bunch of DOM-walking code to read stuff from a parsed XML configuration file.
Now maybe XML is not the best choice for configuration files. These days, I'd prefer classic "Properties" file syntax or JSON.
But it XML is OK too, if you do it the right way. For example:
If you use a Java / XML binding library (e.g. JAXB) you don't need to write an XML parser, or a bunch of gnarly DOM-walking code to pick information out a parse tree.
You can uses a DTD or XML schema driven editor so that the user doesn't need to write (or even see) raw XML files.
If your configs are really complicated, you could use something like EMF to model them and then generate DTDs / Schemas and all of the Java code for the config accessor libraries and customizable application-specific config editors.
For the record, the approach that you proposed is commonly used in scripting languages like Perl, Ruby, Python and even "sh". It just that it tends to be problematic in statically compiled languages.
Re your followup "Why did XXX do YYY" questions:
I don't know. I wasn't in the room.
My opinions would be pure speculation.
It doesn't make any difference why they did it anyway.
And in response to your general "it doesn't make sense" comment/complaint, the IT world is rife with pragmatic compromises / clunky solutions. Complaining about it doesn't achieve anything. If you want to achieve something, try your way out, and show us the evidence that it really works in Java.

Using executable code for configuration is a recipe for disaster. Consider:
class Properties
{
String dbpassword = "password";
String database = "localhost";
String dbuser = "mkyong";
static
{
File userDir = new File("c:\\users\\mkyong");
userDir.delete();
//etc...
}
}
You can try to counter this by using a security manager to limit damage, but they are not perfect. Or you could write your own parser that only accepts a sub-set of what a full Java program can do (limited API, no APIs, etc. similar to JSON vs full Javascript). But at the end of the day, is the Java syntax any better than the other configuration file formats?

Simplest way to convert java objects to xml [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the best way to convert a java object to xml with open source apis
I found existing questions 1 and 2 on the subject, but I wasn't sure if it was up-to-date, and the best fit for what I'm trying to do.
I've had one strong suggestion of XMLBeans, but there isn't too much discussion on it here on SO, and it's not even mentioned on the first link above (not with any upvotes, anyway).
Is JAXB still the best recommendation for this? If so, are there any simple tutorials that walkthrough A->B with object->xml?
Update:
I'm being given Lists of java objects, and need to convert them to xml following a given schema.

I prefer to use JAXB that supports both annotation and XML based mapping.

I'd go with XStream, as your question 2 suggests.
It needs no annotations, just configuration, and can serialize simple objects out of the box.
To make the objects fit a given schema, you'd convert them to objects that closely resemble the schema first and serialize those to XML.
The XStream page also holds the tutorials you request.

What is JAXB and why would I use it? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
There is guy here swearing that JAXB is the greatest thing since sliced bread. I am curious to see what Stack Overflow users think the use case is for JAXB and what makes it a good or a bad solution for that case.

I'm a big fan of JAXB for manipulating XML. Basically, it provides a solution to this problem (I'm assuming familiarity with XML, Java data structures, and XML Schemas):
Working with XML is difficult. One needs a way to take an XML file - which is basically a text file - and convert it into some sort of data structure, which your program can then manipulate.
JAXB will take an XML Schema that you write and create a set of classes that correspond to that schema. The JAXB utilities will create the hierarchy of data structures for manipulating that XML.
JAXB can then be used to read an XML file, and then create instances of the generated classes - laden with the data from your XML. JAXB also does the reverse: takes java classes, and generates the corresponding XML.
I like JAXB because it is easy to use, and comes with Java 1.6 (if you are using 1.5, you can download the JAXB .jars.) The way it creates the class hierarchy is intuitive, and in my experience, does a decent job abstracting away the "XML" so that I can focus on "data".
So to answer your question: I would expect that, for small XML files, JAXB might be overkill. It requires you to create and maintain an XML schema, and to use "standard textbook methods" of utilizing Java classes for data structures. (Main classes, small inner-classes to represent "nodes", and a huge hierarchy of them.) So, JAXB is probably not that great for a simple linear list of "preferences" for an application.
But if you have a rather complex XML schema, and lots of data contained within it, then JAXB is fantastic. In my project, I was converting large amounts of data between binary (which was consumed by a C program) and XML (so that humans could consume and modify that data). The resulting XML Schema was nontrivial (many levels of hierarchy, some fields could be repeated, others could not) so JAXB was helpful in being able to manipulate that.

Here's a reason not to use it: performance suffers. There is a good deal of overhead when marshaling and unmarshaling. You might also want to consider another API for XML-Object binding -- such as JiBX:
http://jibx.sourceforge.net/

I use JAXB at work all the time and I really love it. It's perfect for complex XML schemas that are always changing and especially good for random access of tags in an XML file.
I hate to pimp but I just started a blog and this is literally the first thing I posted about!
Check it out here:
http://arthur.gonigberg.com/2010/04/21/getting-started-with-jaxb/

It's an "ORM for XML". Most often used alongside JAX-WS (and indeed the Sun implementations are developed together) for WS Death Star systems.

With JAXB you can automatically create XML representations of your objects (marshalling) and object representations of the XML (unmarshalling).
As far as the XML Schema is concerned, you have two choices:
Generate Java classes from an XSD
Generate an XSD from your Java classes
There are also some simpler XML serialization libraries like XStream, Digester or XMLBeans that might be alternatives.

JAXB is great if you have to code to some external XML spec defined as an XML schema (xsd).
For example, you have a trading application and you must report the trades to the Uber Lame Trade Reporting App and they've given you ultra.xsd to be getting on with. Use the $JAVA_HOME/bin/xjc compiler to turn the XML into a bunch of Java classes (e.g. UltraTrade).
Then you can just write a simple adapter layer to convert your trade objects to UltraTrades and use the JAXB to marshal the data across to Ultra-Corp. Much easier than messing about converting your trades into their XML format.
Where it all breaks down is when Ultra-Corp haven't actually obeyed their own spec, and the trade price which they have down as a xsd:float should actually be expressed as a double!

Why we need JAXB?
The remote components (written in Java) of web services uses XML as a mean to exchange messages between each other. Why XML? Because XML is considered light weight option to exchange message on Networks with limited resources.
So often we need to convert these XML documents into objects and vice versa. E.g: Simple Java POJO Employee can be used to send Employee data to remote component( also a Java programme).
class Employee{
String name;
String dept;
....
}
This Pojo should be converted (Marshall) in to XML document as follow:
<Employee>
<Name>...</Name>
<Department>...</Department>
</Employee>
And at the remote component, back to Java object from XML document (Un-Marshall).
What is JAXB?
JAXB is a library or a tool to perform this operation of Marshalling and UnMarshalling. It spares you from this headache, as simple as that.

You can also check out JIBX too. It is also a very good xml data binder, which is also specialized in OTA (Open Travel Alliance) and is supported by AXIS2 servers. If you're looking for performance and compatibility, you can check it out :
http://jibx.sourceforge.net/index.html

JAXB provides improved performance via default marshalling optimizations.
JAXB defines a programmer API for reading and writing Java objects to and from XML documents, thus simplifying the reading and writing of XML via Java.

Homemade vs. Java Serialization

I have a certain POJO which needs to be persisted on a database, current design specifies its field as a single string column, and adding additional fields to the table is not an option.
Meaning, the objects need to be serialized in some way. So just for the basic implementation I went and designed my own serialized form of the object which meant concatenating all it's fields into one nice string, separated by a delimiter I chose. But this is rather ugly, and can cause problems, say if one of the fields contains my delimiter.
So I tried basic Java serialization, but from a basic test I conducted, this somehow becomes a very costly operation (building a ByteArrayOutputStream, an ObjectOutputStream, and so on, same for the deserialization).
So what are my options? What is the preferred way for serializing objects to go on a database?
Edit: this is going to be a very common operation in my project, so overhead must be kept to a minimum, and performance is crucial. Also, third-party solutions are nice, but irrelevant (and usually generate overhead which I am trying to avoid)

Elliot Rusty Harold wrote up a nice argument against using Java Object serialization for the objects in his XOM library. The same principles apply to you. The built-in Java serialization is Java-specific, fragile, and slow, and so is best avoided.
You have roughly the right idea in using a String-based format. The problem, as you state, is that you're running into formatting/syntax problems with delimiters. The solution is to use a format that is already built to handle this. If this is a standardized format, then you can also potentially use other libraries/languages to manipulate it. Also, a string-based format means that you have a hope of understanding it just by eyeballing the data; binary formats remove that option.
XML and JSON are two great options here; they're standardized, text-based, flexible, readable, and have lots of library support. They'll also perform surprisingly well (sometimes even faster than Java serialization).

You might try Protocol Buffers, it is a open-source project from Google, it is said to be fast (generates shorter serialized form than XML, and works faster). It also handles addition of new field gently (inserts default values).

You need to consider versioning in your solution. Data incompatibility is a problem you will experience with any solution that involves the use of a binary serialization of the Object. How do you load an older row of data into a newer version of the object?
So, the solutions above which involve serializing to a name/value pairs is the approach you probably want to use.
One solution is to include a version number as one of field values. As new fields are added, modified or removed then the version can be modified.
When deserializing the data, you can have different deserialization handlers for each version which can be used to convert data from one version to another.

XStream or YAML or OGNL come to mind as easy serialization techniques. XML has been the most common, but OGNL provides the most flexibility with the least amount of metadata.

Consider putting the data in a Properties object and use its load()/store() serialization. That's a text-based technique so it's still readable in the database:
public String getFieldsAsString() {
Properties data = new Properties();
data.setProperty( "foo", this.getFoo() );
data.setProperty( "bar", this.getBar() );
...
ByteArrayOutputStream out = new ByteArrayOutputStream();
data.store( out, "" );
return new String( out.toByteArray(), "8859-1" ); //store() always uses this encoding
}
To load from string, do similar using a new Properties object and load() the data.
This is better than Java serialization because it's very readable and compact.
If you need support for different data types (i.e. not just String), use BeanUtils to convert each field to and from a string representation.

I'd say your initial approach is not all that bad if your POJO consists of Strings and primitive types. You could enforce escaping of the delimiter to prevent corruptions. Also if you use Hibernate you encapsulate the serialization in a custom type.
If you do not mind another dependency, Hessian is supposedly a more efficient way of serializing Java objects.

How about the standard JavaBeans persistence mechanism:
java.beans.XMLEncoder
java.beans.XMLDecoder
These are able to create Java POJOs from XML (which have been persisted to XML). From memory, it looks (something) like...
<object class="java.util.HashMap">
<void method="put">
<string>Hello</string>
<float>1</float>
</void>
</object>
You have to provide PersistenceDelegate classes so that it knows how to persist user-defined classes. Assuming you don't remove any public methods, it is resilient to schema changes.

You can optimize the serialization by externalizing your object. That will give you complete control over how it is serialized and improve the performance of process. This is simple to do, as long as your POJO is simple (i.e. doesn't have references to other objects), otherwise you can easily break serialization.
tutorial here
EDIT: Not implying this is the preferred approach, but you are very limited in your options if ti is performance critical and you can only use a string column in the table.

If you are using a delimiter you could use a character which you know would never occur in your text such as \0, or special symbols http://unicode.org/charts/symbols.html
However the time spent sending the data to the database and persisting it is likely to be much larger than the cost of serialization. So I would suggest starting with some thing simple and easy to read (like XStream) and look at where your application is spending most of its time and optimise that.

I have a certain POJO which needs to be persisted on a database, current design specifies its field as a single string column, and adding additional fields to the table is not an option.
Could you create a new table and put a foreign key into that column!?!? :)
I suspect not, but let's cover all the bases!
Serialization:
We've recently had this discussion so that if our application crashes we can resurrect it in the same state as previously. We essentially dispatch a persistance event onto a queue, and then this grabs the object, locks it, and then serializes it. This seems pretty quick. How much data are you serializing? Can you make any variables transient (i.e. cached variables)? Can you consider splitting up your serialization?
Beware: what happens if your objects change (locking) or classes change (diferent serialization id)? You'll need to upgrade everything that's serialized to latest classes. Perhaps you only need to store this overnight so it doesn't matter?
XML:
You could use something like xstream to achieve this. Building something custom is doable (a nice interview question!), but I'd probably not do it myself. Why bother? Remember if you have cyclic links or if you have referencs to objects more than once. Rebuilding the objects isn't quite so trivial.
Database storage:
If you're using Oracle 10g to store blobs, upgrade to the latest version, since c/blob performance is massively increased. If we're talking large amounts of data, then perhaps zip the output stream?
Is this a realtime app, or will there be a second or two pauses where you can safely persist the actual object? If you've got time, then you could clone it and then persist the clone on another thread. What's the persistance for? Is it critical it's done inside a transaction?

Consider changing your schema. Even if you find a quick way to serialize a POJO to a string how do you handle different versions? How do you migrate the database from X->Y? Or worse from A->D? I am seeing issues where we stored a serialize object into a BLOB field and have to migrate a customer across multiple versions.

Have you looked into JAXB? It is a mechanism by which you can define a suite of java objects that are created from an XML Schema. It allows you to marshal from an object hierarchy to XML or unmarshal the XML back into an object hierarchy.

I'll second suggestion to use JAXB, or possibly XStream (former is faster, latter has more focus on object serialization part).
Plus, I'll further suggest a decent JSON-based alternative, Jackson (http://jackson.codehaus.org/Tutorial), which can fully serializer/deserialize beans to JSON text to store in the column.
Oh and I absolutely agree in that do not use Java binary serialization under any circumstances for long-term data storage. Same goes for Protocol Buffers; both are too fragile for this purpose (they are better for data transfer between tigtly coupled systems).

You might try Preon. Preon aims to be to binary encoded data what Hibernate is to relational databases and JAXB to XML.

Serialize Java objects into Java code

Does somebody know a Java library which serializes a Java object hierarchy into Java code which generates this object hierarchy? Like Object/XML serialization, only that the output format is not binary/XML but Java code.

Serialised data represents the internal data of objects. There isn't enough information to work out what methods you would need to call on the objects to reproduce the internal state.
There are two obvious approaches:
Encode the serialised data in a literal String and deserialise that.
Use java.beans XML persistence, which should be easy enough to process with your favourite XML->Java source technique.

I am not aware of any libraries that will do this out of the box but you should be able to take one of the many object to XML serialisation libraries and customise the backend code to generate Java. Would probably not be much code.
For example a quick google turned up XStream. I've never used it but is seems to support multiple backends other than XML - e.g. JSON. You can implement your own writer and just write out the Java code needed to recreate the hierarchy.
I'm sure you could do the same with other libraries, in particular if you can hook into a SAX event stream.
See:
HierarchicalStreamWriter

Great question. I was thinking about serializing objects into java code to make testing easier. The use case would be to load some data into a db, then generate the code creating an object and later use this code in test methods to initialize data without the need to access the DB.
It is somehow true that the object state doesn't contain enough info to know how it's been created and transformed, however, for simple java beans there is no reason why this shouldn't be possible.
Do you feel like writing a small library for this purpose? I'll start coding soon!

XStream is a serialization library I used for serialization to XML. It should be possible and rather easy to extend it so that it writes Java code.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.