Cassandra custom type mapping? - java

I'd like to be able to do a custom serialization/deserialization of raw types with cassandra without needing to make UDT. Basically I want to store a string column, but have it automatically deserialized into a String wrapper value I have. Is there any way to hook into the cassandra mapping code? I looked through it and I didn't see a particular opening where to put in a custom mapper.

Currently, it is not possible to plug in a custom mapping mechanism into the Java driver, but JAVA-721 is scheduled for 2.1.7 and will probably bring you the ability you are looking for.

Related

Using morphia’s fromDBObject without DataStore

I am working on an application that directly uses Java MongoDB driver for Mongo queries.
I’d like to use Morphia to map retrieved Documents to my POJOs and vice versa (but I do not want to do queries through Morphia itself).
I am trying to achieve this with Morphia 1.1, however the fromDBObject in this version requires Morphia’s DataStore as an argument (previous versions did without it) – and I do not want to give Morphia actual connection to the database. I am not using references to join data from different collections – so when transforming an already retrieved document to POJO it is not needed to retrieve any additional data from the DB.
Can I achieve this in the version 1.1 (eg. by creating and passing an empty, nonfunctional datastore (how to create it?), or just by passing null)?
If not, I can live with the older (1.0.1) version – but does that make sense?
And if not – what would be the best solution for mapping POJOs to Mongo documents – are there any other, currently maintained, libraries to achieve this?
And, again, if not – what would be the best way to implement this functionality myself? The solution should be as generic as possible regarding document and POJO classes schema, I am OK with annotating my entity classes.
Did you try passing in null for the Datastore? It's used for resolving any #Referenced fields for the most part. You should be fine just passing null. But as always, "try it and see."

Datastax Cassandra java driver - Object mapper - Auto create tables

The actual use case i'm working on has many classes that should be persisted (basically different sensor types). Currently i have to create the table per hand for every sensor type. Isn't there a mechanism of the driver that could auto create the respective tables if they are not existent (like seen in e.g. Hibernate)?
This would allow me to deploy the app on other systems without need for recreating the tables again. Furthermore this is quite handy for quick prototyping ;)
I created a partial solution to the problem - a table / udt create-query creation facility. It can be found here:
https://gist.github.com/eintopf/3ae360110846cb80a227
Unfourtunately the type mapping is NOT complete at the moment, since the respective type mapper class in the object mapper package of datastax is private.
The program just builds all CREATE queries and one use them like he wants (copy paste into cqlsh or use it directly on the cassandra session via Java).
Not at the moment, but this is a planned feature (JAVA-569).

What's the best way to read a UDT from a database with Java?

I thought I knew everything about UDTs and JDBC until someone on SO pointed out some details of the Javadoc of java.sql.SQLInput and java.sql.SQLData JavaDoc to me. The essence of that hint was (from SQLInput):
An input stream that contains a stream
of values representing an instance of
an SQL structured type or an SQL
distinct type. This interface, used
only for custom mapping, is used by
the driver behind the scenes, and a
programmer never directly invokes
SQLInput methods.
This is quite the opposite of what I am used to do (which is also used and stable in productive systems, when used with the Oracle JDBC driver): Implement SQLData and provide this implementation in a custom mapping to
ResultSet.getObject(int index, Map mapping)
The JDBC driver will then call-back on my custom type using the
SQLData.readSQL(SQLInput stream, String typeName)
method. I implement this method and read each field from the SQLInput stream. In the end, getObject() will return a correctly initialised instance of my SQLData implementation holding all data from the UDT.
To me, this seems like the perfect way to implement such a custom mapping. Good reasons for going this way:
I can use the standard API, instead of using vendor-specific classes such as oracle.sql.STRUCT, etc.
I can generate source code from my UDTs, with appropriate getters/setters and other properties
My questions:
What do you think about my approach, implementing SQLData? Is it viable, even if the Javadoc states otherwise?
What other ways of reading UDT's in Java do you know of? E.g. what does Spring do? what does Hibernate do? What does JPA do? What do you do?
Addendum:
UDT support and integration with stored procedures is one of the major features of jOOQ. jOOQ aims at hiding the more complex "JDBC facts" from client code, without hiding the underlying database architecture. If you have similar questions like the above, jOOQ might provide an answer to you.
The advantage of configuring the driver so that it works behind the scenes is that the programmer does not need to pass the type map into ResultSet.getObject(...) and therefore has one less detail to remember (most of the time). The driver can also be configured at runtime using properties to define the mappings, so the application code can be kept independent of the details of the SQL type to object mappings. If the application could support several different databases, this allows different mappings to be supported for each database.
Your method is viable, its main characteristic is that the application code uses explicit type mappings.
In the behind the scenes approach the ResultSet.getObject(int) method will use the type mappings defined on the connection rather than those passed by the application code in ResultSet.getObject(int index, Map mapping). Otherwise the approaches are the same.
Other Approaches
I have seen another approach used with JBoss 4 based on these classes:
org.jboss.ejb.plugins.cmp.jdbc.JDBCParameterSetter
org.jboss.ejb.plugins.cmp.jdbc.JDBCResultSetReader.AbstractResultSetReader
The idea is the same but the implementation is non-standard (it probably pre-dates the version of the JDBC standard defining SQLData/SQLInput).
What other ways of reading UDT's in Java do you know of? E.g. what does Spring do? what does Hibernate do? What does JPA do? What do you do?
An example of how something similar to this can be done in Hibernate/JPA is shown in this answer to another question:
Java Enums, JPA and Postgres enums - How do I make them work together?
I know what Spring does: you write implementations of their RowMapper interface. I've never used SQLData with Spring. Your post was the first time I'd ever heard of or thought about that interface.

Versioned Serialization in Java

I have a simple Java class that I need to serialize to be stored as a value in an RDBMS or a key-value store. The class is just a collection of properties of simple types (native types or Maps/Lists of native types). The issue is that the class will likely be evolving over time (likely: adding new properties, less likely but still possible: renaming a property, changing the type of a property, deleting a property).
I'd like to be able to gracefully handle changes in the class version. Specifically, when an attempt is made to de-serialize an instance from an older version, I'd like to be able to specify some sort of handler for managing the transition of the older instance to the newest version.
I don't think Java's built-in serialization is appropriate here. Before I try to roll my own solution, I'm wondering if anyone knows of any existing libraries that might help? I know of a ton of alternative serialization methods for Java, but I'm specifically looking for something that will let me gracefully handle changes to a class definition over time.
Edit:
For what it's worth, I ended up going with Protocol Buffer (http://code.google.com/p/protobuf/) serialization, since it's flexible to adding and renaming fields, while being on less piece of code I have to maintain (in reference to the custom Java serialization using readObject/writeObject).
Java serialisation allows customising of the serial form by providing readObject and writeObject methods. Together with ObjectInputStream.readFields, ObjectOutputStrean.putFields and defining serialPersistentFields, the serialised form can be unrelated to the actual fields in the implementation class.
However, Java serialisation produces opaque data that is not amenable to reading and writing through other techniques.
Perhaps you should map your Java class into the relational model. Dumping some language serialized blob into a database column is a horrible approach.
This is pretty straightforward using read and write object.
Try setting serialversionuid to a fixed value, then define a static final field for your version. The readobject can then use a switch statement to construct the fields depending on the version. We use this to store historical data on our file system. It's very quick on retrieval- so much so that users can't tell the difference.
I had a similar problem. I found out Java's serialVersionUID doesn't help much when you have multiple versions of objects. So I rolled out my own.
Here is what I do to save our user sessions,
In my DB, besides the BLOB field for serialized objects, I added a version column.
Whenever we change the session object, I save the old class, for example SessionV3.
Session is always written to the DB with current version number.
When reading the session, it's deserialized into session object directly if version is current. Otherwise, it's deserialized into old object and manually copied into current session object (SessionV3 => Session).
Once a while, we run a DB script to remove real old session versions so we can clean out old sessions from code. If we care about the old sessions, we can choose to convert them also.
There might be easier way to do this but our approach gives us most flexibility.
Never tried it but you may be able to do something with a custom bootloader to load the correct version of the class file at runtime for the object being deserialized.

How to Serialize Hibernate Collections Properly?

I'm trying to serialize objects from a database that have been retrieved with Hibernate, and I'm only interested in the objects' actual data in its entirety (cycles included).
Now I've been working with XStream, which seems powerful. The problem with XStream is that it looks all too blindly on the information. It recognizes Hibernate's PersistentCollections as they are, with all the Hibernate metadata included. I don't want to serialize those.
So, is there a reasonable way to extract the original Collection from within a PersistentCollection, and also initialize all referring data the objects might be pointing to. Or can you recommend me to a better approach?
(The results from Simple seem perfect, but it can't cope with such basic util classes as Calendar. It also accepts only one annotated object at a time)
solution described here worked well for me: http://jira.codehaus.org/browse/XSTR-226
the idea is to have custom XStream converter/mapper for hibernate collections, which will extract actual collection from hibernate one and will call corresponding standard converter (for ArrayList, HashMap etc.)
I recommend a simpler approach: user dozer: http://dozer.sf.net. Dozer is a bean mapper, you can use it to convert, say, a PersonEJB to an object of the same class. Dozer will recursively trigger all proxy fecthes through getter() calls, and will also convert src types to dest types (let's say java.sql.date to java.utilDate).
Here's a snippet:
MapperIF mapper = DozerBeanMapperSingletonWrapper.getInstance();
PersonEJB serializablePerson = mapper.map(myPersonInstance, PersonEJB.class);
Bear in mind, as dozer walks through your object tree it will trigger the proxy loading one by one, so if your object graph has many proxies you will see many queries, which can be expensive.
What generally seems to be the best way to do it, and the way I am currently doing it is to have another layer of DTO objects. This way you can exclude data that you don't want to go over the channel as well as limit the depth to which the graph is serialized. I use Dozer for my current DTO (Data Transfer Object) from Hibernate objects to the Flex client.
It works great, with a few caveats:
It's not fast, in fact it's downright slow. If you send a lot of data, Dozer will not perform very well. This is mostly because of the Reflection involved in performing its magic.
In a few cases you'll have to write custom converters for special behavior. These work very well, but they are bi-directional. I personally had to hack the Dozer source to allow uni-directional custom converters.

Categories