GreenDao and entity inheritance

GreenDao and entity inheritance - java

My task is to make disk cache on Android OS for my application (it is some sort of messenger). I'd like to store messages in database, but have met a problem of storing different types of messages (currently 5 types of messages each type have it's own fields and they all extends base class)
GreenDao documentation says:
Note: currently it’s impossible to have another entity as a super class (there are no polymorphic queries either)
I am planing to have entity which almost 1 to 1 to base class, except one column - raw binary or json data in which every child class can write anything it need.
My questions are:
GreenDao is good solution in such case? Is there any solutions which allow not to worry about inheritance - and how much did they cost in terms of efficiency.
How to "serialize" data to such field (what method I should override or where I should put my code which will do all necessary things
How to give GreenDao correct constructor to "deserialize" Json or binary to correct class instance
Should I use reflection - or just switch/case for finding correct constructor (only 5 types of constructors are possible) - is reflection how much will reflection "cost" in such case?

If you really need inheritance greendao is not the r I get choice, since it doesn't support it. But I think you can go without inheritance:
You can design an entity with a discriminator column (messagetype) and a binary or text column (data). Then you can use an abstract factory to create desired objects from data depending of the messagetype.
If the conversion is complex, I'd put it in a separate class, otherwise I'd put it as a method in the keep section.
Be aware that this design may slow you down, if you really have a lot of messages, since separate tables would reduce index sizes.
Talking about indexes: if you want to access a message through some property of your data column later on, you are screwed since you can't put an index on it.

Related

Best approach for linking diverse entity types in JPA

Short version for the hasty:
There's various tables/entities in my domain model which have the same field (a UUID). There is a table where I need to link rows/instances of such entities to other JPA-managed entities. In other words, the instance of the field in that link table won't be known up-front. The two approaches I can think of are:
Use an abstract entity and a TABLE_PER_CLASS strategy, or
use an #MappedSuperClass store the class name of the instance in the link table as well, or something similar that lets me define logic for getting the actual instance from the right table.
Both have advantages and disadvantages in terms of complexity and performance. Which do you believe to be best, is there maybe a third option, or have you tried something like this in the past and would advice/strongly warn against?
Long version in case you want more background:
I have a database/object model wherein many types have a common field: a universally unique identifier (UUID). The reason for this is that instances of these types can be subject to changes. The changes follow the command model and their data can be encapsulated and itself persisted. Let's call such a change a "mutation". It must be possible to find out which mutations exist in the database for any given entity, and vice-versa, on which entity a stored mutation operates.
Take the following entities with UUIDs as an (extremely simplified) example:
To store the "mutations", we use a table/entity called MutationHolder. To link a mutation to its target entity, there's a MutationEntityLink. The only reason this data isn't directly on the MutationHolder is because there can be direct or indirect links, but that's of little importance here so I left it out:
The question comes down to how I can model the entity field in MutationEntityLink. There are two approaches I can think of.
The first is to make an abstract #Entity annotated class with the UUID field. Customer, Contract and Address would extend it. So it is a TABLE_PER_CLASS strategy. I assume that I could use this as a type for the entity field, although I'm not certain. However, I fear this might have a serious performance penalty since JPA would need to query many tables to find the actual instance.
The second is to simply use #MappedSuperClass and just store the UUID for an entity in the entity field of MutationEntityLink. In order to get the actual entity with that UUID, I'd have to solve it programmatically. Adding an additional column with the class name of the entity, or something else that allows me to identify it or paste it in a JPQL query would do. This requires more work but seems more efficient. I'm not averse to coding some utility classes or doing some reflection/custom annotation work if needed.
My question is which of these approaches seems best? Alternatively, you might have a better suggestion, or notice I'm missing something; for example, maybe there's a way to add a type column even with TABLE_PER_CLASS inheritance to point JPA to the right table? Perhaps you've tried something like this and want to warn me about numerous issues that would arise.
Some additional info:
We create the database schema, so we can add whatever we want.
A single table inheritance strategy isn't an option. The tables must remain distinct. For the same reason, joined inheritance doesn't seem a good fit either.
The JPA provider is Hibernate and using things that are not part of the JPA standard isn't an issue.

If the entities don't have anything in common besides having a uuid I'd use the second approach you describe: use MappedSuperclass. Making the common superclass an entity would prevent you to use a different inheritance strategy if needed, would require a table for that super entity even if no instances exist and from a business point of view it's just wrong.
The link itself could be implemented in multiple ways, e.g. you could subclass MutationEntityLink for each entity to map (e.g. CustomerMutationEntityLink etc.) or do as you described it, i.e. only store the uuid as well as some discriminator/type information and resolve programatically (we're using that approach for something similar btw.).

You need to use #MappedSuperclass while inheriting associations/methods/properties whereas TABLE_PER_CLASS is generally used when you have entity and sub-entities. If there are entities having an association with the base class in the model, then use TABLE_PER_CLASS since the base class behaves like an entity. Otherwise, since the base class would include properties/attributes and methods which are general to such entities not related to each other, using #MappedSuperclass would be a better idea
Example1: You need to set alarms for some different activities like "take medicine", "call mom", "go to doctor" etc. The content of the alarm message does not matter, you will need a reminder. So use TABLE_PER_CLASS since alarm message, which is your base class is like an entity here.
Example2: Assume the base class AbstractDomainObject enables you to create login ID, loginName, creation/modification date for each object where no entity has an association with the base class, you will need to specify the association for the sake of clearing later, like "Company","University" etc. In this situation, using #MappedSuperclass would be better.

How does serialization tool skip unknown fields during deserialization?

How does serialization tool(i.e. hessian) deserialize a class of different version with the same serialVersionUID? In most cases, it can skip those unknown(not found in class loader) fields and keep compatible. But last time, I tried appending a new field of Map<String, Object>, put some unknown object into the map, then it threw a ClassNotFoundException.
Why can't skip the map like the others?
Is it a problem associated with the tool's implementation or serialization mechanism?

This would depend on the tool itself. serialVersionUID is intended for use by Java's built-in serializer (ObjectOutputStream) which, as best I can tell from reading the Hessian source, is not used by Hessian.
For Hessian specifically, the best source I can find which mentions these kinds of changes is this email:
At least for Hessian, it's best to think of versioning as a set of
types of changes that can be handled.
Specifically Hessian can manage the following kinds of changes: 1)
if you add or drop a field, the side that doesn't understand the
field will ignore it. 2) some field type changes are possible, if
Hessian can convert (e.g. int to long) 3) there's some flexibility
on map(bean) types, depending on how much information Hessian has
(which is a reason to prefer concrete types.)
So, if the sender sends an untyped map {"field1", 10} and the target
is known to be MyValue { int field1; }, then Hessian can map the
fields.
But it cannot manage things like: 1) field name changes (the data
will be dropped). 2) class name changes where the target is
underdefined, like Object field1. If you send a MyValue2 as the new
field1, when the previous version was MyValue1, Hessian can't make
that automatic transition. (But as with #3 above, a "MyValue2 field1"
would give Hessian enough information to translate.) 3) class
splits, e.g. creating a subclass and pushing some fields into it.
4) map to list or list to map changes.
Basically, I don't think Hessian intends to support unknown types in maps.

Best practice design pattern for defining "types" in a database with potential multi language requirement?

My question more specificity is this:
I want users on multiple front ends to see the "Type" of a database row. Let's say for ease that I have a person table and the types can be Student, Teacher, Parent etc.
The specific program would be java with hibernate, however I doubt that's important for the question, but let's say my data is modelled in to Entity beans and a Person "type" field is an enum that contains my 3 options, ideally I want my Person object to have a getType() method that my front end can use to display the type, and also I need a way for my front end to know the potential types.
With the enum method I have this functionality but what I don't have is the ability to easily add new types without re-compiling.
So next thought is that I put my types in to a config file and simply story them in the database as strings. my getType() method works, but now my front end has to load a config file to get the potential types AND now there's nothing to keep them in sync, I could remove a type from my config file and the type in the database would point to nothing. I don't like this either.
Final thought is that I create a PersonTypes database table, this table has a number for type_id and a string defining the type. This is OK, and if the foreign key is set up I can't delete types that I'm using, my front end will need to get sight of potential types, I guess the best way is to provide a service that will use the hibernate layer to do this.
The problem with this method is that my types are all in English in the database, and I want my application to support multiple languages (eventually) so I need some sort of properties file to store the labels for the types. so do I have a PersonType table the purely contains integers and then a properties file that describes the label per integer? That seems backwards?
Is there a common design pattern to achieve this kind of behaviour? Or can anyone suggest a good way to do this?
Regards,
Glen x

I would go with the last approach that you have described. Having the type information in separate table should be good enought and it will let you use all the benefits of SQL for managing additional constraints (types will be probably Unique and foreign keys checks will assure you that you won't introduce any misbehaviour while you delete some records).
When each type will have i18n value defined in property files, then you are safe. If the type is removed - this value will not be used. If you want, you can change properties files as runtime.
The last approach I can think of would be to store i18n strings along with type information in PersonType. This is acceptable for small amount of languages, altough might be concidered an antipattern. But it would allow you having such method:
public String getName(PersonType type, Locale loc) {
if (loc.equals(Locale.EN)) {
return type.getEnglishName();
} else if (loc.equals(Locale.DE)){
return type.getGermanName();
} else {
return type.getDefaultName();
}
}

Internationalizing dynamic values is always difficult. Your last method for storing the types is the right one.
If you want to be able to i18n them, you can use resource bundles as properties files in your app. This forces you to modify the properties files and redeploy and restart the app each time a new type is added. You can also fall back to the English string stored in database if the type is not found in the resource bundle.
Or you can implement a custom ResourceBundle class that fetches its keys and values from the database directly, and have an additional PersonTypeI18n table which contains the translations for all the locales you want to support.

You can use following practices:
Use singleton design pattern
Use cashing framework such as EhCashe for cashe type of person and reload when need.

Structural design pattern

I'm working with three separate classes: Group, Segment and Field. Each group is a collection of one or more segments, and each segment is a collection of one or more fields. There are different types of fields that subclass the Field base class. There are also different types of segments that are all subclasses of the Segment base class. The subclasses define the types of fields expected in the segment. In any segment, some of the fields defined must have values inputted, while some can be left out. I'm not sure where to store this metadata (whether a given field in a segment is optional or mandatory.)
What is the most clean way to store this metadata?

I'm not sure you are giving enough information about the complete application to get the best answer. However here are some possible approaches:
Define an isValid() method in your base class, which by default returns true. In your subclasses, you can code specific logic for each Segment or FieldType to return false if any requirements are missing. If you want to report an error message to say which fields are missing, you could add a List argument to the isValid method to allow each type to report the list of missing values.
Use Annotations (as AlexR said above).
The benefit of the above 2 approaches is that meta data is within the code, tied directly to the objects that require it. The disadvantage is that if you want to change the required fields, you will need to update the code and deploy a new build.
If you need something which can be changed on the fly, then Gangus suggestion of Xml is a good start, because your application could reload the Xml definition at run-time and produce different validation results.

I think, the best placement for such data will be normal XML file. And for work with such data the best structure will be also XMLDOM with XPATH. Work with classes will be too complicated.

Since java 5 is released this kind of metadata can be stored using annotations. Define your own annotation #MandatoryField and mark all mandatory fields with it. Then you can discover object field-by-field using reflection and check whether not initiated fields are mandatory and throw exception in this case.

Persisting data suited for enums

Most projects have some sort of data that are essentially static between releases and well-suited for use as an enum, like statuses, transaction types, error codes, etc. For example's sake, I'll just use a common status enum:
public enum Status {
ACTIVE(10, "Active");
EXPIRED(11, "Expired");
/* other statuses... */
/* constructors, getters, etc. */
}
I'd like to know what others do in terms of persistence regarding data like these. I see a few options, each of which have some obvious advantages and disadvantages:
Persist the possible statuses in a status table and keep all of the possible status domain objects cached for use throughout the application
Only use an enum and don't persist the list of available statuses, creating a data consistency holy war between me and my DBA
Persist the statuses and maintain an enum in the code, but don't tie them together, creating duplicated data
My preference is the second option, although my DBA claims that our end users might want to access the raw data to generate reports, and not persisting the statuses would lead to an incomplete data model (counter-argument: this could be solved with documentation).
Is there a convention that most people use here? What are peoples' experiences with each and are there other alternatives?
Edit:
After thinking about it for a while, my real persistence struggle comes with handling the id values that are tied to the statuses in the database. These values would be inserted as default data when installing the application. At this point they'd have ids that are usable as foreign keys in other tables. I feel like my code needs to know about these ids so that I can easily retrieve the status objects and assign them to other objects. What do I do about this? I could add another field, like "code", to look stuff up by, or just look up statuses by name, which is icky.

We store enum values using some explicit string or character value in the database. Then to go from database value back to enum we write a static method on the enum class to iterate and find the right one.
If you expect a lot of enum values, you could create a static mapping HashMap<String,MyEnum> to translate quickly.
Don't store the actual enum name (i.e. "ACTIVE" in your example) because that's easily refactored by developers.

I'm using a blend of the three approaches you have documented...
Use the database as the authoritative source for the Enum values. Store the values in a 'code' table of some sort. Each time you build, generate a class file for the Enum to be included in your project.
This way, if the enum changes value in the database, your code will be properly invalidated and you will receive appropriate compile errors from your Continuous Integration server. You have a strongly typed binding to your enumerated values in the database, and you don't have to worry about manually syncing the values between code and the data.

Joshua Bloch gives an excellent explanation of enums and how to use them in his book "Effective Java, Second Edition" (p.147)
There you can find all sorts of tricks how to define your enums, persist them and how to quickly map them between the database and your code (p.154).
During a talk at the Jazoon 2007, Bloch gave the following reasons to use an extra attribute to map enums to DB fields and back: An enum is a constant but code isn't. To make sure that a developer editing the source can't accidentally break the DB mapping by reordering the enums or renaming then, you should add a specific attribute (like "dbName") to the enum and use that to map it.
Enums have an intrinsic id (which is used in the switch() statement) but this id changes when you change the order of elements (for example by sorting them or by adding elements in the middle).
So the best solution is to add a toDB() and fromDB() method and an additional field. I suggest to use short, readable strings for this new field, so you can decode a database dump without having to look up the enums.

While I am not familiar with the idea of "attributes" in Java (and I don't know what language you're using), I've generally used the idea of a code table (or domain specific tables) and I've attributed my enum values with more specific data, such as human readable strings (for instance, if my enum value is NewStudent, I would attribute it with "New Student" as a display value). I then use Reflection to examine the data in the database and insert or update records in order to bring them in line with my code, using the actual enum value as the key ID.

What I used in several occations is to define the enum in the code and a storage representation in the persistence layer (DB, file, etc.) and then have conversion methods to map them to each other. These conversion methods need only be used when reading from or writing to the persistent store and the application can use the type safe enums everywhere. In the conversion methods I used switch statements to do the mapping. This allows also to throw an exception if a new or unknown state is to be converted (usually because either the app or the data is newer than the other and new or additional states had been declared).

If there's at least a minor chance that list of values will need to be updated than it's 1. Otherwise, it's 3.

Well we don't have a DBA to answer to, so our preference is for option 2).
We simply save the Enum value into the database, and when we are loading data out of the database and into our Domain Objects, we just cast the integer value to the enum type.
This avoids any of the synchronisation headaches with options 1) and 3). The list is defined once - in the code.
However, we have a policy that nobody else accesses the database directly; they must come through our web services to access any data. So this is why it works well for us.

In your database, the primary key of this "domain" table does't have to be a number. Just use a varchar pk and a description column (for the purposes your dba is concerned). If you need to guarantee the ordering of your values without relying on the alphabetical sor, just add a numeric column named "order or "sequence".
In your code, create a static class with constants whose name (camel-cased or not) maps to the description and value maps to the pk. If you need more than this, create a class with the necessary structure and comparison operators and use instances of it as the value of the constants.
If you do this too much, build a script to generate the instatiation / declaration code.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.