Mapping a document with partly-defined schema

Mapping a document with partly-defined schema - java

I'm writing a demo app using Spring & MongoDB as a database.
My main domain class looks like:
#Document
public class Person {
#Id
private String id;
//Some other fields
private DBObject additionalData;
}
The key is that additionalData is a subdocument with no schema specified, it is kind of user-defined JSON. But when I am parsing this json (using (DBObject) JSON.parse(value) expression), it is stored as a string in MongoDB, and I need it to be a nested document structure.
Searched for couple of hours, found no solution. Any ideas?

I'm not really sure of the expected result of casting the result of
JSON.parse(value)
to DBObject, which is an interface, not a class.
Try casting the result to an implementation of DBObject BasicDBObject (or BasicDBList), or a Map<String, Object> as mentioned in the comments (it is also an interface, but it does work).
If you're working with Spring Data Rest, you will probably not need to deserialize "manually", Spring will do it for you. Check this answer for a basic example of what to do.
Having data with no schema specified may not be the best idea around (mongodb saves you from doing it at the database level, but you should do it at the application level), but I use similar tricks in production, and you can somehow make it work.

Related

How do I Insert JSON-formatted string to a H2 database using Hibernate?

I have a PostgresSQL production database, but I'm trying to run some of my automated tests against H2 in-memory. I'm trying to persist JSON formatted data to a table but while I'm able to write the data with no complaints, I get conversion exceptions when I read them back. I have no problem doing this in the production Postgres database.
The object I'm persisting is structured similar to the following:
#Entity
public class Record {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
#Column(columnDefinition = "jsonb")
#Convert(converter = PersonalInfoConverter.class)
private PersonalInfo personalInfo;
public Record() {}
public Record(PersonalInfo personalInfo) {
this.personalInfo = personalInfo;
}
}
The PersonalInfoConverter just uses a Jackson ObjectMapper to de/serialise the object from/to a String (pretty standard stuff with writeValueAsString, and readValue). To get jsonb to work with H2, I used this trick, which basically sets jsonb as an alias for H2's JSON.
I kept running into conversion errors when reading records from the database, until I stumbled upon this question, which linked to a further discussion on Github about inserting JSON formatted strings into H2 tables. It sounds like to be able to get this to work properly, I need to specifically annotate the string inserted into the H2 database. I assumed that, if this were the case, then Hibernate should have handled this properly itself, but it didn't seem like it works out of the box. How do I configure my code to get this to work?
In the meantime, I'm working around this issue by using jsonb as an alias to H2's text type instead:
CREATE TYPE "JSONB" as text;
I've created a project to demonstrate the issue.

Hibernate does not know about the "JSON" SQL data type and how it needs to be handled. Just use text like you do now, that's totally fine. AFAIU the JSON data type in H2 is just like a domain type with validation i.e. you could replace it with TEXT CHECK is_json(..), so there is not much value in using that particular data type. You could tell hibernate to use #ColumnTransformer to append this FORMAT JSON, but then you'd have issues with PostgreSQL again. Overall, this cross database testing with proprietary features that Hibernate does not abstract over is simply a mess. I would suggest you simply drop H2 and use PostgreSQL with fsync=off for testing which is quite fast already.

DDD implementation with Spring Data and JPA + Hibernate problem with identities

So I'm trying for the first time in a not so complex project to implement Domain Driven Design by separating all my code into application, domain, infrastructure and interfaces packages.
I also went with the whole separation of the JPA Entities to Domain models that will hold my business logic as rich models and used the Builder pattern to instantiate. This approach created me a headache and can't figure out if Im doing it all wrong when using JPA + ORM and Spring Data with DDD.
Process explanation
The application is a Rest API consumer (without any user interaction) that process daily through Scheduler tasks a fairly big amount of data resources and stores or updates into MySQL. Im using RestTemplate to fetch and convert the JSON responses into Domain objects and from there Im applying any business logic within the Domain itself e.g. validation, events, etc
From what I have read the aggregate root object should have an identity in their whole lifecycle and should be unique. I have used the id of the rest API object because is already something that I use to identify and track in my business domain. I have also created a property for the Technical id so when I convert Entities to Domain objects it can hold a reference for the update process.
When I need to persist the Domain to the data source (MySQL) for the first time Im converting them into Entity objects and I persist them using the save() method. So far so good.
Now when I need to update those records in the data source I first fetch them as a List of Employees from data source, convert Entity objects to Domain objects and then I fetch the list of Employees from the rest API as Domain models. Up until now I have two lists of the same Domain object types as List<Employee>. I'm iterating them using Streams and checking if an objects are not equal() between them if yes a collection of List items is created as a third list with Employee objects that need to be updated. Here I've already passed the technical Id to the domain objects in the third list of Employees so Hibernate can identify and use to update the records that are already exists.
Up to here are all fairly simple stuff until I use the saveAll() method to update the records.
Questions
I alway see Hibernate using INSERT instead of updating the list of
records. So If Im correct Hibernate session is not recognising the
objects that Im throwing into it because I have detached them when I
used the convert to domain object?
Does anyone have a better idea how can I implement this differently or fix
this problem?
Or should I stop using this approach as two different objects and continue use
them as rich Entity models?
Simple classes to explain it with code
EmployeeDO.java
#Entity
#Table(name = "employees")
public class EmployeeDO implements Serializable {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String name;
public EmployeeDO() {}
...omitted getter/setters
}
Employee.java
public class Employee {
private Long persistId;
private Long employeeId;
private String name;
private Employee() {}
...omitted getters and Builder
}
EmployeeConverter.java
public class EmployeeConverter {
public static EmployeeDO serialize(Employee employee) {
EmployeeDO target = new EmployeeDO();
if (employee.getPersistId() != null) {
target.setId(employee.getPersistId());
}
target.setName(employee.getName());
return target;
}
public static Employee deserialize(EmployeeDO employee) {
return new Country.Builder(employee.getEmployeeId)
.withPersistId(employee.getId()) //<-- Technical ID setter
.withName(employee.getName())
.build();
}
}
EmployeeRepository.java
#Component
public class EmployeeReporistoryImpl implements EmployeeRepository {
#Autowired
EmployeeJpaRepository db;
#Override
public List<Employee> findAll() {
return db.findAll().stream()
.map(employee -> EmployeeConverter.deserialize(employee))
.collect(Collectors.toList());
}
#Override
public void saveAll(List<Employee> employees) {
db.saveAll(employees.stream()
.map(employee -> EmployeeConverter.serialize(employee))
.collect(Collectors.toList()));
}
}
EmployeeJpaRepository.java
#Repository
public interface EmployeeJpaRepository extends JpaRepository<EmployeeDO, Long> {
}

I use the same approach on my project: two different models for the domain and the persistence.
First, I would suggest you to don't use the converter approach but use the Memento pattern. Your domain entity exports a memento object and it could be restored from the same object. Yes, the domain has 2 functions that aren't related to the domain (they exist just to supply a non-functional requirement), but, on the other side, you avoid to expose functions, getters and constructors that the domain business logic never use.
For the part about the persistence, I don't use JPA exactly for this reason: you have to write a lot of code to reload, update and persist the entities correctly. I write directly SQL code: I can write and test it fast, and once it works I'm sure that it does what I want. With the Memento object I can have directly what I will use in the insert/update query, and I avoid myself a lot of headaches about the JPA of handling complex tables structures.
Anyway, if you want to use JPA, the only solution is to:
load the persistence entities and transform them into domain entities
update the domain entities according to the changes that you have to do in your domain
save the domain entities, that means:
reload the persistence entities
change, or create if there're new ones, them with the changes that you get from the updated domain entities
save the persistence entities
I've tried a mixed solution, where the domain entities are extended by the persistence ones (a bit complex to do). A lot of care should be took to avoid that domain model should adapts to the restrictions of JPA that come from the persistence model.
Here there's an interesting reading about the splitting of the two models.
Finally, my suggestion is to think how complex the domain is and use the simplest solution for the problem:
is it big and with a lot of complex behaviours? Is expected that it will grow up in a big one? Use two models, domain and persistence, and manage the persistence directly with SQL It avoids a lot of caos in the read/update/save phase.
is it simple? Then, first, should I use the DDD approach? If really yes, I would let the JPA annotations to split inside the domain. Yes, it's not pure DDD, but we live in the real world and the time to do something simple in the pure way should not be some orders of magnitude bigger that the the time I need to to it with some compromises. And, on the other side, I can write all this stuff in an XML in the infrastructure layer, avoiding to clutter the domain with it. As it's done in the spring DDD sample here.

When you want to update an existing object, you first have to load it through entityManager.find() and apply the changes on that object or use entityManager.merge since you are working with detached entities.
Anyway, modelling rich domain models based on JPA is the perfect use case for Blaze-Persistence Entity Views.
Blaze-Persistence is a query builder on top of JPA which supports many of the advanced DBMS features on top of the JPA model. I created Entity Views on top of it to allow easy mapping between JPA models and custom interface defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure the way you like and map attributes(getters) via JPQL expressions to the entity model. Since the attribute name is used as default mapping, you mostly don't need explicit mappings as 80% of the use cases is to have DTOs that are a subset of the entity model.
The interesting point here is that entity views can also be updatable and support automatic translation back to the entity/DB model.
A mapping for your model could look as simple as the following
#EntityView(EmployeeDO.class)
#UpdatableEntityView
interface Employee {
#IdMapping("persistId")
Long getId();
Long getEmployeeId();
String getName();
void setName(String name);
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
Employee dto = entityViewManager.find(entityManager, Employee.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features and it can also be saved back. Here a sample repository
#Repository
interface EmployeeRepository {
Employee findOne(Long id);
void save(Employee e);
}
It will only fetch the mappings that you tell it to fetch and also only update the state that you make updatable through setters.
With the Jackson integration you can deserialize your payload onto a loaded entity view or you can avoid loading alltogether and use the Spring MVC integration to capture just the state that was transferred and flush that. This could look like the following:
#RequestMapping(path = "/employee/{id}", method = RequestMethod.PUT, consumes = MediaType.APPLICATION_JSON_VALUE)
public ResponseEntity<String> updateEmp(#EntityViewId("id") #RequestBody Employee emp) {
employeeRepository.save(emp);
return ResponseEntity.ok(emp.getId().toString());
}
Here you can see an example project: https://github.com/Blazebit/blaze-persistence/tree/master/examples/spring-data-webmvc

Working with POJO classes as schema for mongo db, specifically handling schema changes

I'm using simple Java classes which are the schema for my mongo db table.
There are several frameworks for serialization/ deserialization to/ from JSON and CRUD operations for mongo (I've looked into Jackson serializer and Morphia).
But none of them seems to provide a solution for handling changes:
Let's say I have this class as my schema:
Class Person
{
String name;
int age;
String occupation;
}
In my code, I will probably use a setter in some place for age:
Person newDbEntry = new Person();
newDbEntry.setAge(45);
newDbEntry.setOccupation("Carpenter");
Now let's say that at some point of the development process, it was decided that age field name needs to be changed to "theAge", and it was also decided to remove "occupation" field from this collection completely- to a new table.
The problem that I'm faced with is that all my queries look like this:
JsonObject query = new JsonObject().put("age",new JsonObject().put("$gte", 22);
In other words, all field names are written in queries as Strings (and also in all other mongo APIs- update, findAndModify, etc).
I'm looking for a way to "bind" all mentions of the field "age" in my code with the POJO class- so that when something in the POJO schema changes (like renaming this field), I'll have (ideally) compiler errors in all queries that mention this field.
As it currently stands, changes to schema cause no compiler errors and - more critically - usually no runtime errors. The old string query just quietly returns no results, or something similar. This makes changes to the schema very hard to implement.
How should this be done correctly?

Here's the solution that I ended up using:
Project lombok now supports FieldNames generation:
https://projectlombok.org/features/experimental/FieldNameConstants
So instead of using the name hardcoded as string:
serviceRepository.setField(id, “service.serviceName”, “newName”);
I use:
serviceRepository.setField(id, ConnectivityServiceDetails.Fields.service + "." + ConnectivityService.Fields.serviceName, “newName”);
This way, when we search in Intellij for usages of this field (or try to refactor it), it will find those places also automatically.

Specify MongoDb collection name at runtime in Spring boot

I am trying to reuse my existing EmployeeRepository code (see below) in two different microservices to store data in two different collections (in the same database).
#Document(collection = "employee")
public interface EmployeeRepository extends MongoRepository<Employee, String>
Is it possible to modify #Document(collection = "employee") to accept runtime parameters? For e.g. something like #Document(collection = ${COLLECTION_NAME}).
Would you recommend this approach or should I create a new Repository?

This is a really old thread, but I will add some better information here in case someone else finds this discussion, because things are a bit more flexible than what the accepted answer claims.
You can use an expression for the collection name because spel is an acceptable way to resolve the collection name. For example, if you have a property in your application.properties file like this:
mongo.collection.name = my_docs
And if you create a spring bean for this property in your configuration class like this:
#Bean("myDocumentCollection")
public String mongoCollectionName(#Value("${mongo.collection.name}") final String collectionName) {
return collectionName
}
Then you can use that as the collection name for a persistence document model like this:
#Document(collection = "#{#myDocumentCollection}")
public class SomeModel {
#Id
private String id;
// other members and accessors/mutators
// omitted for brevity
}

It shouldn't be possible, the documentation states that the collection field should be collection name, therefore not an expression:
http://docs.spring.io/spring-data/data-mongodb/docs/current/api/org/springframework/data/mongodb/core/mapping/Document.html
As far as your other question is concerned - even if passing an expression was possible, I would recommend creating a new repository class - code duplication would not be bad and also your microservices may need to perform different queries and the single repository class approach would force you to keep query methods for all microservices within the same interface, which isn't very clean.
Take a look at this video, they list some very interesting approaches: http://www.infoq.com/presentations/Micro-Services

I used #environment.getProperty() to read from my application.yml. Like so :
application.yml:
mongodb:
collections:
dwr-suffix: dwr
Model:
#Document("Log-#{#environment.getProperty('mongodb.collections.dwr-suffix')}")
public class Log {
#Id
String logId;
...

appengine datastore change entities property

I would like to change the entity property from String to long. I have seen Nick answering similar problem in Change IntegerProperty to FloatProperty of existing AppEngine DataStore but I am writing in Java and need some code example since I don't know anything about the mapreduce.
e.g. we want to change userId from String to Long of this class.
I also would like to get advice on my thinking of storing date in long instead of String so that the time information can be consumed readily from android, GWT and more(over Rest Json or RPC). Right now, GWT does not have Jodatime and it has limited support of Java.util.Date and parsing.

If you really want to convert from String to Long, I can't see any other choice except to write a conversion snippet using raw GAE, eg:
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;
import com.google.appengine.api.datastore.PreparedQuery;
import com.google.appengine.api.datastore.Query;
Query q = new Query (Task.class.getName());
PreparedQuery pq = DatastoreServiceFactory.getDatastoreService ().prepare (q);
for (Entity entity : pq.asIterable ())
{
String orig = entity.getProperty ("userId").toString ();
entity.removeProperty ("userId");
entity.setProperty ("userId", Long.parseLong (orig));
}

What is your persistence interface? JDO (mine), JPA, Objectify, Twig, raw GAE/J API? I don't think that many people can give you a code example without knowing this.
Also, please give the code extract of your existing (an underlying date-time, I presume) persistent entity including the data member you talk about.

Your class is using JPA not JDO. The latest version (v2.x) of the GAE JPA plugin allows persistence of (java.util.)Date as Long or String. This wouldn't cater for your migration of data (see the reply by Jonathan for that) but would allow you to persist future Date fields as Long. IIRC you can specify the "jdbcType" (DataNucleus extension annotation) as INTEGER would trigger that.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.