Microservices Restful API - DTOs or not?

Microservices Restful API - DTOs or not? - java

REST API - DTOs or not?
I would like to re-ask this question in Microservices' context. Here is the quote from original question.
I am currently creating a REST-API for a project and have been reading
article upon article about best practices. Many seem to be against
DTOs and simply just expose the domain model, while others seem to
think DTOs (or User Models or whatever you want to call it) are bad
practice. Personally, I thought that this article made a lot of sense.
However, I also understand the drawbacks of DTOs with all the extra
mapping code, domain models that might be 100% identical to their
DTO-counterpart and so on.
Now, My question
I am more aligned towards using one Object through all the layers of my application (In other words, just expose Domain Object rather than creating DTO and manually copying over each fields). And the differences in my Rest contract vs domain object can be addressed using Jackson annotations like #JsonIgnore or #JsonProperty(access = Access.WRITE_ONLY) or #JsonView etc). Or if there is one or two fields that needs a transformation which cannot be done using Jackson Annotation, then I will write custom logic to handle just that (Trust me, I haven't come across this scenario not even once in my 5+ years long journey in Rest services)
I would like to know if I am missing any real bad effects for not copying the Domain to DTO

I would vote for using DTOs and here is why:
Different requests (events) and your DB entities. Often it happens that your requests/responses different from what you have in the domain model. Especially it makes sense in microservice architecture, where you have a lot of events coming from other microservices. For instance, you have Order entity, but the event you get from another microservice is OrderItemAdded. Even if half of the events (or requests) are the same as entities it still does make sense to have a DTOs for all of them in order to avoid a mess.
Coupling between DB schema and API you expose. When using entities you basically expose how you model your DB in a particular microservice. In MySQL you probably would want to have your entities to have relations, they will be pretty massive in terms of composition. In other types of DBs, you would have flat entities without lots of inner objects. This means that if you use entities to expose your API and want to change your DB from let's say MySQL to Cassandra - you'll need to change your API as well which is obviously a bad thing to have.
Consumer Driven Contracts. Probably this is related to the previous bullet, but DTOs makes it easier to make sure that communication between microservices is not broken whilst their evolution. Because contracts and DB are not coupled this is just easier to test.
Aggregation. Sometimes you need to return more than you have in one single DB entity. In this case, your DTO will be just an aggregator.
Performance. Microservices implies a lot of data transferring over the network, which may cost you issues with performance. If clients of your microservice need less data than you store in DB - you should provide them less data. Again - just make a DTO and your network load will be decreased.
Forget about LazyInitializationException. DTOs doesn't have any lazy loading and proxying as opposed to domain entities managed by your ORM.
DTO layer is not that hard to support with right tools. Usually, there is a problem when mapping entities to DTOs and backwards - you need to set right fields manually each time you want to make a conversion. It's easy to forget about setting the mapping when adding new fields to the entity and to the DTO, but fortunately, there are a lot of tools that can do this task for you. For instance, we used to have MapStruct on our project - it can generate conversion for you automatically and in compile time.

The Pros of Just exposing Domain Objects
The less code you write, the less bugs you produce.
despite of having extensive (arguable) test cases in our code base, I have came across bugs due to missed/wrong copying of fields from domain to DTO or viceversa.
Maintainability - Less boiler plate code.
If I have to add a new attribute, I don't have to add in Domain, DTO, Mapper and the testcases, of course. Don't tell me that this can be achieved using a reflection beanCopy utils, it defeats the whole purpose.
Lombok, Groovy, Kotlin I know, but it will save me only getter setter headache.
DRY
Performance
I know this falls under the category of "premature performance optimization is the root of all evil". But still this will save some CPU cycles for not having to create (and later garbage collect) one more Object (at the very least) per request
Cons
DTOs will give you more flexibility in the long run
If only I ever need that flexibility. At least, whatever I came across so far are CRUD operations over http which I can manage using couple of #JsonIgnores. Or if there is one or two fields that needs a transformation which cannot be done using Jackson Annotation, As I said earlier, I can write custom logic to handle just that.
Domain Objects getting bloated with Annotations.
This is a valid concern. If I use JPA or MyBatis as my persistent framework, domain object might have those annotations, then there will be Jackson annotations too. In my case, this is not much applicable though, I am using Spring boot and I can get away by using application-wide properties like mybatis.configuration.map-underscore-to-camel-case: true , spring.jackson.property-naming-strategy: SNAKE_CASE
Short story, at least in my case, cons doesn't outweigh the pros, so it doesn't make any sense to repeat myself by having a new POJO as DTO. Less code, less chances of bugs. So, going ahead with exposing the Domain object and not having a separate "view" object.
Disclaimer: This may or may not be applicable in your use case. This observation is per my usecase (basically a CRUD api having 15ish endpoints)

The decision is a much simpler one in case you use CQRS because:
for the write side you use Commands that are already DTOs; Aggregates - the rich behavior objects in your domain layer - are not exposed/queried so there is no problem there.
for the read side, because you use a thin layer, the objects fetched from the persistence should be already DTOs. There should be no mapping problem because you can have a readmodel for every use case. In worst case you can use something like GraphQL to select only the fields you need.
If you do not split the read from write then the decision is harder because there are tradeoffs in both solutions.

Related

Using same Entity classes in different spring data repositories

I'm trying to put together a project in which I have to persist some entity classes using different spring data repositories (gemfire, jpa, mongodb etc). As the data is more or less the same that needs to go into these repositories, I was wondering if I can use the same entity class for all of them to save me from converting from one object to another?
I got it working for gemfire and jpa but the entity class is already starting to looking a bit wired.
#Id // spring-data-gemfire
#javax.persistence.Id // jpa
#GeneratedValue
private Long id;
So far I can see following options:
Create an interface based separate Entity (domain) classes - Trying to re-use same class looks like a bit of premature optimization.
Externalize xml based mapping for JPA, not sure if gemfire and mongodb mapping can be externalized.
Use different concrete entity classes and use some copy constructor/converter for the conversion.
Been literally hitting my head against the wall to find the best approach - Any response is much appreciated. Thanks

If by weird, you mean your application domain objects/entity classes are starting to accumulate many different, but separate (mapping) annotations (some semantically the same even, e.g. SD Common's o.s.data.annotation.Id and JPA's #javax.persistence.Id) for the different data stores in which those entities will be persisted, then I suppose that is understandable.
The annotation pollution only increases too as the number of representations for your entities increases. For example, think Jackson annotations for JSON mapping or JAXB for XML, etc. Pretty soon, you have more meta-data then actual data, :-)
However, it is more a matter of preference, convenience, simplicity, really.
Some developers are purists and like to externalize everything. Others like to keep information (meta-data) close to the code using it. Even certain patterns have emerged to address these type of concerns... DTOs, Bounded Contexts (see Fowler's BoundedContext, which has a strong correlation to DDD and Microservices).
Personally, I use the following rules when designing and applying architectural principals/decisions in my code, especially when introducing something new:
Simplicity
Consistency
DRY
Test
Refactor
(along with a few others as well... good OOD, SoC, SOLID, Design Patterns, etc).
In that order too. If something starts getting too complex, refactor and simplify it. Be consistent in what you do by following/using patterns, conventions; familiarity is 1 key to consistency. But, don't keep repeating yourself either.
At the end of the day, it is really about maintaining the application. Will someone else who picks up where you left off be able to understand the organization and logic quickly, and be able to maintain it... simplicity is king. It does not mean it is so simple it is not viable or valuable. Even complex things can be simple if organized properly. However, breaking things apart and introducing abstractions can have hidden costs (see closing thoughts).
To more concretely answer (a few of) your questions...
I am not certain about MongoDB, but (Spring Data) GemFire does not have an external mapping. Minimally, #Region (on the entity class) and #Id are required, along with #PersistenceConstructor if your entity class has more than 1 constructor. For example.
This sounds sneakingly like to DTOs. Personally, I think BoundContexts are a better, more natural model of the application's data since the domain model should not be unduly tied to any persistent store or external representation (e.g. JSON, XML, etc). The application domain model is the 1 true state of the application and it should model the concept that is represents in a natural way, not superficially to satisfy some representation or persistent store (hence the mapping/conversion).
Anyway, try not to beat yourself up too much. It is all about managing complexity. Try to let yourself just do and use testing and other feedback loops to find an answer that is right for your application. You'll know.
Hope this helps.

What is a good strategy for converting jpa entities into restful resources

Restful resources do not always have a one-to-one mapping with your jpa entities. As I see it there are a few problems that I am trying to figure out how to handle:
When a resource has information that is populated and saved by more than one entity.
When an entity has more information in it that you want to send down as a resource. I could just use Jackson's #JsonIgnore but I would still have issue 1, 3 and 4.
When an entity (like an aggregate root) has nested entities and you want to include part of its nested entities but only to a certain level of nesting as your resource.
When you want to exclude once piece of an entity when its part of one parent entity but exclude a separate piece when its part of a different parent entity.
Blasted circular references (I got this mostly working with JSOG using Jackson's #JsonIdentityInfo)
Possible solutions:
The only way I could think of that would handle all of these issues would be to create a whole bunch of "resource" classes that would have constructors that took the needed entities to construct the resource and put necessary getters and setters for that resource on it. Is that overkill?
To solve 2, 3, 4 , and 5 I could just do some pre and post processing on the actual entity before sending it to Jackson to serialize or deserialize my pojo into JSON, but that doesn't address issue 1.
These are all problems I would think others would have come across and I am curious what solutions other people of come up with. (I am currently using JPA 2, Spring MVC, Jackson, and Spring-Data but open to other technologies)

With a combination of JAX_RS 1.1 and Jackson/GSON you can expose JPA entities directly as REST resources, but you will run into a myriad of problems.
DTOs i.e. projections onto the JPA entities are the way to go. It would allow you to separate the resource representation concerns of REST from the transactional concerns of JPA. You get to explicitly define the nature of the representations. You can control the amount of data that appears in the representation, including the depth of the object graph to be traversed, if you design your DTOs/projections carefully. You may need to create multiple DTOs/projections for the same JPA entity for the different resources in which the entity may need to be represented differently.
Besides, in my experience using annotations like #JsonIgnore and #JsonIdentityInfo on JPA entities doesnt exactly lend to more usable resource representations. You may eventually run into trouble when merging the objects back into the persistence context (because of ignored properties), or your clients may be unable to consume the resource representations, since object references as a scheme may not be understood. Most JavaScript clients will usually have trouble consuming object references produced by the #JsonidentityInfo annotation, due to the lack of standardization here.
There are other additional aspects that would be possible through DTOs/projections. JPA #EmbeddedIds do not fit naturally into REST resource representations. Some advocate using the JAX-RS #MatrixParam annotation to identify the resource uniquely in the resource URIs, but this does not work out of the box for most clients. Matrix parameters are after all only a design note, and not a standard (yet). With a DTO/projection, you can serve out the resource representation against a computed Id (could be a combination of the constituent keys).
Note: I currently work on the JBoss Forge plugin for REST where some or all of these issues exist and would be fixed in some future release via the generation of DTOs.

I agree with the other answers that DTOs are the way to go. They solve many problems:
Separation of layers and clean code. One day you may need to expose the data model using a different format (eg. XML) or interface (eg. non web-service based). Keeping all configuration (such as #JsonIgnore, #JsonidentityInfo) for each interface/format in domain model would make is really messy. DTOs separate the concerns. They can contain all the configuration required by your external interface (web-service) without involving changes in domain model, which can stay web-service and format agnostic.
Security - you easily control what is exposed to the client and what the client is allowed to modify.
Performance - you easily control what is sent to the client.
Issues such as (circular) entity references, lazily-loaded collections are also resolved explicitly and knowingly by you on converting to DTO.

Given your constraints, there looks to be no other solution than Data Transfer Objects - yes, it's occurring frequently enough that people named this pattern...

If you application is completely CRUDish then the way to go is definitely Spring Data REST in which you absolutely do not need DTOs. If it's more complicated than that you will be safer with DTOs securing the application layer. But do not attempt to encapsulate DTOs inside the controller layer. They belong to a service layer cause the mapping is also the part of logic (what you let in the application and what you let out of it). This way the application layer stays hermetic. Of course in most cases it can be the mix of those two.

Should raw Hibernate annotated POJO's be returned from the Data Access Layer, or Interfaces instead?

I understand separating the data layer objects (DAOs) in their own layer that abstracts the data access logic and data source specifics from service and business layers as outlined in DAO and Service layers (JPA/Hibernate + Spring) and other questions. I have experience creating these layers, but I've always used either raw JDBC or similar lower level ways of interfacing with the DB (such as Spring's SimpleJDBC), and am new to Hibernate.
My question comes in that in raw JDBC or other ways where you are actually dealing with a result set (or a thin wrapper around it) at the data access layer, the resulting POJOs where you stick your data are extremely clean and know nothing about where the data came from, and I've never worried about returning these to the service layer and beyond. However it appears that with Hibernate, you have a lot of your Hibernate / data structure specific logic right in the POJO annotations (things like 1 to many mappings, lazy loading preferences, etc). I feel uncomfortable returning them (or Collections of them) from my DAOs and up to my service layer and am tempted to have all the POJOs implement interfaces that I pass back instead. Is this good practice, or over complicating?

Annotations have one drawback of coupling some framework knowledge with Java objects. That's the price you pay for not having separate metadata definitions. POJOs still remain POJOs, though, and from practical standpoint I see no good reason to complicate design just because of annotations.
Lets think, if you would use XML mappings, would you even have that concern? Most likely - not. So pay the penalty and move on; and in unlikely case if you will be changing your persistence framework - you will go ahead and remove those annotations. In all cases they should have no side effects on your code outside of your DAO layer.
Just my 2 cents...

Both ;) I'm pretty ambivalent--I prefer interfaces, it's just easier to mock, use across non-Hibernate systems, etc. but in my case I've usually needed to provide an external API with datatypes, so it's almost always made sense. That, and I generate the interfaces automatically, so I don't have to actually do anything.
For isolated systems with no external API requirements, or if you never need the types outside of Hibernate, I'm not convinced it really matters all that much, although the purists will have my head on a pike for saying so (and they're arguably correct).

what is a good pattern for converting between hibernate entities and data transfer objects?

I have had similar questions and concerns as to how to convert between Hibernate entities and data transfer objects to be returned by a web service as are discussed in this question:
Is using data transfer objects in ejb3 considered best practice
One of the factors mentioned here is that if the domain model changes, a set of DTOs will protect consumers in the case of a web service.
Even though it seems like it will add a substantial amount of code to my project, this reasoning seems sound.
Is there a good design pattern that I can use to convert a Hibernate entity (which implements an interface) to a DTO that implements the same interface?
So assuming both of the following implement 'Book', I would need to convert a BookEntity.class to a BookDTO.class so that I can let JAXB serialize and return.
Again, this whole prospect seems dubious to me, but if there are good patterns out there for helping to deal with this conversion, I would love to get some insight.
Is there perhaps some interesting way to convert via reflection? Or a 'builder' pattern that I'm not thinking of?
Should I just ignore the DTO pattern and pass entities around?

Should I just ignore the DTO pattern
and pass entities around?
My preference is usually "yes". I don't like the idea of parallel hierarchies created just for the sake of architectural or layer purity.
The original reason for the DTO pattern was excessive chattiness in EJB 1.0 and 2.0 apps when passing entity EJBs to the view tier. The solution was to put the entity bean state into a DTO.
Another reason that's usually given for creating DTOs is to prohibit modification by the view layer. DTOs are immutable objects in that case, with no behavior. They do nothing but ferry data to the view layer.
I would argue that DTO is a Core J2EE pattern that's become an anti-pattern.
I realize that some people would disagree. I'm simply offering my opinion. It's not the only way to do it, nor necessarily the "right" way. It's my preference.

There needs to be a contrarian view amongst all the jolly kicking of the DTO.
tl;dr - It is sometimes still useful.
The advantage of the DTO is that you don't have to add a zillion annotations to your domain classes.
You start with #Entity. Not so bad. But then you need JAXB so you add #XMLElement etc - and then you need JSON so you add things like #JsonManagedReference for Jackson to do the right thing with relationships then you add etc. etc. etc. ad infinitum.
Pretty soon your POJO ain't so plain any more. Read about "domain driven design" sometime.
In addition you can "filter" some properties that you don't want the view to know about.

We should not forget that entity objects are not easy to handle when they are in managed state. This makes their passing to GUI forms problematic. To be more precise, child objects are handled eagerly. This cannot be done out of session, cousing exceptions. So, they either have to be evicted (detached) from the entity manager of they have to be converted to appropriate DTOs. Unless of cource there is a pattern, which I am not aware of, that I would be very glad to know.

For quickly create a "look-alike" DTO, without a bunch of duplicate get/set code, you can use BeanUtils.copyProperties. That function help you quickly copy the data from DAO to DTO class. Just remember that there are more than one common libraries support BeanUtils.copyProperties, but their syntax are not the same.

I know this is an old question, but thought I would add an answer offering a framework to help in case someone else is tackling this problem.
Our project has JAXB annotated POJOs that are separate from the JPA annotated POJOs. Our team was debating how best to move data between the two objects (actually data structures).
Here is an option for people to consider:
We found and are experimenting with Dozer which handles (1) same name, (2) XML mapping and (3) custom conversions as ways to copy data between two POJOs.
It has been very easy to use so far.

Spring MVC: should service layer be returning operation specific DTO's?

In my Spring MVC application I am using DTO in the presentation layer in order to encapsulate the domain model in the service layer. The DTO's are being used as the spring form backing objects.
hence my services look something like this:
userService.storeUser(NewUserRequestDTO req);
The service layer will translate DTO -> Domain object and do the rest of the work.
Now my problem is that when I want to retrieve a DTO from the service to perform say an Update or Display I can't seem to find a better way to do it then to have multiple methods for the lookup that return different DTO's like...
EditUserRequestDTO userService.loadUserForEdit(int id);
DisplayUserDTO userService.loadUserForDisplay(int id);
but something does not feel right about this approach. Perhaps the service should not return things like EditUserRequestDTO and the controller should be responsible of assembling a requestDTO from a dedicated form object and vice versa.
The reason do have separate DTO's is that DisplayUserDTO is strongly typed to be read only and also there are many properties of user that are entities from a lookup table in the db (like city and state) so the DisplayUserDTO would have the string description of the properties while the EditUserRequestDTO will have the id's that will back the select drop down lists in the forms.
What do you think?
thanks

I like the stripped down display objects. It's more efficient than building the whole domain object just to display a few fields of it. I have used a similar pattern with one difference. Instead of using an edit version of a DTO, I just used the domain object in the view. It significantly reduced the work of copying data back and forth between objects. I haven't decided if I want to do that now, since I'm using the annotations for JPA and the Bean Validation Framework and mixing the annotations looks messy. But I'm not fond of using DTOs for the sole purpose of keeping domain objects out of the MVC layer. It seems like a lot of work for not much benefit. Also, it might be useful to read Fowler's take on anemic objects. It may not apply exactly, but it's worth thinking about.
1st Edit: reply to below comment.
Yes, I like to use the actual domain objects for all the pages that operate on a single object at a time: edit, view, create, etc.
You said you are taking an existing object and copying the fields you need into a DTO and then passing the DTO as part of the model to your templating engine for a view page (or vice-versa for a create). What does that buy you? The ref to the DTO doesn't weigh any less than the ref to the full domain object, and you have all the extra attribute copying to do. There's no rule that says your templating engine has to use every method on your object.
I would use a small partial domain object if it improves efficiency (no relationship graphs to build), especially for the results of a search. But if the object already exists don't worry about how big or complex it is when you are sticking it in the model to render a page. It doesn't move the object around in memory. It doesn't cause the templating engine stress. It just accesses the methods it needs and ignores the rest.
2nd edit:
Good point. There are situations where you would want a limited set of properties available to the view (ie. different front-end and back-end developers). I should read more carefully before replying. If I were going to do what you want I would probably put separate methods on User (or whatever class) of the form forEdit() and forDisplay(). That way you could just get User from the service layer and tell User to give you the use limited copies of itself. I think maybe that's what I was reaching for with the anemic objects comment.

You should use a DTO and never an ORM in the MVC layer! There are a number of really good questions already asked on this, such as the following: Why should I isolate my domain entities from my presentation layer?
But to add to that question, you should separate them to help prevent the ORM being bound on a post as the potential is there for someone to add an extra field and cause all kinds of mayhem requiring unnecessary extra validation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.