Why are data transfer objects (DTOs) an anti-pattern?

Why are data transfer objects (DTOs) an anti-pattern? - java

I've recently overheard people saying that data transfer objects (DTOs) are an anti-pattern.
Why? What are the alternatives?

Some projects have all data twice. Once as domain objects, and once as data transfer objects.
This duplication has a huge cost, so the architecture needs to get a huge benefit from this separation to be worth it.

DTOs are not an anti-pattern. When you're sending some data across the wire (say, to an web page in an Ajax call), you want to be sure that you conserve bandwidth by only sending data that the destination will use. Also, often it is convenient for the presentation layer to have the data in a slightly different format than a native business object.
I know this is a Java-oriented question, but in .NET languages anonymous types, serialization, and LINQ allow DTOs to be constructed on-the-fly, which reduces the setup and overhead of using them.

"DTO an AntiPattern in EJB 3.0" (original link currently offline) says:
The heavy weight nature of Entity
Beans in EJB specifications prior to
EJB 3.0, resulted in the usage of
design patterns like Data Transfer
Objects (DTO). DTOs became the
lightweight objects (which should have
been the entity beans themselves in
the first place), used for sending the
data across the tiers... now EJB 3.0
spec makes the Entity bean model same
as Plain old Java object (POJO). With
this new POJO model, you will no
longer need to create a DTO for each
entity or for a set of entities... If
you want to send the EJB 3.0 entities
across the tier make them just
implement java.io.Serialiazable

OO purists would say that DTO is anti-pattern because objects become data table representations instead of real domain objects.

I don't think DTOs are an anti-pattern per se, but there are antipatterns associated with the use of DTOs. Bill Dudney refers to DTO explosion as an example:
http://www.softwaresummit.com/2003/speakers/DudneyJ2EEAntiPatterns.pdf
There are also a number of abuses of DTOs mentioned here:
http://anirudhvyas.com/root/2008/04/19/abuses-of-dto-pattern-in-java-world/
They originated because of three tier systems (typically using EJB as technology) as a means to pass data between tiers. Most modern day Java systems based on frameworks such as Spring take a alternative simplified view using POJOs as domain objects (often annotated with JPA etc...) in a single tier... The use of DTOs here is unnecessary.

Some consider DTOs an anti-pattern due to their possible abuses. They're often used when they shouldn't be/don't need to be.
This article vaguely describes some abuses.

The question should not be "why", but "when".
Definitively it's anti-pattern when only result of using it is higher cost - run-time or maintenance. I worked on projects having hundreds of DTOs identical to database entity classes. Each time you wanted to add a single field you ad to add id like four times - to DTO, to entity, to conversion from DTO to domain classes or entities, the inverse conversion, ... You forgot some of the places and data got inconsistent.
It's not anti-pattern when you really need different representation of domain classes - flatter, richer, ...
Personally I start with a domain class and pass it around, with proper checks at the right places. I can annotate and/or add some "helper" classes to make mappings to database, to serialization formats like JSON or XML ... I can always split a class to two if I feel the need.
It's about your point of view - I prefer to look at a domain object as a single object playing various roles, instead of multiple objects created from each other. If the only role an object is to transport data, then it's DTO.

If you're building a distributed system, then DTOs are certainly not an anti pattern. Not everyone will develop in that sense, but if you have a (for example) Open Social app all running off JavaScript.
It will post a load of data to your API. This is then deserialized into some form of object, typically a DTO/Request object. This can then be validated to ensure the data entered is correct before being converted into a model object.
In my opinion, it's seen as an anti-pattern because it's mis-used. If you're not build a distributed system, chances are you don't need them.

DTO becomes a necessity and not an ANTI-PATTERN when you have all your domain objects load associated objects EAGERly.
If you don't make DTOs, you will have unnecessary transferred objects from your business layer to your client/web layer.
To limit overhead for this case, rather transfer DTOs.

I think the people mean it could be an anti-pattern if you implement all remote objects as DTOs. A DTO is merely just a set of attributes and if you have big objects you would always transfer all the attributes even if you do not need or use them. In the latter case prefer using a Proxy pattern.

The intention of a Data Transfer Object is to store data from different sources and then transfer it into a database (or Remote Facade) at once.
However, the DTO pattern violates the Single Responsibility Principle, since the DTO not only stores data, but also transfers it from or to the database/facade.
The need to separate data objects from business objects is not an antipattern, since it is probably required to separate the database layer anyway.
Instead of DTOs you should use the Aggregate and Repository Patterns, which separates the collection of objects (Aggregate) and the data transfer (Repository).
To transfer a group of objects you can use the Unit Of Work pattern, that holds a set of Repositories and a transaction context; in order to transfer each object in the aggregate separately within the transaction.

Related

What is the main reason to create a DTO nowadays? [duplicate]

Why should I use DTOs/Domain objects when I can just put all business classes in a class library, use them in the business logic, then also pass those same business objects on to boundary classes?
UPDATE:
All are great points, thanks for the help. A follow up question:
Where do you typically place these DTOs? Alongside the Domain objects i.e. in the same namespace?
namespace MedSched.Medical
{
public class MedicalGroup
{
//...
}
public class MedicalGroupDTO
{
//...
}
}

The DTO's provide an abstraction layer for your domain model. You can therefore change the model and not break the contract you have for your service clients. This is akin to a common practice in database design where views and procedures become the abstraction layer for the underlying data model.
Serialization - you may over expose data and send bloated messages over the wire. This may be mitigated by using serialization attributes, but you may still have extra information.
Implicit vs. Explicit Contracts - by exposing domain objects you leave it up to the client to interpret their usage since they do not have the full model at their disposal. You will often make updates to domain objects implicitly based on the presence or removal of associations, or blindly accept all changes. DTOs explicitly express the usage of the service and the desired operation.
Disconnected Scenarios - having explicit messages or DTOs would make it easier for you to implement messaging and messaging patterns such as Message Broker, etc.
DDD - pure DDD demands that domain objects are externally immutable, and therefore you must offload this to another object, typically a DTO.

I can think of 2 basic scenarios for using DTOs:
You're creating your business object from data that's incomplete and will fail validation. For example, you're parsing a CSV file or an Excel file from where your business object is created. If you use data directly from these objects to create your business object, it is quite possible to fail several validation rules within the object, because data from such files are prone to errors. They also tend to have a different structure that you have in your final business object: having a placeholder for that incomplete data will be useful.
You're transporting your business object through a medium that is bandwidth intensive. If you are using a web service, you will need to use DTOs to simplify your object before transport; otherwise the CLR will have a hard time trying to serialize all your data.

DTOs are Data Transfer Objects, with Transfer being the key word. They are great when you want to pass your objects across the wire and potentially communicate with a different language because they are "light weight" (ie no business logic.)
If your application isn't web service based, DTOs aren't really going to buy you anything.
So deciding to use or not use DTOs should be based on the architecture of your application.

There are times when the data you want to pass around doesn't map exactly to the structure the business objects, in which case you use DTOs.
Ex: when you want to pass a subset of the data or a projection.

Microservices Restful API - DTOs or not?

REST API - DTOs or not?
I would like to re-ask this question in Microservices' context. Here is the quote from original question.
I am currently creating a REST-API for a project and have been reading
article upon article about best practices. Many seem to be against
DTOs and simply just expose the domain model, while others seem to
think DTOs (or User Models or whatever you want to call it) are bad
practice. Personally, I thought that this article made a lot of sense.
However, I also understand the drawbacks of DTOs with all the extra
mapping code, domain models that might be 100% identical to their
DTO-counterpart and so on.
Now, My question
I am more aligned towards using one Object through all the layers of my application (In other words, just expose Domain Object rather than creating DTO and manually copying over each fields). And the differences in my Rest contract vs domain object can be addressed using Jackson annotations like #JsonIgnore or #JsonProperty(access = Access.WRITE_ONLY) or #JsonView etc). Or if there is one or two fields that needs a transformation which cannot be done using Jackson Annotation, then I will write custom logic to handle just that (Trust me, I haven't come across this scenario not even once in my 5+ years long journey in Rest services)
I would like to know if I am missing any real bad effects for not copying the Domain to DTO

I would vote for using DTOs and here is why:
Different requests (events) and your DB entities. Often it happens that your requests/responses different from what you have in the domain model. Especially it makes sense in microservice architecture, where you have a lot of events coming from other microservices. For instance, you have Order entity, but the event you get from another microservice is OrderItemAdded. Even if half of the events (or requests) are the same as entities it still does make sense to have a DTOs for all of them in order to avoid a mess.
Coupling between DB schema and API you expose. When using entities you basically expose how you model your DB in a particular microservice. In MySQL you probably would want to have your entities to have relations, they will be pretty massive in terms of composition. In other types of DBs, you would have flat entities without lots of inner objects. This means that if you use entities to expose your API and want to change your DB from let's say MySQL to Cassandra - you'll need to change your API as well which is obviously a bad thing to have.
Consumer Driven Contracts. Probably this is related to the previous bullet, but DTOs makes it easier to make sure that communication between microservices is not broken whilst their evolution. Because contracts and DB are not coupled this is just easier to test.
Aggregation. Sometimes you need to return more than you have in one single DB entity. In this case, your DTO will be just an aggregator.
Performance. Microservices implies a lot of data transferring over the network, which may cost you issues with performance. If clients of your microservice need less data than you store in DB - you should provide them less data. Again - just make a DTO and your network load will be decreased.
Forget about LazyInitializationException. DTOs doesn't have any lazy loading and proxying as opposed to domain entities managed by your ORM.
DTO layer is not that hard to support with right tools. Usually, there is a problem when mapping entities to DTOs and backwards - you need to set right fields manually each time you want to make a conversion. It's easy to forget about setting the mapping when adding new fields to the entity and to the DTO, but fortunately, there are a lot of tools that can do this task for you. For instance, we used to have MapStruct on our project - it can generate conversion for you automatically and in compile time.

The Pros of Just exposing Domain Objects
The less code you write, the less bugs you produce.
despite of having extensive (arguable) test cases in our code base, I have came across bugs due to missed/wrong copying of fields from domain to DTO or viceversa.
Maintainability - Less boiler plate code.
If I have to add a new attribute, I don't have to add in Domain, DTO, Mapper and the testcases, of course. Don't tell me that this can be achieved using a reflection beanCopy utils, it defeats the whole purpose.
Lombok, Groovy, Kotlin I know, but it will save me only getter setter headache.
DRY
Performance
I know this falls under the category of "premature performance optimization is the root of all evil". But still this will save some CPU cycles for not having to create (and later garbage collect) one more Object (at the very least) per request
Cons
DTOs will give you more flexibility in the long run
If only I ever need that flexibility. At least, whatever I came across so far are CRUD operations over http which I can manage using couple of #JsonIgnores. Or if there is one or two fields that needs a transformation which cannot be done using Jackson Annotation, As I said earlier, I can write custom logic to handle just that.
Domain Objects getting bloated with Annotations.
This is a valid concern. If I use JPA or MyBatis as my persistent framework, domain object might have those annotations, then there will be Jackson annotations too. In my case, this is not much applicable though, I am using Spring boot and I can get away by using application-wide properties like mybatis.configuration.map-underscore-to-camel-case: true , spring.jackson.property-naming-strategy: SNAKE_CASE
Short story, at least in my case, cons doesn't outweigh the pros, so it doesn't make any sense to repeat myself by having a new POJO as DTO. Less code, less chances of bugs. So, going ahead with exposing the Domain object and not having a separate "view" object.
Disclaimer: This may or may not be applicable in your use case. This observation is per my usecase (basically a CRUD api having 15ish endpoints)

The decision is a much simpler one in case you use CQRS because:
for the write side you use Commands that are already DTOs; Aggregates - the rich behavior objects in your domain layer - are not exposed/queried so there is no problem there.
for the read side, because you use a thin layer, the objects fetched from the persistence should be already DTOs. There should be no mapping problem because you can have a readmodel for every use case. In worst case you can use something like GraphQL to select only the fields you need.
If you do not split the read from write then the decision is harder because there are tradeoffs in both solutions.

Persistence entities as data transfer objects

I have some persistence capable Java objects with all the #annotations in my web application. All these objects reside in a data layer. Is it a best practice to use these persistence objects as data transfer objects?
For example, if I want to pass back the data fetched from data store, should I directly return those persistence objects or should I manually copy data to a intermediary DTO and pass it back to other layers? Which approach would you suggest?

I would say that it is OK to do so (in fact the major advantage for these ORMs were to use these domain objects in different layers w/o having unnecessary DTOs) if you follow the following guidelines:
You don't extend the session boundaries i.e., any changes related to database should always be done using the Data Access Layer you have defined and not through these passed objects in other layers.
Any data that you need in other layer (layers above the Data Access Layer like Business Logic Layer and Presentation Layer) is pre-populated in these objects otherwise you will get exceptions according to the ORM behavior.
Don't extend the session boundaries for resolving the issue mentioned in No. 2

If you are absolutely certain no copies will end up in the Session storage or will be persisted/migrated to another instance/whatever, there is no need to do it.
If you need to keep these objects over multiple sessions/requests then it makes sense.
Another use-case is when you need to decouple loginc and persistence layer very thoroughly (i.e. to swap different persistence layers) then the coupling through the annotations could be troublesome.

I have never needed the extra level of abstraction provided by copying the persistence instance to a different DTO class.

How large should be Data Transfer Objects?

As I understand, Data Transfer Objects are used for different purposes, so let's bound the scope with the view layer in Java (JSF) -based web-applications (i.e. there are usually some entity-objects mapped on a DB, which can be also used in a Business-Logic layer, and some transfer objects used in a presentation layer).
So, I have some misunderstanding about how should well-designed DTOs look. Should I keep them as small as possible? Or should I try to pass with them as much information as possible and design them in such way that only some (different in different use-cases) part of the DTO fields is initialized at a time?
Should I consider using some OO principles (at least inheritance and composition) when designing DTOs or they should be as simple as only a number of primitive-type fields with their accessors?

DTOs, if at all different from the domain objects/entities, should be as big as needed - you should transfer exactly all data that you need.

DTOs in any language should be pretty light weight. Whether or not to use inheritance is a question only you could answer - it really depends on the business needs. Otherwise the DTO itself should include the basic get/set properties.
Generally these objects are pretty light weight, however it really depends on the data / properties you need. If your DTO has 1 property vs 50 properties, if you need 50 so be it. When it comes time to passing data to functions / methods the DTO saves your life from having to add all those extra parameters. You're essentially just passing one object.

DTOs should be as lightweight as possible, distinct from the business objects, and limited in scope (for example package level objects).
I say they should be separate from the business objects, contrary to Bozho's statement "if at all different from the domain objects", because a DTO will often need setters that users of the business object should not use.
I have, for example, a Person object and a PersonDTO... the DTO needs a setter for the person's name (first, last, etc.) but that is retrieved from an external data source and my application is not allowed to change it, so my business object "Person" shouldn't have a setter.

Spring MVC: should service layer be returning operation specific DTO's?

In my Spring MVC application I am using DTO in the presentation layer in order to encapsulate the domain model in the service layer. The DTO's are being used as the spring form backing objects.
hence my services look something like this:
userService.storeUser(NewUserRequestDTO req);
The service layer will translate DTO -> Domain object and do the rest of the work.
Now my problem is that when I want to retrieve a DTO from the service to perform say an Update or Display I can't seem to find a better way to do it then to have multiple methods for the lookup that return different DTO's like...
EditUserRequestDTO userService.loadUserForEdit(int id);
DisplayUserDTO userService.loadUserForDisplay(int id);
but something does not feel right about this approach. Perhaps the service should not return things like EditUserRequestDTO and the controller should be responsible of assembling a requestDTO from a dedicated form object and vice versa.
The reason do have separate DTO's is that DisplayUserDTO is strongly typed to be read only and also there are many properties of user that are entities from a lookup table in the db (like city and state) so the DisplayUserDTO would have the string description of the properties while the EditUserRequestDTO will have the id's that will back the select drop down lists in the forms.
What do you think?
thanks

I like the stripped down display objects. It's more efficient than building the whole domain object just to display a few fields of it. I have used a similar pattern with one difference. Instead of using an edit version of a DTO, I just used the domain object in the view. It significantly reduced the work of copying data back and forth between objects. I haven't decided if I want to do that now, since I'm using the annotations for JPA and the Bean Validation Framework and mixing the annotations looks messy. But I'm not fond of using DTOs for the sole purpose of keeping domain objects out of the MVC layer. It seems like a lot of work for not much benefit. Also, it might be useful to read Fowler's take on anemic objects. It may not apply exactly, but it's worth thinking about.
1st Edit: reply to below comment.
Yes, I like to use the actual domain objects for all the pages that operate on a single object at a time: edit, view, create, etc.
You said you are taking an existing object and copying the fields you need into a DTO and then passing the DTO as part of the model to your templating engine for a view page (or vice-versa for a create). What does that buy you? The ref to the DTO doesn't weigh any less than the ref to the full domain object, and you have all the extra attribute copying to do. There's no rule that says your templating engine has to use every method on your object.
I would use a small partial domain object if it improves efficiency (no relationship graphs to build), especially for the results of a search. But if the object already exists don't worry about how big or complex it is when you are sticking it in the model to render a page. It doesn't move the object around in memory. It doesn't cause the templating engine stress. It just accesses the methods it needs and ignores the rest.
2nd edit:
Good point. There are situations where you would want a limited set of properties available to the view (ie. different front-end and back-end developers). I should read more carefully before replying. If I were going to do what you want I would probably put separate methods on User (or whatever class) of the form forEdit() and forDisplay(). That way you could just get User from the service layer and tell User to give you the use limited copies of itself. I think maybe that's what I was reaching for with the anemic objects comment.

You should use a DTO and never an ORM in the MVC layer! There are a number of really good questions already asked on this, such as the following: Why should I isolate my domain entities from my presentation layer?
But to add to that question, you should separate them to help prevent the ORM being bound on a post as the potential is there for someone to add an extra field and cause all kinds of mayhem requiring unnecessary extra validation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.