Best practices with a domain object collection in DDD

Best practices with a domain object collection in DDD - java

I have the following class:
public class DomainClass {
private Integer value;
private Integer total;
private Integer month;
public Double getPercent(){/*Business logic*/}
}
I want to do the same getPercent operation with a list of DomainClass objects without repeat code. I got 2 ideas to handle that but I don't know if they're good ones.
Idea 1 - Create a service and iterate the list:
public class DomainObjectService{
....
public Double getPercent(List<DomainClass> list){
double value, total;
for(DomainClass d : list){
value += d.getValue();
total += d.getTotal();
}
// do percent operation
}
Idea 2 - Query operation at database, fill object and call the business method
public class DomainObjectService{
....
public Double getPercent(){
double value, total;
.... query data with sql and set the above variables
// do percent operation
return new DomainBusiness(value, total).getPercentage();
}
I'm reading that in DDD a entity should handle itself logic but in this case of a collection operation how it should be treated?
Well, if my DDD base knowledge is wrong. I would like to know good article/books/example of DDD in java.

How do you manage your entities? Do you use any kind of ORM?
My solution for this kind of operations is to build a class that manages the collection of object.
So, for example:
public class DomainClasses{
private final List<DomainClass> domainClasses;
....
public Double getPercent(){
// manage the percent operation ...
// ... on all the members the way ...
// ... your business is expected ...
// ... to do it on the collection
return something;
}
// class initialization
}
In this way you can reuse the getPercent code of each class, but also implement a specific version of it to be used by the collection. Moreover, the collection could access to the package private getters, if any, of DomainClass to make this calculations. In this way you expose nothing else than just the functions that you need to build your domain objects.
Note: this solution is viable if you manage your persistence without any ORM. Or, if you want to use it, it will require additional work to configure correctly the container class.
Some links:
https://www.mehdi-khalili.com/orm-anti-patterns-part-4-persistence-domain-model (I work with a DM separated from the PM)
https://softwareengineering.stackexchange.com/questions/371344/why-not-use-an-orm-with-ddd (is what I'm doing, translating the Domain objects to DTOs that will be persisted - it's a bit of extra code to write for collections, but once is tested it works, always, and what you get is a domain that has less no interferences from ORM framework)
Update after question. I use the Memento pattern.
Storing
My Domain Class has a function that export all the data in a Memento object. The repository takes a Domain Instance, asks for the Memento and then:
I generate the SQL insert/update (plain SQL with transaction management from Spring)
You can load your JPA entity and update with the Memento information (care should be taken, but if you write tests, once done it will work always - hence, tests are important ;) )
Reading
For the reverse, building a Domain instance from the saved data, I do this:
in the persistence layer, where the repository code is implemented, I've extended my Memento (lets call it PersistedMemento)
when I've to load something, I build a PersistedMemento and I use it to build an instance of the Domain Class
my Domain Class has a function that allows to build objects from a Memento. Note: this could not always be necessary, but in my case the main constructor ha extra checks that cannot be done when the object is rebuilt from a saved one. Anyway, this simplifies the rebuilding of the Domain class.
To protect the Domain classes from being used outside the world of the domain:
my repositories require an existent transaction, so they cannot be directly used anywhere in the code
the Memento classes have protected constructors, so they are usable only in the Domain package or the Repository package. The PersistedMemento is also hidden in the Repository package, so no instances could be created.
Notes
Of course is not the perfect solution. The Domain Class has 2 functions that are there to support a non domain requirement. The Memento class could also be sub-classed, and an instance could be used to build the Domain Class (but why? It's much more simpler to build it with the default constructor). But, except this small amount of pollution, the domain stays really clean and I can really concentrate on the domain requirement, without thinking how to manage the persistence.

Related

What's the point of having DTO object when you have the same object as POJO (Entity)?

I would like to understand what's the benefits to create DTO objects when you already have POJO object (as Entity).
In my project I have both :
DTO classes are used to communicate between Web Service and the application
POJO entity classes (JPA) are used for communication between database and the application
If I look at a DTO object class (let's call it MyObjDTO) and the same class but POJO side (let's call it MyObjPOJO) there is no difference at all except MyObjPOJO as annotation due to the fact it's an #Entity.
So in fact, I got in my project 2 classes who look the same (same attributes, same methods) but for different puprose.
IMO, in this case the DTO class is useless and increase application complexity because all I do with DTO class I can do it with my POJO class and moreover, for a single type of object I have to maintain at least 2 classes (the DTO and POJO), for instance if I add an attribute I have to add this attribute in both classes.
I'm not an expert and I'm questionning about my thoughts; what do you think about it ?

This answer is a replica of what can be found on stack exchange. IMHO the OP should be closed for being posted in the wrong forum. It's currently also attracting opinionated answers, though not necessarily so, and isn't tied to java in any particular way.
DTO is a pattern and it is implementation (POJO/POCO) independent. DTO says, since each call to any remote interface is expensive, response to each call should bring as much data as possible. So, if multiple requests are required to bring data for a particular task, data to be brought can be combined in a DTO so that only one request can bring all the required data. Catalog of Patterns of Enterprise Application Architecture has more details.
DTO's are a fundamental concept, not outdated.
What is somewhat outdated is the notion of having DTOs that contain no logic at all, are used only for transmitting data and "mapped" from domain objects before transmission to the client, and there mapped to view models before passing them to the display layer. In simple applications, the domain objects can often be directly reused as DTOs and passed through directly to the display layer, so that there is only one unified data model. For more complex applications you don't want to expose the entire domain model to the client, so a mapping from domain models to DTOs is necessary. Having a separate view model that duplicates the data from the DTOs almost never makes sense.
However, the reason why this notion is outdated rather than just plain wrong is that some (mainly older) frameworks/technologies require it, as their domain and view models are not POJOS and instead tied directly to the framework.
Most notably, Entity Beans in J2EE prior to the EJB 3 standard were not POJOs and instead were proxy objects constructed by the app server - it was simply not possible to send them to the client, so you had no choice about haing a separate DTO layer - it was mandatory.
Although DTO is not an outdated pattern, it is often applied needlessly, which might make it appear outdated.
From Java guru Adam Bien:
The most misused pattern in the Java Enterprise community is the DTO. DTO was clearly defined as a solution for a distribution problem. DTO was meant to be a coarse-grained data container which efficiently transports data between processes (tiers). ~ Adam Bien
From Martin Fowler:
DTOs are called Data Transfer Objects because their whole purpose is to shift data in expensive remote calls. They are part of implementing a coarse grained interface which a remote interface needs for performance. Not just do you not need them in a local context, they are actually harmful both because a coarse-grained API is more difficult to use and because you have to do all the work moving data from your domain or data source layer into the DTOs. ~ Martin Fowler
Here is a Java EE specific example of a common but incorrect use of the DTO pattern. If you're unfamiliar with Java EE, you just need to know the MVC pattern: a "JSF ManagedBean" is a class used by the View, and a "JPA Entity" is the Model in the MVC pattern.
So, for example, say you have a JSF ManagedBean. A common question is whether the bean should hold a reference to a JPA Entity directly, or should it maintain a reference to some intermediary object which is later converted to an Entity. I have heard this intermediary object referred to as a DTO, but if your ManagedBeans and Entities are operating within the same JVM, then there is little benefit to using the DTO pattern.
Futhermore, consider Bean Validation annotations (again, if you're unfamiliar with Java EE, know that Bean Validation is an API for validating data). Your JPA Entities are likely annotated with #NotNull and #Size validations. If you're using a DTO, you'll want to repeat these validations in your DTO so that clients using your remote interface don't need to send a message to find out they've failed basic validation. Imagine all that extra work of copying Bean Validation annotations between your DTO and Entity, but if your View and Entities are operating within the same JVM, there is no need to take on this extra work: just use the Entities.
The Catalog of Patterns of Enterprise Application Architecture provides a concise explanation of DTOs, and here are more references I found illuminating:
HOW TO DEAL WITH J2EE AND DESIGN PATTERNS
How to use DTO in JSF + Spring + Hibernate
Pros and Cons of Data Transfer Objects Martin Fowler's description of DTO
Martin Fowler explains the
problem with DTOs. Apparently they were being misused as early
as 2004

Most of this comes down to Clean Architecture and a focus on separation of concerns
My biggest use-case for the entities is so i don't litter the DTO's with runtime variables or methods that i've added in for convenience (such as display names / values or post-calculated values)
If its a very simple entity then there isn't so much of a big deal about it, but if you're being extremely strict with Clean then there becomes a lot of redundant models (DTO, DBO, Entity)
Its really a preference in how much you want to dedicate to strict Clean architecture
https://medium.com/android-dev-hacks/detailed-guide-on-android-clean-architecture-9eab262a9011

There is an advantage, even if very small, to having a separation of layers in your architecture, and having objects "morph" as they travel through the layers. this decoupling allows you to replace any layer in your software with minimal change, just update the mapping of fields between 2 objects and your all set.
If the 2 objects have the same members...well, then that's what Apache Commons BeanUtils.copyProperties() is for ;)

Other people have already informed you of the benefits of DTO, here I will talk about how to solve the trouble of maintaining one more DTO version object.
I deveploy a library beanKnife to automatically generate a dto. It will create a new class base the original pojo. You can filter the inherited properties, modify existing properties or add new properties. All you need is just writing a configuration class, and the library will do the left things for you. The configuration support inheritance feature, so you can extract the common part to simpify the configuration even more.
Here is the example
#Entity
class Pojo1 {
private int a;
#OneToMany(mappedBy="b")
private List<Pojo2> b;
}
#Entity
class Pojo2 {
private String a;
#ManyToOne()
private Pojo1 b;
}
// Include all properties. By default, nothing is included.
// To change this behaviour, here use a base configuration and all other final configuration will inherit it.
#PropertiesIncludePattern(".*")
// By default, the generated class name is the original class name append with "View",
// This annotation change the behaviour. Now class Pojo1 will generate the class Pojo1Dto
#ViewGenNameMapper("${name}Dto")
class BaseConfiguration {
}
// generate Pojo1Dto, which has a pojo2 info list b instead of pojo2 list
#ViewOf(value = Pojo1.class)
class Pojo1DtoConfiguration extends BaseConfiguration {
private List<Pojo2Info> b;
}
// generate Pojo1Info, which exclude b
#ViewOf(value = Pojo1.class, genName="Pojo1Info", excludes = "b")
class Pojo1InfoConfiguration extends BaseConfiguration {}
// generate Pojo2Dto, which has a pojo1 info b instead of pojo1
#ViewOf(value = Pojo2.class)
class Pojo2DtoConfiguration extends BaseConfiguration {
private Pojo1Info b;
}
// generate Pojo2Info, which exclude b
#ViewOf(value = Pojo2.class, genName="Pojo2Info", excludes = "b")
class Pojo2InfoConfiguration extends BaseConfiguration {}
will generate
class Pojo1Dto {
private int a;
private List<Pojo2Info> b;
}
class Pojo1Info {
private int a;
}
class Pojo2Dto {
private String a;
private Pojo1Info b;
}
class Pojo2Info {
private String a;
}
Then use it like this
Pojo1 pojo1 = ...
Pojo1Dto pojo1Dto = Pojo1Dto.read(pojo1);
Pojo2 pojo2 = ...
Pojo2Dto pojo2Dto = Pojo2Dto.read(pojo2);

Correct way of finding what was modified by a post in an spring-mvc controller?

It is a rather general question, but I will give a stripped down example. Say I have a Web CRUD application that manages simple entities stored in a database, nothing but classic : JSP view, RequestMapping annotated controller, transactional service layer and DAO.
On an update, I need to know the previous values of my fields, because a business rule asks a for a test involving the old and new values.
So I am searching for a best practice on that use case.
I thing that spring code is way more extensively tested and more robust than my own, and I would like to do it the spring way as much as possible.
Here is what I have tried :
1/ load an empty object in controller and manage the update in service :
Data.java:
class Data {
int id; // primary key
String name;
// ... other fields, getters, and setters omitted for brevity
}
DataController
...
#RequestMapping("/data/edit/{id}", method=RequestMethod.GET)
public String edit(#PathVariable("id") int id, Model model) {
model.setAttribute("data", service.getData(id);
return "/data/edit";
}
#RequestMapping("/data/edit/{id}", method=RequestMethod.POST)
public String update(#PathVariable("id") int id, #ModelAttribute Data data, BindingResult result) {
// binding result tests omitted ..
service.update(id, data)
return "redirect:/data/show";
}
DataService
#Transactional
public void update(int id, Data form) {
Data data = dataDao.find(id);
// ok I have old values in data and new values in form -> do tests stuff ...
// and MANUALLY copy fields from form to data
data.setName(form.getName);
...
}
It works fine, but in real case, if I have many domain objects and many fields in each, it is quite easy to forget one ... when spring WebDataBinder has done it including validation in the controller without I have to write any single thing other than #ModelAttribute !
2/ I tried to preload the Data from the database by declaring a Converter
DataConverter
public class DataConverter<String, Data> {
Data convert(String strid) {
return dataService.getId(Integer.valueOf(strid));
}
}
Absolutely magic ! The data if fully initialized from database and fields present in form are properly updated. But ... no way to get the previous values ...
So my question is : what could be the way to use spring DataBinder magic and to have access to previous values of my domain objects ?

You have already found the possible choices so i will just add some ideas here ;)
I will start with your option of using a empty bean and copying the values over to a loaded instance:
As you have shown in your example it's an easy approach. It's quite easily adaptable to create a generalized solution.
You do not need to copy the properties manually! Take a look at the 'BeanWrapperImpl' class. This spring object allows you to copy properties and is in fact the one used by Spring itself to achieve it's magic. It's used by the 'ParameterResolvers' for example.
So copying properties is the easy part. Clone the loaded object, fill the loaded object and compare them somehow.
If you have one service or just several this is the way to go.
In my case we needed this feature on each entity. Using Hibernate we have the issue that an entity might not only change inside a specific service call, but theoretically all over the place..
So I decided to create a 'MappedSuperClass' which all entities need to extend. This entity has a 'PostLoad' event listener which clones the entity in a transient field directly after loading. (This works if you don't have to load thousands of entities in a request.) Then you need also the 'PostPersist' and 'PostUpdate' listeners to clone the new state again as you probably don't reload the entity before another modification.
To facilitate the controller mapping I have implemented a 'StringToEntityConverter' doing exactly what you did, just generalized to support any entity type.
Finding the changes in a generalized approach will involve quite a bit of reflection. It's not that hard and I don't have the code available right now, but you can also use the 'BeanWrapper' for that:
Create a wrapper for both objects. Get all 'PropertyDescriptors' and compare the results. The hardest part is to find out when to stop. Compare only the first level or do you need deep comparison?
One other solution could also be to rely on Hibernate Envers. This would work if you do not need the changes during the same transaction. As Envers tracks the changes during a flush and creates a 'Revision' you can "simply" fetch twp revisions and compare them.
In all scenarios you will have to write a comparison code. I'm not aware of a library but probably there is something around in the java world :)
Hope that helps a bit.

Custom Constructor : Apache Cayenne 3.2M

I'm new to the API. It appears to me that you have to construct objects via the 'context' object like this:
ServerRuntime cayenneRuntime = new ServerRuntime("cayenne-project.xml");
context = cayenneRuntime.newContext()
...
MyEntity entity=context.newObject(MyEntity.class);
Rather than just creating Java Objects in the usual new() way:
MyEntity entity=new MyEntity();
But I want to create a constructor for my 'MyEntity' class that would do something like:
public MyEntity(String inputFile) {
...
do setters based on information derived from inputFile (size, time created etc).
...
How can I achieve this - ideally I want to keep the logic on the class MyEntity itself, rather than having a 'wrapper' class somewhere else to instantiate the object and perform the setting.... I guess I could have a 'helper' method which just the settings on a previously instantiated instance...but is there an idiom I'm missing here...?

You got it right about creating the object via 'context.newObject(..)' - this is the best way to do it and will keep you out of trouble. Still you can actually have your own constructor (provided you also maintain a default constructor for the framework to use):
public MyEntity(String inputFile) {
...
}
public MyEntity() {
}
Then you can create your object first, and add it to the context after that:
MyEntity e = new MyEntity(inputFile);
context.registerNewObject(e);
As far as idioms go, a very common one is to avoid business logic in persistent objects. ORM models are often reused in more than one application, and behavior you add to the entities doesn't uniformly apply everywhere. The other side of this argument is that anything but simplest methods depend on the knowledge of the surrounding environment - something you don't want your entities to be aware of.
Instead one would write a custom service layer that sits on top of the entities and contains all the business logic (often used with a dependency injection container). Services are not wrappers of entities (in fact services are often singletons). You can think of them as configurable strategy objects. In the Java world such layered design and this type of separation of concerns is very common and is probably the most flexible approach.
But if you want to hack something quickly, and don't envision it to grow into a complex multi-module system, then using a custom constructor or a static factory method in the entity is just fine of course.

How do you keep clean layers separation with Hibernate/ORM?

How is it possible to keep clean layers with Hibernate/ORM (or other ORMs...)?
What I mean by clean layer separation is for exemple to keep all of the Hibernate stuff in the DAO layer.
For example, when creating a big CSV export stream, we should often do some Hibernate operations like evict to avoid OutOfMemory... The filling of the outputstream belong to the view, but the evict belongs to the DAO.
What I mean is that we are not supposed to put evict operations in the frontend / service, and neither we are supposed to put business logic in the DAO... Thus what can we do in such situations?
There are many cases where you have to do some stuff like evict, flush, clear, refresh, particularly when you play a bit with transactions, large data or things like that...
So how do you do to keep clear layers separation with an ORM tool like Hibernate?
Edit: something I don't like either at work is that we have a custom abstract DAO that permits a service to give an Hibernate criterion as an argument. This is practical, but for me in theory a service that calls this DAO shouldn't be aware of a criterion. I mean, we shouldn't have in any way to import Hibernate stuff into the business / view logic.
Is there an answer, simple or otherwise?

If by "clean" you mean that upper layers don't know about implementations of the lower layers, you can usually apply the
Tell, don't ask principle. For your CSV streaming example, it would be something like, say:
// This is a "global" API (meaning it is visible to all layers). This is ok as
// it is a specification and not an implementation.
public interface FooWriter {
void write(Foo foo);
}
// DAO layer
public class FooDaoImpl {
...
public void streamBigQueryTo(FooWriter fooWriter, ...) {
...
for (Foo foo: executeQueryThatReturnsLotsOfFoos(...)) {
fooWriter.write(foo);
evict(foo);
}
}
...
}
// UI layer
public class FooUI {
...
public void dumpCsv(...) {
...
fooBusiness.streamBigQueryTo(new CsvFooWriter(request.getOutputStream()), ...);
...
}
}
// Business layer
public class FooBusinessImpl {
...
public void streamBigQueryTo(FooWriter fooWriter, ...) {
...
if (user.canQueryFoos()) {
beginTransaction();
fooDao.streamBigQueryTo(fooWriter, ...);
auditAccess(...);
endTransaction();
}
...
}
}
In this way you can deal with your specific ORM with freedom. The downside of this "callback" approach: if your layers are on different JVMs then it might not be very workable (in the example you would need to be able to serialize CsvFooWriter).
About generic DAOs: I have never felt the need, most object access patterns I have found are different enough to make an specific implementation desirable. But certainly doing layer separation and forcing the business layer to create Hibernate criteria are contradictory paths. I would specify a different query method in the DAO layer for each different query, and then I would let the DAO implementation get the results in whatever way it might choose (criteria, query language, raw SQL, ...). So instead of:
public class FooDaoImpl extends AbstractDao<Foo> {
...
public Collection<Foo> getByCriteria(Criteria criteria) {
...
}
}
public class FooBusinessImpl {
...
public void doSomethingWithFoosBetween(Date from, Date to) {
...
Criteria criteria = ...;
// Build your criteria to get only foos between from and to
Collection<Foo> foos = fooDaoImpl.getByCriteria(criteria);
...
}
public void doSomethingWithActiveFoos() {
...
Criteria criteria = ...;
// Build your criteria to filter out passive foos
Collection<Foo> foos = fooDaoImpl.getByCriteria(criteria);
...
}
...
}
I would do:
public class FooDaoImpl {
...
public Collection<Foo> getFoosBetween(Date from ,Date to) {
// build and execute query according to from and to
}
public Collection<Foo> getActiveFoos() {
// build and execute query to get active foos
}
}
public class FooBusinessImpl {
...
public void doSomethingWithFoosBetween(Date from, Date to) {
...
Collection<Foo> foos = fooDaoImpl.getFoosBetween(from, to);
...
}
public void doSomethingWithActiveFoos() {
...
Collection<Foo> foos = fooDaoImpl.getActiveFoos();
...
}
...
}
Though someone could think that I'm pushing some business logic down to the DAO layer, it seems a better approach to me: changing the ORM implementation to an alternative one would be easier this way. Imagine, for example that for performance reasons you need to read Foos using raw JDBC to access some vendor-specific extension: with the generic DAO approach you would need to change both the business and DAO layers. With this approach you would just reimplement the DAO layer.

Well, you can always tell your DAO layer to do what it needs to do when you want to. Having a method like cleanUpDatasourceCache in your DAO layer, or something similar (or even a set of these methods for different objects), is not bad practice to me.
And your service layer is then able to call that method without any assumption on what is done by the DAO under the hood. A specific implementation which uses direct JDBC calls would do nothing in that method.

Usually a DAO layer to wrap the data access logic is necessary. Other times is just the EntityManager what you want to use for CRUD operations, for those cases, I wouldn't use a DAO as it would add unnecessary complexity to the code.
How should EntityManager be used in a nicely decoupled service layer and data access layer?

If you don't want to tie your code to Hibernate you can use Hibernate through JPA instead and not bother too much about abstracting everything within your DAOs. You are less likely to switch from JPA to something else than replacing Hibernate.

my 2 cents: i think the layer separation pattern is great as a starting point for most cases, but there is a point where we have to analyze each specific application case by case and design a more flexible solution. what i mean is, ask yourself for example:
is your DAO expected to be reused in another context other than
exporting csv data?
does it make sense to have another implementation of the same DAO
interface without hibernate ?
if both answers were no, maybe a little bit of coupling between persistence and data presentation is ok. i like the callback solution proposed above.
IMHO sometimes strict implementation of a pattern has a higher cost in readability, mantainability, etc. which are the very issues we were trying to fix by adopting a pattern in the first place

you can achieve layer separation by implementing DAO pattern and and doing all hibernate/JDBC/JPA related stuff in Dao itself
for eg:
you can specify a Generic Dao interface as
public interface GenericDao <T, PK extends Serializable> {
/** Persist the newInstance object into database */
PK create(T newInstance);
/** Retrieve an object that was previously persisted to the database using
* the indicated id as primary key
*/
T read(PK id);
/** Save changes made to a persistent object. */
void update(T transientObject);
/** Remove an object from persistent storage in the database */
void delete(T persistentObject);
}
and its implementaion as
public class GenericDaoHibernateImpl <T, PK extends Serializable>
implements GenericDao<T, PK>, FinderExecutor {
private Class<T> type;
public GenericDaoHibernateImpl(Class<T> type) {
this.type = type;
}
public PK create(T o) {
return (PK) getSession().save(o);
}
public T read(PK id) {
return (T) getSession().get(type, id);
}
public void update(T o) {
getSession().update(o);
}
public void delete(T o) {
getSession().delete(o);
}
}
so whenever service classes calls any method on any Dao without any assumption of the internal implementation of the method
have a look at the GenericDao link

Hibernate (either as a SessionManager or a JPA EntityManager) is the DAO. The Repository pattern is, as far as I have seen, the best starting place. There is a great image over at the DDD Sample Website which I think speaks volumes about how you keep things things separate.
My application layer has interfaces that are explicit business actions or values. The business rules are in the domain model and things like Hibernate live in the infrastructure. Services are defined at the domain layer as interfaces, and implemented in the infrastructure in my case. This means that for a given Foo domain object (an aggregate root in the DDD terminology) I usually get the Foo from a FooService and the FooService talks to a FooRepository which allows one to find a Foo based on some criteria. That criteria is expressed via method parameters (possibly complex object types) which at the implementation side, for example in a HibernateFooRepository, would be translated in to HQL or Hibernate criterion.
If you need batch processing, it should exist at the application level and use domain services to facilitate this. StartBatchTransaction/EndBatchTransaction. Hibernate may listen to start/end events in order to coordinate purging, loading, whatever.
In the specific case of serializing domain entities, though, I see nothing wrong with taking a set of criteria and iterating over them one at a time (from root entities).
I find that often, in the pursuit of separation, we often try to make things completely general. They are not one in the same - your application has to do something, and that something can and should be expressed rather explicitly.
If you can substitute an InMemoryFooRepository where a HibernateFooRepository was previously being used, you're on the right path. The natural flow through unit and integration testing your objects encourages this when you adhere or at least try to respect the layering outlined in the image I linked above.

You got some good answers here, I would like to add my thoughts on this (by the way, this is something to take care of in our code as well) I would also like to focus on the issue of having Hibernate annotations/JPA annotations on entities that you might need to use outside of your DAL (i.e - at business logic, or even send to your client side) -
A. If you use the GenericDAO pattern for a given entity, you may find your entity being annotated with Hibernate (or maybe JPA annotation) such as #Table, #ManyToOne and so on - this means that you client code may contain Hibernate/JPA annotations and you would require an appropriate jar to get it compiled, or have some other support at your client code this is for example if you use GWT as your client (which can have support for JPA annotations in order to get entities compiled), and share the entities between the server and the client code, or if you write a Java client that performs a bean lookup using InitialContext against a Java application server (in this case you will need a JAR
B. Another approach that you can have is work with Hibernate/JPA annotated code at server side, and expose Web Services (let's say RESTFul web service or SOAP) - this way, the client works with an "interface" that does not expose knowledge on Hibernate/JPA (for example - a WSDL in case of SOAP defines the contract between the client of the service and the service itself). By breaking the architecture to service oriented one, you get all kinds of benefits such as loose coupling, ease of replacement of pieces of code, and you can concentrate all the DAL logic in one service that serves the rest of your services, and later own replace the DAL if needed by another service.
C. You can use an "object to object" mapping framework such as dozer to map objects of classes with Hibernate/JPA annotations to what I call "true" POJOs - i.e - java beans with no annotations whatsoever on them.
D. Finally regarding annotations - why use annotations at all? Hibernate uses hbm xml files an alternative for doing the "ORM magic" - this way your classes can remain without annotations.
E. One last point - I would like to suggest you look at the stuff we did at Ovirt - you can dowload the code by git clone our repo. You will find there under engine/backend/manager/modules/bll - a maven project holding our bll logic, and under engine/backend/manager/moduled/dal - our DAL layer (although currently implemented with Spring-JDBC, and some hibernate experiments, you will get some good ideas on how to use it in your code. I would like to add that if you go for a similar solution, I suggest that you inject the DAOs in your code, and not hold them in a Singletone like we did with getXXXDao methods (this is legacy code we should strive to remove and move to injections).

I would recommend you let the database handle the export to CSV operation rather than building it yourself in Java, it isn't as efficient. ORM shouldn't really be used for those large scale batch operations, because ORM should only be used to manipulate transactional data.
Large scale Java batch operations should really be done by JDBC directly with transactional support turned off.
However, if you do this regularly, I recommend setting up a reporting database which is a delayed replica of the database that is not used by the application and utilizes database specific replication tools that may come with your database.
Your solution architect should be able to work with the other groups to help set this up for you.
If you really have to do it in the application tier, then using raw JDBC calls may be the better option. With raw JDBC you can perform a query to assemble the data that you require on the database side and fetch the data one row at a time then write to your output stream.
To answer your layers question. Though I don't like using the word layers because it usually implies one thing on top of another. I would rather use the word "components" and I have the following component groups.
application
domain - just annotated JPA classes, no persistence logic, usually a plain JAR file, but I recommend just plop it as a package in the EJB rather than having to deal with class path issues
contracts - WSDL and XSD files that define an interface between different components be it web services or just UI.
transaction scripts - Stateless EJBs that would have a transaction and persistence units injected into them and do the manipulation and persistence of the domain objects. These may implement the interfaces generated by the contracts.
UI - a separate WAR project with EJBs injected into them.
database
O/R diagram - this is the contract that is agreed upon by application and data team to ensure THE MINIMUM that the database will provide. It does not have to show everything.
DDLs - this is the database side implementation of the O/R diagram which will contain everything, but generally no one should care because it implementation details.
batch - batch operations such as export or replicate
reporting - provides queries to get business value reports from the system.
legacy
messaging contracts - these are contracts used by messaging systems such as JMS or WS-Notifications or standard web services.
their implementation
transformation scripts - used to transform one contract to another.

It seems to me we need to take another look at the layers.
(I hope someone corrects me if I get this wrong.)
Front End/UI
Business
Service/DAO
So for the case of Generating a Report, THe layers break down like so.
Front End/UI
will have a UI with a button "Get Some Report"
the button will then call the Business layer that knows what the report is about.
The data returned by the report generator is given any final formatting before being returned to the user.
Business
MyReportGenerator.GenerateReportData() or similar will be called
Service/DAO
inside of the report generator DAOs will be used. DAOLocator.GetDAO(Entity.class); or similar factory type methods would be used to get the DAOs. the returned DAOs will extend a Common DAO interface

Well, to get a clean separation of concern or you can say clean layer separation you can add Service layer to your application, which lies between you FrontEnd and DaoLayer.
You can put your business logic in Service layer and database related things in Dao layer using Hibernate.
So if you need to change something in your business logic, you can edit your service layer without changing the DAO and if you want to change the Dao layer, you can do without changing actual business logic i.e. Service Layer.

Is a DAO Only Meant to Access Databases?

I have been brushing up on my design patterns and came across a thought that I could not find a good answer for anywhere. So maybe someone with more experience can help me out.
Is the DAO pattern only meant to be used to access data in a database?
Most the answers I found imply yes; in fact most that talk or write on the DAO pattern tend to automatically assume that you are working with some kind of database.
I disagree though. I could have a DAO like follows:
public interface CountryData {
public List<Country> getByCriteria(Criteria criteria);
}
public final class SQLCountryData implements CountryData {
public List<Country> getByCriteria(Criteria criteria) {
// Get From SQL Database.
}
}
public final class GraphCountryData implements CountryData {
public List<Country> getByCriteria(Criteria criteria) {
// Get From an Injected In-Memory Graph Data Structure.
}
}
Here I have a DAO interface and 2 implementations, one that works with an SQL database and one that works with say an in-memory graph data structure. Is this correct? Or is the graph implementation meant to be created in some other kind of layer?
And if it is correct, what is the best way to abstract implementation specific details that are required by each DAO implementation?
For example, take the Criteria Class I reference above. Suppose it is like this:
public final class Criteria {
private String countryName;
public String getCountryName() {
return this.countryName;
}
public void setCountryName(String countryName) {
this.countryName = countryName;
}
}
For the SQLCountryData, it needs to somehow map the countryName property to an SQL identifier so that it can generate the proper SQL. For the GraphCountryData, perhaps some sort of Predicate Object against the countryName property needs to be created to filter out vertices from the graph that fail.
What's the best way to abstract details like this without coupling client code working against the abstract CountryData with implementation specific details like this?
Any thoughts?
EDIT:
The example I included of the Criteria Class is simple enough, but consider if I want to allow the client to construct complex criterias, where they should not only specify the property to filter on, but also the equality operator, logical operators for compound criterias, and the value.

DAO's are part of the DAL (Data Access Layer) and you can have data backed by any kind of implementation (XML, RDBMS etc.). You just need to ensure that the project instance is injected/used at runtime. DI frameworks like Spring/Guice shine in this case. Also, your Criteria interface/implementation should be generic enough so that only business details are captured (i.e country name criteria) and the actual mapping is again handled by the implementation class.
For SQL, in your case, either you can hand generate SQL, generate it using a helper library like Spring or use a full fledged framework like MyBatis. In our project, Spring XML configuration files were used to decouple the client and the implementation; it might vary in your case.
EDIT: I see that you have raised a similar concern in the previous question. The answer still remains the same. You can add as much flexibility as you want in your interface; you just need to ensure that the implementation is smart enough to make sense of all the arguments it receives and maps them appropriately to the underlying source. In our case, we retrieved the value object from the business layer and converted it to a map in the SQL implementation layer which can be used by MyBatis. Again, this process was pretty much transparent and the only way for the service layer to communicate with DAO was via the interface defined value objects.

No, I don't believe it's tied to only databases. The acronym is for Data Access Object, not "Database Access Object" so it can be usable with any type of data source.
The whole point of it is to separate the application from the backing data store so that the store can be modified at will, provided it still follows the same rules.
That doesn't just mean turfing Oracle and putting in DB2. It could also mean switching to a totally non-DBMS-based solution.

ok this is a bit philosophical question, so I'll tell what I'm thinking about it.
DAO usually stands for Data Access Object. Here the source of data is not always Data Base, although in real world, implementations are usually come to this.
It can be XML, text file, some remote system, or, like you stated in-memory graph of objects.
From what I've seen in real-world project, yes, you right, you should provide different DAO implementations for accessing the data in different ways.
In this case one dao goes to DB, and another dao implementation goes to object graph.
The interface of DAO has to be designed very carefully. Your 'Criteria' has to be generic enough to encapsulate the way you're going to get the data from.
How to achieve this level of decoupling? The answer can vary depending on your system, by in general, I would say, the answer would be "as usual, by adding an another level of indirection" :)
You can also think about your criteria object as a data object where you supply only the data needed for the query. In this case you won't even need to support different Criteria.
Each particular implementation of DAO will take this data and treat it in its own different way: one will construct query for the graph, another will bind this to your SQL.
To minimize hassling with maintenance I would suggest you to use Dependency Management frameworks (like Spring, for example). Usually these frameworks are suited well to instantiate your DAO objects and play good together.
Good Luck!

No, DAO for databases only is a common misconception.
DAO is a "Data Access Object", not a "Database Access Object". Hence anywhere you need to CRUD data to/from ( e.g. file, memory, database, etc.. ), you can use DAO.
In Domain Driven Design there is a Repository pattern. While Repository as a word is far better than three random letters (DAO), the concept is the same.
The purpose of the DAO/Repository pattern is to abstract a backing data store, which can be anything that can hold a state.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.