I've had the opportunity to rework a good deal of old, poorly maintained perl scripts from a department library into a newer Java design, which hopefully should be more maintainable. Originally, this library did a number of things relating to our Active Directory instance, including things like looking for and reporting on new users, keeping track of which users we knew about, etc.
The next functionality to replicate is the ability to store simple user information in a database -- things like names, employee IDs and account names, nothing too complex. Because I generally don't enjoy JDBC, and I had the opportunity to expand my horizons a bit, so I decided to poke at Hibernate. I know it's very likely overkill for what I'm doing with this application, but I figured that it was a good learning opportunity.
The issue that I have is fairly simple. I've got creating new persistent objects down, that's no sweat. Where I hit a speed bump is in retrieving those objects from the database using Hibernate. I can load the class by its built in ID, but I don't see an option to load on anything else, and needless to say, there isn't an option to save the database's user ID into AD itself. I'm wondering if someone can provide a bit of insight on how to load already-seen users from the database without the User ID; a tutorial or link would be fine. I've tried reading the Hibernate documentation itself, but it's massive, and the vast majority doesn't apply to what I'm actually doing.
Thanks.
Your best bet is to read section 10.4 of the hibernate reference guide on HQL queries. Although you can use the Hibernate Criteria API to formulate queries, HQL is probably the easier to grasp IMHO. In a nutshell, you can formulate queries using the Hibernate Session and using the persistent object's attributes for restriction criteria.
Related
I am creating a webapp in Spring Boot (Spring + Hibernate + MySQL).
I have already created all the CRUD operations for the data of my app, and now I need to process the data and create reports.
As per the complexity of these reports, I will create some summary or pre proccesed tables. This way, I can trigger the reports creation once, and then get them efficiently.
My doubt is if I should build all the reports in Java or in Stored Procedures in MySQL.
Pros of doing it in Java:
More logging
More control of the structures (entities, maps, list, etc)
Catching exceptions
If I change my db engine (it would not happen, but never know)
Cons of doing it in Java:
Maybe memory?
Any thoughts on this?
Thanks!
Java. Though both are possible. It depends on what is most important and what skills are available for maintenance and the price of maintaining. Stored procedures are usually very fast, but availability and performance also depends on what exact database you use. You will need special skills, and then you have it all working on that specific database.
Hibernate does come with a special dialect written for every database to get the best performance out of the persistence layer. It’s not that fast as a stored procedure, but it comes pretty close. With Spring Data on top of that, all difficulty is gone. Maintenance will not cost that much and people who know Spring Data are more available than any special database vendor.
You can still create various “difficult” queries easily with HQL, so no block there. But Hibernate comes with more possibilities. You can have your caching done by eh-cache and with Hibernate envers you will have your audit done in no time. That’s the nice thing about this framework. It’s widely used and many free to use maven dependencies are there for the taking. And if in future you want to change your database, you can do it by changing like 3 parameters in your application.properties file when using Spring Data.
You can play with some annotations and see what performs better. For example you have the #Inheritance annotation where you can have some classes end up in the same table or split it to more tables. Also you have the #MappedSuperclass where you can have one JpaObject with the id which all your entities can extend. If you want some more tricks on JPA, maybe check this post with my answer on how to use a superclass and a general repository.
As per the complexity of these reports, I will create some summary or
pre proccesed tables. This way, I can trigger the reports creation
once, and then get them efficiently.
My first thought is, is this required? It seems like adding complexity to the application that perhaps isn't needed. Premature optimisation and all that. Try writing the reports in SQL and running an execution plan. If it's good enough, you have less code to maintain and no added batch jobs to administer. Consider load testing using E.G. jmeter or gatling to see how it holds up under stress.
Consider using querydsl or jooq for reporting. Both provide a database abstraction layer and fluent API for querying databases, which deliver the benefits listed in the "Pros of doing it in Java" section of the question and may be more suited to the problem. This blog post jOOQ vs. Hibernate: When to Choose Which is well worth a read.
This question already has answers here:
Java Programming - Where should SQL statements be stored? [closed]
(15 answers)
Closed 9 years ago.
As part of my Java program, I need to do a run a lot of queries against (Oracle) database.
Currently, we create a mix SQL and Java, which (i know) is a bad bad thing.
What is a right way to handle something like this? If possible, include examples.
Thank you.
EDIT:
A bit more information about the application. It is a web application that derives content mainly from the database (it takes user input and paints content to be seen next based on what database believes to be true).
The biggest concern I have with how it's done today is that mixing Java code and a SQL queries look "out-of-place" when coupled as tightly as it is (Queries hardcoded as part of source code)
I am looking for a cleaner way to handle this situation, which would improve maintainability and clarity of the project at hand
For what you've described, incorporating an object relational mapper (ORM) or rewriting as stored procedures is probably more work than you want to embrace. Both have non-trivial learning curves.
Instead a good practice is consolidating SQL in a class per table or purpose. Take a look at the table data gateway object and the data access object design patterns to see how this is done in practice.
The upshot of this approach is myriad. You are better positioned for reuse because queries are in one spot. Client code becomes more readable as you replace several lines of JDBC and SQL with a method call (e.g. userTableDataGateway.getContentToShow(pageId)). Finally, this will help you see the problem more clearly an ORM helps solve.
Well, one thing you could consider is an Object Relational Mapper (for example, Hibernate). This would allow you to map your database schema to Java objects, which would generally clean up your Java code.
However, if performance and speed is of the essence, you might be better off using a plain JDBC driver.
This would of course also be dependent upon the task your application is trying to accomplish. If, for example, you need to do batch updates based on a CSV file, I migh go with a pure JDBC solution. If you're designing a web application, I would definitely go with an ORM solution.
Also, note that a pure JDBC solution would involve having SQL in your Java code. Actually, for that matter, you would have to have some form of SQL, be it HQL, JPQL, or plain SQL, in any ORM solution as well. Point being, there's nothing wrong with some SQL in your Java application.
Edit in response the OP's edits
If I were writing a web application from scratch, I would use an ORM. However, since you already have a working application, making the transition from a pure JDBC solution to an ORM would be pretty painful. It would clean up your code, but there is a significant learning curve involved and it takes quite a bit of set-up. Some of the pain from setting-up would be alleviated if you are working with some sort of bean-management system, like Spring, but it would still be pretty significant.
It would also depend on where you want to go with your application. If you plan on maintaining and adding to this code for a significant period, a refactor may be in order. I would not, however, recommend a re-write of your system just because you don't like having SQL hard-coded in your application.
Based on your updates, I concur with Tim Pote's edits re: the learning curve to integrate ORM. However, instead of integrating ORM, you could do things like using prepared statements, which you in turn store in a properties file. Or even store your queries in the DB so that you can make subtle updates to them that can then be read in immediately without restarting your app server. Both of these strategies would declutter your Java code of hard-coded SQL.
Ultimately though, I don't think there's a clear answer to your question, because there's nothing inherently wrong with what you're doing. It's just a bit inflexible, but perhaps acceptably so for your circumstances.
That said, I'm posting this as an answer!
I'm not sure of the state of the project but you may also be able to find an 'alternate' object relational mapper called MyBatis. It has a lower learning curve than the popular hibernate or eclipselink and let's you actually write the queries so you know what the code is doing. That is if ORM is your thing.
I'm working with JPA right now (mainly because it is the current trend and it needs to be learned). JPA is the Java standard for ORM. If you are going to learn what is currently a typical ORM way of doing things, JPA is probably the best way to go. Frameworks like Hibernate and Eclipselink drive it. Depending on what framework you choose to underpin your JPA app, you can use proprietary features but that will tie you to that framework pretty much for good. JPA is not hard to start using, but can be very cryptic when it doesn't work since it obfuscates the interaction with the database quite a bit (mind you, it does allow the option using native SQL queries, but that kind of negates the reason why people say JPA style DB access is good).
And yes, there are still people using JDBC with prepared statements. And normally there are practices/patterns that you will use when programming with plain old JDBC that act like a very, very minimalist ORM... or really, closer to MyBatis. Again, if you go this route, use prepared statements. They negate a number of dangers.
This is a religious kind of question, so you will hear a lot of proselytizing the way you wrote the question. In fact someone might shoot down your question for this. I think the only thing you could ask that might be worse is whether emacs or vi is better to a crowd of unix geeks.
Your question seems too generic, however if you have a mix of Direct SQL on Oracle and Java SQL, it would be better to invest some time in an ORM like Hibernate or Apache Cayenne. The ORM is a separate design approach to segregate Database operations from the Java side. All the db interactions and DB design is implemented on the ORM and all the access and business logic will reside in Java, this is a suggestion. Still unclear about your actual problem though.
The biggest concern I have with how it's done today is that mixing
Java code and a SQL queries look "out-of-place" when coupled as
tightly as it is (Queries hardcoded as part of source code)
This assumption of yours is not really "correct" in a way that there is going to be a true / false answer to your question. This question here explains that there are several ways of dealing with mixing Java and SQL:
Java Programming - Where should SQL statements be stored?
It essentially distinguishes between SQL being:
Hardcoded in business objects
Embedded in SQLJ clauses
Encapsulated in separate classes e.g. Data Access Objects
Metadata driven (decouple the object schema from the data schema - describe the mappings between them in metadata)
Put into external files (e.g. Properties or Resource files)
Put into stored procedures
I'll add to that:
Embedded in CriteriaQuery statements
Embedded in jOOQ statements.
Apache Cayenne, is one of the easiest ORM to use. It comes with a Cayenne Modeller to Model data objects and does mappings. I would recommend Cayenne for a beginner in ORM. It can create mapping classes and DB sync through the modeller.
Basically what the title says. Going forward, we need to start supporting both database platforms (and will start writing migrations accordingly), but we need to do the first initial "port".
Our DBAs are confident they can convert the schema, tables, data types, etc. but our developers have less confidence that the DAOs will "just work". Can someone point us towards some resources we can review? Ideally common pitfalls to avoid, specific tests to run, etc. We will of course run the full suite of database tests at the application layer, but want to do as much preparation as possible before then.
Pay attention to and test performance under load. Oracle does some things fundamentally differently than other database vendors. Tom Kyte's excellent book Expert Oracle Database Architecture points out several differences. A couple of highlights:
Oracle never locks data just to read it. Many other databases do.
A writer of data in Oracle never blocks a reader. A reader of data never blocks a writer. Again, many other vendors do.
Not paying attention to things like this can cause big headaches after a conversion when locking issues surface. This is not to imply a superiority of one product over another, rather it just means that what works well with one vendor's product may fail miserably in another, and custom approaches depending on the database may be required.
Ditto (although on a quite simple schema, have to say). "Just worked". Hibernate magic.
I had my peace of mind because we had 100% test coverage for DAO layer. So when schema was recreated on MS SQL, and some table and column names were updated in the mapping (don't remember why, but DBAs asked to, may be naming convention), we just run our tests and found no failed ones.
P.S. Recalled one interesting detail: functional tests were all OK. But when PTE started on MS SQL database, we have found that a concurrent access to one particular table was times slower than on Oracle due to locks propagation. We had to redesign that functionality.
I think the first step would be to get an empty MS SQL schema, use hbm2ddl=true and let Hibernate create the tables there. Then show this to your DBAs and ask if this makes sense.
Populating data is less of a problem, I'd guess queries would be more slippery (especially if you use raw JDBC in some places). You might also want to check query plans for commonly used queries and see if these make sense, too.
I'm working on a medium-sized project in Java (GWT to be precise), and I'm still in the process of deciding what ORM to use.
I just refuse to write SQL queries unless utterly and completely necessary (not the case :D)
I want to use ONLY annotations, no XML configuring [except database location, username, etc], and I DON'T want to create any tables or define them. I want this to be done by the framework completely.
Call me lazy, but I like Java/GWT programming, not creating tables and coping with that sort of things, and it's a plus in my assignment if I actually use an ORM :D
I've considered so far:
Hibernate with annotations: I've found little documentation to get started from ground using this. I've found little examples and alike. It's as if they didn't actually want you to use 100% annotations.
DataNucleus
JDO: It seems interesting, I'd never heard of DataNucleus up to until this week, but it seems extremely mature, and I actually discovered it because Google uses it in GWT, so that's a good sign. I also like the fact that they mentioned I don't need to define any tables or columns, though I think hibernate can achieve this as well. I actually enjoyed reading though their documentation (though I haven't finished yet), something quite opposite to hibernate.
JPA I'm not totally sure if DataNucleus/JPA can work with annotation-only configuration, though I might need to take a deeper look into the documentation.
As you might guess, I'm quite inclined to JDO... but it'd be nice to hear what people who've used it have to say vs the other alternatives, and if i'm missing some very important point here.
Edit 1: I know I'll need to XML the database location/usr/pwd, I meant I don't want to use an XML to configure the mapping or database schema.
JPA (1 and 2) is pretty much XML free, depending on how it's packaged. You most certainly don't need it for the schema. It also supports annotations for details when the tables are generated.
The only issue with these is that while they can create a database, they're a DB MAPPING tool, not a DB DEFINITION tool. Specifically, most won't allow you to create the arbitrary indexes that you may well need to get the DB tuned properly to your queries.
But other than that, JPA should fill your needs, and it has a lot of implementations (Hibernate is just one implementation).
This is a self publicizing but I'm been working for a while on a simple Java ORM package called ORMLite. I wanted something much less complicated than hibernate but without writing SQL directly. It's completely annotation based and currently supports MySQL, Postgres, Derby, and H2. Adding other database would be simple if I have access to a server. It is completely annotation based and can create (and destroy) tables.
http://ormlite.com/
It has pretty flexible QueryBuilder and table paging. Joining is, however, not supported.
I have a large tree of Java Objects in my Desktop Application and am trying to decide on the best way of persisting them as a file to the file system.
Some thoughts I've had were:
Roll my own serializer using DataOutputStream: This would give me the greatest control of what was in the file, but at the cost of micromanaging it.
Straight old Serialization using ObjectOutputStream and its various related classes: I'm not sold on it though since I find the data brittle. Changing any object's structure breaks the serialized instances of it. So I'm locked in to what seems to be a horrible versioning nightmare.
XML Serialization: It's not as brittle, but it's significantly slower that straight out serialization. It can be transformed outside of my program.
JavaDB: I'd considered this since I'm comfortable writing JDBC applications. The difference here is that the database instance would only persist while the file was being opened or saved. It's not pretty but... it does lend itself to migrating to a central server architecture if the need arises later and it introduces the possibility of quering the datamodel in a simpler way.
I'm curious to see what other people think. And I'm hoping that I've missed some obvious, and simpler approach than the ones above.
Here are some more options culled from the answers below:
An Object Database - Has significantly less infrastructure than ORM approaches and performs faster than an XML approach. thanks aku
I would go for the your final option JavaDB (Sun's distribution of Derby) and use an object relational layer like Hibernate or iBatis. Using the first three aproaches means you are going to spend more time building a database engine than developing application features.
Have a look at Hibernate as a simpler way to interface to a database.
In my experience, you're probably better off using an embedded database. SQL, while less than perfect, is usually much easier than designing a file format that performs well and is reliable.
I haven't used JavaDB, but I've had good luck with H2 and SQLite. SQLite is a C library which means a little more work in terms of deployment. However, it has the benefit of storing the entire database in a single, cross-platform library. Basically, it is a pre-packaged, generic file format. SQLite has been so useful that I've even started using it instead of text files in scripts.
Be careful using Hibernate if you're working with a small persistence problem. It adds a lot of complexity and library overhead. Hibernate is really nice if you're working with a large number of tables, but it will probably be cumbersome if you only need a few tables.
db4objects might be the best choice
XStream from codehaus.org
XML serialization/deserialization largely without coding.
You can use annotations to tweak it.
Working well in two projects where I work.
See my users group presentation at http://cjugaustralia.org/?p=61
I think it depends on what you need. Let's see the options:
1) Descarded imediatelly! I'll not even justify. :)
2) If you need a simple, quick, one-method persistence, stick with it. It will persist the complete data graph as it is! Beware of how long you'll be maintaning the persisted objects. As yourself pointed out, versioning can be a problem.
3) Slower than (2), need extra code and can be edited by the user. I would only use it the data is supposed to be used by a client in another language.
4) If you need to query your data in anyway, stick with the DB solution.
Well, I think you had already answered your question :)