We're going to write a new web interface for a big system based on Oracle database. All business rules are already coded in PL/SQL stored procedures and we'd like to reuse as much code as possible. We'll write some new stored procedures that will combine the existing business rules and return the final result dataset.
We want to do this on the database level to avoid java-db round trips. The interface layer will be written in Java (we'd like to use GWT), so we need a way of passing data from Oracle stored procedures to Java service side. The data can be e.g. a set of properties of a specific item or a list of items fulfilling certain criteria.
Would anyone recommend a preferable way of doing this?
We're considering one of the 2 following scenarios:
passing objects and lists of objects (DB object types defined on the
schema level)
passing a sys_refcursor
We verified that both approaches are "doable", the question is more about design decision, best practice, possible maintenance problems, flexibility, etc.
I'd appreciate any hints.
I would recommend sticking with a refcursor with well defined keys (agreed on both sides by java devs and pl/sql developers). This is much easier to extend in the future, you can easily convert the refcursor to hashmap and then a hashmap to a POJO using a apache bean utils if needed. I'm working on a big telecom project with many approaches to this issue and refcursor seems to be the best at the end of the day.
In the past I have achieved exactly the same with classic JDBC CallableStatement without any perfomance or maintenance issues. With ORM solutions like Hibernate making persistence much more flexible, you can wrap your solution around Hibernate as achieve in this post. Also see this example if you are not already familiar with the way store procedure and CallableStatement works.
It's been a while since I've done something like that, but the way I remember is that you need to define a view that calls your stored procedure, and you can then easily read the result sets from within java, with the OR-mapper of your choice.
So, this seems close to your scenario 1, which never caused any problems in my experience.
The one thing one needs to be careful is transaction handling: If your stored procedures write data, and you call several of them within a Java EE transaction, you might get into a situation of data inconsistency.
Related
So, I'm working on using Jooq to create a caching layer over Postgres. I've been using the MockConnection/MockDataProvider objects to intercept every query, and this is working, but I'm having a few issues.
First, how do I determine between reads and writes? That is, how do I tell whether a query is an insert/update/etc or a select, given only the MockExecuteContext that's passed into the execute method in MockDataProvider?
And I'm a bit confused on how I can do invalidations. The basic scheme I'm implementing right now is that whenever a "write" query is made to a table, I invalidate all cached queries that involve that table. This goes back to my first question, on telling different types of queries from each other, but also brings up another issue: how would I identify the tables used in a query given only the sql string and the bindings (both are attributes of MockExecuteContext)?
Also, is this a correct approach at caching? My first thought was to override the fetch() method, but that method is final, and I'd rather not change something already embedded in Jooq itself. This is the only other way I could think of to intercept all requests made so I could create a separate, persistent caching layer.
I have seen this (https://groups.google.com/forum/#!topic/jooq-user/xSjrvnmcDHw) question, but I'm still not clear on how Lukas recommended to identify tables from the object. I can try to implement a Postgres NOTIFY, but I wanted something native in Jooq first. I've seen this issue (https://github.com/jOOQ/jOOQ/issues/2665) pop up a lot too, but I'm not sure how it applies.
Keep in mind that I'm new to Jooq, so it's quite possible that I'm missing something obvious.
Thanks!
The project I'm working on has a REST API written in JRuby/Java, with an endpoint that hits a MySQL database to retrieve a number of records.
We need to allow the client to filter those records using one or more columns, including boolean checks and range values.
The easiest way we can do this is to add a string parameter to the API, then add it into the SQL statement.
Collectively, the development team agree that this is a bad idea but the alternative is to provide an almost identical syntax for filtering, which is translated into SQL. The allure of the SQL injection parameter is strong.
So my question is, are there any circumstances under which this is a safe thing to do?
In particular, might we consider using the WHERE clause safely if it's been fully parsed beforehand, and identified as such. Or at the very least, checking for certain trigger words such as DROP, SELECT etc.
Also if anyone knows of a good library that could act as a go-between (translating or parsing an external expression into a WHERE clause) that would be great.
The OData and GData protocols already implement this functionality in a safe and standard way. You can find server and client implementations for both, for Ruby, PHP, MySQL etc. Check here for the OData libraries
Leaving aside the SQL injection issue, you'll expose your inner implementation (both the database chosen - MySQL and your table structure) directly in the form of your API.
e.g. if you change to some NoSQL-type implementation at the backend, your public-facing API will break immediately. Similarly if you restructure your database. I wouldn't do this even in an environment in which I wasn't worried about the probability/severity of injection attacks.
Besides the security implications, allowing an arbitrary WHERE clause is a bad idea because it takes the "I" out of "API" -- it's not an interface. The API is supposed free the user of the need to know details of the implementation. Like table and column names.
If clients are interacting with your data by constructing their own WHERE clauses, then you can never change the database. There might be code out there with those statements programmed in. If a bug or new feature required you to alter the DB in a way that would break existing client interactions you'd be stuck. The API should provide the filtering capability and translate requests into calls to backend in a way that lets you change the backend without breaking the API.
There are numerous ORM's for this purpose, especially in ruby (activerecord, sequel)
The most basic thing you need to do is escape the string input, which will pretty much prevent sequel injection if you are doing it properly.
It helps to not directly insert parameters directly into the sequel statement if you dont have to either, instead, check their validity and then map them to logical ones (this isn't always possible). For example, if there is an html dropdown list, and when you submit the form it passes some parameter 'firstitem', map 'firstitem' to an id or otherwise that you will then use to search on, versus using the user supplied version (assuming this mapping doesn't involve the db).
I am a fan of ORM - Object Relational Mapping and I have been using it with Rails for the past year and a half. Prior that, I use to write raw queries using JDBC and make Database do the heavy lifting via Stored Procedures. With ORM, I was initially happy to do stuff like coach.manager and manager.coaches which were very simple and easy to read.
But as time went by there were in-numerous associations creeping up and I ended up doing a.b.c.d which were firing queries in all directions, behind the scenes. With rails and ruby, the garbage collector went nuts and took insane time to load a very complex page which involves relatively lesser data. I had to replace this ORM style code by a simple Stored procedure and the result I saw was enormous. A page that took 50 seconds to load now takes only 2 seconds.
With this huge difference, should I continue using ORM? It is very clear it has severe overheads compared to a raw query.
In general, what are the general pitfalls of using an ORM framework like Hibernate, ActiveRecord?
An ORM is only a tool. If you don't use it correctly, you'll have bad results.
Nothing stops you from using dedicated HQL/criteria queries, with fetch joins or projections, to return the information that your page must display in as few queries as possible. This will take more or less the same time as dedicated SQL queries.
But of course, if you just get everything by ID and navigate through your objects without realizing how many queries it generates, it will lead to long loading times. The key is to know exactly what the ORM does behind the scene, and decide if it's appropriate or if another strategy must be adopted.
I think you've already identified the major tradeoff associated with ORM software. Every time you add a new layer of abstraction that tries to provide a generalized implementation of something that you used to do by hand there is going to be some loss of performance/efficiency.
As you noted, traversing multiple relationships such as a.b.c.d can be inefficient, because most ORM software will be doing an independent database query for each . along the way. But I'm not sure that means you should eliminate ORM altogether. Most ORM solutions (or at least, certainly Hibernate) allow you to specify custom queries where you can bring back exactly what you want in a single database operation. This should be about as fast as your dedicated SQL.
Really the issue is about understanding how the ORM layer is working behind the scenes, and realizing that while something like a.b.c.d is simple to write, what it causes the ORM layer to do as it is evaluated is not. As a general rule I always go with the simplest possible approach to begin, and then write optimized queries in areas where it makes sense/where it is obvious that the simple approach will not scale.
I'd say, one should use the appropriate tool for different tasks.
E.g., for CRUD operations, ORM frameworks like Hibernate can speed up development and it will perform well enough. Sometimes you need to do some necessary tweaks to achieve acceptable performance. I'm not sure, your task (what took 50 sec with Hibernate) could not be done properly with Hibernate, because you did not provide us with the details.
On the other hand, for example bulk operations involving hundreds of thousands of records is not the type of task you'd expect Hibernate will do without significant performance penalty.
As it was mentioned already, ORM is only a tool and you can use it eiter good or bad.
One of the most typical performance problems in ORMs is 1+N queries problem. It is caused by loading additional objects for each of objects from the list. This is caused by eager fetch of 1-to-n-relation entities for each element on list, the dealing is using HQL queries, specifying fields in projection or marking fetching 1-to-n relations to lazy.
Any time, you must exactly know what the ORM is doing in order to achieve good performance. Not understanding what operations are done in background is a way to disaster (slow, buggy and hard to analyze code because of unnecessary and wrongly written work-arounds).
I'm with Petar from your comments regarding the lazy fetching. Say you have an html table filled fields from object a.b.c.d. You could find your framework round-tripping the database thousands of times(possibly many more) . The disadvantage of ORM in this case is you have to read the documentation thoroughly. Most frameworks support disabling lazy fetching and many even support adding your own processing logic to bind the data set.
The net out is that almost any ORM is almost undoubtedly better than anything you are going to write yourself. You will find yourself saddled with maintaining huge libraries of boilerplate or worse writing the same code over and over again.
We are currently investigating to switch from our own data store layer with clean separation of transfer objects and data access objects to JPA. We used a generator to create the TOs, the DAOs and the SQL DDL as well from some documentation in docbook format. By this all of our stuff from documentation, the database structure and the generated Java classes where always in sync with a good documentation of the database itself.
What we discovered so far by using JPA:
Foreign key references cannot be used for imports, some special
queries and so on because they must not be placed in a managed
entity. JPA only allows the target class there.
Access to some user session scope is difficult upto impossible. We
still have no clue how to get the users id into the column
'userWhoLastMadeAnUpdate' in some PrePersist method.
Something expected to be quite easy with an ORM, namely "class
mapping" does not work at all. We are using HalDateTime
(http://sourceforge.net/projects/haldatetime/) internally.
Especially in the client. Mapping it with JPA directly is not
possible although HalDateTime supports it. Due to JPA restrictions
we have to use two fields in the entity.
JPA uses either one XML file to describe the mapping. So you have to
look at least into two files to even understand the relationship
between the Java class and the database. And the XML file becomes
huge for large applications.
Alternatively ORMs provide annotations in the Java class itself. So
its easier to learn and understand the relationship. But it forces
you to see all that database stuff in the client layer (which
completely breaks a proper layering).
You will have to restrict yourself to stay as close to a clean
database structure as anyhow possible. Otherwise you will for sure
end up with a mess of queries and statements by the ORM.
Use an ORM which provides a query language which is close to SQL
itself (JPA seems quite acceptable here). An ORM induced language
makes supporting a large application really expensive.
I just started working on upgrading a small component in a distributed java application. The main application is a rather complicated applet/servlet combo running on JBoss and it extensively uses Hibernate for its DataAccess. The component i am working on however is very a very straightforward data importing service.
Basically the workflow is
Listen for a network event
Parse the data packet, extract a set of identifiers
Map the identifier set to a primary key in our database
Parse the rest of the packet and insert items in a related table using the foreign key found in step 3
Repeat
in the previous version of this component it used a hibernate based DAL, that is no longer usable for a variety of reasons (in particular it is EOL), so I am in charge of replacing the Data Access layer for this component.
So on the one hand I think i should use Hibernate because that's what the rest of the application does, but on the other i think i should just use regular java.sql.* classes because my requirements are really straightforward and aren't expected to change any time soon.
So my question is (and i understand it is subjective) at what point do you think that the added complexity of using an ORM tool (in terms of configuration, dependencies...) is worth it?
UPDATE
due to the way the DataAccesLayer for the main application was written (weird dependencies) i cannot easily use it, i would have to implement it myself.
If we look into why Spring-Hibernate combination is used?
Because for simple Jdbc operation we have to do lot of operation like getting a connection.
Making a statement and handling resultset.For all these steps there are lot of exception handling.
But with spring hibernate you have to use just this:
public PostProfiles findPostProfilesById(long id) {
List list=getHibernateTemplate().find("from PostProfiles where id=?",id);
return (PostProfiles) list.get(0);
}
And everything is taken care by framework.I hope it will solve you dilemma
I think the answer really depends on your skill set. It would probably take similar amount of time to craft a simple solution involving a handful of tables in either way (Hibernate or raw JDBC) if you are comfortable with both techniques.
As I am pretty comfortable with Hibernate, I'd just choose it as I prefer to working in a higher level and not worrying about things that Hibernate handles for me. Yes, it has its own glitches, but especially for simple data models it does the job, and does it well.
The only few reasons why would I choose plain JDBC would be:
uber-complicated maximum-optimized SQL that is performance critical;
Hibernate being stupid and not being capable to express what I want;
And especially if you say you are already managing other entities with Hibernate, why not keep your code in the same style everywhere?
I think you are better off using JDBC api. From what you describe, the two operations (select foreign key from table, insert into table_2) can easily be executed with a simple Stored Procedure call.
The advantage of using this technique is that you can manage transactions/exceptions within your stored procedure call.
I'm trying to write a program with Hibernate. My domain is now complete and I'm writing the database.
I got confused about what to do. Should I
make my sql tables in classes and let the Hibernate make them
Or create tables in the
database and reverse engineer it and
let the hibernate make my classes?
I heard the first option one from someone and read the second option on the Netbeans site.
Does any one know which approach is correct?
It depends on how you best conceptualize the program you are writing. When I am designing my system I usually think in terms of entities and their relationships to eachother, so for me, I start with my business objects, then write my hibernate mappings and let hibernate create the database.
Other people are able to think better in terms of database tables, in whcih case that approach is best for them. So you gotta decide which one works for you based on your experience.
I believe you can do either, so it's down to preference.
Personally, I write the lot by hand. While Hibernate does a reasonable job of creating a database for you it doesn't do it as well as I can do myself. I'd assume the same goes for the Java classes it produces although I've never used that feature.
With regards to the generated classes (if you went the class generation route) I'm betting every field has a getter/setter whether fields should be read only or not (did somebody say thread safety and mutability) and that you can't add behavior because it gets overridden if you regenerate the classes.
Definitely write the java objects and then add the persistence and let hibernate generate the tables.
If you go the other way you lose the benefit of OOD and all that good stuff.
I'm in favor of writing Java first. It can be a personal preference though.
If you analyse your domain, you will probably find that they are some duplication.
For example, the audit columns (user creator and editor, time created and edited) are often common to most tables.
The id is often a common field.
Look at your domain to see your duplication.
The duplication is an opportunity to reuse.
You could use inheritance, or composition.
Advantages :
less time : You will have much less things to write,
logical : the same logical field would be written once (that would be other be many similar fields)
reuse : in the client code for your entities, you could write reusable code. For example, if all your entities have the same id field called ident because of their superclass, a client code could make the generic call object.getIdent() without having to find out the exact class of the object, so it will be more reusable.