Design Problem - Generating SQL Queries for business calculations

Design Problem - Generating SQL Queries for business calculations - java

We have an application where the user is allowed to enter expressions for performing calculations on the fields of a database table. The calculations allows various types of functions (math, logic, string, date etc). For e.g MAX(col1, col2, col3).
Note that these expressions can get complex by having nested functions. For e.g.
IF(LENGTH(StringColumn)=0, MAX(col1, col2, 32), MIN(col1, col2, col3)) > LENGTH(col2)
One way we have implemented this is having a java cc parser to parse the user entered expressions and then generating a tree type of data structure. The tree is then parsed in java and sql queries are generated for each of the functions used in the expressions. Finally after the queries are generated for each of the user entered expression, java executes this query using simple database call.
A major problem with this framework is that the database issues are to be handled in java. By database issues I mean some database limitation or any performance optimization. One database limitation with Microsoft SQL Server is that only 10 nested CASE WHEN statements are allowed. This means that while parsing the java code needs to estimate how many CASE WHEN's would the query string have before it is translated.
Similarly if there are any sql performance optimizations to be done, handling them in java simply not logical.
Does anyone know about any better design approaches for this problem?

Rather than reimplement a very SQL-like language that gets translated to SQL, have your users query the database with SQL.

I would look into Hibernate and it's HQL query language.
In response to the poster above, I think it would be a bad idea to let your users query the database with SQL directly, as you'd be opening yourself up to SQL injection attacks.

Some time ago i wrote a java applet with dynamic filter routines and there i translate the sql statements to javascript statements and execute them with javascripts exec function

You could have a look at JPA 2.0 Criteria API or Hibernate Criteria API
JPA 2.0 provides the so called Criteria API (http://stackoverflow.com/questions/2602757/creating-queries-using-criteria-api-jpa-2-0)
Hibernate has its own Criteria API (even before JPA 2.0) - but it is different from JPA 2.0 Criteria API. (http://www.ibm.com/developerworks/java/library/j-typesafejpa/)
The aim of both Criteria APIs is to provide a way to create sql queries at runtime in a more pleasant way then concatenating strings. (http://docs.jboss.org/hibernate/core/3.3/reference/en/html/querycriteria.html)
(JPA 2.0 Critiera API has a extra feature, it provides some kind of code generation, that makes it possible to write queries in a compile time save way. (http://docs.jboss.org/hibernate/core/3.3/reference/en/html/querycriteria.html))

Another approach which I could think was to look for language recognizers supported by database (which is Oracle in my case). Similar to what we currently use in java (i.e. javacc) if a similar framework is supported by the database then the intermediate string could be parsed and translated into a sql query.
The intermediate string I refer here is similar to the user entered string but may be exactly the same (e.g. column names could be transformed to actual physical column names).
Any thoughts (pros and cons) about this approach? Also any suggestions on language recognizers in Oracle would be highly appreciated.
Thank you.

Related

Parse SQL and Evaluate an Expression

I have a SQL query that I would like to parse evaluate. I have parsed the SQL using JSQL Parser. Now, I need to evaluate the where clause in the SQL. I would like to do it in Flink as part of the filter function. Basically, stream.filter(Predicate<Row>). The Predicate<Row> is what I need to get from the evaluation of the SQL's where clause and apply it on the streaming record.
Ex: SELECT COLUMN FROM TABLE WHERE (ac IS NOT NULL AND ac = 'On')
I would like to parse the above query and given a streaming record with say ac = on, I would like to run the above expression evaluation on that record.
Any thoughts on how I can do it?
I would like to try using expression evaluation with DFS but kinda confused how to run by it. Any help is appreciated!

If the SQL query is known at compile time, it's more straightforward to do this by integrating Flink SQL (via the Table API) into your DataStream application. See the docs for more info and examples.
The overall approach would be to convert your DataStream into a dynamic Table (which can be done automatically if the stream is a convenient type, such as a POJO), apply the SQL query to it, and then (if necessary) convert the resulting Table back to a DataStream.
Or maybe just implement the entire application with the Table API if you don't need any of the features that are unique to DataStreams.
On the other hand, if the query is dynamic and isn't provided until runtime, you'll need to pursue something like what you've proposed. FWIW, others with similar requirements have used dynamic languages with JVM-based runtimes, such as Javascript via Rhino, or Groovy. The overall approach is to use a BroadcastProcessFunction, with the dynamic code being broadcast into the operator.

Single line select using string builder or Stored Procedure

I have a lot of single line select queries in my application with multiple joins spanning 5-6 tables. These queries are generated based on many conditions based on input from a form etc using String Builders. However my team lead who happens to be a sql developer has asked me to convert those single line queries to Stored Procedures.
Is there any advantage of converting the single line select queries to backend and performing all the if and else there as SP.

One advantage of having all your sql part in stored procedures is that you keep your queries in one place that is database so it would a lot easier to change or modify without making a lot of changes in application layer or front end layer.
Besides DBA's or SQL develoeprs could fine tune the SQL's if it is stored in database procedures. You could keep all your functions/stored procedures in a package which would be better in terms of performance and organizing your objects(similar way of creating packages in Java). And of course in packages you could restrict direct access to its objects.
This is more of team's or department policy where to keep the sql part whether in front end or in database itself and of course like #Gimby mentioned, many people could have different views.
Update 1
If you have a select statement which returns something use a function, if you have INSERT/UPDATE/DELETE or similar stuff like sending emails or other business rules then use a procedure and call these from front end by passing parameters.

I'm afraid that is a question that will result in many different answers based on many different personal opinions.
Its business logic you are talking about here in any case, in -my- opinion that belongs in the application layer. But I know a whole club of Oracle devs who wholeheartedly disagree with me.

If your use PreparedStatement in java then there is no big differense in performance between
java queries and stored procedures. (If your use Statement in java, then your have a problem).
But Stored Procedure is a good way to organize and reuse your sql code. Your can group them in packages, your can change them without java compilation and your DBA or SQL spetialist can tune them.

Java MS SQL -> mySQL conversion

I am building an application at work and need some advice. I have a somewhat unique problem in which I need to gather data housed in a MS SQL Server, and transplant it to a mySQL Server every 15 mins.
I have done this previously in C# with a DataGrid, but now am trying to build a Java version that I can run on an Ubuntu Server, but I can not find a similar model for Java.
Just to give a little background
When I pull the data from the MS SQL Server, it always has 9 columns, but could have anywhere from 0 - 1000 rows.
Before inserting into the mySQL Server blindly, I do manipulate some of the data.
I convert a time column to CST based on a STATE column
I strip some characters to prevent SQL injection
I tried using the ResultSet, but I am having issues with the "forward only result sets" rules.
What would be the best data structure to hold that information, manipulate it, and then parse it to insert later into mySQL?

This sounds like a job for PreparedStatements!
Defined here: http://download.oracle.com/javase/6/docs/api/java/sql/PreparedStatement.html
Quick example: http://download.oracle.com/javase/tutorial/jdbc/basics/prepared.html
PreparedStatements allows you to batch up sets of data before pushing them into the target database. They also allow you use the PreparedStatement.setString method which handles escaping characters for you.
For the time conversion thing, I would retrieve the STATE value from the row and then retrieve the time value. Before calling PreparedStatement.setDate, convert the time to CST if necessary.
I dont think that you would need all the overhead that an ORM tool requires.

You could consider using an ORM technology like Hibernate. This might seem a little heavyweight at first, but it means you can maintain the various table mappings for various databases with ease as well as having the power of Java's RegEx lib for any manipulation requirements.
So you'd have a Java class that represents the source table (with its Hibernate mapping) and another Java class that represents the target table and lastly a conversion utility class that does any manipulation of that data. Hibernate takes care of the CRUD SQL for you, so no need to worry about Database specific SQL (as long as you get the mapping correct).
It also lessens the SQL injection problem

Is HibernateCallback best for executing SQL/procedures?

I'm working on a web based application that belongs to an automobil manufacturer, developed in Spring-Hibernate with MS SQL Server 2005 database.
There are three kind of use cases:
1) Through this application, end users can request for creating a Car, Bus, Truck etc through web based interfaces. When a user logs in, a HTML form gets displayed for capturing technical specification of vehicle, for ex, if someone wanted to request for Car, he can speify the Engine Make/Model, Tire, Chassis details etc and submit the form. I'm using Hibernate here for persistence, i.e. I've a Car Entity that gets saved in DB for each such request.
2) This part of the application deals with generation of reports. These reports mainly dela with number of requests received in a day and the summary. Some of the reports calculate Turnaround time for individual Create vehicle requests.
I'm using plain JDBC calls with Preparedstatement (if report can be generated with SQLs), Callablestatement (if report is complex enough and needs a DB procedure/Function to fetch all details) and HibernateCallback to execute the SQLs/Procedures and display information on screen.
3) Search: This part of application allows ensd users to search for various requests data, i.e. how many vehicle have been requested in a Year etc. I'm using DB procedure with CallableStatement..Once again executing these procedures within HibernateCallback, populating and returning search result on GUI in a POJO.
I'm using native SQL in (2) and (3) above, because for the reporting/search purpose the report data structure to display on screen is not matching with any of my Entity. For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense, so why not use plain SQL and retrieve just the data needed for displaying on screen.
Similarly for Search, I had to write procedures/Functions because search algorithm is not straight forward and Hibernate has no way to write a stored procedure kind of thing.
This is working fine for proto type, however I would like to know
a. If my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgement.
b. Also whether executing SQLs in HibernateCallback is correct approach?
Need expert's help.

I would like to know (...) if my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgment
Nothing forces your to use a stored procedure for case 2, you could use HQL and projections as already pointed out:
select f.id, f.firstName from Foo f where ...
Which would return an Object[] or a List<Object[]> depending on the where condition.
And if you want type safe results, you could use a SELECT NEW expression (assuming you're providing the appropriate constructor):
select new Foo(f.id, f.firstName) from Foo f
And you can even return non entities
select new com.acme.LigthFoo(f.id, f.firstName) from Foo f
For case 3, the situation seems different. Just in case, note that the Criteria API is more appropriate than HQL to build dynamic queries. But it looks like this won't help here.
I would like to know (...) whether executing SQLs in HibernateCallback is correct approach?
First of all, there are several restrictions when using stored procedures and I prefer to avoid them when possible. Secondly, if you want to return entities, it isn't the only way and simplest solution as we saw. So for case 2, I would consider using HQL.
For case 3, since you aren't returning entities at all, I would consider not using Hibernate API but the JDBC support from Spring which offers IMHO a cleaner API than Session#connection() and the HibernateCallback.
More interesting readings:
References
Hibernate Core reference guide
14.6. The select clause (about the select new)
16.1.5. Returning non-managed entities (about ResultTransformer)
16.2.2. Using stored procedures for querying
Resources
Hibernate 3.2: Transformers for HQL and SQL
Related questions
hibernate SQLquery extract variable
hibernate query language or using criteria

You should strive to use as much HQL as possible, unless you have a good argument (like performance, but do a benchmark first). If the use of native queries becomes to excessive, you should consider whether Hibernate has been a good choice.
Note a few things:
you can have native queries and stored procedures that result in Hibernate entities. You just have to map the query / storproc call to a class and call it by session.createSQLQuery(queryName)
If you really need to construct native queries at runtime, the newest version of hibernate have a doWork(..) method, by which you can do JDBC work.

You say
For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense
but HQL in hibernate allows you to do a projection (select only a subset of the columns back). You don't have to pull the entire entity if you don't want to.
Then you get all the benefits of HQL (typing of results, HQL join syntax) but you can pretty much write SQLish code.
See here for the HQL docs and here for the select syntax. If you're used to SQL it's pretty easy.
So to answer you directly
a - No, I think you should be using HQL
b - Becomes irrelevant if you go with my suggestion for a.

Hibernate translation capabilities

Our project must be able to run both in Oracle and SQL Server. The problem is we have a number of HQL + native queries with non-standard operators (i.e. bitand and || ) and functions ( i.e. SUBSTR ) that work fine in Oracle but not in SQL Server.
I wonder if Hibernate is capable of translating them dynamically. I suppose that with HQL maybe it does, because it creates an AST but I doubt the same applies with native queries.
Additional question: what's the best approach of dealing with these troublesome queries ? Conditionals, subclassing, others ... the goal is not to modify the code a lot.
Thanx in advance

Use custom Dialects for HQL. Instead of using ||, create your own function called concat. Then, in the SQL Server dialect add this to the constructor:
registerFunction("concat", new VarArgsSQLFunction(Hibernate.STRING, "", "+", ""));
You don't have to change the Oracle dialect because Oracle already has a concat function so it just passes through, but for other functions, you may need to register new functions in both.
For SQL queries, since you're building them dynamically anyway, you could use base class methods, for example super.addBitAndClause(leftSide, rightSide).
You can even get to the dialect dynamically, although Hibernate didn't make it easy by putting on the interface:
Dialect d = ((SessionFactoryImpl)sessionFactory).getDialect()

I suggest moving the HQL queries from the code to an external .hbm file and to use named queries before switching the Database. The HQL queries shouldn't be a problem as you already said. Native queries are a problem and you have to find the equivalent for the other DBMS. But by putting the queries into the external file you can then configure the sessionfactory to use the database specific .hbm file and do not need to change the code which depends only on the named query which can be a native query or a HQL
To get a named Query you can do the foloowing:
Query query = session.getNamedQuery("YourNamedHQLorSQLQuery");

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.