Supporting both Oracle and MySQL: how similar is their SQL syntax?

Supporting both Oracle and MySQL: how similar is their SQL syntax? - java

We use Oracle on a project and would like to also support MySQL. How close are their SQL dialects?
Is it perhaps even possible to use the same SQL source for both without too many gymnastics?
Details:
We're using iBatis, a persistence manager that cleanly segregates the SQL statements into resource files. But we work at the SQL level, which has its advantages (and disadvantages).
We'd prefer not to move to an object-relational mapper like Hibernate, which would fully shield us from dialect differences.
We've tried hard to keep to a generic subset of Oracle SQL.
There's no PL/SQL.
We don't use stored procedures or triggers (yet, anyway).
We use check constraints, unique constraints, and foreign key constraints.
We use ON DELETE CASCADEs.
We use transactions (done at the iBatis API level).
We call a few Oracle timestamp functions in the queries.
We would use the InnoDB storage engine with MySQL (it supports transactions and constraints).
So what are your thoughts? Would we need to maintain two different sets of iBatis SQL resource files, one for each dialect, or is it possible to have a single set of SQL supporting both MySQL and Oracle?
Final Update: Thanks for all the answers, and especially the pointers to Troels Arvin's page on differences. It's really regrettable that the standard isn't more, well, standard. For us the issues turn out to be the MySQL auto-increment vs. the Oracle sequence, the MySQL LIMIT vs. the Oracle Rowumber(), and perhaps the odd function or two. Most everything else ought to transfer pretty easily, modulo a few edits to make sure we're using SQL-92 as #mjv points out. The larger issue is that some queries may need to be hand-optimized differently in each DBMS.

Expect a few minor bumps on the road, but on whole should be relatively easy.
From the list of features you currently use, there should only be a few synctactic or semantic differences, in general easy to fix or account for. The fact that you do not use PL/SQL and/or Stored Procedures is a plus. A good rule of thumb is to try and stick to SQL-92 which most DBMSes support, in particular both Oracle and MySQL. (Note this is not the current SQL standard which is SQL-2008).
A few of the differences:
"LIMIT" is a famous one: to limit the number of rows to retrieve in the results list, MySQL uses LIMIT n, at the end of the query, Oracle uses RowNumber() in the WHERE clause (which is pain, for you also need to reference it in the SELECT list...)
Some datatypes are different. I think mostly BOOLEAN (but who uses this ;-) ) Also some I think subtle differences with the DATETIME type/format.
Some function names are different (SUBSTRING vs. SUBSTR and such...)
Just found what seems to be a good resource about differences between SQL implementations.
Reading the responses from others, yeah, DDL, could be a problem. I discounted that probably because many applications do not require DDL, you just need to set the data schema etc. at once, and then just use SQL for querying, adding or updating the data.

I believe that maintaining a single set of SQL resource files with MySQL and Oracle, has several disadvantages as being caught between backward compatibility and solve a particular problem. it is best to have a sql for each SQL engine and thus maximize the capabilities of each.
Features that look identical in a brochure may be implemented very differently.
see these examples
Limiting result sets
MYSQL
SELECT columns
FROM tablename
ORDER BY key ASC
LIMIT n
ORACLE
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
columns
FROM tablename
)
WHERE rownumber <= n
Limit—with offset
MYSQL
SELECT columns
FROM tablename
ORDER BY key ASC
LIMIT n OFFSET skip
ORACLE
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY key ASC) AS rn,
columns
FROM tablename
)
WHERE rn > skip AND rn <= (n+skip)
You can check this Comparison of different SQL implementations

In addition to the stuff others have mentioned, oracle and mysql handle outer joins quite differently. Actually, Oracle offers a syntax that mySql won't cope with, but Oracle will cope with the standard syntax.
Oracle only:
SELECT a.foo, b.bar
FROM a, b
WHERE a.foo = b.foo(+)
mySql and Oracle:
SELECT a.foo, b.bar
FROM a
LEFT JOIN b
ON (a.foo=b.foo)
So you may have to convert some outer joins.

You definitely won't be able to keep your DDL the same. As far as DML goes, there are many similarities (there's a core subset of ANSI SQL standard supported by every database) but there are some differences as well.
To start, MySQL uses auto increment values and Oracle uses sequences. It's possible to work around this (sequence + trigger on Oracle side to simulate auto increment), but it's there. Built-in functions are quite different.
Basically, depending on what exactly you intend to use it may or may not be possible to keep one set of statements for both. Incidentally, even with Hibernate dialects it's not always possible to have the same set of queries - HQL is great but not always enough.

Oracle treats empty strings as nulls. MySQL treats empty strings as empty strings and null strings as null strings.

Related

Clojure.java.jdbc return type for datetime changes based on query structure

I have a Korma based software stack that constructs fairly complex queries against a MySQL database. I noticed that when I am querying for datetime columns, the type that I get back from the Korma query changes depending on the syntax of the SQL query being generated. I've traced this down to the level of clojure.java.jdbc/query. If the form of the query is like this:
select modified from docs order by modified desc limit 10
then I get back maps corresponding to each database row in which :modified is a java.sql.Timestamp. However, sometimes our query generator generates more complex union queries, such that we need to apply an order by ... limit ... constraint to the final result of the union. Korma does this by wrapping the query in parentheses. Even with only a single subquery--i.e., a simple parenthesized select--so long as we add an "outer" order by ..., the type of :modified changes.
(select modified from docs order by modified desc limit 10) order by modified desc
In this case, clojure.java.jdbc/query returns :modified values as strings. Some of our higher level code isn't expecting this, and gets exceptions.
We're using a fork of Korma, which is using an old (0.3.7) version of clojure.java.jdbc. I can't tell if the culprit is clojure.java.jdbc or java.jdbc or MySQL. Anyone seen this and have ideas on how to fix it?

Moving to the latest jdbc in a similar situation changed several other things for us and was a decidedly "non-trvial" task. I would suggest getting off of a korma fork soon and then debugging this.
For us the changes focused around what korma returned on update calls changed between the verions of the backing jdbc. It was well worth getting current even though it's a moderately painful process.
Getting current with jdbc will give you fresh new problems!
best of luck with this :-) These things tend to be fairly specific to the DB server you are using.
Other options for you is to have a policy of aways specifying an order-by parameter or building a library to coerce the strings into dates. Both of these have some long term technical dept problems.

Better option to fetch results from database tables

Are there any performance improvement in calling a procedure which returns SYS_RECURSOR or call a query?
For example
CREATE OR REPLACE PROCEDURE my_proc
(
p_id number,
emp_cursor IN OUT SYS_REFCURSOR
)
AS
BEGIN
OPEN emp_cursor for
select * from emp where emp_number=p_id
end;
/
and call the above from Java by registering OUT parameter,pass IN parameter and fetch the results.
Or
From Java get the results from emp table by
preparedStatement = prepareStatement(connection, "select * from emp where emp_number=?", values);
resultSet = preparedStatement.executeQuery();
Which one of the above is a better option to call from Java?

There is no performance difference assuming your prepareStatement method is using the appropriate type for all bind variables. That is, you would need to ensure that you are calling setLong, setDate, setString, etc. depending on the data type of the parameter. If you bind the data incorrectly (i.e. calling setString to bind a numeric value), you may force Oracle to do data type conversion which may prevent the optimizer from using an index that would improve performance.
From a code organization and maintenance standpoint, however, I would rather have the queries in the database rather than in the Java application. If you find that a query is using a poor plan, for example, it's likely to be much easier for a DBA to address the problem if the query is in a stored procedure than if the query is embedded in a Java application. If the query is stored in the database, you can also use the database's dependency tracking functions to more easily do an impact analysis if you need to do something like determine what would be impacted if the emp table needs to change.

Well, I don't think there is major significant difference from the Java invocation standpoint.
Some differencesI can think of are:
You will now have to maintain two different code bases: your Java code and your stored procedures. In case of errors, you will have to debug in two different places, and fix problems in two different places.
Once production-ready, making changes to the database is probably going to require some additional formalisms besides those required to change the Java code deployed.
Another important matter to take into account is database-independence, if you are building a product to work with different kinds of databases, you would be forced to write different versions of your stored procedures and you will have more code to maintain (debug, bugfix, change, etc).
This very important if you're building a product that you intend to deploy in different environments of different (possible yet unknown) clients, wich you cannot predict what RDBMS will be using.
If you want to use an ORM framework i.e. Hibernate, EclipseLink) it will generate pretty optimized queries for you. Plus, it would be more difficult to integrate it later on if you use stored-procedures.
With proper amount of logging is easy to analyze your queries for optimization purposes. You could use JDBC logging or the logging provided by your ORM provider and actually see how the query is being used by the application, how many times, how often, etc, and optimize where it matters.

Hibernate produce different SQL for every query

I've just tested my application under the profiler and found out that sql strings use about 30% of my memory! This is bizarre.
There are a lot of strings like this stored in app memory. This is SQL queries generated by hibernate, note the different numbers and trailing underscores:
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=? for update
select avatardata0_.Id as Id4347_0_,...... where avatardata0_.Id=? for update
Here is the part I can't understand. Why does hibernate have to generate different sql strings with different identifiers like "Id4305_0_" for each query? Why can't it use one query string for all identical queries? Is this some kind of trick to bypass query caching?
I would greatly appreciate if someone would describe me why it happening and how to avoid such resource wasting.
UPDATE
Ok. I found it. I was wrong assuming memory leak, It was my fault. Hibernate is working as intended.
My app created 121(!) SessionFactories in 10 threads, they produced about 2300 instances of SingleTableEntityPersisters. And each SingleTableEntityPersister generates about 15 SQL queries with different identifiers. Hibernate was forced to generate about 345.000 different SQL queries. Everything is fine, nothing weird :)

There is a logic behind the query string that hibernate generates. Its primary aim is to get unique aliases for tables and columns names.
From your query,
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=?
avatardata0_ ==> avatardata is the alias of the table and 0_ is appended to indicate it is the first table in the query. So if it were the second table(or Entity) in the query it should have been shown as avatardata1_. It uses the same logic for the column aliases.
So, this way all the possible conflicts are avoided.
You are seeing theses queries because you have turns on the show_sql flag the configuration. This is intended for the debugging of queries. Once you application started working you are supposed turn it off.
Read more on the API docs here.
I am not much aware of the memory consumption part, but you repeat your tests with the above flag turned off and see if there is any improvement.

Assuming you are using sql server, you might want to check the parameter type declaration for '?', making sure the declaration results in the same, fixed length declaration every time.
Dynamic length parameters would result in separate execution plans for each query. This could possibly comsume a lot of resources. What we see as the same procedure, get's interpreted by sql server as a different query, rendering a separate execution plan.
Thus,
exec myprocedure #p1 varchar(3)='foo'
and
exec myprocedure #p1 varchar(6)='foobar'
would result in different plans. Simply by the fact that the declarations of #p1, differ in size.
There is a lot to know about this behaviour. If the above applies to you, I would recommend you read up on 'parameter sniffing'.

No... you can generate you common query inside the hibernate. The logic behind is to mapping with table and fetch the record from there. It is used common query for all the database. Please create a common query like that :
Example :
select t.Id as Id4305_0_,...... from t where t.Id=?

Does using Limit in query using JDBC, have any effect in performance?

If we use the Limit clause in a query which also has ORDER BY clause and execute the query in JDBC, will there be any effect in performance? (using MySQL database)
Example:
SELECT modelName from Cars ORDER BY manuDate DESC Limit 1
I read in one of the threads in this forum that, by default a set size is fetched at a time. How can I find the default fetch size?
I want only one record. Originally, I was using as follows:
SQL Query:
SELECT modelName from Cars ORDER BY manuDate DESC
In the JAVA code, I was extracting as follows:
if(resultSett.next()){
//do something here.
}

Definitely the LIMIT 1 will have a positive effect on the performance. Instead of the entire (well, depends on default fetch size) data set of mathes being returned from the DB server to the Java code, only one row will be returned. This saves a lot of network bandwidth and Java memory usage.
Always delegate as much as possible constraints like LIMIT, ORDER, WHERE, etc to the SQL language instead of doing it in the Java side. The DB will do it much better than your Java code can ever do (if the table is properly indexed, of course). You should try to write the SQL query as much as possibe that it returns exactly the information you need.
Only disadvantage of writing DB-specific SQL queries is that the SQL language is not entirely portable among different DB servers, which would require you to change the SQL queries everytime when you change of DB server. But it's in real world very rare anyway to switch to a completely different DB make. Externalizing SQL strings to XML or properties files should help a lot anyway.

There are two ways the LIMIT could speed things up:
by producing less data, which means less data gets sent over the wire and processed by the JDBC client
by potentially having MySQL itself look at fewer rows
The second one of those depends on how MySQL can produce the ordering. If you don't have an index on manuDate, MySQL will have to fetch all the rows from Cars, then order them, then give you the first one. But if there's an index on manuDate, MySQL can just look at the first entry in that index, fetch the appropriate row, and that's it. (If the index also contains modelName, MySQL doesn't even need to fetch the row after it looks at the index -- it's a covering index.)
With all that said, watch out! If manuDate isn't unique, the ordering is only partially deterministic (the order for all rows with the same manuDate is undefined), and your LIMIT 1 therefore doesn't have a single correct answer. For instance, if you switch storage engines, you might start getting different results.

JPA or Hibernates With Oracle Table Partitions?

I need to use an Entity framework with my application, and I have used table - partitions in Oracle database. With simple JDBC, I am able to select data from a specific partition. But I don't know whether I can do the same with hibernate or Eclipse link (JPA). If someone knows how to do that, please do let me know.
usually the select statement in JDBC - SQL is,
select * from TABLE_NAME partiton(PARTITON_NAME) where FIELD_NAME='PARAMETER_VALUE';
How can I do the same with Hibernates or JPA?
Please share at least a link for learning sources.
Thanks!!!

JPA or any other ORM framework does not support Oracle partition tables natively (atleast in my knowledge).
There are different possible solutions though, depending on the nature of your problem:
Refactor your classes so that data that needs to be treated differently in real-life, belongs in a separate class. Sometimes this is called vertical partitioning (partitions are not obtained across rows, rather across columns).
Use Oracle partition tables underneath and use native SQL queries or stored procedures from JPA. This is just a possibile solution (I haven't attempted this).
Use Hibernate Shards. Although the typical use case for Hibernate Shards is not for a single database, it presents a singular view of distributed databases to an application developer.
Related:
JPA Performance, Don't Ignore the Database

EclipseLink supports partitioning or sharding with different options.
You can find more about this and examples here:
http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/Data_Partitioning

Table partitioning is data organization on physical level. In a word, partitioning is a poor man index. Like the later, it is supposed to be entirely transparent to the user. A SQL query is allowed to refer to the entire table, but not partition. Then, it is query optimizer job to decide if it can leverage a certain partition, or index.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.