I've created a functional index on a sybase table.
create index acadress_codpost_lower on acadress(LOWER(l5_codpost))
I then run a complex query that uses the index. Without the index it takes 17.086 seconds. With the index it takes 0.076 seconds.
I've run it from two different SQL clients and on both development and pre-prod Sybase servers. In all cases I see the acceleration from the index.
However when we run an identical query from Java (and I know it's identical since I've logged the generated SQL and used that directly in the SQL clients) then the performance is exactly the same as before we added the indexes.
What possible reason might there be for identical SQL queries to use the index when run from ACE and SQuirreL but not from Java?
My first thought is that maybe Sybase is caching execution plans for the Prepared Statements and not using the index. We've tried restarting the Java server several times (other services use the Sybase server so it's harder to bounce) and it has made no difference.
The other possibility is that we are using a very old version of the Sybase driver:
jConnect (TM) for JDBC(TM)/7.00(Build 26502)/P/EBF17993/JDK16/Thu Jun 3 3:09:09 2010
Is it possible that functional indexes are not supported by this version of JConnect?
Does anyone know if either of these theories might be correct, or whether there is something else I've missed?
I've been looking into this off and on for the past week or so and while I still do not have a definitive answer I do have a plausible theory.
I tried the suggestions from the comments and thanks to them I was able to narrow the cause down to a single change, if I have the query:
"where LOWER(aca.l5_codpost) like '"+StringEscapeUtils.escapeSql("NG179GT".toLowerCase())+"'"
Then the query uses the index and returns extremely quickly.
If on the other hand I have:
where LOWER(aca.l5_codpost) like :postcode
query.setString("postcode", "NG179GT".toLowerCase());
Then it does not use the index.
The theory is that Sybase is optimizing the query plan with no information about the contents of :postcode, so it is not using the index. It doesn't recompile the query once it does know the contents so it never uses the index.
I've tried forcing the index using (index acadress_codpost_lower) and that made no difference.
I've tried set forceplan off and set literal_autoparam off and neither made any difference.
The only thing that I can find that changes the behavior is directly embedding the option into the query plan vs having it as a parameter.
So the work around is embedding the parameter into the query string, although I'd still like to know what's actually happening and solve the problem properly.
Related
I have a Korma based software stack that constructs fairly complex queries against a MySQL database. I noticed that when I am querying for datetime columns, the type that I get back from the Korma query changes depending on the syntax of the SQL query being generated. I've traced this down to the level of clojure.java.jdbc/query. If the form of the query is like this:
select modified from docs order by modified desc limit 10
then I get back maps corresponding to each database row in which :modified is a java.sql.Timestamp. However, sometimes our query generator generates more complex union queries, such that we need to apply an order by ... limit ... constraint to the final result of the union. Korma does this by wrapping the query in parentheses. Even with only a single subquery--i.e., a simple parenthesized select--so long as we add an "outer" order by ..., the type of :modified changes.
(select modified from docs order by modified desc limit 10) order by modified desc
In this case, clojure.java.jdbc/query returns :modified values as strings. Some of our higher level code isn't expecting this, and gets exceptions.
We're using a fork of Korma, which is using an old (0.3.7) version of clojure.java.jdbc. I can't tell if the culprit is clojure.java.jdbc or java.jdbc or MySQL. Anyone seen this and have ideas on how to fix it?
Moving to the latest jdbc in a similar situation changed several other things for us and was a decidedly "non-trvial" task. I would suggest getting off of a korma fork soon and then debugging this.
For us the changes focused around what korma returned on update calls changed between the verions of the backing jdbc. It was well worth getting current even though it's a moderately painful process.
Getting current with jdbc will give you fresh new problems!
best of luck with this :-) These things tend to be fairly specific to the DB server you are using.
Other options for you is to have a policy of aways specifying an order-by parameter or building a library to coerce the strings into dates. Both of these have some long term technical dept problems.
I've just tested my application under the profiler and found out that sql strings use about 30% of my memory! This is bizarre.
There are a lot of strings like this stored in app memory. This is SQL queries generated by hibernate, note the different numbers and trailing underscores:
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=? for update
select avatardata0_.Id as Id4347_0_,...... where avatardata0_.Id=? for update
Here is the part I can't understand. Why does hibernate have to generate different sql strings with different identifiers like "Id4305_0_" for each query? Why can't it use one query string for all identical queries? Is this some kind of trick to bypass query caching?
I would greatly appreciate if someone would describe me why it happening and how to avoid such resource wasting.
UPDATE
Ok. I found it. I was wrong assuming memory leak, It was my fault. Hibernate is working as intended.
My app created 121(!) SessionFactories in 10 threads, they produced about 2300 instances of SingleTableEntityPersisters. And each SingleTableEntityPersister generates about 15 SQL queries with different identifiers. Hibernate was forced to generate about 345.000 different SQL queries. Everything is fine, nothing weird :)
There is a logic behind the query string that hibernate generates. Its primary aim is to get unique aliases for tables and columns names.
From your query,
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=?
avatardata0_ ==> avatardata is the alias of the table and 0_ is appended to indicate it is the first table in the query. So if it were the second table(or Entity) in the query it should have been shown as avatardata1_. It uses the same logic for the column aliases.
So, this way all the possible conflicts are avoided.
You are seeing theses queries because you have turns on the show_sql flag the configuration. This is intended for the debugging of queries. Once you application started working you are supposed turn it off.
Read more on the API docs here.
I am not much aware of the memory consumption part, but you repeat your tests with the above flag turned off and see if there is any improvement.
Assuming you are using sql server, you might want to check the parameter type declaration for '?', making sure the declaration results in the same, fixed length declaration every time.
Dynamic length parameters would result in separate execution plans for each query. This could possibly comsume a lot of resources. What we see as the same procedure, get's interpreted by sql server as a different query, rendering a separate execution plan.
Thus,
exec myprocedure #p1 varchar(3)='foo'
and
exec myprocedure #p1 varchar(6)='foobar'
would result in different plans. Simply by the fact that the declarations of #p1, differ in size.
There is a lot to know about this behaviour. If the above applies to you, I would recommend you read up on 'parameter sniffing'.
No... you can generate you common query inside the hibernate. The logic behind is to mapping with table and fetch the record from there. It is used common query for all the database. Please create a common query like that :
Example :
select t.Id as Id4305_0_,...... from t where t.Id=?
Recently one of my colleagues made a comment that I should not use
LIKE '%'||?||'%'
rather use
LIKE ?
in the SQL and then replace the LIKE ? marker with LIKE '%'||?||'%' before I execute the SQL. He made the point that with a single parameter marker DB2 database will cache the statement always and thus cut down on the SQL prepare time.
However, I am not sure if it is accurate or not. To me it should be the other way around since we are doing more processing by doing a string replace on the SQL everytime the query is getting executed.
Does anyone know if a single marker really speeds up execution? Just FYI - I am using Spring 2.5 JDBC framework and the DB2 version is 9.2.
My question is - does DB2 treat "LIKE ?" differently from "LIKE '%'||?||'%'" as far as caching and preparation goes.
'LIKE ?' is a PreparedStatement. Prepared statements are an optimization at the JDBC driver level. The thinking is that databases analyze queries to decide how to most efficiently process them. The DB can then cache the resulting query plan, keyed on the full statement. Reusing identical statements reuses the query plan. So basically if you are running the same query multiple times with different comparison strings, and if the query plan stays cached, then yes, using 'LIKE ?' will be faster.
Some useful (though somewhat dated) info on PreparedStatements:
Prepared Statments
More Prepared Statments
I haven't done too much DB2, not since the 90's and I'm not really sure if I'm understanding what your underlying question is. Way back then I got a phone call from the head of the DBA team. "What are you doing different than every other programmer we've got!??" Mind you, this was early in my career, so tentatively I answered, "Nothing....", imagine it in kind of a whiny voice. "Well then, why do your queries take 50% of the cpu resources of any the other guys???". I took a quick poll of all the other guys and found I was the only one using prepared statements. Now under the covers Spring automatically makes prepared statements, and they've improved statement caching in the database a lot over the years, but if you make use of the properly, you can get the speedup there, AND it'll make the statement cache swap things out less often. It really depends on your use case, if you're only going to hit the query once, then there would be no difference, if its a few thousand times, obviously it would make a much greater difference.
in the SQL and then replace the LIKE ? marker with LIKE '%'||?||'%' before I execute the SQL. He made the point that with a single parameter marker DB2 database will cache the statement always and thus cut down on the SQL prepare time.
Unless DB2 is some sort of weird alien SQL database, or if it's driver does some crazy things, then the database server will never see your prepared statement until you actually execute it. So you can swap clauses in and out of the PreparedStatement all day long, and it will have no effect until you actually send it to the server when you execute it.
Currently working in the deployment of an OFBiz based ERP, we've come to the following problem: some of the code of the framework calls the resultSet.last() to know the total rows of the resultset. Using the Oracle JDBC Driver v11 and v10, it tries to cache all of the rows in the client memory, crashing the JVM because it doesn't have enough heap space.
After researching, the problem seems to be that the Oracle JDBC implements the Scrollable Cursor in the client-side, instead of in the server, by the use of a cache. Using the datadirect driver, that issue is solved, but it seems that the call to resultset.last() takes too much to complete, thus the application server aborts the transaction
is there any way to implemente scrollable cursors via jdbc in oracle without resorting to the datadirect driver?
and what is the fastest way to know the length of a given resultSet??
Thanks in advance
Ismael
"what is the fastest way to know the length of a given resultSet"
The ONLY way to really know is to count them all. You want to know how many 'SMITH's are in the phone book. You count them.
If it is a small result set, and quickly arrived at, it is not a problem. EG There won't be many Gandalfs in the phone book, and you probably want to get them all anyway.
If it is a large result set, you might be able to do an estimate, though that's not generally something that SQL is well-designed for.
To avoid caching the entire result set on the client, you can try
select id, count(1) over () n from junk;
Then each row will have an extra column (in this case n) with the count of rows in the result set. But it will still take the same amount of time to arrive at the count, so there's still a strong chance of a timeout.
A compromise is get the first hundred (or thousand) rows, and don't worry about the pagination beyond that.
your proposed "workaround" with count basically doubles the work done by DB server. It must first walk through everything to count number of results and then do the same + return results. Much better is the method mentioned by Gary (count(*) over() - analytics). But even here the whole result set must be created before first output is returned to the client. So it is potentially slow a memory consuming for large outputs.
Best way in my opinion is select only the page you want on the screen (+1 to determine that next one exists) e.g. rows from 21 to 41. And have another button (usecase) to count them all in the (rare) case someone needs it.
I've been writing a java app on my machine and it works perfectly using the DB I set up, but when I install it on site it blows up because the DB is slightly different.
So I'm in the process of writing some code to verify that:
A: I've got the DB details correct
B: The database has all the Tables I expect and they have the right columns.
I've got A down but I've got no idea where to start with B, any suggestions?
Target DB is for the current client is Oracle, but the app can be configured to run on SQL Server as well. So a generic solution would be appreciated, but is not nessisary as I'm sure I can figure out how to do one from the other.
You'll want to query the information_schema of the database, here are some examples for Oracle, every platform I am aware of has something similar.
http://www.alberton.info/oracle_meta_info.html
You might be able to use a database migration tool like LiquiBase for this -- most of these tools have some way of checking the database. I don't have first hand experience using it so it's a guess.
I use DbUnit to test databases. It is a Java based solution, that integrates well with Junit. It is possible to use it with almost no Java. I havent used it in exactly the same situation as you described, but it should be close enough to work.
Most generic solution would be to execute queries with select clause having the expected coulmns and from clause having table names, within try catch block. You can put where clause as 1=2 so as not to fetch any data. If query executed without throwing exception then you have got the expected table and columns.
The slightly different piece might be better handled by scripting the creation of the database in the first place. A automated process gives you a better chance of making the two identical.
Another point worth making is that you minimize your risk by making your devl and prod environments identical - same database schema and vendor for both. Change the circumstances that make the two different.
Lastly, you don't say what is "slightly" different, but sometimes these are unavoidable (e.g. Oracle uses sequences, SQL Server uses identities). Maybe Hibernate can help you to switch between vendors more reliably. It abstracts details in such a way that changing databases can mean modifying a single value in a configuration file.
What you need to have is basically Unit Tests for your database. "A column must exist named FOOBAR, the type must be Integer. No foreign keys may exist etc."
This is doable with plain JUnit and JDBC (ask the table for its meta-data) as you may want to ensure that you are absolutely certain what is being done which may be harder when using e.g. dbUnit.
You can check for the presence of tables, columns, views, etc. using these tables in Oracle
USER_TABLES
USER_VIEWS
USER_PROCEDURE
(or for everything)
USER_OBJECTS WHERE OBJECT_TYPE = '??'
To keep going... USER_TAB_COLS for table columns
Regards
K
I use MigrateDB for this. It lets you build queries that do things like check for the existence of given tables, columns, rows, indexes, etc. for a given database and use those as "tests." If a test fails, it triggers an "action" (which is just another query that knows how to remedy the problem.)
MigrateDB supports multiple database platforms (you can specify the "check for table existence query" for each platform, for example), completely configurable tests (you can make your own up), comes with fairly complete Oracle tests, and can be run in "audit only" mode so that it only tells you what the differences are.
It's a nice, robust solution.
If you're using plain JDBC, you should try utilizing this method: DatabaseMetadata.getTables and other similar methods available in the metadata class.