Java + Spring + Hibernate: "order by" slow on production

Java + Spring + Hibernate: "order by" slow on production - java

I have a Java Spring server, working with Hibernate and MySQL, and I have a view in my database, which I've mapped as an entity with hibernate.
The view is completely flat, no references to other entities. No fetches etc.
I can query the view using a simple select with no problems, both directly on the sql server using "Workbench", and using hql.
Locally I can also use "order by" clauses with no noticeable impact on the performance.
My problem is that When I move to a dev server, the "order by" clause causes the query to be extremely slow, which causes a timeout.
This happens only when the server makes the hql query.
I can still query the dev DB directly using "Workbench" with no performance problems.
The only differences between the two servers (that I can think of) is that the dev server is deployed on Google's appengine, and that the hibernate drivers are different.
On the local server I use com.mysql.jdbc.Driver, and on appengine com.mysql.jdbc.GoogleDriver.
My local DB is a MySQL database, and the dev one is on Google's Cloud SQL, which is also MySQL (pretty much).
On my Dev DB I have about 28000 rows in total in the view, and on my local DB I have about 21000, so that also shouldn't make the difference.
I can add code if you think it could help. I wasn't sure where to start.
Thank you very much in advance.

I would enable the 'show_sql=true' to see the generated queries, and also will use the 'GoogleDriver' with local DB to eliminate any issues with driver.
Would it be possible to connect to DEV DB from your local workspace? if so, try that option using 'com.mysql.jdbc.Driver' to eliminate any issues with DEV DB.
if possible connect to DEV_DB from Mysql Manager and run the query directly on DEV_DB.
28K is very less count to get problems with 'Order by' clause.
Edit
If you have not enabled the 'connector-j' before deploying to the appEngine, it will still try to use the 'mysql' driver, and will timeout eventually.
<?xml version="1.0" encoding="utf-8"?>
<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
...
<use-google-connector-j>true</use-google-connector-j>
</appengine-web-app>
More info.

You might want to look to see if an index on the ORDER BY field exists in the DEV server but does not on PROD. That can have a profound effect on both ordering and searching.

Related

MySQL only shows some data that is inserted by Hibernate

I have a Java application that loads data from a very large text file into MySQL via Hibernate. So far, the application seems to work OK - the data gets loaded.
Then I wanted to monitor the progress of the application while it was inserting data, so I wrote a small command-line utility that basically queries select count(*) from my_table;.
The first time this query is run (from either the CLI or MySQL Workbench), I get the correct number of records, as expected. All subsequent executions of this query return the exact same number! Even though the data-loading application is still running!
If I stop and start the MySQL process, querying for the number of records shows the correct number, as the data-loading application would report it.
I've never seen anything like this before. It looks like there is some strange MySQL caching issue going on here, and I'm concerned it may cause problems for other non-Hibernate applications that may want to access this database.
Does anyone know how to tweak MySQL (or possibly Hibernate) so that MySQL always shows what's being added to the database?
Technical details:
MySQL: 5.7.26, in a docker container, using InnoDB storage
Hibernate version: 5.4.2.Final
Spring version: 5.1.7.RELEASE

Calling FLUSH TABLES seems to resolve this, in the sense that after I flush the tables, I can see how many records have been added by the application.

How to determine what SQL query a java web application is using to return the data?

I have been given a java web application for which I have the source code to. The application queries an Oracle database to return data back to the user in web page. I need to update a value for the returned data in the database, without knowing the table or column names. Is there a general way to determine what query the application is submitting to return the data for a particular page so I can find the right tables and columns?
I am new to java and web development so not sure where to start looking.
Thanks!

Well, there's always the old fashioned way of finding out. You can find the source code for the specific page you're looking at and identify the query that's being executed to retrieve the data. I'm assuming that's not what you're looking for, though.
Some other options include using JDBC (Enabling and Using JDBC Logging) logging feature or JProfiler (the JDBC probe shows you all SQL statements in the events view). Once you find the SQL statement, you can use standard text search features within your IDE to locate the specific code and make alterations.
Hope that helps!

If you can run a controlled test (e.g., you are the only person on that web application), you could turn on SQL tracing on the DB connection and then run your transaction several times. To do this
look at all the connections from that application using v$session -- you can control this by tweaking your connection pool setting (e.g., set min and max connection to 1). Assuming this is your test environment.
turn on 10046 trace (see https://oracle-base.com/articles/misc/sql-trace-10046-trcsess-and-tkprof -- there are many other examples).
The 10046 trace will show you what the application is doing -- SQL by SQL. You can even set the level to 12 to get the bind variable values (assuming you are using prepared statements).

How to update tables from back end that it reflects in retrieved entities

I'm developing a Spring MVC web application using Windows 7, Eclipse Juno, Eclipselink JPA as ORM and Glassfish as application server with Oracle 11g. While I was working with Eclipselink I noticed when I update a table manually by execute an update PL/SQL query it doesn't has any affect on entities that already retrieved by Eclipselink until restart the server. Although, I disabled Eclipselink cache by having <shared-cache-mode>NONE</shared-cache-mode> in persistance.xml and using EntityManager.clear(), EntityManager.close() and #Cacheable(false).
Then, I noticed when I update tables using Oracle-SQLDeveloper table designer it totally works fine and entities are showing updated information. So I checked SQLDeveloper log to see what query it's using to update rows and I saw that it's using ORA_ROWSCN and ROW ROWID in where clause. After that, I exactly used the same where clause as the one SQLDeveloper used to update tables, but still entities were showing old information.
I'm wondering what factors are involved here that Eclipslink is not fetching real time data from database ? but, after updating table with SQLDeveloper designer Eclipselink is showing real time data. It seems that modifying a table data with SQLDeveloper table designer also marks the table as changed by using some database features. Then, Eclipselink will read the mark before hitting the table.
Also to get more clarification, anyone knows what steps are involved in Eclipselink before it decides to hit database when user commands to execute a TypedQuery ? I'm so curious that where it stores cached entities ? since cache rest just by restarting the computer; I tried restarting Glassfish, killing the java process and logoff current user, but none of them worked. Why Eclipselink still is caching entities since I configured it to not use any caching procedure? Is it possible to completely turn off cache in Eclipselink?

Are batch inserts not working only because of the MySQL driver? What about others?

Earlier I was trying to get batch inserts working in Hibernate. I tried everything: For the config I set batch_size(50), order_inserts(true), order_updates(true), use_second_level_cache(false), use_query_cache(false). For the session I used setCacheMode(CacheMode.IGNORE) and setFlushMode(FlushMode.MANUAL). Still the MySQL query log showed that each insert was coming in separately.
The ONLY thing that worked was setting rewriteBatchedStatements=true in the JBDC connection string. This worries me, as my application is supposed to support any JBDC database and I'm trying to avoid DB specific optimizations.
Is the only reason hibernate can't actually use batch statements because the MySQL driver doesn't support them by default? What about other drivers, do I have to add options to the connection string so they can support batched inserts? If you need specific db's, think SQL server, sqlite, Postgres, etc

One reason it could not be working is that hibernate disables batching if you use the Identity id generation strategy.
Also MySQL doesn't support JDBC batch prepared statements the same way as other databases without turning on the rewrite option.
I don't see that it is a problem to turn this flag on though, if your are setting up your application for a different database you will have to change the settings such as dialect, driver name, etc. anyway and as this is part of the JDBC connect String then you are isolated from the configuration.
Basically I think you are doing the right thing.

As batch insert (or bulk insert) is part of the SQL standard, ORM frameworks like Hibernate support and implement it. Please see Chapter 13. Batch Processing and Hibernate / MySQL Bulk insert problem .
Basically, you need to set the JDBC batch size via the variable named hibernate.jdbc.batch_size to a reasonable size. Also don't forget to end the batch transaction with flush() and clear().

How to generically test a database connection with hibernate

I have a service method on an api that can be called to check the health of my database connection.
The method is pulling the query string from a properties file (depends on DB vendor, using Sybase and HSQL for now, more in future), and executing it. Then the method lets the caller know if it succeeded or failed.
In addition to this, I was using the Query.setHint("javax.persistence.query.timeout") to set a timeout on the query:
javax.persistence.EntityManager entityManager;
...
Query heartbeatQuery = entityManager.createNativeQuery(heartbeatQueryString);
heartbeatQuery.setHint("javax.persistence.query.timeout", heartbeatTimeout);
heartbeatQuery.getResultList();
My problem is the timeout property is working against my Sybase DB, but not against my HSQL DB. It sounds like it depends on the vendor, so I don't know for sure when it will work.
Is there a better way to generically test the DB connection & include some kind of timeout parameter?

Well sadly no. JPA's query hints are not mandatory, i.e. it's up to the implementator (EclipseLink, Hibernate, etc) to enforce them or not. Moreover, even if the implementator does chose to recognize a certain query hint, if that hint's functionality is not supported by the database then it won't work (here some implementators are nice and tell you if a certain hint won't work agains the current db while others fail silently). In the case of HSQLDB there's no way to set the query timeout. You can only set a timeout for the login (i.e. how long should it wait for a successful login before failing), but not for the queries duration.
Things are not so grim however. On the one hand, even if you'd solve this, you'd still stumble over other issues with HSQLDB, as it does not support a lot of other nice functionalities that most dbs have. You should only use HSQLDB for basic integration/unit testing. For more involved testing, you can use the integrated MySQL Java library. You can find it here:
http://dev.mysql.com/doc/refman/5.0/en/connector-mxj.html
This is simply a packaged fully working Mysql server, which has a Java api for star and stop, works on most major OSs (win,lin, os x, etc). This way you can have your integration tests start a real Mysql server, and try your code there, where such stuff as a query timeout hint will work fine.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.