MySQL only shows some data that is inserted by Hibernate

MySQL only shows some data that is inserted by Hibernate - java

I have a Java application that loads data from a very large text file into MySQL via Hibernate. So far, the application seems to work OK - the data gets loaded.
Then I wanted to monitor the progress of the application while it was inserting data, so I wrote a small command-line utility that basically queries select count(*) from my_table;.
The first time this query is run (from either the CLI or MySQL Workbench), I get the correct number of records, as expected. All subsequent executions of this query return the exact same number! Even though the data-loading application is still running!
If I stop and start the MySQL process, querying for the number of records shows the correct number, as the data-loading application would report it.
I've never seen anything like this before. It looks like there is some strange MySQL caching issue going on here, and I'm concerned it may cause problems for other non-Hibernate applications that may want to access this database.
Does anyone know how to tweak MySQL (or possibly Hibernate) so that MySQL always shows what's being added to the database?
Technical details:
MySQL: 5.7.26, in a docker container, using InnoDB storage
Hibernate version: 5.4.2.Final
Spring version: 5.1.7.RELEASE

Calling FLUSH TABLES seems to resolve this, in the sense that after I flush the tables, I can see how many records have been added by the application.

Related

How to determine what SQL query a java web application is using to return the data?

I have been given a java web application for which I have the source code to. The application queries an Oracle database to return data back to the user in web page. I need to update a value for the returned data in the database, without knowing the table or column names. Is there a general way to determine what query the application is submitting to return the data for a particular page so I can find the right tables and columns?
I am new to java and web development so not sure where to start looking.
Thanks!

Well, there's always the old fashioned way of finding out. You can find the source code for the specific page you're looking at and identify the query that's being executed to retrieve the data. I'm assuming that's not what you're looking for, though.
Some other options include using JDBC (Enabling and Using JDBC Logging) logging feature or JProfiler (the JDBC probe shows you all SQL statements in the events view). Once you find the SQL statement, you can use standard text search features within your IDE to locate the specific code and make alterations.
Hope that helps!

If you can run a controlled test (e.g., you are the only person on that web application), you could turn on SQL tracing on the DB connection and then run your transaction several times. To do this
look at all the connections from that application using v$session -- you can control this by tweaking your connection pool setting (e.g., set min and max connection to 1). Assuming this is your test environment.
turn on 10046 trace (see https://oracle-base.com/articles/misc/sql-trace-10046-trcsess-and-tkprof -- there are many other examples).
The 10046 trace will show you what the application is doing -- SQL by SQL. You can even set the level to 12 to get the bind variable values (assuming you are using prepared statements).

Java + Spring + Hibernate: "order by" slow on production

I have a Java Spring server, working with Hibernate and MySQL, and I have a view in my database, which I've mapped as an entity with hibernate.
The view is completely flat, no references to other entities. No fetches etc.
I can query the view using a simple select with no problems, both directly on the sql server using "Workbench", and using hql.
Locally I can also use "order by" clauses with no noticeable impact on the performance.
My problem is that When I move to a dev server, the "order by" clause causes the query to be extremely slow, which causes a timeout.
This happens only when the server makes the hql query.
I can still query the dev DB directly using "Workbench" with no performance problems.
The only differences between the two servers (that I can think of) is that the dev server is deployed on Google's appengine, and that the hibernate drivers are different.
On the local server I use com.mysql.jdbc.Driver, and on appengine com.mysql.jdbc.GoogleDriver.
My local DB is a MySQL database, and the dev one is on Google's Cloud SQL, which is also MySQL (pretty much).
On my Dev DB I have about 28000 rows in total in the view, and on my local DB I have about 21000, so that also shouldn't make the difference.
I can add code if you think it could help. I wasn't sure where to start.
Thank you very much in advance.

I would enable the 'show_sql=true' to see the generated queries, and also will use the 'GoogleDriver' with local DB to eliminate any issues with driver.
Would it be possible to connect to DEV DB from your local workspace? if so, try that option using 'com.mysql.jdbc.Driver' to eliminate any issues with DEV DB.
if possible connect to DEV_DB from Mysql Manager and run the query directly on DEV_DB.
28K is very less count to get problems with 'Order by' clause.
Edit
If you have not enabled the 'connector-j' before deploying to the appEngine, it will still try to use the 'mysql' driver, and will timeout eventually.
<?xml version="1.0" encoding="utf-8"?>
<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
...
<use-google-connector-j>true</use-google-connector-j>
</appengine-web-app>
More info.

You might want to look to see if an index on the ORDER BY field exists in the DEV server but does not on PROD. That can have a profound effect on both ordering and searching.

How to update tables from back end that it reflects in retrieved entities

I'm developing a Spring MVC web application using Windows 7, Eclipse Juno, Eclipselink JPA as ORM and Glassfish as application server with Oracle 11g. While I was working with Eclipselink I noticed when I update a table manually by execute an update PL/SQL query it doesn't has any affect on entities that already retrieved by Eclipselink until restart the server. Although, I disabled Eclipselink cache by having <shared-cache-mode>NONE</shared-cache-mode> in persistance.xml and using EntityManager.clear(), EntityManager.close() and #Cacheable(false).
Then, I noticed when I update tables using Oracle-SQLDeveloper table designer it totally works fine and entities are showing updated information. So I checked SQLDeveloper log to see what query it's using to update rows and I saw that it's using ORA_ROWSCN and ROW ROWID in where clause. After that, I exactly used the same where clause as the one SQLDeveloper used to update tables, but still entities were showing old information.
I'm wondering what factors are involved here that Eclipslink is not fetching real time data from database ? but, after updating table with SQLDeveloper designer Eclipselink is showing real time data. It seems that modifying a table data with SQLDeveloper table designer also marks the table as changed by using some database features. Then, Eclipselink will read the mark before hitting the table.
Also to get more clarification, anyone knows what steps are involved in Eclipselink before it decides to hit database when user commands to execute a TypedQuery ? I'm so curious that where it stores cached entities ? since cache rest just by restarting the computer; I tried restarting Glassfish, killing the java process and logoff current user, but none of them worked. Why Eclipselink still is caching entities since I configured it to not use any caching procedure? Is it possible to completely turn off cache in Eclipselink?

How to generically test a database connection with hibernate

I have a service method on an api that can be called to check the health of my database connection.
The method is pulling the query string from a properties file (depends on DB vendor, using Sybase and HSQL for now, more in future), and executing it. Then the method lets the caller know if it succeeded or failed.
In addition to this, I was using the Query.setHint("javax.persistence.query.timeout") to set a timeout on the query:
javax.persistence.EntityManager entityManager;
...
Query heartbeatQuery = entityManager.createNativeQuery(heartbeatQueryString);
heartbeatQuery.setHint("javax.persistence.query.timeout", heartbeatTimeout);
heartbeatQuery.getResultList();
My problem is the timeout property is working against my Sybase DB, but not against my HSQL DB. It sounds like it depends on the vendor, so I don't know for sure when it will work.
Is there a better way to generically test the DB connection & include some kind of timeout parameter?

Well sadly no. JPA's query hints are not mandatory, i.e. it's up to the implementator (EclipseLink, Hibernate, etc) to enforce them or not. Moreover, even if the implementator does chose to recognize a certain query hint, if that hint's functionality is not supported by the database then it won't work (here some implementators are nice and tell you if a certain hint won't work agains the current db while others fail silently). In the case of HSQLDB there's no way to set the query timeout. You can only set a timeout for the login (i.e. how long should it wait for a successful login before failing), but not for the queries duration.
Things are not so grim however. On the one hand, even if you'd solve this, you'd still stumble over other issues with HSQLDB, as it does not support a lot of other nice functionalities that most dbs have. You should only use HSQLDB for basic integration/unit testing. For more involved testing, you can use the integrated MySQL Java library. You can find it here:
http://dev.mysql.com/doc/refman/5.0/en/connector-mxj.html
This is simply a packaged fully working Mysql server, which has a Java api for star and stop, works on most major OSs (win,lin, os x, etc). This way you can have your integration tests start a real Mysql server, and try your code there, where such stuff as a query timeout hint will work fine.

Handling transactions spanning across database servers

I have a scenario where the unit of work is defined as:
Update table T1 in database server S1
Update table T2 in database server S2
And I want the above unit of work to happen either completely or none at all (as the case with any database transaction). How can I do this? I searched extensively and found this post close to what I am expecting but this seems to be very specific to Hibernate.
I am using Spring, iBatis and Tomcat (6.x) as the container.

It really depends on how robust a solution you need. The minimal level of reliability on such a thing is XA transactions. To use that, you need a database and JDBC driver that supports it for starters, then you could configure Spring to use it (here is an outline).
If XA isn't robust enough for you (XA has failure scenarios, such as if something goes wrong in the second phase of commits, such as a hardware failure) then what you really need to do is put all the data in one database and then have a separate process propagate it. So the data may be inconsistent, but it is recoverable.
Edit: What I mean is that put the whole of the data into one database. Either the first database, or a different database for this purpose. This database would essentially become a queue from which the final data view is fed. The write to that database (assuming a decent database product) will be complete, or fail completely. Then, a separate thread would poll that database and distribute any missing data to the other databases. So if the process should fail, when that thread starts up again it will continue the distribution process. The data may not exist in every place you want it to right away, but nothing would get lost.

You want a distributed transaction manager. I like using Atomikos which can be run within a JVM.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.