Is it possible to know the progress of a transaction.commit operation? - java

I'm using JPA in my application to bundle a series of insert and updates into one commit() operation.
While that commit is running, is it possible to learn the progress of that operation (0-100%) so I can display that in a progress bar to the user?
I could split my updates into many commits, but that would make the entire job take longer.
Using EclipseLink as my JPA provider.

I think the only way to create something like that would be to use the org.hibernate.stat.internal.StatisticsImpl class of hibernate. You can programmatically get different metrics from the instantiation of this class. Hibernate statistics generation must be enabled for this to work. You can enable it by setting the property hibernate.generate_statistics to true.
The statistics instance has a method called getQueryExecutionCount() that you might be able to use to build a progress bar. It gives the number of queries that were executed by the current JPA EntityManagerFactory or Hibernate. If you keep calling that method in a while loop while the queries are still running you might be able to show the percentage of completed queries by dividing the return value of getQueryExecutionCount() by the total amount of queries that need to be processed. Heres a good tutorial that explains all the different metrics that are available.
I must also point out that turning on hibernate statistics could slow your application down. So if you want to use this feature in production then you must also test whether this slowdown is acceptable or not.
EDIT: You could also choose to only turn hibernate statistics on right before the queries will run and turn it off after they've completed.
The StatisticsImpl class has a method called setStatisticsEnabled(boolean b) that you can use to programmatically turn it on or off.
EDIT 2: I'm assuming here that you are using Hibernate as the JPA provider. If not i'll remove this answer.

Related

How does Spring JPA with hibernate manage concurrent updates

I am trying to write an app that can run active-active, with both instances using the same DB. I was worried that in updating a entity (calling repo.findById(), entity.setX(), repo.save(entity)) there was a potential race condition where updates could overwrite each other if there was a large enough delay between find and save.
To test it I made a method that loads the entity, waits 10 seconds, then adds something to a list attribute and saves it. Calling that twice I expected the second save to overwrite the first one, but surprisingly both updates were persisted.
Whilst this is what I wanted I was wondering if anyone knew how/why spring did this, because I want to make sure it will do this every time? I understand it uses optimistic locking by default, but you need the #version annotation (which I don't have). Will this also work if the updates come from separate apps, or did it only work because both of them were from the same application?

How to simulate slow SQL database in test?

I have a bug that manifest himself only when database is slow. It applies to even simplest database operations (select, etc).
I would like to create test, where I force my database to be slow. How to do that?
I use Spring Boot, Spring Data, Hibernate, MariaDB.
Ideally, I want the 'slow database' part to be completely contained in the test code, in my java application. That way, test will be completely automated and self-contained.
I want to slow down database access only for one test (not globally, for all access).
I was proposed to introduce database trigger (BEFORE SELECT) with sleep
But this is not flexible, because it slows down every access, not access just for one test.
I see four possible solutions for this problem.
You don't have to create slow database, you can create slow connection to the database. If you run database on a different (Virtual) machine, there are systems that help simulating shitty internet connections by delaying network responses randomly.
You can use sleep(10) function that is provided by your database, this would require "injecting" it into SQL query or override method for the purpose of test and replace SELECT with SELECT SLEEP(10).
Simulate stress-test on the database with mysqlslap if you use mysql.
Another solution, a bit stupid tho, you can use spring-aop and attach a delay aspect before and after the DAO method execution with random small sleep. This way you have control over it, don't have to modify existing code and let spring make the job of doing the delay without integration into real-system. Not that stupid after all. This one is quite flexible and I think I would go with it. Easiest to setup.
If it's stupid, but it works, it's not stupid.
I had a similar need when developing on a SQL Server DB.
To simulate a slow query you can use (but this is specific to SQL Server):
select * from TABLE
WAITFOR DELAY '00:00:45'--to simulate 45 seconds of delay
If you want to write a Spring Boot Test, maybe you can use the #SpyBean annotation
#SpyBean
SomeBeanCallingTheDatabase someBeanCallingTheDatabase;
//...
// in the test method
doAnswer(answer-> {
Thread.sleep(300L); //any value here
return answer.callRealMethod();
})
.when(someBeanCallingTheDatabase)
.find(any());
// call the service using the bean above
The easy answer is to write a test repository class that has a Thread.sleep embedded in it.
credit: this answer was provided by https://stackoverflow.com/users/37213/duffymo in the comment.

How to inspect data within a database transaction?

I'm running an integration test that executes some Hibernate code within a single transaction (managed by Spring). The test is failing with a duplicate key violation and I'd like to hit a breakpoint just before this and inspect the table contents. I can't just go into MySQL Workbench and run a SELECT query as it would be outside the transaction. Is there another way?
After reading your comments, my impression that predominantly you are interested in how to hit a breakpoint and at the same time be able to examine database contents. Under normal circumstances I would just offer you to log the SQLs. Having the breakpoint in mind my suggestion is:
Reduce isolation level to READ_UNCOMMITED for the integration test.
Reducing the isolation level will allow you to see the uncommitted values in the database during the debugging. As long as you don't have parallel activity within the integration test. It should be fine.
Isolation level can be set up on per connection basis. There is no need for anything to be done on the server.
One side note. If you are using Hibernate even the parallel activities may work fine when you reduce the ISOLATION LEVEL because largely Hibernate behaves as it is in REPEATABLE_READ because of the transactional Level 1 cache.
The following can be run from Eclipse's "Display" view:
java.util.Arrays.deepToString(
em.createNativeQuery("SELECT mystuff FROM mytable").getResultList().toArray())
.replace("], ", "]\n");
This displays all the data, albeit not in a very user-friendly way - e.g. will need to work out which columns the comma-separated fields correspond to.

Parallel updates to different entity properties

I'm using JDO to access Datastore entities. I'm currently running into issues because different processes access the same entities in parallel and I'm unsure how to go around solving this.
I have entities containing values and calculated values: (key, value1, value2, value3, calculated)
The calculation happens in a separate task queue.
The user can edit the values at any time.
If the values are updated, a new task is pushed to the queue that overwrite the old calculated value.
The problem I currently have is in the following scenario:
User creates entity
Task is started
User notices an error in his initial entry and quickly updates the entity
Task finishes based on the old data (from step 1) and overwrites the entire entity, also removing the newly entered values (from step 3)
User is not happy
So my questions:
Can I make the task fail on update in step 4? Wrapping the task in a transaction does not seem to solve this issue for all cases due to eventual consistency (or, quite possibly, my understanding of datastore transactions is just wrong)
Is using the low-level setProperty method the only way to update a single field of an entity and will this solve my problem?
If none of the above, what's the best way to deal with a use case like this
background:
At the moment, I don't mind trading performance for consistency. I will care about performance later.
This was my first AppEngine application, and because it was a learning process, it does not use some of the best practices. I'm well aware that, in hindsight, I should have thought longer and harder about my data schema. For instance, none of my entities use ancestor relationships where they would be appropriate. I come from a relational background and it shows.
I am planning a major refactoring, probably moving to Objectify, but in the meantime I have a few urgent issues that need to be solved ASAP. And I'd like to first fully understand the Datastore.
Obviously JDO comes with optimistic concurrency checking (should the user enable it) for transactions, which would prevent/reduce the chance of such things. Optimistic concurrency is equally applicable with relational datastores, so you likely know what it does.
Google's JDO plugin uses the low-level API setProperty() method obviously. The log even tells you what low level calls are made (in terms of PUT and GET). Moving to some other API will not on its own solve such problems.
Whenever you need to handle write conflicts in GAE, you almost always need transactions. However, it's not just as simple as "use a transaction":
First of all, make sure each logical unit of work can be defined in a transaction. There are limits to transactions; no queries without ancestors, only a certain number of entity groups can be accessed. You might find you need to do some extra work prior to the transaction starting (ie, lookup keys of entities that will participate in the transaction).
Make sure each unit of work is idempotent. This is critical. Some units of work are automatically idempotent, for example "set my email address to xyz". Some units of work are not automatically idempotent, for example "move $5 from account A to account B". You can make transactions idempotent by creating an entity before the transaction starts, then deleting the entity inside the transaction. Check for existence of the entity at the start of the transaction and simply return (completing the txn) if it's been deleted.
When you run a transaction, catch ConcurrentModificationException and retry the process in a loop. Now when any txn gets conflicted, it will simply retry until it succeeds.
The only bad thing about collisions here is that it slows the system down and wastes effort during retries. However, you will get at least one completed transaction per second (maybe a bit less if you have XG transactions) throughput.
Objectify4 handles the retries for you; just define your unit of work as a run() method and run it with ofy().transact(). Just make sure your work is idempotent.
The way I see it, you can either prevent the first task from updating the object because certain values have changed from when the task was first launched.
Or you can you embed the object's values within the task request so that the 2nd calc task will restore the object state with consistent value and calcuated members.

Hibernate Dirty Object usage

I have a Hibernate Entity in my code. i would fetch that and based on the value of one of the properties ,say "isProcessed" , go on and :
change value of "isProcessed" to "Yes" (the property that i checked)
add some task to a DelayedExecutor.
in my performance test, i have found that if i hammer the function,a classic dirty read scenario happens and i add too many tasks to the Executor that all of them would be executed. i can't use checking the equality of the objects in the Queue based on anything , i mean java would just execute all of them which are added.
how can i use hibernate's dirty object stuff to be able to check "isProcessed" before adding the task to executor? would it work?
hope that i have been expressive enough.
If you can do all of your queries to dispatch your tasks using the same Session, you can probably patch something together. The caveat is that you have to understand how hibernate's caching mechanisms (yes, that's plural) work. The first-level cache that is associated with the Session is going to be the key here. Also, it's important to know that executing a query and hydrating objects will not look into and return objects from the first-level cache...the right hand is not talking to the left hand.
So, to accomplish what you're trying to do (assuming you can keep using the same Session...if you can't do this, then I think you're out of luck) you can do the following:
execute your query
for each returned object, re-load it with Session's get method
check the isProcessed flag and dispatch if need-be
By calling get, you'll be sure to get the object from the first-level cache...where all the dirty objects pending flush are held.
For background, this is an extremely well-written and helpful document about hibernate caching.

Categories