Strategy for locking rows for updates with multiple processing nodes

Strategy for locking rows for updates with multiple processing nodes - java

I have an application on Spring Boot with PostgreSQL.
The application performs updates for rows in a database. In the past it was SELECT FOR UPDATE SKIP LOCKED to fetch new data and do updates in one thread, this was made to prevent several nodes to update same row (and as consequence to speed up update process for documents).
Now for speed up processing time, select rows and perform requests for update (to external service) are in separate threads (multiple workers with RestTemplate to smooth I/O waiting time) that fill this queue with ready updates and another thread worker perform post-processing by selecting from queue and insert to database. So now select and update in separate process and works in different transactions.
What is a good way to save behavior of SELECT FOR UPDATE SKIP LOCKED when processing is separated to different threads to prevent different nodes update same rows?
I think about adding few fields to table like update-status update-started node, and select like WHERE STATUS != 'IN PROGRESS' and to prevent holding rows if app crash add something like AND update-stared < now() - '20 minutes::INTERVAL'.
And second way to send connection from pool with document to another process, maybe it is better solution. As I know I can also select which node acquire lock so it also good for monitoring.

Related

use database as a queue of tasks

In one of our java applications (based on postgresql db), we have a database table that maintains a list of tasks to be executed.
Each row has a json blob for the details of a task as well as scheduled time value.
We have a few java workers/threads whose jobs are to search for tasks that are ready for execution (based on its schedule value), execute and delete them from the table. Execution of a task may take a few seconds.
The problem is, more than one worker may grab the same row, causing duplicate execution of a task, which is something we want to avoid.
One approach is, when doing select to grab a row, do it with FOR UPDATE to lock the row, supposedly preventing other worker from grabbing the same row that's locked.
My concern with this approach is, the row is only locked when the select transaction is being executed in the db (according to this), while the java code is actually executing the row/task that's selected, the locking has gone, another worker can grab it again.
Can some shed some light on whether the above approach is going to work for sure? Thanks!

Treat the DB calls as atomic instructions and design lock free algos around your table, using updates to change a boolean column "in-progress" from false to true. Could also just be a state int (0=avail, 1=inprogress, N=resultcode).
Make sure you have a partial index on state 0 (and possibly 1 to recover from crashes to find tasks in progress), so that the ...where state=0 remains selective and fast (on top of the scheduled time index of course).
Hope this helps.

When one thread has successfully locked the row on a given connection, another one attempting to obtain a lock on the row on a different connection should fail. You should issue the select-for-update with some kind of no-wait clause to request immediate failure if the row is locked.
Now, this doesn't solve the query vs lock race, as a failed lock may interrupt a thread's execution. You can solve that by (in each execution):
Select all records with new tasks (regardless of whether they're being processed or not)
For each new task returned in [1], run a matching select-for-update, then continue with processing the task if the lock fails.
If any lock attempt fails, skip the task without failing the entire process.

Multiple node behavior

I had written a code which will pull database rows and process it.
Right now I am selecting 100 rows and making there status as ProcessInprogress and after successful process making status of 100 rows as Processed one by one.
This process is scheduled for every 2 min under quartz.
Question: what I have to take care so that this process can run successfully when my code deployed in multiple nodes. So that I should avoid duplicate data processing in another node.
Please suggest:)

Right now I am locking the records fetched by the polling process so that same records should not be fetched by polling process of other instance or node. But this is making other process to wait for release of lock.
Is there anything which I can do so that other process should go to next 100 records fetch instead of waiting on the lock.
Please suggest if any more suggestions for handling behavior of multiple nodes for a database polling process running in java.
Thanks

better way to process records, select and update lock

I had a problem in my software that sometimes caused a lock on the SQL server.
This was caused by a process that selects a group of records and starts processing them.
Based on some values and a calculation the records get updated.
When a record is being updated the page where that record is on, is locked by the SQL server for select. Which results in a lock that never solves itself.
To solve the problem we have created a second table, from which we select, the main table is copied into it before the process starts, the table that is updated is not being selected in that way and no lock can appear.
What I am looking for is simple and better solution for this problem, because for me it is like a workaround for something I'm doing the wrong way and would really like to improve the processing.

Try to change TRANSACTION ISOLATION LEVEL on database. Here is link.

I guess your default isolation level is set to repeatable read, which causes the select to set a shared lock on the returned records, deadlock happens when concurrent requests come in. To solve this you should take a locking select (to lock records with X lock rather than S lock).

Multiple thread selecting row from database optimisation

I have an java application where 15 threads select a row from table with 11,000 records, through a synchronized method called getNext(), the threads are getting slow at selection a row, thereby taking a huge amount of time. Each of the thread follows the following process:
Thread checks if a row with resume column value set to 1 exist.
A. If it exist the thread takes the id of that row and uses that id to select another row with id greater than that of the taking id.
B. Otherwise it select's a row with id greater than 0.
The last row received based on the outcome of steps described in 1 above is marked with the resume column set to 1.
The threads takes the row data and works on it.
Question:
How can multiple thread access thesame table selecting rows that another thread has not selected and be fast?
How can threads be made to resume in case of a crash at the last row that was selected by any of the threads?

1.:
It seems the multiple database operations in getNext() art the bottleneck. If the data isn't change by an outside source you could read "id" and "resume" of all rows and cache it. Than you would only have one query and than operate just in memory for reads. This would safe lot of expensive DB calls in getNext():
2.:
Basically you need some sort of transactions or at least add an other column that gets updated when a thread has finished processing that row. Basically the processing and the update need to happen in a single transaction. When something happens while the transaction is not finished, you can rollback to the state in which the row wasn't processed.

If the threads are all on the same machine they could use a shared data structure to avoid working on the same thing instead of synchronization. But the following assumes the threads are on on different machines ( maybe different members of an application server cluster ) and can only communicate via the database.
Remove synchronization on getNext() method. When setting the resume flag to 1 (step 2), do so atomically. update table set resume=1 where resume = 0, commit. Only one thread will succeed at this, the thread that does gets that unit of work. At the same time, set a resume time-- if the resume time is greater than some max assume the thread working on that unit of work hash crashed, set resume flag back to 0. After the work is finished set the resume time to null, or otherwise mark the work as done.

Well, would think of different issues here:
Are you keeping status in your DB? I would look for some approach where you call a select for update where you filter by inactive status (be sure just to get one row in the select) and immediately update to active (in same transaction). It would be nice to know what DB you're using, not sure if "select for update" is always an option.
Process and when you're finished, update to finished status.
Be sure to keep a timestamp in the table to identifiy when you changed status for the last time. Make yourself a rule to decide when an active thread will be treated as lost.
Define other possible error scenarios (what happens if the process fails).
You would also need to analyze the scenario. How many rows does your table have? How many threads call it concurrently? How many inserts occur in a given time? Depending on this you will have to see how DB performance is running.
I'm assuming you'r getNext() is synchronized, with what I wrote on point 1 you might get around this...

Thread Priority on application using java

Can you help me in two problem :
A. We have a table on which read and write operation happens simultaneously. Write happens very vastly so read is very slow - sometimes my web application does not come up due to heavy write operation on this table. How could i handle such scenario. Write happens through different Java application while read happens through our web application, so web application become very slow. Any idea?
B. Write happens to this table happens through 200 threads, these thread take connection from connection pool and write into the table and this application run 24 by 7. is the thread priority is having issue and stopping read operation from web application.
C. Can we have master- master replication for that table only- so write happens in one table and write happens in other table and every two minute data migrates from one table to other table?
Please suggest me .
Thanks in advance.

Check connection pool size - maybe it's too small and your threads waste time waiting for connection from pool.
Check your database settings, if you just running it with out-of-the-box params there maybe a good space for improvements.
You probably need some kind of event-driven system - when vehicle sends data DB is not updated, but a message is added to some queue (e.g. JMS). Your app then caches data on startup, and updates both cache and database upon receiving this message. The key thing is that the only component that interacts with DB is your app, and data changed only when you receive event - so you don't need to query DB to read the data, plus you may do updates in the background using only few threads, etc. There are quite good open-source messaging systems (e.g. Apache Active MQ) and caching libraries (e.g. EH Cache), so you can built reasonably perfomant and fault-tolerant system with not too much effort.
I guess introducing messaging will be a serious reengineering, so to solve your immediate problem replication might be the best solution - merge data from the updateable table to another one every 2 minutes, and the tracker will read that another table; obviously works well if you only read the data in the web-app, and not update them, otherwise you need to put a lot of effort to keep 2 tables in sync. A variation of that is batching - data from vehicle are iserted into intermediate table, and then every 2 minutes transferred into main table from which reader queries them; intermediate table is cleaned after transfer.

The one true way to solve this is to use a queue of write events and to stop the writing periodically so that the reader has a chance.
Create a queue for incoming write updates
Create an atomicXXX (see java.util.concurrency) to use as a lock
Create a thread pool to read from the queue and execute the updates when the lock is unset
Use javax.swing.Timer to periodically set the lock and read the table data.

Before trying anything too complicated try this perhaps:
1) Don't use Thread priorities, they are rarely what you want.
2) Set up your own priority scheme, perhaps simply by having a (priority) queue for both reads and writes where reads are prioritized. That is: add read and write requests to a single queue and have them block or be notified of the result.
3) check your database features to optimize write heavy tables

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.