In one of our java applications (based on postgresql db), we have a database table that maintains a list of tasks to be executed.
Each row has a json blob for the details of a task as well as scheduled time value.
We have a few java workers/threads whose jobs are to search for tasks that are ready for execution (based on its schedule value), execute and delete them from the table. Execution of a task may take a few seconds.
The problem is, more than one worker may grab the same row, causing duplicate execution of a task, which is something we want to avoid.
One approach is, when doing select to grab a row, do it with FOR UPDATE to lock the row, supposedly preventing other worker from grabbing the same row that's locked.
My concern with this approach is, the row is only locked when the select transaction is being executed in the db (according to this), while the java code is actually executing the row/task that's selected, the locking has gone, another worker can grab it again.
Can some shed some light on whether the above approach is going to work for sure? Thanks!
Treat the DB calls as atomic instructions and design lock free algos around your table, using updates to change a boolean column "in-progress" from false to true. Could also just be a state int (0=avail, 1=inprogress, N=resultcode).
Make sure you have a partial index on state 0 (and possibly 1 to recover from crashes to find tasks in progress), so that the ...where state=0 remains selective and fast (on top of the scheduled time index of course).
Hope this helps.
When one thread has successfully locked the row on a given connection, another one attempting to obtain a lock on the row on a different connection should fail. You should issue the select-for-update with some kind of no-wait clause to request immediate failure if the row is locked.
Now, this doesn't solve the query vs lock race, as a failed lock may interrupt a thread's execution. You can solve that by (in each execution):
Select all records with new tasks (regardless of whether they're being processed or not)
For each new task returned in [1], run a matching select-for-update, then continue with processing the task if the lock fails.
If any lock attempt fails, skip the task without failing the entire process.
Related
Right now, I am thinking of implementing multi-threading to take tasks corresponding to records in the DB tables. The tasks will be ordered by created date. Now, I am stuck to handle the case that when one task (record) being taken, other tasks should skip this one and chase the next one.
Is there any way to do this? Many thanks in advance.
One solution is to make a synchronized pickATask() method and free threads can only pick a task by this method.
this will force the other free threads to wait for their order.
synchronized public NeedTask pickATask(){
return task;
}
According to how big is your data insertion you can either use global vectorized variables or use a table in the database itself to record values like (string TASK, boolean Taken, boolean finished, int Owner_PID).
By using the database to check the status you tend to accomplish a faster code in large scale, but if do not have too many threads or this code will run just once the (Synchronized) global variable approach may be a better solution.
In my opinion if you create multiple thread to read from db and every thread involve in I/O operation and some kind of serialization while reading row from same table.In my mind this is not scallable and also some performance impact.
My solution will be one thread will be producer which will read the row in batch and create task and submit the task to execution (will be thread pool of worker to do the actual task.)Now we have two module which can be scallable independently.In producer side if required we can create multiple thread and every thread will read some partition data.For an example Thread 1 will read 0-100 and thread 2 read 101-200.
It depends on how you manage your communication between java and DB. Are you using direct jdbc calls, Hibernate, Spring Data or any other ORM framework. In case you use just JDBC you can manage this whole issue on your DB level. you will need to configure your DB to lock your record upon writing. I.e. once a record was selected for update no-one can read it until the update is finished.
In case that you use some ORM framework (Such as Hibernate for example) the framework allows you to manage concurrency issues. See about Optimistic and Pessimistic locking. Pessimistic locking does approximately what is described above - Once the record is being updated no-one can read it until the update is finished. Optimistic one uses versioning mechanism, and then multiple threads can try to update the record but only the first one succeeds and the rest will get an exception saying that they are now working with stale data and they should read the record again. The versioning mechanism is to add a version column that is usually a number or sometimes timestamp. Each thread reads the record and upon update it checks if the version in DB still the same. If so it means no-ne else updated the record and upon update the version is changed (incremented or current timestamp is set). If the version changed then someone else already updated the record since it was read and so this thread has stale record and should not be allowed to update it. Optimistic locking shows better performance in environment where reading heavily outnumbers writing
I have an application that works with database table like
Id, state, procdate, result
When there is a need to process some data, the app sets state to PROCESSING. After processing the result of processing is being set to result column and the state goes to STANDBY.
To do the first set to PROCESSING I start the transaction, do select for update, then update the state and procdate.
Then I do the work and using selection for update update the state and the result.
The processing may take up to 5 minutes. The state switching is needed to see how many rows are in progress. The problem is that another request for processing may occur and it has to wait until the first processing will end.
So I want to keep row locked. If I will make the select for update for locking just after I commit the processing state the second request may intercept and lock the row.
So how can I both keep the locking and commit the changes?
You'll need to handle this with your design. Here is an idea.
You records initially have a status, say 'READY', and a processing id, null initially.
When you start, update the status to 'PROCESSING', and update the id to a value for the job run, this can come from a sequence within Oracle, such that it is unique for your process run. commit.
the process runs, with the same id, and selects the status 'PROCESSING' and the same as it's defined processing id. Complete processing, update status to 'COMPLETE' (or 'STANDBY' as you have it). Commit.
This allows a second process to select new 'READY' records and set them for its own processing without interference with the already running process.
Here are two approaches I have taken. (I provide a third, but have never had to take that approach.)
1) Why not exit the transaction after committing the changes.
2) If option 1 is not viable, then you could simply:
COMMIT the changes
attempt to re-acquire the lock, if you fail, leave the screen, else just continue.
3) If it is absolutely imperative that no one can ever acquire the lock in the middle of a commit... you could actually lock another object. I will admit, I have never had to take this approach, but it would be as follows:
Initial phase
LOCK GLOBALOBJECT
Attempt to Acquire record lock for table
UNLOCK GLOBALOBJECT
Test to see if record lock was attained
Phase for committing the change
LOCK GLOBALOBJECT
COMMIT change
Acquire record lock for table
UNLOCK GLOBALOBJECT
Test to see if record lock was attained (Should never happen...)
I have never needed this kind of logic, and I really do not like it since it requires a GLOBAL locking object for this table. Again, it depends on your code, and the criticality of someone being able to commit changes while still being in the transaction.
However, just make sure you are not gold-plating your code when simply exiting out of the transaction after commiting a change would be fine for your stakeholders.
I was asked to implement an "access policy" to limit the amount of concurrent executions of a certaing process within an application (NOT a web application) which has direct connection to the database.
The application is running in several machines, and if more than a user tries to call the process, only one execution should be allowed at a given time, and the other must return an error message (NOT wait for the first execution to end).
Although I'm using Java/Postgres, is sort of a general question.
Given that I have the same application running in several machines, the simplest solution I can think of is implementing some sort of "database flag".
Something like checking whether the process is currently active:
SELECT Active FROM Process
If it's active, return a 'concurrent access policy error'. If not, activate it:
UPDATE Process SET Active = 'Y'
Once the execution is finished, simply update the active flag:
UPDATE Process SET Active = 'N'
However, I've encountered a major issue:
If I don't use a DB transaction in order to change the active flag, and the application is killed, the the active flag will remain with the Y value forever.
If I use a DB transaction, the first point is solved. However, the change of the active flag in a host (from N to Y) will only be visible after the commit, so the other hosts will never read active with Y value and therefore execute anyway.
Any ideas?
Don't bother with an active flag, instead simply lock a row based on the user ID. Keep that row locked in a dedicated transaction/connection. When the other user tries to lock the row (using SELECT ... FOR UPDATE) you'll get an error, and you can report it.
If the process holding the transaction fails, the lock is freed. If it quits, the lock is freed. If the DB is rebooted, the lock is freed.
Win all around.
Instead of having only a simple Y/N flag, put the timestamp at which active as been set, and have your client application set it regularly (say every minute, or every five minute). Then if a client crashes, other clients will have to wait just over that time limit, and then assume that client is dead and take over. This is just some kind of "heartbeat" mechanism to check the client that started the process is still alive.
A simpler solution would be to configure the database to only accept one connection at the time?
I am not sure if a RDBMS is the best system to solve this kind of issue. But I recently implemented a similar thing in SQL Server 2012. So here's what I learned from that experience.
In general, I mean in principle, you need an atomic operation "check the value, update the value (of one single record)" i.e. an atomic SELECT/UPDATE. This makes the matter complex. And because normally there's no such standard single atomic operation in the RDBMSs, you can get familiar with and use ISOLATION LEVEL SERIALIZABLE.
This is how I implemented it in SQL Server 2012, and I've seriously tested it, it's working fine. I have a table called DistributedLock, each record from it represents a logical lock. The operations I allow are tryLock and releaseLock (these are implemented as two stored procedures). The tryLock is non-blocking (practically non-blocking). If it succeeds, it returns some ID/stamp to the caller who can use that ID/stamp later to call releaseLock. If one calls releaseLock without actually holding the lock (without having the latest ID/stamp that is), the call succeeds and does nothing, otherwise (if the caller has the lock) the call succeeds and releases the lock held by the caller. I also have support for timeouts. So if some process grabs the ID/stamp of a given lock/record, and forgets to release it, it will expire automatically after some time.
Here is how the table looks like.
[DistributedLockID] [bigint] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL -- surrogate PK
[ResourceID] [nvarchar](256) NOT NULL -- resource/lock logical identifier
[Duration] [int] NOT NULL
[AcquisitionTime] [datetime] NULL
[RecordStamp] [bigint] NOT NULL
I guess you can figure out the rest (or try, and then ping me if you get stuck).
I'm implementing an event listener, querying new items to process, by creationTime in ascending order.
I deal with multithreading.
My current workflow is:
Querying a batch of items (let's say 50) containing the "New" flag.
Looping through those items, and for each item, updating its status to "InProgress".
For each item, still within the loop, start the corresponding process, detached in a thread (using Akka Actors in my case).
As soon as a process is fully completed, update the item's flag to "Consumed".
I set a polling frequency of 3 seconds, that obviously may involve query of new items BEFORE the current retrieved items are being fully processed (due to multithreading), with the flag "Consumed" set.
Only the querying is single-threaded, otherwise it would lead to retrieve duplicates.
I wonder if the step 2 is essential: updating each item with "InProgress" flag.
Indeed, it would slow down the whole.
I thought about skipping this step but to ensure that futures queries don't retrieve items that are currently being processed (let's imagine a very long computation), I would NOT start the next retrieval query as soon as the whole batch is processed.
Basically, my query step would wait for workers to finish their current jobs.
Obviously, this would make sense if the kind of jobs are similar in computation time.
What is a good practice of polling database while dealing with multithreaded computation?
I have an java application where 15 threads select a row from table with 11,000 records, through a synchronized method called getNext(), the threads are getting slow at selection a row, thereby taking a huge amount of time. Each of the thread follows the following process:
Thread checks if a row with resume column value set to 1 exist.
A. If it exist the thread takes the id of that row and uses that id to select another row with id greater than that of the taking id.
B. Otherwise it select's a row with id greater than 0.
The last row received based on the outcome of steps described in 1 above is marked with the resume column set to 1.
The threads takes the row data and works on it.
Question:
How can multiple thread access thesame table selecting rows that another thread has not selected and be fast?
How can threads be made to resume in case of a crash at the last row that was selected by any of the threads?
1.:
It seems the multiple database operations in getNext() art the bottleneck. If the data isn't change by an outside source you could read "id" and "resume" of all rows and cache it. Than you would only have one query and than operate just in memory for reads. This would safe lot of expensive DB calls in getNext():
2.:
Basically you need some sort of transactions or at least add an other column that gets updated when a thread has finished processing that row. Basically the processing and the update need to happen in a single transaction. When something happens while the transaction is not finished, you can rollback to the state in which the row wasn't processed.
If the threads are all on the same machine they could use a shared data structure to avoid working on the same thing instead of synchronization. But the following assumes the threads are on on different machines ( maybe different members of an application server cluster ) and can only communicate via the database.
Remove synchronization on getNext() method. When setting the resume flag to 1 (step 2), do so atomically. update table set resume=1 where resume = 0, commit. Only one thread will succeed at this, the thread that does gets that unit of work. At the same time, set a resume time-- if the resume time is greater than some max assume the thread working on that unit of work hash crashed, set resume flag back to 0. After the work is finished set the resume time to null, or otherwise mark the work as done.
Well, would think of different issues here:
Are you keeping status in your DB? I would look for some approach where you call a select for update where you filter by inactive status (be sure just to get one row in the select) and immediately update to active (in same transaction). It would be nice to know what DB you're using, not sure if "select for update" is always an option.
Process and when you're finished, update to finished status.
Be sure to keep a timestamp in the table to identifiy when you changed status for the last time. Make yourself a rule to decide when an active thread will be treated as lost.
Define other possible error scenarios (what happens if the process fails).
You would also need to analyze the scenario. How many rows does your table have? How many threads call it concurrently? How many inserts occur in a given time? Depending on this you will have to see how DB performance is running.
I'm assuming you'r getNext() is synchronized, with what I wrote on point 1 you might get around this...