I'm trying to multi thread a Result Set. I want to make sure whenever I call the next() within one of the many threads, all other threads are locked out. This is important , because if many threads call the next() method simultaneously, this will result in skipping the rows. Here is what I did
public class MainClass {
private static ResultSet rs;
public static void main (String [] args) {
Thread thread1 = new Thread(new Runnable() {
#Override
public void run() {
runWhile();
}});
Thread thread2 = new Thread(new Runnable() {
#Override
public void run() {
runWhile();
}});
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.exit(0);
}
private static void runWhile () {
String username = null;
while ((username = getUsername()) != null) {
// Use username to complete my logic
}
}
/**
* This method locks ResultSet rs until the String username is retrieved.
* This prevents skipping the rows
* #return
* #throws SQLException
*/
private synchronized static String getUsername() throws SQLException {
if(rs.next()) {
return rs.getString(1).trim();
}
else
return null;
}
}
Is this a correct way of using synchronized. Does it lock the ResutSet and makes sure other thread do not interfere ?
Is this a good approach ?
JDBC objects shouldn't be shared between threads. That goes for Connections, Statements, and ResultSets. The best case here would be that the JDBC vendor follows the spec and does internal locking so that you can get by with this, in which case all the threads are still trying to acquire the same lock and only one can make progress at a time. This will be slower than using a single thread, because on top of doing the same work to read from the database there is extra overhead from managing all the threading.
(Locking done by the driver could be for the driver's benefit, so the provider doesn't have to deal with bug reports of race conditions caused by users misusing their software. That it does locking doesn't necessarily imply the software should actually be used by multiple threads.)
Multithreading works when threads can make progress concurrently, see Amdahl's Law. If you have a situation where you can read the ResultSet and use the results to create tasks which you submit to an ExecutorService (as Peter Lawrey recommends in a comment) then that would make more sense (as long as those tasks can work independently and don't have to wait on each other).
I will suggest to create the ResultSet, then copy all the data into a DTO (Data Transfer Object) or a DAO (Data Access Object). After having the data on the DTO or DAO, close your ResultSet, Statement and Connection.
A very simple structure to creat a DTO/DAO to store records in order, its fields, and parsing capabilities is this:
ArrayList<HashMap<String, Object>> table = new ArrayList<HashMap<String, Object>>();
HashMap<String, Object> record = new HashMap<String, Object>();
String field1 = "something";
Integer field2 = new Integer(45);
record.put("field1", field1);
record.put ("field2", field2);
table.add(record);
You may (and probably you should) automate and make the DTO/DAO flexible enough to use the same class in any table, without hard code or fixed names.
Remember that you will need to create a wrapper and the methods for storing/reading the data, and that these methods should be thread safe.
Keep in mind that this design only works if you have enough memory to store all the records of your ResultSet.
Related
I am using the spring framework under tomcat to write service that can handle multiple concurrent requests. There is a static variable declared in my service class to be shared across all the threads and protected by a read-write lock. This static collection is read and written to periodically. Each time I update I acquire the write lock and when I read it I acquire the read lock.
The data stored in the collection is entries from a database table. So it is a collection of type List. Basically I have a table that gets updated rarely and so I am caching it in the process memory.
Now there are times when I need to log this data so can I log the return object without acquiring the lock in the method. Or will that cause a race condition? The logging is for the purpose of debugging.
Also, the returned value of the collection is only read only and is not modified by any method. No one outside of the service object uses this Collection.
I feel this should work because when a new collection is allocated the old collection will only go away if all references to it have gone away. And while updating no new references to old or new object are allowed till the write lock is unlocked.
The code looks as follows:
class ObjService {
private static Collection<OtherObj> _staticCollection;
private static ReentrantReadWriteLock rwlock = new ReentrantReadWriteLock(true);
private Collection<OtherObj> getCollection () {
Collection<OtherObj> retVal = null;
rwlock.readLock().lock();
if (_staticCollection != null) {
retVal = _staticCollection;
rwlock.readLock().unlock();
log (retVal);
}
else {
rwlock.readLock().unlock();
ReloadCollectionFromDB ();
rwlock.readLock().lock();
retVal = _staticCollection;
rwlock.readLock().unlock();
}
}
private ReloadCollectionFromDB () {
Collection<OtherObj> otherObjCol = null;
try {
otherObjCol = objRepo.findAll ();
}
catch (Exception ex) {
// log exception
return;
}
rwlock.writeLock().lock();
_staticCollection = otherObjCol;
rwlock.writeLock().unlock();
}
// periodically get data from DB
#Scheduled(initialDelayString = "120000", fixedDelayString = "540000")
void readLoadCache () {
ReloadCollectionFromDB ();
}
}
If there are better ways of doing this, I would appreciate some guidance.
Many thanks,
~Ash
Usecase : Rotation of credentials for a datastore
What I want :
When updateCredentials is called, it will wait until it all threads are done fetching credentials (via the synchronize) to update the credentials to the new ones.
I DO NOT want calls to doSomeQuery making each other wait to fetch credentials. This object can be used in multiple threads and its a wasteful wait.
Is there a method / pattern to achieve this? The code sample below achieves item 1 but not item 2.
private Object credentialUpdate = new Object();
public void updateCredentials(String user, String pass) {
synchronize(credentialUpdate) {
this.user = user;
this.pass = pass;
}
}
public void doSomeQuery(String query) {
String curUser;
String curPass;
synchronize(credentialUpdate) {
curUser = this.user;
curPass;
}
// execute query
}
Use java.util.concurrent.locks.ReadWriteLock and its implementation ReentrantReadWriteLock. From the Javadoc:
A ReadWriteLock maintains a pair of associated locks, one for read-only operations and one for writing. The read lock may be held simultaneously by multiple reader threads, so long as there are no writers. The write lock is exclusive.
I'm using something like
Cache<Integer, Item> cache;
where the Items are independent of each other and look like
private static class Item {
private final int id;
... some mutable data
synchronized doSomething() {...}
synchronized doSomethingElse() {...}
}
The idea is to obtain the item from the cache and call a synchronized method on it. In case of a miss, the item can be recreated, that's fine.
A problem occurs when an item gets evicted from the cache and recreated while a thread runs a synchronized method. A new thread obtains a new item and synchronizes on it... so for a single id, there are two threads inside the synchronized method. FAIL.
Is there an easy way around it? It's Guava Cache, if it helps.
I think the suggestion from Louis, using the the keys for locking is the most simple and practical one. Here is code some snippet, that, without the help of Guava libraries, illustrates the idea:
static locks[] = new Lock[ ... ];
static { /* initialize lock array */ }
int id;
void doSomething() {
final lock = locks[id % locks.length];
lock.lock();
try {
/* protected code */
} finally {
lock.unlock();
}
}
The size of the lock array limits the maximum amount of parallelism you get. If your code is only using CPU, you can initialize it by the number of available processors and this is the perfect solution. If your code waits for I/O you might need an arbitrary big array of locks or you limit the number of threads that can run the critical section. In this case another approach might be better.
Comments on a more conceptual level:
If you want to prevent the item from being evicted, you need a mechanism called pinning. Internally this is used by most cache implementations, e.g. for blocking during I/O operations. Some caches may expose a way to do it by the applications.
In a JCache compatible cache, there is the concept of an EntryProcessor. The EntryProcessor allows you to process a peace of code on an entry in an atomic way. This means the cache is doing all the locking for you. Depending of the scope of the problem, this may have an advantage, since this also works in clustered scenarios, which means the locking is cluster wide.
Another idea which comes to my mind is the vetoable eviction. This is a concept EHCache 3 is implementing. By specifying a vetoable eviction policy you can implement a pinning mechanism on your own.
I'm sure that there are multiple solutions for your issue.
I wrote down one of them with using a unique lock for each ietmId:
public class LockManager {
private Map<Integer, Lock> lockMap = new ConcurrentHashMap<>();
public synchronized Lock getOrCreateLockForId(Integer itemId) {
Lock lock;
if (lockMap.containsKey(itemId)) {
System.out.println("Get lock");
lock = lockMap.get(itemId);
} else {
System.out.println("Create lock");
lock = new ReentrantLock();
lockMap.put(itemId, lock);
}
return lock;
}
public synchronized Lock getLockForId(Integer itemId) {
Lock lock;
if (lockMap.containsKey(itemId)) {
System.out.println("get lock");
return lockMap.get(itemId);
} else {
throw new IllegalStateException("First lock, than unlock");
}
}
}
So, instead of using synchronised methods in class Item use LockManager to get Lock by itemId and call lock.lock() after it was retrieved.
Also note that LockManager should have singleton scope and the same instance should be shared across all usages.
Below you can see example of LockManager using:
try {
lockManager.getOrCreateLockForId(itemId).lock();
System.out.println("start doing something" + num);
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("completed doing something" + num);
} finally {
lockManager.getLockForId(itemId).unlock();
}
We need to lock a method responsible for loading database date into a HashMap based cache.
A possible situation is that a second thread tries to access the method while the first method is still loading cache.
We consider the second thread's effort in this case to be superfluous. We would therefore like to have that second thread wait until the first thread is finished, and then return (without loading the cache again).
What I have works, but it seems quite inelegant. Are there better solutions?
private static final ReentrantLock cacheLock = new ReentrantLock();
private void loadCachemap() {
if (cacheLock.tryLock()) {
try {
this.cachemap = retrieveParamCacheMap();
} finally {
cacheLock.unlock();
}
} else {
try {
cacheLock.lock(); // wait until thread doing the load is finished
} finally {
try {
cacheLock.unlock();
} catch (IllegalMonitorStateException e) {
logger.error("loadCachemap() finally {}",e);
}
}
}
}
I prefer a more resilient approach using read locks AND write locks. Something like:
private static final ReadWriteLock cacheLock = new ReentrantReadWriteLock();
private static final Lock cacheReadLock = cacheLock.readLock();
private static final Lock cacheWriteLock = cacheLock.writeLock();
private void loadCache() throws Exception {
// Expiry.
while (storeCache.expired(CachePill)) {
/**
* Allow only one in - all others will wait for 5 seconds before checking again.
*
* Eventually the one that got in will finish loading, refresh the Cache pill and let all the waiting ones out.
*
* Also waits until all read locks have been released - not sure if that might cause problems under busy conditions.
*/
if (cacheWriteLock.tryLock(5, TimeUnit.SECONDS)) {
try {
// Got a lock! Start the rebuild if still out of date.
if (storeCache.expired(CachePill)) {
rebuildCache();
}
} finally {
cacheWriteLock.unlock();
}
}
}
}
Note that the storeCache.expired(CachePill) detects a stale cache which may be more than you are wanting but the concept here is the same, establish a write lock before updating the cache which will deny all read attempts until the rebuild is done. Also, manage multiple attempts at write in a loop of some sort or just drop out and let the read lock wait for access.
A read from the cache now looks like this:
public Object load(String id) throws Exception {
Store store = null;
// Make sure cache is fresh.
loadCache();
try {
// Establish a read lock so we do not attempt a read while teh cache is being updated.
cacheReadLock.lock();
store = storeCache.get(storeId);
} finally {
// Make sure the lock is cleared.
cacheReadLock.unlock();
}
return store;
}
The primary benefit of this form is that read access does not block other read access but everything stops cleanly during a rebuild - even other rebuilds.
You didn't say how complicated your structure is and how much concurrency / congestion you need. There are many ways to address your need.
If your data is simple, use a ConcurrentHashMap or similar to hold your data. Then just read and write in threads regardlessly.
Another alternative is to use actor model and put read/write on the same queue.
If all you need is to fill a read-only map which is initialized from database once requested, you could use any form of double-check locking which may be implemented in a number of ways. The easiest variant would be the following:
private volatile Map<T, V> cacheMap;
public void loadCacheMap() {
if (cacheMap == null) {
synchronized (this) {
if (cacheMap == null) {
cacheMap = retrieveParamCacheMap();
}
}
}
}
But I would personally prefer to avoid any form of synchronization here and just make sure that the initialization is done before any other thread can access it (for example in a form of init method in a DI container). In this case you would even avoid overhead of volatile.
EDIT: The answer works only when initial load is expected. In case of multiple updates, you could try to replace the tryLock by some other form of test and test-and-set, for example using something like this:
private final AtomicReference<CountDownLatch> sync =
new AtomicReference<>(new CountDownLatch(0));
private void loadCacheMap() {
CountDownLatch oldSync = sync.get();
if (oldSync.getCount() == 0) { // if nobody updating now
CountDownLatch newSync = new CountDownLatch(1);
if (sync.compareAndSet(oldSync, newSync)) {
cacheMap = retrieveParamCacheMap();
newSync.countDown();
return;
}
}
sync.get().await();
}
I'm new to Java programming. I have a use case where I have to execute 2 db queries parallely. The structure of my class is something like this:
class A {
public Object func_1() {
//executes db query1
}
public Object func_2() {
//executes db query1
}
}
Now I have a add another function func_3 in the same class which calls these 2 functions but also makes sure that they execute parallely. For this, I'm making use callables and futures. Is it the right way to use it this way? I'm storing the this variable in a temporary variable and then using this to call func_1 and func_2 from func_3(which I'm not sure is correct approach). Or is there any other way to handle cases like these?
class A {
public Object func_1() {
//executes db query1
}
public Object func_2() {
//executes db query1
}
public void func_3() {
final A that = this;
Callable call1 = new Callable() {
#Override
public Object call() {
return that.func_1();
}
}
Callable call2 = new Callable() {
#Override
public Object call() {
return that.func_2();
}
}
ArrayList<Callable<Object>> list = new ArrayList<Callable<Object>>();
list.add(call1);
list.add(call2);
ExecutorService executor = Executors.newFixedThreadPool(2);
ArrayList<Future<Object>> futureList = new ArrayList<Future<Object>>();
futureList = (ArrayList<Future<Object>>) executor.invokeAll(list);
//process result accordingly
}
}
First of all, you do NOT need to store this in another local variable: outer functions will be available just as func_1() or func_2() and when you want to get this of outer class you just use A.this.
Secondly, yes, it is common way to do it. Also, if you are going to call func_3 often - avoid creating of fixed thread pool, you should just pass it as params, since thread creation is rather 'costly'.
The whole idea of Executor(Service) is to use small number of threads for many small tasks. Here you use 2-threaded executor for 2 tasks. I would either create globally defined executor, or just spawn 2 threads for 2 tasks.