This is a design problem where I am trying to figure out in which level(application, class, object or even finer) I should put locks to ensure atomicity.
I have an application say Engine which has a class A and a method inside which contains a map.
class A{
public void methodA(){
Map<X,Y> testMap = Maps.newHashMap();
}
}
Now I have multiple threads accessing this map. What I want to ensure is atomic {read and write} combination on this map. Options which I have is
1.ConcurrentHashMap
2.Collections.synchronizedMap
3.static synchronizedMap outside the methodA
4.Application level locks using Redis or Memcache
What I should mention is I am considering the 4th option because the application Engine can have multiple instances.
Now I am facing race conditions when multiple threads are trying to read and write to the map.
For option1 I get bucket level lock because different threads can be directed to different instances of the Engine app
For option2 I get object level lock which faces the same issue as 1
For option3 I get class level lock which suffers from the same flaws of a multi instanced app
Option 4 seems the most viable option. However it will carry a performance overhead. So is there some way in Java to ensure this lock at a class level and not allow threads to modify different instances of the app.
EDIT
With reference to Chetan's comment this local map is later used to talk to the dao of a database which is global and thats where the race condition is encountered.
ConcurrentHashMap
even though all operations are thread-safe when you are 'get'ing, it may or may not reflect the 'put's.
Collections.synchronizedMap
Collections.synchronizedMap(map) creates a blocking Map which will degrade performance, albeit ensure consistency. use this option only if each thread needs to have an up-to-date view of the map
static synchronizedMap outside the methodA
same as 3
Application level locks using Redis or Memcache
From what I understood from your question, only this option make sense.
This is my understanding about your requirement, correct me if I am am wrong, will update answer accordingly: -You have multiple instances (as in multiple instances residing in multiple JVMs) of 'engine' application. while persisting the map (which is local to each instances) into the database you have to avoid race condition
Related
Given what I understand of concurrency in java, it seems shared access to instance members must be coded to handle multi-threaded access only if the threads access the same instance of a given object, such as a servlet.See here:
Why instance variable in Servlet is not thread-safe
Since not all applications are servlet based, how do u determine which objects need to accomodate multi-threaded access? For example, in a large, non-servlet based enterprise application, given the sheer number of classes, how do you determine from a design stand-point which objects will have only one instance shared across multiple threads during run-time? The only situation I can think of is a singleton.
In Java's EL API, javax.el.BeanELResolver has a private inner class that uses synchronization to serialize access to one of its members. Unless I am missing something, BeanELResolver does not look like a singleton, and so each thread should have its own instance of BeanELResolver. What could have been the design consideration behind synchronizing one of its members?
There are many cases in which the state of one class can be shared across many threads, not just singletons. For example you could have a class or method creating objects (some sort of factory) and injecting the same dependency in all the created objects. The injected dependency will be shared across all the threads that call the factory method. The dependency could be anything: a counter, database access class, etc.
For example:
class ThreadSafeCounter{
/* constructor omitted */
private final String name;
private final AtomicInteger i = new AtomicInteger();
int increment() { return i.incrementAndGet(); }
}
class SheepTracker {
public SheepTracker(ThreadSafeCounter c) { sheepCounter = c;}
private final ThreadSafeCounter sheepCounter;
public int addSheep() { return c.increment(); }
}
class SheepTrackerFactory {
private final ThreadSafeCounter c;
public SheepTracker newSheepAdder() {
return new SheepTracker(c);
}
}
In the above, the SheepTrackerFactory can be used by many threads that all need to do the same thing, i.e., keeping track of sheep. The number of sheep across all the threads is maintained in a global state variable, the ThreadSafeCounter (it could be just an AtomicInteger in this example, but bear with me, you can imagine how this class could contain additional state/operations). Now each SheepTracker can be a lightweight class that performs other operations that don't require synchronization, but when they need to increment the number of sheep, they will do it in a thread-safe way.
You're asking a very broad question, so I'll try to answer with a broad answer. One of the first things your design has to consider, long before you dive into classes, is the design of the application's threading. In this step you consider the task at hand, and how to best utilize the hardware that has to solve it. Based on that, you choose the best threading design for your application.
For instance - Does the application perform intense computations? If so, can parts of the computation be parallelized to make better use of a multi core CPU? If so, make sure to design multiple threads compute on different cores in parallel.
Does your application perform a lot of I/O operations? If so, it's better to parallelize them so multiple threads could handle the input/output (which is slow and requires a lot of waiting for external devices) while other threads continue working on their own tasks. This is why servlets are executed by multiple threads in parallel.
Once you decide one the tasks you want to parallelize and the ones you prefer executing in a single thread, you go into the design of the classes themselves. Now it's clear which parts of your software have to be thread safe, and which don't. You have a data structure that's being accessed by a thread pool responsible for I/O? It has to be thread safe. You have an object that's being accessed by a single thread that performs maintenance tasks? It doesn't have to be.
Anyway, this has nothing to do with singletons. Singlton is a design pattern that means that only a single instance of a certain object can be created. It doesn't say anything about the number of threads accessing it or its members.
Any instance can be shared between threads, not only singletons.
That's why it's pretty hard to come up with a design where anyone in a development team can instantly see which types or instances will be shared between threads and which won't. It outright impossible to prevent sharing of some instances. So the solution must be somewhere else. Read up on "memory barriers" to understand the details.
Synchronization is used for two purposes:
Define memory barriers (i.e. when changes should become visible to other threads)
Make sure that complex data structures can be shared between threads (i.e. locking).
Since there is no way to prevent people from sharing a single BeanELResolver instance between different threads, they probably need to make sure that concurrent access doesn't break some complex structure (probably a Map).
I would like to reuse instances of non-thread safe classes for performance reasons in a Servlet. I have two options,
use ThreadLocal where in Java takes care of doing the instance management per thread
use a static HashMap which uses Thread as the HashMap key and the instances are managed at this level
With the ThreadLocal approach there are potentials for memory leaks esp in Servlet enviornment. Because of this, I am thinking of using the 2nd option, I was wondering if anyone has experience in using this approach and any pitfalls of using the same?
Prefer the ThreadLocal approach because it is likely synchronized (or better yet, requires no synchronization) at the correct granularity and no larger.
If you roll your own solution using HashMap you'll have to acquire a lock over the HashMap every time you want to access any thread-local data. Why? Because a new thread could be created and threads can die. These are implicitly adding/removing items from a HashMap, which require synchronization on the full HashMap. You'll also have quite the time keeping object lifetimes straight because HashMap will keep alive all items it contains as long as it is referable from any thread. That is not how ThreadLocal store behaves.
The problem is not ThreadLocal itself, but the way it's being used. See here for a detailed explanation. So, your own implementation won't make a difference.
Quick question about "best practices" in Java. Suppose you have a database object, with the primary data structure for the database as a map. Further, suppose you wanted to synchronize any getting/setting info for the map. Is it better to synchronize every method that accesses/modifies the map, or do you want to create sync blocks around the map every time it's modified/accessed?
Depends on the scope of your units of work that need to be atomic. If you have a process that performs multiple operations that represent a single change of state, then you want to synchronize that entire process on the Map object. If you are synchronizing each individual operation, multiple threads can still interleave with each other on reads and writes. It would be like using a database cursor in read-uncommitted mode. You might make a decision based on some other threads half-complete work, seeing an incomplete/incorrect data state.
(And of course insert obligatory suggestion to use classes from java.util.concurrent.locks instead of the synchronized keyword :) )
In the general case, it is better to prefer to synchronize on a private final Object for non-private methods than it is to have private synchronized methods. The rationale for this is that you do not want a rouge caller to pass an input to your method and acquire your lock. For private methods you have complete control over how they can be called.
Personally, I avoid synchronized methods and encapsulate the method in a synchronized() block instead. This gives me tighter control and prevents outside sources from stealing my monitor. I cannot think of cases where you would want to provide an outside source access to your monitor, but if you did you could instead pass them your lock object just the same. But like I said, I would avoid that.
In order to avoid race condition, we can synchronize the write and access methods on the shared variables, to lock these variables to other threads.
My question is if there are other (better) ways to avoid race condition? Lock make the program slow.
What I found are:
using Atomic classes, if there is only one shared variable.
using a immutable container for multi shared variables and declare this container object with volatile. (I found this method from book "Java Concurrency in Practice")
I'm not sure if they perform faster than syncnronized way, is there any other better methods?
thanks
Avoid state.
Make your application as stateless as it is possible.
Each thread (sequence of actions) should take a context in the beginning and use this context passing it from method to method as a parameter.
When this technique does not solve all your problems, use the Event-Driven mechanism (+Messaging Queue).
When your code has to share something with other components it throws event (message) to some kind of bus (topic, queue, whatever).
Components can register listeners to listen for events and react appropriately.
In this case there are no race conditions (except inserting events to the queue). If you are using ready-to-use queue and not coding it yourself it should be efficient enough.
Also, take a look at the Actors model.
Atomics are indeed more efficient than classic locks due to their non-blocking behavior i.e. a thread waiting to access the memory location will not be context switched, which saves a lot of time.
Probably the best guideline when synchronization is needed is to see how you can reduce the critical section size as much as possible. General ideas include:
Use read-write locks instead of full locks when only a part of the threads need to write.
Find ways to restructure code in order to reduce the size of critical sections.
Use atomics when updating a single variable.
Note that some algorithms and data structures that traditionally need locks have lock-free versions (they are more complicated however).
Well, first off Atomic classes uses locking (via synchronized and volatile keywords) just as you'd do if you did it yourself by hand.
Second, immutability works great for multi-threading, you no longer need monitor locks and such, but that's because you can only read your immutables, you cand modify them.
You can't get rid of synchronized/volatile if you want to avoid race conditions in a multithreaded Java program (i.e. if the multiple threads cand read AND WRITE the same data). Your best bet is, if you want better performance, to avoid at least some of the built in thread safe classes which do sort of a more generic locking, and make your own implementation which is more tied to your context and thus might allow you to use more granullar synchronization & lock aquisition.
Check out this implementation of BlockingCache done by the Ehcache guys;
http://www.massapi.com/source/ehcache-2.4.3/src/net/sf/ehcache/constructs/blocking/BlockingCache.java.html
One of the alternatives is to make shared objects immutable. Check out this post for more details.
You can perform up to 50 million lock/unlocks per second. If you want this to be more efficient I suggest using more course grain locking. i.e. don't lock every little thing, but have locks for larger objects. Once you have much more locks than threads, you are less likely to have contention and having more locks may just add overhead.
Suppose we have a class called AccountService that manages the state of accounts.
AccountService is defined as
interface AccountService{
public void debit(account);
public void credit(account);
public void transfer(Account account, Account account1);
}
Given this definition, what is the best way to implement transfer() so that you can guarantee that transfer is an atomic operation.
I'm interested in answers that reference Java 1.4 code as well as answers that might use resources from java.util.concurrent in Java 5
Synchronize on both Account objects and do the transfer. Make sure you always synchronize in the same order. In order to do so, make the Accounts implement Comparable, sort the two accounts, and synchronize in that order.
If you don't order the accounts, you run the possibility of deadlock if one thread transfers from A to B and another transfers from B to A.
This exact example is discussed on page 207 of Java Concurrency in Practice, a critical book for anybody doing multi-threaded Java development. The example code is available from the publisher's website:
Dynamic lock-ordering deadlock. (bad)
Inducing a lock ordering to avoid deadlock.
A classic example very well explained here - http://www.javaworld.com/javaworld/jw-10-2001/jw-1012-deadlock.html?page=4
You probably need to have a full transactions support (if it's a real application of course).
The difficulty of solution hardly depends on your environment. Describe your system in detail and we'll try to help you (what kind of application? does it use web-server? which web-server? what is used to store data? and so on)
If you can guarantee that all accesses are made through the transfer method, then probably the easiest approach is just to make transfer a synchronized method. This will be thread-safe because this guarantees that only one thread will be running the transfer method at any one time.
If other methods may also access the AccountService, then you might decide to have them all use a single global lock. An easy way of doing this is to surround all code that accesses the AccountService in a synchronized (X) {...} block where X is some shared / singleton object instance (that could be the AccountService instance itself). This will be thread safe because only one thread will be accessing the AccountService at any one time, even if they are in different methods.
If that still isn't sufficient, then you'll need to use more sophisticated locking approaches. One common approach would be to lock the accounts individually before you modify them... but then you must be very careful to take the locks in a consistent order (e.g. by account ID) otherwise you will run into deadlocks.
Finally if AccountService is a remote service then you are into distributed locking territory.... unless you have a PhD in computer science and years of research budget to burn you should probably avoid going there.
Couldn't you avoid having to synchronize using an AtomicReference<Double> for the account balance, along with get() and set()?