Multithreading with classes - java

This is a bit of an interesting question but I wanted to know everyone's thoughts on this design pattern.
public class MyThreadedMap {
private ConcurrentHashMap<Integer, Object> map;
...
public class Wrapper {
public Object get(int index){
return map.get(index);
}
}
}
At this point multiple threads will have their own instance of Wrapper and would be accessing the map with wrapper.get(index).
I found that the performance change from having the wrapper and not having the wrapper is just slightly better, that is the wrapper helps a little. When I place synchronized on the get method there is a serious performance hit.
What exactly is happening here? When an inner class is instantiated am I creating a copy of that get method for each instance? Would it be best if I just left the wrapper out since there is no real performance gain?

ConcurrentHashMap has fancy ways of minimizing synchronization overhead. When you synchronize the get method, it imposes normal synchronization overhead, thus the performance hit.
If there is no other code in the Wrapper class, I would just leave it out as it doesn't appear to add anything.

Related

Is "double checked locking" broken here in java?

I find an example for double checked locking.
However, I think this example is invalid because it's possible that another thread may see a non-null reference to a DoorControlManage object of door 1 but see the default values for fields of the DoorControlManage object of door 1 rather than the values set in the constructor.
(Ref: https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)
Could you let me know whether I am right?
Thanks a lot!
public class DoorControlManager {
private static HashMap<Integer, DoorControlManager> mInstances = new HashMap<>();
public static DoorControlManager getInstance(int door) {
if (!mInstances.containsKey(door)) {
synchronized (mInstances) {
if (!mInstances.containsKey(door)) {
mInstances.put(slotId, new DoorControlManager(door));
}
}
}
return mInstances.get(slotId);
}
...
}
Yes this code is broken, though not for the normal reason.
In this case, you have different threads accessing HashMap without proper synchronization. Since HashMap is not a thread-safe class, this is not thread-safe. It is possible that the first containsKey call will see stale values the internals of the map, and behave in unspecified (implementation dependent) ways.
Making "simple" changes to concurrency sensitive code can completely destroy the properties that make the original version thread-safe. If you are going to attempt to write "clever" code like this, you need to have a deep understanding of Java concurrency ... and how the Java Memory Model really works.
There are a couple of ways that this code could be written correctly:
Use a ConcurrentHashMap and implement the getInstance method as:
return mInstances.computeIfAbsent(
slotId, () -> new DoorControlManager(door));
Keep using a HashMap and don't use the DCL pattern. Simply lock before testing.
Note that DCL initialization pattern in Java 5+ is not broken, provided that the you are initializing a single field and the field is declared as volatile. But there are other (better) ways to achieve the same effect, so its use is not recommended.

anonymous-inner-classes vs static field

I prefer to use static field for instances of classes that not store his state in fields instead anonymous-inner-classes. I think this good practice for to less memory and GC usage if method sort(or other) call very often. But my colleague prefer to use anonymous-inner-classes for this case saying that JIT will optimize it.
class MyClass {
//non fields of class
/*access modifier*/ final static Comparator<MyClass> comparator = new Comparator<MyClass>(){
public compare(MyClass o1, MyClass o2){
//comparing logic
}
}
}
Usage example (I prefer):
List<MyClass> list = ...;
Collection.sort(list, MyClass.comparator);
Usage example (my colleague prefer):
List<MyClass> list = ...;
Collection.sort(list, new Comparator<MyClass>(){
public compare(MyClass o1, MyClass o2){
//comparing logic
}
});
1. Using anonymous-inner-classes in openJDK optimized?
2. Please, tell me good practice for this case.
I think this good practice for to less memory and GC usage if method sort(or other) call very often.
Well it's the other way round. If you are bothered about memory, the static fields will be there in memory until the class is unloaded.
However, the concern is more of readability here rather than memory or performance. If you find yourself using a Comparator instance may be 2-3 times, or more, it's better to store that in a field, to avoid repeating the code. Even better, mark the field final. If you are going to use it only once, there is no point of it being stored as static field.
But my colleague prefer to use anonymous-inner-classes for this case saying that JIT will optimize it.
I don't understand what kind of optimization is your colleague talking about. You should ask him/her for further clarification. IMO, this is just a pre-mature optimization, and you should really not be bothered.

how to locally synchronize two maps?

For instance,
class Test{
static Map a = new ...
static Map b = new ...
public void process(){
...
a.put(...);
b.put(...);
}
}
Do I have to lock like this:
synchronized(a){
a.put();
}
synchronized(b){
b.put(b);
}
This seems to be awkward. Any other right way to do this? Thanks.
No, you need both operations in one synchronized block, otherwise another thread may see inconsistencies between the two maps.
One possible option would be using a synchronized method, or you could use some other private object or one of the maps as an monitor. Here is the synchronized method example:
static Map a = new ...
static Map b = new ...
public synchronized void process(){
...
a.put(...);
b.put(...);
}
}
You can use a dedicated object like
Object mapLock = new Object();
to synchronize on.
Or you can sync on a keeping in mind that even if you need an access to b you need to sync on a.
Synchronizing on this is not a good idea in general. I mean this is a bad habit and doing so may eventually result in bad performance or non-obvious deadlocks if not in this but other applications you make.
Avoid synchronized(this) in Java?
You can also consider using ReadWriteLock from concurrency package.
You do need to run both operations within one synchronized block. Worth noting that in your example, you've defined the maps statically while the process() method is an instance method. The synchronizing the method will mean that calls to that instance will be synchronized, but that calls to 2 different instances will not (as the lock used when applying the synchronized keyword to a method is effectively this). You could either make the process() method static, or use a synchronized(Test.class) {} block instead to ensure that there's no racing happening.
You will also need to be careful about how you expose the maps to clients - if you're offering them up as properties, then I would probably wrap them with Collections.unmodifiableMap() to ensure that nothing else can go and screw with them while you're not looking - however that doesn't entirely protect against clients having an "odd" time as they will still see changes happen in potentially unsafe ways. As such, I'd also probably declare the types as ConcurrentHashMap to make things a little safer (although there are still some dangerous operations such as sharing an Iterator between threads)

Determining synchronization scope?

in trying to improve my understanding on concurrency issues, I am looking at the following scenario (Edit: I've changed the example from List to Runtime, which is closer to what I am trying):
public class Example {
private final Object lock = new Object();
private final Runtime runtime = Runtime.getRuntime();
public void add(Object o) {
synchronized (lock) { runtime.exec(program + " -add "+o); }
}
public Object[] getAll() {
synchronized (lock) { return runtime.exec(program + " -list "); }
}
public void remove(Object o) {
synchronized (lock) { runtime.exec(program + " -remove "+o); }
}
}
As it stands, each method is by thread safe when used standalone. Now, what I'm trying to figure out is how to handle where the calling class wishes to call:
for (Object o : example.getAll()) {
// problems if multiple threads perform this operation concurrently
example.remove(b);
}
But as noted, there is no guarantee that the state will be consistent between the call to getAll() and the calls to remove(). If multiple threads call this, I'll be in trouble. So my question is - How should I enable the developer to perform the operation in a thread safe manner? Ideally I wish to enforce the thread safety in a way that makes it difficult for the developer to avoid/miss, but at the same time not complicated to achieve. I can think of three options so far:
A: Make the lock 'this', so the synchronization object is accessible to calling code, which can then wrap the code blocks. Drawback: Hard to enforce at compile time:
synchronized (example) {
for (Object o : example.getAll()) {
example.remove(b);
}
}
B: Place the combined code into the Example class - and benefit from being able to optimize the implementation, as in this case. Drawback: Pain to add extensions, and potential mixing unrelated logic:
public class Example {
...
public void removeAll() {
synchronized (lock) { Runtime.exec(program + " -clear"); }
}
}
C: Provide a Closure class. Drawback: Excess code, potentially too generous of a synchronization block, could in fact make deadlocks easier:
public interface ExampleClosure {
public void execute(Example example);
}
public Class Example {
...
public void execute(ExampleClosure closure) {
synchronized (this) { closure.execute(this); }
}
}
example.execute(new ExampleClosure() {
public void execute(Example example) {
for (Object o : example.getAll()) {
example.remove(b);
}
}
}
);
Is there something I'm missing? How should synchronization be scoped to ensure the code is thread safe?
Use a ReentrantReadWriteLock which is exposed via the API. That way, if someone needs to synchronize several API calls, they can acquire a lock outside of the method calls.
In general, this is a classic multithreaded design issue. By synchronizing the data structure rather than synchronizing concepts that use the data structure, it's hard to avoid the fact that you essentially have a reference to the data structure without a lock.
I would recommend that locks not be done so close to the data structure. But it's a popular option.
A potential technique to make this style work is to use an editing tree-walker. Essentially, you expose a function that does a callback on each element.
// pointer to function:
// - takes Object by reference and can be safely altered
// - if returns true, Object will be removed from list
typedef bool (*callback_function)(Object *o);
public void editAll(callback_function func) {
synchronized (lock) {
for each element o { if (callback_function(o)) {remove o} } }
}
So then your loop becomes:
bool my_function(Object *o) {
...
if (some condition) return true;
}
...
editAll(my_function);
...
The company I work for (corensic) has test cases extracted from real bugs to verify that Jinx is finding the concurrency errors properly. This type of low level data structure locking without higher level synchronization is pretty common pattern. The tree editing callback seems to be a popular fix for this race condition.
I think everyone is missing his real problem. When iterating over the new array of Object's and trying to remove one at a time the problem is still technically unsafe (though ArrayList implantation would not explode, it just wouldnt have expected results).
Even with CopyOnWriteArrayList there is the possibility that there is an out of date read on the current list to when you are trying to remove.
The two suggestions you offered are fine (A and B). My general suggestion is B. Making a collection thread-safe is very difficult. A good way to do it is to give the client as little functionality as possible (within reason). So offering the removeAll method and removing the getAll method would suffice.
Now you can at the same time say, 'well I want to keep the API the way it is and let the client worry about additional thread-safety'. If thats the case, document thread-safety. Document the fact that a 'lookup and modify' action is both non atomic and non thread-safe.
Today's concurrent list implementations are all thread safe for the single functions that are offered (get, remove add are all thread safe). Compound functions are not though and the best that could be done is documenting how to make them thread safe.
I think j.u.c.CopyOnWriteArrayList is a good example of similar problem you're trying to solve.
JDK had a similar problem with Lists - there were various ways to synchronize on arbitrary methods, but no synchronization on multiple invocations (and that's understandable).
So CopyOnWriteArrayList actually implements the same interface but has a very special contract, and whoever calls it, is aware of it.
Similar with your solution - you should probably implement List (or whatever interface this is) and at the same time define special contracts for existing/new methods. For example, getAll's consistency is not guaranteed, and calls to .remove do not fail if o is null, or isn't inside the list, etc. If users want both combined and safe/consistent options - this class of yours would provide a special method that does exactly that (e.g. safeDeleteAll), leaving other methods close to original contract as possible.
So to answer your question - I would pick option B, but would also implement interface your original object is implementing.
From the Javadoc for List.toArray():
The returned array will be "safe" in
that no references to it are
maintained by this list. (In other
words, this method must allocate a new
array even if this list is backed by
an array). The caller is thus free to
modify the returned array.
Maybe I don't understand what you're trying to accomplish. Do you want the Object[] array to always be in-sync with the current state of the List? In order to achieve that, I think you would have to synchronize on the Example instance itself and hold the lock until your thread is done with its method call AND any Object[] array it is currently using. Otherwise, how will you ever know if the original List has been modified by another thread?
You have to use the appropriate granularity when you choose what to lock. What you're complaining about in your example is too low a level of granularity, where the lock doesn't cover all the methods that have to happen together. You need to make methods that combine all the actions that need to happen together within the same lock.
Locks are reentrant so the high-level method can call low-level synchronized methods without a problem.

Generating singletons

This might sound like a weird idea and I haven't thought it through properly yet.
Say you have an application that ends up requiring a certain number of singletons to do some I/O for example. You could write one singleton and basically reproduce the code as many times as needed.
However, as programmers we're supposed to come up with inventive solutions that avoid redundancy or repetition of any kind. What would be a solution to make multiple somethings that could each act as a singleton.
P.S: This is for a project where a framework such as Spring can't be used.
You could introduce an abstraction like this:
public abstract class Singleton<T> {
private T object;
public synchronized T get() {
if (object == null) {
object = create();
}
return object;
}
protected abstract T create();
}
Then for each singleton, you just need to write this:
public final Singleton<Database> database = new Singleton<Database>() {
#Override
protected Database create() {
// connect to the database, return the Database instance
}
};
public final Singleton<LogCluster> logs = new Singleton<LogCluster>() {
...
Then you can use the singletons by writing database.get(). If the singleton hasn't been created, it is created and initialized.
The reason people probably don't do this, and prefer to just repeatedly write something like this:
private Database database;
public synchronized Database getDatabase() {
if (database == null) {
// connect to the database, assign the database field
}
return database;
}
private LogCluster logs;
public synchronized LogCluster getLogs() {
...
Is because in the end it is only one more line of code for each singleton, and the chance of getting the initialize-singleton pattern wrong is pretty low.
However, as programmers we're supposed to come up with inventive solutions that avoid redundancy or repetition of any kind.
That is not correct. As programmers, we are supposed to come up with solutions that meet the following criteria:
meet the functional requirements; e.g. perform as required without bugs,
are delivered within the mandated timeframe,
are maintainable; e.g. the next developer can read and modify the code,
performs fast enough for the task in hand, and
can be reused in future tasks.
(These criteria are roughly ordered by decreasing priority, though different contexts may dictate a different order.)
Inventiveness is NOT a requirement, and "avoid[ing] redundancy or repetition of any kind" is not either. In fact both of these can be distinctly harmful ... if the programmer ignores the real criteria.
Bringing this back to your question. You should only be looking for alternative ways to do singletons if it is going to actually make the code more maintainable. Complicated "inventive" solutions may well return to bite you (or the people who have to maintain your code in the future), even if they succeed in reducing the number of lines of repeated code.
And as others have pointed out (e.g. #BalusC), current thinking is that the singleton pattern should be avoided in a lot of classes of application.
There does exist a multiton pattern. Regardless, I am 60% certain that the real solution to the original problem is a RDBMS.
#BalusC is right, but I will say it more strongly, Singletons are evil in all contexts.
Webapps, desktop apps, etc. Just don't do it.
All a singleton is in reality is a global wad of data. Global data is bad. It makes proper unit testing impossible. It makes tracing down weird bugs much, much harder.
The Gang of Four book is flat out wrong here. Or at least obsolete by a decade and a half.
If you want only one instance, have a factory that makes only one. Its easy.
How about passing a parameter to the function that creates the singleton (for example, it's name or specialization), that knows to create a singleton for each unique parameter?
I know you asked about Java, but here is a solution in PHP as an example:
abstract class Singleton
{
protected function __construct()
{
}
final public static function getInstance()
{
static $instances = array();
$calledClass = get_called_class();
if (!isset($instances[$calledClass]))
{
$instances[$calledClass] = new $calledClass();
}
return $instances[$calledClass];
}
final private function __clone()
{
}
}
Then you just write:
class Database extends Singleton {}

Categories