Quick question about "best practices" in Java. Suppose you have a database object, with the primary data structure for the database as a map. Further, suppose you wanted to synchronize any getting/setting info for the map. Is it better to synchronize every method that accesses/modifies the map, or do you want to create sync blocks around the map every time it's modified/accessed?
Depends on the scope of your units of work that need to be atomic. If you have a process that performs multiple operations that represent a single change of state, then you want to synchronize that entire process on the Map object. If you are synchronizing each individual operation, multiple threads can still interleave with each other on reads and writes. It would be like using a database cursor in read-uncommitted mode. You might make a decision based on some other threads half-complete work, seeing an incomplete/incorrect data state.
(And of course insert obligatory suggestion to use classes from java.util.concurrent.locks instead of the synchronized keyword :) )
In the general case, it is better to prefer to synchronize on a private final Object for non-private methods than it is to have private synchronized methods. The rationale for this is that you do not want a rouge caller to pass an input to your method and acquire your lock. For private methods you have complete control over how they can be called.
Personally, I avoid synchronized methods and encapsulate the method in a synchronized() block instead. This gives me tighter control and prevents outside sources from stealing my monitor. I cannot think of cases where you would want to provide an outside source access to your monitor, but if you did you could instead pass them your lock object just the same. But like I said, I would avoid that.
Related
I need to make a data structure keyed off of username and then some data (additional collections) in a POJO. The data needs to be thread safe.
So I'm thinking for the main structure, ConcurrentHashMap<String, MyPOJO>. For the operations I need to perform on MyPOJO, I may either just read it, or I may perform write operations on it.
Would the best approach be to do a get on the map and then operate on MyPOJO in a syncronized block? I assume I just need to put a syncronized block in the update methods and the read methods would automatically be blocked? Is that the best approach in a highly concurrent app? Or do I need to use something like ReadWriteLock on BOTH the get/set operations?
If I use something like StampedLock, each MyPOJO would need one correct, so I can do record level locking?
Thanks!
Would the best approach be to do a get on the map and then operate on MyPOJO in a synchronized block?
I assume that you mean a synchronized block on the MyPOJO instance itself (or a private lock owned by the instance).
My answer is yes, if you do it right.
I assume I just need to put a synchronized block in the update methods and the read methods would automatically be blocked?
No, that's not correct. All methods that access or update a mutable object would need to synchronize on the same lock.
If you don't synchronize for both reads and writes, you risk various thread-safety concerns, including problems with visibility of writes. Heisenbugs.
Is that the best approach in a highly concurrent app? Or do I need to use something like ReadWriteLock on BOTH the get/set operations?
It depends.
On the ReadWriteLock issue:
Unless it is likely that you will get significant lock contention on a specific MyPOJO instance, it is probably not worth the effort to optimize this.
If the access and update methods only hold the lock for a relatively short period of time, that reduces the impact of any contention.
More generally, I have a suspicion that you might be confusing "highly concurrent" with "highly scalable". Java multi-threading only performs up to the limit of the cores (and memory) on a single machine. Beyond that, clever tweaks to improve concurrency get you nowhere. To scale up further, you need to change the system architecture so that requests are handled by multiple JVM instances on different machines.
So ... to sum up ... ReadWriteLock might help if you have significant contention on individual MyPOJO instances AND there are likely to be a lot of parallel read operations on individual instances.
If I use something like StampedLock, each MyPOJO would need one correct, so I can do record level locking?
I doubt that there would be much benefit unless you have significant contention; see above. But yes, if you used a StampedLock per instance you would get record-level locking ... just like you would other per-instance locking.
FWIW: This smells to me of "premature optimization". Furthermore, if you expect that your solution will need to scale beyond a single JVM in the short to medium term, then it is arguably a waste of time to optimize the single JVM solution too much.
I have an ArrayList which I add items to, within a broadcastreceiver callback.
However the arraylist will eventually be attached to an adapter and then I wish to display the contents of the array to the screen.
The array contains peer information from a P2P app I'm working on so it will be subject to change frequently as devices drop in and out of connection/range.
So basically the arraylist will be read and written to frequently.
I come from a c++ background so I would normally use a lock to protect my arraylist, when accessing it, but I'm unsure what I should use in java/android.
Any Advice please.
Using a lock is never wrong. All synchronized does is use a lock under the hood. Some Java purists may complain, but you tend to get more flexibility out of just using a semaphore (and sometimes its just the only way to be correct). There's also some ugly corner cases to wait/notify that you have to really understand the use cases of to get right that semaphores just avoid. If you're familiar with them I wouldn't hesitate to use it just because you're in Java now.
Use a BlockingQueue instead of an ArrayList. It'll make your list Thread safe. As per the Documenatation :
A Queue that additionally supports operations that wait for the queue to become non-empty when retrieving an element, and wait for space to become available in the queue when storing an element.
The synchronized keyword locks on whatever object is specified. If the method is marked as synchronized and its an instance method it locks on the enclosing instance. If the method is static, it locks on the class object. If an object is specified in parentheses after the synchronized keyword in a syncrhonized block, the lock is held on that object. I would typically use a thread safe collection like AndroidWarrior proposed, but if thats not possible, just make sure that your accessors and mutators lock on the same object.
Lets say I have a Set and another Queue. I want to check in the set if it contains(Element) and if not add(element) to the queue. I want to do the two steps atomically.
One obvious way is to use synchronized blocks or Lock.lock()/unlock() methods. Under thread contention , these will cause context switches. Is there any simple design strategy for achieving this in a non-blocking manner ? may be using some Atomic constructs ?
I don't think you can rely on any mechanism, except the ones you pointed out yourself, simply because you're operating on two structures.
There's decent support for concurrent/atomic operations on one data structure (like "put if not exists" in a ConcurrentHashMap), but for a sequence of operations, you're stuck with either a lock or a synchronized block.
For some operations you can employ what is called a "safe sequence", where concurrent operations may overlap without conflicting. For instance, you might be able to add a member to a set (in theory) without the need to synchronize, since two threads simultaneously adding the same member do not conceptually conflict with each other.
But to query one object and then conditionally operate on a different object is a much more complicated scenario. If your sequence was to query the set, then conditionally insert the member into the set and into the queue, the query and first insert could be replaced with a "compare and swap" operation that syncs without stalling (except perhaps at the memory access level), and then one could insert the member into the queue based on the success of the first operation, only needing to synchronize the queue insert itself. However, this sequence leaves the scenario where another thread could fail the insert and still not find the member in the queue.
Since the contention case is the relevant case you should look at "spin locks". They do not give away the CPU but spin on a flag expecting the flag to be free very soon.
Note however that real spin locks are seldom useful in Java because the normal Lock is quite good. See this blog where someone had first implemented a spinlock in Java only to find that after some corrections (i.e. after making the test correct) spin locks are on par with the standard stuff.
You can use java.util.concurrent.ConcurrentHashMap to get the semantics you want. They have a putIfAbsent that does an atomic insert. You then essentially try to add an element to the map, and if it succeeds, you know that thread that performed the insert is the only one that has, and you can then put the item in the queue safely. The other significant point here is that the operations on a ConcurrentMap insure "happens-before" semantics.
ConcurrentMap<Element,Boolean> set = new ConcurrentHashMap<Element,Boolean>();
Queue<Element> queue = ...;
void maybeAddToQueue(Element e) {
if (set.putIfAbsent(e, true) == null) {
queue.offer(e);
}
}
Note, the actual value type (Boolean) of the map is unimportant here.
In order to avoid race condition, we can synchronize the write and access methods on the shared variables, to lock these variables to other threads.
My question is if there are other (better) ways to avoid race condition? Lock make the program slow.
What I found are:
using Atomic classes, if there is only one shared variable.
using a immutable container for multi shared variables and declare this container object with volatile. (I found this method from book "Java Concurrency in Practice")
I'm not sure if they perform faster than syncnronized way, is there any other better methods?
thanks
Avoid state.
Make your application as stateless as it is possible.
Each thread (sequence of actions) should take a context in the beginning and use this context passing it from method to method as a parameter.
When this technique does not solve all your problems, use the Event-Driven mechanism (+Messaging Queue).
When your code has to share something with other components it throws event (message) to some kind of bus (topic, queue, whatever).
Components can register listeners to listen for events and react appropriately.
In this case there are no race conditions (except inserting events to the queue). If you are using ready-to-use queue and not coding it yourself it should be efficient enough.
Also, take a look at the Actors model.
Atomics are indeed more efficient than classic locks due to their non-blocking behavior i.e. a thread waiting to access the memory location will not be context switched, which saves a lot of time.
Probably the best guideline when synchronization is needed is to see how you can reduce the critical section size as much as possible. General ideas include:
Use read-write locks instead of full locks when only a part of the threads need to write.
Find ways to restructure code in order to reduce the size of critical sections.
Use atomics when updating a single variable.
Note that some algorithms and data structures that traditionally need locks have lock-free versions (they are more complicated however).
Well, first off Atomic classes uses locking (via synchronized and volatile keywords) just as you'd do if you did it yourself by hand.
Second, immutability works great for multi-threading, you no longer need monitor locks and such, but that's because you can only read your immutables, you cand modify them.
You can't get rid of synchronized/volatile if you want to avoid race conditions in a multithreaded Java program (i.e. if the multiple threads cand read AND WRITE the same data). Your best bet is, if you want better performance, to avoid at least some of the built in thread safe classes which do sort of a more generic locking, and make your own implementation which is more tied to your context and thus might allow you to use more granullar synchronization & lock aquisition.
Check out this implementation of BlockingCache done by the Ehcache guys;
http://www.massapi.com/source/ehcache-2.4.3/src/net/sf/ehcache/constructs/blocking/BlockingCache.java.html
One of the alternatives is to make shared objects immutable. Check out this post for more details.
You can perform up to 50 million lock/unlocks per second. If you want this to be more efficient I suggest using more course grain locking. i.e. don't lock every little thing, but have locks for larger objects. Once you have much more locks than threads, you are less likely to have contention and having more locks may just add overhead.
If multiple threads are updating the same variable, what should I do so each thread updates the variable correctly?
Any help would be greatly appreciated
There are several options:
1) Using no synchronization at all
This can only work if the data is of primitive type (not long/double), and you don't care about reading stale values (which is unlikely)
2) Declaring the field as volatile
This will guarantee that stale values are never read. It also works fine for objects (assuming the objects aren't changed after creation), because of the happens-before guarantees of volatile variables (See "Java Memory Model").
3) Using java.util.concurrent.AtomicLong, AtomicInteger etc
They are all thread safe, and support special operations like atomic incrementation and atomic compare-and-set operations.
4) Protecting reads and writes with the same lock
This approach provides mutual exclusion, which allows defining a large atomic operation, where multiple data members are manipulated as a single operation.
This is a major problem with multi-threaded applications, and spans more than I could really cover in an answer, so I'll point you to some resources.
http://download.oracle.com/javase/tutorial/essential/concurrency/sync.html
http://www.vogella.de/articles/JavaConcurrency/article.html#concurrencyjava_synchronized
Essentially, you use the synchronized keyword to place a lock around a variable. This makes sure that the piece of code is only being run once at a time. You can also place locks around the same object in multiple areas.
Additionally, you need to look out for several pitfalls, such as Deadlock.
http://tutorials.jenkov.com/java-concurrency/deadlock.html
Errors caused by misuse of locks are often very difficult to debug and track down, because they aren't very consistent. So, you always need to be careful that you put all of your locks in the correct location.
You should implement locking on the variable in question.
Eg.
http://download.oracle.com/javase/tutorial/essential/concurrency/newlocks.html