Will JIT optimize this code? Is synchronization required? - java

Below is a class that holds a map of misspelled to correctly spelled terms. The map is updated periodically by a quartz job by calling updateCache(). Method updatecache processes key and values in the input map and stores them in a temporary map object. After the processing is complete (after for loop), it assigns the temporary map to local class variable misspelledToCorrectlySpelled.
package com.test;
import java.util.HashMap;
import java.util.Map;
import org.checkthread.annotations.ThreadSafe;
#ThreadSafe
public class SpellCorrectListCacheManager {
private Map<String, String> misspelledToCorrectlySpelled =
new HashMap<String, String>(0);
/*
* invoked by a quartz job thread
*/
public void updateCache(Map<String, String> map) {
Map<String, String> tempMap = new HashMap<String, String>(map.size());
for (Map.Entry<String, String> entry : map.entrySet()) {
//process key and values
String key = entry.getKey().toLowerCase();
String value = entry.getValue().toLowerCase();
tempMap.put(key, value);
}
// update local variable
this.misspelledToCorrectlySpelled = tempMap;
}
/*
* Could be invoked by *multiple* threads
*/
public Map<String, String> getMisspelledToCorrectlySpelled() {
return misspelledToCorrectlySpelled;
}
}
Question 1: Will JIT optimize optimize this code?
Actual Code
/*
* since tempMap is assigned to misspelledToCorrectlySpelled and not
* used anywhere else, will JIT remove tempMap as shown in the optimized
* version below?
*/
Map<String, String> tempMap = new HashMap<String, String>(map.size());
for (Map.Entry<String, String> entry : map.entrySet()) {
// process key and values
String key = entry.getKey().toLowerCase();
String value = entry.getValue().toLowerCase();
tempMap.put(key, value);
}
this.misspelledToCorrectlySpelled = tempMap;
Optmized Code
this.misspelledToCorrectlySpelled = new HashMap<String, String>(map.size());
for (Map.Entry<String, String> entry : map.entrySet()) {
//process key and values
String key = entry.getKey().toLowerCase();
String value = entry.getValue().toLowerCase();
this.misspelledToCorrectlySpelled.put(key, value);
}
Question 2: Assuming JIT won't optimize the code, should method getMisspelledToCorrectlySpelled be synchronized?
/*
* is this assignment atomic operation?
*
* Does this needs to be synchronized?
*
* By not synchronizing, the new map may not
* be visible to other threads *immediately* -- this is
* ok since the new map will be visible after a bit of time
*
*/
this.misspelledToCorrectlySpelled = tempMap;
}

You should use an AtomicReference to store the new map, in order to avoid synchronization and visibility problems. But the biggest problem in your code is that you give access to a non-thread-safe mutable map to several threads. You should wrap your map into an unmodifiable map :
private AtomicReference<Map<String, String>> misspelledToCorrectlySpelled =
new AtomicReference<Map<String, String>>(Collections.unmodifiableMap(new HashMap<String, String>(0)));
/*
* invoked by a quartz job thread
*/
public void updateCache(Map<String, String> map) {
Map<String, String> tempMap = new HashMap<String, String>(map.size());
for (Map.Entry<String, String> entry : map.entrySet()) {
//process key and values
String key = entry.getKey().toLowerCase();
String value = entry.getValue().toLowerCase();
tempMap.put(key, value);
}
// update local variable
this.misspelledToCorrectlySpelled.set(Collections.unmodifiableMap(tempMap));
}
/*
* Could be invoked by *multiple* threads
*/
public Map<String, String> getMisspelledToCorrectlySpelled() {
return misspelledToCorrectlySpelled.get();
}
To answer your question about JIT optimization : no, the JIT won't remove the temp map usage.

Synchronization is required where synchronization is required. Nothing about the JIT can be assumed except that a compliant implementation will be compliant with the JLS and Java Memory Model and will honor the code designed with these rules. (There are multiple methods of synchronization, not all utilize the synchronized keyword.)
Synchronization is required here unless it's "okay" that a stale version is seen. (That is likely not the case here, and it could be a very stale version with caches and all -- so not something to bet on!). The "assignment of the reference" itself is atomic insofar as no "partial write" can occur, but it is not guaranteed to be [immediately] propagated ("visible") across all threads.
Happy coding.

Related

How to atomically replace all the values of a singleton map? [duplicate]

I have a stateful bean in an multi-threaded enviroment, which keeps its state in a map. Now I need a way to replace all values of that map in one atomic action.
public final class StatefulBean {
private final Map<String, String> state = new ConcurrentSkipListMap<>();
public StatefulBean() {
//Initial state
this.state.put("a", "a1");
this.state.put("b", "b1");
this.state.put("c", "c1");
}
public void updateState() {
//Fake computation of new state
final Map<String, String> newState = new HashMap<>();
newState.put("b", "b1");
newState.put("c", "c2");
newState.put("d", "d1");
atomicallyUpdateState(newState);
/*Expected result
* a: removed
* b: unchanged
* C: replaced
* d: added*/
}
private void atomicallyUpdateState(final Map<String, String> newState) {
//???
}
}
At the moment I use ConcurrentSkipListMap as implementation of a ConcurrentMap, but that isn't a requirement.
The only way I see to solve this problem is to make the global state volatile and completely replace the map or use a AtomicReferenceFieldUpdater.
Is there a better way?
My updates are quite frequent, once or twice a second, but chance only very few values. Also the whole map will only ever contain fewer than 20 values.
Approach with CAS and AtomicReference would be to copy map content on each bulk update.
AtomicReference<Map<String, String>> workingMapRef = new AtomicReference<>(new HashMap<>());
This map can be concurrent, but for "bulk updates" it is read-only. Then in updateState looping doUpdateState() until you get true and that means that your values has been updated.
void updateState() {
while (!doUpdateState());
}
boolean doUpdateState() {
Map<String, String> workingMap = workingMapRef.get();
//copy map content
Map<String, String> newState = new HashMap<>(workingMap); //you can make it concurrent
newState.put("b", "b1");
newState.put("c", "c2");
newState.put("d", "d1");
return workingMapRef.compareAndSet(workingMap, newState);
}
The simplest, least fuss method is to switch the map instead of replacing map contents. Whether using volatile or AtomicReference (I don't see why you'd need AtomicReferenceFieldUpdater particularly), shouldn't make too much of a difference.
This makes sure that your map is always in proper state, and allows you to provide snapshots too. It doesn't protect you from other concurrency issues though, so if something like lost updates are a problem you'll need further code (although AtomicReference would give you CAS methods for handling those).
The question is actually rather simple if you only consider the complete atomic replacement of the map. It would be informative to know what other operations affect the map and how. I'd also like to hear why ConcurrentSkipListMap was chosen over ConcurrentHashMap.
Since the map is quite small, it's probably enough to just use synchronized in all places you access it.
private void atomicallyUpdateState(final Map<String, String> newState) {
synchronized(state) {
state.clear();
state.putAll(newState);
}
}
but don't forget any, like all occurances of things like
String myStatevalue = state.get("myValue");
need to become
String myStatevalue;
synchronized (state) {
myStatevalue = state.get("myValue");
}
otherwise the read and update are not synchronized and cause a race condition.
Extend a map implementation of your choice and add a synchronized method:
class MyReplaceMap<K, V> extends HashMap<K, V> //or whatever
{
public synchronized void replaceKeys(final Map<K, V> newMap)
{
//.. do some stuff
}
}
Of course, you could always make state non-final volatile and re-assign it (assignment is atomic)
private volatile Map<String, String> state = new HashMap<>();
//...
final Map<String, String> newState = new HashMap<>();
newState.put("b", "b1");
newState.put("c", "c2");
newState.put("d", "d1");
state = newState;
As client code maintains a reference to the bean not the map, replacing the value (i.e. the whole map) would seem to be the simplest solution.
Unless there's any significant performance concerns (although using locking is likely to perform worse and less predictably unless the map is huge) I'd try that before anything requiring more advanced knowledge.
It's how a functional programmer would do it.
Use ReadWriteLock can help to automically replace all values in a Map.
private static final ReadWriteLock LOCK = new ReentrantReadWriteLock();
private void atomicallyUpdateState(final Map<String, String> newState) {
LOCK.writeLock().lock();
try {
state.clear();
state.putAll(newState);
} finally {
LOCK.writeLock().unlock();
}
}

Java: How to atomically replace all values in a Map?

I have a stateful bean in an multi-threaded enviroment, which keeps its state in a map. Now I need a way to replace all values of that map in one atomic action.
public final class StatefulBean {
private final Map<String, String> state = new ConcurrentSkipListMap<>();
public StatefulBean() {
//Initial state
this.state.put("a", "a1");
this.state.put("b", "b1");
this.state.put("c", "c1");
}
public void updateState() {
//Fake computation of new state
final Map<String, String> newState = new HashMap<>();
newState.put("b", "b1");
newState.put("c", "c2");
newState.put("d", "d1");
atomicallyUpdateState(newState);
/*Expected result
* a: removed
* b: unchanged
* C: replaced
* d: added*/
}
private void atomicallyUpdateState(final Map<String, String> newState) {
//???
}
}
At the moment I use ConcurrentSkipListMap as implementation of a ConcurrentMap, but that isn't a requirement.
The only way I see to solve this problem is to make the global state volatile and completely replace the map or use a AtomicReferenceFieldUpdater.
Is there a better way?
My updates are quite frequent, once or twice a second, but chance only very few values. Also the whole map will only ever contain fewer than 20 values.
Approach with CAS and AtomicReference would be to copy map content on each bulk update.
AtomicReference<Map<String, String>> workingMapRef = new AtomicReference<>(new HashMap<>());
This map can be concurrent, but for "bulk updates" it is read-only. Then in updateState looping doUpdateState() until you get true and that means that your values has been updated.
void updateState() {
while (!doUpdateState());
}
boolean doUpdateState() {
Map<String, String> workingMap = workingMapRef.get();
//copy map content
Map<String, String> newState = new HashMap<>(workingMap); //you can make it concurrent
newState.put("b", "b1");
newState.put("c", "c2");
newState.put("d", "d1");
return workingMapRef.compareAndSet(workingMap, newState);
}
The simplest, least fuss method is to switch the map instead of replacing map contents. Whether using volatile or AtomicReference (I don't see why you'd need AtomicReferenceFieldUpdater particularly), shouldn't make too much of a difference.
This makes sure that your map is always in proper state, and allows you to provide snapshots too. It doesn't protect you from other concurrency issues though, so if something like lost updates are a problem you'll need further code (although AtomicReference would give you CAS methods for handling those).
The question is actually rather simple if you only consider the complete atomic replacement of the map. It would be informative to know what other operations affect the map and how. I'd also like to hear why ConcurrentSkipListMap was chosen over ConcurrentHashMap.
Since the map is quite small, it's probably enough to just use synchronized in all places you access it.
private void atomicallyUpdateState(final Map<String, String> newState) {
synchronized(state) {
state.clear();
state.putAll(newState);
}
}
but don't forget any, like all occurances of things like
String myStatevalue = state.get("myValue");
need to become
String myStatevalue;
synchronized (state) {
myStatevalue = state.get("myValue");
}
otherwise the read and update are not synchronized and cause a race condition.
Extend a map implementation of your choice and add a synchronized method:
class MyReplaceMap<K, V> extends HashMap<K, V> //or whatever
{
public synchronized void replaceKeys(final Map<K, V> newMap)
{
//.. do some stuff
}
}
Of course, you could always make state non-final volatile and re-assign it (assignment is atomic)
private volatile Map<String, String> state = new HashMap<>();
//...
final Map<String, String> newState = new HashMap<>();
newState.put("b", "b1");
newState.put("c", "c2");
newState.put("d", "d1");
state = newState;
As client code maintains a reference to the bean not the map, replacing the value (i.e. the whole map) would seem to be the simplest solution.
Unless there's any significant performance concerns (although using locking is likely to perform worse and less predictably unless the map is huge) I'd try that before anything requiring more advanced knowledge.
It's how a functional programmer would do it.
Use ReadWriteLock can help to automically replace all values in a Map.
private static final ReadWriteLock LOCK = new ReentrantReadWriteLock();
private void atomicallyUpdateState(final Map<String, String> newState) {
LOCK.writeLock().lock();
try {
state.clear();
state.putAll(newState);
} finally {
LOCK.writeLock().unlock();
}
}

Keeping a track of duplicate inserts in a Map (Multithreaded environment)

I am looking for a way to keep a track of the number of times the same key insert is attempted into a Map in a multithreaded environemnt such that the Map can be read and updated by multiple threads at the same time. If keeping a track of duplicate key insert attempts is not achievable easily, an alternate solution would be to kill the application at the first sign of a duplicate key insert attempt.
The following user defined singleton Spring bean shows a global cache used by my application which is loaded using multiple partitioned spring batch jobs (one job for each DataType to be loaded). The addResultForDataType method can be called by multiple threads at the same time.
public class JobResults {
private Map<DataType, Map<String, Object>> results;
public JobResults() {
results = new ConcurrentHashMap<DataType, Map<String, Object>>();
}
public void addResultForDataType(DataType dataType, String uniqueId, Object result) {
Map<String, Object> dataTypeMap = results.get(dataType);
if (dataTypeMap == null) {
synchronized (dataType) {
dataTypeMap = results.get(dataType);
if (dataTypeMap == null) {
dataTypeMap = new ConcurrentHashMap<String, Object>();
results.put(dataType, dataTypeMap);
}
}
}
dataTypeMap.put(uniqueId, result);
}
public Map<String, Object> getResultForDataType(DataType dataType) {
return results.get(dataType);
}
}
Here :
DataType can be thought of as the table name or file name from
where the data is loaded. Each DataType indicates one table or file.
uniqueId represents the primary key for each record in the table or file.
result is the object representing the entire row.
The above method is called once per record. At any given time, multiple threads can be inserting a record for the same DataType or a different DataType.
I thought of creating another map to keep a track of the duplicate inserts :
public class JobResults {
private Map<DataType, Map<String, Object>> results;
private Map<DataType, ConcurrentHashMap<String, Integer>> duplicates;
public JobResults() {
results = new ConcurrentHashMap<DataType, Map<String, Object>>();
duplicates = new ConcurrentHashMap<DataType, ConcurrentHashMap<String, Integer>>();
}
public void addResultForDataType(DataType dataType, String uniqueId, Object result) {
Map<String, Object> dataTypeMap = results.get(dataType);
ConcurrentHashMap<String,Integer> duplicateCount = duplicates.get(dataType);
if (dataTypeMap == null) {
synchronized (dataType) {
dataTypeMap = results.get(dataType);
if (dataTypeMap == null) {
dataTypeMap = new ConcurrentHashMap<String, Object>();
duplicateCount = new ConcurrentHashMap<String, Integer>();
results.put(dataType, dataTypeMap);
duplicates.put(dataType, duplicateCount);
}
}
}
duplicateCount.putIfAbsent(uniqueId, 0);
duplicateCount.put(uniqueId, duplicateCount.get(uniqueId)+1);//keep track of duplicate rows
dataTypeMap.put(uniqueId, result);
}
public Map<String, Object> getResultForDataType(DataType dataType) {
return results.get(dataType);
}
}
I realize that the statemet duplicateCount.put(uniqueId, duplicateCount.get(uniqueId)+1); is not implicitly thread safe. To make it thread-safe, I will need to use synchronization which will slow down my inserts. How can I keep a track of the duplicate inserts without impacting the performance of my application. If keeping a track of duplicate inserts is not easy, I would be fine with just throwing an exception at the first sign of an attempt to overwrite an existing entry in the map.
Note I am aware that a Map does not allow duplicate keys. What I want is a way to keep a track of any such attempts and halt the application rather than overwrite entries in the Map.
Try something like this:
ConcurrentHashMap<String, AtomicInteger> duplicateCount = new ConcurrentHashMap<String, AtomicInteger>();
Then when you're ready to increment a count, do this:
final AtomicInteger oldCount = duplicateCount.putIfAbsent(uniqueId, new AtomicInteger(1));
if (oldCount != null) {
oldCount.incrementAndGet();
}
So, if you do not have a count in the map yet, you will put 1, if you have, you will get the current value and atomically increment it. This should be thread safe.
If you want to keep track of the number of inserts, you can change the outer map type to something like Map<String, Pair<Integer, Object>> (or, if you don't use Apache Commons, just Map<DataType, Map.Entry<Integer, InnerType>>, where the Integer value is the number of updates:
DataType key = ...;
Map<Integer, Object> value = ...;
dataTypeMap.compute(key, (k, current) -> {
if (current == null) {
/* Initial count is 0 */
return Pair.of(0, value);
} else {
/* Increment count */
return Pair.of(current.getFirst(), value);
}));
If all you care about is ensuring that there is no duplicate inserts, you can simply use computeIfAbsent:
DataType key = ...;
Map<Integer, Object> value = ...;
if (dataTypeMap.computeIfAbsent(key, k -> value)) != null) {
/* There was already a value */
throw new IllegalStateException(...);
});

Managing nested maps with string keys

Can I access my nestedMap in my iterator when the nestedMap is created in the put() method, like this:
#Override
public String put(final String row, final String column, final String value) {
/**
* Second map which is contained by centralMap, that contain Strings as
* Keys and Values.
*/
Map<String, String> nestedMap;
if (centralMap.containsKey(row))
nestedMap = centralMap.get(row);
else
nestedMap = new HashMap<String, String>();
if (!nestedMap.containsKey(column))
counter++;
centralMap.put(row, nestedMap);
return nestedMap.put(column, value);
}
and the centralMap is declared as an Object-Variable,
private final Map<String, Map<String, String>> centralMap;
but instantiated just in the constructor, like this:
centralMap = new HashMap<String, Map<String, String>>();
the method i'm trying to implement is the remove method:
#Override
public void remove() {
for (Map<String, String> map : centralMap.values()) {
map = centralMap.get(keyName);
iteratorNested.remove();
if (map.size() <= 0)
iteratorCentral.remove();
}
}
Thanks a lot!
Not sure what you're asking exactly, but I think this is a little easier to read:
#Override
public String put(final String row, final String column, final String value) {
/**
* Second map which is contained by centralMap, that contain Strings as
* Keys and Values.
*/
Map<String, String> nestedMap = centralMap.get(row);
if (nestedMap == null) {
nestedMap = new HashMap<String, String>();
centralMap.put(row,nestedMap);
}
if (!nestedMap.containsKey(column))
counter++;
centralMap.put(row, nestedMap);
return nestedMap.put(column, value);
}
I can't quite understand what you're doing in the second stanza, so can't help you improve that. And I don't see an iterator as referred to in your question.
You're making me guess, but maybe ELSEWHERE in your program (it would really help to see more code, and a specific function prototype or statement of behavior you're seeking) you want to be able to iterate through the contents of the centralMap instance, and nested instances of nestedMap. Yes you can.
public void iterateOverAllNested()
{
for (Map.Entry<String,Map<String,String>> nested : centralMap) {
final String centralKey = nested.key();
final Map<String,String> nestedMap = nested.value();
System.out.println("Central map row/column: "+centralKey);
for (Map.Entry<String,String> entry : nestedMap) {
System.out.println(" key="+entry.key()+", value="+entry.value());
}
}
}
Note that this smells. Nested maps of untyped Strings are probably wrong. Any chance you've been writing Perl recently? I suggest you write a second SO question asking about a good data structure for your specific problem. You can include this code as your starting place, and folks will likely offer a cleaner solution.

Iterating over a HashMap of HashMaps in Java (or Scala)

I created a class Foo that has the method toArray() that returns an Array<Int>.
Now, I have a HashMap mapping Strings to HashMaps, which map Objects to Foo. That is:
HashMap<String,HashMap<Object,Foo>>
And I want to create a new object of type:
HashMap<String,HashMap<Object,Array<Int>>>
That is obtained by calling the function toArray() for every element Foo in the original HashMAp.
To do so I normally would do something like:
public static HashMap<String,HashMap<Object,Array<Int>>> changeMap(Map mpOld) {
Object key2;
String key1;
Iterator it2;
HashMap<String,HashMap<Object,Array<Int>>> mpNew=
new HashMap<String,HashMap<Object,Array<Int>>>()
Iterator it1 = mpOld.keySet().iterator();
while (it1.hasNext()) {
key1=it1.next();
it2= mpOld.get(key1).keySet().iterator();
mpNew.put(key1,new HashMap<Object,Array<Int>>())
while (it2.hasNext()) {
key2=it2.next();
mpNew.get(key1).put(key2,mpOld.get(key1).get(key2).toArray());
//TODO clear entry mpOld.get(key1).get(key2)
}
//TODO clear entry mpOld.get(key1)
}
return mpNew;
}
A similar code works just fine, but the Size of the HashMap is too big to hold two of them in memory. As you can see I added two points where I want to clear some entries. The problem is, if I do, I get either a concurrency error, or the iterator loop just terminates.
I wonder if there is a better way to iterate through the Maps and copy the information.
Also, I'm working in a Scala project but here I have to use Java types for some compatibility issues. Although Java.util.HashMap is not an iterator, maybe Scala has some hidden functinality to deal with this?
Thanks,
Iterators offer remove(..) methods that safely removes the previously accessed item. Iterate over the Key/Value entries of the map, converting them and adding them to the new map, and removing the old ones as you go.
/**
* Transfers and converts all entries from <code>map1</code> to
* <code>map2</code>. Specifically, the {#link Foo} objects of the
* inner maps will be converted to integer arrays via {#link Foo#toArray}.
*
* #param map1 Map to be emptied.
* #param map2 Receptacle for the converted entries.
*/
private static void transfer(Map<String, Map<Object, Foo>> map1
, Map<String, Map<Object, int[]>> map2) {
final Iterator<Entry<String, Map<Object, Foo>>> mapIt
= map1.entrySet().iterator();
while (mapIt.hasNext()) {
final Entry<String, Map<Object, Foo>> mapEntry = mapIt.next();
mapIt.remove();
final Map<Object, int[]> submap = new HashMap<Object,int[]>();
map2.put(mapEntry.getKey(), submap);
final Iterator<Entry<Object,Foo>> fooIt
= mapEntry.getValue().entrySet().iterator();
while (fooIt.hasNext()) {
final Entry<Object,Foo> fooEntry = fooIt.next();
fooIt.remove();
submap.put(fooEntry.getKey(), fooEntry.getValue().toArray());
}
}
}
I did not have time to check it, but I guess something like this should work on scala Maps (assuming you use scala 2.8 which is finally here):
mpO.mapValues(_.mapValues(_.toArray))
It would take your outer map, and "replace" all inner maps with a new one, where the values are the Int arrays. Keys, and the general "structure" of the maps remain the same. According to scaladoc "The resulting map wraps the original map without copying any elements.", so it won't be a real replacement.
If you also do an
import scala.collection.JavaConversions._
then the java maps can be used the same way as scala maps: JavaConversions contain a bunch of implicit methods that can convert between scala and java collections.
BTW using a Map < String,HashMap < Object,Array < Int>>> might not be really convenient at the end, if I were you I would consider introducing some classes that would hide the complexity of this construct.
Edit reflecting to your comment
import scala.collection.JavaConversions._
import java.util.Collections._
object MapValues {
def main(args: Array[String]) {
val jMap = singletonMap("a",singletonMap("b", 1))
println(jMap)
println(jMap.mapValues(_.mapValues(_+1)))
}
}
prints:
{a={b=1}}
Map(a -> Map(b -> 2))
Showing that the implicits are applied both to the outer and inner map quite nicely. This is the purpose of the JavaConversions object: even if you have a java collection you can use it as a similar scala class (with boosted features).
You don't have to do anything else, just import JavaConversions._
For example considering String keys; lets call the input data: Map<String, Map<String, Object>> data
for (Entry<String, Map<String, Tuple>> entry : data.entrySet()) {
String itemKey = entry.getKey();
for (Entry<String, Object> innerEntry : entry.getValue().entrySet()) {
String innerKey = innerEntry.getKey();
Object o = innerEntry.getValue();
// whatever, here you have itemKey, innerKey and o
}
}
The set is backed by the map, so changes to the map are reflected in the set, and vice-versa. If the map is modified while an iteration over the set is in progress (except through the iterator's own remove operation), the results of the iteration are undefined. The set supports element removal, which removes the corresponding mapping from the map, via the Iterator.remove, Set.remove, removeAll, retainAll, and clear operations.
Why don't you call the remove () method on the iterator or set.remove (iterator.next ()) where iterator.next () returns the key, set is the keyset and iterator its iterator.
PS: also try to refactor your data structure, maybe some intermediate classes which handle the data retrieval? A map in a map with arrays as values doesn't say anything and is difficult to keep track of.

Categories