Keeping a track of duplicate inserts in a Map (Multithreaded environment)

Keeping a track of duplicate inserts in a Map (Multithreaded environment) - java

I am looking for a way to keep a track of the number of times the same key insert is attempted into a Map in a multithreaded environemnt such that the Map can be read and updated by multiple threads at the same time. If keeping a track of duplicate key insert attempts is not achievable easily, an alternate solution would be to kill the application at the first sign of a duplicate key insert attempt.
The following user defined singleton Spring bean shows a global cache used by my application which is loaded using multiple partitioned spring batch jobs (one job for each DataType to be loaded). The addResultForDataType method can be called by multiple threads at the same time.
public class JobResults {
private Map<DataType, Map<String, Object>> results;
public JobResults() {
results = new ConcurrentHashMap<DataType, Map<String, Object>>();
}
public void addResultForDataType(DataType dataType, String uniqueId, Object result) {
Map<String, Object> dataTypeMap = results.get(dataType);
if (dataTypeMap == null) {
synchronized (dataType) {
dataTypeMap = results.get(dataType);
if (dataTypeMap == null) {
dataTypeMap = new ConcurrentHashMap<String, Object>();
results.put(dataType, dataTypeMap);
}
}
}
dataTypeMap.put(uniqueId, result);
}
public Map<String, Object> getResultForDataType(DataType dataType) {
return results.get(dataType);
}
}
Here :
DataType can be thought of as the table name or file name from
where the data is loaded. Each DataType indicates one table or file.
uniqueId represents the primary key for each record in the table or file.
result is the object representing the entire row.
The above method is called once per record. At any given time, multiple threads can be inserting a record for the same DataType or a different DataType.
I thought of creating another map to keep a track of the duplicate inserts :
public class JobResults {
private Map<DataType, Map<String, Object>> results;
private Map<DataType, ConcurrentHashMap<String, Integer>> duplicates;
public JobResults() {
results = new ConcurrentHashMap<DataType, Map<String, Object>>();
duplicates = new ConcurrentHashMap<DataType, ConcurrentHashMap<String, Integer>>();
}
public void addResultForDataType(DataType dataType, String uniqueId, Object result) {
Map<String, Object> dataTypeMap = results.get(dataType);
ConcurrentHashMap<String,Integer> duplicateCount = duplicates.get(dataType);
if (dataTypeMap == null) {
synchronized (dataType) {
dataTypeMap = results.get(dataType);
if (dataTypeMap == null) {
dataTypeMap = new ConcurrentHashMap<String, Object>();
duplicateCount = new ConcurrentHashMap<String, Integer>();
results.put(dataType, dataTypeMap);
duplicates.put(dataType, duplicateCount);
}
}
}
duplicateCount.putIfAbsent(uniqueId, 0);
duplicateCount.put(uniqueId, duplicateCount.get(uniqueId)+1);//keep track of duplicate rows
dataTypeMap.put(uniqueId, result);
}
public Map<String, Object> getResultForDataType(DataType dataType) {
return results.get(dataType);
}
}
I realize that the statemet duplicateCount.put(uniqueId, duplicateCount.get(uniqueId)+1); is not implicitly thread safe. To make it thread-safe, I will need to use synchronization which will slow down my inserts. How can I keep a track of the duplicate inserts without impacting the performance of my application. If keeping a track of duplicate inserts is not easy, I would be fine with just throwing an exception at the first sign of an attempt to overwrite an existing entry in the map.
Note I am aware that a Map does not allow duplicate keys. What I want is a way to keep a track of any such attempts and halt the application rather than overwrite entries in the Map.

Try something like this:
ConcurrentHashMap<String, AtomicInteger> duplicateCount = new ConcurrentHashMap<String, AtomicInteger>();
Then when you're ready to increment a count, do this:
final AtomicInteger oldCount = duplicateCount.putIfAbsent(uniqueId, new AtomicInteger(1));
if (oldCount != null) {
oldCount.incrementAndGet();
}
So, if you do not have a count in the map yet, you will put 1, if you have, you will get the current value and atomically increment it. This should be thread safe.

If you want to keep track of the number of inserts, you can change the outer map type to something like Map<String, Pair<Integer, Object>> (or, if you don't use Apache Commons, just Map<DataType, Map.Entry<Integer, InnerType>>, where the Integer value is the number of updates:
DataType key = ...;
Map<Integer, Object> value = ...;
dataTypeMap.compute(key, (k, current) -> {
if (current == null) {
/* Initial count is 0 */
return Pair.of(0, value);
} else {
/* Increment count */
return Pair.of(current.getFirst(), value);
}));
If all you care about is ensuring that there is no duplicate inserts, you can simply use computeIfAbsent:
DataType key = ...;
Map<Integer, Object> value = ...;
if (dataTypeMap.computeIfAbsent(key, k -> value)) != null) {
/* There was already a value */
throw new IllegalStateException(...);
});

Related

ConcurrentHashMap, find by value, compare fields and put

How can I check if there is a value using the fields of a given value? And put new one?
In ConcurrentHashMap, cause I have N threads.
Here is an example of what I want. However, it is not thread-safe.
Map<Integer, Record> map = new ConcurrentHashMap<>();
// it works, but I think it's unsafe
int get(Object key) {
for (Map.Entry<Integer, Record> next : map.entrySet()) {
if (next.getValue().a == key) {
return next.getValue().b;
}
}
int code = ...newCode();
map.put(code, new Record(...))
return code;
}
record Record(Object a, int b) {
}

What you're suggesting would defeat the purpose of using a HashMap since you're iterating through the Map instead of retrieving from the Map.
What you should really do is create a new Map where the field in Record.a is the Key and the field in Record.B is the value (or just the whole Record). Then just update your logic to insert into both Maps appropriately.

Unflatten a HashMap of values

I currently have a Map of key value pairs in the format of
a.b.c: value1
e.f: value2
g: [
g.h: nested_value1
g.i: nested_value2
]
and I need to 'unflatten' this to a new Map in a nested structure -
a:
b:
c: value1
e:
f: value2
g: [
h: nested_value1
i: nested_value2
]
My current attempt doesn't get very far, and throws a ConcurrentModificationException
private static Map<String, Object> unflatten(Map<String, Object> flattened) {
Map<String, Object> unflattened = new HashMap<>();
for (String key : flattened.keySet()) {
doUnflatten(flattened, unflattened, key, flattened.get(key));
}
return unflattened;
}
private static Map<String, Object> doUnflatten(
Map<String, Object> flattened,
Map<String, Object> unflattened,
String key,
Object value) {
String[] parts = StringUtils.split(key, '.');
for (int i = 0; i < parts.length; i++) {
String part = parts[i];
Object current = flattened.get(part);
if (i == (parts.length - 1)) {
unflattened.put(part, value);
} else if (current == null) {
if ((current = unflattened.get(part)) == null) {
current = new HashMap<>();
}
unflattened.put(part, current);
unflattened = (Map<String, Object>) current;
} else if (current instanceof Map) {
unflattened.put(part, current);
unflattened = (Map<String, Object>) current;
}
}
return unflattened;
}
Am I missing something obvious here? One solution is to use a library like JsonFlattener - the only issue is this would involve converting back and forward between JSON alot.
Edit: Thanks for the pointers - I am half way there, one thing I forgot to mention was it also needs to unflatten a collection of HashMaps

Your error comes because you iterate the key set and then change the map, not through the iterator.
The iterators returned by all of this class's "collection view
methods" are fail-fast: if the map is structurally modified at any
time after the iterator is created, in any way except through the
iterator's own remove method, the iterator will throw a
ConcurrentModificationException. Thus, in the face of concurrent
modification, the iterator fails quickly and cleanly, rather than
risking arbitrary, non-deterministic behavior at an undetermined time
in the future.
You could get around this by using a new map.

The problem with your implementation is that you are writing the output into the same Map that you use for the input, which causes ConcurrentModificationException.
Implementation becomes straightforward with a separate Map for output:
Map<String,Object> unflattened = new HashMap<>();
for (Map.Entry<String,Object> e : flattened.entrySet()) {
String[] parts = StringUtils.split(e.getKey(), ".");
// Find the map to be used as a destination for put(...)
Map<String,Object> dest = unflattened;
for (int i = 0 ; i != parts.length-1 ; i++) {
Object tmp = dest.get(parts[i]);
if (tmp == null) {
// We did not see this branch yet
Map<String,Object> next = new HashMap<>();
dest.put(parts[i], next);
dest = next;
continue;
}
if (!(temp instanceof Map)) {
throw new IllegalStateException();
}
dest = (Map<String,Object>)temp;
}
// Put the entry into the destination Map<>
dest.put(parts[parts.length-1], e.getValue());
}
Note that the process of "unflattening" may fail when the initial map describes an inconsistent hierarchy, for example, one with a branch and a leaf having the same name:
"a.b.c" -> "x" // OK: "a.b" is a branch
"a.b.d" -> "y" // OK: "a.b" is a branch
"a.b" -> "z" // Error: "a.b" is a leaf

Create a new Map instance for your result instead of attempting to reuse the current one. Also, send in the map value, so it doesn't need to be extracted:
private static Map<String, Object> unflatten(Map<String, Object> flattened) {
Map<String, Object> unflattened = new HashMap<>();
for (String key : flattened.keySet()) {
doUnflatten(unflattened, key, flattened.get(key));
}
return unflattened;
}
This also prevents the original keys from being present in the resulting map.
The above also requires a slight rewrite of the doUnflatten method:
private static void doUnflatten(Map<String, Object> current, String key,
Object originalValue) {
String[] parts = StringUtils.split(key, ".");
for (int i = 0; i < parts.length; i++) {
String part = parts[i];
if (i == (parts.length - 1)) {
current.put(part, originalValue);
return;
}
Map<String, Object> nestedMap = (Map<String, Object>) current.get(part);
if (nestedMap == null) {
nestedMap = new HashMap<>();
current.put(part, nestedMap);
}
current = nestedMap;
}
}
Couple of notes: There's no need to return the map from the method. Divide the loop into two distinct cases: Either the value should be written to the map, or a nested map should be created or retrieved.

The simplest solution is to replace line
for (String key : flattened.keySet()) {
to
for (String key : new ArrayList<>(flattened.keySet())) {
but for large data amount it can be not very effective from performance perspective.

returning a key from inner hashmap that has lowest value

I have a nested HashMap:
HashMap<String, Map<String,Integer>> map = new HashMap<>();
The key for the nested map may have multiple values:
{Color={Red=4, Blue=6}}
I want to be able to return the key of the nested map that has the lowest value. In this case, if I gave the key Color from the outer map, I want to have Red returned.
Any help is greatly appreciated.

Get the inner map by key.
Get the Iterator of the inner map.
Assign the first kvp as the minimum.
Loop through the iterator checking if any subsequent kvp is less than the minimum and assign it if true.
Return the minimum's key.
Code Sample:
public static void main(String[] args) throws Exception {
Map<String, Map<String,Integer>> map = new HashMap() {{
put("Color", new HashMap() {{
put("Red", 4);
put("Orange", 1);
put("Blue", 6);
put("Yellow", 2);
}});
}};
System.out.println(getInnerKeyWithLowestValue(map, "Color"));
}
public static String getInnerKeyWithLowestValue(Map<String, Map<String,Integer>> map, String outerKey) {
Map<String, Integer> innerMap = map.get(outerKey);
// Make sure inner map was retrieved
if (innerMap != null) {
Iterator<Map.Entry<String,Integer>> it = innerMap.entrySet().iterator();
Map.Entry<String, Integer> minimum = it.next();
while (it.hasNext()) {
Map.Entry<String, Integer> next = it.next();
if (next.getValue() < minimum.getValue()) {
minimum = next;
}
}
return minimum.getKey();
}
return ""; // Inner map doesn't exist
}
Results:
Orange

If Java 8 is a option for you, it is easy to write a very concise method to do that:
public static String lowestValueKey(Map<String, Map<String, Integer>> map, String key) {
return map.get(key).entrySet().stream()
.min(Comparator.comparing(Map.Entry::getValue))
.get().getKey();
}
Also using Maps inside Maps can be very tedious sometimes. You may consider using Table<String, String, Integer> from Guava library.

Get the Hashmap from the inner hashmap and sort the hashmap based on the value which is shown in the link enter link description here. Obviously firs entry will be the lowest value in your innerhashmap.

How to get key depending upon the value from hashmap

I want to retrieve the specific key associated with the value in a hashmap
I want to retrieve the key of "ME", how can I get it?
Code snippet :
HashMap<Integer,String> map = new HashMap<Integer,String>();
map.put(1,"I");
map.put(2,"ME");

There's a small problem with what you are trying to do. There can be multiple occurrences of the same value in a hashmap, so if you look up the key by value, there might be multiple results (multiple keys with the same value).
Nevertheless, if you are sure this won't occur, it can be done; see the following example:
import java.util.*;
public class Main {
public static void main(String[] args) {
HashMap<Integer, String> map = new HashMap<Integer, String>();
map.put(5, "vijf");
map.put(36, "zesendertig");
}
static Integer getKey(HashMap<Integer, String> map, String value) {
Integer key = null;
for(Map.Entry<Integer, String> entry : map.entrySet()) {
if((value == null && entry.getValue() == null) || (value != null && value.equals(entry.getValue()))) {
key = entry.getKey();
break;
}
}
return key;
}
}

Iterate over the entries of the map :
for(Entry<Integer, String> entry : map.entrySet()){
if("ME".equals(entry.getValue())){
Integer key = entry.getKey();
// do something with the key
}
}

You will have to iterate through the collection of keys to find your value.
Take a look at this post for details: Java Hashmap: How to get key from value?

If your values are guaranteed to be unique use Guava BiMap (the HashMap counterpart is called HashBiMap.
Integer key = map.inverse().get("ME");
Guava Documentation.

/**
* Return keys associated with the specified value
*/
public List<Integer> getKey(String value, Map<Integer, String> map) {
List<Integer> keys = new ArrayList<Integer>();
for(Entry<Integer, String> entry:map.entrySet()) {
if(value.equals(entry.getValue())) {
keys.add(entry.getKey());
}
}
return keys;
}

Managing nested maps with string keys

Can I access my nestedMap in my iterator when the nestedMap is created in the put() method, like this:
#Override
public String put(final String row, final String column, final String value) {
/**
* Second map which is contained by centralMap, that contain Strings as
* Keys and Values.
*/
Map<String, String> nestedMap;
if (centralMap.containsKey(row))
nestedMap = centralMap.get(row);
else
nestedMap = new HashMap<String, String>();
if (!nestedMap.containsKey(column))
counter++;
centralMap.put(row, nestedMap);
return nestedMap.put(column, value);
}
and the centralMap is declared as an Object-Variable,
private final Map<String, Map<String, String>> centralMap;
but instantiated just in the constructor, like this:
centralMap = new HashMap<String, Map<String, String>>();
the method i'm trying to implement is the remove method:
#Override
public void remove() {
for (Map<String, String> map : centralMap.values()) {
map = centralMap.get(keyName);
iteratorNested.remove();
if (map.size() <= 0)
iteratorCentral.remove();
}
}
Thanks a lot!

Not sure what you're asking exactly, but I think this is a little easier to read:
#Override
public String put(final String row, final String column, final String value) {
/**
* Second map which is contained by centralMap, that contain Strings as
* Keys and Values.
*/
Map<String, String> nestedMap = centralMap.get(row);
if (nestedMap == null) {
nestedMap = new HashMap<String, String>();
centralMap.put(row,nestedMap);
}
if (!nestedMap.containsKey(column))
counter++;
centralMap.put(row, nestedMap);
return nestedMap.put(column, value);
}
I can't quite understand what you're doing in the second stanza, so can't help you improve that. And I don't see an iterator as referred to in your question.
You're making me guess, but maybe ELSEWHERE in your program (it would really help to see more code, and a specific function prototype or statement of behavior you're seeking) you want to be able to iterate through the contents of the centralMap instance, and nested instances of nestedMap. Yes you can.
public void iterateOverAllNested()
{
for (Map.Entry<String,Map<String,String>> nested : centralMap) {
final String centralKey = nested.key();
final Map<String,String> nestedMap = nested.value();
System.out.println("Central map row/column: "+centralKey);
for (Map.Entry<String,String> entry : nestedMap) {
System.out.println(" key="+entry.key()+", value="+entry.value());
}
}
}
Note that this smells. Nested maps of untyped Strings are probably wrong. Any chance you've been writing Perl recently? I suggest you write a second SO question asking about a good data structure for your specific problem. You can include this code as your starting place, and folks will likely offer a cleaner solution.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Keeping a track of duplicate inserts in a Map (Multithreaded environment) - java

Related

ConcurrentHashMap, find by value, compare fields and put

Unflatten a HashMap of values

returning a key from inner hashmap that has lowest value

How to get key depending upon the value from hashmap

Managing nested maps with string keys

Categories

Resources