Sort concurrentHash Map with threadsafty - java

I am using 'concurrentHashMap' in my 'multithreaded' application. i was able to sort it as describe here. but since i am converting hashmap to a list i am bit worried about the thred safty. My 'ConcurrentHashMap' is a static variable hence i can guarantee there will be only one instance of it. but when i am going to sort it i convert it to a list, and sort then put it back to a new concurrentHashMap.
Is this a good practice in multi-threading enlivenment?
Please let me know your thoughts and suggestions.
Thank you in advance.

You should use a ConcurrentSkipListMap. It is thread-safe, fast and maintains ordering according to the object's comparable implementation.

If you don't change it a lot and all you want is to have it sorted, you should use a TreeMap ** wrapped by a **Collections.synchronizedMap() call
Your code would be something like this:
public class YourClass {
public static final Map<Something,Something> MAP = Collections.synchronizedMap( new TreeMap<Something,Something>() );
}

My 'ConcurrentHashMap' is a static variable hence i can guarantee there will be only one instance of it. but when i am going to sort it i convert it to a list, and sort then put it back to a new concurrentHashMap.
This is not a simple problem.
I can tell you for a fact, that using a ConcurrentHashMap won't make this thread-safe. Nor will using a synchronizedMap wrapper. The problem is that sorting is not supported as a single atomic operation. Rather it involves a sequence of Map API operations, probably with significant time gaps in between them.
I can think of two approaches to solving this:
Avoid the need for sorting in the first place by using a Map that keeps the keys in order; e.g. use ConcurrentSkipListMap.
Wrap the Map class in a custom synchronized wrapper class with a synchronized sort method. The problem with this approach is that you are likely to reintroduce the concurrency bottleneck that you avoided by using ConcurrentHashMap.
And it is worth pointing out that it doesn't make any sense to sort a HashMap or a ConcurrentHashMap because these maps will not preserve the order into which you sort the elements. You could use a LinkedHashMap, which preserves the entry insertion order.

Related

Retrieval of data in HashSet

I want to know that in which manner, the data is retrieve in HashSet
I have inserted data in different order and output data is in another order.
Can someone please tell the logic behind this?
Code is like this :-
class Test
{
public static void main(String[]args)
{
HashSet<String> h = new HashSet<String>();
// Adding elements into HashSet using add()
h.add("India");
h.add("Australia");
h.add("South Africa");
System.out.println(h);
}
}
Output:- [South Africa, Australia, India]
From Javadoc of HashSet
It makes no guarantees as to the
iteration order of the set; in particular, it does not guarantee that the
order will remain constant over time.
HashSet works same as HashMap with Value. Moreover It internally uses HashMap With value constant Object called "PRESENT". By doing this HashSet guarantee uniqueness but not order It locate the set elements similarly as what Hashmap do.
You can see the implementation of HashSet on internet.
As said, the ordering of elements in a HashSet is not guaranteed to be anything, nor to be constant over time.
This is due to the nature of the underlying data structure.
In your case, it looks like the Strings were stored in a LIFO queue, but another implementation of HashSet may well do things differently (and even this one might as more items get inserted, start to behave differently).
As per the above, please see the Javadoc for HashSets - the order is not guaranteed. https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html
Use the LinkedHashSet if you want it to maintain the order of elements.

Should we use HashSet?

A HashSet is backed by a HashMap. From it's JavaDoc:
This class implements the Set interface, backed by a hash table
(actually a HashMap instance)
When taking a look at the source we can also see how they relate to each other:
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Therefore a HashSet<E> is backed by a HashMap<E,Object>. For all HashSets in our application we have one reference object PRESENT that we use in the HashMap for the value. While the memory needed to store PRESENT is neglectable, we still store a reference to it for each value in the map.
Would it not be more efficient to use null instead of PRESENT? A further consideration then is should we forgo the HashSet altogether and directly use a HashMap, given the circumstance permits the use of a Map instead of a Set.
My basic problem that triggered these thoughts is the following situation: I have a collection of objects on with the following properties:
big collection of objects > 30'000
Insertion order is not relevant
Efficient check if an item is contained
Adding new items to the collection is not relevant
The chosen solution should perform optimal in the context to the above criteria as well as minimize memory consumption. On this basis the datastructures HashSet and HashMap spring to mind. When thinking about alternative approaches, the key question is:
How to check containement efficiently?
The only answer that comes to my mind is using the items hash to calculate the storage location. I might be missing something here. Are there any other approaches?
I had a look at various issues, that did shed some light on the issue, but not quietly answered my question:
Java : HashSet vs. HashMap
clarifying facts behind Java's implementation of HashSet/HashMap
Java HashSet vs HashMap
I am not looking for suggestions of any alternative libraries or framework to address this, but I want to understand if there is an other way to think about efficient containement checking of an element in a Collection.
In short, yes you should use HashSet. It might not be the most possibly efficient Set implementation, but that hardly ever matters, unless you are working with huge amounts of data.
In that case, I would suggest using specialized libraries. EnumMaps if you can use enums, primitive maps like Trove if your data is mostly primitives, a bunch of other data-structures that are optimized for certain data-types, or even an in-memory-database.
Don't get me wrong, I'm someone who likes performance-tuning, too, but replacing the built-in data-structures should only be done when its really necessary. For most cases, they work perfectly fine.
What you could do, in case you really want to save the last bit of memory and do not care about inserting, is using a fixed-sized array, sorting that and doing a binary search every time. But I doubt that it's more efficient than a HashSet.
Hashtables and HashSets should be used entirely different, so maybe the two shouldn't be compared as "which is more efficient". The hashset would be more suitable for the mathematical "set" (ex. {1,2,3,4}). They contain no duplicates and allow for only one null value. While a hashmap is more of a key-> pair value system. They allow multiple null values as well as duplicates, just not duplicate key vales. I know this is probably answering "difference between a hashtable and hashset" but I think my point is they really can't be compared.

java: maps zoo, what to choose

I'm pretty new to the Java World (since I'm writing primary in C/C++). I'm using maps in my apps.
Since java.util.Map is abstract I need to instantiate it's implementation. Usually I use HashMap like:
Map<String, MyClass> x = new HashMap<>();
But in java docs I found many other implementations, like TreeMap, LinkedHashMap, HashTable, etc. I want to know if I can continue blindly using of the HashMap or there are any important differences between those Map implementations.
The brief list of points-to-know will be ok.
Thanks.
Never bother with Hashtable, it's a relic from Java 1.0;
HashMap is the universal default due to O(1) lookup and reliance only on equals and hashCode, guaranteed to be implemented for all Java objects;
TreeMap gives you sorted iteration over the map entries (plus a lot more—see NavigableMap), but requires a comparison strategy and has slower insertion and lookup – O(logN) – than HashMap;
LinkedHashMap preserves insertion/access order when iterating over the entries.
SortedMap implementations offer some great features, like headMap and tailMap. NavigableMap implementations offer even more features with terrific performance for operations that assume sorted keys.
Further out there are java.util.concurrent map implementations, like ConcurrentHashMap, which offer great concurrent performance and atomic get/put operations.
HashMap use it almost all the time. Note that your object need have proper implementation of equals and hashCode methods. Does not save insertion order.
HashTable don't use it never.
LinkedHashMap the same as HashMap but saves insertion order. Large overhead.
TreeMap support natural ordering. But insertion works in O(logn).
Hashtable is the thread safe version of HashMap, you shouldn't use it anymore. instead you should use ConcurrentHashMap which is a new implementation of a thread safe map
TreeMap is mostly use when you want to sort your keys, it implements the SortedMap interface. The put/get performance is O(logn).
ConcurrentSkipListMap is used if you need a thread safe SortedMap
LinkedHashMap is used when you want to iterate on keys in the insertion order
I mostly use HashMap or ConcurrentHashMap if I need it to be thread safe
There of course are important differences between each of these maps. It depends purely on what you are trying to do. If you recall a HashMap becomes pretty useless (see inefficient) when you have a poor hashing function in place. The LinkedHashMap is a HashMap that is backed by a doubly linked list, so you can iterate over it. You would eat the overhead that is associated with a linked list of course. TreeMap keeps elements in order, so you will eat that overhead. HashTable is a synchronized collection, that is generally avoided.
are any important differences between those Map implementations
Yes there are some major differences to consider when choosing an implementation of Map.
ConcurrencyWill you be manipulating this map across threads?
NULLsDo you want to accept, or reject, NULL pointers as key and/or value?
SortingDo you want map entries put in some order, such as sorted order or original-insertion order? Do you want support for the SortedMap/NavigableMap interfaces?
Not ModifiableDo you want a map to be frozen, refusing to accept or remove entries?
IdentityDo want to compare keys based on reference-equality or object-equality?
EfficiencyDo you want to take advantage of the very fast performance and very little memory used when your key is an enum?
LiteralsDo you want the convenience of declaring and populating a map in a single line of code?
LegacyDo you want to avoid using a legacy map, created before the modern Java Collections Framework?
Here is a graphic table I made comparing the features of each of the ten Map implementations bundled with Java 11.

Synchronizing LinkedHashmap externally

What is the best way to implement synchronization of a linkedhashmap externally, without using Collections.synchronizedMap
When Collections.synchronizedMap is used entire datastructure is locked, so performance is hugely impacted in a bad way.
What is the best way to lock only required part of datastructure. e.g. If thread is accessing key (K1), it should lock only Key(K1) and Value(v1) part of the datastructure
You can't get a fine-grained-locking, FIFO-eviction concurrent map from the built-in Java implementations.
Check out Guava's Cache or the open-source ConcurrentLinkedHashMap project.
I think you may want to synchronize the subsequent operation you do, just on the value coming from the map:
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
Synchronizing to values get from the Map, makes sense, since they can be shared and accessed concurrently; the example I posted above should do what you need. That should be enough.
By the way you can also synchronize on the key doing two nested synchronized blocks:
synchronized(key) {
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
}
The key is -usually- just used to access the object (by hashing). Keys are matched by hash value, so it doesn't make full sense to me to synchronize over the key.
Or, maybe you can subclass ConcurrentHashMap adding what is missing from LinkedHashMap.
Louis Wasserman's suggestion is probably the best because it gives you a lot of useful functionality. However, even if you lock on the entire map, you have to be hitting it really, really hard to make that a bottleneck (as in, your code is mostly doing read/write on the map). If you don't need the additional functionality of Guava's Cache, a synchronized map could be simpler & better. You could also use a ReadWriteLock if you mostly read from the map.
Best option would be to use java.util.concurrent.ConcurrentHashMap .
I can't see how it would be possible to externally lock only parts of zour Map, since you cannot control what shared datastructures are accessed internally by a call to any of the maps function.
If you don't need a LinkedHaspMap, use a ConcurrentHashMap from the java.util.concurrent package.
It is specifically designed for both speed and thread safety. It uses the minimal possible locking to achieve its thread safety.
An insertion in a HashMap, or LinkedHashMap, can cause a rehash because it increases the ratio between the size and the number of buckets. Having two or more threads rehash simultaneously would be a disaster.
Even if you are only doing a get, another thread may be removing an entry from the same bucket, so you are scanning a linked list that is being modified under you. You could also have two or more threads appending to the main linked list at the same time.
If you can do without the linking, use java.util.concurrent.ConcurrentHashMap, as already suggested.

How can I use a Java ConcurrentNavigableMap with comparator instead of a TreeMap?

I need build Queue that will stay allways sorted by its keys .
the TreeMap seams to be great for it like in this example :
http://www.javaexamples4u.com/2009/03/treemap-comparator.html
but the big problem is that its not thread safe , then i found ConcurrentNavigableMap
great , but how do i use the comparator the same way as with the TreeMap? i didn't found any example for it .
It sounds like you are actually looking for a PriorityQueue. If you need a thread-safe version of it, you can use a PriorityBlockingQueue.
A priority queue is a queue where you can retrieve items ordered by "importance." In the case of Java, you can use a Comparator or the items' natural order (if they implement Comparable).
If you really need to use a ConcurrentNavigableMap, you will need to use an implementation of it such as ConcurrentSkipListMap. Just allocate an instance of ConcurrentSkipListMap and pass it the comparator you want to use.
new ConcurrentSkipListMap<MyKeyType, MyValueType>(new MyKeyComparator());
ConcurrentNavigableMap is just the interface. You need to use a concrete class implementing it, which in the standard collections library is ConcurrentSkipListMap.
You should basically be able to use ConcurrentSkipListMap as a drop-in replacement for TreeMap, including using the comparator. Operations will generally have similar performance characteristics (O(log n)), but as I understand the size() operation of ConcurrentSkipListMap requires traversal of the skip list rather than simply reading a variable, so just be slightly careful if you call this frequently on a large map.

Categories