I have an interesting problem I would like some help with. I have implemented a couple of queues for two separate conditions, one based on FIFO and the other natural order of a key (ConcurrentMap). That is you can image both queues have the same data just ordered differently. The question I have (and I am looking for an efficient way of doing this) if I find the key in the ConcurrentMap based on some criteria, what is the best way of finding the "position" of the key in the FIFO map. Essentially I would like to know whether it is the firstkey (which is easy), or say it is the 10th key.
Any help would be greatly appreciated.
There is no API for accessing the order in a FIFO map. The only way you can do it is iterate over keySet(), values() or entrySet() and count.
I believe something like the code below will do the job. I've left the implementation of element --> key as an abstract method. Note the counter being used to assign increasing numbers to elements. Also note that if add(...) is being called by multiple threads, the elements in the FIFO are only loosely ordered. That forces the fancy max(...) and min(...) logic. Its also why the position is approximate. First and last are special cases. First can be indicated clearly. Last is tricky because the current implementation returns a real index.
Since this is an approximate location, I would suggest you consider making the API return a float between 0.0 and 1.0 to indicate relative position in the queue.
If your code needs to support removal using some means other than pop(...), you will need to use approximate size, and change the return to ((id - min) / (max - min)) * size, with all the appropriate int / float casting & rounding.
public abstract class ApproximateLocation<K extends Comparable<K>, T> {
protected abstract K orderingKey(T element);
private final ConcurrentMap<K, Wrapper<T>> _map = new ConcurrentSkipListMap<K, Wrapper<T>>();
private final Deque<Wrapper<T>> _fifo = new LinkedBlockingDeque<Wrapper<T>>();
private final AtomicInteger _counter = new AtomicInteger();
public void add(T element) {
K key = orderingKey(element);
Wrapper<T> wrapper = new Wrapper<T>(_counter.getAndIncrement(), element);
_fifo.add(wrapper);
_map.put(key, wrapper);
}
public T pop() {
Wrapper<T> wrapper = _fifo.pop();
_map.remove(orderingKey(wrapper.value));
return wrapper.value;
}
public int approximateLocation(T element) {
Wrapper<T> wrapper = _map.get(orderingKey(element));
Wrapper<T> first = _fifo.peekFirst();
Wrapper<T> last = _fifo.peekLast();
if (wrapper == null || first == null || last == null) {
// element is not in composite structure; fifo has not been written to yet because of concurrency
return -1;
}
int min = Math.min(wrapper.id, Math.min(first.id, last.id));
int max = Math.max(wrapper.id, Math.max(first.id, last.id));
if (wrapper == first || max == min) {
return 0;
}
if (wrapper == last) {
return max - min;
}
return wrapper.id - min;
}
private static class Wrapper<T> {
final int id;
final T value;
Wrapper(int id, T value) {
this.id = id;
this.value = value;
}
}
}
If you can use a ConcurrentNavigableMap, the size of the headMap gives you exactly what you want.
Related
This question already has answers here:
Interview question: data structure to set all values in O(1)
(18 answers)
Closed 9 months ago.
I'm trying to write a data structure that is capable to set all the Values in O(1).
My code:
public class myData {
boolean setAllStatus = false;
HashMap<Integer, Integer> hasMap = new HashMap<>();
int setAllValue = 0;
int count = 0;
public void set(int key, int value) {
hasMap.put(key, value);
}
public int get(int key) {
if (setAllStatus) {
if (hasMap.get(key) != null) {
if (count == hasMap.size()) {
return setAllValue;
} else {
// do something
}
} else {
throw new NullPointerException();
}
} else {
if (hasMap.get(key) == null) {
throw new NullPointerException();
} else {
return hasMap.get(key);
}
}
}
public void setAll(int value) {
setAllStatus = true;
setAllValue = value;
count = hasMap.size();
}
public static void main(String[] args) {
myData m = new myData();
m.set(1, 4);
m.set(4, 5);
System.out.println(m.get(4)); // 5
m.setAll(6);
System.out.println(m.get(4)); // 6
m.set(8, 7);
System.out.println(m.get(8)); // 7
}
}
When I set variables for the first time and then set all the values to a specific variable it works, but when I try to put a new variable after setting all the variables I'm a bit confused.
What kind of solution can I use to make it work?
If you want to enhance your knowledge of Data Structures, I suggest you to implement your own version of Hash table data structure from the ground up (define an array of buckets, learn how to store elements in a bucket, how to resolve collisions and so on...) instead of decorating the HashMap.
Your current code is very contrived:
By its nature, get() should not do anything apart from retrieving a value associated with a key because that's the only responsibility of this method (have a look at the implementation of get() in the HashMap class). Get familiar with the Single responsibility principle.
The idea of throwing an exception when the given key isn't present in the map is strange. And NullPointerException is not the right type of exception to describe this case, NoSuchElementException would be more intuitive.
You might also be interested in learning What does it mean to "program to an interface"?
And the main point is that is because you've picked the wrong starting point (see the advice at the very beginning), learn more about data structures starting from the simplest like Dynamic array, try to implement them from scratch, and gradually learn more about the class design and language features.
Time complexity
Regarding the time complexity, since your class decorates a HashMap methods set() and get() would perform in a constant time O(1).
If you need to change all the values in a HashMap, that could be done only a linear time O(n). Assuming that all existing values are represented by objects that are distinct from one another, it's inherently impossible to perform this operation in a constant time because we need to do this change for every node in each bucket.
The only situation when all values can be set to a new value in a constant time is the following very contrived example where each and every key would be associated with the same object, and we need to maintain a reference to this object (i.e. it would always retrieve the same value for every key that is present, which doesn't seem to be particularly useful):
public class SingleValueMap<K, V> {
private Map<K, V> map = new HashMap<>();
private V commonValue;
public void setAll(V newValue) {
this.commonValue = newValue;
}
public void add(K key) {
map.put(key, commonValue);
}
public void add(K key, V newValue) {
setAll(newValue);
map.put(key, commonValue);
}
public V get(K key) {
if (!map.containsKey(key)) throw new NoSuchElementException();
return commonValue;
}
}
And since we are no longer using the actual HashMap's functionality for storing the values, HashMap can be replaced with HashSet:
public class SingleValueMap<K, V> {
private Set<K> set = new HashSet<>();
private V commonValue;
public void setAll(V newValue) {
this.commonValue = newValue;
}
public void add(K key) {
set.add(key);
}
public void add(K key, V newValue) {
setAll(newValue);
set.add(key);
}
public V get(K key) {
if (!set.contains(key)) throw new NoSuchElementException();
return commonValue;
}
}
If I'm understanding the problem here correctly, every time setAll is called, we effectively forget about all the values of the HashMap and track only its keys basically as if it were a HashSet, where get uses the value passed into setAll. Additionally, any new set calls should still track both the key and the value until setAll is called some time later.
In other words, you need to track the set of keys before setAll, and the set of key-and-values after setAll separately in order to be able to distinguish them.
See if you can find a way to amortize or through constant time operations, keep track of which keys are and are not associated with the latest setAll operation.
Given that this looks like a homework problem, I am hesitating to help further (as per these SO guidelines), but if this is not homework, let me know and I can delve further into this topic.
I am trying to implement a hash cons in java, comparable to what String.intern does for strings. I.e., I want a class to store all distinct values of a data type T in a set and provide an T intern(T t) method that checks whether t is already in the set. If so, the instance in the set is returned, otherwise t is added to the set and returned. The reason is that the resulting values can be compared using reference equality since two equal values returned from intern will for sure also be the same instance.
Of course, the most obvious candidate data structure for a hash cons is java.util.HashSet<T>. However, it seems that its interface is flawed and does not allow efficient insertion, because there is no method to retrieve an element that is already in the set or insert one if it is not in there.
An algorithm using HashSet would look like this:
class HashCons<T>{
HashSet<T> set = new HashSet<>();
public T intern(T t){
if(set.contains(t)) {
return ???; // <----- PROBLEM
} else {
set.add(t); // <--- Inefficient, second hash lookup
return t;
}
}
As you see, the problem is twofold:
This solution would be inefficient since I would access the hash table twice, once for contains and once for add. But okay, this may not be a too big performance hit since the correct bucket will be in the cache after the contains, so add will not trigger a cache miss and thus be quite fast.
I cannot retrieve an element already in the set (see line flagged PROBLEM). There is just no method to retrieve the element in the set. So it is just not possible to implement this.
Am I missing something here? Or is it really impossible to build a usual hash cons with java.util.HashSet?
I don't think it's possible using HashSet. You could use some kind of Map instead and use your value as key and as value. The java.util.concurrent.ConcurrentMap also happens to posess the quite convenient method
putIfAbsent(K key, V value)
that returns the value if it is already existent. However, I don't know about the performance of this method (compared to checking "manually" on non-concurrent implementations of Map).
Here is how you would do it using a HashMap:
class HashCons<T>{
Map<T,T> map = new HashMap<T,T>();
public T intern(T t){
if (!map.containsKey(t))
map.put(t,t);
return map.get(t);
}
}
I think the reason why it is not possible with HashSet is quite simple: To the set, if contains(t) is fulfilled, it means that the given t also equals one of the t' in the set. There is no reason for being able return it (as you already have it).
Well HashSet is implemented as HashMap wrapper in OpenJDK, so you won't win in memory usage comparing to solution suggested by aRestless.
10-min sketch
class HashCons<T> {
T[] table;
int size;
int sizeLimit;
HashCons(int expectedSize) {
init(Math.max(Integer.highestOneBit(expectedSize * 2) * 2, 16));
}
private void init(int capacity) {
table = (T[]) new Object[capacity];
size = 0;
sizeLimit = (int) (capacity * 2L / 3);
}
T cons(#Nonnull T key) {
int mask = table.length - 1;
int i = key.hashCode() & mask;
do {
if (table[i] == null) break;
if (key.equals(table[i])) return table[i];
i = (i + 1) & mask;
} while (true);
table[i] = key;
if (++size > sizeLimit) rehash();
return key;
}
private void rehash() {
T[] table = this.table;
if (table.length == (1 << 30))
throw new IllegalStateException("HashCons is full");
init(table.length << 1);
for (T key : table) {
if (key != null) cons(key);
}
}
}
I'm looking at algorithms on coursea.org by Robert Sedgewick. He mentions that keys should be immutable for a priority queue? Why?
Here is simple Example
public static void main(String[] args)
{
Comparator<AtomicInteger> comparator = new AtomicIntegerComparater();
PriorityQueue<AtomicInteger> queue =
new PriorityQueue<AtomicInteger>(10, comparator);
AtomicInteger lessInteger = new AtomicInteger(10);
AtomicInteger middleInteger = new AtomicInteger(20);
AtomicInteger maxInteger = new AtomicInteger(30);
queue.add(lessInteger);
queue.add(middleInteger);
queue.add(maxInteger);
while (queue.size() != 0)
{
System.out.println(queue.remove());
}
queue.add(lessInteger);
queue.add(middleInteger);
queue.add(maxInteger);
lessInteger.addAndGet(30);
while (queue.size() != 0)
{
System.out.println(queue.remove());
}
}
}
class AtomicIntegerComparater implements Comparator<AtomicInteger>
{
#Override
public int compare(AtomicInteger x, AtomicInteger y)
{
if (x.get() < y.get())
{
return -1;
}
if (x.get() > y.get())
{
return 1;
}
return 0;
}
}
You will get an output like
10
20
30
40
20
30
Note in the second removal , it removes 40 first. but the expectation is it should removed last. Since while it added it has 10 and that is considered as first element.
How ever, if you add another element to the same queue, it is re-ordering properly.
queue.add(lessInteger);
queue.add(middleInteger);
queue.add(maxInteger);
lessInteger.addAndGet(30);
queue.add(new AtomicInteger(5));
while (queue.size() != 0)
{
System.out.println(queue.remove());
}
would result as
5
20
30
40
Check siftUpUsingComparator method of PriortyQueue .
private void siftUpUsingComparator(int k, E x) {
while (k > 0) {
int parent = (k - 1) >>> 1;
Object e = queue[parent];
if (comparator.compare(x, (E) e) >= 0)
break;
queue[k] = e;
k = parent;
}
queue[k] = x;
}
is it applicable to other Collection ?
Well it depends upon the that Collection , Implementation.
For example : TreeSet fall under same category. it just keeps / use the Comparator while insert not iterate.
TreeSet<AtomicInteger> treeSets = new TreeSet<AtomicInteger>(comparator);
lessInteger.set(10);
treeSets.add(middleInteger);
treeSets.add(lessInteger);
treeSets.add(maxInteger);
lessInteger.addAndGet(30);
for (Iterator<AtomicInteger> iterator = treeSets.iterator(); iterator.hasNext();) {
AtomicInteger atomicInteger = iterator.next();
System.out.println(atomicInteger);
}
Would result in
40
20
30
Which is not excepted.
The reason why keys (entries) of a PriorityQueue should be immutable is that the PriorityQueue can not detect changes of these keys. For example, when you insert a key with a certain priority, it will be placed at a certain position in the queue. (Actually, in the backing implementation, it is more like a "tree", but this does not matter here). When you now modify this object, then its priority may change. But it will not change its position in the queue, because the queue does not know that the object was modified. The placement of this object in the queue may then simply be wrong, and the queue will become inconsistent (that is, return objects in the wrong order).
Note that the objects do not strictly have to be completely immutable. The important point is that there may be no modification of the objects that affects their priority. It's perfectly feasible to modify a field of the object that is not involved in the computation of the priority. But care has to be taken, because whether a change affects the priority may or may not be specified explicitly in the class of the respective entries.
Here is a simple example that shows how it breaks - the first queue prints numbers in order as expected but the second doesn't because one of the numbers has been mutated after having been added to the queue.
public static void main(String[] args) throws Exception {
PriorityQueue<Integer> ok = new PriorityQueue<>(Arrays.asList(1, 2, 3));
Integer i = null;
while ((i = ok.poll()) != null) System.out.println(i); //1,2,3
PriorityQueue<AtomicInteger> notOk = new PriorityQueue<>(Comparator.comparing(AtomicInteger::intValue));
AtomicInteger one = new AtomicInteger(1);
notOk.add(one);
notOk.add(new AtomicInteger(2));
notOk.add(new AtomicInteger(3));
one.set(7);
AtomicInteger ai = null;
while ((ai = notOk.poll()) != null) System.out.println(ai); //7,2,3
}
Why cannot I retrieve an element from a HashSet?
Consider my HashSet containing a list of MyHashObjects with their hashCode() and equals() methods overridden correctly. I was hoping to construct a MyHashObject myself, and set the relevant hash code properties to certain values.
I can query the HashSet to see if there "equivalent" objects in the set using the contains() method. So even though contains() returns true for the two objects, they may not be == true.
How come then there isn’t any get() method similar to how the contains() works?
What is the thinking behind this API decision?
If you know what element you want to retrieve, then you already have the element. The only question for a Set to answer, given an element, is whether it contains() it or not.
If you want to iterator over the elements, just use a Set.iterator().
It sounds like what you're trying to do is designate a canonical element for an equivalence class of elements. You can use a Map<MyObject,MyObject> to do this. See this Stack Overflow question or this one for a discussion.
If you are really determined to find an element that .equals() your original element with the constraint that you must use the HashSet, I think you're stuck with iterating over it and checking equals() yourself. The API doesn't let you grab something by its hash code. So you could do:
MyObject findIfPresent(MyObject source, HashSet<MyObject> set)
{
if (set.contains(source)) {
for (MyObject obj : set) {
if (obj.equals(source))
return obj;
}
}
return null;
}
It is brute-force and O(n) ugly, but if that's what you need to do...
You can use HashMap<MyHashObject, MyHashObject> instead of HashSet<MyHashObject>.
Calling containsKey() on your "reconstructed" MyHashObject will first hashCode() - check the collection, and if a duplicate hashcode is hit, finally equals() - check your "reconstructed" against the original, at which you can retrieve the original using get()
Complexity is O(1) but the downside is you will likely have to override both equals() and hashCode() methods.
It sounds like you're essentially trying to use the hash code as a key in a map (which is what HashSets do behind the scenes). You could just do it explicitly, by declaring HashMap<Integer, MyHashObject>.
There is no get for HashSets because typically the object you would supply to the get method as a parameter is the same object you would get back.
If you know the order of elements in your Set, you can retrieve them by converting the Set to an Array. Something like this:
Set mySet = MyStorageObject.getMyStringSet();
Object[] myArr = mySet.toArray();
String value1 = myArr[0].toString();
String value2 = myArr[1].toString();
The idea that you need to get the reference to the object that is contained inside a Set object is common. It can be archived by 2 ways:
Use HashSet as you wanted, then:
public Object getObjectReference(HashSet<Xobject> set, Xobject obj) {
if (set.contains(obj)) {
for (Xobject o : set) {
if (obj.equals(o))
return o;
}
}
return null;
}
For this approach to work, you need to override both hashCode() and equals(Object o) methods
In the worst scenario we have O(n)
Second approach is to use TreeSet
public Object getObjectReference(TreeSet<Xobject> set, Xobject obj) {
if (set.contains(obj)) {
return set.floor(obj);
}
return null;
}
This approach gives O(log(n)), more efficient.
You don't need to override hashCode for this approach but you have to implement Comparable interface. ( define function compareTo(Object o)).
One of the easiest ways is to convert to Array:
for(int i = 0; i < set.size(); i++) {
System.out.println(set.toArray()[i]);
}
If I know for sure in my application that the object is not used in search in any of the list or hash data structure and not used equals method elsewhere except the one used indirectly in hash data structure while adding. Is it advisable to update the existing object in set in equals method. Refer the below code. If I add the this bean to HashSet, I can do group aggregation on the matching object on key (id). By this way I am able to achieve aggregation functions such as sum, max, min, ... as well. If not advisable, please feel free to share me your thoughts.
public class MyBean {
String id,
name;
double amountSpent;
#Override
public int hashCode() {
return id.hashCode();
}
#Override
public boolean equals(Object obj) {
if(obj!=null && obj instanceof MyBean ) {
MyBean tmpObj = (MyBean) obj;
if(tmpObj.id!=null && tmpObj.id.equals(this.id)) {
tmpObj.amountSpent += this.amountSpent;
return true;
}
}
return false;
}
}
First of all, convert your set to an array. Then, get the item by indexing the array.
Set uniqueItem = new HashSet();
uniqueItem.add("0");
uniqueItem.add("1");
uniqueItem.add("0");
Object[] arrayItem = uniqueItem.toArray();
for(int i = 0; i < uniqueItem.size(); i++) {
System.out.println("Item " + i + " " + arrayItem[i].toString());
}
If you could use List as a data structure to store your data, instead of using Map to store the result in the value of the Map, you can use following snippet and store the result in the same object.
Here is a Node class:
private class Node {
public int row, col, distance;
public Node(int row, int col, int distance) {
this.row = row;
this.col = col;
this.distance = distance;
}
public boolean equals(Object o) {
return (o instanceof Node &&
row == ((Node) o).row &&
col == ((Node) o).col);
}
}
If you store your result in distance variable and the items in the list are checked based on their coordinates, you can use the following to change the distance to a new one with the help of lastIndexOf method as long as you only need to store one element for each data:
List<Node> nodeList;
nodeList = new ArrayList<>(Arrays.asList(new Node(1, 2, 1), new Node(3, 4, 5)));
Node tempNode = new Node(1, 2, 10);
if(nodeList.contains(tempNode))
nodeList.get(nodeList.lastIndexOf(tempNode)).distance += tempNode.distance;
It is basically reimplementing Set whose items can be accessed and changed.
If you want to have a reference to the real object using the same performance as HashSet, I think the best way is to use HashMap.
Example (in Kotlin, but similar in Java) of finding an object, changing some field in it if it exists, or adding it in case it doesn't exist:
val map = HashMap<DbData, DbData>()
val dbData = map[objectToFind]
if(dbData!=null){
++dbData.someIntField
}
else {
map[dbData] = dbData
}
I want to sort an arraylist of pairs of integers. So far I've been able to sort them according to the first element, but I get something like (1,2), (1,-2). I want to also sort them according to the second element so I can get a correct sorted arraylist, but I cant seem to make it work.
The code for the first element sorting is:
private class FirstElmComparator implements Comparator<Pair> {
public int compare(Pair pr1, Pair pr2) {
return pr1.compareFirstElms(pr2);
}
}
and compareFirstElms function is the following:
protected int compareFirstElms (Pair p) {
return (new Integer (this.p1)).compareTo(new Integer (p.p1));
}
I can think of making the second element comparator as the following:
private class SecondElmComparator implements Comparator<Pair> {
public int compare(Pair pr1, Pair pr2) {
return pr1.compareSecondElms(pr2);
}
}
protected int compareSecondElms (Pair p) {
return (new Integer (this.p2)).compareTo(new Integer (p.p2));
}
NOTE: p1 and p2 are the first and second element in a pair.
But I think it will override the first element sorting order, or am I mistaken?
Can anybody help me with this.
You create one common comparator that evaluates both elements of the Pair.
public int compare(Pair pr1, Pair pr2) {
int firstResult = pr1.compareFirstElms(pr2);
if (firstResult == 0) { //First comparison returned that both elements are equal
return pr1.compareSecondElms(pr2);
} else {
return firstResult;
}
}
It's very simple, implement it like this:
If the first elements of the pairs to compare are different, then sort on the first element.
Otherwise (if the first elements are equal), sort on the second element.
You wouldn't use two distinct comparators but one (which might in turn call others to do the internal work).
So in pseudocode, the comparison would look like this:
public int compare(Pair pr1, Pair pr2) {
int result = compare(p1.first, p2.first);
if( result == 0 ) {
result = compare(p1.second, p2.second);
}
return result;
}
Well, first off, you need to write an explicit method for that:
public int compare(Pair p) {
int first = compareFirstElms(p);
return first == 0 ? compareSecondElms(p) : first;
}
Secondly, don’t over-engineer. Comparing two ints is as simple as writing this.p1 - p.p1. No need for conversions.
Thirdly, I would choose explicit, concise yet complete names. Don’t arbitrarily abbreviate parts of words, this doesn’t exactly help readability. How about compareByFirst and compareBySecond, respectively?