Mapping int to int (in Java) - java

In Java.
How can I map a set of numbers(integers for example) to another set of numbers?
All the numbers are positive and all the numbers are unique in their own set.
The first set of numbers can have any value, the second set of numbers represent indexes of an array, and so the goal is to be able to access the numbers in the second set through the numbers in the first set. This is a one to one association.
Speed is crucial as the method will have to be called many times each second.
Edit: I tried it with SE hashmap implementation, but found it to be slow for my purposes.

There's an article, devoted to this problem (with a solution): Implementing a world fastest Java int-to-int hash map
Code can be found in related GitHub repository. (Best results are in class IntIntMap4a.java )
Citation from the article:
Summary
If you want to optimize your hash map for speed, you have to do as much as you can of the following list:
Use underlying array(s) with capacity equal to a power of 2 - it will allow you to use cheap & instead of expensive % for array index
Do not store the state in the separate array - use dedicated fields for free/removed keys and values.
Interleave keys and values in the one array - it will allow you to load a value into memory for free.
Implement a strategy to get rid of 'removed' cells - you can sacrifice some of remove performance in favor of more frequent get/put.
Scramble the keys while calculating the initial cell index - this is required to deal with the case of consecutive keys.
Yes, I know how to use citation formatting. But it looks awful and doesn't handle bullet lists well.

The structure you are looking for is called an associative array. In computer science, an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears just once in the collection.
In java in particular as already mentioned this is easily done with a HashMap.
HashMap<Integer, Integer> cache = new HashMap<Integer, Integer>();
You can insert elements with the method put
cache.put(21, 42);
and you can retrieve a value with get
Integer key = 21
Integer value = cache.get(key);
System.out.println("Key: " + key +" value: "+ value);
Key: 21 value: 42
If you want to iterate through data you need to define an iterator:
Iterator<Integer> Iterator = cache.keySet().iterator();
while(Iterator.hasNext()){
Integer key = Iterator.next();
System.out.println("key: " + key + " value: " + cache.get(key));
}

Sounds like HashMap<Integer,Integer> is what you're looking for.

If you are willing to use an external library, you can use apache's IntToIntMap, which is a part of Apache Lucene.
It implements a pretty efficient int to int map that uses primitives for tasks that should not suffer the boxing overhead.

If you have a limit for the size of the first list, you can just use a large array. Suppose you know there first list only has numbers 0-99, you can use int[100]. Use the first number as an array index.

Your requirements can be satisfied by the Map interface. As an example, see HashMap<K,V>.
See Map and HashMap

Related

String Array as Key of HashMap

I need to solve two problems for our project, where (1) I have to find a way in which I can keep an array (String[] or int[]) as a key of the Map. The requirement is that, if the contents of two arrays are equal (String[] a={"A","B"}, String[] b={"B","A"}) then they should be considered as equal/same keys, i.e., if I use a, or b as key of Map then a.equal(b)=true
I found that Java Sets adds the hashcodes of all the objects stored in them. The addition of hashcode allows to compare two hashsets, to see if they are equal or not, this means that such mechanism allows to compare two java Sets based on their contents.
So for the above problem I can use Sets as a Key of the Map, but the thing is I want to use Arrays as Key. So any idea for this?
(2) the next thing is, we are interested in an efficient partial key matching mechanism. For instance, to see if any key in the Map contains a portion of the Array, such as to find some thing like Key.contains(new String[]{"A"}).
Please share your ideas, any alternate way of doing this, I am concern with space and time optimal implementations. As this will be used in Data Stream processing projects. So space and time is really an issue.
Q1 - You can't use bare arrays as HashMap keys if you want key equality based on the array elements. Arrays inherit equals(Object) and hashCode() implementations from java.lang.Object, and they are based on object identity, not array contents.
The best alternative I can think of is to wrap the arrays as (immutable) lists.
Q2 - I don't think there is a simple efficient way to do this. The best I can think of are:
Extract all possible subarrays of each array and make each one an alternative key in the hash table. The problem is that the keys will take O(N M^2) space where M is the average (?) number of strings in the primary key String[]'s . Lookup will still be O(1).
Build an inverted index that gives the location of each string in all of the keys, then do a "phrase search" for a sequence of strings in the key space. That should scale better in terms of space usage, but lookup is a lot more expensive. And it is complicated.
I try to use lambda expression in Java8 to solve your problem
For Problem 1:
String[] arr1 = {"A","B","A","C","D"};
List<String> list1 = new ArrayList<String>(new LinkedHashSet<>(Arrays.asList(arr1)));
list1.stream().forEach(x -> System.out.println(x));
If you would like to compare them if they are equal. I suggest you could sort them first and then compare.
Of course, It's much better to use Set and Hashcode to do comparsion
For Problem 2(Some variable in the above would be re-used):
String[] arr2 = {"A"};
List<String> list2 = new ArrayList<String>(Arrays.asList(arr2)); //Assume List2 element is also unique
int NumOfKeyContain = list1.stream().filter(a -> (list2.stream().filter(b -> !b.equals(a)).count())<list2.size())
.collect(Collectors.toList())
.size();
System.out.println(NumOfKeyContain); //NumOfKeyContain is the number that of key in list2 contained by list1

Why `floorEntry` and other methods are not accessible in PatriciaTrie?

While implementing an ip-lookup structure, I was trying to maintain a set of keys in a trie-like structure that allows me to search the "floor" of a key (that is, the largest key that is less or equal to a given key). I decided to use Apache Collections 4 PatriciaTrie but sadly, I found that the floorEntry and related methods are not public. My current "dirty" solution is forcing them with reflection (in Scala):
val pt = new PatriciaTrie[String]()
val method = pt.getClass.getSuperclass.getDeclaredMethod("floorEntry", classOf[Object])
method.setAccessible(true)
// and then for retrieving the entry for floor(key)
val entry = method.invoke(pt, key).asInstanceOf[Entry[String, String]]
Is there any clean way to have the same functionality? Why this methods are not publicly available?
Why those methods are not public, I don't know. (Maybe it's because you can achieve what you want with common Map API).
Here's a way to fulfil your requirement:
PatriciaTrie<String> trie = new PatriciaTrie<>();
trie.put("a", "a");
trie.put("b", "b");
trie.put("d", "d");
String floorKey = trie.headMap("d").lastKey(); // d
According to the docs, this is very efficient, since it depends on the number of bits of the largest key of the trie.
EDIT: As per the comment below, the code above has a bounds issue: headMap() returns a view of the map whose keys are strictly lower than the given key. This means that, i.e. for the above example, trie.headMap("b").lastKey() will return "a", instead of "b" (as needed).
In order to fix this bounds issue, you can use the following trick:
String cFloorKey = trie.headMap("c" + "\uefff").lastKey(); // b
String dFloorKey = trie.headMap("d" + "\uefff").lastKey(); // d
Now everything works as expected, since \uefff is the highest unicode character. Actually, searching for key + "\uefff", whatever key is, will always return key if it belongs to the trie, or the element immediately prior to key, if key is not present in the trie.
Now, this trick works for String keys, but is extensible to other types as well. i.e. for Integer keys you could search for key + 1, for Date keys you could add 1 millisecond, etc.

HashMap should be unsorted but still sorts according to key

According to these:
http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.html
Difference between HashMap, LinkedHashMap and TreeMap
java beginner : How key gets sorted in hashmaps?
The HashMap in Java should be unsorted but it is being sorted with respect to Key.
I experienced this as a problem because I needed inserted-order data. So, I used LinkedHashMap instead. But still I am confused why the HashMap sorted it.
Can anyone explain it?
I did a simple example to view the sort.
public static void main(String[] args) {
HashMap<Integer, String> newHashMap = new HashMap<Integer, String>();
newHashMap.put(2, "First");
newHashMap.put(0, "Second");
newHashMap.put(3, "Third");
newHashMap.put(1, "Fourth");
Iterator<Entry<Integer, String>> iterator = newHashMap.entrySet()
.iterator();
while (iterator.hasNext()) {
Map.Entry<Integer, String> entry = iterator.next();
System.out.println("Key: " + entry.getKey());
System.out.println("Value: " + entry.getValue());
iterator.remove();
}
}
Result:
Key: 0
Value: Second
Key: 1
Value: Fourth
Key: 2
Value: First
Key: 3
Value: Third
Edit:
I tried to insert 50 random numbers using Random of Java and I found some data unsorted. But, it still manages to sort most of the integers.
Random results:
...
Key: 36
Value: random
Key: 43
Value: random
Key: 47
Value: random
Key: 44
Value: random
Key: 45
Value: random
...
It's a coincidence (not really, rather it has to do with the hashing algorithm).
Try adding
newHashMap.put(-5, "Fifth");
as last.
Output will be
Key: 0
Value: Second
Key: 1
Value: Fourth
Key: 2
Value: First
Key: 3
Value: Third
Key: -5
Value: Fifth
The javadoc specifically says
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
You should not infer too much! Just because three or four numbers appear sorted, it doesn't mean they have been sorted.
The hash code of a positive int is usually just that int, so if all your keys are lower than the length of the internal array the Map maintains, they may appear sorted.
Try with really big values, and you'll see that the apparing order vanishes. For example, use
100,200,300,100001, 100002, 10003, 999123456, 888777666, ....
You can't assume that it will be sorted. The reason why it appears sorted, in this simple example: A HashMap is internally constructed from "Bins". These Bins contain the actual elements. They are basically small lists that reside in an array.
[0] -> [ Bin0: ... ]
[1] -> [ Bin1: ... ]
[2] -> [ Bin2: ... ]
[3] -> [ Bin3: ... ]
When an item is inserted into the HashMap, then the "Bin" in which it should be inserted is - to simplify it a little - found by using the hashCode() of the object as an array index. For example, if the hashCode is 2, it will be inserted into Bin 2. When this "index" is greater than the array size, it will be placed into Bin (index%arraySize) - that is, if the hashCode is 5, it will be inserted into Bin 1.
And since a HashMap initially has an internal array size of 10, inserting Integer objects between 0 and 9 will coincidentally place the elements in the right order into the array. (The hashCode of an Integer is just its value, of course).
(Note: The actual algorithms and hash functions may be slightly more complicated, but that's the basic idea)
It's pure coincidence. Sometimes it appears to be sorted, but keep adding keys and the dream will shatter.
I wrote this little program:
import java.util.Map;
import java.util.HashMap;
class MapTest {
public static void main(String[] args){
int count = Integer.parseInt(args[0]);
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
for (int i = 0; i < count; i++) map.put(i, i);
System.out.println(map);
}
}
When running java MapTest 20, I get the following output (line-wrapped for readability):
{0=0, 1=1, 2=2, 3=3, 4=4, 5=5, 6=6, 7=7, 8=8, 9=9, 10=10, 11=11, 12=12, 13=13,
14=14, 15=15, 17=17, 16=16, 19=19, 18=18}
It's simply a property of the implementation of HashMap that Integers added sequentially (and starting at 0) at first seem to be ordered.
Like every one is saying (AND is right about) is that you should assume that the keys in an HashMap are not sorted.
Now they LOOK sorted in your case for two simple reasons:
1 - You are using Integer as a key: The HashMap use the hashCode() method of the Object class of Java to find the index in the underlying array it uses to store the Entry instances (what contains your values and keys in the HashMap). It just so happen that the hashcode of an Integer is its own value.
2 - You are not setting the initial size of the HashMap and thus are using its default initial size (which is 16). So until you add a key below 0 or above 16 (included) you will see the keys stored in order. Since the HashMap gets the index by doing
int index = newKey.hashCode() % this.capacity;
Later on HashMap might increase the capacity of its underlying array if you insert a lot key-value pairs (how and when it decides to do that is very interesting if you are into algo and data structure study), so you might end up in a situation in which your Integer keys might look sorted again, but they actually are not intentionally sorted.
Btw if your keys are going to be Integers and you can estimate the maximum key value you are going to have I'd suggest to directly use an array. Access is faster and the memory used will be the same or slightly less.
I would recommend you to use a LinkedHashMap.
In the LinkedHashMap the elements can be accessed in their insertion order.
You can't make assumptions about the orderings on HashMap objects. They will order as they please, implementation-defined. You should treat them as unordered data structures.
Actually It can not be ensured the order.
Hashmap uses hashcode to hash data for search fast.
Your key is so simple, so it sorted.
Mine is an educated guess, but the reason is likely to be the fact that the default hashCode method uses the memory location. The memory location of small Integers (and your keys are autoboxed into Integer) are most likely fixed: it would be nonesense to have Integer.valueOf(1) return a different memory location on multiple calls. Finally, most likely these fixed memory locations are in the ascending order. This would explain this coincidence, but well, one would need to dig into the implementation of Integer and HashMap to prove this.
Correction: in case of Integer "A hash code value for this object, equal to the primitive int value represented by this Integer object." (JavaDoc). Which, although a different number, confirms the idea.
Since no answer really utilized looking at the Java source, I will do so! :)
When you call the put() function, the internal hash function uses the hashCode of the object, to generate the hashing index. [put() source]
The hash() function simply ensures that hashCodes that differ only by constant multiples at each bit position have a bounded number of collisions [use Google to see why that is the case].
Things just coincidentally worked here. That is about it.
Questioner said:
"The HashMap in Java should be unsorted but it is being sorted with respect to Key."
Yes, correct. I will show you with bellow sample
Map<String, String> map1 = new HashMap<String, String>();
map1.put("c","xxxx");
map1.put("b","yyyy");
map1.put("a","zzzz");
for (String key :map1.keySet())
System.out.println(map1.get(key));
System.out.println();
Map<Integer,String> map2 = new HashMap<Integer,String>();
map2.put(3,"xxxx");
map2.put(2,"yyyy");
map2.put(1,"zzzz");
for (int key :map2.keySet())
System.out.println(map2.get(key));
Output shows HashMap use Key to sort data
xxxx
yyyy
zzzz
zzzz
yyyy
xxxx

java why does an enhanced for loop iterate over null results in a HashMap with more than 16 or more items?

The code I used originally, which returned a NullPointerException in some places for any HashMap containing 16 or more elements:
for(Entry<Integer, String> e : myHashMap.entrySet()){
System.out.println(e.getKey() + ": "+e.getValue());
}
The code I am now using, which works on the same HashMap, regardless of size:
int i = 0; //variable to show the index
int c = 0; //variable to count the number items found
while(c < myHashMap.size()){
if(myHashMap.containsKey(i)){ //if the HashMap contains the key i
System.out.println(i + ": "+myHashMap.get(i)); //Print found item
c++; //increment up to count the number of objects found
}
i++; //increment to iterate to the next key
}
What is the difference between the two? Why does the first one iterate over null values? And, more importantly, why does the first one iterate out of order if there are 16 or more items? (ie: 12,13,17,15,16,19,18 instead of the neat 12,13,14,15,16,17,18,19 in the second)
I think I am just starting to scratch the surface of java, so I would like to understand why it was designed this way. Any book recommendations on this kind of thing are welcome.
You should read the documentation of a class and try to understand its purpose before starting to use it. HashMap provides an efficient storage but no guaranteed order. It’s just a coincident that you didn't discover it with smaller HashMap sizes because the default capacity is 16 and the hash codes of contiguous Integer objects are contiguous too. But that is not a property you can rely on. You always have to assume no guaranteed order for a HashMap.
If you need the insertion order you can use a LinkedHashMap, if you need ascending order of the keys you can use a TreeMap. If you have a contiguous range of Integer keys and want ascending order you can simply use an array as well.
The foreach loop for(Entry<Integer, String> e : myHashMap.entrySet()) does not “iterate over null values”. It iterates over the values contained in the HashMap which are the values you have added before. There can be at most one null key contained in the map, if you added it. You might see null values in the debugger when looking at the internal array of a HashMap which are unused slots as a HashMap has a capacity which can be larger than its size.

Which collection is suitable for tel number + name pair in Java?

I am trying to add to a collection the following pairs
698xxxxxxx - personA
698xxxxxxx - personB
699xxxxxxx - personA
699xxxxxxx - personB
I go through alot of files and try to add to a collection the pairs i find there. I want to be able to have a table that will show each number and what people it was correlated with without having dublicate PAIRS. for example
1-personA ok
1-personB ok
2-personA ok
3-personB ok
3-personB NOT OK as its already there
I tried using a Multimap but i m not sure if its the right choice. Whatever the solution is please show me how to iterrate through its values as well so i can use the pairs. Sorry for the demanding post but i m new with Java and i find a lil hard to understand the APIs.
Thanks in advance
There are three obvious alternative, depending on what you require.
If there can only be one person for each phone number, then a simple Map<PhoneNo, Name>.
If there a given phone number can be associated with multiple people, then either a Map<Phone,Set<Name>> or a multi-map class.
If you also want to find out the phone number or numbers for each person, you need two maps or two multi-maps ... or a bidirectional map.
There is a secondary choice you need to make: hash-table versus tree-based organizations. A hash table will give you O(1) lookup/insert/remove (assuming that the hash function is good). A tree-based implementation gives O(logN) operations ... but it also allows you to iterate over the entries (or values) in key order.
While the standard Java class libraries don't provide multi-maps or bidirectional maps, they can easily be implemented by combining the simple collection classes.
You can choose Map Interface in Java, which accepts key and value pairs.
You can have this as a reference: http://www.tutorialspoint.com/java/java_map_interface.htm
You may need a hashmap with the key as the name of the person and value as a HashSet of numbers. Hashset does not allow duplicates, so duplicate numbers will not be stored in that. Here is the code:
HashMap<String,HashSet> Records;
In Java there is a couple of options. If you don't know about the cardinality of persons or numbers, then go for:
public class Pair {
String person;
String number;
}
Then use a Set to be save from doublettes like
Set<Pair> pairs = new HashSet<>();
....
pairs.add( new Pair( "689xxxx", "personA" );
for ( Pair pair : pairs ) {
System.out.println( pair.number + " - " + pair.person );
}
Hajo

Categories