String Array as Key of HashMap - java

I need to solve two problems for our project, where (1) I have to find a way in which I can keep an array (String[] or int[]) as a key of the Map. The requirement is that, if the contents of two arrays are equal (String[] a={"A","B"}, String[] b={"B","A"}) then they should be considered as equal/same keys, i.e., if I use a, or b as key of Map then a.equal(b)=true
I found that Java Sets adds the hashcodes of all the objects stored in them. The addition of hashcode allows to compare two hashsets, to see if they are equal or not, this means that such mechanism allows to compare two java Sets based on their contents.
So for the above problem I can use Sets as a Key of the Map, but the thing is I want to use Arrays as Key. So any idea for this?
(2) the next thing is, we are interested in an efficient partial key matching mechanism. For instance, to see if any key in the Map contains a portion of the Array, such as to find some thing like Key.contains(new String[]{"A"}).
Please share your ideas, any alternate way of doing this, I am concern with space and time optimal implementations. As this will be used in Data Stream processing projects. So space and time is really an issue.

Q1 - You can't use bare arrays as HashMap keys if you want key equality based on the array elements. Arrays inherit equals(Object) and hashCode() implementations from java.lang.Object, and they are based on object identity, not array contents.
The best alternative I can think of is to wrap the arrays as (immutable) lists.
Q2 - I don't think there is a simple efficient way to do this. The best I can think of are:
Extract all possible subarrays of each array and make each one an alternative key in the hash table. The problem is that the keys will take O(N M^2) space where M is the average (?) number of strings in the primary key String[]'s . Lookup will still be O(1).
Build an inverted index that gives the location of each string in all of the keys, then do a "phrase search" for a sequence of strings in the key space. That should scale better in terms of space usage, but lookup is a lot more expensive. And it is complicated.

I try to use lambda expression in Java8 to solve your problem
For Problem 1:
String[] arr1 = {"A","B","A","C","D"};
List<String> list1 = new ArrayList<String>(new LinkedHashSet<>(Arrays.asList(arr1)));
list1.stream().forEach(x -> System.out.println(x));
If you would like to compare them if they are equal. I suggest you could sort them first and then compare.
Of course, It's much better to use Set and Hashcode to do comparsion
For Problem 2(Some variable in the above would be re-used):
String[] arr2 = {"A"};
List<String> list2 = new ArrayList<String>(Arrays.asList(arr2)); //Assume List2 element is also unique
int NumOfKeyContain = list1.stream().filter(a -> (list2.stream().filter(b -> !b.equals(a)).count())<list2.size())
.collect(Collectors.toList())
.size();
System.out.println(NumOfKeyContain); //NumOfKeyContain is the number that of key in list2 contained by list1

Related

How to calculate domino-chain for integer pairs?

The problem I'm facing is more like of algorithmic nature.
Let's say that I have a list of pair objects containing integers. Is there a way to sort the list so that the second part of the pair is equal to first part of the next pair?
For instance given this list of pairs:
A = Pair(2,1),Pair(2,3),Pair(1,3).
After sorting the list becomes:
A = Pair(1,3), Pair(3,2),Pair(2,1).
As you can see it is allowed to change the order of values inside the pair like the Pair(2,3) which became Pair(3,2).
I though about using comparator or comparable interfaces but they dont cover complex cases like the above.

Number and string sorting Al

I want to sort some number+string combination but the sorting will be based on the number from that combination. Can you suggest an optimal solution?
Say my strings are:
12 Masdf
4 Oasd
44 Twer
and so on. The sorting will be based on the numbers like 12, 4, 44 and after the sorting I have to show the full alphanumeric strings.
As the program will run on thousands of data I don't want to split the string and compare the number on each iteration. My plan is to extract the numbers and take those in an array and then sort the array. After sorting done, I want to put back the numbers with associated strings and keep those in a string array to show.
It should be done in C++. Algorithms should be applied - Insertion sort, Quick sort, Merge sort, etc.
Create a class to store the full string and the number. Make the class Comparable. Convert your list of string to list of Class. Sort the list using which sort method is relevant. Iterate the list and print the string fields.
Sorry, that was an answer for Java, since you tagged it Java. Replace/remove Comparable for whatever is good for C++.
I am going to assume these two parts are in separate variables and are not together as one string (if they were you could just store them in a list).
First consider a Map. Each 'bucket' of the map can be represented by a number. Within each of the maps buckets is a bunch of strings in a list. (Note this could also be solved with an array especially if the Integer part is always under some fixed value) The java equivalent would look like:
Map map = new HashMap<Integer,ArrayList<String>>();
For sorting on this custom collection first the integer part of the value would be searched on the map returning a list. Every item in the list will have the same starting number. So we now search the list the string part of the value (I am assuming the list is sorted so you can do whatever sort you want ie: selection/quicksort).
The advantages of this search mean that if the number is not found in the Hashmap you instantly know there is no string part for it.

Mapping int to int (in Java)

In Java.
How can I map a set of numbers(integers for example) to another set of numbers?
All the numbers are positive and all the numbers are unique in their own set.
The first set of numbers can have any value, the second set of numbers represent indexes of an array, and so the goal is to be able to access the numbers in the second set through the numbers in the first set. This is a one to one association.
Speed is crucial as the method will have to be called many times each second.
Edit: I tried it with SE hashmap implementation, but found it to be slow for my purposes.
There's an article, devoted to this problem (with a solution): Implementing a world fastest Java int-to-int hash map
Code can be found in related GitHub repository. (Best results are in class IntIntMap4a.java )
Citation from the article:
Summary
If you want to optimize your hash map for speed, you have to do as much as you can of the following list:
Use underlying array(s) with capacity equal to a power of 2 - it will allow you to use cheap & instead of expensive % for array index
Do not store the state in the separate array - use dedicated fields for free/removed keys and values.
Interleave keys and values in the one array - it will allow you to load a value into memory for free.
Implement a strategy to get rid of 'removed' cells - you can sacrifice some of remove performance in favor of more frequent get/put.
Scramble the keys while calculating the initial cell index - this is required to deal with the case of consecutive keys.
Yes, I know how to use citation formatting. But it looks awful and doesn't handle bullet lists well.
The structure you are looking for is called an associative array. In computer science, an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears just once in the collection.
In java in particular as already mentioned this is easily done with a HashMap.
HashMap<Integer, Integer> cache = new HashMap<Integer, Integer>();
You can insert elements with the method put
cache.put(21, 42);
and you can retrieve a value with get
Integer key = 21
Integer value = cache.get(key);
System.out.println("Key: " + key +" value: "+ value);
Key: 21 value: 42
If you want to iterate through data you need to define an iterator:
Iterator<Integer> Iterator = cache.keySet().iterator();
while(Iterator.hasNext()){
Integer key = Iterator.next();
System.out.println("key: " + key + " value: " + cache.get(key));
}
Sounds like HashMap<Integer,Integer> is what you're looking for.
If you are willing to use an external library, you can use apache's IntToIntMap, which is a part of Apache Lucene.
It implements a pretty efficient int to int map that uses primitives for tasks that should not suffer the boxing overhead.
If you have a limit for the size of the first list, you can just use a large array. Suppose you know there first list only has numbers 0-99, you can use int[100]. Use the first number as an array index.
Your requirements can be satisfied by the Map interface. As an example, see HashMap<K,V>.
See Map and HashMap

Efficient way to find the difference between two data sets

I have two copies of data, here 1 represents my volumes and 2 represent my issues. I have to compare COPY2 with COPY1 and find all the elements which are missing in COPY2 (COPY1 will always be a superset and COPY2 can be equal or will always be a subset).
Now, I have to get the missing volume and the issue in COPY2.
Such that from the following figure(scenario) I get the result as : -
Missing files – 1-C, 1-D, 2-C, 2-C, 3-A, 3-B, 4,E.
Question-
What data structure should I use to store the above values (volume and issue) in java?
How should I implement this scenario in java in the most efficient manner to find the difference between these 2 copies?
I suggest a flat HashSet<VolumeIssue>. Each VolumeIssue instance corresponds to one categorized issue, such as 1-C.
In that case all you will need to find the difference is a call
copy1.removeAll(copy2);
What is left in copy1 are all the issues present in copy1 and missing from copy2.
Note that your VolumeIssue class must properly implement equals and hashCode for this to work.
Since you've added the Guava tag, I'd go for a variation of Marco Topolnik's answer. Instead of removing one set from the other, use Sets.difference(left, right)
Returns an unmodifiable view of the difference of two sets. The
returned set contains all elements that are contained by set1 and not
contained by set2. set2 may also contain elements not present in set1;
these are simply ignored. The iteration order of the returned set
matches that of set1.
What data structure should I use to store the above values (volume and issue) in java?
You can have a HashMap's with key and value pairs.
key is Volume and Value is a List of Issues.
How should I implement this scenario in java in the most efficient manner to find the difference between these 2 copies?
By getting value from both the HashMap's so you get two List's of value. Then find the difference between those two lists.
consider you got two list of values with same key from two maps.
now
Collection<Issue> diff = list1.removeAll( list2 );

Adding elements into ArrayList at position larger than the current size

Currently I'm using an ArrayList to store a list of elements, whereby I will need to insert new elements at specific positions. There is a need for me to enter elements at a position larger than the current size. For e.g:
ArrayList<String> arr = new ArrayList<String>();
arr.add(3,"hi");
Now I already know there will be an OutOfBoundsException. Is there another way or another object where I can do this while still keeping the order? This is because I have methods that finds elements based on their index. For e.g.:
ArrayList<String> arr = new ArrayList<String>();
arr.add("hi");
arr.add(0,"hello");
I would expect to find "hi" at index 1 instead of index 0 now.
So in summary, short of manually inserting null into the elements in-between, is there any way to satisfy these two requirements:
Insert elements into position larger than current size
Push existing elements to the right when I insert elements in the middle of the list
I've looked at Java ArrayList add item outside current size, as well as HashMap, but HashMap doesn't satisfy my second criteria. Any help would be greatly appreciated.
P.S. Performance is not really an issue right now.
UPDATE: There have been some questions on why I have these particular requirements, it is because I'm working on operational transformation, where I'm inserting a set of operations into, say, my list (a math formula). Each operation contains a string. As I insert/delete strings into my list, I will dynamically update the unapplied operations (if necessary) through the tracking of each operation that has already been applied. My current solution now is to use a subclass of ArrayList and override some of the methods. I would certainly like to know if there is a more elegant way of doing so though.
Your requirements are contradictory:
... I will need to insert new elements at specific positions.
There is a need for me to enter elements at a position larger than the current size.
These imply that positions are stable; i.e. that an element at a given position remains at that position.
I would expect to find "hi" at index 1 instead of index 0 now.
This states that positions are not stable under some circumstances.
You really need to make up your mind which alternative you need.
If you must have stable positions, use a TreeMap or HashMap. (A TreeMap allows you to iterate the keys in order, but at the cost of more expensive insertion and lookup ... for a large collection.) If necessary, use a "position" key type that allows you to "always" generate a new key that goes between any existing pair of keys.
If you don't have to have stable positions, use an ArrayList, and deal with the case where you have to insert beyond the end position using append.
I fail to see how it is sensible for positions to be stable if you insert beyond the end, and allow instability if you insert in the middle. (Besides, the latter is going to make the former unstable eventually ...)
even you can use TreeMap for maintaining order of keys.
First and foremost, I would say use Map instead of List. I guess your problem can be solved in better way if you use Map. But in any case if you really want to do this with Arraylist
ArrayList<String> a = new ArrayList<String>(); //Create empty list
a.addAll(Arrays.asList( new String[100])); // add n number of strings, actually null . here n is 100, but you will have to decide the ideal value of this, depending upon your requirement.
a.add(7,"hello");
a.add(2,"hi");
a.add(1,"hi2");
Use Vector class to solve this issue.
Vector vector = new Vector();
vector.setSize(100);
vector.set(98, "a");
When "setSize" is set to 100 then all 100 elements gets initialized with null values.
For those who are still dealing with this, you may do it like this.
Object[] array= new Object[10];
array[0]="1";
array[3]= "3";
array[2]="2";
array[7]="7";
List<Object> list= Arrays.asList(array);
But the thing is you need to identify the total size first, this should be just a comment but I do not have much reputation to do that.

Categories