Find indices of common elements in arraylists in java - java

I have several ArrayLists with no repeated elements. I want to find their intersection and return indices of common elements in each arraylist.
For example, if I have input as {0,1,2},{3,0,4},{5,6,0}, then I want to return {0},{1},{2} i.e. indices of common element 0 here.
One way I can think of is to use succesive retainAll() on all ArrayLists to get intersection, and then finding indices of elements of intersection using indexOf() for each input ArrayList.
Is there a better way to do that ?

Sorting the list first would require at least O(nlogn) time. If you are looking for a more efficient algorithm you could get O(n) using hashmaps.
For example with
A=[0,1,2],B=[3,0,4],C=[5,6,0]
You can loop through each list and append elements with a hash on the element. The final hash will look like
H = {0:[0,1,2], 1:[1], 2:[2], 3:[0], 4:[2], 5:[0], 6:[1]}
Here, the key is the element, and the value is the index in it's corresponding list. Now, just loop through the hashmap to find any lists that have a size of 3, in this case, to get the indices.
The code would look something like this (untested):
int[][] lists = {{0,1,2}, {3,0,4}, {5,6,0}};
// Create the hashmap
Map<Integer, List<Integer>> H = new HashMap<Integer, List<Integer>>();
for(int i = 0; i < lists.length; i++){
for(int j = 0; j < lists[0].length; j++){
// create the list if this is the first occurance
if(!H.containsKey(lists[i][j]))
H.put(lists[i][j], new ArrayList<Integer>());
// add the index to the list
H.get(lists[i][j]).add(j);
}
}
// Print out indexes for elements that are shared between all lists
for(Map.Entry<Integer, List<Integer>> e : H.entrySet()){
// check that the list of indexes matches the # of lists
if(e.getValue().size() == lists.length){
System.out.println(e.getKey() + ":" + e.getValue());
}
}
EDIT: Just noticed you suggested using retainAll() in your question. That would also be O(n).

Here is a very inefficient but fairly readable solution using streams that returns you a list of lists.
int source[][];
Arrays.stream(source)
.map(list -> IntMap.range(0, list.length)
.filter(i -> Arrays.stream(source)
.allMatch(l -> Arrays.binarySearch(l, list[i]) >= 0))
.collect(Collectors.toList()))
.collect(Collectors.toList());
You can add toArray calls to convert to arrays if required.

Related

Arraylist Comparison using one loop

It has been a long since something came to my mind while starting to code and using lists or array lists. When comparing values of one array to every other elements in another array, I used to do it in two for loops since it was the easiest way to do that.but recently I came to know that it increases much time complexity, I thought about another solution.can anyone help me in solving this case using any algorithm. I am using java.but solution in any language would be fine. just the algorithm to do that is needed. Thanks in advance.
This is what i am doing:
a1 = [1,2,3,4,5]
b1 = [9,5,4,3,8,3,7]
I want to check how much time an element in a1 occurs in b1
So what i am doing is:
count = 0;
for(int i = 0;i <a1.length;i++)
{
for(j=0;j<b1.length;j++)
{
if (a1[i] == b1[j])
{
count = count+1;
}
}
}
print("count is" count);
Theres no need of loop to obtain what you want
ArrayList<Integer> l1 = new ArrayList<Integer>();
l1.add(1);
l1.add(2);
l1.add(3);
l1.add(4);
l1.add(5);
ArrayList<Integer> l2 = new ArrayList<Integer>();
l2.add(9);
l2.add(5);
l2.add(4);
l2.add(3);
l2.add(8);
l2.add(3);
l2.add(7);
ArrayList<Integer> lFiltered = new ArrayList<Integer>(l2);
lFiltered.removeAll(l1);
int Times = l2.size() - lFiltered.size();
System.out.println("number of migrants : " + Times);
Suffice it to to generate from l2 a list without elements and l1 and to count elements which have been removed
Use hashing, e.g. using a Set or Map
If you want to compare the objects as a whole:
properly implement equals and hashcode for your class (if not implemented already)
put all the elements of list A into a Set, then see which elements from list B are in that Set
If you just want to compare objects by some attribute:
define a method that maps the objects to that attribute (or combination of attriutes, e.g. as a List)
create a Map<KeyAttributeType, List<YourClass>> and for each element from list A, add the element to that Map: map.get(getKey(x)).add(x)
for each element from list B, calculate the value of the key function and get the elements it "matches" from the map: matches = map.get(getKey(y))
Given your code, your case seems to be a bit different, though. You have lists or arrays of numbers, so no additional hashing is necessary, and you do not just want to see which items "match", but count all combinations of matching items. For this, you could create a Map<Integer, Long> to count how often each element of the first list appears, and then get the sum of those counts for the elements from the second list.
int[] a1 = {1,2,3,4,5};
int[] b1 = {9,5,4,3,8,3,7};
Map<Integer, Long> counts = IntStream.of(b1).boxed()
.collect(Collectors.groupingBy(x -> x, Collectors.counting()));
System.out.println(counts); // {3=2, 4=1, 5=1, 7=1, 8=1, 9=1}
long total = IntStream.of(a1).mapToLong(x -> counts.getOrDefault(x, 0L)).sum();
System.out.println(total); // 4
Of course, instead of using the Stream API you can just as well use regular loops.
Use ArrayLists.
To compare the content of both arrays:
ArrayList<String> listOne = new ArrayList<>(Arrays.asList(yourArray1);
ArrayList<String> listTwo = new ArrayList<>(Arrays.asList(yourArray);
listOne.retainAll(listTwo);
System.out.println(listOne)
To find missing elements:
listTwo.removeAll(listOne);
System.out.println(listTwo);
To enumerate the Common elements:
//Time complexity is O(n^2)
int count =0;
for (String element : listOne){
for (String element2: listTwo){
if (element.equalsIgnoreCase(elemnt2){
count += 1;
}
}
}

Java: remove a range of indices from a list

Consider a linked list of strings I got from somewhere
LinkedList<String> names = getNames();
Now, I want to remove the first k elements from the list. Currently, I'll do it this way:
for (int i = 0 ; i < k ; i++) {
names.removeFirst();
}
Is there some way to do it more efficiently and to instead call something like:
names.removeRange(0, k);
Note that I prefer not to construct a whole new list using sublist(), as for small k values, popping k times would be even more efficient than constructing the new list
Maybe Something like this :
names.subList(0, k).clear();
this is more efficient but doesn't release memory according to sublist it's just a view:
names.sublist(k, names.size());

Getting the indices of an unsorted double array after sorting

This question comes as a companion of this one that regarded fastest sorting of a double array.
Now I want to get the top-k indices corresponding to the unsorted array.
I have implemented this version which (unfortunately) uses autoboxing and HashMap as proposed in some answers including this one:
HashMap<Double, Integer> map = new HashMap<Double, Integer>();
for(int i = 0; i < numClusters; i++) {
map.put(scores[i], i);
}
Arrays.sort(scores);
HashSet<Integer> topPossibleClusters = new HashSet<Integer>();
for(int i = 0; i < numClusters; i++) {
topPossibleClusters.add(map.get(scores[numClusters - (i+1)]));
}
As you can see this uses a HashMap with keys the Double values of the original array and as values the indices of the original array.
So, after sorting the original array I just retrieve it from the map.
I also use HashSet as I am interested in deciding if an int is included in this set, using .contains() method. (I don't know if this makes a difference since as I mentioned in the other question my arrays are small -50 elements-). If this does not make a difference point it out though.
I am not interested in the value per se, only the indices.
My question is whether there is a faster approach to go with it?
This sort of interlinking/interlocking collections lends itself to fragile, easily broken, hard to debug, unmaintainable code.
Instead create an object:
class Data {
double value;
int originalIndex;
}
Create an array of Data objects storing the original value and index.
Sort them using a custom comparator that looks at data.value and sorts descending.
Now the top X items in your array are the ones you want and you can just look at the value and originalIndex as you need them.
As Tim points out linking a multiple collections is rather errorprone. I would suggest using a TreeMap as this would allow for a standalone solution.
Lets say you have double[] data, first copy it to a TreeMap:
final TreeMap<Double, Integer> dataWithIndex = new TreeMap<>();
for(int i = 0; i < data.length; ++i) {
dataWithIndex.put(data[i], i);
}
N.B. You can declare dataWithIndex as a NavigableMap to be less specific, but it's so much longer and it doesn't really add much as there is only one implementation in the JDK.
This will populate the Map in O(n lg n) time as each put is O(lg n) - this is the same complexity as sorting. In reality it will be probably be a little slower, but it will scale in the same way.
Now, say you need the first k elements, you need to first find the kth element - this is O(k):
final Iterator<Double> keyIter = dataWithIndex.keySet().iterator();
double kthKey;
for (int i = 0; i < k; ++i) {
kthKey = keyIter.next();
}
Now you just need to get the sub-map that has all the entries upto the kth entry:
final Map<Double, Integer> topK = dataWithIndex.headMap(kthKey, true);
If you only need to do this once, then with Java 8 you can do something like this:
List<Entry<Double, Integer>> topK = IntStream.range(0, data.length).
mapToObj(i -> new SimpleEntry<>(data[i], i)).
sorted(comparing(Entry::getKey)).
limit(k).
collect(toList());
i.e. take an IntStream for the indices of data and mapToObj to an Entry of the data[i] => i (using the AbsractMap.SimpleEntry implementation). Now sort that using Entry::getKey and limit the size of the Stream to k entries. Now simply collect the result to a List. This has the advantage of not clobbering duplicate entries in the data array.
It is almost exactly what Tim suggests in his answer, but using an existing JDK class.
This method is also O(n lg n). The catch is that if the TreeMap approach is reused then it's O(n lg n) to build the Map but only O(k) to reuse it. If you want to use the Java 8 solution with reuse then you can do:
List<Entry<Double, Integer>> sorted = IntStream.range(0, data.length).
mapToObj(i -> new SimpleEntry<>(data[i], i)).
sorted(comparing(Entry::getKey)).
collect(toList());
i.e. don't limit the size to k elements. Now, to get the first k elements you just need to do:
List<Entry<Double, Integer>> subList = sorted.subList(0, k);
The magic of this is that it's O(1).

Finding index of duplicated element in arraylist

I'm trying to find the index position of the duplicates in an arraylist of strings. I'm having trouble figuring out a way to efficiently loop through the arraylist and report the index of the duplicate. My initial thought was to use Collections.binarySearch() to look for a duplicate, but I'm not sure how I would be able to compare the elements of the arraylist to each other with binarySearch. The only other thought I had would involve looping through the list, which is quite massive, too many times to even be feasible. I have limited java knowledge so any help is appreciated.
Not elegant, but should work:
Map<String, List<Integer>> indexList = new HashMap<String, List<Integer>>();
for (int i = 0; i < yourList.size(); i++) {
String currentString = yourList.get(i);
List<String> indexes = indexList.get(currentString);
if (indexes == null) {
indexList.put(currentString, indexes = new LinkedList<Integer>());
}
indexes.add(i);
if (indexes.size() > 1) {
// found duplicate, do what you like
}
}
// if you skip the last if in the for loop you can do this:
for (String string : indexList.keySet()) {
if (indexList.get(string).size() > 1) {
// String string has multiple occurences
// List of corresponding indexes:
List<Integer> indexes = indexList.get(string);
// do what you want
}
}
It sounds like you're out of luck.
You will have to inspect every element (i.e. iterate through the whole list). Think about it logically - if you could avoid this, it means that there's one element that you haven't inspected. But this element could be any value, and so could be a duplicate of another list element.
Binary searches are a smart way to reduce the number of elements checked when you are aware of some relationship that holds across the list - so that checking one element gives you information about the others. For instance, for a sorted list if the middle element is greater than 5, you know that every element after it is also greater than five.
However, I don't think there's a way to make such an inference when it comes to duplicate checking. You'd have to sort the list in terms of "number of elements that this duplicates" (which is begging the question), otherwise no tests you perform on element x will give you insight into whether y is a duplicate.
Now this may not be a memory efficient solution but yes I guess this is what you were looking for.. May be this program could be further improved.
import java.io.*;
import java.util.*;
class ArrayList2_CountingDuplicates
{
public static void main(String[] args)throws IOException
{
ArrayList<String> als1=new ArrayList<String>();
ArrayList<String> als2=new ArrayList<String>();
int arr[];
int n,i,j,c=0;
String s;
BufferedReader p=new BufferedReader(new InputStreamReader(System.in));
n=Integer.parseInt(p.readLine());
arr=new int[n];
for(i=0;i<n;i++)
als1.add(p.readLine());
for(i=0;i<n;i++)
{
s=als1.get(i);
als1.remove(i);
als2.add(s);
arr[c]=1;
while(als1.contains(s))
{
j=als1.indexOf(s);
als1.remove(j);
arr[c]=arr[c]+1;
}
n=n-arr[c];
c=c+1;
i=-1;
}
for(i=0;i<c;i++)
System.out.println(als2.get(i)+" has frequency "+arr[i]);
}
}
I was looking for such a method and eventually I came up with my own solution with a more functional approach to solve the problem.
public <T> Map<T, List<Integer>> findDuplicatesWithIndexes(List<T> elems) {
return IntStream.range(0, elems.size())
.boxed()
.collect(Collectors.groupingBy(elems::get))
.entrySet().stream()
.filter(e -> e.getValue().size() > 1)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
}
It returns a map consisting of duplicated elements as the keys and list of all indexes of repeating element as the value.

Remove multiple elements from ArrayList

I have a bunch of indexes and I want to remove elements at these indexes from an ArrayList. I can't do a simple sequence of remove()s because the elements are shifted after each removal. How do I solve this?
To remove elements at indexes:
Collections.sort(indexes, Collections.reverseOrder());
for (int i : indexes)
strs.remove(i);
Or, using the Stream API from Java 8:
indexes.sort(Comparator.reverseOrder());
indexes.stream().mapToInt(i -> i).forEach(l::remove);
Sort the indices in descending order and then remove them one by one. If you do that, there's no way a remove will affect any indices that you later want to remove.
How you sort them will depend on the collection you are using to store the indices. If it's a list, you can do this:
List<Integer> indices;
Collections.sort(indices, new Comparator<Integer>() {
public int compare(Integer a, Integer b) {
//todo: handle null
return b.compareTo(a);
}
}
Edit
#aioobe found the helper that I failed to find. Instead of the above, you can use
Collections.sort(indices, Collections.reverseOrder());
I came here for removing elements in specific range (i.e., all elements between 2 indexes), and found this:
list.subList(indexStart, indexEnd).clear()
You can remove the elements starting from the largest index downwards, or if you have references to the objects you wish to remove, you can use the removeAll method.
you might want to use the subList method with the range of index you would like to remove and
then call clear() on it.
(pay attention that the second parameter is exclusive - for example in this case, I pass 2 meaning only index 0 and 1 will be removed.):
public static void main(String[] args) {
ArrayList<String> animals = new ArrayList<String>();
animals.add("cow");
animals.add("dog");
animals.add("chicken");
animals.add("cat");
animals.subList(0, 2).clear();
for(String s : animals)
System.out.println(s);
}
}
the result will be:
chicken
cat
You can remove the indexes in reverse order. If the indexes are in order like 1,2,3 you can do removeRange(1, 3).
I think nanda was the correct answer.
List<T> toRemove = new LinkedList<T>();
for (T t : masterList) {
if (t.shouldRemove()) {
toRemove.add(t);
}
}
masterList.removeAll(toRemove);
You can sort the indices as many said, or you can use an iterator and call remove()
List<String> list = new ArrayList<String>();
list.add("0");
list.add("1");
list.add("2");
list.add("3");
list.add("4");
list.add("5");
list.add("6");
List<Integer> indexes = new ArrayList<Integer>();
indexes.add(2);
indexes.add(5);
indexes.add(3);
int cpt = 0;
Iterator<String> it = list.iterator();
while(it.hasNext()){
it.next();
if(indexes.contains(cpt)){
it.remove();
}
cpt++;
}
it depends what you need, but the sort will be faster in most cases
Use guava! The method you are looking is Iterators.removeAll(Iterator removeFrom, Collection elementsToRemove)
If you have really many elements to remove (and a long list), it may be faster to iterate over the list and add all elements who are not to be removed to a new list, since each remove()-step in a array-list copies all elements after the removed one by one. In this case, if you index list is not already sorted (and you can iterate over it parallel to the main list), you may want to use a HashSet or BitSet or some similar O(1)-access-structure for the contains() check:
/**
* creates a new List containing all elements of {#code original},
* apart from those with an index in {#code indices}.
* Neither the original list nor the indices collection is changed.
* #return a new list containing only the remaining elements.
*/
public <X> List<X> removeElements(List<X> original, Collection<Integer> indices) {
// wrap for faster access.
indices = new HashSet<Integer>(indices);
List<X> output = new ArrayList<X>();
int len = original.size();
for(int i = 0; i < len; i++) {
if(!indices.contains(i)) {
output.add(original.get(i));
}
}
return output;
}
order your list of indexes, like this
if 2,12,9,7,3 order desc to 12,9,7,3,2
and then do this
for(var i = 0; i < indexes.length; i++)
{
source_array.remove(indexes[0]);
}
this should resolve your problem.
If the elements you wish to remove are all grouped together, you can do a subList(start, end).clear() operation.
If the elements you wish to remove are scattered, it may be better to create a new ArrayList, add only the elements you wish to include, and then copy back into the original list.
Edit: I realize now this was not a question of performance but of logic.
If you want to remove positions X to the Size
//a is the ArrayList
a=(ArrayList)a.sublist(0,X-1);
Assuming your indexes array is sorted (eg: 1, 3, 19, 29), you can do this:
for (int i = 0; i < indexes.size(); i++){
originalArray.remove(indexes.get(i) - i);
}
A more efficient method that I guess I have not seen above is creating a new Arraylist and selecting which indices survive by copying them to the new array. And finally reassign the reference.
I ended up here for a similar query and #aioobe's answer helped me figure out the solution.
However, if you are populating the list of indices to delete yourself, might want to consider using this:
indices.add(0, i);
This will eliminate the need for (the costly) reverse-sorting of the list before iterating over it, while removing elements from the main ArrayList.

Categories