Java: See if ArrayList contains ArrayList with duplicate values

Java: See if ArrayList contains ArrayList with duplicate values - java

I'm currently trying to create a method that determine if an ArrayList(a2) contains an ArrayList(a1), given that both lists contain duplicate values (containsAll wouldn't work as if an ArrayList contains duplicate values, then it would return true regardless of the quantity of the values)
This is what I have: (I believe it would work however I cannot use .remove within the for loop)
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
Integer a1Size= a1.size();
for (Integer integer2:a2){
for (Integer integer1: a1){
if (integer1==integer2){
a1.remove(integer1);
a2.remove(integer2);
if (a1Size==0){
return true;
}
}
}
}
return false;
}
Thanks for the help.

Updated
I think the clearest statement of your question is in one of your comments:
Yes, the example " Example: [dog,cat,cat,bird] is a match for
containing [cat,dog] is false but containing [cat,cat,dog] is true?"
is exactly what I am trying to achieve.
So really, you are not looking for a "subset", because these are not sets. They can contain duplicate elements. What you are really saying is you want to see whether a1 contains all the elements of a2, in the same amounts.
One way to get to that is to count all the elements in both lists. We can get such a count using this method:
private Map<Integer, Integer> getCounter (List<Integer> list) {
Map<Integer, Integer> counter = new HashMap<>();
for (Integer item : list) {
counter.put (item, counter.containsKey(item) ? counter.get(item) + 1 : 1);
}
return counter;
}
We'll rename your method to be called containsAllWithCounts(), and it will use getCounter() as a helper. Your method will also accept List objects as its parameters, rather than ArrayList objects: it's a good practice to specify parameters as interfaces rather than implementations, so you are not tied to using ArrayList types.
With that in mind, we simply scan the counts of the items in a2 and see that they are the same in a1:
public boolean containsAllWithCounts(List<Integer> a1, List<Integer> a2) {
Map<Integer,Integer> counterA1 = getCounter(a1);
Map<Integer,Integer> counterA2 = getCounter(a2);
boolean containsAll = true;
for (Map.Entry<Integer, Integer> entry : counterA2.entrySet ()) {
Integer key = entry.getKey();
Integer count = entry.getValue();
containsAll &= counterA1.containsKey(key) && counterA1.get(key).equals(count);
if (!containsAll) break;
}
return containsAll;
}
If you like, I can rewrite this code to handle arbitrary types, not just Integer objects, using Java generics. Also, all the code can be shortened using Java 8 streams (which I originally used - see comments below). Just let me know in comments.

if you want remove elements from list you have 2 choices:
iterate over copy
use concurrent list implementation
see also:
http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedList-java.util.List-
btw why you don't override contains method ??
here you use simple Object like "Integer" what about when you will be using List< SomeComplexClass > ??
example remove with iterator over copy:
List<Integer> list1 = new ArrayList<Integer>();
List<Integer> list2 = new ArrayList<Integer>();
List<Integer> listCopy = new ArrayList<>(list1);
Iterator<Integer> iterator1 = listCopy.iterator();
while(iterator1.hasNext()) {
Integer next1 = iterator1.next();
Iterator<Integer> iterator2 = list2.iterator();
while (iterator2.hasNext()) {
Integer next2 = iterator2.next();
if(next1.equals(next2)) list1.remove(next1);
}
}
see also this answer about iterator:
Concurrent Modification exception
also don't use == operator to compare objects :) instead use equal method
about use of removeAll() and other similarly methods:
keep in mind that many classes that implements list interface don't override all methods from list interface - so you can end up with unsupported operation exception - thus I prefer "low level" binary/linear/mixed search in this case.
and for comparison of complex classes objects you will need override equal and hashCode methods
f you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.
- Olga
In Java, HashMap works by using hashCode to locate a bucket. Each bucket is a list of items residing in that bucket. The items are scanned, using equals for comparison. When adding items, the HashMap is resized once a certain load percentage is reached.
So, sometimes it will have to compare against a few items, but generally it's much closer to O(1) than O(n).
in short - there is no need to use more resources (memory) and "harness" unnecessary classes - as hash map "get" method gets very expensive as count of item grows.
hashCode -> put to bucket [if many item in bucket] -> get = linear scan
so what counts in removing items ?
complexity of equals and hasCode and used of proper algorithm to iterate

I know this is maybe amature-ish, but...
There is no need to remove the items from both lists, so, just take it from the one list
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
for(Integer a1Int : a1){
for (int i = 0; i<a2.size();i++) {
if (a2.get(i).equals(a1Int)) {
a2.remove(i);
break;
}
}
if (a2.size()== 0) {
return true;
}
}
return false;
}

If you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.

Related

How can I compare two array lists for equality with a custom comparator?

To be specific, I have two lists:
List<SystemUserWithNameAndId> list1;
List<SystemUserWithNameAndId> list2;
I want to check if they contain the same system users and ordering is not an issue. I tried to use a comparator to sort them first and then check if they're equal using the equals() method of lists. But I don't want to override the equals method for SystemUserWithNameAndId and I was wondering if I could use the comparator I created for sorting or a similar one to check for equality without explicitly iterating through the lists after sorting.
Comparator<SystemUserWithNameAndId> systemUserComparator = new Comparator<SystemUserWithNameAndId>()
{
#Override
public int compare(SystemUserWithNameAndId systemUser1, SystemUserWithNameAndId systemUser2)
{
final int systemUserId1 = systemUser1.getSystemUserId();
final int systemUserId2 = systemUser2.getSystemUserId();
return systemUserId1 == systemUserId2
? 0
: systemUserId1 - systemUserId2;
}
};
Collections.sort(systemUsers1, systemUserComparator);
Collections.sort(systemUsers2, systemUserComparator);
return systemUsers1.equals(systemUsers2);
Ideally, I want to be able to say,
CollectionUtils.isEqualCollections(systemUsers1, systemUsers2, someCustomComparator);

Just implement the method that iterates, and reuse it every time you need it:
public static <T> boolean areEqualIgnoringOrder(List<T> list1, List<T> list2, Comparator<? super T> comparator) {
// if not the same size, lists are not equal
if (list1.size() != list2.size()) {
return false;
}
// create sorted copies to avoid modifying the original lists
List<T> copy1 = new ArrayList<>(list1);
List<T> copy2 = new ArrayList<>(list2);
Collections.sort(copy1, comparator);
Collections.sort(copy2, comparator);
// iterate through the elements and compare them one by one using
// the provided comparator.
Iterator<T> it1 = copy1.iterator();
Iterator<T> it2 = copy2.iterator();
while (it1.hasNext()) {
T t1 = it1.next();
T t2 = it2.next();
if (comparator.compare(t1, t2) != 0) {
// as soon as a difference is found, stop looping
return false;
}
}
return true;
}

Here's a Java 8 way of solving your problem. First make sure the lists are of equal length:
List<SystemUserWithNameAndId> list1 = ... ;
List<SystemUserWithNameAndId> list2 = ... ;
if (list1.size() != list2.size()) {
return false;
}
Now build a Comparator using the new comparator utilities. The idea is that instead of writing custom logic for a comparator, most comparators do something like comparing two objects by extracting a key from them, and then comparing the keys. That's what this does.
Comparator<SystemUserWithNameAndId> comp =
Comparator.comparingInt(SystemUserWithNameAndId::getSystemUserId);
Sort the lists. Of course, you might want to make copies before sorting if you don't want your function to have the side effect of sorting its input. If your input lists aren't random access (who uses LinkedList nowadays?) you might also want to copy them to ArrayLists to facilitate random access.
list1.sort(comp);
list2.sort(comp);
Run a stream over the indexes of the lists, calling the comparator on each pair. The comparator returns 0 if the elements are equals according to this comparator. If this is true for all pairs of elements, the lists are equal.
return IntStream.range(0, list1.size())
.allMatch(i -> comp.compare(list1.get(i), list2.get(i)) == 0);

Get unique values from ArrayList in Java

I have an ArrayList with a number of records and one column contains gas names as CO2 CH4 SO2, etc. Now I want to retrieve different gas names(unique) only without repeatation from the ArrayList. How can it be done?

You should use a Set. A Set is a Collection that contains no duplicates.
If you have a List that contains duplicates, you can get the unique entries like this:
List<String> gasList = // create list with duplicates...
Set<String> uniqueGas = new HashSet<String>(gasList);
System.out.println("Unique gas count: " + uniqueGas.size());
NOTE: This HashSet constructor identifies duplicates by invoking the elements' equals() methods.

You can use Java 8 Stream API.
Method distinct is an intermediate operation that filters the stream and allows only distinct values (by default using the Object::equals method) to pass to the next operation.
I wrote an example below for your case,
// Create the list with duplicates.
List<String> listAll = Arrays.asList("CO2", "CH4", "SO2", "CO2", "CH4", "SO2", "CO2", "CH4", "SO2");
// Create a list with the distinct elements using stream.
List<String> listDistinct = listAll.stream().distinct().collect(Collectors.toList());
// Display them to terminal using stream::collect with a build in Collector.
String collectAll = listAll.stream().collect(Collectors.joining(", "));
System.out.println(collectAll); //=> CO2, CH4, SO2, CO2, CH4 etc..
String collectDistinct = listDistinct.stream().collect(Collectors.joining(", "));
System.out.println(collectDistinct); //=> CO2, CH4, SO2

I hope I understand your question correctly: assuming that the values are of type String, the most efficient way is probably to convert to a HashSet and iterate over it:
ArrayList<String> values = ... //Your values
HashSet<String> uniqueValues = new HashSet<>(values);
for (String value : uniqueValues) {
... //Do something
}

you can use this for making a list Unique
ArrayList<String> listWithDuplicateValues = new ArrayList<>();
list.add("first");
list.add("first");
list.add("second");
ArrayList uniqueList = (ArrayList) listWithDuplicateValues.stream().distinct().collect(Collectors.toList());

ArrayList values = ... // your values
Set uniqueValues = new HashSet(values); //now unique

Here's straightforward way without resorting to custom comparators or stuff like that:
Set<String> gasNames = new HashSet<String>();
List<YourRecord> records = ...;
for(YourRecord record : records) {
gasNames.add(record.getGasName());
}
// now gasNames is a set of unique gas names, which you could operate on:
List<String> sortedGasses = new ArrayList<String>(gasNames);
Collections.sort(sortedGasses);
Note: Using TreeSet instead of HashSet would give directly sorted arraylist and above Collections.sort could be skipped, but TreeSet is otherwise less efficent, so it's often better, and rarely worse, to use HashSet even when sorting is needed.

When I was doing the same query, I had hard time adjusting the solutions to my case, though all the previous answers have good insights.
Here is a solution when one has to acquire a list of unique objects, NOT strings.
Let's say, one has a list of Record object. Record class has only properties of type String, NO property of type int.
Here implementing hashCode() becomes difficult as hashCode() needs to return an int.
The following is a sample Record Class.
public class Record{
String employeeName;
String employeeGroup;
Record(String name, String group){
employeeName= name;
employeeGroup = group;
}
public String getEmployeeName(){
return employeeName;
}
public String getEmployeeGroup(){
return employeeGroup;
}
#Override
public boolean equals(Object o){
if(o instanceof Record){
if (((Record) o).employeeGroup.equals(employeeGroup) &&
((Record) o).employeeName.equals(employeeName)){
return true;
}
}
return false;
}
#Override
public int hashCode() { //this should return a unique code
int hash = 3; //this could be anything, but I would chose a prime(e.g. 5, 7, 11 )
//again, the multiplier could be anything like 59,79,89, any prime
hash = 89 * hash + Objects.hashCode(this.employeeGroup);
return hash;
}
As suggested earlier by others, the class needs to override both the equals() and the hashCode() method to be able to use HashSet.
Now, let's say, the list of Records is allRecord(List<Record> allRecord).
Set<Record> distinctRecords = new HashSet<>();
for(Record rc: allRecord){
distinctRecords.add(rc);
}
This will only add the distinct Records to the Hashset, distinctRecords.
Hope this helps.

public static List getUniqueValues(List input) {
return new ArrayList<>(new LinkedHashSet<>(incoming));
}
dont forget to implement your equals method first

If you have an array of a some kind of object (bean) you can do this:
List<aBean> gasList = createDuplicateGasBeans();
Set<aBean> uniqueGas = new HashSet<aBean>(gasList);
like said Mathias Schwarz above, but you have to provide your aBean with the methods hashCode() and equals(Object obj) that can be done easily in Eclipse by dedicated menu 'Generate hashCode() and equals()' (while in the bean Class).
Set will evaluate the overridden methods to discriminate equals objects.

One liner for getting a sublist from a Set

Is there a one-liner (maybe from Guava or Apache Collections) that gets a sublist from a set. Internally it should do something like this:
public <T> List<T> sublist(Set<T> set, int count) {
Iterator<T> iterator = set.iterator();
List<T> sublist = new LinkedList<T>();
int pos = 0;
while (iterator.hasNext() && pos++ < count) {
sublist.add(iterator.next());
}
return sublist;
}
Obviously, if there are not enough elements it has to return as many as possible.

With Guava:
return FluentIterable.from(set)
.limit(count)
.toImmutableList();
(Also, this won't actually iterate over the whole set, in contrast to most of these other solutions -- it'll actually only iterate through the first count elements and then stop.)

(new LinkedList<Object>(mySet)).sublist(0, Math.min(count, mySet.size()))
But please note: the code (even your original code) is a little bit smelly, since iteration order of sets depends on the actual set implementation in question (it's totally undefined in HashSet and the key order for TreeSets). So, it is actually an open question, which elements make it into the final sublist.

This should do it:
return (new LinkedList<T>(set)).subList(0, count);
But ensure, that count isn't larger than the size of set.

You could use a TreeSet and use it's subSet method:
Returns a view of the portion of this set whose elements range from fromElement to toElement. If fromElement and toElement are equal, the returned set is empty unless fromExclusive and toExclusive are both true. The returned set is backed by this set, so changes in the returned set are reflected in this set, and vice-versa. The returned set supports all optional set operations that this set supports.
EXAMPLE USING INTEGER:
TreeSet<Integer> t = new TreeSet<Integer>();
t.add(1);
t.add(2);
t.add(3);
t.add(4);
t.add(5);
System.out.println("Before SubSet:");
for(Integer s : t){
System.out.println(s);
}
System.out.println("\nAfter SubSet:");
for(Integer s : t.subSet(2,false,5,true)){
System.out.println(s);
}
OUTPUT:
Before SubSet:
1
2
3
4
5
After SubSet:
3
4
5
Alternatively, If you do not know the elements and want to return the elements between two points you can use an ArrayList constructed with the Set and use the subList method.
System.out.println("\nAfter SubSet:");
t = new TreeSet(new ArrayList(t).subList(2, 5));
for(Integer s : t){
System.out.println(s);
}

What about this
Set<String> s = new HashSet<String>();
// add at least two items to the set
Set<String> subSet = new HashSet(new ArrayList<String>(s).subList(1, 2));
This would sublist between 1 and 2

Without creating a copy of the Set beforehand, you can do (using Guava) :
Lists.newLinkedList(Iterables.getFirst(Iterables.partition(mySet, count), ImmutableList.of()))
It's a real LinkedList containing only (up to) the first count elements, not a view on a larger list.

Cross compare ArrayList elements and remove duplicates

I have an ArrayList<MyObject> that may (or may not) contain duplicates of MyObject I need to remove from the List. How can I do this in a way that I don't have to check duplication twice as I would do if I were to iterate the list in two for-loops and cross checking every item with every other item.
I just need to check every item once, so comparing A:B is enough - I don't want to compare B:A again, as I already did that.
Furthermore; can I just remove duplicates from the list while looping? Or will that somehow break the list and my loop?
Edit: Okay, I forgot an important part looking through the first answers: A duplicate of MyObject is not just meant in the Java way meaning Object.equals(Object), but I need to be able to compare objects using my own algorithm, as the equality of MyObjects is calculated using an algorithm that checks the Object's fields in a special way that I need to implement!
Furthermore, I can't just override euqals in MyObject as there are several, different Algorithms that implement different strategies for checking the equality of two MyObjects - e.g. there is a simple HashComparer and a more complex EuclidDistanceComparer, both being AbstractComparers implementing different algorithms for the public abstract boolean isEqual(MyObject obj1, MyObject obj2);

Sort the list, and the duplicates will be adjacent to each other, making them easy to identify and remove. Just go through the list remembering the value of the previous item so you can compare it with the current one. If they are the same, remove the current item.
And if you use an ordinary for-loop to go through the list, you control the current position. That means that when you remove an item, you can decrement the position (n--) so that the next time around the loop will visit the same position (which will now be the next item).
You need to provide a custom comparison in your sort? That's not so hard:
Collections.sort(myArrayList, new Comparator<MyObject>() {
public int compare(MyObject o1, MyObject o2) {
return o1.getThing().compareTo(o2.getThing());
}
});
I've written this example so that getThing().compareTo() stands in for whatever you want to do to compare the two objects. You must return an integer that is zero if they are the same, greater than 1 if o1 is greater than o2 and -1 if o1 is less than o2. If getThing() returned a String or a Date, you'd be all set because those classes have a compareTo method already. But you can put whatever code you need to in your custom Comparator.

Create a set and it will remove the duplicates automatically for you if the ordering is not important.
Set<MyObject> mySet = new HashSet<MyObject>(yourList);

Instantiate a new set-based collection HashSet. Don't forget to implement equals and hashcode for MyObject.
Good Luck!

If object order is insignificant
If the order is not important, you can put the elements of the list into a Set:
Set<MyObject> mySet = new HashSet<MyObject>(yourList);
The duplicates will be removed automatically.
If object order is significant
If ordering is significant, then you can manually check for duplicates, e.g. using this snippet:
// Copy the list.
ArrayList<String> newList = (ArrayList<String>) list.clone();
// Iterate
for (int i = 0; i < list.size(); i++) {
for (int j = list.size() - 1; j >= i; j--) {
// If i is j, then it's the same object and don't need to be compared.
if (i == j) {
continue;
}
// If the compared objects are equal, remove them from the copy and break
// to the next loop
if (list.get(i).equals(list.get(j))) {
newList.remove(list.get(i));
break;
}
System.out.println("" + i + "," + j + ": " + list.get(i) + "-" + list.get(j));
}
}
This will remove all duplicates, leaving the last duplicate value as original entry. In addition, it will check each combination only once.
Using Java 8
Java Streams makes it even more elegant:
List<Integer> newList = oldList.stream()
.distinct()
.collect(Collectors.toList());
If you need to consider two of your objects equal based on your own definition, you could do the following:
public static <T, U> Predicate<T> distinctByProperty(Function<? super T, ?> propertyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> seen.add(propertyExtractor.apply(t));
}
(by Stuart Marks)
And then you could do this:
List<MyObject> newList = oldList.stream()
.filter(distinctByProperty(t -> {
// Your custom property to use when determining whether two objects
// are equal. For example, consider two object equal if their name
// starts with the same character.
return t.getName().charAt(0);
}))
.collect(Collectors.toList());
Futhermore
You cannot modify a list while an Iterator (which is usually used in a for-each loop) is looping through an array. This will throw a ConcurrentModificationException. You can modify the array if you are looping it using a for loop. Then you must control the iterator position (decrementing it while removing an entry).

Or http://docs.oracle.com/javase/6/docs/api/java/util/SortedSet.html if you need sort-order..
EDIT: What about deriving from http://docs.oracle.com/javase/6/docs/api/java/util/TreeSet.html, it will allow you to pass in a Comparator at construction time. You override add() to use your Comparator instead of equals() - this will give you the flexibility of creating different sets that are ordered according to your Comparator and they will implement your "Equality"-Strategy.
Dont forget about equals() and hashCode() though...

Count the occurrences of items in ArrayList

I have a java.util.ArrayList<Item> and an Item object.
Now, I want to obtain the number of times the Item is stored in the arraylist.
I know that I can do arrayList.contains() check but it returns true, irrespective of whether it contains one or more Items.
Q1. How can I find the number of time the Item is stored in the list?
Q2. Also, If the list contains more than one Item, then how can I determine the index of other Items because arrayList.indexOf(item) returns the index of only first Item every time?

You can use Collections class:
public static int frequency(Collection<?> c, Object o)
Returns the number of elements in the specified collection equal to the specified object. More formally, returns the number of elements e in the collection such that (o == null ? e == null : o.equals(e)).
If you need to count occurencies of a long list many times I suggest you to use an HashMap to store the counters and update them while you insert new items to the list. This would avoid calculating any kind of counters.. but of course you won't have indices.
HashMap<Item, Integer> counters = new HashMap<Item, Integer>(5000);
ArrayList<Item> items = new ArrayList<Item>(5000);
void insert(Item newEl)
{
if (counters.contains(newEl))
counters.put(newEl, counters.get(newEl)+1);
else
counters.put(newEl, 1);
items.add(newEl);
}
A final hint: you can use other collections framework (like Apache Collections) and use a Bag datastructure that is described as
Defines a collection that counts the number of times an object appears in the collection.
So exactly what you need..

This is easy to do by hand.
public int countNumberEqual(ArrayList<Item> itemList, Item itemToCheck) {
int count = 0;
for (Item i : itemList) {
if (i.equals(itemToCheck)) {
count++;
}
}
return count;
}
Keep in mind that if you don't override equals in your Item class, this method will use object identity (as this is the implementation of Object.equals()).
Edit: Regarding your second question (please try to limit posts to one question apiece), you can do this by hand as well.
public List<Integer> indices(ArrayList<Item> items, Item itemToCheck) {
ArrayList<Integer> ret = new ArrayList<Integer>();
for (int i = 0; i < items.size(); i++) {
if (items.get(i).equals(itemToCheck)) {
ret.add(i);
}
}
return ret;
}

As the other respondents have already said, if you're firmly committed to storing your items in an unordered ArrayList, then counting items will take O(n) time, where n is the number of items in the list. Here at SO, we give advice but we don't do magic!
As I just hinted, if the list gets searched a lot more than it's modified, it might make sense to keep it sorted. If your list is sorted then you can find your item in O(log n) time, which is a lot quicker; and if you have a hashcode implementation that goes well with your equals, all the identical items will be right next to each other.
Another possibility would be to create and maintain two data structures in parallel. You could use a HashMap containing your items as keys and their count as values. You'd be obligated to update this second structure any time your list changes, but item count lookups would be o(1).

I could be wrong, but it seems to me like the data structure you actually want might be a Multiset (from google-collections/guava) rather than a List. It allows multiples, unlike Set, but doesn't actually care about the order. Given that, it has a int count(Object element) method that does exactly what you want. And since it isn't a list and has implementations backed by a HashMap, getting the count is considerably more efficient.

Thanks for your all nice suggestion. But this below code is really very useful as we dont have any search method with List that can give number of occurance.
void insert(Item newEl)
{
if (counters.contains(newEl))
counters.put(newEl, counters.get(newEl)+1);
else
counters.put(newEl, 1);
items.add(newEl);
}
Thanks to Jack. Good posting.
Thanks,
Binod Suman
http://binodsuman.blogspot.com

I know this is an old post, but since I did not see a hash map solution, I decided to add a pseudo code on hash-map for anyone that needs it in the future. Assuming arraylist and Float data types.
Map<Float,Float> hm = new HashMap<>();
for(float k : Arralistentry) {
Float j = hm.get(k);
hm.put(k,(j==null ? 1 : j+1));
}
for(Map.Entry<Float, Float> value : hm.entrySet()) {
System.out.println("\n" +value.getKey()+" occurs : "+value.getValue()+" times");
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java: See if ArrayList contains ArrayList with duplicate values - java

If you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.

Related

How can I compare two array lists for equality with a custom comparator?

Get unique values from ArrayList in Java

One liner for getting a sublist from a Set

Cross compare ArrayList elements and remove duplicates

Count the occurrences of items in ArrayList

Categories

Resources