Count the occurrences of items in ArrayList - java

I have a java.util.ArrayList<Item> and an Item object.
Now, I want to obtain the number of times the Item is stored in the arraylist.
I know that I can do arrayList.contains() check but it returns true, irrespective of whether it contains one or more Items.
Q1. How can I find the number of time the Item is stored in the list?
Q2. Also, If the list contains more than one Item, then how can I determine the index of other Items because arrayList.indexOf(item) returns the index of only first Item every time?

You can use Collections class:
public static int frequency(Collection<?> c, Object o)
Returns the number of elements in the specified collection equal to the specified object. More formally, returns the number of elements e in the collection such that (o == null ? e == null : o.equals(e)).
If you need to count occurencies of a long list many times I suggest you to use an HashMap to store the counters and update them while you insert new items to the list. This would avoid calculating any kind of counters.. but of course you won't have indices.
HashMap<Item, Integer> counters = new HashMap<Item, Integer>(5000);
ArrayList<Item> items = new ArrayList<Item>(5000);
void insert(Item newEl)
{
if (counters.contains(newEl))
counters.put(newEl, counters.get(newEl)+1);
else
counters.put(newEl, 1);
items.add(newEl);
}
A final hint: you can use other collections framework (like Apache Collections) and use a Bag datastructure that is described as
Defines a collection that counts the number of times an object appears in the collection.
So exactly what you need..

This is easy to do by hand.
public int countNumberEqual(ArrayList<Item> itemList, Item itemToCheck) {
int count = 0;
for (Item i : itemList) {
if (i.equals(itemToCheck)) {
count++;
}
}
return count;
}
Keep in mind that if you don't override equals in your Item class, this method will use object identity (as this is the implementation of Object.equals()).
Edit: Regarding your second question (please try to limit posts to one question apiece), you can do this by hand as well.
public List<Integer> indices(ArrayList<Item> items, Item itemToCheck) {
ArrayList<Integer> ret = new ArrayList<Integer>();
for (int i = 0; i < items.size(); i++) {
if (items.get(i).equals(itemToCheck)) {
ret.add(i);
}
}
return ret;
}

As the other respondents have already said, if you're firmly committed to storing your items in an unordered ArrayList, then counting items will take O(n) time, where n is the number of items in the list. Here at SO, we give advice but we don't do magic!
As I just hinted, if the list gets searched a lot more than it's modified, it might make sense to keep it sorted. If your list is sorted then you can find your item in O(log n) time, which is a lot quicker; and if you have a hashcode implementation that goes well with your equals, all the identical items will be right next to each other.
Another possibility would be to create and maintain two data structures in parallel. You could use a HashMap containing your items as keys and their count as values. You'd be obligated to update this second structure any time your list changes, but item count lookups would be o(1).

I could be wrong, but it seems to me like the data structure you actually want might be a Multiset (from google-collections/guava) rather than a List. It allows multiples, unlike Set, but doesn't actually care about the order. Given that, it has a int count(Object element) method that does exactly what you want. And since it isn't a list and has implementations backed by a HashMap, getting the count is considerably more efficient.

Thanks for your all nice suggestion. But this below code is really very useful as we dont have any search method with List that can give number of occurance.
void insert(Item newEl)
{
if (counters.contains(newEl))
counters.put(newEl, counters.get(newEl)+1);
else
counters.put(newEl, 1);
items.add(newEl);
}
Thanks to Jack. Good posting.
Thanks,
Binod Suman
http://binodsuman.blogspot.com

I know this is an old post, but since I did not see a hash map solution, I decided to add a pseudo code on hash-map for anyone that needs it in the future. Assuming arraylist and Float data types.
Map<Float,Float> hm = new HashMap<>();
for(float k : Arralistentry) {
Float j = hm.get(k);
hm.put(k,(j==null ? 1 : j+1));
}
for(Map.Entry<Float, Float> value : hm.entrySet()) {
System.out.println("\n" +value.getKey()+" occurs : "+value.getValue()+" times");
}

Related

Java: See if ArrayList contains ArrayList with duplicate values

I'm currently trying to create a method that determine if an ArrayList(a2) contains an ArrayList(a1), given that both lists contain duplicate values (containsAll wouldn't work as if an ArrayList contains duplicate values, then it would return true regardless of the quantity of the values)
This is what I have: (I believe it would work however I cannot use .remove within the for loop)
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
Integer a1Size= a1.size();
for (Integer integer2:a2){
for (Integer integer1: a1){
if (integer1==integer2){
a1.remove(integer1);
a2.remove(integer2);
if (a1Size==0){
return true;
}
}
}
}
return false;
}
Thanks for the help.
Updated
I think the clearest statement of your question is in one of your comments:
Yes, the example " Example: [dog,cat,cat,bird] is a match for
containing [cat,dog] is false but containing [cat,cat,dog] is true?"
is exactly what I am trying to achieve.
So really, you are not looking for a "subset", because these are not sets. They can contain duplicate elements. What you are really saying is you want to see whether a1 contains all the elements of a2, in the same amounts.
One way to get to that is to count all the elements in both lists. We can get such a count using this method:
private Map<Integer, Integer> getCounter (List<Integer> list) {
Map<Integer, Integer> counter = new HashMap<>();
for (Integer item : list) {
counter.put (item, counter.containsKey(item) ? counter.get(item) + 1 : 1);
}
return counter;
}
We'll rename your method to be called containsAllWithCounts(), and it will use getCounter() as a helper. Your method will also accept List objects as its parameters, rather than ArrayList objects: it's a good practice to specify parameters as interfaces rather than implementations, so you are not tied to using ArrayList types.
With that in mind, we simply scan the counts of the items in a2 and see that they are the same in a1:
public boolean containsAllWithCounts(List<Integer> a1, List<Integer> a2) {
Map<Integer,Integer> counterA1 = getCounter(a1);
Map<Integer,Integer> counterA2 = getCounter(a2);
boolean containsAll = true;
for (Map.Entry<Integer, Integer> entry : counterA2.entrySet ()) {
Integer key = entry.getKey();
Integer count = entry.getValue();
containsAll &= counterA1.containsKey(key) && counterA1.get(key).equals(count);
if (!containsAll) break;
}
return containsAll;
}
If you like, I can rewrite this code to handle arbitrary types, not just Integer objects, using Java generics. Also, all the code can be shortened using Java 8 streams (which I originally used - see comments below). Just let me know in comments.
if you want remove elements from list you have 2 choices:
iterate over copy
use concurrent list implementation
see also:
http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedList-java.util.List-
btw why you don't override contains method ??
here you use simple Object like "Integer" what about when you will be using List< SomeComplexClass > ??
example remove with iterator over copy:
List<Integer> list1 = new ArrayList<Integer>();
List<Integer> list2 = new ArrayList<Integer>();
List<Integer> listCopy = new ArrayList<>(list1);
Iterator<Integer> iterator1 = listCopy.iterator();
while(iterator1.hasNext()) {
Integer next1 = iterator1.next();
Iterator<Integer> iterator2 = list2.iterator();
while (iterator2.hasNext()) {
Integer next2 = iterator2.next();
if(next1.equals(next2)) list1.remove(next1);
}
}
see also this answer about iterator:
Concurrent Modification exception
also don't use == operator to compare objects :) instead use equal method
about use of removeAll() and other similarly methods:
keep in mind that many classes that implements list interface don't override all methods from list interface - so you can end up with unsupported operation exception - thus I prefer "low level" binary/linear/mixed search in this case.
and for comparison of complex classes objects you will need override equal and hashCode methods
f you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.
- Olga
In Java, HashMap works by using hashCode to locate a bucket. Each bucket is a list of items residing in that bucket. The items are scanned, using equals for comparison. When adding items, the HashMap is resized once a certain load percentage is reached.
So, sometimes it will have to compare against a few items, but generally it's much closer to O(1) than O(n).
in short - there is no need to use more resources (memory) and "harness" unnecessary classes - as hash map "get" method gets very expensive as count of item grows.
hashCode -> put to bucket [if many item in bucket] -> get = linear scan
so what counts in removing items ?
complexity of equals and hasCode and used of proper algorithm to iterate
I know this is maybe amature-ish, but...
There is no need to remove the items from both lists, so, just take it from the one list
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
for(Integer a1Int : a1){
for (int i = 0; i<a2.size();i++) {
if (a2.get(i).equals(a1Int)) {
a2.remove(i);
break;
}
}
if (a2.size()== 0) {
return true;
}
}
return false;
}
If you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.

Insert Objects in a Constant Length List - Java

I am looking for a good optimal strategy to write a code for the following problem.
I have a List of Objects.
The Objects have a String "valuation" field among other fields. The valuation field may or may not be unique.
The List is of CONSTANT length which is calculated within the program. The length would usually be between 100 and 500.
The Objects are all sorted within the list based on String field - valuation
As new objects are found or created: The String field valuation is compared with the existing members of the list.
If the comparison fails e.g. with the bottom member of the list, then the Object is NOT added to the list.
If the comparison succeeds and the new Object is added to the list - within the sort criteria;the new object is added in the right position and the bottom member is ousted from the list to keep the length of the list constant.
One strategy which I am thinking:
Keep adding members to the list - till it reaches maxLength
Sort - (e.g Collections.sort with a comparator) the list
When a new member is created - compare it with the bottom member of the list.
If success - replace the bottom member else continue
Re-Sort the List - if success
and continue.
The program loops through million or more iterations, thus optimized comparison and running has become an issue.
Any guidance on a good strategy to address this within the Java domain. What lists will be the most effective e.g. LinkedList or ArrayLists or Sets etc. Which sort/insert (standard package) will be the most effective?
Consider this example based on TreeSet and comparing over a String for Results. As you can see, after enough iterations, only elements with very large keys are left in List. On my quite old laptop, I had 10.000 items in less than 50ms - so roundabout 5s per million list operations.
public class Valuation {
public static class Element implements Comparable<Element> {
String valuation;
String data;
Element(String v, String d) {
valuation = v;
data = d;
}
#Override
public int compareTo(Element e) {
return valuation.compareTo(e.valuation);
}
}
private TreeSet<Element> ts = new TreeSet<Element>();
private final static int LISTLENGTH = 500;
public static void main(String[] args) {
NumberFormat nf = new DecimalFormat("00000");
Random r = new Random();
Valuation v = new Valuation();
for(long l = 1; l < 150; ++l) {
long start = System.currentTimeMillis();
for(int j = 0; j < 10000; ++j) {
v.pushNew(new Element(nf.format(r.nextInt(50000))
, UUID.randomUUID().toString()));
}
System.out.println("10.000 finished in " + (System.currentTimeMillis()-start) + "ms. Set contains: " + v.ts.size());
}
for(Element e : v.ts) {
System.out.println("-> " + e.valuation);
}
}
private void pushNew(Element hexString) {
if(ts.size() < LISTLENGTH) {
ts.add(hexString);
} else {
if(ts.first().compareTo(hexString) < 0) {
ts.add(hexString);
if(ts.size() > LISTLENGTH) {
ts.remove(ts.first());
}
}
}
}
}
Any guidance on a good strategy to address this within the Java domain.
My advice would be - there is no need to do any sorting. You can ensure your data is sorted by doing binary insertion as you add more objects into your collection.
This way, as you add more items, the collection itself is already is a sorted state.
After the 500th item, if you want to add another one, we just perform another binary insertion. The insertion performance always remains at O(log(n)) and there is no need to perform any sorting.
Comparing with your algorithm
Your algorithm works fine from 1 - 4. But step 5 will likely be the bottle neck of your algorithm:
5.Re-Sort the List - if success
This is because even though your list will only have a maximum of 500 items, but there can be infinite number of insertions to be performed on this list after the 500th item is being added.
Imagine having another 1 million more insertions and (in worse case scenario), all 1 million items "succeeded" and can be inserted into the list, that implies your algorithm will need to perform 1 million more sorts!
That will be 1 million * n(log(n)) for sorting.
Compare with binary insertion, in the worse case it will be 1 million * log(n) for insertion (no sorting).
What lists will be the most effective e.g. LinkedList or ArrayLists or Sets etc.
If you use ArrayList, insertion won't be as efficient as compared to a linked list since ArrayList is backed by an array. However accessing of elements is only O(1) for arrayList as compare to linked list which is O(n). So there isn't a data structure which is efficient for all scenarios. You will have to plan your algorithm first and see which one fits best for your strategy.
Which sort/insert (standard package) will be the most effective?
As far as I know, there is Arrays.sort() and Collections.sort() available which will give you a good performance of O(n log(n)) as they are using a dual pivot sort which will be more effective than a simple insertion/bubble/selection sort created by yourself.

Efficient search for not empty intersection (Java)

I have a method that returns an integer value or integer range (initial..final) and I want to know if values are all disjoint.
Is there a more efficient solution than the following one:
ArrayList<Integer> list = new ArrayList<Integer>();
// For single value
int value;
if(!list.contains(value))
list.add(value);
else
error("",null);
// Range
int initialValue,finalValue;
for(int i = initialValue; i <= finalValue; i++){
if(!list.contains(i))
list.add(i);
else
error("",null);
}
Finding a value (contains) in HashSet is a constant-time operation (O(1)) on average, which is better than a List, where contains is linear (O(n)). So, if your lists are large enough, it may be worthwhile to replace your first line with:
HashSet<Integer> list = new HashSet<Integer>();
The reason for this is that to find a value in an (unsorted) list, you need to check every index in the list until you find the one you want or run out of indexes to check. On average you'll check half the list before finding a value if the value is in the list, or the whole list if it's not. For a hash table, you generate an index from the value you want to find, then you check that one index (it's possible you need to check more than one, but it should be uncommon in a well-designed hash table).
Also, if you use a Set, you get a guarantee that each value is unique, so if you try to add a value that already exists, add will return false. You can use that to slightly simplify the code (note: This will not work if you use a List, because add always returns true on a List):
HashSet<Integer> list = new HashSet<Integer>();
int value;
if(!list.add(value))
error("",null);
Problems involving ranges often lend themselves to the use of a tree. Here's a way to do that using TreeSet:
public class DisjointChecker {
private final NavigableSet<Integer> integers = new TreeSet<Integer>();
public boolean check(int value) {
return integers.add(value);
}
public boolean check(int from, int to) {
NavigableSet<Integer> range = integers.subSet(from, true, to, true);
if (range.isEmpty()) {
addRange(from, to);
return true;
}
else {
return false;
}
}
private void addRange(int from, int to) {
for (int i = from; i <= to; ++i) {
integers.add(i);
}
}
}
Here, rather than calling an error handler, the check methods return a boolean indicating whether the arguments were disjoint from all previous arguments. The semantics of the range version are different to in the original code; if the range is not disjoint, none of the elements are added, whereas in the original, any below the first non-disjoint element are added.
A few points may deserve elaboration:
Set::add returns a boolean indicating whether the addition modified the set; we can use that as the return value from the method.
NavigableSet is an obscure but standard subinterface of SortedSet which is sadly neglected. Although you could actually use a plain SortedSet here with only minor modifications.
The NavigableSet::subSet method (like SortedSet::subSet) returns a lightweight view on the underlying set which is restricted to a given range. This provides a very efficient way to query the tree for any overlap with the whole range in one operation.
The addRange method here is very simple, and runs in O(m log n) when adding m items to a checker which has seen n items previously. It would be possible to make a version which ran in O(m) by writing an implementation of SortedSet which described a range of integers and then using Set::addAll, because TreeSet's implementation of this contains a special case for adding other SortedSets in linear time. The code for that special set implementation is very simple, but involves a lot of boilerplate, so i leave it as an exercise for the reader!

How to select a random key from a HashMap in Java?

I'm working with a large ArrayList<HashMap<A,B>>, and I would repeatedly need to select a random key from a random HashMap (and do some stuff with it). Selecting the random HashMap is trivial, but how should I select a random key from within this HashMap?
Speed is important (as I need to do this 10000 times and the hashmaps are large), so just selecting a random number k in [0,9999] and then doing .next() on the iterator k times, is really not an option. Similarly, converting the HashMap to an array or ArrayList on every random pick is really not an option. Please, read this before replying.
Technically I feel that this should be possible, since the HashMap stores its keys in an Entry[] internally, and selecting at random from an array is easy, but I can't figure out how to access this Entry[]. So any ideas to access the internal Entry[] are more than welcome. Other solutions (as long as they don't consume linear time in the hashmap size) are also welcome of course.
Note: heuristics are fine, so if there's a method that excludes 1% of the elements (e.g. because of multi-filled buckets) that's no problem at all.
from top of my head
List<A> keysAsArray = new ArrayList<A>(map.keySet())
Random r = new Random()
then just
map.get(keysAsArray.get(r.nextInt(keysAsArray.size()))
I managed to find a solution without performance loss. I will post it here since it may help other people -- and potentially answer several open questions on this topic (I'll search for these later).
What you need is a second custom Set-like data structure to store the keys -- not a list as some suggested here. Lists-like data structures are to expensive to remove items from. The operations needed are adding/removing elements in constant time (to keep it up-to-date with the HashMap) and a procedure to select the random element. The following class MySet does exactly this
class MySet<A> {
ArrayList<A> contents = new ArrayList();
HashMap<A,Integer> indices = new HashMap<A,Integer>();
Random R = new Random();
//selects random element in constant time
A randomKey() {
return contents.get(R.nextInt(contents.size()));
}
//adds new element in constant time
void add(A a) {
indices.put(a,contents.size());
contents.add(a);
}
//removes element in constant time
void remove(A a) {
int index = indices.get(a);
contents.set(index,contents.get(contents.size()-1));
indices.put(contents.get(index),index);
contents.remove((int)(contents.size()-1));
indices.remove(a);
}
}
You need access to the underlying entry table.
// defined staticly
Field table = HashMap.class.getDeclaredField("table");
table.setAccessible(true);
Random rand = new Random();
public Entry randomEntry(HashMap map) {
Entry[] entries = (Entry[]) table.get(map);
int start = rand.nextInt(entries.length);
for(int i=0;i<entries.length;i++) {
int idx = (start + i) % entries.length;
Entry entry = entries[idx];
if (entry != null) return entry;
}
return null;
}
This still has to traverse the entries to find one which is there so the worst case is O(n) but the typical behaviour is O(1).
Sounds like you should consider either an ancillary List of keys or a real object, not a Map, to store in your list.
As #Alberto Di Gioacchino pointed out, there is a bug in the accepted solution with the removal operation. This is how I fixed it.
class MySet<A> {
ArrayList<A> contents = new ArrayList();
HashMap<A,Integer> indices = new HashMap<A,Integer>();
Random R = new Random();
//selects random element in constant time
A randomKey() {
return contents.get(R.nextInt(contents.size()));
}
//adds new element in constant time
void add(A item) {
indices.put(item,contents.size());
contents.add(item);
}
//removes element in constant time
void remove(A item) {
int index = indices.get(item);
contents.set(index,contents.get(contents.size()-1));
indices.put(contents.get(index),index);
contents.remove(contents.size()-1);
indices.remove(item);
}
}
I'm assuming you are using HashMap as you need to look something up at a later date?
If not the case, then just change your HashMap to an Array/ArrayList.
If this is the case, why not store your objects in a Map AND an ArrayList so you can look up randomly or by key.
Alternatively, could you use a TreeMap instead of HashMap? I don't know what type your key is but you use TreeMap.floorKey() in conjunction with some key randomizer.
After spending some time, I came to the conclusion that you need to create a model which can be backed by a List<Map<A, B>> and a List<A> to maintain your keys. You need to keep the access of your List<Map<A, B>> and List<A>, just provide the operations/methods to the caller. In this way, you will have the full control over implementation, and the actual objects will be safer from external changes.
Btw, your questions lead me to,
Why does the java.util.Set<V> interface not provide a get(Object o) method?, and
Bimap: I was trying to be clever but, of course, its values() method also returns Set.
This example, IndexedSet, may give you an idea about how-to.
[edited]
This class, SetUniqueList, might help you if you decide to create your own model. It explicitly states that it wraps the list, not copies. So, I think, we can do something like,
List<A> list = new ArrayList(map.keySet());
SetUniqueList unikList = new SetUniqueList(list, map.keySet);
// Now unikList should reflect all the changes to the map keys
...
// Then you can do
unikList.get(i);
Note: I didn't try this myself. Will do that later (rushing to home).
Since Java 8, there is an O(log(N)) approach with O(log(N)) additional memory: create a Spliterator via map.entrySet().spliterator(), make log(map.size()) trySplit() calls and choose either the first or the second half randomly. When there are say less than 10 elements left in a Spliterator, dump them into a list and make a random pick.
If you absolutely need to access the Entry array in HashMap, you can use reflection. But then your program will be dependent on that concrete implementation of HashMap.
As proposed, you can keep a separate list of keys for each map. You would not keep deep copies of the keys, so the actual memory denormalisation wouldn't be that big.
Third approach is to implement your own Map implementation, the one that keeps keys in a list instead of a set.
How about wrapping HashMap in another implementation of Map? The other map maintains a List, and on put() it does:
if (inner.put(key, value) == null) listOfKeys.add(key);
(I assume that nulls for values aren't permitted, if they are use containsKey, but that's slower)

Cross compare ArrayList elements and remove duplicates

I have an ArrayList<MyObject> that may (or may not) contain duplicates of MyObject I need to remove from the List. How can I do this in a way that I don't have to check duplication twice as I would do if I were to iterate the list in two for-loops and cross checking every item with every other item.
I just need to check every item once, so comparing A:B is enough - I don't want to compare B:A again, as I already did that.
Furthermore; can I just remove duplicates from the list while looping? Or will that somehow break the list and my loop?
Edit: Okay, I forgot an important part looking through the first answers: A duplicate of MyObject is not just meant in the Java way meaning Object.equals(Object), but I need to be able to compare objects using my own algorithm, as the equality of MyObjects is calculated using an algorithm that checks the Object's fields in a special way that I need to implement!
Furthermore, I can't just override euqals in MyObject as there are several, different Algorithms that implement different strategies for checking the equality of two MyObjects - e.g. there is a simple HashComparer and a more complex EuclidDistanceComparer, both being AbstractComparers implementing different algorithms for the public abstract boolean isEqual(MyObject obj1, MyObject obj2);
Sort the list, and the duplicates will be adjacent to each other, making them easy to identify and remove. Just go through the list remembering the value of the previous item so you can compare it with the current one. If they are the same, remove the current item.
And if you use an ordinary for-loop to go through the list, you control the current position. That means that when you remove an item, you can decrement the position (n--) so that the next time around the loop will visit the same position (which will now be the next item).
You need to provide a custom comparison in your sort? That's not so hard:
Collections.sort(myArrayList, new Comparator<MyObject>() {
public int compare(MyObject o1, MyObject o2) {
return o1.getThing().compareTo(o2.getThing());
}
});
I've written this example so that getThing().compareTo() stands in for whatever you want to do to compare the two objects. You must return an integer that is zero if they are the same, greater than 1 if o1 is greater than o2 and -1 if o1 is less than o2. If getThing() returned a String or a Date, you'd be all set because those classes have a compareTo method already. But you can put whatever code you need to in your custom Comparator.
Create a set and it will remove the duplicates automatically for you if the ordering is not important.
Set<MyObject> mySet = new HashSet<MyObject>(yourList);
Instantiate a new set-based collection HashSet. Don't forget to implement equals and hashcode for MyObject.
Good Luck!
If object order is insignificant
If the order is not important, you can put the elements of the list into a Set:
Set<MyObject> mySet = new HashSet<MyObject>(yourList);
The duplicates will be removed automatically.
If object order is significant
If ordering is significant, then you can manually check for duplicates, e.g. using this snippet:
// Copy the list.
ArrayList<String> newList = (ArrayList<String>) list.clone();
// Iterate
for (int i = 0; i < list.size(); i++) {
for (int j = list.size() - 1; j >= i; j--) {
// If i is j, then it's the same object and don't need to be compared.
if (i == j) {
continue;
}
// If the compared objects are equal, remove them from the copy and break
// to the next loop
if (list.get(i).equals(list.get(j))) {
newList.remove(list.get(i));
break;
}
System.out.println("" + i + "," + j + ": " + list.get(i) + "-" + list.get(j));
}
}
This will remove all duplicates, leaving the last duplicate value as original entry. In addition, it will check each combination only once.
Using Java 8
Java Streams makes it even more elegant:
List<Integer> newList = oldList.stream()
.distinct()
.collect(Collectors.toList());
If you need to consider two of your objects equal based on your own definition, you could do the following:
public static <T, U> Predicate<T> distinctByProperty(Function<? super T, ?> propertyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> seen.add(propertyExtractor.apply(t));
}
(by Stuart Marks)
And then you could do this:
List<MyObject> newList = oldList.stream()
.filter(distinctByProperty(t -> {
// Your custom property to use when determining whether two objects
// are equal. For example, consider two object equal if their name
// starts with the same character.
return t.getName().charAt(0);
}))
.collect(Collectors.toList());
Futhermore
You cannot modify a list while an Iterator (which is usually used in a for-each loop) is looping through an array. This will throw a ConcurrentModificationException. You can modify the array if you are looping it using a for loop. Then you must control the iterator position (decrementing it while removing an entry).
Or http://docs.oracle.com/javase/6/docs/api/java/util/SortedSet.html if you need sort-order..
EDIT: What about deriving from http://docs.oracle.com/javase/6/docs/api/java/util/TreeSet.html, it will allow you to pass in a Comparator at construction time. You override add() to use your Comparator instead of equals() - this will give you the flexibility of creating different sets that are ordered according to your Comparator and they will implement your "Equality"-Strategy.
Dont forget about equals() and hashCode() though...

Categories