I have a list of objects that contains two string properties.
public class A {
public String a;
public String b;
}
I want to retrieve two Sets one containing property a and one b.
The naive approach is something long these lines:
List<A> list = ....
Set<String> listofa = new HashSet<>();
Set<String> listofb = new HashSet<>();
for (A item : list) {
if (item.a != null)
listofa.add(item.a);
if (item.b != null)
listofb.add(item.b);
}
Trying to do in a functional way in guava I ended up with this approach:
Function<String,A> getAFromList = new Function<>() {
#Nullable
#Override
public String apply(#Nullable A input) {
return input.a;
}
};
Function<String,A> getBFromList = Function<>() {
#Nullable
#Override
public String apply(#Nullable A input) {
return input.b;
}
};
FluentIterable<A> iterables = FluentIterable.from(list);
Set<String> listofAs = ImmutableSet.copyOf(iterables.transform(getAFromList).filter(Predicates.notNull()));
Set<String> listofBs = ImmutableSet.copyOf(iterables.transform(getBFromList).filter(Predicates.notNull()));
However this way I would iterate twice over the list.
Is there any way how to avoid iterating twice or multiple times ?
In general how does one solve these uses cases in a functional way in general (not only in guava/java) ?
Firstly you're after an optimisation - but if performance is key, use regular java methods over guava (i.e. your first method). See here.
I think because you want two results, at some point you will need to iterate twice (unless you pass in one of the sets to be populated but that that is definitely not a fp approach as it would not be a pure function).
However if iteration was expensive enough to need an optimisation you would iterate that once to an intermediate structure:
a_b_pairs = transformToJustAB(input) //single expensive iteration
list_of_a = transformA(a_b_pairs) //multiple cheaper iterations
list_of_b = transformB(a_b_pairs)
So the simple answer is that you have to iterate twice. Think about it. If you have N elements in your List you will need to do N inserts into the first Set and N inserts into the second Set. Functional or otherwise, you will have to iterate N twice whether it be on conversion (extraction) or insert.
If you were going for two Lists it would be different because you could create views and only iterate as needed.
This can be solved in one iteration with Multimaps.index:
Function<A, String> filterAB = new Function<A, String>() {
#Override
public String apply(A input) {
if (input.a != null) {
return "a";
}
if (input.b != null) {
return "b";
}
return "empty";
}
};
ImmutableListMultimap<String, A> partitionedMap = Multimaps.index(list, filterAB);
The output will be a Guava Multimap with three separate entries for:
an immutable list with all "a-not-null" objects under key "a".
an immutable list with all "b-not-null" objects under key "b".
and possibly an immutable list with objects where both a and b is null under key "empty".
What you're trying to achieve is partitioning or splitting of the collection using predicates.
With Guava, you can use Multimap.index. See related question and answer here.
Related
Given:
public abstract class Cars {}
...
public class Ford extends Cars {}
...
public class Dodge extends Cars {}
...
public class Volkswagen extends Cars {}
...
If I have two ArrayList objects:
List<Cars> dealer1 = new ArrayList<>;
List<Cars> dealer2 = new ArrayList<>;
dealer1.addAll(asList(new Ford("ford1"), new Dodge("dodge1")));
dealer2.addAll(asList(new Dodge("dodge2"), new Volkswagen("vw1")));
I then want to create a merged list from the two with only one instance of each subclass, such that:
dealerMerged = ["ford1", "dodge1", "vw1"]
OR
dealerMerged = ["ford1", "dodge2", "vw1"]
It doesn't matter which instance makes it into the merged list.
Is this possible? I had a search through and saw something about using Set but that seems to only ensure unique references, unless I've badly misunderstood something.
Overriding equals() will work but DON'T
You can always make your collection distinctful converting it to a Set (as #Arun states in comment) or using distinct operation over the Stream of your collections. But remember those approaches use the equal() methods for that. So a quick thinking would be overriding equals() and return its Class type. But wait ! If you do so you will end up having all Dodge objects equals to each other despite they have different properties like name dodge1, dodge2. You may not only handle a single business in read world. equal() method has lots of other significances. So stay away of doing so.
If you are thinking a Java 8 way, Stateful Filter is perfect
We have a choice to use the filter operation for our concatenated stream. filter operation works pretty straight forward. It takes a predicate and decide which element to take or ignore. This a commonly used function that you will find all over the blogs that solves this problem.
public static <T> Predicate<T> distinctBy(Function<? super T, ?> keyExtractor) {
Map<Object, Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
Here the distinctBy function returns a predicate (that will be used in filter operation). It maintains state about what it's seen previously and returns whether the given element was seen for the first time. (You can read further explanation about this here)
You can use this Stateful Filter like
Stream.of(dealer1, dealer2)
.flatMap(Collection::stream)
.filter(distinctBy(Cars::getClass))
.collect(Collectors.toList())
.forEach(cars -> System.out.println(cars));
So What we actually did here ?
We concatenated the 2 ArrayList with flatmap that will give us a single stream of the merged elements (If you are new to Stream API. See this Stream Concatenation article
We then exploits the filter() operation that is feed with the distinctBy method which return a predicate.
And you see a ConcurrentHashMap is maintained to track which element satisfies the predicate or not by a boolean flag.
And the predicate uses the getClass() method which returns the full class name, that distinguise the elements as subclasses
We then can collect or iterate over the filtered list.
Try using Map instead of List. You may please try following solution. This will let you put Car instances by their types. Thereby you will always have only one entry per class (this will be the latest entry in your map by the way).
public class CarsCollection {
Map<Class<? extends Cars>, ? super Cars> coll = new HashMap<>();
public <T extends Cars> void add(Class<T> cls, T obj) {
coll.put(cls, obj);
}
}
public class Temp {
public static void main(String[] args) {
CarsCollection nos1 = new CarsCollection();
cars.add(Ford.class, new Ford("ford1"));
cars.add(Dodge.class, new Dodge("dodge1"));
cars.add(Dodge.class, new Dodge("dodge2"));
cars.add(Volkswagen.class, new Volkswagen("vw1"));
System.out.println(cars);
}
}
You could add all the element of the first list into the result list (assuming there is no duplicate in the first list) and then loop through the second list and add the elements to the resulting list only if there is no instance of the same class in the first list.
That could look something like this :
dealerMerged = dealer1;
boolean isAlreadyRepresented;
for (car2 : dealer2) {
isAlreadyRepresented = false;
for (car1 : dealer1) {
if (car1.getClass().equals(car2.getClass())) {
isAlreadyRepresented = true;
}
}
if (!isAlreadyRepresented) {
dealerMerged.add(car2);
}
}
Just use class of the object as key in the map. This example with Java stream does exactly that:
List<Cars> merged = Stream.of(dealer1, dealer2)
.flatMap(Collection::stream)
.collect( Collectors.toMap( Object::getClass, Function.identity(), (c1, c2) -> c1 ) )
.values()
.stream().collect( Collectors.toList() );
I'm currently trying to create a method that determine if an ArrayList(a2) contains an ArrayList(a1), given that both lists contain duplicate values (containsAll wouldn't work as if an ArrayList contains duplicate values, then it would return true regardless of the quantity of the values)
This is what I have: (I believe it would work however I cannot use .remove within the for loop)
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
Integer a1Size= a1.size();
for (Integer integer2:a2){
for (Integer integer1: a1){
if (integer1==integer2){
a1.remove(integer1);
a2.remove(integer2);
if (a1Size==0){
return true;
}
}
}
}
return false;
}
Thanks for the help.
Updated
I think the clearest statement of your question is in one of your comments:
Yes, the example " Example: [dog,cat,cat,bird] is a match for
containing [cat,dog] is false but containing [cat,cat,dog] is true?"
is exactly what I am trying to achieve.
So really, you are not looking for a "subset", because these are not sets. They can contain duplicate elements. What you are really saying is you want to see whether a1 contains all the elements of a2, in the same amounts.
One way to get to that is to count all the elements in both lists. We can get such a count using this method:
private Map<Integer, Integer> getCounter (List<Integer> list) {
Map<Integer, Integer> counter = new HashMap<>();
for (Integer item : list) {
counter.put (item, counter.containsKey(item) ? counter.get(item) + 1 : 1);
}
return counter;
}
We'll rename your method to be called containsAllWithCounts(), and it will use getCounter() as a helper. Your method will also accept List objects as its parameters, rather than ArrayList objects: it's a good practice to specify parameters as interfaces rather than implementations, so you are not tied to using ArrayList types.
With that in mind, we simply scan the counts of the items in a2 and see that they are the same in a1:
public boolean containsAllWithCounts(List<Integer> a1, List<Integer> a2) {
Map<Integer,Integer> counterA1 = getCounter(a1);
Map<Integer,Integer> counterA2 = getCounter(a2);
boolean containsAll = true;
for (Map.Entry<Integer, Integer> entry : counterA2.entrySet ()) {
Integer key = entry.getKey();
Integer count = entry.getValue();
containsAll &= counterA1.containsKey(key) && counterA1.get(key).equals(count);
if (!containsAll) break;
}
return containsAll;
}
If you like, I can rewrite this code to handle arbitrary types, not just Integer objects, using Java generics. Also, all the code can be shortened using Java 8 streams (which I originally used - see comments below). Just let me know in comments.
if you want remove elements from list you have 2 choices:
iterate over copy
use concurrent list implementation
see also:
http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedList-java.util.List-
btw why you don't override contains method ??
here you use simple Object like "Integer" what about when you will be using List< SomeComplexClass > ??
example remove with iterator over copy:
List<Integer> list1 = new ArrayList<Integer>();
List<Integer> list2 = new ArrayList<Integer>();
List<Integer> listCopy = new ArrayList<>(list1);
Iterator<Integer> iterator1 = listCopy.iterator();
while(iterator1.hasNext()) {
Integer next1 = iterator1.next();
Iterator<Integer> iterator2 = list2.iterator();
while (iterator2.hasNext()) {
Integer next2 = iterator2.next();
if(next1.equals(next2)) list1.remove(next1);
}
}
see also this answer about iterator:
Concurrent Modification exception
also don't use == operator to compare objects :) instead use equal method
about use of removeAll() and other similarly methods:
keep in mind that many classes that implements list interface don't override all methods from list interface - so you can end up with unsupported operation exception - thus I prefer "low level" binary/linear/mixed search in this case.
and for comparison of complex classes objects you will need override equal and hashCode methods
f you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.
- Olga
In Java, HashMap works by using hashCode to locate a bucket. Each bucket is a list of items residing in that bucket. The items are scanned, using equals for comparison. When adding items, the HashMap is resized once a certain load percentage is reached.
So, sometimes it will have to compare against a few items, but generally it's much closer to O(1) than O(n).
in short - there is no need to use more resources (memory) and "harness" unnecessary classes - as hash map "get" method gets very expensive as count of item grows.
hashCode -> put to bucket [if many item in bucket] -> get = linear scan
so what counts in removing items ?
complexity of equals and hasCode and used of proper algorithm to iterate
I know this is maybe amature-ish, but...
There is no need to remove the items from both lists, so, just take it from the one list
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
for(Integer a1Int : a1){
for (int i = 0; i<a2.size();i++) {
if (a2.get(i).equals(a1Int)) {
a2.remove(i);
break;
}
}
if (a2.size()== 0) {
return true;
}
}
return false;
}
If you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.
I am building a couple of methods which are supposed to create a cache of input strings, load them in to a list, and then determine the number of occurrences of each string in that list, ranking them in order of the most common elements.
The string, or elements themselves are coming from a JUnit test. It's calling up a method called
lookupDistance(dest)
where "dest" is a String (destination airport code), and the lookupDistance returns the distance between two airport codes....
There's the background. The problem is that I want to load all of the "dest" strings in to a cache. What's the best way to do that?
I have skeleton code that has a method called:
public List<String> mostCommonDestinations()
How would I add "dest" strings to the List in a transparent way? The JUnit test case is only calling lookupDistance(dest), so how can I also redirect those "dest" strings to the List in this method?
How would I then quantify the number of occurrences of each element and say, rank the top three or four?
Have a Map<String, Integer> destinations = new HashMap<>();
In lookupDistance(dest), do something like this (untested pseudocode):
Integer count = destinations.get(dest);
if (count == null) {
destinations.put(dest, Integer.valueOf(1));
} else {
count = Integer.valueOf(count.intValue() + 1);
}
This way, you count the occurences of each dest.
Go through the Map and find the highest counts. That's a bit tricky. One approach might be:
List> list = new ArrayList<>();
list.addAll(destinations.entrySet());
// now you have a list of "entries", each of which maps from dest to its respective counter
// that list now has to be sorted
Collections.sort(list, comparator);
The comparator we used in this invocation has still to be written. It has to take two arguments, which are elements of the list, and compare them according to their counter value. the sort routine will do the rest.
Comparator<Map.Entry<String, Integer>> comparator = new Comparator<>() {
public #Override int compare(Map.Entry<String, Integer> a, Map.Entry<String, Integer> b) {
return a.getValue().intValue() - b.getValue().intValue();
}
}
Ok, so we have a sorted List of Entrys now from which you can pick the top 5 or so. Think that's pretty much it. All this looks more complicated than it should be, so I'm curios for other solutions.
You can add known destination at startup and keep adding new strings to cache as they arrive. That's one way. The other way is to cache strings as they are requested, keeping them for future request. In that case your lookupDistance should also cache string.
Start by making a small class that contains a Hashmap. The key would be your destination string, and the value can either be an object if you want to keep multiple information or just a number specifying how many times that string is used. I would recommend using a data object.
Please note that code below is just to you an idea, more like a pseudo-code.
class Cache {
private Hashmap<String, CacheObject>;
public void Add(string, CacheObject);
public CacheObject Lookup(string);
public CacheObject Remove(string);
public static Cache getInstance(); //single cache
}
class CacheObject {
public int lookupCount;
public int lastUsed;
}
In your lookupDistance you can simply do
if(Cache.getInstance().Lookup(string) == null) {
Cache.getInstance().Add(string, new CacheObject() { 1, Date.now});
}
I have an ArrayList with a number of records and one column contains gas names as CO2 CH4 SO2, etc. Now I want to retrieve different gas names(unique) only without repeatation from the ArrayList. How can it be done?
You should use a Set. A Set is a Collection that contains no duplicates.
If you have a List that contains duplicates, you can get the unique entries like this:
List<String> gasList = // create list with duplicates...
Set<String> uniqueGas = new HashSet<String>(gasList);
System.out.println("Unique gas count: " + uniqueGas.size());
NOTE: This HashSet constructor identifies duplicates by invoking the elements' equals() methods.
You can use Java 8 Stream API.
Method distinct is an intermediate operation that filters the stream and allows only distinct values (by default using the Object::equals method) to pass to the next operation.
I wrote an example below for your case,
// Create the list with duplicates.
List<String> listAll = Arrays.asList("CO2", "CH4", "SO2", "CO2", "CH4", "SO2", "CO2", "CH4", "SO2");
// Create a list with the distinct elements using stream.
List<String> listDistinct = listAll.stream().distinct().collect(Collectors.toList());
// Display them to terminal using stream::collect with a build in Collector.
String collectAll = listAll.stream().collect(Collectors.joining(", "));
System.out.println(collectAll); //=> CO2, CH4, SO2, CO2, CH4 etc..
String collectDistinct = listDistinct.stream().collect(Collectors.joining(", "));
System.out.println(collectDistinct); //=> CO2, CH4, SO2
I hope I understand your question correctly: assuming that the values are of type String, the most efficient way is probably to convert to a HashSet and iterate over it:
ArrayList<String> values = ... //Your values
HashSet<String> uniqueValues = new HashSet<>(values);
for (String value : uniqueValues) {
... //Do something
}
you can use this for making a list Unique
ArrayList<String> listWithDuplicateValues = new ArrayList<>();
list.add("first");
list.add("first");
list.add("second");
ArrayList uniqueList = (ArrayList) listWithDuplicateValues.stream().distinct().collect(Collectors.toList());
ArrayList values = ... // your values
Set uniqueValues = new HashSet(values); //now unique
Here's straightforward way without resorting to custom comparators or stuff like that:
Set<String> gasNames = new HashSet<String>();
List<YourRecord> records = ...;
for(YourRecord record : records) {
gasNames.add(record.getGasName());
}
// now gasNames is a set of unique gas names, which you could operate on:
List<String> sortedGasses = new ArrayList<String>(gasNames);
Collections.sort(sortedGasses);
Note: Using TreeSet instead of HashSet would give directly sorted arraylist and above Collections.sort could be skipped, but TreeSet is otherwise less efficent, so it's often better, and rarely worse, to use HashSet even when sorting is needed.
When I was doing the same query, I had hard time adjusting the solutions to my case, though all the previous answers have good insights.
Here is a solution when one has to acquire a list of unique objects, NOT strings.
Let's say, one has a list of Record object. Record class has only properties of type String, NO property of type int.
Here implementing hashCode() becomes difficult as hashCode() needs to return an int.
The following is a sample Record Class.
public class Record{
String employeeName;
String employeeGroup;
Record(String name, String group){
employeeName= name;
employeeGroup = group;
}
public String getEmployeeName(){
return employeeName;
}
public String getEmployeeGroup(){
return employeeGroup;
}
#Override
public boolean equals(Object o){
if(o instanceof Record){
if (((Record) o).employeeGroup.equals(employeeGroup) &&
((Record) o).employeeName.equals(employeeName)){
return true;
}
}
return false;
}
#Override
public int hashCode() { //this should return a unique code
int hash = 3; //this could be anything, but I would chose a prime(e.g. 5, 7, 11 )
//again, the multiplier could be anything like 59,79,89, any prime
hash = 89 * hash + Objects.hashCode(this.employeeGroup);
return hash;
}
As suggested earlier by others, the class needs to override both the equals() and the hashCode() method to be able to use HashSet.
Now, let's say, the list of Records is allRecord(List<Record> allRecord).
Set<Record> distinctRecords = new HashSet<>();
for(Record rc: allRecord){
distinctRecords.add(rc);
}
This will only add the distinct Records to the Hashset, distinctRecords.
Hope this helps.
public static List getUniqueValues(List input) {
return new ArrayList<>(new LinkedHashSet<>(incoming));
}
dont forget to implement your equals method first
If you have an array of a some kind of object (bean) you can do this:
List<aBean> gasList = createDuplicateGasBeans();
Set<aBean> uniqueGas = new HashSet<aBean>(gasList);
like said Mathias Schwarz above, but you have to provide your aBean with the methods hashCode() and equals(Object obj) that can be done easily in Eclipse by dedicated menu 'Generate hashCode() and equals()' (while in the bean Class).
Set will evaluate the overridden methods to discriminate equals objects.
Ok so here is my issue. I have to HashSet's, I use the removeAll method to delete values that exist in one set from the other.
Prior to calling the method, I obviously add the values to the Sets. I call .toUpperCase() on each String before adding because the values are of different cases in both lists. There is no rhyme or reason to the case.
Once I call removeAll, I need to have the original cases back for the values that are left in the Set. Is there an efficient way of doing this without running through the original list and using CompareToIgnoreCase?
Example:
List1:
"BOB"
"Joe"
"john"
"MARK"
"dave"
"Bill"
List2:
"JOE"
"MARK"
"DAVE"
After this, create a separate HashSet for each List using toUpperCase() on Strings. Then call removeAll.
Set1.removeAll(set2);
Set1:
"BOB"
"JOHN"
"BILL"
I need to get the list to look like this again:
"BOB"
"john"
"Bill"
Any ideas would be much appreciated. I know it is poor, there should be a standard for the original list but that is not for me to decide.
In my original answer, I unthinkingly suggested using a Comparator, but this causes the TreeSet to violate the equals contract and is a bug waiting to happen:
// Don't do this:
Set<String> setA = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
setA.add("hello");
setA.add("Hello");
System.out.println(setA);
Set<String> setB = new HashSet<String>();
setB.add("HELLO");
// Bad code; violates symmetry requirement
System.out.println(setB.equals(setA) == setA.equals(setB));
It is better to use a dedicated type:
public final class CaselessString {
private final String string;
private final String normalized;
private CaselessString(String string, Locale locale) {
this.string = string;
normalized = string.toUpperCase(locale);
}
#Override public String toString() { return string; }
#Override public int hashCode() { return normalized.hashCode(); }
#Override public boolean equals(Object obj) {
if (obj instanceof CaselessString) {
return ((CaselessString) obj).normalized.equals(normalized);
}
return false;
}
public static CaselessString as(String s, Locale locale) {
return new CaselessString(s, locale);
}
public static CaselessString as(String s) {
return as(s, Locale.ENGLISH);
}
// TODO: probably best to implement CharSequence for convenience
}
This code is less likely to cause bugs:
Set<CaselessString> set1 = new HashSet<CaselessString>();
set1.add(CaselessString.as("Hello"));
set1.add(CaselessString.as("HELLO"));
Set<CaselessString> set2 = new HashSet<CaselessString>();
set2.add(CaselessString.as("hello"));
System.out.println("1: " + set1);
System.out.println("2: " + set2);
System.out.println("equals: " + set1.equals(set2));
This is, unfortunately, more verbose.
It could be done by:
Moving the content of your lists into case-insensitive TreeSets,
then removing all common Strings case-insensitively thanks TreeSet#removeAll(Collection<?> c)
and finally relying on the fact that ArrayList#retainAll(Collection<?> c) will iterate over the elements of the list and for each element it will call contains(Object o) on the provided collection to know whether the value should be kept or not and here as the collection is case-insensitive, we will keep only the Strings that match case-insensitively with what we have in the provided TreeSet instance.
The corresponding code:
List<String> list1 = new ArrayList<>(
Arrays.asList("BOB", "Joe", "john", "MARK", "dave", "Bill")
);
List<String> list2 = Arrays.asList("JOE", "MARK", "DAVE");
// Add all values of list1 in a case insensitive collection
Set<String> set1 = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
set1.addAll(list1);
// Add all values of list2 in a case insensitive collection
Set<String> set2 = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
set2.addAll(list2);
// Remove all common Strings ignoring case
set1.removeAll(set2);
// Keep in list1 only the remaining Strings ignoring case
list1.retainAll(set1);
for (String s : list1) {
System.out.println(s);
}
Output:
BOB
john
Bill
NB 1: It is important to have the content of the second list into a TreeSet especially if we don't know the size of it because the behavior of TreeSet#removeAll(Collection<?> c) depends on the size of both collections, if the size of the current collection is strictly bigger than the size of the provided collection, then it will call directly remove(Object o) on the current collection to remove each element, in this case the provided collection could be a list. But if it is the opposite, it will call contains(Object o) on the provided collection to know whether a given element should be removed or not so if it is not an case-insensitive collection, we won't get the expected result.
NB 2: The behavior of the method ArrayList#retainAll(Collection<?> c) described above is the same as the behavior of the default implementation of the method retainAll(Collection<?> c) that we can find in AbstractCollection such that this approach will actually work with any collections whose implementation of retainAll(Collection<?> c) has the same behavior.
You can use a hashmap and use the capital set as keys that map to the mixed case set.
Keys of hashmaps are unique and you can get a set of them using HashMap.keyset();
to retrieve the original case, it's as simple as HashMap.get("UPPERCASENAME").
And according to the documentation:
Returns a set view of the keys
contained in this map. The set is
backed by the map, so changes to the
map are reflected in the set, and
vice-versa. The set supports element
removal, which removes the
corresponding mapping from this map,
via the Iterator.remove, Set.remove,
removeAll, retainAll, and clear
operations. It does not support the
add or addAll operations.
So HashMap.keyset().removeAll will effect the hashmap :)
EDIT: use McDowell's solution. I overlooked the fact that you didn't actually need the letters to be upper case :P
This would be an interesting one to solve using google-collections. You could have a constant Predicate like so:
private static final Function<String, String> TO_UPPER = new Function<String, String>() {
public String apply(String input) {
return input.toUpperCase();
}
and then what you're after could be done someting like this:
Collection<String> toRemove = Collections2.transform(list2, TO_UPPER);
Set<String> kept = Sets.filter(list1, new Predicate<String>() {
public boolean apply(String input) {
return !toRemove.contains(input.toUpperCase());
}
}
That is:
Build an upper-case-only version of the 'to discard' list
Apply a filter to the original list, retaining only those items whose uppercased value is not in the upper-case-only list.
Note that the output of Collections2.transform isn't an efficient Set implementation, so if you're dealing with a lot of data and the cost of probing that list will hurt you, you can instead use
Set<String> toRemove = Sets.newHashSet(Collections2.transform(list2, TO_UPPER));
which will restore an efficient lookup, returning the filtering to O(n) instead of O(n^2).
as far as i know, hashset's use the object's hashCode-method to distinct them from each other.
you should therefore override this method in your object in order to distinct cases.
if you're really using string, you cannot override this method as you cannot extend the String-class.
therefore you need to create your own class containing a string as attribute which you fill with your content. you might want to have a getValue() and setValue(String) method in order to modify the string.
then you can add your own class to the hashmap.
this should solve your problem.
regards