Get unique values from ArrayList in Java - java

I have an ArrayList with a number of records and one column contains gas names as CO2 CH4 SO2, etc. Now I want to retrieve different gas names(unique) only without repeatation from the ArrayList. How can it be done?

You should use a Set. A Set is a Collection that contains no duplicates.
If you have a List that contains duplicates, you can get the unique entries like this:
List<String> gasList = // create list with duplicates...
Set<String> uniqueGas = new HashSet<String>(gasList);
System.out.println("Unique gas count: " + uniqueGas.size());
NOTE: This HashSet constructor identifies duplicates by invoking the elements' equals() methods.

You can use Java 8 Stream API.
Method distinct is an intermediate operation that filters the stream and allows only distinct values (by default using the Object::equals method) to pass to the next operation.
I wrote an example below for your case,
// Create the list with duplicates.
List<String> listAll = Arrays.asList("CO2", "CH4", "SO2", "CO2", "CH4", "SO2", "CO2", "CH4", "SO2");
// Create a list with the distinct elements using stream.
List<String> listDistinct = listAll.stream().distinct().collect(Collectors.toList());
// Display them to terminal using stream::collect with a build in Collector.
String collectAll = listAll.stream().collect(Collectors.joining(", "));
System.out.println(collectAll); //=> CO2, CH4, SO2, CO2, CH4 etc..
String collectDistinct = listDistinct.stream().collect(Collectors.joining(", "));
System.out.println(collectDistinct); //=> CO2, CH4, SO2

I hope I understand your question correctly: assuming that the values are of type String, the most efficient way is probably to convert to a HashSet and iterate over it:
ArrayList<String> values = ... //Your values
HashSet<String> uniqueValues = new HashSet<>(values);
for (String value : uniqueValues) {
... //Do something
}

you can use this for making a list Unique
ArrayList<String> listWithDuplicateValues = new ArrayList<>();
list.add("first");
list.add("first");
list.add("second");
ArrayList uniqueList = (ArrayList) listWithDuplicateValues.stream().distinct().collect(Collectors.toList());

ArrayList values = ... // your values
Set uniqueValues = new HashSet(values); //now unique

Here's straightforward way without resorting to custom comparators or stuff like that:
Set<String> gasNames = new HashSet<String>();
List<YourRecord> records = ...;
for(YourRecord record : records) {
gasNames.add(record.getGasName());
}
// now gasNames is a set of unique gas names, which you could operate on:
List<String> sortedGasses = new ArrayList<String>(gasNames);
Collections.sort(sortedGasses);
Note: Using TreeSet instead of HashSet would give directly sorted arraylist and above Collections.sort could be skipped, but TreeSet is otherwise less efficent, so it's often better, and rarely worse, to use HashSet even when sorting is needed.

When I was doing the same query, I had hard time adjusting the solutions to my case, though all the previous answers have good insights.
Here is a solution when one has to acquire a list of unique objects, NOT strings.
Let's say, one has a list of Record object. Record class has only properties of type String, NO property of type int.
Here implementing hashCode() becomes difficult as hashCode() needs to return an int.
The following is a sample Record Class.
public class Record{
String employeeName;
String employeeGroup;
Record(String name, String group){
employeeName= name;
employeeGroup = group;
}
public String getEmployeeName(){
return employeeName;
}
public String getEmployeeGroup(){
return employeeGroup;
}
#Override
public boolean equals(Object o){
if(o instanceof Record){
if (((Record) o).employeeGroup.equals(employeeGroup) &&
((Record) o).employeeName.equals(employeeName)){
return true;
}
}
return false;
}
#Override
public int hashCode() { //this should return a unique code
int hash = 3; //this could be anything, but I would chose a prime(e.g. 5, 7, 11 )
//again, the multiplier could be anything like 59,79,89, any prime
hash = 89 * hash + Objects.hashCode(this.employeeGroup);
return hash;
}
As suggested earlier by others, the class needs to override both the equals() and the hashCode() method to be able to use HashSet.
Now, let's say, the list of Records is allRecord(List<Record> allRecord).
Set<Record> distinctRecords = new HashSet<>();
for(Record rc: allRecord){
distinctRecords.add(rc);
}
This will only add the distinct Records to the Hashset, distinctRecords.
Hope this helps.

public static List getUniqueValues(List input) {
return new ArrayList<>(new LinkedHashSet<>(incoming));
}
dont forget to implement your equals method first

If you have an array of a some kind of object (bean) you can do this:
List<aBean> gasList = createDuplicateGasBeans();
Set<aBean> uniqueGas = new HashSet<aBean>(gasList);
like said Mathias Schwarz above, but you have to provide your aBean with the methods hashCode() and equals(Object obj) that can be done easily in Eclipse by dedicated menu 'Generate hashCode() and equals()' (while in the bean Class).
Set will evaluate the overridden methods to discriminate equals objects.

Related

Java: See if ArrayList contains ArrayList with duplicate values

I'm currently trying to create a method that determine if an ArrayList(a2) contains an ArrayList(a1), given that both lists contain duplicate values (containsAll wouldn't work as if an ArrayList contains duplicate values, then it would return true regardless of the quantity of the values)
This is what I have: (I believe it would work however I cannot use .remove within the for loop)
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
Integer a1Size= a1.size();
for (Integer integer2:a2){
for (Integer integer1: a1){
if (integer1==integer2){
a1.remove(integer1);
a2.remove(integer2);
if (a1Size==0){
return true;
}
}
}
}
return false;
}
Thanks for the help.
Updated
I think the clearest statement of your question is in one of your comments:
Yes, the example " Example: [dog,cat,cat,bird] is a match for
containing [cat,dog] is false but containing [cat,cat,dog] is true?"
is exactly what I am trying to achieve.
So really, you are not looking for a "subset", because these are not sets. They can contain duplicate elements. What you are really saying is you want to see whether a1 contains all the elements of a2, in the same amounts.
One way to get to that is to count all the elements in both lists. We can get such a count using this method:
private Map<Integer, Integer> getCounter (List<Integer> list) {
Map<Integer, Integer> counter = new HashMap<>();
for (Integer item : list) {
counter.put (item, counter.containsKey(item) ? counter.get(item) + 1 : 1);
}
return counter;
}
We'll rename your method to be called containsAllWithCounts(), and it will use getCounter() as a helper. Your method will also accept List objects as its parameters, rather than ArrayList objects: it's a good practice to specify parameters as interfaces rather than implementations, so you are not tied to using ArrayList types.
With that in mind, we simply scan the counts of the items in a2 and see that they are the same in a1:
public boolean containsAllWithCounts(List<Integer> a1, List<Integer> a2) {
Map<Integer,Integer> counterA1 = getCounter(a1);
Map<Integer,Integer> counterA2 = getCounter(a2);
boolean containsAll = true;
for (Map.Entry<Integer, Integer> entry : counterA2.entrySet ()) {
Integer key = entry.getKey();
Integer count = entry.getValue();
containsAll &= counterA1.containsKey(key) && counterA1.get(key).equals(count);
if (!containsAll) break;
}
return containsAll;
}
If you like, I can rewrite this code to handle arbitrary types, not just Integer objects, using Java generics. Also, all the code can be shortened using Java 8 streams (which I originally used - see comments below). Just let me know in comments.
if you want remove elements from list you have 2 choices:
iterate over copy
use concurrent list implementation
see also:
http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedList-java.util.List-
btw why you don't override contains method ??
here you use simple Object like "Integer" what about when you will be using List< SomeComplexClass > ??
example remove with iterator over copy:
List<Integer> list1 = new ArrayList<Integer>();
List<Integer> list2 = new ArrayList<Integer>();
List<Integer> listCopy = new ArrayList<>(list1);
Iterator<Integer> iterator1 = listCopy.iterator();
while(iterator1.hasNext()) {
Integer next1 = iterator1.next();
Iterator<Integer> iterator2 = list2.iterator();
while (iterator2.hasNext()) {
Integer next2 = iterator2.next();
if(next1.equals(next2)) list1.remove(next1);
}
}
see also this answer about iterator:
Concurrent Modification exception
also don't use == operator to compare objects :) instead use equal method
about use of removeAll() and other similarly methods:
keep in mind that many classes that implements list interface don't override all methods from list interface - so you can end up with unsupported operation exception - thus I prefer "low level" binary/linear/mixed search in this case.
and for comparison of complex classes objects you will need override equal and hashCode methods
f you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.
- Olga
In Java, HashMap works by using hashCode to locate a bucket. Each bucket is a list of items residing in that bucket. The items are scanned, using equals for comparison. When adding items, the HashMap is resized once a certain load percentage is reached.
So, sometimes it will have to compare against a few items, but generally it's much closer to O(1) than O(n).
in short - there is no need to use more resources (memory) and "harness" unnecessary classes - as hash map "get" method gets very expensive as count of item grows.
hashCode -> put to bucket [if many item in bucket] -> get = linear scan
so what counts in removing items ?
complexity of equals and hasCode and used of proper algorithm to iterate
I know this is maybe amature-ish, but...
There is no need to remove the items from both lists, so, just take it from the one list
public boolean isSubset(ArrayList<Integer> a1, ArrayList<Integer> a2) {
for(Integer a1Int : a1){
for (int i = 0; i<a2.size();i++) {
if (a2.get(i).equals(a1Int)) {
a2.remove(i);
break;
}
}
if (a2.size()== 0) {
return true;
}
}
return false;
}
If you want to remove the duplicate values, simply put the arraylist(s) into a HashSet. It will remove the duplicates based on equals() of your object.

Remove array elements belonging to the same group leaving only 1 of the group?

I have an array myArray[] of objects MyThing which contains X elements. I need to remove elements belonging to the same group, but leaving one representative of each group.
MyThing class has a field groupId
public class MyThing {
private int groupId;
//...other fields
public int getGroupId(){return groupId;}
//getter and setter
}
So I have to compare groupId integer value of array elements (myArray[x].getGroupId()) and remove all element belonging the same group except the first such element in the array.
This way I will get an array of unique elements with only 1 from the same group. For example, if I have an array with a.getGroupId()=1, b.getGroupId()=2, c.getGroupId()=1 after purification, the array will contain only {a,b}, and c will be removed since it's of the same group as a.
Because this is the custom object, I cannot use Set<T>.
Any ideas?
PS. please let me know if I explained this clearly since it's kind of confusing.
A set by definition doesn't contain any duplicates. A set determines if two items are alike, by using either the objects equals()/compareTo(..) method or by using a Comparator. If you only want unique items in your set, implementing the Comparable interface and overriding equals() is what you want to do. BUT in your case, you're only interested in objects in unique groups, so it's then better to create a custom Comparator for the occasion, which you then supply to the Set, telling it to use it, instead of "natural ordering".
Set<MyThing> myThings = new TreeSet<>(new Comparator<MyThing>() {
#Override
public int compare(MyThing o1, MyThing o2)
{
return o1.getGroupId() - o2.getGroupId();
}
});
myThings.addAll(Arrays.asList(myArray));
After creating the set, you add your entire array into it, using the convinience method addAll(..).
(How the comparator sorts your objects is completely up to you to decide.)
You could loop through your array and use a map to keep track of which IDs have already occurred. Then if one was already added to the set, remove it from the array:
Set<Integer> uniqueIDs = new HashSet<Integer>();
for(MyThing thing : MyThings){
int groupID = thing.getGroupId();
if(!uniqueIDs.add(groupID)){
// DUPLICATE, REMOVE IT
}
}
Use a TreeSet and a custom Comparator class that inspects your objects and treats two with the same group as equal.
http://docs.oracle.com/javase/6/docs/api/java/util/TreeSet.html
Algorithm psuedocode:
Create TreeSet
Add all array elements to TreeSet
Convert TreeSet back to array
For a sample implementation: see Martin's answer
Just rewrote the Martin's solution because the comparator is broken, it might overflow
Set<MyThing> myThings = new TreeSet<>(new Comparator<MyThing>() {
#Override
public int compare(MyThing o1, MyThing o2) {
return Integer.compare(o1.getGroupId(), o2.getGroupId());
}
});
myThings.addAll(Arrays.asList(myArray));
Why don't you try something like (semi-pseudo-code here):
List<Integer> uniqGroups = new ArrayList<Integer>();
for (int i = 0; i < myArray.length; i++) {
int groupId = myArray[i].getGroupId();
if (!uniqGroups.contains(groupId)) {
// Hasn't been seen before, keep around
uniqGroups.add(groupId);
}
else {
// Already seen, remove or otherwise clean up the array
myArray[i] = null;
}
}
As you just need to distinguish your objects by groupId, you might override the hashCode() and equals() methods in your class:
class MyThing {
private int groupId;
public int getGroupId(){return groupId;}
// new code to add...
#Override
public int hashCode() {
return groupId;
}
#Override
public boolean equals(Object o) {
return (o instanceof MyThing
&& (groupId == ((MyThing)o).groupId));
}
}
and then, use a HashSet<MyThing> class to remove the MyThing objects in myArray with duplicated groupId:
myArray = new HashSet<MyThing>(Arrays.asList(myArray)).toArray(new MyThing[0]);

List that can only contain a value once

This is most certainly a noob question, but I haven't been able to find a good answer on Google or here, so I have to ask:
What kinda list should I use in Java, when I just want a value to be added once?
The problem is that I'm doing a web technology project in college (a webshop), and I have this cloud I connect too. I can the request the customer ID´s from those who bought items in my shop. What I want to do is extract these ID´s and add them to a list. But when extracting them I get the ID returned for every item they have bought, so I want a list that can check: "This value is already in this list, do nothing", or "This ID is not in the list, lets add the ID"
Is there a list that can do this, or a way to do it with a list without it getting too complicated?
You want a Set, this is the data structure that prevents duplicates. This is a Collection so you can define a function like so:
public Collection<MyObject> foo()
{
return new HashSet<MyObject>();
}
and at a later time change the return internally to this:
public Collection<MyObject> foo()
{
return new ArrayList<MyObject>();
}
And your API won't break.
A Set contains every value only once.
Though, the problem with HashSet is that the order in which the elements were added gets lost. So if you want to preserve the order of elements, I would suggest using a LinkedHashSet.
With a LinkedHashSet, iterating over the elements will return them in the order they were inserted.
public static void main(String[] args) {
Set<String> hashSet = new HashSet<>();
hashSet.add("first");
hashSet.add("second");
hashSet.add("third");
for (String s : hashSet) {
System.out.println(s); // no particular order
}
Set<String> linkedHashSet = new LinkedHashSet<>();
linkedHashSet.add("first");
linkedHashSet.add("second");
linkedHashSet.add("third");
for (String s : linkedHashSet) {
System.out.println(s); // "first", "second", "third"
}
}
public boolean insertRecord(Programmer targetProgrammer, List programmerList) {
boolean flag = false;
for (Programmer p : programmerList){
if (targetProgrammer.getId() == p.getId()) {
return true;
}
}
return flag;
}
// Then when you invoke:
Programmer target = new Programmer(1,"Dev","Java");
if (!insertRecord(target, myList)) {
myList.add(target);
}
What you will be looking for is a Set, as a Set is a Collection that contains no duplicates.
There are a few types that you could use depending on your needs:
HashSet
LinkedHashSet
CopyOnWriteArraySet
EnumSet
TreeSet
ConcurrentSkipListSet
Better use HashSet as it takes care of your problem of unique IDs implicitly. Still better is SortedSet where you can have the unique elements printed in sorted order automatically.

In Java, How to remove duplication from an ArrayList<StringBuilder> efficiently?

I tried to use HashSet to remove the duplications from an ArrayList<StringBuilder>.
E.g. Here is an ArrayList, each line is a StringBuilder object.
"u12e5 u13a1 u1423"
"u145d"
"u12e5 u13a1 u1423"
"u3ab4 u1489"
I want to get the following:
"u12e5 u13a1 u1423"
"u145d"
"u3ab4 u1489"
My current implementation is:
static void removeDuplication(ArrayList<StringBuilder> directCallList) {
HashSet<StringBuilder> set = new HashSet<StringBuilder>();
for(int i=0; i<directCallList.size()-1; i++) {
if(set.contains(directCallList.get(i)) == false)
set.add(directCallList.get(i));
}
StringBuilder lastString = directCallList.get(directCallList.size()-1);
directCallList.clear();
directCallList.addAll(set);
directCallList.add(lastString);
}
But the performance becomes worse and worse as the ArrayList size grows. Is there any problem with this implementation? Or do you have any better ones in terms of performance?
StringBuilder doesn't implement equals() or hashcode(). Two StringBuilders are only equal if they are the exact same object, so adding them to a HashSet won't exclude two different StringBuilder objects with identical content.
You should convert the StringBuilders to String objects.
Also, you should initialize your HashSet with an "initial capacity" in the constructor. This will help with the speed if you are dealing with large numbers of objects.
Lastly, it's not necessary to call contains() on the hashset before adding an object. Just add your Strings to the set, and the set will reject duplicates (and will return false).
Let's analyze your method to find where we can improve it:
static void removeDuplication(ArrayList<StringBuilder> directCallList) {
HashSet<StringBuilder> set = new HashSet<StringBuilder>();
for(int i=0; i<directCallList.size()-1; i++) {
if(set.contains(directCallList.get(i)) == false)
set.add(directCallList.get(i));
}
This for loop repeats once for each element in the ArrayList. This seems unavoidable for the task at hand. However, since HashSet can only contain one of each item, the if statement is redundant. HashSet.add() does the exact same check again.
StringBuilder lastString = directCallList.get(directCallList.size()-1);
I don't understand the need to get the lastString from your list and then add it. If your loop works correctly, it should have already been added to the HashSet.
directCallList.clear();
Depending on the implementation of the list, this can take up to O(n) time because it might need to visit every element in the list.
directCallList.addAll(set);
Again, this takes O(n) time. If there are no duplicates, set contains the original items.
directCallList.add(lastString);
This line seems to be a logic error. You will add a String which is already in the set and added to directCallList.
}
So overall, this algorithm takes O(n) time, but there is a constant factor of 3. If you can reduce this factor, you can improve the performance. One way to do this is to simply create a new ArrayList, rather than clearing the existing one.
Additionally, this removeDuplication() function can be written in one line if you use the correct constructors and return the ArrayList without duplicates:
static List<StringBuilder> removeDuplication(List<StringBuilder> inList) {
return new ArrayList<StringBuilder>(new HashSet<StringBuilder>(inList));
}
Of course, this still doesn't address the issues with StringBuilder that others have pointed out.
So you had some other options, but I like my solutions short, simple, and to the point. I've changed your method to no longer manipulate the parameter, but rather return a new List. I used a Set<String> to see if the contents of each StringBuilder was already included and returned the unique Strings. I also used a for each loop instead of accessing by index.
static List<StringBuilder> removeDuplication(List<StringBuilder> directCallList) {
HashSet<String> set = new HashSet<String>();
List<StringBuilder> returnList = new ArrayList<StringBuilder>();
for(StringBuilder builder : directCallList) {
if(set.add(builder.toString())
returnList.add(builder);
}
return returnList;
}
As Sam states, StringBuider does not override hashCode and equals and so the Set will not work appropriately.
I think the answer is to wrap the Builder in an object that executes toString only once:
class Wrapper{
final String string;
final StringBuilder builder;
Wrapper(StringBuilder builder){
this.builder = builder;
this.string = builder.toString();
}
public int hashCode(){return string.hashCode();}
public boolean equals(Object o){return string.equals(o);}
}
public Set removeDups(List<StringBuilder> list){
Set<Wrapper> set = ...;
for (StringBuilder builder : list)
set.add(new Wrapper(builder));
return set;
}
The removeDups method could be updated to extract the builders from the set and return a List<StringBuilder>
As explained, StringBuilders don't override Object#equals and aren't Comparable.
Although using StringBuilders to concatenate your Strings is the way to go, I would suggest that once you are done with your concatenation, you should store the underlying strings (stringBuilder.toString()) instead of the StringBuilders in your list.
Removing duplicates then becomes a one line:
Set<String> set = new HashSet<String>(list);
Or even better, store the strings in the set directly if you don't need to know that there are duplicates.

Collection removeAll ignoring case?

Ok so here is my issue. I have to HashSet's, I use the removeAll method to delete values that exist in one set from the other.
Prior to calling the method, I obviously add the values to the Sets. I call .toUpperCase() on each String before adding because the values are of different cases in both lists. There is no rhyme or reason to the case.
Once I call removeAll, I need to have the original cases back for the values that are left in the Set. Is there an efficient way of doing this without running through the original list and using CompareToIgnoreCase?
Example:
List1:
"BOB"
"Joe"
"john"
"MARK"
"dave"
"Bill"
List2:
"JOE"
"MARK"
"DAVE"
After this, create a separate HashSet for each List using toUpperCase() on Strings. Then call removeAll.
Set1.removeAll(set2);
Set1:
"BOB"
"JOHN"
"BILL"
I need to get the list to look like this again:
"BOB"
"john"
"Bill"
Any ideas would be much appreciated. I know it is poor, there should be a standard for the original list but that is not for me to decide.
In my original answer, I unthinkingly suggested using a Comparator, but this causes the TreeSet to violate the equals contract and is a bug waiting to happen:
// Don't do this:
Set<String> setA = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
setA.add("hello");
setA.add("Hello");
System.out.println(setA);
Set<String> setB = new HashSet<String>();
setB.add("HELLO");
// Bad code; violates symmetry requirement
System.out.println(setB.equals(setA) == setA.equals(setB));
It is better to use a dedicated type:
public final class CaselessString {
private final String string;
private final String normalized;
private CaselessString(String string, Locale locale) {
this.string = string;
normalized = string.toUpperCase(locale);
}
#Override public String toString() { return string; }
#Override public int hashCode() { return normalized.hashCode(); }
#Override public boolean equals(Object obj) {
if (obj instanceof CaselessString) {
return ((CaselessString) obj).normalized.equals(normalized);
}
return false;
}
public static CaselessString as(String s, Locale locale) {
return new CaselessString(s, locale);
}
public static CaselessString as(String s) {
return as(s, Locale.ENGLISH);
}
// TODO: probably best to implement CharSequence for convenience
}
This code is less likely to cause bugs:
Set<CaselessString> set1 = new HashSet<CaselessString>();
set1.add(CaselessString.as("Hello"));
set1.add(CaselessString.as("HELLO"));
Set<CaselessString> set2 = new HashSet<CaselessString>();
set2.add(CaselessString.as("hello"));
System.out.println("1: " + set1);
System.out.println("2: " + set2);
System.out.println("equals: " + set1.equals(set2));
This is, unfortunately, more verbose.
It could be done by:
Moving the content of your lists into case-insensitive TreeSets,
then removing all common Strings case-insensitively thanks TreeSet#removeAll(Collection<?> c)
and finally relying on the fact that ArrayList#retainAll(Collection<?> c) will iterate over the elements of the list and for each element it will call contains(Object o) on the provided collection to know whether the value should be kept or not and here as the collection is case-insensitive, we will keep only the Strings that match case-insensitively with what we have in the provided TreeSet instance.
The corresponding code:
List<String> list1 = new ArrayList<>(
Arrays.asList("BOB", "Joe", "john", "MARK", "dave", "Bill")
);
List<String> list2 = Arrays.asList("JOE", "MARK", "DAVE");
// Add all values of list1 in a case insensitive collection
Set<String> set1 = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
set1.addAll(list1);
// Add all values of list2 in a case insensitive collection
Set<String> set2 = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
set2.addAll(list2);
// Remove all common Strings ignoring case
set1.removeAll(set2);
// Keep in list1 only the remaining Strings ignoring case
list1.retainAll(set1);
for (String s : list1) {
System.out.println(s);
}
Output:
BOB
john
Bill
NB 1: It is important to have the content of the second list into a TreeSet especially if we don't know the size of it because the behavior of TreeSet#removeAll(Collection<?> c) depends on the size of both collections, if the size of the current collection is strictly bigger than the size of the provided collection, then it will call directly remove(Object o) on the current collection to remove each element, in this case the provided collection could be a list. But if it is the opposite, it will call contains(Object o) on the provided collection to know whether a given element should be removed or not so if it is not an case-insensitive collection, we won't get the expected result.
NB 2: The behavior of the method ArrayList#retainAll(Collection<?> c) described above is the same as the behavior of the default implementation of the method retainAll(Collection<?> c) that we can find in AbstractCollection such that this approach will actually work with any collections whose implementation of retainAll(Collection<?> c) has the same behavior.
You can use a hashmap and use the capital set as keys that map to the mixed case set.
Keys of hashmaps are unique and you can get a set of them using HashMap.keyset();
to retrieve the original case, it's as simple as HashMap.get("UPPERCASENAME").
And according to the documentation:
Returns a set view of the keys
contained in this map. The set is
backed by the map, so changes to the
map are reflected in the set, and
vice-versa. The set supports element
removal, which removes the
corresponding mapping from this map,
via the Iterator.remove, Set.remove,
removeAll, retainAll, and clear
operations. It does not support the
add or addAll operations.
So HashMap.keyset().removeAll will effect the hashmap :)
EDIT: use McDowell's solution. I overlooked the fact that you didn't actually need the letters to be upper case :P
This would be an interesting one to solve using google-collections. You could have a constant Predicate like so:
private static final Function<String, String> TO_UPPER = new Function<String, String>() {
public String apply(String input) {
return input.toUpperCase();
}
and then what you're after could be done someting like this:
Collection<String> toRemove = Collections2.transform(list2, TO_UPPER);
Set<String> kept = Sets.filter(list1, new Predicate<String>() {
public boolean apply(String input) {
return !toRemove.contains(input.toUpperCase());
}
}
That is:
Build an upper-case-only version of the 'to discard' list
Apply a filter to the original list, retaining only those items whose uppercased value is not in the upper-case-only list.
Note that the output of Collections2.transform isn't an efficient Set implementation, so if you're dealing with a lot of data and the cost of probing that list will hurt you, you can instead use
Set<String> toRemove = Sets.newHashSet(Collections2.transform(list2, TO_UPPER));
which will restore an efficient lookup, returning the filtering to O(n) instead of O(n^2).
as far as i know, hashset's use the object's hashCode-method to distinct them from each other.
you should therefore override this method in your object in order to distinct cases.
if you're really using string, you cannot override this method as you cannot extend the String-class.
therefore you need to create your own class containing a string as attribute which you fill with your content. you might want to have a getValue() and setValue(String) method in order to modify the string.
then you can add your own class to the hashmap.
this should solve your problem.
regards

Categories