removing duplicates from list of lists and preserving lists - java

I have an arrayList of arrayLists. Each inner arraylist contains some objects with the format (name.version) .
{ {a.1,b.2,c.3} , {a.2,d.1,e.1} , {b.3,f.1,z.1}....}
For example a.1 implies name = a and version is 1.
So i want to eliminate duplicates in this arraylist of lists. For me , two objects are duplicate when they have the same name
So essentially my output should be
{ { a.1,b.2,c.3},{d.1,e.1} ,{f.1 ,z.1} }
Note that i want the output in the exact same form (That is , i dont want a single list with no duplicates)
Can someone provide me with an optimal solution for this?
I can loop through each inner list and place the contents in the hashset. But two issues there, i cant get back the answer in
form of list of lists.Another issue is that when i need to override equals for that object , but i am not sure if that would
break other code. These objects are meaningfully equal if their names are same (only in this case. I am not sure that would
cover the entire spectrum)
Thanks

I used Iterator.remove() to modify the collection as you move through it.
// build your example input as ArrayList<ArrayList<String>>
String[][] tmp = { { "a.1", "b.2", "c.3" }, { "a.2", "d.1", "e.1" },
{ "b.3", "f.1", "z.1" } };
List<List<String>> test = new ArrayList<List<String>>();
for (String[] array : tmp) {
test.add(new ArrayList<String>(Arrays.asList(array)));
}
// keep track of elements we've already seen
Set<String> nameCache = new HashSet<String>();
// iterate and remove if seen before
for (List<String> list : test) {
for (Iterator<String> it = list.iterator(); it.hasNext();) {
String element = it.next();
String name = element.split("\\.")[0];
if (nameCache.contains(name)) {
it.remove();
} else {
nameCache.add(name);
}
}
}
System.out.println(test);
Output
[[a.1, b.2, c.3], [d.1, e.1], [f.1, z.1]]

List<List<Pair>> inputs; // in whatever format you have them
List<List<Pair>> uniqued = new ArrayList<>(); // output to here
Set<String> seen = new HashSet<String>();
for (List<Pair> list : inputs) {
List<Pair> output = new ArrayList<>();
for (Pair p : list)
if (seen.add(p.getName()))
output.add(p);
uniqued.add(output);
}

Create a Set. Iterate over the list of lists' items. See if the item is in the Set. If it is already there, ignore it. If it isn't, add it to the Set and the list of lists.
Your method will return a new list of lists, not modify the old one. Modifying a list while iterating over it is a pain.

Related

Adding data into a list and keeping the right order

Consider the following piece of code in Java:
final LinkedHashSet<String> alldata = new LinkedHashSet<String>();
for (final String folder: folders) {
for (final String d: data.get(folder)) {
allData.add(d);
}
}
when folders is List<String> and data is Map<String, List<String>>.
Example of data:
{data1=[prepare, echo2, echo1], data2=[prepare, sleep2, check]}
Until now, I didn't care about the order of the allData and it would look:
[prepare, echo1, echo2, sleep2, check]
I have a list dataList which contains the right order of the data (and of course all data):
[prepare, sleep1, echo1, sleep2, echo2, check]
I want to iterate through the data and add into the allData those that are defined in data in the right order.
The output for the previous example should be:
[prepare, echo1, sleep2, echo2, check]
Another example:
data = {data1=[prepare], data2=[prepare, sleep2,sleep1,echo2]}
dataList = [prepare, sleep1, echo1, sleep2, echo2, check]
expected output:
allData = [prepare,sleep1,sleep2,echo2]
Of course, there could be more than two inner data (data1,data2,...,dataN).
I'm not sure that its the right thing to add code into the inner for loop, because by doing I'll have to add another loop to iterate through the dataList, and it does not feel efficient (3 loops).
What good, clean and efficient way should I use?
EDIT: I don't try to sort the collection. But I thought of doing the following algorithm:
Creating a Set of all data and then iterate through the dataList and insert into the LinkedHashSet.
Is it efficient?
EDIT2: What I tried to do:
final Set<String> activeData = new HashSet<String>();
for (final String folder: folders) {
for (final String d: fubsAndSteps.get(folder)) {
activeData.add(d);
}
}
final LinkedHashSet<String> allData = new LinkedHashSet<String>();
for (final String main_data : dataList) {
for (final String active_data: activeData) {
if (main_data.equals(active_data)) {
allData.add(active_data);
}
}
}
That works for me, but it does not look really good, I mean 4 for loops. It does not feel clean and efficient. Is there a better algorithm?
One of the solution is
Iterate over datalist and check the element exists in data, then add it to allData which keeps order
Set<String> tempSet = new HashSet<>();
folders.forEach(folder -> tempSet.addAll(data.get(folder))); // flattening all the data elements into single set
//Iterate over dataList where you are already maintaining order. If element exists, add it to allData
dataList.forEach(item -> {
if(tempSet.contains(item)){
allData.add(item);
}
});
If you are using Java 7 or below
Set<String> tempSet = new HashSet<>();
// flattening all the data elements into single set
for(String folder:folders) {
tempSet.addAll(data.get(folder));
}
//Iterate over dataList where you are already maintaining order. If element exists, add it to allData
for(String item:dataList){
if(tempSet.contains(item)){
allData.add(item);
}
}

Finding duplicate and non duplicate in Java

I know this question has been answered on "how to find" many times, however I have a few additional questions. Here is the code I have
public static void main (String [] args){
List<String> l1= new ArrayList<String>();
l1.add("Apple");
l1.add("Orange");
l1.add("Apple");
l1.add("Milk");
//List<String> l2=new ArrayList<String>();
//HashSet is a good choice as it does not allow duplicates
HashSet<String> set = new HashSet<String>();
for( String e: l1){
//if(!(l2).add(e)) -- did not work
if(!(set).add(e)){
System.out.println(e);
}
Question 1:The list did not work because List allows Duplicate while HashSet does not- is that correct assumption?
Question 2: What does this line mean: if(!(set).add(e))
In the for loop we are checking if String e is in the list l1 and then what does this line validates if(!(set).add(e))
This code will print apple as output as it is the duplicate value.
Question 3: How can i have it print non Duplicate values, just Orange and Milk but not Apple? I tried this approach but it still prints Apple.
List unique= new ArrayList(new HashSet(l1));
Thanks in advance for your time.
1) Yes that is correct. We often use sets to remove duplicates.
2) The add method of HashSet returns false when the item is already in the set. That's why it is used to check whether the item exists in the set.
3) To do this, you need to count up the number of occurrances of each item in the array, store them in a hash map, then print out those items that has a count of 1. Or, you could just do this (which is a little dirty and is slower! However, this approach takes a little less space than using a hash map.)
List<String> l1= new ArrayList<>();
l1.add("Apple");
l1.add("Orange");
l1.add("Apple");
l1.add("Milk");
HashSet<String> set = new HashSet<>(l1);
for (String item : set) {
if (l1.stream().filter(x -> !x.equals(item)).count() == l1.size() - 1) {
System.out.println(item);
}
}
You're right.
Well... adding to the collection doesn't necessary need to return anything. Fortunately guys from the Sun or Oracle decided to return a message if the item was successfully added to the collection or not. This is indicated by true/false return value. true for a success.
You can extend your current code with the following logic: if element wasn't added successfully to the set, it means it was a duplicate so add it to another set Set<> duplicates and later remove all duplicates from the Set.
Question 1:The list did not work because List allows Duplicate while HashSet does not- is that correct assumption?
That is correct.
Question 2: What does this line mean: if(!(set).add(e)) In the for loop we are checking if String e is in the list l1 and then what does this line validates if(!(set).add(e))
This code will print apple as output as it is the duplicate value.
set.add(e) attempts to add an element to the set, and it returns a boolean indicating whether it was added. Negating the result will cause new elements to be ignored and duplicates to be printed. Note that if an element is present 3 times it will be printed twice, and so on.
Question 3: How can i have it print non Duplicate values, just Orange and Milk but not Apple? I tried this approach but it still prints Apple. List<String> unique= new ArrayList<String>(new HashSet<String>(l1));
There are a number of ways to approach it. This one doesn't have the best performance but it's pretty straightforward:
for (int i = 0; i < l1.size(); i++) {
boolean hasDup = false;
for (int j = 0; j < l1.size(); j++) {
if (i != j && l1.get(i).equals(l1.get(j))) {
hasDup = true;
break;
}
}
if (!hasDup) {
System.out.println(e);
}
}
With the /java8 power...
public static void main(String[] args) {
List<String> l1 = new ArrayList<>();
l1.add("Apple");
l1.add("Orange");
l1.add("Apple");
l1.add("Milk");
// remove duplicates
List<String> li = l1.parallelStream().distinct().collect(Collectors.toList());
System.out.println(li);
// map with duplicates frequency
Map<String, Long> countsList = l1.stream().collect(Collectors.groupingBy(fe -> fe, Collectors.counting()));
System.out.println(countsList);
// filter the map where only once
List<String> l2 = countsList.entrySet().stream().filter(map -> map.getValue().longValue() == 1)
.map(map -> map.getKey()).collect(Collectors.toList());
System.out.println(l2);
}

Iterating over sets of sets

I am attempting to write a program that iterates over a set of sets. In the example code below, I am getting an error that stating that iter.next() is of type object rather than a set of strings. I am having some other more mysterious issues with iterating over sets of sets as well. Any suggestions?
Set<String> dogs= new HashSet<String>();
dogs.add("Irish Setter");
dogs.add("Poodle");
dogs.add("Pug");
dogs.add("Beagle");
Set<String> cats = new HashSet<String>();
cats.add("Himalayan");
cats.add("Persian");
Set<Set<String>> allAnimals = new HashSet<Set<String>>();
allAnimals.add(cats);
allAnimals.add(dogs);
Iterator iter = allAnimals.iterator();
System.out.println(allAnimals.size());
while (iter.hasNext())
{
System.out.println(iter.next().size());
}
A related question with the same setup (minus the loop).
The code fragment below results in a final output that includes tildes. But I don't want to change allAnimals as I go! How can I edit extension without affecting the larger set (allAnimals).
for (Set<String> extension : allAnimals)
{
System.out.println("Set size: " + extension.size());
extension.add("~");
System.out.println(extension);
}
System.out.println(allAnimals);
Your allAnimals variable is of type Set<Set<String>>, however, when you ask its Iterator you "forget" the type information. According to the compiler, your iterator just contains Objects. Change the line where you get the Iterator to this
Iterator<Set<String>> iter = allAnimals.iterator();
and all should be fine.
Use an enhanced for loop for traversing the sets, is easier than using an iterator:
for (Set<String> names : allAnimals) {
System.out.println(names.size());
}
For example, to traverse all the animal's names:
for (Set<String> names : allAnimals) {
for (String name : names) {
System.out.println(name);
}
}
You do not mention the type on which your iterator is defined. So as far as it is concerned it expects an object as next.
I would just use a (nested) foreach loop:
for(Set<String> animals : allAnimals) {
int size = animals.size(); // if you want it
for (String animal : animals) {
// do something with the name
}
}

How to remove multiple elements in Vector in Java?

I read from .txt file all of the ids and insert these ids into Vector.
String pathSelectedfile = fileChooser.getSelectedFile().getAbsolutePath();
File selectedFile = new File(pathSelectedfile);
Scanner readFile = new Scanner(selectedFile);
Vector ids=new Vector();
while (readFile.hasNextLine()) {
String id= readFile.nextLine();
ids.addElement(id);
}
then I want to remove multiple ids in Vector.i can do that by for loop
but information is too big.tnx a lot
To remove multiple values
Vector vector = new Vector();
vector.add("value1");
vector.add("value2");
vector.add("value3");
vector.add("value4");
System.out.println("Size : "+vector.size());
// to remove single value
vector.remove("value1");
System.out.println("Size : "+vector.size());
Vector itemsToRemove = new Vector();
itemsToRemove.add("value3");
itemsToRemove.add("value4");
//remove multiple values
vector.removeAll(itemsToRemove);
System.out.println("Size : "+vector.size());
//to remove all elements
vector.removeAllElements();
// or
vector.clear();
But instead of using Vector consider to use ArrayList since Vector is obsolete collection.
Read this : Why is Java Vector class considered obsolete or deprecated?
Also use generics Like ArrayList<String> idList = new ArrayList() if you store only String elements in list.
If you want to skip duplicates when adding elements in Vector, use the following code
Vector vector = new Vector() {
#Override
public synchronized boolean add(Object e) {
if(!contains(e)){
return super.add(e);
}
System.out.println("Element " + e +" is duplicate");
return false ;
}
};
But if you want to add only unique elements, use Set
Do completely remove the duplicated ids, you could use the following:
Set<String> ids=new LinkedHashSet<String>();
Set<String> duplicates=new HashSet<String>();
while (readFile.hasNextLine()) {
String id= readFile.nextLine();
if(!ids.add(id)) {
duplicates.add(id);
}
}
ids.removeAll(duplicates)
Note that unlike Vector, LinkedHashSet is not synchronized. In most cases this is not a bad thing, but in the case that you actually need it to be synchronized, wrap it using Collections.synchronizedSet()
READ the javadoc and pay attention to methods starting with remove http://docs.oracle.com/javase/6/docs/api/java/util/Vector.html. This should be you first approach not SO.
If you "want to remove multiple ids in Vector" do the following
ids = new Vector(new HashSet(ids))

How to avoid java.util.ConcurrentModificationException when iterating through and removing elements from an ArrayList

I have an ArrayList that I want to iterate over. While iterating over it I have to remove elements at the same time. Obviously this throws a java.util.ConcurrentModificationException.
What is the best practice to handle this problem? Should I clone the list first?
I remove the elements not in the loop itself but another part of the code.
My code looks like this:
public class Test() {
private ArrayList<A> abc = new ArrayList<A>();
public void doStuff() {
for (A a : abc)
a.doSomething();
}
public void removeA(A a) {
abc.remove(a);
}
}
a.doSomething might call Test.removeA();
Two options:
Create a list of values you wish to remove, adding to that list within the loop, then call originalList.removeAll(valuesToRemove) at the end
Use the remove() method on the iterator itself. Note that this means you can't use the enhanced for loop.
As an example of the second option, removing any strings with a length greater than 5 from a list:
List<String> list = new ArrayList<String>();
...
for (Iterator<String> iterator = list.iterator(); iterator.hasNext(); ) {
String value = iterator.next();
if (value.length() > 5) {
iterator.remove();
}
}
From the JavaDocs of the ArrayList
The iterators returned by this class's iterator and listIterator
methods are fail-fast: if the list is structurally modified at any
time after the iterator is created, in any way except through the
iterator's own remove or add methods, the iterator will throw a
ConcurrentModificationException.
You are trying to remove value from list in advanced "for loop", which is not possible, even if you apply any trick (which you did in your code).
Better way is to code iterator level as other advised here.
I wonder how people have not suggested traditional for loop approach.
for( int i = 0; i < lStringList.size(); i++ )
{
String lValue = lStringList.get( i );
if(lValue.equals("_Not_Required"))
{
lStringList.remove(lValue);
i--;
}
}
This works as well.
In Java 8 you can use the Collection Interface and do this by calling the removeIf method:
yourList.removeIf((A a) -> a.value == 2);
More information can be found here
You should really just iterate back the array in the traditional way
Every time you remove an element from the list, the elements after will be push forward. As long as you don't change elements other than the iterating one, the following code should work.
public class Test(){
private ArrayList<A> abc = new ArrayList<A>();
public void doStuff(){
for(int i = (abc.size() - 1); i >= 0; i--)
abc.get(i).doSomething();
}
public void removeA(A a){
abc.remove(a);
}
}
While iterating the list, if you want to remove the element is possible. Let see below my examples,
ArrayList<String> names = new ArrayList<String>();
names.add("abc");
names.add("def");
names.add("ghi");
names.add("xyz");
I have the above names of Array list. And i want to remove the "def" name from the above list,
for(String name : names){
if(name.equals("def")){
names.remove("def");
}
}
The above code throws the ConcurrentModificationException exception because you are modifying the list while iterating.
So, to remove the "def" name from Arraylist by doing this way,
Iterator<String> itr = names.iterator();
while(itr.hasNext()){
String name = itr.next();
if(name.equals("def")){
itr.remove();
}
}
The above code, through iterator we can remove the "def" name from the Arraylist and try to print the array, you would be see the below output.
Output : [abc, ghi, xyz]
Do the loop in the normal way, the java.util.ConcurrentModificationException is an error related to the elements that are accessed.
So try:
for(int i = 0; i < list.size(); i++){
lista.get(i).action();
}
Here is an example where I use a different list to add the objects for removal, then afterwards I use stream.foreach to remove elements from original list :
private ObservableList<CustomerTableEntry> customersTableViewItems = FXCollections.observableArrayList();
...
private void removeOutdatedRowsElementsFromCustomerView()
{
ObjectProperty<TimeStamp> currentTimestamp = new SimpleObjectProperty<>(TimeStamp.getCurrentTime());
long diff;
long diffSeconds;
List<Object> objectsToRemove = new ArrayList<>();
for(CustomerTableEntry item: customersTableViewItems) {
diff = currentTimestamp.getValue().getTime() - item.timestamp.getValue().getTime();
diffSeconds = diff / 1000 % 60;
if(diffSeconds > 10) {
// Element has been idle for too long, meaning no communication, hence remove it
System.out.printf("- Idle element [%s] - will be removed\n", item.getUserName());
objectsToRemove.add(item);
}
}
objectsToRemove.stream().forEach(o -> customersTableViewItems.remove(o));
}
One option is to modify the removeA method to this -
public void removeA(A a,Iterator<A> iterator) {
iterator.remove(a);
}
But this would mean your doSomething() should be able to pass the iterator to the remove method. Not a very good idea.
Can you do this in two step approach :
In the first loop when you iterate over the list , instead of removing the selected elements , mark them as to be deleted. For this , you may simply copy these elements ( shallow copy ) into another List.
Then , once your iteration is done , simply do a removeAll from the first list all elements in the second list.
In my case, the accepted answer is not working, It stops Exception but it causes some inconsistency in my List. The following solution is perfectly working for me.
List<String> list = new ArrayList<>();
List<String> itemsToRemove = new ArrayList<>();
for (String value: list) {
if (value.length() > 5) { // your condition
itemsToRemove.add(value);
}
}
list.removeAll(itemsToRemove);
In this code, I have added the items to remove, in another list and then used list.removeAll method to remove all required items.
Instead of using For each loop, use normal for loop. for example,the below code removes all the element in the array list without giving java.util.ConcurrentModificationException. You can modify the condition in the loop according to your use case.
for(int i=0; i<abc.size(); i++) {
e.remove(i);
}
Sometimes old school is best. Just go for a simple for loop but make sure you start at the end of the list otherwise as you remove items you will get out of sync with your index.
List<String> list = new ArrayList<>();
for (int i = list.size() - 1; i >= 0; i--) {
if ("removeMe".equals(list.get(i))) {
list.remove(i);
}
}
You can also use CopyOnWriteArrayList instead of an ArrayList. This is the latest recommended approach by from JDK 1.5 onwards.
Do somehting simple like this:
for (Object object: (ArrayList<String>) list.clone()) {
list.remove(object);
}
An alternative Java 8 solution using stream:
theList = theList.stream()
.filter(element -> !shouldBeRemoved(element))
.collect(Collectors.toList());
In Java 7 you can use Guava instead:
theList = FluentIterable.from(theList)
.filter(new Predicate<String>() {
#Override
public boolean apply(String element) {
return !shouldBeRemoved(element);
}
})
.toImmutableList();
Note, that the Guava example results in an immutable list which may or may not be what you want.
for (A a : new ArrayList<>(abc)) {
a.doSomething();
abc.remove(a);
}
"Should I clone the list first?"
That will be the easiest solution, remove from the clone, and copy the clone back after removal.
An example from my rummikub game:
SuppressWarnings("unchecked")
public void removeStones() {
ArrayList<Stone> clone = (ArrayList<Stone>) stones.clone();
// remove the stones moved to the table
for (Stone stone : stones) {
if (stone.isOnTable()) {
clone.remove(stone);
}
}
stones = (ArrayList<Stone>) clone.clone();
sortStones();
}
I arrive late I know but I answer this because I think this solution is simple and elegant:
List<String> listFixed = new ArrayList<String>();
List<String> dynamicList = new ArrayList<String>();
public void fillingList() {
listFixed.add("Andrea");
listFixed.add("Susana");
listFixed.add("Oscar");
listFixed.add("Valeria");
listFixed.add("Kathy");
listFixed.add("Laura");
listFixed.add("Ana");
listFixed.add("Becker");
listFixed.add("Abraham");
dynamicList.addAll(listFixed);
}
public void updatingListFixed() {
for (String newList : dynamicList) {
if (!listFixed.contains(newList)) {
listFixed.add(newList);
}
}
//this is for add elements if you want eraser also
String removeRegister="";
for (String fixedList : listFixed) {
if (!dynamicList.contains(fixedList)) {
removeResgister = fixedList;
}
}
fixedList.remove(removeRegister);
}
All this is for updating from one list to other and you can make all from just one list
and in method updating you check both list and can eraser or add elements betwen list.
This means both list always it same size
Use Iterator instead of Array List
Have a set be converted to iterator with type match
And move to the next element and remove
Iterator<Insured> itr = insuredSet.iterator();
while (itr.hasNext()) {
itr.next();
itr.remove();
}
Moving to the next is important here as it should take the index to remove element.
List<String> list1 = new ArrayList<>();
list1.addAll(OriginalList);
List<String> list2 = new ArrayList<>();
list2.addAll(OriginalList);
This is also an option.
If your goal is to remove all elements from the list, you can iterate over each item, and then call:
list.clear()
What about of
import java.util.Collections;
List<A> abc = Collections.synchronizedList(new ArrayList<>());
ERROR
There was a mistake when I added to the same list from where I took elements:
fun <T> MutableList<T>.mathList(_fun: (T) -> T): MutableList<T> {
for (i in this) {
this.add(_fun(i)) <--- ERROR
}
return this <--- ERROR
}
DECISION
Works great when adding to a new list:
fun <T> MutableList<T>.mathList(_fun: (T) -> T): MutableList<T> {
val newList = mutableListOf<T>() <--- DECISION
for (i in this) {
newList.add(_fun(i)) <--- DECISION
}
return newList <--- DECISION
}
Just add a break after your ArrayList.remove(A) statement

Categories