Built in method for removing duplicates in a string array

Built in method for removing duplicates in a string array - java

Is there a java built-in method for removing duplicates ? (array of strings)
Maybe using a Set ? in this case how can I use it ?
Thanks

Instead of an array of string, you can directly use a set (in this case all elements in set will always be unique of that type) but if you only want to use array of strings , you can use the following to save array to set then save it back.
import java.util.Arrays;
import java.util.HashSet;
public class HelloWorld{
public static void main(String []args)
{
String dupArray[] = {"hi","hello","hi"};
dupArray=removeDuplicates(dupArray);
for(String s: dupArray)
System.out.println(s);
}
public static String[] removeDuplicates(String []dupArray)
{
HashSet<String> mySet = new HashSet<String>(Arrays.asList(dupArray));
dupArray = new String[mySet.size()];
mySet.toArray(dupArray);
return dupArray;
}
}

Related

Type mismatch: convert from String to List<String>

I have in mind the algorithm of my school-class program, but also difficulty in some basics I guess...
here is my code with the problem:
import java.io.*;
import java.util.*;
public class Main {
public static void main(String[] args) throws FileNotFoundException {
String allWords = System.getProperty("user.home") + "/allwords.txt";
Anagrams an = new Anagrams(allWords);
for(List<String> wlist : an.getSortedByAnQty()) {
//[..............];
}
}
}
public class Anagrams {
List<String> myList = new ArrayList<String>();
public List<String> getSortedByAnQty() {
myList.add("aaa");
return myList;
}
}
I get "Type mismatch: cannot convert from element type String to List"
How should initialise getSortedByAnQty() right?

an.getSortedByAnQty() returns a List<String>. When you iterate over that List, you get the individual Strings, so the enhanced for loop should have a String variable :
for(String str : an.getSortedByAnQty()) {
//[..............];
}
If the main method should remain as is, you should change getSortedByAnQty to return a List<List<String>>.

char[] cArray = "MYString".toCharArray();
convert the string to an array as above and then iterate over the character array to form a list of String as below
List<String> list = new ArrayList<String>(cArray.length);
for(char c : cArray){
list.add(String.valueOf(c));
}

Duplicates elements not removed from ArrayList

I am trying to add unique elements to an array using the below code. I used Ignorecase, but still I am getting duplicates.
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class RemoveDuplicatesIgnoreCase {
public static void main(String args[]) {
// String Array with duplicates Colors
String[] colorArray={"Black","BLACK","black","Cobalt","COBALT","cobalt","IVORY","Ivory","ivory","White","WHITE","white"};
List<String> uniqueColorList=new ArrayList<String>();
for (String color : colorArray) {
if(!uniqueColorList.contains(color)&& !uniqueColorList.contains(color.toLowerCase())&& !uniqueColorList.contains(color.toUpperCase()))
{
uniqueColorList.add(color);
}
}
Iterator<String> itr=uniqueColorList.iterator();
while(itr.hasNext())
{
System.out.println(itr.next());
}
}
}
Output:
Black
BLACK
Cobalt
COBALT
IVORY
White
WHITE
I want to avoid adding case sensitive & case insensitive duplicates.

I would use a SET instead of a ArrayList and add the string in lowercase. The Set doesn't allowed duplicate element.
Set<String> uniqueColorList = new HashSet<String>();
for (String color : colorArray) {
uniqueColorList.add(color.toLowerCase());
}

you have to lowerCase both values, to find a match

I think the RIGHT way to do this would be encapsulating the Color in an Object.
It is only minimal overhead and makes your code A LOT more readable:
public class ColorString {
public final String str;
public ColorString(String str) {
this.str = str;
}
public boolean equals(Object obj) {
if (obj == null) return false;
if (obj == this) return true;
if (!(obj instanceof ColorString )) return false;
ColorString col = (ColorString) obj;
if (this.str == null) return (col.str == null);
return this.str.equalsIgnoreCase(col.str);
}
public int hashCode() { // Always override hashCode AND equals
return str.toLowerCase().hashCode();
}
}
If you do it like this, you can use all the standard-methods, you can use a Set, an ArrayList.contains and so on. This solution is more sensible, since it is the representation of the idea: You don't have Strings, but you have a "color" and you have special rules, when two "color"s should be considered equal or not.
And if you want to expand your solution e.g. by allowing multiple colors with similar names to be treated as the same "color" you just have to change one method and everything still works!

I would use a Set of lowercase versions of the colors to track uniqueness:
public static void main(String args[]) {
String[] colorArray={"Black","BLACK","black","Cobalt","COBALT","cobalt","IVORY","Ivory","ivory","White","WHITE","white"};
List<String> colors = new ArrayList<String>();
Set<String> uniqueColors = new HashSet<String>();
for (String color : colorArray) {
if (set.add(color.toLowerCase()) {
uniqueColors.add(color);
}
}
// colors now contains case-insensitive unique names
}
This code makes use of two things about a Set:
Sets allow only unique values, so by putting in lowercase copies of the string we get the case-insensitive part taken care of
The add() method returns true if the operation changed the set, which will only happen if the value being added is new to the set, Using this return value avoids having to use contains() - simply attempt to add the value and you'll find out if it's unique or not.

Your problem is that you only cover all lower-case and all upper-case Strings, and not any other mix of cases (e.g. you also have capitalized Strings).
To make things short, you can just extend ArrayList and override contains to use ignore-case comparison for each String , as suggested in this thread:
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class RemoveDuplicatesIgnoreCase {
public static void main(String args[]) {
// String Array with duplicates Colors
String[] colorArray={"Black","BLACK","black","Cobalt","COBALT","cobalt","IVORY","Ivory","ivory","White","WHITE","white"};
List<String> uniqueColorList=new IgnoreCaseStringList<String>();
for (String color : colorArray) {
if(!uniqueColorList.contains(color))
{
uniqueColorList.add(color);
}
}
Iterator<String> itr=uniqueColorList.iterator();
while(itr.hasNext())
{
System.out.println(itr.next());
}
}
}
public class IgnoreCaseStringList extends ArrayList<String> {
#Override
public boolean contains(Object o) {
String paramStr = (String)o;
for (String s : this) {
if (paramStr.equalsIgnoreCase(s)) return true;
}
return false;
}
}

It doesn't work because you start by adding Black which is neither uppercase nor lowercase.
You could just decide to add uppercase or lowercase strings, and better yet, use a TreeSet if ordering doesn't matter to you. TreeSet will make it sorted alphabeticly.

The reason is you are checikng for all lowercase
When you are checking for BLACK there is Black in the list but Black != black.
The first char of the lowercase string is uppercase.

first off this part:
uniqueColorList.contains(color)
won't work, this is because when it does .equals on the strings, the case is different. then your next problem is that the lower and upper case don't work, because the first one if mixed. What it boils down to is the fact that WHITE and White are technically unique. The easiest option is to just use a single set case, as per ZouZou's suggestion. otherwise you need to do what Domi said and implement your own contains method to do a case insensitive check

Set<String> duplicateDection = new HashSet<>()
if (duplicateDection.add(color.toLowerCase())) {
uniqueColorList.add(color);
}
If removing items from the list you also need to remove them from the duplicate detection:
uniqueColorList.remove(color);
duplicateDetection.remove(color.toLowerCase());

You only check that list being build already contains 'same case' 'upper case' 'lower case' variants of the element to be added, so the search is not exhaustive if the list contains already a string with the different case combination then it passes condition and adds the color.

Try this..
public static void main(String[] a) {
String[] colorArray = {"Black", "BLACK", "black", "Cobalt", "COBALT", "cobalt", "IVORY", "Ivory", "ivory", "White", "WHITE", "white"};
List<String> uniqueColorList = new ArrayList<String>();
for (int i = 0; i < colorArray.length; i++) {
for (int j = i+1; j < colorArray.length; j++) {
if (!colorArray[i].equals("")) {
if (colorArray[i].equalsIgnoreCase(colorArray[j])) {
colorArray[j] = "";
}
}
}
}
System.out.println(Arrays.toString(colorArray));;
for (String color : colorArray) {
if (!color.equals("")) {
uniqueColorList.add(color);
}
}
Iterator<String> itr = uniqueColorList.iterator();
while (itr.hasNext()) {
System.out.println(itr.next());
}
}

I want to avoid adding case sensitive & case insensitive duplicates.
You have to get all Strings in to either lowercase or uppercase the moment you are compare String same as comparing values.
you can use equalsIgnoreCase()
But better and easy way is use a Set since it keep unique vales only.
Set<String> uniqueVal = new HashSet<String>();
for (String color : colorArray) {
uniqueVal.add(color.toLowerCase());
}
you can convert Set into List again
List<String> uniqueList=new ArrayList<>(uniqueVal);
// Now list contains unique values only

public static void main(String args[]) {
String[] colorArray={"Black","BLACK","black","Cobalt","COBALT","cobalt","IVORY","Ivory","ivory","White","WHITE","white"};
Set<String> uniqueColorList = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
for (String color : colorArray) {
uniqueColorList.add(color);
}
Iterator<String> itr=uniqueColorList.iterator();
while(itr.hasNext())
{
System.out.println(itr.next());
}
}
this will solve your problem
Output:
Black
Cobalt
IVORY
White

Promblem finding empty elements in a comma seperated data

import java.util.ArrayList;
import java.util.StringTokenizer;
public class test {
public static void main(String args[]) {
ArrayList<String> data = new ArrayList<String>();
String s = "a,b,c,d,e,f,g";
StringTokenizer st = new StringTokenizer(s,",");
while(st.hasMoreTokens()){
data.add(st.nextToken());
}
System.out.println(data);
}
}
Problem in finding empty elements in a CSV data
the above code works well when the data is complete. If some data is missing it fails to detect the empty data.
ex:
Complete DATA : a,b,c,d,e,f,g
if a,d,e,g are removed
New DATA : ,b,c,,,f,
4 data missing!!
I need a way to put this data into ArrayList with null or "" values for empty data

You can use Guava Splitter to do that:
import com.google.common.base.Splitter;
public class Example
{
private static final Splitter SPLITTER = Splitter.on(",").trimResults();
public List<String> split(String singleLine) {
return SPLITTER.split(singleLine);
}
}

I'm sure there are more elegant solutions, but a simple one would be to use split() function:
public static void main(String args[]) {
ArrayList<String> data = new ArrayList<String>();
String s = ",b,c,,,f,";
//create an array of strings, using "," as a delimiter
//if there is no letter between commas, an empty string will be
//placed in strings[] instead
String[] strings = s.split(",", -1);
for (String ss : strings) {
data.add(ss);
}
System.out.println(data);
}

Convert ArrayList to String

I have an ArrayList and I need to convert it to one String.
Each value in the String will be inside mark and will be separated by comma something like this:
ArrayList list = [a,b,c]
String s = " ’a’,’b’,’c’ ";
I am looking for efficient solution .

You can follow these steps: -
Create an empty StringBuilder instance
StringBuilder builder = new StringBuilder();
Iterate over your list
For each element, append the representation of each element to your StringBuilder instance
builder.append("'").append(eachElement).append("', ");
Now, since there would be a last comma left, you need to remove that. You can use StringBuilder.replace() to remove the last character.
You can take a look at documentation of StringBuilder to know more about various methods you can use.

Take a look at StringBuilder and StringBuffer:
StringBuffer
StringBuilder

Maybe an overkill here but providing a more functional approach through Guava:
import com.google.common.base.Function;
import com.google.common.base.Joiner;
import com.google.common.collect.Collections2;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
public class Main {
public static void main(String ... args){
List<String> list = new ArrayList(){{add("a");add("b");add("c");}};
Collection<String> quotedList = Collections2.transform(list,new Function<String, String>() {
#Override
public String apply(String s) {
return "'"+s+"'";
}
});
System.out.println(Joiner.on(",").join(quotedList));
}
}

use StringUtils library from Apache org.apache.commons.lang3.StringUtils;
StringUtils.join(list, ", ");
or
String s = (!list.isEmpty())? "'" + StringUtils.join(list , "', '")+ "'":null;

how to remove duplicate array elements using hashmap in java

How to remove duplicate elements in an array using HashMap without using hashset in java...Below code describes removal of duplicates in array..
Now i need to write using hashmap for generating key and value pairs
import java.util.*;
class TestArray{
public static void main(String arg[])
{
ArrayList<String> wordDulicate = new ArrayList<String>();
wordDulicate.add("chennai");
wordDulicate.add("bangalore");
wordDulicate.add("hyderabad");
wordDulicate.add("delhi");
wordDulicate.add("bangalore");
wordDulicate.add("mumbai");
wordDulicate.add("mumbai");
wordDulicate.add("goa");
wordDulicate.add("calcutta");
wordDulicate.add("hyderabad");
ArrayList<String> nonDupList = new ArrayList<String>();
Iterator<String> dupIter = wordDulicate.iterator();
while(dupIter.hasNext())
{
String dupWord = dupIter.next();
if(nonDupList.contains(dupWord))
{
dupIter.remove();
}else
{
nonDupList.add(dupWord);
}
}
System.out.println(nonDupList);
}
}

A HashSet is implemented in terms of a HashMap anyway. If you specifically want to use a HashMap, use it the same way as HashSet does: use a dummy constant new Object() as the map value everywhere.

Well a HashMap will prevent you from entering duplicate keys, the same way as HashSet. Actually, many implementations of HashSet just use a HashMap under the hood.
So you can do:
HashMap<String, String> map = new HashMap<String, String>();
for (String s : WordDuplicate)
map.put( s, s );
Now you can access the key/values just like a HashMap.

import java.util.HashSet;
import java.util.Stack;
public class stackdupes {
public static void main(String[] args) {
Stack<Integer> st = new Stack<Integer>();
int[] arr= {1,2,3,3,4,5,5,7};
HashSet<Integer> set = new HashSet<Integer>();
for (int i=0;i<arr.length;i++) {
if(set.add(arr[i]) == true)
st.push(arr[i]);
}
System.out.println(st);
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Built in method for removing duplicates in a string array - java

Is there a java built-in method for removing duplicates ? (array of strings) Maybe using a Set ? in this case how can I use it ? Thanks

Related

Type mismatch: convert from String to List<String>

Duplicates elements not removed from ArrayList

Promblem finding empty elements in a comma seperated data

Convert ArrayList to String

how to remove duplicate array elements using hashmap in java

Categories

Resources