How to count unique elements in the array? Need only idea - java

For example: String[] str = {"M1","M1","M1","M2","M3"};
The most recommended is the answer - HashSet. Which methods or you have better idea?

Unless you want to implement this yourself, a Set is the way to go. A set will only allow unique elements to be added and will automatically filter duplicates.
The HashSet functionality works as follows:
The hash is computed for the object. Next the set checks if any of the objects with the same hash-value .equals() the new value. If so, the new value is ignored. If not, it is added to the set.
If you add everything to the set and then ask for its size, you will get the amount of unique elements.

new HashSet(Arrays.asList(str)).size();

I prefer to use things that are already provided natively. Which in your requirement is Set.
You can do the following -
Set<String> set = new HashSet<String>(Arrays.asList(str));
set.size();

You can try this too
String[] str = {"M1","M1","M1","M2","M3"};
HashMap<String,String> map=new HashMap<>();
for(String i:str){
map.put(i, i);
}
System.out.println(map.keySet().size());

Instead of creating a temporary list as in other answers, you can also use:
Set<String> set = new HashSet<> ();
Collections.addAll(set, str);
int countUnique = set.size();

Related

How to convert an object to a set?

so I have a set named "all" that contains objects Doc. Then I have another set called "partDocs" that contains other sets that contains Doc. Let's say that "all" contains = [Doc1, Doc2, Doc3]. I want partDocs to contain those sets like: [[Doc1], [Doc2], [Doc3]]. How can I do that?
I tried
Set<Doc> all = new HashSet<Doc>(); // contains [Doc1, Doc2, Doc3]
Set<Set<Doc>> partDocs = new HashSet<Set<Doc>>();
Set<Doc> set2 = new HashSet<Doc>();
for (i = 0; i < all.size(); i++) {
set2.clear();
set2.add(all.stream().toList().get(i)); // adds the i.th element of the all set
partDocs.add(set2);
}
However when I do this, partDocs only has the last element of the all set, because set2 is always changing, and its last value is [Doc3]
I also tried doing below, but its syntax is wrong
for (i = 0; i < all.size(); i++) {
partDocs.add(Set<allDocuments.stream().toList().get(i)>);
}
Does anyone have any idea about how to implement this?
I don't know if you actually want to do this. It seems like very confusing code but try the following:
Set<Set<Doc>> partDocs = new HashSet<Set<Doc>>();
Set<Doc> all = new HashSet<>();
for(Doc doc : all) {
partDocs.add(Set.of(doc));
}
And now you have a set (partDocs) that contains sets which contain docs
The invalidation of set2 is flawed.
Instead of
set2.clear();
it should be
set2 = new HashSet<Doc>();
This particular bug may be easier to understand if we use a list for partDocs
List<Set<Doc>> partDocs = new ArrayList<Set<Doc>>();
which results in
[[Doc3], [Doc3], [Doc3]]
Essentially, we are adding the same Set to the List and since we are using the same reference, consecutive calls to set2.clear() and set2.add() will affect both the local variable and the Set in the List.
Going back to partDocs being a Set, it will end up with only one entry, because a Set guarantees to not contain duplicate values and three identical instances are of course duplicates.
So, to fix this bug, we must ensure that the three objects added to partDocs are truly distinct, which is accomplished (partly) by the new invalidation routine
set2 = new HashSet<Doc>();
and otherwise by the fact that we add a different element to it each time.

how can we add (long,String) in arraylist?

I need to create a list with values of type - (long,String)
like -
ArrayList a = new ArrayList();
a.add(1L,branchName);
How can I do this because if I use list It will accept only int,String.
You should note that ArrayList's add(int,String) adds the String element in the given int index (if the index is valid). The int parameter is not part of the contents of the ArrayList.
Perhaps an ArrayList is not the correct choice for you. If you wish to map Long keys to String values, use Map<Long,String>.
Map<Long,String> a = new HashMap<> ();
a.put(1L,branchName);
You can define a custom class, e.g.
class IndexAndBranchName {
long index;
String branchName;
}
and then add instances of this to the ArrayList:
ArrayList<IndexAndBranchName> a = new ArrayList<>();
a.add(new IndexAndBranchName(index, branchName));
Whether you use this approach or something like Eran's depends upon what you need to use the list for subsequently:
If you want to look "branches" up by index, use a Map; however, you can only store a single value per key; you could use a Guava Multimap or similar if you want multiple values per key.
If you simply want all of the index/branch name pairs, you can use this approach.
You can use the below code for your question.
HashMap is also a better option , but if you want only ArrayList then use it.
List<Map<Object, Object>> mylist = new ArrayList<Map<Object, Object>>();
Map map = new HashMap<>();
map.put(1L, "BranchName");
mylist.add(map);

Hashset objects

I'm writing a piece of code which takes a great deal of objects and adds them to another array. The catch is, I don't want any duplicates. Is there a way I could implement a Hashset to solve this problem?
public static Statistic[] combineStatistics(Statistic[] rptData, Statistic[] dbsData) {
HashSet<Statistic> set = new HashSet<Statistic>();
for (int i=0; i<rptData.length; i++) {
set.add(rptData[i]);
}
/*If there's no data in the database, we don't have anything to add to the new array*/
if (dbsData!=null) {
for (int j=0; j<dbsData.length;j++) {
set.add(dbsData[j]);
}
}
Statistic[] total=set.toArray(new Statistic[0]);
for (int workDummy=0; workDummy<total.length; workDummy++) {
System.out.println(total[workDummy].serialName);
}
return total;
}//end combineStatistics()
Properly implement equals(Object obj) and hashCode() on YourObject if you expect value equality instead of reference equality.
Set<YourObject> set = new HashSet<YourObject>(yourCollection);
or
Set<YourObject> set = new HashSet<YourObject>();
set.add(...);
then
YourObject[] array = set.toArray(new YourObject[0])
I think you should pay attention to:
1 - what to do if there is a duplicate in the original Collection? Use the first added to the array? Use the other(s)?
2 - You definitely need to implement equals and hashcode so that you can tell what are duplicate objects
3 - Are you going to create a fixed size array and then won't add anymore objects? Or are you going to keep adding stuff?
You can use any kind of Set actually, but if you use LinkedHashSet, then you will have a defined iteration order (which looks like an array). HashSet wont't garantee any order and TreeSet will try to order data ascending.
Depends on what you are referring to as a duplicate. If you mean an identical object, then you could use a List and simply see if the List contains the object prior to adding it to the list.
Object obj = new Object();
List<Object> list = new ArrayList<Object>();
if (!list.contains(obj)) {
list.add(obj);
}

In Java, How to remove duplication from an ArrayList<StringBuilder> efficiently?

I tried to use HashSet to remove the duplications from an ArrayList<StringBuilder>.
E.g. Here is an ArrayList, each line is a StringBuilder object.
"u12e5 u13a1 u1423"
"u145d"
"u12e5 u13a1 u1423"
"u3ab4 u1489"
I want to get the following:
"u12e5 u13a1 u1423"
"u145d"
"u3ab4 u1489"
My current implementation is:
static void removeDuplication(ArrayList<StringBuilder> directCallList) {
HashSet<StringBuilder> set = new HashSet<StringBuilder>();
for(int i=0; i<directCallList.size()-1; i++) {
if(set.contains(directCallList.get(i)) == false)
set.add(directCallList.get(i));
}
StringBuilder lastString = directCallList.get(directCallList.size()-1);
directCallList.clear();
directCallList.addAll(set);
directCallList.add(lastString);
}
But the performance becomes worse and worse as the ArrayList size grows. Is there any problem with this implementation? Or do you have any better ones in terms of performance?
StringBuilder doesn't implement equals() or hashcode(). Two StringBuilders are only equal if they are the exact same object, so adding them to a HashSet won't exclude two different StringBuilder objects with identical content.
You should convert the StringBuilders to String objects.
Also, you should initialize your HashSet with an "initial capacity" in the constructor. This will help with the speed if you are dealing with large numbers of objects.
Lastly, it's not necessary to call contains() on the hashset before adding an object. Just add your Strings to the set, and the set will reject duplicates (and will return false).
Let's analyze your method to find where we can improve it:
static void removeDuplication(ArrayList<StringBuilder> directCallList) {
HashSet<StringBuilder> set = new HashSet<StringBuilder>();
for(int i=0; i<directCallList.size()-1; i++) {
if(set.contains(directCallList.get(i)) == false)
set.add(directCallList.get(i));
}
This for loop repeats once for each element in the ArrayList. This seems unavoidable for the task at hand. However, since HashSet can only contain one of each item, the if statement is redundant. HashSet.add() does the exact same check again.
StringBuilder lastString = directCallList.get(directCallList.size()-1);
I don't understand the need to get the lastString from your list and then add it. If your loop works correctly, it should have already been added to the HashSet.
directCallList.clear();
Depending on the implementation of the list, this can take up to O(n) time because it might need to visit every element in the list.
directCallList.addAll(set);
Again, this takes O(n) time. If there are no duplicates, set contains the original items.
directCallList.add(lastString);
This line seems to be a logic error. You will add a String which is already in the set and added to directCallList.
}
So overall, this algorithm takes O(n) time, but there is a constant factor of 3. If you can reduce this factor, you can improve the performance. One way to do this is to simply create a new ArrayList, rather than clearing the existing one.
Additionally, this removeDuplication() function can be written in one line if you use the correct constructors and return the ArrayList without duplicates:
static List<StringBuilder> removeDuplication(List<StringBuilder> inList) {
return new ArrayList<StringBuilder>(new HashSet<StringBuilder>(inList));
}
Of course, this still doesn't address the issues with StringBuilder that others have pointed out.
So you had some other options, but I like my solutions short, simple, and to the point. I've changed your method to no longer manipulate the parameter, but rather return a new List. I used a Set<String> to see if the contents of each StringBuilder was already included and returned the unique Strings. I also used a for each loop instead of accessing by index.
static List<StringBuilder> removeDuplication(List<StringBuilder> directCallList) {
HashSet<String> set = new HashSet<String>();
List<StringBuilder> returnList = new ArrayList<StringBuilder>();
for(StringBuilder builder : directCallList) {
if(set.add(builder.toString())
returnList.add(builder);
}
return returnList;
}
As Sam states, StringBuider does not override hashCode and equals and so the Set will not work appropriately.
I think the answer is to wrap the Builder in an object that executes toString only once:
class Wrapper{
final String string;
final StringBuilder builder;
Wrapper(StringBuilder builder){
this.builder = builder;
this.string = builder.toString();
}
public int hashCode(){return string.hashCode();}
public boolean equals(Object o){return string.equals(o);}
}
public Set removeDups(List<StringBuilder> list){
Set<Wrapper> set = ...;
for (StringBuilder builder : list)
set.add(new Wrapper(builder));
return set;
}
The removeDups method could be updated to extract the builders from the set and return a List<StringBuilder>
As explained, StringBuilders don't override Object#equals and aren't Comparable.
Although using StringBuilders to concatenate your Strings is the way to go, I would suggest that once you are done with your concatenation, you should store the underlying strings (stringBuilder.toString()) instead of the StringBuilders in your list.
Removing duplicates then becomes a one line:
Set<String> set = new HashSet<String>(list);
Or even better, store the strings in the set directly if you don't need to know that there are duplicates.

How to find the duplicate entries of an array of string and make them null by using HashMap

I've an array of string, I want to find the duplicate strings in the array and want to make the duplicates null by using HashMap with a good time complexity.
Sounds like you want to use a Set. This clears all duplicate entries, but you can also just create an array which has the unique entries (and no null values)
String[] array =
Set<String> found = new LinkedHashSet<String>();
for(int i=0;i<array.length;i++)
if(!found.add(array[i]))
array[i] = null;
// just the entries without duplicates.
String[] unique = found.toArray(new String[found.size()]);
You don't actually need a map. Here's an example that uses a HashSet instead. (Assuming that you want the repeated strings "nulled" out.
String[] strs = "aa,bb,cc,aa,xx,cc,dd".split(",");
Set<String> seen = new HashSet<String>();
for (int i = 0; i < strs.length; i++)
if (!seen.add(strs[i]))
strs[i] = null;
// Prints [aa, bb, cc, null, xx, null, dd]
System.out.println(Arrays.toString(strs));
You can do it in O(n) time, by iterating over your array once, sticking every new element into a HashSet and replacing array elements that are already in the HashSet with nulls.
Instead of a HashMap you can use a Set too, the steps are (you can work the details out yourself):
for every string in the array
if the string exists in the Map/Set null it
otherwise add it to the Map/Set
That's it.

Categories