I read data from a text file, so there may be:
John
Mary
John
Leeds
I now need to get 3 unique elements in the ArrayList, because there are only 3 unique values in the file output (as above).
I can use a HashTable and add information to it, then simply copy its data into the List.
Are there other solutions?
Why do you need to store it in a List? Do you actually require the data to be ordered or support index-based look-ups?
I would suggest storing the data in a Set. If ordering is unimportant you should use HashSet. However, if you wish to preserve ordering you could use LinkedHashSet.
If you have a List containing duplicates, and you want a List without, you could do:
List<String> newList = new ArrayList<String>(new HashSet<String>(oldList));
That is, wrap the old list into a set to remove duplicates and wrap that set in a list again.
You can check list.contains() before adding.
if(!list.contains(value)) {
list.add(value);
}
I guessed it would be obvious! However, adding items to a HashSet and then creating a list from this set would be more efficient.
Use a set instead of a list. Take a look at here: Java Collections Tutorials and specifically about Sets here: Java Sets Tutorial
In a nutshell, sets contain one of something. Perfect :)
Here is how I solved it:
import groovy.io.*;
def arr = ["5", "5", "7", "6", "7", "8", "0"]
List<String> uniqueList = new ArrayList<String>(new HashSet<String>( arr.asList() ));
System.out.println( uniqueList )
Another approach would be to use Java 8 stream's distinct
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
// public static void main(String args[]) ...
// list of strings, including some nulls and blanks as well ;)
List<String> list = Arrays.asList("John", "Mary", "John", "Leeds",
null, "", "A", "B", "C", "D", "A", "A", "B", "C", "", null);
// collect distinct without duplicates
List<String> distinctElements = list.stream()
.distinct()
.collect(Collectors.toList());
// unique elements
System.out.println(distinctElements);
Output:
[John, Mary, Leeds, null, , A, B, C, D]
List<String> distinctElements = list.stream()
.distinct().filter(s -> s != null && s != "")
.collect(Collectors.toList());
This will collect the distinct items and also avoid null or empty String
class HashSetList<T extends Object>
extends ArrayList<T> {
private HashSet<Integer> _this = new HashSet<>();
#Override
public boolean add(T obj) {
if (_this.add(obj.hashCode())) {
super.add(obj);
return true;
}
return false;
}
}
I now use those kind of structure for little programs, I mean you have little overhead in order to have getters and setters but uniqueness. Moreover you can override hashCode to decide wether your item equals another one.
Related
I'm have defined a list like the below
List<String> list = List.of("val1", "val2", "val3");
Now I have the below String
String myStr = "rel1,wel12,val1";
Now I need to check if the String has anyone one of the elements of the list(in the above case its true as it has val1, next is get that value into a variable
I have tried the below and it works, but I'm sure there is a better way to do that using any of the Collections libraries
List<String> list = List.of("val1", "val2", "val3");
String myStr = "rel1,wel12,val1";
String matchedStr =StringUtils.EMPTY;
String[] vals = myStr.split(",");
for(String val:vals) {
if(list.contains(val){
matchedStr=val;
break;
}
}
You can use Java Streams to get the first String that match:
Optional<String> result = Stream.of(vals).filter(list::contains).findFirst();
Your way is alright if the lists aren't too big. I am considering the string as a list too because it can be made one from splitting it as you've done already. You can make a set from the bigger one, and iterate on the smaller one.
Another way to go would be to find the intersection of two lists.
List<String> list = Arrays.asList("red", "blue", "blue", "green", "red");
List<String> otherList = Arrays.asList("red", "green", "green", "yellow");
Now we can find the inttersection:
Set<String> result = list.stream()
.distinct()
.filter(otherList::contains)
.collect(Collectors.toSet());
The result should contain "red" and "green".
Some more info on collectors.
Depending on the possible values in the problem domain, there may be no need to split the input string. Just call String#contains.
If so, you can flip your logic. Rather than loop on the split parts, loop on the target list of strings. For each string in the list, ask if your unprocessed input string contains that element. If yes, bail out of the loop.
Tip: If this code is in a method returning a string, and returns null if no match was found, learn about returning an Optional.
I would favour Zobayer's answer, or using List#retainAll.
final List<String> first = List.of("one", "two", "three");
final List<String> out = new ArrayList<>(Arrays.asList("five,four,three".split(",")));
out.retainAll(first);
out contains the single entry, "three".
I have a simple LinkedList that contains strings.
LinkedList<String> list = new LinkedList<String>();
list.add("A, B, C, D");
list.add("R");
list.add("A");
list.add("C, D");
So, our LinkedList is: [ "A, B, C, D", "R", "A" ,"C, D" ]
As you can see, "A" and "C, D" are already contained in "A,B,C,D".
What is the most efficient way to remove the contained strings?
First, you can use contains() method before adding new values (as long as you're adding single String every time, but you are not...).
Second, it seems like this "problem" can be easily avoided, if you will change the way you're adding the strings, or the LinkedList restriction..
Anyway, this is a simple method that might suite your need:
private void deleteIfContains(LinkedList<String> list, String str) {
Iterator<String> headIterator = list.iterator();
HashMap<Integer, String> newValues = new HashMap<>();
int index = 0;
while (headIterator.hasNext()) {
String headString = headIterator.next();
if (headString.contains(str)) {
headIterator.remove();
//replace method won't handle ','..you will need to use regex for it
newValues.put(index, headString.replace(str, ""));
}
index++;
}
//Avoid ConcurrentModificationException
for (int i : newValues.keySet()) {
list.add(i, newValues.get(i));
}
}
I would suggest you use a Set instead but you would have to contain every letter in a single String variable (maybe you should use Character?).
If you really want to stick to your own idea consider implementing your own Set. But first figure out what happens in that situation :
LinkedList<String> list = new LinkedList<String>();
list.add("A, B, C, D");
list.add("C, E");
C should be rejected but what about E?
As #nikowis says the best solution depends on the problem definition.
If the values are the elements "A", "B", "C", "D", ... the more efficient solution (on computation time) can be to transform the list into a List> or a single Set.
If the values are "substring", for example "C, E" is ONE value (and not two "C" and "E") you can use a substring "Trie" (https://en.wikipedia.org/wiki/Trie). It can find very quickly the presence of the substring in the trie (O(log N) with N the length of the string to add).
Convert the csv-format string to string values. Then store them as set element. If method add() returns true, that means value is already present.
String[] values = csvStr1.split(",");
Set<String> hashSet = new HashSet<String>(Arrays.asList(values));
String[] values2 = csvStr2.split(",");
for (String value: values2 ) {
if( hashSet.add(value) == true ) {
//value already present. Ignore this or do whatever you want.
}
}
I have 2 text files with data. I am reading these files with BufferReader and putting the data of one column per file in a List<String>.
I have duplicated data in each one, but I need to have unique data in the first List to confront with the duplicated data in the second List.
How can I get unique values from a List?
It can be done one one line by using an intermediate Set:
List<String> list = new ArrayList<>(new HashSet<>(list));
In java 8, use distinct() on a stream:
List<String> list = list.stream().distinct().collect(Collectors.toList());
Alternatively, don't use a List at all; just use a Set (like HashSet) from the start for the collection you only want to hold unique values.
Convert the ArrayList to a HashSet.
List<String> listWithDuplicates; // Your list containing duplicates
Set<String> setWithUniqueValues = new HashSet<>(listWithDuplicates);
If for some reason, you want to convert the set back to a list afterwards, you can, but most likely there will be no need.
List<String> listWithUniqueValues = new ArrayList<>(setWithUniqueValues);
In Java 8:
// List with duplicates
List<String> listAll = Arrays.asList("A", "A", "B", "C", "D", "D");
// filter the distinct
List<String> distinctList = listAll.stream()
.distinct()
.collect(Collectors.toList());
System.out.println(distinctList);// prints out: [A, B, C, D]
this will also work with objects, but you will probably have to adapt your equals method.
i just realize a solution may be it can be helpful for other persons.
first will be populated with duplicated values from BufferReader.
ArrayList<String> first = new ArrayList<String>();
To extract Unique values i just create a new ArrayList like down:
ArrayList<String> otherList = new ArrayList<>();
for(String s : first) {
if(!otherList.contains(s))
otherList.add(s);
}
A lot of post in internet are all speaking to assign my Arraylist to a List , Set , HashTable or TreeSet.
Can anyone explain the difference in theory and whitch one is the best tu use in practice ?
thnks for your time guys.
I was wondering if it is possible to do a select * where value is in Java String[] or do I need to build a string of values from the string array first?
I am trying to find the best way of doing this technically without singular selects or building a string of values from the array.
Thanks for your time.
String[] lookupValues = new String[]{"1", "2", "3", "4"};
Select * from database where value in (lookupvalues)
try Arrays.toString(lookupvalues) or write a utility function.
public String toString(String[] values, boolean addQuotes) {
StringBuilder buff = new StringBuilder();
for(String value : values) {
if(addQuotes) {
buff.append(String.format("\"%s\"", value)).append(",");
} else {
buff.append(value).append(",");
}
}
buff.deleteCharAt(buff.length()-1); // delete the last comma
return buff.toString();
}
Are you looking to search for a string in array? If yes, you might want to look at Collections.binarySearch() or Arrays.binarySearch() - but please not that the collections or arrays need to be sorted before doing binary search
I know that this is not a direct answer to your question considering only arrays. I suggest you to use Guava since it offers a implementation of filter on Java Collections. Using proven libraries will give you flexibility if you use different kind of filters.
You can easily convert your array into proper collection
String[] lookupValues = new String[]{"1", "2", "3", "4"};
List<String> list = new ArrayList<String>(Arrays.asList(lookupValues));
And filter it with Guava filters:
Collection<String> filtered = Collections2.filter(list,
Predicates.containsPattern("3"));
print(filtered);
will find
3
If you want filtered collection as a list, you can use this something like this from Guava:
List<String> filteredList = Lists.newArrayList(Collections2.filter(
list, Predicates.containsPattern("3")));
In Java, what's the most efficient way to return the common elements from two String Arrays? I can do it with a pair of for loops, but that doesn't seem to be very efficient. The best I could come up with was converting to a List and then applying retainAll, based on my review of a similar SO question:
List<String> compareList = Arrays.asList(strArr1);
List<String> baseList = Arrays.asList(strArr2);
baseList.retainAll(compareList);
EDITED:
This is a one-liner:
compareList.retainAll(new HashSet<String>(baseList));
The retainAll impl (in AbstractCollection) iterates over this, and uses contains() on the argument. Turning the argument into a HashSet will result in fast lookups, so the loop within the retainAll will execute as quickly as possible.
Also, the name baseList hints at it being a constant, so you will get a significant performance improvement if you cache this:
static final Set<String> BASE = Collections.unmodifiableSet(new HashSet<String>(Arrays.asList("one", "two", "three", "etc")));
static void retainCommonWithBase(Collection<String> strings) {
strings.retainAll(BASE);
}
If you want to preserve the original List, do this:
static List<String> retainCommonWithBase(List<String> strings) {
List<String> result = new ArrayList<String>(strings);
result.retainAll(BASE);
return result;
}
Sort both arrays.
Once sorted, you can iterate both sorted arrays exactly once, using two indexes.
This will be O(NlogN).
I would use HashSets (and retainAll) then, which would make the whole check O(n) (for each element in the first set lookup if it exists (contains()), which is O(1) for HashSet). Lists are faster to create though (HashSet might have to deal with collisions...).
Keep in mind that Set and List have different semantics (lists allow duplicate elements, nulls...).
retain all is not supported by list. use set instead:
import java.util.*;
public class Main {
public static void main(String[] args) {
String[] strings1={"a","b","b","c"},strings2={"b","c","c","d"};
List<String> list=Arrays.asList(strings1);
//list.retainAll(Arrays.asList(strings2)); // throws UnsupportedOperationException
//System.out.println(list);
Set<String> set=new LinkedHashSet<String>(Arrays.asList(strings1));
set.retainAll(Arrays.asList(strings2));
System.out.println(set);
}
}
What you want is called intersection.
See that:
Intersection and union of ArrayLists in Java
The use of an Hash based collection provides a really faster contains() method, particularly on strings which have an optimized hashcode.
If you can import libraries you can consider using the Sets.intersection of Guava.
Edit:
Didn't know about the retainAll method.
Note that the AbstractCollection implementation, which seems not overriden for HashSets and LinkedHashSets is:
public boolean retainAll(Collection c) {
boolean modified = false;
Iterator it = iterator();
while (it.hasNext()) {
if (!c.contains(it.next())) {
it.remove();
modified = true;
}
}
return modified;
}
Which means you call contains() on the collection parameter!
Which means if you pass a List parameter you will have an equals call on many item of the list, for every iteration!
This is why i don't think the above implementations using retainAll are good.
public <T> List<T> intersection(List<T> list1, List<T> list2) {
boolean firstIsBigger = list1.size() > list2.size();
List<T> big = firstIsBigger ? list1:list2;
Set<T> small = firstIsBigger ? new HashSet<T>(list2) : new HashSet<T>(list1);
return big.retainsAll(small)
}
Choosing to use the Set for the smallest list because it's faster to contruct the set, and a big list iterates pretty well...
Notice that one of the original list param may be modified, it's up to you to make a copy...
I had an interview and this question was the thing they asked me during technical interview. My answer was following lines of code:
public static void main(String[] args) {
String[] temp1 = {"a", "b", "c"};
String[] temp2 = {"c", "d", "a", "e", "f"};
String[] temp3 = {"b", "c", "a", "a", "f"};
ArrayList<String> list1 = new ArrayList<String>(Arrays.asList(temp1));
System.out.println("list1: " + list1);
ArrayList<String> list2 = new ArrayList<String>(Arrays.asList(temp2));
System.out.println("list2: " + list2);
ArrayList<String> list3 = new ArrayList<String>(Arrays.asList(temp3));
System.out.println("list3: " + list3);
list1.retainAll(list2);
list1.retainAll(list3);
for (String str : list1)
System.out.println("Commons: " + str);
}
Output:
list1: [a, b, c]
list2: [c, d, a, e, f]
list3: [b, c, a, a, f]
Commons: a
Commons: c