Is there any null free list data structure?

Is there any null free list data structure? - java

I am using LinkedList data structure serverList to store the elements in it. As of now, it can also insert null in the LinkedList serverList which is not what I want. Is there any other data structure which I can use which will not add null element in the serverList list but maintain the insert ordering?
public List<String> getServerNames(ProcessData dataHolder) {
// some code
String localIP = getLocalIP(localPath, clientId);
String localAddress = getLocalAddress(localPath, clientId);
// some code
List<String> serverList = new LinkedList<String>();
serverList.add(localIP);
if (ppFlag) {
serverList.add(localAddress);
}
if (etrFlag) {
for (String remotePath : holderPath) {
String remoteIP = getRemoteIP(remotePath, clientId);
String remoteAddress = getRemoteAddress(remotePath, clientId);
serverList.add(remoteIP);
if (ppFlag) {
serverList.add(remoteAddress);
}
}
}
return serverList;
}
This method will return a List which I am iterating it in a for loop in normal way. I can have empty serverList if everything is null, instead of having four null values in my list. In my above code, getLocalIP, getLocalAddress, getRemoteIP and getRemoteAddress can return null and then it will add null element in the linked list. I know I can add a if check but then I need to add if check four time just before adding to Linked List. Is there any better data structure which I can use here?
One constraint I have is - This library is use under very heavy load so this code has to be fast since it will be called multiple times.

I am using LinkedList data structure serverList to store the elements in it.
That's most probably wrong, given that you're aiming at speed. An ArrayList is much faster unless you're using it as a Queue or alike.
I know I can add a if check but then I need to add if check four time just before adding to Linked List. Is there any better data structure which I can use here?
A collection silently ignoring nulls would be a bad idea. It may be useful sometimes and very surprising at other times. Moreover, it'd violate the List.add contract. So you won't find it in any serious library and you shouldn't implement it.
Just write a method
void <E> addIfNotNullTo(Collection<E> collection, E e) {
if (e != null) {
collection.add(e);
}
}
and use it. It won't make your code really shorter, but it'll make it clearer.
One constraint I have is - This library is use under very heavy load so this code has to be fast since it will be called multiple times.
Note that any IO is many orders of magnitude slower than simple list operations.

Use Apache Commons Collection:
ListUtils.predicatedList(new ArrayList(), PredicateUtils.notNullPredicate());
Adding null to this list throws IllegalArgumentException. Furthermore you can back it by any List implementation you like and if necessary you can add more Predicates to be checked.
Same exists for Collections in general.

There are data structures that do not allow null elements, such as ArrayDeque, but these will throw an exception rather than silently ignore a null element, so you'd have to check for null before insertion anyway.
If you're dead set against adding null checks before insertion, you could instead iterate over the list and remove null elements before you return it.

The simplest way would be to just override LinkedList#add() in your getServerNames() method.
List<String> serverList = new LinkedList<String>() {
public boolean add(String item) {
if (item != null) {
super.add(item);
return true;
} else
return false;
}
};
serverList.add(null);
serverList.add("NotNULL");
System.out.println(serverList.size()); // prints 1
If you then see yourself using this at several places, you can probably turn it into a class.

You can use a plain Java HashSet to store your paths. The null value may be added multiple times, but it will only ever appears once in the Set. You can remove null from the Set and then convert to an ArrayList before returning.
Set<String> serverSet = new HashSet<String>();
serverSet.add(localIP);
if (ppFlag) {
serverSet.add(localAddress);
}
if (etrFlag) {
for (String remotePath : holderPath) {
String remoteIP = getRemoteIP(remotePath, clientId);
String remoteAddress = getRemoteAddress(remotePath, clientId);
serverSet.add(remoteIP);
if (ppFlag) {
serverSet.add(remoteAddress);
}
}
}
serverSet.remove(null); // remove null from your set - no exception if null not present
List<String> serverList = new ArrayList<String>(serverSet);
return serverList;

Since you use Guava (it's tagged), I have this alternative if you have the luxury of being able to return a Collection instead of a List.
Why Collection ? Because List forces you to either return true or throw an exception. Collection allows you to return false if you didn't add anything to it.
class MyVeryOwnList<T> extends ForwardingCollection<T> { // Note: not ForwardingList
private final List<T> delegate = new LinkedList<>(); // Keep a linked list
#Override protected Collection<T> delegate() { return delegate; }
#Override public boolean add(T element) {
if (element == null) {
return false;
} else {
return delegate.add(element);
}
}
#Override public boolean addAll(Collection<? extends T> elements) {
return standardAddAll(elements);
}
}

Related

Lazy reinit a list to modifiable after declaring with Collections.emptyList()

Want to declare a list as List<String> info = Collections.emptyList()
but when user calls add(String msg) then re-init to a modifiable list.
Is this the correct way:
private List<String> info = Collections.emptyList();
public void addInfo(String s){
final List<String> e = Collections.emptyList();
if(info == e){
info = new ArrayList<>();
}
info.add(s);
}
Or
if(info.equals(e)){
If I have 3 of these can I have this common code :
public void addInfo(String s) {
info = addTo(s, info);
}
public void addWarn(String s) {
warn = addTo(s, warn);
}
public void addErr(String s) {
errs = addTo(s, errs);
}
private List<String> addTo(String s, #org.jetbrains.annotations.NotNull List<String> t){
final List<String> e = Collections.emptyList();
if(t.equals(e)){
t = new ArrayList<>();
}
t.add(s);
return t;
}
I guess the following wont work due to the new list being created?
private void addTo(String s, #org.jetbrains.annotations.NotNull List<String> t){
final List<String> e = Collections.emptyList();
if(t.equals(e)){
t = new ArrayList<>();
}
t.add(s);
}

Note that even if Collections.emptyList() always returns the one instance held in Collections.EMPTY_LIST, a reference comparison does not detect when a caller used JDK 9+ List.of() to initialize the field. On the other hand, being non-empty does not guaranty mutability either.
The entire logic is suitable only for a private method were all callers and their usage are known.
But you should consider the alternative of dropping these special cases altogether. Since Java 8, the default constructor new ArrayList<>() will not create a backing array. It is deferred until the first addition of an element.
So you can initialize all fields with a plain new ArrayList<>() and implement the addInfo, addWarn, and addErr with a plain add call, getting rid of the addTo method, the conditionals, and the repeated assignments. Even declaring the fields final is possible. While still not requiring a significant amount of memory for the unused lists.

Using .equals is the only correct solution -- but equivalent to the much simpler info.isEmpty().

Considering your code:
final List<String> e = Collections.emptyList();
if(info == e){
info = new ArrayList<>();
}
info.add(s);
I don't believe there's any guarantee in the Java API that the same reference will always be returned from emptyList() (the javadoc states "Implementations of this method need not create a separate List object for each call").
Given you may modify the list it'd make more sense to initialise with new ArrayList<>() rather than emptyList(). Really doesn't make much sense to use an unmodifiable list that you may want to modify.
However if you really need to use emptyList() for some reason, then perhaps:
if (info.isEmpty())
info = new ArrayList<>();
Given you are about to add an item to it this test will only pass once anyway.

List inside Redis Hash on Java

I'm using Spring with Redis and I am working with a list inside a hash. Everything works great on a single thread, problems come when I have to update list value with more than one instances.
Here is my code to put and get value from Hash:
public void put(String hashName, int key , List<myObj> myList) {
redis.opsForHash().put(hashName, String.valueOf(key), myList);
}
public List<myObj> get(String hashName, string key) {
Object map = redis.opsForHash().get(hashName,key);
if (map==null) {
log.info("no keys found");
return new ArrayList<myObj>();
}
List<myObj> myList= mapper.convertValue(map, new TypeReference<List<myObj>(){});
return myList;
}
What I do to perform update is:
List<myObj> myList= hash.get(hashName,key);
myList.add(obj);
hash.put(hashName, key, myList);
When there is more than one instance I occur in race condition. Is there a way to update list values in an atomic way?

Your current implementation is not good, because in the put() you update the whole list. In case many threads want to add a single element to the list they first obtain the current list, then add an element, then put new list. Each thread will override the result of the previous one, the last one wins. Usage of synchronized doesn't matter here.
Solution
Don't replace the whole list. Instead, add a single element to the list. Remove the method put() and add a new one like following:
public synchronized void add(String hashName, int key, myObj element) {
List<myObj> myList;
Object map = redis.opsForHash().get(hashName,key);
if (map != null) {
myList= mapper.convertValue(map, new TypeReference<List<myObj>(){});
} else {
myList = new ArrayList<myObj>();
}
myList.add(element);
redis.opsForHash().put(hashName, String.valueOf(key), myList);
}
Besides make sure there are no attempts to modify the list directly and that the only way to add elements is to use your method add(). Use Collections.unmodifiableList():
public List<myObj> get(String hashName, string key) {
Object map = redis.opsForHash().get(hashName,key);
if (map==null) {
log.info("no keys found");
return new ArrayList<myObj>();
}
List<myObj> myList= mapper.convertValue(map, new TypeReference<List<myObj>(){});
return Collections.unmodifiableList(myList);
}

How to convert the following code to Java 8 streams and lambdas

I have a complicated requirement where a list records has comments in it. We have a functionality of reporting where each and every change should be logged and reported. Hence as per our design, we create a whole new record even if a single field has been updated.
Now we wanted to get history of comments(reversed sorted by timestamp) stored in our db. After running query I got the list of comments but it contains duplicate entries because some other field was changed. It also contains null entries.
I wrote the following code to remove duplicate and null entries.
List<Comment> toRet = new ArrayList<>();
dbCommentHistory.forEach(ele -> {
//Directly copy if toRet is empty.
if (!toRet.isEmpty()) {
int lastIndex = toRet.size() - 1;
Comment lastAppended = toRet.get(lastIndex);
// If comment is null don't proceed
if (ele.getComment() == null) {
return;
}
// remove if we have same comment as last time
if (StringUtils.compare(ele.getComment(), lastAppended.getComment()) == 0) {
toRet.remove(lastIndex);
}
}
//add element to new list
toRet.add(ele);
});
This logic works fine and have been tested now, But I want to convert this code to use lambda, streams and other java 8's feature.

You can use the following snippet:
Collection<Comment> result = dbCommentHistory.stream()
.filter(c -> c.getComment() != null)
.collect(Collectors.toMap(Comment::getComment, Function.identity(), (first, second) -> second, LinkedHashMap::new))
.values();
If you need a List instead of a Collection you can use new ArrayList<>(result).
If you have implemented the equals() method in your Comment class like the following
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
return Objects.equals(comment, ((Comment) o).comment);
}
you can just use this snippet:
List<Comment> result = dbCommentHistory.stream()
.filter(c -> c.getComment() != null)
.distinct()
.collect(Collectors.toList());
But this would keep the first comment, not the last.

If I'm understanding the logic in the question code you want to remove consecutive repeated comments but keep duplicates if there is some different comment in between in the input list.
In this case a simply using .distinct() (and once equals and hashCode) has been properly defined, won't work as intended as non-consecutive duplicates will be eliminated as well.
The more "streamy" solution here is to use a custom Collector that when folding elements into the accumulator removes the consecutive duplicates only.
static final Collector<Comment, List<Comment>, List<Comment>> COMMENT_COLLECTOR = Collector.of(
ArrayDeque::new, //// supplier.
(list, comment) -> { /// folder
if (list.isEmpty() || !Objects.equals(list.getLast().getComment(), comment.getComment()) {
list.addLast(comment);
}
}),
(list1, list2) -> { /// the combiner. we discard list2 first element if identical to last on list1.
if (list1.isEmpty()) {
return list2;
} else {
if (!list2.isEmpty()) {
if (!Objects.equals(list1.getLast().getComment(),
list2.getFirst().getComment()) {
list1.addAll(list2);
} else {
list1.addAll(list2.subList(1, list2.size());
}
}
return list1;
}
});
Notice that Deque (in java.util.*) is an extended type of List that have convenient operations to access the first and last element of the list. ArrayDeque is the nacked array based implementation (equivalent to ArrayList to List).
By default the collector will always receive the elements in the input stream order so this must work. I know it is not much less code but it is as good as it gets. If you define a Comment comparator static method that can handle null elements or comment with grace you can make it a bit more compact:
static boolean sameComment(final Comment a, final Comment b) {
if (a == b) {
return true;
} else if (a == null || b == null) {
return false;
} else {
Objects.equals(a.getComment(), b.getComment());
}
}
static final Collector<Comment, List<Comment>, List<Comment>> COMMENT_COLLECTOR = Collector.of(
ArrayDeque::new, //// supplier.
(list, comment) -> { /// folder
if (!sameComment(list.peekLast(), comment) {
list.addLast(comment);
}
}),
(list1, list2) -> { /// the combiner. we discard list2 first element if identical to last on list1.
if (list1.isEmpty()) {
return list2;
} else {
if (!sameComment(list1.peekLast(), list2.peekFirst()) {
list1.addAll(list2);
} else {
list1.addAll(list2.subList(1, list2.size());
}
return list1;
}
});
----------
Perhaps you would prefer to declare a proper (named) class that implements the Collector to make it more clear and avoid the definition of lambdas for each Collector action. or at least implement the lambdas passed to Collector.of by static methods to improve readability.
Now the code to do the actual work is rather trivial:
List<Comment> unique = dbCommentHistory.stream()
.collect(COMMENT_COLLECTOR);
That is it. However if it may become a bit more involved if you want to handle null comments (element) instances. The code above already handles the comment's string being null by considering it equals to another null string:
List<Comment> unique = dbCommentHistory.stream()
.filter(Objects::nonNull)
.collect(COMMENT_COLLECTOR);

Your code can be simplified a bit. Notice that this solution does not use stream/lambdas but it seems to be the most succinct option:
List<Comment> toRet = new ArrayList<>(dbCommentHistory.size());
Comment last = null;
for (final Comment ele : dbCommentHistory) {
if (ele != null && (last == null || !Objects.equals(last.getComment(), ele.getComment()))) {
toRet.add(last = ele);
}
}
The outcome is not exactly the same as the question code as in the latter null elements might be added to the toRet but it seems to me that you actually may want to remove the completely instead. Is easy to modify the code (make it a bit longer) to get the same output though.
If you insist in using a .forEach that would not be that difficult, in that case last whould need to be calculated at the beggining of the lambda. In this case you may want to use a ArrayDeque so that you can coveniently use peekLast:
Deque<Comment> toRet = new ArrayDeque<>(dbCommentHistory.size());
dbCommentHistory.forEach( ele -> {
if (ele != null) {
final Comment last = toRet.peekLast();
if (last == null || !Objects.equals(last.getComment(), ele.getComment())) {
toRet.addLast(ele);
}
}
});

Element is present but `Set.contains(element)` returns false

How can an element not be contained in the original set but in its unmodified copy?
The original set does not contain the element while its copy does. See image.
The following method returns true, although it should always return false. The implementation of c and clusters is in both cases HashSet.
public static boolean confumbled(Set<String> c, Set<Set<String>> clusters) {
return (!clusters.contains(c) && new HashSet<>(clusters).contains(c));
}
Debugging has shown that the element is contained in the original, but Set.contains(element) returns false for some reason. See image.
Could somebody please explain to me what's going on?

If you change an element in the Set (in your case the elements are Set<String>, so adding or removing a String will change them), Set.contains(element) may fail to locate it, since the hashCode of the element will be different than what it was when the element was first added to the HashSet.
When you create a new HashSet containing the elements of the original one, the elements are added based on their current hashCode, so Set.contains(element) will return true for the new HashSet.
You should avoid putting mutable instances in a HashSet (or using them as keys in a HashMap), and if you can't avoid it, make sure you remove the element before you mutate it and re-add it afterwards. Otherwise your HashSet will be broken.
An example :
Set<String> set = new HashSet<String>();
set.add("one");
set.add("two");
Set<Set<String>> setOfSets = new HashSet<Set<String>>();
setOfSets.add(set);
boolean found = setOfSets.contains(set); // returns true
set.add("three");
Set<Set<String>> newSetOfSets = new HashSet<Set<String>>(setOfSets);
found = setOfSets.contains(set); // returns false
found = newSetOfSets.contains(set); // returns true

The most common reason for this is that the element or key was altered after insertion resulting in a corruption of the underlying data structure.
note: when you add a reference to a Set<String> to another Set<Set<String>> you are adding a copy of the reference, the underlyingSet<String> is not copied and if you alter it these changes which affect the Set<Set<String>> you put it into.
e.g.
Set<String> s = new HashSet<>();
Set<Set<String>> ss = new HashSet<>();
ss.add(s);
assert ss.contains(s);
// altering the set after adding it corrupts the HashSet
s.add("Hi");
// there is a small chance it may still find it.
assert !ss.contains(s);
// build a correct structure by copying it.
Set<Set<String>> ss2 = new HashSet<>(ss);
assert ss2.contains(s);
s.add("There");
// not again.
assert !ss2.contains(s);

If the primary Set was a TreeSet (or perhaps some other NavigableSet) then it is possible, if your objects are imperfectly compared, for this to happen.
The critical point is that HashSet.contains looks like:
public boolean contains(Object o) {
return map.containsKey(o);
}
and map is a HashMap and HashMap.containsKey looks like:
public boolean containsKey(Object key) {
return getNode(hash(key), key) != null;
}
so it uses the hashCode of the key to check for presence.
A TreeSet however uses a TreeMap internally and it's containsKey looks like:
final Entry<K,V> getEntry(Object key) {
// Offload comparator-based version for sake of performance
if (comparator != null)
return getEntryUsingComparator(key);
...
So it uses a Comparator to find the key.
So, in summary, if your hashCode method does not agree with your Comparator.compareTo method (say compareTo returns 1 while hashCode returns different values) then you will see this kind of obscure behaviour.
class BadThing {
final int hash;
public BadThing(int hash) {
this.hash = hash;
}
#Override
public int hashCode() {
return hash;
}
#Override
public String toString() {
return "BadThing{" + "hash=" + hash + '}';
}
}
public void test() {
Set<BadThing> primarySet = new TreeSet<>(new Comparator<BadThing>() {
#Override
public int compare(BadThing o1, BadThing o2) {
return 1;
}
});
// Make the things.
BadThing bt1 = new BadThing(1);
primarySet.add(bt1);
BadThing bt2 = new BadThing(2);
primarySet.add(bt2);
// Make the secondary set.
Set<BadThing> secondarySet = new HashSet<>(primarySet);
// Have a poke around.
test(primarySet, bt1);
test(primarySet, bt2);
test(secondarySet, bt1);
test(secondarySet, bt2);
}
private void test(Set<BadThing> set, BadThing thing) {
System.out.println(thing + " " + (set.contains(thing) ? "is" : "NOT") + " in <" + set.getClass().getSimpleName() + ">" + set);
}
prints
BadThing{hash=1} NOT in <TreeSet>[BadThing{hash=1}, BadThing{hash=2}]
BadThing{hash=2} NOT in <TreeSet>[BadThing{hash=1}, BadThing{hash=2}]
BadThing{hash=1} is in <HashSet>[BadThing{hash=1}, BadThing{hash=2}]
BadThing{hash=2} is in <HashSet>[BadThing{hash=1}, BadThing{hash=2}]
so even though the object is in the TreeSet it is not finding it because the comparator never returns 0. However, once it is in the HashSet all is fine because HashSet uses hashCode to find it and they behave in a valid way.

How to de-dupe a List of Objects?

A Rec object has a member variable called tag which is a String.
If I have a List of Recs, how could I de-dupe the list based on the tag member variable?
I just need to make sure that the List contains only one Rec with each tag value.
Something like the following, but I'm not sure what's the best algorithm to keep track counts, etc:
private List<Rec> deDupe(List<Rec> recs) {
for(Rec rec : recs) {
// How to check whether rec.tag exists in another Rec in this List
// and delete any duplicates from the List before returning it to
// the calling method?
}
return recs;
}

Store it temporarily in a HashMap<String,Rec>.
Create a HashMap<String,Rec>. Loop through all of your Rec objects. For each one, if the tag already exists as a key in the HashMap, then compare the two and decide which one to keep. If not, then put it in.
When you're done, the HashMap.values() method will give you all of your unique Rec objects.

Try this:
private List<Rec> deDupe(List<Rec> recs) {
Set<String> tags = new HashSet<String>();
List<Rec> result = new ArrayList<Rec>();
for(Rec rec : recs) {
if(!tags.contains(rec.tags) {
result.add(rec);
tags.add(rec.tag);
}
}
return result;
}
This checks each Rec against a Set of tags. If the set contains the tag already, it is a duplicate and we skip it. Otherwise we add the Rec to our result and add the tag to the set.

This becomes easier if Rec is .equals based on its tag value. Then you could write something like:
private List<Rec> deDupe( List<Rec> recs )
{
List<Rec> retList = new ArrayList<Rec>( recs.size() );
for ( Rec rec : recs )
{
if (!retList.contains(rec))
{
retList.add(rec);
}
}
return retList;
}

I would do that with the google collections. You can use the filter function, with a predicate that remember previous tags, and filters out Rec's with tag that has been there before.
Something like this:
private Iterable<Rec> deDupe(List<Rec> recs)
{
Predicate<Rec> filterDuplicatesByTagPredicate = new FilterDuplicatesByTagPredicate();
return Iterables.filter(recs, filterDuplicatesByTagPredicate);
}
private static class FilterDuplicatesByTagPredicate implements Predicate<Rec>
{
private Set<String> existingTags = Sets.newHashSet();
#Override
public boolean apply(Rec input)
{
String tag = input.getTag();
return existingTags.add(tag);
}
}
I slightly changed the method to return Iterable instead of List, but ofcourse you change that if that's important.

If you don't care about shuffling the data around (i.e you have a small list of small objects), you can do this:
private List<T> deDupe(List<T> thisListHasDupes){
Set<T> tempSet = new HashSet<T>();
for(T t:thisListHasDupes){
tempSet.add(t);
}
List<T> deDupedList = new ArrayList<T>();
deDupedList.addAll(tempSet);
return deDupedList;
}
Remember that implmenations of Set are going to want a consistent and valid equals operator. So if you have a custom object make sure that's taken care of.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Is there any null free list data structure? - java

Related

Lazy reinit a list to modifiable after declaring with Collections.emptyList()

List inside Redis Hash on Java

How to convert the following code to Java 8 streams and lambdas

Element is present but `Set.contains(element)` returns false

How to de-dupe a List of Objects?

Categories

Resources