Limited SortedSet - java

i'm looking for an implementation of SortedSet with a limited number of elements. So if there are more elements added then the specified Maximum the comparator decides if to add the item and remove the last one from the Set.
SortedSet<Integer> t1 = new LimitedSet<Integer>(3);
t1.add(5);
t1.add(3);
t1.add(1);
// [1,3,5]
t1.add(2);
// [1,2,3]
t1.add(9);
// [1,2,3]
t1.add(0);
// [0,1,2]
Is there an elegant way in the standard API to accomplish this?
I've wrote a JUnit Test for checking implementations:
#Test
public void testLimitedSortedSet() {
final LimitedSortedSet<Integer> t1 = new LimitedSortedSet<Integer>(3);
t1.add(5);
t1.add(3);
t1.add(1);
System.out.println(t1);
// [1,3,5]
t1.add(2);
System.out.println(t1);
// [1,2,3]
t1.add(9);
System.out.println(t1);
// [1,2,3]
t1.add(0);
System.out.println(t1);
// [0,1,2]
Assert.assertTrue(3 == t1.size());
Assert.assertEquals(Integer.valueOf(0), t1.first());
}

With the standard API you'd have to do it yourself, i.e. extend one of the sorted set classes and add the logic you want to the add() and addAll() methods. Shouldn't be too hard.
Btw, I don't fully understand your example:
t1.add(9);
// [1,2,3]
Shouldn't the set contain [1,2,9] afterwards?
Edit: I think now I understand: you want to only keep the smallest 3 elements that were added to the set, right?
Edit 2: An example implementation (not optimised) could look like this:
class LimitedSortedSet<E> extends TreeSet<E> {
private int maxSize;
LimitedSortedSet( int maxSize ) {
this.maxSize = maxSize;
}
#Override
public boolean addAll( Collection<? extends E> c ) {
boolean added = super.addAll( c );
if( size() > maxSize ) {
E firstToRemove = (E)toArray( )[maxSize];
removeAll( tailSet( firstToRemove ) );
}
return added;
}
#Override
public boolean add( E o ) {
boolean added = super.add( o );
if( size() > maxSize ) {
E firstToRemove = (E)toArray( )[maxSize];
removeAll( tailSet( firstToRemove ) );
}
return added;
}
}
Note that tailSet() returns the subset including the parameter (if in the set). This means that if you can't calculate the next higher value (doesn't need to be in the set) you'll have to readd that element. This is done in the code above.
If you can calculate the next value, e.g. if you have a set of integers, doing something tailSet( lastElement + 1 ) would be sufficient and you'd not have to readd the last element.
Alternatively you can iterate over the set yourself and remove all elements that follow the last you want to keep.
Another alternative, although that might be more work, would be to check the size before inserting an element and remove accordingly.
Update: as msandiford correctly pointed out, the first element that should be removed is the one at index maxSize. Thus there's no need to readd (re-add?) the last wanted element.
Important note:
As #DieterDP correctly pointed out, the implementation above violates the Collection#add() api contract which states that if a collection refuses to add an element for any reason other than it being a duplicate an excpetion must be thrown.
In the example above the element is first added but might be removed again due to size constraints or other elements might be removed, so this violates the contract.
To fix that you might want to change add() and addAll() to throw exceptions in those cases (or maybe in any case in order to make them unusable) and provide alterante methods to add elements which don't violate any existing api contract.
In any case the above example should be used with care since using it with code that isn't aware of the violations might result in unwanted and hard to debug errors.

I'd say this is a typical application for the decorator pattern, similar to the decorator collections exposed by the Collections class: unmodifiableXXX, synchronizedXXX, singletonXXX etc. I would take Guava's ForwardingSortedSet as base class, and write a class that decorates an existing SortedSet with your required functionality, something like this:
public final class SortedSets {
public <T> SortedSet<T> maximumSize(
final SortedSet<T> original, final int maximumSize){
return new ForwardingSortedSet<T>() {
#Override
protected SortedSet<T> delegate() {
return original;
}
#Override
public boolean add(final T e) {
if(original.size()<maximumSize){
return original.add(e);
}else return false;
}
// implement other methods accordingly
};
}
}

No, there is nothing like that using existing Java Library.
But yes, you can build a one like below using composition. I believe it will be easy.
public class LimitedSet implements SortedSet {
private TreeSet treeSet = new TreeSet();
public boolean add(E e) {
boolean result = treeSet.add(e);
if(treeSet.size() >= expectedSize) {
// remove the one you like ;)
}
return result;
}
// all other methods delegate to the "treeSet"
}
UPDATE
After reading your comment
As you need to remove the last element always:
you can consider maintaining a stack internally
it will increase memory complexity with O(n)
but possible to retrieve the last element with just O(1)... constant time
It should do the trick I believe

Related

Which Data Structure would be more suitable for the following task in Java?

Every 5 minutes, within the 20th minute cycle, I need to retrieve the data. Currently, I'm using the map data structure.
Is there a better data structure? Every time I read and set the data, I have to write to the file to prevent program restart and data loss.
For example, if the initial data in the map is:
{-1:"result1",-2:"result2",-3:"result3",-4:"result4"}
I want to get the last -4 period's value which is "result4", and set the new value "result5", so that the updated map will be:
{-1:"result5",-2:"result1",-3:"result2",-4:"result3"}
And again, I want to get the last -4 period's value which is "result3", and set the new value "result6", so the map will be:
{-1:"result6",-2:"result5",-3:"result1",-4:"result2"}
The code:
private static String getAndSaveValue(int a) {
//read the map from file
HashMap<Long,String> resultMap=getMapFromFile();
String value=resultMap.get(-4L);
for (Long i = 4L; i >= 2; i--){
resultMap.put(Long.parseLong(String.valueOf(i - 2 * i)),resultMap.get(1 - i));
}
resultMap.put(-1L,"result" + a);
//save the map to file
saveMapToFile(resultMap);
return value;
}
Based on your requirement, I think LinkedList data structure will be suitable for your requirement:
public class Test {
public static void main(String[] args) {
LinkedList<String> ls=new LinkedList<String>();
ls.push("result4");
ls.push("result3");
ls.push("result2");
ls.push("result1");
System.out.println(ls);
ls.push("result5"); //pushing new value
System.out.println("Last value:"+ls.pollLast()); //this will return `result4`
System.out.println(ls);
ls.push("result6"); //pushing new value
System.out.println("Last value:"+ls.pollLast()); // this will give you `result3`
System.out.println(ls);
}
}
Output:
[result1, result2, result3, result4]
Last value:result4
[result5, result1, result2, result3]
Last value:result3
[result6, result5, result1, result2]
Judging by your example, you need a FIFO data structure which has a bounded size.
There's no bounded general purpose implementation of the Queue interface in the JDK. Only concurrent implementation could be bounded in size. But if you're not going to use it in a multithreaded environment, it's not the best choice because thread safety doesn't come for free - concurrent collections are slower, and also can create confusing for the reader of your code.
To achieve your goal, I suggest you to use the composition by wrapping ArrayDeque, which is an array-based implementation of the Queue and performs way better than LinkedList.
Note that is a preferred approach not to extend ArrayDeque (IS A relationship) and override its methods add() and offer(), but include it in a class as a field (HAS A relationship), so that all the method calls on the instance of your class will be forwarded to the underlying collection. You can find more information regarding this approach in the item "Favor composition over inheritance" of Effective Java by Joshua Bloch.
public class BoundQueue<T> {
private Queue<T> queue;
private int limit;
public BoundQueue(int limit) {
this.queue = new ArrayDeque<>(limit);
this.limit = limit;
}
public void offer(T item) {
if (queue.size() == limit) {
queue.poll(); // or throw new IllegalStateException() depending on your needs
}
queue.add(item);
}
public T poll() {
return queue.poll();
}
public boolean isEmpty() {
return queue.isEmpty();
}
}

What is the use of LinkedHashMap.removeEldestEntry?

I am aware the answer to this question is easily available on the internet. I need to know what happens if I choose not to removeEldestEntry. Below is my code:
package collection;
import java.util.*;
public class MyLinkedHashMap {
private static final int MAX_ENTRIES = 2;
public static void main(String[] args) {
LinkedHashMap lhm = new LinkedHashMap(MAX_ENTRIES, 0.75F, false) {
protected boolean removeEldestEntry(Map.Entry eldest) {
return false;
}
};
lhm.put(0, "H");
lhm.put(1, "E");
lhm.put(2, "L");
lhm.put(3, "L");
lhm.put(4, "O");
System.out.println("" + lhm);
}
}
Even though I am not allowing the removeEldestEntry my code works fine.
So, internally what is happening?
removeEldestEntry always gets checked after an element was inserted. For example, if you override the method to always return true, the LinkedHashMap will always be empty, since after every put or putAll insertion, the eldest element will be removed, no matter what. The JavaDoc shows a very sensible example on how to use it:
protected boolean removeEldestEntry(Map.Entry eldest){
return size() > MAX_SIZE;
}
In an alternative way, you might only want to remove an entry if it is unimportant:
protected boolean removeEldestEntry(Map.Entry eldest){
if(size() > MAX_ENTRIES){
if(isImportant(eldest)){
//Handle an important entry here, like reinserting it to the back of the list
this.remove(eldest.getKey());
this.put(eldest.getKey(), eldest.getValue());
//removeEldestEntry will be called again, now with the next entry
//so the size should not exceed the MAX_ENTRIES value
//WARNING: If every element is important, this will loop indefinetly!
} else {
return true; //Element is unimportant
}
return false; //Size not reached or eldest element was already handled otherwise
}
Why can't people just answer the OP's simple question!
If removeEldestEntry returns false then no items will ever be removed from the map and it will essentially behave like a normal Map.
Expanding on the answer by DavidNewcomb:
I'm assuming that you are learning how to implement a cache.
The method LinkedHashMap.removeEldestEntry is a method very commonly used in cache data structures, where the size of the cache is limited to a certain threshold. In such cases, the removeEldestEntry method can be set to automatically remove the oldest entry when the size exceeds the threshold (defined by the MAX_ENTRIES attribute) - as in the example provided here.
On the other hand, when you override the removeEldestEntry method this way, you are ensuring that nothing ever happens when the MAX_ENTRIES threshold is exceeded. In other words, the data structure would not behave like a cache, but rather a normal map.
Your removeEldestEntry method is identical to the default implementation of LinkedHashMap.removeEldestEntry, so your LinkedHashMap will simply behave like a normal LinkedHashMap with no overridden methods, retaining whatever you values and keys put into it unless and until you explicitly remove them by calling remove, removeAll, clear, etc. The advantage of using LinkedHashMap is that the collection views (keySet(), values(), entrySet()) always return Iterators that traverse the keys and/or values in the order they were added to the Map.

Iterator retrieve first value and place it back on the same iterator

I have the following scenario: I have an existing iterator Iterator<String> it and I iterate over its head (say first k elements, which are flagged elements, i.e. they start with '*' ). The only way to know that the flagged elements are over, is by noticing that the (k+1)th element is not flagged.
The problem is that if I do that, the iterator it will not provide me the first value anymore on the next call to next().
I want to pass this iterator to a method as it's only argument and I would like to avoid changing its signarture and it implementation. I know I could do this:
public void methodAcceptingIterator(Iterator<String> it) //current signature
//change it to
public void methodAcceptingIterator(String firstElement, Iterator<String> it)
But this looks like a workarround/hack decreasing the elegance and generality of the code, so I don't want to this.
Any ideas how I could solve this problem ?
You could use Guava's PeekingIterator (link contains the javadoc for a static method which, given an Iterator, will return a wrapping PeekingIterator). That includes a method T peek() which shows you the next element without advancing to it.
The solution is to create your own Iterator implementation which stores the firstElement and uses the existing iterator as an underlying Iterator to delegate the requests for the rest of the elements to.
Something like:
public class IteratorMissingFirst<E> implements Iterator<E>{
private Iterator<E> underlyingIterator;
private E firstElement;
private boolean firstElOffered;
public IteratorMissingFirst(E firstElement, Iterator<E> it){
//initialize all the instance vars
}
public boolean hasNext(){
if(!firstElOffered && firstElement != null){
return true;
}
else{
return underlyingIterator.hasNext();
}
}
public E next(){
if(!firstElOffered){
firstElOffered = true;
return firstElement;
}
else return underlyingIterator.next();
}
public void remove(){
}
}
Why don't you just have methodAcceptingIterator store the first element it gets out of the iterator in a variable? Or -- in a pinch -- just copy the contents of the Iterator into an ArrayList at the beginning of your method; now you can revisit elements as often as you like.
With Guava, you can implement Razvan's solution in an easier way by using some methods from the Iterables class:
Iterators.concat(Iterators.singletonIterator(firstElement), it)
This gives you an iterator working similar to IteratorMissingFirst, and it's easy to extend if you need to look at more than one element in front (but it creates two objects instead of only one).

Removing the "first" object from a Set

Under certain situations, I need to evict the oldest element in a Java Set. The set is implemented using a LinkedHashSet, which makes this simple: just get rid of the first element returned by the set's iterator:
Set<Foo> mySet = new LinkedHashSet<Foo>();
// do stuff...
if (mySet.size() >= MAX_SET_SIZE)
{
Iterator<Foo> iter = mySet.iterator();
iter.next();
iter.remove();
}
This is ugly: 3 lines to do something I could do with 1 line if I was using a SortedSet (for other reasons, a SortedSet is not an option here):
if (/*stuff*/)
{
mySet.remove(mySet.first());
}
So is there a cleaner way of doing this, without:
changing the Set implementation, or
writing a static utility method?
Any solutions leveraging Guava are fine.
I am fully aware that sets do not have inherent ordering. I'm asking about removing the first entry as defined by iteration order.
LinkedHashSet is a wrapper for LinkedHashMap which supports a simple "remove oldest" policy. To use it as a Set you can do
Set<String> set = Collections.newSetFromMap(new LinkedHashMap<String, Boolean>(){
protected boolean removeEldestEntry(Map.Entry<String, Boolean> eldest) {
return size() > MAX_ENTRIES;
}
});
if (!mySet.isEmpty())
mySet.remove(mySet.iterator().next());
seems to be less than 3 lines.
You have to synchronize around it of course if your set is shared by multiple threads.
If you really need to do this at several places in your code, just write a static method.
The other solutions proposed are often slower since they imply calling the Set.remove(Object) method instead of the Iterator.remove() method.
#Nullable
public static <T> T removeFirst(Collection<? extends T> c) {
Iterator<? extends T> it = c.iterator();
if (!it.hasNext()) { return null; }
T removed = it.next();
it.remove();
return removed;
}
With guava:
if (!set.isEmpty() && set.size() >= MAX_SET_SIZE) {
set.remove(Iterables.get(set, 0));
}
I will also suggest an alternative approach. Yes, it it changing the implementation, but not drastically: extend LinkedHashSet and have that condition in the add method:
public LimitedLinkedHashSet<E> extends LinkedHashSet<E> {
public void add(E element) {
super.add(element);
// your 5-line logic from above or my solution with guava
}
}
It's still 5 line, but it is invisible to the code that's using it. And since this is actually a specific behaviour of the set, it is logical to have it within the set.
I think the way you're doing it is fine. Is this something you do often enough to be worth finding a shorter way? You could do basically the same thing with Guava like this:
Iterables.removeIf(Iterables.limit(mySet, 1), Predicates.alwaysTrue());
That adds the small overhead of wrapping the set and its iterator for limiting and then calling the alwaysTrue() predicate once... doesn't seem especially worth it to me though.
Edit: To put what I said in a comment in an answer, you could create a SetMultimap that automatically restricts the number of values it can have per key like this:
SetMultimap<K, V> multimap = Multimaps.newSetMultimap(map,
new Supplier<Set<V>>() {
public Set<V> get() {
return Sets.newSetFromMap(new LinkedHashMap<V, Boolean>() {
#Override protected boolean removeEldestEntry(Entry<K, V> eldestEntry) {
return size() > MAX_SIZE;
}
});
}
});
Quick and dirty one-line solution: mySet.remove(mySet.toArray(new Foo[mySet.size()])[0]) ;)
However, I'd still go for the iterator solution, since this would be more readable and should also be faster.
Edit: I'd go for Mike Samuel's solution. :)

Simple database-like collection class in Java

The problem: Maintain a bidirectional many-to-one relationship among java objects.
Something like the Google/Commons Collections bidi maps, but I want to allow duplicate values on the forward side, and have sets of the forward keys as the reverse side values.
Used something like this:
// maintaining disjoint areas on a gameboard. Location is a space on the
// gameboard; Regions refer to disjoint collections of Locations.
MagicalManyToOneMap<Location, Region> forward = // the game universe
Map<Region, <Set<Location>>> inverse = forward.getInverse(); // live, not a copy
Location parkplace = Game.chooseSomeLocation(...);
Region mine = forward.get(parkplace); // assume !null; should be O(log n)
Region other = Game.getSomeOtherRegion(...);
// moving a Location from one Region to another:
forward.put(parkplace, other);
// or equivalently:
inverse.get(other).add(parkplace); // should also be O(log n) or so
// expected consistency:
assert ! inverse.get(mine).contains(parkplace);
assert forward.get(parkplace) == other;
// and this should be fast, not iterate every possible location just to filter for mine:
for (Location l : mine) { /* do something clever */ }
The simple java approaches are: 1. To maintain only one side of the relationship, either as a Map<Location, Region> or a Map<Region, Set<Location>>, and collect the inverse relationship by iteration when needed; Or, 2. To make a wrapper that maintains both sides' Maps, and intercept all mutating calls to keep both sides in sync.
1 is O(n) instead of O(log n), which is becoming a problem. I started in on 2 and was in the weeds straightaway. (Know how many different ways there are to alter a Map entry?)
This is almost trivial in the sql world (Location table gets an indexed RegionID column). Is there something obvious I'm missing that makes it trivial for normal objects?
I might misunderstand your model, but if your Location and Region have correct equals() and hashCode() implemented, then the set of Location -> Region is just a classical simple Map implementation (multiple distinct keys can point to the same object value). The Region -> Set of Location is a Multimap (available in Google Coll.). You could compose your own class with the proper add/remove methods to manipulate both submaps.
Maybe an overkill, but you could also use in-memory sql server (HSQLDB, etc). It allows you to create index on many columns.
I think you could achieve what you need with the following two classes. While it does involve two maps, they are not exposed to the outside world, so there shouldn't be a way for them to get out of sync. As for storing the same "fact" twice, I don't think you'll get around that in any efficient implementation, whether the fact is stored twice explicitly as it is here, or implicitly as it would be when your database creates an index to make joins more efficient on your 2 tables. you can add new things to the magicset and it will update both mappings, or you can add things to the magicmapper, which will then update the inverse map auotmatically. The girlfriend is calling me to bed now so I cannot run this through a compiler - it should be enough to get you started. what puzzle are you trying to solve?
public class MagicSet<L> {
private Map<L,R> forward;
private R r;
private Set<L> set;
public MagicSet<L>(Map forward, R r) {
this.forward = map;
this.r = r;
this.set = new HashSet<L>();
}
public void add(L l) {
set.add(l);
forward.put(l,r);
}
public void remove(L l) {
set.remove(l);
forward.remove(l);
}
public int size() {
return set.size();
}
public in contains(L l){
return set.contains(l);
}
// caution, do not use the remove method from this iterator. if this class was going
// to be reused often you would want to return a wrapped iterator that handled the remove method properly. In fact, if you did that, i think you could then extend AbstractSet and MagicSet would then fully implement java.util.Set.
public Iterator iterator() {
return set.iterator();
}
}
public class MagicMapper<L,R> { // note that it doesn't implement Map, though it could with some extra work. I don't get the impression you need that though.
private Map<L,R> forward;
private Map<R,MagicSet<L>> inverse;
public MagicMapper<L,R>() {
forward = new HashMap<L,R>;
inverse = new HashMap<R,<MagicSet<L>>;
}
public R getForward(L key) {
return forward.get(key);
}
public Set<L> getBackward(R key) {
return inverse.get(key); // this assumes you want a null if
// you try to use a key that has no mapping. otherwise you'd return a blank MagicSet
}
public void put (L l, R r) {
R oldVal = forward.get(l);
// if the L had already belonged to an R, we need to undo that mapping
MagicSet<L> oldSet = inverse.get(oldVal);
if (oldSet != null) {oldSet.remove(l);}
// now get the set the R belongs to, and add it.
MagicSet<L> newSet = inverse.get(l);
if (newSet == null) {
newSet = new MagicSet<L>(forward, r);
inverse.put(r,newSet);
}
newSet.add(l); // magically updates the "forward" map
}
}

Categories