Fixed-size collection that keeps top (N) values in Java - java

I need to keep top N(< 1000) integers while trying to add values from a big list of integers(around a million sized lazy list). I want to be try adding values to a collection but that needs to keep only the top N(highest values) integers. Is there any preferred data structure to use for this purpose ?

I'd suggest to use some sorted data structure, such as TreeSet. Before insertion, check the number of items in the set, and if it reached 1000, remove the smallest number if it's smaller than the newly added number, and add the new number.
TreeSet<Integer> set = ...;
public void add (int n) {
if (set.size () < 1000) {
set.add (n);
} else {
Integer first = set.first();
if (first.intValue() < n) {
set.pollFirst();
set.add (n);
}
}
}

Google Guava MinMaxPriorityQueue class.
You can also use custom sorting by using a comparator (Use orderedBy(Comparator<B> comparator) method).
Note: This collection is NOT a sorted collection.
See javadoc
Example:
#Test
public void test() {
final int maxSize = 5;
// Natural order
final MinMaxPriorityQueue<Integer> queue = MinMaxPriorityQueue
.maximumSize(maxSize).create();
queue.addAll(Arrays.asList(10, 30, 60, 70, 20, 80, 90, 50, 100, 40));
assertEquals(maxSize, queue.size());
assertEquals(new Integer(50), Collections.max(queue));
System.out.println(queue);
}
Output:
[10, 50, 40, 30, 20]

One efficient solution is a slightly tweaked array-based priority queue using a binary min-heap.
First N integers are simply added to the heap one by one or you can build it from array of first N integers (slightly faster).
After that, compare the incoming integer with the root element (which is MIN value found so far). If the new integer is larger that that, simply replace the root with this new integer and perform down-heap operation (i.e. trickle down the new integer until both its children are smaller or it becomes a leaf). The data structure guarantees you will always have N largest integers so far with average addition time of O(log N).
Here is my C# implementation, the mentioned method is named "EnqueueDown". The "EnqueueUp" is a standard enqueue operation that expands the array, adds new leaf and trickles it up.
I have tested it on 1M numbers with max heap size of 1000 and it runs under 200 ms:
namespace ImagingShop.Research.FastPriorityQueue
{
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.CompilerServices;
public sealed class FastPriorityQueue<T> : IEnumerable<Tuple<T, float>>
{
private readonly int capacity;
private readonly Tuple<T, float>[] nodes;
private int count = 0;
public FastPriorityQueue(int capacity)
{
this.capacity = capacity;
this.nodes = new Tuple<T, float>[capacity];
}
public int Capacity => this.capacity;
public int Count => this.count;
public T FirstNode => this.nodes[0].Item1;
public float FirstPriority => this.nodes[0].Item2;
public void Clear()
{
this.count = 0;
}
public bool Contains(T node) => this.nodes.Any(tuple => Equals(tuple.Item1, node));
public T Dequeue()
{
T nodeHead = this.nodes[0].Item1;
int index = (this.count - 1);
this.nodes[0] = this.nodes[index];
this.count--;
DownHeap(index);
return nodeHead;
}
public void EnqueueDown(T node, float priority)
{
if (this.count == this.capacity)
{
if (priority < this.nodes[0].Item2)
{
return;
}
this.nodes[0] = Tuple.Create(node, priority);
DownHeap(0);
return;
}
int index = this.count;
this.count++;
this.nodes[index] = Tuple.Create(node, priority);
UpHeap(index);
}
public void EnqueueUp(T node, float priority)
{
int index = this.count;
this.count++;
this.nodes[index] = Tuple.Create(node, priority);
UpHeap(index);
}
public IEnumerator<Tuple<T, float>> GetEnumerator()
{
for (int i = 0; i < this.count; i++) yield return this.nodes[i];
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void DownHeap(int index)
{
while (true)
{
int indexLeft = (index << 1);
int indexRight = (indexLeft | 1);
int indexMin = ((indexLeft < this.count) && (this.nodes[indexLeft].Item2 < this.nodes[index].Item2))
? indexLeft
: index;
if ((indexRight < this.count) && (this.nodes[indexRight].Item2 < this.nodes[indexMin].Item2))
{
indexMin = indexRight;
}
if (indexMin == index)
{
break;
}
Flip(index, indexMin);
index = indexMin;
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void Flip(int indexA, int indexB)
{
var temp = this.nodes[indexA];
this.nodes[indexA] = this.nodes[indexB];
this.nodes[indexB] = temp;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void UpHeap(int index)
{
while (true)
{
if (index == 0)
{
break;
}
int indexParent = (index >> 1);
if (this.nodes[indexParent].Item2 <= this.nodes[index].Item2)
{
break;
}
Flip(index, indexParent);
index = indexParent;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
The basic implementation is taken from "Cormen, Thomas H. Introduction to algorithms. MIT press, 2009."

In Java 1.7 one may use java.util.PriorityQueue. To keep the top N items you need to use reverse comparator, e.g. for integers you order them descending. In this manner the smallest number is always on top and could be removed if to many items in queue.
package eu.pawelsz.example.topn;
import java.util.Comparator;
import java.util.PriorityQueue;
public class TopN {
public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) {
if (keep == priorityQueue.size()) {
priorityQueue.poll();
}
priorityQueue.add(element);
}
public static void main(String[] args) {
int N = 4;
PriorityQueue<Integer> topN = new PriorityQueue<>(N, new Comparator<Integer>() {
#Override
public int compare(Integer o1, Integer o2) {
return o1 - o2;
}
});
add(N, topN, 1);
add(N, topN, 2);
add(N, topN, 3);
add(N, topN, 4);
System.out.println("smallest: " + topN.peek());
add(N, topN, 8);
System.out.println("smallest: " + topN.peek());
add(N, topN, 5);
System.out.println("smallest: " + topN.peek());
add(N, topN, 2);
System.out.println("smallest: " + topN.peek());
}
}

// this Keep Top Most K Instance in Queue
public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) {
if(priorityQueue.size()<keep){
priorityQueue.add(element);
}
else if(keep == priorityQueue.size()) {
priorityQueue.add(element); // size = keep +1 but
Object o = (Object)topN.toArray()[k-1];
topN.remove(o); // resized to keep
}
}

The fastest way is likely a simple array items = new Item[N]; and a revolving cursor int cursor = 0;. The cursor points to the insertion point of the next element.
To add a new element use the method
put(Item newItem) { items[cursor++] = newItem; if(cursor == N) cursor = 0; }
when accessing this structure you can make the last item added appear at index 0 via a small recalculation of the index, i.e.
get(int index) { return items[ cursor > index ? cursor-index-1 : cursor-index-1+N ]; }
(the -1 is because cursor always point at the next insertion point, i.e. cursor-1 is the last element added).
Summary: put(item) will add a new item. get(0) will get the last item added, get(1) will get the second last item, etc.
In case you need to take care of the case where n < N elements have been added you just need to check for null.
(TreeSets will likely be slower)

Your Question is answered here:
Size-limited queue that holds last N elements in Java
To summerize it:
No there is no data structure in the default java sdk, but Apache commons collections 4 has a CircularFifoQueue.

Related

Insert an element to Array List sorted descending order

I'm trying to insert an element in the correct position in an array list that is sorted in descending order.
The complexity to find the correct position must be O(LOGN).
That's why I tried using binary search to find the correct position.
This is what I did:
I added:
middle = (low + high) / 2;
after the while loop.
The problem is that it's inserting the elements in ascending order. instead of descending order
public void insert(E x) {
if(q.size()==0){
q.add(0, x);
}
else{
int place = binarySearch(x);
q.add(place, x);
}
}
private int binarySearch (E x) {
int size = q.size();
int low = 0;
int high = size - 1;
int middle = 0;
while(high > low) {
middle = (low + high) / 2;
if(q.get(middle).getPriority() == x.getPriority()) {
return middle;
}
if(q.get(middle).getPriority() < x.getPriority()) {
low = middle + 1;
}
if(q.get(middle).getPriority() > x.getPriority()) {
high = middle - 1;
}
}
middle = (low + high) / 2;
if(q.get(middle).getPriority() < x.getPriority()) {
return middle + 1 ;
}
return middle;
}
There are a few problems with your code:
all your comparisons are the wrong way, thus you are inserting in ascending order
you should loop while (high >= low), or you can not insert an element that's smaller than all the existing elements; also, with this you no longer need the if/else in insert
if you want ties to be handled such that the oldest element is sorted first, remove the "same as middle" check and reverse the if/else within the loop; this way, in case of ties, the low bound is raised, inserting the new element after the older one
now, after the while loop, you can just return low
This seems to work (Note: Changed to Integer instead of E for testing, populating an initially empty list with random integers.):
public void insert(E x) {
q.add(binarySearch(x), x);
}
private int binarySearch (E x) {
int low = 0;
int high = q.size() - 1;
while (high >= low) {
int middle = (low + high) / 2;
if (q.get(middle).getPriority() < x.getPriority()) {
high = middle - 1;
} else {
low = middle + 1;
}
}
return low;
}
Tests and example output:
#Data #AllArgsConstructor
class E {
int id, priority;
public String toString() { return String.format("%d/%d", id, priority); }
}
Random random = new Random();
int id = 0;
for (int i = 0; i < 50; i++) {
test.insert(new E(id++, random.nextInt(20)));
}
System.out.println(test.q);
// [2/19, 3/19, 24/19, 32/19, 46/19, 18/18, 23/18, 39/18, 31/17, 10/16, 28/16, 40/16, 45/16, 7/15, 19/14, 33/14, 37/14, 38/14, 36/13, 44/13, 5/11, 12/11, 15/11, 20/11, 30/11, 9/10, 41/10, 48/10, 16/9, 34/9, 13/8, 1/7, 8/7, 35/7, 0/6, 6/6, 22/6, 29/6, 21/5, 26/5, 42/5, 14/4, 27/4, 47/4, 25/3, 4/1, 11/1, 17/1, 43/1, 49/0]
This could be a lot simpler using Collections.binarySearch. The methods will return the index if it is found or return a negative value matching where it should be :
the index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1). The insertion point is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key, or list.size() if all elements in the list are less than the specified key. Note that this guarantees that the return value will be >= 0 if and only if the key is found.
Here is a quick example of a SortedList
class SortedList<E> extends ArrayList<E>{
Comparator<E> comparator;
public SortedList(Comparator<E> comparator) {
this.comparator = comparator;
}
#Override
public boolean add(E e) {
int index = Collections.binarySearch(this, e, comparator);
if(index < 0){
index = -index - 1;
}
if(index >= this.size()){
super.add(e);
} else {
super.add(index, e);
}
return true;
}
}
And a test case for a descending order:
SortedList<Integer> list = new SortedList<>(
(i1, i2) -> i2 - i1
);
list.add(10);
list.add(20);
list.add(15);
list.add(10);
System.out.println(list);
[20, 15, 10, 10]
The comparator in the constructor allow you to set the order to use for the insertion. Not that this is not safe, this is not overriding every methods but this is a quick answer ;)

Java Recursive MergeSort for ArrayLists

I have been having a problem with my mergesort function, as I am not able to sort a series of integers or strings whenever inputting it into the program. I have an outside class that calls items into it, however it simply doesn't sort the numbers/strings. The two methods are below, I don't know where the problem is. Numbers are randomly inputted.
CODE:
/**
* Takes in entire vector, but will merge the following sections together:
* Left sublist from a[first]..a[mid], right sublist from a[mid+1]..a[last].
* Precondition: each sublist is already in ascending order
*
* #param a
* reference to an array of integers to be sorted
* #param first
* starting index of range of values to be sorted
* #param mid
* midpoint index of range of values to be sorted
* #param last
* last index of range of values to be sorted
*/
private void merge(ArrayList<Comparable> a, int first, int mid, int last) {
int x;
int i;
ArrayList<Comparable> left = new ArrayList<Comparable>();
ArrayList<Comparable> right = new ArrayList<Comparable>();
mergeSort(a,first,mid);
for(i = 0; i < a.size() - mid; i++){
left.add(i,a.get(i));
a.remove(i);
}
mergeSort(a,mid,last);
for (x = mid; x < a.size(); x++) {
right.add(x,a.get(x));
a.remove(x);
}
if ((left.get(i).compareTo(right.get(x))) > 0) {
i++;
a.add(i);
} else if (i < x) {
x++;
a.add(x);
}
System.out.println();
System.out.println("Merge");
System.out.println();
}
/**
* Recursive mergesort of an array of integers
*
* #param a
* reference to an array of integers to be sorted
* #param first
* starting index of range of values to be sorted
* #param last
* ending index of range of values to be sorted
*/
public void mergeSort(ArrayList<Comparable> a, int first, int last) {
int mid = (first + last)/2;
if(first == last){
}else if(last - first == 1){
merge(a,first, mid ,last);
}else{
last = mid;
}
}
I have an outside class that calls items into it, however it simply doesn't sort the numbers/strings. The two methods are below, I don't know where the problem is.
The first problem is that if you call your mergeSort method with first = 0 and last = a.size() you won't sort anything as you only call merge if last-first == 1 :
public void mergeSort(ArrayList<Comparable> a, int first, int last) {
int mid = (first + last)/2;
if(first == last){
}else if(last - first == 1){
// you only merge if last - first == 1...
merge(a,first, mid ,last);
}else{
last = mid;
}
}
Appart from this point, I don't get how you're trying to implement the Merge Sort algorithm. It's neither a top down, nor a bottom up implementation. You're splitting inside the merge method which is also really odd. It would have been easier to help you if you had provided your pseudo code + the way you call your public method. IMHO you have a real issue with your algorithm.
In fact the merge sort algorithm is really simple to implement. To illustrate this, I wrote this top down implementation of the merge sort algorithm using Deque instead of List objects:
import java.util.Deque;
import java.util.LinkedList;
public class Example {
private LinkedList<Comparable> merge(final Deque<Comparable> left, final Deque<Comparable> right) {
final LinkedList<Comparable> merged = new LinkedList<>();
while (!left.isEmpty() && !right.isEmpty()) {
if (left.peek().compareTo(right.peek()) <= 0) {
merged.add(left.pop());
} else {
merged.add(right.pop());
}
}
merged.addAll(left);
merged.addAll(right);
return merged;
}
public void mergeSort(final LinkedList<Comparable> input) {
if (input.size() != 1) {
final LinkedList<Comparable> left = new LinkedList<Comparable>();
final LinkedList<Comparable> right = new LinkedList<Comparable>();
// boolean used to decide if we put elements
// in left or right LinkedList
boolean logicalSwitch = true;
while (!input.isEmpty()) {
if (logicalSwitch) {
left.add(input.pop());
} else {
right.add(input.pop());
}
logicalSwitch = !logicalSwitch;
}
mergeSort(left);
mergeSort(right);
input.addAll(merge(left, right));
}
}
}
I used Deque because peek()/ pop() is ways prettier IMHO than get(0) and remove(0) but it's up to you. If you absolutely want to use ArrayList here follows the corresponding implementation.
import java.util.ArrayList;
import java.util.List;
public class Example {
private List<Comparable> merge(final List<Comparable> left, final List<Comparable> right) {
final List<Comparable> merged = new ArrayList<>();
while (!left.isEmpty() && !right.isEmpty()) {
if (left.get(0).compareTo(right.get(0)) <= 0) {
merged.add(left.remove(0));
} else {
merged.add(right.remove(0));
}
}
merged.addAll(left);
merged.addAll(right);
return merged;
}
public void mergeSort(final List<Comparable> input) {
if (input.size() != 1) {
final List<Comparable> left = new ArrayList<Comparable>();
final List<Comparable> right = new ArrayList<Comparable>();
boolean logicalSwitch = true;
while (!input.isEmpty()) {
if (logicalSwitch) {
left.add(input.remove(0));
} else {
right.add(input.remove(0));
}
logicalSwitch = !logicalSwitch;
}
mergeSort(left);
mergeSort(right);
input.addAll(merge(left, right));
}
}
}
Both implementation work with Integerand String or other Comparable.
Hope it helps.
There are several problems but an important one is that you should not iterate over a list while modifying the list, i.e. in:
for (i = 0; i < a.size() - mid; i++){
left.add(i,a.get(i));
a.remove(i);
}
because once you remove an element, indexes for others are not the same... So you add in left elements of a that are not what you think.
A working code is the following (with some comments) :
private static void merge(ArrayList<Comparable> a) {
if (a.size()<=1) return; // small list don't need to be merged
// SEPARATE
int mid = a.size()/2; // estimate half the size
ArrayList<Comparable> left = new ArrayList<Comparable>();
ArrayList<Comparable> right = new ArrayList<Comparable>();
for(int i = 0; i < mid; i++) left.add(a.remove(0)); // put first half part in left
while (a.size()!=0) right.add(a.remove(0)); // put the remainings in right
// Here a is now empty
// MERGE PARTS INDEPENDANTLY
merge(left); // merge the left part
merge(right); // merge the right part
// MERGE PARTS
// while there is something in the two lists
while (left.size()!=0 && right.size()!=0) {
// compare both heads, add the lesser into the result and remove it from its list
if (left.get(0).compareTo(right.get(0))<0) a.add(left.remove(0));
else a.add(right.remove(0));
}
// fill the result with what remains in left OR right (both can't contains elements)
while(left.size()!=0) a.add(left.remove(0));
while(right.size()!=0) a.add(right.remove(0));
}
It has been tested on some inputs... Example:
[4, 7, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11]
[0, 1, 1, 2, 3, 4, 4, 5, 6, 7, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
For efficiency you may use subList method to avoid constructing too much sub lists explicitly, it will need to take care about indices.
A WARNING about Kraal's implementation that got the checkmark. It's a great implementation, but Kraal's Merge sort doesn't preserve the relative order of items that have the same value, which in some cases, when sorting objects for instance, is an important strength that merge sort has that other sorting algorithms, like quicksort, do not have. I modified Kraal's code to preserve relative orders.
private static List<Object> merge(final List<Object> left, final List<Object> right) {
printArr("left", left);
printArr("Right", right);
final List<Object> merged = new ArrayList<>();
while (!left.isEmpty() && !right.isEmpty()) {
if(left.get(0).getValue()-right.get(0).getValue() <= 0){
merged.add(left.remove(0));
} else {
merged.add(right.remove(0));
}
}
merged.addAll(left);
merged.addAll(right);
return merged;
}
public static void mergeSort(final List<Object> input) {
if (input.size() > 1) {
final List<Object> left = new ArrayList<Object>();
final List<Object> right = new ArrayList<Object>();
boolean logicalSwitch = true;
while (!input.isEmpty()) {
if (logicalSwitch) {
left.add(input.remove(0));
} else {
right.add(input.remove(input.size()/2));
}
logicalSwitch = !logicalSwitch;
}
mergeSort(left);
mergeSort(right);
input.addAll(merge(left, right));
}
}
If you want to sort an array using Merge sort, and not to implement a sorting algorithm by yourself,
I recommend using standard Java sorting algorithms because it implements "Merge sort" algorithm for non primitive types.
Collections.sort();
If you would like to implement your own version of Merge sort then you should look first at this implementation.
And if you are interested in better understanding sorting algorithms I recommend this book.
public class MergeSort{
public void sort(List<Integer> list){
sortAndMerge(list, 0, list.size()-1);
}
public void sortAndMerge(List<Integer> list, int start, int end){
if((end - start) >= 2){
int mid = (end - start)/2;
sortAndMerge(list, start, start + mid);
sortAndMerge(list, start + mid +1, end);
int i=start;
int j=start + mid +1;
while(i<j && j<=end){
if(list.get(i) > list.get(j)){
list.add(i, list.remove(j));
i++;
j++;
}else if(list.get(i) == list.get(j)){
list.add(i+1, list.remove(j));
i++;
j++;
}else{
i++;
}
}
}else{
if(end > start){
if(list.get(start) > list.get(end)){
int endValue = list.remove(end);
list.add(start, endValue);
}
}
}
}

How to implement a Spliterator for streaming Fibonacci numbers?

I'm playing with Java 8 Spliterator and created one to stream Fibonacci numbers up to a given n. So for the Fibonacci series 0, 1, 1, 2, 3, 5, 8, ...
n fib(n)
-----------
-1 0
1 0
2 1
3 1
4 2
Following is my implementation which prints a bunch of 1 before running out of stack memory. Can you help me find the bug? (I think it's not advancing the currentIndex but I'm not sure what value to set it to).
Edit 1: If you decide to answer, please keep it relevant to the question. This question is not about efficient fibonacci number generation; it's about learning spliterators.
FibonacciSpliterator:
#RequiredArgsConstructor
public class FibonacciSpliterator implements Spliterator<FibonacciPair> {
private int currentIndex = 3;
private FibonacciPair pair = new FibonacciPair(0, 1);
private final int n;
#Override
public boolean tryAdvance(Consumer<? super FibonacciPair> action) {
// System.out.println("tryAdvance called.");
// System.out.printf("tryAdvance: currentIndex = %d, n = %d, pair = %s.\n", currentIndex, n, pair);
action.accept(pair);
return n - currentIndex >= 2;
}
#Override
public Spliterator<FibonacciPair> trySplit() {
// System.out.println("trySplit called.");
FibonacciSpliterator fibonacciSpliterator = null;
if (n - currentIndex >= 2) {
// System.out.printf("trySplit Begin: currentIndex = %d, n = %d, pair = %s.\n", currentIndex, n, pair);
fibonacciSpliterator = new FibonacciSpliterator(n);
long currentFib = pair.getMinusTwo() + pair.getMinusOne();
long nextFib = pair.getMinusOne() + currentFib;
fibonacciSpliterator.pair = new FibonacciPair(currentFib, nextFib);
fibonacciSpliterator.currentIndex = currentIndex + 3;
// System.out.printf("trySplit End: currentIndex = %d, n = %d, pair = %s.\n", currentIndex, n, pair);
}
return fibonacciSpliterator;
}
#Override
public long estimateSize() {
return n - currentIndex;
}
#Override
public int characteristics() {
return ORDERED | IMMUTABLE | NONNULL;
}
}
FibonacciPair:
#RequiredArgsConstructor
#Value
public class FibonacciPair {
private final long minusOne;
private final long minusTwo;
#Override
public String toString() {
return String.format("%d %d ", minusOne, minusTwo);
}
}
Usage:
Spliterator<FibonacciPair> spliterator = new FibonacciSpliterator(5);
StreamSupport.stream(spliterator, true)
.forEachOrdered(System.out::print);
Besides the fact that your code is incomplete, there are at least two errors in your tryAdvance method recognizable. First, you are not actually making any advance. You are not modifying any state of your spliterator. Second, you are unconditionally invoking the action’s accept method which is not matching the fact that you are returning a conditional value rather than true.
The purpose of tryAdvance is:
as the name suggests, try to make an advance, i.e. calculate a next value
if there is a next value, invoke action.accept with that value and return true
otherwise just return false
Note further that your trySplit() does not look very convincing, I don’t even know where to start. You are better off, inheriting from AbstractSpliterator and not implementing a custom trySplit(). Your operation doesn’t benefit from parallel execution anyway. A stream constructed with that source could only gain an advantage from parallel execution if you chain it with quiet expensive per-element operations.
In general you don't need implementing the spliterator. If you really need a Spliterator object, you may use stream for this purpose:
Spliterator.OfLong spliterator = Stream
.iterate(new long[] { 0, 1 },
prev -> new long[] { prev[1], prev[0] + prev[1] })
.mapToLong(pair -> pair[1]).spliterator();
Testing:
for(int i=0; i<20; i++)
spliterator.tryAdvance((LongConsumer)System.out::println);
Please note that holding Fibonacci numbers in long variable is questionable: it overflows after Fibonacci number 92. So if you want to create spliterator which just iterates over first 92 Fibonacci numbers, I'd suggest to use predefined array for this purpose:
Spliterator.OfLong spliterator = Spliterators.spliterator(new long[] {
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765,
10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309,
3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141,
267914296, 433494437, 701408733, 1134903170, 1836311903, 2971215073L, 4807526976L,
7778742049L, 12586269025L, 20365011074L, 32951280099L, 53316291173L, 86267571272L, 139583862445L,
225851433717L, 365435296162L, 591286729879L, 956722026041L, 1548008755920L, 2504730781961L,
4052739537881L, 6557470319842L, 10610209857723L, 17167680177565L, 27777890035288L,
44945570212853L, 72723460248141L, 117669030460994L, 190392490709135L, 308061521170129L,
498454011879264L, 806515533049393L, 1304969544928657L, 2111485077978050L, 3416454622906707L,
5527939700884757L, 8944394323791464L, 14472334024676221L, 23416728348467685L, 37889062373143906L,
61305790721611591L, 99194853094755497L, 160500643816367088L, 259695496911122585L, 420196140727489673L,
679891637638612258L, 1100087778366101931L, 1779979416004714189L, 2880067194370816120L,
4660046610375530309L, 7540113804746346429L
}, Spliterator.ORDERED);
Array spliterator also splits well, so you will have real parallel processing.
Ok, let's write the spliterator. Using OfLong is still too boring: let's switch to BigInteger and don't limit user by 92. The tricky thing here is to quickly jump to the given Fibonacci number. I'll use matrix multiplication algorithm described here for this purpose. Here's my code:
static class FiboSpliterator implements Spliterator<BigInteger> {
private final static BigInteger[] STARTING_MATRIX = {
BigInteger.ONE, BigInteger.ONE,
BigInteger.ONE, BigInteger.ZERO};
private BigInteger[] state; // previous and current numbers
private int cur; // position
private final int fence; // max number to cover by this spliterator
public FiboSpliterator(int max) {
this(0, max);
}
// State is not initialized until traversal
private FiboSpliterator(int cur, int fence) {
assert fence >= 0;
this.cur = cur;
this.fence = fence;
}
// Multiplication of 2x2 matrix, by definition
static BigInteger[] multiply(BigInteger[] m1, BigInteger[] m2) {
return new BigInteger[] {
m1[0].multiply(m2[0]).add(m1[1].multiply(m2[2])),
m1[0].multiply(m2[1]).add(m1[1].multiply(m2[3])),
m1[2].multiply(m2[0]).add(m1[3].multiply(m2[2])),
m1[2].multiply(m2[1]).add(m1[3].multiply(m2[3]))};
}
// Log(n) algorithm to raise 2x2 matrix to n-th power
static BigInteger[] power(BigInteger[] m, int n) {
assert n > 0;
if(n == 1) {
return m;
}
if(n % 2 == 0) {
BigInteger[] root = power(m, n/2);
return multiply(root, root);
} else {
return multiply(power(m, n-1), m);
}
}
#Override
public boolean tryAdvance(Consumer<? super BigInteger> action) {
if(cur == fence)
return false; // traversal finished
if(state == null) {
// initialize state: done only once
if(cur == 0) {
state = new BigInteger[] {BigInteger.ZERO, BigInteger.ONE};
} else {
BigInteger[] res = power(STARTING_MATRIX, cur);
state = new BigInteger[] {res[1], res[0]};
}
}
action.accept(state[1]);
// update state
if(++cur < fence) {
BigInteger next = state[0].add(state[1]);
state[0] = state[1];
state[1] = next;
}
return true;
}
#Override
public Spliterator<BigInteger> trySplit() {
if(fence - cur < 2)
return null;
int mid = (fence+cur) >>> 1;
if(mid - cur < 100) {
// resulting interval is too small:
// instead of jumping we just store prefix into array
// and return ArraySpliterator
BigInteger[] array = new BigInteger[mid-cur];
for(int i=0; i<array.length; i++) {
tryAdvance(f -> {});
array[i] = state[0];
}
return Spliterators.spliterator(array, ORDERED | NONNULL | SORTED);
}
// Jump to another position
return new FiboSpliterator(cur, cur = mid);
}
#Override
public long estimateSize() {
return fence - cur;
}
#Override
public int characteristics() {
return ORDERED | IMMUTABLE | SIZED| SUBSIZED | NONNULL | SORTED;
}
#Override
public Comparator<? super BigInteger> getComparator() {
return null; // natural order
}
}
This implementation actually faster in parallel for very big fence value (like 100000). Probably even wiser implementation is also possible which would split unevenly reusing the intermediate results of matrix multiplication.

How to implement a Median-heap

Like a Max-heap and Min-heap, I want to implement a Median-heap to keep track of the median of a given set of integers. The API should have the following three functions:
insert(int) // should take O(logN)
int median() // will be the topmost element of the heap. O(1)
int delmedian() // should take O(logN)
I want to use an array (a) implementation to implement the heap where the children of array index k are stored in array indices 2*k and 2*k + 1. For convenience, the array starts populating elements from index 1.
This is what I have so far:
The Median-heap will have two integers to keep track of number of integers inserted so far that are > current median (gcm) and < current median (lcm).
if abs(gcm-lcm) >= 2 and gcm > lcm we need to swap a[1] with one of its children.
The child chosen should be greater than a[1]. If both are greater,
choose the smaller of two.
Similarly for the other case. I can't come up with an algorithm for how to sink and swim elements. I think it should take into consideration how close the number is to the median, so something like:
private void swim(int k) {
while (k > 1 && absless(k, k/2)) {
exch(k, k/2);
k = k/2;
}
}
I can't come up with the entire solution though.
You need two heaps: one min-heap and one max-heap. Each heap contains about one half of the data. Every element in the min-heap is greater or equal to the median, and every element in the max-heap is less or equal to the median.
When the min-heap contains one more element than the max-heap, the median is in the top of the min-heap. And when the max-heap contains one more element than the min-heap, the median is in the top of the max-heap.
When both heaps contain the same number of elements, the total number of elements is even.
In this case you have to choose according your definition of median: a) the mean of the two middle elements; b) the greater of the two; c) the lesser; d) choose at random any of the two...
Every time you insert, compare the new element with those at the top of the heaps in order to decide where to insert it. If the new element is greater than the current median, it goes to the min-heap. If it is less than the current median, it goes to the max heap. Then you might need to rebalance. If the sizes of the heaps differ by more than one element, extract the min/max from the heap with more elements and insert it into the other heap.
In order to construct the median heap for a list of elements, we should first use a linear time algorithm and find the median. Once the median is known, we can simply add elements to the Min-heap and Max-heap based on the median value. Balancing the heaps isn't required because the median will split the input list of elements into equal halves.
If you extract an element you might need to compensate the size change by moving one element from one heap to another. This way you ensure that, at all times, both heaps have the same size or differ by just one element.
Here is a java implementaion of a MedianHeap, developed with the help of above comocomocomocomo 's explanation .
import java.util.Arrays;
import java.util.Comparator;
import java.util.PriorityQueue;
import java.util.Scanner;
/**
*
* #author BatmanLost
*/
public class MedianHeap {
//stores all the numbers less than the current median in a maxheap, i.e median is the maximum, at the root
private PriorityQueue<Integer> maxheap;
//stores all the numbers greater than the current median in a minheap, i.e median is the minimum, at the root
private PriorityQueue<Integer> minheap;
//comparators for PriorityQueue
private static final maxHeapComparator myMaxHeapComparator = new maxHeapComparator();
private static final minHeapComparator myMinHeapComparator = new minHeapComparator();
/**
* Comparator for the minHeap, smallest number has the highest priority, natural ordering
*/
private static class minHeapComparator implements Comparator<Integer>{
#Override
public int compare(Integer i, Integer j) {
return i>j ? 1 : i==j ? 0 : -1 ;
}
}
/**
* Comparator for the maxHeap, largest number has the highest priority
*/
private static class maxHeapComparator implements Comparator<Integer>{
// opposite to minHeapComparator, invert the return values
#Override
public int compare(Integer i, Integer j) {
return i>j ? -1 : i==j ? 0 : 1 ;
}
}
/**
* Constructor for a MedianHeap, to dynamically generate median.
*/
public MedianHeap(){
// initialize maxheap and minheap with appropriate comparators
maxheap = new PriorityQueue<Integer>(11,myMaxHeapComparator);
minheap = new PriorityQueue<Integer>(11,myMinHeapComparator);
}
/**
* Returns empty if no median i.e, no input
* #return
*/
private boolean isEmpty(){
return maxheap.size() == 0 && minheap.size() == 0 ;
}
/**
* Inserts into MedianHeap to update the median accordingly
* #param n
*/
public void insert(int n){
// initialize if empty
if(isEmpty()){ minheap.add(n);}
else{
//add to the appropriate heap
// if n is less than or equal to current median, add to maxheap
if(Double.compare(n, median()) <= 0){maxheap.add(n);}
// if n is greater than current median, add to min heap
else{minheap.add(n);}
}
// fix the chaos, if any imbalance occurs in the heap sizes
//i.e, absolute difference of sizes is greater than one.
fixChaos();
}
/**
* Re-balances the heap sizes
*/
private void fixChaos(){
//if sizes of heaps differ by 2, then it's a chaos, since median must be the middle element
if( Math.abs( maxheap.size() - minheap.size()) > 1){
//check which one is the culprit and take action by kicking out the root from culprit into victim
if(maxheap.size() > minheap.size()){
minheap.add(maxheap.poll());
}
else{ maxheap.add(minheap.poll());}
}
}
/**
* returns the median of the numbers encountered so far
* #return
*/
public double median(){
//if total size(no. of elements entered) is even, then median iss the average of the 2 middle elements
//i.e, average of the root's of the heaps.
if( maxheap.size() == minheap.size()) {
return ((double)maxheap.peek() + (double)minheap.peek())/2 ;
}
//else median is middle element, i.e, root of the heap with one element more
else if (maxheap.size() > minheap.size()){ return (double)maxheap.peek();}
else{ return (double)minheap.peek();}
}
/**
* String representation of the numbers and median
* #return
*/
public String toString(){
StringBuilder sb = new StringBuilder();
sb.append("\n Median for the numbers : " );
for(int i: maxheap){sb.append(" "+i); }
for(int i: minheap){sb.append(" "+i); }
sb.append(" is " + median()+"\n");
return sb.toString();
}
/**
* Adds all the array elements and returns the median.
* #param array
* #return
*/
public double addArray(int[] array){
for(int i=0; i<array.length ;i++){
insert(array[i]);
}
return median();
}
/**
* Just a test
* #param N
*/
public void test(int N){
int[] array = InputGenerator.randomArray(N);
System.out.println("Input array: \n"+Arrays.toString(array));
addArray(array);
System.out.println("Computed Median is :" + median());
Arrays.sort(array);
System.out.println("Sorted array: \n"+Arrays.toString(array));
if(N%2==0){ System.out.println("Calculated Median is :" + (array[N/2] + array[(N/2)-1])/2.0);}
else{System.out.println("Calculated Median is :" + array[N/2] +"\n");}
}
/**
* Another testing utility
*/
public void printInternal(){
System.out.println("Less than median, max heap:" + maxheap);
System.out.println("Greater than median, min heap:" + minheap);
}
//Inner class to generate input for basic testing
private static class InputGenerator {
public static int[] orderedArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = i;
}
return array;
}
public static int[] randomArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = (int)(Math.random()*N*N);
}
return array;
}
public static int readInt(String s){
System.out.println(s);
Scanner sc = new Scanner(System.in);
return sc.nextInt();
}
}
public static void main(String[] args){
System.out.println("You got to stop the program MANUALLY!!");
while(true){
MedianHeap testObj = new MedianHeap();
testObj.test(InputGenerator.readInt("Enter size of the array:"));
System.out.println(testObj);
}
}
}
Here my code based on the answer provided by comocomocomocomo :
import java.util.PriorityQueue;
public class Median {
private PriorityQueue<Integer> minHeap =
new PriorityQueue<Integer>();
private PriorityQueue<Integer> maxHeap =
new PriorityQueue<Integer>((o1,o2)-> o2-o1);
public float median() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
if (minSize == 0 && maxSize == 0) {
return 0;
}
if (minSize > maxSize) {
return minHeap.peek();
}if (minSize < maxSize) {
return maxHeap.peek();
}
return (minHeap.peek()+maxHeap.peek())/2F;
}
public void insert(int element) {
float median = median();
if (element > median) {
minHeap.offer(element);
} else {
maxHeap.offer(element);
}
balanceHeap();
}
private void balanceHeap() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
int tmp = 0;
if (minSize > maxSize + 1) {
tmp = minHeap.poll();
maxHeap.offer(tmp);
}
if (maxSize > minSize + 1) {
tmp = maxHeap.poll();
minHeap.offer(tmp);
}
}
}
Isn't a perfectly balanced binary search tree (BST) a median heap? It is true that even red-black BSTs aren't always perfectly balanced, but it might be close enough for your purposes. And log(n) performance is guaranteed!
AVL trees are more tighly balanced than red-black BSTs so they come even closer to being a true median heap.
Here is a Scala implementation, following the comocomocomocomo's idea above.
class MedianHeap(val capacity:Int) {
private val minHeap = new PriorityQueue[Int](capacity / 2)
private val maxHeap = new PriorityQueue[Int](capacity / 2, new Comparator[Int] {
override def compare(o1: Int, o2: Int): Int = Integer.compare(o2, o1)
})
def add(x: Int): Unit = {
if (x > median) {
minHeap.add(x)
} else {
maxHeap.add(x)
}
// Re-balance the heaps.
if (minHeap.size - maxHeap.size > 1) {
maxHeap.add(minHeap.poll())
}
if (maxHeap.size - minHeap.size > 1) {
minHeap.add(maxHeap.poll)
}
}
def median: Double = {
if (minHeap.isEmpty && maxHeap.isEmpty)
return Int.MinValue
if (minHeap.size == maxHeap.size) {
return (minHeap.peek+ maxHeap.peek) / 2.0
}
if (minHeap.size > maxHeap.size) {
return minHeap.peek()
}
maxHeap.peek
}
}
Another way to do it without using a max-heap and a min-heap would be to use a median-heap right away.
In a max-heap, the parent is greater than the children.
We can have a new type of heap where the parent is in the 'middle' of the children - the left child is smaller than the parent and the right child is greater than the parent. All even entries are left children and all odd entries are right children.
The same swim and sink operations which can be performed in a max-heap, can also be performed in this median-heap - with slight modifications. In a typical swim operation in a max-heap, the inserted entry swims up till it is smaller than a parent entry, here in a median-heap, it will swim up till it is lesser than a parent (if it is an odd entry) or greater than a parent (if it is an even entry).
Here's my implementation for this median-heap. I have used an array of Integers for simplicity.
package priorityQueues;
import edu.princeton.cs.algs4.StdOut;
public class MedianInsertDelete {
private Integer[] a;
private int N;
public MedianInsertDelete(int capacity){
// accounts for '0' not being used
this.a = new Integer[capacity+1];
this.N = 0;
}
public void insert(int k){
a[++N] = k;
swim(N);
}
public int delMedian(){
int median = findMedian();
exch(1, N--);
sink(1);
a[N+1] = null;
return median;
}
public int findMedian(){
return a[1];
}
// entry swims up so that its left child is smaller and right is greater
private void swim(int k){
while(even(k) && k>1 && less(k/2,k)){
exch(k, k/2);
if ((N > k) && less (k+1, k/2)) exch(k+1, k/2);
k = k/2;
}
while(!even(k) && (k>1 && !less(k/2,k))){
exch(k, k/2);
if (!less (k-1, k/2)) exch(k-1, k/2);
k = k/2;
}
}
// if the left child is larger or if the right child is smaller, the entry sinks down
private void sink (int k){
while(2*k <= N){
int j = 2*k;
if (j < N && less (j, k)) j++;
if (less(k,j)) break;
exch(k, j);
k = j;
}
}
private boolean even(int i){
if ((i%2) == 0) return true;
else return false;
}
private void exch(int i, int j){
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}
private boolean less(int i, int j){
if (a[i] <= a[j]) return true;
else return false;
}
public static void main(String[] args) {
MedianInsertDelete medianInsertDelete = new MedianInsertDelete(10);
for(int i = 1; i <=10; i++){
medianInsertDelete.insert(i);
}
StdOut.println("The median is: " + medianInsertDelete.findMedian());
medianInsertDelete.delMedian();
StdOut.println("Original median deleted. The new median is " + medianInsertDelete.findMedian());
}
}

Split larger collection (Collections, Arrays, List) into smaller collections in Java and also keep track of last one returned

public Collection<Comment> getCommentCollection() {
commentCollection = movie.getCommentCollection();
return split((List<Comment>) commentCollection, 4);
}
public Collection<Comment> split(List<Comment> list, int size){
int numBatches = (list.size() / size) + 1;
Collection[] batches = new Collection[numBatches];
Collection<Comment> set = commentCollection;
for(int index = 0; index < numBatches; index++) {
int count = index + 1;
int fromIndex = Math.max(((count - 1) * size), 0);
int toIndex = Math.min((count * size), list.size());
batches[index] = list.subList(fromIndex, toIndex);
set = batches[index];
}
return set;
}
I am trying to split a bigger collection into smaller collections, depending on the number of items in the original collection. And then return one of the smaller collections every time the get method is called while keeping track of which smaller collection is returned. How can I achieve this?
Maybe I don't understand the question, but this is part of List:
List<E> subList(int fromIndex, int toIndex)
Returns a view of the portion of this list between the specified fromIndex, inclusive, and toIndex, exclusive. (If fromIndex and toIndex are equal, the returned list is empty.) The returned list is backed by this list, so non-structural changes in the returned list are reflected in this list, and vice-versa. The returned list supports all of the optional list operations supported by this list.
This method eliminates the need for explicit range operations (of the sort that commonly exist for arrays). Any operation that expects a list can be used as a range operation by passing a subList view instead of a whole list. For example, the following idiom removes a range of elements from a list:
list.subList(from, to).clear();
docs.oracle.com/javase/1.5.0/docs/api/java/util/List.html
This is simple: just use Lists.partition() from Guava. If I understand what you want correctly, it's exactly what it does.
private int runs = 0;
public void setRunsOneMore() {
runs++;
}
public void setRunsOneLess() {
runs--;
}
public Collection<Comment> getCommentCollection() {
commentCollection = movie.getCommentCollection();
Collection[] com = split((List<Comment>) commentCollection,4);
try{
return com[runs];
} catch(ArrayIndexOutOfBoundsException e) {
runs = 0;
}
return com[runs];
}
public Collection[] split(List<Comment> list, int size){
int numBatches = (list.size() / size) + 1;
Collection[] batches = new Collection[numBatches];
Collection<Comment> set = commentCollection;
for(int index = 0; index < numBatches; index++) {
int count = index + 1;
int fromIndex = Math.max(((count - 1) * size), 0);
int toIndex = Math.min((count * size), list.size());
batches[index] = list.subList(fromIndex, toIndex);
}
return batches;
}
Setting the current "run" with the next & previous button actions
public String userNext() {
userReset(false);
getUserPagingInfo().nextPage();
movieController.setRunsOneMore();
return "user_movie_detail";
}
public String userPrev() {
userReset(false);
getUserPagingInfo().previousPage();
movieController.setRunsOneLess();
return "user_movie_detail";
}
I'm not entirely sure what you're asking... do you want to remove the first 4 items from the source Collection before returning them, so that you get the next 4 the next time you call the method? If so, you could just use the Iterator:
Iterator<Comment> iter = commentCollection.iterator();
while (iter.hasNext() && group.size() < 4) {
group.add(iter.next());
iter.remove();
}
By doing this, though, you'd be destroying the movie object's collection of comments (unless it returns a copy of that collection each time, in which case the above wouldn't work at all). I'm guessing you're trying to do something like paging, in which case I'd suggest doing something different like partitioning a List of comments with size 4 and keeping track of a current index (the page) in that partition list.
public static <E extends Object> List<List<E>> split(Collection<E> input, int size) {\n
List<List<E>> master = new ArrayList<List<E>>();
if (input != null && input.size() > 0) {
List<E> col = new ArrayList<E>(input);
boolean done = false;
int startIndex = 0;
int endIndex = col.size() > size ? size : col.size();
while (!done) {
master.add(col.subList(startIndex, endIndex));
if (endIndex == col.size()) {
done = true;
}
else {
startIndex = endIndex;
endIndex = col.size() > (endIndex + size) ? (endIndex + size) : col.size();
}
}
}
return master;
}
You can create a separate sublist that is a deep copy of the original list using the ArrayList constructor.
import java.util.ArrayList;
import java.util.List;
class Scratch {
public static void main(String[] args) {
final List<String> parent = new ArrayList<>();
parent.add("One");
parent.add("Two");
parent.add("Three");
// using the ArrayList constructor here
final List<String> copy = new ArrayList<>(parent.subList(0, 2));
// modifying the new list doesn't affect the original
copy.remove(0);
// outputs:
// parent: [One, Two, Three]
// copy: [Two]
System.out.println("parent: " + parent);
System.out.println("copy: " + copy);
}
}
You can use Vector.remove(collection), example:
public Collection<Comment> getCommentCollection() {
commentCollection = movie.getCommentCollection();
Vector<Comment> group = new Vector<Comment>();
for (Comment com:commentCollection){
group.add(com);
if(group.size() == 4){
break;
}
}
movie.getCommentCollection().remove(commentCollection);
return commentCollection;
}
assuming movie.getCommentCollection() is also a vector
here is my implementation. hope it helps!
dependencies CollectionUtils and Lists to see:
https://mvnrepository.com/artifact/org.apache.commons/commons-lang3/
/**
* efficient collection partition
*
* #param baseCollection base collection to split
* #param maxSize max element size of each sublist returned
* #param balancing whether each of sublists returned needs size balancing
* #return list of sublists, whose order bases on the base collection's iterator implementation
* #since 2020/03/12
*/
public static <T> List<List<T>> partition(final Collection<T> baseCollection, int maxSize, boolean balancing) {
if (CollectionUtils.isEmpty(baseCollection)) {
return Collections.emptyList();
}
int size = baseCollection.size() % maxSize == 0 ? baseCollection.size()/maxSize : baseCollection.size()/maxSize+1;
if (balancing) {
maxSize = baseCollection.size() % size == 0 ? baseCollection.size()/size : baseCollection.size()/size+1;
}
int fullElementSize = baseCollection.size() % size == 0 ? size : baseCollection.size() % size;
List<List<T>> result = Lists.newArrayListWithExpectedSize(size);
Iterator<T> it = baseCollection.iterator();
for (int i = 0; i < size; i++) {
if (balancing && i == fullElementSize) {
maxSize--;
}
maxSize = Math.min(maxSize, baseCollection.size()-i*maxSize);
List<T> subList = Lists.newArrayListWithExpectedSize(maxSize);
for (int i1 = 0; i1 < maxSize; i1++) {
if (it.hasNext()) {
subList.add(it.next());
} else {
break;
}
}
result.add(subList);
}
return result;
}

Categories