Best way of calculating permutations [closed] - java

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm trying to figure out the best way of finding/automating all the possible permutations for a certain scenario.
I have a program which takes in a set of numbers [X, Y , Z], Each number has a predefined uncertainty. Therefore, I want to run my program against [X, Y , Z], [X+e, Y, Z] [x-e, Y, Z], [X, Y+e, Z] etc. Right now I have built an object which contains all the 27 possibilities and I'm iterating through it in order to provide my program with a new set of input. (I'll run my program 27 times with different set of inputs)
as time goes, I'd need to update my program to take in a bigger set of numbers. So I'm wondering whether there is a better way of calculating all the possible permutations my base set may have.
I'd rather know the way of implementing this instead of using any existing libraries (if there is any). I see this as a learning program. Thanks!

Instead of writing down the the 3x3x3 sets of 3 numbers by hand, you can use nested loops. If you have 3 loops, one inside the other, each running 3 times, you get 27 outputs:
double[] numbers = new double[3];
double[] e = {-1e-6, 0, 1e-6};
for (double eX : e) {
for (double eY : e) {
for (double eZ : e) {
double[] newNumbers = {numbers[0] + eX, numbers[1] + eY, numbers[2] + eZ};
// Run your program using "newNumbers". Just as an example:
System.out.println(Arrays.toString(newNumbers));
}
}
}
As for
as time goes, I'd need to update my program to take in a bigger set of numbers
If the size of the set is going to be small and fixed, you can just add more nested loops. If not, you are going to need more advanced techniques .

Here is a permutation method I found some time ago. It prints them within the method. It only does single dimension permutations but you may be able to adapt it to your needs.
public static void generate(int n, int[] a) {
if (n == 1) {
System.out.println(Arrays.toString(a));
} else {
for (int i = 0; i < n - 1; i++) {
generate(n - 1, a);
if ((n & 1) == 0) {
swap(i, n - 1, a);
} else {
swap(0, n - 1, a);
}
}
generate(n - 1, a);
}
}
public static void swap(int a, int b, int[] array) {
int temp = array[a];
array[a] = array[b];
array[b] = temp;
}

I believe the best way to do this is to implement a Spliterator and wrap it in a Stream:
public interface Combinations<T> extends Stream<List<T>> {
public static <T> Stream<List<T>> of(Collection<T> collection) {
SpliteratorSupplier<T> supplier =
new SpliteratorSupplier<T>(collection);
return supplier.stream();
}
...
}
Which solves the general use-case:
Combinations.of(List.of(X, Y, Z)).forEach(t -> process(t));
Implementing the Spliterator is straightforward but tedious and I have written about it here. The key components are a DispatchSpliterator:
private Iterator<Supplier<Spliterator<T>>> spliterators = null;
private Spliterator<T> spliterator = Spliterators.emptySpliterator();
...
protected abstract Iterator<Supplier<Spliterator<T>>> spliterators();
...
#Override
public Spliterator<T> trySplit() {
if (spliterators == null) {
spliterators = Spliterators.iterator(spliterators());
}
return spliterators.hasNext() ? spliterators.next().get() : null;
}
#Override
public boolean tryAdvance(Consumer<? super T> consumer) {
boolean accepted = false;
while (! accepted) {
if (spliterator == null) {
spliterator = trySplit();
}
if (spliterator != null) {
accepted = spliterator.tryAdvance(consumer);
if (! accepted) {
spliterator = null;
}
} else {
break;
}
}
return accepted;
}
A Spliterator for each prefix:
private class ForPrefix extends DispatchSpliterator<List<T>> {
private final int size;
private final List<T> prefix;
private final List<T> remaining;
public ForPrefix(int size, List<T> prefix, List<T> remaining) {
super(binomial(remaining.size(), size),
SpliteratorSupplier.this.characteristics());
this.size = size;
this.prefix = requireNonNull(prefix);
this.remaining = requireNonNull(remaining);
}
#Override
protected Iterator<Supplier<Spliterator<List<T>>>> spliterators() {
List<Supplier<Spliterator<List<T>>>> list = new LinkedList<>();
if (prefix.size() < size) {
for (int i = 0, n = remaining.size(); i < n; i += 1) {
List<T> prefix = new LinkedList<>(this.prefix);
List<T> remaining = new LinkedList<>(this.remaining);
prefix.add(remaining.remove(i));
list.add(() -> new ForPrefix(size, prefix, remaining));
}
} else if (prefix.size() == size) {
list.add(() -> new ForCombination(prefix));
} else {
throw new IllegalStateException();
}
return list.iterator();
}
}
and one for each combination:
private class ForCombination extends DispatchSpliterator<List<T>> {
private final List<T> combination;
public ForCombination(List<T> combination) {
super(1, SpliteratorSupplier.this.characteristics());
this.combination = requireNonNull(combination);
}
#Override
protected Iterator<Supplier<Spliterator<List<T>>>> spliterators() {
Supplier<Spliterator<List<T>>> supplier =
() -> Collections.singleton(combination).spliterator();
return Collections.singleton(supplier).iterator();
}
}

Related

What edits should I make to my constructor? [duplicate]

This question already has answers here:
How can I set the count of an item in a list<T> to a specific amount
(2 answers)
Closed 3 months ago.
I am trying to implement some functions to a generic array such as setting the count of the item to int count via method setcount( DT item, int count). I've asked this before and a kind Stack Overflow user has been nice enough to explain that using a hashmap would've been better.
class SimpleHistogram<T> implements Histogram<T>, Iterable<T> {
private final Map<T, Integer> bins = new HashMap<>();
public SimpleHistogram() {
this(List.of());
}
SimpleHistogram(List<? extends T> items) {
for (T item : items) {
Integer count = bins.getOrDefault(item, 0);
bins.put(item, count + 1);
}
}
#Override
public void setCount(T item, int count) {
bins.put(item, count);
}
#Override
public Iterator<T> iterator() {
return bins.keySet().iterator();
}
}
#Override
public int getTotalCount() {
return bins.size();
}
However, there seems to be an error when I tried to run it using my test cases and it seems that the issue stems from the constructor that I'm provided with.
I've tried to debug the issue but the only solution available is to change from
public SimpleHistogram() {
this(List.of());
}
to
public SimpleHistogram(Character[] target) {
this(List.of());
}
which would be wrong since it should take in any generic array.
Any suggestions on what changes should I make?
Here are the test cases by the way:
public class SimpleHistogramTest {
#Test
public void testHistogram() {
Character[] target = {'a','b','c','a'};
Histogram<Character> h = new SimpleHistogram<>(target); //here is where the problem arises
Iterator<Character> iter = h.iterator();
int elemCount = 0;
while(iter.hasNext()) {
iter.next();
elemCount++;
}
assertEquals(3, elemCount);
assertEquals(2, h.getCount('a'));
assertEquals(1, h.getCount('b'));
assertEquals(1, h.getCount('c'));
assertEquals(4, h.getTotalCount());
You should overload your constructor with a new one which is handle the arrays or have to pass the array as list.
SimpleHistogram(T[] items) {
this(Arrays.asList(items));
}
or
#Test
public void testHistogram() {
Character[] target = {'a','b','c','a'};
Histogram<Character> h = new SimpleHistogram<>(Arrays.asList(target));
Iterator<Character> iter = h.iterator();
int elemCount = 0;
while(iter.hasNext()) {
iter.next();
elemCount++;
}
}
Or you can use the 3 dot syntax for parameters.
SimpleHistogram(T... items) {
this(Arrays.asList(items));
}
Formatted code for the comment answer:
Its a little missunderstand naming convention. If you want to add items, so incrase the count is: Check if the HashMap has item with key if yes then get its value and incrase with the count, you can do this by sum them and put the key again with the new value
public void addCount(T item, int count) {
if (bins.containsKey(item)){
bins.put(item, bins.get(item)+count);
}else{
bins.put(item, count);
}
}
public void setCount(T item, int count) {
bins.put(item, count);
}

Is there a way to return a set of objects combined as one object?

So I have a list of objects, these objects are intervals, for ex [2-10]. The left end is always less than the right end, (start < end).
Now say I calculate the union of intervals,
like [2-9] and [10-12], I want to return [2-9,10-12] as a single object.
Is there any way to return that instance of intervals with just using a list of interval objects?
Also, the non number characters are built from my toString method, not be be confused as part of the object itself.
Once you will get the List of your IntervalUnions, you will be able to print it as required using already implemented toString method. Printing list will be like
public static void main(String... args) throws InterruptedException {
List<MyInterval> myList = new ArrayList<>();
myList.add(new MyInterval(5, 10));
myList.add(new MyInterval(20, 30));
System.out.println(myList.toString());
}
static class MyInterval {
int x, y;
MyInterval(int x, int y) {
this.x = x;
this.y = y;
}
#Override
public String toString() {
return x + "-" + y;
}
}
Which prints
[5-10, 20-30]
So that is exactly as required. Now swap MyInterval with collection of combined interval unions and you are done.
I suggest using a new structure for your IntervalUnion class that can handle multiple start and end points.
WARNINGS: This unions implementation is not correct. This implementation empties the two intervals you are union. If [1,4] and [2,6] is unioned you will not get the expected result [1,6]
public class Interval {
List<Integer> starts = new ArrayList<>();
List<Integer> ends = new ArrayList<>();
public Interval() {
}
public Interval(int start, int end) {
starts.add(start);
ends.add(end);
}
public Interval union(Interval interval) {
Interval result = new Interval();
while(this.starts.size()>0||interval.starts.size()>0){
if(!this.starts.isEmpty() && this.starts.get(0) <= interval.starts.get(0)) {
result.starts.add(this.starts.get(0));
result.ends.add(this.ends.get(0));
this.starts.remove(0);
this.ends.remove(0);
} else {
result.starts.add(interval.starts.get(0));
result.ends.add(interval.ends.get(0));
interval.starts.remove(0);
interval.ends.remove(0);
}
}
return result;
}
#Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("[");
for(int i = 0; i < starts.size();i++){
sb.append(starts.get(i));
sb.append("-");
sb.append(ends.get(i));
if(i<starts.size()-1) {
sb.append(",");
}
}
sb.append("]");
return sb.toString();
}
public static void main(String... args) {
Interval i = new Interval(1, 2);
Interval j = new Interval(4, 5);
Interval k = new Interval(7, 9);
System.out.println(i.union(j).union(k));
}
}
With output:
[1-2,4-5,7-9]

Build an adaptative mesh refinement with ForkJoin and Streams

I want to build an adaptative mesh refinement in 3D.
The basic principle is the following:
I have a set of cells with unique cell IDs.
I test each cell to see if it needs to be refined.
If refinement is required, a create 8 new child cells and add them to the list of cells to check for refinement.
Otherwise, this is a leaf node and I add it to my list of leaf nodes.
I want to implement it using the ForkJoin framework and Java 8 streams. I read this article, but I don't know how to apply it to my case.
For now, what I came up with is this:
public class ForkJoinAttempt {
private final double[] cellIds;
public ForkJoinAttempt(double[] cellIds) {
this.cellIds = cellIds;
}
public void refineGrid() {
ForkJoinPool pool = ForkJoinPool.commonPool();
double[] result = pool.invoke(new RefineTask(100));
}
private class RefineTask extends RecursiveTask<double[]> {
final double cellId;
private RefineTask(double cellId) {
this.cellId = cellId;
}
#Override
protected double[] compute() {
return ForkJoinTask.invokeAll(createSubtasks())
.stream()
.map(ForkJoinTask::join)
.reduce(new double[0], new Concat());
}
}
private double[] refineCell(double cellId) {
double[] result;
if (checkCell()) {
result = new double[8];
for (int i = 0; i < 8; i++) {
result[i] = Math.random();
}
} else {
result = new double[1];
result[0] = cellId;
}
return result;
}
private Collection<RefineTask> createSubtasks() {
List<RefineTask> dividedTasks = new ArrayList<>();
for (int i = 0; i < cellIds.length; i++) {
dividedTasks.add(new RefineTask(cellIds[i]));
}
return dividedTasks;
}
private class Concat implements BinaryOperator<double[]> {
#Override
public double[] apply(double[] a, double[] b) {
int aLen = a.length;
int bLen = b.length;
#SuppressWarnings("unchecked")
double[] c = (double[]) Array.newInstance(a.getClass().getComponentType(), aLen + bLen);
System.arraycopy(a, 0, c, 0, aLen);
System.arraycopy(b, 0, c, aLen, bLen);
return c;
}
}
public boolean checkCell() {
return Math.random() < 0.5;
}
}
... and I'm stuck here.
This doesn't do much for now, because I never call the refineCell function.
I also might have a performance issue with all those double[] I create. And merging them in this way might not be the most efficient way to do it too.
But first things first, can anyone help me on implementing the fork join in that case?
The expected result of the algorithm is an array of leaf cell IDs (double[])
Edit 1:
Thanks to the comments, I came up with something that works a little better.
Some changes:
I went from arrays to lists. This is not good for the memory footprint, because I'm not able to use Java primitives. But it made the implantation simpler.
The cell IDs are now Long instead of Double.
Ids are not randomly chosen any more:
Root level cells have IDs 1, 2, 3 etc.;
Children of 1 have IDs 10, 11, 12, etc.;
Children of 2 have IDs 20, 21, 22, etc.;
You get the idea...
I refine all cells whose ID is lower than 100
This allows me for the sake of this example to better check the results.
Here is the new implementation:
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import java.util.concurrent.*;
import java.util.function.BinaryOperator;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import java.util.stream.Stream;
public class ForkJoinAttempt {
private static final int THRESHOLD = 2;
private List<Long> leafCellIds;
public void refineGrid(List<Long> cellsToProcess) {
leafCellIds = ForkJoinPool.commonPool().invoke(new RefineTask(cellsToProcess));
}
public List<Long> getLeafCellIds() {
return leafCellIds;
}
private class RefineTask extends RecursiveTask<List<Long>> {
private final CopyOnWriteArrayList<Long> cellsToProcess = new CopyOnWriteArrayList<>();
private RefineTask(List<Long> cellsToProcess) {
this.cellsToProcess.addAll(cellsToProcess);
}
#Override
protected List<Long> compute() {
if (cellsToProcess.size() > THRESHOLD) {
System.out.println("Fork/Join");
return ForkJoinTask.invokeAll(createSubTasks())
.stream()
.map(ForkJoinTask::join)
.reduce(new ArrayList<>(), new Concat());
} else {
System.out.println("Direct computation");
List<Long> leafCells = new ArrayList<>();
for (Long cell : cellsToProcess) {
Long result = refineCell(cell);
if (result != null) {
leafCells.add(result);
}
}
return leafCells;
}
}
private Collection<RefineTask> createSubTasks() {
List<RefineTask> dividedTasks = new ArrayList<>();
for (List<Long> list : split(cellsToProcess)) {
dividedTasks.add(new RefineTask(list));
}
return dividedTasks;
}
private Long refineCell(Long cellId) {
if (checkCell(cellId)) {
for (int i = 0; i < 8; i++) {
Long newCell = cellId * 10 + i;
cellsToProcess.add(newCell);
System.out.println("Adding child " + newCell + " to cell " + cellId);
}
return null;
} else {
System.out.println("Leaf node " + cellId);
return cellId;
}
}
private List<List<Long>> split(List<Long> list)
{
int[] index = {0, (list.size() + 1)/2, list.size()};
List<List<Long>> lists = IntStream.rangeClosed(0, 1)
.mapToObj(i -> list.subList(index[i], index[i + 1]))
.collect(Collectors.toList());
return lists;
}
}
private class Concat implements BinaryOperator<List<Long>> {
#Override
public List<Long> apply(List<Long> listOne, List<Long> listTwo) {
return Stream.concat(listOne.stream(), listTwo.stream())
.collect(Collectors.toList());
}
}
public boolean checkCell(Long cellId) {
return cellId < 100;
}
}
And the method testing it:
int initialSize = 4;
List<Long> cellIds = new ArrayList<>(initialSize);
for (int i = 0; i < initialSize; i++) {
cellIds.add(Long.valueOf(i + 1));
}
ForkJoinAttempt test = new ForkJoinAttempt();
test.refineGrid(cellIds);
List<Long> leafCellIds = test.getLeafCellIds();
System.out.println("Leaf nodes: " + leafCellIds.size());
for (Long node : leafCellIds) {
System.out.println(node);
}
The output confirms that it adds 8 children to each root cell. But it does not go further.
I know why, but I don't know how to solve it: this is because even though the refineCell method add the new cells to the list of cells to process. The createSubTask method is not called again, so it cannot know I have added new cells.
Edit 2:
To state the problem differently, what I'm looking for is a mechanism where a Queue of cells IDs is processed by some RecursiveTasks while others add to the Queue in parallel.
First, let’s start with the Stream based solution
public class Mesh {
public static long[] refineGrid(long[] cellsToProcess) {
return Arrays.stream(cellsToProcess).parallel().flatMap(Mesh::expand).toArray();
}
static LongStream expand(long d) {
return checkCell(d)? LongStream.of(d): generate(d).flatMap(Mesh::expand);
}
private static boolean checkCell(long cellId) {
return cellId > 100;
}
private static LongStream generate(long cellId) {
return LongStream.range(0, 8).map(j -> cellId * 10 + j);
}
}
While the current flatMap implementation has known issues with parallel processing that might apply when the mesh is too unbalanced, the performance for your actual task might be reasonable, so this simple solution is always worth a try, before start to implement something more complicated.
If you really need a custom implementation, e.g. if the workload is unbalanced and the Stream implementation can’t adapt well enough, you can do it like this:
public class MeshTask extends RecursiveTask<long[]> {
public static long[] refineGrid(long[] cellsToProcess) {
return new MeshTask(cellsToProcess, 0, cellsToProcess.length).compute();
}
private final long[] source;
private final int from, to;
private MeshTask(long[] src, int from, int to) {
source = src;
this.from = from;
this.to = to;
}
#Override
protected long[] compute() {
return compute(source, from, to);
}
private static long[] compute(long[] source, int from, int to) {
long[] result = new long[to - from];
ArrayDeque<MeshTask> next = new ArrayDeque<>();
while(getSurplusQueuedTaskCount()<3) {
int mid = (from+to)>>>1;
if(mid == from) break;
MeshTask task = new MeshTask(source, mid, to);
next.push(task);
task.fork();
to = mid;
}
int pos = 0;
for(; from < to; ) {
long value = source[from++];
if(checkCell(value)) result[pos++]=value;
else {
long[] array = generate(value);
array = compute(array, 0, array.length);
result = Arrays.copyOf(result, result.length+array.length-1);
System.arraycopy(array, 0, result, pos, array.length);
pos += array.length;
}
while(from == to && !next.isEmpty()) {
MeshTask task = next.pop();
if(task.tryUnfork()) {
to = task.to;
}
else {
long[] array = task.join();
int newLen = pos+to-from+array.length;
if(newLen != result.length)
result = Arrays.copyOf(result, newLen);
System.arraycopy(array, 0, result, pos, array.length);
pos += array.length;
}
}
}
return result;
}
static boolean checkCell(long cellId) {
return cellId > 1000;
}
static long[] generate(long cellId) {
long[] sub = new long[8];
for(int i = 0; i < sub.length; i++) sub[i] = cellId*10+i;
return sub;
}
}
This implementation calls the compute method of the root task directly to incorporate the caller thread into the computation. The compute method uses getSurplusQueuedTaskCount() to decide whether to split. As its documentation says, the idea is to always have a small surplus, e.g. 3. This ensures that the evaluation can adapt to unbalanced workloads as idle threads can steal work from other task.
The splitting is not done by creating two sub-tasks and wait for both. Instead, only one task is split off, representing the second half of the pending work, and the current task’s workload is adapted to reflect the first half.
Then, the remaining workload is processed locally. Afterwards, the last pushed subtask is popped and attempted to unfork. If unforking succeeded, the current workload’s range is adapted to cover the subsequent task’s range too and the local iteration continues.
That way, any surplus task that has not been stolen by another thread is processed in the simplest and most lightweight way, as if it was never forked.
If the task has been picked up by another thread, we have to wait for its completion now and merge the result array.
Note that when waiting for a sub task via join(), the underlying implementation will also check if unforking and local evaluation is possible, to keep all worker threads busy. However, adjusting our loop variable and directly accumulating the results in our target array is still better than a nested compute invocation that still needs merging the result arrays.
If a cell is not a leaf, the resulting nodes are processed recursively by the same logic. This again allows for adaptive local and concurrent evaluation, so the execution will adapt to unbalanced workloads, e.g. if a particular cell has a larger subtree or the evaluation of a particular cell taskes much longer than others.
It must be emphasized that in all cases, a significant processing workload is needed to draw a benefit from parallel processing. If, like in the example, there is mostly data copying only, the benefit might be much smaller, non-existent or in the worst case, the parallel processing may perform worse than sequential.

Java Priority Queue Comparator

I have defined my own compare function for a priority queue, however the compare function needs information of an array. The problem is that when the values of the array changed, it did not affect the compare function. How do I deal with this?
Code example:
import java.util.Arrays;
import java.util.Comparator;
import java.util.PriorityQueue;
import java.util.Scanner;
public class Main {
public static final int INF = 100;
public static int[] F = new int[201];
public static void main(String[] args){
PriorityQueue<Integer> Q = new PriorityQueue<Integer>(201,
new Comparator<Integer>(){
public int compare(Integer a, Integer b){
if (F[a] > F[b]) return 1;
if (F[a] == F[b]) return 0;
return -1;
}
});
Arrays.fill(F, INF);
F[0] = 0; F[1] = 1; F[2] = 2;
for (int i = 0; i < 201; i ++) Q.add(i);
System.out.println(Q.peek()); // Prints 0, because F[0] is the smallest
F[0] = 10;
System.out.println(Q.peek()); // Still prints 0 ... OMG
}
}
So, essentially, you are changing your comparison criteria on the fly, and that's just not the functionality that priority queue contracts offer. Note that this might seem to work on some cases (e.g. a heap might sort some of the items when removing or inserting another item) but since you have no guarantees, it's just not a valid approach.
What you could do is, every time you change your arrays, you get all the elements out, and put them back in. This is of course very expensive ( O(n*log(n))) so you should probably try to work around your design to avoid changing the array values at all.
Your comparator is only getting called when you modify the queue (that is, when you add your items). After that, the queue has no idea something caused the order to change, which is why it remains the same.
It is quite confusing to have a comparator like this. If you have two values, A and B, and A>B at some point, everybody would expect A to stay bigger than B. I think your usage of a priority queue for this problem is wrong.
Use custom implementation of PriorityQueue that uses comparator on peek, not on add:
public class VolatilePriorityQueue <T> extends AbstractQueue <T>
{
private final Comparator <? super T> comparator;
private final List <T> elements = new ArrayList <T> ();
public VolatilePriorityQueue (Comparator <? super T> comparator)
{
this.comparator = comparator;
}
#Override
public boolean offer (T e)
{
return elements.add (e);
}
#Override
public T poll ()
{
if (elements.isEmpty ()) return null;
else return elements.remove (getMinimumIndex ());
}
#Override
public T peek ()
{
if (elements.isEmpty ()) return null;
else return elements.get (getMinimumIndex ());
}
#Override
public Iterator <T> iterator ()
{
return elements.iterator ();
}
#Override
public int size ()
{
return elements.size ();
}
private int getMinimumIndex ()
{
T e = elements.get (0);
int index = 0;
for (int count = elements.size (), i = 1; i < count; i++)
{
T ee = elements.get (i);
if (comparator.compare (e, ee) > 0)
{
e = ee;
index = i;
}
}
return index;
}
}

Picking a random element from a set

How do I pick a random element from a set?
I'm particularly interested in picking a random element from a
HashSet or a LinkedHashSet, in Java.
Solutions for other languages are also welcome.
int size = myHashSet.size();
int item = new Random().nextInt(size); // In real life, the Random object should be rather more shared than this
int i = 0;
for(Object obj : myhashSet)
{
if (i == item)
return obj;
i++;
}
A somewhat related Did You Know:
There are useful methods in java.util.Collections for shuffling whole collections: Collections.shuffle(List<?>) and Collections.shuffle(List<?> list, Random rnd).
In Java 8:
static <E> E getRandomSetElement(Set<E> set) {
return set.stream().skip(new Random().nextInt(set.size())).findFirst().orElse(null);
}
Fast solution for Java using an ArrayList and a HashMap: [element -> index].
Motivation: I needed a set of items with RandomAccess properties, especially to pick a random item from the set (see pollRandom method). Random navigation in a binary tree is not accurate: trees are not perfectly balanced, which would not lead to a uniform distribution.
public class RandomSet<E> extends AbstractSet<E> {
List<E> dta = new ArrayList<E>();
Map<E, Integer> idx = new HashMap<E, Integer>();
public RandomSet() {
}
public RandomSet(Collection<E> items) {
for (E item : items) {
idx.put(item, dta.size());
dta.add(item);
}
}
#Override
public boolean add(E item) {
if (idx.containsKey(item)) {
return false;
}
idx.put(item, dta.size());
dta.add(item);
return true;
}
/**
* Override element at position <code>id</code> with last element.
* #param id
*/
public E removeAt(int id) {
if (id >= dta.size()) {
return null;
}
E res = dta.get(id);
idx.remove(res);
E last = dta.remove(dta.size() - 1);
// skip filling the hole if last is removed
if (id < dta.size()) {
idx.put(last, id);
dta.set(id, last);
}
return res;
}
#Override
public boolean remove(Object item) {
#SuppressWarnings(value = "element-type-mismatch")
Integer id = idx.get(item);
if (id == null) {
return false;
}
removeAt(id);
return true;
}
public E get(int i) {
return dta.get(i);
}
public E pollRandom(Random rnd) {
if (dta.isEmpty()) {
return null;
}
int id = rnd.nextInt(dta.size());
return removeAt(id);
}
#Override
public int size() {
return dta.size();
}
#Override
public Iterator<E> iterator() {
return dta.iterator();
}
}
This is faster than the for-each loop in the accepted answer:
int index = rand.nextInt(set.size());
Iterator<Object> iter = set.iterator();
for (int i = 0; i < index; i++) {
iter.next();
}
return iter.next();
The for-each construct calls Iterator.hasNext() on every loop, but since index < set.size(), that check is unnecessary overhead. I saw a 10-20% boost in speed, but YMMV. (Also, this compiles without having to add an extra return statement.)
Note that this code (and most other answers) can be applied to any Collection, not just Set. In generic method form:
public static <E> E choice(Collection<? extends E> coll, Random rand) {
if (coll.size() == 0) {
return null; // or throw IAE, if you prefer
}
int index = rand.nextInt(coll.size());
if (coll instanceof List) { // optimization
return ((List<? extends E>) coll).get(index);
} else {
Iterator<? extends E> iter = coll.iterator();
for (int i = 0; i < index; i++) {
iter.next();
}
return iter.next();
}
}
If you want to do it in Java, you should consider copying the elements into some kind of random-access collection (such as an ArrayList). Because, unless your set is small, accessing the selected element will be expensive (O(n) instead of O(1)). [ed: list copy is also O(n)]
Alternatively, you could look for another Set implementation that more closely matches your requirements. The ListOrderedSet from Commons Collections looks promising.
In Java:
Set<Integer> set = new LinkedHashSet<Integer>(3);
set.add(1);
set.add(2);
set.add(3);
Random rand = new Random(System.currentTimeMillis());
int[] setArray = (int[]) set.toArray();
for (int i = 0; i < 10; ++i) {
System.out.println(setArray[rand.nextInt(set.size())]);
}
List asList = new ArrayList(mySet);
Collections.shuffle(asList);
return asList.get(0);
This is identical to accepted answer (Khoth), but with the unnecessary size and i variables removed.
int random = new Random().nextInt(myhashSet.size());
for(Object obj : myhashSet) {
if (random-- == 0) {
return obj;
}
}
Though doing away with the two aforementioned variables, the above solution still remains random because we are relying upon random (starting at a randomly selected index) to decrement itself toward 0 over each iteration.
Clojure solution:
(defn pick-random [set] (let [sq (seq set)] (nth sq (rand-int (count sq)))))
Java 8+ Stream:
static <E> Optional<E> getRandomElement(Collection<E> collection) {
return collection
.stream()
.skip(ThreadLocalRandom.current()
.nextInt(collection.size()))
.findAny();
}
Based on the answer of Joshua Bone but with slight changes:
Ignores the Streams element order for a slight performance increase in parallel operations
Uses the current thread's ThreadLocalRandom
Accepts any Collection type as input
Returns the provided Optional instead of null
Perl 5
#hash_keys = (keys %hash);
$rand = int(rand(#hash_keys));
print $hash{$hash_keys[$rand]};
Here is one way to do it.
C++. This should be reasonably quick, as it doesn't require iterating over the whole set, or sorting it. This should work out of the box with most modern compilers, assuming they support tr1. If not, you may need to use Boost.
The Boost docs are helpful here to explain this, even if you don't use Boost.
The trick is to make use of the fact that the data has been divided into buckets, and to quickly identify a randomly chosen bucket (with the appropriate probability).
//#include <boost/unordered_set.hpp>
//using namespace boost;
#include <tr1/unordered_set>
using namespace std::tr1;
#include <iostream>
#include <stdlib.h>
#include <assert.h>
using namespace std;
int main() {
unordered_set<int> u;
u.max_load_factor(40);
for (int i=0; i<40; i++) {
u.insert(i);
cout << ' ' << i;
}
cout << endl;
cout << "Number of buckets: " << u.bucket_count() << endl;
for(size_t b=0; b<u.bucket_count(); b++)
cout << "Bucket " << b << " has " << u.bucket_size(b) << " elements. " << endl;
for(size_t i=0; i<20; i++) {
size_t x = rand() % u.size();
cout << "we'll quickly get the " << x << "th item in the unordered set. ";
size_t b;
for(b=0; b<u.bucket_count(); b++) {
if(x < u.bucket_size(b)) {
break;
} else
x -= u.bucket_size(b);
}
cout << "it'll be in the " << b << "th bucket at offset " << x << ". ";
unordered_set<int>::const_local_iterator l = u.begin(b);
while(x>0) {
l++;
assert(l!=u.end(b));
x--;
}
cout << "random item is " << *l << ". ";
cout << endl;
}
}
Solution above speak in terms of latency but doesn't guarantee equal probability of each index being selected.
If that needs to be considered, try reservoir sampling. http://en.wikipedia.org/wiki/Reservoir_sampling. Collections.shuffle() (as suggested by few) uses one such algorithm.
Since you said "Solutions for other languages are also welcome", here's the version for Python:
>>> import random
>>> random.choice([1,2,3,4,5,6])
3
>>> random.choice([1,2,3,4,5,6])
4
Can't you just get the size/length of the set/array, generate a random number between 0 and the size/length, then call the element whose index matches that number? HashSet has a .size() method, I'm pretty sure.
In psuedocode -
function randFromSet(target){
var targetLength:uint = target.length()
var randomIndex:uint = random(0,targetLength);
return target[randomIndex];
}
PHP, assuming "set" is an array:
$foo = array("alpha", "bravo", "charlie");
$index = array_rand($foo);
$val = $foo[$index];
The Mersenne Twister functions are better but there's no MT equivalent of array_rand in PHP.
Icon has a set type and a random-element operator, unary "?", so the expression
? set( [1, 2, 3, 4, 5] )
will produce a random number between 1 and 5.
The random seed is initialized to 0 when a program is run, so to produce different results on each run use randomize()
In C#
Random random = new Random((int)DateTime.Now.Ticks);
OrderedDictionary od = new OrderedDictionary();
od.Add("abc", 1);
od.Add("def", 2);
od.Add("ghi", 3);
od.Add("jkl", 4);
int randomIndex = random.Next(od.Count);
Console.WriteLine(od[randomIndex]);
// Can access via index or key value:
Console.WriteLine(od[1]);
Console.WriteLine(od["def"]);
Javascript solution ;)
function choose (set) {
return set[Math.floor(Math.random() * set.length)];
}
var set = [1, 2, 3, 4], rand = choose (set);
Or alternatively:
Array.prototype.choose = function () {
return this[Math.floor(Math.random() * this.length)];
};
[1, 2, 3, 4].choose();
In lisp
(defun pick-random (set)
(nth (random (length set)) set))
How about just
public static <A> A getRandomElement(Collection<A> c, Random r) {
return new ArrayList<A>(c).get(r.nextInt(c.size()));
}
For fun I wrote a RandomHashSet based on rejection sampling. It's a bit hacky, since HashMap doesn't let us access it's table directly, but it should work just fine.
It doesn't use any extra memory, and lookup time is O(1) amortized. (Because java HashTable is dense).
class RandomHashSet<V> extends AbstractSet<V> {
private Map<Object,V> map = new HashMap<>();
public boolean add(V v) {
return map.put(new WrapKey<V>(v),v) == null;
}
#Override
public Iterator<V> iterator() {
return new Iterator<V>() {
RandKey key = new RandKey();
#Override public boolean hasNext() {
return true;
}
#Override public V next() {
while (true) {
key.next();
V v = map.get(key);
if (v != null)
return v;
}
}
#Override public void remove() {
throw new NotImplementedException();
}
};
}
#Override
public int size() {
return map.size();
}
static class WrapKey<V> {
private V v;
WrapKey(V v) {
this.v = v;
}
#Override public int hashCode() {
return v.hashCode();
}
#Override public boolean equals(Object o) {
if (o instanceof RandKey)
return true;
return v.equals(o);
}
}
static class RandKey {
private Random rand = new Random();
int key = rand.nextInt();
public void next() {
key = rand.nextInt();
}
#Override public int hashCode() {
return key;
}
#Override public boolean equals(Object o) {
return true;
}
}
}
The easiest with Java 8 is:
outbound.stream().skip(n % outbound.size()).findFirst().get()
where n is a random integer. Of course it is of less performance than that with the for(elem: Col)
With Guava we can do a little better than Khoth's answer:
public static E random(Set<E> set) {
int index = random.nextInt(set.size();
if (set instanceof ImmutableSet) {
// ImmutableSet.asList() is O(1), as is .get() on the returned list
return set.asList().get(index);
}
return Iterables.get(set, index);
}
In Mathematica:
a = {1, 2, 3, 4, 5}
a[[ ⌈ Length[a] Random[] ⌉ ]]
Or, in recent versions, simply:
RandomChoice[a]
Random[] generates a pseudorandom float between 0 and 1. This is multiplied by the length of the list and then the ceiling function is used to round up to the next integer. This index is then extracted from a.
Since hash table functionality is frequently done with rules in Mathematica, and rules are stored in lists, one might use:
a = {"Badger" -> 5, "Bird" -> 1, "Fox" -> 3, "Frog" -> 2, "Wolf" -> 4};
PHP, using MT:
$items_array = array("alpha", "bravo", "charlie");
$last_pos = count($items_array) - 1;
$random_pos = mt_rand(0, $last_pos);
$random_item = $items_array[$random_pos];
you can also transfer the set to array use array
it will probably work on small scale i see the for loop in the most voted answer is O(n) anyway
Object[] arr = set.toArray();
int v = (int) arr[rnd.nextInt(arr.length)];
If you really just want to pick "any" object from the Set, without any guarantees on the randomness, the easiest is taking the first returned by the iterator.
Set<Integer> s = ...
Iterator<Integer> it = s.iterator();
if(it.hasNext()){
Integer i = it.next();
// i is a "random" object from set
}
A generic solution using Khoth's answer as a starting point.
/**
* #param set a Set in which to look for a random element
* #param <T> generic type of the Set elements
* #return a random element in the Set or null if the set is empty
*/
public <T> T randomElement(Set<T> set) {
int size = set.size();
int item = random.nextInt(size);
int i = 0;
for (T obj : set) {
if (i == item) {
return obj;
}
i++;
}
return null;
}

Categories