This question already has answers here:
Non repeating random numbers
(2 answers)
Closed 10 years ago.
I need to randomly sample a subset of n elements from a list in Scala, and I was wondering if there was a convenient way to do so without recourse to manually checking that each of the n elements are unique. At the moment I have something like this:
import util.Random
def sample(itms:List[A], sampleSize:Int) {
var numbersSeen = Set[Int]()
var sampled = List[A]()
val itmLen = itms.size()
var sampleIdex = Random.nextInt(itmLen)
while(sampled < sampleSize) {
if(numbersSeen.contains(sampleIdex)){
sampleIdex = Random.nextInt(itmLen)
} else {
numbersSeen.add(sampleIdex)
sampled.add(itms(sampleIdex))
}
}
sampled
}
I was hoping there was something more elegant that can be done to either generate a non-repeating random list of integers in a range or to randomly sample n elements from a list.
If your list is not too long you could shuffle a list of index numbers and then march through that list.
In Scala that would be something like:
val aList = ('A' to 'Z').toList
val aListIterator = scala.util.Random.shuffle((0 until aList.length).toList).toIterator
and then in your looping structure:
...
if( aListIterator.hasNext ) aList(aListIterator.next)
...
If your list is huge, a function that returns a unique random number in the range of your list size (used as an index) might be a better approach. Jeff Preshing, recently blogged about unique random numbers, http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers.
You can pick one randomly, and sample from the list except the one you've just picked, with simpleSize-1 (tail-)recursively:
def sample[A](itms:List[A], sampleSize:Int) = {
def collect(vect: Vector[A], sampleSize: Int, acc : List[A]) : List[A] = {
if (sampleSize == 0) acc
else {
val index = Random.nextInt(vect.size)
collect( vect.updated(index, vect(0)) tail, sampleSize - 1, vect(index) :: acc)
}
}
collect(itms toVector, sampleSize, Nil)
} //> sample: [A](itms: List[A], sampleSize: Int)List[A]
sample(1 to 10 toList, 5) //> res0: List[Int] = List(6, 8, 2, 1, 10)
itms.map(x => (x, util.Random.nextDouble)).sortBy(_._2).take(sampleSize).map(_._1)
as long as you don't care about the inefficiency of sort.
You could take a random sample from the set of subsets, i.e.:
val distinctSubsets = itms.to[Set].subsets(sampleSize)
Then choose one of those randomly.
What about this approach?
trait RandomOrdering[T] extends Ordering[T]
object RandomOrdering {
implicit def defaultOrdering[T] = new RandomOrdering[T] {
def compare(x:T, y:T) = (Random nextInt 3) - 1
}
}
def sample[A](items:List[A], sampleSize:Int)(implicit r:RandomOrdering[A]) =
items.sorted take sampleSize
It might be less performant but it also allows you to inject a different RandomOrdering.
Related
I am trying to find a way to get a random value from a provided list of different ranges using ThreadLocalRandom, and return that one random value from a method. I've been trying different approaches, and not having much luck.
I've tried this:
private static final Long[][] values = {
{ 233L, 333L },
{ 377L, 477L },
{ 610L, 710L }
};
// This isn't correct
long randomValue = ThreadLocalRandom.current().nextLong(values[0][0]);
But I could not figure out how to get a random value out of it for a specific range, so thought I'd try the Map approach, I tried creating a Map of Integers and List of Longs:
private static Map<Integer, List<Long>> mapValues = new HashMap<>();
{{233L, 333L}, {377L, 477L}, {610L, 710L}} // ranges I want
I am not sure how to store those value ranges into the Map.
I've tried adding in values, for example:
// Need to get the other value for the range in here, in this case 333L
map.put(1, 233L);
I am not sure how to add the 333L to the List, I have searched and tried various things but always get errors, such as: found 'long', required List
I want the Integer in the Map to be an id for the associated range, for example, 1 for 233L-333L, so that I can tell it first, get a random Int key from the Map, for example 1, and then use ThreadLocalRandom.current().nextLong(origin, bound) where origin would be 233L and bound would be 333L, and then return a random value within that range of 233L-333L.
I am not sure if this is possible, or I am simply approaching this the wrong way - any guidance/help is appreciated!
It's pretty straightforward. Your long[][] will do fine.
First, select a random index, then select a long between values[index][0] and values[index][1]1.
long[][] values = {
{ 233L, 333L },
{ 377L, 477L },
{ 610L, 710L }
};
// Select a random index
int index = ThreadLocalRandom.current().nextInt(0, values.length);
// Determine lower and upper bounds
long min = values[index][0];
long max = values[index][1];
long rnd = ThreadLocalRandom.current().nextLong(min, max);
Of course, you could also abstract it away into some convenient classes.
Note that, for the distribution of values to be even, all ranges must have the same size (which seem to be the case in your code).
Implementation with even distribution
However, if you want to support different ranges while the distribution has to remain even, another approach is required.
We could calculate a single random number with as upper bound the total number of possible values. Then we could check in which 'bucket' the value is to be retrieved.
Here is a working example. In order to test the distribution which is said to be even, a random number is generated a million times. As you can see, each value occurs approximately 200,000 times.
1 In my examples, the upper bound is exclusive. This is consistent with many methods from the Java standard libraries, like ThreadLocalRandom.nextLong(origin, bound) or LongStream.range(long start, long end).
int range = ThreadLocalRandom.current().nextInt(3);
long randomValue = ThreadLocalRandom.current().nextLong(values[range][0],values[range][1]);
this will work with the array solution you tried first. first you select the range then you get the random value.
The easiest is the most straight forward.
private static final long[][] values = { { 233L, 333L
}, { 377L, 477L
}, { 610L, 710L
}
};
public static void main(String[] args) {
for (long v[] : values) {
long low = v[0];
long high = v[1];
System.out.println("Between " + low + " and " + high + " -> "
+ getRandom(low, high));
}
}
public static long getRandom(long low, long high) {
// add 1 to high to make range inclusive
return ThreadLocalRandom.current().nextLong(low, high + 1);
}
This question already has answers here:
Generate random number without duplicate in certain range
(10 answers)
Closed 6 years ago.
I have used this code in order to randomise 1000000 numbers without duplication's. Here's what I have so far.
enter code here protected void randomise() {
int[] copy = new int[getArray().length];
// used to indicate if elements have been used
boolean[] used = new boolean[getArray().length];
Arrays.fill(used,false);
for (int index = 0; index < getArray().length; index++) {
int randomIndex;
do {
randomIndex = getRandomIndex();
} while (used[randomIndex]);
copy[index] = getArray()[randomIndex];
used[randomIndex] = true;
}
for (int index = 0; index < getArray().length; index++) {
getArray()[index] = copy[index];
//Checks if elements in array have already been used
}
}
public static void main(String[] args) {
RandomListing count = new SimpleRandomListing(1000000);
//Will choose 1000000 random numbers
System.out.println(Arrays.toString(count.getArray()));
}
This method is too slow can you let me know how this can be done more efficiently. I appreciate all replies.
Regards,
A more efficient way to do this is by starting with a pool of numbers (e.g. an List of all numbers between 0 and 1000000) and then remove numbers that you've already used. That way, every time you try to get a new number, that number is guaranteed to never having been used before rather than spending time trying to find a "good" unused number.
It looks like your using a linear search to find matches. Try using a binary search it's more efficient. The array you are searching must be sorted to implement a binary search.
So given a string such as: 0100101, I want to return a random single index of one of the positions of a 1 (1, 5, 6).
So far I'm using:
protected int getRandomBirthIndex(String s) {
ArrayList<Integer> birthIndicies = new ArrayList<Integer>();
for (int i = 0; i < s.length(); i++) {
if ((s.charAt(i) == '1')) {
birthIndicies.add(i);
}
}
return birthIndicies.get(Randomizer.nextInt(birthIndicies.size()));
}
However, it's causing a bottle-neck on my code (45% of CPU time is in this method), as the strings are over 4000 characters long. Can anyone think of a more efficient way to do this?
If you're interested in a single index of one of the positions with 1, and assuming there is at least one 1 in your input, you can just do this:
String input = "0100101";
final int n=input.length();
Random generator = new Random();
char c=0;
int i=0;
do{
i = generator.nextInt(n);
c=input.charAt(i);
}while(c!='1');
System.out.println(i);
This solution is fast and does not consume much memory, for example when 1 and 0 are distributed uniformly. As highlighted by #paxdiablo it can perform poorly in some cases, for example when 1 are scarce.
You could use String.indexOf(int) to find each 1 (instead of iterating every character). I would also prefer to program to the List interface and to use the diamond operator <>. Something like,
private static Random rand = new Random();
protected int getRandomBirthIndex(String s) {
List<Integer> birthIndicies = new ArrayList<>();
int index = s.indexOf('1');
while (index > -1) {
birthIndicies.add(index);
index = s.indexOf('1', index + 1);
}
return birthIndicies.get(rand.nextInt(birthIndicies.size()));
}
Finally, if you need to do this many times, save the List as a field and re-use it (instead of calculating the indices every time). For example with memoization,
private static Random rand = new Random();
private static Map<String, List<Integer>> memo = new HashMap<>();
protected int getRandomBirthIndex(String s) {
List<Integer> birthIndicies;
if (!memo.containsKey(s)) {
birthIndicies = new ArrayList<>();
int index = s.indexOf('1');
while (index > -1) {
birthIndicies.add(index);
index = s.indexOf('1', index + 1);
}
memo.put(s, birthIndicies);
} else {
birthIndicies = memo.get(s);
}
return birthIndicies.get(rand.nextInt(birthIndicies.size()));
}
Well, one way would be to remove the creation of the list each time, by caching the list based on the string itself, assuming the strings are used more often than they're changed. If they're not, then caching methods won't help.
The caching method involves, rather than having just a string, have an object consisting of:
current string;
cached string; and
list based on the cached string.
You can provide a function to the clients to create such an object from a given string and it would set the string and the cached string to whatever was passed in, then calculate the list. Another function would be used to change the current string to something else.
The getRandomBirthIndex() function then receives this structure (rather than the string) and follows the rule set:
if the current and cached strings are different, set the cached string to be the same as the current string, then recalculate the list based on that.
in any case, return a random element from the list.
That way, if the list changes rarely, you avoid the expensive recalculation where it's not necessary.
In pseudo-code, something like this should suffice:
# Constructs fastie from string.
# Sets cached string to something other than
# that passed in (lazy list creation).
def fastie.constructor(string s):
me.current = s
me.cached = s + "!"
# Changes current string in fastie. No list update in
# case you change it again before needing an element.
def fastie.changeString(string s):
me.current = s
# Get a random index, will recalculate list first but
# only if necessary. Empty list returns index of -1.
def fastie.getRandomBirthIndex()
me.recalcListFromCached()
if me.list.size() == 0:
return -1
return me.list[random(me.list.size())]
# Recalculates the list from the current string.
# Done on an as-needed basis.
def fastie.recalcListFromCached():
if me.current != me.cached:
me.cached = me.current
me.list = empty
for idx = 0 to me.cached.length() - 1 inclusive:
if me.cached[idx] == '1':
me.list.append(idx)
You also have the option of speeding up the actual searching for the 1 character by, for example, useing indexOf() to locate them using the underlying Java libraries rather than checking each character individually in your own code (again, pseudo-code):
def fastie.recalcListFromCached():
if me.current != me.cached:
me.cached = me.current
me.list = empty
idx = me.cached.indexOf('1')
while idx != -1:
me.list.append(idx)
idx = me.cached.indexOf('1', idx + 1)
This method can be used even if you don't cache the values. It's likely to be faster using Java's probably-optimised string search code than doing it yourself.
However, you should keep in mind that your supposed problem of spending 45% of time in that code may not be an issue at all. It's not so much the proportion of time spent there as it is the absolute amount of time.
By that, I mean it probably makes no difference what percentage of the time being spent in that function if it finishes in 0.001 seconds (and you're not wanting to process thousands of strings per second). You should only really become concerned if the effects become noticeable to the user of your software somehow. Otherwise, optimisation is pretty much wasted effort.
You can even try this with best case complexity O(1) and in worst case it might go to O(n) or purely worst case can be infinity as it purely depends on Randomizer function that you are using.
private static Random rand = new Random();
protected int getRandomBirthIndex(String s) {
List<Integer> birthIndicies = new ArrayList<>();
int index = s.indexOf('1');
while (index > -1) {
birthIndicies.add(index);
index = s.indexOf('1', index + 1);
}
return birthIndicies.get(rand.nextInt(birthIndicies.size()));
}
If your Strings are very long and you're sure it contains a lot of 1s (or the String you're looking for), its probably faster to randomly "poke around" in the String until you find what you are looking for. So you save the time iterating the String:
String s = "0100101";
int index = ThreadLocalRandom.current().nextInt(s.length());
while(s.charAt(index) != '1') {
System.out.println("got not a 1, trying again");
index = ThreadLocalRandom.current().nextInt(s.length());
}
System.out.println("found: " + index + " - " + s.charAt(index));
I'm not sure about the statistics, but it rare cases might happen that this Solution take much longer that the iterating solution. On case is a long String with only a very few occurrences of the search string.
If the Source-String doesn't contain the search String at all, this code will run forever!
One possibility is to use a short-circuited Fisher-Yates style shuffle. Create an array of the indices and start shuffling it. As soon as the next shuffled element points to a one, return that index. If you find you've iterated through indices without finding a one, then this string contains only zeros so return -1.
If the length of the strings is always the same, the array indices can be static as shown below, and doesn't need reinitializing on new invocations. If not, you'll have to move the declaration of indices into the method and initialize it each time with the correct index set. The code below was written for strings of length 7, such as your example of 0100101.
// delete this and uncomment below if string lengths vary
private static int[] indices = { 0, 1, 2, 3, 4, 5, 6 };
protected int getRandomBirthIndex(String s) {
int tmp;
/*
* int[] indices = new int[s.length()];
* for (int i = 0; i < s.length(); ++i) indices[i] = i;
*/
for (int i = 0; i < s.length(); i++) {
int j = randomizer.nextInt(indices.length - i) + i;
if (j != i) { // swap to shuffle
tmp = indices[i];
indices[i] = indices[j];
indices[j] = tmp;
}
if ((s.charAt(indices[i]) == '1')) {
return indices[i];
}
}
return -1;
}
This approach terminates quickly if 1's are dense, guarantees termination after s.length() iterations even if there aren't any 1's, and the locations returned are uniform across the set of 1's.
I know that if you have two HashSet the you can create a third one adding the two.However, for my purpose I need to change my previous HashSet, look for certain condition , and then if not met then change the set again.My purpose is that that I will give an input, say number 456, and look for digits(1 through 9, including 0).If I'm unable to find size 10 for the HashSet then I will multiply the number with 2 , and do the same.So I'll get 912; the size is 6 now(and I need to get all digits 1-9 & 0, i.e., size 10).Now I will multiply it by 3 and I get 2736 , the size is now 7.I keep doing so until I get size 10.At the time I get size 10, I will complete the loop and return the last number that concluded the loop, following the incremental multiplication rule.My approach is as follows.It has errors so won't run but it represents my understanding as of now.
public long digitProcessSystem(long N) {
// changing the passed in number into String
String number = Long.toString(N);
//splitting the String so that I can investigate each digit
String[] arr = number.split("");
// Storing the digits(which are Strings now) into HashSet
Set<String> input = new HashSet<>(Arrays.asList(arr));
// Count starts for incremental purpose later.
count =1;
//When I get all digits; 1-9, & 0, I need to return the last number that concluded the condition
while (input.size() == 10) {
return N;
}
// The compiler telling me to delete the else but as a new Java user so far my understanding is that I can use `else` with `while`loops.Correct me if I'm missing something.
else {
// Increment starts following the rule; N*1, N*2,N*3,...till size is 10
N = N*count;
// doing everything over
String numberN = Long.toString(N);
String[] arr1 = number.split("");
// need to change the previous `input`so that the new updated `HashSet` gets passed in the while loop to look for size 10.This is error because I'm using same name `input`. But I don't want to create a new `set` , I need to update the previous `set` which I don't know how.
Set<String> input = new HashSet<>(Arrays.asList(arr1));
// increments count
count++;
}
clear() input and add the new values. Something like
// Set<String> input = new HashSet<>(Arrays.asList(arr1));
input.clear();
input.addAll(Arrays.asList(arr1));
and
while (input.size() == 10) {
should be
if (input.size() == 10) {
Or your else isn't tied to an if.
So I'm working on a program which is supposed to randomly put people in 6 rooms (final input is the list of rooms with who is in each room). So I figured out how to do all that.
//this is the main sorting sequence:
for (int srtM = 0; srtM < Guys.length; srtM++) {
done = false;
People newMove = Guys[srtM]; //Guys is an array of People
while (!done) {
newMove.rndRoom(); //sets random number from 4 to 6
if (newMove.getRoom() == 4 && !room4.isFull()) {
room4.add(newMove); //adds person into the room4 object rList
done = true;
} else if (newMove.getRoom() == 5 && !room5.isFull()) {
room5.add(newMove);
done = true;
} else if (newMove.getRoom() == 6 && !room6.isFull()) {
room6.add(newMove);
done = true;
}
}
The problem now is that the code for reasons I don't completely understand (something with the way I wrote it here) is hardly random. It seems the same people are put into the same rooms almost every time I run the program. For example me, I'm almost always put by this program into room 6 together with another one friend (interestingly, we're both at the end of the Guys array). So how can I make it "truly" random? Or a lot more random than it is now?
Thanks in advance!
Forgot to mention that "rndRoom()" does indeed use the standard Random method (for 4-6) in the background:
public int rndRoom() {
if (this.gender == 'M') {
this.room = (rnd.nextInt((6 - 4) + 1)) + 4;
}
if (this.gender == 'F') {
this.room = (rnd.nextInt(((3 - 1) + 1))) + 1;
}
return this.room;
}
if you want it to be more random try doing something with the Random method, do something like this:
Random random = new Random();
for (int i = 0; i < 6; i++)
{
int roomChoice = random.nextInt(5) + 1;
roomChoice += 1;
}
of course this is not exactly the code you will want to use, this is just an example of how to use the Random method, change it to how you want to use it.
Also, the reason I did random.nextInt(5) + 1; is because if random.nextInt(5) + 1; gets you a random number from 0 to 5, so if you want a number from 1 to 6 you have to add 1, pretty self explanatory.
On another note, to get "truly" random is not as easy as it seems, when you generate a "random" number it will use something called Pseudo random number generation, this, is basically these programs produce endless strings of single-digit numbers, usually in base 10, known as the decimal system. When large samples of pseudo-random numbers are taken, each of the 10 digits in the set {0,1,2,3,4,5,6,7,8,9} occurs with equal frequency, even though they are not evenly distributed in the sequence.
There might be something wrong with code you didn't post.
I've build a working example with what your classes might be, and it is distributing pretty randomly:
http://pastebin.com/u8sZRxi6
OK so I figured out why the results don't seem very random. So the room sorter works based on an alphabetical people list of 18 guys. There are only 3 guy rooms (rooms 4, 5 and 6) So each guy has a 1 in 3 chance to be put in say, room 6. But each person could only possibly be in 2 of the 6 spots in each room (depending on where they are in the list).
The first two people for example, could each only be in either the first or second spot of each room. By "spot" I mean their place in the room list which is printed in the end. Me on the other hand am second last on the list, so at that point I could only be in either the last or second last spot of each room.
Sorry if it's confusing but I figured out this is the reason the generated room lists don't appear very random - it's because only the same few people could be put in each room spot every time. The lists are random though, it's just the order in which people appear in each list which is not random.
So in order to make the lists look more random I had to make people's positions in the room random too. So the way I solved this is by adding a shuffler action which mixes the Person arrays:
public static void shuffle(Person[] arr) {
Random rgen = new Random();
for (int i = 0; i < arr.length; i++) {
int randPos = rgen.nextInt(arr.length);
Person tmp = arr[i];
arr[i] = arr[randPos];
arr[randPos] = tmp;
}
}
TL;DR the generated room lists were random - but since the order of the people that got put into the rooms wasn't random the results didn't look very random. In order to solve this I shuffled the Person arrays.