Probability of x possible outcomes [duplicate]

Probability of x possible outcomes [duplicate] - java

This question already has answers here:
Random weighted selection in Java
(7 answers)
Closed 2 years ago.
So I am writing a program in part where a user can define as many random outcomes as they want. They also define the probability of each (so no, they are not equal). What is they best way to check which occured. Note: My program happens to be a Minecraft plugin, but the question is more of a general java one, so I am trying to make the code reflect that:
Map<String,Integer> possibilities = new HashMap<String,Integer>();
int x = (int) (Math.random() * 100)
My idea was to create another variable, and add the previous probability checked to it every time, and check if that was less than x. If it wasn't rinse and repeat, but I'm unsure of how to structure this.
So for is example: if the user configured it so he has 3 different outcomes, with a 30, 20, 50 percent chance respectively, how would I do this?

Use a NavigableMap, which will allow you to retrieve the correct outcome with one clean and simple lookup. (And internally, this uses an efficient O(log n) lookup—not that your maps will be large enough to matter.)
import java.util.NavigableMap;
import java.util.TreeMap;
import static java.util.concurrent.ThreadLocalRandom.current;
final class LoadedDie {
public static void main(String... argv) {
/* One-time setup */
NavigableMap<Integer, String> loot = new TreeMap<>();
int cumulative = 0;
loot.put(cumulative += 20, "Gold");
loot.put(cumulative += 30, "Iron");
loot.put(cumulative += 50, "Coal");
/* Repeated use */
System.out.println(loot.higherEntry(current().nextInt(cumulative)).getValue());
System.out.println(loot.higherEntry(current().nextInt(cumulative)).getValue());
}
}

Here's one way to do it.
public static String getOutcome(Map<String, Integer> possibilities) {
int x = (int) (Math.random() * 100);
for (Map.Entry<String, Integer> possibility : possibilities.entrySet()) {
if (x <= possibility.getValue()) {
return possibility.getKey();
}
x -= possibility.getValue();
}
// unreachable if probabilities are correctly mapped
return null;
}

Related

In Apache Spark, can I easily repeat/nest a SparkContext.parallelize?

I am trying to model a genetics problem we are trying to solve, building up to it in steps. I can successfully run the PiAverage examples from Spark Examples. That example "throws darts" at a circle (10^6 in our case) and counts the number that "land in the circle" to estimate PI
Let's say I want to repeat that process 1000 times (in parallel) and average all those estimates. I am trying to see the best approach, seems like there's going to be two calls to parallelize? Nested calls? Is there not a way to chain map or reduce calls together? I can't see it.
I want to know the wisdom of something like the idea below. I thought of tracking the resulting estimates using an accumulator. jsc is my SparkContext, full code of single run is at end of question, thanks for any input!
Accumulator<Double> accum = jsc.accumulator(0.0);
// make a list 1000 long to pass to parallelize (no for loops in Spark, right?)
List<Integer> numberOfEstimates = new ArrayList<Integer>(HOW_MANY_ESTIMATES);
// pass this "dummy list" to parallelize, which then
// calls a pieceOfPI method to produce each individual estimate
// accumulating the estimates. PieceOfPI would contain a
// parallelize call too with the individual test in the code at the end
jsc.parallelize(numberOfEstimates).foreach(accum.add(pieceOfPI(jsc, numList, slices, HOW_MANY_ESTIMATES)));
// get the value of the total of PI estimates and print their average
double totalPi = accum.value();
// output the average of averages
System.out.println("The average of " + HOW_MANY_ESTIMATES + " estimates of Pi is " + totalPi / HOW_MANY_ESTIMATES);
It doesn't seem like a matrix or other answers I see on SO give the answer to this specific question, I have done several searches but I am not seeing how to do this without "parallelizing the parallelization." Is that a bad idea?
(and yes I realize mathematically I could just do more estimates and effectively get the same results :) Trying to build a structure my boss wants, thanks again!
I have put my entire single-test program here if that helps, sans an accumulator I was testing out. The core of this would become PieceOfPI():
import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;
import org.apache.spark.Accumulable;
import org.apache.spark.Accumulator;
import org.apache.spark.SparkContext;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.storage.StorageLevel;
import org.apache.spark.SparkConf;
import org.apache.spark.storage.StorageLevel;
public class PiAverage implements Serializable {
public static void main(String[] args) {
PiAverage pa = new PiAverage();
pa.go();
}
public void go() {
// should make a parameter like all these finals should be
// int slices = (args.length == 1) ? Integer.parseInt(args[0]) : 2;
final int SLICES = 16;
// how many "darts" are thrown at the circle to get one single Pi estimate
final int HOW_MANY_DARTS = 1000000;
// how many "dartboards" to collect to average the Pi estimate, which we hope converges on the real Pi
final int HOW_MANY_ESTIMATES = 1000;
SparkConf sparkConf = new SparkConf().setAppName("PiAverage")
.setMaster("local[4]");
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
// setup "dummy" ArrayList of size HOW_MANY_DARTS -- how many darts to throw
List<Integer> throwsList = new ArrayList<Integer>(HOW_MANY_DARTS);
for (int i = 0; i < HOW_MANY_DARTS; i++) {
throwsList.add(i);
}
// setup "dummy" ArrayList of size HOW_MANY_ESTIMATES
List<Integer> numberOfEstimates = new ArrayList<Integer>(HOW_MANY_ESTIMATES);
for (int i = 0; i < HOW_MANY_ESTIMATES; i++) {
numberOfEstimates.add(i);
}
JavaRDD<Integer> dataSet = jsc.parallelize(throwsList, SLICES);
long totalPi = dataSet.filter(new Function<Integer, Boolean>() {
public Boolean call(Integer i) {
double x = Math.random();
double y = Math.random();
if (x * x + y * y < 1) {
return true;
} else
return false;
}
}).count();
System.out.println(
"The average of " + HOW_MANY_DARTS + " estimates of Pi is " + 4 * totalPi / (double)HOW_MANY_DARTS);
jsc.stop();
jsc.close();
}
}

Let me start with your "background question". Transformation operations like map, join, groupBy, etc. fall into two categories; those that require a shuffle of data as input from all the partitions, and those that don't. Operations like groupBy and join require a shuffle, because you need to bring together all records from all the RDD's partitions with the same keys (think of how SQL JOIN and GROUP BY ops work). On the other hand, map, flatMap, filter, etc. don't require shuffling, because the operation works fine on the input of the previous step's partition. They work on single records at a time, not groups of them with matching keys. Hence, no shuffling is necessary.
This background is necessary to understand that an "extra map" does not have a significant overhead. A sequent of operations like map, flatMap, etc. are "squashed" together into a "stage" (which is shown when you look at details for a job in the Spark Web console) so that only one RDD is materialized, the one at the end of the stage.
On to your first question. I wouldn't use an accumulator for this. They are intended for "side-band" data, like counting how many bad lines you parsed. In this example, you might use accumulators to count how many (x,y) pairs were inside the radius of 1 vs. outside, as an example.
The JavaPiSpark example in the Spark distribution is about as good as it gets. You should study why it works. It's the right dataflow model for Big Data systems. You could use "aggregators". In the Javadocs, click the "index" and look at the agg, aggregate, and aggregateByKey functions. However, they are no more understandable and not necessary here. They provide greater flexibility than map then reduce, so they are worth knowing
The problem with your code is that you are effectively trying to tell Spark what to do, rather than expressing your intent and letting Spark optimize how it does it for you.
Finally, I suggest you buy and study O'Reilly's "Learning Spark". It does a good job explaining the internal details, like staging, and it shows lots of example code you can use, too.

How to efficiently remove duplicate collision pairs in spatial hash grid?

I'm working on a 2D game for android so performance is a real issue and a must. In this game there might occur a lot of collisions between any objects and I don't want to check in bruteforce o(n^2) whether any gameobject collides with another one. In order to reduce the possible amount of collision checks I decided to use spatial hashing as broadphase algorithm becouse it seems quite simple and efficient - dividing the scene on rows and columns and checking collisions between objects residing only in the same grid element.
Here's the basic concept I quickly scratched:
public class SpatialHashGridElement
{
HashSet<GameObject> gameObjects = new HashSet<GameObject>();
}
static final int SPATIAL_HASH_GRID_ROWS = 4;
static final int SPATIAL_HASH_GRID_COLUMNS = 5;
static SpatialHashGridElement[] spatialHashGrid = new SpatialHashGridElement[SPATIAL_HASH_GRID_ROWS * SPATIAL_HASH_GRID_COLUMNS];
void updateGrid()
{
float spatialHashGridElementWidth = screenWidth / SPATIAL_HASH_GRID_COLUMNS;
float spatialHashGridElementHeight = screenHeight / SPATIAL_HASH_GRID_ROWS;
for(SpatialHashGridElement e : spatialHashGrid)
e.gameObjects.clear();
for(GameObject go : displayList)
{
for(int i = 0; i < go.vertices.length/3; i++)
{
int row = (int) Math.abs(((go.vertices[i*3 + 1] / spatialHashGridElementHeight) % SPATIAL_HASH_GRID_ROWS));
int col = (int) Math.abs(((go.vertices[i*3 + 0] / spatialHashGridElementWidth) % SPATIAL_HASH_GRID_COLUMNS));
if(!spatialHashGrid[row * SPATIAL_HASH_GRID_COLUMNS + col].gameObjects.contains(go))
spatialHashGrid[row * SPATIAL_HASH_GRID_COLUMNS + col].gameObjects.add(go);
}
}
}
The code isn't probably of the highest quality so if you spot anything to improve please don't hesitate to tell me but the most worrying problem that arises currently is that in 2 grid cells there might be same collision pairs checked. Worst case example (assuming none of the objects spans more than 2 cells):
Here we have 2 gameObjects colliding(red and blue). Each of them resides in 4 cells => therefore in each cell there will be the same pair to check.
I can't come up with some efficient approach to remove the possibility of duplicate pairs without a need to filter the grid after creating it in updateGrid(). Is there some brilliant way to detect that some collision pair has been already inserted even during the updateGrid function? I will be very grateful for any tips!

I'm trying to explain my idea using some pseudo-code (C# language elements):
public partial class GameObject {
// ...
Set<GameObject> collidedSinceLastTick = new HashSet<GameObject>();
public boolean collidesWith(GameObject other) {
if (collidedSinceLastTick.contains(other)) {
return true; // or even false, see below
}
boolean collided = false;
// TODO: your costly logic here
if (collided) {
collidedSinceLastTick.add(other);
// maybe return false if other actions depend on a GameObject just colliding once per tick
}
return collided;
}
// ...
}
HashSet and .hashCode() both can be tuned in some cases. Maybe you could even remove displayList and "hold" everything in spatialHashGrid to reduce the memory foot-print a little bit. Of course do that only if you don't need special access to displayList - in XML's DocumentObjectModel objects can be accessed by a path throught the tree, and "hot spots" can be accessed by ID where the ID has to be assigned explicitely. For serializing (saving game state or whatever) it should not be an issue to iterate through spatialHashGrid performance-wise (it's a bit slower than serializing the gameObject set because you may have to suppress duplicates - using Java serialization it even does not save the same object twice using the default settings, saving just a reference after the first occurence of an object).

NullPointerException in Java my java program [duplicate]

This question already has answers here:
What is a NullPointerException, and how do I fix it?
(12 answers)
Closed 7 years ago.
I'm working on a program using eclipse that generates objects for runners in a 100 m race.
each runner has a lane, name and three separate times for completing the race.
However, when I try to generate the times for each object, I get java.lang.NullpointerException.
Here is the method for generating the times.
public double[] getTimes() {
for (int i = 0; i <= (times.length - 1); i++) {
rolled = 1.0 + roll.nextDouble() * 100.0;
// set rolled to a new random number between 1 and 100
times[i] = rolled;
// set index i of the array times to be the nearest
// double to roll's value.
}
return times;
}
and then the code in which the method is called.
public void testGetTimes() {
double[] times = performance.getTimes();
assertEquals(2, times.length);
assertEquals(9.2, times[0], 0.01);
assertEquals(9.4, times[1], 0.01);
}
I'd try to fix it through debugger, but every time i try to step-into the for loop, i get InvocationTargetException,(Throwable line: not available
initialization of times, roll and rolled:
public class Performance {
private int lane;
private String name;
double[] times = new double[3];
int rolling;
Random roll = new Random();
double rolled;
double average;
double best;
and of performance:
public class PerformanceTest {
Performance performance;
#Before
public void setup() {
performance = new Performance(1, "", new double[]{9.2, 9.4});
}

It looks like one of your objects is uninitialized. On the other hand, initializing it to null can cause the same problem. As of Java 8, I would suggest using an Optional to contain any values that you are not immediately certain of, or are not immediately aware of, and put their processing in an Optional.ifPresent(Consumable) statement. It minimizes the possibility of a NullPointerException, and invites functional programming methods, which will seriously slim down the amount of time it takes to get the job done.
As far as your assertion error goes, I'm sure you already know that this is only caused by an assert statement. Looks like times[0] is substantially larger than you were expecting. It appears, from what we can see, that you're setting it to 1.0 + random{0.0...1.0} * 100.0; I don't know what the precondition that leads you to expect 9.4 is, but this could easily hit the forties.

Java fastest way to get matching range

I have a set of integer ranges, which represent lower and upper bounds of classes. For example:
0..500 xsmall
500..1000 small
1000..1500 medium
1500..2500 large
In my case there can be over 500 classes. These classes do not overlap, but they can differ in size.
I can implement finding the matching range as a simple linear search through a list, for example
class Range
{
int lower;
int upper;
String category;
boolean contains(int val)
{
return lower <= val && val < upper;
}
}
public String getMatchingCategory(int val)
{
for (Range r : listOfRanges)
{
if (r.contains(val))
{
return r.category;
}
}
return null;
}
However, this seems slow; as I need on average N/2 look-ups. If the classes were equally sized, I could use division. Is there a standard technique to find the correct range faster?

What you are looking for is a SortedMap and its methods tailMap and firstKey. Check out the documentation for full details.
The advantage of this approach over plain arrays is in the ease of maintaining your ranges: you can insert/remove new boundaries at any point with almost no runtime cost; with arrays it means copying both parallel arrays in full.
Update
I've written code for both variants and benchmarked it:
#State(Scope.Thread)
#OutputTimeUnit(TimeUnit.MICROSECONDS)
public class BinarySearch
{
static final int ARRAY_SIZE = 128, INCREMENT = 1000;
static final int[] arrayK = new int[ARRAY_SIZE];
static final String[] arrayV = new String[ARRAY_SIZE];
static final SortedMap<Integer,String> map = new TreeMap<>();
static {
for (int i = 0, j = 0; i < arrayK.length; i++) {
arrayK[i] = j; arrayV[i] = String.valueOf(j);
map.put(j, String.valueOf(j));
j += INCREMENT;
}
}
final Random rnd = new Random();
int rndInt;
#Setup(Level.Invocation) public void nextInt() {
rndInt = rnd.nextInt((ARRAY_SIZE-1)*INCREMENT);
}
#GenerateMicroBenchmark
public String array() {
final int i = Arrays.binarySearch(arrayK, rndInt);
return arrayV[i >= 0? i : -(i+1)];
}
#GenerateMicroBenchmark
public String sortedMap() {
return map.tailMap(rndInt).values().iterator().next();
}
}
Benchmark results:
Benchmark Mode Thr Cnt Sec Mean Mean error Units
array thrpt 1 5 5 10.948 0.033 ops/usec
sortedMap thrpt 1 5 5 5.752 0.070 ops/usec
Interpretation: array search is only twice as fast and this factor is quite stable across array sizes. In the presented code the array size is 1024 and the factor is 1.9. I've also tested with array size 128, where the factor is 2.05.

Here, Arrays.binarySearch is your friend. Simply put all the boundaries in and handle the possible cases. Assuming you ranges leave no holes between them, you only need to put the upper bounds in.
For you example
0..500 xsmall
500..1000 small
1000..1500 medium
1500..2500 large
you'd use
int[] boundaries = {500, 1000, 1500, 2500};
and look up the input. Handle the two cases (found/not found) and you're done. Forget about ranges, they're nice but they don't fit you problem.
Update
I also wrote a benchmark and no matter how I try I'd lose my bet as the ratio is about 3 rather than 5. The strange things like S001024 in my results stand for the size 1024.

How do I make two classes to work together in Java? [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I have been trying for weeks in this project where I have to make one class that generates 500 random numbers from 1-250 and in a second class I have to inherit the first class properties and write all those numbers in a text file but when I have being having problems getting the properties and work with it and I haven't found a way to do it online.
My First class is
import java.util.Random;
public class GenKeys {
public static void random(){
for (int i = 0; i < 250; i++) {
int x = (int) (Math.random() * 100);
}
}
}
and my second code is
import java.util.Random;
import java.io.*;
import java.lang.*;
public class MainProg extends GenKeys{
public static void main(String[] args){
public static void random(){
try {
BufferedWriter out = new BufferedWriter(new FileWriter("file.txt"));
out.write( x + System.getProperty("line.separator"));// when i compile the x is not found!!!
out.close();
} catch (IOException e) {
System.out.print(e);
}
}
How can I make the two classes work together?

What am i doing Wrong ?
You are using inheritance instead of just using an instance of GenKeys in MainProg
You keep overwriting your random values, since you only use a single variable x, when you should be using e.g. an array
You create 250 values in range [0..99] instead of 500 values in range [1..250]
You don't store or return anything from your random() method

and i havent found a way to do it online.
I'm not sure you've looked hard enough.
How to get your code working
Firstly, you want to change the type and name of your method to an int.
public static int randomNum()
Then, remove the loop from the code, and just return the random number generated:
return (int)Math.Random() * 100; //By the way there is a Random class.
In the random method, you want the loop:
for(int x = 0; x < 250; x++)
{
BufferedWriter out = new BufferedWriter(new FileWriter("file.txt"));
out.write( randomNum() + System.getProperty("line.separator"));
}
out.close();
The various issues with your code
You're mis-using inheritance here. Your class is not a type of GenKey. It simply uses it, so it should be a field in your class.
Secondly, a method can only return one value, or one object. It can not return 250 numbers as is. You're assigning 250 numbers to x. This is only going to store the last number generated.

I don't think this is right approach. you need another class, for example KeyWriter to inherit from GenKeys. let it use GenKeys method random (it doesn't need to be static)
also, your random method is wrong, you only generate 250 keys instead of 500, and they are not from 0 to 250.
my solution is:
1) inherit KeyWriter from GenKeys
2) modify random to return only 1 generated number using nextInt
3) use cycle inside KeyWriter to call random 500 times and write those values into a file
4) use KeyWriter class inside you main method
I don't post the actual solution, cause it looks like you're doing your homework.

Well, somethings aren't correct here, but the weirdest of all is that you made the random() function a void.
void random()
Where X goes to? You just create a new int, but do nothing about it.
Besides this, there are other problems, as other folks mentioned around.
I'd recommend you to read about function in Java, especially about the difference between int and void.

Some problems (and comments) I see of the bat:
x is not an instance field and is not stored anywhere thus how can it be accessible from the child class.
Like others have said x is being overwritten with each iteration of your for loop.
Why is the mainProg.random() method declared inside of the mainProg.main() method?
I dont think inheritance is the way to go unless it is absolutely required for this project. Why not just make an instance of your random class inside the main method of the mainProg class?
If you want to use inheritance I believe a call to super.random() will be necessary inside of the mainProg.random() method.(Please someone confirm this. Im not 100% sure)
If it was me I would do something along the lines of this in my GenKeys.random() method:
public int[] random() {
int[] keys = new int[500];
for(int i = 0; i < 500; ++i)
{
keys[i] = (int) (Math.random() * 100);
}
return keys;
}
This code creates and returns an array of 500 keys. NOT in the range of 1-250. See here for that: How do I generate random integers within a specific range in Java?
Hopefully that will get you started on the right track.

x is the local variable of random().
so you can't directly access local variable out side the class.
And you are trying to generate 500 random no. between 1-250 so change the for loop in first class
for (int i = 0; i < 500; i++){
.....
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.