Random point in circle method is not uniformly distributed [duplicate] - java

This question already has answers here:
Generate a random point within a circle (uniformly)
(22 answers)
Closed 5 years ago.
I have the following method in Java:
public static Vector2d random(Circle circle) {
// this returns a random number between 0 and Math.PI * 2
double angle = MathUtils.random(0, Math.PI * 2);
// give the point inside the unit circle
// this returns a normalized vector from a given angle
Vector2d point = new Vector2d(angle);
// however, this is only along the edge
// now add a random magnitude (because this is a normalized vector, we can just multiply it by the desired magnitude)
double magnitude = Math.random();
point = point.multiply(magnitude);
// now expand this to fit the radius
point = point.multiply(circle.getRadius());
// now translate by circleCenter
return point.add(circle.getCenter());
}
This does return a point in the defined circle, however, when you do this many times and plot the points, you can clearly see most points will be toward the center.
Why is this? I don't see how my math can do this.
Comment if you want me to add an image of the points on the plot, if you think that could be helpful.

Of course, when r is small, the generated points are closer to each other.
As said by #DBrowne, you can adjust the density by the inverse CDF trick.
Alternatively, you can spare function evaluations by drawing uniform points in [-R,R]x[-R,R] and rejecting the ones such that X²+Y²>R² (about 21% of them). The method generalizes to any shape known by its implicit equation.

Your math is flawed. Here's an explanation of why and the correct solution:
The task is to generate uniformly distributed numbers within a circle
of radius R in the (x,y) plane. At first polar coordinates seems like
a great idea, and the naive solution is to pick a radius r uniformly
distributed in [0, R], and then an angle theta uniformly distributed
in [0, 2pi]. BUT, you end up with an exess of points near the origin
(0, 0)! This is wrong because if we look at a certain angle interval,
say [theta, theta+dtheta], there needs to be more points generated
further out (at large r), than close to zero. The radius must not be
picked from a uniform distribution, but one that goes as
pdf_r = (2/R^2)*r
That's easy enough to do by calculating the inverse of the cumulative
distribution, and we get for r:
r = R*sqrt( rand() )
where rand() is a uniform random number in [0, 1]
http://www.anderswallin.net/2009/05/uniform-random-points-in-a-circle-using-polar-coordinates/

Related

Getting a point from begginning coordinates, angle and distance

I want to get a Vector containing a coordinate. I know my beginning coordinates, angle and distance. So far I've tried doing:
Vector2 pos = new Vector2(beginningX, beginningY).add(distance, distance).rotate(angle);
But it doesn't work as I expect it to. When the rotation isn't 0 the coordinates become big, and the ending point isn't where I expect it to be. I know this must be a simple problem, but I just can't solve it.
EDIT:
Tried doing:
Vector2 pos = new Vector2(beginningX, beginningY).add(distance, 0).rotate(angle);
(Adding distance to x only) Still no success.
I'd say you're doing it wrong: you need to rotate the distance vector and add it to the position vector:
Vector2 pos = new Vector2(beginningX, beginningY).add(new Vector2(distance, 0).rotate(angle) );
You might want to read up on vector math but basically it amounts to this (if I correctly understood what you're trying to do):
If you rotate a vector you're always rotating around point 0/0. Thus you'll want to create a vector that covers the distance from 0/0 to your distance on the x-axis:
0---------------->d
Now you rotate that vector by some angle:
d
/
/
/
/
/
0
Then you offset that vector by your starting point, i.e. you add the two vectors (for simplicity I assume your starting point lies on the y-axis):
d
/
/
/
/
/
s
|
|
|
0
You need to rotate only the distance vector, rather than a sum of beginning and distance. Addition is the same in either order (commutative), so you can try this way:
Vector2 pos = new Vector2(distance, 0).rotate(angle).add(beginningX, beginningY);
Advantage: This chained call does not create a temporary Vector2 for the beginning position that would immediately become garbage for the garbage collector. Conserving space and garbage collection time will be important when your code handles millions of vectors.
This is simple vector addition. I'm assuming 2D coordinates, with angles measured counterclockwise from x-axis:
x(new) = x(old) + distance*cos(angle)
y(new) = y(old) + distance*sin(angle)
Be sure that your angles are in radians when you plug them into trig functions.

How do i find if a set of coordinates are inside a 2D triangle?

Does anybody know how to find out if a set of coordinates are within a triangle for which you have the coordinates for. i know how to work out length of sides, area and perimeter, but i have no idea where to begin working out the whereabouts within the triangle of other points.
Any advice would be appreciated
You can create a Polygon object.
Polygon triangle = new Polygon();
Add the vertexes of your triangle with the addPoint(int x, int y) method.
And then, you just need to check if the set of coordinates is inside your triangle using contains(double x, double y) method.
Use the contains method of the Polygon class as documented here.
For a solution without using the Polygon-class:
Assume that you have giving three points A,B,C the vertices of your polygon. Let P be the point you want to check. First calculate the vectors representing the edges of your triangle. Let us call them AB, BC, CA. Also calculate the three vectors PA, PB, PC.
Now calculate the cross product between the first two of the vectors from above.
The cross product of the first pair gives you the sin(alpha), where alpha is the angle between AB and PA, multiplied with a vector pendenpicular to AB and PA. Ignore this vector because we are interested in the angle and take a look at the sine (in the case of 2D vectors you can imagine it as the vector standing perpendicular to your screen).
The sine can take values between (let's say for the ease) betwenn 0 and 2*Pi. It's 0 exactly at 0 and Pi. For every value in between the sine is positive and for every value between Pi and 2*Pi it's negative.
So let's say your Point p is on the left hand side of AB, so the sine would be positive.
By taking the cross product of each pair from above, you could easily guess that the point P is on the left hand side of each edge from the triangle. This just means that it has to be inside the triangle.
Of course this method can even be used from calculating whether a point P is in a polygon. Be aware of the fact, that this method only works if the sides of the polygon are directed.

Gradient Descent in linear regression

I am trying to implement linear regression in java. My hypothesis is theta0 + theta1 * x[i].
I am trying to figure out the value of theta0 and theta1 so that the cost function is minimum.
I am using gradient descent to find out the value -
In the
while(repeat until convergence)
{
calculate theta0 and theta1 simultaneously.
}
what is this repeat until convergence?
I understood that it is the local minimum but what is exact code that I should put in the while loop?
I am very new to machine learning and just began to code basic algos to get better understanding. Any help will be greatly appreciated.
The gradient descent is an iterative approach for minimizing the given function. We start with an initial guess of the solution and we take the gradient of the function at that point. We step the solution in the negative direction of the gradient and we repeat the process. The algorithm will eventually converge where the gradient is zero (which correspond to a local minimum). So your job is to find out the value of theta0 and theta1 that minimization the loss function [for example least squared error].
The term "converges" means you reached in the local minimum and further iteration does not affect the value of parameters i.e. value of theta0 and theta1 remains constant. Lets see an example Note: Assume it is in first quadrant for this explanation.
Lets say you have to minimize a function f(x) [cost function in your case]. For this you need to find out the value of x that minimizes the functional value of f(x). Here is the step by step procedure to find out the value of x using gradient descent method
You choose the initial value of x. Lets say it is in point A in the figure.
You calculate the gradient of f(x) with respect to x at A.
This gives the slope of the function at point A. Since it the function is increasing at A, it will yield a positive value.
You subtract this positive value from initial guess of x and update the value of x. i.e. x = x - [Some positive value]. This brings the x more closer to the D [i.e. the minimum] and reduces the functional value of f(x) [from figure]. Lets say after iteration 1, you reach to the point B.
At point B, you repeat the same process as mention in step 4 and reach the point C, and finally point D.
At point D, since it is local minimum, when you calculate gradient, you get 0 [or very close to 0]. Now you try to update value of x i.e. x = x - [0]. You will get same x [or very closer value to the previous x]. This condition is known as "Convergence".
The above steps are for increasing slope but are equally valid for decreasing slope. For example, the gradient at point G results into some negative value. When you update x i.e x = x - [ negative value] = x - [ - some positive value] = x + some positive value. This increases the value of x and it brings x close to the point F [ or close to the minimum].
There are various approaches to solve this gradient descent. As #mattnedrich said, the two basic approaches are
Use fixed number of iteration N, for this pseudo code will be
iter = 0
while (iter < N) {
theta0 = theta0 - gradient with respect to theta0
theta1 = theta1 - gradient with respect to theta1
iter++
}
Repeat until two consecutive values of theta0 and theta1 are almost identical. The pseudo code is given by #Gerwin in another answer.
Gradient descent is one of the approach to minimize the function in Linear regression. There exists direct solution too. Batch processing (also called normal equation) can be used to find out the values of theta0 and theta1 in a single step. If X is the input matrix, y is the output vector and theta be the parameters you want to calculate, then for squared error approach, you can find the value of theta in a single step using this matrix equation
theta = inverse(transpose (X)*X)*transpose(X)*y
But as this contains matrix computation, obviously it is more computationally expensive operation then gradient descent when size of the matrix X is large.
I hope this may answer your query. If not then let me know.
Gradient Descent is an optimization algorithm (minimization be exact, there is gradient ascent for maximization too) to. In case of linear regression, we minimize the cost function. It belongs to gradient based optimization family and its idea is that cost when subtracted by negative gradient, will take it down the hill of cost surface to the optima.
In your algorithm, repeat till convergence means till you reach the optimal point in the cost-surface/curve, which is determined when the gradient is very very close to zero for some iterations. In that case, the algorithm is said to be converged (may be in local optima, and its obvious that Gradient Descent converges to local optima in many cases)
To determine if your algorithm has converged, you can do following:
calculate gradient
theta = theta -gradientTheta
while(True):
calculate gradient
newTheta = theta - gradient
if gradient is very close to zero and abs(newTheta-Theta) is very close to zero:
break from loop # (The algorithm has converged)
theta = newTheta
For detail on Linear Regression and Gradient Descent and other optimizations you can follow Andrew Ng's notes : http://cs229.stanford.edu/notes/cs229-notes1.pdf.
I do not know very much about the gradient descend, but we learned another way to calculate linear regression with a number of points:
http://en.wikipedia.org/wiki/Simple_linear_regression#Fitting_the_regression_line
But if you really want to add the while loop, I recommend the following:
Eventually, theta0 and theta1 will converge to a certain value. This means that, no matter how often you apply the formula, it will always stay in the vicinity of that value. (http://en.wikipedia.org/wiki/(%CE%B5,_%CE%B4)-definition_of_limit).
So applying the code once again will not change theta0 and theta1 very much, only for a very small amount. Or: the difference between theta0(1) and the next theta0(1) is smaller than a certain amount.
This brings us to the following code:
double little = 1E-10;
do {
$theta0 = theta0;
$theta1 = theta1;
// now calculate the new theta0, theta1 simultaneously.
} while(Math.abs(theta0-$theta0) + Math.abs(theta1-$theta1)>little);
You need to do the following inside of the while loop:
while (some condition is not met)
// 1) Compute the gradient using theta0 and theta1
// 2) Use the gradient to compute newTheta0 and newTheta1 values
// 3) Set theta0 = newTheta0 and theta1 = newTheta1
You can use several different criteria for terminating the gradient descent search. For example, you can run gradient descent
For a fixed number of iterations
Until the gradient value at (theta0, theta1) is sufficiently close to zero (indicating a minimum)
Each iteration you should get closer and closer to the optimal solution. That is, if you compute the error (how well your theta0, theta1 model predicts your data) for each iteration it should get smaller and smaller.
To learn more about how to actually write this code, you can refer to:
https://www.youtube.com/watch?v=CGHDsi_l8F4&list=PLnnr1O8OWc6ajN_fNcSUz9k5gF_E9huF0
https://www.youtube.com/watch?v=kjes46vP5m8&list=PLnnr1O8OWc6asSH0wOMn5JjgSlqNiK2C4
Initially assign theta[0] and theta[1] to some arbitrary value and then calculate the value of your hypothesis (theta[0] +theta[1]*x1), and then by the gradient descent algorithm calculate theta[0] and theta[1]. By the algo:
theta[0](new) = theta[0](old) - alpha*[partialderivative(J(theta[0],theta[1]) w.r.t theta[0])
theta[1](new) = theta[1](old) - alpha*[partialderivative(J(theta[0],theta[1]) w.r.t theta[1])
where alpha: learning rate
J(theta[0],theta[1])=cost function
You'll get the new value of theta[0] and theta[1]. You then need to again calculate the new value of your hypothesis. Repeat this process of calculating theta[0] and theta[1], until the difference between theta[i](new) and theta[i](old) is less than 0.001
For details refer: http://cs229.stanford.edu/notes/cs229-notes1.pdf

Determine if vertex is convex. Help understanding

I am studying the following code.
boolean convex(double x1, double y1, double x2, double y2,
double x3, double y3)
{
if (area(x1, y1, x2, y2, x3, y3) < 0)
return true;
else
return false;
}
/* area: determines area of triangle formed by three points
*/
double area(double x1, double y1, double x2, double y2,
double x3, double y3)
{
double areaSum = 0;
areaSum += x1 * (y3 - y2);
areaSum += x2 * (y1 - y3);
areaSum += x3 * (y2 - y1);
/* for actual area, we need to multiple areaSum * 0.5, but we are
* only interested in the sign of the area (+/-)
*/
return areaSum;
}
I do not understand the concept that area being negative.
Shouldn't area be always positive? maybe I am lacking some understanding of terms here.
I tried to contact the original writer but this code is about 8 years old and I have no way to contact the original writer.
This method of determining if the given vertex x2y2 is convex seems really mobile. I really want to understand it.
Any direction or reference to help me understand this piece of code will be appreciated greatly.
Source code : http://cgm.cs.mcgill.ca/~godfried/teaching/cg-projects/97/Ian/applets/BruteForceEarCut.java
The algorithm use a very simple formula with which you can compute twice the area of a triangle.
This formula has two advantages:
it doesn't require any division
it returns a negative area if the point are in the counterclockwise order.
In the code sample, the actual value of the area doesn't matter, only the sign of the result is needed.
The formula can also be used to check if three points are colinear.
You can find more information about this formula on this site : http://www.mathopenref.com/coordtrianglearea.html
This algorithm is basically using the dot product of two vectors and interpreting the results. This is the core of the Gift Wrapping Algorithm used to find convex hulls.
Since a dot b is also equal to |a|*|b|*cos(theta) then if the result is positive, cos of theta must be positive and thus convex. Per a wiki article on cross products...
Because the magnitude of the cross product goes by the sine of the
angle between its arguments, the cross product can be thought of as a
measure of ‘perpendicularity’ in the same way that the dot product is
a measure of ‘parallelism’. Given two unit vectors, their cross
product has a magnitude of 1 if the two are perpendicular and a
magnitude of zero if the two are parallel. The opposite is true for
the dot product of two unit vectors.
The use of "area" is slightly misleading on part of the original coder in my opinion.
You know about how integrals work, right? One way to think of integrals is in terms of the area under the integrated curve. For functions that are strictly positive, that definition works great, but when the function becomes negative at some point, there is a problem because then you have to take the absolute value, right?
That is not always so, actually, and it can be quite useful in some contexts to leave the curve negative. Think back to what was said earlier: the area under the curve. All that space between negative infinity and our function. Clearly, that is absurd, right? A better way to think of it is as the difference between the area under the curve, and the area under the x axis. That way, when the function is positive, our curve is gaining more area, and when it is negative, it is gaining less than the x axis.
The same thing applies to plane figures that are not strict functions. In order to really determine this, we have to define which direction our edge is going as it travels around the figure. We can define it so that all the area on the right of our curve is inside the region, and all the area to the left is outside (or we can define it the other way around, but I will use the first way).
So our figure includes all the area from there to the edge at infinity of the plane that is directly to our right. Regions enclosed clockwise really include their conventional interior twice. Regions enclosed counterclockwise don't include their conventional interior at all. The area, then, is the difference between our region and the whole plane.
The application of this to concavity is fairly simple, if you understand what it actually means to be concave or convex. The triangle you are given is concave if it is cutting an area out from the plane, and it is convex if you are adding extra area to it. That is the exact same thing we were doing to determine the our area, so positive area corresponds to a convex shape, and negative area corresponds to a concave shape.
You can also do other weird things with this conceptual model. For instance, you can turn a region 'inside out' by reversing the edge direction.
I'm sorry if this has been a little hard to follow, but this is the actual way I understand negative area.

Generate random lat lon

I need it to stress test some location based web service. The input is 4 pairs of lat/lon defining a bounding rectangle or a set of points defining a polygon.
Are there any libraries/algorithms for generating random point on a map? (Python/java)
In java you can use Math.random()
For example, if you want to generate a random number between 1 and 10:
int randomNumGenerated = (int)(Math.Random()*10) + 1;
You can apply this to the issue you are trying to solve easily.
Take a look at this question, which deals with generating points inside an arbitrary 4-point convex polygon.
Random points inside a 4-sided Polygon
This article, on sphere point picking explains far better than I could why the naive approach of generating 2 random numbers on the interval [0,1) will lead to a poor distribution of points across the surface of the sphere. That may or may not be a concern of OP.
However, it ought to be of concern to OP that randomly generating a set of 4 points on the surface of the Earth might necessitate some tricky programming. Consider the case of the 'polygon' defined by the points (lat/long, all in degrees) (+5,90),(+5,-90),(-5,-90),(-5,90). Does the point (0,0) lie inside this polygon or outside it ? What about the point (0,180) ? It's very easy to generate such ambiguous polygons -- the surface of a sphere is not well modelled by the Euclidean plane.
I'd take a completely different approach -- generate 1 point at random, then generate lat and long offsets. This will give you a quasi-rectangular patch on the surface, and you can tune the generation of the offsets to avoid ambiguous polygons. If you want to generate polygons which are not quasi-rectangular, generate a series of points and angles which, when combined, define a polygon which suits your needs.
Simple: Generate two random numbers, one for latitude and one for longitude, inside the bounding rectangle of the map, for each point.
double longitude = Math.random() * Math.PI * 2;
or use
public static LatLng random(Random r) {
return new LatLng((r.nextDouble() * -180.0) + 90.0,
(r.nextDouble() * -360.0) + 180.0);
}
Why wouldn't you just generate the latitude as a random number between -90 and 90, and the longitude as another random number between -180 and 180?
Then you have a point. Yo can then generate as many points as you need to make a polygon.
You can generate a random number between a and b with something like:
rnum = a + rnd() * (b-a); // where rnd() gives a number from 0 to 1

Categories