Closest Point - Flaw in approach

Closest Point - Flaw in approach - java

I am solving Closest Point Problem from here
Problem Statement :
We are given an array of n points in the plane, and the problem is to find out the closest pair of points in the array.
INPUT : Input will be two arrays X and Y, X[] stores x coordinates and Y[] stores y coordinates.
OUTPUT : Smallest distance.
My Algorithm :
Note : Approach works only for positive coordinates.
Find Distance between all the coordinates from (0,0) and store it in distance array.
Sort Distance array calculated in previous step.
Find smallest distance by calculating difference between two consecutive values in distance array.
Code :
public class ClosestPoint {
int x[]={2,12,40,5,12,3},y[]={3,30,50,1,10,4}; // x and y coordinates
float distance[] = {0,0,0,0,0,0}; // distance
void calculateDis(){
for(int i=0;i<x.length;i++){
int dis=(x[i]*x[i] + y[i]*y[i]);
distance[i]= (float)Math.sqrt(dis);
}
}
float findClosest() {
float closest = Float.MAX_VALUE;
for(int i=0;i<distance.length-1;i++) {
float pairDis= distance[i+1]-distance[i];
if(closest>pairDis) {
closest =pairDis;
}
}
return closest;
}
public static void main(String arg[]) {
ClosestPoint p =new ClosestPoint();
p.calculateDis(); // calculate distance from 0,0.
Arrays.sort(p.distance);
System.out.println(p.findClosest());
}
}
Correct answer :
1.4
My Answer :
0.099
I am not getting correct answer. Can someone point out flaw in my approach.
Thanks.

The actual problem is in the logic. You are calculating the distances from origin and comparing it. This may lead to the wrong answer.
Consider this example of points (3,4) and (4,3). Both are at same distance from origin - 5. So according to your logic, You sort the distances and take minimum consecutive distance so here your algorithm will return 0 (as after sorting array would be 5.0 , 5.0) but the actual answer is .

Related

How to calculate distance between two points in n dimensions with Java?

I'd like to write a function that can calculate the Euclidean distance between two points, no matter how many coordinates the points have (assuming both points have the same number of coordinates)?
For two dimensions its of course
public static Double getDistance(Point2D p, Point2D ref) {
double dXSquared = Math.pow(p.getX()-ref.getX(), 2);
double dYSquared = Math.pow(p.getY()-ref.getY(), 2);
return Double.valueOf(Math.sqrt(dXSquared + dYSquared));
}
Is there an elegant way of doing this without having to write workarounds to figure out how many coordinates a point has? Like direct vector operations as in numpy would be nice.

Shuffling through all the points in a 3-dimensional space without storing all possible coordinates

I'm programming a 3-dimensional cellular automata. The way I'm iterating through it right now in each generation is:
Create a list of all possible coordinates in the 3D space.
Shuffle the list.
Iterate through the list until all coordinates have been visited.
Goto 2.
Here's the code:
I've a simple 3 integer struct
public class Coordinate
{
public int x;
public int y;
public int z;
public Coordinate(int x, int y, int z) {this.x = x; this.y = y; this.z = z;}
}
then at some point I do this:
List<Coordinate> all_coordinates = new ArrayList<>();
[...]
for(int z=0 ; z<length ; z++)
{
for(int x=0 ; x<diameter ; x++)
{
for(int y=0 ; y<diameter ; y++)
{
all_coordinates.add(new Coordinate(x,y,z));
}
}
}
and then in the main algorithm I do this:
private void next_generation()
{
Collections.shuffle(all_coordinates);
for (int i=0 ; i < all_coordinates.size() ; i++)
{
[...]
}
}
The problem is, once the automata gets too large, the list containing all possible points gets huge. I need a way to shuffle through all the points without having to actually store all the possible points in memory. How should I go about this?

One way to do this is to start by mapping your three dimensional coordinates into a single dimension. Let's say that your three dimensions' sizes are X, Y, and Z. So your x coordinate goes from 0 to X-1, etc. The full size of your space is X*Y*Z. We'll call that S.
To map any coordinate in 3-space to 1-space, you use the formula (x*X) + (Y*y) + z.
Of course, once you generate the numbers, you have to convert back to 3-space. That's a simple matter of reversing the conversion above. Assuming that coord is the 1-space coordinate:
x = coord/X
coord = coord % X
y = coord/Y
z = coord % Y
Now, with a single dimension to work with, you've simplified the problem to one of generating all the numbers from 0 to S in pseudo-random order, without duplication.
I know of at least three ways to do this. The simplest uses a multiplicative inverse, as I showed here: Given a number, produce another random number that is the same every time and distinct from all other results.
When you've generated all of the numbers, you "re-shuffle" the list by picking a different x and m values for the multiplicative inverse calculations.
Another way of creating a non-repeating pseudo-random sequence in a particular range is with a linear feedback shift register. I don't have a ready example, but I have used them. To change the order, (i.e. re-shuffle), you re-initialize the generator with different parameters.
You might also be interested in the answers to this question: Unique (non-repeating) random numbers in O(1)?. That user was only looking for 1,000 numbers, so he could use a table, and the accepted answer reflects that. Other answers cover the LFSR, and a Linear congruential generator that is designed with a specific period.
None of the methods I mentioned require that you maintain much state. The amount of state you need to maintain is constant, whether your range is 20 or 20,000,000.
Note that all of the methods I mentioned above give pseudo-random sequences. They will not be truly random, but they'll likely be close enough to random to fit your needs.

How to compare two curves (array of points)

I have problem to find method to compare two trajectories (curves).
The first original contains points (x,y).
The second one can be offset, smaller or larger scale, and with rotation - also array with points (x,y)
My first method that i did is to find smallest distance between two points and repeat this process in every iteration, sum of it and divide by number of points - then my result tell me value the average error per point:
http://www.mathopenref.com/coorddist.html
And also i find this method:
https://help.scilab.org/docs/6.0.0/en_US/fminsearch.html
But i cant figure out how to use it.
I would like compare both trajectories but my results have to include rotation, or at least offset for beginning.
My current result is calculate error per point (distance)
get coordinate (x,y) second trajectory.
in loop i try to find min_distance between (x,y) from 1. and point from original trajectory.
add smallest_distance what i found in 2 step.
divide sum of smallest distance by number of points from second trajectory.
My result describe average error(distance) per points if we compare with original trajectory.
But i can not figure how to handle if trajectory is rotated, scaled or is shifted.
Please look at my example trajectories:
http://pokazywarka.pl/trajectory/
http://pokazywarka.pl/trajectory2/

So you need to compare shape of 2 curves invariant on rotation,translation and scale.
Solution
Let assume 2 sinwaves for testing. Both rotated and scaled but with the same aspect ratio and one with added noise. I generated them in C++ like this:
struct _pnt2D
{
double x,y;
// inline
_pnt2D() {}
_pnt2D(_pnt2D& a) { *this=a; }
~_pnt2D() {}
_pnt2D* operator = (const _pnt2D *a) { *this=*a; return this; }
//_pnt2D* operator = (const _pnt2D &a) { ...copy... return this; }
};
List<_pnt2D> curve0,curve1; // curves points
_pnt2D p0,u0,v0,p1,u1,v1; // curves OBBs
const double deg=M_PI/180.0;
const double rad=180.0/M_PI;
void rotate2D(double alfa,double x0,double y0,double &x,double &y)
{
double a=x-x0,b=y-y0,c,s;
c=cos(alfa);
s=sin(alfa);
x=x0+a*c-b*s;
y=y0+a*s+b*c;
}
// this code is the init stuff:
int i;
double x,y,a;
_pnt2D p,*pp;
Randomize();
for (x=0;x<2.0*M_PI;x+=0.01)
{
y=sin(x);
p.x= 50.0+(100.0*x);
p.y=180.0-( 50.0*y);
rotate2D(+15.0*deg,200,180,p.x,p.y);
curve0.add(p);
p.x=150.0+( 50.0*x);
p.y=200.0-( 25.0*y)+5.0*Random();
rotate2D(-25.0*deg,250,100,p.x,p.y);
curve1.add(p);
}
OBB oriented bounding box
compute OBB which will find the rotation angle and position of both curves so rotate one of them so they start at the same position and has the same orientation.
If the OBB sizes are too different then the curves are different.
For above example it yealds this result:
Each OBB is defined by start point P and basis vectors U,V where |U|>=|V| and z coordinate of U x V is positive. That will ensure the same winding for all OBBs. It can be done in OBBox_compute by adding this to the end:
// |U|>=|V|
if ((u.x*u.x)+(u.y*u.y)<(v.x*v.x)+(v.y*v.y)) { _pnt2D p; p=u; u=v; v=p; }
// (U x V).z > 0
if ((u.x*v.y)-(u.y*v.x)<0.0)
{
p0.x+=v.x;
p0.y+=v.y;
v.x=-v.x;
v.y=-v.y;
}
So curve0 has p0,u0,v0 and curve1 has p1,u1,v1.
Now we want to rescale,translate and rotate curve1 to match curve0 It can be done like this:
// compute OBB
OBBox_compute(p0,u0,v0,curve0.dat,curve0.num);
OBBox_compute(p1,u1,v1,curve1.dat,curve1.num);
// difference angle = - acos((U0.U1)/(|U0|.|U1|))
a=-acos(((u0.x*u1.x)+(u0.y*u1.y))/(sqrt((u0.x*u0.x)+(u0.y*u0.y))*sqrt((u1.x*u1.x)+(u1.y*u1.y))));
// rotate curve1
for (pp=curve1.dat,i=0;i<curve1.num;i++,pp++)
rotate2D(a,p1.x,p1.y,pp->x,pp->y);
// rotate OBB1
rotate2D(a,0.0,0.0,u1.x,u1.y);
rotate2D(a,0.0,0.0,v1.x,v1.y);
// translation difference = P0-P1
x=p0.x-p1.x;
y=p0.y-p1.y;
// translate curve1
for (pp=curve1.dat,i=0;i<curve1.num;i++,pp++)
{
pp->x+=x;
pp->y+=y;
}
// translate OBB1
p1.x+=x;
p1.y+=y;
// scale difference = |P0|/|P1|
x=sqrt((u0.x*u0.x)+(u0.y*u0.y))/sqrt((u1.x*u1.x)+(u1.y*u1.y));
// scale curve1
for (pp=curve1.dat,i=0;i<curve1.num;i++,pp++)
{
pp->x=((pp->x-p0.x)*x)+p0.x;
pp->y=((pp->y-p0.y)*x)+p0.y;
}
// scale OBB1
u1.x*=x;
u1.y*=x;
v1.x*=x;
v1.y*=x;
You can use Understanding 4x4 homogenous transform matrices to do all this in one step. Here the result:
sampling
in case of non uniform or very different point density between curves or between any parts of it you should re-sample your curves to have common point density. You can use linear or polynomial interpolation for this. You also do not need to store the new sampling in memory but instead you could build function that returns point of each curve parametrized by arc-length from start.
point curve0(double distance);
point curve1(double distance);
comparison
Now you can substract the 2 curves and sum up the abs of the differences. Then divide it by the curve length and threshold the result.
for (double sum=0.0,l=0.0;d<=bigger_curve_length;l+=step)
sum+=fabs(curve0(l)-curve1(l));
sum/=bigger_curve_length;
if (sum>threshold) curves are different
else curves match
You should try this even with +180deg rotation as the orientation difference from OBB has only half of the true range.
Here few related QAs:
compare shapes
How can i produce multi point linear interpolation?

Using random function in selecting an object if two same distance values

I have an ArrayList unsolvedOutlets containing object Outlet that has attributes longitude and latitude.
Using the longitude and latitude of Outlet objects in ArrayList unsolvedOutlets, I need to find the smallest distance in that list using the distance formula : SQRT(((X2 - X1)^2)+(Y2-Y1)^2), wherein (X1, Y1) are given. I use Collections.min(list) in finding the smallest distance.
My problem is if there are two or more values with the same smallest distance, I'd have to randomly select one from them.
Code:
ArrayList<Double> distances = new ArrayList<Double>();
Double smallestDistance = 0.0;
for (int i = 0; i < unsolvedOutlets.size(); i++) {
distances.add(Math.sqrt(
(unsolvedOutlets.get(i).getLatitude() - currSolved.getLatitude())*
(unsolvedOutlets.get(i).getLatitude() - currSolved.getLatitude())+
(unsolvedOutlets.get(i).getLongitude() - currSolved.getLongitude())*
(unsolvedOutlets.get(i).getLongitude() - currSolved.getLongitude())));
distances.add(0.0); //added this to test
distances.add(0.0); //added this to test
smallestDistance = Collections.min(distances);
System.out.println(smallestDistance);
}
The outcome in the console would print out 0.0 but it wont stop. Is there a way to know if there are multiple values with same smallest value. Then I'd incorporate the Random function. Did that make sense? lol but if anyone would have the logic for that, it would be really helpful!!
Thank you!

Keep track of the indices with min distance in your loop and after the loop choose one at random:
Random random = ...
...
List<Integer> minDistanceIndices = new ArrayList<>();
double smallestDistance = 0.0;
for (int i = 0; i < unsolvedOutlets.size(); i++) {
double newDistance = Math.sqrt(
(unsolvedOutlets.get(i).getLatitude() - currSolved.getLatitude())*
(unsolvedOutlets.get(i).getLatitude() - currSolved.getLatitude())+
(unsolvedOutlets.get(i).getLongitude() - currSolved.getLongitude())*
(unsolvedOutlets.get(i).getLongitude() - currSolved.getLongitude()));
distances.add(newDistance);
if (newDistance < smallestDistance) {
minDistanceIndices.clear();
minDistanceIndices.add(i);
smallestDistance = newDistance;
} else if (newDistance == smallestDistance) {
minDistanceIndices.add(i);
}
}
if (!unsolvedOutlets.isEmpty()) {
int index = minDistanceIndices.get(random.nextInt(minDistanceIndices.size()));
Object chosenOutlet = unsolvedOutlets.get(index);
System.out.println("chosen outlet: "+ chosenOutlet);
}
As Jon Skeet mentioned you don't need to take the square root to compare the distances.
Also if you want to use distances on a sphere your formula is wrong:
With your formula you'll get the same distance for (0° N, 180° E) to (0° N, 0° E) as for (90° N, 180° E) to (90° N, 0° E), but while you need to travel around half the earth to travel from the first to the second, the last 2 coordinates both denote the north pole.

Note: I believe fabian's solution is superior to this, but I've kept it around to demonstrate that there are many different ways of implementing this...
I would probably:
Create a new type which contained the distance from the outlet as well as the outlet (or just the square of the distance), or use a generic Pair type for the same purpose
Map (using Stream.map) the original list to a list of these pairs
Order by the distance or square-of-distance
Look through the sorted list until you find a distance which isn't the same as the first one in the list
You then know how many - and which - outlets have the same distance.
Another option would be to simply shuffle the original collection, then sort the result by distance, then take the first element - that way even if multiple of them do have the same distance, you'll be taking a random one of those.
JB Nizet's option of "find the minimum, then perform a second scan to find all those with that distance" would be fine too - and quite possibly simpler :) Lots of options...

digit categorisation using Euclidean distance

I want to categorise digits which are represented in a 64 dimensional space which gives an 8X8 pixel character image. Each attribute is an integer from 0...16. I have 20 rows of 64 values plus one at the end which determines the category. The category is previously determined by UCI but I want to know how they got each particular category for each row. So they say they used Euclidean distance to determine the category.
My question is how do I apply Euclidean distance to 64 values? I tried to use following formula (pythagorean theorem) Math.sqrt(Math.pow(x2-x1)+Math.pow(y2-y1)) within a row but the result was too big and I do not know what that represents. For example for the first row I obtained 1612 which is the square root of 40.15
This is my code for the process:
enter code here
public static void main(String[]args)
{
int row[]= new int[64];
for(int z=0;z<64;z++)
{
row[z]=digits[0][z]; //get the first row and store it
}
double result = 0;
for(int z=0;z<64;z+=2)
{
double distance = Math.pow(row[z]-row[z+1],2);
result = result+distance; //add distance each time
System.out.print(result+", ");
}
}
The first row of digits is this:
0,0,5,13,9,1,0,0,0,0,13,15,10,15,5,0,0,3,15,2,0,11,8,0,0,4,12,0,0,8,8,0,0,5,8,0,0,9,8,0,0,4,11,0,1,12,7,0,0,2,14,5,10,12,0,0,0,0,6,13,10,0,0,0,0
I am not sure if this makes sense but if something is not clear please do ask.
Thanks in advance.

My question is how do I apply Euclidean distance to 64 values?
You do not. Distance is a measure between two objects, each of which can have 64 values, but you need two objects. In particular, euclidean distance is defined as
dist(x, y) = ||x-y||_2 = sqrt[ SUM_{i=1}^d (x_i - y_i)^2 ]
where d is the number of dimensions, and x_i means ith dimension of x.
So they say they used Euclidean distance to determine the category.
They said more than that, as the distance itself does not define anything besides... distance. Category on the other hand is an abstract object, which might be defined by some some characteristic point (centroid), then you assign a category with closest (in terms of given distance) centroid.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.