Convert lat/long to US State

Convert lat/long to US State - java

I have access to a list of lat/long coordinates, and I want to know (roughly) the US State these coordinates are located in. I can do with loss of precision, but I can't rely on external libraries or API. I can also add a database of locations in my code.
What is a reasonable way to do this?
I thought about 3 possibilities:
Represent each state by a single point at its center, then do a nearest-neighbour search
Represent each state by points located at cities in the state, then do a nearest-neighbour search (with much more points)
Represent each state by a simple bounding box, then use some algorithm to query which bounding box my point belongs to
What do you think is best? I would tend to think about solution 3, but I can't find a list of coarse "bounding boxes" for US states

I made a little search and find out a proper solution for what you are looking for with a dataset of bounding box.
Answer on StackOverflow: LINK
Dataset: LINK
Algorithm to use(implement): LINK
So yes, the proper way to implement it's using the solution 3 with the given dataset.
Hope it helps :)

Will not work, consider
Has a high likelihood to not work for at least some states. Consider states with towns/cities more clustered to the middle, against states with towns/cities clustered to the edge.
Will not work (these were supposed to be 90 degree angles, perfect squares, but drawing with a mouse is hard :) )
If you want to do this even vaguely accurately you will need some shape data which defines the boundaries between states. You will then need an algorithm which can determine whether a point is within an irregular polygon
See List of the United States (US) state boundaries / borders as latitude/longitude pairs for geofence?

Related

How can I correctly convert geographical coordinates to pixels on screen?

I'm trying to make a Java project that pinpoints the place on a image of a map, when given coordinates (taken from Google Maps).
I've tried using the top-left corner of the image (place that has highest latitude, and the lowest longitude), as an some kind of an reference point, which would be (0,0) point on the map image, and than I've tried to calculate every place on the map based on that reference point. However, this method proved inaccurate, probably because of the curvature of the Earth (mind that the map I'm working with (Serbia) covers area of 4° latitude, and 4° longitude).
I've seen couple of answers talking about converting into Mercator projection, but they are not very clear to me, because they are not covering a case similar to mine, and are not written in Java.
What can I do to pinpoint those points more accurately (±3km would be accurate enough)?

As comments have pointed oit correctly, in order to precisely convert between geographic coordinates and map position, you have to know the method of projection used for the map, and a sufficient number of parameters so that tuning the remaining parameters using a suitable set of reference points becomes feasible.
So if you assume a conic projection, then read the document David pointed out, and this referenced follow-up as well. As you can see, within the family of conic projections, there are a few alternatives to choose from. Each of them is described by a few parameters (i.e. standard parallels, cone constant, aspect ratio, …). You'd make guesses for these and then use some numerical optimization to obtain a best fit. Then you take the best parameter fit for each kind of projection and see which of them has the best overall fit. Quite a bit of work. If you don't want to implement the code for all these projections you can use proj.4 either on the command line or as a native library. To do the numeric optimization, you could possibly try to adapt one of the COIN-OR projects to your application.
In any case, the first step would be creating a suitable set of reference points which you can use to evaluate the fit. So pick a few prominent points on your map and find Google Earth coordinates for these. I'd say you should have at least a dozen points, to account for the fact that you know so little about your map. Otherwise there is a great risk that you will tune the large number of parameters to exactly fit your points while the rest of the map is still completely off. Even with this number of reference points, since the area of Serbia is not that big (compared to maps spanning whole continents), the errors of a wrong guess or a bad fit might be very small. So it might be hard to actually decide which projection has been used.
With all that I said above, and even with external libraries taking care of the projection and the numerical optimization, it might easily take you half a year just to set up the tools to work out the projection. So decide whether that's worth the effort. If not, there are several alternatives. One would be to take a different map, one where you know the projection. Or contact the author of your map and obtain the projection. Or ask someone working in geodesics in Serbia, because they might have enough experience to recognize the projection at a glance, I don't know.
One other option is by combining the fact that you need reference points with the fact that you might not be able to work out the exact projection in any case. Simply combine these in the following way: choose a suitably dense set of reference points, evenly distributed over the map. Then interpolate between them, picewise linearily or with higher degree or using some weighted interpolation scheme or whatever. You know there is a projection behind all this, but you give up on working out the projection, and simply mitigate the symptom: by having enough reference points, each data item is close enough to a reference point to keep the error smaller than your threshold.

I found an answer I was looking for in this thread: Java, convert lat/lon to UTM
I find out that the actual projection of my map was UTM. From there, it was simply finding a class that would convert my lat/lon coordinates into UTM eastings and northings (very useful code in this answer), and then I would do simple math to find out where the point is compared to the boundaries of the map, and it's actually working.

What is a good data structure for a multi resolution graph?

I have a data set consisting of hundreds of millions of data points. I'd like to be able to effectively render such a set depending on the zoom level (i.e. axis scale). I'd like to be able to have a sampled subset render at the full view. As you zoom in, you'll be able to see more detailed data points until you reach maximum zoom, at that point you'll be able to see individual data points. What would be a good data structure to store such a data set and allows multi resolution access?

You need to keep your points spatially indexed, because "outlier" and "density" are spatial properties -- an outlier is a point that happens to be in a low-density area; and "zooming out" would mean replacing sets of close-together points for 'sampled' points; and when "zooming in" you really, really want to ignore all those points that fall outside the current window. Your operations could be something like:
void addPoint(Point2D p);
void removePoint(Point2D p);
Iterator<Point2D> getPointsToPaint(Rectangle2D viewArea, int maxDensity, double densityArea);
where the viewArea represents the window you want to find points for, and the maxDensity parameter could be used to control point abstraction: when more than maxDensity points fall within a densityArea square, you return maxDensity random points within that area instead. getPointsToPaint would then cover your viewArea with densityArea sampling boxes and return the points within: the real points if less than maxDensity, and the "sampled" ones if over maxDensity (nobody will notice if 10 points within a 1mm2 area are random or not).
Typical spatial structures are quad-trees (for 2d) and kd-trees (for any number of ds). However, in their default implementations, neither of them is too good for quickly-changing dynamic data. Another option is to use spatial hashing; but you really seem to need a multi-level approach, and for multi-level, trees are always the way to go. From a quick review of search results for "dynamic spatial indexing", it seems that a variant of the r-tree may be what you are looking for. Beware that these data-structures are not easy to implement from scratch. The best approach may be to rely on an external GIS system to do the bookkeeping for you. Several Java GISs are available.

Not 100% sure what kind of data you are rendering, but I guess you could do sampling and calculate an approximation, and as you zoom in you make the approximation more and more accurate?

cost / mapping function for determining center of object based on detected features

I wrote an object tracker that will try to detect and follow a moving object in a recorded video. In order to maximize the detection rate, my algorithm is using a bunch of detection & tracking algorithms (cascade, foreground & particle tracker). Each tracking algorithm will return me some point of interest that might be part of the object that I'm trying to track. Let's assume (for the simplicity of this example) that my object is a rectangle and that the three tracking algorithms returned the points 1, 2 and 3:
Based on the relation / distance of these three points it is possible to calculate the center of gravity (blue X in above image) of the tracked object. So for each frame I might be able to come up with some good estimate of the center of gravity. However, the object might move from one frame to the next:
In this example I merely rotated the original object. My algorithm will give me three new points of interest: 1',2' and 3'. I could again calculate the center of gravity based on these three new points, but I would throw away important information that I've acquired from the previous frame: based on points 1, 2 and 3 I already do know something about the relationship of these points and thus by combining the information from 1, 2 and 3 and 1',2' and 3' I should be able to come up with a better estimate of the center of gravity.
Furthermore, the next frame might yield a forth data point:
This is what I would like to do (but I don't know how):
based on the individual points (and their relationship to each other) that are returned from the different tracking algorithms, I want to build up a localization map of the tracked object. Intuitively I feel like I need to come up with A) an identification function that will identify individual points across frames and B) some cost function that will determine how similar tracked points (and the relationship / distance between them) are from frame to frame, but I can't get my head around on how to implement this. Alternatively, maybe some kind of map buildup based on the points will work. But again, I don't know how to approach this.
Any advice (and example code) is highly appreciated!
EDIT1
a simple particle filter might probably work too, but I again don't know how to define the cost function. A particle filter for tracking a certain color is easy to program: for each pixel you calculate the difference between target color and pixel color. But how would I do the same for estimating the relationship between tracked points?
EDIT2 intuitively I feel like Kalman filters could also help with the prediction step. See slides 24 - 32 of this pdf. Or am I misled?

What I think you're trying to do is essentially build up a state space of features, which can be applied to a filtering process, such as an Extended Kalman Filter. This is a useful framework when you have multiple observations in every frame, and you're trying to estimate or measure something indicated by these observations.
To determine the similarity of the tracked points, you can perform simple template matching from frame to frame for small regions around the points. One way of doing this is to extract an NxN (say, 7x7) region around point a in frame n and point a' in frame n+1, followed by normalised cross correlation between the extracted regions. This will give you a reasonable measure of how similar the patches are. If the patches are not similar, then you've probably lost track of that point.

There is an enormous literature on this and related problems starting in the 80's. Try searching for "optical flow" algorithms". The input for such algorithms is two successive frames of the same scene. The output is a vector field, one vector per pixel in the second image, which shows what the direction and speed of movement of the feature in that field. This presentation is a pretty nice summary.
A nice thing about optical flow is that many algorithms for it parallelize nicely and map onto your favorite video card GPU, so they can run in real time. Think ESPN overlays.

According to me, in order to identify who is who in each frame, you will have to use a greater dimension. For example if you want to know which point is where between two frame (considering your extracted point are same), you will have to build vectors or simplex and then deduce an organisation between your points (like angles values).
The main problem is that combinations increase with point number. If your camera is a fixed point then, you could use background as a reference in order to deduce object rotations and translations, i mean build vectors between background interest points and object points in order to clearly identify them.
hope that help go forward.

I would recommend looking in to the divided difference filter (DDF), which is similar to the extended Kalman filter (EKF), but does not require an approximate model of the dynamics of your system (which you may not have). Basically the DDF approximates the derivatives used in the EKF using a difference equation. There are plenty of papers online about this, but I do not know whether you have access to them so I have not linked them here. If you are working from a university or a company that has access to online journals (like IEEE Explore), then just Google "divided difference filter" and check out some of the papers.

Find location using only distance and bearing?

Triangulation works by checking your angle to three KNOWN targets.
"I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and it's to my right at 90 degrees." Repeat 2 more times for different targets and angles.
Trilateration works by checking your distance from three KNOWN targets.
"I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and I'm 100 meters away from that." Repeat 2 more times for different targets and ranges.
But both of those methods rely on knowing WHAT you're looking at.
Say you're in a forest and you can't differentiate between trees, but you know where key trees are. These trees have been hand picked as "landmarks."
You have a robot moving through that forest slowly.
Do you know of any ways to determine location based solely off of angle and range, exploiting geometry between landmarks? Note, you will see other trees as well, so you won't know which trees are key trees. Ignore the fact that a target may be occluded. Our pre-algorithm takes care of that.
1) If this exists, what's it called? I can't find anything.
2) What do you think the odds are of having two identical location 'hits?' I imagine it's fairly rare.
3) If there are two identical location 'hits,' how can I determine my exact location after I move the robot next. (I assume the chances of having 2 occurrences of EXACT angles in a row, after I reposition the robot, would be statistically impossible, barring a forest growing in rows like corn). Would I just calculate the position again and hope for the best? Or would I somehow incorporate my previous position estimate into my next guess?
If this exists, I'd like to read about it, and if not, develop it as a side project. I just don't have time to reinvent the wheel right now, nor have the time to implement this from scratch. So if it doesn't exist, I'll have to figure out another way to localize the robot since that's not the aim of this research, if it does, lets hope it's semi-easy.

Great question.
The name of the problem you're investigating is localization, and it, together with mapping, are two of the most important and challenging problems in robotics at the moment. Put simply, localization is the problem of "given some sensor observations how do I know where I am?"
Landmark identification is one of the hidden 'tricks' that underpin so much of the practice of robotics. If it isn't possible to uniquely identify a landmark, you can end up with a high proportion of misinformation, particularly given that real sensors are stochastic (ie/ there will be some uncertainty associate with the result). Your choice of an appropriate localisation method, will almost certainly depend on how well you can uniquely identify a landmark, or associate patterns of landmarks with a map.
The simplest method of self-localization in many cases is Monte Carlo localization. One common way to implement this is by using particle filters. The advantage of this is that they cope well when you don't have great models of motion, sensor capability and need something robust that can deal with unexpected effects (like moving obstacles or landmark obscuration). A particle represents one possible state of the vehicle. Initially particles are uniformly distributed, as the vehicle moves and add more sensor observations are incorporated. Particle states are updated to move away from unlikely states - in the example given, particles would move away from areas where the range / bearings don't match what should be visible from the current position estimate. Given sufficient time and observations particles tend to clump together into areas where there is a high probability of the vehicle being located. Look up the work of Sebastian Thrun, particularly the book "probabilistic robotics".

What you're looking for is Monte Carlo localization (also known as a particle filter). Here's a good resource on the subject.
Or nearly anything from the probabilistic robotics crowd, Dellaert, Thrun, Burgard or Fox. If you're feeling ambitious, you could try to go for a full SLAM solution - a bunch of libraries are posted here.
Or if you're really really ambitious, you could implement from first principles using Factor Graphs.

I assume you want to start by turning on the robot inside the forest. I further assume that the robot can calculate the position of every tree using angle and distance.
Then you can identify the landmarks by iterating through the trees and calculating the distance to all its neighbours. In Matlab you can use pdist to get a list of all (unique) pairwise distances.
Then you can iterate through the trees to identify landmarks. For every tree, compare the distances to all its neighbours to the known distances between landmarks. Whenever you find a candidate landmark, you check its possible landmark neighbours for the correct distance signature. Since you say that you always should be able to see five landmarks at any given time, you will be trying to match 20 distances, so I'd say that the chance of false positives is not too high. If the candidate landmark and its candidate fellow landmarks do not match the complete relative distance pattern, you go check the next tree.
Once you have found all the landmarks, you simply triangulate.
Note that depending on how accurately you can measure angles and distances, you need to be able to see more landmark trees at any given time. My guess is that you need to space landmarks with sufficiently density that you can see at least three at a time if you have high measurement accuracy.

I guess you need only distance to two landmarks and the order of seeing them (i.e. from left to right you see point A and B)

(1) "Robotic mapping" and "perceptual aliasing".
(2) Two identical hits are inevitable. Since the robot can only distinguish between a finite number X of distinguishable tree configurations, even if the configurations are completely random, there is almost certainly at least one location that looks "the same" as some other location even if you encounter far fewer than X/2 different trees. Those are called "birthday paradox collisions". You may be lucky that the particular location you are at is in fact actually unique, but I wouldn't bet my robot on it.
So you:
(a) have a map of a large area with
some, but not all trees on it.
(b) a
robot somewhere in the actual forest
that, without looking at the map, has
looked at the nearby trees and
generated an internal map of a all
the trees in a tiny area and its
relative position to them
(c) To the
robot, every tree looks the same as
every other tree.
You want to find: Where is the robot on the large map?
If only each actual tree had a unique name written on it that the robot could read, and then (some of) those trees and their names were on the map, this would be trivial.
One approach is to attach a (not necessarily unique) "signature" to each tree that describes its position relative to nearby trees.
Then, as you travel along, the robot drives up to a tree and finds a "signature" for that tree, and you find all the trees on the map that "match" that signature.
If only one unique tree on the map matches, then the tree the robot is looking might be that tree on the map (yay, you know where the robot is) -- put down a weighty but tentative dot on the map at the robot's relative position to the matching tree -- the tree the robot is next to is certainly not any of the other trees on the map.
If several of the trees on the map match -- they all have the same non-unique signature -- then you could put some less-weighty tentative dots on the map at the robots position relative to each one of them.
Alas, even if find one or more matches, it is still possible that the tree the robot is looking at is not on the map at all, and the signature of that tree is coincidentally the same as one or more trees on the map, and so the robot could be anywhere on the map.
If none of the trees on the map matches, then the tree the robot is looking at is definitely not on the map. (Perhaps later on, once the robot knows exactly where it is, it should start adding these trees to the map?)
As you drive down the path, you push the dots in your estimated direction and speed of travel.
Then as you inspect other trees, possibly after driving down the path a little further, you eventually have lots of dots on the map, and hopefully one heavy, highly overlapping cluster at the actual position, and hopefully each other dot is an easily-ignored isolated coincidences.
The simplest signature is a list of distances from a particular tree to nearby trees.
A particular tree on the map is "matched" to a particular tree in the forest when, for each and every nearby tree on the map, there is a corresponding nearby tree in the forest at "the same" distance, as far as you can tell with your known distance and angular errors.
(By "nearby", I mean "close enough that the robot should be able to definitely confirm that the tree is actually there", although it's probably simpler to approximate this with something like "My robot can see all trees out to a range of R, so I'm only going to bother even trying to match trees that are within a circle of R*1/3 from my robot, and my list of distances only include trees that are within a circle of R*2/3 from the particular tree I'm trying to match").
If you know your north-south orientation even very roughly, you can create signatures that are "more unique", i.e., have fewer spurious matches on the map and (hopefully) in the real forest.
A "match" for the tree the robot is next to occurs when, for each nearby tree on the map, there is a corresponding tree in the forest at "the same" distance and direction, as far as you can tell with your known distance and angular errors.
Say you see that tree "Fred" on the map has another tree 10 meters in the N to W quadrant from it, but the robot is next to a tree that definitely doesn't have any trees at that distance in the N to W quadrant, but it has a tree 10 meters away to the South.
In that case, then (using a more complex signature) you can definitely tell the robot is not next to Fred, even though the simple signature would give a (false) match.
Another approach:
The "digital paper" solves a similar problem ... Can you plant a few trees in a pattern that is specifically designed to be easily recognized?

Convert a list java.awt.geom.Point2D to a java.awt.geom.Area

I have a set of points that i want to turn into a closed polygon in Java. I'm currently trying to use java.awt.geom.Point2D and java.awt.geom.Area but can't figure out how to turn a group of the points into an Area.
I think I can define a set of Line2Ds based on the points and then add those to the Areas, but that's a lot of work and I'm lazy. So is there an easier way to go.
The problem is I have a list of lat/lon coordinates and want to build up an area that I can use for hit testing.
Non-core Java libraries are a possibility as well.
Update, I looked at using java.awt.Polygon but it only supports ints and I'm operating with doubles for the coordinates.

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4077518
Hear that, "customer"? You should be using GeneralPath, even though the absence of Polygon2D since the late 1990s is an obvious monster-truck-sized hole in the API.

If you are actually working with Geodetic lat/lon values, you can actually use OpenMap to do some of this work. I just spent some time using the Geo class in that API to bounce an object around an area defined by a polygon of lat/lon points. There are intersection calls and everything and all of the math is done spherically so that the points are more correct as far as projections go.

The simplest (and laziest) thing to do is to create a bounding box for the points from the maximum and minimum of the X, Y ordinate values.
If you need a closer fit then rather than devise your own algorithm, this might be a good place to start:

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.