Fast algorithm to find thousands of points in millions polygon?

Fast algorithm to find thousands of points in millions polygon? - java

I tried to find out thousands of point in million polygon via web services .At first i implemented the algorithm(Point in polygon) in java ,but it take a long time .And then i split the table in mysql and tried to using the multiple thread to solve it ,but still inefficiently. Is there any faster algorithm or implementation for solve this?
Plus description about the polygon. 2D ,static,complex polygon(also with hole).
Any Suggestion will be appreciate.

Testing a point against a million polygons is going to take a lot of time, no matter how efficient your point in polygon function is.
You need to narrow down the search list. Start by making a bounding box for each polygon and only selecting the polygons when the point is within the bounding box.
If the polygons are unchanging you could convert each polygon to a set of triangles. Testing to see if a point is in a triangle should be much faster than testing to see if it's in an arbitrary polygon. Even though the number of triangles will be much larger than the number of polygons, it might be faster overall.

If the collection of polygons is static it may be helpful to first register them onto a spatial data structure - an R-tree might be a good choice, assuming that the polygons do not overlap each other too much.
To test a point against the polygon collection the enclosing leaf in the tree would first be found (an O(log(n)) style operation) and then it would only be necessary to perform the full point-in-polygon test for the polygons that are associated with the enclosing leaf.
This approach should greatly speed up each point test, but requires an additional setup phase to build the R-tree.
Hope this helps.

If you deal with millions of polygons, you need some kind of space partitioning, or it's gonna be slow, no matter how optimized your hit-test function is or how many threads work on solving your query.
What kind of space partitioning ? it depends:
2D? 3D?
Is your polygon set static? If not, do it changes frequently?
What kind of request are you doing on this set?
What kind of polygon is it? Triangle? Convex? Concave? Complex? With holes?
We need more information to help you.
EDIT
Here is a simple space partitioning scheme.
Suppose there is a Cartesian grid over your 2D space with a given step.
When you add a polygon:
Compute its bounding box
Find all the grid cells that intersect with the bounding box
For each cell, add a line in a special table.
The table looks like this: cell_x, cell_y, polygon_id. Add the proper indexes (at least cell_x and cell_y)
Of course, you want to choose your grid step so most of the polygons lay in less than 10 cells, or else your cell table will quickly becomes huge.
It's now easy to find the polygons at a given point:
Compute in which cell your point belongs
Get all polygons associated to this cell
For each polygon, use your hit-test function
This solution is far from optimal, but easy to implements.

I thik here is the case where divide and conquer would do, you could try making subpolyons or simplifying some of the poonts, maybe try an heuristic approach, there are my 5 cents.

Related

OpenGL very large mesh clipping

For my work I had to get into OpenGL 3d rendering recently, and I admit I'm quite new to this topic.
Without getting into too much detail, I have to deal with a HUGE array of data (vertices) from which I need to draw a shape. Basically, think of a plane of a very odd shape in 3d space. This shape is being added to on the fly. Think of a car moving on a plane and painting it's trail behind it - but not just a simple trail, but with holes, discarded sections, etc. And it generates a new section several times per second for hours.
So, obviously, what you end up with is A LOT of vertices, that do get optimized somewhat, but not enough. Millions of them.
And obviously I can't just feed it to a GPU of embedded system as a vertex VBO.
So I've been reading about culling and clipping, and as far as I understand I only need to display the visible triangles of this array, and not render everything else.
Now, how do I do that properly?
The simplest brute-force solution would be to go through all triangles, and if they lie outside of frustum - just not draw them. Generate a buffer of what I DO draw and pass it to GPU
One idea I had is to divide world space into squares, a kind of chunks, and basically split the "trail" mesh between them. So each square will hold data for it's part of the trail, and then I could use frustum culling, maybe, to decide which squares to render and which to skip.
But I'm not convinced it's a great solution. I also read that you should reduce the number of GL function calls as much as possible, and calling it for hundreds of squares doesn't seem great.
So I decided to ask for advice among people who would understand the subject better then me. Sadly, I don't get much learning time - I need to dive right into it.
If anyone could give me some directed tips it'd be appreciated.

You'd be better off using some form of spatial partitioning tree (e.g. OctTree, QuadTree, etc). That's a similar approach to your second suggestion, however because it's hierarchical, searching the tree is O(logN) vs O(n).

Selecting points randomly in regions corresponding with neighbors, avoiding infinite recursion

Pardon the wall of text. I'll add images later. I need to generate a somewhat realistic map of cubic meter voxels, with water, sand, grasses, trees, minerals, deserts, beaches, islands, etc, without any sort of voronoi cop-out(i.e. be smart about relating these factors to each other). Yes, this is a game.
I figured I'd generate critical points randomly and interpolate them for elevation and humidity readings, but I'm at a loss with random generation. Basically I need a somewhat even distribution of points without having to make the full list at once. I need to generate roughly 20x20x20 at a time, and probably work with approximately 1000x1000x1000 cells of critical points, but I'd expect strange things to happen at the edges of the large cells. Does anyone know of any way to select points in this way? The real trouble is that points should prefer to be in proximity to others in "mountain-range" style chains.
The problem here is that this is happening across these 1-km cells.
I can simply pick points this way within a cell, but since a cell and its neighbor need to be dependent on each other, my trivial algorithm would have encountered the need to head to infinity for one of these cells, or see a grid-like pattern of broken chains. The chains should not break more frequently on cell boundaries. If they do a somewhat problematic wafer-like pattern shows in generation and makes for a poor generation that is unusable for design/gameplay reasons.
n.b. For the purposes of these, the system-level random generator can be seeded and is practically uniform. As far as within a cell, I can select chains just fine.
I also considered having a cell spill over into any ungenerated cells so its chains start connected to existing ones, but that would break the determinism of generation simply based on location and seed, adding order of generation as a factor.
Again, for the purposes of realism and design I'm trying to stay away from using Perlin. Or should I post on gamedev.SE?

The key concept is to create the interfaces between a cell and all six of its neighbors before filling in the cell. Picture creating your entire world as a grid of hollow boxes, but before you do that, create it as a wire-frame outline, but before you do that, create the grid of wire-frame intersections.
Here's a simplistic approach -- you'll have to improve this. First, consider this method of generating the entire world at once:
(1) select all the vertex voxels -- perhaps the upper north west voxel in each cell -- and set its world attributes to reasonable values based entirely on location and seed.
(2) select the lines of voxels connecting the vertices, and fill in all their world attributes, based on location and seed, but constrained to match up with the existing vertex voxel values at each end.
(3) select the planes of voxels describing the faces bounded by the existing lines, and fill in in all their world attributes, based on location and seed, but constrained to match up with the existing line voxel values along the edges.
(4) fill in the cell, based on location and seed, but constrained to match up with the existing bounding six faces.
Now, consider that this method doesn't need to be done all at once. All that is necessary is that you create all six faces of a cell before filling it in, and that you create all four bounding lines of a face before you fill that in, and that you create the two end points of a line before filling that in. After the first cell, some of these will already exist.
The reason I said that you'll have to improve this idea is that it produces noticeable gradient boundaries at the cell boundaries. I'm afraid that each interface voxel will not only need to contain world attributes, but the rate of change of each attribute across the interface at that point. This implies that each line voxel will have to contain two rates of change for each attribute, and each vertex, three.
I'm not going to describe how you would constrain the gradient of a world attribute as it approaches a voxel with a predefined gradient because I'm sure you can handle it, my answer is already too long, and I don't know how.

Finding the intersection of 2 arbitrary cubes in 3d

So, I'd like to figure out a function that allows you to determine if two cubes of arbitrary rotation and size intersect.
If the cubes are not arbitrary in their rotation (but locked to a particular axis) the intersection is simple; you check if they intersect in all three dimensions by checking their bounds to see if they cross or are within one another in all three dimensions. If they cross or are within in only two, they do not intersect. This method can be used to determine if the arbitrary cubes are even candidates for intersection, using their highest/lowest x, y, and z to create an outer bounds.
That's the first step. In theory, from that information we can tell which 'side' they are on from each other, which means we can eliminate some of the quads (sides) from our intersection. However, I can't assume that we have that information, since the rotation of the cubes may make it difficult to determine simply.
My thought is to take each pair of quads, find the intersection of their planes, then determine if that line intersects with at least one edge of each of the pairs of sides. If any pair of sides has a line of intersection that intersects with any of their edges, the quads intersect. If none intersect, the two cubes do not intersect.
We can then determine the depth of the intersection on the second cube by where the plane-intersection line intersects with its edge(s).
This is simply speculative, however. Is there a better, more efficient way to determine the intersection of these two cubes? I can think of a number of different ways to do this, and I can also tell that they could be very different in terms of amount of computation required.
I'm working in Java at the moment, but C/C++ solutions are cool too (I can port them); even psuedocode since it is perhaps a big question.

To find the intersection (contact) points of two arbitrary cubes in three dimensions, you have to do it in two phases:
Detect collisions. This is usually two phases itself, but for simplicity, let's just call it "collision detection".
The algorithm will be either SAT (Separating axis theorem), or some variant of polytope expansion/reduction. Again, for simplicity, let's assume you will be using SAT.
I won't explain in detail, as others have already done so many times, and better than I could. The "take-home" from this, is that collision detection is not designed to tell you where a collision has taken place; only that it has taken place.
Upon detection of an intersection, you need to compute the contact points. This is done via a polygon clipping algorithm. In this case, let's use https://en.wikipedia.org/wiki/Sutherland%E2%80%93Hodgman_algorithm
There are easier, and better ways to do this, but SAT is easy to grasp in 3d, and SH clipping is also easy to get your head around, so is a good starting point for you.

You should take a look at the field of computer graphics. They have many means. E.g. Weiler–Atherton clipping algorithm. There are also many datastructures that could ease up the process for you. To mention AABBs (Axis-aligned bounding boxes).

Try using the separating axis theorem. it should apply in 3d as it does in 2d.

If you create polygons from the sides of the cubes then another approach is to use Constructive Space Geometry (CSG) operations on them. By building a Binary Space Partitioning (BSP) tree of each cube you can perform an intersection on them. The result of the intersection is a set of polygons representing the intersection. In your case if the number of polygons is zero then the cubes don't intersect.
I would add that this approach is probably not a good real time solution, but you didn't indicate if this needed to happen in frame refresh time or not.
Since porting is an option you can look at the Javascript library that does CSG located at
http://evanw.github.io/csg.js/docs/
I've ported this library to C# at
https://github.com/johnmott59/CGSinCSharp

A data-structure for efficiently indexing thousands of moving points?

Situation:
I have potentially tens of thousands of moving (2D) points. They affect each other only to a certain radius. They can move from place to place (not teleporting, just flying around the screen, essentially).
Since I have to check for updates every tick, it is rather important to do this efficiently.
My naive solution is to simply create a grid type structure with grid spacing somewhere around the radius of effect and as points move from cell to cell, update which cell they are in. So when I need to do effects checking, I only have to check a point's cell and a few neighboring cells.
I am familiar with quadtree, but I worry that it is a bit more expensive than what I need to do, but I am open to suggestions if this is indeed the correct route.
Also, for added information, this is in Java.
Thanks

i was and for my next project will be in a similar situation.
I choosed the simple grid variant, because its simpler and faster to implement.
tens of thousands is at the border where an quad tree or k-d tree could make sense. (especially when many cells would be empty)
You should try to test if an grid approach is sufficient. probably it is.

Getting boundary information from a 3d array

Hey, I'm currently trying to extract information from a 3d array, where each entry represents a coordinate in order to draw something out of it. The problem is that the array is ridiculously large (and there are several of them) meaning I can't actually draw all of it.
What I'm trying to accomplish then, is just to draw a representation of the outside coordinates, a shell of the array if you'd like. This array is not full, can have large empty spaces with only a few pixels set, or have large clusters of pixel data grouped together. I do not know what kind of shape to expect (could be a simple cube, or a complex concave mesh), and am struggling to come up with an algorithm to effectively extract the border. This array effectively stores a set of points in a 3d space.
I thought of creating 6 2d meshes (one for each side of the 3d array), and getting the shallowest point they can find for each position, and then drawing them separetly. As I said however, this 3d shape could be concave, which creates problems with this approach. Imagine a cone with a circle on top (said circle bigger than the cone's base). While the top and side meshes would get the correct depth info out of the shape, the bottom mesh would connect the base to the circle through vertical lines, making me effectivelly loose the conical shape.
Then I thought of annalysing the array slice by slice, and creating 2 meshes from the slice data. I believe this should work for any type of shape, however I'm struggling to find an algorithm which accuratly gives me the border info for each slice. Once again, if you just try to create height maps from the slices, you will run into problems if they have any concavities. I also throught of some sort of edge tracking algorithm, but the array does not provide continuous data, and there is almost certainly not a continuous edge along each slice.
I tried looking into volume rendering, as used in medical imaging and such, as it deals with similar problems to the one I have, but couldn't really find anything that I could use.
If anyone has any experience with this sort of problem, or any valuable input, could you please point me in the right direction.
P.S. I would prefer to get a closed representation of the shell, thus my earlier 2d mesh approach. However, an approach that simply gives me the shell points, without any connection between them, that would still be extremely helpful.
Thank you,
Ze

I would start by reviewing your data structure. As you observed, the array does not maintain any obvious spatial relationships between points. An octree is a pretty good representation for data like you described. Depending upon the complexity of you point set, you may be able to find the crust using just the octree - assuming you have some connectivity between near points.
Alternatively, you may then turn to more rigorous algorithms like raycasting or marching cubes.

Guess, it's a bit late by now to be truly useful to you, but for reference I'd say this is a perfect scenario for volumetric modeling (as you guessed yourself). As long as you know the bounding box of your point cloud, you can map these coordinates to a voxel space and increase the density (value) of each voxel for each data point. Once you have your volume fully defined, you can then use the Marching cubes algorithm to produce a 3D surface mesh for a given threshold value (iso value). That resulting surface doesn't need to be continuous, but will wrap all voxels with values > isovalue inside. The 2D equivalent are heatmaps... You can refine the surface quality by adjusting the iso threshold (higher means tighter) and voxel resolution.
Since you're using Java, you might like to take a look at my toxiclibs volumeutils library, which also comes with sevaral examples (for Processing) showing the general approach...

Imagine a cone with a circle on top
(said circle bigger than the cone's
base). While the top and side meshes
would get the correct depth info out
of the shape, the bottom mesh would
connect the base to the circle through
vertical lines, making me effectivelly
loose the conical shape.
Even an example as simple as this would be impossible to reconstruct manually, let alone algorithmically. The possibility of your data representing a cylinder with a cone shaped hole is as likely as the vertices representing a cone with a disk attached to the top.
I do not know what kind of shape to
expect (could be a simple cube...
Again, without further information on how the data was generated, 8 vertices arranged in the form of a cube might as well represent 2 crossed squares. If you knew that the data was generated by, for example, a rotating 3d scanner of some sort then that would at least be a start.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.