Quadtree Removal - java

I am writing a removal method for a quad tree.
Now when you remove an item in a node, you will need to check its siblings to see if you need to collapse the nodes and merge them into one.
For checking the siblings, should I store a pointer to the parent node, or is there a way to do this recursively and better?
Thanks

For removal in a quadtree you'll need to basically do the following:
Find the object's leaf, then remove it from that list (the node that contains the leaves)
Check if the removal of the leaf leaves the node empty, if it does, then remove the node itself.
Check if the surrounding nodes are empty as well, and if so, collapse this node into the parent by "unsubdividing" (this can get recursively tricky to do). The trick is to just check if the adjacent nodes have anything in them. If not, you're safe to throw the whole quadrant away and step up one level. Doing this recursively will collapse the tree back up to where an adjacent node with a leaf exists.
After step 1, you're basically done. If you want to save memory and keep the tree efficient then you should do steps 2 and 3.
And yes, you should retain a parent node reference to made reverse traversal efficient.

Related

Split a tree in a forset using jgrapht

I have a tree represented with the library jgrapht, there are variuous type of nodes I need to cut any subtree starting from a particulare node type.
As you can see in this example, this tree represent a source code of a Java class. I need to create multiple jgrapht objects by splitting the main tree starting for each "Entry" node type. In total I should get 7 tree from this big one. The structure I use is a DirectedPseudograph.
Although I'm not 100% clear about what you want, it seems there are various solution approaches.
Starting from every outgoing neighbor of the root node, you could run a depth first search and record the nodes returned. The nodes reachable by the DFS algorithm belong to the same subtree. For this you can use the DepthFirstIterator
You could create a subgraph without the root node, for instance by using the AsSubgraph class. You can then invoke the ConnectivityInspector on the resulting induced subgraph. Since every subtree is a disconnected graph component, the connectivity inspector will be able to find each of these components.
Btw, unless you need the capabilities of a Pseudograph, for performance it would be better to use the SimpleDirectedGraph. Obviously, the latter does not allow parallel edges or self-loops.

Algorithm to remove locations from a set which lie within a distance

I am looking for a solution where I have a set of locations which have some priority.
I want to remove lower priority locations such that no remaining location lies within a particular distance (say 100 meters) to any of the location.
A k-d tree sounds well-suited to this problem.
If you're removing the vast majority of the points, it might make the most sense to start from the point with highest priority, and, for each point, doing something similar to nearest-neighbour search (stopping once we get a point bounded by the given distance) in the tree to check whether to insert the point.
You may want to try to find a self-balancing variant or occasionally rebalance the tree during this process, as unbalanced trees lead to slow operations.
If a significant portion of the points will remain, it might be better to insert all the points into the tree to start and do modified nearest-neighbour search (ignoring the point itself, bounded by distance) starting from the lowest priority and removing relevant points as we go.
Using appropriate construction techniques, you can construct a balanced tree from the start.
Insertion and deletion takes O(log n) in a balanced tree (a simple approach to deletion is to just set a "deleted" flag in the node, but this doesn't ever make the tree smaller) and O(n) in an unbalanced tree. Nearest-neighbour search is similar, although it might take up to O(n) even for balanced trees, but this is a worst case - on average it should be closer to O(log n).
The k-d tree is a binary tree in which every node is a k-dimensional point. Every non-leaf node can be thought of as implicitly generating a splitting hyperplane that divides the space into two parts, known as half-spaces. Points to the left of this hyperplane are represented by the left subtree of that node and points right of the hyperplane are represented by the right subtree. The hyperplane direction is chosen in the following way: every node in the tree is associated with one of the k-dimensions, with the hyperplane perpendicular to that dimension's axis. So, for example, if for a particular split the "x" axis is chosen, all points in the subtree with a smaller "x" value than the node will appear in the left subtree and all points with larger "x" value will be in the right subtree. In such a case, the hyperplane would be set by the x-value of the point, and its normal would be the unit x-axis.
Searching for a nearest neighbour in a k-d tree proceeds as follows:
Starting with the root node, the algorithm moves down the tree recursively, in the same way that it would if the search point were being inserted (i.e. it goes left or right depending on whether the point is lesser than or greater than the current node in the split dimension).
Once the algorithm reaches a leaf node, it saves that node point as the "current best"
The algorithm unwinds the recursion of the tree, performing the following steps at each node:
If the current node is closer than the current best, then it becomes the current best.
The algorithm checks whether there could be any points on the other side of the splitting plane that are closer to the search point than the current best. In concept, this is done by intersecting the splitting hyperplane with a hypersphere around the search point that has a radius equal to the current nearest distance. Since the hyperplanes are all axis-aligned this is implemented as a simple comparison to see whether the distance between the splitting coordinate of the search point and current node is lesser than the distance (overall coordinates) from the search point to the current best.
If the hypersphere crosses the plane, there could be nearer points on the other side of the plane, so the algorithm must move down the other branch of the tree from the current node looking for closer points, following the same recursive process as the entire search.
If the hypersphere doesn't intersect the splitting plane, then the algorithm continues walking up the tree, and the entire branch on the other side of that node is eliminated.
When the algorithm finishes this process for the root node, then the search is complete.
I'm assuming you meant to remove locations, starting with those with lowest priority, within a certain distance to another location.
You could use a quad tree to represent the relative locations. The idea is that you construct a tree where each node has four children. Each child represents a quadrant and as you add each location, you traverse the tree. If you hit a childless node, then you create new child nodes representing once again that same area divided into four regions until each location is in its own region.
Once you've created this tree, you can reduce the sample sizes for each distance check between locations to only those within a certain depth and only locations within the nearby quadrants.
This method however is approximate and doesn't guarantee that you check locations that cross entire major quadrants, however it does efficiently allow you to distance check most locations that are nearby effectively reducing the run time. To get around this, you can create multiple quad trees from the same data with slight offsets, but of course each quad tree is going to be a multiplier factor in the run time since you will build and check each tree individually.
Does this answer your question?

Given a large tree structure, is there an efficient algorithm to do querying or filtering on the tree?

Let's say I wanted all nodes whose parent(s) matched some certain condition.
Is there an accepted way of doing this other than inspecting each node and building a results object full of either nodes or subtrees?
If the tree is not in already sorted or indexed based on the search condition in some way, then you cannot prune the tree traversal (i.e. you cannot decide to not take the right child at some particular node, for instance). Therefore, you have no choice but to traverse the entire tree.
That's pretty much it. You simply have to access each node to see whether it matches the criteria.
But there are some ways to speed it up:
Use an index. If you are repeatedly querying the same property, it might be beneficial to create an index on that property and use for searching. This could speed up your code immensely. Doing is not free though: you need to calculate the index up front, update it every time you update the tree and you need more memory to keep it.
If you have a multi-core machine, you can process individual subtrees in parallel by using separate threads.

external node of binary tree

Hi
I have a question that I have written a code such a merge sort that we can have a binary decision tree for that .but when i want to merge those elements I do not need those external nodes that has just one element in it! so what should I do with them? I should return them?
I'd say you remove the parent because your tree at that point degrades to a list. If the action at the leaf depends on data at the parent then there's not much you can do unless you can merge actions into one.

Java TreeNode: How to prevent getChildCount from doing expensive operation?

I'm writing a Java Tree in which tree nodes could have children that take a long time to compute (in this case, it's a file system, where there may be network timeouts that prevent getting a list of files from an attached drive).
The problem I'm finding is this:
getChildCount() is called before the user specifically requests opening a particular branch of the tree. I believe this is done so the JTree knows whether to show a + icon next to the node.
An accurate count of children from getChildCount() would need to perform the potentially expensive operation
If I fake the value of getChildCount(), the tree only allocates space for that many child nodes before asking for an enumeration of the children. (If I return '1', I'll only see 1 child listed, despite that there are more)
The enumeration of the children can be expensive and time-consuming, I'm okay with that. But I'm not okay with getChildCount() needing to know the exact number of children.
Any way I can work around this?
Added: The other problem is that if one of the nodes represents a floppy drive (how archaic!), the drive will be polled before the user asks for its files; if there's no disk in the drive, this results in a system error.
Update: Unfortunately, implementing the TreeWillExpand listener isn't the solution. That can allow you to veto an expansion, but the number of nodes shown is still restricted by the value returned by TreeNode.getChildCount().
http://java.sun.com/docs/books/tutorial/uiswing/components/tree.html#data
scroll a little down, there is the exact tutorial on how to create lazy loading nodes for the jtree, complete with examples and documentation
I'm not sure if it's entirely applicable, but I recently worked around problems with a slow tree by pre-computing the answers to methods that would normally require going through the list of children. I only recompute them when children are added or removed or updated. In my case, some of the methods would have had to go recursively down the tree to figure out things like 'how many bytes are stored' for each node.
If you need a lot of access to a particular feature of your data structure that is expensive to compute, it may make sense to pre-compute it.
In the case of TreeNodes, this means that your TreeNodes would have to store their Child count. To explain it a bit more in detail: when you create a node n0 this node has a childcount (cc) of 0. When you add a node n1 as a child of this one, you n1.cc + cc++.
The tricky bit is the remove operation. You have to keep backlinks to parents and go up the hierarchy to subtract the cc of your current node.
In case you just want to have the a hasChildren feature for your nodes or override getChildCount, a boolean might be enough and would not force you to go up the whole hierarchy in case of removal. Or you could remove the backlinks and just say that you lose precision on remove operations. The TreeNode interface actually doesn't force you to provide a remove operation, but you probably want one anyway.
Well, that's the deal. In order to come up with precomputed precise values, you will have to keep backlinks of some sorts. If you don't you'd better call your method hasHadChildren or the more amusing isVirgin.
There are a few parts to the solution:
Like Lorenzo Boccaccia said, use the TreeWillExpandListener
Also, need to call nodesWereInserted on the tree, so the proper number of nodes will be displayed. See this code
I have determined that if you don't know the child count, TreeNode.getChildCount() needs to return at least 1 (it can't return 0)

Categories