RectangleTree
The RectangleTree class represents a generic multidimensional space
partitioning tree.  It is heavily templatized to control splitting behavior and
other behaviors, and is the actual class underlying trees such as the
RTree.  In general, the RectangleTree class is not meant to
be used directly, and instead one of the numerous variants should be used
instead:
The RectangleTree and its variants are capable of inserting points and
deleting them.  This is different from BinarySpaceTree
and other mlpack tree types, where the tree is built entirely in batch at
construction time.  However, this capability comes with a runtime cost, and so
in general the use of RectangleTree with mlpack algorithms will be slower than
the batch-construction treesโbut, if insert/delete functionality is required,
RectangleTree is the only choice.
For users who want to use RectangleTree directly or with custom behavior,
the full class is still detailed in the subsections below.  RectangleTree
supports the TreeType API and
can be used with mlpackโs tree-based algorithms, although using custom behavior
may require a template typedef.
- Template parameters
 - Constructors
 - Basic tree properties
 - Bounding distances with the tree
 StatisticTypetemplate parameterSplitTypetemplate parameterDescentTypetemplate parameterAuxiliaryInformationTypetemplate parameter- Tree traversals
 - Example usage
 
๐ See also
RTree- R-Tree on Wikipedia
 - R-Trees: A Dynamic Index Structure for Spatial Searching (pdf)
 - Tree-Independent Dual-Tree Algorithms (pdf)
 
๐ Template parameters
The RectangleTree class takes six template parameters.  The first three of
these are required by the
TreeType API
(see also
this more detailed section). The
full signature of the class is:
template<typename DistanceType,
         typename StatisticType,
         typename MatType,
         typename SplitType,
         typename DescentType,
         template<typename> class AuxiliaryInformationType>
class RectangleTree;
- 
    
DistanceType: the distance metric to use for distance computations.RectangleTreerequires that this isEuclideanDistance, and a compilation error will be thrown if any otherDistanceTypeis specified. StatisticType: this holds auxiliary information in each tree node. By default,EmptyStatisticis used, which holds no information.- See the 
StatisticTypesection for more details. 
- See the 
 - 
    
MatType: the type of matrix used to represent points. Must be a type matching the Armadillo API. By default,arma::matis used, but other types such asarma::fmator similar will work just fine. SplitType: the class defining how an individualRectangleTreenode should be split. By default,RTreeSplitis used.- See the 
SplitTypesection for more details. 
- See the 
 DescentType: the class defining how a child node is chosen for point insertion. By default,RTreeDescentHeuristicis used.- See the 
DescentTypesection for more details. 
- See the 
 AuxiliaryInformationType: holds information specific to the variant of theRectangleTree. By default,NoAuxiliaryInformationis used.
Note that the TreeType API requires trees to have only three template
parameters.  In order to use a RectangleTree with its six template parameters
with an mlpack algorithm that needs a TreeType, it is easiest to define a
template typedef:
template<typename DistanceType, typename StatisticType, typename MatType>
using CustomTree = Rectangle<DistanceType, StatisticType, MatType,
    CustomSplitType, CustomDescentType, CustomAuxiliaryInformationType>
Here, CustomSplitType, CustomDescentType, and
CustomAuxiliaryInformationType are the desired splitting and descent
strategies and auxiliary information type.  This is the way that all
RectangleTree variants (such as RTree) are defined.
๐ Constructors
RectangleTrees are constructed by inserting points in a dataset sequentially.
The dataset is not permuted during the construction process.
node = RectangleTree(data)node = RectangleTree(data, maxLeafSize=20, minLeafSize=8)node = RectangleTree(data, maxLeafSize=20, minLeafSize=8, maxNumChildren=5, minNumChildren=2)- Construct a 
RectangleTreeon the givendatawith the given construction parameters. - Default template parameters are used, meaning that this tree will be a
RTree. - By default, 
datais copied. Avoid a copy by usingstd::move()(e.g.std::move(data)); when doing this,datawill be set to an empty matrix. 
- Construct a 
 
node = RectangleTree<DistanceType, StatisticType, MatType, SplitType, DescentType, AuxiliaryInformationType>(data)node = RectangleTree<DistanceType, StatisticType, MatType, SplitType, DescentType, AuxiliaryInformationType>(data, maxLeafSize=20, minLeafSize=8)node = RectangleTree<DistanceType, StatisticType, MatType, SplitType, DescentType, AuxiliaryInformationType>(data, maxLeafSize=20, minLeafSize=8, maxNumChildren=5, minNumChildren=2)- Construct a 
RectangleTreeon the givendata, using custom template parameters to control the behavior of the tree and the given construction parameters. - By default, 
datais copied. Avoid a copy by usingstd::move()(e.g.std::move(data)); when doing this,datawill be set to an empty matrix. 
- Construct a 
 
node = RectangleTree(dimensionality)- Construct an empty 
RectangleTreewith no children, no points, and default template parameters. - Use 
node.Insert()to insert points into the tree. All points must have dimensionalitydimensionality. 
- Construct an empty 
 
node.Insert(x)- Insert the point 
xinto the tree. xshould have vector type compatible with the chosenMatType; so, for defaultMatType,arma::vecis the expected type.- If a custom 
MatTypeis specified (e.g.arma::fmat), thenxshould have type equivalent to the corresponding column vector type (e.g.arma::fvec). - Due to tree rebalancing, this may change the internal structure of the
tree; so references and pointers to children of 
nodemay become invalid. - Warning: This will throw an exception if 
nodeis not the root of the tree! 
- Insert the point 
 - `node.Delete(i)
    
- Delete the point with index 
ifrom the tree. - The point to be deleted from the tree will be 
node.Dataset().col(i); after deleting, the column will be removed fromnode.Dataset()and all indexes held in all tree nodes will be updated. (Thus, this operation can be expensive!) - Due to tree rebalancing, this may change the internal structure of the
tree; so references and pointers to children of 
nodemay become invalid. - Warning: This will throw an exception if 
nodeis not the root of the tree! 
 - Delete the point with index 
 
Notes:
- 
    
The name
nodeis used here forRectangleTreeobjects instead oftree, because eachRectangleTreeobject is a single node in the tree. The constructor returns the node that is the root of the tree. - 
    
See also the developer documentation on tree constructors.
 
๐ Constructor parameters:
| name | type | description | default | 
|---|---|---|---|
data | 
      MatType | 
      Column-major matrix to build the tree on. | (N/A) | 
maxLeafSize | 
      size_t | 
      Maximum number of points to store in each leaf. | 20 | 
    
minLeafSize | 
      size_t | 
      Minimum number of points to store in each leaf. | 8 | 
    
maxNumChildren | 
      size_t | 
      Maximum number of children allowed in each non-leaf node. | 5 | 
    
minNumChildren | 
      size_t | 
      Minimum number of children in each non-leaf node. | 2 | 
    
dimensionality | 
      size_t | 
      Dimensionality of points to be held in the tree. | (N/A) | 
| ย | ย | ย | ย | 
x | 
      arma::vec | 
      Column vector: point to insert into tree.  Should have type matching the column vector type associated with MatType, and must have node.Dataset().n_rows elements. | 
      (N/A) | 
i | 
      size_t | 
      Index of point in node.Dataset() to delete from node. | 
      (N/A) | 
๐ Basic tree properties
Once a RectangleTree object is constructed, various properties of the tree
can be accessed or inspected.  Many of these functions are required by the
TreeType API.
๐ Navigating the tree
- 
    
node.NumChildren()returns the number of children innode. This is0ifnodeis a leaf, and between the values ofnode.MinNumChildren()andnode.MaxNumChildren()(inclusive) otherwise. - 
    
node.IsLeaf()returns aboolindicating whether or notnodeis a leaf. node.Child(i)returns aRectangleTree&that is theith child.imust be less thannode.NumChildren().- This function should only be called if 
node.NumChildren()is not0(e.g. ifnodeis not a leaf). Note that this returns a validRectangleTree&that can itself be used just like the root node of the tree! 
node.Parent()will return aRectangleTree*that points to the parent ofnode, orNULLifnodeis the root of theRectangleTree.
๐ Accessing members of a tree
node.Bound()will return anHRectBound<DistanceType, ElemType>&object that represents the hyperrectangle bounding box ofnode.ElemTypeis the element type ofMatType; so, if default template parameters are used,ElemTypeisdouble.boundis a hyperrectangle that encloses all the descendant points ofnode. It may be somewhat loose (e.g. points may not be very near the edges).
- 
    
node.Stat()will return aStatisticType&holding the statistics of the node that were computed during tree construction. - 
    
node.Distance()will return aEuclideanDistance&. SinceEuclideanDistancehas no members, this function is not likely to be useful, but it is required by the TreeType API. - 
    
node.AuxiliaryInfo()returns anAuxiliaryInformationType&that holds any auxiliary information required by the node. - 
    
node.MinNumChildren()returns the minimum number of children that the node is required to have as asize_t. If points are deleted such that the number of children falls below this limit, thennodewill become a leaf and the tree will be rebalanced. - 
    
node.MaxNumChildren()returns the maximum number of children that the node is required to have as asize_t. If points are inserted such that the number of children goes above this limit, new nodes will be added and the tree will be rebalanced. - 
    
node.MaxLeafSize()returns the maximum number of points that the node is allowed to hold as asize_t. If the number of points held bynodeexceeds this limit during insertion, thennodewill be split and the tree will be rebalanced. node.MinLeafSize()returns the minimum number of points that the node is allowed to hold as asize_t. If the number of points held bynodegoes under this limit during deletion, thennodewill be deleted (if possible) and the tree will be rebalanced.
See also the developer documentation for basic tree functionality in mlpack.
๐ Accessing data held in a tree
- 
    
node.Dataset()will return aconst MatType&that is an internally-held representation of the dataset the tree was built on. node.NumPoints()returns asize_tindicating the number of points held directly innode.- If 
nodeis not a leaf, this will return0, asRectangleTreeonly holds points directly in its leaves. - If 
nodeis a leaf, then this will return values betweennode.MinLeafSize()andnode.MaxLeafSize()(inclusive). - If the tree has fewer than 
node.MinLeafSize()points total, thennode.NumPoints()will return a value less thannode.MinLeafSize(). 
- If 
 node.Point(i)returns asize_tindicating the index of theiโth point innode.Dataset().imust be in the range[0, node.NumPoints() - 1](inclusive).nodemust be a leaf (as non-leaves do not hold any points).- The 
iโth point innodecan then be accessed asnode.Dataset().col(node.Point(i)). - Accessing the actual 
iโth point itself can be done with, e.g.,node.Dataset().col(node.Point(i)). - Point indices are not necessarily contiguous for 
RectangleTrees; that is,node.Point(i) + 1is not necessarilynode.Point(i + 1). 
node.NumDescendants()returns asize_tindicating the number of points held in all descendant leaves ofnode.- If 
nodeis the root of the tree, thennode.NumDescendants()will be equal tonode.Dataset().n_cols. 
- If 
 node.Descendant(i)returns asize_tindicating the index of theiโth descendant point innode.Dataset().imust be in the range[0, node.NumDescendants() - 1](inclusive).nodedoes not need to be a leaf.- The 
iโth descendant point innodecan then be accessed asnode.Dataset().col(node.Descendant(i)). - Accessing the actual 
iโth descendant itself can be done with, e.g.,node.Dataset().col(node.Descendant(i)). - Descendant point indices are not necessarily contiguous for
RectangleTrees; that is,node.Descendant(i) + 1is not necessarilynode.Descendant(i + 1). 
๐ Accessing computed bound quantities of a tree
The following quantities are cached for each node in a RectangleTree, and so
accessing them does not require any computation.  In the documentation below,
ElemType is the element type of the given MatType; e.g., if MatType is
arma::mat, then ElemType is double.
node.FurthestPointDistance()returns anElemTyperepresenting the distance between the center of the bound ofnodeand the furthest point held bynode.- If 
nodeis not a leaf, this returns 0 (becausenodedoes not hold any points). 
- If 
 - 
    
node.FurthestDescendantDistance()returns anElemTyperepresenting the distance between the center of the bound ofnodeand the furthest descendant point held bynode. - 
    
node.MinimumBoundDistance()returns anElemTyperepresenting the minimum possible distance from the center of the node to any edge of its bound. node.ParentDistance()returns anElemTyperepresenting the distance between the center of the bound ofnodeand the center of the bound of its parent.- If 
nodeis the root of the tree,0is returned. 
- If 
 
Note: for more details on each bound quantity, see the developer documentation on bound quantities for trees.
๐ Other functionality
node.Center(center)computes the center of the hyperrectangle bounding box ofnodeand stores it incenter.centershould be of typearma::Col<ElemType>&, whereElemTypeis the element type of the specifiedMatType.centerwill be set to have size equivalent to the dimensionality of the dataset held bynode.- This is equivalent to calling 
node.Bound().Center(center). 
- A 
RectangleTreecan be serialized withdata::Save()anddata::Load(). 
๐ Bounding distances with the tree
The primary use of trees in mlpack is bounding distances to points or other tree nodes. The following functions can be used for these tasks.
node.GetNearestChild(point)node.GetFurthestChild(point)- Return a 
size_tindicating the index of the child that is closest to (or furthest from)point, with respect to theMinDistance()(orMaxDistance()) function. - If there is a tie, the node with the lowest index is returned.
 - If 
nodeis a leaf,0is returned. pointshould be a column vector type of the same type asMatType. (e.g., ifMatTypeisarma::mat, thenpointshould be anarma::vec.)
- Return a 
 node.GetNearestChild(other)node.GetFurthestChild(other)- Return a 
size_tindicating the index of the child that is closest to (or furthest from) theRectangleTreenodeother, with respect to theMinDistance()(orMaxDistance()) function. - If there is a tie, the node with the lowest index is returned.
 - If 
nodeis a leaf,0is returned. 
- Return a 
 
node.MinDistance(point)node.MinDistance(other)- Return a 
doubleindicating the minimum possible distance betweennodeandpoint, or theRectangleTreenodeother. - This is equivalent to the minimum possible distance between any point
contained in the bounding hyperrectangle of 
nodeandpoint, or between any point contained in the bounding hyperrectangle ofnodeand any point contained in the bounding hyperrectangle ofother. pointshould be a column vector type of the same type asMatType. (e.g., ifMatTypeisarma::mat, thenpointshould be anarma::vec.)
- Return a 
 node.MaxDistance(point)node.MaxDistance(other)- Return a 
doubleindicating the maximum possible distance betweennodeandpoint, or theRectangleTreenodeother. - This is equivalent to the maximum possible distance between any point
contained in the bounding hyperrectangle of 
nodeandpoint, or between any point contained in the bounding hyperrectangle ofnodeand any point contained in the bounding hyperrectangle ofother. pointshould be a column vector type of the same type asMatType. (e.g., ifMatTypeisarma::mat, thenpointshould be anarma::vec.)
- Return a 
 node.RangeDistance(point)node.RangeDistance(other)- Return a 
RangeType<ElemType>whose lower bound isnode.MinDistance(point)ornode.MinDistance(other), and whose upper bound isnode.MaxDistance(point)ornode.MaxDistance(other). ElemTypeis the element type ofMatType.pointshould be a column vector type of the same type asMatType. (e.g., ifMatTypeisarma::mat, thenpointshould be anarma::vec.)
- Return a 
 
๐ Tree traversals
Like every mlpack tree, the RectangleTree class provides a single-tree and
dual-tree traversal that can be paired
with a RuleType class to implement a
single-tree or dual-tree algorithm.
RectangleTree::SingleTreeTraverser- Implements a depth-first single-tree traverser.
 
RectangleTree::DualTreeTraverser- Implements a dual-depth-first dual-tree traverser.
 
๐ StatisticType
Each node in a RectangleTree holds an instance of the StatisticType class.
This class can be used to store additional bounding information or other cached
quantities that a RectangleTree does not already compute.
mlpack provides a few existing StatisticType classes, and a custom
StatisticType can also be easily implemented:
EmptyStatistic: an empty statistic class that does not hold any information- Custom 
StatisticTypes: implement a fully customStatisticType 
Note: this section is still under constructionโnot all statistic types are documented yet.
๐ EmptyStatistic
The EmptyStatistic class is an empty placeholder class that is used as the
default StatisticType template parameter for mlpack trees.
The class does not hold any members and provides no functionality. See the implementation.
๐ Custom StatisticTypes
A custom StatisticType is trivial to implement.  Only a default constructor
and a constructor taking a RectangleTree is necessary.
class CustomStatistic
{
 public:
  // Default constructor required by the StatisticType policy.
  CustomStatistic();
  // Construct a CustomStatistic for the given fully-constructed
  // `RectangleTree` node.  Here we have templatized the tree type to make it
  // easy to handle any type of `RectangleTree`.
  template<typename TreeType>
  StatisticType(TreeType& node);
  //
  // Adding any additional precomputed bound quantities can be done; these
  // quantities should be computed in the constructor.  They can then be
  // accessed from the tree with `node.Stat()`.
  //
};
Example: suppose we wanted to know, for each node, the exact time at which it
was created.  A StatisticType could be created that has a
std::time_t member,
whose value is computed in the constructor.
๐ SplitType
The SplitType template parameter controls the algorithm used to split each
node of a RectangleTree while building.  The splitting strategy used can be
entirely arbitraryโthe SplitType simply needs to split a leaf node and a
non-leaf node into children.
mlpack provides several drop-in choices for SplitType, and it is also possible
to write a fully custom split:
RTreeSplit: splits according to a simple binary heuristicRStarTreeSplit: finds the best possible binary split that minimizes the volume of the two children and maximizes the margin between themXTreeSplit: an improved splitting strategy that minimizes overlap of sibling nodesRPlusTreeSplit: split by partitioning two nodes along the dimension that minimizes overall node volumeRPlusPlusTreeSplit: split using maximum bounding rectangles to ensure zero overlap between sibling nodesHilbertRTreeSplit<>: use deferred splitting and Z-ordering values of points to decide the split- Custom 
SplitTypes: implement a fully customSplitTypeclass 
Note: this section is still under constructionโnot all split types are documented yet.
๐ RTreeSplit
The RTreeSplit class implements the original R-tree splitting strategy and can
be used with the RectangleTree class.  This is the splitting
strategy used for the RTree class, and is the same strategy
proposed in the original paper
(pdf).  The
strategy works as follows:
- Find the two furthest-apart points (or children if the node is not a leaf).
 - Create two children with each point (or child) as the only point (or child).
 - Iteratively add each remaining point (or child) to the new child whose hyperrectangle bound volume increases the least.
 
For implementation details, see the source code.
๐ RStarTreeSplit
The RStarTreeSplit class implements the improved R*-tree splitting strategy
and can be used with the RectangleTree class.  This is the
splitting strategy used for the RStarTree class, and is the
strategy proposed in the
R*-tree paper (pdf).  The
strategy computes, for each possible binary split in each dimension,
- The combined volume of the two child nodes,
 - The size of the margin between the two child nodes, and
 - The size of the overlap between the two child nodes.
 
The split that minimizes the combined volume and maximizes the overlap is chosen.
In addition, the RStarTreeSplit will sometimes perform forced reinsertion,
where points are removed from a node during the splitting process and reinserted
into the tree.  This can help decrease the overlap between adjacent nodes in the
tree, which in turn improves the quality of the tree for search and other tasks.
For implementation details, see the source code.
๐ XTreeSplit
The XTreeSplit class implements the improved splitting strategy for the
XTree as described in the
X-tree paper (pdf).  This strategy is
an improved version of the standard RTreeSplit, where the
overlap of sibling nodes is minimized.
When overlap cannot be prevented, XTreeSplit will instead create
โsuper-nodesโ with more children than typically allowed.  The split is then
deferred until a later time when overlap can be more effectively avoided.
For implementation details, see the source code.
๐ RPlusTreeSplit
The RPlusTreeSplit class implements the splitting policy of the
R+-tree.  The strategy splits nodes (leaves and non-leaves) by
partitioning along the dimension that results in the two children with minimum
volume, similar to the kd-tree.
For implementation details, see
the source code.
Note that RPlusTreeSplit is a template typedef of the general
RPlusTreeSplitType<> class.
๐ RPlusPlusTreeSplit
The RPlusPlusTreeSplit class implements the splitting policy of the
R++-tree.  This class can only be used in a tree that
uses RPlusPlusTreeAuxiliaryInformation
as the AuxiliaryInformationType.
The splitting strategy splits leaf nodes along an arbitrarily-chosen dimension, and splits non-leaf nodes along the dimension that minimizes the number of descendant nodes that also must be split along that dimension.
For implementation details, see
the source code.
Note that RPlusPlusTreeSplit is a template typedef of the general
RPlusTreeSplitType<> class.
๐ HilbertRTreeSplit<>
The HilbertRTreeSplit<> class is an implementation of the
HilbertRTree splitting strategy.  This strategy, proposed
in the original paper (pdf), has two
main differences from the standard RTreeSplit strategy:
- The idea of space-filling curves is used to order points for insertion.
 - Instead of one node splitting into two, the 
HilbertRTreeSplit<>class defers splitting, and re-splits a group of two nodes into three nodes.- Note: this behavior is configurable, see below.
 
 
When inserting a point, one cooperating sibling node is found. If both the node and its cooperating sibling are full, then all points in the two nodes as well as the point being inserted are ordered by Z-ordering value (also known as Morton ordering), and split evenly into three nodes.
Notes:
- 
    
HilbertRTreeSplit<>has one template parameter, which controls the number of sibling nodes to split. This is why the class must be specified asHilbertRTreeSplit<>and notHilbertRTreeSplit. - 
    
By default,
HilbertRTreeSplit<>splits two sibling nodes into three new nodes; but this is configurable:HilbertRTreeSplit<N>will splitNsibling nodes intoN + 1new nodes. - 
    
The concept of splitting based on Z-ordering is also used in the
UBTreeSplitstrategy for theUBTree, a variant of theBinarySpaceTreeclass. 
๐ Custom SplitTypes
Custom split strategies for a RectangleTree can be implemented via the
SplitType template parameter.  By default, the RTreeSplit
splitting strategy is used, but it is also possible to implement and use a
custom SplitType.  Any custom SplitType class must implement the following
signature:
class SplitType
{
 public:
  // Given the leaf node `tree`, split into multiple nodes.  `TreeType` will be
  // the relevant `RectangleTree` type.  `tree` should be modified directly.
  //
  // `relevels` is an auxiliary array used by some splitting strategies, such as
  // the `RStarTreeSplit`, to indicate whether a node needs to be reinserted
  // into the tree.
  template<typename TreeType>
  static void SplitLeafNode(TreeType* tree, std::vector<bool>& relevels);
  // Given the non-leaf node `tree`, split into multiple nodes.  `TreeType` will
  // be the relevant `RectangleTree` type.  `tree` should be modified directly.
  //
  // `relevels` is an auxiliary array used by some splitting strategies, such as
  // the `RStarTreeSplit`, to indicate whether a node needs to be reinserted
  // into the tree.
  template<typename TreeType>
  static void SplitNonLeafNode(TreeType* tree, std::vector<bool>& relevels);
};
๐ DescentType
The DescentType template parameter controls the algorithm used to assign child
points and child nodes to nodes in a RectangleTree.  The strategy used can be
arbitrary: the DescentType simply needs to return an index of a child to
insert a point or node into.
mlpack provides several drop-in choices for DescentType, and it is also
possible to write a fully custom split:
RTreeDescentHeuristic: selects the closest child, which is the child whose volume will increase the least.RStarTreeDescentHeuristic: selects a child such that overlap is minimized and volume increase is minimizedRPlusTreeDescentHeuristic: selects a child that does not cause overlap; if not possible, creates a new childRPlusPlusTreeDescentHeuristic: selects the child whose maximum bounding box contains the point such that overlap is minimized and volume increase is minimized.HilbertRTreeDescentHeuristic: select the first child with minimum Z-order value greater than the point or node to be inserted.- Custom 
SplitTypes: implement a fully customSplitTypeclass 
Note: this section is still under constructionโnot all split types are documented yet.
๐ RTreeDescentHeuristic
The RTreeDescentHeuristic is the default descent strategy for the
RectangleTree and is used by the RTree.  The strategy is
simple: the child node whose volume will increase the least is chosen as the
child to insert a point or other node into.
For implementation details, see the source code.
๐ RStarTreeDescentHeuristic
The RStarTreeDescentHeuristic is a descent strategy for the
RectangleTree and is used by the
RStarTree.  The heuristic will always prefer to insert a
point or node into a child node whose hyperrectangle bound already contains the
point or node to be inserted.
When inserting a point or node into a node whose children are leaves, the strategy will choose to insert into the child where the overall overlap of childrenโs volumes after insertion is minimized.
When inserting a point or node into a node whose children are not leaves, the strategy will choose to insert into the child whose volume is the smallest after insertion.
For implementation details, see the source code.
๐ RPlusTreeDescentHeuristic
The RPlusTreeDescentHeuristic is the descent strategy used by the
RPlusTree.  When determining which node to insert a point
into, the following heuristic is used:
- If the point to be inserted already falls within the bounding hyperrectangle of a child, select that child.
 - If the point to be inserted does not fall within the bounding hyperrectangle of any child, but a childโs volume can be expanded to encompass the point without causing any children to overlap, select that child.
 - If neither of the conditions above are true, insert the point into a new
child node.  This child node will likely be rebalanced or modified later by
RPlusTreeSplit. 
๐ RPlusPlusTreeDescentHeuristic
The RPlusPlusTreeDescentHeuristic is the descent strategy used by the
RPlusPlusTree.  The strategy chooses the child whose
outer bound (held by
RPlusPlusTreeAuxiliaryInformation)
contains the point to be inserted.
For implementation details, see the source code.
๐ HilbertRTreeDescentHeuristic
The HilbertRTreeDescentHeuristic is the descent strategy used by the
HilbertRTree.  The strategy depends on the concept of
Z-ordering (or Morton ordering): the child node whose minimum Z-ordering value
is closest to but greater than the Z-ordering value of the point to be inserted
is chosen.
For implementation details, see the source code.
๐ Custom DescentTypes
Custom descent strategies for a RectangleTree can be implemented via the
DescentType template parameter.  By default, the
RTreeDescentHeuristic descent strategy is used,
but it is also possible to implement and use a custom DescentType.  Any custom
DescentType class must implement the following signature:
class DescentType
{
 public:
  // Return a `size_t` indicating which child of `node` should be chosen to
  // insert `point` in.
  //
  // `TreeType` will be the relevant `RectangleTree` type.
  template<typename TreeType>
  static size_t ChooseDescentNode(const TreeType* node, const size_t point);
  // Return a `size_t` indicating which child of `node` should be chosen to
  // insert `insertedNode` in.
  //
  // `TreeType` will be the relevant `RectangleTree` type.
  template<typename TreeType>
  static size_t ChooseDescentNode(const TreeType* node,
                                  const TreeType* insertedNode);
};
๐ AuxiliaryInformationType
The AuxiliaryInformationType template parameter holds any auxiliary
information required by the SplitType or DescentType strategies.  By
default, the NoAuxiliaryInformation class is used, which holds nothing.
Different variants of RectangleTrees may use other predefined types for their
AuxiliaryInformationTypes:
XTreeAuxiliaryInformation: used for theXTree.RPlusPlusTreeAuxiliaryInformation: used for theRPlusPlusTree.DiscreteHilbertRTreeAuxiliaryInformation: used for theHilbertRTree.
๐ XTreeAuxiliaryInformation
The XTreeAuxiliaryInformation class is the auxiliary information type used by
the XTree class, and is meant to be used with the
XTreeSplit splitting strategy.  It holds information required
to construct super-nodes (a concept specific to X-trees), where splitting is
being deferred.
For implementation details, see the source code.
๐ RPlusPlusTreeAuxiliaryInformation
The RPlusPlusTreeAuxiliaryInformation class is used by the
RPlusPlusTree to store information required for tree
building.  In addition to the regular
HRectBound that is used to maintain the
minimum bounding rectangle of each node, each R++-tree node also maintains an
โouter boundโ that represents the maximum bounding rectangle.  This maximum
bounding rectangle is used for splitting, instead of the minimum bounding
rectangle; this helps prevent overlap in nodes.
For an object auxInfo, the function auxInfo.OuterBound() will return an
HRectBound&.  If the tree was built with a
non-standard MatType, then the type returned will be
HRectBound<EuclideanDistance, ElemType>, where ElemType is the element type
of the given MatType.
For implementation details, see the source code.
๐ DiscreteHilbertRTreeAuxiliaryInformation
The DiscreteHilbertRTreeAuxiliaryInformation class is used by the
HilbertRTree.  It stores the largest Z-ordering value of
any descendant point of a node.  (This can be accessed with the HilbertValue()
method.)
For more details, see the source code.
๐ Custom AuxiliaryInformationTypes
Custom AuxiliaryInformationTypes can be implemented and used with the
AuxiliaryInformationType template parameter.  Any custom
AuxiliaryInformationType class must implement the following signature:
// TreeType will be the type of RectangleTree that the auxiliary information
// type is being used in.
template<typename TreeType>
class CustomAuxiliaryInformationType
{
 public:
  // Default constructor is required.
  CustomAuxiliaryInformationType();
  // Construct the object with a tree node that may not yet be constructed.
  CustomAuxiliaryInformationType(TreeType* node);
  // Construct the object with another object and another tree node, optionally
  // making a 'deep copy' instead of just copying pointers where relevant.
  CustomAuxiliaryInformationType(const CustomAuxiliaryInformationType& other,
                                 TreeType* node,
                                 const bool deepCopy = true);
  // Just before a point is inserted into a node, this is called.
  // `node` is the node that will have `node.Dataset().col(point)` inserted into
  // it.
  //
  // Optionally, this method can manipulate `node`.  If so, `true` should be
  // returned to indicate that `node` was changed.  Otherwise, return `false`
  // and the RectangleTree will perform its default behavior.
  bool HandlePointInsertion(TreeType* node, const size_t point);
  // Just before a child node is inserted into a node, this is called.
  // `node` is the node that will have `nodeToInsert` inserted into it as a
  // child.
  //
  // Optionally, this method can manipulate `node`.  If so, `true` should be
  // returned to indicate that `node` was changed.  Otherwise, return `false`
  // and the RectangleTree will perform its default behavior.
  bool HandleNodeInsertion(TreeType* node,
                           TreeType* nodeToInsert,
                           const bool atMaxDepth);
  // Just before a point is deleted from a node, this is called.
  // `node` is the node that will have `node.Dataset().col(point)` deleted from
  // it.
  //
  // Optionally, this method can manipulate `node`.  If so, `true` should be
  // returned to indicate that `node` was changed.  Otherwise, return `false`
  // and the RectangleTree will perform its default behavior.
  bool HandlePointDeletion(TreeType* node, const size_t point);
  // Just before a child node is deleted from a node, this is called.
  // `node` is the node that will have `node.Child(nodeIndex)` deleted from it.
  //
  // Optionally, this method can manipulate `node`.  If so, `true` should be
  // returned to indicate that `node` was changed.  Otherwise, return `false`
  // and the RectangleTree will perform its default behavior.
  bool HandleNodeRemoval(TreeType* node, const size_t nodeIndex);
  // When `node` is changed, this is called so that the auxiliary information
  // can be updated.  If information needs to be propagated upward, return
  // `true` and then `UpdateAuxiliaryInfo(node->Parent())` will be called.
  bool UpdateAuxiliaryInfo(TreeType* node);
};
๐ Example usage
The RectangleTree class is only really necessary when a custom split type or
custom descent strategy is intended to be used.  For simpler use cases, one of
the typedefs of RectangleTree (such as RTree) will suffice.
For this reason, all of the examples below explicitly specify all six template
parameters of RectangleTree.
Writing a custom splitting strategy,
writing a custom descent strategy,
and writing a custom auxiliary information
type are discussed in the previous sections.
Each of the parameters in the examples below can be trivially changed for
different behavior.
Build a RectangleTree on the cloud dataset and print basic statistics
about the tree.
// See https://datasets.mlpack.org/cloud.csv.
arma::mat dataset;
mlpack::data::Load("cloud.csv", dataset, true);
// Build the rectangle tree with a leaf size of 10.  (This means that leaf nodes
// cannot contain more than 10 points.)
//
// The std::move() means that `dataset` will be empty after this call, and no
// data will be copied during tree building.
mlpack::RectangleTree<mlpack::EuclideanDistance,
                      mlpack::EmptyStatistic,
                      arma::mat,
                      mlpack::RTreeSplit,
                      mlpack::RTreeDescentHeuristic,
                      mlpack::NoAuxiliaryInformation> tree(std::move(dataset));
// Print the bounding box of the root node.
std::cout << "Bounding box of root node:" << std::endl;
for (size_t i = 0; i < tree.Bound().Dim(); ++i)
{
  std::cout << " - Dimension " << i << ": [" << tree.Bound()[i].Lo() << ", "
      << tree.Bound()[i].Hi() << "]." << std::endl;
}
std::cout << std::endl;
// Print the number of children in the root, and the allowable range.
std::cout << "Number of children of root: " << tree.NumChildren()
    << "; allowable range: [" << tree.MinNumChildren() << ", "
    << tree.MaxNumChildren() << "]." << std::endl;
// Print the number of descendant points of the root, and of each of its
// children.
std::cout << "Descendant points of root:        "
    << tree.NumDescendants() << "." << std::endl;
for (size_t i = 0; i < tree.NumChildren(); ++i)
{
  std::cout << "Descendant points of child " << i << ":  "
      << tree.Child(i).NumDescendants() << "." << std::endl;
}
std::cout << std::endl;
// Compute the center of the RectangleTree.
arma::vec center;
tree.Center(center);
std::cout << "Center of tree: " << center.t();
Build two RectangleTrees on subsets of the corel dataset and compute minimum
and maximum distances between different nodes in the tree.
// See https://datasets.mlpack.org/corel-histogram.csv.
arma::mat dataset;
mlpack::data::Load("corel-histogram.csv", dataset, true);
// Convenience typedef for the tree type.
using TreeType = mlpack::RectangleTree<mlpack::EuclideanDistance,
                                       mlpack::EmptyStatistic,
                                       arma::mat,
                                       mlpack::RTreeSplit,
                                       mlpack::RTreeDescentHeuristic,
                                       mlpack::NoAuxiliaryInformation>;
// Build trees on the first half and the second half of points.
TreeType tree1(dataset.cols(0, dataset.n_cols / 2));
TreeType tree2(dataset.cols(dataset.n_cols / 2 + 1, dataset.n_cols - 1));
// Compute the maximum distance between the trees.
std::cout << "Maximum distance between tree root nodes: "
    << tree1.MaxDistance(tree2) << "." << std::endl;
// Get the leftmost grandchild of the first tree's root---if it exists.
if (!tree1.IsLeaf() && !tree1.Child(0).IsLeaf())
{
  TreeType& node1 = tree1.Child(0).Child(0);
  // Get the leftmost grandchild of the second tree's root---if it exists.
  if (!tree2.IsLeaf() && !tree2.Child(0).IsLeaf())
  {
    TreeType& node2 = tree2.Child(0).Child(0);
    // Print the minimum and maximum distance between the nodes.
    mlpack::Range dists = node1.RangeDistance(node2);
    std::cout << "Possible distances between two grandchild nodes: ["
        << dists.Lo() << ", " << dists.Hi() << "]." << std::endl;
    // Print the minimum distance between the first node and the first
    // descendant point of the second node.
    const size_t descendantIndex = node2.Descendant(0);
    const double descendantMinDist =
        node1.MinDistance(node2.Dataset().col(descendantIndex));
    std::cout << "Minimum distance between grandchild node and descendant "
        << "point: " << descendantMinDist << "." << std::endl;
    // Which child of node2 is closer to node1?
    const size_t closestIndex = node2.GetNearestChild(node1);
    std::cout << "Child " << closestIndex << " is closest to node1."
        << std::endl;
    // And which child of node1 is further from node2?
    const size_t furthestIndex = node1.GetFurthestChild(node2);
    std::cout << "Child " << furthestIndex << " is furthest from node2."
        << std::endl;
  }
}
Build a RectangleTree on 32-bit floating point data and save it to disk.
// See https://datasets.mlpack.org/corel-histogram.csv.
arma::fmat dataset;
mlpack::data::Load("corel-histogram.csv", dataset);
// Build the RectangleTree using 32-bit floating point data as the matrix
// type.  We will still use the default EmptyStatistic and EuclideanDistance
// parameters.  A leaf size of 100 is used here.
mlpack::RectangleTree<mlpack::EuclideanDistance,
                      mlpack::EmptyStatistic,
                      arma::fmat,
                      mlpack::RTreeSplit,
                      mlpack::RTreeDescentHeuristic,
                      mlpack::NoAuxiliaryInformation> tree(
    std::move(dataset), 100);
// Save the tree to disk with the name 'tree'.
mlpack::data::Save("tree.bin", "tree", tree);
std::cout << "Saved tree with " << tree.Dataset().n_cols << " points to "
    << "'tree.bin'." << std::endl;
Load a 32-bit floating point RectangleTree from disk, then traverse it
manually and find the number of leaf nodes with less than 10 points.
// This assumes the tree has already been saved to 'tree.bin' (as in the example
// above).
// This convenient typedef saves us a long type name!
using TreeType = mlpack::RectangleTree<mlpack::EuclideanDistance,
                                       mlpack::EmptyStatistic,
                                       arma::fmat,
                                       mlpack::RTreeSplit,
                                       mlpack::RTreeDescentHeuristic,
                                       mlpack::NoAuxiliaryInformation>;
TreeType tree;
mlpack::data::Load("tree.bin", "tree", tree);
std::cout << "Tree loaded with " << tree.NumDescendants() << " points."
    << std::endl;
// Recurse in a depth-first manner.  Count both the total number of leaves, and
// the number of leaves with less than 10 points.
size_t leafCount = 0;
size_t totalLeafCount = 0;
std::stack<TreeType*> stack;
stack.push(&tree);
while (!stack.empty())
{
  TreeType* node = stack.top();
  stack.pop();
  if (node->NumPoints() < 10)
    ++leafCount;
  ++totalLeafCount;
  for (size_t i = 0; i < node->NumChildren(); ++i)
    stack.push(&node->Child(i));
}
// Note that it would be possible to use TreeType::SingleTreeTraverser to
// perform the recursion above, but that is more well-suited for more complex
// tasks that require pruning and other non-trivial behavior; so using a simple
// stack is the better option here.
// Print the results.
std::cout << leafCount << " out of " << totalLeafCount << " leaves have fewer "
    << "than 10 points." << std::endl;
Build a RectangleTree by iteratively inserting points from the corel dataset,
print some information, and then remove a few randomly chosen points.
// See https://datasets.mlpack.org/corel-histogram.csv.
arma::mat dataset;
mlpack::data::Load("corel-histogram.csv", dataset, true);
// This convenient typedef saves us a long type name!
using TreeType = mlpack::RectangleTree<mlpack::EuclideanDistance,
                                       mlpack::EmptyStatistic,
                                       arma::mat,
                                       mlpack::RTreeSplit,
                                       mlpack::RTreeDescentHeuristic,
                                       mlpack::NoAuxiliaryInformation>;
// Create an empty tree of the right dimensionality.
TreeType t(dataset.n_rows);
// Insert points one by one for the first half of the dataset.
for (size_t i = 0; i < dataset.n_cols / 2; ++i)
  t.Insert(dataset.col(i));
std::cout << "After inserting half the points, the root node has "
    << t.NumDescendants() << " descendant points and "
    << t.NumChildren() << " child nodes." << std::endl;
// For the second half, insert the points backwards.
for (size_t i = dataset.n_cols - 1; i >= dataset.n_cols / 2; --i)
  t.Insert(dataset.col(i));
std::cout << "After inserting all the points, the root node has "
    << t.NumDescendants() << " descendant points and "
    << t.NumChildren() << " child nodes." << std::endl;
// Remove three random points.
t.Delete(mlpack::math::RandInt(0, t.NumDescendants()));
std::cout << "After removing 1 point, the root node has " << t.NumDescendants()
    << " descendant points." << std::endl;
t.Delete(mlpack::math::RandInt(0, t.NumDescendants()));
std::cout << "After removing 2 points, the root node has " << t.NumDescendants()
    << " descendant points." << std::endl;
t.Delete(mlpack::math::RandInt(0, t.NumDescendants()));
std::cout << "After removing 3 points, the root node has " << t.NumDescendants()
    << " descendant points." << std::endl;