Spatial indexing mechanism
Index concept
An important research area of raster-vector integrated spatial data structure is how to establish an effective spatial index structure. At present, there are many studies on the index structure of line elements, such as PMR quadtree, band tree and barrel method, while the index structure of surface elements mainly includes quadtree and R tree. These structures have their own application fields and relative advantages, but they also have shortcomings.
Spatial index refers to a data structure arranged in a certain order according to the position and shape of spatial objects or some spatial relationship between spatial objects. It contains the outline information of spatial objects, such as object identification, external rectangle and pointers to spatial object entities. As an assistant spatial data structure, spatial index is between spatial operation algorithm and spatial object. It can improve the speed and efficiency of spatial operation by filtering a large number of spatial objects which are independent of specific spatial operation. The performance of spatial index directly affects the overall performance of spatial database and geographic information system. It is a key technology of spatial database and geographic information system. Generally, large spatial index is a kind of spatial index of data structure which divides the space from top to bottom and divides the space step by step. The representative indexes include BSP tree, K-D-B tree, R tree, R+tree and CELL tree. In addition, the simple grid spatial index has a wide range of applications.
Index type
Grid Spatial Index
The idea of grid spatial index is simple and easy to understand and implement. The basic idea is to divide the research area into equal and different size grids with horizontal and vertical lines, and record the spatial entities contained in each grid. When users query spatial objects, they first compute the grid in which users query objects, and then quickly query the selected spatial entities in the grid, which greatly accelerates the query speed of spatial index.
BSP Tree Spatial Index
BSP tree is a binary tree, which divides the space into two parts step by step (Figure 7-19). BSP tree can be well adapted to the distribution of spatial objects in spatial database, but in general, BSP tree is deep and has adverse effects on various operations.
KDB Tree Spatial Index
KDB tree is a development from B tree to multi-dimensional space. It has good dynamic characteristics for indexing points in multi-dimensional space, and it is convenient to delete and increase spatial point objects. Its disadvantage is that it does not directly support the elements occupying a certain spatial range, such as lines and surfaces in two-dimensional space. This disadvantage can be partially solved by means of spatial mapping or transformation. Spatial mapping or transformation is to transform regions in 2n-dimensional space into points in 2n-dimensional space. In this way, point index structure can be used to index regions, and region query in original space can be transformed into point query in high-dimensional space. However, there are still some shortcomings in spatial mapping or transformation methods: point query in high-dimensional space is much more difficult than point query in original space; after transformation, adjacent areas in original space may become quite far away in point space, which will affect the performance of spatial index.
R tree and R+ tree
R-tree is based on the smallest outsourcing rectangle of the object (Figure 7-20), which can directly index the spatial objects occupying a certain range of space. Each node N of the R tree corresponds to the disk page D (N) and region I (N). If the node is not a leaf node, all the sub-nodes of the node are within the scope of region I (N), and stored in the disk page D (N). If the node is a leaf node, then the disk page D (N) will store a series of sub-regions within the scope of region I (N), and the sub-regions are tight. Around a space object, it is usually an outer rectangle of a space object.
The number of sub-nodes per node in the R-tree is limited. The lower limit guarantees the effective utilization of disk space by index, the number of nodes less than the lower limit will be deleted, and the sub-nodes of the node will be allocated to other nodes. The reason for setting the upper limit is that each node corresponds to only one disk page. If the space required by a node is larger than one disk page, the node will be divided into two new nodes. All subnodes of the incoming node will be assigned to these two new nodes.
Because of the overlap of spatial regions corresponding to R-tree brothers’nodes, R-tree can easily insert and delete. However, because of the overlap between regions, spatial indexing may need to search multiple paths to get the final results, so its spatial search efficiency is low. It is this reason that leads to the emergence of R + tree (Figure 7-21). In R + tree, there is no overlap of spatial regions corresponding to sibling nodes, and the speed of spatial index searching can be greatly improved by the division of regions without overlap. However, the efficiency of insertion and deletion operations is reduced because of the need to ensure that the corresponding spatial regions of sibling nodes do not overlap when inserting and deleting spatial objects.
CELL tree
Considering the difficulty of R tree and R + in insertion, deletion and spatial search efficiency, CELL tree emerged as the times require. It uses convex polygon instead of rectangle as the basic unit of space partition. The partition method is similar to BSP tree, and the subspace is no longer covered by each other. The number of disk accesses of CELL tree is less than that of R tree and R + tree. Because the number of disk accesses is the key index affecting the performance of spatial index, CELL tree is an excellent spatial index method (Figure 7-22).