In the nearest neighbor problem a set P of data points in d-dimensional space is given. These points are preprocessed into a data structure, so that given any query point q, the nearest (or generally k nearest) points of P to q can be reported efficiently. ANN is designed for data sets that are small enough that the search structure can be stored in main memory (in contrast to approaches from databases that assume that the data resides in secondary storage). The distance between two points can be defined in many ways. ANN assumes that distances are measured using any class of distance functions called Minkowski metrics. These include the well known Euclidean distance, Manhattan distance, and max distance.
Nearest Neighbor Search (9) |
Kd-Trees (7) |
Point Location (7) |
Range Search (7) |