Font Size:
An Empirical Study of Distance Metrics for k-Nearest Neighbor Algorithm
Last modified: 2015-02-05
Abstract
This research aims at studying the performance of k-nearest neighbor classification when applying different distance measurements. In this work, we comparatively study 11 distance metrics including Euclidean, Standardized Euclidean, Mahalanobis, City block, Minkowski, Chebychev, Cosine, Correlation, Hamming, Jaccard, and Spearman. A series of experimentations has been performed on eight synthetic datasets with various kinds of distribution. The distance computations that provide highly accurate prediction consist of City block, Chebychev, Euclidean, Mahalanobis, Minkowski, and Standardize Euclidean techniques.
Keywords
Data Classification;Synthetic Data;Distance Metrics;k-Nearest Neighbors
Full Text:
PDF