Abstract
Similarity metrics are widely used in computer graphics. In this paper, we will concentrate on a new, algorithmic complexity-based metric called Normalized Compression Distance. It is a universal distance used to compare strings. This measure has also been used in computer graphics for image registration or viewpoint selection. However, there is no previous study on how the measure should be used: which compressor and image format are the most suitable. This paper presents a practical study of the Normalized Compression Distance (NCD) applied to color images. The questions we try to answer are: Is NCD a suitable metric for image comparison? How robust is it to rotation, translation, and scaling? Which are the most adequate image formats and compression algorithms? The results of our study show that NCD can be used to address some of the selected image comparison problems, but care must be taken on the compressor and image format selected.
Similar content being viewed by others
References
Adobe Systems Incorporated: Developer resources/TIFF. http://partners.adobe.com/public/developer/tiff/index.html (2010). Online; accessed 3th March 2010
Asimakis, K.: INSHAME: FrontPaQ. http://inshame.blogspot.com/2009/09/frontpaq.html (2010). Online; accessed 23th February 2010
Bardera, A., Feixas, M., Boada, I., Sbert, M.: Compression-based image registration. In: Proc. of IEEE International Conference on Information Theory. IEEE Press, New York (2006)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Benedetto, D., Caglioti, E., Loreto, V.: Language trees and zipping. Phys. Rev. Lett. 88(4) (2002)
Bennett, C., Gacs, P., Li, M., Vitanyi, P., Zurek, W.: Information distance. IEEE Trans. Inf. Theory 44 (1998)
Bergmans, W.: Maximum compression (lossless compression software). http://www.maximumcompression.com/index.html (2010). Online; accessed 23th February 2010
Bourke, P.: BMP image format. http://local.wasp.uwa.edu.au/pbourke/dataformats/bmp/ (2010). Online; accessed 3th March 2010
Cebrián, M., Alfonseca, M., Ortega, A.: The normalized compression distance is resistant to noise. IEEE Trans. Inf. Theory 53(5), 1895–1900 (2007)
Cleary, J.G., Teahan, W.J., Witten, I.H.: Unbounded length contexts for ppm. In: DCC ’95: Proceedings of the Conference on Data Compression, p. 52. IEEE Computer Society, Washington (1995)
Cilibrasi, R.: Complearn home. http://www.complearn.org/ (2005). Online; accessed 2th June 2010
Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)
Cilibrasi, R., Vitanyi, P., de Wolf, R.: Algorithmic clustering of music based on string compression. Comput. Music J. 28(4), 49–67 (2004)
Dubnov, S., Assayag, G., Lartillot, O., Bejerano, G.: Using machine-learning methods for musical style modeling. Computer 36(10), 73–80 (2003)
Henderson, B.: Netpbm. http://netpbm.sourceforge.net/ (2010). Online; accessed 3th March 2010
Joint Photographic Experts Group: JPEG 2000. http://www.jpeg.org/jpeg2000/ (2010). Online; accessed 3th March 2010
Joint Photographic Experts Group: The JPEG committee home page. http://www.jpeg.org/jpeg/ (2010). Online; accessed 3th March 2010
Lan, Y., Harvey, R.: Image classification using compression distance. In: Proceedings of the 2nd International Conference on Vision, Video and Graphics, pp. 173–180 (2005)
Lee, S.M., Xin, J.H., Westland, S.: Evaluation of image similarity by histogram intersection. Color Res. Appl. 30(4), 265–274 (2005). doi:10.1002/col.20122
Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1075–1088 (2003)
Li, M., Badger, J., Chen, X., Kwong, S., Kearney, P., Zhang, H.: An information-based sequence distance and its application to the whole mitochondrial genome phylogeny. Bioinformatics 17(2), 149–154 (2001)
Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Trans. Inf. Theory 50(12), 3250–3264 (2004)
Li, M., Vitanyi, P.M.: An Introduction to Kolmogorov Complexity and Its Applications. Springer, Berlin (1993)
Li, M., Zhu, Y.: Image classification via lz78 based string kernel: a comparative study. In: PAKDD, pp. 704–712 (2006)
Macedonas, A., Besiris, D., Economou, G., Fotopoulos, S.: Dictionary based color image retrieval. J. Vis. Commun. Image Represent. 19(7), 464–470 (2008)
Nelson, M., Gailly, J.-L.: The Data Compression Book, 2nd edn. M&T Books, New York (1996)
Mahoney, M.: Data compression programs. http://mattmahoney.net/dc/#paq (2010). Online; accessed 23th February 2010
Pavlov, I.: 7-zip. http://www.7-zip.org/ (2010). Online; accessed 23th February 2010
Pierre-Emmanuel, G.: XnView Software—free graphic and photo viewer, converter, organizer. http://www.xnview.com/ (2010). Online; accessed 23th February 2010
Rocha, J., Rosselló, F., Segura, J.: Compression ratios based on the universal similarity metric still yield protein distances far from cath distances. CoRR arXiv:q-bio/0603007v2 (2006)
Roshal, A.: WinRar archiver, a powerful tool to process RAR and ZIP files. http://www.rarlab.com (2010). Online; accessed 23th February 2010
Team, T.G.: GIMP—The GNU Image Manipulation Program. http://www.gimp.org/ (2010). Online; accessed 23th February 2010
Tran, N.: The normalized compression distance and image distinguishability. In: Proceedings 19th IS&T/SPIE Symposium on Electronic Imaging Science and Technology, San José, USA, pp. 508–515 (2007)
Tran, N.: A perceptual similarity measure based on smoothing filters and the normalized compression distance. In: Proceedings 22nd IS&T/SPIE Symposium on Electronic Imaging Science and Technology, San José, USA (2010)
Väyrynen, J.J., Tapiovaara, T., Kettunen, K., Dobrinkat, M.: Normalized compression distance as an automatic MT evaluation metric. In: Proceedings of MT 25 Years on (2011, to appear)
Vázquez, P.P.: Automatic view selection through depth-based view stability analysis. Vis. Comput. 25(5–7), 441–449 (2009)
Vázquez, P.P., Monclús, E., Navazo, I.: Representative views and paths for volume models. In: SG ’08: Proceedings of the 9th International Symposium on Smart Graphics, pp. 106–117. Springer, Berlin, Heidelberg (2008)
World Wide Web Consortium: Portable network graphics (PNG) specification, 2nd edn. http://www.w3.org/TR/PNG/ (2010). Online; accessed 3th March 2010
OpenCV: Open Source Computer Vision. http://opencv.willowgarage.com/wiki/ (2010). Online; accessed 2nd February 2011
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vázquez, PP., Marco, J. Using Normalized Compression Distance for image similarity measurement: an experimental study. Vis Comput 28, 1063–1084 (2012). https://doi.org/10.1007/s00371-011-0651-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-011-0651-2