Skip to main content
Log in

A Comparison of Affine Region Detectors

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The paper gives a snapshot of the state of the art in affine covariant region detectors, and compares their performance on a set of test images under varying imaging conditions. Six types of detectors are included: detectors based on affine normalization around Harris  (Mikolajczyk and  Schmid, 2002; Schaffalitzky and  Zisserman, 2002) and Hessian points  (Mikolajczyk and  Schmid, 2002), a detector of ‘maximally stable extremal regions', proposed by Matas et al. (2002); an edge-based region detector  (Tuytelaars and Van Gool, 1999) and a detector based on intensity extrema (Tuytelaars and Van Gool, 2000), and a detector of ‘salient regions', proposed by Kadir, Zisserman and Brady (2004). The performance is measured against changes in viewpoint, scale, illumination, defocus and image compression.

The objective of this paper is also to establish a reference test set of images and performance software, so that future detectors can be evaluated in the same framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Baumberg, A. 2000. Reliable feature matching across widely separated views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head Island, South Carolina, USA, pp. 774–781.

  • Brown, M. and Lowe, D. 2003. Recognizing panoramas. In Proceedings of the International Conference on Computer Vision, Nice, France, pp. 1218–1225.

  • Canny, J. 1986. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8: 679–698.

    Google Scholar 

  • Csurka, G., Dance, C., Bray, C., and Fan, L. 2004. Visual categorization with bags of keypoints. In Proceedings Workshop on Statistical Learning in Computer Vision.

  • Dorko, G. and Schmid, C. 2003. Selection of scale invariant neighborhoods for object class recognition. In Proceedings International Conference on Computer Vision, Nice, France, pp. 634–640.

  • Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA.

  • Ferrari, V., Tuytelaars, T., and Van Gool, L. 2001. Simultaneous object recognition and segmentation by image exploration. In Proceedings European Conference on Computer Vision, Prague, Czech Republic, pp. 40–54.

  • Ferrari, V., Tuytelaars, T., and Van Gool, L. 2005. Simultaneous object recognition and segmentation from single or multiple model views. International Journal of Computer Vision, to appear.

  • Goedeme, T., Tuytelaars, T., and Van Gool, L. 2004. Fast wide baseline matching for visual navigation. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, pp. 24–29.

  • Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Alvey Vision Conference, pp. 147–151.

  • Hartley, R.I. and Zisserman, A. 2004. Multiple View Geometry in Computer Vision, 2nd edition, Cambridge University Press, ISBN: 0521540518.

  • Kadir, T., Zisserman, A., and Brady, M. 2004. An affine invariant salient region detector. In Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, pp. 345–457.

  • Lazebnik, S., Schmid, C., and Ponce, J. 2003a. A sparse texture representation using affine-invariant regions. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, pp. 319–324.

  • Lazebnik, S., Schmid, C., and Ponce, J. 2003b. Affine-invariant local descriptors and neighborhood statistics for texture recognition. In Proceedings of the International Conference on Computer Vision, Nice, France, pp. 649–655.

  • Lazebnik, S., Schmid, C., and Ponce, J. 2005. A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8):1265–1278.

    Google Scholar 

  • Lindeberg, T. and Gårding, J. 1997. Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure. Image and Vision Computing 15(6):415–434.

    Google Scholar 

  • Lindeberg, T. 1998. Feature detection with automatic scale selection. International Journal of Computer Vision 30(2):79–116.

    Google Scholar 

  • Lowe, D. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 1150–1157.

  • Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision 60(2):91–110.

    Google Scholar 

  • Matas, J., Burianek, J., and Kittler, J. 2000. Object Recognition using the Invariant Pixel-Set Signature. In Proceedings of the British Machine Vision Conference, London, UK, pp. 606–615.

  • Matas, J. Chum, O., Urban, M., and Pajdla, T. 2002. Robust wide-baseline stereo from maximally stable extremal regions. In Proceedings of the British Machine Vision Conference, Cardiff, UK, pp. 384–393.

  • Matas, J., Chum, O., Urban, M., and Pajdla, T. 2004. Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing 22(10):761–767.

    Google Scholar 

  • Mikolajczyk, K. and Schmid, C. 2001. Indexing based on scale invariant interest points. In Proceedings of the 8th International Conference on Computer Vision, Vancouver, Canada.

  • Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark.

  • Mikolajczyk, K. and Schmid, C. 2003. A performance evaluation of local descriptors. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA.

  • Mikolajczyk, K., Zisserman, A., and Schmid, C. 2003. Shape recognition with edge-based features. In Proceedings of the British Machine Vision Conference, Norwich, UK.

  • Mikolajczyk, K. and Schmid, C. 2004. Scale & affine invariant interest point detectors. International Journal on Computer Vision 60(1):63–86.

    Google Scholar 

  • Mikolajczyk, K. and Schmid, C. 2005. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(10):1615–1630.

    Google Scholar 

  • Obdržálek, Ŝ. and Matas, J. 2002. Object recognition using local affine frames on distinguished regions. In Proceedings of the British Machine Vision Conference, Cardiff, UK, pp. 113–122.

  • Opelt, A., Fussenegger, M., Pinz, A., and Auer, P. 2004. Weak hypotheses and boosting for generic object detection and recognition. In Proceedings of European Conference on Computer Vision, Prague, Czech Republic, pp. 71–84.

  • Pritchett, P. and Zisserman, A. 1998. Wide baseline stereo matching. In Proceedings of the 6th International Conference on Computer Vision, Bombay, India, pp. 754–760.

  • Rothganger, F., Lazebnik, S., Schmid, C., and Ponce, J. 2003. 3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, pp. 272–277.

  • Rothganger, F., Lazebnik, S., Schmid, C., and Ponce, J. 2005. Object modeling and recognition using local affine-invariant image descriptors and multi-view spatial consraints. International Journal of Computer Vision, to appear.

  • Schaffalitzky, F., and Zisserman, A. 2002. Multi-view matching for unordered image sets, or “How do I organize my holiday snaps?”. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pp. 414–431.

  • Schaffalitzky, F. and Zisserman, A. 2003. Automated Location matching in movies. Computer Vision and Image Understanding, 92(2):236–264.

    Google Scholar 

  • Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(5):530–535.

    Google Scholar 

  • Se, S., Lowe, D., and Little, J. 2002. Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. International Journal of Robotics Research 21(8):735–758.

    Google Scholar 

  • Sedgewick, R. 1988. Algorithms, 2nd edition. Addison-Wesley.

  • Sivic, J., and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Computer Vision, Nice, France.

  • Sivic, J., Schaffalitzky, F., and Zisserman, A. 2004. Object level grouping for video shots. In Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, pp. 724–734.

  • Sivic, J., and Zisserman, A. 2004. Video data mining using configurations of viewpoint invariant regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, pp. 488–495.

  • Tell, D. and Carlsson, S. 2000. Wide baseline point matching using affine invariants computed from intensity profiles. In Proceedings of the 6th European Conference on Computer Vision, Dublin, Ireland, pp. 814–828.

  • Tell, D. and Carlsson, S. 2002. Combining appearance and topology for wide baseline matching. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pp. 68–81.

  • Turina, A., Tuytelaars, T., and Van Gool, L. 2001. Efficient Grouping under perspective skew. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, pp. 247–254.

  • Tuytelaars, T. and Van Gool, L. 1999. Content-based image retrieval based on local affinely invariant regions. In Int. Conf. on Visual Information Systems, pp. 493–500.

  • Tuytelaars, T., Van Gool, L., D'haene, L., and Koch, R. 1999. Matching of affinely invariant regions for visual servoing. In Int. Conference Robotics and Automation ICRA 99.

  • Tuytelaars, T. and Van Gool, L. 2000. Wide baseline stereo matching based on local, affinely invariant regions. In Proceedings of the 11th British Machine Vision Conference, Bristol, UK, pp. 412–425.

  • Tuytelaars, T. and Van Gool, L. 2004. Matching Widely Separated Views based on Affine Invariant Regions. International Journal on Computer Vision 59(1):61–85.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Mikolajczyk.

Additional information

First online version published in October, 2005

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mikolajczyk, K., Tuytelaars, T., Schmid, C. et al. A Comparison of Affine Region Detectors. Int J Comput Vision 65, 43–72 (2005). https://doi.org/10.1007/s11263-005-3848-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-005-3848-x

Keywords

Navigation