Abstract
We address the problem of object detection and segmentation using global holistic properties of object shape. Global shape representations are highly susceptible to clutter inevitably present in realistic images, and thus can be applied robustly only using a precise segmentation of the object. To this end, we propose a figure/ground segmentation method for extraction of image regions that resemble the global properties of a model boundary structure and are perceptually salient. Our shape representation, called the chordiogram, is based on geometric relationships of object boundary edges, while the perceptual saliency cues we use favor coherent regions distinct from the background. We formulate the segmentation problem as an integer quadratic program and use a semidefinite programming relaxation to solve it. The obtained solutions provide a segmentation of the object as well as a detection score used for object recognition. Our single-step approach achieves state-of-the-art performance on several object detection and segmentation benchmarks.
Similar content being viewed by others
References
Basri, R., Costa, L., Geiger, D., & Jacobs, D. (1998). Determining the similarity of deformable shapes. Vision Research, 38(15–16), 2365–2385.
Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4), 509–522.
Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94(2), 115–147.
Binford, T. O. (1971). Visual perception by computer. In IEEE conference on systems and control.
Blum, H. (1973). Biological shape and visual science. Journal of Theoretical Biology, 38(2), 205–287.
Borenstein, E., Sharon, E., & Ullman, S. (2004). Combining top-down and bottom-up segmentation. In IEEE computer society conference on computer vision and pattern recognition.
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
Carlsson, S. (1999). Order structure, correspondence and shape based categories. In International workshop on shape, contour and grouping.
Chekuri, C., Khanna, S., Naor, J., & Zosin, L. (2005). A linear programming formulation and approximation algorithms for the metric labeling problem. SIAM Journal on Discrete Mathematics, 18(3), 608–625.
Cootes, T. (1995). Active shape models-their training and application. Computer Vision and Image Understanding, 61(1), 38–59.
Cour, T., Benezit, F., & Shi, J. (2005). Spectral segmentation with multiscale graph decomposition. In IEEE computer society conference on computer vision and pattern recognition.
Felzenszwalb, P., & Schwartz, J. (2007). Hierarchical matching of deformable shapes. In IEEE computer society conference on computer vision and pattern recognition.
Ferrari, V., Tuytelaars, T., & Gool, L. V. (2006). Object detection by contour segment networks. In European conference on computer vision.
Ferrari, V., Jurie, F., & Schmid, C. (2007). Accurate object detection with deformable shape models learnt from images. In IEEE computer society conference on computer vision and pattern recognition.
Ferrari, V., Fevrier, L., Jurie, F., & Schmid, C. (2008). Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1), 36–51.
Ferrari, V., Jurie, F., & Schmid, C. (2010). From images to shape models for object detection. International Journal of Computer Vision, 87(3), 284–303.
Fritz, M., & Schiele, B. (2008). Decomposition, discovery and detection of visual categories using topic models. In IEEE computer society conference on computer vision and pattern recognition.
Goemans, M. X., & Williamson, D. (1995). Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM, 42(6), 1115–1145.
Gold, S., & Rangarajan, A. (1996). A graduated assignment algorithm for graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(4), 377–388.
Gorelick, L., & Basri, R. (2009). Shape based detection and top-down delineation using image segments. International Journal of Computer Vision, 83(3), 211–232.
Grant, M., & Boyd, S. (2010). CVX: Matlab software for disciplined convex programming, version 1.21. http://cvxr.com/cvx.
Grimson, W., & Lozano-Perez, T. (1987). Localizing overlapping parts by searching the interpretation tree. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(4), 469–482.
Gu, C., Lim, J., Arbelaez, P., & Malik, J. (2009). Recognition using regions. In IEEE computer society conference on computer vision and pattern recognition.
Huttenlocher, D., Klanderman, D., & Rucklige, A. (1993). Comparing images using the Hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9), 850–863.
Indyk, P., & Thaper, N. (2003). Fast image retrieval via embeddings. In 3rd international workshop on statistical and computational theories of vision.
Joachims, T. (1999). Making large-scale svm learning practical. In Advances in kernel methods—support vector learning.
Kimia, B., Tannenbaum, A., & Zucker, S. (1995). Shapes, shocks, and deformations I: the components of two-dimensional shape and the reaction-diffusion space. International Journal of Computer Vision, 15(3), 189–224.
Koffka, K. (1935). Principles of gestalt psychology. London: Lund Humphries.
Lamdan, Y., Schwartz, J., & Wolfson, H. (1990). Affine invariant model-based object recognition. IEEE Transactions on Robotics and Automation, 6(5), 578–589.
Latecki, L., & Lakamper, R. (2000). Shape similarity measure based on correspondence of visual parts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(10), 1185–1190.
Latecki, L., Lakamper, R., & Eckhardt, U. (2000). Shape descriptors for non-rigid shapes with a single closed contour. In IEEE computer society conference on computer vision and pattern recognition.
Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1), 259–289.
Leordeanu, M., Hebert, M., & Sukthankar, R. (2007). Beyond local appearance: Category recognition from pairwise interactions of simple features. In IEEE computer society conference on computer vision and pattern recognition.
Levin, A., & Weiss, Y. (2006). Learning to combine bottom-up and top-down segmentation. In European conference on computer vision.
Ling, H., & Jacobs, D. (2007). Shape classification using the inner-distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2), 286–299.
Lu, C., Latecki, L. J., Adluru, N., Yang, X., & Ling, H. (2009). Shape guided contour grouping with particle filters. In International conference on computer vision.
Maji, S., & Malik, J. (2009). Object detection using a max-margin hough transform. In IEEE computer society conference on computer vision and pattern recognition.
Malisiewicz, T., & Efros, A. A. (2008). Recognition by association via learning per-exemplar distances. In IEEE computer society conference on computer vision and pattern recognition.
Marr, D. (2010). Vision: A computational investigation into the human representation and processing of visual information. New York: Henry Holt.
Martin, D., Fowlkes, C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.
Mcneill, G., & Vijayakumar, S. (2006). Hierarchical Procrustes matching for shape retrieval. In IEEE computer society conference on computer vision and pattern recognition.
Mokhtarian, F., Abbasi, S., & Kittler, J. (1997). Efficient and robust retrieval by shape content through curvature scale space. Image Databases and Multi-Media Search, 51–58.
Opelt, A., Pinz, A., & Zisserman, A. (2006). A boundary-fragment-model for object detection. In European conference on computer vision.
Osada, R., Funkhouser, T., Chazelle, B., & Dobkin, D. (2002). Shape distributions. ACM Transactions on Graphics, 21(4), 807–832.
Palmer, S. E. (1999). Vision science: Photons to phenomenology. Cambridge: The MIT Press.
Pentland, A. (1986). Perceptual organization and the representation of natural form. Artificial Intelligence, 28(3), 293–331.
Pizer, S., Fritsch, D., Yushkevich, P., Johnson, V., & Chaney, E. (1999). Segmentation, registration, and measurement of shape variation via image object shape. IEEE Transactions on Medical Imaging, 18(10), 851–865.
Ravishankar, S., Jain, A., & Mittal, A. (2008). Multi-stage contour based detection of deformable objects. In European conference on computer vision.
Ren, X., Fowlkes, C., & Malik, J. (2005). Cue integration in figure/ground labeling. In Neural information processing systems.
Russell, B., Efros, A. A., Sivic, J., Freeman, B., & Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In IEEE computer society conference on computer vision and pattern recognition.
Schoenemann, T., & Cremers, D. (2007). Globally optimal image segmentation with an elastic shape prior. In International conference on computer vision.
Sebastian, T., Klein, P., & Kimia, B. (2003). On aligning curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(1), 116–125.
Sebastian, T., Klein, P., & Kimia, B. (2004). Recognition of shapes by editing their shock graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 550–571.
Shapiro, L., & Haralick, R. (1979). Structural descriptions and inexact matching (Technical report CS79011-R). Computer Science, Virginia Tech.
Shotton, J., Blake, A., & Chipolla, R. (2005). Contour-based learning for object detection. In International conference on computer vision.
Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2009). Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 81(1), 2–23.
Siddiqi, K., Shokoufandeh, A., Dickinson, S., & Zucker, S. (1999). Shock graphs and shape matching. International Journal of Computer Vision, 35(1), 13–32.
Srinivasan, P., Zhu, Q., & Shi, J. (2010). Many-to-one contour matching for describing and discriminating object shape. In IEEE computer society conference on computer vision and pattern recognition.
Sturm, J. F. (1999). Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones. Optimization Methods & Software, 11(12), 625–653.
Toshev, A., Taskar, B., & Daniilidis, K. (2010). Object detection via boundary structure segmentation. In IEEE computer society conference on computer vision and pattern recognition.
Trinh, N. H., & Kimia, B. B. (2011). Skeleton search: Category-specific object recognition and segmentation using a skeletal shape model. International Journal of Computer Vision, 94(2), 215–240.
Tu, Z., & Yuille, A. (2004). Shape matching and recognition–using generative models and informative features. In Seventh European conference on computer vision.
Umeyama, S. (1988). An eigendecomposition approach to weighted graph matching problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(5), 695–703.
Yoshida, K., & Sakoe, H. (1982). Online handwritten character recognition for a personal computer system. IEEE Transactions on Consumer Electronics, 3, 202–209.
Yu, S. X., & Shi, J. (2003). Multiclass spectral clustering. In International conference on computer vision.
Zhang, D., & Lu, G. (2003). Evaluation of mpeg-7 shape descriptors against other shape descriptors. Multimedia Systems, 9(1), 15–30.
Zhu, Q., Wang, L., Wu, Y., & Shi, J. (2008). Contour context selection for object detection: A set-to-set contour matching approach. In European conference on computer vision.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Toshev, A., Taskar, B. & Daniilidis, K. Shape-Based Object Detection via Boundary Structure Segmentation. Int J Comput Vis 99, 123–146 (2012). https://doi.org/10.1007/s11263-012-0521-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-012-0521-z