Fast Approximate Energy Minimization with Label Costs

Delong, Andrew; Osokin, Anton; Isack, Hossam N.; Boykov, Yuri

doi:10.1007/s11263-011-0437-z

Fast Approximate Energy Minimization with Label Costs

Published: 12 July 2011

Volume 96, pages 1–27, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Andrew Delong¹,
Anton Osokin²,
Hossam N. Isack¹ &
…
Yuri Boykov¹

2804 Accesses
295 Citations
6 Altmetric
Explore all metrics

Abstract

The α-expansion algorithm has had a significant impact in computer vision due to its generality, effectiveness, and speed. It is commonly used to minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main algorithmic contribution is an extension of α-expansion that also optimizes “label costs” with well-characterized optimality bounds. Label costs penalize a solution based on the set of labels that appear in it, for example by simply penalizing the number of labels in the solution.

Our energy has a natural interpretation as minimizing description length (MDL) and sheds light on classical algorithms like K-means and expectation-maximization (EM). Label costs are useful for multi-model fitting and we demonstrate several such applications: homography detection, motion segmentation, image segmentation, and compression. Our C++ and MATLAB code is publicly available http://vision.csd.uwo.ca/code/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akaike, H. (1974). A new look at statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.
Article MATH MathSciNet Google Scholar
Ayed, I. B., & Mitiche, A. (2008). A region merging prior for variational level set image segmentation. IEEE Transactions on Image Processing, 17(12), 2301–2311.
Article MathSciNet Google Scholar
Babayev, D. A. (1974). Comments on the note of frieze. Mathematical Programming, 7(1), 249–252.
Article MATH MathSciNet Google Scholar
Barinova, O., Lempitsky, V., & Kohli, P. (2010). On the detection of multiple object instances using hough transforms. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2010.
Google Scholar
Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In International conf. on computer vision (ICCV).
Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.
MATH Google Scholar
Blake, A., & Zisserman, A. (1987). Visual reconstruction. Cambridge: MIT Press.
Google Scholar
Boros, E., & Hammer, P. L. (2002). Pseudo-boolean optimization. Discrete Applied Mathematics, 123(1–3), 155–225.
Article MATH MathSciNet Google Scholar
Boykov, Y., & Kolmogorov, V. (2003). Computing geodesics and minimal surfaces via graph cuts. In International conf. on computer vision (ICCV).
Google Scholar
Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE transactions on pattern analysis and machine intelligence, 29(9), 1124–1137.
Article Google Scholar
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE transactions on pattern analysis and machine intelligence, 23(11), 1222–1239.
Article Google Scholar
Brox, T., & Weickert, J. (2004). Level set based segmentation of multiple objects. In LNCS: Vol. 3175. Pattern recognition (pp. 415–423).
Chapter Google Scholar
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference. Berlin: Springer.
MATH Google Scholar
Comaniciu, D., & Meer, P. (2002). Mean shift: a robust approach toward feature space analysis. IEEE transactions on pattern analysis and machine intelligence, 24(5), 603–619.
Article Google Scholar
Cornuejols, G., Fisher, M. L., & Nemhauser, G. L. (1977). Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Management Science, 23(8), 789–810.
Article MATH MathSciNet Google Scholar
Cornuejols, G., Nemhauser, G. L., & Wolsey, L. A. (1983). The uncapacitated facility location problem. Technical Report 605, Op. Research, Cornell University, August.
Dahlhaus, E., Johnson, D. S., Papadimitriou, C. H., Seymour, P. D., & Yannakakis, M. (1994). The complexity of multiterminal cuts. SIAM Journal on Computing, 23(4), 864–894.
Article MATH MathSciNet Google Scholar
Delong, A., Osokin, A., Isack, H., & Boykov, Y. (2010). Fast approximate energy minimization with label costs. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2010.
Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1), 1–38.
MATH MathSciNet Google Scholar
Everett, H. (1963). Generalized Lagrange multiplier method for solving problems of optimum allocation of resources. Operations Research, 11(3), 399–417.
Article MATH MathSciNet Google Scholar
Feige, U. (1998). A threshold of \(\ln n\) for approximating set cover. Journal of the ACM, 45(4), 634–652.
Article MATH MathSciNet Google Scholar
Figueiredo, M. A., & Jain, A. K. (2002). Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 381–396.
Article Google Scholar
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Article MathSciNet Google Scholar
Freedman, D., & Drineas, P. (2005). Energy minimization via graph cuts: settling what is possible. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2005.
Google Scholar
Frieze, A. M. (1974). A cost function property for plant location problems. Mathematical Programming, 7(1), 245–248.
Article MATH MathSciNet Google Scholar
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Article MATH Google Scholar
Gersho, A., & Gray, R. M. (2001). Vector quantization and signal compression. Norwell: Kluwer Academic.
Google Scholar
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
Google Scholar
Hochbaum, D. S. (1982). Heuristics for the fixed cost median problem. Mathematical Programming, 22(1), 148–162.
Article MATH MathSciNet Google Scholar
Hoiem, D., Rother, C., & Winn, J. (2007). 3D LayoutCRF for multi-view object class recognition and segmentation. In IEEE conf. on computer vision and pattern recognition (CVPR).
Google Scholar
Isack, H. N., & Boykov, Y. (2011) Energy-based geometric multi-model fitting. International Journal of Computer Vision (IJCV). doi:10.1007/s11263-011-0474-7
Kleinberg, J., & Tardos, E. (2002). Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. Journal of the ACM, 49(5).
Kohli, P., Kumar, M. P., & Torr, P. H. S. (2007). \(\mathcal{P}^{3}\) & Beyond: solving energies with higher order cliques. In IEEE conf. on computer vision and pattern recognition (CVPR).
Google Scholar
Kohli, P., Ladický, L., & Torr, P. H. S. (2009). Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision, 82(3), 302–324.
Article Google Scholar
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.
Article Google Scholar
Kolmogorov, V., Boykov, Y., & Rother, C. (2007). Applications of parametric maxflow in computer vision. In International conf. on computer vision (ICCV).
Google Scholar
Kolmogorov, V., & Zabih, R. (2004). What energy functions can be optimized via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.
Article Google Scholar
Kuehn, A. A., & Hamburger, M. J. (1963). A heuristic program for locating warehouses. Management Science, 9(4), 643–666.
Article Google Scholar
Ladický, L., Russell, C., Kohli, P., & Torr, P. (2010). Graph cut based inference with co-occurrence statistics. In European conf. on computer vision (ECCV), September 2010.
Google Scholar
Lazic, N., Givoni, I., Frey, B., & Aarabi, P. (2009). FLoSS: facility location for subspace segmentation. In International conf. on computer vision (ICCV).
Google Scholar
Leclerc, Y. G. (1989). Constructing simple stable descriptions for image partitioning. International Journal of Computer Vision, 3(1), 73–102.
Article Google Scholar
Li, H. (2007). Two-view motion segmentation from linear programming relaxation. In IEEE conf. on computer vision and pattern recognition (CVPR).
Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.
Article Google Scholar
MacKay, D. J. C. (2003). Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press.
MATH Google Scholar
Mitchell, T., & Beauchamp, J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83(404), 1023–1032.
Article MATH MathSciNet Google Scholar
Nemhauser, G. L., Wolsey, L. A., & Fisher, M. L. (1978). An analysis of approximations for maximizing submodular set functions—I. Mathematical Programming, 14(1), 265–294.
Article MATH MathSciNet Google Scholar
Ortega, A., & Ramchandran, K. (1998). Rate-distortion methods for image and video compression. IEEE Signal Processing Magazine, 15(6), 23–50.
Article Google Scholar
Rother, C., Kolmogorov, V., & Blake, A. (2004). GrabCut: interactive foreground extraction using iterated graph cuts. In ACM SIGGRAPH.
Google Scholar
Shmoys, D. B., Tardos, E., & Aardal, K. (1998). Approximation algorithms for facility location problems (extended abstract). In ACM symposium on theory of computing (STOC) (pp. 265–274).
Google Scholar
Sun, M. (2005). A tabu search heuristic for the uncapacitated facility location problem. In Metaheuristic optimization via memory and evolution: Vol. 30 (pp. 191–211). Berlin: Springer.
Chapter Google Scholar
Sung, K. K., & Poggio, T. (1995). Example based learning for view-based human face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 39–51.
Article Google Scholar
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2008). A comparative study of energy minimization methods for Markov random fields with smoothness-based priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 1068–1080.
Article Google Scholar
Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In European conf. on computer vision (ECCV).
Google Scholar
Taskar, B., Chatalbashev, V., & Koller, D. (2004). Learning associative Markov networks. In International conf. on machine learning (ICML).
Google Scholar
Torr, P. H. S. (1998). Geometric motion segmentation and model selection. Philosophical Trans. of the Royal Society A (pp. 1321–1340).
Tron, R., & Vidal, R. (2007). A benchmark for the comparison of 3-d motion segmentation algorithms. In IEEE conf. on computer vision and pattern recognition (CVPR).
Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2006). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6(2), 1453–1484.
MathSciNet Google Scholar
Ueda, N., Nakano, R., Ghahramani, Z., & Hinton, G. E. (2000). SMEM algorithm for mixture models. Neural Computation, 12(9), 2109–2128.
Article Google Scholar
Werner, T. (2008). High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In IEEE conf. on computer vision and pattern recognition (CVPR), June 2008.
Google Scholar
Woodford, O. J., Rother, C., & Kolmogorov, V. (2009). A global perspective on MAP inference for low-level vision. In International conf. on computer vision (ICCV), October 2009.
Google Scholar
Yuan, J., & Boykov, Y. (2010). TV-based multi-label image segmentation with label cost prior. In British machine vision conference (BMVC), Sept 2010.
Google Scholar
Zabih, R., & Kolmogorov, V. (2004). Spatially coherent clustering with graph cuts. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2004.
Google Scholar
Zhu, S. C., & Yuille, A. L. (1996). Region competition: unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9), 884–900.
Article Google Scholar
Zuliani, M., Kenney, C. S., & Manjunath, B. S. (2005). The multiRANSAC algorithm and its application to detect planar homographies. In International conf. on image processing (ICIP).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Western Ontario, London, Ontario, Canada, N6A 5B7
Andrew Delong, Hossam N. Isack & Yuri Boykov
Department of Computational Mathematics and Cybernetics, Moscow State University, Moscow, Russia
Anton Osokin

Authors

Andrew Delong
View author publications
You can also search for this author in PubMed Google Scholar
Anton Osokin
View author publications
You can also search for this author in PubMed Google Scholar
Hossam N. Isack
View author publications
You can also search for this author in PubMed Google Scholar
Yuri Boykov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuri Boykov.

Additional information

The authors assert equal contribution and joint first authorship.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Delong, A., Osokin, A., Isack, H.N. et al. Fast Approximate Energy Minimization with Label Costs. Int J Comput Vis 96, 1–27 (2012). https://doi.org/10.1007/s11263-011-0437-z

Download citation

Published: 12 July 2011
Issue Date: January 2012
DOI: https://doi.org/10.1007/s11263-011-0437-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Approximate Energy Minimization with Label Costs

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast Approximate Energy Minimization with Label Costs

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation