Skip to main content
Log in

Fast Approximate Energy Minimization with Label Costs

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The α-expansion algorithm has had a significant impact in computer vision due to its generality, effectiveness, and speed. It is commonly used to minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main algorithmic contribution is an extension of α-expansion that also optimizes “label costs” with well-characterized optimality bounds. Label costs penalize a solution based on the set of labels that appear in it, for example by simply penalizing the number of labels in the solution.

Our energy has a natural interpretation as minimizing description length (MDL) and sheds light on classical algorithms like K-means and expectation-maximization (EM). Label costs are useful for multi-model fitting and we demonstrate several such applications: homography detection, motion segmentation, image segmentation, and compression. Our C++ and MATLAB code is publicly available http://vision.csd.uwo.ca/code/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike, H. (1974). A new look at statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.

    Article  MATH  MathSciNet  Google Scholar 

  • Ayed, I. B., & Mitiche, A. (2008). A region merging prior for variational level set image segmentation. IEEE Transactions on Image Processing, 17(12), 2301–2311.

    Article  MathSciNet  Google Scholar 

  • Babayev, D. A. (1974). Comments on the note of frieze. Mathematical Programming, 7(1), 249–252.

    Article  MATH  MathSciNet  Google Scholar 

  • Barinova, O., Lempitsky, V., & Kohli, P. (2010). On the detection of multiple object instances using hough transforms. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2010.

    Google Scholar 

  • Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In International conf. on computer vision (ICCV).

    Google Scholar 

  • Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.

    MATH  Google Scholar 

  • Blake, A., & Zisserman, A. (1987). Visual reconstruction. Cambridge: MIT Press.

    Google Scholar 

  • Boros, E., & Hammer, P. L. (2002). Pseudo-boolean optimization. Discrete Applied Mathematics, 123(1–3), 155–225.

    Article  MATH  MathSciNet  Google Scholar 

  • Boykov, Y., & Kolmogorov, V. (2003). Computing geodesics and minimal surfaces via graph cuts. In International conf. on computer vision (ICCV).

    Google Scholar 

  • Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE transactions on pattern analysis and machine intelligence, 29(9), 1124–1137.

    Article  Google Scholar 

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE transactions on pattern analysis and machine intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Brox, T., & Weickert, J. (2004). Level set based segmentation of multiple objects. In LNCS: Vol. 3175. Pattern recognition (pp. 415–423).

    Chapter  Google Scholar 

  • Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference. Berlin: Springer.

    MATH  Google Scholar 

  • Comaniciu, D., & Meer, P. (2002). Mean shift: a robust approach toward feature space analysis. IEEE transactions on pattern analysis and machine intelligence, 24(5), 603–619.

    Article  Google Scholar 

  • Cornuejols, G., Fisher, M. L., & Nemhauser, G. L. (1977). Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Management Science, 23(8), 789–810.

    Article  MATH  MathSciNet  Google Scholar 

  • Cornuejols, G., Nemhauser, G. L., & Wolsey, L. A. (1983). The uncapacitated facility location problem. Technical Report 605, Op. Research, Cornell University, August.

  • Dahlhaus, E., Johnson, D. S., Papadimitriou, C. H., Seymour, P. D., & Yannakakis, M. (1994). The complexity of multiterminal cuts. SIAM Journal on Computing, 23(4), 864–894.

    Article  MATH  MathSciNet  Google Scholar 

  • Delong, A., Osokin, A., Isack, H., & Boykov, Y. (2010). Fast approximate energy minimization with label costs. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2010.

    Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1), 1–38.

    MATH  MathSciNet  Google Scholar 

  • Everett, H. (1963). Generalized Lagrange multiplier method for solving problems of optimum allocation of resources. Operations Research, 11(3), 399–417.

    Article  MATH  MathSciNet  Google Scholar 

  • Feige, U. (1998). A threshold of \(\ln n\) for approximating set cover. Journal of the ACM, 45(4), 634–652.

    Article  MATH  MathSciNet  Google Scholar 

  • Figueiredo, M. A., & Jain, A. K. (2002). Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 381–396.

    Article  Google Scholar 

  • Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.

    Article  MathSciNet  Google Scholar 

  • Freedman, D., & Drineas, P. (2005). Energy minimization via graph cuts: settling what is possible. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2005.

    Google Scholar 

  • Frieze, A. M. (1974). A cost function property for plant location problems. Mathematical Programming, 7(1), 245–248.

    Article  MATH  MathSciNet  Google Scholar 

  • Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.

    Article  MATH  Google Scholar 

  • Gersho, A., & Gray, R. M. (2001). Vector quantization and signal compression. Norwell: Kluwer Academic.

    Google Scholar 

  • Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.

    Google Scholar 

  • Hochbaum, D. S. (1982). Heuristics for the fixed cost median problem. Mathematical Programming, 22(1), 148–162.

    Article  MATH  MathSciNet  Google Scholar 

  • Hoiem, D., Rother, C., & Winn, J. (2007). 3D LayoutCRF for multi-view object class recognition and segmentation. In IEEE conf. on computer vision and pattern recognition (CVPR).

    Google Scholar 

  • Isack, H. N., & Boykov, Y. (2011) Energy-based geometric multi-model fitting. International Journal of Computer Vision (IJCV). doi:10.1007/s11263-011-0474-7

  • Kleinberg, J., & Tardos, E. (2002). Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. Journal of the ACM, 49(5).

  • Kohli, P., Kumar, M. P., & Torr, P. H. S. (2007). \(\mathcal{P}^{3}\) & Beyond: solving energies with higher order cliques. In IEEE conf. on computer vision and pattern recognition (CVPR).

    Google Scholar 

  • Kohli, P., Ladický, L., & Torr, P. H. S. (2009). Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision, 82(3), 302–324.

    Article  Google Scholar 

  • Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.

    Article  Google Scholar 

  • Kolmogorov, V., Boykov, Y., & Rother, C. (2007). Applications of parametric maxflow in computer vision. In International conf. on computer vision (ICCV).

    Google Scholar 

  • Kolmogorov, V., & Zabih, R. (2004). What energy functions can be optimized via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.

    Article  Google Scholar 

  • Kuehn, A. A., & Hamburger, M. J. (1963). A heuristic program for locating warehouses. Management Science, 9(4), 643–666.

    Article  Google Scholar 

  • Ladický, L., Russell, C., Kohli, P., & Torr, P. (2010). Graph cut based inference with co-occurrence statistics. In European conf. on computer vision (ECCV), September 2010.

    Google Scholar 

  • Lazic, N., Givoni, I., Frey, B., & Aarabi, P. (2009). FLoSS: facility location for subspace segmentation. In International conf. on computer vision (ICCV).

    Google Scholar 

  • Leclerc, Y. G. (1989). Constructing simple stable descriptions for image partitioning. International Journal of Computer Vision, 3(1), 73–102.

    Article  Google Scholar 

  • Li, H. (2007). Two-view motion segmentation from linear programming relaxation. In IEEE conf. on computer vision and pattern recognition (CVPR).

    Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.

    Article  Google Scholar 

  • MacKay, D. J. C. (2003). Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Mitchell, T., & Beauchamp, J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83(404), 1023–1032.

    Article  MATH  MathSciNet  Google Scholar 

  • Nemhauser, G. L., Wolsey, L. A., & Fisher, M. L. (1978). An analysis of approximations for maximizing submodular set functions—I. Mathematical Programming, 14(1), 265–294.

    Article  MATH  MathSciNet  Google Scholar 

  • Ortega, A., & Ramchandran, K. (1998). Rate-distortion methods for image and video compression. IEEE Signal Processing Magazine, 15(6), 23–50.

    Article  Google Scholar 

  • Rother, C., Kolmogorov, V., & Blake, A. (2004). GrabCut: interactive foreground extraction using iterated graph cuts. In ACM SIGGRAPH.

    Google Scholar 

  • Shmoys, D. B., Tardos, E., & Aardal, K. (1998). Approximation algorithms for facility location problems (extended abstract). In ACM symposium on theory of computing (STOC) (pp. 265–274).

    Google Scholar 

  • Sun, M. (2005). A tabu search heuristic for the uncapacitated facility location problem. In Metaheuristic optimization via memory and evolution: Vol. 30 (pp. 191–211). Berlin: Springer.

    Chapter  Google Scholar 

  • Sung, K. K., & Poggio, T. (1995). Example based learning for view-based human face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 39–51.

    Article  Google Scholar 

  • Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2008). A comparative study of energy minimization methods for Markov random fields with smoothness-based priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 1068–1080.

    Article  Google Scholar 

  • Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In European conf. on computer vision (ECCV).

    Google Scholar 

  • Taskar, B., Chatalbashev, V., & Koller, D. (2004). Learning associative Markov networks. In International conf. on machine learning (ICML).

    Google Scholar 

  • Torr, P. H. S. (1998). Geometric motion segmentation and model selection. Philosophical Trans. of the Royal Society A (pp. 1321–1340).

  • Tron, R., & Vidal, R. (2007). A benchmark for the comparison of 3-d motion segmentation algorithms. In IEEE conf. on computer vision and pattern recognition (CVPR).

    Google Scholar 

  • Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2006). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6(2), 1453–1484.

    MathSciNet  Google Scholar 

  • Ueda, N., Nakano, R., Ghahramani, Z., & Hinton, G. E. (2000). SMEM algorithm for mixture models. Neural Computation, 12(9), 2109–2128.

    Article  Google Scholar 

  • Werner, T. (2008). High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In IEEE conf. on computer vision and pattern recognition (CVPR), June 2008.

    Google Scholar 

  • Woodford, O. J., Rother, C., & Kolmogorov, V. (2009). A global perspective on MAP inference for low-level vision. In International conf. on computer vision (ICCV), October 2009.

    Google Scholar 

  • Yuan, J., & Boykov, Y. (2010). TV-based multi-label image segmentation with label cost prior. In British machine vision conference (BMVC), Sept 2010.

    Google Scholar 

  • Zabih, R., & Kolmogorov, V. (2004). Spatially coherent clustering with graph cuts. In IEEE conf. on computer vision and pattern recognition (CVPR), June 2004.

    Google Scholar 

  • Zhu, S. C., & Yuille, A. L. (1996). Region competition: unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9), 884–900.

    Article  Google Scholar 

  • Zuliani, M., Kenney, C. S., & Manjunath, B. S. (2005). The multiRANSAC algorithm and its application to detect planar homographies. In International conf. on image processing (ICIP).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuri Boykov.

Additional information

The authors assert equal contribution and joint first authorship.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Delong, A., Osokin, A., Isack, H.N. et al. Fast Approximate Energy Minimization with Label Costs. Int J Comput Vis 96, 1–27 (2012). https://doi.org/10.1007/s11263-011-0437-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-011-0437-z

Keywords

Navigation