EM, MCMC, and Chain Flipping for Structure from Motion with Unknown Correspondence

Dellaert, Frank; Seitz, Steven M.; Thorpe, Charles E.; Thrun, Sebastian

doi:10.1023/A:1020245811187

EM, MCMC, and Chain Flipping for Structure from Motion with Unknown Correspondence

Published: January 2003

Volume 50, pages 45–71, (2003)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

EM, MCMC, and Chain Flipping for Structure from Motion with Unknown Correspondence

Download PDF

Frank Dellaert¹,
Steven M. Seitz²,
Charles E. Thorpe³ &
…
Sebastian Thrun³

779 Accesses
47 Citations
3 Altmetric
Explore all metrics

Abstract

Learning spatial models from sensor data raises the challenging data association problem of relating model parameters to individual measurements. This paper proposes an EM-based algorithm, which solves the model learning and the data association problem in parallel. The algorithm is developed in the context of the the structure from motion problem, which is the problem of estimating a 3D scene model from a collection of image data. To accommodate the spatial constraints in this domain, we compute virtual measurements as sufficient statistics to be used in the M-step. We develop an efficient Markov chain Monte Carlo sampling method called chain flipping, to calculate these statistics in the E-step. Experimental results show that we can solve hard data association problems when learning models of 3D scenes, and that we can do so efficiently. We conjecture that this approach can be applied to a broad range of model learning problems from sensordata, such as the robot mapping problem.

Article PDF

Improved Structure from Motion Using Fiducial Marker Matching

Adaptive Structure from Motion with a Contrario Model Estimation

Iterated Conditional Modes to Solve Simultaneous Localization and Mapping in Markov Random Fields Context

Article 01 February 2018

References

Avitzour, D. (1992). A maximum likelihood approach to data association. IEEE Trans. on Aerospace and Electronic Systems, 28:2, 560-566.
Google Scholar
Ayer, S., &; Sawhney, H. S. (1995). Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding. In Int. Conf. on Computer Vision (ICCV) (pp. 777-784).
Bar-Shalom, Y., &; Fortmann, T. E. (1988). Tracking and data association. New York: Academic Press.
Google Scholar
Basri, R., Grove, A. J., &; Jacobs, D.W. (1998). Efficient determination of shape from multiple images containing partial information. Pattern Recognition, 31:11, 1691-1703.
Google Scholar
Beardsley, P. A., Torr, P. H. S., &; Zisserman, A. (1996). 3D model acquisition from extended image sequences. In Eur. Conf. on Computer Vision (ECCV) (pp. II:683-695).
Bertsekas, D. P. (1991). Linear network optimization: Algorithms and codes. Cambridge, MA: The MIT Press.
Google Scholar
Borenstein, J., Everett, B., &; Feng, L. (1996). Navigating mobile robots: Systems and techniques. Wellesley, MA: A. K. Peters, Ltd.
Google Scholar
Broder, A. Z. (1986). How hard is to marry at random? (On the approximation of the permanent). In Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing (pp. 50-58). Berkeley, California.
Broida, T., &; Chellappa, R. (1991). Estimating the kinematics and structure of a rigid object from a sequence of monocular images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13:6, 497-513.
Google Scholar
Burgard, W., Fox, D., Jans, H., Matenar, C., &; Thrun, S. (1999). Sonar-based mapping of large-scale mobile robot environments using EM. In Proceedings of the International Conference on Machine Learning, Bled, Slovenia.
Castellanos, J. A., Montiel, J. M. M., Neira, J., &; Tardos, J. D. (1999). The SPmap: A probabilistic framework for simultaneous localization and map building. IEEE Trans. on Robotics and Automation, 15:5, 948-953.
Google Scholar
Castellanos, J. A., &; Tardos, J. D. (2000). Mobile robot localization and map building: A multisensor fusion approach. Boston, MA: Kluwer Academic Publishers.
Google Scholar
Cook, W. J., Cunningham, W. H., Pulleyblank, W. R., &; Schrijver, A. (1998). Combinatorial optimization. New York, NY: John Wiley &; Sons.
Google Scholar
Cox, I. J. (1993). A review of statistical data association techniques for motion correspondence. Int. J. of Computer Vision, 10:1, 53-66.
Google Scholar
Cox, I. J., &; Hingorani, S. L. (1994). An efficient implementation and evaluation of Reid's multiple hypothesis tracking algorithm for visual tracking. In Int. Conf. on Pattern Recognition (ICPR), (Vol. 1, pp. 437-442). Jerusalem, Israel.
Google Scholar
Cox, I. J., &; Leonard, J. J. (1994). Modeling a dynamic environment using a Bayesian multiple hypothesis approach. Artificial Intelligence, 66:2, 311-344.
Google Scholar
Dellaert, F. (2001). Monte Carlo EM for data association and its applications in computer vision. PhD thesis, School of Computer Science, Carnegie Mellon. Also available as Technical Report CMU-CS-01-153.
Google Scholar
Dempster, A. P., Laird, N. M., &; Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39:1, 1-38.
Google Scholar
Deriche, R., &; Faugeras, O. D. (1990). Tracking line segments. Image and Vision Computing, 8, 261-270.
Google Scholar
Doucet, A., de Freitas, N., &; Gordon, N. (Eds.). (2001). Sequential Monte Carlo methods in practice. New York: Springer-Verlag.
Google Scholar
Feder, H. J. S., Leonard, J. J., &; Smith, C. M. (1999). Adaptive mobile robot navigation and mapping. International Journal of Robotics Research, Special Issue on Field and Service Robotics, 18:7, 650-668.
Google Scholar
Gauvrit, H., Le Cadre, J. P., &; Jauffret, C. (1997). A formulation of multitarget tracking as an incomplete data problem. IEEE Trans. on Aerospace and Electronic Systems, 33:4, 1242-1257.
Google Scholar
Gilks, W. R., Richardson, S., &; Spiegelhalter, D. J. (Eds.) (1996). Markov chain Monte Carlo in practice. Chapman and Hall.
Gold, S., Rangarajan, A., Lu, C., Pappu, S., &; Mjolsness, E. (1998). Newalgorithms for 2D and 3D point matching. Pattern Recognition, 31:8, 1019-1031.
Google Scholar
Gutmann, J.-S., &; Konolige, K. (2000). Incremental mapping of large cyclic environments. In Proceedings of the IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA).
Hartley, H. O. (1958). Maximum likelihood estimation from incomplete data. Biometrics, 14, 174-194.
Google Scholar
Hartley, R. I. (1994). Euclidean reconstruction from uncalibrated views. In Application of Invariance in Computer Vision, 237-256.
Hartley, R., &; Zisserman, A. (2000). Multiple viewgeometry in computer vision. Cambridge: Cambridge University Press.
Google Scholar
Hastings, W. K. (1970). Monte carlo sampling methods using markov chains and their applications. Biometrika, 57, 97-109.
Google Scholar
Jacobs, D.W. (1997). Linear fitting with missing data: Applications to structure from motion and to characterizing intensity images. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (pp. 206-212).
Kozen, D. C. (1991). The design and analysis of algorithms. Berlin: Springer-Verlag.
Google Scholar
Kurata, T., Fujiki, J., Kourogi, M., &; Sakaue, K. (1999). A robust recursive factorization method for recovering structure and motion from live video frames. In 1999 ICCV Workshop on Frame Rate Processing, Corfu, Greece.
Leonard, J. J., Cox, I. J., &; Durrant-Whyte, H. F. (1992). Dynamic mmap building for an autonomous mobile robot. Int. J. Robotics Research, 11:4, 286-289.
Google Scholar
Leonard, J. J. &; Durrant-Whyte, H. F. (1992). Directed sonar sensing for mobile robot navigation. Boston: Kluwer Academic.
Google Scholar
Longuet-Higgins, H. C. (1981). A computer algorithm for reconstructing a scene from two projections. Nature, 293, 133-135.
Google Scholar
McLachlan, G. J., &; Basford, K. E. (1988). Mixture models: Inference and applications to clustering. New York: Marcel Dekker.
Google Scholar
McLachlan, G. J., &; Krishnan, T. (1997). The EM algorithm and extensions. Wiley series in probability and statistics. John Wiley &; Sons.
McLauchlan, P., &; Murray, D. (1995). A unifying framework for structure and motion recovery from image sequences. In Int. Conf. on Computer Vision (ICCV) (pp. 314-320).
Morris, D. D., &; Kanade, T. (1998). A unified factorization algorithm for points, line segments and planes with uncertainty models. In Int. Conf. on Computer Vision (ICCV) (pp. 696-702).
Morris, D. D., Kanatani, K., &; Kanade, T. (1999). Uncertainty modeling for optimal structure from motion. In ICCV Workshop on Vision Algorithms: Theory and Practice.
Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRGTR-93-1, Dept. of Computer Science, University of Toronto.
Papadimitriou, C. H., &; Steiglitz, K. (1982). Combinatorial optimization: Algorithms and complexity. Englewood Cliff, NJ: Prentice-Hall.
Google Scholar
Pasula, H., Russell, S., Ostland, M., &; Ritov, Y. (1999). Tracking many objects with many sensors. In Int. Joint Conf. on Artificial Intelligence (IJCAI), Stockholm.
Poelman, C., &; Kanade, T. (1997). A paraperspective factorization method for shape and motion recovery. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19:3, 206-218.
Google Scholar
Popoli, R., &; Blackman, S. S. (1999). Design and analysis of modern tracking systems. Artech House Radar Library.
Rasmussen, C., &; Hager, G. D. (1998). Joint probabilistic techniques for tracking objects using multiple vision clues. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS) (pp. 191-196).
Reid, D. B. (1979). An algorithm for tracking multiple targets. IEEE Trans. on Automation and Control, AC-24:6, 84-90.
Google Scholar
Scott, G. L., &; Longuet-Higgins, H. C. (1991). Analgorithm for associating the features of two images. Proceedings of Royal Society of London, B-244, 21-26.
Google Scholar
Seitz, S. M., &; Dyer, C. R. (1995). Complete structure from four point correspondences. In Int. Conf. on Computer Vision (ICCV) (pp. 330-337).
Shapiro, L. S., &; Brady, J. M. (1992). Feature-based correspondence: An eigenvector approach. Image and Vision Computing, 10:5, 283-288.
Google Scholar
Shatkay, H. (1998). Learning models for robot navigation. Ph.D. thesis, Computer Science Department, Brown University, Providence, RI.
Google Scholar
Shatkay, H., &; Kaelbling, L. (1997). Learning topological maps with weak local odometric information. In Proceedings of IJCAI-97, IJCAI, Inc.
Smith, A. F. M., &; Gelfand, A. E. (1992). Bayesian statistics without tears: A sampling-resampling perspective. American Statistician, 46:2, 84-88.
Google Scholar
Spetsakis, M., &; Aloimonos, Y. (1991). A multi-frame approach to visual motion perception. Int. J. of Computer Vision, 6:3, 245-255.
Google Scholar
Streit, R., &; Luginbuhl, T. (1994). Maximum likelihood method for probabilistic multi-hypothesis tracking. In Proc. SPIE (Vol. 2335, pp. 394-405).
Google Scholar
Szeliski, R., &; Kang, S. B. (1993). Recovering 3D shape and motion from image streams using non-linear least squares. Technical Report CRL 93/3, DEC Cambridge Research Lab.
Tanner, M. A. (1996). Tools for statistical inference (3rd edn). New York: Springer Verlag.
Google Scholar
Thrun, S., Fox, D., &; Burgard, W. (1998a). A probabilistic approach to concurrent mapping and localization for mobile robots. Machine Learning, 31, 29-53. also appeared in Autonomous Robots, 5, 253-271.
Google Scholar
Thrun, S., Fox, D., &; Burgard, W. (1998). Probabilistic mapping of an environment by a mobile robot. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).
Tomasi, C., &; Kanade, T. (1992). Shape and motion from image streams under orthography: A factorization method. Int. J. of Computer Vision, 9:2, 137-154.
Google Scholar
Torr, P.H. S. (1997). An assessment of information criteria for motion model selection. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (pp. 47-53).
Torr, P., Fitzgibbon, A., &; Zisserman, A. (1998). Maintaining multiple motion model hypotheses over many views to recover matching and structure. In Int. Conf. on Computer Vision (ICCV) (pp. 485-491).
Torr, P. H. S., Szeliski, R., &; Anandan, P. (1999). An integrated bayesian approach to layer extraction from image sequences. In Int. Conf. on Computer Vision (ICCV) (pp. 983-990).
Triggs, B. (1996). Factorization methods for projective structure and motion. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (pp. 845-851).
Triggs, B., McLauchlan, P., Hartley, R., &; Fitzgibbon, A. (1999). Bundle adjustment-a modern synthesis. In Vision Algorithms 99, Corfu, Greece.
Tsai, R. Y., &; Huang, T. S. (1984). Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surfaces. IEEE Trans. on Pattern Analysis and Machine Intelligence, 6:1, 13-27.
Google Scholar
Ullman, S. (1979). The interpretation of visual motion. Cambridge, MA: MIT Press.
Google Scholar
Zhang, Z., &; Faugeras, O. D. (1992). Three-dimensional motion computation and object segmentation in a long sequence of stereo frames. Int. J. of Computer Vision, 7:3, 211-241.
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
Frank Dellaert
Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA
Steven M. Seitz
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Charles E. Thorpe & Sebastian Thrun

Authors

Frank Dellaert
View author publications
You can also search for this author in PubMed Google Scholar
Steven M. Seitz
View author publications
You can also search for this author in PubMed Google Scholar
Charles E. Thorpe
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Thrun
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dellaert, F., Seitz, S.M., Thorpe, C.E. et al. EM, MCMC, and Chain Flipping for Structure from Motion with Unknown Correspondence. Machine Learning 50, 45–71 (2003). https://doi.org/10.1023/A:1020245811187

Download citation

Issue Date: January 2003
DOI: https://doi.org/10.1023/A:1020245811187

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

EM, MCMC, and Chain Flipping for Structure from Motion with Unknown Correspondence

Abstract

Article PDF

Similar content being viewed by others

Improved Structure from Motion Using Fiducial Marker Matching

Adaptive Structure from Motion with a Contrario Model Estimation

Iterated Conditional Modes to Solve Simultaneous Localization and Mapping in Markov Random Fields Context

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

EM, MCMC, and Chain Flipping for Structure from Motion with Unknown Correspondence

Abstract

Article PDF

Similar content being viewed by others

Improved Structure from Motion Using Fiducial Marker Matching

Adaptive Structure from Motion with a Contrario Model Estimation

Iterated Conditional Modes to Solve Simultaneous Localization and Mapping in Markov Random Fields Context

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation