SoftPOSIT: Simultaneous Pose and Correspondence Determination

David, Philip; DeMenthon, Daniel; Duraiswami, Ramani; Samet, Hanan

doi:10.1023/B:VISI.0000025800.10423.1f

SoftPOSIT: Simultaneous Pose and Correspondence Determination

Published: September 2004

Volume 59, pages 259–284, (2004)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Philip David^1,2,
Daniel DeMenthon¹,
Ramani Duraiswami¹ &
…
Hanan Samet¹

975 Accesses
162 Citations
6 Altmetric
Explore all metrics

Abstract

The problem of pose estimation arises in many areas of computer vision, including object recognition, object tracking, site inspection and updating, and autonomous navigation when scene models are available. We present a new algorithm, called SoftPOSIT, for determining the pose of a 3D object from a single 2D image when correspondences between object points and image points are not known. The algorithm combines the iterative softassign algorithm (Gold and Rangarajan, 1996; Gold et al., 1998) for computing correspondences and the iterative POSIT algorithm (DeMenthon and Davis, 1995) for computing object pose under a full-perspective camera model. Our algorithm, unlike most previous algorithms for pose determination, does not have to hypothesize small sets of matches and then verify the remaining image points. Instead, all possible matches are treated identically throughout the search for an optimal pose. The performance of the algorithm is extensively evaluated in Monte Carlo simulations on synthetic data under a variety of levels of clutter, occlusion, and image noise. These tests show that the algorithm performs well in a variety of difficult scenarios, and empirical evidence suggests that the algorithm has an asymptotic run-time complexity that is better than previous methods by a factor of the number of image points. The algorithm is being applied to a number of practical autonomous vehicle navigation problems including the registration of 3D architectural models of a city to images, and the docking of small robots onto larger robots.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

3D point cloud-based place recognition: a survey

Article Open access 07 March 2024

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

Article 13 November 2015

References

Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., and Wu, A. 1998. An optimal algorithm for approximate nearest neighbor searching. Journal of the ACM, 45(6):891–923.
Google Scholar
Baird, H.S. 1985. Model-Based Image Matching Using Location. MIT Press: Cambridge, MA.
Google Scholar
Beis, J.S. and Lowe, D.G. 1999. Indexing without invariants in 3D object recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 21(10):1000–1015.
Google Scholar
Beveridge, J.R. and Riseman, E.M. 1992. Hybrid weak-perspective and full-perspective matching. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, Champaign, IL, pp. 432–438.
Beveridge, J.R. and Riseman, E.M. 1995. Optimal geometric model matching under full 3D perspective. Computer Vision and Image Understanding, 61(3):351–364.
Google Scholar
Brand, P. and Mohr, R. 1994. Accuracy in image measure. In Proc. SPIE, Videometrics III, Boston, MA, pp. 218–228.
Breuel, T.M. 1992. Fast recognition using adaptive subdivisions of transformation space. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Champaign, IL, pp. 445–451.
Bridle, J.S. 1990. Training stochastic model recognition as networks can lead to maximum mutual information estimation of parameters. In Proc. Advances in Neural Information Processing Systems, Denver, CO, pp. 211–217.
Burns, J.B., Weiss, R.S., and Riseman, E.M. 1993. View variation of point-set and line-segment features. IEEE Trans. Pattern Analysis and Machine Intelligence, 15(1):51–68.
Google Scholar
Cass, T.A. 1992. Polynomial-time object recognition in the presence of clutter, occlusion, and uncertainty. In Proc. European Conf. on Computer Vision, Santa Margherita Ligure, Italy, pp. 834–842.
Cass, T.A. 1994. Robust geometric matching for 3D object recognition. In. Proc. 12th IAPR Int. Conf. on Pattern Recognition, Jerusalem, Israel, vol. 1, pp. 477–482.
Google Scholar
Cass, T.A. 1998. Robust affine structure matching for 3D object recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(11):1265–1274.
Google Scholar
DeMenthon, D. and Davis, L.S. 1993. Recognition and tracking of 3D objects by 1D search. In Proc. DARPA Image Understanding Workshop, Washington, DC, pp. 653–659.
DeMenthon, D. and Davis, L.S. 1995. Model-based object pose in 25 lines of code. International Journal of Computer Vision, 15(1/2):123–141.
Google Scholar
DeMenthon, D., David, P., and Samet, H. 2001. SoftPOSIT: An algorithm for registration of 3D models to noisy perspective images combining softassign and POSIT. University of Maryland, College Park, MD, Report CS-TR-969, CS-TR 4257.
Google Scholar
Ely, R.W., Digirolamo, J.A., and Lundgren, J.C. 1995. Model supported positioning. In Proc. SPIE, Integrating Photogrammetric Techniques with Scene Analysis and Machine Vision II, Orlando, FL.
Fiore, P.D. 2001. Efficient linear solution of exterior orientation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(2):140–148.
Google Scholar
Fischler, M.A. and Bolles, R.C. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. Association for Computing Machinery, 24(6):381–395.
Google Scholar
Geiger, D. and Yuille, A.L. 1991. A common framework for image segmentation. International Journal of Computer Vision, 6(3):227–243.
Google Scholar
Gold, S. and Rangarajan, A. 1996. Agraduated assignment algorithm for graph matching. IEEE Trans. on Pattern Analysis and Machine Intelligence, 18(4):377–388.
Google Scholar
Gold, S., Rangarajan, A., Lu, C.-P., Pappu, S., and Mjolsness, E. 1998. New algorithms for 2D and 3D point matching: Pose estimation and correspondence. Pattern Recognition, 31(8):1019–1031.
Google Scholar
Grimson, E. 1990. Object Recognition by Computer: The Role of Geometric Constraints. MIT Press: Cambridge, MA.
Google Scholar
Grimson, E. and Huttenlocher, D.P. 1991. On the verification of hypothesized matches in model-based recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(12):1201–1213.
Google Scholar
Haralick, R.M., Lee, C., Ottenberg, K., and Nolle, M. 1991. Analysis and Solutions of the three point perspective pose estimation problem. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Maui, HI, pp. 592–598.
Hartley, R. and Zisserman, A. 2000. Multiple ViewGeometry in Computer Vision. Cambridge University Press: Cambridge, UK.
Google Scholar
Horaud, R., Conio, B., Leboulleux, O., and Lacolle, B. 1989. An analytic solution for the perspective 4-point problem. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, CA, pp. 500–507.
Horn, B.K.P. 1986. Robot Vision. MIT Press: Cambridge, MA.
Google Scholar
Jacobs, D.W. 1992. Space efficient 3-D model indexing. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Champaign, IL, pp. 439–444.
Jurie, F. 1999. Solution of the simultaneous pose and correspondence problem using gaussian error model. Computer Vision and Image Understanding, 73(3):357–373.
Google Scholar
Lamdan, Y. and Wolfson, H.J. 1988. Geometric hashing: A general and efficient model-based recognition scheme. In Proc. IEEE Int. Conf. on Computer Vision, Tampa, FL, pp. 238–249.
Lu, C.-P., Hager, G.D. and Mjolsness, E. 2000. Fast and globally convergent pose estimation from video images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(6):610–622.
Google Scholar
Moon, T.K. 1996. The expectation-maximization algorithm. IEEE Signal Processing Magazine, 13(6):47–60.
Google Scholar
Morokoff, W.J. and Caflisch, R.E., 1994. Quasi-random sequences and their discrepancies. SIAM Journal Scientific Computing, 15(6):1251–1279.
Google Scholar
Murase, H. and Nayar, S.K. 1995. Visual learning and recognition of 3-D objects from appearance. Int. Journal of Computer Vision, 14(1):5–24.
Google Scholar
Olson, C.F. 1997. Efficient pose clustering using a randomized algorithm. Int. Journal of Computer Vision, 23(2):131–147.
Google Scholar
Procter, S. and Illingworth, J. 1997. ForeSight: Fast object recognition using geometric hashing with edge-triple features. In Proc. Int. Conf. on Image Processing, vol. 1, Santa Barbara, CA, pp. 889–892.
Google Scholar
Sinkhorn, R. 1964. Arelationship between arbitrary positive matrices and doubly stochastic matrices. Annals Mathematical Statistics, 35(2):876–879.
Google Scholar
Ullman, S. 1989. Aligning pictorial descriptions: An approach to object recognition. Cognition, 32:193–254.
Google Scholar
Wunsch, P. and Hirzinger, G. 1996. Registration of CAD models to images by iterative inverse perspective matching. In Proc. Int. Conf. on Pattern Recognition, vol. 1, Vienna, Austria, pp. 78–83.
Google Scholar
Yuan, J.-C. 1989. A general photogrammetric method for determining object position and orientation. IEEE Trans. on Robotics and Automation, 5(2):129–142.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Maryland Institute for Advanced Computer Studies, College Park, MD, 20742, USA;
Philip David, Daniel DeMenthon, Ramani Duraiswami & Hanan Samet
Army Research Laboratory, 2800 Powder Mill Road, Adelphi, MD, 20783-1197, USA
Philip David

Authors

Philip David
View author publications
You can also search for this author in PubMed Google Scholar
Daniel DeMenthon
View author publications
You can also search for this author in PubMed Google Scholar
Ramani Duraiswami
View author publications
You can also search for this author in PubMed Google Scholar
Hanan Samet
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

David, P., DeMenthon, D., Duraiswami, R. et al. SoftPOSIT: Simultaneous Pose and Correspondence Determination. International Journal of Computer Vision 59, 259–284 (2004). https://doi.org/10.1023/B:VISI.0000025800.10423.1f

Download citation

Issue Date: September 2004
DOI: https://doi.org/10.1023/B:VISI.0000025800.10423.1f

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SoftPOSIT: Simultaneous Pose and Correspondence Determination

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

3D point cloud-based place recognition: a survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

SoftPOSIT: Simultaneous Pose and Correspondence Determination

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

3D point cloud-based place recognition: a survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation