Skip to main content
Log in

SoftPOSIT: Simultaneous Pose and Correspondence Determination

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The problem of pose estimation arises in many areas of computer vision, including object recognition, object tracking, site inspection and updating, and autonomous navigation when scene models are available. We present a new algorithm, called SoftPOSIT, for determining the pose of a 3D object from a single 2D image when correspondences between object points and image points are not known. The algorithm combines the iterative softassign algorithm (Gold and Rangarajan, 1996; Gold et al., 1998) for computing correspondences and the iterative POSIT algorithm (DeMenthon and Davis, 1995) for computing object pose under a full-perspective camera model. Our algorithm, unlike most previous algorithms for pose determination, does not have to hypothesize small sets of matches and then verify the remaining image points. Instead, all possible matches are treated identically throughout the search for an optimal pose. The performance of the algorithm is extensively evaluated in Monte Carlo simulations on synthetic data under a variety of levels of clutter, occlusion, and image noise. These tests show that the algorithm performs well in a variety of difficult scenarios, and empirical evidence suggests that the algorithm has an asymptotic run-time complexity that is better than previous methods by a factor of the number of image points. The algorithm is being applied to a number of practical autonomous vehicle navigation problems including the registration of 3D architectural models of a city to images, and the docking of small robots onto larger robots.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., and Wu, A. 1998. An optimal algorithm for approximate nearest neighbor searching. Journal of the ACM, 45(6):891–923.

    Google Scholar 

  • Baird, H.S. 1985. Model-Based Image Matching Using Location. MIT Press: Cambridge, MA.

    Google Scholar 

  • Beis, J.S. and Lowe, D.G. 1999. Indexing without invariants in 3D object recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 21(10):1000–1015.

    Google Scholar 

  • Beveridge, J.R. and Riseman, E.M. 1992. Hybrid weak-perspective and full-perspective matching. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, Champaign, IL, pp. 432–438.

  • Beveridge, J.R. and Riseman, E.M. 1995. Optimal geometric model matching under full 3D perspective. Computer Vision and Image Understanding, 61(3):351–364.

    Google Scholar 

  • Brand, P. and Mohr, R. 1994. Accuracy in image measure. In Proc. SPIE, Videometrics III, Boston, MA, pp. 218–228.

  • Breuel, T.M. 1992. Fast recognition using adaptive subdivisions of transformation space. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Champaign, IL, pp. 445–451.

  • Bridle, J.S. 1990. Training stochastic model recognition as networks can lead to maximum mutual information estimation of parameters. In Proc. Advances in Neural Information Processing Systems, Denver, CO, pp. 211–217.

  • Burns, J.B., Weiss, R.S., and Riseman, E.M. 1993. View variation of point-set and line-segment features. IEEE Trans. Pattern Analysis and Machine Intelligence, 15(1):51–68.

    Google Scholar 

  • Cass, T.A. 1992. Polynomial-time object recognition in the presence of clutter, occlusion, and uncertainty. In Proc. European Conf. on Computer Vision, Santa Margherita Ligure, Italy, pp. 834–842.

  • Cass, T.A. 1994. Robust geometric matching for 3D object recognition. In. Proc. 12th IAPR Int. Conf. on Pattern Recognition, Jerusalem, Israel, vol. 1, pp. 477–482.

    Google Scholar 

  • Cass, T.A. 1998. Robust affine structure matching for 3D object recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(11):1265–1274.

    Google Scholar 

  • DeMenthon, D. and Davis, L.S. 1993. Recognition and tracking of 3D objects by 1D search. In Proc. DARPA Image Understanding Workshop, Washington, DC, pp. 653–659.

  • DeMenthon, D. and Davis, L.S. 1995. Model-based object pose in 25 lines of code. International Journal of Computer Vision, 15(1/2):123–141.

    Google Scholar 

  • DeMenthon, D., David, P., and Samet, H. 2001. SoftPOSIT: An algorithm for registration of 3D models to noisy perspective images combining softassign and POSIT. University of Maryland, College Park, MD, Report CS-TR-969, CS-TR 4257.

    Google Scholar 

  • Ely, R.W., Digirolamo, J.A., and Lundgren, J.C. 1995. Model supported positioning. In Proc. SPIE, Integrating Photogrammetric Techniques with Scene Analysis and Machine Vision II, Orlando, FL.

  • Fiore, P.D. 2001. Efficient linear solution of exterior orientation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(2):140–148.

    Google Scholar 

  • Fischler, M.A. and Bolles, R.C. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. Association for Computing Machinery, 24(6):381–395.

    Google Scholar 

  • Geiger, D. and Yuille, A.L. 1991. A common framework for image segmentation. International Journal of Computer Vision, 6(3):227–243.

    Google Scholar 

  • Gold, S. and Rangarajan, A. 1996. Agraduated assignment algorithm for graph matching. IEEE Trans. on Pattern Analysis and Machine Intelligence, 18(4):377–388.

    Google Scholar 

  • Gold, S., Rangarajan, A., Lu, C.-P., Pappu, S., and Mjolsness, E. 1998. New algorithms for 2D and 3D point matching: Pose estimation and correspondence. Pattern Recognition, 31(8):1019–1031.

    Google Scholar 

  • Grimson, E. 1990. Object Recognition by Computer: The Role of Geometric Constraints. MIT Press: Cambridge, MA.

    Google Scholar 

  • Grimson, E. and Huttenlocher, D.P. 1991. On the verification of hypothesized matches in model-based recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(12):1201–1213.

    Google Scholar 

  • Haralick, R.M., Lee, C., Ottenberg, K., and Nolle, M. 1991. Analysis and Solutions of the three point perspective pose estimation problem. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Maui, HI, pp. 592–598.

  • Hartley, R. and Zisserman, A. 2000. Multiple ViewGeometry in Computer Vision. Cambridge University Press: Cambridge, UK.

    Google Scholar 

  • Horaud, R., Conio, B., Leboulleux, O., and Lacolle, B. 1989. An analytic solution for the perspective 4-point problem. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, CA, pp. 500–507.

  • Horn, B.K.P. 1986. Robot Vision. MIT Press: Cambridge, MA.

    Google Scholar 

  • Jacobs, D.W. 1992. Space efficient 3-D model indexing. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Champaign, IL, pp. 439–444.

  • Jurie, F. 1999. Solution of the simultaneous pose and correspondence problem using gaussian error model. Computer Vision and Image Understanding, 73(3):357–373.

    Google Scholar 

  • Lamdan, Y. and Wolfson, H.J. 1988. Geometric hashing: A general and efficient model-based recognition scheme. In Proc. IEEE Int. Conf. on Computer Vision, Tampa, FL, pp. 238–249.

  • Lu, C.-P., Hager, G.D. and Mjolsness, E. 2000. Fast and globally convergent pose estimation from video images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(6):610–622.

    Google Scholar 

  • Moon, T.K. 1996. The expectation-maximization algorithm. IEEE Signal Processing Magazine, 13(6):47–60.

    Google Scholar 

  • Morokoff, W.J. and Caflisch, R.E., 1994. Quasi-random sequences and their discrepancies. SIAM Journal Scientific Computing, 15(6):1251–1279.

    Google Scholar 

  • Murase, H. and Nayar, S.K. 1995. Visual learning and recognition of 3-D objects from appearance. Int. Journal of Computer Vision, 14(1):5–24.

    Google Scholar 

  • Olson, C.F. 1997. Efficient pose clustering using a randomized algorithm. Int. Journal of Computer Vision, 23(2):131–147.

    Google Scholar 

  • Procter, S. and Illingworth, J. 1997. ForeSight: Fast object recognition using geometric hashing with edge-triple features. In Proc. Int. Conf. on Image Processing, vol. 1, Santa Barbara, CA, pp. 889–892.

    Google Scholar 

  • Sinkhorn, R. 1964. Arelationship between arbitrary positive matrices and doubly stochastic matrices. Annals Mathematical Statistics, 35(2):876–879.

    Google Scholar 

  • Ullman, S. 1989. Aligning pictorial descriptions: An approach to object recognition. Cognition, 32:193–254.

    Google Scholar 

  • Wunsch, P. and Hirzinger, G. 1996. Registration of CAD models to images by iterative inverse perspective matching. In Proc. Int. Conf. on Pattern Recognition, vol. 1, Vienna, Austria, pp. 78–83.

    Google Scholar 

  • Yuan, J.-C. 1989. A general photogrammetric method for determining object position and orientation. IEEE Trans. on Robotics and Automation, 5(2):129–142.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

David, P., DeMenthon, D., Duraiswami, R. et al. SoftPOSIT: Simultaneous Pose and Correspondence Determination. International Journal of Computer Vision 59, 259–284 (2004). https://doi.org/10.1023/B:VISI.0000025800.10423.1f

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VISI.0000025800.10423.1f

Navigation