Abstract
We present an end-to-end system that goes from video sequences to high resolution, editable, dynamically controllable face models. The capture system employs synchronized video cameras and structured light projectors to record videos of a moving face from multiple viewpoints. A novel spacetime stereo algorithm is introduced to compute depth maps accurately and overcome over-fitting deficiencies in prior work. A new template fitting and tracking procedure fills in missing data and yields point correspondence across the entire sequence without using markers. We demonstrate a data-driven, interactive method for inverse kinematics that draws on the large set of fitted templates and allows for posing new expressions by dragging surface points directly. Finally, we describe new tools that model the dynamics in the input sequence to enable new animations, created via key-framing or texture-synthesis techniques.
Supplemental Material
Available for Download
- ALLEN, B., CURLESS, B., AND POPOVIC, Z. 2003. The space of human body shapes: reconstruction and parameterization from range scans. In SIGGRAPH Conference Proceedings, 587--594. Google ScholarDigital Library
- ARIKAN, O., AND FORSYTH, D. A. 2002. Synthesizing constrained motions from examples. In SIGGRAPH Conference Proceedings, 483--490.Google Scholar
- BAKER, S., GROSS, R., AND MATTHEWS, I. 2003. Lucas-kanade 20 years on: A unifying framework: Part 3. Tech. Rep. CMU-RI-TR-03-35, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, November.Google Scholar
- BASU, S., OLIVER, N., AND PENTLAND, A. 1998. 3D lip shapes from video: A combined physical-statistical model. Speech Communication 26, 1, 131--148. Google ScholarDigital Library
- BLACK, M. J., AND ANANDAN, P. 1993. Robust dense optical flow. In Proc. Int. Conf. on Computer Vision, 231--236.Google Scholar
- BLANZ, V., AND VETTER, T. 1999. A morphable model for the synthesis of 3D faces. IN SIGGRAPH Conference Proceedings, 187--194. Google ScholarDigital Library
- BLANZ, V., BASSO, C., POGGIO, T., AND VETTER, T. 2003. Reanimating faces in images and video. In Proceedings of EUROGRAPHICS, vol. 22, 641--650.Google ScholarCross Ref
- BOUGUET, J.-Y. 2001. Camera Calibration Toolbox for Matlab. http://www.vision.caltech.edu/bouguetj/calib_doc/index.html.Google Scholar
- BRAND, M. 1999. Voice puppetry. In SIGGRAPH Conference Proceedings, 21--28. Google ScholarDigital Library
- BRAND, M. 2001. Morphable 3D models from video. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 456--463.Google ScholarCross Ref
- BREGLER, C., COVELL, M., AND SLANEY, M. 1997. Video rewrite: Visual speech synthesis from video. In SIGGRAPH Conference Proceedings, 353--360. Google ScholarDigital Library
- BROOMHEAD, D. S., AND LOWE, D. 1988. Multivariable functional interpolation and adptive networks. Complex Systems 2, 321--355.Google Scholar
- CHAI, J., JIN, X., AND HODGINS, J. 2003. Vision-based control of 3D facial animation. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, 193--206. Google ScholarDigital Library
- COOTES, T. F., TAYLOR, C. J., COOPER, D. H., AND GRAHAM, J. 1995. Active shape models---their training and application. Computer Vision and Image Understanding 61, 1, 38--59. Google ScholarDigital Library
- CURLESS, B., AND LEVOY, M. 1996. A volumetric method for building complex models from range images. In SIGGRAPH Conference Proceedings, 303--312. Google ScholarDigital Library
- DAVIS, J., RAMAMOORTHI, R., AND RUSINKIEWICZ, S. 2003. Spacetime stereo: A unifying framework for depth from triangulation. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 359--366.Google ScholarCross Ref
- DECARLO, D., AND METAXAS, D. 2002. Adjusting shape parameters using model-based optical flow residuals. IEEE Trans. on Pattern Analysis and Machine Intelligence 24, 6, 814--823. Google ScholarDigital Library
- ESSA, I., BASU, S., DARRELL, T., AND PENTLAND, A. 1996. Modeling, tracking and interactive animation of faces and heads using input from video. In Proceedings of the Computer Animation, IEEE Computer Society, 68--79. Google ScholarDigital Library
- EZZAT, T., GEIGER, G., AND POGGIO, T. 2002. Trainable videorealistic speech animation. In SIGGRAPH Conference Proceedings, 388--398. Google ScholarDigital Library
- FAUGERAS, O. 1993. Three-Dimensional Computer Vision. MIT Press. Google ScholarDigital Library
- GUENTER, B., GRIMM, C., WOOD, D., MALVAR, H., AND PIGHIN, F. 1998. Making faces. In SIGGRAPH Conference Proceedings, 55--66. Google ScholarDigital Library
- HUANG, P. S., ZHANG, C. P., AND CHIANG, F. P. 2003. High speed 3-d shape measurement based on digital fringe projection. Optical Engineering 42, 1, 163--168.Google ScholarCross Ref
- JOSHI, P., TIEN, W. C., DESBRUN, M., AND PIGHIN, F. 2003. Learning controls for blend shape based realistic facial animation. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, 187--192. Google ScholarDigital Library
- KANADE, T., AND OKUTOMI, M. 1994. A stereo matching algorithm with an adaptive window: Theory and experiment. IEEE Trans. on Pattern Analysis and Machine Intelligence 16, 9, 920--932. Google ScholarDigital Library
- KOVAR, L., GLEICHER, M., AND PIGHIN, F. 2002. Motion graphs. In SIGGRAPH Conference Proceedings, 473--482. Google ScholarDigital Library
- KOZEN, D. C. 1992. The Design and Analysis of Algorithms. Springer. Google ScholarDigital Library
- LEE, J., CHAI, J., REITSMA, P. S. S., HODGINS, J. K., AND POLLARD, N. S. 2002. Interactive control of avatars animated with human motion data. In SIGGRAPH Conference Proceedings, 491--500. Google ScholarDigital Library
- LI, Y., WANG, T., AND SHUM, H.-Y. 2002. Motion texture: A two-level statistical model for character motion synthesis. In SIGGRAPH Conference Proceedings, 465--472. Google ScholarDigital Library
- NAYAR, S. K., WATANABE, M., AND NOGUCHI, M. 1996. Real-time focus range sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence 18, 12, 1186--1198. Google ScholarDigital Library
- NOCEDAL, J., AND WRIGHT, S. J. 1999. Numerical Optimization. Springer.Google Scholar
- PARKE, F. I. 1972. Computer generated animation of faces. In Proceedings of the ACM annual conference, ACM Press, 451--457. Google ScholarDigital Library
- PIGHIN, F., HECKER, J., LISCHINSKI, D., SALESIN, D. H., AND SZELISKI, R. 1998. Synthesizing realistic facial expressions from photographs. In SIGGRAPH Conference Proceedings, 75--84. Google ScholarDigital Library
- PIGHIN, F., SALESIN, D. H., AND SZELISKI, R. 1999. Resynthesizing facial animation through 3D model-based tracking. In Proc. Int. Conf. on Computer Vision, 143--150.Google ScholarCross Ref
- PRESS, W. H., FLANNERY, B. P., TEUKOLSKY, S. A., AND VETTERLING, W. T. 1993. Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge University Press. Google ScholarDigital Library
- PROESMANS, M., GOOL, L. V., AND OOSTERLINCK, A. 1996. One-shot active 3D shape acquization. In Proc. Int. Conf. on Pattern Recognition, 336--340. Google ScholarDigital Library
- PULLI, K., AND GINZTON, M. 2002. Scanalyze. http://graphics.stanford.edu/software/scanalyze/.Google Scholar
- RASKAR, R., WELCH, G., CUTTS, M., LAKE, A., STESIN, L., AND FUCHS, H. 1998. The office of the future: A unified approach to image-based modeling and spatially immersive displays. In SIGGRAPH Conference Proceedings, 179--188. Google ScholarDigital Library
- SCHARSTEIN, D., AND SZELISKI, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. on Computer Vision 47, 1, 7--42. Google ScholarDigital Library
- SCHÖDL, A., AND ESSA, I. A. 2002. Controlled animation of video sprites. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, ACM Press, 121--127. Google ScholarDigital Library
- SCHÖDL, A., SZELISKI, S., SALESIN, D. H., AND ESSA, I. 2000. Video textures. In SIGGRAPH Conference Proceedings, 489--498. Google ScholarDigital Library
- TORRESANI, L., YANG, D. B., ALEXANDER, E. J., AND BREGLER, C. 2001. Tracking and modeling non-rigid objects with rank constraints. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 493--500.Google ScholarCross Ref
- VEDULA, S., BAKER, S., RANDER, P., COLLINS, R., AND KANADE, T. 1999. Three-dimensional scene flow. In Proc. Int. Conf. on Computer Vision, 722--729. Google ScholarDigital Library
- ZHANG, L., CURLESS, B., AND SEITZ, S. M. 2003. Spacetime stereo: Shape recovery for dynamic scenes. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 367--374.Google ScholarCross Ref
- ZHANG, Q., LIU, Z., GUO, B., AND SHUM, H. 2003. Geometry-driven photo-realistic facial expression synthesis. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, 177--186. Google ScholarDigital Library
Index Terms
- Spacetime faces: high resolution capture for modeling and animation
Recommendations
Spacetime faces: high resolution capture for modeling and animation
SIGGRAPH '04: ACM SIGGRAPH 2004 PapersWe present an end-to-end system that goes from video sequences to high resolution, editable, dynamically controllable face models. The capture system employs synchronized video cameras and structured light projectors to record videos of a moving face ...
Performance-driven animation of hand-drawn cartoon faces
We present a novel performance-driven approach to animating cartoon faces starting from pure 2D drawings. A 3D approximate facial model automatically built from front and side view master frames of character drawings is introduced to enable the animated ...
Modeling and Animating Realistic Faces from Images
We present a new set of techniques for modeling and animating realistic faces from photographs and videos. Given a set of face photographs taken simultaneously, our modeling technique allows the interactive recovery of a textured 3D face model. By ...
Comments