Skip to main content
Log in

Photobook: Content-based manipulation of image databases

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We describe the Photobook system, which is a set of interactive tools for browsing and searching images and image sequences. These query tools differ from those used in standard image databases in that they make direct use of the image content rather than relying on text annotations. Direct search on image content is made possible by use of semantics-preserving image compression, which reduces images to a small set of perceptually-significant coefficients. We discuss three types of Photobook descriptions in detail: one that allows search based on appearance, one that uses 2-D shape, and a third that allows search based on textural properties. These image content descriptions can be combined with each other and with text-based descriptions to provide a sophisticated browsing and search capability. In this paper we demonstrate Photobook on databases containing images of people, video keyframes, hand tools, fish, texture swatches, and 3-D medical data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Adelson, E. and Bergen, J. 1991. The plenoptic function and the elements of early vision. In M. Landy and J.A. Movshon (Eds.), Computational Models of Visual Processing, MIT Press.

  • ACMSIGIR. 1991. Proceedings of International Conference on Multimedia Information Systems, Singapore.

  • Ballard, D. and Brown, C. 1982. Computer Vision. Prentice Hall.

  • Binaghi. E., Gagliardi, I., and Schettini, R. Indexing and fuzzy logicbased retrieval of color images. In Visual Database Systems, II, IFIP Transactions, A-7:79–92.

  • Blanz, W.E., Petkovic, D., and Sanz, J.L. 1989. Algorithms and Architectures for Machine Vision. C.H. Chen (Ed.), Marcel Decker Inc.

  • Breuel, T. 1990. Indexing for recognition from a large model base. M.I.T. Artificial Intelligence Laboratory Memo. 1108.

  • Brodatz, P. (1966). Textures: A Photographic Album for Artists and Designers, Dover: New York.

    Google Scholar 

  • Chang, C.C. and Lee, S.Y. (1991). Retrieval of similar pictures on pictorial databases. Pattern recognition, 24(7):675–680.

    Google Scholar 

  • Chang, C.-C. and Wu, T.-C. (1992). Retrieving the most similar symbolic pictures from pictorial databases. Information Processing and Management, 28(5):581–588.

    Google Scholar 

  • Chen, Z. and Ho, S.-Y. (1991). Computer vision for robust 3D aircraft recognition with fast library search. Pattern Recognition, 24(5):375–390.

    Google Scholar 

  • Darrell, T. and Pentland, A. 1991. Robust estimation of a multi-layer motion representation. In Proceedings IEEE Workshop on Visual Motion, pp. 173–177. Longer version available as M.I.T. Media Laboratory Perceptual Computing Technical Report No. 163.

  • Darrell, T., Maes, P., Blumberg, B., and Pentland, A. 1994. A novel environment for situated vision and behavior. IEEE Workshop on Visual Behaviors. Seattle. WA, pp. 68–72.

  • Duda, R. and Hart, P. (1973). Pattern Classification and Scene Analysis. Wiley: New York.

    Google Scholar 

  • Francos, J. 1993. Orthogonal decompositions of 2-D random fields and their applications for 2-D spectral estimation. In N.K. Bose and C.R. Rao (Eds.), Signal Processing and its Applications. North-Holland, pp. 287–327.

  • Gast, P. 1993. Integrating eigenpicture analysis with an image database. M.I.T. Bachelors Thesis, Computer Science and Electrical Engineering Deptartment. Advisor: Alex Pentland.

  • Grosky, W.I., Neo, P., and Mehrotra, R. 1992. A pictorial index mechanism for model-based matching. Data and Knowledge Engineering, 8:309–327.

    Google Scholar 

  • Haase, K. 1993a. FRAMER: A portable persistent representation library. Proceedings of the AAAI Workshop on AI in Systems and Support, Am. Asso. for AI.

  • Haase, K. 1993b. AI in service and support: Bridging the gap, Haase. Proceedings of Am. Asso. AI.

  • Helson, H. and Lowdenslager, D. (1962). Prediction theory and fourier series in several variables II. Acta Mathmatica, 196:175–213.

    Google Scholar 

  • Hirata, K. and Kato, T. (1992). Query by visual example. In Advances in Database Technology EDBT'92, Third International Conference on Extending Database Technology. Springer-Verlag: Vienna, Austria.

    Google Scholar 

  • Ioka, M. 1989. A method of defining the similarity of images on the basis of color information. Technical Report RT-003 0, IBM Tokyo Research Lab.

  • Ireton, M.A. and Xydeas, C.S. 1990. Classification of shape for content retrieval of images in a multimedia database. In Sixth International Conference on Digital Processing of Signals in Communications, Loughborough, UK, 2–6. IEE, pp. 111–116.

  • Jagadish, H.V. 1991. A retrieval technique for similar shapes. In International Conference on Management of Data, SIGMOD 91, Denver CO, ACM, pp. 208–217.

  • Jain, R. and Niblack, W. 1992. NSF Workshop on Visual Information Management.

  • Kato, T., Kurita, T., Shimogaki, H., Mizutori, T., and Fujimura, K. 1991. A cognitive approach to visual interaction. In International Conference of Multimedia Information Systems, MIS'91, ACM and National University of Singapore, pp. 109–120.

  • Lamdan, Y. and Wolfson, H.J. 1988. Geometric hashing: A genral and efficient model-based recognition scheme. In 2nd International Conference on Computer Vision (ICCV), Tampa, Florida, IEEE, pp. 238–249.

  • Lee, S.-Y. and Hsu, F.-J. (1990). 2D C-string: A new spatial knowledge representation for image database systems. Pattern Recognition, 23(10):1077–1087.

    Google Scholar 

  • Lee, S.-Y. and Hsu, F.-J. (1992). Spatial reasoning and similarity retrieval of images using 2D c-string knowledge representation. Pattern Recognition, 25(2):305–318.

    Google Scholar 

  • Lippman, A. 1981. Semantic bandwidth compression. Picture Coding Symposium.

  • McLean, P. 1989. Structured video coding. M.I.T. Masters Thesis, Advisor: Andrew Lippman.

  • Mao, J. and Jain, A. (1992). Texture classification and segmentation using multiresolution simultaneous autoregressive models. Pattern Recognition, 25(2):173–188.

    Google Scholar 

  • Mehrotra, R. and Grosky, W.I. 1989. Shape matching utilizing indexed hypotheses generation and testing. IEEE Transactions of Robotics and Automation, 5(1):70–77.

    Google Scholar 

  • Moghaddam, B. and Pentland, A. 1994. Face recognition using viewbased and modular eigenspaces for identification and inspection of Humans. SPIE Conf. on Automatic Systems, San Diego.

  • Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, E., Petkovic, D., and Yanker, P. 1993. The QBIC project: Querying image s by content using color, texture, and shape. In IS & T/SPIE 1993 International Symposium on Electronic Imaging: Science & Technology., Conference 1908, Storage and Retrieval for Image and Video Databases.

  • Martin, J., Pentland, A., and Kikinis, R., 1994. Shape analysis of brain structures using physical and experimental modes. IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, pp. 752–755.

  • Pentland, A. and Sclaroff, S. 1991. Closed-form solutions for physically based shape modeling and recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(7):715–730.

    Google Scholar 

  • Pentland, A., Picard, R., Davenport, G., and Welsh, R. 1993. The BT/MIT project on advanced image tools for telecommunications: An overview. Image Com'93, 2nd International Conference on Image Communications, Bordeaux, France, pp. 23–25.

  • Pentland, A., Moggadam, B., and Starner, T., 1994. View-based and modular eigenspaces for face recognition. IEEE Conf. Computer Vision and Pattern Recognition, Seattle, WA, pp. 84–90.

  • Picard, R.W. (1982). Random field texture coding. Society for Information Display International Symposium Digest, XXIII:685–688.

    Google Scholar 

  • Picard, R.W. and Kabir, T. 1993. Finding similar patterns in large image databases. Proc. ICASSP, Minneapolis, MN, Vol. 5, pp. 161–164.

    Google Scholar 

  • Picard, R.W. and Gorkani, M. 1994. Finding perceptually dominant orientations in natural textures. Spatial Vision, 8(2):221–253.

    Google Scholar 

  • Picard, R.W. and Liu, F. 1994. A new Wold ordering for image similarity. Proc. ICASSP, Adelaide, Australia.

  • Picard, R.W. and Minka, T.P. 1995. Vision texture for annotation. ACM/Springer-Verlag Journal of Multimedia Systems, 3:3–14.

    Google Scholar 

  • Rao, A.R. and Lohse, G.L. 1993. Towards a texture naming system: Identifying relevant dimensions of texture. IEEE Conf. on Visualization, San Jose, CA.

  • Sclaroff, S. and Pentland, A. 1993. A finite-element framework for correspondence and matching. 4th International Conference on Computer Vision, Berlin, Germany, pp. 308–313.

  • Sclaroff, S. and Pentland, A. 1995. Modal matching for correspondence and recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 17(6):562–575. Also available as: M.I.T. Media Laboratory Perceptual Computing Technical Note No. 304.

    Google Scholar 

  • Sirovich, L. and Kirby, M. 1987. Low-dimensional procedure for the characterization of human faces. J. Opt. Soc. Am. A, 4(3):519–524.

    Google Scholar 

  • Smoliar, S. and Zhang, H. 1994. Content-based video indexing and retrieval. IEEE Multimedia Magazine, 1(2):62–72.

    Google Scholar 

  • Sriram, R., Francos, J.M., and Pearlman, W.A. 1994. Texture coding using a wold decomposition model. Proc. 12th IAPR Int. Conf. Pat. Rec., Jerusalem, Israel.

  • Swain, M. and Ballard, D. 1991, Color indexing. Int. J. of Computer Vision, 7(1):11–32.

    Google Scholar 

  • Tanaka, S., Shima, M., Shibayama, J., and Maeda, A. 1989. Retrieval method for an image database based on topographical structure. In Applic. of Digital Image Processing, SPIE, 1153:318–327.

  • Therrien, C.W. 1992. Discrete Random Signals and Statistical Signal Processing. Prentice-HallL: Englewood Cliffs, NJ.

    Google Scholar 

  • Turk, M. and Pentland, A. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience.

  • Wakimoto, K., Shima, M., Tanaka, S., and Maeda, A. 1990. An intelligent user interface to an image database using a figure interpretation method. In 9th Int. Conference on Pattern Recognition, Vol. 2, pp. 516–991.

  • Wang, J.Y.A. and Adelson, E.H. Layered representation for motion analysis IEEE CVPR'93. Longer version available as: M.I.T. Media Laboratory Perceptual Computing Technical Report No. 228.

Download references

Authors

Additional information

Perceptual Computing Section, The Media Laboratory, Massachusetts Institute of Technology

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pentland, A., Picard, R.W. & Sclaroff, S. Photobook: Content-based manipulation of image databases. Int J Comput Vision 18, 233–254 (1996). https://doi.org/10.1007/BF00123143

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00123143

Keywords

Navigation