Photobook: Content-based manipulation of image databases

Pentland, A.; Picard, R. W.; Sclaroff, S.

doi:10.1007/BF00123143

Photobook: Content-based manipulation of image databases

Published: June 1996

Volume 18, pages 233–254, (1996)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

A. Pentland,
R. W. Picard &
S. Sclaroff

935 Accesses
694 Citations
6 Altmetric
Explore all metrics

Abstract

We describe the Photobook system, which is a set of interactive tools for browsing and searching images and image sequences. These query tools differ from those used in standard image databases in that they make direct use of the image content rather than relying on text annotations. Direct search on image content is made possible by use of semantics-preserving image compression, which reduces images to a small set of perceptually-significant coefficients. We discuss three types of Photobook descriptions in detail: one that allows search based on appearance, one that uses 2-D shape, and a third that allows search based on textural properties. These image content descriptions can be combined with each other and with text-based descriptions to provide a sophisticated browsing and search capability. In this paper we demonstrate Photobook on databases containing images of people, video keyframes, hand tools, fish, texture swatches, and 3-D medical data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adelson, E. and Bergen, J. 1991. The plenoptic function and the elements of early vision. In M. Landy and J.A. Movshon (Eds.), Computational Models of Visual Processing, MIT Press.
ACMSIGIR. 1991. Proceedings of International Conference on Multimedia Information Systems, Singapore.
Ballard, D. and Brown, C. 1982. Computer Vision. Prentice Hall.
Binaghi. E., Gagliardi, I., and Schettini, R. Indexing and fuzzy logicbased retrieval of color images. In Visual Database Systems, II, IFIP Transactions, A-7:79–92.
Blanz, W.E., Petkovic, D., and Sanz, J.L. 1989. Algorithms and Architectures for Machine Vision. C.H. Chen (Ed.), Marcel Decker Inc.
Breuel, T. 1990. Indexing for recognition from a large model base. M.I.T. Artificial Intelligence Laboratory Memo. 1108.
Brodatz, P. (1966). Textures: A Photographic Album for Artists and Designers, Dover: New York.
Google Scholar
Chang, C.C. and Lee, S.Y. (1991). Retrieval of similar pictures on pictorial databases. Pattern recognition, 24(7):675–680.
Google Scholar
Chang, C.-C. and Wu, T.-C. (1992). Retrieving the most similar symbolic pictures from pictorial databases. Information Processing and Management, 28(5):581–588.
Google Scholar
Chen, Z. and Ho, S.-Y. (1991). Computer vision for robust 3D aircraft recognition with fast library search. Pattern Recognition, 24(5):375–390.
Google Scholar
Darrell, T. and Pentland, A. 1991. Robust estimation of a multi-layer motion representation. In Proceedings IEEE Workshop on Visual Motion, pp. 173–177. Longer version available as M.I.T. Media Laboratory Perceptual Computing Technical Report No. 163.
Darrell, T., Maes, P., Blumberg, B., and Pentland, A. 1994. A novel environment for situated vision and behavior. IEEE Workshop on Visual Behaviors. Seattle. WA, pp. 68–72.
Duda, R. and Hart, P. (1973). Pattern Classification and Scene Analysis. Wiley: New York.
Google Scholar
Francos, J. 1993. Orthogonal decompositions of 2-D random fields and their applications for 2-D spectral estimation. In N.K. Bose and C.R. Rao (Eds.), Signal Processing and its Applications. North-Holland, pp. 287–327.
Gast, P. 1993. Integrating eigenpicture analysis with an image database. M.I.T. Bachelors Thesis, Computer Science and Electrical Engineering Deptartment. Advisor: Alex Pentland.
Grosky, W.I., Neo, P., and Mehrotra, R. 1992. A pictorial index mechanism for model-based matching. Data and Knowledge Engineering, 8:309–327.
Google Scholar
Haase, K. 1993a. FRAMER: A portable persistent representation library. Proceedings of the AAAI Workshop on AI in Systems and Support, Am. Asso. for AI.
Haase, K. 1993b. AI in service and support: Bridging the gap, Haase. Proceedings of Am. Asso. AI.
Helson, H. and Lowdenslager, D. (1962). Prediction theory and fourier series in several variables II. Acta Mathmatica, 196:175–213.
Google Scholar
Hirata, K. and Kato, T. (1992). Query by visual example. In Advances in Database Technology EDBT'92, Third International Conference on Extending Database Technology. Springer-Verlag: Vienna, Austria.
Google Scholar
Ioka, M. 1989. A method of defining the similarity of images on the basis of color information. Technical Report RT-003 0, IBM Tokyo Research Lab.
Ireton, M.A. and Xydeas, C.S. 1990. Classification of shape for content retrieval of images in a multimedia database. In Sixth International Conference on Digital Processing of Signals in Communications, Loughborough, UK, 2–6. IEE, pp. 111–116.
Jagadish, H.V. 1991. A retrieval technique for similar shapes. In International Conference on Management of Data, SIGMOD 91, Denver CO, ACM, pp. 208–217.
Jain, R. and Niblack, W. 1992. NSF Workshop on Visual Information Management.
Kato, T., Kurita, T., Shimogaki, H., Mizutori, T., and Fujimura, K. 1991. A cognitive approach to visual interaction. In International Conference of Multimedia Information Systems, MIS'91, ACM and National University of Singapore, pp. 109–120.
Lamdan, Y. and Wolfson, H.J. 1988. Geometric hashing: A genral and efficient model-based recognition scheme. In 2nd International Conference on Computer Vision (ICCV), Tampa, Florida, IEEE, pp. 238–249.
Lee, S.-Y. and Hsu, F.-J. (1990). 2D C-string: A new spatial knowledge representation for image database systems. Pattern Recognition, 23(10):1077–1087.
Google Scholar
Lee, S.-Y. and Hsu, F.-J. (1992). Spatial reasoning and similarity retrieval of images using 2D c-string knowledge representation. Pattern Recognition, 25(2):305–318.
Google Scholar
Lippman, A. 1981. Semantic bandwidth compression. Picture Coding Symposium.
McLean, P. 1989. Structured video coding. M.I.T. Masters Thesis, Advisor: Andrew Lippman.
Mao, J. and Jain, A. (1992). Texture classification and segmentation using multiresolution simultaneous autoregressive models. Pattern Recognition, 25(2):173–188.
Google Scholar
Mehrotra, R. and Grosky, W.I. 1989. Shape matching utilizing indexed hypotheses generation and testing. IEEE Transactions of Robotics and Automation, 5(1):70–77.
Google Scholar
Moghaddam, B. and Pentland, A. 1994. Face recognition using viewbased and modular eigenspaces for identification and inspection of Humans. SPIE Conf. on Automatic Systems, San Diego.
Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, E., Petkovic, D., and Yanker, P. 1993. The QBIC project: Querying image s by content using color, texture, and shape. In IS & T/SPIE 1993 International Symposium on Electronic Imaging: Science & Technology., Conference 1908, Storage and Retrieval for Image and Video Databases.
Martin, J., Pentland, A., and Kikinis, R., 1994. Shape analysis of brain structures using physical and experimental modes. IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, pp. 752–755.
Pentland, A. and Sclaroff, S. 1991. Closed-form solutions for physically based shape modeling and recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(7):715–730.
Google Scholar
Pentland, A., Picard, R., Davenport, G., and Welsh, R. 1993. The BT/MIT project on advanced image tools for telecommunications: An overview. Image Com'93, 2nd International Conference on Image Communications, Bordeaux, France, pp. 23–25.
Pentland, A., Moggadam, B., and Starner, T., 1994. View-based and modular eigenspaces for face recognition. IEEE Conf. Computer Vision and Pattern Recognition, Seattle, WA, pp. 84–90.
Picard, R.W. (1982). Random field texture coding. Society for Information Display International Symposium Digest, XXIII:685–688.
Google Scholar
Picard, R.W. and Kabir, T. 1993. Finding similar patterns in large image databases. Proc. ICASSP, Minneapolis, MN, Vol. 5, pp. 161–164.
Google Scholar
Picard, R.W. and Gorkani, M. 1994. Finding perceptually dominant orientations in natural textures. Spatial Vision, 8(2):221–253.
Google Scholar
Picard, R.W. and Liu, F. 1994. A new Wold ordering for image similarity. Proc. ICASSP, Adelaide, Australia.
Picard, R.W. and Minka, T.P. 1995. Vision texture for annotation. ACM/Springer-Verlag Journal of Multimedia Systems, 3:3–14.
Google Scholar
Rao, A.R. and Lohse, G.L. 1993. Towards a texture naming system: Identifying relevant dimensions of texture. IEEE Conf. on Visualization, San Jose, CA.
Sclaroff, S. and Pentland, A. 1993. A finite-element framework for correspondence and matching. 4th International Conference on Computer Vision, Berlin, Germany, pp. 308–313.
Sclaroff, S. and Pentland, A. 1995. Modal matching for correspondence and recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 17(6):562–575. Also available as: M.I.T. Media Laboratory Perceptual Computing Technical Note No. 304.
Google Scholar
Sirovich, L. and Kirby, M. 1987. Low-dimensional procedure for the characterization of human faces. J. Opt. Soc. Am. A, 4(3):519–524.
Google Scholar
Smoliar, S. and Zhang, H. 1994. Content-based video indexing and retrieval. IEEE Multimedia Magazine, 1(2):62–72.
Google Scholar
Sriram, R., Francos, J.M., and Pearlman, W.A. 1994. Texture coding using a wold decomposition model. Proc. 12th IAPR Int. Conf. Pat. Rec., Jerusalem, Israel.
Swain, M. and Ballard, D. 1991, Color indexing. Int. J. of Computer Vision, 7(1):11–32.
Google Scholar
Tanaka, S., Shima, M., Shibayama, J., and Maeda, A. 1989. Retrieval method for an image database based on topographical structure. In Applic. of Digital Image Processing, SPIE, 1153:318–327.
Therrien, C.W. 1992. Discrete Random Signals and Statistical Signal Processing. Prentice-HallL: Englewood Cliffs, NJ.
Google Scholar
Turk, M. and Pentland, A. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience.
Wakimoto, K., Shima, M., Tanaka, S., and Maeda, A. 1990. An intelligent user interface to an image database using a figure interpretation method. In 9th Int. Conference on Pattern Recognition, Vol. 2, pp. 516–991.
Wang, J.Y.A. and Adelson, E.H. Layered representation for motion analysis IEEE CVPR'93. Longer version available as: M.I.T. Media Laboratory Perceptual Computing Technical Report No. 228.

Download references

Authors

A. Pentland
View author publications
You can also search for this author in PubMed Google Scholar
R. W. Picard
View author publications
You can also search for this author in PubMed Google Scholar
S. Sclaroff
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Perceptual Computing Section, The Media Laboratory, Massachusetts Institute of Technology

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pentland, A., Picard, R.W. & Sclaroff, S. Photobook: Content-based manipulation of image databases. Int J Comput Vision 18, 233–254 (1996). https://doi.org/10.1007/BF00123143

Download citation

Received: 26 July 1994
Revised: 10 February 1995
Issue Date: June 1996
DOI: https://doi.org/10.1007/BF00123143

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Photobook: Content-based manipulation of image databases

Abstract

Access this article

Similar content being viewed by others

Visual Browsing of Large Image Databases

Interactive Browsing Systems for Large Image Collections

VISIONE at Video Browser Showdown 2021

References

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Photobook: Content-based manipulation of image databases

Abstract

Access this article

Similar content being viewed by others

Visual Browsing of Large Image Databases

Interactive Browsing Systems for Large Image Collections

VISIONE at Video Browser Showdown 2021

References

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation