Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-m9kch Total loading time: 0 Render date: 2024-05-11T11:28:52.002Z Has data issue: false hasContentIssue false

9 - Attention and the minimal subscene

Published online by Cambridge University Press:  01 September 2009

Laurent Itti
Affiliation:
University of Southern California, Los Angeles
Michael A. Arbib
Affiliation:
University of Southern California, Los Angeles
Michael A. Arbib
Affiliation:
University of Southern California
Get access

Summary

Introduction

The Mirror System Hypothesis (MSH), described in Chapter 1, asserts that recognition of manual actions may ground the evolution of the language-ready brain. More specifically, the hypothesis suggests that manual praxic actions provide the basis for the successive evolution of pantomime, then protosign and protospeech, and finally the articulatory actions (of hands, face and – most importantly for speech – voice) that define the phonology of language. But whereas a praxic action just is a praxic action, a communicative action (which is usually a compound of meaningless articulatory actions; see Goldstein, Byrd, and Saltzman, this volume, on duality of patterning) is about something else. We want to give an account of that relationship between the sign and the signified (Arbib, this volume, Section 1.4.3).

Words and sentences can be about many things and abstractions, or can have social import within a variety of speech acts. However, here we choose to focus our discussion by looking at two specific tasks of language in relation to a visually perceptible scene: (1) generating a description of the scene, and (2) answering a question about the scene. At one level, vision appears to be highly parallel, whereas producing or understanding a sentence appears to be essentially serial. However, in each case there is both low-level parallel processing (across the spatial dimension in vision, across the frequency spectrum in audition) and high-level seriality in time (a sequence of visual fixations or foci of attention in vision, a sequence of words in language).

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2006

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Arbib, M. A., 1981. Perceptual structures and distributed motor control. In Brooks, V. B. (ed.) Handbook of Physiology, Section 2, The Nervous System, vol. 2, Motor Control, Part 1. Bathesda, MD: American Physiological Society, pp. 1449–1480.Google Scholar
Arbib, M. A., 1989. The Metaphorical Brain 2: Neural Networks and Beyond. New York: Wiley-Interscience.Google Scholar
Arbib, M. A., 2001. The Mirror System Hypothesis for the language-ready brain. In Cangelosi, A. & Parisi, D. (eds.) Computational Approaches to the Evolution of Language and Communication. London: Springer-Verlag, pp. 229–254.Google Scholar
Arbib, M. A., and Bota, M., 2003. Language evolution: neural homologies and neuroinformatics. Neur. Networks 16: 1237–1260.CrossRefGoogle ScholarPubMed
Arbib, M. A., and Caplan, D., 1979. Neurolinguistics must be computational. Behav. Brain Sci. 2: 449–483.CrossRefGoogle Scholar
Arbib, M. A., and Didday, R. L., 1975. Eye-movements and visual perception: a two-visual system model. Int. J. Man–Machine Stud. 7: 547–569.Google Scholar
Arbib, M. A., and Liaw, J.-S., 1995. Sensorimotor transformations in the worlds of frogs and robots. Artif. Intell. 72: 53–79.CrossRefGoogle Scholar
Arbib, M., and Rizzolatti, G., 1997. Neural expectations: a possible evolutionary path from manual skills to language. Commun. Cogn. 29: 393–424.Google Scholar
Arbib, M. A., Érdi, P., and Szentágothai, J., 1998. Neural Organization: Structure, Function, and Dynamics. Cambridge, MA: MIT Press.Google Scholar
Arkin, R. C., 1998. Behavior-Based Robotics. Cambridge, MA: MIT Press.Google Scholar
Baddeley, A., 2003. Working memory: looking back and looking forward. Nature Rev. Neurosci. 4: 829–839.CrossRefGoogle ScholarPubMed
Baddeley, A. D., and Hitch, G. J., 1974. Working memory. In Bower, G. A. (ed.) The Psychology of Learning and Motivation. New York: Academic Press, pp. 47–89.Google Scholar
Biederman, I., Teitelbaum, R. C., and Mczzanotte, R. J., 1983. Scene Perception: a failure to benefit from prior expectancy of familiarity. J. Exp. Psychol. Learn. Mem. Cogn. 9: 411–429.CrossRefGoogle Scholar
Bloom, P., 1997. Intentionality and word learning. Trends Cogn. Sci. 1: 9–12.CrossRefGoogle ScholarPubMed
Bullock, D., & Rhodes, B. J., 2003. Competitive queuing for planning and serial performance. in Arbib, M. A. (ed.) The Handbook of Brain Theory and Neural Networks, 2 edn. Cambridge, MA: MIT Press, pp. 241–248.Google Scholar
Chambers, C. G., Tanenhaus, M. K., and Magnuson, J. S., 2004. Actions and affordances in syntactic ambiguity resolution. J. Exp. Psychol. Learn. Mem. Cogn. 30: 687–696.CrossRefGoogle ScholarPubMed
Chomsky, N., 1995. The Minimalist Program. Cambridge, MA: MIT Press.Google Scholar
Colby, C. L., and Goldberg, M. E., 1999. Space and attention in parietal cortex. Annu. Rev. Neurosci. 22: 97–136.CrossRefGoogle ScholarPubMed
Croft, W., 2001. Radical Construction Grammar. Oxford, UK: Oxford University Press.CrossRefGoogle Scholar
Croft, W., and Cruse, D. A., 2004. Cognitive Linguistics. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Dominey, P. F., and Arbib, M. A., 1992. A cortico-subcortical model for generation of spatially accurate sequential saccades. Cereb. Cortex. 2: 153–175.CrossRefGoogle ScholarPubMed
Dominey, P. F., Arbib, M. A., and Joseph, J. P., 1995. A model of cortico-striatal plasticity for learning oculomotor associations and sequences. J. Cogn. Neurosci. 7: 311–336.CrossRefGoogle Scholar
Draper, B. A., Collins, R. T., Brolio, J., Hanson, A. R., and Riseman, E. M., 1989. The schema system. Int. J. Comput. Vision. 2: 209–250.CrossRefGoogle Scholar
Elsner, B., Hommel, B., Mentschel, C., et al., 2002. Linking actions and their perceivable consequences in the human brain. Neuroimage 17: 364–372.CrossRefGoogle ScholarPubMed
Emmorey, K., 2004. The role of Broca's area in sign language. In Grodzinsky, Y. and Amunts, K. (eds.) Broca's Region. Oxford, UK: Oxford University Press, pp. 167–182.Google Scholar
Epstein, R., Stanley, D., Harris, A., and Kanwisher, N., 2000. The parahippocampal place area: perception, encoding, or memory retrieval? Neuron 23: 115–125.CrossRefGoogle Scholar
Fagg, A. H., and Arbib, M. A., 1998. Modeling parietal–premotor interactions in primate control of grasping. Neur. Networks. 11: 1277–1303.CrossRefGoogle ScholarPubMed
Fillmore, C. J., Kay, P., and O'Connor, M. K., 1988. Regularity and idiomaticity in grammatical constructions: the case oflet alone. Language 64: 501–538.Google Scholar
Fogassi, L., Ferrari, P. F., Gesierich, B., et al., 2005. Parietal lobe: from action organization to intention understanding. Science 308: 662–667.CrossRefGoogle ScholarPubMed
Gandhi, S. P., Heeger, M. J., and Boyton, G.M., 1998. Spatial attention affects brain activity in human primary visual cortex. Proc. Nat. Acad. Sci. USA 96: 3314–3319.CrossRefGoogle Scholar
Gibson, J. J., 1979. The Ecological Approach to Visual Perception. Boston, MA: Houghton Mifflin.Google Scholar
Goldberg, A., 1995. Constructions: A Construction Grammar Approach to Argument Structure. Chicago, IL: University of Chicago PressGoogle Scholar
Goldberg, A., 2003. Constructions: a new theoretical approach to language. Trends Cogn. Sci. 7: 219–224.CrossRefGoogle Scholar
Goldin-Meadow, S., 1999. The role of gesture in communication and thinking. Trends Cogn. Sci. 3: 419–429.CrossRefGoogle ScholarPubMed
Griffin, Z., and Bock, K., 2000. What the eyes say about speaking. Psychol. Sci. 11: 274–279.CrossRefGoogle ScholarPubMed
Haruno, M., Wolpert, D. M., and Kawato, M., 2001. MOSAIC model for sensorimotor learning and control. Neur. Comput. 13: 2201–2220.CrossRefGoogle ScholarPubMed
Henderson, J. M., and Ferreira, F. (eds.), 2004. Interface of Language, Vision, and Action: Eye Movements and the Visual World. New York: Psychology Press.Google Scholar
Henderson, J. M., and Hollingworth, A., 1999. High-level scene perception. Annu. Rev. Psychol. 50: 243–271.CrossRefGoogle ScholarPubMed
Herzog, G., and Wazinski, P., 1994. Visual TRAnslator: linking perceptions and natural language descriptions. Artif. Intell. Rev. 8: 175–187.CrossRefGoogle Scholar
Hoff, B., and Arbib, M. A., 1993. Simulation of interaction of hand transport and preshape during visually guided reaching to perturbed targets. J. Motor Behav. 25: 175–192.CrossRefGoogle Scholar
Hollingworth, A., 2004. Constructing visual representations of natural scenes: the roles of short- and long-term visual memory. J. Exp. Psychol. Hum. Percept. Perform. 30: 519–537.CrossRefGoogle Scholar
Hollingworth, A., and Henderson, J. M., 1998. Does consistent scene context facilitate object perception? J. Exp. Psychol. Gen. 127: 398–415.CrossRefGoogle ScholarPubMed
Hollingworth, A., & Henderson, J. M., 2002. Accurate visual memory for previously attended objects in natural scenes. J. Exp. Psychol. Hum. Percept. Perform. 28: 113–136.CrossRefGoogle Scholar
Hollingworth, A., & Henderson, J. M., 2004. Sustained change blindness to incremental scene rotation: a dissociation between explicit change detection and visual memory. Percept. Psychophys. 66: 800–807.CrossRefGoogle ScholarPubMed
Indefrey, P. and Levelt, W. J. M., 2000. The neural correlates of language production. In Gazzaniga, M. (ed.) The New Cognitive Sciences, 2nd edn. Cambridge, MA: MIT Press, pp. 845–865.Google Scholar
Irwin, D. E., and Andrews, R., 1996. Integration and accumulation of information across saccadic eye movements. In Inui, T. and McClelland, J. L. (eds.) Attention and Performance, vol. 16, Information Integration in Perception and Communication. Cambridge, MA: MIT Press, pp. 125–155.Google Scholar
Irwin, D. E., and Zelinsky, G. J., 2002. Eye movements and scene perception: memory for things observed. Percept. Psychophys. 64: 882–895.CrossRefGoogle ScholarPubMed
Ito, M., and Gilbert, C. D., 1999. Attention modulates contextual influences in the primary visual cortex of alert monkeys. Neuron 22: 593–604.CrossRefGoogle ScholarPubMed
Itti, L., 2005. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cogn. 12: 1093–1123.CrossRefGoogle Scholar
Itti, L., and Koch, C., 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Res. 40: 1489–1506.CrossRefGoogle ScholarPubMed
Itti, L., and Koch, C., 2001. Computational modeling of visual attention. Nature Rev. Neurosci. 2: 194–203.CrossRefGoogle Scholar
Itti, L., Gold, C., and Koch, C., 2001. Visual attention and target detection in cluttered natural scenes. Opt. Engin. 40: 1784–1793.Google Scholar
Itti, L., Dhavale, N., and Pighin, F., 2003. Realistic avatar eye and head animation using a neurobiological model of visual attention. Proceedings SPIE 48th Annual International Symposium on Optical Science and Technology, pp. 64–78.Google Scholar
Jackendoff, R., 2002. Foundations of Language. Oxford, UK: Oxford University Press.CrossRefGoogle Scholar
Jeannerod, M., 1997. The Cognitive Neuroscience of Action. Oxford, UK: Blackwell.Google Scholar
Jeannerod, M., and Biguer, B., 1982. Visuomotor mechanisms in reaching within extra-personal space. In Ingle, D. J., Mansfield, R. J. W., and Goodale, M. A. (eds.) Advances in the Analysis of Visual Behavior. Cambridge, MA: MIT Press, pp. 387–409.Google Scholar
Jellema, T., Baker, C., Wicker, B., and Perrett, D., 2000. Neural representation for the perception of the intentionality of actions. Brain Cogn. 44: 280–302.CrossRefGoogle ScholarPubMed
Kayne, R. S., 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press.Google Scholar
Knott, A., 2003. Grounding syntactic representations in an architecture for sensorimotor control. Available at http://www.cs.otago.ac.nz/trseries/oucs-2003-04.pdf
Knott, A., 2004. Syntactic representations as side-effects of a sensorimotor mechanism. Abstract, 5th International Conference on Evolution of Language, Leipzig, April, 2004.
Koopman, H., and Sportiche, D., 1991. The position of subjects. Lingua 85: 211–258.CrossRefGoogle Scholar
Lacquaniti, F., Perani, D., Guigon, E., et al., 1997. Visuomotor transformations for reaching to memorized targets: a PET study. Neuroimage, 5: 129–146.CrossRefGoogle ScholarPubMed
Lesser, V. R., Fennel, R. D., Erman, L. D., and Reddy, D. R., 1975. Organization of the HEARSAY-II speech understanding system. IEEE Trans. Acoust. Speech Signal Process. 23: 11–23.CrossRefGoogle Scholar
Levelt, W. J. M., 1989. Speaking. Cambridge, MA: MIT Press.Google Scholar
Levelt, W. J. M., 2001. Spoken word production: a theory of lexical access. Proc. Natl Acad. Sci. USA 98: 13464–13471.CrossRefGoogle ScholarPubMed
Levelt, W. J. M., Roelofs, A., & Meyer, A. S., 1999. A theory of lexical access in speech production. Behav. Brain Sci. 22: 1–75.CrossRefGoogle ScholarPubMed
Luria, A. R., 1973. The Working Brain. New York: Penguin Books.Google Scholar
McCulloch, W. S., 1965. Embodiments of Mind. Cambridge, MA: The MIT Press.Google Scholar
McNeill, D. (ed.), 2000. Language and Gesture. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Medendorp, W. P., Goltz, H. C., Vilis, T., and Crawford, J. D., 2003. Gaze-centered updating of visual space in human parietal cortex. J. Neurosci. 23: 6209–6214.CrossRefGoogle ScholarPubMed
Megerdoomian, K., 2001. Event structure and complex predicates in Persian. Can. J. Ling.–Rev. Can. Linguistique 46: 97–125.CrossRefGoogle Scholar
Moran, J., and Desimone, R., 1985. Selective attention gates visual processing in the extrastriate cortex. Science 229: 782–784.CrossRefGoogle ScholarPubMed
Moreno, F. J., Reina, R., Luis, V., and Sabido, R., 2002. Visual search strategies in experienced and inexperienced gymnastic coaches. Percept. Motor Skills 95: 901–902.CrossRefGoogle ScholarPubMed
Navalpakkam, V., and Itti, I., 2005. Modeling the influence of task on attention. Vision Res. 45: 205–231.CrossRefGoogle ScholarPubMed
Nevatia, R., Zhao, T., and Hongeng, S., 2003. Hierarchical language-based representation of events in video streams. IEEE Workshop on Event Mining, Madison, WI, 2003.
Nodine, C. F., and Krupinski, E. A., 1998. Perceptual skill, radiology expertise, and visual test performance with NINA and WALDO. Acad. Radiol. 5: 603–612.CrossRefGoogle ScholarPubMed
Noton, D., and Stark, L., 1971. Scanpaths in eye movements during pattern perception. Science(Washington) 171: 308–311.CrossRefGoogle ScholarPubMed
Nunberg, G., Sag, I. A., and Wasow, T., 1994. Idioms. Language 70: 491–538.CrossRefGoogle Scholar
Oliva, A., and Schyns, P. G., 1997. Coarse blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli. Cogn. Psychol. 34: 72–107.CrossRefGoogle ScholarPubMed
Oliva, A., & Schyns, P. G., 2000. Diagnostic colors mediate scene recognition. Cogn. Psychol. 41: 176–210.CrossRefGoogle ScholarPubMed
O'Regan, J. K., 1992. Solving the “real” mysteries of visual perception: the world as an outside memory. Can J. Psychol. 46: 461–488.CrossRefGoogle ScholarPubMed
Oztop, E., and Arbib, M. A., 2002. Schema design and implementation of the grasp-related mirror neuron system. Biol. Cybernet. 87: 116–140.CrossRefGoogle ScholarPubMed
Oztop, E., Wolpert, D., and Kawato, M., 2005. Mental state inference using visual control parameters. Cogn. Brain Res. 22: 129–151.CrossRefGoogle ScholarPubMed
Perrett, D., Harries, M., Bevan, R., et al., 1989. Frameworks of analysis for the neural representation of animate objects and actions. J. Exp. Biol. 146: 87–113.Google ScholarPubMed
Pollock, J.-Y., 1989. Verb movement, universal grammar and the structure of IP. Linguist. Inquiry 20: 365–424.Google Scholar
Potter, M. C., 1975. Meaning in visual search. Science 187: 965–966.CrossRefGoogle ScholarPubMed
Pulvermüller, F., Härle, M., and Hummel, F., 2001. Walking or talking? Behavioral and neurophysiological correlates of action verb processing. Brain Lang. 78: 143–168.CrossRefGoogle ScholarPubMed
Rensink, R. A., 2000. Seeing, sensing, and scrutinizing. Vision Res. 40: 1469–1487.CrossRefGoogle ScholarPubMed
Rensink, R. A., O'Regan, J. K., and Clark, J. J., 1997. To see or not to see: the need for attention to perceive changes in scenes. Psychol. Sci. 8: 368–373.CrossRefGoogle Scholar
Rolls, E. T., and Arbib, M. A., 2003. Visual scene perception. In Arbib, M. A. (ed.) The Handbook of Brain Theory and Neural Networks, 2nd edn. Cambridge, MA: MIT Press, pp. 1210–1215.Google Scholar
Rybak, I. A., Gusakova, V. I., Golovan, A. V., Podladchikova, L. N., and Shevtsova, N. A., 1998. A model of attention-guided visual perception and recognition. Vision Res. 38: 2387–2400.CrossRefGoogle ScholarPubMed
Sabbagh, M. A., and Baldwin, D. A., 2001. Learning words from knowledgeable versus ignorant speakers: links between preschoolers' theory of mind and semantic development. Child Devel. 72: 1054–1070.CrossRefGoogle ScholarPubMed
Sanocki, T., and Epstein, W., 1997. Priming spatial layout of scenes. Psychol. Sci. 8: 374–378.CrossRefGoogle Scholar
Savelsbergh, G. J., Williams, A. M., , Kamp, Van, J., and Ward, P., 2002. Visual search, anticipation and expertise in soccer goalkeepers. J. Sports Sci. 20: 279–287.CrossRefGoogle ScholarPubMed
Scalaidhe, S. P., Wilson, F. A., and Goldman-Rakic, P. S., 1999. Face-selective neurons during passive viewing and working memory performance of rhesus monkeys: evidence for intrinsic specialization of neuronal coding. Cerebr. Cortex 9: 459–475.CrossRefGoogle ScholarPubMed
Schill, K., Umkehrer, E., Beinlich, S., Krieger, G., and Zetzsche, C., 2001. Analysis with saccadic eye movements: top–down and bottom–up modeling. J. Electr. Imag.
Simons, D. J., 2000. Attentional capture and inattentional blindness. Trends Cogn. Sci. 4: 147–155.CrossRefGoogle ScholarPubMed
Tanenhaus, M. K., Chambers, C. G., and Hanna, J. E., 2004. Referential domains in spoken language comprehension: using eye movements to bridge the product and action traditions. In Henderson, J. M. and Ferreira, F. (eds.) The Interface of Language, Vision, and Action: Eye Movements and the Visual World. New York: Psychology Press, pp. 279–317.Google Scholar
Tipper, S., Lortie, C., and Baylis, G., 1992. Selective reaching: evidence for action-centred attention. J. Exp. Psychol. Hum. Percept. Perform. 18: 891–905.CrossRefGoogle Scholar
Tipper, S., Howard, L., and Houghton, G., 1998. Action-based mechanisms of attention. Phil. Trans. Roy. Soc. London B 353: 1385–1393.CrossRefGoogle Scholar
Torralba, A., 2003. Modeling global scene factors in attention. J. Opt. Soc. America A, Opt. Image Sci. Vis. 20: 1407–1418.CrossRefGoogle ScholarPubMed
Treisman, A., and Gelade, G., 1980. A feature integration theory of attention. Cogn. Psychol. 12: 97–136.CrossRefGoogle Scholar
Treue, S., and Martinez-Trujillo, J. C., 1999. Feature-based attention influences motion processing gain in macaque visual cortex. Nature 399: 575–579.CrossRefGoogle ScholarPubMed
Tversky, B., and Lee, P. U., 1998. How space structures language. In Freksa, C., Habel, C., and Wender, K. F. (eds.) Spatial Cognition: An Interdisciplinary Approach to Representing and Processing Spatial Knowledge. Berlin: Springer-Verlag, pp. 157–175.CrossRef
Weymouth, T. E., 1986. Using Object Descriptions in a Schema Network for Machine Vision, COINS Technical Report No. 86–24. Amherst, MA: Department of Computer and Information Science, University of Massachusetts.
Williams, E., 1995. Theta theory. In Webelhuth, G. (ed.) Government and Binding Theory and the Minimalist Program. Oxford, UK: Blackwell, pp. 97–124.Google Scholar
Wolfe, J. M., 1994. Guided search 2.0: a revised model of visual search. Psychonom. Bull. Rev. 1: 202–238.CrossRefGoogle ScholarPubMed
Yarbus, A., 1967. Eye Movements and Vision. New York: Plenum Press.CrossRefGoogle Scholar
Zacks, J., and Tversky, B., 2001. Event structure in perception and cognition. Psychol. Bull. 127: 3–21.CrossRefGoogle Scholar
Zhao, T., and Nevatia, R., 2004. Tracking multiple humans in complex situations. IEEE Trans. Pattern Anal. Machine Intell. 26: 1208–1221.CrossRefGoogle ScholarPubMed

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×