Online interpretation of scalar quantifiers: Insight into the semantics–pragmatics interface

https://doi.org/10.1016/j.cogpsych.2008.09.001Get rights and content

Abstract

Scalar implicature has served as a test case for exploring the relations between semantic and pragmatic processes during language comprehension. Most studies have used reaction time methods and the results have been variable. In these studies, we use the visual-world paradigm to investigate implicature. We recorded participants’ eye movements during commands like “Point to the girl that has some of the socks” in the presence of a display in which one girl had two of four socks and another had three of three soccer balls. These utterances contained an initial period of ambiguity in which the semantics of some was compatible with both characters. This ambiguity could be immediately resolved by a pragmatic implicature which would restrict some to a proper subset. Instead in Experiments 1 and 2, we found that participants were substantially delayed, suggesting a lag between semantic and pragmatic processing. In Experiment 3, we examined interpretations of some when competitors were inconsistent with the semantics (girl with socks vs. girl with no socks). We found quick resolution of the target, suggesting that previous delays were specifically linked to pragmatic analysis.

Introduction

Where does language end and communication begin? While many aspects of utterances are tightly linked to word meaning and syntactic structure, other facets are clearly added by context-sensitive, inferential processes. For example, in the dialogue in (1), we can infer from the response of the Little Red Hen that she has not finished making the bread. Indeed, if we do not make this inference her response would be a non-sequitur.

(1) The Lazy Dog: Have you made the bread yet?

The Little Red Hen: I’ve planted the grain.

But this inference is not part of the truth conditional content of the Hen’s statement. Planting the grain does not rule out the possibility of making bread. In fact on the farm, one event typically precedes the other. This division between semantically-encoded meaning and the inferences that we can derive from it was made prominent by Grice, 1957, Grice, 1975. Semantics is used to refer to the truth conditional content of the utterance or the aspects of the interpretation that can be directly calculated from the meanings of words and the structural relationships between them. In contrast, pragmatics is used to refer to the aspects of interpretation arrived at by an inferential analysis of the utterances with respect to the context and the communicator’s goals. Grice proposed that while pragmatic inferences make use of the semantic analysis, they are distinct from this truth conditional content because they are in fact defeasible. In other words, we can imagine a situation where our initial inference (the bread is not finished) is explicitly canceled by subsequent statements from the Hen in (2).

(2) The Little Red Hen: (frustrated) And I’ve cut the wheat and ground the flour. In fact, I’ve done everything and now I will eat the bread all by myself!

While Grice’s distinction between semantic content and pragmatic inference has been widely accepted, there are divergent theories about the nature of these two levels of representation and their relation to one another (Levinson, 1983, Levinson, 2000, Recanati, 2003, Sperber and Wilson, 1986/1995). These theories are not psycholinguistic models but they differ in their conception of the processes that mediate between semantic and pragmatic interpretation and how they might interact. These processes have been explored by psycholinguists, primarily by examining the effects of context on language comprehension or contrasting the processing of utterances that require a particular inference with those that do not. Evidence of early pragmatic processing has been demonstrated across phenomena as diverse as the resolution of lexical ambiguity, the use of contrast sets to predict the referent of a modified noun, and the interpretation of metaphoric expressions (Rayner and Duffy, 1986, Glucksberg et al., 1982, Frisson and Pickering, 1999, Sedivy et al., 1999). For example, Sedivy and her colleagues (1999) demonstrated that listeners were quicker to comprehend “Pick up the tall glass” in the presence of another contrasting member of the same category (e.g., a short glass). This rapid sensitivity to the presence of a pragmatically specified comparison set suggests that upon hearing tall, listeners were able to quickly generate an inference that the referent is likely to belong to a set of objects from the same category (presumably the one with a short and tall item) (Sedivy, 2003, Grodner and Sedivy, in press).

However, this research leaves open the question of whether these rapid pragmatic inferences are preceded by some degree of semantic analysis. In fact in many cases, pragmatic inferences seem to depend upon aspects of lexical and compositional semantics. In the example above, the relevance of the contrast set can only be determined after recognizing that tall is a scalar adjective that encodes the dimension of height. Thus we might expect to find some moment in processing—however brief it may be—when the semantic contribution a given word is available but the pragmatic inference that it triggers is not. The experiments in this paper are an attempt to find that moment. We do this by exploring a test case where the division between semantic meaning and pragmatic inference is sharply defined: the interpretation of scalar quantifiers.

Linguists have long noted that terms like some have two distinct interpretations (Horn, 1972, Horn, 1989, Gadzar, 1979). Typically, sentences like (3) will be taken to imply that Ernie ate only a proper subset of the apples (he did not eat all of them).

(3) Bert: Where are the apples that I bought?

Ernie: I ate some of them.

However, on occasion some can be used in a context that does not exclude the total set. For example, in (4) Cookie Monster asserts that he has eaten some of the cookies but then goes on to explain that he ate all of them.

(4) Bert: If you ate some of the cookies, then I won’t have enough for the party.

Cookie Monster: I ate some of the cookies. In fact, I ate all of them.

Grice argued these two interpretations are actually the result of a single meaning of some that is compatible with all. As Fig. 1 illustrates, the two terms, some and all, can be ordered on a scale with respect to the strength of the information that they convey (Horn, 1972, Horn, 1989, Gadzar, 1979). On this theory, the meaning of the weaker term (some) is consistent with all amounts greater than a lower boundary (some is greater than none) up through and including the maximum value (all). In sentences like (4), this meaning is transparent. Utterances with interpretations like this are termed lower-bounded since the scalar term has a lower boundary but no upper bound.

However, weaker scalar expressions are typically interpreted as having an upper boundary which excludes referents which are compatible with the maximal term (as in 3). This happens via a pragmatic inference called scalar implicature. According to Grice (1975), the participants in a conversation expect that each will tailor their contribution to be as informative as required but no more informative than is required (Quantity Maxim, pp. 45). Thus one can imagine a situation where Ernie had actually polished off the apples and uttered (5).

(5) Ernie: I ate all of the apples.

The existence of this more informative alternative means that if the speaker chooses instead to use a weaker scalar term like in (3), the listener can apply the Quantity Maxim and infer that this was a situation where the speaker is not in a position to make the stronger assertion (presumably because the stronger scalar term was not true). When this inference is made, the resulting interpretation is called upper-bounded since it imposes an additional boundary on the upper end of the scale. In other words, like Bert, we can infer that if Ernie had eaten all of the apples, he would have simply said so. Thus he must have eaten some-but-not-all of them. However, like all pragmatic inferences, the scalar implicature is defeasible allowing for the possibility of lower-bounded interpretations when the inference is cancelled or never calculated (as in 4).

This logic can be extended to any set of terms which can be placed on ordinal scale and which differ in their strength (Horn, 1972, Horn, 1989, Levinson, 2000). Parallel inferences have been noted for a wide range of expressions including scalar adjectives (warm vs. hot), aspectual verbs (start vs. finish), and logical operators (or vs. and). Thus if Ernie says he likes his soup warm, we can infer that he doesn’t like it hot or if Bert says that he has started the book, we can infer that he hasn’t finished it. Scalar implicatures can even be generated in cases where alternatives are ordered solely by virtue of the context or our knowledge of common practices (Hirschberg, 1985, Papafragou and Tantalou, 2004). For example, in sentence (1), our knowledge of making bread establishes a scale and as a result, the Hen’s use of the weaker alternative (planted the grain) leads the listener to infer that the stronger is not true (made bread).

In summary, the Gricean description of scalar implicature provides an explanation for the two readings of weak scalar terms which invokes constraints at two distinct levels of interpretation. At the semantic level, the meaning of some is always compatible with the total set (some-and-possibly-all). However at the pragmatic level, the interpretation can vary. Typically, an implicature will be calculated, as in (3), and some will be compatible with only a proper subset (some-but-not-all). However, this implicature is optional, and when it is absent or cancelled, as in (4), the pragmatic interpretation will have the same content as the semantic analysis. Note that the scalar implicature limits interpretation to a subset of the circumstances that are allowable on the basis of the semantic restriction alone. This creates an ideal situation for understanding the relationship between semantic and pragmatic processing since the facets of meaning that are assigned at each level analysis have consequences on the potential referents of the quantified phrase. In the remainder of the Introduction, we will first briefly review recent studies comparing semantic and pragmatic interpretation of various scalar terms using offline and online measures of language comprehension and then we will describe a series of experiments designed to isolate these components within real-time processing.

Psycholinguistic studies have provided some empirical support for this two-level analysis of scalar interpretation. One source of evidence comes from research on developmental changes in the construal of scalar terms. These studies have demonstrated that while adults consistently favor upper-bounded readings, children prefer lower-bounded interpretations for a variety of scalar terms. For example, Noveck (2001) asked children and adults to evaluate statements like “x might be y” in contexts where “x must be y” was true. He found that while adults overwhelmingly rejected the weaker modal, seven- to nine-year-olds accepted it, suggesting that they treated the weaker statement as compatible with the stronger one. Similarly, Papafragou and Musolino (2003) found that five-year-olds, but not adults, were content to accept weak scalar predicates like started in situations where the stronger term finished applied.

In the absence of a distinction between semantics and pragmatics, this pattern would be puzzling. For example, if we assume that the scalars are simply semantically ambiguous (some means both some-and-possibly-all and some-but-not-all), then we confronted with a learning paradox. Adults typically use weak scalars in situations in which they are upper-bounded. Children, like adults, show a robust bias for the more common interpretation of an ambiguous word (Swinney & Prather, 1989). So how and why would children develop a preference for the less frequent lower-bounded readings? Noveck (2001) points out that the theoretical distinction between semantics and pragmatics allows us to make sense of this pattern: children, like adults, correctly retrieve a lower-bounded semantics for these scalar terms but, unlike adults, they fail to make the pragmatic implicature.

The nature of scalar implicature can also be explored by investigating the time-course of interpretation in adults. Several researchers have done this by comparing reaction times to sentences with implicatures to those without implicatures. For example, Bott and Noveck (2004) examined the response times for truth-value judgments of sentences containing weak scalar quantifiers like “Some elephants are mammals.” For underinformative statements like these, participants’ spontaneous judgments reveal how they are interpreting the sentence. False responses indicate an upper-bounded interpretation, while true responses indicate a lower-bounded one. They found that participants who judged the statements to be false took longer than those who judged them to be true. The authors attribute this difference to the time that it takes to generate the implicature. A similar data pattern has emerged in several other studies measuring speeded truth-value judgments of underinformative usages of some (Rips, 1975, Noveck and Posada, 2003, De Neys and Schaeken, 2007).

While these results are consistent with the two-level analysis described above, aspects of the method limit the conclusions we can draw. First, the use of underinformative statements introduces potential confounds. To link increases in reading time to one interpretation, the experimenters must either manipulate the participants’ construal of the critical scalar or measure spontaneously occurring differences in the preferred analysis. If interpretation is directly manipulated (e.g., by instructing participants to analyze some as some-but-not-all or some-and-possibly-all), then we cannot be sure that the processes involved in deliberately carrying out this instruction are the same as those that would be involved in ordinary comprehension. If we examine spontaneous variation in interpretation (e.g., by comparing trials where the pragmatic reading is accessed with those where the semantic reading prevails), then we necessarily move from an experimental design to a correlational one. This introduces the third-variable problem: the possibility that differences in reaction time between the two response types are attributable to some third factor which is responsible both for the longer reaction times and for the contrasting responses.

Feeney and colleagues have explored this latter possibility. They suggest that the reaction time differences between “pragmatic responders” and “logical responders” are attributable to differences in the participants’ response strategies (Feeney, Scafton, Duckworth, & Handley, 2004). Thus they note that the participants in Noveck and Posada’s study (2003) who select the upper-bounded response to the underinformative sentences also had slower reaction times to the other items, suggesting that these participants may simply be more cautious and systematic. They attribute the use of overt strategies to the large number of critical trials that were used and relative scarcity of filler items (raising awareness of the critical items). When the authors conducted a parallel study with fewer trials but more trial types, they found that individuals failed to adopt a consistent strategy and instead produced both “pragmatic” and “logical” responses. Critically, the within-subject comparison revealed no difference in the reaction times for the lower- and upper-bounded interpretations.

A second methodological limitation is that research on this topic has relied almost exclusively on sentence final judgments about the validity or truth of a statement. These judgments provide limited information about the processes that underlie the apparent delays. This is problematic for several reasons. First, since judgment times are not directly mapped onto separable periods of analyses, it is unclear whether scalar implicatures are ever preceded by a period of semantic analysis. Instead participants who spontaneously adopt an upper-bounded interpretation might do so in a single unitary process. Second, the use of verification tasks creates uncertainty about whether the increases in reaction times are actually attributable to linguistic processes, rather than processes involved in verification (Bott & Noveck, 2004). Is the delay caused by the time taken to calculate the implicature and thus restrict the meaning of the target utterance (the conversion of “Some elephants are mammals” to “Only some elephants are mammals”)? Or does it reflect the need for additional time to test this more restrictive meaning against the participant’s stored knowledge (the time required to ascertain that not only are there elephants that are mammals, but in fact there are no elephants which are not)?

These limitations are highlighted by differences in the findings across experiments. For example, while many investigators have found delays for upper-bounded interpretations, others have found no differences in the reaction times (Feeney et al., 2004) or delays for items in lower-bounded contexts (Bezuidenhout & Cutting, 2002).1 If we take these increases in reaction times as indicative of additional stages in processing, these studies are difficult to reconcile. Feeney and colleagues (2004) have suggested that the pattern might be explained by a three stage process: first the logical reading is accessed, then in most contexts the implicature is calculated, and finally in some contexts this implicature can be cancelled restoring the lower-bounded reading. The direction of the reaction time difference will depend on whether the lower-bounded readings in a particular study reflect the first or third stage of processing. However, in the absence of detailed information about the time-course of interpretation this account remains speculative.

A recent paper by Breheny et al. (2006) addresses some of these limitations. The authors use a phrase-by-phrase self-paced reading task to examine the effects of context on the generation of implicatures for both some and or. This task has greater temporal resolution and places fewer demands on the participants. In the case of or, Breheny and colleagues found that scalar terms embedded in upper-bounded contexts were read more slowly than those in lower-bounded contexts, suggesting that the pragmatic inference involves an additional process which is not automatically triggered across all utterances (Experiment 1). In the case of some, additional support for this hypothesis is provided by the reading times for a continuation that presupposes the upper-bounded interpretation (Experiment 3). Participants were presented with the upper-bounded or lower-bounded context seen in (6) and (7) and their reading times were compared during two critical regions following the quantifier.

(6) Upper-bounded context: Mary asked John whether he intended to host all his relatives in his tiny apartment. John replied that he intended to host some of his relatives. The rest would stay in a nearby hotel.

(7) Lower-bounded context: Mary was surprised to see John cleaning his apartment and she asked the reason why. John told her that he intended to host some of his relatives. The rest would stay in a nearby hotel.

They found that those who encountered the term in the upper-bounded context showed delays in reading the quantified phrase (“some of his relatives”), suggesting that the scalar implicature was calculated at this initial period. In contrast, those who encountered the term in the lower-bounded context demonstrated delays in the following region, in which the proper subset was explicitly referred to (“the rest would stay”), suggesting that the upper-bounded inference had not yet been made in the initial period.

While these studies are clearly consistent with the two-level analysis described above they also leave some questions open. First, because they rely on complex contextual manipulations to drive interpretation, it is difficult to pin the differences in reading times directly to the inference process. For example, in the study described above, the upper-bounded context not only emphasizes the need for a boundary, it also (1) contains considerably more overlap between the context sentence and the target sentence, (2) makes use of the contrasting scalar term (all), and (3) provides an antecedent in the discourse (“all his relatives”) for the critical scalar phrase (“some of his relatives”). It is not clear what the impact of each of these differences would be on reading times in the critical regions independent of their effects on implicature. Second, while these studies have greater temporal sensitivity, they still leave open the question of how participants arrive at the two analyses. Is the upper-bounded analysis slower because it is preceded by the lower-bounded one? Or does the delay merely reflect a difference in the length of a single process caused by a difference in complexity or accessibility of the two analyses?

One way to circumvent these problems is to use a procedure that provides an indirect measure of comprehension as it takes place. The visual-world eye-tracking paradigm has been used extensively in psycholinguistic research to yield a sensitive, time-locked measure of linguistic processing (Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995). Participants are presented with spoken instructions, asking them to manipulate objects within a visual reference world, while their eye movements to those objects are measured. This procedure has two advantages for exploring the relationship between semantics and pragmatics. First, since eye movements are typically made without conscious reflection, they provide a more implicit measure of comprehension prior to overt judgments, which may invoke higher-level strategic processes. Second, because eye movements are rapid, frequent and tightly linked to the processing of spoken language, they provide a fine-grained measure of how interpretation unfolds over time. Thus rather than having to infer the difficulty of a process based on the length of sentence final reaction times, these fixations provide information about the nature of the interpretation at a given point in time.

In the following experiments, we investigated how the processing of scalar terms unfolds over the course of online speech comprehension. Participants heard stories in which two types of objects were divided up between four characters, two boys and two girls. These stories were accompanied by a visual display. In the first experiment, the items were always divided such that one of the critical characters (e.g., the girls) had a proper subset of one item (e.g., the socks) while the other had the total set of second item (e.g., the soccer balls). In the critical condition, participants were given instructions like “Point to the girl that has some of the socks” (see Fig. 2) and their eye movements were recorded. These trials contained a period of semantic ambiguity beginning at the onset of the quantifier during which the referent of a lower-bounded reading of some is compatible with both of the critical characters.

Eye movements to the target in this condition were compared to those in trials asking for “all of the socks” (in a context where one participant has all the socks and another has a proper subset of the soccer balls). In this case, the Distractor character (the girl with some-but-not-all of the soccer balls) is inconsistent with the semantics of the quantifier. Thus if semantic meaning constrains interpretation prior to calculation of a pragmatic implicature, we would predict quick referential disambiguation in the all trials but prolonged competition between the two characters during the some trials. To ensure that differences between these trials were not simply due to preferences for larger quantities or a greater difficulty in calculating upper bounds relative to lower bounds, we also used terms from a number scale, two and three. Like all, these terms do not require a pragmatic inference to specify exact quantities and consequently do not have the same temporary semantic ambiguity as some.2 Thus the performance on the two trials provides a crucial comparison since its meaning rules out the same competitor as some would once the implicature is calculated (the girl with soccer balls). By comparing these trials we can see whether there is any temporal delay between reference restriction via semantic content and reference restriction via pragmatic implicature.

Section snippets

Participants

Twenty undergraduate students at Harvard University participated in this study. They received either course credit or $5 for their participation. All participants were native monolingual English speakers.

Procedure

Participants sat in front of an inclined podium divided into four quadrants, each containing a shelf where pictures could be placed (i.e. upper-left, upper-right, lower-left, and lower-right). A camera at the center of the display was focused on the participant’s face and recorded the direction

Participants

Twenty undergraduate students at Harvard University participated in this study. They received either course credit or $5 for their participation. All were native monolingual English speakers who had no history of participation in the previous experiment. Two additional students took part in the study but were excluded from these analyses due to experimental error.

Procedure

The procedure was identical to Experiment 1.

Materials

The materials were similar to Experiment 1 with one key difference: the distribution of

Participants

Twenty undergraduate students at Harvard University participated in this study. They received either course credit or $5 for their participation. All were native monolingual English speakers who had not participated in the previous experiments.

Procedure

The procedure was identical to the previous experiments.

Materials

The materials compared the interpretation of the scalar quantifier some in two different referential contexts. For the 2-referent trials, we again introduced participants to displays that contrasted

General discussion

This study explores the real-time interaction between semantic and pragmatic meaning by investigating the interpretation of scalar terms. In Experiment 1 and 2, we found quick resolution of the referent when participants heard two, three, and all but initial delays when they heard some and had to make an upper-bounding inference. In Experiment 3, we again found delays in looks to the referent when some contrasted with all but we found no such delays when some was contrasted with none. These

Acknowledgments

This work benefited from conversations with members of the Laboratory for Developmental Studies and MIT-Harvard Number Reading Group. We are grateful to Ayo Adigun, Hila Katz, Charlotte Distefano, and Jane Pollock for their assistance in data collection and coding. Portions of this work have been presented at the 19th annual meeting of CUNY Sentence Processing and the 28th annual meeting of the Cognitive Science Society. This material is based upon work supported by the National Science

References (63)

  • K. Ito et al.

    Anticipatory effects of intonation: Eye movements during instructed visual search

    Journal of Memory and Language

    (2008)
  • M. Kjelgaard et al.

    Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity

    Journal of Memory and Language

    (1999)
  • I.A. Noveck

    When children are more logical than adults: Experimental investigation of scalar implicatures

    Cognition

    (2001)
  • I.A. Noveck et al.

    Characterizing the time course of an implicature: An evoked potentials study

    Brain and Language

    (2003)
  • A. Papafragou et al.

    Scalar implicatures: Experiments at the semantics- pragmatics interface

    Cognition

    (2003)
  • L.J. Rips

    Quantification and semantic memory

    Cognitive Psychology

    (1975)
  • J. Sedivy et al.

    Achieving incremental semantic interpretations through contextual representation

    Cognition

    (1999)
  • J. Snedeker et al.

    The developing constraints on parsing decisions: The role of lexical-biases and referential scenes in child and adult sentence processing

    Cognitive Psychology

    (2004)
  • D. Swingley et al.

    Recognition of words referring to present and absent objects by 24-month-olds

    Journal of Memory and Language

    (2002)
  • G. Altmann et al.

    Now you see it, now you don’t: Mediating the mapping between language and the visual world

  • Baumann, S. (2005). Degrees of givenness and their prosodic marking. In Paper presented at the international symposium...
  • M.E. Beckman et al.

    The ToBI annotation conventions

    (1994)
  • Breheny, R. (2004). Some scalar implicatures aren’t quantity implicatures—but some are. In Proceedings of the 9th...
  • Brugos, A., Shattuck-Hufnagel, S., & Vielleux, N. (2006). Transcribing prosodic structure of spoken utterances with...
  • R. Carston

    Informativeness, relevance and scalar implicature

  • G. Chierchia

    Scalar implicatures, polarity phenomena, and the syntax/pragmatic interface

  • Chierchia, G. (2004b). Numerals and formal vs. substantive features of mass and count. In Paper presented at Linguistic...
  • W. De Neys et al.

    When people are more logical under cognitive load: Dual task impact on scalar implicature

    Experimental Psychology

    (2007)
  • A. Feeney et al.

    The story of some: Everyday pragmatic inferences by children and adults

    Canadian Journal of Experimental Psychology

    (2004)
  • A. Fernald et al.

    Rapid gains in speed of verbal processing by infants in the second year

    Psychological Science

    (1998)
  • S. Frisson et al.

    The processing of metonymy: Evidence from eye movement

    Journal of Experimental Psychology: Learning, Memory and Cognition

    (1999)
  • Cited by (214)

    View all citing articles on Scopus
    View full text