Some utterances are underinformative: The onset and time course of scalar inferences

https://doi.org/10.1016/j.jml.2004.05.006Get rights and content

Abstract

When Tarzan asks Jane Do you like my friends? and Jane answers Some of them, her underinformative reply implicates Not all of them. This scalar inference arises when a less-than-maximally informative utterance implies the denial of a more informative proposition. Default Inference accounts (e.g., Levinson, 1983, Levinson, 2000) argue that this inference is linked to lexical items (e.g., some) and is generated automatically and largely independently of context. Alternatively, Relevance theory (Sperber & Wilson, 1985/1995) treats such inferences as contextual and as arriving effortfully with deeper processing of utterances. We compare these accounts in four experiments that employ a sentence verification paradigm. We focus on underinformative sentences, such as Some elephants are mammals, because these are false with a scalar inference and true without it. Experiment 1 shows that participants are less accurate and take significantly longer to answer correctly when instructions call for a Some but not all interpretation rather than a Some and possibly all interpretation. Experiment 2, which modified the paradigm of Experiment 1 so that correct responses to both interpretations resulted in the same overt response, reports results that confirm those of the first Experiment. Experiment 3, which imposed no interpretations, reveals that those who employed a Some but not all reading to the underinformative items took longest to respond. Experiment 4 shows that the rate of scalar inferences increased as permitted response time did. These results argue against a Neo-Gricean account and in favor of Relevance theory.

Introduction

There is a growing body of psycholinguistic work that focuses on the comprehension of logical terms. These studies can be broken down into two sets. One investigates the way logical inferences are made on-line in the context of story comprehension (e.g., Lea, 1995; Lea, O'Brien, Fisch, & Noveck, 1990). In this approach, the comprehension of a term like or is considered to be tantamount to knowing logical inference schemas attached to it. For or it would be or-elimination (where the two premises—p or q; not-q—imply p). The other line of research investigates non-standard existential quantifiers, such as few or a few, demonstrating how the meanings of quantifiers—besides conveying notions about amount—transmit information about the speaker's prior expectations as well as indicate where the addressee ought to place her focus (Moxey, Sanford, & Dawydiak, 2001; Paterson, Sanford, Moxey, & Dawydiak, 1998; Sanford, Moxey, & Paterson, 1996). For example, positive quantifiers like a few, put the focus on the quantified objects (e.g., those who got to the match in A few of the fans went to the match) while negative quantifiers like few place the focus on the quantified objects' complement (e.g., those fans who did not go to the match in Few of the fans went to the match).

In the present paper, we investigate a class of inference—which we will refer to as a scalar inference—that is orthogonal to the ones discussed above, but is arguably central to the way listeners treat logical terms. These arise when a less-than-maximally informative utterance is taken to imply the denial of the more informative proposition (or else to imply a lack of knowledge concerning the more informative one). Consider the following dialogues:

  • (1)

    Peter: Are Cheryl and Tony coming for dinner?

  • Jill: We are going to have Cheryl or Tony.

  • (2)

    John: Did you get to meet all of my friends?

  • Robyn: Some of them.

In (1), Jill's statement can be taken to mean that not both Cheryl and Tony are coming for dinner and, in (2), that Robyn did not meet all of John's friends. These interpretations are the result of scalar inferences, which we will describe in detail below. Before we do so, note that the responses in each case are compatible with the questioner's stronger expectation from a strictly logical point of view; if Jill knows that both Cheryl and Tony are coming, her reply is still true and if in fact Robyn did meet all of John's friends, she also spoke truthfully. Or is logically compatible with and and some is logically compatible with all.

Scalar inferences are examples of what Paul Grice (1989) called generalized implicatures as he aimed to reconcile logical terms with their non-logical meanings. Grice, who was especially concerned by propositional connectives, focused on logical terms that become, through conversational contexts, part of the speaker's overall meaning. In one prime example, he described how the disjunction or has a weak sense, which is compatible with formal logic's ∨ (the inclusive-or), but as benefiting from a stronger sense (but not both) through conversational uses (making the disjunction exclusive). What the disjunction says, he argued, is compatible with the weaker sense, but through conversational principles it often means the stronger one. Any modern account of the way logical terms are understood in context would not be complete without considering these pragmatic inferences.

Grice's generalized implicatures were assumed to occur very systematically although the context may be such that they do not occur. These were contrasted with particularized implicatures, which were assumed to be less systematic and always clearly context dependent. His reasons for making the distinction had to do with his debates with fellow philosophers on the meaning of logical connectives and of quantifiers, and not with the goal of providing a processing model of comprehension, and there is some vagueness in his view of the exact role of the context in the case of generalized implicatures (see Carston, 2002, pp. 107–116). In summary, Grice can be said to have inspired work on implicatures (by providing a framework), but there is not enough in the theory to describe, for example, how a scalar inference manifests itself in real time.

Pragmatic theorists, who have followed up on Grice and are keen on describing how scalar inferences actually work, can be divided into two camps. On the one hand, there are those who assume that the inference generally goes through unless subsequently cancelled by the context. That is, scalars operate on the (relatively weak) terms—the speaker's choice of a weak term implies the rejection of a stronger term from the same scale. To elucidate with disjunctions, the connectives or and and may be viewed as part of a scale (〈or, and〉), where and constitutes the more informative element of the scale (since p and q entails p or q). In the event that a speaker chooses to utter a disjunctive sentence, p or q, the hearer will take it as suggesting that the speaker either has no evidence that a stronger element in the scale, i.e., p and q, holds or that she perhaps has evidence that it does not hold. Presuming that the speaker is cooperative and well informed the hearer will tend to infer that it is not the case that p and q hold, thereby interpreting the disjunction as exclusive. A strong default approach has been defended by Neo-Griceans like Levinson (2000) and to some extent by Horn (1984, p. 13). More recently, Chierchia (2004) and Chierchia, Crain, Guasti, Gualmini, and Meroni (2001) have essentially defended the strong default view by making a syntactic distinction with respect to scalar terms: when a scalar is embedded in a downward-entailing context (e.g., negations and question forms), Chierchia and colleagues predict that one would not find the production of scalar inferences (also see Noveck, Chierchia, Chevaux, Guelminger, & Sylvestre, 2002). Otherwise, Chierchia and colleagues do assume that scalar inferences go through.

For the sake of exposition, we focus on Levinson (2000) because he has provided the most extensive proposal for the way pragmatically enriched “default” or “preferred” meanings of weak scalar terms are put in place. Scalars are considered by Levinson to result from a Q-heuristic, dictating that “What isn't said isn't (the case).” It is named Q because it is directly related to Grice's (1989) first maxim of quantity: Make your utterance as informative as is required. In other words, this proposal assumes that scalars are general and automatic. When one hears a weak scalar term like or, some, might, etc., the default assumption is that the speaker knows that a stronger term from the same scale is not warranted or that she does not have enough information to know whether the stronger term is called for. Default means that relatively weak terms prompt the inference automatically—or becomes not both, some becomes some but not all, etc. Also, a scalar inference can be cancelled. If this happens, it occurs subsequent to the production of the scalar term.

On the other hand, there are pragmatists who argue against the default view and in favor of a more contextual account. Such an account assumes that an utterance can be inferentially enriched in order to better appreciate the speaker's intention, but this is not done on specific words as a first step to arrive at a default meaning. We focus on Relevance theory because it arguably presents the most extensive contextualist view of pragmatic inferences in general and of scalar inferences in particular (see Post face of Sperber & Wilson, 1995). According to this account, a scalar is but one example of pragmatic inferences which arise when a speaker intends and expects a hearer to draw an interpretation of an utterance that is relevant enough. How far the hearer goes in processing an utterance's meaning is governed by principles concerning effect and effort; namely, listeners try to gain as many effects as possible for the least effort.

A non-enriched interpretation of a scalar term (the one that more closely coincides with the word's meaning) could very well lead to a satisfying interpretation of this term in an utterance. Consider Some monkeys like bananas: a weak interpretation of Some (with which the utterance can be glossed as Some and possibly all monkeys like bananas) can suffice for the hearer and not require further pragmatic enrichment. The potential to derive a scalar inference comes into play when an addressee applies relevance more stringently. A scalar inference could well be drawn by a hearer in an effort to make an utterance, for example, more informative (leading to an utterance that can be glossed as Some but not all monkeys like bananas). Common inferences like scalars are inferences that optionally play a role in such enrichment; they are not steadfastly linked to the words that could prompt them. If a scalar does arrive in a context that enriches an underinformative utterance, all things being equal the inference ought to be linked with extra effort.

One can better appreciate the two accounts by taking an arbitrary utterance (3) and comparing the linguistically encoded meaning (4a) and the meaning inferred by way of scalar inference (4b):

  • (3)

    Some X are Y.

  • (4a)

    Some and possibly all X are Y (logical interpretation).

  • (4b)

    Some but not all X are Y (pragmatic interpretation).

Note that (4a) is less informative than (4b) because the former is compatible with any one of four possible treatments of some in (3). That is, some X are Y can be viewed as having 4 representations in order to be true, where (i) X is a subset of Y, (ii) Y is a subset of X, (iii) X and Y overlap, and where (iv) X and Y coincide. With interpretation (4b), only (ii) and (iii) remain as possibly true. The interpretation represented by (4b) reduces the range of meanings of some. According to Levinson, the interpretation in (4b) is prepotently adopted through the Q-heuristic. This becomes the default meaning unless something specific in the context leads one to cancel (4b) and to then adopt the reading in (4a).

According to Relevance theory, a listener starts with the interpretation that corresponds with the meaning of the words, like in (4a); if that reading is satisfactory to the listener, she will adopt it. However, if the listener aims to make (3) more relevant, e.g., more informative, she will adopt (4b) instead. Given that (4b) arrives by way of a supplementary step (scalar inference), there is a cost involved (i.e., cognitive effort). This amounts to deeper processing but at a cost.

We propose that the two explanations can be separated by looking at the time course of processing sentences involving scalar inference. Consider first the Neo-Gricean view. This account assumes that the `default' meaning is the initial interpretation for the weak scalar term, which includes the negation of the stronger elements on the scale. It follows that to interpret the sentence without the inference, the listener must pass through a stage where the scalar inference has been considered and then rejected on the basis of contextual information. Thus, the time taken to process a sentence without a scalar inference must be greater than or equal to one in which a scalar inference is present. In contrast, Relevance theory considers the weaker sense of a scalar term to be considered first, and only if it is sufficiently relevant is the inference made to deny the stronger term. Comparing the processing times of sentences that have been interpreted with a scalar inference to those that have been interpreted without the inference can therefore be used as evidence to distinguish the two theories.

We should state at this point that although Levinson, 1983, Levinson, 2000 believes default rules and heuristics are an integral part of his theory and that processing issues are central, his account has not explicitly made the processing predictions that we suggest above. Nevertheless, we feel that there is some intrinsic interest in generating predictions from such a default model because this idea is at the heart of many Neo-Gricean claims. To avoid confusion between predictions based on a range of Neo-Gricean accounts and on the default model we test here, we refer to the processing predictions described above as stemming from a Default Inference (DI) account.

Response time experiments in which the interpretation of a scalar term has been important have generally instructed their participants to interpret the term in a strictly logical way (i.e., without the scalar inference). For example, Meyer (1970) informed participants to treat some to mean some and possibly all in sentence verification tasks with sentences like some pennies are coins. To our knowledge, the only early psychological study to take an interest in the potentially conflicting interpretations of such underinformative sentences was Rips (1975). Rips investigated how participants make category judgments by using sentence verification tasks with materials like some congressmen are politicians. He examined the effect of the quantifier interpretation by running two studies, one in which participants were asked to treat some as some and possibly all and another where they were asked to treat some as some but not all. This comparison demonstrated that the participants given the some but not all instructions in one Experiment responded more slowly than those given the some and possibly all instructions in another. Despite these indications, Rips modestly hedged when he concluded that “of the two meanings of Some, the informal meaning may be the more difficult to compute” (italics added). His reaction is not uncommon. Many colleagues share the intuition that the pragmatic interpretation seems more natural. In any case, this is an initial finding that goes in favor of the Relevance account.

Surprisingly, this finding has not led to any follow-up experiments. We consider four reasons for this. First, Rips's (1975) initial investigation was only incidentally concerned with pragmatic issues so it did not put a spotlight on this very interesting finding. Second, until recently, linguistic–pragmatic issues have not been central to traditional cognitive investigations (see Noveck, 2001). Third, skeptics might point out that Rips's effect relies on data collected across two experiments that were ultimately not comparable. It could be argued that his result may be due to sampling bias because participants were not allocated randomly to the two instructions conditions; also, the experiment that requested a logical interpretation (some and possibly all) included five types of sentences whereas the experiment that requested a pragmatic interpretation included four. Finally, a task requiring participants to apply certain kinds of interpretations is arguably artificial and does not necessarily capture what occurs under more natural circumstances.

We now turn to the four experiments in the paper. We investigate responses to underinformative categorical statements like some elephants are mammals1 as we compare the Default Inference and Relevance theory accounts of scalar inference onset. In Experiment 1, we replicated Rips (1975, Experiments 2 & 3) in one overarching procedure to address some of the concerns mentioned earlier. Furthermore, we make comparisons between the underinformative sentences and control sentences that were not made in Rips's original experiment. Experiment 2 uses the same paradigm as Experiment 1 but changes the presentation of the sentences and the response options so that correct responses to the two sentences that make up the most critical comparison require the same response key. Experiment 3 was similar to Experiment 1, except that it did not provide precise instructions about the way one ought to treat some. All three of these experiments allow us to make a comparison between the Default Inference account and Relevance theory. According to the Default Inference model, a response prompting an implicature should be faster than one that requires its cancellation. In contrast, Relevance theory would predict that the minimal meaning of some allows for an immediate treatment of a statement that has no need for an implicature and that the production of the implicature arises when participants apply more effort to treating the weak quantifier. The final experiment is a direct test of Relevance theory with this paradigm. Participants made the same judgments as in Experiments 1 and 3, but we manipulated the time available for responding. A reduction in the processing time was expected to reduce the possibility of producing the scalar inference.

Section snippets

Experiment 1

Experiment 1 presents categorical sentences and asks participants to provide True/False judgments. Examples of the six types of sentences included are shown in Table 1, translated from the French. Sentences referred to as T1 are the underinformative statements described before. In one experimental session, participants were instructed to interpret the quantifier some to mean some and possibly all, which we refer to as the Logical condition. In another session, they were told to interpret some

Experiment 2

A potential criticism of Experiment 1 is that the pragmatic effect might be due to a response bias because the correct response to T1 sentences in the Logical-instructions condition is to say “True” while under Pragmatic instructions the correct response is to say “False.” If one supposes that people are slower at rejecting a sentence than confirming it, then this alternative explanation predicts an advantage for Logical responses over Pragmatic responses. One response to this criticism is to

Experiment 3

This experiment uses the same paradigm as in Experiment 1, however we provide neither explicit instructions nor feedback about the way to respond to T1 sentences. Instead, we expect participants' responses to reflect equivocality to these types of sentences—some saying false and some true. This means that we should have two groups of responses: one in which the inference is drawn (T1 Pragmatic responses) and another where there is no evidence of inference (T1 Logical responses). We can

Experiment 4

According to Relevance theory, inferences are neither automatic nor arrive by default. Rather, they are cognitive effects that are determined by the situation and, if they do manifest themselves, ought to appear costly compared to the very same sentences that do not prompt the inference. In Relevance terminology, all other things being equal, the manifestation of an effect (i.e., the inference) ought to vary as a function of the cognitive effort required. If an addressee (in this case, a

General discussion

The experiments presented in this paper were designed to compare two competing accounts about how scalar inferences are generated. Participants were asked to evaluate statements that could be interpreted in one of two ways: either by treating the quantifier some in a logical way and not attaching any inference or by drawing a scalar inference and treating some to mean some but not all. The theories under consideration make different predictions regarding the length of time required to make the

Conclusion

This work largely validates distinctions made by Grice nearly a half-century ago by showing that a term like some has a logical reading and a pragmatic one. This study focused on the pragmatic reading that results from a scalar inference. It does not appear to be general and automatic. Rather, as outlined by Relevance theory, such an inference occurs in particular situations as an addressee makes an effort to render an utterance more informative.

References (24)

  • D.H. Brainard

    The psychophysics toolbox

    Spatial Vision

    (1997)
  • R. Carston

    Thoughts and utterances

    (2002)
  • G. Chierchia

    Scalar implicatures, polarity phenomena and the syntax/pragmatics interface

  • G. Chierchia et al.

    The acquisition of disjunction: evidence for a grammatical view of scalar implicature

  • H.H. Clark

    The language-as-fixed-effect fallacy: A critique of language statistics in psychological research

    Journal of Verbal Learning and Verbal Behavior

    (1973)
  • H.H. Clark et al.

    On the process of comparing sentences against pictures

    Cognitive Psychology

    (1972)
  • H.P. Grice

    Studies in the way of words

    (1989)
  • L.R. Horn

    Toward a new taxonomy for scalar inference

  • D.C. Howell

    Statistical methods for psychology

    (1997)
  • M.A. Just et al.

    Comprehension of negation with quantification

    Journal of Verbal Learning and Verbal Behavior

    (1971)
  • R.B. Lea

    On-line evidence for elaborative logical inferences in text

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1995)
  • R.B. Lea et al.

    The effect of negation on deductive inferences

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (2002)
  • Cited by (0)

    Support for the majority of this work came from by a post-doctoral grant from the Centre National de la Recherche Scientifique (France) to the first author as part of an Action Thematique et Incitative grant awarded to the second author. The first author is presently supported by NIMH Grant 41704, awarded to Professor G.L. Murphy of New York University. Versions of this paper have been presented at the First International Workshop on Current Research in the Semantics–Pragmatics Interface (Michigan State University, 2003). The authors wish to express their gratitude to Dan Sperber, Jean-Baptiste van der Henst, Nausicaa Pouscoulous, Gregory Murphy, Jennifer Wiley, Robyn Carston, and three anonymous reviewers whose comments improved the paper.

    View full text