Many of our everyday activities involve rules that prescribe or prohibit certain behaviours. Think for example of a sign in the park that says ‘keep off the grass’ or the code of conduct for a company. A paradigmatic activity involving rules is playing a game. The rules of games can be transient, formulated on the spot, and leave room for interpretation, as is often the case in children playing together, or can be fixed, codified in a rule book, and written with the express intention of preventing room for interpretation, as for example in the ‘FIDE Laws of Chess’ rule book. Wittgenstein (1953) proposed that we can think of language in terms of language-games. Regardless of whether one thinks the analogy of language and games holds up, many philosophers and linguists assume that language is like a game at least in the sense that it is a normative activity subject to rules (e.g., Itkonen 2008; Hacker 2014). However, theorists of language disagree about the role that rules play in linguistic behaviour.

On what Matthews (2003) calls the received view, linguistic behaviour is governed by rules that are represented in the cognitive system of individuals (e.g. Fodor 2008). On such a view, learning to speak revolves around forming mental representations of the rules that constitute a particular language. In opposition to the received view, proponents of embodied cognition argue that mental representations do not exist.Footnote 1 Embodied cognition aims to show how behaviour can be ‘regular without being regulated’ (Gibson 1979, p. 215) by showing how cognitive structures emerge from a history of sensorimotor interactions of an organism with its environment (Varela et al. 1991). In line with embodied cognition, people have sought to redescribe language in terms of embodied action, for example in terms of a catalyst of cognition (Verbrugge 1985), a system of replicable constraints (Rączaszek-Leonardi 2009; Rączaszek-Leonardi and Kelso 2008), attentional actions (van den Herik 2018), or in terms of future attractors in dynamic systems (Thibault 2011). However, the role that linguistic rules play in an embodied account of linguistic behaviour is currently underexplored.

In this paper, I start from the ecological-enactive approach to cognition,Footnote 2 a prominent proponent of embodied cognition, and show a possible way in which this approach could account for the normativity of language. Linguistic rules are understood in a sense which lies much closer to the everyday use of the word ‘rule’: namely as prescriptions, formulated in language, that normatively structure linguistic practices, where rules are formulated and enforced in interaction. In particular, I put forward as a hypothesis the thesis that metalinguistic reflexivity is constitutive of linguistic normativity. By this I mean that the normativity of language requires metalinguistic practices, by means of which the properties of language behaviour can be (re)negotiated.

It is often assumed that this constitutive role of reflexivity runs into a regress objection. If language is indeed subject to rules, and these rules are formulated in public language, it seems that one must already understand language in order to learn language (Sellars 1954; Searle 1995). In this paper, I argue that the regress objection can be evaded from an ecological-enactive perspective by proposing a conceptual account of how a child learns language by first learning what I dub first-order linguistic skills, and then retrospectively interpreting her own behaviour and that of others in normative metalinguistic terms, i.e., as being guided by rules, by relying on second-order linguistic skills.Footnote 3 Based on this account, I propose to see linguistic rules as resources: they are available to participants in order to (re)negotiate properties of situated language behaviour and thereby reorganize linguistic practices. The account developed in this paper thus allows us to understand the constitutive role of metalinguistic reflexivity for linguistic normativity without falling prey to the regress objection.

The paper unfolds as follows. In Section 1, I define the thesis that Metalinguistic Reflexivity is Constitutive of linguistic normativity (MRC) and briefly discuss some empirical considerations. In Section 2, I motivate MRC by giving three arguments in its favour. Section 3 is dedicated to discussing the prima facie devastating regress objection to MRC as discussed by Searle (1995, p. 72). In Section 4, I propose an alternative response to the regress objection on ontogenetic timescales by relying on the theoretical resources from the ecological-enactive approach to cognition. In this Section, I also flesh out the concept of a linguistic rule and sketch the outlines of a view of rules as resources.

1 Defining the constitutive role of metalinguistic reflexivity

In this first Section, I define the thesis which I put forward as a hypothesis in this paper, that Metalinguistic Reflexivity is Constitutive of linguistic normativity (MRC). I first define metalinguistic reflexivity, after which I explain what I mean by constitutivity. I then make a distinction between ontogenetic and sociogenetic MRC. This Section ends with some empirical considerations.

1.1 Metalinguistic reflexivity

Linguistic behaviour is marked by reflexivity: we effortlessly shift between talking about other things and talking about talking itself (Taylor 2000). This ability to ‘turn language on itself’, as Davidson (1979) describes it, means that language itself is often an object of discussion and attention, not just in poetry and philosophy, but also in everyday situations. Take the following examples:

(1) Do as you are told!

(2) What did you say yesterday?

(3) Could you repeat that last word?

(4) I didn’t understand a word he said.

(5) What do you mean by that?

(6) English is easy to learn.

(7) This book is incomprehensible.

(8) That conversation got us nowhere.

This reflexivity of language is expressed in metalinguistic activities in which we attribute properties to language, the people that use it, and the actions that we can accomplish through its use (Agha 2007, p. 17). A metalinguistic practice consists of recurrent metalinguistic activities, and taking part in these metalinguistic activities requires metalinguistic skills on the part of the participants. I use the term metalinguistic reflexivity to refer to the reflexivity of metalinguistic practices and the skills required to participate in them.

Metalinguistic reflexivity has received remarkably little attention in theoretical reflections on language. One reason for this is that metalinguistic reflexivity is often understood as an optional extension of linguistic behaviour (cf. van den Herik 2017). Here I briefly introduce the optional extension view as a contrast to the constitutive view I discuss next.

On the optional extension view, language is mostly experienced as a transparent medium that enables one to directly engage with what is meant. Innate mental capacities such as a language acquisition device (Chomsky 1965) or a language of thought (Fodor 1975) are surmised to support language ‘acquisition’ and linguistic performance outside the scope of our awareness. On such a view, metalinguistic reflexivity is often described in terms of metalinguistic awareness (e.g. Cazden 1974; Hakes 1980; Van Kleeck 1982; McDaniels and Cairns 1996). This awareness develops only after a child has acquired knowledge of language, as it is thought to rely both logically and developmentally on this knowledge. Metalinguistic awareness is often operationalised in terms of explicit judgements of more technical aspects of language, such as grammaticality, reference, and ambiguity (Myhill and Jones 2015). This leads to the idea that metalinguistic abilities concern decontextualised language, ‘divorced from its utility for communication’ (Cairns 2015, p. 271), and therefore play no role in everyday language behaviour.

These accounts of metalinguistic awareness are couched in individualist terms, a consequence of the ever-popular methodological individualism in the cognitive sciences. My perspective is different in that, in line with the ecological-enactive approach, I do not locate metalinguistic reflexivity in individuals’ cognitive systems, but instead conceive of it as a form of social action. This does not mean that cognitive mechanisms are irrelevant. Of course, certain skills are required. But in first instance, these skills are social skills that are expressed in interaction with others. Moreover, on this perspective, metalinguistic activities are situated and often do pertain to contextualised language behaviour. Although judgments of grammaticality, reference and ambiguity are metalinguistic activities on my understanding, they are a subset of a much broader class of metalinguistic activities.

The optional extension view has been criticised because it proposes a naïve realism with respect to metalinguistic categories (Cowley 2011; Taylor 1990; Harris 1998). Love (1990) proposes a distinction between first-order language behaviour, which he defines as the ‘real-time, contextually determined process of investing behaviour or the products of behaviour (vocal, gestural or other) with semiotic significance’ (Love 2004: 53) and second-order cultural constructs, which arise from metalinguistic reflection on this first-order language behaviour. Although first-order language behaviour is certainly influenced by second-order or metalinguistic reflection, it cannot be reduced to it. The problem, according to Love, is that linguists have taken these cultural constructs, which are the result of reflections on language, and treated them as the cause of language behaviour by reifying them and locating them in the cognitive system of individuals. This in turn makes metalinguistic behaviour a mere reflection of the reified cultural constructs, leading to the optional extension view (Love 2017). If you want to explain linguistic behaviour, so argues Love, you have to start at first-order language behaviour, not at the second-order constructs. In this paper, I take Love’s recommendation at heart, while at the same time foregrounding the normative structuring force of second-order linguistic behaviour.

1.2 Constitutivity

As opposed to the optional extension view, the constitutive view gives metalinguistic reflexivity a constitutive explanatory role. De Jaegher, Di Paolo, and Gallagher (2010, p. 443) define the situation in which a phenomenon occurs as ‘the collection of past and present events, processes and relations that are observed with a phenomenon’. Some of these events, processes, and relations will figure in an explanation of the phenomenon, in which case they are explanatory factors. According to De Jaegher, Di Paolo and Gallagher, these explanatory factors can have one of three different explanatory roles. First, variations in contextual factors produce variations in the target phenomenon. Second, enabling factors are causally necessary for the phenomenon to occur. This means that the absence of an enabling factor in a given situation prevents the phenomenon from occurring. However, the absence of an enabling factor does not imply that the phenomenon cannot occur, as another enabling factor might take its place. For example, learning to write could be enabled by a pencil, but it could also be enabled by a pen. If no pencils are available, a pen will do. Finally, a factor is constitutive if it is a necessary part of the phenomenon. As De Jaegher, Di Paolo, and Gallagher (2010, p. 443) explain, what ‘exact role an element plays in X [the phenomenon] depends on how one chooses to describe and observe X’. This means that the claim that a given factor is constitutive of a phenomenon is a conceptual claim: if our concepts by means of which we describe a phenomenon change, what we take to be constitutive also changes (Di Paolo 2016).

Take the phenomenon of someone speaking. We might describe this event as the production of sounds. In that case, the explanation could be that ‘articulatory patterns’ are constituted by flexibly assembled coordinative structures of articulators (Kelso et al. 1984). However, if we describe the same event as an utterance of meaningful words, a totally different explanation would be in order that would involve very different constitutive elements. On this description, the flexibly assembled coordinative structures referenced earlier could be merely enabling, as linguistic communication can just as well take place in a different modality, as for example in signed languages.

For the purposes of this paper, linguistic normativity is the phenomenon. The explanatory factor under investigation is metalinguistic reflexivity. The claim is that metalinguistic reflexivity is constitutive of linguistic normativity. In other words, MRC means that metalinguistic reflexivity is a necessary part of the collection of events, processes, and relations that are observed in relation to the phenomenon of linguistic normativity. In the next Section, I will motivate this constitutive role. But first I want to make a distinction between variants of MRC.

1.3 Sociogenetic and ontogenetic MRC

So far, I have understood metalinguistic reflexivity as referring both to metalinguistic practices and the skills required to participate in them. We can however be more precise when we make a distinction between those two aspects. First, sociogenetic MRC is the thesis that metalinguistic practices are constitutive of the linguistic normativity inherent in linguistic practices. Second, ontogenetic MRC is the thesis that a child’s metalinguistic skills are constitutive of the development of a sensitivity to linguistic normativity. In other words, only when a child has acquired the relevant metalinguistic skills, can her linguistic actions be justifiably described as being subject to linguistic rules. These two theses are interrelated but also have a certain independence.

On the one hand, sociogenetic MRC does not imply ontogenetic MRC. For example, one could argue that although metalinguistic practices are constitutive of the normativity of linguistic practices, an individual could develop the skills required to competently participate in these linguistic practices, and thus be subject to justifiable normative assessment, without developing the skills required to participate in metalinguistic practices. This view would be analogous to the division of linguistic labour proposed by Putnam (1975). On his account, the word gold refers to gold because some speakers in a community have a way of recognising gold – they have criteria for distinguishing gold from other substances. However, not everybody has to know these criteria in order to be able to use the word gold to refer to gold. The fact that gold refers to gold can thus be explained by the abilities of a subset of a community, and need not be explained in terms of individual knowledge.

On the other hand, ontogenetic MRC does imply sociogenetic MRC. The reason is obvious: ontogenetic MRC requires that children learn metalinguistic skills. But one cannot learn the skills required to participate in metalinguistic practices if those practices do not exist. In this paper, I shall foreground ontogenetic MRC, and only briefly discuss the implications of this ontogenetic account for sociogenetic MRC at the end of Section 4.

1.4 Empirical considerations

MRC has empirical implications. If MRC is true then it has to be the case that when we find linguistic behaviour that is justifiably characterizable as involving linguistic normativity, metalinguistic reflexivity must be involved. Some linguists have argued that written language plays an important role in practices of reflection (e.g. Goody 1977; Ong 1982; Love 1990). Following this argument, one might hypothesise oral cultures will lack metalinguistic practices. However, ethnographic fieldwork has shown that oral cultures have rich metalinguistic practices (e.g., Feldman 1991). Therefore, it seems reasonable to assume that wherever there is normative linguistic behaviour, metalinguistic reflexivity is involved. Yet, for MRC to be true, it should not only be the case that there is in actual fact no instance of linguistic normativity in the absence of metalinguistic reflexivity, we need to establish conceptually that there could not be linguistic normativity without metalinguistic reflexivity. Empirical evidence can thus not decide whether MRC is true.

There is, however, a different use of empirical evidence: it can show the possibility of linguistic practices that are radically different from English linguistic practices, with radically different normative structures. This in turn protects us from the ethnocentric fallacy, which Taylor (2010, p. 490) describes as the idea that ‘the reflexive linguistic distinctions which our culture applies in evaluating and characterizing communicational behavior must also be applied – and if not explicitly, then implicitly – by the members of every culture.’

Entertaining these alternative conceptualisations of language may appear as highly counter-intuitive. The root of this counter-intuitiveness is that our means of conceptualising language by means of metalinguistic practices are not ‘outside’ language, but form an integral part of it. In this sense, language is a special kind of activity. Love (2003), p. 88; cf. Harris 1998, pp. 26–28) employs a comparison with musical notation. Things such as misunderstandings and differences in opinion as to how the musical notation relates to musical performance have to be negotiated in language, for there is no meta-notation: musical notation is not reflexive. In the case of language, however, there is no recourse to another medium to resolve disputes. This means that we lack a method of neutral comparison of different ways of conceptualising language. The illusion can therefore arise that our (academic) English ways of metalinguistically understanding language, in terms of talking about things (reference), truth, words and sentences, and so on, are the only possible ways to conceptualise language (Lucy 1993).

Empirical evidence, in the form of ethnographic descriptions of (meta)linguistic practices provide a way of getting a feel for the constitutive nature of metalinguistic reflexivity by seeing that metalinguistic categories from other linguistic communities do not apply to English as spoken for example in England.Footnote 4 For example, for the Ilongot,Footnote 5 all speech falls in one of three categories (Rosaldo 1973): straight speech (qube:nata qupu), crooked speech (qambaqan), and invocatory speech (nawnaw). Straight speech is every day, ordinary speech, and is used to for gossiping, chatting, or exchanging news. Invocatory speech is ‘characterized by frequent and often hyperbolic metaphors, redundant rhythms, and stereotyped lines’ (Ibid., p. 197), for example long spells used to coerce gods and spirits, although one can also nawnaw children, which is to lecture them, or to coerce them by telling them lies. Children and adults have to persuaded differently in the egalitarian Ilongot society. Crooked speech is a subtle mode of speaking which enables one to ‘hide’ from their words, ‘a distancing of the speaker from his words’ (Ibid., p. 198) by using devices such as metaphor and qualification. For the Ilongot, all speech can be categorised using this tripartite distinction. At the same time, from the descriptions of three kinds of speech, it is clear that this distinction does not apply to everyday English speech. English simply has no identifiable form of speech that is used both to coerce gods and children.Footnote 6

The Ilongot’s metalinguistic practices have normative implications. For example, for the Ilongot, one should use ‘crooked speech’ in convincing adults, whereas for convincing children ‘invocatory speech’ should be used, a prescription that makes no sense in an English context. A second normative difference concerns the criteria for learning language. In western lay linguistic understanding, it is usually assumed that learning language involves learning words. A child has ‘learned a word’ when she can reliably produce a recognisable vocalisation in the correct circumstances (we will return to language learning, and an ecological alternative for this picture, in Section 4). This lay linguistic view is materialised for example in children ‘word books’ that feature pictures of everyday objects and animals that are labelled with the corresponding word.

The Ilongot see tuydek, or commands, as the primary speech act, and maintain that ‘children learn to speak by learning tuydek’ (Rosaldo 1982, p. 209). This perspective on language and language learning means that the Ilongot do not dissociate language from other forms of behaviour in the way English speakers often do: ‘knowing how to speak itself was virtually identical to knowing how and when to act’ (Ibid.), and thus the lay criteria for learning tuydek involve much more than merely producing the right vocalisation in the right circumstances.

2 Motivating the constitutive role of metalinguistic reflexivity

A prima facie reason for thinking that metalinguistic reflexivity is constitutive in explaining linguistic normativity is the fact that metalinguistic activities often have a prescriptive function (Taylor 2000, 2013; Harris 1998). To talk about language (e.g. What does mosh-pit mean?), a linguistic action (e.g. I asked you something!), or some person (e.g. She’s always so talkative) in metalinguistic terms is to suggest treating it or her in a particular way. Of course, there is a continuum here. On the one hand, giving a definition for a new word is a wholly prescriptive act: there is no previous usage to conform to, and thus the definition sanctions a particular way of using the word. On the other hand, we could have a statement like the word red is in the dictionary, which is as descriptive as metalinguistic utterances get. In between these extreme cases there are intermediate cases. For example, an argument about the meaning of a word is not merely a factual disagreement about past usage of that word; its outcome partly determines how that word should be used in the future. In other words, metalinguistic practices have the potential to normatively structure our linguistic practices (I return to this structuring role in Section 4). In what follows, I give three arguments in favour of MRC.

First, metalinguistic reflexivity is required for discussing and (re)negotiating normative properties of first-order linguistic behaviour. Noë (2015, pp. 41–42)Footnote 7 argues that success in linguistic communication is fragile: every communicative context is unique, which entails the possibility of misunderstandings and other breakdowns. If the situation calls for it, we can rely on metalinguistic activities in order to (re)negotiate our metalinguistic understanding of what is going on in conversation, and for this we need to have recourse to a meta-conversation in which we can conceptualise what we are doing a we are engaging in linguistic behaviour.Footnote 8 Linguistic behaviour therefore cannot be reduced to merely regular behaviour or knowledge of conventions (c.f. Davidson 1986). In the words of Noë (2015, p. 42), we can always ask ‘such basic questions as How to go on? What is right and what is wrong? What does he mean when he says that? Why did he say that? and so on.’

This argument works both on the level of practices and on the level of the individual. Take the example of word meanings understood as ‘correct’ ways of using words. If we were to lack the resources to discuss, criticise, and (re)negotiate ‘the meaning’ of words, this would entail that no criteria for ‘correctness’ for the use of words could be agreed upon, explained, changed, or enforced (Taylor 2000, p. 489). To have ‘correct’ uses of words requires a way of agreeing on what the correct use is, and this agreement can only be forged by means of metalinguistic reflexivity.

At the level of the individual, participating in these practices of discussing criteria for correct use is a criterion for being a member of the linguistic community. Negotiating a breakdown by explaining what you meant after saying something, for example, is one of the skills that we expect members of our linguistic community to have. This is the case not just for children first learning language, but continues to hold true later in life. If a student is able to reproduce sentences he has read in a philosophy paper, but is unable to explain what these sentences mean in his own words when we ask him to do so, we are unable to ascertain whether he understands them – although we shall have our suspicions.

Second, the possibility of metalinguistic attribution of understanding is crucial for monitoring the coordinative function of language. For any person that has the aim to coordinate with another person beyond the current situation necessarily relies on the possibility of a meta-conversation in which normative questions of understanding and agreement can be addressed whenever the need arises. If you ask me to perform a task in the future, and it is important to you that the task is performed, you will want to make sure that we I have understood you correctly. If we lacked metalinguistic practices, we could not engage in this meta-conversation, and therefore could not do such things as check whether someone understood an instruction, or to ask someone to explain what they mean when we did not understand them.

Third, a closely related normative function of metalinguistic reflexivity is that it enables us to hold each other accountable for our linguistic actions. In order to hold someone accountable for what they have done, we have to know what that person has done (Enfield and Sidnell 2017). This knowledge must be agreed upon between at least the person who holds another accountable and the person who is held accountable.Footnote 9 Therefore, a public medium must be available by means of which what a person has done can be discussed; the medium we use for this is language. In other words, without language there can be no social accountability. This argument holds also in the case of actions achieved by linguistic means - think for example of promises and other commitments. And thus, holding another person accountable for their linguistic actions constitutively requires metalinguistic reflexivity.

As an illustration, suppose I say ‘I’ll finish my revisions tomorrow’ to my colleague. If metalinguistically construed as a promise, I can be held accountable for finishing the revisions tomorrow. However, the speech act can also be construed as a prediction, or as an educated guess as to when I will find the time to finish the revisions. How the utterance is metalinguistically construed thus determines the commitment I make.Footnote 10 In everyday interaction, we need not always resort to explicit metalinguistic activities to construe our linguistic acts. However, for any number of reasons, such as the situation being ambiguous, or the stakes of finishing the task being very high, we can resort to explicit metalinguistic negotiation of the commitment. Besides this ongoing (re)negotiation of the commitment undertaken by performing a linguistic action, another reason why metalinguistic reflexivity is crucial for social accountability is that we require a way of reporting on past speech in order to hold people accountable now for what they said earlier. We need a means of referring back to that act to justify that someone else is accountable (‘But yesterday you said that you would finish the revisions today!’).

3 The regress objection

So far, I have defined MRC (Section 1) and I have given three reasons to motivate this thesis (Section 2). There is however a potentially defeating counterargument against MRC: the regress objection. In this Section, I introduce this regress objection based on Searle (1995) as well as his solution which is emblematic of the received view. In the next Section, I will develop an alternative response to this regress objection.

Searle (1995) starts from institutional facts – facts that are dependent on human agreement in institutions, such as money, property, research grants, and marriages. These institutional facts are constituted by rules of the form ‘X counts as Y (in context C)’. For example, submerging a new born in water counts as baptising it (when done by a priest), and saying ‘sorry’ counts as apologising (if sincerely uttered). Searle (1995, p. 27) distinguishes between constitutive rules and regulative rules. Regulative rules regulate a pre-existing phenomenon. Searle gives the example of the rule ‘drive on the right side of the road’. The activity of driving is not constituted by this rule; it is perfectly possible to drive a vehicle without a rule being in place that regulates on which side of the road one should drive (in the case of certain vehicles, even the road is optional). Constitutive rules, on the other hand, ‘create the possibility of certain activities’ (Ibid.). Here Searle’s example is chess: if one does not follow the rules of chess, one does not play chess. Note that Searle’s concept of constitutive rules is stronger than the concept of constitutivity at play in this paper. I return to this issue in Section 4.

According to Searle (1995, pp. 69–70), institutional facts involve constitutive rules that require language. The reason for this is that the Y status of Xs needs to be marked, as the Y status goes beyond the physical characteristics of Xs. For example, a ball rolling between the goal posts counts as a goal in football (if the ball is in play). The status of this event as a goal can only be explained in terms of people doing things such as keeping track of the score. Moreover, in order to be effective in the interactions between people (e.g., determining the winner), the status has to be marked in a public way.

When Searle (1995, p. 60) gets to language, he encounters the regress objection. The reasoning is the following: ‘If institutional facts require language and language is itself an institution, then it seems language must require language, and we have either infinite regress or circularity.’ Note that in the case of language, the X in the constitutive rule ‘X counts as Y (in C)’ is linguistic. This means that the constitutive rules of language are metalinguistic in nature.

In order to avoid the regress objection, Searle grants language a special status. Where most institutional facts require language for their identification as they must be marked in a public way, language is ‘self-identifying’ according to Searle (1995, p. 73): ‘The child is brought up in a culture where she learns to treat the sounds that come out of her own and others’ mouths as standing for, or meaning something, or representing something.’ This in turn means that the child must have a special ability to learn language: ‘the capacity to attach a sense, a symbolic function, to an object that does not have that sense intrinsically is the precondition not only of language but of all institutional reality’ (Ibid., p. 75, emphasis added).Footnote 11 The explanation of institutional facts thus ultimately relies on ‘primitive prelinguistic psychological states’ (Ibid., p. 78) that in an important sense already contain what needs to be explained. For assuming that children can attach such a status means that they can already follow constitutive rules of the form ‘X counts as Y (in C)’ before they first learn language.

Based on Searle’s account, we can formulate the regress objection in terms of our earlier discussion as follows:

  1. [1]

    Learning language starts out as learning to follow the rules of language.

  2. [2]

    Learning to follow the rules of language requires being able to participate in metalinguistic practices that enable the formulation of these rules.

  3. [3]

    Therefore, learning language in linguistic practices requires already being able to participate in metalinguistic practices.

  4. [4]

    But, being able to participate in metalinguistic practices means that you have already learned language.

  5. [5]

    Therefore, learning language requires already having learned language.

Note that the regress argument thus sketched, if viable, provides a defeating argument against ontogenetic MRC. The reason is that the conclusion is obviously absurd, as it entails that language cannot be learned. Therefore, either assumption [1] or [2] has to be false. Searle’s solution is to accept [1] and argue against [2]: Learning to follow rules, at least in the case of language, does not require being able to participate in the metalinguistic practices that enable the formulation of these rules. In order to avoid the potential infinite regress Searle thus has to deny the force of his own argument in the case of language.

Searle’s response to the regress objection is exemplary of the ‘received view’ mentioned in the introduction of this paper, as the rules of language are thought to be represented in the cognitive system of an individual. The ecological-enactive approach eschews these mental representations, and thus cannot rely on this kind of solution. But before I formulate an alternative response to the regress objection, it is good to realise that it is not just a skepticism with respect to mental representations that forces us to abandon Searle’s solution.

More in general, the view that ‘rules’ can be located ‘inside’ individuals can be rejected on Wittgenstein (1953) grounds. In the example from the introduction, the rule on the sign (‘keep off the grass’) has a normative status, and it gets this normative status from the fact that it can be enforced. When someone breaks the rule, the groundkeeper can remove them from the park, citing the rule as a reason. If, however, we now try to imagine a rule that is present only in the cognitive system or consciousness of an individual, it seizes to have any normative force, as it cannot be enforced. Moreover, since there is only one person with access to the rule, there cannot be a distinction between what seems to be right, or in agreement with the rule, and what is actually right, or in agreement with the rule (e.g. Itkonen 2008).

4 An alternative response to the regress objection

In order to provide an alternative response to the regress objection, it is important to make a distinction between mere regularities and rule-regulated regularities. As an example of the former, take a desire path.Footnote 12 As people walk across the grass, a path gets made as the grass gets damaged by the walking. This desire path, in turn, may guide future walking behaviour. If you are going roughly in the direction the desire path is going, you might end up walking on the desire path. However, there is no normativity involved in walking along the desire path. Although many people happen to walk on the desire path, and they do so because of material effects of this pattern in behaviour, there is no rule according to which they should do so. Sellars (1954) calls this kind of behaviour pattern governed. His solution to the regress objection consists in arguing that language is a complex form of behaviour that involves both pattern governed and rule following behaviour. In particular, learning language does not start out by learning rules, but is instead, initially at least, a form of pattern governed behaviour.

In this Section, I propose a conceptual account of ontogenetic MRC starting from the ecological-enactive approach to cognition that is in line with Sellars’ solution. I will do so by making a distinction between different kinds of linguistic skills. The main idea is to first account for regular communicative behaviour, which I will call first-order linguistic skills, and show how the addition of reflexive forms of communicative behaviour, or second-order communicative skills, enable a sensitivity to linguistic normativity by enabling a child to use regulative rules. Note that for the purposes of this paper, I only propose a conceptual framework for this ontogenetic story, which means that I leave fleshing out this story with reference to empirical literature for a later occasion. Initial steps in this direction are made by Taylor (2012, 2013), who discusses the role of metalinguistic reflexivity in language learning in general.

4.1 The ecological-enactive approach and regular communicative behaviour

According to Gibson (1979, p. 215), the question an approach to cognition has to answer is how behaviour can be regular without being regulated. The ecological approach to cognition he pioneered starts from two premises in order to account for this regularity: (1) organisms perceive affordances, which are possibilities for action, and (2) the environment of the organism is structured in such a way that these affordances are directly perceivable. The regular behaviour of organisms can be explained in terms of acting on perceived affordances. Affordances are relational: they depend both on the layout of that environment and on the skills of the organism (Rietveld and Kiverstein 2014; Chemero 2009). Organism and environment are thus mutually specifying. This means that an organism’s environment is not the physical environment, but its ecological niche. In the case of humans, the ecological niche ‘is shaped and sculpted by the rich variety of social practices humans engage in’ (Rietveld and Kiverstein 2014, p. 326). This shaping can be understood in line with the desire path example discussed at the beginning of this Section.

We are not sensitive to affordances in isolation. Instead, we are always open to a multitude of possibilities for action, which is called a field of affordances (Rietveld and Kiverstein 2014) and thereby enact, i.e., bring forth, our world. Not all affordances equally invite a person to act on them (Withagen et al. 2012). What affordances are inviting, and the degree to which they are inviting, depends on a host of factors: what activity one is currently engaged in, one’s mood, the social setting, and so on.

On this ecological-enactive approach to cognition, communicative behaviour is in first instance understood as directly acting on the field of affordances of others, by indicating particular affordances (Reed 1995; Baggs 2015). These affordances thereby become more inviting, and it becomes more likely that the person acts on these affordances. When applied to language learning, a child’s ‘first words’ do not function by standing for things, instead they are social actions aimed at directing other people’s attention in order to do something together (Reed 1996). That is, the child learns to communicate in ‘the midst of “doing”’ (Bruner 1990, p. 70). The child is not a mere observer, nor a passive creature that requires conditioning, but an active participant in activities that recur time and time again, and it is in these activities that she starts ‘doing things with words’ (Rączaszek-Leonardi 2009, p. 170).

When we consider the intersubjective context in which the child starts out communicating, we see that her communicative behaviour is best characterised, not in terms of acquiring knowledge-that about language, but as coming to know-how to do things by talking (Taylor 2013, p. 317; van den Herik 2017; Simpson 2010; Hanna 2006). As Reed (1995, p. 2) argues, this means that the central question in language development is not how a child learns a language, understood as an abstract system of rules, but rather ‘how the child comes to enter the linguistic community’ by learning a repertoire of social skills.

The skills the child thus develops enable the child to engage in social actions aimed at manipulating joint attention. I have called these social actions attentional actions (van den Herik 2018).Footnote 13 What a child learns as she learns to say her first words is thus not referential knowledge (she does not learn that ‘ball’ stands for balls), she rather learns a social skill, namely to indicate aspects of recurrent situations while engaging in joint activities with others. Attentional actions function like ostensive gestures: they direct attention, where attention is understood as the selective openness to affordances. This directing of attention is not a mute pointing. Attentional actions act as ‘operators of reminiscence’ (Bottineau 2010, p. 283), linking the present situation to previous situations and thereby suggesting a way of going on.

Although currently underexplored, I think this account of attentional actions is a promising perspective on linguistic behaviour beyond a child’s first words. Our linguistic activities continue to fulfil an ostensive role with respect to possibilities for action. Talking about some thing is to draw attention to that thing as a particular thing, and this can be understood on the model of attentional actions, even in the case of highly theoretical language use. Kukla (2017), for example, states that Haugeland, in a seminar, said that philosophy ‘was just a particularly sophisticated and elaborate form of ostension; we use philosophical discourse to direct one another’s attention to how things are’.Footnote 14

It is important to realise that communicative situations are very different from the perspective of the child who learns her first attentional actions than they are from the perspective of her caregivers. There is an asymmetry in the situation: whereas the caregiver can evaluate the child’s verbal behaviour in normative terms (as being (in)appropriate, (in)correct, and so on), and encourage or discourage aspects of the child’s communicative behaviour on the basis of these evaluations, the child initially lacks these normative, metalinguistic concepts. Therefore, we can call the child’s initial communicative skills, following Love’s distinction discussed in Section 1.1, first-order linguistic skills. Fortunately, in order to enter her linguistic community, the child does not need to have these normative concepts; she only has to attune to the patterns in behaviour of her community that enable her to guide others’ attention and have her attention be guided by the verbal behaviour of others.Footnote 15

4.2 Normative bootstrapping

As the child develops her initial verbal skills, or at least, skills that are taken to be verbal from the perspective of members of her linguistic community, she does not yet have the skills that enable her to entertain whether her verbal behaviour is (in)correct or (in)appropriate. At this stage, although the child’s communicative behaviour is roughly in line with the patterns of communicative behaviour in her community, she does not yet follow rules. In order to be able to do this, according to the hypothesis developed in this paper, the child has to develop second-order linguistic skills, which she does by learning to engage in metalinguistic, reflexive communicative behaviour.

One more perceivable aspect of the child’s environment is the linguistic behaviour of herself and others. A child’s first forays into her community’s metalinguistic practices can thus be described in similar terms (see van den Herik in preparation for an account along these lines; cf. Taylor 2013). That is, the child’s reflexive communicative skills are in the first instance reflexive attentional actions: attentional actions that direct attention to other attentional actions. This means that the child’s initiation into her community’s metalinguistic practices need not be described in normative terms. All we need are systematic links between reflexive attentional actions and verbal aspects of the environment of the child. How these systematic links come about can be described non-representationally in terms of attentional actions.

The crucial insight, which I take from Brown (2006), is that the regularity without regulation of the child’s first-order communicative behaviour plays an important part in understanding how the child can make the shift from merely regular, or pattern governed behaviour to rule following behaviour. The key is that, as a result of her behaviour being regular and roughly in line with her community’s practices, the child is justified in applying the normative metalinguistic concepts to her own behaviour. Of course, she also learns to apply these concepts to the behaviour of others such as her caregivers, which is also in line with the communal patterns of behaviour.

This retroactive interpretation, a second-order skill, of her own first-order communicative behaviour in terms of rules enables the child to conceive of her own behaviour as being guided by rules before that was actually the case. As Brown argues, this retroactive interpretation of her own behaviour does not require backward causation, for the child’s new skills consist in interpreting her own past behaviour as being guided by rules, not in describing it as being guided by rules at the time. As soon as the child can interpret her own behaviour in these normative terms, she can also conceive of herself as being subject to the rules. The child can do so because she knows she can behave in accordance with the rules.

Note that the idea of retroactive interpretation does not entail a ‘light-bulb’ moment when the child all of the sudden makes the leap from merely regular behaviour to truly normative behaviour. The second-order or reflexive linguistic skills that enable the child to articulate and enforce linguistic rules can be acquired in a piecemeal fashion. But it does help us to explain how the child can justifiably see herself as being guided by rules in coming to grasp the normative concepts of her community.Footnote 16

4.3 Linguistic rules

Up until now, the discussion has been fairly abstract, but we are now in a position to flesh out the concept of linguistic rules in more detail. I do so by discussing two accounts of the role linguistic rules play in language. In this discussion, I highlight the regulative role that rules play in linguistic practice.

Following Sellars, Garner (2014)Footnote 17 takes linguistic behaviour to, in first instance, consist in pattern governed behaviour. Yet, at the same time he thinks rules are important. Garner (2014, p. 112) defines rules as ‘explicit statements of regularities in language patterns that purport to specify the conditions that must be adhered to by language users.’ On this account, rules are not static facts, but contingent outcomes of processes of formulation, which is why Garner also refers to rules as formulas. Formulation is only one side of the story; rules also have to be implemented. For implementation, formulas are collected in formularies. For Garner, the primary formularies are grammars and dictionaries, but he also includes ‘the wide range of manuals giving guidance on topics as diverse as style, public speaking, effective communication, and the like, and (much more indirectly) iconic texts of literature, oratory, and, increasingly, the media’ (Ibid. 115).

Like Noë, as discussed in Section 2, Garner stresses that communication can never be taken for granted. For Garner, this means that language has to be predictable, in the sense that interacting individuals ‘can reliably presume that their own and others’ meanings can to a reliable extent be interpreted forwards from any moment in an interaction’ (Garner 2014, p. 117). Although in practice, people might not often consult formularies, their main role is in providing a guarantee that language is a predictable entity by assuring us that we are all speaking ‘the same language’ (Ibid., p. 118). For Garner, linguistic rules, in the first instance, play a sociological role in unifying nation states around the concept of an official, codified, language.

While I find much to agree with in Garner, I feel that reducing the role of rules in language to rule-books (‘formularies’) grants rules a much too limited role. In fact, I think that his definition of rules – ‘explicit statements of regularities in language patterns that purport to specify the conditions that must be adhered to by language users’ – allows for a much broader reading of the role of rules in linguistic behaviour. It is relevant here that the ongoing process of formulation, as described by Garner, is not limited to the writing of dictionaries and grammars. As Harris reminds us, the kind of linguistic inquiry that lies at the basis of writing dictionaries and grammars, ‘is conditional on the reflexivity of language’ (Harris 1998, p. 26). Garner’s discussion of rule-books can thus be seen as one possible manifestation of a more basic practice of articulating and enforcing rules.

In order to understand this broader role of linguistic rules, I now discuss Hacker’s account of linguistic rules for word meanings. Starting from the work of the later Wittgenstein (1953), Hacker (2014, §5) argues that explanations of meaning provide rules for the use of words: ‘They furnish us with standards for the correct use of the word explained. If “Oxford blue” is explained as: that colour, then anything which is that colour can correctly be said to be Oxford blue.’ Like Garner, Hacker foregrounds the importance of regularities in accounting for rules. The move from regularities to rules is effected by the recognition of a regularity by someone, who then takes that regularity to be a standard of correctness (or appropriateness, or adequacy). Viewed in this way, the pattern governed behaviour of calling a particular colour ‘Oxford blue’ can be normatively enforced by saying that this colour is called ‘Oxford blue’.

Practices of articulating rules, such as giving explanations of the meaning of words, do not leave first-order linguistic behaviour untouched. As mentioned earlier, an argument about the meaning of a word is not merely a factual disagreement about past usage of that word; its outcome partly determines how that word should be used in the future. Noë (2015) therefore calls these reflexive practices reorganisational practices. Our ability to reflect on first-order linguistic behaviour, recognise patterns in it, and then normatively enforce these patterns has a reorganising, or structuring effect. Deviations from this pattern, that until the pattern was recognised and elevated to a rule, were only deviations, now become ‘incorrect’ or ‘inappropriate’, and therefore must be avoided.

Our second-order skills are not limited to enforcing regularities. Of course, once reflexive practices are in place, they can also be used to set up new patterns, for example in defining a neologism. But we can also question regularities, or doubt them, or argue against enforcing them, and so on. Rules articulated on the basis of reflexive second-order skills can thus feed back into first-order patterns in many ways, and thereby change our first-order behaviour. Given this reorganisational potential of reflexive second-order skills, our first order linguistic behaviour will always carry the traces of metalinguistic activities.

Both first-order linguistic skills and second-order linguistic skills are forms of know-how (cf. Sellars 1971, §24). And usually, these skills go hand in hand, for if one knows-how to use a word, one is usually able to explain its meaning to someone else, either in words, or by giving examples, and so on. At the same time, distinguishing between them enables us to make sense of learning language without falling prey to the regress objection.

We can now translate this perspective on rules to the development of the child’s ‘first words.’ For the child that has yet to master second-order skills, these are not words yet, and she is also unable to provide rules for using these words by providing criteria for correct use. Only later will she acquire the reflexive metalinguistic skills that enable her to reflect on the regularities in her own and others verbal behaviour, and to treat these regularities as rules, i.e., as standards of correctness. The child will do so by providing criteria for correct use, for example in giving an ostensive definition, by telling other people you are wrong, by giving a verbal definition, and so on. In doing so, the child’s relation to the regularities in her communicative behaviour changes. As she learns to participate in the game of providing and discussing criteria that distinguish correct from incorrect use of words, she gradually becomes sensitive to the normative dimension of language.

So far, we discussed rules in terms of the meaning of words. This is only one example of linguistic rules. The present analysis can easily be extended to accommodate other kinds of linguistic rules, such as the conditions of appropriateness at play in the distinction between polite ways of talking versus non-polite ways of talking or the syntactic rules in at play in grammars, and how these rules are enforced by caregivers and in institutionalised schooling. Think for example of a parent telling a child to ‘say thank you’ when receiving a gift or a Dutch parent telling their child to use formal pronouns when talking to unfamiliar adults.

Not all second-order linguistic behaviour involves rules, in the sense of articulated regularities. Only when regularities are articulated, and standards of correctness, appropriateness, or adequacy are in play, can we say that second-order behaviour involves rules. When I ask you to repeat what you just said, for example, no regularity in behaviour is articulated by me. Second-order behaviour that involves rules is thus only a part of all second-order behaviour.

At this point, one might object that the current analysis of linguistic rules is only applicable to Western contexts, where the concept of a rule is in play. However, I do not think that practices like explaining the meaning of a word rely on having the concept of a rule. On the present analysis, the Ilongot parent that corrects a child because she incorrectly responds to a command (tuydek), or explains to the anthropologists that it is inappropriate for one adult male to convince another by lying to him (nawnaw), is giving a rule. The Ilongot recognises a regularity in first-order languaging behaviour, and normatively enforces this regularity, which is all that is required on the present account for counting as a rule.

4.4 Rules as resources

From the discussion of rules so far, a picture emerges according to which rules only apply in so far as we view ourselves as being subject to them, that is, in as far as we have the reflexive resources for enforcing them. This idea of normativity does not mean that anything goes, for our sensitivity to the normativity of rules is grounded in pattern-governed forms of behaviour. Here a comparison to Searle’s account is illuminating: even though I take rules to be constitutive of linguistic normativity, and linguistic normativity to be central to language, this claim is much weaker than Searle’s account of constitutive rules. In particular, the order of explanation is reversed when compared to Searle’s account. For Searle, and the received view in general, constitutive rules explain regularities in behaviour, because the rules represented in the cognitive system of individuals govern linguistic behaviour. Learning language is based on the model of learning chess, where one starts out by learning the basic rules that determine legal moves, whereas the ecological-enactive approach conceives of learning language as developing communicative skills that are initially merely pattern-governed. On the ecological-enactive account, regularities enable the initial formulation of rules, which in turn can explain how some aspects of these regularities get to have a normative force as they are enforced in practices, leading to rule following rather than pattern governed behaviour. By thus distinguishing pattern governed and rule-following behaviour, we can explain how a child gradually enters the linguistic community, as Reed (1995, p. 2) says, ‘first as a marginal, and later as a central member.’

As Davidson (1986) argued, the interpretation of concrete, situated linguistic behaviour cannot be reduced to (knowledge of) rules for the use of linguistic forms. Understanding what someone says is always a situated affair, and draws as much on our knowledge of the world as on our linguistic knowledge, a conclusion that blurs the boundaries between linguistic and non-linguistic knowledge. The process of recognising regularities, and normatively enforcing these regularities as rules, does not lead to an inflexible repertoire of linguistic tools, which would be wholly unusable as it would lack the possibility to adapt to concrete situations. The formulation of rules and criteria for applying them is thus never complete, but remains a matter of ongoing determination and (re)negotiation (cf. Di Paolo et al. 2018).

Viewed in this way, linguistic rules and the criteria for applying them do not determine or govern linguistic behaviour. Pace Searle, our linguistic activities are thus not constituted by rules in the way that playing chess is constituted by rules (cf. Love 1999). If you do not follow the rules of chess, e.g., by making a diagonal move with a rook, in an important sense you do not play chess. But in language, this does not hold. Instead, what comes first, both on ontogenetic and sociogenetic timescales, is shared forms of regular communicative behaviour that can be understood in terms of attentional actions. Which rules apply in any given context is up to the participants themselves to decide. For example, by and large people’s linguistic behaviour is in line with some rules of grammar. However, if under extreme stress or for comedic effect, someone might resort to single words in order to communicate. Although this breaks the rules of grammar, it does not entail that his behaviour is no longer linguistic. Other examples include using a word in a novel way, thereby breaking with regular usage, coming up with novel words, break spelling rules in textspeak, pronounce a word in a strange way, invent a new word on the spot, combine words from different languages in one sentence, or botch up a turn of phrase.

Therefore, I propose to conceive of rules as regulative resources by means of which people can regulate their communicative interaction. This is perhaps obvious when applied to teaching situations, e.g. in institutional language education. But outside teaching situations, if the situation requires it, a recourse to articulated rules provides participants with ways of solving a breakdown in communication and negotiating the communicational import of utterances in an unfolding conversation. Here it is crucial to see that the reflexive second-order skills that we learn in order to navigate the normativity of language are practical skills, aimed at coordinating our behaviour with others. Take the practice of making promises. In everyday circumstances, people are not interested in the metaphysics of promises. But they are interested in whether someone sincerely made a promise, whether he will keep his promise, and what keeping a promise entails.

We therefore can compare linguistic rules with traffic rules (c.f. Wittgenstein 1967, §440; cf. Edwards 1997). Traffic rules normatively regulate the behaviour of people as they partake in traffic and people often act in accordance with the traffic rules because they have been trained to do so. At the same time, traffic rules do not determine or govern what people do in traffic. A person might choose to ignore traffic rules because they are late or take pleasure in the sound the engine makes at 150 km/h. In the case of a breakdown of traffic behaviour, say an accident, the rules can be cited to ascertain who was at fault – ‘You should have given right of way!’ – and can thus be used to determine accountability. In the absence of traffic rules, no such judgement could be made. In his example of using the rule ‘drive on the right side of the road’ as a regulative rule, Searle remarked that the activity of driving does not depend on the rule. And in a way, we could have a form of driving without having traffic rules. But this practice would be something very different from traffic as we know it. While it is therefore true that there could be a kind of traffic without traffic rules, the addition of regulative rules introduces a normative dimension to traffic that it otherwise would not have. In other words, while traffic is not constituted by rules, but merely regulated by it, traffic qua normative activity is constituted by rules.

The same goes for language. Communicative behaviour qua first-order pattern-governed behaviour is not constituted by rules, but the normativity of language is. It is the addition of regulative rules, i.e., the reflexive recognition of regularities and the use of this reflexive recognition in determining criteria and standards of correctness, appropriateness, or adequacy, that makes our communicative behaviour into a normative practice.

So far we have approached the regress objection on ontogenetic timescales. At this point, it might be objected that the regress objection will return on sociogenetic timescale. However, the account developed in this paper enables us to tell a story on sociogenetic timescales as well. In particular, we can imagine that pure first-order communicative pattern-governed behaviour understood in terms of attentional actions can gradually transform into normative behaviour by means of the piecemeal development of reflexive skills. An example of this pure first-order communicative behaviour that is pattern-governed are the alarm calls of vervet monkeys (Manser 2013; Seyfarth et al. 1980). Making the distinction between pattern-governed behaviour and rule-following behaviour, in combination with ecological-enactive theoretical tools for accounting for this pattern-governed behaviour, means that telling such a story is possible in principle.

At this point, one might wonder whether this purely pattern governed behaviour is ‘language’, or whether the normativity afforded by second-order reflexive skills is required for calling behaviour linguistic. Here I concur with Love (2017, p. 7), who maintains that where ‘we draw a line between the linguistic and the non-linguistic, if a line is needed at all, is not in my view an interesting question.’ The ecological-enactive approach foregrounds both the continuity between pure first-order communicative behaviour and the highly complicated language use of philosophers by understanding both in terms of attentional actions, but also shows the transformative potential of second-order reflexive skills.

5 Conclusion

In this paper, I have proposed and defended the hypothesis that metalinguistic reflexivity is constitutive of linguistic normativity. I have shown how metalinguistic practices can be used to articulate and negotiate standards of correctness, appropriateness, and adequacy. The meta-conversation afforded by metalinguistic practices enables us to reflexively recognise and normatively enforce patterns of communicative activities. A potential defeating counterargument against the constitutivity of metalinguistic reflexivity is the regress objection. We saw that at the root of this regress objection lies the idea that learning language requires learning to follow the constitutive rules of linguistic practices. This means that knowledge of the rules explains regularities in behaviour. On the ecological-enactive alternative I sketched, learning language can be understood as first learning regular communication behaviour by developing first-order linguistic skills that can be understood in terms of attentional actions, and then learning the second-order skills that enable one to reflexively recognise these regularities and normatively enforce them. On this view, it is only because a child first behaves in regular ways, that she can then interpret her own behaviour in normative metalinguistic terms, that is, as being guided by rules. Metalinguistic reflexivity enables regulation of communicative behaviour, and thereby constitutes linguistic normativity. The account developed in this paper thus allows us to understand how metalinguistic reflexivity constitutes linguistic normativity without falling prey to the regress objection.