Keywords

1 What Is the Problem?

All human beings use media, whether in the form of gestures, speech, news programmes, websites, music, advertisements or traffic signs. The collaboration of all these media is essential for living, learning and sharing experiences. Understanding mediality is one of the keys to understanding meaning-making in human interaction, whether directly through the capacities of our bodies or with the aid of traditional or modern external devices.

Media can be understood as communicative tools constituted by interrelated features. All media are multimodal and intermedial in the sense that they are composed of multiple basic features and can be thoroughly understood only in relation to other types of media with which they share basic features. We do not have standard communication on one hand and multimodal and intermedial communication on the other. Therefore, basic research in multimodality and intermediality is vital for further progress in understanding mediality—the use of communicative media—in general. Intermediality is an analytical angle that can be used successfully for unravelling some of the complexities of all kinds of communication.

Scholars have been debating the interrelations of the arts for centuries. Now, in the age of mass media, electronic media and digital media, the focus of the argumentation has been broadened to the interrelations among media types in general. One important move has been to acknowledge fully the materiality of the arts: like other media, they depend on mediating substances. For this reason, the arts should not be isolated as something ethereal, but rather seen as aesthetically developed forms of media. Still, several of the issues discussed within the old interart paradigm are also highly relevant to multimodal and intermedial studies. One such classical locus of the interart debate concerns the relation between the arts of time and the arts of space. In the eighteenth century, Gotthold Ephraim Lessing famously argued in Laocoön that there are, or rather should be, clear differences between poetry and painting (1984 [1766]). Lessing’s core question of what implications spatiotemporal differences have for media remains acutely relevant today.

I believe it is equally important to highlight media differences and media similarities when trying to get a grip on multimodality and intermediality. If we have earlier seen a bent towards emphasising differences, recent decades have shown a tendency to deconstruct media dissimilarities, not least through the writings of W. J. T. Mitchell (1986), who criticised ideologically grounded attempts to find clear boundaries between media types and particularly art forms. Other scholars, like Shlomith Rimmon-Kenan, have emphasised that media differences come in grades: ‘It seems to me that (1) most of the distinctions between media will turn out to be matters of degree rather than of absolute presence or absence of qualities; and (2) what is a constraint in one medium may be only a possibility in another’ (Rimmon-Kenan 1989: 161). I feel that this is a productive view that still needs to be developed methodically. I find it as unsatisfying to continue talking about ‘writing’, ‘film’, ‘performance’, ‘music’ and ‘television’ as if they were like different people who can be married and divorced as to find repose in a belief that all media are always fundamentally blended in a hermaphroditical way.

In brief, one might say that the crucial ‘inter’ part of intermediality is a bridge, but what does it bridge over? If all media were fundamentally different, it would be hard to find any interrelations at all; if they were fundamentally similar, it would be equally hard to find something that is not already interrelated. However, media are both different and similar, and intermediality must be understood as a bridge between media differences that is founded on media similarities. The primary aim of this article is to shed light on precisely these differences and similarities in order to better understand intermedial relations.

I identify five tendencies in exploration of mediality, including what is known as multimodality and intermediality studies, which I find problematic. Although these tendencies were stronger a decade ago when I published the initial version of ‘The Modalities of Media’ (Elleström 2010), and several scholars have proposed ways to tackle them, they still exist.

  1. 1.

    Research is carried out without proper explanations of the concept of medium. Just as multimodality studies are often conducted without accurate definitions of mode, intermediality tends to be discussed without clear conceptions of the medium. I argue that if the concept of medium is not properly defined, one cannot expect to comprehend mediality and intermediality, which makes it difficult to integrate medium with mode and other related concepts. This is not only a terminological problem; on the contrary, it concerns the formation of conceptual frameworks capable of operating over large areas of communication.

  2. 2.

    Only two media types at a time are compared. Following the traditions of interart studies, intermedial work has a strong tendency to compare no more than two media types at a time. Countless publications have focussed on word and image, word and music, film and literature, film and computer games, visual art and poetry and other constellations including two or perhaps three media types. While such studies are legitimate and may offer great insights, they usually delimit the field of vision in such ways that the outcomes are not helpful for analysing other forms of media interrelations. This results in a multitude of incompatible terms and concepts that blur the essential core features of media in general.

  3. 3.

    Media in general are studied through concepts developed for language analysis. Twentieth-century research in the humanities has been strongly affected by the language-centred semiotics of Ferdinand de Saussure (2011 [1916]). Although Saussure has been seminal for understanding language better, his ideas have also, to some extent, harmed the conceptualisation of communication in general. This is because his concepts lack the capacity to explain anything other than the conventional aspects of signification, which Saussure explored in terms of arbitrariness of signs. This excludes core features of several media types. The strong bias in a lot of Western research towards trying to understand all kinds of communication in terms of language has been counterproductive, overall, and is still a major threat to a cross-disciplinary understanding of media properties. This is true even for the significant amount of research that clearly focuses on non-verbal aspects (multimodality research in the tradition of Kress and van Leeuwen 2001), although the field is currently moving towards a less language-centred approach (Bateman et al. 2017).

  4. 4.

    Misleading dichotomies structure the arguments. Although advanced terminology and theoretical sophistication are not lacking, many researchers still use largely undefined and deeply ambiguous layman’s terms, such as ‘text’ and ‘image’, to describe the nature of media. Although such terms are indispensable for everyday use, and valuable for preliminary scholarly categorisations, they refer to notoriously vague concepts, which causes misunderstanding and confusion to become standard features of academic discussions. Attempts to create systematic and comprehensive methodologies and theoretical frameworks fail because the most basic concepts are not clearly delimited. For instance, the terms ‘text’ and ‘image’ may refer to media with fundamentally different material, spatiotemporal and sensorial features. Consequently, efforts to understand the relationship between so-called texts and images are doomed to fail, leaving us with nebulous and insufficient ideas of ‘mixtures’ of text and image unless more fine-grained explanations are made. Similarly, the ‘verbal’ vs. ‘visual’ media dichotomy is inadequate. Although it may be practical for upholding rough differences between some media types, it is actually confusing and counterproductive when trying to understand media similarities and differences in a deeper way. Because being visual is a sensorial trait and being verbal is a semiotic trait, it is pointless to oppose the two. Some media are verbal, others are not; some media are visual, others are not; and some media are both verbal and visual.

  5. 5.

    Media traits are not distinguished from mediaperceptionand signification. Another recurring problem is the failure to distinguish between inherent media traits and the perception of those traits. This is understandable since it is, in practice, impossible to separate the two. Nevertheless, it is crucial to discriminate theoretically between the modes of existence of media and the perception of these modes in order to apprehend media differences and similarities. Although this is doubtless a slippery business, it is important to acknowledge that, for instance, the quality of time in a movie, understood as a mode of existence, is not the same as the time required to perceive a still photograph. Furthermore, time can be said to be present in many forms in the same medium. A still photograph, which does not have time as a mode of existence, can nevertheless represent temporal events. If one avoids taking notice of these intricacies, one is left with a featureless mass of only seemingly identical media that cannot be compared properly.

The goal of this article is to suggest solutions to these problems through the following means:

  1. 1.

    A methodical elaboration of the concept of medium

  2. 2.

    A systematic development of concepts that are applicable to all media types

  3. 3.

    A multifaceted understanding of communication that is not anchored in linguistic concepts

  4. 4.

    A fine-grained manner of conceptualising the multitude of media traits beyond standard formulae

  5. 5.

    A nuanced investigation of the relations among basic media traits, perception and signification

I hope that fulfilling this objective will make it possible to understand better what media borders are and how they can be crossed, how one can comprehend the concept of multimodality in relation to intermediality, what it means to combine and integrate different media and how it is possible for different media types to communicate similar things.

My suggested conceptual solutions are not the only ones available. However, to keep my lines of argument as clear as possible, I refrain from engaging in excessive critique of other positions. Furthermore, my ambition is not to propose anything like a complete model for analysing communication; instead, the objective is to scrutinise precisely intermedial relations. Understanding such interrelations may be vital for various forms of investigations, and, depending on the aims and goals of those investigations, the concepts and principles that I propose here must be complemented with other research tools.

The term ‘medium’ is widely employed, and it would be pointless to try to find a straightforward definition that covers all the various notions that lurk behind the different uses of the word. Dissimilar notions of medium and mediality are at work within different fields of research, and there is no reason to interfere with these notions as long as they fulfil their specific tasks. Instead, I will circumscribe a concept that is applicable to the issue of human communication. However, a brief definition of medium would only capture fragments of the whole conceptual web and could be counterproductive. Instead, I will try to form a model (which actually constitutes a conglomerate of several models) that preserves the term ‘medium’ and still qualifies its use in relation to the different aspects of the conceptual web of mediality. Thus, the concept of medium can be divided into several deeply entangled concepts in order to cover the many interrelated aspects of mediality.

The core of this differentiation consists of setting apart four media modalities that may be helpful for analysing media products. A media product is a single physical entity or phenomenon that enables inter-human communication. Media products can be analysed in terms of four types of traits: material, spatiotemporal, sensorial and semiotic traits. I call these categories of traits media modalities. During the last decades, the notion of multimodality has gained ground (Kress and van Leeuwen 2001; Bateman 2008; Kress 2010; Seizov and Wildfeuer 2017), stemming from social semiotics, education, linguistics and communication studies. Although my notion of media modalities is inspired by this research tradition, it differs significantly in ways that will become evident. Likewise, I am strongly influenced by the research field of intermediality, which has its historical roots in aesthetics, philosophy, semiotics, comparative literature, media studies and interart studies (for details, see Clüver 2007, 2019; Rajewsky 2008). These research traditions have been decisive for how I have come to circumscribe the various aspects of mediality.

As my arguments unfold, I will distinguish among media products, technical media of display and media types (basic media types and qualified media types). Basic and qualified media types are categories of media products, whereas technical media of display are the physical entities needed to realise media products and hence media types. Consequently, the term ‘medium’, when used without specifications, generally refers to all of these media aspects.

Thus, various media aspects are not groups of media. Instead, they are complementary, interwoven, theoretical aspects of what constitutes mediality. Accordingly, the wide concept of medium that I will present in this article comprises several intimately related yet divergent notions that I will distinguish terminologically. I believe that multimodality and intermediality cannot be fully understood without grasping the fundamental conditions of every single media product, and these conditions constitute a complex network of both physical qualities of media and various cognitive and interpretive operations performed by the media perceivers. For my purpose, media definitions that deal only with the physical aspects of mediality are too narrow, as are media definitions that only emphasise the social construction of communication. Instead, I will emphasise the critical meeting of the physical, the perceptual, the cognitive and the social.

2 What Are Media Products and Communicating Minds?

2.1 A Medium-Centred Model of Communication

The starting point of this investigation of media interrelations consists of an examination of the concept of media product, which is the core of all further elaborations in this study. To delineate the concept of a media product properly and thoroughly, it is necessary to have a developed model of human communication that is devised for highlighting precisely the notion of medium (Elleström 2018a, b, c). Although I have designed my model to scrutinise primarily human communication, it is at least partly applicable to communication among other animals as well. It consists of what I take to be the smallest and fewest possible entities of communication and their essential interrelations. If one of these entities or interrelations is removed, communication is no longer at hand; thus, the model is irreducible. I submit that three indispensable and interconnected entities can be discerned:

  1. 1.

    Something being transferred

  2. 2.

    Two separate places between which the transfer occurs

  3. 3.

    An intermediate stage that makes the transfer possible

These three entities of communication have been circumscribed in various ways in established and influential communication models. In the following, I refer to some of these classical models (from linguistics, media and communication studies and cultural studies) to anchor my concepts in well-known communication paradigms and make clear the many ways in which I depart from the standard concepts. Although it is debatable, I have kept the traditional concept of transfer because I think it is part and parcel of the concept of communication. While the term ‘transfer’ may have misleading associations with material things being moved around, one can hardly avoid the deep experiential similarity between sharing and transferring material and mental entities—as in human communication. These issues will be continuously scrutinised in the ensuing discussions.

Roman Jakobson used the term ‘message’ to capture the first entity, ‘something being transferred’, but did not delineate the notion underlying his term (Jakobson 1960). Wilbur Schramm vacillated between two incompatible arguments: that there is no such thing as an entity being transferred, and that the transferred entity is a ‘message’—not ideas or thoughts (Schramm 1971). Stuart Hall was also rather vague when he implied that ‘meaning’ is transferred in communication. Instead of clearly stating that communication is about transferring meaning, he emphasised that ‘meaning structures 1’ and ‘meaning structures 2’ may differ; there are degrees of ‘symmetry’ and degrees of ‘understanding’ and ‘misunderstanding’ (Hall 1980: 131). In other words, if there is transfer of meaning in communication, this involves transformation of meaning. This contention is certainly feasible.

While the second entity, ‘two separate places between which the transfer occurs’, arguably consists of two units, they can only be outlined in relation to each other. Jakobson’s terms were ‘addresser’ and ‘addressee’, but Schramm preferred ‘communicator’ and ‘receiver’. Finally, Hall avoided outlining the two separate places between which the transfer occurs as persons; in fact, he avoided pointing to such places at all. However, his notion that ‘meaning structures’ are to some extent transferred implies that such meaning structures indeed need to be located at places that are capable of holding ‘meaning’—which must be understood as the minds of human beings, given that human communication is at stake.

The third entity, ‘an intermediate stage that makes the transfer possible’, has also been conceptualised differently. Jakobson’s ‘contact’ notably incorporates both a material and a mental aspect; it was described as ‘a physical channel and psychological connection between the addresser and the addressee’ (1960: 353). Schramm used the term message to represent not only the transferred entity, but also the intermediate stage of communication (he seems to understand the message as something that is both ‘transferred’ and ‘transferred through’). Importantly, however, Schramm described the transmitting message not only as a material entity—such as ‘a letter’—but also as ‘a collection of signs’, thus indicating the capacity of the material to produce mental significance through signs (1971: 15). Hall also emphasised the semiotic nature of the intermediate stage of communication. His term for this entity was ‘meaningful’ discourse; however, his terminology is generally rather incoherent, resulting in uncertainty about the more precise nature of the intermediate stage.

Regarding the first entity of communication, ‘something being transferred’, there is certainly a point in Schramm’s notion that no ideas or thoughts are transferred in communication. As Hall indicated, transfer of meaning is likely to entail a change of meaning; this modification may be only slight or more radical. Nevertheless, I claim that communication models cannot do without the notion of something being transferred. If there is no correlation at all between input and output, there is simply no communication, given the foundational idea that to communicate is ‘to share’; thus, a concept of communication without the notion of something being transferred is nonsensical. However problematic it may be, the notion of something being transferred must be retained and painstakingly scrutinised, instead of being avoided. To begin with, I think it is clear that one cannot confine the transferred units or features to distinct and consciously intended conceptions, and perhaps not even to ‘ideas’ as Schramm understands them.

My suggestion is to use the term ‘cognitive import’ to refer to those mental configurations that are the output and input of communication (thus, ‘import’ should not be understood here in contrast to ‘export’). The notion that I want to suggest using this term is clearly closely related to notions captured by terms such as ‘meaning’, ‘significance’ and ‘ideas’, although the term ‘cognitive import’ is perhaps less burdened with certain notions that a term such as ‘meaning’ seems to have difficulty ridding itself of. Meaning is often understood as a rather rigid concept of verbal, firm, definable or even logical sense. Instead, cognitive import should be understood as a broad notion that also includes vague, fragmentary, undeveloped, intuitive, ambiguous, non-conceptual and pragmatically oriented meaning that is relevant to a wide range of media types and communicative situations. It is imperative to emphasise that although cognitive import is always a result of mind-work, cognition is embodied and not always possible to articulate using language; hence, according to my proposed model, communication cannot be reduced to simply communication of verbal or verbalisable significance.

The second entity, ‘two separate places between which the transfer occurs’, is usually construed as two persons. However, this straightforward notion is not precise enough for my purposes. Because it is imperative to be able to connect mind and body to different entities of the communication model, it is also essential to avoid crude notions such as that of Jakobson’s addresser–addressee and Schramm’s communicator–receiver. These notions give the impression that the transfer necessarily occurs between two persons consisting of minds and bodies and with a third, separate, intermediate object in the middle, so to speak, an intermediate object in the form of a ‘message’ that is essentially disconnected from the communicating persons. It is better to follow Hall’s implicit idea that communication occurs between sites that are capable of holding ‘meaning’. Warren Weaver’s description of communication as something that occurs between ‘one mind’ and ‘another’ is simple and to the point (Weaver 1998 [1949]).

My suggestion is to use the terms ‘producer’s mind’ and ‘perceiver’s mind’ to refer to the mental places in which cognitive import appears. First, there are certain mental configurations in the producer’s mind, and then, following the communicative transfer, there are mental configurations in the perceiver’s mind that are at least remotely similar to those in the producer’s mind. The term ‘mind’ should generally be understood as denoting (human) consciousness that originates in the brain and is particularly manifested in perception, emotion, thought, reasoning, will, judgment, memory and imagination. The term ‘mental’ refers to everything relating to the mind. The term ‘cognition’ should be understood as representing those mental processes that are involved in gaining knowledge and comprehension, including, among other higher-level functions of the brain, thinking, remembering, problem-solving, planning and judging. However, even though the mind and its cognition are founded on cerebral processes, mental activities are in no way separated from the rest of the body. On the contrary, I subscribe to the idea that the mind is profoundly embodied—formed by experiences of corporeality (Johnson 1987).

Most of the researchers that I refer to here have recognised, either explicitly or implicitly, that the third entity, ‘an intermediate stage that makes the transfer possible’, is in some way material. As stated succinctly in a more recent publication, any act of communication ‘is made possible by some form of concrete reification of the message, which, at its most elementary level, must abide by physical laws to exist and take shape’ (Bolchini and Lu 2013: 398). Furthermore, Schramm and Hall clearly discussed the intermediate stage in terms of signs. In line with this, I suggest that the intermediate entity connecting two minds with each other is always in some way material, understood broadly as consisting of physical entities or phenomena, although it clearly cannot be conceptualised only in terms of materiality. As it connects two minds in terms of a transfer of cognitive import, it must be understood as materiality having the capacity to trigger certain mental responses.

My suggestion is to use the term ‘media product’ to refer to the intermediate stage that enables the transfer of cognitive import from a producer’s to a perceiver’s mind (what Irina O. Rajewsky called ‘medial configuration’ (2010)). As the bodies of these two minds may well be used as instruments for the transfer of cognitive import, they have potential to attain the function of media products. I propose that a media product may be realised by either non-bodily or bodily matter (including matter emanating directly from a body), or a combination of the two. This means that the producer’s mind may, for instance, use either non-bodily matter (say, paper) or her own body and its immediate extensions (moving arms and sound produced by the vocal cords) to realise media products such as printed texts, gestures and speech. Furthermore, the perceiver’s body may be used to accomplish media products; for instance, the producer may realise a painting on the perceiver’s skin or push her gently to communicate the desire that she move a little. Additionally, other bodies, such as the bodies of actors, may be used as media products.

In contrast to influential scholars such as Marshall McLuhan, who conceptualised media as the ‘extensions of man’ in general (McLuhan 1994 [1964]), I define media products as ‘extensions of mind’ in the context of inter-human communication. Thereby, I avoid the classical distinction in communication studies between mediated and interpersonal communication—communication that needs and communication that supposedly does not need mediation. This distinction has been criticised because of practical difficulties in upholding it (see Rice 2017). I avoid it also because of the theoretical and more profound obstacle of thinking about interpersonal communication as not being mediated (it would be absurd to consider interpersonal communication independent of media capacities and media limitations). The only thing that justifies such a distinction is that so-called interpersonal communication is entirely dependent on specific (but not fundamentally different) forms of media products, namely, those that rely on the producer and perceiver’s human bodies and their immediate extensions instead of external devices.

2.2 Media Products

Given that being a media product must be understood as a function rather than an essential property, virtually any material existence can be used as one, including not only solid objects but also all kinds of physical phenomena that can be perceived by the human senses. In addition to those forms of media products that are more commonly categorised as such (like written texts, songs, scientific diagrams, warning cries and road signs), there is an endless row of forms of physical objects, phenomena and actions that can function as media products, given that they are perceived in situations and surroundings that encourage interpretation in terms of communication. These include nudges, blinks, coughs, meals, ceremonies, decorations, clothes, hairstyles and make-up. In addition, dogs, wine bottles and cars of certain makes, sorts and designs may well function as media products to communicate the embracing of certain values or simply wealth, for instance. Within the framework of a trial, surveillance camera footage and spoken word testimony from witnesses both function as media products, as do fingerprints, DNA samples and bloodstains presented by the prosecutor—because they are drawn into a communicative situation.

Because the function of being a media product is initially triggered by the producer’s mind, media products can be said to be produced by the producer’s mind. As I define these concepts, producing a media product does not necessarily mean fabricating it materially. Fingerprints presented in a criminal trial are evidently produced by the prosecutor not in the sense that she materially fabricates them, but in the sense that she gives them a communicative function by placing them in the context of the trial.

It may also be the case that someone uses an ‘old’ media product, produced by someone else, to communicate. For instance, one could play a recorded love song, written and sung by others, to communicate love to someone special on a certain occasion. In this way, the recorded song, which already has the function of a media product, is appropriated, so to speak, and given a more specific and partly new communicative function. Like the fingerprints (disregarding other differences), the recorded love song is not fabricated by the (new) producer’s mind, but rather exposed and given a (new) communicative function.

Given this conceptualisation, it is pointless to try to distinguish between physical existences that are and that are not actual media products. Instead, it is important to have a clear notion of the properties of physical existences that confer the function of media products on them. Clearly, these properties, which I will investigate in the following, are in no way self-evidently present. Perceiving something as a media product is a question of being attentive to certain kinds of phenomena in the world. As humans have been able to communicate with each other for thousands and thousands of years, this attention is partly passed on by heredity, but it is also deeply formed by cultural factors and the experience of navigating within one’s present surroundings. Knowledge of musical performance traditions, for example, leads to specific attention to certain details while others may be ignored; thus, accidental noises and random gestures may be sifted out as irrelevant for the musical communication and not part of the media product. Practical experience of the environment normally makes us pay attention to what happens on the screen of a television set rather than to its backside. However, if the television set is used in an artistic installation, or if a repair person tries to explain why it does not work by way of pointing to certain gadgets, it may be the backside that should be selected for attention in order to achieve the function of a media product.

Thus, media products are cultural entities that depend on social praxis; media products and their basic characteristics are (more or less) delimited units formed by (often shared) selective attention on sensorially perceptible areas of communication that are believed to be relevant for achieving communication in a certain context. This means that there is no such thing as a media product ‘as such’. I argue that not even a written text is a media product in itself; it is only when its function of transferring cognitive import among minds is realised that it can be conceptualised as a media product. The archaeologist who inspects the marks on a bone and believes that they are caused by accidental scraping is not involved in communication. If the archaeologist believes that the marks are some sort of letters in an unknown language, she may be engaged in elementary communication to the extent that she understands a communicative intent. If the marks are eventually deciphered, communication that is more complex may result. If the deciphering actually turns out to be mistaken, the belief that communication occurred is an illusion. Of course, border cases like these could also be exemplified by everyday interaction among people who may or may not be mistaken about the significance of all kinds of movements, glances and sounds.

McLuhan suggestively argued that not only the spoken word, the photograph, comics, the typewriter and television are media, but also money, wheels and axes (1994 [1964]: 24). In relation to that, I argue that whereas nothing is a media product as such, virtually everything can attain the function of a media product. In that sense, money, wheels and axes may also function as media products, although they do not actually do so as regularly as spoken words and photographs.

2.3 Elaborating the Communication Model

I will now display my communication model in the form of a visual diagram (Fig. 1.1) and explain some of its implications. Construing this diagram from left to right, the act of communication starts with certain cognitive import in the producer’s mind. Consciously or unconsciously, the producer forms a media product, which may be taken in by some perceiver. Thus, the media product makes possible a transfer of cognitive import from the producer’s mind to the perceiver’s mind. This is certainly not a transfer in the strong sense that the cognitive import as such passes through the media product (which lacks consciousness), but in the sense that there is, ultimately, cognitive import in the perceiver’s mind that bears some resemblance to the cognitive import in the producer’s mind.

Fig. 1.1
A diagram represents the transfer of cognitive import between the producer's mind, the media product, and the perceiver's mind. It denotes the act of production between the producer's mind and the media product and the act of perception between the media product and the perceiver's mind.

A medium-centred model of communication (Elleström 2018a: 282)

The visual diagram contains the three entities of communication circumscribed above:

  1. 1.

    Something being transferred: cognitive import

  2. 2.

    Two separate places between which the transfer occurs: producer’s mind and perceiver’s mind

  3. 3.

    An intermediate stage that makes the transfer possible: media product

Additionally, the visual diagram displays four essential interrelations among these entities:

  1. 1.

    An act of production ‘between’ the producer’s mind and media product

  2. 2.

    An act of perception ‘between’ the media product and the perceiver’s mind

  3. 3.

    Cognitive import ‘inside’ the producer’s mind and the perceiver’s mind

  4. 4.

    A transfer of cognitive import ‘through’ the media product

I will now elaborate on these interrelations, especially the fourth one. I submit that the notion of media product, and the question of how cognitive import may be transferred through a media product, is essential for understanding communication.

The first interrelation, ‘an act of production “between” the producer’s mind and media product’, is always initiated by the producer’s mind and always, to begin with, effectuated by the producer’s body. Sometimes, this primary bodily act will immediately result in a media product. For instance, when one person begins talking to another person who is standing beside her, the speech emanating from the vocal cords constitute a media product that reaches the perceiver directly. At other times, the primary bodily act is linked to subsequent stages of production, and the primary bodily act can be connected to a broad range of actions and procedures before a media product comes to be present for a perceiver. For instance, talking through a telephone often requires manual handling of the telephone in addition to the activation of the user’s vocal cords, and always requires constructed, technological devices that are suitable to transmit the initial speech to another place, in which the actual media product is constituted—that is, the speech that can be heard by the perceiver. Similarly, a child drawing a picture for her father who is sitting at the same kitchen table only has to perform, in principle, one primary bodily act in order to create a media product that is immediately available for the perceiver. However, if the father is in another place, additional stages of actions and procedures must be added: the drawing may be posted and physically relocated, or scanned and emailed, after which it appears in a slightly transformed way as a media product that is realised by a computer screen. Thus, the act of production may be simple and direct, as well as complex and indirect. It may also include stages of storage.

There is an abundance of devices for the production and storage of media products. Although involved in mediality, and often called media of production and media of storage, I prefer not to call them media, in order to keep the terminology clear. Thus, cameras are technical devices of production (with the capacity to register light chemically or physically) that can be said to be attached, more or less distantly, to technical devices of display with various properties, such as silver-plated sheet copper, photographic paper or a screen (a computer screen or a display on the camera itself). Book pages are technical devices of storage and technical devices for the display of visual sensory configurations. In contrast, because they quickly disappear, sound waves generated by vocal cords do not store sensory configurations but only display them.

The second interrelation, ‘an act of perception “between” the media product and the perceiver’s mind’, is always initiated by the perceiver’s sense organs and always, to some extent, followed by and entangled with interpretation. Interpretation should be understood as all kinds of mental activities that somehow make sense of the sensory input; these activities may be both conscious and unconscious and are no doubt already present in a basic way when the sense impressions are initially processed. Thus, compared to the potentially extensive act of production, the act of perception is brief and quickly channelled into interpretation, which of course occurs in the perceiver’s mind. Nevertheless, the type, quality and form of sensory input provided by the media product, and actually taken in by the perceiver’s sense organs, are crucial for the interpretation formed by the perceiver’s mind.

For the moment, I will only comment briefly upon the third interrelation among the entities of communication, ‘cognitive import “inside” the producer’s mind and the perceiver’s mind’. One cannot state, without intricate implications, that there is a certain amount of confinable cognitive import inside a mind, and it is undoubtedly difficult to judge the actual extent of similarity between the two amounts of cognitive import in the two minds. Deciding this in a more precise way is probably beyond the reach of known scientific research methods. However, I find the notion that the transferred cognitive import is only one part of the producer’s and the perceiver’s minds unproblematic. The cognitive import is ‘inside’ the minds, in the sense that it is closely interconnected with a multitude of other cognitive entities and processes and, ultimately, with the total sum of mental activities in general that surrounds it.

The fourth interrelation, ‘a transfer of cognitive import “through” the media product’, is central for my arguments. Until now, I have only described the media product simply as the entity of communication that enables a transfer of cognitive import from a producer’s mind to a perceiver’s mind—a material entity that has the capacity of triggering mental response. However, to give a somewhat more detailed account of this notion, the very capacity itself must be scrutinised.

Of course, the transfer of cognitive import is only partly comparable to other transfers—such as the transfer of goods between two cities by train. The cognitive import transfer is not a material transfer but a mental transfer aided by materiality. In one respect it can be compared to teleportation, which is the transfer of energy or matter between two points without traversing the intermediate space: the cognitive import is indeed transferred between two points (two minds), and, contrary to the transfer of goods, it does not traverse the intermediate space. Nevertheless, as the transfer depends on the media product, it is reasonable to say that the cognitive import goes ‘through’ the media product. Actually, the media product is neither a neutral object of material transfer, like a freight car, nor an intermediate space without effect, as in teleportation; it constitutes a crucial stage of transition, not only transmission. As Beate Schirrmacher suggested to me in personal communication, the transfer of cognitive import ‘through’ the media product might alternatively be described as ‘a chain or interactions’ involving producer’s mind, media product, perceiver’s mind and everything in between.

Explaining this in some detail requires attention to the whole spectrum, from the material to the mental. My angle for coping with this challenge is to suggest that all media products can be analysed in terms of four kinds of basic traits. As already noted, I call these categories media modalities (Elleström 2010). I will describe these modalities briefly to prepare the ground for further elaboration of the communication model and then come back to them in a lengthier discussion later in the article.

The first three modalities are the material modality, the spatiotemporal modality and the sensorial modality. Media products are all material in the sense that they may be, for instance, solid or non-solid, or organic or inorganic, and comparable traits like these belong to the material modality. All media products also have spatiotemporal traits, which means that such products that do not have at least either spatial or temporal extension are inconceivable; hence, the spatiotemporal modality consists of comparable media traits such as temporality, stasis and spatiality. Furthermore, media products must reach the mind through at least one sense. Hence, sensory perception is the common denominator of the media traits belonging to the sensorial modality—media products may be visual, auditory and tactile and so forth.

Of course, these kinds of traits are not unknown to communication researchers. For instance, Hall discussed the two sensory channels of television (1980), David K. Berlo highlighted all five external senses (1960), and Schramm at least briefly mentioned that ‘a message has dimensions in time or space’ (1971: 32). However, a thorough understanding of the conditions for communication requires systematic attention to all modalities. It is clear that cognitive import of any sort cannot be freely communicated by any kinds of material, spatiotemporal and sensorial traits. For instance—to use some blatant examples—complex assertions cannot easily be transferred through the sense of smell, and it is more difficult to effectively transfer detailed series of visual events though a static media product than through a temporal media product.

The fourth modality is the semiotic modality. Whereas the semiotic traits of a media product are less palpable than the material, spatiotemporal and sensorial traits, and in fact are entirely derived from them, they are equally essential for realising communication. The sensory configurations of a media product do not transfer any cognitive import until the perceiver’s mind comprehends them as signs. In other words, the perceived sensory configurations are meaningless until one understands them as representing something through unconscious or conscious interpretation. This is to say that all objects and phenomena that act as media products have semiotic traits, by definition. By far the most successful effort to define the basic ways in which to create meaning in terms of signs has been Charles Sanders Peirce’s foundational trichotomy icon, index and symbol.

Understanding this trichotomy requires us to comprehend an even more foundational semiotic trichotomy: the three sign constituents. In brief, Peirce held that signs, often called representamens, stand for objects—this relationship results in interpretants in the perceiver’s mind: ‘A sign, or representamen, is something which stands to somebody for something in some respect or capacity’. This means that the representamen stands for an object in some respects and thus ‘creates in the mind of that person’ an interpretant (Peirce 1932: CP2.228 [c.1897]). This entails that signs are not pre-existing static items, but rather dynamical functions established by relational constituents that exist only in interaction with each other. Signification is a mental process, although both representamens and objects may be connected to external elements or phenomena; however, the interpretant is entirely in the mind. I would argue that my notion of cognitive import created in the perceiver’s mind in communication is a vital example of Peirce’s notion of interpretants resulting from signification.

Hence, a media product can be understood as an assemblage of representamens that, due to their material, spatiotemporal and sensorial traits, together with contextual factors, represent certain objects (that are available to the perceiver), thus creating interpretants (cognitive import) in the perceiver’s mind.

Peirce defined his three central sign types based on some fundamental cognitive abilities that make representamen–object relationships possible. Icons stand for (represent) their objects based on similarity, indices do so based on contiguity, and symbols rely on habits or conventions (1932: CP2.247–249 [c.1903]; Elleström 2014a: 98–113). I take iconicity, indexicality and symbolicity to be the main media traits within the semiotic modality, which is to say that no communication occurs unless cognitive import is created through at least one of the three sign types. Iconicity, indexicality and symbolicity are simply indispensable for semiosis, and they work because of our capacity to perceive similarities and contiguity and to form habits.

I use the term ‘semiosis’ here to denote the widest and least strict notion of sign activity and sign use, where signs are always to be understood as results of interpretation—not inherent qualities of objects or phenomena. ‘Semiosis’ is a catch-all term for everything that involves signs, which may be applied when there is no need for precision. Peirce himself only used the term sporadically, without ever giving it a prominent or particularly specific place in his vocabulary (something close to a definition can be found in 1934: CP5.484 [c.1907]). Briefly, I take signification to be the process of meaning creation. While signification is always a mental process, it may also include material aspects; for instance, the mind may perceive physical qualities through media products. Representation should be understood more specifically as representamens triggering the presence of objects in the mind; thus, representation is a core part of signification.

Again, processes of signification are not unknown in communication research. Among the scholars quoted in this article, Schramm clearly related to some basic semiotic features. For instance, he accurately noted that ‘it is just as meaningful to say that B [the receiver] acts on the signs [the message], as that they act on B’ (1971: 22). Indeed, the mind of the perceiver is very active in construing the signs of the media product. In addition, Hall spoke in terms of semiotics, albeit with a distinct linguistic bias. Peirce’s semiotic framework is fruitful because it incorporates sign types that work far outside of the linguistic domain, dominated by symbolicity in the form of verballanguage.

Furthermore, I wish to emphasise the notion that a semiotic perspective must be combined with a material perspective. Communication is equally dependent on the material, spatiotemporal, sensorial and semiotic modalities. What one takes to be represented objects called forth by representamens (objects such as persons, things, events, actions, feelings, ideas, desires, conditions and narratives) are results of both the basic features of the media product as such (the mediated material, spatiotemporal and sensorial traits) and of cognitive activity, connected to surrounding factors, resulting in representation. While signification is ultimately about mind-work, in the case of communication this mind-work is fundamentally dependent on the physical appearance of the media product—although some representation is clearly more closely tied to the appearance of the medium, whereas other is more a result of interpretation, and hence the context of the perceiving mind.

As with material, spatiotemporal and sensorial traits, the semiotic traits of a media product offer certain possibilities and set some restrictions. Obviously, cognitive import of any sort cannot be freely created based on just any sign type. For instance, the iconic signs of music can represent complex feelings and motional structures that are largely inaccessible to the symbolic signs of written text; conversely, written symbolic signs can represent arguments, and the appearance of visual objects, with much greater accuracy than auditory icons. Flagrant examples like these are only the tip of the iceberg in terms of the (in)capacities of signs based on similarity, contiguity and habits or conventions, respectively. Therefore, the semiotic traits of the medium make possible—but also delimit—the communicative transfer of cognitive import through a media product.

In line with this proposal, it is appropriate to bring the notion of noise into the discussion. Many researchers engaged in communication of meaning have picked up Claude E. Shannon’s (1948) idea that signal disturbances in communication can be conceptualised as noise. The basic phenomenon of disruptions that occur on the way from the producer’s mind to the perceiver’s is clearly relevant to the transfer of cognitive import. For instance, Schramm noted that noise is ‘anything in the channel other than what the communicator puts there’ (1955: 138). As an example, speech can be disturbed by other sounds, and a motion picture can be disrupted because of material decay or censorship. Noise in this sense occurs both in the act of production and in the act of perception. My visual model of communication (Fig. 1.1) shows this noise as disruptions in the arrow representing transfer of cognitive import—both before and after the transfer through the media product—reflecting the unsatisfactory conditions of production and perception.

The problem with the notion of noise when applied to communication of meaning, or cognitive import, is that it might imply that the complete absence of noise would bring about complete transfer of cognitive import, as in the case of technical transmission of computable data, which is clearly not the case. The technological notion of noise is simply not sufficient to understand communication of cognitive import. According to Hall, ‘distortions’ or ‘misunderstandings’ are also due to, among other things, ‘the asymmetry between the codes of “source” and “receiver” at the moment of transformation into and out of the discursive form’ (1980: 131).

This contention is definitely a step in the right direction in terms of offering a more complex notion of possible disruptions in the communication of cognitive import. However, it does not provide a more complete view of restraining factors in the transfer of cognitive import. It is also important to emphasise that creators of media products generally do not have access to, or do not master, more than a few media types. Consequently, they are often unable to form media products that have the capacity to create cognitive import in the perceiver’s mind that is similar to the cognitive import in their own mind. Therefore, I argue that important restraining factors of communication are found in the material, spatiotemporal, sensorial and semiotic traits of the media products.

Many exceedingly complex factors are clearly involved when the perceiver’s mind forms cognitive import. Furthermore, as Mary Simonson has accurately noted, media products are sometimes ‘envisioned and created precisely so that they will likely not transmit meanings and ideas in a straightforward way’ (2020: 4). My proposed model highlights one particular cluster of crucial factors: media products have partly similar and partly dissimilar material, spatiotemporal, sensorial and even semiotic traits, and the combination of traits to a large extent—although certainly not completely—determines what kinds of cognitive import can be transferred from the producer’s mind to the perceiver’s mind. Songs, emails, photographs, gestures, films and advertisements differ in various ways concerning their material, spatiotemporal, sensorial and semiotic traits and hence can only transfer the same sort of cognitive import to a limited extent. Figure 1.1 shows this communicative restriction as disruptions in the arrow representing transfer of cognitive import as it passes through the media product.

2.4 Communicating Minds

Outlining only the fewest possible entities of communication and their essential interrelations, my suggested model of communication (Fig. 1.1) is irreducible but certainly expandable. I have already fleshed it out by suggesting various ways of conceptualising the notion of media product in some detail. I will now also sketch a more multifaceted comprehension of communicating minds: the minds of the producer and the perceiver and their interrelations.

The minimal level of complexity consists of simply one mind producing a single media product of which another perceiving mind makes sense. This, I believe, is the core of human communication. In actual communicative situations, however, the perceiver’s mind is often also a producer’s mind. Based on the cognitive import generated by an initial media product, the perceiver becomes a producer in terms of creating another media product (of the same or another kind) that reaches an additional perceiver’s mind, thereby forming new cognitive import that is more or less similar to that in earlier producers’ minds. Hence, a communicative chain is formed. When the communicative chain involves the initial producer and perceiver constantly changing roles and forming new media products (of the same or another kind), we have two-way communication. The creation of new media products in two-way communication is often conceptualised as feedback that may result in the creation of cognitive import that is either only slightly or significantly developed. Communicative chains that are uni- and bidirectional may be combined in a multitude of ways.

Furthermore, media products are often produced or perceived by several minds. For instance, a motion picture is normally both produced and perceived by more than one mind. While the minds of scriptwriters, directors, actors and many others combine to create the motion picture, the audience consists of a multitude of perceiving minds. In contrast, a plenary talk is, as a rule, produced by one mind but perceived by many. An unsuccessful theatre performance may be produced by many minds but perceived (from an off-stage position) by only one.

Another level of complexity consists of the case when perceivers take in their own media product. Although I would not say that pure thinking is communication (as suggested by Berlo [1960: 31]), perception of one’s own media product created earlier may mean that the mind tries to construe cognitive import on the basis of the media product rather than on the memory of what one had in mind on the occasion of production. In this case, a transfer of cognitive import actually occurs through a media product from one mind to another, in the sense that the mind, when perceiving the media product, is in a different state than it is during production. The effort of writing a scholarly text is a good example of this sort of internal communication: communication sometimes fails when one cannot understand the words one has written just the day before.

Of course, one can also combine this level of complexity with others, as in the case of interactive video games. Such games are normally constructed and designed by several minds, but the point here is that the actual media products (the many realised sensory configurations that are mediated by screens and sounding loudspeakers each time the game is being played) are also created by the players. Accordingly, we have a kind of communication involving several producing minds that have created certain frames for interaction and resulting consequences (when designing the game), one or several producing minds that create the actual media product in their interaction with the evolving media product (when playing the game) and one or several perceiving minds that are actually the same as those minds that interact with and hence produce the media product: the specific realisation of the possibilities of the video game. Naturally, additional minds that are not co-producers (i.e. an audience) may also perceive this media product.

The notion of the producer’s mind and perceiver’s mind may well be simple but it is certainly not reductive. On the contrary, it offers a solid basis for analysing all kinds of communicative complexities. While the examples above do not exhaust the intricacies, they may hint at the many complicated ways in which producers and perceivers’ minds may be positioned in various communicative circumstances.

In addition to developing the basic notion of transfer of cognitive importbetween two separate minds, I will now also elaborate on the notion of cognitive import in the producer’s and especially the perceiver’s mind. As the irreducible model of communication only states that cognitive import is transferred between minds, it is appropriate to suggest not only a way of understanding how it is formed by basic media traits (which was done in the section on media modalities), but also a way of comprehending how it is moulded by surrounding factors. In addition to its innate basic capacity to perceive and interpret mediated qualities, the mind is inclined to form cognitive import based on acquired knowledge, experiences, beliefs, expectations, preferences and values—preconceptions that are largely shaped by culture, society, geography, history and various communities in the mind’s surroundings. This concept is immensely important for the outcome of communication. The perceiver’s mind acts upon the perceived media product on the basis of both its hardwired cognitive capacities and its attained predispositions. Evidently, the cognitive import that was stored in the mind before the media product was perceived has a significant effect—to varying degrees—on the new cognitive import formed by communication.

This widely recognised fact has been extensively theorised in various ways. Jakobson discussed it in terms of ‘a context [that is] seizable by the addressee, and either verbal or capable of being verbalized’ (1960: 353). While context is important for all kinds of communication, I think it is a mistake—even for a restricted focus on verbal communication—to say that the context must be verbalisable in order for it to be relevant. Hall distinctly emphasised the ‘social relations of the communication process as a whole’ and the ‘frameworks of knowledge’ (1980: 129–130) and discussed them in detail. The research area of hermeneutics has minutely scrutinised these and other issues that are central to the formation of meaning in a broad context.

Here I will only suggest a complementary semiotic way of circumscribing how surrounding factors form cognitive import in communication. Although the focus is on the perceiver’s mind, the suggested basic principles are also relevant for the formation of cognitive import in the producer’s mind.

I have already established that the representamens that initiate semiosis in communication come from sensory perception of media products. One perceives configurations of sound, vision, touch and so forth that are created or brought out by someone and understood to signify something; they make objects (in the Peircean sense) present to the perceiver’s mind and result in interpretants based on the representamen–object relation. These interpretants, and interpretants resulting from further chains of semiosis, constitute the cognitive import being transferred in communication. The objects emerge from earlier perceptions, sensations and notions that are stored in the perceiver’s mind, either in long-term or short-term memory that may also cover ongoing communication. ‘Earlier’ could be a century before or a fraction of a second before.

In semiotic terms, the stored mental entities may be direct perceptions from outside of communication, interpretants from semiosis outside of communication, interpretants from semiosis in earlier communication or interpretants from semiosis in ongoing communication. This is to say that objects of semiosis always require ‘collateral experience’ (Peirce 1958: CP8.177–185 [1909]; cf. Bergman 2009) that may derive both from within and without ongoing communication. In other words, collateral experience may be formed by semiosis inside the spatiotemporal frame of the communicative act or stem from other earlier involvements with the world, including former communication as well as direct experience of the surrounding existence.

In line with this twofold origin of collateral experience, I distinguish between two utterly entwined but dissimilar areas in the mind of the perceiver of media products: the intracommunicational and the extracommunicationaldomains. This distinction emphasises a difference between the formation of cognitive import in ongoing communication and what precedes and surrounds it (related but divergent distinctions in cognitive psychology have been proposed by Brewer [1987: 187]). I also find it appropriate to make a corresponding distinction between intracommunicational and extracommunicational objects, both of which are formed by collateral experience from their respective domains.

The extracommunicational domain should be understood as the background area in the mind of the perceiver of media products. It comprises everything with which the perceiver is already familiar. As it is a mental domain, it does not consist of the world as such but rather of what the perceiver believes and knows through perception and semiosis. The perceiver’s stored experiences not only consist of raw perceptions, such as foundational sensations of being a body that physically interacts with a spatiotemporal surrounding, but also of perceptions that have been contemplated and processed by the mind through semiosis. This involves estimations and evaluations of encounters with people, societies and cultures that are consciously or unconsciously accepted, put in doubt or rejected. It involves shared experiences and ideas, cultural norms and common beliefs, but also more individual understandings, impressions and values—all of which are well known to be crucial factors for the outcome of communication.

The extracommunicationaldomain includes experiences of what one presumes to be more objective states of affairs (dogs, universities, music and statistical relations), what one presumes to be more subjective states of affairs (states of mind related to individual experiences) and everything in between. Thus, it is actually formed in one’s mind not only through semiosis and immediate external perception but also through interoception, proprioception and mental introspection. Hence, the extracommunicational/intracommunicational domain distinction is different from exterior/interior to the mind, world/individual, material/mental and objective/subjective.

Vital parts of the extracommunicational domain are constituted by perception and interpretation of media products. Therefore, former communication is very much part of what precedes and surrounds ongoing communication. Together, non-communicative and communicative prior experiences form ‘a horizon of possibilities’, to borrow an expression from Marie-Laure Ryan (1984: 127). The extracommunicational domain is the reservoir from which entities are selected to form new constellations of objects in the intracommunicationaldomain.

In contrast to the extracommunicational domain, the intracommunicationaldomain is the foreground area in the mind of the perceiver of media products. It is formed by one’s perception and interpretation of the media products that are present in the ongoing act of communication. It is based on both extracommunicational objects, emanating from the extracommunicational domain, and intracommunicational objects, arising in the intracommunicational domain, that together result in interpretants making up a salient cognitive import in the perceiver’s mind. However, the intracommunicational domain is largely mapped upon the extracommunicationaldomain. Rehashing Ryan’s ‘principle of minimal departure’ (1980: 406), I argue that one construes the intracommunicationaldomain as being the closest possible to the extracommunicational domain and allows for deviations only when they cannot be avoided. In other words, familiar ideas and experiences are not questioned until it is necessary to do so.

As the intracommunicationaldomain is formed by communicative semiosis, it can be called a virtual sphere. The virtual should not be understood in opposition to the actual, but as something that has the potential to have real connections to the extracommunicational—to be truthful (Elleström 2018b). Therefore, I define the virtual as a mental sphere, created by communicative semiosis and consisting of cognitive import formed by represented objects.

A virtual sphere can consist of anything from a brief thought triggered by a few spoken words, a gesture or a quick glance at an advertisement, to a scientific theory or a complex narrative formed by hours of reading books or watching television (Elleström 2019). Ultimately, everything that is possible to think may be part of a virtual sphere.

Depending on the degree of attention to the media products, the borders of a virtual sphere do not necessarily have to be clearly defined. As communication is rarely flawless, a virtual sphere may be exceedingly incomplete or even fragmentary. It may also comprise what one apprehends as clashing ideas or inconsistent notions. As virtual spheres result from communication, they are, by definition, shareable among minds to some extent.

The coexistence of intracommunicational and extracommunicational objects results in a possible double view on virtual spheres. From one point of view, they form self-ruled spheres with a certain degree of experienced autonomy; from another point of view, they are always exceedingly dependent on the extracommunicationaldomain. The crucial point is that intracommunicational objects cannot be created ex nihilo; they are completely derived from extracommunicational objects. This is because one cannot grasp anything in communication without the resource of extracommunicational objects. Even the most fanciful narratives require recognisable objects in order to make sense (cf. Bergman 2009: 261). To be more precise: intracommunicational objects are always in some way parts, combinations or blends of extracommunicational objects. To be even more exact, intracommunicational objects are parts, combinations or blends of interpretants resulting from representation of extracommunicational objects.

It is possible to represent, say, griffins (which, to the best of our knowledge, exist only in virtual spheres) because of one’s acquaintance with extracommunicational material objects such as lions and eagles that one can easily combine. A virtual sphere may even include notions such as a round square, consisting of two mutually exclusive extracommunicational objects that together form an odd intracommunicational object. Literary characters such as Lily Briscoe in Virginia Woolf’s novel To the Lighthouse are composite intracommunicational objects consisting of extracommunicational material and mental objects that stem from the world as one knows it. You cannot imagine Lily Briscoe unless you are familiar with notions such as walking, talking and eating; what it means to refer to persons with certain names; what women and men, adults and children are; what it means to love and to be bored; and what artistic creation is. In addition, more purely mental extracommunicational objects can be modified or united into new mental intracommunicational objects. Objects such as familiar emotions can be combined into novel intracommunicational objects consisting of, say, conflicts between or blends of emotions that one perceives as unique although one is already acquainted with the components.

The question then arises: if all intracommunicational objects are ultimately derived from extracommunicational objects, why do we often experience virtual spheres as having a certain degree of autonomy? This is because we may perceive them, in part or in whole, as new gestalts that disrupt the connection to the extracommunicational domain. This happens when we do not immediately recognise the new composites of extracommunicational objects. The reason why they are not being re-cognised is that they have not earlier been cognised in the particular constellation in which they appear in the virtual sphere. Several such disruptions lead to greater perceived intracommunicationaldomain autonomy. Even though intracommunicational objects are entirely dependent on extracommunicational objects, they can be said to emerge within the intracommunicationaldomain.

Having described the interrelations between the intracommunicational and the extracommunicational domains in some detail, I will now present an overview with the aid of a visual diagram (Fig. 1.2). Whereas the intracommunicational domain simply consists of one virtual sphere, the extracommunicationaldomain consists of two rather different elements: on the one hand, other virtual spheres, and, on the other hand, what I propose to call the perceived actual sphere. This means that, from the point of view of a virtual sphere, there are three more or less distinct spheres: the virtual sphere itself, other virtual spheres and the perceived actual sphere.

Fig. 1.2
A diagram of two concentric spheres represents the intracommunicational domain with a virtual sphere in the inner sphere, the extracommunicational domain on the surface of the inner sphere, and the world as one knows I T with a perceived actual sphere and other virtual spheres on the outer sphere.

Virtual sphere, other virtual spheres and perceived actual sphere (Elleström 2018b: 432)

The perceived actual sphere consists of extracommunicational, immediate and presented material and mental objects beyond the realm of communication that the perceiving mind is acquainted with. ‘Perceived’ shall be understood in a broad sense to include exteroception, interoception and proprioception, joined by mental introspection and semiosis based on perception of the actual sphere. ‘Immediate and presented’ shall be understood in contrast to communication: the perceived actual sphere does not consist of mediated representations formed by media products brought out by minds and their extensions, but is immediately present to us. Note that immediately present does not mean that the perceived actual sphere is independent of the mediating mental mechanisms that connect sensation to perception or the complicated mediating functions that connect perception to the external world.

The other virtual spheres consist of extracommunicational, already mediated and represented material and mental objects that the perceiving mind is acquainted with. As virtual spheres are thoroughly semiotic, these objects are always made out of former interpretants. The other virtual spheres result from communication and comprise mediated representations formed by media products brought out by minds and their extensions.

Hence, the virtual sphere consists of extracommunicational, immediate and presented material and mental objects from the perceived actual sphere + extracommunicational, already mediated and represented material and mental objects from other virtual spheres + intracommunicational, mediated and represented material and mental objects that emerge within thevirtual sphere.

Together, the intra- and extracommunicationaldomains constitute the world as one knows it, which corresponds to what Siegfried J. Schmidt called actuality, ‘our world of experience’; ‘we have to postulate a strict separation between reality, which is cognitively inaccessible but has to be presupposed as existing at least for logical reasons, and actuality, which is constructed by the real brain’ (Schmidt 1994: 499). Hence, everything outside of these domains—the unknown—corresponds to what Schmidt referred to as the cognitively inaccessible reality.

Like all schematic representations, this model is intended to provide an overview of an intricate state of affairs. Nevertheless, it not only points to mental areas that are fundamentally different in certain respects, but also reveals their complex interrelations. Thus, one must emphasise that every virtual sphere, from the point of view of that sphere, is intracommunicational, and is therefore composed of objects that are derived from itself (to the extent that parts, combinations and blends of extracommunicational objects may be understood as distinct), as well as from other virtual spheres and the perceived actual sphere. This comprises a mise-en-abyme: intracommunicational virtual spheres are formed by perceived actual spheres and by other extracommunicational virtual spheres that are, in turn, formed by perceived actual spheres and by other extracommunicational virtual spheresad infinitum.

Adding the diagram in Fig. 1.2 to the diagram in Fig. 1.1 might give a sense of how one may expand the irreducible model of communication in terms of a more complex understanding of the transferred cognitive import. In brief, the totality of the intracommunicational and extracommunicationaldomains in Fig. 1.2 (the outer circle) is equivalent to the whole perceiver’s mind in Fig. 1.1 (the outer circle). The intracommunicationaldomain, comprising the virtual sphere in Fig. 1.2 (the inner circle), consists of the cognitive import in the perceiver’s mind according to Fig. 1.1 (the inner circle). This virtual sphere is not only formed by the perception and interpretation of the specific traits of the media products that are present in the ongoing act of communication, as emphasised in Fig. 1.1. It is simultaneously based on a combination of extracommunicational and intracommunicational objects that, together, result in interpretants making up salient cognitive import in the perceiver’s mind, as demonstrated in Fig. 1.2. In other words, the cognitive import in the perceiver’s mind, bringing about a virtual sphere, is formed by both ongoing experience of the particular traits of the media product and the general collateral experiences of all sorts in the perceiver’s mind.

The extent to which cognitive import may be shared among the producer’s and perceiver’s mind is undoubtedly partly determined by how much the extracommunicational domain of the perceiver’s mind overlaps with the extracommunicationaldomain of the producer’s mind (understood as the background area in the mind of the producer of media products). This conclusion corresponds well with established views on the importance of shared experiences and knowledge for successful communication.

3 What Is a Technical Medium of Display?

3.1 Media Products and Technical Media of Display

At this stage of the account, it is necessary to introduce a delicate but sometimes vital distinction between media products and technical media of display. I have stated that media products are physical entities or processes that are necessary for communication because they interconnect minds. More precisely, I should also emphasise that being a media product is a function that requires some sort of perceptible physical phenomenon to come into existence. I call these physical items or phenomena technical media of display (cf. Jürgen E. Müller’s distinction between ‘technical conditions’ and ‘media products’ [1996: 23]).

I choose the term ‘technical’ to attach to one of the meanings of the Greek word téchnē: practical skill and the methods employed in producing something. Accordingly, technical media of display should be understood as entities that realise media products; they distribute sensory configurations with a communicative function. Terms such as ‘technical media of distribution’, ‘dissemination’ or ‘presentation’ would all be accurate. The cumbersome term ‘technical media of display of sensory configurations’ is perhaps the most precise one for my purpose.

I define a technical medium of display as any object, physical phenomenon or body that mediates sensory configurations in the context of communication; it realises and displays the entities that we construe as media products. Technical media of display are those perceptible physical items and processes that, when used in a communicative context, acquire the function of media products. Strictly speaking, this means that when the same physical items and processes are not used in a communicative context, they are not technical media of display.

My definition of the notion of technical medium of display is narrower than that of ‘physical media’ circumscribed, for instance, by Claus Clüver (2007: 30). Devices used for the realisation of media products, but not tools used only for the production or storage of media products, are technical media of display. The brush and the typewriter are tools for production that are normally separated from the material manifestations of media products and are, as such, not normally technical media of display according to my definition, although they count as physical media in Clüver’s sense (2007). For the same reason, a computer hard disk—a device for storage—is not routinely a technical medium in the sense that I emphasise here. The video camera is partly a tool for production and partly a device for the realisation of media products (if it includes a screen for film display), so it can be habitually seen as a technical medium of display. A guitar, which can produce and realise musical sound simultaneously, also often works as a technical medium of display if one considers its immediate extensions in the form of sound waves. Some physical existences, such as ink on paper, may both store and display sensory configurations and thus work as technical media if present in communicative situations. Such pieces of paper can mediate sensory figurations that we understand to be, say, written words, whereas a pen, which can only produce and not display written words, is not, in its role as producer of writing, a technical medium of display.

Technical media of display clearly exist in diverse forms. I have already suggested that media products can be realised by either bodily or non-bodily matter. From the perspective of the producer’s mind, being situated in a human body, this means that there are external technical media (extra-bodily materialities such as clay, screens, ink on paper, sound waves from loudspeakers or just about anything chosen from the surroundings, including other bodies) and there are internal technical media (the producer’s body in its entirety, parts of it or physical phenomena emanating directly from it, such as a voice). All forms of external and internal technical media of display can be combined with each other in countless ways.

Regarding external technical media of display, any perceptible physicality can be used in the function of a media product. A stone and a tree branch lying on the ground are only a stone and a branch. However, if someone picks them up and uses them to intimidate somebody else (to communicate threat) or to manufacture sculptures (to communicate something aesthetic), they become technical media of display—physical entities with a communicative function, the function of being media products.

Harold A. Innis (1950) emphasised the importance of technical media such as stone, clay, papyrus and paper for the historical development of communication—more specifically writing—and society at large. More modern technical media of display include electronic screens and sound waves produced by loudspeakers. Thus, very different kinds of physical entities may act as external technical media of display and realise media products. They may simply be at hand in the environment of the producer’s mind and body (like directing a waiter’s attention to an empty glass to communicate the desire to be given a new drink) or they may be more or less crafted with a communicative purpose (like using a piece of paper to display the words ‘one more beer, please’). They may also be internal and consist of corporeal actions and immediate extensions of the body (like a movement of hand and arm imitating the act of drinking or a voice saying ‘one more beer, please’).

These examples do not in any way exhaust the many possible modes of existence for technical media. For instance, one may note that items that are manufactured for producing media products, not displaying them, may actually be used as technical media of display in certain circumstances. A pen, which is not a technical medium of display in its role as a producer of writing, may become a technical medium of display if, say, it is placed in a shop window in order to indexically communicate the notion that pens are for sale in the shop.

The distinction between a media product and the technical medium of display is clearly theoretical rather than a distinction between two different kinds of material entities. On the contrary, the physical technical medium is a prerequisite for the existence of a media product, and, in a communicative situation, the perceiver identifies only one level of presence: the perceived sensory configurations emanating from some physical existence. However, the distinction is needed in order to demonstrate the difference—and mutual interdependence—between, for example, what one construes as a piece of music (a media product) and the sound waves emanating from a music audio system (a technical medium of display). Confronted with the famous question in William Butler Yeats’s poem ‘Among School Children’—‘How can we know the dancer from the dance?’—the distinction allows us to give two different but fully compatible answers. On one hand, the dancer and the dance are inseparable in the sense that they are the same material entity occupying physical space and time. On the other hand, they are two different things. Whereas the dancer is a body acting as a technical medium of display, the dance is a function of the material body—a media product.

Although this distinction is sometimes hard to grasp, it often aligns well with everyday parlance and thinking. Allow me to illustrate this further. Some technical media of display, such as audio systems, are well fitted to be reused many times. This is also the case for a technical medium such as a television set (which actually consists of two kinds of technical media of display: a screen that emits photons and loudspeakers that set the air into pulsation) that may realise several different media products (many television programs). A communicating human body may be conceptualised in a similar fashion. When moved in certain ways and in certain circumstances, the body mediates certain sensory configurations and realises what one understands as gestures (media products). As long as the memory of these gestures is kept in the producer’s mind, similar gestures can be performed by the same technical medium of display—the body—thus creating a large amount of equivalent media products. Of course, the same body may also be used for realising a multitude of different media products. Conversely, many types of technical media of display can realise a media product such as a television programme; not only television sets but also, for instance, laptop computers, which also consist of a screen and loudspeakers.

On the other hand, because of their physical qualities, some technical media of display tend to be used only once or a few times. A marble block being cut to a certain form mediates certain sensory configurations and realises a sculpture and can usually be reused only a limited number of times. As the block not only displays but also stores the sculpture, the reuse of the technical medium of display implies the destruction of the initial media product.

However, common language does not always provide words to properly describe the distinction between technical media of display and media products. This is because of the boundless and dynamic nature of human communication: for reasons of mental economy, only the most common and salient media products are categorised and given names. An example can be given through the communicative acts performed by the thirsty person discussed above. The movement of the person’s hand and arm is used as a technical medium with which to realise what is commonly known as a gesture, a kind of media product. The paper is used as a technical medium for realising a media product that may be called, for instance, a written note. The raised empty glass, however, resists being described in ordinary language; one may say that ‘glass’ or ‘a glass’ is used as a technical medium, but what kind of a media product does it realise? This is not clear. Nevertheless, the media product is there, whether there is a proper term to denote it or not.

All these observations call for some discussion regarding duplication of media products. According to my definition, the concept of media product implies that every single display through a technical medium constitutes a specific media product. This display may last for a very short time (a cry of warning, for instance), for a very long time (such as a rock painting) or anything in between. In any case, the display of such media products can be repeated in various ways. Several cries of warning can be heard, several rock paintings can be seen, and some of these are very similar. In some cases, the similarity between media products is so detailed that it is more than reasonable to think that they are ‘the same’. When I watch the movie Fantasia, I believe that it is the same movie that I saw some years ago, having the same title and being identical in virtually all details, although it was then displayed on the screen in a movie theatre and not on the screen of my television set.

However, the two realisations of Fantasia are not the same media product. On a theoretical level, it is important to be able to acknowledge that every display of a media product is unique, even though several media products may be extremely similar indeed—like the thousands of copies of operating instructions for a certain kind of toaster. On a pragmatic level, however, it is efficient to operate with the notion of sameness. Life outside the domain of scholarly writing would become very difficult to handle if we did not recognise that different people, at different times, located at different places, may actually watch ‘the same television program’, such as a specific episode of Monty Python’s Flying Circus. However, people actually perceive different media products that are generally virtually undistinguishable but slightly different when it comes to qualities such as the size and resolution of the moving images and the quality of the sound—differences that may or may not affect how cognitive import is construed.

Under theoretical pressure, the ‘sameness’ of different actual displays becomes diffuse and problematic. Are my toaster operating instructions, covered in coffee stains and almost illegible, the same media product as your unblemished copy? As they can hardly communicate the same cognitive import (understanding how to handle the toaster), I would say not. If I argued that two unstained copies of the operating instructions are the same, the obscure question arises: how many stains or torn pages are required to render them different? In the end, the question of sameness becomes a somewhat metaphysical question. Therefore, strictly speaking, different media products may only be the same in the respect that they are very similar. Although different media products are never ontologically the same, they may be thought of as being ‘the same’ in many other important respects. One could perhaps say that very similar media products are variations of an abstract but recognisable communicational composition that may be reproduced more or less efficiently.

3.2 Mediation and Representation

As postulated above, media products are the entities through which cognitive import is transferred among minds in communication. Such products require technical media of display in order to be realised. Different forms of technical media of display have different capacities to mediate sensory configurations and make them present to the perceivers’ minds, which has consequences for the outcome of communication. The perception of media products is also deeply entangled with cognitive operations, resulting from the encounter with the sensory configurations. These perceptual and cognitive functions can be broadly described as interpretation, and more specifically analysed in terms of signification.

As this complex process of transfer of cognitive import from a producer’s mind to a perceiver’s mind involves both material and mental aspects, I find it helpful to distinguish between two profoundly interrelated but nevertheless discernible basic facets of the communicative process: mediation and representation. Mediation is the display of sensory configurations by the technical medium (and hence also by the media product) that are perceived by human sense receptors in a communicative situation. It is a presemiotic phenomenon that should be understood as the physical realisation of entities with material, spatiotemporal and sensorial qualities—and semiotic potential. For instance, one may hear a sound. Representation is a semiotic phenomenon that should be understood as the core of signification, which I delimit to how humans create cognitive import in communication. When a perceiver’s mind forms sense of the mediated sensory configurations, sign functions are activated and representation is at work. For instance, the heard sound may be interpreted as a voice uttering meaningful words.

To say that a media product represents something is to say that it triggers a certain type of interpretation. This interpretation may be more or less hardwired in the media product and the manner in which a person perceives it with her or his senses, but it never exists independently of the cognitive activity in the perceiver’s mind. When something represents, it calls forth something else; the representing entity makes something else—the represented—present in the mind. In terms of Charles Sanders Peirce’s foundational notions, this means that a sign or representamen stands for an object. Peirce’s third sign constituent, the interpretant, can be understood as the mental result of the representamen–object relation (see, for instance, 1932: CP2.228 [c. 1897]). As stated earlier, one may further understand my notion of cognitive import created in the perceiver’s mind in communication as an example of Peirce’s notion of interpretant—and of course, the concept of interpretation has everything to do with the semiotic idea of interpretants in signification.

Representation, the very essence of semiosis, occurs constantly in our minds when we think without having to be prompted by sensory perceptions. However, it is also triggered by external stimuli; in this context, focusing on external stimuli resulting from mediation is appropriate. Thus, although representation also occurs in pure thinking and in the perception of things and phenomena that are not part of mediation, I delimit the account of representation to the creation of cognitive import based on mediated sensory configurations—stimuli picked up by our sense receptors in communicative situations. My contention is that all media products represent in various ways as soon as sense is attributed to them; or, in other words, when they are attributed sense, they become media products. Hence, one can understand media products as assemblages of representamens that, due to their mediated material, spatiotemporal and sensorial traits—and because of collateral experience in the intra- and extracommunicationaldomains—represent certain objects, thus creating interpretants (cognitive import) in the perceiver’s mind. It is through representation, and more broadly signification, that virtual spheres are created in the perceiver’s mind. Hence, according to my terminology, the idea of non-representative media products is self-contradictory.

My current emphasis is on the notion that basic encounters with media have both a presemiotic and a semiotic side. Whereas the concept of mediation highlights the material realisation of the media product, made possible by a technical medium of display, the concept of representation highlights the semiotic conception of the medium. Although mediation and representation are clearly entangled in complex ways, it is vital to uphold a theoretical distinction between them. This theoretical distinction is helpful in analysing complex communicative relations and processes. In practice, however, mediation and representation are deeply interrelated. Every representation is based on the distinctiveness of a specific mediation. Furthermore, some types of mediation facilitate certain types of representation and render other types of representation impossible; different kinds of mediation have different kinds of semiotic potential. As an obvious example, vibrating air emerging from the vocal cords and lips that is perceived as sound but not words is well suited for the iconic representation of bird song, whereas such sounds cannot possibly form a detailed, three-dimensional iconic representation of a cathedral. However, distinctive differences among mediations are frequently more subtle and less easily spotted without close and systematic examination.

4 What Are Media Modalities, Modality Modes and Multimodality?

4.1 Multimodality and Intermediality

To facilitate such systematic examination of mediality, I will now expand on what I have already introduced as the four modalities of media. This requires a brief discussion of the two research fields of multimodality and intermediality. Although they focus on similar issues, cross-references between these two interrelated research fields are rare. Nevertheless, Mikko Lehtonen combined the notions of intermediality and multimodality two decades ago when, in an article in a journal of media and communication studies, he accurately stated, ‘multimodality always characterises one medium at a time. Intermediality, again, is about the relationships between multimodal media’ (Lehtonen 2001: 75; cf. also the rewarding discussions in Fornäs 2002). Although Lehtonen used the concepts in different and not very developed ways, compared to the framework that I have sought to elaborate here, I subscribe to the basic idea that intermediality is about the relationship between media having a multitude of vital traits, or modes.

Nevertheless, it is not evident how this notion should be operationalised. The term ‘medium’ simply means ‘middle’, ‘interspace’ and so forth, and the term can justifiably be used in an abundance of different ways. The term ‘modality’ is related to ‘mode’, and these terms are also, for good reason, widely employed in different fields. A ‘mode’ is a way to be or to do things. Just like ‘medium’, the term ‘mode’ can, has and should be used to stand for different notions in diverse contexts. Therefore, certain ways of using terms such as ‘modality’ and ‘mode’ must not necessarily compete or be in conflict with very different ways of using them. However, in trying to form a terminologically and conceptually coherent research branch, it is essential to interrelate terms as well as concepts in lucid ways.

In the context of media studies and linguistics, ‘multimodality’ sometimes refers to the combination of, say, text, image and sound, and sometimes to the combination of sense faculties (the auditory, the visual, the tactile and so forth). Thus, multimodality has been defined as ‘the use of two or more of the five senses for the exchange of information’ (Granström et al. 2002: 1). The idea that multimodality is the combination of several human (primarily external) senses is also widespread in research areas such as medicine, psychology and cognitive science. However, the field of multimodality itself uses less clear-cut definitions. Gunther Kress and Theo van Leeuwen (2001) understood a mode or modality as any semiotic resource, in a broad sense, that produces meaning in a social context: the verbal, the visual, language, text, image, music, sound, gesture, narrative, colour, design, taste, speech, touch, plastic and so on. While this approach to multimodality has some pragmatic advantages, it produces a rather indistinct set of modes that are hard to compare and correlate since they overlap in many ways (Kress and van Leeuwen 2001: vii, 3, 20, 22, 25, 28, 67, 80; Kress and van Leeuwen 2006: 46, 113, 177, 214). Despite recent suggestions for systematic analysis of multimodality (Bateman et al. 2017), the fundamental notion of multimodality remains circumscribed rather haphazardly by researchers attaching to the Kress and van Leeuwen tradition. However, Kress’s book Multimodality (2010) circumscribed the notion of mode more firmly within a frame of social semiotics (Chap. 5, ‘Mode’), and the selection of what might constitute modes is narrower than in earlier publications. On the other hand, Kress emphasised the distinctiveness of modes such as images and writing.

By emphasising the distinctiveness of modes, Kress’s notion of multimodality comes close to the view that media types are inherently different. Earlier efforts to describe relations among different media generally started with precisely the same conceptual units that we also find in multimodal research—image, music, text, film, language (verbal media) and visuality (visual media)—presuming that it is appropriate to compare these entities. The indistinctness of such comparisons is confusing if one treats the compared units as fundamentally different media with little or nothing in common.

In contrast to such views, Mieke Bal has convincingly demonstrated that ‘word’ and ‘image’ are interrelated and integrated in complex ways (1991). W. J. T. Mitchell is another scholar who has successfully criticised this mode of thinking by importantly pointing to the way in which media types (more specifically art forms) that are generally seen as opposites actually share various traits (1986). However, Mitchell’s use of traditional dichotomies such as text vs. image and verbal vs. pictorial makes it difficult to grasp the nature of the similarities of media. Meanwhile, most other scholars working with similar issues have continued to operate with the dichotomy of verbal vs. visual media types. This is problematic because of what I would describe as the modal incommensurability of the two notions: whereas the verbal is a variation of the symbolic, in Peirce’s sense, and hence a semiotic property, the visual belongs to the domain of sense perception. Hence, the two notions belong to different categories of media traits, different modalities, and are not fit to form a dichotomy—just as there is little point in contrasting blue cars with fast cars.

As long as such obscurities continue, it remains unclear how to understand notions such as multimodal and intermedial and how they are related. Generally, ambiguities remain even in the most qualified scholarly publications (see, for instance, Moser 2007a, b). The fuzziness of concepts termed ‘media’ and ‘mode’ also remains in a central research area such as communication studies (as demonstrated in Parks 2017).

It is no wonder, then, that the discourses on media and modalities tend to be either separated or mixed up. Why bother to combine, or to keep apart, notions that seem to be fuzzy in rather similar ways? There are many media types, which might be the same as saying that there are many modes of communication. In ordinary situations, a language use that simply equates ‘media’, ‘modalities’ and ‘modes’ is unproblematic. However, I think it is a good idea to separate the meanings of ‘medium’, ‘modality’ and mode’ to make it possible to differentiate between intermediality and multimodality in such a way that Lehtonen proposed—namely, to see intermediality as ‘the relationships between multimodal media’ (2001: 75).

To the best of my knowledge, there is nothing in the etymology of the words ‘medium’, ‘modality’ and ‘mode’, or in their established uses, that clearly determines how they should be interrelated. Therefore, I see it as my task to raise a theoretical construction and propose how to use these central terms in relation to each other.

My starting point is the idea that media are both similar and different and that media cannot be compared without clarifying which aspects are relevant to the comparison and how these aspects can relate to each other. Therefore, I propose a model that starts not with the units of established media forms, or with efforts to distinguish between specific types of intermedial relations between these recognised media, but with the basic categories of features, qualities and aspects of all media. As already explained briefly, I propose to think in terms of media modalities—types of media traits. The modalities are the indispensable cornerstones of all forms of media, integrating physicality, perception and cognition. Separately, these modalities constitute complex fields of research and are not related to the established media types in any definitive way. However, they are crucial in efforts to describe the character of every single media product. They are all familiar for research, even though their interactions have not been accounted for systematically. As stated earlier, I call them the material modality, the spatiotemporal modality, the sensorial modality and the semiotic modality, and they are found on a scale ranging from the material to the mental. The first three modalities are presemiotic and concern mediation. The semiotic modality concerns representation or, more broadly, signification: how the mediated sensory configurations come to signify cognitive import in the perceiver’s mind and form a virtual sphere.

Scholars constantly describe and define media based on one or more of these modalities. However, this is not always sufficient, because all media are necessarily realised in the form of all four modalities. Therefore, I argue that all four of them should be considered. In this respect, there is a fundamental difference between my approach and the systematic, often hierarchic but simplistic classifications and divisions of the arts, the aesthetic media types, which were put forward from the eighteenth century and well into the twentieth century (see Munro 1967: 157–208). Nevertheless, the roots of thinking in terms of media modalities go way back in time. An important early thinker who saw things clearly was Moses Mendelssohn, who built a typology with the aid of distinctions such as ‘natural’ versus ‘arbitrary’ signs, ‘the sense of hearing’ versus ‘the sense of sight’ and signs that are represented ‘successively’ versus ‘alongside one another’ (1997 [1757]: 177–179). The typology is sketchy but instructive since Mendelssohn clearly realised that the borders of the arts ‘often blur into one another’ (1997 [1757]: 181).

Much later, the systematic thinking of the linguist Roman Jakobson came close to the idea of media modalities. He discussed and interrelated the five external senses, spatiality and temporality, as well as Peirce’s sign trichotomy icon, index and symbol (1971a, b, c). Jakobson also made important but undeveloped efforts to put this in the context of ‘communication systems’, albeit with language as the undisputed centre and measure (1971c). This linguistic bias implies that Jakobson thought of communication at large as ‘systems’, which I believe gives a warped picture of the wealth of communication that occurs without the boundaries of systems. Another reason for his failure to achieve a nuanced overview over communication is the common tendency to reason in terms of false dichotomies. A question such as ‘What is the essential difference between spatial and auditory signs?’ (1971b: 340), contrasting a spatiotemporal and a sensorial mode, offers a tilted starting point for investigating signs in communication.

Similar tilted starting points are detectable in Jiří Veltruský’s comparison of artistic media forms (1981). In Veltruský’s account, it remains unclear what the ‘material’ of an art form is. According to the author, materials can be divided into the ‘auditory and visual’; the material of music is said to be ‘tones’ and the material of literature is said to be ‘language’. Furthermore, the material of literature is supposed to oscillate ‘between materiality and immateriality’ (1981: 110). Although this categorisation is representative, it is not at all illuminating. The category of material is untenable since it includes media traits that cannot be treated as equals: tones, language and even the immaterial. Tones must be seen as related primarily to the sensorial modality, whereas language must be understood in semiotic terms; however, spoken language actually also consists of some sorts of tones. What the immaterial material is, I do not know.

Mitchell came closer than Veltruský to the idea of media modalities. In one publication, he discussed ‘four basic ways in which we theoretically differentiate texts from images’. Three of these ways are ‘perceptual mode (eye versus ear)’, ‘conceptual mode (space versus time)’ and ‘semiotic medium (natural versus conventional signs)’ (1987: 3). Although limited to a comparison of texts and images, this description contains three of the media modalities in their embryonic forms. Moving from text and image to the more specific media types poetry and painting, Mitchell also argued that ‘there is no essential difference between poetry and painting, no difference, that is, given for all time by the inherent natures of the media, the objects they represent, or the laws of the human mind’ (1987: 2–3). Although it is important not to exaggerate the differences between media, I would say that it is fully possible ‘to give a theoretical account of these differences’ (1987: 2), essential or not, which Mitchell doubted.

Later interesting discussions of these issues, including actual efforts to systematise several of those media traits that I categorise in modalities, are found in publications by Helen C. Purchase (1999) and Eli Rozik (2010). However, although constantly recurring, the material, spatiotemporal, the sensorial and the semiotic types of media traits tend to be fused and mixed up in fundamental ways. Perhaps the most common mistake in these discussions is to confuse the notions of visual and iconic: whereas the visual is about using a specific sense faculty (whether this is connected to iconic, indexical or symbolic signs), the iconic is semiosis based on similarity (whether this similarity can be seen, heard, felt or otherwise sensed) (see Elleström 2016).

4.2 Media Modalities and Modes

In 2010, I published the first version of this article: ‘The Modalities of Media: A Model for Understanding Intermedial Relations’ (Elleström 2010). In that piece, I introduced a distinction between two levels to facilitate and sharpen methodical descriptions and analyses of media products. On one hand, there are the types of traits that are common for all media products, without exception; on the other hand, there are the specific traits of particular media products or types of media products. To make the distinction transparent, I call the former modalities and the latter modes. In brief, then, media modalities are categories of basic media traits, and media modality modes (or simply media modes or modality modes) are basic media traits.

I have argued that there are four media modalities, four types of basic media modes. For something to acquire the function of a media product, it must be material in some way, understood as a physical matter or phenomenon. Such a physical existence must be present in space and/or time for it to exist; it needs to have some sort of spatiotemporal extension. It must also be perceptible to at least one of our senses, which is to say that a media product has to be sensorial. Finally, it must create meaning through signs; it must be semiotic. This adds up to the material, spatiotemporal, sensorial and semiotic modalities. It follows from the definition of a media product as the intermediate entity that enables the transfer of cognitive import from a producer’s to a perceiver’s mind, where a virtual sphere is created, that no media products or media types can exist unless they have at least one mode of each modality.

The modalities should be understood as categories of related media modes that are basic in the sense that all media products have traits belonging to all four modalities. All media products appear as specific combinations of particular modes of the four media modalities. A certain media product must be realised through at least one material mode (as, say, a solid or non-solid object), at least one spatiotemporal mode (as three-dimensionally spatial and/or temporal), at least one sensorial mode (as visual, auditory or audiovisual) and at least one semiotic mode (as mainly iconic, indexical or symbolic). Hence, the four media modalities form an indispensable skeleton upon which all media products are built.

By ‘modalities’, I thus mean the four necessary categories of media traits ranging from the material to the mental, and by ‘modes’ I mean the specific media traits categorised in modalities. I do not define entities such as ‘text’, ‘music’, ‘gesture’ or ‘image’ as modalities or modes; in the following section of this article, I will instead explain them in terms of media types.

As emphasised, three of the four modalities are presemiotic, which means that they cover media modes that are involved in signification—the creation of cognitive import in the perceiver’s mind—although they are not semiotic qualities in themselves. Thus, the material, spatiotemporal and sensorial modalities are not asemiotic; they are presemiotic, meaning that the modes that they cover are bound to become part of the semiotic as soon as communication is established. The presemiotic media modes concern the fundamentals of mediation, which is to say that they are necessary conditions for any media product to be realised in the outer world by a technical medium of display, and hence for any communication to be brought about. All four modalities obviously depend strongly on each other—just as the modes may be entangled with each other in several ways, depending on the character of the media product.

With the aid of this theoretical framework, basic media differences and media similarities can be pinpointed. Crucial divergences and fundamental parallels can be highlighted among all conceivable sorts of media—existing and yet to be devised—which provides a firm ground for understanding, describing and interpreting the most elementary media interrelations. Of course, I can only hint here at the complexity of the innumerable interrelations that can be derived from the four modalities and their modes.

Thematerial modality is a category of material media modes. All media products are material, or more broadly physical, which makes them perceptible and hence accessible to the perceiver’s mind in various ways. However, distinctions can be made among material properties in ways that may overlap. I discern at least two vital ways of distinguishing material modes. As described in physics, there are different states of matter, four of which are relevant for everyday life: a media product may be solid or in the form of liquid, gas or plasma. As examples, consider a solid road sign made of painted metal, liquid water used in an art installation, gas in the form of vibrating air (sound waves) produced by vocal cords and plasma in a televisionscreen or other device for communicative display. Another way of distinguishing material modes is to separate organic and inorganic matter. For instance, whereas an outstretched arm with a pointing finger is an organic media product, a tailor’s dummy is an inorganic media product. This is a biological rather than physical distinction, although the two are equally relevant for everyday life.

Thespatiotemporal modality is a category of spatiotemporal media modes. As they consist of physical matter, all media products have spatiotemporal properties and can therefore be grasped by human minds. Following well-established models in physics, the three spatial dimensions and the temporal dimension can be considered as a unit. Thus, space and time form a four-dimensional spatiotemporal entity consisting of width, height, depth and time. Although all media products actually exist in such a four-dimensional world, the relevant properties of media products—those properties that because of selective attention perform the function of a media product—may be more restricted. I argue that media products must have at least one and may have up to four spatiotemporal modes.

However, these modes cannot be freely combined: to perceive space with the senses, at least two spatial dimensions are required. This means that the only conceivable monomodal spatiotemporality would be exclusively temporal media products. Speech or song emanating from a single point might be considered as instances of media products that are only temporal, although I think it is reasonable to state that even such media products have some rudimentary spatial qualities. Tracing media modes is seldom a question of definitely affirming or dismissing them. Nevertheless, it is important to discern differences. Thus, temporality, a mode of the spatiotemporal modality, is an aspect of songs, speeches, gestures and dance, but not of stills and most sculptures. Whereas a photograph has only two dimensions (width and height), a sculpture has three spatial dimensions (width, height and depth). A dance and a mobile sculpture have four dimensions (width, height, depth and time). Dance performances and political speeches have a beginning, an extension and an end situated in the dimension of time, while a photograph, as long as it exists, simply exists. If you close your eyes or block your ears in the middle of a performance or a speech, you miss something and cannot grasp the spatiotemporal form in its entirety. If you close your eyes while looking at a photograph, you miss nothing and the spatial form remains intact. In these respects, there are distinct and relevant spatiotemporal differences among media products and media types, even though the presence or absence of certain modes may sometimes be disputed.

All media products, like all objects and phenomena, are necessarily perceived in time and space before they create cognitive import in the perceiver’s mind. Semiosis is also a spatiotemporal phenomenon. However, because media products are constituted only by parts of the physical surroundings that are chosen for selective attention in acquiring a communicative function, this does not rule out the actual media differences. Also, some media types, such as visual, verbal (symbolic) signs on a flat but static surface (such as printed texts), are conventionally decoded in a fixed sequence, which makes them second-order temporal, so to speak: sequential but not actually temporal, because the physical matter of the media products does not change in time.

Thesensorial modality is a category of sensorial media modes. All media products have sensorial properties in the sense that their materiality, somehow existing in time and space, must be perceived by one or more of our senses to reach the mind and trigger semiosis. Media products simply do not exist unless they are grasped by the senses. We usually think about the five external sense faculties of humans, which I here describe as the five main modes of the sensorial modality: seeing, hearing, feeling, tasting and smelling. The visible is a mode of signboards, gestures, films, websites and tattoos. The audible is a mode of instrumental music, recited poetry, films, radio weather forecasts and the shouting of salespeople in the street. Communication can also be accomplished by how the surface of a gift feels, how a meal tastes or how a flower smells.

Still, there are other human senses, described in terms such as interoception (sensing the internal state of the body) and proprioception (sensing body position and self-movement), and these senses may be relevant for human communication and vital for the perception of media products, especially when the human body itself is used as a media product. Someone who physically makes someone else lose her balance by pushing her may communicate threat, which is perceived by sight and touch but also by the perceiver’s proprioception—the perceiver’s body constituting the media product.

Thesemiotic modality is a category of semiotic media modes. While the material, spatiotemporal and sensorial modalities form the framework for explaining the presemiotic processes of mediation, the semiotic modality is the frame for understanding representation. All media products are semiotic because if the sensory configurations with material, spatiotemporal and sensorial properties do not represent anything, they have no communicative function, which means that there is no media product and no virtual sphere in the perceiver’s mind. Hence, all objects and phenomena that act as media products have semiotic traits, by definition.

Whereas the semiotic traits of media products are less palpable than the presemiotic ones and are in fact largely derived from them—because different kinds of mediation have different kinds of semiotic potential—they are equally essential for realising communication. The mediated sensory configurations of a media product do not transfer any cognitive import until the perceiver’s mind comprehends them as signs. In other words, the sensations are meaningless until they are understood to represent something through unconscious or conscious interpretation.

Although the sensory configurations have no meaning in themselves, the process of interpretation begins in the act of perception. Conception does not come after perception; rather, all our perceptions are results of the endeavours of an interpreting, meaning-seeking mind. The moment we become aware of a visual sensation, for instance, the sensation is already meaningful at a basic level because meaning-making already starts in the unconscious apprehension and arrangement of what is perceived by the sense receptors. Meaning-making continues in the more or less conscious acts of creating sensible patterns in the intracommunicationaldomain and relevant connections to the extracommunicationaldomain.

These observations are not valid only for the perception of media products. The world at large is meaningless in itself; its significance is the result of interpreting minds—perceiving and conceiving subjects situated in social circumstances—attributing import to states of affairs, actions, occurrences, natural objects and artefacts. Following Peirce, meaning can be described as the result of sign functions, and although there are no signs until some interpreter has attributed significance to something, it is possible to distinguish between different sorts of signs.

Earlier, it was common to distinguish between conventional signs and natural signs. Peirce’s most important trichotomy—icon, index and symbol—attaches to this division even though it avoids the slightly misleading idea that some signs exist ‘in nature’. It is far beyond the scope of this study to account for all of Peirce’s complex semiotic ideas, so I simply state that I follow his specific idea that signs result from mental activity based on, as I would have it, certain cognitive capacities.

As noted, Peirce defined the three sign types in terms of the representamen–object relationship. Icons stand for (represent) their objects on the ground of similarity, indices do so on the ground of contiguity, often described as ‘real connections’, and symbols operate on the ground of less durable habits or stronger conventions (see, for instance, 1932: CP2.303–304 [1902], CP2.247–249 [c.1903]; Elleström 2014a: 98–113). I regard perceiving similarity and contiguity and forming habits as fundamental cognitive abilities. I also take iconicity, indexicality and symbolicity to be the three main semiotic media modes; no communication can occur unless cognitive import is created in the perceiver’s mind through at least one of the three sign types—icons, indices and symbols.

This sign division is also echoed in research branches that do not engage in semiotics. During the twentieth century, it was common to distinguish between different but complementary ways of thinking. Some cognitive functions have been said to be mainly directed by ‘pictorial representations’, whereas others have been understood to mainly rely on ‘propositional representations’. The pictorial is more concrete and related to perceiving similarity and contiguity, while the propositional is more abstract and related to forming habits. Brain research has shown that the two ways of thinking can largely be located in the two cerebral hemispheres. Cognitive science involves an almost universal dichotomy—cognition based on similarity and cognition based on rules—although there are different opinions regarding their interrelations and dominance (Sloman and Rips 1998).

I suggest three terms to denote the processes of iconic, indexical and symbolicrepresentation. Although these terms are widely used for different purposes in diverse contexts, they fit the rationale of this study. Hence, I propose calling iconic representationdepiction, referring to indexical representation as deiction, and denoting the process of symbolic representation with the term description. The manner in which I use these three terms makes their significance both broader and narrower than in many other contexts; I annex them only to be able to efficiently distinguish verbally among the three main types of signification.

Depiction, deiction and description are not mutually exclusive; as modes of the other modalities, they are often (perhaps even always) combined to create multimodal media, that is, media that are both visual and auditory, spatial and temporal, iconic and indexical and so forth. According to Peirce, who stressed that the determinate aspects of all signs are ‘in the mind’ of the interpreter, the three modes of signification are always mixed, but often one of them can be said to dominate (1932: CP2.228 [c.1897]). In most written, verbal texts, the symbolic sign functions of the letters and words dominate the signification process. In instrumental music and all kinds of visual still images (such as drawings, figures, tables and photographs), iconic signs generally dominate, although photographs also have an important indexical character. Depictions in music and visual still images differ, of course, since the musical representamens are auditory and perhaps mainly represent motions, emotions, bodily experiences and cognitive structures, while the visual representamens of still images can effortlessly represent a broad range of objects from many areas. Nevertheless, all of these iconic sign functions are based on similarity. I am well aware of the lack of consensus when it comes to the question of musical meaning, but my point is that no matter how one defines the semiotic character of a media type, it must include semiotic particularities that are sometimes at least partly media-specific. Music and visual still images simply do not communicate in the same way.

As I have already stressed, a semiotic perspective must be combined with a presemiotic perspective. Communication is equally dependent on the presemiotic media modalities and the semiotic modality. Represented objects called forth by representamens are results of both the basic features of the media product as such and of semiotic activity situated in a social context. While signification is ultimately about mind-work, in the case of communication this mind-work is dependent on the physical appearance of the media product. However, some representation is clearly more closely tied to the appearance of the medium, whereas other representation is more a result of interpretation, and hence the setting of the perceiver’s mind.

Thus, the spatiotemporal, the sensorial, the material and the semiotic modes together form the specific character of all media products, and generally also media types as they are circumscribed at certain periods. Traditional sculpture is three-dimensional, solid and non-temporal. It is primarily perceived visually, but it also has tactile qualities that can be understood as part of its defining qualities. Generally, the iconic sign function dominates. An animated movie, as we understand this media type today, with its moving images and evolving sounds, is temporal. It is mediated by a flat surface with visual qualities combined with sound waves. The images are primarily iconic, and they lack the specific indexical character of images produced by ordinary movie cameras. The sound generally consists of voices, sound effects and music: the musical sounds, but often also much of the voice qualities, are very much iconic, while the parts of the voices that one can discern as language are mainly interpreted as habitual signs. Printed advertisements, as they are normally understood, have a solid, two-dimensional, non-temporal materiality and are perceived by the eye. Most of them gain their meaning through verbal symbols combined with iconicity in the visual form of their elements, including the verbal symbols. Printed advertisements that contain readable words are sequential but not temporal as such, although the conventions of language make it necessary to read the letters and words in a certain order to make sense. As already emphasised, the presemiotic and semiotic modes of a media product offer certain possibilities and set some restrictions: any kind of cognitive import cannot be freely created based on just any media type.

The concept of media modalities that I have outlined here roughly supports ideas about media always containing other media (McLuhan 1994 [1964]: 8, 305) and media always being mixed media: ‘the very notion of a medium and of mediation already entails some mixture of sensory, perceptual and semiotic elements’ (Mitchell 2005: 257, 260; cf. Mitchell 1994: 95, 2005: 215, 350). However, the concept of media modalities also accounts, in some detail, for how media are differently entangled in each other, and in which respects media may not be contained by or mixed with other media: different media necessarily share the four basic modalities, but they have the modes of the modalities only partly or not at all in common. There are media similarities and media dissimilarities and media are mixed, or multimodal, in dissimilar ways.

All media are multimodal in that they must have at least one mode from each modality. Most media are also multimodal in the sense that they have several modes from the same modality: they may be materially multimodal, having both solid and liquid modes, for instance. They may be spatiotemporally multimodal, being both two-dimensionallyspatial and temporal, for example. They may be sensorially multimodal, being dependent on being both seen and heard. They may be semiotically multimodal, for instance, by forming cognitive import through icons and indices as well as symbols. Because signification requires at least some degree of activity of all three sign types, all media are probably semiotically multimodal. Some media, such as computer games and theatre, are multimodal on the level of all four modalities.

The four media modalities are categories of basic media traits. However, the traits that they cover, the various modes, are not isolated, self-sufficient traits. Therefore, the proposed model offers no simple, mechanical way of checking off the modality modes, one after another, but it instead suggests a method of minutely investigating the features of various media and ways of analysing and interrelating them. This is a more detailed and specific way of outlining media multimodality compared to multimodality understood as the combinations of socially constructed entities such as writing, music and gesture. However, the model of media modalities does not in any way exclude the social aspect of communication, which I have already accounted for in the discussions of how communicating minds are formed, and which will return in the following section on media types.

5 What Are Media Types?

5.1 Basic and Qualified Media Types

Having outlined the concepts of media modalities, modality modes and multimodality, I can now suggest a way of thinking about media types. Reasoning in terms of types can involve several pitfalls. Nevertheless, it is virtually impossible to navigate in one’s material and mental surrounding without categorising objects and phenomena; otherwise, everything would be difficult to grasp and to explain. Categorisation brings about borders—or at least border zones—and borders should always be disputed. The area of communication is no exception: it is unavoidable to categorise media into types, and it is not evident how these categorisations should be made.

What, then, does one categorise in communication? I suggest that a central element for categorisation in this broad area is the media product, understood here as a single entity in contrast to types of media. Whereas media products are individual communicative entities, media types are clusters of media products. In everyday discourse, and in this article (unless otherwise specified), the term ‘medium’ may refer to an individual media product as well as a media type. More specifically, ‘a talk’ and ‘a photograph’ refer to specific media products, and ‘talk’ and ‘photography’ refer to types of media.

Despite the complex nature of media products, it is fully possible to categorise them in various ways. A discussion of media categorisation requires that proper attention be paid to the basic qualities of media products, understood as physical intermediate entities that enable transfer of cognitive import between at least two minds, resulting in a virtual sphere in the perceiver’s mind. This involves qualities that must be understood as being situated within the range from the purely material to the purely mental. I have already described these traits that involve physical properties as well as cognitive processes in terms of media modalities.

In the end, each media product is unique. However, thinking species such as humans feel the need to categorise things in order to navigate the world and communicate efficiently. This leads to the categorisation of media products, and, as is often the case with classification in general, our media categories are usually quite fluid. Nonetheless, thinking in terms of media modalities is helpful for understanding media differences and similarities and hence for understanding how media can be categorised. This is not the whole story, though. Some categorisations are more solid and stable than others are because they depend on partly dissimilar factors. There are simply different types of media categories.

That is why I find it helpful to work with the two complementary notions of basicmedia types and qualified media types—two types of media types. People sometimes pay attention to the most basic features of media products and classify them according to their most salient material, spatiotemporal, sensorial and semiotic properties. For instance, people sometimes think in terms of still images (most often understood as tangible, flat, static, visual and iconic media products). This is what I call a basic medium (a basic type of media product), and it is relatively solid because of its perennial fundamental traits. Basic media types are categories of media products grounded on basic mediamodality modes.

However, when such a basic classification is not enough to capture more specific media properties, we qualify the definition of the media type that we are after and add criteria that lie beyond the basic media modalities. We also include all kinds of aspects about how we produce, situate, use and evaluate media products in the world. We tend to talk about a media type as something that has certain functions or that we use in a certain way at a certain time and in a certain cultural and social context. Qualified media types are simply categories of media products grounded not only on basic mediamodality modes but further qualified.

For instance, we may want to delimit the focus to still images that are handmade by very young people—children’s drawings. This is what I call a qualified medium (a qualified type of media product), and it is more indefinitive than the basic medium of a still image, simply because the added specific criteria are vaguer than those captured by the media modalities. It may be difficult to agree upon what a handmade drawing actually is: Should drawings made on computers or scribble on the wall be included? When does a child actually become a young adult? The notion of childhood varies significantly among cultures and changes over time, not to mention the individual differences in maturity. Therefore, the limits of qualified media are bound to be ambivalent, debated and changed much more than the limits of basic media are.

Because processes of categorisation are multifaceted, serve different purposes and often involve vague terminology, the distinction between basic media types and qualified media types is not always clearly distinguishable in actual media classifications. Also, because the modes of the modalities are not always easily isolated entities, there is no definite set of basic media types. There is also an abundance of basic media that we have no terms for at all, which makes explaining and discussing them a cumbersome exercise. In fact, everyday language only covers a few rudimentary media types. Here I think of the terms ‘text’ and ‘image’ that, in various terminological constellations, come close to standing for several related basic media types.

If ‘text’ is defined as any media type primarily based on (verbal) symbols, it becomes possible to discern variations such as ‘auditory text’ (consisting of sound waves in air or possibly water or some other gas or liquid that are heard in a temporal flow), ‘tactile text’ (consisting of solid, three-dimensional signs on a surface that does not evolve in time) and various forms of ‘visual text’ (consisting of, say, non-organic or organic materials in two or three spatial dimensions that are either temporal or not). Likewise, if ‘image’ is defined as any media type primarily based on icons, it is possible to differentiate between basic media types such as ‘auditory image’ (consisting of sound waves that are heard in a temporal flow and resulting not primarily in verbalsymbols but in icons), ‘tactile image’ (consisting of solid, three-dimensional signs on a surface that does not evolve in time) and several forms of ‘visual image’ like ‘visual still image’ (non-temporal) and ‘visual moving image’ (temporal) in various material appearances.

Because of the almost infinite possible modal combinations, we must accept that some basic modal groupings are commonly distinguishable at a certain time and in a certain culture, and that the future may hold new habits and technical solutions that make novel basic media types relevant. For example, imagine a basic media type consisting of organic materiality in the form of a liquid that is perceived as both a spatial extension and a temporal flow, which can be both seen and felt and which produces mainly iconic meaning. Assuming that a technical medium of display capable of realising media products with such traits was invented and grew popular, we might expect an increasing need for a term to represent such a basic media type.

Categorising media products in basic media types is about categorising what are considered the relevant features of all perceived sensory configurations and how they trigger semiosis. We have observed that something becomes a media product because it attains a communicative function in mediating between several minds, but not all traits of the mediating physical entity or process are involved in the communicative function. The selective attention of the perceiver’s mind, which is often formed by social praxis, decides what material, spatiotemporal and sensorial qualities of certain parts of the physical entity or process become involved in signification, resulting in a virtual sphere. When we perceive a standard book page, we usually ignore its slightly three-dimensional features and think of it as a flat surface; we also look at it in certain ways rather than try to taste it or listen to it.

Hence, basic media types—such as inorganic, flat, static, visual texts—are really categorisations of salient traits that enable communication in certain ways, not simply of objectively existing traits of physical items or occurrences. This becomes apparent especially considering the semiotic modality. Although they are based on the presemiotic modes, it is the semiotic modes that fulfil the communicative function of the media product, and different sign types, different forms of representation—belonging to different basic media types—may well result from similar forms of mediation depending on different forms of expectation and interpretation. For instance, when trying to make sense of certain inscriptions on an old monument, exactly the same visual, ornamental configurations can be understood either as icons representing natural objects or abstract ideas on the ground of perceived similarity or as symbols representing names or places on the ground of conventions.

As noted above, it is often insufficient to consider only the media modalities when seeking to understand how media products are categorised. One must also consider their communicative functions in societies and a world of constant change. In addition to basic media types, there are qualified media types, which depend on history, culture and communicative purposes. They include classes such as lectures, music, television programmes, news articles, visual art, Morse Code messages, sign language and email. Although they are normally based on one or several basic media types, and may therefore have a certain degree of stability, their defining features are formed by fluctuating conventions. My understanding of qualified media types comes fairly close to how other scholars have defined media at large: ‘“medium” could be defined in a moderately broad sense as a conventionally distinct means of communication, specified not only by particular channels (or one channel) of communication but also by the use of one or more semiotic systems serving for the transmission of cultural “messages”’ (Wolf 1999: 35–36); ‘what we identify as a specific “medium”—as well as what we consider “natural” about and how we perceive and use both traditional and new media—are shaped by a wide variety of factors, ranging from physical material, technological infrastructure, means of access, social conventions, media habits, preferences of communication partners, and institutional structures’ (Rice 2017: 536).

One could say that the dependence of qualified media types on basic media types moderates the potentially radical changes of qualified media types. Although societies, technologies, cultures, values, habits and communicative expectations change, there is often a natural resistance towards complete metamorphoses of qualified media types. For instance, few would find a point in letting a qualified medium such as music be developed in such a way that its basic presemiotic modal qualities (sound evolving in time) were counted out. Likewise, one would hardly accept a qualified media type such as surveillance video to include media products that do not contain temporally evolving visual iconicity. Whereas painting is a qualified medium because expected aesthetic qualities are to be presented within certain social and artistic frames that are bound to undergo changes, its expected modal traits are relatively stable and provide a useable starting point for discussing the limits of the media type. For instance, few would accept that a media product that cannot be seen is a painting and if it is strongly three-dimensional, rather than two-dimensional, a strong case could be made for it being a relief rather than a painting.

By the same principle—qualified media types depending on basic media types—there are categorisations that are often understood to form single qualified media types, whereas they might be seen as several interrelated qualified media. I argue that literature as art is preferably treated as at least two qualified media types: literature that one sees (reads) and literature that one hears. Of course, visual (written) and auditory literature are deeply entangled; we constantly transform the auditory to the visual and vice versa when we write down literature and read it out loud but still expect the different media products to function in roughly the same way. Hence, the qualifying processes are partly similar for the two qualified media, but they are still significantly different in certain respects since they are based on at least two different basic media.

Thus, qualified media types often contain more solid cores of basic media types, which partly justifies the much debated idea of medium specificity and the controversial notion that there are sometimes also essential differences between qualified media types. Whereas many scholars load their revolvers when they hear the word ‘essential’ (because media qualities that are described as essential are often just social constructions), I think that similarities and dissimilarities among qualified media types in terms of basic presemiotic and semiotic features can be said to be essential. Most people in most cultures now understand a qualified medium such as film to be a combination of visual, predominantly iconic signs (images) displayed on a flat surface and sound in the form of icons (as music), indices (sounds that are contiguously related to visual events in the film) and symbols (as speech), all expected to develop in a temporal dimension. The combination of these features is no doubt a historically determined social construction of what we call the medium of film, but given these qualifications of the medium, it has a certain essence.

Because qualified media types are cultural conceptions that are created, perceived and defined by human minds, there are no media types ‘as such’ and therefore no independent essences of qualified media ‘as such’. However, once we agree that, for pragmatic reasons, it is meaningful to say that there are dissimilar media types, essential presemiotic and semiotic modes are inscribed into these conventionally defined qualified media. It would be nonsensical to argue that a static collection of visual symbols (letters and words) displayed on book pages or a screen actually constituted a film. This is because there are essential dissimilarities on a basic level between our conceptions of writtenliterature and film. A century ago, the two qualified media were construed slightly differently, so the essential dissimilarities between what was then called writtenliterature and film were slightly different; the same terms were used to refer to somewhat different qualified media types.

However, it is not always possible to trace cores of basic media in qualified media. A qualified media type such as popular science is so broadly conceived that it can be realised by all kinds of presemiotic and semiotic modes as long as scientific ideas are communicated in a way that is not too complicated. Whereas such qualified media types are vague in terms of modality modes, they may well be precise in terms of communicative functions.

Furthermore, not all media products are regularly categorised. As we have noted, there is an abundance of variations of media products, especially considering that any physical item or phenomenon may be drawn into communication and acquire the function of media product, but only the most institutionalised types of media products are clearly categorised as qualified media types. This is the case for non-professionals as well as scholars. Thus, there are several kinds of media products that we normally do not categorise in qualified media types. For example, certain television programmes are readily understood as instances of the nature documentary qualified media type; however, when using an empty glass to communicate the desire to get more beer, it is unclear to what type of qualified medium such a glass might belong. Although not urgent, this problem should be noted.

5.2 The Contextual and Operational Qualifying Aspects

The grounds on which media types are qualified can be divided into at least two main aspects. The first is the origin and delimitation of media in specific historical, cultural and social circumstances. This can be termed the contextual qualifying aspect and involves forming media types on the grounds of historically and geographically determined practices, discourses and conventions. We tend to think about a media type as a cluster of media products that one begins to use in a certain way, or gain certain qualities, at a certain time and in a certain cultural and social context. This is in line with Joseph Garncarz’s notion that media must be seen ‘not only as textual systems, but as cultural and social institutions’ (1998: 253). Visual art, Morse Code messages, sign language and email are not eternal media types, although they could be neatly described in terms of media modalities—they appear, they perhaps eventually disappear, and they are fully intelligible only in certain shared circumstances.

Sometimes it is more or less radical technological developments, such as the invention of new materials or forms of reproduction, that quickly trigger the genesis of what one takes to be new qualified media types (as is the case with various forms of so-called digital media). It may also be the case that new technology only slowly gives rise to new qualified media types. It has been argued that ‘cinema’ did not become ‘cinema’ the day the technique was invented (Gaudreault and Marion 2002). It took a while before a sufficient number of media products, created through cinematographic techniques, were original and characteristically similar enough to be thought of as a new media type. Eventually, two notions came to be attached to the same term: ‘cinema’ as a set of techniques and ‘cinema’ as a qualified media type developed within the frames of, but not determined by, the technological aspects. Video presents a similar case. First, a set of technical devices for the production, storage and distribution of media products were launched, and only later did these devices give birth to a qualified medium with certain communicative qualities (Spielmann 2008 [2005]). It is sometimes instead media products based on old techniques that are seen as a new qualified media type when they are adopted in new contexts, as when photographs are exhibited at galleries and museums and come to be seen as photographic art.

The second of the two qualifying aspects is the general purpose, use and function of media, which may be termed the operational qualifying aspect. This aspect encompasses construing media types on the ground of claimed or expected communicative tasks. Whereas communication is generally a goal-driven activity, the goals may be very different, so it is natural to associate individual media products with other familiar media products that are known to have certain purposes and functions. Therefore, media products tend to be categorised to enhance understanding of what they could or should achieve. This means that such classification is not only descriptive but also prescriptive; it may deeply affect the effects on the perceiver’s mind. Here, I can hint at only a few of the myriad existing communicative functions.

On an overarching level, media products can be thought of as more private or more official; there is a difference between how secluded communication is expected to work compared to communication with open access for everybody. This is why the idea of a category of mass media (often referred to as simply ‘media’) is so widespread. It is a common evaluation that one’s more private affairs are preferably communicated among a limited group of people that one trusts, whereas some media types are capable of reaching large groups of people and are therefore suited for communicating things of more general interest. In this way, the media types under the umbrella term ‘mass media’ are qualified operationally. However, media types are also qualified contextually. So, even though the distinction between private and mass media has never been sharp, we have seen in the last few years how the boundary has become increasingly blurred in so-called social media, where private and even intimate matters are commonly communicated openly and at least potentially accessible to a mass audience. Although still useful for most people, the distinction between private and mass media types will clearly continue to be debated and modified.

On a more specific level, crossing the fragile border between private and mass communication, media products may be claimed or expected to bond, create trust or share affections among people. We think in terms of caresses, consolations, promises, gifts and acts of courtesy. Although it may feel unusual to think of these things as media products, they are precisely such intermediate entities that enable transfer of cognitive import among minds, and we categorise them according to their claimed or expected communicative functions. Similarly, media products may have main functions to warn, threaten or frighten.

It is also common for media products to be claimed or expected to communicate various forms of truthfulness. Although it is not always clearly detectible in terms of how we categorise media products, this kind of purpose and use probably permeates a majority of media types (with the obvious exception of decidedly misleading communication). Qualified media types that are maintained to communicate news—television news, articles in newspapers, public announcements on streets and town squares and perhaps even gossip—are mainly expected to be truthful regarding factual and weighty recent events and their interconnections. Qualified media types called documentaries are largely construed on the purpose and function of representing truthfully and in some detail the interconnections of a specific set of persons and events in the past or in the present. There is also a multitude of media types that overtly function to educate, inform, instruct, train, provide wisdom and the like—media types that can be circumscribed in terms of various forms of expected truthfulness. Similarly, artistic media types, even those that are termed ‘fiction’, are expected to communicate truthfully, albeit in ways that are partly different from those media types mentioned previously. Art is generally claimed and believed to communicate general rather than particular truthfulness, for instance, not necessarily what a living person with a certain name said, did and felt in a specific place on a particular date, but rather what many people are likely to say, do and feel under certain circumstances.

Other forms of claimed or expected communicative functions that steer the construction of qualified media types include entertaining and aesthetic qualities. A performer would not produce stand-up comedy if her or his performance was not at all amusing; videogames need to be pleasurable to some degree to be regarded as games; movies that fail to be scary in an engaging way are not likely to be seen as horror movies; and jokes that are not funny for anyone are not really jokes—or at best they are failed jokes. Disregarding the obvious difficulty of distinguishing art from entertainment (which is perhaps not really necessary), artistically qualified media types such as music, dance, calligraphy, poetry and architecture are construed on the assumption that to deserve to be included in these art forms, the media products must fulfil certain aesthetic standards. Although this view has been contested in various ways, it remains a central factor for most people.

One can highlight the importance of the operational qualifying aspect with a comparison of dance, gesture and so-called body language. Although dance is generally considered an art form that is governed by aesthetic standards, it is closely related to and dependent on gesture and body language—media types that are also seen as part of everyday practical communication. All three media types are probably among the most perennial and widespread forms of communication (less dependent on the contextual qualifying aspect), and they are virtually inseparable in terms of modality modes. The primary modes involved in dance, as well as in gesture and body language, are organic and solid materiality (the human body), all four spatiotemporal dimensions and visuality. Semiotically, I believe that all three media types are equally dependent on icons (signification based on similarity with elements, chains of events and ideas), indices (signification based on contiguity with entities and developments in the body’s external surrounding as well as emotional and cognitive processes within the body itself) and symbols (signification based on habits—both personal habits and collective conventions). Therefore, the difference between dance on the one hand and gesture and body language on the other remains to be found in the operational qualifying aspect. Whereas dance is supposed to fulfil certain current aesthetic criteria in order for it to be accepted as such, the same does not apply for gesture and body language.

All of these particular qualifying aspects can exist side by side, and they may well overlap. As we have seen in some of the examples, the contextual and operational qualifying aspects often interact. As Jürgen E. Müller (2008a, b, 2010; cf. Bignell 2019) emphasised, the communicative functions of a media type often arise, become gradually accepted or disappear at certain moments in history and in certain socio-cultural circumstances. The qualifying aspects are, precisely, aspects of the multifaceted mechanisms that lie behind categorisations of media products, and it is probably feasible to split these aspects into three, four or even more specific aspects.

It is impossible to avoid noticing the relativity of most qualified media types. Sometimes, a qualified media type may also seem to contain several more finely restricted media types. These more limited qualified media types might be referred to as qualified submedia types, or simply submedia. The concept of a submedium is effectively the same as most notions of genre. In other words, a genre is a qualified media type that is qualified also within the frames of an overarching qualified medium: a submedium. However, some genres, such as Western novels and Western movies, being subtypes of novels and movies, attach to each other across the borders of qualified media types and exist as twin submedia.

In the end, it is probably always possible to add criteria to make further distinctions among qualified media types (cf. Ettlinger 2015). Because qualification and requalification of media types are bound to continue as long as humans exist and are able to communicate, total agreements are utopic and unnecessary. Consequently, my ambition here is not to argue in favour of certain ways of circumscribing particular qualified media types, but rather to highlight the general mechanisms behind basic and qualified categorisations of media products.

5.3 Technical Media of Display, Basic Media Types and Qualified Media Types

Having explained the concepts of basic and qualified media types as different forms of categorisation of media products, I will now clarify the relation between technical media of display and basic and qualified media types. I have defined technical media of display as any objects, physical phenomena or bodies that mediate sensory configurations in the context of communication; they realise and display the entities that acquire the function of media products. Thus, every technical medium of display can be described according to the range of basic media it can and cannot realise—or, more precisely, which presemiotic modes it is more or less fit to mediate. One could also argue that different technical media of display can realise basic media types more or less completely and successfully. However, strictly speaking, it is a contradiction in terms to say that a basic media type may be realised only in parts; if one or several modality modes are missing, it is actually another basic medium, and one must think in terms of media being transformed. This line of thinking is ultimately self-evident, considering that basic media types are categories of media products and media products are functions of sensory configurations mediated by technical media of display.

Given that every technical medium of display can only realise certain basic media types, it follows that they can also only realise certain qualified media types. This is because many qualified media are construed on cores of basic media and are therefore dependent on particular technical media of display. One can only realise a theatre performance by a combination of technical media such as human bodies, some form of indoor or outdoor area and props. A television set, which displays a feature film very well (apart from the size of the screen), is only capable of partly realising a theatre performance: the three-dimensional spatiality, complex corporeality and multisensoriality of the theatre are reduced to a flat screen and a concentrated source of sounds—which means that it is not really theatre that one sees and hears on the television, but theatre transformed to something else.

Since the existence of certain technical media of display is a facet of every historical moment and cultural space, several qualified media types are more or less strongly dependent on specific technical media having a socially determined existence (the contextual qualifying aspect). Technical media of display inevitably also play a crucial part in the forming of the general purpose, use and function of media (the operational qualifying aspect). An oil painting can be described as a qualified medium characterised not only by certain modality modes but also by unique aesthetic qualities linked to the technical medium of oil colour, which was invented and developed at a certain time and in a certain cultural context. Similarly, qualified media types such as computer games are inconceivable without the resource of recently invented technology, and more specifically, they depend on electronic screens as technical media of display, which have only existed relatively recently.

This historical and functional closeness between physical existents (technical media of display) and qualified ways of categorising media (qualified media types) explains why the same term is often used to represent both, which sometimes creates confusion. We have already noted that ‘cinema’ (technologies for producing but also displaying cinema) did not become ‘cinema’ (a qualified media type) the day the technology was invented. Likewise, the term ‘photography’ can refer to devices and techniques for production, to several technical media of display (paper in books and magazines, electronic screens, t-shirts and even cakes), or to one or several qualified media types (photography as documentation or as art).

On the other hand, some qualified media types are broadly conceived and not so determined by specific technical media of display. The way that sculpture is usually conceived means it can be realised by all technical media of display that can mediate solid, three-dimensionally spatial and visual materiality, which includes technical media such as bronze, stone, plaster, plastics, sand, ice and metal. This allows for a larger variety of individual media products within the same media category.

6 What Are Media Borders and Intermediality?

6.1 Identifying and Construing Media Borders

With a deeper understanding of the multimodal nature of media products as well as their categorisation in media types, it is now possible to return to the issue of intermedial relations. For good reason, scholars have argued that intermediality is a result of constructed media borders being trespassed. Indeed, nature does not give any definite media borders, which means that it is not evident what intermedial relations are. Werner Wolf emphasised that media borders are created by conventions and defined intermediality as a relation ‘between conventionally distinct media of expression or communication: this relation consists in a verifiable, or at least convincingly identifiable, direct or indirect participation of two or more media in the signification of a human artefact’ (Wolf 1999: 37). Christina Ljungberg stressed the performative aspect of border crossings, arguing that intermediality is something that sometimes ‘happens’, an effect of unconventional ways of performing medial works (Ljungberg 2010).

However, there are at least two kinds of media borders. As we have seen, media differ partly because of modal dissimilarities and partly because of divergences concerning the qualifying aspects of media, and the conventionality and performativity of media borders are mainly a facet of the qualifying aspects (Rajewsky drew a similar conclusion [2010]). Intermedial relations between basic media types such as visual moving images and visual still images can be relatively clearly described within the framework of the four modalities, whereas intermedial relations between qualified media types such as auditory literature and music largely also rely on the two qualifying aspects.

In the first case, the border between the two basic media (visual moving image and visual still image) lies in the spatiotemporal modality, since still images are spatial, whereas moving images are both spatial and temporal. In the second case, the border between the two qualified media (auditory literature and music) is partly modal in character and partly qualified in character. It is modal because of differences in the semiotic modality: all auditory literature is primarily (but not exclusively) symbolic, and music is primarily (but not exclusively) iconic. It is qualified because the boundaries between what one counts as auditory literature and music largely depend on different communicative ambitions and expectations. A reading of a poem that is reasonably close to the sound of ordinary speech is generally considered to be literature, whereas a singing performance of the same poem counts as music. However, there are many performance variants between the literary and the musical that cannot be clearly classified as either auditory literature or music since there is no definite border to be crossed. Instead, there is a border zone that is located differently in different periods and cultures. The classification is sometimes simply a question of whether the poem is performed within the frames of a poetry event or a musical concert. However, this cultural and aesthetic ambiguity of the difference between auditory literature and music is clearly linked to the semiotic modality. Even a neutral reading of a poem has some iconic potential, and what one takes to be the increasing musicality of a more varied, rhythmic and melodic reading is, in fact, strongly linked to increased iconicity.

Thus, I subscribe to the idea that the borders between what I refer to as qualified media types are largely relative. Boris Eikhenbaum’s brief comment from nearly a century ago about the media types that we call art forms remains relevant today: ‘None of the arts are fully bound entities, since syncretic tendencies are inherent in each of them; the whole point is in their inter-relationship, in the grouping of elements under one sign or another’ (1973 [1926]: 124–125). I also believe that Mitchell’s later contention that there are no ‘essential’ differences between media that are ‘given for all time by the inherent natures of the media, the objects they represent, or the laws of the human mind’ (1987: 2–3) is broadly correct—if we consider the qualifying aspects of media types. However, it is also the case that several qualified media types have indispensable cores of basic media types, which means that once a community has formed these qualified media types on the ground of contextual and operational qualifications, and as long as they are of service, they may differ ‘essentially’ regarding modality modes, from other qualified media types. As long as we think that a weather forecast on the radio is something that we have to hear and a printed newspaper article is something that we have to see, there will be an ‘essential’ difference between sensorial modes of these two qualified media types.

In brief, then, the classification of basic media typesis relatively stable, whereas the classification of qualified media types is relatively unstable. It follows from this that media borders can be stronger and weaker; in other words, media borders can be understood to be both identified and construed, depending on whether one considers basic media borders or qualified media borders.

6.2 Crossing Media Borders

One might understand the crossing of media borders as the phenomenon, that one can classify a particular media product in different ways. For instance, one might categorise a certain three-dimensional, solid artefact as both an artistic sculpture and an object for religious adoration, which means that it, in a broad sense, bridges over qualified media borders. This is possible because the processes of categorising media products in qualified ways are largely open-ended, overlapping and changing.

However, one might also understand the crossing of media borders in a narrow sense as bridging over basic media borders. To explain this, it is important to consider the cross-modal cognitive capacities of the human mind, which no doubt evolved to make it possible to cope with a multimodal world. Practically all media borders can be bridged over to some extent, although certainly not completely, through these cross-modal cognitive capacities. They are central for mediality as such and indispensable for understanding intermedial relations.

Within a semiotic framework, cross-modal cognitive capacities refer to the abilities to create cross-modal representations. In the context of communication, these abilities explain the imperative phenomenon that meaning-making often goes beyond the media product’s actual presemiotic modalitymodes. For instance, a visual, two-dimensional and static image may represent something that is perceived to be both three-dimensionally spatial and temporal, such as a deer running in the forest. Whereas we perceive only two actual dimensions with our eyes, we perceive (or rather construe) virtual third and fourth spatiotemporal dimensions in our mind. Similarly, we regularly construe virtual materialities and sensory perceptions. A relief on a temple wall that is actually made of stone may be understood to represent a living organism such as a lion, which means that the representation crosses the border between non-organic and organic materiality. When studying a musical score, we only actually perceive visual configurations, but we understand them to represent auditory patterns: virtual sound is construed in our minds. All of these virtualities, these represented objects that are made present to our minds through signs in communication, result from semiotic activity: iconicity, indexicality and symbolicity. Thus, virtual spheres are partly made of interpretants resulting from cross-modal representation.

Cross-modal representation in communication involves a difference between the presemiotic modalitymodes of the media product and the material, spatiotemporal and sensorial traits of the virtual sphere that it represents, which requires cross-modal cognitive capacities. In single-modal representation in communication, the material, spatiotemporal and sensorial modes of the media product are akin to the traits of the virtual sphere that it represents (such as a solid, visual, two-dimensional, static image representing a solid, visual, flat and unchanging object). This is arguably less cognitively demanding.

The term ‘cross-modal’ is used in various ways in a multitude of research areas. In the context of communication, it usually refers to connections among the external senses (see, for instance, Brochard et al. 2013). However, in line with the concept of media modalities, cross-modal here means the linking of all forms of different presemiotic modes within the same media modality. More specifically, cross-modality should be understood here as cross-material, cross-spatiotemporal and cross-sensorialrepresentationthrough iconicity,indexicalityorsymbolicity. For instance, solid media products may represent non-solid objects, static media products may represent temporal objects, and auditory media products may represent visual objects—through iconicity, indexicality or symbolicity. Importantly, this means that dissimilar basic media types can partly represent the same objects. For instance, the notion of a running dog—a solid, organic, spatiotemporal and largely visual and auditory object—can be represented by a variety of different basic media types, not just solid, organic, spatiotemporal and visual or auditory media. This is what I mean when I state that cross-modal cognitive capacities can bridge over basic media borders: our minds are, to some extent, capable of leaping from mode to mode in the act of representation.

The functions of icons, indices and symbols—iconicity, indexicality and symbolicity—may be simple and straightforward as well as complex and sometimes difficult to grasp. All three sign types may cross the boundaries of what Peirce called the representamen, in the respect that something visual can represent something tactile, something static can represent something temporal and so forth. However, cross-modal representation may also mean that something material represents something mental. Our minds’ capacity to connect the experience of concrete objects and phenomena with the experience of thinking, feeling, perceiving and imagining is fundamental for our ability to communicate cognitive import. Whereas a visual circle may be an icon for a material, concrete object such as the sun, it may also work as an icon for mental, abstract phenomena such as harmony, satisfaction or eternity because of a perceived similarity between the visual form and the cognitive notions. A visual circle may also function as an index for the earlier presence of a material object like a pen or a brush that actually created the circle. Similarly, it could be understood as an indexical sign for the mental act of wanting to draw a circle: there is a real connection between the producer’s intention and the realised circle. The pen was there, but also the idea was there. Finally, a visual circle may be understood as a symbol, a sign based on habits, such as the letter O. In English, the written letter O signifies symbolically in at least two different ways. On one hand, it stands for a certain kind of sound (or rather a group of related sounds), and sound is a material phenomenon that we perceive with our external senses. On the other hand, the letter O stands for something abstract and conceptual in the sense that it represents a linguistic function—to form meaningful words—that can only be realised in conjunction with other letters.

Although abundantly present in all three sign types, cross-modal representation is perhaps most noteworthy in iconicity (Ahlner and Zlatev 2010; Elleström 2017). The ability to perceive cross-modal similarities is a remarkable cognitive capacity. While similarities are most clearly perceived among visual and auditory phenomena, respectively (a photograph of a boat clearly looks like a boat and a skilled whistler is able to sound just like a blackbird), similarities can be established across material, spatiotemporal and sensorial borders—and between the material and the mental. This is because mode-specific dissimilarities of details can be disregarded and similarity can be perceived on higher, more abstract and cross-modal levels. For example, visual traits may depict auditory or cognitive phenomena, and static structures may depict temporal phenomena. Hence, graphs may depict both changing pitch and altering financial status. Similarly, a variety of media types can depict similar ideas and concepts, such as the notion of speed, because they are abstracted from a broad range of sensory perceptions of different materialities and also mental experiences.

Initially, the purpose of my account of material, spatiotemporal and sensorial modes was to clarify the basic properties of media products working as representamens. However, as I have just demonstrated, it is clear that the modalities can also be used to characterise the objects of media products—what they represent, what they call forth in the mind of the perceiver in creating a virtual sphere. While represented objects such as abstract concepts may have an almost purely cognitive character, objects that are made present to the mind in signification may also be more or less concrete and physical. A painting of a face represents a face because the features of the painting are similar to the features of actual, physical faces as they are stored as recollections in our minds (Elleström 2014a). Hence, media products have certain material, spatiotemporal and sensorial modes, and, similarly, the objects that they depict, deict or describe may have either the same or other material, spatiotemporal and sensorial modes—or they may have a cognitive nature.

6.3 Intermediality in a Narrow and a Broad Sense

Given that media types and media borders are of various sorts and have different degrees of stability, it follows that media interrelations are multifaceted. Therefore, it may be helpful to provide some elementary divisions regarding the general nature of media interrelations. I first postulate that mediality is everything pertaining to media in communication. Intramediality concerns all types of relations among similar media types, and intermediality involves all types of relations among dissimilar media types. However, considering that there are (at least) two kinds of media borders, there are (at least) two ways of understanding media interrelations, making the classes intramediality and intermediality broader or narrower.

The term ‘intramedial’ is commonly used to refer to slightly different conceptions depending on how the notion of medium is circumscribed (see, for instance, Rajewsky 2002: 12). This is the case also for ‘intermedial’. Here, I follow the distinctions that I have recently expounded and suggest that media interrelations can be intramedial in a broad and in a narrow sense. Intramediality in a broad sense regards relations among (media products belonging to) similar basic media types, and intramediality in a narrow sense regards relations among (media products belonging to) similar qualified media types. Similarly, I suggest that media interrelations can be intermedial in a broad sense and in a narrow sense. Intermediality in a broad sense regards relations among (media products belonging to) dissimilar qualified media types, and intermediality in a narrow sense regards relations among (media products belonging to) dissimilar basic media types.

Thinking of intramedial and intermedial relations in a narrow and a broad sense is useful for disentangling the intricate notion of crossing media borders. To avoid confusion, it is recommended to keep the intramediality and intermediality classes together, which entails combining one broad and one narrow notion. Intramediality in a broad sense (meaning relations among similar basic media types) belongs together with intermediality in a narrow sense (meaning relations among dissimilar basic media types). Intramediality in a narrow sense (meaning relations among similar qualified media types) belongs together with intermediality in a broad sense (meaning relations among dissimilar qualitied media types).

Elaborating on intermediality, it can be concluded more specifically that intermedial relations in a narrow sense are relations among (media products belonging to) dissimilar basic media types, that is, relations among media types based on different modality modes. This involves transgressing relatively strong media borders when moving between them. Intermedial relations in a broad sense, on the other hand, are relations among (media products belonging to) dissimilar qualified media typesincluding cases where no differences inmodality modesare present. Because several qualified media types are based on the same modality modes, they belong to the same basic mediatype, and their interrelations are intermedial only in a broad sense. This involves transgressing relatively weak media borders when moving between them. For instance, the two media types writtenpoetry and scholarly article are clearly qualified in different ways, although they are both typically understood to consist of visual, static and mainly symbolic signs on a flat and generally solid surface. Whereas the interrelation between written poetry and scholarly article is intermedial in a broad sense, it is not intermedial in a narrow sense. Sections of poetry can normally be seamlessly incorporated into scholarly articles (and vice versa) without modifying modality modes.

Thus, intermedial relations in a narrow sense are largely a question of ‘finding’ or identifying media borders between dissimilar basic media types. Intermedial relations in a broad sense are more a question of ‘inventing’ or construing media borders between dissimilar qualified media types based on similar basic media types. As the mechanisms for classifying media products into media types are anything but clear-cut, it is often not evident how to apply this seemingly straightforward distinction between different forms of media interrelations. However, the division of intermedial relations into a narrow and a broad sense offers a methodical way of considering the intricate nature of intermediality.

7 What Are Media Integration, Media Transformation and Media Translation?

7.1 Heteromediality and Transmediality

Media interrelations are multifaceted. I now wish to add another viewpoint on media interrelations, to be placed on top of the ones already discussed. I suggest distinguishing between a synchronic and a diachronic perspective on media interrelations. Having a synchronic perspective means considering how media features appear at a certain moment. Having a diachronic perspective means considering how media features appear in relation to preceding and possibly subsequent media. Evidently, these two perspectives are analytical outlooks; I do not suggest using them to categorise media products. All media products can be investigated from both a synchronic and a diachronic perspective. While there is no doubt that certain media products are remarkably apt for diachronic analysis, no media products exist that cannot be treated in terms of diachronicity without some profit.

I propose calling the synchronic perspective on media interrelations heteromediality. With references to Mitchell (1994) and Elleström (2010), Jørgen Bruhn defined heteromediality as ‘the multimodal character of all media and, consequently, the a priori mixed character of all conceivable texts’ (2010: 229). I think this is an apt description of how media exist from a synchronic perspective. For me, the term ‘heteromediality’ refers to the general concept that all media products and media types, having partly similar and partly dissimilar basic presemiotic modes, overlap and can be described in terms of amalgamation of material properties and abilities for activating mental capacities that can be understood as various sign functions. This implies that media products and media types can only be properly understood in relation to each other. In my view, heteromediality, the synchronic perspective on media interrelations, is equally relevant for intra- and intermedial relations. It is the fundamental condition for mediality as such.

I also propose calling the diachronic perspective on media interrelations transmediality. Transmediality has been widely discussed and defined in various but fairly consistent ways. For instance, Irina O. Rajewsky circumscribed transmediality in terms of phenomena that are not media-specific, such as parody (Rajewsky 2002). I propose a very broad delineation of transmediality to match the comprehensive concept of heteromediality. For me, the term ‘transmediality’ refers to the general concept that media products and media types can, to some extent, mediate equivalent sensory configurations and represent similar objects (in Peirce’s sense of the notion); in other words, they may communicate comparable things (Elleström 2014b: 11–20). This means that there may be transfers in time among media. Even though multitudes of more or less different media products and media types are used, communication can be grasped as a succession of interconnected representations, chains of overlapping virtual spheres. Clearly, transmediality, the diachronic perspective on media interrelations, cannot be properly understood without profoundly comprehending heteromediality, the synchronic perspective on media interrelations. As heteromediality, transmediality is relevant for both intramedial and intermedial relations. However, because of the complicated nature of media differences, transmediality in intermedial relations will be discussed separately and receive more attention. In these discussions, intermediality means intermediality in a narrow sense (relations among dissimilar basic media types), and intramediality means intramediality in a broad sense (relations among similar basic media types). This is because I want to focus specifically on the role of media modalities.

Heteromediality concerns the combination and integration of media products and basic or qualified media types. How can media be understood, analysed and compared in terms of the combination and integration of modality modes and qualifying aspects? This viewpoint emphasises an understanding of media as coexisting modality modes, media products and media types. Therefore, (intramedial and intermedial) heteromediality can also be called media integration.

Intermedial transmediality concerns transfer and transformation of media products and basic or qualified media types. How can the transfer and transformation of cognitive import represented by different forms of media be adequately comprehended and described? This viewpoint emphasises an understanding of media involving temporal gaps among modality modes, media products and media types—either actual gaps in terms of different times of genesis or gaps in the sense that the perceiver construes the import of a medium based on previously known media. Because media differences bring about inevitable transformations, intermedial transmediality can also be called media transformation.

Intramedial transmediality concerns the translation of media products and basic or qualified media types. I use the term ‘translation’ to adhere to the common idea that translation involves transfer of cognitive import among similar forms of media, such as translating writtenverballanguage from Chinese to English. Therefore, intramedial transmediality can be broadly referred to as media translation.

7.2 Media Integration

As stated, the synchronic perspective on media interrelations, heteromediality, is foundational for comprehending mediality as such, and there is little point in distinguishing between intramedial and intermedial heteromediality. It is imperative to emphasise both the notion of combination and the notion of integration, stressing that sharing and combining media properties always entails integrating them to some degree. That is why I also refer to heteromediality as media integration. Compared to other intermediality scholars, I more strongly emphasise that there is a floating scale between combination and integration and avoid stricter divisions. For instance, Hans Lund made a heuristic distinction between three kinds of word–picture relations: combination, integration and transformation (1992 [1982]: 5–9). Claus Clüver distinguished between multimedia texts (separable texts), mixed-media texts (weakly integrated texts) and intermedia texts (fully integrated texts) (2007: 19).

The core of heteromediality consists of the multimodal character of media products, as explained in some detail in the earlier sections of this article. Every media product is made of a combination of media modality modes, generally including several modes from at least some of the modalities. Consequently, it is fair to say that media products consisting of many different modes are integrated or even mixed already as single media products, as Mitchell emphasised (1994). However, it is vital to note that media types are modally mixed or integrated in very different ways, allowing different kinds of media integrations with other media types composed of dissimilar modal mixtures.

Heteromediality also involves the combination and integration of different media products (that are already integrated on a more basic level). The circumstances under which a person is motivated to decide that she or he is dealing with ‘one’ media product rather than ‘several’ media products are rarely evident. Therefore, it may be that one and the same act of communication can be accurately analysed as consisting of one highly multimodal media product as well as of several thoroughly integrated media products. For instance, two people engaged in face-to-face communication both continually produce temporal, auditory and visual sensory configurations with a multitude of other modality modes, using their bodies and their immediate extensions, and perhaps other items, as technical media of display to realise a stream of communication. As the two minds give each other feedback, the continuous communication is nevertheless segmented in turn-taking to a certain degree; there are moments of relative silence or immobility on one side or another, after which something at least partly new is produced.

Although it may be impossible to determine exactly when or where one media product ends and another one begins, it is reasonable to think that each communicating mind in a case like this produces several media products rather than one. Similarly, one may experience that there is a certain autonomy in what one sees and hears. In other words, gestures and body language might (or might not) be perceived as media products that are not fully integrated with speech, because we are all familiar with hearing speech without seeing gestures and body language, and vice versa. However, these mental mechanisms of perceiving either single or several media products are certainly affected not only by the representing sensory configurations but also by the represented objects. The more successfully a single coherent virtual sphere is created, the more one is probably inclined to say that the media products are deeply integrated or actually constitute a single media product forming one perceptual gestalt. This means that, in each communicative situation, such as when one encounters a multitude of impressions during a lecture involving a variety of educational aids, it may be an open question whether one is guided by the disparity of material, spatiotemporal, sensorial or semiotic modes and feels that one encounters several combined and more or less integrated media products, or rather perceives a single total and highly multimodal media product. In any case, the heteromedial perspective offers theoretical tools for disentangling the interrelations.

Media types are categories of media products, which means that it may be an equally open question whether we are dealing with a weak combination or a strong integration of several basic or qualified media types, or in fact just a single highly multimodal, inclusive media type. This is because media categorisations are subjective and follow pragmatic communicative incitements rather than systematic rules. Nevertheless, it is clear that highly multimodal media types, compared to less multimodal media types, are more often perceived as combinations and integrations of several media types, most likely because one is used to experience and think of the various parts separately.

For instance, a qualified media type such as documentary photography can be said to be grounded on the basic media type materially solid, visual and flat still images. Similarly, a qualified media type such as animated cartoons for children might be said to be grounded on a single broad basic media type that is materially both solid and in gas form, spatiotemporally consisting of time and at least two spatial dimensions, sensorially audiovisual and semiotically dominated by icons as well as indices and symbols. However, it is probably more enlightening to think in terms of an integration of several basic media types (that can actually be perceptually separated). On one hand, materially solid, visual and flat moving images, on the other hand what might briefly be described as auditory text (verbally symbolic, temporal sounds that are heard) and non-verbal sounds (iconic and indexical, temporal sounds that are heard).

Theatre, to take another example, potentially combines and integrates a multitude of basic media types; almost anything can be brought into a scene and made part of the performance. The aesthetic aspects of these combinations and integrations of basic media are part of how many people understand and define theatre as a qualified media type. Each basic medium has its own modal characteristics, and when combined and integrated according to certain communicative ambitions and expectations, the result is known as ‘theatre’. Theatre consists of different kinds of materialities—which are both profoundly spatial and temporal, appeal to both the eye and the ear and produce meaning by way of all kinds of signs—and it is contextually and operationally qualified in several ways. Therefore, theatre could be described as a profoundly multimodal qualified medium that is susceptible to intermedial analysis. It makes sense to say that it not only integrates several basic media, but also several qualified media; one may recognise parts of a theatre performance as, say, music, architecture, gesture, dance and speech. However, it might be an overstatement that ‘theatre is a hypermedium that incorporates all arts and media’ (Chapple and Kattenbelt 2006: 20; cf. Kattenbelt 2006: 32) because once the different media types are integrated, they become something else: the qualified medium of theatre.

To compare, one could argue that the pop song (here narrowly understood as something that one listens to without access to live performance) is a qualified medium that combines the two basic media types of auditory text (verbal symbols that are heard in a temporal flow) and auditory image (icons that are heard in a temporal flow). The consequences of combining and integrating these two basic media are not as far-reaching as the combination of several basic media in theatre. Auditory text and auditory images have the same materiality: sound waves that are taken in by the organs of hearing. Their way of being fundamentally temporal, but also to a certain degree spatial, is similar. The difference between auditory text and auditory image is clearly in the semiotic modality: whereas signification in auditory texts is mainly based on symbols and grounded on habits, signification in auditory images is mainly based on icons and grounded on similarity.

However, an unqualified combination and integration of these two basic media types is not enough to produce a pop song. Normally, both the auditory text and the auditory image need to have certain qualities that confer on them not only the value of ‘lyrics’ and ‘music’ but also of ‘pop lyrics’ and ‘pop music’. The qualities of qualified media types become even more qualified when aspects of qualified submedia types, or simply genres, are involved. We usually consider the lyrics produced by the singer to be music in themselves, as is the sound produced by the instruments. Consequently, the integration of the two basic media in a pop song is deep, since the two media types are virtually identical when it comes to three of the four modalities. Concerning the fourth modality, the semiotic, it is perfectly normal to integrate the symbolic and the iconic sign-processes in the interpretation of both lyrics and music. Whereas literary texts are generally more saliently symbolic, and music is generally more saliently iconic, the combination and integration of lyrics and music stimulates the perceiver to find iconic aspects in the text and to realise the symbolic facets of the music.

Compared to theatre, the basic media of pop songs are strongly integrated because of their identical sensory configurations, which may make it seem that they are actually based on one basic media type and constitute one qualified submedium rather than an integration of several submedia. On the contrary, because of its strongly multimodal character, theatre might be seen as comprising several integrated basic and qualified media types rather than just one.

7.3 Media Transformation

As noted, the diachronic perspective on media interrelations, transmediality, is relevant for both intermedial and intramedial relationships. It covers all kinds of actual and potential diachronic media interrelations. This goes beyond the general field of media history, that is, the study of how media types evolve throughout the centuries (we find this narrower sense of a diachronic perspective on media in, for instance, Rajewsky 2005: 46–47). Regarding the diachronic perspective on intermedial relationships among dissimilar media (which I comprehend here as intermedial in a narrow sense: relations among dissimilar basic media types), I find it imperative to emphasise both the notion of transfer, indicating that identifiable represented characteristics are actually or potentially relocated among media (the narrative of a comic strip can be clearly recognised in a movie), and the notion of transformation, stressing that transfers among different media always entail changes (the narrative in the movie can hardly be identical to the one in the comic strip). For the sake of brevity, however, I refer to this perspective simply as media transformation; thus, media transformation equals intermedial transmediality.

Just as a combination of media products and media types involves grades of integration, transfer of cognitive import among media products and media types involves transformation, to different degrees. The human body, a technical medium of display, perfectly realises a solo dance or a gesture. In order to communicate something similar to the dance or the gesture, the technical medium of a televisionscreen will work quite well, a printed still image will do the job less well, and the sound emitted by a radio will only be able to realise media products that are radically altered, although they may still be able to create recognisable virtual spheres. This depends on the dissimilar modal capacities of the various technical media of display, suitable for realising different basic media types. Therefore, when the transfer of cognitive import among media is restricted by the modal capacities of the technical media of display, or when the technical media allow of modal expansion—in brief, when the transfer brings about more or less radical modal changes—it can be described as transformation.

More specifically, transmediality generally involves the idea that different media products (belonging to the same or dissimilar media types) may trigger the same or similar cognitive import; they may create the same or at least similar virtual spheres. Therefore, it is only a short step from the idea that virtual spheres may be transmedial, to varying degrees, to recognising that cognitive import can be transferred among similar or different kinds of media. When inserting a temporal perspective, it often makes sense to acknowledge not only that similar cognitive import is or may be signified by various media, but also that parts of or even whole virtual spheres, that are similar enough to be recognised, may recur after having appeared in another medium. Thus, transmediality involves actual or potential transfers of cognitive import not only among minds (which is the indispensable core of communication as such), but also among media—that is, among minds perceiving different media.

When describing how a media product is perceived and construed in prompting specific cognitive import forming a particular virtual sphere, it is convenient to simply refer to its characteristics. I used the term ‘compound media characteristics’ earlier to represent the concept that media products and media types bring into being individual or typical cognitive import that forms specific (types of) virtual spheres in the perceiver’s mind (Elleström 2014b). The term includes the word ‘compound’ to avoid mixing up the material, spatiotemporal and sensorial media traits that represent (the presemiotic modality modes) and the multifaceted characteristics that are represented. Therefore, it might be clearer to instead use the term ‘represented media characteristics’ or simply ‘media characteristics’, while recalling that ‘media characteristics’ refers to the represented cognitive import.

Represented media characteristics include everything that one might think of. They may be concrete or abstract and they may be conceived in terms of form or content: animals, persons, minds, structures, stories, rhythms, compositions, explanations, contrasts, themes, motifs, ideas, events, interrelations, moods and so forth. Some of the things and phenomena that media represent have material, spatiotemporal and sensorial traits. However, all things that media represent, in the broad sense of making them present to the perceiver’s mind, aremedia characteristics.

The advantage of sometimes using the term ‘represented media characteristics’ instead of simply ‘cognitive import’ is that it emphasises the specificity of what certain media products or media types represent. Certain media characteristics are attached to particular media products and some are attributed to particular basic and qualified media types. Ultimately, though, ‘represented media characteristics’ means the same as ‘specific cognitive import created by the perceiver’s mind in communication’. The point here is that represented media characteristics are more or less transmedial, meaning that they can be more or less successfully transferred among different media products or even different basic and qualified media types (Elleström 2014b: 39–45). This largely, but certainly not solely, depends on the present or absent modality modes of the involved media.

Returning to specifically intermedial transmediality, I distinguish between two forms of media transformation (intermedial transmediality). The first is transmediation (repeated representation of media characteristics by a different form of medium, such as a person orally communicating the same story as a computer game), and the second is media representation (representation of another medium of a different type, such as a written review that describes the performance of a piece of music).

Transmediation, another kind of medium that again represents some media characteristics, can more precisely be described in terms of my previous distinction between intracommunicational and extracommunicationaldomains. The intracommunicational domain consists of the virtual sphere—represented cognitive import. The extracommunicationaldomain consists of the perceived actual sphere and other virtual spheres: cognitive import stemming from previous representations in earlier communication. Transmediation occurs when already represented objects from other virtual spheres, created by other media types, become part of a virtual sphere; this is the same as saying that media characteristics are represented again by another form of medium. For instance, the people in a newspaper photograph or the visual actions in a film may be described by spoken words; a musical score may be performed by a musician; the oral statements of a witness may be written down; a story and characters in a theatrical play may be adapted to a movie; the gist of a scientific account may be rendered into a visual diagram; and written alphabetical text may be transformed to Braille writing. Even the recipe in a cookbook being realised as a meal communicating, for instance, affection, contrasts or the sense of a certain season of the year, can be understood in terms of transmediation.

Examples of mediarepresentation, a medium representing another medium of a different kind, are dialogues, gestures or photographs being heard and seen in a film; a scholarly treatise discussing media interrelations; pictures of drawings on a website; a song about love letters; and a written article in a magazine describing social media. If a written article in a magazine not only describes social media in general but also, say, events that have already been communicated on social media, we have media representationandtransmediation. The two types of media transformation are not in any way mutually exclusive; on the contrary, they often coexist. Furthermore, they include not only transformations among specific media products but also among qualified media types and between media products and qualified media types. Filmic qualities in a written article in a magazine are a case of transmediation from the qualified medium of film to a specific media product. The artistic genreekphrasis is generally defined as poems representing paintings, which is a case of qualified submedia representing other qualified media and normally includes transmediation of media characteristics from painting to poem.

I want to emphasise that it is not necessarily the technical medium of display that ‘forces’ the transformations in media transformation. Naturally, media transformations may also result from communicative choices to take advantage of the modal possibilities offered by the target medium. In the classical example of novels being adapted to films, modal differences between the two qualified media types clearly make it necessary to alter many things; however, transmediations of this kind also offer possibilities for creative choices and voluntary transformations that are desirable. In this case, transmediation can be seen as a possibility rather than a problem. In other cases, such as transmediations among statements, written reports and footage from surveillance cameras in criminal trials, transmediation is definitely a problem rather than a creative opportunity; judges rarely appreciate inventive new versions of earlier media characteristics.

Obviously, there are many kinds of media transformation. These sometimes involve fairly clear and complete relations between media products, such as when a particular newspaper article is evidently recognisable in its online version (albeit with fewer words and added animations and hyperlinks), or when a specific novel can be identified as the source of a feature film (although the narrative has been abridged and sound and visual iconicity have been added). It is sometimes rather a question of less definitive and fragmentary media characteristics that travel among media products and media types, such as when musical form is traced in a short story, when visual characteristics associated with comic strips can be said to have found their way to a television commercial, or when certain formal media characteristics of literature are transmediated to dance (cf. Aguiar and Queiroz 2015).

As demonstrated in the section on media borders, transfer of media characteristics over modal borders is often possible despite essential presemiotic and semiotic dissimilarities among media. This is not least because our brains have cross-modal abilities; they can make meaningful transmissions between, say, visual and auditory information, or spatial and temporal forms of presentation. This allows for media characteristics being more or less transmedial. Hence, the fact that there are fundamental or even essential media dissimilarities does not preclude shared representational capacities and the transfer of media characteristics among dissimilar media. Over thirty years ago, Dudley Andrew noted that in order to explain how different sign systems can represent entities that are approximately the same (such as narratives), ‘one must presume that the global signified of the original is separable from its text’ (Andrew 1984: 101). This is no doubt true, especially if one relativises the proposition and adds that the represented media characteristics are to some extent separable from the representing sensory configurations. Represented objects are ultimately cognitive entities in our minds, and these entities can be made present by different kinds of signs, although media differences will always ensure that they are not completely similar when represented again by another kind of medium.

7.4 Media Translation

Although I have discussed transmediality primarily within the frames of intermediality (in a narrow sense), the diachronic perspective on media interrelations is relevant also for intramedial relations (which I comprehend here as intramedial in a broad sense: relations among similar basic media types, which may actually involve dissimilar qualified media types). I refer to intramedial transmediality as media translation. I choose this term because ‘translation’ attaches to the common notion of translation as transfer among verballanguages. Hence, media translation is an extension of this idea to include transmediality among all forms of similar media types, not just media types based on verballanguage. Much of what I have said about media transformation is also applicable to media translation, with the obvious difference that whereas media transformation involves dissimilar media types, media translation involves similar media types, which makes media translation somehow less complicated to grasp. Nevertheless, basic media transformation categories such as transmediation and media representation have their equivalences in media translation. Intramedial transmediation would then include phenomena such as cover versions of pop songs, remakes of feature films, rephrased oral statements and translations of menus from Spanish to English. Intramedial media representation could include dinner talks mentioning any form of speech, paintings representing other paintings, television shows discussing television shows in general or specific television programs and news articles referring to themselves. However, a lengthy discussion of media translation would not add much to what I have already concluded regarding media transformation.

8 What Is the Conclusion?

How can media be circumscribed within the realm of communication and how can media interrelations be conceptualised? These questions have been at the heart of this article from start to end. The incompatibility of many of the suggested answers in the past is largely caused by the shifting approaches of different scholars and research traditions. Technological features, as well as modal and qualifying aspects, have been emphasised in diverse and often exclusive ways in the efforts to find slim and efficiently operable definitions of the concept of medium. Jürgen E. Müller emphasised this problem several decades ago (1996: 81–83). One alternative has been to lean on conceptions of media that are open-ended and mind triggering but difficult to handle analytically, such as McLuhan’s (1994 [1964]). The advantage of working with a set of entangled and complementary concepts—media product, technical medium of display, media modalities and modes and basic and qualified media types—is that such a conglomerate of concepts sets certain parameters at the same time as it incorporates most of the actual comprehensions of mediality. Therefore, I have tried to offer an array of interrelated analytical perspectives that may be used for careful analysis of media interrelations, without strictly compartmentalising media products and their interrelations.

Although I have provided a few detailed accounts of media and their interrelations, my overview requires a more exhaustive elaboration and exemplification. I have offered a model for understanding media and intermedial relations, and the point of models is precisely to put aside specific details to make possible a view that is more generally valid. Therefore, I hope that the model may also offer a starting point for methodical analyses in the service of various research questions attaching to mediality at large and more specifically media interrelations.

In a certain sense, the presented model is bottom-up in nature. Instead of beginning with a small selection of established media types and their traits and interrelations, which is the usual scholarly methodology, it is founded on observations of all kinds of media, leading to a broad but firm definition of the concept of media product and an explanation of media modalities that are shared by all media products and hence also media types. Hence, the conceptual framework can properly deal with any individual media product even if it is found outside of established and well-researched areas of communication. The model can also account for the plain but central fact that media products and media types are both similar and different. While there are four media modalities that underlie all conceivable media, each modality encloses several modes that vary among media products and media types. However, these modality modes are not always easily detectable properties; rather, they are found on a scale from physical traits to perception, cognition and interpretation.

The existence of several modality modes belonging to different media modalities means that the concept of media multimodality can be comprehended in various ways. In the broadest sense, a media product or a media type is multimodal if it combines, for instance, solid materiality, temporality, visuality and iconicity; in this respect, all media are definitely multimodal because they must be realised by at least one mode of each modality. In a more restricted sense, media multimodality means that a media product or media type includes several modes of the same modality. In this specific sense, there are material multimodality (multimateriality), spatiotemporal multimodality (multispatiotemporality), sensorial multimodality (multisensoriality) and semiotic multimodality (multisemioticity). Considering this narrower sense of multimodality, all media are at least slightly multimodal because the modality modes are generally either overlapping or mutually dependent in complex ways that I have only hinted at.

However, I have demonstrated in more detail the ways in which the concepts of media products, technical media of display, media modalities, modality modes, multimodality and basic and qualified media types make it possible to delineate properly concepts such as mediality, media borders, intramediality, intermediality, heteromediality and transmediality. Taking the intricacy of the many aspects of mediality into account, intermediality could actually be described as ‘media intermultimodality’. As argued, I think it is worth viewing intermediality as a complex set of relations among media that are more or less multimodal in various ways, although I hesitate to use the cumbersome term ‘media intermultimodality’. Nevertheless, the concept that it stands for has proven fertile (see Lavender 2014).

Multimodality is vital for mediality, and although an intramedial perspective is necessary for understanding many communicative phenomena, an intermedial perspective is essential for grasping the intricate field of mediality at large—because crossing media borders is the rule rather than the exception in communication. Because of their ubiquity and complexity, I do not think it is possible to circumscribe a specific corpus of multimodal media products or intermedial relations, although I find many of the scholarly systems of intermedial ‘works’ and ‘relations’ valuable (cf. the enlightening overview of intermedial positions and issues in Rajewsky 2005). Intermedial relations can only be pinned down to a certain extent and intermedial analysis cannot live without its twin sister, intermedial interpretation.

While intermediality is certainly about specific intermedial relations, it is also, and perhaps primarily, about studying all kinds of media with an awareness of media differences and similarities. As stressed by Jørgen Bruhn (2010), what makes intermedial studies important is that they offer insights into the nature of all media, not only a selection of peripheral media. Although the objects of intermedial studies may well be, for instance, media that have been categorised as ‘intermedial’ or ‘multimodal’, they may also be what have been taken to be (for the moment) ‘normal’ media. The outcome of the studies depends less on the objects of investigation than on the way the studies are performed. The ambition of the model that I have here outlined, first presented in an initial form a decade ago (Elleström 2010), is that it continues to offer helpful tools for careful analysis and interpretation of all forms of media interrelations, regardless of the inducements and goals of the investigations.