Is there an optimum of realism in computer-generated instructional visualizations?

Skulmowski, Alexander

doi:10.1007/s10639-022-11043-2

Is there an optimum of realism in computer-generated instructional visualizations?

Open access
Published: 18 April 2022

Volume 27, pages 10309–10326, (2022)
Cite this article

Download PDF

You have full access to this open access article

Education and Information Technologies Aims and scope Submit manuscript

Is there an optimum of realism in computer-generated instructional visualizations?

Download PDF

Alexander Skulmowski ORCID: orcid.org/0000-0002-1682-021X¹

1733 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract

Realistic visualizations are abundantly used in digital education. However, the use of realism is still thought to risk a cognitive overload due to excessive details. Moreover, it is still not precisely known whether there is an optimal level of realism that benefits learners the most. In two experimental studies, different versions of anatomical visualizations were compared regarding their effects on retention performance and the subjective cognitive load experienced during learning. In Experiment 1 (n = 73), four visualizations with minor variations in the geometry and shading of the model featured in the visualizations were used. Although neither the level of detail in the geometry nor the realism of the shading resulted in significant differences, a detailed model with simplified shading elicited the highest retention scores descriptively. In Experiment 2 (n = 156), a schematic visualization was compared with an “idealized” model featuring only simplified shading and a highly realistic rendering. There most realistic version elicited the highest retention scores, but also the highest cognitive load ratings. Taken together, the results suggest that the optimal level of realism might lie on the more realistic end of the spectrum for learning tasks focused on the memorization of shapes that are assessed using image-based tests.

Is a Preference for Realism Really Naive After All? A Cognitive Model of Learning with Realistic Visualizations

Article Open access 23 September 2021

Are realistic details important for learning with visualizations or can depth cues provide sufficient guidance?

Article Open access 21 March 2024

Realistic details impact learners independently of split-attention effects

Article Open access 09 January 2023

1 Introduction

How realistic and detailed should visualizations be to enable effective learning? This question has been a rather controversial issue for the field of digital education. While some claim that realistic computer-generated visualizations contribute little towards performance or can even turn out to be a burden (Scheiter et al., 2009; Smallman & St. John, 2005), some recent results suggest that at least under certain circumstances, realism can be of value to learners (e.g., Skulmowski, 2022; Skulmowski & Rey, 2021). However, current technology provides us with the necessary tools to create a visualizations with a wide array of options concerning the level of realism. Only a few investigations into the effects of different realism degrees have been published (e.g., Brucker et al., 2014; Huk et al., 2010; Imhof et al., 2011; Skulmowski, 2022; Skulmowski & Rey, 2021). This lack of research poses a problem for instructors wishing to use computer-generated visualizations, as it is hard to estimate the effectiveness of (different) realistic visualizations beforehand and knowledge regarding the optimal level of realism could facilitate instructional design using such material. In order to provide insights into the effects of realism levels in visualizations on learning and cognitive load, two experimental studies were conducted on the basis of a fine-grained model of realism in computer graphics described in the following section.

1.1 Geometry, shading, and realism as the basis of computer-generated imagery

Although different definitions and classifications of pictorial realism have been proposed over the years, many of them do not allow a comprehensive description of the possibilities that modern software used in the creation of three-dimensional (3D) computer-generated imagery offers (Skulmowski et al., 2021). Such software typically requires users to follow a certain series of steps when creating digital visualizations. The main components of content creation have been summarized in the geometry, shading, rendering (GSR) model (Skulmowski et al., 2021) described in the rest of this paragraph. First, models need to be created as virtual counterparts of real objects to populate the 3D scene. These models are created from polygonal structures either by drawing the individual polygons one at a time, from simple primitives that can be refined, or using automated methods such as 3D scanning or photogrammetry (Nebel et al., 2020; Skulmowski et al., 2021). Either way, the end result is a geometrical mesh with more or less polygonal detail. Thus, geometry is the first factor to consider as a contributor to realism (Skulmowski et al., 2021). This geometry can then receive material properties such as color, highlights, and bumps in the shading stage (Skulmowski et al., 2021). Lastly, virtual lights and a camera need to be positioned in the scene that determine how the models will be rendered. In addition, there are several rendering options, such as generating drawing-like contours. Different realism degrees that can be achieved using the individual dimensions of the GSR model are shown in Fig. 1. Naturally, the question arises whether these differences in realism will have an impact on learners. In the following section, previous research on the effects of varying these dimensions to create more or less realistic instructional visualizations is presented.

1.2 Research on the effects of different levels of realism

The most comprehensive comparisons between visualizations featuring different levels of realism have been conducted by Dwyer (e.g., Dwyer, 1968a, b) using material such as photographs of organs, of plastic models, drawings featuring a low or high level of detail, and other forms of visualization. Several of these studies were designed to test the notion of a "realism continuum," i.e., a hypothesized link between the degree of realism and learning performance (e.g., Dwyer, 1967, 1969). Dwyer (1969) concludes that this supposed correlation between realism and learning performance cannot be found empirically. However, Dwyer (1969) found some benefits for more realistic visualizations in image-based tests while other tests did not reveal advantages of realism (for discussions of this aspect, see Nebel et al., 2020; Skulmowski et al., 2021).

Surprisingly, only a few studies investigating the effects of controlled variations of realism on learning using computer-generated imagery are currently available. Most studies only contrast two levels of realism, usually a “schematic” version featuring a contour outline filled with solid colors, often including minimal shading, and a “realistic” version with details, accurate materials, and believable rendering (e.g., Huk et al., 2010; Menendez et al., 2020, 2022; Scheiter et al., 2009; Skulmowski, 2022; Skulmowski & Rey, 2020, 2021). As summarized by Skulmowski et al. (2021), we can broadly distinguish between different learning objectives in instructional realism research: (1) knowledge regarding surfaces, (2) understanding processes, and (3) applying abstract knowledge. From a brief overview of selected studies, Skulmowski et al. (2021) conclude that, as a general rule, realism may be most useful for gaining knowledge of shapes (i.e., in tasks such as anatomy learning), without a clear effect pattern when learning about processes, and often with negative effects on acquiring abstract knowledge. As the focus of this paper lies on the learning of anatomical shapes, the remainder of this section will be dealing with this aspect.

One effect that was found in a study on the usefulness of realistic visualizations in relation to the level of the realism used in retention tests is that learning with a detailed visualization compared to a schematic drawing-like image appears to be particularly beneficial when an equally realistic image is used in the test (Skulmowski & Rey, 2021). The results of that study were interpreted to indicate that learning with a realistic visualization may only pay off if learners need to apply their knowledge to a task involving realistic visuals (Skulmowski & Rey, 2021).

One of the few studies using more than two levels of realism was conducted by King (1986). In that study, participants of various age groups used visualizations consisting of abstract shapes, simplified drawings, or a mixture of photographs and more detailed drawings. The stimuli utilized for the learning tasks either showed a person, an animal, or an object. While retention performance was higher for the stimuli presented in the styles at the ends of the realism spectrum (abstract and realistic) in an immediate test, performance was highest for the images presented in a medium level of realism in a delayed test after one week.

A systematic investigation of the effects of 3D visualizations on credibility was published by Zanola et al. (2009). In their study, a sketch-like rendering of a city, a realistic rendering with plain geometry and simple shading, and a view of that city with more detailed geometry and shading (including small shapes such as windows in houses) was presented. The study revealed that the more realistic visualizations elicited higher credibility ratings. However, learning performance was not a part of that investigation.

In sum, the existing research on the effects of different levels of realism does not seem to provide a clear answer to the question whether there is an optimal degree of realism. While a higher degree of realism does not appear to be detrimental from the reviewed literature, certain conditions need to be met in order for realistic visualizations to unleash their full potential for learning. Furthermore, the existing literature does not allow us to pin-point a specific level of realism as the optimum.

1.3 Cognitive load and the realism paradox

Negative effects of realism have been explained in reference to an assumed or measured cognitive load that the demands of realistic details are thought to entail (e.g., Scheiter et al., 2009; for a discussion, see Skulmowski et al., 2021). Many studies were conducted in the theoretical framework of cognitive load theory (Sweller et al., 1998, 2019). This theory divides the information that learners are presented with into an intrinsic and an extraneous component (Sweller et al., 2019). For effective learning to take place, the aim is to enable learners to devote as much of their cognitive capacity to the intrinsic cognitive load, the actual content they need to learn (Sweller et al., 1998). Poor instructional design can hinder learning through filling learners’ cognitive capacity with extraneous cognitive load (Sweller et al., 1998) and the design of instructional visualizations is particularly prone to such problems (for an overview, see Renkl & Scheiter, 2017). In the cognitive model of learning with realistic visualizations, the GSR components of realistic visualizations are thought to contribute towards a perceptual load, consisting of demands on learners to work with very detailed visualizations containing many visual elements (Skulmowski et al., 2021). Does the solution to the aforementioned issues arising from realism lie in simply removing details that are deemed “unnecessary”? The answer to this question is not as straightforward as one would hope. Recent studies revealed that the relationship between realism and cognitive load cannot be adequately described as a simple negative correlation. For instance, it was found that when schematic and realistic components are combined in a display, some realistic visualizations can raise the overall extraneous load, but the retention performance for the realistic parts can turn out to be higher as well (Skulmowski & Rey, 2020; see also Koc & Topu, 2022, for a related finding on high cognitive load during learning with 3D visualizations). This counterintuitive finding has been named the realism paradox (Skulmowski & Rey, 2020) and has raised concerns over the explanatory power of (extraneous) cognitive load in the context of digital learning (Skulmowski & Xu, 2022). Another recent study found that in a virtual reality learning task, extraneous cognitive load was positively correlated with learning results (Tugtekin & Odabasi, 2022), offering additional evidence for the claim that in virtual and realistic environments, high perceptual demands may be perceived as demanding, but that these environments can still be beneficial for learning. It has been argued that the effects of realism need to be analyzed with the desired impact on cognitive processing and the mode of assessment in mind (Skulmowski & Xu, 2022). In some cases, letting learners invest more effort may actually be a better preparation for a later test than oversimplifying the learning task (Skulmowski, 2021; Skulmowski & Xu, 2022).

As an interim summary, it is generally assumed that unnecessary cognitive load that may be introduced by irrelevant details in realistic visualizations can be an obstacle for learners. However, some studies imply that lowering cognitive load may not be the ideal strategy for the design of realistic visualizations, as it may be the case that the higher cognitive demands of realistic visualizations actually have a positive effect on retention performance.

1.4 Idealization as the best of both worlds?

One aspect of realism that is in need of a closer examination is the perceptual load associated with it. In a number of papers, realism has been analyzed in terms of geon theory (Biederman, 1985, 1987; for discussions in the context of realistic visualizations, see Nebel et al., 2020; Skulmowski et al., 2021; Skulmowski & Rey, 2018). Geon theory holds that one of the steps in human visual perception is to mentally simplify the visual information in the field of view by treating objects not as the highly complex structures they may be, but rather as (a combination of) geometric primitives, such as boxes and cylinders (Biederman, 1985). Thus, a tree can be visually processed as a brown cylinder with a (slightly deformed) green sphere on top when seen from afar. In this stage, details such as the ridges in the tree bark or discolorations of individual leafs are not (yet) focused on. Several educational fields attempt to use this aspect of visual perception for their advantage. For instance, several books on learning to draw, in particular those concerned with (artistic) anatomy, use geometric primitives as proxies for the complex shapes of the human body. A particularly noteworthy example for this strategy are the works of George Bridgman (e.g., Bridgman, 1973), in whose instructional books parts of the body such as the arms, legs, and the neck are presented as simple geometric shapes. These basic shapes can be rather easily drawn in correct perspective and then serve as the base for further refinement, for example by subdividing the cylindrical shapes of the arms into smaller primitives approximating the forms of their muscles. Based on the popularity of this approach in art instruction, segmenting the shapes of objects into more idealized and prototypical shapes may be a promising approach for the design of computer-generated instructional visualizations.

The use of idealized 3D forms as a potential optimum between too simplified and too detailed instructional visualizations has not been thoroughly studied extensively yet using computer-generated visualizations. Somewhat related comparisons have been undertaken by Dwyer (1968a) when drawings and photographs of the heart were compared with photographs of a plastic model, featuring an idealized, smooth shape without irregularities that may confuse novice learners. The results of that study do not indicate a clear advantage of the different visualization types over a mere oral presentation across all tests. However, in an identification test, a detailed shaded drawing elicited higher scores than the photographs of the model. By contrast, Dwyer (1969) found in a related study that learning using photographs of a plastic model was more effective in an identification test than learning with drawings or photographs of real anatomical structures. As these results do not offer clear guidance on the issue of idealization in visualizations and since the studies could not make use of computer-generated imagery yet, there is a gap concerning this aspect in the literature.

In sum, an idealized mode of visualization that uses simplified 3D models while avoiding the use of too many irregular details on the one side and a too abstract, drawing-like presentation could prove to be the optimal way of designing 3D instructional visualizations and should be investigated empirically.

1.5 The present studies

Based on the described theoretical models and empirical results, two experiments were conducted. The first study was designed to assess whether variations in the geometry and shading of a model lead to differences in retention performance and subjective (extraneous) cognitive load. The second study compares an idealized rendering and a more detailed realistic version with a schematic visualization.

2 Experiment 1

Previous studies suggest that learners can be overburdened by irrelevant details (e.g., Scheiter et al., 2009), but since other studies demonstrated that an oversimplification may not be a good alternative (e.g., Skulmowski, 2021; Skulmowski & Rey, 2018), an experiment was conducted to assess whether a medium level of realism can prove effective for learning an anatomical structure. In this study, the two components geometry and shading of the GSR model are experimentally manipulated. The visualization used in the study featured combinations of an idealized or a detailed geometry, combined with either simplified or realistic shading. An interaction effect was hypothesized in which the two “medium” combinations (idealized geometry and realistic shading; detailed geometry and simplified shading) should lead to better retention scores than the two extreme conditions featuring a strongly simplified or a highly detailed rendering. Furthermore, the study assessed whether an idealized geometry and a simplified shading can enhance learning. Concerning (extraneous) cognitive load, based on the realism paradox presented above (Skulmowski & Rey, 2020), it was hypothesized that a higher level of realism in both factors leads to higher cognitive load scores for each factor.

2.1 Methods

Participants and design

Based on an assumed effect size of η_p² = 0.10 (in line with previous research finding even larger effects, e.g., Skulmowski & Rey, 2021) and a power of 0.80, a target sample size of 73 participants was computed using G*Power (Version 3.1.9.2; Faul et al., 2009). The 2 × 2 between-subjects design consists of the factors shape (idealized vs. detailed) and shading (simplified vs. realistic).

In order to be included in the final analysis, participants needed to fulfill certain criteria. As described below in detail, participants were asked regarding the participation requirements at the beginning of the web-based study. They were eligible to enter the regular data collection if they passed these requirements. At the end of the study, they were asked two quality control questions that needed to be answered appropriately (see below for an explanation) in order for their participation to be counted as completed and they also needed to reach the last page of the study. A total of 75 eligible participants completed the study before data collection was stopped. Only the data of the originally planned 73 participants was considered in the analyses, although the pattern does not change with the additional two participants included.

In the study, 62 female and 11 male students within the age range of 18 and 30 years participated for partial course credit in a lecture on Digital Education held at a university of education in Germany. This lecture was open to students in the teacher training courses at the undergraduate and graduate level. Randomization to one of the four experimental groups was achieved through block randomization and resulted in nearly equal group sizes (n_{Group 1} [idealized shape + simplified shading] = 17, n_{Group 2} [idealized shape + realistic shading] = 18, n_{Group 3} [detailed shape + simplified shading] = 19, n_{Group 4} [detailed shape + realistic shading] = 19).

Materials

The study utilized four different visualizations of the lung of which one was presented to each participant during the learning stage (see Fig. 2). Blender (Version 3.0.0) was used for the creation of all renderings in both experiments. The model featured a smooth and idealized geometry without smaller details in two of the conditions (see Fig. 2a, b) that was contrasted with a bumpy and irregular, more detailed geometry (see Fig. 2c, d). This model was either presented with a solid and simplified shading with a plastic-like surface (see Fig. 2a, c) or a realistic shading involving several material layers (such as a more irregularly colored texture, glossiness, and bump mapping) that can be seen in Fig. 2b and d. Based on the idea that a schematic visualization was the option that would lead to the least biased result, a drawing-like visualization with minimal shading and a contour outline was produced for the learning test (see Fig. 2e). The idea behind this choice was that this style was sufficiently different from all four versions shown during the learning phase that it would not give a specific advantage to one of the four groups based solely on the realism level of the visualization used for testing (see also Skulmowski & Rey, 2018). For each correct answer, participants were awarded one point for their total score, with a maximum of 18. The retention test had a reliability of McDonald’s ω = 0.77.

During the study, the extraneous cognitive load survey items developed by Klepsch et al. (2017) were used in a modified form (as in Skulmowski & Rey, 2020) that asked participants regarding their learning experience using the visualizations rather than the entire learning task as in the original items. The survey had a reliability of ω = 0.87.

Procedure

The procedure of both experiments in this paper was similar to the one described by Skulmowski and Rey (2020). After providing informed consent, a page was presented on which participants were asked to respond to questions regarding their age range, their prior knowledge, their native language, their currently used device, and whether they had participated in the study before. Only if they were native speakers of German, had no or little knowledge of lung anatomy, used a PC or laptop rather than a device with a small screen to participate, and had not already participated, they could enter the regular study. They then received the instruction that they would be presented with visualizations of the left and right lung and had 60 s to memorize the names, shapes, and locations of the parts. Then, they were directed to a filler task in which they were asked to sort the 16 German federal states according to their number of day schools within the time limit of 60 s. The following retention test consisted of a page in which the left and right lung were displayed, labeled with letters (see Fig. 2e). Below the two images, participants were asked to select the correct part name for each lettered component from drop-down menus. As there were labeled parts in the test images that had not been presented with a label in the learning phase, participants were asked to select the option “NOT LEARNED” for these components. There was no time limit for this task and participants were reminded not to use additional resources for answering the tests. As the test was presented relatively soon after the learning phase, the test is mainly focused on short-term memory. Following the test page, they were asked to indicate their gender and course of study along with two quality control questions. These questions asked participants whether they were strongly distracted during the learning task and whether they experienced a major technical difficulty (as in Skulmowski & Rey, 2020). Both of these questions needed to be answered with a negative response in order to proceed to the final pages and in order for datasets to count as complete. Both studies in this article were conducted using SoSci Survey (Leiner, 2021).

2.2 Results

In the analyses of both studies, the assumptions of analysis of variance (ANOVA) procedures were tested using the Shapiro–Wilk test (calculated on the model residuals) and Levene’s test. If one or more of these assumptions were violated as indicated by a significant deviation, a nonparametric ANOVA was conducted using aligned rank transformation (Fawcett & Salter, 1984) instead of a parametric ANOVA.

Extraneous load

As a Shapiro–Wilk test assessing the normality of the residuals of the parametric ANOVA of the extraneous cognitive load data indicated a violation of this assumption, a nonparametric ANOVA was conducted. No significant main effect or interaction was found, all ps ≥ 0.269 (see Fig. 3a for the untransformed data).

Retention

An ANOVA computed using the retention scores did not result in significant main effects or an interaction, all ps ≥ 0.390 (see Fig. 3b).

3 Experiment 2

The first study did not reveal a clear advantage for a medium level of realism that had been hypothesized. However, a few tendencies could be observed on the descriptive level of the data. In line with previous research (Skulmowski & Rey, 2020), a higher level of realism in the shape and shading dimensions resulted in slight increases in extraneous cognitive load. Concerning retention performance, the more detailed shape combined with simplified shading elicited the highest average retention score among all four combinations. This descriptive result could be taken as an indication that certain combinations of the GSR dimensions in the middle of the realism spectrum could indeed be favorable. In addition, the benefits of a higher level of realism may have become clearer if a realistic visualization would have been used (Skulmowski & Rey, 2021). In order to achieve more definitive results, a second study was conducted using visualizations with even stronger differences concerning their level of realism. Furthermore, the learning test in this study utilized a realistic visualization rather than a schematic drawing as in the first experiment. It was hypothesized that an idealized realistic visualization and a detailed realistic rendering result in higher extraneous cognitive load ratings than a schematic visualization. Concerning retention performance, the idealized and realistic version should lead to higher scores than a schematic version.