According to YouTube, more than one billion learning-related videos are viewed every day (YouTube, 2021), and increasing numbers of specialized learning channels are published on the platform (e.g., Tadbier & Shoufan, 2021). Given the magnitude of videos about virtually everything found online and the increasing number of Internet-connected devices in classrooms, it is unsurprising that online videos are widely used in both formal and informal learning (e.g., Bergdahl et al., 2020; Dyosi & Hattingh, 2018; European Commission, 2019; Fleck et al., 2014; Rosenthal, 2018). A representative survey of more than 800 German adolescents (Rat für Kulturelle Bildung, 2019) found that approximately 50% of students considered YouTube “important” or “very important” for their schoolwork. These students reported that they used YouTube to review learning content not understood in the classroom, find information and explanations when completing homework, prepare for tests and exams, and as a general source of knowledge.

Among different types of online learning videos, video blogs, also known as vlogs, are channels on video-sharing websites in which people regularly publish streaming videos presenting some information. The vlog creators are known as vloggers or YouTubers, as they mostly publish their vlogs on YouTube. Vlogs provide information on a wide range of content, from bloggers’ personal lives to fashion trends, news, reviews of commercial products or political opinions. There is also an emerging community of YouTubers intending to spread knowledge related to educational subjects, such as academic topics (e.g., physics, informatics, psychology), cultural content (e.g., book reviews), or procedural knowledge (also known as tutorials, e.g., how to repair an electronic device or use a particular software program). Vlogs, including those focused on educational issues, are extremely popular, particularly among young people, and they represent a large share of all videos on YouTube. Their popularity may be because they simulate face-to-face interaction, foster direct feedback through comments, favor consumer engagement, and provide room for criticism and emotional communication (Burgess & Green, 2018).

The current study aims to study the extent to which video blogs can be efficient learning materials for secondary school students. The remainder of the manuscript is organized as follows. First, we review the literature on the use of videos as learning materials, with a particular focus on online videos from video-streaming websites such as YouTube. Second, based on the shallowing hypothesis (Annisette & Lafreniere, 2017), we review previous studies on the detrimental effect of digital materials on learning. Third, we discuss the benefits of notetaking as a learning technique. Finally, we present an experimental study designed to test our hypotheses in a sample of secondary school students (14-15 years old) and discuss the results and their educational implications.

1 Students’ comprehension of educational streaming videos

Although educational streaming videos are a moderately novel way of communicating information, the use of videos for learning is definitely not new. The idea of using instructional films for learning is nearly as old as the history of motion pictures itself and has been examined by researchers since the early 1920s (Wehberg, 1938). However, today’s highly digitalized and Internet-connected classroom has changed the use of videos for learning, from teacher-controlled one-way activities delivered through linear TV, VHS and DVDs with the students as passive “recipients” of information to the use of self-selected, interactive and user-generated videos from the online video-sharing platform YouTube or other more specialized digital pedagogical platforms (Köster, 2018). The interactive features of online videos, such as those found on YouTube, seem to be important for student learning. One line of research demonstrates better learning from reading print compared to watching video, but this research has typically presented videos in broadcast mode with no possibility for the viewer to pause or review parts of the video (Hardway et al., 2018; Merkt et al., 2011). Today’s online video platforms offer the viewer a greater possibility to control the presentation (e.g., pause, review, fast forward), making it more comparable to reading, where the reader can reread sentences and paragraphs, search for keywords, and skim part of the text. Hence, in newer studies where learners have this type of control in the instructional video condition, no difference in comprehension outcome has been found between video and reading (e.g., Burin et al., 2021; List, 2018; Merkt et al., 2011; Salmerón et al., 2020; Tarchi et al., 2021). In their recent review, Noetel et al. (2021) found that combining viewer-controlled videos with traditional teaching resulted in strong improvements in students’ learning.

The availability of the burgeoning amount of information that platforms such as YouTube provide and the popularity of vlogs among youngsters is arguably important for the extensive use of videos in students’ learning. However, other advantages of the use of videos for learning have been proposed. One such advantage is that learning from videos, compared to reading, does not require the same range of basic language skills to comprehend the learning content (e.g., Schwan, 2017). Hence, videos might give the learner a greater possibility to allocate attention and processing capacity to the content itself, rather than the reading processes required to extract and construct meaning from text (e.g., word decoding). Hence, it has been suggested that videos (and other types of multimedia materials) can provide comprehension support, particularly for struggling students (Castek et al., 2009; Henry et al., 2012).

Although they have potential as materials for formal learning, information communication may also come at a price because students consume Internet content mainly for entertainment purposes (Malamud et al., 2019). Early research suggested that learners’ (incorrect) perception about video (i.e., less demanding than text) could influence the effort and depth of their processing and consequently, their learning (e.g., Kozma, 1991; Salomon, 1984). These early studies also indicated that processing was particularly desultory when watching videos for entertainment purposes (e.g., Krendl & Watkins, 1983). Today, students are extensive consumers of online videos as entertainment, and the algometric system that underlies YouTube secures a steady flow of personalized content. According to Lupinacci (2021), our interaction with this never-ending stream of recommendations and suggestions on YouTube (and other social media platforms) can favor a mindset where one scroll through digital content without paying much attention. The possible consequences of such trifling processing of digital media for learning have been addressed by the shallowing hypothesis (Annisette & Lafreniere, 2017). This hypothesis suggests that people typically process digital learning materials more shallowly or superficially because their use of digital media, often involving quick interactions driven by immediate rewards, promotes a habit of mind that is nonbeneficial for performing more challenging tasks that require sustained attention, such as deep learning. This phenomenon could affect learners’ metacognitive monitoring of their own level of comprehension. In this regard, student calibration is the most studied measure among measures reflecting the use of metacognitive skills when learning.

2 Students’ processing of educational streaming videos

Metacognitive calibration can be defined as the outcome of self-regulated learning processes that represents the accuracy of the evaluation of one’s performance (Pieschl, 2009). Previous research has generally found that students tend to be overconfident of their performance (Stone, 2000). In a series of studies comparing undergraduates’ reading of identical printed and digital texts, Ackerman and colleagues (e.g., Ackerman & Goldsmith, 2011; Ackerman & Lauterman, 2012) examined not only text comprehension outcomes but also students’ metacognitive calibration of their level of comprehension. For example, Ackerman and Goldsmith (2011) asked participants to predict their performance on subsequent comprehension tests after reading some texts. They found that when reading digital texts, students tended to overestimate their comprehension to a greater extent than when reading printed versions (Exp. 1). As a likely consequence of this overestimation, students also spent less time reading and achieved poorer comprehension when reading digital texts (Exp. 2). More inaccurate judgment of their actual level of comprehension, as well as less investment of time and effort in comprehending digital versus printed texts, seem consistent with the shallowing hypothesis (Annisette & Lafreniere, 2017).

However, we have been unable to find studies directly examining the shallowing hypothesis in relation to comprehension of Internet videos. A related issue is cyberloafing (e.g., Durak, 2020) or cyberslacking (e.g., O’Neill et al., 2014), which refer to the use of the Internet and information technologies for entertainment and personal purposes during school or work hours. Several studies indicate that students can become distracted by the entertainment opportunities provided by YouTube and other digital media and that this distraction can detrimentally affect their learning (Bergdahl et al., 2020; Durak, 2020; Klobas et al., 2018, 2019). In their study of students in upper secondary students, Bergdahl et al. (2020) found a significant correlation between low grades and the frequency of unauthorized nonacademic use of YouTube when in class. In contrast, high-performing students developed strategies (self-regulatory skills) that enabled them to use digital technologies in productive ways, resisting the nonacademic use of YouTube in the classroom.

The link between the perception of videos as less demanding and actual learning remains unclear. To our knowledge, no study has linked students’ processing of video and text (e.g., attention to task, metacognitive monitoring) and its output (i.e., comprehension). As both video and text blogs are digital materials, one could argue that both will be processed in a shallow way. Nevertheless, most adolescents tend to access streaming videos for entertainment purposes, even if they may use them for educational purposes in some occasions. By contrast, because students use textual materials for learning at school on a daily basis, we expect this superficial processing to be less pronounced on text blogs. Thus, learning from video blogs may result in shallower processing and lower comprehension than when reading text blogs.

3 Notetaking to support deep processing and learning from streaming videos

The expected superficial processing of blogs may partially depend on students’ abilities and on the activities they engage in while consuming them and on their interaction with the content. First, the question arises as to what activities can increase student processing and comprehension of blogs. Notetaking while learning is a particularly effective technique to foster deep processing, as it helps students encode relevant information while learning (Peverly & Wolf, 2019). In addition, note-taking may help students to focus their attention on the learning task. For example, Wong and Lim (2021) conducted two studies with undergraduate students that demonstrated that students instructed to take notes recalled more information from a video lecture than the control group. Critically, study 2 showed that the group that took notes mind-wandered less during the learning session.

Regarding the effects of notetaking and learning, a meta-analysis of studies comparing notetaking and no notetaking groups found a small positive effect size on student learning outcomes (Kobayashi, 2005). A relevant moderator for this effect was the study medium, indicating that notetaking was useful when learning from text (Mean ES = 0.27) but not from audiovisual materials (a category that included films, videotaped lectures, and live lectures; Mean ES = −0.02). The authors argued that in contrast to audiovisual materials, text easily allows students to alternate between processing the learning material and taking notes. However, as we discussed above, current online video platforms provide the viewer greater control over the presentation pace. Recent studies using video lectures as learning materials have reported positive medium-size effects of notetaking on students’ recall (Wong and Lim (2021), particularly when students do not posses much prior knowledge on the topic (Kane et al., 2017). Accordingly, we may expect that notetaking may be useful for both text blogs and video blogs.

In addition, students’ processing of blogs may also be supported by their comprehension skills, as good comprehenders may better identify main ideas and integrate those in a more coherent representation. The extent to which such skills and notetaking interact remains largely unexplored. In a recent systematic review of studies examining the effects of notetaking, Jansen et al. (2017) identified studies analyzing the interaction between this learning technique and individual differences. Of the 26 studies included in their review, only seven studies considered the potential moderating role of individual differences, and most of those (n = 4) focused on the influence of working memory capacity. Specifically, Jansen et al. (2017) concluded that students with a high working memory capacity tend to take high quality notes and thus may benefit from notetaking. Conversely, students with low working memory may experience fewer benefits from taking notes. In one of the few studies that specifically examines comprehension skills, Bui and McDaniel (2015) analyzed the effect of undergraduate reading comprehension on the quality of notetaking and learning. Better comprehenders included a higher percentage of idea units in their notes and scored higher on the comprehension questions. However, as all participants in their study took notes, the researchers could not test the interaction between reading comprehension skills and notetaking on learning outcomes. In sum, previous evidence suggests that notetaking may be a useful learning technique, particularly for skilled students. In the present study, we further investigate the possible interaction between blog format, notetaking, and student reading comprehension, on not only comprehension outcomes but also indicators of processing effort such as students’ perceived on-task attention and metacognitive calibration of their level of comprehension.

4 The present study

In this study, we examined secondary students’ perceived on-task attention, metacognitive monitoring of their level of comprehension, and comprehension outcomes to test the extent to which video blogs are as suitable as text blogs for learning. In addition, we investigated whether notetaking could interact with format and influence students’ codification and recall of information. We presented students with two texts and two videos on different topics in the form of Internet blog entries. After each blog entry, students indicated their level of perceived attention, predicted their performance on subsequent comprehension questions and then answered the questions.

Based on the shallowing hypothesis (Annisette & Lafreniere, 2017) and considering notetaking as a profitable learning technique (Peverly & Wolf, 2019), whose benefits may be greater for more skilled students (Bui & McDaniel, 2015; Jansen et al., 2017), the present study was built on the following research questions: RQ1) Is learning from video blog entries detrimental for secondary students’ perceived on-task attention, metacognitive calibration, and/or content comprehension compared to learning from text blog entries?; if so, RQ2) can notetaking help secondary students overcome the detrimental effect of learning from video blog entries?; and RQ3) do benefits from notetaking depend on students’ reading comprehension? We expected higher students’ distraction (Hypothesis 1a) and higher overconfidence in their comprehension (Hypothesis 1b) when watching the video blogs than when reading the text blogs. Consequently, students’ comprehension outcomes would be poorer in the video blog condition (Hypothesis 1c). In addition, we explored whether notetaking confirmed or refuted the expected shallowing processing of video blogs. Half of the students were asked to take notes while reading/watching each blog entry. Although we expected that notetaking would improve students’ on-task attention, calibration and comprehension, regardless of blog format (Hypotheses 2a, 2b, and 2c), we also hypothesized that such improvement would be greater in the video blog condition (Hypothesis 2d). This expectation is based on the fact that, as anticipated in Hypotheses 1a, 1b, and 1c, we expected that students would process video blogs in a shallower way (Annisette & Lafreniere, 2017). Last, regarding the role of students’ reading comprehension, we predicted that the positive effect of notetaking would be moderated by students’ reading comprehension skills. Specifically, we expected improvements due to notetaking to be higher in more skilled students (Hypothesis 3).

5 Method

5.1 Participants

One hundred eighty-eight students from 9th (n = 120; 50.8% girls, 49.2% boys) and 10th (n = 68; 50% girls, 50% boys) grade without any known learning disabilities participated in the study. Participants were from two different high schools. School A is a public high school located in a middle-class neighborhood in downtown Alicante, Spain, a mid-size city with aprox. 330,000 inhabitants. School A has one group for each 9th and 10th grade (n = 61). School B is a public high school located in a middle-class neighborhood in downtown Gandía, Spain, a small-size city with aprox. 73,000 inhabitants. School B has two groups for each 9th and 10th grade (n = 127). In both schools, students regularly used tablets and computers during their lessons. Both the students and their parents or legal guardians signed an informed consent form. APA ethical standards and the guidelines of the Helsinki Protocol were followed in conducting the study. Twenty participants could not complete the entire task due to time limitations, and they were thus excluded from the final sample, which consisted of 168 participants (Mage = 14.55, 52.3% girls, 47.6% boys).

5.2 Materials

Blog entries

We developed four texts (402-416 words) on different topics related to scientific or social issues (i.e., feminism as a social movement, children’s use of social media, management of nuclear waste, radiation from mobile phones). They were written in first person, which is the voice that bloggers often communicate to their readers. Each text blog was verbatim converted into a video blog (2:05-2:35 min long) in which the author shared the entry in front of a camera. Blog entries were presented within a webpage (see Fig. 1). An example of the blog entry Mobile phones and cancer from the study is provided in the Appendix 1.

Fig. 1
figure 1

Example of the video and text versions of a blog entry

5.3 Measures

Reading comprehension

Students’ reading comprehension was measured with a subtest from PROLEC-SE-R, a widely used Spanish standardized test of reading skills with adequate reliability and validity (Cuetos et al., 2016). The original test was validated with a sample of 1254 high-school students from different regions of Spain. Results indicated that the reading comprehension subtest presents acceptable internal reliability (Cronbach’s a = .76). As evidence for different forms of validity, scores on the reading comprehension subtest discriminate well among high-school levels (d = .76), and positively correlate with average school grades, r = .25.

This measure consists of two expository texts. Each text is accompanied by 10 open-ended questions, including both literal and inferential questions. Literal questions refer to single ideas at the surface level of texts, whereas inferential questions require integrating pieces of information separated in the text or linking text content with prior knowledge. The participants did not have access to the texts when answering the questions, and there was no time limit on this measure. The score on this measure was the number of questions answered correctly.

The internal reliability in our sample was examined using the omega coefficient based on a polychoric-transformed correlation matrix. This coefficient has been shown to be more appropriate for dichotomous items than alpha (i.e., correct/incorrect answers; Gadermann et al., 2012). The results indicated good reliability for the two texts (ω = .83 for the first text and ω = .80 for the second text).

Comprehension of blog entries

Students’ comprehension of each blog entry was measured by four multiple-choice questions covering literal and inferential comprehension. Response options were constructed following the guidelines proposed by Ozuru et al. (2007): For each question, response options included the target and three different distractors: near-miss (an idea located in the text that conceptually taps the target answer), thematic (a plausible answer but containing common misconceptions), and unrelated distractor (an extremely improbable answer or inconsistent with the text content). The following is a literal question from the text Mobile phones and cancer (see text in the Appendix 1): The hands-free system of mobile phones makes radio frequency energy to: a) stop being transmitted to the outside; b) to hardly reach the user’s brain; c) increase its danger due to the microwave effect; d) be transformed into X-rays. The correct response is b), as it paraphrases the following sentence in the text: Exposure drops dramatically when phones are moved away from the head. The following is an inferential question from the same text: Regarding the relationship between mobile phones and cancer, from the studies described in the blog it is concluded that we must: a) reduce the use of mobile phones to prevent brain tumors; b) make shorter calls with mobile phones to be healthy; c) calmly use mobile phones without risk to health; d) make emergency calls in the event of radiation. The correct option is c) as it is the only option that synthesizes the evidence showing that there is no stablished relationship between mobile phone use and cancer.

All questions were piloted in a group of 9th-10th-grade students (N = 33) who did not participate in the main study. The reliability of this measure was acceptable (ω = .79).

Perceived on-task attention

We used an adapted version of the mindwandering questionnaire developed by Sanchez and Naylor (2018) to assess students’ perceived on-task attention while reading/watching the blog entries. After each blog entry, students answered three questions on a scale from 1 (“Almost never”) to 6 (“Almost always”), such as “While watching the video/reading the text, how often did you notice that you were thinking about things other than the text/video?”. The reliability of this questionnaire was questionable (ω = .63).

Metacognitive calibration

After reading/watching each blog entry, but prior to accessing the comprehension questions, students predicted the number of questions they thought they would answer correctly. Students’ calibration for each blog entry comprehension was calculated by subtracting the number of correct answers from their predictions. Negative values in this measure indicated underestimation, and positive values reflected overconfidence in participants’ predictions of performance, with possible values ranging from −4 to 4.

Students’ notes

The quality of students’ notes in the notetaking condition was categorized and quantified following Gil and colleagues (2001) based on the degree of transformation of the information (i.e., literal, elaborate, and erroneous ideas) and content relevance (i.e., important ideas or secondary details). Specifically, we considered a) literal ideas as those including claims from blog entries copied verbatim or mechanically paraphrased, b) elaborate ideas as those ideas re-elaborated or inferred by the students, c) important ideas as those ideas referring to central ideas in blog entries, d) secondary details as ideas referring to surface details in blog entries, and e) erroneous ideas as claims representing an incorrect interpretation of the ideas from the blog entries. Examples of students’ notes for each category can be seen in Appendix 2 (Table 4). The third and fourth authors coded the notes, yielding 74.5% interrater agreement (Cohen’s κ = .65). Disagreements between the two raters were resolved through discussion.

Covariates

We measured participants’ selective and sustained attention capacity using the Perception of Differences test - Revised (CARAS-R; Thurstone & Yela, 2014), validated in Spain in a sample of 12,190 students (Cronbach’s α = 0.91), and previously used to measure students’ attention (e.g., Crespo-Eguilaz et al., 2006). In addition, after reading or watching each blog entry, students reported their perceived comprehensibility of the blog entry and their situational interest in the content of the entry on a 5-point Likert-scale.

5.4 Procedure

Each student completed the study in two group sessions in their regular classrooms. In the first session, student comprehension skills and attention capacity were measured. The experimental task was performed in the second session. Students were told that they had to learn material on different topics from four webpages to answer questions about the content. Students in the notetaking condition were additionally told that they had to write down what they considered important and that they could check their notes when answering the comprehension questions. They first performed a practice trial in which they read/watched a blog entry (shorter than the experimental trials) and answered only two questions. Then, they read two text entries and watched two video entries on either a tablet or a desktop computer. When watching the video entries, the students could control the video playback. The format and the presentation order were counterbalanced across blog entries and students. After reading/watching each blog entry, students reported their interest in its content and their perceived comprehensibility. They then predicted their performance on the comprehension questions and answered the questions. Last, they completed the mindwandering questionnaire. All these tasks were completed in a printed booklet, in which the students in the notetaking condition also took their notes. Each group of students was guided through the tasks by two experimenters to ensure their understanding.

5.5 Data analyses

To examine the effects of format and notetaking on student blog entry comprehension, perceived attention to task, and metacognitive calibration, we also aimed to test whether these effects interact with student reading comprehension skills. Thus, we performed two linear mixed-effects (LME) models for each dependent measure with blog format, notetaking, and student reading comprehension as fixed effects, including in each of the two models their main effect terms and their interaction term. Additionally, given that students’ perceived on-task attention correlated with their score in the CARAS-R test (i.e., selective and sustained attention capacity; see Results), this latter variable was also included as a covariate (fixed effect) in the two models for perceived on-task attention. Based on the results from model fitting comparisons using the ‘anova’ function from the ‘stats’ package v.4.0.2 for R (R Core Team, 2020), students’ and students’ classrooms were included as random effects in all cases (random intercept, fixed slope), and blog entry topics were controlled as fixed effects.

Last, the effect of blog entry format and student reading comprehension on the quality of students’ notes was examined by performing two LMEs for each of the note quality indicators, including format, student reading comprehension, and blog entry topic as fixed effects (Model 1: main effect terms; Model 2: interaction term), and students and students’ classroom as random effects (fixed slopes) in all cases.

These analyses were performed using the ‘lmer’ function from the ‘lme4’ package v.1.1-23 for R (Bates et al., 2015). When appropriate, post hoc analyses were conducted using the ‘emtrends’ function from the ‘emmeans’ package v.1.5.5-1 for R (Lenth, 2020). Following Cohen et al. (2013), quantitative predictors were centered prior to being included in the models.

6 Results

Four participants were excluded due to outlier values in the score on the comprehension questions (±2SD from the sample mean). Thus, the sample finally included in the analyses consisted of 164 participants. Descriptive statistics for each dependent variable are shown in Table 1. Variables were normally distributed once outliers were deleted (kurtosis and skewness values were within the ±2 range; George & Mallery, 2010).

Table 1 Mean and SD for each dependent variable in each experimental condition and skewness and kurtosis for each dependent variable in each format condition for the entire sample

With respect to student reading comprehension and the covariates, all mean values were similar across the notetaking groups, all ts < 1, except that the scores in the CARAS attention test were higher in the group who took notes, t(163) = 2.23, p < .03. We further explored its correlations with the dependent variables. As shown in Table 2, participants’ situational interest and perceived comprehensibility of blog entry content correlated with their perceived on-task attention and scores on the blog entry comprehension questions. Furthermore, participants’ scores on the CARAS attention test correlated with their perceived on-task attention. Thus, as mentioned earlier, scores in the CARAS test were included as covariates in the model examining differences in participants’ perceived on-task attention.

Table 2 Pearson correlations between covariates and dependent variables

Last, regarding the students’ notes per blog entry, they consisted of 5.11 important ideas on average (SD = 3.50), 2.28 secondary details (SD = 2.36), and 0.10 erroneous ideas (SD = 0.32), representing on average 71.29% (SD = 23.59), 25.60% (SD = 22.03) and 3.11% (SD = 13.13) of the total number of ideas included in the notes (M = 7.49, SD = 5.25), respectively. Students included on average 30.60% (SD = 21.51) of the important ideas in the blog entries and 18.15% (SD = 19.36) of the secondary details. They rarely paraphrased the ideas that they annotated (M% = 8.45, SD = 18.69), and their notes were 45.14 words long per blog entry (SD = 47.39).

6.1 Effects of blog entries format and notetaking on perceived on-task attention

Regarding students’ distraction, we expected an increase with video blogs (Hypothesis 1a) and a decrease with notetaking (Hypothesis 2a). Thus, we analyzed two LME models including students’ scores on the mind-wandering questionnaire as criterion variables. Model 1 included the main effect terms of blog entry format, notetaking, and student reading comprehension. Model 2 included their interaction term. Students’ mind wandering did not vary as a function of blog format, t(487.99) = 1.35, p = .18. In addition, there was no main effect of notetaking, t < 1, and student reading comprehension was not a significant predictor of their perceived on-task attention, t = −1.22, p = .22. The interaction between format and notetaking and that between notetaking and students’ reading comprehension were not significant (all ps < .08) (Fig. 2). In sum, these results did not support Hypotheses 1a and 2a on the effects of video blogs and notetaking on students’ distraction.

Fig. 2
figure 2

Plot for the interaction term between blog entry format, notetaking, and student reading comprehension on students’ scores in the mind wandering questionnaire. Students and students’ classrooms were controlled as random effects (fixed slope), and blog entry topics were controlled as fixed effects. Note. Higher values on the mindwandering questionnaire indicate lower on-task attention

6.2 Effects of blog format and notetaking on metacognitive calibration

For students’ metacognitive calibration, we predicted a decrease with video blogs (Hypothesis 1b) and an increase with notetaking (Hypothesis 2b). Students’ predictions of their performance on the blog entry comprehension questions were generally slightly overconfident, as the sample mean score for calibration was significantly higher than -zero (i.e., perfect calibration), M = 0.23, SD = 0.95, t(164) = 3.27, p < .01. Students were slightly more overconfident when watching the videos than when reading the texts (see Table 2).Footnote 1 Nevertheless, and contrary to Hypotheses 1b and 2b, the results from Model 1, that included participants’ calibration index as the criterion variable and format, notetaking, and reading comprehension as fixed factors, showed no main effect of notetaking and blog format, both ts < 1, and student reading comprehension was not a significant predictor, t(153.49) = −1.45, p = .15. Moreover, the model including the interaction term between these variables (Model 2) revealed no interaction effects between these three variables on students’ calibration, all ts < 1.30, ps > .19 (see Fig. 3). Thus, differences in students’ calibration index across the four experimental conditions were not statistically significant.

Fig. 3
figure 3

Model’s plot for the interaction term between blog format, notetaking, and students’ reading comprehension on students’ calibration index. Students and students’ classrooms were controlled as random effects (fixed slope), and blog entry topics were controlled as fixed effects

6.3 Effects of blog format and notetaking on blog entry comprehension

Regarding students’ comprehension, we predicted a decrease due to the use of video blogs (Hypothesis 1c) and an increase with notetaking (Hypothesis 2c). However, contrary to those predictions, the results from Model 1, that included participants’ scores on blog entry comprehension as the dependent variable and the main effect terms of format, notetaking and student’s reading comprehension, showed no effect of format and notetaking, both ts < 1. Moreover, student’s reading comprehension was a significant and positive predictor, t(149.87) = 6.83, p < .001.

Model 2 included the interaction term between format, notetaking, and student’s reading comprehension on blog entry comprehension. These interaction effects were relevant to test the expected interactions between blog format and notetaking (Hypothesis 2d) and between reading comprehension and notetaking (Hypothesis 3). Two-way interactions showed no significant interaction between format and notetaking, t < 1, and a significant interaction effect of notetaking and reading comprehension on blog entry comprehension, t(295.60) = 2.75, p < .01. This two-way interaction was qualified by a significant three-way interaction between format, notetaking, and reading comprehension, t(488.01) = 2.48, p = .01.

We thus performed post hoc analyses to further examine the interaction between both experimental factors and students’ reading comprehension. Tukey-corrected pairwise comparisons of the estimated marginal means of the regression slopes of student reading comprehension on blog entry comprehension across the experimental conditions showed that the only significant difference appears between the notetaking conditions in the case of text entries comprehension, estimate: 2.08, SE = 0.78, t(296) = 2.70, p = .04. The remaining post hoc comparisons yielding nonsignificant results indicated no difference between the regression slopes of reading comprehension across formats in any of the notetaking conditions, as well as no difference across notetaking conditions within the video entry condition, all ts < 2.03, ps > .16. In other words, as shown in Fig. 4, when reading the text entries, poor comprehenders who took notes seemed to outperform poor comprehenders who did not, which was not the case when watching videos.

Fig. 4
figure 4

Model’s plot for the interaction term between blog format, notetaking, and reading comprehension on students’ blog entry comprehension. Students and students’ classrooms were controlled as random effects (fixed slope), and blog entry topics were controlled as fixed effects

To further examine differences across format and notetaking conditions in terms of student reading comprehension, we followed the procedure for testing group mean differences at a particular point of a continuous variable in regression analyses proposed by Cohen et al. (2013). We constructed four additional mixed models in which the variable students’ reading comprehension was rescored. In two of the models, we subtracted from each individual score in this variable 2SD and 1SD from the sample mean. In the other two models, we summed the individual scores 1SD and 2SD from the sample mean. Thus, the intercept of the students’ reading comprehension regression slope on the scores for blog entry comprehension in each model was set at -2SD, -1SD, 1SD, and 2SD from the sample average in reading comprehension, allowing us to reproduce the equation estimates for very poor, poor, good, and very good comprehenders in each experimental group (see Fig. 5). In addition, estimates for average comprehension were those from the original model (i.e., the model including original values of the variable reading comprehension).

Fig. 5
figure 5

Predicted scores on text entry (top panel) and video entry (bottom panel) comprehension for very poor, poor, good, and very good comprehenders (-2SD, -1SD, +1SD, and + 2SD from the sample mean in students’ reading comprehension). Note. **p < .01; *p < .05; ns: not significant

Differences between notetaking conditions were significant only for poor and very poor comprehenders. On the one hand, the models predicted that the percentage of correct answers in the text entry comprehension questions of two hypothetical students with poor and very poor reading comprehension would increase by 11.79 and 20.24 points, respectively, if they take notes, SE = 5.06, t(43.53) = 2.33, p = .02, and SE = 7.26, t(120.44) = 2.79, p < .01. On the other hand, hypothetical students with average, good, and very good reading comprehension would score similarly regardless of notetaking, all ts < 1.89, ps > .06. The predicted scores on the video entry comprehension questions were similar, regardless of notetaking and student reading comprehension, all ts < 0.54, ps > .59 (see Fig. 5).

Additionally, differences in estimated comprehension outcomes for poor and very poor readers across formats and notetaking conditions (i.e., higher scores for text entry than for video entry comprehension in the notetaking condition and lower scores for text entry than for video entry comprehension in the no notetaking condition) were not significant in any case, all ts < 1.39, p > .17. Thus, the enhancing effect of notetaking on text entry comprehension among poor readers was driven by not only increased scores among those who took notes but also decreased scores among those who did not, compared to their outcomes in video entry comprehension (see Fig. 4). In sum, Hypotheses 2d (i.e., higher benefit from notetaking when watching the video blogs) and 3 (i.e., greater benefit from notetaking for skilled students) were not confirmed. Instead, notetaking benefited only students low in reading comprehension and only when reading the text blog entries.

6.4 Differences in students’ notes across blog entry formats

To further understand the lack of interaction between notetaking and blog format on comprehension (not supporting Hypothesis 2d) and the unexpected interaction between reading comprehension, notetaking, and blog format (not supporting Hypothesis 3), we analyzed in-depth students’ notes (see Table 3 for descriptive data on each of the note quality indicators segregated by blog entry format). We examined the effect of blog entry format and students’ reading comprehension on each indicator of the quality of students’ notes (i.e., total number and proportion of important ideas, secondary details, errors, and literal and elaborate ideas). Sample values were not normally distributed in any of the cases, so they were normalized, and the grand mean was centered using the ‘normalize’ function from the ‘BBmisc’ package v.1.10 for R (Bischl, 2016). The results indicated that students’ reading comprehension significantly and positively predicted the number of literal ideas [estimate: 0.06, SE = 0.03, t(49.10) = 2.12, p = .04] and negatively predicted the proportion of erroneous ideas included in their notes [estimate: -0.04, SE = 0.02, t(169.00) = −2.27, p = .02]. Moreover, none of the models showed a significant main effect of blog entry format on any indicator of students’ note quality, all ts < 1.72, ps > .08, except for the number of important ideas. Students included a higher number of important ideas in the notes they took from video entries than they did in the notes from text entries [estimate: 0.27, SE = 0.11, t(123.81) = 2.46, p = .02]. There was no interaction effect of format and students’ reading comprehension on any note quality indicator (all ts < 1.10, ps > .27).

Table 3 Means and SDs (in parentheses) for each note quality indicator for each blog format

To further examine the association between the number of important ideas included in the notes and students’ comprehension of blog entries, we performed two LME models, including this quality indicator, blog entry format and students’ reading comprehension as fixed factors (Model 1: main effect terms; Model 2: interaction term), and student, student classroom, and blog entry topic as random factors (fixed slopes). The results showed that the number of important ideas was not a significant predictor of blog entry comprehension or its interactions with format and students’ reading comprehension, all ts < 1.

7 Discussion

The present study contributes to the literature of digital literacy by studying for the first time students’ processing of video and text blogs (e.g., attention to task, metacognitive monitoring) and its output (i.e., comprehension). We also tested the influence of notetaking as a means to overcome the expected shallower processing of video blogs, taking into account students’ reading comprehension. Our first research question concerned whether learning from video blog entries would be detrimental to secondary students’ perceived on-task attention, metacognitive calibration, and/or content comprehension compared to learning from text blog entries. Contrary to our expectations based on the shallowing hypothesis (Annisette & Lafreniere, 2017), students’ processing (i.e., on-task attention, calibration) and performance (i.e., content comprehension) were equivalent for video blogs and text blogs. Regarding RQ2 (Can notetaking help secondary students overcome the detrimental effect of learning from video blog entries?), in addition to the fact that notetaking did not exert any main effect on students’ processing or performance, this learning technique was not necessary to overcome the effect of blog format on content comprehension. Last, findings in relation to RQ3 (Do benefits from notetaking vary with students’ reading comprehension?) indicated that, unexpectedly, notetaking only improved comprehension for low skilled comprehenders reading text entries. Paradoxically, students’ notes from video entries included a higher number of important ideas than those from text entries. However, this indicator of note quality did not predict students’ blog entry comprehension. We next discuss the implications of these findings.

7.1 Video blogs, notetaking, and students’ processing

Our results showed that the indicators of students’ processing efforts of blog entries (i.e., on-task attention, metacognitive calibration) were not significantly affected by blog entry format and notetaking. Overall, those patterns of results do not support the shallowing hypothesis (Annisette & Lafreniere, 2017), which suggests that students create a superficial habit of mind when interacting with digital media. Specifically, the lack of differences between video and text formats with respect to on-task attention contrasts with the findings of previous studies on cyberloafing or cyberslacking that suggest that students can become distracted by the entertainment opportunities provided by Internet streaming videos (Bergdahl et al., 2020; Durak, 2020; Klobas et al., 2018, 2019). A possible explanation for this discrepancy may be contextual factors. The fact that we used a restricted learning environment, which did not allow students to use the video feed to access additional content, may help explain why students did not become more distracted with video blogs. This finding is consistent with recent research reporting no difference in on-task attention between high school students reading texts in print or on tablets offline (Salmerón et al., 2021).

Regarding metacognitive calibration, students’ predictions were slightly overconfident, similar to what research on this variable has traditionally found (Ackerman & Goldsmith, 2011, Exp. 1; Ackerman & Lauterman, 2012). The lack of calibration differences between blog formats adds to the evidence indicating that students processed video and text entries similarly. Although this aspect was not a main focus in our study, calibration and on-task attention scores were not related (see also Delgado & Salmerón, 2021). It has been proposed that the relation between calibration and on-task attention when reading may be bidirectional: either distraction would prevent students from accurately judging their current level of understanding, or overconfidence could release mental resources that could be dedicated to distracting activities (Smallwood & Schooler, 2006). Our pattern of results cannot shed light on this issue.

Finally, the fact that notetaking did not exert a major influence on students’ processing of video compared with text entries is unsurprising, given that our prediction on the role of such activity was based on the expectation that video entries would encourage shallower processing. As this expectation did not prove true, we could not expect that notetaking would specifically stimulate students when learning from video entries. Nevertheless, the fact that notetaking had no overall effect on students’ processing challenges the view that this activity supports in-depth processing (Kobayashi, 2005; Peverly & Wolf, 2019). We cannot attribute the lack of effects in our study to the instruction used, to a lack of training, or to the grade level of our students. We used a procedure similar to what Kobayashi (2005) coded in his meta-analysis as neutral instructions, e.g., instructions to take notes as usual, without providing a particular training or scaffold. Our participants corresponded to the coding “lower schooling” (as opposed to “higher schooling” or undergraduate studies; Kobayashi, 2005). For both categories, effect sizes were positive, with small to medium sizes. As previous research has focused mostly on the outcome or habits of notetaking, our pattern of results suggests a need to further explore online processing of learning materials to better understand the effects of notetaking on learning.

7.2 Video blogs, notetaking, and students’ comprehension

As was the case for the blog processing indicators, comprehension measures did not differ between blog entry format or notetaking, which does not support the shallowing hypothesis (Annisette & Lafreniere, 2017) and other related warnings in previous literature about the possible detrimental effects of online videos as sources of distraction (e.g., Bergdahl et al., 2020; Durak, 2020; Klobas et al., 2018, 2019; Lupinacci, 2021). Similar to previous research with instructional videos, the use of video blogs resulted in similar comprehension to text equivalent information (e.g., Burin et al., 2021; List, 2018; Merkt et al., 2011; Salmerón et al., 2020; Tarchi et al., 2021). Future research should explore the potential limits of this equivalence. More complex learning activities, such as integrating multiple perspectives from different documents, may exceed students’ capabilities to comprehend video blogs. Similarly, in a recent study, Salmerón et al. (2020) found that when primary school students had to learn from two Internet videos providing opposite views in favor of or against bottled water, they included a lower number of inferences in their summaries compared to those learning from two equivalent texts. Based on our results, we must conclude that online videos created by so-called YouTubers can be as profitable as texts for learning in secondary education, at least when used as learning scenarios with a single document.

As was the case for processing indicator measures, notetaking had no main influence on students’ comprehension, a pattern that is at odds with previous meta-analytic evidence showing small- to medium-sized positive effects (Kobayashi, 2005). As we introduced notetaking using a procedure comparable to other previous studies (see previous subsection), we can reasonably conclude that our students were not particularly proficient in taking notes. Indeed, participants copied the ideas from the blog entries mostly verbatim (more than 90% of the included ideas were identified as literal). As previous research has shown, elaborate notes promote better learning outcomes than literal notes do (Gil et al., 2011; Haynes et al., 2015; Mueller & Oppenheimer, 2014). Thus, the notes produced by the students in our study were not particularly insightful for understanding blog entry content. This finding could also explain why, contrary to our expectations and to previous findings (Bui & McDaniel, 2015; Jansen et al., 2017), students with high reading comprehension in our study did not benefit from notetaking. Indeed, students’ reading comprehension positively predicted only the number of literal ideas in their notes, not the number of elaborate ideas.

Finally, an unpredicted pattern emerged from our findings, namely, students with poor reading comprehension could benefit from taking notes, but only when learning from text blog entries, not videos. Given the unexpected nature of this pattern, we can only speculate about what it means. As students could refer to their notes while answering the comprehension questions, the benefits of notetaking may be a consequence of students using notes as a substitute for their own memory. Alternatively, as suggested in early research on learning from videos (e.g., Kozma, 1991; Salomon, 1984), high school students in our study could have perceived videos as less demanding than texts, leading them to rely on their notes only in the more challenging format. However, as we did not measure perceived processing effort, such an interpretation should be considered with caution.

Overall, our results suggest that while video blogs can be as supportive of learning tasks as text blogs, students may require specific instruction to benefit from notetaking when learning from videos (Kobayashi, 2005).

7.3 Limitations and future research

Our study is not without limitations. First, although we controlled for a variety of individual differences (i.e. reading comprehension skills, selective and sustained attention capacity, perceived comprehensibility and situational interest), we did not measure other aspects related to digital and general literacy, such as basic computer skills or oral comprehension. Regarding the first, to watch video blogs strategically students require a certain level of computer skills (e.g. Merkt & Schwan, 2014). For example, to restudy a complex part of the video they need to pause it, and to go backwards until they reach the beginning of the part they want to watch again. Regarding the second, although oral comprehension underlies reading comprehension (e.g., Silva & Cain, 2015), we cannot rule out the possibility that including it as a predictor of students’ blog entry comprehension would have resulted in the same interaction effect between this variable and notetaking for video entry comprehension as that we found in the case of text entries and students’ reading comprehension. In other words, notetaking could benefit students low in oral comprehension when watching video entries, as it did for struggling readers in the text entry condition in our study. Future research could explore this possibility.

Second, the texts and videos used in our study were rather brief. It can be suggested that there is more room for cognitive processes involved in the in-depth processing of information to make a difference when learning from longer texts and/or when learning from multiple texts and videos on the same topic (Salmerón et al., 2020). In addition, the texts and videos in our study were verbatim copies of each other. Thus, differences between learning from text blogs and video blogs should be further explored by not only using longer documents but also comparing informal vs. academic texts/videos conveying identical content.

Third, students’ annotations in our study were mostly literal idea units, so we were unable to explore the influence of elaborate notes. As noted, elaborate notes foster learning to a greater extent than literal notes do (Gil et al., 2011; Haynes et al., 2015; Mueller & Oppenheimer, 2014). Thus, future studies could further explore the relationship between notetaking and the format of the learning materials by training students in taking elaborate notes prior to the learning tasks.

Then, the fact that the students in our study could refer to their notes while answering the comprehension questions did not allow us to fully test the influence of notetaking on the depth of processing, as students could rely on their notes as mere substitutes for memorizing efforts. Therefore, future research should explore differences in consequences of notetaking on comprehension across blog formats by not allowing students to refer to their notes.

Finally, in interpreting our results we should keep in mind that we used a closed scenario, where students must learn a set of previously selected blogs. This is similar to learning activities where teacher provide the learning materials. Future research could examine if our conclusion that students’ performance is equivalent for video blogs and text blogs can be generalizable to open scenarios, such as searching the Internet to learn about a particular topic. Aspects such as students’ preferences for particular sources, such as her favorite blogger, may determine which documents they access and how much time do they spend with them. This, in turn, may affect their learning. In this line, Pardi et al. (2020) report a study where a group of undergraduate students were allowed to freely search the Internet to learn about the formation of thunderstorms and lightning. Results showed that while time spent on mostly textual webpages was associated with higher learning, time spent on videos was associated to lower learning. In sum, we can’t take for granted that videos may be efficient learning materials in open scenarios.

8 Conclusion and educational implications

The present study provides relevant insights into the use of multimedia materials for learning purposes. On the one hand, our results indicate that short video blogs are as suitable as text blogs for learning so that they can be used to cover academic curricula. Ultimately, the implementation of video blogs in the classroom depends to the extent that teachers’ perceive that such innovation may pay off for the change (Cuban, 1986). In this regard, we call for video designs based on previous evidence on the use of multimedia in education (e.g., Mayer, 2020).

On the other hand, it seems that students low in reading comprehension do not benefit from notetaking when learning from video blogs as they do when learning from text blogs. Although further research should clarify this finding in relation to students’ oral comprehension and by training students in annotating elaborate notes, educational practices including notetaking as a learning technique should be careful when deciding when and with whom to use it. Our results suggest that when learning from online content, notetaking might not help struggling readers when they learn from video blogs, in contrast to the case of text blogs.