Skip to main content
SearchLoginLogin or Signup

Do Students Learn Better With Immersive Virtual Reality Videos Than Conventional Videos? A Comparison of Media Effects With Middle School Girls

Volume 3, Issue 3: Fall 2022. DOI: 10.1037/tmb0000082

Published onJul 21, 2022
Do Students Learn Better With Immersive Virtual Reality Videos Than Conventional Videos? A Comparison of Media Effects With Middle School Girls
·

Abstract

This article presents two studies comparing the effects of educational immersive virtual reality (IVR) versus traditional videos on conceptual knowledge and self-efficacy. Learning was measured through multiple-choice questions assessing conceptual knowledge and open-ended questions assessing knowledge understanding, knowledge application, and knowledge creation, based on Bloom’s taxonomy of learning objectives. In Experiment 1, 53 eighth-grade students from an all-girls school learned about humans’ impact on the ocean through either 360° videos, using a virtual reality (VR) headset, or through traditional videos, using a computer monitor. Measures were taken before and right after treatment. In Experiment 2, 139 sixth-to-eighth grade students from the same school used the same instructional material and equipment from Experiment 1, and measures were taken four times (before treatment, right after two treatment sessions, and 5 weeks after treatment). Also, we measured learning agency and investigated its mediation role between condition and self-efficacy. The groups did not differ on the multiple-choice scores assessing conceptual knowledge in Experiment 1. In Experiment 2, participants in the IVR group scored higher for knowledge creation than participants in the desktop group, but not knowledge understanding or knowledge application. The IVR group scored higher on self-efficacy than the desktop group in Experiment 1, but not in Experiment 2. Finally, learning agency mediated the relationship between condition and self-efficacy in Experiment 2, indicating a possible mechanism underlying immersion effects on self-efficacy. Results are discussed in light of cognitive sciences and their implication for learning in immersive virtual reality.

Keywords: immersive virtual reality, learning, self-efficacy, education

Funding: This research was partially supported by a National Science Foundation grant (AISL # 1907050) and by the Knut och Alice Wallenbergs Stiftelse #20170440.

Disclosures:The authors declare that they have no conflicts of interest to disclose.

Data Availability: Data and analytic methods are available at https:// osf.io/mpk2j/.

Open Science Disclosures:
The data are available at https://osf.io/mpk2j/.
The experimental materials are available at https://osf.io/mpk2j/.

Correspondence concerning this article should be addressed to Anna Carolina M. Queiroz, Department of Communication, Stanford University, 450 Jane Stanford Way, Room 411, Stanford, CA 94305, United States. Email: acmq@stanford.edu


Video Summary

Since the first virtual reality (VR) head-mounted displays (HMDs) were built in the 1960s (Sutherland, 1968), there has been an interest in using immersive virtual reality (IVR) technology for learning (Dede, 2010). Yet over the last decade, IVR’s commercial availability has increased considerably (Parong & Mayer, 2018) and is now becoming affordable to the general public. As a result, more attention is now being drawn to possible IVR applications in schools and research on the impact of IVR within the context of learning (Blascovich & Bailenson, 2011; Mado et al., 2022; Parong & Mayer, 2018). IVR is a genre of technology that takes many forms, and there is great variance in how scholars define the medium. In this article, IVR videos are defined as 360° videos created from real images captured through 360° cameras and watched through a VR headset. Traditional videos refer to 2D format videos that are watched through a computer monitor. When in IVR, the headset blocks stimuli other than the digital content and allows the user to visually explore through head movements. In contrast, traditional videos limit the viewer’s point of view. Hence, we ran two experiments comparing the effects of watching videos either in IVR or on a computer monitor on conceptual knowledge and self-efficacy (i.e., the feeling of being able to learn). We expect participants in IVR to perceive that they have higher control of the learning process (namely, learning agency) than participants in the desktop condition. Thus, we hypothesized that participants in the IVR condition will score higher for self-efficacy than participants in the desktop condition (H1). We also hypothesized that this relationship between condition and self-efficacy would be mediated by learning agency (H2). This study also aimed to answer the following research questions: How do IVR videos, compared to traditional videos, impact conceptual learning? (RQ1) and compared to traditional videos, how are the long-term effects of IVR videos on conceptual learning (RQ2) and self-efficacy (RQ3)? In the following sections, we discuss previous findings which support these hypotheses and guide the research questions, and also define the type of technology, content, and learning that past research has focused on.

Immersive Virtual Reality Videos and Computer-Generated Graphics Content

IVR can vary depending on the equipment, set up, and digital content used. The two main categories related to content are either computer-generated graphics (CG) or 360° videos. In this article, a 360° video displayed in a VR headset will be called IVR video, the same term used by Li et al. (2017). This term is also used to distinguish it from other 360° videos displayed on other screens such as a computer monitor. The main difference between IVR videos and CG-based IVR is that CG content typically has lower photographic realism and allows greater interactivity than IVR videos. In CG environments, users can usually choose avatars, select objects, and translate their head and hand position in the virtual environment while it is refreshed according to their body movement. On the other hand, IVR videos are created using real images captured with a 360° camera, which are then mapped to the inside of a sphere and surround the user seamlessly, bringing higher visual fidelity than CG content. Typically, IVR videos only update as a function of users’ head movements, as opposed to hand tracking or room-scale IVR, where people walk around physically. Both categories are immersive but vary in terms of visual fidelity and interactivity.

In academia, VR has traditionally been implemented via computer graphics that enable head translations. However, over the past few years, one can make the argument that using VR headsets to watch 360 videos is common, perhaps as much so as computer graphics. For example, Youtube 360 is one of the most utilized educational applications by parents (Mado et al., 2022). Moreover, in the corporate training space, 360 videos are the most common use of VR given the need to scale up to millions of users (Bailenson, 2020). Finally, one of the venues where VR reaches the public most readily is at events such as film festivals and museums, and 360 videos are quite common in these contexts (Jun et al., 2020).

Compared to a traditional video (2D) streamed on a computer monitor, the IVR videos are more immersive and interactive. Because IVR videos place the user in the middle of a digital sphere and refresh the digital content in real-time according to the user’s head movements, IVR videos allow the user to feel immersed within the scene and as one who controls where to examine the scene over time (Sundar et al., 2017).

Immersion and Presence

When comparing the psychological effects of VR headsets to computer monitors, it is important to highlight the difference between two critical concepts in the IVR research field: immersion and presence. Immersion is “the objective level of sensory fidelity produced by a virtual reality system” (Laha et al., 2013, p. 529). In other words, “the more that a system delivers displays (in all sensory modalities) and tracking that preserves fidelity in relation to their equivalent real-world sensory modalities, the more it is immersive” (Slater, 2003, p. 1). The degree of immersion varies based on hardware aspects and the number of senses activated by the technology (Makransky & Petersen, 2021). Two important attributes used to quantify immersion are visual fidelity (e.g., field of view and resolution) and tracking coverage (e.g., head movement tracking; Slater, 2003).

IVR systems usually rely on HMDs that block visual stimuli around the user, replacing it with a digital visual projection inside the HMD, creating a virtual environment. HMDs are considered fully immersive, as the virtual environment surrounds the user view (Rebelo et al., 2012). The system tracks the user’s head movement and updates the virtual environment accordingly, which provides a 360° area of viewable digital content. Therefore, HMDs provide higher levels of immersion than computer monitors, along with a wider field of view and user’s head movement tracking—two critical features of IVR systems (Slater, 2003). Consequently, the unique affordance of IVR systems of immersing users in virtual environments elicits feelings of presence. Presence is a subjective experience and “refers to the phenomenon of behaving and feeling as if we are in the virtual world created by computer displays” (Sanchez-Vives & Slater, 2005, p. 332). More specifically, when experiencing presence, “your perceptual, vestibular, proprioceptive, and autonomic nervous systems are activated in a way similar to that of real-life in similar situations” (Slater, 2003, p. 2). Previous research in this area has found that increased immersion increases feelings of presence (Sanchez-Vives & Slater, 2005).

Types of Learning

There are generally three domains of learning: cognitive, affective, and psychomotor (Krathwohl & Anderson, 2009). For the cognitive domain, learning can be categorized as conceptual or procedural (Anderson et al., 2001). Conceptual learning connects the most essential elements that have already been organized and explained (Anderson et al., 2001; Krathwohl, 2002). It involves knowledge of classification, categorization, theories, models, structures, principles, and generalizations. This type of knowledge relates to the explanation of what is known and has been learned. Procedural learning refers to the use of criteria and methods in solving problems. It encompasses knowledge of specific techniques, skills, and methods, as well as the perception of how and when to use a particular procedure (Anderson et al., 2001; Krathwohl, 2002).

Most of the studies investigating the effects of VR headsets on learning used CG-based IVR (Jensen & Konradsen, 2018; Queiroz et al., 2018). Reviews from those studies suggest that IVR, when compared to other learning methods, positively impacts the affective (e.g., feelings and emotions) and psychomotor (e.g., motor skills) domains of learning. Results from Queiroz et al. (2018) indicated that students using IVR reported higher affective outcomes, such as greater confidence and more satisfaction in learning compared to other settings. All studies reviewed by Jensen and Konradsen (2018) investigating learners’ attitudes toward IVR reported that students perceived IVR as useful, exciting, and engaging. This pattern holds for studies examining IVR video as well as CG-based IVR.

Among studies comparing the effects on cognitive learning of CG-based IVR to other media, literature shows varied results. While some studies showed higher learning gains for IVR than other media (Alhalabi, 2016; Webster, 2015), some showed the opposite (Dede et al., 2000; Parong & Mayer, 2018). In addition, some studies showed no differences between conditions (Allcoat et al., 2021; Makransky et al., 2019). Among the studies comparing IVR videos to other media, there are studies that show that IVR was more effective for learning (Rupp et al., 2019; Walshe & Driver, 2019) or report no difference between conditions (Lee et al., 2017; Harrington et al., 2018); and to the best of our knowledge, none of the studies reported more significant learning gains when using other media compared to IVR.

One of the reasons indicated by scholars for lower conceptual learning when using CG-based IVR compared to other media is that the rich and interactive CG environment may cause cognitive overload and exceeds the learner’s capabilities to process the conceptual information (Mayer, 2017; Moreno & Mayer, 2002; Parong & Mayer, 2018). Roussou and Slater (2017) compared the impacts of interactive versus passive CG-based IVR in evoking and sustaining conceptual change and found that passive IVR favored sustained conceptual change. These findings point out the need for further research about IVR’s role for learning, particularly when considering different levels of interactions (e.g., IVR videos vs. CG-based IVR) and its use at scale.

In summary, research comparing IVR videos to other media shows that IVR videos were more effective for learning than other media (Rupp et al., 2019; Walshe & Driver, 2019) or shows no difference between conditions (Lee et al., 2017; Harrington et al., 2018). Because there is a lot to learn about how immersion influences, learning, our first research question (RQ) is how watching IVR videos on a headset or computer monitor will impact conceptual learning (RQ1).

Cognitive Objectives of Learning

Bloom (1956) proposed a hierarchical taxonomy emphasizing the cognitive objectives of learning. Mayer et al. (2001) reviewed Bloom’s taxonomy and proposed six levels of cognitive dimensions, in crescent order: remember, understand, apply, analyze, evaluate, and create. These levels are described as follows: remember refers to information recall, while understand refers to comprehension and the ability to translate what was learned to students’ own words; apply relates to the ability to use the knowledge in new situations that were not taught directly; analyze relates to the ability to break down information and discuss its elements; evaluate refers to the ability to make judgments about something, using some criteria; and, create constitutes the highest cognitive level and refers to inductive thinking and the ability to create based on what the student already knows. This taxonomy is particularly useful when planning and evaluating learning activities, helping to assess lower and higher cognitive processes involved in IVR. According to Jensen and Konradsen’s (2018) review on IVR and learning, no study has investigated the effects of IVR on each higher level proposed by Bloom’s taxonomy so far (e.g., apply and create), indicating the need for this kind of investigation and the novelty of the present study.

Self-Efficacy

Self-efficacy, a concept coined by Bandura (1977) in light of the social learning theory, is defined as “people’s judgments of their capabilities to organize and execute courses of action required to attain designated types of performances” (Bandura, 1986, p. 391). In other words, self-efficacy refers to people’s beliefs that they can succeed in a specific task, topic, or environment.

Perceived self-efficacy determines how much effort they will put into a particular activity and how long they will persist in succeeding at that activity (Bandura, 1977). Usher and Pajares (2008) suggested that people who perceive themselves as being able to learn and master some activity will face challenges more positively and be more persistent at mastering that activity than people who perceive themselves as incapable of attempting or mastering the same activity.

Self-efficacy has been considered a predictor of students’ academic performance and career choices (Maddux, 2016; Zimmerman, 1995). Also, it has been shown to affect students’ career choices by influencing the type of environments and activities they engage with because people tend to avoid environments they believe they will not be able to cope with. Thus, several research studies have focused on understanding self-efficacy in educational settings (Graham et al., 2005; Schunk, 1995; Schunk & DiBenedetto, 2016).

According to Bandura (1997), the four main sources of students’ self-efficacy are (a) enactive mastery experiences, (b) vicarious experiences (observation), (c) social persuasions, and (d) physiological states. The author states that enactive mastery experiences (i.e., performance accomplishments) produce the highest and most generalized increase in perceived self-efficacy, as they provide genuine and immediate evidence that students can succeed on the task at hand (Bandura, 1997). After receiving that evidence and interpreting the results, students tend to develop beliefs about their performance, impacting students’ self-efficacy. Although middle and high school girls tend to have higher grades in STEM subjects (Science, Technology, Engineering, and Math; Voyer & Voyer, 2014), stereotypes and self-confidence may prevent them from pursuing STEM careers (O’Dea et al., 2018). In this context, Gibbons and Borders (2010) suggest that students make their career decisions between sixth and eightth grade, and intervention programs targeting girls’ STEM self-efficacy have shown to be more effective before 8th grade (Cho et al., 2009).

Many studies highlighted the role of teachers, parents, and schoolmates on students’ self-efficacy (Schunk, 1995; Usher & Pajares, 2008). However, because of the recent increased technological adoption by students both inside and outside of schools, research has started to focus on the bidirectional relationship between technology and self-efficacy (Huang & Mayer, 2019; Kuo et al., 2014; Makransky, Mayer, et al., 2020; Meyer et al., 2019; Petersen et al., 2020; Sun & Rueda, 2011). More specifically, given IVR’s potential to tap into students’ affective dimensions of learning, more attention has been drawn toward understanding the effects of IVR on self-efficacy (Bandura, 1986; Pekrun, 2006; Plass & Kaplan, 2016; Zimmerman, 2000).

Findings from studies investigating the effects of CG-based IVR and IVR videos on self-efficacy indicate a positive effect (Makransky, Andreasen, et al., 2020; Makransky et al., 2019; Makransky, Mayer, et al., 2020; Meyer et al., 2019). Makransky et al. (2019) compared the motivational and cognitive effects of an IVR simulation, a desktop simulation (both using CG environments), and a conventional text-based safety manual for laboratory safety training. Although results indicated no significant difference on immediate retention tests between conditions, they showed a significant difference in motivation, enjoyment, and self-efficacy between IVR and the text condition. Participants in the IVR condition reported higher scores on these measures.

Francis et al. (2020) compared the effects of IVR videos to traditional lectures in self-efficacy among preclinical physician assistant students when learning about the operating room. Their results showed that participants in both groups reported higher self-efficacy after treatment than at pretest. Still, participants in the IVR video condition reported significantly higher self-efficacy than participants that attended the lecture (Francis et al., 2020).

Although studies comparing VR to computer monitors on learning show varied results, ranging from positive to negative effects of VR on learning, results of studies targeting self-efficacy usually converge, indicating a positive effect of immersion. Considering these previous studies, our first hypothesis is that participants in the IVR condition will score higher for self-efficacy than participants in the desktop condition immediately after treatment (H1).

Sense of Agency

Although the literature shows consistent results with positive effects of immersion on self-efficacy, the mechanisms underlying these effects are still unclear. We hypothesize that this effect is due to the sense of agency (the feeling of making things happen; Tapal et al., 2017) provided by the headset use. When watching IVR video, the user is virtually placed in the center of a sphere surrounded by digital content. The users can visually explore the virtual content moving their heads, and the content is refreshed accordingly. Movement and abstract actions are preceded by the intention to act (David et al., 2008), which is based on people’s impression of the world and how the world responds to their actions (Triberti & Riva, 2016). Moreover, the brain is constantly analyzing body and environmental information to create predictive models about events and the actions’ impact on these events. These predictions can be confirmed or not, since the greater the congruence between the action and the predicted consequence, the greater the sense of agency becomes (David et al., 2008).

This shows that watching IVR videos is different from watching a video on a computer monitor. People watching IVR videos can engage in motor actions of exploration and perceive themselves as able to cause changes to the environment, while people using a computer monitor can explore the video images on the camera angle shown in the monitor. When focusing on learning, the more someone feels that they can make things happen, control their learning, and experience positive emotions about their ability to learn, the greater their feeling of self-efficacy (Bandura, 1997).

A qualitative study comprising two case studies investigating VR use for high schoolers science learning reported that students considered agency an essential aspect of the learning experience with this technology (McGivney, 2021). In one of the case studies, participants watched IVR videos at home using cardboard VR viewers. Data from the interviews and focus groups indicated that participants felt engaged because they could explore and control where to look in the virtual environment. We speculate that watching IVR videos positively affects self-efficacy as it allows action on the environment.

Since self-efficacy has been found to affect academic performance and influence professional choices, Multon et al. (1991) argued that finding new ways to support self-efficacy for educational, professional, and personal growth is crucial. Thus, understanding how teaching methods and in-classroom tools impact self-efficacy plays an important role in improving students’ performance. Focusing on IVR in particular, given its recent and increasing adoption in formal and informal learning environments, it is necessary to understand its role in fostering self-efficacy. Moreover, it is important to note that most studies assessing the relationship between IVR and self-efficacy have only used CG virtual environments. There is still a lot to learn about the relationship between IVR videos and self-efficacy.

Finally, although studies have reported positive effects of immersion on self-efficacy, the mechanism underlying this effect is still not clear. Makransky and Petersen (2021) proposed a theoretical framework describing the process of learning in IVR. In their framework, control factors, such as degree, immediacy, and mode of control, are the most important predictors of agency. In turn, a sense of agency seems to influence users’ engagement and sense of autonomy in VR experiences (McGivney, 2021). Thus, we predict that learning agency (i.e., the feeling that one can control how and what to learn) will mediate the relationship between the conditions and self-efficacy immediately after treatment (H2). In other words, the increased immersion and visual exploration enabled by the VR headset will positively influence the participants’ feeling of being able to control what they learn (learning agency), which will in turn have positive effects on how they feel that they are able to learn (self-efficacy).

Long-Term Effects of IVR for Learning

Before implementing IVR in formal education at scale, it is essential to understand its long-term effects. In a study by Stepan et al. (2017), participants learned about the ventricular system and brain using IVR or online textbooks. Learning and motivation measures were taken before, right after, and 8 weeks after the treatment sessions, and there was no significant difference in the learning assessment between conditions over time. To the best of our knowledge, this is the first study to measure the long-term effects of IVR videos on learning, and only a few have investigated the long-term effects of CG-based IVR. This lack of evidence limits the predictions about IVR videos’ long-term effects on learning and open avenues to exploratory investigations. Experiment 2 investigates IVR and traditional videos’ long-term effects on conceptual learning and self-efficacy through the following research questions: Compared to traditional videos, how are the long-term effects of IVR videos on conceptual learning (RQ2) and self-efficacy (RQ3)?

We conducted two studies to compare the effects of educational IVR and traditional videos on conceptual knowledge, measured through multiple-choice questions and open-ended questions (the open-ended questions were built following the Bloom’s taxonomy of learning objectives and assessed knowledge understanding, knowledge application, and knowledge creation), and self-efficacy. In Experiment 1, 53 eighth-grade students from an all-girls school learned about humans’ impact on the ocean through either IVR videos, using a virtual reality headset, or through traditional videos, using a computer monitor. In Experiment 2, 139 sixth-to-eighth grade students from the same school used the same instructional material and equipment from Experiment 1, and measures were taken four times (before treatment, right after 2 treatment sessions, and 5 weeks after treatment).

Experiment 1

Method

The Stanford University institutional review board approved procedures and materials in this study. Participation was voluntary, and written parental consent forms and assent forms were obtained from all participants and their parents/legal guardians. Researchers worked with the participants’ science teacher to define the logistics of the experiment and the wording of the questionnaires used. Power analysis with a small effect size (Cohen’s f = 0.20) and a significance level of 0.05, powered at 0.80 (Cohen, 1992), suggested a total of 52 participants (N = 52) would be required. The entirety of the study occurred before participants covered any learning material in their coursework related to topics investigated in this study—namely, ocean acidification and coral reefs.

Participants

The study was run in an all-girls school in the United States, given an existing relationship between the researchers and the science teachers, which was critical given the logistics of multiple sessions over time. Also, we chose an all-girls school because we were specifically interested in boosting self-efficacy in science learning and working with girls was important both theoretically (Cho et al., 2009) and for improving society (Gibbons & Borders, 2010). An initial sample of 55 female eighth-grade students answered the pretest using a MacBook Air computer, and two students failed to complete some part of the study and were excluded from analyses. The final sample consisted of 53 female participants.

Materials and Apparatus

The instructional material consisted of two 6-min long videos: The Crystal Reef and Coral Compass developed by the Virtual Human Interaction Lab at Stanford University. We used immersive 360°s video in the IVR condition to keep the experimental and control condition as similar as possible. To investigate the influence of immersion on the learning outcomes, we decided that IVR videos would allow the two conditions to be similar in terms of time, pace, and content, while varying the level of immersion. The development of each video took years and was the result of interdisciplinary collaborations with marine scientists as domain experts and learning scientists. Moreover, compared to many 360° videos, these were designed to minimize distraction (i.e., avoiding voiceover and textual cues during important visual events) and motion sickness (i.e., no movement of the camera to disturb the vestibular system). Both videos were official selections at the Tribeca Film Festival, and each has been used in hundreds of classrooms, museums, and other informal learning institutions across the planet. They are available in the following links: Coral Compass (https://stanfordvr.com/coralcompass/); The Crystal Reef (https://stanfordvr.com/the-crystal-reef/).

Coral Compass: Fighting Climate Change in Palau

Both videos depict how human actions have been negatively affecting the ocean. The Crystal Reef focuses on ocean acidification and how carbon dioxide (CO2) from human emissions negatively impacts the ocean. The video presents a female scientist diving into the Mediterranean Sea, in a site where natural underwater vents spew CO2 into the water. The high concentration of CO2 in the water decreases the water pH, resulting in more acidic water and reduced biodiversity. Coral Compass focuses on how human activities have impacted the coral reefs in Palau, a small island in the Western Pacific. A male scientist narrates the video. It presents how tourism and land practices have affected the coral reefs’ health and shows how the country’s government has been acting to reduce the negative impact on the ocean. Screenshots of these videos can be seen in Figure 1and Figure 2.

Figure 1

Coral Compass Video Images
Note. (a) Tourists diving in the coral reef in Palau. (b) Researchers and Palauan policymakers discussing conservation measures. (c) Coral reefs affected by sediments from bad land practices.

Figure 2

The Crystal Reef Video Images
Note. CO2 = carbon dioxide. (a) Healthy coral reef. (b) Volcanic vents releasing CO2 into the water. (c) Unhealthy coral reef with reduced biodiversity.

Estimated field of view, resolution, visual exploration, and occlusion of the surrounding environment are presented in Table 1. The HMD used in this study is considered more immersive than the computer used as the HMD has a wider field of view (110°, against approximately 30° on a computer monitor) and a larger area of visual exploration (360° area of exploration, against 13.3″ on the computer monitor; Cummings & Bailenson, 2016). Also, the HMD offers complete occlusion of the physical environment and tracks participants’ head movements, placing them in the center of a virtual sphere that projects the video in 360°, contributing to increased immersion. Depending on which condition participants were randomly assigned to, they either watched both videos on the screen of a 13.3″ Mac-Book Air (desktop condition) or on a Lenovo Mirage Solo VR headset (IVR condition).

Table 1

Devices Features

Feature

HMD (Lenovo Mirage Solo)

Computer monitor (Mac-Book Air)

Relevant differences influencing immersion

Field of view

110°

∼30°s

A wider field of view increases immersion (Cummings & Bailenson, 2016).

Resolution

2,560 × 1,440

2,560 × 1,600

Area for visual exploration (field of regard)

360°s

13″ screen

The 360° area of exploration refreshed according to HMD users’ head movements allows them to act on the digital content (Sundar et al., 2017).

Occlusion of physical classroom

Complete

11.97″ × 8.36″

The physical environment occlusion HMD provides increases the immersion and the feeling of being part of the scene (Kim et al., 2017).

Note. HMD = head-mounted displays.

Participants in both conditions listened to the audio using headphones and were exposed to the same content (i.e., words and images). The desktop versions of each IVR video were created using a screen capture software (Fraps) from a person going through the IVR experience. The length and sequence of the experience were the same for both conditions. While participants in the IVR condition could turn their heads and have the digital content refreshed accordingly, participants in the desktop condition could only see what was on the computer screen, and no movement on the camera angle was possible. We chose this condition over the interactive desktop video condition to simulate what people typically do in classrooms as a control condition.

Design and Procedure

Participants were invited by their science teacher to participate in this study. Participants who accepted to participate and received consent from their parents answered an online questionnaire investigating their knowledge, self-efficacy, concerns, and opinions about ocean acidification, during their science class. Three weeks later, the researchers visited the school to conduct the treatment sessions, during which participants were randomly assigned to the IVR condition (n = 28) or the desktop condition (n = 25).

In the IVR condition, participants watched the IVR video versions of The Coral Reef and Coral Compass using a Lenovo Mirage VR headset and headphones. In contrast, participants in the desktop condition watched the traditional versions of the videos on a MacBook Air laptop and headphones in a separate room. The treatment session and posttest took place during a 90-min science class period. To account for order effects, half of the participants in each condition watched The Crystal Reef first while the other half started with Coral Compass.

To mitigate fatigue, after participants watched the first video, the researcher asked them to remove the headset and headphones (IVR condition) or headphones only (desktop condition) and answer three questions assessing presence, three open-ended questions, and three multiple-choice questions assessing learning on a laptop computer. When participants finished that survey, the researchers helped participants in the IVR condition put the headset on again and started playing the second video. After the second video, participants again answered the three questions assessing presence, three open-ended questions, three multiple-choice questions assessing learning, three additional questions about the environmental concern, knowledge about ocean acidification, and twelve questions to measure self-efficacy. In this article, we will present and discuss the multiple-choice and open-ended learning assessment measures and presence and self-efficacy measures to focus on the effects of IVR videos on learning and self-efficacy.

Finally, after all students completed the experiment, they were debriefed about the study, and researchers answered students’ questions.

Measures

Presence

In order to confirm the difference in immersion between the two conditions, presence was used as a manipulation check. Three items assessing feelings of presence using a 5-point Likert scale (1 = not at all, 5 = extremely) were adapted from Nowak and Biocca (2003). These items were as follows: “To what extent did you feel like you were inside the virtual experience?” “To what extent did you feel immersed in the virtual experience?” and “How much did it feel as if you visited another place?” Participants answered these three questions during the postquestionnaire immediately after watching each of the videos. Reliability for the presence questionnaire was ω = 0.96.

Open-Ended Learning Assessment Questions

Six open-ended questions were created to assess conceptual learning based on the critical thinking theory and revision of Bloom’s taxonomy. They focused on knowledge understanding, knowledge application, and knowledge creation (Krathwohl & Anderson, 2009; Mayer et al., 2001). These questions are intended to evaluate how immersion would affect learning according to each cognitive level. The questions assessing understanding measured how they remembered the information they received in the videos. The questions focusing on knowledge application required them to apply the information they received in new contexts or purposes, and the questions assessing knowledge creation prompted participants to propose solutions for environmental problems. For example, one of the questions assessing knowledge creation was as follows: “Propose strategies to increase engagement of the general public with ocean acidification.” These questions can be found in Appendix A.

Two researchers, blind to condition, used a rubric developed according to Saxton et al. (2012) to score them in a range between 0 and 5 points. The average rate of agreement between researchers was 89.27% in pretest and 88% in posttest. When researchers had an initial disagreement on the scoring, they looked for a consensus score, and a score that both agreed upon was used. The six questions were grouped according to their corresponding Bloom taxonomy level—knowledge understanding, knowledge application, and knowledge creation—with two questions compounding each composite.

Multiple-Choice Learning Assessment

Four questions about ocean acidification and coral reefs were adapted from the International Ocean Literacy Survey (Fauville et al., 2019), and a marine biologist expert created two questions about Palauan government actions regarding coral reefs. These questions aimed to evaluate participants’ conceptual knowledge about the topic. They are not aligned with Bloom’s taxonomy of learning objectives. Participants were given a point for every correct answer and received zero points for incorrect answers. Participants’ final score was the sum of all of their correct answers. For example, one of the questions was: “Select the land practice that can greatly help protect the coral reefs in Palau,” and the participant had to select the correct answer among four options.

Self-Efficacy and Motivation to Learn Science

Twelve questions from Tuan et al. (2005), Pekrun et al. (2005, 2011) were adapted, using a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree) to investigate the effects of immersion on self-efficacy. For example, one of the questions was “Whether the science content is difficult or easy, I am sure that I can understand it.” Negatively framed questions were reverse-coded. Reliability for self-efficacy and motivation to learn was ωpre = 0.79 and ωpost = 0.85. The model fit was marginal, root-mean-square error of approximation (RMSEA) = 0.10, 90% confidence interval (CI) [0.06, 0.15], root-mean-square of residuals (RMSR) = 0.13, χ2(53) = 85.33, p < .01. The exact wording of the questions in the self-efficacy questionnaire can be found in Appendix A.

The data and analytic methods are publicly available at https://osf.io/mpk2j/ (Queiroz et al., 2021).

Results and Discussion

All the analyses were carried out using RStudio Version 1.1.463, except for power analyses, for which we used the G*power application. Analysis of covariance (ANCOVA) models predicting each dependent variable of interest with condition as predictor were run. The models were controlled for the pretest scores of the learning (RQ1) and self-efficacy (H1 and RQ1) measures, to address the significant difference found between conditions at pretest for these variables, multiple-choice learning assessment: t(50.69) = 3.16, p = .003, d = 0.86, 95% CI [0.29, 1.42]; self-efficacy: t(44.49) = 2.27, p = .028, d = 0.63, 95% CI [0.08, 1.19]. Means and SDs for each variable are shown in Table 2.

Table 2

Means and Standard Deviations at Pretest and Posttest

Desktop (n = 25)

IVR (n = 28)

Pretest

Posttest

Pretest

Postest

Measures

M

SD

M

SD

M

SD

M

SD

Multiple-choice

0.37

0.17

0.75

0.21

0.54

0.20

0.87

0.15

Understanding

0.96

0.58

2.46

1.03

0.88

0.44

2.57

0.74

Application

0.80

0.74

2.12

0.94

0.86

0.56

2.20

0.81

Creation

1.54

0.64

3.04

0.98

1.45

0.44

3.10

0.83

Self-efficacy

3.20

0.57

3.12

0.71

3.52

0.43

3.65

0.50

Presence

2.21

0.79

4.05

0.58

For the models investigating learning (RQ1), self-efficacy pretest scores were included in the model because previous studies indicate that self-efficacy scores have been shown to impact future task performance (Bandura, 1986). For the model investigating self-efficacy (H1), given that self-efficacy shows to be impacted both from previous self-efficacy beliefs and current task performance (Bandura, 1986; Bernacki et al., 2015), the learning scores at posttest were included in the model. Finally, testing for multicollinearity with variance inflation factors was run for each model. While there is no consensus on the variance inflation factors value that indicates a concern about multicollinearity (Vatcheva et al., 2016), values below five usually suggest the absence of multicollinearity. Predictors from all models testing H1 and investigating RQ1 showed variance inflation factors values lower than two, suggesting no multicollinearity. Tables with the summary of the models we ran are available in the Supplemental Materials.

A significant difference between conditions for presence was found at posttest, t(43.55) = −9.59, p < .001, d = −2.69, 95% CI [−3.43, −1.93], with participants in the IVR condition scoring higher than participants in the desktop condition after treatment. This indicates that the experimental manipulation was successful.

After controlling for the self-efficacy pretest differences, a medium significant effect of the condition was found on self-efficacy posttest scores, F(1, 46) = 6.62, p = .013, ηp2 = 0.13, 90% CI1 [0.02, 0.28], with participants in the IVR condition reporting higher self-efficacy than in the desktop condition after treatment. Post hoc power analysis indicated that this test achieved 0.79 power. These results support H1.

One possible explanation for this finding is that feelings of greater control of the situation can contribute to the higher self-efficacy perception in IVR (Johnson-Glenberg, 2018; Hite et al., 2019). Also, a sense of agency and the intention to act have been shown to involve causal efficacy (David et al., 2008; Schlosser, 2012) and may have contributed to increased self-efficacy in the IVR condition. While we have no formal way to test this mechanism with the current data, we hypothesize that acting on the environment while using a headset to watch the IVR video could have increased participants’ predictions about their actions and the environment responses. As these predictions were confirmed throughout the experience, they increased the sense of agency, which was positively associated with self-efficacy. However, more studies are needed to investigate this association between a sense of agency and self-efficacy when using IVR.

In addition, this positive impact of IVR on self-efficacy may also be due to the positive effects that IVR has shown in affective aspects of learning such as engagement and enjoyment (Bailenson et al., 2008; Makransky & Lilleholt, 2018; Parong & Mayer, 2018). These aspects are known to influence self-efficacy positively (Bandura, 1995) and therefore may mediate the positive impact of IVR on self-efficacy. However, this study has not focused on these variables, and future studies should consider their relationship with self-efficacy and learning in IVR video contexts.

Given that the students have never experienced IVR videos in the classroom, the novelty effect of IVR may be positively associated with self-efficacy. In addition, more attention was given to the students in the IVR condition. The researchers spent a few minutes before each session explaining to participants how to use the VR headset and help them put the headset on. This procedure could have impacted their feelings of appreciation. As physiological states are one of the sources of self-efficacy (Bandura, 1997), those factors may have positively influenced students’ physiological states, which could have been associated with their self-efficacy perception. These possibilities should be considered in future research.

The models investigating the effects of treatment on learning (RQ1) showed no significant effect of condition on any variable assessing learning, multiple-choice assessment: F(1, 49) = 0.03, p = .855, ηp2 < 0.01, 90% CI [0, 0.04]; knowledge understanding: F(1, 49) = 0.06, p = .814, ηp2 = 0.001, 90% CI [0, 0.05]; knowledge application: F(1, 49) = 0.01, p = .915, ηp2 < 0.001, 90% CI [0, 0.02]; or knowledge creation: F(1, 49) = 0.07, p = .797, ηp2 = 0.001, 90% CI [0, 0.05]. This finding suggests that more immersion did not lead to better learning in this experiment and corroborates previous studies that compared conceptual learning outcomes between IVR videos and desktops and found no significant difference in learning between these media (Harrington et al., 2018).

Experiment 2

In the second experiment, the sample was enlarged, with 139 participants from sixth to eightth grades in the same school in Experiment 1. A matched random assignment (Cozby et al., 1977) was used to avoid significant differences between conditions at pretest. Questionnaires measuring learning agency were added to this study to investigate possible mechanisms involved in the immersion effects on self-efficacy. The self-efficacy questionnaire used the complete version from Tuan et al. (2005). Also, the treatment sessions were increased to two, in which participants watched one video per session, and measures were taken after each session. Finally, participants took a delayed posttest, 5 weeks after the treatment sessions, to evaluate the long-term effects of the intervention. Consequently, in this experiment, measures were taken at four different times: pretest, postfirst (5 weeks after pretest), postsecond (1 or 2 days after postfirst), and delayed posttest (5 weeks after postsecond). The Experiment 2 procedure is shown in Figure 3.

Figure 3

Experiment 2 Procedure
Note. IVR = immersive virtual reality.

Method

Participants

Participants were recruited from the same middle school as in Experiment 1. Participation was voluntary. An initial sample of 164 female students from sixth to eightth grade (M = 7, SD = 0.83) aged from 11 to 14 years old (M = 12.27, SD = 0.91). Twenty-five students failed to complete some part of the study and were excluded from the analysis. The final sample consisted of 139 participants.

Materials and Apparatus

The instructional materials used were the same in Experiment 1, the videos The Crystal Reef and Coral Compass. Participants in the IVR condition used the same VR headsets as in Experiment 1. Participants in desktop condition watched the same traditional versions of the videos and used the same computer model from Experiment 1.

Design and Procedure

Similar to Experiment 1, all participants answered a pretest questionnaire on a laptop computer during their science class. This questionnaire had the same questions used in Experiment 1 to assess learning (multiple-choice and open-ended). The questionnaire also included 18 questions assessing self-efficacy (from Tuan et al., 2005) and five questions assessing learning agency (developed by the authors). We used the complete version of the self-efficacy questionnaire from Tuan et al. (2005) in Experiment 2, aiming to increase the questionnaire reliability and model fit (Stanley & Edwards, 2016). Reliability was good (ωpre = 0.85, ωpostfirst = 0.87, ωpostsecond = 0.90, ωdelayed = 0.89), as well as the model fit, RMSEA = 0.06, 90% CI [0.036, 0.077], RMSR = 0.05, χ2(139) = 148.66, p < .01.

Five weeks later, the researchers visited the school to conduct the treatment sessions. Participants were assigned to either IVR (n = 73) or desktop condition (n = 66), after completing the pretest, through random matched assignment (Cozby et al., 1977).

Each treatment session took place during a 45-min science class. Half of the participants in each condition watched the video The Crystal Reef in the first treatment session, while the other half watched Coral Compass. Researchers helped participants in the IVR condition put the headsets on, start the videos, and remove the headsets after participants had finished watching the video. The two treatment sessions were 1 to 2 days apart and followed the same procedure, in which participants watched one of the videos in the first session (postfirst) and the other video in the second session (postsecond). After watching each video, participants used a computer to answer the same questions used in Experiment 1, assessing presence and learning (multiple-choice and open-ended), and also answered questions assessing self-efficacy (from Tuan et al., 2005) and learning agency.

Five weeks after the treatment sessions, all students answered the delayed posttest questionnaire during a 45-min science class. The same questions used in the pretest were used at the delayed posttest. After participants completed this questionnaire, they were debriefed about the study, and a researcher answered students’ questions.

Measures

As mentioned before, items assessing presence (manipulation check; ωpostfirst = 0.92, ωpostsecond = 0.94) and learning (multiple-choice and open-ended) were the same as Experiment 1.

Learning Agency

To investigate participants’ subjective perception of agency related to learning and visual exploration of the digital content, questions were built on theories of agency, efficacy, and embodiment (Baker et al., 2003; David et al., 2008; Gonzalez-Franco & Peck, 2018; Johnson-Glenberg, 2018; Judge et al., 2002; Schlosser, 2012; Spengler et al., 2009). Five questions using a 5-point Likert scale were created (ωpre = 0.43, ωpostfirst = 0.60, ωpostsecond = 0.61 and ωdelayed = 0.63). For example, one of the questions was as follows: “I learn more when I am the one selecting what I am going to learn” (1 = strongly disagree, 5 = strongly agree; for the complete questionnaire, see Appendix B).

Self-Efficacy and Motivation to Learn Science

In order to investigate the effects of conditions on self-efficacy, 18 questions from Tuan et al. (2005) were adapted to assess participants self-efficacy and motivation to learn (ωpre = 0.85, ωpostfirst = 0.87, ωpostsecond = 0.90 and ωdelayed = 0.89) using a 5-point Likert scale. Negatively framed questions were reverse-coded. The exact wording of the questions in the self-efficacy questionnaire can be found in Appendix B.

Results and Discussion

Welch two-sample t tests were run to compare the pretest scores between conditions and the presence scores between conditions at postfirst and postsecond. ANCOVA models were run to test H1 and to investigate the research questions. Similar to Experiment 1, the models predicting treatment effects on self-efficacy (H1 and RQ3) included the condition as the predictor. They controlled for the previous self-efficacy score, the learning scores at the time being analyzed, and which grade participants were attending. These models also included learning agency at the time being analyzed. The models predicting learning after treatment (RQ1) included condition as the predictor, controlled for learning scores at pretest, previous self-efficacy scores, and grade. The models investigating long-term effects of treatment on learning (RQ3) included condition as the predictor. They controlled for grade and the learning scores and self-efficacy at the second treatment session. Testing for multicollinearity with variance inflation factor was run for each model. In all models, all variables showed variance inflation factor values below two, indicating the absence of multicollinearity. Tables with the summaries of the models we ran are available in the Supplemental Materials. Descriptive statistics for each dependent variable at each time measured are shown in Table 3

Table 3

Means and Standard Deviations at Pretest, Postfirst, and Postsecond Tests

Pretest

Postfirst

Postsecond

Delayed

Variable

M

SD

M

SD

M

SD

M

SD

Desktop (n = 66)

 Multiple-choice

0.40

0.20

0.70

0.30

0.78

0.32

0.64

0.24

 Understanding

0.93

0.64

2.07

1.24

2.32

0.99

1.80

0.87

 Application

0.71

0.61

1.94

1.76

1.92

0.98

1.61

0.87

 Creation

1.61

0.80

2.02

0.83

1.86

0.88

1.75

0.65

 Self-efficacy

3.83

0.42

3.75

0.49

3.73

0.53

3.73

0.50

 Learning agency

3.56

0.50

3.33

0.41

3.32

0.44

3.45

0.56

 Presence

2.20

0.90

2.17

0.92

IVR (n = 73)

 Multiple-choice

0.41

0.20

0.73

0.30

0.77

0.29

0.63

0.22

 Understanding

0.93

0.52

2.03

0.99

2.34

0.97

1.62

0.93

 Application

0.91

0.70

2.08

1.14

2.18

1.00

1.75

0.86

 Creation

1.55

0.80

2.27

0.93

2.26

0.93

1.84

0.67

 Self-efficacy

3.92

0.46

3.92

0.44

3.90

0.47

3.81

0.52

 Learning agency

3.55

0.49

3.46

0.43

3.42

0.38

3.58

0.54

 Presence

3.80

0.80

4.02

0.63

Note. IVR = immersive virtual reality; SD = standard deviation.

No significant differences between conditions were found at pretest for the variables measured, as shown in Table 4. Also, there were no significant differences between conditions for age, t(134.83) = 0.301; p = .764; d = 0.05, 95% CI [−0.28, 0.38], and grade they were in, t(134.66) = 0.203; p = .840; d = 0.03, 95% CI [−0.30, 0.37].

Table 4

Means and Standard Deviations at Pretest, t-Test Results Between Conditions, and Corresponding Effect Size

IVR

Desktop

95% CI

Variable

M

SD

M

SD

t

p

Cohen’s d

LL

UL

Multiple-choice

0.41

0.2

0.4

0.2

0.34

0.735

0.06

−0.28

0.39

Self-efficacy

3.92

0.46

3.83

0.42

1.19

0.235

0.2

−0.13

0.54

Learning agency

3.55

0.49

3.56

0.5

−0.15

0.882

−0.03

−0.36

0.31

Note. IVR = immersive virtual reality; CI = confidence interval; SD = standard deviation; LL = lower limit; UL = upper limit.

A significant difference between conditions was found for presence at postfirst, t(130.87) = −11.01, p < .001, d = −1.88, 95% CI [−2.28, −1.48], and postsecond, t(112.95) = −13.705, p < .001, d = −2.37, 95% CI [−2.80, −1.93], with participants in the IVR condition reporting higher feelings of presence than participants in the desktop condition after treatment. These results indicate that the treatment manipulation was successful.

No significant effect of the condition was found on self-efficacy at postfirst, F(1, 129) = 2.37, p = .126, ηp2 = 0.02, 90% CI [0, 0.07], and postsecond, F(1, 129) = 0.01, p = .757, ηp2 < 0.001, 90% CI [0, 0.3], thus not supporting H1. Bernacki et al. (2015) investigated the stability and change of high school students’ self-efficacy during learning math, measuring self-efficacy four times while students were solving math problems. They found that self-efficacy varied over the course of the learning task. Their results showed that compared to the first self-efficacy measure taken, participants reported higher self-efficacy at the second time measured but lower at the third and fourth time measured, which is aligned with our findings. Bandura (1986) stated that learners perceive their efficacy based on past evidence of their performance completing a task. We speculate that their overall self-efficacy perception influenced participants’ self-efficacy scores at pretest. However, after treatment sessions 1 and 2, self-efficacy could have been influenced by their performance in the learning assessments, given that they answered the self-efficacy questionnaire after answering the learning assessment. Also, they could have compared their answers at pretest and the videos’ information and realized possible mistakes made at pretest, contributing to a decrease in self-efficacy after the treatment sessions.

The models investigating the immediate effects of treatment on learning (RQ1) showed no condition effects on the multiple-choice assessment at postfirst or postsecond, postfirst: F(1, 133) = 0.10, p = .758, ηp2 < 0.001, 90% CI [0, 0.03], postsecond: F(1, 133) = 0.25, p = .617, ηp2 < 0.001, 90% CI [0, 0.02]. The analyses on each cognitive process revealed a significant effect of condition on knowledge creation at postsecond, F(1, 133) = 17.15, p < .001, ηp2 = 0.05, 90% CI [0.01, 0.13]. Post hoc power analysis showed that this test achieved 0.77 power. This is a novel finding in the VR research field and corroborates Bell and Fogler’s (1995) theoretical statement that virtual reality can address higher cognitive levels of learning, such as analysis and evaluation. Also, because IVR videos are less distracting and involve less cognitive load to interact with than CG-based IVR experiences, it may set cognitive capabilities free that enhance higher cognitive levels and support conceptual learning.

Although there were no effects of the condition on self-efficacy, we followed Rucker et al.’s (2011) recommendations to run the mediation analysis if a relationship is of research interest. Causal mediation analysis using nonparametric bootstrap confidence intervals with the percentile method was run over postfirst and postsecond scores, with the condition as the independent variable, learning agency as the mediator, and self-efficacy as the outcome, and learning as a control. Results showed that learning agency mediated the relationship between condition and self-efficacy, average causal mediation effects; ACME = 0.031, p = .014, 95% CI [0, 0.06]; average direct effects; ADE = 0.11, p = .050, 95% CI [0, 0.22], supporting H2. Although we investigated agency as it relates to learning specifically, this result is aligned with the cognitive affective model of immersive learning theoretical framework proposed by Makransky and Petersen (2021), in which agency would be an IVR affordance between immersion and self-efficacy.

The analyses investigating the long-term effects of treatment on learning (RQ2) revealed no effect of condition at delayed posttest on the multiple-choice learning assessments, F(1, 133) = 2.04, p = .155, ηp2 = 0.12, 90% CI [−0.04, 0.28], or the Bloom’s taxonomy levels, knowledge understanding: F(1, 133) = 1.38, p = .242, ηp2 < 0.01, 90% CI [0, 0.06]; knowledge application: F(1, 133) = 0.89, p = .346, ηp2 < 0.01, 90% CI [0, 0.05]; knowledge creation: F(1, 133) = 1.15, p = .284, ηp2 < 0.01, 90% CI [0, 0.01]. Moreover, the analyses to investigate long-term effects of intervention on self-efficacy (RQ3) showed no significant effect of condition at delayed posttest, F(1, 133) = 0.06, p = .800, ηp2 < 0.001, 90% CI [0, 0.02]. While no long-term effects of IVR videos on self-efficacy are reported in the literature, this finding is aligned with Stepan et al.’s (2017) study that compared CG-based IVR and computer-based materials for neuroanatomy learning. In their study, no significant difference for the learning assessment was found between conditions 8 weeks after treatment.

General Discussion

The present research investigated the effects of watching IVR videos or traditional videos on self-efficacy and conceptual learning. Two studies were carried out, and measures for learning, self-efficacy, learning agency, and presence were taken at multiple time points.

Experiment 1 showed a significant effect of condition on self-efficacy. Although this effect was not found in Experiment 2, a mediation effect of learning agency between condition and self-efficacy was found. The possibility of visually exploring the video in the IVR condition may have contributed to increasing participants’ self-efficacy in the IVR condition in Experiment 1. Aligned with this finding, previous studies investigating the effects of visual exploration and active learning on self-efficacy have shown positive impacts of exploratory behavior on self-efficacy (Hardy et al., 2014, 2019). Self-efficacy toward STEM among middle-school students has been shown to be an important predictor of career choices (Gibbons & Borders, 2010). Although more research is needed on the relationship between immersion and self-efficacy, the positive effect found in Experiment 1, indicates that IVR can potentially enhance self-efficacy in science learning among middle-school girls. The mediation effect of learning agency between condition and self-efficacy found in Experiment 2 indicates that this may be a plausible mechanism underlying the positive effects of immersion on self-efficacy. However, more research is needed to understand this effect better.

When investigating learning, there were no significant differences between conditions on the multiple-choice question scores in Experiments 1 and 2, indicating that more immersion did not lead to better conceptual learning. IVR videos were not better than 2D videos in three out of the four measures of learning (multiple-choice questions, knowledge understanding, and knowledge application). Hence, there is no strong evidence for the instructional effectiveness of IVR videos.

Notably, in Experiment 2, watching IVR videos showed a positive effect on a higher cognitive level of learning, knowledge creation, compared to using a computer monitor. Although the effect size was small, it was interesting to find a difference in knowledge creation just by varying the immersion level, given that the narrative and visuals were the same between conditions. This novel finding brings experimental evidence for the theoretical statement from Bell and Fogler (1995) that virtual reality would be capable of addressing higher cognitive levels of learning. This novelty may also be explained by the fact that IVR videos were used instead of CG-based IVR. The IVR videos usually add less extraneous load when compared to CG-based IVR, and they may have free cognitive resources that enable the enhancement of higher cognitive levels of learning. Given the small effect size of this finding, there is a need for future studies replicating these results of IVR videos’ effects on Bloom’s taxonomy of learning objectives.

In light of the cognitive load theory, high levels of interactivity and constant animations, usually present in CG-based IVR, can add extraneous load to the learning material and hinder learning (Fiorella & Mayer, 2018; Parong & Mayer, 2018). That can be inferred by the fact that no study reporting lower learning gains when using IVR videos compared to traditional videos was found in the literature, while some studies using CG-based IVR have reported lower learning gains than traditional videos (Dede et al., 2000; Parong & Mayer, 2018). In this sense, IVR videos may maximize agency while minimizing cognitive load and distraction.

In addition, the medium itself cannot solely account for the learning outcomes (Clark, 1994), and it is essential to highlight the role the design and content of the instructional material play in learning. In CG, it is possible to develop almost any type of experience. For example, it is possible to create an experience in which objects do not fall when dropped. Also, CG allows users to select, move and manipulate objects in the virtual environment. This affordance of CG is often explored to develop entertainment content, such as in games. In 360 videos, the experience is produced using real images captured using a 360° camera, and the user interacts with the content through head movements. It makes the development cheaper and faster than CG, given that no 3D models or interaction feedback other than camera field of view need to be developed (Violante et al., 2019). Thus, 360°s videos seem useful in creating educational materials or documentaries quickly and at scale. Mayer (2008) points out the importance of developing learning multimedia experiences that reduces extraneous cognitive processing (processing information that is not related to the learning goal) and favors essential and generative cognitive processing (selecting relevant information and integrating it with prior knowledge). Fiorella and Mayer (2018) provided guidelines on what is effective in terms of content, design, and learner attributes when developing instructional videos. For example, they reported the positive effects of segmenting the video and using different camera viewpoints on learning complex content.

Coral Compass and The Crystal Reef were chosen for the present studies because they were developed by experts in the VR, education, and marine fields to ensure that they followed scientific instructional material standards. Both videos were proven effective for learning for IVR and desktop conditions, as a significant increase in learning was shown in both conditions after treatment in both studies. Also, flat versions of the IVR videos were used in the desktop condition to avoid a confounding effect usually found in studies comparing videos to lectures or video to animated multimedia materials (Meyer et al., 2019).

Finally, our findings suggest that the effects found on self-efficacy and conceptual learning were short-termed, as they were not found 5 weeks after the treatment sessions. These findings align with previous studies investigating the effects of immersion on affective domains of learning. They are important in light of a changing society in which technology use is increasing fast, and devices continue to augment the level of digital immersion.

Limitations and Future Directions

Although results from this study yielded relevant findings, there are some limitations. First, even though participants were randomly assigned to each condition, there was a significant difference between conditions for the variables measured at pretest in Experiment 1. A statistical method that considers pretest differences was selected in order to analyze the data properly. However, it does not rule out the possibility of interference at posttest results. A matched random assignment and a larger sample were successfully used in Experiment 2 to address this shortcoming from Experiment 1. Second, we didn’t measure their prior experience with VR, hence novelty effects could have influenced the results. Third, participants were female students from a school situated in an affluent neighborhood. No race, ethnicity, or socioeconomic status data were collected, limiting the generalization of the results. These findings should be taken with care when considering other populations.

Assessing learning at multiple times through multiple-choice and open-ended questions has some inherent limitations. Although multiple-choice questions are considered standard practice in learning assessments, they may cause some inaccuracy due to students’ excessive training in answering these kinds of questions. Some students’ scores can be impacted by their ability to identify the most probable correct answer. To reduce this impact, we also used open-ended questions to assess learning. However, even open-ended questions have limitations, mainly because they rely on subjective scoring and prompt fatigue. An objective rubric was developed, and blind score ratings were used in both studies to reduce the subjectivity of this assessment. Also, the treatment was split into two 45-min sessions in Experiment 2 as an attempt to reduce fatigue. Although the focus of this investigation was the comparison between conditions, and both conditions were subject to similar testing effects driven by the multiple tests study design, a testing effect could have impacted the learning results over time.

In this research, participants answered the questions assessing learning before the questions assessing self-efficacy. Future studies should investigate if the sequence that the learning and self-efficacy questions are presented at posttest makes a difference in participants’ self-efficacy, given that the learning performance has been shown to impact self-efficacy.

To alleviate these limitations, future studies should consider a diverse population, multiple IVR exposures and use multiple methods to assess learning to improve the generalizability of these findings. Also, future studies must investigate the associations between interactivity, conceptual learning, and self-efficacy. A future study could compare watching a 360° video through a headset with watching it on a desktop computer in which the participant’s eye movements are tracked and the point of view is refreshed accordingly while the image is centralized to the gaze direction. This way, participants in the desktop condition would have a similar interaction as participants in the IVR condition. Moreover, future studies should consider the unique aspects of both IVR videos and CG-based IVR environments, such as interactivity, realism, and their effects on cognitive processes, and investigate the associations between interactivity on learning and self-efficacy that goes beyond head movements. Finally, these studies should investigate the long-term effects of IVR videos exposure, such as measuring variables months after the treatment sessions, to contribute to the practical implications of IVR videos in education.

Appendix A

Experiment 1 and 2—Open-Ended Questions

  1. Explain what you know about how human activities change ocean chemistry.

  2. Describe ways that could help reduce ocean acidification.

  3. Propose strategies to increase engagement of the general public with ocean acidification.

  4. Palau’s coral reefs are considered to be one of the “Seven Underwater Wonders of the World.” What actions is the Palau government taking to reduce damages to the coral reefs?

  5. How could the Palauan government’s actions protect the coral reefs be applied to other environmental issues, for example, the fast erosion of South California beaches?

  6. Considering the Palauan government’s efforts to protect the coral reefs, think about one issue from your community that needs immediate attention. Propose an action plan that could help address this issue.

Experiment 1 and 2—Multiple-Choice Questions

  1. What is the main cause of ocean acidification?

    •Chemical spills in the ocean

    • Acid rain

    • Absorption of carbon dioxide by the ocean

    • Warmer ocean temperatures

  2. Corals, shellfish, and other marine organisms use the carbon dissolved in the ocean to:

    •build shells.

    • breathe underwater.

    • regulate body temperature.

    • assist in reproduction.

  3. Which of the following is a result of human-caused carbon dioxide emissions?

    •Ocean salinity and the frequency of oil spills are increasing.

    • Coral reefs are degrading, and the diversity of ocean life is decreasing.

    • The frequency of oil spills is increasing, and coral reefs are degrading.

    • The diversity of ocean life is decreasing, and ocean salinity is increasing.

  4. How are carbon dioxide emissions affecting the ocean?

    •Carbon and salt are increasing in the ocean.

    • Phytoplankton and salt are increasing in the ocean.

    • Phytoplankton and temperatures are increasing in the ocean.

    • Carbon and temperatures are increasing in the ocean.

  5. Which sentence best describes how tourism in Palau is damaging the coral reefs?

    •Tourism in Palau is responsible for a great part of carbon dioxide emissions in the world.

    • Tourists in Palau throw more trash into the ocean than most tourists in other places.

    • Tourists are damaging the corals with their fins while diving.

    • Acid rain is increasing in Palau due to tourism’s carbon dioxide emissions.

  6. Select the land practice that can greatly help to protect the coral reefs in Palau.

    •Place taro farms in strategic areas.

    • Eliminate the mangrove forests that run along the coastline.

    • Replace the mangrove forests with farms.

    • Avoid sediment traps along coastline.

Experiment 1—Self-Efficacy Questionnaire

Participants were asked to express their agreement to each statement below, in a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree):

  1. I think that I can be proud of what I know about this subject.

  2. I think that learning science is important because I can use it in my daily life.

  3. No matter how much effort I put in, I cannot learn science. (Reverse-coded)

  4. It is important to have the opportunity to satisfy my own curiosity when learning science.

  5. Whether the science content is difficult or easy, I am sure that I can understand it.

  6. I am not confident about understanding difficult concepts. (Reverse-coded)

  7. I study more than required because I enjoy it so much.

  8. When science activities are too difficult, I give up or only do the easy parts. (Reverse-coded)

  9. I am willing to participate in this science course because the content is exciting.

  10. The subject scares me since I don’t fully understand it. (Reverse-coded)

  11. I am so happy about the progress I made that I am motivated to continue studying.

  12. This subject is so enjoyable that I am motivated to do extra readings about it.

Appendix B

Experiment 2—Self-Efficacy Questionnaire

Participants were asked to express their agreement to each statement below, in a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree):

  1. Whether the science content is difficult or easy, I am sure that I can understand it.

  2. I am not confident about understanding difficult science concepts. (Reverse-coded)

  3. I am sure that I can do well on science tests.

  4. No matter how much effort I put in, I cannot learn science. (Reverse-coded)

  5. When science activities are too difficult, I give up or only do the easy parts. (Reverse-coded)

  6. During science activities, I prefer to ask other people for the answer rather than think for myself. (Reverse-coded)

  7. When I find the science content difficult, I do not try to learn it. (Reverse-coded)

  8. I think that learning science is important because I can use it in my daily life.

  9. I think that learning science is important because it stimulates my thinking.

  10. In science, I think that it is important to learn to solve problems.

  11. In science, I think it is important to participate in inquiry activities.

  12. It is important to have the opportunity to satisfy my own curiosity when learning science.

Experiment 2—Learning Agency Questionnaire

Participants were asked to express their agreement to each statement below, in a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree):

  1. I can control my academic performance.

  2. I learn more when I am the one selecting what I am going to learn.

  3. My choices about what to study influence my learning outcomes.

  4. I prefer other people to tell me what I have to study rather than deciding on my own.

  5. Being able to choose where to focus my attention when watching a video is important to me.

Supplemental Materials

https://doi.org/10.1037/tmb0000082.supp


Received July 14, 2021
Revision received May 2, 2022
Accepted May 8, 2022
Comments
0
comment

No comments here

Why not start the discussion?