Abstract

There has been growing interest in using virtual reality (VR) as a solution for many of the challenges facing distance education, such as fostering a sense of connectedness with classmates. However, implementing VR in distance education has its share of challenges, such as hardware accessibility and a scarcity of content which match curricula. In this exploratory, mixed-methods study, we examined 19 students’ use of head-mounted displays to meet with classmates inside social VR. For 4 weeks, students worked together in small groups on various tasks inside a virtual environment. We present quantitative results on attitudes foundational to fostering ideal learning environments. Entitativity (“group-ness”), enjoyment, realism, and presence did not change over time, likely due to a small sample size resulting from technical difficulties in collecting data. We present qualitative observations on instructors’ and students’ experiences across time and with VR use, and how these may inform curricula development. First, it is critical to provide ample training time to allow students to grow accustomed to the medium before investigating how response to VR changes over time. Without learning how to use VR first, students cannot learn inside VR. Second, we discuss task type and content considerations within and outside of VR and provide recommendations on how to reduce cognitive load and encourage social interaction. Third, we address technological and social issues that are likely to arise. Overall, we focus on ways to create a sense of connectedness and reduce psychological distance and challenges that may disrupt meaningful interactions from taking place.

Keywords: virtual reality, distance education, networked VR, collaborative learning, social VR

Supplemental materials: https://doi.org/10.1037/tmb0000094.supp

Acknowledgments: The authors would like to thank Brandon Lyons for his help with this study.

Disclosures: The authors have no conflicts of interest to disclose.

Data Availability: Our anonymized data and R code are available on Open Science Framework (https://osf.io/k6nzt/).

Open Science Disclosures: The data are available at https://osf.io/k6nzt/ and https://osf.io/amguj/.

Correspondence concerning this article should be addressed to Eugy Han, Department of Communication, Stanford University, 450 Jane Stanford Way, Building 120, Stanford, CA 94305, United States [email protected]

Universities have been offering distance education, a method of using one or more technologies to deliver instruction to students who are separated from the instructor, for decades (Seaman et al., 2018). Yet, until the spring of 2020, the majority of university students took part in the classroom experience in person (Eom & Ashill, 2016). In early 2020, the COVID-19 global pandemic forced universities to migrate from physical settings to digital ones. While research regarding best practices for distance education has been ongoing, a digital migration of such a scale is unprecedented, earning it its label, “crisis learning” (Almaiah et al., 2020; Mishra et al., 2020).

The rapid shift to distance education introduced new challenges, such as Zoom fatigue (Bailenson, 2021) and the need to increase student involvement and engagement (Nambiar, 2020). Even before this time of crisis learning, the results on the efficacy, quality, and student satisfaction of distance education were mixed (Castro & Tumibay, 2021; Deslauriers et al., 2019). Distance education has been plagued with a history of high dropout rates and reports of students feeling isolated and disconnected (Angelino et al., 2007). Similarly, distance education has shown to lead to an increased psychological distance between students, students and instructors, and students and the course material (Neter et al., 1993).

Given learning has both cognitive and social dimensions and is most effective when there is a strong sense of community, such feelings of isolation and disconnect can be detrimental (Rovai, 2002; Rovai & Wighting, 2005). However, this sense of community can be challenging to create in many modalities used for distance education. The literature suggests that instructors should increase both the quantity and quality of interactions to foster an ideal online learning environment (Moller, 1998; Rovai & Wighting, 2005). Given previous research showing that virtual reality (VR) provides unique affordances that increase student enjoyment, motivation, deeper learning, and long-term retention of information (Kavanagh et al., 2017), this raises the possibility of using VR in education to fill the aforementioned needs. At the same time, it is important to consider the cognitive capacity required to effectively use VR. As VR presents vast quantities of information directly to the users’ senses, it will be challenging for users to absorb this flow of information and the brain becomes a “bottleneck in the communication system” (Biocca & Nowak, 2001).

The present study examines the ability of VR to address the limitations of distance education, particularly with respect to engagement. We focus on the use of social, networked VR as the classroom. Our goals are to understand how key factors in creating ideal learning environments change over time and with VR use. In this exploratory study, we use quantitative and qualitative approaches to address challenges associated with using VR for learning and suggest ways to mitigate these limitations to allow for the unique affordances of VR to be leveraged in distance education.

Background

Affordances of VR in Distance Education

In the past, VR has been used in a myriad of educational contexts, including teaching languages (e.g., American Sign Language, Quandt, 2020; cultural-relevant physical interaction to learn languages, Cheng et al., 2017), training (e.g., surgical simulator, Huang et al., 2016; flight simulator, Page, 2000), collaborative design/tasks (e.g., Halabi, 2020; Schroeder et al., 2001), and science (climate change, Markowitz et al., 2018; science, technology, engineering and mathematics, Bogusevschi et al., 2020). The unique properties offered by VR have made it attractive as a medium for learning. Of particular interest are collaborative virtual environments (CVEs), which are networked, computer-generated simulations of environments that allow for multiple individuals to meet and interact in 3D space. CVE technology tracks each individual’s movement and behavior, renders them via avatars, and updates them as they change. CVEs provide affordances that make them ideal for supporting group interactions and foster a sense of connectedness crucial for students.

Immersion

One example of such an affordance of VR and CVEs is immersion, the “objective measure of the extent to which [a] system presents a vivid virtual environment while shutting out physical reality” (Cummings & Bailenson, 2016). Winn (1993) posits that the immersion in VR allows individuals to construct knowledge from direct experiences, rather than descriptions of them. Immersion is typically operationalized as “presence.” This aspect of presence supports the constructivist approach to learning, which holds that people construct meaning and knowledge through experience (Dewey, 1986). Additionally, with this sense of presence, individuals can engage in situated learning, which has shown to be more relevant and successful than learning out of context (Brown et al., 1989; Lave & Wenger, 1991). Furthermore, in CVEs, where instructors and students share the same virtual space, a negotiation of meaning can be communicated among participants (Winn, 1993).

Spatial Navigation

Another unique affordance provided by VR is its spatial navigation. In a virtual environment (VE), individuals can navigate 3D space (change position and orientation), which can help them understand the proximity and distance of others (Benford et al., 1994). Such a spatial system affords cues for individuals to know where others’ attention is focused, which allows them to engage in smooth turn-taking group conversations (Buxton, 1992). This spatial framework lets individuals engage in more spatially oriented tasks such as drawing, dancing, pointing to, moving around, and scaling objects—activities that provide multiple shared viewpoints, similar to those observed in interactions in the physical world (Churchill & Snowdon, 1998).

Avatars

Avatars provide embodied visualizations of communication partners and are often used as a means to enrich the user experience and trust (Steptoe et al., 2010). The presence of an avatar has shown to lead to an increased level of social presence “the sense of being with another” (Biocca et al., 2003), has a significant impact on cognitive load (Pan & Steed, 2019; Steed et al., 2016), and enhances trust (Gefen & Straub, 2004; Hassanein & Head, 2007). While trust between students and avatar instructors has not received proper attention from researchers (Chae et al., 2016), there has been an increase in interest within online learning in incorporating avatar instructors to address this impact of psychological distance that may exist between interactants that are physically separated (Chae et al., 2016; Gunawardena & McIsaac, 2013). Additionally, the myriad of nonverbal cues that avatars provide, such as gaze, head and hand orientation, and other body languages, give rise to new possible social interactions (Loomis et al., 1999).

Social Dynamics of VR

Having a sense of community is an integral part of learning, and this is where VR learning may provide the most support. Wenger (1998) describes learning as the process of becoming part of a community of practice. Over time and through multiple interactions, individuals engage in activities, take part in discussions, and help one another to establish relationships that enable them to learn together. Without a feeling of community, students are likely to be anxious, disconnected, and unwilling to take the risks involved in learning (Wegerif, 1998). In describing the social dimension of the effectiveness of online learning, which is another form of distance education, Wegerif (1998) argues that, in an ideal educational environment, students can cross a social threshold and begin to feel part of a community. Such a sense of community can be fostered if instructors set certain guidelines and structures in their classrooms (McInnerney & Roberts, 2004), which may be easier in immersive VR. However, presenting information in VR requires a different approach from traditional in-person, video, or computer-based learning. For example, while head-mounted display (HMD) viewing and presenting produced a higher sense of presence compared with a desktop display, there was a less clear impact on social presence (Yoshimura & Borst, 2021). Consequently, it is important for instructors to develop and follow guidelines for enhancing social interactions unique to VR.

Time

According to the dynamics systems theory, complex human behaviors and activities emerge as different components of a system influence and change one another over time (see, e.g., Newman & Newman, 2020; Thelen, 2005). At the crux of this theory is the dimension of time. The state of a dynamic system depends on its state at a previous point in time. Consequently, drawing inferences based on a single time point or a few time points may not provide a comprehensive view of how meaningful patterns emerge.

This theory extends to individuals using VR. Previous longitudinal research on VR use has shown that time matters in VR (e.g., Bailenson & Yee, 2006; Khojasteh & Won, 2021; Moustafa & Steed, 2018). While VR is becoming increasingly widespread and more readily accessible, it is still a novel medium for many of the individuals who may benefit from using VR. Thus, having multiple repeated measures collected at multiple time points can allow us to understand how response to VR changes with time and use. Within a longitudinal framework, we can identify intra-individual (within-person) and interindividual (between-person) changes and rates of change across time.

Potential Challenges of VR Learning

VR and CVEs not only increase engagement with content, but they also increase connection with “others,” which, as aforementioned, is necessary in developing learning communities (Lear et al., 2010; Martin & Bolliger, 2018). However, while immersive media experiences can enhance feelings of presence and engagement, instructors should be wary of how they structure learning experiences and guide students’ attention. This is true of any sort of multimedia used for learning. The cognitive theory of multimedia learning suggests that when the brain is engaged in multiple tasks or different streams of information, each of those taxes the limited resources needed to process new information (Mayer & Moreno, 2003). Meaningful learning requires people to carefully attend to details of information that is being presented. This cannot happen when cognitive capacity is overloaded (Mayer & Moreno, 2003), which may be the case for students who feel presence and engagement within the virtual world (Makransky et al., 2019; Makransky & Lilleholt, 2018).

Overview of Study

This study takes a longitudinal approach to investigate how groups of students interact using ENGAGE, a social VR platform that focuses on collaboration, education, and training. In addition to evaluating factors such as presence, which has been of interest in the immersive learning space (e.g., Makransky & Petersen, 2021), we evaluate students’ experience with learning and forming a community within the classroom through measuring entitativity (Rydell & McConnell, 2005), or “group-ness,” the degree to which a collection of individuals is perceived as a single entity (Campbell, 1958).

While we reported five hypotheses in our preregistration (Open Science Framework, https://osf.io/zxf5u), we could not explore most of them due to the sample size limitations resulting from pragmatic issues surrounding the course and technical difficulties in collecting data. We were able to examine one quantitative hypothesis, which was modified to focus on the prediction of presence, as we did not specifically test how students were accustomed to the HMDs:

Hypothesis: Students’ sense of presence in a virtual environment will increase over time.

As with learning as described above, the level of comfort within VR is likely to predict presence, which should increase over time as they get accustomed to the environment. Through quantitative data analysis, we investigate this hypothesis, along with how other measures change across time and use. Through qualitative student responses and observational notes from instructors, we address the limitations of the quantitative results, review individual development over time, identify issues that can arise in implementing social VR in classrooms, and consider how these issues may affect researchers wanting to investigate VR in the realm of education. The findings of this study uncover challenges that arise in integrating HMDs in remote classrooms and provide recommendations on how to resolve these challenges. This article provides suggestions for instructors and students considering using networked VR in the classroom, with a focus on both training and infrastructure needs.

Method

Participants

Participants include 19 students (M = 11, F = 8) in a distance education course on new technology at a public university who consented to participate and come to campus and pick up an HMD. The participants received extra credit for completing the questionnaires. Participants were between 20 and 24 years old (M = 21.11, SD = 1.05) and identified as White (n = 16), multiracial (n = 2), and Hispanic (n = 1). Most participants had never used VR before the class (n = 12). Of the four total sessions, 13 attended all sessions (n₃ = 4, n₂ = 1, n₁ = 1). Participants were divided into five groups of three or four individuals. This study was approved by the University’s Institutional Review Board.

Hardware and VR Equipment

Each student was provided with a Pico Neo 2 Eye headset, which they used in their personal environment. The Pico Neo 2 Eye headset is a standalone HMD with 3,840 × 2,160 resolution (1920 × 2,160 per eye), 101° horizontal field of view, and a 75 Hz refresh rate. The Pico Neo 2 has 6DOF inside-out tracking via two integrated front cameras. It comes with a left and right-hand controller, both of which also have 6DOF. Prior to the sessions, a tutorial was provided on how to set up the HMD in the physical space and students were guided through the setup with a training session. The recommendation was to use the seated option.

Virtual Environment

ENGAGE, a collaborative platform hosted in the virtual classroom environment, is one of the few social VR platforms whose primary purpose is educational. ENGAGE users are presented as human avatars, which they can customize to match their physical appearance. ENGAGE offers features such as presenting media content, writing on a blackboard/whiteboard, transporting to various environments, taking notes, filling out forms, walking/teleporting, chatting with voice, sitting in seats, clapping, and adding 3D objects into the environment (Figure 1). The training and four weekly ENGAGE sessions were held in a hub environment. The hub was set up with giant portals that connected students to separate environments, in which group members could gather to complete tasks and hold discussions. The environments were different for each task. Environments include a lecture room, a hospital room, a pier, a beach island, a spaceship, and Mars.

Procedure

Every week for 4 weeks (five total including the training week), students used their HMDs to meet in ENGAGE during the designated class time. All students convened in a central location within the VE (hub), where they were briefed on that week’s task. In Weeks 1–3, the tasks were premade experiences that were available on ENGAGE. These tasks came with their respective instructions. In Week 4, the task was free form and allowed for room for more creativity. Students then went into their assigned sessions and completed the tasks for 15–20 min. During the last 5 min of each session, group members moved to a separate virtual discussion room, where their avatars sat around a table to have a discussion on their experience. After the students completed their tasks in the VE, they were asked to fill out a Qualtrics questionnaire asking about their experience.

Weekly Tasks

The tasks we selected were offered as part of ENGAGE’s activity- and lesson-based content. The tasks varied each week and allowed for a range of interactions with the group members, the environment, and educational content. After each “active experience” students went to a room to discuss their tasks as a group. During the training and weekly ENGAGE sessions, a videoconferencing window was open for an instructor to assist students with technological issues. Another instructor was in the VE providing guidance on how to use the ENGAGE interface and the controllers to navigate around the environment. After completing the tasks and discussion, students were asked to fill out a Qualtrics questionnaire about their experience.

Training Week

There was a training week in which students were briefed on how to navigate the virtual environment and the tools, use the controllers, and familiarize themselves with the interface.

Week 1

There were two tasks centered on assembling. Students worked together as a group. Each member took on the role of a helper, a doer, or an observer. Students collaborated to piece together parts of a skeleton in a lecture hall and an engine in a hub.

Week 2

There were two tasks centered on puzzle-solving. In the first task, students learned how to deliver a baby in a hospital room. As in the first week, students had to take on a role either as a trainer, a trainee, or an observer. In the second task, students learned the physics of trajectory to shoot a canon. Students had to calculate how to accurately hit a ship by manipulating the angle at which a canon was fired.

Week 3

There was one task centered on in-site visiting. Students went on a field trip to a prehistoric era to learn about dinosaurs. Students walked around a beach island full of animated dinosaur species as an informational video played in the background.

Week 4

There was one task centered on creativity and free-flow movement. Students built a room based on a theme of their choice. Students could base the room on any location of their choice and use any of the 3D objects available in ENGAGE.

Measures

Multiple aspects of individuals’ attitudes and behavior were measured at the start of the study, and after the training session and each of the four VR sessions (weekly questionnaire). Scale reliability (Cronbach α) was computed based on all measurement items.

Individual Differences Measures

Computer Self-Efficacy

Individual ratings for computer self-efficacy were obtained at the start of the study. Computer self-efficacy was measured using three items adapted from Torkzadeh et al. (2006). Sample items, each answered using a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree), included as follows: “I am able to learn and use most computer programs” and “With the right training and tools, I could do almost anything with computers and technology.” Individual computer self-efficacy scores were calculated as the mean of the three items (Cronbach’s α = 0.85), with higher scores indicating higher computer self-efficacy (M = 5.72, SD = 0.98).

Access to Technological Infrastructure

Individual ratings for access to technological infrastructure were obtained at the start of the study. Access to technological infrastructure was measured using three items adapted from Nowak & Watt (2022). Individuals were asked if they had access to the internet, high-speed internet, and a quiet place to study on a 5-point Likert scale (1 = never, 5 = always). Individual access to technological infrastructure scores was calculated as the mean of the three items (Cronbach’s α = 0.81), with higher scores indicating having more access to technological infrastructure (M = 4.51, SD = 0.58).

Weekly Repeated Measures

Entitativity

Individual ratings for entitativity were obtained after each weekly VR session, as well as after the training session. Entitativity was measured by eight items adapted from Rydell and McConnell (2005) using a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree). Sample items include “How strongly bonded do you think the members of your group are?” and “To what extent do you believe that members of your group were affected by the behaviors of other members?” Individual entitativity scores for each week were calculated as the mean of the eight items (Cronbach’s α = 0.88), with higher scores indicating greater entitativity.

Enjoyment

Individual ratings for enjoyment in the virtual environment and tasks were obtained after each weekly VR session, as well as after the training session. Enjoyment was measured using four items created for the study using a 5-point Likert scale (1 = not at all, 5 = extremely). Sample items include “How much fun did you have completing the tasks with your group members?” and “How interesting was completing the tasks with your group members?” Individual enjoyment scores for each week were calculated as the mean of the five items (Cronbach’s α = 0.94), with higher scores indicating greater enjoyment.

Realism

Realism refers to the extent to which the virtual surroundings and its components appeared real. Individual ratings for realism were obtained after each weekly VR session. Realism was measured using five items adapted from Nowak (2013) using a 5-point Likert scale (1 = not at all, 5 = extremely). Sample items include “seemed real” and “seemed naturalistic.” Individual realism scores for each week were calculated as the mean of the five items (Cronbach’s α = 0.86), with higher scores indicating greater perceived realism.

Presence

One of the dimensions of presence is spatial presence, which refers to the feeling of actually “being there” in the VE. Individual ratings for the spatial dimension of presence were obtained after each weekly VR session. Presence was measured using five items adapted from Aymerich-Franch et al. (2012) using a 5-point Likert scale (1 = none at all, 5 = very much). Sample items include “I felt surrounded by the virtual world I saw and heard” and “I felt the virtual world was like the real world.” Individual presence scores for each week were calculated as the mean of the five items (Cronbach’s α = 0.92), with higher scores indicating greater perceived presence.

Data Analysis

Individual differences in how each of the dependent variables changed over time were examined using linear growth models with time-invariant covariates (Grimm et al., 2016). Small sample sizes and the very small between-group variance suggested use of a two-level structure with the repeated-measures nested within individuals. Specifically, the weekly repeated measures of entitativity, enjoyment, realism, and presence outcomes were each modeled as:

\text{outcome}_{ti} = b_{1i} + b_{2i} \times \text{week}_{ti} + u_{ti},

where outcome_ti is the repeated measure variable at time t for individual i; parameter b₁_i represents the intercept, predicted score at t = 0, for individual i; parameter b₂_i represents the rate of change for a one-unit change in week for individual i; and u_ti is the time-specific residual score at time t for individual i.

Individual differences are simultaneously modeled as a function of the time-invariant covariates, such as computer self-efficacy and access to technological infrastructure:

b_{1i} = \beta_{01} + \beta_{11} \times \text{computer\ self-efficacy}_{1i} + d_{1i},

b_{2i} = \beta_{02} + \beta_{12} \times \text{computer\ self-efficacy}_{1i},

where β₀₁ and β₀₂ represent the expected intercept and slope when the time-invariant covariate, computer self-efficacy₁_i (or access to technological infrastructure), is equal to 0; parameters β₁₁ and β₁₂ indicate the relation between the time-invariant covariates and the individual intercepts and slopes; and d₁_i is the residual between-person differences in the intercept that was not explained by the time-invariant covariate, that has variances σ²_d₁. Our preliminary models, which included d₂_i, the residual between-person differences in the slope, were excluded as they had a correlation of −1 with the random intercepts.

All models were fit to the data in R using the lme4 library (Bates et al., 2015) and visualized using the ggplot2 library (Wickham, 2016). Incomplete data were treated as missing at random. Statistical significance was evaluated at α = .05. Our anonymized data and R code will be made available on Open Science Framework.

Results

Quantitative Findings

Descriptive statistics summarizing the longitudinal data for entitativity, enjoyment, realism, and presence are presented in Table 1. Individual trajectories of each of the repeated-measures outcomes over time with the means overlaid are presented in Figure 2. Results of the linear growth models indicated that none of the repeated-measures outcomes changed significantly across weeks (ps > .29). The hypothesis that students’ presence would increase over time was not supported.

Table 1
Repeated measure (scale; N = 18)	Week 0 (N = 17)	Week 1 (N = 17)	Week 2 (N = 16)	Week 3 (N = 16)	Week 4 (N = 16)
Summary of Means and Standard Deviations (in Parentheses) of Repeated Measures Across 4 Weeks
Entitativity (1–7)	3.44 (0.53)	3.41 (0.62)	3.36 (0.67)	3.53 (0.5)	3.41 (0.52)
Enjoyment (1–5)	3.58 (0.97)	3.63 (1.00)	3.72 (0.91)	3.5 (0.59)	3.86 (0.62)
Realism (1–5)	NA	2.55 (0.64)	2.69 (0.89)	2.85 (0.72)	2.52 (0.72)
Presence (1–5)	NA	3.66 (0.82)	3.65 (0.92)	3.72 (0.93)	3.45 (1.01)

**Figure 2**
Dependent Variables Over Time
*Note*. Panels A–D show individual trajectories (raw data) and means over time for each of the four outcome variables.

In considering individual differences in computer self-efficacy and access to technological infrastructure as time-invariant covariates in the models, there was evidence that access to infrastructure was significantly related to differences in rates of change in entitativity and increased over time, β₀₁ = 3.42, p < .01; β₀₂ = 0.21, p = .049. Similarly, differences in self-efficacy were related to the differential change in realism over time. For a prototypical student who had an average computer self-efficacy score, realism started at β₀₁ = 2.6, p < .01, and remained relatively stable over time, β₀₂ = 0.047, p = .15. Meanwhile, a student with a higher computer self-efficacy score had a more pronounced decrease of realism over time, β₁₂ = −0.079, p = .02. There was no evidence that between-person differences in access to technological infrastructure were related to differences in rates of change in realism or presence (ps > .48) or computer self-efficacy to entitativity or presence (ps > .74).

Qualitative Findings

Our fieldwork included participant observation made from the two-course instructors and a collection of open-ended responses from students after every session. Our primary goal was to gain a contextual understanding of students’ experiences to see how they may inform instructors in designing their networked classrooms for learning. We draw upon the abductive analysis approach, an ethnographic methods model that encourages researchers to take an iterative approach to cases starting with a broad theoretical base and developing creative, novel theories based on observations. The abductive analysis approach takes into consideration the position of the researcher and encourages taking on a radical thinking of the relation between data and theory building (Timmermans & Tavory, 2012). The responses were examined by the first author and analyzed to iterate on key themes. We offer examples of first-person observations to provide recommendations on how to integrate VR in distance learning and explain how responses from students may inform the quantitative findings reported above.

Learning How to Use VR Before Learning With VR

Qualitative analyses of open-ended responses suggest that a single training session is insufficient, and that there is a very slow learning curve to using VR. Ample time should be provided for students to adjust to the medium and learn how to use the technology before any learning can occur.

The open-ended responses reveal challenges students faced and that these technological difficulties greatly reduced the quality of the overall experience, which is consistent with previous research on computer self-efficacy predicting success. In the first 2 weeks, the reports were related to difficulty in learning the technology, whereas in the third and fourth weeks, the reported challenges were related to the task itself. The lack of practice-based training and high level of interactivity in the tasks of the first 2 weeks was evident in students’ frustration of the HMD and controllers.

P8 “A disadvantage was that we are not used to using the technology, which made it more difficult to communicate. I think once we know how to use it it will be more fun and beneficial.”

P2 “I feel that some of the exercises weren’t so much [of] a disadvantage but they were just kind of confusing and frustrating to get working.”

P7 “[HMDs] are not accessible to all and the barrier of entry and knowing what you are doing is high.”

This is further supported by the number of students who experienced difficulties such as being kicked out of the ENGAGE environment (Figure 3). Of the total students attended in that particular session, 47% of students were kicked out of the virtual environment in the first session. This number gradually decreased across time, to 29% in the second and third sessions, and 25% in the fourth session.

**Figure 3**
Percentage of Students Who Were Kicked Out of the Social VR Environment Over Time
*Note*. VR = virtual reality. The total number of students attended varied each week (n₁ = 17, n₂ = 18, n₃ = 17, n₄ = 16).

The most frequently reported reasons for experiencing kick out were software related, such as waiting for updates, lagging, and disconnecting while switching between applications or sessions. Network connectivity, such as having a weak internet connection or crashes, were also common reasons for being kicked out during a session. Three students also reported forgetting to charge their HMDs, which caused it to die in the middle of a session. Other reasons include being interrupted by physical world distractions and accidentally stepping outside of the setup boundary.

P7 “I had technical difficulties due to my own error in not fully charging the headset which led it to die in the middle of our group activity but luckily I was able to sign back in via my laptop and finish the activity.”

P9 “I forgot to charge [the HMD]. It was my fault. But I was in the VR room for most of it.”

In addition to a large percentage of the class experiencing kick out in the early weeks, there is an observation of attended students spending the first week changing their avatar’s visual representation (Figure 4). This number drops in the second week, suggesting that students need time in the beginning of a course to adjust not only to the medium but also to the classroom environment. This additionally highlights the importance of avatar design instruction, as students are unlikely to change their avatar as time passes. Early decisions on how students are being instructed matter.

**Figure 4**
Percentage of Students Who Changed Their Avatar Visual Representation

These observations suggest that we cannot fully examine the potential for VR to be used for learning content until the students learn how to use the technology. Considering this slow learning curve, we suggest that training centre around practice-based activities, where students practice simple activities such as moving, teleporting, grasping, and creating. We also suggest that earlier assignments be more open and less focused and allow the students to walk around and observe.

Limiting Overload of Sessions

We reduced the number of tasks from two to one starting Week 3 following feedback from students after Week 2. Not only were students affected by simulator sickness from exceeding the number of recommended minutes spent inside VR, but students also noted that doing two tasks took longer and they felt rushed and wanted to spend more time in the first experience. Given the recommended time limitations of being immersed in VR and that both the VR medium and ENGAGE platform were novel for most students, it is important to recognize that students are spending time and cognitive capacity remembering how to navigate the virtual environment. Consequently, we recommend minimizing the number of tasks early on in their experience with VR and their VR learning process. This is in line with findings reported by Makransky et al. (2019) that, while immersive VR leads to higher levels of presence, it also leads to higher cognitive load and less learning.

Instructor 1 “[Students could not] focus on learning the math formulas for the boat shooting exercise because they were having difficulty navigating. This means we [cannot] really examine the potential for VR to be used for learning content until students learn VR. [It is] like that saying about education. Until 3rd grade students are learning to read. After 3rd grade, students can use reading to learn. Our students in week 2 are still learning [how] to VR so they [are not] yet prepared to use VR to learn.”

In addition to providing more training to allow students to grow accustomed to the medium and learning platform, we suggest that the early assignments provide ample opportunities for students to practice navigation including walking around, exploring freely, talking with their classmates, and viewing their environment.

Instructor 1 “Save the tasks requiring them to manipulate objects for later when they are more comfortable in the environment. Possibly have 2 weeks of training and have one session where you task them with moving from one place to another, sitting down, making something, moving something, etc.”

Instructor 2 “We should have multiple training sessions or have these training in person where everyone can see my HMD screen. Because verbal explaining [does not] seem to be enough, a lot of students forget or [do not] pay attention.”

Another way to reduce overload is to reduce moving between environments and sessions. While having students break off into smaller groups may be effective for certain types of tasks or discussions, we can take advantage of the VR’s unique affordance of spatialized audio to address this limitation without foregoing small group activities. Additionally, if having smaller groups is not crucial to the structure of the course, we recommend forming larger groups and having the number of instructors or assistants should closely match the number of groups. This is to reduce the number of disruptions caused by moving virtual environments in a session, to avoid losing students in the transitory stages, and to ensure that students have a point of contact in VR when technological mishaps occur.

Instructor 2 “I think it will be easier if we had one classroom and had everyone move together. This concept of breakout rooms is not going to work unless there is a centralized way of sending announcements and bringing everyone back. It is very inconvenient and confusing for both the instructors and students to leave and join new sessions. Maybe we can use the spatial sound feature to divide people into their smaller groups within the same session, and avoid using portals and leaving to other sessions.”

Providing Technological Support Within and Outside of VR

Those wishing to help students have positive learning experiences using VR should recognize the need for technical support, which includes ensuring computer self-efficacy and troubleshooting. Students in their homes have to take responsibility for ensuring that they have sufficient internet connection, that they will not be interrupted or disrupt those in their surrounding environment (e.g., family members, housemates), and that the power does not go out. Students also have to remember to charge their HMDs before joining the session. To ensure that students can participate in the class activities if all of the above do not work, instructors should provide alternative ways for students to take part, such as through desktop VR or a mobile-based application.

Additionally, to minimize the chances of students losing their way in VR, we recommend providing clear instructions on where to go and how to get there. These instructions should be left online throughout the class, with an instructor present to monitor a chat or videoconference call to help any students. We also recommend reducing the number of separate experiences students have during a given class and having students meet in one, continuous environment. In hosting an additional channel for assistance, we suggest using text communication, or coordinating the audio both within and outside the HMD, to ensure that there is no aural overlap or echoing.

P7 “When my headset disconnected it was slightly disheartening and I was unsure what I missed when I arrived back in the group as they had moved on from the prior exercise but once again that was due to my own error.”

Finally, we note that instructors should be prepared to adjust for different bandwidths associated with different platforms. As each platform’s system operates differently, in terms of how it kicks a user out, how it represents and renders avatars (e.g., after a certain number of users are present in a session, some platforms may begin representing avatars differently, in low poly, for rendering purposes; see Figure 5), and how it handles audio input and output (e.g., considering how certain voices are prioritized or suppressed), instructors should test to see how the platform hosts groups of different sizes. The lower and upper limits of the bandwidth of a platform should be evaluated before bringing in a group of students into the virtual environment, to ensure that the platform can handle groups that may be larger in size, and that students do not feel isolated or lost when they are kicked out or represented in a different way than that of their peers.

**Figure 5**
Users Represented as Either High-Poly, Full-Bodied Avatars (Left) or Low-Poly, Half-Bodied Avatars (Right)
*Note*. Avatars and their features (e.g., facial structure, skin color) are rendered differently depending on what device is being used to access the platform and how many users are present in the session. Screenshots reprinted with permission from ENGAGE.

Choosing Tasks for Inside and Outside of VR

Weekly tasks were selected based on two factors: What content and activities were available on the ENGAGE platform, whether they leveraged the unique properties of VR (e.g., spatiality, travel). In Weeks 1, 2, and 4, the tasks were collaborative, hands-on, and centered on building. In Week 3, the task was centered on exploring a place that would have otherwise been challenging to travel to in-person. Several students commented on how they felt about the different tasks, and whether they enjoyed it, found it frustrating or challenging, which suggests that task type could have played a key role in shaping their experience and ultimately, the measured outcome.

On Week 2’s tasks:

P13 “I think that the first activity we did today with learning how to care for a newborn baby in a hospital was not executed as well as it should’ve been. [Group member’s name] and I struggled to interact with the baby, and we did not have enough instructions to figure out how to pick up the baby, care for it, etc. We ended up going over the time twice because we did not know what to do. I think that doctors/nursing simulators are not as effective as actually being in the situation in real life. On the other hand, the second activity about the cannons and math was actually fun, even though I did not understand how to do the math. It allowed you to be able to do something (throw cannons at a boat) that you would not be able to do in real life.”

P16 “[I] think that the ability to show me the practical use of the math equation definitely helped to cement it in my mind.”

What some students viewed as an advantage for VR-based learning, others reported as a disadvantage. On Week 3’s tasks, for instance, we note how the same experience resulted in different sentiments. This echoes the importance of how tasks are selected, and how they suit different students differently.

P13 “I actually really enjoyed this activity today. It felt like a more intense version of an IMAX movie that you would watch at a museum or aquarium, which I really enjoyed. I do think that although the experience of learning was very submersive, I wish that there had been an interactive activity after the film with the HMD.”

P11 “The virtual world in our lesson today was so cool and realistic that I was drawn to exploring the world [rather] than actually paying attention to what the lesson was teaching me. [It] was a little distracting.”

Similarly, how instructors plan for class outside of VR is also critical. In the present study, after each “active experience,” students were asked to discuss the tasks in a separate virtual room. In addition to students losing connection or their way to the separate virtual room, this extra discussion added to the time students spent inside VR. Activities such as these would have been more beneficial to occur outside of VR. We recommend that instructors not only consider how to plan for activities that take place inside of VR but also for outside of VR.

In a similar vein, we had students meet on a videoconference call prior to the VR sessions. We frequently waited 10–15 min for everyone to arrive before sending them to groups. While we provided credit based on attendance, only 68% of the students attended every class. We predict that the technological issues students faced played a role in attendance. Our recommendation is to demonstrate flexibility and leave time at the beginning for everyone to arrive, set up, and troubleshoot before heading into VR.

Perceived Effectiveness and Excitement of VR in Classrooms

Common themes in perceived effectiveness and excitement toward using HMDs in classrooms emerged in open-ended responses. Several students mentioned the positive role that HMDs played in working together as a group, and optimism that their learning experiences would improve with further experience.

P7 “A major advantage was the hands-on experience even though we are miles away. [It] really made the group feel more like a group doing work together.”

P4 “A major advantage is the ability to engage ‘physically’ in a discussion through playing with certain aspects of group work. I think HMDs add a level of interactivity in group work [].”

P18 “I think the head-mounted displays added an interesting dynamic to class, and made us more interactive as a group.”

After the end of the course, instructors received additional feedback from a student regarding their experience with the HMD.P“As a student who struggles with the structure, repetitiveness, and lack of creativity/engagement in our school system, I am very intrigued by new ideas as to how we can reform education. During the pandemic especially, it has been rough having to learn mostly online, but this class is a perfect example as to how a shift in learning, like using HMDs, creates a sense of excitement and an eagerness to go to class.”

Another theme that emerged was the ability to focus on the presented material and be free of physical world distractions.

P5 “I feel like the major advantage of learning this way is that it fully immerses you in the content and makes it so that you have little to no distraction … like your phone.”

P9 “[The HMD is] interactive and that keeps you focused. You cannot daydream or doze off when you have the headsets on. You’re always paying attention to what you’re learning.”

In relation to the fourth session’s task of building a creative room, students touched on the potential for using HMDs to promote creativity in an educational setting.

P13 “I really enjoyed being able to just do our own things today and really explore the ENGAGE world … The building process is something that is very similar to Minecraft, so I think a lot of individuals would feel comfortable doing something like that in the classroom.”

P10 “I thought today when we created our own world, the head-mounted displays gave us the freedom to really express our creativity. My experience with my group was definitely enhanced by the head-mounted displays as opposed to if we had to just meet on a screen.”

P18 “I think we were the most engaged in the virtual format. We were collaboratively building the world and seeing the results in real time.”

Such social interactions and engagement show that being together matters, especially during distance learning. Here, other students served as content for the course, and the tasks served as facilitators for these group interactions. To foster the sense of community that is essential in distance learning, allowing time for students to interact with one another is critical.

The Need to Examine Long-Term and Individual-Level Changes

There were no statistically significant differences or consistent trends observed across time of our weekly measures, though this study may be underpowered to confirm this (Han et al., 2022). However, when unaggregated and viewed at a week level and individual level (Figure 2), we see high variation in how each student perceives their experience in that particular session. This suggests that, while an aggregated measure of how factors such as presence changes over time are valuable in understanding the use of VR in classrooms, there is great variation in students’ experience in each session and on an individual level.

Discussion

Our present study elaborates on the unforeseen challenges instructors and researchers may come across in implementing HMDs and social VR environments in distance learning settings, but most of our findings are consistent with previous research and predictions. We found that perceived enjoyment of the tasks and interaction did not increase over time at a statistically significant level. The other repeated measure variables entitativity, realism, and our hypothesized outcome on presence also did not increase significantly over time. Individual difference predictors were not significantly related to change over time. We turn to the qualitative observations to argue that 4 weeks was not sufficient to draw a conclusion about what patterns may emerge, given more training was required and the first 2 weeks were spent helping students learn how to VR.

Additionally, through direct observations made by instructors and qualitative analyses of responses from students, we gather a series of recommendations for developing an efficient class design and considerations for VR education researchers. First, we emphasize the finding that students must learn how to use VR in order to learn in VR. Without a comfort and level of self-efficacy in using a HMD to navigate around a social VR environment, students are not able to take full advantage of what educational VR has to offer. The elements of social, self and spatial presence, and interactivity that make VR an attractive medium for learning may not be meaningful if students are struggling with the technology during lessons.

In addition to sufficiently training the instructors (e.g., Kavanagh et al., 2017), we recommend that there is an emphasis on providing a strong technical support system in the first few sessions, as to help students of all computer self-efficacy levels to adjust to the medium. These first sessions should incorporate opportunities for students to acquire the skills needed to have a successful experience in VR. Additionally, some form of real-time support channel, such as a text-based chat or a persistent videoconferencing call, should be in place to allow students to fall back on, if they run into issues in VR. We recommend minimizing the chances of students getting lost in VR and reducing the feeling of helplessness that may occur.

In line with other longitudinal work in VR (e.g., Khojasteh & Won, 2021), this study supports the finding that different types of tasks yield different effects (e.g., entitativity). In conjunction with our earlier suggestion of providing sufficient support and VR technology-learning opportunities, we recommend that tasks early in the course rely less on interactivity and engagement. Tasks that are more challenging and have a slower learning curve should be reserved for after students have experience in VR. This allows students to have time to experiment with the controllers and software interface without them feeling burdened to learn how to VR and learn in VR simultaneously.

Furthermore, depending on the class size, if dividing the students into groups is a component of the course, instructors and researchers should decide on how to effectively make this division and do everything they can to encourage students to attend every session, which was a challenge despite giving credit for participation. Depending on what the instructors’ or researchers’ goals of splitting the students into smaller groups are, there should be a consideration of three factors: fostering a sense of group entitativity; limiting the group size to maintain a level of interactivity and intimacy between the students, the instructors, and the course material; and allotting available resources and time to assist all groups.

The limitations of this study are similar to those frequently observed in classroom-based longitudinal VR studies. Our small sample size, which was determined by the number of students who were enrolled in the course, resulted in our statistical analyses being underpowered. While many of the trends in our data were not statistically supported, they could be measurable with a larger sample size.

Considering our finding that students must first learn how to use VR to observe meaningful outcomes, we recognize that 4 weeks and only 1 week of training was not sufficient for students to grow comfortable with the technology. The technology continued to feel novel. More research is needed to determine the minimum number of weeks that are required for students to feel confident in their ability to use the technology.

Additionally, we note that there was great variation in the type of tasks students completed every week, which may be the root cause of the lack of the collected measures changing over time. Given this lack of consistency within the task type and given that the tasks are central to students’ perceptions of their experience, it is unclear if the metrics are accurately measuring the change over time. Future research should evaluate what the desired learning outcomes are and how learning is defined and consider what tasks students can engage in sustainably and meaningfully over time.

Last, there are challenges that are unique to running remote experiments. For instance, we are unable to control for factors such as ensuring that all equipment is fully charged and does not run out of battery in the middle of a session; relying on stable internet connectivity; or having research aides readily available to troubleshoot any software- and hardware-related difficulties. Such uncontrollable factors increase variance in the data. We note that one possible way to minimize variances introduced by these factors is to set systems and guidelines in place for both instructors and students to follow.

Supplemental material

open-practice-disclosure_Bailenson.pdf

290 KB

Prerequisites for Learning in Networked Immersive Virtual Reality