Volume 4, Issue 1: Spring 2023. Special Collection: Learning in Immersive Virtual Reality. DOI: 10.1037/tmb0000098
Mixed-reality simulations (MRSs) create new opportunities for responsive teacher learning. This experimental study examined, starting from baseline Simulation 0 (Sim0) across three MRSs, the effect of differentiated coaching treatment in Simulation 1 (Sim1), and personalized practice in Simulation 2 (Sim2) on teachers’ provision of high-information feedback to avatar students as measured by the mean of feedback frequency weighted by quality. Motivational-supportive factors of MRS, including teacher perceptions of cognitive load, realism, and task-value were also examined. Participants were teachers at different career points (i.e., preservice: n = 69; early career: n = 66; and in-service: n = 68). Block randomization by grade level taught (elementary and secondary) and assessment of current feedback practices achieved pretreatment equivalence. Quantitative analyses revealed coaching significantly increased teachers’ use of high-information feedback. Teacher-selected personalized practice also impacted feedback. For example, the Sim2 decision to continue where the Sim1 action stopped was significantly related to decreases in clarify and value feedback and increases in pushing student thinking. Overall, differentiated and personalized professional development (PD) through MRS was valued by educators across the career trajectory and led to a complex evolution in teacher practices. However, the effects of the MRS were moderated by teaching experience. Teachers reported the MRS felt realistic and valuable; however, early career teachers’ perceptions were significantly lower than their preservice and in-service peers, especially under the coaching condition. In-service teachers reported greater cognitive load than less experienced teachers. Our discussion explores transforming teacher education across a career trajectory through assessing teaching expertise, differentiating PD, and personalizing practice.
Keywords: mixed-reality simulations, experimental design, teacher education, personalized learning, individualized instruction
Action Editor: Richard Mayer was the action editor for this article.
Acknowledgments: The authors acknowledge Olivia Barbieri, Abbey Judd, Nicole Nash, John Tournas, Sophie Rich, Harvard Graduate School of Education, the Agile Teacher Lab, Sean Glazebrook, Aminisha Ferdinand, KIDsmART, and the participating teacher education programs and school districts for supporting this research.
Funding: This research was supported by Reach Every Reader, a grant through the Chan Zuckerberg Initiative.
Disclosures: Chris Dede serves on the advisory board of Mursion, the technology used for the mixed-reality simulations examined in this article. No competing interests exist for any of the other authors.
Data Availability: Contact authors for possible access to replication data set. All measurement tools, professional development, and codebooks are available in the Appendices.
Correspondence concerning this article should be addressed to Rhonda Bondie, School of Education, Hunter College, 695 Park Avenue, Office 217, New York, NY 10065, United States [email protected]
Children bring a wide range of individual differences to learning that teachers must perceive and respond to during daily lessons. Toward this goal, education leaders have called for the redesign of P–12 education to include differentiation, where teachers adjust tasks in response to perceived student learning needs (Bondie & Zusho, 2018), and personalization, where students are offered opportunities to determine and then pursue relevant learning (van Drie et al., 2020). In contrast to these teaching and learning goals, most teacher professional development (PD) continues to employ a one-size-fits-all approach (Hertberg-Davis & Brighton, 2006), which as Gabriel (2010) suggests, “inherently ensures that some teachers … are not … challenged [to] their full potential” (p. 86). In fact, research has consistently illustrated how our current educator PD is ineffective, with combined average effects of PD being largely negligible on student achievement (Garet et al., 2007; Guskey, 2002; Yoon et al., 2007). To reach improved outcomes for both students and teachers, teacher education must be transformed (Borko et al., 2010) and designed to support career-long professional learning (Papay & Laski, 2018) that meets the varied needs of teachers who serve students with diverse needs in different contexts. Immersive technology offers an invitation to transform PD into differentiated and personalized learning experiences that both reflect contemporary learning goals for P–12 students and respond to the individual learning needs of teachers.
One type of immersive experience in a virtual environment is mixed-reality simulation (MRS; see Appendix A, for description). Operated by a human simulation specialist who puppets avatars in virtual environments, MRS offers new opportunities for educators to learn and practice teaching techniques in a consequence-free classroom. The use of MRS is aligned with proponents of practice-based teacher education, who identify teaching practice as key to strengthening teacher performance and as necessary for transfer of professional learning into daily teaching practices (Loewenberg Ball & Forzani, 2009). Salient features of MRS are well suited to promote best practices in teacher PD. For example, in line with Lindeman’s (1926) seminal research, simulation design can leverage qualities of deliberate practice (Ericsson, 1996), including opportunities to adjust teaching strategies with an explicit goal of responding to student differences, receiving immediate feedback from avatar students and a coach, and repeating practice. This can include useful variations such as instant changes of the teaching context and location, for example, moving from a room with one student to a group of students or a whole class (Bondie et al., 2021). MRS also holds opportunities for self-evaluation, self-direction, and applications to the adult’s current context (Knowles, 1984). Applying all these precepts, MRS has the potential to provide effective PD and teacher capacity building. Further, teaching in a virtual classroom addresses a common critique that educator PD is detached from the realities of day-to-day classroom experiences (Hill et al., 2020).
Although a growing body of research points to the effectiveness of MRS in promoting teacher learning (e.g., Howell & Mikeska, 2021; Hudson et al., 2019; Judge et al., 2013), studies have not explored how the unique affordances of MRS technology may be used to transform traditional PD by providing differentiated and personalized teacher learning (Bondie & Dede, 2021). Further, Korthagen (2017) critiques a linear cognitive approach to transferring theory into practice and calls for a new type of PD that attends to the motivations and individual strengths of teachers. More research is needed that examines teacher responses to learning through MRS that may in turn impact learning and transfer to daily practices. Literature reviews (Bondie et al., 2021; Ersozlu et al., 2021) have identified at least four factors that restrict our current understanding of teacher learning through MRS:
Research studies with small samples limited to educators at one common career stage (e.g., preservice teachers in university programs preparing to teach, early career teachers participating in induction programs, or in-service teachers who are currently working in schools as teachers).
The use of a single “one-size-fits-all” approach to PD and coaching provided within MRSs.
Measurement of teaching practices without attention to quality of the teaching practice or responsiveness of interactions between the teacher and avatar students.
Limited analysis of the MRS experience from the teacher’s perspective (e.g., cognitive load, relevance, and evaluation of learning).
Accordingly, the present study was designed to address these gaps in the literature by testing the same MRS with a large sample of teachers who were at different points in their career trajectories and in different contexts. In addition, the study engaged participants in three MRSs designed to respond to teacher individual learning needs, including a baseline Simulation 0 (Sim0), a treatment of differentiated coaching versus self-reflection in Simulation 1 (Sim1) that was determined through an assessment of teacher current practices, and personalized practice in Simulation 2 (Sim2) that offered teachers an opportunity to control the type of practice they experienced within the MRS.
Specifically, we investigated the extent to which preservice, early career, and in-service teachers’ provision of high-information feedback, the dependent variable, changed given repeated exposure to MRS with and without a treatment of differentiated coaching as an independent variable. In addition, we explored how personalized practice in Sim2 impacted teachers’ provision of high-information feedback, specifically examining the frequency of four feedback types (i.e., low information, clarify, value, correct, and push thinking). Finally, furthering Korthagen’s (2017) call for PD that includes the experience of the individual as a key factor in professional learning, we examined teachers’ reported feelings of realism, task-value, and cognitive load experienced during the MRS experience. This study extends the existing literature by engaging a large sample of preservice, early career, and in-service teachers’ testing how the affordances of MRS may be used to provide PD that is responsive to individual teacher learning needs.
A literature review aligned to the study’s goals illustrates how this study used MRSs to build on previous research and explore innovations in teacher education. We begin by exploring the evidence base related to teachers’ provision of high-information feedback. Then we turn our attention to factors, this study leverages to transform traditional teacher PD into experiences that are more responsive to individual teacher learning needs. We explore differentiated coaching, elements of motivationally supportive PD, and teacher learning across a career trajectory. Taken together, the literature review provides the foundation of this study and illuminates new paths in using MRSs in teacher education.
The dependent variable in this study was high-information feedback, a daily teaching practice that is highly desirable for P–12 teachers because of its positive impact on student achievement (Hattie & Timperley, 2007; Wisniewski et al., 2020). Research on PD underscores the importance of teacher learning aligned to evidence-based teaching strategies that have demonstrable impact on student learning outcomes. In this study, we examined teacher feedback in response to student comprehension of a nonfiction text. Previous research has found teacher feedback to be a high-leverage practice that is aligned to formative assessment professional standards (Danielson, 2007) and is strongly associated with student achievement (Hattie & Timperley, 2007). More specifically, Wisniewski et al. (2020) identified the relationship between feedback effectiveness and the amount of information contained in teacher feedback. For example, teacher feedback that includes actionable information such as, “You successfully identified a type of equipment from the passage, but your answer needs a key word that is in the question,” has been found to be more effective on student learning outcomes than low-information feedback (e.g., “good job”).
Research on feedback further suggests that there are distinct types of high-information feedback that promote student learning (Perkins, 2003). For example, teachers can provide feedback that clarifies and then values the student perspective by prompting students to elaborate (e.g., “Say more about your idea”) and articulating a specific quality the teacher noticed in the student response (“I am impressed by your careful attention to the meaning of the verb in the question that you answered”; Ritchhart & Perkins, 2008). Teachers can also offer corrective feedback, such as when teachers offer concerns that identify a problem (e.g., “the question is asking for the name of a piece of equipment”), and suggestions for improvement (e.g., “reread your article to find the specific name of the equipment that scientists used”). Finally, teachers can provide feedback that pushes students’ thinking (e.g., “describe your strategy”).
These different types of high-information feedback serve different purposes in promoting student learning and pose varying levels of technical difficulty for teachers in learning how and when to offer the different types of high-information feedback. For example, to clarify a student response, teachers could reply routinely to every student with a memorized question such as, “What makes you say that?” regardless of the initial student response. However, offering a value statement requires acknowledging specific information the teacher heard in a student response. Providing concerns and suggestions to support a student in correcting a misunderstanding may require teacher content knowledge and careful analysis of a student response. Given the observation of cognitive scientists that “learning is the consequence of thinking”—one could argue that the ultimate goal of teacher feedback would be to push student thinking (Ritchhart & Perkins, 2008). Therefore, feedback that prompts students’ thinking might carry the highest value on a continuum or hierarchy starting with clarifying student thinking, valuing student perspective, correcting student responses, and ending with pressing student thinking. Perkins (2003) organizes this hierarchy into a ladder of feedback (i.e., clarify, value, correct, and push thinking) that encourages teachers to offer all four types of feedback to each student in succession moving up the ladder.
In this study, we operationalized Perkins’ (2003) ladder of feedback, by creating the dependent variable of high-information feedback (see Appendix E, for codebook examples). We counted the frequency of each type of high-information feedback and weighted four types of high-information feedback based on a hierarchy of difficulty. We then created a weighted mean of high-information feedback score by dividing the total sum of the weighted high-information feedback frequency by the total number of teacher feedback utterances. The weighted mean provides a nuanced reflection of teachers’ provision of high-information feedback during the MRSs, allowing researchers to examine overall changes in teachers’ use of high-information versus low-information feedback and, more specifically, the exact types of high-information feedback that changed or did not change across the three MRS exposures. This information can then be used to provide further PD that is more specific to individual teacher needs.
Feedback is not only central to student learning but it is also critical in enhancing teacher learning outcomes. Indeed, understanding the type, amount, and individualization of feedback needed to enhance teaching learning and apply learning to daily practices is a worthy goal. To that end, one of the key independent variables of this study was the treatment of differentiated coaching versus a control of self-reflection. Coaching is a strategy that has been found to be effective in changing instructional practices and raising student achievement (Kraft et al., 2018). In fact, research demonstrated that coaching has a larger effect on achievement than most PD interventions (Hart & Nash, 2021). Although models of coaching vary, a primary role of a coach was to help teachers develop teaching expertise by providing job-embedded feedback, meaning the coaching happens within a context of teaching in a school environment. Similar to the role that coaches play in sports, instructional coaches often “serve as an expert ‘other’ who observes teachers from the sidelines, evaluates teachers’ strengths and weaknesses, identifies areas for focused improvement, and develops individualized strategies to promote development” (Cohen et al., 2020, p. 212). In short, following an assessment of teachers’ current strengths and needs, coaches provided timely and specific feedback so that teachers could make positive adjustments to their teaching.
Gawande (2011), a surgeon reflecting on his own professional growth, summarizes the importance of coaching across professional fields, “No matter how well trained people are, few can sustain their best performance on their own (p. 44).” Gawande’s claim was tested by Cohen et al.’s (2020) study that compared teacher performance in MRSs given a standardized coaching treatment versus a control of individual self-reflection time and found coaching significantly improved teacher feedback given avatar student comprehension responses to a fiction text. This study builds on Cohen’s study by using a similar block randomized design. However, this study pushes the field forward by providing a treatment that is responsive to individual teachers instead of a standardized one-size-fits-all coaching treatment. Specifically, we differentiated the focus of the coaching treatment and written self-reflection prompts (see Appendix D) based on an assessment of teachers’ current use of high-information feedback. Further responding to individual teacher needs, Sim2 included personalized practice where teachers could choose to restart, continue, or try a new task for their last simulation. The differentiated coaching and personalized practice are novel features of MRSs aligned with motivationally supportive PD.
Research on PD points to the important role of teacher motivation, that is, teachers are more likely to gain from the experience when they feel like what they are learning is relevant, useful, interesting, and important—dimensions of what motivational psychologists refer to as task-value (Wigfield & Eccles, 2000). Another central aspect of motivation is expectancies; when individuals perceive that they can accomplish a task successfully (i.e., report higher levels of self-efficacy), they are much more likely to move to action. PD may not be designed with these motivational principles in mind—shortcomings that could potentially be addressed through differentiated and personalized PD using MRS.
Differentiation and personalization are proposed alternatives to the one-size-fits-all approach to teaching P–12 students through adult learners. Although sharing common elements, differentiation and personalization differ in important respects that bear on the design of the present study. In line with the National Education Technology Plan (U.S. Department of Education, Office of Educational Technology, 2017), we make the following distinction: differentiation refers to instruction that is designed by a teacher to remove barriers and extend learning in an effort to meet diverse needs of students (Bondie et al., 2019). On the other hand, personalization refers to instruction that is pursued by learners based on their own interests and goals (U.S. Department of Education, Office of Educational Technology, 2017). Therefore, differentiation involves teaching that is adjusted based on the teacher’s perception of learner strengths and needs, necessitating a mechanism for teachers to assess current strengths and skills that are needed for an upcoming lesson (Tomlinson, 2014). By contrast, personalization gives autonomy to the learner by prompting the learner to reflect on their needs and goals and then choose a productive next step for their learning (U.S. Department of Education, Office of Educational Technology, 2017).
From a motivational standpoint, both differentiated and personalized instruction should promote a learner’s sense of motivation and expectancies for success because instruction is tailored to learning strengths and needs. However, personalized learning affords more agency to the learner since it allows the learner (not the teacher) to set learning goals and select appropriate next steps (Pape, 2021). To the extent that research on motivation points to enhanced engagement under conditions of increased agency and control (Ryan & Deci, 2017), it stands to reason that personalized experiences may lead to enhanced motivation and, ultimately, learning. Whereas, differentiated instruction may enhance specific elements of task-value, notably relevance and utility, personalized learning is likely to enhance other dimensions of task-value, specifically attainment (importance of the task) and intrinsic interest value. These findings suggest that personalized learning has affordances for enhanced learning that are distinct from differentiated instruction. More research is needed to determine the possible benefits of these different approaches to promoting teacher learning that is tailored to individual learning needs at different career stages.
Research further points to how MRSs can both facilitate and thwart motivation to learn. On the positive end, MRSs provide teachers a consequence-free space for trials and errors, a space for deliberate and repeated practice critical to the development of expectancies for success and automaticity (Bautista & Boone, 2015). However, recent studies have also illuminated the complexities of how heightened affect in immersive environments may create barriers to learning (Parong & Mayer, 2021). Specifically, questions have been raised as to whether MRS environments are perceived to be cognitively taxing, thereby decreasing teacher motivation to engage in the simulation thwarting potential positive effects of MRS. One potential avenue to ameliorating cognitive load is repeated exposure. Through multiple MRS exposures, teachers may gain automaticity, which in turn could lessen cognitive load and heighten teachers’ motivation to engage. Exploring teacher perception of cognitive load and affective responses during MRS learning experiences, therefore, may inform designs for more optimal learning experiences.
As noted above, coaching is another motivational PD tool that could also reduce perceptions of cognitive load and enhance overall teacher motivation. Indeed, drawing on motivational theories, Hart and Nash (2021) propose one of the primary purposes of coaching is to enhance teacher motivation. They state (p. 7):
“If teachers feel they do not have a voice in their schools or autonomy in their classrooms, empowering coaching practices can provide them with voice and autonomy. If teachers have lost a sense of self-efficacy, empowering coaching practices can rebuild self-efficacy by supporting teachers in identifying targeted goals and providing concrete evidence of progress. ”
Although coaching can be motivationally supportive for teacher learning, constraints such as time (e.g., having enough time for coaching conversations or the length of time passing between coaching and the opportunity to practice with students) and finding locations for coaching and practicing complicate coaching in real classrooms. However, MRS holds potential for addressing time and location constraints of face-to-face coaching. For example, instant changes of location can facilitate learning by instantly providing opportunities to apply new learning. In this study, MRS enabled teachers to instantly move from a coaching session to practicing a teaching technique with one avatar student and then return to a coaching session before moving to a larger group of students for more challenging practice. In addition to location, time can be altered (e.g., enabling teachers to erase time to start over, freeze time to return to a moment and continue, or fast forward time to move on to another task). The technology affordances of MRS provide new opportunities to explore coaching variants (Gibbons & Cobb, 2017) that illuminate what types of coaching support works for whom and under what conditions.
Although core teaching practices, such as providing feedback, are learned throughout a teaching career, professional learning experiences are typically organized and separated by three distinct stages of the teaching career: preservice (i.e., courses taken prior to working as a teacher often based in university certification programs), early career (i.e., required workshops through school districts during the initial years of teaching), and in-service (i.e., ongoing development for working teachers provided by a variety of organizations, universities, and teacher employers). This separation may be a function of convenience, in terms of the times that teachers are available for PD. However, the separation may also be in line with research on cognition that has commonly recognized differences between novices and experts. Experts are typically characterized as having a well-developed knowledge base, which allows them to discern more precise information, further leading to better organization and integration of knowledge. This ultimately results in experts being able to better remember, reason, and solve problems than novices (National Research Council, 2000). The research on expertise suggests that there may be differences in how MRSs can be designed to respond to the differing needs of relatively novice teachers (e.g., preservice and early career teachers) as compared to more experienced, in-service teachers.
For example, preservice teachers may need to develop automaticity in teaching skills through repeated practice and rehearse interactions with students, making MRS an ideal medium for PD (Pankowski & Walker, 2016; Peterson-Ahmad, 2018). In contrast, early career teachers, in their first 3 years of teaching, are independently managing a classroom in a new school context for the first time. For early career teachers, PD must fit into a broader range of teacher skills that include family communication, materials management, and leading extracurricular activities. While early career teachers may need more practice teaching, they may also feel overwhelmed with the independence and larger role of classroom teaching (Wiseman, 2021). In MRSs, early career teachers may look for avatar students to mirror the exact challenges that they are experiencing in their classrooms. In comparison, in-service teachers, with more than 3 years of experience, often report receiving little feedback or coaching to develop their teaching. Experienced teachers tend to use more instructional routines than novice teachers (Borko & Livingston, 1989). Therefore, accomplished teachers may find exploring new teaching practices with their students challenging given their current level of success and student expectations for a routine practice. MRSs for experienced teachers offer a burden-free environment for risk-taking and innovating with new teaching practices and responses. Although previous studies using MRS have measured preservice teacher growth in high-leverage content-specific instructional practices (Howell & Mikeska, 2021), fewer studies have explored MRS with early career teachers (Wiseman, 2021) and in-service teachers (Dieker et al., 2017) or compared the learning of teachers at different career points using the same MRS. Mikeska et al. (2021) called for the field to examine how deliberate variations within simulations may be responsive to teacher learning needs and use simulations to develop hypothetical learning trajectories for developing teaching practices.
Despite individual differences, throughout their career, teachers need ongoing professional learning that supports the integration of new knowledge into daily teaching practices and offers systematic opportunities to reflect on teaching (Korthagen, 2017). MRS provides both a flexible virtual learning environment and a consistent assessment tool that can be used across a career trajectory to measure growth of teaching practices-in-action. Engaging teachers across a career trajectory in a common MRS is an important step in exploring the extent that MRS can respond to teacher learning needs throughout their career and document how teaching expertise is developed over time.
This experimental study was designed with explicit attention to the research on effective PD using a simulation scenario that challenged teachers to provide feedback to a group of five avatar students who held partial understandings of a nonfiction informational text (National Governors Association, 2010). Specifically, we leveraged different MRS features to promote teacher learning (i.e., provision of high-information feedback) over three MRS iterations: Sim0 served as a baseline; Sim1 used block randomization to assign teachers to a differentiated coaching treatment versus a control of differentiated self-reflection prompts that focused teachers’ attention on increasing one type of high-information feedback based on an assessment of current teacher feedback practices; and Sim2 was personalized to promote teachers’ sense of agency in directing their own professional learning and motivation. Building on Cohen et al. (2020), we explored whether the effects of the simulations can be further enhanced through differentiated oral coaching versus written self-reflection prompts and whether the effects of the MRS were consistent among a sample of preservice, early career, and in-service teachers. Given the differences in elementary and secondary teacher preparation requirements, similar to Cohen et al. (2020), to prevent severe imbalances in the randomization of teachers to the differentiated coaching treatment versus the control group engaging in self-reflection, we randomized the treatment selection within blocks dividing teachers by grade level (i.e., elementary or secondary) and a preexisting level of feedback quality as demonstrated by an assessment of current teacher feedback practices prior to treatment. A secondary goal of the study was to explore teachers’ impressions of the MRS experience, specifically to what extent they reported the simulations to be cognitively taxing, realistic, and valuable. In short, our experimental study breaks new ground in teacher education by providing differentiated PD and personalized practice across a series of MRS exposures comparing teachers at different career points.
This study was guided by three main research questions. Our primary research question (RQ1) examined whether differentiated oral coaching versus differentiated written self-reflection prompts over repeated practice in an MRS differentially affected preservice, early career, and in-service teachers’ provision of high-information feedback. In line with Cohen et al. (2020), we hypothesized that the MRS would enable teachers to increase their provision of high-information feedback (i.e., feedback that clarifies, values, corrects, or prompts thinking), and that teachers who receive coaching may benefit more from the experience than those who were randomly assigned to a control group who were given written self-reflection prompts. We further hypothesized that the effect of the MRS would vary across preservice, early career, and in-service teachers given different levels of teaching experience related to feedback. For example, we expected that experienced teachers would provide high-information feedback with the greatest frequency and preservice teachers would provide the least frequency. Related to this research question, we also explored how specific feedback types changed across the simulations (RQ1a). We hypothesized that teachers in the coaching treatment would increase the type of feedback that was focused on in their differentiated coaching session.
Another aim of the study (RQ2) was to explore PD that is responsive to individual teacher learning needs. More specifically, RQ2 explored the impact of the assigned teacher learning need based on the assessment of current feedback practices on teachers’ provision of high-information feedback during Sim1 and the personalized practice decision on teachers’ provision of high-information feedback during Sim2. To better understand differentiated instruction in teacher PD, in RQ2a, we examined the specific types of high-information given by teachers in Sim1 separated by their assigned feedback focus (i.e., teacher learning needs identified through an assessment of current practices) and the treatment and control groups. This analysis enabled us to explore changes in the provision of high-information feedback in relation to teacher practices prior to the differentiated coaching treatment. We hypothesized that there would be alignment between the assigned feedback type needed in the teacher’s practice (i.e., the focus of the differentiated treatment) with an increase in the use of that specific type of feedback in Sim1. To explore the impact of personalized practice (RQ2b), we examined the provision of high-information feedback in Sim2 separated by teachers’ personalized practice decisions (i.e., restart the simulation, continue where the action in Sim1 had stopped, or move to a new task) and the treatment and control groups. This analysis enabled us to explore the interactions between the treatment, personalized decision, and provision of specific types of high-information feedback. We hypothesized that all three personalized practice decisions would be associated with increases in high-information feedback. In summary, RQ2 explored how teachers’ assignment to a specific differentiated condition based on the assessment of current practices was correlated with teachers’ provision of specific types of low- and high-information feedback (RQ2a) and how teachers’ decisions to restart, continue, or move on to a new task were correlated with their provision of specific types of high-information feedback (RQ2b).
Finally, our third research question (RQ3) explored teachers’ reactions to the overall MRS experience. Specifically, we explored the extent that teachers at different points in their career trajectory found the MRS to be cognitively taxing, motivating (i.e., valuable), and realistic, and whether or not these trends varied by condition (i.e., coaching vs. self-reflection). We anticipated that teachers would generally find the MRS experience to be favorable; however, we questioned whether these patterns differed by teaching experience and condition.
Participants were 203 teachers (76% female) including suburban preservice, rural early career, and urban experienced teachers participating in PD. Preservice teachers (n = 68) were participating in field experience courses through five suburban university teacher education programs. Early career teachers (n = 66) were participating in a rural school district’s required monthly PD induction program designed for teachers with under 3 years of teaching experience within their school district. In-service teachers (n = 68) were in an urban or suburban setting engaged in PD of their own choosing offered through an organization supporting ongoing teacher education or elective university courses. Table 1 describes the participants in our sample who were majority White women (57%). Participants were almost evenly divided by school level with 100 elementary and 103 secondary teachers. Our PD was embedded into existing courses held by each program during the 2020–2021 school year and was required by the school system or programs, however, participating in our research was optional. Our study took place during the COVID pandemic where teachers experienced frequent changes from in-person to online teaching. Our MRSs served as required field experiences within university courses for preservice teachers who were not allowed in schools for student teaching. Early career teachers were required to participate in our PD as part of their school district’s induction program. These PD sessions took place during the school day. The experienced teachers were engaged in PD through a professional organization that offered a selection of courses that took place outside of the school day. All teachers engaged in similar PD with variations in terms of duration and time elapsed between sessions consistent with school and university schedules. The controlled environment of the MRS enabled us to consistently and accurately measure teachers’ use of high-information feedback within the dynamic real-life contexts of how teachers engage in PD across a career trajectory.
Description of Sample by Sex, School Level, and Race
Our simulation scenario challenged teachers to provide high-information feedback to a group of five avatar students who held different partial understandings of a nonfiction informational text (National Governors Association, 2010, see Appendix C, for initial avatar responses). The evidence-based teaching practice of providing high-information feedback has been associated with increased student learning (Hattie & Timperley, 2007) and measured on teacher evaluation rubrics used for evaluation throughout a teacher’s career (Danielson, 2007). Although teachers provide feedback to students daily, the practice is difficult to master because feedback is dependent on spontaneous interactions (i.e., listening and responding to students). Given the value, difficulty, frequency of use, and reliance on interactions, PD focused on providing high-information feedback was both well suited for MRS and teacher learning throughout a career. An excerpt from a New York Times article discussing the discovery of water on the moon was selected for the avatar student task because of research on the increased benefits of nonfiction texts for students in terms of content knowledge and vocabulary acquisition (Kim et al., 2021) and teachers’ use of nonfiction texts across subjects and grades (National Governors Association, 2010).
The researchers, simulation specialists, and classroom teachers collaboratively developed and rehearsed the scripts over 3 months. For each avatar, we developed a full background life including experiences, likes and dislikes, and reasons for the response to the nonfiction text. The simulation employs improvisation where the simulation specialist puppets the avatars to respond to the teacher; however, the improvisation is governed by if-then statements for each avatar. So, if the teacher responds with a specific feedback type, then the avatar will say or do a scripted action (see Appendix B, for sample actor responses). Therefore, the simulations are responsive, but standardized. In addition, each teacher experienced three standardized challenges occurring at about 15 s, 2 min, and 5 min into the simulation where an individual student interrupted the teacher with a challenge regarding why an answer is wrong. Simulation specialists rehearsed regularly, watched videos of each other, and met weekly throughout the study to maintain consistency across simulation specialists. In addition, we balanced the number of MRS completed by each simulation specialist for each differentiated feedback type to minimize the impact of simulation specialist differences on teacher outcomes.
Figure 1 displays the sequence of MRSs used in this study illustrating three distinct innovations for teacher PD, specifically, assessment of current practices, differentiated intervention, and personalized practice.
All participants engaged in four MRSs as part of PD (see Appendix E, for Data Collection Timeline Table E1). The orientation simulation was the only simulation completed in a large group and was designed to orient teachers to the virtual environment. The following three simulations (i.e., Sim0, Sim1, and Sim2) were completed as individuals. Figure 1 highlights three important design features of this professional learning experience in the blue boxes: (a) assessment of current feedback practices, (b) differentiated intervention (i.e., coaching vs. self-reflection) to increase teachers’ provision of high-information feedback, and (c) personalized practice. These design features are explained below. Following the MRS exposures, participants completed a postsimulation survey exploring feelings of realism, task-value, and cognitive load.
For all teachers, the PD was conducted online through Zoom and began with an orientation session that included an overview of the study, informed consent, and an orientation MRS, where teachers engaged in the virtual environment as a large group. The orientation MRS invited teachers to ask the avatars what they liked to read outside of school. During this time, participants were challenged to discover what avatar students can and cannot do in the virtual classroom. Finally, participants completed a survey of current teaching practices that included demographics and scheduled a baseline simulation.
The baseline (Sim0) asked teachers to provide a group of five avatar students with feedback given their responses to the first discussion question related to the nonfiction text. Following the baseline simulation, participants engaged as a whole group in PD where they received a brief instruction on the four types of high-information feedback (see Appendix B—Perkins’, 2003) then teachers completed a written assessment of their current feedback practices (Appendix C). Researchers used the written assessment of current practices to assign teachers to a specific type of high-information feedback (i.e., clarify, value, correct, or push thinking) needed in their teaching practices.
To ensure pretreatment equivalence, Figure 2 illustrates how researchers used block randomization based on the written assessment of current practices (i.e., clarify, value, correct, or push thinking) and school level (i.e., elementary or secondary) to assign each teacher to a condition of differentiated oral coaching or written self-reflection. Oral coaching was differentiated to focus teachers’ attention on one type of high-information feedback that was assigned by researchers based on the assessment of current feedback practices. As shown in Figure 2, more teachers were assigned to focus on value and correct feedback types, indicating that fewer teachers needed help with clarifying student understanding. These findings also indicate that, based on the assessment of current practices, fewer teachers were ready to provide feedback that pushed students’ thinking.
The introduction scene to Sim1 began with all teachers meeting an adult host who introduced the simulation and modeled the four types of high-information feedback (Table 2). First, all participants received a compliment valuing a specific strength in one type of feedback demonstrated on the teacher’s assessment of current practices. Then a goal was set with the teacher to work offering students the next type of feedback in the hierarchy during Sim1. Teachers participated for 5 min in their assigned condition (i.e., differentiated oral coaching or written self-reflection) immediately followed by Sim1.
Differentiated Feedback Types Hierarchy
Expected change in teacher feedback
Clarify correct answer
Clarify and value student perspective
Value student perspective
Offer concerns and suggestions
Identify both the reason correction is needed and suggestions for improving the student response
Push student thinking
Oral coaching consisted of direct instruction of the assigned type of feedback, then teachers were instantly transported to a room with one student to practice giving feedback, and then returned to debrief the practice with the coach prior to Sim1.
Similar to Cohen et al. (2020), in contrast to the directive oral coaching, the control condition offered teachers 5 min of time for self-reflection prior to Sim1. In the self-reflection condition, teachers were asked to take 5 min to plan using differentiated reflection prompts (see Appendix D). In addition, the avatar host asked the teachers in the self-reflection group to open a pdf file of their current assessment of feedback practices to support their planning. Teachers could stop the self-reflection time and start the simulation if they were ready to begin before 5 min had elapsed.
Following Sim1, all teachers returned to an empty classroom with an avatar host. They completed a written self-reflection survey starting with an open-ended question, “How did that go?” and identified the types of feedback that they offered and did not provide. Following the survey, the coach asked participants to choose a point in the lesson that they would most like to practice. Teachers could restart from the beginning as if the first time never happened, continue where they left off, or ask a new question. Following Sim2, all participants debriefed with the coach and completed a post-MRS reflection survey.
The main dependent variable for the study was high-information feedback, which was calculated using a weighted mean of high-information feedback provided to avatar students during each 8-min simulation. The dependent variable, high-information feedback, was created by qualitatively coding each teacher feedback utterance by the type of teacher intention along the feedback hierarchy (e.g., low-information feedback, clarify student responses, value student perspective, correct student response, and push student thinking). The rigorous process for qualitatively coding MRS transcripts and our codebook are described in Appendix E. Once all 16,707 teacher feedback utterances from the 609 transcripts were coded, we then multiplied the frequency of each type of feedback by the assigned difficulty level in offering that type of feedback (e.g., low-information feedback = 1, clarifying student responses = 2, valuing student perspective = 3, offering concerns and suggestions = 4, and pushing student thinking = 5) and divided by the total number of feedback utterances that took place during each MRS exposure to create a weighted mean of high-information feedback for each individual teacher for Sim0, Sim1, and Sim2.
Teachers’ perceptions of how realistic the MRS experience was assessed via a two-item index, to which teachers responded immediately after Sim2. The index included the items, “The MRS experience felt real” and “The avatar students behaved like real students,” which were averaged to create the scale. These items were assessed on a 5-point Likert scale where 1 = strongly disagree and 5 = strongly agree. The Cronbach’s α for this index was .86, indicating strong internal consistency.
In addition to the items related to realism, teachers were also asked to respond to items assessing their overall task-value for the MRS experience. The scale included the following five items, which were assessed on the same 5-point Likert scale and averaged to create the scale: (1) The MRS was a useful learning tool; (2) The experience was beneficial for me; (3) I would recommend the experience to others; (4) The MRS was relevant to my program of study; (5) The MRS was relevant for future professional practice. The Cronbach’s α for this scale was .93, indicating excellent internal consistency.
Cognitive load was assessed using one item, measured on a 10-point rating slider that asked teachers to slide the indicator to the level of cognitive load experienced during the final simulation. The slider defined “Very high cognitive load is when a task drains most or all of your mental energy quickly” and “Very low cognitive load is when a task has a minimal effect on your mental energy.”
Throughout the research process, steps were taken to ensure validity. Simulations were timed ensuring that each teacher had about 8 min of teaching time in each simulation. The simulation specialist training and steps for consistency are described in the Method section. To analyze the MRS transcripts, 16 research assistants were trained and certified and then coded transcripts as the data were collected over 18 months. Appendix E provides a detailed description of the coding process and links to the full codebook. On average, our coders scored an 83.5% exact match for all codes in the transcripts that were coded by two independent coders. When examining each type of feedback (e.g., low information, clarify, value, correct, and think), coders scored 86.4% at low-information feedback, 82.1% at clarifying, 70.1% at the valuing student perspective, 71.7% at correction, and 82.8% at pressing student thinking. Of the 609 transcripts analyzed in this study, 14% were double-coded. Our research team split each teacher utterance in the transcripts into the smallest unit of meaning based on the codebook so that each feedback utterance received only one code. The average number of teacher feedback utterances per transcript was 27, and the consistency of the number of lines coded among double-coded transcripts was 93.4%. In addition to double coding, the team consensus coded one transcript biweekly.
Research Question 1: Do teachers’ provision of high-information feedback change as a result of coaching and as a result of repeated (differentiated and personalized) exposures to the MRS?
The primary research question (RQ1) focused on exploring how teachers’ provision of high-information feedback changed over the course of three simulations, including a baseline (i.e., Sim0) prior to PD, differentiated coaching (treatment) versus self-reflection (control) simulation prior to Sim1, and personalized practice during Sim2. We examined patterns among teaching experience (i.e., preservice, early career, or in-service) and performance across multiple exposures of MRSs.
A repeated-measures analysis of variance (ANOVA), 3 (teaching experience) × 2 (condition), was conducted on teachers’ provision of high-information feedback (i.e., the weighted mean dependent variable) over the three MRS exposures. Results revealed a large main effect for simulation, F(2, 196) = 42.25, p < .001, partial η2 = .30, with statistically significant higher scores observed for the latter two simulations (Msim1 = 2.03, Msim2 = 2.00) compared to the baseline simulation (Msim0 = 1.76). This suggests that feedback in the form of coaching and/or self-reflection had a positive effect overall.
There was also a statistically significant main effect for experience, F(2, 197) = 14.60, p < .001, partial η2 = .13, whereby scores for the preservice (M = 1.95) and in-service (M = 2.04) teachers differed significantly from the early career teachers (M = 1.80). No significant main effects were observed for condition; the overall scores of those who received coaching versus self-reflection did not differ from each other, F(1, 197) = 3.02, p = .08, partial η2 = .02. These main effects, however, were qualified by two statistically significant interactions. First, we found a significant interaction effect between simulation and coaching, F(2, 196) = 6.36, p = .002, partial η2 = .06. As shown in Figure 3, we saw greater growth in high-information feedback in the coaching condition than the self-reflection condition. This growth is most pronounced between the baseline and the first simulation where the differentiated coaching occurred.
As shown in Figure 4, we explored the treatment and control conditions by teaching experience. There was a significant three-way interaction between simulation and coaching by teaching experience, F(4, 394) = 2.57, p = .04, partial η2 = .03. Specifically, it appears that the coaching condition had a more pronounced effect on both the preservice and in-service teachers, rather than the early career teachers. Figure 4 displays that all teachers in the coaching condition experienced greater gains in high-information feedback between the baseline and first simulation. Preservice teachers in the control condition of self-reflection decreased their use of high-information feedback in Sim1; however, they increased high information in the personalized practice condition in Sim2.
Given the findings showing increased use of high-information feedback, we examined how the specific feedback types (i.e., low information, clarify student response, value student, correct, and prompt thinking) changed across the three simulations for each teaching experience level (see Figure 5). The purple bar on Figure 5 illustrates a decrease in low-information feedback for both preservice and in-service teachers across the three simulations. However, we observed a different pattern among early career teachers who increased their provision of low-information feedback from Sim0 to Sim1 and then decreased low-information feedback frequency during Sim2.
Figure 5 also illustrates the increase in high-information feedback (i.e., red, green, blue, and orange) versus low-information feedback (i.e., purple). The height of the bars in Figure 5 indicates the number of feedback utterances that students received. We see a decrease in the amount of total feedback (i.e., bar height) and an increase in higher information feedback (i.e., red, green, blue, and orange) for preservice and in-service teachers. This shows, across the three exposures, that preservice and in-service teachers provided less feedback with greater information. Early career teachers increased the amount of total feedback utterances (i.e., height) and also increased the low-information feedback. Across the three simulations, for participants in all groups, low-information feedback was provided most frequently. However, the provision of feedback that valued the student perspective and pushed student thinking increased for teachers in all groups.
Research Question 2: How do teachers’ provision of feedback relate to assigned differentiation and personalized decision?
We believe that this study represents the first of its kind that used the MRS affordances to provide a differentiated treatment condition tailored to perceived individual teachers’ needs. The coaching treatment focused on a specific type of feedback that was assigned based on feedback skills observed in a written assessment of current practices prior to Sim1. For example, for Sim1, based on the results of the initial assessment, some teachers were directed to work on providing feedback that pressed for thinking (i.e., the highest form of feedback), whereas others were directed to focus on providing feedback that values the student perspective. For Sim2, teachers were given the opportunity to personalize their learning, specifically they were asked to select for themselves (without researcher input) the type of practice that they thought would best support their learning. Teachers were given three options—the choice to restart, continue, or try a new question.
Given the lack of previous research examining how MRSs can respond to learner differences, we were especially interested in understanding whether or not these decisions made by the researcher (Sim1, with and without coaching—RQ2a) and the participants themselves (Sim2—RQ2b) had intended or unintended consequences. To that end, we explored whether there was a relationship between the assigned differentiation and the frequency of specific feedback types that teachers provided to the avatars during Sim1 (RQ2a). For RQ2a, we hypothesized that if the results were as expected, we would observe positive correlations between the assigned differentiated feedback focus (i.e., clarify, value, correct, or think) with the frequency of feedback type that matched that focus. Similarly, we explored whether relationships emerged between teachers’ personalized decisions to restart, continue, or try a new simulation and the type of feedback they gave the avatars in Sim2 (RQ2b). Again, we hypothesized that we would observe positive correlations between teachers’ decisions and high-order feedback (i.e., press for thinking) if teachers made desirable decisions
Table 3 displays the results of point-biserial correlations for RQ2a. We correlated teachers’ assigned differentiated feedback focus (i.e., clarify, value, correct, or think) with their frequency of feedback types (i.e., low information, clarify, value, correct, and think) during Sim1. Table 3 displays the correlation analyses for the total sample, as well as broken down by conditions of coaching (treatment) and self-reflection (control). In examining the correlations, one can draw the following conclusions. First in line with our hypothesis, from the sample as a whole, we see that those who were assigned to focus on pressing for thinking did provide more thinking-related feedback to the avatar students. However, we also see trends suggesting that the assigned differentiation did not always result in intended outcomes. For example, we see that, overall, those who were assigned to focus on clarifying understanding were more likely to provide feedback related to correction, whereas those who were assigned to focus on correcting misunderstanding were less likely to provide feedback related to thinking.
Relationship Between Assigned Differentiated Coaching and Frequency of Feedback Type for Total Sample and by Condition for Simulation 1
Frequency of feedback type for Simulation 1
Total sample (N = 203)
Self-reflection (n = 104)
Coaching (n = 99)
* p < .05. ** p < .01. *** p < .001.
In examining trends broken down by the coaching treatment and self-reflection control conditions, we observe more statistically significant relations in the coaching condition than in the self-reflection condition, indicating that coaching did impact the frequency of specific types of feedback. For example, teachers who were directed to focus on clarifying feedback did provide more clarifying feedback to the avatars; similarly, teachers who were asked to focus on providing feedback on correction provided more corrective feedback, and teachers who were asked to focus on providing feedback on thinking also provided more feedback related to press for thinking. However, we also see that teachers who were prompted to focus on clarifying actually provided more low-information feedback. Taken together, these findings indicate that the assigned differentiated instruction did prompt teachers to emphasize specific types of high-information feedback; however, the changes were sometimes not consistent with our expectations.
For RQ2b, point-biserial correlations were conducted to explore the relationship between personalized practice (i.e., the teacher’s decision to restart, continue, or ask students a new question) and the provision of a specific type of feedback (i.e., low, clarify, value, correct, think; RQ2b—see Table 4).
Overall, we found different patterns in the types of feedback offered to avatar students based on whether a teacher decided to restart, continue, or ask a new question. Specifically, we observed a positive relationship between the decision to continue and press for thinking, regardless of whether teachers were in the coaching or self-reflection condition. We also found that teachers who decided to restart the simulation were more likely to provide low-information feedback, especially in the self-reflection condition, or corrective feedback, for those who received coaching. Teachers in the coaching condition who decided to ask for a new question were more likely to provide corrective feedback and less likely to provide feedback that pressed for thinking, whereas teachers in the self-reflection condition were more likely to provide feedback that clarified or valued students’ perspectives. Taken together, the decision to restart, continue, or ask a new question may be indicative of differing degrees of teachers’ metacognitive awareness of their skill to provide high-information feedback. Ultimately, teachers’ decision to continue was correlated with a greater press for student thinking than teachers who decided to restart or ask for a new question. This pattern is logical because teachers did not start from the beginning, but rather built on the feedback and revised student responses accomplished in the previous simulation (Sim1; Table 4).
Relationship Between Teacher Decision and Frequency of Feedback by Type for Total Sample and by Condition for Simulation 2
Frequency of feedback type for Simulation 2
Total sample (N = 203)
Self-reflection (n = 104)
Coaching (n = 99)
* p < .05. ** p < .01. *** p < .001.
Research Question 3: How did teachers perceive the MRS experience?
Building on Korthagen’s (2017) literature review that points to the importance of gathering data about teachers’ perceptions, our final research question explored teachers’ perceptions of how real and valuable the MRS experience was based on their treatment condition (i.e., whether or not teachers received coaching or self-reflection) and also by their teaching experience (i.e., preservice, early career, or in-service). Specifically, we ran a 2 (coaching vs. self-reflection) × 3 (teaching experience) multivariate analysis of variance on the dependent variables of realism and task-value. Overall, there was a significant multivariate effect for experience, F(4, 382) = 8.17, p < .001, partial η2 = .08, which was qualified by a statistically significant interaction between coaching and experience, F(4, 382) = 4.04, p = .003, partial η2 = .04. Further examination of the univariate effects indicated that there was a statistically significant difference in task-value based on experience, F(2, 191) = 13.33, p < .001, partial η2 = .12, with early career teachers reporting statistically significantly lower task-value (M = 3.82, SD = 1.06) compared to preservice (M = 4.37, SD = .78) and in-service (M = 4.48, SD = .59) teachers. No main effects of the independent variables were observed for realism.
Moreover, as displayed in Figure 6 and Figure 7, respectively, results revealed that average ratings of both task-value and realism varied not only by teaching experience but also by condition. Specifically, we see that teachers’ ratings of task-value were higher and relatively comparable under the condition of self-reflection, but for early career teachers, especially, their ratings of task-value were much lower than preservice and in-service teachers under the coaching condition. In fact, we see a slight increase in teachers’ ratings of task-value for preservice and in-service teachers with coaching.
As for realism, we again see a difference; however, this time, it is the preservice teachers who show a different pattern. Specifically, preservice teachers’ ratings of realism were higher under the coaching condition as compared to the self-reflection condition. By comparison, we see that both early career and in-service teachers’ ratings of realism were higher under the self-reflection condition than the coaching condition.
Finally, we also conducted a one-way ANOVA to investigate whether teachers’ perceptions of their cognitive load varied by teaching experience and by condition. No main effects were observed for condition; however, findings revealed a statistically significant effect of teaching experience, F(2, 176) = 7.84, p < .001, partial η2 = .08, with in-service teachers reporting the highest cognitive load (M = 7.02, SD = 1.42), followed by early career teachers (M = 6.20, SD = 1.99), and preservice teachers (M = 5.82, SD = 1.71). Post hoc analyses indicated that these means were all statistically significantly different from each other. Overall, findings related to RQ3 suggest that with means greater than the midpoint of the scale, teachers found the MRS experience to be mostly realistic, valuable, and low in cognitive load; however, paralleling the findings observed in RQ1, the early career teachers did not respond to the experience as favorably as preservice and in-service teachers.
This study begins to transform teacher education by using contemporary technologies to differentiate and personalize professional learning. However, Makransky and Petersen (2021) warn that“it is not the medium of IVR [immersive virtual reality] that causes more or less learning, but rather that the instructional method used in an IVR lesson will be specifically effective if it facilitates the unique affordances of the medium. (p. 940) ”In addition to leveraging MRSs’ capacity to instantly change locations and manipulate time, our study included three novel elements designed to respond to individual teacher learning needs: an assessment of current practices to identify a needed focus for PD, differentiated coaching to develop teaching practices, and personalized practice to increase teacher autonomy toward pursuing self-directed professional learning goals.
Although researchers are generating knowledge about how MRS can be used to develop desired teaching practices (e.g., Hudson et al., 2019; Judge et al., 2013; Pankowski & Walker, 2016; Peterson-Ahmad, 2018), questions remain about what MRS features improve learning outcomes, the conditions under which MRS is and is not effective, and the kinds of cognitive and affective processing that occur during MRS. For example, previous MRS research has continued to rely on a one-size-fits-all approach to simulations. However, contemporary learning goals for P–12 students and their teachers call for instruction that is responsive to individual learning needs. Kavanagh et al. (2020) have called for more research “on the characteristics of designs for teacher professional education that prepare professionals to enact practice that is thoughtfully adaptive and responsive to students’ in-the-moment thinking.” To fill this gap, this study used a blocked random assignment design to examine the impact of a differentiated coaching treatment and personalized practice on teachers’ provision of high-information feedback.
Our findings suggest that teachers at different career points found the MRSs experience to be realistic and valuable, and for the most part, low in cognitive load. However, early career teachers did not respond to the intervention as favorably as did the preservice and in-service teachers. We observed growth in teachers’ provision of high-information feedback over repeated exposures to the MRS, particularly under a condition of coaching over self-reflection—a finding that is in line with previous research (Cohen et al., 2020). We also found that patterns of growth were not the same across the three groups of teachers.
Although we observed growth, teachers, regardless of experience and condition, relied on low-information feedback and were less likely to press for student thinking. However, we did observe a positive correlation between the coaching condition and feedback focused on thinking, suggesting that coaching may support teachers in developing feedback skills to press for student thinking. Yet, specific types of high-information feedback such as acknowledging the student perspective and pushing student thinking were provided with lower frequencies, even with differentiated PD focused on providing that type of feedback. Thus, developing teachers’ ability to provide high-information feedback that values students’ perspectives and pushes them to think critically remained a growth opportunity for teachers.
Building on Cohen et al. (2020), this study confirms the importance of coaching in the MRS environment. Indeed, our findings suggest coaching to be a critical lever in teacher learning; however, our findings also suggest that in some cases, coaching had some unintended consequences that warrant further attention. Although we found that a teachers’ assigned focus did lead to meaningful changes in specific feedback types, this still did not lead to the magnitude of change expected nor did the growth follow a monotonic trend.
For example, clarify was designed to help teachers learn the correct answer to the simulated student task if teachers had provided the incorrect answer in the assessment activity. This should have translated to teachers providing more clarifying feedback; however, we observed a negative correlation with a focus on clarifying feedback and valuing the student perspective. It may be that coaching focused on deepening teacher content knowledge led to teachers discovering the correct answer and in turn new knowledge of the correct answer led to greater attention on correcting responses and less value on exploring student ideas. These results suggest that clarifying coaching had unexpected, but logical results. Our findings suggest that, despite our desire to avoid “one-size-fits-all” PD by differentiating teachers’ focus on a demonstrated need, determining how to assess current teacher practices and then provide optimal PD remains a puzzle. Further, we found that both early career and in-service teachers who received coaching found the simulation to be less realistic. This may reflect teachers’ real-life experiences of having minimal opportunities for coaching, focusing on one aspect of a teaching practice, or practicing with an individual student before trying a teaching practice with a full class. Further research might explore the qualities of MRS that make the virtual environment feel real for teachers at different career points.
Our findings also indicate that teachers seemed to select a logical personalized practice option. For example, we noticed that teachers who struggled to provide high-information feedback in Sim1 often restarted and were then able to improve their feedback with a third attempt. Similarly, teachers who felt that the student responses had been fully developed often choose to ask a new question. Experienced teachers often chose to continue, which enabled them to build on the learning accomplished in Sim1 and further press student thinking. Interestingly, many experienced teachers choose to further student learning by continuing the action from Sim1 which led to opportunities to push student thinking using feedback statements higher on the feedback hierarchy. More preservice teacher chose to restart the simulation to develop their own automaticity that also placed them back at the beginning of the feedback hierarchy. Future studies might explore requiring novice teachers to continue to ensure they experience the opportunity to push student thinking and the satisfaction teachers may feel when students are deeply learning through teacher–student interactions. In addition, future studies might explore the extent that personalized practice decisions were made based on the avatar student learning or teachers’ desire to work on an aspect of their teaching and the impact on teacher learning of that perspective. Clearly, our results suggest that there is much more to be learned about the impact of personalized practice and differentiated teacher learning.
This study used both the controlled MRS environment for consistent measurement of teaching practices and the technology affordances of a virtual environment, including instant changes to location and the manipulation of time, to examine adjustable approaches to teacher professional learning. The MRSs used in this study responded to individual teacher learning needs through differentiated coaching (Sim1) and personalized practice (Sim2)—a first of its kind in this regard. The results of the study provide theoretical, methodological, and practical implications for the field of teacher education and the use of MRS to promote teacher application of evidence-based teaching practices, such as offering high-information feedback to all students.
These findings have several theoretical, methodological, and practical implications. First, from a theoretical perspective, our study contributes to the theory and research on feedback. Hattie and Timperley’s (2007) model of feedback suggests that effective feedback must provide the learner with direction about the goals for learning, the processes to achieve set goals, and possible next steps. The generally positive effect of coaching in this study provides further support for their model, considering that the role of coaches was to provide this kind of effective feedback. Specifically, teachers in the coaching condition were given a targeted goal aligned to their needs and were provided continuous feedback on ways to improve. The positive effect of coaching continued under the personalized practice condition, such that teachers in the coaching condition mostly decided to continue the MRS, which ultimately resulted in a higher frequency of higher order feedback focused on thinking.
Research on feedback also points to the important distinction between low- and high-information feedbacks (Wisniewski et al., 2020). Specifically, Wisniewski and his colleagues observed that feedback is generally more effective when teachers provide more information. Our study builds on this understanding by highlighting specific types of high-information feedback, namely feedback that clarifies students’ understandings, values the students’ perspectives, provides a rationale and actions to correct responses, and presses for thinking. To this end, this study adds to the conceptualization of feedback as “a complex and differentiated construct that includes many different forms with, at times, quite different effects on student learning” (Wisniewski et al., 2020, p. 13).
In addition, there are several important methodological implications of this study. This study measured the specific intention in teacher interactions with students when offering feedback versus an overall rubric score or a frequency count of a more general teaching behavior. By qualitatively coding each teacher feedback statement offered to students, we can measure teaching in terms of the specific type of high-information feedback (i.e., clarify, value, correct, and think), offering insight into the quality of teacher interactions with students and responsiveness of teachers to student needs. This measurement will transform expectations for teaching quality from the presence of a teaching behavior without attention to the impact on and interactions with students to measuring the qualities of teacher–student interactions.
The study also contributes to our understanding of conducting research that includes both researcher-driven decisions and teacher decisions. For research interventions to live a robust life in classrooms beyond research studies, teacher participants must feel agency and relevance. Our study addresses the challenges of refining interventions so that participants have agency in their learning. We demonstrated measurement methods that allowed for both participant agency and rigorous research.
We modeled in our PD design and MRS script effective feedback practices. For example, we began the PD with an assessment of current strengths and needs. Then in the MRS, the coach provided all teachers with a value statement regarding a strength in feedback practices that researchers identified in the participants’ assessment of current practices. Identifying the teaching practice that the new learning from PD will build upon may increase teachers’ perception of the utility of PD and ultimately application of PD learning in daily teaching practices. In addition, recognizing teachers’ strengths may increase teachers’ feelings of value. Differentiating PD and offering personalized learning options do not require the use of MRS and could easily be replicated and further examined in teacher education.
Our study also demonstrates the possibilities of PD that seek to identify the strengths and learning needs of adult teacher participants and then respond specifically by valuing teacher strengths and focusing on individual teacher needs. Our work adds to the growing understanding that individuals respond differently to interventions and those differences should not be ignored. In addition, the study shows that one teaching task can be relevant and challenging for teachers in different contexts and with varying levels of experience. Although early career teachers rated the experience lower than their peers, their ratings, along with all study participants showed strong value of the MRS as PD from the teacher’s perspective. This suggests that one MRS could be used longitudinally to support teachers in recognizing the development of teaching expertise across many years.
To that end, the selection of the avatar student task and learning scenario is critical in creating MRSs that feel relevant for teachers across grades, in different teaching contexts, and at different points in their career. We advise future studies to implement the MRS task with real students, who are in the age ranges served by the teachers that the MRS is intended to support. Responses from real students should be incorporated into the MRS script. An MRS built around a robust student task and an essential evidence-based teaching practice can provide P–12 teachers with the opportunity to return to the MRS throughout their career to see changes in their teaching practices and ability to respond to individual students on their feet as learning unfolds.
There are a number of practical implications of this study. Perhaps most important is the significance of designing motivationally supportive PD that recognizes and values individual learners as professionals with unique goals, strengths, and needs. Moreover, our findings suggest that leveraging the affordances of MRS, it is possible to design PD that is differentiated and personalized to meet teachers’ needs. The findings also underscore the importance of sustaining teachers’ motivation to learn by creating safer spaces for practice, trials, and failure with the aid of a supportive coach. MRS environment may be suited to provide this type of space.
We found that teachers’ experience in the MRS environment was not uniform. Similar to previous studies exploring novice and expert learning (e.g., Borko & Livingston, 1989; Manning & Payne, 1996), we found that novice and experienced teachers experienced MRSs differently. Overall, our findings showed more positive benefits among the preservice and in-service teachers as compared to the early career teachers. This suggests that it may be important to differentiate PD to suit the varying needs of teachers at different stages of the career trajectory. Further research should explore the cognitive load burden of experienced teachers in MRS. Taken together, these findings suggest that more research is needed to better understand how teaching expertise develops over time and the optimal conditions for developing teachers with different needs and levels of experience.
Most importantly, the findings suggest the utility of incorporating MRSs in teacher education programs across the career trajectory. Teachers often complain of “… too much theory and not enough practice” in teacher education (Lampert, 2010, p. 23). In response, teacher educators are increasingly using MRS to provide a consequence-free virtual classroom environment for teachers to develop teaching skills as an alternative to practicing with real children (American Association of Colleges of Education, 2020). Teacher educators and researchers should use and examine further key MRS design features, such as the ability to instantly transport a teacher to a room with one student for practice prior to participating in the full simulation. In addition, more studies should examine time manipulation within the MRS to create options for personalized practice including time being erased (restart), frozen (continue), or fast forwarded (ask a new question). These opportunities for differentiated and personalized, coaching, rehearsals, and scaffolded transfer of theoretical learning into teaching practices (i.e., working with one student prior to trying a practice with a larger group) are not possible in real school environments, making MRS a unique tool in teacher education.
Our study was designed to fit into the established systems and time structures for PD at the teacher experience level. For example, preservice teachers engaged in our MRSs as part of a college course, early career teachers as part of their required induction program, and experienced teachers as a selected course from a PD organization. Although the context and the PD varied that participants engaged in, the MRSs provided a consistent and reliable means to measure teacher growth. Each aspect of the PD and MRS experiences could fit into a 45-min time period. We designed our study to work well within the limitations and practical realities of working with schools across different contexts with a large number of teachers.
Although this study had a large sample size sufficient for the quantitative analyses used, when dispersed among both the conditions and differentiated treatment, the sample size was reduced. Given the complex outcome of differentiated PD, future studies may consider assigning teachers to focus on one of two different feedback types, instead of one of four feedback types. Further, our study was limited by only having one group of participants within each teaching experience level. Therefore, our results are not generalizable beyond the sample population. Further, because the large number of participants scheduled multiple individual MRSs, even with three simulation specialists, data collection took place throughout a school year. Teachers may have experienced different amounts of stress during the school year that could have impacted their learning in the MRS. In addition, different amounts of time elapsed between the baseline (Sim0) and the interventions (Sim1 and Sim2) for individual participants. These time differences may have resulted in greater effort to remember the student task in the virtual environment and the high-information feedback types that in turn may have impacted teacher performance in the MRSs. Future studies should explore the impact of elapsed time and different times of the school year on teacher learning in MRSs. The majority of study participants and research team identify as White women, reflecting the current demographics in the field of education (Ingersoll et al., 2021), this limited the perspectives brought to and amplified in our study. Future studies may recruit a more diverse reach team and engage participants from schools and programs with greater ethnic and cultural diversity. Future studies should explore directly teachers’ perspectives regarding the challenges of virtual representations with diverse avatar students by one simulation specialist.
In addition, variations in the simulation specialists’ performance that could impact the teacher’s provision of high-information feedback were anticipated and minimized in the procedures of our study. For example, simulation specialists sometimes used inconsistent improvised dialog by adding a word or two to the standardized challenge statement that enabled teachers to reply with high-information feedback or gave longer student phrases that may have impacted the amount of time teachers had for feedback opportunities during the simulations. Actions taken to prevent and minimize the impact of simulation specialist variation included balancing simulation specialists among the treatment and control conditions and across the differentiated feedback focus. The simulation specialists rehearsed regularly, provided standardized lines at specific times during each simulation, and kept simulations to roughly the same length. In addition, an online form was used for simulation specialists to record errors or other events that happened during simulations as well as moments of important discoveries. These events were reviewed by researchers and the simulation specialists at weekly meetings to maintain consistency in the MRSs throughout the school year and among the three simulation specialists. Future studies might specifically examine actions to minimize simulation specialist variations.
Finally, our methods may have limited our understanding of our data. This study only examined the research questions using quantitative analyses. We may have missed nuances and complexities that qualitative data may have provided. Further, assumptions of value were given to the different types of feedback (e.g., low information with a weight of zero and push thinking carrying the highest weight of the frequency of thinking utterances multiplied by five) to create the weighted mean. These values might not generalize across different contexts and cultures. Finally, although the MRS environment facilitated measurement of how teachers are able to change their teaching given coaching and multiple trials at a task. We were not able to follow teachers into their classrooms due to COVID-19 to determine the extent that practice in MRS impacted teaching with real students. These limitations were largely minimized through the experimental design, block randomization, large sample size, procedures to ensure accurate data collection, and rigorous coding of the simulation transcripts mitigate these limitations to create a strong study that extends the field of teacher education into new directions. Future studies may tackle factors that limited our analyses and ability to generalize beyond our sample population.
Taken together, our findings illustrate the complexity of teaching that is responsive to individual learner needs. This study extends the literature on MRS design and teacher PD in important ways and raises new questions. Specifically, more research is needed to understand the effectiveness of differentiated PD on growth in teaching practices. Further studies may build on the approach of assessing current teaching practices and differentiating PD interventions and compare the impact of differentiated or personalized PD with a standardized approach. Additionally, future studies might examine the equitable distribution of high information across students and the relationship between the needs that students demonstrate in their response and the feedback teachers provide.
Future studies might also explore teachers’ motivational feelings of efficacy, autonomy, and control under the condition of personalized practice. Although the present study suggests that most teachers found the overall experience to be valuable, we did not test whether teachers’ feelings of motivation would be enhanced more in the personalized practice condition. Future research should also explore teachers’ reactions to the opportunity to personalize their experience. At our debrief meetings, our simulation specialists often commented that some teachers struggled to make the personalized practice decision and sought advice from the simulation specialist. Future studies might examine MRS with and without teacher supports for reflecting on past teaching to set goals for future teaching.
Questions also remain about the use of MRS as PD across a career trajectory and context of PD through MRS that is most beneficial for teachers. In our study, the sample of early career teachers differed in important ways to the sample of preservice and in-service teachers. However, it is unclear if the findings are attributable to career stage, logistical factors, or another unexamined factor. For example, the early career school district induction program took place during the school day. Therefore, the PD sessions were shorter and had to be adjusted as the pandemic changed school schedules. In contrast, the preservice and in-service teachers were not juggling the PD during a workday, instead their PD took place in the evening, outside of the school day. Future studies might examine the specific needs of early career teachers and the impact of the context of PD perhaps comparing PD that takes place during and after the school day.
Effective teacher learning is important for every school system in the United States, given the large size of the teaching occupation and the need for PD that might support greater satisfaction and retention in the field as well as improved student learning (Ingersoll et al., 2021). Evidence from this study suggests that teachers along a continuum of experience found MRS-based PD to be relevant and valuable. Indicating that MRS may provide a means to close the persistent gap between educator PD and improvements to daily teaching practices (Hill et al., 2020; Lampert, 2010). Indeed, teachers, at different career points, found the MRS experience to be valuable and grew in their provision of high-information feedback. Our study provides an example of how essential daily teaching practices can be developed throughout a career. Future studies should continue to examine the same MRS with teachers at different points in career trajectories to provide new insights into how teaching expertise develops over time. Continued research could lead to more efficient and effective preparation and ongoing learning for educators.
Further, our evidence highlights possible challenges that teachers may face when left to change their teaching practices on their own through self-reflection, even with guiding prompts. Given the isolation that many classroom teachers experience throughout their career and how following most PD, teachers are expected to integrate new practices on their own through self-reflection, future studies may focus on examining ways to engage teachers in some type of coaching through MRS or other online tools.
Our results suggest that the technology features of MRS can be used to both differentiate PD and personalize professional practice, offering a promising means for transforming PD from a standard design for an “average” teacher to responding to the varied learning needs of teachers. In addition, future studies should examine assessment of teacher current practices and offering personalized practice decisions through MRS and PD generally.
Mikeska et al. (2021) called for future studies to link teacher learning more rigorously through MRS and actual student learning. Future studies might leverage the standardized environment of a virtual classroom to measure how MRS can be used to examine teaching practices in relation to equitable learning opportunities for avatar students within the virtual classroom. In addition, future studies should examine not only the teaching practices but also the quality of the interactions among teachers and all students in the virtual classroom and then the transfer of quality interactions to real classrooms. Although PD providers have struggled to link PD to classroom practices and student learning (Yoon et al., 2007; Zeichner, 2010), future studies may use standardized measurement tools such as preservice teacher observation and classroom observation rubrics to facilitate exploring links of teacher learning to student learning.
Parallel to providing equitable opportunities for students with diverse learning needs is recognizing and responding to the diverse adult learning needs of educators. Beck and Kosnik (2017) illustrate this relationship as they articulate the urgent need to personalize and differentiate teacher learning as an approach to teaching that immediately makes teachers more effective and sets them on a path of observing and responding to individual student differences through their teaching. Ensuring that all students are learning every day requires well-equipped teachers who are engaged in learning themselves throughout their career. The controlled, easily accessible environment of MRS provides hope in the future of differentiated and personalized PD that furthers career-long teacher learning linked to equitable learning for all students.
The MRSs were implemented using Mursion software. The simulation specialists were actor–coaches trained in both acting and instructional coaching who also worked as arts integration specialists in P–12 schools. Under the direction of the principal investigator, the simulation specialists, and research team collaborated to develop the MRS designed to help teachers provide high-quality feedback. The simulation specialist puppeted the avatar coach and students and changes the virtual location to adjust the number of students from an individual to practice a feedback strategy to a group of five students. Participants accessed the MRS using Zoom (Figure A1). The simulation specialist used a script with standard lines and improvisation grids using At—If—Then—So statements. For example, at the first question, if the teacher does not state the purpose, then the avatar Dev asks, “Why am I in this group?” So, the teacher has a chance to establish a teaching goal. Each avatar has an established set of variances aligned with the assigned task (Table A1).
Examples of How Avatar Students Vary Regarding the MRS Assigned Task
Significant interest in space, reads space magazines and watches videos on YouTube. Confident in his prior knowledge. Large amount of factual and accurate prior knowledge about space. Prior knowledge obscures his ability to focus on text.
“I already know all about space …”
Interest in space insofar as it relates to her interest in saving the Earth. Prior knowledge about Earth, basic schema for understanding space (i.e., names of planets, qualities of certain planets, understanding of orbits).
“Orbiter means it goes around the moon, right?”
Note. MRS = mixed-reality simulations.
Feedback is one of the most powerful tools that teachers use to promote student learning. However, not all students receive high-quality feedback that extends critical thinking skills. Often, in the rush to speak to every student or finish on time, teachers overlook the perspectives and strengths students bring to tasks.
Teachers engaged in professional development designed to learn how to respond to student understanding using the Ladder of Feedback (Perkins, 2003; https://wakelet.com/wake/MN1X1mnhSzbSxKxbFQHaj). Unique to this feedback strategy was eliciting the student’s perspective prior to offering feedback and then valuing from the student’s perspective strengths in the work. Teaching expertise was offered to the student by teachers first identifying the concern or problem in the work and then offering suggestions. This approach helped students identify areas for improvement in their work and then apply suggestions to make revisions. Finally, the ladder of feedback drew teacher attention to providing academic press by prompting metacognition, transfer, and generalization. Cultural awareness can be promoted by teachers seeking and listening for student explanations of strategies and frames of reference as one aspect of teaching practices that further equitable learning opportunities. While not complete or sufficient, the high-frequency teaching practice focus of this simulation is aligned with culturally relevant education approaches and cognitive science (Perkins, 2003).
After learning the teaching strategy, teachers prepared for the simulation through this module (https://wakelet.com/wake/kRQuK8fbrJYrQRjOukZVk).
Teachers accessed the simulation individually using Zoom (see video link, https://youtu.be/ROXOF92DHUk). Three scenes were used, an empty classroom with a coach, an individual student sitting at a table (coaching only), and a classroom with five students sitting at a table (Figure B1).
During the Sim1, three standardized student avatar lines challenged teachers:1Ethan raises his hand at 0:15 and says, “Excuse me, I shouldn’t be in this group. I already know all about space.” 2 Savannah/Jasmine raises her hand at 2:30 and says—“I (still) don’t understand why my answer is wrong.” 3 Harrison takes the focus at the 5:00 mark and says, “I went to NASA, so I know that scientists use a lot of equipment like oxygen tanks, and masks, and suits.”
Three simulation specialists provided the MRS and were balanced among the randomization blocks to reduce the differences among simulation specialists. We used several strategies to reduce variance among simulation specialists including monthly meetings, watching recordings of each other, and discussion implementation challenges.
All participants completed the following Qualtrics survey as part of professional development to learn how to provide students with high-information feedback. For teacher participants, this activity served as preparation for the mixed-reality simulations Sim1 and Sim2 and served as an assessment of teacher current feedback practices for researchers.
Based on researchers’ evaluation of the current practices (Appendix C), all participants were assigned one type of high-information feedback to focus on providing to students during Sim1. Participants who were randomly selected for the control group were shown one self-reflection slide (below) aligned with their assigned feedback focus.
The avatar host said: You can use this slide (share screen) and your preparation activity to prepare for the simulation. I am going to leave this slide up and you will have 5 min to plan before we go into the classroom. If you are ready before 5 min, simply say “start simulation” and begin teaching.
Data Collection Took Place Over a 9-Month Period
Teaching practices survey
Teacher efficacy score
Baseline Simulation 0
Feedback frequency, type, duration
Assigned differentiated PD feedback focus—clarify, value, correct, think
coaching versus self-reflection
Feedback frequency, type, duration
Feedback frequency, type, duration
Post simulation survey
Likert scale, open text responses
Note. MRS = mixed-reality simulations; PD = professional development.
To ensure accuracy of the qualitative coding, a detailed codebook was developed using grounded theory and its process of constant comparison to understand the intention of each feedback utterance offered to avatar students (Strauss & Corbin, 1997). When initial codes from the baseline simulations were organized into categories, researchers recognized relationships among the emerging categories to the literature on teacher feedback, specifically, Perkins’ (2003) ladder of feedback. Perkins’ taxonomy was adopted as the framework for the coding scheme, and each type of feedback was assigned a quantitative value (100–500).
As the study progressed, researchers were trained to code simulation transcripts through an online course and completed training transcripts before attempting a coding certification quiz. To become certified, coders need a score of at least 80% on the exact match metric with codes of the finest grain size or at least 80% accuracy on adjacent matches within each broader category. Coders participated in biweekly norming meetings and double-coded 20% of the total transcript corpus where all measures monitored for exact match to ensure consistency of coding. The frequency of teacher feedback utterances defined in the codebook Table F1 was then used throughout the quantitative analysis.
Example of Codebook (https://docs.google.com/document/d/19SrvS_HYwRoHsRl1hbG0ZTYz6_mLpaMdTkL0dbt-xAo/edit)
Example teacher quotes
There you go.
Explain and clarify
You’re right. Equipment and tools, the equipment is another word for a tool. So, when we have something that’s the same word for a tool, we would call it a synonym. So, they’re synonyms of each other.
Value student perspective
I love that you cited where you found it in the text, paragraph 2.
Teacher expertise—concern and suggestion
Savannah, you successfully identified a type of equipment from the passage, but your answer does not match one of the key words in the question which is “discover.” By looking for words in the text similar to discover, such as “investigate,” you could find a better answer that suits the whole question instead of just part of it.
Academic Press—learning beyond task
A good strategy to help you find the best answer for comprehension questions is to look for the subject and the actions in the text that have the same meaning as those in the question. Let’s try that by looking for synonyms for an action word—discover.