Volume 2, Issue 4. DOI: 10.1037/tmb0000046
Women face pervasive biases in science, technology, engineering, and math (STEM), and games may be one avenue through which biases can be reduced. We tested whether embodying a woman scientist in virtual reality (VR) leads to more positive attitudes toward women in STEM. We also examined the effect of revealing the scientist character’s gender earlier or later in the game based on previous work indicating that a later reveal may lead to greater identification with the character. Undergraduate men (N = 96) played a physicist in a VR game in which they were randomly assigned to a man or a woman avatar whose gender they saw earlier or later in the game. Compared to participants in the man scientist condition, participants in the woman scientist condition felt more positively about women and viewed the category of woman as more overlapping with the category of scientist; however, they viewed their own scientist character more negatively. Furthermore, in both avatar conditions, participants viewed the scientist character as less competent after playing. In addition, there were no effects of the early versus late reveal on attitudes toward women scientists or toward the scientist character. Finally, there were no effects of game conditions on implicit biases, perceptions of the climate for women in STEM, stereotype endorsement, or game enjoyment. Together, this study suggests that VR interventions may decrease some negative attitudes toward women in STEM but are not a panacea for the pervasive biases against women in STEM.
Keywords: women in STEM, gender biases, virtual reality, games
Acknowledgments: This research was supported by the National Science Foundation [DRL-1420036 and 1,462,063].
Disclosures: The authors report no conflict of interest.
Data Availability: The preregistration, data, materials, and supplemental information are available on OSF: https://osf.io/p572h/.
Open Science Disclosures:
The data are available at https://osf.io/yudwg/.
The experiment materials are available at https://osf.io/2dq8h/.
The preregistered design is available at https://osf.io/3xte2/.
Correspondence concerning this article should be addressed to Gili Freedman, Department of Psychology, St. Mary’s College of Maryland, 18952 E Fisher Rd, St. Mary’s City, MD 20686, United States email@example.com
The underrepresentation of women in science, technology, engineering, and math (STEM) fields is a pressing problem (National Science Foundation, 2019). At all levels of postsecondary education, men outnumber women in STEM fields such as physics and computer science, and the representation of women decreases from undergraduate to graduate degrees (National Science Foundation, 2019). Research has shown that it is not a lack of interest or ability (Hill et al., 2010) but rather a larger, more pervasive cause: biases against women. Different forms of media have been used in interventions designed to reduce biases against women in STEM including videos (Moss-Racusin et al., 2018; Pietri et al., 2017, 2019) and board games (Cundiff et al., 2014; Freedman, Seidman, et al., 2018; Shields et al., 2011; Zawadzki et al., 2012, 2014).
A method that has shown some promise for reducing gender biases is the use of games: Games provide a space in which individuals can more safely grapple with difficult social issues (Bessarabova et al., 2016; Flanagan & Kaufman, 2016; Kaufman & Flanagan, 2015). Virtual reality (VR) also has similar advantages and is potentially able to promote even higher levels of involvement, but the use of VR for improving attitudes toward women in STEM is relatively untested. The present study tests whether embodying a woman scientist in VR leads to a more positive attitude toward women scientists.
Both men and women can hold biases about women in STEM, and changing attitudes among both genders is important. However, previous research suggests that men and women may react differently to messages about gender bias, such that women have a greater recognition of bias (Freedman, Green, et al., 2018; Moss-Racusin et al., 2015), and men are more skeptical about gender bias research (Handley et al., 2015). Undergraduate men tend to nominate other men as the most knowledgeable members of the class (Grunspan et al., 2016). Furthermore, men may be more likely than women to perceive the categories of scientist and woman as less overlapping (Carli et al., 2016). Changing attitudes among men is important for a variety of reasons: Men may serve in decision-making roles such as hiring committees, and among students, the attitudes of men in science classes can contribute to creating a supportive or a hostile environment for women (Logel et al., 2009). However, reducing gender biases can be difficult.
One form of gender bias that men may hold (either consciously or unconsciously) is that women do not belong in science. For example, STEM fields like physics and computer science are seen as highly masculine (e.g., Cheryan et al., 2017; Kelly, 1985; Nosek et al., 2002, 2009). Research on prototype matching (e.g., McPherson et al., 2018) suggests that individuals may have a prototype of the typical scientist. For example, when asked to draw a scientist, participants often produce images of men in laboratories (Chambers, 1983). Exposure to women scientists may help change these prototypes, and VR may provide a particularly vivid and engaging form of exposure through embodiment.
Individuals may also hold stereotypes regarding women’s ability to succeed in STEM. The stereotype content model suggests that groups are broadly judged on the dimensions of warmth and competence (Cuddy et al., 2009; Fiske et al., 2007). One form of bias that women scientists may encounter is what Eagly and Mladinic (1994) termed the “women are wonderful” effect (p. 13). According to this theory, women are viewed positively because they are considered to have characteristics associated with warmth, such as being nice or nurturing. However, the downside of this “positive” stereotype is that women’s competence is diminished, and women are seen as unsuited for stereotypically masculine professions. For example, when individuals evaluate the personality traits of men, women, and scientists, they see men and scientists as sharing greater overlap in traits as compared to women and scientists (Carli et al., 2016).
Interventions that directly target these prototypes and stereotypes may evoke defensiveness or reactance. Our study draws on two theoretical perspectives that suggest that VR may help overcome these obstacles. First, the embedded design perspective (Flanagan & Kaufman, 2016; Kaufman & Flanagan, 2015) suggests that games are more effective at shifting beliefs when the manipulations are subtle. The current game draws on the principle of obfuscation, which means that the main purpose of the game (bias reduction) is disguised. Participants are focused on solving the puzzles rather than on the gender of their avatar. The game also uses the principle of distancing by creating a fictional context, which should increase engagement and reduce reactance.
Second, embodied VR experiences can be helpful in allowing people to take the perspective of others, which can then lead to greater empathy and helping (e.g., Ahn et al., 2013; Herrera et al., 2018). VR headsets provide an immersive visual and auditory environment; importantly, players can feel as though they are the character through seeing from the character’s visual perspective and controlling the avatar’s movements through hand tracking. In the present study, a mirror placed within the game highlights that embodiment. Embodying a woman scientist in the game might evoke feelings of connection or similarity with the character, which could lead to more positive feelings about women scientists. Indeed, the idea of becoming a character and leaving oneself behind has been explored across media. For example, research on narratives has shown that connecting with characters in stories can increase empathy toward stigmatized groups (e.g., Carpenter et al., 2019; Johnson et al., 2013; Mazzocco et al., 2010). Individuals who are able to take the perspective of a member of a stigmatized group become more sympathetic and less prejudiced toward that group.
Similarly, research on experience-taking has demonstrated that when individuals are immersed in a story, their attitudes toward others can shift. Experience-taking is similar to identification, and refers to “the imaginative process of spontaneously assuming the identity of a character in a narrative and simulating that character’s thoughts, emotions, behaviors, goals, and traits as if they were one’s own” (Kaufman & Libby, 2012, p. 1). For example, participants who read about gay or Black characters showed more favorable sexuality and race-related attitudes when the story led to more experience-taking (Kaufman & Libby, 2012).
In some previous studies, a key detail that allows for greater identification is letting the reader feel connected to the character before revealing that the character is a member of an outgroup (e.g., a different gender). When participants knew early in the story that the character was an outgroup member, there was less identification and more negative attitudes about the outgroup (Kaufman & Libby, 2012). Similarly, when adolescent boys played a game featuring women characters, they identified more with the characters and displayed fewer stereotypical gender role associations when the gender identity of the characters was revealed later, as compared to earlier, in the game (Kaufman et al., 2019).
In the case of women scientists, a late reveal of a gender identity might have the additional benefit of raising participants’ awareness of their own bias (e.g., Freedman, Green, et al., 2018). Specifically, if individuals playing a scientist character assume that the character is a man, finding out that they are in fact embodying a woman scientist may make them more conscious of their own stereotypes. We suggest that this recognition of bias may be particularly influential if the gender reveal is done in a way that does not evoke defensiveness. For example, if the gender reveal occurs incidentally and later during the gameplay, it may have a more positive effect.
Although, to our knowledge, previous research has not yet used VR to shift men’s attitudes about women in STEM, two studies have examined the effects of VR on women in STEM and have produced mixed results. In one study, women who inhabited a computer scientist’s office (vs. a humanities office) in VR and who were highly identified with STEM anticipated less stereotype threat and indicated more interest in STEM (Starr et al., 2019). However, in another VR study, embodying a man or woman avatar did not prevent the negative effects of engaging in an interaction with a sexist instructor on math performance (Chang et al., 2019).
In the present study, participants play a VR game as a successful scientist (randomly assigned as either a man or woman avatar), whose work has led to an important breakthrough in physics. Throughout the game, they solve puzzles to advance, allowing them to demonstrate competence within the virtual world. In puzzle games, player engagement is maximized by increasing puzzle difficulty as player ability increases (Linehan et al., 2014). The puzzles in the VR game were designed to increase in difficulty over the course of each level. This embodiment experience is designed to address both representation (showing a woman in a male-dominated field) and competence stereotypes. The embedded design principles of the game are intended to increase engagement and avoid reactance, and playing as a woman avatar should encourage perspective-taking.
Thus, we hypothesized that playing the game with a woman avatar might decrease both explicit stereotypes and negative implicit associations between women and science. Specifically, we hypothesized that in the Woman Avatar condition compared to the Man Avatar condition, participants would rate women as more competent at posttest (Hypothesis 1a) and have a stronger implicit association between women and science (Hypothesis 1b). Additionally, we compared perceptions of different social groups on their warmth and competence. We hypothesized those participants who played as a woman would show less of a discrepancy in their warmth/competence adjective ratings for women and for scientists at posttest, compared to those with a man avatar (Hypothesis 2). Although our game did not directly address the climate for women in STEM, as an exploratory measure, we also examined whether the VR experience would affect players’ perceptions of how welcoming STEM fields are for women.
In the present study, participants were randomly assigned to a man or a woman scientist avatar and they learned the gender of that avatar either early or late in the game. Based on the experience-taking literature, we hypothesized that the character would be perceived as more competent at posttest in the Late Reveal, Woman Avatar condition than in the Early Reveal, Woman Avatar condition (Hypothesis 3). Similarly, we hypothesized that participants would rate women as more competent at posttest compared to pretest, and this would be especially true in the Late Reveal compared to Early Reveal condition (Hypothesis 4). We also expected that participants in the Late Reveal, Woman Avatar condition would show less of a discrepancy in their adjective ratings for women and for scientists at posttest compared to participants in the Man Avatar conditions and the Early Reveal conditions, and that the same pattern would emerge for stereotypes and implicit attitudes (Hypothesis 5).
Additionally, as noted above, one potential benefit of games is that they may raise gender issues in a nonthreatening way, and thus may be less likely to evoke reactance (Kaufman & Flanagan, 2015). The effectiveness of a gender bias game may be related to participants’ emotional responses. Therefore, as exploratory measures, we also examined players’ enjoyment of the game and their postgame emotions. Although ideally participants would enjoy the game equally in both conditions, a late reveal of the avatar gender could create a backlash, where players might feel angry about being “fooled” (e.g., Freedman, Seidman, et al., 2018).
In sum, we propose that VR might be a useful tool to both raise awareness among men about their possible biases against women in STEM and to shift expectations about the prototypical scientist. We used a pretest/posttest design to examine whether playing the role of a woman physicist in a VR game can shift attitudes about women in STEM.
Prior to running the present study, we conducted a pilot study to refine the game and test for effects related to when the character’s gender was revealed. In the pilot study (N = 97 undergraduate men; data available on OSF), all participants played as a woman scientist but half saw themselves early and half saw themselves later in the game. Participants viewed the scientist character as less competent after playing but only viewed the scientist as warmer in the early reveal condition (Freedman et al., 2021). These results indicated that the early reveal may be reinforcing gender stereotypes, and that participants needed to feel more competent in the scientist role. Thus, the game was modified to increase feelings of competence by giving players less direction as the game progresses and emphasizing the main character’s expertise.
In the present study, participants learned either early or later in the game about their character’s gender and were randomly assigned to a man or a woman scientist avatar. 1 We hypothesized that participants in the Woman Avatar condition would show an increase in positive attitudes about women and women in STEM compared to participants in the Man Avatar condition. We also hypothesized that the increase in positive attitudes would be particularly strong in the Late Reveal condition. The preregistration, data, and materials are available on OSF: https://osf.io/p572h/. Procedures were approved by the Institutional Review Board.
The game was developed by an experienced design team using the Unity 3D engine. A multistage, iterative process was used to develop game levels, the game script, and the voiceover (see Supplemental Material for a description of the game and level design process).
Participants were recruited via word of mouth, campus-wide emails, and posters. The study description stated that they would play a 30-min VR game. Participants were compensated with a $15 Amazon gift card. A power analysis in G*Power with a small-medium effect size (f = .2), for a 2 × 2 × 2 mixed ANOVA with 95% power, showed a target sample size of 112 participants. To account for participants failing the attention checks, the goal was to recruit 125 participants, and 126 were recruited. Of these, 19 were excluded for not remembering the avatar’s gender and 11 were excluded for not remembering when they saw the avatar. Thus, the final sample was 96 participants (95 cisgender men, 1 transgender man; M age = 19.79, SD age = 1.63; 7.3% African American or Black, 1.0% Arab/Middle Eastern, 31.3% Asian, Asian American, or Asian Canadian, 3.1% Hispanic/Latino, 47.9% White, 7.3% Multiracial, 2.1% other).
After providing consent, participants were given two documents (a physics Ph.D. diploma and press clipping describing Dr. Smith’s work; see Figure 1) about Dr. Alex Smith, the main character. These documents established Dr. Smith’s scientific credentials and status as a successful scientist. In both documents, the character is referred to by name only; personal pronouns are not used. Thus, at the start of the game, participants were not aware of Dr. Smith’s gender. Participants completed the pretest measures after they viewed these documents and then began playing.
Participants were tested individually in an alcove that was curtained on the open side, with an approximately 10 ft by 10 ft play area. Participants wore an off-the-shelf, wired, original model HTC VIVE headset, with room tracking capability from base stations mounted at the corners of the alcove (see Supplemental Material). Participants took on the role of Dr. Alex Smith, a scientist studying multiple dimensions and building a teleportation device. In this 20-min game, participants learn there are multiple dimensions and need to set up a transmitter by finding objects from the multiple dimensions. The players were given puzzles to solve with the help of a mysterious voice: “Jordan.” As the game continued, Jordan gave progressively less direction, allowing players to feel competent in solving the problems.
After putting on the VR headset, participants found themselves in Dr. Smith’s laboratory (see Figure 2) and followed instructions to pick up a beaker and place it on a device labeled “transmitter.” The beaker vanished as the transmitter activated, and an earpiece appeared in a device labeled “receiver.” Participants followed instructions coming from the earpiece to put it in their ear, at which point they were introduced to Jordan, a mysterious fan of Dr. Smith’s work.
Jordan referred to Dr. Smith’s expertise and work at several points (e.g., “Your theories on Entanglement were correct!”) and sent a pair of glasses to the participants through the receiver. The glasses made several translucent red objects appear, and Jordan explained that these objects were in another dimension, and that the participants needed to be able to see them to build a beacon. Participants followed Jordan’s instructions and solved several puzzles in which they obtained objects for the beacon. The final object they needed was a battery, found in a back room. Plugging the battery into the beacon completed the level, and a portal opened to another level where they solved more puzzles. A video of the game (in the early reveal woman avatar condition) can be viewed at https://youtu.be/kbeYzNASGLY.
Participants were randomly assigned to the early or late reveal condition and the man or woman avatar condition. In the early reveal condition, when the participants entered the back room in the first level, they were able to see their character in a mirror. In the late reveal condition, the mirror was in a room that the participants encountered at the end of the third level. For the avatar gender conditions, the game avatars were designed to be clearly distinguishable as either a man or a woman (e.g., through hairstyle, body shape), but without sexualizing the avatars (see Figure 3). We specifically selected avatars that would match each other and the low polygon count (e.g., cartoony) style of the VR game environments (see Supplemental Material).
Participants completed a set of self-report measures before and after playing the game (see Table 1 and Table 2): The Positive and Negative Affect Scale (PANAS; Watson et al., 1988) and adjective ratings of Dr. Smith, men, women, successful scientists, successful historians, and children (Carli et al., 2016). The goal of the adjectives measure is to assess the amount of overlap between perceptions of women and perceptions of scientists (the other entities are used as filler items). Participants rated Dr. Smith and these groups on warmth-related traits (good natured, warm, sincere, tolerant) and competence-related traits (competent, intelligent, independent, competitive, confident) on a 1 (not at all) to 7 (very) scale. A composite score for warmth was created with the four warmth adjectives (Dr. Smith pretest α = .64; Dr. Smith posttest α = .77); however, the competence-related adjectives showed low reliability (pretest α = .45). Thus, we created a composite score for competence using only the two most relevant adjectives (competent, intelligent; Spearman–Brown reliability: pretest = .72, posttest = .81).
Intercorrelations, Means, and Standard Deviations for Posttest Measures
1. PANAS positive
2. PANAS negative
3. Game enjoyment
4. Perceived positive climate
5. Stereotype endorsement
6. IAT d score
Note. PANAS = positive and negative affect scale; IAT = implicit association test. *p < .05. **p<.005.
Participants also completed a set of measures at posttest only, to reduce suspicion about the purpose of the game. For specific emotional responses to the game, participants indicated how playing the game made them feel on a 1 (not at all) to 7 (very) scale for the following emotions: Angry, depressed, frustrated, relieved, happy, amused, empowered, entertained. Game enjoyment was assessed through four items (e.g., “I enjoyed the game”) rated on a 1 (strongly disagree) to 7 (strongly agree) scale (α = .69).
Means and Standard Deviations of Perceived Competence and Warmth for Dr. Smith, an Average Woman, and an Average Scientist at Pretest and Posttest for Participants in the Man and Woman Avatar Conditions
Dr. Smith competence
Average woman competence
Successful scientist competence
Dr. Smith warmth
Average woman warmth
Successful scientist warmth
Next, participants completed the Gender-Science IAT (Greenwald et al., 1998) in Qualtrics using iatgen (Carpenter et al., 2018). In this IAT, participants are presented with pairings of science with male or female and liberal arts with male or female. A d score is calculated based on response latencies for selecting the correct pairing. A positive d value for this IAT indicates that participants were quicker to associate female with science.
To measure explicit attitudes toward women in STEM, participants completed two questionnaires. The first questionnaire was a set of 8 items about their perceptions of the climate in STEM for Dr. Smith (e.g., how likely it would be for Dr. Smith to be respected by authority figures; α = .88) rated on a 1 (very unlikely) to 7 (very likely) scale. The second measure was a modified version of the stereotype endorsement scale (Schmader et al., 2004) rated on a 1 (strongly disagree) to 7 (strongly agree) scale (α = .84) with items such as, “It is possible that men have more physics ability than do women.”
Finally, participants provided demographic information and answered two attention check questions (gender of Dr. Smith; when they saw Dr. Smith in the mirror).
To test the effect of the Reveal and Avatar conditions on perceptions of the average woman’s competence, a 2 (Reveal) × 2 (Avatar Gender) × 2 (Time: Pretest, Posttest) mixed ANOVA was conducted. There was a significant interaction of Time and Avatar Gender, F(1, 89) = 9.78, p = .002, η p 2 = .10). As predicted (Hypothesis 1a), in the Woman Avatar condition, perceptions of the average woman’s competence increased from pretest, mean difference = 0.28, SE = .10, p = .006, 95% CI [0.09, 0.48], but there was no change in the Man Avatar condition, mean difference = −0.16, SE = .10, p = .118, 95% CI [−0.35, 0.04]; (see Figure 4). Contrary to predictions (Hypothesis 4), there were no interactions with Reveal (all p > .70).
To test for how much overlap participants perceived for the competence of the average woman compared to a successful scientist, we ran a 2 (Reveal: Early, Late) × 2 (Avatar Gender: Man, Woman) × 2 (Time: Pretest, Posttest) × 2 (Role: Woman, Scientist) mixed ANOVA on perceived competence. This analysis found a significant three-way interaction of Time, Role, and Avatar Gender, F(1, 82) = 4.99, p = .028, η p 2 = .06; (see Supplemental Material for lower order interactions and main effects). When the interaction was broken down by Avatar Gender, there was no interaction of Time and Role for participants in the Man Avatar condition, F(1, 42) = .04, p = .853, η p 2 = .001, but the interaction was significant for participants in the Woman Avatar condition, F(1, 42) = 9.93, p = .003, η p 2= .19. As predicted (Hypothesis 2), participants in the Woman Avatar condition saw more overlap between the average woman and a successful scientist at posttest compared to pretest. Specifically, participants in the Woman Avatar condition rated the average women as more competent at posttest (M = 5.43, SD = .94) than pretest, M = 5.15, SD = 1.01; mean difference = .28, SE = .12, p = .024, 95% CI [.04, .52], but unexpectedly, rated successful scientists as less competent at posttest (M = 6.49, SD = .60) than at pretest, M = 6.66, SD = .47; mean difference = −.17, SE = .05, p = .002, 95% CI [−.28, −.07]. Contrary to predictions (Hypothesis 3−Hypothesis 5), Reveal did not affect perceptions (all p > .15).
To test for how much overlap participants perceived for the warmth of the average woman compared to a successful scientist, we ran a 2 (Reveal: Early, Late) × 2 (Avatar Gender: Man, Woman) × 2 (Time: Pretest, Posttest) × 2 (Role: Woman, Scientist) mixed ANOVA on perceived warmth. Our hypothesis that participants who played as a woman would show less of a discrepancy in their warmth adjective ratings for women and for scientists at posttest, compared to those with a man avatar was not supported: There was no interaction of Avatar Gender, Role, and Time, F(1, 82) = 0.24, p = .628, η p 2 = .003. Also contrary to predictions (Hypothesis 5), Reveal did not affect perceptions of warmth (all p > .40).
A 2 (Reveal: Early, Late) × 2 (Avatar Gender) × 2 (Time: Pretest, Posttest) mixed ANOVA on the perceived competence of Dr. Smith did not find the predicted interaction (Hypothesis 3) between Reveal, Avatar Gender, and Time: F(1, 92) = 0.20, p = .659, η p 2 = .002). However, there was an interaction between Reveal and Time, F(1, 92) = 8.52, p = .004, η p 2 = .09) such that participants in the Early Reveal condition showed a decrease in their perceptions of Dr. Smith’s competence from pretest (M = 6.58, SD = .76) to posttest (M = 5.87, SD = 1.05; mean difference = −0.71, SE = 0.16, p < .001, 95% CI [−1.04, −0.39], but participants in the Late Reveal condition did not show a decrease from pretest (M = 6.10, SD = .77) to posttest (M = 6.08, SD = 1.05; mean difference = −0.01, SE = 0.18, p = .935, 95% CI [−0.36, 0.34]. These interactions qualified a main effect of Time, F(1, 92) = 9.24, p = .003, η p 2 = .09, such that participants perceived Dr. Smith to be more competent at the pretest (M = 6.36, SD = 0.80) than at the posttest (M = 5.97, SD = 1.05). There were no other significant main effects or interactions (all p > .05).
A 2 (Reveal) × 2 (Avatar Gender) × 2 (Time: Pretest, Posttest) mixed ANOVA on the perceived warmth of Dr. Smith did not find an interaction between Reveal, Avatar Gender, and Time: F(1, 92) = .001, p = .972, η p 2 < .001, and there were no other significant main effects or interactions (all p > .08).
A set of 2 (Reveal) × 2 (Avatar Gender) between subjects ANOVAs on IAT d scores, climate for women in STEM, stereotype endorsement, and game enjoyment found no significant main effects or interactions, contrary to Hypothesis 1b and Hypothesis 5 (all p > .10; see Table 3; see OSF page for full results).
Means and Standard Deviations of the Posttest Only Dependent Variables in the Man and Woman Avatar Conditions
IAT d score
Climate for women in STEM
To examine whether there were gender biases in how individuals viewed Dr. Smith before and after they discovered Dr. Smith’s gender, we conducted exploratory paired samples t-tests for the Man Avatar and Woman Avatar condition. In the Man Avatar condition, there was no difference in ratings of Dr. Smith’s competence from pretest (M = 6.27, SD = 0.93) to posttest, M = 6.10, SD = 1.00, t(47) = −0.86, p = .394, d = 0.17. In the Woman Avatar condition, ratings of Dr. Smith’s competence decreased from pretest (M = 6.45, SD = 0.64) to posttest, M = 5.83, SD = 1.09, t(47) = −4.02, p < .001, d = 0.66.
A 2 (Reveal) × 2 (Avatar Gender) × 2 (Time) mixed ANOVA on the positive affect PANAS scores found that a main effect of time, F(1, 90) = 80.45, p < .001, η p 2 = .47, such that participants felt more positive emotions after playing the game (M = 35.51, SD = 7.18) compared to before the game (M = 30.54, SD = 7.40). There were no other significant effects or interactions (all p > .07). For negative affect, a significant interaction Reveal × Time interaction emerged, F(1, 92) = 4.02, p = .048, η p 2 = .04; participants in the Early Reveal condition showed a significant decrease in negative affect from pretest (M = 13.52, SD = 3.71) to posttest, M = 12.33, SD = 2.57; mean difference = −1.20, SE = 0.48, p = .015, 95% CI [−2.16, −0.24], but participants in the Late Reveal condition did not differ from pretest (M = 12.98, SD = 2.91) to posttest, M = 13.18, SD = 4.78; mean difference = 0.23, SE = 0.53, p = .662, 95% CI [−0.81, 1.28].
Although VR holds significant promise for changing attitudes, the current results suggest some limitations when addressing social biases: Although playing a woman physicist improved men’s perceptions of women’s competence generally, players still judged the woman scientist character as less competent. Participants who saw themselves as a woman physicist showed less bias against women by thinking the average woman was more competent and that there was less of a discrepancy in competence between women and scientists. The game also improved participants’ positive affect across conditions. However, playing the game as a woman physicist also made participants think the physicist character was less competent, and there were no effects of the game on the IAT, stereotype endorsement, or beliefs about the climate for women in STEM. Furthermore, the timing of the gender reveal did not affect gender attitudes. Taken together, the present study provides evidence that playing as a woman scientist in VR may help shift some broad attitudes about women and scientists but may be less successful at changing implicit attitudes or specific attitudes about embodied scientist characters.
These results provided some promising suggestions that gameplay may have helped shift players’ prototypes of women and scientists, as well as improving perceptions of the competence of women generally. However, the results also suggested that our combination of perspective-taking and embedded design gameplay may have had an unintended effect. Specifically, one possible explanation for the game reducing perceptions of character competence is that the game character was faced with puzzles to solve during gameplay. Therefore, the character’s level of competence was in part under the control of the player; even though all players succeeded in solving the game, if a player found the challenges difficult, they might have blamed the avatar/character. These results provide important insights into the complexities of merging game play and VR perspective-taking. Furthermore, embodying a particular type of avatar may not always the best way to create change. Perhaps an alternative approach would be to have the woman scientist character serve as an in-game guide to the player, thus more strongly demonstrating competence.
Additionally, previous research suggested that a late reveal of a stigmatized identity was beneficial for experience-taking in narratives; readers were more accepting of characters if they already had gotten to know them through the story (Kaufman & Libby, 2012). Here, the effects were different: Participants in the Early Reveal condition showed less negative affect after playing, whereas participants in the Late Reveal condition did not. Although the magnitude of these effects was small, future research should be careful about potential backfire effects. A violation of expectations may feel more threatening when it relates to one’s virtual “self” compared to a story character.
In contrast to Kaufman and Libby (2012) findings, some studies examining the disclosure of gay/lesbian identity in real interactions showed a benefit of disclosing a stigmatized identity early rather than late in an interaction (Dane et al., 2015; MacInnis & Hodson, 2015). Although the context of those studies was quite different from the present study (a closeness-inducing interaction with a fellow undergraduate), these effects might translate to a VR context. Further work is needed to clarify the optimal timing of identity disclosures across situations.
In the present study, gameplay did not affect other variables such perceptions of the climate for women in STEM. However, the game focused on only a single character (the player) being guided by a mysterious voice. Thus, the player/character did not actually encounter discrimination, differential treatment, or other effects specifically related to gender bias. These elements may need to be more directly incorporated into a virtual world in order to shift attitudes on these topics.
As women are still underrepresented in many STEM fields, it is important to examine how to shift men’s views about women in STEM. Furthermore, although both men and women hold gender biases against women in STEM (e.g., Moss-Racusin et al., 2012), some research suggests that men may be less likely to understand the influence that bias experiences can have on women scientists (Freedman, Green, et al., 2018), view gender bias intervention research negatively (Handley et al., 2015), and are less likely to see women and scientists as having overlapping traits (Carli et al., 2016). Thus, the present research focused on men’s biases. However, it will be important for future research to examine how VR interventions may influence women’s beliefs about women in STEM and about STEM fields more generally. For example, it is possible that enabling women to inhabit the role of a woman scientist in VR may make women feel more positively about entering a STEM field.
The study was also somewhat underpowered due to the exclusion criteria. Although we intentionally recruited more individuals than the power analysis indicated, more participants had to be excluded than initially anticipated. Due to time and COVID constraints, and the limits of the population at our college, we were unable to recruit more participants. Thus, in future research, it will be important to ensure higher powered samples. Furthermore, we did not collect tracking data in this study as we did not have a priori predictions about the ways in which participants would interact with the world or the amount of time they would spend on specific tasks. However, one potential variable of interest in future research would be how long participants spent in front of the mirror. Participants who interacted more with the mirror may have felt the effects of the manipulation more strongly.
Additionally, the present study focused on the gender match or mismatch of the avatars with participants and did not vary the avatar’s race: The avatars in the game were both White. Given the underrepresentation and pervasive intersectional biases against women of color and particularly Black women in STEM (e.g., Charleston et al., 2014; Ireland et al., 2018), it will be crucial for future research to consider how VR affects biases against women scientists from different ethnic and racial backgrounds (Vignola et al., 2019). The mixed findings from the present study along with the mixed findings from research on racial biases in VR (Banakou et al., 2016; Behm-Morawitz et al., 2016; Groom et al., 2009; Hasler et al., 2017; Peck et al., 2013) indicate that more work is needed to identify the conditions under which VR is a useful medium for gender and race-related bias interventions.
Finally, future research on using VR to shift attitudes would benefit from data on the process by which attitudes can change in VR. For example, including measures of transportation, presence, and perspective-taking for individuals in the VR environment could help determine which aspects of VR immersion are most beneficial for shifting attitudes. Future research could also assess individual differences; for example, VR may be especially useful for encouraging positive attitudes among individuals who are dispositionally lower in concern for others (Ahn et al., 2013).
The present research indicates that VR interventions for decreasing men’s biases against women in STEM hold both promise and potential pitfalls. Specifically, playing the role of a woman scientist within an immersive environment may help shift attitudes about the degree of overlap in competence-related traits between the categories of woman and scientist. Yet, playing as a woman scientist also made participants think more negatively of their scientist character. Thus, it will be important to consider how VR immersion may both positively and negatively shift gender biases.
Although we assigned participants to one of two gender conditions, we recognize that gender is not a binary construct. Future research should consider approaching gender from a more nuanced and inclusive perspective.