Volume 2, Issue 1. DOI: 10.1037/tmb0000034
Virtual Reality (VR) has been touted as an effective empathy intervention, with its most ardent supporters claiming it is “the ultimate empathy machine.” We aimed to determine whether VR deserves this reputation, using a random-effects meta-analysis of all known studies that examined the effect of virtual reality experiences on users’ empathy (k = 43 studies, with 5,644 participants). The results indicated that many different kinds of VR experiences can increase empathy, however, there are important boundary conditions to this effect. Subgroup analyses revealed that VR improved emotional empathy, but not cognitive empathy. In other words, VR can arouse compassionate feelings but does not appear to encourage users to imagine other peoples’ perspectives. Further subgroup analyses revealed that VR was no more effective at increasing empathy than less technologically advanced empathy interventions such as reading about others and imagining their experiences. Finally, more immersive and interactive VR experiences were no more effective at arousing empathy than less expensive VR experiences such as cardboard headsets. Our results converge with existing research suggesting that different mechanisms underlie cognitive versus emotional empathy. It appears that emotional empathy can be aroused automatically when witnessing evocative stimuli in VR, but cognitive empathy may require more effortful engagement, such as using one’s own imagination to construct others’ experiences. Our results have important practical implications for nonprofits, policymakers, and practitioners who are considering using VR for prosocial purposes. In addition, we recommend that VR designers develop experiences that challenge people to engage in empathic effort.
Keywords: empathy, virtual reality, meta-analysis, perspective-taking
Acknowledgments: Alison Jane Martingano is now at the National Human Genome Research Institute, NIH. Sara Konrath was funded by a grant from the Corporation for National and Community Service while writing this manuscript (17REHIN002). Alison Jane Martingano’s dissertation research, of which this study comprised, was funded by the Zolberg Foundation for Migration and Mobility
Conflict of Interest: The authors have no known conflicts to disclose.
Open Science Disclosures: The data are available at https://osf.io/ezpxu/?view_only=d85bf39a 933448c5addf8fd2c631cfa6. The preregistered design and analysis plan is accessible at https:// aspredicted.org/yw59d.pdf.
Interactive Content: Becoming Homeless: an immersive virtual reality experience from Stanford University’s Virtual Human Interaction Lab; The Displaced: a 360° video documentary by The New York Times exploring the global refugee crisis through the stories of three children.
Correspondence concerning this article should be addressed to Alison Jane Martingano, National Human Genome Research Institute, NIH, 31 Center Drive, Bethesda, MD 20892, United States. Email: firstname.lastname@example.org
Virtual reality (VR) has been touted as a way to promote empathy, by helping people virtually experience what it is like to be in someone else’s situation (Milk, 2015). In 2016, VR giant Oculus released their “VR for Good” initiative to incentivize designers to create prosocial content (Matney, 2016). Not to be outdone by their leading competitor, HTC VIVE announced their $10 million “VR for Impact” program in 2017 (HTC VIVE, 2017). In a viral TED talk on the topic, VR developer Chris Milk hailed VR as “the ultimate empathy machine” because it promises to help people understand and feel for others in situations that they might find hard to imagine (Milk, 2015). Although Chris likely spoke in hyperbole, many national and international charities have collaborated with technology companies to use VR in their fundraising campaigns (Amnesty International, 2017; International Rescue Committee, 2016; Médecins Sans Frontières, 2016; UNICEF USA, 2015).
Much of the enthusiasm for, and financial investment in, VR came before empirical support for its effectiveness at increasing prosocial traits and behavior was available. However, there is theoretical and experimental precedent for similar empathy interventions. Scholars describe empathy as a muscle, and as such it should be capable of growth and even regeneration (Konrath et al., 2011). Following this logic, a variety of empathy training programs have been designed to explicitly teach empathy. In fields such as medicine, where such programs are used regularly, they generally have positive effects (g = 0.63; Teding van Berkhout & Malouff, 2016, for meta-analysis). Moreover, less explicit interventions have been shown to also lead to modest improvements in empathy, including engaging with a variety of art (Kou et al., 2020), such as reading fiction (Dodell-Feder & Tamir, 2018; Mumper & Gerrig, 2017) or practicing drama (Goldstein & Winner, 2012). However, empathic improvements following these interventions are not always universal, often being constrained to a particular type of empathy (e.g., Goldstein & Winner, 2012). As such, VR may also find its effectiveness constrained to only one aspect of empathy.
Indeed, there are reasons to be skeptical of VR as an “empathy machine.” Several pieces of recent research find that VR fails to promote empathy in controlled experimental settings (e.g., Gehlbach et al., 2015; Barreda-Ángeles et al., 2020; Jones & Sommer, 2018). Nevertheless, findings are mixed, with some empirical research supporting the connection (e.g., Herrera et al., 2018; Ingram et al., 2019; Kalyanaraman et al., 2010; Kleinsmith et al., 2015). Given these mixed results, we conducted a meta-analysis to determine the effect of different types of VR on different types of empathy.
The effectiveness of VR may depend on the type of empathy being measured. Empathy is a multidimensional construct which includes both the ability to understand what other people are feeling (cognitive empathy) and feelings of care and concern in response (emotional empathy; Davis, 1983). These two types of empathy have been proposed since at least the 18th century, when Adam Smith (1759) differentiated between one’s emotional reactions to others’ and the ability to recognize emotional states free of emotional arousal. Much more recently cognitive and emotional empathy have been proposed as a dual process system (Yu & Chou, 2018). In psychology, dual process theory describes how mental states, such as empathy, can arise as a result of both an automatic, unconscious process, and an explicit, conscious process.
There is growing evidence that emotional empathy is fast, automatic, and occurs spontaneously (Neumann & Strack, 2000). Even in infants, simply witnessing the suffering of another person triggers this automatic emotional response (Sagi & Hoffman, 1976). On the other hand, cognitive empathy is a more deliberate skill first learned around 3–5 years old, when children realize that other people think and feel in ways that differ from themselves (Theory of Mind, Flavell, 1999). Cognitive empathy develops into a more advanced mentalizing capacity with age (Gweon & Saxe, 2013) that requires attention and effort to decipher the thoughts and feelings of another person (Roxßnagel, 2000). If people are distracted by a concurrent processing task, they are less able to imagine another person’s perspective (Davis et al., 1996). For an excellent review on how executive functioning is related to various mentalizing tasks see Launay et al. (2015). Altogether these results support a dual process model of empathy, indicating that cognitive empathy is aroused by more conscious and effortful mental processes, whereas emotional empathy is automatic and requires fewer mental resources.
In line with a dual process model, improvements in cognitive empathy appear to occur after people consciously engage in an effortful mentalizing. For example, reading fiction, which requires deciphering characters’ intentions and motives, leads to improvements in cognitive empathy (Dodell-Feder & Tamir, 2018; Mumper & Gerrig, 2017). In addition, acting, which presents a challenge to the actor to simulate the mind of their characters, leads to improvements in cognitive, but not emotional, empathy (Goldstein et al., 2009). A dual process model of empathy would question the efficacy of VR for increasing cognitive empathy because VR is unlikely to promote effortful mentalizing. If VR experiences present the thoughts and feelings of others explicitly, there may be no need for users to engage in such mental effort, and subsequently less likely to produce increases in cognitive empathy. In this way, VR is acting as a type of “Hot Media” that users need not actively engage with because its message is given without their participation (McLuhan, 1964).
Although VR may be limited at increasing cognitive empathy, VR may be adept at arousing emotional empathy. Emotional empathy does not require people to engage in mentalizing effort but can be triggered automatically by the types of vivid emotional scenes typically found in VR experiences.
The multidimensional nature of empathy, and the relative effort required to arouse each type, may explain why research investigating the effectiveness of VR for increasing empathy has found inconsistent results. The success or failure of VR to elicit empathy may depend upon the type of empathy the researchers were measuring, as well as the type of VR being used.
The effectiveness of VR may depend on the type of VR hardware and software being used, which create different degrees of immersion and interactivity.
The majority of research and theory on VR and empathy has considered the impact of VR environments administered through head mounted display units (HMDs). These delivery devices block out noise and visual input from the real world and replace it with perceptual input from a virtual environment. As users turn their heads, the system is responsive and the sensory input changes accordingly. These features afford users an immersive virtual experience. Other delivery devices, such as desktop VR, which uses a normal computer screen, can be less immersive.
The extent to which any particular VR experience is immersive or interactive also varies based upon the specifications of the specific software. For example, 1000 Cut Journey allows viewers to become Michael Sterling, a Black man, and encounter racism as they try to complete everyday activities (Cogburn et al., 2018). Users can interact with the experience by opening doors and picking-up objects using a controller. Other experiences, such as the 360° video Clouds over Sidra, are less interactive. Clouds over Sidra puts viewers inside a Syrian refugee camp and follows a day in the life of 12-year-old Sidra (Arora & Milk, 2015). Participants cannot interact with the experience but become immersed as they watch aspects of Sidra’s life unfold around them.
More immersive experiences are typified by spatialized sound, stereoscopic visuals, greater image resolution, and a high update rate. Immersion can also be altered by design choices and restrictions, with sensory experiences that are truer to real life likely being more immersive. Realism is normally achieved by using real world footage over computer-generated footage, however, computer generated footage can vary extensively in the degree to which it is realistic. Finally, some VR designers have begun incorporating additional senses into their experiences, such as smell or touch, to increase immersion.
Interactivity primarily involves the extent to which objects in the VR environment can be manipulated (Steuer, 1992). Computer generated VR experiences can allow users to pick up objects, open doors, and even communicate with humanoid avatars. These interactions generally require hand-held devices but advances in voice recognition and motion detection may mean that more natural interactions will become commonplace in the near future. Beyond manipulating content, other aspects of the experience can offer more or less interactivity (Sundar, 2004). For example, VR experiences can differ in the extent they allow the user agency in directing the trajectory of experience (e.g., plot or scene changes).
The extent to which experiences are immersive and/or interactive may influence the engagement of users. However, these technological affordances (interactivity and immersion) are distinct from the psychological engagement they produce (Evans et al., 2017).
More immersive and interactive environments have been associated with creating a heightened feeling of presence in users (see Cummings & Bailenson, 2016, for review; Vashisht & Chauhan, 2017). Presence is the “perceptual illusion of nonmediation” (Lombard & Ditton, 1997), where a user fails to acknowledge the existence of the VR environment and responds as it were not there. More simply put, the user has a feeling of truly “being there” in the virtual environment (Ahn et al., 2013; van Loon et al., 2018). Recent research suggests that the feeling of presence mediates the influence of VR on empathy (Barreda-Ángeles et al., 2020). Therefore, VR experiences that are more immersive and more interactive could be more effective at yielding empathic outcomes in users. However, there are reasons to be skeptical of the connection between presence and empathy. For example, researchers have found that head-mounted displays may trigger (spatial) presence but have no real effect on narrative engagement (of which, empathy plays an important role; Pressgrove & Bowman, 2020).
In addition to creating a sense of presence, more immersive and interactive environments may also result in increased feelings of embodiment. Developments in motion and voice detection have led to a tighter coupling of body and machine which may trigger feelings of body ownership in users (Biocca, 1997). VR allows users to see and hear as if they were experiencing someone else’s point of view in the real world, in other words, to have an “embodied experience” (Ahn et al., 2013). Moreover, in some cases the VR experience is specifically designed to produce a body swap illusion where users are deliberately given a virtual body that is different from their own, that can be controlled in real time, to prompt perspective-taking (Ahn et al., 2016). Perspective taking, or imagining others’ experiences, has been found to enhance empathic concern (compassion) toward others (Batson, 2011; Batson et al., 1997). However, there are critical differences between perspective taking and virtual embodied experiences. Perspective taking requires significant cognitive resources (Lin et al., 2010; Roxßnagel, 2000) and also requires sufficient motivation (Gehlbach et al., 2015).
The impact of VR on empathy could theoretically derive from engagement driven by presence or embodiment. A recent mini meta-analysis of seven studies that induced either presence or embodiment using VR, found that these VR experiences led to increases in cognitive empathy, but not emotional empathy (Ventura et al., 2020). The small number of studies in this meta-analysis speaks to how rarely researchers measure the psychological engagement triggered by VR. It is therefore possible that various types of VR do not trigger psychological engagement to a sufficient degree to arouse empathy. This is a particularly important research question because cheaper and more easily accessible types of VR tend to be less immersive and interactive. Indeed, to an overwhelming degree, when charitable organizations turn to VR, they use simple 360° documentary style footage often administered through desktop VR or cardboard VR headsets (Amnesty International, 2017; International Rescue Committee, 2016; Médecins Sans Frontières, 2016; UNICEF USA, 2015). These less immersive and interactive VR experiences may be less likely to trigger feelings of presence or embodiment, and subsequently empathy. Identifying whether such experiences are useful for arousing empathy is therefore of critical importance.
Unfortunately, measurements of psychological engagement cannot be achieved from descriptions of a VR experience alone. However, the technological affordances of different types of VR experiences (how interactive and immersive they are) can be determined from descriptions of the hardware and software. Therefore, in this meta-analysis we compared VR experiences that are more or less immersive and interactive. Although measuring psychological engagement is beyond the scope of this meta-analysis, it is presented here as a theoretical explanation for why different types of VR are expected to have different effects on empathy.
We conducted a meta-analysis of all known studies investigating the relationship between virtual reality and empathy. Studies were included if they used any type of virtual reality experience and any quantitative measure of empathy. We first determined the size of the overall effect of VR on empathy, and then determined whether the size of these effects depended upon the type of empathy measured and the type of VR used. All hypotheses were pre-registered at AsPredicted.com (#13614) unless otherwise specified.
First, based upon a dual process model of empathy, we expected VR to promote emotional, but not cognitive empathy, because VR requires mentalizing effort. We operationalized type of empathy by dichotomous coding of the empathy measures into emotional versus cognitive.1
Next, we expected that empathy would increase if the type of VR promoted greater psychological engagement. The nature of this psychological engagement (i.e., heightened presence or embodiment) is beyond the scope of this analysis, since such mediating measures were rarely included in the available studies. However, ideas about psychological engagement drive our expectations. Type of VR is operationalized in four ways: categorically based on delivery device (e.g., HMDs are assumed to be more engaging than desktop VR), continuously, based on duration in minutes (longer experiences are assumed to promote deeper engagement), and continuously based on coding of immersive and interactive features. More interactive and immersive experiences were expected to lead to greater improvements in empathy. As an exploratory measure we also report how many senses the VR experience simulated, with more senses assumed to create a more immersive and engaging environment. Number of senses was not pre-registered as a moderator.
We also expected more engaging experiences to lead to longer lasting and more generalizable empathic improvements. In other words, empathic improvements should persist at follow-up and should spill over to other groups targeted by the VR intervention. In addition, we expected that the effectiveness of VR for arousing empathy may depend on the type of control group used, with more engaging control groups yielding a smaller apparent effect of VR.
We report several exploratory moderators, that were not pre-registered: the demographics of the participants tested and the topic of the VR experience (e.g., refugees, disabled people). These moderators are of interest because previous research has found that empathy (and social cognition more broadly) differs in clinical populations (Baron-Cohen & Wheelwright, 2004), across genders (Davis, 1983), and cultures (Chopik et al., 2017), and is directed less toward certain stigmatized groups compared to non-stigmatized groups (Harris & Fiske, 2006).
In addition, we report exploratory moderators regarding the research designs included: use of control group, empathy measure used, and research quality. These moderators were included to ensure that the apparent efficacy of VR for improving empathy is not an artifact of experimental procedure.
This meta-analysis serves both theoretical and practical purposes. VR is a useful tool for better understanding the nature of empathy because it requires minimal mentalizing effort. If emotional empathy is automatic and cognitive empathy is deliberate, then VR should only arouse emotional empathy. In addition, this meta-analysis can provide practical advice regarding the type of VR experiences most adept at increasing empathy in users. This could be applied to a variety of educational and organizational settings.
We conducted a systematic literature review followed by a random effects meta-analysis.
We used a two-step search process to try to find as many eligible studies as possible. First, we conducted a database search of Web of Science and PsycINFO using the following terms: (“Virtual” OR “X Reality” OR “Augmented Reality” OR “360 Degree Media” OR “Avatar” OR Simulat* OR Immers* OR “Mixed Reality”) AND (Empath* OR Sympath* OR “Theory of Mind” OR “Emotional Contagion” OR “Mimicry” OR “Emotional Resonance” OR “Perspective Taking” OR “Mentalizing” OR “Oneness” OR “Psychological overlap”). We did not search WorldCat Dissertations and Theses.
Second, we reviewed reference sections of all articles identified via database searches, and also performed citation forward checking to locate other relevant publications. In an attempt to obtain unpublished data, we also contacted the first authors of eligible articles to request further work. When otherwise eligible articles had missing data, we contacted the authors to request it. Finally, we also made public calls for unpublished data on both social science and technology forums and listservs.
For a study to be included it had to meet three inclusion criteria.
Expose participants to a virtual reality technology. There are a wide variety of virtual experiences and technologies that can be considered virtual reality including immersive virtual environments (IVE) administered through HMDs or projection domes, augmented reality that adds a layer of virtual experience onto the real world via a smartphone or tablet, window to the world experiences on desktop computers, as well as haptic gloves, telepresence controllers, spatialized surround sound, and even newly developed scent masks. In order to acknowledge the large variety of experiences and technologies captured under the common parlance of “virtual reality” we define VR as any computer technology that virtually simulates one or more senses (auditory, visual, olfactory, gustatory, and/or tactile simulations). Although we recognize that more stringent definitions of virtual reality exist, maintaining a broad definition of VR allows for the comparison of technological features that may be more (or less) effective at increasing empathy.
Employ a quantitative measure of empathy. We were interested in studies that measured cognitive and/or emotional empathy. Cognitive empathy was defined as understanding the mental states of others and emotional empathy was defined as having an emotional reaction to the mental states of others.
Study design allows for the calculation of an effect size of VR. Studies were not included if they only compared one type of virtual reality intervention to another. We excluded these studies because they could not answer our research question, since we were examining the effect of virtual reality, and it would be unclear which VR condition would be considered the control group.
5073 articles were obtained through databases and other sources, after removing duplicates (see Figure 1). The abstracts of these articles were screened, and clearly irrelevant articles were excluded. Full texts of 223 possibly relevant articles were obtained. From these full texts 177 articles were excluded for the following reasons. 55 articles were excluded because they did not use virtual reality technology (e.g., used physical methods of simulation such as a blindfold), 48 were excluded because they did not use an appropriate study design (six review articles, nine study proposals, 33 lacking inferential data). Note that studies were not excluded for using a correlational design, but no studies of this nature were found. 39 studies were excluded because they compared different types of VR experiences to each other, 26 were excluded because they did not measure empathy, five articles were excluded because they reported duplicate data included in another publication, and four were excluded because they investigated VR experiences in tandem with another empathy enhancing technique which introduced a confound that prevented the isolation of the unique effect of VR.
We searched the included articles for reported effect sizes or descriptive data that would allow for their calculation. We contacted authors who discussed collecting relevant data but did not report it or did not do so in sufficient detail to allow for the calculation of effect sizes, in an attempt to obtain this missing data. If unsuccessful, and reasonable approximations of missing values could not be made (see Table S1) these studies were also excluded. This process led to the additional exclusion of three articles.
Overall, we obtained 43 usable articles (32 published journal articles, two published conference articles, seven dissertations, and two other unpublished sources, containing a total of 122 useable effect sizes (see Table S3).
The present study investigated seven pre-registered moderators to determine if they explained variation in the effect of VR on empathy, and eight additional exploratory moderators were chosen based upon the nature of the studies found during the literature search. All moderators were hand coded based upon the contents of each article, supplementary materials and, where necessary, after contacting the original authors. When information regarding study methodology was not described in the article in sufficient detail to allow coding, the study was excluded from that moderator analysis. When the moderator required subjective judgment in order to code, we used two independent coders in order to establish inter-rater reliability. In case of disagreements, the lead coder’s judgments were used in subsequent analyses, but the overall trend of results was the same regardless of which coder’s judgments were used. Studies that could not be coded were excluded from the relevant moderator analysis.
Studies were coded to indicate which type oFlag error="hyphen-end"/]?>f empathy was measured, either cognitive empathy (understanding the mental states of others) or emotional empathy (having an emotional reaction to the mental states of others). Studies that used a combined measure of cognitive and emotional empathy (e.g., Bryant Index of Empathy, Kiersma-Chen Empathy Scale, and Empathy Quotient) were not included in this analysis but were included when calculating the overall effect size. Two independent coders (authors Alison Jane Martingano & Sara Konrath) rated empathy type with a high level of agreement (κ = 0.98).
We coded the type of delivery device that was used to administer the VR experience. Immersive virtual environments included Head-Mounted Displays such as Oculus Rift and projection VR systems (e.g., Dome, VR Cave), which aim to block out stimuli from the real world and replace it with virtual content. Non-immersive virtual environments included experiences that augment people’s sensory environment but do not fully replace it, such as desktop display on laptops or tablets, or audio only experiences.
The length of the virtual experience (in minutes) was coded as a continuous variable. When a range was provided by the authors (e.g., 5–10 min), we recorded the midpoint (e.g., 7.5 min).
Studies were coded on the level of immersion afforded by the technology used to display each VR experience. Immersion is regarded as a quality of the technology, that is, the technological capacity of a medium to create and sustain a vivid virtual experience, while shutting out physical reality (Slater & Wilbur, 1997). In other words, immersion is an objective and descriptive measure of the extent that a particular medium is able to replace physical perceptual input with virtual perceptual input and engage multiple sensory modalities. Studies were rated on seven immersive features adapted from a previous meta-analysis on immersive technology (Cummings & Bailenson, 2016), namely, tracking level, stereoscopy, image quality, field of view, sound quality, update rate, and photorealism. Studies were rated on each feature as high (two points), low (one point), or absent (zero points). An overall immersion rating was calculated for each study as a percentage of the total possible score excluding features that could not be coded. Two independent coders (authors Alison Jane Martingano & Fernanda Hererra) rated immersion levels with a high level of agreement (all κ > 0.7).
Studies were coded on the extent to which users could actively interact with and control the virtual environment. Studies were rated on five interactive features, namely, gaze direction, limb movement, mobility, physical manipulation, and agency. Studies were rated on each feature as high interactivity (two points), low interactivity (one point), or absent (zero points). Features that could not be coded from the descriptions provided by authors were rated as N/A. An overall interactivity rating was calculated for each study as a percentage of the total possible score excluding features that could not be coded. Two independent coders (authors Alison Jane Martingano & Fernanda Hererra) rated interactivity levels with a high level of agreement (all κ > 0.7).
Studies were coded to determine the target of participants’ empathy, as either having the same group membership as the one depicted in the VR experience (e.g., measuring empathy toward refugees after viewing an experience about a refugee), a different group membership (e.g., measuring empathy toward refugees after viewing an experience about a homeless person), or using a generalized measure of empathy that was not group specific.
Note: This is a 360 VR video that gives you a sense of depth in every direction so you feel like you’re actually there. Press play ▶ and use the gray multi-directional arrows on the top left side of the video.
Studies were coded as either having an immediate measure of empathy or a delayed measure (e.g., after days or weeks).
We recorded the type of control groups used. Any controls used in two or more studies were subsequently analyzed as a subgroup, namely, perspective-taking instructions, reading text, reading text combined with perspective-taking instructions, video, real-life, and no-treatment (placebo) controls. Studies without control groups were omitted from this analysis.
We recorded the number of human senses (vision, audition, taste, touch, and olfaction) that were simulated as part of the virtual experience.
We coded basic demographic variables of the participants (age, gender, and location) and created subgroups for non-clinical and clinical populations. It transpired that all clinical samples were comprised of adults or children with Autism Spectrum Disorder (ASD). The mean age of each sample and its gender distribution (% males) were coded as continuous variables.
We recorded the specific issue/group that VR interventions were designed to increase empathy toward, for exploratory purposes. Any topic targeted by two or more studies was subsequently analyzed as a subgroup. This included: children, people in poverty, the elderly, refugees and immigrants, people with physical health issues, people with mental health issues, domestic violence victims, victims of bullying, and non-human environmental issues (e.g., pollution, animal rights).
Studies were coded as either including a control group (both within and between subjects’ control groups were included) or not using a control group (pre-post measurements only).
We recorded the specific scales used to measure empathy in each study, for exploratory purposes. Any scales used in two or more studies were subsequently analyzed as a subgroup.
Studies were evaluated for their quality and experimental rigor using a mixed criteria approach based upon the Study Design and Implementation Assessment Device (DIAD) approach (Valentine & Cooper, 2008). We created 20 coding criteria that addressed the four types of validity outlined in the DIAD (construct, internal, external, and statistical). These coding criteria were directly observable in the study methodology, therefore maintaining the benefits of an objective methods-description approach while still assessing threats to validity. We summed sub-scores for each type of validity to produce an overall quality score. Two independent coders (authors Alison Jane Martingano & Sara Konrath) rated research quality with a high level of agreement (all κ > 0.7).
We conducted a random effects meta-analysis using Comprehensive Meta-Analysis V3 software (Borenstein et al., 2006) to determine the overall effect of VR interventions on empathy. Effect sizes indicate the difference between baseline empathy (pretest or control condition) and empathy levels following a virtual reality experience. Positive effects mean that virtual reality increases empathy, while negative effects mean that virtual reality decreases empathy.
When deriving the overall effect of VR, this meta-analysis applied a conservative approach via three general principles. First, when a study employed an experimental design with multiple virtual reality interventions, these were collapsed into one intervention group. This prevents an artificial inflation of N by including the control group only once in the meta-analysis, thus giving a conservative estimate of the precision of the overall effect. Second, when a study employed multiple control conditions these were collapsed together to provide one comparison group, for the same reason. Third, when multiple measures of empathy were reported, we used their average in the meta-analysis. This is so that the analysis does not assign more weight to studies with multiple outcome measures, and so that it does not overestimate the precision of the overall effect by assuming these measures are independent, when they are likely to be positively correlated.
For moderator analyses, results were aggregated within each subgroup of the moderating variable. For example, a study that contained one measure of emotional empathy and two measures of cognitive empathy would be treated as having two outcomes: a composite measure of cognitive empathy and a single measure of emotional empathy.
Analyses were conducted using a random-effects model (DerSimonian & Laird, 1986). Analyses were performed using Cohen’s d, with weighted averages of effect sizes and 95% confidence intervals (CIs). Heterogeneity tests were conducted using I 2 and Q statistics. Moderator analyses were conducted with mixed-effects models.
Data used in this meta-analysis are publicly available via the Open Science Foundation. https://osf.io/ezpxu/?view_only=d85bf39a933448c5addf8fd2c631cfa6
VR has a significant positive impact on empathy, with an overall standardized difference in means of 0.43 [CI 0.31,0.55], z = 6.93, p < .001. This mean effect size is moderate in size, with VR treatment groups improving almost half a standard deviation on empathy measures. However, the dispersion of effects around this mean are substantial, and greater than would be expected by random variation, Q(50) = 379.10, p < .001; I 2 = 86.81. Moderator analyses revealed that in some situations, VR had a strong effect on empathy, and in others, it was trivial or absent.
Subgroup analyses revealed that VR improved emotional empathy significantly more than cognitive empathy, Q(1) = 8.03, p = .005. Indeed, VR appeared to have no significant impact on cognitive empathy, d = 0.08, p = .23. This suggests that VR technology may lend itself to arousing empathic feelings, d = 0.33, p < .001, but not to improving understanding of others’ mental states (See Figure 2).
The moderating effect of empathy type appears robust. Exploratory multiple moderator analyses did not find a reduction in the size of this effect when research quality, number of senses, immersion, or interactivity was taken into account (see Table S2).
To our surprise, more immersive delivery devices that used head mounted displays did not have a significantly larger effect on empathy than non-immersive delivery devices that ran on a normal computer desktop or headphones, Q(1) = 1.77, p = .183 (See Table 1). In addition, meta-regression analyses found that greater levels of immersion and interactivity coded continuously, did not yield greater increases in empathy (p > .05. see Table 2). In addition, the duration of the VR experience also did not appear to influence its effectiveness (p = .109). VR experiences included in this analysis varied from 1 min to 1 hr in duration. The null effect of type of VR challenges the importance of immersion and interactivity for creating prosocial VR experiences. Instead, these results suggest that less expensive, less technologically advanced, and presumably less engaging, VR experiences are just as effective at eliciting empathy.
Summary Statistics for Subgroup Moderator Analyses
Type of empathy [Q(1) = 8.03, p = .005]
Delivery device [Q(1) = 1.77, p = .183]
Immersive virtual environments
Non-Immersive virtual environments
Target of empathy [Q(2) = 1.68, p = .431]
Timeframe[Q(1) = 1.98, p = .167]
Control group [Q(5) = 10.13, p = .072]
Reading and Perspective-taking instructions
No treatment control
Participants [Q(1) = 4.53, p = .033]
Non-Autism spectrum disorder
Autism spectrum disorder
Nationality [Q(2) = 5.58, p = .062]
Other (Australia, Korea & Taiwan)
Topic [Q(8) = 6.44, p = .598]
Refugees and immigrants
Use of control group [Q(1) = 3.48, p = .062]
Empathy measure [Q(7) = 10.86, p = .054]
Kiersma-Chen empathy scale
Venn diagram circles task
Summary Statistics for Continuous Moderator Analyses
Number of senses
The empathic impact of VR is equally powerful toward task-specific and generalized targets, Q(2) = 1.68, p = .431, suggesting that empathy does transfer beyond the specific content of the VR experience. For example, a VR experience about a single child refugee is likely to increase empathy toward all child refugees. However, we were unable to examine whether the improvements in empathy generalize to different targets entirely because this construct was measured in only one research study.
The positive effects of VR on empathy appear to persist over time. Seven studies examined participants’ empathy levels after a delay, ranging from 1 week to 8 weeks. Empathy levels were not significantly different at follow-up compared to immediately following the experience, Q(1) = 1.91, p = .167. This suggests that positive impact of VR does not diminish over time. However, it is worth noting that this analysis is underpowered, given the paucity of follow-up research, and therefore should be interpreted with caution.
The effectiveness of VR does appear to depend on the type of control group to which it is compared, Q(6) = 13.45, p = .036. VR was found to be more effective than no treatment, d = 0.44, p < .001, video, d = 0.50, p = .002, and perspective-taking instructions, d = 0.42, p = .016, control groups, but was only marginally more effective when compared to reading about others, d = 0.30, p = .053, and not significantly more effective than reading combined with perspective-taking instructions, d = 0.10, p = .536, or witnessing others in real life, d = −0.09, p = .613. Given the cost of VR technology, these results suggest that in some situations, less expensive, non-technological interventions may be just as effective at eliciting empathy as VR.
In line with our other results, but contrary to expectations, greater sensory immersion did not produce greater improvements in empathy (p = .173).
Subgroup analyses revealed that VR significantly improved empathy for clinically healthy populations, d = 0.39, p < .001, as well those with autism spectrum disorder, d = 0.96, p < .001. The positive effect of VR was significantly larger among autistic populations, Q(1) = 4.53, p = .033, possibly due to the lower baseline levels of empathy associated with this population. We performed all other moderator analyses using only studies with non-clinical samples, which is a more conservative strategy, to ensure that the effect of VR would not be artificially inflated in the general population.2 Participants’ national origin did not appear to influence the effectiveness of VR. There was no significant difference in empathy enhancement between conducted with participants from North America, Europe, and other locations, Q(2) = 5.58, p = .062. There were also no significant effects of age or gender (all p > .05).
Included in this meta-analysis were VR experiences promoting empathy toward a variety of different groups including children, the elderly, refugees, victims of domestic violence and bullying as well as those who suffer from a variety of mental and physical ailments. There were no significant differences in empathy enhancement toward these different groups, Q(8) = 6.44, p = .598.
There was no solid evidence that the effect of VR on empathy was an artifact of research design, although there were several marginal effects. There was a marginal change in effect size estimate depending on whether the research used a control group or not (p = .062) and what specific empathy measures were used (p = .054); but no significant effect on overall research quality (p = .389).
A funnel plot of all included studies showed minor evidence of asymmetry indicating a possibility of publication bias (see Figure S1). In other words, studies finding a significant effect of VR on empathy may be more likely to be published, and therefore included in our meta-analysis. However, we would need to find thousands more “null” studies in order for the overall effect to be insignificant (fail safe N = 3,444; Rosenthal, 1994). Nevertheless, it is still possible that the estimated effect size may be inflated by publication bias and therefore, in order to take an exceedingly conservative approach, we utilized the trim-and-fill method to make reasonable assumptions about possible missing data (Duval & Tweedie, 2000). Recalculating the average effect of VR using this method reduced the estimated effect size from 0.43 to 0.28. In this meta-analysis, the smallest two studies with the largest effect sizes were both conducted with autistic populations (plotted on the far right). A population difference, rather than a reluctance to publish insignificant findings, may therefore explain this.
Based on this meta-analysis, we conclude that a wide variety of virtual reality experiences can increase empathy, but that these effects are constrained to improvements in emotional empathy, rather than cognitive empathy. The rush of emotions elicited on behalf of virtual victims does not appear to translate into an improvement in understanding their experiences. This may be because VR leaves so little to the imagination that users do not have an opportunity to practice mentally recreating what it is like to be in an unfamiliar situation. This argument, based on a dual process model of empathy, posits that cognitive empathy, unlike emotional empathy, requires the deliberate engagement of mentalizing effort in order to be aroused. Because VR experiences present the thoughts and feelings of others explicitly, there may be no need for users to engage in such mental effort, and subsequently less likely to produce increases in cognitive empathy. In other words, VR makes it easy to feel for others, but it does not challenge us to think for ourselves about others’ perspectives.
Alternatively, VR may not yield improvements in cognitive empathy because the demands it places on users may be too challenging (Bowman, 2019). Users may not be able to attend to and/or process the simultaneous sensory, motor, emotional and cognitive demands of VR because of a limited processing capacity (Fisher et al., 2018). This explanation also assumes that cognitive empathy requires sufficient mental resources, but argues that VR overloads, rather than under-stimulates, this ability.
Another possibility is that VR empathy interventions may need to be much longer (e.g., several months) before they impact cognitive empathy. Although we found no moderating influence of VR duration on empathy, all of the studies in this meta-analysis that were conducted with non-clinical samples used a single session of VR. The differences in duration we analyzed, therefore, were a matter of minutes rather than months. As cognitive empathy is a skillset learned over several years (Gweon & Saxe, 2013), it may not be enhanced with a single-session VR experience.
Our results showing VR increases only emotional empathy, appear to contradict a recent meta-analysis finding VR increases cognitive empathy. However, we believe Ventura et al. (2020) results were unique to the specific limited types of VR experiences they included (k = 13). The meta-analysis reported here (k = 122) includes a much wider variety of VR experiences, including less expensive and commonly available Desktop VR and VR administered through cardboard headsets. In this way, we believe that our data complements rather than contradicts this previous work and helps give a more well-rounded picture of the potential utility of VR.
To our surprise, more immersive and interactive types of VR, that are known to elicit higher levels of psychological engagement, did not have a larger effect on empathy. This unexpected result may be greeted with cautious optimism: charitable organizations need not invest in highly immersive and interactive experiences in order to trigger emotional empathy in would-be-donors.
Despite the null effect on cognitive empathy, the improvement in emotional empathy following VR suggests that tempered enthusiasm for VR is warranted. The empathy generated by watching the suffering of one individual appears to generalize to people in similar situations. In addition, this effect appears to persist over time. The effect of VR on emotional empathy is not large, d = 0.33, but is comparable to other interventions such as reading fiction (d = 0.15, Dodell-Feder & Tamir, 2018) or direct perspective taking instructions (ds range from 0.12 to 1.0; Myers et al., 2014).
In addition, although not an aim of this meta-analysis, it appears that VR is particularly well suited to generate empathy in people on the Autism spectrum, d = 0.97. Our data therefore support the therapeutic use of VR empathy interventions with these clinical populations. Although it is not possible to determine why participants with Autism benefitted most from VR, one possibility is that they had more to gain from the intervention. Previous research suggests that VR is more effective for people with lower levels of dispositional empathy (Ahn et al., 2013), and people with Autism often struggle with cognitive empathy skills (Baron-Cohen & Wheelwright, 2004). However, the duration of the VR experiences used with this population was also much longer, and often involved multiple sessions, which offers an alternative explanation for their superior improvements.
Whether pre-existing levels of empathy influence how participants respond to VR more broadly, remains an empirical question unexplored by this meta-analysis. However, our data on gender may offer some insight into this question. Women consistently report higher levels of dispositional empathy compared to men (Davis, 1983). However, gender was not found to be a moderating factor. Future work should investigate the importance of pre-existing empathy levels for VR interventions.
A dual-process model of empathy suggests that certain types of empathy require more conscious effort than others (Martingano, 2020; Yu & Chou, 2018). The results of this meta-analysis support this distinction, demonstrating that emotional empathy can be passively, and automatically, aroused by virtually watching the suffering of others, whereas cognitive empathy cannot.
By conceptualizing empathy under this dual process system, it is possible to unite two generally distinct bodies of research that have emerged over the last few decades that tacitly assume empathy is a product of either automatic or deliberate processes. Theorizing around emotional empathy grew from the discovery of mirror neurons in humans. The neural profile of emotional empathy is thought to be specifically located within the automatic mirror system (Iacoboni, 2008, 2009; Rizzolatti & Craighero, 2004). Behavioral data also supports the automaticity of emotional empathy demonstrating that it can rapidly occur (Dimberg et al., 2000), even outside of conscious awareness (Neumann & Strack, 2000). Emotional empathy appears present in humans from infancy: newborn babies cry in response to the cries of another infant more than to other sounds that are equally loud and startling (Sagi & Hoffman, 1976). This fierce emotional reaction perhaps best typifies the automaticity of emotional empathy.
On the other hand, many researchers’ operationalization of empathy is slower and more deliberate, which instead captures cognitive empathy. Many experimental manipulations to encourage cognitive empathy explicitly instruct participants to take the perspective of another, often encouraging them to close their eyes in order to visualize the other’s plight and mentally transpose themselves into the situation (Batson & Ahmad, 2009, for review). These experiments assume that a conscious and resource intensive process is required to elicit empathy. Researchers have also demonstrated that people fail to understand the mental states of others when distracted (Gilovich et al., 2000).
VR provides an excellent tool for investigating the dual process model of empathy because it relieves the mental burden of mentalizing from users. Therefore, the emotional aspects of empathy that improve following VR experiences can be assumed to occur with little conscious effort on the part of the user. On the other hand, VR does not improve cognitive aspects of empathy, which supports the idea that this type of empathy requires deliberate mentalizing effort.
When choosing what kind of empathy intervention to use, non-profits, policymakers, teachers, and practitioners must consider what empathic benefits an intervention will yield and weigh these against its associated costs. This meta-analysis provides useful information relevant to both of these considerations.
First, this research supports the assertion that cognitive and emotional empathy are aroused via different mechanisms, therefore, a single pronged approach to arousing empathy is likely to fall short in at least one domain. Although unidimensional empathy interventions can improve prosocial behaviors (Lopez & Snyder, 2009), which type of empathy is more beneficial depends on the type of prosocial behavior one hopes to elicit. For example, fundraising campaigns may find that VR experiences are more than sufficient for their purposes, assuming they can capitalize on the rush of empathic emotions aroused with a well-placed donation bucket or web-link.
Second, our meta-analysis revealed that VR does not create substantive improvements in empathy beyond those that can be achieved with less expensive and less technologically advanced methods. Unsurprisingly, studies that compared VR to real-life scenarios did not find VR to be more effective at eliciting empathy. The same was true for studies that used reading and perspective-taking instructions as their control group. Therefore, although VR experiences are an important addition to the current toolbox of empathy interventions, their considerable cost and specialized emotional effects may limit their general usefulness.
Tremendous advances in computer technology over the last decade have made it possible to achieve hyper-realistic VR simulations. The combination of powerful graphics, high-resolution head-mounted displays, motion-sensing technologies, and high-fidelity surround sound allows users to be immersed in, and interact with, virtual worlds in an unprecedented manner. Designers have focused on these features as a way to increase users’ feelings of presence and/or embodiment and in turn their empathy. However, contrary to expectations, this meta-analysis did not support this assumption: Greater levels of immersion and interactivity did not yield greater increases in empathy.
If VR creators wish to increase empathy, they may need to go beyond making the experience realistic. For example, VR experiences could ask users to reflect on how a virtual person is thinking or feeling by asking them to predict what they might do next or explain why they acted as they did. These kinds of explicit cognitive interventions could be built into the experience as choice-points or augment existing experiences as narrator prompts. Regardless of their exact nature, the aim is to encourage users to use their own imagination to build upon the virtual environment they experience. Another way to prompt cognitive empathy would be to include users in the design of a personalized virtual experience. For example, by allowing users to build their own computer-generated representation of a refugee camp or cancer hospital. VR designers may be able to challenge users’ biased or inaccurate perceptions by limiting their design choices: users may have to place 30 hospital beds within a limited building size or feed 100 people with limited food. By making users actively involved in the creation of a virtual world, they would be required to engage their own imagination.
Given the different mechanisms by which cognitive and emotional empathy are aroused, VR designers wishing to improve both aspects of empathy are likely to need a two-pronged approach. These dualistic experiences must both provide enough explicit emotional information to prompt users’ automatic emotional empathy, but also be complex and ambiguous enough that users feel they need to engage their own cognition. It remains possible that these two aims are not compatible, and that improvements in one aspect necessitate reductions in the other. If this is the case, VR designers may have to prioritize what type of empathy they wish their experience to yield.
Like all meta-analyses the quality of this research was dictated by both the number and nature of the studies available for inclusion. Through an exhaustive database search, as well as attempts to acquire unpublished datasets, we obtained 43 pieces of original research for inclusion in this meta-analysis, yielding 122 effect sizes. These studies often contained only a small number of participants each, but the power of meta-analytic techniques comes from combining these studies to obtain an overall larger sample size, in this case, of 5,644 participants. However, unlike the main effect, for moderator analyses the N is much closer to the number of studies, rather than the number of participants (Hempel et al., 2013). To understand the power of our moderator analyses it is important to inspect the confidence intervals, which contain all necessary information about the precision of the effect. Importantly, the moderator analyses for immersion and interactivity show confidence intervals of [−0.01, 0.00] and [−0.01, 0.01] respectively, indicating that even with a limited N we had the power to precisely estimate this moderator effect and conclude with confidence that immersion and interactivity did not moderate the improvement of empathy overall.
The studies included in this meta-analysis varied sustainably in effect size, and this variance remained significant even within subgroups, indicating the existence of multiple moderators of the effect of VR on empathy. Although we investigated several exploratory moderators based on the nature of the studies available to us, we were unable to explore all moderators and important possibilities went unexplored, such as the content of the experience. In particular, this meta-analysis only included studies which measured empathy as an outcome variable, and as a result, VR content was most likely prosocial in nature. Since psychological research finds that the type of media content (i.e., aggressive vs. prosocial) affects whether outcomes are aggressive or prosocial (Greitemeyer & Mügge, 2014), we would expect that other content, for example violent content, might not have this positive effect on empathy and may even lead to more aggressive outcomes.
Like much psychological research, the research included in this meta-analysis was predominantly conducted on participants who were Western, educated, and from industrialized, rich, and democratic (WEIRD) countries. This is particularly pertinent as there are documented differences in empathic traits cross-culturally (Chopik et al., 2017), and so it is important to investigate possible cultural differences in the efficacy of VR empathy interventions. Sadly, despite a concerted push for more representative sampling in psychology (Henrich et al., 2010) it appears that psychological research has been slow to respond (Rad et al., 2018). We hope that as VR becomes more commonplace, we will see more rigorous experimental work done cross culturally, as well as correlational studies investigating the overall relationship between VR use and trait empathy.
Overall, we used a conservative approach throughout this meta-analysis, taking care to ensure that we did not artificially inflate the mean effect size nor its precision. Therefore, we feel confident to conclude that VR has a significant positive effect on emotional empathy. Given our modern world, it might be reasonable to suggest that people need not invest their own mental effort into empathizing and can simply arouse empathy automatically through these graphic and immersive technologies. Indeed, research suggests that people actively avoid engaging in the effortful process of empathizing when given a choice (Cameron et al., 2019). Perhaps we do not need to teach ourselves how to take someone else’s perspective when we can simply slip on a VR headset?
However, as authors we heavily caution against this approach for three reasons. First, reliance on VR would prevent people from fostering their own cognitive empathy skills, perhaps rendering them unable to empathize with anyone who is not presented in 3D high definition with surround sound technology. Far from increasing the empathy to others around the globe, VR could restrict empathy toward only those we can see, albeit virtually. Second, emotional empathy may be associated with distress and burnout (Kyer, 2020). However, research finds that cognitive empathy is associated with lower stress hormones during a laboratory stressor task (Ho et al., 2014), suggesting cognitive empathy, but not emotional empathy, is a buffer against burnout. Third, research suggests that working harder may be a key component in motivating compassion (Olivola & Shafir, 2013). We argue that, like many other worthwhile skills, cognitive empathy appears to require effort and virtual reality does not offer an easy shortcut.