Skip to main content
SearchLogin or Signup

A Randomized Controlled Trial Assessing the Efficacy of a Virtual Reality Biofeedback Video Game: Anxiety Outcomes and Appraisal Processes

Volume 2, Issue 2. DOI: 10.1037/tmb0000028

Published onAug 05, 2021
A Randomized Controlled Trial Assessing the Efficacy of a Virtual Reality Biofeedback Video Game: Anxiety Outcomes and Appraisal Processes
·

Abstract

This study assessed the efficacy of a virtual reality biofeedback video game (DEEP) in reducing anxiety symptoms. In addition, changes in engagement and cognitive appraisals including self-efficacy, locus of control, and threat-challenge appraisals were measured and it was explored how these factors related to anxiety regulation. Undergraduates with elevated anxiety symptoms (N = 112) were randomly assigned to four training sessions with DEEP or a smartphone-guided breathing application. Trait anxiety was measured at screening, pretest, posttest, and 3 months later. State anxiety was assessed before and after each session. In addition, engagement and appraisals were assessed in each session. Participants in both conditions showed a significant decrease in trait anxiety symptoms from pre- to posttest and this decrease remained stable at follow-up. Furthermore, all participants decreased in state anxiety from pre- to postsession, except for DEEP sessions that included exposure. DEEP users increased in self-efficacy and observed resources to cope throughout the training. In addition, DEEP users felt more engaged in initial sessions than those who used the control application, but their engagement decreased toward the final session. In contrast, participants in the control group showed no change in appraisals nor engagement. Taken together, results demonstrate the potential of digital interventions such as biofeedback games and guided relaxation applications as anxiety regulation tools and show that self-efficacy and threat-challenge appraisals in particular are potential mechanisms of change in biofeedback interventions.

Keywords: anxiety, virtual reality, biofeedback, video games, randomized controlled trial

Supplemental materials: https://doi.org/10.1037/tmb0000028.supp

Acknowledgments: We thank Owen Harris, Niki Smit (Monobanda), Andy Mooney (Paradoxical Recordings), and Bryan Duggan (DIT School of Computing) for sharing the prototype of DEEP that was tested in the present study. In addition, we thank our colleagues at the GEMH lab for their support. Furthermore, we thank Laurie Buitenhuis for her pilot work on the exposure environment. Finally, we gratefully acknowledge the assistance of Marloes Polman, Bas Kooijman, Deniz Çetin, Fernando Jibaja Du Bois, Jamy ten Tije, Kim Verwaaijen, Kim Badenbroek, Rüya Akdag, Flavia Spagnuolo, Ilaria Corti, Ilze Thoonen, and Maaike van Heumen during data collection.

Data Availability Statement: The methods and procedure are made publicly available and the data-set can be accessed upon request via the DANS EASY repository (doi: https://doi.org/10.17026/dans-zbd-bhbe).

Ethics: Our study was approved by the Ethics Committee Social Science (id: ECSW2016-2208-412) of the Radboud University and complies with the APA ethical standards.

Preregistration: The study design and research questions of the present study were preregistered at the Netherlands Trial Register (id: NL6635).

Conflicts of Interest: The authors tested the efficacy of the game DEEP and there was no financial incentive during this period of research. However, it is not certain that this will remain to be the case in future collaborations with the DEEP development team after the public release of the game.

Sources of Financial Support: This work was supported by the Netherlands Organisation of Scientific Research (NWO) Creative Industry grant [314-99-115], the Creative Industries Fund NL (Stimuleringsfonds), and the Netherlands Organisation for Health and Research Development (ZonMw) [912-15-207].

Prior Dissemination of Data: Preliminary results were presented as part of the International Convention of Psychological Science in Paris on March 9, 2019. These same preliminary results were presented at the Health by Technology conference in Groningen, the Netherlands on May 17, 2019 and in a webinar for Immerse UK on May 15, 2020.

Open Science Disclosures:
Disclaimer: Interactive content is included in the online version of this article.

Correspondence concerning this article should be addressed to Joanneke Weerdmeester, Developmental Psychopathology, Behavioural Science Institute, Radboud University, Thomas van Aquinostraat 4, 6525 GD, Nijmegen, Gelderland, The Netherlands. Email: joanneke.weerdmeester@ru.nl


Anxiety is commonly experienced, whether it is the feeling of our heart beating in our chest right before an important job interview or ruminating about an upcoming deadline. While anxiety is an adaptive emotional experience, it can be debilitating when it increases in intensity and becomes difficult to regulate to the point where it develops into an anxiety disorder (American Psychiatric Association [APA], 2013). Together with depression, anxiety disorders are the most frequently diagnosed mental health problems (Costello et al., 2003; Kessler et al., 2012; Wittchen et al., 2011; World Health Organization, 2019) with recent numbers pointing to approximately 260 million individuals suffering worldwide and 37 million in Europe (Global Burden of Disease Study [GBD], 2017; World Health Organization, 2019).

Fortunately, evidence-based anxiety treatments are available, with Cognitive Behavioural Therapy (CBT) recognized as the most effective among them (Bandelow et al., 2017). CBT mainly focuses on changing maladaptive cognitive and behavioral patterns (Beck, 2011). However, somatic complaints and issues with physiological regulation are also common among people suffering from anxiety (APA, 2013; Bekhuis et al., 2015; Crawley et al., 2014; Ginsburg et al., 2006; Kujanpää et al., 2017). These physiological difficulties can be addressed by biofeedback training and may fill a gap that most CBT approaches miss.

Biofeedback represents physiological changes such as in heart rate, breathing, or brain activity (Schwartz & Andrasik, 2017). These changes are represented visually (e.g., by using moving graphs or simple animations) to increase interoceptive awareness (i.e., awareness of internal signals), a core element of emotion regulation (Gross, 2002; Kever et al., 2015). In addition, conditioning and reinforcement techniques are used to support effective physiological regulation, for instance by sounding a pleasant tone when a person’s heart rate has reached an optimal level (Gilbert & Moss, 2003; Hammond, 2005, 2007; Lehrer et al., 2000; Schwartz & Andrasik, 2017).

While biofeedback is an effective treatment for stress and anxiety (Goessl et al., 2017; Richardson & Rothstein, 2008; Schoenberg & David, 2014; Tolin et al., 2020), there are some challenges. Biofeedback training is usually provided in a neutral or relaxed setting, which may limit the transfer of self-regulation skills to stressful situations where they are most needed (Parnandi et al., 2014). Also, biofeedback training mostly involves hundreds of trials, which is a huge time commitment. Furthermore, these trials are often repetitive and rely on unintuitive visual feedback (e.g., raw physiological signals), which can make it hard to stay engaged, especially for youth (Dedeepya et al., 2014; Parnandi et al., 2014).

In recent years, promising advancements in conventional biofeedback interventions have been made, including wearable biosensors (Piwek et al., 2016; Yetisen et al., 2018) and game-based paradigms, which promise to make biofeedback more accessible, engaging, and potentially more effective (Weerdmeester, Van Rooij, Engels, et al., 2020). The present study focuses on DEEP (Explore Deep Ltd, 2015), a video game that combines breath-based biofeedback with virtual reality (VR) to provide players with an immersive, relaxing underwater world with interactive visuals to engage participants long enough to learn regulation skills (van Rooij et al., 2016). Players wear a belt that measures the expansion of their diaphragm. Players’ inhalations are translated to up- and forward forces and their exhale triggers a speed boost. This breath-based mechanic incentivizes the use of diaphragmatic breathing, an effective anxiety regulation technique (Chen et al., 2017; Paul et al., 2007). DEEP also includes visualizations that mirror players’ breathing such as an expanding and contracting circle in the players’ field of view and plants that change in illumination (i.e., getting brighter with the inhale and dimmer with the exhale). Finally, DEEP includes a high-stress training environment based on exposure therapy, an evidence-based technique where people are exposed to stressful stimuli while training relaxation skills (Parker et al., 2018).

Trailer showing the DEEP prototype that was tested in the current study

The primary aim of our study was to assess DEEP’s efficacy as a game-based biofeedback intervention for anxiety. We compared DEEP with Paced Breathing, a phone-based guided breathing application (Trex LLC, 2015). Guided relaxation and breathing applications are increasingly used to manage stress and anxiety (Blackburn & Goetter, 2020; Plaza et al., 2013; Swan, 2012). However, we posit that the effectiveness of guided breathing exercises can be further increased by using a VR biofeedback game such as DEEP for three reasons. First, the operant learning principles that are at the core of biofeedback interventions (i.e., direct feedback and reinforcement), facilitate the internalisation of self-regulation techniques (Gilbert & Moss, 2003; Hammond, 2005; Schwartz & Andrasik, 2017). Second, providing biofeedback training in a video game context creates a more immersive and engaging training environment where self-regulation can be repeatedly practiced and consequently mastered (Granic et al., 2014). Third, providing practice in a stressful VR environment increases the likelihood that the learned self-regulation skills will transfer to real-life stressful situations (Bouchard et al., 2012; Driskell & Johnston, 1998; Maples-Keller et al., 2017; Zafar et al., 2018).

Most biofeedback intervention studies solely focus on measuring outcomes. As a result, little is known about possible mechanisms that may contribute to these outcomes (Tolin et al., 2020; Wheat & Larkin, 2010), even though identifying mechanisms of change is crucial to strengthen and personalize interventions (Kazdin, 2014). In a recent review, we identified techniques (e.g., feedback and reinforcement) and processes (e.g., changes in interoceptive awareness and physiological regulation) that have previously been linked to the efficacy of biofeedback training (Weerdmeester, Van Rooij, Engels, et al., 2020). We identified cognitive appraisals (i.e., how we interpret and evaluate certain situations or feelings) that may explain or contribute to the effectiveness of biofeedback interventions, especially those that are aimed at anxiety regulation including self-efficacy, locus of control, and threat-challenge appraisals.

Self-efficacy is the belief in one’s ability to complete certain tasks or deal with certain situations (Bandura, 1977). Low self-efficacy relates to high levels of anxiety and predicts the development of anxiety disorders (Mathews et al., 2016; Niditch & Varela, 2012; O’Neal & Cotten, 2016) and improvements in self-efficacy mediate anxiety treatment outcomes (Gallagher et al., 2013). Locus of control is another form of appraisal that has been related to the experience of anxiety. Self-efficacy and locus of control are closely related but distinct concepts of self-evaluation (Bandura, 2006). While self-efficacy is a judgment of capability, locus of control is concerned with outcome contingencies, specifically, the belief that outcomes of events are under one’s own control (internal locus) versus controlled by forces beyond one’s control (external locus) (Rotter, 1966). Typically, an internal compared to external locus of control relates to lower levels of anxiety and locus of control also mediates anxiety and depression treatment outcomes (Cloitre et al., 1992; Warmerdam et al., 2010; Weems & Silverman, 2006). Finally, the extent to which stressful situations are appraised as a threat versus a challenge is linked to anxiety levels. Reappraising physiological arousal as helpful (a challenge) rather than harmful (a threat) has been shown to lead to more adaptive responses to stressful situations (Jamieson et al., 2013).

DEEP may address all three of these appraisals. First, DEEP provides players with performance feedback by linking their breathing to movement in the game (i.e., slow and deep diaphragmatic breathing makes the player move forward, shallow breathing halts the player). Players can therefore monitor how successful they are at regulating their breathing. As the experience of mastery is at the core of self-efficacy (Bandura, 1994), receiving direct feedback on players’ success or failure to regulate breathing can influence and change players’ self-efficacy over time. Indeed, some evidence shows that (VR-based) biofeedback (Blum et al., 2019; Rokicki & Holroyd, 1997; Teufel et al., 2013) can increase self-efficacy. Second, DEEP is entirely controlled by the players’ breathing. Players can therefore observe that it is their own actions (i.e., their attempts to regulate their breathing) that directly influence the game environment (e.g., objects that illuminate in synch with their breathing) and their movement, as opposed to being controlled by external factors. Seeing and feeling that they themselves control their experience and progress may therefore influence their locus of control. Finally, repeated practice with regulating their breath and especially applying breathing techniques in DEEP’s exposure environment may facilitate a shift to viewing stressful situations as a positive challenge to overcome, rather than a threat. Thus, by providing continuous feedback on participants’ mastery and control of their breath and providing an immersive and embodied context where they can safely practice self-regulation in stressful environments, we posit that biofeedback video games like DEEP influence self-efficacy, locus of control and threat-challenge appraisals.

Design and Hypotheses

The primary aim of our study was to assess the efficacy of a game-based biofeedback intervention (DEEP) in reducing anxiety symptoms. Our secondary aim was to assess changes in engagement and cognitive appraisals including self-efficacy, locus of control, and threat-challenge appraisals and to explore how these factors relate to anxiety regulation. This was meant as a first step to identify whether these factors may serve as potential mechanisms of change in biofeedback interventions. Undergraduate students with elevated anxiety symptoms were randomly assigned to either four training sessions with DEEP or the control application, a commercially available smartphone application (Paced Breathing (Trex LLC, 2015), that provides guided breathing exercises without the use of biofeedback, VR, or game design. The sessions were divided into two phases. Participants in the control group got the same training in both phases. DEEP participants trained in a relaxing environment in the first phase (session one and two) and in an exposure environment in the second phase (session three and four). This allowed us to separately assess the impact of introducing exposure. Trait anxiety was measured at screening (2 weeks before the first training session), pretest (directly before the first training session), posttest (directly after the last training session), and 3 months after the last training session. Furthermore, self-reported state anxiety was assessed before and after each training session and cognitive appraisals and engagement were assessed at the end of each session.

We had five hypotheses. First, based on the shared focus of both applications on deep diaphragmatic breathing, we expected a decrease in trait anxiety symptoms from pre- to posttest for all participants and we expected this decrease to be maintained at the 3-month follow-up. Furthermore, participants in the DEEP condition compared to the control condition were expected to show more pronounced improvements. We expected more pronounced changes in anxiety in DEEP players due to DEEP’s combination of biofeedback mechanics, VR, game design, and exposure. Second, in the first training phase we expected a decrease in pre- to post-training state anxiety and the magnitude of decrease was again expected to be stronger for DEEP participants. Third, in the second phase, we expected a decrease in pre- to post-training state anxiety in the control condition, whereas anxiety was expected to increase in the DEEP condition given the exposure environment. Fourth, we expected changes in cognitive appraisals and engagement to be more pronounced in DEEP compared to the control group, due to the feedback that DEEP players receive on mastery of diaphragmatic breathing, observations of self-made progress in the game, and the practice that players receive in a stressful exposure environment. In contrast, the control application does not provide feedback on the users’ mastery, does not change based on the users’ breathing, and only provides practice in a neutral setting. Furthermore, changes in engagement were particularly expected in the DEEP group as DEEP was explicitly designed to be an entertaining video game, whereas the control application was not. Finally, in both conditions and training phases we expected a positive association among appraisal and engagement scores and decreases in pre- to post-training state anxiety based on previously established links between these appraisals (e.g., higher self-efficacy) and improved anxiety treatment outcomes. Moreover, as engagement is vital to stay motivated and to benefit from the breathing techniques we expected higher engagement to be related to stronger anxiety changes. As we expected the training elements of DEEP (i.e., biofeedback, VR, game design, and exposure) to more strongly influence cognitive appraisals and engagement compared to the control application, we expected the associations between these appraisals and pre–post training changes to be more pronounced in the DEEP condition.

Method

Participants

Undergraduate students (N = 993) from the Radboud University and HAN University of Applied Sciences were screened using the Dutch Depression Anxiety and Stress Scale (DASS-21; de Beurs et al., 2001). Only those participants with severity scores above the normal cut-off for anxiety (≥8) and/or stress (≥15) were selected for participation. As anxiety often coincides with depressive symptoms (Cameron, 2017), we did not treat the existence of elevated scores on depression as an exclusion criterium. Participants (N = 112) were randomized to the experimental or control condition (see Table 1 for descriptives and Figure 1 for a flow-diagram of the study). Participants were recruited through the university’s Participation System and received 4.5 credit points or 45€ gift certificates for their participation. Our study was preregistered at the Netherlands Trial Register (id: NL6635) and approved by the Ethics Committee Social Science (id: ECSW2016-2208-412).

Table 1

Baseline Differences in Participant Characteristics Between Conditions at Time of Screening

Characteristic

Control (n = 55)

DEEP (n = 57)

t/χ²

Age

20.84 (2.42)

20.58 (2.51)

0.552

Trait anxietya




  Screening

48.04 (8.29)

47.78 (9.42)

0.153

  Pretest

47.86 (8.55)

48.73 (10.67)

–0.462

DASS-scoresb




  Depression

5.76 (4.23)

6.39 (4.19)

–0.782

  Anxiety

5.95 (3.31)

5.54 (2.98)

0.675

  Stress

8.13 (3.04)

8.98 (3.27)

–1.436

Gaming weekday



4.743

  0 hr

5 (7%)

6 (8%)


  <1 hr

20 (27%)

26 (35%)


  1–2 hr

12 (16%)

5 (7%)


  2–3 hr

1 (1%)

0 (0%)


  3–4 hr

0 (0%)

0 (0%)


  >4 hr

0 (0%)

0 (0%)


Gaming weekend



6.412

  0 hr

1 (1%)

3 (4%)


  <1 hr

15 (20%)

22 (29%)


  1–2 hr

12 (16%)

8 (11%)


  2–3 hr

5 (7%)

3 (4%)


  3–4 hr

4 (5%)

1 (1%)


  >4 hr

1 (1%)

0 (0%)


Breathing exercises



4.786

  Never

27 (25%)

29 (26%)


  Monthly

10 (9%)

18 (16%)


  Weekly

15 (14%)


  Daily

2 (2%)

1 (1%)


Gender identity



0.000

  Male

5 (5%)

6 (5%)


  Female

50 (45%

51 (46%)


  Own description

0 (0%)

0 (0%)


Mother tongue



4.303

  Dutch

41 (37%)

51 (46%)


  German

11 (10%)

5 (4%)


  Other

3 (3%)c

1 (1%)d


Anxiety disorder



2.056

  Yes

3 (4%)

6 (8%)


  No

51 (46%)

48 (43%)


Type of disorder



4.309

  GAD

1 (1%)

2 (2%)


  OCD

0 (0%)

1 (1%)


  PD

1 (1%)

1 (1%)


  SAD

1 (1%)

2 (2%)


  SP

0 (0%)

0 (0%)


  PTSD

0 (0%)

1 (1%)


Note. Values represent numbers and percentage of total sample or means and standard deviations. GAD = generalized anxiety disorder; OCD = obsessive-compulsive disorder; PD = panic disorder; SAD = social anxiety disorder; PS = specific phobia; PTSD = post-traumatic stress disorder.
a Based on the State-Trait Anxiety Inventory. b Raw sum scores, need to be multiplied by two for severity ratings. c Friesian, Bulgarian, Serbian. d Turkish.


Power Analysis

Our preregistered sample size (N = 200) was based on a power calculation (Faul et al., 2007) for an independent samples t-test with d = .40 (small to moderate effect; Cohen, 1992), α = 0.05, and power = .80. However, we ended up using linear mixed-effects analyses to better fit our design and hypotheses. We collected data from 112 participants (Figure 1), which allowed us to detect small to moderate effects in our mixed-effects analyses (smallest being d = .32) with 80% power (calculations based on Twisk, 2004). Our sample size at screening (N = 112), posttest (N = 97), and follow-up (N = 77) allowed us to detect medium to large effect sizes (smallest being d = 0.53) with t-tests (Champely, 2020).

Materials

Screening

The Dutch (de Beurs et al., 2001) DASS-21 (Lovibond & Lovibond, 2007) was used to assess symptoms of depression (“I felt downhearted and blue”), anxiety (“I felt I was close to panic”), and stress (“I found it difficult to relax”) that participants experienced over the last week. Each sub-scale had eight items which were answered on a 4-point scale (0 = never to 3 = almost always). In addition, participants were asked about their age, gender identity, native language, and whether they were diagnosed with an anxiety disorder. Participants also estimated the number of hours they spent playing video games on week- and weekend-days and how often they used breathing exercises.

Trait-State Anxiety

Trait and state-assessments of anxiety were measured using the Dutch State-Trait Anxiety Inventory (STAI-DY; van der Ploeg, 1985; van der Ploeg et al., 2000). Half of the items (20) measured state anxiety levels (e.g., “I feel jittery”) which were answered on a 4-point scale (1 = not at all to 4 = very much so) and the other half of the items (e.g., “I feel nervous and restless”) measured trait anxiety (1 = almost never to 4 = almost always).

Self-Efficacy

Participants indicated their self-efficacy with regard to the use of the applications on a 7-point scale (1 = do not agree to 7 = strongly agree) using the three items of the competence subscale (e.g., “I felt very capable and effective”) of the Player Experience of Need Satisfaction questionnaire (Johnson et al., 2018; Rigby & Ryan, 2007).

Locus of Control

Participants rated how much internal control they experienced while using the application (1 = not at all to 11 = extremely) with the four items of the self-agency subscale (e.g., “To what extent did you feel that you could influence what was happening in this situation?”) of the Appraisal Questionnaire (Ellsworth & Smith, 1988).

Threat Challenge

Participants rated four items about perceived task demands (e.g., “This task was demanding”) and four items for perceived coping resources (“I had the abilities to perform well”) on a 7-point scale (1 = completely disagree to 7 = completely agree). These items were selected based on previous research (Mendes et al., 2007) and changed to the past tense.

Application Enjoyment and Follow-up Questions

Participants rated their engagement using the seven items (e.g., “I thought this was a boring activity”) of the Intrinsic Motivation Inventory (Center for Self-determination Theory, 2019; McAuley et al., 1989) on a 7-point scale (1 = not at all true to 7 = very true). In the final session all participants assigned a grade to their overall experience with their application (1–10) and DEEP players rated the statement “I felt nauseous while using the application” on 7-point scale (1 = not at all to 7 = very much so). At the 3-month follow-up participants rated the statements: “I would like to use this application in the future” and “I would recommend this application to a friend” on a 4-point scale (1 = completely disagree to 4 = completely agree) and answered the following question “In the past 3 months I have received professional mental health support” as either “yes” or “no.”

Figure 1

Flow-Diagram of Study Depicting Number of Participants at Each Time-Point


Procedure

Screening and Randomization

Participants completed the 15-min online screening questionnaire and those who met the eligibility criteria were invited to participate in the training. Participants with severe scores on anxiety (>14), stress (>25), or depression (>20) were contacted and provided with information about resources for mental health support. Participants were randomly assigned to DEEP (experimental condition) or a phone-based guided breathing application (control condition). The training sessions commenced 2 weeks after screening. The full procedure is displayed in Figure 2.

Figure 2

Image Depicting the Study Procedure With All Time-Points and Measurements

Training Sessions

Trait anxiety was assessed before the start of the first session and directly after the final session. In each of the four training sessions participants used their respective application for 10 min. State anxiety was measured before and immediately after each training session. Cognitive appraisals and engagement were assessed at the end of each session. Finally, participants reported how much they liked their training after the final session. The first and final session lasted for 1 hr, and the second and third lasted for 30 min.

Control Condition

Participants in the control condition trained with the Paced Breathing (Trex LLC, 2015) application using a Samsung Galaxy Note 4 which provides a guided breathing exercise with visual and auditory instructions. A bar on the screen moves up and down with a corresponding tone which rises and falls in pitch indicating when to inhale and exhale. Participants were asked to breathe through their diaphragm and match their breathing to the instructions. The breathing pattern was set to six breaths per minute with an inhale/exhale ratio of 0.42 (3 s inhale, 7 s exhale) which was selected based on previous research (Van Diest et al., 2014).

Experimental Condition

Participants in the DEEP (Explore Deep Ltd, 2015) condition wore a HTC Vive VR headset and the DEEP controller belt which measured the expansion of their diaphragm. In DEEP, players influence their movement as up- and forward forces are applied with each inhale and a subtle speed boost is applied with each exhale. Visual feedback is provided by an expanding and contracting circle (i.e., getting bigger with the inhale and smaller with the exhale) in the players’ field of vision as well as plants that change in illumination according to the inhalation or exhalation.

Plants in the environment mirror the players’ diaphragmatic breathing, changing in size and/or illumination with each inhale and exhale as measured by the controller belt

An interactive 360 degree video which gives an impression of the DEEP environment.

Disclaimer: to create this video some elements (e.g., the breathing circle and some creatures) had to be left out and the video quality is lower than the actual VR experience

In the first training phase participants trained in a relaxing environment and in the second training phase they trained in an exposure environment. The design of the exposure environment was inspired by appraisal theory, which posits that anxiety is particularly experienced in uncertain or unpredictable situations and where people feel like they can exert little control (Roseman, 2001). DEEP’s general environment is spacious and calming. In contrast, in the exposure environment tense music plays in the background and players have to move through a dark, narrow, winding, and ominous cave system. At the start of the cave, players can still see much of what lies ahead as nearby plants illuminate with their breathing, however, these plants are gradually removed the further players move into the cave, leaving them in increasing darkness. Players can then only light their way with a dim flashlight that illuminates during deep and calm exhales (Figure 3)

Figure 3

Screenshots of the DEEP Relaxation Environment (Left) During Inhalation (Above) and Exhalation (Below) and the Exposure Environment (Right) During Inhalation (Above) and Exhalation (Below)

Video depicting the DEEP exposure environment as tested in the current study

Three-Month Follow-up

Three months after the final session, participants completed a 15-min online questionnaire which included a trait anxiety assessment and questions regarding their experience with the application. They were also asked if they received professional mental health support between the end of the final session and the follow-up so this could be controlled for in the analyses.

Data Analysis

Analyses were performed using R (R Core Team, 2019). Outliers were winsorized to the next value that was not an outlier (within 3 standard deviations) to retain statistical power and attenuate bias resulting from elimination (Ghosh & Vogt, 2012). Baseline differences were assessed with independent sample t-tests and Chi-square tests (see Table 1). The first four hypotheses were tested with linear mixed-effects (LME) models using the lmer function of the lme4 package (Bates et al., 2015) with restricted maximum likelihood estimations to account for non-normal distributions and missing data. P-values were determined with the LmerTest package (Kuznetsova et al., 2017). Effect sizes were calculated using the equations of Westfall et al. (2014). Post-hoc tests were performed using the emmeans and contrast function from the basic R package with a Tukey multiple comparison correction.

The LME model for hypothesis one included trait anxiety as dependent variable (DV) and a fixed and random (per participant) intercept and slope for the interaction between time (screening/pretest/posttest/follow-up) and condition (DEEP/Control). However, as this model did not converge the final model only included a random intercept and slope for time. The interaction between time and professional help received between posttest and follow-up (yes/no) was added as a (fixed) control variable. For the second and third hypothesis, a separate LME model was tested for each session with state anxiety as DV, a fixed intercept and slope for the interaction between time (before training/after training) and condition and a random intercept for time. The fourth hypothesis was tested with a LME model for self-efficacy, locus of control, threat-challenge ratio (resources score divided by demand score), engagement, perceived demands and perceived resources as DVs. These models included a fixed intercept and slope for the interaction between time (session one/session two/session three/session four) and condition and a random (per participant) intercept and slope for time. Due to convergence failures the final models for locus of control and engagement only included a random intercept for time.

For the fifth hypothesis engagement and appraisal scores in the first training phase (sessions one and two) were combined into one score; the same was done for scores in the second phase (sessions three and four). Furthermore, a difference score (post—pre) for state anxiety was calculated for each session and then combined into one score per training phase. Correlations were then calculated between the anxiety difference scores, engagement, and appraisal scores for each training phase per condition. Scores were separately calculated per phase to assess how training in the relaxed environment versus the exposure environment of DEEP influenced these correlations. Finally, t-tests were performed with condition as independent variable for final evaluative grade, desire to use the app in the future, and application recommendation as DVs.

Results

No significant baseline differences were found between conditions (see Table 1). Figure 1 shows the attrition rates.

Hypothesis 1: Changes in Trait Anxiety Symptoms

In line with our hypothesis, there was a significant change in trait anxiety symptoms over time, F(1, 74) = 12.08, p < .001, b = −3.09, se = 0.89, d = .32. Anxiety symptoms decreased from pre- to posttest, p = .019, and marginally from screening to posttest, p = .054. Furthermore, trait anxiety scores at the 3-month follow-up did not differ from those at posttest, p = .677, but were lower compared to the screening, p = .003, and pretest, p = .001. Contrary to our hypothesis, changes in trait anxiety were not more pronounced in the DEEP group compared to the control group, F(1,74) = 1.13, p = .292, b = 0.61, se = 0.58, d = .06 (Figure 4). Furthermore, there was a significant interaction between time and help received, F(1, 74) = 5.08, p = .027, b = 1.42, se = 0.63, d = .15, which revealed that participants who received professional help (n = 23) between posttest and follow-up had higher anxiety scores on all time-points, compared to those who did not receive help. Furthermore, participants that did not receive help decreased in anxiety over the course of the training, whereas those who received help did not.

Figure 4

Changes in Trait Anxiety Symptoms Over Time

Hypothesis 2: Pre–Post State Anxiety in the First Training Phase

In line with our hypothesis a significant time effect was revealed for session one, F(1, 105) = 67.82, p < .001, b = −12.90, se = 1.57, d = 1.30, as well as for session two, F(1, 100) = 12.75, p < .001, b = −3.35, se = 0.94, d = .41, showing a decrease in state anxiety from before to after training (Figure 5). Contrary to our hypothesis, no interactions with condition were found for session one, p = .468, d = .16, nor session two, p = .273, d =.18, meaning that the decrease in state anxiety were equivalent in both conditions.

Figure 5

Changes in State Anxiety From Before to After Training in Each Session Per Condition

Hypothesis 3: Pre–Post State Anxiety in the Second Training Phase

A significant time and condition interaction was found for session three, F(1, 96) = 15.51, p = < .001, b = 5.38, se = 1.37, d = .60, showing that participants in the control condition significantly decreased in state anxiety from before to after training, p < .001, whereas participants in the DEEP condition did not change, p = .615. In session four, a significant time-effect was revealed, F(1, 94) = 16.93, p = .049, b = −3.74, se = 0.91, d = .38, showing a decrease in state anxiety from before to after training. No condition differences were found, p = .210, d = .16 (Figure 5). These results partly confirm our hypothesis as participants in the control group decreased in state anxiety but no anxiety increase was found in the DEEP group. Instead, participants’ anxiety remained stable in the third session and decreased in the final session.

Hypothesis 4: Changes in Appraisals and Engagement

Self-Efficacy

A significant time and condition interaction was found, F(1, 99) = 13.98, p < .001, b = 1.29, se = 0.35, d = .24. Participants in the DEEP condition reported lower self-efficacy in session one compared to those in the control condition, p = .009. Furthermore, DEEP players showed an increase in self-efficacy over time, reporting higher self-efficacy in the second and last session compared to the first session, p < .001. In the control condition no differences were found between sessions (all p > .5) (Figure 6). These results are in line with our expectation that changes in self-efficacy would be more pronounced in the DEEP group.

Figure 6

Trajectories of Change in Cognitive Appraisals and Engagement Per Condition

Locus of Control

A significant time and condition interaction was found, F(1, 299) = 9.08, p = .003, b = −1.60, se = 0.53, d =.20. Participants in the DEEP condition felt less internal control in the third session compared to the control condition, p = .005. Furthermore, in the DEEP condition, participants felt less internal control in session three compared to session one, p = .042, and session two, p < .001. In the control condition no changes in internal control were found (all p > .3) (Figure 6). These results are in line with our expectation that changes in locus of control would be more pronounced in the DEEP group.

Threat Challenge

The threat-challenge ratios were lower in the DEEP condition compared to the control condition, F(1, 104) = 5.75, p = .02, b = −0.93, se = 0.39, d = .48. This group difference seemed particularly driven by perceived coping resources and the changes therein, F(1, 94) = 5.59, p = .02, b = 0.16, se = 0.07, d = .13. Perceived resources in session one were lower in the DEEP condition compared to the control condition, p = .01. Furthermore, DEEP players’ perceived resources increased between the first session and the third, p = .008, and last session, p = .003. In contrast, no differences were found across all the sessions for the control group (all p > .4). Results are consistent with our expectations that changes in threat-challenge appraisals would be more pronounced in the DEEP, compared to the control group.

Engagement

A significant time and condition interaction was found, F(1, 294) = 10.42, p = .001, b = −1.22, se = 0.38, d = .13, revealing that participants in the DEEP condition were more engaged in the first session compared to those in the control condition, p = .006. However, DEEP players were less engaged in the last session compared to the first session, p = .002. In contrast, in the control condition no engagement changes were found between sessions (all p > .5), which is in line with our expectation that engagement changes would be more pronounced in the DEEP group (Figure 6).

Hypothesis 5: Relationship Between Anxiety Regulation, Appraisals, and Engagement

Self-Efficacy

In both conditions the higher participants’ self-efficacy was in the first training phase, the stronger their decrease in anxiety was from before to after training. In the second phase, this same relationship was found, but only in the control condition (Figure 7).

Locus of Control

In both training phases and both conditions, the more internal control participants felt the stronger their decrease in anxiety was from before to after training (see Figure 7).

Threat Challenge

In both conditions and both training phases, the higher participants’ threat-challenge ratios were, the stronger their decrease was in anxiety from before to after training (Figure 7).

Engagement

In both training phases, the more engaged participants were, the more they decreased in anxiety from before to after training. However, this relationship was only found in the control group (see Figure 7).

Figure 7

Correlograms of the First (Session One and Two) and Second Training Phase (Session Two and Four) for the DEEP and Control Condition Displaying the Relationships Between Pre–Post Training Difference Scores of Anxiety (ANX) and Post-Training Engagement (ENG), Locus of Control (LOC), Self-Efficacy (SE), and Threat-Challenge Ratios (TC)
Note. Numbers represent the correlation values. Larger dots represent stronger correlations. Non-significant correlations (p > .05) are crossed out.

The fact that higher scores on all appraisals and engagement were related to stronger prepost decreases in anxiety was in line with our hypothesis. However, our expectation that all these relationships would be more pronounced in the DEEP condition was not supported.

Application Enjoyment

Participants who used DEEP gave a similar evaluative grade as those who used the control application (7 out of 10) and both were equally motivated to use their application in the future. However, participants in the DEEP condition were marginally more likely to recommend the application to others. Reported nausea for DEEP players was low (2 out of 7; Table 2).

Table 2

Differences in Application Enjoyment and Follow-Up Questions Between Conditions


Control

Deep


Variables

M

SD

M

SD

t

df

p

ga

Final grade

7.23

1.42

6.98

1.35

0.90

93

.370

0.25

Nausea

2.18

1.68


Recommended

2.32

0.88

2.69

0.83

–1.87

73

.066

0.43

Future use

2.38

0.89

2.55

0.81

–0.88

72

.383

0.20


n

%

n

%

χ²

df

p

g

Help received





0.60

72

.440

0.24

 No

28

76

26

65





 Yes

9

24

14

35





a Effect size (Hedges’ g).

Discussion

The primary aim of this study was to assess the efficacy of a VR biofeedback video game (DEEP) in reducing anxiety symptoms. Our second aim was to assess changes in engagement and cognitive appraisals including self-efficacy, locus of control, and threat-challenge appraisals and explore how these factors relate to anxiety regulation.

As expected, all participants showed a decrease in trait anxiety symptoms from before to after the training and this decrease was maintained at 3-month follow-up. Contrary to our expectation, DEEP did not outperform the control application. These results seem to indicate that the additional elements in DEEP (i.e., biofeedback, VR, game design) did not result in improvements above and beyond the guided breathing exercise provided by the control application. Instead, the fact that both groups practiced diaphragmatic breathing may have already been enough to help all participants to better manage their anxiety as this type of breathing has been shown to be an effective anxiety regulation technique (Chen et al., 2017; Paul et al., 2007). Our results are in line with previous studies where gamified biofeedback applications were found to be equally effective in promoting relaxation as guided relaxation applications (Shih et al., 2019; Zafar et al., 2018). However, Blum et al. (2019) found that a paced breathing exercise combined with VR-based biofeedback was more enjoyable and led to a greater increase in relaxation self-efficacy compared to a standard paced breathing exercise (Blum et al., 2019). These findings suggest that combining game-based biofeedback with paced breathing instructions may outperform either method alone.

Anxiety Improvements in the First Training Phase

In line with our expectations, participants decreased in state anxiety from before to after using the applications in the first two sessions, showing that all participants were able to effectively regulate their anxiety during this phase of training. Yet, despite the added biofeedback, VR, exposure, and game mechanics, DEEP players did not show a stronger decrease in anxiety. While biofeedback can indeed facilitate anxiety regulation (Goessl et al., 2017; Richardson & Rothstein, 2008; Tolin et al., 2020) and while VR and game design may enhance immersion, engagement and skill retention (Bouchard et al., 2012; Granic et al., 2014), the inclusion of these additional aspects also made the training environment of DEEP more complex in comparison to the control application. To show improvements above and beyond the active control condition that also focused on breathing, DEEP players may have needed more time to internalise and benefit from these added training elements. Indeed, the control group seemed to have benefited equally in the present research context: They were encouraged, given attention, and explicitly taught effective breathing techniques to manage their anxiety. It is likely that outside this research context, this rudimentary protocol would not have resulted in the same adherence, given many digital mental health interventions still suffer from low engagement and fidelity (Scholten & Granic, 2019). Future research will need to establish whether simple breathing apps outside controlled, laboratory settings, actually yield the same benefits as a VR biofeedback game like DEEP.

Anxiety Improvements in the Second Training Phase: Testing the Impact of Exposure

With no exposure training, participants in the control condition showed the expected drop in anxiety in both training phases. In contrast, the second phase of the DEEP training was designed to test the impact of exposure, which we expected to increase players’ anxiety, especially in the third session. Instead, DEEP players’ anxiety levels in the third session remained stable and decreased in the fourth session. This could indicate that the exposure environment was not intense enough to increase participants’ anxiety. Alternatively, the fact that anxiety remained stable could still indicate that participants were affected by the environment. Recall that DEEP players showed a clear decrease in anxiety in the first training phase which took place in a relaxed setting; participants’ stable levels of anxiety in this first exposure session could indicate that they had to exert more effort to downregulate their anxiety compared to the first training phase. Furthermore, participants spent five of the 10-min session in the exposure environment. It is possible that participants initially increased in anxiety but were then able to partially down-regulate their anxiety toward the end of the session, but our prepost measures were not fine-grained enough to pick up these within-session changes. Measures of physiological reactivity and regulation across all biofeedback training sessions may provide a better way to pinpoint processes of change underlying exposure training. Interestingly, anxiety decreased in the final exposure session which may indicate that after the first exposure session, players learned to successfully regulate their anxiety in a threatening environment as was the intended purpose. Alternatively, participants may have been less affected when they returned to the exposure area for the second time as the familiarity may have diminished the uncertainty and unpredictability that the area was designed to evoke. We therefore recommend to include a variety of exposure areas in future iterations of DEEP or to slightly alter the environment each time the player enters it, to make it less predictable.

Changes in Appraisals

Self-Efficacy

Participants in the DEEP condition reported lower self-efficacy in the first session compared with those in the control condition, However, DEEP players showed an increase in self-efficacy over the course of the training, whereas those who used the control application did not. Furthermore, higher self-efficacy was overall found to be related to stronger decrease in prepost training anxiety, although for DEEP players this was particularly pronounced in the first phase of the training. The increase in self-efficacy in DEEP players is in line with previous literature showing the same improvements as a result of biofeedback training (Rokicki & Holroyd, 1997; Teufel et al., 2013). These results are also in line with Blum et al. (2019) who found that a VR biofeedback app resulted in increases in self-efficacy. As self-efficacy is often derived from mastery experiences (i.e., failures and successes), we had expected changes in self-efficacy to be more pronounced in DEEP players as they were able to judge their performance, whereas those who used the control application could not. Specifically, DEEP provides clear feedback about players’ mastery of diaphragmatic breathing, as breathing in this manner is the only way players can progress in the game. In fact, DEEP’s biofeedback mechanics may also explain why players initially reported lower self-efficacy in comparison to the control group as DEEP players immediately notice when they do not breath well, whereas users of the control app do not receive any feedback on their breathing. The fact that a lack of mastery was only observable by participants in the DEEP group combined with the fact that DEEP’s training environment was more complex may therefore explain the initial lower self-efficacy among DEEP players. These different experiences in terms of self-efficacy between DEEP and the control group also are important because higher self-efficacy has been consistently linked to lower anxiety and positive treatment outcomes (Gallagher et al., 2013; Mathews et al., 2016; Niditch & Varela, 2012). Thus, DEEP players’ increase in self-efficacy over the course of the training could indicate that self-efficacy may be a potential mechanism of change in DEEP and biofeedback interventions more generally.

Locus of Control

DEEP players decreased in internal control from the first to the second phase of the training, whereas those who used the control application did not change their locus of control. While changes in locus of control seemed more pronounced in DEEP players as was expected, it is likely that this change resulted from the introduction of exposure in the second training phase. Specifically, the exposure environment was designed to increase players’ anxiety and increased anxiety has been consistently linked to lower internal control (Cloitre et al., 1992). Given this change in locus of control with the introduction of exposure, we suggest a further investigation of locus of control fluctuations in biofeedback interventions, particularly when players are provided with practice in stressful environments. Our results also revealed that the more internal control participants felt, the stronger their decrease in anxiety was, which is in line with previous literature linking internal locus of control to lower anxiety and positive treatment outcomes (Cloitre et al., 1992; Warmerdam et al., 2010; Weems & Silverman, 2006). This relationship between anxiety and control was found in both conditions, an unexpected finding. Thus, our data were not able to clearly support locus of control as a potential mechanism of change in biofeedback interventions. Instead, this feeling of control may be important to target in the design of (digital) anxiety regulation tools in general.

Threat-Challenge Appraisals

In both conditions, participants seemed to appraise the training as challenging rather than threatening, meaning that they felt they had enough resources to cope with the demands of each app. Furthermore, these challenge appraisals were related to stronger decreases in anxiety, which is in line with previous literature relating challenge appraisals to more adaptive stress responses (Jamieson et al., 2013). These findings make sense, given that both apps were meant to be relaxing. However, we did observe some group differences. Specifically, in the first session DEEP players reported lower coping resources compared to those that used the control app. It may be that DEEP is a more complex training environment and therefore was viewed as more demanding compared to the control application. Furthermore, DEEP players’ perceived their coping resources to have increased with each session, whereas these appraisals remained stable in the control condition. Because DEEP players can track their progress and because DEEP provides them with training in both a relaxing and a stressful exposure environment, it is likely that throughout the training participants re-evaluated whether they had enough resources to cope with the changing demands. In contrast, the control application remained the same in its content and instructions so there was little reason for participants to reappraise their experience throughout the training. It would be particularly interesting in future research studies to explore how perceived demands and resources would change with more extensive practice in stressful environments and how this would affect appraisals of stressful situations encountered in daily life.

Engagement

Participants in the DEEP group were more engaged in the first phase of the training compared to those in the control condition. However, DEEP players became less engaged near the final session, whereas engagement in the control condition remained stable. The fact that DEEP players were initially more engaged is likely due to the fact that DEEP was purposefully designed to be engaging and immersive unlike the control application. This could in fact also explain participants’ later decrease in engagement as they may have formed higher expectations of DEEP (e.g., more levels or areas) which may not have been met by the current prototype. Our results suggest that game-based biofeedback interventions may be more engaging than more conventional relaxation applications, but need to be optimized to facilitate long-term engagement. Finally, as we could not determine a clear relationship between engagement and anxiety regulation, we recommend that this is further examined in future biofeedback research.

Strengths and Limitations

There are several strengths to the present study that can be highlighted. First, we used a Randomized Controlled Trial (RCT) design with an active control condition, the gold standard in testing the efficacy of interventions. Second, this design allowed us to compare DEEP’s efficacy to a widely-used guided breathing application. Third, compared to previous pilot studies (van Rooij et al., 2016; Weerdmeester et al., 2017), we recruited individuals with elevated anxiety symptoms to assess DEEP’s efficacy as a tool for highly anxious individuals. Furthermore, in contrast to the majority of biofeedback studies (Wheat & Larkin, 2010) and interventions studies in general (Kazdin, 2014), we took a first step at exploring several cognitive appraisal processes that may function as mechanisms of change in biofeedback interventions. Finally, in contrast to the majority of digital intervention studies, we included a follow-up assessment several months after the training ended (Hollis et al., 2017).

There were some limitations to our study. First, while the exposure environment in DEEP gave us insight into the effect of introducing exposure in biofeedback training, this made it difficult to directly compare the change trajectories of the two conditions. For future studies we recommend using a dismantling study design to examine the benefit of exposure. In this type of design participants are randomly assigned to receive either all or only some components of a treatment to identify how much a specific component adds to the magnitude of change (Levin et al., 2012; Papa & Follete, 2015). Second, as we did not include a passive control group, it is unclear how the patterns of change would have compared to a group that received no intervention at all. Third, while the decrease in day-to-day anxiety symptoms over the course of the training was statistically significant, participants’ level of symptoms was still above the moderate range (Kayikcioglu et al., 2017; Spielberger et al., 1983) after the training. Thus, while it is promising to see changes in something as relatively stable as trait anxiety (Julian, 2011), we expect that more intensive training is needed to achieve more ecologically significant improvements. Fourth, there was a small number of participants who sought additional treatment after the final session and who seemed to benefit less from the training. Thus, it may be important for future research to explore whether biofeedback training could augment more conventional therapies or provide a bridge for those seeking additional help. Fifth, our sample size was sufficient to reliably detect small to medium effect sizes with linear mixed models and medium to large effect sizes with t-tests. Our sample was therefore likely underpowered for the small effect sizes which were found for the group comparisons in application enjoyment as well as changes in (trait) anxiety, which must therefore be interpreted with caution. Sixth, the present study assessed participants’ appraisals such as their perceived self-efficacy in relation to their experience with the applications. However, as constructs like self-efficacy are domain-specific (Pajares & Schunk, 2005) we cannot be certain whether participants’ self-efficacy regarding their general ability to regulate their breathing or anxiety was influenced as well. Seventh, as the present study only included self-report measures, it is possible that while participants felt less anxious, their physiological arousal remained unchanged. Similarly, while participants experienced an increase in self-efficacy, their actual competence in regulating their breathing may not have increased. Alternatively, while we found no group differences in self-reported anxiety, there may still have been differences in physiological arousal. We recommend that future studies include self-report measures as well as physiological measures (e.g., heart rate variability and breathing) to gain further insight into the effectiveness of breathing-based biofeedback games. In addition, while we were able to show how cognitive appraisals changed over the course of the training, we were not yet able to validate these appraisals as mechanisms of change as no causal linkages were determined. An important next step for future studies is therefore to assess the extent to which appraisal changes mediate the change in anxiety regulation in biofeedback interventions. Finally, our sample was rather homogenous, consisting primarily of female-identifying undergraduates who spent relatively few hours playing video games and had little experience with relaxation exercises. Therefore caution should be taken with generalizing to diverse populations. Anxiety is more prevalent in women (Mclean & Anderson, 2009; Mclean et al., 2011; Seedat et al., 2009; Wenjuan et al., 2020) and cognitive and affective symptoms are often more dominant, resulting in higher scores on questionnaires that primarily focus on these symptoms compared to somatic symptoms (Mclean et al., 2011). If our sample had included more men, they may have had lower anxiety scores. Men may have also benefited from the training to a different degree as the applications in our study focus on regulating physiology, rather than feelings or cognitions. Furthermore, if our sample had a broader distribution of gaming experience and experience with breathing exercises, we could have examined whether this difference in experience would have influenced the effectiveness of the training. For future research on game-based biofeedback interventions we suggest to study a sample with more variability in gender as well as gaming and breathing exercise experience.

Conclusion and Implications

The present study showed that highly anxious students who trained with DEEP, a biofeedback video game, and a phone-based guided breathing application decreased in their trait anxiety symptoms from before to after the training, a decrease which remained stable until 3 months after the training concluded. Furthermore, participants in both groups showed decreases in feelings of anxiety from before to after they trained with their respective applications. These results highlight the potential for digital applications to teach anxiety regulation skills. In addition, we showed that playing DEEP in particular elicited changes in self-efficacy, locus of control, and threat-challenge appraisals over the course of the training and found that these appraisals were related to changes in anxiety. These results show the importance of targeting and assessing appraisals in biofeedback interventions. An important next step for future research is to determine the extent to which changes in cognitive appraisals mediate outcomes in biofeedback training. This information together with our current findings and suggestions can be used to optimize engagement, facilitate self-regulation, and to promote healthy appraisal processes in a way that helps individuals to manage their anxiety more effectively.

Supplemental Materials

https://doi.org/10.1037/tmb0000028.supp


Comments
0
comment

No comments here