Volume 1, Issue 2. DOI: 10.1037/tmb0000021
As we approach a time in which social robots will be used in home and healthcare settings, there is a critical need to research robot behaviors that can increase user acceptance and comfort. The use of humor by physicians during patient interactions is associated with a number of positive patient health outcomes. However, no research to date has examined the effect of humor on user outcomes when used by a healthcare robot. This study examined the use of humor by a healthcare robot in a scripted interaction in a simulated medical setting. An experiment was conducted with 91 healthy participants (73 female, mean age 25 years). Participants were randomly allocated to interact with either a humorous or neutral robot in a flu vaccination scenario. Perceptions of the robot were assessed using the Godspeed questionnaire, an empathy questionnaire, and Asch personality scale, at two time points (before and after the interaction). Participant laughing was observed during the interaction. Repeated measures between group ANOVA showed that robot use of humor resulted in significantly greater perceptions of the robot’s likeability and safety. The humorous robot was also rated as having significantly more empathy, and a more sociable personality. Participants in the humorous condition were also more likely to rate the robot’s personality as happy, talkative, and frivolous, compared to participants in the neutral condition. Significantly more participants laughed during the interaction with the humorous robot, than with the neutral robot. Together, these findings suggest that the use of humor by a healthcare robot may increase positive user evaluations. This research has implications for both theory and clinical applications.
Keywords: healthcare robot, human–robot interaction, humor, user perceptions, user behaviors
Acknowledgments and Disclosure:
This research was supported in part by a research grant from the Technology Innovation Program funded by the Ministry of Trade, Industry & Energy (MI, Korea). We have no conflicts of interest to disclose.
Data availability statement:
De-identified data that support the findings of this study are available on reasonable request from the corresponding author (EB). The data are not publicly available due to information that could compromise the privacy of research participants.
Interactive content is included in the online version of this article.
Correspondence concerning this article should be addressed to Elizabeth Broadbent, Department of Psychological Medicine, Faculty of Medical and Health Sciences, University of Auckland, Private Bag 92019, Auckland, 1023, New Zealand email@example.com
The use of social robots in home and healthcare environments is nearing reality (Broadbent et al., 2018a; Fiorini et al., 2017; Wilson et al., 2019; Yamada et al., 2019; Zsiga et al., 2013). Although much research focuses on the technical aspects of healthcare robots, it is imperative that robots are also considered from a behavioral aspect, in order to ensure both functionality and acceptability (Walters et al., 2008).
The implementation of robots in healthcare settings constitutes a special social context due to patient vulnerability, the discussion of potentially sensitive information, and an underlying power imbalance between patients and healthcare providers (Ha et al., 2010; Nimmon & Stenfors-Hayes, 2016). Patients must trust healthcare providers and form a connection in order to encourage adherence and open communication. Research and theory on physician communication skills suggests that both verbal and nonverbal behaviors are vital to successful physician–patient interactions and associated patient health outcomes (Beck et al., 2002; Griffith et al., 2003).
Research examining the individual components of physician–patient communication is limited. A potential explanation for this is the difficulty associated with manipulating just one individual aspect of a doctor’s behavior, in order to examine its effects on patient outcomes in the real world. Instead, research has examined the effects of general physician communication skills on patient outcomes, showing good effects (Beck et al., 2002; Stewart et al., 1999; Travaline et al., 2005).
Recently, it was proposed that the theory of doctor–patient communication could be applied in robot–patient communication (Broadbent et al., 2018b). Unlike humans, robots can be programed to manipulate one particular variable and keep all the others identical. Therefore, robots provide excellent research models to examine individual components of communication. Research in this area may not only inform robotic behaviors but also theory in doctor–patient communication. The next sections discuss physician communication behaviors and relate these to social robotics research. The rationale for the research, aims, and hypotheses are then presented.
Effective physician communication is a fundamental aspect of the medical consultation; key to building rapport with patients and central to the delivery of appropriate healthcare (Ha et al., 2010; Travaline et al., 2005). Effective physician communication refers to the ability of a physician to create rapport, relay appropriate medical information, and to facilitate patient involvement in treatment options and health-related decisions (Ha et al., 2010; Simpson et al., 1991; Stewart et al., 1999; Travaline et al., 2005). In physician–patient literature, effective physician communication is determined by patients’ perceptions of a physician’s behavior or “bed-side manner” (Ha et al., 2010).
Effective physician communication has been found to decrease patient anxiety and psychological distress (Butow et al., 1996; Stewart et al., 1999), facilitate patient understanding of medical information (Travaline et al., 2005), and increase patient satisfaction (Ha et al., 2010). In a meta-analysis of 127 studies, effective physician communication was found to have a significant measurable impact on patient adherence to medications (Zolnierek & DiMatteo, 2009).
One verbal behavior associated with effective physician communication is the use of humor (Bennett, 2003). It is pertinent to study humor because it is commonly used in primary and secondary level medical care (59%, Phillips et al., 2018), and it is associated with a number of positive patient outcomes, including perceptions of physician empathy. As described in the next sections, humor has the capacity to facilitate rapport and communication, empower patients, reduce distress, and increase perceptions of physician empathy. The ability of humor to reduce patient distress and embarrassment is particularly relevant for healthcare settings where patients are often required to discuss potentially distressing subjects or undertake potentially embarrassing health assessments (e.g., breast or prostate exams for cancer screening).
Humor is defined by Phillips et al. (2018) as “a statement made with the intent to make others in the room laugh or react positively and to which a positive response is elicited” (p. 270). Humor is first and foremost a communication tool which, according to Lynch (2002), is an “essential part of what it is to be human” (p. 423). Verbal or conversational humor is a complex multifaceted construct. It includes (but is not limited to) subtypes such as jokes (statements comprising a buildup and punch-line), puns (humorous statements that may be interpreted in two different ways), sarcasm (a biting or sharp statement with a humorous undertone), and anecdotes (a humorous story relating personally to the speaker or the lives of others) (Dynel, 2009).
Witticisms are another humor subtype. Witticisms are context bound (i.e., associated with a specific situation or conversation), clever, woven into conversations, and are generally used in “nonhumorous conversational environments” (Dynel, 2009, p. 1287; Norrick, 1984). Witticisms are often used as a way of reducing anxiety or to bring attention to a particular issue (Wiener, 2015).
In an overview of research looking at humor in social interactions, humor was found to increase perceptions of social competence, intimacy, and trust (Hampes, 2010). Other research has found the use of humor to increase perceptions of confidence and competence, which in turn is found to increase perceptions of social status (Bitterly et al., 2017). Humor is linked to higher cognitive and emotional intelligence (Greengross & Miller, 2011; Willinger et al., 2017) as well as social likeability, with individuals who use humor regularly in social interactions reporting less loneliness and better social networks (Wanzer et al., 1996).
The use of humor in healthcare environments is slowly gaining attention as a useful therapeutic tool for the facilitation of positive patient health outcomes (Bennett, 2003). Humor is generally associated with laughter, which has been shown to reduce psychological distress, reported pain, blood pressure readings, as well as increasing immune function (Borins, 1995; Calman, 2001; Hassed, 2001).
Medical professionals have long used humor as a way in which to decrease a patient’s physical or emotional distress (Lynch, 2002). Research exploring interactions between medical staff and patients supports the appropriate use of humor to increase patient communication and satisfaction, reduce patient anxiety, and create meaningful connections between clinicians and patients (Berger et al., 2004; Dean & Major, 2008; Demjén, 2016).
In studies involving children, humor interventions using clowns in hospital environments were found to decrease pediatric psychological distress and increase positive emotions (Dionigi et al., 2014; Fernandes & Arriaga, 2010; Vagnoli et al., 2005). In patients with schizophrenia, humor interventions have been found to result in significant decreases in negative psychological symptoms such as anxiety and depression (Cai et al., 2014).
Although it may seem that humor is not suited to the often serious nature of interactions between patients and healthcare professionals, research indicates otherwise. Dean and Major (2008) examined interactions between healthcare professionals and patients in intensive care units and palliative care facilities, concluding, that in conjunction with professional skill and compassion, humor was able to provide meaning to healthcare interaction that was too significant to be ignored. Not only was humor found to facilitate emotional coping and relationships between staff and patients, it was also found to assist patients in reducing distress, expressing frustration, and reducing embarrassment associated with personal tasks that required staff assistance (e.g., showering and use of the toilet).
Research has found that the appropriate use of humor can increase patient perceptions of healthcare professionals’ empathy (Dean & Major, 2008; Demjén, 2016; Hampes, 2001). Researchers theorize that this operates through the facilitation of rapport, patient communication, patient empowerment, and reduction of patient distress. As put by Dean and Major (2008): “Delivered with sensitivity and caring, humor is effective, not necessarily because of its content, but because it conveys empathy and recognizes the dignity of the individual” (p. 1092).
The study of humor within social robotics is an encouraging yet limited area of research (Mirnig et al., 2016). Research has examined human responses to a robot’s use of verbal humor (Bechade et al., 2016), nonverbal humor, and physical comedy (Mirnig et al., 2016; Wendt & Berg, 2009) and has also explored the robot recognition of human humor (Bertero & Fung, 2016; Fung et al., 2016) and laughter (Devilliers et al., 2015) during human–robot interactions. Researchers have even explored participant responses to robot “stand-up” comedy (Tay et al., 2016).
Despite the growing interest in robot humor, only a handful of studies have specifically considered the effect of robot humor on user perceptions. In one such study, the addition of joke-telling to a robot’s conversation about building amenities improved user perceptions of the robot’s speaking style, personality expression, emotional expression, extroversion, and overall interaction enjoyability (Niculescu et al., 2013). In other work, jokes were rated funnier by participants when delivered by a robot as compared to text form (Sjoberg & Araki, 2009). In this same study, the addition of an observing robot who either laughed or “booed” after each joke was performed by the robot, was also found to increase the ratings of funniness, compared to a quietly observing robot.
It has been proposed that a robot’s behaviors might influence users’ perceptions of its personality and that this in turn might shape social responses and expectations (Goetz et al., 2003). Goetz et al. provided evidence that matching robot personality to a task can influence outcomes; a playful robot resulted in longer interactions for a jelly bean sorting task, whereas a more serious robot resulted in longer interactions in an exercise task. As suggested here, the context in which humor is used is important and can influence whether or not humor can enhance outcomes. This is one reason why it is important to specifically test the effects of robot humor in a healthcare context.
Although research has begun to examine the use of humor by social robots, the unique settings in which healthcare robots operate mean that these findings do not necessarily generalize to healthcare. Therefore, an experiment was designed to examine the effect of a robot’s humor on user outcomes in a healthcare scenario. A scenario was chosen in order to limit confounds and due to current technological limitations in robotics. The use of scenarios in healthcare research is not uncommon (Nazione et al., 2019).
The healthcare robot in this experiment used “witticism” humor. This humor subtype was chosen due to its use by humans in reducing anxiety, and its ability to bring attention to an issue while still appearing “clever” (Dynel, 2009, p. 1287; Norrick, 1984; Wiener, 2015).
The aim of this research was to investigate whether the use of humor by a healthcare robot could affect user outcomes. The research question was as follows: Will the use of humor by a healthcare robot affect user perceptions of the robot, its empathy and personality, and increase user laughing? Our primary hypothesis was that the use of humor by “EveR” would result in increased participant ratings of the robot’s likeability, intelligence, safety, animacy, anthropomorphism, and empathy, compared to the neutral condition. Our secondary hypotheses were that the use of humor by the “EveR” robot would result in more positive personality ratings of the robot and increase participant laughing compared to the neutral condition.
A power analysis was conducted using the computer program “G*Power” (Faul et al., 2007) using 2 × 2 repeated measures between factors ANOVA, with an alpha error probability of .05, power of .90, and a moderate effect size (f = .30). Effect size was based on the moderate effect sizes found by Niculescu et al. (2013) (d = 0.6), and Sjoberg and Araki (2009) (d = 0.6) on the effect of robot humor on participant perceptions of robot personality, humor, and overall interaction enjoyability. Analysis revealed a total sample size of 90 participants was needed.
Recruitment was undertaken via emails delivered to the University of Auckland’s students and staff. To be eligible, participants needed to be at least 16 years of age and fluent in English. The study was approved by the University of Auckland’s Human Participants Ethics Committee on September 3, 2019 for a period of 3 years (approval number 023487).
The robot used in this study was the “EveR-4” robot (see Figure 1). The EveR-4 robot is an android-type “female” robot, designed and created by robotic engineers at the Korean Institute of Industrial Technology. The EveR-4 robot is the fourth version in the EveR series. The EveR-4 was chosen due to its ability to “speak” and respond to questions, as well as its ability to demonstrate smiling, due to over thirty different “facial” motors covered by a silicone “skin.” The EveR-4 is able to “recognize” human faces and will move its head in order to maintain eye contact with users. Android-type robots like the EveR-4 have been used in previous experiments examining robots in receptionist or healthcare roles (Hashimoto et al., 2007; Ido et al., 2002; Yamada et al., 2019). Due to their human-like appearance, android-type robots are thought to facilitate realistic communication with users (Hashimoto et al., 2007). The EveR-4 robot is also more similar to a real physician than a machine-like robot such as Baxter (Ju et al., 2014), or an animal robot like Paro (Bemelmans et al., 2005), and is, therefore, more appropriate for this kind of study.
A between-subjects, repeated measures, randomized design was chosen for this study. The study was undertaken at the University of Auckland’s Newmarket campus in Auckland, New Zealand. All questionnaires and forms were provided to participants in paper form. In the supplementary material, Figure S1 shows a diagram of the experimental room setup.
On the day of the experiment, participants met with the lead researcher who explained the purpose of the study, answered any questions, and obtained written informed consent. Participants then completed a baseline questionnaire which recorded demographic information (age, gender, culture, employment, and education) and previous history of engagement with robots.
All participants took part in the initial interaction in which they were asked to imagine that they had gone to their local general practitionerʼs (GP’s) clinic with the purpose of determining if they should enroll in the clinic as a patient. The purpose of this initial interaction was to reduce novelty effects associated with “meeting” a robot for the first time. Participants were provided with a script to use during the initial interaction. Participants were advised that they would be speaking to “Jane” the “nurse robot” who would answer their questions in regards to the clinic. Following the initial interaction, participants were asked to complete a questionnaire (time point one).
Once the questionnaire was completed, participants were provided with another script and asked to imagine that they had decided to join the GP’s clinic where Jane “worked,” and would now be attending an appointment with Jane in order to ask her questions about the influenza (“flu”) virus and to book in for a flu vaccine. In this second interaction, Jane provided information about the flu, tips on how to avoid catching the flu, and assisted the participant in booking an appointment for a flu vaccine.
A health interaction involving a “flu” vaccination was chosen for the study because most participants would be familiar with it. The study took part in the New Zealand flu season. Each year, a number of concrete health initiatives encourage individuals to protect themselves from catching influenza by having an influenza vaccination before “flu season.” These initiatives include patient funding, advertisements, workplace initiatives, and educational materials.
Immediately prior to this second interaction, participants were randomized to either the neutral or humor condition. Randomization was undertaken by a separate researcher using a computer algorithm to generate randomization codes which were then stored separately in envelopes marked with a participants’ identification number. Immediately prior to the second interaction, the lead researcher opened an envelope and selected either the neutral or humor condition based on the randomization code contained within. The allocated interaction was run through a computer located in the experiment room.
During the interactions, the lead researcher remained in the room, moving to sit behind a computer screen. The computer screen was large and positioned in a way that obscured the researcher from participant sight. The researcher remained silent during all interactions. Though obscured from sight, the researcher was able to see participants via a camera that displayed the participants’ image on the computer screen. This allowed the researcher to discretely observe and code participant laughing behaviors during the second interaction. In order to establish inter-rater reliability, a second researcher was present for 10% (N = 9) of these interactions. The second researcher was introduced to relevant participants as a study assistant and sat to the side of participants, out of direct eyeline, discretely observing and recording laughing behaviors.
Following the second interaction, participants were given another questionnaire to complete (time point two). The questionnaires were identical at time point one and time point two. See Figure 2 for the procedural outline.
The robot in the current study used “witticisms” during conversation as a form of humor. As discussed above, witticisms are context bound, clever, and woven into conversation (Dynel, 2009, p. 1287; Norrick, 1984). This form of humor was chosen as it would allow the robot to use humor related to the flu vaccine and medical context in which the scenario was taking place. Witticism was also chosen as it is seen as “clever,” and the researchers in the current study did not want to use a form of humor that may negatively influence perceptions of the robots’ intelligence. Finally, the use of witticism humor allowed the flow of conversation to continue without breaks in conversation that would take place if the robot used other humor, for example, the telling of jokes or humorous riddles.
The script for the humorous interaction between participant and robot is shown in Figure 3, with bolded italic sentences indicating humorous statements. The control condition used the identical script, but without the humorous statements. The robotʼs other behaviors, including smiling, were identical across conditions. The three statements are witty because they are context bound, clever, and woven into conversation.
As humor is a subjective construct, the scripts in both conditions were checked prior to the start of this experiment to assess perceived humor. A convenience sample of 10 participants was used. Participants observed both a “humorous” and a “neutral” interaction, demonstrated by the lead author and a “chat-bot.” The order of the interactions demonstrated was randomized. Participants were asked if they found the statements used by the chat-bot to be humorous or not, if the humor was appropriate or not, and to rate the frequency of humor as either: too much, just right, or too little. None of the participants found the neutral interaction to be humorous. All participants (N = 10/10) rated the “humorous” interaction as humorous and all participants reported the humorous statements in this condition to be appropriate. Sixty percent of participants reported that “too much” humor was used, while 40% reported humor frequency to be “just right” in the humor condition. Based on these findings, the decision was made to remove one of the four humorous statements from the script.
The “humorous/humorless” item of the personality measure was analyzed as a manipulation check. A chi-square test showed that more participants in the humor group rated the robot as humorous (n = 41/46) than in the neutral group (n = 7/45), chi square = 49.40, p < .001.
The use of humor in social situations has been found to increase perceptions of competence, confidence (Bitterly et al., 2017), and likeability (Wanzer et al., 1996). In clinical contexts, humor has been found to decrease the symptoms of psychological distress, including anxiety (Cai et al., 2014; Dean & Major, 2008; Lynch, 2002). In studies exploring the use of humor in human–computer interactions, computer humor was found to increase user perceptions of the computers “naturalness” and “flexibility” as well as making the computer seem “more humanlike” when it failed (Mulder & Nijholt, 2002).
In order to determine if similar perceptions would be influenced by the use of humor by the robot in the current study, the Godspeed questionnaire was utilized (Bartneck et al., 2009). The Godspeed questionnaire is a standardized measure developed to assess the five key aspects of human perceptions during human–robot interactions: (a) likeability, (b) intelligence, (c) perceived safety (a measure of user anxiety, agitation, and tiredness), (d) animacy, and (5) anthropomorphism. The Godspeed questionnaire is arguably one of the most used questionnaires in social robotics research, with citations numbering over 150 as of 2014 (Weiss & Bartneck, 2015). Empirical studies have demonstrated the reliability and validity of each individual dimension of the Godspeed questionnaire (Bartneck et al., 2009). The Godspeed questionnaire was chosen not only due to its utility as a standard measure in social robotics but also as it specifically measures perceptions of robot safety, which is important in healthcare.
Reliability analysis was undertaken of the total Godspeed questionnaire (from time point two) with Cronbach’s alpha revealed to be .94 (indicating very good internal consistency). Reliability analyses were also undertaken for each of the separate dimensions in the Godspeed questionnaire, revealing a Cronbach’s alpha of .90 for likeability, .86 for intelligence, .69 for safety, .79 for animacy, and .83 for anthropomorphism. Participant responses for items in each of the five dimensions were summed together in order to give a total score for each dimension at each of the two time points.
Due to the importance of empathy in physician–patient interactions (Riess, 2010), and the positive relationship between humor and empathy (Dean & Major, 2008; Demjén, 2016; Hampes, 2001), the current study assessed perceptions of the robot’s empathy. To assess perceptions of the robot’s empathy, we used the total score from a combination of seven questions from the McGill Friendship Questionnaire (Mendelson & Aboud, 1999), five questions from the Consultation and Relational Empathy measure (CARE measure) (Mercer et al., 2004), and one question specifically for this study “Jane makes a good healthcare receptionist.” An adaption of the McGill Friendship Questionnaire was used in previous robotics research investigating perceived robot empathy (Leite et al., 2013). The CARE measure was designed to assess patient perceptions of clinician empathy and has been found to be both reliable and valid across a number of clinical settings (Bikker et al., 2015; Wirtz et al., 2011). All of the items were scored on Likert scales from 0 “strongly agree” to 5 “strongly disagree.” In the analysis phase, all items were reverse coded so that higher scores represented higher empathy. The 13 items on the scale were summed at time point one and at time point two, in order to create a total empathy score at each of the two time points. A reliability analysis of the empathy scale was undertaken (at time point two), revealing a Cronbach’s alpha of .90 (indicating very good internal consistency).
To assess participant perceptions of the robot’s personality, an adaption of Asch’s personality scale (Asch, 1946) was used, similar to Broadbent et al. (2013). Participants were asked to select the descriptor they most agreed with from each of 18 different pair-choice items. Asch’s personality scale was chosen not only as it measured user perceptions of the robots humor (pair-choice item “humor vs. humorless”) but also due to the measurement of other user perceptions recognized as intuitively important to the implementation of humor in healthcare robots (e.g., “sociable vs. unsociable” and “warm vs. cold”). Asch’s personality measure has been used in both psychological and social robotics research (Broadbent et al., 2013; Mann et al., 2015; Williams & Bargh, 2008). Items were scored as 0 and 1, with lower scores representing more positive personality traits.
We calculated the same three factors reported by Broadbent et al. (2013) for each time point. The first factor was “sociable” (including items sociable–unsociable, popular–unpopular, imaginative–hard headed, warm–cold, humorous–humorless, and good natured–irritable), Cronbach’s alpha was .68 at time point two. The second factor was amiable (including items good looking–unattractive, happy–unhappy, humane–ruthless, and generous–ungenerous), but Cronbach’s alpha was −.38. The third factor was trustworthy (including persistent–unstable, wise–shrewd, and honest–dishonest); however, there was zero variance in the honest item at time two, and the remaining two items were negatively correlated (Cronbach’s alpha −.67). Due to the unreliable nature of the amiable and trustworthy factors, only the sociable factor was analyzed. In addition, individual items are reported in Appendix A.1
Participants were discretely observed during the second interaction with the robot in order to code laughing behaviors. A “laugh” was defined as an audible laughing sound accompanied by a smile (i.e., lifting of the corners of the mouth and cheeks). A second researcher was present for 10% (N = 9) of interactions. Observation reports from both researchers were compared showing a 100% agreement in regards to participant laughing behaviors. Laughing was coded as 1 (present) or 0 (absent).
The Godspeed questionnaire was analyzed using 2 × 5 × 2 ANOVA with time point and dimension (repeated measures), and condition (between groups). Empathy was analyzed using a 2 × 2 ANOVA including time point (repeated measures) and condition (between groups). For the sociable personality factor, 2 × 2 ANOVA including time point (repeated measures) and condition (between groups) was conducted. Fisher’s exact tests were conducted for each individual personality item at time one (Table A1) and time two (Table A2) in Appendix A. A Fisher’s exact test was used to determine whether there was a difference in laughing behaviors between the neutral and humor groups, as laughing was a dichotomous variable.
The majority of participants were female (N = 73/91). Participants identified as New Zealand European (N = 26), Maori (N = 3), Chinese (N = 26), Korean (N = 3), Indian (N = 12), and “Other” (N = 21). Most participants had received a secondary school level education (N = 65), followed by postgraduate level (N = 15), undergraduate level (N = 7), diploma level (N = 2), and trade certificate level (N = 2). Employment ranged from students (N = 64), to part time employees (N = 16), full time employees (N = 8), and those who were currently unemployed (N = 3). The mean age of participants was 25.03 years (SD = 11.06), with a minimum age of 17 and a maximum age of 63. The majority of participants (N = 73/91) had never interacted with any kind of robot in the past. Analyses revealed no significant differences between participants in the neutral (N = 45) and humor (N = 46) groups in regards to age [t(88) = −9.24, p = .358], gender [χ2(1, N = 91) = 0.34; p = .607], education [χ2(4, N = 91) = 2.34; p = .674], employment [χ2(3, N = 91) = 4.05; p = .256], or history of robot interaction [χ2(2, N = 91) = 1.062; p = .588].
Table 1 shows the mean values and SDs for the five Godspeed dimensions for each condition at the two time points.
Multivariate tests indicated significant effects for time, F(5, 85) = 21.93, p < .001, partial eta squared = .56, and for time by condition, F(5, 85) = 3.24, p = .010, partial eta squared = .16. Univariate test results for the five dimensions of the Godspeed questionnaire are reported below.
There was a main effect of time, such that the robot was perceived as more likeable at time point two than at time point one F(1, 89) = 24.82, p < .001, partial eta squared = .22. There was a significant time by condition interaction effect F(1, 89) = 7.74, p = .007, partial eta squared = .08; at time point two, participants in the humor condition saw the robot as more likeable than did participants in the neutral condition.
There was a main effect of time, such that the robot was perceived as more intelligent at time point two than at time point one, F(1, 89) = 24.12, p < .001, partial eta squared = .21. There was no significant time by condition interaction effect F(1, 89) = 0.60, p = .441, partial eta squared = .007.
There was a main effect of time, such that the robot was perceived as safer at time point two than at time point one, F(1, 89) = 8.31, p = .005, partial eta squared = .09. There was a significant time by condition interaction effect F(1, 89) = 5.19, p = .025, partial eta squared = .06; participants in the humor condition saw the robot as safer than did participants in the neutral condition at time point two.
There was a main effect of time, such that the robot was perceived as more animate at time point two than at time point one, F(1, 89) = 71.91, p < .001, partial eta squared = .45. There was a significant time by condition interaction effect F(1, 89) = 5.24, p = .024, partial eta squared = .06; perceptions of animacy increased more in the humor condition compared to the neutral condition. Animacy scores were lower at time point one for the humor group compared with the neutral group, and this may account for the interaction effect.
There was a main effect of time, such that the robot was perceived as more anthropomorphic at time point two than at time point one, F(1, 89) = 49.91, p < .001, partial eta squared = .36. There was no significant time by condition interaction effect F(1, 89) = 0.00, p = .989, partial eta squared = .00.
Table 2 shows the mean values and SDs for perceived robot empathy for each condition at each time point.
There was a significant main effect of time on empathy, F(1, 89) = 63.36, p < .001, partial eta squared = .42; both conditions saw the robot as having higher empathy at time two. There was also a significant condition by time interaction, F(1, 89) = 5.60, p = .020, partial eta squared = .06. The robot was seen as more empathetic in the humor condition than in the neutral group at time two.
Table 3 shows the mean values and SD for the sociable personality factor for the humor and neutral conditions at time points one and two. ANOVA revealed a main effect of time, such that the robot was perceived as more sociable at time point two than at time point one F(1, 89) = 87.69, p < .001, partial eta squared = .50. There was also a significant time by condition interaction effect F(1, 89) = 40.27, p < .001, partial eta squared = .31; participants in the humor condition reported more sociable perceptions of the robot compared to participants in the neutral condition at time point two.2
Appendix A shows the results of tests of the individual personality items at time point one and time point two. There were no significant differences between conditions at time point one (Table A1). At time point two, participants in the humor condition were more likely to rate the robot as happy, humorous, sociable, talkative, warm, popular, frivolous, and imaginative than those in the neutral condition (Table A2). The other items at time point two were not significantly different between conditions.
A Fisher’s exact test [χ2(1, N = 91) = 20.45; p < .001] revealed a statistically significant difference between the humor and neutral conditions in regard to laughter. Significantly, more participants laughed during the interaction with the humorous robot (N = 21, 46%) than the number of participants who laughed during the interaction with the neutral robot (N = 2, 4%) (Cramer’s V = .47).
This study investigated whether the use of humor by a robot in a medical scenario affected user perceptions of the robot’s likeability, intelligence, safety, animacy, anthropomorphism, empathy, and personality. The study also looked at the effect of robot humor on user behavior, specifically that of laughing. The results showed that the use of robot humor significantly increased users’ perceptions of the robot’s likeability and safety. Although humor significantly increased perceptions of robot animacy, this was likely due to differences at time one. The use of robot humor did not affect perceptions of the robot’s intelligence or anthropomorphism, and this may be due to the robot’s human-like appearance and the knowledge it demonstrated in both conditions. Robot humor also increased user ratings of the robot’s empathy, as well as user perceptions of the sociable aspects of its personality. The humorous robot was also seen as happier, more frivolous, and more talkative than the neutral robot. In addition, robot humor was found to increase user laughing behavior.
These findings support the theory of robot–patient communication, which posits that aspects of robot communication can affect patient outcomes (Broadbent et al., 2018). The results of the current study support the use of robot humor within healthcare contexts and suggest that humor increases robot likeability and reduces user distress (indexed by the safety dimension of the Godspeed questionnaire). The positive effect of robot humor on user perceptions of its likeability supports previous research in human social interactions, demonstrating the role of humor in increasing perceptions of likeability (Wanzer et al., 1996).
The study also found that robot humor increased user perceptions of the robot’s empathy, within a medical scenario. These results support research showing that appropriate physician humor is associated with increased patient perceptions of physician empathy (Dean & Major, 2008; Demjén, 2016; Hampes, 2001). Researchers in physician–patient interactions theorize that humor can influence empathy perceptions through the facilitation of rapport, as well as patient communication, empowerment, and the reduction of psychological distress. Our findings support mechanisms of reduced distress and rapport (through the safety and likeability dimensions of the Godspeed questionnaire).
The findings of the study further suggest that humor has a positive impact on perceived robot personality. This supports previous research showing that the use of humor by a social robot increased user ratings of the robot’s expression of personality (Niculescu et al., 2013), and positive personality traits (Bitterly et al., 2017; Hampes, 2010). Our results extend these findings to a medical context and suggest that the effects of humor apply mostly to sociable aspects of a robot’s personality, as indexed by the sociable factor of the personality scale. The effects of humor on the happy and talkative items of the personality scale could also be seen to relate to sociable personality traits. The effects of humor on ratings of the frivolous–serious item of personality should be further investigated as to whether this is reflecting perceptions of humor or something else.
Although preliminary, the findings suggest clinical implications for the incorporation of humor into healthcare robots to improve user perceptions of the robot.
As with many experimental studies, there are limitations that need addressing. First, most of the participants were students and relatively young. This may reduce the generalizability of these results to an older population.
Second, due to technological limitations, we used a scripted interaction in a simulated medical environment, which limits ecological validity. Had the technology been available for participants to converse freely with the robot, and had a patient population been used within a natural setting, the results of this study may have differed. The interaction we used was for the flu vaccine and other type of conversations might produce different results.
Third, the use of humor meant that the robot spoke more words in the humor condition, compared to the neutral condition (69 words). The resulting small increase in interaction time for the humor condition may have had an impact on participant perceptions. This difference in dialogue length is common with other human–robot interaction studies (e.g., Niculescu et al., 2013; Siino et al., 2008).
Fourth, due to financial constraints, the robot used in the current study was a “head-only” robot. Participants’ self-reported perceptions may have differed had the robot been a “full-bodied” humanoid. On this note, participants were asked two open-ended questions in regards to what they liked most and least about their interactions with the robot. Though participants mentioned the robot’s voice, content of speech, eye gaze, movements, and facial expressions, no mention was made of the robot being only a head, indicating that there may have been a general acceptance of this.
The use of humor by a healthcare robot increased user perceptions of its likeability, safety, empathy and sociability, and increased user laughter. Robot programmers may therefore wish to incorporate humor into the development of healthcare robots. As the use of healthcare robots in the future will certainly involve interactions with patients, researchers should consider replication of this research, utilizing a patient sample, within a medical environment, with different conversational topics.
Copyright © the Author(s) 2020
Received December 12, 2019
Revision received August 11, 2020
Accepted August 15, 2020