Volume 3, Issue 1: Spring 2022. Special Collection: Technology in a Time of Social Distancing. DOI: 10.1037/tmb0000055
This preregistered experiment examines the impact of three nonverbal cues displayed through video conference screenshots (i.e., gaze direction, distance between the face and the camera, camera angle) on impression formation. Actors in video conference screenshots each portrayed one of 18 nonverbal cue configurations that manipulated gaze (at the camera, on-screen, or off-screen), camera distance (close or far), and camera angle (high, eye-level, or low). Study participants (N = 3,982) rated the actors on nine interpersonal dimensions (e.g., likeability). Findings showed significant effects of gaze and camera angle on impression formation, with gaze on-camera positively associated with likeability, social presence and interpersonal attraction, and with high camera angles increased interpersonal attraction and decreased threat perceptions compared to low angles. Although the actors’ distance in relation to the camera did not affect impression formation, the interaction between distance and gaze was positively associated with threat judgment and social presence such that faces closer to the camera and maintaining direct gaze were rated as more socially present and threatening than the other three conditions. Finally, participants’ gender also played an important role as women, regardless of actors’ nonverbal behaviors and demographics, reported higher likeability judgment and lower threat judgments than men. These results contribute to the body of knowledge concerning nonverbal behavior in video conferences and how these compare and differ from face-to-face interaction. Moreover, through the use of video conference screenshots, these results inform video conference users concerning how their nonverbal behaviors might impact how they are perceived by others.
Acknowledgments: We are thankful for assistance in this research from Jet Toner, Tobin Asher, Sunny Liu, Carlyn Strang, and the 16 actors shown in the video conference screenshots.
Funding: This work was supported by the National Science Foundation under Grants IIS-1800922 and CMMI-1840131; Knut och Alice Wallenberg under Grant 20170440.
Disclosures: The authors declare that they have no conflict of interest.
Author Contributions: Géraldine Fauville, Anna C. M. Queiroz, and Mufan Luo contributed equally to this work.
Data Availability: Our hypotheses and research questions were preregistered at https://osf.io/tyaqh. The screenshots used in this study are available at https://osf.io/7prhz/ along with the dataset and the analyses.
Correspondence concerning this article should be addressed to Géraldine Fauville, Department of Education, Communication and Learning, University of Gothenburg, Pedagogen hus B, Läroverksgatan 15, 405 30 Göteborg, Sweden email@example.com
As social interactions have increasingly transitioned from in-person to virtual settings, it is essential to understand how we make social impressions on others via this medium. For example, the critical human ability to form impressions as described by Asch (1946, p. 258): This remarkable capacity to understand something of the character of another person is a precondition of social life. Building on Asch’s work (1946), Ambady and Rosenthal (1993) demonstrated that it takes very little time for humans to form interpersonal impressions. These authors argued that “thin slices” of behavior are all it takes for humans to be able to form accurate impressions of others. Scholars have argued that the time necessary to form first impressions can be very short (Ambady & Rosenthal, 1993; Olivola & Todorov, 2010; Todorov & Porter, 2014). People can judge trustworthiness from faces in less than 100 ms of exposure (Todorov et al., 2009; Willis & Todorov, 2006). Even when faces are rendered almost invisible, individuals seem to be able to make decisions concerning attractiveness, suggesting that these impressions barely require the individuals to be consciously aware of the faces (Olson & Marshuetz, 2005).
Computer-mediated communication may filter some cues out of social interactions. For example, in video conferences, nonverbal cues are either restricted or changed, and many researchers have highlighted effects of eye contact (Hinds, 1999) or the sense of increased social distance (Bradner & Mark, 2002). In early theorizing about the impact of computer-mediated communication on nonverbal cues, the cues-filtered-out approach posits that the more nonverbal cues available in a medium the more positive the interactions and outcomes (see Walther & Parks, 2002). This assumption was challenged by the realization that people can adapt their social processing to different kinds and types of cues (e.g., Walther, 1992). This leads to more nuanced alternatives recognizing that communicators can compensate for some of the structural limitations if they are provided with time and motivation (Walther, 1996; Walther et al., 2015).
What happens to impression formation when we see and hear each other through video on a regular basis? Visual cues in video conferences, such as camera angle and a partner’s distance to the screen, may have large perceptual consequences on impression development.
Nonverbal cues have been classified in three domains (Harrigan, 2005): gaze, which involves movements and direction of the eyes in visual interaction, proxemics, which refers to our perception and structuring of interpersonal and environmental space, and kinesics, defined as actions and positions of the body, head, and limbs. In this paper, we use independent variables that serve as proxies for these three domains in order to investigate how nonverbal behaviors can influence impression formation.
The concept of gaze is related to the direction of someone’s eyes (Kleinke, 1986) and provides essential cues in social interaction (Edwards & Bayliss, 2019; Mason et al., 2005). Someone’s gaze helps their interlocutor decipher their attitude, emotion, and behavior (Baron-Cohen, 1994). For example, joy, love, or anger, namely approach-oriented emotions, tend to be characterized by direct gaze, while avoidance-oriented emotions such as sorrow, disgust, or embarrassment seem to lead to more averted gaze (Argyle & Cook, 1976; Fehr & Exline, 1987; Kleinke, 1986). Prior research has shown that a person who engages in high rather than low level of direct gaze is perceived by their interlocutors as more attentive and intelligent than someone who tends to avoid gaze (Kleinke, 1986; LeCompte & Rosenfeld, 1971; Wheeler et al., 1979). Gaze also plays a role in cooperation since a higher level of direct gaze has been shown to encourage the interlocutor to interact more by giving longer responses, leading to increased efficiency of basic cognitive operations (Driver et al., 1999). In the context of interpersonal attraction, Kleinke et al. (1975) showed that male subjects found female researchers who gazed at them 90% of the time significantly more attentive than females who gazed 10% of the time. Regulating one’s gaze is also used to modify one’s intimacy with another person such that direct eye gazing expresses intimacy and avoiding eye contact leads to reduced intimacy (Ellsworth & Ross, 1975).
Gaze has also been studied in the relation to technology and media. For example, Ilicic and Brennan (2020) examined the impacts of direct eye contact on feelings of connectedness with celebrities. The findings showed that participants who saw social media posts where the celebrity was making eye contact expressed a greater connection with the celebrity and rated them as more authentic than when the celebrity was averting eye contact. In another study (de Wolf & Li, 2020), participants were asked to imagine working in a company and watched a video. In the video, an employee walked in the hallway while a telepresence robot displaying the face of a remote colleague traveled in the opposite direction. When their paths crossed, they greeted each other. Four conditions of greeting were tested; gaze (present or averted) and greeting (present or absent). After watching the video, the participants rated their social impression of the remote colleague higher in the gaze and the greeting conditions. In this way, de Wolf and Li (2020) demonstrated that even when mediated by technology, gaze plays a crucial role in impression formation.
In video conferences, Fullwood (2007) showed that participants perceived each other as less likable and intelligent during video conferences than face-to-face interactions. He argued that this difference is the result of fewer visual signals such as eye gaze. One can also point out that the evolution of the video conference technology with improved resolution and decreased latency might have an impact on the social interaction taking place. As described by Bailenson (2021), during video conferences, since the gaze is disrupted by the camera, it feels like everyone is constantly staring at each participant. In this way, all the communication and impression formation cues provided by the gaze of the interlocutor during face-to-face interaction become hard to interpret and rely upon.
In this study, in order to explore the impact of gaze on impression formation, the actors in the video conference screenshots will either be looking directly at the camera, at the screen or off the screen.
Research on proxemics involves studying how humans perceive, use, and structure space between each other in social interaction (Hall, 1966). Hall (1973) argued that different interpersonal distances convey different meanings. He defined four regions of interpersonal distance, namely public (between 7.6 and 3.7 m), social (between 3.7 and 1.2 m), personal (between 1.2 and .46 m), and intimate (less than .46 m). Several factors tend to make these regions shift such as a person’s gender, personality, culture, and the social context (see Lewis et al., 2017 for further details; Burgoon et al., 2002). Various strategies exist to address an interpersonal space invasion. One of the most straightforward strategies is to move away from the intruder (Felipe & Sommer, 1966). Argyle and Dean (1965) argued for the equilibrium theory between gaze and interpersonal distance as they stated that when proximity is too great, one can compensate for this uncomfortable intimacy through gaze averting. According to Felipe and Sommer (1966) confronting the invader is an unusual strategy although in the case of aircraft traveling, Lewis et al. (2017) identified it as one of the various strategies passengers would deploy to restore comfortable distance. Proxemics can also influence interpersonal impression formation. For example, Patterson and Sechrest (1970) found that when the distance between the actor and the participant was between 1.2 and 2.4 m, there was a negative correlation between distance and the impression formation (e.g., friendliness, aggressiveness, dominance, and extraversion).
In face-to-face settings, interlocutors standing close to each other will appear larger than if standing further away from each other. Similarly, the size of digital images influences interpersonal dynamics. Detenber and Reeves (1996) studied the impact of seeing small or large digital images on emotions, and found out that participants rated the large images as more arousing than the smaller ones. When watching videos on screens of different sizes, Reeves et al. (1999) found that larger screens led to greater levels of attention (as indicated by heart rates) and arousal (as indicated by skin conductance) than the medium and small screens. It has also been argued that proxemics behavior can be activated in virtual reality as participants stayed further away from a virtual agent who stared at them compared to a virtual agent averting their gaze (Bailenson et al., 2001, 2003). Moreover, Llobera et al. (2010) showed that physical arousal could be triggered in participants approached by virtual characters.
In this study, as a proxy to proxemics, we will focus on how large the heads of the actors in the video conference screenshots appear, with either large faces (for actors sitting close to the camera) or small faces (for actors sitting further away from the camera). In this way, our manipulation strategy aligns with prior research in this area (e.g., Detenber & Reeves, 1996; Reeves et al., 1999).
Kinesics can be defined as the movement and positions of the body, head, and limbs (Harrigan, 2005), which can play an important role in impression formation. For example, Ellis (1994) found a positive association between height, dominance, and social status for both sexes. Height presents other advantages in the workplace, with physical height positively correlated with measures of social esteem, leadership, performance, and professional success after controlling for sex, weight, and age (see Judge & Cable, 2004 for a review).
In terms of kinesics in media, the orientation of the body in cinematography can be modified by the camera angle. The term “camera angle” is widely used in cinematography to convey the position of the camera in relation to the subject. High-camera angle refers to a shot where the camera is placed higher than the subject and thus looks down on them. Low-camera angle refers to a shot where the camera is positioned lower than the subject and thus looks up on the subject. Film studies have shown that camera angles can convey interpersonal meaning. For example, a camera positioned above the eye-level of the actors (high-camera angle) diminishes the status of the actors and communicates vulnerability (Boorstin, 1991). The camera angle may also influence perceptions of trustworthiness. Baranowski and Hecht (2018) found that actors were rated as most trustworthy when filmed at eye-level. Similarly, McCain et al. (1977) argued that higher angles consistently produced higher credibility ratings than low angle shots and their findings suggested that perceived competence, composure, and sociability were enhanced for speakers with higher camera angles.
In video conferences, the position of the camera in relation to the participant will influence the camera angle. Huang et al. (2002) investigated the impact of the camera angle during video conferences on participants’ influence. Participants worked individually on a problem before negotiating a common solution with another participant through a video conference where one participant’s camera was at a low angle while the other was at a high angle. The influence was measured through the difference between the individual and group solution. Findings showed that participants with a low angle camera had more influence on group tasks than the participant with the high angle camera.
In this study, in order to explore the impact of kinesics on impression formation, the actors in the video conference screenshots will position their camera at a high-angle, eye-level angle or low-angle.
Drawing on prior literature, the present research examines how impression formation is influenced by three aspects of nonverbal cues in the video conference context: gaze, camera distance, and angle. Our hypotheses and research questions were preregistered at https://osf.io/tyaqh. Our first prediction concerned gaze. As discussed earlier, eye contact can promote the impression of attention, intelligence (Kleinke, 1986; LeCompte & Rosenfeld, 1971; Wheeler et al., 1979), and cooperation (Driver et al., 1999). Moreover, on social media individuals express a greater connection with celebrities who make eye contact in their online posts (de Wolf & Li, 2020). Since, in video conferences, the eyes of the participants and the camera are not at the same place, a participant willing to simulate eye contact with an interlocutor will have to stare at the camera. Hence, we hypothesize the following:
H1: Likeability judgments, social presence, and interpersonal attraction will be the highest when the actor looks at the camera, will be intermediate when the actor looks at the screen and will be lowest when the actor looks away from the computer.
Our next set of predictions focus on the camera angle as a type of kinesic cue that can affect interpersonal perceptions. Giannetti (2012) described how low camera angles are used to make the person appear tall, and dominant. Hence, we hypothesize the following:
H2: Threat judgment will be lower in the Camera Angle High condition than in the Camera Angle Low condition.
We also examined how the camera angle may affect perceptions of trustworthiness. Movie actors are rated as most trustworthy when filmed from eye-level (Baranowski & Hecht, 2018). We examine whether this effect also takes place in video conferencing and hypothesize the following:
H3: Trustworthiness will be higher for Camera Angle Eye-Level condition than for Camera Angle High and Low conditions.
We also examined how camera angle can affect likability. Prior research suggested that higher camera angles lead to enhanced perceived competence, composure, and sociability compared to low camera angles (e.g., McCain et al., 1977). Hence, we hypothesize the following:
H4: Likeability judgments and interpersonal attraction will be higher for Camera Angle High condition than for Camera Angle Low condition.
Our last prediction focused on the camera distance. Patterson and Sechrest (1970) demonstrated that perceptions of friendliness, extraversion, dominance, and aggressiveness were negatively correlated with interpersonal distance. Åhs et al. (2015) also demonstrated that greater proximity led to enhanced defensive response. Khan and McGaughey (1977) also found a main effect of proximity on interpersonal attraction between participants of different genders. While arguing that threat and likeability are similarly modulated by distance can seem counter-intuitive, Khan and McGaughey (1977) suggested that these correlations can be modulated by the meaning and interpretation of such an interpersonal approach and by whether the approach might foster or challenge one’s goal. Given that it is unclear how a close versus far interpersonal distance will be interpreted when viewing video conference screenshots, we formulated the following hypotheses:
H5: (a) Threat and (b) likeability judgment will be higher in Distance Close condition compared to Distance Far condition.
Burgoon (1978) argued that people use different types of nonverbal cues to regulate an appropriate level of interpersonal immediacy. For example, people increase their interpersonal distance from interactants who engage them in mutual gaze. Based on the equilibrium theory, Aiello (1977, p.122) postulated that “eye contact functions to regulate the comfort of an interaction and is also a response to the degree of interaction comfort; further, comfortable interaction distances promote eye contact and, more importantly, uncomfortable distances diminish it.” Previous work in virtual reality has shown an interaction between gaze and interpersonal distance, with virtual humans who are both close and maintaining mutual gaze as particularly influential (Bailenson et al., 2001). We intend to explore this interaction with video conference screenshots.
RQ1: How will gaze, camera distance, and angle jointly affect formation impressions in video conferencing?
Finally, as gender and race may affect interpersonal formation in video conferences, we intend to explore their role in impression formation.
RQ2: How will gender and race of the participants and the actors impact impression formation when viewing video conference screenshots?
Between March 19 and May 6, 2021, we recruited participants through news media outlets which published a story concerning the work done by the authors on Zoom Fatigue (Fauville et al., 2021a, 2021b) and included a link to this study. A convenience sample of 4,750 individuals provided informed consent and completed the online survey. Data analyses included only responses from participants who passed the attention check question.
Although we planned to have around 1,199 participants, we received 3,892 complete and valid responses in the period that the survey was open to the public. Although high-powered studies usually present a smaller effect magnitude, they are also less noisy and more realistic (Vasishth et al., 2018). Moreover, high-powered studies reduce Type 2 error (Lakens, 2014) and seem to enhance replicability, as they reduce the exaggerated effect sizes usually found in low-powered studies (Camerer et al., 2018). Hence, we included all the responses in the analyses and reported the effect size for all statistical tests.
The data analyses included responses from 3,892 participants, with 66% of women (n = 2,560), 32% of men (n = 1,261), 0.9% identifying as neither woman nor man (n = 36), and 0.9% declining to answer (n = 35). The age ranged between 18 and 99 years old (M = 43.5, SD = 13.1). The distribution of races was: 2.2% of African or African-American or Black (n = 85), 7.3% of Asian or Asian-American (n = 283), 5.7% of Hispanic or Latinx (n = 221), 0.2% of Indigenous or Native American (n = 6), 0.6% of Middle Eastern (n = 25), 0.1% of Native Hawaiian or Pacific Islander (n = 4), 74% of White (n = 2,887), 4.9% of participants identifying with more than one race (n = 191), 3.8% declined to answer (n = 149), and 1.1% identified as a race not listed (n = 41).
Since this study aims to understand how individuals form interpersonal impressions when viewing video conference screenshots, this study measured dependent variables that have been widely used in interpersonal communication and perception research (see Table 1 for the wording of the questions along with the mean, SD, and Cronbach’s αs).
As video conferences give an increased importance to the faces of the participants while hiding the rest of their body, measuring how individuals perceive the social traits of these faces is central to the impression formation. Social perception of faces was measured in the following seven dimensions validated by previous face perception work (Todorov et al., 2013)—attractiveness, competence, dominance, extroversion, likeability, threat, and trustworthiness. Todorov et al. (2013) argued for highly correlated dimensions and developed judgment models by clustering specific dimensions. Drawing on this judgment model, likeability judgment includes the dimensions of attractiveness, competence, extroversion, likeability, trustworthiness whereas the threat judgment includes dominance and threat dimensions. Finally, the social perception of trustworthiness was also analyzed on its own to test H3.
Video conferences offer a virtual alternative when face-to-face encounters are not possible. Many scholars have investigated how the use of technologies for social interaction impacts social presence, or the sense of being with another (see Oh et al., 2018 for a review). Social presence was adapted from a well-established scale called Networked Minds Measures of Social Presence (Biocca et al., 2001; Harms & Biocca, 2004). The items addressed co-presence (e.g., “I would be aware of the person’s presence”) and attention allocation (e.g., “I would feel like the person pays close attention to me”).
Survey Items With Means, SDs, and Conbach’s αs
Social perception of faces
I think this person is attractive
I think this person is competent
I think this person is extrovert
I think this person is likable
I think this person is trustworthy
I think this person is dominant
I think this person is threatening
If I had a video conference with this person …
I would be aware of the person’s presence.
I would feel that the person is aware of my presence.
The person would catch my attention.
I think I would catch the person’s attention.
I would feel like the person pays close attention to me.
I would pay close attention to the person.
I think the person would remain focused on me throughout our interaction.
I would remain focused on the person throughout our interaction.
I would enjoy completing a task with this person.
I would have fun completing a task with this person.
I would like to interact with this person again.
It would be interesting to complete tasks with this person.
I would like this person.
I would get along with this person.
I would enjoy a casual conversation with this person again.
I would think this person is friendly.
The person you see in the screenshot is one of our researchers. Would you be willing to receive emails from this person for future research studies?
Note. All items are measured on a 5-point Likert-scale ranging from 1 = “Not at all,” 2 = “Slightly,” 3 = “Moderately,” 4 = “Very” to 5 = “Extremely” except for the behavioral measure with “yes” and “no” answer.
Nonverbal cues not only influence impression formation but can reflect someone’s attitude toward the interlocutor such as with interpersonal attraction (Herrera et al., 2020). Interpersonal attraction was measured by items addressing task attraction (e.g., “I would enjoy completing a task with this colleague”) and social attraction (e.g., “I would like this person”) (Davis & Perkowitz, 1979; McCroskey & McCain, 1974).
At the end of the survey, the participants were told that the actor shown in the videoconference screenshot was a researcher and were asked if they would like to be contacted by this person for future study. We operationalized this binary variable as a behavioral measure of social attraction.
A 3 (gaze: camera, screen, off-screen) × 2 (camera distance: close, far) × 3 (camera angle: high, eye-level, low) between-subject design was employed (see Table 2 for a description of nonverbal cues and associated conditions). Participants were randomly assigned to one of the 18 conditions. Each participant first answered demographic questions about age, gender, and race. The participants were then asked to look at a screenshot of an actor in one of the 18 conditions. They were prompted to imagine that they were on a video conference call with the actor and were asked to indicate their perceptions of this person.
To create a database of video conference screenshots, we used a stimulus sampling approach gathering screenshots from 16 actors (two actors from each combination of gender [woman and man] and race [African-American, Asian, Latinx, and White]). The 16 actors were informed of the goal of the study and agreed to the use of their image before taking the photographs. During a video conference, the actors were instructed to iteratively raise or lower their computer (camera angle), to sit closer or further away from the screen (camera distance) and to look at points on or off screen (gaze). A screenshot was taken for each of the 18 conditions. This resulted in a database of 288 screenshots. Rather than strictly controlling the backgrounds of the screenshots, we chose to ask the actors to capture the background they normally used in video conferences. While this results in variation in background content across pictures, the benefit of that variance in a stimuli sampling design is increased generalizability. In a typical stimulus sampling design study, a series of stimuli are presented to participants who are asked to provide some rating based on their judgments (i.e., rating of personality). The stimuli are typically visual images such as pictures, audio or video clips generated by other people. In this way, as explained by Wickham et al. (2021), both the participant and the stimuli constitute a sample of a larger population. Figure 1 illustrates a sample of the screenshots. These screenshots are available for other researchers and for educational purposes at https://osf.io/zrk9m/.
The use of still images in lieu of videos provides the experimental advantage of allowing better experimental control along with more precise measurements of gaze direction, camera angle, and distance between the screen and the actor. While this study focuses on impression formation during video conferences, we described above how previous scholars have argued that impression formation requires as little time as 100 ms and even happens when faces are barely visible (Olson & Marshuetz, 2005; Todorov et al., 2009; Willis & Todorov, 2006). In this way, our decision to use screenshots as a way to simulate video conferences and trigger impression formation is supported by both technical and theoretical arguments. Based on the wide literature on impression formation, we argue that our findings from screenshots will apply to video conferences as well.
Independent Variables and Conditions
We modeled dependent variables with the fixed effects of camera angle, gaze, and distance, their interaction effects and two demographic variables (i.e., race and gender), and random intercept effects for actors. We used the “lmerTest” package in R, which reports Satterthwaite approximations for degrees of freedom and p values. We followed up significant effects of angle, gaze, and distance, and interaction effects by performing pairwise analyses on fitted means of dependent variables to examine where the significant differences between these levels lie within each factor using the “emmeans” package. Bonferroni adjustment and Kenward-Roger degrees-of-freedom method were used for multiple comparisons for both camera angle and gaze. This package uses the Parametric Bootstrap Methods for tests in Linear Mixed Models from the “pbkrtest” package.
Table 3 indicates the effect sizes of the significant effects found between the manipulated and independent variables on each dependent variable investigated.
Effect Sizes (η p 2, 90% CIa) of the Significant Effects (p < .05)
.005 [0, .01]
.002 [0, 0]
.009 [0, .01]
.002 [0, 0]
.005 [0, .01]
.03 [.02, .04]
.008 [0, .01]
.04 [.03, .06]
.04 [.03, .05]
.12 [.11, .14]
.009 [0, .02]
.003 [0, .01]
.006 [0, .01]
Distance × Gaze
.004 [0, .01]
.004 [0, .01]
a 90% CI is reported instead of 95% in accordance to Lakens (2013).
We first hypothesized that likeability judgments, social presence, and interpersonal attraction will be the highest in the Gaze Camera condition, intermediate in the Gaze Screen condition, and the lowest in the Gaze Off-Screen condition. Means and standard deviations of each variable per each Gaze condition are shown in Table 4.
Mean and Standard Deviations of Likeability Judgment, Social Presence, and Interpersonal Attraction by Gaze
The main effect of gaze was significant on likeability judgment, F(2, 3776) = 62.77, p < .001, η p 2 = .03, 90% CI [.02, .04]. Pairwise contrast showed that the Gaze Off-Screen condition led to the lower likeability judgment compared to the Gaze Screen condition, t(3777) = −9.37, p < .001, d 1 = −.37, 95% CI [−.45, −.29], and the Gaze Camera condition, t(3,778) = −9.94, p < .001, d = .39, 95% CI [−.47, −.32], respectively. There is no significant difference between Gaze Camera and Gaze Screen conditions, t(3778) = .58, p > .99, d = .023, 95% CI [−.05, .10]. H1 was confirmed for likeability judgments.
The effect of gaze was significant on social presence, F(2, 3776) = 262.57, p < .001, η p 2 = .12, 90% CI [.11, .14]. Pairwise comparisons showed that participants in the Gaze Camera condition reported the highest social presence compared to participants in the Gaze Screen condition, t(3779) = 4.1, p < .001, d = .16, 95% CI [.08, .24], and the Gaze Off-Screen condition, t(3779) = 21.57, p < .001, d = .86, 95% CI [.78, .94]. There was also a significant difference between Gaze Screen and Gaze Off-Screen conditions, with the former scoring significantly higher, t(3778) = 17.47, p < .001, d = .70, 95% CI [.61, .77]. Hence, H1 was confirmed for social presence.
The effect of gaze was significant on interpersonal attraction, F(2, 3776) = 71.26, p < .001, η p 2 = .04, 90% CI [.03, .05]. Pairwise comparisons showed that participants in the Gaze Screen condition scored the highest for interpersonal attraction, followed by the Gaze Camera condition, with no significant difference between these two conditions, t(3778) = .57, p > .99, d = .02, 95% CI [−.05, .10]. Participants in the Gaze Off-Screen condition scored the lowest compared to participants in the Gaze Camera condition, t(3778) = −0.0, p < .001, d = .40, 95% CI [−.48, −.32], and Gaze Screen condition, t(3778) = −10.57, p < .001, d = −.42, 95% CI [−.40, −.34], supporting the interpersonal attraction component of H1.
No significant difference between the Gaze Camera and Gaze Screen conditions were found for likeability judgment and interpersonal attraction, indicating similar outcomes if the actor is looking at the camera or at the screen.
We hypothesized that participants in the Camera Angle High condition would report lower threat judgment than those in the Camera Angle Low condition.
There was a significant effect of camera angle on threat judgment, F(2, 3776) = 17.85, p < .001, η p 2 = .009, 90% CI [0, .01], with participants in Camera Angle High condition (M = 2.08, SD = .49) reporting lower threat judgment than those in Camera Angle Low condition, M = 2.19, SD = .57; t(3777) = −5.4, p < .001, d = −.22, 95% CI [−.29, −.14], and those in Camera Angle Eye-Level condition, M = 2.10, SD = .50, t(3777) = −.44, p > .99, d = −.02, 95% CI [−.09, .06]. However, the difference between Camera Angle High and Camera Angle Eye-level conditions was not significant.
We hypothesized that trustworthiness would be higher for Camera Angle Eye-Level condition compared to Camera Angle High and Low conditions. Although the effect of camera angle was significant on trustworthiness, F(2, 3776) = 3.29, p = .041, η p 2 = .002, 90% CI [0, 0], no significant difference was found between conditions. Trustworthiness scores in the Camera Angle Eye-Level condition (M = 2.75, SD = .94) was not significantly different than scores in the Camera Angle Low condition, M = 2.66, SD = .91; t(3778) = 2.18, p = .09, d = .09, 95% CI [.01, .16]) and in the Camera Angle High condition, M = 2.73, SD = .93; t(3,778) = −.09, p > .99, d = −.003, 95% CI [−.08, .07]. These results did not support H3.
We hypothesized that likeability judgment, and interpersonal attraction would be greater in the Camera Angle High condition compared to the Camera Angle Low condition. Means and standard deviations for the likeability judgment, and interpersonal attraction for the camera angle conditions are shown in Table 5.
Mean and Standard Deviations of Likeability Judgment and Interpersonal Attraction by Camera Angle
Camera angle conditions
Although the camera angle had a significant effect on likeability judgment, F(2, 3776) = 3.13, p = .043, η p 2 = .002, 90% CI [0, 0], no significant difference was found between the scores in Camera Angle High condition and Camera Angle Low condition, t(3778) = 2.26, p = .07, d = .09, 95% CI [.01, .17]. Hypothesis 4 was not supported in terms of likeability judgment.
The main effect of camera angle was significant on interpersonal attraction, F(2, 3778) = 10.54, p < .001, η p 2 = .005, 90% CI [0, .01]. Participants in the Camera Angle High condition reported significantly higher interpersonal attraction than participants in the Camera Angle Low condition, t(3778) = 4.1, p < .001, d = .16, 95% CI [.08, .24]. Hence, H4 was confirmed in relation to interpersonal attraction. These results altogether partially supported H4.
We hypothesized that threat and likeability judgments would be higher when the camera is close than when it is far from the actor. Since the main effect of distance is not significant for likeability judgment, F(1, 3776) = .20, p = .66, η p 2 ≤ .001, 90% CI [0, 0], nor for threat judgment, F(1, 3776) = 3.32, p = .07, η p 2 ≤ .001, 90% CI [0, 0], Hypothesis 5 was not supported. The findings for the five hypotheses are summarized in Table 6.
Overview of the Hypotheses and Results
Camera > screen > off-screen
Low > high
Eye-level > high and low
High > low
Close > far
In this research question, we expanded the investigation on how gaze, distance to the camera and camera angle jointly affected impression formation when viewing video conference screenshots beyond the hypotheses tested. There was a significant interaction effect between distance and gaze on threat judgment, F(2, 3776) = 5.45, p = .004, η p 2 = .003, 90% CI [0, .01], and social presence, F(2, 3779) = 7.71, p < .001, η p 2 = .004, 90% CI [0, .01].
Follow-up analyses revealed a simple effect of distance on threat judgment in the Gaze Camera condition, (M close = 2.23, SD close = .63, M far = 2.13, SD far = .53, t(3777) = 3.75, p < .001, d = .21, 95% CI [.10, .32], rather than in the Gaze Screen (M close = 2.12, SD close = .52; M far = 2.12, SD far = .52) or Off-Screen conditions (p > .05; M close = 2.06, SD close = .44; M far = 2.09, SD far = .48). The result suggested that a closer distance can lead to higher level of threat judgment only when the actor was looking at the camera. As for social presence, although the significant difference of Gaze Camera > Screen > Off-Screen ordering was maintained when the distance was close (Camera: M = 3.36, SD = .82; Screen: M = 3.16, SD = .83; Off-Screen: M = 2.57, SD = .74), it was not significant when the distance was far. Looking at the camera when the distance was far (M = 3.10, SD = .84) did not yield significantly greater social presence than looking on the screen, M = 3.05, SD = .85), t(3779) = 1.39, p = .493, d = .08, 95% C [−.03, .19].
Finally, we explored how gender and race of the participants and the actors impact impression formation when viewing video conference screenshots.
A significant effect of gender was found on behavior, F(1, 3788) = 12.27, p = .0005, η p 2 = .003, 90% CI [0, .01], likeability judgment, F (1, 3776) = 7.39, p = .007, η p 2 = .002, 90% CI [0, .01], and threat judgment, F(1, 3776) = 35.59, p < .001, η p 2 = .009, 90% CI [0, .02]. Women were significantly more willing to receive an email from the actor, t(3787) = 3.5, p < .001, d = .12, 95% CI [.05, .19], and reported higher likeability judgment, t(3778) = 2.7, p = .006, d = .09, 95% CI [.03, .16], and lower threat judgments than men, t(3777) = −5.98, p < .001, d = −.20, 95% CI [−.27, −.14].
Also, there was a significant effect of race on behavior, F(9, 3787) = 2.77, p = .003, η p 2 = .006, 90% CI [0, .01]. Comparisons between races showed that Asians were less willing to receive an email from the actor than Latinx, t(3785) = −3.87, p = .005, d = −.35, 95% CI [−.53, −.17], and Whites, t(3786) = −3.63, p = .013, d = −.23, 95% CI [−.35, −.11].
This study investigates how nonverbal cues displayed on video conference screenshots can affect impression formation. Participants were presented with a randomly selected video conference screenshot of an actor varying in terms of gaze, distance from the camera, and camera angle. Participants were asked to imagine attending a video conference with the actor shown in the screenshot and to score them according to social perception, interpersonal attraction, and social presence. Results showed that looking at the camera increased social presence, likeability judgments, and interpersonal attraction compared to looking at the screen or off-screen. Overall, gaze has larger effects on various dimensions of impression formation than distance to the camera and camera angle, although the effects were generally small. The finding suggests that gaze can be a powerful nonverbal cue in video conferences that may overshadow other cues such as camera angle and distance to the camera. In particular, gaze has a medium-sized effect on social presence, suggesting the power of gaze in predicting the presence of a partner in the video conference. Gaze also has a small effect on interpersonal attractiveness and trustworthiness.
While these effect sizes are very small, seemingly small effects can quickly accumulate over time. Many people spend a lot of time on video conferences, meeting many people over the course of a week, a month or a year. Funder and Ozer (2019, p. 156) argued that “an effect-size r of .05 indicates an effect that is very small for the explanation of single events but potentially consequential in the not-very-long run.” In other words, the effect of gaze on trustworthiness is small and will not have much impact on how people perceive you in a single video conference. Over many video conferences, however, our findings suggest that gaze will influence how you are perceived by others. Moreover, even small effect sizes are impressive when the manipulations between conditions are relatively subtle as they were in this study.
These findings confirm the importance of gaze direction in interpersonal impression formation when viewing video conference screenshots. Previous research has demonstrated the importance of gaze direction in communicating attentiveness, intelligence (Kleinke, 1986; LeCompte & Rosenfeld, 1971; Wheeler et al., 1979), authenticity (de Wolf & Li, 2020), and cooperation (Driver et al., 1999) in face-to-face settings. In video conferences, the perception of gaze is distorted by the technological settings as inconsistency occurs between the direction of the partners’ gaze and the remote partner’s perception of eye contact (Fuchs et al., 2014). Because the camera is not collocated with the image of another person’s face on the screen, if a participant looks at the screen in order to gaze at their partner, this person may see the participant as averting eye contact. Alternatively, if the participant stares directly at the camera, they will lose access to the visual cues and the partner will, in this case, interpret the situation as if the participant was making eye contact. This lack of mutual gaze synchrony has long been considered a barrier to virtual communication (Olson & Olson, 2000).
In addition, as discussed earlier, camera angle has been a common strategy to convey interpersonal meaning. For example, low camera angle being used to make the actor appear tall and dominant (Giannetti, 2012), while a low camera angle will diminish the status of the actors and communicate vulnerability (Boorstin, 1991). Consistent with this line of research, we observed a higher level of threat judgment when the camera angle was low compared to when it was high during video conferences. However, inconsistent with prior research showing that eye-level camera angle can facilitate trust among participants (Baranowski & Hecht, 2018), our data showed no difference in trustworthiness between eye-level and low or high camera angle. Contrary to past studies showing positive associations between high camera angles and perceived competence, composure, and sociability (e.g., McCain et al., 1977), this study revealed null results for the camera angle effects on interpersonal impressions when viewing video conference screenshots, such as likeability, interpersonal attraction or social presence.
Finally, this study found null results for the effect of distance on likeability and threat, which did not support previous research in the context of face-to-face communication (Åhs et al., 2015; Khan & McGaughey, 1977; Patterson & Sechrest, 1970). Khan and McGaughey (1977) argued that the association between position and likeability can be moderated by the meaning and the interpretation of the interpersonal interaction. The prompt shared with the participants did not specify the nature of this interaction beyond the fact that the screenshot represented a video conference with a colleague. This lack of clear context might have led to the fact that the distance did not influence likeability and threat judgment.
For exploratory purposes, we first examined how nonverbal cues can jointly affect impression formation. Our data revealed a significant interaction between distance and gaze on threat judgment such that being close to the camera was perceived as more threatening when a partner’s gaze was at the camera. The result suggested that the actor whose face was large, simulating being close to the camera, while staring right at it might signal a violation of the equilibrium theory. As argued by Argyle and Dean (1965) in order to keep intimacy at a comfortable level, individuals can compensate for an uncomfortable proximity with gaze aversion. This is what happens in elevators where people are forced to stand too close to each other and compensate for this unwanted intimacy by averting eye contact. This finding conceptually replicates Bailenson et al. (2001), who examined the interaction between distance and gaze in immersive virtual reality, and demonstrated that people stayed farther away from virtual agents who stared at them compared to virtual agents who avoided their gaze, and also demonstrated a higher social presence in the gazing condition compared to the avoidance condition. Finally, we found gender differences in likeability and threat judgment during video conferences, controlling for nonverbal cues, with women reporting higher likeability judgment and lower threat judgments than men.
As an increasing part of our social interactions take place in video conferences, it is important to understand how nonverbal cues in video conferences can shape impression formation among participants. Through the use of video conference screenshots as a proxy of video conference interaction, our research highlights the impact of gaze, camera angle, and distance from the camera on various dimensions of impression formation. Results suggest that looking at the camera instead of the screen when possible, and avoiding large eye movements which attend to objects off screen, as well as using a high camera angle may improve interpersonal impressions.
Our approach of using screenshots of actors rather than actual interactions during video conferences may limit the generalizability of our findings. Future research can control nonverbal cues in actual video conferences to test the robustness of the findings. One can also wonder how the percentage of time in which actors engage in these nonverbal behavior (i.e., gazing off-screen or positioning themselves far away from the camera) might influence the impression of the other participants. Moreover, this study explores two distances from the camera and three angles which represent a small portion of the existing range. In future studies, we will explore a wider range of camera angles and distances from the camera.