Dogs (Canis familiaris) Gaze at Our Hands: A Preliminary Eye-Tracker Experiment on Selective Attention in Dogs

Simple Summary Dogs seem to communicate with humans successfully by getting social information from body signals such as hand signs. However, less is known regarding how dogs pay attention visually toward human body signals including hand signs. The objective of this pilot study was to reveal dogs’ social visual attention tuned to inter-species communication with humans by comparing gazing patterns to the whole body of human, dogs, and cats. The pictures showing humans with or without hand signs, dogs, and cats were presented on a liquid crystal display monitor, and gazing behaviors of subject dogs to these pictures were recorded by an eye-tracking device. The subjects gazed at human limns more frequently than limbs within conspecifics and cat images, where the dogs attention were focused on the head and body. Furthermore, gaze toward hands was greater in the human hand sign photos relative to photos where human hand signs were not present. These results indicate that dogs have an attentional style specialized for human non-verbal communication, with an emphasis placed on human hand gestures. Abstract Dogs have developed a social competence tuned to communicate with human and acquire social information from body signals as well as facial expressions. However, less is known regarding how dogs shift attention toward human body signals, specifically hand signs. Comparison among visual attentional patterns of dogs toward whole body of human being, conspecifics, and other species will reveal dogs’ basic social competences and those specialized to inter-species communication with humans. The present study investigated dogs’ gazing behaviors in three conditions: viewing humans with or without hand signs, viewing conspecifics, and viewing cats. Digital color photographs were presented on a liquid crystal display monitor, and subject dogs viewed the images while their eyes were tracked. Results revealed that subjects gazed at human limbs more than limbs within conspecific and cat images, where attention was predominately focused on the head and body. Furthermore, gaze toward hands was greater in the human hand sign photos relative to photos where human hand signs were not present. These results indicate that dogs have an attentional style specialized for human non-verbal communication, with an emphasis placed on human hand gestures.


Introduction
Dogs' visual cognition has long been a popular research topic. During their long history of domestication, dogs have developed a human-social skill (for a review see [1]). For example, in an assessment of visual social cognition, Pitteri et al. [2] described that dogs can discriminate between an Mixed e 1 Female Intact a The dogs who passed the calibration procedure and participated in the experiment. b The dogs who did not pass the calibration procedure and were omitted from the experiment. c Cavalier King Charles Spaniel x Mixed d The breed was not identified because the dogs was a shelter dog and assumed as a golden retriever. e Chihuahua x Dachshund.

Apparatus and Experimental Set ups
The apparatus included an eye-tracking system (model ETL-300-HD, ISCAN, USA), which consisted of two computers, two color displays, and an eye-tracking camera. One computer and display was used to run the eye-tracking system and record eye movement data. The computer and display (43.2 × 32.4 cm, 1600 × 1200 px; model S2133-HBK, Eizo Nanao, Ishikawa, Japan) were used for stimulus presentation. The camera obtained images of each subject's head and eyes. The camera emitted infrared light in order to track subjects' monocular pupil and corneal reflection at 60 Hz.
All assessments were conducted in an experimental room at the School of Veterinary Medicine, Kitasato University, by one operator and two handlers ( Figure 1). During the experiment, the dogs were released from a leash and could move around the room freely. The handlers guided the dogs to lie down on the floor in front of the presentation display; once the dogs complied, the dogs were rewarded with a treat. The distance between each dog's head and the display was approximately 50 cm. During the calibration and experimental presentation, each dog was positioned between the two handlers staying closely with the dog. One of the handlers instructed each dog to put his/her jaw on a chinrest. The other handler lightly held the dog's body if needed. The operator in the same room controlled the two computers. During the experimental procedure, the owners were absent from the room.

Stimuli
Digital color photographs comprising three stimulus species-humans, dogs, and cats-were used in the experiment ( Figure 2). For each species, 12 images per subject were presented. All pictures were 1600 × 1200 px and consisted of single, whole body images of an individual on a gray background. The human pictures consisted of 3 novel individuals with a hand sign, 3 novel individuals without a hand sign, 3 familiar individuals with a hand sign, and 3 familiar individuals without a hand sign. A familiar person was defined as someone who spent time with the subject dog for longer than one day, including the owners, operator, and handlers. Hand signs used commonly in the subjects Animals 2020, 10, 755 4 of 12 ("hand", "sit", or "stay") were depicted. All human pictures were front facing, upright, and laterally symmetric except a hand showing a sign to avoid visuospatial attentional bias [19]. Each human model wore a white, laboratory coat for consistency and to avoid color distraction. The human sex was counterbalanced. All dog/cat pictures were novel to the subject dogs and were taken from a front/lateral side with a natural posture. Dog and cat breeds varied across the images. We constructed 6 sets of stimuli, comprising 6 pictures each, with at least one picture from each stimulus species.
Animals 2020, 10, x 4 of 13 Figure 1. Subject dogs lied down in front of an eye-tracking camera and display. During calibration and the experimental session, the subject put his/her jaws on a purpose-designed chin rest.

Stimuli
Digital color photographs comprising three stimulus species-humans, dogs, and cats-were used in the experiment ( Figure 2). For each species, 12 images per subject were presented. All pictures were 1600 × 1200 px and consisted of single, whole body images of an individual on a gray background. The human pictures consisted of 3 novel individuals with a hand sign, 3 novel individuals without a hand sign, 3 familiar individuals with a hand sign, and 3 familiar individuals without a hand sign. A familiar person was defined as someone who spent time with the subject dog for longer than one day, including the owners, operator, and handlers. Hand signs used commonly in the subjects ("hand", "sit", or "stay") were depicted. All human pictures were front facing, upright, and laterally symmetric except a hand showing a sign to avoid visuospatial attentional bias [19]. Each human model wore a white, laboratory coat for consistency and to avoid color distraction. The human sex was counterbalanced. All dog/cat pictures were novel to the subject dogs and were taken from a front/lateral side with a natural posture. Dog and cat breeds varied across the images. We constructed 6 sets of stimuli, comprising 6 pictures each, with at least one picture from each stimulus species.

Experimental Procedures
First, a 5-point calibration was conducted for each subject. Five black crosses were presented at the center, upper left, upper right, bottom left, and bottom right of the monitor on a gray background. A handler pointed to the crosses in that order using a treat. When the handler confirmed that the subject gazed on the cross, a vocal cue was provided to the operator in order to record the calibration point. This was repeated for the remaining 4 crosses. The calibration was then checked with a

Experimental Procedures
First, a 5-point calibration was conducted for each subject. Five black crosses were presented at the center, upper left, upper right, bottom left, and bottom right of the monitor on a gray background. A handler pointed to the crosses in that order using a treat. When the handler confirmed that the subject gazed on the cross, a vocal cue was provided to the operator in order to record the calibration point. This was repeated for the remaining 4 crosses. The calibration was then checked with a validation procedure: the handler pointed to the crosses again and the operator confirmed that the gaze points were on the crosses through the eye-tracker system. The calibration was repeated until all gaze points were validly recorded.
After the calibration, the experimental session began. Each subject performed 6 sessions of the task. Each session began with a blank gray screen. When the subject gazed anywhere on the screen (not limited to any specific point to avoid the bias on the first fixation point described below), one of the handlers provided a vocal cue for the operator to start the stimulus presentation. A picture was then presented for 1.5 s, followed by a blank screen for 0.5 s. This routine was repeated for six times and progressed automatically by a computer program assembled for this experiment. Image presentation order was randomized. After each session was completed, the subject was rewarded with a treat, regardless as to whether the dog gazed at the screen. Each subject was then allowed to move freely around the room after each session to avoid to be stressed or habituated. Before the next session started, the subject was guided to lie down at the same position and the calibration was conducted again.
This work complied with the laws of Japan, and the experiment was approved by the President and Institutional Animal Care and Use Committee of Kitasato University (Approval no. 15-067).

Data Analyses
We divided each picture into several features (areas of interest; AOI) to quantitatively analyze subjects' gaze patterns. For the analysis comparing the three species and that comparing human familiarity, the human/dog/cat within each picture was divided into four AOIs: Head, body, buttocks, and limbs (Figure 2a-c). For human pictures, the body indicated the upper third area from the neck to the feet, because the dividing bounds between body parts were obscure due to a white lab coat. The middle third area was included in the buttocks, and the bottom third and arms were included in the "limbs" AOI. For dog and cat pictures, the buttocks included the tail, and the forelimbs and hindlimbs were combined into the limbs AOI. For the analysis comparing stimuli with and without a hand-sign, only the gaze at hands were incorporated ( Figure 2d). To avoid errors in gaze estimation, the AOI frame was drawn larger than the actual outline (approximately 50 pixels on the edge). From monocular gaze data, the number of fixations and total fixation durations within each AOI per presentation were calculated ( Figure 3). For each picture, the AOI to which the first fixation was directed (hereafter, first gazed AOI) were also analyzed. For each AOI, the number of pictures in which the first fixation was directed was counted.
The fixation number and total fixation duration for each AOI were analyzed using a Generalized Linear Mixed Model (GLMM) (lmer, lme4 library, freeware package R, Version 2.14.2; R Development Core Team, 2012). The models were constructed using a Poisson distribution for fixation number data, as these were non-negative count data. A Gaussian distribution was used for analyzing total fixation duration, as these were non-negative continuous data [20]. One data point indicated a fixation number or total fixation duration for each picture for each subject. Stimulus species (human, dog, or cat) and AOI (head, body, buttock, limbs) were set as fixed factors. For the stimulus familiarity model, familiarity (familiar or novel) and AOI (head, body, buttock, limbs) were set as fixed factors. In the hand sign analysis, the hand sign conditions (with or without hand signs) and AOI (head, body, buttock, limbs) were set as fixed factors. Subjects were considered a random factor to control for individual differences. The models with and without the target fixed factor were compared based on the Akaike Information Criterion (AIC [20,21]). Model's significance was tested using a likelihood ration test. For multiple comparisons of the fixed factors, the models that contained every combination of the fixed factors in the different groups were also compared. The number of pictures categorized into each first gazed AOI was analyzed using a chi-square test; if significant, a standardized residual post-hoc analysis was conducted to investigate any difference from chance frequency. familiarity, the human/dog/cat within each picture was divided into four AOIs: Head, body, buttocks, and limbs (Figure 2(a), (b), and (c)). For human pictures, the body indicated the upper third area from the neck to the feet, because the dividing bounds between body parts were obscure due to a white lab coat. The middle third area was included in the buttocks, and the bottom third and arms were included in the "limbs" AOI. For dog and cat pictures, the buttocks included the tail, and the forelimbs and hindlimbs were combined into the limbs AOI. For the analysis comparing stimuli with and without a hand-sign, only the gaze at hands were incorporated (Figure 2(d)). To avoid errors in gaze estimation, the AOI frame was drawn larger than the actual outline (approximately 50 pixels on the edge). From monocular gaze data, the number of fixations and total fixation durations within each AOI per presentation were calculated (Figure 3). For each picture, the AOI to which the first fixation was directed (hereafter, first gazed AOI) were also analyzed. For each AOI, the number of pictures in which the first fixation was directed was counted. The fixation number and total fixation duration for each AOI were analyzed using a Generalized Linear Mixed Model (GLMM) (lmer, lme4 library, freeware package R, Version 2.14.2; R

Species
The most gazed AOI differed between the stimulus species in the fixation number analysis. The fixation number and total fixation duration within each AOI for all three stimulus species are shown in Figure 4a,b, respectively. In the fixation number analysis, the models that included both the stimulus species and AOI showed significantly lower AICs than the models that did not include these factors (Table 2). In the fixation duration analysis, the models that included only the AOI showed significantly lower AICs than the model that include both the stimulus species and AOI. For multiple comparisons of both the fixation number and total fixation duration within the human pictures, the models that included "limb" AOIs at a different group from the other AOIs had the lowest AICs (164.4 and 16.08). For the dog pictures, the models that included a "head" AOI at a different group from the other AOIs showed the lowest AICs (169.5 and 14.85). For cat pictures, the models that included the "head" and "body" AOIs at a different group from the other AOIs showed the lowest AICs (155.5 and 51.11). The number of pictures categorized into each first gazed AOI is shown in Figure 4c. In all stimulus species, the numbers of pictures of each first gazed AOIs were significantly different (human: χ 2 = 10.00, p = 0.019, dog: χ 2 = 14.00, p = 0.003, cat: χ 2 = 15.00, p = 0.002). For the Animals 2020, 10, 755 7 of 12 human pictures, the number of pictures where the first gazed AOI was on the "body" was significantly less (p = 0.04), and the "limbs" was significantly greater (p = 0.003), than chance. For the dog pictures, first gazed AOI for the "head" was significantly greater (p = 0.006), and the "limbs" was significantly less (p = 0.02), than chance. For the cat pictures, the first gazed AOI for the "body" was significantly greater (p = 0.001), and the "buttocks" was significantly less (p = 0.014) than chance. These results suggest that the subjects predominantly oriented toward the "limbs" in human pictures, "head" in dog pictures, and "head" and "body" in cat pictures.

Human Familiarity
For human pictures, gaze behavior did not differ between familiar and novel individuals. The fixation number and total fixation duration in each AOI for familiar and novel human pictures are shown in Figure 5a,b, respectively. Models that included both familiarity and AOIs did not show a significantly lower AIC than the model that only included the AOIs ( Table 2). The number of pictures categorized into each first gazed AOI is shown in Figure 5c. For both the familiar and novel pictures, the number of pictures did not significantly differ between the first gazed AOIs (familiar: χ 2 = 4.54, p = 0.21, novel: χ 2 = 7.33, p = 0.062). These results suggest that familiarity did not affect the dogs' gaze behavior. Animals 2020, 10, x 9 of 13 a AIC = "Akaike Information Criterion," an index used to compare the fitted models. The model with the lower AIC is preferred.

Human Familiarity
For human pictures, gaze behavior did not differ between familiar and novel individuals. The fixation number and total fixation duration in each AOI for familiar and novel human pictures are shown in Figure 5(a) and (b), respectively. Models that included both familiarity and AOIs did not show a significantly lower AIC than the model that only included the AOIs ( Table 2). The number of pictures categorized into each first gazed AOI is shown in Figure 5(c). For both the familiar and novel pictures, the number of pictures did not significantly differ between the first gazed AOIs (familiar: χ 2 = 4.54, p = .21, novel: χ 2 = 7.33, p = 0.062). These results suggest that familiarity did not affect the dogs' gaze behavior.

Hand Sign
The subject dogs' gaze behavior differed between the human pictures with and without a hand sign. The fixation number and total fixation duration to hands in the pictures with and without a

Hand Sign
The subject dogs' gaze behavior differed between the human pictures with and without a hand sign. The fixation number and total fixation duration to hands in the pictures with and without a hand-sign are shown in Figure 6a,b, respectively. For fixation number, the models that included AOIs had significantly lower AICs than models that did not include AOIs (Table 2), although differences between the AICs did not differ significantly in the total fixation duration analysis. The number of pictures in which "hands" and other body parts were gazed first is shown in Figure 6c. For pictures with a hand sign, the number of pictures did not differ significantly between pictures in which "hands" were gazed and other body parts were gazed at first (χ 2 = 1.33, p = 0.248). For pictures without a hand sign, the numbers of pictures of each first gazed AOIs were significantly different (χ 2 = 9.31, p = 0.002). The number of pictures where "hands" were gazed at first was significantly less (p = 0.002), and gaze toward other parts of the body was significantly greater (p = 0.002), than chance. These results suggest that the dogs gazed at "hand" pictures more so when a hand sign was present. hand-sign are shown in Figure 6(a) and (b), respectively. For fixation number, the models that included AOIs had significantly lower AICs than models that did not include AOIs (Table 2), although differences between the AICs did not differ significantly in the total fixation duration analysis. The number of pictures in which "hands" and other body parts were gazed first is shown in Figure 6(c). For pictures with a hand sign, the number of pictures did not differ significantly between pictures in which "hands" were gazed and other body parts were gazed at first (χ 2 = 1.33, p = 0.248). For pictures without a hand sign, the numbers of pictures of each first gazed AOIs were significantly different (χ 2 = 9.31, p = 0.002). The number of pictures where "hands" were gazed at first was significantly less (p = 0.002), and gaze toward other parts of the body was significantly greater (p = 0.002), than chance. These results suggest that the dogs gazed at "hand" pictures more so when a hand sign was present.

Discussion
Dogs' gaze behavior differed depending on the species presented. The fixation number analyses revealed that the dogs gazed more toward human limbs while also gazing more toward dog heads and cat heads and bodies. The first gazed AOI analysis showed similar results for the human and

Discussion
Dogs' gaze behavior differed depending on the species presented. The fixation number analyses revealed that the dogs gazed more toward human limbs while also gazing more toward dog heads and cat heads and bodies. The first gazed AOI analysis showed similar results for the human and dog pictures. For the cat pictures, subjects first gazed toward the body. Notable gaze toward human limbs relative to the dog and cat pictures suggests that dogs have acquired an attentional specialization for non-verbal human limb gestures differed with attention pattern when facing with non-human species. Contrary to these analysis, the total fixation duration did not differ among species, suggesting that the subjects directed shorter gaze toward the AOIs with more fixation numbers repeatedly, but longer gaze toward other AOIs. A fixed stare is interpreted as an expression of threat, while gaze alternation is used in cooperative and affiliative interactions [22,23]. The AOIs with more fixation numbers such as human limbs, dog heads, and cat heads might be affiliative signals for dogs. Conversely, when viewing the faces of other dogs, the head and face provide more communicative information. Contrary to our expectations, the subjects gazed toward other dogs' tails less frequently than the other AOIs. A dog's tail movement could be modulated by the dog's emotional state [24]. Although this modulation may be unintentional, dogs are sensitive to other dogs' tail expressions [25]. However, the present study used still images that lacked information regarding tail movement, which may be why we failed to observe significant attention toward tails in the present study. It should also be noted that our AOIs differed in size, as it was our intent to present pictures depicting natural objects. Interestingly, the subjects gazed more toward the smaller AOIs (e.g., dog head) than the larger AOIs (e.g., dog limbs), indicating that gaze behavior was non-random and contingent on AOI saliency. In other words, the dogs changed their gaze behavior depending on which features were most visually attractive within the particular species presented.
For pictures depicting familiar and novel humans, the fixation number, total fixation duration to each AOI, and first gazed AOIs did not differ significantly. These results indicate that dogs did not change their gaze behavior as a function of stimulus familiarity. There are two potential explanations for this. One possibility is that dogs just do not discriminate between familiar and unfamiliar humans. Alternatively, dogs might discriminate familiarity, but this does not impact their general gaze behavior. Adachi et al. [26] demonstrated that pet dogs have a cross-modal (voice-face) representation of their owners. Somppi et al. [13] also tested whether dogs gaze differently toward personally familiar pictures and found that familiar faces attract attention more than unfamiliar ones. This aforementioned evidence indicates that dogs are able to discriminate novel from familiar human faces. However, in our study, the human stimulus features may have impacted performance. For instance, the images were relatively small, and each human model was wearing the same white lab coat. This may have made it difficult for the dogs to discriminate the novel from the familiar. Hence, differences in gaze patterns on the entirety of a human image within novel and familiar stimuli should be tested in future studies (particularly with a head-mounted eye-tracker; Williams et al. [27] examined gaze patterns toward real people or life-sized images projected on a screen (see [28]).
The fixation number varied significantly as a function of presenting a human hand sign. Here, the dogs gazed more toward human hands when a hand sign was present compared to when it was not. This was also the case for the first gazed AOI. However, total fixation duration did not differ significantly between the hand sign conditions, but a trend level tendency in line with the other gaze analyses was observed. These results suggest that a dog's attention is enhanced when non-verbal hand signals are present, indicating that hand signs serve as a reliable communication aid for pet dogs. Humans and chimpanzees gaze more at human faces than other parts [6]. Hattori et al. [29] revealed humans and chimpanzees gaze less at humans' hands conveying human intentional signs. The discrepancy between the results from these study on humans and chimpanzees and those from our study on dogs could be due to a long phylogenetic distance from dogs to human and chimpanzees. During the process of domestication, dogs might have acquired specialized attentional patterns to discern social information from a human hand sign.
This study aimed to clarify dogs' attentional patterns to the signals produced by human hands by comparing visual gaze toward still pictures showing human, dogs, and cats. The subjects in the experiment were pet dogs, because any potential influence of pre-training to participate in any other task than this experiment must be excluded. This limited the number of the subjects and the breeds of the subject dogs, but the present study could successfully provide some useful insights on the dogs' domestication process. Following the previous eye-tracking studies in dogs, this study used still pictures as experimental stimuli. This favorably highlighted dogs' selective attention to human hand signs. However, human bodily signals are usually expressed with movement which was lacking in the still images. As well as information presented in the experimental stimuli, body movement might be important for dogs to communicate with humans. Further experiments with movie stimuli may enable deeper discussions on dogs' visual cognition specialized to communication with humans.
The present experiment revealed that dogs gazed most notably toward human limbs, and this differed depending on the presence of a hand sign. It appears that dogs have acquired an attentional style that is differed when facing with human and non-human species and enables the collection of information for adequately communicating with people living in their community. Thus, the present study empirically demonstrates that a dog's cognitive abilities can be adjusted to accommodate life with humans.