Eye Tracking Study of Social Intensity on Social Orientation of Autistic Children

Some previous studies indicate that impaired social attention mainly results in social disorders in autistic children. In the social attention mode of autistic children, social orientation and joint attention are particularly important. The influence of different social intensity and ecological validity on them are worthy of further study. This study used realistic paintings with moderate ecological validity as experimental materials, to design isolated individual scene and social interaction scene, and to explore the impact of social interaction on the social orientation of autistic children. It found that in the scenes without social interaction, the attention patterns of autistic children and typical developing children were the same, while the attention patterns of autistic children were abnormal in the scenes with social interaction. From the eye tracking data, it was shown that the gaze processing process of autistic children was not as smooth as that of typical developing children. Compared with cartoons and other social scenes with low ecological validity, realistic painting could better restore the proportion of real scenes. Moreover, it could reduce the complexity of information which could not be done in real scenes. The findings of this study provide support for training and education of autistic children. Intervention with realistic paintings is conducive to the migration of autistic children.


Introduction
Autism spectrum disorder (ASD) is a neurodevelopmental disorder. The incidence of ASD has been on the rise since Kanner first described the symptoms in the 1940s [1]. According to the ASD incidence report issued by the Centers for Disease Control and Prevention (CDC) in 2021, there is one child with ASD in every 44 children in the United States, and the prevalence is independent of race [2]. ASD has received more and more attention from the society and has become a hot research field for more and more scholars.
The Diagnostic and Statistical Manual of Mental Disorders (DSM-5), published in 2013, put forward two major symptoms of ASD: social communication and social interaction defects, and restricted and repetitive behavior patterns and interests or activities. Among them, social disorder is one of the core disorders of ASD. Studies have found that these social disorders are mainly caused by impaired social attention [3].
Social attention refers to the processing and treatment of social information, including awareness of people and things in social situations, social orientation, visual search, and joint attention [4]. Social orientation refers to the ability of individuals to autonomously perceive and observe social stimuli occurring in the environment, such as characters, faces, eyes, and body postures [5]. Infants, preschool children, and adults all have social attention orientation. Compared with typical developing (TD) children, autistic children have certain impairments in their social attention and orientation [6].
For non-social stimuli, such as buildings and objects, autistic children and TD children have similar cognitive behaviors [7]. As for social stimuli, such as faces, eyes, and body movements, autistic children and TD children have different visual attention patterns and cognitive performances [8][9][10][11][12].
Yarbus [13] did some experiments creatively with human faces and found that TD individuals had an attentional preference for faces [14], while autistic individuals tended to show "face avoidance" [15]. They have differences in facial perception and facial gaze, as well as differences in their processing strategies for the internal feature areas of the face. In experiments with human face images, TD individuals prefer to look at the eyes in the face, while autistic individuals often show a viewing preference of "more mouths and fewer eyes" [16]. In natural situations, this phenomenon also exists when autistic individuals observe human faces [16]. Some researchers believe that for the gaze process of a face, autistic individuals adopt a "simple mode", using parts of the brain to process the face [17], while TD individuals use the "social mode" [18].
The research method of a single face deprives the character's body and other biological information and non-biological information such as objects in the real scene, and cannot simulate the actual living environment well. In order to figure out this problem, the researchers used stimulus materials with social context. The results show that autistic individuals have low gaze preference for people and have a longer gaze for objects [18][19][20] in pictures containing people and objects. Yarbus' research shows that people will look at the eyes first in the display of human faces. However, the eyes are not selected first when the face and the body are present at the same time. Compared with TD individuals, autistic individuals pay less attention to faces in the scene, and pay more attention to non-social parts (such as body shadows) [20]. In social situations, autistic children spend less time looking at the eye area and more time looking at the body area, and they show low preference for social cues (gaze direction) compared to TD children [21].
Observers follow social cues such as head orientation [22], gestures [23], and other directional instructions [24], and devote the same attention direction or goal [25,26]. This process is called joint attention [27]. The prerequisite for arousing common attention is to produce social orientation. Studies have suggested that the impaired joint attention ability of autistic children is related to the orientation ability to social stimuli [5].
According to studies, the factors that affect the social attention of autistic children include the familiarity with the characters in the stimulus material, the tasks, the age characteristics of the scene, the social intensity of the scene, and the ecological validity.
Compared with strangers, people generally have a faster and more accurate recognition of familiar faces [14]. There is no significant difference in the gaze patterns of familiar faces between autistic children and TD children, while on unfamiliar faces, autistic children have significantly different gaze patterns than TD children [28]. TD children have different processing methods for acquaintances and strangers. On the opposite, autistic individuals have no obvious difference in the gaze patterns of familiar faces and unfamiliar faces [29]. Studies by Yarbus, Birmingham, and others have shown that the distribution pattern of social attention is highly task-dependent. The age characteristics of the characters in the scene will affect the gaze patterns of autistic children. The duration ratio of autistic children's visits to the face, eyes, and mouth areas of interest in children's scenes is longer than that of adult scenes, while TD children have the opposite. Compared with adult scenes, autistic children tend to look at faces and backgrounds in children's scenes [30].
Studies have shown that both autistic individuals and TD individuals pay more attention to scenes with people than scenes without people [31]. In the case of people, the intensity of social information can be measured by activities and social behaviors. Some scholars also use the number of people in the scene to control the intensity of the social information in the scene, the essence of which is to control the complexity of the social behavior in the scene. Among them, the essential difference between the one-person scene and the multi-person scene is that there is no social behavior in the one-person scene, and the intensity of social information is weak. At this time, autistic individuals and TD individuals have similar social attention performance [32,33]. Bimingham [25] controlled the activities of complex real-world scenes and the presence or absence of social behaviors (1 person vs. 3 people). The findings demonstrate that by increasing the intensity of social information, TD adults can increase the gaze of the eyes in the scene. The research of Rigby et al. on autistic adults proved that with the increase of social behavior (the number of people in the scene), the gaze on the face gradually decreased and the gaze on the body and the background area except the person gradually increased [34].
The ecological validity of the scene will also affect the social attention of the observer [35,36]. The closer it is to the real-life scene, the higher its ecological validity is. Current research on visual search is biased toward real scenes, and its purpose is to improve the ecological validity of experiments. But there are a lot of complex information in real scenes, which makes it difficult for autistic children to deal with social stimuli. Instead, researchers simplified the real scenes and used cartoon scenes and stick figure scenes as stimulus materials. Social attention performance of autistic children is better, but its manifestation is more exaggerated, such as increased ratio of eye information in faces, exaggerated facial expressions, simplified body images, which are not conducive to migration to real scenes. Realistic painting is a concrete art in terms of artistic form. It conforms to the observer's visual experience by reproducing external objects, while processing irrelevant clues such as background to reduce the information complexity of the scene. Its ecological validity lies between real scenes and cartoon and stick figure scenes.
When faced with highly ecologically valid and complex scenes, TD individuals have a gaze preference for characters, especially their faces [25]. In real social scenes, with the increase of social intensity, the fixation of TD children on the mouth is greater than the fixation on the eyes [37], while the conclusion drawn in cartoon and stick figure scenes is on the contrary. Yarbus et al. let adults watch the realistic painting "Unexpected Return" [13]. By increasing the social intensity, they found that the observer's fixation on the eyes of the characters also increased. The results of some experimental studies have shown that the social attention deficit of autistic children may be related to the ecological validity of the stimulus [38,39]. Delphine et al. [40] found that autistic children used different processing strategies for cartoon faces and real faces. When Van der Geest et al. [39] presented cartoon social scene materials to autistic children, they found that their gaze patterns were similar to those of TD children, paying more attention to cartoon characters and gazing at objective objects. As the complexity of the scene increases, the attention to the characters increases and the gaze to the background does not change significantly. The difference in this result may be that cartoons and other stimulus materials with low ecological validity are not as real as the characters in life, the information complexity is low and less socially threatening to autistic individuals. However, there are few studies on realistic painting scenes between the ecological validity of real scenes and cartoon scenes. For autistic individuals, realistic painting scenes play a transitional role in migrating to real scenes during intervention, and plays an important role in their research. This paper adopts realistic painting scenes with medium ecological validity, and controls the social intensity by controlling whether the scene has social interaction or not (isolated individuals vs. social interaction), and make some discussions about the influence of social intensity on the social orientation and joint attention of autistic children in realistic paintings.

Participants
A total of 32 autistic children aged 3-6 were recruited from a rehabilitation institution, and 22 TD children aged 3-6 were recruited from a kindergarten in Wuhan. Some participants were unable to complete the experiment due to inattention, turning their heads many times, or being emotional, resulting in the incomplete viewing of the experimental stimulus materials or the severe lack of experimental data. Gaze samples below 70% were considered severely lacking in experimental data, and data from these participants would not be included in the experimental data analysis. Therefore, a total of 13 autistic children and 3 TD children were excluded from the experiment. In the end, 19 autistic children (Experimental group, 14 boys and 5 girls, mean age = 4.51 years) and 19 TD children (Control group, 10 boys and 9 girls, mean age = 4.45 years) were valid participants. They are grouped and matched according to the chronological age.
Before the experiment, all autistic children met: (1) the ASD criteria of the DSM-5; (2) double-blindly diagnosed by two expert physicians in children's developmental behavior. All TD children were evaluated by the Developmental Scale for Children Aged 0-6 years [41] for evaluating their developmental behavioral level, and all obtained a developmental quotient of more than 80 points (the reference range of developmental quotient: above 130 is excellent, 110-129 is good, 80-109 is moderate, 70-79 is critically low, and less than 70 is mental retardation).
After parental interviews and clinical observations, the two groups of participants were excluded from respiratory diseases, childhood schizophrenia, epilepsy, and other organic brain diseases, and were confirmed to have normal vision (or corrected vision) and normal intelligence. They all had not participated in similar experiments in the previous period, and had never seen this experimental material.
Privacy protection agreements were signed with the institution and the kindergarten. The informed consent was signed with their parents to protect the privacy of these children participating in the experiment. Only relevant data about the participants completing the experimental tasks were collected anonymously. No personally identifiable information or portraits of participants were involved.

Design
A two-factor repeated measurement design was used in this experiment, with the category of participants (ASD vs. TD) as the between-subjects factor, with autistic children as the experimental group and TD children as the control group, to sociability (isolated individuals vs. social interaction) as the within-subject factors. When analyzing eye tracking data, another factor (area of interest) would be added.
Data analysis methods used in the experiments included: • 2 (subject categories: ASD vs. TD) × 2 (sociality: isolated individual vs. social interaction) mixed analysis of variance (ANOVA) including fixation count and fixation duration was used to explore whether there were differences in the social attention of different participants in different scenarios. • 2 (subject categories: ASD vs. TD) × 2 (sociality: isolated individual vs. social interaction) × 4 (area of interest: F, B, BG, and ACT) three-factor repeated metric analysis of variance was used to further explore whether there were differences in the attention areas of different participants in different social scenarios.

Stimuli
For the selection of experimental stimuli, in order to prevent the interference of children's training or previous experience, those stimuli that children could understand but had not seen were selected to enhance the effect of the experiment. Paintings that represent common scenes of daily life were good candidates, such as one person pouring water and two people talking. First, 10 realistic paintings depicting scenes of daily life, each with 1 person or 2 people, were selected as alternative materials. Then, 5 specialists in art with a master's degree or above as art assessors were invited to use the Likert scale to score the paintings for familiarity, interactivity, emotional intensity, and other aspects. According to the scoring result, the pictures of different number of people who were unfamiliar, interactive, emotionally neutral, and with unexaggerated posture were selected, to respectively represent the isolated individual scene and the social interaction scene.
Finally, two paintings were selected as the experimental stimuli for this study. As shown in Figure 1, the selected pictures were "The Maid Pouring Milk" and "A Lady and Her Maid" by Vermeer of the Netherlands. Adobe Photoshop CS6 software was used to process the selected pictures so that the width of each picture is 800 pixels, maintaining the aspect ratio of the original picture.
1 person or 2 people, were selected as alternative materials. Then, 5 specialists in art with a master's degree or above as art assessors were invited to use the Likert scale to score the paintings for familiarity, interactivity, emotional intensity, and other aspects. According to the scoring result, the pictures of different number of people who were unfamiliar, interactive, emotionally neutral, and with unexaggerated posture were selected, to respectively represent the isolated individual scene and the social interaction scene. Finally, two paintings were selected as the experimental stimuli for this study. As shown in Figure 1, the selected pictures were "The Maid Pouring Milk" and "A Lady and Her Maid" by Vermeer of the Netherlands. Adobe Photoshop CS6 software was used to process the selected pictures so that the width of each picture is 800 pixels, maintaining the aspect ratio of the original picture.

Apparatus
The experimental materials were presented to the participants on an all-in-one computer through Microsoft Visual Studio software, and the touch screen resolution was 1920 × 1080. Tobii Eye Tracker 5 (Tobii, Stockholm, Sweden) was used to record the eye tracking data of the participants when they viewed the experimental material, the sampling frequency of the eye tracker is 90 Hz, gaze angle is 40 × 40 degrees.

Procedure
The experiment was conducted in a quiet classroom of the special education institution and kindergarten. Each experiment required one participant and two operators. One operator was responsible for controlling the computer program to ensure that it did not interfere with the participant's viewing. Another operator was responsible for explaining the experiment process. Before the start of the experiment, the operator played with the children sufficiently to gain the children's trust. TD children were brought into the experimental room by the operator, and did not feel nervous because of the trust established in the early stage. For autistic children, a teacher they were familiar with acted as one of the operators to reduce their anxiety. Parents were not involved in the experiment. In the experiment, each participant was tested individually. Adopt the form of free viewing paradigm. The presentation of experimental procedures is shown in Figure 2.

•
The participants sat in front of the computer with their eyes flush with the center of the screen. The distance between their eyes and the screen was about 50-70 cm. Then they were required to gaze at 7 target points on the screen in sequence (a central

Apparatus
The experimental materials were presented to the participants on an all-in-one computer through Microsoft Visual Studio software, and the touch screen resolution was 1920 × 1080. Tobii Eye Tracker 5 (Tobii, Stockholm, Sweden) was used to record the eye tracking data of the participants when they viewed the experimental material, the sampling frequency of the eye tracker is 90 Hz, gaze angle is 40 × 40 degrees.

Procedure
The experiment was conducted in a quiet classroom of the special education institution and kindergarten. Each experiment required one participant and two operators. One operator was responsible for controlling the computer program to ensure that it did not interfere with the participant's viewing. Another operator was responsible for explaining the experiment process. Before the start of the experiment, the operator played with the children sufficiently to gain the children's trust. TD children were brought into the experimental room by the operator, and did not feel nervous because of the trust established in the early stage. For autistic children, a teacher they were familiar with acted as one of the operators to reduce their anxiety. Parents were not involved in the experiment. In the experiment, each participant was tested individually. Adopt the form of free viewing paradigm. The presentation of experimental procedures is shown in Figure 2.

•
The participants sat in front of the computer with their eyes flush with the center of the screen. The distance between their eyes and the screen was about 50-70 cm. Then they were required to gaze at 7 target points on the screen in sequence (a central point, then three peripheral points, then another set of three peripheral points), staring at each point until it disappeared to complete the calibration.

•
After calibration, the participants were reminded to watch the experimental materials on the screen. First, a small red "+" would appear on the black screen and hold for 500 ms to attract participants' attention. Second, stimulus materials were introduced and played in random order, and each one would appear for 5 s. When each piece of material was switched, there would be a 500 ms black screen with a red "+" in the middle to attract participants' attention. This operation was repeated until the end of the experiment. • After starting the experiment, there was no instructions during the whole process, experiment adopted the free-to-view paradigm and did not require any tasks from the participants.

•
The entire experiment took about 2 min.
• After calibration, the participants were reminded to watch the experimenta als on the screen. First, a small red "+" would appear on the black screen a for 500 ms to attract participants' attention. Second, stimulus materials we duced and played in random order, and each one would appear for 5 s. Wh piece of material was switched, there would be a 500 ms black screen with a in the middle to attract participants' attention. This operation was repeated end of the experiment.

•
After starting the experiment, there was no instructions during the whole experiment adopted the free-to-view paradigm and did not require any tas the participants.

•
The entire experiment took about 2 min.

Analysis Indicators
Data cleaning mainly dealt with missing values. The missing values in the d filled with the mean of the data for the same group of participants (ASD or TD).
The collected data from the Tobii Eye Tracker 5 were imported into the OGA (Opensource software), segmented by the area of interest (AOI), and the fixatio and fixation duration were analyzed. Then statistical analysis was performed us SPSS 27 to discuss their differences.
The AOI is divided in OGAMA software, as shown in Figure 3. Section F is AOI (personal face), section B is the body AOI (personal body part), section AC activity AOI (the person's line of sight is facing the area), section BG is the bac AOI (the area excluded the characters and the activity AOI).

Analysis Indicators
Data cleaning mainly dealt with missing values. The missing values in the data were filled with the mean of the data for the same group of participants (ASD or TD).
The collected data from the Tobii Eye Tracker 5 were imported into the OGAMA 5.1 (Opensource software), segmented by the area of interest (AOI), and the fixation count and fixation duration were analyzed. Then statistical analysis was performed using IBM SPSS 27 to discuss their differences.
The AOI is divided in OGAMA software, as shown in Figure 3. Section F is the face AOI (personal face), section B is the body AOI (personal body part), section ACT is the activity AOI (the person's line of sight is facing the area), section BG is the background AOI (the area excluded the characters and the activity AOI).
The eye tracking indicators used in the experiment included: • Fixation count (FC): the total number of fixation points of the participant in the target area. In the Tobii Eye Tracker 5, the participant stayed in the target area for more than 100 ms as one fixation.

Overall Fixation Count
In order to explore the influence of social interaction in the scene on the processing depth of autistic children and TD children, the fixation counts of two groups were analyzed by 2 (subject categories: ASD vs. TD) × 2 (sociality: isolated individual vs. social interaction) mixed analysis of variance (ANOVA). After Mauchly sphericity test, the result was reported as the corrected result. The degree of freedom in decimal form was the degree of freedom corrected by Green-house-Geisser. The following data processing methods were the same.
The overall fixation counts of two groups of children in different scenes were shown in Table 1. The experimental results showed that the main effect of the subject category was significant, F(1,36) = 10.500, p = 0.003 < 0.01. In terms of fixation count in the entire picture, the social interaction between the subject category and the scene was significant, F(1,36) = 8.520, p = 0.006 < 0.01. The interaction diagram is shown in Figure 4.

Overall Fixation Count
In order to explore the influence of social interaction in the scene on the processing depth of autistic children and TD children, the fixation counts of two groups were analyzed by 2 (subject categories: ASD vs. TD) × 2 (sociality: isolated individual vs. social interaction) mixed analysis of variance (ANOVA). After Mauchly sphericity test, the result was reported as the corrected result. The degree of freedom in decimal form was the degree of freedom corrected by Green-house-Geisser. The following data processing methods were the same.
The overall fixation counts of two groups of children in different scenes were shown in Table 1. The experimental results showed that the main effect of the subject category was significant, F(1,36) = 10.500, p = 0.003 < 0.01. In terms of fixation count in the entire picture, the social interaction between the subject category and the scene was significant, F(1,36) = 8.520, p = 0.006 < 0.01. The interaction diagram is shown in Figure 4.

Fixation Duration
In order to explore whether social interaction has an impact on children's fixation duration, and whether there are two types of children's attention and processing levels in social interaction scenes, the overall fixation duration is supposed to be analyzed. In order to further explore children's attention distribution patterns, the fixation duration of different areas of interest needs to be analyzed.

Overall Fixation Duration
A 2 (subject categories: ASD vs. TD) × 2 (sociality: isolated individual vs. social interaction) repeated metric analysis of variance was performed on the fixation duration of the entire picture. The overall fixation duration of two groups of children in different scenes was shown in Table 2. The results showed that there was no interaction between the subject category and the sociality of the scene, F(1,36) = 0.833, p = 0.367 > 0.05. The main effect of the subject category was extremely significant and unrelated to whether the scene had social interaction or not, F(1,36) = 9.714, p = 0.004 < 0.01. The fixation duration of TD children was longer than that of autistic children. There were differences in the social nature of the factors within the subjects, F(1,36) = 5.751, p = 0.022 < 0.05, and the fixation duration of the isolated individual scene was longer than that of the social interaction scene.

Fixation Duration
In order to explore whether social interaction has an impact on children's fixation duration, and whether there are two types of children's attention and processing levels in social interaction scenes, the overall fixation duration is supposed to be analyzed. In order to further explore children's attention distribution patterns, the fixation duration of different areas of interest needs to be analyzed.

Overall Fixation Duration
A 2 (subject categories: ASD vs. TD) × 2 (sociality: isolated individual vs. social interaction) repeated metric analysis of variance was performed on the fixation duration of the entire picture. The overall fixation duration of two groups of children in different scenes was shown in Table 2. The results showed that there was no interaction between the subject category and the sociality of the scene, F(1,36) = 0.833, p = 0.367 > 0.05. The main effect of the subject category was extremely significant and unrelated to whether the scene had social interaction or not, F(1,36) = 9.714, p = 0.004 < 0.01. The fixation duration of TD children was longer than that of autistic children. There were differences in the social nature of the factors within the subjects, F(1,36) = 5.751, p = 0.022 < 0.05, and the fixation duration of the isolated individual scene was longer than that of the social interaction scene.

Fixation Duration of AOI
In order to explore the difference in attention distribution and processing mode between autistic children and TD children in realistic painting scenes, a 2 (subject categories: ASD vs. TD) × 2(sociality: isolated individual vs. social interaction) × 4 (AOI: F, B, BG and ACT) three-factor repeated metric analysis of variance was performed. The fixation duration of AOI of autistic children and TD children in different scenes was shown in Table 3. The results showed that the main effect of the AOI was significant, F(3,34) = 8.728, p = 0.000 < 0.01. The social interaction between AOI and the scene was significant, F(3,34) = 9.437, p = 0.000 < 0.01. The complexity of AOI and subject category was significant, F(3,34) = 3.378, p = 0.029 < 0.05. The interaction between the subject category, the sociality of the scene, and AOI were significant, F(3,34) = 3.993, p = 0.015 < 0.05. A simple interaction effect analysis was further carried out. In the isolated individual scene, the interaction between the subject category and AOI was not significant, F(3,108) = 0.53, p = 0.665 > 0.05. In the social interaction scene, the interaction between the subject category and AOI was significant, F(3,108) = 4.23, p = 0.007 < 0.01. The interaction diagram was shown in Figure 5. Further analysis of its simple effect, in the social interaction scene, only in the face AOI, the subject category had a significant difference, F(1,36) = 20.02, p = 0.000 < 0.01, the fixation duration of TD children was longer than the autistic children. There were no significant differences in the subject category in the fixation duration of other AOI

Fixation Duration of AOI
In order to explore the difference in attention distribution and processing mode between autistic children and TD children in realistic painting scenes, a 2 (subject categories: ASD vs. TD) × 2(sociality: isolated individual vs. social interaction) × 4 (AOI: F, B, BG and ACT) three-factor repeated metric analysis of variance was performed. The fixation duration of AOI of autistic children and TD children in different scenes was shown in Table 3. The results showed that the main effect of the AOI was significant, F(3,34) = 8.728, p = 0.000 < 0.01. The social interaction between AOI and the scene was significant, F(3,34) = 9.437, p = 0.000 < 0.01. The complexity of AOI and subject category was significant, F(3,34) = 3.378, p = 0.029 < 0.05. The interaction between the subject category, the sociality of the scene, and AOI were significant, F(3,34) = 3.993, p = 0.015 < 0.05. A simple interaction effect analysis was further carried out. In the isolated individual scene, the interaction between the subject category and AOI was not significant, F(3,108) = 0.53, p = 0.665 > 0.05. In the social interaction scene, the interaction between the subject category and AOI was significant, F(3,108) = 4.23, p = 0.007 < 0.01. The interaction diagram was shown in Figure 5. Further analysis of its simple effect, in the social interaction scene, only in the face AOI, the subject category had a significant difference, F(1,36) = 20.02, p = 0.000 < 0.01, the fixation duration of TD children was longer than the autistic children. There were no significant differences in the subject category in the fixation duration of other AOI    As shown in Figure 6, for autistic children, there was no significant difference in the fixation duration generated by different social scenes in each AOI. For TD children, there was no significant difference in the fixation duration between different social scenes in the body AOI and background AOI. However, in the face AOI, the fixation duration of the isolated individual scene was significantly less than that of social interaction scene (t = −4.74, p = 0.000 < 0.01). In the activity AOI, the fixation duration of the isolated individual scene was significantly longer than that of social interaction scene (t = 3.782, p = 0.001 < 0.01). With the increase in the number of scenes, attention of TD children to the activity AOI decreased and their attention to the face AOI increased, while the social changes of the scene did not have a significant impact on the attention distribution of autistic children.
As shown in Figure 6, for autistic children, there was no significant difference in the fixation duration generated by different social scenes in each AOI. For TD children, there was no significant difference in the fixation duration between different social scenes in the body AOI and background AOI. However, in the face AOI, the fixation duration of the isolated individual scene was significantly less than that of social interaction scene (t = −4.74, p = 0.000 < 0.01). In the activity AOI, the fixation duration of the isolated individual scene was significantly longer than that of social interaction scene (t = 3.782, p = 0.001 < 0.01). With the increase in the number of scenes, attention of TD children to the activity AOI decreased and their attention to the face AOI increased, while the social changes of the scene did not have a significant impact on the attention distribution of autistic children. Figure 6. Fixation duration of autistic children and TD children in different scenes (* indicates a significant difference in means using pairwise comparisons, p < 0.05. There is significant difference between two groups in the face AOI in social interaction scene. There is no significant difference between the fixation duration of each AOI in different social scenes of autistic children. TD children has significant differences in the fixation duration of different social scenes in both the face AOI and activity AOI).

Discussion
In order to intuitively see the attention distribution of autistic children and TD children when watching different social scenes, OGAMA software was used to generate eye tracking heat maps and eye movement trajectories. In the heat maps, there were three colors: red, orange, and green. The darker the color (the redder), the longer the fixation duration of the subject in the area, and the more fixation count. The lighter the color, the fewer fixation count and the shorter the fixation duration, as shown in Table 4. The eye trajectory chart recorded the order of eye movement, and then analyzed the gaze patterns of autistic children and TD children. The first fixation point appeared randomly, which was related to the transformation of the picture, as shown in Table 5. . Fixation duration of autistic children and TD children in different scenes (* indicates a significant difference in means using pairwise comparisons, p < 0.05. There is significant difference between two groups in the face AOI in social interaction scene. There is no significant difference between the fixation duration of each AOI in different social scenes of autistic children. TD children has significant differences in the fixation duration of different social scenes in both the face AOI and activity AOI).

Discussion
In order to intuitively see the attention distribution of autistic children and TD children when watching different social scenes, OGAMA software was used to generate eye tracking heat maps and eye movement trajectories. In the heat maps, there were three colors: red, orange, and green. The darker the color (the redder), the longer the fixation duration of the subject in the area, and the more fixation count. The lighter the color, the fewer fixation count and the shorter the fixation duration, as shown in Table 4. The eye trajectory chart recorded the order of eye movement, and then analyzed the gaze patterns of autistic children and TD children. The first fixation point appeared randomly, which was related to the transformation of the picture, as shown in Table 5.
In the isolated individual scene, there was no significant difference in the overall fixation count of two groups of children. Though the overall fixation duration of TD children was longer than that of autistic children, and there was no significant difference in the attention distribution in each AOI. The fixation preferences of TD children and autistic children were the same, and there were two eye-tracking hot spots: the face AOI and the gaze direction AOI. In the eye trajectory diagram, both TD children and autistic children first noticed social stimuli (faces), and then followed social cues to produce joint attention. But the difference was that after the TD children had the joint attention, they focused their attention on the objects on which the characters produce actions. In addition to the objects in their hands, autistic children also searched for objects on the table and on the wall. This showed that both autistic children and TD children had social orientation and joint attention, but autistic children had difficulty detecting the intentions of the characters in the scene in time, and needed more revisiting and searching, while the gaze process of TD children was relatively smooth. When there was no social interaction in the scene, the performance of autistic children's social orientation and common attention in the realistic painting scene was consistent with the research results in the real scene [42]. Autistic children noticed the same stimuli as TD children but processed the information in different ways [43].

Experimental Group Control Group
In social interaction scene, the overall fixation count, overall fixation duration, and fixation duration on face AOI of TD children were greater than those of autistic children. Through a T-test, the hypothesis was verified that the possibility of autistic children looking at faces was less than TD children when the scene had social interaction [18]. When there was social interaction in the scene, the processing depth, attention, and social orientation of autistic children were different from those of TD children. Consistent with the previous experimental results [44], there was no significant difference in the fixation duration of the two groups of children between the character's body and the background AOI. It was indicated that there was little difference between the two groups of children's processing modes of the characters' bodies and backgrounds in social scenes and was unrelated to ecological validity of the scenes. The eye tracking hotspot of TD children was the face AOI, while that of autistic children was concentrated in the middle area where the two people's eyes met. Compared to TD children, autistic children tended to focus on different aspects in the same situation [45,46]. The scenes with higher ecological validity might inhibit their attention to the characters. In the eye movement trajectory, autistic children had no rules to follow and had fewer fixation count. Most of their fixation trajectory focused on the body and the irrelevant information about the background. The gaze trajectory of TD children formed a triangle around the two characters and the communication area between them. Compared with autistic children, they had more fixation count in the face AOI. When there was social interaction in the scene, the ability of autistic children to process social orientation and common attention was greatly reduced. The two characters in the scene pictures which was given in this study were in a state of communicating with each other and the transmission of social information mainly relied on social stimuli such as faces. TD children focused more on the faces of the characters after searching the middle area of the two characters for a short time, while autistic children searched and looked at the middle area of the two characters. The processing ability of autistic children at this time was in an abnormal state from the heat map or the trajectory map. In the isolated individual scene, there was no significant difference in the overall fixation count of two groups of children. Though the overall fixation duration of TD children was longer than that of autistic children, and there was no significant difference in the attention distribution in each AOI. The fixation preferences of TD children and autistic children were the same, and there were two eye-tracking hot spots: the face AOI and the gaze direction AOI. In the eye trajectory diagram, both TD children and autistic children first noticed social stimuli (faces), and then followed social cues to produce joint attention. But the difference was that after the TD children had the joint attention, they focused their attention on the objects on which the characters produce actions. In addition to the objects in their hands, autistic children also searched for objects on the table and on the wall. This showed that both autistic children and TD children had social orientation and joint attention, but autistic children had difficulty detecting the intentions of the characters in the scene in time, and needed more revisiting and searching, while the gaze process of TD In the isolated individual scene, there was no significant difference in the overall fixation count of two groups of children. Though the overall fixation duration of TD children was longer than that of autistic children, and there was no significant difference in the attention distribution in each AOI. The fixation preferences of TD children and autistic children were the same, and there were two eye-tracking hot spots: the face AOI and the gaze direction AOI. In the eye trajectory diagram, both TD children and autistic children first noticed social stimuli (faces), and then followed social cues to produce joint attention. But the difference was that after the TD children had the joint attention, they focused their attention on the objects on which the characters produce actions. In addition to the objects in their hands, autistic children also searched for objects on the table and on the wall. This showed that both autistic children and TD children had social orientation and joint attention, but autistic children had difficulty detecting the intentions of the characters in the scene in time, and needed more revisiting and searching, while the gaze process of TD Studies had shown that in real social scenes, the overall processing of faces by autistic children were different from those of TD children [20]. Consistent with the results of this study using realistic painting materials, it was found that there was no significant difference in the gaze between autistic children and TD children in cartoon scenes with or without social interaction [47]. The difference between the cartoon scene and the real scene might be related to the exaggerated expression of the cartoon. Since the performance of the characters in the realistic paintings was based on the real situation as much as possible, the conclusion of highly consistent with the real scene could be drawn. That was to say, TD children had greater ability to process human faces than autistic children in realistic paintings with medium ecological validity.
The overall fixation duration of the isolated individual scene was longer than that of the social interaction scene. As the sociality of the scene increased, the information complexity of the scene also increased. It might lead to a decrease in interest in attention and a corresponding decrease in the degree of processing. However, for autistic children, there was no significant difference between the two types of scenes in the fixation duration of each AOI. For TD children, after increasing social interaction in the scene, the fixation duration of the face AOI increased, but the fixation duration of the activity AOI decreased. This was consistent with Yarbus's research conclusions on normal adults watching "Unexpected Return" [13], indicating that the attention distribution pattern of 3-6 years old TD children in social scenes was basically the same as that of adults.
In general, TD children had more regular eye movements and a smoother gaze. The key point of their eye movements was the social stimuli in the scene. The eye movement trajectory of autistic children was disordered, often jumped out of social stimuli and paid attention to non-social stimuli. Moreover, they were different from TD children in understanding intentions. Some researchers pointed out that children's visual preferences and the reduction of their gaze on the entire scene and face would have a significant impact on the social cues obtained during the development process [45]. It might promote the separation of their social cognitive skills and face perception abilities. In the intervention of autistic children, attention should be paid to the intervention of social orientation and intention understanding. By training autistic children to focus more on faces during social interactions, it would be possible to improve their social attention ability.

Conclusions
The main conclusions of this study are as follows:

•
In the isolated individual scene, the visual processing and attention distribution of autistic children and TD children are consistent.

•
In scenes with social interaction, the processing depth, attention to the scene, and social orientation ability of autistic children are different from those of TD children.

•
When adding social interaction to the scene, TD children pay more attention to the faces of the characters and pay less attention to the activity area, while autistic children have no such difference. • TD children have a smoother gaze trajectory toward the scene. Their ability to deal with social stimuli and understanding of characters' intentions is stronger than that of autistic children.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethic Institutional Review Board of Central China Normal University (protocol code CCNU-IRB-202101011a and date of approval 13 January 2021).

Informed Consent Statement:
Informed consent was obtained from the parents/caregivers of all children involved in the study. The personally identifiable information or portraits of children were not collected and used in the study.