This project utilized already collected data from a larger, ongoing study investigating the utility of walking gait differences in autistic and non-autistic children to estimate the likelihood of autism when cultural factors are considered. In the broader ongoing study, participants are video-recorded walking and complete standardized assessments of motor skills, while parents complete a lab-specific questionnaire on autism, motor skills, communication skills, and cultural factors, as well as a standardized questionnaire on social responsiveness. The present study analyzed a subset of the data from this broader study, focusing on standardized assessments of balance, social responsiveness, and cultural factors.
2.1. Participants
To statistically estimate how many participants were necessary to include in this exploratory study, a power analysis was conducted using G*Power software, version 3.1.9.7 (
Buchner et al., 2017). Using an alpha level of 0.05, power of 0.80, and a moderate effect size of 0.25 as the set parameters for the planned multiple linear regression model with one dependent variable (i.e., the Movement Assessment Battery for Children-2 (MABC-2) balance subscale standard scores) and eight total predictors (i.e., the Social Responsiveness Scale-2 (SRS-2) overall standard score, SRS-2 Social Communication Index, SRS-2 Restrictive and Repetitive Behaviors Score, age, gender/sex, socioeconomic (SES) level, number of known languages, and race/ethnicity), the minimum total sample size needed was 34. As such, the currently available sample size of 44 participants from the broader ongoing study was considered appropriate for this exploratory study.
To be included in the broader research study, all participants were between the ages of 5–12, had the ability to walk independently, and were not taking any medications that would impact movement (
Doyle & McDougle, 2012). To better understand the relationship between motor skills and autism specifically, participants representing a wide range of neurotypes were recruited for this study. This is especially important because nearly 91% of autistic children report co-occurring psychological conditions (
Mosner et al., 2019), a rate much higher than the general population (
Micai et al., 2023). As such, autistic and non-autistic participants with co-occurring conditions (e.g., Attention-deficit/hyperactivity disorder (ADHD), anxiety disorders, obsessive compulsive disorder (OCD)) were included. Of the 44 participants included in this study, 16 children had a formal diagnosis of autism, with 9 reporting co-occurring conditions of ADHD, OCD, anxiety disorder, asthma, dysgraphia, fine motor developmental delay, sensory processing disorder, SATB2-associated syndrome, and precocious puberty. Of the remaining 28 non-autistic children, 6 reported conditions of ADHD, sensory processing disorder, anxiety disorder, auditory processing disorder, non-verbal learning disability, ocular motor aphasia, and dyslexia. Finally, 22 of the non-autistic participants reported typical development with no known co-occurring conditions.
The Autism Diagnostic Observation Schedule, 2nd edition (ADOS-2) was not used for inclusionary purposes because it is known to have race and gender/sex biases (
D’Mello et al., 2022;
Kalb et al., 2022), and because it is possible for children with other diagnoses, such as developmental language disorder, to also score within the autistic range (
Leyfer et al., 2008). Historically, autistic female participants have been excluded from autism research when using the ADOS as a confirmatory measure for diagnosis, so instead, parent reported diagnosis was the basis for recruitment of autistic participants (
D’Mello et al., 2022). To get an estimate of each individual’s autistic characteristics, the Social Responsiveness Scale, 2nd edition (SRS-2) was administered to all participants (autistic and non-autistic;
Constantino, 2012). The SRS-2 is a reliable questionnaire that includes separate male and female forms, and its estimates of autism characteristics align with autism diagnostic criteria (
Frazier et al., 2013). Participants came from a variety of cultural backgrounds, such as gender/sex, ethnic origin, cultural identity, and languages spoken in the home. All participants came from a Midwestern region of the United States.
2.3. Procedure
All recruitment and experimental procedures were implemented in accordance with the approved university Internal Review Board (IRB) protocols. Recruitment of participants for the broader gait study was done largely through the distribution of flyers throughout the community. Support letters from autism day schools, community programs, and clinics local to Northern Illinois University (NIU) were obtained to aid in the distribution of such flyers. Information about the study was then distributed via schoolwide e-mails sent to families, flyers distributed to children at their schools, distributed at community events, posted in public spaces such as libraries, distributed to local clinicians, and through databases available to the research team.
Once participants agreed to join the study, caregivers were provided with a consent form. A research assistant explained each section of the consent form and checked for understanding and asked if the family had any questions. If the caregiver agreed to participate, they signed the consent form. Then, the child participants were asked to provide assent prior to participating in the study. For all child participants, a social story format was used for collecting the child’s assent to participate, which is a supportive way to share information about upcoming events with autistic children to ease transitions and enhance understanding. Both caregivers and children were informed that participation in the study was voluntary, and they could withdraw at any time.
After consent and assent were given, the demographic and cultural factor survey was completed by the caregiver on a password-protected tablet or laptop computer that was provided to them within the research lab. The expected completion time for the survey was approximately 10 min. Then, the SRS-2 was administered to the caregiver by a trained graduate assistant. The SRS-2 is a paper and pencil questionnaire with an expected completion time of approximately 15 min.
While the caregiver completed the survey and SRS-2, the child completed the Balance subtest items of the MABC-2. The completion time of the Balance subtest was approximately 10 min. The graduate assistant administering the MABC-2 recorded the results on the test form. In total, the combination of all three assessments collected for this study took approximately 25 min to complete. Upon completion of the session, the children were given a small toy as compensation for their time.
2.4. Descriptive Statistics
To gather an overall picture of the participants in this study, descriptive statistics were calculated to identify the mean, standard deviation (SD), minimum, and maximum values of the participants’ age, socioeconomic status (SES; as estimated by the caregivers’ total years of education), MABC-2 Balance standard score, SRS-2
T-scores, SRS-2 SCI scores, and SRS-2 RRB scores using SPSS version 26.0 (
Version 26.0; IBM Corp., 2019). Descriptive statistics regarding the number of participants of each gender/sex, ethnic origin category, cultural identity, number of languages used, and which languages used were also calculated. The key descriptive statistics for all participants together are summarized in
Table 1. Mann–Whitney U tests confirmed that groups did not differ significantly in age (
p = 0.171) or SES (
p = 0.213) but differed as expected in SRS-2 scores (
p < 0.001) and MABC-2 Balance performance (
p = 0.025).
2.4.1. Age
The mean age of participants was 8.23 years (SD = 2.25 years, ranging from 5 to 12). Of the 16 autistic participants, the mean age was 8.81 years (SD = 2.17) and ranged from 6 to 12 years. Of the 28 non-autistic participants, the mean age was 7.89 years (SD = 2.27) and ranged from 5 to 12 years.
2.4.2. Gender/Sex
Of the 44 total participants, 27 participants identified as male, 15 as female, one as nonbinary, and one did not report preferred gender pronouns. Of the 16 autistic participants, more identified as males than any other gender/sex (13 males, one female, one nonbinary, one did not report). Of the 28 non-autistic participants, 14 identified as males and 14 identified as females.
2.4.3. Socioeconomic Status (SES)
The average years of education of Caregiver 1 was 17.45 (SD = 6.41) with a range of 0 to 33 years and Caregiver 2 was 13.41 (SD = 5.88) with a range of 0 to 23. Of the 16 autistic participants, the average years of education of Caregiver 1 was 15.75 years (SD = 3.57). Caregiver 2 was 14.06 (SD = 4.40). Of the 28 non-autistic participants, the average years of education of Caregiver 1 was 18.43 years (SD = 7.47) and Caregiver 2 was 13.04 (SD = 6.63). In other words, the caregivers, on average, had some education past high school across the autistic and the non-autistic children.
2.4.4. SRS-2 T-Scores
SRS-2 Total T-scores of 59 or lower are considered “within normal limits” according to the administration manual. Across all children (both autistic and non-autistic combined), the average SRS-2 overall T-score was 61 (SD = 14, ranging from 39 to 90). Of the 16 participants previously diagnosed with autism, the average SRS-2 overall T-score was 72 (SD = 11, ranging from 52 to 90), which is displaying significant autistic characteristics. Of the 28 non-autistic participants, the average SRS-2 overall T-score was 54 (SD = 11, ranging from 39 to 79), which is not displaying significant autistic characteristics. It is worth noting that some of the autistic children scored below the autistic range of scores on the SRS-2, and that some non-autistic children scored within the autistic ranges of scores.
The two DSM-5 Compatible Scales on the SRS-2, namely the Social Communication and Interaction (SCI) subscale and the Restrictive and Repetitive Behaviors (RRB) subscales, were also analyzed. The SRS-2 SCI average T-score across all children (both autistic and non-autistic combined), was 59.80 (SD = 13.45, ranging from 39 to 90). Of the 16 participants previously diagnosed with autism, the average SRS-2 SCI T-score was 70.69 (SD = 10.80, ranging from 52 to 90). Of the 28 non-autistic participants, the average SRS-2 SCI T-score was 52.74 (SD = 10.62, ranging from 39 to 76). The average SRS-2 RRB T-score was 64.11 (SD = 16.03, ranging from 41 to 90). Of the 16 participants previously diagnosed with autism, the average SRS-2 RRB T-score was 77.19 (SD = 10.04, ranging from 54 to 90). Of the 28 non-autistic participants, the average SRS-2 RRB T-score was 56.64 (SD = 13.93, ranging from 41 to 88).
2.4.5. MABC-2 Balance Scores
The MABC-2 uses standardized scaled scores to compare performance to the normative data. A standard scaled score of 10 is the mean (SD = +/−3), with scores from 7–13 considered within the typical range, scores 6 and below are considered below the typical range, and scores above 13 are considered above the typical range. The average MABC-2 Balance standard score was 6.50 (SD = 4.15, ranging from 1 to 16), which is considered below the typical range. Of the 16 autistic participants, the average MABC-2 Balance standard score was 4.65 (SD = 3.36, ranging from 1–11), which is considered below the typical range. Of the 28 non-autistic children, the average MABC-2 Balance standard score was 7.57 (SD = 4.23, ranging from 1 to 16), which is considered within the typical range.
2.4.6. Ethnic Origin
Of the participants who only selected one ethnic origin, 28 selected white, six selected Hispanic, and three selected Black. Of the participants who selected multiple ethnic origins, three selected Hispanic and white, three selected Asian or Pacific Islander and white, and one selected American Indian or Alaskan Native and white.
2.4.7. Cultural Identity
The following information was not included in the final regression model because of conceptual similarity to ethnic origin and number of languages used but is provided to give a more detailed background of the study participants. Nine participants did not provide a cultural identity. Of the participants who reported one cultural identity, 20 identified as U.S. American, one identified as Midwestern American, one identified as Mexican, one identified as Catholic, and one identified as Christian. Of the participants who provided more than one cultural identity, four identified as U.S. American and Mexican, one identified as U.S. American and Greek and Italian, one identified as U.S. American and Caribbean and African American, one identified as U.S. American and Roman Catholic, two identified as U.S. American and German, one identified as Mexican/Hispanic and white, and two identified as U.S. American and Polish and Catholic.
2.4.8. Number of Languages
Thirty-nine participants reported that they used one language, and five participants used two languages. Of the 16 autistic participants, 15 used one language, and one used two languages. Of the 28 non-autistic participants, 24 used one language, and four used two languages. For participants who did not respond to the number of languages used/exposed to in the home, the value of 1 language was the default. It was assumed that everyone uses or has been exposed to at least 1 language.
2.4.9. Languages Used
This information is provided to give a more detailed picture of the participants in the study but was not included in the final regression model because of the conceptual similarity and the high statistically significant correlation with the number of languages used. Of the 16 autistic participants, 15 used English and 1 did not respond. Of the 28 non-autistic participants, 22 used English, 2 used German and English, 2 used Spanish, and 2 did not respond.
2.5. Statistical Analyses
In this project, we aimed to address two research questions: Does the relationship between motor abilities and autistic characteristics still stand when cultural differences are considered? Which, if any, cultural factors moderate the relationship between motor abilities and autistic characteristics the most? To answer these questions, we conducted a regression analysis, in which regression coefficients with the confidence intervals set to 95%, estimates of the model fit (R and R2), and statistical significance of the independent variables (t-values and p-values) were all extracted from the analysis. Independent variable coefficients with a p value of less than 0.05 were considered statistically significant. Analyses were conducted in Python 3.12 (Google Colab).
2.5.1. Establishing the Final Set of Independent Variables
First, a Pearson’s bivariate correlation analysis was used to identify any potential collinearity of independent variables for the later planned regression analysis. When any independent variables conceptually captured similar information and were significantly correlated with one another (i.e., p < 0.05), one representative variable was selected for inclusion in the planned regression analysis. Using this approach, the SRS-2 T-score (which is a combination of the SCI and RRB subscale scores) had a strong, positive relationship with the SRS-2 SCI score (r = 0.989, p < 0.001) and the SRS-2 RRB score (r = 0.912, p < 0.001). Because all three SRS-2 scores conceptually reflected a participant’s autistic traits and were highly correlated, only the SRS-2 Overall T-score was included in the regression analysis. Similarly, C1 years of education had a moderate, positive relationship with C2 years of education (r = 0.541, p < 0.001), and both were conceptually intended to estimate participants’ SES level. Therefore, only the C1 years of education was used in the regression model. Finally, cultural identity had a moderate, positive relationship with the number of languages spoken in the home (r = 0.599, p < 0.001). This finding, although not anticipated, was perhaps unsurprising, as many of the participants who reported using two or more languages at home also reported two or more cultural identities that were conceptually related to their languages. For example, two participants identified as U.S. American and German and reported using two languages (English and German) at home. As such, only the number of languages was used in the regression model.
After eliminating the conceptually similar and significantly correlated variables, the final set of six independent variables for the planned multiple regression analysis to predict the relationship between MABC-2 balance scores (dependent variable), included: the SRS-2 Overall T-scores (IV 1), age (IV2), gender/sex (IV3), ethnic origin (IV4), SES level based on Caregiver 1’s years of education (IV5), and number of languages (IV6).
2.5.2. Variable Preparation and Coding
To keep predictors on a common scale and make coefficients easier to read, the continuous variables, including age (years), SRS-2 Overall T-score, Caregiver 1 education (years), and number of languages spoken, were standardized to z-scores (mean 0, SD 1) before modeling. This standardization puts predictors measured on different scales onto the same metric, making effects comparable; in the regression, each coefficient for a z-scored predictor represents the expected change in MABC-2 Balance score associated with a 1–SD increase in that predictor (e.g., if β = −3.00 for SRS-2_z, a one–SD higher SRS-2 is associated with a 3-point lower Balance score, holding other variables constant).
Because several gender/sex categories had very small counts, gender/sex was grouped as male (reference) vs. non-male (female, non-binary, or not reported) and entered as a dummy variable (Gender_nonMale = 1 for non-male, 0 for male). Using this coding, the data comprise 27 male and 17 non-male participants; the Gender_nonMale coefficient is interpreted as the adjusted mean difference in MABC-2 Balance scores for non-male vs. male (reference). That is, a positive value indicates higher MABC-2 Balance scores among non-male participants, holding all other predictors constant.
Because participants could report more than one heritage, ethnic origin was represented with three yes/no dummy variables: white, Hispanic, and Other heritage. “Other heritage” pools smaller groups in the sample (e.g., Black, Asian/Pacific Islander, American Indian/Alaska Native, and multi-heritage except Hispanic/white). These heritage indicators are not mutually exclusive (e.g., a child can be “yes” for both Hispanic and white). All three were included in the model at the same time and no reference group was selected. Across the sample (N = 44), 35 children endorsed white heritage, 10 endorsed Hispanic heritage, and 7 endorsed another (pooled) heritage; these counts are not mutually exclusive. In the regression, each heritage coefficient represents the adjusted mean difference in MABC-2 Balance scores for children with that heritage versus those without it (controlling for the other predictors and heritage dummies); for multi-heritage children (e.g., Hispanic and white), the implied effect is the sum of the relevant coefficients.
2.5.3. Regression Model
We fit an ordinary least squares (OLS) multiple linear regression with MABC-2 Balance score as the dependent variable and eight predictor: Age (z), SRS-2 Overall T (z), Caregiver 1 education (z), number of languages (z), Gender_nonMale (dummy: 1 = non-male; 0 = male), and Ethnic origin_heritage dummy variables (white, Hispanic, Other heritage; yes/no, not mutually exclusive).
For each predictor, we report coefficients (β; unstandardized for dummy variables and standardized for z-scored continuous variables) with 95% confidence intervals and t-tests with p-values. We also report model R2/adjusted R2. For z-scored predictors (age, SRS-2 score, Caregiver 1 years of education, number of languages, gender/sex, and ethnic origin), β is the expected change in MABC-2 Balance for a 1-SD increase in that predictor (holding others constant). For dummy variable Gender_nonMale, β is the adjusted mean difference between the nonMale group and the reference Male group. For Ethnic origin_heritage dummy variables, coefficients reflect the association of having that heritage. Because heritage indicators can co-occur (e.g., Hispanic and white), they are not contrasts among mutually exclusive categories.
2.5.4. Correlation and Multicollinearity Analysis
Because several cultural indicators were conceptually and empirically interrelated, we conducted correlation and multicollinearity diagnostics to ensure model validity. Intercorrelations among all predictors were examined (reported in
Appendix A.1), with particular attention to cultural variables. Variance inflation factors (VIF) were calculated for all predictors in the baseline model. VIF values below 5 indicate acceptable multicollinearity, values between 5 to 10 suggest moderate concern, and values above 10 indicate severe multicollinearity requiring remediation (
Bobbitt, 2020;
O’brien, 2007). All VIFs in our baseline model were below 1.6, confirming that multicollinearity did not materially distort regression estimates (see
Appendix A.2).
2.5.5. Independence of Observations
For the multiple regression model, the independence of observations was checked using the Durbin–Watson statistic for autocorrelation. Only values between 1.5–2.5 were considered sufficiently independent. The Durbin–Watson statistic calculated for this model was 2.26, indicating the variables were sufficiently independent to run the multiple regression model.
2.5.6. Check for Linearity
Furthermore, to check that a linear relationship between the dependent variable (MABC-2 balance subscale scores) and the primary predictive variable (SRS-2 Overall
T-score) was present, a scatterplot in Microsoft Excel was first created and visually inspected for linearity (
Figure 1). The scatterplot visually revealed a linear relationship.
2.5.7. Moderation Analyses
To test whether cultural factors moderate the association between autistic characteristics and balance, we conducted interaction analyses following recommendations for multiple testing correction (
Holm, 1979). Specifically, we tested six pre-specified interactions between SRS-2 Overall
T-score (standardized) and each cultural moderator: Gender/sex (nonMale vs. Male; dummy-coded); ethnic origin heritage indicators (white, Hispanic, Other; multi-label binary coding as described above); socioeconomic status (Caregiver 1 years of education; standardized); and number of languages spoken in the home (standardized).
Each interaction was tested in a separate regression model that included all baseline predictors (SRS-2, Age, Gender/sex, ethnic origin heritage indicators, SES, and number of languages) plus one SRS-2 × Moderator interaction term. For each test, we report the unstandardized coefficient (
β), 95% confidence interval,
p-value, change in explained variance (Δ
R2) relative to the baseline model, and the model
R2. To control family-wise error rate across these six pre-specified tests, we applied Holm–Bonferroni sequential correction with α = 0.05 (
Holm, 1979). This method adjusts
p-values to account for multiple comparisons while maintaining greater statistical power than simple Bonferroni correction.
Additionally, we fit an exploratory omnibus model including all six interactions simultaneously (reported in
Appendix A.3). This model is clearly labeled as exploratory given power constraints with
N = 44 and 14 predictors. We calculated variance inflation factors (VIF) for this model to assess multicollinearity among interaction terms.
A sensitivity analysis was conducted to determine detectable effect sizes. With N = 44, eight baseline predictors and one interaction term, α = 0.05, and power = 0.80, we were adequately powered to detect moderate-to-large interaction effects (f2 ≥ 0.24) but likely underpowered for small-to-moderate effects (f2 < 0.15). Accordingly, moderation findings should be interpreted as preliminary evidence requiring replication in adequately powered samples.