Hindfoot and ankle pathologies are very common, especially among athletes, sometimes provoking chronic pain and instability in the ankle; in 75% to 80% of patients, this causes an isolated rupture of the anterior talofibular ligament [
1]. Moreover, instability of the subtalar joint is a potential source of chronic lateral instability of the hindfoot [
2]. These injuries can also result in functional disability or inhibition of daily activities, and if not managed correctly can have severe consequences [
3].
The AOFAS scale is commonly used to evaluate foot and ankle complaints, both in the general population and in athletes. It has four classification systems, according to the anatomical region considered: the hindfoot-ankle, the midfoot, the first metatarsophalangeal joint, and the lesser metatarsophalangeal joints [
6]. The AOFAS scale also consists of nine elements divided into three sections: pain, function, and alignment [
5].
Materials and Methods
The review protocol was registered at the International Prospective Register of Systematic Reviews (PROSPERO: CRD 42022320280) before the identification of articles and data extraction.
Design
A systematic review was conducted of cross-cultural adaptations of the methodological quality of the AOFAS scale.
Search Strategy
The study selection process was based on the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) statement for systematic reviews [
17]. The database search was performed by examining five databases—PubMed, Scopus, CINAHL, PEDro: Physiotherapy Evidence Database, and PROSPERO—from inception to January 2023.
The following search terms were used, together with the operators “OR” and “AND”: surveys and questionnaires [MeSH Terms], reliability, validity, cross-cultural adaptation, psychometric properties, American Orthopaedics Foot and Ankle Score, AOFAS, hindfoot [MeSH Terms], ankle [MeSH Terms].
Inclusion and Exclusion Criteria
Studies that met the following criteria were included: patients older than 18 years in whom the reliability and validity of the corresponding version of the AOFAS Ankle-Hindfoot Scale was evaluated; cross-cultural adaptations of the AOFAS scale; and used measurement properties based on the COSMIN (COnsensus-based Standards for the selection of health Measurement INstruments) criteria of structural validity, internal consistency, reliability, measurement error, hypothesis testing for construct validity, cross-cultural validity/measurement invariance, criterion validity, and responsiveness. Studies published in languages other than English or Spanish were excluded.
Data Extraction
Article titles and abstracts were read and all relevant full-text articles were then extracted and read for selection. If information was missing or uncertain, the authors were contacted for clarification. The following data were obtained: full title (original questionnaire or cross-cultural adaptation), author, year of publication, language, population used for the validation process, mean participant age, and results.
Quality Appraisal
The updated COSMIN checklist was used to evaluate the methodological quality of the studies performed to investigate the measurement properties of each patient-reported outcome measure [
18].
This standard can be used both to assess the methodological quality of studies of patient-reported outcome measures [
19] and to compare the measurement properties of various instruments in a systematic review [
20]. Each of the properties observed is rated as positive (“+”), negative (“−”) or indeterminate (“?”). Measurement properties are considered with respect to four domains: reliability, validity, responsiveness, and interpretability. Each property contains various items evaluated on a 4-point Likert scale as very good, adequate, doubtful, or inadequate.
Study Selection
Two blinded reviewers (J.G.-M. and S.S.-M.) evaluated all of the studies obtained. Discrepancies in the process were resolved by discussion and mutual accord, assisted by the intervention of the third and fourth reviewers (A.M.-R. and P.C.-G.). No meta-analysis was peformed due to the heterogeneity of the dimensions and outcomes included in these studies.
Results
The initial search identified 185 potential studies, 138 of which were duplicates among the different databases. The remaining 47 studies were examined, and a further 41 were discarded because they did not meet the inclusion criteria (30 were not based on COSMIN criteria and 11 were in languages other than Spanish or English). Finally, therefore, six full-text articles, all of which met the inclusion criteria, were analyzed. The flowchart for the article selection process (
Fig. 1) is based on the PRISMA statement for systematic reviews [
21].
Figure 1.
PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) flow diagram. COSMIN, COnsensus-based Standards for the selection of health Measurement INstruments.
Figure 1.
PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) flow diagram. COSMIN, COnsensus-based Standards for the selection of health Measurement INstruments.
Study Characteristics
The AOFAS scale is a self-administered instrument in which the patient answers a series of questions, after which the health professional provides complementary information based on objective data [
5]. Various cross-cultural adaptations of this questionnaire have been made (
Table 1). The studies considered in this review included 588 participants (51.5% male and 48.5% female). The participants’ mean age was 46.7 years.
Table 1.
Characteristics of the American Orthopaedic Foot and Ankle Society Ankle-Hindfoot Scale
Table 1.
Characteristics of the American Orthopaedic Foot and Ankle Society Ankle-Hindfoot Scale
Measurement Properties
The measurement properties of the studies are summarized in
Tables 2 and
3. The validity of the articles was checked by reference to COSMIN. In this process, all of the articles were characterized by two reviewers (A.B.O.-A. and M.O.-R.) who were blinded to the authorship details and to each other’s opinions. This analysis was made of the original scale and of the Italian, Dutch, Danish, and Persian versions.
Table 2.
Summary of Measurement Properties
Table 2.
Summary of Measurement Properties
Table 3.
Summary of COSMIN Ratings
Table 3.
Summary of COSMIN Ratings
The Dutch version for ankle fractures presented the best overall rating, with four positive results: internal consistency (Cronbach α = 0.70–0.95), reliability (intraclass correlation coefficient > 0.7), construct validity (evidence from factor analysis to confirm the study hypotheses), and responsiveness (minimal important change, 6.6). The Danish version and the Dutch version for hindfoot fractures also recorded positive results for reliability, construct validity, and responsiveness. On the other hand, all of the cultural adaptations considered obtained indeterminate or missing data for structural validity, measurement error, and cross-cultural validity.
Methodological Quality According to Measurement Properties
In addition to the above, we evaluated the methodological quality of the best-rated patient-reported outcome measures using COSMIN criteria to classify their quality as very good, adequate, doubtful, or inadequate [
23]. These details are shown in
Table 4. In the context of the low overall score, the Dutch AOFAS-Ankle scale had the best score, obtaining positive evaluations for four of the eight items evaluated: internal consistency, reliability, construct validity, and responsiveness. The Italian and Persian versions scored positively for only the reliability criterion. Every version produced negative results for criterion validity, and none presented structural validity.
Table 4.
Methodological Quality Scores for Each Patient-Reported Outcome Measure (PROM)
Table 4.
Methodological Quality Scores for Each Patient-Reported Outcome Measure (PROM)
Discussion
The aim of this study was to review the cross-cultural adaptations of the AOFAS scale for hindfoot and ankle injuries to assess the methodological quality presented. The AOFAS scale has been cross-culturally adapted into Italian [
10], Dutch [
12], Persian [
13,
14], and Danish [
6].
The results of the present analysis reflect a generally low methodological quality in the different versions of the AOFAS scale; only the Dutch AOFAS-Ankle adaptation obtained a very good, adequate, or doubtful rating for half of the measurement properties considered.
The mean age of the participants in these studies ranged from 32.1 years in the Persian version [
13] to 60.38 years in the Italian version [
10]. The number of participants ranged from 50 (36 women and 14 men) in the Italian version to 142 (75 men and 67 women) in the Dutch AOFAS-Ankle instrument [
12]. The predominance of male participants in some studies, and of females in others, means that the study populations are heterogeneous in this respect, which may have influenced the results obtained.
A noteworthy finding is that only the Danish study and the corresponding Dutch adaptation focused on patients with a fracture of the ankle [
6,
12]. The other Dutch version of the AOFAS scale [
12] included patients with hindfoot fractures, distinguishing between two areas of the foot (separating calcaneal fractures [n = 82; 72.6%] from talar fractures [n = 36; 31.9%]). Moreover, many of these fractures were treated nonoperatively (n = 72; 73.6%). The other cross-cultural adaptations did not distinguish between the anatomical regions affected. The Iranian version [
14] included, in addition to fractures, cartilage injuries (n = 31; 31%), tendon injuries (n = 21; 21%), ligament injuries (n = 17; 17%), old trauma (n = 9; 9%), deformities (n = 10; 10%), and plantar fasciitis (n = 12; 12%). The Italian version [
10], on the other hand, included fractures of the calcaneus or malleolus, and the Persian version [
13], in addition to fractures of the calcaneus and ankle, included talar osteochondral defect, ankle sprain, Achilles tendon rupture, and ankle ganglion cyst.
The COSMIN classification reveals the generally low methodological quality of the cross-cultural adaptations made of the AOFAS scale. The most highly rated version, the Dutch adaptation, achieved only four positive ratings, with an indeterminate value for cross-cultural validity, structural validity, and measurement error and a negative rating for criterion validity. The remaining versions also obtained negative or indeterminate values for the latter criterion, which suggests that if this factor is influenced by other conditions affecting the leg/foot, it can be difficult to determine the degree of injury. In view of the indeterminate value obtained, the same conclusion can be drawn for the cross-cultural validity criterion because for each of these adaptations there is no guarantee that the instrument accurately measures function, pain, and alignment. The reliability criterion was the only one for which a positive value was obtained by all of the versions. For internal consistency, the Danish and Dutch-Hindfoot versions obtained negative values (for the Dutch version, Cronbach α = 0.585). In this respect, however, statistical calculation was not possible for the pain and alignment subscales because only one item was included. The function subscale produced a value of 0.863, which would indicate a positive level of internal consistency. Nevertheless, most of the versions obtained a negative value in this respect, and in no case was it possible to calculate the Cronbach α value for all of the subscales.
A ceiling effect was detected in two of the five cross-cultural adaptations analyzed, the two Dutch versions [
12], with values near 15%, 16.2%, and 17.7%, which indicates that the responses of these questionnaires should be more specific and direct to better classify the participants. No floor effect was found in the versions analyzed, having a maximum effect of 1.9% in the cross-cultural adaptation to Persian performed by Sayyed-Hosseinian et al [
13].
Regarding methodological quality, the overall rating for the Dutch (ankle fractures) version of the AOFAS scale was very low [
12] because it only scored “very good” for internal consistency and criterion validity. The same result was obtained by two other instruments, the Dutch (hindfoot) [
12] and Danish [
6] versions. These obtained worse results than the Dutch (ankle fracture) version for content validity and patient-reported outcome measure development, and so the latter was considered to have better overall methodological quality.
The main strength of the present study is the rigorous methods applied in the systematic review, which included a peer-blinded quality assessment using a standard method, COSMIN, and the comprehensive identification of available studies and versions of the AOFAS scale. Among its limitations, this study is subject to the heterogeneity of the participants included in the studies selected for analysis and to the considerable variation in their mean age.
Further studies, using robust methods, should be conducted to examine the cross-cultural adaptations made of the AOFAS scale, preferably with a larger sample population and ensuring an even male-to-female balance among the participants. Furthermore, the AOFAS scale must be correlated with a similar instrument, such as the Self-reported Foot and Ankle Score, obtaining better criterion validity data, and the structural validity must be thoroughly analyzed in future validations and cross-cultural adaptations of the AOFAS scale.