Reliability of the Foot Posture Index and Traditional Measures of Foot Position

Evans, Angela M.; Copper, Alexander W.; Scharfbillig, Rolf W.; Scutter, Sheila D.; Williams, Marie T.

doi:10.7547/87507315-93-3-203

Open AccessArticle

Reliability of the Foot Posture Index and Traditional Measures of Foot Position

by

Angela M. Evans

^*,

Alexander W. Copper

,

Rolf W. Scharfbillig

,

Sheila D. Scutter

and

Marie T. Williams

School of Physiotherapy and Podiatry, and Podiatry Research Group, University of South Australia, Adelaide, South Australia, Australia

^*

Author to whom correspondence should be addressed.

J. Am. Podiatr. Med. Assoc. 2003, 93(3), 203-213; https://doi.org/10.7547/87507315-93-3-203

Published: 1 May 2003

Download Versions Notes

Abstract

Repeatable measures are essential for clinicians and researchers alike. Both need baseline measures that are reliable, as intervention effects cannot be accurately identified without consistent measures. The intrarater and interrater reliability of the new Foot Posture Index and current podiatric measures of foot position were assessed using a same-subject, repeated-measures study design across three age groups. The Foot Posture Index total score showed moderate reliability overall, demonstrating better reliability than most other current measures, although navicular height (normalized for foot length) was the single most reliable measure in adults. None of the tested measures exhibited adequate reliability in young children, and, with less-than-desirable reliability being demonstrated, most measures need to be interpreted accordingly when repeated measures are involved.

During the past 15 years, several studies have examined the reliability of clinical measures of the foot. The reliability of these measures has received due attention, as clinicians and researchers have questioned the value of the measures on which interventions are often based. Repeatable measures are essential for clinicians and researchers alike. Both need baseline measures that are reliable so that intervention effects can be accurately assessed. It is essential that measures with demonstrable reliability are adopted for use in clinical and research settings.

Podiatric physicians the world over embraced the Root paradigm of biomechanics until the late 1980s, when critical inquiry of these clinically accepted measures began. [1-4] Developed in the 1970s, the work of Root et al [5] became the fundamental principles for undergraduate podiatric medicine students for at least 20 years. The more recent questioning of these principles has resulted in both healthy debate and investigations into the reliability of these measures. [6-10]

Some studies [11-13] have supported the reliability of traditional foot measures as described by Root et al. [5] Intratester reliability has been consistently found to be stronger than intertester reliability. [7,11,14] However, closer scrutiny of these results reveals large error estimates associated with measures, which reduces confidence in measurement precision. [8,15] Overall, the reliability of the traditional Root measures of foot posture has been found to be lacking. Some individual measures, for example, navicular height, display acceptable reliability, [7,10] whereas the two “stalwart” measures—resting calcaneal stance position and neutral calcaneal stance position—are less reliable, with the latter involving the elusive concept of the subtalar joint neutral position. [1,10,15-17] The subtalar joint neutral position is the crux of the Root theories, about which all feet are said to optimally function and from which all measurements are made. However, this concept has also been a significant barrier to reliability, lacking definition [1] as a starting point for measuring the foot. [10] The unacceptable reliability of the neutral calcaneal stance position is a critical finding and must be appreciated by podiatric physicians who rely on this measure for orthotic prescription.

It is important to distinguish between intrarater and interrater reliability. Intrarater reliability is the consistency of an individual examiner’s repeated measurements, and interrater reliability refers to the consistency of measurements between examiners measuring the same subjects. A measurement found to be reliable for one examiner repeatedly (intrarater) will not necessarily be reliable for all examiners (interrater), which limits the usefulness of such measures for other examiners. Interrater reliability provides greater confidence in the measures and indicates that other examiners would have found similar results. [18]

The absence of acceptable reliability for several common clinical measures has posed a problem for podiatric physicians. However, to date there has been no real alternative presented for clinical and research settings. How can the foot be best assessed and “measured” if many traditional measures are demonstrably unreliable? A variety of observational assessment scales have been developed to try to address this issue. Most scales acknowledge the triplanar nature of the foot and include scaled observations in all three cardinal planes. [11,19,20] The most recent scale to be developed is the Foot Posture Index (FPI) (AC Redmond, unpublished booklet, 2000). The FPI is a system for observing and rating foot posture features that includes eight criteria that sum to a total score. The FPI criteria require the examiner to observe and rate foot morphology in all cardinal planes, ultimately scaling it along the customary continuum of pronated to supinated. The reliability of this index has not yet been established. The aim of this study was to determine and compare the interrater and intrarater reliability of the FPI and selected traditional measures of foot position.

Methods

A single-subject repeated-measures design was used to determine both intrarater and interrater reliability. Three groups of subjects were recruited: 29 children (aged 4 to 6 years; 13 girls and 16 boys), 30 adolescents (aged 8 to 15 years; 12 girls and 18 boys), and 30 adults (aged 20 to 50 years; 15 women and 15 men). Mean group ages, overall and by gender, were as follows: children, 4.8 years (girls, 5.0 years; boys, 4.6 years); adolescents, 10.9 years (girls, 10.6 years; boys, 11.1 years); and adults, 35.6 years (women, 37.4 years; men, 33.4 years).

Participants were asymptomatic; consenting individuals were sampled by convenience from a kindergarten, a school, and a university campus. Individuals with a history of foot surgery were excluded. To measure interrater reliability, children were assessed by three examiners each and adolescents and adults were assessed by four examiners each. The examiners had 11 to 15 years of clinical experience. Ethical approval was obtained from the University of South Australia Human Research and Ethics Committee and from the school or kindergarten ethics bodies, as applicable. Informed parental consent was obtained for all participating children.

The clinical foot measures examined were the FPI and a selection of traditional measures of foot position, described in the following sections.

Foot Posture Index

The FPI, as described by Redmond (unpublished booklet, 2000), consists of eight specific criteria: talar head palpation, supralateral and infralateral malleolar curvature, Helbing’s sign, calcaneal frontal plane position, prominence in the region of the talonavicular joint, congruence of the medial longitudinal arch, congruence of the lateral border of the foot, and abduction and adduction of the forefoot on the rearfoot.

A 1-hour training session was held to familiarize the examiners with the FPI and the rating of each of the eight criteria before the first testing session. Each FPI criterion was scored on a 5-point scale (range, –2 to +2) by each examiner. [21] Total scores were then calculated and used to determine the intrarater and interrater reliability of the examiners with respect to assessment of foot posture.

Traditional Measures of Foot Position

Measures of foot position commonly used by podiatric physicians were also assessed for intrarater and interrater reliability. This assessment allowed for analysis of each individual traditional measure and for comparison with the newly developed FPI.

The procedures for the assessment of each traditional measure are described in the following subsections. Before assessment, subjects were asked to stand and to walk for three steps and then stop with their feet in a position of standard support.

Navicular Height. The examining podiatric physician palpated the medial midfoot to locate the navicular tuberosity and marked the site with a pen. The height of this mark from the floor was then measured and recorded. Measurements were taken using a ruler marked in millimeters. The test was repeated for the other foot. To standardize the measure of navicular height, foot length was used as the denominator, and the measure is thus reported as normalized navicular height.

Navicular Drop. The examining podiatric physician palpated the medial midfoot to locate the navicular tuberosity and marked this site with a pen. The examiner then placed the subject’s foot in the neutral calcaneal stance position and asked the subject to maintain this position. The height of the mark on the foot from the floor was then measured and recorded (N1). The subject was then asked to relax the foot (resting calcaneal stance position). The height of the mark on the foot from the floor was then measured and recorded (N2). Measurements were taken using a ruler marked in millimeters. The navicular drop (ND) was then calculated as N1 – N2 = ND. The test was then performed on the other foot.

Measurements of the resting and neutral calcaneal stance positions were taken using a degree-incremented tractograph (MDM Manufacturing, Oklahoma City, Oklahoma) with the patient in the same position as for measuring navicular height and navicular drop.

Resting Calcaneal Stance Position. The osseous medial and lateral borders of the calcaneus were palpated and visually bisected. The degree-incremented tractograph was then used to measure the angle of the visualized calcaneal bisection line to the supporting surface. This was recorded as the resting calcaneal stance position.

Neutral Calcaneal Stance Position. This measure was taken with the subject standing as mentioned previously. This time the subject’s subtalar joint was placed in the neutral position. The neutral subtalar joint position was approximated by talar head palpation for congruency. Once the subtalar joint was placed in the neutral position, the visualized calcaneal bisection was measured, as with resting calcaneal stance position, and recorded as the neutral calcaneal stance position (in degrees).

Forefoot-to-Rearfoot Measurement. The only open-kinetic-chain (nonweightbearing) foot measure assessed was the relationship between the forefoot and rearfoot plantar planes. This measure was included because of its value in defining foot type, particularly in terms of closed-kinetic-chain compensation. [22]

The subtalar joint was held in the neutral position (located by talar head palpation) with the fourth and fifth metatarsals loaded in the direction of dorsiflexion to the point of first resistance. The alignment of the forefoot (first to fifth metatarsal plane only) was observed in relation to the bisection of the calcaneus. Using a tractograph, one arm was placed across the plantar aspect of the metatarsals, and the arm was opened until it was perpendicular to the visualized bisection of the calcaneus. This gave a measure of forefoot varus or valgus in degrees.

Procedure

This reliability study was conducted on three samples: adults aged 20 to 50 years, adolescents aged 8 to 15 years, and children aged 4 to 6 years. The procedure used for adults, adolescents, and children was the same.

A research assistant was responsible for randomization of examination times, reception of subjects, numbering of subjects, and collection of examination sheets to ensure anonymity and confidentiality.

Subjects entered the examination area and stood on a raised platform. To ensure consistency of the stance position across all testing sessions, a template outline of the feet was made for each subject. Color-coded data-collection sheets (examiner-specific and subject-numbered) were placed next to each limb of each subject. Each examiner performed all measures on each limb of each subject, with no subject’s limbs being examined consecutively by any examiners. Nonconsecutive limb examination was adopted to avoid rater bias when examining the second foot by the results of examination of the first foot. Once the data sheets were completed, the examiners placed them in a central collection box. The subjects remained in their positions while the examiners rotated at random between subjects, repeating the procedure until all examiners had performed all weightbearing measures on all subjects. Any pen marks used for measuring were removed by the examining podiatric physician at the end of testing.

Subjects then moved to the plinths, where they lay prone. The examiners measured the forefoot-to-rearfoot relationship in turn, again examining individual subjects’ limbs nonconsecutively. At completion of these measures, these examination sheets were placed in the collection box. For intrarater reliability, subjects were reexamined no less than 4 hours later.

Data Management and Analysis

Data were entered and all analyses were performed using constructed data sets in SPSS version 10 (SPSS Science, Chicago, Illinois) and Microsoft Excel 2000 (Microsoft Inc, Redmond, Washington) software packages.

The FPI assessments yielded categorical (ordinal) data for each individual criterion. Summation of the eight criteria to a total score resulted in continuous (interval) data. Traditional measures yielded only continuous (interval and ratio) data. Paired t-tests were applied to check for systematic differences between repeated measures for each rater.

To determine both intrarater and interrater reliability, categorical data (FPI criterion data) were analyzed using the nonparametric statistical analysis of Spearman’s rank correlation (ρ). Interrater analysis of FPI criteria was performed by calculating the intrarater ρ for each examiner pair for each session and then averaging the results for the examiner pairs (adults and adolescents: four pairs and sessions; children: three pairs and sessions).

To determine both intrarater and interrater reliability for continuous data (FPI total scores and traditional measures), parametric statistical analyses were used. To determine intrarater agreement, two approaches were used: intraclass correlation coefficients (ICCs) were calculated (model [3,1] based on two-way analysis of variance, mixed effect with consistency) and the standard error of measurement (SEM) with 95% limits of agreement was determined for each rater. To determine average intrarater ICCs, a form of the standardized (z) score was used. Individual rater ICC (r) values were transformed to z scores, resulting z scores were averaged, and the average z score was then transformed back to an r value. Intraclass correlation coefficients (model [2,k] based on two-way analysis of variance, random effect with absolute agreement) were calculated to determine interrater agreement, and the mean SEMs with 95% limits of agreement were determined across raters. The ICC, widely used for reliability analyses, reflects both correlation and agreement and provides a single index among two or more ratings, which was a requirement of this study. Calculating ICCs also made the results comparable with those of other studies. The SEM was calculated to enhance the clinical application of results. The SEM displays measurement error in the units in which original measurement occurred and hence is related more directly to the clinical setting. Each SEM was calculated with 95% limits of agreement, which contain the measures expected to fall within two standard deviations above and below the mean of the different scores.

Acceptable levels of reliability were defined, acknowledging that such limits are essentially arbitrary. However, if adopted by convention, such definitions provide useful “benchmarks” for discussion. [23] Intraclass correlation coefficient values greater than 0.75 indicated good reliability, 0.50 to 0.75 indicated moderate reliability, and less than 0.50 represented poor reliability. [18]

Results

Paired t-tests were applied to check for possible systematic differences between repeated measures for each rater. A few were statistically significant (P < .05) in the children’s study (FPI, raters 1 and 2; navicular height, rater 1; and neutral calcaneal stance position, rater 3) and adults’ study (FPI, rater 4; navicular height, rater 4; and forefoot-to-rearfoot relationship, rater 2). No paired-samples tests were statistically significant in the adolescents’ study.

Descriptive statistics were used for each of the measures to indicate the range of foot types in each of the study groups. The children’s study provided the following mean (range) values for each measure: FPI, 6.65 (–1 to +14); navicular height, 31.91 mm (20 to 42 mm); navicular drop, 6.23 mm (0 to 15 mm); resting calcaneal stance position, 4.02° everted (15° everted to 3° inverted); neutral calcaneal stance position, 0.79° inverted (5° everted to 10° inverted); and forefoot-to-rearfoot relationship, 3.43° (1° valgus to 11° varus).

In the adolescents’ study, mean (range) values for each measure were as follows: FPI, 5.43 (–4 to +11); navicular height, 39.40 mm (30 to 55 mm); navicular drop, 6.23 mm (0 to 15 mm); resting calcaneal stance position, 3.36° everted (10° everted to 7° inverted); neutral calcaneal stance position, 1.62° inverted (4° everted to 15° inverted); and forefoot-to-rearfoot relationship, 3.15° varus (4° valgus to 15° varus).

In the adults’ study, mean (range) values for each measure were as follows: FPI, 4.98 (–2 to +13); navicular height, 44.68 mm (29 to 64 mm); navicular drop, 7.21 mm (0 to 20 mm); resting calcaneal stance position, 1.88° everted (8° everted to 7° inverted); neutral calcaneal stance position, 1.79° inverted (4° everted to 10° inverted); and forefoot-to-rearfoot relationship, 2.01° varus (13° valgus to 20° varus).

Homogeneity of continuous data for each leg was analyzed to determine whether single-leg data could be pooled (Table 1). From this analysis (using the type [1,1] ICC in its most conservative form), it can be seen that the data were suitable to be pooled. The slight aberrance of navicular height in the children’s data and of resting calcaneal stance position in both the adolescents’ and adults’ data is acknowledged. Pooling had the effect of doubling the sample size, that is, data could reasonably be analyzed on an individual limb basis rather than on the basis of subject numbers. Hence, after this analysis, the children’s study included 3 examiners, 29 subjects, and 58 limbs and the adolescents’ and adults’ studies included 4 examiners, 30 subjects, and 60 limbs.

The children’s study provided less reliable results overall (Table 2). The three examiners involved all demonstrated good intrarater reliability results with the FPI. One examiner demonstrated good intrarater reliability for resting calcaneal stance position (ICC = 0.79), and another examiner demonstrated good intrarater reliability for navicular drop (ICC = 0.89). None of the examiners demonstrated good intrarater reliability for normalized navicular height, neutral calcaneal stance position, or forefoot-to-rearfoot relationship (ICC < 0.75). Interrater results were all moderate to poor. The FPI SEM for raters averaged 1.3 points (±2.6 points), which approximates 4% (±7.5%) of the 33-point FPI scale. The mean SEMs for traditional measures were large, especially for navicular drop (2.5 mm or 16.7% [±32.7%]; normal range, 10 to 15 mm). These results suggest that young children’s feet are less reliably assessed (especially between raters) than are those of older children and adults.

In the adolescents’ study, all examiners achieved good intrarater reliability for the FPI and normalized navicular height (ICC ≥ 0.75) (Table 3). Two examiners achieved good intrarater reliability for forefoot-to-rearfoot relationship, one examiner achieved good intrarater reliability for resting calcaneal stance position, and one examiner achieved good intrarater reliability for neutral calcaneal stance position. Interrater reliability was moderate for FPI, normalized navicular height, forefoot-to-rearfoot relationship, and resting calcaneal stance position and poor for navicular drop and neutral calcaneal stance position. The FPI SEM was similar to that found in the children’s study. Again, navicular drop and forefoot-to-rearfoot relationship showed large measurement errors. Rater 4 showed the greatest error and the least intrarater reliability for forefoot-to-rearfoot relationship.

In the adults’ study, all examiners achieved good intrarater reliability for normalized navicular height and forefoot-to-rearfoot relationship (ICC > 0.75) (Table 4). Three examiners achieved good intrarater reliability for the FPI, and two examiners achieved good reliability for navicular drop, but no examiner achieved good intrarater reliability for resting calcaneal stance position or neutral calcaneal stance position. Interrater analyses found that normalized navicular height displayed good reliability between examiners (ICC = 0.76). Interrater reliability was moderate for forefoot-to-rearfoot relationship and the FPI and poor for navicular drop, resting calcaneal stance position, and neutral calcaneal stance position. The FPI SEM was similar to that found in the younger subject studies; navicular drop showed larger measurement error, as did resting calcaneal stance position. Rater 4 showed the greatest measurement error for forefoot-to-rearfoot relationship.

Analysis of the FPI criterion data revealed that Helbing’s sign was the least reliable criterion across all subjects, with Spearman’s ρ interrater reliability of 0.17, 0.16, and 0.33 for the children, adolescent, and adult groups, respectively. Interrater reliability values were generally poor (Table 5). The FPI criterion analysis across age groups revealed that criteria 4, 5, and 6 showed moderate interrater reliability (except criterion 4 in the adolescent group), whereas the other criteria showed poor reliability.

Discussion

The aim of this study was to determine the intrarater and interrater reliability of a variety of the traditional measures of foot position and the newly developed FPI.

Except for young children, it was apparent that normalized navicular height was the most reliable single measure across subjects and between examiners and therefore is useful to include in examination of the older foot. This finding agrees with that of Weiner-Ogilvie and Rome, [10] who also found small intrarater and interrater differences in navicular height measures. However, it is difficult to compare the findings of this study with those of Weiner-Ogilvie and Rome, as their reliability analysis involved mean difference percentage agreement rather than the ICC. Navicular drop, the counterpart measure of normalized navicular height, largely showed poor interrater reliability (ICCs: children, 0.55; adolescents, 0.47; adults, 0.46). The SEM ranged from 1.8 to 3.5 mm, with a mean of approximately 2.5 mm, which represents an error size of 17% of the normal navicular drop range (10 to 15 mm). This finding is in agreement with that of Picciano et al, [3] who also reported poor interrater reliability of navicular drop (ICC = 0.57, SEM = 2.7 mm), but their study involved two inexperienced testers. Mueller et al [2] reported good reliability for navicular drop (ICC = 0.78 to 0.83, SEM = 1.7 to 2.1 mm), but this study involved only one rater and hence cannot be generalized.

The resting calcaneal stance position and the neutral calcaneal stance position were moderately to poorly reliable between examiners and in most cases also least reliable for repeated measures by the same rater. The SEM ranged from 0.9° to 6.2° for resting calcaneal stance position and from 0.2° to 2.1° for neutral calcaneal stance position. Applying these findings to the clinical setting where both measures are within a small range, the error component contributes approximately 20% to these scores. Findings for these traditional measures are in agreement with some previous studies. [6,8,24] Although rater 2 showed good reliability for resting calcaneal stance position in the children’s and adolescents’ studies, examination of the raw data showed that this rater used fewer values when rating adult subjects (ie, 0, 5, and 10), which falsely enhanced reliability results. This was an interesting finding of this reliability study, and numerical preference by raters has been reported elsewhere. [25] Unfortunately, it potentially confounds the data for adult subjects and hence is a limitation of this study.

The interrater reliability of the forefoot-to-rearfoot relationship varied markedly across the study populations (ICCs: children, 0.28; adolescents, 0.53; adults, 0.70). The results were compared with those of Somers and Hanson, [24] who found interrater reliability (ICC) of 0.38 for goniometric assessment and 0.81 for visual assessment. Diamond and Delitto [11] examined right and left limbs separately using goniometric assessment and found interrater (ICC) results for the forefoot-to-rearfoot relationship of 0.58 (right) and 0.77 (left). Although the levels of reliability are similar to those found in the older subject groups of the present study, it is not appropriate to compare these study results because homogeneity was not uniform. The SEM of the forefoot-to-rearfoot relationship was approximately 2° across age groups, which is high for the range of the measure but favorable compared with other measures.

Interrater reliability results (Table 6) offer the more generalizable and thus more useful results for the wider podiatric profession, with intrarater results being pertinent only to individual examiners. Less reliable measures display a larger error range and hence need to be interpreted within the size of the expected range of scores. It is important that generally reliable, low-error measures are developed and used for comparison of intervention effects with baseline findings. It is also important that unreliable measures are identified and discarded when a suitable alternative for subject assessment and orthotic prescription becomes available. Unreliable measures for which there is no current alternative must be used with great caution when repeated measures are planned. Error can potentially account for much of the measurement range. This is clearly the case for resting calcaneal stance position and neutral calcaneal stance position in this and other studies. [6,8] Use of the FPI, by clinicians and researchers alike, should be considered in light of the error estimates and results of comparison with other measures. The validity of this tool is also yet to be verified.

Results from children were very different from those of the other populations. This was not surprising to the three raters, who found the experience of examining young children to be very different from that of the older subjects. The young children moved more, turned to look and talk, and had to be repeatedly repositioned on their templates. Although each rater displayed good intrarater reliability for the FPI, these results cannot be generalized for other raters, as interrater reliability was only moderate. The average intrarater ICCs (z transformed) showed greatest variation in the children’s study, ranging from 0.137 (neutral calcaneal stance position) to 0.808 (FPI). None of the measures tested exhibited adequate reliability across raters in the children, which implies that young children’s feet need to be examined differently than those of adolescents and adults and that rater skills across age groups cannot be assumed.

Results for the adolescent group showed good intrarater reliability for all raters for the FPI, the only study sample to do so. Interrater reliability of the FPI was improved compared with that found in both adults and children. Average intrarater ICCs (z transformed) were more consistent in this age group, ranging from 0.559 (neutral calcaneal stance position) to 0.916 (FPI). Raters 1, 2, and 3 were more practiced in the methodology protocol for the adolescent group, with that group being the last of the three age groups to be examined. Rater 4 was very familiar with the FPI at commencement of the study.

The adult group was the first subject group to be examined, and for raters 1 to 3 this was their foray with the FPI. It is possible that practice would improve the interrater reliability of the FPI, as it improved in the morphologically similar but later examined adolescent group. The surprising result from this study was the good intrarater and moderate interrater reliability of the forefoot-to-rearfoot relationship measure, as raters were previously least confident about their consistency of this measure. The average intrarater ICCs (z transformed) were similar to those of the adolescent group.

Several notable points arise from the descriptive data. The average FPI value for each study group was within the range attributed to a normal foot posture (+2 to +7). [21] Although all studies involved subjects with supinated and pronated feet, there were fewer extremes of either foot type than could potentially occur in the clinical setting. Hence the reliability of the FPI in extremely supinated or pronated foot types has not been adequately assessed in this study. Not surprisingly, the average navicular height measure increased with age and concurrent increase in foot size and possibly medial arch development. Average navicular drop, resting calcaneal stance position, neutral calcaneal stance position, and forefoot-to-rearfoot relationship measures were relatively static across all age groups, as were the ranges of these measures, except for the adult forefoot-to-rearfoot relationship range. This is perhaps a more surprising finding, as traditional theorists of foot development have described decreasing resting calcaneal stance position eversion and forefoot-to-rearfoot varus with increasing age. Although a finding of this study, it is acknowledged that sample sizes are too small to provide normative trends of developing foot morphology.

In this study, the type (3,1) and (2,k) ICCs were used in their classic forms for reliability studies as described by Shrout and Fleiss. [26] Use of the type (2,k) ICC enabled comparison with the results of other reliability studies. [2,7,11,13,24] There is dissent within reliability studies regarding use of the ICC. [26-28] Specifically, Bland and Altman [28] report concerns about the ICC’s lack of discrimination between correlation and agreement. Data that correlate do not necessarily agree, and low ICC values do not allow for discernment between poor correlation and poor agreement (clearly, this is not problematic for high ICC values). Pearson’s correlation and the t-test allow for distinction of correlation from agreement but give two separate results of reliability, as opposed to the single ICC value.

The FPI criterion analysis was performed to investigate the reliability of the eight individual components that sum to the FPI total score. Clearly, a consistently unreliable criterion may reduce the overall reliability of the FPI total score, so identification of this criterion would potentially allow for improvement in FPI reliability. This analysis is ongoing. The FPI combines the scores of the eight individual foot measures (criteria) to give a total score of –16 to +16, with negative scores (less than –2) representing abnormally supinated foot types and positive scores (greater than +12) representing abnormally pronated foot types. The normal foot is believed to score 2 to 7. [21] The summative nature of the FPI has limitations. The summed score is potentially ambiguous, as criteria making up the FPI total score actually reflect different characteristics of foot type, which are not all equal. Therefore, the same FPI total score may be obtained for quite different criterion findings and quite different foot types (Table 7). Ambiguity is a common problem with summative scales. [18] Despite this ambiguity, many summative rating systems are in common use, for example, the SF-36 Health Status Survey [29] and the Foot Health Status Questionnaire. [30] Within such scales it is possible to separate sections to examine smaller “domains.” The FPI has this potential, but further analysis of the various domains, for example, rearfoot versus forefoot segments, is required. Until such analysis is complete, ambiguity will remain a limitation of the FPI.

An additional limitation of the FPI is that it examines static foot posture by observation of the foot but does not indicate joint ranges of motion. Thus alone it is incapable of differentiating between rigid and flexible flat feet. Subsequently, raters would still need to conduct a range-of-motion assessment (using unreliable techniques) to classify foot type more comprehensively.

This study was limited to asymptomatic subjects and therefore does not cover the complete clinical spectrum of foot types and conditions. In clinical practice it would be expected that a larger proportion of patients would present with foot type extremes. Such an increase in the range of foot types (increasing the range of observed values) may enhance the reliability of the measures.

Conclusion

The FPI showed moderate reliability across the three age groups studied. The FPI total score demonstrated better reliability than most other measures. Intrinsically, the FPI, as a summative scale, is potentially ambiguous, which could limit its usefulness as a clinical or research tool. Further analysis of FPI criterion reliability is required.

As a single clinical measure, normalized navicular height was the most reliable, exhibiting good reliability in the adult population. However, reliability was only moderate in young children, and its use in this population must be questioned.

Finally, although the reliability of the FPI and traditional measures (excluding normalized navicular height in the adult group) was in most cases barely acceptable for rigorous protocol, one should not abstain from measuring a phenomenon because of less-than-desirable reliability. If no suitable and reliable alternative method is available, one must use what is available, interpreting results with due caution.

Acknowledgments

Alison Dalli, BAppSc, University of South Australia, and Anne-Maree Keenan, MAppSc, University of Western Sydney–Macarthur, for their invaluable assistance during the first phase of this trial; Anthony Redmond, MSc, University of Western Sydney–Macarthur, for generous collaboration and assistance with this study; and Brenton Dansie, PhD, University of South Australia, for assistance with statistical analysis.

References

Elveru RA, Rothstein JM, Lamb RR: Methods for taking subtalar joint measurements: a clinical report. Phys Ther68: 678, 1988.
Mueller MJ, Host JV, Norton BJ: Navicular drop as a composite measure of excessive pronation. JAPMA83: 198, 1993.
Picciano AM, Rowlands MS, Worrell T: Reliability of open and closed kinetic chain subtalar joint neutral positions and navicular drop test. J Orthop Sports Phys Ther18: 553, 1993.
Keenan A-M: “Understanding Midtarsal Joint Function: Fact and Fallacy,” in Conference Proceedings: 17th Australian Podiatry Conference, Vol 1, p 107, Australasian Podiatry Council, Melbourne, 1996..
Root ML, Orien WP, Weed JH: Normal and Abnormal Function of the Foot, Clinical Biomechanics Corp, Los Angeles, 1977..
Freeman AC: A study of the inter-tester and intra-tester reliability in the measurement of resting calcaneal stance position and neutral calcaneal stance position. Aust Podiatrist (June)::10, 1990.
Sell KE, Verity TM, Worrell TW, et al: Two measurement techniques for assessing subtalar joint position: a reliability study. J Orthop Sports Phys Ther19: 162, 1994.
Menz HB, Keenan A-M: Reliability of two instruments in the measurement of closed chain subtalar joint positions. The Foot7: 194, 1997.
Weiner-Ogilvie S, Rendall GC, Abboud RJ: Reliability of open kinetic chain subtalar joint measurement. The Foot7: 128, 1997.
Weiner-Ogilvie S, Rome K: The reliability of three techniques for measuring foot position. JAPMA88: 381, 1998.
Diamond J, Delitto A: Reliability of a diabetic foot evaluation. Phys Ther69: 797, 1989.
Smith-Oricchio K, Harris BA: Interrater reliability of subtalar neutral, calcaneal inversion and eversion. J Orthop Sports Phys Ther12: 10, 1990.
Johnson SR, Gross MT: Intrarater reliability, interexaminer reliability, and mean values for nine lower limb skeletal measures in healthy Naval midshipmen. J Orthop Sports Phys Ther25: 253, 1997.
Astrom M, Arvidson T: Alignment and joint motion in the normal foot. J Orthop Sports Phys Ther22: 216, 1995.
Keenan A-M: A clinician’s guide to the practical implications of the recent controversy of foot function. Australas J Podiatr Med31: 87, 1997.
Menz HB: Clinical hindfoot measurement: a critical review of the literature. The Foot5: 57, 1995.
Pierrynowski MR, Smith SB, Mlynarczyk JH: Proficiency of foot care specialists to place the rearfoot at subtalar neutral. JAPMA86: 217, 1996.
Portney LG, Watkins MP: Foundations of Clinical Research: Applications to Practice, Prentice Hall Health, Upper Saddle River, NJ, 2000..
Rose GK, Welton EA, Marshall T: The diagnosis of flat foot in the child. J Bone Joint Surg Br67: 71, 1985.
Dahle LK, Mueller M, Delitto A, et al: Visual assessment of foot type and relationship of foot type to lower extremity injury. J Orthop Sports Phys Ther14: 70, 1991.
Redmond AC, Burns J, Crosbie J, et al: An initial appraisal of the validity of a criterion based, observational clinical rating system for foot posture [abstract]. J Orthop Sports Phys Ther31: 160, 2001.
Michaud TC: Foot Orthoses and Other Forms of Conservative Foot Care, Thomas C. Michaud, Newton, MA, 1997..
Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics33: 159, 1977.
Somers DL, Hanson JA: The influence of experience on the reliability of goniometric and visual measurement of forefoot position. J Orthop Sports Phys Ther25: 192, 2001.
Low JL: The reliability of joint measurement. Physiotherapy62: 227, 1976.
Shrout PE, Fleiss JL: Intraclass correlation: uses in assessing inter-rater reliability. Psychol Bull86: 420, 1979.
Rosner B: Statistical methods in ophthalmology: an adjustment for the intraclass correlation between eyes. Biometrics38: 105, 1982.
Bland JM, Altman DG: Statistical methods for assessing agreement between two measures of clinical measurement. Lancet1: 307, 1986.
Garratt AM, Ruta DA, Abdalla MI, et al: The SF-36 Health Survey Questionnaire: an outcome measure suitable for routine use within the NHS. BMJ306: 1440, 1993.
Bennett PJ, Patterson C, Wearing S, et al: Development and validation of a questionnaire designed to measure foot-health status. JAPMA88: 419, 1998.

Table 1. Data Analyzed for Differences Between the Left and Right Limbs: Type (1,1) ICC (95% CI)

Table 2. Children’s Study: Intrarater and Interrater Reliability for the Three Raters (n = 58)

Table 3. Adolescents’ Study: Intrarater and Interrater Reliability for the Four Raters (n = 60)

Table 4. Adults’ Study: Intrarater and Interrater Reliability for the Four Raters (n = 60)

Table 5. Interrater Reliability for the FPI Critera for Each of the Three Study Populations

Table 6. Interrater Reliability (ICCs) for Each Age Group

Table 7. FPI Scores for Two Nonidentical Foot Types

Share and Cite

MDPI and ACS Style

Evans, A.M.; Copper, A.W.; Scharfbillig, R.W.; Scutter, S.D.; Williams, M.T. Reliability of the Foot Posture Index and Traditional Measures of Foot Position. J. Am. Podiatr. Med. Assoc. 2003, 93, 203-213. https://doi.org/10.7547/87507315-93-3-203

AMA Style

Evans AM, Copper AW, Scharfbillig RW, Scutter SD, Williams MT. Reliability of the Foot Posture Index and Traditional Measures of Foot Position. Journal of the American Podiatric Medical Association. 2003; 93(3):203-213. https://doi.org/10.7547/87507315-93-3-203

Chicago/Turabian Style

Evans, Angela M., Alexander W. Copper, Rolf W. Scharfbillig, Sheila D. Scutter, and Marie T. Williams. 2003. "Reliability of the Foot Posture Index and Traditional Measures of Foot Position" Journal of the American Podiatric Medical Association 93, no. 3: 203-213. https://doi.org/10.7547/87507315-93-3-203

APA Style

Evans, A. M., Copper, A. W., Scharfbillig, R. W., Scutter, S. D., & Williams, M. T. (2003). Reliability of the Foot Posture Index and Traditional Measures of Foot Position. Journal of the American Podiatric Medical Association, 93(3), 203-213. https://doi.org/10.7547/87507315-93-3-203

Article Menu

Reliability of the Foot Posture Index and Traditional Measures of Foot Position

Abstract

Methods

Foot Posture Index

Traditional Measures of Foot Position

Procedure

Data Management and Analysis

Results

Discussion

Conclusion

Acknowledgments

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI