1. Introduction
Total hip arthroplasty (THA) is recognized as a successful and reliable intervention, usually performed to relieve pain or improve hip mobility and stability [
1,
2]. Obtaining the correct center of rotation, the orientation of implant components, appropriate femoral offset, and equal leg lengths are critical in restoring optimal hip biomechanics through this procedure [
3,
4,
5,
6].
In the era of increased healthcare services marketing, patient satisfaction is often used as a surrogate to gauge the overall success of medical practice [
7]. However, typical patient satisfaction rates for primary THA have historically been greater than 90% [
8]. Recent multi-center studies using validated tools continue to identify that 8–11% of patients who undergo THA continue to report postoperative dissatisfaction [
7,
9,
10,
11]. Postoperative LLD is often the basis for this dissatisfaction and has been found to occur in up to 26% of THA cases [
4,
12]. Data published by the Joint Commission on Accreditation of Healthcare Organizations (JCAHO) reveals that LLD may account for as much as 4.7% of all medical errors and contribute significantly to increased patient morbidity [
13]. Therefore, it is unsurprising that LLD following THA is a leading cause of litigation against orthopedic surgeons [
14].
The precise boundary between acceptable and unacceptable leg length discrepancy (LLD) levels remains unclear [
15,
16,
17]. However, it is widely agreed that LLD magnitudes exceeding 1.5 cm can have adverse effects on patients, including sciatic, femoral, or peroneal nerve palsies, chronic lower back pain, hip dislocation, the need for a shoe lift, or gait disorders [
18]. Studies have also shown a link between higher rates of postoperative trochanteric pain and suspected aseptic loosening with significant LLD. Additionally, compensatory gait abnormalities resulting from underlying LLD can lead to degenerative arthritis of the lower extremities and lumbar spine [
19,
20]. These factors likely contribute to overall patient dissatisfaction [
13].
When assessing LLD, the discrepancy must be determined to be true (anatomical) or apparent (functional) [
21]. Factual discrepancies exist in which an actual bony asymmetry exists between the head of the femur and the ankle mortise. Functional LLDs occur as physiological responses to altered biomechanics and could be caused by soft-tissue contractures, spinal deformities, or other processes that lead to pelvic obliquity. Recognition of functional LLD is essential as this is what the patient perceives. Recent studies have demonstrated that apparent LDD, not anatomical LLD, better predicts poor physical performance and reduced mobility after THA [
4,
21,
22,
23,
24]. In light of this, the American Academy of Orthopedic Surgeons recommends that in addition to measuring actual and apparent limb lengths before performing THA—abduction, adduction, and flexion contractures should be assessed and quantified. Clinically, most discrepancies result from actual and apparent differences [
25,
26].
Clinical assessment of true limb length can be conducted by measuring the distance between the anterior superior iliac spine (ASIS) and the medial malleolus. Similarly, apparent length can be measured from the xiphisternum or umbilicus to the medial malleoli of the ankle. Of these two “direct” methods, the measurement of true length is arguably more reliable than apparent length [
25]. However, neither method comprehensively assesses the effects of soft-tissue contractures or pelvic obliquity. A more reliable clinical technique for measuring LLD is to level the pelvis of a standing patient by placing blocks of known height under their shorter limb. This is called the “indirect” method [
23,
26]. Although these clinical methods are easy, safe, and non-invasive means of assessing limb length, they are comparatively less precise and reliable than radiographic techniques. Studies have also found that clinical techniques correlate poorly with LLD measurements on plain radiographs [
27,
28,
29]. They are also prone to significant errors due to potential angular deformities, limb girth differences, difficulties distinguishing bony landmarks, and differences in the relative positions of the limbs to the pelvis [
28,
30]. For these reasons, clinical methods are not recommended for most patients’ initial or serial LLD evaluation. Despite this, clinical techniques are still considered to be helpful when used as screening tools [
31].
The most comprehensive way to evaluate most patients for leg length discrepancy (LLD) is using full-length anteroposterior (AP) computed radiography of both lower extremities while standing. The pelvis is typically leveled by placing a small lift under the shorter leg. However, full-length radiographs can be inconvenient due to the need for weight-bearing and the potential for magnification and parallax errors [
23]. Before, during, and after any operations, AP pelvic radiographs, which include a view of the proximal femur, are the most convenient and commonly used for measuring LLD [
32]. Other methods for determining LLD include CT, 3D ultrasonography, 3D biplanar low-dose X-ray devices (LDX), and magnetic resonance imaging (MRI). CT has the advantage of high inter-observer reliability compared to plain radiography; however, its use is limited due to cost, radiation exposure, and the inability to perform the scan with the patient standing erect [
31,
33].
Pelvic radiographs are often obtained pre-operatively for the general purpose of templating. Preoperative templating allows surgeons to plan and evaluate aspects of upcoming THA procedures. Not only does templating allow better prediction of the size of the prosthesis required, but it also aids in achieving appropriate offset and limb length equality and in minimizing intraoperative complications [
34,
35]. The general technique for assessing LLD on pelvic radiograph involves measuring the perpendicular distance between a bisecting line that passes through the lower edge of the teardrops or the ischial tuberosities to the tip of the lesser trochanters or the center of femoral heads. The perpendicular distance is measured on both sides, and the difference is the LLD [
36]. Several studies have demonstrated that, depending on which pelvic and femoral landmarks are used for measurement, the resulting LLD can vary significantly and may be poorly reliable with poor intra- and inter-observer agreement [
32,
37]. Factors such as misalignment of pelvic positioning due to fixed flexion deformities, external rotation, and adduction/abduction contractures can also contribute to further reducing the precision and reliability of the LLD obtained by this method. Potential magnification errors introduced by the pelvic radiographs must also be considered [
38]. Although there have been attempts to standardize patient positioning during X-rays to address these factors, the fundamental dependency of current LLD measurement methods on reference points that are based on two separate and independent segments (the pelvis and femur) produces a situation in which adjustment of leg position results in unwanted movement of the pelvis in the sagittal and coronal planes and vice-versa [
38,
39,
40]. Variations in where the bisecting line is drawn for measurement on the pelvis and how surgeons choose given landmarks on the pelvis and femur further reduce the reliability of this measurement method. These variations are significant enough to cause false interpretations of the presence or absence of LLD [
25,
41].
This study aims to find the relationship between the most commonly used methods for evaluating leg length using pelvic X-rays and to compare how misalignment during X-rays affects leg length difference measurements. We also want to explore how effective it is to use trigonometric principles for leg length difference measurements. Our proposed technique only requires identifying a single anatomical landmark on each side. Unlike traditional methods, it allows for determining leg length differences regardless of the positions of the pelvis and femurs.
4. Discussion
Considering the variability of methods used for measuring LLD and the potentially discordant results this could yield in clinical and research settings [
57], this study sought to investigate the reliability and correlation between conventional methods to highlight the necessity for more uniformity within the orthopedic community. We also endeavored to determine the effects of changes in pelvic and femoral positioning on LLD measurements and propose an alternative method with advantages over current techniques.
In comparing techniques, we found that certain methods were more reliably reproducible than others. Evaluation of inter-observer variability demonstrated that the most reliable methods were those that referenced the inter-ischial line (IIL-LT) or inter-obturator foramina (IOF-LT) to the lesser trochanter. Methods that referenced the femoral heads were substantially reliable but less reliable than those mentioned above. The use of the inter-teardrop line (ITL-LT) was found to be the least reliable method.
Evaluating equivalence between conventional methods revealed that most correlated poorly with one another. Considering 10 mm as the threshold for clinical significance, we found that each method resulted in significantly different reporting rates for a percentage of captured cases. Reporting rates ranged from 3.28% (IOF-LT) to 14.75% (ITL-FH). Linear regression analysis further illustrated the lack of correlation between methods.
When considering the effects of misalignment on LLD measurement, we found that certain methods were more sensitive to changes in limb positioning. In contrast, others were more sensitive to changes in pelvic tilt. Two methods referencing the lesser trochanter, one to the inter-ischial line (IIL-LT) and the other to the inter-obturator foramina (IOF-LT), were superior in minimizing errors related to postural changes. When subjected to alterations in pelvic and femoral alignment, these two methods resulted in similar patterns of variability. They were sensitive enough to detect LLD variations originating from acetabular and femoral components. On the other hand, the method that referenced the inter-teardrop line to the lesser trochanter (ITL-LT) was found to be overly sensitive to changes in pelvic rotation—demonstrating poorly accurate LDD values with high variability when subject to postural changes. Regarding techniques that used the center of the femoral heads as the femoral reference, we found they could not detect LLD variations unless the discrepancies being evaluated were a direct result of alterations in the alignment of the acetabular cup. We hypothesize that this finding may not necessarily reflect the innate inadequacy of the respective measuring techniques. Still, we may instead reflect the limitations of the computer model used in our study. Our model imposed pre-defined LLD values (0, 10, 20 mm) on variations in three-dimensional posture through joint movement in the sagittal, coronal, and transverse planes. In doing so, the model may not have had to significantly alter the positions of the acetabular cup prostheses to produce the pre-defined LLD or may not have altered cup positions in a plane discernable on the simulated radiographs. By not doing so, the acetabular cup prostheses remain in a state that too closely resembles the natural anatomical centers of the joints when projected onto 2D radiographs. Ultimately, the conventional methods that reference the femoral heads to measure LLD were not sensitive enough to detect these 3-D alterations on the simulated 2-D radiographs.
Using trigonometric techniques to measure LLD, we found a high correlation between reported values, regardless of the femoral or pelvic landmark chosen for reference. However, as seen with conventional methods, trigonometric techniques varied in their ability to identify clinically significant LLD depending on the landmark used for measurement. Compared to traditional methods, which require the identification of two landmarks on each side, the trigonometric method requires only a single landmark on each side to be identified. This yields a practical advantage by eliminating potential sources of variation in landmark selection. Despite this, the trigonometric technique does have limitations. The most significant is that the method should ideally be used to measure LLD on standing, not supine, AP pelvic radiographs. This is due to dependence on an adequately aligned horizontal plane. Lateral pelvic tilt, caused by patients shifting while lying supine, could alter the positions of the reference points and lead to inaccurate measurements. Similarly, using this method on radiographs produced by improperly aligned imaging devices would also produce false measurements.
Implications of our findings challenge assertions made by previous studies. For instance, in assessing the variability of LLD measurements on plain radiographs, Kjellberg et al. [
36] concluded that the method that referenced the inter-teardrop line to the lesser trochanter (ITL-LT) had excellent inter-observer reliability [
36]. We found this method to be the least reliable of all conventional techniques. This discordance between our results may partially be explained by differences in sample size, statistical analysis techniques, and the fact that Kjellberg et al. [
36] did not compare the abovementioned technique to any other conventional measurement method.
A study by Meermans et al. [
37] compared LLD measured by the same conventional methods on preoperative AP pelvic radiographs with LLD values obtained from corresponding AP full-length radiographs. After analyzing the correlation of the average LLDs obtained from each set of corresponding radiographs, they concluded that methods that referenced the inter-teardrop line were more accurate than those that referenced the inter-ischial line. Regarding femoral landmarks used, they also concluded that methods referencing the femoral heads were more reliably reproducible than those referencing the lesser trochanters [
37]. Our study findings dispute both these conclusions. We found that methods that referenced the inter-ischial line (IIL-LT) or inter-obturator foramina (IOF-LT) to the lesser trochanter were the most accurate. This was due to their superiority in minimizing errors related to pelvic and limb mispositioning and a higher sensitivity in quantifying LLD originating from acetabular and femoral components. Furthermore, although we concede that methods that referenced the femoral head centers (ITL-FH and IIL-FH) had significant inter-observer reliability—our results demonstrated that methods referencing the inter-ischial line (IIL-LT) or inter-obturator foramina (IOF-LT) to the lesser trochanter were still marginally more reliable. The disparities between our study results may be partially attributed to the different means by which we obtained the ‘true’ LLD valves used as a reference for assessing the accuracy and reliability of conventional methods. While we used a computer model to impose a precise LLD value as determined by computations in 3D space, Meermans et al. [
37] obtained their values using the less accurate method of measuring LLD from hip center to ankle center on AP full-length radiographs. Other factors contributing to these disparities may include differences in statistical methods, and our study was performed using postoperative, not preoperative, radiographs. Additionally, unlike our study, Meermans et al. [
37] did not consider the effects of underlying pelvic tilt on LLD—choosing to exclude patients who exhibited pelvic tilt related to soft-tissue contractures or spinal deformities for their study.
Another study by Heaver et al. [
38] assessed the intra- and inter-observer reliability of various LLD measurement techniques on postoperative AP pelvic radiographs. Using radiographs of a synthetic pelvis and femur, they investigated the effects of pelvic positioning on LLD variability. The methods considered in their study included those referencing four pelvic landmarks (inter-teardrop line, inter-ischial line, inter-obturator foramina, and inferior sacroiliac joint) and two femoral landmarks (medial point of the lesser trochanter and tip of the greater trochanter). They concluded that measurements that referenced the inter-ischial line to the lesser trochanter (IIL-LT) were most reliably reproducible and least distorted by pelvic positioning [
38]. Our results merit these findings as they corroborate that this method (IIL-LT) is highly reliable and less affected by pelvic mispositioning compared to conventional methods. However, unlike Heaver et al. [
38], we found that the technique referencing the inter-obturator foramina and the lesser trochanter (IOF-LT) was nearly as reliable and unaffected by changes in pelvic alignment as the IIL-LT method. Heaver et al. [
38] did not assess references to the center of the femoral heads. Therefore, no comparisons to our results can be made in this regard.
Some authors have stated that the use of the inter-teardrop line should be preferred to the inter-ischial line, claiming that the inter-ischial line is generally an inferior landmark [
36,
37,
45,
58,
59]. This conflicts with the findings of our study. Upon reviewing the citations used in these publications, we found a chain of references originating from a single study conducted by Goodman et al. [
60], which found the ilio-ischial line, not the inter-ischial line, to be too poorly defined for reliable detection and measurement of acetabular migration on AP roentgenography [
60]. In light of this, we propose using the inter-ischial line as a well-demarcated pelvic landmark.
We need to address some limitations of this study constructively. First, we only used a small number of radiographs (n = 73), but the high inter-observer reliability seen in the methods we used (refer to
Table 1) supports our approach. It is important to note that similar limitations were seen in previous studies like Kuroda et al.’s survey, which was based on 48 cases [
44], Meermans et al. [
37]’s study, which involved 52 cases, and Edeen et al. [
17]’s study, which included 68 cases. In all these cases, additional data would improve observer reliability and further demonstrate how errors in identifying landmarks can affect the results.
Second, this study’s scope was confined to robotic-assisted THA cases only. Regardless, our findings of poor correlation between measurement methods remain applicable to other THA surgical techniques.
One of our co-authors has demonstrated that robotic-assisted THA procedures are highly reliable in facilitating proper acetabular cup placement [
61]. Others have shown that postoperative LLD after robotic-assisted THA is lower and less variable than manually performed THA [
62]. In light of these advantages and the increasing popularity of robotic-assisted THA in clinical and research use [
63], we consider the confines of this procedure to be an appropriate scope for this study.
This study focused solely on analyzing LLD measurement variation resulting from patient misalignment during radiograph positioning while neglecting to consider other influential factors such as parallax and magnification errors.