Model-Based Roentgen Stereophotogrammetric Analysis Using Elementary Geometrical Shape Models: Reliability of Migration Measurements for an Anatomically Shaped Femoral Stem Component

Elementary Geometrical Shape (EGS) models present an alternative approach to detect in vivo migration of total hip arthroplasty using model-based Roentgen Stereophotogrammetric Analysis (mbRSA). However, its applicability for an irregular-shaped femoral stem and the reliability of this mbRSA approach has not been proven so far. The aim of this study is to assess the effect of multi-rater and an anatomically shaped femoral stem design onto resulting implant to bone migration results. The retrospective analysis included 18 clinical cases of anatomically shaped stem with 10-year RSA follow-ups. Three raters repeatedly measured all RSA follow-ups for evaluating the rater equivalence and intra-rater reliability. The results proved the equivalence between different raters for mbRSA using EGS models (mbRSA-EGS), hence it simplified the investigation of rater reliability to intra-rater reliability. In all in-plane migration measurements, mbRSA-EGS shows good intra-rater reliability and small intra-rater variability (translation: <0.15 mm; rotation: <0.18 deg). However, the reliability is worse in the out-of-plane measurements, especially the cranial-caudal rotation (intra-rater variability: 0.99–1.81 deg). Overall, mbRSA-EGS can be an alternative approach next to surface models while the in-plane migration of femoral stem (e.g., the implant subsidence for loosening prediction) have more research interested than other directions.


Introduction
Total Hip Arthroplasty (THA) is considered an effective treatment to improve the function of hip joint failure caused by: arthritis [1] (including osteoarthritis, rheumatoid arthritis, traumatic arthritis, and arthritis caused by other reasons, e.g., the microbleeds in joint capsule of congenital afibrinogenemia This EGS approach enables us to create an individual model with each applied implant size for migration detection within a clinical study. In comparison to the mbRSA approach using CAD/RE models, for each implant size and design variation, one CAD/RE model is required as it is matched with the actual implant shape [18]. Exceptions exist, if, for a design variation, a reduced contour selection is possible [19]. For instance, a THA system with seven different sizes and three variations of caput-collum-diaphyseal (CCD) angle means 21 RE models in total. If each RE model costs approximately EUR 200-250, the total costs of RE models will amount to EUR 4200-5250. In contrast, the application of EGS model can save this cost, as it only demands the corresponding geometry of implants. The generality of EGS model enables the wide application to various types and sizes of implants [20].
At present, EGS model is mostly applied to regular-shaped femoral stems (with conical shaped distal part) [13,16]. However, an irregular-shaped design has been introduced into a large number of femoral stem designs for different reasons [21]: rectangular cross-sectional design for strong rotational stability, conical design with multiple splines for primary fixation, anatomical design to achieve maximum contact, short stem design without a long distal part to preserve bony tissue. However, the application of EGS model on those irregular-shaped femoral stems has not been proven. The development of computer assisted RSA approaches has turned most of the cumbersome manual procedures into automation, thereby reducing unnecessary sources of error from most of manual procedures [22]. Migration analysis using RSA software still contains some user interactions ( Figure 2).  An example for user interactions presents the implant contour detection, which required the rater to select the correct contour. However, not all the implant contours can be clearly displayed on the X-ray image. The clarity of the implant contour can be affected by the image quality, thickness of soft tissue, density of implant material, etc. For instance, the contour of ceramic femoral head is more blurred compared to metal implants (Figure 2c). In addition to the selection of implant contour, the rater needs to decide which part of distal stem is the required conical shape when applying EGS model to irregular-shaped femoral stems. The current RSA approach and available software-package An example for user interactions presents the implant contour detection, which required the rater to select the correct contour. However, not all the implant contours can be clearly displayed on the X-ray image. The clarity of the implant contour can be affected by the image quality, thickness of soft tissue, density of implant material, etc. For instance, the contour of ceramic femoral head is more blurred compared to metal implants (Figure 2c). In addition to the selection of implant contour, the rater needs to decide which part of distal stem is the required conical shape when applying EGS model to irregular-shaped femoral stems. The current RSA approach and available software-package for analysis is not able to standardize this procedure, considering that those user interactions may affect the resulting implant or bone rigid body definition, thereby influencing the reliability of migration measurement. Information of RSA methodology reliability is lacking. There is a need for this kind of analysis, whilst keeping in mind, that user interactions may affect migration results. Therefore, a rater reliability RSA study was designed to investigate whether the EGS model can be reliably applied to one kind of the irregularly shaped designs: an anatomically shaped stem.
The aim of this study is to answer the two following questions by investigating the rater reliability: (i) which directions of migration measurements have acceptable reliability by applying mbRSA using EGS model; (ii) the reliability of which direction of measurements may be greatly affected. This was demonstrated by the following steps:

1.
Rater equivalence: the equivalence between raters was evaluated firstly, which was obtained when the inter-rater difference was so small that measurements from different raters were considered to be equivalent to the rater's own repeated measurements.

2.
Intra-rater reliability: if the rater equivalence is acceptable, the investigating of rater reliability can be reduced to the evaluation of intra-rater reliability (as the definition of rater equivalence above).
• Intra-rater reliability of marker-based RSA and mbRSA-EGS • Whether the intra-rater variability of mbRSA-EGS can be accepted compared with marker-based RSA and the upper limits of RSA accuracy.

Materials and Methods
RSA data of primary THA was retrospectively analysed and taken out of a previous study (ethical registration number: 1.077) [23]. Available data offers the opportunity to analyse long-term implant migration by both marker-based RSA and mbRSA-EGS method, respectively.

Image Acqusition and Analysis
RSA examinations were performed within a uniplanar RSA set up. Patients were positioned in supine position within the both X-ray sources, focused at the hip joint from above with an intersection angle of approximately 40 deg. A calibration box (RSA BioMedical Innovations AB, Umeå, Sweden) was placed under the X-ray table with a vertical distance of 140 cm from the X-ray source. Patients underwent the first reference RSA examination within the first postoperative week, and received RSA follow-ups at 3 months, 6 months, 1 year, 2 years, 5 years, and 10 years.

Patient Cohort-Inclusion Criteria
The patient cohort received a cemented THA system, consisting of an anatomically shaped femoral stem (Lubinus SP II, Waldemar Link GmbH, Hamburg, Germany) with three visible additional attached tantalum markers which was combined with a ceramic ball head (BIOLOX ® forte, Ceram Tec GmbH, Plochingen, Germany) with a diameter of 28 mm. All available cases of this previous study were reviewed. Only cases with a cemented polyethylene acetabular cup (LINK ® IP Acetabular Cup, Waldemar Link GmbH, Hamburg, Germany), and a complete follow-up series of RSA images and cases which were able to carry out by both marker-based RSA and mbRSA-EGS methods, were included. Exclusion criteria were: cases with the marker occlusion problem, unacceptable conditions, and cases with a metal acetabular cup component (this means that the femoral head projection can be occluded by metal cup and it resulted in an impossibility to analyze with EGS models). Finally, n = 18 cases were included.

Measurement and Analysis Protocol
RSA analyses of femoral stem components were performed with a commercially available software package (MBRSA 4.1, RSA Core, Leiden, The Netherlands). During the analysis with standard thresholds, condition number (≤100) and rigid body error (≤0.35 mm), were continuously monitored according to the recommended RSA Guidelines for producing standardized analysis procedure [11,24]. Migration was calculated based on a reference point, which represents the center of gravity of a rigid body. After aligned the reference rigid body (e.g., bone rigid body), translation could be calculated based on the difference of the migration rigid body location (e.g., location of the reference point of implant) between two follow-up time points, rotation could be calculated based on the difference of the migration rigid body orientation between two follow-up time points. The rigid body orientation can be estimated by several shape matching methods in case of using RE/CAD model [12] or EGS model [13]. To verify the quality of the image calibration procedure, standard thresholds for calibration errors (translation ≤0.05 mm, focus error ≤0.5 mm) were used for image analysis [24]. Migration of about 6 degrees of freedom using rigid body kinematics were calculated with respect to a global coordinate system. Application of a calibration box defined translation along the medial-lateral (x) and cranial-caudal (y) axes as in-plane motion, and translation along the anterior-posterior axis (z) as out-of-plane implant to bone motion (migration). Rotation around the anterior-posterior axis (Rz) described in-plane motion and around the medial-lateral (Rx) and cranial-caudal (Ry) axes, out-of-plane implant to bone motion, respectively.
Three independent raters participated in this study, two of which have 2 years' experience in RSA project (rater 1 and rater 3), one has half year experience in RSA analysis (rater 2). Each rater carried out RSA analyses with the marker-based RSA and mbRSA-EGS methods according to the standard analysis protocol in the user manual (MBRSA 4.1, RSA Core, Leiden, The Netherlands). When applying mbRSA-EGS, the raters themselves chose which conical portion of the contour of the distal stem to analyse. After all RSA radiographs were analyzed once by both methods, the raters took a one-week break. This process was repeated until each pair of RSA radiographs was analyzed three times by each rater (Figure 3). Once achieved, for an individual rater, calibration of RSA radiographic image pairs was kept unchanged for the remainder of the analysis sequence of each image pair for both RSA methods (marker-based RSA and mbRSA-EGS). However, for repeated analysis (three times with the same images) by different raters, each time the calibration and analysis were done repeatedly. During the analyses, raters were allowed to revise the analyses when any procedure went against the defined analysis protocol. However, they were not allowed to revise the analyses based only on suspicion of migration. Before all analyses were accomplished, raters had no information of the migration data from previous studies or any other source.

Statistics
The coefficient of individual agreement (CIA) was used to assess rater equivalence [25], which was adapted from the coefficient for assessing individual bioequivalence criteria (IBC) by Food and Drug Administration (FDA) guideline 2001 [26]. CIA was obtained when the inter-rater difference was so small that measurements from different raters were considered to be equivalent to the rater's own repeated measurements. The threshold of CIA was adapted from the bound of IBC recommended by FDA, 2.495, which corresponds to 0.445 of CIA [25]. Equivalence was considered acceptable if CIA greater than 0.445. The 95% confidence interval was estimated with bootstrapping method [25]. Intraclass correlation coefficient (ICC) was used to assess the intra-rater reliability. An ICC less than 0.40 was considered to be "poorly" agreement, from 0.40 to 0.59 was considered to be "acceptable", from 0.60 to 0.74 was considered to be "good", and 0.75 to 1.00 was considered to be "excellent" [27]. The intra-rater variability was calculated as within-group mean square (WMS) [28]. F-test was used to determine the significance of the intra-rater variability of mbRSA-EGS compared with the corresponding of marker-based RSA (normality was tested by Kolmogorov-Smirnov test). Chi-square test for the variance was used to determine whether intra-rater variability was significantly less than the upper limit of RSA measurement accuracy. Consistent with previous studies [29,30], the upper limit of RSA accuracy (0.5 mm for translation, 1.15 deg for rotation) was used as the threshold. In addition, implants with translation more than 0.15 mm within two years was considered to have higher risk of loosening according to a previous literature [31]. Therefore, 0.15 mm was used as an additional threshold for the intra-rater variability considering the ability of mbRSA-EGS to predict the loosening of the femoral stem. The significance level of 0.05 was considered in all the tests mentioned above. All statistical analyses were performed by R (R Foundation, Vienna, Austria) [32].
Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 14 standard analysis protocol in the user manual (MBRSA 4.1, RSA Core, Leiden, The Netherlands). When applying mbRSA-EGS, the raters themselves chose which conical portion of the contour of the distal stem to analyse. After all RSA radiographs were analyzed once by both methods, the raters took a one-week break. This process was repeated until each pair of RSA radiographs was analyzed three times by each rater (Figure 3). Once achieved, for an individual rater, calibration of RSA radiographic image pairs was kept unchanged for the remainder of the analysis sequence of each image pair for both RSA methods (marker-based RSA and mbRSA-EGS). However, for repeated analysis (three times with the same images) by different raters, each time the calibration and analysis were done repeatedly. During the analyses, raters were allowed to revise the analyses when any procedure went against the defined analysis protocol. However, they were not allowed to revise the analyses based only on suspicion of migration. Before all analyses were accomplished, raters had no information of the migration data from previous studies or any other source.

Results
The migration results of each of three raters showed similar migration patterns for the investigated femoral stem. On the cranial-caudal translation, the femoral stem showed a clear trend of subsidence within the first two years post-operation (mean migration: 0.05 mm/year), and then become stabilized from the second year to the 10th year (mean migration: 3.28 × 10 −5 mm/year).
All migration measurements of mbRSA-EGS showed significant rater equivalence (all six measurements have left-sided confidence interval greater than 0.445). For marker-based RSA, four of all six measurements were found to have rater equivalence with three of them showing significant equivalence ( Table 1). None of the measurements had significant inequivalence (with right-sided confidence interval lower than 0.445). Better intra-rater reliability was found in all in-plane measurements, 15 measurements had ICC within the range of 0.75 to 1.00 ("excellent"), and the other three measurements were within the range of 0.60 to 0.75 ("good") ( Table 2). In contrast, in all out-of-plane measurements, the worst ICC value was found in the cranial-caudal rotation measurements with mbRSA-EGS (from 0.11 to 0.30, "poorly"). Furthermore the other out-of-plane measurements showed slightly worse ICC than the in-plane measurements, five measurements within the range of 0.75 to 1.00 ("excellent"), three measurements within the range of 0.60 to 0.75 ("good"), three measurements within the range of 0.4 to 0.59 ("acceptable"), one measurement showed "poorly" result. Both RSA methods showed lower intra-rater variability of the in-plane measurements (translation: 0.07-0.15 mm, rotation: 0.07-0.18 deg). Compared with marker-based RSA, the intra-rater variability of mbRSA-EGS were significantly increased (with all measurements had p < 0.05). However, compared with the upper limits of RSA accuracy, intra-rater variability of all in-plane measurements were significantly below the upper limits (0.5 mm, 1.15 deg). A total of five out of all six in-plane translation measurements of mbRSA-EGS showed significantly lower intra-rater variability than the threshold of 0.15 mm (Table 3). For the out-of-plane measurements, the largest intra-rater variability was found in the cranial-caudal rotation measurements (0.99-1.81 deg) ( Table 3). This large intra-rater variability of mbRSA even leads to clear deviations of the mean cranial-caudal rotation results between raters when compared with marker-based RSA ( Figure 4). Furthermore, rater 3 was found to have larger intra-rater variability (0.181 deg) on this rotation measurement compared with other two raters (0.99 and 1.27 deg) (Figure 4b).

Discussion
Reliability of marker-based RSA and mbRSA EGS approaches and their application in the irregular-shaped femoral stem was assessed for the first time from clinical data. Results of the analyses are very encouraging, suggesting that RSA can deliver reliable and valid migration data, as confirmed within a clinical setting.
Results revealed that both marker-based RSA and mbRSA-EGS have acceptable rater equivalence for the migration measurement of the anatomically designed femoral stem (Table 1). Therefore, according to the definition of CIA, the measurements between different raters can be

Discussion
Reliability of marker-based RSA and mbRSA EGS approaches and their application in the irregular-shaped femoral stem was assessed for the first time from clinical data. Results of the analyses are very encouraging, suggesting that RSA can deliver reliable and valid migration data, as confirmed within a clinical setting.
Results revealed that both marker-based RSA and mbRSA-EGS have acceptable rater equivalence for the migration measurement of the anatomically designed femoral stem (Table 1). Therefore, according to the definition of CIA, the measurements between different raters can be regarded as the repeated measurements of the same rater. Thus, the intra-rater reliability was further explored. For in-plane migration measurements, mbRSA-EGS showed as good an intra-rater reliability as the gold standard marker-based RSA, with 66.7% of the measurements having "excellent" reliability and 33.3% of the measurements having "good" reliability ( Table 2). The intra-rater variability of in-plane migration of mbRSA-EGS (<0.15 mm, <0.18 deg) was much lower compared with the upper-limits of RSA accuracy (0.5 mm, 1.15 deg) ( Table 3).
So far, research question (i) can be answered: the rater reliability of in-plane migration measurements by applying EGS model can be accepted. Moreover, the in-plane translation, subsidence of stem, has certain clinical value in predicting future aseptic loosening [29,33]. Systematic reviews and meta-analyses showed that femoral stem subsidence was associated with long term aseptic loosening [31]. Additionally, the results of the investigated THA design proved that the intra-rater variability of this in-plane translation measurement was significantly less than 0.15 mm (the threshold of risk implants of loosening), which means that the EGS model also has a considerable application value for predicting loosening. However, it is known that the accuracy and precision of mbRSA method in general are prosthesis design-dependent. Conclusions cannot be generally applied to each investigated THA system [18].
However, the intra-rater reliability of out-of-plane migration measurements were worse than the in-plane migration, especially in cranial-caudal rotation measurements ("poorly" reliability). The intra-rater variability of cranial-caudal rotation measurements also exceeded 1.15 deg ( Table 3). One of the reasons for this large variance was considered as the limitation of its working principle using pose-estimation technique. It was demonstrated that mbRSA using CAD or RE performed less accurately than marker-based RSA on the cranial-caudal rotational measurements of femoral stem implant [14]. The implant projection contour did not change much with a slight rotation around this longitudinal axis, which provided too little information for migration measurement.
Until now, research question (ii) can be answered: the rater reliability of out-of-plane migration measurement (cranial-caudal rotation) by applying EGS model were harmed. The mismatch of EGS model and actual stem shape also played an important role on the poor reliability of this rotation measurement [16]. During the analyses, raters found that the distal part of this anatomically shaped stem was the most difficult part to determine. Rater 1 and 2 tended to choose a longer contour of the distal stem, while rater 3 chose a shorter contour. As the selection of shorter contour may provide less information about the stem axis, it led to a larger intra-rater variability as well as worse ICC of rater 3 (Figure 4), especially in the cranial-caudal rotation measurement. However, it is worth noting that the rater's choice of contour length (rater 1 ≥ rater 2 > rater 3) is not consistent with their experience of RSA analysis (rater 1 ≈ rater 3 > rater 2), as the results supported that the choice of contour length was associated with the reliability of out-of-plane migration measurement. It is recommended to clearly define the standard operating protocol for mbRSA-EGS with the region of interest (length and position) for establishing a standardized template for the contour cone segment to represent the femoral stem component. Considering that two of the three virtual markers in the EGS model depended on the position of the stem central axis, the results presupposed that the irregular-shape of investigated stems would have an impact on the reliability of mbRSA-EGS. On the other hand, these results supported that choosing a longer stem contour could help to improve the reliability harmed by the mismatch of EGS model and actual irregular stem shape of the investigated stem. Additionally, this situation can be improved when applied to stem implants which has the shape matched with available EGS models (likewise cones or cylinders). A study showed better measurement precision of mbRSA-EGS on an hip stem with a strictly conical shaped stem (precision on cranial-caudal rotation measurement: 0.614 deg) [13]. For other irregular stem designs, it could be recommended to do a proof of concept study in advance of the clinical application of mbRSA-EGS for these designs.
Therefore, these results showed that the user interaction can affect the reliability of some migration measurements, especially the choice of stem contour length when using EGS stem model. In general, the reliability of mbRSA-EGS is more sensitive to user interaction compared with marker-based RSA, which should be carefully considered before applying to clinical implant migration measurements. If considering measuring the out-of-plane migration of irregular-shaped stem with mbRSA-EGS, it is better to validate the measurement accuracy by in vitro experiments, double examination in advance, or choosing other validated methods as marker-based RSA or RSA using CAD/RE models. In addition, as the EGS stem model considers the head-stem as a rigid body compared with the CAD/RE model that considers the stem only, it can violate the definition of a rigid body when head-taper motion exists in the actual clinical situation, and consequently may cause deviations in translation measurements. Additional attention should be paid to possible head-taper motion when applying the EGS stem model [17].
A uniplanar RSA set up was used. This presents the common set up for hip implant migration measurement [34][35][36]. The results of this study showed the reliability of out-of-plane migration measurements was inferior to in-plane migration measurements, which can be a common limitation of the uniplanar calibration set up [14]. However, the bi-planar set up overcomes this limitation but is rarely used for hip implant migration measurement due to its set up design. Knee and ankle implant migration measurements were much more common with the bi-planar set up [37,38].
The RSA method is in the discussion and gets an increasing importance in the approval and pre-clinical testing process of new orthopaedic implants. Pre-clinical testing is essential to assess the safety and efficacy of new implant designs, coatings, materials, etc., not only for implants in the orthopedic area but also within dentistry [39]. However, sometimes pre-clinical testing results do not correlate with the clinical results and observations (meaning laboratory environment versus real life application). Because of its high accuracy and precision, only small patient cohorts are necessary to investigate the effect of changes in implant design, new bone cements, or additional implant coating on the implant fixation [12,40]. Recently, the importance of RSA to become a tool for the pre-clinical testing and a stepwise introduction of new orthopaedic implant designs has been increasingly valued [40][41][42]. Therefore, it is necessary to validate the reliability of RSA methods.
Classical marker-based RSA or mbRSA approach offers the opportunities: to measure in vivo implant fixation [43,44], to investigate, e.g., new THA design or coatings, degenerative changes [45][46][47], joint kinematics [48] and pathomechanism [49], and bony fusion [50]. Additionally, it offers the opportunity to investigate the effect of joint material connection within total joint arthroplasty, likewise the head-taper junction [17]. The application of mbRSA eliminates the additional costs and corresponding risks caused by additional markers on the implant. The CAD/RE model can match the actual shape of implant precisely and has been proven to be as accurate as marker-based RSA [14,20]. But there are certain difficulties in obtaining CAD models. The implant manufacturer is often reluctant to share their CAD models (representing sensitive construction data files) for some commercial reasons. RE model is more convenient to be obtained compared with CAD model. However, when applying RE models to a large number of implants with different sizes or variants (which is a common situation in clinical studies), it can result in a considerable expense. EGS model is a potential alternative when CAD/RE model is not available [51].

Conclusions
The in-plane migration of the investigated anatomically shaped femoral stem can be reliably measured by mbRSA-EGS. Considering the loosening prediction value of the femoral stem subsidence (in-plane translation), the EGS model can be used as an alternative option when the CAD/RE model is not available. Last but not least, EGS model delivers a lower-cost alternative compared to CAD/RE model. However, it is worth noting that the mismatch between EGS model and the actual stem shape may significantly affect the reliability of out-of-plane migration measurement, the cranial-caudal rotation. The CAD/RE model presenting a better choice when the out-of-plane migration has greater research significance.
Funding: This research received no external funding.