1. Introduction
Total knee arthroplasty (TKA) is a well-established surgical treatment for end-stage osteoarthritis and other degenerative knee disorders, providing effective pain relief and functional improvement [
1]. The long-term success of TKA largely depends on the durability of the knee liner, commonly manufactured from ultra-high-molecular-weight polyethylene (UHMWPE), which functions as the primary bearing surface between the femoral and tibial components [
2,
3]. Despite advances in implant design and material processing, polyethylene liner degradation remains a major contributor to implant failure and revision surgery [
4].
Polyethylene liners are subject to multiple damage mechanisms, including wear, delamination, oxidation, creep deformation, and fatigue cracking. Retrieval studies have identified characteristic surface features such as scratches, pitting, embedded debris, and fatigue striations, reflecting the combined influence of micromotion, third-body particles, and oxidative processes on in vivo degradation [
5]. Similar damage patterns observed across different joint implants indicate that polymer wear is governed by common tribological principles [
6]. Non-destructive analysis of retrieved liners is therefore essential for understanding in vivo wear behaviour and improving implant performance. However, many existing wear assessment methods rely on subjective visual scoring or global surface inspection, offering limited spatial resolution and poor reproducibility. These approaches often fail to localise damage to specific anatomical regions, potentially overlooking clinically relevant stress concentrations. To address these limitations, the present study introduces a quadrant-based surface characterization framework that systematically maps wear features across anatomically defined zones of retrieved knee liners. When combined with computational image analysis using MATLAB and Python, this approach reduces observer bias and enables a more objective, reproducible quantification of wear patterns.
Recent advances in smart knee implant technologies, including sensor-based monitoring of load and kinematics, further highlight the importance of spatially resolved wear analysis [
7]. Integrating localised retrieval-based characterization with such data-driven approaches may improve understanding of polyethylene degradation mechanisms and support future advancements in implant design, surgical alignment, and long-term clinical outcomes [
8].
6. Failure of the Knee Liner in Total Knee Arthroplasty
Knee liner failure impacts patient outcomes, implant performance, and healthcare resources. The most important effects are highlighted below.
6.1. Clinical Implications
Liner wear is typically first manifested by persistent pain that is secondary to the inflammation and irritation of the joint. Further deterioration of the liner may result in reduced mobility, functional difficulties with activities of daily living, and a decline in functional scores compared to primary TKA. Loss of joint congruency results in instability, subluxation, or dislocation, and most cases eventually require complex revision surgery with higher complication rates.
6.2. Mechanical and Structural Effects
Wear particles can induce osteolysis, reducing the strength of the surrounding bone, which could lead to implant loosening. Complete liner failure may result in metal-to-metal contact and accelerate component wear. Malalignment at the time of the initial surgery also leads to increased stress on the liner, accelerating degradation.
6.3. Biological Effects
For this reason, polyethylene debris may trigger an immune response that causes progressive bone loss around an implant. If severe wear exposes metal surfaces, the resulting metallosis may cause tissue damage, pain, and further instability.
6.4. Psychological and Emotional Impacts
Anxiety and depression can occur due to chronic pain and reduced independence. Activity limitations and social withdrawal reduce quality of life even more.
6.5. Socioeconomic Impacts
Revision procedures are much more expensive than primary TKAs and involve longer recoveries. Patients miss work or are less productive, and as revision cases continue to climb, so do demands on health resources.
6.6. Impact on Surgical Practice and Implant Design
The surgical techniques have been refined by experience with liner failures, with emphasis on accurate alignment and soft tissue balancing. Improvements in polyethylene processing, particularly with highly cross-linked materials, have reduced wear. Follow-up and imaging on a regular basis are still important for the prevention of major complications.
8. Microscopic and Quantitative Study of Wear-Induced Failure in Knee Implant Liners
Five ultra-high molecular weight polyethylene knee liners, KL-1 through to KL-5, were retrieved in total knee arthroplasty revision surgeries. The liners were already implanted in patients and were being taken out due to complications or failures that led to revision surgery. The aim of this research was to assess the degree of surface damage in each of these liners by using visual and microscopic techniques. The signs of damage were scratches, pits, and delamination marks, all of which were leading to implant failure in the long run [
33]. Such analysis provided an overview of the mechanical and material degradation of the knee liners following implantation. The dimensions noted are represented in
Table 4.
The liners varied in the extent of their wear. KL-1 and KL-2 had moderate wear with visible scratches and small pits in several zones. KL-3 had minimal wear, with slight polishing and barely any visible defects, suggesting that it experienced minimal mechanical stress or was better aligned. KL-4 and KL-5 were heavily damaged at the surface with deep scratches, delamination, and heavy pitting [
34]. Microscopic photography on KL-1 and KL-2 also identified the presence of scratches and abrasions, based on naked-eye observations. Such variations necessitated special analysis to analyse the failure mechanisms.
The quantitative distribution of surface damage features across the nine predefined anatomical quadrants (Q1–Q9) for each retrieved knee liner is performed and noted in
Table 5. It reports the total counts of scratches, pits, and overall defects, enabling direct comparison of damage extent and spatial localisation among the five samples. This quantitative assessment supports objective evaluation of wear severity and complements the visual and mechanistic analysis presented earlier.
Figure 10.
Defects observed under digital and optical microscope.
Figure 10.
Defects observed under digital and optical microscope.
Figure 11.
Defects observed under digital and optical microscope.
Figure 11.
Defects observed under digital and optical microscope.
A qualitative classification of surface damage features observed across the five retrieved UHMWPE knee liners is presented in
Table 6, focusing on the relative severity of scratches and pits. In addition to total counts, the table identifies the presence of deep scratches and clustered pits, which are indicative of advanced wear and fatigue-related damage. This qualitative assessment complements the quantitative defect counts by highlighting differences in damage intensity and complexity among the liners, thereby offering further insight into the progression of wear mechanisms associated with implant design and in vivo loading conditions.
9. Analysis of the Data
The distribution of scratches and pits across the nine anatomical quadrants (Q1–Q9) for all retrieved knee liners is represented in
Figure 12 as a box plot, highlighting clear variations in damage localisation and severity among samples. Liners KL-04 and KL-05 demonstrate consistently higher defect counts across multiple quadrants, indicating extensive surface deterioration and non-uniform load distribution. In contrast, KL-03 exhibits minimal defect counts across most quadrants, suggesting favourable alignment or reduced mechanical demand.
Figure 13 presents the mean defect values and corresponding statistical analysis obtained using JMP, providing a comparative overview of wear severity across liners. The observed trends confirm that liners with higher defect densities also exhibit greater variability in damage, reinforcing the relationship between surface wear accumulation and increased risk of mechanical failure.
To standardise comparison of wear between the five samples, the wear severity index (WSI) was calculated and reported in
Table 7. The formula used was as follows:
This standardised measure allowed for ranking of each liner according to the relative degree of surface damage (
Figure 14). KL-05 recorded the highest WSI at 100% and was closely followed by KL-04, which registered the second highest WSI; thus, they experienced the highest wear. KL-03 contained a much smaller WSI and was determined to be the least worn sample. Supporting WSI counts were validated with visual data plotted in JMP Pro 18 as defect distribution by quadrant.
The WSI is intended as a relative comparative index for ranking damage severity across retrieved liners, rather than as an absolute clinical threshold, particularly given the limited sample size and absence of outcome-based calibration.
The scratch and pit densities were calculated by first counting the total number of scratches and pits in each knee liner quadrant (Q1 to Q9), and numbers obtained were reported in
Table 8. Tndividual number of stratches and pits are presented in graphical form in
Figure 15. The total scratches and total pits for each sample were also added in
Table 8.
To determine the scratch density, the total scratches for each sample were divided by the total number of defects (scratches + pits). Similarly, the pit density was determined by dividing the total number of pits by the total number of defects per sample. This provides the ratio of the quantity of surface damage in the form of scratches to pits, which is useful in forming an understanding of the wear pattern and failure mechanism of the knee liners.
This quantitative and microscopic failure analysis had important data regarding the wear of UHMWPE knee liners during TKA, the details for each liner is noted in
Table 9. The liners KL-04 and KL-05, being the most worn, experienced focal damage in stressed areas, possibly suggesting that cyclic loading, inappropriate fixation, or implant misalignment were contributory to their damage. Conversely, KL-03, which had the least damage, had the benefit of superior material performance or surgical alignment. These findings highlight the importance of material longevity, ideal implant design, and surgical precision. Collectively, quadrant analysis, digital microscopy, and severity indexing provide an integrated platform for orthopaedic implant performance assessment and subsequent improvement.
While the research provides important information regarding surface wear and failure modes of UHMWPE knee liners in TKA surgery, there are certain limitations of the research that need to be mentioned. First to be noted is that five knee liners (KL-1 to KL-5) limited the sample, and they may not represent the entire spectrum of failure modes or of the different types of implants. In addition, patient-related variables such as age, activity level, implant age, body mass index (BMI), and surgical alignment information were unavailable, limiting the correlation of wear patterns with clinical outcomes [
35]. Only visual and microscopic surface analysis were performed in the study; no mechanical tests such as hardness, tensile strength, or fatigue resistance were performed. In addition, imaging was limited to 20× magnification and possibly lost nano-scale wear features or incipient-stage oxidation effects [
36]. Although quadrant-based damage mapping is used, scratches and pits are subjectively interpreted with variability. Multi-modal analysis (mechanical and chemical), population studies of larger numbers, and patient metadata would enable further inclusive knowledge of implant performance and failure risk in the future.
10. Comparison with Tool Analysis
The analysis of the data integrates manual and automated approaches to evaluate wear patterns, damage distribution, and severity across the retrieved UHMWPE knee liners.
Table 10 provides a direct comparison of defect counts and wear severity indices obtained from manual assessment and tool-based analyses, establishing the level of agreement between methods.
Figure 16 illustrates the overall distribution of all detected defects across the liners, while
Figure 17 presents a quadrant-wise representation of scratch counts highlighting spatial variations in surface damage.
The comparison between manual and tool-based scratch counts is illustrated in
Figure 18, demonstrating close agreement between methods with minor variations across the retrieved knee liners.
A combined graphical representation of wear metrics is shown in
Figure 19, in which the teal line represents the manual scratch count, the orange line indicates the tool-based scratch count, the purple line corresponds to the manual wear severity index (WSI), and the green line represents the tool-based WSI. In addition, the red line denotes manual severity classification, while the blue line represents tool-based severity classification, enabling direct comparison of quantitative counts and qualitative severity trends across liners.
A comparative summary of scratch counts obtained through manual assessment and automated analysis using Python and MATLAB is provided in
Table 11, allowing direct evaluation of agreement and variability between methods across the retrieved knee liners. The results highlight differences in detection sensitivity and consistency among the analytical approaches, supporting the reliability of computational methods as effective alternatives to manual counting for surface wear assessment.
Manual Counting:
Manual counting requires visual observation of the knee liners and manual counting of the observed scratches and pits in the images. It is heavily based on an observer’s attention and consistency; hence, this technique may have variability and human error. This approach is very basic, inexpensive, and reasonable for small data sets; however, it is extremely time-consuming and, therefore, is not practical to handle a large quantity of images. The major advantage is the potential to conduct a direct, simple visual observation; this may affect the precision and reproducibility of the results, owing to the subjective characteristic of the method.
MATLAB Analysis:
MATLAB offers an excellent environment both for image processing and data analysis, allowing one to perform automated detection for both scratches and pits by using specific toolboxes. It uses a variety of algorithms, including thresholding, edge detection, and feature extraction, to observe wear patterns in knee liners. MATLAB efficiently processes large volumes of data with uniform and reproducible results. While it requires expertise in programming to set up and fine-tune analysis, it is highly regarded for its excellent performance and precision, especially when handling challenging image-processing applications.
Python Analysis:
Python 3.13 gives the best flexibility in processing images and data analysis, using libraries such as OpenCV for feature identification. The positive aspects of using Python are that it is an open-source language that is easily integrated with other tasks that involve data analysis. The language itself has a number of tools available for performing the automation of wear pattern detection in knee liners, which may be used and modified for specific needs. Python is highly scalable and able to handle large amounts of data efficiently, but it does require knowledge of several libraries and tends to execute code more slowly than MATLAB unless optimised.
Comparison:
Both MATLAB and Python have automated solutions for wear analysis, which makes them much faster, more consistent, and more scalable than manual counting. MATLAB is very useful for users in professions that require the processing of images to be performed with great precision. It is usually preferred in most academic and industrial circles, due to its specialised toolboxes. On the other hand, Python presents the user with an open-source option that can easily be adapted for a range of analyses. Its choice therefore seems to depend on the preference, budget, and specific needs of the user. As much as MATLAB might be an ideal solution for those who have access to its resources, Python is a cost-effective alternative that offers versatility in analysis pipelines.
The comparison of wear assessment methods is further illustrated through graphical and statistical evaluation.
Figure 20 and
Figure 21 visualise differences and similarities in scratch counts obtained from manual assessment and automated analyses, highlighting overall trends and method-dependent variability across the retrieved knee liners. These visual comparisons are supported by the statistical analysis presented in
Table 12, which quantifies the level of agreement between manual, Python-, and MATLAB-based methods using correlation and paired statistical testing. Together, these results demonstrate that automated tools show strong concordance with manual evaluation, particularly for MATLAB, reinforcing the reliability of computational approaches for wear quantification.
The analysis involves comparing manual wear data with data from Python and MATLAB, using the Pearson correlation and paired t-test. Below is an explanation of the values and what they indicate.
- 1.
Pearson Correlation:
The Pearson correlation measures the strength and direction of the linear relationship between two datasets. The values range from −1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 indicating no correlation.
This suggests a moderate positive correlation between the manual counts and the Python-generated counts. A correlation of 0.68 indicates that while there is a general relationship between the two methods, there are still some differences that might be due to the subjectivity in manual counting or the algorithmic differences in Python.
The higher Pearson correlation of 0.88 between the manual counts and MATLAB-generated counts indicates a strong positive correlation. This suggests that MATLAB’s automated counting is closely aligned with the manual counting results, possibly due to the more defined image-processing techniques in MATLAB compared to Python.
- 2.
Paired t-test (t-statistic):
The paired
t-test compares the means of two related groups to determine whether there is a statistically significant difference between them. The
t-statistic reflects how much the two datasets differ, normalised by the standard error of the differences between them.
A
t-statistic of −0.92 indicates a small difference between the manual and Python counts. Since the
t-statistic is close to 0, it suggests that the difference between these two methods is not large, meaning the wear counts from both methods are quite similar, but not identical.
The t-statistic of 0 means that there is no difference between the manual and MATLAB counts. This suggests that the counts from the manual method and the MATLAB method are very similar, showing that MATLAB’s automated analysis closely mirrors the manual observations.
- 3.
Paired t-test (p-value):
The
p-value helps to determine whether the observed differences are statistically significant. A
p-value less than 0.05 typically indicates a statistically significant difference.
A
p-value of 0.41 means that there is no statistically significant difference between the manual and Python counts (
Figure 22). This value is greater than 0.05, indicating that the difference between the two methods is not likely to be due to chance, but it still does not suggest a highly significant distinction.
A
p-value of 1 suggests that there is no difference between the manual and MATLAB-generated counts (
Figure 22). This further confirms that the wear counts produced by MATLAB are very similar to those produced manually, making MATLAB a highly reliable tool for this analysis.
11. Discussion
The wear behaviour of the five retrieved knee liners (KL-1 through KL-5) was evaluated by using both raw and filtered datasets, providing insight into how polyethylene liners respond to varying mechanical stresses and loading conditions over time. Differences in wear magnitude and distribution across the liners reflect the combined influence of the alignment, loading history, and implant usage. KL-1 demonstrated a moderate variability in wear without a pronounced peak, suggesting relatively steady loading conditions with gradual stress accumulation [
37]. This wear pattern is consistent with a reasonably well-functioning implant under typical daily activities; however, the progressive nature of the wear indicates that continued exposure to mechanical stress may eventually compromise the liner’s integrity.
Clinically, such gradual degradation may manifest as slowly increasing pain or reduced joint comfort over time. KL-2 exhibited a more pronounced wear pattern when compared to KL-1, indicating exposure to higher stresses or increased loading [
38]. This may be related to elevated activity levels or suboptimal alignment, both of which are known to accelerate polyethylene wear. Clinically, increased wear of this nature has been associated with a higher risk of instability, inflammation due to debris generation, and earlier functional decline if not addressed. KL-3 showed minimal wear with a smooth and stable damage profile, suggesting either optimal implant alignment or reduced mechanical demand. This liner’s wear characteristics indicate favourable load distribution and material performance, which may translate clinically to improved joint stability, reduced pain, and enhanced long-term durability. Such patterns are consistent with a lower revision risk and better functional outcomes. In contrast, KL-4 demonstrated the highest wear variability and severity, which is indicative of high-stress loading, possible malalignment, or excessive mechanical demand [
39]. Severe localised wear in such cases is clinically relevant, as it may contribute to joint instability, increased polyethylene debris generation, and accelerated osteolysis, ultimately increasing the likelihood of early implant failure and revision surgery. KL-5 displayed moderate wear with less fluctuation than KL-4, suggesting more uniformly distributed stresses overall, though localised damage was still evident in high-stress regions [
40].
Clinically, this pattern may correspond to acceptable short-term function but increased long-term risk for focal damage-related complications if mechanical conditions persist. Overall, the findings demonstrate that knee liner wear is strongly influenced by mechanical factors such as stress magnitude, load distribution, and alignment, in addition to the intrinsic properties of the polyethylene material [
41]. Localised wear patterns identified through quadrant-based analysis highlight regions that may be more susceptible to damage and clinically relevant failure mechanisms. These observations emphasise the importance of precise surgical alignment, appropriate implant selection, and balanced load transfer to ensure long-term implant durability and favourable patient outcomes.
11.1. Study Limitations
This study has several limitations. The small sample size (n = 5) limits statistical power and restricts the generalizability of the findings to broader patient populations and implant designs. Additionally, mechanical testing and chemical characterization of the liners were not performed; therefore, the conclusions are based on surface-level damage assessment, rather than direct measurements of material properties such as hardness, oxidation, or fatigue resistance. The absence of detailed patient-specific data, including activity level, body mass index, implantation duration, and clinical outcome scores, further limits the direct correlation between wear patterns and patient outcomes.
11.2. Future Directions
Future studies should apply the proposed quadrant-based wear assessment framework to larger, multi-centre retrieval cohorts to improve statistical robustness and enable standardised comparisons across implant designs and clinical settings. Integration of this spatial wear-mapping approach with emerging sensor-embedded smart knee implants offers significant potential. Combining in vivo measurements of loading, alignment, and joint kinematics with post-retrieval wear localization could enable direct correlation between mechanical exposure and damage progression, supporting improved implant design, patient-specific risk assessment, and long-term performance monitoring.
12. Conclusions
The study introduced an eight-quadrant mapping system across the medial and lateral regions of the retrieved knee liners to evaluate localised surface damage. Each quadrant was analysed for scratches, pits, burnishing marks, cracks, and delamination, and the total counts were used to determine the damage density for each liner. The results showed clear differences among the five retrieved liners. KL3 had the lowest number of surface defects, with only five total scratches and pits, representing a wear severity index of about 15 percent. KL1 and KL2 showed moderate damage, with 18 and 15 total defects, respectively, corresponding to mid-range severity between 45 and 55 percent. KL4 and KL5 recorded the highest levels of surface damage, with 30 and 33 total defects, giving them wear-severity indices close to 90 and 100 percent. These data indicate that scratches were the most frequent form of damage, followed by pits and burnishing marks, which became more pronounced in the highly stressed zones. The overall pattern revealed that as the number of scratches and pits increased, the damage density also increased, showing a clear correlation between surface deterioration and risk of mechanical failure. The findings demonstrate that differences in alignment, loading, and material response directly influence the extent of surface wear. KL1 and KL5 exhibited moderate wear and good functional performance, while KL3 was the least affected and was likely well-aligned. KL4 experienced the greatest deterioration, suggesting a higher stress concentration and malalignment. As the damage density increased, the likelihood of cracking, delamination, and structural failure also rose. These results emphasise the importance of precise implant alignment, proper load distribution, and careful material selection to improve the durability of knee liners. Ongoing patient monitoring and refinement in surgical technique remain essential for reducing wear and extending the lifespan of total knee arthroplasty components.