How Can Artificial Intelligence Identify Knee Osteoarthritis from Radiographic Images with Satisfactory Accuracy?: A Literature Review for 2018–2024

Touahema, Said; Zaimi, Imane; Zrira, Nabila; Ngote, Mohamed Nabil

doi:10.3390/app14146333

Open AccessReview

How Can Artificial Intelligence Identify Knee Osteoarthritis from Radiographic Images with Satisfactory Accuracy?: A Literature Review for 2018–2024

by

Said Touahema

^1,2,*

,

Imane Zaimi

³,

Nabila Zrira

¹ and

Mohamed Nabil Ngote

^1,4

¹

ADOS Team (Equipe Aide à la Décision et Optimisation des Systèmes), LISTD Laboratory, Ecole Nationale Supérieure des Mines de Rabat, Rabat 10000, Morocco

²

Ministry of Health and Social Protection, Provincial Ministerial Administration of El Kelaa des Sraghna, El Kelaa des Sraghna 43000, Morocco

³

Multidisciplinary Research Laboratory for Science, Technology and Society, Department of Computer Engineering and Mathematics, Higher School of Technology, Khenifra, Sultan Moulay Slimane University, Beni Mellal 23000, Morocco

⁴

Institut Supérieur d’Ingénierie et Technologies de Santé, Faculté de Médecine Abulcasis des Sciences de la Santé, Rabat 10000, Morocco

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(14), 6333; https://doi.org/10.3390/app14146333

Submission received: 11 June 2024 / Revised: 11 July 2024 / Accepted: 17 July 2024 / Published: 20 July 2024

(This article belongs to the Special Issue Advances in Machine Learning for Healthcare Applications)

Download

Browse Figures

Versions Notes

Abstract

Knee osteoarthritis is a chronic, progressive disease that rapidly progresses to severe stages. Reliable and accurate diagnosis, combined with the implementation of preventive lifestyle modifications before irreversible damage occurs, can effectively protect patients from becoming an inactive population. Artificial intelligence continues to play a pivotal role in computer-aided diagnosis with increasingly convincing accuracy, particularly in identifying the severity of knee osteoarthritis according to the Kellgren–Lawrence (KL) grading scale. The primary objective of this literature review is twofold. Firstly, it aims to provide a systematic analysis of the current literature on the main artificial intelligence models used recently to predict the severity of knee osteoarthritis from radiographic images. Secondly, it constitutes a critical review of the different methodologies employed and the key elements that have improved diagnostic performance. Ultimately, this study demonstrates that the considerable success of artificial intelligence systems will reinforce healthcare professionals’ confidence in the reliability of machine learning algorithms, facilitating more effective and faster treatment for patients afflicted with knee osteoarthritis. In order to achieve these objectives, a qualitative and quantitative analysis was conducted on 60 original research articles published between 1 January 2018 and 15 May 2024.

Keywords:

artificial intelligence (AI); knee osteoarthritis (KOA); machine learning (ML); deep learning (DL); Kellgren–Lawrence (KL); joint space narrowing (JSN)

1. Introduction

Osteoarthritis is a chronic degenerative joint disease characterized by progressive erosion of articular cartilage. It mainly affects women, people with obesity, and the elderly; 18% of women and 9.6% of men over the age of 60 suffer from osteoarthritis globally [1]. According to the World Health Organization (WHO), over 50% of the population over the age of 65 is affected by osteoarthritis [2] and more than 250 million patients worldwide (4%) [3]. From an economic point of view, osteoarthritis carries a huge burden in terms of direct costs (hospitalization, diagnosis, and therapy) and indirect costs (loss of working days and productivity). It costs 186 billion US dollars per year in the United States [4]. For future projections, the National Center for Chronic Disease estimates that 78.4 million American adults will be affected by 2040 [5].

Knee Osteoarthritis (KOA) is marked by joint space narrowing, the formation of osteophytes, sclerosis, and bone deformation that can be observed on X-ray images. It is a form of arthritis that typically affects weight-bearing joints, including the knees, hips, and feet. Knee osteoarthritis is one of the most prevalent forms of the disease. Approximately 83% of all patients with osteoarthritis suffer from knee osteoarthritis. It is defined as a loss of articular cartilage, which serves as the protective tissue of the knee joint. Articular cartilage prevents the bones from rubbing and facilitates movement. KOA is generally diagnosed using conventional radiography, Computed Tomography (CT) scans, and Magnetic Resonance Imaging (MRI). Although MRI is a popular method, standard radiography is still considered the gold standard for the assessment of knee osteoarthritis and is widely used by doctors for initial examinations. This study will present a literature review of the use of AI in knee osteoarthritis diagnosis. The objective is to identify common design principles, core technologies, and major challenges to be overcome to achieve reliable and accurate models. Furthermore, the results of this assessment will be used to propose a strategic roadmap for researchers, students, and healthcare professionals. This roadmap will guide the transition to the most accurate ML and DL models by making available the techniques used and the databases employed. To achieve this objective, the following research questions were formulated:

RQ1 What are the principal models currently employed to design diagnostic models for knee osteoarthritis based on radiographic images?

RQ2 What are the potential factors for enhancing the accuracy of traditional machine learning (ML) and deep learning (DL) models developed in recently published research articles?

RQ3 What are the principal challenges and prospective avenues for further research?

The remainder of this document is organized as follows: Section 2 outlines the methodology employed to conduct the literature review and evaluate the study results, to identify the design principles employed in the selected articles. Section 3 presents the results of the literature review in both qualitative and quantitative terms. It also outlines the main traditional machine learning (ML) and deep learning (DL) models employed to assess the severity of knee osteoarthritis. This section also discusses the databases utilized and the methodologies employed for ROI detection, as well as the techniques applied to enhance diagnostic accuracy. Additionally, it offers a comparative analysis of the major approaches employed. Section 4 presents a critical discussion of the main findings, identifying challenges and potential avenues for future research, as well as the limitations of the present study. Finally, in line with the results of the literature review and discussion, Section 5 presents the answers to the questions posed and the main conclusions drawn.

2. Methods

This literature review was conducted according to the guidelines outlined in the flowchart in Figure 1, which describes the various stages of the literature review. A search string was constructed based on the fundamental concepts associated with AI, to diagnose knee osteoarthritis based on radiographic images. To achieve this, we ensured that at least one element from each of the categories listed in Table 1 was present, employing a combination of OR and AND operators. The search string was then adapted to each of the three platforms included in this study, namely Google Scholar, MDPI, and ScienceDirect. The search was conducted between 5 January 2023 and 15 May 2024, encompassing academic research reviewed and then published between 2018 and May 2024. The search terms were included in the abstract, title, or keywords of the articles. The articles were published in peer-reviewed journals or conferences and written in the English language. The documents obtained were then collated, and duplicates were removed. Following the identification phase, an initial selection was made according to the criteria listed in Table 2. The selection of SC3, SC4, and SC5 was made by reading the title, abstract, and keywords to exclude publications that did not focus on the technical aspects and discrepant methodologies of AI-based approaches. Finally, all other articles were subjected to a more detailed full-text analysis based on the eligibility criteria described in Table 3. Following the flowchart guidelines and the steps described in this section, a total of 870 articles were found on the selected platforms corresponding to the search string in Table 2, of which 60 were selected for further evaluation based on the selection criteria listed in Table 3. To ensure the quality and reliability of the publications included in the analysis, as well as to better delineate recent research focused on the application of AI solutions in the diagnosis of knee osteoarthritis, only accessible journal articles were considered for this analysis.

For each of the eligible articles, basic publication information was collected, including (1) issue title, (2) authors, (3) abbreviations and keywords, (4) year, and (5) journal title. Based on these elements, the following sections attempt to answer the research questions listed in Section 1:

For RQ1 “What are the principal models currently employed to design diagnostic models for knee osteoarthritis based on radiographic images?”, data from eligible publications include (1) research topic, (2) materials and methods used by the authors, and (3) metric evaluation.
For RQ2 “What are the potential factors for enhancing the accuracy of traditional machine learning (ML) and deep learning (DL) models developed in recently published research articles?”: (1) Textual descriptions (terms in the full text of the publication) highlighting the draft choices made by the authors.
For RQ3, “What are the principal challenges and prospective avenues for further research?”: (1) the novelty of the research, (2) the experiments that yielded the best and worst results for each approach, and (3) limitations and discussion.

3. Results

Computer-aided diagnosis has shown enormous potential in a wide range of classification and segmentation applications. ML and DL models, in particular, generally require a large amount of labeled data to achieve correct generalization and avoid over-fitting. However, data availability remains a major challenge to train and validate the developed models with sufficient and appropriate data.

3.1. Data Availability

Automatic diagnosis systems based on artificial intelligence, particularly methods based on supervised learning (SL), are often confronted with the challenge of the limited availability of labeled databases of sufficient size. In studies involving the diagnosis of knee osteoarthritis based on radiographic images, the Osteoarthritis Initiative (OAI) and the Multicenter Osteoarthritis Study (MOST) are the two most frequently cited datasets. Both the OAI and MOST studies were approved by the Institutional Review Board (IRB) of the coordinating center, the University of California, San Francisco, as well as the IRBs of the collaborating and clinical centers. The OAI is a longitudinal observational study conducted by the US National Institutes of Health (NIH) on both men and women over a ten-year period. The dataset comprises knee images of 4796 participants aged 45–69 years, examined by X-ray, MRI, and other means during nine follow-up exams over a period of 96 months. The dataset comprises 4446 X-ray images of knees acquired at a fixed angle of 10° and labeled according to the KL scale by Boston University. The objective of the Osteoarthritis Initiative (OAI) is to facilitate the conduct of new research on osteoarthritis, with the aim of developing novel treatments and diagnostic tools. The MOST dataset comprises lateral and posterior–anterior (PA) views of knee X-ray images of men and women aged 50 to 79. It contains 2920 radiographic images acquired over 84 months and labeled according to KL grades, which are not part of the OAI. The posterior–anterior (PA) view of knee images was acquired at 5°, 10°, and 15°. OAI is funded by the National Institutes of Health (NIH) in a public–private partnership with Merck Research Laboratories, Novartis Pharmaceuticals Corporation, GlaxoSmithKline, and Pfizer, Inc. Similarly, MOST was financed by the NIH in cooperation with Felson, Torner, Lewis, and Nevitt. Other studies have utilized pre-prepared datasets accessible on the Kaggle platform (12–14). While numerous other studies have employed local databases, Table 4 provides a summary of the various datasets utilized in the literature, along with the means of accessing them.

3.2. ROI Detection/Segmentation

Image segmentation is an essential task in image processing, allowing similar pixels to be identified and organized into homogeneous regions. Therefore, the selection of appropriate segmentation techniques and parameters is a challenge to achieving high-quality results. Wahyuningrum et al. [4] used the Active Shape Model (ASM) for ROI identification. In this approach, 43 anatomical points used to describe the shape of the area of interest were manually placed in each training image to specify the femoral and tibial forms. Then, the shape model was generated using principal component analysis (PCA).

In 2019, Brahim et al. [12] employed a semi-automated segmentation method to extract the ROIs of tibial trabecular bone from X-ray images. Initially, four points on the tibial edge were manually marked. Subsequently, three ROIs were extracted by connecting the anatomical markers using a uniform size of 128 × 128 pixels. A vertical adjustment was conducted, with the upper limits of the medial and lateral compartments positioned below the tibial edge’s smallest point automatically. Finally, the horizontal adjustment of the ROI position is defined as the center between the ends of the knee.

Nguyen et al. [13] have applied BoneFinder software to segment the edge of the femur and tibia in each bilateral knee radiograph. Then, both the contrast and histogram intensity were normalized, and the DICOM image was converted to eight bits. Then, the resulting image was centered on 110 × 110 mm and resized to 300 × 300 pixels with 0.37 mm as pixel spacing. BoneFinder was also used by Tiulpin and Saarakkala [14] for ROI extraction from the right and left knees. In this method, each knee image was turned to align the tibial plates horizontally. Then, histogram clipping and overall contrast normalization were applied to the resulting images. Finally, the image was resized to 310 × 310 pixels (0.45 mm resolution) using bilinear interpolation.

In 2020, BoneFinder was employed in another study by Bayramoglu et al. [15] to implement an adaptive segmentation method for the detection of adaptive regions of interest (ROIs). In this approach, an over-segmentation of the femoral and tibial regions into distinct subregions was performed using superpixels and simple linear iterative clustering (SLIC). Subsequently, the k-means clustering method was employed in the spatial domain and intensity. From k equally spaced cluster centers, each pixel of the image was linked with the nearest cluster center, and the cluster was updated. This process was repeated until convergence, after which the superpixels were redeployed to the average color and position of the associated input pixels. The LBP (Local Binary Patterns) descriptor was employed to identify the most informative region, where the binary classification of knee osteoarthritis (OA and non-OA) based on logistic regression (LR) achieved optimal results. This study demonstrates that the external median side of the femur and tibia, where osteophytes are typically observed, represents the most informative regions. In this study, LPB was selected after a comparison of multiple descriptors using a 5-fold cross-validation approach and a total of 9012 radiographic knee images from the Osteoarthritis Initiative (OAI) and 3644 radiographic knee images from MOST.

One of the popular algorithms for detecting and locating objects is You Only Look Once (YOLO), which has been in use since 2001 with progressively more accurate and faster iterations. Chen et al. [16] have used YOLOv2 to detect the knee joint from radiographic images. In this study, the image size was adapted to the model input (256 × 320). The model was trained on 5782 images, approved on 826 images, and tested on 1656 images collected from the OAI dataset. The number of anchor boxes was changed using k-means clustering to cluster knee-bounding boxes between 1 and 6. The use of weight decay of 0.0005 achieved the best performance, with a single anchor box with a recall of 0.922 against 0.91 without weight decay. Knee joints in the test dataset were detected without false positives and false negatives. Note that 1526 (92.2%) knee joints have IoU values greater than 0.75, and 1654 (99.9%) knee joints have IoU values greater than 0.50.

Yunus et al. [17] employed YOLOV2 in a system designed to identify knee osteoarthritis. The proposed model is a combination of an open exchange neural network (ONNX) and YOLOv2. The YOLOv2-ONNX model comprises 31 layers, with YOLOv2 employed in conjunction with the pre-established architecture of ONNX. In this work, the ONNX model comprises 24 layers rather than the originally proposed 35. The model comprises two element-wise affine layers, seven convolutional layers, six batch normalization layers, three max-pooling layers, four activation layers, and two rectified linear unit layers. The YOLOv2-ONNX model achieved an IOU of 0.96 and an mAP of 0.98. Table 5 provides an overview of the principal methodologies employed in the detection and localization of ROIs.

3.3. Knee Osteoarthritis Detection

Recent literature shows that deep learning and traditional machine learning can reliably assess knee osteoarthritis severity according to the Kellgren–Lawrence (KL) scale. This semi-quantitative scoring system varies from grade KL0 (no osteoarthritis) to grade 4 (severe osteoarthritis). It has been accepted by the World Health Organization (WHO) as a reference for epidemiological studies on osteoarthritis since 1961. Knee osteoarthritis classification is mainly associated with joint space narrowing (JSN), osteophytes, sclerosis, and bone extremity deformation. JSN is measured between the femur and the tibia in the medial compartment. It is associated with articular cartilage and subchondral bone sclerosis and is widely used as the main indicator of knee osteoarthritis. Figure 2 shows the stages of knee osteoarthritis severity according to KL’s scoring system with joint space narrowing estimation as mentioned in [6]. Other scoring systems have been used to assess the severity of KOA, such as the Osteoarthritis Research Society International (OARSI) atlas [14] and the International Knee Documentation Committee (IKDC) scoring system [9]. Tian et al. [25] have used the Visual Analogue Scale (VAS) and Hospital for Special Surgery (HSS) scores. Norman et al. [26] used the Western Ontario and McMaster Universities (WOMAC) scores and Knee Outcomes in Osteoarthritis Scores (KOOS). Other studies have developed their models of KOA detection using both the KL and WOMAC scoring systems [27] or by grouping KL0 and KL1 in a single category (no OA or normal) [13,28,29], KL2, KL3, and KL4 as abnormal for binary classification [14,15,20,30,31,32], and KL3 and KL4 in severe osteoarthritis [24,29,33,34] for 3-class and 4-class classification.

3.3.1. Deep Learning Models

In recent years, the use of deep learning methods for disease detection has attracted the attention of the research community due to their effectiveness in solving complex problems. In the field of deep learning, Convolutional Neural Networks (CNNs) have transformed visual analysis. Their capability to extract intricate patterns and features from images has made CNNs crucial for tasks such as image classification. Tri Wahyuningrum et al. [35] developed an end-to-end supervised learning model of a Deep Convolutional Neural Network (DCNN) for automatic diagnosis of KOA. The model was optimized with the hyperparameters and built with five convolution blocks. Each block includes a Re-LU layer and a max-pooling layer to extract the distinctive characteristics of joint space narrowing and to evaluate osteophytes. Then, a flatten layer is followed by a fully connected layer, and a drop-out layer (0.1) is used to avoid over-fitting. Finally, the Adam optimizer and the cross-entropy function were used to obtain the loss value with a learning rate of 0.001. The model was trained in 30 epochs with a batch size of 100 images. The proposed model achieved an accuracy of 77.24%.

Chen et al. [16] have noted that when classifying knee osteoarthritis using loss of Cross-Entropy (CE), two neighboring grades may be confused because cross-entropy disregards the proximity of classified objects. To solve this problem, an adjustable ordinal loss was applied to various models of DenseNet, VGG, ResNet, and InceptionV3. The proposed adjustable ordinal loss is obtained by designing a square matrix of order 4, represented by the penalty weights between the predicted grade j and the real grade i, with i and j among 0 and 4, in which the penalty weight of each KL grade to itself is set to 1 and the highest KL grades if the grades are distant. Applying this ordinal loss slightly improved classification accuracy and reduced MAE error. VGG19 achieved the highest accuracy, 69.6% using ordinal loss and 69.3% using cross-entropy. When the region of interest (ROI) was automatically detected, the accuracy of VGG19-ordinal was improved to 70.4%.

In their study, Dalia et al. [18] employed the VGG16 model for the preliminary identification of knee osteoarthritis according to the Kellgren and Lawrence (KL) score. The model was selected following training and evaluation of VGG16-BN, ResNet152, DenseNet201, and VGG16 using 8892 knee X-ray images from the OAI dataset. In this study, the dataset was randomly split into a 7:1:2 ratio corresponding to training, validation, and test sets. The region of interest (ROI) was then detected using the You Only Look Once (YOLOv5) algorithm. The optimal results were achieved through the transfer learning of VGG16, which involved the addition of two fully connected layers (FC) with 4096 units and a third FC layer with 1000 units, followed by a softmax layer. VGG16 achieved an accuracy of 69.8%, while DenseNet-201 and VGG16-BN yielded 68.5% and 68.2%, respectively.

Jain et al. [36] have proposed an automated system to identify knee osteoarthritis from radiographic images, called High-Resolution Network (HRNet), in conjunction with a Convolutional Mass Attention Module (CBAM). In this study, the cross-entropy loss function has been replaced by an ordinal loss function based on an ordinal matrix, as illustrated in [16]. The aim of this ordinal loss function is to penalize the misclassification of a distant KL score more than that of a neighboring score. The diverse array of layers within the HRNet framework enables the integration of both high- and low-resolution data, as well as the concurrent processing of multiple resolutions. The approach was developed by collecting 8260 radiographic knee images from the OAI dataset, which were divided into a training set (70%), a validation set (10%), and a test set (20%). The proposed model was trained for 30 epochs using a batch size of 24 images and stochastic gradient descent (SGD). The approach yielded an average accuracy of 71.74%.

Sivakumari and Vani [37] employed a deep convolutional neural network to predict knee osteoarthritis using Alexnet. In this approach, 7434 knee radiographic images were obtained from the Kaggle platform and divided into training and testing datasets, with a ratio of 70% to 30%, respectively. The model comprises five convolution layers. The initial convolutional layer comprises 96 filters, the second layer 256 filters, and a block of three convolutional layers with 384 filters, separated by a max-pooling layer. The fifth convolution layer is followed by a third max-pooling layer and two fully connected layers (FC) with 4096 neurons, which are then followed by a softmax layer. The model achieved an accuracy of 98%.

Thomas et al. [38] built DenseNet169 from scratch with 169 layers to diagnose knee osteoarthritis severity according to the KL classification system using radiographic images. In this model, each pair of layers was connected in such a way that contours and primitive shapes could be used directly in the fully connected layer. The final layer was modified to have five outputs, corresponding to the KL grading system. In this study, 40,280 X-ray images were collected and augmented from the Osteoarthritis Initiative (OAI) dataset and divided into three subsets for model training, validation, and testing. The training set consisted of 32,116 images, the validation set included 4074 images, and the testing set comprised 4090 images. The model achieved an F1 score of 0.866 and an accuracy of 0.872.

Tuilpin and Saarakalla [14] developed a method based on the combination of two convolutional neural networks to diagnose the existence of knee osteoarthritis (KL ≥ 2). Each model comprises a first convolutional layer that has been pre-trained on ImageNet and a second part consisting of seven independent fully connected layers (FC), corresponding to each of the KL grades and six OARSI grades (femoral osteophytes, tibial osteophytes, and narrowing of joint space for lateral and medial compartments, respectively). The initial convolutional block of the first model was constructed using ResNet50 following the evaluation of ResNet18 and ResNet34. In contrast, the initial convolutional block of the second model was developed using ResNeXt, and the squeeze–excitation block (SE) was incorporated into the second part. Moreover, the global average pooling technique was employed to establish a connection between the two models (ResNet50-SE and ResNeXt-SE). Both models were trained on 19,704 X-ray images from the OAI dataset and tested on 11,743 X-ray images from the MOST dataset using the optimizer Adam and a dropout of 0.5 inserted before each fully connected (FC) layer. In the initial epoch, the FC layer of each model was trained with a learning rate of 0.01. In the second and third epochs, a learning rate of 0.001 was employed, and subsequently, the learning rate was set to 0.0001 in the fourth epoch. This method achieved an average accuracy of 0.98 for binary classification of knee osteoarthritis from X-ray images, as well as a Cohen kappa coefficient of 0.79, 0.84, 0.94, 0.83, 0.84, and 0.90 for the six OARSI grades, respectively.

In 2018, Tiulpin et al. [39] employed a deep Siamese network model comprising two branches of ResNet34. The model was trained on 18,376 images from the MOST dataset, then validated on 2957 images, and finally tested on 5960 images from the OAI dataset. In this approach, the image was divided into lateral and medial compartments, with each part of the image serving as the input for a branch of the model. Each branch comprises three blocks of a convolution layer, a max-pooling layer, a linear unit rectified layer (ReLU), a normalization batch layer (Conv-BN-ReLU), followed by a global average pooling layer, two blocks of (Conv-BN-ReLU), and a softmax layer. The method yielded a quadratic kappa coefficient of 0.83 and an average precision of 66.71%.

Yang et al. [11] have developed an automatic diagnostic model of KOA that can be used in a portable device named RefineDet. This model is composed of two connected models: the anchor refinement module (ARM), which is used for ROI localization, and the object detection module (ODM), which is used for refinement and classification of the KOA severity. To facilitate the sharing of information between the proposed models, a transfer connection block (TCB) was incorporated. A dataset of 2579 posterior–anterior (PA) view radiographic images of knees collected from the General Hospital of the People’s Liberation Army in China was utilized to train the proposed approach. In this study, the images were captured using an iPhone fixed at a distance of 40 cm. Furthermore, the pre-trained model was constructed using 2499 images of the train, 263 images of the validation set, and 941 images of the test set. The model was trained using a batch size of 128 images, a momentum of 0.9, a weight decay of 5 × 10⁻⁴ with the Adam optimizer, and a learning rate of 2 × 10⁻². The automated diagnostic model for knee osteoarthritis achieved an accuracy of 95.7%.

In 2020, Zhang et al. [23] applied ResNet34 with a convolutional block attention model (CBAM) for the automatic classification of knee osteoarthritis. The model was trained using a large dataset acquired and augmented from the OAI dataset. This comprised 38,232 training dataset images, 5422 validation dataset images, and 10,986 test dataset images. The kernel size of the final average pooling layer of ResNet34 was modified from 7 to 28, and the linear layer was altered to 512 × 5. While CBAM was implemented at the end of each residual block of ResNet34, cross-entropy was applied as a loss function to train the model. The accuracy of ResNet34 was enhanced by the incorporation of CBAM, resulting in an improvement from 73.75% to 74.81%. Concurrently, the mean-squared error (MSE) was reduced from 0.39 to 0.36.

3.3.2. Hybrid Models

The combination of deep learning and traditional machine learning models for the identification of knee osteoarthritis and the integration of other clinical data have yielded variable results. Nonetheless, the integration of feature extractors has emerged as a valuable experimental approach. Brahim et al. [12] employed an automated computer-aided diagnostic approach for the detection of knee osteoarthritis, utilizing predictive modeling based on multivariate linear regression (MLR) and the independent component analysis approach (ICA) to extract features and circumvent the issue of limited datasets. The resulting characteristics were employed for the fully automated classification task, which was conducted using Naive Bayes and Random Forest classifiers. The proposed system achieved an accuracy of 82.98%, a sensitivity of 87.15%, and a specificity of 80.65%.

Abedin et al. [40] employed Elastic Net (EN) and Random Forest (RF) to develop a predictive model for knee osteoarthritis. This model was constructed using patient assessment data (i.e., signs and symptoms of the knees and medication use) and a convolutional neural network (CNN). The model comprises four convolutional layers. Each convolutional layer is followed by a batch normalization layer (BN), a rectified linear unit (ReLU) activation layer, and a max-pooling layer. The final max-pooling layer is followed by a fully connected layer (FC) comprising 1024 neurons and a softmax layer with five outputs, representing the five KL grades of KOA gravity. To prevent over-fitting, a dropout layer with a rate of 0.25 is included after the final convolutional layer, and a dropout layer with a rate of 0.5 is applied after the fully connected layer. The use of 100 trees in this work has yielded the lowest root-mean-square error (RMSE) in the validation set. The Elastic Net and Random Forest regression models yielded a global RMSE of 0.974 and 0.943, respectively.

Gornale S.S. et al. [6] have compared the performance of K-Nearest Neighbors (K-NN) and Decision Tree in knee osteoarthritis prediction when creating their dataset known as Medical-expert, which consisted of 1650 knee radiographic images collected from various hospitals and diagnostic centers in India. In this study, the region of interest (ROI) was segmented using Sobel and Prewitt edge detectors, which yielded the most optimal results. The Active Shape Model (ASM) with a 3 × 3 mask and the Otsu method were employed to extract the cartilage area. Subsequently, an algorithm was employed to assess cartilage thickness based on pixel density. Both models were trained on 50% of the dataset and tested on the remaining 50%. The decision tree model achieved a global accuracy of 95.09%, whereas the KNN model yielded a global accuracy of 99.81%.

In 2021, Bayramoglu et al. [41] developed multiple models, trained either individually or in groups, to predict femoropatellar osteoarthritis (PFOA and non-PFOA). All models were trained on 18,436 knee X-ray images in lateral views acquired from the MOST dataset. The models were developed using an end-to-end deep convolution neural network and a trained Gradient Boosting Machine (GBM) classifier, which were both informed by clinical data, including age, sex, body mass index (BMI), the Western Ontario and McMaster Universities Arthritis Index (WOMAC) total score, and the KL grade of tibiofemoral osteoarthritis. In this study, faster R-CNN was employed as a means of identifying the region of interest pertaining to the patella. The Deep-CNN model comprises three blocks of convolution layers, trained over more than 20 epochs using a stochastic gradient descent algorithm. Each convolution layer is followed by a batch normalization (BN) layer, a max-pooling layer, and a rectified linear unit (ReLU) activation function. The third block is followed by two fully connected layers, separated by a dropout layer with a rate of 0.5. The model, which combined GBM, CNN predictions of tibiofemoral osteoarthritis (KL score), and clinical data, achieved an average precision of 0.862 and an area under the curve (AUC) of 0.958. The model constructed using only the GBM classifier and clinical data achieved an average precision of 0.472 and an AUC of 0.806.

In 2022, Ahmed and Mstafa [30] developed two methods of detecting knee osteoarthritis, designated as Deep Hybrid Learning (DHL-I and DHL-II), which employ deep learning and machine learning. A pre-trained deep learning model and Principal Component Analysis (PCA) algorithm were employed to extract the requisite information, which was then classified according to binary or multiple classifications (grades 3, 4, and 5) using the Support Vector Machine (SVM) classifier. A total of 9786 radiographic images of knees from the OAI dataset were utilized in this study, with 80% allocated for training and 20% for testing. The deep learning model was constructed from the ground up and utilized transfer learning. The model comprises five blocks, each comprising a 2D convolution layer with 32 filters and a ReLU layer. The initial three blocks are succeeded by an average pooling layer and a drop-out layer with a rate of 0.25. The fourth block, comprising 64 filters, is followed by a max-pooling layer and a dropout layer with a rate of 0.5, while the fifth block, comprising 128 filters, is followed by a max-pooling layer and a dropout layer with a rate of 0.75. The model head comprises a flatten layer and a fully connected layer (FC), which is followed by a dropout layer with a rate of 0.5 and a softmax layer. Deep learning models were trained with a batch size of 256 images using the following parameters: β1 = 0.9, β2 = 0.999, Adam optimizer, a learning rate of 0.001, and both categorical cross-entropy and binary cross-entropy for binary and multi-class classification, respectively. The SVM classifier was constructed with optimal values of gamma (0.0001) and a penalty parameter (1000). The DHL-I model was constructed de novo and trained for 50 epochs to predict the severity of knee osteoarthritis in five distinct classes. The DHL-II model was constructed using transfer learning and trained for 20 epochs to classify knee osteoarthritis into severity levels 2, 3, and 4. The DHL-I model achieved an accuracy of 74.57%, while the DHL-II model yielded an accuracy of 90.8%, 87%, and 88% for binary, 3-class, and 4-class classification, respectively.

In a more recent study, Alshamrani et al. [42] proposed the use of transfer learning models based on sequential convolutional neural networks (CNNs), Residual Neural Network (ResNet-50), and Visual Geometry Group (VGG16) for the classification of osteoarthritis from knee radiographs. The models were trained using 3836 radiographs obtained from Kaggle. The VGG16 model demonstrated a training accuracy of 99% and a testing accuracy of 92%. Mohammed et al. [32] proposed a binary and 3-class classification system for the objective determination of the severity of knee osteoarthritis (KOA) from X-ray images. The approach employed six pre-trained deep neural network (DNN) models: VGG19, VGG16, MobileNetV2, ResNet101, DenseNet121, and InceptionResNetV2. The models were trained and tested on a total of 9786 knee images from the Osteoarthritis Initiative (OAI) dataset. The ResNet101 model exhibited the highest accuracy, achieving 89%. The other principal methodologies employed in recent years are presented in Table 6 in descending order of their efficacy in identifying knee osteoarthritis from radiographic images.

3.3.3. Comparative Analysis of Reviewed Methods and Techniques

This section presents a comparative analysis of the different diagnostic approaches used in the papers cited to diagnose knee osteoarthritis. The classification of knee osteoarthritis is conducted in two stages. The first stage involves the processing of input images. In contrast, the second stage entails extracting characteristics of the region of interest to assess the severity of knee osteoarthritis according to different grades (KL). The initial stage of the process is conducted with or without the detection of a region of interest (ROI), while the subsequent stage involves the automated extraction of features and classification using traditional machine learning algorithms and deep learning methods.

A.

In the initial phase of the knee osteoarthritis identification process, studies employing methodologies that did not require ROI localization utilized databases of readily available images of individual knees as input to the proposed model. Furthermore, in studies based on ROI detection, manual selection or additional deep learning models for localization are employed before the classification process. Both types of studies, with and without ROI detection, employ pre-processing of the input images with resizing according to the requirements of the models used.

-: As demonstrated in Table 6, the methods that employed ROI detection exhibited a slight advantage over the other methods, with the highest accuracy observed in [43] at 99.81%. In this study, the edges of the knee bone are first detected and then filtered to suppress noise without losing essential image information. In the second step, the region of interest (ROI) was identified based on pixel density. Finally, the traditional machine learning K-nearest neighbor (KNN) classifier was constructed to classify knee arthrosis.
-: In [44], the manual selection of the ROI before the classification of knee osteoarthritis did not yield satisfactory results. In this approach, the lowest accuracy among studies employing ROI detection was obtained with an average multi-class accuracy of 61.71%.
-: A brief analysis of Table 6 allows us to conclude that approaches that did not employ ROI detection achieved performances comparable to those obtained with methods developed with ROI localization. The best performances were obtained in [28] with a weighted kappa coefficient of 0.99 and MAE of 0.0256 using a set of deep learning models. The approach used in [45] exhibited the lowest performance, with a balanced accuracy of 64.13 ± 0.88. This approach employed a Siamese network and a semi-supervised learning technique.

B.

In the second phase of feature extraction and classification of knee OA, the most successful techniques were based on additive methodologies for extraction, including cartilage thickness estimation in [6] and Hu invariant moments used in [43].

-: The traditional machine learning models XGB and KNN have shown performance as potential extractors and classifiers of knee osteoarthritis, giving the best results when compared with other models with accuracies of 99.81%, and 99.43% in [43], and [46], respectively.
-: Table 6 reveals that the common methodologies used in the different approaches are the use of a multitude of models, with the selection of the optimal model or the adoption of the result obtained by the ensemble of models. The DenseNet and ResNet deep learning models are the most commonly adopted in the proposed approaches, their performance being compared with that of other models. However, they are not the optimal candidates for this purpose. The DenseNet model achieved the best performance in the study [34], using DenseNet169, with an accuracy of 95.93%. In contrast, the ResNet model demonstrated the best performance in [47], using ResNet152V2, with an accuracy of 95.88%.
-: Among the deep learning models developed for knee OA classification, MobileNet demonstrated the most optimal performance in [47] when used as a stand-alone model, while the least optimal performance was observed in [44] using Deep Siamese ResNet34, with an accuracy of 61.71%.

Table 6. Updated artificial intelligence systems to detect knee osteoarthritis severity and progression from radiographs.

References	Methodology	Architecture	With/Wit-Hout ROI	Classes	Dataset	Sample Size	Metric
Gornale S.S. et al. [43]	▪ Multi-class classification using traditional ML models and Hu’s invariant moments.	◦ KNN and Decision Tree	With	5	[6]	2000	Accuracy (KNN/DT) = 99.80%/95.75 (Medical expert-I) and 98.65%/95.4 (Medical expert-II)
Fatema et al. [46]	▪ Multi-class classification using optimal features and traditional ML models.	◦ DT, RF, KNN, GB, XGB	With	5	Mendeley and Kaggle	8660	Best model (XGB): precision: 99.43%
Tariq et al. [28]	▪ Multi-class classification using an ensemble of deep learning models.	◦ DenseNet161, DenseNet121, ResNet34,VGG19.	Without	5	OAI	9786	Best model (Ensemble): Weighted kappa = 0.99, MAE = 0.0256
Raza A. et al. [48]	▪ Multi-class classification using traditional ML models and 10-fold cross-validation.	◦ KNN, SVM, Gaussian Naive Bayes, Decision Tree, Random Forest, and XGBoost	Without	4	Mendeley	5778	Best model (XGBoost and ensemble model) Accuracy: 98.90%
Gornale S.S. et al. [49]	▪ Multi-class classification using DL model and Gradient conjugate.	◦ Artificial neural networks (ANN) + Gradient conjugate/Gradient descent/Quasi-Newton	With	5	[6]	1650	Accuracy: 98.7% (surgeon-1), 98.2% (surgeon-2)
Kalpana V et al. [47]	▪ Multi-class classification using DL models.	◦ DenseNet, EfficientNetB7, Inception, MobileNet, NASNet, ResNet152V2, VGG19, Xception	Without	5	Local dataset	8250	Best accuracy (MobileNet): 98.36%
Touahema S. et al. [50]	▪ MeedKnee: Multi-class classification using DL model.	◦ Xception	5	Without	OAI	5000	Accuracy: 97.20%
Wani and Saini [22]	▪ Multi-class classification using DL models and adjustable ordinal loss	◦ VGG, DenseNet, ResNet, and Inception V3	5	With	OAI	1656	Best model (VGG19): Accuracy = 96.7%, MAE = 0.344
Salama et al. [51]	▪ Multi-class classification using DL model and JSW annotation.	◦ U-Net	5	With	OAI	100	Accuracy:96.3%
El-Ghany et al. [34]	▪ Multi-classification and binary classifications using DL models.	◦ DenseNet169, Xception, ResNet50, DenseNet121, InceptionResNetV2, InceptionV3	2/3	Without	OAI	8891/6224	Best model (DenseNet169) Accuracy: 95.93% (3-class), 93.78 (2-class)
Anitha et al. [52]	▪ Multi-class classification and joint space detection using DL model.	◦ RCNN	5	With	Kaggle	167	Accuracy: 95%
Al-rimy et al. [33]	▪ Binary and Multi-class classification using DL model and the gradual cross-entropy (GCE) loss.	◦ DenseNet169	2/5	Without	OAI	4982	Accuracy: 0.9408 (2-class), 0.9179 (3-class), 0.6274 (5-class)
Ruikar et al. [53]	▪ OACnet: Multi-class classification using DL model.	◦ Deep neural network and handcrafted feature engineering (joint space narrowing, bone spur, sclerosis, and deformation)	5	With	OAI	9492	Accuracy (DNN-OCnet) = 83.74%, accuracy (DNN-Ocnet + Handcrafted features) = 92.7%
Olsson et al. [8]	▪ Multi-class classification using DL model.	◦ ResNet35	Without	5	Local dataset	6403	AUC > 0.80 for the 5 grades KL—Sensitivity > 92% except for grade KL4 with 84% and a specificity between 61% and 88%
Yoon et al. [24]	▪ MediAI-OA: Binary, Multi-classification suing JSN Rate assessment.	◦ HRNet (JSN), NASNet	With	2/5	OAI	45,003	Accuracy: 92% (2-class), 83% (4-class)
Ahmed and Mohammed [54]	▪ Multi-class classification using Convolutional Neural Networks (CNNs).	◦ VGG16, VGG19, ResNet50	Without	5	Mendeley	1650	Best validation accuracy ResNet50: 91.51%
M. and Goswami. [55]	▪ Multi-class classification using CNN model and image sharpening process.	◦ Inception-Resnet-v2	With	5	OAI	8260	Accuracy: 91.03%
Yunus et al. [17]	▪Multi-class classification using a hybrid model and features extraction with LPB.	◦ Darknet53, Alexnet, ROI localization: YOLOv2 ONNX, Final classification: SVM, KNN	With	5	OAI	3795	Accuracy = 90.6%, Precision = 85%, Sensitivity = 91%
Nguyen Huu et al. [20]	▪ Binary classification using the CNN model.	◦ VGG16	With	2	OAI	2874	Accuracy = 89%
Yong et al. [56]	▪ Multi-class classification using DL model and ordinal regression.	◦ Ordinal regression module (ORM) applied on VGG, ResNet, DenseNet, ResNext, GoogLeNet, and Mobilenet.	Without	5	OAI	4130	Best model (DenseNet161): ACCMacro = 88.09%
Abdo et al. [29]	▪ Binary and multi-class classification using DL model.	◦ DNN	Without	2/3	OAI/[6]	9737/1650	3D-CNN (binary classification) (OAI-I/OAI-II): Accuracy = 85.50%/83.%
Yildirim and Mutlu [57]	▪ Multi-class classification using DL model and textural-based feature extraction.	◦ Darknet53 combined with HOG and LBP	Without	5	Kaggle	1650	Accuracy: 83.6%
Bhat and Suhasini [58]	▪ Multi-class classification using a hybrid model.	◦FFNN), Combination of Deep Belief Networks (DBN), RBM (Restricted Boltzmann Machine), and Multi SVM.	With	5	Local dataset	126	Average precision: FFNN: 76%, DBN-RBM: 83%, Multi SVM: 74.67%
Wang Yu et al. [21]	▪ Multi-class classification based on a two-step classification strategy with DL models and High-pass filter.	◦ VGG, ResNet-50	With	5	OAI	8892	Average accuracy: 81.41%
Bayramoglu et al. [15]	▪ Binary classification (OA, no OA) using traditional ML model and texture descriptors.	◦ Logistic Regression combined with texture descriptors: FD, Shannon Entropy, Gabbor, Haralik, Ondelette, Tamura, and Local Binary Pattern (LBP)	With	2	OAI/MOST	9012/3644	AUC of 0.840 (OA: 0, 825, no OA: 0.852) AP = 0.804 (0.786, 0.820).
Riad et al. [31]	▪ Binary classification using traditional ML model and texture analysis approach.	◦ KNN, SVM, Radial Basis Function (RBF) kernels	With	2	OAI	688	Accuracy 80.38%
Bonakdari et al. [59]	▪ KOA structural progressors using a traditional ML model combined with gender, serum biomarkers, age, BMI, and inflammatory factors.	◦ KNN, Random Forest, Decision Tree, Extreme Learning Machine, and SVM.	Without	2	OAI/Napro-xen cohort	677/44	Accuracy: >80% using SVM
Norman et al. [26]	▪ Multi-class classification using DL model and demographic variables.	◦ DenseNet after 15 epochs (DN15) and DenseNet with demographic input vector after 8 epochs: age, sex, and race (DenseNet-DEM-8)	With	4	OAI	39,593	Sensitivity/Specificity: No OA, Mild OA, Moderate OA, Severe OA: 83.7%/86.1%, 70.2%/83.8%, 68.9%/97.1%, and 86.0%/99.1%, respectively.
Pi et al. [60]	▪ Multi-class classification using an ensemble of DL models and mix voting algorithm.	◦ DenseNet-161, EfficienNet-b5, EfficienNet-V2-s, RegNet-Y-8GF, ResNet-101, ResNex, WideResNet-50-2, ShuffleNet-V2- × 2-0	Without	5	OAI	8260	Accuracy: 76.93%
Kwon et al. [7]	▪ Multi-class classification using DL model and gait analysis data.	◦ Inception-ResNet-v2 using gait analysis for features extraction with NCA, support vector machine (SVM) classifier, and cubic kernel function.	Without	5	Local dataset	215	Sensitivity = 0.70, precision = 0.76, F1-score = 0.71
Wahyuningrum et al. [3]	▪ Multi-class classification using DL model and LSTM.	◦VGGNet, ResNet, DenseNet combined with LSTM	With	5	OAI	5148	Average accuracy (VGG16-LSTM) = 75.28%
Nguyen et al. [13]	▪ Binary classification using deep supervised and semi-supervised learning.	◦ Deep Siamese Neural Network ResNet34	With	2	OAI	39,902	Best result (BA): Semixup (SSL): 71.0 ± 0.8, SL without semixup: 70.9 ± 0.8%.
Wang Yifan et al. [1]	▪ Multi-class classification using a novel learning scheme, Estimating Label Confidence, and hybrid loss function.	◦ ResNet34 and DenseNet121	With	5	OAI	8302	Mean accuracy (DenseNet121\ResNet34) = 70.13%/68.32%.
Tiulpin et al. [27]	▪KOA prediction and progression using traditional ML model and clinical data.	◦ Se-resnext101 32 × 4d/Se-resnet50/Inceptionv4. Final classification with GBM.	With	3	OAI/MOST	4928/3918	AP (CNN + clinical data) = 70%, Ap (se-resnext50 32 × 4 d) alone = 63%, and 68% when combined with clinical data
Hu et al. [61]	▪ Predicting KOA longitudinal progression during 4 years using adversarial evolving neural.	◦ ResNet18, VGG19/ResNet50/Vit	Without	5	OAI	3294	Best accuracy: A-ENN(VGG19): 62.7% and 64.6%, 63.9%, 63.2%, 61.8%, and 60.2% for progression 12-month, 24-month, 36-month, and 48-month, respectively.
Raisuddin et al. [45]	▪ Multi-class classification using Deep Active Learning using Consistency Regularization (CR).	◦ SSL deep Siamese VGG	Without	5	OAI	9003	Balanced accuracy = 64.13 ± 0.88
Cueva et al. [44]	▪ Multi-class classification using computer-assisted diagnostic (CAD) based on the DL model.	◦ Deep Siamese ResNet34	With	5	[6]/Local dataset	9182/376	Average multi-class accuracy: 61.71%.

4. Discussion

In this literature review, 60 research articles were selected based on the above criteria. As shown in Figure 3, 15 of these articles were published in 2022, while the number of selected articles published in 2024 was five, as the survey was closed on 15 May 2024. The research articles reviewed dealt with the classification of knee osteoarthritis according to KL grade, with a total of 56 articles out of 60, while only 4 articles focused on monitoring the disease progression. In response to the first research question (RQ1), the main methodologies developed in the reviewed papers are based on the transfer of supervised learning techniques using a variety of traditional machine learning algorithms and deep learning models. The result is the optimal performance obtained by the most efficient model or a combination of a set of models. In the proposed approaches, ResNet models were used in 24 articles, while DenseNet and Visual Geometry Group (VGG) models were applied in 14 and 13 articles, respectively. In terms of conventional machine learning approaches, Logistic Regression (LR) and Random Forest (RF) are the main methods used in the literature. Furthermore, we find that 50 articles used Deep Learning models (83%), and only 10 articles preferred traditional machine learning algorithms (17%). In addition, we find that Grad-CAM is the most widely used visualization method in the literature, followed by the saliency map (Figure 3). The developed methods are dominated by supervised transfer learning with 58 articles, compared to only two papers that used a semi-supervised learning technique [13,45]. From Table 3, it can be concluded that the OAI dataset is the most used database, followed by the MOST dataset, since these two databases are the only tagged images available with a large number of images.

With regard to the second question (RQ2), as shown in Table 6, the traditional machine learning model developed in [43] achieved the highest accuracy, with an average accuracy of 99.80% using an active contour algorithm for ROI segmentation and KNN for knee OA classification. However, among the methods developed without ROI detection, the most accurate method developed in [28] achieved a weighted kappa coefficient of 0.99 and a mean absolute error (MAE) of 0.0256. A brief analysis indicates that ROI segmentation has a minimal impact on the accuracy of the methods developed. It is evident that while ROI segmentation enhances the detection of JSN, it also conceals certain symptoms of knee osteoarthritis, such as osteophytes and sclerosis. This, in turn, diminishes the improvement in the accuracy of the model employed for feature extraction and classification. Conversely, the utilization of demographic variables or clinical data (gender, age, serum biomarkers, age, body mass index (BMI), inflammatory factors, gait analysis data, University of Western Ontario and McMaster Arthritis Index (WOMAC) total score) did not yield satisfactory results, with an accuracy rate of approximately 80% [7,26,27,41]. However, the analysis of this study indicates that hyperparameters, feature extraction, and optimal weights are critical factors that have demonstrated the potential to enhance model accuracy. It is important to note that the validation of models developed in internal databases is consistently superior to that obtained in an external validation using entirely different image datasets. Furthermore, Table 3 reveals that studies comparing binary and multi-class classification generally achieved slightly higher accuracies with binary classification [24,29,30,33]. It can be noted that researchers tend to employ a binary classification (OA, non-OA) or a multi-class classification of only four KL grades instead of five, combining the first two grades, KL0 and KL1. This is because the distinction between the first two grades is generally very challenging, even for experienced medical specialists, who find it difficult to identify grade KL0 as KL1 and vice versa.

As shown in Figure 4, the main guidelines proposed in the research articles can be summarized in the following steps: (i) collecting images; (ii) image processing; (iii) ROI detection/segmentation; (iv) features extraction; (v) features selection; (vi) classification and progression; and (vii) visualization. The knee OA classification process begins with database selection and image pre-processing to eliminate noise and improve image quality. The second phase consists of manual, automatic, or semi-automatic detection of the ROI. Despite its importance in improving diagnostic accuracy, this phase is not used in some studies, where the first phase is followed directly by the third phase of feature extraction. The final phase consists of diagnosing knee osteoarthritis using a classification model to identify the severity of knee osteoarthritis according to the Kellgren and Lawrence (Kl) score. In light of the preceding analysis, the reply to the last research question (RQ3) can be divided into three categories of AI challenges for future research in knee osteoarthritis identification:

i.: Dataset preparation: The initial step in the process of selecting a suitable database is to guarantee an accurate diagnosis of knee osteoarthritis from radiographic images. This can be achieved by using mixed and balanced datasets with external validation, which is a first step towards reliable training. The classification of knee osteoarthritis can be simplified by merging the KL0 and KL1 grades into a single category [13,15,20,26,48]. Although KL1 has no real significance, it reduces the model’s accuracy as it is challenging to differentiate it from KL0.
ii.: Model selection: In addition to preparing a suitable database, the type of model chosen, hyperparameters, features extraction and selection, ROI localization, and JSN quantification are the most important factors in effectively improving the accuracy of the trained model. In light of these promising outcomes, the deployment of diverse iterations of YOLLO for real-time ROI detection and localization is becoming increasingly prevalent.
iii.: Visualization: The reliability of a diagnosis is no longer solely based on its accuracy; it is also based on the visualization of areas of deficiency. The orientation of new research towards a precise and justified diagnosis is analogous to that of the specialist doctors who base their final diagnosis on well-defined symptoms. This approach could therefore be used to confirm the degree of knee osteoarthritis by visualizing the key elements that have contributed to the identification of this condition. This would provide greater confidence in the developed approach.

Finally, this study has some limitations. Firstly, the models compared are not tested on comparable databases since the size and nature of the datasets used during training and validation differ from one article to another. Secondly, the models that obtained low predictions did not indicate whether the training of the models used was saturated or not, bearing in mind that in the event of non-saturation, increasing the number of epochs used could improve their predictions. Finally, although the majority of evaluation methods used were based on mean accuracy, several studies used other evaluation criteria such as mean F1 score, mean recall, and mean precision [7,38], making the results incomparable. Lastly, the Scopus index was not considered in the selection of reviewed papers, which would have enhanced the findings of this literature review. Nevertheless, this survey provides a comprehensive update for doctors and researchers, offering a global perspective on the direction of scientific research and a valuable quantitative and qualitative analysis for all scientists. This study is not only a simple summary of the different methods used in recent years but also a guide and a comparative analysis of the different approaches used, highlighting the techniques that have helped improve the performance of artificial intelligence in diagnosing knee osteoarthritis from radiographic images.

5. Conclusions

A literature review of 60 research articles collected from the Google Scholar, MDPI, and ScienceDirect platforms was conducted according to a special flowchart designed to improve the quality of research and elucidate the current research landscape on diagnostic knee osteoarthritis using artificial intelligence based on radiographic images. By answering the three research questions listed above, this study has identified the main methodologies developed in the recent literature, mentioning the main factors that have improved the performance of the artificial intelligence models used and clarifying the main challenges with the prediction of future research avenues. This study demonstrated the trend in the recent literature towards the adoption of a transfer of supervised learning techniques using a set of artificial intelligence models for the classification or monitoring of knee osteoarthritis progression. In recent literature, approaches that merge KL0 and KL1 into a single class have significantly improved the performance of existing approaches. This survey confirmed that automatic ROI detection is a factor capable of improving the performance of artificial intelligence models, showing a slight performance superiority over approaches developed without ROI localization. This study raised the challenge of distinguishing KL1 grade from KL0 grade and selecting the best artificial intelligence model capable of providing optimal performance, pointing out that traditional machine learning models KNN and XGB are the best classifiers of knee osteoarthritis. In addition, this survey showed that, despite the difficulties, artificial intelligence (AI) tools were able to prove their efficacy as a reliable resource for doctors in diagnosing knee osteoarthritis based on radiographic images. Furthermore, this work has outlined possible future directions for research, pointing out the need not only for an accurate diagnosis but also for one that is justified. To this end, future research can focus on naming the precise symptoms of knee osteoarthritis that led to the diagnostic result. In this way, the diagnosis made by artificial intelligence comes ever closer to that of doctors based on detailed symptoms, to minimizing errors linked to diagnostics. Ultimately, this study can be a resource for researchers and physicians to present the latest approaches and techniques used in the diagnosis of knee osteoarthritis with artificial intelligence, as well as help developers build reliable research leads by easing the transition to successful model deployments, thus reducing the time needed to select and experiment with other AI solutions.

Author Contributions

Conceptualization, S.T., I.Z. and N.Z.; validation, I.Z. and N.Z.; formal analysis, I.Z., N.Z. and M.N.N.; supervision, I.Z., N.Z. and M.N.N.; writing—original draft preparation, S.T.; writing—review and editing, S.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The BoneFinder software cited in this literature review is available online: https://bone-finder.com/ (accessed on 25 September 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AI	Artificial intelligence
CNNs	Convolutional neural networks
DCNN	Deep convolutional neural network
KL	Kellgren and Lawrence
KOA	Knee Osteoarthritis
OAI	Osteoarthritis initiative
TL	Transfer learning
RELU	Linear rectified unit
ROI	Region of interest
RMSE	Root-Mean-Squared Error
SGD	Stochastic gradient descent
A-ENN	Adversarial evolving neural network
DAL	Deep active learning
LSTM	Long short-term memory
ORM	Ordinal regression module
HRNet	High-resolution network
CBAM	Convolutional mass attention module
KNN	K-Nearest neighbors
DT	Decision tree
RF	Random forest
RQ	Research question
EC	Eligibility criteria
MAE	Mean absolute error
SC	Screening criteria
OARSI	Osteoarthritis research society international
WOMAC	Western ontario and McMaster universities
IKDC	International Knee Documentation Committee
KOOS	Knee outcomes in osteoarthritis scores
FFNN	Feed Forward Neural Network

References

Wang, Y.; Bi, Z.; Xie, Y.; Wu, T.; Zeng, X.; Chen, S.; Zhou, D. Learning From Highly Confident Samples for Automatic Knee Osteoarthritis Severity Assessment: Data From the Osteoarthritis Initiative. IEEE J. Biomed. Health Inform. 2022, 26, 1239–1250. [Google Scholar] [CrossRef] [PubMed]
Dhami, B.S.; Mehra, M.; Singh, R.P. VGG16 Based Knee Osteoarthritis Grading Using X-ray Images. IJRASET 2022, 10, 678–683. Available online: https://www.ijraset.com/research-paper/vgg16-based-knee-osteoarthritis-grading (accessed on 5 January 2023). [CrossRef]
Wahyuningrum, R.T.; Anifah, L.; Eddy Purnama, I.K.; Hery Purnomo, M. A New Approach to Classify Knee Osteoarthritis Severity from Radiographic Images based on CNN-LSTM Method. In Proceedings of the IEEE 10th International Conference on Awareness Science and Technology (ICAST), Morioka, Japan, 23–25 October 2019; pp. 1–6. Available online: https://ieeexplore.ieee.org/document/8923284 (accessed on 5 January 2023).
Wahyuningrum, R.T.; Purnama, I.K.E.; Verkerke, G.J.; van Ooijen, P.M.A.; Purnomo, M.H. A novel method for determining the Femoral-Tibial Angle of Knee Osteoarthritis on X-ray radiographs: Data from the Osteoarthritis Initiative. Heliyon 2019, 6, e04433. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Statistics Adopted from the National Center for Chronic Disease Prevention and Health Promotion. Available online: https://archive.cdc.gov/#/details?q=https://www.cdc.gov/arthritis/data_statistics&start=0&rows=10&url=https://www.cdc.gov/media/releases/2017/p0307-arthritis-climbing.html (accessed on 10 August 2023).
Gornale, S.S.; Patravali, P.U.; Hiremath, P.S. A Comprehensive Digital Knee X-ray Image Dataset for the Assessment of Osteoarthritis. JSM Biomed. Imaging Data Pap. 2020, 6, 1012. Available online: https://www.academia.edu/79637490/A_Comprehensive_Digital_Knee_X_ray_Image_Dataset_for_the_Assessment_of_Osteoarthritis (accessed on 25 September 2023).
Kwon, S.B.; Han, H.-S.; Lee, M.C.; Kim, H.C.; Ku, Y.; Ro, D.H. Machine Learning-Based Automatic Classification of Knee Osteoarthritis Severity Using Gait Data and Radiographic Images. IEEE Access 2020, 8, 120597–120603. Available online: https://ieeexplore.ieee.org/document/9130657 (accessed on 5 January 2023). [CrossRef]
Olsson, S.; Akbarian, E.; Lind, A.; Razavian, A.S.; Gordon, M. Automating classification of osteoarthritis according to Kellgren-Lawrence in the knee using deep learning in an unfiltered adult population. BMC Musculoskelet. Disord. 2021, 22, 844. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Schwartz, A.J.; Clarke, H.D.; Spangehl, M.J.; Bingham, J.S.; Etzioni, D.A.; Neville, M.R. Can a Convolutional Neural Network Classify Knee Osteoarthritis on Plain Radiographs as Accurately as Fellowship-Trained Knee Arthroplasty Surgeons? J. Arthroplast. 2020, 35, 2423–2428. [Google Scholar] [CrossRef] [PubMed]
Tiulpin, A.; Melekhov, I.; Saarakkala, S. KNEEL: Knee Anatomical Landmark Localization Using Hourglass Networks. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; Available online: https://ieeexplore.ieee.org/document/9022083 (accessed on 5 January 2023).
Yang, J.; Ji, Q.; Ni, M.; Zhang, G.; Wang, Y. Automatic assessment of knee osteoarthritis severity in portable devices based on deep learning. J. Orthop. Surg. Res. 2022, 17, 540. Available online: https://josr-online.biomedcentral.com/articles/10.1186/s13018-022-03429-2 (accessed on 5 January 2023). [CrossRef] [PubMed]
Brahim, A.; Jennane, R.; Riad, R.; Janvier, T.; Khedher, L.; Toumi, H.; Lespessailles, E. A decision support tool for early detection of knee OsteoArthritis using X-ray imaging and machine learning: Data from the OsteoArthritis Initiative. Comput. Med. Imaging Graph. 2019, 73, 11–18. [Google Scholar] [CrossRef] [PubMed]
Nguyen, H.H.; Saarakkala, S.; Blaschko, M.B.; Tiulpin, A. Semixup: In- and Out-of-Manifold Regularization for Deep Semi-Supervised Knee Osteoarthritis Severity Grading From Plain Radiographs. IEEE Trans. Med. Imaging 2020, 39, 4346–4356. Available online: https://ieeexplore.ieee.org/document/9169719 (accessed on 25 September 2023). [CrossRef]
Tiulpin, A.; Saarakkala, S. Automatic Grading of Individual Knee Osteoarthritis Features in Plain Radiographs using Deep Convolutional Neural Networks. Diagnostics 2020, 10, 932. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Bayramoglu, N.; Tiulpin, A.; Hirvasniemi, J.; Nieminen, M.T.; Saarakkala, S. Adaptive segmentation of knee radiographs for selecting the optimal ROI in texture analysis. Osteoarthr. Cartil. 2020, 28, 941–952. [Google Scholar] [CrossRef] [PubMed]
Chen, P.; Gao, L.; Shi, X.; Allen, K.; Yang, L. Fully automatic knee osteoarthritis severity grading using deep neural networks with a novel ordinal loss. Comput. Med. Imaging Graph. 2019, 75, 84–92. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Yunus, U.; Amin, J.; Sharif, M.; Yasmin, M.; Kadry, S.; Krishnamoorthy, S. Recognition of Knee Osteoarthritis (KOA) Using YOLOv2 and Classification Based on Convolutional Neural Network. Life 2022, 12, 1126. [Google Scholar] [CrossRef] [PubMed]
Dalia, Y.; Bharath, A.; Mayya, V.; Sowmya Kamath, S. DeepOA: Clinical Decision Support System for Early Detection and Severity Grading of Knee Osteoarthritis. In Proceedings of the IEEE 5th International Conference on Computer, Communication and Signal Processing (ICCCSP), Chennai, India, 24–25 May 2021; pp. 250–255. Available online: https://ieeexplore.ieee.org/document/9465522 (accessed on 25 September 2023).
Nagaraj, K.; Jeyakumar, V. A Study on Comparative Analysis of Automated and Semiautomated Segmentation Techniques on Knee Osteoarthritis X-ray Radiographs. In Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB); Lecture Notes in Computational Vision and Biomechanics. Pandian, D., Fernando, X., Baig, Z., Shi, F., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 1655–1666. [Google Scholar] [CrossRef]
Nguyen Huu, P.; Nguyen Thanh, D.; le Thi Hai, T.; Chu Duc, H.; Pham Viet, H.; Nguyen Trong, C. Detection and Classification Knee Osteoarthritis Algorithm using YOLOv3 and VGG16 Models. In Proceedings of the IEEE 7th National Scientific Conference on Applying New Technology in Green Buildings (ATiGB), Da Nang, Vietnam, 11–12 November 2022; pp. 31–36. Available online: https://ieeexplore.ieee.org/document/9984096 (accessed on 25 September 2023).
Wang, Y.; Li, S.; Zhao, B.; Zhang, J.; Yang, Y.; Li, B. A ResNet-based approach for accurate radiographic diagnosis of knee osteoarthritis. CAAI Trans. Intell. Technol. 2022, 7, 512–521. Available online: https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.12079 (accessed on 5 January 2023). [CrossRef]
Wani, Z.M.; Saini, D.S. Deep Neural Network-based Knee Osteoarthritis Grading Using X-rays. IJRASET 2022, 10, 1293–1299. [Google Scholar] [CrossRef]
Zhang, B.; Tan, J.; Cho, K.; Chang, G.; Deniz, C.M. Attention-based CNN for KL Grade Classification: Data from the Osteoarthritis Initiative. In Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA, 3–7 April 2020; pp. 731–735. Available online: https://ieeexplore.ieee.org/document/9098456 (accessed on 5 January 2023).
Yoon, J.S.; Yon, C.-J.; Lee, D.; Lee, J.J.; Kang, C.H.; Kang, S.-B.; Lee, N.-K.; Chang, C.B. Assessment of a novel deep learning-based software developed for automatic feature extraction and grading of radiographic knee osteoarthritis. BMC Musculoskelet Disord. 2023, 24, 869. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Tian, X.; Han, C.; Wang, J.; Tan, Y.; Zhu, G.; Lei, M.; Ma, S.; Hu, Y.; Li, S.; Chen, H.; et al. Distal tibial tuberosity high tibial osteotomy using an image enhancement technique for orthopedic scans in the treatment of medial compartment knee osteoarthritis. Comput. Methods Programs Biomed. 2020, 191, 105349. [Google Scholar] [CrossRef]
Norman, B.; Pedoia, V.; Noworolski, A.; Link, T.M.; Majumdar, S. Applying Densely Connected Convolutional Neural Networks for Staging Osteoarthritis Severity from Plain Radiographs. J. Digit. Imaging 2019, 32, 471–477. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Tiulpin, A.; Klein, S.; Bierma-Zeinstra, S.M.A.; Thevenot, J.; Rahtu, E.; van Meurs, J.; Oei, E.H.G.; Saarakkala, S. Multimodal Machine Learning-based Knee Osteoarthritis Progression Prediction from Plain Radiographs and Clinical Data. Sci. Rep. 2019, 9, 20038. Available online: https://www.nature.com/articles/s41598-019-56527-3 (accessed on 5 January 2023). [CrossRef]
Tariq, T.; Suhail, Z.; Nawaz, Z. Knee Osteoarthritis Detection and Classification Using X-rays. IEEE Access 2023, 11, 48292–48303. Available online: https://ieeexplore.ieee.org/document/10126092 (accessed on 8 January 2024). [CrossRef]
Abdo, A.A.; El-Tarhouni, W.; Abdulsalam, A.F.; Altajori, A.B. Estimating the severity of knee osteoarthritis using Deep Convolutional Neural Network based on Contrast Limited Adaptive Histogram Equalization technique. In Proceedings of the 2022 International Conference on Engineering & MIS (ICEMIS), Istanbul, Turkey, 4–6 July 2022; pp. 1–6. Available online: https://ieeexplore.ieee.org/document/9914285 (accessed on 5 January 2023).
Ahmed, S.M.; Mstafa, R.J. Identifying Severity Grading of Knee Osteoarthritis from X-ray Images Using an Efficient Mixture of Deep Learning and Machine Learning Models. Diagnostics 2022, 12, 2939. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Riad, R.; Jennane, R.; Brahim, A.; Janvier, T.; Toumi, H.; Lespessailles, E. Texture analysis using complex wavelet decomposition for knee osteoarthritis detection: Data from the osteoarthritis initiative. Comput. Electr. Eng. 2018, 68, 181–191. [Google Scholar] [CrossRef]
Mohammed, A.S.; Hasanaath, A.A.; Latif, G.; Bashar, A. Knee Osteoarthritis Detection and Severity Classification Using Residual Neural Networks on Preprocessed X-ray Images. Diagnostics 2023, 13, 1380. [Google Scholar] [CrossRef] [PubMed]
Al-Rimy, B.A.S.; Saeed, F.; Al-Sarem, M.; Albarrak, A.M.; Qasem, S.N. An Adaptive Early Stopping Technique for DenseNet169-Based Knee Osteoarthritis Detection Model. Diagnostics 2023, 13, 1903. [Google Scholar] [CrossRef] [PubMed]
El-Ghany, S.A.; Elmogy, M.; El-Aziz, A.A.A. A fully automatic fine-tuned deep learning model for knee osteoarthritis detection and progression analysis. Egypt. Inform. J. 2023, 24, 229–240. [Google Scholar] [CrossRef]
Tri Wahyuningrum, R.; Yasid, A.; Jacob Verkerke, G. Deep Neural Networks for Automatic Classification of Knee Osteoarthritis Severity Based on X-ray Images. In Proceedings of the 8th International Conference on Information Technology ICIT 2020: IoT and Smart City, Xi’an, China, 25–27 December 2020; ACM: New York, NY, USA, 2021; pp. 110–114. Available online: https://dl.acm.org/doi/10.1145/3446999.3447020 (accessed on 5 January 2023).
Jain, R.K.; Sharma, P.K.; Gaj, S.; Sur, A.; Ghosh, P. Knee Osteoarthritis Severity Prediction using an Attentive Multi-Scale Deep Convolutional Neural Network. Multimed. Tools Appl. 2023, 83, 6925–6942. [Google Scholar] [CrossRef]
Sivakumari, T.; Vani, R. Implementation of AlexNet for Classification of Knee Osteoarthritis. In Proceedings of the 2022 7th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 22–24 June 2022; pp. 1405–1409. Available online: https://ieeexplore.ieee.org/document/9835835 (accessed on 5 January 2023).
Thomas, K.A.; Kidziński, Ł.; Halilaj, E.; Fleming, S.L.; Venkataraman, G.R.; Oei, E.H.G.; Gold, G.E.; Delp, S.L. Automated Classification of Radiographic Knee Osteoarthritis Severity Using Deep Neural Networks. Radiol. Artif. Intell. 2020, 2, e190065. Available online: https://pubs.rsna.org/doi/10.1148/ryai.2020190065 (accessed on 5 January 2023). [CrossRef]
Tiulpin, A.; Thevenot, J.; Rahtu, E.; Lehenkari, P.; Saarakkala, S. Automatic Knee Osteoarthritis Diagnosis from Plain Radiographs: A Deep Learning-Based Approach. Sci. Rep. 2018, 8, 1727. Available online: https://www.nature.com/articles/s41598-018-20132-7 (accessed on 5 January 2023). [CrossRef]
Abedin, J.; Antony, J.; McGuinness, K.; Moran, K.; O’Connor, N.E.; Rebholz-Schuhmann, D.; Newell, J. Predicting knee osteoarthritis severity: Comparative modeling based on patient’s data and plain X-ray images. Sci. Rep. 2019, 9, 5761. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Bayramoglu, N.; Nieminen, M.T.; Saarakkala, S. Automated detection of patellofemoral osteoarthritis from knee lateral view radiographs using deep learning: Data from the Multicenter Osteoarthritis Study (MOST). Osteoarthr. Cartil. 2021, 29, 1432–1447. Available online: https://www.oarsijournal.com/article/S1063-4584(21)00835-9/fulltext (accessed on 5 January 2023). [CrossRef] [PubMed]
Alshamrani, H.A.; Rashid, M.; Alshamrani, S.S.; Alshehri, A.H. Osteo-NeT: An Automated System for Predicting Knee Osteoarthritis from X-ray Images Using Transfer-Learning-Based Neural Networks Approach. Healthcare 2023, 11, 1206. [Google Scholar] [CrossRef] [PubMed]
Gornale, S.S.; Patravali, P.U.; Hiremath, P.S. Automatic Detection and Classification of Knee Osteoarthritis Using Hu’s Invariant Moments. Front. Robot. AI 2020, 7, 591827. [Google Scholar] [CrossRef] [PubMed]
Cueva, J.H.; Castillo, D.; Espinós-Morató, H.; Durán, D.; Díaz, P.; Lakshminarayanan, V. Detection and Classification of Knee Osteoarthritis. Diagnostics 2022, 12, 2362. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Raisuddin, A.M.; Nguyen, H.H.; Tiulpin, A. Deep Semi-Supervised Active Learning for Knee Osteoarthritis Severity Grading. In Proceedings of the IEEE 19th International Symposium on Biomedical Imaging (ISBI), Kolkata, India, 28–31 March 2022; pp. 1–5. Available online: https://ieeexplore.ieee.org/document/9761668 (accessed on 5 January 2023).
Fatema, K.; Rony, M.A.H.; Azam, S.; Muckta, M.S.H.; Hasan, M.Z.; Jonkman, M. Development of an automated optimal distance feature-based decision system for diagnosing knee osteoarthritis using segmented X-ray images. Heliyon 2023, 9, e21703. [Google Scholar] [CrossRef]
Kalpana, V.; Kumar, G.H. Evaluating the efficacy of deep learning models for knee osteoarthritis prediction based on Kellgren-Lawrence grading system, e-Prime—Advances in Electrical Engineering. Electron. Energy 2023, 5, 100266. [Google Scholar] [CrossRef]
Raza, A.; Phan, T.-L.; Li, H.-C.; Van Hieu, N.; Nghia, T.T.; Ching, C.T.S. A Comparative Study of Machine Learning Classifiers for Enhancing Knee Osteoarthritis Diagn. Inf. 2024, 15, 183. Available online: https://www.mdpi.com/2078-2489/15/4/183 (accessed on 15 May 2024).
Gornale, S.S.; Patravali, P.U.; Hiremath, P.S. Detection of Osteoarthritis in Knee Radiographic Images using Artificial Neural Network. Int. J. Innov. Technol. Explor. Eng. IJITEE 2019, 8, 2429–2434. Available online: https://www.ijitee.org/portfolio-item/L30111081219/ (accessed on 5 January 2023). [CrossRef]
Touahema, S.; Zaimi, I.; Zrira, N.; Ngote, M.N.; Doulhousne, H.; Aouial, M. MedKnee: A New Deep Learning-Based Software for Automated Prediction of Radiographic Knee Osteoarthritis. Diagnostics 2024, 14, 993. [Google Scholar] [CrossRef]
Salama, A.; Rahouma, K.; Mansour, F.E. Knee osteoarthritis automatic detection using U-Net. IJ-AI 2024, 13, 2122. Available online: https://ijai.iaescore.com/index.php/IJAI/article/view/23241 (accessed on 15 May 2024). [CrossRef]
Anitha, R.; Archana, M.; Aswini, R.; Christabell Smylin, P. Deep Learning Based Knee Osteoarthritis Detection and Classification. IJARSCT 2024, 4, 230–235. [Google Scholar] [CrossRef]
Ruikar, D.; Kamble, P.; Ruikar, A.; Houde, K.; Hegadi, R. DNN-Based Knee OA Severity Prediction System: Pathologically Robust Feature Engineering Approach. SN Comput. Sci. 2022, 4, 58. Available online: https://dl.acm.org/doi/10.1007/s42979-022-01476-4 (accessed on 5 January 2023). [CrossRef]
Huthaifa, A.A.; Emad, A.M. Detection and Classification of The Osteoarthritis in Knee Joint Using Transfer Learning with Convolutional Neural Networks (CNNs). Iraqi J. Sci. 2022, 63, 5058–5071. Available online: https://ijs.uobaghdad.edu.iq/index.php/eijs/article/view/6054 (accessed on 20 January 2023).
M, G.K.; Goswami, A.D. Automatic Classification of the Severity of Knee Osteoarthritis Using Enhanced Image Sharpening and CNN. Appl. Sci. 2023, 13, 1658. [Google Scholar] [CrossRef]
Yong, C.W.; Teo, K.; Murphy, B.P.; Hum, Y.C.; Tee, Y.K.; Xia, K.; Lai, K.W. Knee osteoarthritis severity classification with ordinal regression module. Multimed. Tools Appl. 2021, 81, 41497–41509. [Google Scholar] [CrossRef]
Yildirim, M.; Mutlu, H.B. Automatic detection of knee osteoarthritis grading using artificial intelligence-based methods. Int. J. Imaging Syst. Technol. 2024, 34, e23057. Available online: https://onlinelibrary.wiley.com/doi/10.1002/ima.23057 (accessed on 15 May 2024). [CrossRef]
Bhat, A.Y.; Suhasini, A. Automated Detection For The Severity Of Knee Osteoarthritis from Plain Radiographs Using Machine Learning Methods. Int. J. Sci. Technol. Res. 2019, 8, 2277–8616. Available online: https://www.ijstr.org/final-print/sep2019/Automated-Detection-For-The-Severity-Of-Knee-Osteoarthritis-From-Plain-Radiographs-Using-Machine-Learning-Methods.pdf (accessed on 5 January 2023).
Bonakdari, H.; Jamshidi, A.; Pelletier, J.-P.; Abram, F.; Tardif, G.; Martel-Pelletier, J. A warning machine learning algorithm for early knee osteoarthritis structural progressor patient screening. Ther. Adv. Musculoskelet. 2021, 13, 1759720X2199325. Available online: https://journals.sagepub.com/doi/10.1177/1759720X21993254 (accessed on 5 January 2023). [CrossRef]
Pi, S.-W.; Lee, B.-D.; Lee, M.S.; Lee, H.J. Ensemble deep-learning networks for automated osteoarthritis grading in knee X-ray images. Sci. Rep. 2023, 13, 22887. Available online: https://www.nature.com/articles/s41598-023-50210-4 (accessed on 8 January 2024). [CrossRef]
Hu, K.; Wu, W.; Li, W.; Simic, M.; Zomaya, A.; Wang, Z. Adversarial Evolving Neural Network for Longitudinal Knee Osteoarthritis Prediction. IEEE Trans. Med. Imaging 2022, 41, 3207–3217. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]

Figure 1. Flowchart of survey inclusions and exclusions for the literature review.

Figure 2. Radiological stages of knee osteoarthritis according to the cartilage thickness [6] and Kellgren and Lawrence (KL) grading system.

Figure 3. Coverage of all articles included in the survey: (a) Number of research articles by year of publication; (b) Breakdown of articles by methods adopted; (c) Distribution of visualization methods employed.

Figure 4. Flowchart of the most common procedures developed in the reviewed articles to predict the classification and progression of knee osteoarthritis based on X-ray images.

Table 1. Global search keywords suitable for each of the online databases.

Category 1	Category 2
Knee osteoarthritis	Automated
Artificial intelligence	Detection
Deep learning	Progression
Machine learning
X-ray images
Radiographic images
Plain radiograph

Table 2. Inclusion criteria for the filtering stage of the literature review.

ID	Filtering Criteria
SC1	The article must include at least the Title, Abstract, Source, Year, and Doi.
SC2	Must be in English.
SC3	Abstract must discuss the implementation of artificial intelligence in knee osteoarthritis diagnosis.
SC4	The article must have been published between January 2018 and May 2024.
SC5	Should not be just a review, survey, preprint, or roadmap.

Table 3. Exclusion criteria for the eligibility phase of the literature review.

ID	Eligibility Criteria
EC1	Does not address the implementation of artificial intelligence in the KOA diagnosis.
EC2	Is not based on X-ray images.
EC3	Full text not accessible.

Table 4. Description of the most commonly used datasets.

Dataset	Description	Labeled Images	Availability
OAI	▪ Osteoarthritis Initiative (OAI) dataset: Multicenter, ten-year observational study of 4796 participants (Men and Women age = 45–79 years)	4446 (KL and OARSI)	https://nda.nih.gov/oai/ (accessed on 5 January 2023)
MOST	▪ Multicenter Osteoarthritis Study (MOST): PA and lateral knees of 2026 participants (Men and Women average age = 50–79 years)	2920(KL)	https://most.ucsf.edu (accessed on 5 January 2023)
Gornale SS et al. [6]	▪ Fixed-flexion digital knee X-ray images collected from various Karnataka hospitals, in India (known as Medical-expert)	1650 (KL)	contact authors
Kwon et al. [7]	▪ 728 limbs with gait analysis data of 364 participants (men and women age ≥ 20 years) collected from Seoul National University Hospital (Korea) from 2013 to 2017	215 (KL)	contact authors
Olsson et al. [8]	▪ 6103 radiographic exams collected from Danderyd University Hospital, Stockholm, Sweden	6403 (KL)	contact authors
Schwartz et al. [9]	▪ Fixed-flexion PA knee X-ray images collected from an outpatient clinic at a large academic joint arthroplasty practice, in Arizona, USA (from 2016 to 2019)	4755 (IKDC)	contact authors
Tiulpin et al. [10]	▪ 81 subjects, collected from Oulu University Hospital, Finland	370 (KL)	ClinicalTrials.gov, ID: NCT02937064
Yang et al. [11]	▪ 2378 participants of the Chinese People’s Liberation Army General Hospital, Beijing, China, collected from January 2020 to January 2021 (men and women age ≥ 40)	2579 (KL)	contact author

Table 5. Overview of the main ROI detection methods used in the selected articles.

References	Method	Dataset (X-ray Images)	Metric
Dalia et al. [18]	YOLOv5	400 X-ray images from OAI	Recall = 93%.
Gornale S.S. et al. [6]	Active contour algorithm +Otsu + Sobel and Prewitt	Local dataset: 1650 X-ray images	Not available
Nagaraj and Jeyakumar [19]	Fuzzy C-means, k-means clustering, Center rectangle, Region-Based Active Contour, and Seed Point Selection	Local dataset of 25 X-ray images	Best models: Center rectangle, Seed point selection: SP = 100%, SE = 0%
Nguyen Huu et al. [20]	YOLOv3 Gaussian and Canny filters	OAI	Train accuracy = 97%, mAP = 97%, IoU = 85%
Bayramoglu et al. [15]	BoneFinder + Adaptive ROI Segmentation using superpixel labeling and k-means clustering	OAI + MOST	Not available
Tiulpin et al. [10]	KNEEL: Knee Anatomical Landmark Localization with hourglass networks	370 X-ray images from Oulu University Hospital, Finland	Precision (resolution = 2.5) = 93.48% ± 0.44
Wang Yu et al. [21]	JC-Regnet based on VGG + AMSGRad	OAI	Accuracy = 96%, recall = 0.92
Wani and Saini [22]	YOLOv2 (Image size: 256 × 320 pixels)	OAI	Mean Jaccard index = 0.858, recall = 92.2
Zhang et al. [23]	ResNet18 using global average pooling layer	OAI	IoU = 0.86
Yoon et al. [24]	HRNet: Detection of knee position, RetinaNet	OAI	Not available

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Touahema, S.; Zaimi, I.; Zrira, N.; Ngote, M.N. How Can Artificial Intelligence Identify Knee Osteoarthritis from Radiographic Images with Satisfactory Accuracy?: A Literature Review for 2018–2024. Appl. Sci. 2024, 14, 6333. https://doi.org/10.3390/app14146333

AMA Style

Touahema S, Zaimi I, Zrira N, Ngote MN. How Can Artificial Intelligence Identify Knee Osteoarthritis from Radiographic Images with Satisfactory Accuracy?: A Literature Review for 2018–2024. Applied Sciences. 2024; 14(14):6333. https://doi.org/10.3390/app14146333

Chicago/Turabian Style

Touahema, Said, Imane Zaimi, Nabila Zrira, and Mohamed Nabil Ngote. 2024. "How Can Artificial Intelligence Identify Knee Osteoarthritis from Radiographic Images with Satisfactory Accuracy?: A Literature Review for 2018–2024" Applied Sciences 14, no. 14: 6333. https://doi.org/10.3390/app14146333

APA Style

Touahema, S., Zaimi, I., Zrira, N., & Ngote, M. N. (2024). How Can Artificial Intelligence Identify Knee Osteoarthritis from Radiographic Images with Satisfactory Accuracy?: A Literature Review for 2018–2024. Applied Sciences, 14(14), 6333. https://doi.org/10.3390/app14146333

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

How Can Artificial Intelligence Identify Knee Osteoarthritis from Radiographic Images with Satisfactory Accuracy?: A Literature Review for 2018–2024

Abstract

1. Introduction

2. Methods

3. Results

3.1. Data Availability

3.2. ROI Detection/Segmentation

3.3. Knee Osteoarthritis Detection

3.3.1. Deep Learning Models

3.3.2. Hybrid Models

3.3.3. Comparative Analysis of Reviewed Methods and Techniques

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI