Next Article in Journal
Carrying Police Load Increases Gait Asymmetry in Ground Reaction Forces and Plantar Pressures Beneath Different Foot Regions in a Large Sample of Police Recruits
Next Article in Special Issue
Artificial Intelligence and Machine Learning in Spine Research: A New Frontier
Previous Article in Journal
Anoikis-Related Long Non-Coding RNA Signatures to Predict Prognosis and Immune Infiltration of Gastric Cancer
Previous Article in Special Issue
Evaluation of Patients’ Levels of Walking Independence Using Inertial Sensors and Neural Networks in an Acute-Care Hospital
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Applications of Artificial Intelligence and Machine Learning in Spine MRI

1
Department of Diagnostic Imaging, National University Hospital, 5 Lower Kent Ridge Rd, Singapore 119074, Singapore
2
Department of Diagnostic Radiology, Yong Loo Lin School of Medicine, National University of Singapore, 10 Medical Drive, Singapore 117597, Singapore
3
National University Spine Institute, Department of Orthopaedic Surgery, National University Health System, 1E Lower Kent Ridge Road, Singapore 119228, Singapore
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Bioengineering 2024, 11(9), 894; https://doi.org/10.3390/bioengineering11090894
Submission received: 27 July 2024 / Revised: 1 September 2024 / Accepted: 1 September 2024 / Published: 5 September 2024
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Spine Research)

Abstract

:
Diagnostic imaging, particularly MRI, plays a key role in the evaluation of many spine pathologies. Recent progress in artificial intelligence and its subset, machine learning, has led to many applications within spine MRI, which we sought to examine in this review. A literature search of the major databases (PubMed, MEDLINE, Web of Science, ClinicalTrials.gov) was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The search yielded 1226 results, of which 50 studies were selected for inclusion. Key data from these studies were extracted. Studies were categorized thematically into the following: Image Acquisition and Processing, Segmentation, Diagnosis and Treatment Planning, and Patient Selection and Prognostication. Gaps in the literature and the proposed areas of future research are discussed. Current research demonstrates the ability of artificial intelligence to improve various aspects of this field, from image acquisition to analysis and clinical care. We also acknowledge the limitations of current technology. Future work will require collaborative efforts in order to fully exploit new technologies while addressing the practical challenges of generalizability and implementation. In particular, the use of foundation models and large-language models in spine MRI is a promising area, warranting further research. Studies assessing model performance in real-world clinical settings will also help uncover unintended consequences and maximize the benefits for patient care.

Graphical Abstract

1. Introduction

The spine is an important site of pathology and can be affected by a variety of conditions, including degenerative, neoplastic, infectious, traumatic and inflammatory demyelinating diseases. Diagnostic imaging often plays a key role in the diagnosis of spinal diseases. In addition, imaging is also vital for planning treatments such as surgery and minimally invasive procedures as it allows for the localization and quantification of underlying pathologies [1].
Various imaging modalities can be used in spine imaging. Radiographs often provide initial assessment of symptoms that may be attributed to a spinal pathology, such as neck or back pain, radiculopathy, or myelopathy. They are a cost-efficient and widely available diagnostic tool that can provide rapid assessment of spinal alignment, fractures, and degenerative changes. Erosive changes can also be detected and may suggest the presence of neoplasms or underlying infection, albeit with a relatively low sensitivity. Radiographs also offer a relatively low-cost method for the dynamic assessment of spinal instability [1,2]. Computed tomography (CT) provides a superior delineation of complex spinal anatomy, which can be challenging to accurately assess using radiographs. In the setting of trauma, CT is the modality of choice for evaluating fractures and dislocations in the cervical spine as it allows for the rapid imaging of patients who may have significant traumatic injuries and be in an unstable clinical condition. It also allows for a good visualization of the cortical bone [3]. CT scans can also be used for pre-operative planning as certain pathologies, such as ossification of the posterior longitudinal ligament (OPLL), are readily visualized [4].
While radiographs and CT remain significant imaging tools, magnetic resonance imaging (MRI) has surpassed both in the range of pathologies that it is able to image. MRI scans have the advantage of being able to evaluate bone marrow signal thus allowing for an accurate detection of pathologies that alter the normal bone marrow, such as fractures or contusions, neoplastic disease or infection. In addition, MRI scans provide superior evaluation of soft tissue structures, such as the intervertebral disc, spinal ligaments, as well as the spinal cord and surrounding dural and epidural spaces [5,6,7]. Thus, MRIs have become widely recognized as the preferred modality to evaluate many spinal pathologies. CT myelography is an alternative modality used to assess the spinal cord and neural foramina. However, it requires the injection of contrast material into the spinal canal via lumbar puncture, making it more invasive and less widely used [8]. It is typically reserved for cases with MRI contraindications, such as patients with incompatible pacemakers.
Despite its many advantages, an important limitation of MRI is its relatively long acquisition times. To accommodate a growing number of scans, more time-efficient MRI pulse sequences have been developed. However, there is often a trade-off between diagnostic quality and time savings, resulting in faster sequences with lower resolution or tissue contrast [9]. More recent developments such as parallel imaging and compressed sensing have partially mitigated this [10,11], but scan time remains a pertinent issue. In addition, interpreting these MRIs can be a tedious and time-consuming process for the reporting radiologist. Each spinal level must be carefully examined for evidence of pathology. Additionally, there is significant interobserver variability in evaluating the severity of observed pathology [12]. The lack of standardized grading systems, particularly in the cervical and thoracic spine, further complicates this process. Finally, certain spinal pathologies present diagnostic challenges due to overlapping imaging characteristics. Differentiating between various types of spinal neoplasms or infections can be particularly difficult, especially for inexperienced radiologists, potentially impacting subsequent treatment decisions.
Artificial intelligence (AI) has been increasingly explored as a solution to many of these challenges, with widespread applications across medicine. Machine learning (ML) is a subset of AI that utilizes a combination of algorithms and statistical models to make predictions on new data [13,14,15]. Deep learning (DL) is a further subset of ML which has garnered significant interest in recent years. Compared to other types of ML, DL algorithms are generally more complex, requiring larger amounts of data and computational power. Such algorithms have been developed with the promise of impacting various areas of radiology. Most algorithms in radiology are ‘supervised’ via labeled datasets. Using labels provided by human readers, the DL model learns to identify patterns in a dataset, and its performance is studied using a separate test/validation dataset [14,15]. The number of applications for AI in radiology, including its subset ML, has increased significantly over time, now spanning areas such as image interpretation, protocolling scans, and optimizing workflows. Additionally, many of the challenges in spine imaging are not unique. Various AI techniques have been successfully applied across a broad spectrum of tasks in radiology and medicine, including endoscopic image analysis and image feature fusion and enhancement [16,17,18]. Many of these advancements have helped inform the innovations in spine imaging AI.
While several studies have examined the use of AI in specific aspects of spine imaging, our goal is to provide a comprehensive overview of the full spectrum of use cases in spine MRI by examining the available literature on a wide variety of applications. Additionally, we aim to identify gaps in the existing literature and propose areas for future research in the field.

2. Materials and Methods

Given the anticipated large number of studies that would be extracted, a scoping review was performed to adequately represent the breadth and depth of the current literature.

2.1. Literature Search Strategy

We performed a literature search of the major databases (PubMed, MEDLINE, Web of Science, ClinicalTrials.gov) on 8 February 2024, according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The following medical subject headings (MeSH) and keywords were utilized: (“artificial intelligence” OR “AI” OR “machine learning” OR “ML” OR “deep learning” OR “DL”) AND (“spine” OR “spinal cord” OR “cord” OR “vertebra” OR “vertebral column” OR “spinal column” OR “intervertebral disc” OR “intervertebral disk) AND (“MRI” OR “MR” OR “magnetic resonance imaging”). Limits were applied to include only English language studies from the past eight years.

2.2. Study Screening and Selection Criteria

A two-stage screening process was used. Studies were first screened independently by two authors (A.L. and W.O.) by title and abstract. A full text review was then performed for any potentially eligible studies. Any controversies at either stage were reviewed by a third author (J.T.P.D.H.).
The inclusion criteria were as follows: studies on the use of AI or ML on MRI images focusing on spine-related applications, English studies, and studies performed on human subjects. The exclusion criteria were as follows: non-original research (for example, review articles, editorial correspondence), unpublished work, conference abstracts, and case reports. Studies that primarily focused on other imaging modalities (for example, radiographs, CT, or nuclear medicine imaging) or other body regions were excluded.

2.3. Data Extraction and Reporting

The selected studies were extracted and compiled onto a spreadsheet using Microsoft Excel Version 16.81 (Microsoft Corporation, Washington DC, USA). The following data was extracted:
  • Study details: authorship, year of publication and journal name;
  • Application and primary outcome measure;
  • Study details: sample size, spine region studied, MRI sequences used;
  • Artificial intelligence technique used;
  • Key results and conclusion.

3. Results

3.1. Search Results

Our initial literature search identified 1226 studies, which were screened according to the specified criteria. Subsequently, 149 studies which did not meet the date range, 44 with an incorrect article type and 2 non-English language studies were initially excluded. This led to 1031 studies selected for full text screening, and the inclusion of 50 studies in the present review (see Figure 1 for a detailed flowchart). The studies are summarized in Table 1. Given the heterogeneity of the included studies, a formal meta-analysis could not be meaningfully performed.
We classified the included studies based on the following themes: Image Acquisition and Processing (6/50, 12%), Segmentation (8/50, 16%), Diagnosis (27/50, 54%), Treatment Planning, Patient Selection and Prognosis (6/50, 12%) and Others (3, 6%) (Figure 2). We further sub-divided the Diagnosis theme into Degenerative (13/50, 26%), Neoplastic Diseases (7/50, 14%), Infection (3/50, 6%), Trauma (2/50, 4%), and Spondyloarthropathy (2/50, 4%).

3.2. Image Acquisition and Processing

AI has demonstrated promise in the area of image acquisition and processing, with multiple studies demonstrating the ability of deep learning (DL)-assisted acquisition and reconstruction techniques to reduce image acquisition times while maintaining similar image quality to conventional protocols. Kashiwagi et al. (2022) studied an ultrafast cervical spine MRI protocol (sagittal T1-, T2-weighted, short-tau inversion recovery (STIR), and axial T2*-weighted sequences) using a convolutional neural network (CNN)-based reconstruction which reduces the matrix size, oversampling rate, and number of excitations by applying a noise reduction algorithm. Scan quality was rated by three neuroradiologists, who graded various degenerative changes including central canal stenosis, foraminal stenosis and disc degeneration. Compared with a conventional MRI protocol, the DL-based reconstruction technique reduced scan time from 12 min 54 s to 2 min 57 s (achieving a time reduction of 9 min 57 s, 77% faster), with high levels of agreement (κ = 0.60–0.98) between the protocols [50]. In another study by Awan et al. (2024), the authors evaluated a lumbar spine MRI DL-accelerated protocol (sagittal T1-, T2-weighted, STIR, axial T2-weighted sequences) that was 57% faster compared to a conventional protocol (287 s versus 654 s). This protocol employs an unrolled variational network and neural networks to reduce the number of signal averages needed while preserving high image fidelity. The DL-accelerated protocol demonstrated non-inferiority for the assessment of foraminal and spinal canal stenosis, nerve compression, and facet arthropathy. However, there was increased artifact perception in the DL group. The authors proposed that further work could focus on other pathologies, such as spinal cord evaluation, to ensure the broad applicability of such protocols across various clinical scenarios [22]. Such protocols promise to generate significant time- and cost-savings for radiology departments. In addition, reducing examination time would potentially improve patient comfort, especially for those who may not be able to fully cooperate with long examination times due to pre-existing medical conditions or claustrophobia.
AI can also be applied to generate synthetic MRI sequences. In a multi-center trial, Tanenbaum et al. (2023) used existing sagittal T1- and T2-weighted MRI images to generate STIR images, which is the preferred MRI sequence to assess certain pathologies such as vertebral fractures and infection. The authors demonstrated that both acquired and synthetic STIR sequences were diagnostically equivalent; five radiologists (four subspecialists and one general radiologist) had similar interobserver agreements for both the conventional and AI-generated sequences for the detection of three findings (prevertebral fluid collections, fracture-related bone edema, and posterior soft tissue/ligamentous injury) against the reference standard. Additionally, synthetic images had a higher mean image quality score. Nonetheless, the authors acknowledged that artifacts in the input images could potentially affect synthetic image quality [42]. Thus, while synthetic MRI sequences could also help reduce scan times, further work is necessary to better understand the effects of MRI artifacts, such as metal or susceptibility artifacts on such algorithms.
Furthermore, certain DL models such as generative adversarial networks (GANs) have been successfully deployed to generate MRI-like images from CT data, and vice versa [41,59]. Gotoh et al. (2022) utilized a conditional GAN (pix2pix) to generate synthetic T2-weighted MRI images from lumbar spine CTs. They achieved a modest peak signal-to-noise ratio of 18.4 dB, although, on qualitative evaluation by two radiologists, there was no significant difference in the image quality with conventional MRI images [59]. These models can be potentially useful for patients who have contraindications to MRI, such as non-MRI compatible implants.

3.3. Segmentation

Segmentation comprised the second highest proportion of studies reviewed. Notably, many early studies concentrated on segmentation. Various regions of the spine, including the vertebrae, intervertebral discs, and spinal cord, have been studied for this application. Recently, more sophisticated and complex models have been employed to achieve higher levels of accuracy across a broader range of tasks. Mohanty et al. (2023) demonstrated the use of a novel segmentation technique that initially segments the spinal cord into different regions. A combination of multiple mask regional CNNs (MRCNNs) is then used for each spinal cord segment, which provides a higher overall accuracy of 99% compared to accuracies of 81–96% for the other models (a CNN, deep neural network, and statistical parametric mapping) [45]. Newer models are also able to segment a larger number of structures, improving granularity. Yilizati-Yilihamu et al. (2023) employed a SAFNet-based model to segment 17 unique spinal structures, overcoming issues posed by intra- and inter-class differences across a range of spinal levels. This method extracted low-, mid- and high-level features on MRI images which were then processed separately before being concatenated. The model achieved an overall mean Dice score of 80% against a radiologist, surpassing other models whose Dice scores ranged from 75–79%, with 3D UNet performing the worst. However, SAFNet exhibited relatively poor segmentation for certain structures, such as the L5 vertebra and sacrum. While SAFNet had high Dice scores on most vertebrae, it struggled with accurately depicting the borders of L5 and the sacral intervertebral discs [36]. The sacrum’s unique shape and poorly formed intervertebral discs may have contributed to these difficulties. In contrast, models like 3D DeepLabv3 and ResUNet, which employ superior boundary detection techniques, were more successful in achieving accurate segmentation in these challenging areas. Despite these challenges, SAFNet demonstrated better generalization and overall performance, making it a robust choice for segmentation tasks.
Interpreting MRIs with spinal abnormalities presents a significant challenge for accurate segmentation due to distorted anatomy and altered relationships between normal structures. To address this, several models have been developed for segmentation in specific clinical scenarios, such as spinal cord trauma. Specifically, Masse-Gignac et al. (2023) employed an attention-gated U-net to segment injured spinal cords. The attention gating mechanism helped the architecture focus on the most relevant features while reducing the number of feature maps, leading to a considerable Dice score of 0.95, even in the presence of distorted segmentation boundaries [35].
Additionally, segmentation has been expanded beyond normal anatomical structures to include pathology itself. For instance, Lemay et al. (2021) trained a cascaded neural network for segmentation of intramedullary tumors across different spinal regions. The study included 343 patients with various tumors (namely, astrocytomas, ependymomas and hemangioblastomas) and utilized T2-weighted and T1-weighted post-contrast images. A Dice score of 62% was achieved for segmentation of the tumor itself compared to radiologists’ segmentation, with a higher Dice score of 77% when the tumor cavity and edema were also included. Compared to a single model architecture (a 3D U-net), the cascaded architecture demonstrated increased Dice scores of 30% for edema, and 5% for tumor and tumor cavity [61]. The segmentation of pathology is potentially useful in clinical practice to allow more accurate quantification and post-treatment follow-up.

3.4. Diagnostics

There has been considerable interest in using AI for diagnostic applications in spine MRI, and this represented the largest proportion of studies in our review. To provide more focus, we have further categorized the studies based on the type of disease examined.

3.4.1. Degenerative Disease

Degenerative disease along the spine is found in a sizeable proportion of all MRIs performed. Multiple studies have investigated the use of AI models in the detection and classification of degenerative pathologies. These mainly focused on the cervical [20,38,63] and lumbar [33,47,64,68] spine, given the relatively higher incidence in these regions compared to the semi-rigid thoracic spine [1]. Models have also been utilized to allow for the detection of specific pathologies on MRI, such as OPLL [25], which is typically assessed on CT.
A number of studies have focused on identifying the sites of spinal cord or nerve compression. Merali et al. (2021) trained a CNN (ResNet50) to classify cervical spine MRIs for the presence or absence of cord compression on axial T2-weighted images, achieving a high AUC of 0.94. While this model could be used to quickly classify patients with high accuracy, the authors noted that a more precise model that stratifies the severity of spinal cord compression (for example, partial versus circumferential compression, with the latter being more severe) would offer greater clinical utility [63].
In a study by Hallinan et al. (2021), the authors examined the ability of a CNN-based model to perform automated grading of lumbar spinal stenosis at different regions of interest. The model was trained on axial T2-weighted and sagittal T1-weighted sequences and achieved high levels of agreement compared with the reference standard (an expert radiologist with 31 years of experience). Its performance was comparable to that of subspecialist radiologists for dichotomous grading at the central canal (κ = 0.96 versus 0.98 for radiologists) and lateral recesses (κ = 0.92 versus 0.92–0.95 for radiologists) but slightly lower at the neural foramina (κ = 0.89 versus 0.94–0.95 for radiologists) [64]. In a follow-up study, Lim et al. (2022) assessed whether this model could enhance radiologist performance. The authors evaluated the performance of eight radiologists (comprising subspecialists, general radiologists, and in-training radiologists) with and without DL model assistance. They found that DL model assistance generated significant time savings (reduced interpretation time by 76–203 s, p < 0.001), with the greatest benefit for in-training radiologists. DL-assisted readers had improved or similar performance compared to the baseline [53]. Such studies that assess the real-world impact of DL models are useful in identifying the areas of greatest benefit and potential problems. In this context, using AI alongside radiologists during image interpretation has the potential to enhance both the efficiency and consistency of reporting by reducing variability in their assessments.
Other studies have focused on specific degenerative pathologies, such as intervertebral disc degeneration. Liawrungrueang et al. (2023) trained a CNN (YOLOv5) to classify lumbar discs on sagittal T2-weighted images using the Pfirrmann classification system [69], a widely used system for communicating the severity of disc degeneration and destruction. Compared to a musculoskeletal radiologist, the model achieved accuracies of more than 95% [47]. Recent work by Xie et al. (2024) employed a combined model using MedSAM followed by radiomics analysis to perform Pfirrmann grading for degenerate cervical discs. The model was trained on sagittal T1- and T2-weighted images and achieved an AUC of 0.95 on a test set, compared against an orthopedic radiologist. It demonstrated the highest accuracy of 90% when using T1- and T2-weighted images in combination (versus 81–86% when trained on either sequence alone) [20]. The ability to classify degenerative pathologies using standardized grading systems can allow for the rapid identification of cases with more severe disease. The use of established criteria would also help facilitate communication among different specialists.

3.4.2. Neoplastic Diseases

Neoplastic disease can affect different structures along the spine including the vertebral column and epidural space with risk of spinal cord compression, potentially leading to significant disability. Bone neoplasms include metastases, myeloma, and primary neoplastic lesions [70]. Additionally, neoplasms can involve the thecal sac/dura and spinal cord. Several models have been utilized to address diagnostic challenges in spine oncology [32,49,51,65], facilitating the distinction between different pathologies with overlapping imaging characteristics. For example, Zhuo et al. (2022) employed the MultiResUNet and DenseNet121-based models to differentiate demyelinating disease from neoplasms (namely ependymoma and astrocytoma) on sagittal T2-weighted MRI alone, without contrast-enhanced sequences. Scans were evaluated by seven neuroradiologists, with the model achieving high accuracies of 79–96% (AUC 0.85–0.99), which was similar or superior to the performance of the neuroradiologists (accuracies of 67–97%). Accuracy for differentiating between the types of demyelinating diseases (multiple sclerosis versus neuromyelitis optica spectrum disorders) was the lowest. This pipeline could potentially be useful for cases where intravenous gadolinium contrast is contraindicated, for example, in patients with renal impairment. However, the authors also performed lesion segmentation and noted relatively poor Dice scores of 0.50–0.58 for the segmentation of demyelinating lesions (versus Dice scores of 0.77–0.80 for neoplasms), suggesting that further work in this area is necessary before AI-based quantification can be applied to clinical practice [49].
Other models have been applied to evaluate complications resulting from neoplastic diseases. For instance, Liu et al. (2023) used a Two-Stream Compare and Contrast Network (TSCCN) model to differentiate between benign and malignant vertebral compression fractures on sagittal T1-weighted and T2-weighted fat-saturated images, a common diagnostic dilemma. In clinical practice, malignant fractures (usually due to metastases) may require surgical management, and the primary malignancy must also be sought. In their study, all cases of malignant fractures were confirmed histologically (total of 14 cancer types). The model achieved higher accuracies of 90–96% (highest accuracy using both MRI sequences in combination) relative to clinical radiologists (accuracies of 81–90%). The TSCCN model does not require the manual segmentation of fractures and allows for the rapid identification of malignant fractures. The authors, however, acknowledge that the generalizability of the model may be limited for other cancer types not included in the study [46].
Additionally, in routine practice, radiologists also evaluate the extent of spinal cord compression resulting from neoplastic disease as it helps guide management and identify patients at risk of neurologic compromise. To this end, Hallinan et al. (2022) used a CNN to grade the severity of metastatic epidural spinal cord compression (MESCC) using the Bilsky classification on axial T2-weighted images. Compared against an experienced musculoskeletal radiologist as the reference standard, the model achieved an almost-perfect agreement for dichotomous grading on internal validation (κ = 0.92) and external testing (κ = 0.94). It was also compared to three clinicians who had similar performance (κ-range = 0.94–0.98). This could be used to identify patients with severe cord compression for prompt specialist review [54]. In a separate study, the authors also demonstrated the feasibility of automated MESCC grading (normal/low/high-grade) on a matched set of contrast-enhanced CT images that had corresponding MRIs. The CT model had a high agreement (κ = 0.87–91) with the expert and was superior to two radiologists (κ = 0.73–0.82). This would potentially allow for even earlier diagnosis on staging CT scans which are routinely performed for patients with cancer [71,72,73].

3.4.3. Infection

MRI is the modality of choice for evaluating spondylodiscitis, allowing for an accurate diagnosis, characterization of the extent of infection, and assessment of complications. However, infections can present a diagnostic challenge as degenerative or inflammatory lesions may exhibit similar MRI findings.
Mukaihata et al. (2023) developed an algorithm to differentiate pyogenic spondylitis from Modic endplate changes, a common diagnostic dilemma. Using a CNN backbone, the authors assessed the model’s performance on sagittal T1-, T2-weighted, and STIR images against a radiologist and specialist orthopedic and spine surgeons. The model demonstrated comparable performance to the clinicians and had a high AUC (0.94–0.95) [48]. Additionally, MRI can be useful in suggesting the likely causative organism for spine infections, helping guide treatment and follow-up. Several studies have applied AI to this effect [37,69]. Wang et al. (2023) evaluated a combined model to predict whether Brucella or Tuberculous spondylitis was more likely using sagittal T1-, T2-, T2-weighted fat-saturated, and axial T2-weighted sequences. Various AI models were used to assess images against the reference standard (defined by clinical and microbiological diagnostic criteria). A random forest model achieved the highest AUC of 0.95, higher than a support vector machine (AUC 0.90–0.94) [37]. Such models are potentially useful as the choice of microbiological therapy and management strategy differs significantly between these conditions.

3.4.4. Trauma

MRI is often used in the assessment of traumatic injuries, providing a detailed visualization of soft tissues including the spinal cord and vertebrae, aiding in the detection of subtle injuries and fractures crucial for accurate diagnosis and treatment planning. Wang et al. (2024) demonstrated that CNNs (YoloV7 and ResNet50) may be used to evaluate for acute vertebral fractures. In their study, sagittal STIR images were used, with the model demonstrating a high accuracy of 98% (sensitivity of 98%, specificity of 97%) against assessments by spine surgeons. While the performance on an external dataset was slightly poorer, this was still relatively high at 92% (sensitivity of 93%, specificity of 92%) [21].
In another application, Jo et al. (2023) developed a two-step algorithm (Attention U-net and Inception-ResNet-V2) for the diagnosis and localization of posterior ligamentous complex injury in patients with thoracolumbar fractures on midsagittal T2-weighted fat-saturated images, which can be particularly difficult for inexperienced readers. Assessment by two experienced musculoskeletal radiologists was used as the reference standard. The algorithm demonstrated comparable performance (AUC 0.93 on internal testing, 0.92 on external testing) to a musculoskeletal radiologist (AUC 0.93), and higher performance than a radiology trainee (AUC 0.83). In addition, they showed that the performance of the radiology trainee was significantly improved when aided by the model (improved from AUC 0.83 to AUC 0.92) [28]. Such models can be used to improve diagnostic confidence for junior readers.

3.4.5. Spondyloarthropathy

MRIs play an important role in the diagnosis and monitoring of spondyloarthropathies, such as ankylosing spondylitis, given its increased sensitivity over conventional techniques like radiographs and CT. It can identify lesions in the pre-clinical stage of the disease and guide the decision on the use of disease-modifying drugs. Tas et al. (2023) demonstrated the use of a multi-stage CNN-based model (termed “ASNet”) in the diagnosis of ankylosing spondylitis with high accuracy (96–100%) on both non-contrast (axial and coronal STIR) and contrast-enhanced T1-weighted MRI sequences. All the included ankylosing spondylitis patients had a clinico-radiological diagnosis and were on follow-up with a rheumatologist. The authors achieved higher accuracies compared to previous similar studies which used ResNet and U-net models (accuracies of 88–92%). Of note, the authors demonstrated the highest accuracy with non-contrast images (99% on coronal and 100% on axial images), which may obviate the need for intravenous contrast in the future [34]. This could improve diagnosis for patients with contraindications such as contrast medium allergy or impaired renal function. In another sample of 330 patients with axial spondyloarthritis, Lin et al. (2024) employed a UNet-based model to detect inflammatory lesions on sagittal STIR images, using combined assessment by an experienced radiologist and rheumatologist as the ground truth. The DL model demonstrated similar results (sensitivity 80%, specificity 88%, on a per-image basis) to a radiologist of four years’ experience (sensitivity 82%, specificity 87%) [23].

3.5. Treatment Planning, Patient Selection, and Prognostication

Another growing application of AI is its use in patients who are being evaluated for specific treatments. Models have been developed to predict outcomes for patients undergoing various spinal procedures or surgeries, such as lumbar disc surgery [31], lumbar nucleoplasty [44], and cervical spine surgery [56,60]. Of note, Goedmakers et al. (2021) employed three CNNs (VGGNet19, ResNet19, and ResNet50) to predict which patients would develop adjacent segment disease (on clinical and radiologic follow-up) after undergoing cervical radiculopathy surgery (anterior discectomy and fusion). Conventionally, the prediction of this relatively common complication relies on subjective clinical assessment. The authors used sagittal T2-weighted images and demonstrated a higher accuracy of 95% (using ResNet50) compared to 58% by the clinicians (a neurosurgeon and neuroradiologist). This model offers to provide useful prognostic information and can guide decisions on patient selection, although future work could also account for other variables such as patient demographics and surgical technique [60].
Apart from surgery, other treatments can also be analyzed using predictive algorithms. Chen et al. (2023) investigated an ML-based radiomics algorithm to evaluate sagittal T1-, T2-weighted, STIR, and axial T2-weighted MRIs for radiotherapy prognostication. Follow-up data on tumor progression were classified into “progressive disease” and “non-progressive disease” groups based on established tumor response criteria. The clinical model achieved an AUC of 0.73 (based on features such as multiplicity of tumors, Bilsky score, symptoms) whereas a combined clinical–radiomics model had an improved AUC of 0.83. Although this study was relatively small, with only 52 lesions in the progressive disease group, and benefits over the conventional model were relatively modest, it shows the potential applicability of radiomics models to assist radiation oncologists in treatment selection for difficult cases [30].

3.6. Others

A variety of other applications exist. These include various non-interpretative tasks such as vetting MRI requests. Alanazi et al. (2022) compared the performance of experienced radiographers to various AI models in determining whether a lumbar spine MRI request was indicated or not. A random forest model was found to achieve the highest area under the curve of 0.99 [52]. Further models have also been applied to tasks such as processing radiology reports. For instance, Jujjavarapu et al. applied various natural language processing methods to analyze lumbar spine radiograph and MRI reports. High accuracy was achieved (AUC 0.96 with n-grams) for the identification of 26 radiologic findings. The authors also showed reliable extraction of potentially clinically important findings (AUC 0.95 with document embeddings). Such models could be employed to facilitate early clinical review for patients with time-critical pathology [58].

3.7. AI and ML Techniques

The reviewed studies employed various AI and ML techniques, with CNNs being the most frequently used, both appearing in studies from the earlier years to the most recent. CNNs are versatile and have been applied across a wide range of tasks, including segmentation, detection, and classification. In more recent years, advanced CNN architectures (such as ResNet, DenseNet, YOLO) have been developed. These architectures incorporate new mechanisms to overcome several limitations of conventional models (including the vanishing gradient problem and overfitting), allowing for deeper networks and better performance [21,25,27,28,33,34,45].
U-nets are another common technique, being particularly favored in image segmentation tasks for their precision and efficiency. In later studies, U-net variants (such as 3D U-net, Attention U-net, and MultiRes U-net) were employed to further enhance their capabilities [19,28,49,57]. For instance, 3D U-nets enable the segmentation of volumetric data like MRI scans, preserving spatial information across slices and leading to greater accuracy.
Some studies also implemented ensemble models, where outputs from multiple individual models are aggregated to produce superior results than each model alone. Common ensemble models in this review include Random Forest and boosting models (XGBoost, AdaBoost), which were used to improve the performance for advanced classification tasks such as predicting outcomes after spinal surgery or stereotactic radiotherapy [30,31,39,44].
Overall, there was a noticeable transition from simple, single-model approaches to more sophisticated models over time. Hybrid and ensemble models became increasingly common, reflecting the need for more robust models capable of effectively handling complex tasks.

4. Discussion

AI and ML in spine MRI have the potential to address several shortcomings of conventional technology and assessment. However, there remain important gaps and limitations that need to be addressed and studied.
As previously alluded to, one of the major challenges that modern radiology departments face is the significant time required to perform MRIs, despite ongoing advances in MRI technology, acceleration techniques, and pulse sequences. The ability of DL reconstructions to significantly reduce imaging time has the potential to improve scanner utilization and patient comfort [74]. These are pertinent issues, given the increasing demand for advanced imaging such as MRI and the longer examination times compared to other imaging modalities. Many DL reconstruction algorithms promise minimal to no degradation of image quality, ensuring high diagnostic accuracy, and some are already commercially available. In particular, the synthetic MRI sequences generated from CT images could allow clinicians to leverage the speed of the CT with the superior soft tissue resolution of MRI, facilitating more accurate diagnoses for patients who cannot undergo MRI (e.g., patients with MRI-incompatible implants or claustrophobia) [59]. Conversely, synthetic MRI-generated CT sequences can eliminate radiation exposure. However, there are limitations to the current DL technology. Reconstruction algorithms, especially those used for denoising, can exaggerate artifacts and may cause instability in output images, potentially leading to small lesions being overlooked [74,75].
Advances in spine segmentation have served as an important foundation on which more complex applications can be built. Most recently, models developed to segment pathology have given rise to new clinical applications [61]. The segmentation of other diseases such as neoplasms could also lead to more objective and accurate monitoring for treatment response. Further work in this area is necessary to determine the impact on patient outcomes.
With advances in diagnostics, interpretative tasks conventionally performed by trained radiologists can be augmented by AI. An important area of potential impact is increased efficiency and productivity leading to time savings. Given the ever-increasing radiology workloads, AI could potentially be used to reduce interpretation times and reader fatigue, allowing radiologists to focus on more complex cases and patient care [76]. Additionally, AI tools may supplement radiologists by improving their diagnostic accuracy. AI augmentation has been shown to improve radiologists’ performance, particularly those with less experience [28,53]. Another area of particular interest is the synergy between radiomics and deep learning. This field involves quantitative image analysis, offering more precise and accurate lesion characterization or classification than what is possible by human readers alone [20,77]. Additionally, AI models, such as those applied to spondyloarthropathy [34], have the potential to reduce the need for intravenous gadolinium contrast, which is commonly used to enhance diagnostic quality in MRI scans. This reduction could lead to significant cost savings and minimize the potential risks of gadolinium toxicity.
In the field of treatment planning, patient selection, and prognostication, predictive algorithms have shown promise in allowing for the better anticipation of patient outcomes and complications. Spinal surgery and interventions carry significant risks and should be offered to patients who are likely to benefit the most. While existing clinical decision support and predictive tests are available, these often lack consensus and can have conflicting evidence [78,79]. AI systems that accurately predict outcomes can improve patient care and resource optimization.
However, despite widespread optimism about the purported benefits of these AI technologies, there are important limitations and potential areas for further study, which we will address in the next sections.

4.1. Generalizability

Generalizability refers to the ability of an AI model to perform its intended function on a new set of data that was not part of the model development process [80]. While models may exhibit high levels of accuracy on test sets, developing a generalizable model presents unique challenges. Ethical, legislative, and practical concerns often lead to the development of models based on patient data from a single healthcare institution or country. Variability in institutional imaging protocols, MRI equipment, and pulse sequences introduces significant challenges to consistent AI model performance. In addition, variations in treatment approaches further complicate the development of generalizable models. These factors collectively limit the performance of AI models when applied in different settings [81,82].
Several strategies can be employed to improve generalizability. Firstly, ensuring the availability of data from varied populations is crucial. For instance, Xu et al. (2023) used a large training dataset to develop an AI model for thyroid nodule classification on ultrasound. They utilized data from 10,023 patients across 208 hospitals and 12 equipment vendors, achieving a high AUC of 0.90. The use of scans from a heterogeneous patient population was cited as an important factor for the model’s strong performance [83]. Large medical image datasets, including RadImageNet, MedPix, CheXpert, have been made available in recent years. Additionally, the Radiological Society of North America (RSNA) and the American Society of Neuroradiology (ASNR) recently launched a publicly available dataset of cases annotated by 50 expert radiologists across eight institutions, with the goal of encouraging the development of AI tools for degenerative lumbar spine MRIs [84]. Improving the availability of diverse, high-quality data can potentially overcome some of the challenges posed by limited diversity in training datasets, potentially resulting in more robust models.
Secondly, techniques such as transfer learning can be employed. Transfer learning, which includes domain adaptation, involves making modifications to a model in order to improve its performance on previously unseen tasks [85]. Xuan et al. (2023) employed transfer learning on CNNs (YOLOv3, YOLOv5, PP-YOLOv2) that were pre-trained on general image datasets. Transfer learning was applied by using these models to train a CNN, together with sagittal T2-weighted MRI images labeled by experienced spinal surgeons for evaluation of various features (including disc bulges and spondylolisthesis). The model had a higher accuracy (98%) compared to three spine doctors (accuracies ranging from 70 to 88%) [86]. Similar approaches could be applied to other spine AI models, ensuring their validity when applied to varied settings.
Thirdly, “stress testing” involves evaluating an AI model under varied or extreme conditions to identify potential weak points [81]. In radiology, this may include modifying the input image by rotating, cropping, or adjusting the brightness. Such tests help simulate clinical variability. Santomartino et al. (2024) recently evaluated a bone age prediction algorithm on external images before and after applying transformations to simulate real-world variations (for example, rotating or flipping the image, altering brightness and contrast). The algorithm performed well on the external dataset with a mean absolute difference of 6.9 months and 16.2% clinically significant errors (CSEs) when compared to radiologists. However, its performance significantly worsened when tested on the altered images; when the image resolution was altered, the mean absolute difference increased to 118.3 months. This process helped demonstrate the important pitfalls of the model [87]. Stress testing allows for the simulation of real-world variations in image quality that can significantly impact model performance.
As more AI models become commercially available, it is crucial that they are rigorously validated before they can be applied to new healthcare settings, ensuring their accuracy and safety for patient use.

4.2. Implementation

Implementing AI in clinical practice involves overcoming several hurdles. Most recently, a multi-society statement was released by several radiological organizations (ACR, CAR, ESR, RANZCR, RSNA) that provided guidance on the application of AI tools, from development to implementation and use [88]. The statement highlighted the need for rigorous evaluation to ensure patient safety and supported the integration of AI into existing healthcare information technology (IT) infrastructure. Other authors have emphasized the importance of using vendor-neutral platforms, which can streamline algorithms from multiple developers and facilitate end-user access to AI-generated results [89]. Addressing these concerns will ensure that AI algorithms are reliable and effective.
A key challenge in healthcare is maintaining patient confidentiality while leveraging large volumes of medical data for AI training and deployment. Privacy concerns arise due to the sensitive nature of medical data, and the risk of data breaches or misuse could compromise patient trust and legal compliance. Designing AI systems that protect patient identities through techniques like data anonymization and encryption is essential [90].
The use of AI in healthcare settings such as radiology also raises ethical concerns. Patient well-being, equity, privacy, and dignity should be prioritized. Responsible use of AI includes ensuring that patient data are properly regulated and kept secure. AI models may also exaggerate pre-existing biases in healthcare, particularly the effects of selection bias. Such bias is inadvertently introduced when algorithms are applied to patient data that differ from the data on which they were trained, where the incidence of various pathologies may differ. For example, an AI model trained primarily on data from urban hospitals may underperform when used in rural settings, where the prevalence of certain conditions and patient demographics differ significantly. Even within a single health system, AI systems may inadvertently introduce bias against minority populations due to unrepresentative training data. This can potentially lead to less accurate predictions for certain groups and increase healthcare disparity. Minimizing bias is crucial to ensuring fairness and accuracy [88,90,91,92,93,94].
Trust is another key factor in the successful implementation of AI models into clinical care. Radiologists’ trust in AI has been identified as one of the most common barriers to its adoption [95]. Many commonly used algorithms operate as “black box” solutions, making it difficult to fully understand how conclusions are derived. Building patient trust in AI is also important for its role to be widely accepted in patient care. Clear communication about how AI works and its benefits can foster this trust. Involving end-users and patients in the AI development process and addressing their concerns is also vital [96]. For example, saliency maps (heat maps) can be used to improve model explainability. These are visual representations that highlight parts of the image that are most relevant to a DL model’s predictions. Brima and Atemkeng (2024) used various saliency methods (GradCAM, ScoreCAM, XRAI) on datasets comprising MRI scans depicting brain tumors and COVID-19 chest radiographs. Employing both qualitative visual assessment and quantitative (Accuracy and Softmax Information Curves) metrics, they showed that saliency maps can offer accurate representations of a model’s decision-making process [97].
Ongoing research and close collaboration between researchers, healthcare providers, and regulatory bodies is necessary to ensure the smooth implementation of AI in routine practice. Establishing robust regulatory frameworks and continuously monitoring AI systems will be essential to addressing any emerging issues and maintaining high standards of patient care.

4.3. Study Limitations

While this review provides a comprehensive analysis of the current applications of AI and ML in spine MRI, several limitations should be acknowledged. Firstly, the heterogeneity of the included studies and lack of inferential statistics limit the robustness and ability to draw generalized conclusions. Differences in study designs, patient populations, MRI protocols, and models make it difficult to directly compare outcomes between studies. Nonetheless, we sought to provide a comprehensive overview across the range of applications that have been studied in this field, with the aim of highlighting current trends and identifying key gaps in the literature which could serve as targets for further research. Secondly, a majority of the studies reviewed were retrospective in nature, which limits the ability to assess the real-world applicability of these AI tools in prospective clinical settings. Another notable limitation is the lack of standardized evaluation metrics across studies, making it challenging to objectively compare the performance of different AI models. Finally, the review did not consider the potential biases that could be introduced by AI models, such as those related to patient demographics, which could impact the fairness and equity of AI-driven healthcare solutions.

4.4. Proposed Areas of Future Research

Firstly, the exploration of foundation AI models in healthcare presents an exciting opportunity. These models are trained on varied and much more extensive datasets than the conventional models, making them adaptable for a wide range of tasks. Furthermore, they have the potential to integrate multiple data types to provide comprehensive insights. However, there are several limitations, including the increased complexity in training and validating these generalist models [98,99]. Thus, further work is necessary to evaluate potential applications in spine MRI.
One such application is the development of comprehensive models for image interpretation. Such models have already been developed and validated for applications like interpreting chest radiographs, where the detection of multiple pathologies is possible [100,101,102]. However, in this application, many existing models focus on detecting a single pathology or differentiating between a small number of pathologies. Currently, many of these models still require radiologist input for image interpretation. Comprehensive AI models that can detect and classify a wide range of pathologies—including degenerative disease, fractures and vertebral malalignment, marrow signal abnormality, spinal cord abnormalities and incidental extra-spinal findings—will significantly enhance the diagnostic process. These advanced models could also integrate information from the patient’s electronic medical records. Multidisciplinary collaboration among computer scientists, radiologists, and clinicians is necessary to identify areas where clinically relevant models can provide the greatest benefit for patients [103].
Another area where foundation models can improve patient care is through large language models (LLMs). Privacy-protecting LLMs (PP-LLMs), which can manage large volumes of healthcare data while ensuring patient privacy and security, are particularly promising. In radiology, LLMs have already been employed for a range of tasks. For example, they have been used to determine the most appropriate imaging protocol for different types of scans [104], generate radiology reports, and summarize reports [105,106]. Spine MRI reports are a suitable area for LLM assistance as they are typically structured by spinal level, which can make the reporting process tedious and time-consuming. LLMs could automate much of this work, generating consistent and accurate reports more efficiently than manual methods. In addition, LLMs have the potential to enhance the understanding of radiological findings by both referring clinicians and patients. LLMs can help translate complicated medical jargon into language that is accessible to patients, helping them to better understand their diagnosis [107,108]. As the performance of LLMs continues to improve, their range of applications in healthcare is likely to expand. Future advancements could include more varied diagnostic tasks. In particular, recent multimodal LLMs which can interpret both text and images are particularly useful in radiology, with the potential to aid in tasks such as identifying errors in radiology reports [109]. Such algorithms could be developed for spine imaging thus ensuring patient safety.
A final key area of future research is the evaluation of real-world applications of AI, particularly its impact on diagnostic accuracy, productivity, and patient outcomes. Studies like those by Lim et al. (2022), where the radiologist’s performance in interpreting the lumbar spine MRI was assessed with and without a DL model, provide valuable insight. The largest improvements in time and accuracy were seen for in-training and general radiologists, with subspeciality radiologists achieving the least productivity gains [53]. Studies on the application of AI systems for other diseases have helped demonstrate unexpected or unintended consequences. A recent study by Yu et al. (2024) examined the impact of AI assistance on 140 radiologists who were tasked with interpreting chest radiographs. The authors found that the impact of AI assistance on radiologist performance was variable, and that AI errors significantly impacted treatment outcomes [110]. Similar studies will be useful in evaluating other interpretative tasks, such as spine imaging, to understand the influence of AI more holistically.

5. Conclusions

Our review has highlighted the significant potential that AI and ML have in revolutionizing spine MRI by addressing important challenges in image acquisition, diagnostic accuracy, and treatment planning. We have examined key advancements in AI including the development of DL reconstruction algorithms that allow for faster image acquisition, and models that allow for improved diagnostic performance. We also acknowledge the limitations of current AI technology. Future work will require collaborative efforts to fully exploit new technologies and address practical challenges related to generalizability and implementation.

Author Contributions

Conceptualization, methodology, validation, data curation, writing—review and editing, A.L., W.O., A.M., Y.H.T., W.C.T., S.W.D.L., X.Z.L. and J.T.P.D.H.; software, resources, A.L., W.C.T., S.W.D.L., J.J.H.T., N.K. and J.T.P.D.H.; writing—original draft preparation, A.L., W.O. and W.C.T.; project administration, A.L., W.O., Y.H.T., A.M. and J.J.H.T.; supervision, funding acquisition, J.T.P.D.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was directly funded by the Ministry of Health/National Medical Research Council, Singapore under the NMRC Clinician Innovator Award (CIA). The grant was awarded for the project titled “Deep learning pipeline for augmented reporting of MRI whole spine” (Grant ID: CIAINV23jan-0001, MOH-001405, J.T.P.D.H.).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kim, G.-U.; Chang, M.C.; Kim, T.U.; Lee, G.W. Diagnostic Modality in Spine Disease: A Review. Asian Spine J. 2020, 14, 910–920. [Google Scholar] [CrossRef] [PubMed]
  2. Leone, A.; Guglielmi, G.; Cassar-Pullicino, V.N.; Bonomo, L. Lumbar Intervertebral Instability: A Review. Radiology 2007, 245, 62–77. [Google Scholar] [CrossRef] [PubMed]
  3. Blackmore, C.C.; Mann, F.A.; Wilson, A.J. Helical CT in the Primary Trauma Evaluation of the Cervical Spine: An Evidence-Based Approach. Skelet. Radiol. 2000, 29, 632–639. [Google Scholar] [CrossRef]
  4. Selopranoto, U.S.; Soo, M.Y.; Fearnside, M.R.; Cummine, J.L. Ossification of the Posterior Longitudinal Ligament of the Cervical Spine. J. Clin. Neurosci. 1997, 4, 209–217. [Google Scholar] [CrossRef]
  5. Hartley, K.G.; Damon, B.M.; Patterson, G.T.; Long, J.H.; Holt, G.E. MRI Techniques: A Review and Update for the Orthopaedic Surgeon. J. Am. Acad. Orthop. Surg. 2012, 20, 775–787. [Google Scholar] [CrossRef]
  6. Alyas, F.; Saifuddin, A.; Connell, D. MR Imaging Evaluation of the Bone Marrow and Marrow Infiltrative Disorders of the Lumbar Spine. Magn. Reson. Imaging Clin. N. Am. 2007, 15, 199–219. [Google Scholar] [CrossRef] [PubMed]
  7. Henninger, B.; Kaser, V.; Ostermann, S.; Spicher, A.; Zegg, M.; Schmid, R.; Kremser, C.; Krappinger, D. Cervical Disc and Ligamentous Injury in Hyperextension Trauma: MRI and Intraoperative Correlation. J. Neuroimaging 2020, 30, 104–109. [Google Scholar] [CrossRef]
  8. Landman, J.A.; Hoffman, J.C., Jr.; Braun, I.F.; Barrow, D.L. Value of Computed Tomographic Myelography in the Recognition of Cervical Herniated Disk. AJNR Am. J. Neuroradiol. 1984, 5, 391–394. [Google Scholar]
  9. Runge, V.M.; Richter, J.K.; Heverhagen, J.T. Speed in Clinical Magnetic Resonance. Investig. Radiol. 2017, 52, 1–17. [Google Scholar] [CrossRef]
  10. Nölte, I.; Gerigk, L.; Brockmann, M.A.; Kemmling, A.; Groden, C. MRI of Degenerative Lumbar Spine Disease: Comparison of Non-Accelerated and Parallel Imaging. Neuroradiology 2008, 50, 403–409. [Google Scholar] [CrossRef]
  11. Gao, T.; Lu, Z.; Wang, F.; Zhao, H.; Wang, J.; Pan, S. Using the Compressed Sensing Technique for Lumbar Vertebrae Imaging: Comparison with Conventional Parallel Imaging. Curr. Med. Imaging Rev. 2021, 17, 1010–1017. [Google Scholar] [CrossRef] [PubMed]
  12. Hajiahmadi, S.; Shayganfar, A.; Askari, M.; Ebrahimian, S. Interobserver and Intraobserver Variability in Magnetic Resonance Imaging Evaluation of Patients with Suspected Disc Herniation. Heliyon 2020, 6, e05201. [Google Scholar] [CrossRef]
  13. SITNFlash. The History of Artificial Intelligence. Science in the News. Available online: https://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/ (accessed on 16 June 2024).
  14. European Society of Radiology (ESR). What the Radiologist Should Know about Artificial Intelligence—An ESR White Paper. Insights Imaging 2019, 10, 44. [Google Scholar] [CrossRef] [PubMed]
  15. Noguerol, M.; Paulano-Godino, T.; Martín-Valdivia, F.; Menias, M.T.; Luna, C.O. Strengths, Weaknesses, Opportunities, and Threats Analysis of Artificial Intelligence and Machine Learning Applications in Radiology. J. Am. Coll. Radiol. 2019, 16 Pt B, 1239–1247. [Google Scholar] [CrossRef]
  16. Khan, S.A.; Hussain, S.; Yang, S. Contrast Enhancement of Low-Contrast Medical Images Using Modified Contrast Limited Adaptive Histogram Equalization. J. Med. Imaging Health Inform. 2020, 10, 1795–1803. [Google Scholar] [CrossRef]
  17. Khan, S.A.; Khan, M.A.; Song, O.-Y.; Nazir, M. Medical Imaging Fusion Techniques: A Survey Benchmark Analysis, Open Challenges and Recommendations. J. Med. Imaging Health Inform. 2020, 10, 2523–2531. [Google Scholar] [CrossRef]
  18. Nouman Noor, M.; Nazir, M.; Khan, S.A.; Song, O.-Y.; Ashraf, I. Efficient Gastrointestinal Disease Classification Using Pretrained Deep Convolutional Neural Network. Electronics 2023, 12, 1557. [Google Scholar] [CrossRef]
  19. Zhu, Y.; Li, Y.; Wang, K.; Li, J.; Zhang, X.; Zhang, Y.; Li, J.; Wang, X. A Quantitative Evaluation of the Deep Learning Model of Segmentation and Measurement of Cervical Spine MRI in Healthy Adults. J. Appl. Clin. Med. Phys. 2024, 25, e14282. [Google Scholar] [CrossRef] [PubMed]
  20. Xie, J.; Yang, Y.; Jiang, Z.; Zhang, K.; Zhang, X.; Lin, Y.; Shen, Y.; Jia, X.; Liu, H.; Yang, S.; et al. MRI Radiomics-Based Decision Support Tool for a Personalized Classification of Cervical Disc Degeneration: A Two-Center Study. Front. Physiol. 2023, 14, 1281506. [Google Scholar] [CrossRef] [PubMed]
  21. Wang, Y.-N.; Liu, G.; Wang, L.; Chen, C.; Wang, Z.; Zhu, S.; Wan, W.-T.; Weng, Y.-Z.; Lu, W.W.; Li, Z.-Y.; et al. A Deep-Learning Model for Diagnosing Fresh Vertebral Fractures on Magnetic Resonance Images. World Neurosurg. 2024, 183, e818–e824. [Google Scholar] [CrossRef]
  22. Awan, K.M.; Goncalves Filho, A.L.M.; Tabari, A.; Applewhite, B.P.; Lang, M.; Lo, W.-C.; Sellers, R.; Kollasch, P.; Clifford, B.; Nickel, D.; et al. Diagnostic Evaluation of Deep Learning Accelerated Lumbar Spine MRI. Neuroradiol. J. 2024, 37, 323–331. [Google Scholar] [CrossRef]
  23. Lin, Y.; Chan, S.C.W.; Chung, H.Y.; Lee, K.H.; Cao, P. A Deep Neural Network for MRI Spinal Inflammation in Axial Spondyloarthritis. Eur. Spine J. 2024. ahead of print. [Google Scholar] [CrossRef]
  24. Kowlagi, N.; Kemppainen, A.; Panfilov, E.; McSweeney, T.; Saarakkala, S.; Nevalainen, M.; Niinimäki, J.; Karppinen, J.; Tiulpin, A. Semiautomatic Assessment of Facet Tropism from Lumbar Spine MRI Using Deep Learning: A Northern Finland Birth Cohort Study. Spine 2024, 49, 630–639. [Google Scholar] [CrossRef] [PubMed]
  25. Qu, Z.; Deng, B.; Sun, W.; Yang, R.; Feng, H. A Convolutional Neural Network for Automated Detection of Cervical Ossification of the Posterior Longitudinal Ligament Using Magnetic Resonance Imaging. Clin. Spine Surg. 2024, 37, E106–E112. [Google Scholar] [CrossRef]
  26. Kim, D.K.; Lee, S.-Y.; Lee, J.; Huh, Y.J.; Lee, S.; Lee, S.; Jung, J.-Y.; Lee, H.-S.; Benkert, T.; Park, S.-H. Deep Learning-Based k-Space-to-Image Reconstruction and Super Resolution for Diffusion-Weighted Imaging in Whole-Spine MRI. Magn. Reson. Imaging 2024, 105, 82–91. [Google Scholar] [CrossRef] [PubMed]
  27. Liu, G.; Wang, L.; You, S.-N.; Wang, Z.; Zhu, S.; Chen, C.; Ma, X.-L.; Yang, L.; Zhang, S.; Yang, Q. Automatic Detection and Classification of Modic Changes in MRI Images Using Deep Learning: Intelligent Assisted Diagnosis System. Orthop. Surg. 2024, 16, 196–206. [Google Scholar] [CrossRef] [PubMed]
  28. Jo, S.W.; Khil, E.K.; Lee, K.Y.; Choi, I.; Yoon, Y.S.; Cha, J.G.; Lee, J.H.; Kim, H.; Lee, S.Y. Deep Learning System for Automated Detection of Posterior Ligamentous Complex Injury in Patients with Thoracolumbar Fracture on MRI. Sci. Rep. 2023, 13, 19017. [Google Scholar] [CrossRef] [PubMed]
  29. Vitale, J.; Sconfienza, L.M.; Galbusera, F. Cross-Sectional Area and Fat Infiltration of the Lumbar Spine Muscles in Patients with Back Disorders: A Deep Learning-Based Big Data Analysis. Eur. Spine J. 2024, 33, 1–10. [Google Scholar] [CrossRef]
  30. Chen, Y.; Qin, S.; Zhao, W.; Wang, Q.; Liu, K.; Xin, P.; Yuan, H.; Zhuang, H.; Lang, N. MRI Feature-Based Radiomics Models to Predict Treatment Outcome after Stereotactic Body Radiotherapy for Spinal Metastases. Insights Imaging 2023, 14, 169. [Google Scholar] [CrossRef]
  31. Saravi, B.; Zink, A.; Ülkümen, S.; Couillard-Despres, S.; Wollborn, J.; Lang, G.; Hassel, F. Clinical and Radiomics Feature-Based Outcome Analysis in Lumbar Disc Herniation Surgery. BMC Musculoskelet. Disord. 2023, 24, 791. [Google Scholar] [CrossRef]
  32. Haim, O.; Agur, A.; Gabay, S.; Azolai, L.; Shutan, I.; Chitayat, M.; Katirai, M.; Sadon, S.; Artzi, M.; Lidar, Z. Differentiating Spinal Pathologies by Deep Learning Approach. Spine J. 2024, 24, 297–303. [Google Scholar] [CrossRef] [PubMed]
  33. Zhang, W.; Chen, Z.; Su, Z.; Wang, Z.; Hai, J.; Huang, C.; Wang, Y.; Yan, B.; Lu, H. Deep Learning-Based Detection and Classification of Lumbar Disc Herniation on Magnetic Resonance Images. JOR Spine 2023, 6, e1276. [Google Scholar] [CrossRef] [PubMed]
  34. Tas, N.P.; Kaya, O.; Macin, G.; Tasci, B.; Dogan, S.; Tuncer, T. ASNET: A Novel AI Framework for Accurate Ankylosing Spondylitis Diagnosis from MRI. Biomedicines 2023, 11, 2441. [Google Scholar] [CrossRef]
  35. Masse-Gignac, N.; Flórez-Jiménez, S.; Mac-Thiong, J.-M.; Duong, L. Attention-Gated U-Net Networks for Simultaneous Axial/Sagittal Planes Segmentation of Injured Spinal Cords. J. Appl. Clin. Med. Phys. 2023, 24, e14123. [Google Scholar] [CrossRef] [PubMed]
  36. Yilizati-Yilihamu, E.E.; Yang, J.; Yang, Z.; Rong, F.; Feng, S. A Spine Segmentation Method Based on Scene Aware Fusion Network. BMC Neurosci. 2023, 24, 49. [Google Scholar] [CrossRef]
  37. Wang, W.; Fan, Z.; Zhen, J. MRI Radiomics-Based Evaluation of Tuberculous and Brucella Spondylitis. J. Int. Med. Res. 2023, 51, 3000605231195156. [Google Scholar] [CrossRef]
  38. Niemeyer, F.; Galbusera, F.; Tao, Y.; Phillips, F.M.; An, H.S.; Louie, P.K.; Samartzis, D.; Wilke, H.-J. Deep Phenotyping the Cervical Spine: Automatic Characterization of Cervical Degenerative Phenotypes Based on T2-Weighted MRI. Eur. Spine J. 2023, 32, 3846–3856. [Google Scholar] [CrossRef]
  39. Cai, J.; Shen, C.; Yang, T.; Jiang, Y.; Ye, H.; Ruan, Y.; Zhu, X.; Liu, Z.; Liu, Q. MRI-Based Radiomics Assessment of the Imminent New Vertebral Fracture after Vertebral Augmentation. Eur. Spine J. 2023, 32, 3892–3905. [Google Scholar] [CrossRef]
  40. Waldenberg, C.; Brisby, H.; Hebelka, H.; Lagerstrand, K.M. Associations between Vertebral Localized Contrast Changes and Adjacent Annular Fissures in Patients with Low Back Pain: A Radiomics Approach. J. Clin. Med. 2023, 12, 4891. [Google Scholar] [CrossRef]
  41. Roberts, M.; Hinton, G.; Wells, A.J.; Van Der Veken, J.; Bajger, M.; Lee, G.; Liu, Y.; Chong, C.; Poonnoose, S.; Agzarian, M.; et al. Imaging Evaluation of a Proposed 3D Generative Model for MRI to CT Translation in the Lumbar Spine. Spine J. 2023, 23, 1602–1612. [Google Scholar] [CrossRef]
  42. Tanenbaum, L.N.; Bash, S.C.; Zaharchuk, G.; Shankaranarayanan, A.; Chamberlain, R.; Wintermark, M.; Beaulieu, C.; Novick, M.; Wang, L. Deep Learning-Generated Synthetic MR Imaging STIR Spine Images Are Superior in Image Quality and Diagnostically Equivalent to Conventional STIR: A Multicenter, Multireader Trial. AJNR Am. J. Neuroradiol. 2023, 44, 987–993. [Google Scholar] [CrossRef] [PubMed]
  43. Küçükçiloğlu, Y.; Şekeroğlu, B.; Adalı, T.; Şentürk, N. Prediction of Osteoporosis Using MRI and CT Scans with Unimodal and Multimodal Deep-Learning Models. Diagn. Interv. Radiol. 2024, 30, 9–20. [Google Scholar] [CrossRef]
  44. Chiu, P.-F.; Chang, R.C.-H.; Lai, Y.-C.; Wu, K.-C.; Wang, K.-P.; Chiu, Y.-P.; Ji, H.-R.; Kao, C.-H.; Chiu, C.-D. Machine Learning Assisting the Prediction of Clinical Outcomes Following Nucleoplasty for Lumbar Degenerative Disc Disease. Diagnostics 2023, 13, 1863. [Google Scholar] [CrossRef]
  45. Mohanty, R.; Allabun, S.; Solanki, S.S.; Pani, S.K.; Alqahtani, M.S.; Abbas, M.; Soufiene, B.O. NAMSTCD: A Novel Augmented Model for Spinal Cord Segmentation and Tumor Classification Using Deep Nets. Diagnostics 2023, 13, 1417. [Google Scholar] [CrossRef] [PubMed]
  46. Liu, B.; Jin, Y.; Feng, S.; Yu, H.; Zhang, Y.; Li, Y. Benign vs Malignant Vertebral Compression Fractures with MRI: A Comparison between Automatic Deep Learning Network and Radiologist’s Assessment. Eur. Radiol. 2023, 33, 5060–5068. [Google Scholar] [CrossRef] [PubMed]
  47. Liawrungrueang, W.; Kim, P.; Kotheeranurak, V.; Jitpakdee, K.; Sarasombath, P. Automatic Detection, Classification, and Grading of Lumbar Intervertebral Disc Degeneration Using an Artificial Neural Network Model. Diagnostics 2023, 13, 663. [Google Scholar] [CrossRef]
  48. Mukaihata, T.; Maki, S.; Eguchi, Y.; Geundong, K.; Shoda, J.; Yokota, H.; Orita, S.; Shiga, Y.; Inage, K.; Furuya, T.; et al. Differentiating Magnetic Resonance Images of Pyogenic Spondylitis and Spinal Modic Change Using a Convolutional Neural Network. Spine 2023, 48, 288–294. [Google Scholar] [CrossRef] [PubMed]
  49. Zhuo, Z.; Zhang, J.; Duan, Y.; Qu, L.; Feng, C.; Huang, X.; Cheng, D.; Xu, X.; Sun, T.; Li, Z.; et al. Automated Classification of Intramedullary Spinal Cord Tumors and Inflammatory Demyelinating Lesions Using Deep Learning. Radiol. Artif. Intell. 2022, 4, e210292. [Google Scholar] [CrossRef] [PubMed]
  50. Kashiwagi, N.; Sakai, M.; Tsukabe, A.; Yamashita, Y.; Fujiwara, M.; Yamagata, K.; Nakamoto, A.; Nakanishi, K.; Tomiyama, N. Ultrafast Cervical Spine MRI Protocol Using Deep Learning-Based Reconstruction: Diagnostic Equivalence to a Conventional Protocol. Eur. J. Radiol. 2022, 156, 110531. [Google Scholar] [CrossRef]
  51. Chen, K.; Cao, J.; Zhang, X.; Wang, X.; Zhao, X.; Li, Q.; Chen, S.; Wang, P.; Liu, T.; Du, J.; et al. Differentiation between Spinal Multiple Myeloma and Metastases Originated from Lung Using Multi-View Attention-Guided Network. Front. Oncol. 2022, 12, 981769. [Google Scholar] [CrossRef]
  52. Alanazi, A.H.; Cradock, A.; Rainford, L. Development of Lumbar Spine MRI Referrals Vetting Models Using Machine Learning and Deep Learning Algorithms: Comparison Models vs. Healthcare Professionals. Radiography 2022, 28, 674–683. [Google Scholar] [CrossRef] [PubMed]
  53. Lim, D.S.W.; Makmur, A.; Zhu, L.; Zhang, W.; Cheng, A.J.L.; Sia, D.S.Y.; Eide, S.E.; Ong, H.Y.; Jagmohan, P.; Tan, W.C.; et al. Improved Productivity Using Deep Learning-Assisted Reporting for Lumbar Spine MRI. Radiology 2022, 305, 160–166. [Google Scholar] [CrossRef] [PubMed]
  54. Hallinan, J.T.P.D.; Zhu, L.; Zhang, W.; Lim, D.S.W.; Baskar, S.; Low, X.Z.; Yeong, K.Y.; Teo, E.C.; Kumarakulasinghe, N.B.; Yap, Q.V.; et al. Deep Learning Model for Classifying Metastatic Epidural Spinal Cord Compression on MRI. Front. Oncol. 2022, 12, 849447. [Google Scholar] [CrossRef] [PubMed]
  55. Suri, A.; Jones, B.C.; Ng, G.; Anabaraonye, N.; Beyrer, P.; Domi, A.; Choi, G.; Tang, S.; Terry, A.; Leichner, T.; et al. Vertebral Deformity Measurements at MRI, CT, and Radiography Using Deep Learning. Radiol. Artif. Intell. 2022, 4, e210015. [Google Scholar] [CrossRef]
  56. Zhang, M.-Z.; Ou-Yang, H.-Q.; Liu, J.-F.; Jin, D.; Wang, C.-J.; Ni, M.; Liu, X.-G.; Lang, N.; Jiang, L.; Yuan, H.-S. Predicting Postoperative Recovery in Cervical Spondylotic Myelopathy: Construction and Interpretation of T2*-Weighted Radiomic-Based Extra Trees Models. Eur. Radiol. 2022, 32, 3565–3575. [Google Scholar] [CrossRef] [PubMed]
  57. Hwang, E.-J.; Kim, S.; Jung, J.-Y. Fully Automated Segmentation of Lumbar Bone Marrow in Sagittal, High-Resolution T1-Weighted Magnetic Resonance Images Using 2D U-NET. Comput. Biol. Med. 2022, 140, 105105. [Google Scholar] [CrossRef] [PubMed]
  58. Jujjavarapu, C.; Pejaver, V.; Cohen, T.A.; Mooney, S.D.; Heagerty, P.J.; Jarvik, J.G. A Comparison of Natural Language Processing Methods for the Classification of Lumbar Spine Imaging Findings Related to Lower Back Pain. Acad. Radiol. 2022, 29 (Suppl. S3), S188–S200. [Google Scholar] [CrossRef]
  59. Gotoh, M.; Nakaura, T.; Funama, Y.; Morita, K.; Sakabe, D.; Uetani, H.; Nagayama, Y.; Kidoh, M.; Hatemura, M.; Masuda, T.; et al. Virtual Magnetic Resonance Lumbar Spine Images Generated from Computed Tomography Images Using Conditional Generative Adversarial Networks. Radiography 2022, 28, 447–453. [Google Scholar] [CrossRef]
  60. Goedmakers, C.M.W.; Lak, A.M.; Duey, A.H.; Senko, A.W.; Arnaout, O.; Groff, M.W.; Smith, T.R.; Vleggeert-Lankamp, C.L.A.; Zaidi, H.A.; Rana, A.; et al. Deep Learning for Adjacent Segment Disease at Preoperative MRI for Cervical Radiculopathy. Radiology 2021, 301, E446. [Google Scholar] [CrossRef]
  61. Lemay, A.; Gros, C.; Zhuo, Z.; Zhang, J.; Duan, Y.; Cohen-Adad, J.; Liu, Y. Automatic Multiclass Intramedullary Spinal Cord Tumor Segmentation on MRI with Deep Learning. NeuroImage Clin. 2021, 31, 102766. [Google Scholar] [CrossRef]
  62. Liu, J.; Wang, C.; Guo, W.; Zeng, P.; Liu, Y.; Lang, N.; Yuan, H. A Preliminary Study Using Spinal MRI-Based Radiomics to Predict High-Risk Cytogenetic Abnormalities in Multiple Myeloma. Radiol. Med. 2021, 126, 1226–1235. [Google Scholar] [CrossRef] [PubMed]
  63. Merali, Z.; Wang, J.Z.; Badhiwala, J.H.; Witiw, C.D.; Wilson, J.R.; Fehlings, M.G. A Deep Learning Model for Detection of Cervical Spinal Cord Compression in MRI Scans. Sci. Rep. 2021, 11, 10473. [Google Scholar] [CrossRef] [PubMed]
  64. Hallinan, J.T.P.D.; Zhu, L.; Yang, K.; Makmur, A.; Algazwi, D.A.R.; Thian, Y.L.; Lau, S.; Choo, Y.S.; Eide, S.E.; Yap, Q.V.; et al. Deep Learning Model for Automated Detection and Classification of Central Canal, Lateral Recess, and Neural Foraminal Stenosis at Lumbar Spine MRI. Radiology 2021, 300, 130–138. [Google Scholar] [CrossRef] [PubMed]
  65. Maki, S.; Furuya, T.; Horikoshi, T.; Yokota, H.; Mori, Y.; Ota, J.; Kawasaki, Y.; Miyamoto, T.; Norimoto, M.; Okimatsu, S.; et al. A Deep Convolutional Neural Network with Performance Comparable to Radiologists for Differentiating between Spinal Schwannoma and Meningioma. Spine 2020, 45, 694–700. [Google Scholar] [CrossRef]
  66. Gaonkar, B.; Beckett, J.; Villaroman, D.; Ahn, C.; Edwards, M.; Moran, S.; Attiah, M.; Babayan, D.; Ames, C.; Villablanca, J.P.; et al. Quantitative Analysis of Neural Foramina in the Lumbar Spine: An Imaging Informatics and Machine Learning Study. Radiol. Artif. Intell. 2019, 1, 180037. [Google Scholar] [CrossRef] [PubMed]
  67. Kim, K.; Kim, S.; Lee, Y.H.; Lee, S.H.; Lee, H.S.; Kim, S. Performance of the Deep Convolutional Neural Network Based Magnetic Resonance Image Scoring Algorithm for Differentiating between Tuberculous and Pyogenic Spondylitis. Sci. Rep. 2018, 8, 13124. [Google Scholar] [CrossRef] [PubMed]
  68. Jamaludin, A.; Kadir, T.; Zisserman, A. SpineNet: Automated Classification and Evidence Visualization in Spinal MRIs. Med. Image Anal. 2017, 41, 63–73. [Google Scholar] [CrossRef]
  69. Pfirrmann, C.W.; Metzdorf, A.; Zanetti, M.; Hodler, J.; Boos, N. Magnetic Resonance Classification of Lumbar Intervertebral Disc Degeneration. Spine 2001, 26, 1873–1878. [Google Scholar] [CrossRef]
  70. Kumar, N.; Tan, W.L.B.; Wei, W.; Vellayappan, B.A. An Overview of the Tumors Affecting the Spine-inside to Out. Neuro-Oncol. Pract. 2020, 7 (Suppl. S1), i10–i17. [Google Scholar] [CrossRef]
  71. Hallinan, J.T.P.D.; Zhu, L.; Zhang, W.; Kuah, T.; Lim, D.S.W.; Low, X.Z.; Cheng, A.J.L.; Eide, S.E.; Ong, H.Y.; Muhamat Nor, F.E.; et al. Deep Learning Model for Grading Metastatic Epidural Spinal Cord Compression on Staging CT. Cancers 2022, 14, 3219. [Google Scholar] [CrossRef]
  72. Hallinan, J.T.P.D.; Zhu, L.; Zhang, W.; Ge, S.; Muhamat Nor, F.E.; Ong, H.Y.; Eide, S.E.; Cheng, A.J.L.; Kuah, T.; Lim, D.S.W.; et al. Deep Learning Assessment Compared to Radiologist Reporting for Metastatic Spinal Cord Compression on CT. Front. Oncol. 2023, 13, 1151073. [Google Scholar] [CrossRef] [PubMed]
  73. Hallinan, J.T.P.D.; Zhu, L.; Tan, H.W.N.; Hui, S.J.; Lim, X.; Ong, B.W.L.; Ong, H.Y.; Eide, S.E.; Cheng, A.J.L.; Ge, S.; et al. A Deep Learning-Based Technique for the Diagnosis of Epidural Spinal Cord Compression on Thoracolumbar CT. Eur. Spine J. 2023, 32, 3815–3824. [Google Scholar] [CrossRef]
  74. Kiryu, S.; Akai, H.; Yasaka, K.; Tajima, T.; Kunimatsu, A.; Yoshioka, N.; Akahane, M.; Abe, O.; Ohtomo, K. Clinical Impact of Deep Learning Reconstruction in MRI. Radiographics 2023, 43, e220133. [Google Scholar] [CrossRef]
  75. Antun, V.; Renna, F.; Poon, C.; Adcock, B.; Hansen, A.C. On Instabilities of Deep Learning in Image Reconstruction and the Potential Costs of AI. Proc. Natl. Acad. Sci. USA 2020, 117, 30088–30095. [Google Scholar] [CrossRef] [PubMed]
  76. Hsu, W.; Hoyt, A.C. Using Time as a Measure of Impact for AI Systems: Implications in Breast Screening. Radiol. Artif. Intell. 2019, 1, e190107. [Google Scholar] [CrossRef] [PubMed]
  77. Avanzo, M.; Wei, L.; Stancanello, J.; Vallières, M.; Rao, A.; Morin, O.; Mattonen, S.A.; El Naqa, I. Machine and Deep Learning Methods for Radiomics. Med. Phys. 2020, 47, e185–e202. [Google Scholar] [CrossRef]
  78. Willems, P.; de Bie, R.; Öner, C.; Castelein, R.; de Kleuver, M. Clinical Decision Making in Spinal Fusion for Chronic Low Back Pain. Results of a Nationwide Survey among Spine Surgeons. BMJ Open 2011, 1, e000391. [Google Scholar] [CrossRef]
  79. Fairbank, J.; Frost, H.; Wilson-MacDonald, J.; Yu, L.-M.; Barker, K.; Collins, R.; Spine Stabilisation Trial Group. Randomised Controlled Trial to Compare Surgical Stabilisation of the Lumbar Spine with an Intensive Rehabilitation Programme for Patients with Chronic Low Back Pain: The MRC Spine Stabilisation Trial. BMJ 2005, 330, 1233. [Google Scholar] [CrossRef]
  80. Azad, T.D.; Zhang, Y.; Weiss, H.; Alamin, T.; Cheng, I.; Huang, B.; Veeravagu, A.; Ratliff, J.; Malhotra, N.R. Fostering Reproducibility and Generalizability in Machine Learning for Clinical Prediction Modeling in Spine Surgery. Spine J. 2021, 21, 1610–1616. [Google Scholar] [CrossRef] [PubMed]
  81. Eche, T.; Schwartz, L.H.; Mokrane, F.-Z.; Dercle, L. Toward Generalizability in the Deployment of Artificial Intelligence in Radiology: Role of Computation Stress Testing to Overcome Underspecification. Radiol. Artif. Intell. 2021, 3, e210097. [Google Scholar] [CrossRef]
  82. Huisman, M.; Hannink, G. The AI Generalization Gap: One Size Does Not Fit All. Radiol. Artif. Intell. 2023, 5, e230246. [Google Scholar] [CrossRef] [PubMed]
  83. Xu, W.; Jia, X.; Mei, Z.; Gu, X.; Lu, Y.; Fu, C.-C.; Zhang, R.; Gu, Y.; Chen, X.; Luo, X.; et al. Chinese Artificial Intelligence Alliance for Thyroid and Breast Ultrasound. Generalizability and Diagnostic Performance of AI Models for Thyroid US. Radiology 2023, 307, e221157. [Google Scholar] [CrossRef]
  84. RSNA Lumbar Spine Degenerative Classification AI Challenge. 2024. Rsna.org. Available online: https://www.rsna.org/rsnai/ai-image-challenge/lumbar-spine-degenerative-classification-ai-challenge (accessed on 12 July 2024).
  85. Kim, H.E.; Cosa-Linan, A.; Santhanam, N.; Jannesari, M.; Maros, M.E.; Ganslandt, T. Transfer Learning for Medical Image Classification: A Literature Review. BMC Med. Imaging 2022, 22, 69. [Google Scholar] [CrossRef]
  86. Xuan, J.; Ke, B.; Ma, W.; Liang, Y.; Hu, W. Spinal Disease Diagnosis Assistant Based on MRI Images Using Deep Transfer Learning Methods. Front. Public Health 2023, 11, 1044525. [Google Scholar] [CrossRef] [PubMed]
  87. Santomartino, S.M.; Putman, K.; Beheshtian, E.; Parekh, V.S.; Yi, P.H. Evaluating the Robustness of a Deep Learning Bone Age Algorithm to Clinical Image Variation Using Computational Stress Testing. Radiol. Artif. Intell. 2024, 6, e230240. [Google Scholar] [CrossRef] [PubMed]
  88. Brady, A.P.; Allen, B.; Chong, J.; Kotter, E.; Kottler, N.; Mongan, J.; Oakden-Rayner, L.; Pinto Dos Santos, D.; Tang, A.; Wald, C.; et al. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement from the ACR, CAR, ESR, RANZCR and RSNA. J. Am. Coll. Radiol. 2021, 18, 710–717. [Google Scholar] [CrossRef] [PubMed]
  89. Kim, B.; Romeijn, S.; van Buchem, M.; Mehrizi, M.H.R.; Grootjans, W. A Holistic Approach to Implementing Artificial Intelligence in Radiology. Insights Imaging 2024, 15, 22. [Google Scholar] [CrossRef]
  90. Suran, M.; Hswen, Y. How to Navigate the Pitfalls of AI Hype in Health Care. JAMA 2024, 331, 273–276. [Google Scholar] [CrossRef] [PubMed]
  91. Geis, J.R.; Brady, A.; Wu, C.C.; Spencer, J.; Ranschaert, E.; Jaremko, J.L.; Langer, S.G.; Kitts, A.B.; Birch, J.; Shields, W.F.; et al. Ethics of Artificial Intelligence in Radiology: Summary of the Joint European and North American Multisociety Statement. Insights Imaging 2019, 10, 101. [Google Scholar] [CrossRef]
  92. Jaremko, J.L.; Azar, M.; Bromwich, R.; Lum, A.; Alicia Cheong, L.H.; Gilbert, M.; Laviolette, F.; Gray, B.; Reinhold, C.; Cicero, M.; et al. Canadian Association of Radiologists White Paper on Ethical and Legal Issues Related to Artificial Intelligence in Radiology. Can. Assoc. Radiol. J. 2019, 70, 107–118. [Google Scholar] [CrossRef]
  93. Plackett, B. The Rural Areas Missing out on AI Opportunities. Nature 2022, 610, S17. [Google Scholar] [CrossRef]
  94. Celi, L.A.; Cellini, J.; Charpignon, M.-L.; Dee, E.C.; Dernoncourt, F.; Eber, R.; Mitchell, W.G.; Moukheiber, L.; Schirmer, J.; Situ, J.; et al. Sources of Bias in Artificial Intelligence That Perpetuate Healthcare Disparities-A Global Review. PLoS Digit. Health 2022, 1, e0000022. [Google Scholar] [CrossRef]
  95. Eltawil, F.A.; Atalla, M.; Boulos, E.; Amirabadi, A.; Tyrrell, P.N. Analyzing Barriers and Enablers for the Acceptance of Artificial Intelligence Innovations into Radiology Practice: A Scoping Review. Tomography 2023, 9, 1443–1455. [Google Scholar] [CrossRef]
  96. Borondy Kitts, A. Patient Perspectives on Artificial Intelligence in Radiology. J. Am. Coll. Radiol. 2023, 20, 243–250. [Google Scholar] [CrossRef]
  97. Brima, Y.; Atemkeng, M. Saliency-Driven Explainable Deep Learning in Medical Imaging: Bridging Visual Explainability and Statistical Quantitative Analysis. BioData Min. 2024, 17, 18. [Google Scholar] [CrossRef] [PubMed]
  98. Moor, M.; Banerjee, O.; Abad, Z.S.H.; Krumholz, H.M.; Leskovec, J.; Topol, E.J.; Rajpurkar, P. Foundation Models for Generalist Medical Artificial Intelligence. Nature 2023, 616, 259–265. [Google Scholar] [CrossRef] [PubMed]
  99. Hafezi-Nejad, N.; Trivedi, P. Foundation AI Models and Data Extraction from Unlabeled Radiology Reports: Navigating Uncharted Territory. Radiology 2023, 308, e232308. [Google Scholar] [CrossRef] [PubMed]
  100. Seah, J.; Tang, C.; Buchlak, Q.D.; Holt, X.G.; Wardman, J.B.; Aimoldin, A. Effect of a Comprehensive Deep-Learning Model on the Accuracy of Chest X-ray Interpretation by Radiologists: A Retrospective, Multireader Multicase Study. Lancet Digit. Health 2021, 3, e496–e506. [Google Scholar] [CrossRef]
  101. van Beek, E.J.R.; Ahn, J.S.; Kim, M.J.; Murchison, J.T. Validation Study of Machine-Learning Chest Radiograph Software in Primary and Emergency Medicine. Clin. Radiol. 2023, 78, 1–7. [Google Scholar] [CrossRef]
  102. Niehoff, J.H.; Kalaitzidis, J.; Kroeger, J.R.; Schoenbeck, D.; Borggrefe, J.; Michael, A.E. Evaluation of the Clinical Performance of an AI-Based Application for the Automated Analysis of Chest X-rays. Sci. Rep. 2023, 13, 3680. [Google Scholar] [CrossRef]
  103. Hayashi, D. Deep Learning for Lumbar Spine MRI Reporting: A Welcome Tool for Radiologists. Radiology 2021, 300, 139–140. [Google Scholar] [CrossRef] [PubMed]
  104. Gertz, R.J.; Bunck, A.C.; Lennartz, S. GPT-4 for Automated Determination of Radiological Study and Protocol Based on Radiology Request Forms: A Feasibility Study. Radiology 2023, 307, e230877. [Google Scholar] [CrossRef] [PubMed]
  105. Beddiar, D.-R.; Oussalah, M.; Seppänen, T. Automatic Captioning for Medical Imaging (MIC): A Rapid Review of Literature. Artif. Intell. Rev. 2023, 56, 4019–4076. [Google Scholar] [CrossRef] [PubMed]
  106. Sun, Z.; Ong, H.; Kennedy, P. Evaluating GPT4 on Impressions Generation in Radiology Reports. Radiology 2023, 307, e231259. [Google Scholar] [CrossRef]
  107. Ayers, J.W.; Poliak, A.; Dredze, M. Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA Intern. Med. 2023, 183, 589–596. [Google Scholar] [CrossRef] [PubMed]
  108. Kuckelman, I.J.; Yi, P.H.; Bui, M.; Onuh, I.; Anderson, J.A.; Ross, A.B. Assessing AI-Powered Patient Education: A Case Study in Radiology. Acad. Radiol. 2024, 31, 338–342. [Google Scholar] [CrossRef]
  109. Wu, J.; Kim, Y.; Keller, E.C.; Chow, J.; Levine, A.P.; Pontikos, N.; Ibrahim, Z.; Taylor, P.; Williams, M.C.; Wu, H. Exploring Multimodal Large Language Models for Radiology Report Error-Checking. arXiv 2023, arXiv:2312.13103. [Google Scholar]
  110. Yu, F.; Moehring, A.; Banerjee, O.; Salz, T.; Agarwal, N.; Rajpurkar, P. Heterogeneity and Predictors of the Effects of AI Assistance on Radiologists. Nat. Med. 2024, 30, 837–849. [Google Scholar] [CrossRef]
Figure 1. PRISMA flowchart showing the two-step study screening process. Adapted from PRISMA Group, 2020.
Figure 1. PRISMA flowchart showing the two-step study screening process. Adapted from PRISMA Group, 2020.
Bioengineering 11 00894 g001
Figure 2. Key themes identified through the literature search representing potential areas where advances in artificial intelligence and machine learning can improve the field of spine MRI.
Figure 2. Key themes identified through the literature search representing potential areas where advances in artificial intelligence and machine learning can improve the field of spine MRI.
Bioengineering 11 00894 g002
Table 1. Summary of Selected Studies.
Table 1. Summary of Selected Studies.
NoStudy TitleAuthorshipYear of PublicationJournal NameApplication and Primary Outcome MeasureSample Size *Spine Region StudiedMRI Sequences UsedArtificial Intelligence Technique UsedKey Results and Conclusion
1A quantitative evaluation of the deep learning model of segmentation and measurement of cervical spine MRI in healthy adultsY Zhu et al. [19]2024J Appl Clin Med PhysSegmentation of cervical spine structures (subarachnoid space area/diameter, spinal cord area/diameter, anterior and posterior extra-spinal space)160CervicalSagittal T1w, T2w, axial T2w3D U-netNo comparative statistics
2MRI radiomics-based decision support tool for a personalized classification of cervical disc degeneration: a two-center studyJ Xie et al. [20]2024Front PhysiolClassification of cervical disc degeneration (Pfirrmann grading)435CervicalSagittal T1w, T2wMedSAM Disc segmentation Dice 0.93. Random forest overall performance AUC 0.95, accuracy 90%, precision 87%
3A deep-learning model for diagnosing fresh vertebral fractures on magnetic resonance imagesY Wang et al. [21]2024World NeurosurgDetection of fresh vertebral fractures716Whole spineMidsagittal STIRYoloV7, Resnet 50 Accuracy 98%, sensitivity 98%, specificity 97%. External dataset accuracy 92%
4Diagnostic evaluation of deep learning accelerated lumbar spine MRIKM Awan et al. [22]2024Neuroradiol JComparison of deep learning accelerated protocol to conventional protocol for neural stenosis and facet arthropathy36LumbarSagittal T1w, T2w, STIR, axial T2wCNN Non-inferior in all aspects however reduced signal-to-noise ratio and increased artifact perception. Interobserver variability κ = 0.50–0.76
5A deep neural network for MRI spinal inflammation in axial spondyloarthritisY Lin et al. [23]2024Eur Spine JDetection of inflammatory lesions on STIR sequence for patients with axial spondyloarthritis330Whole spineSagittal STIRU-net AUC 0.87, sensitivity 80%, specificity 88%, comparable to a radiologist. True positive lesion Dice 0.55.
6Semi-automatic assessment of facet tropism from lumbar spine MRI using deep learning: a Northern Finland birth cohort studyN Kowlagi et al. [24]2023Spine (Phila Pa 1976)Measurement of facet joint angles490Lumbar (L3/4 to L5/S1)Axial T2wU-net Dice 0.93, IOU 0.87
7A convolutional neural network for automated detection of cervical ossification of the posterior longitudinal ligament using magnetic resonance imagingZ Qu et al. [25]2023Clin Spine SurgDetection of ossification of posterior longitudinal ligament 684CervicalSagittal MRIResNet Accuracy 93–98%, AUC 0.91–0.97. ResNet50 and ResNet101 had higher accuracy and specificity than all human readers
8Deep learning-based k-space-to-image reconstruction and super resolution for diffusion-weighted imaging in whole-spine MRIDK Kim et al. [26]2024Magn Reson ImagingK-space-to-image reconstruction for whole spine DWI in patients with hematologic and oncologic diseases67Whole spineAxial single-shot echo-planar DWICNN Higher diagnostic confidence scores and overall image quality
9Automatic detection and classification of Modic changes in MRI images using deep learning: intelligent assisted diagnosis systemGang L et al. [27]2024Orthop SurgDetection and classification of Modic endplate changes168LumbarMedian sagittal T1w and T2wSingle shot multibox detector, ResNet18Internal dataset: accuracy 86%, recall 88%, precision 85%, F1-score 86%, interobserver κ = 0.79 (95%CI 0.66–0.85). External dataset: accuracy 75%, recall 77%, precision 78%, F1-score 75%, interobserver κ = 0.68 (95%CI 0.51–0.68)
10Deep learning system for automated detection of posterior ligamentous complex injury in patients with thoracolumbar fracture on MRISW Jo et al. [28]2023Sci RepDetection of posterior ligamentous complex injury in patients with acute thoracolumbar fractures500Thoracic and lumbar Midline sagittal T2wAttention U-net and Inception-ResNetv2AUC 0.92–0.93 (vs. 0.83–0.93 for radiologists)
11Cross-sectional area and fat infiltration of the lumbar spine muscles in patients with back disorders: a deep learning-based big data analysisJ Vitale et al. [29]2023Eur Spine JSegmentation of lumbar paravertebral muscles and correlation with age4434LumbarAxial T2wU-netHigher cross-sectional area in males (p < 0.001). Positive correlation between age and total fat infiltration (r = 0.73, p < 0.001), negligible negative correlation between cross-sectional area and age (r = −0.24, p < 0.001)
12MRI feature-based radiomics models to predict treatment outcome after stereotactic body radiotherapy for spinal metastasesY Chen et al. [30]2023Insights ImagingPrediction of treatment outcome after stereotactic body radiotherapy for spine metastasis194Whole spineSagittal T1w, T2w, STIR, axial T2wMultiple (including AdaBoost, XGBoost, RF, SVM)Combined model AUC 0.83, clinical model AUC 0.73
13Clinical and radiomics feature-based outcome analysis in lumbar disc herniation surgeryB Saravi et al. [31]2023BMC Musculoskelet DisordCombination of radiomics features and clinical features to predict lumbar disc herniation surgery outcomes172LumbarSagittal T2wMultiple (including XGBoost, Lagrangian SVM, RF radial basis function neural network)Accuracy 88–93% (vs. 88–91% for clinical features alone)
14Differentiating spinal pathologies by deep learning approachO Haim et al. [32]2024Spine JDifferentiation of spinal lesions into infection, carcinoma, meningioma and schwannoma231Whole spineVariable (T2w, T1w post-contrast)Fast.aiAccuracy 78% (validation), 93% (test)
15Deep learning-based detection and classification of lumbar disc herniation on magnetic resonance imagesW Zhang et al. [33]2023JOR SpineDetection and classification of lumbar disc herniation according to the Michigan State University classification 1115LumbarAxial T2wFaster R-CNN, ResNeXt101Internal dataset: detection IOU 0.82, classification accuracy 88%, AUC 0.97, interclass correlation 0.87. External dataset: detection IOU 0.70, classification accuracy 74%, AUC 0.92, interclass correlation 0.79
16ASNET: a novel AI framework for accurate ankylosing spondylitis diagnosis from MRINP Tas et al. [34]2023BiomedicinesPrediction of ankylosing spondylitis diagnosis on MRI2036Sacroiliac jointsAxial, coronal STIR, coronal T1w post-contrastDenseNet201, ResNet50, ShuffleNetAccuracy 100%, recall 100%, precision 100%, F1-score 100%
17Attention-gated U-Net networks for simultaneous axial/sagittal planes segmentation of injured spinal cordsN Masse-Gignac et al. [35]2023J Appl Clin Med PhysSegmentation of the spinal cord in patients with traumatic injuries94All (mainly cervical)Sagittal T2wU-NetDice 0.95
18A spine segmentation method based on scene aware fusion networkEE Yilizati-Yilihamu et al. [36]2023BMC NeurosciSegmentation of lumbar spine MRI into individual vertebrae and discs by level172LumbarSagittal MRIScene-Aware Fusion Network (SAFNet)Dice 0.79–0.81 (average 0.80)
19MRI radiomics-based evaluation of Tuberculous and Brucella spondylitisW Wang et al. [37]2023J Int Med ResDifferentiation of Tuberculous spondylitis from Brucella spondylitis, and culture positive from culture negative Tuberculous spondylitis190Whole spineSagittal T1w, T2w, fat suppressedRF, SVMSVM AUC 0.90–0.94, RF AUC 0.95
20Deep phenotyping the cervical spine: automatic characterization of cervical degenerative phenotypes based on T2-weighted MRIF Niemeyer et al. [38]2023Eur Spine JClassification of cervical spine into degenerative phenotypes based on disc and osteophyte configuration873CervicalSagittal MRI3D CNNDisc κ = 0.55–0.68, disc displacement κ = 0.58–0.74, disc space narrowing κ = 0.65–0.72, osseous abnormalities κ = 0.18–0.49
21MRI-based radiomics assessment of the imminent new vertebral fracture after vertebral augmentationJ Cai et al. [39]2023Eur Spine JEvaluation of risk of new vertebral fracture after vertebral augmentation168LumbarT2wMultiple (logistic regression, RF, SVM, XGBoost)AUC 0.90–0.93, superior to clinical features alone (p < 0.05)
22Associations between vertebral localized contrast changes and adjacent annular fissures in patients with low back pain: a radiomics approachC Waldenberg et al. [40]2023J Clin MedDetection of adjacent level annular fissure based on vertebral changes on MRI 61LumbarSagittal T1w, T2w, discography, CTMultilayer perceptron, RF, K-nearest neighborAccuracy 83%, sensitivity 97%, specificity 28%, AUC 0.76
23Imaging evaluation of a proposed 3D generative model for MRI to CT translation in the lumbar spineM Roberts et al. [41]2023Spine JGeneration of 3D CT from sagittal MRI data420LumbarSagittal T1w3D cycle-GANMeasurements in sagittal plane <10% relative error, axial plane up to 34% relative error
24Deep learning-generated synthetic MR imaging STIR spine images are superior in image quality and diagnostically equivalent to conventional STIR: a multicenter, multireader trialLN Tanenbaum et al. [42]2023AJNRValidation of synthetically created STIR images created from T1w and T2w93Whole spineSagittal T1w, T2w, STIRCNNNo significant difference between synthetic and acquired STIR, higher image quality for synthetic STIR (p < 0.0001)
25Prediction of osteoporosis using MRI and CT scans with unimodal and multimodal deep-learning modelsY Kucukciloglu et al. [43]2024Diagn Interv RadiolPrediction of osteoporosis on lumbar spine MRI and CT against DEXA scans120LumbarSagittal T1w, CT, DEXACNNAccuracy 96–99%
26Machine learning assisting the prediction of clinical outcomes following nucleoplasty for lumbar degenerative disc diseasePF Chiu et al. [44]2023Diagnostics (Basel)Prediction of pain improvement after lumbar nucleoplasty for degenerative disc disease181LumbarAxial T2wMultiple (SVM, light gradient boosting machine, XGBoost, XGBRF, CatBoost, iRF)Improved RF: accuracy 76%, sensitivity 69%, specificity 83%, F1-score 0.73, AUC 0.77
27NAMSTCD: A novel augmented model for spinal cord segmentation and tumor classification using deep netsR Mohanty et al. [45]2023Diagnostics (Basel)Segmentation of spinal cord regions and tumour types5000 imagesWhole spineNot mentionedMultiple (Multiple Mask Regional CNN (MRCNNs), VGGNet 19, YoLo V2, ResNet 101, GoogleNet Classification accuracy 99% (versus 81–96% for other models)
28Benign vs. malignant vertebral compression fractures with MRI: a comparison between automatic deep learning network and radiologist’s assessmentB Liu et al. [46]2023Eur RadiolDifferentiation of benign and malignant vertebral compression fractures209Whole spineMedian sagittal T1w, T2w fat suppressedTwo stream compare and contrast network (TSCCN)AUC 92–99%, accuracy 90–96% (higher than radiologists), specificity 94–99% (higher than radiologists)
29Automatic detection, classification, and grading of lumbar intervertebral disc degeneration using an artificial neural network modelW Liawrungrueang et al. [47]2023Diagnostics (Basel)Classification of lumbar disc degeneration (Pfirrmann grading)515LumbarSagittal T2wYolov5Accuracy > 95%, F1-score 0.98
30Differentiating magnetic resonance images of pyogenic spondylitis and spinal Modic change using a convolutional neural networkT Mukaihata et al. [48]2023Spine (Phila Pa 1976)Differentiation of Modic changes from pyogenic spondylitis on MRI100Whole spineSagittal T1w, T2w, STIRCNNAUC 0.94–0.95, higher accuracy than clinicians (p < 0.05)
31Automated classification of intramedullary spinal cord tumors and inflammatory demyelinating lesions using deep learningZ Zhuo et al. [49]2022Radiol Artif IntellDifferentiation of cord tumors from demyelinating lesions647Whole spineSagittal T2wMultiResU-net, DenseNet121Test cohort Dice 0.50–0.80, accuracy 79–96%, AUC 0.85–0.99
32Ultrafast cervical spine MRI protocol using deep learning-based reconstruction: diagnostic equivalence to a conventional protocolN Kashiwagi et al. [50]2022Eur J RadiolValidation of an ultrafast cervical spine MRI protocol50CervicalSagittal T1w, T2w, STIR, axial T2*wCNNκ = 0.60–0.98, individual equivalence index 95% CI < 5%
33Differentiation between spinal multiple myeloma and metastases originated from lung using multi-view attention-guided networkK Chen et al. [51]2022Front OncolDifferentiation of multiple myeloma lesions from metastasis on MRI217Whole spineT2w, T1w post-contrast (3 planes)Multi-view attention guided (MAGN), ResNet50, Class Activation MappingAccuracy 79–81%, AUC 0.77–0.78, F1-score 0.67–0.71
34Development of lumbar spine MRI referrals vetting models using machine learning and deep learning algorithms: Comparison models vs. healthcare professionalsAH Alanazi et al. [52]2022Radiography (Lond)Vetting of MRI lumbar spine referrals for valid indications 1020LumbarNilSVM, logistic regression, RF, CNN, bi-directional long-short term memory (Bi-LSTM)RF AUC 0.99, CNN AUC 0.98 (outperforming radiographers)
35Improved productivity using deep learning-assisted reporting for lumbar spine MRIDSW Lim et al. [53]2022RadiologyEvaluation of time savings and accuracy for AI-assisted MRI lumbar spine reporting25LumbarSagittal T1w, axial T2wCNN, ResNet101Reduced interpretation time (p < 0.001), improved or equivalent interobserver agreement with DL assistance
36Deep learning model for classifying metastatic epidural spinal cord compression on MRIJ Hallinan et al. [54]2022Front OncolClassification of metastatic vertebral and epidural disease (Bilsky classification)247ThoracicAxial T2wResNet50Internal dataset: κ = 0.92–0.98, external dataset: κ = 0.94–0.95
37Vertebral deformity measurements at MRI, CT, and radiography using deep learningA Suri et al. [55]2021Radiol Artif IntellMeasurement of vertebral deformity on MRI, CT and radiographs1744Whole spineSagittal T1w, T2w, CT, radiographsNeural networkVertebral measurement mean height percentage error 1.5–1.9% ± 0.2–0.4, lumbar lordosis angle mean absolute error 2.3–3.6°
38Predicting postoperative recovery in cervical spondylotic myelopathy: construction and interpretation of T2*-weighted radiomic-based extra trees modelsMZ Zhang et al. [56]2022Eur RadiolPrediction of recovery rate after cervical spondylotic myelopathy surgery based on MRI and clinical features151CervicalT2w, T2*wThreshold selection, collinearity removal, tree-based feature selectionAUC 0.71–0.81 (vs. conventional clinical and radiologic models AUC 0.40–0.55)
39Fully automated segmentation of lumbar bone marrow in sagittal, high-resolution T1-weighted magnetic resonance images using 2D U-NETEJ Hwang et al. [57]2022Comput Biol MedSegmentation of normal and pathological bone marrow on MRI lumbar spine100LumbarSagittal T1wU-net3D, Grow-cutHealthy subjects Dice 0.91–0.96, diseased subjects Dice 0.83–0.95
40A comparison of natural language processing methods for the classification of lumbar spine imaging findings related to lower back painC Jujjavarapu et al. [58]2022Acad RadiolClassification of spine MRI and radiograph reports into 26 findings871LumbarMRI, radiographsElastic-net logistic regressionAUC 0.96 for all findings (n-grams), AUC 0.95 for potentially clinically important findings
41Virtual magnetic resonance lumbar spine images generated from computed tomography images using conditional generative adversarial networksM Gotoh et al. [59]2022Radiography (Lond)Generation of virtual MRI images from CT22LumbarMRIConditional GANNo significant difference between virtual and conventional MRI, except in visualization of spinal canal structure. Peak signal-to-noise ratio 18.4 dB
42Deep learning for adjacent segment disease at preoperative MRI for cervical radiculopathyCMW Goedmakers et al. [60]2021RadiologyPrediction of adjacent segment disease after anterior cervical discectomy and fusion surgery on pre-operative MRI344CervicalSagittal T2wCNNAccuracy 95% (vs. 58% for clinicians), sensitivity 80% (vs. 60%), specificity 97% (vs. 58%)
43Automatic multiclass intramedullary spinal cord tumor segmentation on MRI with deep learningA Lemay et al. [61]2021Neuroimage ClinSegmentation of three common spinal cord tumors343Whole spineSagittal T2w, T1w post-contrastU-netDice 0.77 (all abnormal signal), 0.62 (tumour alone), true positive detection > 87% (all abnormal signal)
44A preliminary study using spinal MRI-based radiomics to predict high-risk cytogenetic abnormalities in multiple myelomaJ Liu et al. [62]2021Radiol MedPrediction of high-risk cytogenic abnormalities in multiple myeloma based on MRI248 lesionsWhole spineSagittal T1w, T2w, T2w fat suppressedLogistic regressionAUC 0.86–0.87, sensitivity 79%, specificity 79%, PPV 75%, NPV 82%, accuracy 79%
45A deep learning model for detection of cervical spinal cord compression in MRI scansZ Merali et al. [63]2021Sci RepDichotomous spinal cord compression for cervical spine289CervicalAxial T2wCNNAUC 0.94, sensitivity 88%, specificity 89%, F1-score 0.82
46Deep learning model for automated detection and classification of central canal, lateral recess, and neural foraminal stenosis at lumbar spine MRIJ Hallinan et al. [64]2021RadiologyGrading of lumbar spinal canal, lateral recess and neural foraminal stenosis446LumbarSagittal T1w, axial T2wCNNRecall 85–100%, dichotomous classification κ-range = 0.89–0.96 (vs. 0.92–0.98 for radiologists)
47A deep convolutional neural network with performance comparable to radiologists for differentiating between spinal schwannoma and meningiomaS Maki et al. [65]2020Spine (Phila Pa 1976)Differentiation of meningioma from schwannoma84Whole spineSagittal T2w, T1w post-contrastCNNAUC 0.87–0.88, sensitivity 78–85% (vs. 95–100% for radiologists), specificity 75–82% (vs. 26–58%), accuracy 80–81% (vs. 69–82%)
48Quantitative analysis of neural foramina in the lumbar spine: an imaging informatics and machine learning studyB Gaonkar et al. [66]2019Radiol Artif IntellSegmentation and statistical modelling of lumbar neural foraminal area1156LumbarSagittal T2wSVM, U-netDice 0.63–0.68 (neural foramen), 0.84–0.91 (disc)
49Performance of the deep convolutional neural network based magnetic resonance image scoring algorithm for differentiating between Tuberculous and pyogenic spondylitisK Kim et al. [67]2018Sci RepDifferentiation of pyogenic from Tuberculous spondylitis161Whole spineAxial T2wDeep CNNAUC 0.80 (vs. 0.73 for radiologists, p = 0.079)
50SpineNet: automated classification and evidence visualization in spinal MRIsA Jamaludin et al. [68]2017Med Image AnalDetection and classification of multiple abnormalities (Pfirrmann grading, disc narrowing, endplate defects, marrow changes, spondylolisthesis, central canal stenosis)2009Lumbar spineT2w sagittalCNNPfirmann inter-rater κ = 0.69–0.81, overall accuracy 74%
Artificial intelligence (AI), magnetic resonance imaging (MRI), T1-weighted (T1w), T2-weighted (T2w), T2*-weighted (T2*w), area under the curve (AUC), segment anything model (SAM), convolutional neural network (CNN), short-tau inversion recovery (STIR), intersection over union (IOU), diffusion-weighted imaging (DWI), random forest (RF), support vector machine (SVM), computed tomography (CT), generative adversarial network (GAN), dual-energy X-ray absorptiometry (DEXA). * numbers are patients unless stated otherwise.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lee, A.; Ong, W.; Makmur, A.; Ting, Y.H.; Tan, W.C.; Lim, S.W.D.; Low, X.Z.; Tan, J.J.H.; Kumar, N.; Hallinan, J.T.P.D. Applications of Artificial Intelligence and Machine Learning in Spine MRI. Bioengineering 2024, 11, 894. https://doi.org/10.3390/bioengineering11090894

AMA Style

Lee A, Ong W, Makmur A, Ting YH, Tan WC, Lim SWD, Low XZ, Tan JJH, Kumar N, Hallinan JTPD. Applications of Artificial Intelligence and Machine Learning in Spine MRI. Bioengineering. 2024; 11(9):894. https://doi.org/10.3390/bioengineering11090894

Chicago/Turabian Style

Lee, Aric, Wilson Ong, Andrew Makmur, Yong Han Ting, Wei Chuan Tan, Shi Wei Desmond Lim, Xi Zhen Low, Jonathan Jiong Hao Tan, Naresh Kumar, and James T. P. D. Hallinan. 2024. "Applications of Artificial Intelligence and Machine Learning in Spine MRI" Bioengineering 11, no. 9: 894. https://doi.org/10.3390/bioengineering11090894

APA Style

Lee, A., Ong, W., Makmur, A., Ting, Y. H., Tan, W. C., Lim, S. W. D., Low, X. Z., Tan, J. J. H., Kumar, N., & Hallinan, J. T. P. D. (2024). Applications of Artificial Intelligence and Machine Learning in Spine MRI. Bioengineering, 11(9), 894. https://doi.org/10.3390/bioengineering11090894

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop