Next Article in Journal
A Centralized Approach to the Logging Mechanisms of Distributed Complex ERP Applications
Previous Article in Journal
Temporal Dynamics in Short Text Classification: Enhancing Semantic Understanding Through Time-Aware Model
Previous Article in Special Issue
Usefulness of Deep Learning Techniques Using Magnetic Resonance Imaging for the Diagnosis of Meningioma and Atypical Meningioma
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Performance of Commercial Deep Learning-Based Auto-Segmentation Software for Prostate Cancer Radiation Therapy Planning: A Systematic Review

by
Curtise K. C. Ng
1,2
1
Curtin Medical School, Curtin University, GPO Box U1987, Perth, WA 6845, Australia
2
Curtin Medical Research Institute (Curtin MRI), Faculty of Health Sciences, Curtin University, GPO Box U1987, Perth, WA 6845, Australia
Information 2025, 16(3), 215; https://doi.org/10.3390/info16030215
Submission received: 28 January 2025 / Revised: 6 March 2025 / Accepted: 7 March 2025 / Published: 11 March 2025

Abstract

:
As yet, there is no systematic review focusing on benefits and issues of commercial deep learning-based auto-segmentation (DLAS) software for prostate cancer (PCa) radiation therapy (RT) planning despite that NRG Oncology has underscored such necessity. This article’s purpose is to systematically review commercial DLAS software product performances for PCa RT planning and their associated evaluation methodology. A literature search was performed with the use of electronic databases on 7 November 2024. Thirty-two articles were included as per the selection criteria. They evaluated 12 products (Carina Medical LLC INTContour (Lexington, KY, USA), Elekta AB ADMIRE (Stockholm, Sweden), Limbus AI Inc. Contour (Regina, SK, Canada), Manteia Medical Technologies Co. AccuContour (Jian Sheng, China), MIM Software Inc. Contour ProtégéAI (Cleveland, OH, USA), Mirada Medical Ltd. DLCExpert (Oxford, UK), MVision.ai Contour+ (Helsinki, Finland), Radformation Inc. AutoContour (New York, NY, USA), RaySearch Laboratories AB RayStation (Stockholm, Sweden), Siemens Healthineers AG AI-Rad Companion Organs RT, syngo.via RT Image Suite and DirectORGANS (Erlangen, Germany), Therapanacea Annotate (Paris, France), and Varian Medical Systems, Inc. Ethos (Palo Alto, CA, USA)). Their results illustrate that the DLAS products can delineate 12 organs at risk (abdominopelvic cavity, anal canal, bladder, body, cauda equina, left (L) and right (R) femurs, L and R pelvis, L and R proximal femurs, and sacrum) and four clinical target volumes (prostate, lymph nodes, prostate bed, and seminal vesicle bed) with clinically acceptable outcomes, resulting in delineation time reduction, 5.7–81.1%. Although NRG Oncology has recommended each clinical centre to perform its own DLAS product evaluation prior to clinical implementation, such evaluation seems more important for AccuContour and Ethos due to the methodological issues of the respective single studies, e.g., small dataset used, etc.

1. Introduction

Globally, prostate cancer (PCa) is the number four cancer and the second most common cancer type in males [1,2,3]. About 80% of PCa is localised [4]. Common treatments for localised and regional PCa include active surveillance, radical prostatectomy, and radiation therapy (RT) with and without hormone therapy (androgen deprivation therapy) [4,5]. Recent literature reviews and meta-analysis show that both radical prostatectomy and RT are equally effective for treating the localised PCa [4,5,6]. However, the PCa RT treatment effectiveness depends on the accurate segmentation of tumours (clinical target volumes (CTVs)) as well as adjacent healthy tissues (organs at risk (OARs)) such as the rectum on medical images during the treatment planning process, so as to minimise its side effects, e.g., faecal incontinence, blood in stool, etc. [5,7,8,9,10].
Traditionally, the OARs and CTVs segmentation in PCa RT treatment planning is handled by radiation therapists (RTTs) and/or radiation oncologists (ROs) based on established contouring guidelines [11,12]. However, this manual process is time-consuming and subjective, resulting in notable inter- and intra-observer variations and hence affecting the PCa RT treatment outcomes [13,14,15]. To address these issues, auto-segmentation approaches (e.g., intensity analysis, shape modelling, atlas-based, etc.) have emerged in recent decades [10,15,16,17]. Given the recent popularity of deep learning (DL) in medical imaging, commercial companies have shifted their focus to develop DL-based auto-segmentation (DLAS) software for PCa RT planning. Examples of such software include Carina Medical LLC INTContour (Lexington, KY, USA), Limbus AI Inc. Contour (Saskatchewan, Canada), Mirada Medical Ltd. DLCExpert (Oxford, UK), MVision.ai Contour+ (Helsinki, Finland), Radformation Inc. AutoContour (New York, NY, USA), RaySearch Laboratories AB RayStation (Stockholm, Sweden), Siemens Healthineers AG AI-Rad Companion Organs RT (Erlangen, Germany), and Therapanacea Annotate (Paris, France) [17,18,19,20].
Over the past five years, several review papers that cover the DLAS in PCa RT planning have been available. Their review areas include DLAS architectures/approaches (e.g., convolutional neural network (CNN), etc.) [12,14,15,17], model training and testing dataset sources (such as private and public) [12,14], dataset types (e.g., magnetic resonance imaging (MRI), computed tomography (CT), etc.) [12,13,15], dataset sizes such as 69 patients [12,13,14,15,21], structures segmented (e.g., seminal vesicles (SVs), rectum, etc.) [12,14,15,17,21], reference contour sources (such as at least three ROs) [14,21], contouring guidelines (e.g., Radiation Therapy Oncology Group (RTOG), etc.) [15], evaluation metrics (such as Dice similarity coefficient (DSC)), and model performance (e.g., efficiency, geometric accuracy, etc.) [12,14,15,17]. Most of these review articles were published in 2024, highlighting the significance and urgency of this topic [12,15,17,21]. However, the aforementioned articles, except for two, were narrative reviews and covered a range of disease sites such as breast, lung, and prostate [12,13,17,21]. As yet, there is only one DLAS systematic review focused on the PCa RT planning. Nevertheless, it was published in 2020 and no studies on commercial DLAS software were included [14].
According to the latest narrative reviews (published in 2024) on the DLAS for the RT planning including one by the working group of the influential body, NRG Oncology, the importance of reviewing commercial DLAS software products has been underscored. This is because such reviews can enable the clinical community to realise strengths and weaknesses of the commercial products for consideration of clinical implementation [12,21]. As yet, there is no systematic review focusing on the benefits and issues of the commercial DLAS software for the PCa RT planning despite that a previous critical review on the DLAS for RT planning of multiple cancers covered four studies about performance evaluation of three commercial DLAS products for the PCa RT planning (Carina Medical LLC INTContour, Limbus AI Inc. Contour and Siemens Healthineers AG AI-Rad Companion Organs RT) out of 12 included papers. Hence, the findings of that review paper with a mix of studies of in-house and commercial DLAS models are less relevant to the clinical community due to potential barriers to accessing and implementing the in-house models [15]. Without a systematic review focusing on the commercial DLAS products for the PCa RT planning with evaluation of associated methods for generation of evidence, this would hinder clinical adoption of such technology to take advantage of its potential benefits to a greater extent [22]. The purpose of this article is to systematically review original studies to answer the question: “What are the performances of commercial DLAS software products for the PCa RT planning and methods for their performance evaluation?”

2. Materials and Methods

This systematic review was conducted according to the PICO model (patient/population: PCa patients; intervention: PCa RT planning with the use of commercial DLAS software), comparison (DLAS versus standard practice-manual contouring), and outcome (performance of structure segmentation), and preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines [22,23,24,25,26]. Four major steps including literature search, article selection, and data extraction and synthesis were involved [22,23,24,25].

2.1. Literature Search

Seven electronic journal databases including Institute of Electrical and Electronics Engineers (IEEE) Xplore, PubMed, ScienceDirect, Scopus, SpringerLink, Web of Science, and Wiley Online Library were used for searching the literature on the performance of the commercial DLAS software for the PCa RT planning on 7 November 2024. The search statement: “Commercial” AND (“Deep Learning” OR “Artificial Neural Network”) AND (“Segmentation” OR “Delineation” OR “Contouring”) AND “Prostate Cancer” AND (“Radiotherapy” OR “Radiation Therapy”) was used. These keywords were based on the focus of the review. The search did not have any restrictions on publication year [13,14,22].

2.2. Article Selection

The selection of articles was handled by a single reviewer with more than 20 years’ experience in the literature review. Article selection criteria are shown in Table 1 [22,23,24,25].
The selection criteria in Table 1 were determined based on the purpose of this review which included appraisal of methods of original studies on evaluating the performance of DLAS software for the PCa RT planning [22,23,24,25,27]. For systematic reviews of DL that covered the study methodology appraisal, the conference proceedings were considered not suitable due to their lower academic rigour in general [22,27,28]. The detail of article selection is illustrated in Figure 1 [22,23,24,25]. The main processes included article deduplication and screening of paper titles, abstracts, and full texts as per the selection criteria. Each non-duplicate paper in the search results was retained unless a decision on its exclusion could be reached. Reference lists of all included papers were checked for identification of additional articles (i.e., backward reference searching) [22,23,24,29,30].

2.3. Data Extraction and Synthesis

Data extraction was conducted based on a form derived from one narrative review [12] and one critical review on the DLAS for different RT plannings [15], one systematic review of the DLAS for the PCa RT planning [14], and another systematic review focusing on commercial DLAS software for breast cancer RT planning [22]. The following data were extracted from every included paper: name and country of author; publication year; software product name and version (such as Siemens Healthineers AG AI-Rad Companion Organs RT VA20); DLAS architecture (e.g., 3-dimensional U-Net, etc.); study design (either prospective or retrospective); any involvement of multiple centres; patient/population (such as postprostatectomy PCa patients); any model finetuning; source (e.g., private: 1 Dutch centre, public: USA The Cancer Imaging Archive, etc.) and size (such as 84 patients) of dataset for development (training of model) and evaluation (testing) of software; any sample size calculation for evaluation and external testing (model evaluation with dataset different from that for its development and obtained from different source); imaging modality for DLAS (e.g., MRI, etc.); OARs (such as rectum) and CTVs (e.g., prostate, SVs, etc.) delineated; source of reference contour, i.e., ground truth (such as four expert ROs); contouring guidelines (e.g., Australia and New Zealand Faculty of Radiation Oncology Genito-Urinary Group (FROGG)) used; performances of delineation regarding geometric accuracy (expressed by metrics, e.g., DSC, etc.) for the DLAS and manual segmentation (in terms of inter-observer variation (IOV)-average performance of multiple observers), and any differences between them; subjective accuracy assessment (such as percentage of CTVs and OARs that needed no and minor adjustments); evaluation of efficiency (e.g., percentage of average segmentation time reduction, etc.); and dosimetric impact (such as DLAS and manual segmentation not having any clinically relevant dose difference for OARs) [12,14,15,22].
For papers that evaluated both DL- and atlas-based auto-segmentation software products or covered multiple cancer types, only data about the DLAS for the PCa RT planning were included. When more than one set of evaluation results were provided by a study, only mean (or median if mean unavailable) performance findings of the best model determined through the external testing and/or expert (instead of clinical) contours were extracted for the facilitation of software comparison. In addition, the following strategies for data synthesis were used when applicable: 1. calculating mean values for figures from multiple PCa patient demographic subgroups and ROs; 2. overall percentage calculation based on individual percentage figures for unusable CTVs and OARs contours as well as those requiring no and minor adjustments; and 3. calculating percentages based on reported absolute values of average segmentation time reduction, and delineated structure volume differences between manual contouring and DLAS [22,23,24,25]. The DLAS delineation time included time for data transfer, auto-segmentation, and contour review and correction [12,21]. Since the included papers used various performance evaluation approaches, resulting in a high study heterogeneity, meta-analysis was not performed due to limited value [22,27,31,32]. The quality of all included papers was assessed based on revised checklist for artificial intelligence in medical imaging (CLAIM) 2024 [33,34,35,36].

3. Results

Thirty-two articles were included in this review as per the selection criteria (Figure 1). The performances of the commercial DLAS software products evaluated by the selected studies are shown in Table 2 and Table 3 [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68]. These studies covered a total of 12 products, namely, Carina Medical LLC INTContour [37,38,39,65,67], Elekta AB ADMIRE (Stockholm, Sweden) [40], Limbus AI Inc. Contour [41,42,43,44,45,46,66], Manteia Medical Technologies Co. AccuContour (Jian Sheng, China) [67], MIM Software Inc. Contour ProtégéAI (Cleveland, OH, USA) [47,48,67], Mirada Medical Ltd. DLCExpert [49,50,51,68], MVision.ai Contour+ [52,53,54,55,66,68], Radformation Inc. AutoContour [68], RaySearch Laboratories AB RayStation [56,57,65,68], Siemens Healthineers AG AI-Rad Companion Organs RT [58,60], syngo.via RT Image Suite [58,59] and DirectORGANS [61], Therapanacea Annotate [62,63,68], and Varian Medical Systems, Inc. Ethos (Palo Alto, CA, USA) [64].
DSC was the most common metric (used in 28 out of 32 (87.5%) papers) for the delineated structures’ geometry accuracy evaluation (Table 2) [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,55,57,58,59,60,61,63,65,66,67,68]. A total of 26 out of these 28 (92.9%) articles used the DLAS software to delineate the OARs. Collectively, 20 OARs as well as two rectal sparing structures (balloon and spacer) were covered [37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,53,55,57,58,59,60,61,63,66,67,68]. Their mean/median DSC values were: abdominopelvic cavity (0.94) [59]; anal canal (0.70–0.74) [45,63]; bladder (0.88–0.99) [37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,55,57,58,59,60,61,63,66,67,68]; body (external contour) (0.99) [59]; cauda equina (0.75) [68]; left (L) (0.68–0.99) and right (R) femoral heads (0.69–0.99) [37,38,39,41,42,44,46,48,49,50,51,52,55,57,59,66,67,68]; L (0.78–0.92) and R femurs (0.78–0.92) [45,63]; L (0.91) and R pelvis (0.90) [63]; L (0.95) and R proximal femurs (0.96) [61]; large (0.49–0.76) and small bowels (0.30–0.76) [67,68]; penile bulb (0.39–0.73) [37,38,44,51,52,55,63,67]; penile root (0.54–0.71) [68]; rectum (0.51–0.95) [37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,53,55,57,58,59,60,61,63,66,67,68]; sacrum (0.89) [63]; sigmoid (0.52–0.77) [68]; rectal balloon (0.89) [67]; and spacer (0.52–0.84) [51,67]. The range of the mean/median DSC for the OARs contoured by the DLAS software were 0.30 (small bowel) [67]–0.99 (bladder [42,43], body [59], and L [42,59] and R femoral heads [42]).
Based on the common DSC cutoff value, at least 0.70 for indicating clinically acceptable DLAS performance [22,44,45,51,60,68,69], 12 (abdominopelvic cavity [59], anal canal [45,63], bladder [37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,55,57,58,59,60,61,63,66,67,68], body [59], cauda equina [68], L and R femurs [45,63], L and R pelvis [63], L and R proximal femurs [61], and sacrum [63]) out of 20 (60.0%) OARs were able to meet this standard (Table 2). With the same criterion (DSC ≥ 0.70) [22,44,45,51,60,68,69], at least 80.0% of the OARs contoured by the following seven DLAS products were considered clinically acceptable: Carina Medical LLC INTContour (four (bladder, L and R femoral heads, and rectum) out of five (80.0%) OARs) [37,38,39,67], Limbus AI Inc. Contour (seven (anal canal, bladder, L and R femoral heads, L and R femurs, and rectum) out of eight (87.5%) OARs) [41,42,43,44,45,46,66], Manteia Medical Technologies Co. AccuContour (all (four) OARs (bladder, L and R femoral heads, and rectum)) [67], MIM Software Inc. Contour ProtégéAI (all (four) OARs (bladder, L and R femoral heads, and rectum)) [47,48,67], Siemens Healthineers AG AI-Rad Companion Organs RT/syngo.via RT Image Suite/DirectORGANS (all (eight) OARs (abdominopelvic cavity, bladder, body, L and R femoral heads, L and R proximal femurs, and rectum)) [58,59,60,61], Therapanacea Annotate ART-Plan (13 (anal canal, bladder, bowels, L and R femoral heads, L and R femurs, L and R pelvis, penile bulb and root, rectum, and sacrum) out of 14 (92.9%) OARs) [63,68], and Radformation Inc. AutoContour (six (bladder, bowels, cauda equina, L and R femoral heads, and rectum) out of seven (85.7%) OARs) [68].
Around 60.0% of the included papers evaluated the DLAS products for delineating the CTVs by the metric, DSC (Table 2) [37,38,39,40,41,42,45,46,48,51,52,55,57,59,60,63,65,67,68]. A total of 11 products were evaluated, and they were: Carina Medical LLC INTContour [37,38,39,65,67], Elekta AB ADMIRE [40], Limbus AI Inc. Contour [41,42,45,46], Manteia Medical Technologies Co. AccuContour [67], MIM Software Inc. Contour ProtégéAI [48,67], Mirada Medical Ltd. DLCExpert [51,68], MVision.ai Contour+ [52,55,68], Radformation Inc. AutoContour [68], RaySearch Laboratories AB RayStation [57,65,68], Siemens Healthineers AG AI-Rad Companion Organs RT [60] and syngo.via RT Image Suite [59], and Therapanacea Annotate ART-Plan [63,68]. Five CTVs were covered in total [37,38,39,40,41,42,45,46,48,51,52,55,57,59,60,63,65,67,68]. Their mean/median DSC values were 0.80 (lymph nodes) [52], 0.74–0.91 (prostate) [37,39,40,41,42,45,46,48,51,52,55,57,59,60,63,65,67,68], 0.73–0.75 (prostate bed) [38,39], 0.73 (SV bed) [38], and 0.30–0.83 (SVs) [37,41,46,48,51,52,55,60,63,67,68]. All (11) products were able to delineate the prostate with the clinically acceptable outcome (DSC ≥ 0.70) [37,39,40,41,42,45,46,48,51,52,55,57,59,60,63,65,67,68]. Also, the performance of Carina Medical LLC INTContour and MVision.ai Contour+ for delineating lymph nodes [52], prostate bed [38,39], and SV bed [38] was deemed acceptable. Nonetheless, highly variable segmentation outcomes are noted for SVs [37,41,46,48,51,52,55,60,63,67,68].
In addition, Table 2 shows that only 18.8% (6/32) studies compared the performances between DLAS and manual contouring (expressed in IOV) [37,41,43,46,61,65]. Nevertheless, five (83.3%) out of these six studies revealed that the DLAS performance at least matched the one of the manual segmentation (represented in IOV), and it outperformed the observers occasionally [37,41,46,65].
Table 3 illustrates that greater than 90.0% (29/32) papers evaluated the DLAS products with more clinically meaningful approaches (subjective accuracy, efficiency, and/or dosimetric impact assessment) [37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,59,60,61,62,63,64,65,66,67,68]. Except for Walker et al.’s [49] and Wang et al.’s [51] studies on Mirada Medical Ltd. DLCExpert, 18 (90.0%) out of 20 articles showed that a large proportion of the structures contoured by Carina Medical LLC INTContour [37,38,39,65,67], Limbus AI Inc. Contour [42,43,45,46,66], Manteia Medical Technologies Co. AccuContour [67], MIM Software Inc. Contour ProtégéAI [48,67], Mirada Medical Ltd. DLCExpert [50], MVision.ai Contour+ [54,66], RaySearch Laboratories AB RayStation [65], Siemens Healthineers AG AI-Rad Companion Organs RT [60] and syngo.via RT Image Suite [59], Therapanacea Annotate ART-Plan [62,63], and Varian Medical Systems, Inc. Ethos [64] required no or minor adjustments. Thus, the DLAS contributed to substantial efficiency gain for the PCa RT planning [41,43,44,45,48,49,50,51,52,54,55,59,60,61,66,68]. The range of average delineation time reduction per patient reported by 16 out of the 32 (50.0%) papers was [41,43,44,45,48,49,50,51,52,54,55,59,60,61,66,68]: 5.7% (Siemens Healthineers AG AI-Rad Companion Organs RT [60])–99.8% (Therapanacea Annotate ART-Plan [68]). However, when excluding the articles not reporting the contour review and adjustment time [41,55], and those merely indicating the review and correction time [45,49,50,54,68], as per the NRG Oncology recommendation [21], the average contouring time reduction range became 5.7% (Siemens Healthineers AG AI-Rad Companion Organs RT [60])–81.1% (Limbus AI Inc. Contour [66]). For the DLAS dosimetric impact, only seven out of the 32 (21.9%) papers investigated this [37,44,45,53,56,61,64]. All but two of these articles indicated no clinically relevant (> 5%) or statistically significant dose difference to the delineated structures (except anal canal) between the manual contouring and DLAS [37,45,53,56,64].
Table 4 shows the included articles’ characteristics. CT images were used for the DLAS software evaluation in 29 out of the 32 (90.6%) studies. However, cone beam CT (CBCT) [39,46,64] and MRI [63] were also used in several (four) papers. Two of these studies finetuned the commercial software products (Carina Medical LLC INTContour and Therapanacea Annotate ART-Plan) originally designed for CT to handle the CBCT [39] and MRI image segmentation in 2024 [63]. The other two employed the models already trained with the CBCT images by the companies (Limbus AI Inc. Contour v1.5.0-D2 and Varian Medical Systems, Inc. Ethos) [46,64]. In total, the DLAS model finetuning was conducted in eight out of the 32 (25.0%) articles [37,38,39,51,53,63,65,67].
Nearly 80.0% (25/32) of the included papers were published in the previous three years (Table 4) [37,38,39,44,45,46,48,49,50,51,53,54,55,56,57,58,59,60,61,62,63,65,66,67,68]. However, these articles had some methodological weaknesses, e.g., only two (6.3%) prospective [42,44] and four (12.5%) multi-centre studies [42,49,52,58], none with sample size calculation for testing dataset [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68], 84.4% (27/32) of the papers with fewer than 100 patients for software testing [37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,54,55,56,59,60,61,62,63,64,65,66,68] (even four studies with ≤ 10 patients [44,46,55,62]), etc. Nevertheless, 27 out of the 32 (84.4%) articles performed the external testing [38,39,40,41,42,43,44,45,47,48,49,50,52,53,54,55,57,58,59,60,61,62,64,65,66,67,68]. In addition, the model testing datasets of the included articles (except Radici et al. not specifying the source [46]) were obtained from 14 different countries [37,38,39,40,41,42,43,44,45,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68] on four continents: Asia [47,52,55,60,68], Australia [50], Europe [40,44,45,48,49,52,53,54,56,58,59,61,62,63,66], and North America [37,38,39,41,42,43,51,57,64,65,67]. The mix of the merits and deficiencies of these papers resulted in their quality scores between 40.0% and 72.0% (mean: 52.1% and median: 53.0%) [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68]. Figure 2 shows the number of included articles meeting each revised CLAIM 2024 criterion [36].

4. Discussion

This paper is the first systematic review about the commercial DLAS software performance for PCa RT planning, covering 12 products and 32 articles [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68]. Thus, it advances Almeida et al.’s [14] DLAS systematic review on the PCa RT planning which covered 28 papers on non-commercial DL models published in 2020 and Matoska et al.’s [15] critical review published in 2024 which included 12 articles about CTV DLAS for PCa with only four studies evaluating three commercial products (Carina Medical LLC INTContour, Limbus AI Inc. Contour, and Siemens Healthineers AG AI-Rad Companion Organs RT). Another strength of this review is about 80.0% of the included papers published in the previous three years. Therefore, it can provide more relevant and recent findings to inform clinical practice (Table 2, Table 3 and Table 4) [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68].
As per Table 2 (geometry accuracy findings), the commercial DLAS products were able to delineate 12 OARs (abdominopelvic cavity [59], anal canal [45,63], bladder [37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,55,57,58,59,60,61,63,66,67,68], body [59], cauda equina [68], L and R femurs [45,63], L and R pelvis [63], L and R proximal femurs [61], and sacrum [63]), and four CTVs (prostate [37,39,40,41,42,45,46,48,51,52,55,57,59,60,63,65,67,68], lymph nodes [52], prostate bed [38,39], and SV bed [38]) with the clinically acceptable outcomes (DSC ≥ 0.70) [22,44,45,51,60,68,69]. The performance of the commercial products covered in this review for the prostate (CTV) segmentation (DSC: 0.74–0.91) [37,39,40,41,42,45,46,48,51,52,55,57,59,60,63,65,67,68] is comparable to the findings of Almeida et al.’s [14] systematic review (DSC: 0.65–0.95) and Matoska et al.’s [15] critical review (DSC: 0.70–0.90). For the other CTVs as well as OARs, both of those review papers did not report the respective DSC findings except for penile bulb with a DSC value of 0.72 [14,15], which is within the corresponding DSC range: 0.39–0.73 illustrated in Table 2 [37,38,44,51,52,55,63,67]. The unsatisfactory DSC results for penile bulb can be attributed to its small size and the low soft tissue contrast of CT for its visualisation [14,38,52,55,67]. The small size of a structure and the low contrast of CT also explain the low DSC values of SVs and penile root in some settings, respectively [46,51,68]. The suboptimal DSC values of SVs as well as other structures including L and R femoral heads, large and small bowels, rectum, and sigmoid could be due to the variation in delineation guidelines for the DLAS model development and evaluation as well [41,49,52,57,67].
The DSC was the most common metric for determining the geometric accuracy of structures delineated by the DLAS software in the included studies (Table 2) [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,55,57,58,59,60,61,63,65,66,67,68]. This is consistent with the findings of Ng’s [22] systematic review on the commercial DLAS software for breast cancer RT planning because the DSC facilitates the software performance comparison [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,55,57,58,59,60,61,63,65,66,67,68]. Since each geometric accuracy metric has its advantages and disadvantages, more than 80% (26/32) of the included articles employed multiple geometric accuracy metrics to complement the weaknesses of individual ones [37,38,39,41,42,44,45,46,47,48,49,50,51,52,53,55,56,57,58,60,61,63,65,66,67,68]. However, discussion of their strengths and weaknesses is not within the scope of this review, and it is available elsewhere [21,71,72]. Nevertheless, all geometric metrics are considered less clinically meaningful because the main purpose of employing the DLAS is to automatically contour OARs and CTVs that meet the local clinicians’ requirements for providing effective RT treatments, leading to segmentation time reduction as a result of minimal manual adjustments for the contours [12,21,37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,59,60,61,62,63,64,65,66,67,68].
Table 2 and Table 3 show that more than four fifths (26/32) of the included studies conducted geometric accuracy as well as subjective accuracy, efficiency, and/or dosimetric evaluations, and hence they were able to meet the NRG Oncology recommendation for more thorough assessment of the DLAS product performance. However, none of these papers used all evaluation approaches stated in Table 2 and Table 3, which is not ideal [12,21,37,38,39,41,42,43,44,45,46,48,49,50,51,52,53,55,56,59,60,61,63,64,65,66,67,68]. A good example for the comprehensive evaluation is illustrated in Almberg et al.’s [73] study on the evaluation of RaySearch Laboratories AB RayStation DLAS performance for breast cancer RT [12,21,22]. Nevertheless, these studies’ findings in Table 3 match the corresponding ones in Table 2, in general [12,21,37,38,39,41,42,43,44,45,46,48,49,50,51,52,53,55,56,59,60,61,63,64,65,66,67,68]. Thus, nine DLAS products, Carina Medical LLC INTContour [37,38,39,65,67], Limbus AI Inc. Contour [41,42,43,44,45,46,66], Manteia Medical Technologies Co. AccuContour [67], MIM Software Inc. Contour ProtégéAI [47,48,67], MVision.ai Contour+ [52,53,54,55,66,68], RaySearch Laboratories AB RayStation [56,57,65,68], Siemens Healthineers AG AI-Rad Companion Organs RT/syngo.via RT Image Suite/DirectORGANS [58,59,60,61], Therapanacea Annotate ART-Plan [62,63,68], and Varian Medical Systems, Inc. Ethos [64], should be useful for PCa RT planning in clinical settings.
Table 4 illustrates that a quarter of studies (all of which were published in the last three years) evaluated the commercial products (Carina Medical LLC INTContour [37,38,39,65,67], Mirada Medical Ltd. DLCExpert [51], MVision.ai Contour+ [53], and Therapanacea Annotate ART-Plan [63]) finetuned by additional datasets. Except for Carina Medical LLC INTContour, which provides a model finetuning function to end users (clinicians) [39], the other three products require their developers to do so through collaboration arrangements [51,53,63]. Model finetuning is an effective strategy for addressing the aforementioned issue of the commercial DLAS products, mismatch of contouring guidelines used by the developers and end users. Through re-training (finetuning) the commercial models with the end users’ datasets, the finetuned models would be able to meet their needs, resulting in better geometric and subjective accuracy, efficiency, and dosimetric outcomes [22,37,53,63,65]. For example, Duan et al. [65] used 57 intact PCa patients’ CT images to finetune the Carina Medical LLC INTContour original model. Their finetuned model outperformed the original one in terms of DSC, Hausdorff distance (HD), 95-percentile HD (HD95), 98-percentile HD (HD98), mean surface distance (MSD) and surface DSC (sDSC), and experienced ROs regarding HD95 and HD98 (p < 0.05) with no statistically significant differences for DSC, HD, MSD, and sDSC.
Another important role of DLAS model finetuning is to enable automatic MRI and CBCT image segmentation for meeting the increasing clinical demand on online adaptive RT (ART). Between treatment factions, prostate and SVs can move and deform. Hence, this requires structure re-contouring on new MRI/CBCT images acquired by linear accelerators during treatment sessions, but it is less practical if manual segmentation is performed [39,46,63,64]. It is noted that all DLAS products except Varian Medical Systems, Inc. Ethos covered in this review were originally trained with CT images only [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68]. Although CT and CBCT images appear similar, the image contrast of CBCT is lower than the former. If the finetuning is not conducted, this will greatly affect the DLAS model performance [46]. In contrast, MRI has better soft tissue contrast which is good for DLAS [52]. However, the image characteristics of CT and MRI are very different. Again, the model finetuning is essential for such [63].
For the quality score range of all included studies, 40.0–72.0% (mean: 52.1% and median: 53.0%) (Table 4) [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68], they are similar to the corresponding ones (range: 44.0–65.0%, mean: 53.9%, and median: 53.0%) of Ng’s [22] systematic review on the commercial DLAS software for breast cancer RT planning, and comparable to Sivanesan et al.’s [34] (median: 57.0%) and Bhandari et al.’s [35] (range: 23.8–73.8% and mean: 47.6%) included article quality scores (based on CLAIM) in their literature reviews on AI in medical imaging and MRI. Also, the issues of the papers included in this review are similar to those of the reviews by Ng [22], Sivanesan et al. [34], and Bhandari et al. [35] such as no sample size calculation in all studies and limited disclosure of details of dataset collection to assure no test set contamination [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,74].
According to NRG Oncology [21], each clinical centre should perform its own evaluation of commercial DLAS software prior to implementation due to potential mismatch of delineation guidelines employed for software development and local clinical practice. However, such evaluation seems more important for Manteia Medical Technologies Co. AccuContour [67] and Varian Medical Systems, Inc. Ethos [64]. This is because they were only tested with small datasets (≤ 25 PCa cancer patients) in single studies, implying that the testing results might not be generalizable [15,64,67]. Also, Moazzezi et al. [64] only collected the data retrospectively, and Duan et al. [67] did not specify this.
There are two major limitations in this systematic review. Only one author with more than 20 years of experience in the literature review selected papers, extracted, and synthesised data [22,23,24,25,30]. However, as per a systematic review about methodology, this is considered appropriate when these processes are handled by an experienced single reviewer [22,23,24,25,30,75,76]. In addition, any potential bias could be further addressed through following the PRISMA guidelines [26], use of the data extraction form based on four review papers [12,14,15,22], and CLAIM [33,36]. Also, only articles written in English were included. This might affect its comprehensiveness [22,23,24,25,30]. Nevertheless, 32 papers on DLAS for PCa RT were covered in this review. This number is larger than those of Almeida et al.’s [14] and Matoska et al.’s [15] review articles. Furthermore, this paper included the studies from four continents [37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68].

5. Conclusions

This systematic review paper covers 12 commercial DLAS products, namely Carina Medical LLC INTContour, Elekta AB ADMIRE, Limbus AI Inc. Contour, Manteia Medical Technologies Co. AccuContour, MIM Software Inc. Contour ProtégéAI, Mirada Medical Ltd. DLCExpert, MVision.ai Contour+, Radformation Inc. AutoContour, RaySearch Laboratories AB RayStation, Siemens Healthineers AG AI-Rad Companion Organs RT, syngo.via RT Image Suite and DirectORGANS, Therapanacea Annotate, and Varian Medical Systems, Inc. Ethos, for the PCa RT planning. As per the geometric accuracy evaluation findings of the included papers, collectively, these products can delineate 12 OARs (abdominopelvic cavity, anal canal, bladder, body, cauda equina, L and R femurs, L and R pelvis, L and R proximal femurs, and sacrum) and four CTVs (prostate, lymph nodes, prostate bed, and SV bed) with clinically acceptable outcomes. These geometric accuracy results match the respective subjective and dosimetric assessment findings. These result in substantial average delineation time reduction per patient, 5.7–81.1%. In addition, this review’s findings reveal that Therapanacea Annotate ART-Plan is capable of delineating 13 OARs as well as the prostate with clinically acceptable outcomes. This contributes to minimal needs of structure contour adjustment and significant segmentation time reduction. Also, it has the model finetuning potential. Although NRG Oncology has recommended that each clinical centre should perform its own evaluation of commercial DLAS software prior to implementation due to potential mismatch of delineation guidelines employed for software development and local clinical practice, such evaluation seems more important for Manteia Medical Technologies Co. AccuContour and Varian Medical Systems, Inc. Ethos because of the methodological issues of the single studies that investigated them. In the future, comprehensive evaluations (including geometric accuracy, subjective, efficiency, and dosimetric assessments) with traditional power calculations to determine the external testing dataset size and prospective data collection from multiple centres should be conducted. Also, further studies on commercial model finetuning and reviews on non-commercial DLAS software are encouraged. It is expected that this review’s results will assist clinical centres in selecting commercial DLAS products for further evaluation prior to implementation.

Funding

This work received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Prostate Cancer Statistics. Available online: https://www.wcrf.org/preventing-cancer/cancer-statistics/prostate-cancer-statistics/ (accessed on 15 January 2025).
  2. Cancer in Men: Prostate Cancer is #1 for 118 Countries Globally. Available online: https://www.cancer.org/research/acs-research-news/prostate-cancer-is-number-1-for-118-countries-worldwide.html (accessed on 15 January 2025).
  3. Prostate Cancer: 3. Epidemiology and Aetiology. Available online: https://uroweb.org/guidelines/prostate-cancer/chapter/epidemiology-and-aetiology (accessed on 15 January 2025).
  4. Wasim, S.; Park, J.; Nam, S.; Kim, J. Review of current treatment intensification strategies for prostate cancer patients. Cancers 2023, 15, 5615. [Google Scholar] [CrossRef] [PubMed]
  5. Hekman, L.; Barrett, A.; Ross, D.; Palaganas, E.; Giridhar, P.; Elumalai, T.; Pragathee, V.; Block, A.M.; Welsh, J.S.; Harkenrider, M.M.; et al. A systematic review of clinical trials comparing radiation therapy versus radical prostatectomy in prostate cancer. Clin. Genitourin. Cancer 2024, 22, 102157. [Google Scholar] [CrossRef] [PubMed]
  6. Numakura, K.; Kobayashi, M.; Muto, Y.; Sato, H.; Sekine, Y.; Sobu, R.; Aoyama, Y.; Takahashi, Y.; Okada, S.; Sasagawa, H.; et al. The current trend of radiation therapy for patients with localized prostate cancer. Curr. Oncol. 2023, 30, 8092–8110. [Google Scholar] [CrossRef] [PubMed]
  7. Fassia, M.K.; Balasubramanian, A.; Woo, S.; Vargas, H.A.; Hricak, H.; Konukoglu, E.; Becker, A.S. Deep learning prostate MRI segmentation accuracy and robustness: A systematic review. Radiol. Artif. Intell. 2024, 6, e230138. [Google Scholar] [CrossRef]
  8. Nayagam, R.D.; Selvathi, D. A systematic review of deep learning methods for the classification and segmentation of prostate cancer on magnetic resonance images. Int. J. Imaging Syst. Technol. 2024, 34, e23064. [Google Scholar] [CrossRef]
  9. Khan, Z.; Yahya, N.; Alsaih, K.; Al-Hiyali, M.I.; Meriaudeau, F. Recent automatic segmentation algorithms of MRI prostate regions: A review. IEEE Access 2021, 9, 97878–97905. [Google Scholar] [CrossRef]
  10. Kalantar, R.; Lin, G.; Winfield, J.M.; Messiou, C.; Lalondrelle, S.; Blackledge, M.D.; Koh, D.M. Automatic segmentation of pelvic cancers using deep learning: State-of-the-art approaches and challenges. Diagnostics 2021, 11, 1964. [Google Scholar] [CrossRef]
  11. Wu, C.; Montagne, S.; Hamzaoui, D.; Ayache, N.; Delingette, H.; Renard-Penna, R. Automatic segmentation of prostate zonal anatomy on MRI: A systematic review of the literature. Insights Imaging 2022, 13, 202. [Google Scholar] [CrossRef]
  12. De Biase, A.; Sijtsema, N.M.; Janssen, T.; Hurkmans, C.; Brouwer, C.; van Ooijen, P. Clinical adoption of deep learning target auto-segmentation for radiation therapy: Challenges, clinical risks, and mitigation strategies. BJR Artif. Intell. 2024, 1, ubae015. [Google Scholar] [CrossRef]
  13. Isaksson, L.J.; Summers, P.; Mastroleo, F.; Marvaso, G.; Corrao, G.; Vincini, M.G.; Zaffaroni, M.; Ceci, F.; Petralia, G.; Orecchia, R.; et al. Automatic segmentation with deep learning in radiotherapy. Cancers 2023, 15, 4389. [Google Scholar] [CrossRef]
  14. Almeida, G.; Tavares, J.M.R.S. Deep learning in radiation oncology treatment planning for prostate cancer: A systematic review. J. Med. Syst. 2020, 44, 179. [Google Scholar] [CrossRef] [PubMed]
  15. Matoska, T.; Patel, M.; Liu, H.; Beriwal, S. Review of deep learning based autosegmentation for clinical target volume: Current status and future directions. Adv. Radiat. Oncol. 2024, 9, 101470. [Google Scholar] [CrossRef] [PubMed]
  16. Ng, C.K.C.; Leung, V.W.S.; Hung, R.H.M. Clinical evaluation of deep learning and atlas-based auto-contouring for head and neck radiation therapy. Appl. Sci. 2022, 12, 11681. [Google Scholar] [CrossRef]
  17. Erdur, A.C.; Rusche, D.; Scholz, D.; Kiechle, J.; Fischer, S.; Llorián-Salvador, Ó.; Buchner, J.A.; Nguyen, M.Q.; Etzel, L.; Weidner, J.; et al. Deep learning for autosegmentation for radiotherapy treatment planning: State-of-the-art and novel perspectives. Strahlenther. Onkol. 2025, 201, 236–254. [Google Scholar] [CrossRef]
  18. Sun, Z.; Ng, C.K.C. Artificial intelligence (enhanced super-resolution generative adversarial network) for calcium deblooming in coronary computed tomography angiography: A feasibility study. Diagnostics 2022, 12, 991. [Google Scholar] [CrossRef]
  19. Sun, Z.; Ng, C.K.C. Finetuned super-resolution generative adversarial network (artificial intelligence) model for calcium deblooming in coronary computed tomography angiography. J. Pers. Med. 2022, 12, 1354. [Google Scholar] [CrossRef]
  20. Leung, V.W.S.; Ng, C.K.C.; Lam, S.K.; Wong, P.T.; Ng, K.Y.; Tam, C.H.; Lee, T.C.; Chow, K.C.; Chow, Y.K.; Tam, V.C.W.; et al. Computed tomography-based radiomics for long-term prognostication of high-risk localized prostate cancer patients received whole pelvic radiotherapy. J. Pers. Med. 2023, 13, 1643. [Google Scholar] [CrossRef]
  21. Rong, Y.; Chen, Q.; Fu, Y.; Yang, X.; Al-Hallaq, H.A.; Wu, Q.J.; Yuan, L.; Xiao, Y.; Cai, B.; Latifi, K.; et al. NRG Oncology assessment of artificial intelligence deep learning-based auto-segmentation for radiation therapy: Current developments, clinical considerations, and future directions. Int. J. Radiat. Oncol. Biol. Phys. 2024, 119, 261–280. [Google Scholar] [CrossRef]
  22. Ng, C.K.C. Performance of commercial deep learning-based auto-segmentation software for breast cancer radiation therapy planning: A systematic review. Multimodal Technol. Interact. 2024, 8, 114. [Google Scholar] [CrossRef]
  23. Ng, C.K.C. Artificial intelligence for radiation dose optimization in pediatric radiology: A systematic review. Children 2022, 9, 1044. [Google Scholar] [CrossRef]
  24. Ng, C.K.C. Diagnostic performance of artificial intelligence-based computer-aided detection and diagnosis in pediatric radiology: A systematic review. Children 2023, 10, 525. [Google Scholar] [CrossRef] [PubMed]
  25. Ng, C.K.C. Generative adversarial network (generative artificial intelligence) in pediatric radiology: A systematic review. Children 2023, 10, 1372. [Google Scholar] [CrossRef] [PubMed]
  26. PRISMA Statement. Available online: https://www.prisma-statement.org/ (accessed on 17 January 2025).
  27. Aggarwal, R.; Sounderajah, V.; Martin, G.; Ting, D.S.W.; Karthikesalingam, A.; King, D.; Ashrafian, H.; Darzi, A. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. NPJ Digit. Med. 2021, 4, 65. [Google Scholar] [CrossRef] [PubMed]
  28. Ma, L.; Liu, Y.; Zhang, X.; Ye, Y.; Yin, G.; Johnson, B.A. Deep learning in remote sensing applications: A meta-analysis and review. ISPRS J. Photogramm. Remote Sens. 2019, 152, 166–177. [Google Scholar] [CrossRef]
  29. Shah, K.A.; Ng, C.K.C. Workplace violence in medical radiation science: A systematic review. Radiography 2024, 30, 440–447. [Google Scholar] [CrossRef]
  30. Ng, C.K.C. A review of the impact of the COVID-19 pandemic on pre-registration medical radiation science education. Radiography 2022, 28, 222–231. [Google Scholar] [CrossRef]
  31. Vasey, B.; Ursprung, S.; Beddoe, B.; Taylor, E.H.; Marlow, N.; Bilbro, N.; Watkinson, P.; McCulloch, P. Association of clinician diagnostic performance with machine learning-based decision support systems: A systematic review. JAMA Netw. Open. 2021, 4, e211276. [Google Scholar] [CrossRef]
  32. Imrey, P.B. Limitations of meta-analyses of studies with high heterogeneity. JAMA Netw. Open. 2020, 3, e1919325. [Google Scholar] [CrossRef]
  33. Mongan, J.; Moy, L.; Kahn, C.E., Jr. Checklist for artificial intelligence in medical imaging (CLAIM): A guide for authors and reviewers. Radiol. Artif. Intell. 2020, 2, e200029. [Google Scholar] [CrossRef]
  34. Sivanesan, U.; Wu, K.; McInnes, M.D.F.; Dhindsa, K.; Salehi, F.; van der Pol, C.B. Checklist for artificial intelligence in medical imaging reporting adherence in peer-reviewed and preprint manuscripts with the highest Altmetric Attention Scores: A meta-research study. Can. Assoc. Radiol. J. 2023, 74, 334–342. [Google Scholar] [CrossRef]
  35. Bhandari, A.; Scott, L.; Weilbach, M.; Marwah, R.; Lasocki, A. Assessment of artificial intelligence (AI) reporting methodology in glioma MRI studies using the Checklist for AI in Medical Imaging (CLAIM). Neuroradiology 2023, 65, 907–913. [Google Scholar] [CrossRef] [PubMed]
  36. Tejani, A.S.; Klontzas, M.E.; Gatti, A.A.; Mongan, J.T.; Moy, L.; Park, S.H.; Kahn, C.E. Jr; CLAIM 2024 Update Panel. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): 2024 update. Radiol. Artif. Intell. 2024, 6, e240300. [Google Scholar] [CrossRef] [PubMed]
  37. Duan, J.; Bernard, M.; Downes, L.; Willows, B.; Feng, X.; Mourad, W.F.; St Clair, W.; Chen, Q. Evaluating the clinical acceptability of deep learning contours of prostate and organs-at-risk in an automated prostate treatment planning process. Med. Phys. 2022, 49, 2570–2581. [Google Scholar] [CrossRef] [PubMed]
  38. Hobbis, D.; Yu, N.Y.; Mund, K.W.; Duan, J.; Rwigema, J.M.; Wong, W.W.; Schild, S.E.; Keole, S.R.; Feng, X.; Chen, Q.; et al. First report on physician assessment and clinical acceptability of custom-retrained artificial intelligence models for clinical target volume and organs-at-risk auto-delineation for postprostatectomy patients. Pract. Radiat. Oncol. 2023, 13, 351–362. [Google Scholar] [CrossRef]
  39. Tegtmeier, R.C.; Kutyreff, C.J.; Smetanick, J.L.; Hobbis, D.; Laughlin, B.S.; Toesca, D.A.S.; Clouser, E.L.; Rong, Y. Custom-trained deep learning-based auto-segmentation for male pelvic iterative CBCT on c-arm linear accelerators. Pract. Radiat. Oncol. 2024, 14, e383–e394. [Google Scholar] [CrossRef]
  40. Jenkins, A.; Mullen, T.S.; Johnson-Hart, C.; Green, A.; McWilliam, A.; Aznar, M.; van Herk, M.; Vasquez Osorio, E. Novel methodology to assess the effect of contouring variation on treatment outcome. Med. Phys. 2021, 48, 3234–3242. [Google Scholar] [CrossRef]
  41. Wong, J.; Fong, A.; McVicar, N.; Smith, S.; Giambattista, J.; Wells, D.; Kolbeck, C.; Giambattista, J.; Gondara, L.; Alexander, A. Comparing deep learning-based auto-segmentation of organs at risk and clinical target volumes to expert inter-observer variability in radiotherapy planning. Radiother. Oncol. 2020, 144, 152–158. [Google Scholar] [CrossRef]
  42. Wong, J.; Huang, V.; Wells, D.; Giambattista, J.; Giambattista, J.; Kolbeck, C.; Otto, K.; Saibishkumar, E.P.; Alexander, A. Implementation of deep learning-based auto-segmentation for radiotherapy planning structures: A workflow study at two cancer centers. Radiat. Oncol. 2021, 16, 101. [Google Scholar] [CrossRef]
  43. Zabel, W.J.; Conway, J.L.; Gladwish, A.; Skliarenko, J.; Didiodato, G.; Goorts-Matthews, L.; Michalak, A.; Reistetter, S.; King, J.; Nakonechny, K.; et al. Clinical evaluation of deep learning and atlas-based auto-contouring of bladder and rectum for prostate radiation therapy. Pract. Radiat. Oncol. 2021, 11, e80–e89. [Google Scholar] [CrossRef]
  44. Radici, L.; Ferrario, S.; Borca, V.C.; Cante, D.; Paolini, M.; Piva, C.; Baratto, L.; Franco, P.; La Porta, M.R. Implementation of a commercial deep learning-based auto segmentation software in radiotherapy: Evaluation of effectiveness and impact on workflow. Life 2022, 12, 2088. [Google Scholar] [CrossRef]
  45. Hoque, S.M.H.; Pirrone, G.; Matrone, F.; Donofrio, A.; Fanetti, G.; Caroli, A.; Rista, R.S.; Bortolus, R.; Avanzo, M.; Drigo, A.; et al. Clinical use of a commercial artificial intelligence-based software for autocontouring in radiation therapy: Geometric performance and dosimetric impact. Cancers 2023, 15, 5735. [Google Scholar] [CrossRef] [PubMed]
  46. Radici, L.; Piva, C.; Casanova Borca, V.; Cante, D.; Ferrario, S.; Paolini, M.; Cabras, L.; Petrucci, E.; Franco, P.; La Porta, M.R.; et al. Clinical evaluation of a deep learning CBCT auto-segmentation software for prostate adaptive radiation therapy. Clin. Transl. Radiat. Oncol. 2024, 47, 100796. [Google Scholar] [CrossRef] [PubMed]
  47. Urago, Y.; Okamoto, H.; Kaneda, T.; Murakami, N.; Kashihara, T.; Takemori, M.; Nakayama, H.; Iijima, K.; Chiba, T.; Kuwahara, J.; et al. Evaluation of auto-segmentation accuracy of cloud-based artificial intelligence and atlas-based models. Radiat. Oncol. 2021, 16, 175. [Google Scholar] [CrossRef] [PubMed]
  48. Palazzo, G.; Mangili, P.; Deantoni, C.; Fodor, A.; Broggi, S.; Castriconi, R.; Ubeira Gabellini, M.G.; Del Vecchio, A.; Di Muzio, N.G.; Fiorino, C. Real-world validation of artificial intelligence-based computed tomography auto-contouring for prostate cancer radiotherapy planning. Phys. Imaging Radiat. Oncol. 2023, 28, 100501. [Google Scholar] [CrossRef]
  49. Walker, Z.; Bartley, G.; Hague, C.; Kelly, D.; Navarro, C.; Rogers, J.; South, C.; Temple, S.; Whitehurst, P.; Chuter, R. Evaluating the effectiveness of deep learning contouring across multiple radiotherapy centres. Phys. Imaging Radiat. Oncol. 2022, 24, 121–128. [Google Scholar] [CrossRef]
  50. Gibbons, E.; Hoffmann, M.; Westhuyzen, J.; Hodgson, A.; Chick, B.; Last, A. Clinical evaluation of deep learning and atlas-based auto-segmentation for critical organs at risk in radiation therapy. J. Med. Radiat. Sci. 2023, 70, 15–25. [Google Scholar] [CrossRef]
  51. Wang, Y.; Boyd, G.; Zieminski, S.; Kamran, S.C.; Zietman, A.L.; Miyamoto, D.T.; Kirk, M.C.; Efstathiou, J.A. A pair of deep learning auto-contouring models for prostate cancer patients injected with a radio-transparent versus radiopaque hydrogel spacer. Med. Phys. 2023, 50, 3324–3337. [Google Scholar] [CrossRef]
  52. Kiljunen, T.; Akram, S.; Niemelä, J.; Löyttyniemi, E.; Seppälä, J.; Heikkilä, J.; Vuolukka, K.; Kääriäinen, O.S.; Heikkilä, V.P.; Lehtiö, K.; et al. A deep learning-based automated CT segmentation of prostate cancer anatomy for radiation therapy planning-A retrospective multicenter study. Diagnostics 2020, 10, 959. [Google Scholar] [CrossRef]
  53. Elisabeth Olsson, C.; Suresh, R.; Niemelä, J.; Akram, S.U.; Valdman, A. Autosegmentation based on different-sized training datasets of consistently-curated volumes and impact on rectal contours in prostate cancer radiation therapy. Phys. Imaging Radiat. Oncol. 2022, 22, 67–72. [Google Scholar] [CrossRef]
  54. Strolin, S.; Santoro, M.; Paolani, G.; Ammendolia, I.; Arcelli, A.; Benini, A.; Bisello, S.; Cardano, R.; Cavallini, L.; Deraco, E.; et al. How smart is artificial intelligence in organs delineation? Testing a CE and FDA-approved deep-learning tool using multiple expert contours delineated on planning CT images. Front. Oncol. 2023, 13, 1089807. [Google Scholar] [CrossRef]
  55. Miura, H.; Ishihara, S.; Kenjo, M.; Nakao, M.; Ozawa, S.; Kagemoto, M. Evaluation of the accuracy of automated segmentation based on deep learning for prostate cancer patients. Med. Dosim. 2025, 50, 91–95. [Google Scholar] [CrossRef] [PubMed]
  56. De Kerf, G.; Claessens, M.; Raouassi, F.; Mercier, C.; Stas, D.; Ost, P.; Dirix, P.; Verellen, D. A geometry and dose-volume based performance monitoring of artificial intelligence models in radiotherapy treatment planning for prostate cancer. Phys. Imaging Radiat. Oncol. 2023, 28, 100494. [Google Scholar] [CrossRef]
  57. Kanwar, A.; Merz, B.; Claunch, C.; Rana, S.; Hung, A.; Thompson, R.F. Stress-testing pelvic autosegmentation algorithms using anatomical edge cases. Phys. Imaging Radiat. Oncol. 2023, 25, 100413. [Google Scholar] [CrossRef] [PubMed]
  58. Marschner, S.; Datar, M.; Gaasch, A.; Xu, Z.; Grbic, S.; Chabin, G.; Geiger, B.; Rosenman, J.; Corradini, S.; Niyazi, M.; et al. A deep image-to-image network organ segmentation algorithm for radiation treatment planning: Principles and evaluation. Radiat. Oncol. 2022, 17, 129. [Google Scholar] [CrossRef] [PubMed]
  59. Pera, Ó.; Martínez, Á.; Möhler, C.; Hamans, B.; Vega, F.; Barral, F.; Becerra, N.; Jimenez, R.; Fernandez-Velilla, E.; Quera, J.; et al. Clinical validation of Siemens’ Syngo.via automatic contouring system. Adv. Radiat. Oncol. 2023, 8, 101177. [Google Scholar] [CrossRef]
  60. Yamauchi, R.; Itazawa, T.; Kobayashi, T.; Kashiyama, S.; Akimoto, H.; Mizuno, N.; Kawamori, J. Clinical evaluation of deep learning and atlas-based auto-segmentation for organs at risk delineation. Med. Dosim. 2024, 49, 167–176. [Google Scholar] [CrossRef]
  61. Berenato, S.; Williams, M.; Woodley, O.; Möhler, C.; Evans, E.; Millin, A.E.; Wheeler, P.A. Novel dosimetric validation of a commercial CT scanner based deep learning automated contour solution for prostate radiotherapy. Phys. Med. 2024, 122, 103339. [Google Scholar] [CrossRef]
  62. Künzel, L.A.; Nachbar, M.; Hagmüller, M.; Gani, C.; Boeke, S.; Wegener, D.; Paulsen, F.; Zips, D.; Thorwarth, D. Clinical evaluation of autonomous, unsupervised planning integrated in MR-guided radiotherapy for prostate cancer. Radiother. Oncol. 2022, 168, 229–233. [Google Scholar] [CrossRef]
  63. Nachbar, M.; Lo Russo, M.; Gani, C.; Boeke, S.; Wegener, D.; Paulsen, F.; Zips, D.; Roque, T.; Paragios, N.; Thorwarth, D. Automatic AI-based contouring of prostate MRI for online adaptive radiotherapy. Z. Med. Phys. 2024, 34, 197–207. [Google Scholar] [CrossRef]
  64. Moazzezi, M.; Rose, B.; Kisling, K.; Moore, K.L.; Ray, X. Prospects for daily online adaptive radiotherapy via ethos for prostate cancer patients without nodal involvement using unedited CBCT auto-segmentation. J. Appl. Clin. Med. Phys. 2021, 22, 82–93. [Google Scholar] [CrossRef]
  65. Duan, J.; Tegtmeier, R.C.; Vargas, C.E.; Yu, N.Y.; Laughlin, B.S.; Rwigema, J.M.; Anderson, J.D.; Zhu, L.; Chen, Q.; Rong, Y. Achieving accurate prostate auto-segmentation on CT in the absence of MR imaging. Radiother. Oncol. 2025, 202, 110588. [Google Scholar] [CrossRef] [PubMed]
  66. Bordigoni, B.; Trivellato, S.; Pellegrini, R.; Meregalli, S.; Bonetto, E.; Belmonte, M.; Castellano, M.; Panizza, D.; Arcangeli, S.; De Ponti, E. Automated segmentation in pelvic radiotherapy: A comprehensive evaluation of ATLAS-, machine learning-, and deep learning-based models. Phys. Med. 2024, 125, 104486. [Google Scholar] [CrossRef] [PubMed]
  67. Duan, J.; Vargas, C.E.; Yu, N.Y.; Laughlin, B.S.; Toesca, D.S.; Keole, S.; Rwigema, J.C.M.; Wong, W.W.; Schild, S.E.; Feng, X.; et al. Incremental retraining, clinical implementation, and acceptance rate of deep learning auto-segmentation for male pelvis in a multiuser environment. Med. Phys. 2023, 50, 4079–4091. [Google Scholar] [CrossRef] [PubMed]
  68. Doolan, P.J.; Charalambous, S.; Roussakis, Y.; Leczynski, A.; Peratikou, M.; Benjamin, M.; Ferentinos, K.; Strouthos, I.; Zamboglou, C.; Karagiannis, E. A clinical evaluation of the performance of five commercial artificial intelligence contouring systems for radiotherapy. Front. Oncol. 2023, 13, 1213068. [Google Scholar] [CrossRef]
  69. Van Dijk, L.V.; Van den Bosch, L.; Aljabar, P.; Peressutti, D.; Both, S.; Steenbakkers, R.J.H.M.; Langendijk, J.A.; Gooding, M.J.; Brouwer, C.L. Improving automatic delineation for head and neck organs at risk by deep learning contouring. Radiother. Oncol. 2020, 142, 115–123. [Google Scholar] [CrossRef]
  70. Gay, H.A.; Jin, J.Y.; Chang, A.J.; Ten Haken, R.K. Utility of normal tissue-to-tumor a/b Ratio when evaluating isodoses of isoeffective radiation therapy treatment plans. Int. J. Radiat. Oncol. Biol. Phys. 2012, 85, e81–e87. [Google Scholar] [CrossRef]
  71. Lin, H.; Xiao, H.; Dong, L.; Teo, K.B.; Zou, W.; Cai, J.; Li, T. Deep learning for automatic target volume segmentation in radiation therapy: A review. Quant. Imaging Med. Surg. 2021, 11, 4847–4858. [Google Scholar] [CrossRef]
  72. Mackay, K.; Bernstein, D.; Glocker, B.; Kamnitsas, K.; Taylor, A. A review of the metrics used to assess auto-contouring systems in radiotherapy. Clin. Oncol. 2023, 35, 354–369. [Google Scholar] [CrossRef]
  73. Almberg, S.S.; Lervåg, C.; Frengen, J.; Eidem, M.; Abramova, T.M.; Nordstrand, C.S.; Alsaker, M.D.; Tøndel, H.; Raj, S.X.; Wanderås, A.D. Training, validation, and clinical implementation of a deep-learning segmentation model for radiotherapy of loco-regional breast cancer. Radiother. Oncol. 2022, 173, 62–68. [Google Scholar] [CrossRef]
  74. Chai, Y. Letter to the editor regarding the article “comparison of transfer learning models in pelvic tilt and rotation measurement in pediatric anteroposterior pelvic radiographs”. J. Imaging Inform. Med. 2024, 37, 1259–1260. [Google Scholar] [CrossRef]
  75. Waffenschmidt, S.; Knelangen, M.; Sieben, W.; Bühn, S.; Pieper, D. Single screening versus conventional double screening for study selection in systematic reviews: A methodological systematic review. BMC Med. Res. Methodol. 2019, 19, 132. [Google Scholar] [CrossRef] [PubMed]
  76. Sun, Z.; Ng, C.K.C.; Dos Reis, C.S. Synchrotron radiation computed tomography versus conventional computed tomography for assessment of four types of stent grafts used for endovascular treatment of thoracic and abdominal aortic aneurysms. Quant. Imaging Med. Surg. 2018, 8, 609–620. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Preferred reporting items for systematic reviews and meta-analyses flow diagram for this systematic review. DLAS, deep learning-based auto-segmentation; IEEE, Institute of Electrical and Electronics Engineers.
Figure 1. Preferred reporting items for systematic reviews and meta-analyses flow diagram for this systematic review. DLAS, deep learning-based auto-segmentation; IEEE, Institute of Electrical and Electronics Engineers.
Information 16 00215 g001
Figure 2. Number of included papers meeting each revised CLAIM 2024 criterion. Radial axis: number of papers. Angular axis: CLAIM 2024 criterion.
Figure 2. Number of included papers meeting each revised CLAIM 2024 criterion. Radial axis: number of papers. Angular axis: CLAIM 2024 criterion.
Information 16 00215 g002
Table 1. Selection criteria for articles.
Table 1. Selection criteria for articles.
Inclusion CriteriaExclusion Criteria
  • Peer-reviewed, original article
  • With evaluation of commercial deep learning-based auto-segmentation software performance for prostate cancer patients’ radiation therapy planning
  • Written in English
  • Commentary
  • Conference proceeding
  • Editorial
  • Grey literature
  • Non-peer-reviewed paper (such as those on arXiv platform)
  • Opinion
  • Perspective
  • Review
Table 2. Geometric accuracy of commercial deep learning-based auto-segmentation (DLAS) software products for prostate cancer radiation therapy planning.
Table 2. Geometric accuracy of commercial deep learning-based auto-segmentation (DLAS) software products for prostate cancer radiation therapy planning.
Software Name and Version Author, Year and CountryGeometric Accuracy
DLASIOVDLAS VS IOV
Carina Medical LLC
INTContourDuan et al. (2022), USA [37]Mean DSC, HD95 (mm), MSD (mm), sDSC, HD90 (mm), HD98 (mm), recall, precision, and VR: prostate (0.83, 5.3, 2.1, 0.75, 4.5, 6.1, 0.84, 0.82, and 1.05); SVs (0.72, 5.7, 2.0, 0.81, 4.8, 6.9, 0.76, 0.70, and 1.09); bladder (0.93, 4.6, 1.3, 0.91, 3.6, 5.7, 0.94, 0.92, and 1.02); L (0.96, 2.4, 0.6, 0.97, 1.8, 2.9, 0.95, 0.98, and 0.97) and R femoral heads (0.97, 2.2, 0.6, 0.97, 1.8, 2.8, 0.96, 0.97, and 0.99); penile bulb (0.53, 6.2, 2.6, 0.28, 5.4, 7.1, 0.55, 0.56, and 1.01); rectum (0.85, 7.6, 2.0, 0.82, 5.4, 10.0, 0.83, 0.88, and 0.94); all structures (0.83, 5.3, 2.1, 0.75, 4.5, 6.1, 0.84, 0.82, and 1.05)Mean DSC, HD95 (mm), MSD (mm), sDSC, HD90 (mm), HD98 (mm), recall, precision, and VR: all structures (0.80, 6.1, 2.2, 0.78, 5.0, 6.9, 0.93, 0.72, and 1.32)DLAS outperforming ROs with statistically significantly different precision, and VR (p < 0.001) and ns for DSC, HD95, MSD, sDSC, HD90, and HD98
Hobbis et al. (2023), USA [38]Median DSC, HD95 (mm), and MSD (mm) for model finetuned by 60 cases (90 for CTVs): prostate and SV beds (0.73, 11.3, and 7.3); bladder (0.98, 2.0, and 0.7); L (0.98, 1.8, and 0.5) and R femoral heads (0.97, 1.1, and 0.6); penile bulb (0.54, 5.7, and 2.7); rectum (0.92, 5.7, and 1.6)NANA
Tegtmeier et al. (2024), USA [39]Mean DSC, HD95 (mm), MSD (mm), and PVD: prostate (0.91, 2.3, 1.0, and 2.2%); prostate bed (0.75, 9.5, 3.2, and −14.9%); bladder (0.96, 3.4, 1.0, and 0.3%); L and R femoral heads (0.96, 1.1, 0.3, and 4.9%); rectum (0.90, 3.4, 0.8, and 12.0%)NANA
Elekta AB
ADMIRE v3.4Jenkins et al. (2021), UK [40]Mean DSC: prostate (0.81)NANA
Limbus AI Inc.
Contour v1.0.22Wong et al. (2020), Canada [41]Mean DSC and HD95 (mm): prostate (0.79 and 6.7); SVs (0.64 and 6.0); bladder (0.97 and 3.2); L and R femoral heads (0.91 and 7.1); rectum (0.78 and 12.1)Mean DSC and HD95 (mm): prostate (0.83 and 5.3); SVs (0.63 and 5.9); bladder (0.96 and 3.0); L and R femoral heads (0.91 and 7.0); rectum (0.79 and 11.3)DLAS outperforming ROs for bladder and L and R femoral heads with statistically significantly different DSC (p < 0.005–0.018) and ns for DSC and HD95 of SVs and rectum and HD95 of L and R femoral heads
Wong et al. (2021), Canada [42]Median DSC and HD95 (mm): prostate (0.90 and 4.3); bladder (0.99 and 0.6), L (0.99 and 1.3) and R femoral heads (0.99 and 1.3); rectum (0.95 and 3.0)NANA
Contour v1.0.18Zabel et al. (2021), Canada [43]Mean DSC: bladder (0.99); rectum (0.95)Mean DSC: bladder (0.99); rectum (0.97)ns
Contour v1.5.0Radici et al. (2022), Italy [44]Mean DSC, DCOM (mm), and PVD: bladder (0.89, 2.7, and 9.1%); L (0.95, 1.7, and −6.4%) and Rand femoral heads (0.92, 0.8, and −4.7%); penile bulb (0.39, 3.3, and 0.7 cm3); rectum (0.77, 5.0, and 11.3%)NANA
Contour v1.0.18Hoque et al. (2023), Italy [45]Mean DSC, HD (mm), and PVD: prostate (0.80, 15.4, and 12.0%); anal canal (0.70, 5.4, and 33.0%); bladder (0.94, 4.1, and 3.0%); L (0.78, 18.1, and 50.0%) and R femurs (0.78, 18.0, and 49.0%); rectum (0.83, 14.2, and 15.0%)Mean DSC, HD (mm), and PVD: prostate (0.91, 5.9, and 9.0%); anal canal (0.91, 3.1, and 12.0%); bladder (0.98, 3.1, and 2.0%); L (0.98, 2.0, and 3.0%) and R femurs (0.98, 2.5, and 3.0%); rectum (0.97, 4.8, and 6.0%)NA
Contour v1.5.0-D2Radici et al. (2024), Italy [46]Mean DSC, DCOM (mm), and PVD for CT/CBCT: prostate (0.74, 6.1, and −11.5%/0.83, 2.5, and −11.8%); SVs (0.59, 6.8, and −40.0%/0.70, 3.9, and −21.0%); bladder (0.89, 3.0, and −9.0%/0.90, 2.2, and −2.0%); L and R femoral heads (0.96, <2.0, and −2.0%/0.96, <2.0, and −5.0%); rectum (0.81, 5.7, and −13.0%/0.86, 3.8, and −15.0%)Median DSC, DCOM (mm), and PVD: prostate (0.83, 2.5, and −18.5%); SVs (0.71, 2.0, and −26.0%); bladder (0.90, 1.9, and −12.4%); L and R femoral heads (0.93, 2.0, and −4.0%); rectum (0.81, 6.5, −22.0%) 1DLAS based on CBCT outperforming ROs for L and R femoral heads with statistically significantly different median DSC and DCOM (p < 0.01–0.02) and ns for other structures
MIM Software Inc.
ProtégéAI v0.9Urago et al. (2021), Japan [47]Median DSC, HD (mm), MDA (mm), and PVD: bladder (0.95, 6.1, 0.9, and 6.2%); rectum (0.87, 7.9, 1.2, and 17.6%)NANA
Contour ProtégéAI v1.1.2Palazzo et al. (2023), Italy [48]Median DSC and HD (mm): prostate (0.82 and 12.0); SVs (0.66 and 12.8); bladder (0.94 and 6.6), L and R femoral heads (0.82 and 112.8); rectum (0.82 and 22.1)Median DSC and HD (mm): prostate (0.83 and 10.9); SVs (0.73 and 8.5); bladder (0.93 and 5.9); L and R femoral heads (0.65 and 216.1); rectum (0.81 and 18.3)NA
Mirada Medical Ltd.
DLCExpertWalker et al. (2022), UK [49]Median DSC and MDA (mm): bladder (0.88 and 1.9); L (0.92 and 1.6) and R femoral heads (0.91 and 1.7); rectum (0.67 and 4.7)NANA
Gibbons et al. (2023), Australia [50]Median DSC and HD (mm): bladder (0.96 and 12.8); L and R femoral heads (0.98 and 6.8); rectum (0.87 and 9.6)NANA
Wang et al. (2023), USA [51]Mean DSC and MDA (mm) for model 1/2: prostate (0.84 and 1.8/0.85 and 1.7); SVs (0.60 and 2.4/0.62 and 2.3); bladder (0.91 and 1.4/0.95 and 0.8); L (0.94 and 0.8/0.96 and 0.5) and R femoral heads (0.95 and 0.7/0.96 and 0.5); penile bulb (0.66 and 2.2/0.65 and 2.2); rectum (0.81 and 2.3/0.84 and 1.9); spacer (0.52 and 2.9/0.84 and 0.9)Mean DSC and MDA (mm) for model 2: prostate (0.83 and 2.0); SVs (0.62 and 2.3); bladder (0.95 and 0.9); L (0.96 and 0.5) and R femoral heads (0.96 and 0.5); penile bulb (0.63 and 2.3); rectum (0.84 and 1.9); spacer (0.83 and 1.0)NA
MVision.ai
Contour+Kiljunen et al. (2020), Estonia, Finland, and Singapore [52]Mean DSC, HD95 (mm), sDSC, and PVD: prostate (0.82, 6.1, 0.38, and −31.6%); lymph nodes (0.80, 14.7, 0.39, and 5.2%); SVs (0.72, 7.1, 0.52, and −11.1%); bladder (0.93, 3.3, 0.68, and −1.7%); L (0.68, 25.0, 0.22, and −37.9%) and R femoral heads (0.69, 24.7, 0.22, and −30.5%); penile bulb (0.51, 7.7, 0.33, and −6.8%); rectum (0.84, 11.4, 0.58, and −9.6%)Mean DSC: prostate (0.83); lymph nodes (0.76); SVs (0.77); bladder (0.91); penile bulb (0.64); rectum (0.75)NA
Contour+ v1.2.1Olsson et al. (2022), Finland and Sweden [53]Mean DSC, HD (mm), and PVD: rectum (0.89, 4.7, and 3.7%)NANA
Contour+ v1.2.5Miura et al. (2024), Japan [55]Mean DSC, HD95, and PVD: prostate (0.86, 2.6, and −12.8%); SVs (0.80, 2.6, and −2.1%); bladder (0.96, 1.2, and −2.1%); L (0.98, 1.2, and −4.1%) and R femoral heads (0.97, 1.5, and −4.7%); penile bulb (0.64, 2.6, and −1.2%); rectum (0.92, 2.4, and 5.9%)NANA
RaySearch Laboratories AB
RayStation v11BDe Kerf et al. (2023), Belgium [56]Mean sDSC and local sDSC: bladder (0.98 and 0.97); anorectum (0.98 and 0.93)NANA
RayStation v9BKanwar et al. (2023), USA [57]Mean DSC, HD95, and MSD (mm) for normal/variants: prostate (0.81, 6.3, and 4.4/0.75, 10.4, and 5.0); bladder (0.95, 3.3, and 2.2/0.87, 8.3, and 3.5); L and R femoral heads (0.87, 14.6, and 2.5/0.83, 13.7, and 3.0); rectum (0.63, 29.5, and 6.9/0.51, 33.4, and 9.4)NANA
Siemens Healthineers AG
AI-Rad Companion Organs RT VA20/syngo.via RT Image Suite VB50Marschner et al. (2022), Germany and USA [58]Mean DSC, HD95 (mm), DCOM (mm), MSD (mm), PVD, RMSD (mm), sensitivity, specificity, JCI, DI, GMI, and left, right, anterior, posterior, superior, and inferior boundaries (mm): bladder (0.88, 6.7, 4.1, 1.8, 9.4%, 3.1, 0.93, 0.99, 0.81, 0.13, 0.07, 0.1, 2.9, 2.0, −1.9, −1.8, and −0.3); rectum (0.79, 10.8, 8.9, 2.5, 9.0%, 4.6, 0.84, 0.99, 0.67, 0.22, 0.16, 2.5, −0.9, 5.2, 1.7, −8.7, and 7.0)NANA
syngo.via RT Image Suite VB40Pera et al. (2023), Germany and Spain [59]Mean DSC: prostate (0.87); abdominopelvic cavity (0.94); bladder (0.95); body (0.99); L (0.99), and R femoral heads (0.98); rectum (0.90)NANA
AI-Rad Companion Organs RT VA30Yamauchi et al. (2024), Japan [60]Median DSC, HD (mm), and MDA (mm): prostate (0.78, 8.2, and 2.4); SVs (0.68, 7.3, and 1.3); bladder (0.94, 7.6, and 1.2); rectum (0.76, 27.1, and 4.6)NANA
DirectORGANS VA30Berenato et al. (2024), Germany and UK [61]Median DSC and MSD (mm): bladder (0.95 and 0.8); L (0.95 and 1.0) and R proximal femurs (0.96 and 0.8); rectum (0.89 and 1.2)Median DSC and MSD (mm): bladder (0.97 and 0.5); L (0.97 and 0.5) and R proximal femurs (0.97 and 0.5); rectum (0.92 and 0.9)Observer outperforming DLAS for all structures with statistically significant difference (p = 0.00)
Therapanacea
Annotate ART-Plan v1.8.3Nachbar et al. (2024), France and Germany [63]Median DSC, HD95 (mm), sDSC, and APL (mm): prostate (0.86, 5.0, 0.90, and 402); SVs (0.77, 4.4, 0.88, and 321); anal canal (0.74, 5.6, 0.75, and 0.0); bladder (0.97, 2.7, 0.97, and 101); L (0.92, 4.6, 0.99, and 123) and R femurs (0.92, 4.7, 0.98, and 157); L (0.91, 2.5, 0.96, and 539) and R pelvis (0.90, 3.3, 0.95, and 1010); penile bulb (0.73, 5.6, 0.93, and 119); rectum (0.91, 6.9, 0.98, and 225); sacrum (0.89, 4.8, 0.95, and 191)NANA
Varian Medical Systems, Inc.
EthosMoazzezi et al. (2021), USA [64]Mean PVD: CTVs (4.5%)NANA
Carina Medical LLC and RaySearch Laboratories AB
INTContour and RayStation Duan et al. (2025), USA [65]Median DSC, HD (mm), HD95 (mm), MSD (mm), sDSC, and HD98 (mm) for INTContour original/RayStation original/INTContour finetuned model: prostate (0.77, 13.4, 8.0, 2.7, 0.64, and 9.5/0.78, 15.2, 8.3, 2.6, 0.64, and 10.3/0.82, 10.7, 6.3, 2.1, 0.78, and 7.4)Median DSC, HD (mm), HD95 (mm), MSD (mm), sDSC, and HD98 (mm): prostate (0.93, 11.0, 8.0, 1.0, 0.89, and 9.3)Finetuned DLAS model outperforming ROs with statistically significantly different HD95 and HD98 (p < 0.05) and ns for DSC, HD, MSD, and sDSC
Limbus AI Inc. and MVision.ai
Contour v1.7.0-B3 and Contour+ v1.2.2 Bordigoni et al. (2024), Italy and Sweden [66]Median DSC, HD (mm), and DAP for Limbus/MVision.ai: bladder (0.93, 12.4, and 93.4%/0.94, 11.4, and 94.0%); L (0.97, 7.9, and 98.6%/0.97, 10.6, and 97.4%) and R femoral heads (0.97, 10.6, and 96.8%/0.97, 8.2, and 97.9%); rectum (0.87, 18.6, and 66.3%/0.87, 19.6, and 62.5%)NANA
Carina Medical LLC, Manteia Medical Technologies Co., and MIM Software Inc.
INTContour, AccuContour, and Contour ProtégéAIDuan et al. (2023), USA [67]Mean (finetuned model)/median (original models) DSC, HD95 (mm), MSD (mm), sDSC, HD90 (mm), and HD98 (mm) for deidentified original model 1/2/3/INTContour finetuned model: prostate (0.73, 8.8, 3.5, 0.56, 7.7, and 10.1/0.80, 6.5, 2.6, 0.69, 5.3, and 7.9/0.76, 9.4, 3.4, 0.57, 8.0, and 10.1/0.82, 6.6, 2.4, 0.73, 5.5, and 7.7); SVs (0.30, 13.0, 4.8, 0.39, 11.1, and 14.6/0.38, 11.3, 4.5, 0.46, 9.6, and 13.0/0.32, 11.6, 4.5, 0.46, 9.9, and 13.6/0.48, 10.0, 3.8, 0.5, 8.5, and 11.3); balloon (NA/NA/NA/0.89, 5.9, 1.9, 0.87, 3.8, and 8.6); bladder (0.96, 1.9, 0.7, 0.97, 2.1, and 3.5/0.96, 2.3, 0.8, 0.97, 1.9, and 4.4/0.96, 3.2, 1.0, 0.95, 2.4, and 4.8/0.96, 5.0, 1.5, 0.96, 4.1, and 6.0); L (0.95, 3.6, 0.8, 0.93, 2.1, and 6.0/0.78, 64.3, 13.6, 0.75, 49.5, and 70.9/0.95, 3.8, 1.0, 0.90, 3.2, and 4.1/0.92, 9.6, 2.0, 0.90, 6.4, and 12.2) and R femoral heads (0.94, 4.8, 0.9, 0.91, 3.2, and 6.6/0.71, 79.4, 20.4, 0.70, 67.1, and 87.7/0.87, 8.4, 2.4, 0.77, 6.1, and 11.1/0.92, 9.1, 1.9, 0.90, 6.0, and 11.7); large (NA/NA/NA/0.49, 36.9, 12.5, 0.41, 30.7, and 41.4) and small bowels (NA/NA/NA/0.30, 40.7, 19.9, 0.30, 36.7, and 44.1); penile bulb (NA/NA/NA/0.47, 6.9, 3.1, 0.17, 6.1, and 7.5); rectum (0.84, 10.1, 3.0, 0.65, 7.7, and 12.3/0.86, 7.6, 2.6, 0.66, 6.1, and 10.4/0.79, 9.0, 3.3, 0.55, 8.5, and 9.8/0.92, 5.4, 1.5, 0.89, 3.8, and 7.4); spacer (NA/NA/NA/0.84, 4.5, 1.7, 0.79, 3.9, and 5.3)NANA
Mirada Medical Ltd., MVision.ai, Radformation Inc., RaySearch Laboratories AB, and Therapanacea
DLCExpert v2.6.4.47181, Contour+ v1.2.1, AutoContour v1.0.25.0, RayStation v12.0.0.932, and Annotate v1.10.0Doolan et al. (2023), Cyprus and Germany [68]Median DSC, HD (mm), sDSC, and APL (mm) for DLCExpert/MVision Contour+/Radformation AutoContour/RayStation/Therapanacea Annotate: prostate (0.86, 7.8, 0.36, and 3226/0.89, 6.9, 0.45, and 2957/0.87, 7.0, 0.34, and 3591/0.85, 7.8, 0.33, and 3415/0.91, 6.9, 0.48, and 3063); SVs (0.76, 9.0, 0.43, and 1862/0.83, 8.0, 0.57, and 1489/0.75, 8.8, 0.48, and 1817/NA/0.82, 7.9, 0.59, and 1404); bladder (0.95, 16.2, 0.62, and 7044/0.97, 6.3, 0.74, and 5342/0.97, 6.9, 0.70, and 6886/0.95, 7.7, 0.60, and 8203/0.97, 5.7, 0.77, and 4753); bowels (0.59, 75.5, 0.06, and 118000/0.75, 76.5, 0.13, and 103000/0.73, 69.2, 0.10, and 113000/NA/0.76, 55.6, 0.17, and 105000); cauda equina (NA/NA/0.75, 30, 0.39, and 3416/NA/NA); L (0.91, 16.1, 0.61, and 5586/0.91, 16.9, 0.66, and 4724/0.90, 17.2, 0.64, and 5242/0.89, 25.3, 0.59, and 5887/0.90, 18.8, 0.70, and 4067) and R femoral heads (0.89, 18.4, 0.59, and 5624/0.91, 16.0, 0.65, and 4957/0.91, 15.9, 0.62, and 5511/0.88, 27.8, 0.51, and 6887/0.90, 18.6, 0.68, and 4068); penile root (0.54, 11.9, 0.23, and 794/0.68, 11.0, 0.36, and 800/0.66, 11.4, 0.35, and 758/NA/0.71, 8.9, 0.39, and 698); rectum (0.87, 15.0, 0.53, and 3802/0.91, 11.6, 0.59, and 3322/0.88, 15.6, 0.57, and 3425/0.87, 19.0, 0.53, and 3565/0.83, 26.0, 0.57, and 3630); sigmoid (NA/0.77, 26.3, 0.55, and 2798/NA/NA/0.52, 47.1, 0.37, and 2950)NANA
Strolin et al. [54] and Künzel et al. [62] did not conduct any geometric accuracy evaluation. 1 Mean values not reported. APL, added path length; CBCT, cone beam computed tomography; CT, computed tomography; CTV, clinical target volume; DAP, distance-to-agreement portion; DCOM, displacement of centre of mass; DI, discordance index; DSC, Dice similarity coefficient; GMI, geographical miss index; HD, Hausdorff distance; HD90, 90-percentile Hausdorff distance; HD95, 95-percentile Hausdorff distance; HD98, 98-percentile Hausdorff distance; IOV, inter-observer variation; JCI, Jaccard conformity index; L, left; MDA, mean distance to agreement; MSD, mean surface distance; NA, not available; ns, no significant difference; PVD, percentage volume difference; R, right; RMSD, residual mean surface distance; RO, radiation oncologist; sDSC, surface Dice similarity coefficient; SVs, seminal vesicles; VR, volume ratio; VS, versus.
Table 3. Subjective accuracy, efficiency, and dosimetric evaluation results of commercial deep learning-based auto-segmentation (DLAS) software products for prostate cancer radiation therapy planning.
Table 3. Subjective accuracy, efficiency, and dosimetric evaluation results of commercial deep learning-based auto-segmentation (DLAS) software products for prostate cancer radiation therapy planning.
Software Name and Version Author, Year, and CountryEvaluation Results
SubjectiveEfficiencyDosimetric
Carina Medical LLC
INTContourDuan et al. (2022), USA [37]No/minor corrections required for 95.7%/4.3% of structures; no unusable contoursNANo statistically significant difference in doses to OARs except bladder between DLAS and manual contouring, and DLAS bladder dose being lower
Hobbis et al. (2023), USA [38]No/minor corrections required for 36.0%/51.0% of OARs and 17.0%/37.0% of CTVs; unusable OARs (1.0%) and CTVs contours (2.0%) delineated by original modelNANA
Tegtmeier et al. (2024), USA [39]Mean contour score: prostate (4.4), prostate bed (3.0), bladder (4.6), L and R femoral heads (4.0), and rectum (3.9) out of 5.0 (1 and 5 indicating unusable and no corrections required), respectivelyNANA
Limbus AI Inc.
Contour v1.0.22Wong et al. (2020), Canada [41]NAMean time reduction/patient: 98.1% (manual: 21.3 min vs. DLAS: 0.4 min) 1NA
Wong et al. (2021), Canada [42]Mean contour score: prostate (2.8), SVs (2.1), bladder (1.4), L and R femoral heads (1.6), and rectum (1.7) out of 5.0 (1 and 5 indicating minimal and significant corrections required); mean user satisfaction score: CTV (4.1) and OARs (4.6) out of 5.0 (1 and 5 representing poor and high satisfactions), respectivelyNANA
Contour v1.0.18Zabel et al. (2021), Canada [43]No corrections required for 89.0% and 76.0% of bladder and rectum contours, respectivelyMean time reduction/patient: 55.6% (manual: 15.3 min vs. DLAS: 6.8 min)NA
Contour v1.5.0Radici et al. (2022), Italy [44]NAMean time reduction/patient: 18.0% (4.0 min)No clinically relevant difference in doses to L and R femoral heads between DLAS and manual contouring; other OARs with relevant dosimetric differences
Contour v1.0.18Hoque et al. (2023), Italy [45]No/minor contour corrections required: prostate (0.0%/65.0%), anal canal (25.0%/75.0%), bladder (5.0%/75.0%), L (65.0%/35.0%) and R femurs (65.0%/30.0%), and rectum (90.0%/10.0%); unusable contours: prostate (5.0%) and R femur (5.0%)Mean time reduction/patient: 72.1% (manual: 23.0 min vs. DLAS: 6.4 min) 2No statistically significant difference in doses to all structures except anal canal between DLAS with contour corrections and manual contouring
Contour v1.5.0-D2Radici et al. (2024), Italy [46]No corrections required for all OARs except rectumNANA
MIM Software Inc.
Contour ProtégéAI v0.9Urago et al. (2021), Japan [47]NANANA
Contour ProtégéAI v1.1.2Palazzo et al. (2023), Italy [48]No/minor corrections required for 10.0%/77.5% of structures; no unusable contoursMean time reduction/patient: 75.6% (manual: 20.5 min vs. DLAS: 5.0 min)NA
Mirada Medical Ltd.
DLCExpertWalker et al. (2022), UK [49]Mean contour scores for centre 1/2: bladder (4.5/3.6), L (1.8/1.3) and R femoral heads (1.5/1.2) and rectum (5.3/3.6) out of 7.0 (1–3 and 7 indicating no corrections required and gross error)/5.0 (1–2 and 5 representing no corrections required and unusable), respectivelyMean time reduction/patient: 30.0% (manual: 19.5 min vs. DLAS: 13.7 min) 2NA
Gibbons et al. (2023), Australia [50]Median contour score: bladder (2.0), L (1.5) and R femoral heads (1.0) and rectum (2.0) out of 4.0 (1 and 4 indicating no and major corrections required)Mean time reduction/patient: 65.4% (manual: 10.7 min vs. DLAS: 3.7 min) 2NA
Wang et al. (2023), USA [51]Mean contour score for model 1/2: prostate (2.7/2.2), SVs (3.3/2.4), bladder (2.3/1.3), L (1.1/1.0) and R femoral heads (1.1/1.0), penile bulb (3.4/2.4), rectum (3.0/2.1), and spacer (3.6/1.3) out of 4.0 (1 and 4 indicating no/minor corrections required and unusable), respectivelyMean efficiency gain score for model 1/2: 2.79/2.20 out of 4.00 (1.00–1.75: nearly complete; 1.76–2.50: substantial; 2.51–3.25: meaningful; 3.26–4.00: no efficiency gain)NA
MVision.ai
Contour+Kiljunen et al. (2020), Estonia, Finland, and Singapore [52]NAMean time reduction/patient: 44.4% (manual: 27.0 min vs. DLAS: 15.0 min)NA
Contour+ v1.2.1Olsson et al. (2022), Finland and Sweden [53]NANANo clinically relevant difference in dose to OAR between DLAS and manual contouring
Strolin et al. (2023), Italy [54]Mean contour score: 4.8 out of 5.0 (1 and 5 indicating unusable and no corrections required), respectivelyMedian time reduction/patient: 53.0% (manual: 25.8 min vs. DLAS: 12.1 min) 2NA
Contour+ v1.2.5Miura et al. (2024), Japan [55]NAEstimation of time reduction/patient: 90.0% (manual: 30.0 min vs. DLAS: 3.0 min) 1NA
RaySearch Laboratories AB
RayStation v11BDe Kerf et al. (2023), Belgium [56]NANANo clinically relevant difference in doses to OARs between DLAS and manual contouring
Siemens Healthineers AG
syngo.via RT Image Suite VB40Pera et al. (2023) [59], Germany and SpainNo/minor contour corrections required for 53.4%/42.0% of structures; no unusable contoursMean time reduction/patient: 76.4% (manual: 34.7 min vs. DLAS: 8.2 min)NA
AI-Rad Companion Organs RT VA30Yamauchi et al. (2024), Japan [60]Mean contour score: prostate (3.3), SVs (3.5), bladder (3.7), and rectum (3.8) out of 4.0 (1 and 4 indicating major and no/minor corrections required)Mean time reduction/patient: 5.7% (manual: 14.6 min vs. DLAS: 13.8 min)NA
DirectORGANS VA30Berenato et al. (2024), Germany and UK [61]NAMedian time reduction/patient: 41.3% (manual: 25.9 min vs. DLAS: 15.2 min)No statistically significant difference in dose to bladder between DLAS and manual contouring; other OARs with statistically significant differences
Therapanacea
Annotate ART-Plan v1.7.1Künzel et al. (2022), Germany [62]No/minor corrections required for 54.0%/24.0% of OARs and 30.0%/36.0% of CTVs; no unusable OARs but 4.0% of unusable CTVs contoursNANA
Annotate ART-Plan v1.8.3Nachbar et al. (2024), France and Germany [63]Mean contour score: prostate (2.0), SVs (1.5), anal canal (1.0), bladder (1.3), L (1.0) and R femurs (1.0), L (1.0) and R pelvis (1.2), penile bulb (1.1), rectum (1.2), and sacrum (1.0) out of 4.0 (1 and 4 indicating no corrections required and unusable), respectivelyNANA
Varian Medical Systems, Inc.
EthosMoazzezi et al. (2021), USA [64]No/minor corrections required for 4.0% of structures/70.0%, 88.0%, and 90.0% of CTVs, bladder, and rectum, respectively.NACTV coverage (D98 > 95%): 100%; no clinically relevant difference in doses to bladder and rectum between DLAS and manual contouring
Carina Medical LLC and RaySearch Laboratories AB
INTContour and RayStation Duan et al. (2025), USA [65]No corrections required for 69.0% of prostate contoursNANA
Limbus AI Inc. and MVision.ai
Contour v1.7.0-B3 and Contour+ v1.2.2Bordigoni et al. (2024), Italy and Sweden [66]No/minor contour corrections required for 80.0%/20.0% (Limbus) and 60.0%/0.0% (MVision.ai) of structures; no unusable contoursMedian time reduction/patient for Limbus/MVision.ai: 81.1%/80.2% (manual: 90.0 min vs. DLAS: 17.0/17.8 min)NA
Carina Medical LLC, Manteia Medical Technologies Co., and MIM Software Inc.
INTContour, AccuContour, and Contour ProtégéAIDuan et al. (2023), USA [67]No and minor corrections required for 80.0% of structures; unacceptable contours (unusable and major corrections required) delineated by finetuned model: 20.0%NANA
Mirada Medical Ltd., MVision.ai, Radformation Inc., RaySearch Laboratories AB, and Therapanacea
DLCExpert v2.6.4.47181, Contour+ v1.2.1, AutoContour v1.0.25.0, RayStation v12.0.0.932, and Annotate v1.10.0Doolan et al. (2023), Cyprus and Germany [68] NAMean time reduction/patient for DLCExpert/Mvision Contour+/Radformation AutoContour/RayStation/Therapanacea Annotate: 82.4%/99.3%/89.8%/87.6%/99.8% (manual: 42.0 min vs. DLAS: 7.4/0.3/4.3/5.2/0.1 min) 2NA
Jenkins et al. [40], Kanwar et al. [57], and Marschner et al. [58] did not conduct any subjective accuracy, efficiency, or dosimetric evaluation. 1 DLAS contouring time did not cover time required for contour corrections. 2 DLAS contouring time only covered time needed for contour adjustments. CTV, clinical target volume; D98, dose received by 98% of structure; L, left; min, minutes; NA, not available; OAR, organ at risk; R, right; SVs, seminal vesicles; vs., versus.
Table 4. Characteristics of studies on commercial deep learning-based auto-segmentation (DLAS) software products for prostate cancer (PCa) radiation therapy planning.
Table 4. Characteristics of studies on commercial deep learning-based auto-segmentation (DLAS) software products for prostate cancer (PCa) radiation therapy planning.
Author, Year, and CountryDLAS ArchitectureStudy DesignMulti-CentrePatient/PopulationModel FinetuningTraining DatasetTesting DatasetSample Size CalculationExternal TestingModalityReference Contour SourceContouring GuidelinesArticle Quality (%)
SourceSize (Number of Patient)SourceSize (Number of Patient)
Carina Medical LLC INTContour
Duan et al. (2022), USA [37]3D CNN U-NetRetrospectiveNoPCa patientsYesPrivate: 1 USA centre84Private: 1 USA centre23NoNoCT1 RO with final review by 2 ROs with >10-year experienceRTOG-081572
Hobbis et al. (2023), USA [38]3D U-NetRetrospectiveNoPostprostatectomy PCa patientsYesPrivate: 1 USA centre for finetuned models120 for original model; 30, 60, and 90 for 3 finetuned modelsPrivate: 1 USA centre49NoOnly for original modelCT6 expert ROsFROGG58
Tegtmeier et al. (2024), USA [39]3D CNN U-NetRetrospectiveNoIntact and postprostatectomy PCa patientsYesPrivate: 2 USA centres84 for original model; 116 for finetuned model (72 intact and 44 postprostatectomy)Private: 1 USA centre50 (25 intact and 25 postprostatectomy)NoOnly for original modelCBCT2 physicists and 2 physicians ESTRO and FROGG 60
Elekta AB ADMIRE v3.4
Jenkins et al. (2021), UK [40]NARetrospectiveNoIntermediate- and high-risk PCa patientsNoNANAPrivate: 1 UK centre232NoYesCT1 RONA44
Limbus AI Inc. Contour
Wong et al. (2020), Canada [41]CNN U-NetRetrospectiveNoPCa patients without bilateral hip implants/rectal spacerNoPublic: USA TCIA 328Private: 1 Canadian centre20NoYesCTAt least 2 expert ROsRTOG56
Wong et al. (2021), Canada [42]CNN U-NetProspectiveYesPCa patientsNoPublic: USA TCIA 328Private: 2 Canadian centres71NoYesCT1 RONA60
Zabel et al. (2021), Canada [43]CNN U-NetRetrospectiveNoPCa patients with 1 unilateral hip implant caseNoPublic: USA TCIA NAPrivate: 1 Canadian centre15NoYesCT3 ROsRTOG49
Radici et al. (2022), Italy [44]CNN U-NetProspectiveNoPCa patientsNoPublic: USA TCIA At least hundredsPrivate: 1 Italian centre3NoYesCT4 expert ROsRTOG56
Hoque et al. (2023), Italy [45]CNN U-NetRetrospectiveNoPCa patients without bilateral hip implants/rectal spacerNoPublic: USA TCIA 328Private: 1 Italian centre20NoYesCT1 RO with >10-year experienceESTRO58
Radici et al. (2024), Italy [46]CNN U-NetRetrospectiveNoPCa patientsNo Public and private NANA10NoNA CT and CBCT 4 expert ROsNA42
MIM Software Inc. Contour ProtégéAI
Urago et al. (2021), Japan [47]U-NetNANoPCa patientsNoPrivate: multi-centres over the world500–1000Private: 1 Japanese centre21NoYesCT3 ROsNA40
Palazzo et al. (2023), Italy [48]U-NetRetrospectiveNoIntermediate- and high-risk PCa patientsNoPrivate: multi-centresNAPrivate: 1 Italian centre20NoYesCT2 ROs with >10-year experienceNA44
Mirada Medical Ltd. DLCExpert
Walker et al. (2022), UK [49]CNNRetrospectiveYesPCa patientsNoPrivate: 1 Dutch centre437Private: 3 UK centres61NoYesCT1 RT/RO CHHiP and RTOG 49
Gibbons et al. (2023), Australia [50]CNNRetrospectiveNoPCa patients without hip implant/rectal spacerNoPrivate: 1 Dutch centre437Private: 1 Australian centre30NoYesCT1 expert RONA47
Wang et al. (2023), USA [51]2D CNN U-NetRetrospectiveNoIntermediate- and high-risk PCa patientsYesPrivate: 1 USA centre135Private: 1 USA centreModel 1 (24) and Model 2 (64)NoNoCT3 expert ROsRTOG-081547
MVision.ai Contour+
Kiljunen et al. (2020), Estonia, Finland, and Singapore [52]Encoder-decoder-based CNNRetrospectiveYesPCa patients without any prostatectomy and femoral implantNoPrivate: 3 centres900Private: 1 Estonian, 4 Finnish, and 1 Singaporean centres30 (5 from each centre)NoYesCT3–4 dosimetrists, physicists, ROs and RTTsNA42
Olsson et al. (2022), Finland and Sweden [53]Encoder-decoder-based CNNRetrospectiveNoPCa patientsYesPrivate: <40 centres for original model and 1 Swedish centre for finetuned model891 for original model; 325 for finetuned modelPrivate: 1 Swedish centre299NoOnly for original modelCT1 researcher supervised by 1 senior RO ESTRO and RTOG 60
Strolin et al. (2023), Italy [54]Encoder-decoder-based CNNRetrospectiveNoPCa patientsNoPrivate: <40 centres891Private: 1 Italian centre20NoYesCT1 senior and 2 junior ROs with >10-year and 3-month experiences, respectively ESTRO and RTOG 56
Miura et al. (2024), Japan [55]Encoder-decoder-based CNNNANoPCa patients without any iodine spacer, rectal balloon, and hip implantNoPrivate: <40 centres891Private: 1 Japanese centre10NoYesCT1 5-year-experienced RO with final review by 2 ROs with >20-year experienceNA53
RaySearch Laboratories AB RayStation
De Kerf et al. (2023), Belgium [56]NARetrospectiveNAPCa patientsNoPrivate: Belgian Iridium Network centre(s)NAPrivate: Belgian Iridium Network centre(s)50NoNoCT1 experienced RONA40
Kanwar et al. (2023), USA [57]U-NetRetrospectiveNoPCa patients with normal anatomy and variationsNoNANAPrivate: 1 USA centre131 (19 normal and 112 variants)NoYesCT1 ROIn-house protocol58
Siemens Healthineers AG AI-Rad Companion Organs RT/syngo. via RT Image Suite/DirectORGANS
Marschner et al. (2022), Germany and USA [58]U-Net variantNAYesProstate and cervical cancer patientsNoPrivate: multi-centres7841 German centre102NoYesCT1 experienced RORTOG56
Pera et al. (2023), Germany and Spain [59]U-Net variantNANoPCa patientsNoPrivate: multi-centres in Asia, Europe, and North and South AmericaAt least hundredsPrivate: 1 Spanish centre35NoYesCT1 expert RTT with final review by 1 RONA44
Yamauchi et al. (2024), Japan [60]U-Net variantRetrospectiveNoPCa patients with and without rectal spacerNoPrivate: multi-centres in Europe and AmericaNAPrivate: 1 Japanese centre30 (15 with spacer)NoYesCT6 expert ROsRTOG49
Berenato et al. (2024), Germany and UK [61]U-Net variantRetrospectiveNoIntermediate-risk PCa patients without any bilateral hip implantsNoNANAPrivate: 1 UK centre20 (3 with single hip implant)NoYesCT1 fully trained observerRTOG58
Therapanacea Annotate ART-Plan
Künzel et al. (2022), Germany [62]NARetrospectiveNoIntermediate-risk PCa patientsNoNANAPrivate: 1 German centre10NoYesCTNANA40
Nachbar et al. (2024), France and Germany [63]3D CNN U-NetRetrospectiveNoPCa patientsYesPrivate: 1 German centre47Private: 1 German centre20NoNoMRI1 RONA58
Varian Medical Systems, Inc. Ethos
Moazzezi et al. (2021), USA [64]CNN U-NetRetrospectiveNoIntermediate-risk PCa patients without nodal involvementNoPrivate: multi-centres in Americas, Asia, Australia, and EuropeHundredsPrivate: 1 USA centre25NoYesCBCT1 medical physicist with final review by 1 RONA53
Carina Medical LLC INTContour and RaySearch Laboratories AB RayStation
Duan et al. (2025), USA [65]3D U-NetRetrospectiveNoIntact PCa patientsYesPrivate: 1 USA centre for INTContour84 for original model; 57 for finetuned modelPrivate: 1 USA centre37 for INTContour and RayStation original models; 54 for finetuned modelNoOnly for original modelsCTExperienced ROsESTRO67
Limbus AI Inc. Contour and MVision.ai Contour+
Bordigoni et al. (2024), Italy and Sweden [66]CNN U-NetRetrospectiveNoPCa patientsNoNANAPrivate: 1 Italian centre20NoYesCT1 expert RONA51
Carina Medical LLC INTContour, Manteia Medical Technologies Co. AccuContour, and MIM Software Inc. Contour ProtégéAI
Duan et al. (2023), USA [67]3D U-NetNANoIntact PCa patients with and without iodine spacer/rectal balloonYesPrivate: 1 USA centre for INTContour model finetuning100Private: 1 USA centre20 for INTContour, AccuContour, and ProtégéAI original models; 115 for INTContour finetuned modelNoOnly for original modelsCT1 dosimetrist and 6 expert ROsNA53
Mirada Medical Ltd. DLCExpert, MVision.ai Contour+, Radformation Inc. AutoContour, RaySearch Laboratories AB RayStation, and Therapanacea Annotate
Doolan et al. (2023), Cyprus and Germany [68] NARetrospectiveNoPCa patientsNoNANAPrivate: 1 Cypriot centre20NoYesCT3 ROs with >10-year experienceGay et al.’s [70] guidelines47
2D, 2-dimensional; 3D, 3-dimensional; CBCT, cone beam computed tomography; CHHiP, Conventional or Hypofractionated High-Dose Intensity Modulated Radiotherapy in Prostate Cancer; CNN, convolutional neural network; CT, computed tomography; ESTRO, European Society for Radiotherapy and Oncology; FROGG, Australia and New Zealand Faculty of Radiation Oncology Genito-Urinary Group; MRI, magnetic resonance imaging; NA, not available; RO, radiation oncologist; RTOG, Radiation Therapy Oncology Group; RTT, radiation therapist; TCIA, The Cancer Imaging Archive.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ng, C.K.C. Performance of Commercial Deep Learning-Based Auto-Segmentation Software for Prostate Cancer Radiation Therapy Planning: A Systematic Review. Information 2025, 16, 215. https://doi.org/10.3390/info16030215

AMA Style

Ng CKC. Performance of Commercial Deep Learning-Based Auto-Segmentation Software for Prostate Cancer Radiation Therapy Planning: A Systematic Review. Information. 2025; 16(3):215. https://doi.org/10.3390/info16030215

Chicago/Turabian Style

Ng, Curtise K. C. 2025. "Performance of Commercial Deep Learning-Based Auto-Segmentation Software for Prostate Cancer Radiation Therapy Planning: A Systematic Review" Information 16, no. 3: 215. https://doi.org/10.3390/info16030215

APA Style

Ng, C. K. C. (2025). Performance of Commercial Deep Learning-Based Auto-Segmentation Software for Prostate Cancer Radiation Therapy Planning: A Systematic Review. Information, 16(3), 215. https://doi.org/10.3390/info16030215

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop