Next Article in Journal
A Review of End-to-End Decision Optimization Research: An Architectural Perspective
Next Article in Special Issue
A Data-Centric Algorithmic Pipeline for Enhancing Cardiac MRI Segmentation Using ViTUNeT and Quality-Aware Filtering
Previous Article in Journal
An Iterative Reinforcement Learning Algorithm for Speed Drop Compensation in Rolling Mills
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improving Prostate Cancer Segmentation on T2-Weighted MRI Using Prostate Detection and Cascaded Networks

1
Department of Bioinformatics and Mathematical Biology, Alferov University, 194021 St. Petersburg, Russia
2
Department of Electronic Instruments and Devices, Saint Petersburg Electrotechnical University “LETI”, 197022 St. Petersburg, Russia
3
Higher School of Cyber-Physical Systems & Control, Peter the Great St. Petersburg Polytechnic University, 195251 St. Petersburg, Russia
*
Author to whom correspondence should be addressed.
Algorithms 2026, 19(1), 85; https://doi.org/10.3390/a19010085
Submission received: 17 December 2025 / Revised: 15 January 2026 / Accepted: 17 January 2026 / Published: 19 January 2026
(This article belongs to the Special Issue AI-Powered Biomedical Image Analysis)

Abstract

Prostate cancer is one of the most lethal cancers in the male population, and accurate localization of intraprostatic lesions on MRI remains challenging. In this study, we investigated methods for improving prostate cancer segmentation on T2-weighted pelvic MRI using cascaded neural networks. We used an anonymized dataset of 400 multiparametric MRI scans from two centers, in which experienced radiologists had delineated the prostate and clinically significant cancer on the T2 series. Our baseline approach applies 2D and 3D segmentation networks (UNETR, UNET++, Swin-UNETR, SegResNetDS, and SegResNetVAE) directly to full MRI volumes. We then introduce additional stages that filter slices using DenseNet-201 classifiers (cancer/no-cancer and prostate/no-prostate) and localize the prostate via a YOLO-based detector to crop the 3D region of interest before segmentation. Using Swin-UNETR as the backbone, the prostate segmentation Dice score increased from 71.37% for direct 3D segmentation to 76.09% when using prostate detection and cropped 3D inputs. For cancer segmentation, the final cascaded pipeline—prostate detection, 3D prostate segmentation, and 3D cancer segmentation within the prostate—improved the Dice score from 55.03% for direct 3D segmentation to 67.11%, with an ROC AUC of 0.89 on the test set. These results suggest that cascaded detection- and segmentation-based preprocessing of the prostate region can substantially improve automatic prostate cancer segmentation on MRI while remaining compatible with standard segmentation architectures.

Graphical Abstract

1. Introduction

Prostate cancer is one of the leading causes of cancer-related deaths in men [1]. One of the factors responsible for this high mortality rate is the difficulty in diagnosing the disease in its early stages. The classic method employed for detecting cancer is prostate-specific antigen (PSA) monitoring. If the PSA level increases, the patient is referred for additional testing, including a multiparametric MRI. mpMRI is performed in various modes, and the results are then sent to a radiologist for analysis. A special PI-RADS scale has been developed for MRI-based cancer diagnosis, with this scale used by specialists to verify detected changes in the prostate gland and determine their clinical significance [2].
When prostate cancer is detected in the late stages of disease, MRI is used to plan surgical interventions. Advances and developments have been made in the use of artificial intelligence software for surgical planning. For example, AI can be used for 3D reconstruction of the prostate, and these data can be used in planning tumor-removal surgery [3,4]. However, detecting cancer in the early stages of development is challenging because healthy and diseased tissues are often marked by poor contrast.
When considering the application of AI in medicine in general, it is important to note that, during the COVID-19 pandemic, numerous studies were conducted on the use of AI to solve classification and segmentation problems [5,6]. Specifically, the primary emphasis in the analysis of lung diseases was placed on the analysis of CT scans and radiographs. Among the systems employed for analyzing CT scans of the lungs, the work of Jin, C., Chen, W., Cao, Y. et al. [7] is noteworthy. These authors obtained ROC AUC > 0.90 in the analysis of CT scans for lung damage caused by COVID-19 and pneumonia (viral and non-viral) and also performed visualization of the locations to which the neural network responded using Grad-Cam [8].
Across all areas of medicine in which AI is utilized, three fundamental image processing tasks are most widely conducted: classification, detection, and segmentation. A classification model assumes the presence of an object that must be assigned to one of the established classes. A prime example is the detection of pathology, such as broken ribs in CT scans [9,10], or research conducted during the pandemic [11].
The aim of detection is to locate an object of a specified class in an image and construct a frame containing the object. In medicine, detection algorithms are used when it is necessary to precisely localize a detected object. Information about the object’s localization potentially enables not only the establishment of the object’s presence but also the ability to obtain more precise data, such as assigning a broken rib to the right or left side of the body [10]. The use of detection has also been explored in detecting gastric cancer in endoscopic images [12].
Segmentation facilitates pixel-by-pixel division of an image into classes. In medicine, neural networks employed for segmentation are primarily used to identify pathologies (for example, brain pathologies) or to identify organ boundaries in CT or MRI scans [13,14].
For most solutions, a single neural network architecture specific to the problem domain is sufficient. However, solving complex problems may require integrated approaches, for example, the use of multiple neural networks, including networks for specific tasks, such as image registration [15].
In our work, we aim to identify and describe an optimal method for segmenting prostate cancer on pelvic MRI. The primary objective is to segment the prostate and the cancer within its boundaries. We used anonymized data from a dataset containing 400 scans. The data were collected at two medical institutions: the Mariinsky Hospital and the Petrov Research Institute. The average age of the patients included in the dataset was 57.3 years, and the average PI-RADS v2.1 score was 4 points. For all MRI scans, specialists with over 10 years of radiology experience mapped the prostate and the tumor.
Among related studies, we highlight the study by Wei C. et al. [14], in which a two-stage approach for cancer segmentation was employed. Wei C. et al. used a neural network to solve the challenge of accurate prostate segmentation (first stage) and cancer segmentation in a refined region of the prostate (second stage). These authors show that the use of a two-stage approach improves the results of cancer segmentation by 0.07 for U-Net and 0.18 for attention U-Net, relative to a one-stage approach, where cancer segmentation is performed without preliminary selection of the prostate region. Another example is the study [16], in which the authors used a combination of three parametric series of multiparametric MRI as data and obtained results comparable to those of an experienced radiologist. Another study in which the capabilities of trained models were evaluated and their results compared with those of experienced specialists is [17], in which the authors compared the quality of prostate cancer detection of neural networks trained on more than 10,000 MRI scans collected from different locations and different devices with the results of the work by experienced clinicians. The results indicate that, with a sufficient amount of data for training, trained neural networks can produce results close to the quality of the work of specialists. In addition to the above-cited works, it is worth noting that the authors of [18] emphasize the relevance of this research direction and the trend toward translating AI from abstract research into commercial medical applications.
In biomedical image analysis, clean masks of the prostate and intraprostatic cancer on MRI are a basic step on which many clinical tools rely. We treat this situation as an algorithm design problem and use a small, modular cascade by first detecting the prostate, then segmenting the gland, and finally segmenting the cancer inside it. By focusing the model on a narrow region of interest, the pipeline reduces background noise and false positives, provides more stable results across scanners and sites, and outputs masks that are easy for clinicians to check and reuse.

2. Materials and Methods

2.1. Image Segmentation

Semantic segmentation is an important approach when analyzing medical images, which involves delineating the regions of interest such as individual organs or tumors. The essence of this method is to assign a label to each pixel based on the context of the pixel relative to its surrounding environment. At present, numerous variations in neural network architectures for segmenting medical images exist, ranging from the classic U-Net network with a U-shaped structure consisting of an encoder and a decoder at the core to the integration of Transformers for image processing in SWINUnet, which also possesses a self-attention mechanism. However, it is worth noting that with the transition from 2D medical image analysis to analysis of multi-series data (such as CT and MRI), architectures for full-fledged analysis of 3D data have been introduced and implemented. These architectures can capture context not only within a 2D slice but also across the 3D image volume, which enables a more accurate determination of the boundaries of organs or tumor inclusions since the spatial relationship between the slices is considered [19,20,21,22,23,24,25,26,27,28,29].
In our study, we used two segmentation settings. In the first, we split 3D MRI data into separate slices and forwarded them to a network for 2D slice segmentation. UNETR, UNET++, and Swin-UNETR were investigated as neural networks for segmentation. In the second case, training was carried out on loaded 3D data for the segmentation of MRI as a whole. The following architectures were used as trained networks: Swin-UNETR, UNETR, SegResNetDS, and SegResNetVAE.
The main metric for segmentation models was the calculation of the Dice–Sørensen (Dice) coefficient. This coefficient quantifies the overlap between two areas. For the task of 2D segmentation, the coefficient is calculated between the true mask of the object (ground truth) and the neural network’s prediction (Equation (1)). When implementing the metric in 3D segmentation, additional summation is introduced over all slices with at least one of the masks, and the average coefficient for a series of images is calculated (Equation (2)).
d i c e = 2 | X Y | | X | + | Y |
d i c e = i n 2 | X i Y i | | X i |   +   | Y i | n
where X and X i are the predicted mask for 2D segmentation and the i-th predicted mask for 3D segmentation, respectively; Y and Y i are the true mask for 2D segmentation and the i-th true mask for 3D segmentation, respectively; and n is the number of 3D image slices that contain at least one mask. An example of calculating this metric is shown in Figure 1 below.

2.2. Image Detection

Object detection in an image involves determining the location of an object of a given class if it is present in the image. The location of an object is determined by enclosing the object in certain boundaries, most often in a bounding box. Detection is widely applied in medical imaging to highlight areas of interest for a variety of tasks, for example, an area containing an organ or a tumor. The development of detection architectures has progressed from basic convolutional networks, such as R-CNN, to more complex ones, such as YOLO in its various versions [30,31]. An example of determining the prostate area in an image is shown in Figure 2.
In our work, we relied on YOLOv9 to solve the detection problem. When choosing this version of the YOLO model, we were guided by its GPL-3.0 licensing, which enables this pipeline to be used not only in research but also for commercial purposes, should such needs arise. Using older versions of YOLO models as a detection model is likely to improve detection quality; however, it imposes limitations on commercial use due to the AGPL-3.0 licensing.
A built-in validation set was used to assess the quality of the trained model, including the mAP50–95 metric. This metric shows the area under the Precision–Recall curve for thresholds of 0.5 to 0.95.

2.3. Image Classification

Classification assigns objects to predefined classes. In medicine, neural classification networks are typically used to identify images with pathologies (for example, the presence or absence of a pathology on an X-ray) or as a routing step (for example, to determine which part of the body an X-ray image corresponds to before sending it to the appropriate analysis service). A full history of this process is beyond the scope of this paper [32,33,34,35,36,37,38,39]. However, fully connected and convolutional neural networks established the basic principles, which later grew into such architectures as ResNet, DenseNet, and Transformer.
To assess the quality of the model, standard metrics were used (the corresponding list is provided in the following paragraph), in addition to the Area Under the Receiver Operating Characteristic Curve (ROC AUC) metric. This metric is constructed by plotting the coordinates of sensitivity and specificity, with different thresholds for classifying an object into a target class. This metric clearly shows how accurately the model performs classification. It is important to note that the ROC AUC metric can also be used to evaluate segmentation models. To achieve this, it is necessary to set the Dice metric threshold at which segmentation is considered successful and belongs to class 1, and everything below the threshold is classified as class 0. In this case, the ROC curve thresholds relate to the activation threshold for image pixels. In our experiments, a Dice score of 0.5 was chosen as the threshold for successful segmentation. An approach in which the DICE coefficient is used as a method for constructing binary quality labels can be found in [40,41,42,43], whereby the DICE coefficient is used to calculate the ROC AUC metric and evaluate the quality of the model based on the obtained results. In our case, the threshold of 0.5 was chosen based on the consideration that if the final DICE for a model exceeded 0.5, then the model was considered capable of finding the correct patterns for localizing the target class and providing a minimal description of the target class, indicating that the chosen training strategy is correct.

2.4. Additional Metrics

To comprehensively assess the quality of the trained segmentation models, we used additional metrics, as listed in Table 1. When calculating the metrics, a threshold of Dice ≥ 0.5 was set to classify the segmentation as “correct segmentation,” which corresponded to “1”.

2.5. Dataset

As a dataset for training the networks and assessing the quality of the results obtained, we used a closed dataset collected from two hospitals:
  • Mariinsky Hospital—183 MRI scans of the pelvis.
  • Petrov Research Institute—217 MRI scans of the pelvis.
In total, the dataset contained 400 anonymized pelvic mpMRI scans from men aged 25–62 years. The obtained scans were performed on MRI machines with a magnetic field strength of 1.5 T, with slice thicknesses of 2.5 and 3 mm. Note that for training, we standardized the dataset and reduced it to a size of 256 × 256 × 128 using trilinear interpolation.
Each of the scans was reviewed by three specialists who identified signs of clinically significant prostate cancer of different stages. In addition, the T2 series of the scans from the dataset was marked by radiologists to highlight the prostate area and prostate cancer. All scans in the dataset were annotated by each specialist. Annotation agreement was verified using the DICE coefficient. If any discrepancies in the annotations arose (DICE coefficient < 0.95 for two annotations), a group discussion was held, which resulted in either adjustments to the existing annotations or, in the absence of consensus, exclusion of the scan from the dataset. The final DICE coefficient among the annotations of all physicians was 0.983. One of three masks, selected randomly, was used to generate the dataset.
We used the T2 series of each scan to train neural networks since they are the clearest and most easily interpreted by humans and are most suitable for analysis by neural network algorithms.
All data were obtained in the DICOM medical data storage format. This format is the standard for storing MRI data (pixel array) and annotations (DICOM tags) in a single file. The MRI annotation includes a wide range of tags, encompassing patient data, device data, and study data. All tags are editable, meaning that additional information can be added to specific tags while analyzing a scan, and unnecessary information can be removed if required. The data anonymization process involves removing tags with direct and indirect patient and study information, including their IDs.
The data received were anonymized by the hospitals prior to transmission. We converted the data to the NifTI data storage format, as this format is widely used in medical data training with artificial intelligence, particularly in MRI tasks. The conversion process involved using only image data, without transferring any auxiliary information from the DICOM files.
The division of all data into training, validation, and test sets was performed using the built-in tools of the MONAI framework with a fixed random seed in the following ratios:
  • For segmentation (2D and 3D): train—0.7, val—0.2, test—0.1;
  • For detection: train—0.85, test—0.15;
  • For classification: train—0.7, val—0.2, test—0.1.
Note that the split into different sets (training, validation, and test) was performed once in the specified proportions. Since the study involved using a combination of models for different methods, to prevent possible data leakage among the models, the test portion was the same for all three training types, with the exception that for detection, the test portion was expanded to 0.15 by eliminating the validation portion and including it in the training and testing sets in proportions of 0.15 and 0.5, respectively. All splits were performed at the study level, such that all slices from a given mpMRI examination were assigned to the same set (train, validation, or test) for every task.

2.6. Hardware and Software Setup

The models were trained using the MONAI v.1.3.2 [44] and PyTorch v.2.4.0 [45] frameworks. The server hardware on which training and quality control of the trained models were performed is shown in Table 2. The operating system was Windows Server 2019 Standard.

2.7. Method

Our main goal was to achieve accurate segmentation of prostate cancer. However, during the study, we found that simply using a neural network for cancer segmentation did not produce acceptable results on our dataset; we therefore began searching for methods to improve prostate cancer segmentation. Based on the progress of the entire study and the results obtained, we can divide the research process into three stages: preliminary, main, and final.
During the preliminary stage, we used neural networks for prostate cancer segmentation, in addition to an approach that preliminarily utilized a classifier trained to separate sections into two classes: “with cancer” and “without cancer.” In the first case, we employed two approaches to cancer segmentation: segmentation of 2D data and segmentation of 3D data. In the second case, we first created a new dataset, retaining only those sections labeled by the classifier as containing cancer. The new dataset was then used to train neural networks for 2D and 3D segmentation. A schematic of the final algorithm, using a single study from this stage as an example, is shown in Figure 3.
To improve the results of the preliminary stage, we chose to use only the prostate region for cancer segmentation, which required us to extract only the prostate region and then train the model on the new data. The main phase of our study was based on prostate-region extraction and consisted of three stages. In the first stage, we used neural segmentation networks to extract the prostate from 2D and 3D data from the entire dataset, without first creating a new dataset.
To improve the results, we chose to attempt the concept from the preliminary stage: only feeding slices containing the prostate to the segmentation networks. To achieve this aim, we trained a neural network for binary classification (presence/absence of the prostate in the slice). We then created a new dataset. However, at this stage, we modified the rule for creating the new dataset: we now included only slices in the range from the lowest to the highest slice number. This approach is justified by the fact that the prostate is a continuous organ, thus eliminating possible classifier error in the midsection. The new dataset was split according to the proportions specified in Section 2.6 and used for 2D and 3D segmentation. A schematic representation of the algorithm using a single series as an example is shown in Figure 4.
Since some segmentation errors were found in the segmentation results, we further refined the target region and used a neural network for detection as a preprocessing step. This network detected the prostate and enclosed it in a bounding box. To avoid cropping out any segment of the prostate (if the detection was inaccurate) and to ensure uniform data size (across the study), we calculated the maximum coordinates among all frames (maximum height and width) and increased them by 5% in each direction.
We then cropped the images, thereby creating a parallelepiped smaller than the original MRI image but still containing the target region with the prostate. Segmentation training was also performed on the new dataset. A schematic representation of the algorithm using a single series as an example is shown in Figure 5.
The final stage involved using the best prostate segmentation method to generate a modified dataset containing only the prostate region and training a cancer segmentation network with the best performance from the preliminary stage. A schematic representation of the entire final stage algorithm is shown in Figure 6.
It is also important to note that augmentations were used during training, specifically tools for random rotation by 90°, random reflection, and random intensity and contrast changes. For the prostate, additional post-processing was performed to retain the largest segment found and fill any gaps within the segment.

3. Experiments

3.1. Preliminary Stage: Training 2D and 3D Cancer Segmentation Networks

During this stage, we trained neural networks for 2D cancer segmentation. The training dataset consisted of MRI scans. Three architectures were chosen as training neural networks: UNETR, UNET++, and Swin-UNETR. The configuration parameters of the neural networks used, the training configuration, the metric used, and the error function for this and all other neural networks are presented in Tables S1 and S2.
To train neural networks for segmentation on 3D data (entire MRI scans), we chose four architectures: UNETR, Swin-UNETR, SegResNetDS, and SegResNetVAE. The training configurations and training graphs are presented below.
At this stage, we also trained the classification network. DenseNet-201 was chosen as the training architecture. After training DenseNet-201 (Figure 7 and Figure 8), we created a modified dataset for 2D and 3D segmentations. The 2D dataset consisted of slices from the MRI scans that the classification network classified as prostate cancer. For the 3D dataset, we created “stripped” MRI scans, which included only those slices labeled as prostate cancer, in addition to one slice before and one slice after the classified sequence. This expansion was used to reduce the risk of classifier error.
Since the 2D dataset contained less data after applying the classifier, but the data themselves remained unchanged, we did not retrain the classification networks and only tested them on the new dataset. However, the networks for 3D segmentation were retrained because the data in the dataset had essentially changed (the number of slices, and therefore the third dimension in the transmitted tensor, had changed); the same set of networks was used for retraining. The AUC results for testing on the test set were 0.80 for cancer and 0.86 for the prostate. The training process is presented below.

3.2. Main Stage: Training 2D and 3D Prostate Segmentation Networks

To train the networks for prostate segmentation, we chose to use the same architecture as in the preliminary stage. Prostate masks were used as data. The training dataset and the training process itself were prepared in a similar manner to the preliminary stage.
To enable prostate segmentation training on the entire MRI dataset (3D data), we also selected the same architectures for 3D prostate cancer segmentation. The training dataset and the training process itself were also prepared in a similar manner to the preliminary stage.
The next step at this stage was to replicate the experience of using a classifier to restrict uninformative slices for segmentation. As in the preliminary stage, the DenseNet-201 neural network was chosen as the classifier. The configuration and training parameters are presented in Table S3.
After training the classifier, we updated the datasets for training 2D and 3D segmentations. The generation rule for the updated datasets was almost identical to the generation rule described in the preliminary stage, with the exception that slices ranging from the minimum slice number classified as prostate to the maximum slice number were used to generate the new 3D data. This approach was adopted from the anatomy of the prostate: it is a continuous organ that is not divided into two or more parts, unlike cancer lesions, which can be multiple in different areas of the prostate gland.
Since the tasks of segmenting prostate cancer and the prostate are essentially identical, we were again required to retrain the 3D segmentation networks, as the new 3D data were smaller than the original. The network configuration and training process are presented below. We achieved a slight improvement in the main Dice metric, as using the classifier required feeding the network fewer slices, resulting in fewer false positives. The next step involved applying detection instead of classification, as this would minimize the number of unnecessary inclusions and structures in the data passed to segmentation.
As a working approach, we developed the following algorithm: submitting the MRI scan for detection, obtaining bounding box coordinates, finding the maximum coordinates (for each of the four box vertices), incrementing each by 5%, rounding up, and cropping the original MRI scan to the resulting parallelepiped. In this approach, increasing the frame size (by 5% on each side) was necessary to avoid the occurrence of random errors in detection. Next, it was necessary to resize all of the obtained images to a uniform size, as the resulting frames were smaller than the original image in all cases. The standard image size of 128 × 128 × 24 (in the case of 3D data) was chosen, and resizing was performed using PyTorch tools.
The popular and powerful YOLO architecture was used to train the detection network. The final result on the test set was as follows: mAP50: 0.85 and mAP50–95: 0.56. The training parameters and standard set of post-training metrics are listed in Table 3 and Figure 9.
Two architectures were chosen as trainable neural networks for working with 2D slices: UNET++ and Swin-UNETR. For training on entire MRI scans, we chose Swin-UNETR and SegResNetDS.
Among all of the trained neural networks for segmentation, the Swin-UNETR architecture demonstrated the best results in the preliminary and main stages. Plots for the Dice metric, in addition to plots for the error function, are presented below: for 2D segmentation in Figure 10 and Figure 11, respectively, and for 3D segmentation in Figure 12 and Figure 13.
Based on the post-training metrics (all metrics are presented in the following section), the best prostate segmentation results were shown by the combination of first detection and then segmentation using the trained Swin-UNETR network.

3.3. Final Stage: Dataset Formation and Segmentation Training

The final stage consisted of preparing the dataset and training the neural network for cancer segmentation using new data. To generate the dataset, we used the best method from the main stage, namely, a combination of preliminary detection and subsequent 3D segmentation using the Swin-UNETR network. For training, we selected the architecture with the best 2D and 3D cancer segmentation performance from the preliminary stage, which remained to be the Swin-UNETR architecture in both cases. The training process is presented below (Figure 14 and Figure 15).

4. Results

In this study, the best prostate cancer segmentation performance was achieved with the final-stage pipeline using 3D cancer segmentation. The ROC AUC plots (Figure 16) for the trained segmentation models used in the final stage are presented below.
An example of segmentation of cancer foci by the final 2D cancer segmentation model is shown in Figure 17.
An example of segmentation of cancer foci by the final 3D cancer segmentation model is shown in Figure 18.
A summary of all metrics for each trained segmentation network is provided in Table S4. It does not include data for the classifier metrics or the YOLO detection network, as these networks were trained as a single instance for their task. The Swin-UNETR architecture models demonstrated the best results for 2D and 3D segmentation. These results can be explained by the architecture’s features, which include a self-attention mechanism, which is likely well suited for solving the problems under consideration. However, it is also worth considering the applicability of this architecture to other data and testing its performance for other medical tasks.

5. Discussion

In this study, we explored methods for improving prostate cancer segmentation results. Although conducting cancer classification on slices for further segmentation did not yield significant improvements, this aspect prompted the idea of using the prostate region for cancer segmentation. This approach led to an analysis of conventional prostate segmentation and two ways to improve this process: classification and detection. The results show that using detection to isolate the prostate region can increase the accuracy of organ segmentation, which can conceptually be transferred to other applied segmentation tasks.
A potential bottleneck of this approach to segmentation is the loss of algorithmic flexibility in handling variable data. During our study, we encountered eight MRI scans that contained significant deviations in contrast values relative to the rest of the dataset. These scans produced the worst prostate and cancer segmentation results (Figure 19), even though prostate detection did not materially affect bounding-box localization. One of the future directions of our research will be to determine methods to address this issue.
Scans marked by contrast issues were obtained from a single medical institution. A brief review of these scans revealed that two tags (window width and window length) were omitted from the original DICOM files, which were used to generate the contrast ratio when displaying the scans. Since these tags were missing during DICOM conversion to NifTI format, this could be one of the reasons for the problematic data. This situation might have been caused by an error during scan generation or data anonymization. A practical solution to this problem is to verify the presence of these tags and substitute default values if they are missing.
A starting point for addressing this issue could be the use of an extended set of augmentation tools during data preprocessing. In our case, two options are possible: processing those eight scans only, such as by using external tools for automatic data contrasting, or increasing the dataset’s resilience to contrast outliers, such as by using augmentation tools to introduce random changes in brightness and contrast with high values of change probability and magnitude on the data used for training. In a specific case, manual adjustment of scans with contrast defects is possible by adjusting the WW and WL tag values in DICOM format (before conversion to NifTI format). However, this approach would be impractical, as contrast may depend, among other things, on the MRI machine settings. Such limitations necessitate addressing the issue through automated data processing or by increasing the model’s resilience to data variability.
Another issue we encountered, which prompted us to apply prostate segmentation for data improvement, was the problem of abrupt data changes when using the classifier. This problem arose when two cancer lesions included sections of healthy prostate tissue between them. With such lesions, when generating a modified dataset after the cancer classifier, sharp boundaries appeared within the segmented data, which was especially noticeable in 3D segmentation. In our work, we avoided this problem by using the prostate region as the transferred data; however, this issue may arise in other segmentation applications using modified datasets.
During the post-processing stage of detection results, we identified the maximum coordinates of the bounding boxes and then expanded them by 5%. This number was chosen heuristically for the following reasons: Firstly, a smaller number reduces the likelihood of correcting a potential segmentation error, whereas a larger number can lead to poor segmentation, as it will capture a large amount of non-target information. Secondly, expanding the bounding box by 5% leads to unification of the data for segmentation, since with such an expansion, even the slice with the largest prostate area will contain non-target information, similarly to the other slices. If the bounding box is expanded excessively, a situation may arise wherein muscle tissue structures of the body are included in the expanded box in sufficient quantities to cause false positives in segmentation. This potential risk must be considered when forming the dataset. In addition, excessive non-target information in the data tends to complicate the training process, which is a negative side effect of expanding the bounding box. The combination of these factors prompted us to increase the frame by 5% relative to each side of the frame; however, this number is not completely universal and provides an opportunity for further clarification, both for this task and for other tasks where such an approach can be used.
It is also worth noting that the data used in this study were obtained from two medical institutions, which does not rule out the possibility of correlations specific to these datasets. The described algorithm requires further validation using external data, including testing the applicability of the approach to other organs or tasks, where the area of interest can be clarified and the results further improved.
Beyond technical performance, the proposed pipeline has several potential clinical applications. Robust prostate and cancer segmentation on MRI could support personalized care by assisting radiologists in lesion localization, enabling more precise targeting of biopsies and focal treatments, and providing quantitative tumor volumes for risk stratification and treatment response assessment at the individual patient level. In radiotherapy, accurate organ and tumor masks are also essential for automated contouring and dose planning; improving segmentation quality may therefore translate into more consistent, patient-specific treatment plans. Prospective studies are required to confirm these benefits in real-world clinical workflows.
One of the main areas for further research will be testing the feasibility of using a combination of mpMRI series to improve the accuracy of segmentation and the entire algorithm. Specifically, the feasibility of combining two studies into one, training multiple models (each on a separate series), and using their composites will be examined.

6. Conclusions

In this study, we explored a cascaded approach to improve prostate cancer segmentation on T2-weighted pelvic MRI. Starting from direct 2D and 3D segmentation of cancer, we incrementally refined the input data by incorporating prostate-focused preprocessing steps: slice-level classification, YOLO-based prostate detection, and 3D prostate segmentation. On a dataset of 400 annotated MRI scans from two clinical centers, the best prostate segmentation performance was obtained with a combination of prostate detection and 3D Swin-UNETR segmentation (Dice 76.09%). Using this refined prostate region as input for 3D cancer segmentation increased the cancer Dice coefficient from 55.03% for direct 3D Swin-UNETR segmentation to 67.11%, with an ROC AUC of 0.89, demonstrating that isolating the prostate region meaningfully improves cancer segmentation quality.
Concurrently, the proposed solution increases pipeline complexity, as it relies on several sequential networks for classification, detection, and segmentation. The approach may be sensitive to variations in image acquisition: in our dataset, a small subset of scans with markedly different contrast characteristics showed the poorest segmentation performance despite accurate detection, highlighting robustness as an important limitation. In future work, we will focus on handling such heterogeneous datasets, assessing generalization on additional external cohorts, and investigating whether the cascaded strategy can be simplified or adapted to multiparametric inputs while preserving or further improving prostate and cancer segmentation accuracy.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/a19010085/s1, Table S1: The parameters of configuration each segmentation neural network, Table S2: Optimizer and training parameters, Table S3: Configuration and training parameters for the classification networks. Table S4: Final metrics of the models.

Author Contributions

Conceptualization, N.N. and N.S.; methodology, N.N.; software, N.S.; validation, R.D., N.N. and N.S.; formal analysis, N.S.; investigation, N.N.; resources, N.N.; data curation, N.N.; writing—original draft preparation, N.N. and R.D.; writing—review and editing, N.N. and R.D.; visualization, N.N.; supervision, N.S. and R.D.; project administration, N.S.; funding acquisition, R.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef]
  2. Turkbey, B.; Rosenkrantz, A.B.; Haider, M.A.; Padhani, A.R.; Villeirs, G.; Macura, K.J.; Tempany, C.M.; Choyke, P.L.; Cornud, F.; Margolis, D.J.; et al. Prostate Imaging Reporting and Data System Version 2.1: 2019 update of Prostate Imaging Reporting and Data System Version 2. Eur. Urol. 2019, 76, 340–351. [Google Scholar] [CrossRef] [PubMed]
  3. Cianflone, F.; Maris, B.; Bertolo, R.; Veccia, A.; Artoni, F.; Pettenuzzo, G.; Montanaro, F.; Porcaro, A.B.; Bianchi, A.; Malandra, S.; et al. Development of artificial intelligence-based real-time automatic fusion of multiparametric magnetic resonance imaging and transrectal ultrasonography of the prostate. Urology 2025, 199, 27–34. [Google Scholar] [CrossRef] [PubMed]
  4. Palazzo, G.; Mangili, P.; Deantoni, C.; Fodor, A.; Broggi, S.; Castriconi, R.; Ubeira Gabellini, M.G.; Del Vecchio, A.; Di Muzio, N.G.; Fiorino, C. Real-world validation of artificial intelligence-based computed tomography auto-contouring for prostate cancer radiotherapy planning. Phys. Imaging Radiat. Oncol. 2023, 28, 100501. [Google Scholar] [CrossRef] [PubMed]
  5. Sheng, K. Artificial intelligence in radiotherapy: A technological review. Front. Med. 2020, 14, 431–449. [Google Scholar] [CrossRef]
  6. Francolini, G.; Desideri, I. Artificial intelligence in radiotherapy: State of the art and future directions. Med. Oncol. 2020, 37, 50. [Google Scholar] [CrossRef]
  7. Jin, C.; Chen, W.; Cao, Y.; Xu, Z.; Tan, Z.; Zhang, X.; Deng, L.; Zheng, C.; Zhou, J.; Shi, H.; et al. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis. Nat. Commun. 2020, 11, 5088. [Google Scholar] [CrossRef] [PubMed]
  8. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
  9. Zhou, Q.-Q.; Hu, Z.-C.; Tang, W.; Xia, Z.Y.; Wang, J.; Zhang, R.; Li, X.; Chen, C.Y.; Zhang, B.; Lu, L.; et al. Precise anatomical localization and classification of rib fractures on CT using a convolutional neural network. Clin. Imaging 2022, 81, 24–32. [Google Scholar] [CrossRef]
  10. Zhou, Q.-Q.; Wang, J.; Tang, W.; Hu, Z.-C.; Xia, Z.-Y.; Li, X.-S.; Zhang, R.; Yin, X.; Zhang, B.; Zhang, H. Automatic detection and classification of rib fractures on thoracic CT using convolutional neural network: Accuracy and feasibility. Korean J. Radiol. 2020, 21, 869–879. [Google Scholar] [CrossRef] [PubMed]
  11. Aggarwal, P.; Mishra, N. COVID-19 image classification using deep learning: Advances, challenges and opportunities. Comput. Biol. Med. 2022, 144, 105350. [Google Scholar] [CrossRef]
  12. Hirasawa, T.; Aoyama, K.; Tanimoto, T.; Ishihara, S.; Shichijo, S.; Ozawa, T.; Ohnishi, T.; Fujishiro, M.; Matsuo, K.; Fujisaki, J.; et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 2018, 21, 653–660. [Google Scholar] [CrossRef]
  13. Ghaffari, M.; Sowmya, A. Automated brain tumor segmentation using multimodal brain scans: A survey based on models submitted to the BraTS 2012–2018 challenges. IEEE Rev. Biomed. Eng. 2020, 13, 156–168. [Google Scholar] [CrossRef] [PubMed]
  14. Wei, C.; Liu, Z.; Zhang, Y.; Fan, L. Enhancing prostate cancer segmentation in bpMRI: Integrating zonal awareness into attention-guided U-Net. Digit. Health 2025, 11, 20552076251314546. [Google Scholar] [CrossRef] [PubMed]
  15. Nefediev, N.A.; Staroverov, N.E.; Davydov, R.V. Improving compliance of brain MRI studies with the atlas using a modified TransMorph neural network. St. Petersburg State Polytech. Univ. J. Phys. Math. 2024, 17, 335–339. [Google Scholar] [CrossRef]
  16. Mehrtash, A.; Sedghi, A.; Ghafoorian, M.; Taghipour, M.; Tempany, C.M.; Wells, W.M.; Kapur, T.; Mousavi, P.; Abolmaesumi, P.; Fedorov, A. Classification of clinical significance of MRI prostate findings using 3D convolutional neural networks. Proc. SPIE 2017, 10134, 101342A. [Google Scholar] [CrossRef]
  17. Saha, A.; Bosma, J.S.; Twilt, J.J.; van Ginneken, B.; Bjartell, A.; Padhani, A.R.; Bonekamp, D.; Villeirs, G.; Salomon, G.; Giannarini, G.; et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI): An international, paired, non-inferiority, confirmatory study. Lancet Oncol. 2024, 25, 879–887. [Google Scholar] [CrossRef]
  18. Ng, C.K.C. Performance of commercial deep learning-based auto-segmentation software for prostate cancer radiation therapy planning: A systematic review. Information 2025, 16, 215. [Google Scholar] [CrossRef]
  19. Gaziev, G.; Wadhwa, K.; Barrett, T.; Koo, B.C.; Gallagher, F.A.; Serrao, E.; Frey, J.; Seidenader, J.; Carmona, L.; Warren, A.; et al. Defining the learning curve for multiparametric magnetic resonance imaging (MRI) of the prostate using MRI–transrectal ultrasonography (TRUS) fusion-guided transperineal prostate biopsies as a validation tool. BJU Int. 2016, 117, 80–86. [Google Scholar] [CrossRef] [PubMed]
  20. Myronenko, A. 3D MRI brain tumor segmentation using autoencoder regularization. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Proceedings of the 4th International Workshop, BrainLes 2018, Granada, Spain, 16 September 2018; Revised Selected Papers, Part II; Springer: Cham, Switzerland, 2019; pp. 311–320. [Google Scholar] [CrossRef]
  21. He, Y.; Yang, D.; Roth, H.; Zhao, C.; Xu, D. DiNTS: Differentiable neural network topology search for 3D medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 5841–5850. [Google Scholar] [CrossRef]
  22. Rodrigues, N.; Silva, S. A comparative study of automated deep learning segmentation models for prostate MRI. Cancers 2023, 15, 1467. [Google Scholar] [CrossRef]
  23. Hatamizadeh, A.; Nath, V.; Tang, Y.; Yang, D.; Roth, H.R.; Xu, D. Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries; LNCS 12962; Springer: Cham, Switzerland, 2022; pp. 272–284. [Google Scholar] [CrossRef]
  24. Liao, W.; Zhu, Y.; Wang, X.; Pan, C.; Wang, Y.; Ma, L. LightM-UNet: Mamba assists in lightweight UNet for medical image segmentation. arXiv 2024, arXiv:2403.05246. [Google Scholar] [CrossRef]
  25. Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef]
  26. Oktay, O.; Schlemper, J.; Le Folgoc, L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention U-Net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar] [CrossRef]
  27. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Springer: Cham, Switzerland, 2015. [Google Scholar] [CrossRef]
  28. Milletari, F.; Navab, N.; Ahmadi, S.-A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 565–571. [Google Scholar] [CrossRef]
  29. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  30. Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
  31. Jocher, G.; Qiu, J.; Chaurasia, A. Ultralytics YOLO, Version 9; GitHub—Ultralytics/Ultralytics: San Francisco, CA, USA, 2026. Available online: https://github.com/ultralytics/ultralytics (accessed on 1 November 2025).
  32. Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Informatica 2007, 31, 249–268. [Google Scholar]
  33. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  34. Armato, S.G., III; Huisman, H.; Drukker, K.; Hadjiiski, L.; Kirby, J.S.; Petrick, N.; Redmond, G.; Giger, M.L.; Cha, K.; Mamonov, A.; et al. PROSTATEx challenges for computerized classification of prostate lesions from multiparametric magnetic resonance images. J. Med. Imaging 2018, 5, 044501. [Google Scholar] [CrossRef] [PubMed]
  35. Shamshad, F.; Khan, S.; Zamir, S.W.; Khan, M.H.; Hayat, M.; Khan, F.S.; Fu, H. Transformers in medical imaging: A survey. Med. Image Anal. 2023, 88, 102802. [Google Scholar] [CrossRef] [PubMed]
  36. Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar] [CrossRef]
  37. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar] [CrossRef]
  38. Li, W.; Zheng, B.; Shen, Q.; Shi, X.; Luo, K.; Yao, Y.; Li, X.; Lv, S.; Tao, J.; Wei, Q. Adaptive window adjustment with boundary DoU loss for cascade segmentation of anatomy and lesions in prostate cancer using bpMRI. Neural Netw. 2025, 181, 106831. [Google Scholar] [CrossRef]
  39. Saha, A.; Bosma, J.; Twilt, J.; van Ginneken, B.; Bjartell, A.; Padhani, A.R.; Bonekamp, D.; Villeirs, G.; Salomon, G.; Giannarini, G.; et al. Artificial intelligence and radiologists at prostate cancer detection in MRI—The PI-CAI challenge. In Proceedings of the Medical Imaging with Deep Learning (MIDL), Nashville, TN, USA, 10–12 July 2023. Short Paper Track. [Google Scholar]
  40. Vicar, T.; Balvan, J.; Jaros, J.; Jug, F.; Kolar, R.; Masarik, M.; Gumulec, J. Cell segmentation methods for label-free contrast microscopy: Review and comprehensive comparison. BMC Bioinform. 2019, 20, 360. [Google Scholar] [CrossRef] [PubMed]
  41. Taha, A.; Hanbury, A. Metrics for evaluating 3D medical image segmentation: Analysis, selection, and tool. BMC Med. Imaging 2015, 15, 29. [Google Scholar] [CrossRef]
  42. Yin, Z.; Kanade, T.; Chen, M. Understanding the phase contrast optics to restore artifact-free microscopy images for segmentation. Med. Image Anal. 2012, 16, 1047–1062. [Google Scholar] [CrossRef]
  43. Koos, K.; Molnár, J.; Kelemen, L.; Tamás, G.; Horvath, P. DIC image reconstruction using an energy minimization framework to visualize optical path length distribution. Sci. Rep. 2016, 6, 30420. [Google Scholar] [CrossRef] [PubMed]
  44. The MONAI Consortium. Project MONAI; Zenodo: Genève, Switzerland, 2020. [Google Scholar] [CrossRef]
  45. Ansel, J.; Yang, E.; He, H.; Gimelshein, N.; Jain, A.; Voznesensky, M.; Bao, B.; Bell, P.; Berard, D.; Burovski, E.; et al. PyTorch 2: Faster machine learning through dynamic Python bytecode transformation and graph compilation. In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’24), San Diego, CA, USA, 27 April–1 May 2024; Volume 2. [Google Scholar] [CrossRef]
Figure 1. Example of Dice calculation: (a) true (red) and predicted (yellow) masks; (b) mask overlap zone, corresponding to |X⋂Y|; (c) total mask zone, corresponding to |X| + |Y|.
Figure 1. Example of Dice calculation: (a) true (red) and predicted (yellow) masks; (b) mask overlap zone, corresponding to |X⋂Y|; (c) total mask zone, corresponding to |X| + |Y|.
Algorithms 19 00085 g001
Figure 2. Example of prostate detection using YOLO.
Figure 2. Example of prostate detection using YOLO.
Algorithms 19 00085 g002
Figure 3. Schematic of the preliminary-stage algorithm using one series as an example.
Figure 3. Schematic of the preliminary-stage algorithm using one series as an example.
Algorithms 19 00085 g003
Figure 4. Schematic representation of the algorithm using the prostate classification network in the main stage, using a single series as an example.
Figure 4. Schematic representation of the algorithm using the prostate classification network in the main stage, using a single series as an example.
Algorithms 19 00085 g004
Figure 5. Schematic representation of the algorithm using the prostate detection network in the main stage.
Figure 5. Schematic representation of the algorithm using the prostate detection network in the main stage.
Algorithms 19 00085 g005
Figure 6. Schematic representation of the final algorithm for the final stage.
Figure 6. Schematic representation of the final algorithm for the final stage.
Algorithms 19 00085 g006
Figure 7. Training graphs of the DenseNet-201 neural network (cancer): (a) dependence of the error on the training epoch; (b) dependence of the AUC on the training epoch.
Figure 7. Training graphs of the DenseNet-201 neural network (cancer): (a) dependence of the error on the training epoch; (b) dependence of the AUC on the training epoch.
Algorithms 19 00085 g007
Figure 8. Training graphs of the DenseNet-201 neural network (prostate): (a) dependence of the error on the training epoch; (b) dependence of the AUC on the training epoch.
Figure 8. Training graphs of the DenseNet-201 neural network (prostate): (a) dependence of the error on the training epoch; (b) dependence of the AUC on the training epoch.
Algorithms 19 00085 g008
Figure 9. Standard set of YOLO metrics after training.
Figure 9. Standard set of YOLO metrics after training.
Algorithms 19 00085 g009
Figure 10. Graphs of Dice change during training of the 2D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) Dice change from epoch to epoch on the validation set.
Figure 10. Graphs of Dice change during training of the 2D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) Dice change from epoch to epoch on the validation set.
Algorithms 19 00085 g010
Figure 11. Graphs of DiceCELoss change during training of the 2D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) DiceCELoss change from epoch to epoch on the validation set.
Figure 11. Graphs of DiceCELoss change during training of the 2D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) DiceCELoss change from epoch to epoch on the validation set.
Algorithms 19 00085 g011
Figure 12. Graphs of Dice changes during training of the 3D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) Dice change from epoch to epoch on the validation set.
Figure 12. Graphs of Dice changes during training of the 3D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) Dice change from epoch to epoch on the validation set.
Algorithms 19 00085 g012
Figure 13. Graphs of DiceCELoss changes during training of the 3D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) DiceCELoss change from epoch to epoch on the validation set.
Figure 13. Graphs of DiceCELoss changes during training of the 3D Swin-UNETR network in the preliminary and main stages: (a) Dice change from epoch to epoch on the training set; (b) DiceCELoss change from epoch to epoch on the validation set.
Algorithms 19 00085 g013
Figure 14. Graphs of Dice changes during the training of the Swin-UNETR network at the final stage: (a) Dice change from epoch to epoch on the training set; (b) Dice change from epoch to epoch on the validation set.
Figure 14. Graphs of Dice changes during the training of the Swin-UNETR network at the final stage: (a) Dice change from epoch to epoch on the training set; (b) Dice change from epoch to epoch on the validation set.
Algorithms 19 00085 g014
Figure 15. Graphs of DiceCELoss changes during Swin-UNETR network training at the final stage: (a) Dice change from epoch to epoch on the training set; (b) DiceCELoss change from epoch to epoch on the validation set.
Figure 15. Graphs of DiceCELoss changes during Swin-UNETR network training at the final stage: (a) Dice change from epoch to epoch on the training set; (b) DiceCELoss change from epoch to epoch on the validation set.
Algorithms 19 00085 g015
Figure 16. AUC metric for Swin-UNETR networks trained in the final stage.
Figure 16. AUC metric for Swin-UNETR networks trained in the final stage.
Algorithms 19 00085 g016
Figure 17. Segmentation of cancer by the final 2D segmentation model.
Figure 17. Segmentation of cancer by the final 2D segmentation model.
Algorithms 19 00085 g017
Figure 18. Segmentation of cancer by the final 3D segmentation model.
Figure 18. Segmentation of cancer by the final 3D segmentation model.
Algorithms 19 00085 g018
Figure 19. Examples of scans with contrast outliers and segmentation errors.
Figure 19. Examples of scans with contrast outliers and segmentation errors.
Algorithms 19 00085 g019
Table 1. Description of the metrics used.
Table 1. Description of the metrics used.
MetricDescription
AccuracyShows the overall correctness of the segmentation
PrecisionShows the proportion of positive pixels relative to all pixels classified as positive
RecallShows the proportion of positive pixels relative to all true positive pixels
SpecificityShows the ability to identify negative pixels correctly
F1Based on precision and recall and displays a balanced measure of segmentation accuracy
Table 2. Server configuration for training.
Table 2. Server configuration for training.
HardwareDescription
CPUIntel(R) Xeon(R) Silver 4214R
GPUNVIDIA RTX A6000
RAM320 GB DDR4
Table 3. YOLO configuration.
Table 3. YOLO configuration.
Net ConfigurationValue
img_size312 × 312
in_channels1
output_channels1
feature_size48
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nefediev, N.; Staroverov, N.; Davydov, R. Improving Prostate Cancer Segmentation on T2-Weighted MRI Using Prostate Detection and Cascaded Networks. Algorithms 2026, 19, 85. https://doi.org/10.3390/a19010085

AMA Style

Nefediev N, Staroverov N, Davydov R. Improving Prostate Cancer Segmentation on T2-Weighted MRI Using Prostate Detection and Cascaded Networks. Algorithms. 2026; 19(1):85. https://doi.org/10.3390/a19010085

Chicago/Turabian Style

Nefediev, Nikolay, Nikolay Staroverov, and Roman Davydov. 2026. "Improving Prostate Cancer Segmentation on T2-Weighted MRI Using Prostate Detection and Cascaded Networks" Algorithms 19, no. 1: 85. https://doi.org/10.3390/a19010085

APA Style

Nefediev, N., Staroverov, N., & Davydov, R. (2026). Improving Prostate Cancer Segmentation on T2-Weighted MRI Using Prostate Detection and Cascaded Networks. Algorithms, 19(1), 85. https://doi.org/10.3390/a19010085

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop