A Convolutional Neural Network-Based Auto-Segmentation Pipeline for Breast Cancer Imaging

: Medical imaging is crucial for the detection and diagnosis of breast cancer. Artificial intelligence and computer vision have rapidly become popular in medical image analyses thanks to technological advancements. To improve the effectiveness and efficiency of medical diagnosis and treatment, significant efforts have been made in the literature on medical image processing, segmentation, volumetric analysis, and prediction. This paper is interested in the development of a prediction pipeline for breast cancer studies based on 3D computed tomography (CT) scans. Several algorithms were designed and integrated to classify the suitability of the CT slices. The selected slices from patients were then further processed in the pipeline. This was followed by data generalization and volume segmentation to reduce the computation complexity. The selected input data were fed into a 3D U-Net architecture in the pipeline for analysis and volumetric predictions of cancer tumors. Three types of U-Net models were designed and compared. The experimental results show that Model 1 of U-Net obtained the highest accuracy at 91.44% with the highest memory usage; Model 2 had the lowest memory usage with the lowest accuracy at 85.18%; and Model 3 achieved a balanced performance in accuracy and memory usage, which is a more suitable configuration for the developed pipeline.


Introduction
In recent years, a convergence of multiple factors has led to an exacerbating global shortage of radiologists.Examples of such factors include the rising demand for medical imaging, the coronavirus disease pandemic, and global aging.Over the past decade, artificial intelligence (AI) has been rapidly adopted in different industrial sectors.Similar to the ways humans carry out complex problem solving and decision making, AI is now able to perform learning from relevant datasets for various applications [1,2].
In the healthcare industry, radiology was one of the first medical disciplines to utilize computer vision (CV) for its medical applications [3].As a subfield of AI, CV has a rich history spanning decades of research work to enable computers to meaningfully interpret visual stimuli.The development of CV has recently been accelerated due to three main factors: (1) the massive improvement in computer processing capability; (2) the rise of big data with the amassment and storage of a large amount of data; and (3) the increased contributions from machine learning algorithm research.CV systems can be used to analyze medical images such as CT scans, X-rays, and magnetic resonance images (MRIs).Convolutional neural networks (CNNs) are one of the mainstream methods in the field of medical imaging segmentation due to their performance [4][5][6][7].Breast cancer is a type of common cancer in females in the world.An MRI segmentation model with a CNN was developed for the volumetric measurement of breast cancer [4].Pre-trained CNN models were used for feature extraction in the task of detecting breast cancer in mammography images [5].A dual-modal CNN was introduced to analyze both ultrasound (US) images and shear-wave elastography for the prediction of breast cancer [8].A Regional Convolutional Neural Network (R-CNN) was utilized for MRI analysis to detect breast tumors [9].Another R-CNN-based framework was reported for the MRI analysis and detection of breast cancer pathological lesions [10].A deep learning-based tumoral radiomics model for CT images was introduced to predict the pathological complete response [11].A CNN model was used for the delineation of the clinical target volumes (CTVs) in CT images of breast cancers in radiotherapy [6].It was observed that the performance of CNN-based segmentation of CTVs in CT images of breast cancer was better than that of the manual process [12].A CNN was depicted for the classification and detection of tumor cancers in CT scans [13].These systems may be able to detect cancerous tumors with certain precision for small patches of anomalous tumor segments.Radiologists can fail in their process of cancer identification [14,15] due to the levels of their expert knowledge and experience.CV systems are complementary tools that can be used to support the work of radiologists and reduce diagnostic times.
Carrying out volumetric predictions of breast cancers in 3D CT scans is a challenging task depending on the experience of the radiologist.It would be beneficial to have an automatic framework using AI models for accurate volumetric predictions of tumors.
In this research, a CNN-based prediction pipeline is proposed for the volumetric prediction of breast cancers in 3D CT scans.The prediction pipeline consists of a suitable and accurate 3D CNN model, which is trained on CT scan datasets.As a type of CNN model, the U-Net architecture is capable of obtaining good precision with little training data.The CT scans of a total of 347 patients were provided by the National Cancer Centre, Singapore (NCCS) for the pipeline's development.It is necessary to note that not all data are suitable for model training.The proposed pipeline is able to perform data pre-processing and manipulation to automatically select the proper data.The architecture of the U-Net in the pipeline was crafted and configured.Experiments were conducted to evaluate the performance of the pipeline outputs.
The main contributions of this paper are as follows: (1) A prediction pipeline is proposed, utilizing a 3D U-Net architecture for image segmentations in 3D CT scan images.The developed pipeline consists of a series of algorithms for data pre-processing to generalize and normalize the CT scan data.A 3D U-Net architecture is customized to cater to the requirements of accurate segmentations and volumetric predictions of breast tumors.The designed pipeline can become a complementary supporting tool for radiologists to increase the productivity and efficiency of their jobs.(2) For the U-Net architecture in the developed pipeline, a hybrid Tversky-cross-entropy loss function is utilized, which combines the advantages of the binary cross-entropy (BCE) loss function and the Tversky focal loss.A Nesterov-accelerated adaptive moment estimation (Nadam) optimization algorithm is leveraged to achieve better optimization performance.(3) Three types of 3D U-net architecture models are designed and compared in this research.Their performance is evaluated based on the Dice coefficient metric [16].
The organization of the remaining parts of the paper is as follows.Section 2 introduces the relevant background knowledge.Section 3 presents our methodology and design.Section 4 discusses the experiment results.Section 5 concludes the research.

Background Knowledge 2.1. CNN and U-Net
CNN is a variant of deep learning artificial neural networks.It is good for pattern recognition and image analysis in CV.The architecture of CNN consists of convolution layers [17,18].These layers help identify patterns in images through a set of filters to detect edges and lines from previous layers.It then identifies the whole shapes and objects at later layers.Apart from the convolution layer, the activation layers, pooling layers, and fully connected layers are also important for a CNN [19].Over the years, various types of CNN architectures were reported to cater to different applications.Some of the commonly seen architectures are shown in Table 1.
Table 1.Various CNN architectures in the literature.

•
Used for handwritten digit recognition.

•
Suffered from vanishing gradient problem.

•
Used on large-scale image datasets.

•
Rectified Linear Unit is used as an activation function, with batch size of 128.

•
Fewer parameters than AlexNet but with better performance.

•
Deeper architecture due to 1 × 1 convolution and global average pooling.

•
Can take large input image of 224 × 224-pixel size.

•
Can be used for tasks of natural language processing.

•
Computationally efficient to match the computation power of Graphics Processing Units (GPUs).
MobileNets [29] • Used depth-wise convolutions to apply a single filter into each channel.

•
Introduced two hyperparameters: width multiplier and resolution multiplier.

•
Can work on mobile devices for mobile and embedded vision applications, such as object detection, face recognition, etc.
U-Net [30,31] • Used for semantic segmentation to address the challenge of limited medical data with annotations.

•
Consisted of a contracting path and an expansive path (i.e., encoder and decoder).

•
Designed to work with fewer training images but yield favorable precision and computational efficiency.
U-Net is a type of deep CNN architecture suitable for biomedical image analysis [7,30].U-Net-based architectures are one of the widely used structures in the field of medical image segmentation, such as breast tumor image segmentation, as it can work with small training data yet produce accurate results of image segmentation [32,33].The U-Net architecture derives its U-shape due to the sequentially arranged encoder and decoder modules [30,31,34].The encoder consists of convolutional layers, batch normalization (BN) function, activation layers, and max pooling layers to realize the unique features in a given image.The decoder combines the encoded spatial and feature information through upconvolutions and concatenations to produce a high-resolution image.This image provides the localized information needed for semantic segmentation.
The configurations of the original U-Net architecture can be fine-tuned with the different arrangements of the layers in the architecture.While designing a U-Net from scratch is possible, minor improvements to the existing architecture could be a quicker alternative, as observed in similar works [32].

Image Segmentation
Image segmentation is an image processing technique used to identify specific objects in an image [11,35,36].Images can be divided into various partitions known as segments.Each segment is analyzed and assigned with some values.There are three common categories for segmentation tasks: binary segmentation, active contour segmentation, and semantic segmentation.The binary segmentation utilizes the threshold method, where a histogram of unique pixel values is extracted.This is followed by choosing a suitable threshold value, to derive the resultant output as a binary segmented image [33].For the active contour segmentation, a boundary is first initialized around the object of interest, and it will automatically move towards the object.It is marked via the difference in pixel values, using a check iteration algorithm.The final boundary is the segmented image [37].
When dealing with more complicated images, semantic segmentation can be used, which involves labelling each pixel of an image corresponding to a specific class [38].For example, if there are three unique classes namely Human, Bicycle, and Background, every object of the same class is labelled with the same pixel values.There is no distinction between different humans in the same class.

Proposed Structure of the Volumetric Prediction Pipeline
There are three axes, i.e., x, y, and z, for the 3D CT scans for each patient.The breast cancer CT scan images contain a stack of images, depending on the spatial locations.The number of slices in the coronal plane and sagittal plane for each patient is the same, while the number of slices of each patient is unequal in the axial plane (i.e., z-axis).For example, scans of every patient have dimensions of 512 × 512 × n, where n varies for every patient, as shown in Figure 1.This is due to the nature of CT scanning machines.However, this can affect the U-Net model training, as typically it requires training data with equal spatial dimensions.It is, thus, necessary to derive suitable data by performing some data pre-processing on the CT slices, before feeding them into the proposed prediction pipeline.[32,33].The U-Net architecture derives its U-shape due to the sequentially arranged encoder and decoder modules [30,31,34].The encoder consists of convolutional layers, batch normalization (BN) function, activation layers, and max pooling layers to realize the unique features in a given image.The decoder combines the encoded spatial and feature information through up-convolutions and concatenations to produce a high-resolution image.This image provides the localized information needed for semantic segmentation.
The configurations of the original U-Net architecture can be fine-tuned with the different arrangements of the layers in the architecture.While designing a U-Net from scratch is possible, minor improvements to the existing architecture could be a quicker alternative, as observed in similar works [32].

Image Segmentation
Image segmentation is an image processing technique used to identify specific objects in an image [11,35,36].Images can be divided into various partitions known as segments.Each segment is analyzed and assigned with some values.There are three common categories for segmentation tasks: binary segmentation, active contour segmentation, and semantic segmentation.The binary segmentation utilizes the threshold method, where a histogram of unique pixel values is extracted.This is followed by choosing a suitable threshold value, to derive the resultant output as a binary segmented image [33].For the active contour segmentation, a boundary is first initialized around the object of interest, and it will automatically move towards the object.It is marked via the difference in pixel values, using a check iteration algorithm.The final boundary is the segmented image [37].
When dealing with more complicated images, semantic segmentation can be used, which involves labelling each pixel of an image corresponding to a specific class [38].For example, if there are three unique classes namely Human, Bicycle, and Background, every object of the same class is labelled with the same pixel values.There is no distinction between different humans in the same class.

Proposed Structure of the Volumetric Prediction Pipeline
There are three axes, i.e., x, y, and z, for the 3D CT scans for each patient.The breast cancer CT scan images contain a stack of images, depending on the spatial locations.The number of slices in the coronal plane and sagittal plane for each patient is the same, while the number of slices of each patient is unequal in the axial plane (i.e., z-axis).For example, scans of every patient have dimensions of 512 × 512 × n, where n varies for every patient, as shown in Figure 1.This is due to the nature of CT scanning machines.However, this can affect the U-Net model training, as typically it requires training data with equal spatial dimensions.It is, thus, necessary to derive suitable data by performing some data pre-processing on the CT slices, before feeding them into the proposed prediction pipeline.The structure of the volumetric prediction pipeline is shown in Figure 2. It consists of a series of algorithms to filter CT scan slices step by step, which is from the input data to the 2D and 3D prediction visualization outputs.The structure of the volumetric prediction pipeline is shown in Figure 2. It consists of a series of algorithms to filter CT scan slices step by step, which is from the input data to the 2D and 3D prediction visualization outputs.

Detailed Data Pre-processing Steps
The Python programming language is utilized in this research.Python libraries such as Tensorflow and Keras are used to assist in building of the U-Net architecture.The computer used in the experiment includes an Intel Xeon W-2265 12 core processor, 264 GB RAM, and 4 Nvidia Quadro RTX 5000 each with 16 GB GDDR6 memory for model training.
The CT scans obtained from 347 patients are provided by NCCS in the format of nearly raw raster data (.nrrd).The data are split into training, validation, and testing sets in the ratio of 60:20:20, corresponding to 210, 60, and 77 patients, respectively.The "pynrrd" Python library and "matplotlib" library are utilized in processing the CT scans and visualizing the data.The CT scans for each patient form a 3D volume.The CT scans are typically anisotropic.In order to gain a better understanding of the number of slices per patient in the axial plane, an algorithm is designed to extract the number of slices in the

Detailed Data Pre-Processing Steps
The Python programming language is utilized in this research.Python libraries such as Tensorflow and Keras are used to assist in building of the U-Net architecture.The computer used in the experiment includes an Intel Xeon W-2265 12 core processor, 264 GB RAM, and 4 Nvidia Quadro RTX 5000 each with 16 GB GDDR6 memory for model training.
The CT scans obtained from 347 patients are provided by NCCS in the format of nearly raw raster data (.nrrd).The data are split into training, validation, and testing sets in the ratio of 60:20:20, corresponding to 210, 60, and 77 patients, respectively.The "pynrrd" Python library and "matplotlib" library are utilized in processing the CT scans and visualizing the data.The CT scans for each patient form a 3D volume.The CT scans are typically anisotropic.In order to gain a better understanding of the number of slices per patient in the axial plane, an algorithm is designed to extract the number of slices in the axial plane (Figure 3).It inspects all the data and counts iteratively the slices in the z-axis under the same patient.The final count of the slices in the z-axis per patient is the number of slices in the axial plane.The output of the extracted number of slices per patient is shown in Figure 4. axial plane (Figure 3).It inspects all the data and counts iteratively the slices in the z-axis under the same patient.The final count of the slices in the z-axis per patient is the number of slices in the axial plane.The output of the extracted number of slices per patient is shown in Figure 4.The training, validation, and testing of the U-Net model require the CT scan images with their respective masks.But the dataset of the CT scan images from 347 patients has masks for some of the slices.Therefore, an algorithm is developed to iteratively record the slice indexes consisting of corresponding masks for each patient, with the flow chart shown in Figure 5.The slice indexes are appended to the mask list under each patient.The training, validation, and testing of the U-Net model require the CT scan images with their respective masks.But the dataset of the CT scan images from 347 patients has masks for some of the slices.Therefore, an algorithm is developed to iteratively record the slice indexes consisting of corresponding masks for each patient, with the flow chart shown in Figure 5.The slice indexes are appended to the mask list under each patient.The training, validation, and testing of the U-Net model require the CT scan images with their respective masks.But the dataset of the CT scan images from 347 patients has masks for some of the slices.Therefore, an algorithm is developed to iteratively record the slice indexes consisting of corresponding masks for each patient, with the flow chart shown in Figure 5.The slice indexes are appended to the mask list under each patient.Some sample outputs derived by the algorithm are shown in Figure 6.For example, there are a total of 122 slices for patient 1, but only six slices contain the masks with the slice indexes 85-90.We also note that a single patient out of 347 patients has no slices with masks.As such, only the slices of 346 patients are used in the proposed pipeline.Some sample outputs derived by the algorithm are shown in Figure 6.For example, there are a total of 122 slices for patient 1, but only six slices contain the masks with the slice indexes 85-90.We also note that a single patient out of 347 patients has no slices with masks.As such, only the slices of 346 patients are used in the proposed pipeline.Some sample outputs derived by the algorithm are shown in Figure 6.For example, there are a total of 122 slices for patient 1, but only six slices contain the masks with the slice indexes 85-90.We also note that a single patient out of 347 patients has no slices with masks.As such, only the slices of 346 patients are used in the proposed pipeline.It is observed from the output of the slices with masks under each patient that there are different numbers of slices with masks in the axial plane.To resolve this issue of the unequal number of slices with masks in the axial plane, an algorithm is developed to classify the patients based on a threshold value of the slice indexes in the z-axis.The threshold value in the z-axis is set to 96, and a fixed spatial size of 512 × 512 × 96 is utilized in the proposed pipeline.This is in order to have the same ranges of spatial locations at the z-axis for CT scan slices, with data consistency for all patients.This means that the slices with masks in the range of the 1st-96th indexes for the patients will be used in the model training, validation, and testing.
As such, the 346 patients will be classified into three categories using another algorithm, shown in Figure 7, according to the indexes of slices with masks: (1) The List with all indexes ≤ 96: All indexes of the slices with masks are lesser than or equal to 96.As shown in Figure 6, most patients, e.g., patients 1, 2, 4, 5, and 7, satisfy this condition.(2) The List with all indexes > 96: All indexes of the slices with 96 masks or more.As shown in Figure 6, patient 6 and patient 8 are some examples.(3) The overlap list: Indexes of the slices with masks across all 96 indexes, where some indexes are lesser than or equal to 96, and the others are more than 96.As shown in Figure 6, patient 3 is the only patient that satisfies this condition.
It is observed from the output of the slices with masks under each patient that there are different numbers of slices with masks in the axial plane.To resolve this issue of the unequal number of slices with masks in the axial plane, an algorithm is developed to classify the patients based on a threshold value of the slice indexes in the z-axis.The threshold value in the z-axis is set to 96, and a fixed spatial size of 512 × 512 × 96 is utilized in the proposed pipeline.This is in order to have the same ranges of spatial locations at the zaxis for CT scan slices, with data consistency for all patients.This means that the slices with masks in the range of the 1st-96th indexes for the patients will be used in the model training, validation, and testing.
As such, the 346 patients will be classified into three categories using another algorithm, shown in Figure 7, according to the indexes of slices with masks: (1) The List with all indexes ≤ 96: All indexes of the slices with masks are lesser than or equal to 96.As shown in Figure 6, most patients, e.g., patients 1, 2, 4, 5, and 7, satisfy this condition.(2) The List with all indexes > 96: All indexes of the slices with 96 masks or more.As shown in Figure 6, patient 6 and patient 8 are some examples.(3) The overlap list: Indexes of the slices with masks across all 96 indexes, where some indexes are lesser than or equal to 96, and the others are more than 96.As shown in Figure 6, patient 3 is the only patient that satisfies this condition.Besides the indexes of the slices with masks for each patient, another important parameter is the thickness of CT scans.In general, there are two thicknesses, namely 3 mm and 5 mm.The distributions of the thickness of the 191 patient scans are shown in Table 3, where the CT scans of 127 patients have a scan thickness of 3 mm.Since there are more patients with 3 mm scan thickness, they are used for the initial model training.The patient indexes of these 127 patients are shown in Table 4.

Data Normalization
For every patient, there are two sets of 512 × 512 × 96-pixel matrices, one for the CT scan images and the other for the mask labels, as shown in Figure 8.The values in the matrix of the CT scan images vary from −1000 to 5000, while the values in the mask label matrix vary between 0 and 1.Thus, the values in the CT scan image matrices need to be normalized to the range of 0 to 1.Besides the indexes of the slices with masks for each patient, another important parameter is the thickness of CT scans.In general, there are two thicknesses, namely 3 mm and 5 mm.The distributions of the thickness of the 191 patient scans are shown in Table 3, where the CT scans of 127 patients have a scan thickness of 3 mm.Since there are more patients with 3 mm scan thickness, they are used for the initial model training.The patient indexes of these 127 patients are shown in Table 4.

Data Normalization
For every patient, there are two sets of 512 × 512 × 96-pixel matrices, one for the CT scan images and the other for the mask labels, as shown in Figure 8.The values in the matrix of the CT scan images vary from −1000 to 5000, while the values in the mask label matrix vary between 0 and 1.Thus, the values in the CT scan image matrices need to be normalized to the range of 0 to 1.The CT scans quantize medical images using Hounsfield Units (HUs) for the pixel value.By convention, the HU of air is −1000 and the HU of water is 0. By normalizing the CT scans to a certain range of HU, different organs or tissues can be contrasted for analysis [39].In this research, the values of the CT scan image are normalized, as shown in Equation (1).
Utilizing the 3D volumes in the model training requires a large amount of GPU resources.Heavy computations are needed to process a CT scan image with the dimensions of 512 × 512 × 96.Double computation memory is required if the label matrix is included.To resolve this issue, the "Patchify" library is utilized, which essentially divides a big 3D volume into smaller partitions for training, as shown in Figure 9.A total of 225 cuboids for one patient with the dimensions 64 × 64 × 96 are segmented from the original 3D volume.The computation complexity can thus be reduced significantly, and the model training becomes more manageable.However, two issues arise from the above processing: (1) Not all patients have 96 slices.For example, the scan dimensions of a few patients are 512 × 512 × 87.
(2) There are many empty cuboids without masks (i.e., tumor labels).If empty cuboids are fed into the training model, the loss function will become erratic, resulting in inaccurate model predictions.
value.By convention, the HU of air is −1000 and the HU of water is 0. By normalizing CT scans to a certain range of HU, different organs or tissues can be contrasted for analy [39].In this research, the values of the CT scan image are normalized, as shown in Eq tion (1).To resolve issue (1), only patients having 96 slices in the axial plane are chosen.issue (2), an algorithm is developed to only select and process the cuboids where masks are present.The flow chart to voxelize the training data is shown in Figure 10 ensures that only the cuboids with tumor labels presented are fed into the training mod  The Python TensorFlow library is utilized for model building and training.The model.fit method takes in several formats of data, such as NumPy arrays, TensorFlow tensors, dictionary mappings, or tf.data dataset objects.In this study, the NumPy arrays used for the input pipeline are used as inputs for the model training.For the CT scans, with 225 cuboids per patient, there are 450 NumPy arrays generated and stored in the computer memory (i.e., 225 for the CT scan image matrix, and 225 for the mask label matrix).Following such a method, the 3D volumes of all patients are converted into smaller cuboids.It is followed by selecting the cuboids where the mask labels exist.The generated NumPy arrays are saved as hdf5 files to avoid repeated processing of every new session.

Set-up of 3D U-Net Architecture in the Pipeline
The 3D U-Net architecture is chosen in the proposed pipeline due to its favorable results with a small number of training samples.The proposed configuration of the 3D U-Net architecture is shown in Figure 11.The input images with dimensions of 64 × 64 × 96 are fed into the 3D U-Net model.In the encoder, each layer has convolutions with filters of 3 × 3 × 3 kernel sizes, followed by a BN function and the activation function.Finally, there is a max pooling operation with a 2 × 2 × 2 kernel size with a stride of 2 in each dimension.The activation function is either a sigmoid or softmax function.The number of filters in the first layer is set to 32 in the encoder.The number of filters doubles in the next layer until reaching the bottommost layer.

Set-Up of 3D U-Net Architecture in the Pipeline
The 3D U-Net architecture is chosen in the proposed pipeline due to its favorable results with a small number of training samples.The proposed configuration of the 3D U-Net architecture is shown in Figure 11.The input images with dimensions of 64 × 64 × 96 are fed into the 3D U-Net model.In the encoder, each layer has convolutions with filters of 3 × 3 × 3 kernel sizes, followed by a BN function and the activation function.Finally, there is a max pooling operation with a 2 × 2 × 2 kernel size with a stride of 2 in each dimension.The activation function is either a sigmoid or softmax function.The number of filters in the first layer is set to 32 in the encoder.The number of filters doubles in the next layer until reaching the bottommost layer.
In the decoder, each layer first combines the spatial and feature information through up-sampling with a 2 × 2 × 2 kernel size and concatenations of the segmentation feature maps from the corresponding layer in the encoder.The value of stride is set as 2. Next, it conducts the convolutions with a 3 × 3 × 3 kernel size, followed by a BN function and the activation function in each layer.In the decoder, the number of filters is reduced by half in the next layer, until reaching the topmost layer, where the last convolution is conducted with a 1 × 1 × 1 kernel size.
The Python Keras library is utilized to implement the 3D U-Net architecture in the pipeline.There are three primary building blocks for the U-Net architecture, namely the convolution block, encoder block, and decoder block.Within these building blocks, various parameters can be fine-tuned to improve the model fitting in accordance with the data after the pre-processing.This is known as hyper-parameter tuning.The code implementation of the 3D U-Net model is shown in Figure 12.In the decoder, each layer first combines the spatial and feature information through up-sampling with a 2 × 2 × 2 kernel size and concatenations of the segmentation feature maps from the corresponding layer in the encoder.The value of stride is set as 2. Next, it conducts the convolutions with a 3 × 3 × 3 kernel size, followed by a BN function and the activation function in each layer.In the decoder, the number of filters is reduced by half in the next layer, until reaching the topmost layer, where the last convolution is conducted with a 1 × 1 × 1 kernel size.
The Python Keras library is utilized to implement the 3D U-Net architecture in the pipeline.There are three primary building blocks for the U-Net architecture, namely the convolution block, encoder block, and decoder block.Within these building blocks, various parameters can be fine-tuned to improve the model fitting in accordance with the data after the pre-processing.This is known as hyper-parameter tuning.The code implementation of the 3D U-Net model is shown in Figure 12.

U-Net Performance Metrics
The Dice coefficient (i.e., F1-Score) is the weighted average of the precision and recall values, which is a commonly used performance metric in machine learning applications.It compares the pixel similarities between the prediction and the ground truth.The calculation is shown in Equation ( 2): where TP is for true positive, FP for false positive, and FN for false negative.

Hybrid U-Net Loss Function
Loss functions in machine learning help a model find an optimal solution.It takes the goal of minimizing the difference between the predictions and the ground truth.Typical types of loss functions in image segmentations include Jaccard loss, Dice loss, Binary Cross-Entropy (BCE) loss, and Binary Focal loss.
BCE loss is effective in measuring the difference between the actual and predicted

U-Net Performance Metrics
The Dice coefficient (i.e., F1-Score) is the weighted average of the precision and recall values, which is a commonly used performance metric in machine learning applications.It compares the pixel similarities between the prediction and the ground truth.The calculation is shown in Equation ( 2): where TP is for true positive, FP for false positive, and FN for false negative.

Hybrid U-Net Loss Function
Loss functions in machine learning help a model find an optimal solution.It takes the goal of minimizing the difference between the predictions and the ground truth.Typical types of loss functions in image segmentations include Jaccard loss, Dice loss, Binary Cross-Entropy (BCE) loss, and Binary Focal loss.
BCE loss is effective in measuring the difference between the actual and predicted image regions.However, the BCE loss depends on approximately equal data distribution.The effectiveness of BCE loss may become inadequate when a severe class imbalance exists, for example, the detection of small tumors in medical images.
Tversky focal loss can tackle the issue of class imbalance within the dataset [40].As such, a customized loss function is designed in the proposed pipeline, that is, a hybrid Tversky-cross-entropy loss.It leverages the benefits of the BCE loss for strong convergence capability and the Tversky focal loss to handle the class imbalance.The implementation of the hybrid Tversky-cross-entropy loss function is shown in Figure 13.

Hybrid Optimizer for the U-Net in the Pipeline
Selecting a suitable optimizer (i.e., optimization algorithm) is crucial for effective backpropagation of the weight update in neural networks.The gradient descent optimizer utilizes a single-step movement (i.e., learning rate) for all neural nodes during the backpropagation, while the Adaptive Movement Estimation (Adam) optimizer uses different step sizes for different neural nodes, resulting in a quicker convergence.But the quick convergence can be hindered when the gradient becomes flat, resulting in convergence at local minima.In this case, the momentum can be incorporated to add inertia to the gradient descent process, which is known as Nesterov momentum.Thus, the Nadam optimization algorithm is able to achieve better optimization performance [41].The Nadam optimizer is leveraged for the 3D U-Net in the proposed pipeline.

Experiment Results and Discussions
In the experiments, three different models of the 3D U-Net architecture in the pipeline are compared.The experiment results of each model are evaluated.The configurations of these three models are presented one by one.

Model 1 Configuration and Experiment Results
The 3D volume of each slice is divided into 225 cuboids.For Model 1, all cuboids per slice were used in the model training.Due to memory constraints, only the data from a small number of patients could be included in the model training.Model 1 is trained using the data of thirty-five patients and validated using the data of six patients.The configurations of the pipeline Model 1 are shown in Table 5.

Hybrid Optimizer for the U-Net in the Pipeline
Selecting a suitable optimizer (i.e., optimization algorithm) is crucial for effective backpropagation of the weight update in neural networks.The gradient descent optimizer utilizes a single-step movement (i.e., learning rate) for all neural nodes during the backpropagation, while the Adaptive Movement Estimation (Adam) optimizer uses different step sizes for different neural nodes, resulting in a quicker convergence.But the quick convergence can be hindered when the gradient becomes flat, resulting in convergence at local minima.In this case, the momentum can be incorporated to add inertia to the gradient descent process, which is known as Nesterov momentum.Thus, the Nadam optimization algorithm is able to achieve better optimization performance [41].The Nadam optimizer is leveraged for the 3D U-Net in the proposed pipeline.

Experiment Results and Discussions
In the experiments, three different models of the 3D U-Net architecture in the pipeline are compared.The experiment results of each model are evaluated.The configurations of these three models are presented one by one.

Model 1 Configuration and Experiment Results
The 3D volume of each slice is divided into 225 cuboids.For Model 1, all cuboids per slice were used in the model training.Due to memory constraints, only the data from a small number of patients could be included in the model training.Model 1 is trained using the data of thirty-five patients and validated using the data of six patients.The configurations of the pipeline Model 1 are shown in Table 5.The results of the Dice coefficient of Model 1 and its validation results are shown in Figure 14a, along with 300 epochs on the x-axis.The results of the loss metric of Model 1 are shown in Figure 14b.The results of the Dice coefficient of Model 1 and its validation results are shown in Figure 14a, along with 300 epochs on the x-axis.The results of the loss metric of Model 1 are shown in Figure 14b.For Model 1, the 2D prediction visualizations for the CT scans of two sample patients, patient 15 and patient 16, are displayed in Figures 15 and 16, respectively.It can be seen that the prediction results are close to the ground truth masks.For Model 1, the 2D prediction visualizations for the CT scans of two sample patients, patient 15 and patient 16, are displayed in Figures 15 and 16, respectively.It can be seen that the prediction results are close to the ground truth masks.For Model 1, the 3D volumetric prediction results for the scans of patients 15 and 16 are displayed in Figures 17 and 18, respectively.High similarity can be observed between the masks and the prediction results, which demonstrates the good prediction performance of Model 1.For Model 1, the 3D volumetric prediction results for the scans of patients 15 and 16 are displayed in Figures 17 and 18, respectively.High similarity can be observed between the masks and the prediction results, which demonstrates the good prediction performance of Model 1.For Model 1, the 3D volumetric prediction results for the scans of patients 15 and 16 are displayed in Figures 17 and 18, respectively.High similarity can be observed between the masks and the prediction results, which demonstrates the good prediction performance of Model 1.For Model 1, the 3D volumetric prediction results for the scans of patients 15 and 16 are displayed in Figures 17 and 18, respectively.High similarity can be observed between the masks and the prediction results, which demonstrates the good prediction performance of Model 1.

Model 2 Configuration and Experiment Results
In Model 2, only cuboids containing the mask label information were included in the training, while the background cuboids were removed.The total number of cuboids is significantly reduced in training, allowing for more patient data to fit into this model.Model 2 is trained using the data of 49 patients and validated using the data of 14 patients.The configurations of Model 2 are shown in Table 6.

Model 2 Configuration and Experiment Results
In Model 2, only cuboids containing the mask label information were included in the training, while the background cuboids were removed.The total number of cuboids is significantly reduced in training, allowing for more patient data to fit into this model.Model 2 is trained using the data of 49 patients and validated using the data of 14 patients.The configurations of Model 2 are shown in Table 6.

Model 2 Configuration and Experiment Results
In Model 2, only cuboids containing the mask label information were included in the training, while the background cuboids were removed.The total number of cuboids is significantly reduced in training, allowing for more patient data to fit into this model.Model 2 is trained using the data of 49 patients and validated using the data of 14 patients.The configurations of Model 2 are shown in Table 6.The results of the Dice coefficient of Model 2 and its validation results are shown in Figure 19a, along with 300 epochs of the x-axis.The results of the loss metric of Model 2 are shown in Figure 19b.The results of the Dice coefficient of Model 2 and its validation results are shown in Figure 19a, along with 300 epochs of the x-axis.The results of the loss metric of Model 2 are shown in Figure 19b.For Model 2, the 2D prediction visualizations for the CT scans of another two sample patients, patient 271 and patient 272, are displayed in Figures 20 and 21, respectively.It is observed that the predicted results are close to the ground truth masks.But there are several false predictions in the results.

Model 3 Configuration and Experiment Results
Model 3 was trained with a similar concept to Model 2. However, for every mask cuboid, an additional four (two front, two back) background cuboids were added to provide relevant context information for the model.Model 3 serves as a hybrid between

Model 3 Configuration and Experiment Results
Model 3 was trained with a similar concept to Model 2. However, for every mask cuboid, an additional four (two front, two back) background cuboids were added to provide relevant context information for the model.Model 3 serves as a hybrid between

Model 3 Configuration and Experiment Results
Model 3 was trained with a similar concept to Model 2. However, for every mask cuboid, an additional four (two front, two back) background cuboids were added to provide relevant context information for the model.Model 3 serves as a hybrid between Model 1 and Model 2. It utilizes less cuboids than Model 1, but more cuboids than Model 2. The configurations of Model 3 are shown in Table 7.The results of the Dice coefficients of Model 3 and its validation results are shown in Figure 24a, along with 300 epochs on the x-axis.The results of the loss metric of Model 3 are shown in Figure 24b.The results of the Dice coefficients of Model 3 and its validation results are shown in Figure 24a, along with 300 epochs on the x-axis.The results of the loss metric of Model 3 are shown in Figure 24b.For Model 3, the 2D prediction visualizations for the CT scans of two sample patients, patient 271 and patient 272, are displayed in Figures 25 and 26, respectively.A high similarity between the masks and predicted results is observed.There are no false predictions.For Model 3, the 2D prediction visualizations for the CT scans of two sample patients, patient 271 and patient 272, are displayed in Figures 25 and 26, respectively.A high similarity between the masks and predicted results is observed.There are no false predictions.For Model 3, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 27 and 28, respectively.While some false prediction results can be observed, it is an improvement from the results of Model 2.   For Model 3, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 27 and 28, respectively.While some false prediction results can be observed, it is an improvement from the results of Model 2. For Model 3, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 27 and 28, respectively.While some false prediction results can be observed, it is an improvement from the results of Model 2.  For Model 3, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 27 and 28, respectively.While some false prediction results can be observed, it is an improvement from the results of Model 2.

Discussions and Comparisons of Three Models
For these three models presented in Sections 4.1.-4.3., comparisons of the experiment results are shown in Table 8.

Discussions and Comparisons of Three Models
For these three models presented in Sections 4.1.-4.3., comparisons of the experiment results are shown in Table 8.

Discussions and Comparisons of Three Models
For these three models presented in Sections 4.1-4.3,comparisons of the experiment results are shown in Table 8.
As seen from Table 8, even though Model 1 produces the highest Dice score at 91.44%, it utilizes the highest amount of memory and longest time compared to others.
Model 2 uses the least memory.The Dice score is at 85.18%, with many false predictions due to the removal of all background cuboids in the training data.Model 3 provides a better balance between favorable predictions and memory usage.Model 3 is trained with cubes containing true masks and additional surrounding background cubes.This provides more background information to the model while preventing high memory usage compared to Model 1.
The prediction accuracies of Model 1-Model 3 in the pipeline are compared with three pre-trained models.The Segmentation Models (SMs) included in a high-level Python library with neural networks are available for image segmentation, where three pre-trained models, namely InceptionV3 [42], SM MobileNetV2 [43], and InceptionResNetV2 [44], were explored.The prediction accuracies of these models are shown in Table 9.It is observed that Model 1 in the pipeline performs better than the three pre-trained models.Eight U-Net-based models [4,13,[45][46][47][48] and one transformer-based model [49] reported in the literature were also compared with the proposed models in this paper.The Dice scores of these nine models are reported based on their own dataset, shown in Table 9.The Dice score is up to 91.74% for the SegPC 2021 cell segmentation dataset [45].With the ISIC 2018 skin lesion segmentation dataset, the Dice score is up to 86.94% [47].The Dice score is up to 81.96% for the Synapse multi-organ CT segmentation dataset [49].Using an MRI dataset for breast cancer, the Dice score is up to 89.4% [4].The Dice score is about 94%, obtained for CT scan images using a 10-fold cross-validation method, which is higher than that of the proposed Model 1 at 91.44%.The reason is that the 10-fold cross-validation approach is able to derive better accuracy of the segmentation.

Conclusions
An automatic tool for a volumetric prediction pipeline is developed for 3D CT scans on breast cancer in this paper.Various data pre-processing techniques are utilized to derive high-quality data to be fed into the model.Due to the nature of CT scans, not all data can be used directly without pre-processing.The developed pipeline consists of several algorithms to select suitable CT scan slices for the model training and test.The pipeline divides the 3D volumes into smaller segments to reduce the memory usage in the computation.It then conducts the normalization of 3D volumes and analyzes the depths of 3D CT scan slices.
Next, the data are fed into the 3D U-Net architecture in the pipeline.The 3D U-Net is designed to cater to the requirements of 3D image segmentation and volumetric prediction.Additional efforts are taken to perform optimization on memory usage and computation loads by reducing the number of training cuboids, and fine-tuning parameters in the U-Net model.A hybrid Tversky-cross-entropy loss function is employed in the pipeline.A Nadam optimization algorithm is utilized to derive favorable optimizer performance.
Three U-Net models with different configurations are implemented in this study.The performances of each model are discussed, with a few samples of prediction visualization illustrated.Among these three models, Model 3, which includes partial contextual information around the masked areas, obtains the most balanced performance on accuracy and memory usage for the CT scan image segmentation tasks.The experiment results are also compared with three pre-trained models, and nine models reported in the literature.Comparable Dice scores are achieved by the proposed pipeline in this paper.

Limitations and Future Works
There are two limitations for the model training performed in the current work.(1) CT scan slices with a thickness of 3 mm are utilized, and (2) only patient data with all masks in the range of the 1st-96th slices are utilized.With these two limitations, the amount of training data is reduced from 347 patients to approximately 50 patients.The essence of the machine learning model is that more training provides better generalization, which leads to better prediction on unseen data.
As future works, to tackle the first limitation, sampling can be performed on the data with CT scans of thicknesses of both 3 mm and 5 mm to increase the number of suitable input data.For the second limitation, a two-stage algorithm can be explored, where the volume segmentations on large cuboid dimensions will be conducted first, followed by breaking down a large cuboid into a certain number of cuboids in smaller dimensions.The volume segmentation can then be conducted on the small cuboids.Other types of loss functions and performance metrics can be explored as well.The k-fold cross-validation between models can be developed to further improve the accuracy of the model.

( 1 )
An algorithm is developed to extract the number of slices in the axial plane for each patient.(2) The model training in the pipeline requires tumor labels, i.e., masks to be assigned to each slice.The slices without masks are filtered out from the training.Another algorithm is designed to identify which slices of each patient have associated masks.The indexes of such slices with masks are sorted out for each patient.(3) To ensure consistent training and better prediction performance, the pipeline requires the same ranges of slice spatial locations in the z-axis for all patients.The patients are classified into three categories based on the threshold value of the slice indexes at the z-axis.Only the slices from the patient category whose slice indexes are all smaller than or equal to the threshold value are selected for further processing.(4) The selected patient category is further classified into two groups based on the CT scan thickness.(5) The values of CT scan image matrices are normalized into the range of 0 to 1. (6) The 3D volumes with large dimensions require a large amount of GPU resources in the model training.They are divided by an algorithm into a set of cuboids with smaller dimensions to reduce computation complexity.(7) A 3D U-Net architecture is customized and trained for the image segmentation tasks with a hybrid loss function and a hybrid optimizer.(8) The results of 2D and 3D prediction results are visualized.

Figure 2 .
Figure 2. Structure of the proposed prediction pipeline.

Figure 2 .
Figure 2. Structure of the proposed prediction pipeline.

Figure 3 .
Figure 3. Flow chart to extract number of slices per patient in the axial plane.

Figure 4 .
Figure 4. Output of extracted number of slices per patient.

Figure 3 .
Figure 3. Flow chart to extract number of slices per patient in the axial plane.

Figure 3 .
Figure 3. Flow chart to extract number of slices per patient in the axial plane.

Figure 4 .
Figure 4. Output of extracted number of slices per patient.

Figure 4 .
Figure 4. Output of extracted number of slices per patient.

Figure 5 .
Figure 5. Flow chart to record the slice indexes with the presence of masks.

Figure 6 .
Figure 6.Sample outputs indicate the patient slices with the presence of masks.

Figure 5 .
Figure 5. Flow chart to record the slice indexes with the presence of masks.

Figure 5 .
Figure 5. Flow chart to record the slice indexes with the presence of masks.

Figure 6 .
Figure 6.Sample outputs indicate the patient slices with the presence of masks.

Figure 6 .
Figure 6.Sample outputs indicate the patient slices with the presence of masks.

Figure 7 .
Figure 7. Flow chart to classify patients according to the indexes of mask slice presence.With this algorithm, the CT scans of the 346 patients are classified into three categories, as shown in Table 2.In the pipeline, only the patients whose indexes of the slices with masks are all ≤ 96 are utilized.Thus, only 191 patients are available for the model training, validation, and testing.

Figure 7 .
Figure 7. Flow chart to classify patients according to the indexes of mask slice presence.With this algorithm, the CT scans of the 346 patients are classified into three categories, as shown in Table 2.In the pipeline, only the patients whose indexes of the slices with masks are all ≤96 are utilized.Thus, only 191 patients are available for the model training, validation, and testing.
= −1000Specify  = 5000   =   = ,  < −1000 ,  ≥ 5000   = Utilizing the 3D volumes in the model training requires a large amount of GPU sources.Heavy computations are needed to process a CT scan image with the dimensio of 512 × 512 × 96.Double computation memory is required if the label matrix is includTo resolve this issue, the "Patchify" library is utilized, which essentially divides a big volume into smaller partitions for training, as shown in Figure9.A total of 225 cubo for one patient with the dimensions 64 × 64 × 96 are segmented from the original 3D v ume.The computation complexity can thus be reduced significantly, and the model tra ing becomes more manageable.However, two issues arise from the above processing:(1) Not all patients have 96 slices.For example, the scan dimensions of a few patients 512 × 512 × 87.(2)There are many empty cuboids without masks (i.e., tumor labels).If empty cubo are fed into the training model, the loss function will become erratic, resulting in accurate model predictions.

Figure 9 .
Figure 9. Segmented 3D volume using Patchify library.The segmented cuboids are shown in re boxes.

Figure 9 .
Figure 9. Segmented 3D volume using Patchify library.The segmented cuboids are shown in red boxes.To resolve issue (1), only patients having 96 slices in the axial plane are chosen.For issue (2), an algorithm is developed to only select and process the cuboids where the masks are present.The flow chart to voxelize the training data is shown in Figure 10.It ensures that only the cuboids with tumor labels presented are fed into the training model.The Python TensorFlow library is utilized for model building and training.The model.fit method takes in several formats of data, such as NumPy arrays, TensorFlow tensors, dictionary mappings, or tf.data dataset objects.In this study, the NumPy arrays used for the input pipeline are used as inputs for the model training.For the CT scans, with 225 cuboids per patient, there are 450 NumPy arrays generated and stored in the computer memory (i.e., 225 for the CT scan image matrix, and 225 for the mask label matrix).Following such a method, the 3D volumes of all patients are converted into smaller cuboids.It is followed by selecting the cuboids where the mask labels exist.The generated NumPy arrays are saved as hdf5 files to avoid repeated processing of every new session.

Figure 11 .
Figure 11.Three-dimensional U-Net architecture developed in the proposed pipeline.

Figure 12 .
Figure 12.The code implementation of the 3D U-Net in the pipeline.

Figure 12 .
Figure 12.The code implementation of the 3D U-Net in the pipeline.

Figure 13 .
Figure 13.Implementation of the hybrid Tversky-cross-entropy loss function.

Figure 14 .
Figure 14.Performance and loss function results of Model 1.(a) The Dice coefficients in the training (blue) and validation (orange).(b) The loss metrics in the training (blue) and validation (orange).

Figure 14 .
Figure 14.Performance and loss function results of Model 1.(a) The Dice coefficients in the training (blue) and validation (orange).(b) The loss metrics in the training (blue) and validation (orange).

Figure 16 .
Figure 16.Two-dimensional prediction result for patient 16 in Model 1. From left to right: scans, masks, and predictions, respectively.

Figure 15 . 28 Figure 15 .
Figure 15.Two-dimensional prediction result for patient 15 in Model 1. From left to right: scans, masks, and predictions, respectively.

Figure 16 .
Figure 16.Two-dimensional prediction result for patient 16 in Model 1. From left to right: scans, masks, and predictions, respectively.

Figure 16 .
Figure 16.Two-dimensional prediction result for patient 16 in Model 1. From left to right: scans, masks, and predictions, respectively.

Figure 16 .
Figure 16.Two-dimensional prediction result for patient 16 in Model 1. From left to right: scans, masks, and predictions, respectively.

Figure 19 .
Figure 19.Performance and loss function results of Model 2. (a) The Dice coefficients in the training (blue) and validation (orange).(b) The loss metrics in the training (blue) and validation (orange).

Figure 19 .
Figure 19.Performance and loss function results of Model 2. (a) The Dice coefficients in the training (blue) and validation (orange).(b) The loss metrics in the training (blue) and validation (orange).For Model 2, the 2D prediction visualizations for the CT scans of another two sample patients, patient 271 and patient 272, are displayed in Figures20 and 21, respectively.It is observed that the predicted results are close to the ground truth masks.But there are several false predictions in the results.

Figure 20 .
Figure 20.Prediction results for patient 271 of Model 2. From left to right: scans, masks, and prediction results.

Figure 21 .
Figure 21.Prediction results for patient 272 of Model 2. From left to right: scans, masks, and prediction results.For Model 2, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 22 and 23, respectively.A large number of false predictions are observed in the results.It could be caused by the total removal of the background cuboids in the model training.

Figure 20 .
Figure 20.Prediction results for patient 271 of Model 2. From left to right: scans, masks, and prediction results.

Figure 21 .
Figure 21.Prediction results for patient 272 of Model 2. From left to right: scans, masks, and prediction results.For Model 2, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 22 and 23, respectively.A large number of false predictions are observed in the It could be caused by the total removal of the background cuboids in the model training.

Figure 21 . 28 Figure 20 .
Figure 21.Prediction results for patient 272 of Model 2. From left to right: scans, masks, and prediction results.For Model 2, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 22 and 23, respectively.A large number of false predictions are observed in the results.It could be caused by the total removal of the background cuboids in the model training.

Figure 21 .
Figure 21.Prediction results for patient 272 of Model 2. From left to right: scans, masks, and prediction results.For Model 2, the 3D volumetric predictions for the CT scans of patients 271 and 272 are displayed in Figures 22 and 23, respectively.A large number of false predictions are observed in the results.It could be caused by the total removal of the background cuboids in the model training.

Figure 24 .
Figure 24.Performance and loss function results of Model 3. (a) The Dice coefficients in the training (blue) and validation (orange).(b) The loss metrics in the training (blue) and validation (orange).

Figure 24 .
Figure 24.Performance and loss function results of Model 3. (a) The Dice coefficients in the training (blue) and validation (orange).(b) The loss metrics in the training (blue) and validation (orange).

Figure 25 .
Figure 25.Prediction results for patient 271 of Model 3. From left to right: scans, masks, and prediction results.

Figure 26 .
Figure 26.Prediction results for patient 272 of Model 3. From left to right: scans, masks, and prediction results.

Figure 25 .
Figure 25.Prediction results for patient 271 of Model 3. From left to right: scans, masks, and prediction results.

Figure 26 .
Figure 26.Prediction results for patient 272 of Model 3. From left to right: scans, masks, and prediction results.

Figure 26 .
Figure 26.Prediction results for patient 272 of Model 3. From left to right: scans, masks, and prediction results.

Figure 26 .
Figure 26.Prediction results for patient 272 of Model 3. From left to right: scans, masks, and prediction results.

Table 2 .
Obtained distributions of three categories of patients.

Table 3 .
Distribution of patient scan thickness.

Table 4 .
The 127 patients with 3 mm CT scan thickness.

Table 2 .
Obtained distributions of three categories of patients.

Table 3 .
Distribution of patient scan thickness.

Table 4 .
The 127 patients with 3 mm CT scan thickness.

Table 5 .
Configurations of the pipeline Model 1.

Table 6 .
Configurations of Model 2 in the experiment.

Table 6 .
Configurations of Model 2 in the experiment.

Table 7 .
Configurations of Model 3 in the experiment.

Table 7 .
Configurations of Model 3 in the experiment.

Table 8 .
Brief comparisons of three models in the experiment.

Table 9 .
Comparisons of three models in the experiment with three pre-trained models and nine models reported in the literature.