In-Series U-Net Network to 3D Tumor Image Reconstruction for Liver Hepatocellular Carcinoma Recognition

Cancer is one of the common diseases. Quantitative biomarkers extracted from standard-of-care computed tomography (CT) scan can create a robust clinical decision tool for the diagnosis of hepatocellular carcinoma (HCC). According to the current clinical methods, the situation usually accounts for high expenditure of time and resources. To improve the current clinical diagnosis and therapeutic procedure, this paper proposes a deep learning-based approach, called Successive Encoder-Decoder (SED), to assist in the automatic interpretation of liver lesion/tumor segmentation through CT images. The SED framework consists of two different encoder-decoder networks connected in series. The first network aims to remove unwanted voxels and organs and to extract liver locations from CT images. The second network uses the results of the first network to further segment the lesions. For practical purpose, the predicted lesions on individual CTs were extracted and reconstructed on 3D images. The experiments conducted on 4300 CT images and LiTS dataset demonstrate that the liver segmentation and the tumor prediction achieved 0.92 and 0.75 in Dice score, respectively, by as-proposed SED method.


Introduction
According to the health data and statistics of World Health Organization (WHO), hepatocellular carcinoma (HCC) is one of the most common cancer diseases in the world, which causes a large number of deaths every year [1]. The detection of lesions as well as the estimation of their size and number is still widely used by visual inspection of computed tomography (CT) [2] and magnetic resonance (MR) images in the clinical examination, which can be subjective. The high tumor variability generally requires reliance on the operator's subjectivity, making it susceptible to diagnosis misinterpretations. In radiomics studies [3], all observations underline the need for automatic and reliable tools dedicated to tumor segmentation in order to finely characterize liver cancer. However, automatic segmentation [4][5][6][7][8] of liver tumor is challenging not only due to the highly variable shape of liver tumors but also because of the similar intensity values of nearby liver parenchyma.
Image segmentation is the process of dividing a digital image into multiple segments. It is a classic problem in image processing and computer vision and is widely used in medical imaging research . In the early years, many algorithms were proposed to fulfill Learning [52] is trained by using a sample, and this kind of training with very small amount of data has excellent effect compared with other training methods.
ED has become the most popular network structure for segmentation over the past few years. Most works adopted a single ED for segmentation task for two reasons: (a) Efficient implementation and (b) end-to-end training. However, using a single ED for accurate segmentation usually requires a large amount of training data, and the performance would be sensitive to network architecture. In the case of CT imaging, the images are monochromatic and the pixel values between each tissue and organ are highly similar. Thus, this property further imposes the difficulty of the network to identify lesions in the absence of sufficient training data. Under such circumstances, a stratified strategy to partition the tumor prediction task into multiple fragments, with a single network to deal with each fragment would be a viable solution. Theoretically, this approach can reduce the difficulty of model learning and further improve the quality of segmentation.
Based on the assumption, we propose a two-stage segmentation approach, called Successive Encoder-Decoder (SED), for automatic liver tumor segmentation from CT images. The SED consists of two independent encoder-decoders, SED-1 and SED-2, which perform different segmentation tasks. The purpose of SED-1 is to localize the liver, while the purpose of SED-2 is to predict the liver lesions based on the region of interest (ROI) obtained by SED-1. More specifically, SED-1 excludes out the tissues other than liver, while SED-2 focuses on the preserved liver region to precisely extract the tumor location. SED-1 can be regarded as a pre-processing step of SED-2, which ensures that SED-2 does not segment the non-liver tissues. In terms of network composition, two different EDs for SED-1 and SED-2 were adopted, U-Net serves as the main architecture of SED-1 to localize the liver. On the other hand, tumor segmentation is a more challenging task due to irregular distribution of tumors within the liver. Thus, dense U-Net [34] was used in this project as the main network of SED-2 to achieve more efficient extual information extraction. Regarding the training of SED, the SED-1 and SED-2 must be trained independently using CT images with liver ground truths and tumor ground truths, respectively. For this purpose, we have built a liver CT dataset consisting of LiTS images [53] with a total of 4300 CT images. The experiments conducted on this dataset will demonstrate the performance on liver lesion segmentation for both quantitative and qualitative analysis. The SED segmentation results of adjacent slices as a 3D visualization image will be visualized, which is likely to assist surgeons rapidly identify the location, shape, and size of the tumors, further improving the quality of surgical treatment.

The Overview of SED
As shown in Figure 1, the SED consists of two stages: Liver localization (Stage 1) and tumor extraction (Stage 2). Stage 1 uses SED-1 to exclude unwanted voxels and organs and produces a liver mask that indicates the location of liver in the CT image. Once the liver mask is obtained, the original CT image was multiplied with the mask to produce the liver image, which would be used as input for Stage 2. Then, Stage 2 uses SED-2 to extract the lesion (tumor) from the liver image. (IRB code: 201801581B0).

SED-1: Liver Localization Network
U-Net was adopted as the main architecture of SED-1, which is shown in Figure 2. The upper part of SED-1 is the encoder network responsible for feature extraction. The encoding process of U-Net consists of five scaling levels. Each level performs twice convolution and one pooling operation at a specific resolution. When passing through a pooling layer, a down-sampling operation is carried out to reduce image size. In order to preserve more feature information, the number of output feature maps from the convolution layer will be doubled after each down-sampling operation.

SED-1: Liver Localization Network
U-Net was adopted as the main architecture of SED-1, which is shown in Figure 2. The upper part of SED-1 is the encoder network responsible for feature extraction. The encoding process of U-Net consists of five scaling levels. Each level performs twice convolution and one pooling operation at a specific resolution. When passing through a pooling layer, a down-sampling operation is carried out to reduce image size. In order to preserve more feature information, the number of output feature maps from the convolution layer will be doubled after each down-sampling operation.
On the other hand, the lower part is the decoder network, which performs de-convolution and up-pooling. The purpose of the decoder is to restore the high-level feature maps obtained by the encoder to an output image with the same resolution of input. It is worth mentioning that the feature maps of the same level in the encoder will be concatenated to the feature maps of the up-sampling result through skip connections (gray dash lines) after each up-sampling operation. Such design can ensure that the restored feature maps contain more low-level features, thus improving the final segmentation result. Table 1 shows the definition of SED-1.

SED-2: Tumor Extraction Network
Relative to the liver area, tumors are tiny structures which are difficult to be detected due to the variability of appearances, fuzzy boundaries, uneven densities, and irregular shapes and sizes. In this case, it is required to have a more powerful encoder-decoder to localize the tumor. Therefore, the SED-2 adopted FC-DenseNet [39] (or called Dense U-Net) as the basic architecture for accurate tumor extraction. On the other hand, the lower part is the decoder network, which performs deconvolution and up-pooling. The purpose of the decoder is to restore the high-level Diagnostics 2021, 11, 11 5 of 18 feature maps obtained by the encoder to an output image with the same resolution of input. It is worth mentioning that the feature maps of the same level in the encoder will be concatenated to the feature maps of the up-sampling result through skip connections (gray dash lines) after each up-sampling operation. Such design can ensure that the restored feature maps contain more low-level features, thus improving the final segmentation result. Table 1 shows the definition of SED-1.

SED-2: Tumor Extraction Network
Relative to the liver area, tumors are tiny structures which are difficult to be detected due to the variability of appearances, fuzzy boundaries, uneven densities, and irregular shapes and sizes. In this case, it is required to have a more powerful encoder-decoder to localize the tumor. Therefore, the SED-2 adopted FC-DenseNet [39] (or called Dense U-Net) as the basic architecture for accurate tumor extraction. FC-DenseNet is an improved version of ED networks based on U-Net. The architecture is shown in Figure 3. As can be seen that the overall architecture consists of three types of modules: Dense block (DB), transition down (TD), and transition up (TU). DB is a core module developed in [40], which utilizes dense connections between all layers so that each layer can use feature maps of all previous layers. This design promotes feature propagation, makes features re-use efficiently, and mitigates the vanishing gradient problem. Since each layer contains the output information of all previous layers, fewer calculations of feature maps are required, thus reducing the computational complexity. As the result, the use of dense block can improve the performance of feature extraction. TD is a down-sampling operation used during the encoding process, while TU is an up-sampling operation performed by transposed convolution during the decoding process. During the decoding process, DenseNet performs feature concatenation through the skip connections (gray dashed lines) to ensure that the restored feature maps have more low-level features. Table 2 shows the architecture of SED-2.

Loss Function
Loss function is used to evaluate the difference between the output and the target (ground truth) in the training of deep neural network. Choosing an appropriate loss function is essential to the effectiveness of the model. In the training of SED, Dice Loss [23] was adopted as loss function to optimize the network parameters of both SED-1 and SED-2, instead of cross entropy. Dice Loss is calculated based on Dice Coefficient, which measures the similarity between two samples. The formulas of Dice Coefficient is expressed by where A presents the network output whose value of each pixel denotes the probability of belonging to the target, and B presents a binary mask (ground truth). Dice coefficient can also be presented by 2×|A∩B| |A|+|B| , where |A ∩ B| presents the number of the intersecting pixels of A and B, and |A| and |B| presents the numbers of total pixels of A and B, respectively. The range of Dice Coefficient is [0,1]. The prediction result is more similar to the target if the value is closer to 1.

Liver CT Dataset
The CT dataset in this project consists of LiTS dataset [53], which is a publicly accessible benchmark dataset for tumor segmentation challenge. It contains 8000 CT images, which were provided by clinical institutions around the world. From the LiTS dataset, 3900 images with 512 × 512 resolution containing the ground truth of liver and tumor location were selected, which were combined with 400 Kaohsiung Chang Gung Memorial Hospital (KCGMH) images to build an experimental CT dataset. The ground truth maps of KCGMH images were annotated and plotted by KCGMH radiologists. The data provided by the hospital involved a total of three physicians who participated in the study, with the lesion section being discussed and the MASK file was created by doctors. The 4300 images were randomly split into 4000 training images and 300 test images. Referring to [10], the intensity values of all LiTS images were truncated to the range of [−100, 200] HU to remove the irrelevant details and enhance their contrast.
In order to fit the size of model as well as avoid the size limitation of GPU memory, all images were downscaled to 256 × 256 resolution to improve computational efficiency. In fact, the 256 × 256 resolution is sufficient to clearly illustrate the segmentation results of the liver and tumor.

Training Method, Environment, and Parameter Setting
Each training image consists of a raw CT, a liver mask, and a tumor mask. For training SED-1, the input and output are the paired data in the form of [raw CT, liver mask]. For training SED-2, the input and output are the paired data in the form of [raw CT*liver mask, tumor mask]. Two networks were trained independently.
For SED-1, the epoch and batch were set to 50 and 16, respectively. The initial learning rate was set to 10 −4 , and then reduced by 10% per two epochs. For SED-2, the epoch and batch were 100 and 4, respectively. The initial learning rate was also set to 10 −4 , and then multiplied by e −0.9 per two epochs (exponential decay). Both SED-1 and SED-2 adopted the ADAM optimizer for updating network parameters. Before training, 20% of the training images were randomly selected as the validation set, there is only a slight difference between testing and validation datasets. After each epoch, the model was validated once, and the model with the lowest validation loss throughout the training process is retained. Finally, the model with the lowest validation loss is designated as final model for testing.

Evaluation Metrics
The performance of the proposed SED was evaluated using the following metrics: Accuracy, Intersection over Union (IoU), Similarity Coefficient (DSC), and Area Under the ROC Curve (AUC). Those metrics could be computed by four measures: TP (true positive), TN (true negative), FP (false positive), and FN (false negative). The accuracy is expressed by: The IoU and DSC are defined by: Finally, the AUC is obtained from a ROC curve. For each test result, a ROC curve can be created by plotting the true positive rate TP TP+FN against the false positive rate. FP FP+TN at different threshold settings. In the following section, the averaged values of four metrics of total 300 test images are reported.

Tumor Segmentatiuon Results
This section conducts both qualitative and quantitative studies for the proposed SED. Three state-of-art methods, U-Net [19], C-UNet [27], and ResNet [29] were selected for comparison of tumor segmentation capability. Table 3 tabulates the tumor extraction results of all the methods in ACC, IoU, DSC, and AUC. It can be seen that the proposed SED has the best overall performance in all metrics. In ACC, all methods achieved values above 0.9 due to the fact that both positive (tumor) and negative (background) samples were counted in ACC. Thus, ACC is not an ideal metric for segmentation where the positive and negative samples are unbalanced. On the contrary, IoU, DSC, and AUC were better metrics for the segmentation task. The result showed that SED significantly outperformed U-Net, ResNet, and C-UNet, corresponding to values of 0.87, 0.75, and 0.95, respectively. The ROC curves for all methods are shown in Figure 4, which were plotted by multiple pairs of (TP, FP) calculated by different threshold values.   Figure 5 shows both liver localization and tumor segmentation results of SED performed on eight selected CT samples. It can be seen that the SED provided satisfactory results for both liver localization and tumor segmentation. Compared Figure 5d with Figure 5b, SED-1 almost preserved the liver region except for those small regions where the boundaries were difficult to define. According to the results, the DSC value of SED-1 was 0.92, which implies that the liver segmentation task can be well handled using a single U-Net. In the part of tumor extraction, SED-2 also achieved ideal results. Compared Figure 5g with Figure 5e, it can be seen that most of the tumor regions were successfully captured by SED-2. Although the results indicate that the proposed SED still generate few tiny FP and FN parts, SED still provides remarkable tumor prediction capability. The contours of those parts are somehow ambiguous, low-contrast, and not clearly visible.
For comparison, Figure 6 shows the tumor segmentation results of U-Net, C-UNet, ResNet, and the proposed SED. It is apparent that SED outperformed the other methods, which implies that SED utilized a two-stage stratified strategy to segment liver and tumor successively, while U-Net and ResNet are one-stage end-to-end approaches, in which the rules and features of tumor cannot be effectively learned under limited training samples. Further, it should be noted that C-UNet adopts a stratified strategy to segment the liver and tumor separately. However, the presence of FPs and FNs can be observed in Figure 6e, which indicates that the corresponding tumor segmentation results are not as good as those of SED. Conversely, the segmentation result of SED produced less segmentation errors, indicating that the primary parts of tumor were predicted more accurate.  Figure 5 shows both liver localization and tumor segmentation results of SED performed on eight selected CT samples. It can be seen that the SED provided satisfactory results for both liver localization and tumor segmentation. Compared Figure 5d with Figure 5b, SED-1 almost preserved the liver region except for those small regions where the boundaries were difficult to define. According to the results, the DSC value of SED-1 was 0.92, which implies that the liver segmentation task can be well handled using a single U-Net. In the part of tumor extraction, SED-2 also achieved ideal results. Compared Figure 5g with Figure 5e, it can be seen that most of the tumor regions were successfully captured by SED-2. Although the results indicate that the proposed SED still generate few tiny FP and FN parts, SED still provides remarkable tumor prediction capability. The contours of those parts are somehow ambiguous, low-contrast, and not clearly visible.
For comparison, Figure 6 shows the tumor segmentation results of U-Net, C-UNet, ResNet, and the proposed SED. It is apparent that SED outperformed the other methods, which implies that SED utilized a two-stage stratified strategy to segment liver and tumor successively, while U-Net and ResNet are one-stage end-to-end approaches, in which the rules and features of tumor cannot be effectively learned under limited training samples. Further, it should be noted that C-UNet adopts a stratified strategy to segment the liver and tumor separately. However, the presence of FPs and FNs can be observed in Figure 6e, which indicates that the corresponding tumor segmentation results are not as good as those of SED. Conversely, the segmentation result of SED produced less segmentation errors, indicating that the primary parts of tumor were predicted more accurate. Figures 7 and 8 show the comparison of IOU value and accuracy for the four cases of U-Net, ResNet, C-UNet, and our proposed SED. It can be seen that our proposed SED has higher IOU values and accuracy for Case 3 and 4. Further, it can also be noted that the recognition rate of the SED model is excellent for tumor segmentation for extremely tiny particles (Case 1 and 2) and irregular shapes (Case 3). In particular, the SED model has an excellent recognition rate for the multi-sided irregular shape in Case 3 compared to U-Net where the TP portion is oversized, and to ResNet where the TP is undersized.
In the evaluation of the generalization capability of the model, Segnet is a well-known and widely used recognition and training module, which is designed to be efficient both in terms of memory and computational time. Further, Segnet is often used for view or larger scale recognition tools, and its architecture has more similarities to UNet, which is used as a control group. Segnet [6] and U-net with the same Encoder-Decoder framework were added to evaluate whether the two models could effectively segment the liver tumor region. Dice Score was used to compare the two models with Expanded Densely U-net, (EDU) and the result of Dice Coefficient calculated by Equation (1) is shown in Figure 9. For each model, the training Dice Score was based on the fluctuation of the training data, with approximately 20 Epochs as observation throughout the training process. The Dice Score results show a similar trend for both the EDU and U-net networks, which indicates that both EDU and U-net could predict more accurate region for tumor imaging.     [19], (d) ResNet [29], (e) C-UNet [27], and (f) The proposed SED. Figures 7 and 8 show the comparison of IOU value and accuracy for the four cases of U-Net, ResNet, C-UNet, and our proposed SED. It can be seen that our proposed SED has higher IOU values and accuracy for Case 3 and 4. Further, it can also be noted that the recognition rate of the SED model is excellent for tumor segmentation for extremely tiny particles (Case 1 and 2) and irregular shapes (Case 3). In particular, the SED model has an excellent recognition rate for the multi-sided irregular shape in Case 3 compared to U-Net where the TP portion is oversized, and to ResNet where the TP is undersized.   [19], (d) ResNet [29], (e) C-UNet [27], and (f) The proposed SED.

Case1:
Case2:  [19], (d) ResNet [29], (e) C-UNet [27], and (f) The proposed SED. Figures 7 and 8 show the comparison of IOU value and accuracy for the four cases of U-Net, ResNet, C-UNet, and our proposed SED. It can be seen that our proposed SED has higher IOU values and accuracy for Case 3 and 4. Further, it can also be noted that the recognition rate of the SED model is excellent for tumor segmentation for extremely tiny particles (Case 1 and 2) and irregular shapes (Case 3). In particular, the SED model has an excellent recognition rate for the multi-sided irregular shape in Case 3 compared to U-Net where the TP portion is oversized, and to ResNet where the TP is undersized.   In the evaluation of the generalization capability of the model, Segnet is a well-known and widely used recognition and training module, which is designed to be efficient both in terms of memory and computational time. Further, Segnet is often used for view or larger scale recognition tools, and its architecture has more similarities to UNet, which is used as a control group. Segnet [6] and U-net with the same Encoder-Decoder framework were added to evaluate whether the two models could effectively segment the liver tumor region. Dice Score was used to compare the two models with Expanded Densely U-net, (EDU) and the result of Dice Coefficient calculated by Equation (1) is shown in Figure 9. For each model, the training Dice Score was based on the fluctuation of the training data, with approximately 20 Epochs as observation throughout the training process. The Dice Score results show a similar trend for both the EDU and U-net networks, which indicates that both EDU and U-net could predict more accurate region for tumor imaging.   In the evaluation of the generalization capability of the model, Segnet is a well-known and widely used recognition and training module, which is designed to be efficient both in terms of memory and computational time. Further, Segnet is often used for view or larger scale recognition tools, and its architecture has more similarities to UNet, which is used as a control group. Segnet [6] and U-net with the same Encoder-Decoder framework were added to evaluate whether the two models could effectively segment the liver tumor region. Dice Score was used to compare the two models with Expanded Densely U-net, (EDU) and the result of Dice Coefficient calculated by Equation (1) is shown in Figure 9. For each model, the training Dice Score was based on the fluctuation of the training data, with approximately 20 Epochs as observation throughout the training process. The Dice Score results show a similar trend for both the EDU and U-net networks, which indicates that both EDU and U-net could predict more accurate region for tumor imaging.  The proposed SED was compared with the other five selected methods proposed in ISBI 2017 challenge [53], as shown in Table 4. In this comparison, only the LiTS dataset was used in the experiment under the same challenge rules. It can be seen that the proposed SED can achieve 0.75 in DSC while the performance of the other models achieved in the range between 0.64 and 0.7 according to the challenge report, which validates the SED.
In the field of image segmentation, image prediction is usually considered as a simple classification. However, the model does not directly output 0 or 1 predicted value classification, but outputs it into a probability graph. Thus, Sigmoid is added at the end of the model to output each category, where AUC can analyze such probability graphs. Further, AUC can be used as an indicator to judge the overall performance of the model. The more convex the curve is towards the (0, 1) point, the better the overall performance of the model. It can be simply divided as follows: 1. AUC = 0.5 (no discrimination); 2. 0.7 ≤ AUC ≤ 0.8 (acceptable discrimination); 3. 0.8 ≤ AUC ≤ 0.9 (excellent discrimination); 4. 0.9 ≤ AUC ≤ 1.0 (outstanding discrimination). Comparison of generalization capabilities on different models is shown in Table 5. As can be seen that the performance of Segnet in ACC, IoU, DSC, and AUC is relatively poor. However, the proposed SED has a significant improvement in IoU and DSC, indicating superior generalization capabilities of SED. Training performance using a mix of LiTS and KCGMH with a total of 4000 images has been shown in Figure 10. Excellent learning results (curve fitting) were obtained after training 100 epochs with 4000 randomly mixed LiTS and KCGMH datasets. Among the 300 randomly mixed images, the result showed that an IoU value of 0.70 and an ACC value of 0.88 after eliminating the data of 5 patients with abnormal recognition due to burned liver or abnormal edema.

3D Visualization
Reconstruction of three-dimensional (3D) tumor contour from two-dimensional (2D) segmentation results can be used as an alternative tool to aid clinical practice. Due to the advancement of medical imaging equipment, the slice spacing and pixels were gradually reduced, enabling the improvement of the 2D contour stitching method. In this study, Photoshop CC 2018 was used to achieve the 3D volume reconstruction of liver tumor. In Figure 10. Training performance using a mix of LiTS and KCGMH with a total of 4000 images.

3D Visualization
Reconstruction of three-dimensional (3D) tumor contour from two-dimensional (2D) segmentation results can be used as an alternative tool to aid clinical practice. Due to the advancement of medical imaging equipment, the slice spacing and pixels were gradually reduced, enabling the improvement of the 2D contour stitching method. In this study, Photoshop CC 2018 was used to achieve the 3D volume reconstruction of liver tumor. In the 3D reconstruction process, the 3D contours are composed of the curved surfaces formed by adjacent 2D tumor segmentation maps. Figure 11 shows 3D reconstruction results of liver tumors by SED generated tumor segmentation maps. Each row presents an individual case. For each case, 15 slices were used with the interval 1mm. The result shows that 3D visualization reconstructed from the SED segmentation maps can clearly represent the size, shape, and relative position of the tumor regions, in which the volume of the tumors could be estimated. Further, the translucent 3D images facilitate the physicians' interpretation. Figure 12 shows the reconstructed transparent 3D view of liver region. The 3D image created by this method can be adjusted in translucency and color to create better image for a better clinical contrast. The image can be closer to reality, and further the orientation, position and size of the images can be freely adjusted. Further, the distance and location of the 3D view can be more realistic as the original DICOM image contains thickness information of each slice. Figure 13 illustrates the reconstructed transparent 3D view of liver tumors. The 3D image reconstructed by this proposed method can facilitate doctor to rapidly capture the size and dimension of the tumors. In addition, the reconstructed 3D tumor image can be manually rotated in any direction, position, and orientation for various perspectives. With this method, the 3D view of the tumors can be presented more concrete and specific.

3D Visualization
Reconstruction of three-dimensional (3D) tumor contour from two-dimensional (2D) segmentation results can be used as an alternative tool to aid clinical practice. Due to the advancement of medical imaging equipment, the slice spacing and pixels were gradually reduced, enabling the improvement of the 2D contour stitching method. In this study, Photoshop CC 2018 was used to achieve the 3D volume reconstruction of liver tumor. In the 3D reconstruction process, the 3D contours are composed of the curved surfaces formed by adjacent 2D tumor segmentation maps. Figure 11 shows 3D reconstruction results of liver tumors by SED generated tumor segmentation maps. Each row presents an individual case. For each case, 15 slices were used with the interval 1mm. The result shows that 3D visualization reconstructed from the SED segmentation maps can clearly represent the size, shape, and relative position of the tumor regions, in which the volume of the tumors could be estimated. Further, the translucent 3D images facilitate the physicians' interpretation.  Figure 12 shows the reconstructed transparent 3D view of liver region. The 3D image created by this method can be adjusted in translucency and color to create better image for a better clinical contrast. The image can be closer to reality, and further the orientation, position and size of the images can be freely adjusted. Further, the distance and location of the 3D view can be more realistic as the original DICOM image contains thickness information of each slice.  Figure 13 illustrates the reconstructed transparent 3D view of liver tumors. The 3D image reconstructed by this proposed method can facilitate doctor to rapidly capture the size and dimension of the tumors. In addition, the reconstructed 3D tumor image can be manually rotated in any direction, position, and orientation for various perspectives. With this method, the 3D view of the tumors can be presented more concrete and specific.

Conclusions
With the advent of the era of artificial intelligence, the use of computer-based automated medicine as an aid will be one of the future trends. Using an appropriate algorithm with a computer will assist surgeons to quickly identify lesion area, reduce labor costs, and further improve medical services. Followed by this trend, many deep learning-based segmentation algorithms have been proposed for medical image processing. However, most of the existing methods only adopt a single encode-decoder (ED) as the main network architecture, which has limited performance. In this paper, a two-stage liver tumor segmentation framework, called SED, was proposed for the automatic prediction of hepatocellular carcinoma based on CT imaging. SED consists of two independent and successive encoder-decoders. The first one aims to localize the liver region through a classical ED network, while the second one performs accurate tumor segmentation through a stronger ED network. The result showed that the proposed two-stage SED method provided satisfactory liver localization and tumor segmentation perfor-   Figure 13 illustrates the reconstructed transparent 3D view of liver tumors. The 3D image reconstructed by this proposed method can facilitate doctor to rapidly capture the size and dimension of the tumors. In addition, the reconstructed 3D tumor image can be manually rotated in any direction, position, and orientation for various perspectives. With this method, the 3D view of the tumors can be presented more concrete and specific.

Conclusions
With the advent of the era of artificial intelligence, the use of computer-based automated medicine as an aid will be one of the future trends. Using an appropriate algorithm with a computer will assist surgeons to quickly identify lesion area, reduce labor costs, and further improve medical services. Followed by this trend, many deep learning-based segmentation algorithms have been proposed for medical image processing. However, most of the existing methods only adopt a single encode-decoder (ED) as the main network architecture, which has limited performance. In this paper, a two-stage liver tumor segmentation framework, called SED, was proposed for the automatic prediction of hepatocellular carcinoma based on CT imaging. SED consists of two independent and successive encoder-decoders. The first one aims to localize the liver region through a classical ED network, while the second one performs accurate tumor segmentation through a stronger ED network. The result showed that the proposed two-stage SED method provided satisfactory liver localization and tumor segmentation perfor-

Conclusions
With the advent of the era of artificial intelligence, the use of computer-based automated medicine as an aid will be one of the future trends. Using an appropriate algorithm with a computer will assist surgeons to quickly identify lesion area, reduce labor costs, and further improve medical services. Followed by this trend, many deep learning-based segmentation algorithms have been proposed for medical image processing. However, most of the existing methods only adopt a single encode-decoder (ED) as the main network architecture, which has limited performance. In this paper, a two-stage liver tumor segmentation framework, called SED, was proposed for the automatic prediction of hepatocellular carcinoma based on CT imaging. SED consists of two independent and successive encoderdecoders. The first one aims to localize the liver region through a classical ED network, while the second one performs accurate tumor segmentation through a stronger ED network. The result showed that the proposed two-stage SED method provided satisfactory liver localization and tumor segmentation performance in both quantitative and qualitative analysis, with liver segmentation and the tumor prediction reaching 0.92 and 0.75 in the Dice score, respectively. To validate the segmentation performance of the proposed SED, 4300 liver CT images composed of LiTS dataset and KCGMH dataset were conducted. Among the 300 randomly mixed LiTS and KCGMH images, the result showed that an IoU value of 0.70 and an ACC value of 0.88 after eliminating the data with abnormal recognition due to burned liver or abnormal edema. The 3D visualization images generated from the 2D segmentation results of SED could indeed provide more realistic estimates of the shape and location.