Detection of Liver Tumour Using Deep Learning Based Segmentation with Coot Extreme Learning Model

Systems for medical analytics and decision making that make use of multimodal intelligence are of critical importance in the field of healthcare. Liver cancer is one of the most frequent types of cancer and early identification of it is crucial for effective therapy. Liver tumours share the same brightness and contrast characteristics as their surrounding tissues. Likewise, irregular tumour shapes are a serious concern that varies with cancer stage and tumour kind. There are two main phases of tumour segmentation in the liver: identifying the liver, and then segmenting the tumour itself. Conventional interactive segmentation approaches, however, necessitate a high number of intensity levels, whereas recently projected CNN-based interactive segmentation approaches are constrained by low presentation on liver tumour images. This research provides a unique deep Learning based Segmentation with Coot Extreme Learning Model approach that shows high efficiency in results and also detects tumours from the publicly available data of liver images. Specifically, the study processes the initial segmentation with a small number of additional users clicks to generate an improved segmentation by incorporating inner boundary points through the proposed geodesic distance encoding method. Finally, classification is carried out using an Extreme Learning Model, with the classifier’s parameters having been ideally chosen by means of the Coot Optimization algorithm (COA). On the 3D-IRCADb1 dataset, the research evaluates the segmentation quality metrics DICE and accuracy, finding improvements over approaches in together liver-coloured and tumour separation.


Introduction
Smart healthcare systems, in particular those that make use of IoMT technology, have benefited from recent expansions in machine learning and computer communication [1,2]. A multimedia-based medical diagnostic system is one of the major demands in the medical healthcare business. Radiologists and doctors can benefit even more from the solutions provided by these intelligently built diagnostic systems [3]. Liver cancer ranks as the sixth most common form of cancer worldwide. The liver is a massive granular organ situated inside in humans. Cirrhosis, acute and chronic liver are on the rise as a direct result of changing lifestyles. As a direct result of these underlying conditions, liver cancer has become the most prevalent form of the disease in many regions [4]. There are more therapy choices available when the cancer is detected early. Liver cancer can only be images, localization, and segmentation, and the generation of synthetic images is carried out by GAN. The features from input images are extracted using ResNet-50, whereas YOLO-v3 is utilized to localize small liver tumours. Finally, the segmentation task is completed by using a pre-trained model of InceptionResNet-v2. The performance of the proposed method is simulated on the publicly available 3D-IRCADb-01 dataset. The method focused only on segmentation, where the classification of tumours is not performed.
A more effective method of liver and tumour separation from CT images is proposed by Ashreetha, B. [17], who suggests using Gabor Features (GF) in conjunction with three different machine learning algorithms: Random Forest (RF), Support Vector Machine (SVM), and Deep Neural Net (DNN). There should not be any inconsistencies or differences in the GF-generated texture data between different slices of the same organ. In the first, an assortment of Gabor filters is used to extract features at the pixel level. Second, liver segmentation is performed by erasing liver from an abdominal CT image using three distinct classifiers. Finally, the segmented liver picture is subjected to tumour segmentation classifiers. All of the aforementioned classification techniques have been successfully applied to pixel-wise segmentation problems, and the Gabor filter is a good approximation to the human visual system (HVS) of perception.
The primary focus of Ayalew, Y.A., [18] was the application of a deep learning approach to partition liver and tumour from stomach CT scan pictures, with the goal of reducing the time and effort required to diagnose liver cancer. The algorithm's foundation is the primary UNet design. However, in this research, it reduced the number of filters used in each block and introduced batch normalisation and a dropout layer afterward every block along the shrinking path. The programme achieved a dice score of 0.96 for liver subdivision, 0.74 for liver tumour segmentation, and 0.63 for abdominal CT scan image tumour division. The liver results improved by 0.11 percentage points, whereas the overall liver segmentation results improved by 0.01.
Zheng, R., et al. [19] apply a 4D deep learning model that takes advantage of 3D memory to the issue of lesion segmentation: The projected deep learning approach uses the four-dimensionality of DCE MRI to aid in liver tumour segmentation. A shallow U-net-based 3D CNN module was utilised to extract the DCE phases' 3D spatial area features, and a 4-layer C-LSTM network module was employed to make use of the phases' temporal domain information. Networks trained in using multi-phase DCE pictures and multi-contrast images, which capture the dynamic nature of tissue imaging features, are better able to understand the characteristics of HCC. The suggested model outperformed the 3D U-net perfect, the RA-UNet model, and other models used in the ablation research on both internal and external test sets for liver tumour segmentation, with a Dice score of 0.825·0.077. The suggested model performs similarly to the state-of-the-art nnU-Net perfect, which has shown to be superior in a variety of segmentation tests, while also greatly outperforming it in terms of prediction time.
A fully automatic approach to liver segmentation from CT scans is projected by Arajo et al. [20]. The suggested method consists of four primary steps. These processes are procedures. The model used 131 CT LiTS database to test the suggested technique. The average results were 95.45% sensitivity, 99.86% specificity, 95.64% Dice coefficient, 8.28% volumetric overlap error, 0.41% relative volume difference, and 26.60 mm in Hausdorff distance.
For better efficiency in diagnosing liver tumours, Rela, M. [21] adopts a novel model. Histogram equalisation and filtering are used for benchmark and manually gathered datasets. In addition, this research work use adaptive thresholding in conjunction with level set segmentation to carry out liver segmentation. After the liver is segmented, a novel Grey algorithm is applied to segment the tumour using an improved deep learning method called U-Net. On top of that, the GW-CTO algorithm adopts based on a multi-objective function because the length of features increases the difficulty of network training. To further improve upon these carefully chosen features, the GW-CTO algorithm is applied to a Neural Network. With a learning percentage of 85%, the DNN achieved an accuracy of 4.3%, outperforming PSO-HI-DNN by 2.4%, O-SHO-HI-DNN by 5.2%, CTO-HI-DNN by 4.3%, and GWO-HI-DNN by 4.3%.
In [22], Liu et al. offer an AI-based approach for segmenting CT images of liver tumours. K-means clustering (KMC), an AI-based approach, was proposed in this paper and compared to the region growth (RG) technique. Using the Child-Pugh classification system, 120 patients with liver tumours at Research Hospital in Chandigarh, India, were divided into two groups: grade A (58 instances) and grade B (62 cases). The experiments showed that liver tumors had low density on plain CT scans and moderate enhancement during PVP. Liver metastasis was shown to be detectable by CT more often than hepatocellular carcinoma. Results reveal that lipiodol chemotherapy emulsion (LCTE) has a favourable deposition effect in patients with a rich blood type (accounting for 53.14%) and a poor blood type (accounting for 25.73%).
Using CT scan data from the 3D-IRCADb01 dataset, Sabir, M.W. [23] developed a ResU-Net architecture with a very dense number of nodes. ResU-fundamental Net's feature, the residual block and U-Net architecture, allows for the withdrawal of more information from the input data than a standard U-Net network. Data augmentation, a Hounsfield windowing unit, and histogram equalisation are only some of the picture pre-processing techniques used before used to measure ResU-network Net's efficiency. The ResU-Net system with residual connections achieved a DSC value of 0.97% for methods, which was much higher than the presentation of state-of-the-art systems for liver tumour documentation. Figure 1 shows the working flow of the proposed model. In [22], Liu et al. offer an AI-based approach for segmenting CT images of liver tumours. K-means clustering (KMC), an AI-based approach, was proposed in this paper and compared to the region growth (RG) technique. Using the Child-Pugh classification system, 120 patients with liver tumours at Research Hospital in Chandigarh, India, were divided into two groups: grade A (58 instances) and grade B (62 cases). The experiments showed that liver tumors had low density on plain CT scans and moderate enhancement during PVP. Liver metastasis was shown to be detectable by CT more often than hepatocellular carcinoma. Results reveal that lipiodol chemotherapy emulsion (LCTE) has a favourable deposition effect in patients with a rich blood type (accounting for 53.14%) and a poor blood type (accounting for 25.73%).

A Proposed System
Using CT scan data from the 3D-IRCADb01 dataset, Sabir, M.W. [23] developed a ResU-Net architecture with a very dense number of nodes. ResU-fundamental Net's feature, the residual block and U-Net architecture, allows for the withdrawal of more information from the input data than a standard U-Net network. Data augmentation, a Hounsfield windowing unit, and histogram equalisation are only some of the picture pre-processing techniques used before used to measure ResU-network Net's efficiency. The ResU-Net system with residual connections achieved a DSC value of 0.97% for methods, which was much higher than the presentation of state-of-the-art systems for liver tumour documentation.

Dataset Selection
The proposed approach employs the 3DIRCADb1 dataset [24]. It is a repository of anonymized medical photos and expertly drawn boundaries around organs of interest. The DICOM format is used for all pictures and masks. We used the CT scan pictures from 20 patients found in the 3D-IRCADb1 database, which represents 75% of the tumours in that collection. Each picture is 512 pixels by 512 pixels. The range of slices is anything from 74 to 260, and the slice thickness can be anywhere from 1.25 to 4 mm. Table 1 has all the details. Each scan segment has an appropriate liver mask. Tumour masks, however, are not shared between tissues and can be found in their own dedicated folders. The tumour masks from many scans on different types of tissue were therefore merged into a single folder for computational convenience. An initial sample of 17 patients is used as a training set for the proposed approach, whereas the remaining sample of three patients serves as a test set. Due of the complexity of the liver and its malignancies, this dataset is being investigated for training and evaluation. It has many tumours that cannot seen by the naked eye, and the tumours have a similar contrast to the liver; more specifically, their HU values are practically identical ( Figure 2).
Because of these challenges, we decided to investigate this dataset using deep learning techniques.

Data Augmentation
Data inadequacy is a major obstacle for deep learning representations used to medical data. Not enough data points are available in the datasets to satisfy the needs of the datahungry deep learning algorithms. Therefore, data augmentation is a popular and viable option for expanding the sample size. When augmentation is required, the Augmentation API37 is used. This package, written in Python, makes it easy to add enhancements to data. To find the most versatile additions, we analysed all of them and settled on four: 90 • rotation, transposition, horizontal flip, and vertical flip. Due to the limited size of the dataset, there is a possibility of the network being overfit. Select online augmentation strategies are utilised to prevent this situation from occurring. As a first step, the training data is partitioned into exercise and validation sets. Next, augmentation methods are functional to each slice, and lengthways with their corresponding ground truth ( Figure 3). not shared between tissues and can be found in their own dedicated folders. The tumour masks from many scans on different types of tissue were therefore merged into a single folder for computational convenience. An initial sample of 17 patients is used as a training set for the proposed approach, whereas the remaining sample of three patients serves as a test set. Due of the complexity of the liver and its malignancies, this dataset is being investigated for training and evaluation. It has many tumours that cannot seen by the naked eye, and the tumours have a similar contrast to the liver; more specifically, their HU values are practically identical ( Figure 2).

Figure 2.
A sample image with a very minute tumour (not visible for the human eye) and its mask.
Because of these challenges, we decided to investigate this dataset using deep learning techniques.

Data Augmentation
Data inadequacy is a major obstacle for deep learning representations used to medical data. Not enough data points are available in the datasets to satisfy the needs of the data-hungry deep learning algorithms. Therefore, data augmentation is a popular and viable option for expanding the sample size. When augmentation is required, the Augmentation API37 is used. This package, written in Python, makes it easy to add enhancements to data. To find the most versatile additions, we analysed all of them and settled on four: 90° rotation, transposition, horizontal flip, and vertical flip. Due to the limited size of the dataset, there is a possibility of the network being overfit. Select online augmentation strategies are utilised to prevent this situation from occurring. As a first step, the training data is partitioned into exercise and validation sets. Next, augmentation methods are functional to each slice, and lengthways with their corresponding ground truth ( Figure 3).

Images
Transpose RandomRotate 90 Vertical Flip Horizontal Flip Figure 3. Effect of augmentation procedures on the unique copy.

Data Normalization
The primary goal of data normalisation is to convert a dataset's numeric columns to a more human-readable scale while maintaining the original data's range variability. Data normalization is not mandatory in ML, but becomes important when the data features a wide range of values. Through the use of data normalisation, we can ensure that each row only stores the data it needs to and eliminate any potential for data duplication. The normalised data takes up far less room than the alternative. The key worry for data collection and storage is, of course, a lot of memory. Data normalisation is employed to lessen storage needs. There is a flexible database architecture in place. If the same piece of

Data Normalization
The primary goal of data normalisation is to convert a dataset's numeric columns to a more human-readable scale while maintaining the original data's range variability. Data normalization is not mandatory in ML, but becomes important when the data features a wide range of values. Through the use of data normalisation, we can ensure that each row only stores the data it needs to and eliminate any potential for data duplication. The normalised data takes up far less room than the alternative. The key worry for data collection and storage is, of course, a lot of memory. Data normalisation is employed to lessen storage needs. There is a flexible database architecture in place. If the same piece of information is stored more than once, data redundancy is committed. Regulating values slow on multiple scales to a theoretically shared scale, frequently before averaging, is what is meant by the term "data normalisation." First, we assume that EF st is a data attribute, and then we define e f 1 and e f 2 as the minimum and maximum normalised values, respectively. Standardization Equation (1) displays the mathematical form of data normalisation.
In Equation (1), the attribute of the data to be standardised is marked by EF st , the normalised data is denoted by EF normalization st , the supreme attribute value connected to each record is signified by EF max , and EF min mentions to the least quality value connected to apiece record.

Segmentation
The segmentation process has two distinct phases. To begin, the user clicks a few times in the vicinity of the target object's (input image) border (i.e., its interior margin point). With these points, a loose bounding box can be inferred for use in cropping the input image. Our suggested EGD transform takes the cropped image as input and uses the user-supplied inner margin points to generate a cue map. The cropped input image and the cue map are fed into a CNN to produce a rough segmentation. In the additional phase, the user clicks to highlight mis-segmented areas, and we apply Information Fusion and Graph Cuts to refine the result (IF-GC). The refinement step can be repeated as many times as necessary throughout testing to obtain a satisfactory end product. Our approach requires only a limited sum of training data and can be used immediately to segment new objects without the requirement for additional annotations or fine-tuning.

Inner Margin Points
Interactive cues such as scribbles, bounding boxes, and their combinations are commonly used, but due to variations in individuals and imaging protocols, it can be challenging and time-consuming to identify the precise extreme points in medical images, increasing the workload on the user. We propose using inner margin points as user interactions to work around these restrictions, where the user needs only a few clicks to completely enclose the target area. For organs with complex and irregular shapes, points can provide more shape information, which only uses an additional point. In addition, it can be challenging for users to place clicks accurately on the object boundary, even during testing. Therefore, it is easier to adjust inaccurate clicks by moving them closer to the object boundary. A transform of the exponentialized geodesic distances between these locations along the interior border can provide a good approximation of the saliency map of the target object. As a result, interior margin points may be useful for directing CNNs as they deal with a wider variety of unseen objects.
Every object's internal boundary points were simulated automatically during training using the ground truth mask and edge detector. The inner margins are calculated using two criteria: In the first place, these points need to be positioned within the object and close to its edge. Second, the entire object space should fit within a tranquil bounding box computed from these locations. As a result, we use a two-stage simulation to mimic user interactions with a training image. (1) A small number of points are picked near the extremes of the target object on the ground truth boundary to make sure the relaxed bounding box encompasses the entire region of interest. Then, to provide further insight into the target's overall form, we pick n random points from the remaining border points.
(2) In step 1, we move all these points towards the inner side of the border to simulate genuine user clicks which may not be entirely accurate. Because users are expected to position the inside margin points inside the border, we rotated the simulated points inwards to reflect this need. Then, the determined bounding box is expanded outward by a few pixels/voxels to incorporate the surrounding area. In the testing phase, the user must supply the inner margin points in a way that complies with the aforementioned conditions. Users' interactions determine a loose boundary box, which is then enlarged somewhat to incorporate further context.

Exponentialized Geodesic Distance Transform
Coding user interactions efficiently is essential for CNN-based interactive approaches. An ideal encoding technique would incorporate image context and work seamlessly with convolutional neural networks (CNNs) without needing any custom-designed parameters. Although these benefits are desirable, they are not shared by other existing interaction encoding approaches such as the Euclidean distance transform, the Gaussian heatmap, the iso-contours, or the geodesic distance transform. We suggest a context-aware approach to this issue: a hybrid of the geodesic distance and exponential transforms.
Imagine S s is the collection of training-stage simulated interior margin points or user-provided test-stage inner margin points. Assuming that I is a pixel or voxel in the contribution picture I, we can write the nameless I to S s as p is a feasible path with parameters n [0; 1], and P (i,j) is the usual of all pathways between pixels/voxels I and j. The tangent unit vector is defined as n ∈ [0; 1]. v(n) = p (n)/||p (n)||. In this paper, the EGD is described for scalar images, but it may be easily generalised to vector-valued images. Specific instances of cue maps created using various encoding techniques applied to some interior margin points are displayed. It is clear that maps are between the foreground and contextual. This means it could provide the CNN a better initial segmentation result by providing additional shape, location, and context information.

Early Segmentation Constructed on Cue Map and CNN
In this research, we tackle the problem of dealing with both visible and hidden objects in a wide variety of image formats by developing a universal and efficient framework. For this reason, our system is not restricted to any one particular CNN architecture. We employ 2D-UNet and 3D-UNet with certain modifications to show off its potential. To strike a better balance between performance, memory cost, and time consumption, we halve the number of feature channels and swap out the batch normalising layers for instance normalisation layers, which can better adapt to various image types. As mentioned in Section 3.4.1, all inside margin points and relaxed leaping boxes during training. To feed the image into the CNN, all the points inside the margin are turned into a cue map. The user will be requested to input inside margin points for a target in the testing phase. A preliminary segmentation result can then be provided by the CNN. We use a refinement stage that fuses data from the first segmentation with data from further user clicks to fix the mis-segmentation, as will be seen in Section 3.4.4.

Refinement Constructed on Data Fusion among Preliminary Segmentation and Extra User Clicks
Refining a preliminary segmentation is crucial for interactive segmentation based on deep learning. Traditional approaches either need to fine-tune the perfect for a particular image or require an extra model for refinement. These methods of refinement need a lot of resources, especially time and memory, and they are not yet ready for usage with unseen objects. CNN's prediction was additionally refined automatically by CRF and these CRF-based refining approaches were not intended for interactive segmentation. In contrast to existing approaches, we present a unique strategy for information fusion among original segmentation and further user connections that yields improved generalisation to unknown items without the need for extensive re-training.
During the refinement phase, the user is prompted to click on areas of the image that were incorrectly segmented as foreground or background. We re-apply the projected EGD transformation to generate two more interaction-derived cue maps, which we then use to efficiently encode the newly discovered interactions: E f and E b are EGD-derived cue maps that influence the user's foreground and background clicks to hone down on certain features. Please take note that in the refinement stage, we do not simply recycle the original EGD map we received in the first phase, but instead use a combination and refinement clicks to determine the new EGD maps. Similarity clicks are represented by the values of E f and E b , which fall in the range [0; 1]. The initial foreground and background probability map that is obtained by CNN are denoted as P f and P b and a pixel/voxel is defined as i in the input image I, respectively. It is suggested that P f and P b be fine-tuned in accordance with E f and E b using the information fusion technique. When pixel I is near the refinement clicks, we want to automatically emphasise E f and E b ; otherwise, P f and P b will be left unaltered. Probabilities of being in the forefront (R f i ) or background R b i ) at pixel I as defined by the user, are defined here as for some weighting factor α i ∈ [0; 1] that changes on its own and dynamically. When I is near the clicks, , and I is nearer to 1.0. If we receive no input for the foreground (background), we will use a fixed value for D f i , D b i . So, if we call the foreground clicks "C f " and the background clicks "C b ," then the total number of clicks is C = C f ∪ C b . Let us call this pixel's user-supplied label c i ; if it's in the clicks, then c i = 1 if i ∈ C f and c i = 0 if i ∈ C b . For the final, refined segmentation, we use a (CRF) that incorporates both R f and R b : Subject to : where φ then ψ are the energy terms, correspondingly. λ postulates a comparative weight among φ and ψ. In paper: r i signifies the value of pixel i in R f , and y i = 1 if I belong to the forefront and 0 then. I i and I j mean pixel i and j in image I, respectively. dist ij is the Euclidean distance among pixel/voxel i and j. σ is a limitation to controller. The CRF problem expressed in Equation (8) is shown to be submodular and solvable by Graph Cut using the max-flow/min-cut algorithm.

Gradient Computation
The following equations are then used to determine the gradients at each pixel in the image: θ(x, y) = tan −1 dy dx (15) 3.4.6. Dividing the Input Image into Cells and Blocks The gradient image generated at the end is divided into 105 squares by tiling it into 8 × 8-pixel cells and pushing a 16 × 16-pixel window across the cells. This window covers four adjacent cells at a time, and the cells in each group of four are combined to form a block. This block sections are created using this HOG-style image partitioning technique can be observed in the resulting image of size 64 × 128 pixels. It is essential for further procedures including feature extraction. Figure 4 presents the sample images for ground truth and liver segmentation. Figure 5 presents the tumour segmentation.

Gradient Computation
The following equations are then used to determine the gradients at each pixel in the image: ( , ) = −1 ( ) 3.4.6. Dividing the Input Image into Cells and Blocks The gradient image generated at the end is divided into 105 squares by tiling it into 8 × 8-pixel cells and pushing a 16 × 16-pixel window across the cells. This window covers four adjacent cells at a time, and the cells in each group of four are combined to form a block. This block sections are created using this HOG-style image partitioning technique can be observed in the resulting image of size 64 × 128 pixels. It is essential for further procedures including feature extraction. Figure 4 presents the sample images for ground truth and liver segmentation. Figure 5 presents the tumour segmentation.

Gradient Computation
The following equations are then used to determine the gradients at each pixel in the image: ( , ) = −1 ( ) 3.4.6. Dividing the Input Image into Cells and Blocks The gradient image generated at the end is divided into 105 squares by tiling it into 8 × 8-pixel cells and pushing a 16 × 16-pixel window across the cells. This window covers four adjacent cells at a time, and the cells in each group of four are combined to form a block. This block sections are created using this HOG-style image partitioning technique can be observed in the resulting image of size 64 × 128 pixels. It is essential for further procedures including feature extraction. Figure 4 presents the sample images for ground truth and liver segmentation. Figure 5 presents the tumour segmentation.

Construction of the Histogram of Concerned with Gradient Using Selective Sum of Histogram Bins
A histogram of the gradient's direction is built for each block. To achieve this, the orientation angles of each pixel are used to cast a vote for placement in one of a set number of histogram bins. More bins will be used to extract more specific orientation information from the image, but more features will be generated as a result.
Varying sections of the image use a different sum of histogram bins to minimise the feature size while still preserving the critical information in the feature. For regions that could be part of a human figure, a higher sum of histogram bins is used to extract characteristics, whereas a lesser number of bins is employed for the remaining regions.
An average image is built using 739 positive training samples to locate possible human body parts. The shaded blocks' features are extracted using a larger sum of histogram bins, whereas the remaining blocks' characteristics are extracted using a smaller sum of histogram bins. In practise, the optimal values for the high and low number of bins to employ are found by trial and error.

Classification Using Extreme Learning Machine
A neural network enhanced by the gradient method, an ELM. Whereas other gradient learning methods require frequent iterations to maintain optimal network parameters, ELM merely requires a random initialization of the connection weights among the input and output layers and the bias parameters in the hidden layer prior to data training. Once the hidden layer's neuron count is determined, a single optimal key can be obtained. This type of learning algorithm can speed up the learning process and decrease the amount of time spent analysing data.
In the construction of the ELM, there are where, g(x) is the start of the hidden neuron, W i = [w i,1 , w i,2 , . . . , w i,n ] T is the input weight, β i is the output weight layer expressed as Specifically, there are proper β i , W i , b i which can content the Equation (4). This formula can be signified as where H is the hidden layer's output matrix, is the hidden layer's output weight, and T is the predictable output. Finding the optimal values forŵ i ,b i ,β i is equivalent to solving the minimal cost function, which is what is required to network.
A universal classifier that can accomplish, the aforementioned ELM is exactly what is needed. Binary and multiclass classification are both possible with ELM in the field of classification. The learning pace can be drastically increased without the necessity for iterative learning. Although ELM has universal approximation, it requires considerable hidden nodes to ensure a good generalization performance, which is prone to overfitting. This makes the prevention of overfitting an urgent problem of ELM. As a regularization method for fully connected networks, Dropout [25] and DropConnect [26] can effectively prevent overfitting [27]. The steps of the ELM algorithm are as follows [28].
Step 1: Give training set ψ = (x i , t i )|x i ∈ R n , t i ∈ R m , i = 1, . . . , N , activation function G(x), and the sum of hidden neuron L.
Step 2: Arbitrarily assign the value of the input weight w i and the bias b i . The optimal solution for weight is identified by COA, which is described in Section 3.5.1.
Step 3: Compute the hidden layer production matrix H.
Step 4: Compute the output weight β : β = H † T, where H † is the Moore-Penrose widespread inverse matrices H. Figure 6 presents the sample label output of input image, which is obtained from the classifier ELM. The binary output "0" and "1" is obtained by ELM, which is labelled by the input image and shown in Figure 6.
Step 4: Compute the output weight : = † , where † is the Moore-Penrose widespread inverse matrices . Figure 6 presents the sample label output of input image, which is obtained from the classifier ELM. The binary output "0" and "1" is obtained by ELM, which is labelled by the input image and shown in Figure 6.

Coot Optimization Algorithm
In [29], a new optimization algorithm called the COA is introduced, and its design is based on the behaviour of coot birds. COA makes an effort to imitate the flock's behaviour as a whole. A few coots floating on the water's surface are in charge of guiding the flock. From what we can tell, they exhibit four distinct behaviours: random wandering, moving in chains, shifting their positions relative to the group leaders, and guiding the pack to the best possible location. These behaviours cannot be implemented without a mathematical model.
Initial conditions include the generation of a random population of coots. Let us pretend we have a D-dimensional problem that needs solving. Using Equation (20), we can construct a population of N coots.
Based on the upper boundaries UB and lower bounds LB calculated for each dimension, Equation (20) generates a uniform distribution of coot positions in a higher-dimensional space. The coots cannot go beyond or below these levels of protection. A specified fitness function, as shown in Equation (21), is applied to this initial random population as well.
In Equation (23), RN2 is an arbitrary number between zero and one. Both A and B are calculated using Equation (24): Figure 6. Sample label output of ELM classifier for input image.

Coot Optimization Algorithm
In [29], a new optimization algorithm called the COA is introduced, and its design is based on the behaviour of coot birds. COA makes an effort to imitate the flock's behaviour as a whole. A few coots floating on the water's surface are in charge of guiding the flock. From what we can tell, they exhibit four distinct behaviours: random wandering, moving in chains, shifting their positions relative to the group leaders, and guiding the pack to the best possible location. These behaviours cannot be implemented without a mathematical model.
Initial conditions include the generation of a random population of coots. Let us pretend we have a D-dimensional problem that needs solving. Using Equation (20), we can construct a population of N coots.
Based on the upper boundaries UB and lower bounds LB calculated for each dimension, Equation (20) generates a uniform distribution of coot positions in a higher-dimensional space. The coots cannot go beyond or below these levels of protection. A specified fitness function, as shown in Equation (21), is applied to this initial random population as well.
First, a random location is generated using Equation (22) to represent the unpredictable behaviour of coots. Second, we calculate the coot's new location using Equation (23).
In Equation (23), RN2 is an arbitrary number between zero and one. Both A and B are calculated using Equation (24): T(i) is the current repetition in Equation (24), and IterMax is the maximum sum of iterations allowed. The average location of two coots is used to bring one closer to another in order to accomplish chain movement, as shown in Equation (25).
Coots also choice a leader coot and shadow them using Equation (26): N L is the parameterized number of leaders, and Lind is the leader index, in Equation (19). To go along with this, we also have the concept of a probability, denoted by p. When all else fails, we turn to Equation (27)'s rules for assigning leadership roles.
Because gBest is the current worldwide best and is 3.14, R3 and R4 are arbitrary statistics in the interval [0, 1]. The COA's pseudocode is provided in Algorithm 1 [30]. Random selection o f leaders f rom the coots 5.
Calculate the f itness o f coots and leaders 6.
Find the best coot or leader a the global optimum while the end criterion is not satis f ied 7.
I f rand < P 9.
R, R1, and R3 are random vectors along the dimensions o f the problem 10.

Dice Similarity Coefficient
DSC, or Dice resemblance coefficient, is commonly used to compute the similarity between two samples. In this research work, this performance measure determined the overlap among two binary masks. It can be mathematically defined as the size of the overlay between two segmentations alienated by objects. The provided range of DSC is usually from 0 (no overlap) to 1. DSC is calculated using the following equation:

Jaccard Similarity Coefficient
JSC gives binary mask values precisely. It is also defined as the ratio of similarity and diversity of samples used in experimentation. In mathematical terms, it is the relation of the connexion between two binary covers with their union. JSC is calculated according to the equation given below:

Accuracy
Accuracy is one of the most significant presentation measures that determine the efficiency and effectiveness of any model. Accuracy represents total number of samples [31].

Symmetric Volume Difference
SVD provides the alteration of the segmented images from the ground truth. If the value of SVD is zero, it represents a promising resultant segmentation value. The equation determines how to calculate SVD, where DSC is the Dice similarity coefficient:

Sensitivity
The properly identified proportion of true positives is measured using sensitivity [32].

Specificity
The correctly identified proportion of true negatives is measured using specificity [32]. The existing technique GW-CTO [21] is implemented in our system and results are averaged in Table 2. Based on the accuracy analysis, the proposed model achieved 93%, whereas the existing GW-CTO method achieved 92%. Additionally, the proposed model obtained a Dice score of 77.11%, which is higher than the existing technique that had a Dice score of 67%. In addition, GW-CTO achieved 56% of Jaccard, 70% of specificity and 64.8% of sensitivity, where the proposed model achieved 67.8% of Jaccard, 79.16% of specificity, and 76.03% of sensitivity. In order to test the generability of the proposed model, its effectiveness is tested with other dataset called Silver07 and it has a lateral resolution of [0.56, 0.8] mm and a z-axis resolution of [1,3] mm. All sizes of tumours, metastases, and cysts are represented. The central venous phase of all datasets included contrast-enhanced drugs. There are a total of 30 sets in the dataset; 20 for training and 10 for testing. Table 3 presents the comparative analysis of the proposed model with existing technique on second dataset. In the analysis of accuracy, the proposed model achieved 92% and existing model achieved 91%, where the reason for better performance is the proposed model focused on liver and tumour segmentation using various models. When these models are tested with the Dice score, the existing technique has 70.7% and the proposed model has 77.54%. The sensitivity and specificity of the proposed model is 77.51% and 80.36%, where the existing techniques achieved only 66.6% and 73.5%.

Evaluation metrics
Accuracy: "ratio of the observation of exactly predicted to the whole observations". This is exposed in Equation (31).
T accuracy = (Tr p + Tr n ) Tr p + Tr n + Fa p + Fa n (31) Sensitivity: "the number of true positives, which are recognized exactly". Se = Tr p Tr p + Fa n (32) Specificity: "the number of true negatives, which are determined precisely".
F1 score: It is distinct as the "harmonic mean between precision and recall. It is used as a statistical measure to rate performance".

MCC =
Tr p × Tr n − Fa p × Fa n (Tr p + Fa p )(Tr p + Fa n )(Tr n + Fa p )(Tr p + Fa n ) When comparing with existing techniques, the proposed model achieved better performance than existing techniques such as RF, SVM, DNN-GF [17] and HI-DNN [21]. The existing techniques are implemented in our dataset and system, then results are averaged in Table 4. Initially, without considering the segmentation process, all techniques identified the tumours, which shows poor performance. For instance, the proposed model has 88% of accuracy, 92% of specificity, 92% of NPV and existing techniques achieved nearly 89% of NPV, 69% to 87% of specificity and 86% of accuracy. Figures 7 and 8 presents the graphical analysis of the proposed model with existing techniques. In the next analysis, all techniques are tested with segmentation and it is provided in Table 5. F1 score: It is distinct as the "harmonic mean between precision and recall. It is used as a statistical measure to rate performance".
MCC: It is a "correlation coefficient computed by four values".
When comparing with existing techniques, the proposed model achieved better performance than existing techniques such as RF, SVM, DNN-GF [17] and HI-DNN [21]. The existing techniques are implemented in our dataset and system, then results are averaged in Table 4. Initially, without considering the segmentation process, all techniques identified the tumours, which shows poor performance. For instance, the proposed model has 88% of accuracy, 92% of specificity, 92% of NPV and existing techniques achieved nearly 89% of NPV, 69% to 87% of specificity and 86% of accuracy. Figures 7 and 8 presents the graphical analysis of the proposed model with existing techniques. In the next analysis, all techniques are tested with segmentation and it is provided in Table 5.    When segmentation techniques are considered, the results for even existing techniques have better performance. For instance, the existing techniques achieved nearly 92% to 95% of accuracy and the proposed model achieved 96% of accuracy. The reason for better performance is that the kernels are optimally selected by using COA and segmentation of liver is considered. The existing techniques did not use any optimization models for optimal parameter selection and achieved poorer performance than the proposed model. The FPR is high for RF, where other existing techniques has 0.03 and the proposed model has 0.01. Figures 9 and 10 presents the graphical analysis for various metrics. The analysis of ROC for the proposed model is presented in Figure 11.  When segmentation techniques are considered, the results for even existing techniques have better performance. For instance, the existing techniques achieved nearly 92% to 95% of accuracy and the proposed model achieved 96% of accuracy. The reason for better performance is that the kernels are optimally selected by using COA and segmentation of liver is considered. The existing techniques did not use any optimization models for optimal parameter selection and achieved poorer performance than the proposed model. The FPR is high for RF, where other existing techniques has 0.03 and the proposed model has 0.01. Figures 9 and 10 presents the graphical analysis for various metrics. The analysis of ROC for the proposed model is presented in Figure 11. Biomedicines 2023, 11, x FOR PEER REVIEW 18 of 23

Cross-Valdiation Analysis
Here, the analysis is carried out on various k-fold cross validations. Table 6 presents the analysis of the proposed model with existing techniques by considering different cross validations. The above table presents the different ratio of training images and testing images, with cross-validation analysis. It is shows that the proposed model achieves nearly 96% to 98% for 80-20, 90-10, and cross-validation, but the same model achieved only 93.45% accuracy, when the training is 70% and testing is 30%. This shows that the splits play an important role for classification accuracy. The existing techniques such as HI-DNN and DNN-GF achieved nearly 93% to 96% of accuracy for 90-10 split, 80-20 split, and crossvalidation, where those models achieved only 91% of accuracy, when the split is 70-30. For the analysis of coot optimization in the proposed model, Table 7 provides the experimental analysis.

Cross-Valdiation Analysis
Here, the analysis is carried out on various k-fold cross validations. Table 6 presents the analysis of the proposed model with existing techniques by considering different cross validations. The above table presents the different ratio of training images and testing images, with cross-validation analysis. It is shows that the proposed model achieves nearly 96% to 98% for 80-20, 90-10, and cross-validation, but the same model achieved only 93.45% accuracy, when the training is 70% and testing is 30%. This shows that the splits play an important role for classification accuracy. The existing techniques such as HI-DNN and DNN-GF achieved nearly 93% to 96% of accuracy for 90-10 split, 80-20 split, and cross-validation, where those models achieved only 91% of accuracy, when the split is 70-30. For the analysis of coot optimization in the proposed model, Table 7 provides the experimental analysis. The computational cost of the proposed model is lower (0.71 GB) compared to the existing methods, and it also requires less memory size (237.89 MB). However, the existing RF model has 1.18 G of computation cost and uses 289.11MB of memory size to process the identification. This leads to poor classification performance and the main reason is that a trained RF requires high memory for storage, due to the need for retaining the information from several hundred individual trees. In addition, the requirement of training time and testing time of various algorithms is depicted in Table 8. From the above experiments, it is clearly proven that the proposed model has less training time and testing time of input liver images. When compared with all techniques, RF has high training time and testing time due to hundreds of trees, which is previously explained in Table 6. The other existing models such as DNN-GF and HI-DNN has nearly 2300 s of training time and 14 s for testing the input image. Table 9 compares the proposed model to existing techniques for the Silver07 dataset with regards to segmentation. In the analysis of accuracy, the existing ML achieved nearly 78/5 to 83%, DNN-GF achieved 80%, HI-DNN model achieved 83%, and the proposed model achieved 87.09%. The reason for better performance is that the ELM's weight parameters are optimally selected by COA, where existing techniques did not consider parameter selection operation. In the analysis of FPR, the RF has 0.12, SVM has 0.18, DNN-GF and HI-DNN have 0.06, and the proposed ELM has 0.05. The sensitivity and specificity of the proposed model achieved nearly 99%, where the existing models such as RF has nearly 80-87%, SVM has 73-81%, and HI-DNN achieved nearly 93% to 98%. The importance of COA in ELM is tested with both datasets, which is provided in Table 10. In the first dataset (3DIRCADb1), the proposed model achieved 87% without COA and the same model achieved 96% with COA. This analysis shows the importance of COA in ELM's parameter selection. Likewise, without COA, the proposed ELM achieved 68% of F-measure, 83% of sensitivity, 96% of specificity, and 95% of precision. The same ELM model achieved 96% of F-measure, sensitivity, and it has 98% of specificity and precision, when it is tested with COA. In the second dataset, the ELM has 81% of accuracy, 75% of F1-score, 95% of sensitivity, 98% of specificity, and 94% of precision without COA. However, when the ELM was tested with COA, it achieved 87% of accuracy, 84% of F1-score, 99% of sensitivity, 99% of specificity, and 99% of precision. In the analysis of FPR, the proposed model has 0.17 on first dataset and 0.19 for second dataset without COA. However, the same ELM achieved 0.15 of FPR on first dataset and second dataset, when it tested with COA.

Conclusions
We present a deep learning-based communicating system with high accuracy for segmenting tumours from liver pictures. To encode user communications to direct CNN for a decent initial subdivision, a new context-aware approach was presented. In addition to the encoding technique, we also suggested a powerful refining strategy to boost the precision of the segmentation outcomes. Segmentation is an important goal for models built using deep learning, and this framework is intended to help achieve that goal. The results of our experiments on a broad variety of input photos demonstrate that (1) our inner margin points and EGD architecture provides superior accuracy and efficiency learning-based interactive segmentation tools. In addition, the suggested framework is very generalizable with regards to liver tumour identification. As an annotation tool, it might be used to produce segmentation masks quickly and precisely for various objects. As a future work, the classification accuracy could be improved by introducing the deep learning model in the research work for classification instead of a machine learning model.