Enhancing Skin Disease Segmentation with Weighted Ensemble Region-Based Convolutional Network †

: Skin diseases are a prevalent and diverse group of medical conditions that affect a significant portion of the global population. One critical drawback includes difficulty in accurately diagnosing certain skin conditions, as many diseases can share similar symptoms or appearances. In this paper, we propose a Weighted Ensemble Region-based Convolutional Network (WERCNN) methodology that consolidates a Mask R-CNN (Mask Region-based Convolutional Neural Network) with the weighted average ensemble technique to enhance the performance of segmentation tasks. A skin disease image dataset obtained from kaggle is utilized to segment the skin disease image. This study investigates the utilization of a Mask R-CNN in skin disease segmentation, where it is prepared on a skin disease image dataset of dermatological pictures. The weighted average ensemble model is utilized to optimize the weights of the Mask R-CNN model. The performance metrics accuracy, precision, recall, specificity, and F1-score are to be employed; this can achieve the values of 94.7%, 93.6%, 93.9%, 92.6%, and 93.7%, respectively. With regard to skin disease segmentation, the WERCNN has shown extraordinary in accurately segmenting the impacted regions of skin images by providing valuable insights to dermatologists for diagnosis and treatment planning.


Introduction
The accurate and timely identification of skin diseases is crucial for effective diagnosis, treatment planning, and patient care [1].Traditional methods for skin disease diagnosis heavily rely on manual inspection and assessment by dermatologists, which can be timeconsuming, subjective, and prone to inter-observer variability.In recent years, various approaches have transformed the field of medical image analysis, offering the potential to automate and enhance skin disease diagnosis [2].One of the primary challenges in skin disease segmentation lies in the complexity and diversity of skin conditions.Different diseases exhibit distinct visual patterns, textures, and morphological characteristics, making it essential for segmentation models to be adaptive and robust [3].The use of deep learning techniques allows these models to learn and extract meaningful features from skin images, enabling them to generalize well across different diseases and variations [4].
Deep learning models, particularly convolutional neural networks (CNNs), have demonstrated exceptional performance in various image analysis tasks, that includes image classification, object detection, as well as segmentation [5].Skin disease segmentation, in particular, is a critical task in medical image analysis, where the goal is to precisely delineate the affected regions in skin images.By providing pixel-level masks, segmentation enables a more granular understanding of the disease extent and location, facilitating better treatment decisions and disease monitoring [6].In this study, we explore the application of deep learning, specifically the Mask Region-based Convolutional Neural Network (Mask R-CNN) architecture, for the segmentation of skin diseases [7].A Mask R-CNN is an extension of the Faster R-CNN object detection model that can generate both bounding boxes and instance-level segmentation masks.By utilizing the inherent strengths of a Mask R-CNN, we aim to attain the accurate and fine-grained segmentation of various skin diseases [8].
To further improve the segmentation performance, we employ a weighted average ensemble of multiple Mask R-CNN models [9].The ensemble approach combines the outputs of individual models with different characteristics, allowing us to capture diverse patterns and features present in the data.By aggregating the predictions of multiple models, we anticipate a more robust and reliable skin disease segmentation system [10].
Skin disease may be the fourth leading cause of non-fatal disease worldwide, yet researcher's efforts and funding do not match the relative incapacity of skin diseases.Skin and subcutaneous diseases have grown by 46.6% between 1990 and 2022, with 21 to 87% of skin diseases reported in 13 pediatric-specific datasets.So, the identification of the skin disease is necessary, and for this reason, a novel method WERCNN is established.Weight average or weight sum ensemble is an ensemble machine learning approach that combines the predictions from multiple models, where the contribution of each model is weighted proportionally to its capability or skill.This can be employed to enhance the performance of the model by introducing diversity to reduce variance.The major contributions of this system are as follows: • This study contributes to the advancement of automated skin disease diagnosis by utilizing state-of-the-art deep learning techniques.• The proposed weighted average ensemble approach, which combines the outputs of multiple Mask R-CNN models and this ensemble strategy, capitalizes on the diverse strengths of individual models, resulting in improved segmentation performance and robustness across different skin diseases and variations.• The proposed ensemble methodology can be easily extended to incorporate new models or adapt to diverse datasets.Its versatility makes it applicable to various clinical scenarios, providing a flexible solution for accurate skin disease segmentation.
The rest of the paper is structured as follows.Section 2 represents the related works of skin disease image segmentation.Section 3 presents the proposed methodology for skin disease segmentation.The experimental analysis is presented in Section 4. Finally, Section 5 concludes the paper.

Literature Survey
Khouloud et al. [11] employed convolutional neural networks (CNNs), including U-Net and DeepLab, which was developed by Samanta et al. [12], and Khan et al. [10] used Mask R-CNN.These have emerged as popular choices for their ability to learn hierarchical features from skin images, leading to the precise and pixel-level segmentation of affected regions.Comparative analyses have been conducted to assess the performance, computational efficiency, and applicability of these architectures for medical image analysis.
The studies by Cao et al. [13] have focused on expanding and curating datasets that cover a wide spectrum of skin conditions, patient demographics, and imaging modalities to ensure the generalizability and robustness of segmentation models utilized by Wibowo et al. [14].Ethical considerations have been emphasized as well, with efforts to identify and mitigate potential biases in AI models to avoid disparities in skin disease diagnosis.Ensuring transparency, accountability, and privacy in AI-driven diagnostic tools has been a subject of exploration as utilized by Anand et al. [15].
Ravi et al. [16] developed ensemble methods, such as weighted averaging, which have gained popularity for combining the outputs of multiple models to improve segmentation accuracy and reliability.Transfer learning and domain adaptation techniques have been employed to utilize pre-trained models on large-scale datasets to fine-tune them on specialized skin disease datasets, effectively overcoming the challenge of limited annotated data.

Materials and Methods/Methodology
Segmentation offers the advantage of removing unwanted detail from the skin image, such as air, as well as allowing for the isolation of various tissues such as bone and soft tissue.In order to identify important areas in medical images and videos, medical image segmentation is used.When training computer vision models for healthcare purposes, image segmentation is a cost-effective and time-efficient method of labeling and annotation that can improve accuracy and outputs.The images used in this analysis are sourced from a dataset of skin disease images.For the segmentation of the skin disease from the dataset, a Mask R-CNN is illustrated.In the Mask R-CNN, the segmentation process is performed by pixel-level segmentation on the detected dataset images.This can also accommodate various classes and overlapping images of datasets.The Mask R-CNN can have greater efficiency, but this cannot work in a large number of dataset as it will have weight-optimizing issues.To overcome these issues, a weighted ensemble average model is employed; this can enhance the performance of segmentation in the skin disease image dataset.To prepare these images for analysis, they undergo preprocessing techniques.Next, the images are segmented using a specific technique.The resulting segmented images are then classified as either segmented or non-segmented.Finally, the predicted output is obtained, and the same is shown in Figure 1.
segmentation accuracy and reliability.Transfer learning and domain adaptation techniques have been employed to utilize pre-trained models on large-scale datasets to finetune them on specialized skin disease datasets, effectively overcoming the challenge of limited annotated data.

Materials and Methods/Methodology
Segmentation offers the advantage of removing unwanted detail from the skin image, such as air, as well as allowing for the isolation of various tissues such as bone and soft tissue.In order to identify important areas in medical images and videos, medical image segmentation is used.When training computer vision models for healthcare purposes, image segmentation is a cost-effective and time-efficient method of labeling and annotation that can improve accuracy and outputs.The images used in this analysis are sourced from a dataset of skin disease images.For the segmentation of the skin disease from the dataset, a Mask R-CNN is illustrated.In the Mask R-CNN, the segmentation process is performed by pixel-level segmentation on the detected dataset images.This can also accommodate various classes and overlapping images of datasets.The Mask R-CNN can have greater efficiency, but this cannot work in a large number of dataset as it will have weight-optimizing issues.To overcome these issues, a weighted ensemble average model is employed; this can enhance the performance of segmentation in the skin disease image dataset.To prepare these images for analysis, they undergo preprocessing techniques.Next, the images are segmented using a specific technique.The resulting segmented images are then classified as either segmented or non-segmented.Finally, the predicted output is obtained, and the same is shown in Figure 1.

Preprocessing
To ensure accurate results, the input data undergo a preprocessing method.This is necessary because the skin images taken may contain unwanted elements like noise, poor background, and low illumination [17].By applying preprocessing techniques to the raw data, the proposed method can achieve better performance accuracy.Various techniques involved in preprocessing are Resizing, Augmentation, Normalization, and Standardization.

Skin Image Segmentation Using WERCNN
Analyzing an image and dividing it into different parts based on pixel characteristics is called image segmentation.This technique helps process images and conduct detailed analyses, like separating foreground and background images.After the preprocessing stage, the skin images are segmented by using a Weighted Ensemble Region-based Convolutional Network (WERCN).For better performance, image segmentation can be customized based on specific circumstances.In the next section, Mask R-CNN and the

Preprocessing
To ensure accurate results, the input data undergo a preprocessing method.This is necessary because the skin images taken may contain unwanted elements like noise, poor background, and low illumination [17].By applying preprocessing techniques to the raw data, the proposed method can achieve better performance accuracy.Various techniques involved in preprocessing are Resizing, Augmentation, Normalization, and Standardization.

Skin Image Segmentation Using WERCNN
Analyzing an image and dividing it into different parts based on pixel characteristics is called image segmentation.This technique helps process images and conduct detailed analyses, like separating foreground and background images.After the preprocessing stage, the skin images are segmented by using a Weighted Ensemble Region-based Convolutional Network (WERCN).For better performance, image segmentation can be customized based on specific circumstances.In the next section, Mask R-CNN and the weighted average ensemble method (WAEM) are explained.These two techniques can be combined to segment skin disease images.
Mask R-CNN: The potential of a Mask R-CNN in the field of skin-related diseases is significant due to its advanced computerized vision model [18].A Mask R-CNN is an extension of Faster R-CNN, which is widely used for detecting and segmenting areas of interest in medical images, including dermatological scans and photographs.With high accuracy, it can identify and isolate regions of potential concern.Mask R-CNN adds a branch to the Faster R-CNN structure to accurately forecast the object mask.The alignment achieved by the network between the input and output is pixel-to-pixel, which results in favorable outcomes [19].They are characterized to perform various loss function tasks during the training stage, which is given as per the following: where K cls indicates the classification box, and K box denotes the bounding box.Here, K cls and K box are identical to each other.By implementing this particular definition K mask , the network can generate distinct masks for each class, without any conflict or overlap between them.K cls , an ROI, is determined as follows: where from the above equation, v indicates the object class, and q = (q 0 , . . . . . . . . ., q k ), which indicates the probability distribution.The relapse loss of the bouncing box j th for an RoI is communicated as; Here, u = u w , u i , u y , u x indicates the precise bounding-box regression of an RoI, and x denotes the forecasted bounding-box regression for Class u.The above equation with smooth K 1 is expressed as; The binary loss function is denoted by the variable K mask and is determined by computing the average binary cross entropy.Here, the output for every RoI is denoted as L 2 n , where L indicates the branch mask within the resolution n × m.K mask is determined by where P indicates the true mask, while Q v represents the forecasted mask for the RoIs in class u, which is given as . The L masks predicted by this model correspond to each of the L key point types (e.g., various skin diseases).The detection algorithm has been trained to accurately identify the exact location of key points on an object or patient in a single attempt.It is important to note that the processes of person detection and key point detection are completely separate from each other and do not rely on one another.By analyzing vital information, we can accurately determine the movement and position of body organs.This allows us to promptly identify any signs of discomfort or diseases in the patient's body.It is important to closely monitor the movements of a particular body part to accurately assess any discomfort a patient may be experiencing within the coordinates (y, x).To determine the distance between (y, x), the following equation can be utilized: Eng. Proc.2023, 59, 49 5 of 10 From Equation ( 6), the Euclidean distance between a key point l in consecutive frames i and i − 1, respectively, can be determined.l yi , l xi and l yi−1 , l xi−1 indicates the key points, and l l are located within the coordinates (y, x) using the frames i and i − 1, respectively.

S =
1, E ≥ e q 0, otherwise Here, the value of "S" acts as a threshold to decide the distance between crucial points.In this work, e q refers to keeping an eye on specific areas of a patient's skin condition that are considered critical.To accomplish this objective, we have thoroughly examined the Euclidean distances of every single key point within the patient's organs by utilizing the following calculation: where j is the numerical values within the range span from one to a maximum value, denoted by the variable m.The key components of human skin B refer to the essential parts of a patient's anatomy, denoted as m.
where Cond patient describes the patient's condition.S s indicates the threshold time, which is capable of differentiating between regular movements and those caused by a medical condition.R-CNN models can precisely distinguish and depict different skin conditions using this precise condition, which enable the arrangement of early and accurate diagnosis treatment and the further development of disease monitoring.Weighted Ensemble Average Method: By multiplying the various weight values to the CNN outputs, the weighted average method increases the unweighted average [20].Furthermore, the total sum of weight values is one.
In Equation (10), for the weighted average approach, β signifies the weight value which multiplies weight vector → x and t, and signifies the number of ensemble CNN models.In this work, WAE was predicted using a variety of parameter weightings.In the case of SAE, it is different because all parameters are given equal weights.Equation (11) provides the formula for WAE: where S (i) , X i , s i (i), and Q signify SAE output.Similarly, Q i was calculated using Equation ( 12): where DCt is the performance efficiency of the "t"th single model.The classification outcomes in the majority of research papers are based on a single model.The ensemble model can perform better than the single model, according to a large body of research.All of the features in the dataset cannot be extracted by a single model.Therefore, to improve performance, researchers use a variety of models.The optimal weighted average ensemble method is applied to multi-class classification in this study.The average ensemble method, in which each model generates predictions equally, is improved by this ensemble technique.
In the weighted average ensemble method, each model is given a weight to calculate its contribution.Additionally, this study uses the grid search methodology to optimize the allocated weights.

Proposed Weighted Ensemble Region-based Convolutional Network (WERCN) for skin disease segmentation:
The specific areas with a skin lesion are precisely divided, and the accuracy of the skin image is enhanced by using the image segmentation technique.For the segmentation, a WERCN is proposed, this can be segmented by combining the Mask R-CNN and weighted average ensemble method.The Mask R-CNN is employed to enhance the segmentation process that can be performed by using pixel-level segmentation.Also, this can accommodate the multiple classes and overlapping of the skin disease images.To optimize the weight, a weighted average ensemble is utilized for this purpose.The weighted average ensemble enhances the accuracy and performance of the segmentation process; this can be especially for the complex and noisy problems.Also, this minimizes the overfitting issues and reduces the spread or dispersion of the predictions and performance model.The process of the WERCN is shown in Figure 2. First, the images are preprocessed and then segmented using the Mask R-CNN.The parameters are initialized and used to build the Mask R-CNN.Next, the model's weights are loaded and optimized with a weighted average ensemble model.This weighted average ensemble model improves the Mask R-CNN's weights.If the weights are optimized, then a segmented output is produced; if not, the process repeats.

Results
The proposed method aims to segment skin diseases within a time limit, and it utilizes a dataset, the skin disease image dataset, for the collection of images to provide efficient results.The WERCN method is compared to the existing methods such as Convolutional Neural Network (CNN) [11], full-resolution Convolutional Neural Network (FrCN) [21], Support Vector Machine (SVM) [22], and Adaptive Neuro-Fuzzy Classifier (ANFC) [23].This comparison was performed based on the general performance metrics, i.e., f1score, recall, precision, specificity, and accuracy [24,25], to determine the performance of each method.In this section, the experimental setup, parameters used for the method, utilized performance metrics [26], and datasets are explained.
The results demonstrate that the proposed WERCN outperforms these existing methods, achieving higher performance scores.Table 1 presents a detailed comparison of the proposed method with the existing methods based on various performance measures.The superior performance of WERCN indicates its effectiveness in skin disease segmentation compared to the other methods evaluated.In the CNN, the performance value of the accuracy is 89.8%, with a precision of 85.7%, recall of 86.7%, specificity of 89.0%, and F1score of 91.5%.In the FrCNN, the performance value of the accuracy is 90.7%, with a precision of 86.8%, recall of 88.8%, specificity of 90.8%, and F1-score of 89.7%.Also, in the SVM and ANFC, the performance value of the accuracy is 87.7% and 88.4%, with a precision of 90.8% and 87.9%, recall of 82.7% and 83.8%, specificity of 84.7% and 85.7%, and F1-

Results
The proposed method aims to segment skin diseases within a time limit, and it utilizes a dataset, the skin disease image dataset, for the collection of images to provide efficient results.The WERCN method is compared to the existing methods such as Convolutional Neural Network (CNN) [11], full-resolution Convolutional Neural Network (FrCN) [21], Support Vector Machine (SVM) [22], and Adaptive Neuro-Fuzzy Classifier (ANFC) [23].This comparison was performed based on the general performance metrics, i.e., f1-score, recall, precision, specificity, and accuracy [24,25], to determine the performance of each method.In this section, the experimental setup, parameters used for the method, utilized performance metrics [26], and datasets are explained.
The results demonstrate that the proposed WERCN outperforms these existing methods, achieving higher performance scores.Table 1 presents a detailed comparison of the proposed method with the existing methods based on various performance measures.The superior performance of WERCN indicates its effectiveness in skin disease segmentation compared to the other methods evaluated.In the CNN, the performance value of the accuracy is 89.8%, with a precision of 85.7%, recall of 86.7%, specificity of 89.0%, and F1-score of 91.5%.In the FrCNN, the performance value of the accuracy is 90.7%, with a precision of 86.8%, recall of 88.8%, specificity of 90.8%, and F1-score of 89.7%.Also, in the SVM and ANFC, the performance value of the accuracy is 87.7% and 88.4%, with a precision of 90.8% and 87.9%, recall of 82.7% and 83.8%, specificity of 84.7% and 85.7%, and F1-score of 88.1% and 89.1%.However, the proposed WERCN attains a greater value of 94.7% accuracy, 93.6% precision, 93.9% recall, 92.6% specificity, and 93.7% F1-score than the other methods.The accuracy analysis of the proposed WERCN model is represented in Figure 3a.From the figure, it is noted that the proposed approach attains a greater accuracy in comparison to the other methods.The attained accuracy for the proposed WERCN is 94.7%, 89.8% for the CNN, 90.7% for the FrCN, 87.7% for the SVM, and 88.4% for the ANFC.The accuracy performance of the WERCN is better than the existing methods.Figure 3b depicts the graphical representation of the precision analysis for the proposed WERCN method, which achieves a higher precision of 93.6%.The accuracy analysis of the proposed WERCN model is represented in Figure 3a.From the figure, it is noted that the proposed approach attains a greater accuracy in comparison to the other methods.The attained accuracy for the proposed WERCN is 94.7%, 89.8% for the CNN, 90.7% for the FrCN, 87.7% for the SVM, and 88.4% for the ANFC.The accuracy performance of the WERCN is better than the existing methods.Figure 3b depicts the graphical representation of the precision analysis for the proposed WERCN method, which achieves a higher precision of 93.6%.In Figure 4b, the specificity measure is used to compare the existing methods with the proposed WERCN for image segmentation.Additionally, Figure 4c illustrates the F1score analysis, with the WERCN achieving a higher F1-score of 93.7% compared to the CNN (91.5%),FrCN (89.7%),SVM (88.1%), and ANFC (89.1%).Overall, the proposed WERCN method demonstrates superior performance in skin disease image segmentation when compared to the existing methods, as evident from the higher values of precision, recall, specificity, and F1-score.These results indicate that WERCN is an effective approach for accurately segmenting skin disease images.In Figure 4b, the specificity measure is used to compare the existing methods with the proposed WERCN for image segmentation.Additionally, Figure 4c illustrates the F1-score analysis, with the WERCN achieving a higher F1-score of 93.7% compared to the CNN (91.5%),FrCN (89.7%),SVM (88.1%), and ANFC (89.1%).Overall, the proposed WERCN method demonstrates superior performance in skin disease image segmentation when compared to the existing methods, as evident from the higher values of precision, recall, specificity, and F1-score.These results indicate that WERCN is an effective approach for accurately segmenting skin disease images.

Conclusions
The proposed Weighted Ensemble Region-Based Convolutional Network (WERCN) is developed for effective skin image segmentation.First, input images are gathered from the skin disease image dataset, and then they undergo preprocessing.Next, the Mask RCNN method effectively segments the skin disease images, and the weight of the Mask RCNN is optimized using the Weighted Average Ensemble model.The results show that the proposed WERCN method achieves superior effectiveness in image segmentation, with an accuracy of 94.7%, an f1-score of 93.7%, a recall of 93.9%, a precision of 93.6%, and a specificity of 92.6%.In the future, our research aims to expand the WERCN method to detect other types of skin diseases and improve skin lesion detection accuracy.Additionally, we plan to explore real-time-based skin disease detection and early prediction techniques to enhance the practical applications of our research.Also, the WERCNN can be employed to identify other types of diseases like cancer, brain tumors, diabetes, tuberculosis, etc.

Conclusions
The proposed Weighted Ensemble Region-Based Convolutional Network (WERCN) is developed for effective skin image segmentation.First, input images are gathered from the skin disease image dataset, and then they undergo preprocessing.Next, the Mask RCNN method effectively segments the skin disease images, and the weight of the Mask RCNN is optimized using the Weighted Average Ensemble model.The results show that the proposed WERCN method achieves superior effectiveness in image segmentation, with an accuracy of 94.7%, an f1-score of 93.7%, a recall of 93.9%, a precision of 93.6%, and a specificity of 92.6%.In the future, our research aims to expand the WERCN method to detect other types of skin diseases and improve skin lesion detection accuracy.Additionally, we plan to explore real-time-based skin disease detection and early prediction techniques to enhance the practical applications of our research.Also, the WERCNN can be employed to identify other types of diseases like cancer, brain tumors, diabetes, tuberculosis, etc.

Figure 1 .
Figure 1.Workflow of the proposed model.

Figure 1 .
Figure 1.Workflow of the proposed model.

Table 1 .
Comparison of the proposed WERCN.