Development of an Artificial Intelligence Model to Recognise Construction Waste by Applying Image Data Augmentation and Transfer Learning

: The demand for categorising technology that requires minimum manpower and equipment is increasing because a large amount of waste is produced during the demolition and remodelling of a structure. Considering the latest trend, applying an artificial intelligence (AI) model for automatic categorisation is the most efficient method. However, it is difficult to apply this technology because research has only focused on general domestic waste. Thus, in this study, we delineate the process for developing an AI model that differentiates between various types of construction waste. Particularly, solutions for solving difficulties in collecting learning data, which is common in AI research in special fields, were also considered. To quantitatively increase the amount of learning data, the Fréchet Inception Distance method was used to increase the amount of learning data by two to three times through augmentation to an appropriate level, thus checking the improvement in the performance of the AI model


Introduction
The construction industry is a pivotal player in the national economy in terms of gross domestic production and employment. According to the World Bank statistics [1], the construction industry is responsible for approximately 24.7% of the gross domestic product on average globally. Similarly, the construction industry in the South Korean economy plays a key role, accounting for approximately 26.8% of the gross domestic production in 2019 [2]. Additionally, it is indicated that the construction industry provided approximately two million jobs, accounting for approximately 7.5% of the overall employment in all manufacturing sectors in South Korea [2].
Behind the positive role of this industrial sector in the national economy, it has been pointed out that this industry not only consumes a large amount of natural resources and energy, but also emits a large amount of greenhouse gases (GHGs) for the production of various building materials and operation of a building or facility during the entire life cycle. According to the Intergovernmental Panel on the Climate Change report, the construction industry consumes approximately 40% of the total global energy and accounts for approximately 30% of the overall GHG emissions per annum [3]. Additionally, the construction industry generates a vast amount of construction and demolition waste, thereby contributing a significant portion to the overall waste generated globally [4][5][6]. In South Korea, construction and demolition waste represent approximately 50% of the total

Waste Management
According to Tam and Tam [15], an intensive policy with a gradual increase in benefits would be an effective approach to encourage employees to participate in waste reduction activities. On the other hand, Lu and Yuan [11] suggested that detailed regulations on waste management at construction sites are essential for successfully reducing construction waste. While waste management through incentive policies and regulations would be an effective method from a short-term perspective, the reduction of waste through recycling would make it possible to decrease waste generation and achieve a circular economy [16][17][18]. According to Edwards [16], recycling, which would be an effective strategy for waste minimisation, would reduce the demand for new resources, reduce transportation and production energy costs, and prevent land loss for landfills. Previous studies claimed that automation systems for recycling would be a potential solution for sorting and classifying waste [19][20][21]. For example, Picon et al. [19] adopted hyperspectral images for sorting non-ferrous metal waste from electric and electronic equipment. Their proposed system achieved approximately 98% accuracy in classifying waste, thereby making it possible to replace the existing manual sorting procedures. Similarly, Aleena et al. [22] proposed an automatic waste segregator using inductive proximity sensors and robotic arms for classifying solid waste into three main categories: metallic, organic, and plastic.
Likewise, on-site automated waste separation and classification is an essential function for recycling construction and demolition waste in the construction industry. For example, Xiao et al. [23] proposed an online construction waste classification system, which used industrial cameras to capture the region of the objects and hyperspectral cameras to obtain spectral information to discern the waste materials into concrete, rubber, black brick wood, plastic, and brick. Similarly, Hollstein et al. [6] developed a new compact hyperspectral camera, which could overcome the existing problems of hyperspectral imagers, for automatic construction waste sorting. Although there are several advantages of using hyperspectral images for automated construction and demolition waste classification, it has several problems, such as a high initial investment cost and insufficient robustness of optical sensors. Recently, the advances in computer vision-based object detection and classification techniques have provided potential solutions for automatic construction and demolition waste classification [5,20,21,[24][25][26][27].

Convolutional Neural Network (CNN)
Convolutional neural networks (CNNs) are widely adopted models for classifying objects in images in various fields, such as medical diagnosis, autonomous driving, facial recognition, and so forth [25][26][27][28]. CNNs are applied to various fields in the construction industry, such as structural health monitoring and prediction, health and safety monitoring on a construction site, workplace assessment, and activity recognition of construction workers for predicting hazards [29][30][31][32]. Zhang et al. [30] proposed a posture recognition method that used deep CNN-based 3D ergonomic posture recognition to enhance the health and safety of construction workers. Additionally, several studies attempted to adopt this model to predict structural safety. Deng et al. [33] developed a CNN-based model for predicting the compressive strength of recycled concrete by learning deep features of the water-cement ratio, recycled coarse aggregate replacement ratio, recycled fine aggregate replacement ratio, fly ash replacement ratio, and their combinations. Cha et al. used CNN in a vision-based approach for detecting cracks in concrete images [25]. In this research, the test results of crack detection using the CNN model showed better performance compared to the conventional edge detection methods. Gopalakrishnan et al. [34] used a deep CNN model to detect the pavement distress from digitised pavement surface images. In this research, the authors applied the VGG-16 deep CNN model, which yielded the best performance compared to other machine learning classifiers. Similarly, Dung [35] proposed a fully convolutional network-based concrete crack detection and density evaluation method, which showed an accuracy rate of more than 90% for concrete surface crack detection. Although CNN has established itself as the core of machine learning technology and is expanding the scope of applications in the construction industry, studies on the classification of construction and demolition waste using the CNN method are relatively scarce.
Since a deep learning model called AlexNet won the ImageNet Large Scale Visual Recognition Challenge championship in 2012, CNNs have become the mainstream image recognition model among different computer vision algorithms. Vision-based object detection is a technology that recognises certain objects directly from image data without any programs or commands [36][37][38]. Object recognition and detection technology have progressed from just determining the existence of an object to distinguishing the location and category of an object. The application of CNN models for waste management is divided into two major approaches in the research domain: (1) creating and validating the viability of the dataset, and (2) applying CNN algorithms to classify waste into various categories and verifying and comparing the performance of different algorithms to explore the best approaches.
The TrshNet dataset, which was released by Yang and Thung [39] in 2016, is one of the most frequently used datasets for training waste images. They applied a support vector machine (SVM) and CNN to classify the trash images into six categories: glass, paper, cardboard, plastic, metal, and trash. The test results showed that the SVM and CNN models achieved accuracies of 63% and 22%, respectively. In this study, the authors found that it would be possible to classify various types of trash into predefined categories using machine learning and computer vision algorithms. Furthermore, they pointed out that although the accuracy rate of this study was relatively low, continuously growing the dataset would improve the accuracy of trash classification using machine learning and computer vision algorithms. Similarly, Proença and Simões [40] introduced an open image dataset containing photos of litter taken from various environments. In this dataset, the pictures were manually labelled and segmented in accordance with a hierarchical taxonomy to train and evaluate object detection algorithms. All the images were labelled with objects and backgrounds to easily detect images in various contexts, such as grass, road, and underwater. According to Liang and Gu [26], existing artificial intelligence-based waste classification methods only deal with single-label waste classification rather than multiple stacked wastes, as in real-world situations. To overcome such problems and to enhance the applicability of waste classification systems, they suggested a multi-label waste classification model that would detect and localise several types of waste in images. Furthermore, they established a new dataset, which contained more than 56,000 images in four categories, and improved the efficiency of learning. The results of their study showed that the F1 score for assessing multi-label waste classification was approximately 96% and the average precision score was approximately 82%.

Comparison of Artificial Intelligence Models
Along with building a new dataset for waste classification, several studies have dealt with the performance comparison of different CNN algorithms. With the development of computer technology, there is a growing interest in developing optimised AI models to yield a better performance. For example, Ahmad et al. [41] tried to improve the reliability and accuracy of waste classification by combining state-of-the-art deep learning algorithms. The authors proposed a method that combined multiple deep learning models using a feature and score-level fusion method named double fusion. In previous studies, one of the most common difficulty in training images for recognising objects was to identify them at various positions. Wang et al. [42] classified plastic bottles with different positions and colours during the recycling process on a conveyor belt. The ReliefF algorithm was applied to select the colour features of recycled bottles, and the colour was identified using SVM. The accuracy of the colour recognition of the recycled bottles was 94.7%. Additionally, research areas related to waste classification attempted to apply various newly proposed image detection and classification algorithms to enhance its capability for practical implementation. Adedeji and Wang [24] suggested a waste classification system that could classify different components of waste. The purpose of this system was to minimise human intervention to separate the waste in sorting facilities, thereby reducing the harmful influence on humans. The system was developed using a 50-layer residual net (Res-Net), which is a CNN algorithm used to classify waste materials. The accuracy of the proposed model was 87% for the dataset.
The speed of object detection and classification is an essential factor in general applications in real-time waste classification. De Carolis et al. [43] proposed YOLO TrashNet by applying YOLOv3 for real-time waste detection in video streams. The suggested method would not only help alleviate waste reporting in a city requiring labour-intensive tasks, but also achieve the goal of a smart city. YOLOv3 is a CNN composed of 106 layers. The first 53 layers refer to the Darknet-53 network used as a feature extractor, and it was pre-trained on ImageNet, allowing deep transfer learning. The successive 53 layers allow object detection on 3 scales of size (small, medium, and large objects). Moreover, an important feature of YOLOv3 is the use of the anchor box, which is predetermined by using the k-means clustering algorithm on the training set. This improvement allows for faster and more stable network training. In this research, the authors trained the last 53 layers of YOLOv3 using their dataset. They called the proposed neural network YOLO Trash-NET. According to Liang and Gu [26], the existing AI-based waste classification methods only deal with single-label waste classification rather than multiple stacked wastes in realworld situations. To overcome such problems and enhance the applicability of waste classification systems, they suggested a multi-label waste classification model that would detect and localise several types of waste in images. Furthermore, they established a new dataset, which contained more than 56,000 images in 4 categories, and improved the efficiency of learning. The results of their study showed that the F1 score for assessing the multi-label waste classification reached approximately 96%, and the average precision score was marked over 82%.
Previous studies suggest that many studies regarding waste classification are related to municipal solid waste segregation, rather than construction and demolition waste classification. Although research on the classification of construction and demolition waste using deep neural networks has been increasing, it is relatively rare compared to municipal solid waste classification.

Development Procedure
Developers generally follow the process shown in Figure 1, to prepare an AI model that recognises objects. This process is in line with the guidebook on establishing a dataset for AI learning published by the National Information Society Agency, an affiliated organisation of Ministry of Science and ICT (Information and Communication Technology) of South Korea, and made quality evaluation on datasets mandatory, unlike the existing research methods [44,45]. There are several reasons for publishing the guidebook at the government level. First, as the amount of learning data increases, inappropriate learning data are included in the dataset, leading to an increase in cases when the models are not learned properly. Furthermore, there have been frequent cases of development failure, where the model outputted inaccurate results owing to the lack of development of human resources or unskillfulness. Thus, the model was unable to verify the dataset properly or randomly deformed the dataset without a specific standard with augmentation, such that even the developer could not identify the created data, which were included in the dataset without additional verification. The first two issues can be solved when skilled manpower is acquired, but the last one needs an adequate program to solve it.

Constructing the Dataset and Selecting the Learning Model
According to the "Enforcement decree of the wastes control act" in South Korea, construction waste is divided into 18 categories to enhance the recycling rate. Among these categories, the research team collected image data on five typical types of construction waste, which constitute a major proportion of the total construction waste [46]. The five types of waste, which include concrete, brick, lumber, board, and mixed waste, as shown in Table 1, were sequentially selected from the most emitted waste at the construction site. Data were labelled during segmentation through the processing process, and the prepared three were designated to transfer learning to the YOLACT model. The backbone of the YOLACT model was ResNet-50, which was assumed to be capable of processing real-time segmentation with small computation to enable operation on on-site computers or edge computers. The standard for real time is Closed-Circuit Television (CCTV) under 30 fps, which is usually used in real life and on construction sites. The YOLACT model is expected to operate at 30 fps if there are no network problems [14]. In this study, we established two hypotheses. The first hypothesis is that the research team performed research focusing on processing and labelling the learning data, which are unlike images of objects with clean backgrounds, as used in the existing research. When there are various objects mixed in the background, the model capacity is expected to have no difference if the designated object is accurately segmented. Another hypothesis is that the function of the AI network changes according to the quantity and quality of the learning data. The remaining sections of this chapter deal with our hypotheses regarding labelling and the performance of the AI model.

Constructing the Learning Dataset
The images used for learning included 500 images directly taken at the waste dump site located at a semiconductor manufacturing facility construction site and 288 images acquired by web crawling. The collected source data were cropped into 512 × 512 pixels with the size of approximately 100 kB considering the Graphics Processing Unit (GPU) memory (Nvidia GTX3080, NVIDIA, Santa Clara, CA, USA). Since the YOLACT model is based on instance segmentation, each image was segmented in polygonal shape using "LableMe" programme, as shown in Figure 2 [47]. The time consumed for labelling tasks for the images of construction waste in each category is shown in Table 1. The labelling tasks required 22-32 h with at least 2 men every hour to complete the composition of learning data sets. When these were calculated using Equation 1, which shows the level of difficulty, data collection and processing showed 9-11 and labelling showed 4-6, with an efficiency rate of 60% compared to the previous research [48]. Since similar objects, such as the concrete and brick, are difficult to differentiate based on colour and shape, it took additional time to sort. Therefore, the work index of brick and concrete were low compared to other categories as the labelling difficulty was high.

Work index
Total amount of data Degree of input manpower Work hours (1)

Optimal Data Labelling Method
As a result of transfer learning to the YOLACT model through labelling, it was verified that transfer learning to the dataset was performed normally, as discussed in Section 3.3. Based on the results of the transfer learning, we describe the results of variable research conducted to find the optimal labelling method in this section.
When collecting learning data, if the images with a clean background and a single object would be collected, it is possible for the workers to mitigate the confusion during labelling and create a robust AI learning model for the purposes. However, the images for the learning data with such conditions would be difficult to obtain. On the other hand, the images or video clips that would be easily able to collect might contain variety of unnecessary objects for learning. In addition, in order to construct a data set for learning, it would take a lot of time and cost to remove unnecessary objects on one image and to label objects necessary for learning. Thus, it is required to explore an appropriate method to reduce time and manpower for creating a suitable learning data set.
By considering the cases that utilised learning data that were collected by only considering classification as there are numerous studies that have a significant amount of data, it is possible to decrease the learning data collection time by using them appropriately. Although the possibility of applying it to the latest AI method has not been verified, researchers tend to avoid its usage. Therefore, the previous learning dataset is simply stored and eventually treated as digital waste. Thus, it is necessary to verify the data usage level, and a variable study was conducted by categorising four cases. Table 2 summarises the results of the labelling and optimal instance segementaion method for the construction waste. In Case A, a pixel labelling method was used by taking pictures on a clean background with the designated waste, whereas in Case B, individually labelled designated waste on a picture taken at the dumpsite were used (see Figure  3). Therefore, both cases differentiated the designated wastes well, but in Case B, the algorithm tended to not recognise some wastes when several types of waste were mixed in the image. The parts that could not be recognised were hidden behind other wastes or had different colours and shapes to previously learned data. This was considered as a lack of learning data. Case C comprised the dataset by simultaneously labelling two to five classes from the pictures taken at the dumpsite, whereas Case D classified one class per image, thereby increasing the overall dataset quantity. Consequently, even if several images were mixed, class classification was possible by forming learning data with accurate labelling. For Case C, it was unable to recognise the pixel boundary of the classified class. However, when the amount of data increased in Case D, this phenomenon seemed to disappear. Thus, the amount of learning data was important in terms of AI recognition. Moreover, the work index was 2.39 for Case D and 1.73 for Case C, which showed a lower level of difficulty. Case A was similar to the data collected to develop the existing classification model. Considering the model learning results, the existing data could be used by the latest AI model.  Generally, well recognised and experienced confusion with a type of class, but followed the boundary well

Result of Learning
The research team finally concluded that Case D, which indicates the high value of the mean Average Precision (mAP) amongst all cases, was suitable for waste classification model development, and performed transfer learning by adding learning image quantity. In Case D, the total number of images was 788 as shown in Table 3. The results are shown in Figure 4 and there are some parts to discuss the ultimately re-classified networks. Unlike ordinary objects, wastes have a very atypical shape, and in the case of concrete waste, the colour, texture, and shape are similar to those of a brick. Moreover, as discovered in a previous research problem, concrete shape is somewhat similar to sand and broken brick. Hence, the learning model categorised the cement brick crumbs as concrete waste.  Timber wastes are in the shape of rectangular lumber, plywood, and palette, but are irregularly fragmented at the waste level. Moreover, the shape of plywood is the same as that of board waste. Therefore, the collection and refinement levels had to consider various situations, as shown in Table 4. This evidence shows the importance of the refinement step, and it can be observed that developing an AI model is difficult by simply increasing the data quantity without quantitative evaluation. This is a limitation of transfer learning as the problem occurs owing to the difference in category and labelling used for previous model development. However, this problem can be solved by re-planning the AI model for the characteristics of the desired object.

Quantitative Evaluation Method for Learning Data using the Fréchet Inception Distance (FID) Technique
As a result of re-classifying the YOLACT model, it was verified that the accuracy and recognition rate were affected depending on the quantity and quality of the learning data. Thus, for improving the efficiency of research and development, increasing the amount of learning through automatised augmentation is the most appropriate solution. This section describes the quantitative evaluation of the augmentation level using the FID technique and the result of learning by increasing the learning data using this technique.

Fréchet Inception Distance (FID) Technique
AI is a concept designed to mimic human intelligence. Therefore, the objects that are difficult for people to differentiate in the image would also be difficult for AI to recognise. Particularly, construction wastes are not only similar in colour but also in shape, e.g., concrete and cement brick. However, objects that are completely different in shape, such as palette and rectangular lumber, also exist. The colour of the photographed image may change owing to the amount of light on site, and the resolution may drop depending on the performance of the camera.
As a result of these variables, it is necessary to check whether the data were learned properly and the model was made well. This is called the quality evaluation of the model. The model performance was checked manually by human beings before a quantitative method was developed for qualitative assessment. When applying this method, the subjectivity of a human affects the model evaluation, and when the amount of data is increased to exceed the human recognition range, there are cases when the standard is ambiguous in the middle of the evaluation. To solve this issue, a program using the FID score, which quantitatively assesses the model, was developed. This technique uses a pretrained inception model, which is classified using 1000 labels on ImageNet. Here, the inception model is supposed to differentiate the characteristics of ordinary objects properly, and only used parts that extract 2048 output attributes without using the model as it is [6]. The evaluation equation of FID is shown in Equation 2.
Where m indicates the average attributes of the real data, C refers to the attribute covariance of the real data, mw is the average attribute of the fake data, and Cw is the attribute covariance of the fake data.
The input and output images through FID following Gaussian distribution as a prerequisite are shown in Figure 5; the smaller the difference between the two distributions, better the performance shown in the result. Although there is an inception score, an index to evaluate the AI model performance, it is not currently in use. This is because real data are not used in performance evaluation, and marks are presented on fake images. Even for a fake image, the image used for the evaluation should have meaning to assess the model performance properly. However, as FID evaluates only real images, all images possess meanings and all data are assessed individually, not on conditional probability. Thus, after calculating the output result using the real image model and the gap of the input value from the probability distribution, it can be said that the model performance is good when the value is small. Although the exact accordance of the probability distribution is ideal, it is impossible in reality. Additionally, if these are analysed with respect to mAP, the level of performance change per learning entity can be assessed quantitatively. The advantage of this technique is that it can customise the algorithm by using the inception model if there is a better AI model to extract the output features. However, this technique is noise sensitive and thus has clear limits for evaluation. This issue occurs chronically in image research and is related to the colour temperature and radiation intensity. This issue can be addressed if multiple images can be evaluated using sufficient pictures and videos.

Susceptibility Level of Re-Classified Model Due to Noise, Colour Change, and Others
To enhance the AI model described in Section 3, additional learning is required. Thus, the amount of learning data was planned to be increased three times through the augmentation of each image. The augmentation technique added noise, a blur effect, and hue and saturation, and augmented 50 learning data from 5 super-categories to select a proper level of change. The result of the image FID is shown in Figure 6. The Python library applied was Python imgaug.

Noise Change
The addition of noise is expected to influence the image resolution and size. Although the level of AI learning equipment enhanced, it is becoming a trend to learn a large amount of data. Therefore, it is necessary to decrease the size of the learning data, and noise is inevitable in this case. However, excessive noise distorts the target object, and unintended errors, such as spots or marks on the image, may be labelled, thereby ruining the learning data.
Furthermore, because it is a part being affected by the performance of the collected device, in case of old devices, images may not be collected in abundance or may result in noise resulting from a deteriorated image sensor.
Therefore, a verification of this is necessary, and the research has examined the proper level by categorising noise into five steps. Noise was used by adding a noise technique according to the Gauss function, which involves loading the image, adding noise in accordance with the function, and combining it with the original image. Each noise step is the number of times overlapping Gaussian noise is sampled once per pixel in a normal distribution. Table 5 shows the results of the data being learned above the appropriate level. It shows a decrease in the model performance when the noise is more than 100 times the FID. The 100 times noise, as shown in Figure 7, is considerable when looking with the naked eye, but seems to not have a significant impact on the accuracy of the learning data.   The blur effect appears to be correlated with the focus of the collected image. When collecting the data, out-of-focus data may exist owing to manpower or equipment problems, and if this evaluation is applied, the data could be utilised. Gaussian blur was used for the blur effect, and the steps were classified using sigma values. As shown in Table 6, the blur effect lowered the model capacity when over sigma 2. However, human eyesight could differentiate objects until sigma 6, as shown in Figure 8. However, if these data were labelled and used, there are concerns regarding a decrease in the model performance.

Hue and Saturation
It is expected that the circumstances of the data collected through variable research on hue and saturation changes could be investigated. Especially in outside circumstances, the overall colour of the obtained image changes depending on the amount of sunshine and time taken to capture the picture, and the effects can be verified through FID. For the changes, the image was brought from the source colour space and converted to HSV, H (hue) and S (saturation) channels were extracted, colour channel on the set colour code angle was applied, and finally converted to the original colour space again. In Figure 9, as observed by human vision, the image is to be observed in black and white when the hue reaches −60, and the image loses the original colour at saturation 20. As a result of the performance evaluation of the model, it was verified that the accuracy drastically dropped when the hue was below −20 and saturation was over 20, as shown in Table 7. Thus, the characteristics of the AI model primarily depend on the colour data and evaluate.

Final Learning Results
The results of learning by quantitatively adding the learning data according to the abovementioned results are shown in Figure 10. By doubling the amount of learning data, a maximum increase of 16% in the mAP was verified. This is a result of learning that amassed the noise filter 100 times, and tripled the learning data (2364) through augmentation in Sigma 2. On the other hand, the dataset with changed saturation data showed a decline in performance. Moreover, the proposed model seems insusceptible to changes in brightness, but is affected by noise or blur; thus, the results can be utilised in data acquisition for developing the model to recognise construction waste.

Conclusion
Transfer learning was applied to an AI model to differentiate between five types of construction waste. Finally, differentiation was successful through transfer learning of the AI model using segmentation. However, there were some situations in which some categories could not be recognised, but could be solved by developing data quality assessment methods and refinement techniques.
1. Advancement in refinement techniques to list the situation on the model function from the data collection step is needed, and not just labelling objects. 2. Labelling was impossible without professional knowledge owing to the characteristics of construction waste. Additionally, supervisors were required to manage refined data because there were many objects that could not be differentiated while labelling. 3. When the existing classification techniques are mainstream, it is possible to re-use the collected data for an instance segmentation model. 4. Regarding the image data with complicated backgrounds, the precise classification of one category seems to enhance the model performance and decrease resource consumption rather than classifying several categories in one image. 5. It was verified that increasing the amount of data indiscriminately worsened the quality of the model. Furthermore, it was necessary to apply quantitative augmentation to the learning data in each category. 6. To develop an AI model that recognises construction waste, less data with minimum focus and noise, better the collected data performance. Although it does not have much impact on brightness, such as sunlight, to collect data avoiding time, such as sunrise/sunset, which affects image colour, seems better. 7. By increasing the amount of data through augmentation using transfer learning, it was verified that mAP increased by 16%. However, the AI model needs to be redesigned by reflecting the characteristics of construction waste if the performance of the model cannot be acquired.
This study highlights the importance of data augmentation and transfer learning for efficient utilisation of artificial intelligence data set. In particular, it is considered that it would be possible to train artificial intelligence models using a small number of image data, since the data augmentation method presented in this study is a useful technique Original (Number of Images x1) Origin + Noise 100 (x2) Origin + Blur 2 (x2) Origin + Hue -20 (x2) Origin + Saturation 20 (x2) Origin + Noise 100 + Blur 2 (x3) through the change of image values without taking additional pictures in various environments. Furthermore, the data augmentation methods suggested in this study would be applicable not only to construction waste, but also to other image-based artificial intelligence models.

Data Availability Statement:
The data used to support the results in this article are included within the paper. In addition, some of the data in this study are supported by the references mentioned in the manuscript. If you have any queries regarding the data, the data of this study would be available from the correspondence upon request.

Conflicts of Interest:
The authors declare no conflict of interest.