AI in Thyroid Cancer Diagnosis: Techniques, Trends, and Future Directions

There has been a growing interest in creating intelligent diagnostic systems to assist medical professionals in analyzing and processing big data for the treatment of incurable diseases. One of the key challenges in this field is detecting thyroid cancer, where advancements have been made using machine learning (ML) and big data analytics to evaluate thyroid cancer prognosis and determine a patient's risk of malignancy. This review paper summarizes a large collection of articles related to artificial intelligence (AI)-based techniques used in the diagnosis of thyroid cancer. Accordingly, a new classification was introduced to classify these techniques based on the AI algorithms used, the purpose of the framework, and the computing platforms used. Additionally, this study compares existing thyroid cancer datasets based on their features. The focus of this study is on how AI-based tools can support the diagnosis and treatment of thyroid cancer, through supervised, unsupervised, or hybrid techniques. It also highlights the progress made and the unresolved challenges in this field. Finally, the future trends and areas of focus in this field are discussed.


I. INTRODUCTION A. Background
The adoption of Artificial intelligence (AI) in healthcare has become a pivotal development, profoundly reshaping the landscape of medical diagnosis, treatment, and patient care.AI's exceptional capabilities, including pattern recognition, predictive analytics, and decision-making skills, enable the development of systems that can analyze complex medical data at a scale and precision beyond human capacity [1].This, in turn, augments early disease detection, facilitates accurate diagnoses, and aids personalized treatment planning.Moreover, AI-driven predictive models can forecast disease outbreaks, enhance the efficiency of hospital operations, and significantly improve patient outcomes [2].Additionally, AI has the potential to democratize healthcare by bridging the gap between rural and urban health services and making highquality care more accessible.Hence, the importance of AI in healthcare is profound and will continue to grow as technology advances, leading to more sophisticated applications and better health outcomes for patients worldwide [3], [4].
Cancer, a leading cause of death, affects various parts of the body as depicted in Fig. 1 (a).Among various types, thyroid carcinoma stands out as one of the most commonly occurring endocrine cancers globally [5], [6].Concerns are mounting over the escalating incidence of thyroid cancer and associated mortality.Research indicates that thyroid cancer incidence is higher in women aged 15-49 years (ranked fifth globally) compared to men aged 50-69 years [7], [8], [9].
According to existing global epidemiological data, the rapid growth of abnormal thyroid nodules (TN) is driven by an accelerated increase in genetic cell activity.This condition can be categorized into four primary subtypes: Papillary carcinoma (PTC) [10], follicular carcinoma (FTC) [11], anaplastic carcinoma (ATC) [12], and medullary carcinoma (MTC) [13].Influential factors such as high radiation exposure, Hashimoto's thyroiditis, psychological and genetic predispositions, as well as advancements in detection technology, can contribute to the onset of these cancer types.These conditions might subsequently lead to chronic health issues, including diabetes, irregular heart rhythms, and blood pressure fluctuations [14], [15], [16].Although the quantity of cancer cells is a significant indicator of thyroid carcinoma, obtaining results is often timeconsuming due to the requirement to observe cell appearance.Thus, the detection and quantification of cell nuclei are considered crucial biomarkers for assessing cancer cell proliferation.
The utilization of computer-aided diagnosis (CAD) systems for analyzing thyroid cancer images has seen a significant increase in popularity in recent years.These systems, renowned for enhancing diagnostic accuracy and reducing interpretation time, have become an invaluable tools in the field.Among these technologies, radionics, when used in conjunction with ultrasonography imaging, has become widely accepted as a cost-effective, safe, simple, and practical diagnostic method in clinical practice.Endocrinologists frequently conduct US scans in the 7-15 MHz range to identify thyroid cancer and evaluate its anatomical characteristics.The American College of Radiology has formulated a Thyroid Imaging, Reporting, and Data System (ACR TI-RADS) that classifies thyroid nodules into six categories based on attributes like composition, echogenicity, shape, size, margins, and echogenic foci.These classifications range from normal (TIRADS-1) to malignant (TIRADS-6) [17], [18], [19].Several open-source applications are available for assessing these thyroid cancer features [20], [21].However, the identification and differentiation of nodules continue to present a challenge, largely reliant on the personal experience and cognitive abilities of radiologists.This is due the poor quality of captured images, and the similarities among US images of benign thyroid nodules, malignant thyroid nodules, and lymph nodes.
Moreover, ultrasonography imaging is often a time-intensive and stressful procedure, which can result in inaccurate diagnoses.Misclassifications among normal, benign, malignant, and indeterminate cases are common [22], [23], [24], [25], [26], [27].For a more precise diagnosis, a fine-needle aspiration biopsy (FNAB) is typically conducted.However, FNAB can be an uncomfortable experience for patients, and a specialist's lack of experience can potentially convert benign nodules into malignant ones, not to mention the additional financial burden [28], [29] (refer to Fig. 1 (b)).The primary challenge in distinguishing between benign and malignant nodules resides in the selection of their characteristics.Numerous studies have explored the characterization of conventional US imaging for various types of cancers, including retina [30], [31], breast cancer [32], [33], blood cancer [34], [35], and thyroid cancer [36], [37].However, these methods still fall short when it comes to the accurate classification of thyroid nodules.
The incorporation of AI technology plays a pivotal role in reducing subjectivity and enhancing the accuracy of pathological diagnoses for various intractable diseases, including those affecting the thyroid gland [38], [39].This enhancement is achieved through improved interpretation of ultrasonography images and faster processing times.Machine learning (ML) and deep learning (DL) have surfaced as potential solutions for automating the classification of thyroid nodules in applications such as US, fine-needle aspiration (FNA), and thyroid surgery [40], [41].This potential has been underscored in numerous studies, such as [42], [43], [44], [39], [45], [46].Furthermore, there are ongoing studies examining the use of this innovative technology for cancer detection, where its effectiveness hinges on the volume of data and the precision of the classification process.
The motivation to write a review on "Artificial Intelligence for Detecting Thyroid Carcinoma" stems from the increasing prevalence of thyroid cancer, a significant endocrine malignancy where early and accurate detection is pivotal for patient outcomes.As technological advancements in AI and machine learning burgeon, their integration into medical diagnostics-spanning imaging, pathology, and genomics-offers potential improvements in detection accuracy and efficiency.
Traditional thyroid carcinoma diagnostic methods, like fineneedle aspiration biopsies, sometimes present inconclusive results; AI promises less invasive alternatives with possibly superior precision.Such a review would amalgamate insights from the intersection of computer science, radiology, pathology, and endocrinology, propelling multidisciplinary collaboration.It would also spotlight AI's clinical implications, guiding clinicians in leveraging its capabilities for patient care, while delineating future research directions.Furthermore, this review would underscore the economic and healthcare benefits, from cost savings to reduced waiting times.At the same time, it is imperative to address AI's inherent challenges, including data privacy and ethical considerations, ensuring its balanced integration into healthcare.In essence, the review would offer a comprehensive panorama of AI's current and potential role in thyroid carcinoma detection, benefitting both researchers and medical practitioners.

B. Our contribution
This review provides a comprehensive examination of the application of Artificial Intelligence (AI) methods in detecting thyroid cancer.The objective of AI-based analysis in the medical field is increasingly shifting towards enhancing diagnostic accuracy, and this review aims to illustrate this trend, particularly in thyroid cancer detection.We first provide an overview of the existing frameworks and delve into the specifics of various AI techniques.These include supervised learning methods, like DL, artificial neural networks, traditional classification, and probabilistic models, as well as unsupervised learning methods, such as clustering and dimensionality reduction.We also explore ensemble methods, including bagging and boosting.Recognizing the importance of quality datasets in AI applications, we scrutinize several thyroid cancer datasets, addressing their features, as well as feature selection and extraction methods used in various studies.We then outline the standard assessment criteria used to evaluate the performance of AI-based thyroid cancer detection methods.These range from classification and regression metrics to statistical metrics, computer vision metrics, and ranking metrics.Finally, we discuss future research directions, emphasizing areas that require more attention to overcome existing barriers and improve the use and deployment of thyroid cancer detection solutions.In conclusion, we underscore the potential of AI in advancing thyroid cancer detection while also noting the need for continuous critical evaluation to ensure its responsible and effective use.
All in all, the principal contributions of our paper are as follows  I.

C. Roadmap
The rest of this paper is organized as follows.Section II follows, providing an overview of existing frameworks utilized in this field, and discussing their respective advantages and limitations.Section III presents various thyroid cancer datasets used in AI-based analyses, explaining their relevance and uniqueness.In Section IV, the paper delves into the vital aspect of 'Features', discussing feature extraction and selection methods in AI models used for thyroid cancer detection.Section V outlines the standard assessment criteria used to evaluate the performance of these models.An actual instance of AI-based thyroid cancer detection is presented in Section VI to provide a real-world context to the theoretical aspects discussed earlier.The paper then proceeds to critical analysis and discussion in Section VII, where challenges, limitations, and areas for improvement in the current approaches are discussed.In Section VIII, potential future research directions are proposed, highlighting areas where further exploration and innovation can lead to advancements in AI-based thyroid cancer detection.The paper concludes with Section IX, summarizing the main findings and discussions, thereby providing a comprehensive conclusion to the discussions presented in the earlier sections.

II. OVERVIEW OF EXISTING FRAMEWORKS
This section showcases the various AI-based methods utilized for diagnosing thyroid gland (TG) cancers.In the illustration, Fig. 2 presents a proposed categorization of the thyroid cancer diagnosis techniques relying on AI.

A. Objective of AI-based analysis (O)
This article focuses on the application of AI in thyroid cancer detection.In order to better understand the purpose behind each framework, it is crucial to identify the objective of each approach.
O1. Classification: Thyroid carcinoma classification refers to the categorization of thyroid cancers based on their histopathological features, clinical behavior, and prognosis.There are several types of thyroid carcinomas, each of which has distinct characteristics.The primary categories include: (i) Papillary Thyroid Carcinoma (PTC): The most common type, accounting for about 80% of all thyroid cancers.PTC tends to grow very slowly, but it often spreads to lymph nodes in the neck.Despite this, it is usually curable with treatment; (ii) Follicular Thyroid Carcinoma (FTC): The second most common type, FTC can invade blood vessels and spread to distant parts of the body, but it is less likely to spread to lymph nodes; (iii) Medullary Thyroid Carcinoma (MTC): This type of thyroid cancer starts in the thyroid's parafollicular cells, also called C cells, which produce the hormone calcitonin.Elevated levels of calcitonin in the blood can indicate MTC; and (iv) Anaplastic Thyroid Carcinoma (ATC): A very aggressive and rare form of thyroid cancer, ATC often spreads quickly to other parts of the neck and body.It is difficult to treat.
The classification of thyroid carcinomas is crucial in determining the most effective course of treatment for each patient.Various factors such as tumor size, location, and the patient's age and overall health are also taken into consideration when forming a treatment plan.Advances in AI and machine learning are helping to automate and improve the accuracy of thyroid carcinoma classification, with many models trained to classify tumors based on medical images or genetic data.As reported by Liu et al. [56], incorporates Support vector machine (SVM) for cancer detection.Similarly, Zhang et al. [57], [58] propose Deep neural network (DNN) based strategies for segregating and categorizing benign and malignant thyroid nodules in ultrasound imagery.Furthermore, the Bi-Long Short Term Memory (Bi-LSTM) model, as presented by Chen et al. [59], demonstrates notable accuracy in classifying thyroid nodules.These classification systems constitute structured hierarchies instrumental in organizing knowledge and workflows in the specific domain of thyroid cancer.
O2. Segmentation: segmentation of thyroid carcinoma refers to the process of identifying and delineating the region of an image that corresponds to a thyroid tumor.The goal of segmentation is to separate the areas of interest, in this case, the thyroid tumor, from the surrounding tissues in the medical images.This can be done manually by an expert radiologist, or it can be automated using machine learning algorithms [60], [61].Segmentation is a crucial step in medical image analysis because it helps to accurately determine the location, size, and shape of the tumor, which are vital parameters for diagnosis, treatment planning, and prognosis prediction.A variety of methods can be used to perform image segmentation, including thresholding, edge detection, region-growing methods, and more complex machine learning and DL techniques.
In the case of thyroid carcinoma, the segmentation can be challenging due to the high variability in the appearance and shape of the tumors, their proximity to other structures in the neck, and the presence of noise or artifacts in the images.Therefore, robust and reliable segmentation algorithms are needed to ensure accurate and consistent results.AI methods, including Convolutional neural network (CNN) and U-Net architecture, are being increasingly used for thyroid carcinoma segmentation because of their ability to learn and generalize from large amounts of data, thus improving the accuracy and reliability of the segmentation process.

O3. Prediction:
The prediction of thyroid carcinoma involves the use of various diagnostic tools, tests, and techniques -often employing machine learning models -to anticipate the probability of a patient developing thyroid cancer.This predictive analysis can be based on several factors, including but not limited to (i) Genetic predisposition: Individuals with a family history of thyroid cancer are at a higher risk; (ii) Gender and age: Thyroid cancer is more common in women and people aged between 25 and 65; (iii) Radiation exposure: Exposure to high levels of radiation, especially during childhood, increases

Ref Year PPY TCDS AIA
Open challenges TCDA RDLA PP Future directions XAI EFC-AI RL PS IoMIT RS [47] 2021 the risk of developing thyroid cancer; (iv) Diet and lifestyle: Lack of iodine in the diet and certain lifestyle factors may contribute to an increased risk.In a medical context, prediction does not necessarily mean a certain future outcome, but rather it points to an increased risk or likelihood based on current data and predictive models.For thyroid carcinoma, predictive tools and tests are typically used in conjunction with each other to achieve more accurate results.For instance, machine learning algorithms can be trained on historical medical data to predict the likelihood of a nodule being benign or malignant, aiding in early detection and more effective treatment planning.Various studies have been proposed to predict thyroid cancer.For instance, in [62], the authors employed the use of Artificial Neural Network (ANN) and Logistic Regression to make predictions.Another study [63] details the creation of a predictive machine using Convolution Neural Networking (CNN) to analyze 10068 microscopic thyroid cancer images from South Asian populations.

B. Supervised learning (SL)
Supervised learning is a method of machine learning where an algorithm is trained to classify or predict the condition based on labeled data, which in this case is medical data related to thyroid cancer.The aim of supervised learning is to differentiate between the different forms of thyroid cancer through the use of annotated data and examples.For example, this data can include ultrasound images, radiomic features, genetic markers, patient demographics, or any other information that may be relevant to the diagnosis or prognosis of thyroid cancer.The labeled data would indicate whether each instance corresponds to a case of thyroid cancer or not, or it may provide more detailed labels such as the stage of the cancer or the type of thyroid carcinoma.
In a classification setting, the supervised learning algorithm could be trained to distinguish between benign and malignant thyroid nodules based on certain characteristics extracted from medical imaging data.The labels in the training data would specify whether each nodule is benign or malignant.After training, the algorithm can then be used to classify new, unlabeled nodules.Similarly, a regression-based supervised learning algorithm might be trained to predict the progression or the prognosis of thyroid cancer based on various patientspecific features.The labels here would correspond to a continuous outcome variable, such as the survival time of the patient or a measure of disease progression.It is important to note that the performance of these methods heavily relies on the quality and quantity of the available data.The more accurate and comprehensive the data, the better the algorithm will perform in predicting or classifying new instances.Additionally, supervised learning models in healthcare, including thyroid carcinoma detection, need to be validated on separate test datasets and in real-world clinical settings to ensure their robustness and reliability [64], [65].
1) Deep learning (DL): DL is a subset of machine learning and artificial intelligence that's based on artificial neural networks with representation learning.It can automatically learn, generate, and improve representations of data by employing large neural networks with many layers-hence the term "deep" learning.In thyroid cancer, DL has been deployed to perform different tasks, including (i) Image Classification -DL algorithms like Convolutional Neural Networks (CNNs) can be trained to classify thyroid ultrasound images.For instance, they can differentiate between benign and malignant nodules based on their shape, texture, and other characteristics [66], [67], [68].This approach can significantly reduce the time and effort required for manual interpretation, thus aiding in the early detection and treatment of thyroid cancer; (ii) Pathological Analysis -DL can also be utilized to analyze histopathological or cytopathological slide images, helping in the detection and classification of cancerous cells; (iii) Genomic Data Analysis -With the advent of genomic medicine, DL models can be employed to analyze genetic variations that may predispose individuals to thyroid cancer; (iv) Radiomics -DL models can be used to extract high-dimensional data from radiographic images, allowing for more precise and personalized treatment planning; and (v) Predictive Analysis -Using electronic health records and other patient data, DL models can be used to predict the likelihood of a patient developing thyroid carcinoma, allowing for preventive measures to be taken if necessary.Fig. 3 illustrates the different classifications of thyroid cancer using DNN.
D1. Denoising autoencoder (DAE): Denoising autoencoders (DAEs) can be beneficial for thyroid carcinoma classification by effectively learning representations from ultrasound or histopathological images.A DAE is a specific type of artificial neural network trained to reconstruct input data, often used for the purposes of dimensionality reduction or feature learning.The process for utilizing DAEs for thyroid carcinoma classification generally follows these steps (i) preprocessing, (ii) noisy input creation, (iii) DAE training, (iv) feature extraction and (v) classification.In [69], the authors implemented six autoencoder algorithms in the training process for papillary thyroid carcinoma (PTC) classification, including fixing weights and fine-tuning the network.The encoding layers and the complete auto-encoder were used to embed the network.Another study [70] employed denoising autoencoders (DAE) and stacked denoising autoencoders to extract features and identify informative genes in thyroid cancer.D2.CNN: CNNs are a class of DL models that have shown extraordinary performance in various image processing and analysis tasks, including the classification of medical images.CNNs are especially adept at processing grid-like data, such as an image, where spatial relationships between the pixels are crucial to understanding the image content.The past few years have seen considerable effort invested in developing CNN-based methodologies for detecting thyroid cancer, especially for the automated identification and classification of nodules in ultrasound imagery [71].The ConvNet model, a widely adopted framework within the neural network realm, emphasizes the use of convolution operations over matrix multiplications [72].Various CNN architectures such as LeNet [73], AlexNet [74], VGG [68], ResNet [75], GoogLeNet [76], Squeeze Net [77], and DenseNet [78], are distinguished by their incorporation of key components including convolutional, pooling, and fully connected layers.
In a study conducted by [79], the potential of CNN models to prognosticate thyroid cancer was explored using 131,731 ultrasound images taken from 17,627 thyroid cancer patients.Another research effort [80] employed VGG16, Inception, and Inception-Resnet models to differentiate malignant tissues within a set of 451 thyroid images from the DDTI dataset.To mitigate the challenge of data scarcity, the images were augmented before classification.A comparison of DCNN diagnostic performance with expert radiologists in distinguishing thyroid nodules (TN) within ultrasound images was carried out by [81], involving a test set of 15,375 TN ultrasound images.They utilized CNNE1 and CNNE2 models, derived from DCNN, for differentiating between malignant and benign TNs.The study [82] proposed a CNN-based DL technique for detecting and classifying TN and breast nodules, with the results contrasted against those from ultrasound imaging.Table II presents a summary of recent CNN-based thyroid cancer classification contributions.
D3. Recurrent neural network (RNN): Recurrent Neural Networks (RNNs) are a class of artificial neural networks where connections between nodes form a directed graph along a sequence, thus enabling them to use their internal state (memory) to process variable-length sequences of inputs.This unique feature makes RNNs particularly suitable for tasks where temporal dependencies are essential, such as time-series analysis, language translation, and speech recognition.In the context of thyroid carcinoma classification, RNNs can be utilized to analyze sequential or time-dependent data, such as the development of a patient's clinical signs over time, the evolution of a tumor seen in a series of medical images, or changes in the gene expression related to the progression of thyroid cancer.For instance, in the study by Chen et al. (2017) [93], the authors propose a hierarchical Recurrent Neural Network (RNN) approach for classifying thyroid nodules (TN) based on historical ultrasound reports.This hierarchical RNN   [94] in 1986 under the name "Harmonium," but the concept of a "restricted" Boltzmann Machine was developed by Geoffrey Hinton and his students in the mid-2000s.RBMs have a layer of visible units and a layer of hidden units, but no connections within layersthis is the restriction in their name.Each node in the layer is connected to every node in the other layer.The lack of intra-layer connections simplifies the learning process.The work by Vairale et al. [95] presents an application of Restricted Boltzmann Machines (RBMs) to develop a personalized fitness recommendation system tailored for individuals diagnosed with thyroid conditions.RBMs are a particular class of generative artificial neural networks characterized by a bi-directional architecture, which operates in an unsupervised manner.This structure comprises a visible layer containing binary variables and a hidden layer, also populated with interconnected binary variables.The learning process within RBMs is primarily conducted through statistical analysis.
D5. Generative adversarial network (GAN): This type of ML network is composed of two distinct models: a generator and a discriminator.The generator maps a random input vector to an output in the data space, while the discriminator serves as a binary classifier that evaluates both input data from the training set and output data from the generator.The GAN has gained widespread use in the diagnosis of diseases, including thyroid nodules (TN) [96], [97].
2) Artificial Neural networks (ANN): ANNs are defined as a class of information processing systems comprised of interconnected non-linear elements known as neurons.These networks have proven to be effective in addressing complex issues, as they have the ability to store and retrieve information.The various types of ANNs can be divided into various classifications.

A1. Extreme learning machine (ELM):
The ELM model features a layer of hidden nodes with randomized weight distribution.The weights between the hidden node inputs and outputs are learned in a single step, resulting in a more efficient learning process compared to other models.The ELM has been proven to be an effective method in the diagnosis of thyroid disease (TD), as evidenced in several studies such as [98], [99], [100] and [101].
A2. Multilayer perceptron (MLP): MLP represents a category of feedforward networks where data is processed from the input layer through to the output layer.Each layer in this network comprises a varying number of neurons.Rao et al. [102] introduced an innovative approach for thyroid nodule classification, utilizing MLP with a backpropagation learning algorithm.In their model, the MLP included four neurons in the input layer, three neurons in each of the ten hidden layers, and a single neuron in the output layer.Hosseinzadeh et al. [103] conducted a separate study with the objective of improving the accuracy of Thyroid Disease (TD) diagnosis through MLP networks.The research compared their findings with existing literature on thyroid cancer classification and found MLP networks to be superior.Isa et al. [104] delved into the exploration of activation functions within MLP networks.Their goal was to identify the optimal activation function for accurate classification of incurable diseases such as TD and breast cancer.The study evaluated multiple activation functions, including logarithmic, sigmoid, neural, sinusoidal, hyperbolic tangent, and exponential functions.The research found the neural function to be the most effective for TD classification, using the Back Propagation algorithm as the training algorithm.This result was further corroborated by Mourad et al. [105].

A3. Radial basis function (RBF):
In [106], ML is applied to the classification of TN, where the MLP and RBF activation functions are utilized.The RBF activation function is found to outperform the MLP in terms of the structural classification of thyroid nodules.This approach highlights the effectiveness of activation functions in approximating functions, classifying, and predicting time series data, especially in the diagnosis of thyroid cancer.

3) Traditional classification (TCL): T1. k-nearest neighbors (KNN):
The nearest k-neighbor (KNN) algorithm is a type of non-parametric supervised machine learning method used for regression and classification.The method relies on the utilization of k-training samples for predictions.In a study conducted by Chandel et al. in [107], the KNN method was applied to classify thyroid disease based on TSH, T4, and goiter parameters.Liu et al. [108] also employed the Fuzzy K-nearest Neighborhood (FKNN) approach to differentiate between hyperthyroidism, hypothyroidism, and normal cases.There is a growing interest in larger datasets for future research, as noted in [109].
T2.Support vector machines (SVM): The Support Vector Machine (SVM) is a machine learning method used for classification and regression tasks.In a study published in [110], an SVM approach was proposed for differentiating benign from malignant thyroid nodules (TN) by utilizing 98 TN samples (82 benign and 16 malignant).Another study in [111] employed six SVMs to classify nodular thyroid lesions by selecting the most important textural characteristics.The authors reported that the proposed method achieved the correct classification.In [112], a Generalized Discriminant Analysis and Wavelet Carrier Vector Machine system (GDA-WSVM) was introduced for diagnosing TN, consisting of feature extraction, classification, and testing phases.

T3. Decision trees (DT):
DT learning is a method for data mining that uses a predictive model for decision-making, where the output values are represented by the leaves and the input variables are represented by branches.This approach has been applied to uncover underlying thyroid diseases as demonstrated in various studies such as [113], [114], [115], and [116].

T4. Logistic regression (LR):
In [117], the Logistic Regression (LR) model was used to determine the specific characteristics of thyroid microcarcinoma (TMC) in 63 patients, based on the combination of contrast-enhanced ultrasound (CEUS) and conventional US values.Another study, conducted in northern Iran and reported in [118], applied LR to analyze 33530 cases of thyroid cancer.LR is a widely used binomial regression model in machine learning.

4) Probabilistic models (PM): P1. Bayesian networks (BN):
In computer science and statistics, a Bayesian Network (BN) is a type of model that represents a set of random variables.It has been used to study various diseases, as shown in the references [119], [120], and [121].

C. Unsupervised learning (USL)
In AI and computer science, unsupervised learning involves analyzing data without pre-existing labels or annotations.It aims to uncover the underlying structures in the unlabeled data.Unlike supervised learning, which uses labeled data to calculate a success score, unsupervised learning lacks this labeling, making it difficult to assess the accuracy of the results.While unsupervised learning algorithms can perform more complex tasks compared to supervised ones, they can also be more unpredictable, adding unintended categories and introducing noise instead of structure.Despite these challenges, unsupervised learning remains a valuable tool for exploring AI, as it enables the discovery of patterns and relationships in data that might not be immediately apparent [122], [123].
1) Clustering (C): The purpose of this method is to segment a set of thyroid cancer data into various homogeneous groups that possess similar characteristics, making it easier to classify the unlabeled datasets into benign and malignant.This detection approach has gained significant attention in various medical studies for its simplicity, including in the detection of DNA copy number changes [124], breast cancer recognition [125], cancer gene detection [126], skin cancer diagnosis [127], and brain tumor detection [128].Additionally, clustering can also help identify cancer without precise definitions [129].The clustering technique was used in [130] to identify factors that impact the normal functioning of TG, and DBSCAN and PCA were applied to manage the clusters and reduce dimensionality.An automated clustering system for thyroid diagnosis was developed in [131] to prescribe the appropriate drug datasets for hyperthyroid, hypothyroid, and normal cases.The efficiency of fuzzy clustering for thyroid and liver datasets from the UCI repository was analyzed in [132], where the FPCM and PFCM algorithms were applied and compared.
C1. K-means (KM): kM method is a technique for data partitioning and a combinatorial optimization challenge.It is commonly utilized in unsupervised learning, in which observations are separated into k groups.In [133], the authors explore the utilization of Artificial Neural Networks (ANN) and improvised k-Means for normalizing raw data.The study used thyroid data from the UCI dataset containing 215 instances.

C2. Entropy-based (EB):
In [134], a parameter-free calculation framework named DeMine was developed to predict MRMs.DeMine is a three-step method based on information entropy.Firstly, the miRNA regulation network is transformed into a synergistic miRNA-miRNA network.Then, miRNA clusters are detected by maximizing the entropy density of the target cluster.Finally, the co-regulated mRNAs are integrated into the corresponding clusters to form the final MRMs.The proposed method not only provides improved accuracy but also identifies more miRNAs as potential tumor markers for tumor diagnosis.
2) Dimensionality reduction (DR): DR is a machinelearning method that transforms data from a high-dimensional space into a lower-dimensional space.This technique is popular for classification due to its cost-effectiveness and ability to eliminate unnecessary data patterns and minimize redundancy.For instance, DR was used to diagnose Thyroid Disease (TD) using cytological images [135].

R1. Principal component analysis (PCA):
PCA is a multivariate statistical method that transforms variables into a reduced set of uncorrelated variables.This approach reduces the number of variables and minimizes redundant information while preserving the relationships between the data as much as possible.PCA has been widely used in cancer detection and classification of benign and malignant thyroid cells.For example, in [136], PCA was utilized to select the optimal set of wavelet coefficients from the application of Double-Tree Complex Wavelet Transform (DTCW) on noisy thyroid images, which were then classified using Random Forest (RF).In [137], PCA was applied to data from 399 patients with three types of thyroid carcinoma (papillary, follicular, and undifferentiated) in Morocco, enabling classification based on factors such as sex, age, type of carcinoma, and region.

D. Ensemble methods (EM)
To address the complexity of cancer data and achieve higher accuracy in detection, the use of ensemble methods is commonly employed in the field.This method involves dividing the data into subgroups and applying multiple machine learning techniques to each subgroup simultaneously, then synthesizing the results to make a final diagnosis.By combining multiple models, the ensemble method aims to produce an optimal predictive model for thyroid cancer detection.This approach has been shown to be effective in various studies, such as [138], where the authors emphasize the importance of ensemble methods in achieving a more comprehensive understanding of the data and improving the accuracy of the diagnosis.
1) Bagging (B): In the realm of thyroid cancer screening, Bagging is an ensemble learning technique utilized to improve the accuracy and stability of ML algorithms.This algorithm operates by reducing variance and avoiding overfitting and can be applied to a variety of methods, particularly decision trees.The purpose of Bagging is to enhance the performance of weak classifiers in the field of thyroid cancer screening applications.

B1. Bootstrap aggregation (BA):
The Bootstrap Aggregating technique is a widely utilized ensemble method aimed at improving the accuracy of Machine Learning algorithms, particularly for the purposes of classification, regression, and variance reduction.In [139], this approach was employed for diagnosing thyroid abnormalities.
B2. Feature bagging (FB): In [140], Feature Bagging (FB) is introduced as a method of ensemble learning with the goal of minimizing the correlation between the individual models in the ensemble.FB achieves this by training the models on a randomly selected subset of features, instead of all features in the dataset.The method is applied to distinguish between benign and malignant thyroid cancer cases [141].
2) Boosting (O): Meta-algorithms are often used in unsupervised learning to mitigate the variance and enhance the performance of weak classifiers by transforming them into strong classifiers.
O1. Adaboost In the study by Pan et al. [142], a new method called AdaBoost was utilized to diagnose thyroid nodules using the standard UCI dataset.The random forest and PCA techniques were employed for classification purposes and to maintain data variability, respectively.
O2. Gradient tree boosting (XGBoost) In [143], the XGBoost algorithm was introduced as a fast and efficient implementation of gradient-boosted decision trees (GTB).Since its introduction, the XGBoost algorithm has been applied to a range of research topics, including civil engineering [144], time-series classification [145], sport and health monitoring [146], and ischemic stroke readmission [147].
For thyroid cancer detection, the authors in [148] used XGBoost to diagnose benign and malignant thyroid nodules, as a solution to the challenge of obtaining accurate diagnoses with DL models when a large-scale dataset is unavailable.
Table III provides a summary of research frameworks for the detection of benign and malignant thyroid cancer, including the category, classifier, detected disease, dataset, objective, and used quantifiable metrics.This table helps to categorize AI methods used for thyroid cancer detection and highlights the current key applications.

III. THYROID CANCER DATASETS
In the field of thyroid carcinoma research, a number of datasets have been developed to facilitate the validation of ML algorithms and models.This is especially important because the creation of such datasets is a major challenge in the area of endocrine ML.In this section, we present an overview of the most significant thyroid databases, which offer a set of standards for evaluating the performance of learning methods and assist in the diagnosis and monitoring of complicated diseases.
• Waikato Environment for Knowledge Analysis (WEKA): The WEKA software, which was created at the University of Waikato using JAVA, is an open-source tool intended for pattern recognition and data analysis tasks such as preprocessing, classification, clustering, correlation, regression, feature selection, and data visualization.
• ThyroidOmics: This is a dataset developed by the Thyroid Working Group of the CHARGE Consortium that aims to examine the underlying factors and consequences of TD using various omics techniques such as genomics, epigenomics, transcriptomics, proteomics, and metabolomics.The dataset consists of the results of the discovery stage of the genome-wide association analysis (GWAS) meta-analysis for thyrotropin (TSH), free thyroxine (FT4), increased TSH (hypothyroidism), and decreased TSH (hyperthyroidism) as reported in [175] and [176].
• Thyroid Disease Data Set (TDDS): The dataset utilized for classifying using artificial neural networks (ANN) is referred to as the (dataset name) and features 3772 training instances and 3428 testing instances, with a combination of 15 categorical and 6 real attributes.The three defined classes in this dataset include normal (not hypothyroid), hyperfunction, and subnormal functioning [177].
• KEEL Thyroid Dataset: The KEEL dataset provides a set of benchmarks to evaluate the effectiveness of various learning methods.This dataset includes several types of classification, such as standard, multi-instance, imbalanced data, semi-supervised classification, regression, time series, and unsupervised learning, which can be used as reference points for performance analysis [178].

IV. FEATURES
In this section, the focus is on showcasing the crucial techniques utilized in the classification process for characteristic extraction and selection.This primarily involves identifying a subset of relevant features that positively impact the classification accuracy, and eliminating irrelevant variables.
A. Feature selection methods (FS) FS1.Information Gain (IG): In this section, the focus is on showcasing the most widely-used techniques in the process of classification, with the main objective being to identify and select relevant characteristics that can positively impact the accuracy of classification while eliminating unimportant variables.
Information Gain (IG) is a straightforward method for classifying thyroid cancer features.This method evaluates the likelihood of having cancer by comparing the entropy before and after the examination.Typically, a higher gain value corresponds to lower entropy.IG has been used extensively in several applications for the diagnosis of cancerous diseases, such as in filtering informative genes for precise cancer classification [199], selecting breast cancer treatment factors based on the entropy formula [200], analyzing and classifying medical data of breast cancer [201], reducing the dimensionality of genes in multi-class cancer microarray gene expression datasets [202], and filtering irrelevant and redundant genes of cancer [199].In [203], IG is utilized as a feature selection technique to eliminate redundant and irrelevant symptoms in datasets related to diabetes, breast cancer, and heart disease.Additionally, the IG-SVM approach, combining Information Gain and Support Vector Machine, has been employed and its results served as input for the LIBSVM classifier [199].

FS2. Correlation-based feature selection (CFS):
The CFS is a technique frequently used for evaluating the correlation between different cancer features.In various studies, the CFS algorithm has been integrated into attribute selection methods for improved classification, such as in [204] where it was applied to thyroid, hepatitis, and breast cancer data from the UCI ML repository.In [121], the authors proposed a hybrid method that combined learning algorithm tools and feature selection techniques for disease diagnosis.The CFS was utilized in [205] for feature selection in microarray datasets to minimize the data's dimensionality and identify discriminatory genes.A hybrid model incorporating the CFS and Binary Particle Swarm Optimization (BPSO) was proposed in [206] to classify cancer types and was applied to 11 benchmark  microarray datasets.The CSVM-RFE, which involves the CFS, was used in [207] to reduce the number of cancer features and eliminate irrelevant ones.In [172], the authors utilized CFS-based feature selection techniques to identify key RNA expression features.

FS3. Relief (R):
The Relief algorithm, commonly known as RA, is an effective method used in selecting important features by assessing their differentiation quality by assigning scores.This technique calculates the weight of various features based on the correlation between cancer attributes.In a study published in [208], a feature selection method based on the Relief algorithm was proposed as a means of improving efficiency.

FS4. Consistency-Based Subset Evaluation (CSE):
The study in [209] presents a hybrid classification model for breast cancer, which is based on dividing cancer data into single-class For instance, in [136], PCA was applied to the dual-tree complex wavelet (DTCW) transform to select the optimum features of Thyroid Cancer.In [137], PCA was proposed as a tool for classifying different thyroid cancer subtypes such as papillary, follicular, and undifferentiated.The implementation of PCA and Linear Discriminant Analysis was also explored in [210] for classifying Raman spectra of different thyroid cancer subtypes.Finally, in [211], the authors utilized PCA on cDNA microarray data to uncover the biological basis of breast cancers.FE2.Texture description (TD): Texture analysis is a commonly used method for extracting relevant information in the classification, segmentation, and prediction of Thyroid Cancer.There are numerous texture analysis techniques in the literature, including wavelet transform, binary descriptors, and statistical descriptors.The discrete wavelet transform, in particular, has received significant attention for its ability to perfectly decorrelate data.Many studies have utilized wavelets for thyroid cancer detection, such as in [212], where wavelet techniques were employed to identify cancer regions in thyroid, breast, ovarian, and prostate tumors.In [213], texture information was used to diagnose TN malignancy through a 2-level 2D wavelet transform.Other works exploring this area can be found in [214] and [215].

FE3. Active contour (AC):
The active contour, first introduced by Kass and Witkin in 1987, is a dynamic structure primarily used in image processing.There are several approaches for solving the problem of contour segmentation using a deformable curve model, which has seen numerous applications in the field of Thyroid Cancer detection, as demonstrated in [216], [217], and [218].

FE4. Local binary patterns (LBP):
The Local Binary Patterns (LBP) are features employed in computer vision to recognize textures or objects in digital images.LBP has been utilized to detect Thyroid Cancer in [214].The combination of LBP and DL has also been proposed to classify benign and malignant thyroid nodules in [219] and [220].
FE5. Gray-level co-occurrence matrix (GLCM): The Gray-Level Co-occurrence Matrix (GLCM) is a matrix that represents the distribution of values of pixels that occur together at a specified offset in an image.In [221], GLCM is used to extract features to differentiate between different types of Thyroid Cancer.In [222], the differences between an individual with Hashimoto's thyroiditis-associated papillary thyroid carcinoma and one with Hashimoto's thyroiditis alone were investigated based on GLCM comparison.FE6.Independent component analysis (ICA)): In Independent Component Analysis (ICA), information is gathered into a set of contributing features for the purpose of feature extraction.ICA is utilized to separate multivariate signals into their individual components.In [223], ICA is used to extract 29 attributes as independent and useful features for classifying data into either hypothyroid or hyperthyroid using a Support Vector Machine (SVM).A summary of features methods based on DL conducted in the diagnosis of thyroid cancer are illustrated in Table VI.

V. STANDARD ASSESSMENT CRITERIA
In this section, we examine the most commonly utilized standard parameters for evaluating the identification of Thyroid Diseases (TD).These criteria serve as a measure of the effectiveness of the methods used.Selecting the right metric is crucial when evaluating the performance of machine learning models.Numerous metrics have been proposed to evaluate machine learning models in various applications.Here, we present a summary of popular metrics that are considered suitable for assessing the performance of AI algorithms applied in the detection of Thyroid Cancer.(See Table VII, VIII and IX)

A. Classification and Regression Metrics
Table VII presents an outline of classification and regression metrics used in evaluating AI-based thyroid cancer detection frameworks.

B. Statistical Metrics
Table VIII depicts a summary of statistical metrics used in assessing AI-based thyroid cancer detection schemes.

Ref. Year
Classifier Features Contributions [224] 2017 KNN FC/IG -Avoid data redundancy and reduce computation time.The kNN deals with the missing dataset, and the ANFIS is provided with the resultant data as input.
[225] 2017 SVM FC/CFS -Extract the geometric and moment features while some kernels of the SVM classifier classify the extracted features.
[226] 2022 CNN FE/PCA -The influence of unbalanced serum Raman data on the prediction results was minimized by using an over-sampling algorithm in this study.PCA then reduced the data's dimension before classifying data using RF and the Adaptive Boosting. [

Mathematical formula Description
Accuracy (ACC) ACC = Give the correct percent of the total number of positive and negative predictions.

Specificity (SPE)
It is the ratio of correctly predicted negative samples to the total negative samples.

Sensitivity (SEN)
It is a quantifiable measure metric of real positive cases that got predicted as true positive cases.Mean Squared Error (MSE) It is the average of the square of the difference between the original values and the predicted values.

M1. The mean reciprocal rank (MRR):
The MRR is a statistic measure for evaluating the mean reciprocal rank of results for a sample of queries [233].
Where rank i refers to the rank position of the first relevant

Mathematical formula Description
Standard deviation (SD) σ = (x − µ) 2 /N It is a measure of the amount of variation or dispersion in a set of data.
Correlation (Corr) It describes the degree of association or relationship between two or more variables.

Kappa de Cohen
k = Pr(a)−Pr(e) 1−Pr(e) It measures the degree of concordance between two evaluators, relative to chance.

Metric
Mathematical formula Description Peak Signal to Noise Ratio (PSNR) It measures the ratio of the maximum possible power of a signal to the power of the noise that affects the fidelity of its representation.
Structural Similarity Index (SSIM) It evaluates the similarity between two images or videos by comparing their luminance, contrast, and structural information.
Visual Information Fidelity (VIF) It evaluates the quality of a reconstructed or compressed image or video compared to the original signal.It measures the amount of visual information preserved in the processed image or video, taking into account the spatial and frequency characteristics of the image.

Cross-Correlation (NCC)
N CC = Measure the similarity between two images (or videos) by subtracting the mean value of each signal from the signal itself.Then, the signals are normalized by dividing them by their standard deviation.Finally, the cross-correlation between the two normalized signals is calculated.
A higher value of SC (Structural Content) shows that the image is of poor quality.
Weight Peak Signal to Noise Ratio (WPSNR) W P SN R = 10 log It takes into account the image texture [230].
Noise Visibility Function (NVF) Visual Signal to Noise Ratio (VSNR) Where C(I) is the RMS contrast of the original image I and V D is visual distortion [231].
It is based on the specified thresholds of distortions in the image based on the computing of contrast thresholds and wavelet transform.If the distortions are lower than the threshold, the VSNR is perfect.

Normalized Absolute Error (NAE):
N It evaluates the accuracy of an ML model's predictions.It measures the difference between the predicted values and the actual values, as a proportion of the range of the actual values.
Laplacian Mean Square Error (LMSE) where L(I(i, j)) is the Laplacian operator.
It is a variant of the Mean Square Error (MSE) that uses the Laplacian distribution instead of the Gaussian distribution.
document for the i-th query.
M2.The Discounted cumulative gain (DCG): the DCG is used to measure the ranking quality [234].

VI. EXAMPLE OF THYROID CANCER DETECTION USING AI
To explain how thyroid cancer has been considered in the literature and how AI can be used to detect types of cancers.
In the following, we present a simple example to classify TD.It has been known that pattern recognition is the process of training a neural network to assign the correct target classes to a set of input patterns.Once trained the network can be used to classify patterns.In this section, we present an example of thyroid cancer classification as benign, malignant, and normal based on a set of features specified according to the TIRADS.In this example, the dataset (7200 samples) is selected from the UCI Machine Learning Repository [235].This dataset can be used to create a neural network that classifies patients referred to a clinic as normal, hyperfunction, or subnormal functioning.The Thyroid Inputs (TI) and Thyroid Targets (TT) are defined as: (i) TI: a 21x7200 matrix consisting of 7200 patients characterized by 15 binary and 6 continuous patient attributes.(ii) TT: a 3x7200 matrix of 7200 associated class vectors defining which of three classes each input is assigned to.Classes are represented by a 1 in rows 1, 2, or 3. (1) Normal, not hyperthyroid.(2) Hyperfunction.(3) Subnormal functioning.
In this network, the data is divided into 5040 samples, 1080 samples, and 1080 samples used for training, validation, and testing respectively.The network is trained to reduce the error between thyroid inputs and thyroid targets or until it reaches the target goal.If the error rate does not decrease and the training does not improve, the training data is halted with data of validation.The data testing is used to deduce the values of targets.Thus, it determines the percentage of learning.For this example, the 10 hidden layer neurons are used in this model for 21 input and 3 output.After the simulation of the model, the Percent Error was been 5.337%, 7.407%, and 5.092% for training, validation, and testing respectively.Thus in the total, it recognized 94.4% and the overall error rate was 5.6%.The confusion matrix and the ROC metric are illustrated in Fig. 5. Fig. 6, illustrates an example of thyroid segmentation (TS) in ultrasound images using k-means (3 clusters have been chosen for this example) which is one of the most commonly used clustering techniques.

VII. CRITICAL ANALYSIS AND DISCUSSION
As we delve into the core of this paper, it is essential to critically assess and discuss the multitude of facets associated with the application of AI in thyroid carcinoma detection.While the promise of AI has been well-articulated in existing literature, a more nuanced perspective is needed to fully understand its impact on healthcare, both positive and negative.In this section, we undertake a critical analysis of the effectiveness of AI models for thyroid carcinoma detection.Moving beyond the optimistic numbers, we will question the robustness of these models in real-world clinical settings and discuss their role in the broader context of clinical decisionmaking.Furthermore, we explore the potential biases in AI models, understanding how they might inadvertently perpetuate existing inequities in healthcare.A comparative assessment of AI-based and traditional diagnostic methods will provide deeper insights into their relative effectiveness.Moving on, acknowledging the challenges to the implementation of AI tools in healthcare, we delve into the infrastructural, regulatory, and cultural barriers that might hinder their widespread adoption.Lastly, we underscore the crucial role of interdisciplinary collaboration in ensuring the successful integration of AI into healthcare.
A summary of features methods based on DL conducted in the diagnosis of thyroid cancer are detailed in Table X.
The Effectiveness of AI Models: The reported accuracy, sensitivity, and specificity of AI models in the literature may vary widely based on the dataset used, the quality of the data, and the methodology employed.AI models' effectiveness in a controlled experimental environment may not reflect their performance in a real-world clinical setting.Factors like noise in the data, incomplete data, and changing clinical conditions can dramatically influence the outcome.Therefore, it is crucial to scrutinize the model's robustness and reliability under various conditions.

A. Limitations and open challenges
Despite the success of AI tools in thyroid cancer diagnosis, their limitations hinder the development of effective solutions, make their application costly, and limit their diffusion.For instance,to achieve precise thyroid cancer detection, it is crucial to gather and store all relevant data in one place.Then, algorithms must be developed to identify all forms of thyroid cancer.Every thyroid cancer dataset includes a set of training images, test images, nodule plans, and classifications of nodule characteristics of diverse sizes [260].The datasets must be regularly updated using MRI, CT scans, X-rays, and clinically obtained scans to assess thyroid conditions, and they should also include demographic information such as gender and age.Additionally, it is important to establish a unified and centralized database accessible to all medical centers to test, validate, and apply AI algorithms to existing data [261].Moving, the rest of the limitations and open challenges can be summarized as follows: leftmargin=* • Insufficient clean data and accuracy: The lack of comprehensive and annotated data sets regarding the incidence and spread of cancer, specifically thyroid cancer, is a major hindrance to accurate cancer diagnoses and efficient treatment.Medical statistics often do not properly record the number of deaths caused by thyroid cancer, making data collection and validation challenging [262].This results in a limited amount of data typically collected from one center, due to the absence of a dedicated thyroid cancer clinical database shared among institutions.The accuracy of AI algorithms in diagnosing thyroid cancer is also limited by the limited number of available labeled cases for clinical outcomes [263].Researchers acknowledge that a large amount of data is necessary for the neural network to yield accurate results, but caution must be taken in regard to the data added during the learning phase as it can introduce noise.• Thyroid gland imaging: In the diagnostic evaluation of thyroid cancer, computed tomography (CT) and magnetic resonance imaging (MRI) are available options but they are not considered the preferred methods due to their high cost and unavailability in certain cases [51].Instead, ultrasound is commonly used as an alternative to physical exams, radioisotope scans, or fine-needle aspiration biopsies.During a ultrasound examination, the doctor is able to assess the activity of the gland by observing the echo of the node and determining its echogenicity, size, limits, and the presence of calcifications.However, the results Target Class   obtained from ultrasound tests are not always accurate enough to differentiate between benign and malignant nodes and the images obtained can be more prone to noise [264].
• The number of DL layers: Choosing the right DL algorithm is crucial in addressing various issues, particularly those related to thyroid cancer diagnosis.Due to the close between benign and malignant tumors, as well as between tumors and other types of lymphocytes, it is challenging to differentiate between them accurately [265].To achieve this, a significant increase in the number of layers for feature extraction may be required.However, this results in a longer processing time, especially when dealing with large amounts of data, which can impact the timeliness of the diagnosis for cancer patients [50].
• The computation cost and space: In the field of algorithms, time computing is a metric that assesses the computational complexity of an algorithm, which predicts the time it will take to run the algorithm by calculating the number of basic operations it performs, as well as its dependence on the size of the input.Typically, time computing is expressed as O(n), where n represents the size of the input, measured in terms of the number of bits required to represent it [266].Researchers in the AI field, especially those working on thyroid cancer or other types of cancer diagnosis, face the challenge of finding algorithms that are both highly accurate and efficient in terms of processing time.They aim to develop algorithms that can analyze vast amounts of data quickly while still providing accurate results.Moreover, the volume of data used in these algorithms can sometimes exceed the available storage space [50].• Imbalanced dataset: The distribution of cancer elements within categories related to thyroid tissue cells is often uneven, as these cells often make up a minority of the total tissue cell dataset.As a result, the data set is highly imbalanced, consisting of both cancer cells and normal cells.This unbalanced distribution of features in cancer cell detection datasets often results in the suboptimal performance of AI algorithms used for the detection [267].• Sparse labels: Labeling is a crucial aspect of CT detection, specifically for distinguishing between normal and abnormal cancer cells.However, the process can be time-consuming and costly due to the limited number of available labels.This scarcity results in inconsistent decisions and can negatively impact the accuracy of AI algorithms, which heavily rely on labeled data.This can eventually undermine the trust and credibility of this type of application [267].• The volume of data: At present, with the advancement in technology, especially in the field of thyroid cancer diagnosis and the growing volume of medical and patient data, researchers are facing challenges in suggesting algorithms that can effectively handle a limited number of samples, noisy samples, unannotated samples, sparse samples, incomplete samples, and high-dimensional samples.This requires AI algorithms that are highly efficient and capable of processing vast amounts of data exchanged between healthcare providers and patients or among specialist physicians [268] • The error-susceptibility: Despite AI being self-sufficient, it is still susceptible to errors.For instance, when training an algorithm with thyroid cancer datasets to diagnose cancerous regions, it can result in biased predictions if the training sets are biased.This can lead to a series of incorrect results that may go unnoticed for an extended period.If detected, identifying and correcting the source of the problem can be a time-consuming process [269].• The data form: Despite the numerous advancements in the use of AI for thyroid cancer detection, several limitations persist and pose a challenge to its progress.With the growing demand for various medical imaging technologies that result in vast amounts of data needed for AI algorithms, coordinating and organizing this information has become a daunting task.This can largely be attributed to the absence of proper labeling, annotation, or segmentation of the data, making it difficult to manage effectively [270].
• Unexplainable AI: The utilization of AI in the medical field can sometimes yield results that are unclear and lack proper justification, known as a "black box".This leaves doctors unsure about the accuracy of the results and may lead to erroneous decisions and treatments for patients with thyroid cancer.Essentially, AI can behave like a black box and fail to provide understandable explanations for its outputs [271].• Lack of cancer detection platform: One of the major barriers to detecting various cancers, particularly thyroid cancer, is the limited availability of platforms for reproducing and examining previous results.This shortage represents a significant weakness and hinders the comparison of AI algorithm performance, making it challenging to improve their efficacy [155].The presence of online platforms with comprehensive data sets, cuttingedge algorithms, and expert recommendations is vital in aiding doctors, researchers, developers, and specialists to make informed decisions with a low margin of error.Such platforms also provide a crucial supplement to clinical diagnoses by allowing for more comprehensive experimentation and comparison [272].
• The digitization and loss data: Digitization of medical records has become a necessity, particularly in the realm of cancer diagnosis, due to the widespread adoption of various technologies such as Whole Slide Images (WSIs).WSIs, which serve as digital versions of glass slides, facilitate the application of AI techniques for pathological analysis [273].Despite its benefits, digitization in the medical field is confronted with certain limitations, such as the risk of significant information loss during quantification and inaccuracies that may arise from data compression utilized in autoencoder algorithms.Hence, it is crucial to be mindful in selecting the right digitization technology to preserve the information and maintain the originality of the data [274], [275].• The Contrast: The absence of sufficient contrast in the tissues neighboring the thyroid gland (TG) complicates the process of accurately analyzing and diagnosing thyroid cancer.

VIII. FUTURE RESEARCH DIRECTIONS
We also highlight the future trajectory of AI in thyroid carcinoma detection, discussing emerging trends and technologies while considering their ethical implications.The ethical considerations do not end there, as we further examine issues related to data privacy, accountability, and equity.This section highlights promising research trends that will have a major effect on enhancing thyroid cancer detection in the future.

A. Explainable Artificial Intelligence (XAI)
The use of artificial intelligence (AI) systems in decisionmaking is crucial, but they can be complex and difficult to understand.To address this issue, the field of explainable AI (XAI) has emerged, which aims to provide transparency in AI models.The need for XAI is especially important in health applications where the interpretation of results is crucial.The use of XAI has been demonstrated in the analysis of incurable diseases affecting the TG, as seen in several studies such as [276], [277], [278], [279], [280].The difference between AI and XAI is illustrated in Fig. 7.In [167], the authors present an XAI model for the detection of thyroid cancer, which improves the confidence of medical practitioners in the predictions.Unlike traditional AI algorithms, XAI models provide evidence to support their conclusions and avoid the limitations of "black box" algorithms.By using XAI, clinicians can make more informed decisions with greater confidence.

B. Edge, fog, and cloud computing for implemetation
The edge network is a combination of edge computing and AI that processes algorithms based on AI near the source of data [281].This allows for better performance and lower costs for applications that require heavy information processing, as well as reduces the need for long-distance communication between the patient and the doctor.The proximity of the information and storage capabilities to the end-user in the health sector allows for direct and immediate access [282].To further enhance performance, the detection of thyroid cancer in edge networks relies on the use of fog computing, which is a decentralized computing architecture located between the cloud and the data-producing devices.This architecture allows for the flexible placement of computing and storage resources in logical locations, improving performance [283].To ensure the proper operation of the AI-based thyroid cancer detection system, it utilizes cloud computing as an access point.This guarantees that the stored data, servers, databases, networks, and programs are accessible and shared among specialized doctors, as long as it is connected to the Internet.Such a hybrid system has proven to be effective for medical applications, including detection of thyroid cancer, as seen in various studies including [284], [285], [286], [287], [288], [289], [290], [291], [292], [293], [294].

C. Reinforcement learning (RL)
RL, a subfield of ML, allows agents to make decisions in interactive environments through trial and error, observation, and learning (as depicted in Fig. 8).In recent years, there has been significant interest in using RL in detecting incurable diseases and providing explanations to aid medical decisionmaking.For example, reinforcement learning is used in [295] to classify cancer data, and deep reinforcement learning is used in [296] to segment lymph node sets.The authors generate pseudo-ground truths using RECIST-slices and achieve simultaneous optimization of lymph node bounding boxes through the interaction between a segmentation network and a policy network.

D. Transfer learning (TL)
TL is a valuable solution to the overfitting and precision challenges faced by diagnosis systems [297], [298], [299].This technique leverages stored knowledge from a specific problem to address other issues such as reducing training time and data volume [298], [268].Its use in the diagnosis of the thyroid gland (TG) is demonstrated in Fig. 9. Moving on, in [154], the authors tackle the challenge of capturing appropriate features of benign and malignant nodules using CNNs.They transfer the knowledge learned from natural data to an ultrasound (US) image dataset to produce hybrid semantic deep features.The transfer learning technique has also been successfully applied to classify thyroid nodule (TN) images in [160].Other related works can be found in [300], [250], [301], [158], [302].

E. Panoptic segmentation (PS)
The challenge of accurately separating and dividing objects with diverse and overlapping appearances remains an issue, particularly in the medical field.To address this, many researchers have put forth proposals for a comprehensive and cohesive segmentation of various details [303], [304].The focus has been on PS, which combines both instance and semantic segmentation to identify and separate objects.In semantic segmentation, the goal is to classify each pixel into specific classes, while in instance segmentation, the focus is on segmenting individual object instances.AI has been incorporated into this model through supervised or unsupervised instance segmentation learning, making it wellsuited for medical applications (Fig. 10).This has been demonstrated in works such as [305], [306].

F. Internet of medical imaging thing (IoMIT)
The Internet of Medical Things (IoMT) has recently gained widespread attention in the medical field, as it seeks to enhance healthcare delivery and reduce treatment costs through the exchange of health data between patients and doctors using connected devices with wireless communication (Fig. 11).One example of this integration can be found in [307], which proposes an AI-based solution for early detection of thyroid cancer in the IoMT, utilizing CNN to improve differentiation between benign and malignant nodules, ultimately saving lives.Other relevant studies related to the IoMIT have also been conducted, such as [308] and [309].

G. 3D thyroid cancer detection (3D-TCD)
The conventional 2D ultrasound is widely used for diagnosing thyroid nodules, but its static images may not accurately reflect the nodule's structures.Hence, the use of 3D ultrasound has gained attention as it provides a more comprehensive view of the lesion by reconstructing its features and enabling better differentiation between different diagnoses [310].With the ability to examine complex growth patterns, and margins, and to shape from multiple angles and levels, 3D ultrasound can provide a more accurate evaluation of the morphological features of thyroid nodules in comparison to 2D images.This has been confirmed through comparative studies between 3D and 2D ultrasound images [311], [312], [313].

H. AI in Thyroid Surgery (AI-TS)
In light of the challenges faced in surgical procedures, the use of AI-powered robots in surgical practices is becoming increasingly essential.AI has the potential to address numerous clinical issues by analyzing and sharing massive amounts of data to support decisions with a level of accuracy comparable to that of healthcare professionals [314].Companies are incorporating AI into surgical practices by training AI-based systems and providing robots that assist surgeons in operating rooms, supply surgical materials, handle contaminated materials and medical waste, remotely monitor patients, and collect and organize patient data such as electronic medical records, vital signs, laboratory results, and video footage [315].As such, it is important for surgeons to have a strong understanding of AI in order to grasp its impact on healthcare.While AI-powered robotic surgery may still be some time away, collaboration across various fields can accelerate AI's capabilities and improve surgical care [316], [317], [318], [319] [320], [321], [322], [323].

I. Wavelet-based AI
Recently, wavelets transform, specifically first and secondgeneration, has gained recognition for its ability to detect various forms of cancer, especially when integrated with AI.This combination has become crucial in the medical field, providing doctors and surgeons with a tool to accurately diagnose diseases more efficiently and quickly [324], [325].The proposed method is based on pre-processing the dataset through discrete wavelet transfer (DWT) and then evaluating the performance of AI in classifying different types of tumors that can impact organs in the body (as explained in Fig. 12).This model holds great potential for the detection of thyroid cancer and researchers are encouraged to test different wavelets available in the literature to further improve its effectiveness [326].

J. Learning with reduced data
One of the hurdles in implementing AI in the medical sector is acquiring adequate data and annotations.AI's capability to minimize the need for labeled data in making an accurate diagnosis is crucial [327].This can be achieved through various learning methods such as semi-supervised learning, supervised learning, unsupervised learning, or alternative approaches that necessitate a smaller quantity of annotated data (Fig. 13) [328].

K. Recommender systems (RSs)
The abundance of data collected from online medical platforms and electronic health records can make it challenging for thyroid cancer patients to access relevant and accurate information [329].The high cost of healthcare data also poses difficulties for doctors to track patients and manage a large patient volume with various treatment options.Given these challenges, the implementation of RSs has been proposed to improve decision-making in healthcare and ease the workload for both patients and oncologists [330], [331].The use of RS in digital health provides personalized recommendations, accurate analysis of big data, and stronger privacy protection through integration with artificial intelligence and machine learning technologies [332] as depicted in Fig. 14.

L. Federated learning (FL):
The FL has become very popular in the field of healthcare applications [333].The surrounding conditions greatly affect human health and cause negative effects on the economy.Diseases of the thyroid gland are among the most common health problems that have become noticeable among various groups of society in recent times.ML can play a vital role in such medical conditions, as the collected data can be exploited to train an ML model that can predict critical conditions.Emphasizing that patient data across different medical centers should be handled privately, the FL setup is the natural choice for such applications, as depicted in Fig. 15.Therefore, In [334], the authors compared the performance of FL against the five conventional deep learning (VGG19, ResNet50, ResNext50, SE-ResNet50, and SE-ResNext50) for analysing and detect thyroid cancer datasets.

M. Generative chatbots
Most recently, the realm of artificial intelligence has witnessed significant advancements, particularly in the development of generative chatbots and large language models like GPT variants [335].These state-of-the-art models, trained on vast amounts of data, are adept at generating human-like text and engaging in coherent conversations, going beyond mere predefined responses.As their capability has expanded, so too has their potential for application across various domains, healthcare being one of the prominent ones.In the healthcare sector, these sophisticated models are being explored for patient engagement, preliminary symptom checks, providing healthrelated information, and even assisting professionals with medical research and data analysis [3].The integration of such technology holds the promise of streamlining healthcare processes, enhancing patient experience, and augmenting the capabilities of healthcare professionals, albeit with the necessary precautions and ethical considerations in place [336].
Using generative chatbots or models like ChatGPT to diagnose thyroid cancer (or any medical condition) directly would be inappropriate and potentially dangerous.However, they can be incorporated into healthcare settings in auxiliary roles [337].Typically, Chatbots can gather preliminary information from patients, including their symptoms, family history, and lifestyle habits.This data can provide a better understanding of the patient's concerns before they meet a healthcare professional.Moving on, they can be programmed to provide information about thyroid cancer, such as risk factors, symptoms, and preventive measures [338].Patients can learn about the disease and its potential signs, allowing them to approach healthcare providers if they find any matching symptoms.Besides, while they cannot replace professional diagnostic tools, they can be designed to guide users through a series of questions that could highlight potential risk factors or symptoms, encouraging them to consult a medical professional for a more comprehensive evaluation [339].
On the other hand, once a diagnosis has been made, chatbots can provide patients with information on treatment options, side effects, diet recommendations, and answer frequently asked questions.Additionally, they can (i) remind patients to take their medications, attend follow-up appointments, or perform regular self-examinations or monitoring, (ii) offer support in terms of relaxation techniques, provide resources for further psychological support, or even just offer a nonjudgmental "listening ear", and (iii) assist doctors and other healthcare professionals by providing instant information about thyroid cancer, recent research, or treatment options, acting as a dynamic reference tool [340].

IX. CONCLUSION
In this research, a comprehensive overview of Deep Neural Networks (DNNs) has been presented, highlighting their growing trend in recent years due to their high accuracy in results compared to other methods.A range of algorithms and training structures have been described, including their advantages and limitations.DNNs have been shown to play a critical role in various real-world applications, characterized by their generalizability and noise tolerance.
However, there are still challenges to be addressed for the widespread use of DNNs in the detection of Thyroid Cancer.One such challenge is the lack of clean datasets and platforms.It is crucial to consider these data to develop efficient and robust cancer detection models that can identify more advanced types of cancer.In the future, more research effort should be put into overcoming these problems and improving the quality of thyroid cancer detection.
The study also highlights the need for further research in the area of thyroid cancer detection and identification, especially considering the accuracy that health specialists desire.Researchers are increasingly exploring the detection of various cancers in two or three dimensions, but the lack of mastery of different geometric transformations and a two-or three-dimensional database hinders the process of diagnosing incurable diseases.The development of new methods to recognize different volumes of cancerous nodules is crucial to achieving speed in treatment and accuracy in diagnosis, as well as enabling early epidemiological surveillance and reducing the death rate.
New technologies such as Explainable AI, Edge Computing, Reinforcement learning, Panoptic Segmentation, and Recommender Systems, have opened up new avenues for research in the field of thyroid cancer detection and have greatly assisted clinicians in the early diagnosis process, reducing the time for detection, and preserving patient privacy.Future work will focus on further exploring the contributions of these technologies to drive a paradigm shift in the field of cancer detection, by developing advanced and secure technologies for the preservation of privacy and detection of thyroid cancer patients, such as telehealth.

Fig. 1 .
Fig. 1.(a) The different kinds of cancer and (b) Thyroid cancer detection methods.

Fig. 2 .
Fig. 2. Taxonomy of the thyroid cancer detection schemes based in AI.

Fig. 3 .
Fig. 3. General DL system for thyroid cancer detection and classification.

Fig. 4 .
Fig. 4. Example of six samples for each class from the DDTI datasets.
Precision (P) P = T P T P +F P 100%Measure the proportion of true positive predictions made by the model, out of all the positive predictions made by the model.F1 Score (F1) F 1 = 2 × P recision×Recall P recision+Recall It is the harmonic mean of precision and sensitivity of the classification.Error Rate (ER) ER = F N +F P T P +F N +F P +T N 100% It is equivalent to 1 minus Accuracy.Root mean square error (RMSE) RM SE = 1 − (ER) 2 × SD It is the standard deviation of the predicted error between the training and testing dataset, its lower value means that the classifier is an excellent one.The negative predictive value (NPV) N P V = T N T N +F N is the proportions of negative results in diagnostic tests, its higher value means the accuracy in diagnosis.Jaccard similarity index (JSI) JSC = |A∩B| |A∪B| = T P T P +F P +F N It has been proposed by Paul Jaccard to gauge the similarity and variety in samples.Fallout or false positive rate (FPR) F P R = F P F P +T N = 1 − SP Measure the proportion of negative samples that are incorrectly classified as positive by the model.Volumetric Overlap Error (VOE) V OE = F P +F N T P +F P +F NEvaluate the similarity between the segmented region and the ground truth region.VOE measures the amount of overlap between the two regions and is defined as the ratio of the volume of the union of the segmented region and the ground truth region to the volume of their intersection.It is the average of the difference between the original values and the predicted values.

1 1+δ 2 bloc
where δ bloc is the luminance variance.It estimates the texture content in the image.

Fig. 5 .
Fig. 5.An example of the confusion matrix and ROC metric for thyroid cancer classification.

Fig. 6 .
Fig. 6.Example of thyroid segmentation based on the k-means method.

Fig. 11 .
Fig. 11.Example of hybrid networks system based on AI for thyroid cancer detection.

Fig. 12 .Fig. 13 .
Fig. 12. Applications of AI-based on wavelet in the detection of thyroid cancer.
Critical analysis and discussion highlighting limitations, hurdles, current trends, and open challenges in the field.• Discussion of future research directions, emphasizing areas requiring more attention to overcome existing barriers and improve thyroid cancer detection solutions.• Emphasis on the potential of AI in advancing thyroid cancer detection while advocating continuous critical evaluation for responsible and effective use.Additionally, the principal contributions of the proposed review compared to other existing surveys are summarized in Table • Scrutiny of several thyroid cancer datasets, addressing their features, feature selection, and extraction methods used in various studies.• Outline of standard assessment criteria used to evaluate the performance of AI-based thyroid cancer detection methods, encompassing classification and regression metrics, statistical metrics, computer vision metrics, and ranking metrics.•

TABLE I THE
MAJOR CONTRIBUTIONS OF THE PROPOSED CONTRIBUTIONS ON THYROID CANCER CLASSIFICATION IN COMPARISON WITH OTHER RELATED WORKS.

TABLE II SUMMARY
OF CNN RESEARCH CONDUCTED IN THE DIAGNOSIS OF THYROID CANCER.
Abbreviations: Number of patients (NP); Number of males (NM); Number of females (NF); Number of nodules (NN); Number of benign nodules (NBN); Number of malignant Nodules (NMN).iscomposed of three layers, with each layer incorporating an individually trained Long Short-Term Memory (LSTM) network.The study's findings indicate that the hierarchical RNN model surpasses basic models in terms of computational efficiency, control accuracy, and robustness, making it an effective tool for diagnosing TN.These advantageous attributes stem from the inherent memory mechanisms of RNNs, which allow them to remember previous states through feedback loops.This memory capability renders RNNs a popular choice for applications in cancer detection.D4.Restricted Boltzmann Machine (RBM): A Restricted Boltzmann Machine (RBM) is a type of artificial neural network and a generative stochastic model.It was first introduced by Paul Smolensky

TABLE III SUMMARY
OF RESEARCH FRAMEWORKS CONDUCTED IN THE DETECTION OF THYROID CANCER BENIGN AND MALIGNANT.

•
[180]noma of the thyroid (TNM8): A dataset was created for the purpose of reporting pathologies of thyroid resection specimens associated with carcinoma.The data does not include core needle biopsy specimens or metastasis to the thyroid gland.The dataset also does not encompass NIFTP (Non-invasive Follicular Thyroid Neoplasm with Papillary-like Nuclear Features), tumors of uncertain malignancy (UMP), thyroid carcinomas originating from struma ovarii, carcinomas originating in thyroglossal duct cysts, sarcomas, or lymphomas.The DDTI dataset was developed with the support of Universidad Nacional de Colombia, CIM@LAB, and Instituto de Diagnostico Medico (IDIME).It serves as a valuable resource for researchers and new radiologists looking to develop algorithm-based computer-aided diagnosis systems for thyroid nodule analysis.The dataset comprises 99 cases and 134 images, with each patient's data stored in an XML file format[180].Fig.4provides an illustration of six samples from each of the thyroid carcinoma tissue types in the DDTI dataset.Prostate, Lung, Colorectal, and Ovarian (PLCO) dataset: The National Cancer Institute (NCI) supports the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial, aimed at examining the direct factors that contribute to cancer in both men and women.The trial has records of 155,000 participants, and all studies regarding thyroid cancer incidence and mortality can be found within it [182].In Table IV, we present examples of public and private thyroid cancer datasets used in thyroid cancer detection.The trengths and weaknesses of thyroid cancer detection techniques AI-based are summarized in Table V.
[179]e Expression Omnibus (GEO): The GEO database is a genomics repository that follows the guidelines of the Minimum Information About a Microarray Experiment (MIAME).This database is designed to store gene expression datasets, arrays, and sequences, and provides researchers with access to a vast collection of experiment results, gene profiles, and platform records in GEO[179].• The Surveillance, epidemiology, and end results database (SEER): The creators of this dataset aim to supply a collection of clinical characteristics from thyroid carcinoma patients, which includes 34 details such as age, gender, lymph nodes, etc. • Digital Database Thyroind Image (DDTI): • The National Cancer Data Repository (NCDR): The NCDR serves as a resource for healthcare and research with the goal of capturing all recorded cases of cancer in England.This data is sourced from the Office for National Statistics [181].•

TABLE IV EXAMPLES
OF PUBLIC AND PRIVATE THYROID CANCER DATASETS USED IN THYROID CANCER DETECTION.

TABLE V A
SUMMARY OF THYROID CANCER DETECTION TECHNIQUES AI-BASED, INCLUDING THEIR STRENGTHS AND WEAKNESSES.
Table IX portrays a summary of computer vision metrics used in assessing AI-based thyroid cancer detection schemes.

TABLE VI SUMMARY
OF FEATURE EXTRACTION METHODS BASED ON DL CONDUCTED IN THE DIAGNOSIS OF THYROID CANCER.

TABLE VII SUMMARY
OF CLASSIFICATION AND REGRESSION METRICS USED IN EVALUATING AI-BASED THYROID CANCER DETECTION SCHEMES.

TABLE VIII SUMMARY
OF STATISTICAL METRICS USED IN ASSESSING AI-BASED THYROID CANCER DETECTION SCHEMES.

TABLE IX SUMMARY
OF COMPUTER VISION METRICS USED IN ASSESSING AI-BASED THYROID CANCER DETECTION SCHEMES.