Cross-Domain Approach for Automated Thyroid Classification Using Diff-Quick Images

Do, Thanh-Ha; Le, Huy; Dang, Minh-Huong Hoang; Nguyen, Van-De; Do, Phuc

doi:10.3390/math13132191

Open AccessArticle

Cross-Domain Approach for Automated Thyroid Classification Using Diff-Quick Images

by

Thanh-Ha Do

^1,*,†

,

Huy Le

^2,†,

Minh-Huong Hoang Dang

^1,†

,

Van-De Nguyen

^3,† and

Phuc Do

^4,*,†

¹

Faculty of Mathematics, Mechanics and Informatics, VNU University of Science, 334 Nguyen Trai Street, Thanh Xuan District, Hanoi 100000, Vietnam

²

L2TI Laboratory, University Sorbonne Paris Nord, 93430 Villetaneuse, France

³

The 108 Military Central Hospital, Hanoi 100000, Vietnam

⁴

Université de Lorraine, CNRS, CRAN, 54000 Nancy, France

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2025, 13(13), 2191; https://doi.org/10.3390/math13132191

Submission received: 21 April 2025 / Revised: 30 May 2025 / Accepted: 20 June 2025 / Published: 4 July 2025

(This article belongs to the Special Issue AI-Driven Innovations in Healthcare: Advances in Machine Learning and Computer Vision)

Download

Browse Figures

Versions Notes

Abstract

Classification of thyroid images based on the Bethesda category using Diff-Quick stained images can assist in diagnosing thyroid cancer. This paper proposes a cross-domain approach that modifies the original deep learning network designed to classify X-ray images to classify stained thyroid images. Since the Diff-Quick stained images have large and high-quality sizes with tiny cells with essential characteristics that can help a doctor diagnose, resizing images is required to maintain this characteristic, which is significant. Thus, in this paper, we also research and evaluate the performance of different interpolation methods, including linear and cubic interpolation. The experiment results evaluated on a private dataset present promising results in the thyroid image classification of the proposed approach.

Keywords:

thyroid cancer; cell classification; deep learning; cross-domain

MSC:

68Vxx

1. Introduction

Thyroid cancer is a disease worldwide and is particularly common in women. Thus, accurate and early diagnosis is essential to help increase the life expectancy of cancer patients. The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) [1,2] is an international standard to guide cytological evaluation. It classifies thyroid cytology—often derived from Diff-Quick stained fine-needle aspiration (FNA) samples—into six diagnostic categories, ranging from Category I: Nondiagnostic to Category VI: Malignant. Accurate classification of thyroid cytology images relies heavily on recognizing a complex set of morphological features, including moderately to highly dense cellularity, architectural alterations such as crowded sheets, micro follicles, and dispersed cells, and nuclear atypia such as chromatin clearing, nuclear groove, and membrane irregularities. This paper focuses on the B4, B5, and B6 categories associated with a high risk of malignancy. These categories represent critical diagnostic points, and accurate classification is decisive in surgical decision-making.

In early studies of cancer classification, models like Naive Bayes and its weighted variant (W-NB) [3] achieved high accuracy on datasets with predefined features. However, these approaches are unsuitable for raw histopathological images where features must be learned directly from pixel-level data.

Traditional machine learning methods—such as Support Vector Machines (SVMs)—follow a two-stage pipeline: feature extraction and classification in histopathological image classification. Descriptors like Local Binary Patterns (LBP) [4], Gray-Level Co-occurrence Matrix (GLCM) [5], and Oriented FAST and Rotated BRIEF (ORB) [6] have been widely used to extract structural or texture-based patterns. In 2019, Spanhol et al. introduced the BreaKHis dataset [7] and showed that SVMs trained on these descriptors could achieve up to 85.1% accuracy for binary breast cancer classification.

In practice, cancer diagnosis is often not a binary classification but requires the ability to distinguish between multiple closely related pathological stages. Using breast histology images, the BACH 2019 challenge [8] introduces a four-class classification task (Normal, Benign, In Situ Carcinoma, Invasive Carcinoma). The authors observed that traditional machine learning methods, which relied on handcrafted features (e.g., nuclear morphology, texture) and classifiers like SVMs, had difficulty recognizing the slight differences between the intermediate cancer stages. To address this, deep learning approaches, especially convolutional neural networks (CNNs), were used to learn rich, multi-scale features from raw data automatically. The top-performing CNN-based methods achieved classification accuracies of up to 87% on the four-class task, significantly outperforming traditional pipelines.

While effective, classical image analysis methods are not suitable for capturing the complex and varied characteristics of Diff-Quick stained thyroid cytology images. Thus, this necessitates more advanced approaches, such as deep learning, which has remarkable achievements in both general computer vision, other applications [9,10] and, more recently, medical image analysis [11,12,13]. Particularly for Diff-Quick stained thyroid cytology images, deep learning methods are prominent in solving the problem of segmentation [14], classification [15], and recognition [16], with a wide variety of data with different types of disease in large quantities (more details in Section 2).

Therefore, in this research, we use a deep learning model by domain transfer technique to deal with the Diff-Quick image by developing a successful lung disease classification model. In more detail, we adapt and modify this successful model to classify thyroid cancer images in three stages: B4, B5, and B6, following the Bethesda system. We adapt the model by finding suitable depth, such as it cannot remove the characteristics of tiny cells. We modify the model by proposing some operators in the original model for lung disease classification to ensure that the cell features are extracted better. The motivation for using a cross-domain approach, adapting a lung disease classification model named DarkCovidNet [17] for thyroid cytology, is based on architectural suitability and transferable characteristics between the two imaging domains. Although the anatomical focus differs, both chest X-ray images and Diff-Quick stained thyroid images require models that can extract subtle texture-based and morphological patterns from complex, high-resolution grayscale or stain-based inputs. DarkCovidNet, initially designed for efficient lung disease classification, uses a lightweight convolutional architecture that performs well on relatively small medical datasets like ours. We do not transfer pre-trained weights but adopt the architectural design, modifying it to better handle the small, densely packed cells in thyroid cytology. This makes it a practical and empirically effective starting point for adaptation to our task.

In addition, this research evaluates the performance of different interpolation methods, including linear and cubic interpolation, in the data augmentation stage. We conduct a thorough evaluation using the private dataset collected from the 108 Military Central Hospital of Vietnam and labeled by pathologists. This should instill confidence in performance of the proposed approach.

The paper’s organization is as follows: Section 2 presents a revision of the main idea when using Deep Learning Models and the reason for using the DarkCovidNet model [17]. Section 3 focuses on this paper’s contribution, including the new architecture of the DarkCovidNet-based model and how to collect and label the dataset. Section 4 indicates the experiment results, and the conclusion is presented in Section 5.

2. Deep Learning Models for Histopathological Image Classification

Nowadays, with the achievement of machine learning and intense learning, advanced algorithms have been researched and developed to support effective pathologists who diagnose diseases by examining cells and other components under a microscope (see Table 1).

Table 1 demonstrates some approaches to cell classification problems. In recent advancements in deep learning for histopathological image classification, various methods have significantly improved. Without Adaptation [18] achieved 63.15% accuracy on Circus images but struggled with capturing intricate morphological features, making it less suitable for Diff-Quick data. ResNet and Vision Transformer [19] demonstrated moderate performance on the TCGA dataset (72.11% and 73.50%, respectively), showing potential for feature extraction but requiring fine-tuning to handle overlapping nuclei and high cell density in Diff-Quick images. HIPT-BERT [20], which combines hierarchical transformers and BERT for contextual understanding, significantly improved accuracy (82.06%) but required substantial computational resources, limiting its practicality in resource-constrained environments. Semi-supervised methods like FixMatch (83.34%) and MixMatch (88.35%) [19] effectively deal with the limited labeled dataset, a common issue in cytology. However, their reliance on high-quality pseudo-labels can be problematic when initial predictions are inaccurate. Slideflow [21] integrated real-time whole-slide visualization, achieving 90.20% accuracy, but its computational demands make it challenging for large-scale deployment. With adaptive stain separation and contrastive learning, CLASS-M [19] achieved 92.13% accuracy, showing strength in addressing variations in stain intensity and structure, yet it depends heavily on extensive labeled datasets. Finally, SHISRCNet [22], which focuses on super-resolution and classification for low-resolution breast cancer images, attained the highest accuracy of 97.82 % on the BreakHis dataset, demonstrating the potential of super-resolution techniques in enhancing classification performance.

As indicated in Table 1, the best deep learning model is the SHISRCNet [22] method on the BreakHis dataset. However, based on our experience evaluating a self-constructed dataset collected from the 108 Military Central Hospital, Vietnam, this model is not working well. Thus, we decided to research another way. Instead of developing a model specific to Diff-Quick images, we tried to adapt a successful model for lung cancer classification to classify thyroid cytology images following the Bethesda system, a widely accepted system for classifying cell abnormalities in medical research.

The model we decided to develop is DarkCovidNet, one version with three percent more efficiency than original DarkNet architectures [17]. To adapt this model to a Diff-Quick image in which a thyroid tumor has a small size, we developed the original model by making a new one with fewer layers, making it suitable for effectively extracting characteristics of small/tiny cells over images.

Specifically, the proposed adapted model employs fewer layers and filters, ranging from eight to thirty-two, including nineteen convolutional layers and five pooling layers. Each DarkNet layer consists of a convolutional layer followed by BatchNorm and LeakyReLU operations. In the case of the three Conv layers, this setup is repeated three times in succession. The batch normalization operation standardizes the input and offers additional benefits, such as reducing training time and improving model stability. The Avgpool layer is used to down-sample the feature maps, and the Softmax layer is used to produce the model’s final output, which is a probability distribution over the three classes.

3. Cross-Domain Approach for Automated Thyroid Classification

3.1. Constructing the Database Being Diff-Quick Stained Images

The process of constructing the DQ image database is authorized under the regulations of the research project of Vietnam National University, Hanoi. First, doctors collect thyroid cell samples from patients to assess the cancer process based on the Bethesda scale. After collecting the biopsy samples, the cells are taken to the staining process. The Diff-Quik staining process is fundamental for producing detailed cytological images [23]. This procedure starts with air-drying the smear followed by fixation in “Diff-Quik” Fixative (or methanol) for 30 s/drain. Next, it is stained with “Diff-Quik” solution II for 30 s/drain, then counterstainedwith Diff-Quik solution I for 30 s/drain and rinsed in tap water to remove excess stain. Finally, the sections are dehydrated in absolute alcohol, cleared, and mounted with a coverslip using a resinous medium. This results in high-quality images that clearly delineate cellular and tissue structures for an accurate diagnostic interpretation.

After the standardized staining process is completed, doctors and medical imaging specialists capture and examine the cell samples under an Olympus DP27 microscope with magnifications ranging from 10 to 40 times. The next step is data labeling. This step aims to assess the impact level of the patient’s cells based on the Bethesda scale, ranging from Level 3 to Level 6, to diagnose whether the patient requires surgical intervention. The labeling is conducted independently under the supervision of an expert with over 20 years of experience and five other doctors with 5 to 10 years of experience. The doctors label the samples based on the characteristics and features of the cells according to the specifications and standards of the Bethesda scale. The results include 195 samples in B3, 279 in B4, 412 in B5, and 211 in B6, with each image having a variable size ranging from approximately 1024 × 768 to 1224 × 960. Specifically, this research focuses on the three main scales: B4, B5, and B6. The reason is that these stages represent the progression of the disease, where patients experience more severe conditions and more significant difficulties in cancer treatment.

3.2. Pre-Processing Diff-Quick Stained Images

The Diff-Quick stained images usually have large and high-quality sizes with tiny cells with essential characteristics that can help a doctor diagnose. If we use the original size of images, the deep learning model takes much time, and thus the input of the deep learning-based model usually has the size of 256 × 256. While the Diff-Quick image usually has various sizes from 1024 × 768 to 1224 × 960. Thus, resizing the image from these sizes to 256 × 256 or 512 × 512 can lead to losing the necessary information over the cell. Thus, in this paper, we evaluate the performance of the interpolation method over Diff-Quick thyroid cytology images in maintaining significant characteristics at the cell level. Two interpolations are evaluated: linear and cubic interpolation.

Linear interpolation and cubic interpolation are two common methods for resizing images, particularly useful for Diff-Quick stained images, where preserving cell-level details is critical. Linear interpolation works by estimating the intensity of a new pixel based on the four closest pixels in the original image. It calculates the new pixel’s intensity as a weighted average of these neighboring pixel values, where the weights depend on how close the new pixel is to each neighbor. Mathematically, the intensity

I (x, y)

of a pixel at position

(x, y)

in the resized image is given by

I (x, y) = \sum_{i = 0}^{1} \sum_{j = 0}^{1} w_{x} (i) \cdot w_{y} (j) \cdot I (⌊ x^{'} ⌋ + i, ⌊ y^{'} ⌋ + j),

(1)

where

(x^{'}, y^{'})

is the corresponding position in the original image calculated by scaling factors

s_{x}

and

s_{y}

;

⌊ x^{'} ⌋

,

⌊ y^{'} ⌋

are the top-left pixel coordinates in the original image;

w_{x} (i) = 1 - | x^{'} - (⌊ x^{'} ⌋ + i) |

and

w_{y} (j) = 1 - | y^{'} - (⌊ y^{'} ⌋ + j) |

are the weight based on the horizontal distance and the vertical distance, respectively; and

I (⌊ x^{'} ⌋ + i, ⌊ y^{'} ⌋ + j)

are the intensity value of a neighboring pixel in the original image.

In general, linear interpolation is particularly effective for small resizing tasks, balancing speed and accuracy well. However, cubic interpolation, also called bicubic interpolation in two-dimensional image resizing, provides a smoother and more accurate result compared to linear interpolation. It calculates the intensity of a new pixel by considering the values of 16 neighboring pixels (arranged in a

4 \times 4

grid) around the target pixel in the original image. The intensity

I (x, y)

of a pixel at position

(x, y)

in the resized image is given by

I (x, y) = \sum_{i = - 1}^{2} \sum_{j = - 1}^{2} w (i, α) w (j, β) I (⌊ x^{'} ⌋ + i, ⌊ y^{'} ⌋ + j),

(2)

where

(x^{'}, y^{'})

and

⌊ x^{'} ⌋, ⌊ y^{'} ⌋

are the corresponding position in the original image and the top-left pixel in the

4 \times 4

grid;

α = x^{'} - ⌊ x^{'} ⌋

,

β = y^{'} - ⌊ y^{'} ⌋

are fractional parts of the scaled coordinates, indicating the relative position of the target pixel;

w (i, α)

and

w (j, β)

are Cubic weight functions determining the contribution of neighboring pixels based on their distances; and

I (⌊ x^{'} ⌋ + i, ⌊ y^{'} ⌋ + j)

are intensity values of the neighboring pixels in the original image.

The cubic weight function

w (i, α)

(similar to

w (i, β)

) is defined as follows, with a as a constant that controls the smoothness of the interpolation:

w (i, α) = \{\begin{matrix} {(a + 2) | α |}^{3} - (a + 3) {| α |}^{2} + 1, & if | α | \leq 1, \\ {a | α |}^{3} - {5 a | α |}^{2} + 8 a | α | - 4 a, & if 1 < | α | \leq 2, \\ 0, & otherwise . \end{matrix}

(3)

Compared to linear interpolation, cubic interpolation produces smoother results by minimizing the sharp edges and distortions. Thus, it is ideal for resizing applications requiring high-quality outputs, such as Diff-Quick stained images, where preserving fine details is essential. Even cubic interpolation is computationally more expensive than linear interpolation due to the larger neighborhood and the complexity of the weight calculations; however, keeping the characteristic of small tumors is very important in diagnosis over time computing.

Figure 1 presents the results of applying two interpolation methods on the original image with the size 1224 × 960 to obtain a smaller size of 256 × 256.

In Figure 1, the results of linear and cubic interpolation are applied to a cropped region. The red boxes highlight an area of particular interest where interpolation quality directly affects the perceptual clarity of fine cellular structures.

In the Linear Interpolation result, the red-marked region reveals visible artifacts—most notably, unnatural banding or striping patterns. These are caused by the inherent limitations of 2D linear interpolation, which independently interpolates across horizontal and vertical axes. As a result, it fails to capture the smooth gradient transitions in diagonal or curved regions, leading to a stepped or misaligned appearance that deviates from the true structure. Conversely, the Cubic Interpolation result exhibits significantly smoother and more continuous transitions in the same region. By leveraging information from a broader neighborhood of pixels (4 × 4), cubic interpolation better preserves both edge continuity and subtle morphological variations, producing a more faithful reconstruction of cellular boundaries.

This qualitative comparison confirms that cubic interpolation is more suitable for resizing cytological images where fine detail preservation is crucial, particularly near curved membranes and textured areas.

3.3. Cross-Domain Approach for Diff-Quick Images Classification

The DarkCovidNet model [17] has excellent performance in classifying lung conditions using Chest X-ray images. While suitable for chest X-ray images, the structure of DarkCovidNet model was overly deep and computationally demanding for Diff-Quick stained thyroid cytology data. Thus, we decided to adapt this model for automated thyroid classification, and implemented targeted architectural modifications based on theoretical considerations and empirical observations.

This section presents more details on adapting and modifying this successful model to classify cell images based on the Bethesda system using Diff-Quick stained images. Figure 2 presents the flowchart of the proposed model. The preprocessing stage is described in detail in Section 3.2. Here, we focus on modifying a model that can classify thyroid carcinoma.

The transition from classifying lung disease to thyroid cancer requires modifications to the model to align with improved data types and classification criteria. The essential modifications we propose are as follows:

1.: The initial convolutional layer’s feature maps were increased from 8 to 16 to enhance the early-stage capture of subtle, small-scale cellular details typical of thyroid cytology images.
2.: Second, instead of using two initial single-layer convolutions, the second convolutional operation was replaced by a more expressive three-layer convolutional module (consisting of a $3 \times 3$ convolution, a $1 \times 1$ convolution, and another $3 \times 3$ convolution), enabling the network to capture richer spatial and structural patterns earlier in the network. Additionally, the deeper convolutional modules operating at higher channel counts (128 and 256 filters) and associated bottleneck layers were removed to reduce model complexity and redundancy. Specifically, the revised network retains only two multi-layer convolutional modules (from 16 to 32 channels and 32 to 64 channels, respectively).
3.: After these modules, a single standard convolutional layer (with 128 filters) is introduced, rather than employing further complex convolutional blocks or bottleneck convolutions. The number of max-pooling layers (five) was preserved, maintaining the downsampling necessary to efficiently aggregate spatial features, resulting in a final spatial resolution of $16 \times 16$ with 128 feature channels.
4.: Lastly, the flattened and fully connected layers were adjusted to match the new feature map sizes, ensuring the model’s final output remains consistent.

Empirical analyses supported these architectural decisions. For instance, layer activation visualizations indicated redundancy in certain deeper convolutions, and gradient flow analyses identified candidate layers for removal due to low parameter updates. Comparative experiments demonstrated that these simplifications and reductions in unnecessary convolutional depth improved model efficiency and reduced overfitting without compromising classification performance. Thus, these modifications collectively enhance the model’s ability to generalize well on Diff-Quick stained images, providing better performance and efficiency for the classification task. The improved model is robust and capable of handling the new data characteristics, leading to more accurate and reliable results.

The loss function optimized during the training model is the categorical cross-entropy, defined as follows:

\begin{matrix} l o s s = \frac{1}{N} \sum_{n = 1}^{N} (- \sum_{c = 0}^{C - 1} y_{n, c} log (\frac{exp ({\hat{X}}_{n, c})}{\sum_{i = 0}^{C - 1} exp ({\hat{X}}_{n, i})})) \end{matrix}

(4)

where

y_{n, c}

is the one hot vector, N is spans the minibatch dimension, C is the number of classes.

4. Results

The dataset for this research was obtained from the 108 Military Central Hospital. Doctors labeled the data according to the Bethesda scale types B4, B5, and B6. The sizes of the images are not uniform because they were taken from different focal lengths and devices. Therefore, the data were standardized to 256 × 256 using CUBIC interpolation or/and Linear interpolation to be input into the deep learning model for cell classification based on the three labels B4, B5, and B6. Figure 3 presents examples of three types of the Bethesda scale, B4, B5, and B6, corresponding to the images from left to right. In general, the B4 image contains small purple cells that are not as dark as those in B5. Particularly in B6, the cells are dark, interspersed, and overlapping, forming a large region.

Our experiment is designed to assess the influence of interpolation on image resizing and the modified DarkCovidNet model. To achieve this, we structured the experiment as follows. The first, L-CDN, involves resizing the original dataset using linear interpolation and training a model with the original DarkCovidNet model. The second, C-DCN, resizes images with CUBIC interpolation, and this dataset is used to train the original DarkCovidNet model. The third, L-EDCN, is an improved dataset with linear interpolation and the improved DarkCovidNet model. The last, C-EDCN, uses CUBIC interpolation and the improved DarkCovidNet model.

Furthermore, to ensure the fairness and robustness of our experiment, we meticulously crafted these four datasets to have equal sizes for the training, validation, and test sets, maintaining a ratio of 70:15:15. This equal sizing of the datasets reinforces the robustness of our experiment. Specifically, in the B4 category, we used 195 images for training, 42 for validation, and 42 for testing. For B5, the split includes 288 training images, 62 validation images, and 62 test images. Similarly, in the B6 category, we assigned 148 images for training, 32 for validation, and 32 for testing.

Precision, Recall, and F1 Score are used to measure the performance of the experiments. Precision measures the proportion of actual positive results out of all the test’s positive results. Recall measures the proportion of true positive results out of all positive cases. The F1 Score provides a balance between the two metrics, Precision and Recall. True Positives are correctly identified cases, True Negatives are correctly identified non-cases, False Positives are non-cases incorrectly identified as cases, and False Negatives are cases incorrectly identified as non-cases. And F1 Score close to one indicates good classification results.

Through experiments, the trade-off parameters for both the original and improved DarkCovidNet models were determined as follows: a learning rate of

3 \times 10^{- 3}

, a batch size of 16, and 60 epochs. In addition, two of these models use the RAdam for optimizer with

β_{1} = 0.9

and

β_{2} = 0.999

. The construction, training, and evaluation processes were all performed using the PyTorch version 2.2.1 deep learning framework. The models were run on an Apple MacBook Pro with an M2 Pro chip and 19-core GPU (Apple Inc., Cupertino, CA, USA). Table 2 shows the results on various datasets.

The results in Table 2 were obtained using the DarkCovidNet on the 108 Military Central Hospital dataset. We construct baseline models, including traditional deep learning architectures, such as MobileNetV2 [24], ResNet18 [25], and SHISRCNet, to comprehensively evaluate the cell classification performance on the Bethesda scale. This choice helps to ensure a fair comparison between lightweight, deep models with different architectural complexities.

This table demonstrates the significant improvements made by the enhanced DarkCovidNet model, particularly when coupled with cubic interpolation (C-EDCN). Specifically, the performance of the original DarkCovidNet (DCN) model with B4, B5, and B6 achieved F1-scores ranging from approximately 0.62 to 0.65. Compared to other methods, these results are lower and less effective because the model has not been fine-tuned on the new dataset. The C-EDCN model outperforms L-DCN and C-DCN configurations across all three categories (B4, B5, B6) regarding Precision, Recall, and F1 Score. For instance, in the B4 category, the C-EDCN model obtains a Precision of 0.83, Recall of 0.79, and F1 Score of 0.81, which are markedly higher than those obtained using linear interpolation with the original DarkCovidNet model (L-DCN).

To establish a fair baseline and consistent evaluation for comparison, we applied the same preprocessing pipeline used in our proposed model, including image resizing via interpolation, and the model was trained under identical experimental conditions. Specifically, we created two variants for each baseline, one using linear interpolation and one using cubic interpolation for image resizing. These variants enabled a consistent and controlled comparison to evaluate the performance of our proposed C-EDCN model. In addition, all models were implemented in PyTorch and trained exclusively on the private dataset collected from the 108 Military Central Hospital, using a uniform data split of 70% for training, 15% for validation, and 15% for testing. No external pre-training was applied. The training configurations were standardized across all models, with a learning rate of

3 \times 10^{- 3}

, a batch size of 16, and 60 epochs, optimized using the Rectified Adam (RAdam) optimizer with parameters

β_{1} = 0.9

and

β_{2} = 0.999

. All input images were resized to

256 \times 256

pixels. The only variation in preprocessing involved the interpolation method. No additional data augmentation or preprocessing techniques were introduced.

The results in Table 2 show that the C-EDCN model is the only model that achieves consistent and high performance in all three scales (B4, B5, B6). Specifically, C-EDCN achieved F1-scores of 0.81 at B4, 0.81 at B5, and 0.82 at B6, demonstrating outstanding stability across classes, especially for the two more difficult classes B5 and B6.

MobileNetV2 is often considered a compact model, but it did not perform competitively in our experiments. Its F1 Scores remained low and inconsistent across all three classes, particularly under linear interpolation—with values ranging from 0.55 to 0.59 for B4, 0.60 to 0.63 for B5, and 0.62 to 0.65 for B6. These results indicate that MobileNetV2’s architecture may not be well suited to the specific visual characteristics of Diff-Quick-stained cytology images.

C-ResNet18 achieves high Precision (0.80) and Recall (0.94) in layer B4 and an F1 Score up to 0.86; the performance in the other two layers drops significantly (only 0.73 in B5 and 0.71 in B6). These results highlight that the deeper residual architecture appears to be highly effective in capturing the complex morphological patterns present in B4. However, this depth may also come at a cost. The reduced performance in B5 and B6 suggests that ResNet18 has difficulty generalizing well on more alternating patterns, illustrating the trade-off between model capacity and class-wise balance in deeper networks.

SHISRCNet, developed initially as a joint super-resolution and classification model for low-resolution breast cancer images, was adapted in this study by excluding the super-resolution component and retraining only the classification branch on our high-resolution cytology dataset. Despite its architectural clarity and prior histopathological success, SHISRCNet achieved only moderate results, with F1-scores ranging from 0.67 to 0.71. These findings suggest that the model is less capable of handling the more irregular and variable patterns found in Diff-Quick-stained cytological images.

The experimental analysis also shows that the choice of image interpolation method significantly affects classification performance. In all tested configurations, cubic interpolation consistently outperformed linear interpolation, crucial to accurate cytological classification.

Furthermore, examining model complexity, ResNet18, MobileNetV2, and SHISRCNet have 11.7, 2.22, and 0.62 million parameters, respectively. While SHISRCNet (0.62 million parameters) is the lightest of these three baselines, it still contains significantly more parameters—approximately 2.8 times—than our proposed C-EDCN model, which has only 0.22 million. This underscores C-EDCN’s efficiency in model size, making it suitable for practical cytopathological applications, particularly in scenarios with limited computational resources.

Overall, these findings confirm the effectiveness of the proposed C-EDCN model. Its performance is marked not only by high F1 Scores but also by inter-class stability and balance between evaluation metrics. Importantly, it achieves this without requiring excessive model complexity or parameter numbers. The comparison highlights the C-EDCN model’s robustness, particularly with cubic interpolation, in maintaining the essential characteristics of DQ-stained images, leading to more accurate and reliable classification results.

5. Conclusions

In this paper, we proposed using a cross-domain approach to classify thyroid images automatically. In more detail, we effectively adapted and enhanced the DarkCovidNet model to X-ray classification to classify thyroid Bethesda categories from Diff-Quick stained images. We also constructed a new dataset to evaluate the performance of the proposed models. We also evaluated the effectiveness of various interpolation methods for resizing images, concluding that cubic interpolation is particularly suitable for Diff-Quick stained images as it preserves small, crucial details.

While our dataset is clinically valuable and representative of real-world Diff-Quick stained thyroid cytology images, a potential limitation is that it was collected from a single institution. Although this ensures consistency in staining and imaging protocols, it may not fully capture the diversity present across different clinical settings. Additionally, the sample size, while sufficient for our experimental evaluation, could be further expanded in future studies to enhance the generalizability and robustness of the proposed model. In future work, we plan to collect data from diverse sources and develop models that can learn more about the characteristics of tumors over images instead.

Author Contributions

Conceptualization, T.-H.D. and V.-D.N.; methodology, T.-H.D. and H.L.; validation, H.L. and M.-H.H.D.; formal analysis, T.-H.D.; investigation, H.L.; data curation, V.-D.N. and T.-H.D.; writing—original draft preparation, T.-H.D., H.L. and M.-H.H.D.; writing—review and editing, T.-H.D.; visualization, M.-H.H.D.; supervision, T.-H.D.; editing, review and revision P.D.; project administration, T.-H.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by research project QG.23.71 of Vietnam National University, Hanoi.

Data Availability Statement

The datasets analyzed during the current study are available from the corresponding author, Thanh-Ha Do, and her college, Van-De Nguyen, upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cibas, E.S.; Ali, S.Z. The 2017 Bethesda system for reporting thyroid cytopathology. Thyroid 2017, 27, 1341–1346. [Google Scholar] [CrossRef]
Pusztaszeri, M.; Brimo, F.; Wang, C.; Sekhon, H.; Al-Nourhji, O.; Fischer, G.; Zeman-Pocrnich, C. The Bethesda System for Reporting Thyroid Cytopathology: Summary Guidelines of the Canadian Society of Cytopathology. 2019. Available online: https://cytopathology.ca/wp-content/uploads/2019/07/Bethesda-system-memo.pdf (accessed on 15 April 2025).
Kharya, S.; Soni, S. Weighted naive bayes classifier: A predictive model for breast cancer detection. Int. J. Comput. Appl. 2016, 133, 32–37. [Google Scholar] [CrossRef]
Pietikäinen, M. Image analysis with local binary patterns. In Proceedings of the Image Analysis: 14th Scandinavian Conference, SCIA 2005, Joensuu, Finland, 19–22 June 2005; Proceedings 14. Springer: Berlin/Heidelberg, Germany, 2005; pp. 115–118. [Google Scholar]
De Siqueira, F.R.; Schwartz, W.R.; Pedrini, H. Multi-scale gray level co-occurrence matrices for texture description. Neurocomputing 2013, 120, 336–345. [Google Scholar] [CrossRef]
Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
Spanhol, F.A.; Oliveira, L.S.; Petitjean, C.; Heutte, L. A Dataset for Breast Cancer Histopathological Image Classification. IEEE Trans. Biomed. Eng. 2016, 63, 1455–1462. [Google Scholar] [CrossRef]
Aresta, G.; Araújo, T.; Kwok, S.; Chennamsetty, S.S.; Safwan, M.; Alex, V.; Marami, B.; Prastawa, M.; Chan, M.; Donovan, M.; et al. Bach: Grand challenge on breast cancer histology images. Med. Image Anal. 2019, 56, 122–139. [Google Scholar] [CrossRef]
Caicedo, J.C.; Lazebnik, S. Active Object Localization with Deep Reinforcement Learning. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar]
Long, N.T.; Huong, T.T.; Bao, N.N.; Binh, H.T.T.; Le Nguyen, P.; Nguyen, K. Q-learning-based distributed multi-charging algorithm for large-scale WRSNs. Nonlinear Theory Its Appl. IEICE 2023, 14, 18–34. [Google Scholar] [CrossRef]
Kazerouni, A.; Aghdam, E.K.; Heidari, M.; Azad, R.; Fayyaz, M.; Hacihaliloglu, I.; Merhof, D. Diffusion Models for Medical Image Analysis: A Comprehensive Survey. arXiv 2022, arXiv:2211.07804. [Google Scholar]
Shin, H.C.; Roth, H.R.; Gao, M.; Lu, L.; Xu, Z.; Nogues, I.; Yao, J.; Mollura, D.; Summers, R.M. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med Imaging 2016, 35, 1285–1298. [Google Scholar] [CrossRef]
Kim, H.E.; Cosa-Linan, A.; Santhanam, N.; Jannesari, M.; Maros, M.E.; Ganslandt, T. Transfer learning for medical image classification: A literature review. BMC Med. Imaging 2022, 22, 69. [Google Scholar] [CrossRef]
Tao, S.; Guo, Y.; Zhu, C.; Chen, H.; Zhang, Y.; Yang, J.; Liu, J. Highly efficient follicular segmentation in thyroid cytopathological whole slide image. In Precision Health and Medicine; Springer: Berlin/Heidelberg, Germany, 2019; pp. 149–157. [Google Scholar]
Slabaugh, G.; Beltran, L.; Rizvi, H.; Deloukas, P.; Marouli, E. Applications of machine and deep learning to thyroid cytology and histopathology: A review. Front. Oncol. 2023, 13, 958310. [Google Scholar] [CrossRef]
Dov, D.; Kovalsky, S.Z.; Feng, Q.; Assaad, S.; Cohen, J.; Bell, J.; Henao, R.; Carin, L.; Range, D.E. Use of machine learning–based software for the screening of thyroid cytopathology whole slide images. Arch. Pathol. Lab. Med. 2022, 146, 872–878. [Google Scholar] [CrossRef]
Ozturk, T.; Talo, M.; Yildirim, E.A.; Baloglu, U.B.; Yildirim, O.; Acharya, U.R. Automated detection of COVID-19 cases using deep neural networks with X-ray images. Comput. Biol. Med. 2020, 121, 103792. [Google Scholar] [CrossRef]
Yang, H.; Chen, C.; Jiang, M.; Liu, Q.; Cao, J.; Heng, P.A.; Dou, Q. Dltta: Dynamic learning rate for test-time adaptation on cross-domain medical images. IEEE Trans. Med. Imaging 2022, 41, 3575–3586. [Google Scholar] [CrossRef]
Zhang, B.; Manoochehri, H.; Ho, M.M.; Fooladgar, F.; Chong, Y.; Knudsen, B.S.; Sirohi, D.; Tasdizen, T. CLASS-M: Adaptive stain separation-based contrastive learning with pseudo-labeling for histopathological image classification. arXiv 2023, arXiv:2312.06978. [Google Scholar]
Sengupta, S.; Brown, D.E. Automatic Report Generation for Histopathology images using pre-trained Vision Transformers. arXiv 2023, arXiv:2311.06176. [Google Scholar]
Dolezal, J.M.; Kochanny, S.; Dyer, E.; Ramesh, S.; Srisuwananukorn, A.; Sacco, M.; Howard, F.M.; Li, A.; Mohan, P.; Pearson, A.T. Slideflow: Deep learning for digital histopathology with real-time whole-slide visualization. BMC Bioinform. 2024, 25, 134. [Google Scholar] [CrossRef]
Xie, L.; Li, C.; Wang, Z.; Zhang, X.; Chen, B.; Shen, Q.; Wu, Z. Shisrcnet: Super-resolution and classification network for low-resolution breast cancer histopathology image. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Vancouver, BC, Canada, 8–12 October 2023; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2023. [Google Scholar]
Silverman, J.F.; Frable, W.J. The use of the Diff-Quik stain in the immediate interpretation of fine-needle aspiration biopsies. Diagn. Cytopathol. 1990, 6, 366–369. [Google Scholar] [CrossRef]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]

Figure 1. Comparison zooms of Diff-Quick-stained images: original, resized with linear and cubic interpolation. The red box indicates the zoomed-in region shown in different interpolations.

Figure 2. The architecture of the proposed model.

Figure 3. Three types of detection. (a) B4; (b) B5; (c) B6.

Table 1. The performance of some deep learning models on classification tasks across different databases.

Method	Dataset	Accuracy
Without Adaptation [18]	Circus image	63.15%
ResNet [19]	TCGA	72.11%
ResNet (Freeze) [19]	TCGA	73.17%
Vision Transformer (Freeze) [19]	TCGA	73.17%
Vision Transformer [19]	TCGA	73.50%
HIPT-BERT [20]	GTEx	82.06%
FixMatch [19]	TCGA	83.34%
MixMatch [19]	TCGA	88.35%
Slideflow [21]	TCGA	90.20%
CLASS-M [19]	TCGA	92.13%
SHISRCNet [22]	BreakHis	97.82%

Table 2. Result obtained using DarkCovidNet on the 108 Military Central Hospital datasets. Bold values are the best results.

Model	B4			B5			B6
Model	Precision	Recall	F1 Score	Precision	Recall	F1 Score	Precision	Recall	F1 Score
L-DCN	0.62	0.67	0.64	0.60	0.72	0.65	0.90	0.44	0.60
C-DCN	0.70	0.68	0.69	0.65	0.70	0.67	0.65	0.70	0.67
L-EDCN	0.75	0.72	0.73	0.70	0.75	0.72	0.80	0.70	0.75
C-EDCN	0.83	0.79	0.81	0.80	0.82	0.81	0.83	0.81	0.82
L-SHISRCNet	0.70	0.70	0.70	0.65	0.72	0.68	0.75	0.60	0.67
C-SHISRCNet	0.72	0.70	0.71	0.68	0.72	0.70	0.70	0.72	0.71
L-ResNet18	0.75	0.91	0.82	0.62	0.74	0.68	0.83	0.62	0.71
C-ResNet18	0.80	0.94	0.86	0.64	0.84	0.73	0.88	0.59	0.71
L-MobileNetV2	0.70	0.45	0.55	0.50	0.72	0.59	0.78	0.65	0.71
C-MobileNetV2	0.74	0.50	0.59	0.54	0.71	0.61	0.80	0.70	0.74

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Do, T.-H.; Le, H.; Dang, M.-H.H.; Nguyen, V.-D.; Do, P. Cross-Domain Approach for Automated Thyroid Classification Using Diff-Quick Images. Mathematics 2025, 13, 2191. https://doi.org/10.3390/math13132191

AMA Style

Do T-H, Le H, Dang M-HH, Nguyen V-D, Do P. Cross-Domain Approach for Automated Thyroid Classification Using Diff-Quick Images. Mathematics. 2025; 13(13):2191. https://doi.org/10.3390/math13132191

Chicago/Turabian Style

Do, Thanh-Ha, Huy Le, Minh-Huong Hoang Dang, Van-De Nguyen, and Phuc Do. 2025. "Cross-Domain Approach for Automated Thyroid Classification Using Diff-Quick Images" Mathematics 13, no. 13: 2191. https://doi.org/10.3390/math13132191

APA Style

Do, T.-H., Le, H., Dang, M.-H. H., Nguyen, V.-D., & Do, P. (2025). Cross-Domain Approach for Automated Thyroid Classification Using Diff-Quick Images. Mathematics, 13(13), 2191. https://doi.org/10.3390/math13132191

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cross-Domain Approach for Automated Thyroid Classification Using Diff-Quick Images

Abstract

1. Introduction

2. Deep Learning Models for Histopathological Image Classification

3. Cross-Domain Approach for Automated Thyroid Classification

3.1. Constructing the Database Being Diff-Quick Stained Images

3.2. Pre-Processing Diff-Quick Stained Images

3.3. Cross-Domain Approach for Diff-Quick Images Classification

4. Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI