Next Article in Journal
Disturbances in Resting State Functional Connectivity in Schizophrenia: A Study of Hippocampal Subregions, the Parahippocampal Gyrus and Functional Brain Networks
Next Article in Special Issue
Prediction of Immunotherapy Response in Hepatocellular Carcinoma Patients Using Pretreatment CT Images
Previous Article in Journal
Mucinous Carcinoma, Mucinous Borderline Tumor and Pseudomyxoma Ovarii in a Cystic Teratoma: A Histological Conundrum
Previous Article in Special Issue
Radiomics with Clinical Data and [18F]FDG-PET for Differentiating Between Infected and Non-Infected Intracavitary Vascular (Endo)Grafts: A Proof-of-Concept Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhanced HoVerNet Optimization for Precise Nuclei Segmentation in Diffuse Large B-Cell Lymphoma

by
Gei Ki Tang
1,
Chee Chin Lim
1,2,*,
Faezahtul Arbaeyah Hussain
3,4,
Qi Wei Oung
1,5,
Aidy Irman Yajid
3,4,
Sumayyah Mohammad Azmi
3,4 and
Yen Fook Chong
2
1
Faculty of Electronic Engineering and Technology, Universiti Malaysia Perlis, Arau 02600, Perlis, Malaysia
2
Sport Engineering Research Centre, Universiti Malaysia Perlis, Arau 02600, Perlis, Malaysia
3
Department of Pathology, Hospital Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
4
Department of Pathology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
5
Communication Engineering (ACE) Centre of Excellence, Universiti Malaysia Perlis, Arau 02600, Perlis, Malaysia
*
Author to whom correspondence should be addressed.
Diagnostics 2025, 15(15), 1958; https://doi.org/10.3390/diagnostics15151958
Submission received: 22 June 2025 / Revised: 14 July 2025 / Accepted: 25 July 2025 / Published: 4 August 2025
(This article belongs to the Special Issue Artificial Intelligence-Driven Radiomics in Medical Diagnosis)

Abstract

Background/Objectives: Diffuse Large B-Cell Lymphoma (DLBCL) is the most common subtype of non-Hodgkin lymphoma and demands precise segmentation and classification of nuclei for effective diagnosis and disease severity assessment. This study aims to evaluate the performance of HoVerNet, a deep learning model, for nuclei segmentation and classification in CMYC-stained whole slide images and to assess its integration into a user-friendly diagnostic tool. Methods: A dataset of 122 CMYC-stained whole slide images (WSIs) was used. Pre-processing steps, including stain normalization and patch extraction, were applied to improve input consistency. HoVerNet, a multi-branch neural network, was used for both nuclei segmentation and classification, particularly focusing on its ability to manage overlapping nuclei and complex morphological variations. Model performance was validated using metrics such as accuracy, precision, recall, and F1 score. Additionally, a graphic user interface (GUI) was developed to incorporate automated segmentation, cell counting, and severity assessment functionalities. Results: HoVerNet achieved a validation accuracy of 82.5%, with a precision of 85.3%, recall of 82.6%, and an F1 score of 83.9%. The model showed powerful performance in differentiating overlapping and morphologically complex nuclei. The developed GUI enabled real-time visualization and diagnostic support, enhancing the efficiency and usability of DLBCL histopathological analysis. Conclusions: HoVerNet, combined with an integrated GUI, presents a promising approach for streamlining DLBCL diagnostics through accurate segmentation and real-time visualization. Future work will focus on incorporating Vision Transformers and additional staining protocols to improve generalizability and clinical utility.

1. Introduction

Accurate segmentation and classification of nuclei are crucial in the analysis of Diffuse Large B-Cell Lymphoma (DLBCL) tissue images [1]. Proper identification and quantification of tumor cells can significantly impact diagnosis and treatment planning. Traditional methods of nuclei segmentation and classification often struggle with challenges such as tissue heterogeneity, staining variability, and complex cellular interactions, leading to slow, labor-intensive processes prone to errors. These challenges highlight the need for more efficient, automated approaches to enhance diagnostic accuracy.
Deep learning, particularly convolutional neural networks (CNNs), has revolutionized the field of medical image analysis. CNNs are proficient at learning intricate patterns from data, making them highly effective in segmenting and classifying nuclei, even under difficult conditions. These networks excel in extracting relevant features such as nucleus shape, size, and texture, enabling accurate tissue analysis. However, while CNNs have shown great promise, existing techniques still face limitations in handling complex cases, such as overlapping nuclei and varying tissue morphologies, especially in diseases like DLBCL.
HoVerNet, an advanced deep learning architecture, offers a promising solution to these challenges. This method employs multi-branch processing to perform both segmentation and classification simultaneously, making it highly effective for the analysis of clustered, heterogeneous nuclei. HoVerNet’s ability to extract precise morphological features, such as nuclear area, perimeter, and shape, allows for more accurate tumor cell quantification. It also opens the door to the development of new prognostic markers and improvements in diagnostic accuracy, addressing the gaps left by traditional methods.
This research aims to leverage HoVerNet’s capabilities to enhance the diagnosis, subtyping, and severity assessment of DLBCL. By addressing the current limitations in medical imaging techniques, this approach promises to improve the precision and efficiency of DLBCL analysis, aiding pathologists in making better-informed clinical decisions. The innovation lies in combining deep learning with tissue image analysis to create a more automated, accurate, and clinically useful framework for cancer diagnosis.

2. Literature Review

Image patches are small, square regions extracted from larger medical images for tasks such as region of interest (ROI) identification, feature extraction, and AI algorithm applications. They enhance analysis by focusing on specific areas, such as structures and textures, with patch sizes varying from single pixels to predefined windows. Basu et al. [2] captured 500 DLBCL and non-DLBCL tissue images at 40× magnification using microscope-based cameras, while El Hussien et al. [3] analyzed 256 × 256 patches from digitally stained H&E slides of CLL, aCLL, and RT cases. Wójcik et al. [4] standardized 37,665 H&E images of DLBCL lymph nodes to 448 × 448 pixels, and Li et al. [5] captured 400× magnification images from 500 labeled DLBCL tissue sections. Swiderska-Chadaj et al. [6] digitized 42 H&E DLBCL slides for external validation, and Bándi et al. [7] extracted annotated patches from six tissue types using WSIs. Shankar et al. [8] analyzed classic Hodgkin lymphoma, mantle cell lymphoma, and DLBCL cores at 40× magnification, while Swiderska-Chadaj et al. [9] derived 512 × 512 patches at 5× magnification for training. Perry et al. [10] applied a self-supervised phase on FFPE H&E-stained biopsy WSIs of aggressive B-cell lymphoma, dividing 20× or 40× images into 384 × 384 patches for analysis.
Pre-processing prepares image data for model input, reducing training time and improving inference. Techniques include orientation, resizing, grayscale conversion, and exposure adjustments to enhance image quality and feature extraction. Hamdi et al. [11] applied Gaussian filters, Laplacian filters, color normalization, and Gradient Vector Flow for feature extraction. Vrabac et al. [12] used tissue microarrays for cell nucleus extraction from H&E-stained images, while Basu et al. [2] developed attention map transformers and feature fusion for DLBCL classification. Blanc-Durand et al. [13] employed resampling, padding, cropping, and adaptive thresholding on PET and CT data to extract tumor heterogeneity features. Ferrández et al. [14] used Gaussian filtering, metabolic tumor volume, and standard uptake value metrics to analyze tumor dissemination. El Hussien et al. [3] annotated ROIs and measured nuclear contour and hull areas, while Graham et al. [15] applied Otsu thresholding, color adjustments, and textural feature extraction. Ferrández et al. [16] and Mohlman et al. [17] utilized normalization, filtering, max-pooling, ReLU operations, and edge detection in pre-processing workflows. Other studies [8,9,10,18,19,20,21] applied normalization, machine learning algorithms, filtering, and feature selection for enhanced image quality and accurate DLBCL classification.
Deep learning, especially CNNs, is highly effective in processing and classifying medical images, such as distinguishing healthy and cancerous cells in DLBCL. HoVerNet excels in nuclei segmentation and classification by integrating segmentation and classification branches, leveraging nuclear pixel distances, and enabling complex pathology analysis. Vrabac et al. [12], El Hussein et al. [3], Wójcik et al. [4], and Graham et al. [15] applied or trained HoVerNet for various tasks, with training epochs ranging from 50 to 800. Hamdi et al. [11] achieved superior performance using MobileNet-VGG-16 with an AUC of 99.43% and 99.8% accuracy on 15,000 H&E-stained WSIs. Blanc-Durand et al. [13] and Ferrández et al. [14,16] utilized 3D U-Net with Adam optimization for PET/CT scans, while Swiderska-Chadaj et al. [6] and Bándi et al. [7] also employed U-Net. Additionally, Basu et al. [2] and Vrabac et al. [12] used DenseNet-201 and ResNet-50, respectively, with optimized training parameters for DLBCL analysis. A study evaluating 17 CNN models (including VGG16) across three hospitals reported average patch-level diagnostic accuracy between 87–96%, demonstrating strong performance in DLBCL vs. non-DLBCL differentiation [5].
However, there are still challenges in nuclei segmentation and classification for DLBCL that need to be addressed. Many studies do not work well with different staining types like CMYC and H&E, which limits their use. Most research also does not focus on adding these tools into real clinical workflows or electronic medical records (EMRs), which are important for practical use. Scalability and real-time analysis are also not well explored, as many methods are not designed for large-scale use. This study helps solve these problems by showing that HoVerNet performs better on CMYC-stained images, creating a simple GUI.

3. Materials and Methods

A total of 122 digital WSIs of DLBCL, comprising 61 MYC+ cells and 61 MYC- cells, were collected from the Department of Pathology, Hospital Universiti Sains Malaysia (Hospital USM). These WSIs were scanned at 40× magnification using the Motic EasyScan Pro digital slide scanner. The collection and use of these images followed ethical guidelines outlined in the Declaration of Helsinki. Approval was obtained from the Jawatankuasa Etika Penyelidikan Manusia Universiti Sains Malaysia (JEPeM-USM), under the reference USM/JEPeM/22110749, ensuring compliance with ethical and legal standards. Only CMYC-stained images were used in this study due to the availability and standardization of these samples within the dataset. While this ensures consistency in analysis and training, it limits the model’s generalizability to other staining protocols, such as H&E. Figure 1 illustrates the examples of CMYC-stained whole slide images.

3.1. Lossless Image Compression

Lossless image compression is a type of image compression method that reduces the image file size without losing any important information [22]. The process begins by loading the image using the ‘PIL’ library [23]. The image is opened and identified using the ‘Image.open()’ function. Once the image is loaded, the next step is to resize it by using the ‘resize()’ function, which takes two parameters: the new size of the image and the resampling filter. The new size is taken by modifying the value of the resize factor, by which both width and height are divided by the resize factor. The resampling filter used is the ‘Image.LANCZOS’ function, which is known for producing high-quality results. The function is nonzero only within the interval (−1 < x < 1). After resizing, the image is saved to the specified output path using the ‘save()’ function. Lossless image compression reduces the file size of images without sacrificing important information, preserving data quality for analysis. This is achieved using the LANCZOS resampling filter, which provides high-quality results by smoothing the image while retaining sharpness and detail. This method ensures that the compressed images maintain essential features needed for segmentation and classification tasks. By resizing images, computational requirements for model training and inference are significantly reduced, enabling faster processing while retaining the integrity of nuclei features. The Lanczos function is defined as in (1).
( x ) = 1 2 ( ( 2 x ) s i n ( π x ) + s i n ( 2 π x ) ) ,       i f   x < 1   0                                                                                                             ,   o t h e r w i s e

3.2. Image Patches

Image patching is a crucial technique for localizing ROI, extracting features, reducing computational complexity, and enhancing accuracy in analyzing DLBCL images [24]. By breaking down these images into smaller patches, specific ROIs like DLBCL nuclei can be localized and analyzed. These patches allow the computation of various attributes such as color, texture, gradients, and shapes. Analyzing patches rather than the entire image reduces computational complexity, making segmentation and classification algorithms more manageable. Additionally, focusing on smaller regions allows algorithms to better handle variations in intensities, shapes, or textures, potentially leading to more accurate segmentation results. The formula for image patching is expressed as in (2). Figure 2 depicts the flow process of image patches. The original WSI, with dimensions of 24,444 × 38,899 pixels, undergoes lossless image compression, reducing its dimensions to 6111 × 9724 pixels. Following compression, the resized image is divided into square patches during the image patching process, with each patch measuring 256 × 256 pixels.
K = ( M W + 1 ) × ( N H + 1 )      

3.3. Normalization

Normalization is a crucial preprocessing step in medical image analysis, particularly for nuclei segmentation and classification. It enhances image quality, reduces distortions, and adjusts pixel intensity values to a consistent range [25], typically between 0 (black) and 1 (white). Normalization involves identifying and outlining the cell containing ROI within the image. This ensures better feature extraction by mitigating noise from lighting or staining variations. For grayscale images, normalization is applied to a single channel, while for RGB images, it is applied to all three channels. This step improves model stability, generalization, and accuracy, especially when dealing with dark, unevenly illuminated patches or datasets with diverse staining techniques. The normalization process is described as in (3).
z i = x i m i n ( x ) m a x ( x ) m i n ( x )  

3.4. CNN Architecture

The dataset was split into 80% training, 10% testing, and 10% validation subsets after patch extraction, not by whole slide images. This ensured a diverse and balanced distribution of input images for the CNN models. This distribution ensures sufficient data for model optimization while retaining samples for unbiased evaluation and fine-tuning (Figure 3). HoVerNet was chosen because it is specifically designed for both nuclei segmentation and classification, which is important for DLBCL analysis [4,11,15]. Its encoder-decoder structure with residual units improves feature extraction and segmentation accuracy [11]. The up-sampling branches further enhance classification by precisely predicting nucleus types, enabling the network to handle clustered and morphologically diverse nuclei in DLBCL WSIs. This study enhances the original HoVerNet framework by adapting it for CMYC-stained WSIs of DLBCL, addressing morphological and staining variations not considered in the original model. The architecture is modified for binary classification of MYC+ and MYC˗ nuclei, aligned with clinical relevance in lymphoma assessment. Model performance is optimized through systematic hyperparameter tuning, and a diagnostic scoring method based on MYC expression is introduced to estimate disease severity. Based on the literature, it performs better than U-Net and ResNet. U-Net is effective for segmentation but does not include classification, making it less suitable for identifying nucleus types [15]. ResNet is good for general image classification but lacks features needed for detailed segmentation of overlapping nuclei, which is common in DLBCL tissue images. A multi-class deep learning study using VGG16 successfully distinguished between benign nodes, DLBCL, Burkitt lymphoma, and small lymphocytic lymphoma with 95% accuracy [26]. These findings from previous studies guided the decision to use HoVerNet and to compare with VGG16 in this work. A GUI is developed to enable real-time segmentation, cell counting, and severity visualization, supporting practical integration into clinical workflows.
Figure 4 illustrates the HoVerNet architecture [11]. The model was trained for 800 epochs to ensure comprehensive learning of complex patterns in DLBCL datasets [4]. The model was implemented using PyTorch version 2.2.0, with minor modifications to support binary classification of MYC+ and MYC- nuclei. The training used the Adam optimizer with a learning rate of 1 × 10−4 and a batch size of 32 and 64. The binary cross-entropy loss function was applied to optimize performance for binary classification. Patch size was 256 × 256 pixels as described in Section 3.2. This choice was guided by the high complexity of the data, requiring extended training to capture intricate features [4]. Early stopping, a common technique to prevent overfitting, was considered but not implemented in this study. Instead, overfitting was managed by closely monitoring validation loss and using batch size adjustments.

3.5. Nuclei Counting and Diagnosis of Severity Level of Cells

After segmenting and classifying nuclei using HoVerNet in DLBCL images, nuclei counting is proposed. The segmented image identifies and labels each nucleus based on its type: normal cells are smaller (average 36.7 pixels), appearing dark blue; positive cancer cells are larger (average 73.5 pixels), lighter blue; negative cancer cells are also large (average 73 pixels), with a lighter blue hue. Cell counting involves using connected component labeling algorithms to distinguish and count these nucleus types. The counts are validated against manual or other automated methods. Subsequently, totals of abnormal (negative and positive) and all cells (normal, negative, and positive) are calculated. Percentages of negative and positive cells relative to total cells are then computed using a formula as in (4).
P e r c e n t a g e = N e g a t i v e   C o u n t + P o s i t i v e   C o u n t T o t a l   C o u n t × 100 %
Disease severity is assessed based on nucleus characteristics like size, shape, color, and abnormalities, with each cell assigned a severity score. Cells exceeding 40% in abnormality are classified as ‘Severe’; otherwise, they are ‘Mild’. This process aids in diagnosing and monitoring disease progression in DLBCL. The 40% threshold was determined through consultation with clinical pathologists and reflects a typical cutoff observed in high-grade cases of DLBCL where MYC-positivity exceeds this level [27].

3.6. Graphic User Interface (GUI)

The data collection and preparation begin with a GUI where patient details like Patient ID and Year of Data are input to associate image analysis with the correct patient records. Once validated, the user selects an image file (PNG, JPG, JPEG, BMP, GIF) for analysis. The selected image and its patches are normalized to enhance contrast and displayed for inspection. Nuclei detection follows, converting images to grayscale if needed and applying binary thresholding for nuclei segmentation (‘Nuclei Pixel Branch’) visualized on the GUI.
Gradient computation (‘HoVer Branch’) highlights textural patterns and directional changes, aiding detailed feature analysis. Nuclei are segmented and classified based on predefined criteria, visualized to show categorization into ‘Negative Cells’, ‘Positive Cells’, and ‘Normal Cells’. Cell counting follows, tallying each type and computing total counts. The proportion of negative and positive cells is used to determine disease severity (‘Severe’ or ‘Mild’). Results, including counts, percentages, and severity assessments, are displayed on the GUI for comprehensive analysis. A ‘Finish’ button concludes analysis, confirming the completion of cell counting.

4. Results and Discussion

4.1. Image Pre-Processing

DLBCL WSIs while maintaining image quality. Using the LANCZOS filter, the resizing process preserved a resolution of 96 dpi and a bit depth of 24 bits, ensuring clarity and color fidelity. This reduction effectively minimizes computational overhead, facilitating more efficient downstream processing. Following compression, image patching yielded 34,000 distinct 256 × 256-pixel patches. Each patch was uniquely named based on its coordinates within the original image and systematically stored. Visualization of these patches revealed diverse pathological features, improving granularity for deep learning applications. This structured patch extraction enhances the deep learning model’s ability to focus on specific tissue regions within DLBCL images, optimizing feature learning for classification and analysis. Normalization of the extracted patches further improved consistency by scaling pixel intensity values to the [0,1] range, achieved through division by 255. This transformation mitigated variability in raw pixel values, leading to improved model stability and faster convergence during training. Standardization plays a crucial role in optimizing deep learning performance, ensuring more accurate and efficient analysis of DLBCL images.

4.2. Performance Evaluation of HoVerNet Optimization Results

Before feeding the images into HoVerNet and VGG16, the dataset is divided into training, testing, and validation sets (Table 1).
HoVerNet consists of three branches: the ‘Nuclei Pixel Branch,’ the ‘HoVer Branch,’ and the ‘Nuclei Classification Branch.’ The process starts with normalized patched images as input. These images are processed to detect nuclei, producing a binary image where nuclei pixels are set to one value (blue), and non-nuclei pixels are set to another value (red), as shown in Figure 5b. The output from the ‘Nuclei Pixel Branch’ is then overlaid on the normalized patched image, creating a composite image. In this overlay, the nuclei are distinctly highlighted against the background, making them easier to visually identify and analyze.
The overlay image is then projected onto the horizontal and vertical axes to create horizontal and vertical images, respectively. This process introduces an additional branch called the ‘HoVer Branch’ (Figure 6). These projections provide useful summaries of the spatial distribution of the nuclei in the image. For example, a horizontal projection provides information on how the nuclei are distributed from top to bottom, while a vertical projection provides information on how they are distributed from left to right.
After defining the architecture, an instance of the HoVerNet model was created and compiled using the Adam optimizer and binary cross-entropy loss function, suitable for binary classification tasks. The model was initially trained for 800 epochs with batch sizes of 32 and 64, following the approach of Wójcik et al. [4], who reported achieving the highest F1 score of 0.939 for similar tasks.
Figure 7a depicts the training process of the HoVerNet model over 800 epochs with a batch size of 32. The model achieved perfect training accuracy of 100% and an exceptionally low training loss of 2.21 × 10−4. However, validation and testing accuracy were notably lower, at 69.63% and 71.26%, with higher validation and testing losses at 24.7998 and 23.0890, respectively. These results indicate overfitting, as the model performed well on training data but failed to generalize effectively to unseen data. Figure 7b focuses on the training dynamics for a batch size of 32 over a zoomed-in epoch range of 20–60. While the training accuracy was consistently 100%, validation performance remained unstable, showing fluctuations in accuracy and higher validation losses. This pattern indicates overfitting, as the model was overly tuned to the training data and struggled to generalize to unseen data.
Figure 8a shows the training and validation performance when using a batch size of 64. The model achieved 100% training accuracy, with an even lower training loss of 1.07 × 10−7. Validation and testing accuracies improved to 74.68% and 75.75%, respectively, with reduced validation losses of 12.4109 and testing loss of 11.9979. The larger batch size resulted in more stable training, with fewer parameter updates per epoch, enabling better generalization to unseen data.
Figure 8b provides a closer analysis of training with a batch size of 64 between epochs 20 and 60. A sharp decline in training accuracy occurred at epoch 45 due to adjustments in model parameters to balance training and validation losses. After this drop, the training accuracy quickly recovered and stabilized at 100%. Validation accuracy showed steady improvement, and validation losses decreased over time, demonstrating the model’s ability to generalize effectively. However, after epoch 45, the accuracy of validation gradually decreases as the epoch increases. This indicates that HoVerNet achieves better performance and generalization when trained for fewer epochs (less than 45).
Further investigation into the optimal number of epochs (15, 20, 25, 30, 35, 45, and 50) aimed to maximize validation and testing accuracies while minimizing losses (Table 2). For batch size 32, the model reached 100% training accuracy by epoch 35, with validation and testing accuracy peaking at 82.78% and 83.67%, respectively. Beyond this, accuracy declined, indicating overfitting. At epoch 100, training accuracy decreased to 98.72%, with further drops in validation and testing accuracies, suggesting underfitting and potential learning rate issues.
The optimal epoch for a batch size of 32 for the HoVerNet and VGG16 models is 35 (Figure 9), where validation and testing accuracies were highest, balancing effective generalization and minimizing overfitting. This iterative approach allowed for fine-tuning the model’s training dynamics and enhancing its ability to generalize effectively. Based on the confusion matrix in Figure 10, the HoVerNet model correctly identified 1450 MYC+ instances and 1394 MYC- instances, resulting in a high true positive (TP) and true negative (TN) count. However, it misclassified 250 MYC- instances as MYC+ (FP) and 306 MYC+ instances as MYC- (FN).
Based on Table 2 and Table 4, VGG16 achieved higher classification accuracy compared to HoVerNet. However, VGG16 is a standard CNN not inherently designed for nuclei segmentation tasks. It performs well in distinguishing image-level classes but lacks the ability to accurately delineate individual nuclei, especially in cases of overlapping structures. In contrast, HoVerNet is specifically optimized for nuclear instance segmentation and classification, making it more adept at handling the complex morphological variations and overlapping nuclei commonly seen in histopathological images of DLBCL. Although its overall accuracy is lower than VGG16, HoVerNet offers superior segmentation granularity and biological interpretability, which are essential for meaningful diagnostic assessment in clinical practice.
These results are tabulated in Table 3, with an overall accuracy of approximately 85.2%, indicating that the model is effective in its predictions. The precision, which measures the accuracy of positive predictions, is about 85.3%, while the recall, reflecting the model’s ability to identify true positive cases, stands at 82.6%. The specificity, or true negative rate, is 84.8%, highlighting the model’s proficiency in correctly identifying negative instances. Additionally, the F1 score, a balance between precision and recall, is around 83.9%, underscoring the model’s robustness in handling both MYC+ and MYC- classification.
Apart from that, when the Adam optimizer with a batch size of 64 is used, the HoVerNet model achieves a training accuracy of 100% across all epochs (Table 4), indicating that it has perfectly learned the training dataset. However, this does not translate to the validation and testing sets, where the accuracy is significantly lower, suggesting that the model is overfitting to the training data. In the early epochs, there is a gradual increase in both validation and testing accuracies, peaking at 83.25% and 84.46%, respectively, at epoch 25 (Figure 11). This could be the model’s sweet spot, where it has learned enough patterns to generalize well to unseen data. Beyond this point, there is a noticeable decline in performance on the validation and testing sets, with the lowest accuracy observed at epoch 100, dropping to 74.79% and 74.53%, respectively. This decline could be due to the model becoming too specialized in the training data features, which do not represent the broader patterns needed for new data.
The model correctly identified 1500 MYC+ instances and 1372 MYC- instances, resulting in a high true positive (TP) and true negative (TN) count. However, it misclassified 200 MYC- instances as MYC+ (FP) and 328 MYC+ instances as MYC- (FN) (Figure 12). These metrics indicate that the model achieves an overall accuracy of approximately 85.5% (Table 5), demonstrating its reliability in predicting MYC+ and MYC- cases. The precision of 82.1% reflects the model’s accuracy in predicting positive cases, while the recall of 88.2% indicates a high true positive rate. The specificity, at 80.7%, shows the model’s effectiveness in correctly identifying negative cases. The F1 score of 85.1% balances both precision and recall, emphasizing the model’s robustness in classification tasks.
Comparing batch sizes of 32 and 64 shows significant differences in performance metrics and execution times. Although both batch sizes achieve perfect training accuracy, a batch size of 64 yields higher validation and testing accuracy with lower losses, and it reaches these results in fewer epochs (25 vs. 35). Additionally, training time per epoch is slightly shorter for a batch size of 64 (62 s vs. 63 s), and validation and testing times are also faster, taking 7 s compared to 8 s for batch size 32. These findings suggest that a batch size of 64 provides better efficiency and faster convergence.
After model training, the simultaneous nuclei segmentation and classification process is taken. This process is defined as the last branch in HoVerNet, known as the ‘Nuclei Classification Branch.’ HoVerNet contains an encoder-decoder structure that can capture both high-level and low-level features in the images, which helps in accurately segmenting an image. Abnormal cells (positive cells and negative cells) are classified in red while normal cells are classified in green (Figure 13).

4.3. Nuclei Counting and Diagnosis of Severity Level of the Cells

In the diagnosis of cellular abnormalities, the severity of cell changes can be categorized into two levels: mild and severe. Based on the nuclei count results generated via automated analysis, it was found that the dataset primarily consists of normal cells, positive cells, and negative cells. Specifically, in one of the examples for MYC+, ‘U58-20-2 B042920 CMYC’, there were 8371 negative cells and 16,381 positive cells identified, contributing to a total of 24,752 abnormal cells. In contrast, 9971 cells were classified as normal. The analysis encompassed a total of 34,723 cells. The percentage of abnormal cells, consisting of both negative and positive types, amounted to 71.28% of the total cell population. This high percentage categorizes the dataset as severe, indicating a substantial presence of abnormal cellular characteristics. Table 6 tabulates the results of cell counting and the severity level of each cell in MYC+. Besides, Table 7 tabulates the results of cell counting and the severity level of each cell in MYC-.

4.4. Graphic User Interface (GUI)

The Tkinter-based application for DLBCL nuclei segmentation and classification allows users to input patient information, upload an image, visualize image patches, and analyze cells. Initially, a main window was created for inputting patient data. If the patient’s ID and year are not filled in, an error message appears. Once an image is selected, it is displayed in the window with an option to view patch images. Users can normalize the image, convert it to grayscale, and create a binary image highlighting the nuclei. Gradient images, showing horizontal and vertical gradients, can also be displayed.
The application further segments and classifies nuclei, enabling cell counting. It calculates the number of negative, positive, and normal cells, their percentages, and the severity based on these percentages. Results are displayed, and a ‘Finish’ button concludes the process, providing a diagnostic tool for medical professionals. This application streamlines the process of analyzing DLBCL nuclei, making it efficient and user-friendly for medical use. Figure 14 shows the overview of the GUI for DLBCL diagnosis. As an example of the GUI in action, one clinical slide was analyzed through the full pipeline. The system detected 4554 negative cells, 438 positive cells, 4114 normal cells, and a total of 9136 cells. It identified 54.64% as abnormal and classified the case as ‘Severe,’ demonstrating the practical functionality and diagnostic potential of the application.

5. Conclusions

This study analyzed 122 digital high-magnification WSIs of DLBCL using advanced pre-processing and the HoVerNet deep learning model. The model successfully classified nuclei, automated cell counting, and assessed disease severity. These improvements make pathology workflows faster and more accurate. A GUI was also developed, making it easier for pathologists to use the system.
However, there are some limitations. HoVerNet’s complex multi-branch architecture led to overfitting, as observed by a sharp performance decline after epoch 45. This overfitting suggests that the model was overly complex for the dataset, performing well on training data but poorly on unseen data. Future work should include optimizing hyperparameters such as dropout, data augmentation, integrating attention mechanisms to improve feature selection and reduce overfitting, and exploring alternative architectures. By addressing these issues, the system can improve DLBCL diagnosis and make pathology workflows faster, more accurate, and easier to use in different medical settings.

Author Contributions

Conceptualization, G.K.T. and C.C.L.; Methodology, G.K.T.; Software, G.K.T.; Validation, F.A.H., A.I.Y., and S.M.A.; Formal analysis, C.C.L. and F.A.H.; Investigation, G.K.T.; Resources, F.A.H., A.I.Y., and S.M.A.; Data curation, C.C.L.; Writing—original draft preparation, G.K.T.; Writing—review and editing, G.K.T., C.C.L., and Q.W.O.; Visualization, Q.W.O. and Y.F.C.; Supervision, C.C.L.; Project administration, C.C.L. and F.A.H.; Funding acquisition, C.C.L. and F.A.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Higher Education (MoHE), Malaysia, through the Fundamental Research Grant Scheme (FRGS) under grant number FRGS/1/2023/ICT02/UNIMAP/02/3. Additional appreciation is extended to Universiti Sains Malaysia for financial support through the RU Top Down grant 1001/PPSP/8070016.

Institutional Review Board Statement

This research was conducted in accordance with the Declaration of Helsinki and approved by the Jawatankuasa Etika Penyelidikan Manusia Universiti Sains Malaysia (JEPeM-USM) on 2 April 2023. (Approval Reference: USM/JEPeM/22110749).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors acknowledge the support of the Ministry of Higher Education (MoHE) Malaysia through the Fundamental Research Grant Scheme (FRGS) under grant number FRGS/1/2023/ICT02/UNIMAP/02/3. Additionally, appreciation is extended to Universiti Sains Malaysia for financial support through the RU Top Down grant 1001/PPSP/8070016.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
CNNConvolutional Neural Network
DLBCLDiffuse Large B-Cell Lymphoma
GUIGraphic User Interface
ROIRegion of Interest
WSIWhole Slide Image

References

  1. Lymphoma Research Foundation. Diffuse Large B-Cell Lymphoma—Lymphoma Research Foundation. Available online: https://lymphoma.org/understanding-lymphoma/aboutlymphoma/nhl/dlbcl/ (accessed on 31 July 2024).
  2. Basu, S.; Agarwal, R.; Srivastava, V. Deep discriminative learning model with calibrated attention map for the automated diagnosis of diffuse large B-cell lymphoma. Biomed. Signal Process. Control 2022, 76, 103728. [Google Scholar] [CrossRef]
  3. El Hussein, S.; Chen, P.; Medeiros, L.J.; Hazle, J.D.; Wu, J.; Khoury, J.D. Artificial intelligence-assisted mapping of proliferation centers allows the distinction of accelerated phase from large cell transformation in chronic lymphocytic leukemia. Mod. Pathol. 2022, 35, 1121–1125. [Google Scholar] [CrossRef]
  4. Wójcik, P.; Naji, H.; Simon, A.; Büttner, R.; Bożek, K. Learning Nuclei Representations with Masked Image Modelling. arXiv 2023, arXiv:2306.17116. [Google Scholar]
  5. Li, D.; Bledsoe, J.R.; Zeng, Y.; Liu, W.; Hu, Y.; Bi, K.; Liang, A.; Li, S. A deep learning diagnostic platform for diffuse large B-cell lymphoma with high accuracy across multiple hospitals. Nat. Commun. 2020, 11, 6004. [Google Scholar] [CrossRef]
  6. Swiderska-Chadaj, Z.; Hebeda, K.M.; van den Brand, M.; Litjens, G. Artificial intelligence to detect MYC translocation in slides of diffuse large B-cell lymphoma. Virchows Arch. 2021, 479, 617–621. [Google Scholar] [CrossRef]
  7. Bándi, P.; Balkenhol, M.; van Ginneken, B.; van der Laak, J.; Litjens, G. Resolution-agnostic tissue segmentation in whole-slide histopathology images with convolutional neural networks. PeerJ 2019, 7, e8242. [Google Scholar] [CrossRef]
  8. Shankar, V.; Yang, X.; Krishna, V.; Tan, B.T.; Rojansky, R.; Ng, A.Y.; Valvert, F.; Briercheck, E.L.; Weinstock, D.M.; Natkunam, Y.; et al. LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype. arXiv 2023, arXiv:2311.09574. [Google Scholar]
  9. Swiderska-Chadaj, Z.; Hebeda, K.; van den Brand, M.; Litjens, G. Predicting MYC translocation in HE specimens of diffuse large B-cell lymphoma through deep learning. Med. Imaging 2020, 11320, 1132010. [Google Scholar]
  10. Perry, C.; Greenberg, O.; Haberman, S.; Herskovitz, N.; Gazy, I.; Avinoam, A.; Paz-Yaacov, N.; Hershkovitz, D.; Avivi, I. Image-Based Deep Learning Detection of High-Grade B-Cell Lymphomas Directly from Hematoxylin and Eosin Images. Cancers 2023, 15, 5205. [Google Scholar] [CrossRef]
  11. Hamdi, M.; Senan, E.M.; Jadhav, M.E.; Olayah, F.; Awaji, B.; Alalayah, K.M. Hybrid Models Based on Fusion Features of a CNN and Handcrafted Features for Accurate Histopathological Image Analysis for Diagnosing Malignant Lymphomas. Diagnostics 2023, 13, 2258. [Google Scholar] [CrossRef]
  12. Vrabac, D.; Smit, A.; Rojansky, R.; Natkunam, Y.; Advani, R.H.; Ng, A.Y.; Fernandez-Pol, S.; Rajpurkar, P. DLBCL-Morph: Morphological features computed using deep learning for an annotated digital DLBCL image set. Sci. Data 2021, 8, 135. [Google Scholar] [CrossRef]
  13. Blanc-Durand, P.; Jégou, S.; Kanoun, S.; Berriolo-Riedinger, A.; Bodet-Milin, C.; Kraeber-Bodéré, F.; Carlier, T.; le Gouill, S.; Casasnovas, R.O.; Meignan, M.; et al. Fully automatic segmentation of diffuse large B cell lymphoma lesions on 3D FDG-PET/CT for total metabolic tumour volume prediction using a convolutional neural network. Eur. J. Nucl. Med. Mol. Imaging 2021, 48, 1362–1370. [Google Scholar] [CrossRef]
  14. Ferrández, M.C.; Golla, S.S.V.; Eertink, J.J.; de Vries, B.M.; Wiegers, S.E.; Zwezerijnen, G.J.C.; Pieplenbosch, S.; Schilder, L.; Heymans, M.W.; Zijlstra, J.M.; et al. Sensitivity of an AI method for [18F]FDG PET/CT outcome prediction of diffuse large B-cell lymphoma patients to image reconstruction protocols. EJNMMI Res. 2023, 13, 88. [Google Scholar] [CrossRef]
  15. Graham, S.; Vu, Q.D.; Raza, S.E.A.; Azam, A.; Tsang, Y.W.; Kwak, J.T.; Rajpoot, N. HoVer-Net: Simultaneous Segmentation and Classification of Nuclei in Multi-Tissue Histology Images. arXiv 2018, arXiv:1812.06499. [Google Scholar] [CrossRef]
  16. Ferrández, M.C.; Golla, S.S.V.; Eertink, J.J.; de Vries, B.M.; Lugtenburg, P.J.; Wiegers, S.E.; Zwezerijnen, G.J.C.; Pieplenbosch, S.; Kurch, L.; Hüttmann, A.; et al. An artificial intelligence method using FDG PET to predict treatment outcome in diffuse large B cell lymphoma patients. Sci. Rep. 2023, 13, 13111. [Google Scholar] [CrossRef]
  17. Mohlman, J.S.; Leventhal, S.D.; Hansen, T.; Kohan, J.; Pascucci, V.; Salama, M.E. Improving Augmented Human Intelligence to Distinguish Burkitt Lymphoma from Diffuse Large B-Cell Lymphoma Cases. Am. J. Clin. Pathol. 2020, 153, 743–759. [Google Scholar] [CrossRef]
  18. Farinha, F. Artifact Removal & Biomarker Segmentation. Francisco Farinha. 2020. Available online: https://franciscofarinha.ca/project/eece571t/ (accessed on 28 November 2024).
  19. Jiang, C.; Chen, K.; Teng, Y.; Ding, C.; Zhou, Z.; Gao, Y.; Wu, J.; He, J.; He, K.; Zhang, J. Deep learning-based tumour segmentation and total metabolic tumour volume prediction in the prognosis of diffuse large B-cell lymphoma patients in 3D FDG-PET images. Eur. Radiol. 2022, 32, 4801–4812. [Google Scholar] [CrossRef]
  20. Steinbuss, G.; Kriegsmann, M.; Zgorzelski, C.; Brobeil, A.; Goeppert, B.; Dietrich, S.; Mechtersheimer, G.; Kriegsmann, K. Deep learning for the classification of non-hodgkin lymphoma on histopathological images. Cancers 2021, 13, 2419. [Google Scholar] [CrossRef]
  21. Lisson, C.S.; Lisson, C.G.; Mezger, M.F.; Wolf, D.; Schmidt, S.A.; Thaiss, W.M.; Tausch, E.; Beer, A.J.; Stilgenbauer, S.; Beer, M.; et al. Deep Neural Networks and Machine Learning Radiomics Modelling for Prediction of Relapse in Mantle Cell Lymphoma. Cancers 2022, 14, 2008. [Google Scholar] [CrossRef]
  22. TWitzig, S.; Vose, S.Y.; Zelenetz, R.S.; Abramson, B.M.; Advani, C.N. An international validation study of the International Prognostic Index for diffuse large B-cell lymphoma. Blood 2015, 126, 2265–2268. [Google Scholar]
  23. Alizadeh, A.; Eisen, S.W.; Davis, R.E.; Ma, I.C.; Lossos, S.L. The influence of the tumor microenvironment on the molecular characteristics of diffuse large B-cell lymphoma. Cell 2011, 145, 559–570. [Google Scholar]
  24. Ahluwalia, A.S.; Patel, S.P.; Mushtaq, M.S. Impact of race on the outcome of diffuse large B-cell lymphoma: A population-based study. Cancer 2017, 123, 1600–1609. [Google Scholar]
  25. Schmitz, N.D.; Winer, L.F.; Lugtenburg, M.T.; Trneny, S.M. Transplantation strategies in diffuse large B-cell lymphoma. Leuk. Lymphoma 2018, 59, 1953–1965. [Google Scholar]
  26. El Achi, H.; Belousova, T.; Chen, L.; Wahed, A.; Wang, I.; Hu, Z.; Kanaan, Z.; Rios, A.; Nguyen, A.N.D. Automated Diagnosis of Lymphoma with Digital Pathology Images Using Deep Learning. arXiv 2018, arXiv:1811.02668. [Google Scholar]
  27. Swerdlow, S.H.; Campo, E.; Harris, N.L.; Jaffe, E.S.; Pileri, S.A.; Stein, H.; Thiele, J. WHO Classification of Tumours of Haematopoietic and Lymphoid Tissues, 4th ed.; WHO: Geneva, Switzerland, 2017; Volume 2. [Google Scholar]
Figure 1. Examples of CMYC-stained whole slide images. (a) MYC+ whole slide image; (b) MYC˗ whole slide image.
Figure 1. Examples of CMYC-stained whole slide images. (a) MYC+ whole slide image; (b) MYC˗ whole slide image.
Diagnostics 15 01958 g001
Figure 2. Flow process of image patches.
Figure 2. Flow process of image patches.
Diagnostics 15 01958 g002
Figure 3. Training, validation, and testing scheme.
Figure 3. Training, validation, and testing scheme.
Diagnostics 15 01958 g003
Figure 4. HoVerNet architecture [11].
Figure 4. HoVerNet architecture [11].
Diagnostics 15 01958 g004
Figure 5. Nuclei Pixel Branch of HoVerNet. (a) Normalized Patched Image; (b) Nuclei Pixel Branch; (c) Overlaid Nuclei Image.
Figure 5. Nuclei Pixel Branch of HoVerNet. (a) Normalized Patched Image; (b) Nuclei Pixel Branch; (c) Overlaid Nuclei Image.
Diagnostics 15 01958 g005
Figure 6. HoVer Branch of HoVerNet. (a) Horizontal Map Channel; (b) Vertical Map Channel.
Figure 6. HoVer Branch of HoVerNet. (a) Horizontal Map Channel; (b) Vertical Map Channel.
Diagnostics 15 01958 g006
Figure 7. Accuracy-loss graph for 800 epochs with Adam optimizer (Batch size 32). (a) Full accuracy-loss graph for epochs 0–800; (b) Zoomed analysis for epochs 20–60.
Figure 7. Accuracy-loss graph for 800 epochs with Adam optimizer (Batch size 32). (a) Full accuracy-loss graph for epochs 0–800; (b) Zoomed analysis for epochs 20–60.
Diagnostics 15 01958 g007
Figure 8. Accuracy-loss graph for 800 epochs with Adam optimizer (Batch size 64). (a) Full accuracy-loss graph for epochs 0–800; (b) Zoomed analysis for epochs 20–60.
Figure 8. Accuracy-loss graph for 800 epochs with Adam optimizer (Batch size 64). (a) Full accuracy-loss graph for epochs 0–800; (b) Zoomed analysis for epochs 20–60.
Diagnostics 15 01958 g008
Figure 9. Accuracy loss graph of epoch 35 with Adam optimizer (Batch size 32).
Figure 9. Accuracy loss graph of epoch 35 with Adam optimizer (Batch size 32).
Diagnostics 15 01958 g009
Figure 10. Confusion matrix of epoch 35 with Adam optimizer (Batch size 32).
Figure 10. Confusion matrix of epoch 35 with Adam optimizer (Batch size 32).
Diagnostics 15 01958 g010
Figure 11. Accuracy loss graph of epoch 25 with the Adam optimizer (Batch size 64).
Figure 11. Accuracy loss graph of epoch 25 with the Adam optimizer (Batch size 64).
Diagnostics 15 01958 g011
Figure 12. Confusion matrix of epoch 25 with the Adam optimizer (Batch size 64).
Figure 12. Confusion matrix of epoch 25 with the Adam optimizer (Batch size 64).
Diagnostics 15 01958 g012
Figure 13. Simultaneous nuclei segmentation and classification (Nuclei Classification Branch of HoVerNet).
Figure 13. Simultaneous nuclei segmentation and classification (Nuclei Classification Branch of HoVerNet).
Diagnostics 15 01958 g013
Figure 14. Overview of GUI for DLBCL diagnosis.
Figure 14. Overview of GUI for DLBCL diagnosis.
Diagnostics 15 01958 g014
Table 1. Dataset Distribution.
Table 1. Dataset Distribution.
ClassDataset Distribution
Training (80%)Testing (10%)Validation (10%)
MYC+13,60017001700
MYC-13,60017001700
Total27,20036003600
Table 2. Accuracies and losses for each epoch with Adam optimizer (Batch size 32).
Table 2. Accuracies and losses for each epoch with Adam optimizer (Batch size 32).
EpochsHoVerNetVGG16
AccLossAccLossAccLossAccLossAccLossAccLoss
TrainingValidationTestingTrainingValidationTesting
1599.792.25 × 10−280.371.869981.221.74798.892.86 × 10−292.290.278793.620.2328
2099.994.47 × 10−480.012.656880.372.575892.301.23 × 10−297.480.37993.110.1654
2599.971.90 × 10−375.002.774275.942.507493.809.8 × 10−394.730.035495.320.2298
3099.857.60 × 10−378.481.844578.211.782100.002.09 × 10−194.920.10598.670.1346
35100.007.86 × 10−982.782.987083.672.7953100.002.42 × 10−2100.000.027299.870.0529
45100.006.03 × 10−980.665.676281.015.499897.654.97 × 10−298.540.040198.870.3312
50100.001.27 × 10−778.864.111280.753.563197.477.56 × 10−291.560.498192.440.4114
10098.726.16 × 10−276.473.429378.453.1200------
800100.002.21 × 10−469.6324.799871.2623.089------
Bold: Best accuracy and loss results for HoVerNet and VGG16.
Table 3. Testing model performance results with Adam optimizer (Batch size 32).
Table 3. Testing model performance results with Adam optimizer (Batch size 32).
EpochsAccuracyLossMYC+MYC˗MYC+MYC˗MYC+MYC˗
F1 ScoreRecallPrecision
1581.22%1.74700.810.810.820.800.810.82
2080.37%2.57580.810.800.820.780.790.82
2575.94%2.50740.760.760.760.750.760.76
3078.21%1.78200.790.780.790.780.780.79
3583.67%2.79530.840.830.850.820.830.85
4581.01%5.49980.810.810.820.800.800.82
5080.75%3.56310.810.800.820.790.800.82
10078.45%3.12000.790.780.790.770.780.79
80071.26%23.08900.720.710.740.690.700.72
Bold: Best testing model results with Adam optimizer (Batch size 32).
Table 5. Testing model performance results with the Adam optimizer (Batch size 64).
Table 5. Testing model performance results with the Adam optimizer (Batch size 64).
EpochsAccuracyLossMYC+MYC˗MYC+MYC˗MYC+MYC˗
F1 ScoreRecallPrecision
1583.43%1.74700.830.840.820.850.840.83
2083.87%2.57580.840.840.820.860.850.83
2584.46%2.50740.850.840.850.840.840.85
3083.34%1.78200.830.840.820.840.840.83
3579.92%2.79530.790.780.790.770.780.79
4581.87%5.49980.820.820.820.810.820.82
5082.52%3.56310.820.830.820.830.830.82
10074.53%3.12000.750.740.760.730.740.76
80075.75%11.99790.720.710.740.690.700.72
Bold: Best testing model performance results with Adam optimizer (Batch size 64).
Table 6. Results of nuclei counting and severity level (MYC+).
Table 6. Results of nuclei counting and severity level (MYC+).
MYC+ SlideNegative CellsPositive CellsAbnormal Cells (Negative + Positive)Normal CellsTotal Number of CellsPercentage of Abnormal CellsSeverity Level
Slide 122,99245,88768,87910,53779,41686.73%Severe
Slide 232,24664,70696,95241,867138,81969.84%Severe
Slide 3994113,81923,760246426,22490.60%Severe
Slide 420,48159,39879,87929,734109,61372.87%Severe
Slide 5827243,78352,05511,56163,61681.83%Severe
Slide 6781525,17332,988875441,74279.03%Severe
Slide 7964621,65331,29912,79944,09870.98%Severe
Slide 810,55683,27693,83249,288143,12065.56%Severe
Slide 9588237,85343,73517,40261,13771.54%Severe
Slide 1022,35127,61849,969889658,86584.89%Severe
Slide 1145,30085,022130,32240,412170,73476.33%Severe
Slide 1230,21160,29690,50717,320107,82783.94%Severe
Slide 1342,31297,284139,59614,376153,97290.66%Severe
Slide 14818131,02139,20210,18349,38579.38%Severe
Slide 1513,86841,82555,69318,46874,16175.10%Severe
Slide 1615,85738,76754,62421,46276,08671.79%Severe
Slide 1743,121111,433154,55482,127236,68165.30%Severe
Slide 1810,43125,88736,318745643,77482.97%Severe
Slide 1914,94743,16658,11321,99080,10372.55%Severe
Slide 20539217,39822,790755730,34775.10%Severe
Slide 2114,15836,93451,092822959,32186.13%Severe
Slide 22541617,76823,184831431,49873.60%Severe
Slide 23459213,47818,070783825,90869.75%Severe
Slide 24729321,15228,44511,63140,07670.98%Severe
Slide 2534,74579,172113,9178726122,64392.89%Severe
Slide 2613,40136,05049,451734156,79287.07%Severe
Slide 27823525,32333,55810,09543,65376.87%Severe
Slide 2820,54459,58180,12532,069112,19471.42%Severe
Slide 29938726,95336,34015,63851,97869.91%Severe
Slide 3013,54271,78985,33129,630114,96174.23%Severe
Slide 3124,74564,87889,62329,901119,52474.98%Severe
Slide 3216,23644,82661,06217,11578,17778.11%Severe
Slide 33446313,83618,299208420,38389.78%Severe
Slide 3417,35739,85257,20918,07975,28875.99%Severe
Slide 355732960715,339656421,90370.03%Severe
Slide 36936947,24256,61117,00773,61876.90%Severe
Slide 3717,90537,92655,83131,11086,94164.22%Severe
Slide 3814,54740,55755,10416,96472,06876.46%Severe
Slide 39432411,05215,376510120,47775.09%Severe
Slide 4026,06355,05481,11724,916106,03376.50%Severe
Slide 41748416,64924,13310,32634,45970.03%Severe
Slide 42786614,00921,87513,66235,53761.56%Severe
Slide 4320,17355,09975,272753482,80690.90%Severe
Slide 44444634,11538,56117,04955,61069.34%Severe
Slide 4527,34459,60086,94433,972120,91671.90%Severe
Slide 4611,22829,26440,49210,72751,21979.06%Severe
Slide 4710,07715,10825,185416529,35085.81%Severe
Slide 48801722,90130,91813,64844,56669.38%Severe
Slide 493877874612,623594818,57167.97%Severe
Slide 50347011,94615,416583521,25172.54%Severe
Slide 51450812,60717,115626123,37673.22%Severe
Slide 52297262149186473713,92365.98%Severe
Slide 53439814,40718,805813926,94469.79%Severe
Slide 5411,10922,54333,65213,15746,80971.89%Severe
Slide 553914717311,087376814,85574.63%Severe
Slide 564629061368510187872.84%Severe
Slide 57561419,04324,65711,95136,60867.35%Severe
Slide 58837116,38124,752997134,72371.28%Severe
Slide 59380216,34720,149605926,20876.88%Severe
Slide 60710617,63824,74411,15935,90368.92%Severe
Slide 6119,22923,16442,393650148,89486.70%Severe
Table 7. Results of nuclei counting and severity level (MYC˗).
Table 7. Results of nuclei counting and severity level (MYC˗).
MYC˗SlideNegative CellsPositive CellsAbnormal Cells (Negative + Positive)Normal CellsTotal Number of CellsPercentage of Abnormal CellsSeverity Level
Slide 156,504864565,14932,64197,79066.62%Severe
Slide 235,648403139,67919,77459,45366.74%Severe
Slide 3291630232182315553358.16%Severe
Slide 425881992787961374874.36%Severe
Slide 58696193110,627200712,63484.11%Severe
Slide 6360195645571415597276.31%Severe
Slide 724,462574530,20713,45543,66269.18%Severe
Slide 8797134931118104988.75%Severe
Slide 920,262806128,323440332,72686.55%Severe
Slide 1034,785619240,97733,16874,14555.27%Severe
Slide 1132,94810,21743,16522,99666,16165.24%Severe
Slide 1293,44117,878111,31955,783167,10266.62%Severe
Slide 1335,909369939,60817,22556,83369.69%Severe
Slide 1415,980415120,131531025,44179.13%Severe
Slide 1510213391360451181175.10%Severe
Slide 163314130246161888650470.97%Severe
Slide 1764,38221,11985,50133,092118,59372.10%Severe
Slide 1821,015431325,32810,20435,53271.28%Severe
Slide 1951,19919,37270,57124,43595,00674.28%Severe
Slide 2018,002596923,971743031,40176.34%Severe
Slide 21588194768282617944572.29%Severe
Slide 2221887852973906387976.64%Severe
Slide 23578925118300282811,12874.59%Severe
Slide 2415,352627621,628745929,08774.36%Severe
Slide 2573405627902346711,36969.50%Severe
Slide 2621623212483923340672.90%Severe
Slide 274438133557731877765075.46%Severe
Slide 2820258112836978381474.36%Severe
Slide 29601925918610299911,60974.17%Severe
Slide 305126178069061303820984.13%Severe
Slide 3124,382813632,51812,40544,92372.39%Severe
Slide 323815106948841769665373.41%Severe
Slide 3311,538131312,851414516,99675.61%Severe
Slide 3414,633462319,256838627,64269.66%Severe
Slide 3516,317698323,30013,90137,20162.63%Severe
Slide 369236277412,010238114,39183.45%Severe
Slide 3735,40319,72455,12726,02181,14867.93%Severe
Slide 3883,44725,691109,13839,139148,27773.60%Severe
Slide 3921,831724229,07311,37240,44571.88%Severe
Slide 4035,58710,93246,51922,02468,54367.87%Severe
Slide 414086174858342622845668.99%Severe
Slide 429026563614,662823322,89564.04%Severe
Slide 43341134137521374512673.20%Severe
Slide 444361936298271188.47%Severe
Slide 4530,76212,02242,78419,62662,41068.55%Severe
Slide 4626,310696933,27912,76946,04872.27%Severe
Slide 4753,547885562,40241,625104,02759.99%Severe
Slide 4811835221705597230274.07%Severe
Slide 4923,25810,96034,21815,16949,38769.29%Severe
Slide 5033,04012,50745,54713,23258,77977.49%Severe
Slide 5158,80321,50780,31028,718109,02873.66%Severe
Slide 528243425012,493597318,46667.65%Severe
Slide 5337,22316,11353,33616,28769,62376.61%Severe
Slide 5431,58912,35243,94121,65365,59466.99%Severe
Slide 5543,11614,65757,77331,52589,29864.70%Severe
Slide 5628,50010,62739,12719,95659,08366.22%Severe
Slide 5724,87812,06036,93810,89347,83177.23%Severe
Slide 5854,65122,02076,67139,186115,85766.18%Severe
Slide 594762143261941441763581.13%Severe
Slide 6017,059769324,752997134,72371.28%Severe
Slide 61432866449924144913654.64%Severe
Table 4. Accuracies and losses for each epoch with the Adam optimizer (Batch size 64).
Table 4. Accuracies and losses for each epoch with the Adam optimizer (Batch size 64).
EpochsHoVerNetVGG16
AccLossAccLossAccLossAccLossAccLossAccLoss
TrainingValidationTestingTrainingValidationTesting
15100.004.45 × 10−582.991.780283.431.593795.790.134891.950.514294.810.0821
20100.004.54 × 10−682.962.155883.871.872496.670.257993.630.312298.630.2196
25100.002.90 × 10−783.252.239984.462.0036100.000.028699.980.109899.670.0112
30100.007.34 × 10−882.492.888183.342.549199.970.299199.540.536498.590.2234
35100.006.83 × 10−779.423.973079.923.640497.940.394896.880.440996.140.1807
45100.001.08 × 10−780.512.952881.872.796696.250.726693.680.734594.510.3142
50100.004.79 × 10−781.844.071282.523.792796.480.734998.370.921399.430.2517
100100.005.31 × 10−574.795.025374.534.7763------
800100.001.07 × 10−774.6812.410975.7511.9979------
Bold: Best accuracy and loss results for HoVerNet and VGG16.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tang, G.K.; Lim, C.C.; Hussain, F.A.; Oung, Q.W.; Yajid, A.I.; Mohammad Azmi, S.; Chong, Y.F. Enhanced HoVerNet Optimization for Precise Nuclei Segmentation in Diffuse Large B-Cell Lymphoma. Diagnostics 2025, 15, 1958. https://doi.org/10.3390/diagnostics15151958

AMA Style

Tang GK, Lim CC, Hussain FA, Oung QW, Yajid AI, Mohammad Azmi S, Chong YF. Enhanced HoVerNet Optimization for Precise Nuclei Segmentation in Diffuse Large B-Cell Lymphoma. Diagnostics. 2025; 15(15):1958. https://doi.org/10.3390/diagnostics15151958

Chicago/Turabian Style

Tang, Gei Ki, Chee Chin Lim, Faezahtul Arbaeyah Hussain, Qi Wei Oung, Aidy Irman Yajid, Sumayyah Mohammad Azmi, and Yen Fook Chong. 2025. "Enhanced HoVerNet Optimization for Precise Nuclei Segmentation in Diffuse Large B-Cell Lymphoma" Diagnostics 15, no. 15: 1958. https://doi.org/10.3390/diagnostics15151958

APA Style

Tang, G. K., Lim, C. C., Hussain, F. A., Oung, Q. W., Yajid, A. I., Mohammad Azmi, S., & Chong, Y. F. (2025). Enhanced HoVerNet Optimization for Precise Nuclei Segmentation in Diffuse Large B-Cell Lymphoma. Diagnostics, 15(15), 1958. https://doi.org/10.3390/diagnostics15151958

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop