Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning

Barczánfalvi, Gábor; Nyári, Tibor; Tolnai, József; Tiszlavicz, László; Gulyás, Balázs; Gulya, Karoly

doi:10.3390/ijms26157134

Open AccessArticle

Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning

by

Gábor Barczánfalvi

^1,2,

Tibor Nyári

²

,

József Tolnai

²

,

László Tiszlavicz

³,

Balázs Gulyás

^4,5

and

Karoly Gulya

^1,*

¹

Department of Cell Biology and Molecular Medicine, University of Szeged, 6720 Szeged, Hungary

²

Department of Medical Physics and Informatics, University of Szeged, 6720 Szeged, Hungary

³

Department of Pathology, University of Szeged, 6725 Szeged, Hungary

⁴

Department of Clinical Neuroscience, Karolinska Institute, 171 77 Stockholm, Sweden

⁵

Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore 636921, Singapore

^*

Author to whom correspondence should be addressed.

Int. J. Mol. Sci. 2025, 26(15), 7134; https://doi.org/10.3390/ijms26157134

Submission received: 14 June 2025 / Revised: 19 July 2025 / Accepted: 21 July 2025 / Published: 24 July 2025

(This article belongs to the Special Issue Machine Learning Applications in Bioinformatics and Biomedicine: 3rd Edition)

Download

Browse Figures

Versions Notes

Abstract

Alzheimer’s disease (AD) involves the accumulation of amyloid-β (Aβ) plaques, whose quantification plays a central role in understanding disease progression. Automated segmentation of Aβ deposits in histopathological micrographs enables large-scale analyses but is hindered by the high cost of detailed pixel-level annotations. Weakly supervised learning offers a promising alternative by leveraging coarse or indirect labels to reduce the annotation burden. We evaluated a weakly supervised approach to segment and analyze thioflavin-S-positive parenchymal amyloid pathology in AD and age-matched brains. Our pipeline integrates three key components, each designed to operate under weak supervision. First, robust preprocessing (including retrospective multi-image illumination correction and gradient-based background estimation) was applied to enhance image fidelity and support training, as models rely more on image features. Second, class activation maps (CAMs), generated by a compact deep classifier SqueezeNet, were used to identify, and coarsely localize amyloid-rich parenchymal regions from patch-wise image labels, serving as spatial priors for subsequent refinement without requiring dense pixel-level annotations. Third, a patch-based convolutional neural network, U-Net, was trained on synthetic data generated from micrographs based on CAM-derived pseudo-labels via an extensive object-level augmentation strategy, enabling refined whole-image semantic segmentation and generalization across diverse spatial configurations. To ensure robustness and unbiased evaluation, we assessed the segmentation performance of the entire framework using patient-wise group k-fold cross-validation, explicitly modeling generalization across unseen individuals, critical in clinical scenarios. Despite relying on weak labels, the integrated pipeline achieved strong segmentation performance with an average Dice similarity coefficient (≈0.763) and Jaccard index (≈0.639), widely accepted metrics for assessing segmentation quality in medical image analysis. The resulting segmentations were also visually coherent, demonstrating that weakly supervised segmentation is a viable alternative in histopathology, where acquiring dense annotations is prohibitively labor-intensive and time-consuming. Subsequent morphometric analyses on automatically segmented Aβ deposits revealed size-, structural complexity-, and global geometry-related differences across brain regions and cognitive status. These findings confirm that deposit architecture exhibits region-specific patterns and reflects underlying neurodegenerative processes, thereby highlighting the biological relevance and practical applicability of the proposed image-processing pipeline for morphometric analysis.

Keywords:

Alzheimer’s disease; amyloid plaques; class activation mapping; microscopic morphometry; thioflavin-S; weakly supervised semantic segmentation

1. Introduction

Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that primarily affects individuals in late middle age or old age and is characterized by a heterogeneous phenotype involving mixed proteinopathies [1,2,3,4]. Although the exact molecular mechanisms of AD remain unclear, its diagnosis relies on characteristic histopathological features. Abnormal alterations observed in AD include extracellular amyloid-β (Aβ) deposits (such as senile, neuritic plaques), neurofibrillary tangles resulting from hyperphosphorylated tau protein aggregation, and cerebral amyloid angiopathy involving the vasculature. Tau pathology in AD progresses predictably across six stages (Braak stages), starting in the transentorhinal and hippocampal regions before spreading to neocortical areas [1,2]. Heterogeneous Aβ plaque forms, derived from amyloid precursor proteins [5,6,7], are also a key AD hallmark and progress in a similar hierarchical pattern (Thal phases), starting in the isocortex and spreading to allocortical, subcortical regions, and eventually to the brainstem [4,8,9,10]. Thus, the reliable detection, segmentation, and quantification of these Aβ plaques in brain tissue are essential for understanding disease progression and evaluating potential interventions. Aβ plaques exhibit considerable morphological diversity, ranging from diffuse, loosely structured deposits with varying size, amorphous shape, and blurred borders, to late-stage, compact fibrillar forms, like mature dense-cored plaques containing central fibrillar Aβ cores with or without a surrounding halo (“core-only” or “burned-out” type). Classification schemes are largely subjective and primarily based on morphological features. Plaques develop progressively, although the precise temporal sequence, influence of regional factors, and their roles in neurodegeneration remain unclear [4,11,12,13,14,15].

Fluorescent amyloid probes, such as thioflavin-S (ThioS), detect Aβ plaques by binding misfolded protein aggregates via β-sheet intercalation (without sequence specificity) both in the parenchymal and vascular amyloid pathology, morphology-based exclusion is commonly used to distinguish these components for qualitative and quantitative assessment [16,17,18,19]. Hereafter, “parenchymal amyloid deposits” refers to plaques and other microscopically detectable pre-amyloid-like structures. ThioS staining is a widely used histological method for labeling Aβ aggregates, yielding intense green-yellow fluorescence (peak 455 nm) under epifluorescence microscopy due to its β-sheet affinity [16]. A previous study has also explored the clinical relevance of the delivery method of aerosolized ThioS probe for in vivo imaging of Aβ deposits in the retina using mouse models, demonstrating its potential for early detection of AD, with the advantage of easy accessibility without the need to cross the blood–brain barrier [20]. Despite its effectiveness, the automated analysis of ThioS-labeled tissue micrographs remains technically challenging, as high-resolution fluorescent imaging often introduces artifacts such as uneven illumination and background fluorescence [21,22], while ThioS protocols using high dye concentrations and short incubation, and solvents may cause non-specific staining (e.g., vessels) and loss from low-affinity sites, underdetecting subtle amyloid [16,17,23]. Furthermore, the heterogeneity in the size, shape, and distribution of amyloid plaques across stages and brain regions make robust segmentation particularly difficult. These issues compromise the reliability and reproducibility of conventional image analysis pipelines, hindering both segmentation and downstream morphometric quantification for neuropathology.

Advances in machine learning and computational pathology, especially convolutional neural networks (CNNs), are addressing limitations in medical image analysis, shifting focus from classification to detection and segmentation for localizing pathological features and delineating anatomical boundaries. CNN architectures like Visual Geometry Group, U-Net, Residual Network, Densely Connected Convolutional Network, Inception, MobileNet, etc., have proven highly effective for these tasks [24]. These models have demonstrated their utility across imaging modalities such as radiological (e.g., X-ray, computed tomography, magnetic resonance imaging), endoscopic, microscopic (e.g., histopathology), ultrasound, ophthalmologic, and other domain-specific medical images, supporting clinical diagnostics [24,25]. Weakly supervised object localization/semantic segmentation (WSOL/WSSS) methods reduce reliance on dense manual annotations by learning from coarse or indirect labels (e.g., holistic image-level class labels), enabling scalable biomedical image analysis with strong segmentation performance. Various studies have demonstrated the applicability of WSOL/WSSS methods to a wide range of tasks, including class activation mapping (CAM)-based techniques [26] for segmenting and analyzing biomedical images [27,28,29,30,31,32,33,34,35,36,37,38,39,40,41]. These approaches improve the ability of models to learn from sparse or weak annotations, contributing to scalable, high-performance biomedical image analysis tools. Moreover, these techniques also play a significant role in enhancing model interpretability. By providing visual insights into which regions of an image influence predictions, CAMs clarify decision-making, which is essential for understanding complex biomedical images [42,43,44,45,46,47,48,49,50,51]. This interpretability is particularly valuable in domains like pathology, where understanding model output reasoning is crucial for clinical application. While similar techniques have been used in AD research and amyloid pathology [52,53,54,55], this study is, to the best of our knowledge, the first to apply a WSSS framework to ThioS-stained amyloid plaques using high-resolution epifluorescence micrographs from human autopsy tissue sections.

To address limitations in current workflows and enable reproducible, automated analysis in AD research, we introduce a scalable, annotation-efficient framework for Aβ-plaque detection and morphometry that integrates established histological methods with modern weakly supervised machine learning strategies.

2. Results

2.1. Evaluation of the SqueezeNet Classifier for Parenchymal Amyloid Detection

To localize parenchymal amyloid pathology, we first trained a fully supervised binary image classifier based on the SqueezeNet architecture, incorporating a simple bypass structure and a Thresholded Average Pooling (TAP) layer in place of the conventional Global Average Pooling (GAP). The model was trained on patch-level image labels indicating the presence or absence of ThioS-positive parenchymal deposits, enabling downstream CAM extraction for localization tasks. These input patches were derived from histological micrographs of AD patients’ ThioS-stained brain tissue sections. Importantly, the classification model was not used for direct segmentation but served as the basis for extracting CAMs, which were later utilized in the training of separate segmentation models. Accordingly, this sub-section focuses on evaluating the classification performance as a critical preliminary step in the overall WSSS pipeline.

To assess generalization across individuals, model performance was evaluated through patient-wise 7-fold cross-validation (i.e., group k-fold), with each fold corresponding to a distinct patient. In each iteration, the model was trained on six folds and tested on the remaining one, ensuring that each subject served as the test case once. This evaluation strategy, designed for practical scenarios, simulates real-world generalization to unseen individuals while avoiding data leakage and subject-specific overfitting. Within each training run, 10% of the training data (randomly sampled from the six training folds) was held out as a development set to monitor performance and mitigate overfitting (Figure 1). The model achieving the lowest development (validation) loss across a maximum of 800 training epochs was retained and subsequently evaluated on the corresponding test fold. Evaluation metrics were computed for each fold, and the average results across all folds are summarized in Supplementary Table S1. The SqueezeNet classifier achieved a mean accuracy of 97.94%, precision of 98.53%, recall of 96.99%, and F1-score of 97.74%, demonstrating high reliability in distinguishing image patches with or without parenchymal amyloid structures. Supported by consistent and precise predictions, the classifier reliably captured image-level features of parenchymal amyloid pathology under full supervision, enabling CAM-based localization cues for the subsequent segmentation stage, as confirmed by classification results and visual inspection (Figure 2).

2.2. Evaluation of the U-Net for Parenchymal Amyloid Segmentation

To enable pixel-level segmentation of parenchymal Aβ deposits in full-image, ThioS-stained micrographs, the U-Net model was trained using CAM-derived binary pseudo-labels generated by a previously trained SqueezeNet distinguishing foreground (FG; amyloid structures) from background (BG; all other tissue or image regions). Although the training was conducted under weak supervision, the final evaluation was performed using manually annotated ground-truth (GT) masks that were never used during training.

For each fold in the 7-fold patient-wise cross-validation, CAMs were extracted from the corresponding SqueezeNet models; they were then postprocessed into pseudo-masks via two-level thresholding, followed by extensive object-level augmentation: FG objects were inpainted out and randomly reinserted, modifying boundaries and spatial context to expand the training distribution. Notably, the development set for each fold also originated from CAM-based pseudo-masks and underwent identical on-the-fly augmentations, ensuring that performance metrics reflected predictions on fully processed inputs, rather than raw (i.e., thresholded) CAMs. Only the test set, which contained the full images, relied on true manually annotated GT masks, kept strictly isolated from the training data, and processed in a patch-based manner.

The U-Net models were trained fold-wise, maintaining strict patient-level separation between training and testing data. For each of the seven folds, the model was trained on six folds and tested on the remaining one. During training, 10% of the data was allocated to a development set solely for tracking stable loss and accuracy trends (Figure 3), while model snapshots were saved exclusively at the end of the 30th epoch, regardless of the development set performance.

Across the seven folds, the segmentation model achieved an average Dice similarity coefficient of 0.763, Jaccard index of 0.639, recall of 0.721, and precision of 0.877. Pixel-wise accuracy (PA) and specificity were high at 0.990 and 0.997, respectively; however, due to the strong FG-BG imbalance inherent in amyloid segmentation, these metrics are less informative. The model also yielded a false positive rate of 0.003, a false discovery rate of 0.123, and a false negative rate of 0.279 (Supplementary Table S2).

Taken together, our results show that dense segmentation of parenchymal amyloid structures can be effectively approximated under weak supervision and object-level augmentation, without manual pixel-level annotations. Despite relying solely on CAM-derived pseudo-labels, the model maintained robust performance across folds against human-annotated GT (Figure 4). This demonstrates a viable and scalable strategy for generating reliable histopathology maps from image-level labels using CAMs, object-level image synthesis, and robust data augmentation, with minimal manual annotation effort.

2.3. Morphometric Profiling of Amyloid Plaques via Particle Analysis

To assess morphological characteristics of ThioS-stained amyloid deposits across brain regions and cognitive statuses, we used a final segmentation model trained on the entire available dataset to generate binary masks for detailed particle analysis. Unique deposits were extracted as connected components, followed by morphometric measurements (Supplementary Data) with ImageJ (version 1.47) and the FracLac plugin (for details, see https://imagej.net/ij/plugins/fraclac/FLHelp/Introduction.htm; accessed on 15 February 2025).

The measured parameters were then analyzed across groups based on the donor’s brain region and dementia status. Principal Component Analysis (PCA) was conducted on 19 morphological parameters of the deposits, revealing three principal components (PCs) that together explained over 85% of the total variance in the dataset (Figure 5A). Component loadings reflect standardized coefficients of the original variables, indicating their relative contribution to each PC (Supplementary Table S3).

The first component (PC1) primarily loaded on overall size-related descriptors, including deposit area (0.315), perimeter (0.287), compact area (0.274), hull area (0.304), and mean radius (0.291). This component captures the overall deposit size and extent. Integrated density (0.314), while intensity-based, is closely tied to area (total signal), thus aligning with size-related parameters. PC2 was mainly associated with shape complexity metrics, dominated by solidity (0.395), circularity (0.372), fractal dimension (0.313), and lacunarity (−0.370). Diffuseness index (−0.310), though intensity-based, reflects the spatial spread of intensity and contributes to morphological complexity. PC3 was driven by global geometric features, particularly span ratio (0.520), max/min radii (0.469), and hull circularity (−0.511), characterizing overall elongation and spatial arrangement. Group-wise comparisons of component scores revealed significant regional and cognitive-status-dependent differences. One-way analysis of variance (ANOVA) followed by Tukey’s post hoc test indicated that all three components differed significantly between the groups (PC1: F = 7.29, p < 0.001; PC2: F = 33.72, p < 0.001; PC3: F = 35.82, p < 0.001) (Figure 5B, Supplementary Tables S4 and S5). PC1 scores were significantly higher in the parietal + dementia and temporal + dementia groups compared to parietal + non-dementia (p < 0.001), suggesting an overall increase in deposit size in the parietal cortex of individuals with dementia. PC2 scores were elevated in parietal + dementia relative to all other groups (all p < 0.05), reflecting increased texture irregularity and complexity. Interestingly, the strongest contrast was observed between parietal + dementia and temporal + dementia (p < 0.001), while PC2 scores were significantly lower in temporal + dementia compared to all other groups (p < 0.001), suggesting region-specific morphological divergence in the presence of dementia. PC3 scores displayed robust group differences, with parietal + dementia consistently scoring higher than all other groups (all p < 0.001), pointing toward expanded radial structure and elongation of deposits. The temporal + dementia group also showed elevated scores compared to both groups without dementia (all p < 0.01), although scores remained lower than those observed in the parietal + dementia group (p < 0.001). Within each brain region, groups with dementia showed significantly higher scores than the corresponding groups without dementia (all p < 0.001), reinforcing the relevance of PC3 in capturing dementia-associated morphological traits.

Altogether, these findings suggest that distinct components of deposit morphology (PC1, PC2, and PC3) differ significantly across both brain region and cognitive status, supporting the hypothesis that deposit architecture is not only regionally specific but also reflects underlying neurodegenerative processes. However, due to the limited number of donor cases, these PCA and group-wise comparison results should be interpreted with caution and are primarily intended to serve a demonstrative and exploratory purpose.

3. Discussion

3.1. SqueezeNet-Based CAM for Weakly Supervised Localization

In this study, SqueezeNet was used as the backbone for a fully supervised image-level classification task and as the base of the WSOL framework, enabling the identification of parenchymal amyloid deposits from histopathological micrographs. To improve coarse localization, the architecture was complemented with a TAP layer, allowing effective extraction of CAMs using image-level labels alone.

While CNNs are typically optimized to maximize accuracy, smaller models that retain strong performance offer practical benefits, particularly in resource-constrained environments. SqueezeNet, a compact, modular CNN architecture for classification tasks, known for achieving performance comparable to AlexNet with a ~50× reduction in parameter count [56], has been used to extract CAMs. To reduce complexity and parameter count, SqueezeNet employs ‘fire modules’ that favor 1 × 1 filters over 3 × 3 filters and reduce the size of layers containing 3 × 3 filters by limiting their input channels. Delayed downsampling preserves spatial detail by maintaining larger activation maps in deeper layers, while the global pooling layer further aids regularization, enhances spatial robustness, and strengthens the connection between feature maps and categories [56], making SqueezeNet inherently suitable for CAM.

In standard CNN classifiers, convolutional layers act as feature extractors, and the final feature representations, obtained by vectorizing the last convolutional layer feature maps, are typically classified using fully connected layers with softmax or sigmoid activation [57]. As an alternative, global pooling layers convert activation maps into fixed-size vectors by aggregating spatial information across feature maps. This reduces overfitting and enhances generalization while preserving the convolutional structure and improving interpretability by linking feature maps to categories [57]. Despite potential trade-offs in accuracy and task-limited applicability [58], global pooling is especially valuable in domains like medical imaging, where transparent and explainable predictions are essential. The most common global pooling operations are GAP and Global Max Pooling (GMP). GAP is often preferred for localization tasks as it averages the entire feature map, maximizing the output score when the object is fully activated, making it effective for capturing the full extent of an object [26]. However, it may suppress highly activated regions, potentially affecting classification performance [59]. In contrast, GMP focuses on the strongest activation, making it better suited for identifying the most discriminative regions, as lower-activation areas do not influence the result. Studies related to the CAM technique have explored similar approaches using GMP [60] and log-sum-exp pooling [61]. Additional global pooling operations have been developed to address emerging challenges, such as Global Average of Top-K Max-Pooling [62], Top-GAP [63], Deep Generalized Max-Pooling [64], and TAP [59].

Generalized CAM methods, like Grad-CAM [58], Grad-CAM++ [65], Score-CAM [66], Poly-CAM [67], and HR-CAM [68], explore the decision-making of an arbitrary CNN. Grad-CAM uses output gradients to weight the feature maps, offering broad applicability across computer vision tasks without requiring specialized architectures [58], while Grad-CAM++, using scaling factors derived from higher-order gradients, improves localization when multiple class instances are present [65]. Still, gradient-based methods are computationally intensive, while mapping resolution also depends on the last convolutional layer, and resolution-enhancing solutions may reduce transparency, especially when involving multiple layers, making global pooling a more common choice in WSOL [59]. We opted for a simpler CAM method with a TAP layer, as the primary goal was to train a binary image classifier to detect parenchymal amyloid structures, rather than solving more complex computer vision tasks. Notably, SqueezeNet, being fully convolutional, requires no major architectural changes (e.g., removing fully connected, convolutional, or pooling layers) to apply CAM and achieve appropriate mapping resolution, making it particularly well-suited for this purpose.

Although the focus here is on evaluating classification performance, architectural and pooling choices were driven by the downstream need for spatial localization. Conventional CAM approaches, such as those based on GAP, are known for their ease of implementation but often suffer from insufficient activation spread [58]. Replacing GAP with TAP allowed us to retain high activation specificity without discarding lower-activation regions that could still carry clinically meaningful information. This was particularly relevant given the subtle and variable nature of ThioS staining, where pathology could also manifest in diffuse or weakly marked regions.

Notably, direct quantitative evaluation of the CAM-derived pseudo-labels (e.g., via ablation study) was not performed, as both patch-level assessment and full-image reconstruction would require substantial additional complexity (e.g., postprocessing or architectural modifications) and offer only limited interpretability due to conceptual and technical constraints, stemming from the binary-labeled, patch-based dataset structure, partial tile overlap, relative scaling of CAM activations, two-level thresholding and resolution mismatch, all of which introduce FG-BG distribution bias, redundancy, and challenges in intensity alignment (i.e., seamless integration), particularly in handling negative (plaque-free) regions.

Our results confirmed that SqueezeNet with skip connections (Simple Bypass) and a TAP layer learned discriminative representations from a relatively small and heterogeneous dataset (Figure 2), demonstrating robust classification in subject-wise 7-fold cross-validation, suggesting good generalization to unseen individuals (Supplementary Table S1). This is a critical requirement in medical applications, where variability across patients often exceeds that within samples from a single subject. Avoiding gradient-based CAM methods (e.g., Grad-CAM, Score-CAM) was motivated by the need for computational simplicity and architectural compatibility. While these methods can improve localization, they require backpropagation gradients during inference, limiting real-time or large-scale use. In contrast, the TAP-based solution yielded heatmaps without incurring these computational costs or architectural dependencies. However, as with all weakly supervised methods, one major limitation is the trade-off between localization accuracy and classification performance. Despite optimizing the pooling strategy, the final CAMs remained limited in spatial resolution due to the downsampling inherent in deep convolutional layers. While we partially mitigated this by increasing the input resolution (572 × 572) and leveraging SqueezeNet’s late downsampling strategy, the resulting activation maps still lacked precise boundary definitions, a known bottleneck of CAM-based localization. Therefore, these CAMs were used as coarse initializations for the subsequent segmentation stage, rather than as final segmentation outputs.

Our model achieved strong classification performance metrics, including a mean accuracy of 97.94%, precision of 98.53%, recall of 96.99%, and an F1-score of 97.74%, highlighting its effectiveness in distinguishing image patches with versus without parenchymal amyloid pathology. Notably, these results were validated through patient-wise cross-validation, which confirmed the model’s robustness and generalizability across patients. This cross-validation approach is particularly valuable in ensuring that the model performs consistently and accurately, even when exposed to previously unseen patients. The findings suggest that the classifier is not only reliable in controlled experimental settings but also applicable to real-world histopathological scenarios, where variability between patients is inevitable.

In summary, the SqueezeNet + TAP framework provided a practical and efficient solution to generate interpretable activation maps suitable for guiding downstream weakly supervised segmentation stages and refinement with minimal annotation needs, while simultaneously achieving competitive classification performance. This phase established the groundwork for refinement and object-level morphometric analysis in the subsequent stages of our pipeline.

3.2. U-Net Segmentation with Advanced Object-Level Augmentation

Our findings demonstrate that weak supervision, when combined with targeted object-level augmentation, can effectively train deep segmentation models to delineate parenchymal amyloid deposits. Despite relying on training labels derived from weakly annotated images, the U-Net model generalized well to manually annotated GT across folds. Our results highlight the capacity of synthetic training data, generated via modular object–BG recombination, to capture real biological structures and refined boundaries with high fidelity.

WSSS with image-level labels aims to generate dense pseudo-masks by classification models, typically via CAMs, that highlight semantically relevant image regions. These initial regions are then refined using postprocessing techniques and/or transformed into pseudo-masks to train segmentation networks, though their inherent coarseness poses challenges for learning under noisy/imperfect supervision [69,70]. Segmenter CNNs often struggle with precision due to architectural constraints like spatial invariance and smoothing effects, exacerbated by class imbalance, uncertainty, and overfitting on label noise, resulting in oversmoothed or inaccurate class transitions. Boundary recovery, applied after classification, localization, and segmentation, refines coarse predictions by sharpening edge pixels (i.e., class transitions), improving accuracy without altering semantics, typically through postprocessing that enforces local coherence and aligns predictions with anatomical contours [70]. Superpixel-based methods partition images into perceptually coherent regions to refine CNN segmentations by sharpening edges, either as postprocessing or auxiliary training cues; however, their effectiveness depends on manually selected granularity, limiting scalability [70,71]. Conditional Random Fields also serve as postprocessing modules [33] or integrated differentiable layers [72], improving localization and spatial smoothness but requiring separate optimization or adding computational overhead with increased model complexity. Multiscale architectures fuse shallow and deep features to enhance boundary detail [32], and boundary-aware loss functions guide training to emphasize pixel-level accuracy at class transitions [73,74,75]. Rather than relying on a single noisy proposal, some WSSS methods exploit multiple annotation candidates as joint multilabel supervision, enabling the model to capture shared semantics and reduce overfitting [76]. Soft labeling, originally designed for classification, has been adapted to image segmentation to handle ambiguity, inter-class confusion, observer variability, poorly defined regions and prevent overfitting on noisy labels. Beyond simple uniform or Gaussian-based label smoothing, segmentation methods apply soft labels in a spatially structured manner (e.g., dilated masks, spatially varying smoothing, or geodesic distance-based weighting) to better reflect spatial structure and image context [77,78,79,80]. Our approach likewise focuses on refining coarse pseudo-masks by training a U-Net with object-level augmentation and contour-perturbation-based image synthesis.

U-Net, a widely used fully CNN architecture introduced by Ronneberger et al. [75], was originally designed for biomedical image segmentation and is now applied across various domains. It excels in several modalities, including electron microscopy, magnetic resonance imaging, and computer tomography scans, for identifying tissues, tumors, and anatomical structures, and has also been successfully applied to general image segmentation tasks beyond medical applications [81,82,83]. Designed to handle limited annotated data common in medical imaging, it uses a symmetric encoder–decoder architecture to enable precise segmentation. The encoder captures contextual information through repeated unpadded 3 × 3 convolutions and 2 × 2 max pooling, gradually downscaling spatial dimensions and expanding feature depth. The decoder reverses the encoding steps by applying transposed convolutions for upsampling and concatenates cropped feature maps from the encoder via skip connections to recover fine spatial details. Cropping is required due to the unpadded convolutions, which reduce feature map size, ensuring that encoder and decoder features align correctly. The network concludes with a 1 × 1 convolution to produce pixel-wise class probabilities, generating accurate segmentation masks while avoiding border artifacts [75,81].

Despite the exceptional performance of modern deep learning models in computer vision, data augmentation remains essential to improve robustness, generalization, and data efficiency. Acting as a powerful regularization, augmentation helps mitigate overfitting, address domain shift (distribution inconsistencies), and alleviate issues like data scarcity, class imbalance, and limited representativity by generating high-quality synthetic/virtual training examples, enhancing dataset volume, diversity, and realism, making models more resilient to real-world variability [84,85]. Beyond classification, it plays an essential role in detection, localization, and segmentation, particularly in high-stakes, annotation-heavy domains like medical imaging and microscopic analysis, where manual labeling is costly, time-consuming, and expertise-bound. Far from being auxiliary, augmentation is now a core component of modern deep learning pipelines [85,86,87]. Within this context, transformation-based input space augmentations can be broadly divided into classical and advanced strategies. Classical methods include geometric transforms (e.g., rotation, flipping, translation, shearing, scaling, cropping, affine/projective warping, non-linear deformations) and non-geometric manipulations (e.g., noise injection, filtering and photometric/color space adjustments like brightness/contrast changes or jittering) [85,86]. Advanced strategies include region- and pixel-level mixing or erasure, alongside data-driven, meta-learning, and automated techniques that mark the frontier of current augmentation research [84,85,86,87]. Image-mixing augmentation blends one or more samples and their labels either pixel-wise (e.g., weighted averaging) or patch-wise [88]. A foundational method, MixUp [89], interpolates between two images and labels using a mixing ratio from a beta distribution, inspiring variants like Manifold Mixup [90] and patch-based approaches such as CutMix [91]. Later refinements improve patch boundaries, exploit saliency or activation maps, and mix one or multiple images via novel strategies, enhancing generalization and robustness against noise or adversarial data across tasks like classification, localization, segmentation, and weakly supervised methods (WSOL, WSSS) [88].

To tackle challenges in weakly supervised learning and semantic segmentation, advanced augmentations manipulate object-level and contextual features to generate more diverse and informative samples, reduce overfitting, and improve robustness. ClassMix [92] blends unlabeled images using masks from network predictions, considering object boundaries, Context Decoupling [93] separates objects from BGs to emphasize object-specific cues, while LCAMix and HSMix [94,95] employ superpixels (homogeneous regions) to maintain local structure and contour integrity in medical image mixing, countering isotropy. Object-aware methods like Cut-Paste-and-Learn (instance detection) [96] extract and store (data banks) objects and BG scenes, apply geometric transforms, and blend inserted objects using Poisson or Gaussian smoothing to reduce boundary artifacts, while the method called Simple Copy-Paste [97] simplifies this by randomly pasting objects between images without modeling visual context, addressing data needs and aiding long-tail category balancing, with semi-supervised learning integration possible. ObjectAug [98] addresses object–BG connectivity, class imbalance, and boundary augmentation by applying classical augmentations to separated objects using semantic labels, and image inpainting for gap-filling, KeepMask and KeepMix [99] vary BGs while preserving FGs to avoid overfitting in medical segmentation and TumorCP [100] uses randomized Copy-Paste with Gaussian blur, demonstrating the effectiveness of inter-patient augmentation. Soft-Copy and Soft-Paste (Soft-CP) [101] improve medical segmentation by using a new object-blending method with erosion/dilation to smooth object edges and adjust relevance based on distance, unlike Poisson or Gaussian smoothing, preserving lesion structure without distorting medical information, while also applying object- and image-level transforms to lesions and BGs to enhance diversity.

ROI contour modification has previously been explored in radiomics, not for segmentation itself, but to improve the robustness of quantitative imaging feature extraction under delineation uncertainty [102,103], in response to two critical challenges: the limited size and class imbalance of datasets, and the observer variability in manual ROI delineations that undermines feature reliability. To address these issues, recent studies have introduced geometric perturbation-based augmentation pipelines operating directly on ROI masks, offering an alternative to conventional feature-level resampling, applying stochastic geometric transformations such as erosion, dilation (volume adaptation), translation, and particularly contour randomization to simulate interobserver variability and evaluate feature stability and discriminative power. By introducing clinically plausible boundary variations without the need for multiple expert annotations, these ROI perturbations mimic natural delineation inconsistencies and uncertainty, augment sample size and diversity, enhance feature robustness, improve the generalizability of radiomic classification models, and aid the reliable recognition of clinically relevant patterns [102,103].

Inspired by these prior works with object-level strategies, we adapt the concept of synthetic contour generation commonly used in texture segmentation tasks, including applications in medical imaging such as hematoxylin/eosin lymphoma mosaics [104], at the object-level. Specifically, we randomize the keypoints of simple, rounded borders extracted from high-confidence CAM regions, while preserving their internal core structure. These transformed objects are then embedded into novel BG scenes using Soft-CP, which mimics natural-looking blending, while uncertain regions are inpainted to suppress noise. This process generates synthetic yet anatomically plausible FG-BG transitions and shape variations. Our method mitigates the model’s tendency to learn overly smooth boundaries caused by upscaled CAM artifacts by exposing it to diverse contour alterations during training, thereby promoting better generalization to unseen real spatial configurations at inference. Separation of images into discrete FG and BG components enabled combinatorial augmentation, vastly expanding the diversity and enhancing the network capacity to learn both fine-grained contours and broader contextual cues. This is reflected in the model’s impressive precision (0.877) and recall (0.721), suggesting a strong ability to accurately differentiate between deposit and non-deposit regions.

Despite the use of

ℒ

_aUF and object-level augmentations favoring the rare FG class, Recall (0.721) and FNR (0.279) values indicate that some deposits remain undetected, like extremely faint or fragmented structures, but this is not unexpected, as the model was not exposed to manually annotated data. Another factor contributing to the mild under-segmentation is the consistent separation of deposits from adjacent vascular elements since the ImageJ-automation and manual editing-based GT masks were less refined in this regard. Post hoc visual inspection confirmed that many of the unsegmented areas corresponded to embedded or directly contacting blood vessels or ambiguous structures that are challenging to delineate, despite efforts to exclude them during annotation, suggesting missing regions were not due to model bias but rather reflect challenges in the GT itself, like imprecise vessel–deposit boundaries. Furthermore, the strict CAM thresholds may have contributed to the exclusion of some ambiguous or borderline features. The model’s exceptionally low false positive rate (FPR = 0.003) confirms its ability to exclude BG and vasculature, enabling parenchyma-specific amyloid detection. This focused identification supports accurate isolation of the parenchymal Aβ fraction, facilitating in-depth analysis of plaque pathology and supporting research into targeted therapies.

The model’s performance is validated by an average Dice score of 0.763 and a Jaccard index of 0.639, indicating accurate amyloid deposit segmentation, particularly impressive given the limited supervision. Patient-wise cross-validation further confirms its robustness, showing strong generalization to unseen cases and reliable discrimination of relevant structures in complex histological environments.

Taken together, these findings validate the core premise of our pipeline: that accurate segmentation of histological pathology can be achieved through a carefully designed augmentation framework, even when GT annotations are indirect. By leveraging CAMs not only as weak labels but as anchors for object-centric augmentation, we demonstrate a scalable strategy for training robust segmentation models in data-limited medical imaging scenarios.

3.3. Morphometric Profiling of Amyloid Plaques via Particle Analysis

The final stage of our analysis delved into the multifaceted morphometric characterization of ThioS-stained amyloid plaques, providing a nuanced perspective on their size, shape complexity, geometric symmetry, and staining intensity. By leveraging an extensive suite of 19 parameters extracted from segmented deposits, we explored structural variations across brain regions and cognitive status, revealing latent morphological patterns through dimensionality reduction and statistical comparison. PCA revealed three principal axes of morphometric variation (PC1–3), each capturing distinct subsets of deposit morphology with biologically interpretable architectural traits.

PC1, which captured the largest share of morphometric variance, predominantly reflected differences in plaque size, suggesting that spatial extent is the most salient morphological feature distinguishing amyloid deposits and potentially influencing their biological impact. Among the morphometric parameters, several size-related features, such as deposit and core area, perimeter, and convex hull-based descriptors, showed consistent co-loading on PC1, supporting a dominant size-related component. Notably, these traits were log-transformed to account for the known log-normal distribution of amyloid plaque sizes in AD, which likely reflects volume-proportional growth patterns [14,105,106]. In line with this trend, integrated density, representing cumulative staining intensity, also aligned with this axis, linking fluorescence accumulation to plaque size, as larger deposits typically accumulate a greater amount of fluorescence signal. Collectively, these parameters capture the overall size and spatial extent of deposits, dimensions that, as revealed in the results, were elevated in the groups with dementia, particularly in the parietal region, suggesting an increased plaque burden associated with dementia. This aligns with previous neuropathological findings linking cortical amyloid load to advanced disease stages and cognitive decline [4].

PC2 was dominated by structural complexity and spatial heterogeneity metrics, shaped by non-linear descriptors such as fractal dimension and lacunarity, alongside shape compactness measures like circularity and solidity. Fractal dimension, a scale-invariant measure quantifying self-similarity and organization, provides insights into structural complexity and is commonly applied to both cellular [107,108,109,110] and non-cellular structures, such as Aβ deposits in AD [11,111], which in this study was positively weighted in PC2. Thus, our findings are partly consistent with previous reports of increased fractal dimension in advanced stages of dementia [11], particularly given the PC2 increase observed in the parietal region. However, the decrease observed in the temporal lobe may point to region-specific differences. It is also important to note that our efforts to exclude vascular structures associated with plaques during segmentation may have influenced this metric, as such structures are likely to affect the apparent structural complexity captured by fractal analysis. In contrast, lacunarity was represented with a strong negative weight in PC2, which increased in the parietal region in the group with dementia, while it decreased in the temporal region. This regional variability may reflect both discontinuities in diffuse, looser structures and differences in subregional patterns within more complex plaque types. In the current analysis, roughness, reflecting deposit boundary irregularity, was negatively weighted in PC2, while circularity and solidity were strongly positively weighted, as their higher values indicate compactness, suggesting denser, morphologically simpler plaques. Notably, circularity has previously been used to classify ThioS-positive plaques by morphology [112]. Given that PC2 increased in the parietal region of dementia cases and decreased in the temporal region, this pattern may indicate a regional shift toward more compact and geometrically regular plaques in the parietal cortex in dementia, whereas plaques in the temporal cortex may retain more irregular, fragmented morphologies. Importantly, the diffuseness index, while technically an intensity-derived metric, quantifying the proportion of weakly stained areas relative to the total deposit area [113], loaded negatively onto this axis, indicating its role in capturing the degree of spatial spread and lack of compactness. PC2 scores were consistently highest in the dementia parietal group and lowest in the dementia temporal group, reflecting regionally divergent shifts in plaque morphology. These shifts involved not only size-related increases but also opposing trends in parameters associated with structural irregularity, compactness, and shape complexity. Such contrasting patterns may point to region-specific pathological trajectories, potentially shaped by differential plaque maturation dynamics, tissue organization, or local microenvironmental influences.

PC3 captured global geometric aspects of plaque morphology, emphasizing spatial symmetry and elongation as key characteristics. Elevated span ratio and max/min radius values reflected more anisotropic, elongated plaque shapes, while low hull circularity values reflected departures from global circular symmetry. Although less commonly assessed in traditional neuropathology (despite prior observations of region-specific patterns in plaque elongation [52]), these features offered additional insight into plaque morphotypes. Notably, PC3 was significantly elevated in both groups with dementia, driven by increases in elongation measures and decreases in symmetry, as it was positively weighted by span ratio and max/min radius and negatively by hull circularity, suggesting a morphological shift toward more elongated and asymmetrical plaques in dementia. This pattern may reflect underlying pathological processes affecting plaque growth dynamics, interactions with the vasculature, or microenvironmental factors that promote anisotropic deposition processes in the diseased brain.

From a biological perspective, the observed morphometric shifts may reflect region-specific tissue architecture, differential plaque maturation stages, or distinct mechanisms of amyloid aggregation and clearance. By applying a joint statistical modeling approach, these multifactorial influences may have been partially disentangled, allowing for the characterization of both global patterns and group-specific morphotypes. Importantly, our analysis underscores the value of incorporating non-traditional descriptors, such as fractal and convex hull-based parameters, which capture subtle structural features beyond conventional size metrics. These advanced measures may provide sensitive readouts for stratifying plaque subtypes or monitoring disease progression.

While the number of analyzed deposits was high, the relatively small number of donor cases (n = 7) warrants extreme caution in interpreting group-level trends. All statistical examinations presented in this study must therefore be considered strictly exploratory and overinterpretation at this scale would be fundamentally misleading. The dataset’s narrow scope inherently precludes deeper population-level inference. The main objective of these analyses is not to establish definitive, robust biomarker-level associations, but to illustrate the technical feasibility, practical applicability, and scalability of our segmentation framework. In this light, our morphometric analysis should be regarded primarily as demonstrative, with severely limited interpretability in the context of disease progression. The heterogeneity of cortical pathology is well-documented, and such a small and non-representative donor sample inevitably limits the generalizability and statistical reliability of any observed patterns.

While the consistent region- and dementia-associated patterns across all three PCs are promising and support the utility of high-dimensional morphometric profiling as a complementary tool in neuropathological and imaging-based studies, their statistical robustness remains fundamentally constrained by the dataset size. These findings are intended to motivate, rather than conclude, and larger, more representative cohorts, including multi-region studies, will be essential to validate, refine, and expand these proof-of-concept analyses and explore their true translational potential.

4. Materials and Methods

4.1. Brain Sections

For the WSSS and subsequent morphometric analysis of amyloid pathology using ThioS staining, 7–8 μm thick paraffin-embedded brain tissue sections were used [114]. The samples from parietal and temporal cortices (or subfields of these areas) of elderly subjects without dementia and with AD were obtained from the Netherlands Brain Bank (Nederlandse Hersenbank/Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, The Netherlands). The samples were collected from donors for or from whom a written informed consent for a brain autopsy had been obtained, and permission was granted for the use of tissue samples and for the anonymous use of clinical information (Project 598/2009). The tissue samples were supplemented with the following clinical/pathological information: autopsy code, age, gender, diagnosis, Braak stage, post-mortem delay, cerebrospinal fluid pH, brain mass, Apolipoprotein E, brain area/region; some of these are included in Supplementary Table S6.

Out of the various brain regions examined, only sections from the temporal and parietal lobes provided sufficient quantity and quality for morphometric analysis. Comparable images could not be obtained during the initial examinations from other available regions (thalamus, subthalamus, locus coeruleus, pons, and hippocampus) and from samples of subjects in Braak 0 and Braak 1 stages, due to the absence of well-identifiable plaques or ThioS-positive parenchymal amyloid structures. For each case, a single section from the superior temporal gyrus (temporal lobe) and one from the superior parietal gyrus (parietal lobe) were analyzed, both from subjects with and without dementia, resulting in a total of fourteen sections from seven individuals (two Braak 2 (no dementia), one Braak 5, and four Braak 6 (with dementia)).

4.2. ThioS Staining and Epifluorescence Microscopy for Amyloid Detection

ThioS is a widely used non-specific fluorescent histological dye used to detect various forms of tissue amyloid and pre-amyloid deposits, including amyloid plaques [18]. It binds non-selectively to the characteristic β-sheet conformations of amyloid fibrils and other proteins and emits green fluorescent light [16,17,18]. The sections were deparaffinized, rehydrated, and used in fluorescent microscopic histochemistry. Briefly, the sections (7–8 μm) were first treated three times for 15 min each with cyclohexane isomers of xylene, washed in absolute ethanol for 3 × 2 min, and subjected to a descending ethanol series (1-1 min in 96%, 70%, and 50% alcohol, a total of 3 min), followed by placing the sections in distilled water (minimum 1 min) and stored there until staining. ThioS staining protocol was carried out based on the method of Krutsay [23]. The deparaffinized sections were placed in a 1 g/10 mL solution of ThioS-staining mixture (Reanal, Budapest, Hungary) in distilled water and left at room temperature for 10 min, making sure that the tissue samples were completely immersed. The dissolved staining solution was filtered several times through filter paper before each use. This was followed by a short alcohol differentiation and then a rinse in distilled water. The sections were coverslipped with a water-based cover medium (Vectashield, Vector Laboratories, Peterborough, UK) and examined under a fluorescence microscope. To prevent fading of the samples and photolysis of the dye, the ThioS-solution, and the stained sample slides were protected from light and stored in a cool (4 °C) dark place until the start of the microscopic examination, which was always carried out within two weeks.

Amyloid aggregates, ranging from well-developed senile plaques to smaller, diffuse extracellular deposits were detected using a Leica DMLB epifluorescence microscope (Leica Microsystems CMS GmbH, Wetzlar, Germany). A 40× apochromatic objective lens with a 0.75 numerical aperture and a lens system corrected for an infinite tube length and 0.17 mm-thick cover glass was used (40×/0.75 ∞/0.17). Green-channel, high-resolution digital images of the affected regions were recorded using a Leica DFC7000 T CCD camera and LAS X Leica Application Suite X computer software (version 3.9.28093.0) (Leica Microsystems CMS GmbH, Wetzlar, Germany) with a specified 45 ms exposure time. Efforts were made during the image acquisition process to systematically divide tissue sections on the slide into sub-areas, aiming to minimize image overlaps. Micrographs were saved as TIFF files with a resolution of 72 dpi (1920 × 1440 pixels) and labeled with scale bars.

4.3. Preprocessing Fluorescent Micrographs: Uneven Illumination, Background Estimation, and Annotation Preparation

Correcting for non-uniform intensity in ThioS-stained micrographs was essential for reliable quantitative measurements. This illumination gradient introduced variations that could interfere with downstream image processing steps [115]. When fluorescence intensity decreases radially from the optical axis, it can lead to variations in the detected intensity of objects, such as amyloid deposits, across different image regions [21]. Similarly, BG estimation is a critical step in both conventional and automated fluorescent microscopic analysis, essential for detecting the photophysical signals of fluorophore-labeled objects, as BG noise from sources such as out-of-focus fluorescence, detector noise, stray light, and autofluorescence often contributes to the detected signal [116]. Both the illumination gradient and BG noise affect later stages of the workflow, influencing segmentation, morphological analysis, and intensity measurements. The illumination gradient causes regional intensity changes, while BG noise introduces global discrepancies, both affecting downstream analysis. To correct intensity nonuniformity in fluorescence microscopy, flat-field correction (white referencing) is commonly used, removing both illumination gradients and lens contaminants. This method requires a uniformly fluorescent reference image (e.g., an empty field of view or free fluorescent dye), which is used to normalize experimental micrographs [21,115,117]. However, obtaining an accurate reference image is challenging, and it is difficult to precisely reproduce the experimental conditions, even when using the same microscope settings, illumination intensity, and exposure time. This is due to factors such as device anomalies, sample preparation processes, and the behavior of fluorescent stains in the chemical environment of the real samples [115]. As an alternative, retrospective correction methods can estimate illumination bias directly from the acquired images. A simple approach, pseudo-flat-field correction, involves smoothing each image with a large-kernel filter to create a smoothed version, which is then used for normalization. This method assumes uniform signal distribution and does not correct the dust specks [115]. More robust multi-image-based approaches generate an illumination correction function (ICF, vignetting function) by averaging multiple images, followed by median filtering to refine the estimate [21,115,117,118,119]. Our approach, adapted from prior robust methods, applies retrospective multi-image illumination correction to fluorescence images by converting them to 8-bit grayscale and computing the ICF by averaging the intensities of 45 individually selected micrographs containing mostly clear tissue areas from multiple experimental subjects, captured at random positions. The averaged image was then refined using Gaussian blur and median filtering, preserving only larger gradients necessary for correction, then each image was corrected through division by the ICF and pixel scaling with its average value.

Standard BG correction methods based on intensity thresholding [120], are sensitive to changes in the FG intensity distribution. They require a bimodal intensity histogram for reliable BG estimation, but in fluorescence imaging, overlapping FG and BG distributions complicate this approach, especially for amyloid structures with varying intensities and low signal-to-noise ratios. Morphological methods based on structuring elements (e.g., rolling ball method) [121], depend on object size-related parameters and do not estimate BG distributions, making them unsuitable for dealing with the wide size range and locally variable structures of ThioS-positive objects. Instead of relying on a single BG value, the Silver Mountain Operator (SMO) [22] method improves BG representation by directly restoring an unbiased (though not complete) BG distribution from the image. It selects representative BG pixels through a robust statistical procedure, leveraging the absence of local spatial correlation between them. The method involves simple mathematical operations and an averaging kernel, with filter size as its only parameter. In essence, the algorithm computes a moving average of the normalized intensity gradients after optional smoothing and then derives the length of these average vectors [22].

For SMO, a 7-pixel averaging window was used without prior smoothing, and the 5th percentile threshold was chosen. The method for BG extraction was applied after illumination correction, using only the unmasked regions of the corrected images, where saturated pixels (≥255) were excluded, and empty-slide areas (≤5) were also masked. The median of the BG distribution, extracted from the masked images, was subtracted from the entire unmasked image, centering the histogram of the BG regions around zero, while empty-slide regions exhibited extreme negative values, as expected. The images were then rescaled to a 0–255 grayscale range for saving as 8-bit TIFF files for further processing (Figure 6). All correction procedures were implemented using standard Python (version 3.10) libraries such as NumPy, OpenCV [122,123,124] and the SMO Python package version (https://github.com/maurosilber/smo, accessed on 15 February 2025) [22], executed within the Google Colaboratory cloud environment (https://colab.research.google.com, accessed on 15 February 2025) [125].

Following the intensity adjustments, 1607 corrected ThioS-stained fluorescence micrographs (1920 × 1440 resolution) were sampled using ImageJ (version 1.47) [126] where a square region of interest (ROI) of 800 × 800 pixels (chosen based on the typical size range and density of plaques) was manually placed on selected regions, cropped and saved for further analysis with image patch-level binary annotations (1 for the presence of parenchymal pathology, 0 for its absence); vascular elements and residual artifacts were present in both groups and although non-overlapping regions were prioritized, complete non-overlap was impractical due to image size and plaque positioning. The dataset included 2 subjects without dementia and 5 with dementia, each contributing two sections from two distinct brain regions. Positive and negative image patches were systematically extracted from the full-size micrograph images of each subject’s brain sections. Positive patches were selected from regions containing visible ThioS-positive parenchymal amyloid pathology, while negative patches were sampled from areas lacking parenchymal deposits. These negative regions could include BG tissue or vascular amyloid pathology. This patch-level sampling strategy ensured that both classes were represented within each section, allowing for balanced and anatomically diverse training data. The final training dataset, consisting of 2978 positive and 3350 negative image patches for parenchymal pathology (a total of 6329 cropped images) was prepared for deep learning-based image classification tasks. Pixel-wise binary segmentation masks were created on the full images (1920 × 1440) through extensive manual segmentation, where ImageJ tools such as thresholding, morphological operations, and filtering were used to simplify the process, but despite these simplifications, the process still required significant manual effort, including visual inspection, object selection, and adjustments, making it extremely costly, time-consuming and labor-intensive to achieve reliable results; these segmentation masks will be used for the final WSSS evaluation.

4.4. Implementation of SqueezeNet-Based CAM for Weakly Supervised Localization

Deep learning aids medical image segmentation using costly dense pixel-level labels; weak supervision pipelines use image-level labels like the CAM method but struggle to delineate lesion boundaries accurately. CAM highlights subregions of an image that contribute most to the prediction by computing a weighted sum of the final convolutional feature maps, with weights derived from the classification layer following a global pooling operation. GAP condenses each feature map into a scalar, forming a pooled vector combined with class-specific weights to generate class scores. The learned class-specific weights are applied directly to the spatially informative feature maps to produce CAMs, which are then upsampled, typically via bilinear interpolation, into heatmaps that highlight areas linked to each class prediction and enable the identification of discriminative image regions with respect to classification outcomes [26]. However, this method provides only coarse-grained localization due to the limited mapping resolution of feature maps, and the global pooling layer imposes architectural constraints by requiring it to directly precede the final classification layer, often necessitating substantial modifications [26,59].

Among CAM-based WSOL methods, we adopted the approach of [59] for its simplicity and effective handling of global pooling issues, specifically using the TAP layer and negative weight clamping (excluding their percentile-based thresholding). TAP averages only activations above a threshold, balancing the broad coverage of GAP with the focused precision of GMP to improve spatial localization. Clamping negative weights to zero during CAM generation suppresses BG signals found in less discriminative object parts without adding training complexity (like erasing methods) or noise [59]. A key limitation of CAM localization is the low spatial (mapping) resolution of final convolutional feature maps. To address this and enable CAM-based localization, ref. [26] removed convolutional, pooling, and fully connected layers in models like AlexNet-GAP, VGGnet-GAP, and GoogLeNet-GAP, increasing mapping resolution (227 × 227 → 13 × 13 and 224 × 224 → 14 × 14 instead of 6 × 6 and 7 × 7), though this may reduce classification performance [58]. SqueezeNet, by delaying downsampling, achieves a similar resolution by default (224 × 224 → 13 × 13; in our case, 572 × 572 → 34 × 34). We applied minor changes replacing GAP with TAP and adding a sigmoid activation for binary classification. To better preserve resolution and align with U-Net-based refinement [75], 800 × 800 input images were resized to 572 × 572 and then normalized to [0, 1]. The SqueezeNet authors proposed three variants (Vanilla, Simple Bypass, and Complex Bypass) and found Simple Bypass most effective by adding skip connections around fire modules (3, 5, 7, 9) to learn residual functions and bypass bottlenecks, boosting accuracy and regularization without increasing model size; we adopted it for our task [56].

The SqueezeNet model, extended with a TAP layer, was trained from scratch on cropped patches derived from histopathological micrographs and corresponding binary labels indicating the presence or absence of parenchymal amyloid deposits. The training was performed using the root mean square propagation (RMSprop) optimizer with a fixed learning rate of 5 × 10⁻⁶ and a batch size of 64, over 800 epochs, aiming to minimize the binary cross-entropy loss (

ℒ

_BC), commonly used in binary classification tasks to measure the difference between the true labels (y) and the predicted probabilities (p). The formula for the

ℒ

_BC: −(1/n)∑ⁿ_i₌₁[y_ilog(p_i) + (1 − y_i)log(1 − p_i)]. The best-performing model was retained based on the lowest validation loss across all epochs.

Data augmentation comprised standard geometric transformations (rotation, shearing, zooming, flipping), complemented by mixup regularization, and was applied on-the-fly during training. Mixup coefficients (λ) were sampled from a symmetric Beta distribution, Beta(α = 8.0, β = 8.0), when inputs belonged to the same class, and from Beta(α = 0.2, β = 0.2) when combining samples across classes. Although inter-class interpolation can improve generalization [89], this strategy was suboptimal in our setting due to the presence of pathologically relevant but weakly staining regions. Mixing positive signals with non-pathological samples produced attenuated images that visually resembled naturally weak positives but received disproportionately reduced soft labels, causing ambiguity in supervision (similar effect to manifold intrusion). Model classification performance was assessed using patient-wise 7-fold cross-validation, where each fold corresponded to a distinct subject. This evaluation strategy was chosen to prevent data leakage across subjects and to simulate real-world scenarios where predictions must generalize to unseen individuals. Following the training cycles, average performance metrics across the seven folds were computed and considered representative of overall model performance. For each fold, the following standard classification metrics were calculated based on the confusion matrix elements derived from image-level predictions (TP_cls: true positives, TN_cls: true negatives, FP_cls: false positives, FN_cls: false negatives):

Accuracy, defined as TP_cls/(TP_cls + TN_cls + FP_cls + FN_cls), measures the overall proportion of correctly classified image patches, regardless of class. It provides a general indication of the model’s reliability in distinguishing between positive and negative instances.
Precision, defined as TP_cls/(TP_cls + FP_cls), quantifies the proportion of true positive predictions among all patches predicted as positive, indicating how many of the identified amyloid-positive regions were correctly detected.
Recall, calculated as TP_cls/(TP_cls + FN_cls), measures the capacity of the model to correctly identify all truly positive instances; i.e., the proportion of actual amyloid-containing patches that were successfully detected.
F1-score expresses the trade-off between precision and recall through their harmonic mean, computed as 2TP_cls/(2TP_cls + FP_cls + FN_cls), and proves especially informative in imbalanced classification contexts.

Additionally, during each cross-validation iteration, 10% of the training data, randomly sampled from the combined six training folds, was set aside as a development (or validation) set. This subset served to monitor training progress and to detect signs of overfitting. Model checkpoints were saved throughout the 800 training epochs, and for each iteration, the model state achieving the lowest validation loss on the development set was selected for final evaluation on the corresponding held-out test subject. CAMs were subsequently extracted from the final convolutional layers to enable spatial localization of discriminative features without requiring major architectural modifications (Figure 6). The model and the training pipeline were implemented in Python using TensorFlow, along with additional libraries for data preprocessing, augmentation, and training [122,123,124,127] and the high-level computations were performed on an NVIDIA GeForce RTX 4090 GPU with 24 GB of memory.

4.5. Implementation of U-Net Segmentation with Advanced Object-Level Augmentation

For the final image segmentation task, the U-Net was chosen for its effectiveness with limited data. To improve generalization, extensive object-level augmentation strategies were applied during training, as robust augmentation is key for optimal performance [75,81]. CAMs from amyloid-positive images (34 × 34, via SqueezeNet) were upsampled (572 × 572) and normalized to the [0, 1] interval, using per-image maxima [26]; minima defaulted to 0 due to negative weight clamping [59].

Pixel-level pseudo-masks with corresponding image patches were created using a simple two-step thresholding: CAMs > 0.5 defined FG, from which unique objects with their local context were extracted based on contours (cv2.findContours) [124], and saved into an object-level image bank. To synthesize BG-only scenes, CAMs > 0.2 were dilated (25-pixel circular structuring element) to expand buffer zones, removed from the images, and resulting irregular holes inpainted using Shift-Map algorithm (OpenCV 4.2.0) [124], which reconstructs missing regions by finding dominant patch offsets and optimizing a Markov Random Field energy function [128] without training additional deep learning models [98,129]. The completed images were placed into a unified BG bank with the original BG set. This approach maximizes environmental diversity and contextual variation for object insertion, supporting the creation of large-scale synthetic data [96,98,100,101,129,130]. This modular pipeline decouples images into object and BG components, enabling on-the-fly object-level augmentation. During U-Net training, 0–5 randomly selected object instances were inserted into these BG scenes, addressing the class imbalance. Segmentation masks are handled similarly, allowing combinatorial data synthesis, and augmenting the boundaries between objects and BGs for increased diversity and precise control over object-boundary transitions [96,100,101].

To further enhance the diversity of object boundaries, contour perturbation is integrated into the object-level augmentation pipeline for training a U-Net using CAM-derived pseudo-masks. Specifically, contour synthesis is applied to thresholded pseudo-object masks prior to cut-and-paste augmentation. When cutting out these object regions from their local BG patches, the synthesized contours are used both to update the masks and to crop the corresponding grayscale image regions.

Our method builds upon the idea of Restorable Contour Synthesis [131], originally designed to produce synthetic segmentation masks with restorable GT approximations, with its potential for augmentation suggested. The average of multiple such synthetic contours remains a faithful approximation of the original, enabling robust augmentation while maintaining structural plausibility [131]. The perturbation mechanism of the polygonal keypoint approximation is modified for better structural integrity and increased variability. Instead of injecting Gaussian noise scaled to keypoint distances in arbitrary directions [131], displacement is constrained to directions perpendicular to the line connecting two neighbors of the current point, approximating the local contour tangent. This avoids self-intersections and maintains coherent shape outlines even with larger global noise parameters (standard deviation). In practice, key points are sampled with gaps of 5–10 pixels for the polygonal approximation, with the size of the gap directly influencing the magnitude of the noise applied to the points.

To ensure natural integration, the Soft-CP blending strategy [101] was applied during insertion, which smooths object mask edges through erosion and dilation operations, with 5 iterations for each chosen for this work, and weights pixel values based on distance from the contour boundary to ensure a seamless transition. However, the dilated and eroded pixel areas of the mask were ignored during training (similar to a trimap), and, unlike the original Soft-CP method [101], the blending technique was applied only to the input grayscale image without using the soft masks as GT.

The U-Net model and the training process were implemented using standard Python libraries [122,124] and TensorFlow [127]. Training utilized the adaptive moment estimation (Adam) optimizer (learning rate: 0.001, β₁ = 0.9, β₂ = 0.999, ε = 1 × 10⁻⁷) [132], minimizing an asymmetric unified focal loss function (

ℒ

_aUF) [133]. This loss function generalizes distribution-based, region-based, and compound losses (e.g., cross-entropy, Dice, and Combo) into a unified framework to address class imbalance, with fewer hyperparameters by grouping functionally equivalent terms, while its asymmetric design promotes FG enhancement and BG suppression [133].

ℒ

_aUF includes three hyperparameters: λ, which controls the relative weighting between the two constituent loss components; δ, which adjusts the balance between negative and positive examples; and γ is a focusing parameter for simultaneous BG suppression and rare class enhancement. In this study, we set λ = 0.5, δ = 0.6, and γ = 0.5. The asymmetric unified focal loss function is defined as

ℒ

_aUF = λ

ℒ

_maF + (1 − λ)

ℒ

_maFT

(1)

where

ℒ

_maF is the modified asymmetric focal loss and

ℒ

_maFT is the modified asymmetric focal Tversky loss:

ℒ

_maF = −(δ/N)y_i:rlog(p_t,r) − [(1 − δ)/N]∑_c≠r(1 − p_t,c)^γlog(p_t,r)

(2)

ℒ

_maFT = ∑_c≠r(1 − mTI) + ∑_c=r(1 − mTI)^{1 − γ},

(3)

where N is the total number of pixels; y, and p are the GT labels and predicted values; p_t is the probability of predicting the GT class; i, c, and r are the pixel, class and rare class indices; mTI is the modified Tversky index:

mTI = ∑^N_i₌₁p₀ᵢg₀ᵢ/(∑^N_i₌₁p₀ᵢg₀ᵢ + δ∑^N_i₌₁p₀ᵢg₁ᵢ + (1 − δ)∑^N_i₌₁p₁ᵢg₀ᵢ),

(4)

where p_1i, p_0i are predicted probabilities for FG/BG; g_1i, g_0i are the corresponding GT indicators (1 if true, 0 otherwise) [133]. The loss function was adapted for the binary segmentation task, with modifications made to handle the presence of ignored pixels by excluding them from the calculations. The training was conducted over 30 epochs with a batch size of 16, using augmented pseudo-masks derived from CAM models as GT. During testing and inference, the full-size micrographs were reflected at the borders to avoid empty regions at the edges. The images were then divided into overlapping tiles; each was resized to the input dimensions for prediction. The predictions for each tile were reassembled into a segmentation mask corresponding to the full micrograph. Postprocessing steps including thresholding (0.5), hole filling, and size exclusion (<3500 pixels) to filter out smaller particles were applied, while objects touching or intersecting the image boundaries were excluded during particle extraction (Figure 6).

Rather than exhaustively retraining baseline or ablation variants for the segmentation pipeline, we focused on the per-fold evaluation of the complete framework using independent pixel-level GT annotations of entire images. Given the considerable computational demand of our group k-fold cross-validation setup and the additional challenges posed by dataset-specific factors, such as severe class imbalance, distributional distortions from patch resampling, a large proportion of negative patches in the original patch set, redundancy due to tile overlap, and inconsistencies from semi-automated annotations, all of which complicate fair comparison and baseline construction under conventional dense segmentation settings, we considered this evaluation strategy more appropriate for demonstrating the viability and practical applicability of our weak supervision pipeline. Segmentation performance was evaluated using patient-wise 7-fold cross-validation, where each fold represented a distinct subject. Importantly, the same subject-wise fold partitions established during the classification step were retained to ensure methodological consistency. Specifically, CAMs produced by the TAP-SqueezeNet classifier were used to generate pseudo-labels for the U-Net segmentation model training, and this mapping respected the original fold assignments, ensuring that CAMs derived from a given subject were only used within their respective training folds. This design maintained strict data separation across subjects and allowed a direct correspondence between classification-derived supervision and subsequent segmentation learning. In each cross-validation iteration, the U-Net was trained on six folds and evaluated on the remaining held-out fold. Performance metrics were computed independently for each fold, and final results were reported as the average across all seven folds. Segmentation performance was evaluated using a set of standard pixel-level evaluation metrics derived from the confusion matrix elements computed over segmentation masks (TP_seg: true positives; TN_seg: true negatives; FP_seg: false positives; FN_seg: false negatives), including both overlap-based and error-based measures:

Dice coefficient, defined as 2TP_seg/(2TP_seg + FP_seg +FN_seg), quantifies the spatial overlap between predicted and GT segmentation masks. It reflects both precision and recall and is widely used in biomedical image segmentation to assess agreement.
Jaccard index (also known as Intersection over Union), given by TP_seg/(TP_seg + FP_seg + FN_seg), provides a stricter measure of overlap than the Dice coefficient. It evaluates the proportion of shared positive predictions relative to the union of predicted and actual positives.
Recall, calculated as TP_seg/(TP_seg + FN_seg), measures the ability of the model to correctly detect all amyloid-positive pixels.
Pixel-level accuracy (PA), computed as (TP_seg + TN_seg)/(TP_seg + TN_seg + FP_seg + FN_seg), assesses the overall correctness of predictions across all pixels, including both FG and BG. However, in cases of strong class imbalance, where BG pixels vastly outnumber FG pixels, accuracy may be inflated and fail to reflect true segmentation performance.
Specificity, calculated as TN_seg/(TN_seg + FP_seg), reflects the model’s ability to correctly identify BG (non-deposit) pixels, reducing the likelihood of over-segmentation.
Precision, defined as TP_seg/(TP_seg + FP_seg), indicates the fraction of true deposit pixels among all those classified as deposits.
Negative Predictive Value (NPV), calculated as TN_seg/(TN_seg + FN_seg), indicates the proportion of correctly classified BG pixels among all BG predictions.
False Positive Rate (FPR), given by FP_seg/(FP_seg + TN_seg), quantifies the proportion of BG pixels incorrectly labeled as deposits.
False Discovery Rate (FDR), calculated as FP_seg/(FP_seg + TP_seg), indicates the proportion of incorrect deposit predictions among all positive predictions.
False Negative Rate (FNR), defined as FN_seg/(FN_seg + TP_seg), represents the fraction of actual deposit pixels that were missed by the model.
False Omission Rate (FOR), defined as FN_seg/(FN_seg + TN_seg), measures the proportion of missed FG pixels among all predicted BG pixels.

In each cross-validation round, a randomly selected 10% subset of the training folds was used as a development set to monitor training dynamics. This set underwent the same data augmentation pipeline as the training data to ensure the presence of synthetic boundary variations, rather than relying on raw thresholded CAMs. The development set was used exclusively to track stable training trends; model checkpointing was not based on its performance, and the model saved at the 30th epoch was consistently used for evaluation. We implemented the model and training procedures in Python with TensorFlow, and other libraries for preprocessing, augmentation, and training [122,123,124,127]. High-level computations were carried out on an NVIDIA GeForce RTX 4090 GPU with 24 GB of memory.

4.6. Implementation of Morphometric Profiling of Amyloid Plaques via Particle Analysis

Morphometric analysis of parenchymal amyloid deposits was performed in subjects with and without dementia, focusing on parietal and temporal cortices. Compact and diffuse subregions were delineated using intensity thresholds at the 80th and 50th percentiles of pixel intensities within the predicted mask (from the final model) of each collected deposit, following prior work for calculating the diffuseness index using a single histological stain [113]. For each deposit, two binarized images based on these thresholds were generated and used alongside corresponding grayscale regions for object-specific morphometric analysis. In total, 19 parameters were measured using the computer programs ImageJ and FracLac for ImageJ (https://imagej.nih.gov/ij/docs/guide/user-guide.pdf and https://imagej.net/ij/plugins/fraclac/FLHelp/Introduction.htm; accessed on 15 February 2025) [107,108,126]. While 7 of these parameters have been previously examined in the context of amyloid pathology [11,14,105,106,112,113], the remaining 12 have primarily been applied in general structural–morphometric studies of biological tissues, offering novel perspectives on the microstructural organization of amyloid plaque. The extracted morphometric parameters were categorized into four major groups: (1) log-transformed, size-related metrics, including deposit area, compact core area, perimeter, and convex hull metrics such as hull area, hull perimeter, bounding circle diameter, maximum span, and mean radius; (2) structural complexity descriptors, including fractal dimension, lacunarity, roughness, circularity, and solidity; (3) geometric and symmetry features, including span ratio, max/min radii ratio, and hull circularity; (4) intensity-based characteristics, including mean gray value, integrated density, and diffuseness index. These categories collectively provide insights into the geometry, structural complexity, morphological variability, and staining characteristics of amyloid deposits.

To capture size-related variation, several deposit features were assessed, which were log-transformed (base 10) to achieve more symmetric distributions for PCA, account for the log-normal size distribution of Aβ-plaques in AD, consistent with a stochastic, volume-proportional growth model around a log-scale mean, likely driven by a porous, sponge-like structure in equilibrium with its surroundings. This cross-sectional pattern, supported by histology and 3D reconstructions, reflects the true volumetric properties of near-isotropic objects [14,105,106]. Deposit and compact area, determined by two predefined threshold values, were calculated from FG pixel counts in binarized images, and then converted to square micrometers, while the perimeter was computed as the length of the composite contour outlining the deposit area (in micrometers). Integrated density, defined as deposit area × mean gray value, was also log-transformed for PCA, as it reflects size-dependent total intensity within the deposit area. The size-related parameters were further expanded to include convex hull-based measurements. The convex hull refers to the smallest convex polygon that encloses the deposit area and several parameters were calculated based on this shape: hull area in square micrometers, and hull perimeter, diameter of the bounding circle (the smallest circle enclosing the hull), maximum span (the greatest distance between two points on the convex hull), and mean radius (the average distance from the center of mass of the hull to an outer point), all measured in micrometers.

To further understand the spatial structure of plaques, we analyzed size-invariant, internal architecture-related morphometric descriptors. Fractal dimension and lacunarity were computed to quantify the spatial complexity and heterogeneity of the binarized amyloid deposits. Fractal dimension was estimated using a box-counting approach, where the slope of a log–log regression between the number of FG pixels within boxes and the scanning box size was calculated. Lacunarity assessed the distribution of gaps and discontinuities within the deposit structure, with higher values indicating greater heterogeneity and lower values suggesting uniformity. Both measures were obtained using the FracLac plugin for ImageJ [107,108], applying the binary scanning method on thresholded images, using a series of grids with varying sizes. The “White Scan Background” option was fixed, and the grid calibers were determined by applying a power series for the box size scale. Measurements were performed on 12 randomized grid placements, and the results were obtained by averaging these measurements [107,108]. In addition to these metrics, the following parameters, though derived from measurements related to plaque size, focus on assessing size-invariant features that capture structural complexity and irregularity, namely: roughness (the ratio of perimeter to hull perimeter) reflects border jaggedness; circularity (4π × [deposit area]/[perimeter]²) quantifies similarity to a perfect circle; solidity (the ratio of deposit area to hull area) indicates compactness; and the diffuseness index (calculated as [deposit area − compact area]/deposit area, following [113]) represents the proportion of weakly stained regions.

Furthermore, global geometric and overall shape characteristics of deposits were quantified using convex hull-based parameters that capture aspects of spatial symmetry and elongation. These included span ratio (the ratio of the major to minor axis of the convex hull), max/min radius (distance ratios from the center of mass to boundary points), and hull circularity (calculated as 4π × hull area/hull perimeter²), collectively providing a comprehensive representation of plaque shape variability.

Subsequently, a comparative statistical analysis was performed based on deposit-level morphometric parameters across the two cortical lobes (parietal and temporal) and the presence or absence of dementia. This yielded four distinct experimental groups: (1) parietal + non-dementia, (2) parietal + dementia, (3) temporal + non-dementia, and (4) temporal + dementia. Prior to dimensionality reduction, all parameters were standardized using z-score transformation (StandardScaler) [134]. When the measured parameters contained zero values (compact area), a constant value of 1 was uniformly added across all observations to enable subsequent log-transformation [135,136]. PCA [134] was then performed to reduce the dimensionality of the dataset and to identify dominant patterns in the morphometric data. The PCs accounting for the highest proportion of explained variance were selected for further group-wise comparisons. The average values of the main PCs were compared across the four groups using ANOVA, and significant differences were further investigated with a Tukey post hoc test to compare pairs of groups. All statistical analyses were carried out using Python (libraries: scikit-learn, pandas, scipy, matplotlib) [122,134,137,138,139] and R (stats) [140], ensuring consistency and reproducibility.

5. Conclusions

In this study, we introduced a weakly supervised image analysis pipeline that enables accurate, robust, and interpretable segmentation of ThioS-positive Aβ plaques in human AD brain tissue without requiring labor-intensive pixel-level annotations. Our approach uniquely integrates robust preprocessing, the generation of CAM-derived spatial priors from a lightweight SqueezeNet classifier trained solely on patch-wise image labels, and a U-Net-based segmentation refinement step using synthetic, pseudo-labeled data with extensive object-level augmentation, without relying on manual delineations.

Importantly, the resulting models generalized well to unseen subjects’ original whole-image data under patient-wise group k-fold cross-validation, demonstrating strong alignment with manual GT. They successfully identified biologically meaningful, disease-relevant patterns by separating parenchymal amyloid pathology from vascular structures and staining artifacts, a distinction that ThioS staining protocols alone cannot reliably achieve due to high background noise and non-specific binding properties. Despite a limited donor pool, intrinsic boundary ambiguities, marked structural heterogeneity in deposit morphology, and variable staining patterns, our pipeline consistently captured plaque-specific semantic features while effectively excluding irrelevant tissue components.

Beyond significantly reducing annotation burden, our weakly supervised segmentation framework enables automated, object-level identification and detailed morphometric profiling of individual amyloid plaques, effectively distinguishing them from other pathological elements, confounding objects, and background staining. This approach circumvents the need for manual selection and overcomes the inherent limitations of the staining procedure itself. Our pipeline thus facilitates large-scale, precise histopathological assessment in AD, as demonstrated by morphometric analyses revealing region- and cognitive status-associated variations in plaque size, structural complexity, and geometry. The ability to extract such biologically meaningful per-object features from specimens demonstrates the practical utility of our method and its potential to contribute to mechanistic insights into AD pathology; however, these morphometric findings should be interpreted with caution due to the limited number of donor cases.

All experiments in this study were conducted exclusively on ThioS-stained cortical tissue micrographs. However, the modular design of our pipeline, including intensity normalization, preprocessing, model architectures, CAM-based localization, and patch-wise training, is not inherently stain-specific. With suitable adjustments, the method can be adapted to other contexts; nevertheless, different target applications inherently involve unique challenges and trade-offs, requiring tailored adaptations of the framework to effectively address their specific objectives and limitations. This framework can be extended to broader donor cohort and brain region coverage, additional fluorescent staining protocols, different histopathological targets and potentially alternative imaging modalities, including 3D datasets, such as Congo red staining, transgenic animal models, in vivo approaches, or Lewy body pathology in Parkinson’s disease, etc., where detailed manual annotations remain scarce.

Altogether, our results demonstrate that carefully designed weak supervision can yield segmentation models that are not only computationally efficient, but also capable of uncovering clinically relevant and biologically interpretable morphological signatures. This positions our pipeline as a scalable, annotation-efficient, and adaptable tool for digital neuropathology in data-constrained clinical settings.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/ijms26157134/s1.

Author Contributions

Conceptualization, K.G. and G.B.; methodology, K.G., G.B., T.N. and J.T.; software, G.B.; validation, G.B.; formal analysis, G.B. and T.N.; resources, K.G., G.B., T.N., J.T., L.T. and B.G.; data curation, G.B. and T.N.; writing—original draft preparation, G.B.; writing—review and editing, K.G.; visualization, G.B.; supervision, K.G.; project administration, K.G.; funding acquisition, K.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by grants from the Ministry of National Resources (GINOP 2.3.2-15-2016-00030 and 2.3.2-15-2016-00034) through the European Union Cohesion Fund. At the time of the experiments, GB was a PhD student at the Theoretical Medicine Doctoral School, Albert Szent-Györgyi Medical School, University of Szeged.

Institutional Review Board Statement

Handling of human brain samples was performed according to the guidelines of the Committee on Human Experimentation of University of Szeged, Albert Szent-Györgyi Medical School (Szeged, Hungary). Tissue samples from parietal and temporal cortices (or subfields of these areas) of elderly subjects without dementia (=Braak stage 2) and with AD (=Braak stages 5, 6) were obtained from the Netherlands Brain Bank (NBB; Project 598/2009), Netherlands Institute for Neuroscience, Amsterdam.

Informed Consent Statement

The samples were collected from donors for or from whom written informed consent for a brain autopsy, the use of tissue samples and permission for anonymous use of clinical information were obtained (Project 598/2009).

Data Availability Statement

The data are contained in the article and its Supplementary Files.

Acknowledgments

During the preparation of this manuscript/study, the authors used the computer programs and analysis tools mentioned in Section 4 for the purposes indicated in the text. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study, data collection and analysis, the decision to publish or the preparation of the manuscript.

Abbreviations

AD	Alzheimer’s disease
Adam	adaptive moment estimation
ANOVA	analysis of variance
Aβ	amyloid-β
BG	background
CAM	class activation maps/mapping
CNNs	convolutional neural networks
FDR	false discovery rate
FG	foreground
FN_cls/seg	false negatives for classification/segmentation tasks
FNR	false negative rate
FOR	false omission rate
FP_cls/seg	false positives for classification/segmentation tasks
FPR	false positive rate
GAP	global average pooling
GMP	global max pooling
GT	ground truth
ICF	illumination correction function
mTI	modified Tversky index
NPV	negative predictive value
PA	pixel-level accuracy
PCA	principal component analysis
PCs	principal components
RMSprop	root mean square propagation
ROI	region of interest
SMO	Silver Mountain Operator
Soft-CP	soft-copy and soft-paste
TAP	thresholded average pooling
ThioS	thioflavin-S
TN_cls/seg	true negatives for classification/segmentation tasks
TP_cls/seg	true positives for classification/segmentation tasks
WSOL	weakly supervised object localization
WSSS	weakly supervised semantic segmentation
$ℒ$ _aUF	asymmetric unified focal loss
$ℒ$ _BC	binary cross-entropy loss
$ℒ$ _maF	modified asymmetric focal loss
$ℒ$ _maFT	modified asymmetric focal Tversky loss

References

Braak, H.; Braak, E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol. 1991, 82, 239–259. [Google Scholar] [CrossRef] [PubMed]
Arendt, T.; Stieler, J.; Ueberham, U. Is sporadic Alzheimer’s disease a developmental disorder? J. Neurosci. 2017, 143, 396–408. [Google Scholar] [CrossRef] [PubMed]
DeTure, M.A.; Dickson, D.W. The neuropathological diagnosis of Alzheimer’s disease. Mol. Neurodegener. 2019, 14, 32. [Google Scholar] [CrossRef] [PubMed]
Walker, L.C. Aβ Plaques. Free neuropathol. 2020, 1, 31. [Google Scholar] [CrossRef] [PubMed]
Cole, S.L.; Vassar, R. The Alzheimer’s disease beta-secretase enzyme, BACE1. Mol. Neurodegener. 2007, 2, 22. [Google Scholar] [CrossRef] [PubMed]
Jankovska, N.; Olejar, T.; Matej, R. Extracellular protein aggregates colocalization and neuronal dystrophy in comorbid Alzheimer’s and Creutzfeldt-Jakob disease: A micromorphological pilot study on 20 brains. Int. J. Mol. Sci. 2021, 22, 2099. [Google Scholar] [CrossRef] [PubMed]
Lauritzen, I.; Pardossi-Piquard, R.; Bauer, C.; Brigham, E.; Abraham, J.D.; Ranaldi, S.; Fraser, P.; St-George-Hyslop, P.; Le Thuc, O.; Espin, V.; et al. The β-secretase-derived C-terminal fragment of βAPP, C99, but not Aβ, is a key contributor to early intraneuronal lesions in triple-transgenic mouse hippocampus. J. Neurosci. 2012, 32, 16243–16255. [Google Scholar] [CrossRef] [PubMed]
Serrano-Pozo, A.; Frosch, M.P.; Masliah, E.; Hyman, B.T. Neuropathological alterations in Alzheimer disease. Cold Spring Harb. Perspect. Med. 2011, 1, a006189. [Google Scholar] [CrossRef] [PubMed]
Hardy, J.; Selkoe, D.J. The amyloid hypothesis of Alzheimer’s disease: Progress and problems on the road to therapeutics. Science 2002, 297, 353–356. [Google Scholar] [CrossRef] [PubMed]
Thal, D.R.; Rüb, U.; Orantes, M.; Braak, H. Phases of A beta-deposition in the human brain and its relevance for the development of AD. Neurology 2002, 58, 1791–1800. [Google Scholar] [CrossRef] [PubMed]
Pirici, D.; Van Cauwenberghe, C.; Van Broeckhoven, C.; Kumar-Singh, S. Fractal analysis of amyloid plaques in Alzheimer’s disease patients and mouse models. Neurobiol. Aging 2011, 32, 1579–1587. [Google Scholar] [CrossRef] [PubMed]
Thal, D.R.; Capetillo-Zarate, E.; Del Tredici, K.; Braak, H. The development of amyloid beta protein deposits in the aged brain. Sci. Aging Knowl. Environ. 2006, 6, re1. [Google Scholar] [CrossRef]
Bussière, T.; Bard, F.; Barbour, R.; Grajeda, H.; Guido, T.; Khan, K.; Schenk, D.; Games, D.; Seubert, P.; Buttini, M. Morphological characterization of Thioflavin-S-positive amyloid plaques in transgenic Alzheimer mice and effect of passive Abeta immunotherapy on their clearance. Am. J. Pathol. 2004, 165, 987–995. [Google Scholar] [CrossRef] [PubMed]
Armstrong, R.A. Beta-amyloid plaques: Stages in life history or independent origin? Dement. Geriatr. Cogn. Dis. 1998, 9, 227–238. [Google Scholar] [CrossRef] [PubMed]
Röhr, D.; Boon, B.; Schuler, M.; Kremer, K.; Hoozemans, J.; Bouwman, F.H.; El-Mashtoly, S.F.; Nabers, A.; Großerueschkamp, F.; Rozemuller, A.; et al. Label-free vibrational imaging of different Aβ plaque types in Alzheimer’s disease reveals sequential events in plaque development. Acta Neuropathol. Comm. 2020, 8, 222. [Google Scholar] [CrossRef] [PubMed]
MacKeigan, T.P.; Morgan, M.L.; Stys, P.K. Quantitation of tissue amyloid via fluorescence spectroscopy using controlled concentrations of thioflavin-S. Molecules 2023, 28, 4483. [Google Scholar] [CrossRef] [PubMed]
Sun, A.; Nguyen, X.V.; Bing, G. Comparative Analysis of an Improved Thioflavin-S Stain, Gallyas Silver Stain, and Immunohistochemistry for Neurofibrillary Tangle Demonstration on the Same Sections. J. Histochem. Cytochem. 2002, 50, 463–472. [Google Scholar] [CrossRef] [PubMed]
Rajamohamedsait, H.B.; Sigurdsson, E.M. Histological staining of amyloid and pre-amyloid peptides and proteins in mouse tissue. Methods Mol. Biol. 2012, 849, 411–424. [Google Scholar] [CrossRef] [PubMed]
Wilcock, D.M.; Gordon, M.N.; Morgan, D. Quantification of cerebral amyloid angiopathy and parenchymal amyloid plaques with Congo red histochemical stain. Nat. Protoc. 2006, 1, 1591–1595. [Google Scholar] [CrossRef] [PubMed]
Barton, S.M.; To, E.; Rogers, B.P.; Whitmore, C.; Uppal, M.; Matsubara, J.A.; Pham, W. Inhalable thioflavin S for the detection of amyloid beta deposits in the retina. Molecules 2021, 26, 835. [Google Scholar] [CrossRef] [PubMed]
Piccinini, F.; Lucarelli, E.; Gherardi, A.; Bevilacqua, A. Multi-image based method to correct vignetting effect in light microscopy images. J. Microsc. 2012, 248, 6–22. [Google Scholar] [CrossRef] [PubMed]
Silberberg, M.; Grecco, H.E. Robust and unbiased estimation of the background distribution for automated quantitative imaging. J. Opt. Soc. Am. A 2023, 40, C8–C15. [Google Scholar] [CrossRef] [PubMed]
Krutsay, M. Patológiai Technika; Medicina: Budapest, Hungary, 1999; ISBN 963 242 434 4. [Google Scholar]
Yang, R.; Yu, Y. Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front. Oncol. 2021, 11, 638182. [Google Scholar] [CrossRef] [PubMed]
Gao, Y.; Jiang, Y.; Peng, Y.; Yuan, F.; Zhang, X.; Wang, J. Medical image segmentation: A comprehensive review of deep learning-based methods. Tomography 2025, 11, 52. [Google Scholar] [CrossRef] [PubMed]
Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative jocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar] [CrossRef]
Han, C.; Lin, J.; Mai, J.; Wang, Y.; Zhang, Q.; Zhao, B.; Chen, X.; Pan, X.; Shi, Z.; Xu, Z.; et al. Multi-layer pseudo-supervision for histopathology tissue semantic segmentation using patch-level classification labels. Med. Image Anal. 2022, 80, 102487. [Google Scholar] [CrossRef] [PubMed]
Din, N.U.; Yu, J. Training a deep learning model for single-cell segmentation without manual annotation. Sci. Rep. 2021, 11, 23995. [Google Scholar] [CrossRef] [PubMed]
Nishimura, K.; Ker, E.; Bise, R. Weakly supervised cell instance segmentation by propagating from detection response. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019; Shen, D., Ed.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2019; Volume 2019, p. 11764. [Google Scholar] [CrossRef]
Ramananda, S.H.; Sundaresan, V. Class activation map-based weakly supervised hemorrhage segmentation using Resnet-LSTM in non-contrast computed tomography images. arXiv 2023, arXiv:2309.16627. [Google Scholar] [CrossRef]
Li, Z.; Xia, Y. Deep reinforcement learning for weakly-supervised lymph node segmentation in CT images. IEEE J. Biomed. Health Inform. 2021, 25, 774–783. [Google Scholar] [CrossRef] [PubMed]
Fu, J.; Lu, T.; Zhang, S.; Wang, G. UM-CAM: Uncertainty-weighted multi-resolution class activation maps for weakly-supervised fetal brain segmentation. Pattern Recognit. 2023, 160, 111204. [Google Scholar] [CrossRef]
Liu, Y.; Lian, L.; Zhang, E.; Xu, L.; Xiao, C.; Zhong, X.; Li, F.; Jiang, B.; Dong, Y.; Ma, L.; et al. Mixed-UNet: Refined class activation mapping for weakly-supervised semantic segmentation with multi-scale inference. Front. Comput. Sci. 2022, 4, 1036934. [Google Scholar] [CrossRef]
Sampaio, V.; Cordeiro, F. Improving mass detection in mammography images: A study of weakly supervised learning and class activation map methods. In Proceedings of the 36th SIBGRAPI Conference on Graphics, Patterns and Images, Rio de Janeiro, Brazil, 17–20 October 2023; pp. 139–144. [Google Scholar] [CrossRef]
Li, Y.; Yu, Y.; Zou, Y.; Xiang, T.; Li, X. Online easy example mining for weakly-supervised gland segmentation from histology images. arXiv 2022, arXiv:2206.06665v3. [Google Scholar] [CrossRef]
SukeshAdiga, V.; Dolz, J.; Lombaert, H. Manifold-driven attention maps for weakly supervised segmentation. arXiv 2020, arXiv:2004.03046v1. [Google Scholar] [CrossRef]
Kim, J.; Kim, H.J.; Kim, C.; Lee, J.H.; Kim, K.W.; Park, Y.M.; Kim, H.W.; Ki, S.Y.; Kim, Y.M.; Kim, W.H. Weakly-supervised deep learning for ultrasound diagnosis of breast cancer. Sci. Rep. 2021, 11, 24382. [Google Scholar] [CrossRef] [PubMed]
Kushwaha, A.; Gupta, S.; Bhanushali, A.; Rai Dastidar, T. Rapid training data creation by synthesizing medical images for classification and localization. arXiv 2023, arXiv:2308.04687. [Google Scholar] [CrossRef]
Zhang, A.; Shailja, S.; Borba, C.; Miao, Y.; Goebel, M.; Ruschel, R.; Ryan, K.; Smith, W.; Manjunath, B.S. Automatic classification and neurotransmitter prediction of synapses in electron microscopy. Biol. Imaging 2022, 2, e6. [Google Scholar] [CrossRef] [PubMed]
Liu, D.; Zhang, D.; Song, Y.; Zhang, F.; ODonnell, L.; Huang, H.; Chen, M.; Cai, W. Unsupervised instance segmentation in microscopy images via panoptic domain adaptation and task Re-Weighting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar] [CrossRef]
Curti, N.; Merli, Y.; Zengarini, C.; Giampieri, E.; Merlotti, A.; Dall’Olio, D.; Marcelli, E.; Bianchi, T.; Castellani, G. Effectiveness of Semi-Supervised Active Learning in Automated Wound Image Segmentation. Int. J. Mol. Sci. 2023, 24, 706. [Google Scholar] [CrossRef] [PubMed]
Arvaniti, E.; Fricker, K.; Moret, M.; Rupp, N.; Hermanns, T.; Fankhauser, C.; Wey, N.; Wild, P.; Rüschoff, J.; Claassen, M. Automated Gleason grading of prostate cancer tissue microarrays via deep learning. Sci. Rep. 2018, 8, 12054. [Google Scholar] [CrossRef] [PubMed]
Liu, H.; Wang, L.; Nan, Y.; Jin, F.; Wang, Q.; Pu, J. SDFN: Segmentation-based deep fusion network for thoracic disease classification in chest X-ray images. Comput. Med. Imaging Graph. 2019, 75, 66–73. [Google Scholar] [CrossRef] [PubMed]
Meng, M.; Zhang, M.; Shen, D.; He, G. Differentiation of breast lesions on dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) using deep transfer learning based on DenseNet201. Medicine 2022, 101, e31214. [Google Scholar] [CrossRef] [PubMed]
Sulaiman, A.; Anand, V.; Gupta, S.; Rajab, A.; Alshahrani, H.; Al Reshan, M.; Shaikh, A.; Hamdi, M. Attention based UNet model for breast cancer segmentation using BUSI dataset. Sci. Rep. 2024, 14, 22422. [Google Scholar] [CrossRef] [PubMed]
Dornadula, B.; Geetha, S.; Anbarasi, L.; Kadry, S. A Survey of COVID-19 detection from chest X-rays using deep learning methods. Inter. J. Data Warehous. Min. 2022, 18, 1–16. [Google Scholar] [CrossRef]
Ullah, I.; Ali, F.; Shah, B.; El-Sappagh, S.; Abuhmed, T.; Park, S. A deep learning based dual encoder-decoder framework for anatomical structure segmentation in chest X-ray images. Sci. Rep. 2023, 13, 791. [Google Scholar] [CrossRef]
Wei, J.W.; Wei, J.W.; Jackson, C.R.; Ren, B.; Suriawinata, A.A.; Hassanpour, S. Automated detection of celiac disease on duodenal biopsy slides: A deep learning approach. J. Pathol. Inform. 2019, 10, 7. [Google Scholar] [CrossRef] [PubMed]
Nguyen Thi Phuong, H.; Shin, C.-S.; Jeong, H.-Y. Finding the differences in capillaries of taste buds between smokers and non-smokers using the convolutional neural networks. Appl. Sci. 2021, 11, 3460. [Google Scholar] [CrossRef]
Richmond, D.; Jost, A.P.; Lambert, T.J.; Waters, J.C.; Elliott, H. DeadNet: Identifying phototoxicity from label-free microscopy images of cells using deep ConvNets. arXiv 2017, arXiv:1701.06109v1. [Google Scholar] [CrossRef]
Irmak, E. A novel implementation of deep-learning approach on malaria parasite detection from thin blood cell images. Electrica 2021, 21, 216–224. [Google Scholar] [CrossRef]
Chen, Z.; Zheng, W.; Pang, K.; Xia, D.; Guo, L.; Chen, X.; Wu, F.; Wang, H. Weakly supervised learning analysis of Aβ plaque distribution in the whole rat brain. Front. Neurosci. 2023, 16, 1097019. [Google Scholar] [CrossRef] [PubMed]
Tang, Z.; Chuang, K.V.; DeCarli, C.; Jin, L.W.; Beckett, L.; Keiser, M.J.; Dugger, B.N. Interpretable classification of Alzheimer’s disease pathologies with a convolutional neural network pipeline. Nat. Commun. 2019, 10, 2173. [Google Scholar] [CrossRef] [PubMed]
Müller, D.; Röhr, D.; Boon, B.D.C.; Wulf, M.; Arto, T.; Hoozemans, J.J.M.; Marcus, K.; Rozemuller, A.J.M.; Großerueschkamp, F.; Mosig, A.; et al. Label-free Aβ plaque detection in Alzheimer’s disease brain tissue using infrared microscopy and neural networks. Heliyon 2025, 11, e42111. [Google Scholar] [CrossRef] [PubMed]
Chabrun, F.; Dieu, X.; Doudeau, N.; Gautier, J.; Luque-Paz, D.; Geneviève, F.; Ferré, M.; Mirebeau-Prunier, D.; Annweiler, C.; Reynier, P. Deep learning shows no morphological abnormalities in neutrophils in Alzheimer’s disease. Alzheimer’s Dement. 2021, 13, e12146. [Google Scholar] [CrossRef] [PubMed]
Iandola, F.N.; Moskewicz, M.W.; Ashraf, K.; Han, S.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. arXiv 2016, arXiv:1602.07360v4. [Google Scholar] [CrossRef]
Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv 2014, arXiv:1312.4400v3. [Google Scholar] [CrossRef] [PubMed]
Selvaraju, R.R.; Das, A.; Vedantam, R.; Cogswell, M.; Parikh, D.; Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 2020, 128, 336–359. [Google Scholar] [CrossRef]
Bae, W.; Noh, J.; Kim, G. Rethinking class activation mapping for weakly supervised object localization. In Computer Vision–ECCV 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2020; Volume 12360, pp. 618–634. [Google Scholar] [CrossRef]
Oquab, M.; Bottou, L.; Laptev, I.; Sivic, J. Is object localization for free?–Weakly-supervised learning with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 685–694. [Google Scholar] [CrossRef]
Pinheiro, P.H.O.; Collobert, R. From image-level to pixel-level labeling with convolutional networks. arXiv 2015, arXiv:1411.6228v3. [Google Scholar] [CrossRef]
Dogan, Y. A new global pooling method for deep neural networks: Global average of top-k max-pooling. Trait. Du Signal 2023, 40, 577–587. [Google Scholar] [CrossRef]
Nieradzik, L.; Stephani, H.; Keuper, J. Top-GAP: Integrating size priors in CNNs for more interpretability, robustness, and bias mitigation. arXiv 2024, arXiv:2409.04819v1. [Google Scholar] [CrossRef]
Christlein, V.; Spranger, L.; Seuret, M.; Nicolaou, A.; Král, P.; Maier, A.K. Deep generalized max pooling. In Proceedings of the International Conference on Document Analysis and Recognition, Sydney, NSW, Australia, 22–26 September 2019; pp. 1090–1096. [Google Scholar] [CrossRef]
Chattopadhyay, A.; Sarkar, A.; Howlader, P.; Balasubramanian, V.N. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, NV, USA, 12–15 March 2018; pp. 839–847. [Google Scholar] [CrossRef]
Wang, H.; Wang, Z.; Du, M.; Yang, F.; Zhang, Z.; Ding, S.; Mardziel, P.; Hu, X. Score-CAM: Score-weighted visual explanations for convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020. [Google Scholar] [CrossRef]
Englebert, A.; Cornu, O.; Vleeschouwer, C.D. Poly-CAM: High resolution class activation map for convolutional neural networks. Mach. Vis. Appl. 2022, 35, 89. [Google Scholar] [CrossRef]
Shinde, S.; Chougule, T.; Saini, J.; Ingalhalikar, M. HR-CAM: Precise localization of pathology using multi-level learning in CNNs. arXiv 2019, arXiv:1909.12919v1. [Google Scholar] [CrossRef]
Shen, W.; Peng, Z.; Wang, X.; Wang, H.; Cen, J.; Jiang, D.; Xie, L.; Yang, X.; Tian, Q. A survey on label-efficient deep image segmentation: Bridging the gap between weak supervision and dense prediction. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 9284–9305. [Google Scholar] [CrossRef] [PubMed]
Zhang, R.; Li, G.; Wunderlich, T.; Wang, L. A survey on deep learning-based precise boundary recovery of semantic segmentation for images and point clouds. Int. J. Appl. Earth Observ. Geoinform. 2021, 102, 102411. [Google Scholar] [CrossRef]
Kwak, S.; Hong, S.; Han, B. Weakly supervised semantic segmentation using superpixel pooling network. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar] [CrossRef]
Zheng, S.; Jayasumana, S.; Romera-Paredes, B.; Vineet, V.; Su, Z.; Du, D.; Huang, C.; Torr, P.H. Conditional random fields as recurrent neural networks. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1529–1537. [Google Scholar] [CrossRef]
Bressan, P.O.; Junior, J.M.; Martins, J.A.C.; De Melo, M.J.; Gonçalves, D.N.; Freitas, D.M.; Ramos, A.P.M.; Furuya, M.T.G.; Osco, L.P.; De Andrade Silva, J.; et al. Semantic segmentation with labeling uncertainty and class imbalance applied to vegetation mapping. Int. J. Applied Earth Observ. Geoinform. 2022, 108, 102690. [Google Scholar] [CrossRef]
Kervadec, H.; Bouchtiba, J.; Desrosiers, C.; Granger, E.; Dolz, J.; Ayed, I.B. Boundary loss for highly unbalanced segmentation. Med. Image Anal. 2021, 67, 101851. [Google Scholar] [CrossRef] [PubMed]
Ronneberger, O.; Fischer, P.; Brox, T. U-NET: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; Lecture notes in computer science; Springer: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. [Google Scholar] [CrossRef]
Guo, R.; Sun, X.; Chen, K.; Zhou, X.; Yan, Z.; Diao, W.; Yan, M. JMLNet: Joint multi-label learning network for weakly supervised semantic segmentation in aerial images. Remote Sens. 2020, 12, 3169. [Google Scholar] [CrossRef]
Islam, M.; Glocker, B. Spatially Varying Label Smoothing: Capturing Uncertainty from Expert Annotations. In Information Processing in Medical Imaging. IPMI 2021; Feragen, A., Sommer, S., Schnabel, J., Nielsen, M., Eds.; Lecture notes in computer science; Springer: Cham, Switzerland, 2021; Volume 12729, pp. 677–688. [Google Scholar] [CrossRef]
Kats, E.; Goldberger, J.; Greenspan, H. Soft labeling by distilling anatomical knowledge for improved MS lesion segmentation. In Proceedings of the IEEE 16th International Symposium on Biomedical Imaging (ISBI), Venice, Italy, 8–11 April 2019; pp. 1563–1566. [Google Scholar] [CrossRef]
Vasudeva, S.A.; Dolz, J.; Lombaert, H. GeoLS: An intensity-based, geodesic soft labeling for image segmentation. J. Mach. Learn. Biomed. Imaging 2025, 2, 120–134. [Google Scholar] [CrossRef]
Dang, T.; Nguyen, H.; Tiulpin, A. SiNGR: Brain tumor segmentation via signed normalized geodesic transform regression. arXiv 2024, arXiv:2405.16813v4. [Google Scholar] [CrossRef]
Jiangtao, W.; Ruhaiyem, N.I.R.; Panpan, F. A comprehensive review of U-Net and its variants: Advances and applications in medical image segmentation. IET Image Process. 2025, 19, e70019. [Google Scholar] [CrossRef]
Solopov, M.; Chechekhina, E.; Kavelina, A.; Akopian, G.; Turchin, V.; Popandopulo, A.; Filimonov, D.; Ishchenko, R. Comparative study of deep transfer learning models for semantic segmentation of human mesenchymal stem cell micrographs. Int. J. Mol. Sci. 2025, 26, 2338. [Google Scholar] [CrossRef] [PubMed]
Ibragimov, A.; Senotrusova, S.; Markova, K.; Karpulevich, E.; Ivanov, A.; Tyshchuk, E.; Grebenkina, P.; Stepanova, O.; Sirotskaya, A.; Kovaleva, A.; et al. Deep semantic segmentation of angiogenesis images. Int. J. Mol. Sci. 2023, 24, 1102. [Google Scholar] [CrossRef] [PubMed]
Zhong, Y.; Zhou, W.; Wang, Z. A survey of data augmentation in domain generalization. Neural Process. Lett. 2025, 57, 34. [Google Scholar] [CrossRef]
Wang, Z.; Wang, P.; Liu, K.; Wang, P.; Fu, Y.; Lu, C.; Aggarwal, C.C.; Pei, J.; Zhou, Y. A comprehensive survey on data augmentation. arXiv 2025, arXiv:2405.09591v3. [Google Scholar] [CrossRef]
Kumar, T.; Brennan, R.; Mileo, A.; Bendechache, M. Image data augmentation approaches: A comprehensive survey and future directions. IEEE Access 2024, 12, 187536–187571. [Google Scholar] [CrossRef]
Ma, J.; Hu, C.; Zhou, P.; Jin, F.; Wang, X.; Huang, H. Review of image augmentation used in deep learning-based material microscopic image segmentation. Appl. Sci. 2023, 13, 6478. [Google Scholar] [CrossRef]
Lewy, D.; Mańdziuk, J. An overview of mixing augmentation methods and augmentation strategies. Artif. Intell. Rev. 2022, 56, 2111–2169. [Google Scholar] [CrossRef]
Zhang, H.; Cissé, M.; Dauphin, Y.; Lopez-Paz, D. Mixup: Beyond empirical risk minimization. arXiv 2018, arXiv:1710.09412v2. [Google Scholar] [CrossRef]
Verma, V.; Lamb, A.; Beckham, C.; Najafi, A.; Courville, A.C.; Mitliagkas, I.; Bengio, Y. Manifold mixup: Learning better representations by interpolating hidden states. arXiv 2019, arXiv:1806.05236v7. [Google Scholar] [CrossRef]
Yun, S.; Han, D.; Chun, S.; Oh, S.J.; Yoo, Y.; Choe, J. CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6022–6031. [Google Scholar] [CrossRef]
Olsson, V.; Tranheden, W.; Pinto, J.; Svensson, L. ClassMix: Segmentation-based data augmentation for semi-supervised learning. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 1368–1377. [Google Scholar] [CrossRef]
Su, Y.; Sun, R.; Lin, G.; Wu, Q. Context decoupling augmentation for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 6984–6994. [Google Scholar] [CrossRef]
Sun, D.; Dornaika, F.; Charafeddine, J. LCAMix: Local-and-contour aware grid mixing based data augmentation for medical image segmentation. Inf. Fusion 2024, 110, 102484. [Google Scholar] [CrossRef]
Sun, D.; Dornaika, F.; Barrena, N. HSMix: Hard and soft mixing data augmentation for medical image segmentation. Inf. Fusion 2025, 115, 102741. [Google Scholar] [CrossRef]
Dwibedi, D.; Misra, I.; Hebert, M. Cut, paste and learn: Surprisingly easy synthesis for instance detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1310–1319. [Google Scholar] [CrossRef]
Ghiasi, G.; Cui, Y.; Srinivas, A.; Qian, R.; Lin, T.; Cubuk, E.D.; Le, Q.V.; Zoph, B. Simple copy-paste is a strong data augmentation method for instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2917–2927. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, Y.; Xu, X. ObjectAUG: Object-level data augmentation for semantic image segmentation. In Proceedings of the International Joint Conference on Neural Networks, Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar] [CrossRef]
Liu, X.; Ono, K.; Bise, R. A data augmentation approach that ensures the reliability of foregrounds in medical image segmentation. Image Vis. Comput. 2024, 147, 105056. [Google Scholar] [CrossRef]
Yang, J.; Zhang, Y.; Liang, Y.; Zhang, Y.; He, L.; He, Z. TumorCP: A simple but effective object-level data augmentation for tumor segmentation. arXiv 2021, arXiv:2107.09843v1. [Google Scholar] [CrossRef]
Dai, P.; Dong, L.; Zhang, R.; Zhu, H.; Wu, J.; Yuan, K. Soft-CP: A credible and effective data augmentation for semantic segmentation of medical lesions. arXiv 2022, arXiv:2203.10507v1. [Google Scholar] [CrossRef]
Zwanenburg, A.; Leger, S.; Agolli, L.; Pilz, K.; Troost, E.G.C.; Richter, C.; Löck, S. Assessing robustness of radiomic features by image perturbation. Sci. Rep. 2019, 9, 614. [Google Scholar] [CrossRef] [PubMed]
Lo Iacono, F.; Maragna, R.; Pontone, G.; Corino, V. A novel data augmentation method for radiomics analysis using image perturbations. J. Imaging Inform. Med. 2024, 37, 2401–2414. [Google Scholar] [CrossRef]
Soares, L.D.A.; Côco, K.F.; Ciarelli, P.M.; Salles, E.O.T. A class-independent texture-separation method based on a pixel-wise binary classification. Sensors 2020, 20, 5432. [Google Scholar] [CrossRef] [PubMed]
Hyman, B.T.; West, H.L.; Rebeck, G.W.; Buldyrev, S.V.; Mantegna, R.N.; Ukleja, M.; Havlin, S.; Stanley, H.E. Quantitative analysis of senile plaques in Alzheimer disease: Observation of log-normal size distribution and molecular epidemiology of differences associated with apolipoprotein E genotype and trisomy 21 (Down syndrome). Proc. Natl. Acad. Sci. USA 1995, 92, 3586–3590. [Google Scholar] [CrossRef] [PubMed]
Nilsson, L.N.; Bales, K.R.; DiCarlo, G.; Gordon, M.N.; Morgan, D.; Paul, S.M.; Potter, H. Alpha-1-antichymotrypsin promotes beta-sheet amyloid plaque deposition in a transgenic mouse model of Alzheimer’s disease. J. Neurosci. 2001, 21, 1444–1451. [Google Scholar] [CrossRef] [PubMed]
Karperien, A.L.; Jelinek, H.F.; Buchan, A.M. Box-counting analysis of microglia form in schizophrenia. Alzheimer’s disease and affective disorder. Fractals 2008, 16, 103–107. [Google Scholar] [CrossRef]
Karperien, A.; Jelinek, H.F.; Milosevic, N.T. Lacunarity analysis and classification of microglia in neuroscience. In Proceedings of the 8th European Conference on Mathematical and Theoretical Biology, European Society for Mathematical and Theoretical Biology, London, UK, 11–15 July 2011. MS#88. [Google Scholar] [CrossRef]
Fernández-Arjona, M.; Grondona, J.M.; Granados-Durán, P.; Fernández-Llebrez, P.; López-Ávalos, M.D. Microglia morphological categorization in a rat model of neuroinflammation by hierarchical cluster and principal components analysis. Front. Cell. Neurosci. 2017, 11, 235. [Google Scholar] [CrossRef] [PubMed]
Stojić, D.; Radošević, D.; Rajković, N.; Milošević, N.T. 2D images of astrocytes in the human principal olivary nucleus: Monofractal analysis of the morphology. J. Biosci. Med. 2021, 9, 38–48. [Google Scholar] [CrossRef]
Amin, E.; Elgammal, Y.M.; Zahran, M.A.; Abdelsalam, M.M. Alzheimer’s disease: New insight in assessing of amyloid plaques morphologies using multifractal geometry based on Naive Bayes optimized by random forest algorithm. Sci. Rep. 2023, 13, 18568. [Google Scholar] [CrossRef] [PubMed]
Puntambekar, S.S.; Moutinho, M.; Lin, P.B.; Jadhav, V.; Tumbleson-Brink, D.; Balaji, A.; Benito, M.A.; Xu, G.; Oblak, A.; Lasagna-Reeves, C.A.; et al. CX3CR1 deficiency aggravates amyloid driven neuronal pathology and cognitive decline in Alzheimer’s disease. Mol. Neurodegener. 2022, 17, 47. [Google Scholar] [CrossRef] [PubMed]
Meilandt, W.J.; Ngu, H.; Gogineni, A.; Lalehzadeh, G.; Lee, S.H.; Srinivasan, K.; Imperio, J.; Wu, T.; Weber, M.; Kruse, A.J.; et al. Trem2 deletion reduces late-stage amyloid plaque accumulation, elevates the Aβ42:Aβ40 ratio, and exacerbates axonal dystrophy and dendritic spine loss in the PS2APP Alzheimer’s mouse model. J. Neurosci. 2020, 40, 1956–1974. [Google Scholar] [CrossRef] [PubMed]
Gulyas, B.; Sovago, J.; Gomez-Mancilla, B.; Jia, Z.; Szigeti, C.; Gulya, K.; Schumacher, M.; Maguire, R.P.; Gasparini, F.; Halldin, C. Decrease of mGluR5 receptor density goes parallel with changes in enkephalin and substance P immunoreactivity in Huntington’s disease: A preliminary investigation in the postmortem human brain. Brain Struct. Funct. 2015, 220, 3043–3051. [Google Scholar] [CrossRef] [PubMed]
Singh, S.; Bray, M.A.; Jones, T.R.; Carpenter, A.E. Pipeline for illumination correction of images for high-throughput microscopy. J. Microsc. 2014, 256, 231–236. [Google Scholar] [CrossRef] [PubMed]
Waters, J.C.; Wittmann, T. Concepts in quantitative fluorescence microscopy. Methods Cell Biol. 2014, 123, 1–18. [Google Scholar] [CrossRef] [PubMed]
Likar, B.; Maintz, J.B.; Viergever, M.A.; Pernus, F. Retrospective shading correction based on entropy minimization. J. Microsc. 2000, 197, 285–295. [Google Scholar] [CrossRef] [PubMed]
Leong, F.J.; Brady, M.; McGee, J.O. Correction of uneven illumination (vignetting) in digital microscopy images. J. Clin. Pathol. 2003, 56, 619–621. [Google Scholar] [CrossRef] [PubMed]
Lee, S.C.; Bajcsy, P. Intensity correction of fluorescent confocal laser scanning microscope images by mean-weight filtering. J. Microsc. 2006, 221, 122–136. [Google Scholar] [CrossRef] [PubMed]
Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
Sternberg, S.R. Biomedical image processing. Computer 1983, 16, 22–34. [Google Scholar] [CrossRef]
van Rossum, G. Python reference manual. In Department of Computer Science; Centrum Wiskunde & Informatica: Amsterdam, The Netherlands, 1995; ISSN 0169-118X. [Google Scholar]
Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
Bradski, G. The OpenCV Library. Dr. Dobb’s J. Softw. Tools 2000, 25/11, 122–125. [Google Scholar]
Bisong, E. Google colaboratory. In Building Machine Learning and Deep Learning Models on Google Cloud Platform; Apress: Berkeley, CA, USA, 2019. [Google Scholar] [CrossRef]
Schneider, C.A.; Rasband, W.S.; Eliceiri, K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 2012, 9, 671–675. [Google Scholar] [CrossRef] [PubMed]
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467v2. [Google Scholar] [CrossRef]
He, K.; Sun, J. Statistics of patch offsets for image completion. In Computer Vision–ECCV 2012. Lecture Notes in Computer Science; Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7573. [Google Scholar] [CrossRef]
Park, S.; Choi, Y.; Hwang, H. SACuP: Sonar Image Augmentation with Cut and Paste based DataBank for semantic segmentation. Remote Sens. 2023, 15, 5185. [Google Scholar] [CrossRef]
Brorsson, E.; Åkesson, K.; Svensson, L.; Bengtsson, K. ECAP: Extensive cut-and-paste augmentation for unsupervised domain adaptive semantic segmentation. In Proceedings of the IEEE International Conference on Image Processing, Abu Dhabi, UAE, 27–30 October 2024; pp. 610–616. [Google Scholar] [CrossRef]
Guan, S.; Samala, R.K.; Kahaki, S.M.M.; Chen, W. Restorable synthesis: Average synthetic segmentation converges to a polygon approximation of an object contour in medical images. In Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation, Santa Fe, NM, USA, 17–19 March 2024; pp. 77–80. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2017, arXiv:1412.6980v9. [Google Scholar] [CrossRef]
Yeung, M.; Sala, E.; Schönlieb, C.-B.; Rundo, L. Unified focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Comput. Med. Imaging Graph. 2022, 95, 102026. [Google Scholar] [CrossRef] [PubMed]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. SciKit-Learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
West, R.M. Best practice in statistics: The use of log transformation. Ann. Clin. Biochem. 2022, 59, 162–165. [Google Scholar] [CrossRef] [PubMed]
Pither, J. Tutorials for BIOL202: Introduction to Biostatistics. Available online: https://ubco-biology.github.io/BIOL202/index.html (accessed on 15 February 2025).
McKinney, W. Data structures for statistical computing in Python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; van der Walt, S., Millman, J., Eds.; SciPy.org: Austin, TX, USA, 2010; Volume 445, pp. 56–61. [Google Scholar] [CrossRef]
Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 2020, 17, 261–272, Erratum in Nat. Methods 2020, 17, 352. [Google Scholar] [CrossRef] [PubMed]
Hunter, J.D. MatPlotLib: A 2D Graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2012; ISBN 3-900051-07-0. Available online: https://www.R-pr-oject.org/ (accessed on 15 February 2025).

Figure 1. Monitoring performance metrics across patient-wise 7-fold cross-validation of the SqueezeNet model. (A) Binary cross-entropy loss

ℒ

_BC and (B) accuracy curves are shown for both training and development sets. Each color represents a different cross-validation iteration, where a distinct subject (test fold) was excluded from training. Within each iteration, the remaining data (from the other 6 folds) were split into 90% training and 10% development sets. Darker shades indicate the training sets, while lighter/transparent shades of the same color correspond to their respective developing (validation) sets, used to monitor performance during training. For visualization purposes, the curves are smoothed using a 25-epoch moving average (this helps in observing overall trends, minimizing short-term fluctuations).

Figure 1. Monitoring performance metrics across patient-wise 7-fold cross-validation of the SqueezeNet model. (A) Binary cross-entropy loss

ℒ

_BC and (B) accuracy curves are shown for both training and development sets. Each color represents a different cross-validation iteration, where a distinct subject (test fold) was excluded from training. Within each iteration, the remaining data (from the other 6 folds) were split into 90% training and 10% development sets. Darker shades indicate the training sets, while lighter/transparent shades of the same color correspond to their respective developing (validation) sets, used to monitor performance during training. For visualization purposes, the curves are smoothed using a 25-epoch moving average (this helps in observing overall trends, minimizing short-term fluctuations).

Figure 2. Qualitative evaluation of class activation maps (CAMs) derived from the Thresholded Average Pooling (TAP)-SqueezeNet binary classifier (final model trained on the complete dataset). The figure displays four pairs of alternating columns of (A,C,E,G) original input image patches (left) and (B,D,F,H) corresponding heatmaps (right), overlaid on the original images. Heatmaps were generated using negative weight-clamping, upsampled to the input image size (572 × 572 pixels), and scaled to the [0, 1] interval. The color bar on the right represents heatmap intensity, ranging from low (blue) to high (red) values, using the “jet” colormap. These maps highlight regions where the model identifies features associated with the presence of parenchymal amyloid structures, which were subsequently used to guide segmentation model training. Scale bar: 100 μm.

Figure 3. Monitoring of performance metrics of the U-Net segmentation model across patient-wise 7-fold cross-validation. (A) Asymmetric unified focal loss

ℒ

_aUF and (B) pixel-wise accuracy (PA) curves are shown for both training and development sets. Each color represents a different cross-validation iteration, where a distinct subject (test fold) was excluded from training. Within each iteration, the remaining data (from the other 6 folds) were split into 90% training and 10% development sets. Darker shades indicate the training sets, while lighter/transparent shades of the same color correspond to their respective developing (validation) sets, used to monitor performance during training.

Figure 3. Monitoring of performance metrics of the U-Net segmentation model across patient-wise 7-fold cross-validation. (A) Asymmetric unified focal loss

ℒ

_aUF and (B) pixel-wise accuracy (PA) curves are shown for both training and development sets. Each color represents a different cross-validation iteration, where a distinct subject (test fold) was excluded from training. Within each iteration, the remaining data (from the other 6 folds) were split into 90% training and 10% development sets. Darker shades indicate the training sets, while lighter/transparent shades of the same color correspond to their respective developing (validation) sets, used to monitor performance during training.

Figure 4. Qualitative evaluation of U-Net binary foreground–background (FG-BG) segmentation results (final model trained on the complete dataset). Three sets of columns are shown: full-size input images (A,D), corresponding postprocessed predictions (generated in a patch-wise manner) overlaid on the input images (B,E), and segmentation outputs overlaid on ground truth (GT) masks using transparent blending (C,F). The FG pixels/segment in the GT are marked in red, predicted FG in yellow, and the overlap appears in orange, indicating agreement. Small isolated red and yellow regions can be seen upon closer inspection, mostly near vessels, clearance zones, and objects we could not identify. The model accurately captures diverse plaque morphologies and structural features, while effectively distinguishing these from vascular regions. Despite their high fluorescence intensity, vascular regions are correctly classified as BG, effectively separating parenchymal pathology from the vasculature and BG tissue features based on their semantic differences, which the ThioS-staining method alone cannot distinguish. Scale bar: 100 μm.

Figure 5. Principal component analysis (PCA) and group-wise comparisons reveal major axes of morphological variation in ThioS-stained plaques and highlight region- and dementia-specific differences. (A) Scree plot showing the explained variance (bars) and cumulative variance (line) by PCA principal components across the 19 extracted morphometric features. The first three principal components (PCs) account for over 85% of the total variance, with PC1 alone explaining more than 50%. (B) Group-wise comparisons of the first three PCs, representing size-related (PC1), structural complexity (PC2), and geometric symmetry properties (PC3). Bar plots show mean ± SEM of PC scores per plaque, grouped by region (parietal = solid, temporal = striped) and cognitive status (dementia = dark, non-dementia = light). Grouping is based on the donor subject. Significant differences across groups are denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001), revealing consistently higher PC scores in the parietal + dementia group across all components. These results suggest that plaques in individuals with dementia, particularly in the parietal cortex, are larger, more structurally complex, and geometrically more irregular, although regional differences are also apparent.

Figure 6. Overview of the three-stage image-processing and training pipeline for weakly supervised amyloid plaque segmentation approach. (A) First stage: preprocessing includes illumination correction using an Illumination Correction Function (ICF) and background (BG) estimation via the Silver Mountain Operator (SMO), improving input image quality for downstream tasks. Representative examples show raw (left) vs. corrected (right) image tiles. (B) Second stage: image-level labeled data is cropped, resized, and augmented for training a TAP-SqueezeNet binary classifier for the presence or absence of parenchymal amyloid pathology using on-the-fly augmentations. Class activation maps (CAMs) are generated via negative weight clamping, resized, scaled, and thresholded. FG objects are cropped with local BG, and BG images are restored via image inpainting and saved into databanks for downstream use. (C) Third stage: segmented object masks and inpainted BGs are augmented and combined using Soft-Copy and Soft-Paste (Soft-CP) and contour synthesis with corresponding trimaps as pseudo-labels. These synthetic examples are used to train a U-Net segmentation model, optimized with an asymmetric unified focal loss (

ℒ

_aUF). Final inference employs an overlap-tile strategy with border mirroring to produce full-image segmentation masks. This pipeline enables automated learning of spatial features that distinguish parenchymal pathology from vasculature and BG, while accounting for high morphological variability and weak supervision constraints. Scale bar: 100 μm (applicable to all magnifications, with proportional scaling).

Figure 6. Overview of the three-stage image-processing and training pipeline for weakly supervised amyloid plaque segmentation approach. (A) First stage: preprocessing includes illumination correction using an Illumination Correction Function (ICF) and background (BG) estimation via the Silver Mountain Operator (SMO), improving input image quality for downstream tasks. Representative examples show raw (left) vs. corrected (right) image tiles. (B) Second stage: image-level labeled data is cropped, resized, and augmented for training a TAP-SqueezeNet binary classifier for the presence or absence of parenchymal amyloid pathology using on-the-fly augmentations. Class activation maps (CAMs) are generated via negative weight clamping, resized, scaled, and thresholded. FG objects are cropped with local BG, and BG images are restored via image inpainting and saved into databanks for downstream use. (C) Third stage: segmented object masks and inpainted BGs are augmented and combined using Soft-Copy and Soft-Paste (Soft-CP) and contour synthesis with corresponding trimaps as pseudo-labels. These synthetic examples are used to train a U-Net segmentation model, optimized with an asymmetric unified focal loss (

ℒ

_aUF). Final inference employs an overlap-tile strategy with border mirroring to produce full-image segmentation masks. This pipeline enables automated learning of spatial features that distinguish parenchymal pathology from vasculature and BG, while accounting for high morphological variability and weak supervision constraints. Scale bar: 100 μm (applicable to all magnifications, with proportional scaling).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Barczánfalvi, G.; Nyári, T.; Tolnai, J.; Tiszlavicz, L.; Gulyás, B.; Gulya, K. Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning. Int. J. Mol. Sci. 2025, 26, 7134. https://doi.org/10.3390/ijms26157134

AMA Style

Barczánfalvi G, Nyári T, Tolnai J, Tiszlavicz L, Gulyás B, Gulya K. Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning. International Journal of Molecular Sciences. 2025; 26(15):7134. https://doi.org/10.3390/ijms26157134

Chicago/Turabian Style

Barczánfalvi, Gábor, Tibor Nyári, József Tolnai, László Tiszlavicz, Balázs Gulyás, and Karoly Gulya. 2025. "Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning" International Journal of Molecular Sciences 26, no. 15: 7134. https://doi.org/10.3390/ijms26157134

APA Style

Barczánfalvi, G., Nyári, T., Tolnai, J., Tiszlavicz, L., Gulyás, B., & Gulya, K. (2025). Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning. International Journal of Molecular Sciences, 26(15), 7134. https://doi.org/10.3390/ijms26157134

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning

Abstract

1. Introduction

2. Results

2.1. Evaluation of the SqueezeNet Classifier for Parenchymal Amyloid Detection

2.2. Evaluation of the U-Net for Parenchymal Amyloid Segmentation

2.3. Morphometric Profiling of Amyloid Plaques via Particle Analysis

3. Discussion

3.1. SqueezeNet-Based CAM for Weakly Supervised Localization

3.2. U-Net Segmentation with Advanced Object-Level Augmentation

3.3. Morphometric Profiling of Amyloid Plaques via Particle Analysis

4. Materials and Methods

4.1. Brain Sections

4.2. ThioS Staining and Epifluorescence Microscopy for Amyloid Detection

4.3. Preprocessing Fluorescent Micrographs: Uneven Illumination, Background Estimation, and Annotation Preparation

4.4. Implementation of SqueezeNet-Based CAM for Weakly Supervised Localization

4.5. Implementation of U-Net Segmentation with Advanced Object-Level Augmentation

4.6. Implementation of Morphometric Profiling of Amyloid Plaques via Particle Analysis

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI