Next Article in Journal
Microbial Diversity in Vehicle Windshield Washer Reservoirs: Findings from Legionella Screening
Previous Article in Journal
Benthic Microbial Community Features and Environmental Correlates in the Northwest Pacific Polymetallic Nodule Field, with Comparative Analysis Across the Pacific
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Non-Targeted Screening Method for Detecting Temporal Shifts in Spectral Patterns of Methicillin-Resistant Staphylococcus aureus and Post Hoc Description of Peak Features

1
QuoData GmbH, 01309 Dresden, Germany
2
Institute of Nutritional Science, University of Potsdam, Arthur-Scheunert-Allee 114-116, 14558 Nuthetal, Germany
3
QuoData GmbH, 14195 Berlin, Germany
4
Bundesamt für Verbraucherschutz und Lebensmittelsicherheit, 13347 Berlin, Germany
*
Author to whom correspondence should be addressed.
Microorganisms 2026, 14(1), 104; https://doi.org/10.3390/microorganisms14010104
Submission received: 12 October 2025 / Revised: 9 December 2025 / Accepted: 17 December 2025 / Published: 3 January 2026
(This article belongs to the Special Issue Advanced Antimicrobial Susceptibility Testing and Detection)

Abstract

Non-targeted methods (NTMs) using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) show promise in bacterial resistance detection, yet temporal variations in spectral features pose significant challenges. These proteomic patterns, which characterize bacterial phenotypes and pathological functions, may vary over time due to bacterial adaptation, virulence, or resistance mechanisms, resulting in large prediction uncertainties and potentially degrading NTM performance. We present a comprehensive screening method to detect temporal changes in MALDI-TOF spectral patterns, demonstrated using methicillin-resistant and -susceptible Staphylococcus aureus (MRSA/MSSA) isolates collected over several years. Our approach combines convolutional neural networks (CNNs) with statistical methods, including significance testing, kernel density estimation, and receiver operating characteristics for dataset shift detection. We employ Gradient-weighted Class Activation Mapping (Grad-CAM) for post hoc feature description, enabling biochemical characterization of temporal changes. This analysis reveals crucial insights into the dynamic relationship between spectral data patterns over time, addressing key challenges in developing robust NTMs for routine applications.

1. Introduction

Non-targeted methods (NTMs) using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI TOF MS) are a promising set of methods that can be used for bacterial species identification [1], sub-species or strain differentiation [2], and metabolic profiling [3], among others. By measuring the mass-to-charge ratio of ions produced from bacterial samples, such methods provide proteomic fingerprints in the form of spectra, which can be used (a) to compare a query against a database of reference spectra or (b) for supervised learning techniques, the latter of which we exploit here to detect changes in spectral patterns of antibiotic-resistant bacterial strains. Throughout this manuscript, we use NTMs to denote the complete analytical workflow, i.e., from sample preparation through spectral acquisition to machine learning-based classification [4].
The use of machine learning methods with MALDI TOF spectra has been a “well-trodden path” for some time now [5,6]. These computational methods excel at learning intricate spectral patterns that reflect biochemical variations associated with antibiotic resistance, offering promising NTMs for distinguishing methicillin-resistant Staphylococcus aureus (MRSA) from its susceptible counterpart (MSSA) [6,7]. Recent multicenter studies demonstrate rapid MRSA screening capabilities with performance comparable to conventional methods [8], whilst recent reviews highlight the importance of MALDI-TOF MS for both species identification and resistance profiling in veterinary contexts, particularly for bovine mastitis surveillance [9]. Advanced approaches incorporating uncertainty estimation further enhance the reliability of resistance predictions [10]. This enhanced analytical capability could significantly improve diagnostic precision, ultimately enabling more informed decision-making.
The current literature on studying antimicrobial resistance (AMR) using MALDI TOF is very broad and deep, but with conflicting evidence [11]. Recent comprehensive reviews emphasize that MALDI-TOF MS plays a crucial role in AMR surveillance across the food production chain, from livestock to food products [12,13,14], whilst highlighting challenges of dataset shift in clinical machine learning applications [12]. While several groups have demonstrated promising results in MRSA/MSSA discrimination [7,15,16,17], other investigations highlight insufficiency and uncertainty in this approach as a rapid diagnostic method [18,19,20]. These disparate outcomes likely stem from the inherent complexity of spectral data collection and analysis. Despite standardized sample preparation and analytical measurement methods (e.g., instrument configurations), spectra acquired from different (a) locations and (b) time points frequently pose significant challenges for integration into a unified predictive model. Focusing on the latter, one of the main practical challenges of NTMs is that the peak features of the spectra may change over time due to various factors such as genetic mutations [21], environmental conditions [22], experimental protocols, or instrument settings [19]. Recent validation studies confirm that ML models for AMR prediction experience significant performance degradation over time, with accuracy declining by 10–25% within 18 months without retraining [23]. These changes may affect the performance and reliability of NTMs for bacterial identification and AMR detection, as well as the interpretation and comparison of the results across different studies and laboratories [7,24]. Most recently, this topic of data heterogeneity has been studied by Park and colleagues, where they report the effects of MRSA-MSSA class imbalance and the effects of different sample preparation and laboratory environments for measurements [25]. This highlights the need to understand and account for the temporal changes in spectral peak features and understand how they impact method performance.
Temporal changes in MALDI-TOF data manifest through multiple, interrelated mechanisms. Studies often employ potentially confusing terminology such as (a) concept drift, (b) covariate shift, and (c) feature shift [26,27]. In practice, an example for (a) where new resistance patterns emerge or bacterial subgroups form; (b) when hospital-trained models face veterinary samples; and (c) where specific spectral patterns lose their discriminatory power. These terms, while theoretically distinct, often create confusion as their practical manifestations frequently overlap [28,29]. Rather than formally categorizing these phenomena, this work examines their collective impact through three interconnected aspects (see Figure 1A). First, the emergence of new resistance patterns could introduce novel spectral signatures, potentially compromising existing classification models. Second, shifts in bacterial population structure, where different resistance patterns becoming more prevalent over time, might affect model performance through changing class distributions. Third, specific peaks of interest (corresponding to specific protein complexes) may exhibit drift or instability, due to either instrumental factors or genuine biological changes in the bacterial proteome. Understanding these aspects is crucial for long-term implementation, as changes in any dimension could impact NTM reliability.
Through analysis of MALDI-TOF spectral data from MRSA and MSSA isolates collected over several years, this paper addresses the challenge of temporal variation in spectral features. Our proposed method comprises three main steps, each targeting the key questions of resistance patterns, population changes, and spectral stability. First, we detect temporal changes using convolutional neural networks (CNNs) trained on the complete dataset over the entire time span (Figure 1B), followed by a sequential approach to assess the significance of temporal differences in decision scores. Second, we conduct period-specific CNN training and classification to observe changes within MRSA. These shifts may not directly influence MRSA vs. MSSA classification but are essential for understanding intra-MRSA variations. By independently classifying different periods, we aim to further uncover changes in spectral patterns and target labels. The second step helps to detect the dataset shifts using the decision scores (Figure 1C). Here we present the use of statistical procedures like visualizing kernel density estimation (KDE) distributions and receiver operating characteristic (ROC) curve (Figure 1D). Lastly, we describe which set of peaks contribute to temporal differences, using post hoc feature description. Here, Gradient-weighted Class Activation Mapping (Grad-CAM) [30] is employed to identify relevant m/z ranges for each year (Figure 1E). Extracting suitable peak features that contribute to the discrimination of a spectrum is not only important for the interpretability and performance of the models but also helps detect biochemical changes. Post hoc feature description helps find distinctive patterns and features that lead to a classification decision. Ultimately, this comprehensive analysis enhances our understanding of the dynamic relationship between MRSA and MSSA across various years.

2. Materials and Methods

2.1. Spectral Dataset

MALDI-TOF MS data were collected between 2008 and 2021 at the Bundesamt fuer Verbraucherschutz und Lebensmittelsicherheit using a Bruker Microflex LT instrument. The dataset comprises spectra from 440 isolates of Staphylococcus aureus. Sample preparation followed standard protocols [15,31]. Briefly, bacteria were cultured on Columbia sheep blood agar plates for 18–24 h at 33–37 °C, transferred using a sterile loop to the MALDI target plate, overlaid with α-cyano-4-hydroxycinnamic acid matrix solution, and air-dried. Spectra were acquired in the mass range of 2000–20,000 Da. Using custom scripts, the raw spectra underwent baseline correction using a rolling median filter. Methicillin resistance status was confirmed genotypically by detection of the mecA gene using polymerase chain reaction (PCR). Isolates positive for mecA were classified as MRSA; mecA-negative isolates were classified as MSSA.

2.2. Convolutional Neural Network (CNN) Model Training and Decision Scores

Two sets of CNN model training were performed—(i) to classify MRSA and MSSA and (ii) to classify time periods. For both, the CNN architecture comprised three convolutional layers interspersed with max-pooling layers, implemented via Keras and TensorFlow [32,33]. To address class imbalance across years, weighted training using Keras’s class weights function was applied, with weights inversely proportional to class frequencies, meaning that years and classes with fewer spectra have higher weights and vice versa. This way, the model gives more importance to the underrepresented years and classes and learns to generalize better across the data.
The nested cross-validation (NCV) approach utilized a calibration set consisting of MRSA and MSSA spectra in duplicates [34,35]. For each internal validation fold, the calibration set was divided into training and testing subsets. Adhering to the NCV framework, CNN models were trained on the training subset and validated with the test subset. Cross-validation splitting was performed at the isolate level (analogous to GroupKFold), ensuring all duplicate measurements from any given isolate remained in the same fold. This prevents data leakage between training and testing sets and ensures valid estimation of between-isolate and within-isolate variance components. The CNN employed gradient descent optimization to minimize the loss function by iteratively adjusting the learnable parameters through backpropagation until convergence at each layer. The output from the CNN comprised a standardized logit-transformed probability value which was used for decision score computation [34]. The classification decision hinged on a threshold, set to zero in this context, such that positive decision scores indicated class A, while negative scores indicated class B.

2.3. Performance Characteristics

Dual approaches were employed to evaluate the classifier’s performance. First, a confusion matrix quantified classification outcomes through true/false positives (TPs/FPs) and negatives (TNs/FNs). Second, the decision scores were analyzed to decompose variance components, providing deeper insights into classification reliability.
For binary non-targeted methods, the evaluation framework integrated both discrete classification metrics and continuous decision scores. Following Uhlig et al. [36,37], let Z i j denote the decision score for isolate i , spectrum j . The model for classification based on quantitative scores is as follows:
Z i j = μ r ( i ) + α i + β i j
where   r i = 1 = ^ i   b e l o n g s   t o   t y p e   A 1 = ^ i   b e l o n g s   t o   t y p e   B
Here, μ r ( i ) represents the mean score for respective classes A and B. α i captures the isolate i specific (random) variation and β i j accounts for the (random) deviation of the decision score for spectrum j of isolate i . Assuming homogeneous variance within classes, two-way random effects analysis of variance ANOVA was used to estimate variance components.
V a r ( α i ; i A ) = σ 1 , A 2   a n d   V a r ( α i ; i B ) = σ 1 , B 2 V a r β i j ; i A = σ 2 , A 2   a n d   V a r ( β i j ; i B ) = σ 2 , B 2
Based on the above variance components, the classification variances were calculated as follows:
σ c l a s s i f i c a t i o n , A 2 = σ 1 , A 2 + σ 2 , A 2 σ c l a s s i f i c a t i o n , B 2 = σ 1 , B 2 + σ 2 , B 2
The smaller the classification variances, the better the classification power. The discrimination power D P of the NTM, derived from the concept of Fisher’s discrimination index, was then calculated as:
Discrimination   power   ( D P )   = μ 1 μ 1 σ c l a s s i f i c a t i o n ,     A + σ c l a s s i f i c a t i o n , B
σ c l a s s i f i c a t i o n in the denominator were replaced by estimated standard deviation values s c l a s s i f i c a t i o n evaluated from the decision scores. Calculation of variance components was performed according to ISO 5725-2 [38], with PROLab Plus v2025.11.6.0 statistical software (QuoData GmbH, Germany). D P above 2 means good discrimination between the classes. If D P is between 1 and 2, then reasonable discrimination is possible, and if D P is less than 1, then there is poor discrimination.

2.4. Significance Testing and t-Values

The sequential t-test analysis was performed on the decision scores to assess significant differences between time periods ( τ ) . At each sequential step, two time periods, τ 1 and τ 2 , were compared. If no significant differences were found, the periods were combined for the subsequent steps.
In case of unequal variance of decision scores, t-values for the two-time spans, τ 1 and τ 2 , were calculated as a ratio of difference in the means ( μ ^ ) divided by pooled variance.
t = μ τ 2 ^ μ τ 1 ^ s τ 2 2 n τ 2 + s τ 1 2 n τ 1  
ROC curves were plotted and the area under the curve (AUC) was evaluated. A large AUC value suggests that the two time periods can be very well distinguished, with a value close to 1 signifying near perfect discrimination.

2.5. Feature Extraction Using Grad-CAM

Grad-CAM is a feature extraction and explainability technique based on neural network dissection, which aims to visualize and helps to interpret the internal representations learned by deep networks [30]. It is based on the idea of using the gradients of the output class with respect to the feature maps of the last convolutional layer of the model to produce a coarse localization map of the input [39]. Feature maps are the output of a convolutional layer in a neural network, which consists of a set of activation values for each filter (or kernel) applied to the input [40]. Figure 1E shows how Grad-CAM evaluation involved four key steps: First, the prediction output was obtained from the CNN in the form of the decision score for the class of interest (e.g., MRSA). Next, the gradient of the score S R with respect to the feature map A c o n v o l u t i o n   l a y e r .   l of the last convolutional layer was evaluated. This was followed by performing a global average pooling on the gradients to obtain the weights α R k for each feature map. These weights represented the importance of each feature map for the MRSA. Each feature map A c o n v o l u t i o n   l a y e r .   l was then multiplied by its corresponding weight α R k and summed to obtain the Grad-CAM activation map L R . This map has the same dimension as that of the feature map A c o n v o l u t i o n   l a y e r .   l . Lastly, the Grad-CAM activation map L R is resized to match the size of the input spectrum and overlaid on the original spectrum with a color map. The resulting spectrum shows the regions that the CNN focuses on when predicting the MRSA class.

2.6. Peak Features Changing over Time

After having identified the peak features, to identify the specific peak features that change over the years, we performed a one-way ANOVA analysis, using the quantitative Grad-CAM intensities. We calculated the F-value for the features, which can be used to determine the features that contribute most to the classification. By performing an ANOVA analysis on the Grad-CAM feature values, we can test whether there are significant differences among the mean values of each peak for each year. These peaks are the ones that have changed significantly over the years and can be considered potential biomarkers or indicators of temporal variation.

3. Results

3.1. Identifying Temporal Changes Using Classification Decision Scores for MRSA and MSSA

To address the key objective of identifying spectral shifts, first the CNNs were trained using all available spectral data for MRSA and MSSA, i.e., from 2008 to 2021, employing the 5-fold NCV technique. Samples from each year were evenly distributed across all folds to ensure an equal representation of years. Internal validation decision scores were evaluated and no spectra for external validation were used as the aim was not to develop an NTM to predict MRSA or MSSA, but rather to gain insights into the varying spectral patterns of MRSA and MSSA that affect their classification. In doing so, one can detect if there are changes in the population with different spectral patterns.
Figure 2 shows a Youden plot with decision scores for distinguishing MRSA and MSSA. Each point on the plot represents an isolate, with decision scores for duplicate measurements on the two axes. Considering a decision threshold of 0, a confusion matrix can be constructed, resulting in 14 false positives and 14 false negatives, resulting in a true positive rate (sensitivity) of 91% and a false positive rate (1—specificity) of about 3.3%. From the Youden plot, two direct observations emerge. First, for MSSA samples (spanning from 2008 to 2017), the classification remains predominantly accurate, i.e., the majority of the points lie in the third quadrant. And secondly, the decision scores of replicate samples display considerable variability, as evidenced by their deviation from the line of identity.
To examine the presence of “sub-populations” within MRSA, the point clouds are drawn in different shapes by known year label. MRSA samples from 2008 to 2015 (red squares) and 2021 (red upper triangles) are farthest from the MSSA of 2008 to 2017 (blue circles), which implies that they can be more distinctly classified. The point cloud distribution additionally indicates that MRSA samples from 2017 (red diamonds) and 2019 (red lower triangles) are notably close to the MSSA samples. This suggests that clear classification between MRSA samples from these years is challenging. There is also an indication that 2019 is similar to 2017, but 2021 is different. However, further differentiation is not possible, i.e., comparing 2021 with other years, because of the fewer available data points.
Such a result prompts the question of how one can identify evolving patterns or spot temporal variations within the data over time. In order to identify whether differences in subgroups are significant, we performed a sequential t-test procedure.

3.2. Sequential Significance Testing of Decision Scores

Following the observations from the Youden plot, sequential significance testing to systematically identify temporal groupings was performed. Figure 3 illustrates the step-by-step approach. The sequential testing began with step 1, comparing two pairs of years with minimal data points: 2019 versus 2021 (denoted by red 1) revealed statistically significant differences, while 2008 versus 2009 (marked as green 2) showed no significant differences. Following this initial analysis, 2008 and 2009 were combined for subsequent comparisons. Throughout the figure, green-shaded boxes indicate pairs of periods where decision scores show no statistical differences, while red-shaded boxes highlight periods with statistically significant differences.
Step 2 revealed that decision scores from 2011 and 2013 showed no significant differences. Proceeding to step 3, the comparison of the combined 2011 + 2013 period with 2015 also showed no statistical differences, enabling the formation of larger temporal groups. Further analysis demonstrated that decision scores from 2017 and 2019 were statistically similar, while the data from 2008 to 2015 maintained statistical consistency throughout the analysis.
The sequential testing culminated in step 6, which definitively identified three distinct temporal periods: 2008–2015, 2017–2019, and 2021. Final statistical tests confirmed significant differences between these three periods. The sequential analysis provides strong evidence that MRSA isolates from 2017 and 2019 exhibit similar characteristics but differ significantly from other periods. While this methodology effectively addresses the presence of changes in spectral patterns, it is important to note that these statistical differences cannot be directly interpreted as qualitative differences in MRSA isolate behavior.

3.3. Finding Breaks by Year-Wise Training

The sequential significance testing helped identify three time periods from the decision scores. To further validate this result, several CNNs were trained to perform classification for two time periods separately for MRSA and MSSA. Figure 4 shows the (A) rug plot charts for decision scores, showing the distribution of scores and (B) ROC curves showing discriminatory relation of the model through the lens of false-positive rate (FPR) and true-positive rate (TPR). Calculating the t-values using Equation (6), we obtain large t-values when comparing MRSA for periods (a) 2015 versus 2017 (value of 6.9) and (b) 2017 versus 2019 + 2021 (value of 4.67). Considering a critical t-value of 2, one can consider the difference in the decision scores to be significant for the above two periods. The ROC curves also show high discrimination for these two periods with AUC values of 0.9 and 0.83, respectively. For the other periods, the AUC value is around 0.5.
Likewise, MSSA for the period 2015 versus 2017 also shows some evidence of differences in decision scores with a relatively high t-value of 3.69 and an AUC of 0.75. This finding is particularly intriguing as antibiotic resistance-associated proteins should theoretically not influence MSSA spectra. Several potential explanations warrant further investigation: the presence of borderline resistance cases that are still classified as MSSA, regional and temporal variations in MSSA subtypes across different geographical regions, differences in measurement conditions during spectral acquisition, or variations in isolate processing protocols that could affect bacterial viability and proteome expression.
The combination of t-values and ROC-AUC metrics provides complementary evidence: t-values establish statistical significance of temporal differences, while ROC-AUC values quantify classification performance. These metrics jointly support the temporal breaks identified in the previous section.

3.4. Tracking Features over Time

3.4.1. Feature Extraction Heatmaps

We proceeded to neural network dissection with Grad-CAM to address the question of if there are changes in specific peaks. The left panels of Figure 5A show the baseline-corrected spectra (black line) overlaid with the Grad-CAM heatmaps presenting one representative spectrum from each year. These heatmaps quantitatively illustrate the contribution of each spectral region to the model’s predictions, where warmer colors indicate stronger contributory features. For instance, in the spectrum in the first row, multiple specific bands demonstrate high influence on the classification. Notably, across all spectra, regions corresponding to m/z 2000–4000 consistently show high importance, indicated by prominent dark red bands. Several studies also report the presence of discriminatory peaks in that range [6,11,41].
By bringing together the Grad-CAM results for all the spectra, a consolidated heatmap is constructed in Figure 5B. In this heatmap, each row of pixels corresponds to a spectrum and each column corresponds to a mass-by-charge ratio. Once again, the color intensity represents the classification importance derived from Grad-CAM analysis, with warmer colors indicating higher significance. The heatmap reveals distinct patterns of CNN focus across different years, suggesting temporal variations in MRSA spectra.
Overall, Grad-CAM demonstrates its utility as a sophisticated visualization technique for interpreting CNN classification behavior, offering insights into the network’s learned features and their relationship to output classes across varying inputs. It can also be used for development and optimization of NTMs. The heatmap primarily provides a qualitative or global perspective of the network-learned spectral features, and detailed year-to-year comparisons may only be partially feasible through visual inspection.

3.4.2. Detecting Specific Peaks Sensitive to Temporal Changes

To quantitatively evaluate the year-to-year changes in the Grad-CAM features, one-way ANOVA analysis is performed for the Grad-CAM feature value with the year. High F-values mean that the peak regions show large variation in importance between the years. Six exemplary spectral regions are taken with relatively high F-values and plotted in Figure 6. The overlap of features learned by the models with recognized resistance patterns reported in the literature is a promising result. Further work is warranted here in terms of biochemical characterization of the proteins responsible for the peaks.

3.5. Classification Performance Based on Different Time Periods

Following the identification of temporal breaks in 2017, we investigated classification performance by analyzing pre-2017 and post-2017 periods separately. Figure 7 presents Youden plots for MRSA classification across these distinct periods: (A) 2008–2015 and (B) 2017–2019. The 2008–2015 period demonstrated a TPR, or sensitivity, of 99% and a very low FPR (<<1%), i.e., very high specificity. This represents a substantial improvement over the non-temporally segregated analysis presented in Section 3.1. The repeatability standard deviations for MRSA and MSSA are 0.404 and 0.285, respectively, with classification standard deviations of 0.419 and 0.308, respectively. Using Equation (5), D P is calculated to be 2.75. Recollect that values above 2 mean very good discrimination between the classes. The 2017–2019 period showed markedly different characteristics, with sensitivity decreasing to 52% (FPR of 11%). Furthermore, it is also clear that the variation between the replicates has increased (spread of the red and blue points), with repeatability standard deviations for MRSA and MSSA of 0.815 and 0.729 and classification standard deviations of 0.815 and 0.821. The resulting D P of 1.22 indicates that the performance characteristics for the NTM for identifying MRSA from MSSA have significantly deteriorated, necessitating a revision and update of the NTM [4]. This observation aligns with recent validation studies demonstrating similar temporal performance degradation in MALDI-TOF-based AMR prediction models [23].

4. Discussion

Proteomic patterns, such as protein expression, modification, and interaction, are widely used to characterize the biological and pathological functions of bacteria. However, these patterns may vary over time due to various processes, such as bacterial growth, adaptation, virulence, or resistance. In this paper, we present a study of the temporal changes in proteomic patterns of MRSA and propose a screening method. We use a large sample of MALDI TOF data collected over several years. We propose and apply statistical methods and machine learning techniques to find suitable features and track their changes over the years. We also discuss the implications of our results and how the identified peaks can help with biochemical characterization as well as the prospects for future studies with emerging technologies.
We use CNNs to learn cross-year feature representations that can capture the biochemical variations associated with antibiotic resistance or susceptibility. We evaluate our method on a large dataset of mass spectrometry data collected from 2008 to 2021. Quantitative decision scores from CNNs were evaluated and used not only in performance characterization but also to evaluate the changes over the years. Using decision scores, variance components are evaluated from which the precision parameters are calculated such as repeatability standard deviation and classification standard deviation, which can help to describe the distribution of points and their farness from each other. Detecting shifts in data over years using quantitative decision scores is an important novelty of this work, complementing recent approaches employing unsupervised information geometric projections for early identification of dataset shift patterns [42]. Such quantitative monitoring approaches would not have been possible merely with qualitative binary outcomes. Feature descriptions are primarily important for three main reasons. First, they ensure that the method works “as expected”; feature extraction can help one understand if the model identifies features that are “logical” or “relevant”, or if the model relies on arbitrary patterns in the data to make a classification. Second, they help one understand the reason for the result, as one can understand why the model classified the sample as MRSA, and, more importantly, one can determine whether the features have changed over the years. Lastly, they help develop targeted methods or improved NTMs. Neural network dissection methods like Grad-CAM are a type of post hoc feature description method that analyze the activations of the hidden layers of a network and try to find meaningful associations between the units in those layers and the input features or the output classes. We apply the Grad-CAM method to detect the relevant mass ranges for each year and compare them with the temporal breaks detected by statistical methods. We show that our approach can provide useful insights for biochemical analysis, as well as potentially accounting for the possible confounding factors of sample selection, isolate preparation, and measurement circumstances.
Our methodological approach differs from classical MALDI-TOF AMR studies in several fundamental aspects. First, whilst most investigations focus on cross-sectional discrimination at single time points [8,9,10], we explicitly address temporal validation, evaluating whether models remain valid as bacterial populations and resistance patterns evolve. This temporal perspective is essential for clinical deployment but remains rare in the literature [33,39]. Second, unlike approaches seeking singular biomarker peaks indicative of specific resistance mechanisms [31], our CNN-based pattern recognition captures complex multi-peak signatures that may prove more robust to subtle instrumental drift whilst remaining interpretable through Grad-CAM analysis. Third, our quantitative decision score framework enables rigorous variance component decomposition and significance testing impossible with binary classification outcomes alone [27,28]. This statistical foundation allows us to distinguish genuine biological/epidemiological shifts from measurement variability, a critical capability for antimicrobial surveillance programs where false-positive drift detection could trigger unnecessary model retraining.
Traditional accuracy-based performance metrics (where accuracy = (TP + TN)/(TP + FP + TN + FN)) might be misleading when dealing with extremely imbalanced data where one class considerably dominates the other(s). They are commonly reported in the literature where new NTMs are described. The model in the NTM may simply predict the majority class and have a large accuracy value. Here there is a risk of having a false sense of superior NTM performance. It is advisable to exercise caution or, ideally, refrain from employing accuracy metrics commonly applied for model evaluation when faced with imbalanced class structures, as their interpretation may yield misleading insights. Quantitative decision scores should instead be used, as performed here, from which sensitivity and specificity can be evaluated.
The work herein is limited by all means to a screening method and does not offer any confirmatory procedures because so far it does not provide any information on the biochemical nature of the changed patterns or extracted features. However, the integration of such screening approaches into decision support systems has shown promise for accelerating antimicrobial stewardship [43]. Nevertheless, the approaches offer a pragmatic and powerful approach that can be implemented via the use of simple tools as shown in this work. Therefore, further biochemical analysis is needed to identify and describe the proteins responsible for the peaks. The core reasons for the changes need further investigation. Here, another possible reason is the different sampling processes of the isolates, which can also affect how the spectra look. Overall, it is clear that with such a screening method, one can easily detect changes in data and generate several hypotheses to then confirm.

5. Conclusions

Data shifts and change in the samples over time is a huge challenge for any NTM for species typing and AMR programs, let alone other analytical sciences. And this work helps to provide tools for handling such scenarios. In this paper, we propose a non-targeted screening method to detect the changes in peak features in mass spectrometry data over the years. We have shown that by identifying the temporal breaks, strategic dataset selection, and model adaptation, one can maintain classification accuracy across different time periods.
Performance evaluation of NTMs is very important not only to trust the results, but also for gaining wider acceptance in routine use. We show how performance evaluation can be performed using the underlying decision scores obtained from the CNN models. In this work, several data experiments were conducted with MRSA and MSSA MALDI TOF spectra. To investigate the impact of changing samples, different combinations of available data were put together to test out hypotheses. The endeavor of NTM development encounters heightened challenges when the MRSA (or any class of interest) contends with a limited pool of unique available samples, compounded by evolutionary changes in features across temporal dimensions. This is something we expect in reality and hence the work in this report describes considerations and options to tackle these challenges. The methodological aspects presented here, sequential significance testing of decision scores, period-specific model training for drift detection, and Grad-CAM-based feature tracking, provide a generalizable framework for the surveillance of temporal changes in bacterial populations. This is particularly relevant for AMR monitoring programs, where spectral pattern shifts may indicate emergence of novel resistance mechanisms, clonal replacements, or methodological drift. Early detection of such changes enables timely model updating, maintaining diagnostic accuracy for antibiotic stewardship. The approaches are readily applicable beyond S. aureus to other priority pathogens where MALDI-TOF-based AMR prediction is being developed. Lastly, the approaches described, when provided in a suitable environment, add to the toolbox for developing superior methods of measurement and validating them. Given the documented performance degradation of ML models over time [23], future work should focus on implementing the screening method described as an automated method for detecting spectral drift and implementing real-time model updates to ensure reliable clinical diagnostics. This aligns with broader initiatives in medical microbiology to establish standardized protocols and interdisciplinary collaboration frameworks for AI implementation [44].

Author Contributions

Conceptualization, K.N., S.U. and S.K.; methodology, K.N. and S.U.; software, K.N. and V.S.M.; validation, K.N., S.U., V.S.M., K.H. and K.F.; formal analysis, K.N.; investigation, K.N., V.S.M., K.F. and K.H.; resources, S.U., U.S., H.K. and P.G.; data curation, K.N., U.S. and H.K.; writing—original draft preparation, K.N.; writing—review and editing, K.N., S.U., V.S.M., K.F., K.H. and S.K.; visualization, K.N.; supervision, S.U. and S.K.; project administration, S.K.; funding acquisition, P.G. and S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The MALDI-TOF spectral data analyzed during this study are available from the authors upon reasonable request.

Acknowledgments

In preparing this manuscript, the authors employed OpenAI GPT-5 to assist with proofreading and grammar; the authors subsequently reviewed and revised the output. KN extends thanks to Harshadrai M. Rawel for valuable conversations regarding wet-lab analytical measurements and MALDI-TOF protocols and to Kirsten Simon for her support and assistance during the study.

Conflicts of Interest

Authors Kapil Nichani, Steffen Uhlig, Victor San Martin, Karina Hettwer and Kirstin Frost were employed by the company QuoData GmbH. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Wolters, M.; Rohde, H.; Maier, T.; Belmar-Campos, C.; Franke, G.; Scherpe, S.; Aepfelbacher, M.; Christner, M. MALDI-TOF MS Fingerprinting Allows for Discrimination of Major Methicillin-Resistant Staphylococcus Aureus Lineages. Int. J. Med. Microbiol. 2011, 301, 64–68. [Google Scholar] [CrossRef]
  2. Pérez-Sancho, M.; Vela, A.I.; Horcajo, P.; Ugarte-Ruiz, M.; Domínguez, L.; Fernández-Garayzábal, J.F.; De La Fuente, R. Rapid Differentiation of Staphylococcus aureus Subspecies Based on MALDI-TOF MS Profiles. J. Vet. Diagn. Investig. 2018, 30, 813–820. [Google Scholar] [CrossRef]
  3. Złoch, M.; Pomastowski, P.; Maślak, E.; Monedeiro, F.; Buszewski, B. Study on Molecular Profiles of Staphylococcus Aureus Strains: Spectrometric Approach. Molecules 2020, 25, 4894. [Google Scholar] [CrossRef]
  4. Nichani, K.; Uhlig, S.; Stoyke, M.; Kemmlein, S.; Ulberth, F.; Haase, I.; Döring, M.; Walch, S.G.; Gowik, P. Essential Terminology and Considerations for Validation of Non-Targeted Methods. Food Chem. X 2023, 17, 100538. [Google Scholar] [CrossRef] [PubMed]
  5. Li, L.; Garden, R.W.; Sweedler, J.V. Single-Cell MALDI: A New Tool for Direct Peptide Profiling. Trends Biotechnol. 2000, 18, 151–160. [Google Scholar] [CrossRef]
  6. Tang, R.; Luo, R.; Tang, S.; Song, H.; Chen, X. Machine Learning in Predicting Antimicrobial Resistance: A Systematic Review and Meta-Analysis. Int. J. Antimicrob. Agents 2022, 60, 106684. [Google Scholar] [CrossRef]
  7. Weis, C.V.; Jutzeler, C.R.; Borgwardt, K. Machine Learning for Microbial Identification and Antimicrobial Susceptibility Testing on MALDI-TOF Mass Spectra: A Systematic Review. Clin. Microbiol. Infect. 2020, 26, 1310–1317. [Google Scholar] [CrossRef] [PubMed]
  8. Yong, D.; Park, J.S.; Kim, K.; Seo, D.; Kim, D.-C.; Kim, J.-S.; Park, J.-M. Rapid Screening of Methicillin-Resistant Staphylococcus aureus Using MALDI-TOF MS and Machine Learning: A Randomized, Multicenter Study. Anal. Chem. 2025, 97, 15667–15675. [Google Scholar] [CrossRef] [PubMed]
  9. Touaitia, R.; Ibrahim, N.A.; Touati, A.; Idres, T. Staphylococcus Aureus in Bovine Mastitis: A Narrative Review of Prevalence, Antimicrobial Resistance, and Advances in Detection Strategies. Antibiotics 2025, 14, 810. [Google Scholar] [CrossRef]
  10. Corvelo Benz, N.; Miranda, L.; Chen, D.; Sattler, J.; Borgwardt, K. Conformal Prediction with Knowledge Graphs for Reliable Antimicrobial Resistance Detection with MALDI-TOF Mass Spectra. J. Comput. Biol. 2025, 15578666251396558. [Google Scholar] [CrossRef]
  11. Sauget, M.; Valot, B.; Bertrand, X.; Hocquet, D. Can MALDI-TOF Mass Spectrometry Reasonably Type Bacteria? Trends Microbiol. 2017, 25, 447–455. [Google Scholar] [CrossRef] [PubMed]
  12. Silva, G.F.D.S.; Barcellos Filho, F.N.; Wichmann, R.M.; Da Silva Junior, F.C.; Chiavegatto Filho, A.D.P. Strategies for Detecting and Mitigating Dataset Shift in Machine Learning for Health Predictions: A Systematic Review. J. Biomed. Inform. 2025, 170, 104902. [Google Scholar] [CrossRef]
  13. Elbehiry, A.; Marzouk, E. From Farm to Fork: Antimicrobial-Resistant Bacterial Pathogens in Livestock Production and the Food Chain. Vet. Sci. 2025, 12, 862. [Google Scholar] [CrossRef] [PubMed]
  14. Elbehiry, A.; Abalkhail, A. Spectral Precision: Recent Advances in Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry for Pathogen Detection and Resistance Profiling. Microorganisms 2025, 13, 1473. [Google Scholar] [CrossRef]
  15. Østergaard, C.; Hansen, S.G.K.; Møller, J.K. Rapid First-Line Discrimination of Methicillin Resistant Staphylococcus Aureus Strains Using MALDI-TOF MS. Int. J. Med. Microbiol. 2015, 305, 838–847. [Google Scholar] [CrossRef]
  16. Yu, J.; Tien, N.; Liu, Y.-C.; Cho, D.-Y.; Chen, J.-W.; Tsai, Y.-T.; Huang, Y.-C.; Chao, H.-J.; Chen, C.-J. Rapid Identification of Methicillin-Resistant Staphylococcus Aureus Using MALDI-TOF MS and Machine Learning from over 20,000 Clinical Isolates. Microbiol. Spectr. 2022, 10, e00483-22. [Google Scholar] [CrossRef]
  17. Kim, J.-M.; Kim, I.; Chung, S.H.; Chung, Y.; Han, M.; Kim, J.S. Rapid Discrimination of Methicillin-Resistant Staphylococcus Aureus by MALDI-TOF MS. Pathogens 2019, 8, 214. [Google Scholar] [CrossRef]
  18. Paskova, V.; Chudejova, K.; Sramkova, A.; Kraftova, L.; Jakubu, V.; Petinaki, E.A.; Zemlickova, H.; Neradova, K.; Papagiannitsis, C.C.; Hrabak, J. Insufficient Repeatability and Reproducibility of MALDI-TOF MS-Based Identification of MRSA. Folia Microbiol. 2020, 65, 895–900. [Google Scholar] [CrossRef]
  19. Lasch, P.; Fleige, C.; Stämmler, M.; Layer, F.; Nübel, U.; Witte, W.; Werner, G. Insufficient Discriminatory Power of MALDI-TOF Mass Spectrometry for Typing of Enterococcus Faecium and Staphylococcus Aureus Isolates. J. Microbiol. Methods 2014, 100, 58–69. [Google Scholar] [CrossRef]
  20. Lasch, P.; Wahab, T.; Weil, S.; Pályi, B.; Tomaso, H.; Zange, S.; Kiland Granerud, B.; Drevinek, M.; Kokotovic, B.; Wittwer, M.; et al. Identification of Highly Pathogenic Microorganisms by Matrix-Assisted Laser Desorption Ionization—Time of Flight Mass Spectrometry: Results of an Interlaboratory Ring Trial. J. Clin. Microbiol. 2015, 53, 2632–2640. [Google Scholar] [CrossRef] [PubMed]
  21. Vestergaard, M.; Frees, D.; Ingmer, H. Antibiotic Resistance and the MRSA Problem. Microbiol. Spectr. 2019, 7, 10–1128. [Google Scholar] [CrossRef]
  22. Viboud, G.; Asaro, H.; Huang, M.B. Use of Matrix-Assisted Laser Desorption Ionization Time of Flight (MALDI-TOF) to Detect Antibiotic Resistance in Bacteria: A Scoping Review. Am. J. Clin. Pathol. 2024, 161, 317–328. [Google Scholar] [CrossRef] [PubMed]
  23. Wiesmann, N.; Enders, D.; Westendorf, A.; Koch, R.; Schaumburg, F. Prediction of Antimicrobial Resistance from MALDI-TOF Mass Spectra Using Machine Learning: A Validation Study. J. Clin. Microbiol. 2025, 63, e01186-25. [Google Scholar] [CrossRef] [PubMed]
  24. Kannan, E.P.; Gopal, J.; Muthu, M. Analytical Techniques for Assessing Antimicrobial Resistance: Conventional Solutions, Contemporary Problems and Futuristic Outlooks. TrAC Trends Anal. Chem. 2024, 178, 117843. [Google Scholar] [CrossRef]
  25. Park, Y.; Weig, M.; Noll, C.; Bader, O.; Hauschild, A.-C. Effect of Data Heterogeneity in Clinical MALDI-TOF Mass Spectra Profiles on Direct Antimicrobial Resistance Prediction Through Machine Learning. bioXiv 2024. [Google Scholar] [CrossRef]
  26. Kull, M.; Flach, P. Patterns of Dataset Shift. 2014. Available online: https://www.semanticscholar.org/paper/Patterns-of-dataset-shift-Kull-Flach/aa49eb379d55fd4c923f47efcd61b2090f58e54f (accessed on 5 December 2025).
  27. Zhang, H.; Singh, H.; Ghassemi, M.; Joshi, S. “Why Did the Model Fail?”: Attributing Model Performance Changes to Distribution Shifts. arXiv 2023, arXiv:2210.10769. [Google Scholar]
  28. Moreno-Torres, J.G.; Raeder, T.; Alaiz-Rodríguez, R.; Chawla, N.V.; Herrera, F. A Unifying View on Dataset Shift in Classification. Pattern Recognit. 2012, 45, 521–530. [Google Scholar] [CrossRef]
  29. Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A Survey on Concept Drift Adaptation. ACM Comput. Surv. 2014, 46, 1–37. [Google Scholar] [CrossRef]
  30. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-Cam: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
  31. Werner, G.; Fleige, C.; Feßler, A.T.; Timke, M.; Kostrzewa, M.; Zischka, M.; Peters, T.; Kaspar, H.; Schwarz, S. Improved Identification Including MALDI-TOF Mass Spectrometry Analysis of Group D Streptococci from Bovine Mastitis and Subsequent Molecular Characterization of Corresponding Enterococcus Faecalis and Enterococcus Faecium Isolates. Vet. Microbiol. 2012, 160, 162–169. [Google Scholar] [CrossRef]
  32. Chollet, F. keras, GitHub. 2015. Available online: https://github.com/keras-team/keras (accessed on 5 December 2025).
  33. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A System for Large-Scale Machine Learning. arXiv 2016, arXiv:1605.08695. [Google Scholar]
  34. Nichani, K.; Uhlig, S.; Colson, B.; Hettwer, K.; Simon, K.; Bönick, J.; Uhlig, C.; Kemmlein, S.; Stoyke, M.; Gowik, P.; et al. Development of Non-Targeted Mass Spectrometry Method for Distinguishing Spelt and Wheat. Foods 2022, 12, 141. [Google Scholar] [CrossRef] [PubMed]
  35. Vabalas, A.; Gowen, E.; Poliakoff, E.; Casson, A.J. Machine Learning Algorithm Validation with a Limited Sample Size. PLoS ONE 2019, 14, e0224365. [Google Scholar] [CrossRef]
  36. Uhlig, S.; Nichani, K.; Stoyke, M.; Gowik, P. Validation of Binary Non-Targeted Methods: Mathematical Framework and Experimental Designs. bioRxiv 2021. [Google Scholar] [CrossRef]
  37. Uhlig, S.; Nichani, K.; Colson, B.; Hettwer, K.; Simon, K.; Uhlig, C.; Stoyke, M.; Steinacker, U.; Becker, R.; Gowik, P. Performance Characteristics and Criteria for Non-Targeted Methods. In Proceedings of the Eurachem Workshop, Tartu, Estonia, 20–21 May 2019. [Google Scholar]
  38. ISO 5725-2:2025; Accuracy (Trueness and Precision) of Measurement Methods and Results—Part 2: Basic Method for the Determination of Repeatability and Reproducibility of a Standard Measurement Method. International Organization for Standardization: Geneva, Switzerland, 2025.
  39. Draelos, R.L.; Carin, L. Use HiResCAM Instead of Grad-CAM for Faithful Explanations of Convolutional Neural Networks. arXiv 2020, arXiv:2011.08891. [Google Scholar]
  40. Zeiler, M.D.; Fergus, R. Visualizing and Understanding Convolutional Networks. In Proceedings of the Computer Vision—ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Part I 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 818–833. [Google Scholar]
  41. Shah, H.N.; Rajakaruna, L.; Ball, G.; Misra, R.; Al-Shahib, A.; Fang, M.; Gharbia, S.E. Tracing the Transition of Methicillin Resistance in Sub-Populations of Staphylococcus Aureus, Using SELDI-TOF Mass Spectrometry and Artificial Neural Network Analysis. Syst. Appl. Microbiol. 2011, 34, 81–86. [Google Scholar] [CrossRef]
  42. Fernández-Narro, D.; Ferri, P.; Gutiérrez-Sacristán, A.; García-Gómez, J.M.; Sáez, C. Unsupervised Characterization of Temporal Dataset Shifts as an Early Indicator of AI Performance Variations: Evaluation Study Using the Medical Information Mart for Intensive Care-IV Dataset. JMIR Med. Inform. 2025, 13, e78309. [Google Scholar] [CrossRef] [PubMed]
  43. Lin, T.-H.; Chung, H.-Y.; Jian, M.-J.; Chang, C.-K.; Perng, C.-L.; Chang, F.-Y.; Chen, C.-W.; Shang, H.-S. Accelerating Antimicrobial Stewardship: An AI-CDSS Approach to Combating Multidrug-Resistant Pathogens in the Era of Increasing Resistance. Clin. Chim. Acta 2025, 574, 120336. [Google Scholar] [CrossRef]
  44. Greutmann, M.; Borgwardt, K.; Brüningk, S.; Franzeck, F.; Giske, C.G.; Green, A.G.; Guerrero-López, A.; Ip, M.; Jutzeler, C.; Kahles, A.; et al. ESCMID Workshop: Artificial Intelligence and Machine Learning in Medical Microbiology Diagnostics. Microbes Infect. 2025, 105562. [Google Scholar] [CrossRef]
Figure 1. Schematic of the non-targeted screening method. (A) Different perspectives for data shift. (B) CNN training and testing, (C) Youden plot for decision scores, (D) illustrative KDE and ROC curve and significance testing, (E) Grad-CAM for feature description.
Figure 1. Schematic of the non-targeted screening method. (A) Different perspectives for data shift. (B) CNN training and testing, (C) Youden plot for decision scores, (D) illustrative KDE and ROC curve and significance testing, (E) Grad-CAM for feature description.
Microorganisms 14 00104 g001
Figure 2. Decision scores of MRSA vs. MSSA as a Youden plot. Each point represents one bacterial isolate, with coordinates defined by decision scores from duplicate MALDI-TOF measurements (measurement 1 on x-axis and measurement 2 on y-axis). Blue circles show MSSA from 2008 to 2017. Red squares show MRSA from 2008 to 2015. Red diamonds show MRSA from 2017. Red lower triangles show MRSA from 2019 and red upper triangles from 2021. Line of identity is shown as a dotted line.
Figure 2. Decision scores of MRSA vs. MSSA as a Youden plot. Each point represents one bacterial isolate, with coordinates defined by decision scores from duplicate MALDI-TOF measurements (measurement 1 on x-axis and measurement 2 on y-axis). Blue circles show MSSA from 2008 to 2017. Red squares show MRSA from 2008 to 2015. Red diamonds show MRSA from 2017. Red lower triangles show MRSA from 2019 and red upper triangles from 2021. Line of identity is shown as a dotted line.
Microorganisms 14 00104 g002
Figure 3. Sequential stepwise testing for temporal differences in MRSA decision scores. This schematic illustrates the iterative process of comparing and merging time periods based on their decision scores. In step one, pairs of periods with the smallest sample sizes were compared. Green-shaded boxes indicate periods with statistically similar scores that were merged for subsequent analysis, while red-shaded boxes show periods with statistically different scores that remained separate. When two comparisons occurred simultaneously within the same step, they are labeled as “1” and “2”. Through six iterative steps of comparison and merging, three distinct temporal periods emerged with significantly different decision score distributions: 2008–2015, 2017–2019, and 2021.
Figure 3. Sequential stepwise testing for temporal differences in MRSA decision scores. This schematic illustrates the iterative process of comparing and merging time periods based on their decision scores. In step one, pairs of periods with the smallest sample sizes were compared. Green-shaded boxes indicate periods with statistically similar scores that were merged for subsequent analysis, while red-shaded boxes show periods with statistically different scores that remained separate. When two comparisons occurred simultaneously within the same step, they are labeled as “1” and “2”. Through six iterative steps of comparison and merging, three distinct temporal periods emerged with significantly different decision score distributions: 2008–2015, 2017–2019, and 2021.
Microorganisms 14 00104 g003
Figure 4. Distinction between the different time periods, separate for MRSA and MSSA, depicted in rows. (A) Rug plots with kernel density estimation curves. Plotted are the mean and standard deviation. (B) ROC curves with AUC values for the classification.
Figure 4. Distinction between the different time periods, separate for MRSA and MSSA, depicted in rows. (A) Rug plots with kernel density estimation curves. Plotted are the mean and standard deviation. (B) ROC curves with AUC values for the classification.
Microorganisms 14 00104 g004
Figure 5. Grad-CAM analysis revealing temporal variations in spectral regions. (A) Representative MALDI-TOF mass spectra from each collection year (black lines, baseline-corrected) overlaid with Grad-CAM activation heatmaps. Color intensity (blue to red) quantifies each spectral region’s contribution to MRSA classification decisions, with red indicating high importance, exemplary for selected spectra, and (B) consolidated heatmap aggregating Grad-CAM importance across all spectra (rows) and m/z positions (columns). Color intensity in each row represents that region’s importance for classifying that specific spectrum. Distinct horizontal banding patterns within temporal blocks (years shown on the left) reveal how discriminatory spectral features change over time. The heterogeneity of patterns within each year also indicates substantial inter-isolate variation.
Figure 5. Grad-CAM analysis revealing temporal variations in spectral regions. (A) Representative MALDI-TOF mass spectra from each collection year (black lines, baseline-corrected) overlaid with Grad-CAM activation heatmaps. Color intensity (blue to red) quantifies each spectral region’s contribution to MRSA classification decisions, with red indicating high importance, exemplary for selected spectra, and (B) consolidated heatmap aggregating Grad-CAM importance across all spectra (rows) and m/z positions (columns). Color intensity in each row represents that region’s importance for classifying that specific spectrum. Distinct horizontal banding patterns within temporal blocks (years shown on the left) reveal how discriminatory spectral features change over time. The heterogeneity of patterns within each year also indicates substantial inter-isolate variation.
Microorganisms 14 00104 g005
Figure 6. Temporal evolution of Grad-CAM feature importance for six exemplary m/z ranges showing significant year-to-year variation. For each panel, the m/z range and ANOVA F-statistic are shown (a higher F indicates greater temporal variation). Solid lines represent mean Grad-CAM importance values per year; dashed lines show 95% confidence intervals calculated from all spectra within each year.
Figure 6. Temporal evolution of Grad-CAM feature importance for six exemplary m/z ranges showing significant year-to-year variation. For each panel, the m/z range and ANOVA F-statistic are shown (a higher F indicates greater temporal variation). Solid lines represent mean Grad-CAM importance values per year; dashed lines show 95% confidence intervals calculated from all spectra within each year.
Microorganisms 14 00104 g006
Figure 7. Youden plot for discrimination of MRSA and MSSA with split time periods. (A) MSSA from 2008 to 2017 (blue dots) and MRSA from 2008 to 2015 (red squares). (B) MSSA from 2008 to 2017 (blue dots) and MRSA from 2017 (red diamonds) and from 2019 (red lower triangles).
Figure 7. Youden plot for discrimination of MRSA and MSSA with split time periods. (A) MSSA from 2008 to 2017 (blue dots) and MRSA from 2008 to 2015 (red squares). (B) MSSA from 2008 to 2017 (blue dots) and MRSA from 2017 (red diamonds) and from 2019 (red lower triangles).
Microorganisms 14 00104 g007
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nichani, K.; Uhlig, S.; San Martin, V.; Hettwer, K.; Frost, K.; Steinacker, U.; Kaspar, H.; Gowik, P.; Kemmlein, S. Non-Targeted Screening Method for Detecting Temporal Shifts in Spectral Patterns of Methicillin-Resistant Staphylococcus aureus and Post Hoc Description of Peak Features. Microorganisms 2026, 14, 104. https://doi.org/10.3390/microorganisms14010104

AMA Style

Nichani K, Uhlig S, San Martin V, Hettwer K, Frost K, Steinacker U, Kaspar H, Gowik P, Kemmlein S. Non-Targeted Screening Method for Detecting Temporal Shifts in Spectral Patterns of Methicillin-Resistant Staphylococcus aureus and Post Hoc Description of Peak Features. Microorganisms. 2026; 14(1):104. https://doi.org/10.3390/microorganisms14010104

Chicago/Turabian Style

Nichani, Kapil, Steffen Uhlig, Victor San Martin, Karina Hettwer, Kirstin Frost, Ulrike Steinacker, Heike Kaspar, Petra Gowik, and Sabine Kemmlein. 2026. "Non-Targeted Screening Method for Detecting Temporal Shifts in Spectral Patterns of Methicillin-Resistant Staphylococcus aureus and Post Hoc Description of Peak Features" Microorganisms 14, no. 1: 104. https://doi.org/10.3390/microorganisms14010104

APA Style

Nichani, K., Uhlig, S., San Martin, V., Hettwer, K., Frost, K., Steinacker, U., Kaspar, H., Gowik, P., & Kemmlein, S. (2026). Non-Targeted Screening Method for Detecting Temporal Shifts in Spectral Patterns of Methicillin-Resistant Staphylococcus aureus and Post Hoc Description of Peak Features. Microorganisms, 14(1), 104. https://doi.org/10.3390/microorganisms14010104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop