Excitation–Emission Fluorescence Spectroscopy Combined with Machine Learning for Biomedical Diagnostics: A Systematic Review

Pérez Hincapié, Melissa; Arana, Victoria A.; García-Alzate, Roberto; Lozano-Arias, Daisy; Trilleras, Jorge

doi:10.3390/sci8070148

Open AccessSystematic Review

Excitation–Emission Fluorescence Spectroscopy Combined with Machine Learning for Biomedical Diagnostics: A Systematic Review

by

Melissa Pérez Hincapié

¹,

Victoria A. Arana

¹

,

Roberto García-Alzate

¹,

Daisy Lozano-Arias

²

and

Jorge Trilleras

^3,*

¹

Grupo de Investigación Ciencias, Educación y Tecnología—CETIC, Laboratorio de Metabolómica, Biosanitaria y Ambiental, Universidad del Atlántico, Carrera 30 No. 8–49, Puerto Colombia 081007, Colombia

²

Grupo de Investigación en Ciencias Básicas y Clínicas—GIBAC, Facultad de Ciencias de la Salud, Fundación Universitaria San Martín, Puerto Colombia 081007, Colombia

³

Grupo de Investigación en Compuestos Heterocíclicos, Facultad de Ciencias Básicas, Universidad del Atlántico, Carrera 30 No. 8–49, Puerto Colombia 081007, Colombia

^*

Author to whom correspondence should be addressed.

Sci 2026, 8(7), 148; https://doi.org/10.3390/sci8070148

Submission received: 21 April 2026 / Revised: 22 June 2026 / Accepted: 24 June 2026 / Published: 27 June 2026

(This article belongs to the Section Chemistry Science)

Download

Browse Figures

Versions Notes

Abstract

Excitation–emission matrix (EEM) fluorescence spectroscopy, when combined with machine learning algorithms, has emerged as a highly promising tool for non-invasive biomedical diagnosis, demonstrating significant potential across various applications. This systematic review offers a comprehensive analysis of recent advancements in integrating EEM with chemometric techniques and machine learning models for the detection of infectious diseases, cancer, neurological, and metabolic disorders, as well as for monitoring bioactive compounds and hormonal contaminants. The review examines multivariate approaches alongside spectral preprocessing strategies, highlighting their ability to resolve overlapping signals and extract relevant information from complex biological matrices. The reviewed studies report promising high sensitivity, specificity, and accuracy values across diverse biological matrices and disease targets, supporting the scalability and versatility of this diagnostic platform. A critical evaluation of methodological approaches is also provided, identifying common pipeline-level challenges and drawing a constructive distinction between proof-of-concept studies, which establish the discriminative potential of EEM spectral data and studies aimed at clinical validation, a distinction that helps contextualize reported performance and guides future research design. Future perspectives focus on the development of open-access spectral databases, portable devices, standardized preprocessing protocols, and the integration of deep learning and explainable artificial intelligence, all of which represent concrete pathways toward the clinical translation of EEM-based diagnostics. This review confirms the value of EEM spectroscopy coupled with machine learning as a versatile, scalable, and high-impact platform for biomedical diagnostics, with significant potential for applications in public health and personalized medicine.

Keywords:

EEM fluorescence spectroscopy; machine learning; chemometrics; biomedical diagnostics; spectral preprocessing; multivariate analysis

1. Introduction

The early and accurate diagnosis of diseases is a persistent challenge in the biomedical field. Delays in clinical results increase the social and economic burden associated with illnesses. Therefore, the search for sensitive and non-invasive diagnostic techniques has become increasingly relevant in recent years. Fluorescence spectroscopy has been used to characterize biomolecules and tissues in various biomedical fields owing to its high sensitivity, speed, and compatibility with physiological conditions [1]. Excitation–emission matrix (EEM) fluorescence spectroscopy is a promising technique that provides a complete molecular fluorescence profile of an analyzed sample. Key endogenous fluorophores, such as aromatic amino acids (tryptophan and tyrosine), metabolic coenzymes (NADH and FAD), vitamins (riboflavin), structural proteins (collagen and elastin), and advanced glycation end products (AGEs), act as molecular fingerprints [2,3]. Fluorescence spectroscopy captures the excitation and emission characteristics of these fluorophores simultaneously, and the resulting EEM contains valuable information about biological composition and interactions. However, analyzing EEM data can be complex because of its high dimensionality and the presence of signal interference [4].

The integration of chemometric and machine learning techniques has aided pattern extraction, classification, and prediction from complex datasets [5]. Machine learning algorithms are capable of handling probabilities by analyzing diverse datasets quickly and effectively, offering solutions to various aspects of precision medicine, including diagnostics, patient phenotyping, and personalized treatment strategies [6]. The integration of precision medicine with approaches such as machine learning can reduce healthcare costs and improve the effectiveness of personalized treatments [7]. The combination of EEM spectroscopy with machine learning algorithms has demonstrated significant potential in various biomedical applications [8,9], including the detection of biomarkers, disease diagnosis, treatment monitoring, and tissue analysis [10,11,12,13].

The purpose of this review article is to provide a state-of-the-art overview of the use of EEM fluorescence spectroscopy coupled with chemometrics and machine learning for biomedical applications. The review will first cover the basic principles of the spectroscopic technique and the foundations of machine learning, with a focus on the most commonly used algorithms. This will be followed by an analysis of technological advances, methodologies, and results from the included research, as well as a discussion of the main challenges and future perspectives in this field.

2. Materials and Methods

This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to ensure a transparent and reproducible process for literature selection [14]. A comprehensive search was performed in three major scientific databases: Scopus, Web of Science, and PubMed, which were selected because of their broad coverage in the chemical, biological, and biomedical sciences. The search strategy focused on studies related to EEM spectroscopy applied to the diagnosis, detection, and monitoring of chronic, oncological, metabolic, and infectious diseases. Eligible studies incorporated multivariate analysis, chemometric methods, and/or machine learning algorithms. The following Boolean search string was used to query all three databases: (“fluorescence spectroscopy” OR “ excitation–emission matrix” OR “EEM spectroscopy”) AND (“machine learning” OR “chemometric” OR “multivariate analysis”) AND (“diagnosis” OR “detection” OR “diagnostic” OR “biomedical”).

2.1. Inclusion and Exclusion Criteria

Studies were included if they were published in English or Spanish up to May 2025 and reported potential applications in clinical or diagnostic contexts. Eligible publications explicitly applied chemometric multivariate analysis and/or machine learning algorithms for data classification or quantification. This review considered original research articles, systematic reviews, and methodological studies.

Studies were excluded if they were unrelated to biomedical applications, including purely theoretical work, techniques without clinical implications, or research conducted in non-biomedical domains. Publications without full-text availability were also excluded, as this prevented an adequate evaluation of methodological quality, statistical analyses, and reported results. In addition, conference proceedings were excluded because they were often not subjected to full peer review and lacked sufficient methodological rigor and scientific validation.

2.2. Study Selection Process

The review process followed the three phases outlined in the PRISMA 2020 flow diagram for new systematic reviews, including database and register searches (Figure 1).

Identification: A comprehensive search of the three major scientific databases (Scopus, Web of Science, and PubMed) retrieved 696 records. After removing 388 duplicates using Mendeley Reference Manager and manual screening, 308 unique articles remained.

Screening: Titles were screened, and 156 studies clearly irrelevant to the scope of this review (e.g., environmental applications, methodological studies without biomedical applications) were excluded. A total of 152 articles were retained for further evaluation.

Eligibility: Full-text articles were sought for the 152 studies retained in the review. However, 56 reports could not be retrieved. The remaining 96 articles were assessed for eligibility, resulting in the exclusion of 73 studies in which the spectroscopic technique employed was not specifically EEM or lacked machine learning applications. Twenty-three articles were retained for qualitative analysis.

Inclusion: After a comprehensive full-text assessment, 23 studies fully satisfied all inclusion criteria and were incorporated into the qualitative analysis. The main reasons for exclusion at this stage included insufficiently described methodologies, lack of model validation, and research objectives misaligned with the aims of this review.

3. Fundamentals of Excitation–Emission Matrix Fluorescence Spectroscopy and Machine Learning

3.1. Basic Principles of EEM Fluorescence Spectroscopy

Fluorescence spectroscopy is a structural analysis technique based on the ability of certain compounds to absorb light at a specific wavelength (excitation) and emit it at a longer wavelength (emission). Figure 2 illustrates this process through a Jablonski diagram, which represents the electronic energy levels and transitions involved in fluorescence. When a molecule absorbs a photon, it is excited from the ground state (S₀) to a higher electronic state (S₁ or S₂). Following excitation, the molecule rapidly relaxes within the excited state, losing energy through non-radiative processes, such as internal conversion and vibrational cascading, until it reaches the lowest vibrational level of S₁. From this relaxed excited state, the molecule returns to the ground state by emitting a photon, a process known as fluorescence. Because energy is lost during vibrational relaxation, the emitted photon has lower energy (longer wavelength) than the absorbed photon, a phenomenon known as the Stokes shift. The fluorescence lifetime provides valuable information about the molecular environment and interactions of the fluorophore [15].

Excitation–emission spectroscopy involves the recording of multiple emission spectra that result from variable excitation wavelengths, thereby producing a three-dimensional data matrix. This matrix relates the excitation wavelengths, emission wavelengths, and the intensity of each measurement and is known as the fluorescent profile of a compound. This technique is highly advantageous, as it allows the detection of multiple fluorophores in a single assay. This is useful not only in complex biological samples, such as plasma, serum, saliva, or tissues, but also in other samples, such as water [16], soil [17], petroleum [18], food [19], and repellents [20].

Signal overlap can occur in EEM fluorescence spectra. To counteract this, methods for signal preprocessing based on mathematical algorithms have been developed to reliably and reproducibly extract information of scientific interest. In recent years, the use of EEM fluorescence spectroscopy in biomedical applications has expanded because the technique offers high sensitivity, can detect subtle changes in a sample’s composition, and is compatible with physiological conditions without the need for specialized markers [21,22,23,24].

3.2. Data Characteristics and Challenges

Biomedical studies often involve many samples. The data are typically collected in the form of cubic matrices, where one dimension corresponds to the number of samples. The high dimensionality of this data necessitates the use of methodologies that facilitate analysis, which has driven the combination of spectroscopic techniques with multivariate methods [25]. Although EEM fluorescence spectroscopy has not yet been widely adopted in clinical contexts, previous studies have shown that chemometric approaches can provide valuable information from complex spectroscopic signals. For example, multivariate models, such as Parallel Factor (PARAFAC) Analysis, a multilinear decomposition method for higher-order data matrices, and Alternating Penalty Trilinear Decomposition (APTLD), a chemometric algorithm for decomposing three-way data, have been used on UV-Vis spectroscopy and voltammetry data to study the inhibition of AGEs in bovine serum albumin. While this study did not use EEM, it highlights the potential of chemometrics to resolve overlapping signals and monitor molecular processes in biofluids [26].

In chemometrics, research has focused on developing mathematical adjustments for the pretreatment of signals acquired via spectroscopic techniques. For instance, Corcoran used a simulation experiment to generate synthetic EEM data from known dye spectra by adding a Rayleigh scattering matrix and applying optimized binary filters to simulate compressive detection. This type of modeling presents a potential solution for spectral overlaps often observed in biological samples with multiple fluorophores. Furthermore, simulated Rayleigh scattering allows the evaluation of optical interference in detection [27]. This evidence supports the growing interest in applying EEM fluorescence spectroscopy, which generates rich three-dimensional data, in combination with machine learning tools to advance more sensitive, interpretable, and automated diagnostic systems.

3.3. Basic Principles of Machine Learning

The three-dimensional nature of EEM data, where each sample is represented by a matrix of fluorescence intensities across combinations of excitation and emission wavelengths, defines both the analytical richness and the methodological complexity of this approach. The high dimensionality of EEM matrices, the presence of spectral overlaps, and the multilinear structure that relates each measured intensity to the contributions of individual fluorophores are features that directly condition the choice of preprocessing strategies, dimensionality reduction methods, and classification or regression algorithms. Machine learning and chemometric methods are therefore not merely complementary tools for EEM analysis; their selection and implementation must be guided by the specific structural properties of three-way spectral data. Machine learning involves algorithms and statistical models that enable computers to identify patterns and make predictions from data without explicit programming. By processing large volumes of data, models can be built to identify underlying patterns with optimized parameters using training data or experience. Models can be predictive, descriptive, or both [28]. Machine learning is the scientific study of algorithms and statistical models used by computers to perform specific tasks, such as data mining, image processing, and predictive analysis [29].

Machine learning algorithms applicable to biomedical diagnostics are classified into three main categories: unsupervised learning, supervised learning, and reinforcement learning (Figure 3). Unsupervised learning does not require prior knowledge of the correct categorization, making it valuable for exploration analysis and pattern discovery [30]. Supervised learning uses labeled data to build functions that map input features to output variables, thereby enabling predictions for new data [31]. In reinforcement learning, an agent learns to make sequential decisions by interacting with an environment and receiving rewards or penalties to optimize a long-term objective without requiring labeled training data [32]. Although reinforcement learning is important conceptually, its applications to static diagnostic tasks are limited; therefore, this review focuses primarily on unsupervised and supervised approaches applicable to EEM spectroscopy-based diagnostics.

3.3.1. Data Preprocessing

Data preprocessing is a critical initial step that directly influences the quality and reliability of machine learning results. Raw spectroscopic data often contains measurement noise, systematic artifacts, baseline shifts, and variations in signal intensity that can obscure true patterns or introduce spurious correlations. Preprocessing transforms raw data into a clean, standardized format suitable for analysis, improving model training efficiency and reducing the risk of misleading conclusions. Common preprocessing methods for spectroscopic data include:

-: Normalization is a fundamental preprocessing step that rescales input features with different units and magnitudes to a standard range, typically between zero and one. This transformation prevents variables with larger scales from disproportionately influencing machine learning model outcomes and is particularly beneficial for linear models operating on small datasets, where empirical evidence shows it can improve classification accuracy from 61% to over 91% and increase the Matthews correlation coefficient (MCC) from 23% to 83% [33]. In the context of EEM fluorescence spectroscopy, normalization constitutes a critical stage of the data preprocessing workflow, alongside baseline correction and noise filtering, that enables the effective integration of high-dimensional spectral matrices with machine learning algorithms for biomedical diagnostic applications [34].
-: Baseline correction addresses systematic low-frequency spectral distortions caused by scattering, fluorescence background, and detector drift, which obscure true analyte signals and compromise quantitative analysis. Effective correction must disentangle analyte signals from background interference without distorting peak morphology, a challenge addressed through physical and statistical models, such as polynomial fitting and asymmetric least squares, as well as adaptive algorithms, including penalized splines and iterative optimization methods. Parametric approaches, such as piecewise polynomial fitting (PPF) and B-spline fitting (BSF), offer computational simplicity and localized control to avoid overfitting but require careful parameter tuning; high-order polynomials risk introducing oscillations, whereas low-order fits may underperform with complex baselines. Adaptive methods, such as asymmetrically reweighted penalized least squares (arPLS), improve robustness by dynamically adjusting weights based on the distribution of residuals, thereby suppressing baseline drift while preserving spectral peaks. However, arPLS remains prone to overfitting faint peaks near the baseline and may misclassify low-intensity signals as baseline variations under high-noise conditions. The choice between parametric and adaptive approaches depends on spectral complexity: simple baselines are well-handled by parametric models, whereas complex fluorescence backgrounds and overlapping interferences demand adaptive or data-driven algorithms. Regardless of the strategy chosen, effective baseline correction is essential for preserving spectroscopic feature fidelity and ensuring that downstream machine learning models operate on chemically meaningful data rather than distortion-dominated signals [35].
-: Noise filtering constitutes a critical preprocessing strategy in machine learning pipelines, particularly in medical domains, where training datasets frequently contain mislabeled or anomalous instances. The multi-class saturation filter, grounded in the saturation property of the training data, operates iteratively to identify and remove examples whose elimination reduces the hypothesis complexity, as measured by the complexity of the least complex hypothesis value. An empirical evaluation across eight UCI medical datasets (University of California, Irvine, USA) demonstrated that classifiers trained on noise-filtered data consistently achieved higher mean prediction accuracy than those trained on unfiltered datasets, with statistically significant improvements observed in seven of eight domains for both inductive learning by logic minimization (ILLM) and C4.5 algorithms. Notably, the application of noise elimination prior to model training yielded relative information score improvements of approximately 3% in the diagnosis of rheumatic diseases, surpassing the performance of built-in noise-handling mechanisms, such as the CN2 significance test. These results underscore the value of preprocessing-based noise elimination as a complementary strategy to algorithmic noise tolerance, while also highlighting that excessive filtering may reduce predictive performance, necessitating careful calibration of the noise sensitivity parameter according to the training set size [36].
-: Centering is a fundamental preprocessing transformation that adjusts the data so that the meaning of each feature becomes zero, thereby eliminating bias related to the location of the data in feature space and allowing machine learning models to focus on the relationships and patterns within the data. This step is particularly critical for algorithms sensitive to the scale and location of the data, such as principal component analysis (PCA) and various regression techniques, as it ensures that the first principal component aligns with the direction of maximum variance. Importantly, centering only shifts the location of the data without affecting its variance or the shape of its distribution; therefore, the total variance, expressed as the trace of the covariance matrix, remains unchanged after the transformation. In iterative optimization algorithms, such as gradient descent, centering accelerates convergence by removing bias from the optimization landscape while reducing numerical instability during high-dimensional computations. For multi-class datasets, centering each class around its own centroid, rather than applying a global centering strategy, maximizes between-class variance while minimizing within-class variance, thereby directly influencing the Fisher discriminant ratio as a measure of class separability. As a standard practice in data preprocessing pipelines, centering is commonly applied as a preliminary step in standardization, where the data are subsequently scaled by dividing by the standard deviation to form Z-score normalization, making features comparable regardless of their original units or scales [37].

These preprocessing steps are not independent; their order and combination significantly influence the downstream results. For instance, applying baseline correction before normalization may yield different results than reversing the order. Careful validation and documentation of preprocessing choices are essential to ensure reproducibility and appropriate interpretation of findings.

3.3.2. Unsupervised Learning

Unsupervised learning techniques enable models to identify patterns, structures, or groupings within datasets without relying on predefined labels or prior constraints. These models explore intrinsic data characteristics to extract meaningful information, supporting decision-making without expert knowledge. This ability to learn directly from data allows the discovery of hidden subgroups, non-linear relationships, and latent dimensions that are often missed by supervised approaches. Unsupervised learning is particularly valuable when working with noisy, heterogeneous datasets or when labels are subjective and potentially biased. By uncovering unexpected structures, these models generate new scientific hypotheses and insights [38]. Unsupervised learning models are grouped into clustering and dimensionality reduction categories:

Clustering algorithms: These algorithms classify items into groups based on similarity, identifying hidden structures and natural subgroups.

-: k-Means clustering is a fundamental unsupervised learning algorithm that partitions multidimensional data into k distinct clusters by minimizing the total within-cluster variance, thereby identifying natural groupings and latent structures in complex datasets without requiring prior class labels. Formally, the algorithm minimizes the sum of squares error (SSE) through an iterative procedure that assigns each observation to the nearest centroid and recalculates cluster centroids based on updated membership until no further reassignment is possible, making it computationally tractable for large, high-dimensional datasets. A well-recognized limitation of k-means is its susceptibility to convergence at local rather than global optima, a consequence of sensitivity to initial seed selection; this can be mitigated through multiple random restarts or rational initialization strategies, such as hierarchical pre-clustering via Ward’s method, which has shown superior recovery performance in comparative studies. The determination of the optimal number of clusters, k, remains a critical methodological decision, addressable through algorithmic approaches that test for Gaussian distribution assumptions, graphical inspection of the SSE curve across values of k, or quantitative indices, including the Calinski–Harabasz criterion, the Davies–Bouldin index, and the gap statistic. Preprocessing decisions also critically influence cluster quality: standardization by range rather than z-score has been recommended as the most effective normalization strategy, and variable weighting or selection procedures, such as variable selection for k-means (VS-KM), can substantially reduce noise and improve the discriminative structure of the resulting partition. Additional methodological safeguards include outlier detection through jackknife-based influence measures and trimmed k-means variants, as well as extensions to alternative metric spaces, including k-medians and k-harmonic means, that offer improved robustness in the presence of outliers or complex data geometries [39]. When applied to spectroscopic data, k-means serves as an exploratory tool for hypothesis generation, identifying sample subgroups that may reflect distinct biochemical states or disease phenotypes, although the results should be validated through complementary supervised methods and clinical correlation before drawing biological conclusions.
-: Hierarchical clustering is an unsupervised grouping procedure that systematically reduces n mutually exclusive subsets to a single group by iteratively merging pairs of subsets whose union produces the smallest increase in the chosen objective function. This approach is particularly suited for large-scale studies (n > 100), in which obtaining a globally optimal solution for a fixed number of groups is computationally prohibitive. At each step, all possible k(k − 1)/2 pairs of active subsets are evaluated, and the merger associated with the optimal objective function value is accepted without modifying the previously formed groups. The objective function, which may reflect any criterion selected by the investigator, such as the SSE, quantifies the information loss associated with each merger, providing a principled basis for evaluating grouping quality. Critically, this procedure does not require the number of groups to be specified in advance; instead, the objective function values computed across all stages from n to one group provide quantitative clues for selecting the operationally appropriate number of clusters. The complete hierarchical structure, together with the incremental loss estimates at each merging stage, enables a multi-resolution inspection of data organization and the identification of meaningful subgroup boundaries [40].

Dimensionality Reduction Algorithms: These transform data into lower-dimensional spaces while preserving the relevant structure, thereby enabling visualization, compression, and noise reduction.

-: PCA is a dimensionality reduction technique that transforms high-dimensional datasets into a reduced set of uncorrelated variables, termed principal components (PCs), which are defined directly from the data rather than established a priori, making PCA an inherently adaptive method for exploratory analysis. Mathematically, each PC is obtained as a linear combination of the original variables by solving an eigenvalue problem on the covariance or correlation matrix, where the coefficients of these linear combinations are known as loadings and the projected values for each observation as scores. Because PCA operates without distributional assumptions, it is broadly applicable to numerical data of diverse types. When the original variables differ in units of measurement, correlation matrix PCA, which standardizes variables prior to analysis, is the preferred approach, as it prevents variables with larger scales from disproportionately influencing the components. The proportion of total variance explained by the retained PCs serves as the standard criterion for evaluating the quality of any low-dimensional representation, enabling a principled trade-off between dimensionality reduction and information retention. Complementary visualization tools, such as biplots, further enhance interpretability by simultaneously representing observations and variables in the reduced space, facilitating the identification of structure and relationships within complex datasets [41].
-: t-Distributed Stochastic Neighbor Embedding (t-SNE) is a non-linear, non-parametric dimensionality reduction technique that maps high-dimensional data into a two- or three-dimensional representation by converting pairwise Euclidean distances into conditional probabilities that reflect the similarity between data points. Unlike linear approaches, t-SNE employs a Student’s-t distribution with one degree of freedom in the low-dimensional space, rather than a Gaussian, to model these similarities, which provides two critical advantages: it strongly repels dissimilar data points that are incorrectly placed in close proximity and avoids infinite repulsive forces that would otherwise destabilize the optimization. This design directly addresses the crowding problem inherent to earlier SNE formulations, in which the insufficient area of the low-dimensional map caused moderately dissimilar points to collapse toward the center, obscuring natural cluster boundaries. The optimization of the cost function is further supported by early exaggeration, a technique that temporarily amplifies the joint probabilities in the high-dimensional space during initial iterations, encouraging the formation of tight, well-separated clusters that can subsequently reorganize into a globally coherent structure. As a result, t-SNE is capable of simultaneously preserving local neighborhood relationships and revealing global structure at multiple scales, a property that distinguishes it from techniques such as Sammon mapping, isomap, and locally linear embedding. However, its computational and memory complexity of O(n²) makes direct application impractical for datasets substantially exceeding 10,000 data points; this limitation is addressed by a random walk variant that computes pairwise affinities for a subset of landmark points by integrating over all paths through a neighborhood graph constructed from the full dataset, thereby incorporating structural information from undisplayed data points into the final visualization [42].
-: PARAFAC is a trilinear decomposition method that simultaneously fits multiple two-way arrays, or slices of a three-way array, in terms of a common set of factors with differing relative weights in each slice, representing a generalization of the bilinear factor analysis model to three-way data. Applied to excitation–emission matrix spectroscopy, this trilinear structure naturally accommodates the three-way array formed by samples, excitation wavelengths, and emission wavelengths, decomposing it into a set of latent factors each associated with a distinct spectral component. A defining advantage of PARAFAC over conventional two-way methods, such as PCA, is its intrinsic axis property: when each factor exhibits sufficiently distinct patterns of variation across the three modes of the data, the orientation of factor axes is uniquely determined by minimizing residual error alone, eliminating the rotational indeterminacy that requires an arbitrary separate rotation phase in traditional factor analysis. This uniqueness is guaranteed when each factor displays distinct proportional patterns of variation across the levels of each mode, such that no two factors can be exchanged or recombined without reducing the overall fit of the model. Solution reliability was assessed through fit diagnostics, including R-squared, stress, and mean squared error, and confirmed via split-half validation, in which replication of essentially the same factor structure across independent subsamples demonstrated that the recovered axes reflected genuine systematic patterns in the data rather than artifacts. When the proportional structure of the data is violated or factors exhibit insufficient variation, degenerate solutions characterized by two or more highly negatively correlated factors may arise; these can be addressed by constraining factors to orthogonality in one mode or by employing indirect fitting through PARAFAC, which is generally immune to such degeneracies. Extensions of the basic model include PARAFAC2, which permits oblique factors while maintaining consistency constraints on interfactor angles across covariance matrices, and the PARATUCK family of models, which combines the intrinsic axis capabilities of PARAFAC with the greater structural generality of Tucker three-mode factor analysis [43].

These unsupervised methods are particularly valuable for EEM spectroscopy because they handle the three-dimensional structure of EEM matrices directly, resolve spectral overlaps, extract interpretable components, reduce dimensionality while preserving chemical meaning, enable both exploratory analysis and quantitative prediction, and facilitate the integration of spectroscopic data with machine learning.

3.3.3. Supervised Learning

Supervised learning uses labeled training data to build models that predict outcomes for new, unlabeled data. In biomedical diagnostics, this approach is crucial for developing classification models that distinguish between disease states. Supervised learning differs fundamentally from unsupervised learning in that it requires explicit labels indicating the correct categorization. The typical workflow involves splitting data into training and test sets, training the model on labeled training data, and evaluating performance on independent test data to assess generalization.

A training dataset is then defined to teach the model to recognize patterns. The most suitable algorithm is selected according to the data type and analytical objective, and the model is trained. Once trained, its performance is evaluated using an independent test dataset. If the results are unsatisfactory, the process iterates by adjusting the parameters, improving the data, or selecting another algorithm. Once the model meets the predefined quality criteria, it is approved as the final classifier and is ready for implementation. This iterative approach ensures that the model is robust, accurate, and useful for classification tasks in real-world scenarios [44]. The classification algorithms for supervised learning include

-: Logistic regression (LR) is a statistical method introduced for the analysis of binary sequences, where the outcome of each observation takes one of two forms, such as success or failure, or presence or absence of a condition. The model is formulated through the logistic law, which expresses the probability of a binary outcome as a function of one or more predictor variables via the logit transformation: the log-odds of the outcome are modeled as a linear combination of the predictors, ensuring that estimated probabilities remain bounded within the interval [0, 1]—a fundamental constraint that linear regression cannot guarantee. The parameter β associated with each predictor has a well-defined interpretation: when the probability of the outcome is small, β approximates the fractional change in that probability per unit increase in the predictor; when the probability of the complementary outcome is small, β approximates the corresponding fractional change in that quantity. The regression coefficients can be estimated through maximum likelihood or minimum logit χ², two asymptotically equivalent approaches that extend naturally to multiple predictor variables through standard iterative or non-iterative multiple regression calculations [45]. Owing to these properties, LR has become widely used in biomedical research for predicting whether a set of conditions will or will not result in disease or a clinically relevant outcome, with the probability of class membership derived directly from a linear combination of the available features. However, its intrinsic linear nature implies a reduced predictive capacity when the relationship between predictors and the outcome is non-linear, which represents a fundamental limitation in datasets where complex interactions among variables are present [46].
-: Support vector machines (SVMs) are supervised machine learning algorithms grounded in statistical learning theory, originally developed for binary classification and subsequently extended to multi-class problems. Their core operating principle consists of identifying an optimal decision boundary that maximizes the separation between classes, a property that confers notable generalization capacity even in complex classification scenarios. When data are not linearly separable, SVMs employ kernel functions to project the input features into higher-dimensional spaces, where class discrimination becomes feasible, thereby allowing the model to capture non-linear relationships between variables. These characteristics have made SVMs particularly well-suited for biomedical applications, where they have been applied for over two decades to tasks such as clinical diagnosis, disease prognosis, biomarker identification, and prediction of treatment outcomes, owing to their high precision and robustness in handling high-dimensional data. Moreover, SVMs have been integrated into personalized medicine frameworks, where their capacity to simultaneously process genomic, demographic, and clinical information supports the identification of patient subgroups and the prediction of individualized therapeutic responses. Although SVMs have limitations related to computational cost and sensitivity to hyperparameter selection, the development of improved variants has substantially broadened their applicability to complex, often imbalanced datasets encountered in real-world biomedical research [47,48].
-: Decision trees are supervised machine learning classifiers that assign class labels to data items by recursively partitioning the feature space through a hierarchical series of questions. Each internal node evaluates a feature-based condition and directs items along branches toward terminal leaf nodes, where class assignments are made. The quality of each partition is measured using impurity criteria, such as entropy or the Gini index, and the algorithm selects the question that minimizes the weighted average impurity of the resulting subsets, thereby maximizing information gain at each split. To prevent overfitting, tree complexity is controlled either by early stopping or by post-construction pruning, in which internal nodes collapse into leaves when doing so reduces the classification error on held-out examples. Decision trees are flexible enough to handle both real-valued and categorical features simultaneously, as well as datasets with missing values. Furthermore, ensemble strategies which aggregate predictions from multiple trees trained on bootstrapped subsets of the data using random feature subsets, can reduce the chance that a new example will be misclassified by avoiding commitment to a single tree [49].
-: Random Forest (RF) is an ensemble machine learning method that generates a collection of decision trees, each trained on a random subset of available data and a random selection of input features and combines their individual outputs to produce a final prediction. The use of bootstrap aggregation ensures that each tree is built on a different portion of the trained data, which introduces diversity among the trees and reduces the correlation between them. The generalization of an RF converges to a stable limit as the number of trees increases, a property derived from the law of large numbers, which explains why the algorithm does not overfit as the ensemble grows larger. The accuracy of the model depends on two fundamental parameters: the strength of the individual trees and the degree of correlation between them, such that forests composed of stronger and less correlated trees consistently yield lower generalization error. A particularly valuable property of RF is its resistance to noise in the output labels, as the algorithm does not concentrate weight on misclassified instances in the same manner as the boosting method, making it more robust under real-world data conditions. The algorithm is applicable to both classification and regression tasks, offers a built-in estimation of variable importance through measures that quantify how much each feature contributes to reducing prediction error across the trees, and can effectively handle datasets with missing values and class imbalance without requiring extensive preprocessing. Furthermore, out-of-bag samples, which are the observations not used in constructing each individual tree, serve as an internal validation set that provides unbiased estimates of the generalization error without the need for a separate test dataset. These characteristics collectively make RF one of the most efficient and widely adopted algorithms in machine learning, without demonstrated utility across diverse domains, including health applications such as disease diagnosis and patient outcome prediction [50,51].
-: Naïve Bayes is one of the most efficient and widely validated probabilistic classifiers for disease prediction, occupying a prominent place among the supervised learning algorithms applied to biomedical diagnosis. Based on Bayes’ theorem, this algorithm estimates the posterior probability of a given condition from the independent contribution of multiple predictor variables, assuming conditional independence among features, an assumption that, despite its simplicity, yields remarkably competitive predictive performance. Its structure can be represented as a directed acyclic graph, in which a single parent node is connected to all predictive variables, and within each node, a variable can take multiple values associated with a specific probability. A systematic review encompassing 23 studies and 53,725 patients demonstrated that Naïve Bayesian networks achieved superior predictive performance in most disease categories evaluated, with 80% of the studies reporting accuracy values above 75%, and more than half exceeding an area under the receiver operating characteristic (ROC) curve of 0.80, a threshold classified as having good discriminative ability. Across diverse clinical scenarios, including brain disease, cancer, diabetic kidney disease, and cardiovascular conditions, the classifier consistently outperformed or performed comparably to other algorithms, such as LR, SVMs, and neural networks. Beyond its predictive accuracy, Naïve Bayes supports clinical decision-making by merging multiple patient characteristics into a unified probability estimate, aiding physicians in both diagnosis and prognosis. This performance across heterogeneous disease domains aligns with the broader recognition that machine learning methods enhance diagnostic precision and facilitate the integration of complex multivariate data into actionable clinical insights [12,52].
-: k-Nearest Neighbor (KNN) is a supervised, non-parametric machine learning method, meaning that it does not impose any assumptions on the underlying distribution of the data. Its operating principle is grounded in the premise that observations located close to one another tend to belong to the same category; thus, the class of a new data point is determined by a majority vote among its KNN in the feature space. A recognized advantage of this algorithm is its simplicity and suitability for multi-class classification problems, particularly when the dataset is not large. In the large-sample limit, the probability of error of the single nearest-neighbor classifier is bounded above by twice the Bayes probability of error, implying that at least half of the classification information available in an infinite sample set is contained in the nearest neighbor. In practice, the selection of the number k of neighbors represents the most critical step in the training process, as small values yield unstable models, whereas k = 5 has been shown to perform reliably across a wide variety of datasets. Unlike other machine learning algorithms, KNN does not construct an explicit function during training; instead, it retains observations in memory and draws upon them when classifying new data points. Neighborhood computation can be performed using distance metrics, such as Euclidean or Manhattan distance, and search algorithms, including the k-d tree and ball tree, allowing the computational complexity to be reduced as a function of the number of training samples and the dimensionality of the feature space [53,54].
-: Artificial Neural Networks (ANN) are computational systems that attempt to simulate, in a simplified manner, the organization of biological neural networks, borrowing from neurophysiological knowledge of neurons and their interconnections. The basic computational unit of an ANN is an artificial neuron, which receives weighted inputs $x_{i}$ scaled by adjustable parameters $w_{i}$ , computes their sum $z = \sum w_{i} x_{i}$ , and passes the result through a non-linear activation function to produce an output $y = f_{N} (z)$ . Multiple such units are organized into an architecture comprising an input layer, one or more hidden layers, and an output layer; the number of hidden layers and nodes in each layer are critical factors that directly influence modeling accuracy and the convergence of learning. Training an ANN consists of iteratively adjusting the connection weights to minimize a defined error function between the actual and desired outputs. The back-propagation algorithm, proposed by Rumelhart, Hinton, and Williams in 1986, provided an effective solution for setting weights in multilayer networks, extending the representational capacity of ANN to problems of essentially no bound in complexity. By virtue of this architecture, ANN can solve complex, mathematically ill-defined, non-linear, and stochastic problems through simple computational operations, with a self-organizing feature that allows them to generalize across a wide range of problems without reprogramming. In supervised learning, the network learns the correlation between input data and desired outcomes and, after training, can generate appropriate outputs in response to new inputs, a property known as generalization. Deep learning extends this principle through architectures composed of multiple processing layers that learn hierarchical representations of data, with each layer transforming its input into progressively more abstract representations; this approach has demonstrated breakthroughs in image recognition, speech recognition, drug discovery, and genomics [55,56,57].

Together, these supervised learning algorithms provide flexible solutions for diverse classification and regression tasks, with algorithm selection depending on the data characteristics and problem requirements [58]. In the specific context of EEM-based diagnostics, this selection is additionally conditioned by the structural properties of spectral data: the high dimensionality of unfolded EEM matrices favors methods robust to the curse of dimensionality, such as SVMs and regularized discriminant approaches, while the availability of physically meaningful dimensionality reduction through PARAFAC or Tucker3 decomposition allows the use of simpler classifiers operating on a reduced set of chemically interpretable scores. The interplay between these data-driven and model-driven dimensionality reduction strategies is a defining feature of the EEM–machine learning pipeline and is discussed in detail in the workflow Section 3.3.5.

3.3.4. Model Evaluation Metrics

Evaluating the performance of a machine learning model requires the selection of metrics that are appropriate for the type of analytical problem being addressed. In classification tasks, the most reported metrics are derived from the confusion matrix, which summarizes the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) produced by the model on a test set. Sensitivity (SENS), also referred to as the true positive rate or recall, measures the proportion of positive cases correctly identified by the model and is defined according to Equation (1).

SENS (%) = \frac{T P}{T P + F N} \times 100

(1)

Specificity (SPEC), or true negative rate, reflects the model’s ability to correctly identify negative cases (Equation (2)).

SPEC (%) = \frac{T N}{T N + F P} \times 100

(2)

These two metrics are primary measures of diagnostic test performance and have been widely used to characterize the operating characteristics of both clinical and spectroscopic classification systems [59]. The overall classification accuracy (ACC) expresses the proportion of all samples that were correctly classified (Equation (3)).

ACC (%) = \frac{T P + T N}{T P + T N + F P + F N} \times 100

(3)

However, accuracy alone can be misleading when class sizes are unbalanced, as a model that consistently predicts the majority class may still achieve high accuracy without any true discriminative ability. In such cases, the MCC provides a more reliable and informative summary of classification quality (Equation (4)), as it accounts for all four entries of the confusion matrix and has been shown to be superior to both accuracy and the F1 score for binary classification evaluation [60]. MCC is defined as:

MCC = \frac{(T P \times T N) - (F P \times F N)}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(4)

The MCC values range from −1 to +1, where +1 indicates perfect classification, 0 corresponds to random prediction, and −1 reflects complete misclassification. The area under the ROC curve (AUC) is also widely used in biomedical applications, as it evaluates model discriminative ability across all possible classification thresholds by plotting the true positive rate against the false positive rate, with values approaching 1.0, indicating superior performance [60]. Beyond these individual metrics, no single measure fully captures model quality in isolation, and reporting a complementary set of figures of merit is considered best practice for robust performance evaluation in biomedical classification problems [61].

For regression and quantification tasks, model performance is assessed using error-based metrics. The root mean square error of cross-validation (RMSECV) (Equation (5)) and root mean square error of prediction (RMSEP) (Equation (6)) quantify the average deviation between the predicted and reference values during cross-validation and external prediction, respectively [62]:

RMSECV = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}}

(5)

RMSEP = \sqrt{\frac{1}{n_{p r e d}} \sum_{i = 1}^{n_{p r e d}} (y_{i} - {\hat{y}}_{i})^{2}}

(6)

where

y_{i}

are the reference values,

{\hat{y}}_{i}

are the model predictions, and

n

is the number of samples. The coefficient of determination (R²) expresses the proportion of variance in the response variable explained by the model (Equation (7)) [63]:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}}{\sum_{i = 1}^{n} (y_{i} - \overline{y})^{2}}

(7)

These metrics have been applied in spectroscopy-based diagnostic studies to evaluate multivariate classification models built on biological fluid samples, wherein reporting a complementary set of figures of merit, including sensitivity, specificity, accuracy, F-score, and AUC, is considered essential for assessing discriminant ability on unknown test samples [64]. Together, this set of metrics provides a comprehensive framework for assessing model performance across diverse analytical objectives encountered in EEM fluorescence spectroscopy applications.

3.3.5. Machine Learning Workflow

The machine learning workflow consists of a series of systematically ordered stages that guide the design, training, evaluation, and deployment of predictive or descriptive models (Figure 4). Although the specific structure may vary depending on the nature of the dataset and analytical objective, certain fundamental stages are common to all machine learning pipelines. The first stage involves defining the analytical objective, which determines whether the problem requires the classification of samples into discrete categories, regression for the quantification of continuous responses, or unsupervised exploration of latent data structure. This distinction directly conditions the choice of algorithm, design of the training and test sets, and performance metrics selected for model evaluation. The subsequent stage encompasses data acquisition and characterization, in which the sample set must be designed to ensure sufficient size, class balance, and representativeness of the target population. In the context of EEM fluorescence spectroscopy, this stage also involves verifying that the spectral variables carry chemically informative content relevant to the analytical objective, typically assessed through preliminary exploratory analysis prior to model construction.

Once the data are acquired, preprocessing transforms raw spectral measurements into a clean, standardized format suitable for analysis, as described in Section 3.3.1. Common operations include baseline correction, normalization, noise filtering, and mean centering, each of which addresses specific sources of systematic distortion that could otherwise propagate into the model and compromise its interpretability. It is worth noting that the order in which these steps are applied is not arbitrary; different sequences can yield substantially different results, and careful documentation of preprocessing choices is essential for reproducibility and meaningful comparisons across studies.

Following preprocessing, an appropriate algorithm is selected according to the nature of the data and the analytical objective, drawing from the unsupervised and supervised methods described in Section 3.3.2 and Section 3.3.3. For exploratory tasks, such as identifying latent groupings or reducing the dimensionality of EEM matrices prior to classification, unsupervised methods, including PCA, PARAFAC, and hierarchical clustering, are commonly applied as the first step. When the objective is the classification or quantification of labeled data, supervised algorithms, such as LR, SVMs, RF, and ANN, are employed. In both cases, the model is trained on a defined subset of the available data, and its parameters are iteratively optimized to minimize a defined error or loss function. The partitioning of data into training, validation, and test sets is a critical methodological decision because it directly affects the estimates of generalization performance and the risk of overfitting.

Model evaluation is performed using independent data that were not involved in the training, applying the metrics described in Section 3.3.4 according to the problem type. For classification tasks, sensitivity, specificity, accuracy, MCC, and AUC provide complementary perspectives on model performance, particularly when class sizes are unbalanced. For regression and quantification tasks, RMSECV, RMSEP, R², and predictive squared correlation coefficient (Q²) are standard figures of merit. Validation strategies, such as k-fold cross-validation, leave-one-out cross-validation, and split-half validation, the latter being particularly common in PARAFAC modeling of EEM data, help assess how well the model generalizes new samples. The use of the Kennard–Stone algorithm for sample selection ensures that the training and test sets are representative of the full data distribution, thereby reducing the risk that model performance reflects peculiarities of a particular subset rather than the true discriminative capacity. When the performance is unsatisfactory, the workflow returns iteratively to earlier stages to adjust the preprocessing parameters, select alternative algorithms, or expand the dataset before proceeding to the final evaluation.

The final stage of the workflow involves interpretation of the model outputs in relation to the underlying chemical and biological information. In EEM spectroscopy applications, this involves linking the spectral regions identified as the most informative by the model to specific fluorophores and their associated biological processes. Tools such as PCA biplots, PARAFAC spectral profiles, SVMs recursive feature elimination, and RF variable importance scores allow the analyst to identify which excitation–emission combinations contribute the most to class discrimination or quantitative prediction. Fluorophores such as NADH, tryptophan, collagen, and advanced glycation end products occupy characteristic regions of the EEM landscape, and their differential expression across sample classes provides a biochemically grounded basis for model interpretation. This interpretive step is not merely descriptive; it serves as an internal consistency check that connects statistical performance to chemically meaningful patterns, strengthening confidence in the model, and supporting its potential translation to clinical or analytical applications.

4. Biomedical Applications of EEM Fluorescence Spectroscopy Coupled with Machine Learning

The integration of EEM fluorescence spectroscopy with machine learning and chemometric algorithms has been explored across a broad range of biomedical contexts, from oncological and neurological disorders to infectious disease diagnosis and therapeutic drug monitoring, reflecting the growing application of multivariate analysis models to spectroscopic data in recent years [65,66,67,68,69] (Figure 5).

Table 1 summarizes the studies included in this review, organized by disease category and reporting the biological matrix, preprocessing strategies, classification or regression models applied, and key validation metrics. The following subsections discuss the methodological approaches and diagnostic performance reported for each application area, with attention to the analytical strategies that have demonstrated the strongest clinical potential.

4.1. Cancer Detection

The early detection of cancer is an increasing area of interest for alternative biomedical diagnostics. In this context, EEM fluorescence spectroscopy combined with chemometric analysis has shown considerable promise. Researchers have applied EEM fluorescence spectroscopy to detect prostate cancer through spectral analysis of urine samples, aiming to develop a preliminary screening model that is non-invasive, rapid, and reliable. A total of 69 urine samples were analyzed, including 46 patients with histologically confirmed prostate cancer and 23 healthy donors. The data were processed using PARAFAC, which resolved the complex overlap of the spectral bands and extracted chemically interpretable components. Four major fluorophores were identified, pteridines, pyridoxic acid, NADH (both free and protein-bound), and flavins. Significant differences were observed between the healthy and cancerous groups. Higher concentrations of pteridines and pyridoxic acid were found in subgroups of cancer patients, whereas free NADH and flavins were reduced in cancer-positive samples. These findings align with previous studies linking altered pteridine biosynthesis to neoplastic processes and associating pyridoxic acid, a catabolite of vitamin B6, with reduced vitamin B6 levels in patients with cancer. The PARAFAC-extracted scores were then used as input variables in linear discriminant analysis (LDA) and partial least squares (PLS)—discriminant analysis (DA) classification models. During calibration, all cancer cases were correctly classified, with misclassifications observed in only four healthy samples, which displayed atypical spectral landscapes and warrants further investigation. In the prediction phase, the models achieved a sensitivity of 94.5% and specificity of 89.7%, reinforcing the potential of this methodology as a non-invasive diagnostic tool. Despite these highly promising results, the authors emphasized that the limited sample size and lack of representativeness precluded broad generalization [70].

In another study, researchers conducted a pilot investigation to assess the feasibility of EEM fluorescence spectroscopy for the detection of precancerous cervical lesions as a preliminary step toward a multicenter phase II trial. The study enrolled 58 women aged > 18 years, who underwent questionnaires, medical history evaluation, physical examination, and pan-colposcopy. Spectral measurements were collected from one to three cervical sites using a fiber-optic probe, producing excitation–emission fluorescence matrices. Spectral data were analyzed using multivariate algorithms, including LDA, LR, and composite-spectrum-based classification models. These models correctly identified normal squamous tissue in 99% of cases but showed poor accuracy (7%) in the classification of columnar tissue, a limitation attributed to factors such as inflammation, human papillomavirus (HPV) infection, tissue structural variability, and instrumental artifacts. Incorporating additional excitation wavelengths (380 and 460 nm) improved diagnostic sensitivity and specificity and a composite algorithm based on measurements at 337, 380, and 460 nm was proposed to enhance tissue discrimination, achieving a sensitivity of 86% and specificity of 74% in the diagnostic mode [71].

The limitations observed in the spectral discrimination of columnar epithelium, largely attributable to morphological and biochemical factors, underscore the need for more robust approaches that integrate multiple excitation wavelengths with multivariate analysis. In this regard, it has been demonstrated that the combination of targeted excitation strategies with statistical modeling can identify early neoplastic changes in skin tissue, even before morphological alterations become apparent. Researchers applied EEM fluorescence spectroscopy combined with multivariate statistical analysis to a murine model of skin carcinogenesis induced by DMBA/TPA. Autofluorescence signals were recorded weekly for 15 weeks from treated animals, acetone controls, and blank groups using a fiber-optic-coupled spectrofluorometer. Excitations ranged from 280 to 460 nm, with emissions between 300 and 750 nm. Spectral changes were detected as early as the first week of tumor induction, preceding visible morphological alterations. Three discriminant analysis levels (DA-1, DA-2, DA-3) were employed, enabling the classification of progressive tissue transformation stages ranging from inflammation and dysplasia to invasive squamous cell carcinoma. The integration of spectral variables from multiple wavelengths (280, 320, 340, and 410 nm) improved classification accuracy by 11.6% compared to univariate analysis. Intensity ratios between fluorophores, such as tryptophan, collagen, NADH, and porphyrins, were found to be particularly discriminative [72].

These findings highlight the sensitivity of EEM fluorescence spectroscopy for detecting biochemical alterations linked to neoplastic processes. A similar rationale has been applied to the non-invasive detection of colorectal cancer (CRC) based on plasma fluorescence. A hierarchical framework integrating fluorescence spectroscopy with SVMs was proposed to classify samples into three categories: CRC, adenomas, and non-malignant findings. Given the high dimensionality of the spectral data (>12,000 variables), feature selection was performed using an SVM–Recursive Feature Elimination (RFE) algorithm. The hierarchical classifier consisted of two stages: a binary SVM trained to detect CRC samples and a one-class SVM trained exclusively on healthy samples to detect spectral deviations associated with non-malignant conditions. The results demonstrated a sensitivity of 86.4% and specificity of 95.2% for CRC detection, with an AUC of 0.933, while the use of a reduced number of wavelengths selected through SVM-RFE optimized the spectral acquisition time, enhancing clinical feasibility [73].

The search for efficient and clinically viable spectral methods has driven the development of alternative approaches. In one study, three-dimensional fluorescence spectra (3D-FS) of human blood plasma were combined with Tchebichef moments (TM) and PLS-DA. This hybrid model extracted robust features resistant to noise and spectral overlap, allowing the accurate classification of CRC, adenomas, and non-malignant findings from 225 plasma samples. Model performance was validated using ROC curve analysis and cross-validation, achieving accuracy of 84% and an AUC of 0.94 for CRC detection, with sensitivity of 88% and specificity of 85%. Compared with other chemometric approaches, TM-PLS-DA demonstrated a strong balance between diagnostic accuracy and computational efficiency [74].

A complementary perspective on CRC screening was provided by a metabonomic study that analyzed EEM spectra from 308 blood plasma samples (including 77 patients with CRC, 77 with adenomas, 77 with other non-malignant diagnoses, and 77 with no findings) using PARAFAC, followed by PLS-DA. This approach extracted scores from three parallel PARAFAC models built under different dilution conditions, pooling 19 spectral variables as classifier inputs. Ten-fold cross-validation yielded a sensitivity of 0.74, specificity of 0.71, and AUC of 0.69 for CRC versus all controls. Performance improved when comparing CRC with specific subgroups: the AUC reached 0.76 when compared with other non-malignant findings and 0.75 against no-finding controls. Although these metrics are moderate, they are comparable to those of established biomarkers, such as carcinoembryonic antigen (CEA), and the study’s strength lies in demonstrating that autofluorescence in plasma carries metabolic information that is discriminative for early-stage CRC [8].

Research has also expanded toward gynecological malignancies. A study explored the potential of autofluorescence for the early detection of endometrial cancer (EC), the most prevalent gynecological malignancy in developed countries. The autofluorescence of biological fluids enables the detection of endogenous fluorescent metabolites, such as vitamins, coenzymes, structural proteins, and tryptophan-derived catabolites. Three-dimensional synchronous fluorescence spectroscopy, measured in a constant wavelength mode (CWM) across 118 blood serum samples (73 EC, 45 healthy controls), was combined with machine learning classifiers, including RF, SVMs, LR, and stochastic gradient descent (SGD). Following class balancing via Synthetic Minority Oversampling Technique (SMOTE) and spectral region selection, the best-performing model (LR at Δλ = 120 nm) achieved a sensitivity of 0.94, specificity of 0.89, and AUC of 0.94. The fluorescence intensity ratios associated with NADH, FAD, and tryptophan metabolism were identified as the most discriminative features, providing a biochemically interpretable basis for classification [75].

4.2. Neurological and Metabolic Diseases

Neurodegenerative disorders represent a major clinical challenge, and EEM fluorescence spectroscopy has recently emerged as a promising non-invasive diagnostic tool, particularly for Alzheimer’s disease. The applicability of this technique was evaluated using dried plasma from 230 individuals, including 83 patients with Alzheimer’s disease and 147 healthy controls. The excitation and emission ranges were set at 250–350 nm and 300–850 nm, respectively, and the spectral data were processed using second-order algorithms, PARAFAC–Quadratic Discriminant Analysis (QDA) and Tucker3-QDA, implemented in MATLAB 2022 environment (MathWorks Inc., Natick, MA, USA), with a Kennard–Stone split of 162/34/34 samples for training, validation, and test sets. Both classifiers demonstrated remarkable performance: PARAFAC-QDA achieved a sensitivity of 83.33%, specificity of 100%, and an F2-score of 86.21%, whereas Tucker3-QDA yielded a sensitivity of 91.67%, specificity of 95.45%, and an F2-score of 91.67%. The overall accuracy was 94.12%, with an MCC of 0.87. The most relevant wavelengths for classification, tentatively associated with NADH, mitochondrial dysfunction, and β-amyloid aggregation, reinforced the study’s biological plausibility. These findings indicate that EEM spectroscopy, when combined with second-order multivariate algorithms, has the potential to become an accessible and cost-effective tool for the early diagnosis of Alzheimer’s disease [24].

4.3. Analysis of Bioactive Compounds and Hormonal Contaminants

A multivariate spectrofluorometric methodology was proposed for the simultaneous determination of two synthetic estrogens, ethinylestradiol (EE) and norgestimate (NOR), which are widely used in contraceptive formulations and are recognized for their endocrine-disrupting potential. The strategy relied on steady-state fluorescence spectra obtained from the interaction of the analytes with human serum albumin (HSA) and bovine serum albumin (BSA). These spectral variations were modeled using partial PLS regression, which effectively overcame the limitations of univariate methods in terms of spectral shifts and overlapping signals. The developed PLS models demonstrated excellent analytical performance, with detection limits of 1.6 × 10⁻⁸ M for EE and 2.4 × 10⁻⁷ M for NOR, correlation coefficients above 0.9949, and prediction errors below 6%. In comparison, univariate regression methods yielded errors exceeding 40%, underscoring their poor capacity to manage spectral interferences [76].

In addition to its applications in oncology and neurodiagnostics, fluorescence spectroscopy has also proven valuable in the analysis of functionally relevant biomolecules, such as enzymes. A chemometric approach was presented for the detection and quantification of extracellular lipase isolated from Serratia marcescens. EEM spectra were acquired over excitation and emission ranges of 200–350 and 200–800 nm, respectively, generating a three-dimensional data cube that was subsequently analyzed using multivariate regression. A radial-base function neural network (RBF-ANN) optimized by a genetic algorithm (GA) was implemented, enabling spectral region selection and improving the predictive capacity. The GA-RBF-ANN model achieved a Q² of 0.919, markedly outperforming N-PLS. Additionally, PARAFAC analysis decomposed the spectra into trilinear components, identifying three major fluorophores in the protein fractions. One component showed a strong correlation with lipase activity, suggesting that fluorescence could serve as an indirect indicator of enzymatic functionality without external markers or destructive procedures [77].

The versatility of EEM-based methods extends to therapeutic drug monitoring, where fluorescence and multivariate analysis support the detection of bioactive compounds in complex biological matrices. An innovative method was developed for the simultaneous detection and quantification of carbamazepine (CBZ) and its epoxide metabolite (CBZ-EP) in blood serum and pharmaceutical formulations. Given the widespread clinical use of CBZ for epilepsy and psychiatric disorders, accurate monitoring of its plasma concentration is crucial to ensure therapeutic efficacy and safety. The method combines solid-support fluorescence on nylon membranes with chemometric models, including PARAFAC, Self-Weighted Alternating Trilinear Decomposition (SWATLD), and N-PLS, achieving precise signal discrimination in the presence of spectral interferences. After methanol evaporation and hexane dilution, CBZ signals, which are otherwise weak in solutions, were recovered by the nylon support, thereby enabling simultaneous quantification of both analytes. Trilinear models (PARAFAC and SWATLD) outperformed bidimensional PLS-1, demonstrating good precision and concordance with reference methods within the clinically relevant CBZ range of 4–12 µg mL⁻¹ [78].

The capacity of solid membranes to preserve fluorescence signals has also been applied to the detection of environmental contaminants in biological matrices. Researchers have developed a direct method for the determination of monohydroxylated metabolites of polycyclic aromatic hydrocarbons (OH-PAHs) in urine using room-temperature EEM fluorescence spectroscopy on octadecyl (C18) membranes N-PLS coupled with Residual Bilinearization (RBL) and Multivariate Curve Resolution–Alternating Least-Squares (MCR-ALS) algorithms. The results demonstrated that both algorithms yielded comparable performance, with recoveries of 96.2–105.6% (N-PLS/RBL) and 98.3–105.6% (MCR-ALS), and limits of detection of 0.018–0.047 ng/mL. These approaches highlight how solid-support EEM combined with trilinear models enables both therapeutic drug monitoring and environmental biomarker detection in a single analytical framework [79].

The versatility of trilinear models, such as PARAFAC, has also been demonstrated in the direct analysis of antibiotics in complex biological matrices. A methodology has been proposed for the simultaneous quantification of fluoroquinolones (ciprofloxacin, ofloxacin, and norfloxacin) in unprocessed urine samples using EEM fluorescence spectroscopy modulated by a dual pH gradient. This approach introduced controlled spectral variability, enabling the resolution of overlapping signals through four-way multivariate calibration. The methodology yielded relative prediction errors below 10% and limits of detection of 0.028–0.035 mg/L, without requiring chemical pretreatment or chromatographic separation. The Elliptical Joint Confidence Region (EJCR) test passed at 95% confidence for all analytes, confirming accurate and unbiased quantification [80].

In line with this approach, a method employing EEM fluorescence spectroscopy coupled with PARAFAC and N-PLS has been reported for the determination of doxorubicin in human plasma. Doxorubicin, a widely used chemotherapeutic agent, exhibits a complex spectral profile due to its interactions with plasma components. Both algorithms demonstrated satisfactory performance: PARAFAC achieved an RMSECV of 0.060 µg/mL with a mean relative error of 4.4%, while N-PLS yielded an RMSECV of 0.045 µg/mL with a mean error of 2.0%, and both achieved a limit of detection of 0.032 µg/mL. The spectral profiles extracted by PARAFAC enabled the specific identification of doxorubicin, even under conditions of severe spectral overlap, reinforcing EEM spectroscopy combined with multi-way analysis as a powerful tool for direct plasma analysis without extensive pretreatment [81].

Following this rationale, a method was developed for the simultaneous quantification of dabigatran etexilate and its active metabolite dabigatran in urine, comparing three chemometric models: Self-Weighted Alternating Normalized Residue Fitting (SWANRF), MCR-ALS and Unfolded Partial Least-Square (U-PLS) coupled with RBL. All models achieved accurate quantification with limits of detection of 0.7–2.5 ng/mL. SWANRF and MCR-ALS both passed the EJCR test at 95% confidence in both matrices, demonstrating unbiased recovery. U-PLS/RBL, however, showed deviation from the ideal EJCR point in urine, reflecting its lower robustness against the complex interferences and background variability of that matrix. These findings underscore that model selection must account for predictive accuracy and matrix-specific analytical reliability [82].

The adaptability of EEM fluorescence spectroscopy also extends to biosecurity applications. Using laboratory-acquired EEM spectra from four protein biotoxins and ten innocuous proteins in solid powder form, the impact of noise contamination was assessed through the peak signal-to-noise ratio (PSNR). The study employed PCA, RF, and multilayer perceptron (MLP) classifiers in combination with spectral descriptors derived from differential transform (DT), Fourier transform (FT), and wavelet transform (WT). All classification schemes achieved 100% accuracy for noise-free spectra. Under noise conditions (PSNR = 20), EEM–WT retained perfect classification using only 15% of the spectral variables, whereas EEM–FT dropped to approximately 60%. These results position EEM spectroscopy as a robust platform for the non-invasive identification of protein biotoxins under real-world noise conditions [83].

This versatility is further exemplified by a method for the direct determination of trans-resveratrol (RVT) in human plasma, without the need for prior extraction or separation. The method combines EEM fluorescence spectroscopy (excitation 280–360 nm; emission 380–550 nm) with PARAFAC and the standard addition method, thereby overcoming matrix effects arising from RVT–protein interactions. Models with four factors explained more than 99.90% of the total variance, and recoveries between 94.0% and 110.0% were achieved within the range of 0.10–5.00 µg mL⁻¹, with limits of detection of 0.002–0.013 µg mL⁻¹. Despite the strong influence of individual matrix effects, which made performance metrics sample-specific, this approach represents a rapid and cost-effective alternative to conventional HPLC for direct RVT analysis in biofluids [84].

This potential was further explored through the development of a non-invasive, label-free spectrofluorimetric methodology for evaluating cell viability, with applications in biomedical contexts, such as drug screening, biocompatibility testing, tissue engineering, and pharmacodynamic studies. Using EEM spectra and U-PLS regression, A375 melanoma cells exposed to UV radiation were characterized, and an approximately 20% reduction in viability was observed, which was independently confirmed using the MTT assay. The correlation between viability values obtained by EEM and MTT was nearly perfect (R² = 0.986), and statistical tests (t-test and F-test) indicated no significant differences between the two methods. This strategy offers substantial advantages over traditional destructive methods, enabling real-time cellular monitoring in both clinical and experimental settings [85].

The applicability of EEM-based cell monitoring was further demonstrated in a study quantifying the cytotoxic effect of oxaliplatin on two skin cell lines, A375 melanoma and HaCaT keratinocytes, treated at concentrations ranging from 5 to 1000 µM. PARAFAC decomposed the spectral data into five interpretable components corresponding to tryptophan, NADH/NADPH, FAD, riboflavin, and pyridoxine. Multilinear regression (MLR) on PARAFAC scores achieved R² = 0.997 (A375) and R² = 0.979 (HaCaT) relative to MTT reference values, with Tukey post hoc tests confirming statistically significant group separation at p ≤ 0.05 for most dose pairs. This study extends the cell viability monitoring approach to chemotherapy dose–response characterization, illustrating the potential of EEM + PARAFAC as a label-free alternative to traditional cytotoxicity assays [23].

4.4. Diagnosis of Infectious Diseases

EEM spectroscopy has emerged as a powerful tool in biomedical diagnostics, particularly for the rapid and accurate detection of infectious diseases. This approach captures the unique spectral signatures of pathogenic agents and, when integrated with multivariate classification and dimensionality reduction techniques, enables the discrimination between resistant and susceptible bacterial strains.

An innovative methodology has been proposed that integrates EEM spectroscopy with chemometric techniques and supervised classification algorithms LDA, QDA and SVM in combination with dimensionality reduction approaches such as PCA, GA, and the Successive Projections Algorithm (SPA). When applied to spectral data from Escherichia coli and Klebsiella pneumoniae samples, the developed models achieved 100% sensitivity, specificity, and accuracy in classifying antimicrobial resistance versus susceptibility. This level of precision highlights the potential of integrating EEM with machine learning to be implemented in clinical laboratories as a significantly faster alternative to conventional antimicrobial susceptibility testing, which typically requires 1–3 days to generate results [87].

A novel strategy for the rapid identification of pathogenic bacteria was reported, employing EEM spectroscopy in combination with a convolutional neural network (CNN) classifier and the environmentally sensitive fluorescent dye 2-(4′-dimethylaminophenyl)-3-hydroxyflavone (DMAF). The interaction of DMAF with the bacterial cell envelope generates unique spectral signatures for each species without the requirement of specific antibodies or molecular markers. This work characterized spectral dynamics from 595 EEMs across eight bacterial species and reported identification accuracy of 85.8% at the species level and 98.3% for Gram classification, with a limit of detection of approximately 10⁷ CFU/mL. Spectral analysis revealed that structural differences in bacterial cell envelopes significantly influenced fluorescence responses, accounting for misclassifications within the same Gram group [88].

The application of EEM spectroscopy has also been extended to viruses. A study reported the use of EEM fluorescence combined with chemometric analysis for the classification of blood serum samples as positive or negative for coronavirus disease 2019 (COVID-19) infection, confirmed by Reverse Transcription Polymerase Chain Reaction (RT-PCR). EEM spectra were analyzed using PARAFAC to extract the underlying fluorescent components, which were attributed to tryptophan, tyrosine, NADPH, and advanced glycation AGEs. Several classification models were tested, including Soft Independent Modeling of Class Analogy (SIMCA), Data-Driven SIMCA (DD-SIMCA), PCA-DA, and PLS-DA, to discriminate between positive and negative samples. The best-performing model was PARAFAC-DD-SIMCA with autoscaling (3 PCs), achieving a sensitivity of 0.98, specificity of 0.96, and an error rate of 5%. The best-performing unfolded model was PCA-DA (8 PCs), achieving sensitivity and specificity of 0.94 each. These findings suggest that combining fluorescence spectral data with multivariate analysis enables effective discrimination of serological samples, offering promise as a rapid screening tool, particularly in resource-limited settings [89].

A recent study proposed a strategy for the classification of respiratory viruses using EEM fluorescence integrated with machine learning algorithms. Spectral datasets were acquired from eight inactivated viruses, including severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), influenza A virus, parainfluenza virus, rhinovirus, adenovirus 7, rotavirus, and inactivated poliovirus vaccine. A key contribution was the introduction of a novel fast Fourier transform–wavelet transform (FFT–WT) data fusion strategy, which improved the classification accuracy from 45% (no transformation) to 75%. SVM consistently outperformed RF across datasets, and confusion matrix analyses confirmed the successful differentiation of SARS-CoV-2, coronaviruses, rhinoviruses, poliovirus (PIV), and adenovirus 7 (AdV-7) using the best fusion [89].

Collectively, these studies underscore that EEM spectroscopy, when integrated with multivariate analysis and machine learning algorithms, constitutes a robust platform for the differential diagnosis of infectious diseases. Spanning both bacterial and viral pathogens has significant potential for clinical applications, epidemiological surveillance, and public health monitoring.

5. Discussion

5.1. Performance and Validity of Current Approaches

The reviewed studies show that EEM fluorescence spectroscopy, combined with chemometric methods and machine learning, can achieve competitive diagnostic performance, with reported sensitivities between 74% and 100%, specificities between 71% and 100%, and AUC values between 0.69 and 0.94. These figures are promising; however, they must be interpreted with careful attention to the methodological context, given that they are highly dependent on sample size, class balance, and the validation strategy employed. A fundamental distinction that conditions this interpretation is the difference between studies aimed at demonstrating proof of concept and those aspiring to clinical validation. The former seeks to establish that EEM spectral data contain discriminative information relevant to a diagnostic question; the latter require evidence that this discriminability generalizes to independent populations under realistic conditions. Several of the reviewed studies occupy the first category explicitly, Mustorgi et al. [70] define their work as “a proof-of-concept research” providing “preliminary evidence”, while others present results with clinical language that their sample sizes and validation designs cannot yet fully support. Recognizing this distinction is essential for an accurate reading of the field’s current state.

The best results were obtained when the spectral signals were clear, and the samples were well controlled. Costa et al. [86] achieved 100% accuracy in the classification of bacterial resistance using simple bacterial suspensions in phosphate buffer, a result attributable in part to the favorable signal-to-noise ratio of the experimental conditions rather than to the inherent difficulty of the clinical problem. Similarly, Henry et al. [87] achieved 99.8% classification accuracy in cross-validation for Gram status in monocultures; however, this figure dropped to 85.8% when evaluated on a genuinely independent test set collected in separate measurement sessions. This gap between cross-validation and independent test performance is not an anomaly but an expected consequence of datasets in which within-session variability is lower than between-session variability, and it illustrates why cross-validation alone can yield systematically optimistic estimates, particularly when sample sizes are small.

In contrast, studies using plasma face more challenging conditions because biological variability arising from diet, hydration, medication, and comorbidities can produce dispersed signals within each diagnostic group. Nevertheless, some models have yielded promising results under these conditions: the Tucker3-QDA model for Alzheimer’s disease achieved 94.12% accuracy and a Matthews Correlation Coefficient (MCC) of 0.87 [24], and the DD-SIMCA model for COVID-19 achieved a sensitivity of 0.98 and specificity of 0.96 [88]. Both studies employed Kennard–Stone sampling to ensure representative training and test sets and reported complementary metrics alongside sensitivity and specificity, including MCC or error rate, a practice that remains uncommon across the reviewed literature but is essential for an honest assessment of model quality.

The validation strategy is a critical determinant of reported performance. Of the 23 studies reviewed, the approaches ranged from leave-one-out cross-validation [77,81] to k-fold cross-validation [83,85,86], stratified train/test splits [8,74,88], and external test sets [70,87,89]. Only a minority of studies employed a genuinely independent external validation group. When external test sets were reported, performance consistently declined relative to cross-validation estimates, confirming that cross-validation alone tends to produce optimistic results, particularly in small datasets. The adoption of prospective external validation as the primary benchmark for clinical translation would substantially benefit the field.

Sample size was the most widespread limitation: 17 of 23 studies worked with fewer than 300 samples, and several with fewer than 100, reducing statistical power and population representativeness. Machine learning models trained on small, controlled clinical sets tend to overfit specific spectral features and may not generalize to more heterogeneous populations; this is especially critical in plasma and serum, where interindividual variability amplifies the within-class dispersion that few datasets adequately capture.

Preprocessing decisions add another significant source of heterogeneity: Rayleigh and Raman scatter removal, normalization, spectral region selection, and background subtraction alter the informational content of the final matrix, but their reporting is inconsistent. Some studies only excluded scatter regions [71], while others applied iterative algorithms for missing values [81] or asymmetric background subtraction [79]. The lack of standardized protocols makes it difficult to directly compare studies and limits the transferability of models between different instrumental setups.

5.2. Methodological Challenges in Machine Learning Applied to EEM Data

Beyond the general limitations discussed above, a more thorough examination of the reviewed studies reveals a set of recurrent pipeline-level challenges whose impact on reported performance has not been explicitly addressed in most of the included publications. These challenges are particularly consequential in EEM-based diagnostics because the combination of high-dimensional spectral data and small clinical datasets creates conditions in which even modest methodological inconsistencies can produce substantially inflated performance estimates.

A first and frequently overlooked source of bias is data leakage in the preprocessing stage. In the context of EEM spectroscopy, preprocessing operations fall into two fundamentally different categories with respect to leakage risk. Operations applied independently to each spectrum, such as Rayleigh and Raman scatter removal by interpolation, or background subtraction using a reference spectrum, do not introduce leakage because they do not depend on the composition of the full dataset. In contrast, operations that require statistics computed across the entire sample set, including mean centering, autoscaling, multiplicative scatter correction, and standardization, introduce leakage when applied before the train/test split, because information from the test set influences the transformation applied to the training data. Among the reviewed studies, this distinction was rarely acknowledged. Dos Santos et al. [24] applied scatter removal and multiplicative signal correction to all 230 samples prior to Kennard–Stone partitioning; Yin et al. [74] computed Tchebichef moments and performed outlier detection on the full dataset before division; and Zhang et al. [89] applied Z-score standardization globally before splitting their 40-sample dataset. The practical consequence is that the test set, which should represent genuinely unseen data, is contaminated by information from its own samples during the preprocessing step. A notable exception is Araujo et al. [88], who fitted the PARAFAC model exclusively on the training set and obtained test set scores by projection, the most explicit anti-leakage measure among the classification studies reviewed. A second, subtler form of leakage affects the estimation of the number of components in multilinear models: when PARAFAC or Tucker3 decompositions are performed on the full dataset prior to division, the resulting component structure, and therefore the scores used as classifier inputs, encodes information from the test samples even when the subsequent classification is performed correctly [8,24].

A second challenge is feature selection bias, which arises when the process of identifying discriminative spectral variables is performed using all available data rather than exclusively within the training partition. In Diagaradjane et al. [72], fluorescence intensity ratios were selected through a t-test applied to the complete dataset, including samples subsequently used for leave-one-out cross-validation, generating a set of input variables optimized for that specific sample collection rather than for the underlying population. This type of bias systematically inflates performance estimates because the classifier receives variables that have already been screened against the very samples it is being evaluated on. In Zhang et al. [89], the selection of the optimal transformation strategy among multiple combinations, including individual and fused applications of SNV, FFT, and wavelet transform, was evaluated by comparing performance on the test set, introducing an implicit selection bias that is especially problematic when the total sample size is 40. Among the reviewed studies, Soares et al. [73] implemented the most rigorous feature selection, applying SVM-RFE exclusively within the training partition using repeated 10-fold cross-validation, with the test set held out until the final evaluation.

A third challenge concerns class imbalance and its effect on both model training and performance interpretation. When the ratio of positive to negative cases is markedly unequal, classifiers trained without corrective measures tend to favor the majority class, producing high overall accuracy that conceals poor performance on the clinically relevant minority class. The clearest illustration among the reviewed studies is provided by Nath et al. [71], where a ratio of 88 squamous to 14 columnar tissue sites produced a model with 99% correct classification in the majority class but only 7% in the minority, demonstrating that a high global accuracy can coexist with a diagnostically uninformative model. Araujo et al. [88] presented the most pronounced imbalance among the clinical classification studies (80 COVID-19 positive vs. 26 negatives, approximately 3:1), and addressed it indirectly through the choice of DD-SIMCA, which models a single target class and is therefore less sensitive to imbalance than discriminant classifiers by design. Among the reviewed studies, Svecová et al. [75] is the only work that applied a dedicated oversampling strategy, Synthetic Minority Oversampling Technique (SMOTE), within the cross-validation loop, evaluating different neighborhood configurations and documenting its effect on sensitivity and specificity. The use of class-weighted loss functions, as implemented by Soares et al. [73] through assignment of higher penalty to false negatives in the cancer class, constitutes an alternative approach that is both simple and methodologically sound. The remaining studies with imbalanced designs did not report corrective measures, which limits the interpretability of their reported metrics.

A fourth challenge, closely related to the previous ones, is the selection and reporting of evaluation metrics. The diversity of metrics reported across the 23 reviewed studies is substantial and complicates cross-study comparisons. Most studies reported sensitivity and specificity as primary figures of merit, but fewer than half complemented these with metrics robust to class imbalance such as MCC, F-score, or AUC. Accuracy alone, reported as the primary outcome in several studies, is a misleading measure when classes are unbalanced, as it can be dominated by the majority class. Dos Santos et al. [24] provided one of the most complete and explicitly justified metric sets among the classification studies, reporting MCC, F₂-score, Youden’s index, likelihood ratios, and test effectiveness (δ), with explicit argumentation for weighting recall more heavily than precision given the clinical cost of false negatives in Alzheimer’s disease screening. This type of metric-to-context alignment, where the choice of figures of merit is driven by the clinical implications of each error type, represents a standard that the field would benefit from adopting more broadly. At the opposite end of the spectrum, studies reporting only global accuracy without per-class breakdown, such as Zhang et al. [89], where the modest absolute accuracy of 75% in an eight-class problem with five samples per class carries high uncertainty, provide limited information for assessing real diagnostic utility.

Taken together, these pipeline-level challenges suggest that a meaningful fraction of the high sensitivity and specificity values reported in the literature reflects methodological artifacts rather than genuine discriminative capacity of the EEM signal. This does not imply that the underlying approach lacks promise; on the contrary, studies that implement rigorous pipelines, including Soares et al. [73] and Svecová et al. [75] for classification, and Goicoechea et al. [79] and Pagani et al. [80] for quantification, report competitive performance under conditions that can withstand critical scrutiny. The implication is rather that future work in this field should distinguish clearly between studies that demonstrate spectral discriminability as a proof of concept and studies that provide evidence of clinical generalizability, and should adopt pipeline practices, strict preprocessing separation, within-training feature selection, explicit handling of class imbalance, and comprehensive metric reporting, commensurate with the claim being made.

5.3. Future Perspectives and Research Opportunities

To advance the clinical implementation of EEM with machine learning approaches, several priorities emerge from the current review. The creation of public standardized databases containing well-characterized EEM spectra with associated clinical and demographic metadata is essential. This would enable training of more generalizable models, facilitate cross-center validation, and support the reproducibility assessment that single-site studies cannot provide.

The application of deep learning architectures, particularly convolutional neural networks, which treat EEM matrices as spectral images and learn discriminative features hierarchically without requiring manual feature extraction, represents a direction of significant potential that remains largely underexplored in the biomedical EEM literature. Henry et al. [87] demonstrated the feasibility of a convolutional neural network applied directly to EEM data for bacterial identification, achieving competitive accuracy at the Gram status level; however, the full potential of deep learning for EEM-based diagnostics, including architectures designed for three-way tensor input such as those based on Tucker decomposition layers or attention mechanisms over excitation–emission surfaces, has not yet been systematically evaluated in clinical contexts. Alongside predictive performance, the interpretability of model decisions is an increasingly recognized requirement for clinical acceptance. Explainable artificial intelligence (XAI) methods, including gradient-weighted class activation mapping, SHAP values, and integrated gradients, can identify which excitation–emission regions drive individual classifications, providing a link between statistical output and biochemical mechanism that is essential for regulatory approval and clinical trust. The integration of XAI tools with EEM-based classifiers would also strengthen the biological plausibility of reported results by connecting discriminative spectral regions to known fluorophores and their associated pathophysiological processes.

The development of portable EEM spectroscopy devices integrated with real-time classification algorithms presents an opportunity for point-of-care diagnostics. Miniaturized spectrometers capable of acquiring EEM data at reduced spectral resolution are already commercially available and represent a plausible emerging research direction, particularly for resource-limited settings where the cost and complexity of reference methods remain a barrier to early diagnosis.

Beyond instrumentation, methodological standardization is a critical priority. The development of community consensus guidelines for EEM data preprocessing, covering the order and scope of scatter removal, normalization, baseline correction, and spectral region selection, would substantially reduce the between-study variability that currently prevents meaningful meta-analytic comparisons of diagnostic performance. Ultimately, the potential of EEM with machine learning for real-time applications and personalized medicine is significant. The ability to dynamically monitor fluorescent biomarkers in biological fluids opens new possibilities for therapeutic monitoring, patient stratification, and clinical decision-making based on spectral data.

6. Conclusions

This systematic review demonstrates that EEM fluorescence spectroscopy, when coupled with machine learning algorithms, is a non-invasive, sensitive, and scalable diagnostic platform with applications in oncology, virology, neurodiagnostic, and therapeutic monitoring. The reviewed studies reveal that EEM with a machine learning approach constitutes a methodology capable of extracting relevant biochemical information from complex matrices without the need for markers or prior separation, which is a significant advantage over conventional techniques. The integration of multivariate models offers a clear advantage because they resolve spectral overlaps and provide interpretable component profiles that link diagnostic performance to specific endogenous fluorophores, such as NADH, tryptophan, flavins, and advanced glycation end products. These features represent an excellent platform for biofluids analysis.

However, the promising performance metrics reported across the reviewed studies must be contextualized within their methodological constraints. The majority of included studies operated with limited sample sizes, heterogeneous preprocessing protocols, and validation strategies that do not always guarantee independence between training and evaluation data. These conditions increase the risk that reported sensitivities and specificities reflect, at least in part, dataset-specific artifacts rather than genuine generalizable discriminability. A critical distinction that the field would benefit from adopting more explicitly is that between studies demonstrating proof of concept—establishing that EEM spectral data carry diagnostically relevant information—and studies providing evidence of clinical validity through prospective external validation in independent, representative populations. Both types of contribution are valuable, but they support different conclusions and should be communicated accordingly.

Future directions point toward automated, personalized, and accessible diagnostic systems with direct impact on clinical efficiency and health equity. The development of open-access spectral databases, portable acquisition devices, standardized preprocessing protocols, and the integration of deep learning and explainable artificial intelligence tools represent the most promising avenues for translating the demonstrated proof-of-concept potential of EEM spectroscopy into clinically validated diagnostic applications. Achieving this translation will depend not only on algorithmic advances but on the community’s commitment to larger and well-characterized datasets, rigorous and transparent pipelines, and prospective validation designs commensurate with the clinical claims being made.

Author Contributions

Conceptualization, V.A.A., R.G.-A., D.L.-A. and J.T.; methodology, V.A.A. and J.T.; investigation, M.P.H., V.A.A. and J.T.; writing—original draft preparation M.P.H., V.A.A. and J.T.; writing—review and editing, R.G.-A., D.L.-A. and J.T.; project administration, J.T.; funding acquisition, V.A.A., R.G.-A. and J.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Science, Technology and Innovation (MINCIENCIAS)—Colombia, grant number 111691892296 (agreement 799–2023) and the Universidad del Atlántico.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors would like to thank the Universidad del Atlántico and Fundación Universitaria San Martín–sede Caribe. During the preparation of this manuscript, the authors used Claude Opus 4.7 (Anthropic) to assist in language editing and translation from Spanish to English. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ACC	Accuracy
AdV-7	Adenovirus 7
AGE/AGEs	Advanced Glycation End products
ANN	Artificial Neural Network
APTLD	Alternating Penalty Trilinear Decomposition
arPLS	Asymmetrically Reweighted Penalized Least Squares
AUC	Area Under the Receiver Operating Characteristic Curve
BSA	Bovine Serum Albumin
BSF	B-Spline Fitting
CBZ	Carbamazepine
CBZ-EP	Carbamazepine Epoxide
CEA	Carcinoembryonic Antigen
CFU	Colony Forming Units
CNN	Convolutional Neural Network
COVID-19	Coronavirus Disease 2019
CRC	Colorectal Cancer
CWM	Constant Wavelength Mode
DMAF	2-(4′-dimethylaminophenyl)-3-hydroxyflavone
DMBA	7,12-dimethylbenz[α]anthracene
DT	Differential Transform
EC	Endometrial Cancer
EE	Ethinylestradiol
EEM	Excitation–Emission Matrix
EJCR	Elliptical Joint Confidence Region
FAD	Flavin Adenine Dinucleotide
FFT	Fast Fourier transform
FN	False Negative
FP	False Positive
FT/FFT	Fourier Transform/Fast Fourier Transform
GA	Genetic Algorithm
HSA	Human Serum Albumin
HPV	Human Papillomavirus
ILLM	Inductive Learning by Logic Minimization
KNN	k-Nearest Neighbor
LDA	Linear Discriminant Analysis
LOD	Limit of Detection
LR	Logistic Regression
MCC	Matthews Correlation Coefficient
MCR-ALS	Multivariate Curve Resolution–Alternating Least-Squares
MLP	Multilayer Perceptron
MLR	Multiple Linear Regression
MOPS	3-(N-morpholino)propanesulfonic acid (buffer)
MTT	3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide
NADH	Nicotinamide Adenine Dinucleotide (reduced form)
NADPH	Nicotinamide Adenine Dinucleotide Phosphate
N-PLS	N-way Partial Least Squares
NOR	Norgestimate
PAH/OH-PAH	Polycyclic Aromatic Hydrocarbons/Hydroxylated PAH
PARAFAC	Parallel Factor Analysis
PBS	Phosphate-Buffered Saline
PC/PCA	Principal Component/Principal Component Analysis
PLS/PLS-DA	Partial Least Squares/Partial Least Squares Discriminant Analysis
PIV	Poliovirus
PPF	Polynomial Fitting
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
PSNR	Peak Signal-to-Noise Ratio
Q²	Predictive Squared Correlation Coefficient
QDA	Quadratic Discriminant Analysis
R²	Coefficient of Determination
RBF	Radial-Base Function
REP%	Relative Prediction Error Percentage
RF	Random Forest
RMSECV	Root Mean Square Error of Cross-Validation
RMSEP	Root Mean Square Error of Prediction
RMS%RE	Relative Mean Square % Error
ROC	Receiver Operating Characteristic Curve
RT-PCR	Reverse Transcription Polymerase Chain Reaction
RVT	trans-Resveratrol
SARS-CoV-2	Respiratory Syndrome Coronavirus 2
SCC	Squamous Cell Carcinoma
SENS	Sensitivity
SGD	Stochastic Gradient Descent
SIL	Squamous Intraepithelial Lesion
SIMCA/DD-SIMCA	Soft Independent Modeling of Class Analogy/Data-Driven SIMCA
SMOTE	Synthetic Minority Oversampling Technique
SNE/t-SNE	Stochastic Neighbor Embedding/t-distributed SNE
SPEC	Specificity
SPA	Successive Projections Algorithm
SVMs/SVM-RFE	Support Vector Machines/SVM Recursive Feature Elimination
SSE	Sum of Squares Error
SWANRF	Self-Weighted Alternating Normalized Residue Fitting
SWATLD	Self-Weighted Alternating Trilinear Decomposition
TN	True Negative
TM-PLS-DA	Tchebichef Moments Partial Least Squares Discriminant Analysis
3D-FS	Three-Dimensional Fluorescence Spectra
TP	True Positive
TPA	12-0-tetradecanoylphorbol-13-acetate
U-PLS/RBL	Unfolded Partial Least Squares/Residual Bilinearization
UV-Vis	Ultraviolet–Visible Spectroscopy
VS-KM	Variable Selection for K-Means
WT	Wavelet Transform

References

Cifuentes-Rodriguez, N.A.; Benavides-Cuestas, E.R.; Chacón-Chamorro, S.G.; Segura-Giraldo, B. Optical fluorescence spectroscopy using LED-type light in ex-vivo cervical tissues. Ing. Compet. 2023, 25, e-20912532. [Google Scholar] [CrossRef]
Monici, M. Cell and tissue autofluorescence research and diagnostic applications. Biotechnol. Annu. Rev. 2005, 11, 227–256. [Google Scholar] [CrossRef] [PubMed]
Suehiro, A.; Uchida, K.; Wakabayashi, I. Measurement of urinary advanced glycation end-products (AGEs) using a fluorescence assay for metabolic syndrome-related screening tests. Diabetes Metab. Syndr. Clin. Res. Rev. 2016, 10, S110–S113. [Google Scholar] [CrossRef]
Li, Z.; Peleato, N. Comparison of dimensionality reduction techniques for cross-source transfer of fluorescence contaminant detection models. Chemosphere 2021, 276, 130064. [Google Scholar] [CrossRef] [PubMed]
Sebastiani, M.; Vacchi, C.; Manfredi, A.; Cassone, G. Personalized Medicine and Machine Learning: A Roadmap for the Future. J. Clin. Med. 2022, 11, 4110. [Google Scholar] [CrossRef] [PubMed]
Nilius, H.; Tsouka, S.; Nagler, M.; Masoodi, M. Machine learning applications in precision medicine: Overcoming challenges and unlocking potential. TrAC—Trends Anal. Chem. 2024, 179, 117872. [Google Scholar] [CrossRef]
Johnson, K.B.; Wei, W.; Weeraratne, D.; Frisse, M.E.; Misulis, K.; Rhee, K.; Zhao, J.; Snowdon, J.L. Precision Medicine, AI, and the Future of Personalized Health Care. Clin. Trans. Sci. 2021, 14, 86–93. [Google Scholar] [CrossRef] [PubMed]
Lawaetz, A.J.; Bro, R.; Kamstrup-Nielsen, M.; Christensen, I.J.; Jørgensen, L.N.; Nielsen, H.J. Fluorescence spectroscopy as a potential metabonomic tool for early detection of colorectal cancer. Metabolomics 2012, 8, S111–S121. [Google Scholar] [CrossRef]
Gomidze, N.; Kalandadze, L.; Khajishvili, M.; Nakashidze, O.; Jabnidze, I.; Jakobia, D.; Makharadze, K. Fluorescence spectroscopy as a novel tool in hematological diagnostics. APL Bioeng. 2025, 9, 026102. [Google Scholar] [CrossRef] [PubMed]
Binson, V.A.; Thomas, S.; Subramoniam, M.; Arun, J.; Naveen, S.; Madhu, S. A Review of Machine Learning Algorithms for Biomedical Applications. Ann. Biomed. Eng. 2024, 52, 1159–1183. [Google Scholar] [CrossRef] [PubMed]
Ahsan, M.M.; Luna, S.A.; Siddique, Z. Machine-Learning-Based Disease Diagnosis: A Comprehensive Review. Healthcare 2022, 10, 541. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.; Rahnavard, A.; Crandall, K.A. Machine learning enhances biomarker discovery: From multi-omics to functional genomics. Med. Res. Arch. 2025, 13. [Google Scholar] [CrossRef]
Wang, X.; Ding, Q.; Groleau, R.R.; Wu, L.; Mao, Y.; Che, F.; Kotova, O.; Scanlan, E.M.; Lewis, S.E.; Li, P.; et al. Fluorescent Probes for Disease Diagnosis. Chem. Rev. 2024, 124, 7106–7164. [Google Scholar] [CrossRef] [PubMed]
O’Dea, R.E.; Lagisz, M.; Jennions, M.D.; Koricheva, J.; Noble, D.W.; Parker, T.H.; Gurevitch, J.; Page, M.J.; Stewart, G.; Moher, D.; et al. Preferred reporting items for systematic reviews and meta-analyses in ecology and evolutionary biology: A PRISMA extension. Biol. Rev. 2021, 96, 1695–1722. [Google Scholar] [CrossRef] [PubMed]
Lakowicz, J.R. Principles of Fluorescence Spectroscopy, 3rd ed.; Springer: New York, NY, USA, 2007. [Google Scholar] [CrossRef]
Wang, F.; Li, J.; Wang, F.; Wang, H.; Zhou, T.; Zhang, X.; Liu, J.; Wang, Y.; Dong, W. Machine learning-based predictive modeling of HAAs concentration in secondary water supply system using UV–vis absorption and excitation-emission matrix (EEM) fluorescence spectroscopy. Microchem. J. 2025, 219, 114329. [Google Scholar] [CrossRef]
Qin, X.-Q.; Yao, B.; Jin, L.; Zheng, X.-Z.; Ma, J.; Benedetti, M.F.; Li, Y.; Ren, Z.-L. Characterizing Soil Dissolved Organic Matter in Typical Soils from China Using Fluorescence EEM–PARAFAC and UV–Visible Absorption. Aquat. Geochem. 2020, 26, 71–88. [Google Scholar] [CrossRef]
Geng, T.; Wang, Y.; Yin, X.L.; Chen, W.; Gu, H.W. A Comprehensive Review on the Excitation-Emission Matrix Fluorescence Spectroscopic Characterization of Petroleum-Containing Substances: Principles, Methods, and Applications. Crit. Rev. Anal. Chem. 2023, 54, 2827–2849. [Google Scholar] [CrossRef] [PubMed]
Wisuthiphaet, N.; Zhang, H.; Liu, X.; Nitin, N. Detection of Escherichia coli Using Bacteriophage T7 and Analysis of Excitation-Emission Matrix Fluorescence Spectroscopy. J. Food Prot. 2024, 87, 100396. [Google Scholar] [CrossRef] [PubMed]
Lemes, L.F.R.; Soares, F.L.F.; Nagata, N. Determination of DEET, Icaridin, and IR3535 in insect repellents using excitation-emission matrix (EEM) fluorescence spectroscopy and multiway calibration. Microchem. J. 2024, 206, 111601. [Google Scholar] [CrossRef]
Siano, G.; Mora, S.; Schenone, A.; Giovanini, L. Sequential acquisition of fluorescence signals with changing fluorophore concentrations. Multivariate Curve Resolution with time measurements assistance. arXiv 2025, arXiv:2502.19431v1. [Google Scholar]
Costa Pereira, J.; Pais, A.A.C.C.; Burrows, H.D. Analysis of raw EEM fluorescence spectra—ICA and PARAFAC capabilities. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2018, 205, 320–334. [Google Scholar] [CrossRef] [PubMed]
Głowacz, K.; Skorupska, S.; Grabowska-Jadach, I.; Bro, R.; Ciosek-Skibińska, P. Excitation–Emission Matrix Fluorescence Spectroscopy Coupled with PARAFAC Modeling for Viability Prediction of Cells. ACS Omega 2023, 8, 15968–15978. [Google Scholar] [CrossRef] [PubMed]
Dos Santos, R.F.; Paraskevaidi, M.; Mann, D.M.A.; Allsop, D.; Santos, M.C.D.; Morais, C.L.M.; Lima, K.M.G. Alzheimer’s disease diagnosis by blood plasma molecular fluorescence spectroscopy (EEM). Sci. Rep. 2022, 12, 16199. [Google Scholar] [CrossRef] [PubMed]
Caputi, A.F.; Squeo, G.; Sikorska, E.; Silletti, R.; Noviello, M.; Pasqualone, A.; Summo, C.; Caponio, F. Feasibility of excitation-emission fluorescence spectroscopy in tandem with chemometrics for quantitation of trans-resveratrol in vine-shoot ethanolic extracts. J. Sci. Food Agric. 2025, 105, 1496–1507. [Google Scholar] [CrossRef] [PubMed]
Abbasi, S.; Gharaghani, S.; Benvidi, A.; Rezaeinasab, M. New insights into the efficiency of thymol synergistic effect with p-cymene in inhibiting advanced glycation end products: A multi-way analysis based on spectroscopic and electrochemical methods in combination with molecular docking study. J. Pharm. Biomed. Anal. 2018, 150, 436–451. [Google Scholar] [CrossRef] [PubMed]
Corcoran, T.C. Compressive Detection of Highly Overlapped Spectra Using Walsh–Hadamard-Based Filter Functions. Appl. Spectrosc. 2018, 72, 392–403. [Google Scholar] [CrossRef] [PubMed]
Alpaydin, E. Introduction to Machine Learning, 3rd ed.; MIT Press: Cambridge, MA, USA, 2014. [Google Scholar]
Zhao, H.; Chen, Y.; Fu, X. Comparison of Machine Learning Based on Category Theory. J. Web Eng. 2023, 22, 41–54. [Google Scholar] [CrossRef]
Harris, M.A.; McCoach, D.B. Classify with caution: An illustrative example using mixture models and machine learning. J. Res. Pers. 2025, 116, 104602. [Google Scholar] [CrossRef]
Rashidi, H.H.; Tran, N.K.; Betts, E.V.; Howell, L.P.; Green, R. Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods. Acad. Pathol. 2019, 6, 2374289519873088. [Google Scholar] [CrossRef] [PubMed]
Jayaraman, P.; Desman, J.; Sabounchi, M.; Nadkarni, G.N.; Sakhuja, A. A primer on reinforcement learning in medicine for clinicians. npj Digit. Med. 2024, 7, 337. [Google Scholar] [CrossRef] [PubMed]
Sujon, K.M.; Hassan, R.B.; Towshi, Z.T.; Othman, M.A.; Samad, M.A.; Choi, K. When to Use Standardization and Normalization: Empirical Evidence from Machine Learning Models and XAI. IEEE Access 2024, 12, 135300–135314. [Google Scholar] [CrossRef]
Dimas Pratama, Y.; Salam, A. Comparison of Data Normalization Techniques on KNN Classification Performance for Pima Indians Diabetes Dataset. J. Appl. Inform. Comput. 2025, 9, 693–706. [Google Scholar] [CrossRef]
Yan, C. A Review on Spectral Data Preprocessing Techniques for Machine Learning and Quantitative Analysis. iScience 2024, 28, 112759. [Google Scholar] [CrossRef] [PubMed]
Gamberger, D.; Lavrac, N.; Dzeroski, S. Noise Detection and Elimination in Data Preprocessing: Experiments in Medical Domains. Appl. Artif. Intell. 2000, 14, 205–223. [Google Scholar] [CrossRef]
Dagal, I.; Harrison, A.; Ibrahim, A.-W.; Mbasso, W.F. Comprehensive Evaluation of Data Preprocessing and Visualization Techniques for Enhanced Classification and Sampling. Clust. Comput. 2025, 28, 476. [Google Scholar] [CrossRef]
Tautan, A.M.; Andrei, A.G.; Smeralda, C.L.; Vatti, G.; Rossi, S.; Ionescu, B. Unsupervised learning from EEG data for epilepsy: A systematic literature review. Artif. Intell. Med. 2025, 162, 103095. [Google Scholar] [CrossRef] [PubMed]
Steinley, D. K-means Clustering: A Half-century Synthesis. Br. J. Math. Stat. Psychol. 2006, 59, 1–34. [Google Scholar] [CrossRef] [PubMed]
Ward, J.H. Hierarchical Grouping to Optimize an Objective Function. J. Am. Stat. Assoc. 1963, 58, 236–244. [Google Scholar] [CrossRef]
Jolliffe, I.T.; Cadima, J. Principal Component Analysis: A Review and Recent Developments. Philos. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed]
Van der Maaten, L.; Hinton, G.; Rachmad, Y. Visualizing Data Using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Harshman, R.A.; Lundy, M.E. PARAFAC: Parallel Factor Analysis. Comput. Stat. Data Anal. 1994, 18, 39–72. [Google Scholar] [CrossRef]
Osisanwo, F.Y.; Akinsola, J.E.T.; Awodele, O.; Hinmikaiye, J.O.; Olakanmi, O.; Akinjobi, J. Supervised Machine Learning Algorithms: Classification and Comparison. Int. J. Comput. Trends Technol. 2017, 48, 128–138. [Google Scholar] [CrossRef]
Cox, D.R. The Regression Analysis of Binary Sequences. J. R. Stat. Soc. B 1958, 20, 215–242. [Google Scholar] [CrossRef]
Jovel, J.; Greiner, R. An Introduction to Machine Learning Approaches for Biomedical Research. Front. Med. 2021, 8, 771607. [Google Scholar] [CrossRef] [PubMed]
Guido, R.; Ferrisi, S.; Lofaro, D.; Conforti, D. An Overview on the Advancements of Support Vector Machine Models in Healthcare Applications: A Review. Information 2024, 15, 235. [Google Scholar] [CrossRef]
Khyathi, G.; Indumathi, K.P.; Jumana Hasin, A.; Lisa Flavin Jency, M.; Siluvai, S.; Krishnaprakash, G. Support Vector Machines: A Literature Review on Their Application in Analyzing Mass Data for Public Health. Cureus 2025, 17, e77169. [Google Scholar] [CrossRef] [PubMed]
Kingsford, C.; Salzberg, S.L. What are decision trees? Nat. Biotechnol. 2008, 26, 1011–1013. [Google Scholar] [CrossRef] [PubMed]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Salman, H.A.; Kalakech, A.; Steiti, A. Random Forest Algorithm Overview. Babylon. J. Mach. Learn. 2024, 2024, 69–79. [Google Scholar] [CrossRef] [PubMed]
Langarizadeh, M.; Moghbeli, F. Applying Naive Bayesian Networks to Disease Prediction: A Systematic Review. Acta Inform. Med. 2016, 24, 364–369. [Google Scholar] [CrossRef] [PubMed]
Cover, T.M.; Hart, P.E. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
Gupta, A.K.; Chakroborty, S.; Ghosh, S.K.; Ganguly, S. A machine learning model for multi-class classification of quenched and partitioned steel microstructure type by the k-nearest neighbor algorithm. Comput. Mater. Sci. 2023, 228, 112321. [Google Scholar] [CrossRef]
Graupe, D. Principles of Artificial Neural Networks, 2nd ed.; World Scientific: Singapore, 2007. [Google Scholar]
Cho, H.S.; Leu, M.C. Artificial neural networks in manufacturing processes: Monitoring and control. IFAC Proc. Vol. 1998, 31, 529–537. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Uddin, S.; Khan, A.; Hossain, M.E.; Moni, M.A. Comparing different supervised machine learning algorithms for disease prediction. BMC Med. Inform. Decis. Mak. 2019, 19, 281. [Google Scholar] [CrossRef] [PubMed]
Blakeley, D.D.; Oddone, E.Z.; Hasselblad, V.; Simel, D.L.; Matchar, D.B. Noninvasive carotid artery testing: A meta-analytic review. Ann. Intern. Med. 1995, 122, 360–367. [Google Scholar] [CrossRef] [PubMed]
Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef] [PubMed]
Sokolova, M.; Japkowicz, N.; Szpakowicz, S. Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation. In AI 2006: Advances in Artificial Intelligence; Sattar, A., Kang, B.H., Eds.; Lecture Notes in Computer Science, vol. 4304; Springer: Berlin/Heidelberg, Germany, 2006; pp. 1015–1021. [Google Scholar] [CrossRef]
Olivieri, A.C.; Faber, N.M.; Ferré, J.; Boqué, R.; Kalivas, J.H.; Mark, H. Uncertainty estimation and figures of merit for multivariate calibration (IUPAC Technical Report). Pure Appl. Chem. 2006, 78, 633–661. [Google Scholar] [CrossRef]
Nagelkerke, N.J.D. A note on a general definition of the coefficient of determination. Biometrika 1991, 78, 691–692. [Google Scholar] [CrossRef]
Zambrano, A.; Trilleras, J.; Arana, V.A.; Lima, K.M.G.; Neves, A.C.O.; Morais, C.L.M.; Romero, C.; Falconar, A.K.I.; Muñoz, B.S.; García, R.; et al. ATR-FTIR and multivariate analysis for differential diagnosis of dengue and leptospirosis: A feasibility study. Sci. Rep. 2025, 15, 34092. [Google Scholar] [CrossRef] [PubMed]
Kheirollahpour, M.; Shokoufi, N.; Lotfi, M. The Potential of Optical Technologies in Early Virus Detection; Prospects in Addressing Future Viral Outbreaks. Cri. Rev. Anal. Chem. 2025, 56, 1198–1226. [Google Scholar] [CrossRef] [PubMed]
Santos, M.C.D.; Mariz, J.V.M.; Silva, R.V.O.; Morais, C.L.M.; Lima, K.M.G. Clinical applications of spectroscopic techniques in conjunction with multivariate analysis in virus diagnosis. Biomed. Spectrosc. Imaging 2023, 10, 49–75. [Google Scholar] [CrossRef]
Rumaling, M.I.; Chee, F.P.; Bade, A.; Hasbi, N.H.; Daim, S.; Juhim, F.; Duinong, M.; Rasmidi, R. Methods of optical spectroscopy in detection of virus in infected samples: A review. Heliyon 2022, 8, e10472. [Google Scholar] [CrossRef] [PubMed]
Lv, R.; Wang, Z.; Ma, Y.; Li, W.; Tian, J. Machine Learning Enhanced Optical Spectroscopy for Disease Detection. J. Phys. Chem. Lett. 2022, 13, 9238–9249. [Google Scholar] [CrossRef] [PubMed]
Escandar, G.M.; Damiani, P.C.; Goicoechea, H.C.; Olivieri, A.C. A review of multivariate calibration methods applied to biomedical analysis. Microchem. J. 2006, 82, 29–42. [Google Scholar] [CrossRef]
Mustorgi, E.; Durante, C.; Malegori, C.; Greco, P.; Bartoletti, R.; Cocchi, M.; Casale, M. An analytical approach based on excitation-emission fluorescence spectroscopy and chemometrics for the screening of prostate cancer through urine analysis: A proof–of–concept study. Chemom. Intell. Lab. Syst. 2023, 234, 104752. [Google Scholar] [CrossRef]
Nath, A.; Rivoire, K.; Chang, S.; West, L.; Cantor, S.B.; Basen-Engquist, K.; Adler-Storthz, K.; Cox, D.D.; Atkinson, E.N.; Staerkel, G.; et al. A pilot study for a screening trial of cervical fluorescence spectroscopy. Int. J. Gynecol. Cancer 2004, 14, 1097–1107. [Google Scholar] [CrossRef]
Diagaradjane, P.; Yaseen, M.A.; Yu, J.; Wong, M.S.; Anvari, B. Autofluorescence characterization for the early diagnosis of neoplastic changes in DMBA/TPA-induced mouse skin carcinogenesis. Lasers. Surg. Med. 2005, 37, 382–395. [Google Scholar] [CrossRef] [PubMed]
Soares, F.; Becker, K.; Anzanello, M.J. A hierarchical classifier based on human blood plasma fluorescence for non-invasive colorectal cancer screening. Artif. Intell. Med. 2017, 82, 1–10. [Google Scholar] [CrossRef] [PubMed]
Yin, B.; Mi, J.Y.; Zhai, H.L.; Zhao, B.Q.; Bi, K.X. An effective approach to the early diagnosis of colorectal cancer based on three-dimensional fluorescence spectra of human blood plasma. J. Pharm. Biomed. Anal. 2021, 193, 113757. [Google Scholar] [CrossRef] [PubMed]
Švecová, M.; Blahova, L.; Kostolný, J.; Birkova, A.; Urdzik, P.; Marekova, M.; Dubayova, K. Enhancing endometrial cancer detection: Blood serum intrinsic fluorescence data processing and machine learning application. Talanta 2025, 283, 127083. [Google Scholar] [CrossRef] [PubMed]
Hordge, L.Q.N.; McDaniel, K.L.; Jones, D.D.; Fakayode, S.O. Simultaneous determination of estrogens (ethinylestradiol and norgestimate) concentrations in human and bovine serum albumin by use of fluorescence spectroscopy and multivariate regression analysis. Talanta 2016, 152, 401–409. [Google Scholar] [CrossRef] [PubMed]
Nemati, S.S.; Emadi, S.; Ghiasvand Mohammadkhani, L.; Kompany-Zareh, M.; Hasanzadeh, Z. Multivariate spectrofluorimetric detection of lipase isolated from Serratia marcescens in chromatographic fractions. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2019, 222, 117137. [Google Scholar] [CrossRef] [PubMed]
Escandar, G.M.; Gómez, D.G.; Mansilla, A.E.; De La Peña, A.M.; Goicoechea, H.C. Determination of carbamazepine in serum and pharmaceutical preparations using immobilization on a nylon support and fluorescence detection. Anal. Chim. Acta 2004, 506, 161–170. [Google Scholar] [CrossRef]
Goicoechea, H.C.; Calimag-Williams, K.; Campiglia, A.D. Multi-way partial least-squares and residual bi-linearization for the direct determination of monohydroxy-polycyclic aromatic hydrocarbons on octadecyl membranes via room-temperature fluorescence excitation emission matrices. Anal. Chim. Acta 2012, 717, 100–109. [Google Scholar] [CrossRef] [PubMed]
Pagani, A.P.; Ibañez, G.A. Four-way calibration applied to the processing of pH-modulated fluorescence excitation-emission matrices. Analysis of fluoroquinolones in the presence of significant spectral overlapping. Microchem. J. 2017, 132, 211–218. [Google Scholar] [CrossRef]
Trevisan, M.G.; Poppi, R.J. Determination of doxorubicin in human plasma by excitation–emission matrix fluorescence and multi-way analysis. Anal. Chim. Acta 2003, 493, 69–81. [Google Scholar] [CrossRef]
Wang, T.; Liu, Q.; Long, W.J.; Chen, A.Q.; Wu, H.L.; Yu, R.Q. A chemometric comparison of different models in fluorescence analysis of dabigatran etexilate and dabigatran. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2021, 246, 118988. [Google Scholar] [CrossRef] [PubMed]
Xu, J.; Xu, J.; Tong, Z.; Yu, S.; Liu, B.; Mu, X.; Du, B.; Gao, C.; Wang, J.; Liu, Z.; et al. Impact of different classification schemes on discrimination of proteins with noise-contaminated spectra using laboratory-measured fluorescence data. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2023, 296, 122646. [Google Scholar] [CrossRef] [PubMed]
Bernardes, C.D.; Poppi, R.J.; Sena, M.M. Direct determination of trans-resveratrol in human plasma by spectrofluorimetry and second-order standard addition. Talanta 2010, 82, 640–645. [Google Scholar] [CrossRef] [PubMed]
Głowacz, K.; Skorupska, S.; Grabowska-Jadach, I.; Ciosek-Skibínska, P. Excitation-emission matrix fluorescence spectroscopy for cell viability testing in UV-treated cell culture. RSC Adv. 2022, 12, 7652–7660. [Google Scholar] [CrossRef] [PubMed]
Costa, F.S.L.; Bezerra, C.C.R.; Neto, R.M.; Morais, C.L.M.; Lima, K.M.G. Identification of resistance in Escherichia coli and Klebsiella pneumoniae using excitation-emission matrix fluorescence spectroscopy and multivariate analysis. Sci. Rep. 2020, 10, 12994. [Google Scholar] [CrossRef] [PubMed]
Henry, J.; Endres, J.L.; Sadykov, M.R.; Bayles, K.W.; Svechkarev, D. Fast and accurate identification of pathogenic bacteria using excitation-emission spectroscopy and machine learning. Sens. Diagn. 2024, 3, 1253–1262. [Google Scholar] [CrossRef] [PubMed]
Araujo Gomes, G.J.; Beltrão, F.E.d.L.; Fragoso, W.D.; Lemos, S.G. Discrimination between COVID-19 positive and negative blood serum based on excitation-emission matrix fluorescence spectroscopy and chemometrics. Talanta 2024, 280, 126788. [Google Scholar] [CrossRef] [PubMed]
Zhang, P.; Yang, Q.; Xu, X.; Feng, H.; Du, B.; Xu, J.; Liu, B.; Mu, X.; Wang, J.; Tong, Z. Fluorescence excitation-emission matrix spectroscopy combined with machine learning for the classification of viruses for respiratory infections. Talanta 2025, 286, 127462. [Google Scholar] [CrossRef] [PubMed]

Figure 1. PRISMA flow diagram illustrates the process of study identification, screening, eligibility assessment, and inclusion in the systematic review.

Figure 2. Illustration of Jablonski diagram. Electronic transition and vibrational relaxation in fluorescence.

Figure 3. Classification of main machine learning approaches.

Figure 4. Machine learning workflow for EEM fluorescence spectroscopy data analysis.

Figure 5. Integrated analytical framework of EEM spectroscopy and machine learning for biomedical applications.

Table 1. Potential applications of EEM spectroscopy and machine learning for non-invasive biomedical diagnostics.

Disease Detection	Sample	Machine Learning Models	Results	Reference
Prostate cancer screening	Urine (69 samples; 46 cancer, 23 healthy); 95 EEMs	PARAFAC (4 factors) → LDA, PLS-DA; 70/30 split (Kennard–Stone)	- Sensitivity 94.5%, specificity 89.7% (cancer class, LDA and PLS-DA) - AUC 0.92 (PLS-DA)	[70]
Cervical precancer and cancer (SIL)	Cervical tissue (58 women, 92 measurement sites)	PCA → logistic discrimination; LDA (forward stepwise, F = 20)	- Squamous normal: 99% correct; concordance colposcopy vs. fluorescence: 69% - Sensitivity 86%, specificity 74% (diagnostic) - Sensitivity 75%, specificity 80% (screening)	[71]
Skin SCC early detection/staging	In vivo mouse skin (n = 40 experimental + 6 control + 6 blank); weekly EEMs over 15 weeks	Stepwise Fisher LDA; leave-one-out cross-validation	- DA-2 (280 nm): accuracy 74.3%, specificity 95.6% - DA-3 (280 + 320 + 340 + 410 nm): accuracy 82.9%, specificity 97.8%, sensitivity 77.5–87.5% - 11.6% accuracy improvement with multiple excitation wavelengths	[72]
Colorectal cancer (CRC) screening	Human blood plasma (299 individuals; 74 CRC, 75 healthy, 75 adenomas, 75 other)	Hierarchical SVM: Level 1 binary SVM (CRC vs. non-CRC); Level 2 one-class SVM	- CRC detection: sensitivity 0.8636, specificity 0.9516, AUC 0.933 - Non-malignant: sensitivity 0.6000, specificity 0.7955	[73]
Colorectal cancer (CRC) discrimination	Human blood plasma (74 CRC, 74 adenomas, 77 non-malignant, 4 outliers removed)	TM-PLS-DA (Venetian blinds 10-fold CV, 8 LVs)	- Accuracy 84% (test set); error rate 0.16 - Sensitivity: CRC 0.88, adenomas 0.85, non-malignant 0.80 - AUC: 0.94 (CRC), 0.90 (adenomas), 0.90 (non-malignant)	[74]
Colorectal cancer (CRC) early detection (metabonomic)	Human blood plasma (n = 308; 77 CRC, 77 adenomas, 77 other non-malignant, 77 no findings); diluted and undiluted	PARAFAC (3 EEM sets, 10 + 3 + 6 components) → pooled score matrix (19 vars) → PLS-DA (forward selection); also iPLS on raw unfolded; 10-fold CV	- CRC vs. all controls: sensitivity 0.74, specificity 0.71, AUC 0.69 - CRC vs. other non-malignant: sensitivity 0.79, specificity 0.73, AUC 0.76 - CRC vs. no findings: sensitivity 0.73, specificity 0.77, AUC 0.75	[8]
Endometrial cancer (EC) detection	Blood serum (n = 118; 73 EC, 45 healthy controls); diluted 10× in phosphate buffer pH 7.4; 3D synchronous fluorescence spectra (CWM, Δλ = 10–200 nm)	PCA + k-means clustering; PLS-DA; RF, SVM, LR, SGD classifiers; 10-fold CV × 100 repeats; grid search optimization	- Best (LR, optimized, Δλ = 120 nm): sensitivity 0.94, specificity 0.89, AUC 0.94 - RF: sensitivity 0.87, specificity 0.74, AUC 0.92 - Fluorescence ratios R300/330 and R360/490: AUC 0.88 and 0.91	[75]
Alzheimer’s disease (AD) detection	Human blood plasma (230 individuals: 83 AD, 147 controls); dried plasma on microscope slides; EEM (excitation 230–450 nm); Kennard–Stone split 162/34/34	PARAFAC-QDA (6 factors) and Tucker3-QDA; Kennard–Stone train/validation/test split	- PARAFAC-QDA: sensitivity 83.33%, specificity 100%, F2-score 86.21% - Tucker3-QDA: sensitivity 91.67%, specificity 95.45%, F2-score 91.67% - Global accuracy 94.12%, MCC 0.87	[24]
Estrogen quantification (EE and NOR) in serum albumin	BSA and HSA spiked with EE/NOR in MOPS buffer, pH 7.4	PLS regression (multivariate); univariate regression (comparison)	- PLS: R² ≥ 0.9949; RMS%RE for EE in BSA: 5.84% (310 K) - LOD: 1.6 × 10⁻⁸ M (EE), 2.4 × 10⁻⁷ M (NOR)	[76]
Lipase activity detection in chromatographic fractions	Gel filtration fractions (13 EEMs) from S. marcescens extracellular lipase	N-PLS (with region selection) → GA-RBF-ANN; PARAFAC (3 components)	- GA-RBF-ANN: Q² = 0.919, R² = 0.919 - PARAFAC: 83.52% variance explained, core consistency 84.08%	[77]
Therapeutic drug monitoring (CBZ and CBZ-EP)	Human serum (spiked and real patient samples); pharmaceutical preparations	PARAFAC, SWATLD, N-PLS (three-way EEM); PLS-1 (two-way emission)	- PARAFAC: RMSEP = 0.61 µg/mL (CBZ), 0.83 µg/mL (CBZ-EP), R² = 0.976 (CBZ) - Recoveries 104 ± 5% (CBZ), comparable to FPIA/HPLC references	[78]
OH-PAH metabolites in urine (PAH exposure biomarkers)	Synthetic urine spiked with four OH-PAH metabolites; C18 membrane SPE	N-PLS/RBL and MCR-ALS; calibration factors A = 4 (leave-one-out CV)	- N-PLS/RBL recoveries: 96.2–105.6% - MCR-ALS recoveries: 98.3–105.6% - LOD: 0.018–0.047 ng/mL	[79]
Fluoroquinolone quantification (CIP, OFLO, NOR) in urine	Spiked human urine (diluted 1/200) with interferents (salicylate, naproxen)	PARAFAC (four-way/third-order calibration); no constraints applied	- REP% = 3.3–7.4% across analytes and matrices - LOD: 0.028–0.035 mg/L - EJCR test passed at 95% confidence	[80]
Doxorubicin (DXR) monitoring in plasma	Human blood plasma (10 healthy volunteers, spiked 0.75–11.25 µg/mL)	PARAFAC (2 factors) → linear regression; N-PLS (2 factors, leave-one-out CV)	- PARAFAC: RMSECV = 0.060 µg/mL, mean error 4.4% - N-PLS: RMSECV = 0.045 µg/mL, mean error 2.0% - LD = 0.032 µg/mL (both methods)	[81]
Dabigatran etexilate/dabigatran quantification	Spiked human plasma and urine (7 calibration, 3 blank, 5 spiked prediction samples per matrix)	SWANRF (trilinear), MCR-ALS (bilinear), U-PLS/RBL (latent variables)	- SWANRF: recovery 100.4 ± 1.4% (DABE), 99.2 ± 2.2% (DAB) in plasma - LOD: 0.7–2.5 ng/mL - SWANRF and MCR-ALS pass EJCR at 95% in both matrices; U-PLS/RBL fails EJCR in urine	[82]
Proteinaceous biotoxin classification	14 protein samples in solid powder form (4 biotoxins + 10 harmless proteins); measured at 45°; λex 260–315 nm, λem 270–420 nm; 75/25 split	Four schemes: EEM→MLP, EEM→PCA→MLP, EEM→RF→MLP, EEM→RF→PCA→MLP	- Noise-free: ~100% classification across all schemes - PSNR = 20: EEM-WT retains 100%; EEM-RF-MLP: 100% with only 15% of features - EEM-FT drops to ~60% under noise	[83]
trans-Resveratrol (RVT) quantification in plasma	Human plasma (5 healthy volunteers, diluted 10×, buffered pH 5.0); spiked 0.10–5.00 µg/mL	PARAFAC + SOSAM (Second-Order Standard Addition Method); univariate linear regression	- Recovery: 94.0–110.0% (mean 102.6%) - LOD: 0.002–0.013 µg/mL - t-tests showed no significant differences (4/5 samples)	[84]
Cell viability assessment (UV-induced cytotoxicity)	A375 melanoma cells exposed to UV (λ = 365 nm); 72 EEMs (6 exposures × 12 replicates)	UPLS regression; 75/25 train/test split; Venetian blinds CV	- Train: R² = 0.950; Test: R² = 0.877 - MTT vs. EEM correlation: R² = 0.986 - t-test and F-test: all p > 0.05	[85]
Cell viability prediction (oxaliplatin cytotoxicity)	A375 (melanoma) and HaCaT (keratinocytes) treated with oxaliplatin (5–1000 µM); 144 EEMs (2 lines × 6 concentrations × 12 replicates)	PARAFAC (5 components) → MLR on PARAFAC scores; cross-validation; Tukey post hoc tests	- MLR on PARAFAC scores: R² = 0.997 (A375), R² = 0.979 (HaCaT) - PARAFAC resolved 5 fluorophores (tryptophan, NADH/NADPH, FAD, riboflavin, pyridoxine) - Significant group differentiation at p ≤ 0.05 for most dose pairs	[23]
Antimicrobial resistance (E. coli, K. pneumoniae)	Bacterial suspensions in phosphate buffer (75 samples; 5 replicates per species)	PCA → 2D-LDA, 2D-PCA-LDA/QDA/SVM; Unfolded: UPCA-QDA/SVM, USPA-QDA/SVM, UGA-QDA/SVM	- Best models (K. pneumoniae): UPCA-QDA, UGA-SVM, 2D-LDA → 100% sensitivity, specificity, accuracy, F-score	[86]
Pathogenic bacteria identification (species and Gram classification)	Bacterial suspensions (8 species: 4 G⁺/4 G⁻); DMAFa fluorescent dye; 595 EEMs (cross-validation dataset)	Convolutional Neural Network (CNN, MATLAB); K-fold cross-validation; independent test set	- CV species: 98.2%; CV Gram: 99.8% - Independent test: 85.8% (species), 98.3% (Gram) - LOD ~107 CFU/mL	[87]
COVID-19 (SARS-CoV-2) detection	Blood serum (n = 106; 80 RT-PCR positive, 26 negative); diluted 1:1000 in PBS pH 7.4; 40/66 split (Kennard–Stone)	PARAFAC → SIMCA/DD-SIMCA/PCA-DA/PLS-DA; Unfolded EEM → same classifiers; Single λ_ex 280 nm → SIMCA/PLS-DA	- Best: PARAFAC → DD-SIMCA (autoscaled, 3 PCs): Sens = 0.98, Spec = 0.96, ER = 0.05 - PARAFAC → PLS-DA (2 LVs): Sens = 0.94, Spec = 0.94, ER = 0.06 - Best unfolded: Unfolded → PCA-DA (8 PCs): Sens = 0.94, Spec = 0.94, ER = 0.06	[88]
Respiratory virus classification (8 virus types)	Inactivated virus solutions (8 types, 5 samples each, n = 40); 1.0 × 10⁷ PFU/mL; 28/12 train/test split	PCA → RF and SVM; mid-level data fusion (SNV-FFT, SNV-WT, FFT-WT, SNV-FFT-WT); K-fold CV	- Best individual (SVM + FFT or WT): ACC = 65%, F1 = 69% - Best fusion (SVM + FFT-WT): ACC = 75%, improvement 15.38% over individual transforms	[89]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pérez Hincapié, M.; Arana, V.A.; García-Alzate, R.; Lozano-Arias, D.; Trilleras, J. Excitation–Emission Fluorescence Spectroscopy Combined with Machine Learning for Biomedical Diagnostics: A Systematic Review. Sci 2026, 8, 148. https://doi.org/10.3390/sci8070148

AMA Style

Pérez Hincapié M, Arana VA, García-Alzate R, Lozano-Arias D, Trilleras J. Excitation–Emission Fluorescence Spectroscopy Combined with Machine Learning for Biomedical Diagnostics: A Systematic Review. Sci. 2026; 8(7):148. https://doi.org/10.3390/sci8070148

Chicago/Turabian Style

Pérez Hincapié, Melissa, Victoria A. Arana, Roberto García-Alzate, Daisy Lozano-Arias, and Jorge Trilleras. 2026. "Excitation–Emission Fluorescence Spectroscopy Combined with Machine Learning for Biomedical Diagnostics: A Systematic Review" Sci 8, no. 7: 148. https://doi.org/10.3390/sci8070148

APA Style

Pérez Hincapié, M., Arana, V. A., García-Alzate, R., Lozano-Arias, D., & Trilleras, J. (2026). Excitation–Emission Fluorescence Spectroscopy Combined with Machine Learning for Biomedical Diagnostics: A Systematic Review. Sci, 8(7), 148. https://doi.org/10.3390/sci8070148

Article Menu

Excitation–Emission Fluorescence Spectroscopy Combined with Machine Learning for Biomedical Diagnostics: A Systematic Review

Abstract

1. Introduction

2. Materials and Methods

2.1. Inclusion and Exclusion Criteria

2.2. Study Selection Process

3. Fundamentals of Excitation–Emission Matrix Fluorescence Spectroscopy and Machine Learning

3.1. Basic Principles of EEM Fluorescence Spectroscopy

3.2. Data Characteristics and Challenges

3.3. Basic Principles of Machine Learning

3.3.1. Data Preprocessing

3.3.2. Unsupervised Learning

3.3.3. Supervised Learning

3.3.4. Model Evaluation Metrics

3.3.5. Machine Learning Workflow

4. Biomedical Applications of EEM Fluorescence Spectroscopy Coupled with Machine Learning

4.1. Cancer Detection

4.2. Neurological and Metabolic Diseases

4.3. Analysis of Bioactive Compounds and Hormonal Contaminants

4.4. Diagnosis of Infectious Diseases

5. Discussion

5.1. Performance and Validity of Current Approaches

5.2. Methodological Challenges in Machine Learning Applied to EEM Data

5.3. Future Perspectives and Research Opportunities

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI