1. Introduction
Despite ongoing efforts in basic and applied research, there is no curative treatment option for glioblastoma (GBM), the most common primary brain tumor in adults. The survival prognosis of patients at present remains as low as 5 to 15 months [
1]. Due to its intratumoral heterogeneity and plasticity, combating the tumor with individual pharmacological agents has proven to be extremely difficult, and its stem cell-like characteristics further enable it to develop resistance to chemotherapy, radiation, and immunotherapy [
2]. Although several prognostic factors influence the survival of patients with glioblastoma, including tumor location, MGMT promoter methylation, and age, the extent of surgical resection (EOR) is the only modifiable factor directly impacted by therapeutic intervention [
3]. Maximally safe tumor resection has been consistently shown to improve patients’ quality of life (QoL), as well as prolong progression-free survival (PFS) and overall survival (OS) [
4,
5,
6,
7]. Furthermore, Grabowski et al. [
8] and Gerritsen et al. [
9] have presented the remaining tumor volume as an additional and potentially more meaningful prognostic factor for PFS and OS than EOR.
The assumption that remaining neoplastic cells in the brain are a major cause of glioblastoma relapse is supported by studies on supramarginal resection (SMR), in which cytoreduction is extended beyond tumor borders detected by MRI. Several studies comparing SMR with gross total resection (GTR)—defined as the removal of all visible tumor tissue—have reported significantly improved progression-free survival (PFS) and overall survival (OS) following SMR [
7,
10,
11,
12]. However, while maximizing the extent of resection is usually the primary goal of glioblastoma surgery, preventing iatrogenic neurological deficits resulting from radical tissue removal is equally crucial [
9,
13].
To enhance visual demarcation of tumor tissue for improved precision during resection, exogenous 5-aminolevulinic acid (5-ALA) is used as a surgical adjunct. 5-ALA is an approved fluorescent marker for high-grade gliomas that selectively labels neoplastic cells. After oral administration 3–4 h before surgery, it is metabolized within the heme biosynthesis pathway to the fluorescent molecule protoporphyrin IX (PPIX), which accumulates intracellularly in malignant glioma cells due to altered metabolism [
14]. The use of 5-ALA in high-grade glioma is an established method, and contemporary neurosurgical microscopes are equipped with the necessary modalities for intraoperative fluorescence imaging [
13,
15].
PPIX fluorescence is excited by blue light around 405 nm and emits lower-energy light at approximately 635 nm, appearing pinkish red. Fluorescence-guided surgery (FGS) thus enables real-time visualization of the tumor independent of neuronavigation and brain shift [
14].
Previous studies showed that tumor regions revealed by 5-ALA probably even exceed the preoperatively detected MRI contrast-enhanced volume [
14,
16]. Its role in achieving a greater EOR in malignant gliomas and the associated benefits in PFS and OS is strongly supported by numerous studies [
17,
18,
19,
20,
21,
22,
23].
Despite these benefits, important limitations remain. Although the positive predictive value (PPV) of 5-ALA approaches 99%, the negative predictive value (NPV) ranges from 22% to 90%, indicating that tumor tissue may remain undetected [
14]. Reported specificity and sensitivity range between 89 and 100% as well as 83 and 87%, respectively [
24]. Clinically, most glioblastoma patients still develop recurrence near the resection cavity even after FGS [
7,
25]. Despite the use of fluorescent imaging, tumor cell residues were identified in more than 50% of biopsies taken from the tumor margin [
16].
The fluorescence intensity in malignant glioma, and thus the ability to delineate tissue contaminated by tumor cells, is by no means homogeneous but is strongly related to tumor cell density and cell proliferation, among other factors [
14,
25]. In diffusely infiltrative tumor margins, fluorescence gradually fades, making tumor borders difficult to identify [
25]. In addition, interpatient variability and procedural factors—including the timing of 5-ALA administration, surgical workflow variations, and microscope parameters such as illumination intensity, working distance, angle of incidence, shadowing, and photobleaching—further influence fluorescence visibility [
14,
24,
26]. There is an apparent need for improved visualization of active tumor areas that exceeds the capabilities of basic 5-ALA fluorescence imaging in terms of sensitivity as well as objectivity [
14].
Most research attempts to meet the need for improved FGS imaging involve spectroscopic solutions and can be grossly divided into hyperspectral wide-field camera technologies and spectroscopic probe (SP)-based approaches. A review article on probe-based spectroscopy of 5-ALA-induced fluorescence is available from Gautheron et al. [
27]. The sensitivity of fluorescence detection with SPs substantially exceeds that of conventional FGS. SPs can detect fluorescence signals emanating from a tissue surface within the sub-millimeter range and additionally enable quantification of PPIX in the tissue, providing a more objective tissue assessment [
28]. However, probe-based approaches are limited to localized measurements rather than a comprehensive scan of the resection cavity, significantly reducing their practical utility.
Hyperspectral imaging (HSI) enables wide-field capture, encompassing the entire surgical field similarly to a surgical microscope. In their 2022 study, Lehtonen et al. [
29] were able to demonstrate that HSI has a considerably lower sensory threshold for PPIX fluorescence compared to visual assessment. Later that year, the group presented a surgical microscope-integrated HSI setup that allowed them to collect an annotated database of 52 hyperspectral images from glioma surgeries [
30]. In 2023, they published corresponding results for a multitissue (blood, nodules, glioma, dura, gray matter, reflections, vein) classification where they achieved an accuracy of 80% [
31].
Since the start of the Hyperspectral Imaging Cancer Detection (HELiCoiD) project in 2014, a research group at the Institute of Applied Microelectronics at the University of Las Palmas has been continuously involved in the ongoing development of intraoperative HSI imaging for intracranial procedures. During this time, they have not only collected a large amount of labeled data (over 890k labeled hyperspectral pixels) and created a repository, but they have also investigated different machine learning (ML) methods for segmenting the HSI images and classifying the hypercube pixels. In a recent paper from 2023, they presented a multiclass tissue classification method for HSI images that achieved an F1 score of 70.2 ± 7.9% for tumor tissue, normal tissue, blood vessels, and background differentiation [
32]. Livermore et al. [
33] as well as Jabarkheel et al. [
34] analyzed ex vivo tissue using Raman spectroscopy to differentiate between normal tissue and tumor tissue, achieving sensitivities of 96% and 91.3%, as well as specificities of 99% and 81.2%, respectively. Despite the excellent detection rate, tissue already removed from the resection cavity is examined in this approach. It could provide local reassurance about the resection progress, e.g., in tumor margins, but is not equivalent to real-time in vivo visualization.
In parallel, Black et al. [
35] have investigated the potential of hyperspectral fluorescence signatures as optical biomarkers for intraoperative tissue characterization (high-grade gliomas, non-glial primary tumors radiation necrosis, miscellaneous, metastases). Random forest and multilayer perceptron models achieved average test accuracies of 84–87%, 96.1%, 86%, and 91%, respectively. Beyond tissue classification, they also focused on improving the quantitative interpretation of hyperspectral fluorescence signals. Deep learning-based approaches have been proposed to correct for heterogeneous optical and geometric tissue properties and to enable more accurate estimation of PPIX concentration from hyperspectral data [
36].
Similarly, Shen et al. have presented a possibility of high-throughput intraoperative assessment of ex vivo tumor specimens based on fluorescence staining using indocyanine green (ICG) in combination with deep convolutional neural network (CNN) analysis. They achieved over 90% sensitivity and over 80% specificity in differentiating tumor and non-tumor tissue [
37].
Among the numerous proposed solutions to improve intraoperative vision in glioma surgery, the predominant focus is placed on improving the imaging technology. This is immensely important as it defines the basic ability to capture discriminative optical signals. Promising approaches, such as HSI in conjunction with ML can additionally offer a computational classification of the tissue in the image data but are largely still at the research stage. Recently launched microneurosurgical microscopes have primarily changed the visibility of the tumor surrounding anatomy to facilitate the procedure under blue light. Thus, a surgeon’s subjective judgment of colors and intensities is still used to evaluate vague tumor borders and ambiguous fluorescence. In this work, we aim to investigate an approach using a purely data-driven improved visualization for 5-ALA-guided resections. Herein we apply ML-based pixel classification to increase the sensitivity of the weakly and ambiguously fluorescent areas in conventional Zeiss microscope-collected RGB images. The overarching aim within this study is to explore the potential and limitations of a data-driven approach to enhance fluorescence imaging in 5-ALA-aided glioblastoma resections. This involves a dual focus:
- 1.
Determining the sensitivity limits of neurosurgical microscope cameras in detecting PPIX fluorescence signals.
- 2.
Establishing a method for quantifying fluorescence intensity and converting it into a clinically useful visualization.
To address these objectives, we follow a systematic approach throughout this study. We begin by utilizing controllable and reproducible synthetic PPIX samples to analyze the detectability of fluorescence signals in images acquired with conventional neurosurgical microscopes. This controlled experimental setting enables a systematic evaluation of different machine learning strategies for identifying weak fluorescence signals. Furthermore, the synthetic samples allow us to develop a model for the quantitative assessment of fluorescence intensity and its conversion into a meaningful visualization. Finally, we demonstrate the potential of the proposed approach on a small dataset of real intraoperative images, illustrating its ability to reveal subtle fluorescence patterns that may remain difficult to discern by visual inspection alone.
2. Materials and Methods
2.1. Synthetic Sample Image Acquisition
As a first step toward developing a machine learning framework for improved fluorescence detection, synthetic fluorescent samples with clinically relevant PPIX concentrations were produced and analyzed to assess the separability of fluorescence signals at the pixel level. Although the optical properties of these samples differ from those of human tissue, they provide a controlled experimental setup with known ground truth, enabling fundamental insights into the detectability of fluorescence signals in images acquired with conventional surgical microscopes. In addition, the standardized nature of this dataset allows for a systematic comparison of machine learning models and facilitates the derivation of assumptions for their subsequent adaptation to real clinical data.
Liquid fluorescent samples were prepared with nine different PPIX (Sigma Aldrich, St. Louis, MO, USA) concentrations dissolved in 1 mL dimethyl sulfoxide (DMSO; Carl Roth, Karlsruhe, Germany): 5
g/mL, 2
g/mL, 1
g/mL, 0.5
g/mL, 0.2
g/mL, 0.1
g/mL, 0.05
g/mL, 0.025
g/mL, and 0.01
g/mL. The selected concentration range was based on values reported for malignant human gliomas following oral 5-ALA administration [
38,
39]. In this context, the highest concentration of 5
g/mL represents a moderate accumulation of PPIX in grade 4 gliomas, which can be expected to produce visible fluorescence under standard surgical conditions. The remaining samples were progressively diluted to approach the detection limits of conventional fluorescence-guided surgery (FGS) imaging, reaching up to a 500-fold reduction in PPIX concentration and resulting in fluorescence levels that are no longer visually detectable. Finally, a reference sample containing pure DMSO was prepared (see
Figure 1).
Sample images were acquired using the integrated camera of a microneurosurgical microscope (ZEISS KINEVO 900 S, Carl Zeiss Meditec AG, Jena, Thuringia, Germany) in fluorescence mode. For each image, three PPIX-containing samples and the reference sample were placed on a positioning template as depicted in
Figure 1(A1–A3). Since the blue light illumination source of the microscope generates an uneven optical power profile within the field of view [
26], the positioning template was used to obtain an approximately homogeneously illuminated area in each image. It was used to align the samples within a defined area (inner template circle) and thus ensure the same fluorescence excitation conditions in all samples.
2.2. Intraoperative Glioblastoma Image Acquisition
GBM images were acquired from intraoperative videos recorded during GBM fluorescence-guided resections using the microneurosurgical microscope in fluorescence mode. 5-ALA was administered to these patients 3–4 h before the start of surgery. Furthermore, images of the surgical site were collected from non-5-ALA-guided intracranial procedures, including oligodendroglioma, astrocytoma and metastasis resections. A total of 15 frames were extracted from videos of GBM resection procedures from 6 different patients and 10 frames from videos of non-fluorescence-guided procedures and 6 different patients. All procedures were performed according to standard practice using available assistive technology such as neuronavigation and ultrasound. No specifications were made for imaging system settings such as exposure intensity or microscope working distance. The aim was to reflect the preferences and variability of the surgeons with regard to individually ideal visibility in the dataset.
2.3. Data Preparation
The data preparation methodology was initially developed for the synthetic samples image dataset and later adapted for the intraoperative image dataset, applying uniformly to both datasets. Since the images in the datasets were provided in JPEG format, the primary focus of data preprocessing was the removal of compression artifacts, particularly block formations in dark image regions. To detect potential artifacts, a density function was approximated from the frequency distribution of pixel saturation values using a moving average filter. Saturation values whose frequency significantly exceeded the density function—by a factor greater than 1.5—were identified as likely stemming from quantization-related artifacts. Pixels exhibiting these saturation values were assigned a value of 0 in their respective color channels. These pixels were subsequently interpolated using a 9 × 9 median filter, after which the channels were reassembled into an RGB color image.
For the synthetic dataset, two non-overlapping regions of interest (ROIs) were defined in each depicted sample in all three synthetic sample images (see
Figure 1B). Herein, pixels from one ROI were used for analysis and ML training, while dissimilar pixels were extracted from the second ROI for testing and evaluation purposes. This procedure yielded between 3830 and 3850 pixels (i.e., individual data points) per ROI. Additionally, two ROIs per image were defined in the dark background and two ROIs were extracted from the positioning template markers (see
Figure 1B). Altogether, the synthetic dataset comprised 138,140 pixels from all ROIs, of which 69,087 pixels were allocated for training and 69,053 pixels for testing.
To construct the intraoperative dataset, regions of interest (ROIs) were manually defined in clearly fluorescent areas from five GBM images (see
Figure 1C). In total, 305,132 pixels were extracted from five images obtained from three patients (Patients 1–3) and used for model training.
For the non-fluorescent (negative) class, five images acquired during non-5-ALA-guided procedures were used in their entirety without ROI specification, resulting in 10,368,000 pixels. These images were captured at different time points during the procedures of Patients 7 and 8.
Model evaluation was performed on ten frames from fluorescence-guided surgery (FGS) recordings. Five of these images originated from Patients 1–3 but were taken at different stages of the procedure, where different tissue layers were exposed compared to those used for training. The remaining five images were obtained from three additional patients (Patients 4–6).
To further assess model behavior in the absence of fluorescence, five additional images from non-FGS procedures were included for evaluation, originating from Patients 9–12. A detailed overview of the composition of both datasets is provided in
Figure 2.
2.4. Qualitative t-SNE Analysis
To provide an initial qualitative assessment of the sensitivity of FGS imaging to PPIX fluorescence and an overview of data separability, the t-distributed stochastic neighbor embedding (t-SNE) algorithm was applied to the test data, including pixels from PPIX-containing and PPIX-free samples. t-SNE projections are mostly used to visualize high-dimensional data, but they also reveal local structures in data based on similarity measures. They indicate a fundamental class separability based on non-linear relationships in the data [
40]. In this study, t-SNE was utilized to evaluate class separability, offering insights into the expected performance of ML models under real-world hardware limitations. The algorithm was executed with a perplexity value of 200, using the Minkowski metric as the distance measure.
2.5. Model Development
To determine the sensitivity limits of neurosurgical microscope cameras for detecting PPIX fluorescence, a binary pixel-wise classification into fluorescent and non-fluorescent classes is required. However, complete annotation of intraoperative image data for training of a classification model is inherently infeasible. While clearly fluorescent areas can be annotated, vague fluorescence would require exhaustive, pixel-by-pixel manual assessments for reliable labeling—an impractical approach given the time and data volume required. Moreover, non-visible or ambiguous fluorescence cannot be annotated visually at all.
In contrast to the positive class, which is inherently challenging to annotate comprehensively, perfectly annotated negative class data—pixels depicting no fluorescence—can be generated in large quantities. This is achieved using intraoperative images acquired under blue light in neurosurgical procedures not involving glioma resection, where patients have not received 5-ALA. Against this background, we investigate, in addition to traditional classifiers, an anomaly detection approach in which the model focuses on extensively learning the negative class (or normal data), enabling it to identify abnormalities not encountered during training, specifically non-visible or ambiguous fluorescence.
Three traditional classifiers:
Support Vector Machine (SVM) configured with a radial basis function kernel (scale = 5);
Naïve Bayes (NB) utilizing a Gaussian probability distribution;
Neural Network (NN) implemented as a single fully connected layer with 10 neurons and ReLU activation.
These classifiers were trained using pixels from highly fluorescent and non-fluorescent synthetic samples. In parallel, three contrastive loss Variational Autoencoder (clVAE) models with varying -values ( = 1, 2, and 3) were trained to identify fluorescent pixels as anomalies.
2.5.1. Pixel Classification
Based on the considerations outlined above, only pixels from clearly fluorescent samples were used to train the ML models—specifically from samples with PPIX concentrations of 5 g/mL, 2 g/mL, and 1 g/mL. The positive class thus comprised pixels exhibiting visible fluorescence, while the negative class included pixels from the fluorescence-free reference sample as well as background pixels. PPIX-containing samples without visible fluorescence were excluded from training and were only used during testing to evaluate the model’s ability to detect weak or non-visible fluorescence signals.
To create a balanced training set, 10,000 randomly selected pixels from each class were included, resulting in a total of 20,000 data points.
2.5.2. Anomaly Detection
The training set composition for the clVAE deviated from the training set of traditional classifiers. Unlike conventional classifiers, which rely on balanced training sets, the clVAE focuses on learning the negative class (non-fluorescent pixels) extensively. Thus, any amount of data from the negative class can be used without requiring balance with the positive class. For this model, 60,000 randomly selected pixels were extracted from ROIs within the non-fluorescent regions and assigned to the negative class (i.e., normal data). From this pool, 10,000 pixels were further selected at random and paired with 10,000 fluorescent pixels from the positive class (anomaly data) to create the dataset for contrastive training. Feature extraction was employed for dimensionality extension prior to training. The resulting set of 10 features, along with their respective computations, is detailed below:
![Cancers 18 01125 i001 Cancers 18 01125 i001]()
The VAE architecture comprised a 10-dimensional input layer, a three-dimensional latent layer, and a corresponding 10-dimensional output layer. Training was conducted over 1000 epochs, with calculations performed on randomly selected mini-batches of size 256 within each epoch. The learning rate was progressively reduced from 0.001 to 0.0001 following an exponential decay schedule. The -factor, which controls the balance between reconstruction accuracy and latent space regularization, was adjusted using a two-phase cyclic annealing algorithm spanning a total of 20 cycles. During the first half of each cycle, increased from 0 to its maximum value following a sigmoidal progression. In the second half of each cycle, remained constant at this maximum value. Three distinct VAE models were trained for anomaly detection, each with a different maximum -factor, including = 1, = 2, and = 3. The Kullback–Leibler divergence was used as the anomaly score for normal and anomaly classification.
2.5.3. PPIX Quantification
In addition to detecting fluorescence, an objective quantification of signal intensity is essential for assessing the biological relevance and potential malignancy of tissue. To this end, a predictive model was developed to estimate the underlying PPIX concentration of each pixel, using the synthetic samples as a calibrated reference system with known fluorophore content. All training pixels from PPIX-containing samples, represented with their extended feature set (see
Section 2.5.2), were included to develop a robust quantification model for predicting the PPIX concentration corresponding to each pixel. To begin, the predictive power of individual features for concentration estimation was assessed using a random forest (RF) as an evaluation tool.
The fit quality of the RF was determined through out-of-bag (OOB) prediction, which provides a reliable measure of model performance without requiring a separate validation set. Feature importance scores were derived based on the OOB predictor importance estimates using permutation, highlighting the most influential features for PPIX concentration prediction. Subsequently, the strongest predictors identified by the RF were used to obtain a PPIX concentration prediction model using polynomial fitting.
Figure 3 visualizes the complete ML-model and quantification model development workflow including data preparation.
2.6. Model Transfer and Evaluation
Following the model comparison presented above, the
= 1 clVAE was selected for application to real intraoperative data. In this new training phase, 500,000 randomly selected pixels extracted from non-FGS surgery images were used as normal data, accompanied by 50,000 normal–anomaly pairs (see
Section 2.3 for details on the intraoperative dataset). The remaining hyperparameters were carried over from the training process designed for the labeled PPIX-sample dataset.
The model’s false positive rate was initially determined by evaluating pixels classified as anomalies within the five test set images from non-FGS interventions. Additionally, four experienced surgeons were invited to annotate regions in a selection of 10 images from FGS procedures wherever they could still visually discern fluorescence. The regions they annotated were compared with the areas identified as anomalies by the clVAE model to assess the model’s ability to detect subtle or non-visible fluorescence and to explore its potential for providing a more sensitive and objective tool for real-time surgical guidance in fluorescence-aided GBM resections. To facilitate this, a custom application was developed using the Matlab App Designer Tool (version R2024a, MathWorks, Natick, MA, USA), providing the surgeons with user-friendly annotation capabilities and saving annotations as logical masks. A screenshot of its graphical user interface (GUI) is shown in
Figure 4.
Surgeons were instructed to adjust their monitor brightness to optimize the visibility of fluorescence in the test images. The regions they annotated were compared with the areas identified as anomalies by the clVAE model to assess the model’s ability to detect subtle or non-visible fluorescence and to explore its potential for providing a more sensitive and objective tool for real-time surgical guidance in fluorescence-aided GBM resections.
4. Discussion
The relationship between tumor cell burden and disease progression, coupled with the observation that residual glioma cells remain at the resection margins in approximately half of FGS cases, underscores the need for improved intraoperative imaging techniques capable of reliably delineating high-grade gliomas.
The new generation of microneurosurgical imaging systems has already introduced advancements in fluorescence modules. Leica Microsystems, for example, has introduced two new visualization modes within its GLOW400 AR application for 5-ALA-FGS, both based on multispectral imaging [
41]. The Anatomy view mode provides a clear vision of the surrounding anatomy and eliminates the need to switch between white light and blue light. The Highlighted Fluorescence View mode is designed to enhance the visibility of weak fluorescence against the background, although formal evaluation of its effectiveness is currently lacking. The Olympus Orbeye Exoscope from Olympus similarly enhances anatomical visibility through image processing, even under poor lighting conditions. For detecting fluorescent tumor areas, the Orbeye system demonstrated a sensitivity of 75% and a specificity of 80% [
42]. Suero Molina et al. [
16] have presented a further development of the 5-ALA fluorescence module in Zeiss surgical microscopes named BLUE 400 AR mode. Here, enhanced visibility in FGS mode is achieved through an improved filter system and fluorescence detection was found to have a sensitivity of 70.89% and a specificity of 97.37%.
While current and ongoing efforts focus on enhancing microneurosurgical imaging devices with advanced camera and image acquisition technologies, we propose augmenting existing systems with an ML-based approach for pixel-wise image assessment and real-time visualization.
In this study, we demonstrated the extensive potential of a VAE-based anomaly detection approach for more sensitive and objective fluorescence detection and visual indication. Experimental investigations incorporating synthetic fluorescent samples of known ground truth revealed that ML models could learn the correct patterns to identify weak or non-visible PPIX fluorescence in FGS imaging pixels, even when trained solely on data containing strong fluorescence signals. At a PPIX concentration of 0.1 g/mL represented in a pixel, fluorescence was detected in about half of the test data samples—a concentration that is five times lower than the concentration invoking weak, faintly visible fluorescence.
Further, the fluorescent regions identified by the clVAE extended well beyond the surgeon-annotated areas by visual assessment—often by a significant margin. Although pixel-wise ground truth was unavailable for the images used in this study, it is highly plausible that the areas detected by the clVAE are based on a learned fluorescence fingerprint. Beyond the experimental data, this is further supported by the expansion of the detected areas, which not only encompass the surgeon-annotated regions but also the fading fluorescence at their periphery, extending into well-illuminated tissue. Importantly, the detected areas differed markedly from false positives found in the images from non-fluorescence-guided procedures. The latter were typically small, scattered, and primarily associated with reflections on the tissue surface.
However, it is important to note that this paper presents a set of preliminary results and serves as a proof of concept. Consequently, the study has certain limitations, particularly due to the restricted dataset used. Another notable limitation is the model’s reliance solely on pixel-specific features as input data. Future iterations of this approach should incorporate a more extensive dataset to improve sensitivity by including additional feature types. Specifically, integrating local features from neighboring pixels and global features that capture contextual information from the entire image could provide valuable insights for ML models. This extended contextual data has the potential to significantly enhance model performance and reliability.
The evaluation of the clVAE model’s specificity yielded an excellent value exceeding 99.9%. However, the limited amount of test data poses a significant constraint on the validity of these results. The accumulation of false positives associated with light reflections on tissue surfaces is a plausible finding. False positives were primarily associated with specular reflections on wet tissue surfaces. Such reflections can produce high signal intensities and partial sensor saturation across color channels, particularly with strong red components that resemble fluorescence signals. Addressing these artifacts will be essential for clinical applicability. Approaches such as cost-sensitive learning or hard-negative mining may help the model better distinguish true fluorescence signals from reflection-induced artifacts.
The quantification model presented in this study, aimed at modeling fluorescence intensity and providing a corresponding visualization, demonstrates promising qualitative results. Typically, a high-intensity nucleus is observed, corresponding to the visible fluorescence in the original images, with a gradual decrease in intensity toward surrounding tissue. This pattern is consistent with the infiltrative growth behavior characteristic of high-grade glioma. The assumption that fluorescence intensity correlates with tumor cell density—and more specifically with the concentration of PPIX within the tissue volume—is intuitive. However, several factors, including exposure intensity, shadowing effects, the time elapsed since 5-ALA administration, and interpatient variability, cannot be inferred from pixel-level information alone. Consequently, the conclusions drawn from the model remain qualitative in nature.
Importantly, the displayed fluorescence intensity cannot be considered a direct surrogate for tumor cell density and should therefore not be interpreted as a recommendation for further tissue resection. Particularly in eloquent brain regions, surgical decision-making must remain guided by established functional and anatomical constraints, with ML-based fluorescence enhancement serving only as an additional source of information to support intraoperative assessment.
Despite these limitations, objective, data-driven classification methods hold significant potential for supporting intraoperative decision-making. This is particularly relevant in critical, eloquent brain areas, where cytoreduction requires precise and informed considerations to balance tumor removal with the preservation of healthy tissue.
Comparable clinical challenges exist in other oncologic disciplines where complete tumor resection or accurate tissue identification is crucial for patient prognosis. In gynecologic oncology, fluorescence-guided surgery is routinely used for sentinel lymph node mapping in endometrial, cervical, and vulvar cancer, typically employing indocyanine green (ICG). Similar applications are found in thoracic surgery for non-small cell lung cancer using near-infrared fluorescence imaging. While artificial intelligence is already widely applied in these fields for diagnostic imaging, radiomics-based staging, lymph node classification, and treatment planning, the direct AI-assisted interpretation of intraoperative fluorescence imaging remains limited. Current intraoperative applications are often constrained by challenges such as limited availability of well-annotated datasets and regulatory requirements for clinical decision-support systems [
43].
Nevertheless, recent studies have explored advanced optical imaging techniques combined with machine learning to address these limitations. In addition to hyperspectral imaging approaches that have been investigated particularly in neurosurgery, other modalities such as fluorescence lifetime imaging combined with ML models like SVM, random forests, and CNNs have shown promise for tumor detection in oral and oropharyngeal cancer [
44]. Likewise, AI-assisted analysis of ICG perfusion videos has demonstrated encouraging results in rectal cancer for the demarcation of tumor and healthy tissue [
45]. Beyond intraoperative imaging alone, multimodal AI approaches integrating radiological imaging, pathological features, and molecular or omics data are increasingly investigated to provide a more comprehensive representation of disease characteristics and may enable applications such as real-time pathological grading of gliomas [
37].
These developments highlight both the potential and the current limitations of machine learning in surgical decision support. While AI can enhance image interpretation, improve tissue classification, and integrate complex multimodal information, its clinical deployment remains limited by data availability and the need for robust validation. In this context, the presented approach aims to contribute to the emerging field of AI-assisted fluorescence-guided surgery by enabling sensitive fluorescence detection using standard RGB microscope images without requiring specialized imaging hardware.
5. Conclusions and Future Perspectives
This study demonstrates the feasibility of using machine-learning-based anomaly detection to enhance the visualization of PPIX fluorescence in conventional RGB images acquired with surgical microscopes. Even with limited training data the proposed approach was able to detect weak fluorescence signals beyond the limits of visual perception and highlight extended regions potentially associated with infiltrative tumor tissue.
We propose this system as a lightweight software plugin for real-time fluorescence evaluation, designed for integration into existing operating room workflows. This approach has the advantage of requiring no major hardware modifications and could therefore enable faster and more cost-effective clinical adoption compared with emerging hardware-based solutions such as multispectral imaging systems.
In future work, larger and more diverse intraoperative datasets will be essential to further validate and refine the proposed models. Incorporating contextual image features, improving robustness against reflection artifacts, and systematically evaluating model performance across different surgical conditions will be important steps toward clinical translation. In addition, direct comparisons with spectroscopic imaging techniques should be performed to determine whether ML-based analysis of conventional RGB data can achieve comparable diagnostic performance.
Ultimately, integrating real-time, ML-assisted decision support into surgical microscopes could enhance intraoperative safety, support more complete tumor resections, improve workflow efficiency, and, eventually, contribute to better patient outcomes.