Evaluating Prognosis of Gastrointestinal Metastatic Neuroendocrine Tumors: Constructing a Novel Prognostic Nomogram Based on NETPET Score and Metabolic Parameters from PET/CT Imaging

Introduction: The goal of this study is to compare the prognostic performance of NETPET scores, based on gallium-68 DOTANOC (68Ga-DOTANOC) and fluorine-18 fluorodeoxyglucose (18F-FDG) Positron Emission Tomography-Computed Tomography (PET-CT), and PET-CT metabolic parameters in metastatic gastrointestinal neuroendocrine tumors (GI-NET), while constructing and validating a nomogram derived from dual-scan PET-CT. Methods: In this retrospective study, G1–G3 GI-NET patients who underwent 68Ga-DOTANOC and 18F-FDG PET scans were enrolled and divided into training and internal validation cohorts. Three grading systems were constructed based on NETPET scores and standardized uptake value maximum (SUVmax). LASSO regression selected variables for a multivariable Cox model, and nomograms predicting progression-free survival (PFS) and overall survival (OS) were created. The prognostic performance of these systems was assessed using time-dependent receiver-operating characteristic (ROC) curves, concordance index (C-index), and other methods. Nomogram evaluation involved calibration curves, decision curve analysis (DCA), and the aforementioned methods in both cohorts. Results: In this study, 223 patients (130 males; mean age ±  SD: 52.6 ± 12 years) were divided into training (148) and internal validation (75) cohorts. Dual scans were classified based on NETPET scores (D1–D3). Single 68Ga-DOTANOC and 18F-FDG PET-CT scans were stratified into S1-S3 and F1-F3 based on SUVmax. The NETPET score-based grading system demonstrated the best OS and PFS prediction (C-index, 0.763 vs. 0.727 vs. 0.566). Nomograms for OS and PFS exhibited superior prognostic performance in both cohorts (all AUCs > 0.8). Conclusions: New classification based on NETPET score predicts patient OS/PFS best. PET-CT-based nomograms show accurate OS/PFS forecasts.


Introduction
Neuroendocrine neoplasms (NENs) are rare heterogeneous tumors from the neuroendocrine cell system, with neuroendocrine differentiation and markers, secreting various peptide hormones and biogenic amines [1].NENs can occur throughout the body, with the most common being gastroenteropancreatic NENs (GEP-NENs), followed by those in the lungs, and among GEP-NETs, gastrointestinal NETs (GI-NETs) are the most common primary sites [2].Despite a low incidence rate of 3.56/100,000 per year in the US, there has been a steady rise in recent years [3].The World Health Organization (WHO) classifies NENs based on hormonal secretion and the presence of hormone-related symptoms into functional and non-functional categories, as well as by their pathological differentiation into well-differentiated neuroendocrine tumors (NETs) and poorly differentiated neuroendocrine carcinomas (NECs) [4].
NENs have a high metastatic potential, with ~20% of patients presenting distant metastases at the initial diagnosis [5], with most patients developing distant metastases during the disease progression [6].Distant metastases correlate with reduced patient survival, and predicting prognosis for well-differentiated NETs with distant metastasis is challenging due to heterogeneity, presenting a significant clinical challenge.
The WHO grading system, currently the most utilized tool for predicting NEN prognosis, has long been a cornerstone in NEN diagnosis and treatment, significantly guiding clinical decisions [7,8].However, tumor heterogeneity has complicated the assessment of tumor biology [9], leading to other methods to evaluate the prognoses, such as the use of molecular imaging modalities to depict aggressive cell populations [10].
Positron Emission Tomography (PET) stands as one of the most wildly used molecular imaging method in clinical settings, utilizing tomographic techniques to map the threedimensional distribution of positron-emitting radiotracers within the body.PET facilitates the noninvasive, quantitative evaluation of biochemical and physiological processes.It is often integrated with Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) technologies to create PET-CT and PET-MRI systems, respectively, allowing for the simultaneous acquisition of molecular metabolic information and anatomical details in a single scanning session [11].This technology is commonly used for the diagnosis and staging of tumors, as well as for assessing the metabolic activity of tissues [12].
Utilizing various tracers, PET can capture distinct metabolic and biological characteristics of normal and tumor tissues.The most commonly used tracer is the 18F-labelled glucose analogue FDG, whose accumulation in tissues reflects glucose utilization [11].Tumor activity is linked to the overexpression of GLUT glucose transporters and increased hexokinase activity, making FDG PET widely used in oncology for detection, staging, restaging, and assessing treatment response [13,14].
Another commonly used tracer is the 68 Ga-DOTA conjugated peptide.The fundamental mechanism of using 68 Ga-DOTA conjugated peptides to assess tumors expressing somatostatin receptors (SSTRs) is based on these compounds' high affinity for SSTRs [15][16][17].SST is a small cyclic neuropeptide present in neurons and endocrine cells, with a high density in the brain, peripheral neurons, endocrine pancreas, and GI tract [11].Since NENs originate from neuroendocrine cells and most express SSTRs, PET-CT using 68Ga-DOTA conjugated peptides can effectively target and visualize them [18].
PET imaging has emerged as a prominent tool in NEN imaging and guiding optimal systemic therapies selection [19,20].Research indicates that 68 Ga-DOTATATE PET/CT exhibits a sensitivity greater than 94% and a specificity greater than 92% for detecting NETs, both superior to conventional CT and MRI [21].Well-differentiated NEN cells tend to exhibit more SSTRs on their membranes, making PET-CT with 68 Ga-DOTA an accurate method for identifying well-differentiated NENs [22].Positive results from these imaging techniques usually indicate lower tumor aggressiveness, a better prognosis, and that patients are more suitable candidates for Peptide Receptor Radionuclide Therapy (PRRT) [23,24]; meanwhile, increased avidity on 18 F-FDG PET indicates higher metabolic activity, suggesting aggressive biology and poorer prognosis.At this point, PRRT becomes less applicable [25][26][27].
Many studies confirm a significant link between 18 F-FDG and 68 Ga-DOTA PET/CT scan metabolic parameters (e.g., SUVmax) and GEP-NET prognosis [25,28].Recent research shows that combining both tracers in PET/CT scans improves prognostic insights for metastatic GEP-NET patients.The most notable is Chan, David L. et al.'s NETPET scoring system [29], which has been found to independently correlate with prognosis and is widely recognized [30,31].Previous studies have primarily investigated the association between individual assessment methods and patient prognosis, with few comparing the predictive performance of these methods.Additionally, they only demonstrated the relationship between PET-CTrelated parameters or scores and NET prognosis without extensively utilizing them for prognostic prediction.
Therefore, our present study aims to compare the prognostic prediction efficacy of NET-PET scores and PET-CT metabolic parameters and to establish a novel prognostic tool by combining them with clinical pathological indicators.

Overall Survival and Progression-Free Survival for Patients
During follow-up, 68 (30.5%) patients died, and 195 (87.5%) patients were found to have disease progression or death.The mean follow-up time was 27.2 ± 12.4 months (median follow-up time: 26 months, range: 2-63 months).Survival analysis was performed on PFS and OS using four grading systems with the Kaplan-Meier (KM) method.The results showed that the D grading system had the most excellent discriminative ability for both OS and PFS.The median OS by the D grading system was unreached for D1, 48 months for D2, and 22 months for D3 (p < 0.001, shown in Figure 2a).The median PFS by D grading system was 23 months for D1, 12 months for D2, and 6 months for D3 (p < 0.001, shown in Figure 2e).Similarly, the F grading system also exhibited remarkable discriminatory ability for both OS and PFS.The median OS by F grading system was 60 for F1, 35 months for F2, and 24 months for F3 (p < 0.001, shown in Figure 2b).The median PFS by F grading system was 18 months for F1, 11 months for F2, and 6 months for F3 (p < 0.001, shown in Figure 2f).The S classification system exhibits poor discriminative ability for both OS (p = 0.16, shown in Figure 2c) and PFS (p = 0.43, shown in Figure 2g).The WHO G grading system, however, demonstrated only moderate discriminatory ability for both OS (p = 0.52, Figure 2d) and PFS (p = 0.05, shown in Figure 2h).F2, and 24 months for F3 (p < 0.001, shown in Figure 2b).The median PFS by F grading system was 18 months for F1, 11 months for F2, and 6 months for F3 (p < 0.001, shown in Figure 2f).The S classification system exhibits poor discriminative ability for both OS (p = 0.16, shown in Figure 2c) and PFS (p = 0.43, shown in Figure 2g).The WHO G grading system, however, demonstrated only moderate discriminatory ability for both OS (p = 0.52, Figure 2d) and PFS (p = 0.05, shown in Figure 2h).

Comparing the Prognostic Value of D Grade, F Grade, S Grade, and WHO Grading System
index calculations for OS and PFS prediction were conducted across three models and the WHO grading system, revealing the highest values for D grade (D vs. F vs. S vs. WHO; OS: 0.763, 0.727, 0.566, 0.650; PFS: 0.724, 0.630, 0.556, 0.592).Additionally, D grade achieved the lowest AIC, highest LR-test, and R2 values for both OS and PFS, indicating  Index calculations for OS and PFS prediction were conducted across three models and the WHO grading system, revealing the highest values for D grade (D vs. F vs. S vs. WHO; OS: 0.763, 0.727, 0.566, 0.650; PFS: 0.724, 0.630, 0.556, 0.592).Additionally, D grade achieved the lowest AIC, highest LR-test, and R2 values for both OS and PFS, indicating superior model fit (shown in Table 2).ROC curves (shown in Supplementary Figure S5), AUC, NRI, and IDI analyses further demonstrated enhanced overall predictive performance and clinical utility of D grade for OS and PFS (with D grade as a reference, NRI and IDI for other gradings were <0, except for FDG in the 1-year OS, whose NRI was >0, but the p-value was >0.1, shown in Supplementary Table S1).

Construction and Validation of the Nomograms
The entire cohort was randomly divided into a training set and internal validation set at a 2:1 ratio.Ultimately, 148 patients were incorporated into the modeling set for constructing prognostic models for OS and PFS, while 75 patients were included in the internal validation set to assess the predictive performance of these prognostic models.There were no statistically significant differences in baseline characteristics and D, F, and S classifications between the two groups (all p > 0.05).
Using LASSO regression and cross-validation in the training cohort (shown in Supplementary Figure S6a,b), five OS-predicting variables were identified: age, metastasis status, WHO G grade, D grade, and F grade.These variables informed prognostic nomograms for patients' 1-, 2-, and 3-year OS (shown in Figure 3a).Multivariate Cox regression analyses revealed D classification as independently and significantly associated with better OS both in the training cohort and internal validation cohort (shown in Table 3).Similarly, LASSO regression identified six PFS-predicting variables (shown in Supplementary Figure S6c,d): age, metastasis status, treatment modality, WHO G grade, D classification, and F classification.Nomograms for patients' 6-, 12-, and 18-month PFS were constructed (shown in Figure 3b).Multivariate Cox regression analyses revealed D classification as independently and significantly associated with better PFS both in the training cohort and internal validation cohort (shown in Table 4).The nomograms were validated for their performance in both the training set and the internal validation set.Initially, the nomograms exhibited a C-index of 0.810 (95% CI: 0.747-0.874)for OS prediction within the training set and a C-index of 0.741 (95% CI: 0.692-0.789)for PFS prediction.In the internal validation set, the C-index for OS prediction was 0.849 (95% CI: 0.781-0.916),while the C-index for PFS prediction was 0.824 (95% CI: 0.778-0.871).The ROC curves for 1-, 2-, and 3-year OS predictions demonstrated an AUC of over 0.8 in both the training and internal validation sets (shown in Figure 4a-c,g-i).Similarly, the ROC curves for 6-, 12-, and 18-month PFS predictions showed an AUC of over 0.8 in both sets (shown in Figure 4d-f,j-l).Additionally, the calibration curves indicated a good fit between the nomogram predictions and the actual event occurrences (shown in Figure 5).Furthermore, the DCA demonstrated upward trends for both OS and PFS nomo grams, with excellent separation from the baseline models.The DCA curves exhibite higher net benefits within the common threshold selection range, regardless of whethe they were applied to the training cohort or the internal validation cohort (shown in Figur 6).Furthermore, the DCA demonstrated upward trends for both OS and PFS nomograms, with excellent separation from the baseline models.The DCA curves exhibited higher net benefits within the common threshold selection range, regardless of whether they were applied to the training cohort or the internal validation cohort (shown in Figure 6).We evaluated the nomogram's prognostic efficacy against D, F, and WHO classifications.The ROC curves showed nomograms had superior 1-, 2-, and 3-year OS, and 6-, 12-, and 18-month PFS predictions in both training and validation cohorts, with higher AUC values (shown in Figure 4).DCA revealed the nomograms had the highest net benefit across most threshold probabilities (shown in Figure 6).The goodness of fit was assessed, and the nomograms showed the lowest AIC and highest C-index, R-squared, and LR test values for OS and PFS predictions in both cohorts (shown in Tables 5 and 6).Clinical applicability was assessed using NRI and IDI metrics, with the nomogram outperforming other staging systems in both cohorts (shown in Supplementary Tables S2 and S3).
In our final analysis, we divided the training and internal validation sets into several subgroups based on different primary tumor locations and treatment modalities.This al- applicability was assessed using NRI and IDI metrics, with the nomogram outperforming other staging systems in both cohorts (shown in Supplementary Tables S2 and S3).In our final analysis, we divided the training and internal validation sets into several subgroups based on different primary tumor locations and treatment modalities.This allowed us to validate the nomogram's efficacy across these diverse subgroups.The ROC curves, as shown in Supplementary Figure S10, indicate that the model consistently achieved predictive accuracies for both OS and PFS with AUCs of ≥0.75 across all subgroups of different primary locations (illustrated in Supplementary Figure S10).Further validation of the model's performance through the calculation of the C-index yielded an average C-index of 0.84 (as detailed in Supplementary Table S4).In the subgroups categorized by different treatment modalities, the ROC curves (illustrated in Figure S11) demonstrate that the model's predictions for both OS and PFS remained robust with AUCs of ≥0.79, and the average C-index was calculated to be 0.80 (as detailed in Supplementary Table S5).This evidence supports the model's stability and applicability across varied clinical subgroups.

Discussion
Over the past two decades, medical imaging technology has experienced significant advancements, including remarkable progress in nuclear medicine.This is particularly evident in the context of NETs.As the majority of well-differentiated NETs express somatostatin receptors, the use of PET-CT imaging with specialized radiotracers demonstrates unique advantages [32].Numerous studies have reported that PET-CT imaging using 68 Ga-labeled somatostatin analogs ( 68 Ga-SSA) as contrast agents exhibit exceptionally high specificity and sensitivity in the diagnosis of NETs [33][34][35].Additionally, 68 Ga PET/CT can serve as a valuable tool in guiding the management of Peptide Receptor Radionuclide Therapy (PRRT) and Somatostatin Analogues (SSA) [36].Therefore, it plays a critical role in the comprehensive management of NETs. 18F-FDG is the most commonly used radiotracer in clinical PET-CT imaging.It is effective in detecting a wide range of tumors with high sensitivity to glucose metabolism and can reveal their metabolic activity, thereby indicating the malignancy level of the tumor [37,38].In the context of NETs, 18 F-FDG PET-CT imaging plays a crucial role in the diagnosis and detection of tumors with higher proliferative potential.
Not all NETs are suitable for PET/CT scans, as excessive use of nuclear medicine examinations may lead to radiation damage in patients and resource wastage.According to the consensus of the European Association of Nuclear Medicine (EANM) [39], 68 Ga-DOTAbased PET/CT is only applicable for 1. NETs with an unknown primary site, 2. metastatic NETs, and 3. staging/restaging of NETs.Meanwhile, 18 F-FDG-based PET/CT is limited to: 1. neuroendocrine carcinoma; 2. G3 NETs; and 3. G1 and G2 NETs with confirmed negative somatostatin PET/CT uptake.
Although there are differences in the roles and scope of application for 68 Ga-DOTANOC PET/CT and 18 F-FDG PET/CT in the diagnosis and treatment of NETs, numerous studies have demonstrated that the relationship between the two modalities should be viewed as complementary rather than competitive.Irfan Kayani et al. [40] collected imaging results from 38 patients diagnosed with neuroendocrine tumors (NETs) and analyzed the diagnostic efficacy of using 68 Ga-DOTATATE and 18 F-FDG PET/CT scans individually as well as in combination.The findings indicated that the sensitivity and specificity of the combined dual-tracer scans were higher than those of either tracer alone in diagnosing NETs (dualtracer sensitivity: 92%, 68 Ga-DOTATATE PET/CT sensitivity: 82%, and 18 F-FDG PET/CT sensitivity: 66%).Duygu Has Simsek et al. [31] investigated the relationship between the maximum SUVmax of 68 Ga-DOTATATE and 18 F-FDG PET/CT and their correlation with histopathological findings and metastasis.The results revealed that the combined impact of 18 F-FDG and 68 Ga-DOTATATE PET/CT on treatment decision making was 59%.Furthermore, the dual-tracer scans could overcome the limitations of histopathological grading, particularly in intermediate-grade GEP-NETs.
In recent years, the prognostic implications of dual scans have increasingly gained attention.Among these, the NETPET scoring system, pioneered by Chen et al. [29], is the most widely accepted system at present.It primarily targets metastatic GEP-NET and stratifies patients' prognostic risk based on the comparison of SSTR and FDG uptake in lesions on whole-body PET-CT scans, along with the number of metastatic lesions.The system is divided into eight categories, from P1 to P5.Its prognostic stratification effectiveness has been validated in multiple centers [41] and has proven to be effective beyond GEP-NETs [42].
Recently, numerous studies have explored the prognostic implications of semiquantitative or quantitative data derived from PET/CT scans.Amit Tirosh et al. [43] assessed the relationship between total 68 Ga-DOTATATE-avid tumor volume ( 68 Ga-DOTATATE TV) and progression-free survival (PFS), finding that the quartiles of 68 Ga-DOTATATE TV were negatively correlated with PFS (p = 0.001) and disease-specific survival rates (p = 0.002).This demonstrated that the 68 Ga-DOTATATE TV values were associated with the prognosis of NET patients.Additionally, Akira Toriihara et al. [44] investigated the association between the maximum SUVmax in lesions with the highest 68 Ga-DOTATATE uptake for each patient and PFS, revealing a significant correlation between the two.J. Zhang et al. [45] included 495 patients with metastatic GEPNETs who underwent PRRT.They investigated the relationship between 18 F-FDG PET/CT SUVmax values and qualitative indicators with OS and PFS.The results revealed that the presence of 18 F-FDG PET-positive lesions was an independent prognostic factor for NEN patients receiving PRRT treatment.It is worth noting that whole-body PET-CT imaging offers unique advantages over other imaging modalities in detecting various tumor metastatic lesions throughout the body [46].
In this study, like most existing research, we utilized SUVmax as a semi-quantitative indicator for single PET/CT scans.A new grading system was developed based on SUVmax, incorporating an F grading system for 18 F-FDG and an S grading system for 68 Ga-DOTA We compared F grading, S grading, and a combined dual-scan D grading using the KM method.The results demonstrated that both the combined 68 Ga-DOTA and 18 F-FDG PET/CT scans (median survival: D1 vs. D2 vs. D3, 60 months vs.48 months vs. 22 months, as shown in Figure 1a; median PFS: D1 vs. D2 vs. D3, 23 months vs. 12 months vs. 6 months, as shown in Figure 1e) and the single 18 F-FDG PET/CT scans (median survival: D1 vs. D2 vs. D3, 60 months vs. 35 months vs. 24 months, as shown in Figure 1b; median PFS: D1 vs. D2 vs. D3, 18 months vs. 11 months vs. 9 months, as shown in Figure 1f) provided good stratification for patients' OS and PFS.Furthermore, the KM curves indicated that higher D and F gradings are associated with worse prognosis, which is logical since higher D gradings indicate an increase in lesion count and glucose uptake capacity over somatostatin.High F grading signifies strong glucose uptake by lesions, a marker of high tumor malignancy.The KM analysis enlightens us that in evaluating PET/CT scans of NENs patients, a propensity for lesions to uptake more glucose than somatostatin is an indicator of poor prognosis, warranting attention from clinicians.
These results are similar to the studies by Tina Binderup [25].They used SUVmax values as semi-quantitative indicators for 18 F-FDG PET/CT to explore their relationship with prognosis and discovered that FDG SUVmax was significantly correlated with patient prognosis, consistent with our study results.In their research, the cutoff point for FDGpositive SUVmax was set at 4, while the maximum survival difference was found at a cutoff of 7 (p = 0.001), similar to our study's stratification for FDG SUVmax values (FDG SUVmax D1, 0-3.5, D2: 3.6-6.1,D3 ≥ 6.1).
We further compared the stratification and prediction performance of the three classifications for patient prognosis in this research.The results revealed that the D classification provided the best stratification and prediction performance for both OS and PFS, followed by the F classification, both of which outperformed the WHO classification; the S classification performed the worst.Our findings are in line with those of Tina Binderup et al. [25], who observed that FDG positivity/negativity offered better prognostic risk stratification than histopathological grading.Hwan Lee et al. [47].also found that dual-tracer stratification based on SUVmax values could reflect G3 tumor characteristics and serve as an alternative to histopathological grading.This implies that if feasible, using dual-scan grading or single 18 F-FDG-PET/CT semi-quantitative parameters can serve as non-invasive prognostic factors, and their performance is superior to histopathological grading, which requires an invasive approach.
In addition to pathological grading, there are other commonly used clinical prognostic markers for NETs, such as chromogranin A (CgA) and the liquid biopsy-based NETest.Similar to the NETPET scoring system, they can also indicate the PFS or OS for patients [48][49][50][51].In a study covering 152 patients with GEP-NETs, the NETest reached a 12-month PFS AUC value of 0.78, while CgA had an AUC of 0.73 [52].In another study evaluating whether NETest could identify the potential of disease recurrence after NET resection, the accuracy of the NETest prediction marker was 94%, the specificity was 92%, and the sensitivity was 100% until the exact time of recurrence or 24 months, whereas the accuracy of the CgA prediction marker was 57%, with a specificity of 76% and a sensitivity of 15%, failing to reach the accurate diagnostic threshold [53].In our study, the D grading system achieved a 12-month PFS AUC value of 0.81 in independent internal validation.Considering its non-invasive nature, it can even be comparable to NETest and better to CgA, which indicates strong prognostic performance and clinical application potential.Additionally, the D grading system possesses strong interpretability, unlike CgA, which, despite demonstrating diagnostic and prognostic value across various diseases in clinical settings, has not always produced satisfactory results [54].It has the potential to replace CgA in the future.
Within this research, a novel nomogram incorporating D and F gradings along with clinicopathological factors was developed to predict OS and PFS in patients with GI-NENLM.The nomogram's predictive performance and consistency were validated using an independent internal validation cohort, outperforming D and F gradings as well as histological classifications in terms of prediction accuracy and clinical utility.This fills a gap in the prediction of patient prognosis in the metastatic GEP-NET domain.Previous studies often utilized public data to predict surgery-related outcomes [55,56].Our research addresses the unmet need for prognostic models specific to metastatic GEP-NETs.
Previous studies have also designed prognostic models for NETs.In a study based on the SEER database targeting patients with GEP-NENs, the nomogram they constructed for OS yielded C-indexes of 0.821 and 0.823 in the primary and validation groups, respectively [57].Additionally, in a study focusing on patients with GEP-NENs and liver-limited metastasis, the nomogram for OS achieved C-indexes of 0.814, 0.826, and 0.789 in the training, internal validation, and external validation sets, respectively [58].These results are closely aligned with ours, which are 0.810 and 0.849, yet they lacked an investigation into patients' PFS, which is crucial for a disease characterized by a relatively mild and chronic course, such as NETs.In our study, we also constructed a predictive model for patients' PFS and obtained favorable results.Furthermore, this is the first study to incorporate PET-CT visual scoring and related parameters into a clinical pathological factor model.Future research should further include circulating biomarkers to enhance the model's predictive capability.While this model demonstrates excellent predictive performance, it is important to consider the strong heterogeneity inherent in NETs when applying it in clinical settings.Consequently, the predictive model should be viewed as an adjunct tool, complementing clinical guidelines and the MDT approach.This integration enables more precise and flexible clinical applications, tailored to the unique context of each patient's case.
Additionally, we've established an online prediction platform (Figure 2) and a grading system based on nomogram scores to streamline clinicians' assessments and prognostic evaluations.In clinical usage, for example, a patient meeting the inclusion criteria may receive D, F, and S grades with the assistance of a nuclear medicine physician during treatment.By integrating these grades with pathological grading and other clinical circumstances, our predictive model can be utilized to assess their prognosis.If a patient's pathological grade is G3, with D and F grades exceeding D2 and F2, respectively (indicating high FDG uptake by the tumor), it is likely the patient may have a poor prognosis with a 2-year survival rate of less than 50% and a high risk of disease progression in the near term.This would necessitate more aggressive clinical treatment and follow-up.Conversely, if the patient has a better tumor differentiation (G1-G2) and nuclear medicine assessment grades are D1 and F1 (indicating stronger uptake of somatostatin by the tumor), then the patient's prognosis is likely to be better, with a 3-year survival rate reaching up to 90%, and the tumor progression being slow, which may call for a more conservative approach to treatment and monitoring.
Our study has some limitations: it used single-center data from a small patient sample for developing grading models and prognostic nomograms, requiring external validation.The high cost of PETCT and radiation exposure limit the number of patients.We assessed lesion sites using dual-scan PET-CT imaging with NETPET scoring criteria, the D grading system based on the NETPET score may be influenced by the quality of imaging technology, the experience of operators, the difference in image protocols and the consistency of imaging interpretation standards.Therefore, its practical application in assisting clinical decision making requires further research and validation, integrating it with patients' specific clinical situations for personalized interpretation and application, involving visual assessment methods that may require expert interpretation.For single-scan evaluation, we only used SUVmax, excluding other PET-CT parameters like Metabolic Volumetric Index (MVI) and Mean Parenchymal Volume (MPV).The retrospective nature of this study limits our control over baseline patient characteristics and only including GI-NET may limit the generality of the model.Despite these limitations, our conclusions remain significant.Future research should involve multi-center, large-scale, prospective studies combining deep learning and PET-CT radiomics for better prognostic guidance in GI-NET patients with metastases.Additionally, we should also initiate prospective studies comparing the prognostic accuracy of the NETPET score with circulating biomarkers and incorporate these markers into our prognostic model to enhance its potential and functionality.

Materials and Methods
We analyzed all patients pathologically diagnosed with well-differentiated (G1-G3) GI-NETs at The First Affiliated Hospital of Sun Yat-sen University between 2016 August and 2021 August.Eligible participants are: (1) aged 18-75 years; (2) with pathologically confirmed GI-NENs; (3) evidence of distant metastasis supported by radiologic or pathologic data; (4) undergone concurrent PET-CT scans using 18 F-FDG and 68 Ga-DOTA radiotracers, excluding individuals: (1) classified with NEC per WHO pathological grading; (2) with treatment of PRRT within 1 year; (3) with intervals exceeding 30 days between PET-CT scans; (4) experiencing death or disease progression within 3 months; (5) with incomplete diagnostic, therapeutic, or radiologic information; (6) NENs originating from non-GI locations or with unknown primary sites.If a patient has undergone multiple rounds of PET-CT scans, select the initial pair for analysis.

PET/CT Imaging Information Acquisition and Analyses
All patients signed informed consent forms for PET-CT examinations.Before 68 Ga-DOTANOC PET-CT and 18 F-FDG examination, fasting for 6 h was required for the patients.PET-CT imaging was performed with a Gemini GXL 16 PET scanner (Philips Healthcare, Andover, MA, USA).Between 111 and 185 MBq (3-5 mCi) 68 Ga-DOTANOC or a dose of 5.18 MBq (0.14 mCi)/kg FDG was injected intravenously and scanned continuously.Approximately 45-60 min after the injection, serial scanning was performed from head to midthigh.Following low radiation dose CT acquisition with a slice thickness of 5 mm, the PET acquisition was performed for 1.5 min per bed position for 7-8 beds using a slice thickness of 4 mm.CT-based attenuation correction of the emission data was employed.PET images were reconstructed by the Line of Response RAMLA algorithm.The interval between 18 F-FDG and 68 Ga-DOTANOC PET-CT studies shall be at least 24 h.
After undergoing training on the visual evaluation and scoring of 68 Ga-DOTANOC and 18 F-FDG PET-CT images according to the NETPET scoring system [29,41], two nuclear medicine experts independently scored the patients using a blinded method.In cases of discrepancy, Professor Zhao Wang, an expert with 30 years of experience in diagnosing and treating NETs, made the final decision.To be detailed, the NETPET score is classified into grades 0-5, which is in total 9 levels.Due to limitations in quantity, P0 is typically not included in the analysis.When a patient's lesions exhibit uptake of SSTRI without FDG uptake, they are classified into grade P1.Conversely, lesions with FDG uptake but without SSTRI uptake are classified as grade P5.Grades P2-P4 represent lesions that have concurrent uptake of both SSTRI and FDG, without isolated FDG uptake.The more detailed classification requires counting the number of lesions that meet these criteria to define the specific grade.When selecting and comparing lesions, all lesions that have been identified as tumors should be included.Identify the single lesion with the most FDG-avidity relative to its SSTR uptake, which should be selected as a reference, and its size relationship with self-SSTR uptake should be compared.When a patient has multiple records of PET-CT scans, the earliest one is selected.In short, this system not only considers the comparison of FGD and SSTRI uptake of tumor lesions, but also considers the number of lesions, which is the most comprehensive scoring system for dual scanning so far.The PET/CT images were evaluated visually qualitatively and semi-quantitatively.When the activity in each area exceeded background levels and could not be attributed to physiological activity, it was identified as tumor tissue.The evaluators were blinded to the findings of the structural imaging.Any non-physiological focus of 68 Ga-DOTANOC uptake above background was considered abnormal.Likewise, on 18 F-FDG PET/CT images, any non-physiological focus of 18 F-FDG uptake greater than the background blood-pool activity or adjacent normal tissue was considered positive.The PET/CT images were combined with non-enhanced CT to obtain anatomical 3D imaging.Metastatic disease was categorized into liver metastases and extrahepatic metastases.
PET-CT imaging offers a variety of semi-quantitative metrics, including Standardized Uptake Value (SUV), Metabolic Volumetric Index (MVI), and Mean Parenchymal Volume (MPV) and so on.Extensive research has indicated that the SUV of the primary tumor has been investigated as a potential prognostic factor for survival [59,60].Among these, the most frequently utilized metric is SUVmax, which represents the maximum SUV within the tumor and is considered a key indicator of tumor metabolic activity.This high uptake value is often associated with increased tumor aggressiveness and poor prognosis, making SUVmax a valuable tool in evaluating cancer severity and potential outcomes [25,61].Therefore, in this study, the SUVmax of both primary and metastatic lesions was calculated one hour post-injection of the contrast agent when evaluating single 68 Ga-DOTANOC and 18 F-FDG PET-CT images.These measurements were then utilized as the respective semi-quantitative metabolic parameters for each scanning modality in the subsequent development of a prognostic model.All the SUVmaxs were standardly calculated using the default method in relation to body weight and injected doses: (target tissue(gram)/injected dose(megabecquerels)/body weight(gram).

Treatment and Follow-Up
All patients' treatment plans were decided after discussion by our hospital's NETMDT team.Treatment modalities were categorized into surgical treatment (primary tumor resection performed) and medical treatment (primary tumor resection not performed).There are two primary endpoints in this study: overall survival (OS) and progression-free survival (PFS).OS is defined as between the initial diagnoses of the disease and death from any cause or the date of the last follow-up for patients who are still alive.PFS is defined as the length of time during which a patient's disease does not progress or worsen after treatment, recorded according to the Response Evaluation Criteria in Solid Tumors version 1.1 (RECIST 1.1) [66].

Statistical Analyses
We evaluated the D, S, and F grading systems' predictive performance and clinical utility for patients' PFS and OS in the overall cohort, comparing them to histological grading.We employed time-dependent ROC curves to assess accuracy and specificity, Harrell's C-index for discrimination power, and NRI and IDI with Z tests for comparing grading systems.For OS prediction, NRI and IDI calculations employed risk categories of (0-0.1, 0.1-0.4,0.4-0.6,0.6-1).Likewise, for PFS prediction, risk levels were categorized as (0-0.2,0.2-0.5, 0.5-0.7,and 0.7-1) in NRI and IDI calculations.Lastly, we used AIC, R-squared, and LR tests to determine the goodness of fit for various models predicting PFS and OS.
In a 2:1 ratio, the overall cohort was randomly divided into a training set and internal validation set.Baseline characteristics and clinical pathological features were compared between the two groups using t-tests for continuous variables, chi-squared tests for categorical variables, Fisher's exact test, and the Mann-Whitney U test.Continuous variables were reported as either the mean with standard deviation (SD) or the median with interquartile range (IQR).
LASSO regression introduces L1 regularization to decrease model complexity, effectively preventing overfitting and facilitating variable selection.It has been widely adopted in building predictive models within the medical field [67,68].In this study, LASSO-Cox regression was utilized within the modeling group, targeting OS and PFS endpoints, to screen for all potential variables that could be prognostic factors.Specifically, the method of cross-validation was employed, selecting variables based on the lambda minimum criterion.This approach helps in identifying the most relevant predictors by applying a penalization to the model's coefficients, effectively reducing the risk of overfitting and enhancing the model's predictive accuracy for patient outcomes.
Univariate and multivariate Cox regression were used to examine associations with survival.Subsequently, nomograms predicting 1-, 2-, and 3-year OS and 6-, 12-, and 18month PFS were established based on the multivariate survival risk model factors.To assess the nomograms' predictive ability, we applied the aforementioned evaluation methods, calibration curves, and decision curve analysis (DCA) in the training and internal validation sets independently.The predictive performance of the nomograms with that of the WHO histological grading system was also compared.
To further validate the model's generalizability and practicality, we performed a subgroup analysis based on the primary site and different treatment modalities.We utilized time-dependent ROC curves and the C-index to conduct sensitivity analysis of the model across different subgroups.

The Construction of a Portable Nomogram
To enhance clinical utility, we employed two strategies: (1) Developed an online prognostic app accessible via a URL or QR code, allowing users to input parameters for 1-, 2-, 3-year OS or 6-, 12-, 18-month PFS predictions; and (2) calculated nomogram scores for each patient and stratified them into three risk groups (high, medium, low) for OS and PFS using X-tile (3.6.1)software, enabling efficient prognosis assessment.

Conclusions
In summary, our study found that the visual grading based on both 68 Ga-DOTA-NOC and 18 F-FDG PET-CT scans can better predict patient prognosis compared to the WHO histological grading and single scans.This approach has the potential to improve the prognostic risk stratification of well-differentiated (G1-G3) metastatic gastrointestinal neuroendocrine tumors within the WHO grading system.We developed prognostic nomograms based on PET-CT visual and semi-quantitative data grading, which demonstrate good predictive performance for both OS and PFS.Additionally, we designed and created an online dynamic nomogram and a new risk stratification system, enabling clinicians to quickly assess the prognosis of their target patients and guide subsequent treatments.

Figure 2 .
Figure 2. The Kaplan-Meier curves for OS and PFS for the dual 18 F-FDG and 68 Ga DOTANOC PETCT visual grade (D grade), semiquantitative 18 F-FDG PETCT grade (F grade) and semiquantitative 68 Ga-DOTANOC PETCT grade (S grade).(a) Kaplan-Meier curves for OS stratified by D grade.(b) Kaplan-Meier curves for OS stratified by F grade; (c) Kaplan-Meier curves for OS stratified by S grade.(d) Kaplan-Meier curves for OS stratified by WHO grade; (e) Kaplan-Meier curves for PFS stratified by D grade.(f) Kaplan-Meier curves for PFS stratified by F grade.(g) Kaplan-Meier curves for PFS stratified by S grade.(h) Kaplan-Meier curves for PFS stratified by WHO grade.

Figure 2 .
Figure 2. The Kaplan-Meier curves for OS and PFS for the dual 18 F-FDG and 68 Ga DOTANOC PETCT visual grade (D grade), semiquantitative 18 F-FDG PETCT grade (F grade) and semiquantitative 68 Ga-DOTANOC PETCT grade (S grade).(a) Kaplan-Meier curves for OS stratified by D grade.(b) Kaplan-Meier curves for OS stratified by F grade; (c) Kaplan-Meier curves for OS stratified by S grade.(d) Kaplan-Meier curves for OS stratified by WHO grade; (e) Kaplan-Meier curves for PFS stratified by D grade.(f) Kaplan-Meier curves for PFS stratified by F grade.(g) Kaplan-Meier curves for PFS stratified by S grade.(h) Kaplan-Meier curves for PFS stratified by WHO grade.

2. 4 .
Comparing the Prognostic Value of D Grade, F Grade, S Grade, and WHO Grading System
Figure S1.Patient Selection Flowchart.Figure S2.The process of X-tile software for stratification of 18F FDG SUVmax values.Figure S3.The process of X-tile software for stratification of FDG 68G DOTANOC SUVmax values.Figure S4.The process of X-tile software for stratification of NETPET score.Figure S5.ROC curves for dual 18F-FDG and 68Ga DOTANOC PETCT visual grade (D grade) and examples of D grade, semiquantitative 18F-FDG PETCT grade (F grade), and semiquantitative 68Ga-DOTANOC PETCT grade (S grade) predicting 1-, 2-, 3-year OS and 6-, 12-, 18-month PFS in the overall cohort.Figure S6.LASSO regression for OS and PFS. a LASSO regression for OS.b the cross-validation for LASSO regression for OS (lambada min).c LASSO regression for PFS.d the cross-validation for LASSO regression for PFS (lambada min).Figure S7.The process of X-tile software for stratification of nomogram output scores for OS. Figure S8.The process of X-tile software for stratification of nomogram output scores for PFS. Figure S9.(a,b) The Kaplan Meier curves for risk Stratification Scale of the nomogram for OS in the training and internal validation cohort.(c,d) the Kaplan Meier curves for risk Stratification Scale of the nomogram for PFS in the training and internal validation cohort.(e,f) the Kaplan Meier curves for dual-scan visual grading for OS in the training and internal validation cohort.(g,h) the Kaplan Meier curves for dual-scan visual grading for PFS in the training and internal validation cohort.(i,j) the Kaplan Meier curves for semiquantitative 18F-FDG PETCT grading system for OS in the train-ing and internal validation cohort.(k,l) the Kaplan Meier curves for semiquantitative 18F-FDG PETCT grading system for PFS in the training and internal validation cohort.(m,n) the Kaplan Meier curves for the WHO grading system for OS in the training and internal validation cohort.(o,p) the Kaplan Meier curves for the WHO grading system for PFS in the training and internal vali-dation cohort.
Figure S10.(a-l).Time ROC curve for OS. a The subgroup of stomach in Modelling cohort; b The subgroup of small intestine in Modelling cohort; c The subgroup of colorectum in Modelling cohort; d The subgroup of stomach in internal test cohort; e The subgroup of small intestine in internal test cohort; f The subgroup of colorectum in internal test cohort.Time ROC curve for PFS; g the sub-group of stomach in Modelling cohort; h The subgroup of small intestine in Modelling cohort; i The subgroup of colorectum in Modelling cohort; j The subgroup of stomach in internal test cohort; k The subgroup of small intestine in internal test cohort; l The subgroup of colorectum in internal test cohort.Figure S11.Time ROC curve for OS. a The subgroup of post-surgery in Modeling cohort; b The subgroup of no-surgery in Modeling cohort; c The subgroup of post-surgery in internal test cohort; d The subgroup of no-surgery in internal test cohort.Time ROC curve for PFS. e The subgroup of post-surgery in Modeling cohort; f The subgroup of no-surgery in Modeling cohort; g The subgroup of post-surgery in internal test cohort; h The subgroup of no-surgery in internal test cohort.

Table 1 .
Baseline characteristics of Training dataset and internal validation dataset.

Table 2 .
Efficiency of different grading systems for predicting OS and PFS.

Table 3 .
Uni -and Multivariate Cox Analyses for OS in Training Cohort and Internal Validation Cohort.Characteristics Training Cohort

Table 3 .
Uni-and Multivariate Cox Analyses for OS in Training Cohort and Internal Validation Cohort.

Table 4 .
Uni-and Multivariate Cox Analyses for PFS in Training Cohort and Internal Validation Cohort.

Table 6 .
Efficiency of Different Grading System for Predicting OS and PFS in the Internal Validation Cohort.

Table S2 .
NRI, IDI, and C-index of the Grading Systems for OS and PFS in Traing Cohort.TableS3.NRI, IDI, and C-index of the Grading Systems for OS and PFS in Internal Validation Cohort.TableS4.C-index (95% CI) of Nomogram from Subgroup Analysis by Primary Site.Table S5.C-index (95% CI) of Nomogram from Subgroup Analysis by Treatment.