Predicting Cellular Rejection of Renal Allograft Based on the Serum Proteomic Fingerprint

Kidney transplantation is an essential medical procedure that significantly enhances the survival rates and quality of life for patients with end-stage kidney disease. However, despite advancements in immunosuppressive therapies, allograft rejection remains a leading cause of organ loss. Notably, predictions of cellular rejection processes primarily rely on biopsy analysis, which is not routinely performed due to its invasive nature. The present work evaluates if the serum proteomic fingerprint, as acquired by Fourier Transform Infrared (FTIR) spectroscopy, can predict cellular rejection processes. We analyzed 28 serum samples, corresponding to 17 without cellular rejection processes and 11 associated with cellular rejection processes, as based on biopsy analyses. The leave-one-out-cross validation procedure of a Naïve Bayes model enabled the prediction of cellular rejection processes with high sensitivity and specificity (AUC > 0.984). The serum proteomic profile was obtained in a high-throughput mode and based on a simple, rapid, and economical procedure, making it suitable for routine analyses and large-scale studies. Consequently, the current method presents a high potential to predict cellular rejection processes translatable to clinical scenarios, and that should continue to be explored.


Introduction
Kidney transplantation is unquestionably a key medical intervention that increases the survival rates of patients with end-stage kidney disease and their overall quality of life by preventing the routine therapies of renal function replacement, such as dialysis, and strict diets [1][2][3].Moreover, the restoration of renal function has the added benefit of effectively halting the progression of other debilitating conditions, such as cardiovascular disease.This multifaceted impact enables transplant recipients to not only regain their health but also actively participate in more normal daily routines, thereby improving their overall wellbeing [4][5][6].Although there have been significant developments in immunosuppressive therapies and various other medical interventions, which can lead to enhanced graft and patient survival rates, allograft rejection persists as a prominent contributor to the failure of allografts in transplantation medicine [7].
Over the years, serum creatinine testing has been the dominant method for monitoring the well-being of transplanted kidneys due to its widespread availability and familiarity [8].Nevertheless, the stark reality remains that nearly 20% to 30% of kidney transplant patients experience allograft loss within five years of the surgical procedure, underscoring the urgent requirement for more effective monitoring tools and interventions [9,10].
Numerous clinical studies have illustrated that serum creatinine is not a timely indicator of allograft dysfunction [11][12][13].Identifying kidney injury early on is imperative for safeguarding the health of the transplanted organ [12,14,15].Physicians are aware that interpreting serum creatinine levels can be challenging during the initial stages of declining renal function and can be influenced by factors unrelated to the kidneys [13].
Also, in cases of subclinical allograft injury, there is typically a swift decline in glomerular filtration, which the proximal tubules initially counterbalance through increased secretion [16].This extended compensation phase leads to the stability of serum creatinine levels, creating the misleading impression that the transplanted organ is functioning normally, even as the injury to the graft continues to progress.Combined with this compensation, other factors can potentially elevate serum creatinine levels, e.g., medications (corticosteroids, cimetidine among others), which can inhibit creatinine secretion via proximal tubules [17], resulting in increased serum creatinine levels without a corresponding decrease in the glomerular filtration rate.Consequently, the reliance on serum creatinine levels for the detection of kidney injuries poses significant challenges.Without a more reliable and timelier marker to indicate changes in renal function, physicians face the dilemma of potentially delayed diagnosis or unnecessary invasive procedures, e.g., allograft biopsy, with all of its logistical complexities and the risks associated.
Due to all mentioned above, allograft biopsy is regarded as the gold standard for diagnosing allograft rejection, despite its highly invasive nature making it unsuitable for routine use.Even the less frequent protocol biopsies, designed to monitor the allograft status, present challenges [18].Biopsies present an associated risk of infections or hematomas and, in extreme cases, can lead to nephrectomy or even death, with both outcomes occurring in approximately 1 out of every 1000 renal biopsies performed [19,20].The specificity of the technique can be inconsistent, as it hinges on the subjective assessments of pathologists and is heavily dependent on their experience.Additionally, the reproducibility of the diagnosis is often limited [21].Due to these factors, it is important to develop cost-effective, minimally invasive alternatives to predict rejection processes.For all mentioned above, the routine prediction of the humoral rejection processes is based on serum analysis of the circulating antibodies [22], which may target human leukocyte antigens (HLAs), non-HLA antigens, and blood group antigens [23][24][25][26].However, predicting T-cell-mediated rejection processes, also known as cellular rejection, using minimally invasive methods is particularly challenging.Cellular rejection occurs when T-lymphocytes are activated by antigen-presenting cells via direct, semi-direct, or indirect pathways.This leads to the infiltration of immune cells into the allograft, causing subsequent tissue damage [25][26][27].As such, this rejection mechanism is, in general, based on biopsy analysis to assess the presence of patients' immune cells on the allograft [28].In some cases, a mixture of both antibody and cellular rejection may occur [26,29], and therefore, the prediction of humoral rejection based on serum analysis can result in the non-detection of cellular rejection, which may consequently lead to a critical delay in starting adequate immunosuppressor therapy [30].
The present work aims to assess a novel method, based on a rapid serum analysis to predict cellular rejection events with high sensitivity and specificity.
Urine and blood proteomics have been explored for predicting both humoral and cellular rejections [15,31].However, to achieve a set of peptides or proteins that may efficiently predict the mechanism of rejection processes, among the very high diversity of patients existing in real clinical scenarios, large-scale proteomics studies are needed [32].
Since conventional proteomics typically involves labor-intensive, time-consuming, and costly techniques, its application on large-scale studies may significantly be impaired.This limitation also applies to data derived from other omics, including transcriptional and metabolomics, to name a few.
As an alternative to conventional proteomics, the whole protein molecular profile from serum can be acquired in a simple, rapid, and economical model, using Fourier Transform Infrared (FTIR) spectroscopy.In fact, the analysis of biofluids like serum through FTIR spectroscopy, when combined with machine learning algorithms, has proven effective in predicting various pathophysiological states with high sensitivity and specificity.This includes a range of diseases such as cancer, diabetes, and neurological conditions [33,34].Owing to the technique's simplicity, speed, cost-effectiveness, and capability to operate in a high-throughput mode, based on plates with micro-wells [34], it is more readily adaptable for large-scale studies.
This study aims to assess whether the molecular fingerprint of whole serum proteins, as captured by FTIR spectroscopy, can be used to predict cellular rejection.To evaluate this, the spectrum from serum between 1500 to 1700 cm −1 , from 21 patients, will be considered since it represents vibrations of molecular bonds of amides I and II from proteins, especially C=O and C-N stretching vibrations and the bending vibrations of the N-H bonds.Furthermore, the technique is also highly sensitive to protein structural alterations [35,36].To our knowledge, this exploratory study is the first application of this strategy to predict cellular rejection.

Results
The characteristics of the patients, from which the 28 biopsies were collected, 17 without cellular rejection processes and 11 associated with cellular rejection processes, are presented in Table 1.No statistically significant differences (p > 0.05) were found concerning age, sex, donor type (living or deceased donors), and if the transplanted organ was only the kidney or simultaneous kidney-pancreas.
The average serum spectra, after baseline correction and normalization, or based on the second derivative (Figure 1A-D), highlight some differences between patients with and without cellular rejection.Normalization minimizes the impact of sample quantity during the analysis, thereby emphasizing the differences in the biochemical composition of the samples.The second derivative enables band deconvolution, i.e., enhances the band resolution and, consequently, extracts more information from the spectra.As expected, the second derivative spectra highlighted the differences between the average spectra of patients with and without cellular rejection processes in relation to non-derived spectra (Figure 1A-D).
The t-SNE scores (Figure 1E-H) indicate some degree of separation between samples with and without rejection.t-SNE is an unsupervised, non-linear dimensionality reduction technique that represents data based on their similarities, allowing for the identification of data patterns.However, spectral pre-processing using the second derivative did not show a clear enhancement in data separation based on rejection status (Figure 1E-H).For this, it was represented as heatmaps of the spectra, also based on diverse spectral pre-processing methods (Figure 2).Heatmaps based on spectra with normalized baseline correction highlight the whole amide I (between 1640 and 1665 cm −1 ) and amide II (between 1530 and 1560 cm −1 ) bands, while the second derivative highlights a higher number of spectral regions due to band deconvolution (Figure 2).Nonetheless, the heatmaps also did not reveal a specific region of the spectra that could be significantly different between these two populations.
HCA, based on non-derived spectra, presented low sensitivities or low specificities (i.e., <50%), HCA based on second derivative spectra performed better, but the results were still modest, yielding a sensitivity of 64% and a specificity of 76%, regardless of normalization.
Since the multivariate unsupervised analysis failed to predict cellular rejection, supervised Naïve Bayes models were constructed (Table 2).These models allow for probabilitybased classification based on the Bayes theorem.An LOOCV procedure was employed, and the performance of the models was evaluated.Among the pre-processing methods evaluated, the second derivative spectra yielded a high AUC value (0.885), attributable to the enhanced band resolution achieved by derivatives (Figure 1B).However, despite the observed high sensitivities (100%), the specificity was very low (59%).To maintain the high sensitivities, i.e., the probability of correctly predicting all samples with cellular rejection, and increase the specificity, i.e., the probability of correctly predicting samples without cellular rejection processes, we evaluated which regions of the spectra could lead to increases in the model's performance.With the aim of identifying relevant features, i.e., bands in the 1500 to 1700 cm −1 region, the information gain algorithm was implemented, based on second derivative spectra, as this was the pre-processing method that led to the best Naïve Bayes model.The bands that most contributed to the cellular rejection were, in decreasing order, as follows: 1524, 1631, 1505, 1558, 1575, 1599, and 1673 cm −1 .This resulted in an information gain of 0.60 and 0.46 for the first two wavenumbers and 0.42 for all the remaining wavenumbers.
Diverse Naïve Bayes models were built based on the bands that gave the highest information gain value, i.e., 1524 cm −1 , that resulted in both high sensitivity (100%) and specificity (82%) (Table 2).So, from here, several other Naïve Bayes models were constructed based on a set of increased number of bands according to the information gain, as presented in Table 2.The set of bands that resulted in the best Naïve Bayes model were 1524, 1631, 1505, and 1558 cm −1 , where the first three bands present higher values in the samples of patients with cellular rejection, and the last band (1558 cm −1 ) presents higher values on samples from patients without cellular rejection processes (Figure 3).This model resulted in an AUC of 0.984.
Since the multivariate unsupervised analysis failed to predict cellular rejection, supervised Naïve Bayes models were constructed (Table 2).These models allow for probability-based classification based on the Bayes theorem.An LOOCV procedure was employed, and the performance of the models was evaluated.Among the pre-processing methods evaluated, the second derivative spectra yielded a high AUC value (0.885), attributable to the enhanced band resolution achieved by derivatives (Figure 1B).However, despite the observed high sensitivities (100%), the specificity was very low (59%).To maintain the high sensitivities, i.e., the probability of correctly predicting all samples with cellular rejection, and increase the specificity, i.e., the probability of correctly predicting samples without cellular rejection processes, we evaluated which regions of the spectra could lead to increases in the model's performance.With the aim of identifying relevant features, i.e., bands in the 1500 to 1700 cm −1 region, the information gain algorithm was implemented, based on second derivative spectra, as this was the pre-processing method that led to the best Naïve Bayes model.The bands that most contributed to the cellular rejection were, in decreasing order, as follows: 1524, 1631, 1505, 1558, 1575, 1599, and 1673 cm −1 .This resulted in an information gain of 0.60 and 0.46 for the first two wavenumbers and 0.42 for all the remaining wavenumbers.
Diverse Naïve Bayes models were built based on the bands that gave the highest information gain value, i.e., 1524 cm −1 , that resulted in both high sensitivity (100%) and specificity (82%) (Table 2).So, from here, several other Naïve Bayes models were constructed based on a set of increased number of bands according to the information gain, as presented in Table 2.The set of bands that resulted in the best Naïve Bayes model were 1524, 1631, 1505, and 1558 cm −1 , where the first three bands present higher values in the samples of patients with cellular rejection, and the last band (1558 cm −1 ) presents higher values on samples from patients without cellular rejection processes (Figure 3).This model resulted in an AUC of 0.984.

Discussion
It is interesting to notice that blood urea nitrogen, the glomerular filtration rate estimate, and serum creatinine were not statistically different (p > 0.05) between the two groups of patients, highlighting the need for alternative biomarkers to predict cellular rejection processes.Therefore, it is crucial to develop minimally invasive methods to predict allograft cellular rejection that can be routinely applied, such as those based on blood analysis.These methods should be able to detect rejections between protocol biopsies and might even reduce the number of such biopsies in the future.Various studies suggest that the serum proteome holds promise for achieving this objective [15].However, translating biomarker discovery from serum proteomics to clinical application can be challenging, given the complexities of conducting proteomic analyses in large-scale, international multi-center studies.An alternative method with high potential to be applicable to large-scale studies is based on FTIR spectroscopy associated with machine learning algorithms, which could enable the discovery of biomarkers of rejection with high sensitivity and specificity and, consequently, is more easily translatable to clinics.
These models should be further validated using a larger patient cohort with greater diversity and a mix of rejection processes, as well as other pathophysiological processes, including infections (renal or not), among others.Nonetheless, it is noteworthy to highlight the promising results achieved in this study, especially when compared to findings from other researchers, such as those based on proteomics.Gwinner et al. [40] identified 14 peptides in urine that could predict cellular rejection, but with sensitivities and specificities of 0.66 and 0.47, respectively; Lim et al. [41] identified two proteins in urine exosomes that predicted cellular rejection with sensitivities and specificities of 0.64 and 0.73, respectively.Based on metabolomics, Blydt-Hansen et al. [42] developed a model, based on 134 metabolites present in urine, for predicting cellular rejection, with an AUC of 0.892.In this study, the model developed, which relies on just a few spectral bands, showcased superior predictive performances.
In the present work, we obtained the molecular fingerprint associated with whole proteins present in serum by FTIR spectroscopy.This procedure is straightforward, necessitating only a basic serum dehydration step prior to spectral acquisition.The spectra were derived from a minimal sample volume (25 µL) on a multi-well plate.As a result, the analysis can be executed in an automatable high-throughput mode using just a drop of blood.This makes it suitable for application in large-scale studies for biomarker optimization and validation, thereby substantially elevating the likelihood of translating the biomarker into a clinical setting.Moreover, the routine analysis of such biomarkers would be considerably more cost-effective than analyzing a set of metabolites or proteins.Utilizing only a few spectral bands could also pave the way for affordable and portable equipment, allowing for routine analysis beyond the confines of a hospital laboratory.

Study Population
This retrospective study included data from 28 biopsies collected from 21 adult patients with a kidney allograft, based on non-protocol and protocol biopsies.Biopsy results were classified either as presenting a cellular rejection (n = 11) or non-cellular rejection, i.e., stable allografts (n = 17).Blood samples were collected just before graft biopsies.All patients gave their informed consent, and the study was approved by the hospital's (Centro Hospitalar Universitário de Lisboa Central-CHULC) Ethics Committee (number 454/2017 and number 1215/2022).
Considering the present study goal, renal biopsies were categorized into two broad groups: samples presenting 'cellular' and 'non-cellular' rejection processes.The 'cellular' category included the following rejection processes: cellular borderline, cellular I, and cellular II, indicating varying degrees of cellular rejection and with increased severity.The 'non-cellular' category included the following processes: no alteration, acute tubular necrosis, previously treated humoral rejection, HIV nephropathy, IgA nephropathy recurrence, interstitial fibrosis and tubular atrophy, and polyomavirus-associated nephropathy (Table 1).
Demographic and clinical variables among the two groups of patients were analyzed by the Chi-square and Student's t-test, using (GraphPad Prism) version 8.0.2 for Microsoft Windows (GraphPad Software, San Diego, CA, USA).

MIR Spectra Acquisition
Briefly, 25 µL of pre-diluted serum (1 to 10 in Milli-Q water), from each serum sample, was pipetted to a 96-well Si plate and then dehydrated in a desiccator for 150 min under vacuum (Vacuubrand, ME 2, Wertheim, Germany).Spectral data were collected using an FTIR spectrometer (Vertex 70, Bruker, Billerica, MA, USA) equipped with an HTS-XT (Bruker) accessory.Each spectrum represented 64 coadded scans, with a 2 cm −1 resolution, and was collected in transmission mode, between 1500 and 1700 cm −1 .The first well of the 96-well plate did not contain a sample, and the corresponding spectra were acquired and used as the background, according to the HTS-XT manufacturer.All spectra used in the following sections were submitted to atmospheric compensation.

Spectra Pre-Processing and Processing
Spectra pre-processing by atmospheric compensation was conducted with OPUS ® software, version 6.5 (Bruker, Ettlingen, Germany), while remaining spectra pre-processing and processing analysis were conducted with Orange3 version 3.35.0(Bioinformatics Lab., University of Ljubljana, Ljubljana, Slovenia).Spectra baseline correction based on the Rubber Band method, unit vector normalization, and with second derivative spectra, based on a Savitzky-Golay filter with a 2nd polynomial degree, was evaluated.Dimensionality reduction techniques, such as the t-distributed stochastic neighbor embedding method (t-SNE), heatmaps, dot matrix, and hierarchical cluster analysis (HCA), were performed.Feature selection was conducted by an information gain algorithm.Supervised Naïve Bayes models were developed.The Leave-One-Out Cross-Validation (LOOCV) procedure was applied.The models' performances were assessed by the area under the receiver operating characteristics curve (AUC).The classification accuracy, F-1 score, precision, sensitivity, and specificity corresponded, on the ROC curve, to the minimum distance from the upper-left corner of the unit square, representing the optimal point on the ROC curve where these metrics are maximized.

Conclusions
A very good predictive model for kidney allograft cellular rejection was developed based on the molecular profile of whole proteins present in serum, as captured by FTIR spectroscopy (AUC = 0.984).The current analysis utilized just 25 µL of serum, making it feasible to perform the test with a mere drop of blood.The analysis was also conducted using a very simple procedure and high-throughput mode, enabling the method to be feasibly implemented in large-scale studies.Spectral bands linked to specific protein bond vibrations were identified, suggesting the potential for future applications using more cost-effective and even portable equipment for routine analyses outside the laboratory setting.As such, the current method holds significant promise for predicting cellular rejection processes and warrants further investigation.

Figure 1 .
Figure 1.Average serum spectra (between 1500 and 1700 cm −1 ) from patients without (blue) and with (red) cellular rejection of the kidney allograft (A-D) and the corresponding t-SNE (E-H), respectively.The following spectra pre-processing methods were evaluated: (A,E) unprocessed raw data spectra; (B,F) normalized and baseline correction; (C,G) second derivative; (D,H) normalized second derivative.

Figure 1 .Figure 2 .
Figure 1.Average serum spectra (between 1500 and 1700 cm −1 ) from patients without (blue) and with (red) cellular rejection of the kidney allograft (A-D) and the corresponding t-SNE (E-H), respectively.The following spectra pre-processing methods were evaluated: (A,E) unprocessed raw data spectra; (B,F) normalized and baseline correction; (C,G) second derivative; (D,H) normalized second derivative.

Figure 2 .
Figure 2. Heatmap of serum spectra from patients without (upper part of the graphs) and with (lower part of the graphs) cellular rejection processes.The following spectra pre-processing methods were applied: (A) unprocessed raw data spectra; (B) normalized baseline correction; (C) second derivative; (D) normalized second derivative.

Figure 3 .
Figure 3.A dot matrix plot of the 4 most informative bands (x-axis) from the amine I region (1500-1700 cm −1 ) selected by the information gain to discriminate serum samples from patients with and without cellular rejection.

Table 1 .
Characteristics of the 28 renal biopsies.Out of these, 17 did not show signs of cellular rejection processes, whereas 11 exhibited cellular rejection processes.The p-value from the statistical analysis comparing these two groups is also provided.

Table 2 .
Performance of LOOCV of Naïve Bayes models, considering diverse spectral pre-processing methods to predict cellular rejection.

Table 2 .
Performance of LOOCV of Naïve Bayes models, considering diverse spectral pre-processing methods to predict cellular rejection.