Targeted MRM Quantification of Urinary Proteins in Chronic Kidney Disease Caused by Glomerulopathies

Glomerulopathies with nephrotic syndrome that are resistant to therapy often progress to end-stage chronic kidney disease (CKD) and require timely and accurate diagnosis. Targeted quantitative urine proteome analysis by mass spectrometry (MS) with multiple-reaction monitoring (MRM) is a promising tool for early CKD diagnostics that could replace the invasive biopsy procedure. However, there are few studies regarding the development of highly multiplexed MRM assays for urine proteome analysis, and the two MRM assays for urine proteomics described so far demonstrate very low consistency. Thus, the further development of targeted urine proteome assays for CKD is actual task. Herein, a BAK270 MRM assay previously validated for blood plasma protein analysis was adapted for urine-targeted proteomics. Because proteinuria associated with renal impairment is usually associated with an increased diversity of plasma proteins being present in urine, the use of this panel was appropriate. Another advantage of the BAK270 MRM assay is that it includes 35 potential CKD markers described previously. Targeted LC-MRM MS analysis was performed for 69 urine samples from 46 CKD patients and 23 healthy controls, revealing 138 proteins that were found in ≥2/3 of the samples from at least one of the groups. The results obtained confirm 31 previously proposed CKD markers. Combination of MRM analysis with machine learning for data processing was performed. As a result, a highly accurate classifier was developed (AUC = 0.99) that enables distinguishing between mild and severe glomerulopathies based on the assessment of only three urine proteins (GPX3, PLMN, and A1AT or SHBG).


Introduction
Chronic kidney disease (CKD) is a non-specific condition with clinical symptoms that can be caused by various reasons [1]. This socially significant pathology affects up to 13.4% of people [2] and carries a high risk of disability, requiring high-cost treatments such as hemodialysis, peritoneal dialysis, and kidney transplants [3,4]. At the same time, the severity of CKD manifestations does not always reflect the degree of renal damage [5] and may largely depend on the nature of the underlying nephropathy. In recent years, there has been a steady increase in the incidence of severe glomerular diseases-especially focal segmental glomerulosclerosis (FSGS), which has increased by 41% over the past 10 years, approaching diabetic nephropathy [6]. The portion of end-stage renal disease (ESRD) attributed to FSGS has increased 11-fold, from 0.2% to 2.3%, over a 21-year period. The annual incidence of membranous nephropathy ESRD also remains high [7]. Thus, timely and accurate diagnosis is extremely important for choosing an adequate therapy and preventing serious consequences.
Urine proteomics can reflect various pathophysiological changes in the body [8], and even more so for renal dysfunction [9]. The search for specific markers among proteins and peptides in the urine is a generally growing trend due to the high information content and good stability of urine as an object of analysis, as well as the absolutely non-invasive method of sample collection, which is extremely important for potential alternatives to the kidney biopsy procedure. Urinary protein or peptide markers have already been proposed for a large number of socially significant pathologies, including a number of cancers and cardiovascular diseases, in addition to disorders associated with renal dysfunction and CKD (as reviewed in [10]).
Mass spectrometric (MS) studies using numerous approaches have made the greatest contribution to obtaining information about changes in the urinary proteome and peptidome in CKD [10][11][12][13][14][15][16]. Nevertheless, of about 100 potential protein markers of different nephropathies described so far, only a few of them have demonstrated consistent reproducibility in at least two different studies: an increase in afamin was observed in membranous nephropathy (MN) [17][18][19]; an increase in antithrombin III was shown in IgA nephropathy (IgAN) [20,21]; an increase in α-1 anti-chymotrypsin was revealed in lupus nephritis (LN) [22,23]; and diabetic nephropathy (DN) was shown to be associated with an increase in zinc-α2-glycoprotein [24,25] and AMBP [25,26], as well as a decrease in transthyretin [23,24]. However, at the same time, changes in the levels of a number of common proteins-such as collagens, uromodulin, serum albumin, and alpha-1-antitrypsin (A1AT) [8]-also remain the focus of much attention as important markers that can have an even greater diagnostic capacity than individual specific markers [11,13,15,16,18,25].
Thus, the development of a proteomic marker panel for distinguishing nephropathies remains highly relevant, so quantitative studies are of particular importance. Multiplereaction monitoring (MRM) MS technology with internal stable-isotope-labeled standards (SISs) used for normalization has already been successfully applied for the development of quantitative assays with high specificity, precision, and robustness [27][28][29][30]. In particular, a recent targeted MRM-based analysis of 72 plasma samples from patients with CKD confirmed a significant association of the AMBP protein, beta-2-microglobulin (B2M), lysozyme C (LYZ), hemoglobin subunit beta (HBB), and pigment epithelium-derived factor (PEDF) with the estimated glomerular filtration rate (eGFR), and the estimation of absolute plasma concentrations of a combination of these markers provided a stronger association with the outcome of CKD than individual markers [31].
There have been few studies regarding the development of highly multiplexed MRM assays for urine proteome analysis. Percy et al. developed an MRM panel for quantitative analysis of 136 potential urinary protein biomarkers and successfully applied it for patients with prostate cancer [29]. A targeted urine proteome assay (TUPA) for quantification of 167 urinary proteins in kidney diseases was further proposed by Cantley et al. [30], and nine kidney transplant patients with immediate or delayed graft function were compared. The consistency of both MRM assays is rather low and consists of only 22 common peptides among a total of 415 unique peptides (Percy et al.-213; TUPA-224). Thus, the further development of targeted urine proteome assays for CKD is a pressing task.
In the present study, we considered a panel of 177 proteins and corresponding peptides from the BAK270 MRM assay, which was previously developed for the analysis of potential protein biomarkers in blood plasma [28]. Because proteinuria associated with renal impairment is usually associated with an increased diversity of plasma proteins being present in urine, we considered the use of this panel appropriate. Another advantage of the BAK270 MRM assay is that it includes 35 potential CKD markers described previously [10]. Targeted MRM MS analysis was performed for 46 CKD patients with glomerulopathies, including MN (n = 12), FSGS (n = 26), MCD (n = 8), and healthy controls (n = 23). The results obtained confirm a number of previously proposed CKD markers. Combination of MRM analysis with machine learning for data processing allows for an even greater capacity to generate specific classifiers using proteomic marker panels [35,36]. However, both MRM and machine learning in general are not yet widely used methods in the study of urinary proteomic markers, including CKD markers in particular. Therefore, a preliminary assessment of the effectiveness of using a combination of these approaches in relation to urinary proteomic markers was the rationale for the present study. As a result, a highly accurate classifier was developed that allows for distinguishing between mild and severe glomerulopathies based on the assessment of only three urine proteins.

Identification of Significantly Different Urine Proteins
In the present study, we considered a panel of 177 proteins and corresponding peptides from the BAK270 MRM assay, which was previously developed for the analysis of potential protein biomarkers in blood plasma [28]. Targeted proteomic analysis was performed for 69 urine samples from 46 CKD patients (see details in Materials and Methods-Section 4.1) and 23 healthy controls. Liquid chromatography-multiple reaction monitoring mass spectrometry (LC-MRM MS) analysis revealed 138 proteins that were found in ≥2/3 of the samples from at least one of the groups, which included 84 core proteins common to all of the groups ( Figure 1, Supplementary Table S1). Comparison of our 138-protein panel with the 167 proteins (TUPA) proposed by Cantley et al. [30] and the panel of 136 urinary proteins developed by Percy et al. [29] revealed 49 and 69 common proteins, respectively, while 89 and 69 proteins were new in our MRM assay for CKD, respectively (Supplementary Table S2).
Pairwise correlation among the core proteins revealed a number of highly correlated groups, which should be taken into account ( Figure 1B). A total of 40 core proteins were found to be statistically different, mainly between the control and at least one of the other groups, at an uncorrected p-value of <0.01 (Table 1); all of them passed the 10% FDR cutoff after the Benjamini-Hochberg multiple test correction, and 28 of them passed the 5% FWER cutoff after the Bonferroni-Holm correction. It is important to note that these proteins include 17 potential markers of CKD described previously (Table 1); hence, along with the other 23 significantly different core proteins, they can be used as non-specific markers of all considered glomerulopathies. The group of proteins including antithrombin-III (ANT3), plasminogen (PLMN), alpha-1-antitrypsin (A1AT), transthyretin (TTHY), and Ig gamma-1 chain C region (IGHG1) should be especially emphasized, since they revealed the greatest number of differences between groups (Figure 2). At the same time, proteins that did not show any differences (such as actin, apolipoprotein D, etc.) may also be important to monitor, in particular for simplified normalization, instead of normalization to the total protein concentration. and Ig gamma-1 chain C region (IGHG1) should be especially emphasized, since they revealed the greatest number of differences between groups (Figure 2). At the same time, proteins that did not show any differences (such as actin, apolipoprotein D, etc.) may also be important to monitor, in particular for simplified normalization, instead of normalization to the total protein concentration.  Table 1; for other abbreviations, see Supplementary Table S1.  Table 1; for other abbreviations, see Supplementary Table S1.
Preliminary hierarchical clustering of individual samples using significantly different core proteins separated 100% of the healthy controls from CKD patients and distinguished particular CKD groups such as mFSGS and sFSGS from one another well ( Figure 3A). In general, samples from these three groups were part of the three main clusters that stand out on the heatmap. At the same time, MCD samples were mainly co-localized with mFSGS, while the MN group samples were distributed in a 1:2 ratio between the two FSGS clusters. Dimension reduction methods such as t-SNE reliably separated the control group ( Figure 3B) and made it possible to additionally distinguish samples with mild and severe nephropathic manifestations with just seven extracted principal components. In general, even a preliminary analysis easily distinguished the control group from the rest and, in addition, showed some differences between mild and severe manifestations of different nephropathies. a -Seventeen proteins previously characterized as potential CKD markers; b -ten proteins that did not pass the 5% FWER cutoff after the Bonferroni-Holm correction but passed the 10% FDR cutoff after the Benjamini-Hochberg correction, like all other proteins in the table. The symbol "+" indicates a significant difference after the FDR adjustment (<0.1); "n. Preliminary hierarchical clustering of individual samples using significantly different core proteins separated 100% of the healthy controls from CKD patients and distinguished particular CKD groups such as mFSGS and sFSGS from one another well ( Figure  3A). In general, samples from these three groups were part of the three main clusters that stand out on the heatmap. At the same time, MCD samples were mainly co-localized with mFSGS, while the MN group samples were distributed in a 1:2 ratio between the two FSGS clusters. Dimension reduction methods such as t-SNE reliably separated the control group ( Figure 3B) and made it possible to additionally distinguish samples with mild and severe nephropathic manifestations with just seven extracted principal components. In general, even a preliminary analysis easily distinguished the control group from the rest and, in addition, showed some differences between mild and severe manifestations of different nephropathies.  Preliminary hierarchical clustering of individual samples using significantly different core proteins separated 100% of the healthy controls from CKD patients and distinguished particular CKD groups such as mFSGS and sFSGS from one another well ( Figure  3A). In general, samples from these three groups were part of the three main clusters that stand out on the heatmap. At the same time, MCD samples were mainly co-localized with mFSGS, while the MN group samples were distributed in a 1:2 ratio between the two FSGS clusters. Dimension reduction methods such as t-SNE reliably separated the control group ( Figure 3B) and made it possible to additionally distinguish samples with mild and severe nephropathic manifestations with just seven extracted principal components. In general, even a preliminary analysis easily distinguished the control group from the rest and, in addition, showed some differences between mild and severe manifestations of different nephropathies.

Building of a Binary Classifier for Distinguishing Mild and Severe Nephropathies
Due to the rather small number of samples in each pathological group, a binary classifier was built for distinguishing samples with mild and severe manifestations of nephropathies. First of all, it was necessary to perform feature selection. First, the most significant features were selected based on their p-values, Cohen's d effect sizes, and feature importance determined by the "decision tree" algorithm (Supplementary Table S3). To evaluate the latter, the model was trained on the entire dataset, increasing the minimum "tree" depth until reaching an AUC-ROC value of 1.0. The top five proteins from each approach were pooled, resulting in a set of 10 proteins: alpha-2-antiplasmin (A2AP), IGHG1, sex-hormone-binding globulin (SHBG), Ras GTPase-activating protein nGAP (NGAP), vitamin-D-binding protein (VTDB), attractin (ATRN), alpha-1-antitrypsin (A1AT), plasminogen (PLMN), glutathione peroxidase 3 (GPX3), and aortic smooth muscle actin (ACTA). Further estimation of the predictive power of each feature by a default logistic regression classifier using a fivefold cross-validation approach reduced the number of features, and the best results were achieved with PLMN, GPX3, A1AT, and SHBG (Supplementary Table S4). It is also noteworthy that the first two were also among the proteins that showed the greatest numbers of differences between the pathological groups ( Figure 2).
Binary classifiers built using four machine learning algorithms suggest that the best ROC-AUC metrics can be achieved even with just three proteins: GPX3, PLMN, and A1AT or SHBG (Figure 4). Both of these three-protein sets may be highly effective for distinguishing samples from patients with mild and severe disease states.
"tree" depth until reaching an AUC-ROC value of 1.0. The top five proteins from each approach were pooled, resulting in a set of 10 proteins: alpha-2-antiplasmin (A2AP), IGHG1, sex-hormone-binding globulin (SHBG), Ras GTPase-activating protein nGAP (NGAP), vitamin-D-binding protein (VTDB), attractin (ATRN), alpha-1-antitrypsin (A1AT), plasminogen (PLMN), glutathione peroxidase 3 (GPX3), and aortic smooth muscle actin (ACTA). Further estimation of the predictive power of each feature by a default logistic regression classifier using a fivefold cross-validation approach reduced the number of features, and the best results were achieved with PLMN, GPX3, A1AT, and SHBG (Supplementary Table S4). It is also noteworthy that the first two were also among the proteins that showed the greatest numbers of differences between the pathological groups ( Figure 2).
Binary classifiers built using four machine learning algorithms suggest that the best ROC-AUC metrics can be achieved even with just three proteins: GPX3, PLMN, and A1AT or SHBG (Figure 4). Both of these three-protein sets may be highly effective for distinguishing samples from patients with mild and severe disease states.
Thus, analysis of just several core proteins, in addition to separating glomerulopathies from the control group, can confidently identify the severity of the disease. Nonetheless, accurate diagnosis still requires more specific markers for consideration.

Perspective Potential Protein Markers Specific for CKD Glomerulopathies
If potential protein markers prevalent only in certain CKD groups (which are present in >2/3 of patients in at least one pathological state, but may be totally absent in the others) are also taken into consideration, additional proteins that can contribute to group discrim- Thus, analysis of just several core proteins, in addition to separating glomerulopathies from the control group, can confidently identify the severity of the disease. Nonetheless, accurate diagnosis still requires more specific markers for consideration.

Perspective Potential Protein Markers Specific for CKD Glomerulopathies
If potential protein markers prevalent only in certain CKD groups (which are present in >2/3 of patients in at least one pathological state, but may be totally absent in the others) are also taken into consideration, additional proteins that can contribute to group discrimination can be revealed (Table 2). Eleven proteins can strengthen the separation of the control group. Nine proteins were increased in all glomerulopathies; however, the degree of this increase varied in different groups and was correlated with the severity of the pathology. Twenty proteins are common potential markers for sFSGF and MN and can reliably distinguish MN from mFSGS and MCD, which cannot be achieved as effectively using only the core proteins. Carboxypeptidase N catalytic chain (CBPN) and alpha-2-macroglobulin (A2MG) should be emphasized, since their levels are the highest in MN ( Figure 5). They can also to some extent contribute to the distinction between MN and sFSGS, but 12 unique proteins are even more significant for distinguishing the sFSGS group from all others ( Table 2). Separation of the MCD and mFSGS groups seems to be a more difficult task. However, these groups also demonstrate some proteomic differences: MCD patients show significantly higher levels of keratins (K1C10 and K1C9) among all of the pathological groups, as well as essentially higher levels of alpha-2-HS-glycoprotein (AHSG) and prothrombin (F2) than in mFSGS, while urine samples from mFSGS patients show higher levels of haptoglobin (HPT) and A2MG. a -Thirteen potential markers related to CKD described previously. The most likely markers of one or more groups are marked with a gray background (taking into account the percentage of representation and the median value). "Nan"-missing values that indicate concentrations that are too low to be quantified or completely absent in the sample.
Phospholipid transfer protein (PLTP) 82.6 (−0.78) 12.5 (Nan) 18.2 (Nan) 73.3 (2.14) 50 (−1.34) Pigment epithelium-derived factor (PEDF) a 8.7 (Nan) 25 (Nan) 18.2 (Nan) 73.3 (10.2) 33.3 (Nan) Vitamin-K-dependent protein S (PROS) 0 (Nan) 12.5 (Nan) 18.2 (Nan) 73.3 (5.18) 33.3 (Nan) a -Thirteen potential markers related to CKD described previously. The most likely markers of one or more groups are marked with a gray background (taking into account the percentage of representation and the median value). "Nan"-missing values that indicate concentrations that are too low to be quantified or completely absent in the sample. The proteins listed in Table 2 include 11 previously described markers of CKD: AHSG, HPT, OSTP, F2, A2MG, ADIPO, CO3, CO9, K1C10, FGA, and PEDF. Therefore, the quantitative analysis performed in this study confirms their diagnostic potential. Further quantitative analysis of a significantly expanded number of samples could make it possible to use machine learning methods to create a more accurate classifier for diagnosing specific glomerulopathies.

Discussion
In the present study, we applied a targeted MRM assay previously validated for 270 blood plasma proteins for the quantitative analysis of urine proteins in glomerulopathies compared to healthy controls. Because proteinuria associated with renal impairment is usually associated with an increased diversity of plasma proteins being present in urine, we considered the use of this kit appropriate. Since the severity of proteinuria varies The proteins listed in Table 2 include 11 previously described markers of CKD: AHSG, HPT, OSTP, F2, A2MG, ADIPO, CO3, CO9, K1C10, FGA, and PEDF. Therefore, the quantitative analysis performed in this study confirms their diagnostic potential. Further quantitative analysis of a significantly expanded number of samples could make it possible to use machine learning methods to create a more accurate classifier for diagnosing specific glomerulopathies.

Discussion
In the present study, we applied a targeted MRM assay previously validated for 270 blood plasma proteins for the quantitative analysis of urine proteins in glomerulopathies compared to healthy controls. Because proteinuria associated with renal impairment is usually associated with an increased diversity of plasma proteins being present in urine, we considered the use of this kit appropriate. Since the severity of proteinuria varies greatly, absolute quantification-which is usually applied for the analysis of blood samples-would not allow a correct comparison of urine samples. This is why the amount of total protein used for trypsinolysis and subsequent MS analysis was normalized for this study. Therefore, the obtained results reflect the relative characteristic changes in the urine proteomic contents. Moreover, the contents of a number of identified core proteins, which did not show significant differences in levels between groups (such as ACTA, GELS, APOD, DIAC, IGHM, etc.), can also be used for normalization instead of the total protein concentration.
The used MRM assay kit actually includes 35 potential CKD markers described previously. Nineteen of them turned out to be among the core proteins with significantly different levels, mostly distinguishing all of the glomerulopathies from the control group (Table 1). Nevertheless, ANT3, PLMN, and A1AT also showed significant differences between mild and severe disease states. These results greatly coincide with those of multiple other studies, which showed a significant increase in alpha-1-antitrypsin levels in almost all nephropathies associated with CKD (as reviewed in [10]), while ANT3 and PLMN were previously shown to be increased only in IgAN and MN, respectively. At the same time, the present study revealed several other significantly differing core proteins, which had not been described previously as CKD markers. Particularly, PZP, IPSP, and IGHG1 also underlined the differences between particular glomerulopathies; the first two are of particular interest, as they distinguish FSGS and MN, unlike any other core protein.
Interaction analysis between 40 proteins that are prevalent only in certain groups ( Table 2) revealed many intersecting pathways (Supplementary Table S5). However, some interactions should be especially emphasized. In particular, proteins identified in all pathological groups (but not in the control group) were more involved in blood coagulation and fibrinolysis (A2MG, F2, C1IN), acute phase response (HPT, AHSG, F2), regulation of complement activation and defense response (A2MG, C1IN, F2, CBPN, AHSG), and posttranslational protein modification and negative regulation of response to wounding (OSTP, AHSG, A2M, F2, C1IN) (Supplementary Table S5). At the same time, proteins that were prevalent only in severe cases (e.g., sFSGS and MN) were most associated with multiple processes of lipid metabolism (APOA4, APOM, APOC1, APOC3, PLTP, CO3, PHLD, ADIPO), including regulation of lipoprotein particle assembly, remodeling, and levels ( Figure 6). Moreover, severe cases included proteins associated with complement activation and regulation (CO3, CO5, CO9) (Supplementary Table S5). While the proteins prevailing in the control samples did not form a specific physiological cluster, for the most part they were  (Table 2), using the STRING method. The red cluster corresponds to different processes associated with lipid metabolism; the green cluster relates to immune processes; the blue cluster corresponds to extracellular matrix organization. The color of the font indicates the predominance of the protein in specific groups: blue-control; yellow-sFSGS; orange-sFSGS and MN; red-MCD, mFSGS, sFSGS, and MN. Abbreviations coincide with the data in Table 2. The more complete information for all prevailing proteins is given in Supplementary Table S5.
Of the previously proposed markers mostly revealed in pathological groups in the present study, increased levels of AHSG and HPT in all glomerulopathies coincide well with the findings of other studies, while the increases in CERU, F2, A2MG, and ADIPO were previously shown mainly for IgAN (as reviewed in [10]), but this group was not considered in the present study. The presence of fibrinogen chains in severe cases also has some overlap with previous studies [11,15], including more recently published data on fibrinogen γ-chain as a potential marker of renal interstitial fibrosis in IgAN [37]. Increased levels of OSTP were found in all glomerulopathies, with no significant differences ( Table 2). The same was true for AHSG, HPT, and F2, as well as for several proteins that  (Table 2), using the STRING method. The red cluster corresponds to different processes associated with lipid metabolism; the green cluster relates to immune processes; the blue cluster corresponds to extracellular matrix organization. The color of the font indicates the predominance of the protein in specific groups: blue-control; yellow-sFSGS; orange-sFSGS and MN; red-MCD, mFSGS, sFSGS, and MN. Abbreviations coincide with the data in Table 2. The more complete information for all prevailing proteins is given in Supplementary Table S5. Of the previously proposed markers mostly revealed in pathological groups in the present study, increased levels of AHSG and HPT in all glomerulopathies coincide well with the findings of other studies, while the increases in CERU, F2, A2MG, and ADIPO were previously shown mainly for IgAN (as reviewed in [10]), but this group was not considered in the present study. The presence of fibrinogen chains in severe cases also has some overlap with previous studies [11,15], including more recently published data on fibrinogen γ-chain as a potential marker of renal interstitial fibrosis in IgAN [37]. Increased levels of OSTP were found in all glomerulopathies, with no significant differences ( Table 2). The same was true for AHSG, HPT, and F2, as well as for several proteins that have not previously been suggested as markers of CKD, including CBPN, KV401, and C1IN (Table 2). Of these, AHSG, CBPN, HPT, KV401, C1IN, and F2 can separate mild and severe diseases or even distinguish between sFSGS and MN.
Of the other previously described CKD urine markers, the decreased presence of CO3 in MCD, decreased presence of collagen in glomerulopathies, and decreased levels of FGA in MN (compared to sFSGS) are in good agreement with previously published data. The increase in PEDF in sFSGS revealed here seems particularly interesting, as it may correlate with its previously described increase in plasma [31]. Moreover, it should be noted that a number of new potential markers that could be used to distinguish between severe glomerulopathies-such as APOM, ITOH2, and LUM-were identified. However, the increased diversity of plasma proteins in the urine itself should correlate with the degree of deterioration of the renal function; therefore, it is hardly appropriate to consider many proteins as specific markers. In general, further studies using more balanced and urine-specific protein panels (including uromodulin, a variety of collagens, and previously proposed CKD markers) on larger collections of urine samples will facilitate the validation of particular CKD markers, as well as the creation of accurate differentiating panels.

Study Population
The study cohort consisted of 69 participants, including patients with confirmed diagnoses of MCD (n = 8), FSGS (n = 26), and MN (n = 12) (Table 3), as well as healthy controls (n = 23). Participants were recruited in the Nephrology Department of Sechenov First Moscow State Medical University (Moscow, Russia). Informed consent was obtained from all participants. Healthy controls (12 men and 11 women, aged 19 to 58 years) had no history of nephropathies, with normal kidney function and no proteinuria. The exclusion criteria for patients with nephropathies were as follows: active urinary infection, diabetes mellitus, obesity, severe arterial hypertension (≥160/≥100 mm Hg), liver disease, rheumatic systemic diseases, and stage 5 CKD ESRD. Obese patients and patients with long-term arterial hypertension were excluded from analysis to avoid secondary FSGS. Any rheumatic disease (e.g., lupus, systemic vasculitis, rheumatoid arthritis) was also an exclusion criterion. The patients' clinical parameters, including total blood protein, revealed no significant correlation with gender ( Figure S1).
The diagnostic groups (FSGS, MN) were determined by biopsy and immunohistochemistry. No patient was diagnosed with cancer. The MN group included primary aPLA2R-positive patients.
Patients with FSGS were additionally subdivided into mild (mFSGS) and severe (sFSGS) subgroups using a special index, taking into account the conservation of the renal function and estimated glomerular filtration rate (eGFR), as well as the severity of proteinuria and steroid resistance of the nephrotic syndrome, as described previously [32]. The special index was calculated as follows: the first score was assigned depending on the level of eGFR, the second depending on the severity of proteinuria, and the third depending on the steroid resistance of the nephrotic syndrome (Table 4). Steroid resistance was defined as the absence of a decrease in proteinuria levels after 16 weeks of prednisolone therapy, or a decrease by less than 50% of the baseline level. The renal function was considered to be "saved" if the eGFR (as determined by the CKD-EPI formula, eGFR CKD-EPI) was above 60 mL/min/1.73 m 2 . If the total score was <3, patients were classified as mFSGS, while scores of ≥3 indicated sFSGS.

Urine Sample Preparation for LC-MS
Ten milliliters of the middle portion of freshly collected morning urine was centrifuged at 3000 rpm for 15 min immediately after collection. The supernatant was aliquoted and stored at −20 • C.
Urine proteins were precipitated with ice-cold acetone as described previously [32]. In brief, 0.1 mL urine aliquots were quickly thawed, mixed with 0.5 mL of ice-cold acetone, and incubated overnight at −20 • C. The precipitate was centrifuged (20,000× g, 10 min) and dissolved in 50 µL of 8 M urea (200 mM Tris-HCL, pH 8.5). Urine samples of healthy participants were 10-fold concentrated before acetone precipitation using 3 kDa Amicon ® filters (Millipore, Germany). Protein concentration was measured with the BCA assay (Thermo Scientific, Waltham, MA, USA).
Before trypsinolysis, the samples (100 µg of total protein) were reduced with 5 mM dithiothreitol (30 min, +37 • C) and alkylated in the dark with 20 mM iodoacetamide (30 min). TPCK-treated trypsin (Worthington, Franklin, OH, USA) was added at an enzyme:protein ratio of 1:25, and hydrolysis was performed at +37 • C overnight. The reaction was quenched by adding formic acid up to 0.5%. The SIS peptide mixture was spiked in each sample, followed by desalting by solid-phase extraction using plates (Oasis HLB 96-well Microelution Plate, Waters, Taunton, MA, USA). The eluate was lyophilized and dissolved in 0.1% formic acid to a concentration of 0.5 mg/mL for further LC-MS/MS analysis.
The normalization of the amount of total protein was performed before trypsinolysis and subsequent MS analysis, due to the significant variability in the total protein concentrations of the studied urine samples.

Targeted Quantitative LC-MS/MS Using Multiple-Reaction Monitoring (MRM) with Stable-Isotope-Labeled Peptide Standards (SISs)
Targeted quantitative LC-MS analysis was carried out using synthetic stable-isotopelabeled internal standards (SISs) and natural (NAT) synthetic proteotypic peptides for measuring the corresponding proteins in urine. The selected 270 SISs and NAT synthetic peptides had been previously validated for use in LC/MRM-MS experiments for blood plasma [38]. LC-MS parameters, such as the LC gradient and the MRM parameters (Q1 and MRM scans), were adapted and optimized based on previous studies [38]. The SIS peptide mixture was spiked in each urine sample at a balanced concentration, which was optimized in experiments with dilution of a series of urine samples with proteinuria. Standard curves were generated using NAT and SIS peptide standards with a pooled urine sample as a matrix, as previously described in detail for blood plasma analysis [38].
All samples were analyzed in duplicate by HPLC-MS using an ExionLC™ UHPLC system (Thermo Fisher Scientific, USA) coupled online with a SCIEX QTRAP 6500+ triplequadrupole mass spectrometer (SCIEX, Toronto, ON, Canada). LC-MS parameters, such as the LC gradient and MRM parameters (Q1 and MRM scans), were adapted and optimized based on previous studies [39,40].
The loaded sample volume was 10 µL per injection. HPLC separation was carried out using an Acquity UPLC Peptide BEH column (C18, 300 Å, 1.7 µm, 2.1 mm × 150 mm, 1/pkg) (Waters, USA) with gradient elution. Mobile phase A was 0.1% FA in water; mobile phase B was 0.1% FA in acetonitrile. LC separation was performed at a flow rate of 0.4 mL/min using a 53 min gradient from 2 to 45% of mobile phase B. Mass spectrometric measurements were carried out using the MRM acquisition method. The electrospray ionization (ESI) source settings were as follows: ion spray voltage 4000 V, temperature 450 • C, ion source gas 40 L/min. The corresponding transition list for MRM experiments, with retention time values and Q1/Q3 masses for each peptide, is available in Supplementary Table S6.
For quantitative analysis of the raw LC-MS/MS data, Skyline Quantitative Analysis software (version 20.2.0.343, University of Washington) was used [41,42]. To calculate the peptide concentrations in the measured samples, calibration curves were generated using 1/(x × x)-weighted linear regression methods.

Data Analysis
Statistical analysis and data visualization were performed on Python (3.7.3) with the following packages: SciPy [43], Seaborn [44], Matplotlib [45], and Pandas [46]. Significant differences in protein concentrations in the patient groups were estimated using the Mann-Whitney U-test. The false discovery rate (FDR) control Benjamini-Hochberg procedure and the familywise error rate (FWER) approach with the Bonferroni-Holm method were used to control and prevent the false rejection of hypotheses (type I error). Pearson's correlation coefficient was used to evaluate the correlation between features and build a correlation matrix. To build a Venn diagram (http://bioinformatics.psb.ugent.be/webtools/Venn/ (accessed on 7 April 2003)), all proteins that were present in ≥2/3 of the samples from at least one of the groups were considered. Only proteins identified in ≥70% of samples of any group were considered for binary classification, reducing the dataset from 138 to 58 features (Supplementary Table S3). Since missing values often represent low abundant measurements, the "Nan" values were filled with a Gaussian distribution using the Perseus software [47], with parameters of shift down = 0.4 and width = 0.2 of the mean value for each group. Heatmap hierarchical clustering and principal component analysis (PCA) with t-distributed stochastic neighbor embedding (t-SNE) were used for preliminary estimation of the differences between the studied groups and particular samples. Interaction analysis of physiological processes involving potential markers was performed using the STRING method with default parameters, using an available resource (https://string-db.org/ (accessed on 7 April 2003)).

Machine Learning for Binary Classification
All machine learning models were taken from the Scikit-Learn package [48]. All data were Z-scored for normalization. Feature significance ranking was performed using p-values, Cohen's d effect size, and the "decision tree" algorithm (Supplementary Table S3). The prediction power of each feature was additionally estimated by a default logistic regression classifier using 5-fold cross validation. The binary classifier was selected from 4 widely used machine learning algorithms, suitable for small datasets: k-nearest neighbors (kNN), logistic regression (LR), random forest (RF) and support-vector machine (SVM) with a linear kernel; a 5-fold cross-validation approach, grid search (Supplementary Table S7), and all possible combinations of selected proteins were used to achieve the best performance (Supplementary Table S8).

Conclusions
The obtained results confirm that MRM-MS combined with machine learning is an effective tool for the development of classifiers for the accurate discrimination of different nephropathies related to CKD, as well as for the validation of candidate proteomic markers described previously. As a result, a highly accurate classifier was developed that enables distinction between mild and severe glomerulopathies based on the assessment of only three urine proteins.
Comparison of our 138-protein panel with the 167 proteins (TUPA) proposed by Cantley et al. [30] and the panel of 136 urinary proteins developed by Percy et al. [29] revealed 49 and 69 common proteins, respectively, while 89 and 69 proteins were new in our MRM assay for CKD, respectively. In the present study, 31 proteins previously suggested as potential CKD markers were validated as such.
Quantification of proteins prevalent in only specific pathologies can lead to a more accurate and earlier CKD diagnosis. The development of particularly specific classifiers still requires greater sample collections and further increases in the number of potential CKD markers used for the MRM assay.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/molecules28083323/s1, Figure S1: Correlation of different patient's parameters with morphology forms; Table S1: Proteins identified in the study; Table S2: Comparison with other urine MRM assays; Table S3: Proteins significant features selected based on their p-values, Cohen's d effect sizes, and feature importance; Table S4: The most important 10 core proteins; Table S5: STRING analysis of interaction between urine proteinsprevalent in cpecific groups;