Development of a Biomarker Panel to Distinguish Risk of Progressive Chronic Kidney Disease

Chronic kidney disease (CKD) patients typically progress to kidney failure, but the rate of progression differs per patient or may not occur at all. Current CKD screening methods are sub-optimal at predicting progressive kidney function decline. This investigation develops a model for predicting progressive CKD based on a panel of biomarkers representing the pathophysiological processes of CKD, kidney function, and common CKD comorbidities. Two patient cohorts are utilised: The CKD Queensland Registry (n = 418), termed the Biomarker Discovery cohort; and the CKD Biobank (n = 62), termed the Predictive Model cohort. Progression status is assigned with a composite outcome of a ≥30% decline in eGFR from baseline, initiation of dialysis, or kidney transplantation. Baseline biomarker measurements are compared between progressive and non-progressive patients via logistic regression. In the Biomarker Discovery cohort, 13 biomarkers differed significantly between progressive and non-progressive patients, while 10 differed in the Predictive Model cohort. From this, a predictive model, based on a biomarker panel of serum creatinine, osteopontin, tryptase, urea, and eGFR, was calculated via linear discriminant analysis. This model has an accuracy of 84.3% when predicting future progressive CKD at baseline, greater than eGFR (66.1%), sCr (67.7%), albuminuria (53.2%), or albumin-creatinine ratio (53.2%).


Introduction
Chronic kidney disease (CKD) is a major health and economic burden worldwide, including Australia [1,2]. Irrespective of aetiology, many patients progress towards kidney failure requiring dialysis or kidney transplantation [3]. Currently, there are no clinically robust biomarkers to predict progressive CKD. Rather, clinicians rely on multiple longitudinal kidney measurements, such as estimated glomerular filtration rate (eGFR), albuminuria (Alb) and albumin-creatinine ratio (ACR), A sample of 418 patients was ascertained from the CKD QLD Registry (termed the Biomarker Discovery cohort), a registry of patients who are known to specialist nephrology practices across Queensland, Australia, with pre-terminal CKD and with records of associated clinical data. These patients were recruited via an opt-in consent model [10] (https://cre-ckd.centre.uq.edu.au/ CKD.QLD). Patients were included if ≥4 independent eGFR measurements were recorded over a minimum of 12 months during follow-up. This subset of patients was recruited from the Royal Brisbane and Women's Hospital (RBWH) between May-2011 and May-2015 and they were followed until the date of kidney replacement therapy (dialysis/transplant), death, discharge, loss to follow-up, or censor date of 30 June 2017.
A sample of 62 patients was ascertained from the CKD Biobank (termed the Predictive Model cohort), a repository of pre-terminal CKD patients, with associated clinical data and baseline biospecimens, who are known to specialist nephrology practices across Queensland, Australia and are recruited via a broad consent model [11] (https://cre-ckd.centre.uq.edu.au/project/nhmrc-ckd-biobank). Patients were included if ≥2 independent eGFR measurements were recorded during follow-up. This subset of patients was recruited from the Logan Hospital, Queensland, between November 2017 and October 2018, and was followed until kidney replacement therapy or censor date of 31 December 2019.

Outcome
The outcome for both cohorts was subsequent progressive CKD occurring during follow-up. This was defined by a composite outcome of a ≥30% decline in eGFR from baseline, initiation of dialysis, or kidney transplantation. A ≥30% decline in eGFR from baseline was chosen to represent a progressive decline kidney function because the CKD Prognosis Consortium found it conferred a substantial risk of kidney failure in CKD with an eGFR 60+ or <60 mL/min/1.73 m 2 [12].

Biomarkers
A panel of 61 biomarkers (Table S1) was assessed in the Biomarker Discovery cohort. Laboratory data, including routine kidney function measurements from Queensland Health Pathology Services (QHPS) and taken during the clinical management of CKD patients, were sourced from Queensland Health integrated electronic Medical Record (ieMR) or other databases. A biomarker was included for analysis if it was measured at baseline eGFR measurement or ≤3 months prior to the baseline eGFR measurement in an individual patient and was measured in ≥50 patients. In the Biomarker Discovery cohort, eGFR was calculated by the 2009 CKD-EPI creatinine equation.

Statistical Analysis
Patient characteristics of progressive and non-progressive CKD patients were compared using an independent t-test or Mann-Whitney U-test, depending on whether continuous variables were distributed normally. A Chi-Square test of homogeneity was used to compare categorical variables. If a characteristic was found to differ between progressive and non-progressive patients, the relative risk was calculated.
Biomarker concentrations were compared between progressive and non-progressive CKD patients using logistic regression. For each biomarker, the basic covariates of age, gender, kidney disease diagnosis, body mass index, and follow up time were included in the logistic regression. Regarding an independent biomarker-if a covariate did not significantly contribute to the model, it was dropped from the logistic regression.
Several predictive models for predicting progressive CKD, termed the DROP CKD models, based on biomarker expression were calculated via linear discriminant analysis. This is a statistical approach for predicting class membership of individuals. Biomarkers that were observed as differing between progressive and non-progressive CKD patients of the Predictive Model cohort were selected for inclusion in the development of these predictive models. Biomarker selection was via a step approach, which included (step-forward) or excluded (step-backward) a biomarker if it improved the accuracy of the model. If the inclusion or exclusion of a biomarker did not improve the accuracy of the predictive model, the selection process was stopped, and the biomarker panel of the previous step was confirmed as the final predictive model.
Predictive models were calculated with and without the basic covariates of age, gender, kidney disease diagnosis, body mass index, and follow-up time included using both step-forward and step-backwards approaches. Additionally, predictive models were calculated for the routine kidney measurements sCr, eGFR, Alb, and ACR. The accuracy of these predictive models, in terms of predicting future progression status (based on baseline biomarker expression), were compared with each other and to the predictive models of the routine kidney measurements.
Statistical analysis was performed using R (Version 3.3.1) with the packages "MASS" and "sm". The specific code for these packages is available online [14]. Statistical significance was assigned at p < 0.05.

Patient Characteristics at Baseline
Patients were retrospectively classified as progressive or non-progressive based on a composite outcome of a ≥30% decline in eGFR from baseline, initiation of dialysis, or kidney transplantation. Patients of the Biomarker Discovery cohort (n = 418) that were classified as progressive (n = 183) only differed from those classified as non-progressive (n = 235) in follow-up time (Table 1). Progressive patients were followed for a significantly longer time (p < 0.0001), 4.6 ± 1.5 years, compared to non-progressive patients who were followed for 3.9 ± 1.6 years. A longer follow-up time, ≥4.3 years (the median follow-up time), conferred a relative risk of 1.3 95% CI [1.1, 1.7] for progression compared to a shorter follow-up time ≤ 4.3 years. Abbreviations: Estimated glomerular filtration rate (eGFR), non-significant (n.s.), p < 0.05 (*), p < 0.0001 (****).

Assessing the Composite Outcome
The composite outcome was found to distinguish between progressive and non-progressive CKD patients by the maximum eGFR percentage decrease from baseline and the longitudinal trajectory of eGFR percentage change from baseline ( Figure 1). Progressive patients of the Biomarker Discovery cohort experienced a greater maximum eGFR percentage decrease of 54.4 ± 13.8% compared to 17.6 ± 10.2% for non-progressive patients (p < 0.0001). This was also observed in the Predictive Model cohort with progressive patients experiencing a maximum eGFR percentage decrease of 50.9 ± 14.8% compared to 20.1 ± 7.4% for non-progressive patients (p < 0.0001).
Both progressive and non-progressive patients of the Biomarker Discovery cohort demonstrated significantly different longitudinal trajectories of eGFR percentage change from baseline (F(3, 6609) = 961.8, p < 0.0001) with progressive patients experiencing a steep decline and non-progressive patients experiencing a shallow increase in the percentage change in eGFR from baseline. Within the Predictive Model cohort, both patient groups demonstrated longitudinal downwards trajectories; however, these trajectories were significantly different (F(3, 454) = 98.87, p < 0.01). Progressive patients experienced a steep decline, while non-progressive patients experienced a shallower decline in the percentage change in eGFR from baseline.

DROP CKD-A Predictive Model
Created either through a step-forward or step-backward approach, the DROP CKD models were more accurate at predicting progressive CKD based on baseline biomarker expression in the Predictive Model cohort than the traditional kidney function measurements sCr, eGFR, Alb or ACR in solitude (Figure 2). The step-forward approach selected a biomarker panel of sCr, osteopontin, tryptase, urea, and eGFR and had an accuracy of 84.3% and 83.3%, including the basic covariates age, body mass index, follow-up time, kidney disease diagnosis, and gender. The step-backward approach selected a biomarker panel of bicarbonate, osteopontin, SCF, tissue factor, tryptase, urea, sCr, and eGFR with an accuracy of 86.3%. When including the basic covariates, the accuracy of the model created via the step-backward approach decreased to 81.3%. These were in comparison to the kidney measurements sCr, eGFR, Alb, and ACR which had an in predictive solitude accuracy of 67.7%, 66.1%, 53.2%, and 53.2%, respectively, and a cumulative predictive accuracy of 75.8%. Created either through a step-forward or step-backward approach, the DROP CKD models were more accurate at predicting progressive CKD based on baseline biomarker expression in the Predictive Model cohort than the traditional kidney function measurements sCr, eGFR, Alb or ACR in solitude (Figure 2). The step-forward approach selected a biomarker panel of sCr, osteopontin, tryptase, urea, and eGFR and had an accuracy of 84.3% and 83.3%, including the basic covariates age, body mass index, follow-up time, kidney disease diagnosis, and gender. The step-backward approach selected a biomarker panel of bicarbonate, osteopontin, SCF, tissue factor, tryptase, urea, sCr, and eGFR with an accuracy of 86.3%. When including the basic covariates, the accuracy of the model created via the step-backward approach decreased to 81.3%. These were in comparison to the kidney measurements sCr, eGFR, Alb, and ACR which had an in predictive solitude accuracy of 67.7%, 66.1%, 53.2%, and 53.2%, respectively, and a cumulative predictive accuracy of 75.8%. Figure 2. Linear discriminants of kidney measurements and DROP CKD models. Distinguishing Risk of Progressive CKD models, calculated via linear discriminant analysis, were more accurate than those calculated for the kidney measurements eGFR (A) and albuminuria (B) when predicting future progressive CKD at baseline in the Predictive Model cohort. eGFR and albuminuria conferred accuracies of 66.1 % and 53.2%, respectively. The step-forward (C) approach calculated a predictive model with an accuracy of 84.3% with the biomarkers sCr, eGFR, osteopontin, tryptase, and urea. When including basic covariates (age, body mass index, follow-up time, kidney disease diagnosis, and gender) the step-forward approach (D) had an accuracy of 83.3%. The step-backward approach (E) calculated a predictive model with an accuracy of 86.3% with the bicarbonate biomarkers, osteopontin, SCF, tissue factor, tryptase, urea, sCr, and eGFR. When including basic covariates, the step-backward approach (F) had an accuracy of 81.25% Frequency of linear discriminants was plotted by progressive (dashed, red) and non-progressive (solid, black) CKD. Greater separation of progressive and non-progressive distributions indicates greater accuracy when predicting progressive CKD by a predictive model. Abbreviations: Estimated glomerular filtration rate (eGFR), serum creatinine (sCR), stem cell factor (SCF).

Kidney Measurement
Step-Forward Approach Step-Backward Approach  Figure 2. Linear discriminants of kidney measurements and DROP CKD models. Distinguishing Risk of Progressive CKD models, calculated via linear discriminant analysis, were more accurate than those calculated for the kidney measurements eGFR (A) and albuminuria (B) when predicting future progressive CKD at baseline in the Predictive Model cohort. eGFR and albuminuria conferred accuracies of 66.1 % and 53.2%, respectively. The step-forward (C) approach calculated a predictive model with an accuracy of 84.3% with the biomarkers sCr, eGFR, osteopontin, tryptase, and urea. When including basic covariates (age, body mass index, follow-up time, kidney disease diagnosis, and gender) the step-forward approach (D) had an accuracy of 83.3%. The step-backward approach (E) calculated a predictive model with an accuracy of 86.3% with the bicarbonate biomarkers, osteopontin, SCF, tissue factor, tryptase, urea, sCr, and eGFR. When including basic covariates, the step-backward approach (F) had an accuracy of 81.25% Frequency of linear discriminants was plotted by progressive (dashed, red) and non-progressive (solid, black) CKD. Greater separation of progressive and non-progressive distributions indicates greater accuracy when predicting progressive CKD by a predictive model. Abbreviations: Estimated glomerular filtration rate (eGFR), serum creatinine (sCR), stem cell factor (SCF).

Discussion
An investigative study of CKD patients was conducted to identify novel biomarkers of progressive CKD and to validate emerging biomarkers. The aim was to develop a model for accurately predicting progressive CKD. A discovery-based approach using the CKD QLD Registry, termed the Biomarker Discovery cohort, identified 13 biomarkers that differed in baseline expression between CKD patients who subsequently progressed or who did not progress, in CKD. Several of these biomarkers, in addition to emerging biomarkers identified in the literature, were also screened in the CKD Biobank cohort, termed the Predictive Model cohort, where 10 biomarkers were found to differ between progressive and non-progressive CKD patients (Figure 3). Predictive models, termed the DROP CKD models, were developed based on biomarker panels representing the pathophysiological processes of progressive CKD, traditional kidney function measurements, and common CKD comorbidities. These predictive models were more accurate at predicting progressive CKD than current traditional kidney function measurements.
The DROP CKD models were created to predict future progression events based on the expression of a selected biomarker panel at baseline. Several predictive models for progressive CKD have been created. The most robust appears to be The Kidney Failure Risk Equation that was created in a cohort of >700,000 patients across the globe [9]. While promising, this model lacks in several areas. Patients with eGFR category G1 and G2 CKD and a definition of progressive kidney function decline of ≥30% decline in eGFR from baseline, were not included in the construction of The Kidney Failure Risk Equation. This is problematic because as G1 and G2 CKD patients are still at risk of experiencing a 'progressive' decline in kidney function, as shown in the research presented here, and published previously [15,16]. Additionally, the Kidney Failure Risk Equation did not include novel or emerging biomarkers in its development. The research presented here is some of the first showings that inclusion of novel and emerging biomarkers of progressive CKD, in addition to kidney function measurements and clinical information traditionally found in patient health records, improves prediction accuracy. Studies using the Scottish Diabetes Research Type 1 Bioresource and the Finish Diabetic Nephropathy cohorts showed that biomarker panels that included KIM-1 and CD27 greatly improved accuracy [15,17]. Moreover, a novel urinary biomarker panel, termed CKD273, and its sub-panels, were more accurate at predicting progressive CKD in the lower grade eGFR categories [16,18].
The tissue injury biomarkers osteopontin and TF were screened in the Predictive Model cohort. These biomarkers were increased in CKD patients classified as progressive. Osteopontin has received little attention as a biomarker of progressive CKD, but is known to be inversely correlated with eGFR [19,20]. Additionally, TF has not previously been associated with progressive CKD, but hypercoagulability is known to occur in CKD patients [21,22]. To our knowledge, this is the first study to associate increased osteopontin and TF levels with progressive CKD.
Despite being involved in a range of kidney diseases, mast cells have received little attention in the context of progressive CKD. SCF, a major regulator of mast cell activity, and the mast cell-specific protease tryptase, were screened in the Predictive Model cohort. This is the first study to record an association of SCF with progressive CKD, but considering increased mast cell activity has been observed in a range of kidney diseases [23][24][25][26][27][28][29][30][31], the SCF-progressive CKD association is perhaps unsurprising. Little research has been conducted into the association of tryptase with progressive CKD; however, the Renal Impairment in Secondary Care Study observed an association between elevated tryptase levels and progression towards kidney failure [32].
TNF-α and its soluble receptors sTNFR-I and sTNFR-II are major regulators of inflammation. While TNF-α was unchanged between progressive and non-progressive patients of the Predictive model cohort, sTNFR-I and sTNFR-II were increased in progressive patients of this cohort. Similar observations have been shown previously. The TNF-α observation is at odds with those in the Chronic Renal Insufficiency Cohort Study where plasma TNF-α was associated with a rapid reduction in kidney function [33]. In contrast nephropathy, increased sTNFR-I and sTNFR-II correlated with kidney function decline [34] and in type 1 and type 2 diabetes mellitus cohorts, they were associated with worse clinical outcomes [35][36][37][38]. Additionally, sTNFR-I was associated with an increased risk of progressive CKD in a community and CKD population [39,40]. Kidney measurements, such as sCr, urea, Alb, eGFR and ACR have been studied extensively in the context of CKD and its progression [4]. In both cohorts used here, patients classified as Kidney measurements, such as sCr, urea, Alb, eGFR and ACR have been studied extensively in the context of CKD and its progression [4]. In both cohorts used here, patients classified as progressive demonstrated increased sCr and urea, while eGFR was decreased. The protein creatinine ratio was increased in progressive CKD patients of the Biomarker Discovery cohort. However, increased kidney damage was not observed in patients classified as progressive in the Predictive Model cohort, with Alb and ACR being unchanged between patient groups. Although increased Alb is a known biomarker of CKD progression and has been used in other clinical tools for predicting progressive CKD [9], Alb has been observed as being less accurate when predicting CKD progression in early eGFR categories than more advanced categories in addition to patients with advanced eGFR categories having normoalbuminuria [16,41]. With reduced kidney function in progressive CKD patients, a reduction in the ability of the kidney to filter excess electrolytes from the blood is expected.
Haematocrit and haemoglobin, as biomarkers of anaemia, were reduced in progressive CKD patients of the Biomarker Discovery cohort. This observation in the Biomarker discovery cohort was not unexpected as anaemia is a known complication of CKD. The current results agree with previous research showing anaemia as an indicator for worse CKD outcomes [42]. The inverse association between kidney failure and haemoglobin levels was also observed in non-diabetic CKD and autosomal dominant polycystic kidney disease G2-G5 CKD patients [43,44]. Erythropoiesis-stimulating agents slowed progressive CKD in a non-dialysis dependent population [45].
Bicarbonate, a biomarker of metabolic acidosis, was reduced in both cohorts. Reduced bicarbonate has previously been associated with a higher risk of progressive CKD in several studies, including in children with glomerular disease, the AASK (African American Study of Kidney Disease and Hypertension) Study, and a CKD population sourced from a USA tertiary care centre [46][47][48]. The Modification of Diet in Renal Disease (MDRD) study showed that reduced serum bicarbonate levels were associated with the increased risked of kidney failure [49]. Chloride, another metabolic acidosis biomarker, was increased in progressive CKD patients, but only in the Biomarker Discovery cohort. Increased serum chloride was associated with lower baseline kidney function in the CKD-ROUTE (CKD Research of Outcomes in Treatment and Epidemiology) study [50]. In a G3-G4 CKD cohort, higher serum chloride levels were associated with worse kidney function decline, but not with a ≥30% decline in eGFR in a fully adjusted model [51].
As biomarkers of mineral and bone disease, measurements for phosphate and calcium were available in both cohorts, while parathyroid hormone was available only in the CKD QLD Registry. Previously, elevated phosphate levels have been associated with an increased risk of CKD progression and worse clinical outcomes in CKD cohorts [52,53], and is an autosomal dominant polycystic kidney disease cohort where increased phosphate levels were associated with kidney failure [46]. Calcium levels were decreased only in the progressive patients of the CKD QLD Registry cohort and were unchanged in the Predictive Model cohort. Furthermore, parathyroid hormone was elevated in progressive patients of the Biomarker Discovery cohort. According to our understanding, this is one of the first studies to investigate the association between calcium biomarkers and parathyroid hormone with progressive CKD. Considering that kidney function decline is associated with deterioration of mineral homeostasis and disruption to tissue and circulating levels of phosphate, calcium and parathyroid hormone [54,55], that the increase in calcium and parathyroid levels is expected in progressive CKD patients.
A composite outcome was used to define progressive CKD in both cohorts and was found to distinguish between patients who experienced progressive CKD and those who did not. This was based on a ≥30% decline in eGFR from baseline, initiation of dialysis, or receipt of a kidney transplant. In both cohorts, patients classified as progressive experienced a maximum eGFR percentage decline that was~2-4 fold greater than those classified as non-progressive. This investigation was limited in three aspects. Firstly, the DROP CKD models were constructed using a relatively small patient cohort. Secondly, several biomarkers that were associated with progression in the Biomarker Discovery cohort were not available from patient records in the Predictive Model cohort. Thirdly, the longer follow-up time of progressive patients in the Biomarker Discovery cohort and the more advanced eGFR category of progressive patients of the Predictive Model cohort conferred a slight risk of progressive CKD. Future studies accounting for these limitations are required to validate the screened biomarkers and biomarker panels.
The primary benefit of the approach of the DROP CKD models, and other predictive models [9,15,16], is its ability to be integrated into the public medical infrastructure. These biomarkers are measured in biospecimens, such as venous blood or urine, that are collected via minimally invasive procedures by a phlebotomist, a position that does not require highly specialised training. Furthermore, with the addition of the relevant assay kits, pathology laboratories have the infrastructure required to screen for these proteomic biomarkers. A major hurdle to deploying a predictive model of progressive CKD, such as these, would be in rural and remote regions of a country and in developing countries where public medical infrastructure is often insufficient to support it.
While the DROP CKD model shows promising results, future efforts are required to develop a clinically useful tool for predicting progressive CKD. Efforts are needed in identifying novel biomarkers of progressive CKD and using advanced statistical analysis in predictive model construction. Two high throughput approaches for identifying novel biomarkers are proteomics and metabolomics. Previously, proteomics has been utilised in a large-scale study where 273 urinary biomarkers that differed between healthy controls and CKD patients were identified. These findings were subsequently used in studies attempting to create predictive CKD models [16,18]. Metabolomic studies in CKD patients have focused on the blood and have been performed in small clinical cohorts [56][57][58]. Finally, as an advanced statistical approach, machine learning is being adopted in clinical CKD research. It has been used in several studies, including used with comorbidity data to predict kidney replacement therapy within 12 months of CKD diagnosis [59], and creation of biomarker panels using kidney measurements, dyslipidaemia biomarkers, serum sodium, and c-reactive protein to determine progressive CKD [60].
The research presented here has identified several novel biomarkers and validated several emerging biomarkers of progressive CKD. It has contributed to the growing body of literature that supports the use of novel and emerging biomarkers of CKD progression to improve the accuracy of models for predicting progressive CKD. It also supports the benefit of building predictive models from biomarkers that represent the pathophysiological processes of progressive CKD, traditional kidney measurements, and common CKD comorbidities. However, to create a successful clinical tool for predicting progressive CKD, more biomarker research is required, and more sophisticated approaches need to be used in its creation.

Conflicts of Interest:
The authors declare no conflict of interest.