1. Introduction
Patients with End-Stage Renal Disease (ESRD) initiate dialysis when their intrinsic, residual renal function (RRF) is between 5 and 11 mL/min/1.73 m
2 [
1,
2]. Although RRF will be invariably lost within the first few years of dialysis initiation, sustaining this RRF is clinically important for the following reasons: low or declining RRF is associated with worse survival [
3,
4,
5,
6,
7], worse phosphate control [
8], increasing left ventricular hypertrophy [
9], and poorer quality of life [
3,
10]. Hence existing dialysis guidelines [
11,
12] recommend the incorporation of RRF measurements into individualized patient prescriptions of treatment frequency and duration in order to achieve minimum dialysis adequacy targets.
In recent years, the paradigm of incremental dialysis [
6,
13,
14], i.e., the gradual increase in frequency of dialysis to match the RRF has also been proposed. In this paradigm, patients with preserved RRF are initiated on twice-weekly dialysis, and then switched to thrice-weekly dialysis as RRF is lost. This practice has not been widely adopted even though it may be associated with neutral [
15] or even improved survival, reduced hospitalizations and improved control of several biochemical parameters [
16,
17]. A major barrier to the safe implementation of incremental dialysis is the need to measure RRF in order to prevent underdialysis. This is a concern raised by regulators as well. In the United States the Center for Medicare Services explicitly considers twice-weekly dialysis to be inadequate in patients with urea clearance (UrCl, a proxy for RRF) lower than 2 mL/min. Current guidelines [
18] thus suggest that RRF be measured by interdialytic urine collection and plasma sampling for urea and creatinine for the calculation of the relevant clearances. However, such collections are inconvenient for the patient [
17,
19,
20,
21], costly and partially adhered to, even in incentivized research settings [
22].
To overcome the shortcomings of urine collections, effort has shifted towards estimating UrCl without having to collect urine. These efforts leverage predialysis measurements of “middle molecules” such as beta 2 microglobulin, (B2M) [
19,
21,
23], cystatin C (CysC) [
21,
23] and beta trace protein (BTP) [
19,
21], either alone or in combination, to estimate UrCl. These data suggest that B2M is the single most predictive biomarker of the guideline relevant cutoff of UrCl <2 mL/min. The addition of other middle molecule markers, e.g., CysC or BTP, only marginally improves the performance of RRF-estimating equations. Despite these encouraging results, these equations are not deployed in clinical practice because of the large variability in predicting the RRF of individual patients. This variable performance is thought to reflect the complex, multi-compartmental interdialytic kinetics of middle molecules and is particularly evident in patients with minimal or zero RRF [
21].
In this study, we introduce a novel framework for the estimation of RRF, based on the population compartmental kinetic behavior of B2M and its removal during dialysis that have been meticulously modelled and meta-analyzed by our group [
24,
25]. This population kinetic model captures interindividual variability in the processes of generation, distribution and even elimination of B2M from the body. Using this model, we simulated a large cohort of patients with various levels of RRF receiving either hemodialysis (HD) or hemodiafiltration (HDF). Through these simulations, we were able to generate a very large database of B2M measurements and RRF in a realistic, physiologically correct manner. This database of B2M levels and RRF was then used to estimate a novel Population Kinetic equation for RRF (PK-RRF), which was subsequently validated in an external public dataset of real patients [
21]. The incorporation of additional renal function biomarkers in our approach is straightforward under the
simulator calibration framework [
26,
27] that is commonly applied to calibrate simulations against real-world measurements. This is an innovative direction towards the development of multi-biomarker models, which we explore in this manuscript. We assessed the performance of the resulting equation(s) against their ability to estimate UrCl using cross-validation. The clinical utility of these predictive models is quantified from a decision curve analysis/net-benefit perspective [
28,
29]. These analyses allow us to assess the safety of the RRF-estimating equations through a decision analytic framework, over the entire range of cost–benefit ratio valuations against the current standard policy of thrice-weekly dialysis.
4. Discussion
In this paper, we utilized a novel population compartmental kinetic framework to derive a set of equations for the prediction of RRF in patients undergoing conventional high-flux HD or on-line HDF. Our equations were derived entirely by combining computer simulations with advanced statistical modeling and had extremely high discrimination when applied to a human dataset of measurements of RRF. A clearance-based equation that utilized predialysis and postdialysis B2M measurements, patient weight, treatment duration and ultrafiltration had higher discrimination than an equation previously derived in humans. Furthermore, the derived equations appear to have acceptable clinical usefulness for a wide range of likelihood of having low (<2 mL/min) residual urea clearance. Our analyses of clinical utility suggests that these formulas can support decisions about the dialysis prescription that depend on patients having preserved RRF, e.g., incremental dialysis for those with preserved RRF, or even be used to increase the frequency and time on dialysis in those with low RRF.
Compartmental models for biomarker kinetics are familiar to nephrologists, since they have been used to quantify dialysis dose for decades [
59,
60,
61,
62,
63,
64]. These models are mathematical descriptions of the processes of generation, distribution, and elimination that determine the concentration of the biomarker of interest. A population viewpoint extends the kinetic approach by allowing interindividual variation in these parameters. This interindividual variation allowed us to simulate the relation between B2M, RRF and the impact of the dialytic regimen in a manner that generalized from the simulated patients to the real-world clinical data. The resulting equations, which were entirely derived in artificial datasets, thus, had high discrimination when applied to data from actual patients. In fact, the performance of the PK-RRF equations rivalled the performance of equations derived entirely in human populations, e.g., AUC of 0.91 [
21] and 0.84 [
23]. Overall, our paper adds to the expanding literature, showing that plasma levels of middle molecules in general and B2M in particular [
19,
21,
23] can predict the regulatory relevant threshold of RRF >2 mL/min on a par with the performance of the clinically accepted, validated troponin assays in the diagnosis of acute coronary syndromes (AUC: 0.84–0.94) [
65].
The high performance of the PK-RRF equations must be attributed to the validity of the constructs used to derive them from first principles. These constructs include the bi-compartmental kinetics of the B2M, the population distribution [
24] of the kinetic parameters of B2M, the effects of dialytic clearance [
25] and finally the relation between the RRF and the clinical measurement (UrCl) used as its proxy. Despite the high theoretical validity of our approach, translation of the derived equations to the real world should be expected to not be entirely free of complications. For example, the equations appear to have low precision when applied to the development-simulated dataset (e.g., fewer than 15% of predicted clearances are within 0.5 mL/min of their simulated values). In the simulations, this low performance is driven mainly by combinations of kinetic parameters, e.g., “large” dialysis patients given “short” treatment times that are unlikely to be observed in the real world. These unlikely combinations were an unavoidable consequence of relying on group average values for the various kinetic parameters of the dialysis regimens as reported in the papers of the clinical trials we used. Despite this possibly large amount of noise present in the source dataset, the overall pattern between residual renal function and B2M was learned successfully by the modeling methods, leading to a real-world performance that was substantially better than the one noted in the simulations (e.g., the proportion of predicted UrClr that is within 0.5 and 2.0 mL/min of the actually measured one is in the order of 40% and 90%, respectively). Notwithstanding these encouraging observations, we feel there is still the need for calibration of the base and the clearance-based PK-RRF formulas. The degree of calibration required to predict RRF in a new dataset appears to be smaller than that required to adapt the high quality Shafi equation to the same external dataset. Recalibration of the PK-RRF equations did not materially affect their extremely high discrimination but did seem to have a positive impact on the clinical usefulness as assessed by DCA at least for the relatively high prevalence of patients with UrCl <2 mL/min in the Vilar dataset. Consequently, we feel that the calibrated version of the base PK-RRF equation should be used over the uncalibrated version. However, the uncalibrated clearance-based PK-RRF equation appears to perform equally well to the calibrated and multibiomarker equations when the population level prevalence of having low UrCl is less than 50% and either the calibrated or the non-calibrated version can be applied to future clinical studies. It should be noted that the Shafi rule also performed well in this setting, especially when the population prevalence of UrCl <2 mL/min was less than 30%.
Despite the success of middle molecules in predicting RRF, a puzzling feature of the literature to date [
19,
21,
23] concerned the marginal success of multi-biomarker equations in the field. This was also noted in our study, which showed unimpressive improvements in discrimination and precision when CysC or simultaneous urea and creatinine measurements were used to recalibrate the B2M-based PK-RRF. Although there have been concerns that the involvement of B2M in the inflammatory response may confound the relationship between RRF and B2M [
41,
66], our approach to consider variable rates of generation in our large scale simulations probably allowed us to derive PK-RRF equations that are largely insensitive to variations in the generation rate. Hence, our formulas are unlikely to require additional biomarker measurements for robust performance. Despite these observations, we have noticed that some improvement in clinical utility may be derived by considering additional, readily available measurements, e.g., urea, creatinine or CysC. We also provided the analytical methodology to incorporate these biomarkers, as flexible GP embedded in the recalibration framework. This opens the possibility of incorporating additional, promising biomarkers, e.g., BTP [
41,
42] in datasets that have simultaneously measured all the aforementioned biomarkers.
In developing our equations, we were motivated by the unmet need for measurements that can facilitate further research in the area of RRF preservation and/or implementation of incremental, individualized forms of dialysis in practice [
13,
43]. Research in both areas is impeded by the lack of alternative techniques to measure RRF that do not require urine collections [
19,
21,
23,
41]. Prospective observational [
67], and retrospective propensity score-matched studies [
15] have shown that an incremental approach to dialysis frequency is not inferior with respect to mortality and may be associated with improved quality of life. Considering the direct treatment cost differential, i.e., biweekly dialysis has 2/3 dialytic costs than thrice-weekly dialysis, a formula that can identify patients with relative preserved RRF could have direct implications for both research and practice of “personalized dialysis”. However, this research should take place within the boundaries of existing regulations for the dialysis industry. In fact, one of the barriers in practice incremental dialysis in the United States is a concern raised by regulators: the Center for Medicare Services explicitly considers twice-weekly dialysis to be inadequate in patients with RRF lower than 2 mL/min. By providing a PK-RRF equation that can predict this threshold with high discrimination and positive Net Benefit across the entire spectrum of the risk-benefit assessments and a prevalence of low UrCl, we feel that research in this space can proceed in an ethical and regulatory compliant manner.
Conversely, the proposed PK-RRF formulas could be used to identify individuals who would benefit from opting out the thrice-weekly standard, for longer or more frequent, nocturnal or quotidian dialysis. If dialysis frequency is a major driver of the cost–benefit ratio in a clinical usefulness analysis, a six time per week regimen would have a ratio that lies in the opposite direction from that of an incremental dialysis approach. Hence, our sensitivity analyses that show that the PK-RRF formulas have positive SNB across the entire range of the cost–benefit ratios and for a wide range of population prevalence of a low residual renal clearance, suggest that these formulas should be preferred when selecting patients for more frequent dialysis. In the latter situation, the cost–benefit ratio is not driven solely by the increased financial and time burden of treatment frequency, but also by the potential negative effects on the preservation of residual renal function [
68]. Hence, it becomes critical that the prediction tool used to identify such patients, achieves a consistent performance across the entire cost–benefit ratio for small to medium prevalence of the low UrCl state. In the later situation, most of the misguided predictions of the UrCl prediction rule will be among the patients with preserved renal function, who may be likely to lose this function faster if they are dialyzed more frequently according to analyses in the Frequent Hemodialysis Network clinical trial dataset.
A few limitations of the developed PK-RRF equations should be kept in mind. First, the analytical complexity makes it unwieldly to write them down in the closed form, of the simpler equations they outperform. This is an unavoidable price to pay for the high discrimination of the PK-RRF equation. Nevertheless, we provide these equations as software programs in the open source R programming language and a web server in order to allow other investigators to replicate our results and to deploy them in other settings. Since these formulas were developed in simulations, the entire set of programs used to derive them can be (and has been) shared in its entirety to allow for transparent verification and subsequent replication. Second, the formulas were validated only in cross-sectional assessments and their use in repeated evaluations of the RRF of the same patient remain untested. This is an area of exploration in future studies. Third, the marginal improvement of the discrimination of the multi-biomarker models may reflect deficiencies in the biomarkers available for inclusion. In particular, urea, creatinine and cystatin C, the conventional serum biomarkers of estimating renal function in patients not on dialysis, exhibit large interdialytic variation in levels and thus may not provide the optimal additional biomarkers for patients receiving hemodialysis. Fourth, the source data did not include a subset of measurements in which the RRF rather than the UrCl was measured with gold standard techniques, i.e., iothalamate or iohexol clearance. Consequently, an important component of our modeling, i.e., the relation of B2M to the RRF could only be validated indirectly, by comparing the model’s predictions against the proxy of the RRF, i.e., the UrCl. We do not view this as major limitation of any future applications, since a gold standard measurement of RRF is done very infrequently, if ever, in dialysis clinical practice and nearly all research to date in the space utilizes the UrCl. Fifth, our analyses also suggest the possibility of further improvement in the performance of these equations by incorporating measurements or proxies for the non-dialytic renal clearance or the generation rate. As of the present time, proxies for these processes remain largely unknown, so that the analyses in the appendix that use these values constitute a provocative thought experiment about the ultimate potential of the population kinetic approach. Finally, the recalibrated PK-RRF equations have an intermediate validation status since we did not have an additional dataset to test their performance. This limitation does not extend to the uncalibrated version whose discrimination and clinical usefulness can be externally validated.
In summary, we have used computer simulations, the population kinetic approach and advanced statistical modeling to develop equations that can predict RRF (as assessed by UrCl) in patients undergoing maintenance HD or on-line HDF. These equations exhibit high discrimination and clinical usefulness when validated against an external, public clinical dataset. Recalibrated versions of these equations were developed in a cross-validation setting and are available for clinical use as well. Future studies should validate these equations in repeated assessments of the same patients and explore the utility of the PK-RRF equations as research tools in the areas of preservation of RRF and incremental, personalized dialysis.