Predicting Kidney Failure, Cardiovascular Disease and Death in Advanced CKD Patients

Introduction Predicting the timing and occurrence of kidney replacement therapy (KRT), cardiovascular events, and death among patients with advanced chronic kidney disease (CKD) is clinically useful and relevant. We aimed to externally validate a recently developed CKD G4+ risk calculator for these outcomes and to assess its potential clinical impact in guiding vascular access placement. Methods We included 1517 patients from the European Quality (EQUAL) study, a European multicentre prospective cohort study of nephrology-referred advanced CKD patients aged ≥65 years. Model performance was assessed based on discrimination and calibration. Potential clinical utility for timing of referral for vascular access placement was studied with diagnostic measures and decision curve analysis (DCA). Results The model showed a good discrimination for KRT and “death after KRT,” with 2-year concordance (C) statistics of 0.74 and 0.76, respectively. Discrimination for cardiovascular events (2-year C-statistic: 0.70) and overall death (2-year C-statistic: 0.61) was poorer. Calibration was fairly accurate. Decision curves illustrated that using the model to guide vascular access referral would generally lead to less unused arteriovenous fistulas (AVFs) than following estimated glomerular filtration rate (eGFR) thresholds. Conclusion This study shows moderate to good predictive performance of the model in an older cohort of nephrology-referred patients with advanced CKD. Using the model to guide referral for vascular access placement has potential in combating unnecessary vascular surgeries.

overestimation of the risk of kidney failure. 5 Recently, a promising CKD G4þ risk calculator was developed by Grams et al. 6 (from here on referred to as the Grams model), which predicts the timing and occurrence of not only kidney failure, but also cardiovascular events and death in patients with advanced CKD. By predicting various adverse outcomes in a single model, this tool accounts for competing events and therefore has large potential to improve care for patients with advanced CKD. To our knowledge, the Grams model has only been externally validated for the outcome KRT and death before dialysis and has yet to be validated for cardiovascular disease (CVD) and overall death. 4,7,8 Before use, external validation of prediction models is crucial in order to determine the accuracy in new patients.
The Grams model could be used in various clinical scenarios, such as discussions on the best KRT modality going forward, if any), and the timely preparation for KRT. A report from the Kidney Disease: Improving Global Outcomes group suggested using the Grams model for timely referral for vascular access placement. 1 Obtaining a working vascular access in the form of an AVF (generally the preferred access for starting hemodialysis (HD) can take up to several months and timing is vital. 9,10 AVF placement that is too early is injudicious due to the potential of an unnecessary surgery and the risk of complications. Approximately 30% of patients who receive an AVF will not initiate dialysis within a 2year period. 11,12 On the other hand, late placement can also be harmful, especially when starting dialysis using a central venous catheter, because this is associated with an increased risk of thoracic central venous obstruction and sepsis. 13 Timing of AVF referral and placement is currently mainly determined by the physician's expertise. 9,14,15 The European Society for Vascular Surgery advise that a permanent vascular access should be created 3 to 6 months before HD initiation. Nevertheless, predicting the start of HD is not easy. Although more directive guidelines exist, these are generally based on eGFR and lack individualization. 16,17 In their 2019 update, the Kidney Foundation's Kidney Disease Outcomes Quality Initiative advised assessment for vascular access for patients with a $50% risk of KRT within 2 years or an eGFR of #15 ml/min per 1.73 m 2 , but added that this advice is solely based on expert opinion as empirical evidence is lacking. 18 The aim of this study is two fold. First, we aim to externally validate the complete Grams model in a nephrology-referred cohort of patients with advanced CKD. Second, we aim to assess the potential clinical impact of the Grams model for guiding timely KRT preparation, including vascular access placement.

Study Population
EQUAL study is an ongoing multicenter prospective cohort study among European nephrology-referred patients $65 years. 19 Patients were included in Germany, Italy, the Netherlands, Poland, Sweden, and the UK from 2012 on, when their eGFR first dropped below 20 ml/min per 1.73 m 2 , and were followed for 4 to 8 years or up to kidney transplantation. Patients with acute kidney injury or a history of KRT were excluded. Some patients' kidney function were higher than 20 ml/min per 1.73 m 2 at study baseline, as eligibility assessment took place earlier. For the current study, we restricted inclusion to patients with an eGFR between 10 and 30 ml/min per 1.73 m 2 at baseline, because this was considered a clinically relevant population for KRT prediction and conform to the Grams development study. Clinical characteristics and laboratory values were registered every 6 months. All patients gave written informed consent to participate. Overlap between the EQUAL cohort and the European CKD Prognosis Consortium (CKD-PC) cohort on which the Grams model was developed is possible for the Swedish patients included in EQUAL, but highly unlikely because EQUAL inclusion started 3 years after the formation of the CKD-PC cohort.

Outcomes
The Grams model predicts the risk of KRT initiation, CVD event and death, within 2 and 4 years. These outcomes are predicted in any possible sequence, for instance the risk of experiencing CVD followed by KRT initiation. For the main analysis of the current study, the following combined outcomes were validated: any KRT, any CVD event, death, and no event. In addition, the outcome of death was split into "death without KRT" and "death after initiation of KRT." These outcomes were considered most clinically relevant and ensured enough events in our data for precise validation. It is important to note that the calculated risks for combined outcomes no longer represent multinomial probabilities because they are not mutually exclusive; patients may experience various combinations of these outcomes. In accordance with the original model, CVD event was defined as a nonfatal stroke, heart failure, myocardial infarction, or coronary revascularization.

Predictors
Predictors in the Grams model include age, sex, race, history of CVD, current smoking status, systolic blood pressure, diabetes mellitus, eGFR (using the CKD-EPI equation), and urine albumin-to-creatinine ratio (ACR). These predictors were measured at the first study visit. If ACR was missing but protein-to-creatinine ratio was measured, the latter was converted to ACR using equations developed for this purpose. 20 Clinical Impact Projection For assessment of potential clinical utility, we assessed whether decision rules based on a predicted risk or eGFR can identify patients starting KRT within 1 year and, therefore, be useful to ensure timely KRT preparation. A DCA was employed, which illustrates the difference in impact of various decision-rules and prediction models. We assessed the potential clinical utility for the following predefined referral thresholds: a predicted 2year KRT risk of 20%, 30%, 40% and 50% (based on the Grams model), an eGFR <30, <20 or <15 ml/min per 1.73 m 2 , and the Kidney Foundation's Kidney Disease Outcomes Quality Initiative suggested guideline of a 2year KRT risk >50% and/or an eGFR <15 ml/min per 1.73 m 2 . [16][17][18] Every patient was assessed at the first study visit and we determined whether their predicted risk or eGFR met the different KRT preparation thresholds. More specifically, if patients reached the above mentioned thresholds, we observed whether they initiated KRT within 1 year; if so, the decision-rule was "correct" and AVF referral was appropriate (a true positive). If the patient did not initiate KRT within 1 year, due to slow progression, death or any other cause, the KRT preparation and AVF placement advice was considered unnecessary or "incorrect" at that time-point (a false positive). 16 By simplifying 2-year risk predictions into a decision rule (yes/no threshold), we could assess whether these predictions can also be used to guide decisions on KRT initiation within 1 year instead of 2 years. Patients who initiate KRT sooner within the 2-year timeframe generally also have a higher predicted risk of KRT compared to patients who initiate later within that window.

Statistical Analysis
Continuous baseline values are presented as means with standard deviations when normally distributed, or as medians with interquartile range. To describe the observed occurrence of outcomes, we used cumulative incidence functions and a stacked cumulative incidence plot to account for censoring and competing events. Missing data were assumed to be missing at random and a 10-fold multiple imputation was performed using the R-package (R core team 2021, version 3.5.3) "mice." 21 All predictors, outcomes and times to outcomes, along with various other patient characteristics were included in the imputation model and results were pooled according to Rubin's rules. 22,23 For every patient, the predicted risks for each outcome were calculated using the multinomial formulas provided by Grams et al. 6 (these multinomial formulas are also provided in our Supplemental Material). Model performance was assessed by discrimination and calibration. The model performance differs per predicted outcome and outcome grouping and was assessed separately for the various combined outcomes and original outcome trajectories. Discrimination determines how well a model can discriminate between patients who will have the outcome and those who will not and was calculated using a time-to-event C-statistic. 24 Harrel's Cstatistic was modified to account for competing events using the method proposed by Wolbers et al. 25 Calibration determines how well the absolute predicted risk corresponds with the observed risk, overall and in patient subgroups. 24 Observed risks were calculated using cumulative incidence functions, to take competing events and censoring into account. These cumulative incidences are not the same as multinomial probabilities, which would ideally have been used had follow-up been complete. 26 The calibration-in-the-large and calibration plots, including a smoothed locally weighted scatterplot smoothing line, were computed. 27,28 DCA was used to evaluate the potential clinical utility. This is a statistical method in which the relationship between harm-benefit ratios (also termed threshold probabilities) and the net benefit is plotted to ascertain the added value of different decision rules over the entire range of theoretical harm-benefit ratios (x-axis). The net benefit (y-axis) is calculated by subtracting the proportion of all patients who are false positive (unjustified AVF referral) from the proportion who are true-positive (correct AVF referral), weighting by each theoretical harm-benefit ratio. 29,30 For a low harm-benefit ratio, a false positive is given less weight compared to a higher harm-benefit ratio. The harmbenefit ratio is not set or calculated nor does the DCA tell us what the "correct" harm-benefit ratio is; the DCA plots the net benefit for all possible harm-benefit ratio's, starting at a scenario in which AVF referral and placement is only ever beneficial (a harm-benefit ratio which is almost zero) and ending at a scenario in which an AVF referral is far more harmful than beneficial. When comparing different referral guidelines, the guideline with the highest net benefit on the y-axis across the range of harm-benefit ratios would be considered most beneficial. The DCA is most useful for comparing net benefit of various decision threshold over a wide range of harm-benefit ratios. A more in-depth explanation of the DCA is included in the Supplemental Material (Decision curve analysis). The sensitivity, specificity, positive predictive value and negative predictive value were also calculated for each decision rule.

Sensitivity Analyses
As a sensitivity analysis, model performance was assessed for all 8 outcome sequences that the Grams model predicts, namely KRT initiation only, KRT initiation after CVD, CVD only, CVD after KRT initiation, death only, death after KRT initiation, death after CVD, and death after KRT initiation and CVD. To assess discrimination of these outcomes, areas under the receiver operating curve were computed and calibration was determined with absolute proportions. Timeto-event and competing risks could not be considered, because the predicted outcomes of the Grams model consist of a series of events and time-points and not all outcomes are competing with each other. We also compared potential clinical impact of the Grams model to the 2-year kidney failure risk equation (KFRE), an often used prediction model for KRT initiation, using DCA. [31][32][33] This study was performed according to the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis guidelines (see Supplementary Material for TRIPOD checklist). 34 Analyses were performed in R version 3.5.3.

Baseline Characteristics
In total, 1517 patients from EQUAL had an eGFR between 10 ml/min per 1.73 m 2 and 30 ml/min per 1.73 m 2 and were included in the study. The baseline characteristics of the EQUAL cohort used for validation are depicted in Table 1, and are compared to the reported characteristics of the development cohort of the Grams model: the CKD-PC cohort. 6 Compared to the CKD-PC cohort, the median age of patients from the EQUAL cohort was 4 years older. EQUAL included less black patients than the CKD-PC cohort (1.3% vs. 9.3%). EQUAL patients had worse kidney function compared to CKD-PC patients, shown by a lower eGFR and higher ACR.  Figure S1 for a flowchart and proportions of all outcomes). The competing risks of experiencing a first event are shown in Figure 1; the probability of not experiencing any event was 50% within 2 years and 25% within 4 years. Among the patients who died within 2 years, the large majority died without starting KRT (80.0% and 72.6% for 2 and 4 year horizons, respectively). Among the patients receiving KRT within 4 years, 72.7% started on HD, 23.5% on peritoneal dialysis and 3.8% received a pre-emptive transplantation. Out of the 310 patients starting HD, 45.4% started with a central venous catheter and 53.2% with a graft or fistula, and 1.4% with unknown vascular access type. These percentages were similar for patients starting KRT within 2 years.

Predictive Performance
The discrimination of the Grams model varied greatly between predicted outcomes and the C-statistics ranged from 0.59 to 0.76 (see Table 2). The best discrimination was seen for predicting KRT (2-year C-statistic 0.74, 4year C-statistic 0.73) and "Death after KRT" (2-year C-  Figure 2 illustrate the model's capability of discriminating between patients who will start KRT and those who will die without having started KRT. Though there is some overlap, most patients who died without KRT had a higher predicted risk of death than those with KRT and vice versa. Overall, the calibration of the model was fairly accurate (see Table 2 and Figure 3). For high-risk patients, the model underpredicted the risk of KRT. In general, the 4-year model had better calibration than the 2-year model. The risk of CVD, death, and "death without KRT" were particularly well calibrated in the 4-year model. The discrimination and calibration for all outcome sequences predicted by the Grams model is shown in Supplementary Table S1 and S2 and Supplementary Figure S2. In general, the discriminative ability and calibration of the 2-year and 4-year model ranged from moderate to good for the specific outcomes (area under the receiver operating characteristic curve -AUC-range 0.61-0.79), but due to few events these estimates are rather uncertain.

Clinical Utility and Projected Impact
In Figure 4a and 4b are shown the net benefit of using the Grams model to guide KRT preparations compared to an eGFR referral guideline and compared to the KFRE predictions. The decision curves should be read vertically, meaning that for any given harm-benefit ratio (xaxis) the guideline with the highest net benefit is most beneficial to patients and will result in the most beneficial ratio of justified and unnecessary referrals. If we believe all referrals to be completely harmless, the harmbenefit ratio is very small and the best strategy would  be to refer all patients for KRT preparations, including AVF placement. The appropriate harm-benefit ratio will differ for each patient and setting, because a patient's personal concerns and priorities play a part as well as individual characteristics, which increase the risk of complications surrounding different vascular access forms. 35,36 It is therefore more relevant to look at overall superiority of a particular guideline or model, rather than the specific harm-benefit ratios. From Figure 4 we can see that use of the Grams prediction model to guide referral is mostly superior to using the eGFR <15 and results in fewer unnecessary referrals. In Supplementary Figure S3, more eGFR referral thresholds are included in the DCA curve, which clearly demonstrates that eGFR <15 gives a higher net benefit compared to other eGFR thresholds. For situations in which the harmbenefit ratio is small (little harm is expected), a KRT risk >20 % shows the best net benefit; for a medium harm-benefit ratio a decision threshold of eGFR <15 ml/ min per 1.73 m 2 is superior; and in a setting in which the harm could be large, a KRT risk >40% has the highest net benefit. Which of these decision rules should be followed depends on the subjective estimate of the risk-benefit ratio for each individual patient.  Figure S3). In addition to the net benefit, several impact measures such as the sensitivity and specificity were calculated for the discussed guidelines and thresholds (Table 3). Compared to using eGFR-based guidelines, using risk thresholds would yield notably less referrals and considerably fewer unnecessary referrals (false positives). The combined guideline of 2-year KRT risk >50% and eGFR <15 ml/min per 1.73 m 2 would only yield 4 more referrals compared to simply referring all patients with an eGFR <15 ml/min per 1.73 m 2 and therefore has no added benefit over the simple eGFR guideline in our cohort of older patients. The number of false negatives is difficult to directly relate to clinical practice, because these patients may still be referred in time (according to the risk thresholds) at any subsequent doctor's visit. Similarly, the negative effects of a false positive may also be mitigated by delaying or deferring KRT preparations at subsequent clinical assessments.

DISCUSSION
In this study, we externally validated the Grams model in a European, older cohort of nephrology-referred patients with CKD stage 4. Model performance was assessed for the risk of KRT, CVD, and death within 2 years and 4 years. The discriminative ability was   year. These graphs should be read vertically; for any given harm-benefit ratio the guideline with the highest net benefit would result in the most beneficial ratio of correct referrals and incorrect referrals (given the weight that is given to a false positive compared to a true positive based on the harm-benefit ratio). For most harm-benefit ranges 2-year KRT risks predicted by the Grams prediction model have a higher net benefit than eGFR-based risks (a), the net benefit of the Grams and KFRE predictions are very similar, though the Grams model seems to be slightly more beneficial (b). eGFR, estimated glomerular filteration rate; KRT, kidney replacement therapy. reasonably good for predicting KRT but poorer for the outcomes CVD and death. Overall, the calibration was accurate, especially for the 4 year model. The potential clinical utility for decision making regarding KRT preparation (in particular vascular access placement) was investigated and compared to existing eGFR guidelines. Using the model to guide KRT preparation may be more effective than following the currently recommended eGFR thresholds. Depending on a patient's and their nephrologist's preferences and circumstances, preparation could be considered at a predicted KRT risk of more than 20%, more than 40% or an eGFR <15.
Considering results from previous studies, the prediction of death is generally less accurate than predicting KRT initiation. 37 In previous French validation studies, the Grams model was externally validated in a cohort of advanced CKD patients aged 75 years and older for the outcomes KRT and "death without KRT." 7,8 These studies reported a C-statistic of 0.64 and 0.65 for KRT and a C-statistic of 0.68 and 0.70 for "death without KRT" for the 2-year and 4-year models, respectively. Our DCA showed that, for most patients, using individualized risk thresholds has a higher potential clinical utility than using eGFR thresholds, though the eGFR <15 guideline also showed good performance. Particularly in older populations it is important not to solely rely on eGFR, because progression of kidney disease is often slow. Previous literature has also shown that relying on eGFR-based guidelines in older patients would lead to a high proportion of unnecessary vascular access referrals. 38,39 This study has a number of strengths. To our knowledge, this is the first study to externally validate the Grams model's predictive performance for all outcomes. We accounted for censoring and competing risks in the external validation. Another strength is the use of data from a current, international European cohort, which allows generalization of our results to various European countries. Moreover, we examined the potential clinical utility of using the model for decision making regarding KRT preparation, which is an important and often overlooked step toward implementation. 29,40 We explored the utility of multiple decision rules over a range of harm-benefit ratios. Our results should be seen in the light of a number of limitations. First, KRT only included patients who initiated dialysis or were transplanted (conforms to the definition used by Grams et al. 6 ). For prospective use of the model, it would be more informative to also include patients who opted for conservative care in the outcome definition. Second, for the DCA and standard diagnostic measures (such as sensitivity and specificity), censoring due to loss to follow-up was not considered. Because these analyses looked at a shorter time frame, this censoring was less of an issue, but nevertheless present. Furthermore, there was a considerable amount of missing baseline values for ACR. Some patients may delay dialysis start to allow time for vascular access maturation, this may have influenced the outcomes of our decision analysis as well. Finally, the clinical utility analysis is only a first step toward assessing the impact of such a model. The DCA only assesses optimal test diagnostic characteristics related to false positives. A change in practice from using eGFR guidelines to prognostic model probabilities would have broader clinical and potentially economic effects. The harm-benefit ratios are subjective and it is unrealistic to determine the appropriate harmbenefit ratio for individual patients in practice. We were only able to compare the clinical utility of risk thresholds to simple eGFR-based guidelines. These are not an accurate representation of current practice, where decisions are often based on more factors than eGFR, such as the rate of renal function decline and the physician's experience. Furthermore, the prediction model and decision rules were applied at baseline contrary to clinical practice, where the nephrologist is likely to consider whether KRT preparation is appropriate at each follow-up consultation. This study may have a number of clinical implications. Using a prediction model of KRT risk (such as the Grams model or KFRE) may provide clinicians with an extra tool to improve the timing of KRT preparations, including AVF placement. A 2-year predicted risk threshold of 20% or 40% seemed most beneficial, depending on the expected potential harms and benefits. Using prediction models for this timing may result in fewer patients unnecessarily undergoing vascular access surgery and may result in more patients initiating dialysis on their preferred vascular access type. Currently, a large proportion of patients undergoing pre-emptive vascular access placement will not start dialysis within a year and especially many older patients will never use their vascular access. 41,42 A study by Lee et al. 11 showed that in a cohort of patients aged 70 years and older, 33% of patients that received an AVF did not initiate KRT within 2 years. The Kidney Foundation's Kidney Disease Outcomes Quality Initiative recommended guideline of 2-year KRT risk >50% or eGFR <15 ml/min per 1.73 m 2 did not have added value compared to only using the eGFR guideline of <15 ml/min per 1.73 m 2 in our cohort, which performed well. Although the predicted risks could never replace health care professionals, they can aid in decision making by providing objective risk estimates. 43 Moreover, they are easy to calculate using the existing webtool (http://ckdpcrisk.org/lowgfrevents/). One could also argue that in older patients it may be beneficial to start dialysis with another vascular access type such as an arteriovenous graft or a central catheter. This would greatly decrease the number of unused AVFs but requires a careful consideration of all involved risks and there is currently an randomized controlled trial running to determine the best vascular access strategy in elderly HD patients. 44 Future studies may focus on recalibrating the Grams model to predict outcomes within a shorter time frame of 3, 6, 9 or 12 months, because this might be more informative and intuitive to use for decisions regarding KRT preparation. Predicting maturation time of an AVF may further improve this timing. In addition, it would be valuable to externally validate the Grams model in other populations such as younger and non-European patients as well. To simulate practice more closely, it is important to validate such models at every subsequent doctor visit for which a dynamic model may be more fitting. Finally, a clinical impact trial, in which physicians or patients are randomized to the use of a prediction model to augment decision-making will give the most insight into the clinical impact of these models and allow for a comprehensive comparison with current practice, in which many factors besides eGFR are taken into account.
In conclusion, this study provided a first external validation of the complete Grams model. The clinical utility of this model for the timing of vascular access placement was generally superior to existing eGFR guidelines and has potential in combating unnecessary vascular surgeries.

APPENDIX
List of EQUAL study investigators are listed in the appendix DISCLOSURE All the authors declared no competing interests.

ACKNOWLEDGMENTS
All patients and health professionals involved in the EQUAL study are greatfully acknowledged.
The work on this study by CLR was supported by a grant from the Dutch Kidney Foundation (20OK016). The work on this study by MvD was supported by grants from the Dutch Kidney Foundation (16OKG12 & 20OK016). Main funding for the EQUAL study was received from the European Renal Association-European Dialysis and Transplant Association and contributions from the Swedish Medical Association, the Stockholm County Council ALF Medicine and Center for Innovative research, the Italian Society of Nephrology, the Dutch Kidney Foundation, the Young Investigators grant in Germany, and the National Institute for Health Research in the United Kingdom.

Ethical Approval and Consent to Participate
The EQUAL study was approved by the medical ethics committee or institutional review boards (as appropriate) of all participating centers (main medical ethical committee approval obtained in the Amsterdam Medical Center, NL38874.018.11). Written informed consent was obtained from all patients.

Availability of Data
Data are not publicly available. Data from the EQUAL study may be requested with protocol and statistical analysis plan at the EQUAL publication committee (contact: n.c. chesnaye@amterdamumc.nl).

SUPPLEMENTARY MATERIAL
Supplementary File (PDF) TRIPOD Checklist. Multinomial formulas for predicted risk calculations. Decision curve analysis. Figure S1. Inclusion and follow-up flowchart Figure S2. Calibration plots for all possible outcomes Figure S3. Decision Curve comparing the clinical utility of Grams and eGFR thresholds for referring patients to the vascular surgeon for AVF formation Table S1. Discrimination of the 2-year and 4-year Grams model for all possible outcomes Table S2. Calibration of the 2-and 4-year Grams model validated in EQUAL