Population-Based Limits of Urine Creatinine Excretion

Introduction The validity of a timed urine collection is typically judged by measurement of urine creatinine excretion, but prevailing limits may be unreliable. We sought to empirically derive population-based limits of excretion for evaluating the validity of a timed urine collection. Methods Covariate and 24-hour urine data were obtained from 3582 participants in the Chronic Renal Insufficiency Cohort (CRIC) study, 814 participants in the Modification of Diet in Renal Disease (MDRD) study, 1010 participants in the Jackson Heart Study (JHS), and 8536 participants in the Prevention of Renal Vascular End Stage Disease (PREVEND) study. Weight, height, age, sex, and serum creatinine concentrations were evaluated as potential predictors of urine creatinine excretion using Akaike Information Criteria, R-squared values, and deviance. Bias and precision of the fitted models were assessed by analyses of residuals. Agreement between 24-hour creatinine clearance and 125I-iothalamate clearance was assessed before and after exclusion of potentially invalid urine samples. Results A best-fitting model to predict 24-hour urine creatinine excretion among the 9199 discovery cohort members included sex-specific terms for weight, height, and age (R-squared = 0.328). This model had a median bias of +4.3 mg creatinine/day (95% confidence interval −5.6, +13.3 mg/day) in 4599 validation cohort members, and 82% of observed values were within 30% of predicted model. Serum creatinine concentrations only marginally improved model precision but reduced bias in persons with advanced chronic kidney disease (CKD). Conclusion The limits of urine creatinine excretion derived here represent the most valid and representative data for appraising the adequacy of a timed urine collection.

T imed urine collections have wide-ranging applications in clinical medicine and research, including quantification of urine albumin excretion, detection of monoclonal proteins, workup of specific endocrine disorders, and measurement of kidney stone precursors. [1][2][3][4] Timed urine collections are also used to calculate creatinine clearance as a means of estimating the glomerular filtration rate (GFR), particularly in persons with advanced CKD. 5, 6 A major barrier to interpreting the results obtained from a timed urine sample is the potential for error in the collection process, which often occurs outside the clinic or hospital setting. Potential errors in urine collection times are typically addressed by simultaneous measurement of urine creatinine excretion under the assumption that creatinine is produced at a constant rate by skeletal muscle under steady-state conditions. 7 Prevailing weight and sex-based limits of creatinine excretion used to judge the adequacy of a timed urine sample (e.g., 15 to 20 mg/kg per day in women and 20 to 25 mg/kg per day in men) may be unreliable. Studies to derive these limits lacked adequate sample size to reliably estimate lower and upper limits of creatinine excretion and included mostly white individuals despite potential differences in skeletal muscle mass by race or ethnicity. [8][9][10][11][12][13][14][15] Most prior studies are further limited by a relative paucity of persons with advanced CKD for whom serum creatinine-based estimation of GFR may be less reliable. 16,17 The goal of this study was to empirically derive and validate equations to predict lower and upper limits of urine creatinine excretion for evaluating the validity of a timed urine sample. To accomplish this goal, we analyzed 24-hour urine creatinine, anthropometric, demographic, and laboratory data from more than 13,000 participants in 4 cohort studies that included persons with and without kidney disease. We indirectly evaluated the utility of the derived limits by comparing the agreement between 24-hour creatinine clearance and 125 I-iothalamate clearance in a subset of participants before and after exclusion of potentially invalid samples that were outside the derived cutoff values.

Study Populations
We used data from 3582 participants in the CRIC study, 814 participants in the MDRD study, 1010 participants in the JHS, and 8536 participants in the PREVEND study who had 24-hour urine creatinine measurements available for analysis. The CRIC study is a multicenter US cohort study of 3939 patients with CKD; 3582 participants completed a 24-hour urine collection at baseline. 18,19 Exclusion criteria in CRIC included polycystic kidney disease, kidney transplantation, HIV disease, immunosuppression, multiple myeloma, and severe heart failure. The MDRD study was a cooperative randomized clinical trial of dietary protein restriction and blood pressure reduction among 840 patients 18 to 70 years of age with known CKD. 20 Persons with insulin-requiring diabetes mellitus or a body weight less than 80% or greater than 160% of predicted were excluded, and 24-hour urine samples were collected at the baseline visit following the run-in procedure and before randomization. The JHS is a community-based study of cardiovascular disease among 5302 African American adults residing in the Jackson, Mississippi metropolitan area. 21 A randomly selected subset of 1028 JHS participants provided a 24-hour urine sample within 1 week of the baseline exam, and collections were returned to the examination center for recording of total urine volume. 22 The PREVEND study is a prospective study of microalbuminuria and its associations with cardiovascular disease in persons 28 to 75 years of age residing in the city of Groningen (The Netherlands). 23 Persons with insulin-dependent diabetes mellitus were excluded. PREVEND study participants collected 2 consecutive 24-hour urine samples during 2 visits, separated by 3 weeks, to an outpatient clinic; we used the first collection for these analyses. Efforts to ensure complete 24-hour urine collections are described in Supplementary Table S2. Beginning with a total analytic sample of 13,942, we excluded 100 individuals who were missing information on height or weight, 25 individuals who had extreme values for height (< 140 or >200 cm), and 19 individuals who had extreme values for weight (<40 or >180 kg), leaving a final analytic sample of 13,798, which served as the study population. For models that included serum creatinine concentrations, we further excluded 75 individuals who were missing creatinine data and 9 individuals who had extreme values (>7 mg/dl).

Measurements
Urine creatinine concentrations were measured using the Jaffe method on the Roche diagnostics platform in the CRIC study, a kinetic alkaline picrate assay on a Beckman Astra-8 platform (Beckman Instruments, Irvine, CA) in the MDRD study, a Vitros 950 or 250 analyzer (Ortho-Clinical Diagnostics, Raritan, NJ) in the JHS study, and an enzymatic method on a Roche Modular analyzer, using reagents and calibrators from Roche (Roche Diagnostics, Mannheim, Germany) in PREVEND. [23][24][25] CRIC and JHS urine samples were recalibrated as described in Supplementary Table S1. We adjusted models for individual study cohort with CRIC as the reference group. We estimated GFR from age, sex, self-reported race, and serum creatinine concentrations using the 2009 CKD Epidemiology Collaboration equation. 26 We calculated creatinine clearance as: Creatinine clearance (ml/min) ¼ (Urine creatinine Â Urine volume)/(Serum creatinine)Where, urine and serum creatinine are expressed in mg/dl and urine volume is expressed in ml/min. All MDRD study participants and a randomly selected one-third of CRIC study participants completed 125 I-iothalamate clearance measurements for determination of GFR (iGFR). Trained personnel in each study administered 125 I-iothalamate and collected plasma and timed urine samples during 4 subsequent collection periods. In CRIC, the first collection period was excluded from the analyses. iGFR was calculated as the weighted average of iothalamate clearance during the collection. To facilitate comparison with creatinine clearance, we did not standardize iGFR to body surface area.

Analysis
We randomly selected two-thirds of the study population to serve as the discovery cohort and one-third to serve as the validation cohort, stratified by categories of estimated GFR (0-29, 30-44, 45-59, 60-89 and $90 ml/min per 1.73 m 2 ). Graphical assessment of 24-hour urine creatinine excretion values indicated a unimodal distribution with a right-sided tail. Scatterplots of urine creatinine excretion and the continuous predictors showed an increasing mean-variance relationship. To best capture the distribution of errors for this characteristic, we implemented a Gamma generalized linear model with log link, which yields wider prediction intervals for greater predicted mean urine creatinine values. This models differences between recalibrated CRIC study urine creatinine values and the other cohorts on a multiplicative rather than additive scale.
Primary predictor variables were sex, weight (kg), height (cm), and age (years). We explored whether serum creatinine concentrations or self-reported race could improve performance of the anthropometric and demographic model. All models included dummy variables for sex and individual study cohort (MDRD, JHS, PREVEND) with recalibrated CRIC urine creatinine values serving as the reference group.
We investigated potential nonlinear relationships using polynomial, logarithmic, and inverse transformations. We tested for potential interactions among the demographic and anthropometric characteristics.
Sex-specific differences in the associations of weight and age with 24-hour urine creatinine excretion motivated construction of separate models among men and women. For models that included serum creatinine concentrations, we modeled this covariate using natural splines with knots at 0.7 mg/dl in women and 0.9 mg/dl in men at the median and 75 th percentiles of each sex, based on empirical associations with urine creatinine excretion. We considered model performance to be improved by a decrease in the Aikake Information Criterion within sexspecific models with further consideration of changes in R-squared values, deviance, and graphical inspection of residual plots. We tested the precision and accuracy of the fitted models in the validation cohort using root mean squared errors, the median and interquartile range of residuals, and the proportion of observations within 30% of predicted model.
We computed prediction limits of 24-hour urine creatinine excretion using bootstrap-based prediction intervals based on covariates. We compared empirically derived model-based limits with prevailing sex and weight-based limits. To indirectly assess application of the prediction limits, we determined the agreement between 24-hour creatinine clearance and iGFR in a subset of participants before and after excluding urine samples with creatinine excretion values outside these

Study Population
Among 9199 participants in the discovery cohort, the mean age was 52 AE 13 years; 4532 (49%) were women; and 1748 (19%) self-reported their race as Black ( Table 1). The mean 24-hour urine creatinine excretion in the discovery cohort was 1338 AE 466 mg (by subgroups shown in Table 2). Estimated GFR was $ 60 ml/min per 1.73 m 2 in 67% of discovery cohort members, 30 to 59 ml/min per 1.73 m 2 in 24%, and <30ml/min per 1.73 m 2 in 9%. Participant characteristics were similar in the discovery cohort and the validation cohort. Among 8515 participants in the PREVEND study who repeated their 24-hour urine collection an average of 3 weeks later, the median interindividual difference in urine creatinine excretion was 11% (interquartile range 5%, 22%).

Development of Prediction Models
In univariate analyses, sex was the strongest single predictor of 24-hour urine creatinine excretion based on the reduction in Aikake Information Criterion, followed by weight, height, and age. The functional relationships of weight and age with 24-hour urine creatinine excretion differed by sex, motivating construction of separate models in men and women. The functional forms selected are displayed in Supplementary Figure S1. In models that included sexspecific terms for only weight, analogous to current clinical assessment, the adjusted R-squared value was 0.275 (Table 3). The addition of best-fitting terms for height and age progressively reduced the Aikake Information Criterion and deviance and increased the Rsquared value. No meaningful improvement in prediction was achieved by addition of a product term for weight and height or by the inclusion of self-reported Black race. A final anthropometric and demographic model that included sex-specific terms for weight, height, age, and individual study cohort had an Rsquared value of 0.328. Higher serum creatinine concentrations were associated with progressively greater urine creatinine excretion up to threshold values of approximately 0.7 mg/dl and 0.9 mg/dl in women and men, respectively (Supplementary Figure S2). Thereafter, higher serum creatinine concentrations were associated with progressively lower urine creatinine excretion. The addition of serum creatinine modeled as a natural spline to fit these thresholds yielded small gains in prediction ( Table 3).

Validation of Prediction Models
Among 4599 validation cohort members, the anthropometric and demographic model had a root mean squared error of 353 mg creatinine/day and median bias of þ4.3 mg creatinine/day (95% confidence interval À5.6, þ13.3 mg creatinine/day). 81.8% of the observed 24-hour urine creatinine measurements in the replication cohort were within 30% of those predicted by the model. Similar prediction statistics were observed for the anthropometric, demographic, and serum creatinine model as follows: root mean squared error of 345 mg creatinine/day, median bias  The median difference between observed and predicted model urine creatinine excretion in the replication cohort was low among men and women ( Figure 1) and among participants who self-reported their race as   Black or non-Black, indicating low bias by sex and race (Table 4). Nevertheless, the anthropometric and demographic model systematically overestimated urine creatinine excretion among persons with low estimated GFR (median difference À82 mg creatinine/day for estimated GFR <30 ml/min per 1.73 m 2 ). This bias was reduced using the model that included serum creatinine concentrations.

Derivation and Indirect Evaluation of Model-Derived Prediction Limits
Applying clinical sex and weight-based limits to the replication cohort of 20 to 25 mg creatinine/kg per day in men or 15 to 20 mg creatinine/kg per day in women excluded 67% of the 24-hour urine samples. More conservative limits of 14 to 26 mg creatinine/kg per day in men or 11 to 20 mg creatinine/kg per day in women excluded 27% of the samples. Prediction intervals derived from model-based covariates were constructed to include 92.5% of replication cohort members (Table 5). For example, a 50-year old woman who is 160 cm tall and weighs 65 kg would have a predicted 5% lower limit of 588 mg creatinine/day. Iothalamate clearance measurements of GFR (iGFR) were available for 714 validation cohort members. (Supplementary Figure S3) The root mean squared error between 24-hour creatinine clearance and iGFR was 20.8 ml/min when all samples were included in the analysis ( Table 6). The agreement between 24-hour creatinine clearance and iGFR improved upon excluding urine samples for which urine creatinine excretion was lower than model predicted limits, suggesting potential under-collection (Supplementary Figure S4). In contrast, the exclusion of potentially over-collected samples improved agreement for only the highest 2 to 3% of samples.

DISCUSSION
Using data from 4 large cohort studies, including persons with and without kidney disease, we developed and validated models to predict lower and upper limits of urine creatinine excretion for assessing the validity of a timed urine sample. A model that included sex-specific terms for weight, height, and age had moderate precision and low bias except among persons with advanced CKD, who had systematically lower urine creatinine excretion than predicted. A second model that added serum creatinine concentrations reduced bias by low GFR. Importantly, the addition of self-described race to the models did not improve the prediction of urine creatinine excretion beyond that of sex, height, weight, and age. The agreement between 24-hour creatinine clearance and 125 I-iothalamate clearance improved after excluding urine samples for which creatinine excretion was outside of model-based predicted limits. Applying prevailing sex and weight-based limits of 20 to 25mg creatinine/kg per day in men or 15 to 20 mg creatinine/kg per day in women to the 24-hour urine samples in this study excluded two-thirds of the collections. Given limitations of existing methods and strengths of the current study, we suggest that the empirical limits derived here represent the most valid and representative data for appraising the utility of a timed urine collection.
The measurement of creatinine in a timed urine sample is motivated by its theoretically stable production from creatine in skeletal muscle and nearly  27,28 Under steadystate conditions, creatinine excretion in the urine should equal its production, which can be estimated from body size. Early equations to estimate urine creatinine excretion based on anthropometric characteristics were derived from small study populations with limited diversity and lacked a validation step. 29,30 More recent studies have derived and validated equations to predict the mean 24-hour urine creatinine excretion values in larger cohorts. [9][10][11] For example, a prediction model based on weight, age, and sex derived from 2466 individuals yielded similar bias and precision to those calculated here. 9 Nevertheless, this study did not quantify upper and lower limits of creatinine excretion for assessing the adequacy of a timed urine sample.
There is no gold-standard method to definitively determine the accuracy of a timed urine collection, which often takes place outside of the clinic or hospital setting. Previous studies have investigated the utility of ingested para-aminobenzoic acid, which is rapidly excreted in the urine. Improvement in paraaminobenzoic acid recovery was reported after excluding 24-hour urine samples for which urine creatinine excretion was lower than sex and weightbased limits. Nevertheless, this approach is limited by use of an ingested compound and the ability to detect only under-collected samples. Given the inherent lack of a gold-standard, the adequacy of timed urine collections can be practically assessed using model-based population-derived limits, such as those determined here, similar to the approach used for many laboratory measurements. Such a strategy will represent a trade-off between accuracy on one hand and inclusiveness on the other and could be used to identify potential overcollections and under-collections. Based on the agreement between 24-hour creatinine clearance and direct measurements of GFR in this study, a reasonable balance was achieved by excluding the upper 2.5% and lower 5% of samples for a given sex, weight, height, and age.
Our study has several strengths. Discovery and validation analyses were conducted in a large and diverse study population that included persons with and without CKD. We calibrated urine creatinine measurements in the reference cohort to current isotope dilution mass-spectroscopy standards, promoting clinical application of the model-derived limits. We derived a second model specifically for individuals with advanced CKD or low muscle mass that included serum creatinine concentrations to account for potential bias that may occur in these individuals. An important limitation of this study is the residual variation in urine creatinine excretion after idealized modeling of sex, weight, height, and age, consistent with findings from previous studies. More comprehensive measurements of body composition using methods such as bioimpedance or imaging procedures could improve prediction but are impractical for clinical use. A second limitation is systematic overestimation of 24-hour urine creatinine excretion among persons with advanced CKD. This limitation was moderated by the addition of serum creatinine concentrations to the model. Limits derived from the serum creatinine model may be useful for evaluating timed urine samples in persons with advanced CKD. A third limitation is that comparisons between creatinine clearance and iGFR were limited to the CRIC and MDRD studies, which include only persons with established CKD.
In summary, we derived and validated prediction limits for 24-hour urine creatinine excretion in a large diverse study population based on readily obtainable anthropometric and demographic characteristics. Urine creatinine values from the derived equation are referenced to current laboratory standards. The application of model-derived prediction limits excluded far fewer 24-hour urine samples compared with accepted sex and weight-based limits. These considerations suggest application of the derived limits to assess the adequacy of timed urine collections in practice.

DISCLOSURE
BK reports personal fees from Reata Pharmaceutical outside of the submitted work. JI reports grants or contracts from the NIDDK and Baxter International, consulting fees from Bayer, Ardelyx, AstraZeneca, and Jnana Therapeutics, payment from American Society of Nephrology (ASN), support for travel and meetings from ASN and Kidney Disease: Improving Global Outcomes, participation in a Data Monitoring Committee for Sanifit International, and stocks from AlphaYoung. YC is an employee of analysis group. All the other authors declared no competing interests.

ACKNOWLEDGMENTS
The Chronic Renal Insufficiency Cohort study, Jackson Heart Study, and Modification of Diet in Renal Disease study were conducted by the CRIC, JHS, and MDRD Investigators and supported by the National Institute of Diabetes and Digestive and Kidney Diseases and National Heart Lung and Blood Institute. Data from these studies reported here were supplied by the NIDDK Central Repository and the National Heart Lung and Blood Institute Biolincc. This manuscript was not prepared in collaboration with Investigators of the JHS or MDRD study and does not necessarily reflect the opinions or views of the NIDDK or National Heart Lung and Blood Institute Central Repositories, or the NIDDK or National Heart Lung and Blood Institute.

SUPPLEMENTARY MATERIAL
Supplementary File (PDF) Figure S1. 24-hour urine creatine excretion for women and men by weight, height and age in the discovery cohort. Figure S2. Association of serum creatinine concentrations with 24-hour urine creatinine excretion in the discovery cohort for women and men. Figure S3. 24-hour creatinine clearance and iothalamate clearance measurements of GFR among replication cohort members who completed iothalamate testing in the discovery and replication cohorts for women and men. Figure S4. Agreement between 24-hour creatinine clearance and iothalamate clearance measurements of GFR among replication cohort members who completed iothalamate testing. Table S1. Calibration of Jackson Heart Study and Chronic Renal Insufficiency Cohort study urine creatinine measurements to isotope dilution mass-spectroscopy standards. Table S2. Efforts by study cohort to ensure complete 24hour urine collections.