Effective semiparametric models for time-varying risk factors associated with cardiovascular diseases in patients on dialysis
Substantial research links micro- and macroinflammatory states to atherosclerosis. A macroinflammatory state of particular interest is infection, and recent studies have shown an important association between infections and cardiovascular disease. My main research program funded by NIH is on building generalized semiparametric regression models to relate important baseline covariates and longitudinal infection status with cardiovascular outcomes/events (e.g., myocardial infarction, unstable angina, stroke or transient ischemic attack) in patients on dialysis in the U.S. population. Below is a summary of a variety of research projects supported with funding from my R01 from the National Institutes of Health (NIH) since 2011.
Generalized semiparametric and parametric modeling approaches
Assessing the infection-CV hypothesis
Our studies based on the USRDS data found a significantly increased incidence of CV events after infection (e.g., particularly during the 1 month after infection) for a broad spectrum of infections. We used the self-controlled case series (SCCS) method to control for both measured and unmeasured baseline confounders and to mitigate residual confounding. The SCCS method is a case-only generalized regression model, where outcome (e.g., CV) events arise from an underlying time-dependent Poisson cohort model that relates time-varying exposures (e.g., infections) and acute outcome events. We subsequently developed and applied a novel SCCS method to properly account for the fact that the exact time of infection onset is not known precisely for USRDS hospitalization data. The results, after correcting for the bias due to measurement error, indicate that the incidence of CV events is 60% higher within the 30-days risk period following an infection compared to the control time periods, providing stronger support for the infection-CV risk hypothesis in the dialysis population.
- Mohammed, S. M., Senturk, D., Dalrymple, L. S. and Nguyen, D. V. (2012) Measurement error case series models with application to infection-cardiovascular risk in older patients on dialysis. Journal of the American Statistical Association, 107, 1310-1323. PMCID: PMC3643015
- Mohammed, S. M., Dalrymple, L. S., Senturk, D. and Nguyen, D. V. (2013) Design considerations for case series models with exposure onset measurement error. Statistics in Medicine, 32(5):772-786. NIHMSID: NIHMS600077
- Mohammed S. M., Dalrymple, L. S., Senturk, D. and Nguyen, D. V. (2013) Naive hypothesis testing for case series models with time-varying exposure onset measurement error: Inference for infection-cardiovascular risk in patients on dialysis. Biometrics, 69(2):520-952. NIHMSID: NIHMS600308
Patient-level factors associated with CV event onset and recurrence processes
The hurdle model and zero-inflated Poisson model, are useful in modeling hospitalizations based on the USRDS data or similar ongoing longitudinal databases due to excessive zeros observed in the count of all-cause hospitalizations, or cause specific (e.g., infection- or CV-related) hospitalizations in the dialysis population. We have proposed modeling approaches to jointly examine patient-level factors associated with CV event onset (the probability of having a CV event) and recurrence rate (the subsequent rate of CV events given that patients already have a CV event). This approach allows us to tease out patient factors, such as baseline comorbidities that contribute to CV event onset (likelihood) and/or recurrence. A methodological challenge we addressed is to properly account for the differential follow-up times in this simultaneous modeling. In an application to USRDS data, we found that for older patients (age 65 or older) on dialysis, higher infection rate was associated with a significantly increased likelihood of positive CV events during the five-year study period (30.7% higher odds with each additional infection-related hospitalization per person-year; 95% CI: 1.28-1.33) which in magnitude is similar to the effect of having coronary heart disease at the start of dialysis (33% higher odds; 95% CI: 1.29-1.36).
- Senturk, D., Dalrymple, L. S., Mu, Y. and Nguyen, D. V. (2014) Weighted hurdle regression method for joint modeling of cardiovascular events likelihood and rate in the U.S. dialysis population. Statistics in Medicine, in press. NIHMSID: NIHMS600313
- Senturk, D., Dalrymple, L. S. and Nguyen, D. V. (2014) Functional linear models for zero-inflated count data with application to modeling hospitalizations in patients on dialysis. Statistics in Medicine, in press. NIHMSID: NIHMS600081
Facility-level risk factors
Presently, there are over 5,000 dialysis facilities across the U.S. situated over diverse regional characteristics (e.g., socioeconomic status, % below poverty, median education, urbanicity etc.). In a USRDS-based study we found that patients receiving hemodialysis in for-profit facilities had a 15% (95% CI: 13%-18%) higher relative rate of hospitalization compared with those in nonprofit facilities; 37% (95% CI: 31%-44%) higher rate of hospitalization for heart failure or volume overload and a 15% (95% CI: 11%-20%) higher rate of hospitalization for vascular access complications. Interestingly, a recent study documents the differential allocation of patient care staffing levels between for-profit and nonprofit dialysis facilities (e.g., the number of registered nurse per patient, or patient care staff composition, including licensed practical nurses, patient care technicians, social workers and dietitians).
- Dalrymple, L. S., Johansen, K. L., Romano, P. S., Chertow, G. M., Mu, Y., Grimes, B., Kaysen, G.A., Nguyen, D. V. (2014) Comparison of hospitalization between for-profit and nonprofit dialysis facilities. Clinical Journal of the American Society of Nephrology, 9: 73-81. PMCID: PMC3878699
Generalized varying coefficient models (GVCMs) for irregular, infrequent, unsynchronized and noise-contaminated (IIUN) longitudinal data
We advanced estimation and inference for GVCMs, including methods for predicting subject-specific mean response trajectories, by developing two approaches: discretized local maximum likelihood (ML) and functional data analysis tailored to large and small IIUN data, respectively.
Discretized local ML approach to multi-index GVCMs
Given the accumulating evidence for the infection-CV risk hypothesis, we developed a new method to elucidate how the risk of CV events (MI, stroke, TIA) changes over time for patients on dialysis, depending simultaneously on multiple key indices of: (a) time since the start of dialysis or vintage, (b) time since sentinel events such as infection-related hospitalization during the course of dialysis and (c) age at dialysis. Although CV events and infection-related hospitalization from USRDS are IIUN, the number of subjects is large (>250,000 incident dialysis patients in our analysis). Thus, we developed an efficient discretized local ML estimation method for the multi-index VCMs for IIUN large USRDS data. The proposed method is also tailored to address follow-up truncated by death by adopting a partly conditional modeling approach, where the CV event risk is characterized for the dynamic cohort of survivors. Applications to USRDS data show that the infection-related hospitalization results in sustained increases in CV event risk among the dynamic cohort of survivors; for instance, even one year after the infection-related hospitalization, the CV event probability is still substantially higher than the CV risk at the start of dialysis, a time of high CV risk with respect to vintage. This pattern of CV risk dynamics, with respect to vintage and time since the pivotal initial infection-related hospitalization, holds for the vast majority of older patients starting dialysis (e.g., ages 65-90), although the difference in CV risks before and after the infection declines with increasing baseline age at dialysis.
- Estes, J., Nguyen, D. V., Dalrymple, L. S., Mu, Y. and Senturk, D. (2014) Cardiovascular event risk dynamics over time in older patients on dialysis: A generalized multiple-index varying coefficient model approach. Biometrics, in press. NIHMSID: NIHMS600080
- Senturk, D., Dalrymple, L. S., Mohammed, S. M., Kaysen, G. A. and Nguyen, D. V. (2013) Modeling time varying effects with generalized and unsynchronized longitudinal data. Statistics in Medicine, 32: 2971-2987. PMCID: PMC3702655
- Senturk, D., Ghosh, S. and Nguyen, D. V. (2014) Exploratory time varying lagged regression: Modeling association of cognitive and functional trajectories with expected clinic visits in older adults. Computational Statistics and Data Analysis, 73: 1-15. PMCID: PMC3890149
Functional data analysis (FDA) approach to GVCMs and application to the Comprehensive Dialysis Study (CDS)
Although discretization is an effective approach for large IIUN data, it is not designed for ultra-sparse small IIUN data applications. For example, in the CDS, a prospective cohort study of ESRD patients who newly initiated dialysis between September 2005 and June 2007, longitudinal serum samples (e.g., C-reactive protein [CRP], pre-albumin and albumin) were collected on a subset of only 266 participants within the first two years from the start of dialysis. It is of interest to examine the age-varying relationship between sparse longitudinal biomarkers with the probabilities of infection-related hospitalization and obtain predicted subject-specific infection-related hospitalization probability trajectories. In the below paper we exploit the fact that although there is little information in a single subject trajectory due to high irregularity and infrequency, the moments of the underlying processes (used to represent/approximate the varying-coefficient functions of interest) can be accurately recovered by pooling information from all subjects using FDA. Similar approaches VCMs were illustrated with our work on studying lagged associations (the association of outcome with exposures at previous specific time points).