Harvard Catalyst Profiles

Contact, publication, and social network information about Harvard faculty and fellows.



Correlated data are common in health sciences research, such as cancer research, where clustered, hierarchical (multi-level) and spatial data are often observed. Correlated data arise in various study designs, such as longitudinal studies, interventional studies, clinical trials and disease mapping. this correlation may be due to a single outcome measured repeatedly over time, as in longitudinal studies; or may be due to multiple outcomes measured one or more times each, as in clinical trials involving multiple endpoints; or may be due to a hierarchical or nested membership relationship among units, as in interventional studies; or may be due to geographic proximity, as in the estimation of disease maps.

The purpose of this proposal is to develop new mixed effects models for types of correlated data that are common in practice but cannot be analyzed using existing statistical models, such as correlated data requiring nonparametric regression, or involving measurement error, or consisting of mixed discrete and continuous outcomes. The applicants will develop three new classes of mixed effects models: (1) generalized additive mixed models, which allow for flexible functional dependence of an outcome variable on covariates using nonparametric regression, while accounting for correlation among observations; (2) generalized linear (additive) mixed measurement error models, which allow outcomes and covariates to be measured with error, while accounting for correlation among observations; (3) generalized linear (additive) mixed models for mixed discrete and continuous outcomes, which allow multiple outcomes (e.g., multiple endpoints in clinical trials) to have different forms.

Maximum likelihood inference and Bayesian inference using Monte-Carlo simulation methods will be developed for the proposed models. Simulation studies will be conducted to evaluate their performance. Efficient numerical algorithms and user-friendly statistical software will be developed, with the goal of disseminating these new models and methods to health sciences researchers. In collaboration with biomedical investigators, the applicants will apply the proposed models and methods to several accessible data sets on cancer research and other fields of research.

Funded by the NIH National Center for Advancing Translational Sciences through its Clinical and Translational Science Awards Program, grant number UL1TR002541.