London School of Hygiene and Tropical Medicine logo
Research Developer Initiative logo Economic and Social Research Council logo

The estimation of R2 and adjusted R2 in incomplete data sets using multiple imputation

Research Area: Uncategorized Year: 2009
Type of Publication: Article
  • Harel, O.
Journal: Journal of Applied Statistics Volume: 36
Number: 10 Pages: 1109-1118
The coefficient of determination, known also as the R2, is a common measure in regression analysis. Many scientists use the R2 and the adjusted R2 on a regular basis. In most cases, the researchers treat the coefficient of determination as an index of 'usefulness' or 'goodness of fit,' and in some cases, they even treat it as a model selection tool. In cases in which the data is incomplete, most researchers and common statistical software will use complete case analysis in order to estimate the R2, a procedure that might lead to biased results. In this paper, I introduce the use of multiple imputation for the estimation of R2 and adjusted R2 in incomplete data sets. I illustrate my methodology using a biomedical example.