Selection models

Quick summary

Selection model is one of the most famous classical statistical methods to handle missing data analyses under MNAR assumption (Diggle and kenward 1994). It is based on factorizations of joint likelihood of both measurement process and missingness process. A marginal density of the measurement process describes the complete data generation while the density of the missingness process conditional on the outcomes describes the missing data “selection” based on the complete data. Therefore, similar to shared parameter model, this is a joint modelling approach: two process linked through response variable. Please note that in selection models, it is the response values that directly model the missingness process and/or dropout probability, in contrast to some latent random effects as in shared parameter models. Users need to make their judgement and choice according to their project details.

Classic selection model assumes Non-Future Dependence (NFD). That is, dropout probability only depends on the previously last observed and current missing responses. This is a reasonable assumption but can significantly simplify the analysis. It’s also very popular to see different treatment arms with different dropout performance which can be specified in selection model approach.

The current macro models the response using a standard repeated measure model and models the dropout using a logistic regression.


Selection model fitting in the current macro involves integration and the code uses SAS built-in nonlinear optimization function which can be very slow. Convergence may not be achieved for extreme cases. Please always check log files.

Alternatively, selection models can be implemented via proc mcmc. The model specifications and computations may be more straightforward.


Macros can be downloaded here Selection Model_20120726

Comments are closed.