Imputation on the Food, Nutrition and Environment Surveys 2007 and 2009 data
The Food Nutrition Environment Survey (FNES) is a survey of New Zealand early childhood centres and schools and the food and nutritional services that they provide for their pupils. The 2007 and 2009 FNES surveys were managed by the Ministry of Health. Like all the other social surveys, the FNES has the common problem of unit and item non-responses. In other words, the FNES has missing data. In this thesis, we have surveyed a wide variety of missing data handling techniques and applied most of them to the FNES datasets. This thesis can be roughly divided into two parts. In the first part, we have studied and investigated the different nature of missing data (i.e. missing data mechanisms), and all the common and popular imputation methods, using the Synthetic Unit Record File (SURF) which has been developed by the Statistics New Zealand for educational purposes. By comparing all those different imputation methods, Bayesian Multiple Imputation (MI) method is the preferred option to impute missing data in terms of reducing non-response bias and properly propagating imputation uncertainty. Due to the overlaps in the samples selected for the 2007 and 2009 FNES surveys, we have discovered that the Bayesian MI can be improved by incorporating the matched dataset. Hence, we have proposed a couple of new approaches to utilize the extra information from the matched dataset. We believe that adapting the Bayesian MI to use the extra information from the matched dataset is a preferable imputation strategy for imputing the FNES missing data. This is because the use of the matched dataset provides more prediction power to the imputation model.