Publication Date



We call a sample design that allows for different patterns, or sets, of data items to be collected from different sample units a Split Questionnaire Design (SQD). SQDs can be used to accommodate constraints on respondent burden and to maximise survey design efficiency, commonly measured by the trade-off between the survey cost and the accuracy of target estimates. This paper explores these issues where the data that are not collected by an SQD can be treated as Missing Completely At Random or Missing At Random, targets are regression coefficients in a generalised linear model fitted to binary variables, and targets are estimated using Maximum Likelihood. A key finding is that some respondents may contribute relatively little to the information about regression coefficients; consequently, collecting all data items from these respondents can not only be inefficient but may also impose unnecessary burden. This paper illustrates how to exploit this key finding through an SQD, using Australia's NSW Population Health Survey.