University of Wollongong
Browse

Statistical learning in sample design

Download (355.92 kB)
preprint
posted on 2024-11-15, 23:23 authored by Robert Clark
A well-designed sampling plan can greatly enhance the information that can be produced from a survey. Once a broad sample design is identified, specific design parameters such as sample sizes and selection probabilities need to be chosen. This is typically achieved using an optimal sample design, which minimises the variance of a key statistic or statistics, expressed as a function of design parameters and population characteristics, subject to a cost constraint. In practice, only imprecise estimates of population characteristics are available, but the effects of this variability are usually ignored. A general approach to sample allocation allowing for imprecise design data is proposed and evaluated. The approach is based on the availability of two sets of design data which can act as a check on each other. One application is to stratified sampling, where estimated stratum variances may be highly variable. Pooling strata into groups may reduce this variability, at the possible cost of some inefficiency. Proportional allocation, ignoring differences between stratum variances, could also be used. The new approach enables a data- driven compromise between all three. Simulation results based on real data show useful gains in a hypothetical farm survey, business survey and household survey of a subpopulation.

History

Citation

Clark, Robert Graham, Statistical learning in sample design, Centre for Statistical and Survey Methodology, University of Wollongong, Working Paper 6-12, 2012, 25. http://ro.uow.edu.au/cssmwp/93 This has been subsequently published as: Clark, R. Graham. (2013). Sample design using imperfect design data. Journal of Survey Statistics and Methodology, 1 (1), 6-23. Journal of Survey Statistics and Methodology

Article/chapter number

6-12

Total pages

25

Language

English

Usage metrics

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC