Publication Date



Estimating population counts for multidimensional tables based on a representative sample data subject to known marginal population counts is not only important in survey sampling but is also an integral part of standard methods for simulating area-specific synthetic populations (SPs). In order to generate a reliable SP, tabulating multi-dimensional tables of agents' socio-demographics is needed. In this paper we review the iterative proportional fitting procedure (IPFP) and the maximum likelihood (ML) method for estimating the cell counts in multidimensional tables subject to known population sub-tables. We also review two standard error estimators for ML and IPFP and investigate their performance in a simulation study, in which we consider mis-specification models, for which sample and target populations differ systematically. The empirical results show that a simple adjustment can lead to more efficient estimates when table probabilities are low. The methods discussed in this paper along with standard error estimators, one of which is relatively new, are made freely available in the R package mipfp . As an illustration, the methods are applied to the 2011 Australian census data available for the Illawarra Region in Australia to obtain cell counts estimates for the desired three-way table for age by sex by family type subject to marginal tables for age by sex and family type.