Abstract
Official Statistics bureaus are periodically asked to give an estimate of their country's population, which can be defined by the number of usual residents. A person is considered a usual resident when they have lived in the Netherlands for longer than a year, or if they have the intention to
... read more
reside for longer than a year. For the Dutch Census, Statistics Netherlands makes use of the Population Register (PR). However, for numerous reasons, immigrants that have taken residence in the Netherlands may not register and become undocumented immigrants. Thus, the PR alone is not sufficient to estimate the number of usual residents, and has an undercoverage considering the number of Dutch usual residents. One commonly used method to estimate population sizes is the capture-recapture methodology. First the PR is linked to two other registers. Then capture-recapture methodology using a covariate that denotes residence duration can be used to estimate the number of usual residents missed by all three registers. However, for the valid use of capture-recapture methodology, a set of assumptions has to be met. Additionally, practical issues such as missing data may occur. Such practical issue have to be resolved before one can estimate the number of Dutch usual residents via capture-recapture methodology. For that purpose there are two central questions answered in this thesis: 1) what is the effect of violated assumptions and missing data on the robustness of population size estimation via capture-recapture methodology, and 2) how can the information gained in 1) be used to achieve a trustworthy estimate of the under coverage of usual residents in the Population Register in the Netherlands? To answer the first question in this thesis, research has been conducted into the robustness of population size estimation via capture-recapture methodology when the following assumptions are violated: 1) independence of the inclusion probabilities of the registers, 2) no erroneous captures in the registers, and 3) perfect linkage of the units in the used registers. For the independence assumption, this research also investigated the robustness for independence conditional on fully and partially observed covariates. Additionally research has been conducted into the effect missing data have on the population size estimation, and most notably how different methods of handling missing data differ in their effect on the resulting population size estimate. It has been found that implied coverage of one register, given the other register is important tot he extent that violated assumptions will bias the population size estimation. Implied coverage plays an important role in this thesis given that it cannot be ascertained from the data whether assumptions are violated, but implied coverage can. The results obtained in answering the first question have been used to conduct research into the undercoverage of the PR of the Netherlands. It is concluded that for reference date september 2010, the PR has an undercoverage of 0,5 to 1,1% usual residents.
show less