Abstract
To prevent infectious diseases from spreading, it is often very valuable to know how, when, where and between whom transmissions occur: the dynamics of the disease. These dynamics can be quantified using observations of illness, such as time of symptom onset. A relatively new source of information is formed by
... read more
genetic data: DNA sequences of the pathogens. The sequences can inform us on the relationships between the different samples taken, and thus also on the dynamics of the disease. To optimize inference schemes, we should utilize both types of data simultaneously. In this thesis, three mathematical methods are presented that combine epidemiological and genetic data to unravel infectious disease dynamics, and applied to actual datasets to illustrate their use. First, the thesis shows how the notion of 'superspreading' (i.e. some infected individuals causing a disproportionately large number of secondary cases) can be quantified for tuberculosis using data on date of diagnosis, country of origin and genotype of the sampled bacterium. The genetic data here is of coarse resolution; although the population can be divided in groups of cases sharing the same genotype, no further distinction can be made within or among these groups. A mathematical model derived from branching processes allows for inference of heterogeneity in infectiousness based on the size distribution of these groups, indicating superspreading behaviour for tuberculosis. Second, a method is developed to identify cases of infectious diseases that are epidemiologically related, which allows for detection of local outbreaks and risk factors for local infection. The method builds on existing clustering methods that identify groups of related individuals, and adds to it by specifically including genetic data. Although the data used here are still genotypes rather than sequences, there is now a clear measure of similarity between the genotypes, which allows for inter-group comparisons. The method is applied to a dataset on MRSA in Dutch hospitals, containing MLVA types, dates of sampling and geographical locations of the patients. The analysis shows marked differences between MLVA complexes, and suggests a high proportion of infections are imported into the hospital. Third, the thesis showcases how the transmission tree of an outbreak can be estimated, based on epidemiological and genetic sequence data. The method is applied to an outbreak of avian influenza, and shows differences in infectiousness for different types of farms. A further analysis that includes data on farms not infected shows a statistically significant correlation between wind direction and the direction of spread of the disease. Lastly, it is shown how to treat genetic data in a more sophisticated way, allowing for within-host genetic diversity. This approach shows the relationship between transmission tree reconstruction and phylodynamics. Combining genetic and epidemiological data in one analysis can be a powerful way to learn more about the spread of infectious diseases. As the technology that drives this field is moving fast, techniques such as those presented here have the potential to unravel infectious disease dynamics at unprecedented levels
show less