Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

The coalescent process describes how changes in the size or structure of a population influence the genealogical patterns of sequences sampled from that population. The estimation of (effective) population size changes from genealogies that are reconstructed from these sampled sequences is an important problem in many biological fields. Often, population size is characterised by a piecewise-constant function, with each piece serving as a population size parameter to be estimated. Estimation quality depends on both the statistical coalescent inference method employed, and on the experimental protocol, which controls variables such as the sampling of sequences through time and space, or the transformation of model parameters. While there is an extensive literature on coalescent inference methodology, there is comparatively little work on experimental design. The research that does exist is largely simulation-based, precluding the development of provable or general design theorems. We examine three key design problems: temporal sampling of sequences under the skyline demographic coalescent model, spatio-temporal sampling under the structured coalescent model, and time discretisation for sequentially Markovian coalescent models. In all cases we prove that (i) working in the logarithm of the parameters to be inferred (e.g. population size), and (ii) distributing informative coalescent events uniformly among these log-parameters, is uniquely robust. `Robust' means that the total and maximum uncertainty of our parameter estimates are minimised, and made insensitive to their unknown (true) values. This robust design theorem provides rigorous justification for several existing coalescent experimental design decisions, and leads to usable guidelines for future empirical or simulation-based investigations. Given its persistence among models, this theorem may form the basis of an experimental design paradigm for coalescent inference.

Original publication

DOI

10.1093/sysbio/syz008

Type

Journal article

Journal

Systematic biology

Publication Date

06/02/2019

Addresses

Department of Zoology, University of Oxford, Oxford, OX1 3SY, UK.