Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

MOTIVATION: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate. RESULTS: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods. AVAILABILITY: Package ’extraBinomial’ is on http://cran.r-project.org/. CONTACT: chris.wallace@cimr.cam.ac.uk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Online.

Original publication

DOI

10.1093/bioinformatics/bts553

Type

Journal article

Journal

Bioinformatics

Publication Date

15/11/2012

Volume

28

Pages

2898 - 2904

Keywords

Diabetes Mellitus, Type 1, Humans, Models, Statistical, Polymorphism, Single Nucleotide, Sequence Analysis, DNA