Phasing for Pedigrees when SNPs are Densely Packed

Bonnie Kirkpatrick, University of California, Berkeley

Pedigree analysis is a common method for identifying genes that may be responsible for a heritable disease. After identifying families where multiple people are affected by the disease of interest, medical researchers collect both DNA samples and pedigree information about the relationships between family members. The genetic data and pedigree are used to identify specific regions of the genome where the DNA of the affected people distinguishes them from the unaffected people.


Traditional methods of pedigree analysis were designed for genotype data taken from sparsely located sites on the same chromosome. As a result, these methods assume that pairs of adjacent sites have a non-zero recombination fraction. With the advent of dense genotyping (>100,00 SNPs per individual), new methods must be developed that efficiently consider the extremely low probability of recombination between closely adjacent sites.


We introduce a new algorithm for simultaneously inferring both founder haplotype frequencies and haplotypes of all individuals in a moderately-sized pedigree. Our method scales well for pedigrees with multiple pairs of married founders and for a small number of closely adjacent SNPs. Haplotypes inferred by our method are more accurate than estimates based on the traditional models for pedigree analysis.

Abstract Author(s): Bonnie Kirkpatrick(1), Javier Rosa(2), Eran Halperin(3), Richard M. Karp(1,3)<br /><br />