BACKGROUND: Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency. RESULTS: The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants. CONCLUSIONS: This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variation.
BACKGROUND: Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency. RESULTS: The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants. CONCLUSIONS: This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variation.
Authors: Robert J Klein; Caroline Zeiss; Emily Y Chew; Jen-Yue Tsai; Richard S Sackler; Chad Haynes; Alice K Henning; John Paul SanGiovanni; Shrikant M Mane; Susan T Mayne; Michael B Bracken; Frederick L Ferris; Jurg Ott; Colin Barnstable; Josephine Hoh Journal: Science Date: 2005-03-10 Impact factor: 47.728
Authors: Alex Coventry; Lara M Bull-Otterson; Xiaoming Liu; Andrew G Clark; Taylor J Maxwell; Jacy Crosby; James E Hixson; Thomas J Rea; Donna M Muzny; Lora R Lewis; David A Wheeler; Aniko Sabo; Christine Lusk; Kenneth G Weiss; Humeira Akbar; Andrew Cree; Alicia C Hawes; Irene Newsham; Robin T Varghese; Donna Villasana; Shannon Gross; Vandita Joshi; Jireh Santibanez; Margaret Morgan; Kyle Chang; Walker Hale Iv; Alan R Templeton; Eric Boerwinkle; Richard Gibbs; Charles F Sing Journal: Nat Commun Date: 2010-11-30 Impact factor: 14.919
Authors: Adam R Boyko; Scott H Williamson; Amit R Indap; Jeremiah D Degenhardt; Ryan D Hernandez; Kirk E Lohmueller; Mark D Adams; Steffen Schmidt; John J Sninsky; Shamil R Sunyaev; Thomas J White; Rasmus Nielsen; Andrew G Clark; Carlos D Bustamante Journal: PLoS Genet Date: 2008-05-30 Impact factor: 5.917
Authors: Zachary A Szpiech; Jishu Xu; Trevor J Pemberton; Weiping Peng; Sebastian Zöllner; Noah A Rosenberg; Jun Z Li Journal: Am J Hum Genet Date: 2013-06-06 Impact factor: 11.025
Authors: Adam S Gordon; Holly K Tabor; Andrew D Johnson; Beverly M Snively; Themistocles L Assimes; Paul L Auer; John P A Ioannidis; Ulrike Peters; Jennifer G Robinson; Lara E Sucheston; Danxin Wang; Nona Sotoodehnia; Jerome I Rotter; Bruce M Psaty; Rebecca D Jackson; David M Herrington; Christopher J O'Donnell; Alexander P Reiner; Stephen S Rich; Mark J Rieder; Michael J Bamshad; Deborah A Nickerson Journal: Hum Mol Genet Date: 2013-11-26 Impact factor: 6.150
Authors: Emma E M Knowles; Jack W Kent; D Reese McKay; Emma Sprooten; Samuel R Mathias; Joanne E Curran; Melanie A Carless; Marcio A A de Almeida; H H Goring Harald; Tom D Dyer; Rene L Olvera; Peter T Fox; Ravi Duggirala; Laura Almasy; John Blangero; David C Glahn Journal: J Affect Disord Date: 2015-11-17 Impact factor: 4.839