MOTIVATION: Local ancestry analysis of genotype data from recently admixed populations (e.g. Latinos, African Americans) provides key insights into population history and disease genetics. Although methods for local ancestry inference have been extensively validated in simulations (under many unrealistic assumptions), no empirical study of local ancestry accuracy in Latinos exists to date. Hence, interpreting findings that rely on local ancestry in Latinos is challenging. RESULTS: Here, we use 489 nuclear families from the mainland USA, Puerto Rico and Mexico in conjunction with 3204 unrelated Latinos from the Multiethnic Cohort study to provide the first empirical characterization of local ancestry inference accuracy in Latinos. Our approach for identifying errors does not rely on simulations but on the observation that local ancestry in families follows Mendelian inheritance. We measure the rate of local ancestry assignments that lead to Mendelian inconsistencies in local ancestry in trios (MILANC), which provides a lower bound on errors in the local ancestry estimates. We show that MILANC rates observed in simulations underestimate the rate observed in real data, and that MILANC varies substantially across the genome. Second, across a wide range of methods, we observe that loci with large deviations in local ancestry also show enrichment in MILANC rates. Therefore, local ancestry estimates at such loci should be interpreted with caution. Finally, we reconstruct ancestral haplotype panels to be used as reference panels in local ancestry inference and show that ancestry inference is significantly improved by incoroprating these reference panels. AVAILABILITY AND IMPLEMENTATION: We provide the reconstructed reference panels together with the maps of MILANC rates as a public resource for researchers analyzing local ancestry in Latinos at http://bogdanlab.pathology.ucla.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Local ancestry analysis of genotype data from recently admixed populations (e.g. Latinos, African Americans) provides key insights into population history and disease genetics. Although methods for local ancestry inference have been extensively validated in simulations (under many unrealistic assumptions), no empirical study of local ancestry accuracy in Latinos exists to date. Hence, interpreting findings that rely on local ancestry in Latinos is challenging. RESULTS: Here, we use 489 nuclear families from the mainland USA, Puerto Rico and Mexico in conjunction with 3204 unrelated Latinos from the Multiethnic Cohort study to provide the first empirical characterization of local ancestry inference accuracy in Latinos. Our approach for identifying errors does not rely on simulations but on the observation that local ancestry in families follows Mendelian inheritance. We measure the rate of local ancestry assignments that lead to Mendelian inconsistencies in local ancestry in trios (MILANC), which provides a lower bound on errors in the local ancestry estimates. We show that MILANC rates observed in simulations underestimate the rate observed in real data, and that MILANC varies substantially across the genome. Second, across a wide range of methods, we observe that loci with large deviations in local ancestry also show enrichment in MILANC rates. Therefore, local ancestry estimates at such loci should be interpreted with caution. Finally, we reconstruct ancestral haplotype panels to be used as reference panels in local ancestry inference and show that ancestry inference is significantly improved by incoroprating these reference panels. AVAILABILITY AND IMPLEMENTATION: We provide the reconstructed reference panels together with the maps of MILANC rates as a public resource for researchers analyzing local ancestry in Latinos at http://bogdanlab.pathology.ucla.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Matthew L Freedman; Christopher A Haiman; Nick Patterson; Gavin J McDonald; Arti Tandon; Alicja Waliszewska; Kathryn Penney; Robert G Steen; Kristin Ardlie; Esther M John; Ingrid Oakley-Girvan; Alice S Whittemore; Kathleen A Cooney; Sue A Ingles; David Altshuler; Brian E Henderson; David Reich Journal: Proc Natl Acad Sci U S A Date: 2006-08-31 Impact factor: 11.205
Authors: L N Kolonel; B E Henderson; J H Hankin; A M Nomura; L R Wilkens; M C Pike; D O Stram; K R Monroe; M E Earle; F S Nagamine Journal: Am J Epidemiol Date: 2000-02-15 Impact factor: 4.897
Authors: Jun Z Li; Devin M Absher; Hua Tang; Audrey M Southwick; Amanda M Casto; Sohini Ramachandran; Howard M Cann; Gregory S Barsh; Marcus Feldman; Luigi L Cavalli-Sforza; Richard M Myers Journal: Science Date: 2008-02-22 Impact factor: 47.728
Authors: Noah Zaitlen; Scott Huntsman; Donglei Hu; Melissa Spear; Celeste Eng; Sam S Oh; Marquitta J White; Angel Mak; Adam Davis; Kelly Meade; Emerita Brigino-Buenaventura; Michael A LeNoir; Kirsten Bibbins-Domingo; Esteban G Burchard; Eran Halperin Journal: Genetics Date: 2016-11-22 Impact factor: 4.562
Authors: Gaurav Bhatia; Arti Tandon; Nick Patterson; Melinda C Aldrich; Christine B Ambrosone; Christopher Amos; Elisa V Bandera; Sonja I Berndt; Leslie Bernstein; William J Blot; Cathryn H Bock; Neil Caporaso; Graham Casey; Sandra L Deming; W Ryan Diver; Susan M Gapstur; Elizabeth M Gillanders; Curtis C Harris; Brian E Henderson; Sue A Ingles; William Isaacs; Phillip L De Jager; Esther M John; Rick A Kittles; Emma Larkin; Lorna H McNeill; Robert C Millikan; Adam Murphy; Christine Neslund-Dudas; Sarah Nyante; Michael F Press; Jorge L Rodriguez-Gil; Benjamin A Rybicki; Ann G Schwartz; Lisa B Signorello; Margaret Spitz; Sara S Strom; Margaret A Tucker; John K Wiencke; John S Witte; Xifeng Wu; Yuko Yamamura; Krista A Zanetti; Wei Zheng; Regina G Ziegler; Stephen J Chanock; Christopher A Haiman; David Reich; Alkes L Price Journal: Am J Hum Genet Date: 2014-09-18 Impact factor: 11.025
Authors: Javier Mendoza-Revilla; J Camilo Chacón-Duque; Macarena Fuentes-Guajardo; Louise Ormond; Ke Wang; Malena Hurtado; Valeria Villegas; Vanessa Granja; Victor Acuña-Alonzo; Claudia Jaramillo; William Arias; Rodrigo Barquera; Jorge Gómez-Valdés; Hugo Villamil-Ramírez; Caio C Silva de Cerqueira; Keyla M Badillo Rivera; Maria A Nieves-Colón; Christopher R Gignoux; Genevieve L Wojcik; Andrés Moreno-Estrada; Tábita Hünemeier; Virginia Ramallo; Lavinia Schuler-Faccini; Rolando Gonzalez-José; Maria-Cátira Bortolini; Samuel Canizales-Quinteros; Carla Gallo; Giovanni Poletti; Gabriel Bedoya; Francisco Rothhammer; David Balding; Matteo Fumagalli; Kaustubh Adhikari; Andrés Ruiz-Linares; Garrett Hellenthal Journal: Mol Biol Evol Date: 2022-04-11 Impact factor: 8.800
Authors: Maria Pino-Yanes; Neeta Thakur; Christopher R Gignoux; Joshua M Galanter; Lindsey A Roth; Celeste Eng; Katherine K Nishimura; Sam S Oh; Hita Vora; Scott Huntsman; Elizabeth A Nguyen; Donglei Hu; Katherine A Drake; David V Conti; Andres Moreno-Estrada; Karla Sandoval; Cheryl A Winkler; Luisa N Borrell; Fred Lurmann; Talat S Islam; Adam Davis; Harold J Farber; Kelley Meade; Pedro C Avila; Denise Serebrisky; Kirsten Bibbins-Domingo; Michael A Lenoir; Jean G Ford; Emerita Brigino-Buenaventura; William Rodriguez-Cintron; Shannon M Thyne; Saunak Sen; Jose R Rodriguez-Santana; Carlos D Bustamante; L Keoki Williams; Frank D Gilliland; W James Gauderman; Rajesh Kumar; Dara G Torgerson; Esteban G Burchard Journal: J Allergy Clin Immunol Date: 2014-10-06 Impact factor: 10.793
Authors: Jacquiline W Mugo; Ephifania Geza; Joel Defo; Samar S M Elsheikh; Gaston K Mazandu; Nicola J Mulder; Emile R Chimusa Journal: Bioinformatics Date: 2017-10-01 Impact factor: 6.937