Literature DB >> 31924160

LDpop: an interactive online tool to calculate and visualize geographic LD patterns.

T A Alexander1, M J Machiela2.   

Abstract

BACKGROUND: Linkage disequilibrium (LD)-the non-random association of alleles at different loci-defines population-specific haplotypes which vary by genomic ancestry. Assessment of allelic frequencies and LD patterns from a variety of ancestral populations enables researchers to better understand population histories as well as improve genetic understanding of diseases in which risk varies by ethnicity.
RESULTS: We created an interactive web module which allows for quick geographic visualization of linkage disequilibrium (LD) patterns between two user-specified germline variants across geographic populations included in the 1000 Genomes Project. Interactive maps and a downloadable, sortable summary table allow researchers to easily compute and compare allele frequencies and LD statistics of dbSNP catalogued variants. The geographic mapping of each SNP's allele frequencies by population as well as visualization of LD statistics allows the user to easily trace geographic allelic correlation patterns and examine population-specific differences.
CONCLUSIONS: LDpop is a free and publicly available cross-platform web tool which can be accessed online at https://ldlink.nci.nih.gov/?tab=ldpop.

Entities:  

Keywords:  1000 Genomes Project; Genome-wide association; Geographical visualization; Linkage disequilibrium

Mesh:

Year:  2020        PMID: 31924160      PMCID: PMC6954550          DOI: 10.1186/s12859-020-3340-1

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

Linkage disequilibrium (LD)—the non-random association of alleles at different loci—defines population-specific haplotypes which vary by genomic ancestry [1]. Assessment of allelic frequencies and LD patterns from a variety of ancestral populations enables researchers to better understand population histories as well as improve genetic understanding of diseases in which risk varies by ethnicity. For example, genome-wide association studies (GWAS) identify germline variation associated with disease susceptibility but need to account for ancestry-specific differences in LD patterns when designing the study, analyzing markers and interpreting findings. While population geneticists have developed many datasets (e.g., 1000 Genomes Project, HapMap) [2, 3] and tools (e.g., Geography of Genetic Variants Browser) [4] to investigate differences in allelic frequencies by population group, to date no tool exists to easily explore and visualize LD patterns across 1000 Genomes population groups.

Implementation

LDpop is an online module designed to allow researchers to query LD patterns of two variants across ancestral populations of interest. LDpop estimates allele frequencies and measures of LD (D′ and R2) for each included population. The reference genetic data is from the 1000 Genomes Project Phase 3, which includes sequencing data for 2504 individuals in 26 ancestral populations which are divided into 5 super populations (e.g., African, Ad-Mixed American, East Asian, European, and South Asian) [2]. The 1000G data are available for public download in VCF format (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/). LDpop is written in Python (2.7.15) and runs on a web-accessible virtual machine with UNIX operating system. The genomic coordinates are retrieved for each query variant from an indexed MongoDb database of dbSNP version 151 and subsequently extracted from the phased 1000 Genomes Project variant call format (VCF) file using Tabix (0.2.5). LDpop uses the Google Maps API to produce the interactive geographic mapping for each population using latitude and longitudinal coordinates for each 1000 Genomes Project ancestral population. The LDpop web-accessible page is programed in HTML5 for cross-browser and cross-platform compatibility and is part of the larger LDlink collection of LD web tools [5, 6]. All code for LDpop is available from out GitHub repository: https://github.com/CBIIT/nci-webtools-dceg-linkage/.

Results

LDpop takes as input two dbSNP reference SNP numbers (rsIDs), a selection of desired populations from the 1000 Genomes Project, and a choice of which LD statistic (D′ or R2) to report for the geographic mapping. LDpop supports queried dbSNP variants which are biallelic including both single nucleotide polymorphism (SNP) and small insertion and deletion (indel) queries. LDpop allows the user to specify any subset of populations from the subpopulations, super populations, and all populations, they are interested in examining for the analysis. LDpop produces three geographic maps and one sortable, filterable table as output (Fig. 1). For each queried variant, the allele frequency is calculated for every population selected and the frequency percentage is plotted over the population’s approximate geographic coordinates as a colored pin with deeper blue colors indicating higher allele frequencies. This allows the investigators to easily calculate and visualize changes in allele frequency across ancestral populations for each variant. A LD map is also produced displaying a computed LD statistic (D’ or R2) for the two query variants for every population selected. The mapped data point is colored in proportion to the gradient shown in the legend, with darker red signifying a higher degree of LD. All geographic mapping utilizes the Google Maps API for smooth and rapid performance. The interactive summary table at the bottom of the page has a row for each selected 1000 Genomes Project population and displays data in the number of samples in each population, allele frequencies for each variant, and calculated LD values (D′ and R2). This table is sortable by column and has a search bar to quickly navigate through it. The table is also downloadable as a text file for local storage and future data integration and analysis.
Fig. 1

Example of an LDpop interactive map and table. Selected tab displays a map of R2 for rs3 and rs383 for all 1000 Genomes Project populations. Numeric data on sample size, allele frequency and LD measures are displayed in the table at the bottom of the screen capture

Example of an LDpop interactive map and table. Selected tab displays a map of R2 for rs3 and rs383 for all 1000 Genomes Project populations. Numeric data on sample size, allele frequency and LD measures are displayed in the table at the bottom of the screen capture

Conclusions

LDpop is an online module designed to allow researchers to query LD patterns of two variants across ancestral populations of interest. It is designed to allow users to easily calculate and geographically visualize these LD patterns and changes in allele frequency across ancestral populations. This web tool is freely available and can be accessed at https://ldlink.nci.nih.gov/?tab=ldpop.
  6 in total

1.  Linkage disequilibrium in the human genome.

Authors:  D E Reich; M Cargill; S Bolk; J Ireland; P C Sabeti; D J Richter; T Lavery; R Kouyoumjian; S F Farhadian; R Ward; E S Lander
Journal:  Nature       Date:  2001-05-10       Impact factor: 49.962

2.  The International HapMap Project.

Authors: 
Journal:  Nature       Date:  2003-12-18       Impact factor: 49.962

3.  LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants.

Authors:  Mitchell J Machiela; Stephen J Chanock
Journal:  Bioinformatics       Date:  2015-07-02       Impact factor: 6.937

4.  LDassoc: an online tool for interactively exploring genome-wide association study results and prioritizing variants for functional investigation.

Authors:  Mitchell J Machiela; Stephen J Chanock
Journal:  Bioinformatics       Date:  2018-03-01       Impact factor: 6.937

5.  Visualizing the geography of genetic variants.

Authors:  Joseph H Marcus; John Novembre
Journal:  Bioinformatics       Date:  2017-02-15       Impact factor: 6.937

6.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

  6 in total
  15 in total

1.  Defining novel causal SNPs and linked phenotypes at melanoma-associated loci.

Authors:  Carolina Castaneda-Garcia; Vivek Iyer; Jérémie Nsengimana; Adam Trower; Alastair Droop; Kevin M Brown; Jiyeon Choi; Tongwu Zhang; Mark Harland; Julia A Newton-Bishop; D Timothy Bishop; David J Adams; Mark M Iles; Carla Daniela Robles-Espinoza
Journal:  Hum Mol Genet       Date:  2022-08-25       Impact factor: 5.121

2.  The association between genetically determined ABO blood types and major depressive disorder.

Authors:  Linda Garvert; Bernhard T Baune; Klaus Berger; Dorret I Boomsma; Gerome Breen; Andreas Greinacher; Steven P Hamilton; Douglas F Levinson; Cathryn M Lewis; Susanne Lucae; Patrik K E Magnusson; Nicholas G Martin; Andrew M McIntosh; Ole Mors; Bertram Müller-Myhsok; Brenda W J H Penninx; Roy H Perlis; Giorgio Pistis; James B Potash; Martin Preisig; Marcella Rietschel; Jianxin Shi; Jordan W Smoller; Henning Tiemeier; Rudolf Uher; Uwe Völker; Henry Völzke; Myrna M Weissman; Hans J Grabe; Sandra Van der Auwera
Journal:  Psychiatry Res       Date:  2021-02-24       Impact factor: 3.222

3.  The influence of single nucleotide polymorphisms of NOD2 or CD14 on the risk of Mycobacterium tuberculosis diseases: a systematic review.

Authors:  Juan M Cubillos-Angulo; Catarina D Fernandes; Davi N Araújo; Cristinna A Carmo; María B Arriaga; Bruno B Andrade
Journal:  Syst Rev       Date:  2021-06-09

4.  Comparative international incidence of Ewing sarcoma 1988 to 2012.

Authors:  Logan G Spector; Aubrey K Hubbard; Brandon J Diessner; Mitchell J Machiela; Beau R Webber; Joshua D Schiffman
Journal:  Int J Cancer       Date:  2021-05-17       Impact factor: 7.316

5.  The T allele of TCF7L2 rs7903146 is associated with decreased glucose tolerance after bed rest in healthy older adults.

Authors:  Jean L Fry; Brooke D Munson; Katherine L Thompson; Christopher S Fry; Douglas Paddon-Jones; Emily J Arentson-Lantz
Journal:  Sci Rep       Date:  2022-04-27       Impact factor: 4.996

6.  Low-frequency variation near common germline susceptibility loci are associated with risk of Ewing sarcoma.

Authors:  Shu-Hong Lin; Joshua N Sampson; Thomas G P Grünewald; Didier Surdez; Stephanie Reynaud; Olivier Mirabeau; Eric Karlins; Rebeca Alba Rubio; Sakina Zaidi; Sandrine Grossetête-Lalami; Stelly Ballet; Eve Lapouble; Valérie Laurence; Jean Michon; Gaelle Pierron; Heinrich Kovar; Udo Kontny; Anna González-Neira; Javier Alonso; Ana Patino-Garcia; Nadège Corradini; Perrine Marec Bérard; Jeremy Miller; Neal D Freedman; Nathaniel Rothman; Brian D Carter; Casey L Dagnall; Laurie Burdett; Kristine Jones; Michelle Manning; Kathleen Wyatt; Weiyin Zhou; Meredith Yeager; David G Cox; Robert N Hoover; Javed Khan; Gregory T Armstrong; Wendy M Leisenring; Smita Bhatia; Leslie L Robison; Andreas E Kulozik; Jennifer Kriebel; Thomas Meitinger; Markus Metzler; Manuela Krumbholz; Wolfgang Hartmann; Konstantin Strauch; Thomas Kirchner; Uta Dirksen; Lisa Mirabello; Margaret A Tucker; Franck Tirode; Lindsay M Morton; Stephen J Chanock; Olivier Delattre; Mitchell J Machiela
Journal:  PLoS One       Date:  2020-09-03       Impact factor: 3.752

7.  Genetic Variations and Health-Related Quality of Life (HRQOL): A Genome-Wide Study Approach.

Authors:  Araba A Adjei; Camden L Lopez; Daniel J Schaid; Jeff A Sloan; Jennifer G Le-Rademacher; Charles L Loprinzi; Aaron D Norman; Janet E Olson; Fergus J Couch; Andreas S Beutler; Celine M Vachon; Kathryn J Ruddy
Journal:  Cancers (Basel)       Date:  2021-02-10       Impact factor: 6.639

8.  HSD17B7 gene in self-renewal and oncogenicity of keratinocytes from Black versus White populations.

Authors:  Xiaoying Xu; Beatrice Tassone; Paola Ostano; Atul Katarkar; Tatiana Proust; Jean-Marc Joseph; Chiara Riganti; Giovanna Chiorino; Zoltan Kutalik; Karine Lefort; Gian Paolo Dotto
Journal:  EMBO Mol Med       Date:  2021-06-29       Impact factor: 12.137

9.  Genetic Predictors of Chemotherapy-Induced Peripheral Neuropathy from Paclitaxel, Carboplatin and Oxaliplatin: NCCTG/Alliance N08C1, N08CA and N08CB Study.

Authors:  Araba A Adjei; Camden L Lopez; Daniel J Schaid; Jeff A Sloan; Jennifer G Le-Rademacher; Charles L Loprinzi; Aaron D Norman; Janet E Olson; Fergus J Couch; Andreas S Beutler; Celine M Vachon; Kathryn J Ruddy
Journal:  Cancers (Basel)       Date:  2021-03-03       Impact factor: 6.639

10.  Identification of 22 susceptibility loci associated with testicular germ cell tumors.

Authors:  John Pluta; Louise C Pyle; Kevin T Nead; Rona Wilf; Mingyao Li; Nandita Mitra; Benita Weathers; Kurt D'Andrea; Kristian Almstrup; Lynn Anson-Cartwright; Javier Benitez; Christopher D Brown; Stephen Chanock; Chu Chen; Victoria K Cortessis; Alberto Ferlin; Carlo Foresta; Marija Gamulin; Jourik A Gietema; Chiara Grasso; Mark H Greene; Tom Grotmol; Robert J Hamilton; Trine B Haugen; Russ Hauser; Michelle A T Hildebrandt; Matthew E Johnson; Robert Karlsson; Lambertus A Kiemeney; Davor Lessel; Ragnhild A Lothe; Jennifer T Loud; Chey Loveday; Paloma Martin-Gimeno; Coby Meijer; Jérémie Nsengimana; David I Quinn; Thorunn Rafnar; Shweta Ramdas; Lorenzo Richiardi; Rolf I Skotheim; Kari Stefansson; Clare Turnbull; David J Vaughn; Fredrik Wiklund; Xifeng Wu; Daphne Yang; Tongzhang Zheng; Andrew D Wells; Struan F A Grant; Ewa Rajpert-De Meyts; Stephen M Schwartz; D Timothy Bishop; Katherine A McGlynn; Peter A Kanetsky; Katherine L Nathanson
Journal:  Nat Commun       Date:  2021-07-23       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.