Literature DB >> 35504290

TOP-LD: A tool to explore linkage disequilibrium with TOPMed whole-genome sequence data.

Le Huang¹, Jonathan D Rosen², Quan Sun², Jiawen Chen², Marsha M Wheeler³, Ying Zhou⁴, Yuan-I Min⁵, Charles Kooperberg⁴, Matthew P Conomos⁶, Adrienne M Stilp⁶, Stephen S Rich⁷, Jerome I Rotter⁸, Ani Manichaikul⁷, Ruth J F Loos⁹, Eimear E Kenny¹⁰, Thomas W Blackwell¹¹, Albert V Smith¹¹, Goo Jun¹², Fritz J Sedlazeck¹³, Ginger Metcalf¹³, Eric Boerwinkle¹⁴, Laura M Raffield¹⁵, Alex P Reiner¹⁶, Paul L Auer¹⁷, Yun Li¹⁸.

Abstract

Current publicly available tools that allow rapid exploration of linkage disequilibrium (LD) between markers (e.g., HaploReg and LDlink) are based on whole-genome sequence (WGS) data from 2,504 individuals in the 1000 Genomes Project. Here, we present TOP-LD, an online tool to explore LD inferred with high-coverage (∼30×) WGS data from 15,578 individuals in the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. TOP-LD provides a significant upgrade compared to current LD tools, as the TOPMed WGS data provide a more comprehensive representation of genetic variation than the 1000 Genomes data, particularly for rare variants and in the specific populations that we analyzed. For example, TOP-LD encompasses LD information for 150.3, 62.2, and 36.7 million variants for European, African, and East Asian ancestral samples, respectively, offering 2.6- to 9.1-fold increase in variant coverage compared to HaploReg 4.0 or LDlink. In addition, TOP-LD includes tens of thousands of structural variants (SVs). We demonstrate the value of TOP-LD in fine-mapping at the GGT1 locus associated with gamma glutamyltransferase in the African ancestry participants in UK Biobank. Beyond fine-mapping, TOP-LD can facilitate a wide range of applications that are based on summary statistics and estimates of LD. TOP-LD is freely available online.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35504290 PMCID： PMC9247832 DOI： 10.1016/j.ajhg.2022.04.006

Source DB: PubMed Journal: Am J Hum Genet ISSN： 0002-9297 Impact factor: 11.043

Main text

Linkage disequilibrium (LD), i.e., the non-random association of alleles at different variant sites in a given population, is an important genetic phenomenon. Patterns of LD between genetic markers can be leveraged to gain insights in a variety of different applications, from population genetic research to disease association studies., With the growth of whole-genome sequencing (WGS) and high-throughput array and genotype imputation technologies, resources for calculating LD across populations have expanded to encompass multiple populations at variant sites with increasingly rare frequencies.3, 4, 5, 6 Due to the centrality of LD in a host of applications, multiple tools exist for querying LD between genetic markers in different populations. The current most widely used LD lookup tools, HaploReg and LDlink, base their LD estimates on the 1000 Genomes data. Specifically, HaploReg uses phase 1 and LDlink uses phase 3 1000 Genomes data. Although the 1000 Genomes data contains LD information on >99% of genetic markers with minor allele frequency (MAF) > 1% in a variety of populations, there remains a dearth of publicly available information on LD between markers with MAF < 1%. We have created a new LD lookup tool (called “TOP-LD”), in the spirit of HaploReg and LDlink, that is based on deep (30×) WGS data from the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. Because the TOPMed data contain much larger sample sizes with greater depth of sequencing than the 1000 Genomes project, TOP-LD provides a significant upgrade in LD information availability, specifically by including single-nucleotide variants and small indels (referred to hereafter simply as “SNVs”) with MAF < 1% as well as structural variants (SVs). Here, we describe the data and methods that went into creating TOP-LD along with specific examples of how TOP-LD can provide essential information that is missed by HaploReg and LDlink. We used TOPMed WGS data from the following four cohorts: BioMe Biobank (BioMe), the Multi-Ethnic Study of Atherosclerosis (MESA), the Jackson Heart Study (JHS), and the Women’s Health Initiative (WHI). We aimed to provide LD estimates for genetically homogeneous groups of individuals from one of the following four ancestral populations: European (EUR), African (AFR), East Asian (EAS), and South Asian (SAS). To select appropriate samples, we first inferred local and global ancestry for all participants in these four cohorts by using RFMix, with reference populations including five ancestral groups, namely African, Native American, East Asian, European, and South Asian. After local ancestry inference, we then retained only TOPMed samples with >90% estimated ancestry from a single population, as estimated via RFMix. We further removed related individuals by using a stringent kinship coefficient threshold of 2−5.5 obtained via PC-Relate. This threshold of 2−5.5 removes pairs within as far as fifth degree relationship. The final dataset included 1,335 unrelated individuals of African, 844 of East Asian, 13,160 of European, and 239 of South Asian ancestry for pairwise LD inference. Regarding variants, we started with all TOPMed freeze 8 polymorphic variants that passed quality control and retained multi-allelic variants or multiple entries at the same position, resulting in a total of 23.0–153.0 million SNVs in each of the ancestral groups (Figure 1A, Table S1).

Figure 1

Number of variants included in TOP-LD

(A) Comparison of autosomal variants with HaploReg 4.0 by population. Blue bars on the left show total number of autosomal variants in HaploReg4.0. Green and red indicate common (MAF ≥ 1%) and uncommon (MAF < 1%) autosomal variants in TOP-LD. Note that HaploReg4.0 provides LD for ASN (Asian) with no separate information for EAS and SAS. Therefore, we used the same 13.7 million ASN variants for comparison in both EAS and SAS.

(B) Number of autosomal variants in TOP-LD breaking down by LD R2 threshold. The majority of the variants have at least one LD proxy with R2 ≥ 0.8.

(Note: LD information downloaded from HaploReg4.0 does not contain chromosome X. Therefore, we compared TOP-LD with HaploReg4.0 only for autosomal variants).

Number of variants included in TOP-LD (A) Comparison of autosomal variants with HaploReg 4.0 by population. Blue bars on the left show total number of autosomal variants in HaploReg4.0. Green and red indicate common (MAF ≥ 1%) and uncommon (MAF < 1%) autosomal variants in TOP-LD. Note that HaploReg4.0 provides LD for ASN (Asian) with no separate information for EAS and SAS. Therefore, we used the same 13.7 million ASN variants for comparison in both EAS and SAS. (B) Number of autosomal variants in TOP-LD breaking down by LD R2 threshold. The majority of the variants have at least one LD proxy with R2 ≥ 0.8. (C) Number of chrX variants in TOP-LD breaking down by LD R2 threshold. (Note: LD information downloaded from HaploReg4.0 does not contain chromosome X. Therefore, we compared TOP-LD with HaploReg4.0 only for autosomal variants). We inferred LD separately within each of the four ancestral groups, for all pairs of variants within 1 Mb of each other, and retained LD pairs meeting a minimum R2 threshold of 0.2. The reported R2 between two variants is the squared Pearson correlation coefficient between their phased haplotypes, where phasing was performed with Eagle 2.4 for all polymorphic variants, similar to phasing of the freeze 5 data. No minimum minor allele count thresholding was used, that is, even singletons in our sample were included in LD calculations. We also report the direction of each association as either positive (+) or negative (−) on the basis of the sign of the Pearson correlation coefficient between the corresponding pair of reference (REF) alleles. In addition to R2, we also report D-prime statistics for each pair of variants meeting the R2 of 0.2. We filtered chromosome X to exclude the pseudo-autosomal regions: PAR1 (bp 10,001–2,781,479, GRCh38) and PAR2 (bp 155,701,383–156,030,895, GRCh38). Variants that were not coded as homozygous in the males were excluded from the LD calculations. We inferred LD for the remaining variants by using a total of 2F + M haplotypes, where F and M are the numbers of females and males, respectively. The TOPMed structural variant (SV) call-set freeze 1 was merged with a reduced TOPMed SNV call-set where SNVs with MAF < 0.1% were filtered out before merging, and then the merged SV-SNV dataset was phased with Eagle2. SVs with >10% missingness were removed prior to phasing. For each ancestry group, we included 16.5–79K SVs (deletions, duplications, and inversion) with the majority being lower frequency (e.g., 7–69K with MAF < 1%) (Table 1). LD values were subsequently estimated as the squared Pearson correlation coefficient between the corresponding pair of phased alleles.

Table 1

Summary of SVs by population

Population	Number of SVs	Number of SVs in LD w/SNVsa	Number of SVs with MAF < 0.01
EUR	79,004	16,301	69,011
AFR	44,859	15,151	27,978
SAS	16,511	10,392	7,292
EAS	20,789	7,498	12,902

Number of SVs having at least one SNV LD tag with R2 ≥ 0.8.

Summary of SVs by population Number of SVs having at least one SNV LD tag with R2 ≥ 0.8. TOPMed LD information was then loaded into the TOP-LD website, which is powered by a combination of MySQL, PHP, Javascript, and Apache2 under the CloudSQL and Compute Engine of Google Cloud Platform. The web interface provides access to all precomputed LD estimates. Users have the option to either paste or upload a file containing variant(s) of interest. Users can specify the population (East Asian, European, African, or South Asian) in which LD was estimated. In TOP-LD, markers are identified by rsID, or chr:position, or chr:position:REF:ALT for SNVs, or TOPMed variant names for SVs (in the format of DEL/DUP/INV_chr:startPosition-endPosition, for example, DEL_10:85001–97300). TOP-LD returns all variants within a pre-specified LD threshold (ranging from R2 values of 0.2 to 1.0) with the query variant. TOP-LD supports fast batch queries (Figure 2); querying a single variant takes ∼0.5 s, while a batch query of 500 variants takes ∼2.3 seconds. TOP-LD currently allows a maximum of 500 variants in one query.

Figure 2

Elapsed time (in seconds) for queries

The x axis represents the number of variants queried, and the y axis represents the elapsed time.

Elapsed time (in seconds) for queries The x axis represents the number of variants queried, and the y axis represents the elapsed time. After submitting the query, the website auto-directs to a result page that contains two parts: LD information on the top panel and variant information on the bottom panel. The latter provides basic information for the queried variants, including position, marker name, alleles (REF and ALT), and minor allele frequency (MAF). Markers not in the database will have “none” for all fields except marker names. The LD panel displays related LD metrics, one pair of variants on each line, including both R2, D′, and the sign of LD (measured between REF alleles of the two variants), along with marker name, marker position, alleles, and frequency for both variants in the pair (Figure 3). In addition, we provide the following pieces of information for SNVs from WGSA annotation: CADD score (phred-scaled), fathmm_XF_coding_or_noncoding classification, FANTOM5 enhancer annotations, gene name, and relative location to gene as well as a link to GWAS catalog query results. For SVs, we provide a variety of annotations including gene(s) overlapping the SV, the SV’s location relative to gene, the gene’s pLI score, overlapping candidate cis-regulatory regions (cCREs) from ENCODE SCREEN., The query results can be sorted, searched, copied, exported, and printed for further analyses.

Figure 3

An example query result

The result contains two parts. The top part “LD information from AFR” shows the LD information where each line provides information between a query variant (rsID1) and one of its corresponding LD proxies (rsID2). The bottom part “variant information from AFR” provides variant information, which shows basic information for each query variant. From the bottom part, we know that the user’s query includes four variants: rs334, rs8008208820, rs2462498, and rs12219304. Variants not included in LD calculation will have “none” records. For instance, rs8008208820 in this example query is not involved in LD inference and therefore will not have any LD proxies in the top part simply because of no data. Records from SV inference are in blue and those from SNV data are in orange. Some variants may appear twice because they are included in both SNV LD calculation and SV calculation. For example, in this example, rs12219304 appeared twice with MAF 0.0558 from the SNV source (second last record in orange) and MAF 0.0543 from the SV source (last record in blue).

An example query result The result contains two parts. The top part “LD information from AFR” shows the LD information where each line provides information between a query variant (rsID1) and one of its corresponding LD proxies (rsID2). The bottom part “variant information from AFR” provides variant information, which shows basic information for each query variant. From the bottom part, we know that the user’s query includes four variants: rs334, rs8008208820, rs2462498, and rs12219304. Variants not included in LD calculation will have “none” records. For instance, rs8008208820 in this example query is not involved in LD inference and therefore will not have any LD proxies in the top part simply because of no data. Records from SV inference are in blue and those from SNV data are in orange. Some variants may appear twice because they are included in both SNV LD calculation and SV calculation. For example, in this example, rs12219304 appeared twice with MAF 0.0558 from the SNV source (second last record in orange) and MAF 0.0543 from the SV source (last record in blue). The TOP-LD tool leverages TOPMed WGS data, whose much larger sample size and high depth sequencing lead to LD information for a much larger number of variants compared to the 1000 Genomes Project. As shown in Figure 1A and Table S1, TOP-LD offers 2.6- to 9.1-fold increase in variant coverage compared to the other state-of-the-art resources such as HaploReg 4.0 or LDlink. For example, for the European population, TOP-LD includes 146.5 million autosomal SNVs, while HaploReg 4.0 or LDlink contains 16.1 million variants. Not surprisingly, the vast majority of the variants in TOP-LD that are not in 1000 Genomes, contributing to the up to 9.1× increase, are low frequency or rare. For example, out of the 146.5 million autosomal SNVs cataloged in the TOP-LD European population, 137.8 million have MAF < 0.01 (Figure 1A, Table S1). Most of the variants have LD proxies. For example, 115.1 out of the 146.5 (78.6%) million autosomal variants have at least one LD tag with R2 ≥ 0.8 and if we further relax the R2 threshold to 0.5 and 0.2, the number increases to 135.3 (92.4%) and 143.5 (98.0%), respectively (Figure 1B). For chromosome X, we have included 6.5 million, 2.4 million, 1.3 million, and 760,000 variants for the European, African, East Asian, and South Asian populations, respectively (Table S1). Similar to the autosomal variants, the majority of these variants have at least one LD proxy with R2 ≥ 0.8: 5.1 million, European; 2.1 million, African; 1.2 million East Asian; 690,000, South Asian (Figure 1C, Table S2). To evaluate the consistency between TOP-LD estimates and those from Haploreg v4.1, we collected the set of overlapping variants based on rsID with MAF ≥ 0.05 for Europeans and Africans. This set of variants was further filtered such that the MAF values were within 10% of each other because large MAF differences would induce large LD differences. Figures S1 and S2 show high level of agreement between TOP-LD and Haploreg v4.1 LD estimates (e.g., Pearson correlation = 0.972 and 0.962 for European and African chromosome 1, respectively). Similarly, comparison of the chromosome X TOP-LD estimates for females and males again show high level of consistency (Pearson correlation = 0.992 and 0.975 for European and African population, respectively) (Figures S3 and S4). To demonstrate the utility of TOP-LD, we performed fine-mapping at the GGT1 locus on chromosome 22, which is known to be associated with gamma glutamyltransferase. We performed sequential conditional analysis with EPACTS by using individual-level data among 8,768 UK Biobank participants of African ancestry following the same strategy in our previous work adjusting for the same covariates as in Sun et al. The sequential conditional analyses with individual-level data identified seven distinct signals at the GGT1 locus associated with gamma glutamyltransferase (Table 2). Because we used individual-level data for this conditional analysis, we considered these seven distinct signals to be the “working truth.”

Table 2

Summary statistics of distinct working truth at GGT1 locus associated with gamma glutamyltransferase

Signal	Variant	Position (hg38)	Effect allele	Unconditional p value	p value conditional on previous signalsa	Effect allele frequency
1	rs4049904	24609759	G	2.82e−61	N/A	10.27%
2	rs73404962	24598530	G	4.46e−29	2.00e−36	5.63%
3	rs743369	24588099	A	9.94e−36	7.51e−27	11.94%
4	rs6004193	24598329	C	4.23e−41	3.25e−19	18.27%
5	rs57719575	24609020	C	3.97e−38	1.98e−24	14.86%
6	rs3876101	24607291	A	2.66e−15	1.17e−13	35.45%
7	rs116161010	24585912	T	5.69e−17	7.70e−9	7.13%

The p values are reported from the sequential conditional analysis. For example, we report the p value for rs73404962 conditional on rs4049904, the p value of rs743369 conditional on both rs4049904 and rs73404962, and so forth.

Summary statistics of distinct working truth at GGT1 locus associated with gamma glutamyltransferase The p values are reported from the sequential conditional analysis. For example, we report the p value for rs73404962 conditional on rs4049904, the p value of rs743369 conditional on both rs4049904 and rs73404962, and so forth. We then carried out fine-mapping analysis with the FINEMAP method by using only GWAS summary statistics from Sun et al. We applied FINEMAP with an LD reference either from TOP-LD or from the 1000 Genomes Project and assessed the performance by comparing the results with “working truth” established from the sequential conditional analysis of the individual-level data. FINEMAP produced 95% credible sets containing five variants when using either the 1000 Genomes (1000G) Project LD panel or the TOP-LD panel (see Table 3). However, the 1000G-based credible set contained only one of the seven signals from the “working truth” set. In contrast, the TOP-LD-based credible set contained three of the seven signals from the “working truth” set. In addition, because the lead variant from each conditional analysis (corresponding to each distinct signal) is selected somewhat arbitrarily, we also considered their LD proxies. When we considered any LD proxy (using a lenient R2 threshold of 0.2) of a variant in the working truth set, the 1000G-based results still only identified a single signal from the working truth, whereas the TOP-LD-based results identified four of the seven signals (Table 3).

Table 3

FINEMAP credible-set variants

		Variant 1	Variant 2	Variant 3	Variant 4	Variant 5
1000G reference	credible-set variant	rs4049904	rs147866692	rs570263050	rs115231893	22:24649848:G:A (hg38)
1000G reference	LD with working truth	1 (w/rs4049904 itself)	0.464 (w/rs4049904)	0.606 (w/rs4049904)	0.275 (w/rs4049904)	0.434 (w/rs4049904)
TOP-LD reference	credible-set variant	rs4049904	rs743369	rs57719575	rs2073397	rs5751902
TOP-LD reference	LD with working truth	1 (w/rs4049904 itself)	1 (w/rs743369 itself)	1 (w/rs57719575 itself)	0.83 (w/rs6004193)	0.51 (w/rs6004193)

The two five-variant credible sets provided by FINEMAP with either 1000G or TOP-LD as reference. For each credible-set variant, we list the corresponding variant (and the LD Rsq) from the working truth that has the highest LD.

FINEMAP credible-set variants The two five-variant credible sets provided by FINEMAP with either 1000G or TOP-LD as reference. For each credible-set variant, we list the corresponding variant (and the LD Rsq) from the working truth that has the highest LD. We also used TOP-LD to aid in the identification and prioritization of potentially causal structural variants at GWAS loci. For example, our recent association analysis with TOPMed data identified an African-specific (MAF = 0.129) variant rs28450540 associated with lower monocyte count (p = 3.65 × 10−17). Query for LD tags via TOP-LD revealed a ∼600 bp deletion near S1PR3 in perfect LD (R2 = 1) with rs28450540 in the African population. We performed genome editing in monocytic and primary human HSPCs followed by xenotransplantation, which provides evidence that the deletion disrupts an S1PR3 monocyte enhancer leading to decreased S1PR3 expression. These preliminary data from functional experiments suggest that the 600 bp deletion is most likely casual but would have been missed in standard association analysis with only SNVs. TOP-LD offers a simple and efficient approach to rescue such putative causal structural variants. LD information, reflecting recombination, natural selection, and demographic history, has always been of intense interest in population genetics and complex trait association studies. LD information is also indispensable for a wide range of other applications, including GWAS follow-up and many summary-statistics-based inferences including fine-mapping, imputation of association summary statistics, construction of polygenic risk scores (PRSs), and interpretation and prioritization of GWAS results for further functional and clinical studies. TOP-LD significantly boosts the coverage of lower frequency variants by harnessing the power of high-coverage (∼30×) WGS data of over 15,000 individuals primarily of a single continental ancestry. We demonstrate the utility of TOP-LD in fine-mapping at the GGT1 locus and variant prioritization at the S1PR3 locus. The LD information provided by TOP-LD will facilitate a range of essential inferences for common and rare variation across a diverse range of populations.

21 in total

1. Model-free Estimation of Recent Genetic Relatedness.

Authors: Matthew P Conomos; Alexander P Reiner; Bruce S Weir; Timothy A Thornton
Journal: Am J Hum Genet Date: 2016-01-07 Impact factor: 11.025

2. Variance component model to account for sample structure in genome-wide association studies.

Authors: Hyun Min Kang; Jae Hoon Sul; Susan K Service; Noah A Zaitlen; Sit-Yee Kong; Nelson B Freimer; Chiara Sabatti; Eleazar Eskin
Journal: Nat Genet Date: 2010-03-07 Impact factor: 38.330

3. Integrating common and rare genetic variation in diverse human populations.

Authors: David M Altshuler; Richard A Gibbs; Leena Peltonen; David M Altshuler; Richard A Gibbs; Leena Peltonen; Emmanouil Dermitzakis; Stephen F Schaffner; Fuli Yu; Leena Peltonen; Emmanouil Dermitzakis; Penelope E Bonnen; David M Altshuler; Richard A Gibbs; Paul I W de Bakker; Panos Deloukas; Stacey B Gabriel; Rhian Gwilliam; Sarah Hunt; Michael Inouye; Xiaoming Jia; Aarno Palotie; Melissa Parkin; Pamela Whittaker; Fuli Yu; Kyle Chang; Alicia Hawes; Lora R Lewis; Yanru Ren; David Wheeler; Richard A Gibbs; Donna Marie Muzny; Chris Barnes; Katayoon Darvishi; Matthew Hurles; Joshua M Korn; Kati Kristiansson; Charles Lee; Steven A McCarrol; James Nemesh; Emmanouil Dermitzakis; Alon Keinan; Stephen B Montgomery; Samuela Pollack; Alkes L Price; Nicole Soranzo; Penelope E Bonnen; Richard A Gibbs; Claudia Gonzaga-Jauregui; Alon Keinan; Alkes L Price; Fuli Yu; Verneri Anttila; Wendy Brodeur; Mark J Daly; Stephen Leslie; Gil McVean; Loukas Moutsianas; Huy Nguyen; Stephen F Schaffner; Qingrun Zhang; Mohammed J R Ghori; Ralph McGinnis; William McLaren; Samuela Pollack; Alkes L Price; Stephen F Schaffner; Fumihiko Takeuchi; Sharon R Grossman; Ilya Shlyakhter; Elizabeth B Hostetter; Pardis C Sabeti; Clement A Adebamowo; Morris W Foster; Deborah R Gordon; Julio Licinio; Maria Cristina Manca; Patricia A Marshall; Ichiro Matsuda; Duncan Ngare; Vivian Ota Wang; Deepa Reddy; Charles N Rotimi; Charmaine D Royal; Richard R Sharp; Changqing Zeng; Lisa D Brooks; Jean E McEwen
Journal: Nature Date: 2010-09-02 Impact factor: 49.962

4. Analyses of biomarker traits in diverse UK biobank participants identify associations missed by European-centric analysis strategies.

Authors: Yun Li; Laura M Raffield; Quan Sun; Misa Graff; Bryce Rowland; Jia Wen; Le Huang; Tyne W Miller-Fleming; Jeffrey Haessler; Michael H Preuss; Jin-Fang Chai; Moa P Lee; Christy L Avery; Ching-Yu Cheng; Nora Franceschini; Xueling Sim; Nancy J Cox; Charles Kooperberg; Kari E North
Journal: J Hum Genet Date: 2021-08-11 Impact factor: 3.755

5. Allelic Heterogeneity at the CRP Locus Identified by Whole-Genome Sequencing in Multi-ancestry Cohorts.

Authors: Laura M Raffield; Apoorva K Iyengar; Biqi Wang; Sheila M Gaynor; Cassandra N Spracklen; Xue Zhong; Madeline H Kowalski; Shabnam Salimi; Linda M Polfus; Emelia J Benjamin; Joshua C Bis; Russell Bowler; Brian E Cade; Won Jung Choi; Alejandro P Comellas; Adolfo Correa; Pedro Cruz; Harsha Doddapaneni; Peter Durda; Stephanie M Gogarten; Deepti Jain; Ryan W Kim; Brian G Kral; Leslie A Lange; Martin G Larson; Cecelia Laurie; Jiwon Lee; Seonwook Lee; Joshua P Lewis; Ginger A Metcalf; Braxton D Mitchell; Zeineen Momin; Donna M Muzny; Nathan Pankratz; Cheol Joo Park; Stephen S Rich; Jerome I Rotter; Kathleen Ryan; Daekwan Seo; Russell P Tracy; Karine A Viaud-Martinez; Lisa R Yanek; Lue Ping Zhao; Xihong Lin; Bingshan Li; Yun Li; Josée Dupuis; Alexander P Reiner; Karen L Mohlke; Paul L Auer
Journal: Am J Hum Genet Date: 2019-12-26 Impact factor: 11.025

6. Whole-genome sequencing in diverse subjects identifies genetic correlates of leukocyte traits: The NHLBI TOPMed program.

Authors: Anna V Mikhaylova; Caitlin P McHugh; Linda M Polfus; Laura M Raffield; Meher Preethi Boorgula; Thomas W Blackwell; Jennifer A Brody; Jai Broome; Nathalie Chami; Ming-Huei Chen; Matthew P Conomos; Corey Cox; Joanne E Curran; Michelle Daya; Lynette Ekunwe; David C Glahn; Nancy Heard-Costa; Heather M Highland; Brian D Hobbs; Yann Ilboudo; Deepti Jain; Leslie A Lange; Tyne W Miller-Fleming; Nancy Min; Jee-Young Moon; Michael H Preuss; Jonathon Rosen; Kathleen Ryan; Albert V Smith; Quan Sun; Praveen Surendran; Paul S de Vries; Klaudia Walter; Zhe Wang; Marsha Wheeler; Lisa R Yanek; Xue Zhong; Goncalo R Abecasis; Laura Almasy; Kathleen C Barnes; Terri H Beaty; Lewis C Becker; John Blangero; Eric Boerwinkle; Adam S Butterworth; Sameer Chavan; Michael H Cho; Hélène Choquet; Adolfo Correa; Nancy Cox; Dawn L DeMeo; Nauder Faraday; Myriam Fornage; Robert E Gerszten; Lifang Hou; Andrew D Johnson; Eric Jorgenson; Robert Kaplan; Charles Kooperberg; Kousik Kundu; Cecelia A Laurie; Guillaume Lettre; Joshua P Lewis; Bingshan Li; Yun Li; Donald M Lloyd-Jones; Ruth J F Loos; Ani Manichaikul; Deborah A Meyers; Braxton D Mitchell; Alanna C Morrison; Debby Ngo; Deborah A Nickerson; Suraj Nongmaithem; Kari E North; Jeffrey R O'Connell; Victor E Ortega; Nathan Pankratz; James A Perry; Bruce M Psaty; Stephen S Rich; Nicole Soranzo; Jerome I Rotter; Edwin K Silverman; Nicholas L Smith; Hua Tang; Russell P Tracy; Timothy A Thornton; Ramachandran S Vasan; Joe Zein; Rasika A Mathias; Alexander P Reiner; Paul L Auer
Journal: Am J Hum Genet Date: 2021-09-27 Impact factor: 11.043

7. Genetic analysis in European ancestry individuals identifies 517 loci associated with liver enzymes.

Authors: Raha Pazoki; Marijana Vujkovic; Benjamin F Voight; Kyong-Mi Chang; Mark R Thursz; Paul Elliott; Joshua Elliott; Evangelos Evangelou; Dipender Gill; Mohsen Ghanbari; Peter J van der Most; Rui Climaco Pinto; Matthias Wielscher; Matthias Farlik; Verena Zuber; Robert J de Knegt; Harold Snieder; André G Uitterlinden; Julie A Lynch; Xiyun Jiang; Saredo Said; David E Kaplan; Kyung Min Lee; Marina Serper; Rotonya M Carr; Philip S Tsao; Stephen R Atkinson; Abbas Dehghan; Ioanna Tzoulaki; M Arfan Ikram; Karl-Heinz Herzig; Marjo-Riitta Järvelin; Behrooz Z Alizadeh; Christopher J O'Donnell; Danish Saleheen
Journal: Nat Commun Date: 2021-05-10 Impact factor: 14.919

8. HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease.

Authors: Lucas D Ward; Manolis Kellis
Journal: Nucleic Acids Res Date: 2015-12-10 Impact factor: 16.971

9. Reference-based phasing using the Haplotype Reference Consortium panel.

Authors: Po-Ru Loh; Petr Danecek; Pier Francesco Palamara; Christian Fuchsberger; Yakir A Reshef; Hilary K Finucane; Sebastian Schoenherr; Lukas Forer; Shane McCarthy; Goncalo R Abecasis; Richard Durbin; Alkes L Price
Journal: Nat Genet Date: 2016-10-03 Impact factor: 38.330

10. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.

Authors: Daniel Taliun; Daniel N Harris; Michael D Kessler; Jedidiah Carlson; Zachary A Szpiech; Raul Torres; Sarah A Gagliano Taliun; André Corvelo; Stephanie M Gogarten; Hyun Min Kang; Achilleas N Pitsillides; Jonathon LeFaive; Seung-Been Lee; Xiaowen Tian; Brian L Browning; Sayantan Das; Anne-Katrin Emde; Wayne E Clarke; Douglas P Loesch; Amol C Shetty; Thomas W Blackwell; Albert V Smith; Quenna Wong; Xiaoming Liu; Matthew P Conomos; Dean M Bobo; François Aguet; Christine Albert; Alvaro Alonso; Kristin G Ardlie; Dan E Arking; Stella Aslibekyan; Paul L Auer; John Barnard; R Graham Barr; Lucas Barwick; Lewis C Becker; Rebecca L Beer; Emelia J Benjamin; Lawrence F Bielak; John Blangero; Michael Boehnke; Donald W Bowden; Jennifer A Brody; Esteban G Burchard; Brian E Cade; James F Casella; Brandon Chalazan; Daniel I Chasman; Yii-Der Ida Chen; Michael H Cho; Seung Hoan Choi; Mina K Chung; Clary B Clish; Adolfo Correa; Joanne E Curran; Brian Custer; Dawood Darbar; Michelle Daya; Mariza de Andrade; Dawn L DeMeo; Susan K Dutcher; Patrick T Ellinor; Leslie S Emery; Celeste Eng; Diane Fatkin; Tasha Fingerlin; Lukas Forer; Myriam Fornage; Nora Franceschini; Christian Fuchsberger; Stephanie M Fullerton; Soren Germer; Mark T Gladwin; Daniel J Gottlieb; Xiuqing Guo; Michael E Hall; Jiang He; Nancy L Heard-Costa; Susan R Heckbert; Marguerite R Irvin; Jill M Johnsen; Andrew D Johnson; Robert Kaplan; Sharon L R Kardia; Tanika Kelly; Shannon Kelly; Eimear E Kenny; Douglas P Kiel; Robert Klemmer; Barbara A Konkle; Charles Kooperberg; Anna Köttgen; Leslie A Lange; Jessica Lasky-Su; Daniel Levy; Xihong Lin; Keng-Han Lin; Chunyu Liu; Ruth J F Loos; Lori Garman; Robert Gerszten; Steven A Lubitz; Kathryn L Lunetta; Angel C Y Mak; Ani Manichaikul; Alisa K Manning; Rasika A Mathias; David D McManus; Stephen T McGarvey; James B Meigs; Deborah A Meyers; Julie L Mikulla; Mollie A Minear; Braxton D Mitchell; Sanghamitra Mohanty; May E Montasser; Courtney Montgomery; Alanna C Morrison; Joanne M Murabito; Andrea Natale; Pradeep Natarajan; Sarah C Nelson; Kari E North; Jeffrey R O'Connell; Nicholette D Palmer; Nathan Pankratz; Gina M Peloso; Patricia A Peyser; Jacob Pleiness; Wendy S Post; Bruce M Psaty; D C Rao; Susan Redline; Alexander P Reiner; Dan Roden; Jerome I Rotter; Ingo Ruczinski; Chloé Sarnowski; Sebastian Schoenherr; David A Schwartz; Jeong-Sun Seo; Sudha Seshadri; Vivien A Sheehan; Wayne H Sheu; M Benjamin Shoemaker; Nicholas L Smith; Jennifer A Smith; Nona Sotoodehnia; Adrienne M Stilp; Weihong Tang; Kent D Taylor; Marilyn Telen; Timothy A Thornton; Russell P Tracy; David J Van Den Berg; Ramachandran S Vasan; Karine A Viaud-Martinez; Scott Vrieze; Daniel E Weeks; Bruce S Weir; Scott T Weiss; Lu-Chen Weng; Cristen J Willer; Yingze Zhang; Xutong Zhao; Donna K Arnett; Allison E Ashley-Koch; Kathleen C Barnes; Eric Boerwinkle; Stacey Gabriel; Richard Gibbs; Kenneth M Rice; Stephen S Rich; Edwin K Silverman; Pankaj Qasba; Weiniu Gan; George J Papanicolaou; Deborah A Nickerson; Sharon R Browning; Michael C Zody; Sebastian Zöllner; James G Wilson; L Adrienne Cupples; Cathy C Laurie; Cashell E Jaquish; Ryan D Hernandez; Timothy D O'Connor; Gonçalo R Abecasis
Journal: Nature Date: 2021-02-10 Impact factor: 69.504

1 in total

Review 1. Understanding the function of regulatory DNA interactions in the interpretation of non-coding GWAS variants.

Authors: Wujuan Zhong; Weifang Liu; Jiawen Chen; Quan Sun; Ming Hu; Yun Li
Journal: Front Cell Dev Biol Date: 2022-08-19

1 in total