| Literature DB >> 30576415 |
Arthur Gilly1,2, Lorraine Southam1,3, Daniel Suveges1, Karoline Kuchenbaecker1, Rachel Moore1, Giorgio E M Melloni4, Konstantinos Hatzikotoulas1,2, Aliki-Eleni Farmaki5, Graham Ritchie1,6, Jeremy Schwartzentruber1, Petr Danecek1, Britt Kilian1, Martin O Pollard1, Xiangyu Ge1, Emmanouil Tsafantakis7, George Dedoussis5, Eleftheria Zeggini1,2.
Abstract
MOTIVATION: Very low-depth sequencing has been proposed as a cost-effective approach to capture low-frequency and rare variation in complex trait association studies. However, a full characterization of the genotype quality and association power for very low-depth sequencing designs is still lacking.Entities:
Mesh:
Year: 2019 PMID: 30576415 PMCID: PMC6662288 DOI: 10.1093/bioinformatics/bty1032
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Processing pipeline for the MANOLIS 1× data. Tools and parameters for the genotype refinement and phasing steps were selected after benchmarking 13 pipelines involving four different tools (see Section 4)
Fig. 2.Concordance and call rate for low-depth WGS genotypes. (a) Genotype (blue circles) and minor allele (yellow circles) concordance is computed for 1239 samples in MANOLIS (4× and 1×) against merged OmniExpress and ExomeChip data. Call rate is assessed for the refined (purple) and refined plus imputed (green) datasets. (b) Non-reference allele concordance (green circles) and PPV (fuchsia bars) are computed for 1127 MANOLIS samples with both 22× WGS and low-depth calls
Fig. 3.Unique variants called by sequencing and imputed GWAS. Variants unique to either dataset, arranged by MAF bin. Both datasets are unfiltered apart from monomorphics, which are excluded. MAF categories: rare (MAF<1%), low-frequency (MAF 1–5%), common (MAF>5%)
Fig. 5.Association signals in the 1× WGS and imputed GWAS at P<5 × 10−7 for 57 quantitative traits in the 1225 samples with both imputed GWAS and low-depth WGS. Purple dots represent significant results in the imputed GWAS (a) and the 1× WGS (b) analysis. Orange dots, if present, denote the P-value of the same SNP in the other study. Blue dots represent the association P-value in a larger (n=1457) association study based on 22× WGS. Signals with a 22× WGS P-value above 5×10−5 were considered as false-positives in both studies and excluded from the plot. Red dashes indicate the minimum P-value among all tagging SNVs in the other dataset (r2>0.8). Absence of an orange dot and/or a red dash means that the variant was not present and/or no tagging variant could be found for that signal in the other study