| Literature DB >> 35534523 |
Jasmine A McQuerry1,2, Merry Mclaird1,2, Samantha N Hartin1,2, John C Means1,2, Jeffrey Johnston1,2, Tomi Pastinen1,2,3,4, Scott T Younger5,6,7,8.
Abstract
Clinical whole genome sequencing has enabled the discovery of potentially pathogenic noncoding variants in the genomes of rare disease patients with a prior history of negative genetic testing. However, interpreting the functional consequences of noncoding variants and distinguishing those that contribute to disease etiology remains a challenge. Here we address this challenge by experimentally profiling the functional consequences of rare noncoding variants detected in a cohort of undiagnosed rare disease patients at scale using a massively parallel reporter assay. We demonstrate that this approach successfully identifies rare noncoding variants that alter the regulatory capacity of genomic sequences. In addition, we describe an integrative analysis that utilizes genomic features alongside patient clinical data to further prioritize candidate variants with an increased likelihood of pathogenicity. This work represents an important step towards establishing a framework for the functional interpretation of clinically detected noncoding variants.Entities:
Mesh:
Year: 2022 PMID: 35534523 PMCID: PMC9085742 DOI: 10.1038/s41598-022-11589-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Identification of rare genetic variants from an undiagnosed rare disease cohort. (a) Gender composition of proband cohort. (b) Age distribution of proband cohort. (c) Prevalence of clinical features associated with selected patients. (d) Frequency of rare variants detected within 100 kb of a TSS. (e) Distance to nearest TSS for detected rare variants.
Figure 2Profiling the regulatory capacity of genomic sequences using a massively parallel reporter assay. (a) Schematic of MPRA oligonucleotide library design. (b) Schematic of MPRA vector design and cloning. (c) Abundance of MPRA library elements corresponding to genomic sequences represented in the MPRA plasmid pool (pDNA, technical replicates) and expressed in HEK293T (cDNA, biological replicates). (d) Reproducibility of MPRA library element detection across biological replicates. (e) Regulatory activity of reference genome sequences profiled in MPRA (red = significant regulatory activity). (f) Fraction of reference genome sequences that display significant regulatory activity in MPRA. (g) Distance to nearest TSS for reference genome sequences that display significant regulatory activity.
Figure 3Rare genetic variants alter the regulatory capacity of genomic sequences. (a) Distribution of expression differences between reference and variant alleles profiled in MPRA (red points denote variants with an expression difference |z-score| > 2). (b) Fraction of profiled rare variants that alter regulatory capacity of genomic sequences. (c) Distance to nearest TSS for variants that alter the regulatory capacity of genomic sequences (|z-score| > 2). (d) Heatmap showing regulatory capacity of all possible alleles corresponding to de novo variants that display altered regulatory capacity (|z-score| > 2). (e) Aggregated regulatory capacity of genomic sequences shown in (d) by allele type.
Figure 4Integrative prediction of rare variant pathogenicity. (a) Significance analysis of HPO term overlap between probands and variant-associated transcription factors. (b) Schematic of transcription factor binding at candidate variant position. (c) MPRA analysis of candidate variant. (d) Volcano plot showing transcriptome-wide expression changes following CRISPRi-mediated repression of KDM5B. (e) Volcano plot showing transcriptome-wide expression changes following CRISPRi-mediated repression of variant site. (f) Significance analysis of overlapping expression changes following KDM5B or variant site repression. (g) MPRA analysis, transcription factor binding, and overlapping HPO terms for candidate variants.