| Literature DB >> 35387445 |
Omar Soukarieh1, Caroline Meguerditchian1, Carole Proust1, Dylan Aïssi1, Mélanie Eyries2, Aurélie Goyenvalle3, David-Alexandre Trégouët1.
Abstract
High-throughput sequencing (HTS) technologies are revolutionizing the research and molecular diagnosis landscape by allowing the exploration of millions of nucleotide sequences at an unprecedented scale. These technologies are of particular interest in the identification of genetic variations contributing to the risk of rare (Mendelian) and common (multifactorial) human diseases. So far, they have led to numerous successes in identifying rare disease-causing mutations in coding regions, but few in non-coding regions that include introns, untranslated (UTR), and intergenic regions. One class of neglected non-coding variations is that of 5'UTR variants that alter upstream open reading frames (upORFs) of the coding sequence (CDS) of a natural protein coding transcript. Following a brief summary of the molecular bases of the origin and functions of upORFs, we will first review known 5'UTR variations altering upORFs and causing rare cardiovascular disorders (CVDs). We will then investigate whether upORF-affecting single nucleotide polymorphisms could be good candidates for explaining association signals detected in the context of genome-wide association studies for common complex CVDs.Entities:
Keywords: Mendelian disease; genome wide association analysis (GWAS); non-coding mutations; open reading frame (ORF); polymorphism
Year: 2022 PMID: 35387445 PMCID: PMC8977850 DOI: 10.3389/fcvm.2022.841032
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
FIGURE 1Different types of upstream open reading frames (upORFs) located in the 5′UTR of coding transcripts. The upper, middle, and lower panels show the position on coding transcripts of fully upstream ORF (uORF), overlapping ORF (uoORF), and elongated coding sequence (eCDS), respectively. The start and stop codons associated to the described upORF are indicated by green and red circles, respectively. AUG corresponds to the canonical start codon and UAA, UAG, and UGA correspond to the stop codons. TSS, transcription start site; UTR, untranslated region; CDS, coding sequence.
Rare upSNVs in CVD-related diseases.
| Gene (orientation) | cDNA position | Predicted effect | Disease | Databases | Classification (ClinVar) | References |
| NM_000518.5 c.-29G>A | uoORF (42 nts) | β-Thalassaemia | ClinVar | Pathogenic | ||
| NM_000492.3 c.-34C>T | uoORF (108 nts) | Disseminated bronchiectasis | HGMD, ClinVar | Conflicting interpretations of pathogenicity |
| |
| NM_001114753.3 c.-142A>T | uoORF (270 nts) | Hereditary Haemorrhagic TelangiectasiaT | NA | NA |
| |
| NM_001114753.3 c.-127C>T | uoORF (255 nts) | HGMD | Pathogenic/Likely pathogenic | |||
| NM_001114753.3 c.-10C>T | uoORF (138 nts) | HGMD | Likely pathogenic |
| ||
| NM_001114753.3 c.-9G>A | eCDS (+ 3 nts) | HGMD | Conflicting interpretations |
| ||
| NM_001114753.3 c.-79C>T | uoORF (207 nts) | NA | ClinVar | Uncertain significance | NA | |
| NM_000313.4 c.-39C>T | uoORF (156 nts) | Protein S deficiency | NA | NA |
| |
| NM_000132.4 c.-5A>G | uoORF (63 nts) | Hemophilia A | HGMD | NA |
| |
| NM_021175.4 c.-25G>A | uAUG | Juvenile Hereditary Hemochromatosis | HGMD | NA | ||
| NM_000527.5 c.-22delC | uoORF (174 nts) | Familial Hypercholesterolaemia | ClinVar | Uncertain significance |
uoORF, upstream overlapping Open Reading Frame; eCDS, elongated coding sequence; nts, nucleotides; NA, non-available.
*This variant is reported in ClinVar without any clinical annotation (
**No in frame stop codon predicted.
Common upSNVs in GWAS loci for CVDs and associated traits.
| upSNV | Gene (orientation) | cDNA position | Genomic position (GRCh38.p13) | Predicted functional effect | GWAS lead SNPs | r2/D’ | References |
| rs1801020 | NM_000505.4 c.-4C>T | chr5:177409531 | ACG>ATG uoORF = 9 nts uSTOP = TGA | rs1801020 | 1.0/1 |
| |
| rs492571 | NM_001286491.2 c.-487A>G | chr15:43919075 | ATA>ATG uORF = 39 nts uSTOP = TAA | rs492571 | 1.0/1 |
| |
| rs75699653 | NM_001353683.2 c.-491C>T | chr1:156902203 | ACG>ATG uORF = 63 nts uSTOP = TGA | rs12566888 | (0.00/−1 |
| |
| rs58852338 | SLC18A1 (−1) | NM_001135691.3 c.-276G (A | chr8:20181901 | GTG>ATG uORF (36 nts uSTOP (TAG | rs55682243 | (0.00/−1 |
|
|
| NM_019113.4 c.-173C>G | chr19:48756064 | ATC>ATG uORF = 36 nts uSTOP = TGA | rs838133 | 0.04/−1 |
| |
|
| NM_032556.6 c.-143C>T | chr2:113072596 | ACG>ATG eCDS = 603 nts uSTOP (TAG | rs6761276 | 0.06/0.74 |
| |
| rs35137994 | ANGPTL4 (1) | NM_139314.3 c.-140C (T | chr19:8364182 | ACG>ATG eCDS = 1362 nts uSTOP = TAG | rs116843064 | ∼0.00/−1 | |
| rs3131003 rs3815087 | NM_014068.3 c.-199G>A c.-94G>A | chr6:31125705 chr6:31125810 | GTG>ATG uORF = 183 and 78 nts same uSTOP = TGA | rs3094205 | 0.61/0.97 0.03/0.26 |
uoORF, upstream overlapping open reading frame, uORF, fully upstream open reading frame; nts, nucleotides; uStop, upstream stop codon, eCDS, elongated coding sequence.
*Pairwise linkage disequilibrium metrics (r