| Literature DB >> 35965322 |
Marie A Brunet1,2, Sébastien Leblanc3,4, Xavier Roucou5,6,7.
Abstract
BACKGROUND: Recent technological advances have revealed thousands of functional open reading frames (ORF) that have eluded reference genome annotations. These overlooked ORFs are found throughout the genome, in any reading frame of transcripts, mature or non-coding, and can overlap annotated ORFs in a different reading frame. The exploration of these novel ORFs in genomic datasets and of their role in genetic traits is hindered by a lack of software.Entities:
Keywords: Alternative ORF; Genome; NGS; OpenProt; SNP; Small ORF; Variant annotation
Year: 2022 PMID: 35965322 PMCID: PMC9375913 DOI: 10.1186/s13578-022-00871-x
Source DB: PubMed Journal: Cell Biosci ISSN: 2045-3701 Impact factor: 9.584
Fig. 1OpenVar to annotate and explore the impact of genomic variants using deeper genome annotations. A OpenVar uses a deep genome annotation, OpenProt, to annotate genomic variants (yellow). It will annotate the effect on canonical ORFs (green) listed in reference annotations, and on alternative ORFs (red) listed in OpenProt. For example, a given variant may be a 3′UTR variant for a canonical ORF, but a missense variant for a downstream ORF. B Visualization of the genomic variants in the HEY2 gene from the SynMicDB and COSMIC datasets. The genomic position is indicated at the top (genome), with the HEY2 transcripts in blue (Ensembl: ENST00000368364.3 and ENST00000368365.5; respectively NCBI RefSeq: NM_012259.2 and XM_017010629.2), the canonical ORFs in green (UniProt: Q9UBP5 and Q5TF93; respectively Ensembl: ENSP00000357348.3 and ENSP00000357349.1; or NCBI RefSeq: NP_036391.1 and XP_016866118.1) and an alternative ORF in red (OpenProt: IP_145210). The variants tracks are below (SynMicDB and COSMIC variants), each grouping the variants per category of effect on the canonical proteins (up side) and the alternative protein (flipped). The colour code for the effect of the variants is at the bottom of the figure. The black line on the variants tracks indicates the number of variants with the maximal impact between the canonical and the alternative ORFs
Fig. 2OpenVar reclassifies many low impact variants as high impact and highlights the role of non-canonical ORFs. A Bar chart of the relative proportion of variants in each impact category (modifier, low, moderate, high) when annotating the SynMicDB dataset with the Ensembl Variant Effect Predictor (VEP, in blue), Annovar (in red), SnpEff (in purple) or OpenVar (in green). B Bar chart of the relative proportion of variants in each impact category (modifier, low, moderate, high) when annotating the COSMIC catalog of variants for the HEY2 gene with the Ensembl Variant Effect Predictor (VEP, in blue), Annovar (in red), SnpEff (in purple) or OpenVar (in green). C Visualization of the relative impact of genomic variants in the HEY2 gene from the COSMIC dataset when annotated with the Ensembl Variant Effect Predictor (VEP), Annovar, SnpEff or OpenVar. The HEY2 transcript position is indicated at the bottom (ENST00000368364.3). The position of the canonical ORF is represented in light green (UniProt: Q9UBP5; which corresponds to Ensembl: ENSP00000357348.3 or NCBI RefSeq: NP_036391.1; transcript coordinates: 198-1212) with its functional domains in dark green (bHLH: IPR011598; and Orange: IPR003650). The position of the alternative ORF is represented in light red (OpenProt: IP_145210; transcript coordinates: 805-1237)
Deep annotation of 4 genomic variants within the HEY2 gene with OpenVar
| Genomic variant | Effect on the canonical ORF (Q9UBP5) | Effect on the alternative ORF (IP_145210) | ||||||
|---|---|---|---|---|---|---|---|---|
| Locus effect | Protein effect | Reported by common annotators | Reported by OpenVar | Locus effect | Protein effect | Reported by common annotators | Reported by OpenVar | |
| chr6:125,759,167 C > T | Stop gained | R127* | Yes | Yes | 5′UTR variant | – | No | Yes |
| chr6:125,759,503 C > T | Stop gained | R239* | Yes | Yes | 5′UTR variant | – | No | Yes |
| chr6:125,759,711 C > T | Missense | S308L | Yes | Yes | Stop gained | Q106* | No | Yes |
| chr6:125,759,780 G > T | Missense | G331V | Yes | Yes | Stop gained | G129* | No | Yes |
Common annotators include VEP, Annovar and SnpEff
*symbolises a stop codon