| Literature DB >> 33774510 |
Irwin Jungreis1, Chase W Nelson2, Zachary Ardern3, Yaara Finkel4, Nevan J Krogan5, Kei Sato6, John Ziebuhr7, Noam Stern-Ginossar4, Angelo Pavesi8, Andrew E Firth9, Alexander E Gorbalenya10, Manolis Kellis11.
Abstract
At least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for these ORFs and their shorter isoforms, developed in consultation with the Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. We recommend calling the 39 codon Spike-overlapping ORF ORF2b; the 41, 57, and 22 codon ORF3a-overlapping ORFs ORF3c, ORF3d, and ORF3b; the 33 codon ORF3d isoform ORF3d-2; and the 97 and 73 codon Nucleocapsid-overlapping ORFs ORF9b and ORF9c. Finally, we document conflicting usage of the name ORF3b in 32 studies, and consequent erroneous inferences, stressing the importance of reserving identical names for homologs. We recommend that authors referring to these ORFs provide lengths and coordinates to minimize ambiguity caused by prior usage of alternative names.Entities:
Keywords: Accessory protein; Alternative reading frame; Nomenclature; ORF2b; ORF3b; ORF3c; ORF3d; ORF9a; ORF9b; Open reading frame; Overlapping ORF; SARS-CoV-2
Year: 2021 PMID: 33774510 PMCID: PMC7967279 DOI: 10.1016/j.virol.2021.02.013
Source DB: PubMed Journal: Virology ISSN: 0042-6822 Impact factor: 3.616
Fig. 1Browser image of recommended names for overlapping ORFs. UCSC Genome Browser images showing our recommended names and the number of amino acids (below name) for overlapping ORFs (light green or pink background for ORFs whose codons are shifted 1 or 2 nucleotides, respectively, in the 3′ direction from those of the main ORF, white background). AUG (green) and stop codons (red) are shown in each of three positive-sense genomic reading frames. (A.). 5′ end of Spike ORF (S) containing ORF2b (39 codons). B. ORF3a containing ORFs 3c (41 codons), 3d (57 codons), 3d-2 (33 codons), and 3b (22 codons). The region homologous to SARS-CoV ORF3b, which overlaps the 3′ half of ORF3a and the 5′ end of the envelope protein ORF (E) is also shown (light blue background). Note that ORFs 3a, 3c, and 3d are in different reading frames (+0, +1, and +2, respectively), so the 59 nucleotide region common to all three could be a rare example of RNA translated in three different reading frames. C. Nucleocapsid ORF (N) containing ORFs 9b (97 codons) and 9c (73 codons). (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)
Recommended standard names. Recommended standard names for each of six ORFs overlapping S, ORF3a, or N, in 5′–3′ order, and the shorter isoform of ORF3d, with number of codons, coordinates, and a list of other names that have been used in previous publications or preprints. Codon counts and coordinates do not include the stop codon. Coordinates are with respect to the Wuhan-Hu-1 reference genome (NCBI: NC_045512.2). Frame +1 or +2 indicates that codons are shifted one or two nucleotides, respectively, in the 3′ direction from codons in the main (larger) ORF, which occupies frame +0.
| Recommended ORF name | Length (codons) | Coordinates | Frame relative to main (+0) ORF | Other names used (not recommended) | SARS-CoV homolog |
|---|---|---|---|---|---|
| ORF2b | 39 | 21744-21860 | +1 (S) | S.iORF1 | None |
| ORF3c | 41 | 25457-25579 | +1 (ORF3a) | ORF3h, 3a.iORF1, ORF3b | Unnamed |
| ORF3d | 57 | 25524-25694 | +2 (ORF3a) | ORF3b, ORF3c | None |
| ORF3b | 22 | 25814-25879 | +1 (ORF3a) | 5′ end of ORF3b | |
| ORF9b | 97 | 28284-28574 | +1 (N) | ORF9a, N.iORF1 | ORF9b |
| ORF9c | 73 | 28734-28952 | +1 (N) | ORF9b, ORF14 | ORF9c, ORF14 |
ORF3b studies. Thirty two studies that use the name “ORF3b”, but do not distinguish the 22 codon and 57 codon ORFs as separate entities. Information about what each study was investigating and how we determined the ORF referred to is provided in Supplementary Table 1.
| ORF called “ORF3b” in study | Recommended name | Study type | Studies (first author and citation) |
|---|---|---|---|
| 22 codon ORF | ORF3b | Genome report | ( |
| Empirical | ( | ||
| Review | |||
| 57 codon ORF | ORF3d | Genome Report | |
| Empirical | ( | ||
| Laboratory resource - sequence clone collection | ( | ||
| Computational | ( | ||
| Review | ( | ||
| Unclear | Unclear | Empirical | ( |
| Computational |