| Literature DB >> 25410051 |
Aditi Shukla1, Rolf Hilgenfeld.
Abstract
Acquisition of new proteins by viruses usually occurs through horizontal gene transfer or through gene duplication, but another, less common mechanism is the usage of completely or partially overlapping reading frames. A case of acquisition of a completely new protein through introduction of a start codon in an alternative reading frame is the protein encoded by open reading frame (orf) 9b of SARS coronavirus. This gene completely overlaps with the nucleocapsid (N) gene (orf9a). Our findings indicate that the orf9b gene features a discordant codon-usage pattern. We analyzed the evolution of orf9b in concert with orf9a using sequence data of betacoronavirus-lineage b and found that orf9b, which encodes the overprinting protein, evolved largely independent of the overprinted orf9a. We also examined the protein products of these genomic sequences for their structural flexibility and found that it is not necessary for a newly acquired, overlapping protein product to be intrinsically disordered, in contrast to earlier suggestions. Our findings contribute to characterizing sequence properties of newly acquired genes making use of overlapping reading frames.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25410051 PMCID: PMC7089080 DOI: 10.1007/s11262-014-1139-8
Source DB: PubMed Journal: Virus Genes ISSN: 0920-8569 Impact factor: 2.332
Fig. 1a Schematic organization of the SARS-coronavirus genome, highlighting structural, and accessory genes. The overprinting accessory genes are indicated below their overprinted mates. b Overlapping N and orf9b genes of SARS-CoV with their start-stop coordinates within the genome (“end” coordinates include the stop codon)
Correlation between the codon-usage patterns of the overlapping N and its overprinting internal genes of SARS-CoV, as well as of other members of the genus Betacoronavirus, i.e., BCoV, MHV, and MERS-CoV, each with the non-overlapping coding regions in their genome
|
| Proteins | Number of amino-acid residues | Correlation coefficient (r) |
|---|---|---|---|
| SARS-CoV | Nucleocapsid | 422 | 0.62 |
| Orf9b | 98 | −0.01 | |
| BCoV | Nucleocapsid | 448 | 0.67 |
| internal protein | 207 | −0.11 | |
| MHV | Nucleocapsid | 455 | 0.66 |
| internal protein | 136 | 0.00 | |
| MERS-CoV | Nucleocapsid | 411 | 0.58 |
| hypothetical internal protein | 112 | −0.13 |
Synonymous and nonsynonymous substitutions in overlapping and non-overlapping regions of the SARS-CoV nucleocapsid gene and in the orf9b gene
| Gene regions of: | Ka | Ks | Ka/Ks = ω |
|---|---|---|---|
| Nucleocapsid (overlapping part) | 0.41 | 0.73 | 0.56 |
| Nucleocapsid (non-overlapping part) | 0.37 | 0.59 | 0.62 |
| Orf9b | 0.53 | 0.43 | 1.23 |
Fig. 2a Top: The 5′ ends of the SARS-CoV N and orf9b genes. At nucleotide no. 10 of the N gene, translation of the overlapping orf9b gene begins, resulting in a phase difference of +1 for this gene relative to the N gene. Bottom: Codon-site substitutions in the two genes. Three types of substitution have to be distinguished: N2/9b1, N3/9b2, N1/9b3. b Variation of three sets of nucleotides (in magenta): N1/9b3, N2/9b1, and N3/9b2, in relation to the amino-acid variations (in blue) in the overlapping nucleocapsid and orf9b proteins. The x-axis represents the codon sites in case of graphs 1, 3, and 5, i.e., nucleotide variations, whereas in case of graphs 2 and 4, the x-axis represents the amino-acid residue number. Note that the N protein overlaps with orf9b between its residues 4 and 101; however, in graph 2, which represents the amino-acid variations in the N protein, the x-axis is calibrated from 1 to 98 in order to facilitate the comparison with orf9b. The y-axis represents entropy. The green dot indicates the one case of synonymous N1/9b3 substitution that does not lead to an amino-acid exchange in the N protein because of the partial degeneration of the first nucleotide position in a codon (AGA and CGA both code for Arg). The red dot indicates a case of a two-nucleotide difference as a result of an N1/9b3 and an N2/9b1 substitution that leads to an amino-acid exchange in the N protein. All bat betacoronaviruses of lineage b (with the exception of SL-CoV WIV1 [33]) have Lys at this position, whereas all civet and human SARS-CoV isolates as well as bat SL-CoV WIV1 have Pro (see text)
Fig. 3a Structures of the SARS-CoV proteins investigated in this study, colored according to the B-factor (color scheme used is VIBGYOR, where Violet depicts the minimum and red depicts the maximum value of B-factor) averaged for each amino-acid residue; (left) NTD of nucleocapsid protein (overall average B-factor 11.19 Å2; PDB code 2OFZ [58] ); (right) Dimer of the orf9b protein (overall average B-factor 100.8 Å2, PDB code 2CME [59] ). b Disorder prediction result for the NTD of the SARS-CoV nucleocapsid protein and its overprinting counterpart, the orf9b protein, calculated using the program DisProt VSL2B [67]. The degree of order–disorder lies within the range of 0 (well ordered) to 1 (highly disordered)