| Literature DB >> 23131071 |
Antoinette C van der Kuyl1, Ben Berkhout.
Abstract
Viruses often deviate from their hosts in the nucleotide composition of their genomes. The RNA genome of the lentivirus family of retroviruses, including human immunodeficiency virus (HIV), contains e.g. an above average percentage of adenine (A) nucleotides, while being extremely poor in cytosine (C). Such a deviant base composition has implications for the amino acids that are encoded by the open reading frames (ORFs), both in the requirement of specific tRNA species and in the preference for amino acids encoded by e.g. A-rich codons. Nucleotide composition does obviously affect the secondary and tertiary structure of the RNA genome and its biological functions, but it does also influence phylogenetic analysis of viral genome sequences, and possibly the activity of the integrated DNA provirus. Over time, the nucleotide composition of the HIV-1 genome is exceptionally conserved, varying by less than 1% per base position per isolate within either group M, N, or O during 1983-2009. This extreme stability of the nucleotide composition may possibly be achieved by negative selection, perhaps conserving semi-stable RNA secondary structure as reverse transcription would be significantly affected for a less A-rich genome where secondary structures are expected to be more stable and thus more difficult to unfold.This review will discuss all aspects of the lentiviral genome composition, both of the RNA and of its derived double-stranded DNA genome, with a focus on HIV-1, the nucleotide composition over time, the effects of artificially humanized codons as well as contributions of immune system pressure on HIV nucleotide bias.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23131071 PMCID: PMC3511177 DOI: 10.1186/1742-4690-9-92
Source DB: PubMed Journal: Retrovirology ISSN: 1742-4690 Impact factor: 4.602
Average nucleotide composition of full-length lentivirus DNA genomes
| HIV-1, group M, subtype A | 3 | 35.1 ± 0.2 | 18.1 ± 0.2 | 24.4 ± 0.3 | 22.4 ± 0.4 |
| HIV-1, group M, subtype B | 18 | 35.3 ± 0.2 | 18.1 ± 0.2 | 24.4 ± 0.2 | 22.3 ± 0.1 |
| HIV-1, group M, subtype C | 18 | 35.5 ± 0.2 | 18.0 ± 0.2 | 24.2 ± 0.1 | 22.3 ± 0.1 |
| HIV-1, group M. subtype D | 1 | 35.8 | 18.0 | 24.1 | 22.1 |
| HIV-1, group O | 4 | 35.0 ± 0.2 | 19.0 ± 0.1 | 23.8 ± 0.2 | 22.2 ± 0.2 |
| HIV-1, group N | 8 (gag-pol-env only) | 36.0 ± 0.2 | 17.6 ± 0.2 | 24.0 ± 0.1 | 22.2 ± 0.2 |
| HIV-1, group P | 1 | 33.9 | 18.5 | 24.6 | 22.7 |
| SIVchimpanzee | 7 | 35.3 ± 0.5 | 18.3 ± 0.2 | 23.8 ± 0.3 | 22.5 ± 0.2 |
| SIVgorilla | 4 | 34.6 ± 0.0 | 18.5 ± 0.1 | 24.6 ± 0.1 | 22.3 ± 0.2 |
| SIVmangabey | 3 | 34.0 ± 0.2 | 18.9 ± 0.2 | 25.1 ± 0.3 | 22.1 ± 0.4 |
| SIVgreen monkey | 9 | 33.6 ± 0.6 | 19.3 ± 0.7 | 25.0 ± 0.5 | 22.0 ± 0.3 |
| SIVmandrill | 3 | 34,6 ± 1.3 | 18.1 ± 1.8 | 24.5 ± 0.6 | 22.9 ± 1.1 |
| HIV-2 | 16 | 33.9 ± 0.3 | 20.4 ± 0.5 | 24.9 ± 0.3 | 20.7 ± 0.5 |
| pSIV (lemur endogenous lentivirus) | 1a | 29.0 | 20.5 | 27.5 | 23.1 |
| | | | | | |
| EAIV | 25 | 35.7 ± 0.2 | 16.0 ± 0.3 | 22.0 ± 0.2 | 26.4 ± 0.4 |
| CAEV/Ovine lentivirusb | 7/4 | 38.0 ± 0.5 | 15.7 ± 0.6 | 25.2 ± 0.4 | 21.1 ± 0.4 |
| CAEV subtype E | 2 | 33.9 ± 0.3 | 28.1 ± 0.0 | 28.1 ± 0.2 | 19.5 ± 0.0 |
| Maedi-visna virus | 6 | 37.2 ± 0.2 | 26.0 ± 0.1 | 26.0 ± 0.1 | 21.4 ± 0.0 |
| Jembrana disease virus | 1 | 31.7 | 20.1 | 26.4 | 21.9 |
| BIV | 1 | 31.8 | 21.2 | 23.8 | 23.2 |
| FIV cat/cougar | 4/14 | 38.0 ± 0.3 | 14.9 ± 0.1 | 22.0 ± 0.3 | 25.2 ± 0.3 |
| FIV Pallas’ cat/lionc | 1/2 | 38.0 ± 0.1 | 13.7 ± 0.3 | 22.1 ± 0.3 | 26.2 ± 0.6 |
| RELIK (hare endogenous lentivirus)d | 1 (gag-pol-env only) | 34.0 | 19.4 | 22.4 | 24.1 |
| ELVmpf (ferret endogenous lentivirus) | 1 | 33.6 | 20.0 | 23.6 | 22.8 |
a Consensus sequence of pSIVgml (gray mouse lemur, one proviral copy) and pSIVfdl (fat-tail mouse lemur, several proviral copies).
b Viruses labeled CAEV subtype E and Maedi-visna virus, respectively, different significantly in nucleotide composition from other CAEV types including viruses labeled ovine lentivirus.
c FIV isolated from cats and cougars is phylogenetically distinct from FIV found in Pallas’ cat and lions.
d Similar frequencies for rabbit endogenous retrovirus sequences.
Std = standard deviation.
HIV-1 nucleotide composition of genome segments
| LTR (R3-U-R5) | 635 | 25.0 | 24.4 | 27.2 | 23.3 |
| Gag-ORF | 1503 | 36.9 | 19.6 | 24.5 | 19.1 |
| Pol-ORF | 3012 | 38.9 | 16.5 | 22.8 | 21.9 |
| Env-ORF | 2571 | 34.7 | 17.1 | 24.0 | 24.3 |
| Vif-ORF | 579 | 36.1 | 18.0 | 24.0 | 21.9 |
| Vpr-ORF | 292 | 32.5 | 18.5 | 26.7 | 22.3 |
| Tat-ORF | 306 | 33.0 | 23.9 | 24.2 | 19.0 |
| Rev-ORF | 351 | 29.9 | 23.1 | 28.2 | 18.8 |
| Vpu-ORF | 249 | 38.6 | 11.7 | 26.5 | 23.3 |
| Nef-ORF | 621 | 30.6 | 21.3 | 28.2 | 20.0 |
aHXB2 reference strain (GenBank acc. no. K03455).
Nucleotide composition of the different HIV-1 subtypes (group M)
| 0.0008 | 0.003 | 0.01 | 0.03 | |
| 0.0002 | <0.0001 | <0.0001 | 0.0004 | |
| 0.03 | 0.03 | 0.007 | 0.12 | |
| <0.0001 | 0.70 | <0.0001 | 0.29 | |
| <0.0001 | 0.20 | <0.0001 | 0.22 | |
| <0.0001 | 0.02 | <0.0001 | 0.009 | |
| <0.0001 | 0.88 | <0.0001 | 0.04 | |
| 0.84 | 0.25 | 1.00 | 0.30 | |
| <0.0001 | 0.45 | <0.0001 | 0.004 |
Differences in nucleotide composition between HIV-1 subtypes were analysed using Student’s t-test.
Based on 41 subtype B, 62 subtype A, 55 subtype C, 38 subtype D, 18 subtype G and 32 recombinant CRF02_AG gag-pol-env sequences.
Nucleotide composition of the HIV-1 RNA genome over time (1983–2009)
| 24 | A 36.5 | 35.8-36.9 | 38 | A 36.6 | 36.3-36.9 | |
| | | C 17.5 | 17.3-17.8 | | C 17.5 | 17.0-17.7 |
| | | G 23.8 | 23.3-24.4 | | G 23.8 | 23.4-24.2 |
| | | U 22.3 | 21.8-22.6 | | U 22.1 | 21.8-22.5 |
| 16 | A 36.7 | 36.5-37.0 | 25 | A 36.7 | 36.3-37.0 | |
| | | C 17.4 | 17.1-17.7 | | C 17.4 | 17.1-17.7 |
| | | G 23.7 | 23.5-24.0 | | G 23.7 | 23.4-24.1 |
| | | U 22.2 | 22.0-22.4 | | U 22.3 | 22.1-22.5 |
| 2 | A 35.7 | 35.6-35.8 | 6 | A 36.1 | 36.0-36.4 | |
| | | C 17.7 | 17.6-17.9 | | C 17.5 | 17.2-17.9 |
| | | G 24.2 | 24.1-24.3 | | G 23.9 | 23.9-24.1 |
| | | U 22.3 | 22.2-22.3 | | U 22.2 | 22.0-22,4 |
| 4 | A 35.4 | 35.0-35.8 | 3 | A 35.3 | 34.8-36.1 | |
| | | C 18.7 | 18.6-18.8 | | C 18.8 | 18.3-19.1 |
| | | G 23.7 | 23.5-24.2 | | G 23.8 | 23.3-24.1 |
| | | U 22.2 | 21.9-22.4 | | U 22.1 | 21.9-22.3 |
| 0 | NA | NA | 2 | A 34.1 | 33.9-34.4 | |
| | | | | | C 18.4 | 18.3-18.5 |
| | | | | | G 24.5 | 24.4-24.6 |
| U 22.4 | 22.2-22.6 |
a Gag-pol-env only.
b No significant differences in nucleotide composition were scored between groups over time (p>0.05, Student’s t-test).
c A shorter sequence of HIV-1 subtype A was analysed (approx. 8600 nt), as not many full-length genomes were available.
d Only genomes with ≤ 10 ambiguous nucleotides were used for the analysis.
e Seven of eighth group N genomes contain ambiguous nucleotides (range 5–38).
f Only two full-length genomes of group P viruses are available from the Los Alamos Database, which contain 54 and 66 ambiguous nucleotides, respectively.