| Literature DB >> 23914950 |
Itay Mayrose1, Adi Stern, Ela O Burdelova, Yosef Sabo, Nihay Laham-Karam, Rachel Zamostiano, Eran Bacharach, Tal Pupko.
Abstract
BACKGROUND: Synonymous or silent mutations are usually thought to evolve neutrally. However, accumulating recent evidence has demonstrated that silent mutations may destabilize RNA structures or disrupt cis regulatory motifs superimposed on coding sequences. Such observations suggest the existence of stretches of codon sites that are evolutionary conserved at both DNA-RNA and protein levels. Such stretches may point to functionally important regions within protein coding sequences not necessarily reflecting functional constraints on the amino-acid sequence. The HIV-1 genome is highly compact, and often harbors overlapping functional elements at the protein, RNA, and DNA levels. This superimposition of functions leads to complex selective forces acting on all levels of the genome and proteome. Considering the constraints on HIV-1 to maintain such a highly compact genome, we hypothesized that stretches of synonymous conservation would be common within its genome.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23914950 PMCID: PMC3750384 DOI: 10.1186/1471-2148-13-164
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Maximum log-likelihood (LL) values for the nine HIV-1 ORFs under the KaD-KsD, KaV-KsV, and KaV-KsC models
| Env | −95,879.1 | −94,199.7 | −93,958.9 | < 10-250 | < 10-104 |
| Gag | −35,193.8 | −34,706.5 | −34,663.4 | < 10-230 | < 10-18 |
| Nef | −21,444.5 | −21,171 | −21,114.8 | < 10-144 | < 10-24 |
| Pol | −54,094.5 | −53,342.7 | −53,204 | < 10-250 | < 10-60 |
| Rev | −10,892.2 | −10,642.9 | −10,639 | < 10-110 | 0.02 |
| Tat | −9,316.45 | −9,103.74 | −9,094.84 | < 10-97 | < 10-3 |
| Vif | −13,368.2 | −13,151.4 | −13,141 | < 10-100 | < 10-4 |
| Vpr | −7,158.27 | −7,084.55 | −7,082.61 | < 10-34 | 0.14 |
| Vpu | −10,188.1 | −9,998.8 | −9,974.56 | < 10-94 | < 10-10 |
a Comparison of the KaD-KsD versus the KaV-KsC model.
b Comparison of the KaD-KsD versus the KaV-KsV model.
Figure 1A histogram portraying the HIV-1 proteome-wide distribution of inferred (A) Ka and (B) Ks values.
Summary of stretches with significantly low Ks, ranked according to StretchFinder
| 1 | 898-937 | | + | Contains the cPPT, the CTS and the DNA flap regions. The first two regions are involved in regulation of reverse transcription of the genome, and the third in translocation of the genome to the nucleus. | |
| 2 | 173-186 | + | | Overlaps Vpr 1–14. In this region of Vpr there is a conserved homo-ologimerization domain. | |
| 3 | 65-98 | + | | Overlaps Rev 18–52 (as well as Env 719–744), which includes the conserved homo-multimerization region and RNA-binding domain. | |
| 4 | 986-1003 | + | | Overlaps Vif 1-18 | |
| 5 | 88-99 | | + | Contains the PPT (a primer for reverse transcription) | |
| 6 | 41-52 | + | | Sites 47–52 overlap Rev 1–6. | |
| 7 | 728-744 | + | | Equivalent to region (3) (overlap of functional elements in Rev) | |
| 8 | 7-31 | + | | Overlaps Gag 442–448 and Gag 454-463 | |
| 9 | 488-496 | + | | Overlaps Protease 1-9 | |
| 10 | 1-21 | + | | Equivalent to region (4) (overlaps Pol 986–1003) | |
| 11 | 2-9 | + | | Overlaps Tat 49–56: conserved nuclear localization signal and RNA binding domain | |
| 12 | 535-541 | | + | RRE | |
| 13 | 2-5 | + | | Overlaps Vpu 56-60 | |
| 14 | 82-107 | | | ||
| 15 | 2-5 | | + | Four RNA loops (SL1-4) of the HIV-1 packaging signal. Gag 2–5 overlaps the fourth loop SL4. | |
| 16 | 570-608 | | + | RRE | |
| 17 | 74-80 | + | | Overlaps Env. | |
| 18 | 58-75 | + | | Overlaps Env 240-257 | |
| 19 | 939-954 | | + | Pol 939–947 forms the CTS end. | |
| 20 | 495-531 | | + | RRE | |
| 21 | 114-143 |
Regions which could not be fully accounted for (whether as part of a regulatory domain or as part of an overlapped region) are marked in bold.
Figure 2Analysis of the 82–90 Ks-conserved region. (A) Ka and Ks values for each position in the protein. The pol 82–90 region is boxed. (B) Codon adaptation index analysis based on the human codon usage in highly expressed genes (see Methods). The pol 82–90 region is boxed. (C) RNA secondary structure prediction, including the flanking region (see text for details). Red ovals encircling triplets of nucleotides mark the beginning and end of pol 82–90 within the predicted region.
Figure 3Schematic presentation of the NLR+GFP clone with wild type and mutated sequence in the gene. Shown are the HIV-1 reading frames. Black and gray rectangles represent the long terminal repeats (LTRs) and the out-of-frame env gene, respectively. GFP represents the GFP-expression cassette under the control of an internal CMV promoter. The mutated nucleotides are underlined. The indicated mutations do not change the amino acid sequence of the encoded protein.
Figure 4Gag expression and virion release of the NLR+GFP and NLR+GFPclones in transfected and infected cells. (A) HEK293T cells were transfected with plasmids expressing the VSV-G envelope, and NLR+GFP (WT) or two identical clones of NLR+GFPpolmut (mut1, mut2). Mock represents control cells transfected with no plasmid DNA. Gag precursor (Pr55gag) was detected in extracts of transfected cells (two days post transfection) by Western blotting using anti-capsid monoclonal antibody. Actin was used to control for protein levels in the samples. (B) Virions were purified from equal volumes of supernatants of cells in (A) and their levels were determined by detecting the capsid protein (p24), using Western blotting as above. (C) Equal amounts of virions from (B), normalized by RT activity, were used to infect naïve 293T cells and the newly infected cells were analyzed two days post infection as in (A). (D) Virions from supernatants of the cells indicated in (C) were analyzed as in (B). For (A-D), one representative Western blot is shown at the top of each panel and bars (bottom part) represent the average densitometry of the bands from three independent experiments.