| Literature DB >> 33771926 |
Ronnie M Russell1,2, Frederic Bibollet-Ruche1, Weimin Liu1, Scott Sherrill-Mix1,2, Yingying Li1, Jesse Connell1, Dorothy E Loy1,2, Stephanie Trimboli1,2, Andrew G Smith1, Alexa N Avitto1, Marcos V P Gondim1, Lindsey J Plenderleith3,4, Katherine S Wetzel2, Ronald G Collman1,2, Ahidjo Ayouba5, Amandine Esteban5, Martine Peeters5, William J Kohler6, Richard A Miller6, Sandrine François-Souquiere7, William M Switzer8, Vanessa M Hirsch9, Preston A Marx10,11, Alex K Piel12, Fiona A Stewart12,13, Alexander V Georgiev14,15, Volker Sommer12, Paco Bertolani16, John A Hart17, Terese B Hart17, George M Shaw1,2, Paul M Sharp3,4, Beatrice H Hahn18,2.
Abstract
Infection with human and simian immunodeficiency viruses (HIV/SIV) requires binding of the viral envelope glycoprotein (Env) to the host protein CD4 on the surface of immune cells. Although invariant in humans, the Env binding domain of the chimpanzee CD4 is highly polymorphic, with nine coding variants circulating in wild populations. Here, we show that within-species CD4 diversity is not unique to chimpanzees but found in many African primate species. Characterizing the outermost (D1) domain of the CD4 protein in over 500 monkeys and apes, we found polymorphic residues in 24 of 29 primate species, with as many as 11 different coding variants identified within a single species. D1 domain amino acid replacements affected SIV Env-mediated cell entry in a single-round infection assay, restricting infection in a strain- and allele-specific fashion. Several identical CD4 polymorphisms, including the addition of N-linked glycosylation sites, were found in primate species from different genera, providing striking examples of parallel evolution. Moreover, seven different guenons (Cercopithecus spp.) shared multiple distinct D1 domain variants, pointing to long-term trans-specific polymorphism. These data indicate that the HIV/SIV Env binding region of the primate CD4 protein is highly variable, both within and between species, and suggest that this diversity has been maintained by balancing selection for millions of years, at least in part to confer protection against primate lentiviruses. Although long-term SIV-infected species have evolved specific mechanisms to avoid disease progression, primate lentiviruses are intrinsically pathogenic and have left their mark on the host genome.Entities:
Keywords: CD4; balancing selection; parallel evolution; primate lentiviruses; trans-specific polymorphism
Mesh:
Substances:
Year: 2021 PMID: 33771926 PMCID: PMC8020793 DOI: 10.1073/pnas.2025914118
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Allelic diversity of CD4 in bonobos. (A) CD4 coding variants identified in wild bonobo populations. D1 domain variants of bonobos (P. paniscus, Pp; derived from exon 2 and 3 sequences) are compared to human and chimpanzee CD4, with dots indicating identity to the human reference. A single amino acid replacement (I83T) in the bonobo CD4 is boxed. A potential N-linked glycosylation site is indicated by asterisks. The ancestral chimpanzee (P. troglodytes; Pt) allele is shown for reference, with positions that are polymorphic in chimpanzees underlined. (B) Crystal structure of the HIV-1 gp120 envelope domain (black) bound to human CD4 (gray) (PDB ID code 4R2G), with polymorphic positions indicated for chimpanzees (blue) and bonobos (green). (C) CD4 allele frequency in wild bonobos. The number of tested individuals is indicated. (D) Effects of bonobo CD4 polymorphisms on SIV Env mediated cell entry. The infectivity of pseudoviruses carrying different SIV Envs is shown for transiently transfected cells expressing human and bonobo CD4 variants and the cognate CC5 receptors. Values are scaled relative to the human CD4 (set to 100%). Bars represent the average of three independent transfections, each performed in triplicate, with SDs shown (fold-changes of Env infectivity for different CD4 alleles are shown in ).
D1 domain genotyping of African primates
| Primate species | No. of individuals | No. of D1 variants | SIV infection | Non synonymous changes | Synonymous changes | |
| Apes | ||||||
| Chimpanzee ( | 544 | 9 | + | 6 | 1 | |
| Bonobo ( | 100 | 2 | − | 1 | 0 | |
| Western gorilla ( | 97 | 3 | + | 3 | 0 | |
| Eastern gorilla ( | 25 | 2 | − | 3 | 0 | |
| Cercopithecine monkeys | ||||||
| L’Hoest’s monkey ( | 3 | 2 | + | 2 | 0 | |
| Sun-tailed monkey ( | 2 | 3 | + | 2 | 1 | |
| Tantalus monkey ( | 11 | 3 | + | 3 | 0 | |
| Vervet monkey ( | 63 | 4 | + | 4 | 0 | |
| Grivet ( | 26 | 3 | + | 3 | 0 | |
| Green monkey ( | 53 | 4 | + | 3 | 0 | |
| Malbrouck ( | 16 | 4 | + | 3 | 0 | |
| Patas monkey ( | 5 | 1 | − | 0 | 0 | |
| Red-tailed monkey ( | 4 | 2 | + | 2 | 1 | |
| Mustached monkey ( | 15 | 11 | + | 8 | 0 | |
| Greater spot-nosed monkey ( | 17 | 9 | + | 8 | 1 | |
| Sykes’ monkey ( | 22 | 3 | + | 3 | 0 | |
| Blue monkey ( | 7 | 3 | + | 3 | 0 | |
| De Brazza’s monkey ( | 3 | 1 | + | 0 | 0 | |
| Diana monkey ( | 5 | 3 | ? | 2 | 0 | |
| Lesser spot-nosed monkey ( | 4 | 2 | ? | 2 | 0 | |
| Crested mona monkey ( | 2 | 2 | − | 1 | 0 | |
| Red-capped mangabey ( | 3 | 2 | + | 4 | 0 | |
| Sooty mangabey ( | 5 | 2 | + | 2 | 0 | |
| Mandrill ( | 7 | 2 | + | 1 | 1 | |
| Chacma baboon ( | 10 | 1 | − | 0 | 0 | |
| Olive baboon ( | 13 | 1 | − | 0 | 0 | |
| Yellow baboon ( | 3 | 1 | − | 0 | 0 | |
| Colobine monkeys | ||||||
| Ugandan red colobus ( | 29 | 6 | + | 7 | 1 | |
| Mantled guereza ( | 4 | 2 | + | 1 | 1 |
In addition to the within-species diversity shown here, D1 domain sequences were also obtained for single individuals of the following species: Preuss’s monkey (Allochrocebus preussi), Hamlyn’s monkey Cercopithecus hamlyni), Lowe’s mona monkey (Cercopithecus lowei), Allen’s swamp monkey (Allenopithecus nigroviridis), Angolan talapoin (Miopithecus talapoin), Drill (Mandrillus leucophaeus), and Angola colobus (Colobus angolensis) (see for GenBank accession numbers).
Plus sign (+), naturally SIV infected; minus sign (−) not naturally SIV infected; question mark (?), insufficient sampling to determine SIV infection status.
D1 domain sequences were obtained from GenBank.
All or some D1 domain sequences were extracted from whole genome or RNA-seq datasets (24, 48, 68). Note that the number of D1 domain variants represent minimum estimates, since alleles with low quality support were discarded.
Minimum number of individuals from 25 samples.
In addition to substitutions, a 3 bp deletion was observed in alleles of some red-capped and sooty mangabeys (Fig. 4A).
Fig. 2.Allelic diversity of CD4 in gorillas. (A) CD4 coding variants identified in western (G. gorilla gorilla; Ggg) and eastern (G. beringei graueri; Gbg) lowland gorillas. D1 domain variants in exon 2 are compared to the human CD4, with dots indicating sequence identity (exon 3 derived protein sequences were invariant). A potential N-linked glycosylation site is indicated by asterisks. Amino acid replacements are indicated in red, with allelic variants named based on the order of polymorphic amino acid residues. (B) Crystal structure of the HIV-1 gp120 envelope domain (black) bound to human CD4 (gray) (PDB ID code 4R2G) with polymorphic positions indicated for chimpanzees (blue) and gorillas (red). (C) CD4 allele frequencies in wild-living western (Upper) and eastern (Lower) gorillas. The number of tested individuals is indicated. (D) Effects of gorilla CD4 polymorphisms on SIV Env mediated cell entry. The infectivity of pseudoviruses carrying the SIV Envs indicated is shown for transiently transfected cells expressing human and gorilla CD4 variants and the cognate CCR5 receptors. Values are scaled relative to human CD4 (set to 100%). Bars represent the average of two or three independent transfections, each performed in triplicate, with SDs shown. MT145, MB897, EK505, LB715, and GAB2 represent SIVcpzPtt strains from central chimpanzees, while TAN2 and BF1167 represent SIVcpzPts strains from eastern chimpanzees. (E) Protective effect of the N15 glycan. The percentage of infected cells bearing the AHSM allele was compared to a mutant (N15T) lacking the N15 glycan (fold-changes of Env infectivity for different CD4 alleles are shown in ). Individual SIV Envs are color coded as in D.
Fig. 3.Allelic diversity of CD4 in mustached monkeys. (A) CD4 coding variants identified in mustached monkeys. Mustached monkey D1 domain variants derived from both exons 2 and 3 (indicated on the bottom) are compared to one representative allele, with dots indicating identity to this reference (sequences are trimmed to the polymorphic region). Polymorphic positions are highlighted in red and their position is indicated on the top. Alleles 1 to 9 could be unambiguously inferred; the remaining alleles remain ambiguous because of polymorphisms in both exons 2 and 3, which could not be linked. For individual 10, permutations of exons 2 and 3 combinations resulted in one new allele (either 10a or 10b), which was paired with an allele already identified in other individuals. For individual 11, permutations resulted in either one new allele (11a) combined with an already known allele, or a combination of two new alleles (11b and 12?). An N-linked glycosylation site is indicated by asterisks. An arrow marks allele 2, which is the inferred ancestral allele. (B) Effects of mustached monkey CD4 polymorphisms on SIV Env mediated cell entry. A heatmap displays the percentage of cells expressing the indicated CD4 variant that were infected by the corresponding SIV Env, averaged across two or three experiments each performed in triplicate (fold-changes of Env infectivity for different CD4 alleles are shown in ).
Fig. 4.CD4 diversity in African primate species. (A) D1 domain positions exhibiting intraspecies polymorphisms. For each primate species, the positions of polymorphic residues are shown, with the amino acid residues indicated. Dots indicate identity to the CD4 consensus, while dashes indicate deletions. Contact residues between HIV-1 Env and human CD4 are indicated in red. (B and C) Maximum-likelihood trees of G6PD (B) and CD4 D1 domain (C) nucleotide sequences from different guenon species (color coded). Bootstrap values of >90% are shown; the scale bars indicate 0.001 and 0.004 nucleotide substitutions per site, respectively.
Fig. 5.D1 domain glycosylation in primates. (A) PNGS found in the D1 domain of different primate species. PNGS positions are modeled onto the structure of the HIV-1 envelope trimer, with different protomers highlighted in pink, green, and gray, respectively, and the bound human CD4 shown in black (PDB ID code 5U1F). (B) Phylogeny of African primate species adapted from Springer et al. (69). The tree highlights the presence of glycans within each species, which are color coded as in A. Solid and striped squares indicate invariant and polymorphic glycans, respectively. The scale bar indicates estimated primate divergence times as previously reported (69).