| Literature DB >> 29784978 |
Thomas D Otto1,2, Aude Gilabert3, Thomas Crellen4,5, Ulrike Böhme4, Céline Arnathau3, Mandy Sanders4, Samuel O Oyola4,6, Alain Prince Okouga7, Larson Boundenga7, Eric Willaume8, Barthélémy Ngoubangoye7, Nancy Diamella Moukodoum7, Christophe Paupy3, Patrick Durand3, Virginie Rougeron3,7, Benjamin Ollomo7, François Renaud3, Chris Newbold4,9, Matthew Berriman4, Franck Prugnolle3,7.
Abstract
Plasmodium falciparum, the most virulent agent of human malaria, shares a recent common ancestor with the gorilla parasite Plasmodium praefalciparum. Little is known about the other gorilla- and chimpanzee-infecting species in the same (Laverania) subgenus as P. falciparum, but none of them are capable of establishing repeated infection and transmission in humans. To elucidate underlying mechanisms and the evolutionary history of this subgenus, we have generated multiple genomes from all known Laverania species. The completeness of our dataset allows us to conclude that interspecific gene transfers, as well as convergent evolution, were important in the evolution of these species. Striking copy number and structural variations were observed within gene families and one, stevor, shows a host-specific sequence pattern. The complete genome sequence of the closest ancestor of P. falciparum enables us to estimate the timing of the beginning of speciation to be 40,000-60,000 years ago followed by a population bottleneck around 4,000-6,000 years ago. Our data allow us also to search in detail for the features of P. falciparum that made it the only member of the Laverania able to infect and spread in humans.Entities:
Mesh:
Year: 2018 PMID: 29784978 PMCID: PMC5985962 DOI: 10.1038/s41564-018-0162-2
Source DB: PubMed Journal: Nat Microbiol ISSN: 2058-5276 Impact factor: 17.745
Figure 1Overview of the dating of the evolution of the Laverania.
(a) Maximum likelihood tree of the Laverania based on the “Lav12sp” set of orthologues. All bootstrap values are 100. Coalescence based estimates of the timing of speciation events are displayed on nodes (MYA - million years ago), based on intergenic and genic alignments. (b) Multiple sequentially Markovian coalescent estimates of the effective population size (Ne) in the P. falciparum and P. praefalciparum population. Assuming our estimate of the number of mitotic events per year, a bottleneck occurred in P. falciparum 4,000-6,000 years ago. The y-axis shows the natural logarithm (Ln) of Ne. Bootstrapping (pale lines) was performed by randomly resampling segregating sites from the input 50 times.
Figure 2Overview of the analyses of core genes over all Laverania genomes.
(a) Summary of evolution of core genes. From outer to inner track: scatterplot of branch site test for each genome (see Supplementary Table 4 for P. falciparum data); per-species dN/dS values (0.5 < dN/dS < 2); orthologues represented by vertical black lines under the chromosome track represent, with dots representing P. falciparum 3D7 var genes on the forward (blue) or reverse strands (red), or var pseudogenes (black); average of the relative polymorphism (π) across species, with the underlying π for each species calculated from multiple strains (“Lav15st” dataset) and normalized by the average for that species; signatures of convergent evolution based on host-specific fixed differences analysis with the chromosome 4 region that includes the Rh5 locus highlighted (black box). (b) Magnified view of the Rh5 region that is enriched with host specific fixed differences. Convergent evolution analysis was performed using orthologues conserved across the Laverania. Filled circles represent the subset of differences that were fixed within all the isolates available (“Lav15st” set) and for which we could reject neutral evolution (for the gene list see Supplementary Table 5).
Figure 3Gene families in the Laverania.
Distribution of major multigene families including var and those that show significant copy number variation among lineages. Data from P. praefalciparum include the subtelomeric gene families from the two infecting genotypes. Assembly of P. billcollinsi is incomplete in the subtelomeres.
Figure 4Clustering of Pir (Rifin and Stevor) proteins families.
Graphical representation of similarity between all pir proteins > 250aa, coloured by species. A BLAST cut-off of 45% global identity was used (see methods). More connected genes are more similar. Black circles highlight Clade A rifin proteins that cluster with Clade B rifin proteins.
Figure 5Evolution of var gene domains in the Laverania
(a) Heatmap of numbers of var gene domains in each Laverania species. Duffy represents regions closest to the Pfam Duffy binding domain. CIDRn is a new domain discovered in this study in Clade A. Only domains from var genes longer than 2.5 kb were considered. Heat map colours blue-yellow-white indicate decreasing copy numbers. (b) Graphical representation of similarity between domains, using domains from var genes longer than 2.5kb. Domains are coloured by species and clustered by a minimum BLAST cut-off of 45% global identity. Larger circles denote var genes in the opposite orientation. (c) Maximum likelihood trees of the Acidic Terminal Sequence (ATS). Apparent ATS sequences from clade A that cluster with clade B are indicated (**).
Figure 6Overview of the genomic evolution of the Laverania subgenus.
The values of polymorphism (π) within the species are indicated by triangles of different size at the end of the tree branches, as well the bottleneck in P. falciparum (constricted branch width), ~ 5,000 years ago. Also shown are the gene transfers that occurred between certain Clade A and B species and the huge genomic differences that accumulated in Clade B after the divergence with P. blacklocki.