| Literature DB >> 29360019 |
Abstract
The genomic architecture of organisms, including nucleotide composition, can be highly variable, even among closely-related species. To better understand the causes leading to structural variation in genomes, information on distinct and diverse genomic features is needed. Malaria parasites are known for encompassing a wide range of genomic GC-content and it has long been thought that Plasmodium falciparum, the virulent malaria parasite of humans, has the most AT-biased eukaryotic genome. Here, I perform comparative genomic analyses of the most AT-rich eukaryotes sequenced to date, and show that the avian malaria parasites Plasmodium gallinaceum, P. ashfordi, and P. relictum have the most extreme coding sequences in terms of AT-bias. Their mean GC-content is 21.21, 21.22 and 21.60 %, respectively, which is considerably lower than the transcriptome of P. falciparum (23.79 %) and other eukaryotes. This information enables a better understanding of genome evolution and raises the question of how certain organisms are able to prosper despite severe compositional constraints.Entities:
Keywords: AT-bias; GC-content; Plasmodium; genome evolution
Mesh:
Year: 2018 PMID: 29360019 PMCID: PMC5857377 DOI: 10.1099/mgen.0.000150
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
GC content (%) of the most AT-rich eukaryotes sequenced to date
| Species | Host | Transcripts | CDS | Non-CDS* | Genome |
|---|---|---|---|---|---|
| Birds | 21.21 | 21.19 | 14.85 | 17.83 | |
| Birds | 21.22 | ||||
| Birds | 21.60 | 21.57 | 15.27 | 18.33 | |
| Primates | 22.42 | 22.44 | 12.78 | 18.21 | |
| Primates | 23.79 | 23.78 | 14.28 | 19.34 | |
| Rodents | 23.79 | 23.75 | 19.95 | 22.04 | |
| Rodents | 23.94 | 23.91 | 19.62 | 21.74 | |
| Primates | 24.07 | 24.06 | 13.72 | 19.26 | |
| Rodents | 24.70 | 24.66 | 20.62 | 22.89 | |
| Insects | 24.83 | 16.64 | 18.78 | ||
| Rodents | 25.58 | 25.53 | 21.25 | 23.62 | |
| Crustaceans | 25.62 | 20.46 | 22.60 | ||
| Ruminants | 26.76 | 14.31 | 17.00 | ||
| Insects | 27.42 | 27.36 | 24.40 | 25.27 | |
| 27.42 | 27.41 | 14.40 | 22.44 | ||
| 27.53 | 17.24 | 22.32 | |||
| Mammals | 27.72 | 27.72 | 20.09 | 23.67 | |
| 27.74 | 19.10 | 22.94 | |||
| Mammals | 27.78 | 27.78 | 21.50 | 25.02 | |
| Crustaceans | 27.82 | 19.53 | 25.45 | ||
| Insects | 27.74 | 27.84 | 21.92 | 23.21 | |
| Rodents | 27.98 | 16.91 | 21.43 |
*Introns and intergenic sequences (non-coding).
†Data from this species are derived from a transcriptome assembly [30]
‡This species was previously known under the name Orpinomyces sp. C1A [57].
§This species was previously known under the name Brachiola algerae [37].
Genome statistics of the Plasmodium species analysed
| Species | GC (%) CDS | Genome | Organellar genomes | Protein | Contigs* | Transcripts | CDS | Orthologs | Version |
|---|---|---|---|---|---|---|---|---|---|
| 21.19 | 25.03 | Yes | 5307 | 154 | 5439 | 5307 | 5233 | 2017-01-09 | |
| 21.57 | 22.61 | Yes | 5178 | 514 | 5306 | 5178 | 5108 | 2017-01-09 | |
| 22.44 | 20.39 | Yes | 5286 | 833 | 5590 | 5774 | 5196 | 2016-06-16 | |
| 23.78 | 23.33 | Yes | 5460 | 16 | 5800 | 5734 | 5458 | 2015-06-18 | |
| 23.79 | 18.78 | Yes | 5067 | 21 | 5254 | 5094 | 5067 | 2017-01-09 | |
| 23.91 | 23.08 | Yes | 6091 | 16 | 6258 | 6094 | 6091 | 2016-10-27 | |
| 24.06 | 24.06 | Yes | 5769 | 372 | 6071 | 6012 | 5733 | 2015-06-18 | |
| 24.66 | 18.22 | No | 4954 | 49 | 5009 | 4954 | 4944 | 2014-06-17 | |
| 25.53 | 18.97 | Yes | 5217 | 16 | 5364 | 5217 | 5216 | 2015-06-18 | |
| 46.30 | 27.01 | Yes | 5552 | 2748 | 5631 | 5552 | 5550 | 2015-06-18 |
*Number of contigs/chromosomes making up the genome assembly, including organellar genome sequences. Example: the P. berghei genome assembly includes 14 nuclear chromosomes, one mitochondrial genome, one apicoplast genome, and five extra contigs with unplaced sequences [31].
Fig. 1.Comparative transcriptome GC-content of the eight eukaryotes with the most AT-rich genes. (a) Density GC curves of P. ashfordi (first row, in green), P. gallinaceum (second row, in blue), and P. relictum (third row, in turquoise). (b) Violin plot of transcriptome GC-content. The grey shaded area represents the bird-infecting malaria parasites and the dashed horizontal line shows the mean GC-content of P. ashfordi and P. gallinaceum (21.2 %). Pa, P. ashfordi; Pgal, P. gallinaceum; Prel, P. relictum; Pgab, P. gaboni; Pf, P. falciparum; Prei, P. reichenowi; Pb, P. berghei; Py, P. yoelii.
Fig. 3.Relative pairwise differences in amino acid proportions of predicted proteins in the genomes of P. gallinaceum versus P. falciparum (an AT-rich congeneric) and versus P. vivax (a GC-rich congeneric). The same comparison is made for P. relictum versus P. falciparum and versus P. vivax. Positive values (red bars) indicate a larger relative proportion of the denoted amino acids in the genomes of either P. gallinaceum or P. relictum. Note the differences in scale in the y-axes between the P. falciparum and the P. vivax comparison.
Fig. 2.GC-content by gene category in the seven eukaryotes with the most AT-rich genes and with genome sequences available. Points signify mean GC-percentages and horizontal lines delineate the 95 % confidence interval. The shaded area represents the bird-infecting malaria parasites. Pgal, P. gallinaceum; Prel, P. relictum; Pgab, P. gaboni; Pf, P. falciparum; Prei, P. reichenowi; Pb, P. berghei; Py, P. yoelii.