| Literature DB >> 35780080 |
Julie Boisard1,2,3, Evelyne Duvernois-Berthet4, Linda Duval5, Joseph Schrével5, Laure Guillou6, Amandine Labat5, Sophie Le Panse7, Gérard Prensier8, Loïc Ponger9, Isabelle Florent10.
Abstract
Our current view of the evolutionary history, coding and adaptive capacities of Apicomplexa, protozoan parasites of a wide range of metazoan, is currently strongly biased toward species infecting humans, as data on early diverging apicomplexan lineages infecting invertebrates is extremely limited. Here, we characterized the genome of the marine eugregarine Porospora gigantea, intestinal parasite of Lobsters, remarkable for the macroscopic size of its vegetative feeding forms (trophozoites) and its gliding speed, the fastest so far recorded for Apicomplexa. Two highly syntenic genomes named A and B were assembled. Similar in size (~ 9 Mb) and coding capacity (~ 5300 genes), A and B genomes are 10.8% divergent at the nucleotide level, corresponding to 16-38 My in divergent time. Orthogroup analysis across 25 (proto)Apicomplexa species, including Gregarina niphandrodes, showed that A and B are highly divergent from all other known apicomplexan species, revealing an unexpected breadth of diversity. Phylogenetically these two species branch sisters to Cephaloidophoroidea, and thus expand the known crustacean gregarine superfamily. The genomes were mined for genes encoding proteins necessary for gliding, a key feature of apicomplexans parasites, currently studied through the molecular model called glideosome. Sequence analysis shows that actin-related proteins and regulatory factors are strongly conserved within apicomplexans. In contrast, the predicted protein sequences of core glideosome proteins and adhesion proteins are highly variable among apicomplexan lineages, especially in gregarines. These results confirm the importance of studying gregarines to widen our biological and evolutionary view of apicomplexan species diversity, and to deepen our understanding of the molecular bases of key functions such as gliding, well known to allow access to the intracellular parasitic lifestyle in Apicomplexa.Entities:
Keywords: Apicomplexa; Comparative genomics; Genome assembly; Gliding; Marine gregarine; Phylogeny
Mesh:
Year: 2022 PMID: 35780080 PMCID: PMC9250747 DOI: 10.1186/s12864-022-08700-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 4.547
Fig. 1Morphological characterization of Porospora cf. gigantea. A. Trophozoite stage (Tropho #8, Lobster #12) (scale bar = 100 μm). B. Zoom on A, showing trophozoite epimerite (scale bar = 10 μm). C. Rectal ampulla showing cysts in folds (Lobster #4) (scale bar = 1 mm). D. Isolated cyst (Cyst #4, Lobster #12) (scale bar = 50 μm). E. Broken cyst packed with gymnospores (Lobster #4) (scale = 10 μm). F. Section across a cyst illustrating radial arrangement of zoites in gymnospores (JS449 = Lobster #35) (scale bar = 2 μm). G., H. Zoom on intact and broken gymnospores showing zoites (Lobster #4) (scale = 1 μm). All images are scanning electronic micrographs except F which is a transmission electronic micrograph. See also Fig. S1, Tables S2, S3, S4, S5 and S6
Metrics of the genomes of P. cf. gigantea and a selection of 6 reference species. * by considering only genes with intron(s). See also Figs. S2, S3 and S5
| species |
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|
| strain | A | B | na | IowaII | ME49 | 3D7 | CCMP2878 | CCMP3155 |
| nb of contigs/chromosomes | 787 | 934 | 355 | 8 | 435 | 14 | 5470 | 1006 |
| total length of assembly (bp) | 8,806,768 | 9,049,943 | 13,873,624 | 9,102,324 | 63,472,444 | 23,292,622 | 192,006,978 | 72,475,329 |
| mean length contigs/chromosomes (bp) | 11,190.3 | 9689.45 | 39,080.63 | 1,137,790.5 | 145,913.66 | 1,663,758.71 | 35,101.82 | 72,043.07 |
| GC content (%) | 54.3 | 54.3 | 53.8 | 30.2 | 52.4 | 19.3 | 49.1 | 58.1 |
| nb of protein coding genes | 5270 | 5361 | 6606 | 4020 | 8862 | 5602 | 30,604 | 23,412 |
| mean length of coding genes (bp) | 1438.2 | 1450.3 | 1392.6 | 1865.0 | 5602.9 | 2488.6 | 4507.6 | 2704.7 |
| nb of tRNA | 14 | 14 | 231 | 45 | 150 | 45 | 0 | 0 |
| nb of rRNA | 27 | 25 | 0 | 5 | 420 | 28 | 0 | 0 |
| nb of gene with intron(s) | 2957 | 2981 | 2390 | 575 | 6801 | 3010 | 21,895 | 22,163 |
| median length of the introns (bp) | 28 [27–30] | 28 [27–30] | 95 [56–145] | 65 [51–91] | 467 [322–632] | 140 [110–184] | 372 [273–520] | 81 [70–98] |
| mode of intron length (bp) | 28 | 28 | 37 | 44 | 55 | 121 | 320 | 74 |
| mean nb of introns per gene* | 1.8 | 1.8 | 1.4 | 1.8 | 5.9 | 2.9 | 5.4 | 7.9 |
| non-coding DNA (%) | 16 | 16 | 37 | 24 | 68 | 47 | 74 | 50 |
Fig. 2Shared apicomplexan proteins. Distribution of the orthogroups among P. cf. gigantea A and B and 4 species of apicomplexans: the gregarine G. niphandrodes, the cryptosporidian C. parvum, the coccidian T. gondii and the hematozoan P. falciparum. Orthogroups only shared by P. cf. gigantea A and B are highlighted in green, whereas orthogroups shared by all species are highlighted in pink. Only bars with more than 20 orthogroups are shown. See also Table S1
Fig. 3Phylogeny of Apicomplexa. Maximum likelihood phylogeny of apicomplexans as retrieved from a 312 proteins dataset, merged from two previously published datasets [10, 11, 13]. Final concatenated alignment comprised 93,936 sites from 80 species. Bootstrap support values (n = 1000) followed by MrBayes posterior probabilities are shown on the branches. Black spots indicate 100/1 supports. Porospora cf. gigantea A and B sequenced in this study are bolded. See also Figs. S7 and S8
Fig. 4Comparative analysis of glideosome components. A. Table of presence/absence of genes encoding glideosome proteins, distributed into functional groups. Glideosome components have been described mainly in T. gondii and P. falciparum. Protein sequences were searched for in the genomes of both Porospora and a selection of representative species as well as in available gregarine transcriptomes. Green indicates the presence, while white indicates the absence of an orthologous protein-encoding sequence. Light red refers to cases where only partial sequences have been retrieved. Violet indicates the presence of at least one protein in multigenic family proteins. * refers to the GAP45 3′ short conserved domain found in some gregarines species. All P. cf. gigantea orthologous proteins are detailed in Table S8. B. Schematic comparison of the canonical model of the glideosome and the elements found in P. cf. gigantea A and B. Missing proteins are shown with dotted lines
Fig. 5Structures and molecular domains of candidate TRAP-like proteins in P. cf. gigantea A and B. See also Table S8