| Literature DB >> 35186789 |
Jian Guo1, Jianbo Jian2, Lili Wang1, Lijuan Xiong3, Huiping Lin1, Ziyi Zhou1, Eva C Sonnenschein2, Wenjuan Wu1.
Abstract
The Prototheca alga is the only chlorophyte known to be involved in a series of clinically relevant opportunistic infections in humans and animals, namely, protothecosis. Most pathogenic cases in humans are caused by Prototheca wickerhamii. In order to investigate the evolution of Prototheca and the genetic basis for its pathogenicity, the genomes of two P. wickerhamii strains S1 and S931 were sequenced using Nanopore long-read and Illumina short-read technologies. The mitochondrial, plastid, and nuclear genomes were assembled and annotated including a transcriptomic data set. The assembled nuclear genome size was 17.57 Mb with 19 contigs and 17.45 Mb with 26 contigs for strains S1 and S931, respectively. The number of predicted protein-coding genes was approximately 5,700, and more than 96% of the genes could be annotated with a gene function. A total of 2,798 gene families were shared between the five currently available Prototheca genomes. According to the phylogenetic analysis, the genus of Prototheca was classified in the same clade with A. protothecoides and diverged from Chlorella ~500 million years ago (Mya). A total of 134 expanded genes were enriched in several pathways, mostly in metabolic pathways, followed by biosynthesis of secondary metabolites and RNA transport. Comparative analysis demonstrated more than 96% consistency between the two herein sequenced strains. At present, due to the lack of sufficient understanding of the Prototheca biology and pathogenicity, the diagnosis rate of protothecosis is much lower than the actual infection rate. This study provides an in-depth insight into the genome sequences of two strains of P. wickerhamii isolated from the clinic to contribute to the basic understanding of this alga and explore future prevention and treatment strategies.Entities:
Keywords: Prototheca wickerhamii; algae; pathogenic; protothecosis; whole genome sequencing
Mesh:
Year: 2022 PMID: 35186789 PMCID: PMC8847788 DOI: 10.3389/fcimb.2022.797017
Source DB: PubMed Journal: Front Cell Infect Microbiol ISSN: 2235-2988 Impact factor: 5.293
Figure 1Characteristics of the two types of P. wickerhamii genomes. Distribution of genomic features of the P. wickerhamii genomes for strain S1 (red) and strain S931 (green). From inside to outside: (A) syntenic gene blocks, (B) GC content, (C) TE density, (D) gene number, (E) gene length, and (F) Contig (>500 kb).
Genome statistics of two newly sequenced and one published P. wickerhamii.
|
| Strain S1 | Strain S931 | Strain ATCC 16529 ( |
|---|---|---|---|
| Assembly length (Mb) | 17.57 | 17.45 | 16.70 |
| Contig number | 19 | 26 | 21 |
| N50 contig (bp) | 1,639,047 | 1,406,360 | 1,578,614 |
| GC content (%) | 64.21 | 64.45 | 64.16 |
| Number of genes | 5,694 | 5,704 | 6,081 |
| Complete BUSCOs V5 (gene) | 88.40% | 86.90% | 79.70% |
| Repetitive DNA in genome assembly (%) | 3.11 | 2.49 | 2.25 |
Figure 2Distribution of gene families in P. wickerhamii strain S1, P. wickerhamii strain S931, P. wickerhamii strain ATCC 16529, P. cutis, and P. stagnorum. Homologous genes in the five species were clustered into gene families. Numbers indicate unique and shared gene families in each species.
Figure 3A Phylogenetic tree and gene family expansion and contraction among 13 species. Node labels represent node ages. Expansion/contraction of gene families is shown in green and red.
Figure 4The KOG functional categories for “reduced virulence,” “unaffected pathogenicity,” and “loss of pathogenicity” in S1 and S931.