| Literature DB >> 35550024 |
Victoria Mesa1, Marc Monot2,3, Laurent Ferraris1, Michel Popoff4, Christelle Mazuet4, Frederic Barbut1,5, Johanne Delannoy1, Bruno Dupuy3, Marie-Jose Butel1, Julio Aires1.
Abstract
Clostridium neonatale is a potential opportunistic pathogen recovered from faecal samples in cases of necrotizing enterocolitis (NEC), a gastrointestinal disease affecting preterm neonates. Although the C. neonatale species description and name validation were published in 2018, comparative genomics are lacking. In the present study, we provide the closed genome assembly of the C. neonatale ATCC BAA-265T (=250.09) reference strain with a manually curated functional annotation of the coding sequences. Pan-, core- and accessory genome analyses were performed using the complete 250.09 genome (4.7 Mb), three new assemblies (4.6-5.6 Mb), and five publicly available draft genome assemblies (4.6-4.7 Mb). The C. neonatale pan-genome contains 6840 genes, while the core-genome has 3387 genes. Pan-genome analysis revealed an 'open' state and genomic diversity. The strain-specific gene families ranged from five to 742 genes. Multiple mobile genetic elements were predicted, including a total of 201 genomic islands, 13 insertion sequence families, one CRISPR-Cas type I-B system and 15 predicted intact prophage signatures. Primary virulence classes including offensive, defensive, regulation of virulence-associated genes and non-specific virulence factors were identified. The presence of a tet(W/N/W) gene encoding a tetracycline resistance ribosomal protection protein and a 23S rRNA methyltransferase ermQ gene were identified in two different strains. Together, our results revealed a genetic diversity and plasticity of C. neonatale genomes and provide a comprehensive view of this species genomic features, paving the way for the characterization of its biological capabilities.Entities:
Keywords: Clostridium neonatale; closed genome; comparative genomics; core-genome; necrotizing enterocolitis; pan-genome
Mesh:
Year: 2022 PMID: 35550024 PMCID: PMC9465065 DOI: 10.1099/mgen.0.000813
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
strains and genome data
NEC+, preterm infants with a diagnosis of necrotizing enterocolitis (NEC).
|
strain ID |
Date of isolation (reference) |
Sample type |
Sequencing technology |
Contigs/genome close level |
Contig N50 (bp) |
Contig L50 |
Genome size (kb) |
G+C (%) |
CDS |
tRNAs |
rRNAs |
Repeat regions |
Accession no. |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
250.09 *, † |
2002 [ |
Stool/NEC+ |
Pacbio HiSeq |
1 Closed |
4 753 394 |
1 |
4753 |
28.61 |
4356 |
85 |
33 |
275 |
ERS7257048 (This study) |
|
CC3_PB† |
2015 [ |
Stool/NEC+ |
Pacbio HiSeq |
6 Contigs |
5 137 519 |
1 |
5675 |
28.60 |
5505 |
91 |
32 |
37 |
ERS7257050 (This study) |
|
LF22† |
2006 |
Stool |
HiSeq |
288 Contigs |
55 509 |
32 |
5598 |
28.67 |
5602 |
78 |
11 |
53 |
ERS7257051 (This study) |
|
CB12† |
2015 [ |
Stool/NEC+ |
HiSeq |
83 Contigs |
119 741 |
14 |
4612 |
28.38 |
4259 |
74 |
11 |
24 |
ERS7257049 (This study) |
|
NEC25 |
2011 [ |
Stool/NEC+ |
MiSeq |
3‡ Contigs |
2 546 558 |
1 |
4739 |
28.67 |
3932 |
77 |
28 |
221 |
PRJEB26968 |
|
NEC26 |
2012 [ |
Stool/NEC+ |
MiSeq |
3‡ Contigs |
2 546 622 |
1 |
4738 |
28.86 |
3808 |
77 |
28 |
218 |
PRJEB26973 |
|
NEC32 |
2012 [ |
Stool/NEC+ |
MiSeq |
3‡ Contigs |
2 546 635 |
1 |
4738 |
28.75 |
3891 |
77 |
28 |
220 |
PRJEB27003 |
|
NEC86 |
2010 [ |
Stool/NEC+ |
MiSeq |
3‡ Contigs |
2 546 783 |
1 |
4739 |
28.58 |
4304 |
77 |
28 |
230 |
PRJEB26949 |
|
C25 |
2012 [ |
Stool samples |
MiSeq |
3‡ Contigs |
2 546 563 |
1 |
4739 |
28.64 |
3937 |
77 |
28 |
228 |
PRJEB26947 |
*Reference strain ATCC BAA-265T.
†Strains were isolated from independent individuals in different spatiotemporal settings.
‡Contigs containing ‘N’ stretches.
NEC+, satrin isolated from a NEC case.
Fig. 1.Pie chart comparing the automatic and the manually curated annotation of the 250.09 genome. (a) Automatic annotation, this study. (b) Manual curated annotation, this study. (c) Draft genome automatic annotation published in 2016 [11]. (d) Draft genome automatic annotation published in 2018 [8].
Fig. 2.Heatmap representing the degree of genome similarity based on ANI. The heatmap was derived from the high similarity (dark red) and low similarity (blue) of CDS derived from the nine . genomes.
Fig. 3.circular genome comparison maps. The upper and lower figures show the blast comparison between genomes and 250.09 or CC3_PB as the reference genomes, respectively. Genome scale (Mbp) is indicated on the innermost circles. Origin of replication is positioned at 0 Mbp. Regions in white correspond to variable areas between genomes. The two outermost circles indicate locations of gene coding regions in the plus (circle 1) and minus (circle 2) strands. Rings 3–10 indicate blast comparison with the contigs of the different strain genomes. Circle 11 shows G+C skew in the plus (green) and minus (purple) strands. The innermost rings indicate G+C content (deviation from average).
Fig. 4.Pan- and core-genome analysis of strains. (a) Distribution frequency of the gene families within genomes. (b) Number of new genes added to each genome. (c) Flowerplot illustrating the core-genome size (flower centre), and unique genes for each strain (flower petals). (d) Cumulative curves showing the downward trend of the core gene families (in blue) and the upward trend of the pan gene families (in green) with the increase in the number of genomes. Power and exponential equations are noted.
Fig. 5.Functional categories of genes. (a) COG distribution. (b) COG distribution categories for the core, accessory and unique genome.
Fig. 6.Venn diagram representing the number of shared and unique orthologous protein clusters encoded by the 250.09, CC3_PB and LF22 genomes.
Characteristics of genomic islands
|
250.09 |
CC3_PB |
LF22 |
CB12 |
C25 |
NEC32 |
NEC26 |
NEC86 |
NEC25 | |
|---|---|---|---|---|---|---|---|---|---|
|
No. of GIs |
21 |
25 |
32 |
18 |
22 |
21 |
22 |
19 |
21 |
|
GI total length (kb) |
359 |
429 |
520 |
312 |
345 |
337 |
378 |
290 |
361 |
|
Largest GI (kb) |
70 |
36 |
41 |
51 |
52 |
52 |
63 |
55 |
52 |
|
GI mean length (kb) |
17 |
17 |
16 |
16 |
18 |
16 |
17 |
20 |
17 |
Only GIs >5 kb were considered.
GI, genomic island.