| Literature DB >> 29467746 |
Eric Duchaud1, Tatiana Rochat1, Christophe Habib1,2, Paul Barbier1, Valentin Loux2, Cyprien Guérin2, Inger Dalsgaard3, Lone Madsen3, Hanne Nilsen4, Krister Sundell5, Tom Wiklund5, Nicole Strepparava6, Thomas Wahli7, Greta Caburlotto8, Amedeo Manfrin8, Gregory D Wiens9, Erina Fujiwara-Nagata10, Ruben Avendaño-Herrera11, Jean-François Bernardet1, Pierre Nicolas2.
Abstract
Flavobacterium psychrophilum, the etiological agent of rainbow trout fry syndrome and bacterial cold-water disease in salmonid fish, is currently one of the main bacterial pathogens hampering the productivity of salmonid farming worldwide. In this study, the genomic diversity of the F. psychrophilum species is analyzed using a set of 41 genomes, including 30 newly sequenced isolates. These were selected on the basis of available MLST data with the two-fold objective of maximizing the coverage of the species diversity and of allowing a focus on the main clonal complex (CC-ST10) infecting farmed rainbow trout (Oncorhynchus mykiss) worldwide. The results reveal a bacterial species harboring a limited genomic diversity both in terms of nucleotide diversity, with ~0.3% nucleotide divergence inside CDSs in pairwise genome comparisons, and in terms of gene repertoire, with the core genome accounting for ~80% of the genes in each genome. The pan-genome seems nevertheless "open" according to the scaling exponent of a power-law fitted on the rate of new gene discovery when genomes are added one-by-one. Recombination is a key component of the evolutionary process of the species as seen in the high level of apparent homoplasy in the core genome. Using a Hidden Markov Model to delineate recombination tracts in pairs of closely related genomes, the average recombination tract length was estimated to ~4.0 Kbp and the typical ratio of the contributions of recombination and mutations to nucleotide-level differentiation (r/m) was estimated to ~13. Within CC-ST10, evolutionary distances computed on non-recombined regions and comparisons between 22 isolates sampled up to 27 years apart suggest a most recent common ancestor in the second half of the nineteenth century in North America with subsequent diversification and transmission of this clonal complex coinciding with the worldwide expansion of rainbow trout farming. With the goal to promote the development of tools for the genetic manipulation of F. psychrophilum, a particular attention was also paid to plasmids. Their extraction and sequencing to completion revealed plasmid diversity that remained hidden to classical plasmid profiling due to size similarities.Entities:
Keywords: Flavobacterium psychrophilum; aquaculture; clonal-complex; comparative genomics; fish-pathogenic bacteria; homologous recombination
Year: 2018 PMID: 29467746 PMCID: PMC5808330 DOI: 10.3389/fmicb.2018.00138
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Genome sequences included in the comparative analysis.
| JIP 02/86 | 1 | 2,860 | France | Kidney | 1986 | Duchaud et al., | ||
| CSF259-93 | 1 | 2,901 | ID, USA | Spleen | 1993 | Wiens et al., | ||
| NCIMB 1947T | 1 | 2,716 | WA, USA | Kidney | n.a. | Wu et al., | ||
| DIFR 950106-1/1 | 1 | 2,736 | Denmark | Spleen | 1995 | Castillo et al., | ||
| FPG101 | 1 | 2,835 | Canada | n.a. | 2008 | Castillo et al., | ||
| MH1 | 1 | 2,848 | Chile | Skin | 2008 | Castillo et al., | ||
| PG2 | 1 | 2,851 | Chile | Skin | 2009 | Castillo et al., | ||
| 5 | 1 | 2,848 | Chile | Fish farm freshwater | 2013 | Castillo et al., | ||
| 3 | 1 | 2,805 | Chile | Fish farm freshwater | 2013 | Castillo et al., | ||
| VQ50 | 1 | 2,807 | Chile | Skin | 2006 | Castillo et al., | ||
| OSU THCO2-90 | 1 | 2,784 | OR, USA | Kidney | 1990 | Rochat et al., | ||
| KU 051128-10 | 179 | 2,572 | Japan | River water | 2005 | This study (EFN) | ||
| KU 060626-4 | 182 | 2,527 | Japan | Kidney | 2006 | This study (EFN) | ||
| KU 060626-59 | 204 | 2,518 | Japan | Skin lesion | 2006 | This study (EFN) | ||
| KU 061128-1 | 202 | 2,515 | Japan | River water | 2006 | This study (EFN) | ||
| IT02 | 43 | 2,648 | Italy | Spleen | 2011 | This study (GC) | ||
| IT09 | 46 | 2,716 | Italy | Spleen | 2012 | This study (GC) | ||
| NO004 | 58 | 2,774 | Norway | Fin | 1998 | This study (HN) | ||
| NO014 | 66 | 2,682 | Norway | Spleen | 2008 | This study (HN) | ||
| NO042 | 73 | 2,583 | Norway | Spleen | 2008 | This study (HN) | ||
| NO083 | 42 | 2,690 | Norway | Kidney | 2011 | This study (HN) | ||
| NO098 | 52 | 2,617 | Norway | Ovarian fluid | 2011 | This study (HN) | ||
| DK002 | 40 | 2,686 | Denmark | kidney | 1990 | This study (ID) | ||
| DK150 | 66 | 2,775 | Denmark | Kidney | 1995 | This study (ID) | ||
| DK095 | 66 | 2,816 | Denmark | Skin | 2000 | This study (ID) | ||
| DK001 | 48 | 2,704 | Denmark | Spleen | 2009 | This study (ID) | ||
| FPC 840 | 274 | 2,516 | Japan | Kidney | 1987 | This study (J-FB) | ||
| FPC 831 | 226 | 2,600 | Japan | Peduncle lesion | 1990 | This study (J-FB) | ||
| LVDJ XP189 | 258 | 2,685 | France | Kidney | 1992 | This study (J-FB) | ||
| JIP 08/99 | 261 | 2,549 | France | Kidney | 1999 | This study (J-FB) | ||
| JIP 16/00 | 368 | 2,503 | France | n.a. | 2000 | This study (J-FB) | ||
| FRGDSA 1882/11 | 74 | 2,687 | France | n.a. | 2011 | This study (J-FB) | ||
| CH8 | 58 | 2,699 | Switzerland | Spleen | 2009 | This study (NS) | ||
| CH1895 | 65 | 2,748 | Switzerland | Skin | 2011 | This study (NS) | ||
| LM-01-Fp | 46 | 2,753 | Chile | Kidney | 2006 | This study (RA-H) | ||
| LM-02-Fp | 43 | 2,685 | Chile | Kidney | 2006 | This study (RA-H) | ||
| FI055 | 78 | 2,732 | Finland | Inner organs | 1996 | This study (TW) | ||
| FI056 | 41 | 2,694 | Finland | inner organs | 1996 | This study (TW) | ||
| FI146 | 99 | 2,838 | Finland | Pond water | 2000 | This study (TW) | ||
| FI070 | 52 | 2,715 | Finland | Mouth | 2006 | This study (TW) | ||
| FI166 | 62 | 2,732 | Scotland | n.a. | 2007 | This study (TW) | ||
Number of contigs (≥ 1 Kbp, plasmids excluded).
Cumulated contig length.
The names of USA states are abbreviated.
Abbreviated species names for Gasterosteus aculeatus, Oncorhynchus mykiss, Oncorhynchus kisutch, Perca fluviatilis, Plecoglossus altivelis.
Initials of contact name for newly sequenced isolates; Erina Fujiwara-Nagata, Greta Caburlotto, Hanne Nilsen, Inger Daslgaard, Jean-François Bernardet, Nicole Strepparava, Ruben Avendaño-Herrera, Tom Wiklund. n.a., no data available.
The spelling OSU THC02-90 has been used in several publications (e.g., Nicolas et al., .
Information on host fish for DIFR 950106-1/1 can be found in Madsen and Dalsgaard (.
S. trutta lacustris.
The exact date of isolation of the type strain NCIMB 1947.
VQ50 was isolated by R. Avendaño-Herrera in 2006 in Veterquímica (VQ).
Figure 1Nucleotide diversity. (A) Distribution of pairwise nucleotide distances between genomes. (B) Tree representation of pairwise nucleotide distances obtained with the neighbor-joining method. (C) Folded nucleotide frequency spectrum for binary SNPs as computed on the 18 isolates obtained when taking only one representative per clonal complex. The spectrum is shown for three categories of sites based on the detection or not of amino-acid polymorphism in the concerned codon: synonymous sites, non-synonymous sites, all sites. The theoretical spectrum under the idealized scenario of the standard coalescent with infinitely many sites is also shown for comparison.
Figure 2Gene families. (A) Accumulation curves for the sizes of the core- and pan-genomes. Core-genome size, number of gene families in common; Pan-genome size, total number of gene families. Number averaged on 10,000 randomized orders for genome addition. Only gene families of which one member has a length above 100 aa are represented. Colors distinguish curves for three sets of isolates: the 41 isolates considered in this study (blue), the 18 isolates obtained when taking only one representative per clonal complex (green), and the 22 isolates of CC-ST10 (red). (B) Fit of the power-law on the number of new gene families discovered in the last genome added; α is the scaling exponent corresponding to the slope in this log-log representation. Same colors as in (A). (C) Gene frequency spectrum. Only one representative per CC is considered. (D) Distribution of gene-content distances between pairs of genomes.
Figure 3Whole-genome MLST. Neighbor-Joining tree built on pairwise distance between AT-profiles. Distance counted as the fraction of allele types that differs between two strains. Isolate name, year, host fish, and country of sampling are reported. The following abbreviations are used for the fish names: RbT for rainbow trout, CoS for coho salmon, AtS for Atlantic salmon, and BrT for brown trout. STs and AT-profiles corresponding to the classical 7-genes MLST scheme are also shown. CC-ST10 is highlighted by a light gray box.
Figure 4Variable gene pool of CC-ST10. A total of 330 gene families are represented here. Black indicates presence and gray absence in a particular genome. Genomes are ordered vertically according to the tree built on AT-profiles. Genes are arranged horizontally according to their presence/absence profile as obtained by hierarchical clustering based on Manhattan distance and complete link (not shown). The six major profiles are represented by color rectangles and numbered.
Figure 5Allele-type map of CC-ST10. Protein coding gene loci are positioned along the genome according to gene coordinates in the genome of strain CSF259-93. Genomes are ordered vertically according to the tree built on AT-profiles as illustrated in the upper-left corner; the exact position of CC-ST10 is reported. Black indicates presence of a gene in more than one copy; white indicates absence. Allele-types in CC-ST10 are represented by different colors: light gray for the main allele-type and red to yellow for the others, with some local reallocations of colors to maximize the consistency of the color choice between adjacent loci. Allele-types not found in CC-ST10 are represented in dark gray.
Figure 6SNPs and recombination tracts between two closely related isolates. The strains compared here are JIP 02/86 and CH1895. (Upper) Positions of the SNPs along the genome (using the genomic coordinates of CSF259-93 as references). SNP index is reset every 100 SNPs for this representation. Each dot represents one SNP between the two isolates. Colors distinguish two types of polymorphism: in blue, polymorphism observed only inside CC-ST10 (referred to as polymorphism of type P1 in the text); in red, polymorphism observed also outside of CC-ST10 (polymorphism of type P2). Areas in gray correspond to regions not covered by our alignments. (Lower) Probability of recombination tract as computed with the HMM.
Figure 7Recombination and mutation in divergence between closely related isolates. (A) Relationship between the amount of nucleotide changes arising from recombination (r) and from mutation (m). Each point represents a pair of closely related isolates. Colors highlight pairs within two groups of isolates in CC-ST10 according to the topology of the tree built on AT-profiles: blue for isolates from FI055 to DK002 (group B in the text) and red for isolates from MH1 to CH8 (group C in the text). The lines represent the line with intercept zero and slopes equal to the median of r/m for all pairs (black), for pairs in group B (blue), and for pairs in group C (red). (B) Distribution of the estimated r/m ratio across pairs of isolates. Vertical lines correspond to the slopes of the lines in (A).
Figure 8Tree built on mutational distances within CC-ST10. (A) Neighbor-Joining tree for CC-ST10 built on the estimated number of mutations (m). (B) Tip to root distance as a function of isolate sampling year. The line represents the linear regression obtained after excluding the two isolates of North-American origins (shown in red).
Figure 9Plasmids found in our collection of isolates. (A) Physical maps of the 15 types of plasmids (pFP1 to pFP15) are ordered by length. Colors indicate the main different types of genes (see inset legend), pseudogenes, and unknown short genes are represented in white. Three types of rep genes have been distinguished based on the important divergence of amino-acid sequences. (B) Distribution of the 15 types of plasmids across the isolates. The Neighbor-Joining tree obtained on whole-genome MLST profiles is used here. n.a. stands for no data available. Asterisks (*) indicate the representatives that served to illustrate the different types of plasmids.