| Literature DB >> 30842906 |
Pedro Seoane1, Silvana T Tapia-Paniagua2, Rocío Bautista3, Elena Alcaide4, Consuelo Esteve4, Eduardo Martínez-Manzanares2, M Carmen Balebona2, M Gonzalo Claros1,3, Miguel A Moriñigo2.
Abstract
Probiotic microorganisms are of great interest in clinical, livestock and aquaculture. Knowledge of the genomic basis of probiotic characteristics can be a useful tool to understand why some strains can be pathogenic while others are probiotic in the same species. An automatized workflow called TarSynFlow (Targeted Synteny Workflow) has been then developed to compare finished or draft bacterial genomes based on a set of proteins. When used to analyze the finished genome of the probiotic strain Pdp11 of Shewanella putrefaciens and genome drafts from seven known non-probiotic strains of the same species obtained in this work, 15 genes were found exclusive of Pdp11. Their presence was confirmed by PCR using Pdp11-specific primers. Functional inspection of the 15 genes allowed us to hypothesize that Pdp11 underwent genome rearrangements spurred by plasmids and mobile elements. As a result, Pdp11 presents specific proteins for gut colonization, bile salt resistance and gut pathogen adhesion inhibition, which can explain some probiotic features of Pdp11.Entities:
Keywords: Bioinformatics; Cultured fish; Genomics; Probiotics; Shewanella putrefaciens; Synteny; Workflow
Year: 2019 PMID: 30842906 PMCID: PMC6397758 DOI: 10.7717/peerj.6526
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Microbiological characterization of S. putrefaciens pathogenic strains used in this study.
| Pathogenic strain | LD50 (cfu/g) | Profile of RAPD | Growth at 6% NaCl | Growth at 37 °C |
|---|---|---|---|---|
| SH4 | 3.4 × 106 | I | + | + |
| SH6 | 8.3 × 106 | ND | − | + |
| SH9 | 1.4 × 106 | II | + | + |
| SH12 | 2.8 × 106 | III | − | + |
| SH16 | 5.5 × 106 | III | + | − |
Figure 1Flow chart describing the TarSynFlow workflow.
Solid lines represent analyses performed for every protein in a specific genome, where green box-lines depict processes applied to proteins of genome A and orange box-lines depict those for the other genome. Boxes outputting results in files are in solid blue, encircled with green or orange box-lines when are genome-specific, and with a thin blue box-line for comparative results. Dashed lines represent the comparison of protein IDs for the two genomes, with the line colour indicating genome A or B source. ‘CD-HIT’, ‘Blast’, ‘Prosplign’ and ‘Circos’ are in bold uppercase because they correspond to third party software. ‘High profile’ refers to the filter that keeps only protein matches with protein coverage and identity ≥85%, while ‘Low profile’ refers to the one keeping also protein matches with protein coverage and identity between 85% and 45%. See text for further details.
Figure 2Synteny diagrams of probiotic Pdp11 and SdM2 saprophytic strain as CIRCOS output.
Data for synteny were obtained using Sibelia (A) with a synteny block length of 500 nt in order to generate comparable results with TarSynFlow (B). Exemplary scaffolds with sequence rearrangements are boxed with the same color in A and B panels for comparison.
PCR primers designed to verify the probiotic-specific genes (described by their UniProt ID) based on the sequence of Pdp11.
| Uniprot ID | Primer | Sequence | Mt (°C) | Size (pb) | AnnT (°C) |
|---|---|---|---|---|---|
| E6XIZ3 | Z3-Pdp11-F | TCAGGGTCTTCGAATCTTCC | 59.9 | 1,344 | 53 |
| Z3-Pdp11-R | AGAGCAGCACAGTCAAAGCA | 59.2 | |||
| E6XIZ2 | Z2-Pdp11-F | TTGCTGTTGTTGGTGGGTTA | 60.0 | 2,235 | 55 |
| Z2-Pdp11-R | AGCGTTTAGCCGAACTTGAA | 60.0 | |||
| E6XG14 | G14-Pdp11-F | AACCGAGCAGTGCATTTTCT | 58.4 | 1,294 | 53 |
| G14-Pdp11-R | CACACCGTCAGTTCCAAAAT | 59.8 | |||
| E6XG15 | G15-Pdp11-F | TGCATACCGCGAACTAAGTG | 58.9 | 1,252 | 55 |
| G15-Pdp11-R | CAGATAAGCCATGAAGCAACA | 59.9 | |||
| E6XIZ5 | Z5-Pdp11-F | CCTGAAAACGCACCAAGTTT | 59.9 | 1,007 | 53 |
| Z5-Pdp11-R | CAGCAGTAAAATGACGCAACA | 60.1 | |||
| O86914 | 6914-Pdp11-F | CAAACCCAATACGGTCCATC | 60.0 | 2,058 | 55 |
| 6914-Pdp11-R | GCTGACCTTAGGCACTTTGC | 60.0 | |||
| E6XL69 | L69-Pdp11-F | CATCCAAAGGATTTAATTTAAGTGG | 60.1 | 575 | 53 |
| L69-Pdp11-R | GTGATACCTAGGGCGACGAA | 59.2 | |||
| E6XIZ4 | Z4-Pdp11-F | GGTTACATCATATTCTCTGCATGAT | 59.0 | 585 | 53 |
| Z4-Pdp11-R | GTAACTCCCCAATTGCAGAAA | 58.5 | |||
| E6XLE5 | LE5-Pdp11-F | GGCTTAACAATCACGCCAAT | 58.0 | 473 | 53 |
| LE5-Pdp11-R | ATGTCCGGATGCTACAAAAA | 59.9 | |||
| Q8GJK1 | K1-Pdp11-F | TCGGTTACCATTTACTCTCAGC | 58.4 | 905 | 55 |
| K1-Pdp11-R | GGAGATGTTTTTGTGTCGTGTT | 59.1 | |||
| Q6ZYR2 | R2- Pdp11-F | TGAGCCAACCCAATCTATCC | 59.8 | 1,085 | 55 |
| R2-Pdp11-R | GTGGCAACCTCTTCTTGTCC | 60.0 | |||
| A4Y1U2 | U2-Pdp11-F | ACACCAGTTGGGCGATAAAA | 60.0 | 873 | 54 |
| U2-Pdp11-R | ATCGGCAAGGTTTAAAAGCA | 59.7 | |||
| A4Y11U5 | U5-Pdp11-F | CCAGTCACCACACTCATTGG | 60.0 | 1,932 | 55 |
| U5-Pdp11-R | GCTTATGAACGCACCCGTAT | 59.9 | |||
| A1KQX7 | X7-Pdp11-F | TACCTGGATGAAATGCGTCA | 55.1 | 500 | 57 |
| X7-Pdp11-R | TCGTGTTTCGATAAGGCTGA | 55.1 | |||
| A4Y11U4 | U4-Pdp11-F | TCGACGATCATCATCTGAGAA | 59.8 | 575 | 54 |
| U4-Pdp11-R | TTCAGCTGATGCATACCAAAG | 58.9 | |||
| A4YB89 | B89-Pdp11-F | GCCATCATAGGCGAGCTAAC | 60.2 | 900 | 54 |
| B89-Pdp11-R | ATCAACTGCATGACAATAAAAACG | 59.8 |
Note:
The melting temperature (Mt) for every primer, as well as the amplicon size and the annealing temperature (AnnT) for every primer pair are given. F, Forward primer; R, Reverse primer.
A5-miseq summary for sequencing and assembling data for the NPSs used in this study.
| Strain | Raw reads | Useful reads (%) | N50 (bp) | Genome size (bp) | %GC | Scaffold number | Completeness (%) |
|---|---|---|---|---|---|---|---|
| SH4 | 2,303,512 | 98.50 | 223,087 | 4,628,646 | 46.3 | 46 | 99.4 |
| SH6 | 2,047,622 | 98.28 | 259,802 | 5,022,912 | 45.3 | 44 | 99.5 |
| SH9 | 1,193,322 | 96.16 | 245,702 | 5,020,097 | 45.3 | 47 | 99.5 |
| SH12 | 1,650,318 | 98.03 | 160,200 | 4,628,973 | 46.3 | 58 | 99.4 |
| SH16 | 2,319,174 | 98.21 | 387,271 | 5,018,364 | 45.3 | 37 | 99.5 |
| SdM1 | 3,262,744 | 98.33 | 347,522 | 5,068,163 | 45.2 | 44 | 99.3 |
| SdM2 | 4,227,076 | 98.57 | 511,212 | 4,354,804 | 44.3 | 28 | 99.1 |
Summary of protein matches revealed by TarSynFlow when Pdp11 was the test genome compared to the NPSs of Table 3.
| NPS name | NPS-specific | Pdp11-specific | Shared by Pdp11 & NPSs | Probably shared by Pdp11 & NPSs | Not assigned to any strain |
|---|---|---|---|---|---|
| SH4 | 79 | 90 | 1,930 | 1,219 | 1,330 |
| SH6 | 130 | 41 | 2,286 | 959 | 1,234 |
| SH9 | 130 | 41 | 2,285 | 960 | 1,234 |
| SH12 | 79 | 91 | 1,931 | 1,217 | 1,331 |
| SH16 | 130 | 41 | 2,286 | 959 | 1,234 |
| SdM2 | 305 | 29 | 2,333 | 946 | 1,200 |
| SdM1 | 116 | 43 | 2,277 | 964 | 1,238 |
| Common to all | 64 | 19 | 1,886 | 834 | 1,160 |
Figure 3Strain clustering based on differential proteins.
Proteins which are differentially present or absent in genomes were clustered by their pattern of presence and absence in the eight strains analyzed in this study. Green: the protein is present; red: the protein is absent.
Specific UniProt IDs for the probiotic strain Pdp11 and absent in the NPSs.
| UniProt ID | Protein length | Protein description | Gene ontology terms |
|---|---|---|---|
| E6XIZ3 | 360 | Bile acid/detergent exporter membrane fusion component, VexC | Membrane [GO:0016020]; transmembrane transport [GO:0055085] |
| E6XIZ2 | 1,011 | Bile acid/detergent exporter permease component, VexD | Integral component of membrane [GO:0016021]; transporter activity [GO:0005215] |
| E6XG14 | 353 | Undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphate transferase | Gram-negative-bacterium-type cell wall [GO:0009276]; integral component of plasma membrane [GO:0005887]; magnesium ion binding [GO:0000287]; manganese ion binding [GO:0030145]; phospho-N-acetylmuramoyl-pentapeptide-transferase activity [GO:0008963]; transferase activity, transferring glycosyl groups [GO:0016757]; UDP-N-acetylglucosamine-undecaprenylphosphate N-acetylglucosaminephosphotransferase activity [GO:0036380]; O antigen biosynthetic process [GO:0009243] |
| E6XG15 | 357 | Undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphate transferase | Gram-negative-bacterium-type cell wall [GO:0009276]; integral component of plasma membrane [GO:0005887]; magnesium ion binding [GO:0000287]; manganese ion binding [GO:0030145]; phospho-N-acetylmuramoyl-pentapeptide-transferase activity [GO:0008963]; transferase activity, transferring glycosyl groups [GO:0016757]; UDP-N-acetylglucosamine-undecaprenylphosphate N-acetylglucosaminephosphotransferase activity [GO:0036380]; O antigen biosynthetic process [GO:0009243] |
| E6XIZ5 | 242 | MltA-interacting MipA family protein | |
| O86914 | 829 | Trimethylamine-N-oxide reductase | Periplasmic space [GO:0042597]; electron carrier activity [GO:0009055]; molybdenum ion binding [GO:0030151]; trimethylamine-N-oxide reductase (cytochrome c) activity [GO:0050626]; trimethylamine-N-oxide reductase activity [GO:0009033] (EC 1.6.6.9) |
| E6XL69 | 105 | Putative uncharacterized protein | Integral component of membrane [GO:0016021] |
| E6XIZ4 | 111 | Putative uncharacterized protein | |
| E6XLE5 | 60 | Putative uncharacterized protein | |
| Q8GJK1 | 215 | HTH-type transcriptional regulator for conjugative element pMERPH | Sequence-specific DNA binding [GO:0043565]; regulation of transcription, DNA-templated [GO:0006355]; transcription, DNA-templated [GO:0006351] |
| Q6ZYR2 | 413 | Putative integrase | DNA binding [GO:0003677]; DNA integration [GO:0015074]; DNA recombination [GO:0006310] |
| A4Y1U2 | 206 | Resolvase, N-terminal domain | DNA binding [GO:0003677]; recombinase activity [GO:0000150] |
| A4Y1U5 | 1,026 | Transposase Tn3 family protein | transposase activity [GO:0004803]; transposition, DNA-mediated [GO:0006313] |
| A1KQX7 | 66 | Putative excisionase (Recombination directionality factor) | |
| A4Y1U4 | 103 | Plasmid stabilization system | |
| A4YB89 | 222 | Transposase | |
| A4L329 | 48 | TraG (Fragment) | |
| Q70IK8 | 194 | Putative transfer protein (Fragment) | |
| Q70IK5 | 70 | Putative conjugative transfer protein (Fragment) |
Notes:
Data are as output from the get_all_results.sh script for the comparative analysis of the seven TarSynFlow executions (one per NPS, using Pdp11 as the test genome).
Orthologue containing the tag “Fragment” within metadata.
Figure 4Example of PCR amplification for three genomic sequences predicted to code Pdp11-specific proteins.
MWM is the molecular weight marker; arrows indicate bands for 600 and 1,000 bp. Ctrl– indicates a negative control without DNA. In the three cases, amplification was obtained only in Pdp11, which confirms the in silico prediction that these genes are absent in the NPSs and are not an artefact due to the draft nature of the genomes of the NPSs.
PCR validation of sequences coding for Pdp11-specific proteins using the primer pairs of Table 2 for Pdp11-sequences coding Pdp11-specific proteins in Table 5.
| Protein ID | Isolated strains | |||||||
|---|---|---|---|---|---|---|---|---|
| SH4 | SH6 | SH9 | SH12 | SH16 | SdM1 | SdM2 | Pdp11 | |
| E6XIZ3 | − | − | − | − | − | − | − | + |
| E6XIZ2 | − | − | − | − | − | − | − | + |
| E6XG14 | − | − | − | − | − | − | − | + (96.73%) |
| E6XG15 | − | − | − | − | − | − | − | + (98.25%) |
| E6XIZ5 | − | − | − | − | − | − | − | + |
| O86914 | + | − | − | + | − | − | − | + |
| E6XL69 | − | − | − | − | − | − | − | + |
| E6XIZ4 | − | − | − | − | − | − | − | + |
| E6XLE5 | − | − | − | − | − | − | − | + (88.25%) |
| Q8GJK1 | − | − | − | − | − | − | − | + (99.30%) |
| Q6ZYR2 | − | − | − | − | − | − | − | + (99.71%) |
| A4Y1U2 | − | − | − | − | − | − | − | + (78.41%) |
| A4Y1U5 | − | − | − | − | − | − | − | + |
| A1KQX7 | − | − | − | − | − | − | − | + |
| A4Y1U4 | − | − | − | − | − | − | − | + |
| A4YB89 | − | − | − | − | − | − | − | + |
Note:
Fragment presence and correct amplification size is denoted with + and absence is denoted with −. When the fragment was sequenced, the percent of identity with the Pdp11 genome is included.