| Literature DB >> 32528454 |
Gang Wang1, Qian Liu1, Zhangming Pei1, Linlin Wang1, Peijun Tian1, Zhenmin Liu2, Jianxin Zhao1,3,4, Hao Zhang1,3,5,6,7, Wei Chen1,3,6,8.
Abstract
Diverse CRISPR-Cas systems constitute an indispensable part of the bacterial adaptive immune system against viral infections. However, to escape from this immune system, bacteriophages have also evolved corresponding anti-defense measures. We investigated the diversity of CRISPR-Cas systems and the presence of prophages in the genomes of 66 Bifidobacterium pseudocatenulatum strains. Our findings revealed a high occurrence of complete CRISPR-Cas systems (62%, 41/66) in the B. pseudocatenulatum genomes. Subtypes I-C, I-U and II-A, were found to be widespread in this species. No significant association was found between the number of bacterial CRISPR spacers and its host's age. This study on prophages within B. pseudocatenulatum genomes revealed that prophage genes related to distinct functional modules became degraded at different levels, indicating that these prophages were not likely to enter lytic cycle spontaneously. Further, the evolutionary analysis of prophages in this study revealed that they might be derived from different phage ancestors. Notably, self-targeting phenomenon within B. pseudocatenulatum and Anti-CRISPR (Acr) coding genes in prophages was observed. Overall, our results indicate that the competition between B. pseudocatenulatum and phages is a major driving factor for the genomic diversity of both partners.Entities:
Keywords: Bifidobacterium pseudocatenulatum; CRISPR-Cas systems; co-evolution; genomic diversity; prophage
Year: 2020 PMID: 32528454 PMCID: PMC7264901 DOI: 10.3389/fmicb.2020.01088
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
CRISPR-Cas systems present in B. pseudocatenulatum strains.
| Strain | Type-subtype | Reapeat sequence | Repeat length | No. repeats | cas1 | cas2 | cas3 | cas9 |
| A13 | None | |||||||
| A14 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 63 | Y | Y | Y | |
| FAHBZ2M3 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 113 | Y | Y | Y | |
| FAHBZ9L5 | None | |||||||
| FAHWH24M2 | None | |||||||
| FFJND17M1 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 6 | Y | Y | Y | |
| FFJND7M3 | None | |||||||
| FFJNDD5M3 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 83 | Y | Y | Y | |
| FFJNDD6M2 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 91 | Y | Y | Y | |
| FGSYC11M1 | None | |||||||
| FGSYC12M4 | None | |||||||
| FGSYC13M1 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 114 | Y | Y | Y | |
| FGSYC18M1 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 104 | Y | Y | Y | |
| FGSYC36M3 | I-U | ATTCCTGAGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 38 | Y | Y | Y | |
| FGSYC39M1 | None | |||||||
| FGSYC3M2 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 99 | Y | Y | Y | |
| FGSYC43M1 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 83 | Y | Y | Y | |
| FGSYC4M2 | I-U | ATTCCTGAGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 49 | Y | Y | Y | |
| FGSYC5M4 | None | |||||||
| FGSYC6M1 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 59 | Y | Y | Y | |
| FGSYC76M7 | I-C | GTCGCTCCCCGCAAGGGGAGTGTGGATTGAAAT | 33 | 34 | Y | Y | Y | |
| FGSYC7M5 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 80 | Y | Y | Y | |
| FGSYC87M1 | None | |||||||
| FGSYC88M3 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 53 | Y | Y | Y | |
| FGSYC91M2 | None | |||||||
| FGSZY20M1 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 64 | Y | Y | Y | |
| FGSZY50M3 | I-U | ATTCCTGAGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 32 | Y | Y | Y | |
| FHNFQ13M2 | I-C | GTCGCTCCCCGCAAGGGGAGTGTGGATTGAAAT | 33 | 23 | Y | Y | Y | |
| FHNFQ3M1 | None | |||||||
| FHNXY15M2 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 90 | Y | Y | Y | |
| FHNXY46M4 | II-A | GTTTCAGATGCCTGTCAGATCAAAGACTTAGACCAC | 36 | 13 | Y | Y | ||
| FHuNMY10M3 | None | |||||||
| FHuNMY37M1 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 97 | Y | Y | Y | |
| FJLHD2M3 | None | |||||||
| FJLHD33M2 | I-C | GTCGCTCCCCGCAAGGGGAGTGTGGATTGAAAT | 33 | 31 | Y | Y | Y | |
| FJLHD45M1 | I-C | GTCGCTCCCCGCAAGGGGAGTGTGGATTGAAAT | 33 | 21 | Y | Y | Y | |
| FJLHD4M2 | None | |||||||
| FJSNT36M3 | None | |||||||
| FJSNT37M5 | None | |||||||
| FNMHLBE12M7 | None | |||||||
| FNXHL2M3 | I-U | ATTCCTGAGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 25 | Y | Y | Y | |
| FNXHL5M2 | II-A | GTTTCAGATGCCTGTCAGATCAAAGACTTAGACCAC | 36 | 13 | Y | Y | ||
| FNXYCHL12M2 | II-A | GTTTCAGATGCCTGTCAGATCAAAGACTTAGACCAC | 36 | 31 | Y | Y | ||
| FQHXN112M3 | None | |||||||
| FQHXN3M8 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 64 | Y | Y | Y | |
| FQHXN5M4 | None | |||||||
| FQHXN6M4 | None | |||||||
| FQHXN72M4 | I-U | ATTCCTGAGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 8 | Y | Y | Y | |
| FQHXN83M4 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 116 | Y | Y | Y | |
| FQHXN8M3 | None | |||||||
| FSCPS14M2 | I-U | ATTCCTGGGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 32 | Y | Y | Y | |
| FSDWF3M4 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 111 | Y | Y | Y | |
| FSHXXA2M9 | None | |||||||
| FXJKS15M4 | None | |||||||
| FXJWS24M3 | I-U | ATTCCTGAGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 48 | Y | Y | Y | |
| FXJWS49M33 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 81 | Y | Y | Y | |
| FYNDL22M6 | I-C | GTCACTCCCCGCAAGGGGAGTGTGGATTGAAAT | 33 | 16 | Y | Y | Y | |
| FYNLJ23M6 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 76 | Y | Y | Y | |
| FZJHZ1M1 | I-E | GTGTTCCCCGCATACGCGGGGATGATCCC | 29 | 168 | Y | Y | Y | |
| FZJHZD11M4 | None | |||||||
| HuNa38 | None | |||||||
| HuNan_2016 | II-A | GTTTCAGATGCCTGTCAGATCAAAGACTTAGACCAC | 36 | 47 | Y | Y | ||
| NT17 | I-U | ATTCCTGAGCTAATCAGCTCAGGACTTCATTGAGGA | 36 | 41 | Y | Y | Y | |
| U2 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 60 | Y | Y | Y | |
| V6 | I-C | GTCGCTCCCCGCAAGGGGAGTGTGGATTGAAAT | 33 | 22 | Y | Y | Y | |
| XZ28R1 | I-C | GTCGCTCCCCGCAAGGGGAGTGTGGATTGAAAT | 33 | 16 | Y | Y | Y |
FIGURE 1Phylogenetic tree based on the amino acid sequences of Cas proteins in B. pseudocatenulatum, aligned using the MUSCLE algorithm and depicted using UPGMA using 500 bootstrap replicates. Bootstrap values are represented on the nodes. The CRISPR-Cas subtypes are written on the right, and the groups are highlighted in different colors for each subtype. (A) Phylogenetic tree based on Cas1 amino acid sequences. (B) Phylogenetic tree based on Cas3 amino acid sequences.
FIGURE 2Comparison of the occurrence of CRISPR-Cas subtypes between the Bifidobacterium genus (outer circle), B. longum (intermediate circle) and B pseudocatenulatum (the innermost circle).
FIGURE 3Schematic representation of CRISPR-Cas systems in B. pseudocatenulatum. (A) Representative CRISPR-Cas locus architecture of B. pseudocatenulatum. The same color arrow represents the same cas genes, and the length of the arrow represents the length of the cas gene; the fence graphic represents the CRISPR loci, and the upper number represents the number of repeats. Long repeat-spacer arrays were shortened for simplicity indicated by a double line break; (B) Schematic diagram of incomplete CRISPR-Cas systems.
List of prophages found in B. pseudocatenulatum strains.
| Strains | Name | Location | Start | End | Size | ORF | GC content |
| A13 | Bpseuc_1 | Scaffold2 | 227637 | 244310 | 16674 | 27 | 61.69% |
| A13 | Bpseuc_2 | Scaffold8 | 175 | 36952 | 36778 | 40 | 59.27% |
| A13 | Bpseuc_3 | Scaffold9 | 1142 | 36643 | 35502 | 53 | 58.21% |
| A14 | Bpseuc_4 | Scaffold4 | 69117 | 85946 | 16830 | 25 | 61.59% |
| FAHBZ9L5 | Bpseuc_5 | Scaffold13 | 19 | 29389 | 29371 | 28 | 59.60% |
| FAHBZ9L5 | Bpseuc_6 | Scaffold18 | 2260 | 39492 | 37233 | 64 | 58.86% |
| FAHWH24M2 | Bpseuc_7 | Scaffold3 | 77613 | 99772 | 22160 | 29 | 61.63% |
| FAHWH24M2 | Bpseuc_8 | Scaffold3 | 99844 | 126153 | 26310 | 36 | 59.94% |
| FFJND17M1 | Bpseuc_9 | Scaffold15 | 19662 | 44844 | 25183 | 23 | 59.18% |
| FFJND17M1 | Bpseuc_10 | Scaffold8 | 8 | 37391 | 37384 | 58 | 63.57% |
| FGSYC11M1 | Bpseuc_11 | Scaffold11 | 68 | 39551 | 39484 | 59 | 58.16% |
| FGSYC11M1 | Bpseuc_12 | Scaffold8 | 6393 | 56749 | 50357 | 50 | 63.98% |
| FGSYC13M1 | Bpseuc_13 | Scaffold11 | 1110 | 17254 | 16145 | 26 | 54.89% |
| FGSYC13M1 | Bpseuc_14 | Scaffold15 | 1356 | 38694 | 37339 | 53 | 55.82% |
| FGSYC39M1 | Bpseuc_15 | Scaffold5 | 2440 | 24532 | 22093 | 40 | 58.63% |
| FGSYC39M1 | Bpseuc_16 | Scaffold5 | 36799 | 64284 | 27486 | 23 | 57.52% |
| FGSYC3M2 | Bpseuc_17 | Scaffold1 | 644814 | 659818 | 15005 | 22 | 58.58% |
| FGSYC43M1 | Bpseuc_18 | Scaffold17 | 610 | 18948 | 18339 | 22 | 55.45% |
| FGSYC43M1 | Bpseuc_19 | Scaffold18 | 258 | 19089 | 18832 | 34 | 59.56% |
| FGSYC43M1 | Bpseuc_20 | Scaffold6 | 98257 | 141985 | 43729 | 59 | 59.57% |
| FGSYC6M1 | Bpseuc_21 | Scaffold13 | 734 | 20588 | 19855 | 31 | 59.38% |
| FGSYC6M1 | Bpseuc_22 | Scaffold8 | 43823 | 69065 | 25243 | 22 | 59.16% |
| FGSYC76M7 | Bpseuc_23 | Scaffold11 | 34493 | 77744 | 43252 | 57 | 59.03% |
| FGSYC91M2 | Bpseuc_24 | Scaffold4 | 41764 | 59356 | 17593 | 27 | 61.75% |
| FGSZY20M1 | Bpseuc_25 | Scaffold15 | 812 | 14292 | 13481 | 25 | 58.67% |
| FJLHD2M3 | Bpseuc_26 | Scaffold3 | 19631 | 45491 | 25861 | 21 | 65.53% |
| FJLHD33M2 | Bpseuc_27 | Scaffold7 | 3832 | 39993 | 36162 | 51 | 59.68% |
| FJSNT37M5 | Bpseuc_28 | Scaffold12 | 11435 | 33102 | 21668 | 31 | 61.15% |
| FNXHL5M2 | Bpseuc_29 | Scaffold17 | 1037 | 36153 | 35117 | 56 | 58.43% |
| FNXYCHL12M2 | Bpseuc_30 | Scaffold15 | 66 | 19263 | 19198 | 35 | 57.29% |
| FQHXN112M3 | Bpseuc_31 | Scaffold11 | 10551 | 33925 | 23375 | 36 | 55.32% |
| FQHXN112M3 | Bpseuc_32 | Scaffold3 | 39 | 30984 | 30946 | 52 | 54.94% |
| FQHXN5M4 | Bpseuc_33 | Scaffold11 | 3043 | 35653 | 32611 | 51 | 58.72% |
| FQHXN5M4 | Bpseuc_34 | Scaffold6 | 97354 | 125807 | 28454 | 45 | 59.57% |
| FQHXN6M4 | Bpseuc_35 | Scaffold7 | 20186 | 45415 | 25230 | 22 | 59.14% |
| FQHXN72M4 | Bpseuc_36 | Scaffold6 | 41789 | 59417 | 17629 | 27 | 61.77% |
| FQHXN83M4 | Bpseuc_37 | Scaffold4 | 15022 | 55324 | 40303 | 42 | 56.86% |
| FQHXN83M4 | Bpseuc_38 | Scaffold4 | 55295 | 72648 | 17354 | 22 | 55.25% |
| FQHXN83M4 | Bpseuc_39 | Scaffold4 | 118715 | 143733 | 25019 | 22 | 55.95% |
| FQHXN8M3 | Bpseuc_40 | Scaffold2 | 106129 | 150800 | 44672 | 59 | 58.95% |
| FQHXN8M3 | Bpseuc_41 | Scaffold2 | 145017 | 162282 | 17266 | 25 | 57.72% |
| FQHXN8M3 | Bpseuc_42 | Scaffold9 | 8469 | 54895 | 46427 | 49 | 59.53% |
| FSHXXA2M9 | Bpseuc_43 | Scaffold9 | 1804 | 24122 | 22319 | 39 | 58.53% |
| FXJWS24M3 | Bpseuc_44 | Scaffold12 | 4191 | 26020 | 21830 | 29 | 55.45% |
| FXJWS24M3 | Bpseuc_45 | Scaffold4 | 61672 | 119564 | 57893 | 82 | 54.42% |
| FXJWS24M3 | Bpseuc_46 | Scaffold4 | 129990 | 175070 | 45081 | 47 | 56.27% |
| FYNDL22M6 | Bpseuc_47 | Scaffold3 | 265958 | 294283 | 28326 | 34 | 63.00% |
| FYNLJ23M6 | Bpseuc_48 | Scaffold9 | 63575 | 86482 | 22908 | 28 | 56.73% |
| FZJHZ1M1 | Bpseuc_49 | Scaffold9 | 1998 | 22072 | 20075 | 36 | 58.27% |
| HuNa38 | Bpseuc_50 | Scaffold11 | 23457 | 56669 | 33213 | 51 | 60.78% |
| HuNan_2016 | Bpseuc_51 | Scaffold17 | 1 | 26720 | 26720 | 47 | 63.22% |
| HuNan_2016 | Bpseuc_52 | Scaffold5 | 117353 | 154565 | 37213 | 52 | 60.48% |
| U2 | Bpseuc_53 | Scaffold12 | 4175 | 29437 | 25263 | 35 | 59.17% |
| U2 | Bpseuc_54 | Scaffold16 | 1708 | 20682 | 18975 | 34 | 59.60% |
| V6 | Bpseuc_55 | Scaffold6 | 85538 | 115164 | 29627 | 37 | 58.95% |
| XZ28R1 | Bpseuc_56 | Scaffold10 | 1210 | 39052 | 37843 | 70 | 54.92% |
| XZ28R1 | Bpseuc_57 | Scaffold10 | 21621 | 44575 | 22955 | 31 | 54.92% |
| XZ28R1 | Bpseuc_58 | Scaffold4 | 109934 | 127530 | 17597 | 26 | 61.73% |
| XZ28R1 | Bpseuc_59 | Scaffold5 | 3834 | 41268 | 37435 | 54 | 59.66% |
FIGURE 4Association between CRISPR-Cas systems and prophages. (A) Comparison of the number of spacers in B. pseudocatenulatum strains with and without prophages and comparison of the number of prophages in the presence and absence of CRISPR-Cas systems using two-tailed Student’s t-test. (B) The origin of 15.6% of the spacer sequences was mapped to selected prophages, and that of the remaining spacers could not be matched to any of the integrated prophages or against the RefSeq_viral database. (C) CRISPR spacers targeting prophages in B. pseudocatenulatum strains. The heat-map represents spacers that matched the prophages in different B. pseudocatenulatum strains. The vertical axis represents the selected prophages. The horizontal axis represents the strains carrying CRISPR spacers that target prophages. The color scales represent the number of targeting events, with blue squares representing the absence of matches and red squares representing the highest number of targeting. (C) Correlation between the number of spacers in CRISPR arrays and the number of matched prophages (n = 41, Spearman’s rank correlation coefficient r = 0.7156, P < 0.001).
FIGURE 5CRISPR spacers targeting prophages in bifidobacterial genomes. (A) B. pseudocatenulatum CRISPR spacers targeting prophages in other Bifidobacterium strains. The heat-map represents spacers that matched the prophages in different Bifidobacterium strains. The horizontal axis represents other Bifidobacterium strains with targeted prophages. The vertical axis represents the B. pseudocatenulatum strains carrying CRISPR spacers that target prophages. The color scales represent the number of targeting events, with blue squares representing the absence of matches and red squares representing the highest number of targeting. (B) B. pseudocatenulatum CRISPR spacers targeting prophages in strains belonging to its species. The horizontal axis represents B. pseudocatenulatum strains that harbor prophages targeted by B. pseudocatenulatum CRISPR spacers. The vertical axis represents the B. pseudocatenulatum strains carrying CRISPR spacers that target prophages within B. pseudocatenulatum strains.
FIGURE 6Pan-genome and COG comparison between prophages and B. pseudocatenulatum genome. (A) Relative proportion of the number of genes in B. pseudocatenulatum and prophages. (B) Relative proportion of the number of bacterial COGs (BifCOGs) and prophage COGs (ProCOGs). (C) Abundance of the ProCOGs with identical predicted functions.
FIGURE 7Preservation of genes within the prophages identified based on the genomic functional modules. (A) Prophage genes were subdivided in five functional modules supported by a heatmap of the identified genes for each prophage. The prophage names are indicated on the right-hand margin of the heatmap, and the gene names are displayed at the bottom. (B) Abundance of individual functions identified within the prophages. The first column shows the number of prophages that encode a particular function listed in the second column, whereas the third column shows the relative percentages.
FIGURE 8Phylogenetic tree based on the whole genome of prophages. MAFFT was used to perform multiple sequence alignment. The maximum likelihood method was used to construct the phylogenetic tree with 1000 bootstrap replicates. The outermost circle bar represents the number of prophages containing important types of viral functional proteins.