| Literature DB >> 26230606 |
Alexandra E Briner1, Gabriele Andrea Lugli2, Christian Milani2, Sabrina Duranti2, Francesca Turroni3, Miguel Gueimonde4, Abelardo Margolles4, Douwe van Sinderen3, Marco Ventura2, Rodolphe Barrangou1.
Abstract
CRISPR-Cas systems constitute adaptive immune systems for antiviral defense in bacteria. We investigated the occurrence and diversity of CRISPR-Cas systems in 48 Bifidobacterium genomes to gain insights into the diversity and co-evolution of CRISPR-Cas systems within the genus and investigate CRISPR spacer content. We identified the elements necessary for the successful targeting and inference of foreign DNA in select Type II CRISPR-Cas systems, including the tracrRNA and target PAM sequence. Bifidobacterium species have a very high frequency of CRISPR-Cas occurrence (77%, 37 of 48). We found that many Bifidobacterium species have unusually large and diverse CRISPR-Cas systems that contain spacer sequences showing homology to foreign genetic elements like prophages. A large number of CRISPR spacers in bifidobacteria show perfect homology to prophage sequences harbored in the chromosomes of other species of Bifidobacterium, including some spacers that self-target the chromosome. A correlation was observed between strains that lacked CRISPR-Cas systems and the number of times prophages in that chromosome were targeted by other CRISPR spacers. The presence of prophage-targeting CRISPR spacers and prophage content may shed light on evolutionary processes and strain divergence. Finally, elements of Type II CRISPR-Cas systems, including the tracrRNA and crRNAs, set the stage for the development of genome editing and genetic engineering tools.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26230606 PMCID: PMC4521832 DOI: 10.1371/journal.pone.0133661
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Occurrence of CRISPR-Cas systems in bifidobacteria.
|
| Strain | System Type | CRISPR Repeat Sequence | RepeatLength | Numberof Repeats |
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
|
| DSM 22766 | I-E | GTGTTCCCCGCATGCGCGGGGATGATCCC | 29 | 81 | Y | Y | ||
|
| ATCC 15703 | I-C | GTCGCTCTCCTTACGGAGAGCATGGATTGAAAT | 33 | 86 | Y | Y | ||
|
| LMG 11039 | I-E | GTGTTCCCCGCACACGCGGGGATGATCCC | 29 | 172 | Y | Y | ||
|
| ATCC 25527 | I-E | GTTTGCCCCGCACAGGCGGGGATGATCCG | 29 | 32 | Y | Y | ||
|
| DSM 10140 | I-U | ATCTCCGAAGTCTCGGCTTCGGAGCTTCATTGAGGG | 36 | 19 | Y | Y | ||
|
| PRL2011 | I-E | GTGTTCCCCGCATCCGCGGGGATGATCC | 28 | 147 | Y | Y | ||
|
| DSM 23969 | None | none | ||||||
|
| LMG 13200 | II-A | GTTTCAGATGCCTGTCAGATCAATGACTTTGACCAC | 36 | 23 | Y | Y | ||
|
| DSM 22767 | I-C | GTCGCTCCCTTCACAGGGAGCGTGGATTGAAAT | 33 | 20 | Y | Y | ||
|
| DSM 19703 | II-C | CCAGTATATCAGAGGGGCTTTAGATTGAATTTGAAAC | 37 | 25 | Y | Y | ||
|
| LMG 10736 | I-E | GTGTTCCCCGCGCATGCGGGGATGATCCC | 29 | 157 | Y | Y | ||
|
| UCC2003 | I-C | GTCGATCCCCATCCGGGGAGCGTGGATTGAAAT | 33 | 48 | Y | Y | ||
|
| DSM 23973 | II-C | CAAGTCTATCAAGAAGGGTGAATGCTAATTCCCAAC | 36 | 13 | Y | Y | ||
|
| DSM 16992 | I-E | GTGTTCCCCGCATACGCGGGGATGATCCC | 29 | 15 | Y | Y | ||
|
| LMG 10510 | Undetermined | GTGCTCCTCGCAAGCGCGTGGACAACCCG | 29 | 20 | N | |||
|
| LMG 18911 | None | none | ||||||
|
| LMG 23609 | I-C | GTCGCTCCCTCACGGGAGCGTGGATTGAAAT | 31 | 36 | Y | Y | ||
|
| LMG 10738 | I-E | AGTTGCCCCGCGTATGCGGGGATGATCCG | 29 | 93 | Y | Y | ||
|
| LMG 11045 | II-C | CAAGTTTATCAAGAAGGGTAGAAGCTAATTCCCAGT | 36 | 17 | Y | Y | ||
|
| I-C | GTCGCTCTCCTCACGGAGAGCGTGGATTGAAAT | 33 | 81 | Y | Y | |||
|
| Undetermined | TGTCCGATTCTCCAGAATCGGACA | 24 | 8 | N | ||||
|
| DSM 20093 | I-E | GTGCTCCCCGCAAGCGCGGGGATGATCC | 28 | 39 | Y | Y | ||
|
| LMG 11586 | None | none | ||||||
|
| LMG 11587 | None | none | ||||||
|
| DSM 21854 | None | none | ||||||
|
| ATCC 15697 | None | none | ||||||
|
| NCC2705 | None | none | ||||||
|
| LMG 21814 | Undetermined | GTCGCACCCCACTGGGGTGCGTGGATTGAAAT | 32 | 9 | N | |||
|
| LMG 11591 | Undetermined | GTGCTCCCCACATAGGTGGGGATGAT | 26 | 4 | N | |||
|
| LMG 11341 | II-A | GTTTCAGATGCCTGTCAGATCAAGGACCTAGACCAC | 36 | 87 | Y | Y | ||
|
| LMG 11592 | I-E | GTTTGCCCCGCACTCGCGGGGATGATCC | 29 | 149 | Y | Y | ||
|
| DSM 21395 | None | none | N | |||||
|
| DSM 27321 | Undetermined | ATTTCAATCCACGCTCTCCGTGAGGAGAGCGAC | 33 | 15 | Y | |||
|
| DSM 20438 | I-C | GTCGCTCTCCTCATGGAGAGCGTGGATTGAAAT | 33 | 27 | Y | Y | ||
|
| LMG 11569 | None | none | ||||||
|
| LMG 11571 | I-E | GTTTGCCCCGCATGTGCGGGGATGATCCG | 29 | 112 | Y | Y | ||
|
| LMG 21775 | None | none | ||||||
|
| LMG 21816 | I-U | ATTGCGAAGCTTTACGCTTCGCAACTTCATTGAGGA | 36 | 20 | Y | Y | ||
|
| DSM 23975 | Undetermined | CCGAGGTTCCGCCCCGCTGAGGA | 23 | 13 | N | |||
|
| LMG 21811 | I-E | GTGTTCCCCGCATGCGCGGGGATGATCCC | 29 | 67 | Y | Y | ||
|
| Undetermined | ATGTCCGATTCTGCAGAATCGGACA | 25 | 16 | N | ||||
|
| LMG 14934 | None | none | ||||||
|
| DSM 23967 | I-C | GTCACCCTCCTCACGGAGGGTGCGGATTGAAAT | 33 | 31 | Y | Y | ||
|
| LMG 21589 | I-E | GTTTACCCCGCATGCGCGGGGATGATCCG | 29 | 66 | Y | Y | ||
|
| DSM 23968 | I-C | GTCGCCCCTCTCACGAGGGGCGTGGATTGAAAT | 33 | 51 | Y | Y | ||
|
| DSM 24849 | Undetermined | TTGGATGTGAGCGGCTGGAACACC | 24 | 6 | N | |||
|
| LMG 11597 | I-C | GTCGCTCCCTCACGGGAGCGTGGATTGAAAT | 31 | 156 | Y | Y | ||
|
| LMG 21689 | Undetermined | TTGTGTGAGGATTTGCTCGCACA | 23 | 5 | N | |||
|
| LMG 21395 | I-C | ATCGCTCCCCGTATGGGGAGCGTGAGTTGAAAT | 33 | 89 | Y | Y | ||
|
| JCM 7027 | I-U | ATTGCCGGGATTCAATTCCCGGCGCTTCATTGAGGG | 36 | 53 | Y | Y | ||
|
| Undetermined | GTCGCTCTCCTTACGGAGAGCGTGGATTGAAAT | 33 | 11 | N | ||||
|
| JCM 13495T | I-U | ATTGCCAGAGTTATAAGCTCTGGCCTTCGTTGAGGA | 36 | 7 | Y | Y | ||
|
| II-C | CAATCTTATCAAGAGGGTAGAAAGCTAATTCACAGC | 36 | 13 | Y | Y |
Fig 1Clustering of Cas1 into distinct phylogenetic groups.
Cas1 protein sequences were aligned using the MUSCLE algorithm and used to generate a UPMGA tree to show the divergence of different CRISPR-Cas systems. The system type and sub-type is noted on the right. The (*) indicates B. moukalabense which contained an “Undetermined” CRISPR-Cas system.
Fig 2CRISPR-Cas locus architecture.
One representative for each unique CRISPR subtypes represents the locus architecture of cas genes, CRISPR repeats, spacers and other system-specific components (e.g. tracrRNA). The signature gene for each subtype is colored in red (cas3 or cas9 for Type I and II, respectively). The universal cas1 and cas2 genes are colored in blue. Accessory genes are grey. The tracrRNA for Type II systems is shown in yellow. The direction of the arrows indicates directionality of the coding sequences. The repeat-spacer array only shows the CRISPR repeats (black rectangles). Each operon is shown at a scale of 11,000 base pairs. Long repeat-spacer arrays were shortened for simplicity indicated by a double line break. Numbers under the arrays indicate the first and last spacer location, showing the size of the array.
Fig 3CRISPR repeat-spacer array size distribution.
The graph shows the variability in size of the repeat-spacer arrays using number of spacers in each array, from Table 1. The error bars show the range of the locus size.
Fig 4Prophage targeting by CRISPR spacers.
The heat map displays Bifidobacterium CRISPR spacers that target prophage sequences harbored in bifidobacterial genomes. The horizontal axis lists hosts that contain prophages that are targeted by CRISPR spacers. The vertical axis lists strains containing CRISPR spacers that target a bifibacterial prophage sequence. The color intensity represents the number of cross-targeting events with red squares being high density (up to 10 targeting events), and white squares being single targeting events. The darker pink squares correlate with higher cross-targeting events and lighter pink squares correlate with fewer events. Blue squares represent that absence of CRISPR targeting. Hits shown in grey indicate self-targeting spacers that target prophage sequences in that particular chromosome meaning both the spacer and the prophage target are in the same chromosome. The darker grey indicates four distinct spacers that target prophage sequences, while the lighter grey indicates a single self-targeting spacer.
Fig 5WebLogo predictions of PAMs.
The height of each letter represents the conservation of that nucleotide at each position in the 10nt flank at the 3’ end of the protospacer. Hits from S1 Table were used to generate these WebLogos.
Fig 6crRNA:tracrRNA duplex binds with target DNA sequence next to PAM sequence.
Type II elements involved in Cas9 targeting and cleavage for B. bombi (A), B. bifidum (B), and B. merycicum (C) include the target protospacer (blue), the recognition PAM (purple), crRNA (green), and tracrRNA (red). The tracrRNA:crRNA duplex come together and induce endonucleolytic activity of Cas9 to cleave foreign DNA. The PAM was not able to be determined for B. bifidum, so the PAM region on the target DNA strand is shown instead. The RNaseIII processing sites were inferred from the preliminary RNASeq data and previous characterization of RNaseIII activity [41, 42]; the darkest arrow is most likely the primary processing site and the lighter arrows are the secondary and tertiary processing sites. For B. bombi, the processing sites were based on the boundaries determined for the crRNAs using the RNASeq data. For B. bifidum, the processing sites were determined using the boundaries from the tracrRNA RNASeq data. The B. merycicum sites are based on the boundaries from the B. bifidum data; the tracrRNA sequences only differed by five nucleotides in the upperstem-bulge-lowerstem region, meaning the processing sites are likely similar.