| Literature DB >> 32322250 |
Lan Yang1, Weixun Li1, Obaroakpo Joy Ujiroghene1, Yang Yang1, Jing Lu1, Shuwen Zhang1, Xiaoyang Pang1, Jiaping Lv1.
Abstract
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is an adaptive immune system that resists foreign genes through nuclease targeting in bacteria and archaea. In this study, we analyzed 68 strains of Lactobacillus casei group from the NCBI GenBank database, and bioinformatic tools were used to investigate the occurrence and diversity of CRISPR system. The results showed that a total of 30 CRISPR loci were identified from 27 strains. Apart from three strains which contained double loci with distinguishable distributed sites, most strains contained only one CRISPR locus. The analysis of direct repeat (DR) sequences showed that all DR could form stable RNA secondary structures. The CRISPR spacers showed diversity, and their origin and evolution were revealed through the investigation of their spacer sequences. In addition, a large number of CRISPR spacers showed perfect homologies to phage and plasmid sequences. Collectively, our results would contribute to researches of resistance in L. casei group, and also provide a new vision on the diversity and evolution of CRISPR/Cas system.Entities:
Keywords: CRISPR system; Lactobacillus casei group; genotyping; phage; spacer
Year: 2020 PMID: 32322250 PMCID: PMC7156538 DOI: 10.3389/fmicb.2020.00624
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
CRISPR-cas systems in L. casei group.
| Lactobacillus casei groups | Strain | Assembly | Type-subtype | CRISPR direction | No. spacer | Repeat sequence |
| Casei | BL23 | GCA_000026485.1 | TypeII-A | - | 21 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC |
| 12A | GCA_000309565.2 | None | ||||
| W56 | GCA_000318035.1 | TypeII-A | - | 15 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| ATCC 393 | GCA_000829055.1 | None | ||||
| LC5 | GCA_002192215.1 | None | ||||
| CECT 9104 | GCA_900492555.1 | TypeII-A | - | 42 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| TypeI-C | + | 13 | ATTTCAATTCACGCAGTCACGTAGACTGCGAC | |||
| TypeI-C | 25 | ATTTCAATTCACGCAGTCACGTAGACTGCGAC | ||||
| A2-362 | GCA_000510825.1 | None | ||||
| Z11 | GCA_001885295.1 | TypeII-A | + | 11 | GTTTTAGAAGGATGTTAAATCAATAAGGTTAAACCC | |
| MJA 12 | GCA_002091975.1 | None | ||||
| YNF-5 | GCA_004123005.1 | None | ||||
| DSM 20011 | GCA_001433735.1 | None | ||||
| Paracasei | ATCC 334 | GCA_000014525.1 | TypeI-E | - | 20 | GGATCACCCCCGCATGTGCGGGGAAAAC |
| JCM 8130 | GCA_000829035.1 | None | ||||
| Zhang | GCA_000019245.3 | TypeII-A | - | 16 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| 8700:2 | GCA_000155515.2 | TypeII-A | - | 20 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| BD-II | GCA_000194765.1 | TypeII-A | - | 21 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| LC2W | GCA_000194785.1 | TypeII-A | - | 21 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| LOCK919 | GCA_000418515.1 | TypeII-A | - | 11 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| N1115 | GCA_000582665.1 | None | ||||
| CAUH35 | GCA_001191565.1 | None | ||||
| L9 | GCA_001244395.1 | TypeII-A | - | 46 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| KL1 | GCA_001514415.1 | None | ||||
| IIA | GCA_002079285.1 | TypeII-A | - | 35 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| TK1501 | GCA_002257625.1 | TypeII-A | - | 24 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| FAM18149 | GCA_002442835.1 | none | ||||
| TMW 1.1434 | GCA_002813615.1 | TypeI-E | - | 130 | GGATCACCCCCGCATGTGCGGGGAAAAC | |
| TypeII-A | - | 22 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |||
| HD1.7 | GCA_002865565.1 | TypeII-A | - | 32 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| HDS-01 | GCA_002902825.1 | TypeII-A | - | 30 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| EG9 | GCA_003177075.1 | TypeII-A | - | 17 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| Lpc10 | GCA_003199005.1 | TypeII-A | + | 22 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| LC355 | GCA_003268715.1 | TypeII-A | - | 52 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| TypeI-E | + | 65 | GTTTTCCCCGCACATGCGGGGGTGATCC | |||
| ZFM54 | GCA_003627255.1 | TypeII-A | + | 18 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| 7112-2 | GCA_003957435.1 | TypeII-A | - | 21 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| IJH-SONE68 | GCA_003966835.1 | None | ||||
| SRCM103299 | GCA_004141835.1 | None | ||||
| LcY | GCA_000388095.2 | TypeII-A | - | 21 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| ATCC 25302 | GCA_000159495.1 | None | ||||
| KL1-Liu | GCA_000827145.1 | None | ||||
| 1316.rep1_LPAR | GCA_001062665.1 | None | ||||
| 1316.rep2_LPAR | GCA_001062695.1 | None | ||||
| 844_LCAS | GCA_001066565.1 | None | ||||
| 275_LPAR | GCA_001076595.1 | None | ||||
| 525_LPAR | GCA_001076935.1 | TypeII-A | - | 48 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| BM-LC14617 | GCA_001636215.1 | None | ||||
| RI-210 | GCA_001981715.1 | None | ||||
| RI-194 | GCA_001982085.1 | None | ||||
| RI-195 | GCA_001982095.1 | None | ||||
| CCC B1205 | None | |||||
| KMB_598 | GCA_003367655.1 | None | ||||
| AM33-2AC | GCA_003434205.1 | TypeI-E | - | 31 | GTTTTCCCCGCACATGCGGGGGTGATCC | |
| DTA83 | GCA_003571925.1 | TypeII-A | + | 30 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| UBLPC-35 | GCA_003640765.1 | None | ||||
| FAM18108 | GCA_003712245.1 | None | ||||
| FAM18110 | GCA_003712265.1 | None | ||||
| FAM18105 | GCA_003712275.1 | None | ||||
| FAM18123 | GCA_003712325.1 | None | ||||
| FAM18119 | GCA_003712385.1 | None | ||||
| FAM18113 | GCA_003712395.1 | None | ||||
| FAM18149 | GCA_003712485.1 | None | ||||
| FAM18133 | GCA_003712525.1 | None | ||||
| FAM18172 | GCA_003712585.1 | None | ||||
| FAM6012 | GCA_003712745.1 | None | ||||
| FAM8374 | GCA_003712825.1 | None | ||||
| FAM8407 | GCA_003712835.1 | None | ||||
| FAM6165 | GCA_003712875.1 | None | ||||
| FAM18126 | GCA_003712925.1 | None | ||||
| FAM3248 | GCA_003712935.1 | TypeI-E | + | 91 | GTTTTCCCCGCACATGCGGGGGTGATCC | |
| LcA | GCA_000400585.1 | TypeII-A | - | 21 | GTCTCAGGTAGATGTCGAATCAATCAGTTCAAGAGC | |
| DSM 20258 | GCA_001436485.1 | None | ||||
FIGURE 1CRISP-Cas systems in L. casei group. (A) The number of CRISPR-Cas systems detected in L. casei group strains for each CRISPR-Cas type. (B) Schematic diagram of CRISPR-Cas systems in L. casei group. (C) The length of the CRISPR repeats in each subtype. The y-axis is the base number of CRISPR repeat sequences. (D) The number of CRISPR spacers for in the CRISPR loci of each subtype. The y-axis is the number of CRISPR spacers.
FIGURE 2Alignment of type I-E (A) and type II-A (B) repeat sequences by DNAMAN, The dark blue represents completely identical sequences and the variation of nucleotide site is marked with other color.
FIGURE 3The prediction of DR secondary structure and MFE values in type I-C (A), type I-E (B,C), and type II-A (D,E).
FIGURE 4The type -II CRISPR arrays from L. casei group strains are represented graphically. The repeats have been eliminated and only spacers are shown. Identical spacers are shown as squares representing different color combination, Gray Squares containing an “X” represent no spacers. Strains are listed by CRISPR genotype, CRISPR array pattern, strain name. The newly acquired spacer is represented on the left side while the earliest acquired spacer is on the right side.
FIGURE 5CRISPR subtype I spacers comparison in L. casei group. each unique spacer sequence is showed as a unique color combination. Gray Squares containing an “X” represent no spacers. Spacer is displayed from the ancestral end (right) toward the recently acquired spacers (left) in order.
FIGURE 6L. casei group CRISPR spacers targeting phages (A) and plasmid (B). The heatmap represents the number of CRISPR spacers that matched phages (A) and plasmid (B). The horizontal axis represents the strain that target phages (A) or plasmids (B). The vertical axis represent phages (A) or plasmids (B) targeted by L. casei CRISPR spacers. The color scales represents the number of targeting events with blue squares representing the absent of matches and red squares representing the highest number of targeting.