| Literature DB >> 34992734 |
Harumi Mizuki1, Yu Shimoyama1, Taichi Ishikawa1, Minoru Sasaki1.
Abstract
INTRODUCTION: Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated systems are RNA-mediated adaptive immune systems that actagainst invading genetic elements such as phages or plasmids. CRISPR/Cas systems exist in nearly half of bacteria. Mycoplasma salivarium is a commensal species of the oropharynx. The American Type Culture Collection maintains five M. salivarium strains: ATCC 14277, 23064, 23557, 29803, and 33130. The genome sequence of ATCC 23064 revealed that it has an incomplete CRISPR/Cas system. However, the genome sequences of the remaining strains have not been analyzed.Entities:
Keywords: CRISPR RNA-guided endonuclease; CRISPR/Cas system; Cas9; Clustered regularly interspaced short palindromic repeats; Mycoplasma; Mycoplasma salivarium; de novo genome sequencing
Year: 2022 PMID: 34992734 PMCID: PMC8725752 DOI: 10.1080/20002297.2021.2008153
Source DB: PubMed Journal: J Oral Microbiol ISSN: 2000-2297 Impact factor: 5.474
Description of Mycoplasma salivarium strains used in this study
| Strain | Other designations | Note |
|---|---|---|
| ATCC 14277 | Buccal 1 | |
| ATCC 23064 | NCTC 10113 NBRC 14478PG 20, H110 | type strain |
| ATCC 23557 | Manire A-2-B-3 | |
| ATCC 29803 | W | |
| ATCC 33130 | S9 |
Sequences of the primers used for PCR amplification in this study
| Name | Sequence (5′→3′) |
|---|---|
| CRISPR FW1 | AGGTAGTTGTGTTTGATCCCACT |
| CRISPR RV1 | TTTTGCTGCATGCCCTTCAC |
| CRISPR FW2 | TGGCGAGAATCCGAAACTTA |
| CRISPR RV2 | TCGCG GTTAAATTTGCTACC |
| CRISPR FW3 | GGTAAGGTAGTTGTGTTTGATCCCAC |
| CRISPR RV3 | GCATGCCCTTCACGGTTAGA |
| CAS9 FW1 | CTTTCAAAACCACCGAAGGA |
| CAS9 RV1 | AGTTTCGGATTCTCGCCAAA |
| CAS9 RV2 | CTGCGGCTTGTATATGTTTCCC |
| CAS9 FW3 | AGTTTTGGCGGAATTTGGTA |
| CAS9 RV3 | CTTTTCACGTGCCAATTCAA |
| CAS1-CSN2 FW1 | CGCAAATTGTACCATTCAATGG |
| CAS1-CSN2 RV1 | TGCTACTCTGACATCGCCAT |
| RNC FW | TGTCCATCCACAATAACGCT |
| RNC RV | AGAGGGGATTGCAACTAAACA |
Figure 1.PCR amplification of the CRISPR/Cas system within the genomic sequences of Mycoplasma salivarium strains. For ATCC 23064 and ATCC 29803, the genomic sequences were amplified using primer pairs, which are shown at both ends of the bars. For ATCC 14277, 23557, and 33130, the genomic sequences were not amplified using any primer pairs. Red bars indicate that amplified PCR products were sequenced via primer walking or capillary sequencing. Blue bars indicate PCR amplification alone without sequencing. The black dash indicates the disruption in the gene sequence. PCR, polymerase chain reaction; ATCC, American Type Culture Collection; CRISPR, clustered regularly interspaced palindromic repeats; Cas, CRISPR-associated.
Summary of the de novo genome assembly for Mycoplasma salivarium strains ATCC 14277, 23557 and 33130
| Element | ATCC 14277 | ATCC 23557 | ATCC 33130 | ATCC 23064 (NCTC 10113) * |
|---|---|---|---|---|
| Total sequence length (bp) | 718,941 | 718,986 | 736,914 | 728,347 |
| Number of sequences | 7 | 6 | 9 | 1 |
| Longest sequences (bp) | 551,285 | 702,908 | 503,089 | |
| N50 (bp) | 551,285 | 702,908 | 503,089 | |
| Gap ration (%) | 0.0 | 0.0 | 0.0 | |
| GC content (%) | 26.5 | 26.5 | 26.5 | |
| Number of CDSs | 610 | 618 | 627 | 614 |
| Average protein length | 142.3 | 142.7 | 140.6 | |
| Coding ratio (%) | 91.7 | 90.6 | 90.7 | |
| Number of rRNAs | 3 | 3 | 3 | 3 |
| Number of tRNAs | 33 | 33 | 33 | 33 |
| Number of CRISPRs | 0 | 0 | 0 | 1 |
*Data from NCBI Reference Sequence: NZ_LR214938.2
Figure 3.Comparison between the whole genomes of M. salivarium ATCC 14277, ATCC 23557, ATCC 33130, and ATCC 23064 (NCTC 10113) strains, visualized using Easyfig. Based on BLAST analysis (red for matches in the same direction and blue for inverted matches), vertical blocks between the sequences indicate regions of shared similarity. Comparison between the whole genome sequence of ATCC 23064 and the sequences of the CRISPR/Cas system in ATCC 29803 is shown as a yellow block. ATCC, American Type Culture Collection; BLAST, basic local alignment search tool; CRISPR, clustered regularly interspaced palindromic repeats; Cas, CRISPR-associated.
Direct repeat consensus sequence and spacer sequence in ATCC 23064
| No. | Direct repeat consensus sequence (36 bp) | Spacer sequence (30 bp) |
|---|---|---|
| 1 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTTCTTCTCCTGCTCCTGTTGGTTTTGCTC |
| 2 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TCATTTAATATAAAAAAAACAACAAGGAAA † |
| 3 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TCATTTAATATAAAAAAAACAACAAGGAAA † |
| 4 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTACACAAGATATGATTAACAACCCAACAA |
| 5 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTTATAATTACATCACATTCTTGACATATA |
| 6 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTGACGCAAAAATTTATGGTAATATTCCAG |
| 7 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GAAGACGTTTTAATATATTCTAAATATTCA ‡ |
| 8 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TAATTTTGTTGATATTCAATTTAATTTGAT |
| 9 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GAAAAAAGGTAGAGTTAGCAGGACTAACAA |
| 10 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | CTCTCTAAAGAAAATGAATATTTGAGAAGC |
| 11 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GCTGAACGTATCATTAGAAAACGTGCAAAA |
| 12 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | AACACAAGAAAACAACAAAGAATTACAGCT |
| 13 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | ATTATTGCTTTATTGATTGATATGAAGTAC § |
| 14 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TCATTAAAGCAACTTAATAGTTGTGATAAC |
| 15 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTTAATATCTAACTAAGAAAAAGCGAGCAC |
| 16 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GTAAACTAATTCTTATAATTTTCCTTTAAG †† |
| 17 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TAGAATAAGTATTATCTCAATCATTGTAAT ‡‡ |
| 18 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GAAGACGTTTTAATATATTCTAAATATTCA ‡ |
| 19 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | ATTATTGCTTTATTGATTGATATGAAGTAC § |
| 20 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TCATTAAAGCAACTTAATAGTTGTGATAAC |
| 21 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTGCAGCATTAACATTAACCATTGATGCTA |
| 22 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TAGAATAAGTATTATCTCAATCATTGTAAT ‡‡ |
| 23 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GAAGACGTTTTAATATATTCTAAATATTCA ‡ |
| 24 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | ATTATTGCTTTATTGATTGATATGAAGTAC § |
| 25 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TCATTAAAGCAACTTAATAGTTGTGATAAC |
| 26 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TACTATAAAATTACCATCTCAACTTAAATT |
| 27 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GTAAACTAATTCTTATAATTTTCCTTTAAG †† |
| 28 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC |
Spacer sequences with the symbols † (No. 2, 3), ‡ (7, 18, 23), § (13, 19, 24), ¶ (14, 20, 25), †† (16, 27) and ‡‡ (17, 22) indicate to be identical respectively.
Direct repeat consensus sequence and spacer sequence in ATCC 29803
| No. | Direct repeat consensus sequence (36 bp) | Spacer sequence (30 bp) |
|---|---|---|
| 1 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | AGTATAGTGGACGTTAATGCAAACCAAAAA |
| 2 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | AATTCATAGATGGTTGAACGTATAAAAAAG |
| 3 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | ATTATTATTGGTCATTTTCACGAAATAGAA |
| 4 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TCGGTCCCAGAAACTTGAATAGACAATTAA † |
| 5 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | TCGGTCCCAGAAACTTGAATAGACAATTAA † |
| 6 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | GATAATCAAGCAAAAGATTAAGACAATTAC |
| 7 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | ATCCAACAATTATAAATATAACATCACCAG |
| 8 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | CAGACCATGCAGTTTCATTATTGTTTGGAC |
| 9 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | CACCTTTAGGCTATGCACAAGGCTTTAAAA |
| 10 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TAACAGTAATTTCAATTATATATGATCTTT |
| 11 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | ATCTGGAGTACAAAAGATAGCATTAATTTA |
| 12 | GTTTTAGCGCTGTACAATATTTGAGTAAGTTATAAC | AAATAAGACTAGAAGAAAGAGAACAAGAGA |
| 13 | GTTTTAGCGCTGTACAATATTTGAGTAAGTTATAAC | TATTAGAACTACAAAAACTAAAAGAACACA |
| 14 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTTCTAATTATTTTGCATCCAACTTTACAG |
| 15 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | TAACATTCATTACACTATTAGATAACTCAA |
| 16 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | ATTTTGTTCATTATTTAAGATATTTAGATT |
| 17 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | TCTAAAGCCTGTTTTTATATAAACTTACCT |
| 18 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | TGTTATTTAGTCATTTTCTATTTGTATATT |
| 19 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | CATTTACTGGTTTATTGCCTTGTTTAACTA |
| 20 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | ATTTAATAAAAAATACTTATATTGCGAATA |
| 21 | GTTTTAGCACTGTACAATATTTGAGTAAGCTATAAC | AACATTAACCAAATATATATGCAAATACTA ‡ |
| 22 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | TGTAATTGTAGTTATGTTGTCTTCTCATAA § |
| 23 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | AACATTAACCAAATATATATGCAAATACTA ‡ |
| 24 | GTTTTAGTGCTGTACAATATTTGAGTAAGCTATAAC | TGTAATTGTAGTTATGTTGTCTTCTCATAA § |
| 25 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TTAAAGAATATAAAACGCAAATTCCTAGTT |
| 26 | GTTTTAGCGCTGTACAATATTTGAAAC | ACAAGCATAAACAAGAAGTTTTAGAAGTTG |
| 27 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC | TAGCAAAAGCAATTAAAAAACTAAATATTA |
| 28 | GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC |
Red letters in the direct repeat consensus sequence indicate the mismatches with the direct repeat consensus sequence ‘GTTTTAGCGCTGTACAATATTTGAGTAAGCTATAAC’. Spacer sequences with the symbols † (No. 4, 5), ‡ (21, 23) and § (22, 24) indicate to be identical respectively.
Potential protospacer sequences matched with the spacer sequence in ATCC 29803
| Spacer No. | Spacer sequence | Protospacer match (annotation) | 5′ Flankingsequence | Potential protospacer sequence | 3′ Flanking sequence |
|---|---|---|---|---|---|
| 9 | CACCTTTAGGCTATGCACAAGGCTTTAAAA | ATATGTGAAT | CTTTAAAACCTTGTGCATAGCCTAAA | AGGGATGAGA | |
| TCTCATCCCT | CACCTTTAGGCTATGCACAAGGCTTTAAAA | ATTCACATAT | |||
| 12 | AAATAAGACTAGAAGAAAGAGAACAAGAGA | CATTTGCCCA | AAATAAGATTAGAAGAAAGA | TGTTTGATAA | |
| TTATCAAACA | TCTCTTGTTCTCTTTCTTTTAGTCTTATTT | TGGGCAAATG | |||
| TTATCAAACA | TCTCTTGTTCTCTTTCTTTTAGTCTTATTT | TGGGCAAATG | |||
| CATTTGCCCA | AAATAAGATTAGAAGAAAGA | TGTTTGATAA |
Red letters in the potential protospacer sequence indicate the mismatches with the spacer sequences.
Figure 4.Prediction of the tracrRNA sequence of M. salivarium ATCC 29803. (a) The tracrRNA-coding sequence contained a 36 bp stretch with 88.9% homology to a DR sequence (red letters). The tracrRNA-coding sequence is included in an array containing a homology part and the 5′ side part. (b) Secondary structure of a DR/tracrRNA hybrid, simulated by concatenating the RNA sequences of a DR and the tracrRNA sequence, predicted using mfold. A predicted stem involving a DR and a tracrRNA anti-repeat includes a lower stem (L), a bulge (B), and an upper stem (U). N: nexus stem-loop; T: terminator. The tracrRNA sequence terminated with a poly-uridine tract (UUU), based on the secondary structure of a crRNA/tracrRNA hybrid of Mycoplasma gallisepticum S6 as a reference. (c) Simulation of binding of a predicted crRNA and a tracrRNA. A repeat-derived sequence at the 3′ end of crRNA was postulated to have 22 nucleotides based on a reference sequence of Streptococcus pyogenes crRNA. ATCC, American Type Culture Collection; crRNA, clustered regularly interspaced palindromic repeats-associated RNA; tracrRNA, transactivating crRNA; DR, direct repeat.
Figure 5.Schematic representation of the arrangement of CRISPR/Cas system components in two M.salivarium strains, ATCC 29803 and ATCC 23064 (NCTC 10113). The CRISPR array components are indicated as follows: red rhombus, spacer; green rectangle, direct repeat; pink rectangle, leader sequence. The black flash indicates a disruption in the gene sequence. ATCC, American Type Culture Collection; CRISPR, clustered regularly interspaced palindromic repeats; Cas, CRISPR-associated.
Figure 2.Agarose gel electrophoresis analysis of amplified PCR products. (a) Following amplification, the PCR products of the CRISPR sequences of five M. salivarium strains were analyzed via agarose gel electrophoresis: ATCC 14277 (lane 1), ATCC 23064 (lane 2), ATCC 23557 (lane 3), ATCC 29803 (lane 4), and ATCC 33130 (lane 5). MK: 1 kbp DNA marker. (b) PCR products generated by amplifying the cas1, cas2, and csn2 gene sequences of ATCC 14277 (lane 1), ATCC 23064 (lane 2), ATCC 23557 (lane 3), ATCC 29803 (lane 4), and ATCC 33130 (lane 5). MK: 1 kbp DNA marker. (c) PCR amplification of the cas9 gene regions of five strains using the primer pair CAS9 FW1 and CAS9 RV1 (lanes 1–5) or the primer pair CAS9 FW1 and CAS9 RV2 (lanes 6–10). The template DNA used for PCR included genomic DNA from ATCC 14277 (lanes 1 and 6), ATCC 23064 (lanes 2 and 7), ATCC 23557 (lanes 3 and 8), ATCC 29803 (lanes 4 and 9), and ATCC 33130 (lanes 5 and 10). MK: 1 kbp DNA marker. (d) PCR products generated by amplifying the rnc gene sequences of five strains: ATCC 14277 (lane 1), ATCC 23064 (lane 2), ATCC 23557 (lane 3), ATCC 29803 (lane 4), and ATCC 33130 (lane 5). MK: 100 bp DNA marker. PCR, polymerase chain reaction; ATCC, American Type Culture Collection; CRISPR, clustered regularly interspaced palindromic repeats.