Literature DB >> 30093886

Study the Features of 57 Confirmed CRISPR Loci in 38 Strains of Staphylococcus aureus.

Xihong Zhao1, Zhixue Yu1, Zhenbo Xu2.   

Abstract

Staphylococcus aureus is a foodborne pathogen that causes food contamination and food poisoning, which poses great harm to health, agriculture and other hosts. Clustered regularly interspaced short palindromic repeats (CRISPR) are a recently discovered bacterial immune system that resists foreign genes such as phage DNA. This system inhibits the transfer of specific movable genetic elements that match the CRISPR spacer sequences, thereby preventing the spread of drug-resistant genes between pathogens. In this study, 57 CRISPR loci were screened from 38 strains of S. aureus based on the CRISPR database, and bioinformatics tools were used to investigate the structural features and potential functions of S. aureus CRISPR loci. The results showed that most strains contained only one CRISPR locus, a few strains contained multiple loci with sparsely distributed sites. These loci mainly included highly conserved direct repeat sequences and highly variable spacer sequences, as well as polymorphic cas genes. In addition, the analysis of secondary structure of direct repeat RNA showed that all sites can form stable RNA secondary structure. The results of constructing phylogenetic tree based on spacer sequence showed that some strains contained a high degree of phylogenetic relationship, while the differences among other strains in evolutionary processes were quite obvious. Of the 57 CRISPR loci identified, only the cas gene was found near the 4 CRISPR loci.

Entities:  

Keywords:  CRISPR; Staphylococcus aureus; cas; direct repeat; food safety; spacer

Year:  2018        PMID: 30093886      PMCID: PMC6070637          DOI: 10.3389/fmicb.2018.01591

Source DB:  PubMed          Journal:  Front Microbiol        ISSN: 1664-302X            Impact factor:   5.640


Introduction

In the past, it was known that vertebrates have the ability to resist and eliminate foreign pathogens, and could form a highly effective secondary immune mechanism to prevent the re-invasion of pathogens. Until the advent of clustered regularly interspaced short palindromic repeats (CRISPR), researchers realized that prokaryotes also had an adaptive secondary immune system similar to animals (Morange, 2015). CRISPR was the product of the evolution of life in the history of bacterial invasion against viruses, the bacteria to remove the virus invasive alien genes, evolved this powerful immune defense system. The main fundamental principle of the CRISPR system: firstly, the characteristic gene (proto-spacer) was extracted from the invaded foreign DNA and embedded in the CRISPR locus. Secondly, when the exogenous phage invaded again, this prokaryotic immune system used characteristic genes (spacers) to rapidly target and recognized foreign DNA. Finally, with the participation of a Cas protein complex, the invading phage DNA sequence was targeted and interfered, and the recognized foreign DNA was excised to eliminate exogenous invasion. The prokaryotic immune system, both acquired and heritable, is widespread in about 47% of bacteria and 84% of archaebacterial, which truly documents the pathways of bacterial evolution and the history of confrontations with foreign invaders (such as phages) (Lillestøl et al., 2006; Wakefield et al., 2015; Yang et al., 2015). Staphylococcus aureus, as an important foodborne pathogen in humans, is widely found in nature, air, water, dust, and human and animal excrement (Zhao et al., 2016). Furthermore, S. aureus is the most common pathogen in human purulent infection, in addition to local purulent infection, but also can cause pneumonia, pseudomembranous colitis, pericarditis, and even systemic infection such as sepsis (Larkin et al., 2009; Lindsay, 2014; Pérez-Montarelo et al., 2017). The pathogenicity of S. aureus mainly depends on the toxins and invasive enzymes it produces (Zhao X. H. et al., 2017). One of the most important toxins is enterotoxin, a protein toxin that can cause acute gastroenteritis. The enterotoxin can tolerate boiling at 100°C for 30 min without being destroyed, and the food poisoning symptoms causes are vomiting and diarrhea (Zhang and Zhao, 2017; Zhang et al., 2017; Zhao Y. et al., 2017). At present, S. aureus can be divided into five groups and 26 types using phage typing (Dua et al., 1982), most of which have high environmental adaptability. It is tempting to speculate that CRISPR, as prokaryotic immune system, may be closely related to the high environmental adaptability of S. aureus (Schröder et al., 2013). The typical genomic architecture of a CRISPR-Cas system is consists of a CRISPR locus, a series of cas genes, and a leader region. One of the major system components is the CRISPR locus, which is characterized by a series of tandem repeats separated with a unique spacer sequence (Horvath and Barrangou, 2010; Koonin and Makarova, 2013). The spacers of the CRISPR locus are highly specific, while the repeats are almost identical in same CRISPR locus (Guzina et al., 2017; Rossi et al., 2017). The direct repeat contains palindrome, which can form RNA secondary hairpin structure. Further study found that the hairpin structure was the recognition and binding site of Cas protein in the process of interference function (Wang et al., 2011). The spacer sequences were shown to be derived from previously encountered phages (Bolotin et al., 2005), and a small proportion came from the same bacterial genome, suggesting that there is horizontal gene transfer between homologous species in CRISPR system (Grissa et al., 2008; Rossi et al., 2017). In addition to the CRISPR locus, CRISPR-Cas system also includes a series of Cas proteins with multiple nuclease activities, as well as the leader region identified as having promoter function. In general, the CRISPR-Cas system can be divided into two categories based on the Cas protein. Class 1 systems perform functions through a multi-subunit Cas protein complex, whereas the Class 2 systems only require a single Cas protein (Cas9 or Cpfl) in the crRNA effector complex. Class 1 includes Type I, Type III, and Type IV systems, and Class 2 includes Type II and Type V systems (Quan and Ye, 2017). It illustrates the complexity of the CRISPR system from another perspective. At present, traditional pathogen detection methods include PCR, bacterial identification media, latex agglutination test, and traditional bacterial typing methods such as serotyping, phage typing, and drug resistance profile typing (Sabat et al., 2013; Zhong and Zhao, 2017, 2018; Wei et al., 2018; Zhao et al., 2018). It has been difficult for them to meet the current requirements for accurate diagnosis, traceable typing, and epidemiological studies of pathogens. For instance, foodborne pathogens once induced into the viable but non-culturable state (VBNC) cannot be detected by routine bacterial culture assays and can easily lead to undetected pathogens caused by the VBNC status, posing a serious threat to food safety and human health (Zhao et al., 2014; Ding et al., 2017; Liu et al., 2017; Zhao X. et al., 2017). Therefore, exploring how to identify pathogens more accurately and rapidly at the genetic level becomes a new direction for CRISPR systems in the field of microbiology. Currently, researchers have proposed to apply the diversity and specificity of spacers in CRISPR to genotyping techniques and have been well-established in some bacteria. For instance, Shariat et al. (2013) have proposed a new typing method CRISPR-MVLST which was a combination of bacterial marker gene, CRISPR and multiple site sequence typing. This typing method not only showed high degree consistency in epidemiology, but also higher resolution than pulsed-field gel electrophoresis, which distinguished highly cloned strains during pathogen outbreaks. In fact, the focus of CRISPR research is very prominent, especially as a faster and simpler application of gene editing tools (Doudna and Charpentier, 2014; Zhang et al., 2014), which can be used in biotechnology and medicine, and has great potential in gene and cell therapy (Maeder and Gersbach, 2016). The third generation of gene editing CRISPR-Cas9 technology is used to study the genetic engineering of various organisms such as eukaryotes, bacteria and viruses (Penewit et al., 2018). As CRISPR technology continues to improve, replacing Cas9 with Cpf1 or xCas9 may provide more opportunities for these applications (Hu et al., 2018). Given the diversity of CRISPR-Cas systems in different prokaryotes (Koonin et al., 2017; Wexler and Tajkarimi, 2017), researchers have been working on these sites in various bacteria in recent years. At present, bioinformatics tools can help researchers expand their knowledge in different fields without the need for laborious laboratory experiments. Based on this method, there have been some studies on the CRISPR system in several bacteria (Hidalgo-Cantabrana et al., 2017; Negahdaripour et al., 2017; Tomida et al., 2017; Hao et al., 2018). Bioinformatics Analysis of CRISPR in foodborne pathogens is crucial for assessing the potential evolution of foodborne pathogens to predict the outbreak of food borne pathogens, which is of great importance to food safety. In order to investigate the presence of CRISPR in different strains and the range of possible immune defenses, the effects of different CRISPR/Cas systems on the pathogenicity of S. aureus were explored, and accurately and quickly detect foodborne pathogenic bacteria, prevent and eliminate S. aureus caused by a variety of foodborne infections. It is significant to decipher the CRISPR locus that has now been identified. Therefore, systematic research on CRISPR structures helps us to better explore other functions of the CRISPR system in addition to bacteriophage immunity. In this study, 57 identified CRISPR loci from 38 species of S. aureus were selected for experimental subjects. By bioinformatics analysis of its structural characteristics and potential functional activity in S. aureus, based on the similarity of the phylogenetic relationships between spacer and direct repeat sequences, the 57 CRISPR loci were classified and verified that the spacer sequence was derived from a phage or exogenous plasmid. In addition, bioinformatics tools were used to predict RNA secondary structure formed by direct repeat sequences and their stability. Finally, the presence and distribution of the cas gene near the CRISPR locus were analyzed. The aim is to provide a new strategy for the control of foodborne pathogens S. aureus resistance studies, genotyping, traceability analysis, food safety and prevention.

Materials and methods

Data sources

The different S. aureus strain genomes were searched by the National Center for Biotechnology Information (NCBI) nucleotide database (http://www.ncbi.nlm.nih.gov/) with default parameters; then S. aureus CRISPR loci were searched by the CRISPR Finder server (E-value ≤ 0.001) (http://crispr.i2bc.paris-saclay.fr/Server/)(Last updated on May 9, 2017). The CRISPR loci contained in 38 stains of S. aureus were classified as suspected CRISPR loci and identified CRISPR loci. All the identified 57 CRISPR loci were selected for study.

Analysis method

CRISPR loci were grouped according to the similarity of the common direct sequence repeats (CDRs) of the CRISPR locus. First, using the MEGA6.06 software (https://www.megasoftware.net/) for direct sequences multiple sequence alignment (MSA), similar repeat sequences were clustered into the same group, and then the RNA secondary structures and minimum free energy (MFE) of each direct repeat sequence were predicted by the RNA fold Web server (http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi). As for the algorithm for secondary structure folding and MFE, the output option was set to default. The prediction of these structures are based on the cyclic energy model and the dynamic program definition algorithm (Zuker and Stiegler, 1981). In addition, since the CRISPR system may contain numerous spacers, the principal component analysis (PCA) was used to screen representative spacers from each CRISPR system, then MEGA6.06 software (https://www.megasoftware.net/) was used to import the selected spacer sequences to describe phylogenetic tree and identify the phylogenetic relationships among the stains, the genotyping and phylogenetic relationship of 38 strains of S. aureus were predicted, and the highly homologous sequence sources were searched by NCBI BLASTN search pattern (default parameters, nr database, mismatches ≤ 3). Finally, the cas gene near the CRISPR locus was searched in CRISPRs database (http://crispr.i2bc.paris-saclay.fr/crispr/). In addition, multiple sequence aligned proteins were searched by the BLASTP platform in NCBI (identity ≥ 90%, coverage > 90%). The above results are used to describe the type of CRISPR system and the distribution of cas genes in order to achieve bioinformatic analysis of the CRISPR locus identified in the selected strain.

Results

CRISPR locus of Staphylococcus aureus in CRISPR database

CRISPR database is a relational database implemented using mysql 4.1 (http://www.mysql.com/). Its implementation of related Web services is based on Perl 5.8.8 (http://www.perl.org/) and the Linux operating system (debian Sarge 3.1). Run on the Apache 2.0 web server (http://www.apache.org/). And use some modules to process the sequence. The database core application consists of two main programs: (a) CRISPR Finder for detecting CRISPRs and extracting them from the genome sequence. (b) Database tools for downloading prokaryotic genomes from the NCBI ftp site (ftp://ftp.ncbi.nih.gov/genomes/Bacteria), save CRISPRs, and update them (Grissa et al., 2007a). Currently, CRISPR database analyzes the genomes of 232 archaea by CRISPR Finder and obtains 870 CRISPR loci, 202 of which are convincing. In addition, 8,690 CRISPR loci were obtained by analyzing the genomes of 6,786 bacterial species, of which 3,059 loci were identified. In the CRISPR database, CRISPR loci contained in 38 S. aureus were searched. Among them, 57 CRISPR loci have been confirmed and others were suspicious CRISPR loci (last updated: 9 May, 2017), 38 strains of S. aureus have been confirmed in 57 CRISPR loci to carry out bioinformatics analysis of the components. As shown in Table 1, 22 strains of S. aureus contained only one CRISPR locus, 14 strains of S. aureus contained 2 CRISPR loci, and the other 2 strains contained 3 and 4 CRISPR loci. This was in comparison to other CRISPR loci on the distribution of the number of species was rare, it came down to the fact that the study of the S. aureus CRISPR system was not extensive enough, there may be some CRISPR loci hidden in the large number of suspicious CRISPR sequences. The number of spacer sequences in each CRISPR loci was between 1 and 15, the length of spacer sequence was concentrated in 25–31 bp, the number of direct repeats was between 2 and 16, and the length of the sequence was concentrated in 23–26 bp.
Table 1

Statistical table of 38 CRISPR loci of Staphylococcus aureus.

StainGenbank IDSourceCRISPR IDNumber of CRISPRNumber of spacersCRISPR lengthDR lengthSpacer length
S. auras 04_02981387149188Nübel et al., 2010NC_017340_5,1021, 178, 8023, 2533, 31
S. auras 08BA02176404477334Golding et al., 2012NC_018608_1,2215, 21107,18336, 3836, 34, 34 36, 37, 34 37, 35, 38 37, 35, 36 35, 33, 35, 34, 36
S. auras GCF_000597965749295051Parker et al., 2014NZ_CP007454_4,521,178, 8033, 2533, 31
S. auras GCF_000695875749295046Sabirova et al., 2014NZ_CP007690_711782333
S. auras GCF_000815045749198600Daum et al., 2015NZ_CP010295_611782333
S. auras GCF_000815085749203622Daum et al., 2015NZ_CP010296_611782333
S. auras GCF_000815165749193063Daum et al., 2015NZ_CP010298_611782333
S. auras GCF_000969225806462661Mcculloch et al., 2015NZ_CP011147_4,1021,178, 8023, 2533, 31
S. auras GCF_001021895829615601Tenover and Goering, 2009NZ_CP007674_3,821,180, 7826, 2329, 33
S. auras GCF_001045795983310191Planet et al., 2015NZ_CP007672_511782333
S. auras GCF_001278745983466147Panesso et al., 2015NZ_CP012593_6,1221,178, 8023,2533, 31
S. auras GCF_001281145927544131Giraud et al., 2015NZ_CP010890_5,1021,178, 8023, 2533, 31
S. auras GCF_001457495983361205Holmes et al., 2016NZ_LN831036_411842633
S. auras GCF_001465755983420869Bosch et al., 2016NZ_CP013621_1131993125, 25, 26
S. auras GCF_0015587951344139377Giannuzzi et al., 1999NZ_CP014064_1,421,178,8023, 2533, 31
S. auras GCF_0015942051008818213Aswani et al., 2016NZ_CP014791_5121332332, 33
S. auras GCF_0016113451016065235Trouilletassant et al., 2016NZ_CP012978_1,3,4,641,2,4,180, 136, 250, 8226, 24, 27, 2629, 33, 32, 29, 29, 29, 29, 31
S. auras GCF_0016113851016064704Trouilletassant et al., 2016NZ_CP012974_3,4,931,1,278, 81, 13623, 26, 2533, 30, 31 31
S. auras GCF_0016114251016068196Trouilletassant et al., 2016NZ_CP012970_1,421,180, 8226, 2629, 31
S. auras GCF_0016410251027722058Tatusova et al., 2013NZ_CP013957_411802629
S. auras GCF_0017259651065087359Lim et al., 2015NZ_CP012692_5,1221,179, 8024,2532, 31
S. auras GCF_9000925951045302382Maël et al., 2016NZ_LT598688_611782333
S. auras subs. auras 11819-97385780298Stegger et al., 2012NC_017351_411842633
S. auras subs. auras 71193386727822Uhlemann et al., 2012NC_017673_1131993125, 25, 26
S. auras subs. auras ACT-R 2384863396Lindqvist et al., 2012NC_017343_3,821,178, 8023,2533, 31
S. auras subs. auras ED133384546269Guinane et al., 2010NC_017337_911782431
S. auras subs. auras GCF_000772025755010342Lim et al., 2015NZ_CP009554_711802531
S. auras subs. Auras GCF_001296985930161532Sabat et al., 2015NZ_CP010402_311842633
S. auras subs. auras GCF_001515665975875548Botelho et al., 2016bNZ_CP012015_411802629
S. auras subs. auras GCF_001515705975875579Botelho et al., 2016aNZ_CP012018_411802629
S. auras subs. auras GCF_001515765975883094Costa et al., 2013NZ_CP012012_311802629
S. auras subs. auras HO 5096 0412386829725Holden et al., 2013NC_017763_1121372728 29
S. auras subs. auras JH1150392480Kim et al., 2014NC_009632_4,921, 178, 8023, 2533, 31
S. auras subs. auras MSHR1132379794527Holt et al., 2011NC_016941_1,226, 4469, 31136, 2336, 37, 37, 34, 36, 38, 48, 49, 51, 49
S. auras subs. auras T0131384868588Li et al., 2011NC_017347_911782333
S. auras subs. auras USA30087159884Diep et al., 2006NC_007793_611782333
S. auras subs. auras VC40379013365Sass et al., 2012NC_016912_611782333
S. auras subs. auras Z172554642795Chen et al., 2013NC_022604_4,721,280, 13826, 2829, 29, 28
Statistical table of 38 CRISPR loci of Staphylococcus aureus.

Repeat sequences

Direct repeat sequences (DR) are always highly similar or identical at the same CRISPR locus, the consensus direct repeat sequence (CDR) of each CRISPR locus for multiple sequence alignment analysis was chosen. Based on the results of the comparison, 57 CRISPR loci in 38 strains of S. aureus were divided into 25 groups. As shown in Table 2, each group was composed of the same DR sequence. In order to facilitate multiple sequence comparison and clustering, one DR sequence for homology analysis was chosen in each group. In addition, the RNA secondary structure of each repeat of 25 sets of DR sequences was predicted and recorded its MFE by RNA fold web server (Grissa et al., 2007b). In all groups, the RNA secondary structure was bound at both ends, forming stems in the middle. As shown in Figure 1, group 2, group 14, group 16, and group 22 were predicted to have a repeat length of 4 bp for the RNA secondary structure, 3 bp for the 9th group, the stem length of the secondary structure of group 19 and group 24 was 6 bp, the length of the 6th and 12th groups was 7 bp, the length of the secondary structure formed by group 21 was 8 bp, and that of other groups was 5 bp. Due to the formation of difference algorithm and system structure uncertainty, in partial red graphic representation form different secondary structure types, the structure formation probability prediction system was the possibility of relatively large, While the greenish-green indicated that the RNA secondary structure formed by the repeated sequences was still predicted to have a low probability of formation after the system optimization algorithm (Mathews, 2004). In addition, the sequence stability and the degree of conservation of the DR can be predicted by RNA secondary structure diagram and MFE value. Due to possible systematic errors in the prediction of the green group with a lower probability of being predicted, the group with the largest MFE value was group 22, which meant that it was not only the most unstable in the green marker group but also the least stable in all RNA secondary structures. The 21th group had a minimum MFE value of −13.2 kcal/mol and the longest stem formed in all RNA secondary structures, which was in line with the secondary structure of RNA stability and the formation of the stem length showed a certain linear positive correlation theory. In the same cluster, in addition to stem length can affect the stability of the secondary structure; other factors include the length of the repeat sequence and the “GC” content. In general, repeats with higher “GC” content at the same length are more stable; the longer the repeats, the more stable the secondary structure may be. Overall, the DR sequence with lower MFE value was more stable than the DR sequence with high MFE value.
Table 2

DR Sequence and RNA secondary structure MFE value statistics.

GroupDR ConsensusNumber of CDRPercentage (%)Min Free Energy (kcal/mol)
1CAGCTTCTGTGTTGGGGCCCCGC814.03−5.2
2GATCGATAACTACCCCGAATAACAGGGGACGAGAAT11.75−7.8
3TGCAAGTTGGCGGGGCCCCAACA11.75−4.7
4TGTTGGGGCCCCGCCAACCTGCA814.03−5.5
5TTCTTTATGTTGGGGCCCCGCCAACT814.03−5.9
6TGTTGGGGCCCACACCCCAACTTGCA23.51−12
7TGCAAGTTGGCGGGGCCCCAACACAGAAGCT23.51−4.7
8TGCAAGTTGGCGGGGCTCCAACA11.75−4.5
9CAGCTTCTGTGTTGGGGCCCCGCC11.75−5.2
10TTCTCTATGTTGGGGCCCCGCCAA11.75−2.7
11TCTATGTTGGGGCCCCGCCAACTTG712.28−5.9
12TGTTGGGGCCCACACCCCAACTTGCA11.75−12
13TGCAAGTTGGCGGGGCCCCAACATAGA11.75−4.7
14GATCGATAACTACCCCGAAGAATAGGGGACGAGAAC11.75−7.8
15TGTTGGGGCCCCGCCAACCTGCA23.51−5.5
16ATTCGATAACTACCCCCGTAGAAGAGGGGACGAGAACT11.75−8.2
17CAAGTTGGCGGGGCCCCAACACAGA11.75−4.7
18TCTATGTTGGGGCCCCGCCAACTTG23.51−5.9
19TGTTGGGCCCCACCCCAACTTGCA11.75−8.3
20TGCAAGTTGGCGGGGCCCCAACATAG11.75−4.7
21ATGCAAGTTGGGGTGGGGCCCCAACA23.51−13.2
22TATTCGATAACTACCCCGAAGAA11.75−1
23TGCAAGTTGGCGGGGCCCCAATATAGA11.75−2.9
24TGCAAGTTGGCGGGGGCCCAACATAGA11.75−6.4
25TATGTTGGGGCCCCGCCAACTTGCA11.75−5.9
Figure 1

Using DR sequences to generate 25 sets of RNA secondary structure prediction and MFE values.

DR Sequence and RNA secondary structure MFE value statistics. Using DR sequences to generate 25 sets of RNA secondary structure prediction and MFE values.

Spacers

Based on multi-sequence alignment of spacer sequences, a total of 92 spacers were found in 57 CRISPR loci in 38 stains of S. aureus and the spacer sequences present at the CRISPR locus appear to be highly homologous, contrary to the diversity necessary as a bacterial self-defense system. But from another perspective, they seemed to have undergone the same phage invasion or the process of gene transfer. After further analysis of all spacers sequences present in the CRISPR system by MSA, the CRISPR loci were classified into 22 groups, What's more, on the basis of the MSA analysis, it was found that there was no highly conserved nucleotide in the spacer sequences of different CRISPR loci. Finally, homologous matching of 92 spacer sequences from 57 CRISPR loci by NCBI blast showed that most of the spacers matched the corresponding exogenous elements. Strikingly, 11 spacers from 2 strains (08BA02176 and MSHR1132) appeared to be highly homologous to exogenous phage or plasmid (Table 3). Besides, PCA was used for each of the 57 CRISPR loci after optimization based on MSA data. In the process of dimensionality reduction of the experimental data, the spacer components of each CRISPR locus were projected onto the same analysis plane for each CRISPR locus, and in the principal component data on the overall contribution of the calculation results were consistent with the required precision. Through such a simplified analysis of the data, it avoided the systematic error of deducting the principle of deduction too much when constructing the phylogenetic tree due to the uneven length of the nucleotide sequence (Moore, 1981). The evolutionary relationship between strains was explored by using the difference between the access and deletion of the spacer sequences. The phylogenetic tree of 57 groups of spacer components was constructed by using mega6.06 software (Figure 2), which enabled the homology analysis of the selected 57 CRISPR loci.
Table 3

Statistics of phage or plasmid highly homologous to spacers.

StainGenbank IDCRISPR IDSPACER IDSequence of spacerSimilar Phage GISimilar plasmid
Staphylococcus aureus 08BA02176404477334NC_018608_1NC_018608_1-FTAGAATGTTATTATCTAAGTGGTCGATGTATTCC735998225, 1352282635, 594138638, 1336442650, 1229407576.
NC_018608_1-GTCATACTAGCACCCCACTCTCTACTGAACAAGTATCA765348377
NC_018608_1-HCTTAAAATCTAATTGCATTGTTATCAATTCCTTTA1188256656, 558695106, 558694899.
NC_018608_1-KTTTTCTTTAACTGTTTTTACTGCCCATTTAATAGT735998439,525336474.
NC_018608_1-MAAGTTAACGGCATTACCTAATAAAAATATTTTAGG584590862, 1345606604, 1332563252, 695256149, 365189246, 365189224.
NC_018608_1-NTCATCTTTCATGTCACTGATTAATTCATTTGTAPlasmid SAP020A
NC_018608_1-OGGTAATAGTTGCTCAATAGGTAATAAAACGTCGGTPlasmid pAYP1020
NC_018608_2608_2NC_018608_2-BGATATACTCCTTTACCATGTATTAATTCTGGACCAC1220001744, 1188256199,1220003875.
Staphylococcus aureus subsp. aureus MSHR1132379794527NC_016941_1NC_016941_1-DGTTTTTCATAGTTAATCAATCCCTTTTCTTTTTT1192700659,1102331716,797192878,410809112,398255565,670139430,1215500049,1188256881.
NC_016941_1-ETTAAATCTTTGATTGCTCTTAGCTCTAGTTATGTAT806933942,1336445532,1321071118,1321070986,940328084, 1168037548, 1072301026, 857291865.
NC_016941_1-FCACGCTGTAGTGAAGTATAGAAACGGCATGAGTACAAT1321071610,589626950,402761649, 514343602,398256436,215260398,475990627,456174244,349732033,302749846.
Figure 2

Biological evolution tree generated by principal component spacer sequences. Evolutionary tree results are grouped based on evolutionary relationships. The numbers from 1 to 21 represent 21 groups. Strains in the same group indicate higher evolutionary similarity, and the closer the distance, the higher the affinity is. The evolutionary distance scale is 0.2.

Statistics of phage or plasmid highly homologous to spacers. Biological evolution tree generated by principal component spacer sequences. Evolutionary tree results are grouped based on evolutionary relationships. The numbers from 1 to 21 represent 21 groups. Strains in the same group indicate higher evolutionary similarity, and the closer the distance, the higher the affinity is. The evolutionary distance scale is 0.2.

Cas genes near CRISPR loci

For the CRISPR-Cas system, the cas1 and cas2 genes were essential elements of a normally active CRISPR system and were located near the CRISPR locus. Therefore, the presence of the cas1 and cas2 genes were searched in the range of 10,000 bp upstream and 10,000 bp downstream of all 57 CRISPR loci. The result found that the presence of two core cas genes only 3 strains of S. aureus were S. aureus 08BA02176 (CRISPR ID: NC_018608_1) and S. aureus GCF_001611345 (CRISPR ID: NC_CP012978_4) and S. aureus MSHR1132 (CRISPR ID: NC_016941_1, NC_016941_2), the other 35 strains did not exist in these two kinds of cas gene. Based on this result, cas genes contained in these three strains were described (Figure 3). The CRISPR of S. aureus 08BA02176 NC_018608_1 belongs to the subtype I-C, its related proteins near CRISPR were endonuclease Cas1, integrase Cas2, helicase Cas3, protein Cas4, protein Cas5 and protein Cas7. The CRISPR of S. aureus MSHR1132 NC_016941_1 belongs to the subtype III-A and the CRISPR-related proteins in the vicinity of NC_016941_1 were nuclease Cas9, endonuclease Cas1, integrase Cas2 and protein Csn2. CRISPR's associated proteins near CRISPR of MSHR1132 NC_016941_2 (subtype III-A) were Cas6, protein Cas10, protein Csm10, protein Csm4, endonuclease Cas1 and integrase Cas2. CRISPR-related proteins near CRISPR (GCF_001611345 NC_CP012978_4, subtype III-A) were endonuclease Cas1, integrase Cas2, protein Cas6, protein Cas10, protein Csm2, and protein Csm6. From the observation, it showed that similar proteins of the same CRISPR type can be found on CRISPR loci with 4 or more flanking proteins.
Figure 3

The cas genes in vicinity of CRISPR loci. Cas genes were searched from 10,000 bp upstream to 10,000 bp downstream the CRISPR sequence. “Sasa” represents Staphylococcus aureus subsp. aureus. “S. aureus” represents Staphylococcus aureus.

The cas genes in vicinity of CRISPR loci. Cas genes were searched from 10,000 bp upstream to 10,000 bp downstream the CRISPR sequence. “Sasa” represents Staphylococcus aureus subsp. aureus. “S. aureus” represents Staphylococcus aureus.

Discussion

Staphylococcus aureus is currently one of the major pathogenic microorganisms that cause infectious diseases in humans worldwide. In recent years, due to the widespread use of multiple antibiotics, various resistance genes (such as mec gene-encoding penicillin-binding protein) were horizontally transferred between Staphylococcus strains resulted in multi-drug resistance of S. aureus (Chuang et al., 2014; Lin et al., 2016; Miao et al., 2016). One study analyzed 370 CRISPR-Cas systems from 148 prokaryotic genomes and constructed a phylogenetic tree based on the cas1, cas2, cas3, and cas4 genes present in these systems. Surprisingly, the species that are supposed to be closely related to each other are on separate branches of the phylogenetic tree. Further analysis revealed 10 large plasmids with a CRISPR system, suggesting that the CRISPR-Cas system was a mobile genetic element and that many horizontal gene transfer events have occurred through binding transfer, allowing the system to spread among distantly related species (Borowski, 2017). In addition, Bikard et al. (2012) constructed a CRISPR locus targeting a drug resistance gene on the chromosome of the recipient bacteria and introduced the plasmid containing the CRISPR gene into the chromosome of the recipient bacteria, resulting in the death of the bacteria containing the drug resistance gene. Based on this, a mobile CRISPR element targeting a drug-resistant or virulence gene was designed to not only kill pathogens with the target genes but also prevent the spread of drug-resistant or virulence genes among bacteria. The development of this mobile CRISPR element as an antimicrobial agent will provide a new strategy for the control of foodborne pathogens. Some studies have found that about 40% of the bacterial genome contains the CRISPR locus (Hakim et al., 2016). Most of the CRISPR loci of prokaryotes are located on their chromosomes, rarely on plasmids (Sorek et al., 2008). The main reason for this phenomenon is that the CRISPR system could provide the scavenging effect on the foreign bacteriophage and plasmid through the targeting interference mechanism of RNA, which endowed prokaryotes with strong adaptability in the face of different evolutionary environments (Koonin and Wolf, 2015). If CRISPR exists on the plasmid, it will be detrimental to the heritability of this immunity. In this study, there were 16 kinds of S. aureus in all 38 kinds of S. aureus contains 2 or more CRISPR loci, and they may present different CRISPR system type. It directly proved that a bacterial genome containing multiple CRISPR loci. There were great distinctions in the number and characteristics of the CRISPR loci contained among different species, which was also a necessary condition for bacterial genotyping (Horvath et al., 2009). In general, direct repeats of a single CRISPR locus are extremely conserved; however, there exist in some modified nucleotides in direct repeats within different CRISPR loci. When a new spacer is formed, it always remove the internal spacer by homologous recombination between direct repeats to limit the size of the CRISPR. The terminal repeats were frequently observed to be polymorphic, which was well-explained by the frequent loss of spacer/repeats containing the last of the terminal repeats (Vahidi and Honda, 1991; Horvath et al., 2008). The position of the repeat appears to be consistent with the location of nearby cas gene. In addition, variations within repetitive sequences often occur throughout the CRISPR locus (Horvath et al., 2008). In the 38 strains of S. aureus, repeats of most strains were highly conserved and were located in groups 1, 4, and 5. Therefore, it could be inferred that these direct repetitive mutations were less likely to occur in other groups, suggesting that the presence of cas around the CRISPR locus in these clusters is probably less. Due to the presence of short palindromic sequences in direct repeats, the secondary structures of RNA that may be formed by these direct repeats during transcription were investigated. This structure could cooperate with the crRNA transcribed from the entire CRISPR sequence to form a bimodal structure, guiding the Cas protein to target the site. Kunin et al. (2007) indicated that the stem-loop structure of some repeats may contribute to recognition-mediated contact between a gap-targeted exogenous RNA or DNA and a Cas-encoded protein, suggested that the stability of RNA secondary structure may affect CRISPR function. In addition, the MFE values of RNA secondary structures formed by all 25 different direct repeats were predicted. By comparing their MFE, secondary structures of RNA with lower minimum free energies were more stable. The stability of secondary structure of RNA was found depends on the length of the repeat and by the number of bp as well as the content of “GC” in stem. By homologous comparison of the spacer sequences in the NCBI Database, the source of the known genes highly homologous thereto was queried. Boch et al. (2009) indicated that the spacer sequence was proved from the exogenous gene components, CRISPR got the spacer sequence from the new invasion of exogenous DNA by some means. A homology survey of all the spacer sequences of this experiment by NCBI blast showed that most of the spacers of S. aureus strains can be matched to the foreign phage or plasmid in NCBI. In particular, 11 spacers from 2 strains of S. aureus presented highly matched exogenous phage or plasmids (as shown in Table 3). This strongly proves that the spacer is derived from exogenous elements. To a certain extent, the higher the number of phages with higher similarity to the spacers, the higher the frequency of phages attacking to the bacterium, and the more important the survival of the CRISPR system for bacteria in the environment of phage invasion. In addition, the phylogenetic tree of the selected strains was constructed by using the spacer sequences to provide reference value for the genotyping of S. aureus. In fact, the CRISPR-based typing has been well-established in some bacteria, the most representative of which is the application of Salmonella typing (Liu et al., 2011; Fabre et al., 2012). Even recent studies have demonstrated that Helicobacter pylori can be typed using CRISPR-like sequences in combination with virulence genes. The CRISPR-virulence technique was constructed for the 20 Helicobacter pylori strains obtained and compared with the phenotype obtained by the random amplified polymorphic DNA technique. There is no difference between the discrimination of the CRISPR typhoid typing and the RAPD typing (Bangpanwimon et al., 2017). Thus, the CRISPR-based typing method has a broad application prospect in the investigation of foodborne pathogens. However, at the moment, the data of the research on the typing of foodborne pathogenic bacteria by this method is still not enough, the database is not perfect, and no uniform standard. Due to the increasingly serious S. aureus infection and the relatively few studies on the CRISPR of S. aureus, which means that extensive research on the CRISPR typing of S. aureus is particularly urgent. Through the establishment of a database to explore the occurrence of foodborne pathogens CRISPR, diversity and activity in the relevant pathogens, in order to solve the problem of lasting food safety (Nikki and Dudley, 2014). In this paper, the 92 spacers were found in 38 species of S. aureus, of which only 9 species of S. aureus contained 2 or more spacers, and other strains containing only one spacer, the 92 spacers with an average length of about 32 bp, the shortest 25 bp, the longest 51 bp, and the diversity of spacer length and sequence will affect the bacterial activity in the CRISPR system. Horvath et al. (2008) studies showed that longer CRISPR sequences may be more active than short ones. Di et al. (2014) studies indicated that CRISPR loci containing more number of the spacers with length of 30 bp were more active than sites containing a small amount of 36 bp. Therefore, the CRISPR loci in the strains of our choice may be more active than the previously studied strains with shorter spacers. Additionally, the same nucleic acid sequence was observed in the spacers of different strains, which suggested that these strains may be attacked by several phages with higher relatives or due to horizontal gene transfer. For the 57 CRISPR loci contained in 38 S. aureus strains, each of the 10,000 bp upstream and downstream of these CRISPR loci was queried for the presence of cas1 and cas2 genes. Three S. aureus strains were screened to meet the requirements. The three S. aureus strains contained cas1 and cas2 genes to systematically search for the distribution of other cas genes in the vicinity of the CRISPR sequence. Wherein in the CRISPR system of S. aureus 08BA02176, the cas gene have located in 10,000 bp downstream of the CRISPR sequence. However, in the two other strains of S. aureus (S. aureus GCF_001611345 and S. aureus subsp. aureus MSHR1132), the cas gene have located in upstream of the CRISPR sequence, it proved that CRISPR-Cas system could transfer between the same strains (Borowski, 2017). In summary, the other strains of S. aureus that did not detect the cas1 and cas2 genes were considered inactive by the CRISPR system they contained, because when these key cas genes in certain CRISPR loci inactivated or not present, bacterial drug resistance and its ability to integrate new spacer sequence will be lost. At present, research on the CRISPR system of pathogenic bacteria of foodborne origin has not paid enough attention. In fact, in addition to the immune defense function, the CRISPR system has also been found to have the functions of regulating the virulence of bacteria and influencing the formation of biofilms of foodborne pathogens. Zegans et al. (2009) found that the CRISPR system was involved in the lysogenic infection of bacteriophage DMS3 in Pseudomonas aeruginosa PA14, resulting in the inhibition of P. aeruginosa biofilm formation and decreased the swarming motility. Palmer and Gilmore (2010) indicated that multidrug-resistant Enterococcus strains were usually missing the CRISPR system, and thus speculated that the CRISPR system may be involved in the interference of drug-resistance gene capture. Lovley and Muktak (2010) showed that the spacer1 in the CRISPR system is homologous to the histidine-tRNA synthetase of the bacterium in Pelobacter carbinolicus, which prompted P. carbinolicus to selectively delete the genes with more histidine codons during evolution. As a result, P. carbinolicus differentiated from other geobacteraceae to form new species. Babu et al. (2011) demonstrated that Cas1 (Ygb T) in the CRISPR I-E system of E. coli K12 can act on the ss DNA or branched DNA in the Holiday model, replication fork, 5′-flaps structure, and cleave them to affect the recombination and repair of the host genome. These studies revealed that the CRISPR system still has many unknown functions in regulating bacterial physiological activities. Most of the current research is based on bacterial genotyping, drug resistance, epidemiological studies, and traceability analysis. Therefore, in this study, S. aureus, the representative of foodborne pathogens, was selected as the research target, and the basic structure of the CRISPR system contained in the 38 strains of S. aureus was analyzed by means of bioinformatics. The results of bioinformatics analysis can provide a data reference for a broader range of CRISPR studies in S. aureus. At present, there are still many foodborne pathogens CRISPR system research is not involved, which need to be based on large data genotyping and foodborne pathogenic mechanism of pathogenesis is unfavorable, in this regard, put more effort to construct extensive research on CRISPR system of food borne pathogenic bacteria is imperative. With the rapid development of CRISPRs technology, from the CRISPR-Cas9 DNA-editing system to the discovery of the CRISPR-Cas13 RNA-editing system that is now replacing RNAi technology (Doudna and Charpentier, 2014; Abudayyeh et al., 2017), there is no doubt that CRISPR system have underestimated in the future various fields of research potential and application value. It is believed that as the structure of the CRISPR system is further revealed, which will change the research pattern in the life sciences.

Conclusion

Most strains of S. aureus contain only one CRISPR site, and a few strains contain multiple sites with sparse distribution. These loci mainly include highly conserved direct repeats and highly variable spacers, as well as polymorphic cas genes. In addition, all direct repeats can form stable RNA secondary structures, and spacer sequences have been shown to originate from exogenous phage or plasmids. Moreover, the specificity of spacer sequences can serve as a basis for accurate genotyping techniques. Three different CRISPR system subtypes were found in 38 S. aureus strains, including 4 active CRISPR-Cas systems. The analysis and comparison of the CRISPR/Cas system can help to understand the environmental adaptability of each S. aureus evolutionary lineage that represents different pathogenicity. Bioinformatics analysis provides data support for bacterial typing, traceability analysis, and exploration of CRISPR other than immune. The CRISPR study provides new ideas for preventing the spread of resistant genes between S. aureus and eliminating drug resistance genes.

Author contributions

XZ designed this study and wrote the manuscripts. ZY finished the experiments and collected the data. XZ, ZY, and ZX revised the manuscript critically for important intellectual content. All authors read and approved the final manuscript.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  93 in total

1.  Repeats and subrepeats in the intergenic spacer of rDNA from the nematode Meloidogyne arenaria.

Authors:  H Vahidi; B M Honda
Journal:  Mol Gen Genet       Date:  1991-06

2.  Identification of a highly transmissible animal-independent Staphylococcus aureus ST398 clone with distinct genomic and cell adhesion properties.

Authors:  Anne-Catrin Uhlemann; Stephen F Porcella; Sheetal Trivedi; Sean B Sullivan; Cory Hafer; Adam D Kennedy; Kent D Barbian; Alex J McCarthy; Craig Street; David L Hirschberg; W Ian Lipkin; Jodi A Lindsay; Frank R DeLeo; Franklin D Lowy
Journal:  mBio       Date:  2012-02-28       Impact factor: 7.867

3.  Complete Genome Sequence of Staphylococcus aureus FCFHV36, a Methicillin-Resistant Strain Heterogeneously Resistant to Vancomycin.

Authors:  John Anthony McCulloch; Alessandro Conrado de Oliveira Silveira; Aline da Costa Lima Moraes; Paula Juliana Pérez-Chaparro; Manoella Ferreira Silva; Lara Mendes Almeida; Pedro Alves d'Azevedo; Elsa Masae Mamizuka
Journal:  Genome Announc       Date:  2015-08-13

4.  RefSeq microbial genomes database: new representation and annotation strategy.

Authors:  Tatiana Tatusova; Stacy Ciufo; Boris Fedorov; Kathleen O'Neill; Igor Tolstoy
Journal:  Nucleic Acids Res       Date:  2013-12-06       Impact factor: 16.971

5.  Complete Genome Sequence of Staphylococcus aureus MCRF184, a Necrotizing Fasciitis-Causing Methicillin-Sensitive Sequence Type 45 Staphylococcus Strain.

Authors:  Vijay Aswani; Bob Mau; Sanjay K Shukla
Journal:  Genome Announc       Date:  2016-05-12

Review 6.  Detection of Foodborne Pathogens by Surface Enhanced Raman Spectroscopy.

Authors:  Xihong Zhao; Mei Li; Zhenbo Xu
Journal:  Front Microbiol       Date:  2018-06-12       Impact factor: 5.640

7.  Efficient and Scalable Precision Genome Editing in Staphylococcus aureus through Conditional Recombineering and CRISPR/Cas9-Mediated Counterselection.

Authors:  Kelsi Penewit; Elizabeth A Holmes; Kathyrn McLean; Mingxin Ren; Adam Waalkes; Stephen J Salipante
Journal:  MBio       Date:  2018-02-20       Impact factor: 7.867

8.  Whole-Genome Sequence for Methicillin-Resistant Staphylococcus aureus Strain ATCC BAA-1680.

Authors:  Luke T Daum; Violet V Bumah; Daniela S Masson-Meyers; Manjeet Khubbar; John D Rodriguez; Gerald W Fischer; Chukuka S Enwemeka; Steve Gradus; Sanjib Bhattacharyya
Journal:  Genome Announc       Date:  2015-03-12

9.  CRISPR-Cas Systems Features and the Gene-Reservoir Role of Coagulase-Negative Staphylococci.

Authors:  Ciro C Rossi; Thaysa Souza-Silva; Amanda V Araújo-Alves; Marcia Giambiagi-deMarval
Journal:  Front Microbiol       Date:  2017-08-15       Impact factor: 5.640

10.  Evolved Cas9 variants with broad PAM compatibility and high DNA specificity.

Authors:  Johnny H Hu; Shannon M Miller; Maarten H Geurts; Weixin Tang; Liwei Chen; Ning Sun; Christina M Zeina; Xue Gao; Holly A Rees; Zhi Lin; David R Liu
Journal:  Nature       Date:  2018-02-28       Impact factor: 49.962

View more
  14 in total

1.  Genome-wide correlation analysis suggests different roles of CRISPR-Cas systems in the acquisition of antibiotic resistance genes in diverse species.

Authors:  Saadlee Shehreen; Te-Yuan Chyou; Peter C Fineran; Chris M Brown
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2019-05-13       Impact factor: 6.237

2.  Formation and Control of the Viable but Non-culturable State of Foodborne Pathogen Escherichia coli O157:H7.

Authors:  Yanmei Li; Teng-Yi Huang; Congxiu Ye; Ling Chen; Yi Liang; Kan Wang; Junyan Liu
Journal:  Front Microbiol       Date:  2020-06-16       Impact factor: 5.640

3.  Pathogenomic Analysis of a Novel Extensively Drug-Resistant Citrobacter freundii Isolate Carrying a blaNDM-1 Carbapenemase in South Africa.

Authors:  Yogandree Ramsamy; Koleka P Mlisana; Daniel G Amoako; Mushal Allam; Arshad Ismail; Ravesh Singh; Akebe Luther King Abia; Sabiha Y Essack
Journal:  Pathogens       Date:  2020-01-31

4.  Reduction, Prevention, and Control of Salmonella enterica Viable but Non-culturable Cells in Flour Food.

Authors:  Yanmei Li; Tengyi Huang; Caiying Bai; Jie Fu; Ling Chen; Yi Liang; Kan Wang; Jun Liu; Xiangjun Gong; Junyan Liu
Journal:  Front Microbiol       Date:  2020-08-21       Impact factor: 6.064

5.  Genes Influencing Phage Host Range in Staphylococcus aureus on a Species-Wide Scale.

Authors:  Abraham G Moller; Kyle Winston; Shiyu Ji; Junting Wang; Michelle N Hargita Davis; Claudia R Solís-Lemus; Timothy D Read
Journal:  mSphere       Date:  2021-01-13       Impact factor: 4.389

6.  Characterization on gut microbiome of PCOS rats and its further design by shifts in high-fat diet and dihydrotestosterone induction in PCOS rats.

Authors:  Yanhua Zheng; Jingwei Yu; Chengjie Liang; Shuna Li; Xiaohui Wen; Yanmei Li
Journal:  Bioprocess Biosyst Eng       Date:  2020-03-10       Impact factor: 3.210

7.  "One-step" characterization platform for pathogenic genetics of Staphylococcus aureus.

Authors:  Yanmei Li; Yisen Qiu; Congxiu Ye; Ling Chen; Yi Liang; Teng-Yi Huang; Li Zhang; Junyan Liu
Journal:  Bioprocess Biosyst Eng       Date:  2020-10-28       Impact factor: 3.210

8.  Development of a Direct and Rapid Detection Method for Viable but Non-culturable State of Pediococcus acidilactici.

Authors:  Yu Guan; Kan Wang; Yang Zeng; Yanrui Ye; Ling Chen; Tengyi Huang
Journal:  Front Microbiol       Date:  2021-07-02       Impact factor: 5.640

9.  Milk microbial composition of Brazilian dairy cows entering the dry period and genomic comparison between Staphylococcus aureus strains susceptible to the bacteriophage vB_SauM-UFV_DC4.

Authors:  Vinícius da Silva Duarte; Laura Treu; Cristina Sartori; Roberto Sousa Dias; Isabela da Silva Paes; Marcella Silva Vieira; Gabriele Rocha Santana; Marcos Inácio Marcondes; Alessio Giacomini; Viviana Corich; Stefano Campanaro; Cynthia Canedo da Silva; Sérgio Oliveira de Paula
Journal:  Sci Rep       Date:  2020-03-26       Impact factor: 4.379

10.  Potent CRISPR-Cas9 inhibitors from Staphylococcus genomes.

Authors:  Kyle E Watters; Haridha Shivram; Christof Fellmann; Rachel J Lew; Blake McMahon; Jennifer A Doudna
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-10       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.