| Literature DB >> 29218031 |
Mehrdad Tajkarimi1, Hannah M Wexler1,2,3.
Abstract
Background: While CRISPR-Cas systems have been identified in bacteria from a wide variety of ecological niches, there are no studies to describe CRISPR-Cas elements in Bacteroides species, the most prevalent anaerobic bacteria in the lower intestinal tract. Microbes of the genus Bacteroides make up ~25% of the total gut microbiome. Bacteroides fragilis comprises only 2% of the total Bacteroides in the gut, yet causes of >70% of Bacteroides infections. The factors causing it to transition from benign resident of the gut microbiome to virulent pathogen are not well understood, but a combination of horizontal gene transfer (HGT) of virulence genes and differential transcription of endogenous genes are clearly involved. The CRISPR-Cas system is a multi-functional system described in prokaryotes that may be involved in control both of HGT and of gene regulation.Entities:
Keywords: Bacteroides; CRISPR-Cas system; gut microbiome; immune defense; pathobiome; phage; virulence
Year: 2017 PMID: 29218031 PMCID: PMC5704556 DOI: 10.3389/fmicb.2017.02234
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Bacteroides fragilis strains used in this study.
| Appy Abscess | ASM2598v1 | NCTC Type Strain | ||
| Appy Abscess | ASM21083v1 | Privitera et al., | ||
| Appy Abscess | Bact_frag_HMW_615_V1 | Pumbwe et al., | ||
| Blood | DCMSKEJBY0001B2.0 | Sydenham et al., | ||
| Blood | DCMOUH0067B_1.0 | Sydenham et al., | ||
| Blood | Bact_frag_HMW_610_V1 | Sherwood et al., | ||
| Blood | Bact_frag_HMW_616_V1 | Pumbwe et al., | ||
| Blood | DCMOUH0017B2.0 | Sydenham et al., | ||
| Blood | DCMOUH0018B2.0 | Sydenham et al., | ||
| Blood | ASM105877v1 | Roach et al., | ||
| Blood | ASM992v1 | Kuwahara et al., | ||
| Blood | ASM78502v1 | Kalapila et al., | ||
| Blood | DCMOUH0042B1.0 | Sydenham et al., | ||
| Blood | DCMOUH0085B_1.0 | Sydenham et al., | ||
| Blood | ASM105875v1 | Roach et al., | ||
| Blood | ASM105874v1 | Roach et al., | ||
| ETBF | ASM59840v1 | Science, | ||
| ETBF | ASM59854v2 | Science, | ||
| ETBF | ASM59916v2 | Science, | ||
| ETBF | ASM59888v1 | Science, | ||
| ETBF | ASM59826v1 | Science, | ||
| ETBF | ASM59830v1 | Science, | ||
| ETBF | ASM59832v1 | Science, | ||
| ETBF | ASM59816v1 | Science, | ||
| ETBF | ASM59818v2 | Science, | ||
| ETBF | ASM59844v1 | Science, | ||
| ETBF | ASM59836v1 | Science, | ||
| ETBF | ASM59820v1 | Science, | ||
| ETBF | ASM59824v1 | Science, | ||
| ETBF | ASM59912v1 | Science, | ||
| ETBF | ASM59882v1 | Science, | ||
| ETBF | ASM59884v1 | Science, | ||
| ETBF | ASM59860v1 | Science, | ||
| ETBF | ASM59864v1 | Science, | ||
| ETBF | ASM59922v1 | Science, | ||
| ETBF | ASM59924v1 | Science, | ||
| ETBF | ASM59870v1 | Science, | ||
| ETBF | ASM59928v2 | Science, | ||
| ETBF | ASM59892v1 | Science, | ||
| ETBF | ASM59894v1 | Science, | ||
| ETBF | ASM59896v1 | Science, | ||
| ETBF | ASM59934v1 | Science, | ||
| ETBF | ASM59902v1 | Science, | ||
| ETBF | ASM59876v2 | Science, | ||
| ETBF | ASM59878v2 | Science, | ||
| ETBF | ASM60111v1 | Science, | ||
| ETBF | ASM69968v1 | Science, | ||
| ETBF | ASM59828v1 | Science, | ||
| ETBF | ASM59842v1 | Science, | ||
| ETBF | ASM59834v1 | Science, | ||
| ETBF | ASM59846v1 | Science, | ||
| ETBF | ASM59822v1 | Science, | ||
| ETBF | ASM59848v1 | Science, | ||
| ETBF | ASM59838v1 | Science, | ||
| ETBF | ASM59850v1 | Science, | ||
| ETBF | ASM59908v1 | Science, | ||
| ETBF | ASM59880v1 | Science, | ||
| ETBF | ASM59852v1 | Science, | ||
| ETBF | ASM59910v1 | Science, | ||
| ETBF | ASM59914v2 | Science, | ||
| ETBF | ASM59856v1 | Science, | ||
| ETBF | ASM59858v1 | Science, | ||
| ETBF | ASM59918v2 | Science, | ||
| ETBF | ASM59886v1 | Science, | ||
| ETBF | ASM59862v1 | Science, | ||
| ETBF | ASM59920v1 | Science, | ||
| ETBF | ASM59866v1 | Science, | ||
| ETBF | ASM59868v2 | Science, | ||
| ETBF | ASM59926v1 | Science, | ||
| ETBF | ASM59890v1 | Science, | ||
| ETBF | ASM59872v1 | Science, | ||
| ETBF | ASM59898v1 | Science, | ||
| ETBF | ASM59930v1 | Science, | ||
| ETBF | ASM59900v1 | Science, | ||
| ETBF | ASM59874v1 | Science, | ||
| ETBF | ASM59932v1 | Science, | ||
| ETBF | ASM59936v1 | Science, | ||
| ETBF | ASM59938v1 | Science, | ||
| ETBF | ASM59904v1 | Science, | ||
| ETBF | ASM59906v2 | Science, | ||
| ETBF | ASM60101v1 | Science, | ||
| ETBF | ASM60103v1 | Science, | ||
| ETBF | ASM60107v2 | Science, | ||
| ETBF | ASM60109v1 | Science, | ||
| ETBF | ASM60105v1 | Science, | ||
| ETBF | ASM169988v1 | Pierce and Bernstein, | ||
| ETBF | ASM169987v1 | Pierce and Bernstein, | ||
| ETBF | ASM169986v1 | Pierce and Bernstein, | ||
| ETBF | ASM96578v1 | Nikitina et al., | ||
| ETBR | ASM59814v1 | Science, | ||
| Feces | ASM158009v1 | HMP | ||
| Feces | Bact_frag_CL03T12C07_V1 | HMP | ||
| Feces | Bact_frag_CL05T12C13_V1 | HMP | ||
| Feces | ASM169269v1 | Russia | ||
| HMP | Bact_frag_CL07T12C05_V1 | HMP | ||
| HMP | ASM15701v1 | HMP | ||
| HMP | Bact_frag_CL03T00C08_V1 | HMP | ||
| HMP | Bact_frag_CL05T00C42_V1 | HMP | ||
| HMP | Bact_frag_CL07T00C01_V1 | HMP | ||
| Japan | ASM61342v1 | NBRP, Japan | ||
| Undernourished Malawian Children | 4g8B_assembly | Kau et al., | ||
| Undernourished Malawian Children | 2d2A_assembly | Kau et al., | ||
| ViR/MDR | ASM169535v1 | Soki et al., | ||
| ViR/MDR | ASM168221v1 | Soki | ||
| Fluid | ASM105489v1 | Roach et al., | ||
| Fluid | ASM105633v1 | Roach et al., | ||
| Fluid | ASM105486v1 | Roach et al., | ||
| ViR/MDR | ASM169369v1 | Soki et al., | ||
| ViR/MDR | BFBE1.1 | Risse et al., |
Reference Genome for the Human Microbiome Project (HMP).
Federal Research and Clinical Center of Physical-Chemical Medicine of FMBA of Russia.
National Bioresearch Project, Japan.
Soki et al. (unpublished), Comparative analysis of Division I and II Bacteroides fragilis strain genomes.
Figure 1Dendrogram of B. fragilis isolates. The dendrogram is based on ANIb values generated for published B. fragilis genomes. It is viewed using the Archaeopteryx software tool. B. fragilis strains isolated from blood are colored in red. (Although BF Cag 558 is clustered with the blood isolates, there is no available source data). All blood isolates (in red) are clustered together, apart from BF_DCMOUHOO42B (this isolate also has a CRISPR pattern distinct from the other isolates.) Isolates described as multidrug resistant or virulent (teal) are more widely scattered. ETBF isolates are scattered across the dendrogram. Several of the strains are emphasized in a larger black font because their spacer distribution was of particular interest (discussed in Figure 7).
Figure 7Binary representation of spacer distribution in Type IIIB CRISPR-Cas in blood isolates and other strains with repeat arrays in alternate neighborhoods. The CRISPRs are configured with the oldest spacer at the “top” of the array (i.e., spacer 1, to the left edge of the binary representation) thus the newest spacers are those at the right edge. (A) This is an excerpted panel of the full binary representation of spacer distribution in Supplementary Table 3B. Each unique spacer is represented by a red vertical bar. Isolates from blood are in a red font. A unique pattern of spacers are found in the atypical neighborhood. Four of the blood isolates have no other Type IIIB CRISPR-Cas array and no Type IIIB cas genes. There are no cas genes adjacent to the CRISPR array in the atypical gene neighborhood, but since there is also a Type IIIB CRISPR-Cas array in the typical neighborhood, it is possible that the genes can act in trans on the atypical array. This is not the case for the four blood isolates (in bold red font). Blood isolates Bacteroides sp. UW and BF-DCMSKEJBY18 have two additional spacers not seen in other isolates but their protospacer targets could not be determined. A cutout of the dendrogram shown in Figure 1 is superimposed to show the phylogenetic relationship of the blood isolates. (B) Type IB spacer distribution in three strains of BF. BF-S13-L11 and BF-1007_1_F-7 are closely related phylogenetically while BF-3998_T-B-3_2 is at the opposite end of the phylogenetic tree; the more ancient part of the CRISPR (i.e., the first spacers) is highly conserved. BF-3998-T-B-3_2 has a longer array with more spacers at the newest edge, indicating that BF-3998-T-B-3_2 acquired more spacers. (C) Type IIIB spacer arrangements in two closely related strains of BF (see Figure 1). This pattern is consistent with the frequently seen homology of CRISPR arrays between highly related strains, with one strain losing two of the internal spacers. Another possible but less likely scenario is that BF-3774_T13 picked up two additional spacers that were not added at the leading edge but at an internal location.
Distribution of CRISPR-Cas Systems in B. fragilis.
| IB | 25 | Alternate | No | 5 | BF 1009 4 F 10 | IIC | 31,8,14 | BF 1007 1 F 10 | ||
| BF 1007 1 F 10 | IB | 15 | BF 3976t7 | ND | 5 | BF 1009 4 F 7 | IIC | 8,7,14,31 | BF 1007 1 F 3 | |
| BF 1007 1 F 3 | IB | 14 | BF 3976t8 | IIIB | 5 | BF 2 078382 3 | IIC | 19 | BF 1007 1 F 4 | |
| BF 1007 1 F 4 | IB | 14 | BF 1007 1 F 10 | IIIB | 15 | BF 2 F 2 4 | IIC | 6 | BF 1007 1 F 5 | |
| BF 1007 1 F 5 | IB | 14 | BF 1007 1 F 3 | IIIB | 15 | BF 2 F 2 5 | IIC | 4,5 | BF 1007 1 F 6 | |
| BF 1007 1 F 6 | IB | 14 | BF 1007 1 F 4 | IIIB | 15 | BF 20656 2 1 | IIC | 15 | BF 1007 1 F 7 | |
| BF 1007 1 F 7 | IB | 14 | BF 1007 1 F 5 | IIIB | 15 | BF 3 F 2 | IIC | 15,15,15 | BF 1007 1 F 8 | |
| BF 1007 1 F 8 | IB | 6,5,(12) | BF 1007 1 F 6 | IIIB | 15 | BF 320 | IIC | 5 | BF 1007 1 F 9 | |
| BF 1007 1 F 9 | IB | 14 | BF 1007 1 F 7 | IIIB | 15 | BF 321 | IIC | 7,6 | BF 1009 4 F 10 | |
| BF 1009 4 F 10 | Truncated IB | 9 | BF 1007 1 F 8 | ND | 15 | BF 322 | IIC | 5 | BF 1009 4 F 7 | |
| BF 1009 4 F 7 | Truncated IB | 9 | BF 1007 1 F 9 | IIIB | 15 | BF 34 F 2 13 | IIC | 6 | BF 2 F 2 4 | |
| BF 3 F 2 #6 | Truncated IB | 8 | BF 3397 N2 | IIIB | 34 | BF 3719 T6 | IIC | 15 | BF 2 F 2 5 | |
| BF 320 | IB | 20 | BF 3397 N3 | IIIB | 35 | BF 3725 D9 ii | IIC | 32 | BF 2 F 2 7 | |
| BF 321 | IB | 20 | BF 3397 T10 | ND | 28 | BF 3774 T13 | IIC | 14 | BF 20793 3 | |
| BF 322 | IB | 20 | BF 3397 T14 | IIIB | 35 | BF 3783N1 2 | IIC | 13 | BF 3 1 12 | |
| BF 3998T B 3 | Truncated IB | 20,1b, inferred | BF 3719A10 | IIIB | 31 | BF 3783N1 6 | IIC | 13 | BF 3 F 2 #6 | |
| BF 3998 T B 4 | ND | 3,1 | BF 3774 T13 | ND | 11 | BF 3783N1 8 | IIC | 13 | BF 3397 N2 | |
| BF DCMOUH0017B | Odd genes | 67 | BF 3783N1 2 | ND | 4 | BF 3783N2 1 | ND | 13 | BF 3397 T14 | |
| BF DCMSKEJBY0001B | IB | 25 | BF 3783N1 6 | IIIB | 6 | BF 3976T7 | ND | 13 | BF 34 F 2 13 | |
| BF HMW 610 | IB | 8,2 | BF 3783N1 8 | IIIB | 8 | BF 3986 N B 19 | IIC | 14 | BF 3774 T13 | |
| BF KLE1758 | Truncated IB | 4,2 | BF 3783N2 1 | ND | 5 | BF 3996 N B 6 | IIC | 9,5,3 | BF 3783N1 6 | |
| BF Korea 419 | Truncated IB | 8 | BF 3986 N B 22 | IIIB | 7 | BF 3998 T B 4 | IIC | 5,5 | BF 3783N1 8 | |
| BF NCTC 9343 | Truncated IB | 8 | BF 3986 N3 | IIIB | 15 | BF 3998T B 3 | IIC | 5 | BF 3783N2 1 | |
| BF S13 L11 | ND | 14 | BF 3986 T B 13 | IIIB | 7 | BF 638R | IIC | 29 | BF 3976T7 | |
| BF S14 | Truncated IB | 8 | BF 3986 T B 9 | IIIB | 7 | BF 86 5443 2 2 | IIC | 9 | BF 3986 N B 22 | |
| BF S38L5 | Truncated IB | 8 | BF 3986T B 10 | ND | 4 | BF BE1 1 | IIC | 11 | BF 3986 N3 | |
| BF YCH46 | IB | 8 | BF 3988 T1 | ND | 7 | BF DCMOUH0042B | IIC | 4 | BF 3986 T B 13 | |
| BF s38L3 | Truncated IB | 8 | BF 3988T B 14 | ND | 4,24 | BF DS 208 | IIC | 9 | BF 3986 T B 9 | |
| BF 3996 NB6 | IB | 20 | BF 884 | ND | 8 | BF DS 71 | IIC | 19 | BF 3986T B 10 | |
| BF 885 | ND | 17 | BF KLE1758 | IIC | 3 | BF 3988 T1 | ||||
| BF 894 | IIIB | 17 | BF NCTC 9343 | IIC | 26 | BF 3988T B 14 | ||||
| BF DCMOUH0018B | IIIB | 4,7 | BF S14 | IIC | 24 | BF A7 UDC12 2 | ||||
| BF DCMOUH0042B | IIIB | 17 | BF S23 R14 | IIC | 9,10,4 | BF B1 UDC16 1 | ||||
| BF DCMOUH0067B | Alternate | No | 4 | BF S23L17 | IIC | 14 | BF BE1 1 | |||
| BF-DCMSKEJBY0001B | Alternate | No | 5 | BF S23L24 | IIC | 14 | BF BF8 | |||
| BF DS 166 | ND | 12,5 | BF S24L15 | IIC | 8 | BF BOB25 | ||||
| BF DS 71 | ND | 1 | BF S24L26 | IIC | 8 | BF CL03T00C08 | ||||
| BF HMW 610 | Alternate | No | 2 | BF S24L34 | IIC | 8 | BF CL03T12C07 | |||
| BF J38 1 | ND | 17 | BF S38L3 | IIC | 19 | BF CL07T00C01 | ||||
| BF Korea 419 | IIIB | 15 | BF S38L5 | IIC | 19 | BF CL07T12C05 | ||||
| BF S13 L11 | ND | 5,9 | BF DCMOUH0018B | |||||||
| BF S14 | IIIB | 22 | BF DCMOUH0042B | |||||||
| BF S23 R14 | ND | 21 | BF DS 166 | |||||||
| BF S23L17 | IIIB | 21 | BF DS 71 | |||||||
| BF S23L24 | IIIB | 21 | BF HMW 616 | |||||||
| BF S36L11 | ND | 24,3 | BF I1345 | |||||||
| BF S36L12 | IIIB | 24,3 | BF JCM 11017 | |||||||
| BF S36L5 | IIIB | 23,3 | BF JIM10 | |||||||
| BF S38L3 | IIIB | 29 | BF KLE1758 | |||||||
| BF S38L5 | ND | 29 | BF Korea 419 | |||||||
| BF S6L3 | Alternate | YES | 1,3 | BF NCTC 9343 | ||||||
| BF S6L5 | ND | 1,3 | BF O:21 | |||||||
| BF S6L8 | ND | 2,4 | BF S14 | |||||||
| BF S6R5 | ND | 1,3 | BF S23L17 | |||||||
| BF s6R6 | IIIB | 1,3 | BF S23L24 | |||||||
| BF s6R8 | ND | 1,3 | BF S23 R14 | |||||||
| BF-S6L5 | Alternate | YES | BF S24L15 | |||||||
| BF-S6L8 | Alternate | YES | BF S24L26 | |||||||
| BF-S6R5 | Alternate | YES | BF S24L34 | |||||||
| BF-S6R6 | Alternate | YES | BF S36L11 | |||||||
| BF-S6R8 | Alternate | YES | BF S36L12 | |||||||
| BF-S36L11 | Alternate | YES | BF S36L5 | |||||||
| BF-S36L12 | Alternate | YES | BF S38L3 | |||||||
| BF-S36L5 | Alternate | YES | BF S38L5 |
Because many of the genomes are not yet assembled, the same CRISPR array may have been identified in different contigs, the length refers to the number of spacers in a particular CRISPR array.
RT-cas: Gene coding for Reverse-transcriptase Cas1 fusion protein.
ND: The contig on which the repeat array was found was very short and no adjacent cas genes could be identified.
Has cas2, cas1, cas 4a and cas 7 only; missing the effector cas genes.
BF HMW 610 has large segments of N's in the midst of what appears to be a long, continuous repeat array.
Alternate gene neighborhood in the midst of polysaccharide and other metabolic genes.
Figure 2CRISPR-Cas systems in B. fragilis. Cas genes found in the respective systems are listed. The canonic arrangement of the closest matching type according to Makarova et al. (2015) is represented by the smaller green arrows below the B. fragilis cas operon. T, Transposase gene. (A) Type IB: The traditional annotation servers identified genes coding for cas2, cas1, cas3, cas4a, and cas6 (TM1814). Three additional genes, classified by all the publicly available annotation sites as hypothetical proteins, have been classified as cas5, cas7, and cas8b6 genes by Makarova et al. (2015); their sequences are very divergent. (B) Type IB, truncated: These truncated CRISPR-Cas systems were located in the same neighborhood as the complete Type IB CRISPR-Cas systems. The cas genes include: cas2, truncated cas1 (252 vs. 1014 bp), truncated cas5 (381 vs. 564 bp) and cas6. The cas4, cas7, cas8b6, and cas3 genes were missing in those strains, as were the 5′ and 3′ ends of cas1 and cas5, respectively. They occur in strains BF 1009 4 F 10, BF 1009 4 F 7, BF 3 F 2-6, BF 3998T B 3, BF KLE1758, BF Korea 419, BF NCTC 9343, BF S14, BF S38L5 and BF s38L3. (C) Type IB, BF DCMOUH17B: This array has only some of the genes in the typical Type IB systems. (D) Type IIIB: The BF cas1 gene in the Type IIIB system codes for a reverse transcriptase-Cas1 fusion protein. The other genes present, cmr2-6 are typical of Type IIIB CRISPR-Cas systems, although in the canonic operon, the order is somewhat different. (E) Type IIC: The BF Type IIC system includes the canonic cas2, cas1 and csn1 (cas9) genes. Two additional genes coding for hypothetical proteins of unknown function are situated between cas2 and cas1.
Figure 3Conserved gene neighborhoods of B. fragilis CRISPR-Cas arrays. Proteins coded for by genes are listed below. Vertical red lines represent repeat array. Associated cas genes products are in bold. Representative strain is shown with genes surrounding CRISPR-Cas array. The neighborhoods are highly conserved within each CRISPR-Cas type. The most characteristic cas gene for each group (i.e., A: cas 3, B: RT-cas 1-fusion and C: cas 9) are underlined.
(A) Conserved gene neighborhood of Type IB CRISPR-Cas system in B. fragilis. Upstream genes: Hypothetical protein;hypothetical protein;Manganese transport protein MntH;Exodeoxyribonuclease III; hypothetical protein; hypothetical protein; hypothetical protein;Translation elongation factor LepA;putative Na+/H+ exchange protein;Na+/H+ antiporter NhaA type;hypothetical protein;DNA recombination protein RmuC; Methionine aminopeptidase; CRISPR-associated protein, TM1814 family ( protein;hypothetical protein;RNA pseudouridylate synthase BT0642; RNA methyltransferase, TrmA family;hypothetical protein;Pyruvate,phosphate dikinase;hypothetical protein; hypothetical protein;Thiamin-phosphate pyrophosphorylase;Sulfur carrier protein adenylyltransferase ThiF;Thiazole biosynthesis protein ThiH; hypothetical protein;Thiamin biosynthesis protein ThiC;Thiazole biosynthesis protein ThiG;Thiamin-phosphate pyrophosphorylase; Sulfur carrier protein ThiS; Superoxide dismutase [Fe];ATP-dependent DNA helicase UvrD/PcrA;Carboxynorspermidine decarboxylase, putative;hypothetical protein;
(B) Conserved gene Neighborhood of Type IIIB CRISPR-Cas system in B. fragilis. Upstream genes: Two-component system sensor histidine kinase; Two-component system response regulator; Outer membrane protein assembly factor YaeT precursor; ABC transporter permease; Probable ABC transporter permease; putative ABC transporter permease; putative ABC transporter permease; ABC transporter, permease protein; ABC transporter, permease protein; ABC transporter, permease protein; ABC transporter ATP-binding protein YvcR; Thiol:disulfide interchange protein; M. jannaschii predicted coding region MJ0978; protein of unknown function DUF88; hypothetical protein; hypothetical protein; hypothetical protein; hypothetical protein; CRISPR-associated RAMP Cmr2; CRISPR-associated RAMP Cmr3; CRISPR-associated RAMP Cmr4; CRISPR-associated RAMP Cmr5; CRISPR-associated RAMP Cmr6; Retron-type RNA-directed DNA polymerase RT-Cas1 fusion protein (underlined). CRISPR Repeat Array. Downstream genes: ISNCY family transposase; ISNCY family transposase; Aminotransferase class II, serine palmitoyltransferase like; Transcription regulator [contains diacylglycerol kinase catalytic domain]; Aspartyl-tRNA synthetase; hypothetical protein; N-carbamoylputrescine amidase/Aliphatic amidase AmiE Agmatine deiminase;Ferredoxin domain containing protein; YbbL ABC transporter ATP-binding protein; YbbM seven transmembrane helix protein; hypothetical protein; Possible glyoxylase family protein (Lactoylglutathione lyase); hypothetical protein; Hypothetical protein YbgI; Hypothetical protein; RND efflux system, outer membrane lipoprotein CmeC; RND efflux system, membrane fusion protein CmeA; RND efflux system, inner membrane transporter CmeB. In contrast, Gene Neighborhood of TYPE IIIB CRISPR Repeat Array in Blood Isolates (absence of Lipopolysaccharide biosynthesis protein; hypothetical protein;UDP-N-acetylglucosamine 2-epimerase; glycosyl transferase, group 1 family protein; hypothetical protein; mannosyltransferase B; UDP-N-acetylglucosamine 4,6-dehydratase; UDP-N-acetylglucosamine 2-epimerase; dTDP-4-dehydrorhamnose reductase; Glycosyl transferase, group 1 precursor; UDP-glucose 4-epimerase; Undecaprenyl-phosphate N-acetylglucosaminyl 1-phosphate transferase; TYPE IIIB REPEAT ARRAY in BF BLOOD ISOLATES; Downstream genes: Putative non-specific DNA-binding protein; hypothetical protein; Na+/H+-dicarboxylate symporter; 6-phosphogluconate dehydrogenase, decarboxylating; Glucose-6-phosphate 1-dehydrogenase; 6-phosphogluconolactonase, eukaryotic type; hypothetical protein;
(C) Gene Neighborhood of Type IIC CRISPR-Cas system in . Upstream gene neighborhood: Conserved hypothetical protein;putative transmembrane protein;putative transmembrane DNA mismatch repair-like protein;conserved hypothetical protein;putative urocanate hydratase;putative formimidoyltransferase-cyclodeaminase;putative imidazolonepropionase;putative formiminotransferase-cyclodeaminase;putative histidine ammonia-lyase;putative TetR transcriptional regulator;putative outer membrane efflux protein;putative membrane fusion protein transporter;putative transmembrane Acr-type transport protein;conserved hypothetical protein;putative transmembrane protein;conserved hypothetical protein;putative transmembrane polysaccharide modification protein;hypothetical protein;hypothetical protein;hypothetical protein; (underlined);conserved hypothetical protein (pseudogene); CRISPR REPEAT ARRAY. Downstream: Putative transmembrane protein;conserved hypothetical protein;putative transmembrane MotA/TolQ/ExbB proton channel family protein;conserved hypothetical protein;conserved hypothetical protein;putative TonB-family outer membrane receptor protein;conserved hypothetical protein;putative TPR-repeat family protein;putative ATP-binding component of ABC transporter;putative transmembrane protein;putative GntR family transcriptional regulator;conserved hypothetical protein;conserved hypothetical protein;putative ABC transporter transmembrane component.
Figure 4Venn diagram of distribution of CRISPR-Cas systems among strains of B. fragilis. The Venn diagrams in Figure 4 include CRISPR-Cas repeats of a given type, whether full or truncated; whether or not there was a full set of adjacent cas genes is detailed in Table 2. (A) Distribution of all CRISPR arrays in BF S38L3, BF S38L5, BF S14. Orphan Type IB Type IIIB-9 Strains: BF 1007 1 F 4, BF 1007 1 F 3, BF Korea 419, BF 1007 1 F 6, BF 1007 1 F 7, BF 1007 1 F 9, BF 1007 1 F 8, BF 1007 1 F 10, BF 1007 1 F 5. Orphan Type IB Type IIC-5 Strains: BF 1009 4 F 7, BF KLE1758, BF 3-F-2 #6, BF NCTC 9343, BF 1009 4 F 10. Orphan Type IIC Type IIIB-10 Strains: BF 3783N1 8, BF 3783N2 1, BF 3976T7, BF DS 71, BF S23L17, BF S23L24, BF S23 R14, BF 3783N1 6, BF DCMOUH0042B, BF 3774 T13. Type IB Type IIIB-4 Strains: BF DCMSKEJBY0001B, BF S13 L11, BF HMW 610 Bacteroides sp. UW. Type IB Type IIC-6 Strains: BF 3996 N B 6, BF 321, BF 322, BF 3998T B 3, BF 320, BF 3998 T B 4. Orphan Type IB-1 Strain: BF YCH46. Type IIC Type IIIB-1 Strain: BF 3783N1 2. Orphan Type IIIB-20 Strains: BF 3986T B 10, BF DCMOUH0018B, BF S6L8, BF S36L12, BF 3986 T B 9, BF S6L3, BF DS 166, BF 3988T B 14, BF S6R8, BF S6L5, BF S36L5, BF 3986 N B 22, BF 3988 T1, BF S6R6, BF 3397 N2, BF S36L11, BF 3397 T14, BF 3986 T B 13, BF S6R5, BF 3986 N3. Orphan Type IIC-7 Strains: BF 34 F 2 13, BF 2 F 2 4, BF 2 F 2 5, BF S24L26, BF S24L34, BF S24L15, BF BE1 1. Type IB-1 Strains: BF DCMOUH0017B. Type IIIB-9 Strains: BF 3397 T10, BF 3976 T8, BF DCMOUH0067B, BF J38 1, BF 894, BF 885, BF 884, BF 3719A10, BF 3397 N3. Type IIC-8 Strains: BF 3719 T6, BF 3725 D9 ii, BF 2 078382 3, BF 20656 2 1, BF DS 208, BF 3986 N B 19, BF 638R, BF 86 5443 2 2. Orphan-16 Strains: BF BOB25, BF A7 UDC12 2, BF CL07T00C01, BF CL03T00C08, BF HMW 616, BF 2 F 2 7, BF 3 1 12, BF CL07T12C05, BF CL03T12C07, BF JCM 11017, BF JIM10, BF, BF8, BF I1345, BF 20793 3, BF B1 UDC16 1, BF O: 21. No CRISPR-Cas- 9 Strains: BF CL05T00C42, BF HMW 615, BF DCMOUH0085B, BF Ds 233, BF 3725 D9, BF CL05T12C13, BF J 143 4, BF 4g8B, BF 2d2A. (B) Distribution of CRISPR-Cas systems excluding orphan CRISPR arrays. Type IB Type IIC Type IIIB-3 Strains: BF S38L3, BF S38L5, BF S14. Type IB Type IIIB-13 Strains: BF 1007 1 F 4, BF DCMSKEJBY0001B, BF 1007 1 F 3, BF Korea 419, BF S13 L11, BF 1007 1 F 6, BF 1007 1 F 7, BF HMW 610 Bacteroides sp. UW, BF 1007 1 F 9, BF 1007 1 F 8, BF 1007 1 F 10, BF 1007 1 F 5. Type IB Type IIC-11 Strains: BF 3996 N B 6, BF 321, BF 1009 4 F 7, BF KLE1758, BF 3-F-2 #6, BF 322, BF 3998T B 3, BF 320, BF NCTC 9343, BF 1009 4 F 10, BF 3998 T B 4. Type IIC Type IIIB-11 Strains: BF 3783N1 8, BF 3783N2 1, BF 3976T7, BF DS 71, BF S23L17, BF 3783N1 2, BF S23L24, BF S23 R14, BF 3783N1 6, BF DCMOUH0042B, BF 3774 T13. Type IB-2 Strains: BF YCH46, BF DCMOUH0017B. Type IIIB-29 Strains: BF 3397 T10, BF 3986T B 10, BF DCMOUH0018B, BF 3976 T8, BF DCMOUH0067B, BF S6L8, BF J38 1, BF S36L12, BF 3986 T B 9, BF S6L3, BF DS 166, BF 3988T B 14, BF 894, BF S6R8, BF S6L5, BF S36L5, BF 3986 N B 22, BF 885, BF 3988 T1, BF 884, BF S6R6, BF 3397 N2, BF 3719A10, BF S36L11, BF 3397 N3, BF 3397 T14, BF 3986 T B 13, BF S6R5, BF 3986 N3. Type IIC-15 Strains: BF 34 F 2 13, BF 3719 T6, BF 3725 D9 ii, BF 2 078382 3, BF 20656 2 1, BF DS 208, BF 3986 N B 19, BF 2 F 2 4, BF 638R, BF 2 F 2 5, BF S24L26, BF 86 5443 2 2, BF S24L34, BF S24L15, BF BE1 1. No CRISPR-Cas (with adjacent BF CL05T00C42, BF BOB25, BF A7 UDC12 2, BF CL07T00C01, BF CL03T00C08, BF HMW 615, BF HMW 616, BF 2 F 2 7, BF 3 1 12, BF DCMOUH0085B, BF CL07T12C05, BF CL03T12C07, BF JCM 11017, BF Ds 233, BF JIM10, BF, BF8, BF 3725 D9, BF CL05T12C13, BF J 143 4, BF I1345, BF 4g8B, BF 2d2A, BF 20793 3, BF B1 UDC16 1, BF O: 21.
Figure 5Relationship of B. fragilis DR (direct repeat) sequence to other sequences in the Direct Repeat Database. CRISPR Map was used to locate the consensus sequences of the four CRISPR-Cas types of B. fragilis. Based on placement within the map shown, Superclass, tentative taxonomy, Cas subtype and Sequence Family were determined (if available). Based on a detailed tree (not shown) of all of the DRs in the database, the closest phylogenetic neighbors were determined. BF1_IB: Superclass: A, Taxonomy: Bacteroidetes, Cas subtype: IB, Sequence Family: 2 (158 bacteria including 2 strains of BF). Nearest phylogenetic neighbors: Phormidium sp. (cyanobacteria living at temperatures of 45–60) and Pyrococcus yayanosii (strictly anaerobic, hypertermophilic archeon isolated from the deep sea); BF2_IIC: Superclass: -, Taxonomy: Bacteroidetes, Cas subtype IIC, Sequence family: 21 (23 bacteria including 2 DRs of B. fragilis strains). Nearest phylogenetic neighbors: Capnocytophaga, a gram-negative bacterium (Phylum Bacteroidetes, Family Flavobacteriaceae) normally found in the oropharangeal tract of mammals and involved in pathogenesis of animal bite wounds and periodontal disease. Remarkably, Capnocytophaga carries cfxA and cepA, two β-lactamase genes found in Bacteroides species and responsible for β-lactam resistance in Bacteroides. Phylogenetic analysis indicated that the Cas9 protein was also closely related to that of Cas9s found in Capnocytophaga; another close match was to Fluviicola taffensis, a novel freshwater bacterium of the family Cryomorphaceae within the phylum “Bacteroidetes”; BF3_IIIB: Superclass: E; Taxonomy: Firmicutes, Cas subtype IIIA? (based on arrangement of the cas genes, we assigned this DR to CRISPR-Cas Type IIIB). Nearest phylogenetic neighbors: Saprospira grandis, a gram-negative, marine, multicellular, filamentous flexibacterium, (phylum Bacteroidetes, Class: Sphingobacteria) known for devouring bacteria (and algae) and Methanococcus vaniellii (Superkingdom Archea, Phylum Euryarchaeota); both (particularly the latter) indicates that the CRISPR may have been horizontally transferred from a phylogenetically distant species; BF4_Orphan Superclass: D Taxonomy: Proteobacteria. Nearest phylogenetic neighbors: Fluviicola taffensis, a novel freshwater bacterium (Phylum Bacteroidetes, family Cryomorphaceae) and Ornithobacterium rhinotracheale (Phylum Bacteroidetes, family Flavobacteriaceae) a bacterium found worldwide that causes potentially fatal respiratory disease in poultry.
Figure 6Consensus direct repeat sequences and predicted fold structure for CRISPR-Cas systems in B. fragilis. The structure is colored by base-pairing probabilities. For unpaired regions the color denotes the probability of being unpaired. A short bar denoting the base pairing probability is including in the drawing. (A) Type IB Direct Repeat. The centroid secondary structure in dot-bracket notation has a minimum free energy of 0.10 kcal/mol. (B) Type IIIB Direct Repeat. The centroid secondary structure in dot-bracket notation has a minimum free energy of 0.10 kcal/mol. (C) Type IIC Direct Repeat. The centroid secondary structure in dot-bracket notation has a very stable secondary structure with a minimum free energy of −5.30 kcal/mol. (D) Orphan Direct Repeat. The centroid secondary structure in dot-bracket notation has a minimum free energy of −0.90 kcal/mol.