| Literature DB >> 19132092 |
John F Heidelberg1, William C Nelson, Thomas Schoenfeld, Devaki Bhaya.
Abstract
CRISPR arrays and associated cas genes are widespread in bacteria and archaea and confer acquired resistance to viruses. To examine viral immunity in the context of naturally evolving microbial populations we analyzed genomic data from two thermophilic Synechococcus isolates (Syn OS-A and Syn OS-B') as well as a prokaryotic metagenome and viral metagenome derived from microbial mats in hotsprings at Yellowstone National Park. Two distinct CRISPR types, distinguished by the repeat sequence, are found in both the Syn OS-A and Syn OS-B' genomes. The genome of Syn OS-A contains a third CRISPR type with a distinct repeat sequence, which is not found in Syn OS-B', but appears to be shared with other microorganisms that inhabit the mat. The CRISPR repeats identified in the microbial metagenome are highly conserved, while the spacer sequences (hereafter referred to as "viritopes" to emphasize their critical role in viral immunity) were mostly unique and had no high identity matches when searched against GenBank. Searching the viritopes against the viral metagenome, however, yielded several matches with high similarity some of which were within a gene identified as a likely viral lysozyme/lysin protein. Analysis of viral metagenome sequences corresponding to this lysozyme/lysin protein revealed several mutations all of which translate into silent or conservative mutations which are unlikely to affect protein function, but may help the virus evade the host CRISPR resistance mechanism. These results demonstrate the varied challenges presented by a natural virus population, and support the notion that the CRISPR/viritope system must be able to adapt quickly to provide host immunity. The ability of metagenomics to track population-level variation in viritope sequences allows for a culture-independent method for evaluating the fast co-evolution of host and viral genomes and its consequence on the structuring of complex microbial communities.Entities:
Mesh:
Year: 2009 PMID: 19132092 PMCID: PMC2612747 DOI: 10.1371/journal.pone.0004169
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
CRISPR repeats in Synechococcus OS-A and OS-B′.
| Repeat Type | # Repeats | Repeat consensus | Start Position | End Position | CRISPR_ID |
|
| 97 | ||||
| IA* | 42 |
| 889207 | 892334 | NC_007775_2 |
| IB | 9 |
| 1139963 | 1140602 | NC_007775_3 |
| IIA* | 12 |
| 2557478 | 2559420 | NC_007775_9, NC_007775_10 |
| IIB | 7 |
| 1260140 | 1260640 | NC_007775_4 |
| IIC | 2 |
| 1860020 | 1860135 | NC_007775_5 |
| IID | 2 |
| 1960215 | 1960325 | NC_007775_6 |
| IIE | 16 |
| 2327972 | 2329201 | NC_007775_7 |
| III* | 7 |
| 733236 | 733709 | NC_007775_1 |
|
| 125 | ||||
| IA* | 9 |
| 604640 | 605267 | NC_007776_4 |
| IB | 16 |
| 1428062 | 1429246 | NC_007776_8 |
| IIA* | 16 |
| 866037 | 867185 | NC_007776_5 |
| IID | 35 |
| 156596 | 159176 | NC_007776_1 |
| IIF | 18 |
| 515271 | 516591 | NC_007776_3 |
| IIG | 31 |
| 2016367 | 2018657 | NC_007776_9 |
The Table shows the CRISPR Repeat Type, number of repeats, the consensus sequence with the start and end position of the array on the genome and the CRISPR ID assigned by CRISPR db.
From CRISPRdb [30]. All sequence symbols follow the IUPAC Nucleotide Symbol code. G Guanine; A Adenine; T Thymine; C Cytosine; R G or A; Y T or C; M A or C; K G or T; S G or C; W A or T; H A or C or T; B G or T or C; V G or C or A; D G or A or T, N G or A or T or C. The letter O represent s a gap found in one or more of the sequences.
Figure 1Schematic location of CRISPR loci on Syn OS-A OS-B genomes.
Bars indicate the relative positions of the Type I (red), Type II (green) and Type III (blue) CRISPR loci on the genomes of Syn OS-A (top) and Syn OS-B′ (bottom). Asterisks indicate CRISPRs with an associated cas operon. CRISPRs within syntenic genome blocks are connected with lines. Approximate location and direction of the mapped clone-mates of CRISPR containing clones are show as triangles colored by the repeat type (as above). Two clones that mapped to a location on a reference genome that lacks a CRISPR array are shown within the genome box. Line below shows genome size (large ticks at 1 Mbp; small ticks at 0.5 Mbp).
Figure 2Type I CRISPR regions.
A) CRISPR-IA region from Syn OS-A (top) and Syn OS-B′ (bottom). The cas gene cluster is indicated by a grey bracket. B) CRISPR-IB region. Gene identifiers are shown above or below each gene, excluding the GenBank locus tag prefix ‘CYA_’ (for Syn OS-A) or ‘CYB_’ (for Syn OS-B′). Genes are color-coded following the COG-based IMG convention (http://img.jgi.doe.gov), (described in Figure 4), except that genes annotated as “hypothetical protein” are in grey, genes with no putative ortholog in the other genome are in red, and genes annotated as transposases are in black. Orthologous genes are indicated by yellow blocks. Detailed information on each gene is available in Table S3.
Figure 4Comparison of Type III CRISPR locus to Roseiflexus RS-1 and Symbiobacterium.
Homologous regions from Syn OS-A, Syn-OS-B′, Roseiflexus RS-1 and Symbiobacterium are displayed. Gene identifiers are shown above or below each gene, excluding the GenBank locus tag prefix ‘CYA_’ (for Syn OS-A), ‘CYB_’ (for Syn OS-B′), RoseRS_ (for Roseiflexus RS-1) or STH (for Symbiobacterium thermophilum). Additional figure conventions are as described in Figure 2. Details about the genes can be found in Table S3.
Figure 3Type II CRISPR regions.
Homologous regions from Syn OS-A and Syn OS-B′ where one or both genomes has a Type II CRISPR locus are displayed. The CRISPR displayed is indicated by the panel letter (e.g. panel ‘A’ shows CRISPR-IIA, panel ‘B’ shows CRISPR-IIB, etc.) Figure conventions are as described in Figure 2. FS indicates a frameshift in the CDS. Additional details can be found in Table S3.
Virome sequences showing silent or conservative changes relative to the viritope Cluster_2_YMBCR81TF-SP-2 sequence.
| Sequence identifier | Nucleic Acid Sequence | %NAID | Predicated AA Sequence | % AASIM/AAID |
| Viritope |
| FTLKWEGGFVHH | ||
| Y_1647173811 |
| 95 | ............ | 100/100 |
| Y_1647170208 |
| 95 | ............ | 100/100 |
| Y_1647175271 |
| 92 | ............ | 100/100 |
| Y_1647164556 |
| 92 | ............ | 100/100 |
| Y_1647162994 |
| 92 | ........Y... | 100/91 |
| B_1647183951-RC |
| 70 | ...R....Y.N. | 100/75 |
| B_1647183950 |
| 70 | ...R....Y.N. | 100/75 |
| B_1647178644 |
| 70 | ...R....Y.N. | 100/75 |
| B_1647170869 |
| 70 | ...R....Y.N. | 100/75 |
| B_1647170868-RC |
| 70 | ...R....Y.N. | 100/75 |
| B_1647166765 |
| 70 | ...R....Y.N. | 100/75 |
| B_1647165544-RC |
| 70 | ...R....Y.N. | 100/75 |
| P_1647167058 |
| 86 | ............ | 100/100 |
| P_1647166181 |
| 80 | ........Y.N. | 100/83 |
| P_1647164597 |
| 80 | ........Y.N. | 100/83 |
| P_1647164475 |
| 80 | ........Y.N. | 100/83 |
| P_1647164033 |
| 80 | ........Y.N. | 100/83 |
| P_1647163511 |
| 80 | ........Y.N. | 100/83 |
| P_1647173178 |
| 75 | ........Y.N. | 100/83 |
| P_1647166871 |
| 73 | ........Y.N. | 100/83 |
| P_1647162773 |
| 73 | ........Y.N. | 100/83 |
| P_1647183575 |
| 70 | ...R....Y.N. | 100/75 |
| P_1647183574 |
| 70 | ...R....Y.N. | 100/75 |
Nucleic Acid Sequences. The viritope sequence is in Line 1. Matching sequences from the virome are shown below. Identical nucleotides are shown as a “.”. Nucleotides in regular and bold text indicate synonymous and non-synonymous changes, respectively, relative to the viritope sequence. Nucleotides in italics indicate SNPs which cannot be assigned a specific amino acid translation since the sequences have been translated in the +3 frame, thus the first base of the codon is missing. Sequence identifier prefixes indicate the method used to find them: ‘Y’–virome sequence with high NAID to CRISPR_II_YMBCR81TF-SP-2; ‘B’–virome sequence with high NAID to CRISPR_II_YMIA938TF-SP-4/5; ‘P’–amino acid similarity to translations of ‘Y’ and ‘B’ sequences. ‘RC’ indicates that the sequence shown is the reverse complement of the database entry. NAID of the virome sequences to the viritope are shown. For the predicted amino acid (AA) sequence of the viritope and virome sequences which have been translated starting from nucleotide position #3. Percent similarity (SIM) and identity (ID) of amino acids to the translated amino acids of the viritope are shown at extreme right. Identical amino acids are shown as a “.”. Those AAs that differ from the viritope AA sequence are shown, and represent conservative changes.
CRISPR viritopes with similarity to virome sequences.
| gnl|ti|1647165544-RC |
|
| gnl|ti|1647183951-RC |
|
| gnl|ti|1647183950 |
|
| gnl|ti|1647178644 |
|
| gnl|ti|1647166765 |
|
| gnl|ti|1647170869 |
|
| gnl|ti|1647170868-RC |
|
| CRISPR_II_YMIA938TF-SP-5 |
|
| CRISPR_II_YMIA938TF-SP-4 |
|
| = = = = = = = = = = = | |
| gnl|ti|1647165465 |
|
| CRISPR_III_CYNAC89TF-SP-5 |
|
| = = = = = = = = = = = | |
| gnl|ti|1647175271 |
|
| gnl|ti|1647164556 |
|
| gnl|ti|1647173811 |
|
| gnl|ti|1647170208 |
|
| gnl|ti|1647162994 |
|
| CRISPR_II_YMBCR81TF-SP-2 |
|
| = = = = = = = = = = = | |
| gnl|ti|1647168791 |
|
| gnl|ti|1647168790 |
|
| CRISPR_I_CYPLU89TR-SP-2 |
|
| CRISPR_I_CYPLU89TR-SP-3 |
|
| CRISPR_II_CYPBU82TR-SP-5 |
|
| CRISPR_II_YMJAO09TR-SP-3 |
|
Segments of virome sequence that align to viritope sequences are displayed. Bold nucleotides indicate SNPs relative to the viritope sequence. RC indicates that the sequence shown is a reverse complement of the original virome sequence.
Figure 5Example of a putative gene encoding a lysozyme derived from the virome.
The location of the CRISPR viritopes is highlighted. The yellow highlighted region matches CRISPR_II_YMBCR81TF-SP-2 and the bold highlighted region is CRISPR_II_YMIA938TF-SP-4/5.