| Literature DB >> 23181781 |
Sebastián Aguilar Pierlé1, Michael J Dark, Dani Dahmen, Guy H Palmer, Kelly A Brayton.
Abstract
BACKGROUND: The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs). We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility.Entities:
Mesh:
Year: 2012 PMID: 23181781 PMCID: PMC3542260 DOI: 10.1186/1471-2164-13-669
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1SNPs segregated with transmission status through whole genome comparison and targeted sequencing. A. Genome wide comparison of the non-transmissible Florida strain (red) with the efficiently transmitted St. Maries (green) strain produced 9609 SNPs. From this list we subtracted SNPs that encode for synonymous changes, leaving two types of SNPs that were further characterized: those that resulted in non-synonymous (NS) amino acid changes within ORFs and SNPs located in putative promoter regions. Comparison of these SNPs with genome sequences of three tick transmissible strains was then performed. SNPs that consistently segregated with phenotype were retained. The remaining differences were then targeted sequenced in two additional efficiently transmissible strains. B. A total of 9609 SNPs were found between the transmissible St. Maries and the non-transmissible Florida strain (SNPs). This comparison found 4498 non-synonymous SNPs (represented in black), 1630 SNPs found within putative promoter regions (shown in dark grey) and synonymous SNPs (shown in light gray). Whole genome comparison with three transmissible strains allowed removal of 4127 non-synonymous SNPs and 1568 promoter SNPs from further consideration. Finally, Targeted sequencing in additional transmissible strains of 241 non-synonymous and 62 promoter SNPs allowed retention of 35 NS and 14 promoter SNPs as candidate SNPs involved in tick transmission.
Figure 2Location of candidate SNPs on the Florida strain genome. This circular representation of the Florida genome shows in light blue annotated CDSs; outer circle represents CDSs on the forward strand, inner circle represents the reverse strand, in grey the 9609 SNPs found between the St. Maries and the Florida strain genomes. The elements in light green are miscellaneous features annotated in the genome. In the inner most circle 49 candidate SNPs found through comparative genomics are shown. Red bars show the position of candidate non-synonymous SNPs within CDSs. Dark blue bars show candidate SNPs found within putative promoter regions.
Figure 3RNA-seq and qPCR confirm trends in transcriptional changes between strains that differ in their tick transmission status. A. Fold change in the transmissible St. Maries strain relative to the non-transmissible Florida strain for all promoter candidates expressed in log scale 10. Locus tags for all genes are given on the X axis. Blue bars show the results obtained after evaluating two biological replicates with RT-PCR. Red bars show the fold change obtained using RNA-seq analysis for the promoter candidates across two biological replicates. The asterisk indicates statistical significance at p < 0.05. B. Fold change in the transmissible St. Maries strain relative to the non-transmissible florida strain expressed in log10. The top 18 differentially transcribed genes identified through RNA-seq across two replicates and two statistical tests and their fold changes are shown. Red bars show results obtained with RNA-seq, blue bars show validation through qPCR. The asterisk indicates statistical significance at p < 0.01.
Reads mapped to from three sequencing platforms
| STM1 | 726,051 | 269,730 | 37.1 | 381.7 | |
| FL2 | 1,018,447 | 326,440 | 32.0 | 396.5 | |
| STM | 1,004,747 | 295,629 | 29.4 | 111.0 | |
| FL | 2,043,607 | 577,284 | 28.2 | 111.2 | |
| STM | 88,650,713 | 4,604,993 | 5.2 | 100 | |
| FL | 81,507,967 | 3,845,853 | 4.7 | 100 |
1STM: St. Maries strain.
2FL: Florida strain.
Percentage of putative 5′ UTRs according to length
| 25.68 | 63.30 | 11.02 | |
| 27.15 | 48.57 | 24.28 |
15′ UTR within predicted CDS: in this column we list cases where transcript mapping shows that the 5′ UTR and TSS are found within the previously predicted and annotated CDS indicating that the previous annotation was incorrect.
Previously unannotated areas that exhibited high transcriptional activity
| 543 | hypothetical protein AmarV_01231 [Anaplasma marginale str. Virginia] | AM294 pep1 | AM259 thiD | 0.7 | |
| 644 | n/a | AM380 | AM382 | 23.6* | |
| 976 | DNA-binding protein HU [Anaplasma phagocytophilum HZ] | AM434 pdxJ | AM435 | 1.1 | |
| 441 | hypothetical protein AmarM_02282 [Anaplasma marginale str. Mississippi] | AM504 | tRNA-Asn-1 | 1.3 | |
| 335 | hypothetical protein PseS9_19739 [Pseudomonas sp. S9] | AM969 bioB | AM973 purL | 1.5 | |
| 577 | hypothetical protein AmarM_05569 [Anaplasma marginale str. Mississippi] | AM1214 polA | AM1216 | 6.9* |
1base pair positions spanned by the newly identified regions in the St. Maries genome.
*Statistically significant fold changes are indicated with an asterisk.
Figure 4Newly identified transcriptionally active regions of the genome. Mapping of cDNA reads to the A. marginale genome allowed us to detect regions without previous annotation that exhibited transcriptional activity. A shows the region of the St. Maries genome that spans from bp 336042 to 336685. Three different gene identification algorithms did not detect a CDS that would span the length of the transcript. The top panel shows the six reading frames containing forty-five stop codons, shown as black bars. The bottom panel shows some of the mapped cDNA reads in green and red (indicating direction of the read). The grey histogram under the reads represents depth (read height). This transcript was up-regulated in the St. Maries strain by a fold change of 23.7 at p < 1E-10. B shows the region of the St. Maries genome that spans from bp 1084944 to 1085520. The newly identified gene is found between genes polA (not shown) and AM1216. One ORF on the leading strand seems to span the length of this transcript and is shown as PUTATIVE_CDS in this figure. This new gene was found to be up-regulated in the transmissible St. Maries strain by a fold change of 6.9 at p < 1E-10.
Figure 5Whole genome comparison of transcriptional activity in the St. Maries and Florida strains. The RPKM values for 955 genes found in the Florida strain genome of A. marginale. RPKM values are shown on the Y axis. Features are arranged from left to right as they appear in the genome on the X axis. The normalized RPKM values were plotted for each strain. RPKM values for the transmissible St. Maries strain are shown in red in the upper part of the graph; numbers for the non-transmissible Florida strain are shown in light blue in the lower part of the graph. RPKM values for the Florida strain are plotted on the opposite side of the x axis for ease of comparison; they do not represent negative values. Ribosomal RNA (rRNA) genes were subtracted from this comparison.
Candidate genes involved in transmission phenotype segregated by polymorphisms and differential transcription
| AMF_433 | AM579 | Hypothetical protein | 1 | 0 | Gene | AP, ER, ECh | TM | |
| AMF_432 | AM579 | Hypothetical protein | 1 | 0 | Gene | AP, ER, ECh, ECa | - | |
| AMF_431 | AM580 | Hypothetical protein | 3 | 0 | Gene | AP, ER, ECh, ECa | TM | |
| AMF_430 | AM576 | Hypothetical protein | 22 | 0 | Gene | AP, ER, ECh, ECa | TM/DS | |
| AMF_429 | AM574 | Hypothetical protein | 15 | 0 | Gene | AP, ER, ECh | TM | |
| AMF_798 | AM1055 | Hypothetical protein | 9 | 0 | Gene | AP, ER, ECh, ECa | TM/SP | |
| AMF_879 | AM1165 | Hypothetical protein | 14 | 0 | Gene | - | TM | |
| AMF_878 | AM1164 | Outer membrane protein 4 | 2 | 0 | Gene | AP | TM/SP | |
| AMF_401 | AM540 | Hypothetical protein | 12 | 0 | Gene/Prom | - | TM | |
| AMF_258 | AM347 | Hypothetical protein | 19 | 0 | Gene | AP | TM | |
| AMF_474 | AM632 | Ribosome-associated inhibitor A | 1 | 1 | Promoter | ER, ECh, ECa | - | |
| AMF_553 | AM748 | NADH Dehydrogenase I chain J | 1 | 1 | Promoter | AP, ER, ECh, ECa | TM | |
| AMF_793 | AM1048 | Hypothetical protein | 77 | 2 | Gene/Prom | AP, ER, ECh, ECa | TM/SP | |
| AMF_1026 | AM1352 | Hypothetical protein | 18 | 5 | Gene/Prom | AP, ER, ECh | TM/DS | |
| AMF_051 | AM071 | Hypothetical protein | 5 | 1 | Gene | AP, ER, ECh, ECa | TM | |
| AMF_197 | AM265 | Hypothetical protein | 19 | 1 | Gene | AP, RB | TM | |
| AMF_264 | AM354 | Hypothetical protein | 21 | 2 | Gene | ER, ECh, ECa | TM/DS | |
| AMF_265 | AM356 | Hypothetical protein | 65 | 1 | Gene | AP | - | |
| AMF_269 | AM368 | Hypothetical protein | 43 | 1 | Gene | RB | TM/DS | |
| AMF_480 | AM644 | DNA gyrase B | 14 | 1 | Gene | AP, ER, ECh, ECa, RB, RC, RR | TM | |
| AMF_530 | AM712 | Hypothetical protein | 138 | 1 | Gene | - | TM | |
| AMF_518 | AM689 | Hypothetical protein | 20 | 1 | Gene | AP | - | |
| AMF_547 | AM742 | Hypothetical protein | 1 | 1 | Gene | AP, ER, ECh, ECa | DS | |
| AMF_613 | AM823 | Hypothetical protein | 17 | 4 | Gene | AP, ER, ECh, ECa | - | |
| AMF_703 | AM919 | Hypothetical protein | 1 | 1 | Gene | AP, ER, ECh, ECa, RB, RC, RR | TM | |
| AMF_762 | AM1001 | methionyl-tRNA synthetase | 23 | 1 | Gene | AP, ER, ECh, ECa, RB, RC, RR | TM/DS | |
| AMF_764 | AM1005 | aspartokinase | 13 | 1 | Gene | AP, ER, ECh, ECa | TM/DS | |
| AMF_824 | AM1091 | D-Ala-D-Ala carboxypeptidase | 33 | 6 | Gene | AP, ER, ECh, ECa, RB, RC, RR | TM/SP | |
| AMF_893 | AM1183 | lipoprotein-releasing transmembrane protein | 8 | 3 | Gene | ER, ECh, ECa | TM/DS | |
| AMF_1037 | AM345 | Hypothetical protein | 4 | 1 | Gene | - | - |
1FL: Locus tag in the Florida strain St.M: locus tag in the St. Maries strain.
2AP: Anaplasma phagocytophilum, ER: Ehrlichia ruminantium, ECh: Ehrlichia chaffeensis, ECa: Ehrlichia canis, RB: Rickettsia bellii, RC: Rickettsia conorii, RR: Rickettsia rickettsia.
3BI: Bioinformatics TM: Transmembrane domain, SP: Signal peptide, Prom: promoter, DS: Deleterious substitution.