| Literature DB >> 20140244 |
Ufuk Nalbantoglu1, Khalid Sayood, Michael P Dempsey, Peter C Iwen, Stephen C Francesconi, Ravi D Barabote, Gary Xie, Thomas S Brettin, Steven H Hinrichs, Paul D Fey.
Abstract
Francisella tularensis subspecies tularensis consists of two separate populations A1 and A2. This report describes the complete genome sequence of NE061598, an F. tularensis subspecies tularensis A1 isolated in 1998 from a human with clinical disease in Nebraska, United States of America. The genome sequence was compared to Schu S4, an F. tularensis subspecies tularensis A1a strain originally isolated in Ohio in 1941. It was determined that there were 25 nucleotide polymorphisms (22 SNPs and 3 indels) between Schu S4 and NE061598; two of these polymorphisms were in potential virulence loci. Pulsed-field gel electrophoresis analysis demonstrated that NE061598 was an A1a genotype. Other differences included repeat sequences (n = 11 separate loci), four of which were contained in coding sequences, and an inversion and rearrangement probably mediated by insertion sequences and the previously identified direct repeats I, II, and III. Five new variable-number tandem repeats were identified; three of these five were unique in NE061598 compared to Schu S4. Importantly, there was no gene loss or gain identified between NE061598 and Schu S4. Interpretation of these data suggests there is significant sequence conservation and chromosomal synteny within the A1 population. Further studies are needed to determine the biological properties driving the selective pressure that maintains the chromosomal structure of this monomorphic pathogen.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20140244 PMCID: PMC2815774 DOI: 10.1371/journal.pone.0009007
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Genomic characteristics of F. tularensis subsp. tularensis NE061598.
| Length (bp) | 1892681 |
| GC Content (%) | 32.26 |
| Total Genes | 1850 |
| Protein Coding Genes | 1601 |
| Genes Assigned Function | 1185 |
| Hypothetical proteins | 416 |
| Disrupted ORFs | 201 |
| Large Duplicated Regions | 2 |
| Transposons (IS elements) | 75 |
| tRNA | 38 |
| rRNA | 10 |
| sRNA | 2 |
| Average Gene Length (nt) | 1068 |
| Percent Coding | 90.40% |
VNTR markers and their differences between Schu S4 and NE061598.
| VNTR | Repeat motif | Repeat size (nt) | Genomic Location | Repeat copy no., strain SCHU S4 | Repeat copy no., strain NE061598 |
| Ft-M1 | AAT | 3 | I (−76) | 3 | 3 |
| Ft-M2 | TAAATA | 6 | G (+12) | 4 | 5 |
| Ft-M3 | AATAAGGAT | 9 | G (+1401) | 25 | 20 |
| Ft-M4 | TTGTT | 5 | G (+55) | 3 | 3 |
| Ft-M5 |
| 16 | I (−21) | 3 | 2 |
| Ft-M6 |
| 21 | G (+1160) | 4 | 5 |
| Ft-M7 |
| 16 | I (−21) | 4 | 4 |
| Ft-M8 |
| 16 | I (−21) | 4 | 4 |
| Ft-M9 |
| 16 | I (−21) | 4 | 9 |
| Ft-M10 |
| 16 | I (−21) | 18 | 8 |
| Ft-M11 |
| 10 | I (−113) | 5 | 5 |
| Ft-M12 |
| 10 | I (−113) | 2 | 2 |
| Ft-M13 |
| 12 | G (+1174) | 2 | 2 |
| Ft-M14 | TCATTA | 6 | G (+67) | 3 | 3 |
| Ft-M15 | ATACTT | 6 | G (+32) | 2 | 2 |
| Ft-M16 |
| 10 | I (+551) | 2 | 2 |
| Ft-M17 | TATTTA | 6 | G (+484) | 3 | 3 |
| Ft-M18 | CATTAA | 6 | I (−52) | 4 | 4 |
| Ft-M19 |
| 13 | I (−20) | 2 | 2 |
| Ft-M20 |
| 12 | G (+1964) | 3 | 3 |
| Ft-M21 | TCAATTA | 7 | G (+586) | 3 | 4 |
| Ft-M22 | AAAAAT | 6 | G (+2254) | 2 | 2 |
| Ft-M23 |
| 23 | I (+1864) | 2 | 2 |
| Ft-M24 |
| 21 | I (−93) | 1 | 1 |
| Ft-M25 | GT | 2 | G (+525) | 5 | 5 |
| VNTR-1 | CAAAGACA | 8 | I (−392) | 1 | 3 |
| VNTR-2 |
| 11 | I (−42) | 3 | 2 |
| VNTR-3 | GAAAATAA | 8 | G (+282) | 1 | 2 |
| VNTR-4 |
| 16 | I (+22) | 2 | 3 |
| VNTR-5 |
| 29 | I (−32) | 1 | 1 |
FtM1-FtM25 VNTR markers as previously reported by Johansson et al. [7]. New VTNR polymorphisms identified in this study are listed as VNTR1 through VNTR-5.
Indicates repeat size in nucleotides.
“G” indicates that the repeat is located within an open reading frame (genic) whereas “I” indicates that the repeat is located within an intergenic region. Distance to predicted translation start site is indicated in nucleotides. “+” or “−” indicates that the translation start site is downstream or upstream of repeat motif, respectively (as reported by Johansson et al. [7]).
Description of six local collinear blocks (LCBs) between NE061598 and Schu S4.
| LCB | Type | NE061598 Position | Schu S4 position |
| 1 | Conserved | 1-352156 | 1-352087 |
| 2 | Inversion | 352157-381876 | 381807-352088 |
| 3 | Conserved | 381877-1312701 | 381808-1312781 |
| 4 | Rearrangement | 1312702-1700690 | 1379901-1767877 |
| 5 | Rearrangement | 1700691-1767602 | 1307424-1374335 |
| 6 | Conserved | 1767603-1892681 | 1767671-1892775 |
Non-synonymous SNPs, synonymous SNPs, and indels discovered between NE061598 and Schu S4.
| Schu S4/NE061598 | Nucleotide change | Type | ORF_ID | Product | Putative amino acid change | |
| 157940 | 158036 | A/C | sSNP | FTT0144 | DNA-directed RNA polymerase subunit beta | SYN |
| 218776 | 218872 | G/A | iSNP | IGS | intergenic space or other non-protein-coding region | − |
| 262990 | 263086 | C/G | nSNP | FTT0249 | ferrous iron transport protein | T/R |
| 269208 | 269304 | C/T | nSNP | FTT0252 | 2-isopropylmalate synthase | S/F |
| 297337 | 297433 | C/T | sSNP | FTT0282 | Cytochrome O ubiquinol oxidase subunit I | SYN |
| 989503 | 989567 | T/– | deletion | IGS | intergenic space or other non-protein-coding region | |
| 1459387 | 1392208 | G/– | deletion | IGS | intergenic space or other non-protein-coding region | |
| 727330 | 727387 | A/G | nSNP | FTT0708 | major facilitator superfamily (MFS) transport protein | I/V |
| 753071 | 753128 | G/T | nSNP | FTT0729 | ABC transporter, membrane protein | G/W |
| 793639 | 793696 | C/T | sSNP | FTT0773 | 50S ribosomal protein L27 | SYN |
| 853540 | 853597 | C/A | nSNP | FTT0839 | hypothetical membrane protein | H/N |
| 920302 | 920367 | G/A | nSNP | FTT0912c | ribosomal large subunit methyltransferase J | L/F |
| 932205 | 932270 | T/C | iSNP | IGS | intergenic space or other non-protein-coding region | – |
| 1154882 | 1154948 | A/T | iSNP | IGS | intergenic space or other non-protein-coding region | – |
| 1223209 | 1223273 | T/C | nSNP | FTT1204c | hypothetical membrane protein | T/A |
| 1296176 | 1296067 | C/T | sSNP | FTT1273 | 50S ribosomal protein L13 | SYN |
| 1351129 | 1744396 | T/C | nSNP | FTT1323 | Methylase | L/S |
| 1419877 | 1352678 | C/T | nSNP | FTT1373 | 3-oxoacyl-[acyl carrier protein] synthase III | P/S |
| 1423162 | 1355963 | A/G | nSNP | FTT1377 | 3-oxoacyl-[acyl-carrier-protein] synthase II | S/G |
| 1525732 | 1458553 | G/A | sSNP | FTT1473c | Galactose-proton symporter, major facilitator superfamily (MFS) transport protein | SYN |
| 1700620 | 1633433 | C/T | sSNP | FTT1635 | cell division protein (post-translational processing & secretion) | SYN |
| 1738053 | 1670866 | T/C | iSNP | IGS | intergenic space or other non-protein-coding region | – |
| 1833651 | 1833583 | T/C | nSNP | FTT1744c | indolepyruvate decarboxylase | Y/C |
| 1540425 | 1473247 | –/A | insertion | IGS | intergenic space or other non-protein-coding region | – |
| 570431 | 570488 | T/C | iSNP | IGS | intergenic space or other non-protein-coding region | – |
aNucleotide number at which SNP or indel is located in the Schu S4 and NE061598 genome, respectively.
bPutative nucleotide substitutions or indel in the Schu S4 and NE061598 genomes, respectively, as identified by genomic sequence comparison.
cType of nucleotide substitution. sSNP, synonomous single nucleotide polymorphism; nSNP, non-synonomous single nucleotide polymorphism; iSNP, intergenic single nucleotide polymorphism.
dOpen reading frame (ORF) associated with SNP or indel in the Schu S4 genome sequence. IGS, intergenic sequence.
ePutative protein function of associated ORF.
fAmino acid change of associated SNP or indel.
Figure 1Genome rearrangement representation for NE061598 and Schu S4 genomes.
Each local collinear blocks (LCB) 1-6 is represented by a different color. Upside-down blocks (i.e. LCB2) represent the location of the reverse strand, which means an inversion has occurred. Note the rearrangements of LCB4 and LCB5.
Figure 2Depiction of genomic rearrangement between local collinear blocks 4 and 5 in NE061598 compared to Schu S4.
Direct repeats 1 (DRI) and II (DRII) are colored in green in both 3A (Schu S4) and 3B (NE061598). DRIII, a segment of both DRI and DRII, is colored in red. Note that DRIII is found independently in LCB4. The initial 207 bp of DRI and DRII in Schu S4 is colored in blue. Note that the genomic rearrangement resulted in the loss of this initial 207 bp region in DR1 of NE061598.
Figure 3Genome rearrangement representation for NE061598, Schu S4 and FSC033 genomes.
Each local collinear blocks (LCB) 1–10 is represented by a different color. Upside-down blocks (i.e. LCBs 3 and 9) represent the location of the reverse strand, which means an inversion has occurred. Each LCB is denoted above NE061598.
IS element found in NE061598 compared to Schu S4.
| IS Elements | Number in NE061598 | Number in Schu S4 |
| ISFtu1 (IS630 family) | 50 | 50 |
| ISFtu2 | 16 | 16 |
| ISFtu3 (ISNCY family, ISHpal-IS1016) | 3 | 3 |
| ISFtu4 (IS982 family) | 1 | 1 |
| ISFtu5 (IS4 family) | 1 | 1 |
| ISFtu6 (IS1595 family) | 3 | 3 |
| ISSod13 | 1 | 1 |
| TOTAL | 75 | 75 |