Literature DB >> 20140244

Large direct repeats flank genomic rearrangements between a new clinical isolate of Francisella tularensis subsp. tularensis A1 and Schu S4.

Ufuk Nalbantoglu1, Khalid Sayood, Michael P Dempsey, Peter C Iwen, Stephen C Francesconi, Ravi D Barabote, Gary Xie, Thomas S Brettin, Steven H Hinrichs, Paul D Fey.   

Abstract

Francisella tularensis subspecies tularensis consists of two separate populations A1 and A2. This report describes the complete genome sequence of NE061598, an F. tularensis subspecies tularensis A1 isolated in 1998 from a human with clinical disease in Nebraska, United States of America. The genome sequence was compared to Schu S4, an F. tularensis subspecies tularensis A1a strain originally isolated in Ohio in 1941. It was determined that there were 25 nucleotide polymorphisms (22 SNPs and 3 indels) between Schu S4 and NE061598; two of these polymorphisms were in potential virulence loci. Pulsed-field gel electrophoresis analysis demonstrated that NE061598 was an A1a genotype. Other differences included repeat sequences (n = 11 separate loci), four of which were contained in coding sequences, and an inversion and rearrangement probably mediated by insertion sequences and the previously identified direct repeats I, II, and III. Five new variable-number tandem repeats were identified; three of these five were unique in NE061598 compared to Schu S4. Importantly, there was no gene loss or gain identified between NE061598 and Schu S4. Interpretation of these data suggests there is significant sequence conservation and chromosomal synteny within the A1 population. Further studies are needed to determine the biological properties driving the selective pressure that maintains the chromosomal structure of this monomorphic pathogen.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20140244      PMCID: PMC2815774          DOI: 10.1371/journal.pone.0009007

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Francisella tularensis is a highly pathogenic gram-negative cocco-bacillus that is the causative agent of tularemia, commonly referred to as “rabbit fever.” The large majority of disease is ulceroglandular in nature and can be traced to contact with an infected host (e.g. rabbit or cat) or vector (e.g. tick or mosquito); however more serious forms of disease such as pneumonic tularemia can be life-threatening, and therefore F. tularensis is considered a potential biowarfare agent. There are three recognized subspecies of F. tularensis including tularensis (commonly referred to as type A), holarctica (commonly referred to as type B), and mediasiatica as well as a closely related species F. novicida. These subspecies are associated with important geographic differences in their distribution with F. tularensis holarctica found throughout the northern temperate regions of both hemispheres whereas subspecies tularensis is found primarily in North America. In addition, the population of F. tularensis subspecies tularensis consists of two major, geographically isolated clades, A1 and A2 [1], [2]. The A2 population has been isolated in the western United States whereas the A1 population is found east of the Rocky Mountains, primarily in the Ozark mountain regions of Missouri, Oklahoma and Arkansas. The genomes of two F. tularensis subspecies tularensis A1 isolates (Schu S4 and FSC198) have recently been sequenced; FSC198 was isolated from Slovakia in 1986 whereas Schu S4, an often-utilized virulent laboratory strain, is a clinical isolate obtained from Ohio in 1941 [3], [4]. In addition, a draft sequence of a separate F. tularensis subsp. tularensis A.I isolate, FSC033, was also recently published [5]. FSC033 was isolated from a squirrel in Georgia, USA. Genomic comparisons between FSC198 and Schu S4 revealed remarkable sequence conservation; only 8 SNP and three variable number tandem repeat (VNTR) differences were noted [3]. Chaudhri et al. [3] have suggested that the close similarity between FSC198 and Schu S4 indicated that the FSC198 strain may have derived from Schu S4. Preliminary analysis between a recent human clinical isolate of F. tularensis subsp. tularensis obtained in 1998 in Nebraska and Schu S4 revealed distinguishing characteristics [6]. This presented an opportunity to further examine the genomic diversity within the A1 population, and therefore, the complete sequence of a F. tularensis subspecies tularensis A1 isolate NE061598 was determined. The genomes of the four A1 isolates that have been fully or partially sequenced (SchuS4, FSC198, NE061598 and FSC033) were compared in light of their temporal and spatial separation. This analysis demonstrated that the F. tularensis subsp. tularensis A1 population, as represented by these isolates, is highly clonal and displays a high degree of DNA sequence conservation and chromosomal synteny. The primary chromosomal differences between NE061598 and Schu S4/FSC198/FSC033 were due to rearrangements occurring between large direct repeats and insertion sequences.

Results

General Features

The genomic sequence of Francisella tularensis subsp. tularensis NE061598 (GenBank accession number CP001633 or at http://bioinfo.unl.edu/NE061598genome) consists of a single circular chromosome of size 1,892,681 base pairs (bp). General characteristics of the NE061598 genome are shown in Table 1. Using pulsed-field gel electrophoresis, Kugeler et al have demonstrated the population of F. tularensis subsp. tularensis A.I can be divided into at least two separate groups, A1a and A1b [2]. Previous PFGE analysis of NE061598 using both PmeI and BamHI suggested that it was a subtype A1a (data not shown and [6]).
Table 1

Genomic characteristics of F. tularensis subsp. tularensis NE061598.

Length (bp)1892681
GC Content (%)32.26
Total Genes1850
Protein Coding Genes1601
Genes Assigned Function1185
Hypothetical proteins416
Disrupted ORFs201
Large Duplicated Regions2
Transposons (IS elements)75
tRNA38
rRNA10
sRNA2
Average Gene Length (nt)1068
Percent Coding90.40%

Comparison to the Other Type A1 Strains

The NE061598 genome sequence contains 65 bp more than the FSC198 sequence [3] and 94 bp less than the Schu S4 sequence [4]. Previous bioinformatic analysis of the FSC198 and Schu S4 genomes demonstrated that there were only eight single nucleotide polymorphisms (SNPs) and three VNTR differences between these two isolates [3]. Therefore, based on the known genomic similarity between Schu S4 and FSC198, NE061598 was compared with Schu S4 (Genbank accession number AJ749949 and the Refseq accession no. NC_006570). The regions of difference between Schu S4 and NE061598 were divided into 2 types: small tandem repeats (Table 2) and rearrangements (Table 3). The VNTR's listed in Table 2 accounted for the difference in size between the two isolates. Table 2 consists of known VNTR markers used previously for MLVA analysis [6], [7] in addition to five newly identified tandem repeat differences (VNTR 1–5) discovered between NE061598 and Schu S4. Only one of the five new VNTRs was found within an open reading frame.
Table 2

VNTR markers and their differences between Schu S4 and NE061598.

VNTRa MarkerRepeat motifRepeat size (nt)b Genomic LocationRepeat copy no., strain SCHU S4Repeat copy no., strain NE061598
Ft-M1AAT3I (−76)33
Ft-M2TAAATA6G (+12)45
Ft-M3AATAAGGAT9G (+1401)2520
Ft-M4TTGTT5G (+55)33
Ft-M5 TTTCTACAAATATCTT 16I (−21)32
Ft-M6 TTGGTGAACTTTCTTGCTCTT 21G (+1160)45
Ft-M7 TTTCTACAAATATCTT 16I (−21)44
Ft-M8 TTTCTACAAATATCTT 16I (−21)44
Ft-M9 TTTCTACAAATATCTT 16I (−21)49
Ft-M10 TTTCTACAAATATCTT 16I (−21)188
Ft-M11 AATTATAAAT 10I (−113)55
Ft-M12 TAGCTTTTTT 10I (−113)22
Ft-M13 CTCCAGGACCAA 12G (+1174)22
Ft-M14TCATTA6G (+67)33
Ft-M15ATACTT6G (+32)22
Ft-M16 TAAAAGTAAG 10I (+551)22
Ft-M17TATTTA6G (+484)33
Ft-M18CATTAA6I (−52)44
Ft-M19 TAAATTTCTCATA 13I (−20)22
Ft-M20 ATTATTTTGATC 12G (+1964)33
Ft-M21TCAATTA7G (+586)34
Ft-M22AAAAAT6G (+2254)22
Ft-M23 AAGTAGCATTGTCACGACCTCCT 23I (+1864)22
Ft-M24 ATAAATTATTTATTTTGATTA 21I (−93)11
Ft-M25GT2G (+525)55
VNTR-1CAAAGACA8I (−392)13
VNTR-2 TTTATATAAGT 11I (−42)32
VNTR-3GAAAATAA8G (+282)12
VNTR-4 TTCTACAAATATCTTT 16I (+22)23
VNTR-5 AAAATGCCATCATATAGCCAAGATTTTAG 29I (−32)11

FtM1-FtM25 VNTR markers as previously reported by Johansson et al. [7]. New VTNR polymorphisms identified in this study are listed as VNTR1 through VNTR-5.

Indicates repeat size in nucleotides.

“G” indicates that the repeat is located within an open reading frame (genic) whereas “I” indicates that the repeat is located within an intergenic region. Distance to predicted translation start site is indicated in nucleotides. “+” or “−” indicates that the translation start site is downstream or upstream of repeat motif, respectively (as reported by Johansson et al. [7]).

Table 3

Description of six local collinear blocks (LCBs) between NE061598 and Schu S4.

LCBTypeNE061598 PositionSchu S4 position
1Conserved1-3521561-352087
2Inversion352157-381876381807-352088
3Conserved381877-1312701381808-1312781
4Rearrangement1312702-17006901379901-1767877
5Rearrangement1700691-17676021307424-1374335
6Conserved1767603-18926811767671-1892775
FtM1-FtM25 VNTR markers as previously reported by Johansson et al. [7]. New VTNR polymorphisms identified in this study are listed as VNTR1 through VNTR-5. Indicates repeat size in nucleotides. “G” indicates that the repeat is located within an open reading frame (genic) whereas “I” indicates that the repeat is located within an intergenic region. Distance to predicted translation start site is indicated in nucleotides. “+” or “−” indicates that the translation start site is downstream or upstream of repeat motif, respectively (as reported by Johansson et al. [7]). Compared to the published Schu S4 genome sequence, NE061598 had 25 polymorphisms (22 SNPs and 3 indels; Table 4). All SNP and indel differences were confirmed by repeat sequence analysis. Of the 22 confirmed SNPs, 6 were synonomous SNPs, 5 were intergenic SNPs, and 11 were nonsynonomous. There were no SNPs in rRNA or tRNA genes. Petrosino et al. [8] have identified 268 virulence genes associated with F. tularensis. Comparing NE061598 to Schu S4, only two of the proposed virulence genes identified by Petrosino et al. [8] were determined to have SNPs. These include a ferrous iron transport protein (FTT0249) and 2-isopropylmalate synthase (FTT0252). Both contain non-synonymous polymorphisms that result in a non-conservative amino acid substitution; it is unknown whether these mutations have any effect on protein function.
Table 4

Non-synonymous SNPs, synonymous SNPs, and indels discovered between NE061598 and Schu S4.

Schu S4/NE061598a Nucleotide changeb Typec ORF_IDd Producte Putative amino acid changef
157940158036A/CsSNPFTT0144DNA-directed RNA polymerase subunit betaSYN
218776218872G/AiSNPIGSintergenic space or other non-protein-coding region
262990263086C/GnSNPFTT0249ferrous iron transport protein [17] T/R
269208269304C/TnSNPFTT02522-isopropylmalate synthaseS/F
297337297433C/TsSNPFTT0282Cytochrome O ubiquinol oxidase subunit ISYN
989503989567T/–deletionIGSintergenic space or other non-protein-coding region
14593871392208G/–deletionIGSintergenic space or other non-protein-coding region
727330727387A/GnSNPFTT0708major facilitator superfamily (MFS) transport proteinI/V
753071753128G/TnSNPFTT0729ABC transporter, membrane proteinG/W
793639793696C/TsSNPFTT077350S ribosomal protein L27SYN
853540853597C/AnSNPFTT0839hypothetical membrane proteinH/N
920302920367G/AnSNPFTT0912cribosomal large subunit methyltransferase JL/F
932205932270T/CiSNPIGSintergenic space or other non-protein-coding region
11548821154948A/TiSNPIGSintergenic space or other non-protein-coding region
12232091223273T/CnSNPFTT1204chypothetical membrane proteinT/A
12961761296067C/TsSNPFTT127350S ribosomal protein L13SYN
13511291744396T/CnSNPFTT1323MethylaseL/S
14198771352678C/TnSNPFTT13733-oxoacyl-[acyl carrier protein] synthase IIIP/S
14231621355963A/GnSNPFTT13773-oxoacyl-[acyl-carrier-protein] synthase IIS/G
15257321458553G/AsSNPFTT1473cGalactose-proton symporter, major facilitator superfamily (MFS) transport proteinSYN
17006201633433C/TsSNPFTT1635cell division protein (post-translational processing & secretion) [18] SYN
17380531670866T/CiSNPIGSintergenic space or other non-protein-coding region
18336511833583T/CnSNPFTT1744cindolepyruvate decarboxylaseY/C
15404251473247–/AinsertionIGSintergenic space or other non-protein-coding region
570431570488T/CiSNPIGSintergenic space or other non-protein-coding region

aNucleotide number at which SNP or indel is located in the Schu S4 and NE061598 genome, respectively.

bPutative nucleotide substitutions or indel in the Schu S4 and NE061598 genomes, respectively, as identified by genomic sequence comparison.

cType of nucleotide substitution. sSNP, synonomous single nucleotide polymorphism; nSNP, non-synonomous single nucleotide polymorphism; iSNP, intergenic single nucleotide polymorphism.

dOpen reading frame (ORF) associated with SNP or indel in the Schu S4 genome sequence. IGS, intergenic sequence.

ePutative protein function of associated ORF.

fAmino acid change of associated SNP or indel.

Apart from the rearrangements and polymorphisms, the main reason for the remaining genomic differences in composition and length between NE061598 and Schu S4 were found to be due to differences in the VNTR's. VNTR analysis has been very useful in epidemiological and population analyses of Francisella [6], [7]. Of the twelve tandem repeats that have a unique number of repeats in NE061598 in comparison to Schu S4, 7 (FtM5, FtM9, FtM10, FtM21, VNTR-1, VNTR-2, and VNTR-4) occur in intergenic regions, and the remaining 4 (FtM2, FtM3, FtM6, and VNTR-3) are in coding regions (Table 3). Of these four, one repeat in the gene for a hypothetical protein (FtM2; FTT1800c [Schu S4] and NE6158_10490 [NE061598]) inserted two amino acids into the translated sequence. Another repeat in a gene for a hypothetical protein (VNTR3; FTT0877c [Schu S4]) resulted in a premature stop codon in NE061598. An insertion of 7 amino acids was observed in an ATP-dependent DNA helicase protein in NE061598 compared to Schu S4 (FTT1395c [Schu S4] and NE61598_07740 [NE061598]). Lastly, one tandem repeat difference (FtM3) appeared to eliminate a premature stop codon in a pseudogene in Schu S4 (TPR repeat region protein; FTT0294 [Schu S4] and NE61598_0160 [NE061598]). This difference resulted in a deletion of the repeat NKDNKDNKD. Importantly, NE061598 does not encode any unique genes that are not found in Schu S4. aNucleotide number at which SNP or indel is located in the Schu S4 and NE061598 genome, respectively. bPutative nucleotide substitutions or indel in the Schu S4 and NE061598 genomes, respectively, as identified by genomic sequence comparison. cType of nucleotide substitution. sSNP, synonomous single nucleotide polymorphism; nSNP, non-synonomous single nucleotide polymorphism; iSNP, intergenic single nucleotide polymorphism. dOpen reading frame (ORF) associated with SNP or indel in the Schu S4 genome sequence. IGS, intergenic sequence. ePutative protein function of associated ORF. fAmino acid change of associated SNP or indel.

Chromosomal Rearrangements

In order to describe the chromosomal rearrangements between NE016598 and Schu S4, the genomes were divided into six local collinear blocks (LCBs) as shown in Table 3 and Figure 1. The initial division was performed using the genome rearrangement analysis tool SPRING (Sorting Permutation by Reversals and block-INterchanGes) [9]. These analyses demonstrated that the first, third and sixth LCBs are conserved whereas the second LCB is inverted in NE061598 with respect to Schu S4. The fourth and fifth LCBs are rearranged (Table 3 and Figure 1). These data are consistent with a previous comparison of two type A strains of Francisella tularensis subsp. tularensis, WY96 (A2) and Schu S4 (A1), which demonstrated the presence of various genome rearrangements due to inversions and block rearrangements mediated by insertion sequences [10]. The remaining LCBs have flanking duplicated regions. Several insertion elements were also observed juxtaposed to the flanking regions of the LCBs (Table 3) that might promote further chromosomal rearrangements during strain divergence. For example, the second LCB is inverted between NE061598 and Schu S4. This inversion is hypothesized to be due to 2969 bp long flanking regions on each side of the inverted region that are reverse complements of each other. These flanking regions are comprised of one ISFtu2 and two additional ISFtu1 insertion sequence elements.
Figure 1

Genome rearrangement representation for NE061598 and Schu S4 genomes.

Each local collinear blocks (LCB) 1-6 is represented by a different color. Upside-down blocks (i.e. LCB2) represent the location of the reverse strand, which means an inversion has occurred. Note the rearrangements of LCB4 and LCB5.

Genome rearrangement representation for NE061598 and Schu S4 genomes.

Each local collinear blocks (LCB) 1-6 is represented by a different color. Upside-down blocks (i.e. LCB2) represent the location of the reverse strand, which means an inversion has occurred. Note the rearrangements of LCB4 and LCB5. The rearrangements in LCBs four and five are most probably mediated by two large duplicated regions (DR1 and DR2) previously discussed in the genome report comparing WY96 and Schu S4 [10]. These duplicated regions include the Francisella Pathogenicity Island (FPI) containing the iglABCD operon [11] required for intramacrophage growth. This operon is regulated by the transcription factor MglA that has been shown to regulate a number of virulence factors [12]. These two regions (33,910 bp) occur at locations 1,374,336–1,408,246 (DRI) and 1,767,671–1,801,581 (DRII) in Schu S4. In addition, a 5358 bp segment of the duplicated regions between the 208th and 5565th bases of the duplicated regions, was also duplicated at positions 1,307,425 bp–1,312,781 bp in Schu S4. No structural alterations in the iglABCD operon were found in NE061598. The location of DRI and DRII in both Schu S4 and NE061598 are shown in figures 2A and 2B. In addition, DRIII (III, red) is shown which contains the aforementioned 5358 bp long segment of the duplicated regions [10]. Relating these regions to the LCBs noted in Figure 2, DRII is contained in LCB 6 while the other components are contained in LCBs four and five. The rearrangement can be explained as an edit operation in which one block with a partially duplicated flanking region is replaced by another block having DR1 as the flanking region (Figure 3). Consequently, DR2 is conserved in NE061598 but other regions have been transformed to partially duplicated regions. This genomic rearrangement results in the loss of the first 207 bp in DRI of NE061598 (Figure 2). Similar chromosomal changes mediated by these duplicated regions were also observed between Schu S4 and WY96 [10]. WY96 has a conserved copy of DRII and a copy lacking the first 207 bases as in the NE061598 LCB5 region (Figure 3B). These duplicated regions were determined to be the most compositionally different segments of the genome using the Alien Hunter program [13].
Figure 2

Depiction of genomic rearrangement between local collinear blocks 4 and 5 in NE061598 compared to Schu S4.

Direct repeats 1 (DRI) and II (DRII) are colored in green in both 3A (Schu S4) and 3B (NE061598). DRIII, a segment of both DRI and DRII, is colored in red. Note that DRIII is found independently in LCB4. The initial 207 bp of DRI and DRII in Schu S4 is colored in blue. Note that the genomic rearrangement resulted in the loss of this initial 207 bp region in DR1 of NE061598.

Figure 3

Genome rearrangement representation for NE061598, Schu S4 and FSC033 genomes.

Each local collinear blocks (LCB) 1–10 is represented by a different color. Upside-down blocks (i.e. LCBs 3 and 9) represent the location of the reverse strand, which means an inversion has occurred. Each LCB is denoted above NE061598.

Depiction of genomic rearrangement between local collinear blocks 4 and 5 in NE061598 compared to Schu S4.

Direct repeats 1 (DRI) and II (DRII) are colored in green in both 3A (Schu S4) and 3B (NE061598). DRIII, a segment of both DRI and DRII, is colored in red. Note that DRIII is found independently in LCB4. The initial 207 bp of DRI and DRII in Schu S4 is colored in blue. Note that the genomic rearrangement resulted in the loss of this initial 207 bp region in DR1 of NE061598.

Genome rearrangement representation for NE061598, Schu S4 and FSC033 genomes.

Each local collinear blocks (LCB) 1–10 is represented by a different color. Upside-down blocks (i.e. LCBs 3 and 9) represent the location of the reverse strand, which means an inversion has occurred. Each LCB is denoted above NE061598. While it is known that IS elements are significantly involved in intrachromosomal rearrangement, only one rearrangement associated with insertion sequences was observed when comparing NE061598 to Schu S4. The most parsimonious transformation using the rearrangements and inversions of the collinear blocks involved an inversion of LCB2 and the edit process discussed in Figure 2.

Comparison of NE061598 and Schu S4 with the Draft Sequence of F. Tularensis Subsp. Tularensis FSC033

Kugeler et al have demonstrated the population of F. tularensis subsp. tularensis A1b is associated with higher mortality rates [2]. A prototype A1b isolate, FSC033, has recently been partially sequenced [2], [5]. In order to perform preliminary genomic comparisons between FSC033, NE061598 and Schu S4, the genomes were divided into 10 LCBs as described above (Figure 3). This analysis found that the only major difference between FSC033 and NE061598/Schu S4 was the rearrangement of LCB2 (Figure 3). The genomic organization of FSC033 surrounding DRI and DRII as shown in Figures 1 and 2 was similar to the Schu S4 genomic arrangement. Although few significant differences were observed regarding the genomic synteny between FSC033 (subtype A1b) and NE061598/Schu S4 (subtype A1a), SNP analysis indicated that 123 SNPs and 8 indels were detected between NE061598 and FSC033.

Transposable Elements

Seven different types (n = 75) of IS elements were found within NE061598 (Table 5). In addition to 50 ISFtu1 elements, NE061598 contains 16 ISFtu2 elements (of which one flanks the inverted LCB 2), 3 ISFtu3 and ISFtu6 elements, and one copy each of ISFtu4, ISFtu5 and ISSod13. All of the insertion sequences found in NE061598 are also present in Schu S4.
Table 5

IS element found in NE061598 compared to Schu S4.

IS ElementsNumber in NE061598Number in Schu S4
ISFtu1 (IS630 family)5050
ISFtu21616
ISFtu3 (ISNCY family, ISHpal-IS1016)33
ISFtu4 (IS982 family)11
ISFtu5 (IS4 family)11
ISFtu6 (IS1595 family)33
ISSod1311
TOTAL7575

Discussion

Due to the remarkable sequence conservation between Schu S4 and FSC198 [3], speculation was made that these two isolates may have the same origin. Therefore, we proposed to sequence a separate virulent isolate of F. tularensis subsp. tularensis A1 and compare it with Schu S4 to evaluate the issue of sequence divergence over time. NE061598 was isolated in Nebraska in 1998 from the blood of a patient with ulceroglandular tularemia, Schu S4 was derived in 1941 and FSC198 was isolated in 1986. The availability of a recent clinically virulent isolate of F. tularensis subsp. tularensis A.I isolate obtained in the mid-western portion of the United States provided the opportunity for an in-depth sequence comparison with other A.I. isolates. Because of the significant temporal separation (45 years) between Schu S4 and NE061598, the sequence conservation between these two isolates was unexpected. Even though VNTR analysis yielded 11 distinct polymorphisms (see Table 2), analysis of the entire genome only yielded 25 additional SNPs/indels. The most significant difference detected was an inversion associated with LCB 2 and rearrangements associated with LCBs 4 and 5 (see Figures 1 and 2); both events were predictably mediated through IS element recombination (LCB 2) or rearrangement mediated by large duplicated regions (LCBs 4 and 5). Significantly, there was no net gain (or loss) of genes within the NE061598 genome in relationship to Schu S4. These data may suggest that the minimal differences observed in pulsed-field RFLP patterns of the F. tularensis subsp. tularensis A1 population may be due to IS- or direct repeat-mediated rearrangements and is not due to the acquisition of new genes [1], [2], [6]. Furthermore, these data support the notion that this highly monomorphic pathogen [14] may have undergone a recent population bottleneck which may be related to its specific host preference (e.g. lagomorphs, humans) and vectors (e.g. ticks). The further elucidation of the natural reservoir, hosts, and vectors of F. tularensis may lead to novel hypotheses of the selective pressure of this A1 population. Due to the lack of genetic diversity noted within the F. tularensis subsp. tularensis A1 population, phylogenetic and population structure analyses are problematic and biased especially due to the rapid evolution of VNTR loci and lack of sensitivity of other methodologies [14], [15]. However, whole genome SNP analysis has been successful at probing the population structure of highly monomorphic pathogens such as B. anthracis and other highly virulent pathogens [14], [16]. A recent report using a variety of SNP analyses identified 11 subclades within F. tularensis subsp. holarctica [15]. Phylogenetic analysis suggested that F. tularensis subsp. holarctica originated from North America and was introduced multiple times into Eurasia. Further studies need to be performed to delineate the complicated population structure of F. tularensis subsp. tularensis A.I (both A1a and A1b) and its relationship to the F. tularensis subsp. tularensis A2 population. Data provided in our study may yield canonical SNPs that provide lineage- or strain-specific phylogeny within this subspecies. The utility of these unique SNPs will be evaluated using large repositories of F. tularensis subspecies. Lastly, our study suggests that the genomic organization between the A1a and A1b populations may not significantly differ; however, preliminary SNP/indel analysis provides evidence that the increased virulence observed with A1b strains may reside in specific nucleotide alterations and not gene acquisition or loss.

Materials and Methods

Genome Sequencing of NE061598

The genome coverage determined at the end of the draft-sequencing phase was 11x and resulted in 19 contigs mapped into 12 scaffolds. The draft phase involved two clone libraries, one small insert library (2200 bp average insert size) and one medium insert library (6289 bp average insert size). Paired end shotgun reads from each of these libraries produced 12218 and 13156 reads respectively. During the finishing phase, seven transposon bomb libraries were created and sequenced to assist with repeat resolution. Four PCR shatter libraries were created and sequenced to assist with hard stops. An additional 528-primer walk reads were created as needed to address low quality regions of the draft assembly. The final genome at the end of the finishing stage was a complete genome with no gaps consisting of 1892901 base pairs. The overall average error rate of the finished genome was less than one error in 100,000 bp. The total number of reads used in the final assembly was 25,531.

Annotation

The open reading frames of Schu S4 strains were extracted and each ORF was searched for in the NE061598 chromosome using the standard Smith-Waterman algorithm [17]. The hits having accuracy higher than 98% identity were detected as initial annotations. Next, the NCBI annotation pipeline (http://www.ncbi.nlm.nih.gov/genome/guide/build.html) was employed and any missed ORFs were extracted from the output of this pipeline. Eliminating the ORFs and overlapping genes that had already been recognized, protein BLAST searches were performed on filtered predictions of the pipeline.

Insertion Sequence Element Mapping

Annotated insertion sequence elements that are specific to F. tularensis were detected in the NE061598 genome using Smith-Waterman alignment [17].

SNP Discovery

SNP polymorphisms between Schu S4 and NE061598 were discovered using the SNPsFinder program of Los Alamos Laboratories (http://snpsfinder.lanl.gov/UsersManual/index.html). SNP predictions were then curated manually using BLAST (with parameters match: 1 mismatch: −4 existence and extension gaps: −1).

Genome Rearrangement Discovery

In order to determine the local collinear blocks (LCB), the SPRING tool [7] was utilized. The SPRING parameters for LCB discovery included the following. Block search mode: reversals (inversions) plus block interchange mode; minimum multi-MUM length: 21 bp (closest integer to log2 [1892 Kbp], where 1892 is the average genome length); minimum LCB length: 63 bp (3 x minimum multi-MUM); chromosome type: linear. The boundaries of the rearrangements were further optimized using BLAST (expect threshold: 10; word size: 64; match score: 1; mismatch score: −4; existence and extension gaps: −1) around the 10 Kb flanking regions of LCB ends.

Pulsed-Field Gel Electrophoresis

Agarose embedded DNA was prepared and digested with PmeI and BamHI as previously described [18]. RFLP analysis was performed using Bionumerics software (Applied Maths).
  18 in total

1.  Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands.

Authors:  Georgios S Vernikos; Julian Parkhill
Journal:  Bioinformatics       Date:  2006-07-12       Impact factor: 6.937

2.  Identification of common molecular subsequences.

Authors:  T F Smith; M S Waterman
Journal:  J Mol Biol       Date:  1981-03-25       Impact factor: 5.469

3.  Chromosome rearrangement and diversification of Francisella tularensis revealed by the type B (OSU18) genome sequence.

Authors:  Joseph F Petrosino; Qin Xiang; Sandor E Karpathy; Huaiyang Jiang; Shailaja Yerrapragada; Yamei Liu; Jason Gioia; Lisa Hemphill; Arely Gonzalez; T M Raghavan; Akif Uzman; George E Fox; Sarah Highlander; Mason Reichard; Rebecca J Morton; Kenneth D Clinkenbeard; George M Weinstock
Journal:  J Bacteriol       Date:  2006-10       Impact factor: 3.490

4.  Identification of MglA-regulated genes reveals novel virulence factors in Francisella tularensis.

Authors:  Anna Brotcke; David S Weiss; Charles C Kim; Patrick Chain; Stephanie Malfatti; Emilio Garcia; Denise M Monack
Journal:  Infect Immun       Date:  2006-09-25       Impact factor: 3.441

5.  The complete genome sequence of Francisella tularensis, the causative agent of tularemia.

Authors:  Pär Larsson; Petra C F Oyston; Patrick Chain; May C Chu; Melanie Duffield; Hans-Henrik Fuxelius; Emilio Garcia; Greger Hälltorp; Daniel Johansson; Karen E Isherwood; Peter D Karp; Eva Larsson; Ying Liu; Stephen Michell; Joann Prior; Richard Prior; Stephanie Malfatti; Anders Sjöstedt; Kerstin Svensson; Nick Thompson; Lisa Vergez; Jonathan K Wagg; Brendan W Wren; Luther E Lindler; Siv G E Andersson; Mats Forsman; Richard W Titball
Journal:  Nat Genet       Date:  2005-01-09       Impact factor: 38.330

6.  Worldwide genetic relationships among Francisella tularensis isolates determined by multiple-locus variable-number tandem repeat analysis.

Authors:  Anders Johansson; Jason Farlow; Pär Larsson; Meghan Dukerich; Elias Chambers; Mona Byström; James Fox; May Chu; Mats Forsman; Anders Sjöstedt; Paul Keim
Journal:  J Bacteriol       Date:  2004-09       Impact factor: 3.490

7.  Epidemiologic and molecular analysis of human tularemia, United States, 1964-2004.

Authors:  J Erin Staples; Kristy A Kubota; Linda G Chalcraft; Paul S Mead; Jeannine M Petersen
Journal:  Emerg Infect Dis       Date:  2006-07       Impact factor: 6.883

8.  SPRING: a tool for the analysis of genome rearrangement using reversals and block-interchanges.

Authors:  Ying Chih Lin; Chin Lung Lu; Ying-Chuan Liu; Chuan Yi Tang
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

9.  Comparative genomic characterization of Francisella tularensis strains belonging to low and high virulence subspecies.

Authors:  Mia D Champion; Qiandong Zeng; Eli B Nix; Francis E Nano; Paul Keim; Chinnappa D Kodira; Mark Borowsky; Sarah Young; Michael Koehrsen; Reinhard Engels; Matthew Pearson; Clint Howarth; Lisa Larson; Jared White; Lucia Alvarado; Mats Forsman; Scott W Bearden; Anders Sjöstedt; Richard Titball; Stephen L Michell; Bruce Birren; James Galagan
Journal:  PLoS Pathog       Date:  2009-05-29       Impact factor: 6.823

10.  Francisella tularensis in the United States.

Authors:  Jason Farlow; David M Wagner; Meghan Dukerich; Miles Stanley; May Chu; Kristy Kubota; Jeannine Petersen; Paul Keim
Journal:  Emerg Infect Dis       Date:  2005-12       Impact factor: 6.883

View more
  10 in total

1.  Assessment of whole-genome mapping in a well-defined outbreak of Salmonella enterica serotype Saintpaul.

Authors:  P D Fey; P C Iwen; E B Zentz; A M Briska; J K Henkhaus; K A Bryant; M A Larson; R K Noel; S H Hinrichs
Journal:  J Clin Microbiol       Date:  2012-06-20       Impact factor: 5.948

2.  Genomic comparison between a virulent type A1 strain of Francisella tularensis and its attenuated O-antigen mutant.

Authors:  Thero Modise; Cheryl Ryder; Shrinivasrao P Mane; Aloka B Bandara; Roderick V Jensen; Thomas J Inzana
Journal:  J Bacteriol       Date:  2012-05       Impact factor: 3.490

3.  Francisella tularensis molecular typing using differential insertion sequence amplification.

Authors:  Marilynn A Larson; Paul D Fey; Amanda M Bartling; Peter C Iwen; Michael P Dempsey; Stephen C Francesconi; Steven H Hinrichs
Journal:  J Clin Microbiol       Date:  2011-05-25       Impact factor: 5.948

4.  Whole-Genome Relationships among Francisella Bacteria of Diverse Origins Define New Species and Provide Specific Regions for Detection.

Authors:  Jean F Challacombe; Jeannine M Petersen; La Verne Gallegos-Graves; David Hodge; Segaran Pillai; Cheryl R Kuske
Journal:  Appl Environ Microbiol       Date:  2017-01-17       Impact factor: 4.792

5.  Natural Selection in Virulence Genes of Francisella tularensis.

Authors:  Mark K Gunnell; Richard A Robison; Byron J Adams
Journal:  J Mol Evol       Date:  2016-05-13       Impact factor: 2.395

6.  Regulation of francisella tularensis virulence.

Authors:  Shipan Dai; Nrusingh P Mohapatra; Larry S Schlesinger; John S Gunn
Journal:  Front Microbiol       Date:  2011-01-06       Impact factor: 5.640

7.  Microbial Consortium Associated with the Antarctic Marine Ciliate Euplotes focardii: An Investigation from Genomic Sequences.

Authors:  Sandra Pucciarelli; Raghul Rajan Devaraj; Alessio Mancini; Patrizia Ballarini; Michele Castelli; Martina Schrallhammer; Giulio Petroni; Cristina Miceli
Journal:  Microb Ecol       Date:  2015-02-24       Impact factor: 4.552

8.  Unusual large-scale chromosomal rearrangements in Mycobacterium tuberculosis Beijing B0/W148 cluster isolates.

Authors:  Egor A Shitikov; Julia A Bespyatykh; Dmitry S Ischenko; Dmitry G Alexeev; Irina Y Karpova; Elena S Kostryukova; Yulia D Isaeva; Elena Y Nosova; Igor V Mokrousov; Anna A Vyazovaya; Olga V Narvskaya; Boris I Vishnevsky; Tatiana F Otten; Viacheslav Iu Zhuravlev; Valery Y Zhuravlev; Peter K Yablonsky; Elena N Ilina; Vadim M Govorun
Journal:  PLoS One       Date:  2014-01-08       Impact factor: 3.240

9.  Identification of an Attenuated Substrain of Francisella tularensis SCHU S4 by Phenotypic and Genotypic Analyses.

Authors:  Julie A Lovchik; Douglas S Reed; Julie A Hutt; Fangfang Xia; Rick L Stevens; Thero Modise; Eileen M Barry; Terry H Wu
Journal:  Pathogens       Date:  2021-05-22

10.  Francisella tularensis Subtype A.II Genomic Plasticity in Comparison with Subtype A.I.

Authors:  Marilynn A Larson; Ufuk Nalbantoglu; Khalid Sayood; Emily B Zentz; Amanda M Bartling; Stephen C Francesconi; Paul D Fey; Michael P Dempsey; Steven H Hinrichs
Journal:  PLoS One       Date:  2015-04-28       Impact factor: 3.240

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.