| Literature DB >> 26384038 |
Petra Möbius1, Martin Hölzer2, Marius Felder3, Gabriele Nordsiek4, Marco Groth3, Heike Köhler5, Kathrin Reichwald3, Matthias Platzer3, Manja Marz2.
Abstract
Mycobacterium avium (M. a.) subsp. paratuberculosis (MAP) - the etiologic agent of Johne's disease - affects cattle, sheep and other ruminants worldwide. To decipher phenotypic differences among sheep and cattle strains (belonging to MAP-S [Type-I/III] respectively MAP-C [Type-II]) comparative genome analysis needs data from diverse isolates originating from different geographic regions of the world. The current study presents the so far best assembled genome of a MAP-S-strain: sheep isolate JIII-386 from Germany. One newly sequenced cattle isolate (JII-1961, Germany), four published MAP strains of MAP-C and MAP-S from U.S. and Australia and M. a. subsp. hominissuis (MAH) strain 104 were used for assembly improvement and comparisons. All genomes were annotated by BacProt and results compared with NCBI annotation. Corresponding protein-coding sequences (CDSs) were detected, but also CDSs that were exclusively determined either by NCBI or BacProt. A new Shine-Dalgarno sequence motif (5'AGCTGG3') was extracted. Novel CDSs including PE-PGRS family protein genes and about 80 non-coding RNAs exhibiting high sequence conservation are presented. Previously found genetic differences between MAP-types are partially revised. Four out of ten assumed MAP-S-specific large sequence polymorphism regions (LSPSs) are still present in MAP-C strains; new LSPSs were identified. Independently of the regional origin of the strains, the number of individual CDSs and single nucleotide variants confirm the strong similarity of MAP-C strains and show higher diversity among MAP-S strains. This study gives ambiguous results regarding the hypothesis that MAP-S is the evolutionary intermediate between MAH and MAP-C, but it clearly shows a higher similarity of MAP to MAH than to M. intracellulare.Entities:
Keywords: Johne’s disease; MAP-S; SNV/SNP; Shine-Dalgarno sequence; evolution of MAP-types; ncRNA; new LSPSs
Year: 2015 PMID: 26384038 PMCID: PMC4607514 DOI: 10.1093/gbe/evv154
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FOverview of M. avium strains compared in this study and their assembly and annotation status. Strains of MAP-S (Type I/III): JIII-386, S397, CLIJ361 (red); strains of MAP-C (Type II): K-10, K-10′, MAP4, JII-1961 (green); MAH strain 104 (blue); M. tuberculosis strain H37Rv (brown, used for extended ncRNA annotation). Underlined, annotations available; scaff, scaffolds; con, contigs; full, finished genome. Pictograms describe host origin.
General Genome Features of the Different Mycobacteria Strains
Note.—The number of ORFs with a homologous sequence in NCBI (homologous ORFs) and additionally hypothetical ORFs, both predicted by BacProt, are provided. NcRNAs and riboswitches were annotated by homology search of Rfam (v.11.0) (Gardner et al. 2009) families using the GORAP pipeline (unpublished data), see Materials and Methods. For further information (fasta, gff, stk files), see supplementary tables S11, S14, S20, and S22, Supplementary Material online. chr, chromosome; scaff, scaffolds; con, contigs; N50, length of the shortest con/scaff, so that at least 50% of all bp in the assembly are represented by this and all longer contigs; ?, candidate, further analysis needed. TPP binds thiamin pyrophosphate (TPP) to regulate thiamin biosynthesis and transport (Winkler et al. 2002); Cobalamin binds adenosylcobalamin to regulate vitamin B12 (cobalamin) biosynthesis and transport (Nahvi et al. 2002); Glycine binds glycine to regulate glycine metabolism genes, including use of glycine as energy source (Mandal et al. 2004); SAM-IV binds S-adenosyl methionine (SAM) to regulate methionine as well as SAM biosynthesis/transport (Weinberg et al. 2007); SAH recycling of S-adenosylhomocysteine (SAH), produced during SAM-dependent methylation reactions (Weinberg et al. 2007); pan predicted riboswitch function, located in 5′-UTRs of genes encoding enzymes involved in vitamin pantothenate synthesis (Weinberg et al. 2010); pfl predicted riboswitch function, consistently present in genomic locations corresponding to 5′-UTRs of protein-coding genes (Weinberg et al. 2010); ydaO–yuaA, genetic “off” switch for ydaO and yuaA genes, maybe triggered during osmotic shock (Barrick et al. 2004); ykok, MG-sensing riboswitch, controls expression of magnesium ion transport proteins (Barrick et al. 2004); ykkC–yxkD, upstream of ykkC and yxkD genes in Bacillus subtilis and related genes in other bacteria, function mostly unclear (Weinberg et al. 2010); ykkC-III predicted riboswitch function, appears to regulate genes related to preceding motifs such as ykkC and yxkD (Weinberg et al. 2010); NA, not applicable.
FGenome comparison of K-10′ (top), JIII-386 (middle), and S397 (bottom) calculated with Mauve. Colored blocks connected by lines indicate homologous regions which are internally free from genomic rearrangements. White areas within blocks indicate sequence regions of lower similarity. Blocks below the center line are aligned reverse complementary. See detailed supplementary figure S10, Supplementary Material online.
Annotations Obtained from NCBI and Those Additionally Calculated Using BacProt Lead to an Extended Annotation for Each Investigated Mycobacterium avium (Last Column)
Note.—In the second lines (bold): only predicted ORFs with homology to genes with an assigned function in the NCBI annotation are shown (CDSs). Corresponding, ORFs identified by BacProt and NCBI originating from same positions in the genome; Start/End shifted, ORFs identified by BacProt and NCBI but with differences in length (only 5′ or 3′); NCBI/BacProt only, ORFs identified only by NCBI/BacProt; Extended, total number of ORFs (combination of NCBI + BacProt only). All *.gff files are provided in the supplementary tables S1, S11, and S12, Supplementary Material online.
aMAP strains with currently no NCBI annotation available, instead only BacProt results are shown.
bLift-over annotation based on NCBI, MAP S397.
FShine–Dalgarno sequence motifs of investigated M. avium strains and M. tuberculosis strain H37Rv. Detailed information and the motif of MAP strain K-10 (similar to K-10′) can be found in supplementary table S11, Supplementary Material online.
FPhylogenetic reconstructions for all investigated M. avium strains based on sequence comparison of 790 corresponding CDSs on nucleotide (A) and amino acid level (B) and 70 corresponding ncRNAs (C). Mycobacterium tuberculosis strain H37Rv was used as an outgroup. Mycobacterium intracellulare (MI) strains were included as members of the MAC. Float numbers correspond to substitutions per site and integer numbers represent RAxML bootstrap values. Long branches are shrinked. Detailed figures, all multiple sequence alignments, and tree representations in Newick format can be found in the supplementary figures S22–S26, Supplementary Material online. Strains of MAP-S, Type I: CLIJ361 and Type III: JIII-386, S397 (red); Strains of MAP-C, Type II: K-10, K-10′, MAP4, JII-1961 (green); MAH strain 104 (blue); MI MOTT-64 and MI MOTT-02 (orange); and M. tuberculosis strain H37Rv (brown, used as outgroup) are shown.
Distribution of Ten LSPs in Mycobacterium avium Strains, Previously Described to be Present in MAP-S but Absent in MAP-C
| Name | Size (kb) | ORFs | MAP-C | MAP-S | MAH | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| K-10 | K-10′ | MAP4 | JII-1961 | JIII-386 | S397 | CLIJ361 | 104 | |||
| LSP | 9.01 | MAPs_15940–16060 | — | — | — | — | Full | Full | Full | Part |
| LSP | 6.65 | MAPs_46190–46270 | — | — | — | — | Full | Full | Full | Full |
| LSP | 3.78 | MAPs_14620–14660 | Full | Full | Full | Full | Full | Full | Full | Full |
| LSP | 3.63 | MAPs_46290–46320 | — | — | — | — | Full | Full | Full | Full |
| LSP | 3.47 | MAPs_17580–17610 | — | — | — | — | Full | Full | Full | Part |
| LSP | 3.0 | MAPs_40470–40500 | Full | Full | Full | Full | Full | Full | Full | — |
| LSP | 2.89 | MAPs_17640–17670 | — | — | — | — | Full | Full | Full | Full |
| LSP | 2.39 | MAPs_02730–02760 | Part | Part | Part | Part | Full | Full | Full | Full |
| LSP | 1.84 | MAPs_23120–23150 | Full | Full | Full | Full | Full | Full | Full | Full |
| LSP | 1.58 | MAPs_42460–42490 | Full | Full | Full | Full | Full | Full | Full | — |
Note.—Labels and locations according to Bannantine et al. (2012). LSP8 was only partially detected with an alignment length of 692 bp in all MAP-C strains. Homologous fasta sequences for LSP1–10 of MAP JIII-386 as well as further details and additional information about the distribution of 25 other LSPs (Semret et al. 2005; Alexander et al. 2009) can be found in the supplementary table S14 and , Supplementary Material online. Full, full-length hit; Part, partial hit.
aAll ORFs comprised by the LSP are present but split on different contigs or genomic locations.
New LSPs Regions, Extended and Revised Previous Described Regions, Present in MAP-S but Absent in MAP-C
| LSP (Genomic Region) | LSP | New LSP | Island Size (bp) (MAP-C negative) | Including MAPs | # ORFs | # ORFs (BacProt) | Present in | ||
|---|---|---|---|---|---|---|---|---|---|
| MAP-S | MAP-C | MAH 104 | |||||||
| 34,377 | 31 | +5 | Yes | Not | Partly | ||||
| New (This Study) | 2 ORFs of LSP | LSP | 10,227 | MAPS_15870–15950 | 9 | +2 | Yes | Not | Not |
| Extended (This Study) | 23 ORFs of LSP | LSP | 24,150 | MAPS_15961–16180 | 22 | +3 | Yes | Not | Yes |
| 8 ORFs of LSP | |||||||||
| Extended (This Study) | LSP | 16,392 | 18 | +2 | Yes | Not | Yes | ||
| New (BacProt) | MAPs_46241–46242 | 1 | |||||||
| Extended (Previously | LSP | 16,015 | 12 | +5 | Yes | Not | Partly | ||
| New (BacProt) | MAPs_17690 | 1 | Yes | Not | Not | ||||
| LSP | LSP | 12,142 | MAPS_17580–17680 | 11 | +4 | Yes | Not | Yes | |
| LSP | 3,873 | MAPS_17690–17700 | 2 | +1 | Yes | Not | Not | ||
| Extended (This Study) | MAV-14 | 22 | +2 | Yes | Not | Yes | |||
| Revised (This Study) | LSP | — | — | Yes | Yes | Yes | |||
| LSP | — | — | Yes | Yes | Not | ||||
| LSP | — | — | Yes | Partly | Yes | ||||
| LSP | — | — | Yes | Yes | Yes | ||||
| LSP | — | — | Yes | Yes | Not | ||||
Note.—The number of novel ORFs, additionally predicted by BacProt and with no overlap against previously annotated MAPs ORFs, is listed. For further information about genomic positions of homologous ORFs (CDSs) in MAP JIII-386 and gene annotation, see supplementary table S17, Supplementary Material online. # ORFs, number of ORFs including homologous as well as hypothetical ORFs. In bold: new designation of LSPs and the included MAPs.
aSee Bannantine et al. (2012).
bSee Alexander et al. (2009).
cBacProt-assigned function.
Gene Cluster Comprising Seven K-10 ORFs Absent in S397 but Present in JIII-386 and CLIJ361
| ORF | Size (bp) | Description |
|---|---|---|
| MAP1432 | 1,490 | REP-family protein |
| MAP1433c | 1,745 | 3-oxosteroid 1-dehydrogenase |
| MAP1434 | 1,118 | Putative phthalate oxygenase |
| MAP1435 | 713 | Short chain dehydrogenase |
| MAP1436c | 782 | Putative oxidoreductase |
| MAP1437c | 986 | Hypothetical protein |
| MAP1438c | 983 | Probable lipase |
Note.—Table based on Bannantine et al. (2012). Homologous sequences of all ORFs were found on scaffold S02 in MAP JIII-386. For additional information and BLAST results, see supplementary table S16, Supplementary Material online.
aPartial hit (alignment length 1,484 bp) with mismatches.
bInvolved in energy metabolism.
cTreated as hypothetical ORFs during analyses.
dInvolved in degradation of macromolecules.