| Literature DB >> 24015186 |
Arun N Prasanna1, Sarika Mehra.
Abstract
Mycobacterium species are the source of a variety of infectious diseases in a range of hosts. Genome based methods are used to understand the adaptation of each pathogenic species to its unique niche. In this work, we report the comparison of pathogenic and non-pathogenic Mycobacterium genomes. Phylogenetic trees were constructed using sequence of core orthologs, gene content and gene order. It is found that the genome based methods can better resolve the inter-species evolutionary distances compared to the conventional 16S based tree. Phylogeny based on gene order highlights distinct evolutionary characteristics as compared to the methods based on sequence, as illustrated by the shift in the relative position of M. abscessus. This difference in gene order among the Mycobacterium species is further investigated using a detailed synteny analysis. It is found that while rearrangements between some Mycobacterium genomes are local within synteny blocks, few possess global rearrangements across the genomes. The study illustrates how a combination of different genome based methods is essential to build a robust phylogenetic relationship between closely related organisms.Entities:
Mesh:
Year: 2013 PMID: 24015186 PMCID: PMC3756022 DOI: 10.1371/journal.pone.0071248
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Genome characteristics of mycobacterial pathogens and non-pathogens considered in this study.
| ATTRIBUTES |
|
|
|
|
|
|
|
|
|
|
|
| 4.4 | 5.1 | 5.5 | 3.27 | 5.63 | 6.64 | 6.99 | 5.74 | 5.62 | 6.49 |
|
| 4062 | 4991 | 5313 | 2770 | 4957 | 5541 | 6938 | 5551 | 5327 | 6136 |
|
| 4003 | 4941 | 5120 | 1605 | 4160 | 5423 | 6717 | 5460 | 5241 | 5979 |
|
| 65.6 | 64.1 | 69 | 57.8 | 65.5 | 65.7 | 67.4 | 68.4 | 67.9 | 67.8 |
|
| YES | YES | YES | YES | YES | YES | NO | NO | NO | NO |
|
| TB | Skin infections | Pulmonary disease | Leprosy | Buruli ulcer | Lymphan-gitis | STL | N.A | N.A | N.A |
|
| ∼24 h | 4–5 h | 3.4 h–7 d | 12–14 d | 36 h | 6–11 h | ∼2–3 h | 7–8 h | N.A | N.A |
|
| + | + | + | + | + | + | + | + | + | + |
|
| Bacilli | Bacilli | Bacilli | Bacilli | Bacilli | Bacilli | Bacilli | Bacilli | Bacilli | Bacilli |
|
| 8 | 0 | 143 | 1115 | 747 | 67 | 167 | 34 | 32 | 99 |
|
| 50 | 50 | 50 | 50 | 50 | 51 | 54 | 57 | 55 | 58 |
|
| NIL | 1 | NIL | NIL | 1 | 1 | NIL | 2 | 3 | NIL |
|
| 3617 | 4614 | 4521 | 1581 | 3635 | 4738 | 6157 | 5034 | 4790 | 5418 |
STL stands for Soft tissue Lesions.
Figure 1Classification of mycobacterial genes into functional classes.
The genes are categorized into 16 different role categories based on the TIGR classification system, where AAB represents Amino acid and Biosynthesis; BSC- Biosynthesis of Co-factors; CEN- Cell envelope; CEP- Cellular processes; CIM- Central intermediary metabolism; DME- DNA Metabolism; EME-Energy metabolism; FPM – Fatty acid and Phospholipid metabolism; HPR – Hypothetical proteins; PFA – Protein Fate; PSY – Protein synthesis; PPN – Purine, pyrimidine and nucleotide metabolism; RFU – Regulatory functions (Regulatory functions+Transcription); TBP – Transport and Binding proteins; UNK – Unknown & Unclassified proteins; MEE – Mobile extrachromosomal elements.
Figure 2Variation of number of genes in functional categories with genome size.
The role category abbreviations are same as that in Figure 1.
Number of pair-wise orthologs between all genomes considered in this study.
|
| ||||||||||
|
| 2413 | |||||||||
|
| 2466 | 2573 | ||||||||
|
| 2837 | 3008 | 3692 | |||||||
|
| 1343 | 1269 | 1306 | 1353 | ||||||
|
| 1675 | 1909 | 1723 | 1965 | 1000 | |||||
|
| 2117 | 2605 | 2212 | 2586 | 1145 | 2177 | ||||
|
| 2130 | 2631 | 2225 | 2639 | 1140 | 2069 | 3536 | |||
|
| 2138 | 2589 | 2254 | 2651 | 1131 | 2112 | 3562 | 3717 | ||
|
| 2035 | 2518 | 2149 | 2499 | 1112 | 2022 | 3275 | 3561 | 4162 | |
|
| 645 | 783 | 663 | 728 | 345 | 784 | 1092 | 969 | 946 | 908 |
|
| 26 | 30 | 29 | 32 | 16 | 31 | 47 | 35 | 38 | 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3988 | 5120 | 4160 | 5423 | 1605 | 4920 | 6716 | 5460 | 5979 | 5241 |
Nomenclature: coelicolor stands for Streptomyces ceolicolor A3(2) and coli refers to Escherichia coli K12.
For other organisms, the abbreviations refer to the organism mentioned in Table 1.
Normalized Gene Content between Mycobacterium species.
| Organism |
|
|
|
|
|
|
|
|
|
|
|
| 61 | 62 | 71 | 84 | 42 | 53 | 53 | 54 | 51 | |
|
| 47 | 62 | 59 | 79 | 39 | 51 | 51 | 51 | 49 | |
|
| 59 | 50 | 89 | 81 | 41 | 53 | 53 | 54 | 52 | |
|
| 52 | 55 | 68 | 84 | 40 | 48 | 49 | 49 | 48 | |
|
| 34 | 25 | 31 | 25 | 62 | 71 | 71 | 70 | 69 | |
|
| 34 | 37 | 35 | 36 | 20 | 44 | 42 | 43 | 41 | |
|
| 32 | 39 | 33 | 39 | 17 | 32 | 65 | 60 | 62 | |
|
| 39 | 48 | 41 | 48 | 21 | 38 | 53 | 68 | 68 | |
|
| 36 | 43 | 38 | 44 | 19 | 35 | 53 | 62 | 79 | |
|
| 39 | 48 | 41 | 48 | 21 | 39 | 49 | 65 | 70 | |
|
| 3988 | 5120 | 4160 | 5423 | 1605 | 4920 | 6716 | 5460 | 5979 | 5241 |
Upper triangular matrix represents % gene content defined as genes shared by two genomes/genes in smaller genome;
Lower triangular matrix represents the % gene defined as genes shared by two genomes/genes in larger genome.
Figure 3Distribution of core orthologs into various functional categories.
The bar graphs represent the number of core orthologs in each category based on the TIGR annotation of the genes in Mycobacterium leprae. The line plot shows the % conservation of each class with respect to total number of genes in that class in M. leprae.
Figure 4Phylogenetic tree based on nucleotide and protein gene sequence.
Tree based on 16S rRNA sequence is shown in (a); whereas tree in (b) is based on concatenation of protein sequences of 759 core orthologs. The 16S tree is based on the Fitch-Margoliash method, whereas neighbour-joining method was used to draw the tree in (b).
Figure 5Phylogeny of mycobacteria based on gene content and synteny.
Tree shown in (a) is based on number of orthologs shared among two genomes. The % similarity is based on smallest of the two genomes. Tree shown in (b) is based on order of the core orthologs in each genome. The distance between two genomes is defined as the number of reversals to convert the gene order of one genome into that of the second. The trees are based on the Fitch-Margoliash method.
Figure 6Synteny Analysis of Core Orthologs in mycobacteria.
Order of blocks in few chosen mycobacteria are shown. Three representative rearrangement patterns are shown (A–C) for 22 blocks where each block had a minimum of 10 genes.
Figure 7Phylogenetic tree based on order of syntenic blocks from core ortholog genes.
The number in bracket denotes the % micro-rearrangements with respect to M. smegmatis. Pathogens are shown in red whereas green denotes non-pathogens.
Genes identified as exclusive to pathogens.
| LOCUS | PRODUCT |
| SIGNIFICANCE OF GENEFAMILY |
|
| succinate-semialdehyde dehydrogenase ( | NGA (4.862), STREP (−3.023), DETA (−2.229) | MTB lacks KDH activity in TCA cycle; hence, α-ketoglutarate cannot be converted to succinate. As an alternate pathway, kgd and |
|
| Conserved membrane protein | KCN (−4.291), CPM (−3.936), Sediment | These genes belong to DedA protein family, that are involved in membrane homeostasis |
|
| Orotatephosphoribosyltransferase ( | STREP (6.698), PMA (5.205), DETA (−4.493), Starvation (−5.575) | These genes are involved in the reversible conversion of orotate to orotidine-5 phosphate in the utilization of L-glutamine, essentially in pyrimidine metabolism. |
|
| Membrane | Iron (2.959), OA (−2.55), | In general, MmpS proteins act as a scaffold for coupled |
|
| protein ( | WT | biosynthesis of lipid and transport machinery |
|
| Transcriptional | DM+PMA (6.348), KCN+DETA (4.27), Rv | Rv1404 belongs to marR family of proteins. This gene acts as a repressor of two methyltransferases Rv1403c |
|
| regulator | Grown in macrophage(−4.014) | and Rv1405c. Rv1404 was found to be an important regulator of genes in response of MTB to acid shock |
|
| Glycosyl transferase | Reaeration of Rv under hypoxia (2.753), 3 mM ACE (−3.577), MEN (−6.26) | Glycosyltransferases are the enzymes that synthesizeoligosaccharides,and lipopoly-sachharides |
|
| Hypothetical protein cpsA | OA (6.453), LA (7.257), PMA (2.882), Rv | Role of cpsA protein is unknown in MTB. However, in streptococcus, this gene is involved in regulatory control ofcell wall related processes and response to antimicrobial stress |
|
| Glycosyl | Rv | Rv3631 encodes glycosyltransferases (ppgS). Catalytic activity of ppgS was found to be increased by 40–50 fold |
|
| transferase | PMA+OA (−2.605), PMA (−3.774) | by co-transcription of Rv3632. |
|
| Conserved | SNG+CPM (9.742), Rv | Rv3631 and Rv3632 are jointly involved in modification |
|
| membrane protein | DCH(−7.865), AA (−6.174) | of cell wall component called arabidogalactan |
The values shown are log2 ratios of condition with respect to their control. Data is from TB database located at www.tbdb.org [30].
Expansions: NGA - Nordihydroguaiaretic acid; STREP –Streptomycin; PMA-Palmitate; DM-Defined Media; KCN-Potassium cyanide; DETA- Diethylenetriamine; LA - Linoleic acid; OA - oleic acid; Rv – H37Rv strain; SNG- S-nitrosoglutathione; CPM-Chlorpromazine; ACE – Acetate; MEN-Menadione; DCH – Dicyclohexylcarboxamide; AA - Arachidonic acid; DOS – dosS/dosT mutant; CSP – Cell surface physiology; PS – Polysaccharide synthesis; CWP – Cell wall processing; KDH-alpha-ketoglutarate dehydrogenase; kgd -alpha-ketoglutarate decarboxylase.