| Literature DB >> 30279355 |
Abstract
An alarming increase in tuberculosis (TB) caused by drug-resistant strains of Mycobacterium tuberculosis has created an urgent need for new antituberculosis drugs acting via novel mechanisms. Phylogenomic and comparative genomic analyses reviewed here reveal that the TB causing bacteria comprise a small group of organisms differing from all other mycobacteria in numerous regards. Comprehensive analyses of protein sequences from mycobacterial genomes have identified 63 conserved signature inserts and deletions (indels) (CSIs) in important proteins that are distinctive characteristics of the TB-complex of bacteria. The identified CSIs provide potential means for development of novel diagnostics as well as therapeutics for the TB-complex of bacteria based on four key observations: (i) The CSIs exhibit a high degree of exclusivity towards the TB-complex of bacteria; (ii) Earlier work on CSIs provide evidence that they play important/essential functions in the organisms for which they exhibit specificity; (iii) CSIs are located in surface-exposed loops of the proteins implicated in mediating novel interactions; (iv) Homologs of the CSIs containing proteins, or the CSIs in such homologs, are generally not found in humans. Based on these characteristics, it is hypothesized that the high-throughput virtual screening for compounds binding specifically to the CSIs (or CSI containing regions) and thereby inhibiting the cellular functions of the CSIs could lead to the discovery of a novel class of drugs specifically targeting the TB-complex of organisms.Entities:
Keywords: comparative genomics; conserved signature indels; mycobacterial classification; mycobacterial genomes; novel drug targets; phylogenomics; protein structures and surface loops; tuberculosis-complex
Year: 2018 PMID: 30279355 PMCID: PMC6306742 DOI: 10.3390/ht7040031
Source DB: PubMed Journal: High Throughput ISSN: 2571-5135
Figure 1A compressed tree showing the main clades of mycobacteria observed in phylogenomic trees and molecular markers that have been identified for different clades. The tree shown is based on 1941 core proteins from the genomes of 150 Mycobacteriaceae species [25]. The terms CSIs and CSPs refer to conserved signature indels and conserved signature proteins, respectively, which are specific for the species from the observed clades. Comprehensive analyses of genome sequences have led to division of the family Mycobacteriaceae (genus Mycobacterium) into five different genera as indicated here [25].
Figure 2A compressed phylogenetic tree showing the main clades observed within the delimited genus Mycobacterium in a phylogenetic tree. The tree shown is based on 136 proteins commonly shared by members of the phylum Actinobacteria. The tree was constructed as described in earlier work [25] and the main species groupings observed are collapsed, except those from the M. tuberculosis-related group of bacteria. The group of species that is commonly referred to as the tuberculosis-complex is marked. All of the CSIs described in this work are specific for the tuberculosis-complex of bacteria.
Figure 3Partial sequence alignments of the proteins (A) UDP-N-acetylenolpyruvoyl-glucosamine reductase (MurB) and (B) 3′-phosphoadenosine 5′-phosphosulfate reductase (CysH), containing conserved inserts of four amino acid (aa) and seven aa (boxed), respectively, which are uniquely found in the tuberculosis-complex of bacteria. The numbers 9/9 indicate that there are 9 sequences available from the Tuberculosis-complex of bacteria and all 9 of them contain these CSIs. However, these CSIs are lacking in the homologs from all other mycobacteria as well as other examined bacteria. The homologs of these proteins, or the CSI-containing regions of these proteins, are not found in human. The dashes (-) in different sequence alignments show identity with the aa present on the top line. Mutational studies indicate that both these proteins are essential for the growth of M. tuberculosis [45,46].
Figure 4Partial sequence alignment of a LytR family transcriptional regulatory protein showing a 12 aa long deletion in a conserved region. This deletion is uniquely present in all other M. tuberculosis complex of organisms except Mycobacterium canettii, which branches earlier in comparison to the other species from this group (Figure 2). The dashes (-) indicate identity with the aa present on the top line.
Summary of conserved signature indels (CSIs) that are specific for the tuberculosis complex.
| Name | Gene Number | Figure Number | Ins/Del | Location | Mutational Results # |
|---|---|---|---|---|---|
| putative UDP- | Rv0482 | 4aa Ins | 249–298 | Essential | |
| putative 3′-phosphoadenosine 5′-phosphosulfate reductase (CysH) (PAPS reductase, thioredoxin dep) | Rv2392 | 7aa Ins | 17–71 | Essential (growth defect) | |
| transcriptional regulator, LytR family | Rv3840 | 12aa Del | 50–87 | Non-essential | |
| putative propionyl-CoA carboxylase beta chain 5 ACCD5 (PCCASE) | Rv3280 |
| 1aa Del | 172–220 | Essential |
| O-succinylbenzoic acid-CoA ligase MenE * | Rv0542c |
| 2aa Ins | 41–95 | Essential |
| ligase * | Rv3712 |
| 4aa Ins | 180–234 | Essential |
| arabinosyltransferase EmbB * | Rv3795 |
| 3aa Ins | 747–795 | Essential |
| GTPase Era | Rv2364c |
| 1aa Ins | 225–283 | Essential (growth defect) |
| primosome assembly protein PriA | Rv1402 |
| 3aa Ins | 609–655 | Essential |
| putative phospho-sugar mutase/MRSA homolog * | Rv3441c |
| 3aa Ins | 43–102 | Essential |
| polyketide synthase Pks8 | Rv1662 |
| 1aa Del | 539–586 | Non-essential |
| Glutamine-dependent NAD(+) synthetase | Rv2438c |
| 1aa Del | 584–641 | Essential |
| ribonuclease E | Rv2444c |
| 3aa Ins | 219–269 | Essential |
| putative folylpolyglutamate synthase protein (FolC) | Rv2447c |
| 3aa Ins | 111–170 | Essential |
| DNA topoisomerase I TOPA (omega-protein) | Rv3646c |
| 3aa Ins | 392–440 | Essential |
| metal cation transporting ATPase H | Rv0425c |
| 1aa Del | 963–1014 | Non-essential |
| Acyltransferase * | Rv1565c |
| 4aa Ins | 162–220 | Non-essential |
| α-amylase | Rv2471 |
| 1aa Del | 428–477 | Non-essential |
| hypothetical protein IQ48_14915, partial | Rv0897c |
| 3aa Ins | 257–306 | Non-essential |
| hypothetical protein CAB90_01059 * | Rv0938 |
| 3aa Ins | 422–469 | Non-essential |
| transcriptional regulator * | Rv1186c |
| 1aa Del | 406–457 | Non-essential |
| hypothetical protein IU12_21070 | Rv0008c |
| 4aa Ins | 10–59 | Non-essential |
| hypothetical protein IU14_19860 | Rv0029 |
| 2aa Del | 194–250 | Non-essential |
| membrane protein | Rv0051 |
| 8aa Ins | 470–522 | Non-essential |
| hypothetical protein RN11_1864 * | Rv0094c |
| 8aa Ins | 18–67 | Non-essential |
| transmembrane protein | Rv0188 |
| 3aa Del | 18–55 | Non-essential |
| hypothetical protein ERS181347_00724 | Rv0209 |
| 3aa Ins | 195–242 | Non-essential |
| conserved membrane protein | Rv0210 |
| 3aa Ins | 10–59 | Non-essential |
| fructose-bisphosphate aldolase * | Rv0365c |
| 4aa Ins | 144–193 | Non-essential |
| anti-sigma K factor | Rv0444c |
| 1aa Ins | 147–206 | Non-essential |
| conserved protein of uncharacterised function % 2C possibly exported | Rv0518 |
| 3aa Ins | 36–84 | Non-essential |
| exonuclease V subunit α | Rv0629c |
| 2aa Del | 109–159 | Non-essential |
| multidrug resistance protein EmrB | Rv0783c |
| 3aa Del | 320–366 | Non-essential |
| Hypothetical protein ERS024213_05484 | Rv0789c |
| 1aa Del | 39–91 | Non-essential |
| LuxR family transcriptional regulator | RVBD_0890c |
| 1aa Del | 290–328 | Non-essential |
| polyprenyl-diphosphate synthase GrcC | Rv0989c |
| 3aa Del | 205–251 | Non-essential |
| polyprenyl-diphosphate synthase GrcC | Rv0989c |
| 1aa Del | 94–141 | Non-essential |
| cold-shock protein | Rv1253 |
| 2aa Del | 220–275 | Non-essential |
| transcriptional regulator | Rv1358 |
| 1aa Ins | 94–152 | Non-essential |
| hypothetical protein IQ40_04435 | Rv1359 |
| 4aa Ins | 150–208 | Non-essential |
| esterase | Rv1497 |
| 1aa Ins | 296–344 | Non-essential |
| hypothetical protein RN11_1864 * | Rv0094c |
| 8aa Ins | 165–213 | Non-essential |
| DEAD/DEAH box helicase | Rv2092c |
| 1aa Del | 579–624 | Non-essential |
| phosphoglycerate mutase | Rv2135c |
| 1aa Del | 01–48 | Non-essential |
| hypothetical protein CAB90_02390 | Rv2137c |
| 2aa Ins | 10–54 | Non-essential |
| putative glycerol-3-phosphate dehydrogenase * | Rv2249c |
| 4aa Ins | 333–380 | Non-essential |
| GTP-binding protein LepA | Rv2404c |
| 3aa Ins | 298–355 | Non-essential |
| type I restriction/modification system specificity determinant HsdS | Rv2761c |
| 4aa Ins | 10–58 | Non-essential |
| hypothetical protein IQ38_12515, partial | Rv2762c |
| 2aa Del | 42–81 | Non-essential |
| polyketide synthase * | Rv2940c |
| 3aa Ins | 1311–1359 | Non-essential |
| lipase | Rv2970c |
| 1aa Ins | 170–225 | Non-essential |
| secreted protein | Rv3054c |
| 1aa Del | 13–61 | Non-essential |
| DNA polymerase IV * | Rv3056 |
| 1aa Del | 208–254 | Non-essential |
| ATP-dependent DNA helicase * | Rv3202c |
| 1aa Del | 378–426 | Non-essential |
| membrane protein | Rv3207c |
| 1aa Ins | 207–256 | Non-essential |
| ATPase | Rv3220c |
| 1aa Ins | 215–269 | Non-essential |
| DNA glycosylase * | Rv3297 |
| 4aa Ins | 80–132 | Non-essential |
| hypothetical protein IQ47_16905, partial * | Rv3394c |
| 3aa Ins | 78–123 | Non-essential |
| hydrolase * | Rv3400 |
| 3aa Ins | 141–186 | Non-essential |
| hypothetical protein RN11_1864 * | Rv0094c |
| 8aa Ins | 19–68 | Non-essential |
| acyl-CoA dehydrogenase FadE27 | Rv3505 |
| 8aa Ins | 162–211 | Non-essential |
| oxidoreductase * | Rv3742c |
| 11aa Del | 25–65 | Non-essential |
| hypothetical protein IQ42_20035 * | Rv3912 |
| 2aa Del | 113–159 | Non-essential |
# Inferences whether the genes encoding different proteins are essential or not required for in vitro growth of M. tuberculosis H37Rv are based on the results from Himar1 based transposon mutagenesis reported in literature [45,46], * Some exceptions are seen for these CSIs.
Figure 5Structural localization of the CSI in the MurB protein. (A) Resolved structure of the UDP-N-acetylenolpyruvoylglucosamine reductase (MurB) protein from M. tuberculosis (PDB ID: 5JZX) [72]. The four aa insertion is highlighted in red. (B) A close up of the CSI region from M. tuberculosis proteins colored in green, and a homology model of the same protein from Mycobacterium angelicum, shown in cyan.
Figure 6Structural location of the CSI in the CysH protein. (A) Homology model of the 3′-phosphoadenosine 5′-phosphosulfate reductase CysH protein from M. tuberculosis (based on PDB ID: 2GOY). The seven aa insertion is highlighted in red and boxed. (B) Resolved structure of the CysH protein from Pseudomonas aeruginosa (PDB ID: 2GOY). The region homologous to the insert is boxed. (C) A close-up of the CSI region in the aligned structures of the two proteins, with CSIs marked in red.