| Literature DB >> 25559874 |
Michal Wozniak, Jerzy Tiuryn, Limsoon Wong.
Abstract
BACKGROUND: Development of drug resistance in bacteria causes antibiotic therapies to be less effective and more costly. Moreover, our understanding of the process remains incomplete. One promising approach to improve our understanding of how resistance is being acquired is to use whole-genome comparative approaches for detection of drug resistance-associated mutations.Entities:
Mesh:
Year: 2014 PMID: 25559874 PMCID: PMC4304204 DOI: 10.1186/1471-2164-15-S10-S10
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Schema of the GWAMAR pipeline. For a set of considered bacterial strains, the input data for GWAMAR consists of (i) a set of mutations; (ii) a set of drug resistance profiles; and (iii) phylogenetic tree for the set of bacterial strains. Typically the set of mutation profiles is generated using eCAMBer, which is able to download the genome sequences and annotations for the set of bacterial strains, identify point mutations based on multiple alignments, and reconstruct the phylogenetic tree of the considered bacterial strains. The first step of GWAMAR is to compute binary mutation profiles for all point mutations. This step significantly reduces the number of genetic profiles considered. Finally, GWAMAR implements several statistical scores to associate drug resistance profiles with mutation profiles. These include: mutual information, odds ratio, hypergeometric test, weighted support and tree-generalized hypergeometric score (TGH). As a result, we obtain ordered lists of drug resistance associations, where the top scored associations are the most likely to be real.
Figure 2Example colorings for the TGH score. (A) an example of coloring ĉ induced by a given drug-resistance profile (large red nodes) and coloring induced by a given binary mutation profile (small orange nodes) for a flat tree. In this example and . (B) another example of colorings ĉ and induced by the same pair of profiles but for a different tree. In this example and |
20 top-scoring associations between drug-resistance profiles and point mutations in the case study on 173 fully sequenced M.tuberculosis strains.
| drug name | gene id | gene name | mutation | all | TGH | |
|---|---|---|---|---|---|---|
| Fluoroquinolones | Rv0006 | gyrA | D94G/A/H/N/Y | Y | Y | 14.1843430424 |
| Isoniazid | Rv1908c | katG | S315T/G/N | Y | Y | 9.04507605888 |
| Rifampicin | Rv0667 | rpoB | S450L | Y | Y | 8.60191917013 |
| Streptomycin | Rv0682 | rpsL | K43R | Y | Y | 8.32303955124 |
| Ethambutol | Rv3795 | embB | M306I/V/L | Y | Y | 8.24966301883 |
| Isoniazid | Rv1483 | fabG1 | C-15T | Y | Y | 5.8445976648 |
| Rifampicin | Rv0667 | rpoB | D435F/V/Y/G/A | Y | Y | 5.0402225732 |
| Streptomycin | Rv0682 | rpsL | K88R/M | Y | Y | 4.16354931535 |
| Ethambutol | Rv3795 | embB | E504G/D | N | N | 3.33103155053 |
| Pyrazinamide | Rv2043c | pncA | W68L | Y | Y | 2.7080502011 |
| Pyrazinamide | Rv2043c | pncA | H51P | Y | Y | 2.7080502011 |
| Rifampicin | Rv0667 | rpoB | H445D/Y/R | Y | Y | 2.52993515037 |
| Streptomycin | Rvnr01 | rrs | G1108C | N | N | 1.71691080314 |
| Ethambutol | Rv3795 | embB | D1024N | Y | N | 1.68763546921 |
| Ethambutol | Rv3795 | embB | D869G | N | N | 1.68763546921 |
| Ethambutol | Rv3795 | embB | A505T | N | N | 1.68763546921 |
| Fluoroquinolones | Rv0005 | gyrB | N538T | Y | Y | 1.68478734968 |
| Fluoroquinolones | Rv0006 | gyrA | S91P | Y | Y | 1.68478734968 |
| Fluoroquinolones | Rv0005 | gyrB | T539I | N | N | 1.68478734968 |
| Streptomycin | Rvnr01 | rrs | A1401G | Y | N | 1.28846347057 |
Each row corresponds to one association, whereas the consecutive columns describe: drug name, gene identifier, gene name, mutation, association presence in the TBDReaMDB database, status indicating if the association is categorized as high confidence in TBDReaMDB, TGH score.
List of sequenced genes and promoters available in the mtu_broad dataset.
| gene id | gene name | description | promoter sequenced? |
|---|---|---|---|
| Rv0005 | gyrB | DNA gyrase subunit B | yes |
| Rv0006 | gyrA | DNA gyrase subunit A | yes |
| Rv0341 | iniB | isoniazid inductible gene protein | yes |
| Rv0342 | iniA | isoniazid inductible gene protein | yes |
| Rv0343 | iniC | isoniazid inductible gene protein | yes |
| Rv0667 | rpoB | DNA-directed RNA polymerase beta chain | yes |
| Rv0682 | rpsL | 30S ribosomal protein S12 | yes |
| Rv1483 | fabG1 | 3-oxoacyl-[acyl-carrier protein] reductase | yes |
| Rv1484 | inhA | NADH-dependent enoyl-[acyl-carrier-protein] reductase | yes |
| Rv1694 | tlyA | cytotoxin--haemolysin | no |
| Rv1854c | ndh | NADH dehydrogenase | yes |
| Rv1908c | katG | catalase-peroxidase-peroxynitritase T | no |
| Rv2043c | pncA | pyrazinamidase/nicotinamidas | yes |
| Rv2245 | kasA | 3-oxoacyl-[acyl-carrier protein] synthase 1 | no |
| Rv2427Ac | oxyR' | hypothetical protein | no |
| Rv2428 | ahpC | alkyl hydroperoxide reductase C protein | yes |
| Rv2764c | thyA | thymidylate synthase | yes |
| Rv2764c | ddl | D-alanine-D-alanine ligase ddlA | no |
| Rv3423c | alr | alanine racemase | no |
| Rv3793 | embC | membrane indolylacetylinositol arabinosyltransferase | yes |
| Rv3794 | embA | membrane indolylacetylinositol arabinosyltransferase | yes |
| Rv3795 | embB | membrane indolylacetylinositol arabinosyltransferase | yes |
| Rv3854c | ethA | monooxygenase | yes |
| Rv3919c | gid | glucose-inhibited division protein B | yes |
| Rvnr01 | rrs | ribosomal RNA 16S | no |
| Rvnr02 | rrl | ribosomal RNA 23S | no |
20 top-scoring associations between drug-resistance profiles and point mutations in the case study for 1398 partially sequenced M.tuberculosis strains.
| drug name | gene id | gene name | mutation | all | TGH | |
|---|---|---|---|---|---|---|
| Fluoroquinolones | Rv0006 | gyrA | D94A/G/N/Y/H | Y | Y | 129.754964792 |
| Fluoroquinolones | Rv0006 | gyrA | A90G/V | Y | Y | 41.8967753922 |
| Streptomycin | Rv0682 | rpsL | K43R | Y | Y | 31.005838239 |
| Isoniazid | Rv1908c | katG | S315T/S/G/N/I/R | Y | Y | 27.1918713598 |
| Ethambutol | Rv3795 | embB | Q497P/R/K/H | Y | Y | 17.1681425414 |
| Streptomycin | Rv0682 | rpsL | K88T/R/Q/M | Y | Y | 16.2806822989 |
| Fluoroquinolones | Rv0005 | gyrB | N538K/T/D/S | Y | Y | 12.6368065275 |
| Rifampicin | Rv0667 | rpoB | H445P/D/R/Y/L/N/Q | Y | Y | 12.627849397 |
| Streptomycin | Rvnr01 | rrs | A1401G | Y | N | 9.60726487825 |
| Pyrazinamide | Rv2043c | pncA | T135P/A | Y | N | 9.35766011848 |
| Streptomycin | Rvnr01 | rrs | A514C | Y | Y | 8.96892262877 |
| Rifampicin | Rv0667 | rpoB | D435Y/V/H/G/A/N | Y | Y | 7.63431166207 |
| Fluoroquinolones | Rv0006 | gyrA | S91P | Y | Y | 7.57935978224 |
| Pyrazinamide | Rv2043c | pncA | T-11C/G | Y | Y | 6.76727069266 |
| Ethambutol | Rv3795 | embB | G406S/D/A/C | Y | Y | 6.32500852932 |
| Fluoroquinolones | Rv0006 | gyrA | D89G/N | N | N | 6.26814578901 |
| Pyrazinamide | Rv2043c | pncA | L120P/R | Y | N | 6.11085770664 |
| Streptomycin | Rvnr01 | rrs | C517T | Y | Y | 5.16411345885 |
| Ethambutol | Rv3795 | embB | D328Y/G/H | Y | N | 5.07901609928 |
| Pyrazinamide | Rv2043c | pncA | V139G/A/M/L | Y | Y | 5.05727324518 |
This dataset is provided by The Broad Institute. Each row corresponds to one association, whereas the consecutive columns describe: drug name, gene identifier, gene name, mutation, association presence in the TBDReaMDB database, status indicating if the association is categorized as high confidence in TBDReaMDB, TGH score.
Figure 3Comparison of accuracy. Precision-recall curves for comparison of different association scores implemented in GWAMAR. Left panel presents results for the mtu173 dataset; right for the mtu_broad dataset. Numbers present in the square brackets display the Area Under the Curve (AUC) for the scores. In both case studies tree-aware statistics (weighted support and TGH) achieve better performance the the tree-ignorant statistics.
Figure 4Compensatory mutations. Distribution of putative compensatory mutations with the rpoA, rpoB and rpoC genes. Position of each mutation is indicated by a vertical line.