| Literature DB >> 24767249 |
Ekaterina N Chernyaeva1, Marina V Shulgina, Mikhail S Rotkevich, Pavel V Dobrynin, Serguei A Simonov, Egor A Shitikov, Dmitry S Ischenko, Irina Y Karpova, Elena S Kostryukova, Elena N Ilina, Vadim M Govorun, Vyacheslav Y Zhuravlev, Olga A Manicheva, Peter K Yablonsky, Yulia D Isaeva, Elena Y Nosova, Igor V Mokrousov, Anna A Vyazovaya, Olga V Narvskaya, Alla L Lapidus, Stephen J O'Brien.
Abstract
BACKGROUND: Tuberculosis (TB) poses a worldwide threat due to advancing multidrug-resistant strains and deadly co-infections with Human immunodeficiency virus. Today large amounts of Mycobacterium tuberculosis whole genome sequencing data are being assessed broadly and yet there exists no comprehensive online resource that connects M. tuberculosis genome variants with geographic origin, with drug resistance or with clinical outcome. DESCRIPTION: Here we describe a broadly inclusive unifying Genome-wide Mycobacterium tuberculosis Variation (GMTV) database, (http://mtb.dobzhanskycenter.org) that catalogues genome variations of M. tuberculosis strains collected across Russia. GMTV contains a broad spectrum of data derived from different sources and related to M. tuberculosis molecular biology, epidemiology, TB clinical outcome, year and place of isolation, drug resistance profiles and displays the variants across the genome using a dedicated genome browser. GMTV database, which includes 1084 genomes and over 69,000 SNP or Indel variants, can be queried about M. tuberculosis genome variation and putative associations with drug resistance, geographical origin, and clinical stages and outcomes.Entities:
Mesh:
Year: 2014 PMID: 24767249 PMCID: PMC4234438 DOI: 10.1186/1471-2164-15-308
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Regions of the Russian Federation, from which isolates were obtained. Source regions of M. tuberculosis strains genomes included into the database are highlighted with pink color.
Figure 2Genotypes of isolates included into GMTV database. Spoligotypes were identified using conventional spacer oligonucleotide typing technique and genetic clades were identified using SNP analysis.
Genome variants (SNPs and Indels) in GMTV database filtered by Q30
| Overall quantity | 45655 | 23975 |
| In CDS | 39808 | 18537 |
| Nonsynonymous mutations in CDS | 24124 | - |
| Synonymous mutations in CDS | 13392 | - |
| Variations in STOP-codons in CDS | 684 | - |
| Frameshift mutations in CDS | - | 10993 |
Figure 3Size distribution of Indels in genome. (A) Indels distribution without Quality threshold, (B) Indels distribution with Quality threshold 30. One- and nine-nucleotide size Indels are the most common among M. tuberculosis isolates in GMTV database.
Figure 4An output of a query (SNP). Query was based on SNPs in M. tuberculosis isolates with streptomycin and isoniazid resistance in coding regions with Quality threshold 30.