| Literature DB >> 18253486 |
Tiago Dos Vultos1, Olga Mestre, Jean Rauzier, Marcin Golec, Nalin Rastogi, Voahangy Rasolofo, Tone Tonjum, Christophe Sola, Ivan Matic, Brigitte Gicquel.
Abstract
BACKGROUND: Mycobacterium tuberculosis complex species display relatively static genomes and 99.9% nucleotide sequence identity. Studying the evolutionary history of such monomorphic bacteria is a difficult and challenging task. PRINCIPALEntities:
Mesh:
Year: 2008 PMID: 18253486 PMCID: PMC2211405 DOI: 10.1371/journal.pone.0001538
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Putative gene function and distribution of synonymous and non-synonymous SNPS, deletions and stop codons found in this study.
| sSNPs | nsSNPs | Deletions | Stop Codons | Putative Function | |
|
| 2 | 2 | 0 | 0 | Probable DNA ligase |
|
| 2 | 6 | 0 | 0 | Possible ATP-dependent ligase |
|
| 3 | 5 | 1 | 0 | Possible ATP-dependent ligase |
|
| 2 | 5 | 0 | 0 | Possible ATP-dependent ligase |
|
| 0 | 0 | 0 | 0 | Probable single-strand binding protein |
|
| 1 | 5 | 2 | 0 | Probable exonuclease V |
|
| 5 | 7 | 0 | 0 | Probable exonuclease V |
|
| 1 | 5 | 0 | 0 | Possible ATP-dependent DNA helicase |
|
| 1 | 2 | 0 | 0 | Possible ATP-dependent DNA helicase II |
|
| 2 | 3 | 0 | 0 | Possible ATP-dependent DNA helicase II |
|
| 3 | 4 | 0 | 0 | Probable excinuclease ABC |
|
| 2 | 5 | 0 | 0 | Probable excinuclease ABC |
|
| 1 | 3 | 0 | 0 | Probable excinuclease ABC |
|
| 8 | 8 | 0 | 0 | Probable DNA polymerase I |
|
| 2 | 0 | 0 | 0 | Probable holliday junction DNA helicase |
|
| 3 | 1 | 0 | 0 | Probable holliday junction DNA helicase |
|
| 0 | 0 | 0 | 0 | Probable crossover junction endodeoxyribonuclease |
|
| 1 | 2 | 0 | 0 | Recombinase A |
|
| 0 | 2 | 0 | 0 | Repressor lexA |
|
| 4 | 7 | 0 | 0 | Probable exonuclease V |
|
| 3 | 3 | 0 | 0 | Probable DNA repair protein |
|
| 4 | 5 | 0 | 0 | Probable transcription repair coupling factor |
|
| 0 | 5 | 0 | 0 | Probable NADH pyrophosphatase |
|
| 2 | 3 | 0 | 0 | Probable thymidine phosphohydrolase |
|
| 1 | 6 | 0 | 0 | Probable DNA polymerase III |
|
| 2 | 2 | 0 | 0 | Probable DNA polymerase III |
|
| 1 | 1 | 0 | 0 | Regulatory protein |
|
| 1 | 2 | 0 | 0 | Possible DNA repair protein |
|
| 0 | 2 | 0 | 0 | Probable dUTPase |
|
| 3 | 1 | 0 | 0 | DNA repair protein |
|
| 2 | 4 | 1 | 1 | Probable endonucleaseIV |
|
| 0 | 0 | 0 | 0 | Probable exodeoxyribonuclease III |
|
| 0 | 5 | 0 | 0 | DNA replication and repair protein |
|
| 1 | 2 | 0 | 0 | Probable DNA-3-methyladenine glycosylase I |
|
| 1 | 3 | 0 | 0 | Possible formamidopyrimidine-DNA glycosylase |
|
| 2 | 2 | 0 | 0 | Probable endonuclease VIII |
|
| 0 | 3 | 0 | 0 | Probable resolvase |
|
| 0 | 1 | 0 | 0 | Probable endonuclease III |
|
| 3 | 1 | 0 | 0 | Probable uracil-DNA glycosylase |
|
| 2 | 3 | 0 | 0 | Possible hydrolase mutt1 |
|
| 1 | 1 | 0 | 0 | Probable 8-oxo-dGTPase |
|
| 1 | 1 | 0 | 0 | Probable 8-oxo-dGTPase |
|
| 2 | 2 | 0 | 0 | Probable nudix hydrolase |
|
| 1 | 1 | 0 | 0 | 6-O-methylguanine-DNA methyltransferase |
|
| 4 | 9 | 1 | 1 | Probable ada regulatory protein alkA |
|
| 0 | 0 | 0 | 0 | Possible 3-methyladenine DNA glycosylase |
|
| 0 | 2 | 0 | 0 | Probable formamidopyrimide-DNA glycosylase |
|
| 1 | 2 | 0 | 0 | Probable adenine glycosylase |
|
| 1 | 1 | 0 | 0 | Probable recombination protein |
|
| 0 | 2 | 0 | 0 | Probable dna polymerase IV |
|
| 1 | 5 | 0 | 1 | Possible DNA-damage-Inducible protein F |
|
| 2 | 3 | 0 | 0 | Possible DNA polymerase |
|
| 2 | 2 | 0 | 0 | DNA polymerase III |
|
| 3 | 2 | 0 | 0 | Possible DNA-damage-inducible protein P |
|
| 1 | 1 | 0 | 0 | Probable restriction system protein |
|
| 0 | 1 | 2 | 0 | Possible DNA glycosylase |
List of oligonucleotides (5′-3′) used in this study.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
|
|
|
|
| ||
|
|
|
|
| ||
|
|
|
|
| ||
|
|
|
|
| ||
|
|
|
|
|
The name of the target gene and position of the oligonucleotide is followed by the oligonucleotide sequence. (f) for forward and (r) for reverse oligonucleotides used for amplification and sequencing reactions. Oligonucleotides whose name finishes in number were used for sequencing reactions.
DNA polymorphism and divergence data.
| Gene |
|
|
|
|
|
|
| 7 | 0,00014 | 0,462 | 7,406 | 31,414 |
|
| 8 | 0,00094 | 0,934 | 2,780 | 10,491 |
|
| 8 | 0,00021 | 0,399 | 0,731 | 6,132 |
|
| 7 | 0,00022 | 0,290 | 5,225 | 5,497 |
|
| 14 | 0,00066 | 0,991 | 1,781 | 4,115 |
|
| 3 | 0,00037 | 0,182 | 2,577 | 2,794 |
|
| 8 | 0,00028 | 0,638 | 0,424 | 1,981 |
|
| 9 | 0,00022 | 0,818 | 0,346 | 1,389 |
|
| 3 | 0,00016 | 0,086 | 1,130 | 1,156 |
|
| 6 | 0,00012 | 0,151 | 0,935 | 0,948 |
|
| 7 | 0,00010 | 0,216 | 0,840 | 0,839 |
|
| 5 | 0,00028 | 0,215 | 0,809 | 0,811 |
|
| 3 | 0,00015 | 0,065 | 0,755 | 0,763 |
|
| 4 | 0,00003 | 0,065 | 0,712 | 0,712 |
|
| 4 | 0,00011 | 0,065 | 0,696 | 0,696 |
|
| 5 | 0,00004 | 0,108 | 0,556 | 0,550 |
|
| 12 | 0,00052 | 1,412 | 0,464 | 0,495 |
|
| 7 | 0,00026 | 0,461 | 0,068 | 0,463 |
|
| 4 | 0,00017 | 0,129 | 0,380 | 0,386 |
|
| 5 | 0,00006 | 0,130 | 0,383 | 0,381 |
|
| 6 | 0,00012 | 0,130 | 0,373 | 0,376 |
|
| 4 | 0,00027 | 0,129 | 0,380 | 0,371 |
|
| 5 | 0,00004 | 0,087 | 0,364 | 0,364 |
|
| 3 | 0,00005 | 0,043 | 0,352 | 0,352 |
|
| 4 | 0,00004 | 0,086 | 0,350 | 0,346 |
|
| 5 | 0,00059 | 0,407 | 0,242 | 0,247 |
|
| 5 | 0,00014 | 0,252 | 0,201 | 0,195 |
|
| 3 | 0,00010 | 0,065 | 0,189 | 0,187 |
|
| 4 | 0,00008 | 0,086 | 0,131 | 0,130 |
|
| 9 | 0,00030 | 0,464 | 0,151 | 0,129 |
|
| 4 | 0,00018 | 0,167 | 0,134 | 0,127 |
|
| 13 | 0,00031 | 1,005 | 0,397 | 0,115 |
|
| 6 | 0,00030 | 0,290 | 0,104 | 0,097 |
|
| 3 | 0,00056 | 0,340 | 0,086 | 0,075 |
|
| 8 | 0,00040 | 0,435 | 1,324 | 0,059 |
|
| 11 | 0,00042 | 0,731 | 0,195 | 0,051 |
|
| 7 | 0,00018 | 0,399 | 0,139 | 0,023 |
|
| 5 | 0,00035 | 0,418 | 0,071 | 0,015 |
|
| 5 | 0,00057 | 0,689 | 0,076 | 0,013 |
|
| 5 | 0,00061 | 0,465 | 0,039 | 0,011 |
|
| 4 | 0,00042 | 0,334 | 0,061 | 0,010 |
|
| 5 | 0,00025 | 0,356 | 0,026 | 0,005 |
|
| 3 | 0,00011 | 0,065 | 0,000 | 0,000 |
|
| 3 | 0,00005 | 0,043 | ------ | ------ |
|
| 2 | 0,00008 | 0,064 | ------ | ------ |
|
| 2 | 0,00009 | 0,064 | ------ | ------ |
|
| 3 | 0,00010 | 0,065 | ------ | ------ |
|
| 6 | 0,00013 | 0,151 | ------ | ------ |
|
| 3 | 0,00022 | 0,312 | ------ | ------ |
|
| 3 | 0,00044 | 0,204 | ------ | ------ |
|
| 6 | 0,00049 | 0,460 | ------ | ------ |
|
| 3 | 0,00083 | 0,486 | ------ | ------ |
|
|
|
|
|
|
|
The genes for which no Pi(a)/Pi(s) and Ka/Ks ratios could be determined are marked by -----.
DNA polymorphism data on the control group of strains.
|
|
|
|
|
| 0,00014 | 0,286 |
|
| 0,00050 | 1,143 |
|
| 0,00075 | 1,143 |
|
| 0,00027 | 0,286 |
|
| 0,00070 | 2,286 |
|
| 0,00066 | 2,190 |
|
| 0,00000 | 0,000 |
|
| 0,00012 | 0,286 |
|
| 0,00000 | 0,000 |
|
| 0,00050 | 1,048 |
|
| 0,00015 | 0,286 |
|
| 0,00029 | 0,857 |
|
| 0,00032 | 0,857 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00060 | 1,429 |
|
| 0,00044 | 0,286 |
|
| 0,00033 | 0,571 |
|
| 0,00065 | 1,143 |
|
| 0,00039 | 1,429 |
|
| 0,00121 | 1,143 |
|
| 0,00089 | 1,143 |
|
| 0,00144 | 1,429 |
|
| 0,00071 | 0,857 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00184 | 0,857 |
|
| 0,00000 | 0,000 |
|
| 0,00038 | 0,286 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00240 | 1,143 |
|
| 0,00112 | 0,857 |
|
| 0,00195 | 1,143 |
|
| 0,00039 | 0,286 |
|
| 0,00042 | 0,286 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00076 | 0,571 |
|
| 0,00057 | 0,286 |
|
| 0,00070 | 1,048 |
|
| 0,00033 | 0,286 |
|
| 0,00031 | 0,286 |
|
| 0,00000 | 0,000 |
|
| 0,00021 | 0,286 |
|
| 0,00087 | 1,143 |
|
| 0,00024 | 0,286 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00031 | 0,286 |
|
| 0,00000 | 0,000 |
|
|
|
|
The 3R genes were analyzed from the strains M. bovis subsp. bovis AF2122/97 and M. tuberculosis CDC1551 from the TIGR website at http://cmr.tigr.org, M. microti and M. africanum from the Sanger Institute at http://www.sanger.ac.uk and strains F11, C and Haarlem from Broad Institute available at http://www.broad.mit.edu.
DNA polymorphism data on the control group of strains.
|
|
|
|
|
| 0,00050 | 0,222 |
|
| 0,00031 | 0,222 |
|
| 0,00026 | 0,222 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
| 0,00000 | 0,000 |
|
|
|
|
The housekeeping genes were analyzed from the strains M. bovis subsp. bovis AF2122/97 and M. tuberculosis CDC1551 from the TIGR website at http://cmr.tigr.org, M. microti and M. africanum from the Sanger Institute at http://www.sanger.ac.uk and strains F11, C and Haarlem from Broad Institute available at http://www.broad.mit.edu.
Outcome of correlating the location of non-synonymous single nucleotide polymorphisms (ns SNPs) inside genes, the amino acids they are predicted to encode and predicted enzymatic signature motifs and active sites.
|
|
|
|
|
|
|
|
|
|
|
|
Figure 1Average nucleotide diversity by gene class.
It was calculated based on the results for the clinical strains according to the class of 3R genes analyzed. Holliday Junction resolving genes, 4407 nucleotides-4 genes. SOS repair, 16893 nucleotides-10 genes. NER genes, 18108 nucleotides-5 genes. AP endonucleases, 1635 nucleotides-2 genes. GO repair, 5850 nucleotides-8 genes. Recombination involved genes, 30567 nucleotides-18 genes. Ligases, 6957 nucleotides-2 genes. BER genes, 8328 nucleotides-10 genes. Alkylation damage, 3216 nucleotides-4 genes. RecBCD, 8307 nucleotides-3 genes. RecFOR, 2568 nucleotides-3 genes. Polymerases, 7857 nucleotides-5 genes. Direct repair, 1989 nucleotides-2 genes. Uracil related repair, 1149 nucleotides-2.
Figure 2(A) Phylogenetic network based on the total set of SNPs. This phylogenetic network was constructed using the median-joining algorithm with a final set of 252 SNPs characterized in 92 clinical strains of the Mycobacterium tuberculosis complex (MTC).
(B) Phylogenetic network based on the nsSNPs. This phylogenetic network was constructed using the median-joining algorithm with a final set of 163nsSNPs characterized in 92 clinical strains of the MTC. (C) Phylogenetic network based on the sSNPs. This phylogenetic network was constructed using the median-joining algorithm with a final set of 89 sSNPs characterized in 92 clinical strains of the MTC. Deletions were excluded from the analysis. Clinical isolates are classified with a color code, according to their spoligotype-based family. Node sizes indicate the number of strains belonging to the same haplotype.
Figure 3Geographic origin of the haplotypes identified.
This phylogenetic network constructed using the median-joining algorithm with a final set of 252 SNPs characterized in 92 clinical strains of the Mycobacterium tuberculosis complex (MTC). Deletions were excluded from the analysis. Geographical origin is classified with a color code. Node sizes indicate the number of strains belonging to the same haplotype.
Figure 4Spoligotype based unrooted tree of the strains analyzed.
This unrooted neighbor-joining tree was built with the Mega software on the same dataset as in Figure 1. The upper part of the tree describes Principal Genetic Group (PGG) 2 & 3 strains and the lower part relates to PGG1. The spoligotypes are indicated next to the tree to show the excellent congruence. Clades are named according to SpolDB4 and to the recent SNP-cluster group(SCG) nomenclature.
Figure 5Site frequency spectrum of sSNPs and nsSNPs.
This spectrum summarizes the allele frequencies of the various mutations in the sample.