| Literature DB >> 26496891 |
P A Black1, M de Vos1, G E Louw1, R G van der Merwe1, A Dippenaar1, E M Streicher1, A M Abdallah2, S L Sampson1, T C Victor1, T Dolby3, J A Simpson3, P D van Helden1, R M Warren4, A Pain2.
Abstract
BACKGROUND: Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26496891 PMCID: PMC4619333 DOI: 10.1186/s12864-015-2067-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Genotypic characterisation of M. tuberculosis clinical isolates used for the investigation into genomic heterogeneity
| Isolate name | rpoBa | Spoligotype classification |
|---|---|---|
| R160 | Ser531Leu | LCC |
| R376 | Ser531Leu | Haarlem |
| R458 | Ser531Leu | Unknown/unique |
| R486 | Leu533Pro | Beijing |
| R631 | His526Tyr | Unknown/unique |
| R637 | Ser531Leu | Beijing |
| R641 | Leu533Pro | Beijing |
| R721 | Ser531Leu | Beijing |
| R912 | His526Tyr | EAI |
| R965 | Leu533Pro | Beijing |
| R966 | His526Tyr | Beijing |
| R1035 | Ser531Leu | LAM |
| R1415 | His526Tyr | Beijing |
LCC low copy clade, EAI East African Indian, LAM Latin American Mediterranean
aAmino acid change according to the Escherichia coli rpoB gene sequence
Validation of variants with a read frequency ranging between 20 and 100 % using targeted PCR and sanger sequencing
| Genea | Sanger chromatogram result | Read frequency (%) | Sanger result |
|---|---|---|---|
|
| Single peak | 127/127 (100.0) | True |
|
| Single peak | 152/154 (98.7) | True |
|
| Single peak | 193/200 (96.5) | True |
|
| Single peak | 129/134 (96.3) | True |
|
| Single peak | 98/126 (77.8) | True |
|
| Single peak | 102/142 (71.8) | True |
|
| Double peaks | 157/222 (70.7) | True |
|
| Double peaks | 117/159 (68.8) | True |
|
| Double peaks | 119/181 (65.7) | True |
|
| Double peaks | 96/155 (61.9) | True |
|
| Double peaks | 84/145 (57.9) | True |
|
| Double peaks | 55/95 (57.9) | True |
|
| Double peaks | 74/129 (57.4) | True |
|
| Double peaks | 33/61 (54.1) | True |
| Intergenic (1093238) | Double peaks | 89/168 (53.0) | True |
|
| Double peaks | 57/113 (50.4) | True |
|
| Double peaks | 59/123 (48.0) | True |
|
| Double peaks | 48/104 (46.2) | True |
|
| Double peaks | 54/130 (41.5) | True |
|
| Double peaks | 52/128 (40.6) | True |
|
| Double peaks | 18/45 (40.0) | True |
|
| Double peaks | 34/89 (38.2) | True |
|
| Double peaks | 36/95 (37.9) | True |
|
| Double peaks | 72/198 (36.4) | True |
|
| Double peaks | 42/119 (35.3) | True |
|
| Double peaks | 50/151 (33.1) | True |
|
| Double peaks | 47/147 (32.7) | True |
|
| Double peaks | 38/119 (31.9) | True |
|
| Double peaks | 30/106 (31.9) | True |
|
| Double peaks | 44/138 (31.9) | True |
|
| Single peak | 47/154 (30.5) | False |
|
| Double peaks | 42/138 (30.4) | True |
|
| Single peak | 37/126 (29.4) | False |
|
| Double peaks | 61/202 (29.2) | True |
|
| Single peak | 85/296 (28.7) | False |
|
| Single peak | 34/120 (28.3) | False |
|
| Single peak | 44/159 (27.7) | False |
|
| Double peaks | 47/170 (27.6) | True |
|
| Double peaks | 56/209 (26.8) | True |
|
| Single peak | 54/205 (26.3) | False |
|
| Double peaks | 41/158 (25.9) | True |
|
| Double peaks | 66/263 (25.1) | True |
|
| Double peaks | 52/239 (21.8) | True |
|
| Single peak | 34/161 (21.1) | False |
|
| Double peaks | 34/167 (20.4) | True |
|
| Double peaks | 47/232 (20.3) | True |
aAll variant positions and WGS results are listed in the supplementary data (Additional file 2: Table S2)
Fig. 1Validation of variants identified by Illumina sequencing. Analysis of Sanger sequencing alignments and corresponding chromatograms was used to validate the presence of homo- and heterogeneous variants identified by Illumina sequencing. Variants present in either the sequencing file or in the chromatogram (Additional file 2: Table S2) were scored as true variants while sequences which remained wild-type were scored as false
Variants identified in corresponding single colonies derived from different clinical isolates
| Single colony 1 | Single colony 2 | Single colony 3 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Totala | Fixedb | Heteroc | Totala | Fixedb | Heteroc | Totala | Fixedb | Heteroc | Total variation | |
| R160 | 5 | 2 | 3 | 4 | 3 | 1 | - | - | - | 9 |
| R376 | 3 | 1 | 2 | 8 | 2 | 6 | 6 | 1 | 5 | 17 |
| R458 | 3 | 3 | 0 | 5 | 2 | 3 | 3 | 0 | 3 | 11 |
| R486 | 0 | 0 | 0 | 2 | 0 | 2 | - | - | - | 2 |
| R631 | 5 | 0 | 5 | 6 | 0 | 6 | 8 | 0 | 8 | 19 |
| R637 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| R641 | 9 | 9 | 0 | 3 | 0 | 3 | 6 | 0 | 6 | 18 |
| R721 | 0 | 0 | 0 | 4 | 0 | 4 | 0 | 0 | 0 | 4 |
| R912 | 4 | 4 | 0 | 6 | 3 | 3 | 7 | 7 | 0 | 17 |
| R965 | 4 | 4 | 0 | 1 | 1 | 0 | - | - | - | 5 |
| R966 | 5 | 0 | 5 | 1 | 0 | 1 | 6 | 0 | 6 | 12 |
| R1035 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| R1415 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
aTotal number variants unique between the corresponding single colonies
bFixed variants as defined as having a read frequency of ≥ 70 %
cHeterogeneous variants as defined as having a read frequency < 70 % and ≥ 30 %
-A third single colony was not available for the comparison
Fig. 2Genetic diversity between parental populations and their corresponding single colonies. a The R721 parental population had 8 unique variants relative to all three colonies. While all three colonies shared 6 variants which were unique to the colonies relative to the parental population, each colony also harboured unique variants relative to the other colonies. Similarly, for (b) the R912 parental population had 8 unique variants relative to all three colonies. While all three colonies only shared 2 variants, each colony also harboured unique variants relative to the other colonies. (c) For R965, the parental population had 1 unique variant relative to both colonies, while the colonies shared 4 variants which were unique relative to the parental population. Similarly, the two colonies for R486 (d) shared 4 variants which were unique relative to the parental population. The R486 parental population had 1 unique variant relative to the other colonies
M. tuberculosis clinical isolates demonstrating a number of unique variants between rifampicin mono-resistant and MDR isolates during in vivo evolution of isoniazid resistance
| Patient | Isolate name | Phenotypic resistance | Collection date | rpoBc |
|
| Spoligotype classification | Fixed variantsa | Heterogeneous variantsb | Total variation |
|---|---|---|---|---|---|---|---|---|---|---|
|
| R721 | Rifampicin mono | 22/10/2003 | Ser531Leu | - | - | Beijing | 1 | 8 | 9 |
| R807 | MDR | 19/05/2004 | Ser531Leu | Gly309Val | - | Beijing | 2 | 0 | 2 | |
|
| R912 | Rifampicin mono | 15/09/2004 | His526Tyr | - | - | East Africa Indian (EAI) | 7 | 0 | 7 |
| R1210 | MDR | 12/08/2005 | His526Tyr | - | −15 | EAI | 1 | 5 | 6 |
aFixed variants as defined as having an read coverage of ≥70 %
bHeterogeneous variants as defined as having an read coverage <70 % and ≥30 %
cAmino acid change according to the Escherichia coli rpoB gene sequence
Isolate specific variants identified in rifampicin mono-resistant and MDR M. tuberculosis isolates of patient 1 and 2
| Isolate | Locus |
| Amino acid change | Coverage of variant (%) | Gene description | Functional categoryC | |
|---|---|---|---|---|---|---|---|
| Patient 1 | R721a | Rv0435c | I397L | 50 | Putative conserved ATPase | Cell wall and cell processes | |
| Rv0435c | D395Y | 50 | Putative conserved ATPase | ||||
| Rv0668 |
| V1039A | 30 | DNA-directed RNA polymerase RpoC (RNA polymerase beta’ subunit). | Information pathways | ||
| Rv1850 |
| Q11K | 48 | Urease alpha subunit UreC (urea amidohydrolase) | Intermediary metabolism and respiration | ||
| Rv1850 |
| Q11R | 49 | Urease alpha subunit UreC (urea amidohydrolase) | |||
| Rv3218 | Y174H | 66 | Conserved protein | Conserved hypotheticals | |||
| Rv2004c | Ins AAG | 43 | Conserved protein | Conserved hypotheticals | |||
| Rv3563 |
| Ins AC | 40 | Probable acyl-CoA dehydrogenase FadE32 | Lipid metabolism | ||
| Rv3696c |
| Ins AC | 73 | Probable glycerol kinase GlpK (ATP:glycerol 3-phosphotransferase) | Intermediary metabolism and respiration | ||
| R807b | Rv1908c |
| G309V | 100 | Catalase-peroxidase-peroxynitritase T KatG | Virulence, detoxification, adaptation | |
| Rv3696c |
| T91I | 80 | Probable glycerol kinase GlpK (glycerokinase) (GK) | Intermediary metabolism and respiration | ||
| Patient 2 | R912a | Rv1128c | G430S | 98 | Conserved hypothetical protein | Insertion sequences and phages | |
| Rv2236c |
| L269S | 97 | Probable cobalamin biosynthesis transmembrane protein CobD | Intermediary metabolism and respiration | ||
| Rv2664 | H22Q | 99 | Hypothetical protein | Conserved hypotheticals | |||
| Rv2772c | E149* | 97 | Probable conserved transmembrane protein | Cell wall and cell processes | |||
| Rv2984 |
| P631A | 96 | Polyphosphate kinase PPK (polyphosphoric acid kinase) | Intermediary metabolism and respiration | ||
| Rv3391 |
| syn (248) | 99 | Possible multi-functional enzyme with acyl-CoA-reductase activity AcrA1 | Lipid metabolism | ||
| Rv3537 |
| syn (378) | 98 | Probable dehydrogenase | Intermediary metabolism and respiration | ||
| R1210b |
| −15 | 45 | ||||
| Rv1484 |
| S94A | 70 | NADH-dependent enoyl-[acyl-carrier-protein] reductase InhA (NADH-dependent enoyl-ACP reductase) | Lipid metabolism | ||
| Rv1629 |
| syn (146) | 45 | Probable DNA polymerase I PolA | Information pathways | ||
| Rv2935 |
| C582R | 69 | Phenolpthiocerol synthesis type-I polyketide synthase PpsE | Lipid metabolism |
aRifampicin mono-resistant
bMDR
cFunctional category as classified by Tuberculist (http://genolist.pasteur.fr/TubercuList/ and http://tuberculist.epfl.ch/)
Fig. 3Heterogeneous positions identified across the whole population genomes relative to M. tuberculosis H37Rv reference genome. aThe rifampicin mono-resistant isolate (R721) for patient 1 shows numerous heterogeneous variants relative to M. tuberculosis H37Rv while the follow-up MDR isolate (R807) has none. (b) For patient 2 the rifampicin mono-resist isolate (R912) showed no heterogeneous variants relative to M. tuberculosis H37Rv, while the follow-up MDR isolate (R1210) had numerous heterogeneous variants. R912 shared 2 variants (Rv0667 and Rv0672) with R1210, where the variant was present at 100 % in R912 but was a heterogeneous variant in R1210
Fig. 4Proposed model for the effect of a selection bottleneck and random mutations on the population structure of M. tuberculosis clinical isolates. a A rifampicin mono-resistant clinical M. tuberculosis isolate where each cell comprising the population contains an rpoB mutation. Numerous other genetic mutations are present thereby creating a diverse population structure. (b) Following the onset of treatment the genetic mutations in the population may change, and a spontaneous isoniazid resistance causing (for example katG gene or inhA promoter) mutation is selected for and becomes dominant within the population. (c) Selective pressure of treatment results in the emergence of an isoniazid resistant M. tuberculosis population where each cell contains a katG mutation. Numerous other genetic mutations are lost during the selection bottleneck resulting in a loss of genetic diversity. (d) Subsequent replication cycles and population growth results in new genetic mutations arising within the population allowing for new diversification e.g. R1210. Each cell in this MDR population retains the rpoB and katG resistance causing mutations. Key: x denotes an isoniazid resistance causing mutation (katG)