| Literature DB >> 26834831 |
Joshua B Richardson1, Benjamin Evans1, Patient P Pyana2, Nick Van Reet3, Mark Sistrom4, Philippe Büscher3, Serap Aksoy5, Adalgisa Caccone1.
Abstract
The trypanosome Trypanosoma brucei gambiense (Tbg) is a cause of human African trypanosomiasis (HAT) endemic to many parts of sub-Saharan Africa. The disease is almost invariably fatal if untreated and there is no vaccine, which makes monitoring and managing drug resistance highly relevant. A recent study of HAT cases from the Democratic Republic of the Congo reported a high incidence of relapses in patients treated with melarsoprol. Of the 19 Tbg strains isolated from patients enrolled in this study, four pairs were obtained from the same patient before treatment and after relapse. We used whole genome sequencing to investigate whether these patients were infected with a new strain, or if the original strain had regrown to pathogenic levels. Clustering analysis of 5938 single nucleotide polymorphisms supports the hypothesis of regrowth of the original strain, as we found that strains isolated before and after treatment from the same patient were more similar to each other than to other isolates. We also identified 23 novel genes that could affect melarsoprol sensitivity, representing a promising new set of targets for future functional studies. This work exemplifies the utility of using evolutionary approaches to provide novel insights and tools for disease control.Entities:
Keywords: Trypanosoma brucei gambiense; drug resistance; human African trypanosomiasis; melarsoprol; population genomics; whole genome sequencing
Year: 2016 PMID: 26834831 PMCID: PMC4721075 DOI: 10.1111/eva.12338
Source DB: PubMed Journal: Evol Appl ISSN: 1752-4571 Impact factor: 5.183
Details of Tbg isolates from Mbuji‐Mayi, Democratic Republic of the Congo, used in this study
| Intl. code | Patient number | Treatment outcome | Time of relapse (month) | Sampling point | Sample date | Sample source | Passage time (species, days) | Adapted to mice | Prior relapse | Treatment before inclusion | Treatment at inclusion |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
MHOM/CD/IN | 15 | Cure | BT‐RE | 30/05/05 | CSF | G36G14 | Y | N | M10 | ||
|
MHOM/CD/ST | 45 | Cure | BT | 11/06/05 | CSF | S13 | Y | Y | M3, MN | E14 | |
|
MHOM/CD/IN | 57 | Relapse | 12 | AT | 18/07/06 | CSF | G49 | Y | N | M10 | |
|
MHOM/CD/IN | 85 | Cure | BT | 23/06/05 | CSF | G7 | Y | Y | M3 | E14 | |
|
MHOM/CD/IN | 141 | Cure | BT | 27/07/05 | CSF | G9 | Y | Y | M10 | MN | |
|
MHOM/CD/IN | 146 | Relapse | 3 | BT | 28/07/05 | CSF | Mas38 | Y | N | M10 | |
|
MHOM/CD/IN | 146 | Relapse | 3 | AT | 10/11/05 | Blood | Mas 4 | Y | N | M10 | |
|
MHOM/CD/IN | 148 | Relapse | 3 | BT | 28/07/05 | CSF | G5 | Y | N | M10 | |
| MHOM/CD/IN RB/2006/14 | 148 | Relapse | 3 | AT | 5/01/06 | CSF | Mas 16 | Y | N | M10 | |
| MHOM/CD/IN RB/2006/06A | 163 | Relapse | 3 | AT | 22/11/05 | Blood | Mas 5 | Y | Y | M3 | MN |
| MHOM/CD/IN RB/2006/06A | 163 | Relapse | 3 | AT‐RE | 22/11/05 | Blood | Mas 5 | Y | Y | M3 | MN |
| MHOM/CD/IN RB/2006/21B | 340 | Relapse | 3 | AT | 1/03/06 | CSF | G11G7 | Y | N | M10 | |
| MHOM/CD/IN RB/2006/22A | 346 | Relapse | 3 | BT | 21/11/05 | Blood | G30 | N | N | M10 | |
| MHOM/CD/IN RB/2006/24B | 346 | Relapse | 3 | AT | 3/03/06 | CSF | G25G8 | Y | N | M10 | |
| MHOM/CD/IN RB/2006/24B | 346 | Relapse | 3 | AT‐RE | 3/03/06 | CSF | G25G8 | Y | N | M10 | |
|
MHOM/CD/IN | 348 | Cure | BT | 24/11/05 | CSF | G13 | Y | N | M10 | ||
| MHOM/CD/IN RB/2006/16 | 349 | Relapse | 3 | BT | 25/11/05 | CSF | G8 | Y | N | M10 | |
|
MHOM/CD/IN | 349 | Relapse | 3 | AT | 8/03/06 | Blood | G7G5 | Y | N | M10 | |
| MHOM/CD/IN RB/2008/34 | 378 | Cure | BT | 14/01/06 | CSF | G21G9 | Y | N | M10 |
International code: code of the stabilate. Patient number: number of the patient in the Mumba Ngoyi et al. (2010) study. Treatment outcome: treatment outcome of patient in the Mumba Ngoyi et al. (2010) study. Time of relapse (month): period after treatment when relapse was observed. Sampling point: BT, before treatment; AT, after treatment at moment of relapse. BT‐RE/AT‐RE, relapse in mouse carrying BT or AT sample and treated with melarsoprol. Sample date: date when blood or CSF specimen was taken and frozen in liquid nitrogen. Specimens stayed in liquid nitrogen until the first attempt to isolate the strain in Grammomys surdaster or Mastomys natalensis or severe combined immunodeficient (SCID) Mus musculus). Sample source: blood or cerebrospinal fluid (CSF). Passage time (species, days): first letter of host (G. surdaster, M. natalensis, or SCID M. musculus) followed by days of infection before subpassage or before cryostabilization. Adapted to mice: whether or not the strain was adapted to laboratory mice after the isolation in G. surdaster or M. natalensis. Prior relapse: whether the patient was included in the Mumba Ngoyi et al. (2010) study as a patient that already experienced a relapse. Treatment before inclusion: M3: classic 3‐course treatment with melarsoprol, M10: abridged 10‐day treatment with melarsoprol, MN: melarsoprol–nifurtimox combination therapy, Treatment at inclusion: as above plus P8: 8 days pentamidine, E14: 14 days eflornithine.
Figure 1Overlap between single nucleotide polymorphisms (SNPs) identified by Genome Analysis Toolkit (GATK) Haplotype Caller and Samtools mpileup. Number of genomic positions containing SNPs according to samtools mpileup (pink), GATK (purple), or both (red).
Figure 2Cluster analysis of Tbg isolates. (A) Neighbor‐joining tree of Tbg isolates based on 5938 single nucleotide polymorphisms (SNPs) and 10 000 replicates obtained using the ape package in R. Bootstrap percentages are shown on the nodes. Red strains come from patients where both a BT and an AT strain are part of the data set. 346AT‐RE isolated from mouse carrying 346AT which relapsed after being treated with melarsoprol. 1829 Aljo is a Tbg strain isolated from the DRC in the 1970s, included for comparison. (B) Strains grouped by k‐means clustering of SNP principal components (Jombart et al. 2010). For k values from 2 through 8, group membership for each strain is shown by color.
Figure 3Discriminant analysis of principal components (DAPC) Scatter plot of Tbg isolates. First two discriminant functions from DAPC analysis of Tbg isolates obtained using the DAPC function from the adegenet package in R. Group membership is shown by color while strains isolated from the same patient share the same shape. Circles represent the 14 other Tbg isolates included in this study and 1829 Aljo. The shape and color of the four patient‐pair strains are detailed in the insert.
Previously identified candidate melarsoprol genes
| Gene number | Protein name/des cription | nsSNPs | Amino acid change | Predicted effect | Strain Genotypes | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 146 | 148 | 346 | 349 | ||||||||||
| PolyPhen2 | SNAP | BT | AT | BT | AT | BT | AT | BT | AT | ||||
| Tbg972.2.2440 | Trypanothione synthetase | No nsSNPs | |||||||||||
| Tbg972.4.2470 | Multidrug resistance‐associated protein (MRPA) | C → G | T469R | Benign | Neutral | C/G | C/G | C/G | C/G | C/G | C/G | C/G | C/G |
| – | – | A → G | N752D | Benign | Neutral | A/G | A/G | A/G | A/G | A/G | A/G | A/G | A/G |
| Tbg972.5.40 | Adenosine transporter 1 (AT1) | No nsSNPs | |||||||||||
| Tbg972.6.1170 | Aquaporin 1 | No nsSNPs | |||||||||||
| Tbg972.8.5950 | Protein kinase | No nsSNPs | |||||||||||
| Tbg972.9.80 | Hypothetical protein | No nsSNPs | |||||||||||
| Tbg972 10 1740 | Hypothetical protein | G → T | C379F | Benign | Neutral | G/T | G/T | G/T | G/T | G/T | G/T | G/T | G/T |
| – | – | G → A | D507F | Possibly damaging | Neutral | G/G | G/G | G/G | G/A | G/G | G/G | G/G | G/G |
| Tbg972.10.2310 | Putative serine/threonine protein kinase | No nsSNPs | |||||||||||
| Tbg972.10.14510 | Mapk11 homolog | No nsSNPs | |||||||||||
| Tbg972.10.16540 | Aquapglycero‐porin 2 (TbAQP2 homolog) | Chimeric allele with Tbg972.10.16560 | |||||||||||
| Tbg972.10.16560 | Aquapglycero‐porin 2 (TbAQP3 homolog) | Chimeric allele with Tbg972.10.16540 | |||||||||||
| Tbg972.10.18650 | Hypothetical protein | No nsSNPs | |||||||||||
| Tbg972.10.19640 | Mapk11 homolog | No nsSNPs | |||||||||||
| Tbg972.11.450 | Upstream binding protein 1 (UBP1) | T → G | S204A | Unknown | Neutral | T/G | T/G | T/G | T/G | T/G | T/G | T/G | T/G |
Genetic variation found in the 14 genes previously known to affect melarsoprol sensitivity. Any nsSNPs observed are listed, along with the protein change caused by the nsSNP and its predicted effect on protein function. PolyPhen2: result based on the HumDiv data set, SNAP: results from the SNAP web interface. Genotypes of patient‐pair strains are listed in the final eight columns. Italicized genotypes indicate a heterozygous difference between strains isolated from the same patient. – Indicates same entry as line above.
Figure 4Significant single nucleotide polymorphisms (SNPs) Identified by DAPC (A) Discriminant function values separating BT (Red) and AT (Blue) strains obtained using the Discriminant analysis of principal component function in the adegenet package of R. The Y axis represents discriminant function values, and the X axis represents density of individual strains. (B) Loading values of individual SNPs contributing to the discriminant function shown in A. SNPs are shown on the Y axis, and loading values on the X axis. Values above line are significantly higher than expected by chance.
Novel candidate genes
| Gene number | Protein name/description | nsSNP | Protein change | Predicted effect | Blood‐stream expression | Strain genotypes | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 146 | 148 | 346 | 349 | |||||||||||
| PolyPhen2 | SNAP | BT | AT | BT | AT | BT | AT | BT | AT | |||||
| Tbg972.1.2860 | Hypothetical protein | A → G | N270D | Unknown | Neutral | Yes |
|
|
|
|
| A/A | A/G | A/G |
| Tbg972.3.4800 | Hypothetical protein | T → C | L33P | Probably damaging | Non‐neutral | Unknown |
|
|
|
|
|
| T/T | T/T |
| – | – | C → T | L34F | Benign | Non‐neutral | – |
|
|
|
|
|
| C/C | C/C |
| – | – | C → G | Q35V | Benign | Non‐neutral | – |
|
|
|
|
|
| C/C | C/C |
| – | – | A → T | Q35V | – | – | – |
|
| A/A | A/A |
|
| A/A | A/A |
| – | – | C → G | L40V | Benign | Neutral | – | C/C | C/C | C/C | C/C |
|
| C/C | C/C |
| – | – | G → A | C43Y | Benign | Non‐neutral | – |
|
| G/G | G/G |
|
|
| G/G |
| Tbg972.4.5180 | Hypothetical protein | A → G | V199A | Benign | Neutral | Yes | A/G | A/G |
|
|
| A/G |
|
|
| Tbg972.5.370 | Hypothetical invariant surface glycoprotein | G → A | G141N | Benign | Neutral | No |
|
| A/A | A/A | A/A | A/A | A/A | A/A |
| – | – | G → A | G141N | – | – | – |
|
|
|
| A/A | A/A | A/A | A/A |
| Tbg972.5.6250 | Hypothetical protein | G → A | V16I | Failed | Neutral | Unknown | G/A | G/A |
|
|
|
|
|
|
| Tbg972.6.700 | Metacaspase 3 (MCA3) | C → T | V124I | Benign | Neutral | Yes | C/T | C/T |
|
|
|
|
|
|
| Tbg972.7.7504 | Hypothetical protein | G → A | L116F | Failed | Neutral | Unknown |
|
|
|
|
|
| G/G | G/G |
| Tbg972.7.8020 | Putative ex‐pression site associated gene (ESAG) | G → C | E49Q | Benign | Neutral | Yes | C/C | C/C |
|
| C/C | C/C |
|
|
| Tbg972.8.7620 | Hypothetical protein | T → C | F491S | Probably damaging | Non‐neutral | Yes | T/T | T/T | T/T | T/T |
|
| T/T | T/T |
| Tbg972.8.7910 | Amino acid transporter 1 | C → T | T219I | Benign | Neutral | No | C/C | C/C |
|
|
|
|
|
|
| Tbg972.9.9130 | Putative leucine‐rich repeat protein (LRRP) | C → T | P220L | Benign | Neutral | No | C/C | C/C | C/C | C/C |
|
| C/C | C/C |
| – | – | C → T | A358V | Benign | Neutral | – | C/C | C/C |
|
|
|
| C/C | C/C |
| – | – | T → G | F364C | Benign | Neutral | – | G/G | G/G |
|
|
|
| G/G | G/G |
| – | – | G → C | G1335R | Probably damaging | Non‐neutral | – |
|
|
|
|
|
| G/C | G/C |
| Tbg972.10.1150 | Hypothetical protein | T → C | S22G | Benign | Neutral | Yes |
|
|
|
|
|
|
|
|
| Tbg972.10.1360 | Putative serine carboxypeptidase III precur sor | C → A | F409L | Possibly damaging | Non‐neutral | No | C/C | C/C | C/C | C/C | C/C | C/C |
|
|
| Tbg972.10.18680 | Hypothetical protein | T → G | F92C | Failed | Neutral | Yes | T/T | T/T |
|
|
|
|
|
|
| Tbg972.10.19330 | Hypothetical protein | C → G | A644G | Benign | Non‐neutral | Yes |
|
|
|
|
|
|
|
|
| Tbg972.10.19940 | Putative expression site associated gene (ESAG) | T → A | K395I | Benign | Neutral | Yes |
|
|
|
| A/A | A/A | A/A | A/A |
| Tbg972.11.1610 | Putative leucine‐rich repeat protein (LRRP) | T → G | C435G | Benign | Neutral | Yes |
|
| G/G | G/G | G/G | G/G | G/G | G/G |
| Tbg972.11.4390 | Hypothetical protein | C → T | R17W | Probably damaging | Non‐neutral | Yes | C/C | C/C | T/T | T/T |
|
|
|
|
| Tbg972.11.4390 | Hypothetical protein | A → G | Y38C | Benign | Non‐neutral | ‐‐ | A/A | A/A | A/A | A/A |
|
| A/A | A/A |
| Tbg972.11.10190 | Hypothetical protein | G → A | G766S | Unknown | Neutral | Yes | G/G |
| G/A | G/A |
|
|
|
|
| Tbg972.11.11950 | Hypothetical protein | C → T | P2365S | Unknown | Neutral | Yes | C/T | C/T | C/T |
|
|
|
|
|
| Tbg972.11.12410 | Hypothetical protein | G → A | A321T | Benign | Neutral | Yes | G/G | G/G |
|
|
|
|
|
|
| Tbg972.11.16370 | Expression site‐associated gene 2 (ESAG2) | G → A | D351S | Possibly damaging | Neutral | Yes | G/A | G/A |
|
|
|
|
|
|
| – | – | A → G | D351S | – | – | – | A/G | A/G |
|
|
|
|
|
|
| – | – | G → C | G352A | Benign | Neutral | – | G/C | G/C |
|
|
|
|
|
|
| Tbg972.11.19250 | Expression site‐associated gene 4 (ESAG4) | T → A | S390T | Benign | Neutral | Yes | T/A | T/A |
|
|
|
|
|
|
Genetic variation found in 23 genes containing nsSNPs identified by our screen. Gene Number: gene number in DAL972 genome. Protein name/description: name of protein or brief description, if known. Observed nsSNPs are listed, along with the protein change caused by the nsSNP and its predicted effect on protein function. PolyPhen2: result based on the HumDiv data set, SNAP: results from the SNAP web interface. Expressed in Tbg bloodstream form: data from Veitch et al. (2010). Genotypes of patient‐pair strains are listed in the final eight columns. Italicized genotypes indicate a heterozygous difference between strains isolated from the same patient. Bold italicized genotypes indicate a fixed, homozygous difference between strains isolated from the same patient. – Indicates same entry as line above.
Figure 5Single nucleotide polymorphisms (SNPs) in the Tbg972.3.4800 locus. Diagram showing the Tbg972.3.4800 locus, including the Tbg972.3.4800 open reading frame (orf), and another putative orf encoding a retrotransposon hot spot (RHS)‐like protein. Gray diamonds represent SNPs identified in our screen, and the blue diamond represents a SNP that eliminates a stop codon found in the DAL972 reference strain, allowing read through of the RHS‐like orf.