| Literature DB >> 29132317 |
Anita Lerch1,2,3, Cristian Koepfli3,4, Natalie E Hofmann1,2, Camilla Messerli1,2, Stephen Wilcox3,4, Johanna H Kattenberg5,6, Inoni Betuela5, Liam O'Connor3,4, Ivo Mueller3,4,7, Ingrid Felger8,9.
Abstract
BACKGROUND: Amplicon deep sequencing permits sensitive detection of minority clones and improves discriminatory power for genotyping multi-clone Plasmodium falciparum infections. New amplicon sequencing and data analysis protocols are needed for genotyping in epidemiological studies and drug efficacy trials of P. falciparum.Entities:
Keywords: Amplicon sequencing; HaplotypR software; Haplotype clustering; Malaria; Multi-clone infections; Plasmodium falciparum; SNP; cpmp; csp; msp2
Mesh:
Substances:
Year: 2017 PMID: 29132317 PMCID: PMC5682641 DOI: 10.1186/s12864-017-4260-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Diversity of markers cpmp and csp based on 3′411 genomes of the MalariaGen dataset
| Marker | He a | No. of SNPs | Fragment sizeb | No. of Haplotypes |
|---|---|---|---|---|
|
| 0.930c | 20c | 383c | 82 of 980c,d |
|
| 0.857 | 40 | 287 | 77 of 1323d |
aExpected heterozygosity
bFragment size without primer sequence
cTrimming of reads in the here presented experiments led to a reduction of variation (Characteristics for a shorter cpmp fragment size of 310 bp: He = 0.913, SNPs = 14 and number of haplotypes = 47)
dFrom 3411 genomes only genomes with non-ambiguous SNP calls in selected region were used
Fig. 1Mismatch rate per nucleotide position derived from all samples sequenced for markers cpmp and csp. Each data point represents the mean observed mismatch rate observed in all reads of one sample at the respective nucleotide position. Red data points: control samples (P. falciparum culture strains); black data points: field samples; X-axis: nucleotide position in sequenced fragment; Y-axis: mismatch rate with respect to the reference sequence (for control samples: sequences of strains 3D7 and HB3, for field samples, 3D7 sequence); dashed grey lines represent SNPs with a mismatch rate of >0.5 in >1 sample; red dotted horizontal line indicates a mismatch rate of 0.5; solid black vertical line: position of concatenation of forward and reverse reads
Detectability of the minority clone in defined ratios of P. falciparum strains HB3 and 3D7
| Ratios in mixtures |
|
| Minimum Coverage HaplotypRc | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| HB3:3D7 | 3D7a
| HB3a
| PCR artefacts | Back-groundb
| Coverage | 3D7a
| HB3a
| PCR artefacts | Back-groundb
| Coverage | |
| 1:1 | 34.6 | 57.4 | 0.48 | 7.53 | 40,768 | 34.7 | 50.5 | 5.79 | 9.01 | 9009 | 6 |
| 1:10 | 75.6 | 16.4 | 0.40 | 7.59 | 13,037 | 76.1 | 10.1 | 5.63 | 8.08 | 3341 | 30 |
| 1:50 | 88.8 | 3.15 | 0.06 | 7.95 | 4953 | 82.7 | 2.88 | 6.23 | 8.16 | 14,711 | 150 |
| 1:100 | 90.9 | 1.53 | 0.36 | 7.26 | 13,311 | 83.5 | 2.25 | 5.41 | 8.88 | 11,975 | 300 |
| 1:500 | 90.8 | 0.48 | 0.27 | 8.44 | 5649 | 84.0 | 0.46 | 4.76 | 10.8 | 3508 | 1500 |
| 1:1000 | 91.5 | 0.23 | 0.03 | 8.26 | 3039 | 85.7 | 0.22 | 5.09 | 9.02 | 1807 | 3000 |
| 1:1500 | 92.5 | 0.11 | 0.48 | 6.94 | 55,887 | 86.3 | 0.08d | 5.71 | 7.91 | 23,619 | 4500 |
| 1:3000 | 92.5 | 0.094 | 0.38 | 7.00 | 7417 | 85.0 | 0.04d | 5.87 | 9.10 | 2318 | 9000 |
a Percent of reads that cluster with 3D7 and HB3 reference sequences
b Singleton reads and reads that failed to cluster with 3D7 or HB3 haplotypes
c Theoretical minimum required coverage needed to detect minority clone by software HaplotypR with default cut-off values
d Haplotypes considered as noise by software HaplotypR (default cut-off: ≥3 reads per haplotype and a minority clone detection limit of 1:1000)
Fig. 2Simulation of minority clone detectability by bootstrapping for marker cpmp and csp. Cut-off for acceptance of a haplotype was a minimum coverage per haplotype of 3 reads and a minority clone detection limit of 1:1000. Samples were drawn from reads of defined mixtures of P. falciparum strain 3D7 and HB3. X-axis shows dilution ratios of strains 3D7 and HB3; Y-axis indicates the sampling size (number of draws from sequence reads) for each mixture of strains. Sampling was repeated 1000 times to estimate mean minority clone detectability
Fig. 3Comparison of genotyping by length-polymorphic marker msp2 and amplicon sequencing of cpmp and csp. Raw data from length-polymorphism- and SNP-based genotyping for one P. falciparum-positive field sample. Left panel: Capillary electropherograms (CE) for msp2 nested PCR products (duplicate experiments); X-axis: fragment length, Y-axis: peak heights (arbitrary intensity units); size standards: red/orange peaks; 3D7-type msp2 genotypes: green peaks; FC27-type msp2 genotypes: blue peaks. Middle and right panel: Dendrograms derived from sequence reads of marker cpmp (middle) and csp (right); coloured lines represent membership to a specific, colour-coded haplotype; Grey lines: sequence reads of PCR artefacts (later excluded by cut-off settings); line length: number of mismatches according to bar insert. Bottom panels: Read counts (n) and percentage of reads (%) per haplotype and final multiplicity call
Fig. 4Frequency distribution of multiplicity of infection and allelic frequencies of cpmp, csp and msp2-CE. 37 P. falciparum positive samples from PNG were analysed for the 3 markers cpmp (27 haplotypes), csp (4 haplotypes) and msp2-CE (25 genotypes). Pie charts represent allelic frequency distribution for each marker in 37 samples
Summary of genotyping results from three molecular markers analysed in 37 field samples
| Marker | He | Mean MOI | Number SNPsa | Number Haplotypes | Concordance of MOI |
|---|---|---|---|---|---|
|
| 0.948 | 2.19 | NA | 25 | Reference |
|
| 0.957 | 2.41 | 45 | 27 | 0.71b (good) |
|
| 0.574 | 1.54 | 10c | 4 | 0.38d (poor) |
a With respect to the reference sequence of P. falciparum strain 3D7
b Cohen’s Kappa (2 raters, weights = equal): z = 6.64, p-value = 3.04e-11
c 4/10 SNPs are fixed within these 37 field samples
d Cohen’s Kappa (2 raters, weights = equal): z = 4.48, p-value = 7.61e-6
Concordance of haplotype calls in replicates of 37 field samples
|
|
| Passed cut-offa | |
|---|---|---|---|
| Haplotype frequency within sample ≥ 1% | |||
|
|
|
|
|
| present in single replicate only | 0 | 0 | no |
| Haplotype frequency within sample < 1% | |||
|
|
|
|
|
| present in both replicates one ≥ 3 readsb and one < 3 readsb | 1c | 0 | yes/nod |
| present in single replicate at ≥ 3 readsb | 17e | 2 | yes/nod |
| present in both replicates at < 3 readsb | 1 | 0 | no |
| present in single replicate at < 3 readsb | 10 | 5 | no |
Bold rows indicate haplotypes that did pass cut-off criteria in both replicates
a Default cut-off criteria to accepted haplotype ≥3 reads and a minority clone detection limit of 1:1000
b Owing to default cut-off for haplotype call
c Second replicate had too low coverage to detect ≥3 reads
d Potential false haplotype calls as only one replicate passed cut-off criteria
e In 2 instances second replicate had too low coverage to detect minority clone
Mean proportion of singleton or chimeric reads and indels detected in both field sample replicates
| Marker | Replicate 1 | Replicate 2 | ||||
|---|---|---|---|---|---|---|
| Singletons | Indels | Chimera | Singletons | Indels | Chimera | |
|
| 11.55 | 3.78a | 0.00 | 11.47 | 4.05a | 0.00 |
|
| 9.76 | 0.073b | 0.631c | 9.74 | 0.034b | 0.130c |
a Marker csp: Indels Replicate 1 versus 2; Student’s t-Test: t = −1.336, df = 71.052, p-value = 0.1858
b Marker cpmp: Indels Replicate 1 versus 2; Student’s t-Test: t = 1.3304, df = 71.94, p-value = 0.1876
c Marker cpmp: Chimera Replicate 1 versus 2; Student’s t-Test: t = 2.3552, df = 55.4, p-value = 0.02208
Fig. 5Bioinformatic analysis pipeline applied on highly multiplexed deep sequencing data