| Literature DB >> 21853089 |
Timothy Robinson1, Susana G Campino, Sarah Auburn, Samuel A Assefa, Spencer D Polley, Magnus Manske, Bronwyn MacInnis, Kirk A Rockett, Gareth L Maslen, Mandy Sanders, Michael A Quail, Peter L Chiodini, Dominic P Kwiatkowski, Taane G Clark, Colin J Sutherland.
Abstract
Naturally acquired blood-stage infections of the malaria parasite Plasmodium falciparum typically harbour multiple haploid clones. The apparent number of clones observed in any single infection depends on the diversity of the polymorphic markers used for the analysis, and the relative abundance of rare clones, which frequently fail to be detected among PCR products derived from numerically dominant clones. However, minority clones are of clinical interest as they may harbour genes conferring drug resistance, leading to enhanced survival after treatment and the possibility of subsequent therapeutic failure. We deployed new generation sequencing to derive genome data for five non-propagated parasite isolates taken directly from 4 different patients treated for clinical malaria in a UK hospital. Analysis of depth of coverage and length of sequence intervals between paired reads identified both previously described and novel gene deletions and amplifications. Full-length sequence data was extracted for 6 loci considered to be under selection by antimalarial drugs, and both known and previously unknown amino acid substitutions were identified. Full mitochondrial genomes were extracted from the sequencing data for each isolate, and these are compared against a panel of polymorphic sites derived from published or unpublished but publicly available data. Finally, genome-wide analysis of clone multiplicity was performed, and the number of infecting parasite clones estimated for each isolate. Each patient harboured at least 3 clones of P. falciparum by this analysis, consistent with results obtained with conventional PCR analysis of polymorphic merozoite antigen loci. We conclude that genome sequencing of peripheral blood P. falciparum taken directly from malaria patients provides high quality data useful for drug resistance studies, genomic structural analyses and population genetics, and also robustly represents clonal multiplicity.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21853089 PMCID: PMC3154926 DOI: 10.1371/journal.pone.0023204
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of sequence data, and numbers of potential SNP and indels, relative to 3D7 reference sequence.
| Isolate | Read length | Lanes | PE reads | Cover. All, >0 | Cover.All, >0 | Cover. | % genome | % genome | Q30 | Q30 | Q20 | Q20 | Q60 | Q60 |
| per lane (millions) | Median (mean) | Nuclear median (mean) | Mito Median (mean) | Cover. >0 | Cover. >4 | SNP (Indels) | % coding (unique) regions | SNP (Indels) | % coding (unique) regions | SNP (Indels) | % coding (unique) regions | |||
| OX001 | 54 | 2 | 14.4 | 3, (11, 16) | 3, 9 (11, 16) | 1071 (1158) | 68.2 | 45.4 | 27943 (1043) | 69.2 (70.0) | 29521 (1123) | 68.5 (69.1) | 24167 (785) | 71.0 (72.1) |
| OX003 | 54 | 2 | 13.9 | 6, 9 (11, 14) | 6, 9 (11, 14) | 939 (895) | 80.7 | 54.5 | 27093 (1546) | 65.4 (70.1) | 28538 (1644) | 64.5 (69.1) | 23699 (1249) | 67.6 (70.2) |
| OX005A | 76 | 2 | 32.6 | 90, 91 (98, 101) | 90, 91 (97, 100) | 1244 (1150) | 97.5 | 95.5 | 48329 (25763) | 31.4 (74.5) | 50442 (26419) | 31.3 (74.0) | 43059 (24239) | 31.9 (76.5) |
| OX005B | 76 | 1 | 30.0 | 126, 127 (115, 118) | 125, 127 (114, 117) | 1544 (1478) | 97.5 | 96.1 | 43753 (22624) | 33.8 (76.7) | 46258 (23208) | 33.4 (75.6) | 37010 (21180) | 34.5 (79.8) |
| OX006 | 76 | 1 | 25.1 | 122, 123 (115, 117) | 122, 123 (115, 117) | 1526 (1450) | 98.0 | 96.7 | 42985 (25580) | 32.5 (78.1) | 45863 (26687) | 32.3 (77.1) | 34307 (23834) | 32.5 (81.1) |
PE = paired end; Cover. = coverage; all refers to all positions; >0 refers to those positions with non-zero coverage; Mito = mitochondrial genome; unique = % of sliding 50-mer windows around each position that are unique; indels = insertions and deletions; Q20/30/60 equates to error rates of 1 in 100/1000/1000000 base pairs respectively.
Figure 1Identifying amplifications as areas of high MPS coverage.
Length of sequence intervals between paired reads (nt) and coverage (read frequency) are plotted against chromosome position and %GC content for two loci in isolates OX001 and OX006. pfgh1 displays high coverage consistent with amplification in OX006, but not OX001. pfef2 displays high coverage consistent with amplification in OX001, but not OX006. Loci of interest (red circles) are shown within100 km of genomic context. Red colouring within the read pile-ups signify polymorphic sites within a read in which a non-reference allele is present (i.e. SNP).
Figure 2Deletions in pfrbp2 homologues a and b appear as areas with inflated sequence intervals in four isolates.
Two isoforms of RBP2 are encoded by adjacent genes on chromosome 13, arranged head to head and transcribed in opposite directions. 60 kb around these genes are depicted, for four isolates. Loci of interest (red arrows) have either a ∼600 bp deletion in the carboxy-terminal serine-rich domain of homologue b (red elipses), or a ∼2–300 bp deletion in the low-complexity protein domain immediately upstream in both genes (blue elipses). Y-axis depicts sequence interval between paired reads. X-axis depicts nucleotide coordinates along the chromosome, as in Figure 1.
Figure 3Evidence of a major deletion at the right end of chromosome 3 in isolate OX005.
Paired reads across the whole of chromosome 3 are presented in pile-up view for two isolates, OX005 and OX006 (upper panel). Y-axis depicts sequence interval between paired reads, and X-axis gives chromosome coordinates as in Figures 1 and 2. A detailed view of ∼70 kb around the clag3.2 and clag3.1 loci is also shown for 4 isolates (lower panel). The locus between PFC0110w and PFC0120w is a degenerate var gene lacking a full-length ORF in 3D7 and other parasite sequences in the available databases.
Amino acids encoded at polymorphic codons of 6 P. falciparum loci likely to be under drug selection.
| Isolate | Pfatpase6 PFA0310c | Pfmrp1 PFA0590w | Pfmdr1 PFE1150w | Pfcrt MAL7P1.27 | Pfmrp2 PFL1410c | Pfnhe1 P13_0019 | |||||||||||||||||||||
| CODONS: | 431 | 569 | 639 | 876 | 1466 | 86 | 184 | 496 | 649–654 | 1246 | 72–76 | 220 | 271 | 199 | 235–40–42 | 350 | 709 | 714 | 796 | 1527 | 1531 | 1373 | 173 | 203–4 | 878 | 950 | 1557 |
| 3D7 REF | E | N | G | I | K | N | Y | T | NDDNNN | D | CVMNK | A | Q | L | YQQ | T | Q | K | S | S | L | H | V | SD | T | V | F |
| OX001 Ghana | E | N | G | I | K | YN | F | T | NDDNNN | YD | CVMNK | A | Q | LV | YQQ | PT | QK | K | S | - | - | N | V | SD | T | V | S |
| OX003 Mozambique | E | K | G | I | K | N | F | I | NDDNNN DDNNNN | D | CVIET | S | E | V | YQQ | T | Q | I | AS | T | VI | H | A | SD | T | G | S |
| OX005A Ghana | E | N | G | I | K | N | F | T | NDNDNN YNDNNN | D | CVMNK | A | Q | V | YQQ YEE | T | Q | I | S | T | I | H | A | FY | I | V | S |
| OX005B Ghana | E | N | G | I | K | N | F | T | NDDDNN NNDNNN | D | CVMNK | A | Q | V | YQQ NEE | T | Q | I | S | T | I | H | A | FD | I | V | S |
| OX006 Kenya | K | N | D | VI | R | YN | YF | T | NNNNDD NDDNNN | YD | CVIET | S | E | V | YQQ NQE | T | Q | I | S | ST | LI | H | V | SD | T | G | S |
Genome sequence data was generated on the Solexa Illumina platform as described in Materials and Methods. Sequences aligning with the 6 loci shown in the reference sequence for P. falciparum (laboratory clone 3D7) were extracted (with quality criteria stated in the Methods), converted to FASTA format, translated and aligned in Clustal W. Sites exhibiting polymorphism among the 5 isolates are shown.
Shading: non-synonymous substitution relative to the reference sequence.
Multiplicity: where more than one base was called at any one position, the encoded amino acid with the most calls is displayed above. Haplotypes cannot be inferred by these data – for example any or all combinations of YFY, YFD, NFY or NFD may exist for pfmdr1 codons 86, 184 and 1246 in patient OX001.
SNP and inferred haplotypes in pfdhfr and pfdhps loci.
| Isolate ID | |||||
|
| OX001 | OX003 | OX005A | OX005B | OX006 |
| MAL4:755220 PFD0830w | T | T | T | T | T |
| MAL4:755243 PFD0830w | C | C | C | C | C |
| MAL4:755391 PFD0830w | A | A | A | A | A |
| MAL4:755558 PFD0830w | A | A | A | A | A |
|
|
|
|
|
|
|
| MAL8:550802 PF08_0095 | T | T | G | G | T |
| MAL8:550806 PF08_0095 | G | G | G | G | G |
| MAL8:551114 PF08_0095 | A | G | A | A | G |
| MAL8:551238 PF08_0095 | C | C | C | C | C |
|
|
|
|
|
|
|
Polymorphic nucleotide positions in MPS-derived P. falciparum mitochondrial genomes.
| Mitochondrial genome coordinates | ||||||||||||||||||||||||||||||||
| NT coord: Joy et al. 30 | 74 | 204 | 701 | 766 | 776 | 837 | 964 | 1260 | 1284 | 1362 | 1371 | 1634 | 1687 | 1696 | 1754 | 1780 | 1938 | 2179 | 2387 | 2495 | 2645 | 3010 | 3070 | 3517 | 3558 | 3729 | 3858 | 3966 | 4184 | 4718 | 4720 | 4956 |
| Ref state (version 2.1.4) | A | C |
| A | C | T | T | G | G | G | G | A | A | G | T | T | G | T | G | G | T | T | G | T | C | C | T | A | C | A | A | T |
| Coding Gene |
|
|
| |||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||
| OX001 Ghana | A | C | T | A | C | T | T | G | G | G | G | A | A |
| T | T | G | T | G | G | T | T | G | T | C | C | T | A | C | A | A |
|
| OX003 Mozambique | A | C | T | A | C | T | T | G | G | G | G | A | A | G | T | T | G | T |
| G | T | T | G | T | C | C | T | A | C | A | A |
|
| OX005A Ghana | A | C | T | A | C | T | T | G | G | G | G | A | A | G | T | T | G | T | G | G | T | T | G | T | C | C | T | A | C | A | A |
|
| OX005B Ghana | A | C | T | A | C | T | T | G | G | G | G | A | A | G | T | T | G | T | G | G | T | T | G | T | C | C | T | A | C | A | A |
|
| OX006 Kenya | A | C | T | A | C | T | T | G | G | G | G | A | A |
| T | T | G | T | G | G | T | T | G | T | C | C | T | A | C | A | A |
|
Row 1: Nucleotide coordinates from ref. 30;
Row 2: additional “ATAT” insert at position 701 is not present in reference sequence.
Row 3: intersection of polymorphic loci with protein-coding genes
Clonal multiplicity estimated from polymorphic amplicon sizes in pfmsp1 and pfmsp2 PCR assays compared to estimates from MPS analysis.
| Number of alleles seen | OX001 Ghana | OX003 Mozambique | OX005A Ghana | OX005B Ghana | OX006 Kenya | |
|
| ||||||
| K1 | 2 | 1 | 2 | 2 | 2 | |
| MAD20 | 0 | 2 | 1 | 1 | 1 | |
| RO33 | 2 | 1 | 0 | 0 | 1 | |
|
| ||||||
| FC27 | 2 | 2 | 1 | 1 | 3 | |
| IC/3D7 | 3 | 1 | 0 | 1 | 1 | |
| Minimum number of genotypes from PCR analysis |
|
|
|
|
| |
| Minimum number of genotypes from MPS analysis | Method 1 |
|
|
|
|
|
| Method 2 | 4 | 4 | 3 | - | 5 | |
*Each of the three allelic families occur in a mutually exclusive manner in a single msp1 gene; similarly for the two allelic families of msp1. Thus the minimum number of genotypes is taken as the larger of the allele totals for the two genes.
**Highest minimum estimates of haplotype multiplicity (> = 3) are shown for each isolate. Method 2 did not identify high multiplicity loci in isolate OX005B.