| Literature DB >> 21789235 |
Sarah Auburn1, Susana Campino, Taane G Clark, Abdoulaye A Djimde, Issaka Zongo, Robert Pinches, Magnus Manske, Valentina Mangano, Daniel Alcock, Elisa Anastasi, Gareth Maslen, Bronwyn Macinnis, Kirk Rockett, David Modiano, Christopher I Newbold, Ogobara K Doumbo, Jean Bosco Ouédraogo, Dominic P Kwiatkowski.
Abstract
Highly parallel sequencing technologies permit cost-effective whole genome sequencing of hundreds of Plasmodium parasites. The ability to sequence clinical Plasmodium samples, extracted directly from patient blood without a culture step, presents a unique opportunity to sample the diversity of "natural" parasite populations in high resolution clinical and epidemiological studies. A major challenge to sequencing clinical Plasmodium samples is the abundance of human DNA, which may substantially reduce the yield of Plasmodium sequence. We tested a range of human white blood cell (WBC) depletion methods on P. falciparum-infected patient samples in search of a method displaying an optimal balance of WBC-removal efficacy, cost, simplicity, and applicability to low resource settings. In the first of a two-part study, combinations of three different WBC depletion methods were tested on 43 patient blood samples in Mali. A two-step combination of Lymphoprep plus Plasmodipur best fitted our requirements, although moderate variability was observed in human DNA quantity. This approach was further assessed in a larger sample of 76 patients from Burkina Faso. WBC-removal efficacy remained high (<30% human DNA in >70% samples) and lower variation was observed in human DNA quantities. In order to assess the Plasmodium sequence yield at different human DNA proportions, 59 samples with up to 60% human DNA contamination were sequenced on the Illumina Genome Analyzer platform. An average ~40-fold coverage of the genome was observed per lane for samples with ≤ 30% human DNA. Even in low resource settings, using a simple two-step combination of Lymphoprep plus Plasmodipur, over 70% of clinical sample preparations should exhibit sufficiently low human DNA quantities to enable ~40-fold sequence coverage of the P. falciparum genome using a single lane on the Illumina Genome Analyzer platform. This approach should greatly facilitate large-scale clinical and epidemiologic studies of P. falciparum.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21789235 PMCID: PMC3138765 DOI: 10.1371/journal.pone.0022213
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Frequency distribution of human DNA proportions post WBC-depletion.
a) Mali, b) Burkina Faso. Human DNA proportions are presented as the percentage human DNA in the total sample as estimated by quantitative real-time PCR (see methods). The Malian samples (a) represent a range of WBC-depletion methods. All Burkinabe samples (b) were processed using the Lymphoprep plus Plasmodipur approach.
Figure 2Distributions of human DNA proportions (% of total DNA) remaining after processing clinical blood samples from Mali with four different WBC-depletion methods.
LA = Lymphoprep plus Antibody (N = 8), P = Plasmodipur (N = 13), LP = Lymphoprep plus Plasmodipur (N = 9), LPA = Lymphoprep plus Plasmodipur plus Anti-HLA1 (N = 13).
Figure 3Human DNA percentage post-purification with Lymphoprep plus Plasmodipur against parasitaemia in Burkina Faso.
The dashed line indicates the line of best fit. IRBC = Infected Red Blood Cell.
Figure 4Square root of average coverage per base against human DNA proportion for 59 sequenced clinical samples.
Sequence read lengths were either 54 bp (grey spots) or 76 bp (black spots). Each dot represents a sample. Where a sample was sequenced on more than one lane, the average sequence coverage is presented. The median (Square root of average coverage) intra-sample standard deviation was 0.91, and inter-quartile range was 0.70–1.35.
Summary of sequencing coverage per lane across 59 clinical samples.
| % Human | Read Length | No. Samples (No. Lanes) | Average No. | Average Coverage | % Genome Covered: Min Read Depth 1 | % Genome Covered: Min Read Depth 5 | % Genome Covered: Min Read Depth 10 |
| <1 | 54 | 10 (18) | 1006 | 43.19 | 91.11 | 75.26 | 64.78 |
| 76 | 1 (2) | 895 | 38.39 | 63.5 | 48.63 | 42.86 | |
| 1–5 | 54 | 12 (20) | 924 | 47.16 | 92.42 | 80.39 | 67.15 |
| 76 | 1 (2) | 437 | 18.8 | 65.67 | 47.16 | 37.17 | |
| 5–10 | 54 | 8 (13) | 1037 | 44.54 | 92.99 | 77.82 | 68.82 |
| 10–20 | 54 | 6 (11) | 754 | 32.36 | 81.29 | 61.93 | 51.45 |
| 76 | 1 (2) | 641 | 27.5 | 88.7 | 71.32 | 58.95 | |
| 20–30 | 54 | 2 (4) | 629 | 26.98 | 92.52 | 73.7 | 60.51 |
| 76 | 1 (2) | 1136 | 48.74 | 96.44 | 86.87 | 80.22 | |
| 30–40 | 54 | 3 (6) | 126 | 5.4 | 54.01 | 21.91 | 12.58 |
| 76 | 2 (2) | 246 | 10.56 | 61.82 | 36.18 | 24.59 | |
| 40–50 | 76 | 7 (13) | 186 | 6.65 | 66.4 | 39.07 | 25.25 |
| 50–60 | 76 | 5 (11) | 155 | 6.93 | 56.88 | 30.9 | 21.6 |
Data is averaged across all sample lanes within the given ranges of human DNA quantities. On average, each sample was sequenced on 2 lanes.