| Literature DB >> 35906670 |
Vincenzo A Ellis1,2, Victor Kalbskopf1, Arif Ciloglu1,3,4, Mélanie Duc1,5, Xi Huang1,6, Abdullah Inci3,4, Staffan Bensch1, Olof Hellgren7, Vaidas Palinauskas8.
Abstract
BACKGROUND: Sequencing parasite genomes in the presence of host DNA is challenging. Sequence capture can overcome this problem by using RNA probes that hybridize with the parasite DNA and then are removed from solution, thus isolating the parasite DNA for efficient sequencing.Entities:
Keywords: Avian malaria; Haemosporida; Hybrid enrichment; Parasite genomics; Parasitemia
Mesh:
Year: 2022 PMID: 35906670 PMCID: PMC9336033 DOI: 10.1186/s13071-022-05373-w
Source DB: PubMed Journal: Parasit Vectors ISSN: 1756-3305 Impact factor: 4.047
BLAST results of unmapped reads against the refseq database for five samples representing the three lineages in this study (GRW11, GRW4, SGS1) and the two SGS1 lineage isolates (SGS1-A and SGS1-B)
| BLAST Taxon Hit | GRW11 (cc82) | GRW4 (51242) | SGS1-A (1309) | SGS1-A (1455) | SGS1-B (735) |
|---|---|---|---|---|---|
| Aves: Passeriformes | 6342 | 39,851 | 2258 | 14,009 | 1900 |
| Aves: Non-passerine orders | 322 | 3734 | 89 | 910 | 90 |
| Mammalia | 10 | 150 | 3 | 31 | 4 |
| 0 | 0 | 0 | 0 | 0 | |
| All hits | 16,478 | 82,689 | 9782 | 30,536 | 5467 |
| Total unmapped reads | 991,173 | 14,761,081 | 360,028 | 1,593,536 | 423,348 |
| No BLAST hit | 974,695 | 14,678,392 | 350,246 | 1,563,000 | 417,881 |
| All reads | 6,116,400 | 15,219,066 | 5,915,344 | 1,839,486 | 1,939,288 |
Two samples from the SGS1-A lineage isolate were BLASTed; they represent a high-parasitemia (1309) and low-parasitemia (1455) sample. The number of reads that mapped to birds (shown separately for Passeriformes and non-passerine orders), mammals (Mammalia), and Plasmodium (no unmapped reads mapped to Plasmodium) are shown. The “All hits” category is the sum of the reads in the aforementioned categories and all other unmapped reads that resulted in hits to other taxonomic categories (including unassigned taxa; data not shown). The number of unmapped reads that did not result in a BLAST hit (“No BLAST hit”), the total number of unmapped reads for each sample (“Total unmapped reads”), and the total number of sequenced reads for each sample (mapped and unmapped; “All reads”) are also shown. Sample name follows lineage name in parentheses
Fig. 1The percentage of targeted nucleotides as a function of depth of coverage at which those nucleotides were sequenced. Steeply sloping lines on the left-hand side of the graph suggest poor sequencing relative to lines with shallower slopes closer to the right-hand side of the graph. Each line represents a single sample and line type and color correspond to the lineage isolates. The lineage SGS1 was represented by two isolates; SGS1-A was originally isolated from the host species Loxia curvirostra and SGS1-B from Passer domesticus
Fig. 2The percentage of nucleotides that the sequence capture probes were designed to capture sequenced at different depths of coverage (5×, 20×, 50×, 100×) in relation parasitemia (percentage of infected red blood cells). Samples are from experimental infections of the lineage SGS1
Fig. 3Examples of mapping depth over the first chromosome for low—0.1% (a), medium—7% (b), and high—56.4% (c) parasitemia samples with the entire chromosome on the left and a region taken from the dashed box expanded on the right in IGV. *Depth extends to 4015. **Depth extends to 1758. ***Depth extends to 2711
Fig. 4Haplotype networks of lineages (DONANA05 is the lineage name of the reference genome; GRW11 is the lineage of sample cc82; GRW4 is the lineage of sample 51242; SGS1-A is lineage SGS1 represented by sample 1309; SGS1-B is lineage SGS1 represented by sample 735) for four genes (merozoite surface protein 8, MSP8, 1321 bp; cytochrome b, cyt b 1151 bp; replication factor C subunit 1, RFC1 2380 bp; rhoptry neck protein 3, RON3, 4558 bp). The number of substitutions separating haplotypes is presented on each branch of the network
A distance matrix of number of nucleotide differences between lineages (DONANA05 is the lineage name of the reference genome; GRW11 is the lineage of sample cc82; GRW4 is the lineage of sample 51242; SGS1-A is lineage SGS1 represented by sample 1309; SGS1-B is lineage SGS1 represented by sample 735) computed from the concatenated alignment of 25 genes (Additional file 6: Table S3)
| SGS1-A | SGS1-B | GRW11 | DONANA05 | |
|---|---|---|---|---|
| SGS1-A | ||||
| SGS1-B | 11 | |||
| GRW11 | 19 | 22 | ||
| DONANA05 | 16 | 9 | 23 | |
| GRW4 | 666 | 663 | 675 | 660 |