| Literature DB >> 32943086 |
Rui Martiniano1, Erik Garrison2,3, Eppie R Jones4, Andrea Manica4, Richard Durbin5,6.
Abstract
BACKGROUND: During the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of aDNA means that aDNA molecules are short and frequently mutated by post-mortem chemical modifications. These features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less likely to be mapped than those containing reference alleles. Alternative approaches have been developed to replace the linear reference with a variation graph which includes known alternative variants at each genetic locus. Here, we evaluate the use of variation graph software vg to avoid reference bias for aDNA and compare with existing methods.Entities:
Keywords: Ancient DNA; Reference bias; Sequence alignment; Variation graph
Mesh:
Substances:
Year: 2020 PMID: 32943086 PMCID: PMC7499850 DOI: 10.1186/s13059-020-02160-7
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Sequence tube maps. Sequence tube maps [19] of a small region of the human genome with aDNA reads from the Yamnaya individual aligned with abwa aln to a linear reference sequence and bvg map to a graph containing 1000 Genomes variants. The individual is heterozygous for both an indel (GTTTGAG/-) and a SNP (A/C) in this region, with insertion and alternate allele on the same haplotype. The two underlying haplotypes in this region are colored in gray, and red and blue lines indicate forward and reverse reads, respectively. None of the 6 reads across the insertion and only 2 of 12 reads across the SNP were mapped by bwa. Reads were locally realigned with vg map to the graph for the purpose of visualization
Fig. 2Comparing vg graph, bwa aln, and bwa mem using simulated ancient DNA. Comparing bwa aln and vg map performance when aligning reads simulated from chromosome 11 of the Human Origins panel. Lines represent ordinary least squares (OLS) regression results for the allele/aligner conditions corresponding to their colors. a Comparison between vg graph and bwa aln -n 0.02. b Comparison between vg graph and bwa aln -n 0.01 -o 2. c Comparison of the mean percentage (and 95% CI) of mapped reads in simulated data by vg graph, bwa aln, and bwa mem using different alignment parameters and minimum mapping quality filtering thresholds. d Mean alternate allele fraction (and 95% CI) of simulated reads after alignment with the different methods and minimum mapping quality filtering thresholds. We also show results obtained after processing simulated data with two previously published workflows for addressing reference bias: modified reads (“modreads”) [10] and modified reference genome (“altref genome”) [15]
Datasets analyzed in the present study
| Damgaard et al. 2018 | 2 | 11.24-18.95x | Untreated | Whole-genome shotgun sequencing | Kazakhstan |
| Martiniano et al. 2016 | 9 | 0.54-1.63x | Untreated | Whole-genome shotgun sequencing | UK |
| Schiffels et al. 2016 | 10 | 0.47-7.86x | Partial UDG/USER | Whole-genome shotgun sequencing | UK |
| Posth et al. 2018 | 13 | 0.02-0.40x | Partial UDG | Target capture | South America |
Fig. 3Downsampling a high-coverage aDNA sample. The comparative effect of downsampling on heterozygous variant calling following bwa aln and vg map alignment of reads from the ancient Yamnaya sample [26] with different parameters and mapping quality filtering thresholds, and including post-processing of bwa aln with the modified read filter [10]. a SNPs. b Indels (the modified read filter does not apply in this case)
Fig. 4Comparison between vg and bwa aln for indel detection. a Alternate allele observations at indels. b Comparison between vg graph and bwa aln in the detection of the CCR5 delta 32 deletion associated with HIV-1 resistance. Reads containing the deletion were mapped with vg in four ancient samples, but not with bwa