| Literature DB >> 34026589 |
Hannane Mohammadi Nodehi1, Mohammad Amin Tabatabaiefar2,3, Mohammadreza Sehhati1.
Abstract
BACKGROUND: Careful design in the primary steps of a next-generation sequencing study is critical for obtaining successful results in downstream analysis.Entities:
Keywords: Chromosomes; high-throughput nucleotide sequencing; sequence analysis
Year: 2021 PMID: 34026589 PMCID: PMC8043119 DOI: 10.4103/jmss.JMSS_7_20
Source DB: PubMed Journal: J Med Signals Sens ISSN: 2228-7477
Figure 1A sample quality score diagram. Quality boxplot obtained by FaQCs tool for the short reads simulated for Hiseq 1000 with fold of coverage ×10
List of selected mapping tools and their basic features
| BWA | Bowtie2 | Kart | Stampy | Novoalign | |
|---|---|---|---|---|---|
| Version | 0.7.17 | 2.3.3.1 | 2.4.4 | 1.0.32 | 3.0802 |
| Mapping algorithm | BWT | FM-Index and BWT | Hash table and BWT | Improved hash table and SIMD | Hash table |
| Multithreading | Yes | Yes | Yes | No | No |
| Optimized read length (bp) | 4-200 | 4-5000k | 150-7k | 4-4k | 30-300 |
| Seed mismatches | Yes | Yes | Yes | (0.15 read length) | 8 |
| Indel | 8 | Yes | 5 | 30 | 7 |
| Gap | Yes | Yes | 5 | No | Yes |
| Alignment | Global | Global/local | Local | Global | Global |
| Mapping quality | 0-60 | 0-42 | 0-60 | 0-99 | 0-70 |
FM – Ferragina-Manzini; BWT – Burrows-Wheeler Transform; SIMD – Single instruction, multiple data; BWA – Burrows-wheeler aligner
Figure 2Mapping accuracy for various aligners using data of Hiseq systems ((a) 10,(b) 20, and (c) 25) with coverage of × 10
Figure 3Average runtime in minutes for mapping ×10 reads using five selected aligners
Mapping error rate (%) of different aligners in alignment of with and without mutation short reads using three types of references
| Without SNV | SNV imported | ||||
|---|---|---|---|---|---|
| WG | CTBR | WG | CTBR | Chr# | |
| BWA | 3.60 | 1.31 | 4.2 | 1.84 | > 30 |
| Bowtie2 | 3.60 | 1.29 | 5.37 | 3.04 | > 30 |
| Stampy | 3.35 | 1.30 | 3.78 | 1.67 | > 20 |
#Reported value is the minimum error rate observed in the set of all 24 chromosomes individually (chr1-chr22, chrX and chrY). WG – Whole genome; CTBR – Customized target-based reference; SIMD – Single instruction, multiple data; BWA – Burrows-wheeler aligner
Figure 4Scheme of normalized mapping error in different chromosomes using SNV-imported data and customized target-based reference as reference for (a) BWA, (b) Bowtie2, and (c) Stampy
Figure 5Scheme of normalized mapping error between different chromosomes, cross-chromosome mapping error, in all evaluated conditions using all tools