| Literature DB >> 30254741 |
Steven W Wingett1, Simon Andrews1.
Abstract
DNA sequencing analysis typically involves mapping reads to just one reference genome. Mapping against multiple genomes is necessary, however, when the genome of origin requires confirmation. Mapping against multiple genomes is also advisable for detecting contamination or for identifying sample swaps which, if left undetected, may lead to incorrect experimental conclusions. Consequently, we present FastQ Screen, a tool to validate the origin of DNA samples by quantifying the proportion of reads that map to a panel of reference genomes. FastQ Screen is intended to be used routinely as a quality control measure and for analysing samples in which the origin of the DNA is uncertain or has multiple sources.Entities:
Keywords: Bioinformatics Contamination FastQC Illumina Metagenomics NGS QC Sequencing
Mesh:
Year: 2018 PMID: 30254741 PMCID: PMC6124377 DOI: 10.12688/f1000research.15931.2
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Graphical output from FastQ Screen after mapping a publicly available RNA-Seq sample (SRR5100711) against several reference genomes.
Reads either i) mapped uniquely to one genome only (light blue), ii) multi-mapped to one genome only (dark blue), iii) mapped uniquely to a given genome and mapped to at least one other genome (light red), or iv) multi-mapped to a given genome and mapped to at least one other genome (dark red). The reads represented by blue shading are significant since these are sequences that align only to one genome, and consequently, if are observed in an unexpected genome they suggest contamination.