| Literature DB >> 35193521 |
Sheina B Sim1, Renee L Corpuz2, Tyler J Simmonds2,3, Scott M Geib2.
Abstract
BACKGROUND: Pacific Biosciences HiFi read technology is currently the industry standard for high accuracy long-read sequencing that has been widely adopted by large sequencing and assembly initiatives for generation of de novo assemblies in non-model organisms. Though adapter contamination filtering is routine in traditional short-read analysis pipelines, it has not been widely adopted for HiFi workflows.Entities:
Keywords: Adapter; Circular consensus sequencing; PacBio HiFi; Sequence data filtering
Mesh:
Year: 2022 PMID: 35193521 PMCID: PMC8864876 DOI: 10.1186/s12864-022-08375-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Public SRAs and types of errors produced by each of the three assembly programs
Options for HiFiAdapterFilt
Fig. 1Schematic of the types of errors found in assemblies made from un-sanitized raw reads relative to their corresponding assemblies from filtered raw reads where all raw reads containing adapter sequences were removed. Five types of assembly errors were identified in the assemblies for the three taxa using three assembly programs: (A) errant insertions of adapter sequence in an otherwise contiguous contig with a near exact homolog in the corresponding filtered assembly, (B) short (truncated) duplicate contigs containing adapter sequence that is collapsed into a single contig in the corresponding filtered assembly, (C) mis-joined chimeric sequences which represent different parts of two non-homologous contigs in the corresponding filtered assembly, (D) contigs containing an inverted duplicate adjacent to the adapter sequence, and (E) contigs containing tandem adapter sequences where the adjacent sequence is not present in the filtered assembly