| Literature DB >> 33648554 |
Matthew T Parker1, Katarzyna Knop2, Geoffrey J Barton2, Gordon G Simpson3,4.
Abstract
Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools ( https://github.com/bartongroup/2passtools ), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.Entities:
Keywords: Gene expression; Long-read sequencing; Machine learning; Nanopore sequencing; RNA-seq; Spliced alignment; Splicing; Transcriptome assembly
Mesh:
Substances:
Year: 2021 PMID: 33648554 PMCID: PMC7919322 DOI: 10.1186/s13059-021-02296-0
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583