| Literature DB >> 31020555 |
Katharina J Hoff1, Alexandre Lomsadze2, Mark Borodovsky3,4,5, Mario Stanke6.
Abstract
BRAKER is a pipeline for highly accurate and fully automated gene prediction in novel eukaryotic genomes. It combines two major tools: GeneMark-ES/ET and AUGUSTUS. GeneMark-ES/ET learns its parameters from a novel genomic sequence in a fully automated fashion; if available, it uses extrinsic evidence for model refinement. From the protein-coding genes predicted by GeneMark-ES/ET, we select a set for training AUGUSTUS, one of the most accurate gene finding tools that, in contrast to GeneMark-ES/ET, integrates extrinsic evidence already into the gene prediction step. The first published version, BRAKER1, integrated genomic footprints of unassembled RNA-Seq reads into the training as well as into the prediction steps. The pipeline has since been extended to the integration of data on mapped cross-species proteins, and to the usage of heterogeneous extrinsic evidence, both RNA-Seq and protein alignments. In this book chapter, we briefly summarize the pipeline methodology and describe how to apply BRAKER in environments characterized by various combinations of external evidence.Entities:
Keywords: AUGUSTUS; BRAKER; Gene prediction; GeneMark-ES/ET; Genome annotation pipeline; Protein mapping to genome; Protein-coding genes; RNA-Seq reads
Mesh:
Year: 2019 PMID: 31020555 PMCID: PMC6635606 DOI: 10.1007/978-1-4939-9173-0_5
Source DB: PubMed Journal: Methods Mol Biol ISSN: 1064-3745