Gwo-Liang Chen1, Yun-Juan Chang, Chun-Hway Hsueh. 1. Department of Materials Science and Engineering, National Taiwan University, Taipei 10617, Taiwan and High-Performance Biological Computing, Roy J. Carver Biotechnology Center, The University of Illinois, Urbana, IL 61801, USA.
Abstract
MOTIVATION: Prokaryotic genome annotation has been focused mainly on identifying all genes and their protein functions. However, <30% of the prokaryotic genomes submitted to GenBank contain partial repeat features of specific types and none of the genomes contain complete repeat annotations. Deciphering all repeats in DNA sequences is an important and open task in genome annotation and bioinformatics. Hence, there is an immediate need of a tool capable of identifying full spectrum repeats in the whole genome. RESULTS: We report the PRAP (Prokaryotic Repeats Annotation Program software package to automate the analysis of repeats in both finished and draft genomes. It is aimed at identifying full spectrum repeats at the scale of the prokaryotic genome. Compared with the major existing repeat finding tools, PRAP exhibits competitive or better results. The results are consistent with manually curated and experimental data. Repeats can be identified and grouped into families to define their relevant types. The final output is parsed into the European Molecular Biology Laboratory (EMBL)/GenBank feature table format for reading and displaying in Artemis, where it can be combined or compared with other genome data. It is currently the most complete repeat finder for prokaryotes and is a valuable tool for genome annotation. AVAILABILITY: https://sites.google.com/site/prapsoftware/ CONTACT: hsuehc@ntu.edu.tw. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Prokaryotic genome annotation has been focused mainly on identifying all genes and their protein functions. However, <30% of the prokaryotic genomes submitted to GenBank contain partial repeat features of specific types and none of the genomes contain complete repeat annotations. Deciphering all repeats in DNA sequences is an important and open task in genome annotation and bioinformatics. Hence, there is an immediate need of a tool capable of identifying full spectrum repeats in the whole genome. RESULTS: We report the PRAP (Prokaryotic Repeats Annotation Program software package to automate the analysis of repeats in both finished and draft genomes. It is aimed at identifying full spectrum repeats at the scale of the prokaryotic genome. Compared with the major existing repeat finding tools, PRAP exhibits competitive or better results. The results are consistent with manually curated and experimental data. Repeats can be identified and grouped into families to define their relevant types. The final output is parsed into the European Molecular Biology Laboratory (EMBL)/GenBank feature table format for reading and displaying in Artemis, where it can be combined or compared with other genome data. It is currently the most complete repeat finder for prokaryotes and is a valuable tool for genome annotation. AVAILABILITY: https://sites.google.com/site/prapsoftware/ CONTACT: hsuehc@ntu.edu.tw. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Viktor A Shamanskiy; Valeria N Timonina; Konstantin Yu Popadin; Konstantin V Gunbin Journal: BMC Genomics Date: 2019-05-08 Impact factor: 3.969