| Literature DB >> 29635297 |
Scott M Geib1, Brian Hall2, Theodore Derego1, Forest T Bremer2, Kyle Cannoles1,3, Sheina B Sim1.
Abstract
Background: One of the most overlooked, yet critical, components of a whole genome sequencing (WGS) project is the submission and curation of the data to a genomic repository, most commonly the National Center for Biotechnology Information (NCBI). While large genome centers or genome groups have developed software tools for post-annotation assembly filtering, annotation, and conversion into the NCBI's annotation table format, these tools typically require back-end setup and connection to an Structured Query Language (SQL) database and/or some knowledge of programming (Perl, Python) to implement. With WGS becoming commonplace, genome sequencing projects are moving away from the genome centers and into the ecology or biology lab, where fewer resources are present to support the process of genome assembly curation. To fill this gap, we developed software to assess, filter, and transfer annotation and convert a draft genome assembly and annotation set into the NCBI annotation table (.tbl) format, facilitating submission to the NCBI Genome Assembly database. This software has no dependencies, is compatible across platforms, and utilizes a simple command to perform a variety of simple and complex post-analysis, pre-NCBI submission WGS project tasks. Findings: The Genome Annotation Generator is a consistent and user-friendly bioinformatics tool that can be used to generate a .tbl file that is consistent with the NCBI submission pipeline. Conclusions: The Genome Annotation Generator achieves the goal of providing a publicly available tool that will facilitate the submission of annotated genome assemblies to the NCBI. It is useful for any individual researcher or research group that wishes to submit a genome assembly of their study system to the NCBI.Entities:
Mesh:
Year: 2018 PMID: 29635297 PMCID: PMC5887294 DOI: 10.1093/gigascience/giy018
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Options for GAG
| Option | Type of function | Description |
|---|---|---|
|
| Annotate | Adds functional annotations present in annotation file to |
|
| Trim | Removes regions of genome indicated in |
|
| Fix | Adds or corrects start and stop codon features to |
|
| Fix | Removes any trailing ends from contig ends in assembly, updates |
|
| Remove | Remove CDS shorter than <integer> |
|
| Remove | Remove CDS longer than <integer> |
|
| Remove | Remove exons shorter than <integer> |
|
| Remove | Remove exons longer than <integer> |
|
| Remove | Remove introns shorter than <integer> |
|
| Remove | Remove introns longer than <integer> |
|
| Remove | Remove genes shorter than <integer> |
|
| Remove | Remove genes longer than <integer> |
|
| Flag | Flag CDS shorter than <integer> |
|
| Flag | Flag CDS longer than <integer> |
|
| Flag | Flag exons shorter than <integer> |
|
| Flag | Flag exons longer than <integer> |
|
| Flag | Flag introns shorter than <integer> |
|
| Flag | Flag introns longer than <integer> |
|
| Flag | Flag genes shorter than <integer> |
|
| Flag | Flag genes longer than <integer> |