| Literature DB >> 23281853 |
Ting-Wen Chen1, Ruei-Chi Richie Gan, Timothy H Wu, Po-Jung Huang, Cheng-Yang Lee, Yi-Ywan M Chen, Che-Chun Chen, Petrus Tang.
Abstract
BACKGROUND: Recent developments in high-throughput sequencing (HTS) technologies have made it feasible to sequence the complete transcriptomes of non-model organisms or metatranscriptomes from environmental samples. The challenge after generating hundreds of millions of sequences is to annotate these transcripts and classify the transcripts based on their putative functions. Because many biological scientists lack the knowledge to install Linux-based software packages or maintain databases used for transcript annotation, we developed an automatic annotation tool with an easy-to-use interface.Entities:
Mesh:
Year: 2012 PMID: 23281853 PMCID: PMC3521244 DOI: 10.1186/1471-2164-13-S7-S9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Flowchart of the FastAnnotator pipeline. After users upload sequences to the FastAnnotator server, three different processes, LAST-Blast2GO, PRIAM and domain identification, are executed to determine the gene ontology, enzyme and domain annotation of the submitted sequences. We implemented these three processes in parallel to accelerate the annotation procedure. After all of these annotation programs are completed, FastAnnotator presents GO terms, the best hits in the nr database, enzyme annotations and domain annotations together with a statistical report on the website. In addition to exploring the annotation results online, users can also download all of the annotation results as a zip file.
Figure 2Partial result of a GO annotation using FastAnnotator. The GO annotation results include horizontal bar charts in addition to a table of sequences and corresponding GO annotations. These three horizontal bar charts present the distribution of GO terms categorized as biological process, cellular components or molecular functions, which represent the three main categories for GO annotation. Because the GO terms have a hierarchical structure, FastAnnotator provides views of the distribution of GO terms at different levels. Users can select the level and obtain the overall distribution of the GO terms in real-time. For example, level 1 presents the roots (e.g., biological process, cellular components and molecular functions), and level 3 presents the GO terms that are 2 edges away from the roots.
FastAnnotator results for five different organisms
| Organism | # of entries (total base) | Processing time | % of sequences with best hit | % of annotated sequence* |
|---|---|---|---|---|
| 4,456 | 1.25 h | 97.8% | 88.1% | |
| Bacteria example+ | 1,219 | 21 min | 94.7% | 81.3% |
| Plant example+ | 39,914 | 16 h | 77.0% | 62.9% |
| Clam+ | 101,795 | 15.5 h | 24.5% | 20.4% |
| Amoeba EST | 13,814 | 3.5 h | 68.4% | 53.1% |
| 35,989 | 11 h | 93.7% | 42.0% |
+RNA-Seq denovo assembly result
*Percentages of sequences had been annotated with GO, enzyme or domains.