Literature DB >> 16960968

ANGLE: a sequencing errors resistant program for predicting protein coding regions in unfinished cDNA.

Kana Shimizu1, Jun Adachi, Yoichi Muraoka.   

Abstract

In the process of making full-length cDNA, predicting protein coding regions helps both in the preliminary analysis of genes and in any succeeding process. However, unfinished cDNA contains artifacts including many sequencing errors, which hinder the correct evaluation of coding sequences. Especially, predictions of short sequences are difficult because they provide little information for evaluating coding potential. In this paper, we describe ANGLE, a new program for predicting coding sequences in low quality cDNA. To achieve error-tolerant prediction, ANGLE uses a machine-learning approach, which makes better expression of coding sequence maximizing the use of limited information from input sequences. Our method utilizes not only codon usage, but also protein structure information which is difficult to be used for stochastic model-based algorithms, and optimizes limited information from a short segment when deciding coding potential, with the result that predictive accuracy does not depend on the length of an input sequence. The performance of ANGLE is compared with ESTSCAN on four dataset each of them having a different error rate (one frame-shift error or one substitution error per 200-500 nucleotides) and on one dataset which has no error. ANGLE outperforms ESTSCAN by 9.26% in average Matthews's correlation coefficient on short sequence dataset (< 1000 bases). On long sequence dataset, ANGLE achieves comparable performance.

Mesh:

Substances:

Year:  2006        PMID: 16960968     DOI: 10.1142/s0219720006002260

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  27 in total

1.  Inferring bona fide transfrags in RNA-Seq derived-transcriptome assemblies of non-model organisms.

Authors:  Stanley Kimbung Mbandi; Uljana Hesse; Peter van Heusden; Alan Christoffels
Journal:  BMC Bioinformatics       Date:  2015-02-21       Impact factor: 3.169

2.  SMRT Sequencing Reveals Candidate Genes and Pathways With Medicinal Value in Cipangopaludina chinensis.

Authors:  Kangqi Zhou; Zhong Chen; Xuesong Du; Yin Huang; Junqi Qin; Luting Wen; Xianhui Pan; Yong Lin
Journal:  Front Genet       Date:  2022-06-16       Impact factor: 4.772

3.  Detection of Structural Variations and Fusion Genes in Breast Cancer Samples Using Third-Generation Sequencing.

Authors:  Taobo Hu; Jingjing Li; Mengping Long; Jinbo Wu; Zhen Zhang; Fei Xie; Jin Zhao; Houpu Yang; Qianqian Song; Sheng Lian; Jiandong Shi; Xueyu Guo; Daoli Yuan; Dandan Lang; Guoliang Yu; Baosheng Liang; Xiaohua Zhou; Toyotaka Ishibashi; Xiaodan Fan; Weichuan Yu; Depeng Wang; Yang Wang; I-Feng Peng; Shu Wang
Journal:  Front Cell Dev Biol       Date:  2022-04-13

4.  SMRT sequencing of the full-length transcriptome of the Rhynchophorus ferrugineus (Coleoptera: Curculionidae).

Authors:  Hongjun Yang; Danping Xu; Zhihang Zhuo; Jiameng Hu; Baoqian Lu
Journal:  PeerJ       Date:  2020-05-21       Impact factor: 2.984

5.  Characterization and Analysis of the Full-Length Transcriptomes of Multiple Organs in Pseudotaxus chienii (W.C.Cheng) W.C.Cheng.

Authors:  Li Liu; Zhen Wang; Yingjuan Su; Ting Wang
Journal:  Int J Mol Sci       Date:  2020-06-17       Impact factor: 5.923

6.  Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing.

Authors:  Yuehui Chao; Jianbo Yuan; Sifeng Li; Siqiao Jia; Liebao Han; Lixin Xu
Journal:  BMC Plant Biol       Date:  2018-11-26       Impact factor: 4.215

7.  SMRT sequencing of the full-length transcriptome of the white-backed planthopper Sogatella furcifera.

Authors:  Jing Chen; Yaya Yu; Kui Kang; Daowei Zhang
Journal:  PeerJ       Date:  2020-06-09       Impact factor: 2.984

8.  Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis.

Authors:  Roberto T Arrial; Roberto C Togawa; Marcelo de M Brigido
Journal:  BMC Bioinformatics       Date:  2009-08-04       Impact factor: 3.169

9.  Molecular Functions of Long Non-Coding RNAs in Plants.

Authors:  Qian-Hao Zhu; Ming-Bo Wang
Journal:  Genes (Basel)       Date:  2012-03-08       Impact factor: 4.096

10.  Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA.

Authors:  Kumar Parijat Tripathi; Daniela Evangelista; Antonio Zuccaro; Mario Rosario Guarracino
Journal:  PLoS One       Date:  2015-11-18       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.