Literature DB >> 34123354

FGGA-lnc: automatic gene ontology annotation of lncRNA sequences based on secondary structures.

Flavio E Spetale1,2, Javier Murillo1,2, Gabriela V Villanova3, Pilar Bulacio1,2, Elizabeth Tapia1,2.   

Abstract

The study of long non-coding RNAs (lncRNAs), greater than 200 nucleotides, is central to understanding the development and progression of many complex diseases. Unlike proteins, the functionality of lncRNAs is only subtly encoded in their primary sequence. Current in-silico lncRNA annotation methods mostly rely on annotations inferred from interaction networks. But extensive experimental studies are required to build these networks. In this work, we present a graph-based machine learning method called FGGA-lnc for the automatic gene ontology (GO) annotation of lncRNAs across the three GO subdomains. We build upon FGGA (factor graph GO annotation), a computational method originally developed to annotate protein sequences from non-model organisms. In the FGGA-lnc version, a coding-based approach is introduced to fuse primary sequence and secondary structure information of lncRNA molecules. As a result, lncRNA sequences become sequences of a higher-order alphabet allowing supervised learning methods to assess individual GO-term annotations. Raw GO annotations obtained in this way are unaware of the GO structure and therefore likely to be inconsistent with it. The message-passing algorithm embodied by factor graph models overcomes this problem. Evaluations of the FGGA-lnc method on lncRNA data, from model and non-model organisms, showed promising results suggesting it as a candidate to satisfy the huge demand for functional annotations arising from high-throughput sequencing technologies.
© 2021 The Author(s).

Entities:  

Keywords:  gene ontology; lncRNA; prediction

Year:  2021        PMID: 34123354      PMCID: PMC8193470          DOI: 10.1098/rsfs.2020.0064

Source DB:  PubMed          Journal:  Interface Focus        ISSN: 2042-8898            Impact factor:   4.661


  60 in total

1.  Knowledge Generation Model for Visual Analytics.

Authors:  Dominik Sacha; Andreas Stoffel; Florian Stoffel; Bum Chul Kwon; Geoffrey Ellis; Daniel A Keim
Journal:  IEEE Trans Vis Comput Graph       Date:  2014-12       Impact factor: 4.579

2.  A categorization approach to automated ontological function annotation.

Authors:  Karin Verspoor; Judith Cohn; Susan Mniszewski; Cliff Joslyn
Journal:  Protein Sci       Date:  2006-05-02       Impact factor: 6.725

Review 3.  Noncoding RNA transcription beyond annotated genes.

Authors:  Piero Carninci; Yoshihide Hayashizaki
Journal:  Curr Opin Genet Dev       Date:  2007-02-20       Impact factor: 5.578

Review 4.  Long non-coding RNAs: insights into functions.

Authors:  Tim R Mercer; Marcel E Dinger; John S Mattick
Journal:  Nat Rev Genet       Date:  2009-03       Impact factor: 53.242

5.  LncRNA2Function: a comprehensive resource for functional investigation of human lncRNAs based on RNA-seq data.

Authors:  Qinghua Jiang; Rui Ma; Jixuan Wang; Xiaoliang Wu; Shuilin Jin; Jiajie Peng; Renjie Tan; Tianjiao Zhang; Yu Li; Yadong Wang
Journal:  BMC Genomics       Date:  2015-01-29       Impact factor: 3.969

Review 6.  Long Non-coding RNA Structure and Function: Is There a Link?

Authors:  Anna Zampetaki; Andreas Albrecht; Kathleen Steinhofel
Journal:  Front Physiol       Date:  2018-08-24       Impact factor: 4.566

7.  The role of balanced training and testing data sets for binary classifiers in bioinformatics.

Authors:  Qiong Wei; Roland L Dunbrack
Journal:  PLoS One       Date:  2013-07-09       Impact factor: 3.240

8.  Enhanced regulatory sequence prediction using gapped k-mer features.

Authors:  Mahmoud Ghandi; Dongwon Lee; Morteza Mohammad-Noori; Michael A Beer
Journal:  PLoS Comput Biol       Date:  2014-07-17       Impact factor: 4.475

9.  The Gene Ontology Resource: 20 years and still GOing strong.

Authors: 
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

10.  Landscape of transcription in human cells.

Authors:  Sarah Djebali; Carrie A Davis; Angelika Merkel; Alex Dobin; Timo Lassmann; Ali Mortazavi; Andrea Tanzer; Julien Lagarde; Wei Lin; Felix Schlesinger; Chenghai Xue; Georgi K Marinov; Jainab Khatun; Brian A Williams; Chris Zaleski; Joel Rozowsky; Maik Röder; Felix Kokocinski; Rehab F Abdelhamid; Tyler Alioto; Igor Antoshechkin; Michael T Baer; Nadav S Bar; Philippe Batut; Kimberly Bell; Ian Bell; Sudipto Chakrabortty; Xian Chen; Jacqueline Chrast; Joao Curado; Thomas Derrien; Jorg Drenkow; Erica Dumais; Jacqueline Dumais; Radha Duttagupta; Emilie Falconnet; Meagan Fastuca; Kata Fejes-Toth; Pedro Ferreira; Sylvain Foissac; Melissa J Fullwood; Hui Gao; David Gonzalez; Assaf Gordon; Harsha Gunawardena; Cedric Howald; Sonali Jha; Rory Johnson; Philipp Kapranov; Brandon King; Colin Kingswood; Oscar J Luo; Eddie Park; Kimberly Persaud; Jonathan B Preall; Paolo Ribeca; Brian Risk; Daniel Robyr; Michael Sammeth; Lorian Schaffer; Lei-Hoon See; Atif Shahab; Jorgen Skancke; Ana Maria Suzuki; Hazuki Takahashi; Hagen Tilgner; Diane Trout; Nathalie Walters; Huaien Wang; John Wrobel; Yanbao Yu; Xiaoan Ruan; Yoshihide Hayashizaki; Jennifer Harrow; Mark Gerstein; Tim Hubbard; Alexandre Reymond; Stylianos E Antonarakis; Gregory Hannon; Morgan C Giddings; Yijun Ruan; Barbara Wold; Piero Carninci; Roderic Guigó; Thomas R Gingeras
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.