Literature DB >> 32559296

Automated generation of gene summaries at the Alliance of Genome Resources.

Ranjana Kishore1, Valerio Arnaboldi1, Ceri E Van Slyke2, Juancarlos Chan1, Robert S Nash3, Jose M Urbano4, Mary E Dolan5, Stacia R Engel3, Mary Shimoyama6, Paul W Sternberg1, The Alliance Of Genome Resources.   

Abstract

Short paragraphs that describe gene function, referred to as gene summaries, are valued by users of biological knowledgebases for the ease with which they convey key aspects of gene function. Manual curation of gene summaries, while desirable, is difficult for knowledgebases to sustain. We developed an algorithm that uses curated, structured gene data at the Alliance of Genome Resources (Alliance; www.alliancegenome.org) to automatically generate gene summaries that simulate natural language. The gene data used for this purpose include curated associations (annotations) to ontology terms from the Gene Ontology, Disease Ontology, model organism knowledgebase (MOK)-specific anatomy ontologies and Alliance orthology data. The method uses sentence templates for each data category included in the gene summary in order to build a natural language sentence from the list of terms associated with each gene. To improve readability of the summaries when numerous gene annotations are present, we developed a new algorithm that traverses ontology graphs in order to group terms by their common ancestors. The algorithm optimizes the coverage of the initial set of terms and limits the length of the final summary, using measures of information content of each ontology term as a criterion for inclusion in the summary. The automated gene summaries are generated with each Alliance release, ensuring that they reflect current data at the Alliance. Our method effectively leverages category-specific curation efforts of the Alliance member databases to create modular, structured and standardized gene summaries for seven member species of the Alliance. These automatically generated gene summaries make cross-species gene function comparisons tenable and increase discoverability of potential models of human disease. In addition to being displayed on Alliance gene pages, these summaries are also included on several MOK gene pages.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2020        PMID: 32559296      PMCID: PMC7304461          DOI: 10.1093/database/baaa037

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


  35 in total

Review 1.  Model organism data evolving in support of translational medicine.

Authors:  Douglas G Howe; Judith A Blake; Yvonne M Bradford; Carol J Bult; Brian R Calvi; Stacia R Engel; James A Kadin; Thomas C Kaufman; Ranjana Kishore; Stanley J F Laulederkind; Suzanna E Lewis; Sierra A T Moxon; Joel E Richardson; Cynthia Smith
Journal:  Lab Anim (NY)       Date:  2018-09-17       Impact factor: 12.625

2.  Database resources of the National Center for Biotechnology Information.

Authors:  Eric W Sayers; Jeff Beck; J Rodney Brister; Evan E Bolton; Kathi Canese; Donald C Comeau; Kathryn Funk; Anne Ketter; Sunghwan Kim; Avi Kimchi; Paul A Kitts; Anatoliy Kuznetsov; Stacy Lathrop; Zhiyong Lu; Kelly McGarvey; Thomas L Madden; Terence D Murphy; Nuala O'Leary; Lon Phan; Valerie A Schneider; Françoise Thibaud-Nissen; Bart W Trawick; Kim D Pruitt; James Ostell
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

3.  Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation.

Authors:  Sarah Burge; Elizabeth Kelly; David Lonsdale; Prudence Mutowo-Muellenet; Craig McAnulla; Alex Mitchell; Amaia Sangrador-Vegas; Siew-Yit Yong; Nicola Mulder; Sarah Hunter
Journal:  Database (Oxford)       Date:  2012-02-01       Impact factor: 3.451

4.  GO PaD: the Gene Ontology Partition Database.

Authors:  Gil Alterovitz; Michael Xiang; Mamta Mohan; Marco F Ramoni
Journal:  Nucleic Acids Res       Date:  2006-11-10       Impact factor: 16.971

5.  The zebrafish anatomy and stage ontologies: representing the anatomy and development of Danio rerio.

Authors:  Ceri E Van Slyke; Yvonne M Bradford; Monte Westerfield; Melissa A Haendel
Journal:  J Biomed Semantics       Date:  2014-02-25

6.  ECO, the Evidence & Conclusion Ontology: community standard for evidence information.

Authors:  Michelle Giglio; Rebecca Tauber; Suvarna Nadendla; James Munro; Dustin Olley; Shoshannah Ball; Elvira Mitraka; Lynn M Schriml; Pascale Gaudet; Elizabeth T Hobbs; Ivan Erill; Deborah A Siegele; James C Hu; Chris Mungall; Marcus C Chibucos
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

7.  Human Disease Ontology 2018 update: classification, content and workflow expansion.

Authors:  Lynn M Schriml; Elvira Mitraka; James Munro; Becky Tauber; Mike Schor; Lance Nickle; Victor Felix; Linda Jeng; Cynthia Bearer; Richard Lichenstein; Katharine Bisordi; Nicole Campion; Brooke Hyman; David Kurland; Connor Patrick Oates; Siobhan Kibbey; Poorna Sreekumar; Chris Le; Michelle Giglio; Carol Greene
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

8.  WormBase 2016: expanding to enable helminth genomic research.

Authors:  Kevin L Howe; Bruce J Bolt; Scott Cain; Juancarlos Chan; Wen J Chen; Paul Davis; James Done; Thomas Down; Sibyl Gao; Christian Grove; Todd W Harris; Ranjana Kishore; Raymond Lee; Jane Lomax; Yuling Li; Hans-Michael Muller; Cecilia Nakamura; Paulo Nuin; Michael Paulini; Daniela Raciti; Gary Schindelman; Eleanor Stanley; Mary Ann Tuli; Kimberly Van Auken; Daniel Wang; Xiaodong Wang; Gary Williams; Adam Wright; Karen Yook; Matthew Berriman; Paul Kersey; Tim Schedl; Lincoln Stein; Paul W Sternberg
Journal:  Nucleic Acids Res       Date:  2015-11-17       Impact factor: 16.971

Review 9.  FlyBase 2.0: the next generation.

Authors:  Jim Thurmond; Joshua L Goodman; Victor B Strelets; Helen Attrill; L Sian Gramates; Steven J Marygold; Beverley B Matthews; Gillian Millburn; Giulia Antonazzo; Vitor Trovisco; Thomas C Kaufman; Brian R Calvi
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

10.  The Year of the Rat: The Rat Genome Database at 20: a multi-species knowledgebase and analysis platform.

Authors:  Jennifer R Smith; G Thomas Hayman; Shur-Jen Wang; Stanley J F Laulederkind; Matthew J Hoffman; Mary L Kaldunski; Monika Tutaj; Jyothi Thota; Harika S Nalabolu; Santoshi L R Ellanki; Marek A Tutaj; Jeffrey L De Pons; Anne E Kwitek; Melinda R Dwinell; Mary E Shimoyama
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

View more
  8 in total

1.  Annotation-free delineation of prokaryotic homology groups.

Authors:  Yongze Yin; Huw A Ogilvie; Luay Nakhleh
Journal:  PLoS Comput Biol       Date:  2022-06-08       Impact factor: 4.779

2.  Making biological knowledge useful for humans and machines.

Authors:  Valerie Wood; Paul W Sternberg; Howard D Lipshitz
Journal:  Genetics       Date:  2022-04-04       Impact factor: 4.402

3.  The Human Disease Ontology 2022 update.

Authors:  Lynn M Schriml; James B Munro; Mike Schor; Dustin Olley; Carrie McCracken; Victor Felix; J Allen Baron; Rebecca Jackson; Susan M Bello; Cynthia Bearer; Richard Lichenstein; Katharine Bisordi; Nicole Campion Dialo; Michelle Giglio; Carol Greene
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

4.  Zebrafish information network, the knowledgebase for Danio rerio research.

Authors:  Yvonne M Bradford; Ceri E Van Slyke; Leyla Ruzicka; Amy Singer; Anne Eagle; David Fashena; Douglas G Howe; Ken Frazer; Ryan Martin; Holly Paddock; Christian Pich; Sridhar Ramachandran; Monte Westerfield
Journal:  Genetics       Date:  2022-04-04       Impact factor: 4.562

5.  Harmonizing model organism data in the Alliance of Genome Resources.

Authors: 
Journal:  Genetics       Date:  2022-04-04       Impact factor: 4.402

6.  WormBase in 2022-data, processes, and tools for analyzing Caenorhabditis elegans.

Authors:  Paul Davis; Magdalena Zarowiecki; Valerio Arnaboldi; Andrés Becerra; Scott Cain; Juancarlos Chan; Wen J Chen; Jaehyoung Cho; Eduardo da Veiga Beltrame; Stavros Diamantakis; Sibyl Gao; Dionysis Grigoriadis; Christian A Grove; Todd W Harris; Ranjana Kishore; Tuan Le; Raymond Y N Lee; Manuel Luypaert; Hans-Michael Müller; Cecilia Nakamura; Paulo Nuin; Michael Paulini; Mark Quinton-Tulloch; Daniela Raciti; Faye H Rodgers; Matthew Russell; Gary Schindelman; Archana Singh; Tim Stickland; Kimberly Van Auken; Qinghua Wang; Gary Williams; Adam J Wright; Karen Yook; Matt Berriman; Kevin L Howe; Tim Schedl; Lincoln Stein; Paul W Sternberg
Journal:  Genetics       Date:  2022-04-04       Impact factor: 4.402

7.  Revealing the characteristics of ZIKV infection through tissue-specific transcriptome sequencing analysis.

Authors:  Zhi-Lu Chen; Zuo-Jing Yin; Tian-Yi Qiu; Jian Chen; Jian Liu; Xiao-Yan Zhang; Jian-Qing Xu
Journal:  BMC Genomics       Date:  2022-10-08       Impact factor: 4.547

8.  The Zebrafish Information Network: major gene page and home page updates.

Authors:  Douglas G Howe; Sridhar Ramachandran; Yvonne M Bradford; David Fashena; Sabrina Toro; Anne Eagle; Ken Frazer; Patrick Kalita; Prita Mani; Ryan Martin; Sierra Taylor Moxon; Holly Paddock; Christian Pich; Leyla Ruzicka; Kevin Schaper; Xiang Shao; Amy Singer; Ceri E Van Slyke; Monte Westerfield
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.