Literature DB >> 30443819

A Text Mining Pipeline Using Active and Deep Learning Aimed at Curating Information in Computational Neuroscience.

Matthew Shardlow1, Meizhi Ju1, Maolin Li1, Christian O'Reilly2, Elisabetta Iavarone2, John McNaught1, Sophia Ananiadou3.   

Abstract

The curation of neuroscience entities is crucial to ongoing efforts in neuroinformatics and computational neuroscience, such as those being deployed in the context of continuing large-scale brain modelling projects. However, manually sifting through thousands of articles for new information about modelled entities is a painstaking and low-reward task. Text mining can be used to help a curator extract relevant information from this literature in a systematic way. We propose the application of text mining methods for the neuroscience literature. Specifically, two computational neuroscientists annotated a corpus of entities pertinent to neuroscience using active learning techniques to enable swift, targeted annotation. We then trained machine learning models to recognise the entities that have been identified. The entities covered are Neuron Types, Brain Regions, Experimental Values, Units, Ion Currents, Channels, and Conductances and Model organisms. We tested a traditional rule-based approach, a conditional random field and a model using deep learning named entity recognition, finding that the deep learning model was superior. Our final results show that we can detect a range of named entities of interest to the neuroscientist with a macro average precision, recall and F1 score of 0.866, 0.817 and 0.837 respectively. The contributions of this work are as follows: 1) We provide a set of Named Entity Recognition (NER) tools that are capable of detecting neuroscience entities with performance above or similar to prior work. 2) We propose a methodology for training NER tools for neuroscience that requires very little training data to get strong performance. This can be adapted for any sub-domain within neuroscience. 3) We provide a small corpus with annotations for multiple entity types, as well as annotation guidelines to help others reproduce our experiments.

Entities:  

Keywords:  Annotation; Conditional random field; Corpus; Data curation; Data mining; Deep learning; Named entity recognition; Text mining

Mesh:

Year:  2019        PMID: 30443819      PMCID: PMC6594987          DOI: 10.1007/s12021-018-9404-y

Source DB:  PubMed          Journal:  Neuroinformatics        ISSN: 1539-2791


  26 in total

1.  Text mining neuroscience journal articles to populate neuroscience databases.

Authors:  Chiquito J Crasto; Luis N Marenco; Michele Migliore; Buqing Mao; Prakash M Nadkarni; Perry Miller; Gordon M Shepherd
Journal:  Neuroinformatics       Date:  2003

2.  Using text mining to link journal articles to neuroanatomical databases.

Authors:  Leon French; Paul Pavlidis
Journal:  J Comp Neurol       Date:  2012-06-01       Impact factor: 3.215

Review 3.  Text mining and ontologies in biomedicine: making sense of raw text.

Authors:  Irena Spasic; Sophia Ananiadou; John McNaught; Anand Kumar
Journal:  Brief Bioinform       Date:  2005-09       Impact factor: 11.622

4.  Building an abbreviation dictionary using a term recognition approach.

Authors:  Naoaki Okazaki; Sophia Ananiadou
Journal:  Bioinformatics       Date:  2006-10-18       Impact factor: 6.937

5.  Building a high-quality sense inventory for improved abbreviation disambiguation.

Authors:  Naoaki Okazaki; Sophia Ananiadou; Jun'ichi Tsujii
Journal:  Bioinformatics       Date:  2010-03-25       Impact factor: 6.937

6.  A text-mining analysis of the human phenome.

Authors:  Marc A van Driel; Jorn Bruggeman; Gert Vriend; Han G Brunner; Jack A M Leunissen
Journal:  Eur J Hum Genet       Date:  2006-05       Impact factor: 4.246

7.  Textpresso for neuroscience: searching the full text of thousands of neuroscience research papers.

Authors:  Hans-Michael Müller; Arun Rangarajan; Tracy K Teal; Paul W Sternberg
Journal:  Neuroinformatics       Date:  2008-10-24

8.  Argo: an integrative, interactive, text mining-based workbench supporting curation.

Authors:  Rafal Rak; Andrew Rowley; William Black; Sophia Ananiadou
Journal:  Database (Oxford)       Date:  2012-03-20       Impact factor: 3.451

9.  Automated recognition of brain region mentions in neuroscience literature.

Authors:  Leon French; Suzanne Lane; Lydia Xu; Paul Pavlidis
Journal:  Front Neuroinform       Date:  2009-09-01       Impact factor: 4.081

10.  Application and evaluation of automated methods to extract neuroanatomical connectivity statements from free text.

Authors:  Leon French; Suzanne Lane; Lydia Xu; Celia Siu; Cathy Kwok; Yiqi Chen; Claudia Krebs; Paul Pavlidis
Journal:  Bioinformatics       Date:  2012-09-06       Impact factor: 6.937

View more
  3 in total

1.  An open-source framework for neuroscience metadata management applied to digital reconstructions of neuronal morphology.

Authors:  Kayvan Bijari; Masood A Akram; Giorgio A Ascoli
Journal:  Brain Inform       Date:  2020-03-26

2.  MassGenie: A Transformer-Based Deep Learning Method for Identifying Small Molecules from Their Mass Spectra.

Authors:  Aditya Divyakant Shrivastava; Neil Swainston; Soumitra Samanta; Ivayla Roberts; Marina Wright Muelas; Douglas B Kell
Journal:  Biomolecules       Date:  2021-11-30

3.  AT-NeuroEAE: A Joint Extraction Model of Events With Attributes for Research Sharing-Oriented Neuroimaging Provenance Construction.

Authors:  Shaofu Lin; Zhe Xu; Ying Sheng; Lihong Chen; Jianhui Chen
Journal:  Front Neurosci       Date:  2022-03-07       Impact factor: 4.677

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.