Literature DB >> 33406109

Extracting and modeling geographic information from scientific articles.

Elise Acheson1, Ross S Purves1.   

Abstract

Scientific articles often contain relevant geographic information such as where field work was performed or where patients were treated. Most often, this information appears in the full-text article contents as a description in natural language including place names, with no accompanying machine-readable geographic metadata. Automatically extracting this geographic information could help conduct meta-analyses, find geographical research gaps, and retrieve articles using spatial search criteria. Research on this problem is still in its infancy, with many works manually processing corpora for locations and few cross-domain studies. In this paper, we develop a fully automatic pipeline to extract and represent relevant locations from scientific articles, applying it to two varied corpora. We obtain good performance, with full pipeline precision of 0.84 for an environmental corpus, and 0.78 for a biomedical corpus. Our results can be visualized as simple global maps, allowing human annotators to both explore corpus patterns in space and triage results for downstream analysis. Future work should not only focus on improving individual pipeline components, but also be informed by user needs derived from the potential spatial analysis and exploration of such corpora.

Entities:  

Year:  2021        PMID: 33406109      PMCID: PMC7787447          DOI: 10.1371/journal.pone.0244918

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  10 in total

1.  Geographic searching for ecological studies: a new frontier.

Authors:  Jason W Karl; Jeffrey K Gillan; Jeffrey E Herrick
Journal:  Trends Ecol Evol       Date:  2013-05-21       Impact factor: 17.712

2.  A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records.

Authors:  Tasnia Tahsin; Davy Weissenbacher; Robert Rivera; Rachel Beard; Mari Firago; Garrick Wallstrom; Matthew Scotch; Graciela Gonzalez
Journal:  J Am Med Inform Assoc       Date:  2016-01-17       Impact factor: 4.497

3.  EnvMine: a text-mining system for the automatic extraction of contextual information.

Authors:  Javier Tamames; Victor de Lorenzo
Journal:  BMC Bioinformatics       Date:  2010-06-01       Impact factor: 3.169

4.  Text mining for literature review and knowledge discovery in cancer risk assessment and research.

Authors:  Anna Korhonen; Diarmuid O Séaghdha; Ilona Silins; Lin Sun; Johan Högberg; Ulla Stenius
Journal:  PLoS One       Date:  2012-04-12       Impact factor: 3.240

5.  Knowledge-driven geospatial location resolution for phylogeographic models of virus migration.

Authors:  Davy Weissenbacher; Tasnia Tahsin; Rachel Beard; Mari Figaro; Robert Rivera; Matthew Scotch; Graciela Gonzalez
Journal:  Bioinformatics       Date:  2015-06-15       Impact factor: 6.937

6.  Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods.

Authors:  Davy Weissenbacher; Abeed Sarker; Tasnia Tahsin; Matthew Scotch; Graciela Gonzalez
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2017-07-26

7.  What's missing in geographical parsing?

Authors:  Milan Gritta; Mohammad Taher Pilehvar; Nut Limsopatham; Nigel Collier
Journal:  Lang Resour Eval       Date:  2017-03-07       Impact factor: 1.358

8.  Bi-directional Recurrent Neural Network Models for Geographic Location Extraction in Biomedical Literature.

Authors:  Arjun Magge; Davy Weissenbacher; Abeed Sarker; Matthew Scotch; Graciela Gonzalez-Hernandez
Journal:  Pac Symp Biocomput       Date:  2019

9.  World citation and collaboration networks: uncovering the role of geography in science.

Authors:  Raj Kumar Pan; Kimmo Kaski; Santo Fortunato
Journal:  Sci Rep       Date:  2012-11-29       Impact factor: 4.379

10.  Progenetix: 12 years of oncogenomic data curation.

Authors:  Haoyang Cai; Nitin Kumar; Ni Ai; Saumya Gupta; Prisni Rath; Michael Baudis
Journal:  Nucleic Acids Res       Date:  2013-11-12       Impact factor: 16.971

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.