| Literature DB >> 24808858 |
Shreejoy J Tripathy1, Judith Savitskaya2, Shawn D Burton1, Nathaniel N Urban1, Richard C Gerkin1.
Abstract
The behavior of neural circuits is determined largely by the electrophysiological properties of the neurons they contain. Understanding the relationships of these properties requires the ability to first identify and catalog each property. However, information about such properties is largely locked away in decades of closed-access journal articles with heterogeneous conventions for reporting results, making it difficult to utilize the underlying data. We solve this problem through the NeuroElectro project: a Python library, RESTful API, and web application (at http://neuroelectro.org) for the extraction, visualization, and summarization of published data on neurons' electrophysiological properties. Information is organized both by neuron type (using neuron definitions provided by NeuroLex) and by electrophysiological property (using a newly developed ontology). We describe the techniques and challenges associated with the automated extraction of tabular electrophysiological data and methodological metadata from journal articles. We further discuss strategies for how to best combine, normalize and organize data across these heterogeneous sources. NeuroElectro is a valuable resource for experimental physiologists attempting to supplement their own data, for computational modelers looking to constrain their model parameters, and for theoreticians searching for undiscovered relationships among neurons and their properties.Entities:
Keywords: API; database; electrophysiology; machine learning; metadata; natural language processing; neuroinformatics; text-mining
Year: 2014 PMID: 24808858 PMCID: PMC4010726 DOI: 10.3389/fninf.2014.00040
Source DB: PubMed Journal: Front Neuroinform ISSN: 1662-5196 Impact factor: 4.081
Figure 1Illustration of workflow for obtaining electrophysiological information from the research literature.
Statistics of journals represented in the NeuroElectro database.
| J. Neurosci. | 19,002 | 104 | 560 |
| J. Neurophysiol. | 12,078 | 94 | 555 |
| J. Physiol. (Lond.) | 10,543 | 44 | 235 |
| Neuroscience | 3035 | 14 | 205 |
| Eur. J. Neurosci. | 2495 | 7 | 117 |
| Brain Res. | 3017 | 7 | 146 |
| Neuron | 1657 | 4 | 43 |
| Epilepsia | 463 | 2 | 23 |
| Neurosci. Lett. | 1468 | 2 | 34 |
| Hippocampus | 208 | 2 | 10 |
Listing of journals and counts of articles downloaded (articles obtained), articles with published data tables containing neurophysiological information which has been manually validated by an expert curator (validated), and articles which likely contain information in a data table which has not yet been manually curated (not validated). Not validated articles are those which have at least four algorithmically assigned electrophysiological concepts within data tables.
Figure 2Illustration of the sources within an article containing information relevant to neuron electrophysiological properties. Data on neuronal electrophysiological properties are presented within article figures and raw traces, sentences within the article text, and formatted data tables. The raw traces and example sentence are from van Brederode et al. (2011) and are reproduced with permission from The American Physiological Society and the data table is a constructed example. Colored text indicates electrophysiological concepts (red), neuron concepts (pink), or neurophysiological data (yellow).
Figure 4Example of human validation of algorithmically assigned content. All textual elements of a table are enhanced using HTML and javascript to allow for assignment of neuron or electrophysiological concepts using drop down menus. Example data table from Pasquale et al. (1997) and is reproduced with permission from The American Physiological Society.
Figure 3Example data table illustrating mark-up and annotation of entities. (A) Example published data table containing neurophysiological information. Data table from Pasquale et al. (1997) and is reproduced with permission from The American Physiological Society. (B) Same as (A), but semantically marked up with algorithmic and manually curated annotations. Markups in red and pink indicate electrophysiological and neuron type concepts and yellow indicates extracted data measurements. Note that here the textual string “+/+” and “stg/stg” refers to the normotypic and manipulated condition, respectively. Panels (A) and (B) reflect screenshots taken from NeuroElectro web interface.
A partial listing of metadata attributes and extraction methodology.
| Species | MeSH term only | |||
| Rats | Rats | |||
| Mice | Mice | |||
| Guinea pigs | Guinea pigs | |||
| Electrode type | MeSH term + Regex | |||
| Patch-clamp | “Whole cell” or “patch clamp” | Patch-clamp techniques | ||
| sharp | “Sharp electrode” | |||
| Animal strain | MeSH term only | |||
| Fischer 344 | Rats, Inbred F344 | |||
| Long-evans | Rats, Long-Evans | |||
| Sprague-Dawley | Rats, Sprague-Dawley | |||
| Wistar | Rats, Wistar | |||
| C57BL | Mice, Inbred C57BL | |||
| BALB C | Mice, Inbred BALB C | |||
| Preparation type | MeSH Term + Regex | |||
| “Slice” or “ | ||||
| “ | ||||
| Cell culture | “Culture” | Cell culture techniques | ||
| Model | “Model” | Computer simulation | ||
| Junction potential | Regex | |||
| Not corrected | “Not junction potential” | |||
| Corrected | “Junction potential” | |||
| Recording temperature | Regex | |||
| Continuous value | “Record… C” or “experiment C” | |||
| Room temperature | “Record room temperature” | |||
| Animal age | Regex | |||
| Continuous value | Find digits near: “P#-#” or “P#-P#” |
Metadata attributes are extracted through combining PubMed Medical Subject Heading terms (MeSH Terms) and custom regular expressions (Regex). Regular expression column (or MeSH Term column) indicates specific regular expressions (or MeSH terms) used for identifying metadata concept entities.
Figure 5Accuracy of metadata assignment using automated methods alone. Error indicate 95% binomial confidence intervals.