Literature DB >> 23587272

Data sharing and publishing in the field of neuroimaging.

Janis L Breeze1, Jean-Baptiste Poline, David N Kennedy.   

Abstract

There is growing recognition of the importance of data sharing in the neurosciences, and in particular in the field of neuroimaging research, in order to best make use of the volumes of human subject data that have been acquired to date. However, a number of barriers, both practical and cultural, continue to impede the widespread practice of data sharing; these include: lack of standard infrastructure and tools for data sharing, uncertainty about how to organize and prepare the data for sharing, and researchers' fears about unattributed data use or missed opportunities for publication. A further challenge is how the scientific community should best describe and/or reference shared data that is used in secondary analyses. Finally, issues of human research subject protections and the ethical use of such data are an ongoing source of concern for neuroimaging researchers.One crucial issue is how producers of shared data can and should be acknowledged and how this important component of science will benefit individuals in their academic careers. While we encourage the field to make use of these opportunities for data publishing, it is critical that standards for metadata, provenance, and other descriptors are used. This commentary outlines the efforts of the International Neuroinformatics Coordinating Facility Task Force on Neuroimaging Datasharing to coordinate and establish such standards, as well as potential ways forward to relieve the issues that researchers who produce these massive, reusable community resources face when making the data rapidly and freely available to the public. Both the technical and human aspects of data sharing must be addressed if we are to go forward.

Entities:  

Year:  2012        PMID: 23587272      PMCID: PMC3626511          DOI: 10.1186/2047-217X-1-9

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


Background

With the worldwide push for more open science and data sharing [1], it is an ideal time to consider the current state of data sharing in neuroscience, and in particular neuroimaging research. A huge amount of neuroimaging data has been acquired around the world; a recent literature search on PubMed led to an estimate of 12 000 datasets or 144 000 scans (around 55 petabytes of data) over the past 10 years, but only a few percent of such data is available in public repositories. Over the past two years, the International Neuroinformatics Coordinating Facility (http://www.incf.org) has investigated barriers to data sharing through task force working groups and public workshops, and has identified a number of roadblocks, many of which are readily addressable, that impede researchers from both sharing and making use of existing shared data. These include a lack of simple tools for finding, uploading, and downloading shared data; uncertainty about how to best organize and prepare data for sharing, and concerns about data attribution. Many researchers are also wary of data sharing because of confusion institutional human research subject protection and the ethical use of such data [2]. Several journals have played a key role in the trend toward having data available for reviewers or readers of a peer-reviewed paper. The Journal of Cognitive Neuroscience was a pioneer in this context, and while the project was probably too ambitious for the capacity of the tools and for the size of the team, the trend for data “on demand” has remained with several high ranked journals. The requirement to share data, and the infrastructure to support this data sharing present numerous associated technical difficulties and costs to the journal. Nonetheless, in the future it may be that both data and computational tools will be made available in some new form of ‘supplementary material’ or associated data warehouse to help track the shared data and it’s provenance. A growing and crucial issue is how producers of shared data can and should be acknowledged by third parties who publish papers based on this data. Without such acknowledgement, very little data will ever be shared. A number of journals have launched a new type of articles devoted to the description and/or publication of original datasets [3,4]. The benefit of using a publication to ‘mark’ a data release is that credit and reuse are fairly easily tracked with traditional citation and impact metrics. With its ability to host large datasets, GigaScience offers neuroimaging researchers another option to store and share their data, and provides such datasets a digital object identifier.

Discussion

Given the challenges in carrying out controlled research with human subjects—it simply isn’t possible or ethical to treat people like monkeys—questions in human biomedical research are generally best studied with very large datasets. While meta-analyses offer a possible workaround by aggregating the published results of studies, this is obviously less desirable than working with the raw data themselves. To give one example, though functional neuroimaging research typically reports activation locations using coordinates, Salimi-Khorshidi and colleagues [5] recently showed that the consistency between a study using the original contrast maps and those derived from the coordinates alone was poor. There is little doubt that more and more neuroimaging data will be shared. For example, increased attention to the importance of reproducible research [6] has helped to encourage that data and analysis tools are made available as supplementary material at the time of publication. Another impetus is the need for many projects to gather data and communicate with collaborators. Whenever the scientific questions require a large number of scans, longitudinal data, or a very specific patient population that cannot be recruited at one site, researchers from consortia need standard tools to share and curate data and computational tools. The INCF Neuroimaging Datasharing Task Force found that even where enthusiasm for data sharing exists, it is tempered by a number of technical issues that prevent the average neuroimaging researcher from participating fully in the data sharing community. In particular, a lack of standards, recommendations, and interoperable and easy-to-use tools for sharing is lacking. In an attempt to improve this situation, the group is working on four projects to be completed by the end of 2012. In brief, (1) a “One-Click Share Tool” will allow researchers to upload MRI data (in DICOM or NIFTI format) to a database hosted at INCF. A quality control check will provide the uploader with feedback about their data; (2) Building on previous efforts, a neuroimaging data description schema and common application programming interface (API) will facilitate communication among databases with different data models; (3) A mechanism to capture related data under a single container will be introduced; (4) Metadata and the results of processing streams will be automatically stored to a database, including the previously described quality control workflows and any processed data and metadata. While the lack of tools is an obvious barrier, it is one that we feel can be readily addressed by efforts such as that of INCF and similar initiatives [7-9]. A greater challenge will be the current academic and funding framework in which most researchers exist, which equates career advancement with some count of peer-reviewed publications. Given this climate, it is a great step forward for the community that peer-reviewed journals are now offering an article type devoted to the description and publication of data, along with recommendation of organizations such as DataCite. This follows similar journal initiatives to publish papers on software code and technology methods, and signifies a stronger valuation of the computational and technical work that makes up a large part of modern biomedical research. Data papers should describe in detail how the data was acquired, with which goals and constraints, an assessment of their quality, how they have and how they can or should be reused, how to get access, give feedback and credit. Datasets are technical and critical building blocks of science and should be recognized as such by high impact and heavy citation, ensuring that creators of data are appropriately acknowledged for their work.

Conclusions

The impact of widespread data sharing on our field should be enormous—it will provide better training opportunities to students by enabling them to work with large amounts of real data; it will alter our interpretation and understanding of the variability of brain function; it will lead to better reporducibility and stronger data analysis and interpretations; and it will lead to new methods and tools for analyzing massive datasets. While we encourage the field to make use of opportunities for data publishing, we realize that standards for metadata, provenance, and other descriptors are critical. INCF’s Task Force on Data Sharing looks forward to working with the community to converge on such standards. All tools and databases provided hosted by INCF are open-access and without a doubt strengthened by community feedback.

Competing interests

Jean-Baptiste Poline sits on the INCF Governing Board.

Authors’ contributions

All authors contributed equally to the conception and writing of this manuscript. All authors read and approved the final manuscript.
  7 in total

1.  An invitation to reproducible computational research.

Authors:  David L Donoho
Journal:  Biostatistics       Date:  2010-07       Impact factor: 5.899

2.  More education, less administration: reflections of neuroimagers' attitudes to ethics through the qualitative looking glass.

Authors:  A A Kehagia; K Tairyan; C Federico; G H Glover; J Illes
Journal:  Sci Eng Ethics       Date:  2011-05-28       Impact factor: 3.525

3.  Meta-analysis of neuroimaging data: a comparison of image-based and coordinate-based pooling of studies.

Authors:  Gholamreza Salimi-Khorshidi; Stephen M Smith; John R Keltner; Tor D Wager; Thomas E Nichols
Journal:  Neuroimage       Date:  2008-12-31       Impact factor: 6.556

4.  A call for BMC Research Notes contributions promoting best practice in data standardization, sharing and publication.

Authors:  Iain Hrynaszkiewicz
Journal:  BMC Res Notes       Date:  2010-09-02

5.  Large-scale automated synthesis of human functional neuroimaging data.

Authors:  Tal Yarkoni; Russell A Poldrack; Thomas E Nichols; David C Van Essen; Tor D Wager
Journal:  Nat Methods       Date:  2011-06-26       Impact factor: 28.547

6.  Informatics and data mining tools and strategies for the human connectome project.

Authors:  Daniel S Marcus; John Harwell; Timothy Olsen; Michael Hodge; Matthew F Glasser; Fred Prior; Mark Jenkinson; Timothy Laumann; Sandra W Curtiss; David C Van Essen
Journal:  Front Neuroinform       Date:  2011-06-27       Impact factor: 4.081

7.  Derived Data Storage and Exchange Workflow for Large-Scale Neuroimaging Analyses on the BIRN Grid.

Authors:  David B Keator; Dingying Wei; Syam Gadde; Jeremy Bockholt; Jeffrey S Grethe; Daniel Marcus; Nicole Aucoin; Ibrahim B Ozyurt
Journal:  Front Neuroinform       Date:  2009-09-07       Impact factor: 4.081

  7 in total
  13 in total

1.  Interacting with the National Database for Autism Research (NDAR) via the LONI Pipeline workflow environment.

Authors:  Carinna M Torgerson; Catherine Quinn; Ivo Dinov; Zhizhong Liu; Petros Petrosyan; Kevin Pelphrey; Christian Haselgrove; David N Kennedy; Arthur W Toga; John Darrell Van Horn
Journal:  Brain Imaging Behav       Date:  2015-03       Impact factor: 3.978

Review 2.  Neuroinformatics Software Applications Supporting Electronic Data Capture, Management, and Sharing for the Neuroimaging Community.

Authors:  B Nolan Nichols; Kilian M Pohl
Journal:  Neuropsychol Rev       Date:  2015-08-13       Impact factor: 7.444

3.  The perfect neuroimaging-genetics-computation storm: collision of petabytes of data, millions of hardware devices and thousands of software tools.

Authors:  Ivo D Dinov; Petros Petrosyan; Zhizhong Liu; Paul Eggert; Alen Zamanyan; Federica Torri; Fabio Macciardi; Sam Hobel; Seok Woo Moon; Young Hee Sung; Zhiguo Jiang; Jennifer Labus; Florian Kurth; Cody Ashe-McNalley; Emeran Mayer; Paul M Vespa; John D Van Horn; Arthur W Toga
Journal:  Brain Imaging Behav       Date:  2014-06       Impact factor: 3.978

4.  Terminology development towards harmonizing multiple clinical neuroimaging research repositories.

Authors:  Jessica A Turner; Danielle Pasquerello; Matthew D Turner; David B Keator; Kathryn Alpert; Margaret King; Drew Landis; Vince D Calhoun; Steven G Potkin; Marcelo Tallis; Jose Luis Ambite; Lei Wang
Journal:  Data Integr Life Sci       Date:  2015-07-08

Review 5.  State-of-the-art of 3D cultures (organs-on-a-chip) in safety testing and pathophysiology.

Authors:  Natalie Alépée; Anthony Bahinski; Mardas Daneshian; Bart De Wever; Ellen Fritsche; Alan Goldberg; Jan Hansmann; Thomas Hartung; John Haycock; Helena Hogberg; Lisa Hoelting; Jens M Kelm; Suzanne Kadereit; Emily McVey; Robert Landsiedel; Marcel Leist; Marc Lübberstedt; Fozia Noor; Christian Pellevoisin; Dirk Petersohn; Uwe Pfannenbecker; Kerstin Reisinger; Tzutzuy Ramirez; Barbara Rothen-Rutishauser; Monika Schäfer-Korting; Katrin Zeilinger; Marie-Gabriele Zurich
Journal:  ALTEX       Date:  2014-07-14       Impact factor: 6.043

6.  What drives academic data sharing?

Authors:  Benedikt Fecher; Sascha Friesike; Marcel Hebing
Journal:  PLoS One       Date:  2015-02-25       Impact factor: 3.240

7.  Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff.

Authors:  Lisa M Federer; Ya-Ling Lu; Douglas J Joubert; Judith Welsh; Barbara Brandys
Journal:  PLoS One       Date:  2015-06-24       Impact factor: 3.240

8.  Neuroimaging Feature Terminology: A Controlled Terminology for the Annotation of Brain Imaging Features.

Authors:  Anandhi Iyappan; Erfan Younesi; Alberto Redolfi; Henri Vrooman; Shashank Khanna; Giovanni B Frisoni; Martin Hofmann-Apitius
Journal:  J Alzheimers Dis       Date:  2017       Impact factor: 4.472

9.  Large and linked in scientific publishing.

Authors:  Laurie Goodman; Scott C Edmunds; Alexandra T Basford
Journal:  Gigascience       Date:  2012-07-12       Impact factor: 6.524

10.  Making data sharing count: a publication-based solution.

Authors:  Krzysztof J Gorgolewski; Daniel S Margulies; Michael P Milham
Journal:  Front Neurosci       Date:  2013-02-06       Impact factor: 4.677

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.