| Literature DB >> 30011302 |
John A Borghi1, Ana E Van Gulick2,3.
Abstract
Neuroimaging methods such as magnetic resonance imaging (MRI) involve complex data collection and analysis protocols, which necessitate the establishment of good research data management (RDM). Despite efforts within the field to address issues related to rigor and reproducibility, information about the RDM-related practices and perceptions of neuroimaging researchers remains largely anecdotal. To inform such efforts, we conducted an online survey of active MRI researchers that covered a range of RDM-related topics. Survey questions addressed the type(s) of data collected, tools used for data storage, organization, and analysis, and the degree to which practices are defined and standardized within a research group. Our results demonstrate that neuroimaging data is acquired in multifarious forms, transformed and analyzed using a wide variety of software tools, and that RDM practices and perceptions vary considerably both within and between research groups, with trainees reporting less consistency than faculty. Ratings of the maturity of RDM practices from ad-hoc to refined were relatively high during the data collection and analysis phases of a project and significantly lower during the data sharing phase. Perceptions of emerging practices including open access publishing and preregistration were largely positive, but demonstrated little adoption into current practice.Entities:
Mesh:
Year: 2018 PMID: 30011302 PMCID: PMC6047789 DOI: 10.1371/journal.pone.0200562
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Participant titles and research areas.
| A. Participant title or role | B. Participant research area | ||
|---|---|---|---|
| Title | Percent | Research Area | Percent |
| Graduate Student | 24.31 | Cognitive Neuroscience | 55.56 |
| Assistant Professor | 21.53 | Clinical Neuroscience | 15.97 |
| Postdoctoral Fellow | 21.53 | Developmental Neuroscience | 5.56 |
| Associate Professor | 10.42 | Social Neuroscience | 5.56 |
| Professor | 9.03 | Behavioral Neuroscience | 2.08 |
| Research Associate/Scientist | 6.94 | Computational Neuroscience | 2.08 |
| Research Assistant | 2.08 | MRI Methods | 2.08 |
| Research Technician | 1.4 | Bio/Neuroinformatics | 1.39 |
| Staff Scientist | 1.4 | Sensory Systems Neuroscience | 1.39 |
| Other | 1.4 | Affective Neuroscience | 1.39 |
| Other | 6.9 | ||
Characteristics of study participants. A total of 144 neuroimaging researchers participated in our study, though not every participant gave a response for every question. Participants were split between trainees and faculty and between cognitive neuroscience and other research areas. All values listed are percentages.
Fig 1Ratings of research practice maturity.
Average ratings of research practice maturity on scale from 1 (ad-hoc) to 5 (refined) between (A) and within (B-D) three phases of an MRI research project (data collection, data analysis, and data sharing). (A) Participants rated their own practices as significantly more mature than those of the field as a whole during the data collection and analysis phases. Ratings of both individual and field maturity were significantly lower during the data sharing/publishing phase than during data collection and analysis. [Data collection: n = 131 (individual), 130 (field), data analysis: n = 118 (individual/field), data sharing: 116 (individual/field)]. Ratings of individual activities within each phase reflected a similar trend. (B) Practices related to the backup of raw data and securing of sensitive data were rated as highly mature during the data collection phase while the documentation of file organization schemes (such as through a lab notebook or data dictionary) received the lowest rating [n = 132]. (C) Similarly, during the data analysis phase, the backup of analyzed data received the highest rating, while the documentation of decisions related to analytical pipelines and the use of computational tools received the lowest [n = 120]. (D) Activities described in the data sharing phase received lower ratings than those in previous phases [n = 116].
Research data management limits and motivations.
| Data Collection | Data Analysis | Data Sharing | ||
|---|---|---|---|---|
| The amount of time it takes | 69.60 | 71.30 | 79.46 | |
| Lack of best practices | 43.20 | 48.70 | 49.11 | |
| Lack of incentives | 36.80 | 32.18 | 37.50 | |
| Lack of knowledge/training | 32.80 | 40.87 | 41.07 | |
| The financial cost | 17.60 | 8.70 | 22.32 | |
| Other | 7.20 | 6.09 | 5.36 | |
| Prevent loss of data | 100.00 | 85.83 | 78.57 | |
| Ensure access for collaborators | 76.80 | 73.33 | 70.53 | |
| Openness and reproducibility | 63.20 | 64.17 | 66.96 | |
| Institutional data policy | 52.00 | 39.17 | 47.32 | |
| Publisher/Funder Mandates | 35.20 | 28.33 | 41.96 | |
| Availability of tools | 12.00 | 9.17 | 8.93 | |
| Other | 3.20 | 3.3 | 0.0 |
Limits and motivations for RDM during the data collection, analysis, and sharing/publishing phases of a research project. All values listed are percentage of total participants. More than one response could be selected. For limitations, “Other” responses included changes in personnel, differences in expertise within a lab, differences in preferences between lab members, lack of top-down leadership, and concerns about future cost. For motivations, “Other” responses included ensuring continuity following personnel changes, keeping track of analyses, error prevention, and maximizing efficiency. [Data collection: n = 125 (limits/motivations), Data analysis: n = 115 (limits), 120 (motivations), Data sharing: n = 112 (limits/motivations)].
Types of data collected.
| A. MRI Data | B. Non-MRI data | C. Study Information | |||
|---|---|---|---|---|---|
| Data | Percent | Data | Percent | Data | Percent |
| Anatomical | 99.24 | Demographics | 97.0 | Acquisition parameters | 97.7 |
| Task-related | 98.54 | Behavioral data | 95.5 | Task information | 97.7 |
| Resting state | 80.30 | Questionnaires | 88.6 | Stimuli | 91.7 |
| Field map | 64.46 | Clinical data | 60.6 | Session information | 90.2 |
| Diffusion | 59.84 | Physiological data | 43.2 | Code (presentation) | 82.6 |
| Other | 11.36 | Genetic data | 30.3 | Code (data collection) | 71.2 |
| Other imaging | 25.0 | Other | 7.6 | ||
| Other | 3.8 | ||||
Types of data collected for: MRI data (A), non-MRI data (B), and study (C). information. All values listed are percentages, multiple data types could be selected. (A) Common “Other” responses included spectroscopy, diffusion, blood flow, and MRS. (B) Common “Other” responses included motion tracking, neurophysiology measures, and hormones (saliva). (C) Common “Other” responses included scanner Quality Assurance data, information about the scanner itself, and consent forms.
Analysis software used.
| A. MRI-specific Software (top 10) | B. Non-MRI-specific Software | ||
|---|---|---|---|
| Software | Percent | Software | Percent |
| SPM | 71.67 | Matlab | 83.3 |
| FSL | 70.83 | R | 70.0 |
| Freesurfer | 50.00 | Excel | 60.0 |
| AFNI | 45.00 | SPSS | 51.7 |
| MRIcro/MRIcron | 45.00 | Python | 48.3 |
| Mango | 13.33 | SAS | 5.8 |
| CONN | 12.50 | JASP | 5.0 |
| OsiriX | 10.83 | Mathematica | 0.8 |
| Caret | 6.67 | Other | 5.8 |
| Brain Voyager | 6.67 | ||
Software used for analysis: MRI-specific software (top 10 most popular) and non-MRI-specific software. More than one selection could be made for each section. All values listed are percentages, multiple software tools could be selected. Other MRI-specific software described included: Nipype (5.00%), custom code (4.17%), ANTS (4.17%), FMRIprep (2.50%), NiPy (1.17%), ITK-SNAP (1.17%), Connectome Workbench (1.17%), MRIQC (1.17%), CIVET, C-PAC, DPARSF, GIFT, ExploreDTI, CAT, SPHARM, TBSS, fidl, PLS, SamSrF, Vistasoft, and MedIRNIA. Other non-MRI-specific software described included Acknowledge, CIGAL, Fscan, Data Desk, Mplus, Octave, Stan, and Bash.
Reasons why data can and cannot be shared.
| A. Reasons for sharing data | B. Reasons for not sharing data | ||
|---|---|---|---|
| Reason | Percent | Reason | Percent |
| Transparency and openness | 55.14 | Data contains additional findings to publish | 50.43 |
| To enable reuse/reproducibility | 55.14 | Data contains sensitive information | 30.43 |
| To communicate my results | 49.53 | It would take too much time | 25.22 |
| To allow others to check my work | 47.66 | Supervisor doesn’t wish to share | 16.52 |
| Incentives (authorship, citations) | 21.50 | Format of data make sharing difficult | 15.62 |
| Mandated by funder/publisher | 20.56 | Don’t know how | 14.78 |
| Establish intellectual property | 1.87 | Data is proprietary, subject to IP | 1.74 |
| Other | 4.67 | Other | 7.82 |
| NA | 28.04 | Can share, but require citation | 29.57 |
| Can share, but require authorship | 9.57 | ||
More than one reason could be selected. Other reasons given include: Consent (5), laziness, afraid of mishandling, projects that are haphazard.
Important parts of data to preserve long term.
| Data to Preserve Long Term | Percent |
|---|---|
| Raw MRI data | 97.4 |
| Behavioral data | 94.8 |
| Demographic data | 90.5 |
| Task-related stimuli | 90.5 |
| MRI acquisition parameters | 88.8 |
| Code used for stimuli presentation | 84.5 |
| Questionnaire data | 81.9 |
| Code used for data collection | 81.0 |
| Notes/data about the scan session | 74.1 |
| Task-related information | 74.1 |
| Analyzed MRI data | 73.3 |
| Clinical or Medical data | 62.9 |
| Physiological data | 38.8 |
| Eye tracking/pupillometry data | 30.2 |
| Genetic/molecular data | 29.3 |
| Other neuroimaging data | 26.7 |
| Other | 76.7 |
Multiple data types could be selected. Overall, researchers want to preserve nearly all data long term. Other data types indicated to preserve include: code for analysis (3) and hormone information.
Fig 2Adoption of emerging research practices.
Percent of participants who have (purple) and who plan to in the future (blue) [n = 100].