| Literature DB >> 28097074 |
Reem Al-Jawahiri1, Elizabeth Milne1.
Abstract
Recently, there has been a move encouraged by many stakeholders towards generating big, open data in many areas of research. One area where big, open data is particularly valuable is in research relating to complex heterogeneous disorders such as Autism Spectrum Disorder (ASD). The inconsistencies of findings and the great heterogeneity of ASD necessitate the use of big and open data to tackle important challenges such as understanding and defining the heterogeneity and potential subtypes of ASD. To this end, a number of initiatives have been established that aim to develop big and/or open data resources for autism research. In order to provide a useful data reference for autism researchers, a systematic search for ASD data resources was conducted using the Scopus database, the Google search engine, and the pages on 'recommended repositories' by key journals, and the findings were translated into a comprehensive list focused on ASD data. The aim of this review is to systematically search for all available ASD data resources providing the following data types: phenotypic, neuroimaging, human brain connectivity matrices, human brain statistical maps, biospecimens, and ASD participant recruitment. A total of 33 resources were found containing different types of data from varying numbers of participants. Description of the data available from each data resource, and links to each resource is provided. Moreover, key implications are addressed and underrepresented areas of data are identified.Entities:
Keywords: ASD; Autism spectrum disorder; Biobanks; Data sharing; Databases; Heterogeneous disorders; Neuroimaging data; Open science; Phenotypic data; Subtyping
Year: 2017 PMID: 28097074 PMCID: PMC5237363 DOI: 10.7717/peerj.2880
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Search terms.
| Keywords | Database/Search engine | # of results on primary search |
|---|---|---|
| Autism AND (database OR databank OR repository) | Scopus | 1,050 results |
| Autism AND (database OR databank OR repository) | Google | ∼14,400,000 results |
| Autism AND data AND sharing | Scopus | 80 results |
| Autism AND data AND sharing | Google | ∼6,310,000 results |
| Autism AND (database OR databank OR repository) AND (EEG OR MEG) | Scopus | 19 results |
| Autism AND (database OR databank OR repository) AND (EEG OR MEG) | Google | ∼515,000 results |
| Autism AND (database OR databank OR repository) AND MRI | Scopus | 23 results |
| Autism AND (database OR databank OR repository) AND MRI | Google | ∼354,000 results |
| Autism AND (database OR databank OR repository) AND (genetic OR genome) | Scopus | 315 results |
| Autism AND (database OR databank OR repository) AND (genetic OR genome) | Google | ∼493,000 results |
| Autism AND (database OR databank OR repository) AND (phenotype OR phenotypic) | Scopus | 145 results |
| Autism AND (database OR databank OR repository) AND (phenotype OR phenotypic) | Google | ∼524,000 results |
Notes.
Only pages one to three (inclusive) were assessed.
A comprehensive list of ASD data resources.
| URL | Resource | Data type category | Data type | Number of participants with ASD |
|---|---|---|---|---|
|
| National Database for Autism Research (NDAR) | Phenotypic, neuroimaging, genetic, omics | Phenotypic, neuroimaging, genetic, omics data | Over 80,203 participants (however this number includes the control participants of the ASD studies). |
|
| Simons Foundation Autism Research Initiative (SFARI) | Phenotypic, neuroimaging, genetic | Phenotypic data, biospecimens, genetic data, neuroimaging data, participant recruitment (to recruit SSC families for additional studies) | Over 3,000 participants (SSC), over 200 participants (Simons VIP), 50,000 |
|
| Autism Genetic Resource Exchange (AGRE) | Phenotypic, genetic, biospecimens | Phenotypic data; genetic data, biospecimens | Over 1,700 families with over 3,300 ASD participants. |
|
| Interactive Autism Network (IAN) | ASD participant recruitment services | Phenotypic data, ASD participant recruitment services | Over 17,000 participants. |
|
| Autism Spectrum Database-UK (ASD-UK) | ASD participant recruitment services | Phenotypic data, ASD participant recruitment services | Over 3,000 families. |
|
| Autism BrainNet | BioBank | Postmortem brain and related biospecimens | Over 25 donations (since 2014). |
|
| Autism Brain Imaging Data Exchange (ABIDE) | Neuroimaging | Resting state functional magnetic resonance imaging (R-fMRI), structural MRI, phenotypic data | 539 participants (ABIDE I), 487 participants (ABIDE II). |
| – | Australian EEG Database (AED) | Neuroimaging | EEG data | 50 participants. |
|
| BrainMap | Human brain statistical maps | fMRI, PET, and structural coordinate-based results ( | 70 results/articles relevant to ASD functional data (search using BrainMapWeb). |
|
| NeuroVault | Human brain statistical maps | Unthresholded statistical maps, parcellations, and atlases produced by MRI and PET studies | Five studies: 277, 60, 50, 13, 218 participants in each study. |
|
| USC Multimodal Connectivity Database | Brain connectivity matrices | Brain connectivity matrices of fMRI and DTI | 42 (fMRI) participants, 51 (DTI) participants. |
|
| Dryad | General data repository | lncRNA, MRI, metabolite, MEG | Four studies: two, 34, 12, and 13 participants respectively. |
|
| FigShare | General data repository | Phenotypic, statistical, genetic data | – |
|
| NIMH Repository and Genomics Resource (NIMH-RGR) | Biospecimens, genetic | Biospecimens (DNA samples and cell lines, Induced Pluripotent Stem Cell (iPSC) and Source Cells), GWAS, genomic sequences | Biospecimens: 4,793 families and 19,359 individuals of which 17,189 have DNA cell lines. Genome-Wide Association Studies (GWAS) Data: four studies (1,232 cases, 739 families, 943 families, 935 families). Sequence data (exome): 2,119 cases. |
|
| Avon Longitudinal Study of Parents and Children (ALSPAC) | Phenotypic, clinical, biospecimens, genetic | Phenotypic, clinical, biospecimens, genetic (including GWAS, SNPs, VNTRs, in addition to sequence data from UK10K project available via EGA), ALPAC data linked with data (e.g., routine health and social records) from external sources, bespoke data | 96 participants (as identified via follow up questionnaires completed by carers for when the proband was nine years old). |
|
| Coriell BioRepositories (including Autism Research Resource) | BioBank | Cell cultures, DNA samples, and induced pluripotent stem cells | 158 ASD cases. |
|
| NIH NeuroBioBank (NBB) | BioBank | Postmortem brain and related biospecimens | 64 ASD cases. 22 ASD suspected. |
|
| Medical Research Council London Neurodegenerative Diseases Brain Bank | BioBank | Postmortem brain and spinal cord tissue | Four ASD cases. |
Notes.
Data-type is described in more detail in File S1.
The data is not yet available: It is intended to be available in a future date according to the SFARI website.
There is no website or portal for the AED resource; however, the data is available via email requests to aed@newcastle.edu.au.
The approximate number of ASD participants was found via email correspondence with aed@newcastle.edu.au.
Accurate information regarding the approximate number of participants with ASD is not readily available on the website, due to the nature of the search functionality.
Data specifically from ASD participants are not necessarily available in all the different data types described in this table (therefore further specific enquiries directed to the ALSPAC team is advised).
Genetics and omics data resources either from individuals with ASD or containing data relevant to the study of ASD.
| URL | Resource | Data type category | Data type | Notes |
|---|---|---|---|---|
|
| MSSNG | Genetic/Genomic | Phenotypic, genomic (whole genome sequencing of blood DNA) | 10,000 participants. However, data from only 3,000 probands is currently available. |
|
| Simons Foundation Autism Research Initiative Gene (SFARI Gene) | Gene Catalogue | Animal Model, Protein Interaction (PIN), Gene Scoring, CNV | An up-to-date, manually annotated reference set of ASD-linked genes. |
|
| Autism Chromosome Rearrangement Database (ACRD) | Gene Catalogue | Genomic structural variation data—CNVs | A curated catalogue of structural variation related to ASD extracted from publicly available literature and unpublished data. |
|
| Autism Knowledgebase (AutismKB) | Gene Catalogue | A collection of genes and variations associated with ASD with annotations | – |
|
| National Center for Biotechnology Information (NCBI) | Genetics, omics | A collection of multiple resources—omics and sequencing data | – |
|
| European Molecular Biology Laboratory (EMBL-EBI) | Genetics, omics | A collection of multiple resources—omics and sequencing data | – |
|
| Universal Protein Resource (UniProt) | Protein sequences | Protein sequences and their annotations | Can be found among EMBL-EBI resources. 91 (reviewed) and 346 (unreviewed) protein records associated with ASD. |
|
| The European Genome-phenome Archive (EGA) | Omics—Functional genomics | Interaction of genotype and phenotype (including data from UK10K project) | Can be found among EMBL-EBI resources. |
|
| Biological General Repository for Interaction Datasets (BioGRID) | Omics | Genetic and protein interaction data | Resource that archives and disseminates genetic and protein interaction data. |
|
| Global Proteome Machine Database (GPM DB) | Omics—Proteomics | Proteomics data from tandem mass spectrometry | Open-source system for analyzing, storing, and validating proteomics information derived from tandem mass spectrometry. |
|
| PeptideAtlas | Omics—Proteomics | Peptide sequences, mapping— proteome information/data | A collection of peptides identified in a large set of tandem mass spectrometry proteomics experiments. |
|
| DNA DataBank of Japan (DDBJ) | DNA and RNA sequences | DNA and RNA sequences | Annotated collection of all publicly available nucleotide sequences and their translated amino acid sequences. |
|
| The Chromosome 7 Annotation Project | DNA sequences | DNA sequence and annotation of the entire human chromosome 7 | 84 cases. |
|
| miRBase: the microRNA database | miRNA sequences | miRNA sequences and annotation | – |
|
| Sullivan Lab Evidence Project (SLEP) | Genetics, omics | A collection of genes and variations associated with ASD with annotations | Findings from genome wide linkage (GWL), genome wide association (GWA), and microarray (MA) studies for ASD. |
Notes.
The resources listed in this table contain data either from individuals with ASD or data relevant to ASD research that is collected from non-affected individuals (e.g., from individuals with certain genetic profiles or syndromes related to ASD research).