| Literature DB >> 27134731 |
Jana Blazkova1, Sabri Boughorbel1, Scott Presnell2, Charlie Quinn2, Damien Chaussabel1.
Abstract
Compendia of large-scale datasets available in public repositories provide an opportunity to identify and fill current gaps in biomedical knowledge. But first, these data need to be readily accessible to research investigators for interpretation. Here, we make available a collection of transcriptome datasets relevant to HIV infection. A total of 2717 unique transcriptional profiles distributed among 34 datasets were identified, retrieved from the NCBI Gene Expression Omnibus (GEO), and loaded in a custom web application, the Gene Expression Browser (GXB), designed for interactive query and visualization of integrated large-scale data. Multiple sample groupings and rank lists were created to facilitate dataset query and interpretation via this interface. Web links to customized graphical views can be generated by users and subsequently inserted in manuscripts reporting novel findings, such as discovery notes. The tool also enables browsing of a single gene across projects, which can provide new perspectives on the role of a given molecule across biological systems. This curated dataset collection is available at: http://hiv.gxbsidra.org/dm3/geneBrowser/list.Entities:
Keywords: Big Data; Bioinformatics; HIV; Immune Response; Software; Transcriptomics
Year: 2016 PMID: 27134731 PMCID: PMC4838008 DOI: 10.12688/f1000research.8204.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Sample source composition of the dataset collection.
Pie charts representing the numbers of datasets ( a) or transcriptome profiles ( b) for different cell types and tissues.
List of datasets constituting the collection, also available at http://hiv.gxbsidra.org/dm3/geneBrowser/list.
| Title | Platform | Number
| Sample
| Validation
| GEO ID | Ref |
|---|---|---|---|---|---|---|
|
| Illumina
| 107 | Whole
| N/A |
|
|
|
| Affymetrix
| 96 | PBMC | N/A |
|
|
|
| Illumina
| 77 | CD4
+
|
|
|
|
|
| Affymetrix
| 20 | CD4
+
|
|
|
|
|
| Illumina
| 79 | CD4
+
| CD4,
|
|
|
|
| Affymetrix
| 18 | CD4
+
|
|
|
|
|
| Affymetrix
| 42 | CD8
+
|
|
|
|
|
| Affymetrix
| 40 | CD4
+
|
|
|
|
|
| Illumina
| 72 | PBMC |
|
|
|
|
| Affymetrix
| 3 | PBMC |
|
|
|
|
| Affymetrix
| 8 | mDC | CD11c |
|
|
|
| Affymetrix
| 6 | GALT | N/A |
|
|
|
| Affymetrix
| 86 | Whole
| N/A |
|
|
|
| Affymetrix
| 15 | Mono
|
|
|
|
|
| Illumina
| 44 | PBMC |
|
|
|
|
| Illumina
| 24 | PBMC |
|
|
|
|
| Affymetrix
| 8 | Brain |
|
|
|
|
| Affymetrix
| 45 | PBMC | N/A |
|
|
|
| Illumina
| 202 | CD4
+
|
|
|
|
|
| Illumina
| 491 | Whole
| N/A |
|
|
|
| Illumina
| 26 | Whole
| N/A |
|
|
|
| Illumina
| 537 | Whole
| N/A |
|
|
|
| Illumina
| 87 | PBMC | N/A |
|
|
|
| Affymetrix
| 13 | Adipose
|
|
| N/A |
|
| Affymetrix
| 16 | Mono
|
|
|
|
|
| Affymetrix
| 52 | Lymph
|
|
|
|
|
| Affymetrix
| 17 | Tissues |
|
|
|
|
| Affymetrix
| 72 | Brain |
|
|
|
|
| Affymetrix
| 42 | Lymph
|
|
|
|
|
| Illumina
| 40 | CD4
+
|
|
|
|
|
| Illumina
| 86 | Mono
|
|
|
|
|
| Illumina
| 14 | Mono
|
|
|
|
|
| Illumina
| 47 | Whole
|
|
|
|
|
| Illumina
| 185 | Whole
|
|
|
Figure 2. Thematic composition of the dataset collection.
Word frequencies extracted from titles of the studies loaded into the GXB tool are depicted as a word cloud. The size of the word is proportional to its frequency.