| Literature DB >> 27016392 |
Christian R Bauer, Carolin Knecht, Christoph Fretter, Benjamin Baum, Sandra Jendrossek, Malte Rühlemann, Femke-Anouska Heinsen, Nadine Umbach, Bodo Grimbacher, Andre Franke, Wolfgang Lieb, Michael Krawczak, Marc-Thorsten Hütt, Ulrich Sax.
Abstract
Electronic access to multiple data types, from generic information on biological systems at different functional and cellular levels to high-throughput molecular data from human patients, is a prerequisite of successful systems medicine research. However, scientists often encounter technical and conceptual difficulties that forestall the efficient and effective use of these resources. We summarize and discuss some of these obstacles, and suggest ways to avoid or evade them.The methodological gap between data capturing and data analysis is huge in human medical research. Primary data producers often do not fully apprehend the scientific value of their data, whereas data analysts maybe ignorant of the circumstances under which the data were collected. Therefore, the provision of easy-to-use data access tools not only helps to improve data quality on the part of the data producers but also is likely to foster an informed dialogue with the data analysts.We propose a means to integrate phenotypic data, questionnaire data and microbiome data with a user-friendly Systems Medicine toolbox embedded into i2b2/tranSMART. Our approach is exemplified by the integration of a basic outlier detection tool and a more advanced microbiome analysis (alpha diversity) script. Continuous discussion with clinicians, data managers, biostatisticians and systems medicine experts should serve to enrich even further the functionality of toolboxes like ours, being geared to be used by 'informed non-experts' but at the same time attuned to existing, more sophisticated analysis tools.Entities:
Keywords: data analysis; data integration; genomics; inflammation; microbiome; systems medicine
Mesh:
Year: 2017 PMID: 27016392 PMCID: PMC5428997 DOI: 10.1093/bib/bbw024
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1.Summary statistics of the analyzed microbiome data [27]. Left panel: hierarchy of available data elements; right panel: basic statistics including distribution and overview of the current query result (screenshot from tranSMART 1.2 with Rausch 2011 data [27]). A colour version of this figure is available at BIB online: https://academic.oup.com/bib.
Figure 3.Visualization of alpha-diversity in tranSMART. The current cohort selection can also be grouped according to a categorical variable. The CONTROL and CD subcohorts were queried independently, and the resulting graphs were put next to each other manually. (A) Example of the individual Shannon indices of 46 samples. (B) Group-wise comparison of Shannon indices. For genotype labels, see text. Screenshots are from tranSMART 1.2 with sysINFLAME extensions and Rausch 2011 data [27]. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.
Figure 2.Visualization of outliers by means of an R-Script in tranSMART: An outlier of the age distribution is indicated with a label and red color on the right (Table 1). Screenshot from tranSMART 1.2 with sysINFLAME extension.
Output of sysINFLAME tranSMART extension by an R script [32] to identify statistical outliers
|
|
|
|
|
|
|---|---|---|---|---|
| Age | Pat13 | 60 | Grubbs | |
| Age at onset | Pat05; Pat14 | 55.6; 27.7 | Grubbs | |
| Height | No outlier | Grubbs | ||
| Weight | No outlier | Grubbs |
Note. This simple example comprises a patient (‘Pat13’) of maximum age, which qualified them as an outlier. For age at onset, two outliers were identified, whereas both height and weight did not show any outliers.
Challenges of systems medicine toolboxes and problem-solving as exemplified by the microbiome showcase described in the main text
| Internal standardization | Quality control and statistical tests |
| External data sources | Usage of RDP OTU identifiers as nomenclature for the microbial species |
| Storage of intermediate results | Not required here |
| Ontologies and external standardization | Use of accepted and standardized measures and compliance with the TMF recommendations [ |