| Literature DB >> 28127429 |
Eugene Lin1,2,3, Hsien-Yuan Lane1,4.
Abstract
In light of recent advances in biomedical computing, big data science, and precision medicine, there is a mammoth demand for establishing algorithms in machine learning and systems genomics (MLSG), together with multi-omics data, to weigh probable phenotype-genotype relationships. Software frameworks in MLSG are extensively employed to analyze hundreds of thousands of multi-omics data by high-throughput technologies. In this study, we reviewed the MLSG software frameworks and future directions with respect to multi-omics data analysis and integration. Our review was targeted at researching recent approaches and technical solutions for the MLSG software frameworks using multi-omics platforms.Entities:
Keywords: Genomics; Machine learning; Multi-omics; Pharmacogenomics; Single nucleotide polymorphisms; Systems genomics
Year: 2017 PMID: 28127429 PMCID: PMC5251341 DOI: 10.1186/s40364-017-0082-y
Source DB: PubMed Journal: Biomark Res ISSN: 2050-7771
Summary, strength, and limitation of each method of machine learning and systems genomics (MLSG) software frameworks
| Software framework | Summary | Strength | Limitation |
|---|---|---|---|
| Model-based integration (MBI) | Multiple predictive models are generated by using various multi-omics data types; then a final predictive model is generated by using the multiple models. | Predictive models can be consolidated from various multi-omics data types, and each data type can be gathered from a various set of patients with same phenotype. | It may be challenging to avoid overfitting. |
| Concatenation-based integration (CBI) | Multiple data matrices of different multi-omics data types are incorporated into a large input matrix; then a predictive model is generated by using the large input matrix. | It is fairly easy to leverage various machine learning methods for analyzing continuous or categorical data once a large input matrix is formed. | It may be challenging to combine a large input matrix. |
| Transformation-based integration (TBI) | Datasets for various multi-omics data types are first converted into intermediate forms, which are united into a large input matrix; then a predictive model is generated by using the large input matrix. | Unique variables such as patient identifiers can be used to link multi-omics data types and integrate a variety of continuous or categorical data values. | It may be challenging to transform into intermediate forms. |
Fig. 1A flowchart for the model-based integration (MBI) software framework
Fig. 2A flowchart for the concatenation-based integration (CBI) software framework
Fig. 3A flowchart for the transformation-based integration (TBI) software framework