| Literature DB >> 33059594 |
Shaoke Lou1,2, Tianxiao Li1,2, Daniel Spakowicz1,2,3, Xiting Yan4, Geoffrey Lowell Chupp4, Mark Gerstein5,6.
Abstract
BACKGROUND: The pathogenesis of asthma is a complex process involving multiple genes and pathways. Identifying biomarkers from asthma datasets, especially those that include heterogeneous subpopulations, is challenging. Potentially, autoencoders provide ideal frameworks for such tasks as they can embed complex, noisy high-dimensional gene expression data into a low-dimensional latent space in an unsupervised fashion, enabling us to extract distinguishing features from expression data.Entities:
Keywords: Asthma; Asthma subtypes; Biomarker; Denoising autoencoder; Non-invasive
Mesh:
Year: 2020 PMID: 33059594 PMCID: PMC7560063 DOI: 10.1186/s12859-020-03785-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Denoising autoencoder model architecture
Fig. 2Identification of clinically relevant hidden units. a Heatmap of hidden layer embeddings. The sidebar indicates the TEA cluster assignment of the sample. b Hidden units that are negatively correlated with the TEA cluster label (H26, H27). c Hidden units that are positively correlated with the TEA cluster label (H36, H38, H45)
Fig. 3Annotation of hidden units. a Distribution of encoder weights of Hsig. The positive and negative sets show very different distribution patterns. b Gene set enrichment of the asthma pathway from KEGG (KEGG_ASTHMA) for H26 and H38. c Intersection between top weighted 200 genes of the positively (H26, H27) and negatively (H36, H38, H45) related hidden units (duplicates removed)
Fig. 4Prediction of asthma severity and feature selection. a AUROC of the prediction of asthma severity using selected genes compared to randomly selected genes. b Importance of the selected genes. c GO term enrichment of the selected genes. d Selected genes in the context of a PPI network
Fig. 5Prediction of FEV1/FVC ratio. a Spearman correlation between hidden unit values and pre-/post-treatment FEV1/FVC. b Plot of predicted value versus true value of pre-treatment FEV1/FVC using support vector machine regression with selected gene expression. c Plot of predicted value versus true value of post-treatment FEV1/FVC using support vector machine regression with selected gene expression