| Literature DB >> 31694669 |
Lin Jiang1,2, Chao Xue1,3, Sheng Dai1, Shangzhen Chen1, Peikai Chen4, Pak Chung Sham4, Haijun Wang5, Miaoxin Li6,7,8.
Abstract
The driver tissues or cell types in which susceptibility genes initiate diseases remain elusive. We develop a unified framework to detect the causal tissues of complex diseases or traits according to selective expression of disease-associated genes in genome-wide association studies (GWASs). This framework consists of three components which run iteratively to produce a converged prioritization list of driver tissues. Additionally, this framework also outputs a list of prioritized genes as a byproduct. We apply the framework to six representative complex diseases or traits with GWAS summary statistics, which leads to the estimation of the lung as an associated tissue of rheumatoid arthritis.Entities:
Keywords: Disease driver-tissues; Gene-based association; Genome-wide association study; Susceptibility genes; Tissue-selective expression
Mesh:
Substances:
Year: 2019 PMID: 31694669 PMCID: PMC6836538 DOI: 10.1186/s13059-019-1801-5
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1The diagram of the unified framework. The unified framework, DESE, consists of three components which run iteratively. The iteration converged until the p values of estimated tissues get stable. DESE needs two input datasets, the tissue-selective expression profiles and GWAS summary statistics. It outputs estimated driver tissues and prioritized genes
Fig. 2Pearson correlation of the tissues by original expression and selective expression. a The correlation by original TPM expression values at the transcript level. b The correlation by selective expression measured by the robust-regression z-score expression at the transcript level
Fig. 3Driver tissues estimated by DESE and two exiting methods in six representative complex diseases/traits. Note: Each row shows one disease/trait. The first, second, and third columns show the estimated driver tissues according to GTEx transcript-level, GTEx gene-level, and GEO gene-level selective expression respectively. The bar denotes the averaged -log10(p) based on selective expression of four different measures. The -log10(p) based on each selective expression measure is denoted by a line. The fourth column shows estimated driver tissues by Ongen et al.’s method, which is extracted from Supplementary Table 5 of their published paper [10]. The fifth column shows estimated driver tissues by the LDSC-SEG method, which is extracted from Supplementary Table 6 of their published paper [11]. The pink horizontal denotes the significance level. The tissues are classified into 15 groups according to anatomy. The tissues are sorted by the averaged -log10(p) on y axis in descending order. SCZ schizophrenia, BD bipolar disorder, CAD coronary artery disease, RA rheumatoid arthritis, TC total cholesterol
The enrichment statistical significance for different minimal expression cutoffs
| Cutoff | Schizophrenia | Bipolar disorder | ||||
| Brain-anterior cingulate cortex (BA24) | Brain-frontal cortex (BA9) | Brain-cortex | Brain-cerebellar hemisphere | Brain-cerebellum | Brain-frontal cortex (BA9) | |
| 0.01 | 5.3E−13 | 5.3E−13 | 1.8E−12 | 1.3E−9 | 7.3E−09 | 1.6E−6 |
| 0.5 | 9.3E−8 | 1.3E−7 | 7E−8 | 1.7E−5 | 1.9E−5 | 6.4E−3 |
| 1.0 | 3.8E−7 | 1.5E−6 | 7.1E−7 | 3.0E−5 | 5.0E−5 | 0.024 |
| Coronary artery disease | Rheumatoid arthritis | |||||
| Artery-coronary | Adrenal gland | Ovary | Small intestine-terminal ileum | Lung | Spleen | |
| 0.01 | 4.3E−6 | 1.7E−5 | 6.1E−6 | 5.5E−11 | 4.2E−9 | 7E−8 |
| 0.5 | 4.2E−4 | 8.9E−3 | 2.9E−5 | 6.3E−7 | 2.9E−8 | 6E−5 |
| 1.0 | 4.2E−3 | 4.7E−2 | 1.8E−4 | 1.2E−6 | 5.2E−8 | 1.7E−4 |
| Total cholesterol | Height | |||||
| Liver | Lung | Spleen | Cells-transformed fibroblasts | Heart-atrial appendage | Lung | |
| 0.01 | 6.9E−8 | 3.3E−5 | 3.5E−5 | 1.3E−11 | 6E−12 | 5.3E−11 |
| 0.5 | 9.1E−5 | 1E−2 | 2.6E−2 | 1.9E−2 | 3.7E−3 | 6.6E−5 |
| 1.0 | 9.7E−3 | 6.4E−3 | 5.5E−2 | 4.3E−2 | 5.5E−3 | 1.5E−4 |
Note: The p values of driver tissues were calculated according to the proposed robust-regression z-scores. According to a cutoff x, gene or transcripts having TPM < x in 40 or more tissues were excluded