| Literature DB >> 33237966 |
Abstract
Linking phenotypes to specific gene expression profiles is an extremely important problem in biology, which has been approached mainly by correlation methods or, more fundamentally, by studying the effects of gene perturbations. However, genome-wide perturbations involve extensive experimental efforts, which may be prohibitive for certain organisms. On the other hand, the characterization of the various phenotypes frequently requires an expert's subjective interpretation, such as a histopathologist's description of tissue slide images in terms of complex visual features (e.g. 'acinar structures'). In this paper, we use Deep Learning to eliminate the inherent subjective nature of these visual histological features and link them to genomic data, thus establishing a more precisely quantifiable correlation between transcriptomes and phenotypes. Using a dataset of whole slide images with matching gene expression data from 39 normal tissue types, we first developed a Deep Learning tissue classifier with an accuracy of 94%. Then we searched for genes whose expression correlates with features inferred by the classifier and demonstrate that Deep Learning can automatically derive visual (phenotypical) features that are well correlated with the transcriptome and therefore biologically interpretable. As we are particularly concerned with interpretability and explainability of the inferred histological models, we also develop visualizations of the inferred features and compare them with gene expression patterns determined by immunohistochemistry. This can be viewed as a first step toward bridging the gap between the level of genes and the cellular organization of tissues.Entities:
Year: 2020 PMID: 33237966 PMCID: PMC7688140 DOI: 10.1371/journal.pone.0242858
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The main analysis steps.
Fig 2Whole slide image of thyroid sample GTEX-11NSD-0126.
(A) Whole slide. (B) Detail. (C) Tile of size 512x512.
Convolutional neural network architectures used in this paper.
For ResNet34, the features considered involve only layers that are not “skipped over”, while for Inception_v3 only elementary layers with non-negative outputs (ReLU, max/avg-pooling) that are not on the auxiliary branch.
| Network | Number of features | Number of trainable parameters | Reference |
|---|---|---|---|
| AlexNet | 2,816 | 57,163,623 | Krizhevsky, 2012 [ |
| VGG11 | 6,976 | 128,926,119 | Simonyan, 2014 [ |
| VGG13 | 7,360 | 129,110,631 | Simonyan, 2014 [ |
| VGG16 | 9,920 | 134,420,327 | Simonyan, 2014 [ |
| (0)Conv2d(3,64) (1)ReLU (2)Conv2d(64,64) (3)ReLU (4)MaxPool2d (5)Conv2d(64,128) (6)ReLU (7)Conv2d(128,128) (8)ReLU (9)MaxPool2d (10)Conv2d(128,256) (11)ReLU (12)Conv2d (256,256) (13)ReLU (14)Conv2d(256,256) (15)ReLU (16)MaxPool2d (17)Conv2d (256,512) (18)ReLU (19)Conv2d(512,512) (20)ReLU (21)Conv2d(512,512) (22)ReLU (23)MaxPool2d (24)Conv2d(512,512) (25)ReLU (26)Conv2d(512,512) (27)ReLU (28)Conv2d(512,512) (29)ReLU (30)MaxPool2d (31)Linear(25088,4096) (32)ReLU (33)Dropout(0.5) (34)Linear(4096,4096) (35)ReLU (36)Dropout(0.5) (37)Linear(4096,39) | |||
| VGG19 | 12,480 | 139,730,023 | Simonyan, 2014 [ |
| VGG11_bn | 9,728 | 128,931,623 | Simonyan, 2014 [ |
| VGG13_bn | 10,304 | 129,116,519 | Simonyan, 2014 [ |
| VGG16_bn | 14,144 | 134,428,775 | Simonyan, 2014 [ |
| VGG19_bn | 17,984 | 139,741,031 | Simonyan, 2014 [ |
| ResNet34 | 28,992 | 21,304,679 | He, 2016 [ |
| Inception_v3 | 27,712 | 24,453,166 | Szegedy, 2016 [ |
| VGG16_1FC | 9,920 | 15,693,159 | This paper: same convolutional part as VGG16, but with a single fully connected layer: Conv(VGG16); Dropout(0.5); |
| VGG16_avg1FC | 10,432 | 14,734,695 | This paper: the convolutional part as VGG16, followed by an average pooling layer and a single fully connected layer: Conv(VGG16); |
Classification accuracies for image tiles.
| Network | accuracy (tiles) | |
|---|---|---|
| AlexNet | 77.02% | 74.76% |
| VGG11 | 80.18% | 77.49% |
| VGG13 | 80.77% | 78.19% |
| VGG16 | 80.35% | 78.00% |
| VGG19 | 79.61% | 76.93% |
| VGG11_bn | 83.24% | 80.74% |
| VGG13_bn | 83.94% | 81.56% |
| VGG16_bn | 84.24% | 81.46% |
| VGG19_bn | 84.44% | 81.54% |
| ResNet34 | 82.51% | 80.09% |
| Inception_v3 | 81.13% | 78.99% |
| VGG16_1FC | 82.66% | 79.82% |
| VGG16_avg1FC | 83.79% | 80.92% |
Classification accuracies for whole slides.
| Network | accuracy (slides) | |
|---|---|---|
| AlexNet | 91.82% | 90.42% |
| VGG11 | 93.33% | 92.22% |
| VGG13 | 93.94% | 92.81% |
| VGG16 | 92.73% | 93.71% |
| VGG19 | 93.33% | 93.11% |
| VGG11_bn | 93.94% | 92.81% |
| VGG13_bn | 93.03% | 92.51% |
| VGG16_bn | 93.64% | 92.22% |
| VGG19_bn | 93.94% | 92.81% |
| ResNet34 | 93.33% | 92.51% |
| Inception_v3 | 92.12% | 92.22% |
| VGG16_1FC | 93.33% | 93.11% |
| VGG16_avg1FC | 93.64% | 94.01% |
Numbers of significantly correlated gene-feature pairs for various network architectures.
Fixed correlation- and log gene expression thresholds are used: R = 0.8, E = 10. Numbers of unique genes, features and tissues involved are also shown (test dataset).
| Network | gene-feature pairs | unique genes | unique features | unique tissues |
|---|---|---|---|---|
| AlexNet | 424 | 55 | 112 | 11 |
| VGG11 | 1,175 | 70 | 204 | 12 |
| VGG13 | 2,149 | 70 | 307 | 13 |
| VGG16 | 2,176 | 74 | 346 | 13 |
| VGG19 | 1,227 | 71 | 321 | 14 |
| VGG11_bn | 34 | 19 | 7 | 3 |
| VGG13_bn | 59 | 22 | 17 | 6 |
| VGG16_bn | 51 | 20 | 11 | 4 |
| ResNet34 | 3 | 2 | 3 | 2 |
| Inception_v3 | 23 | 12 | 7 | 2 |
| VGG16_1FC | 2,714 | 69 | 363 | 13 |
| VGG16_avg1FC | 3,114 | 83 | 448 | 15 |
Numbers of significantly correlated gene-feature pairs for various correlation- and log gene expression thresholds.
Numbers of unique genes, features and tissues involved are also shown (VGG16 network, test dataset).
| Correlation threshold | gene-feature pairs | unique genes | unique features | unique tissues | |
|---|---|---|---|---|---|
| 0.8 | 10 | 2,176 | 74 | 346 | 13 |
| 0.75 | 10 | 4,308 | 115 | 647 | 22 |
| 0.7 | 10 | 7,984 | 213 | 1,146 | 28 |
| 0.75 | 8 | 9,624 | 312 | 926 | 26 |
| 0.8 | 7 | 8,055 | 365 | 576 | 18 |
| 0.75 | 7 | 15,062 | 535 | 1,046 | 31 |
| 0.7 | 7 | 27,671 | 995 | 1,947 | 36 |
Fig 3Numbers of significantly correlated genes for the 31 layers of VGG16.
Significantly correlated genes were aggregated for all features (channels) belonging to a given layer (i.e. for all features of the form [layer]_[channel]). The correlated genes were broken down according to their tissues of maximal expression. Tissues are color-coded. Colors of the figure bars scanned bottom-up correspond to colors in the legend read top to bottom.
Fig 6Visualizations of select histological features.
The following features (of the form [layer]_[channel]) found correlated with specific genes are visualized on each row: row 1: 29_234 AGXT (r = 0.939) Liver GTEX-Q2AG-1126_9_7, row 2: 29_118 CALR3 (r = 0.829) Testis GTEX-11NSD-1026_5_27, row 3: 30_244 KLK3 (r = 0.828) Prostate GTEX-V955-1826_8_13, row 4: 20_137 CYP11B1 (r = 0.850) Adrenal Gland GTEX-QLQW-0226_25_9. Original image (column 1), guided backpropagation of the feature on the original image (column 2), synthetic image of the feature (column 3), immunohistochemistry image for the corresponding gene from the Human Protein Atlas (column 4).
Developmental and transcription regulation genes correlated with visual features.
| Gene | Gene name | Tissue with highest expression | Highest median tissue expression ( | Best correlated feature | r |
|---|---|---|---|---|---|
| FABP4 | Fatty acid-binding protein, adipocyte | Adipose—Visceral (Omentum) | 12.6 | 1_14 | 0.767 |
| CYP17A1 | Steroid 17-alpha-hydroxylase/17,20 lyase | Adrenal Gland | 12.4 | 20_467 | 0.805 |
| HSD3B2 | 3 beta-hydroxysteroid dehydrogenase/Delta 5—>4-isomerase type 2 | Adrenal Gland | 11.2 | 29_123 | 0.791 |
| STAR | Steroidogenic acute regulatory protein, mitochondrial | Adrenal Gland | 12.5 | 29_132 | 0.770 |
| TPM4 | Tropomyosin alpha-4 chain | Artery—Aorta | 9.1 | 19_468 | 0.767 |
| S100A6 | Protein S100-A6 | Artery—Aorta | 11.6 | 19_162 | 0.760 |
| Transcriptional coactivator YAP1 | Artery—Aorta | 7.6 | 19_162 | 0.758 | |
| MARVELD1 | MARVEL domain-containing protein 1 | Artery—Tibial | 7.8 | 19_162 | 0.764 |
| GNG12 | Guanine nucleotide-binding protein G(I)/G(S)/G(O) subunit gamma-12 | Artery—Tibial | 7.1 | 12_209 | 0.755 |
| Zinc finger protein ZIC 4 | Brain—Cerebellum | 7.0 | 29_447 | 0.922 | |
| Neurogenic differentiation factor 2 | Brain—Cerebellum | 7.2 | 29_463 | 0.915 | |
| Neurogenic differentiation factor 1 | Brain—Cerebellum | 7.8 | 29_479 | 0.873 | |
| CRTAM | Cytotoxic and regulatory T-cell molecule | Brain—Cerebellum | 7.2 | 29_479 | 0.868 |
| Zinc finger protein ZIC 2 | Brain—Cerebellum | 7.9 | 29_463 | 0.854 | |
| SLC12A5 | Solute carrier family 12 member 5 | Brain—Cerebellum | 7.5 | 27_140 | 0.850 |
| Zinc finger protein ZIC 1 | Brain—Cerebellum | 8.1 | 29_479 | 0.844 | |
| ELAVL3 | ELAV-like protein 3 | Brain—Cerebellum | 8.0 | 29_463 | 0.834 |
| PVALB | Parvalbumin alpha | Brain—Cerebellum | 9.1 | 29_463 | 0.781 |
| HPCAL4 | Hippocalcin-like protein 4 | Brain—Cerebellum | 7.5 | 29_463 | 0.781 |
| CPLX2 | Complexin-2 | Brain—Cerebellum | 8.7 | 27_140 | 0.762 |
| SPOCK2 | Testican-2 | Brain—Cerebellum | 8.6 | 27_148 | 0.761 |
| SLC1A2 | Excitatory amino acid transporter 2 | Brain—Cortex | 8.7 | 18_8 | 0.844 |
| GRIN1 | Glutamate receptor ionotropic, NMDA 1 | Brain—Cortex | 7.9 | 18_8 | 0.812 |
| HPCA | Neuron-specific calcium-binding protein hippocalcin | Brain—Cortex | 7.5 | 29_485 | 0.811 |
| PACSIN1 | Protein kinase C and casein kinase substrate in neurons protein 1 | Brain—Cortex | 8.5 | 29_463 | 0.810 |
| SLC17A7 | Vesicular glutamate transporter 1 | Brain—Cortex | 9.3 | 18_8 | 0.799 |
| Calcium/calmodulin-dependent protein kinase type II subunit alpha | Brain—Cortex | 9.2 | 23_417 | 0.786 | |
| Dendrin | Brain—Cortex | 7.9 | 25_478 | 0.775 | |
| CEND1 | Cell cycle exit and neuronal differentiation protein 1 | Brain—Cortex | 8.6 | 29_485 | 0.755 |
| KIF5A | Kinesin heavy chain isoform 5A | Brain—Cortex | 10.0 | 29_485 | 0.754 |
| NBL1 | Neuroblastoma suppressor of tumorigenicity 1 | Cervix—Ectocervix | 9.6 | 19_162 | 0.758 |
| CRTAP | Cartilage-associated protein | Cervix—Ectocervix | 8.1 | 12_223 | 0.756 |
| High mobility group protein B1 | Cervix—Ectocervix | 7.4 | 19_468 | 0.753 | |
| E3 SUMO-protein ligase PIAS3 | Cervix—Endocervix | 7.0 | 19_468 | 0.764 | |
| SPIN1 | Spindlin-1 | Cervix—Endocervix | 7.2 | 19_333 | 0.752 |
| PLXNB2 | Plexin-B2 | Cervix—Endocervix | 8.0 | 12_164 | 0.750 |
| Homeobox protein Nkx-2.5 | Heart—Atrial Appendage | 7.0 | 30_214 | 0.903 | |
| BMP10 | Bone morphogenetic protein 10 | Heart—Atrial Appendage | 8.3 | 29_313 | 0.855 |
| NPPA | Natriuretic peptides A | Heart—Atrial Appendage | 14.9 | 29_481 | 0.777 |
| NMRK2 | Nicotinamide riboside kinase 2 | Heart—Atrial Appendage | 9.8 | 30_72 | 0.751 |
| MYBPC3 | Myosin-binding protein C, cardiac-type | Heart—Left Ventricle | 10.8 | 23_270 | 0.783 |
| TNNI3 | Troponin I, cardiac muscle | Heart—Left Ventricle | 12.0 | 25_28 | 0.780 |
| TNNT2 | Troponin T, cardiac muscle | Heart—Left Ventricle | 11.6 | 25_31 | 0.759 |
| CSRP3 | Cysteine and glycine-rich protein 3 | Heart—Left Ventricle | 9.4 | 30_72 | 0.755 |
| AQP2 | Aquaporin-2 | Kidney—Cortex | 7.6 | 29_280 | 0.890 |
| UMOD | Uromodulin | Kidney—Cortex | 7.4 | 27_274 | 0.874 |
| PLG | Plasminogen | Liver | 8.7 | 29_234 | 0.959 |
| APOA5 | Apolipoprotein A-V | Liver | 8.8 | 29_192 | 0.954 |
| SERPINC1 | Antithrombin-III | Liver | 10.1 | 29_192 | 0.952 |
| HRG | Histidine-rich glycoprotein | Liver | 9.2 | 29_234 | 0.952 |
| F2 | Prothrombin | Liver | 9.4 | 29_192 | 0.949 |
| AHSG | Alpha-2-HS-glycoprotein | Liver | 10.7 | 29_192 | 0.946 |
| APCS | Serum amyloid P-component | Liver | 10.7 | 29_192 | 0.942 |
| BAAT | Bile acid-CoA:amino acid N-acyltransferase | Liver | 7.7 | 29_192 | 0.938 |
| ANGPTL3 | Angiopoietin-related protein 3 | Liver | 7.3 | 29_234 | 0.934 |
| APOA2 | Apolipoprotein A-II | Liver | 12.2 | 29_192 | 0.931 |
| CYP4A11 | Cytochrome P450 4A11 | Liver | 8.3 | 29_234 | 0.924 |
| CPB2 | Carboxypeptidase B2 | Liver | 8.5 | 29_192 | 0.922 |
| PROC | Vitamin K-dependent protein C | Liver | 7.9 | 29_234 | 0.919 |
| APOH | Beta-2-glycoprotein 1 | Liver | 11.7 | 29_234 | 0.919 |
| FGB | Fibrinogen beta chain | Liver | 12.6 | 29_192 | 0.896 |
| FGL1 | Fibrinogen-like protein 1 | Liver | 10.2 | 18_231 | 0.879 |
| G6PC | Glucose-6-phosphatase | Liver | 7.1 | 29_234 | 0.872 |
| FGA | Fibrinogen alpha chain | Liver | 12.2 | 30_192 | 0.858 |
| ASGR2 | Asialoglycoprotein receptor 2 | Liver | 8.8 | 30_192 | 0.854 |
| CRP | C-reactive protein | Liver | 12.6 | 29_192 | 0.852 |
| VTN | Vitronectin | Liver | 11.2 | 18_279 | 0.851 |
| FGG | Fibrinogen gamma chain | Liver | 12.2 | 30_192 | 0.829 |
| IGFBP1 | Insulin-like growth factor-binding protein 1 | Liver | 7.2 | 29_75 | 0.802 |
| APOB | Apolipoprotein B-100 | Liver | 8.5 | 30_438 | 0.784 |
| Cyclic AMP-responsive element-binding protein 3-like protein 3 | Liver | 8.2 | 20_481 | 0.777 | |
| CPS1 | Carbamoyl-phosphate synthase [ammonia], mitochondrial | Liver | 8.6 | 29_192 | 0.775 |
| ARG1 | Arginase-1 | Liver | 9.0 | 20_194 | 0.768 |
| SFTPB | Pulmonary surfactant-associated protein B | Lung | 12.2 | 29_383 | 0.789 |
| SCGB1A1 | Uteroglobin | Lung | 9.2 | 29_155 | 0.776 |
| STATH | Statherin | Minor Salivary Gland | 7.7 | 25_200 | 0.767 |
| Myogenic factor 6 | Muscle—Skeletal | 7.2 | 25_409 | 0.872 | |
| NEB | Nebulin | Muscle—Skeletal | 9.9 | 25_409 | 0.851 |
| XIRP2 | Xin actin-binding repeat-containing protein 2 | Muscle—Skeletal | 8.0 | 18_102 | 0.828 |
| RYR1 | Ryanodine receptor 1 | Muscle—Skeletal | 8.6 | 29_362 | 0.827 |
| TMOD4 | Tropomodulin-4 | Muscle—Skeletal | 8.6 | 30_16 | 0.827 |
| KLHL40 | Kelch-like protein 40 | Muscle—Skeletal | 7.8 | 25_409 | 0.824 |
| SMTNL1 | Smoothelin-like protein 1 | Muscle—Skeletal | 7.2 | 30_362 | 0.820 |
| TTN | Titin | Muscle—Skeletal | 8.7 | 18_136 | 0.810 |
| MYPN | Myopalladin | Muscle—Skeletal | 7.3 | 30_72 | 0.805 |
| LMOD3 | Leiomodin-3 | Muscle—Skeletal | 7.4 | 18_136 | 0.802 |
| MYLPF | Myosin regulatory light chain 2, skeletal muscle isoform | Muscle—Skeletal | 10.9 | 30_161 | 0.798 |
| LMOD2 | Leiomodin-2 | Muscle—Skeletal | 8.9 | 30_72 | 0.798 |
| NRAP | Nebulin-related-anchoring protein | Muscle—Skeletal | 10.0 | 18_136 | 0.794 |
| MYL2 | Myosin regulatory light chain 2, ventricular/cardiac muscle isoform | Muscle—Skeletal | 13.7 | 18_136 | 0.781 |
| MYLK2 | Myosin light chain kinase 2, skeletal/cardiac muscle | Muscle—Skeletal | 7.7 | 30_161 | 0.777 |
| TNNT1 | Troponin T, slow skeletal muscle | Muscle—Skeletal | 12.5 | 18_272 | 0.776 |
| TNNI1 | Troponin I, slow skeletal muscle | Muscle—Skeletal | 10.0 | 30_161 | 0.773 |
| MB | Myoglobin | Muscle—Skeletal | 13.1 | 30_72 | 0.772 |
| MYH7 | Myosin-7 | Muscle—Skeletal | 12.4 | 18_136 | 0.770 |
| KLHL41 | Kelch-like protein 41 | Muscle—Skeletal | 11.6 | 20_171 | 0.766 |
| ANP32B | Acidic leucine-rich nuclear phosphoprotein 32 family member B | Nerve—Tibial | 8.2 | 19_468 | 0.775 |
| CNTF | Ciliary neurotrophic factor | Nerve—Tibial | 8.3 | 29_36 | 0.769 |
| MXRA8 | Matrix remodeling-associated protein 8 | Nerve—Tibial | 8.5 | 19_468 | 0.751 |
| Recombining binding protein suppressor of hairless-like protein | Pancreas | 9.5 | 30_467 | 0.935 | |
| INS | Insulin | Pancreas | 10.8 | 18_158 | 0.913 |
| CELA2A | Chymotrypsin-like elastase family member 2A | Pancreas | 12.6 | 23_110 | 0.895 |
| CEL | Bile salt-activated lipase | Pancreas | 14.0 | 23_324 | 0.866 |
| REG3G | Regenerating islet-derived protein 3-gamma | Pancreas | 9.6 | 30_290 | 0.815 |
| FSHB | Follitropin subunit beta | Pituitary | 7.1 | 29_279 | 0.930 |
| Pituitary-specific positive transcription factor 1 | Pituitary | 7.3 | 29_279 | 0.913 | |
| GHRHR | Growth hormone-releasing hormone receptor | Pituitary | 8.9 | 29_279 | 0.900 |
| TSHB | Thyrotropin subunit beta | Pituitary | 9.2 | 29_436 | 0.869 |
| LHB | Lutropin subunit beta | Pituitary | 12.0 | 30_429 | 0.844 |
| MYO15A | Unconventional myosin-XV | Pituitary | 7.2 | 29_436 | 0.803 |
| PRL | Prolactin | Pituitary | 15.5 | 29_436 | 0.799 |
| TGFBR3L | Transforming growth factor-beta receptor type 3-like protein | Pituitary | 8.0 | 29_436 | 0.787 |
| GH1 | Somatotropin | Pituitary | 16.8 | 30_429 | 0.763 |
| COL22A1 | Collagen alpha-1(XXII) chain | Pituitary | 7.2 | 30_436 | 0.753 |
| KLK3 | Prostate-specific antigen | Prostate | 12.8 | 29_458 | 0.900 |
| KLK4 | Kallikrein-4 | Prostate | 9.4 | 29_430 | 0.821 |
| Homeobox protein Hox-B13 | Prostate | 7.3 | 30_458 | 0.801 | |
| KRT77 | Keratin, type II cytoskeletal 1b | Skin—Not Sun Exposed | 8.6 | 18_78 | 0.884 |
| FLG2 | Filaggrin-2 | Skin—Sun Exposed | 9.6 | 18_78 | 0.883 |
| LCE2C | Late cornified envelope protein 2C | Skin—Sun Exposed | 8.2 | 18_78 | 0.878 |
| LCE1A | Late cornified envelope protein 1A | Skin—Sun Exposed | 8.9 | 18_78 | 0.878 |
| CDSN | Corneodesmosin | Skin—Sun Exposed | 8.3 | 18_78 | 0.875 |
| LCE2B | Late cornified envelope protein 2B | Skin—Sun Exposed | 9.4 | 18_78 | 0.874 |
| LCE1B | Late cornified envelope protein 1B | Skin—Sun Exposed | 8.0 | 18_78 | 0.874 |
| LCE1C | Late cornified envelope protein 1C | Skin—Sun Exposed | 9.4 | 18_78 | 0.873 |
| LCE6A | Late cornified envelope protein 6A | Skin—Sun Exposed | 7.6 | 18_78 | 0.870 |
| LCE1F | Late cornified envelope protein 1F | Skin—Sun Exposed | 7.6 | 18_78 | 0.868 |
| LCE5A | Late cornified envelope protein 5A | Skin—Sun Exposed | 7.3 | 18_78 | 0.864 |
| LCE2D | Late cornified envelope protein 2D | Skin—Sun Exposed | 7.5 | 18_78 | 0.863 |
| LCE2A | Late cornified envelope protein 2A | Skin—Sun Exposed | 7.2 | 18_78 | 0.856 |
| DSC1 | Desmocollin-1 | Skin—Sun Exposed | 8.2 | 18_78 | 0.849 |
| KRT2 | Keratin, type II cytoskeletal 2 epidermal | Skin—Sun Exposed | 12.8 | 18_78 | 0.840 |
| FLG | Filaggrin | Skin—Sun Exposed | 9.0 | 18_78 | 0.835 |
| SERPINB7 | Serpin B7 | Skin—Sun Exposed | 7.4 | 18_78 | 0.830 |
| KRT10 | Keratin, type I cytoskeletal 10 | Skin—Sun Exposed | 14.6 | 18_78 | 0.826 |
| CASP14 | Caspase-14 | Skin—Sun Exposed | 9.4 | 18_78 | 0.820 |
| PSAPL1 | Proactivator polypeptide-like 1 | Skin—Sun Exposed | 7.4 | 22_265 | 0.812 |
| CST6 | Cystatin-M | Skin—Sun Exposed | 9.6 | 20_329 | 0.773 |
| ASPRV1 | Retroviral-like aspartic protease 1 | Skin—Sun Exposed | 9.6 | 18_78 | 0.772 |
| ALOX12B | Arachidonate 12-lipoxygenase, 12R-type | Skin—Sun Exposed | 7.4 | 25_227 | 0.757 |
| Steroidogenic factor 1 | Spleen; Adrenal Gland | 8.0 | 27_423 | 0.780 | |
| CD19 | B-lymphocyte antigen CD19 | Spleen | 7.9 | 27_389 | 0.759 |
| KCNE2 | Potassium voltage-gated channel subfamily E member 2 | Stomach | 8.3 | 29_287 | 0.782 |
| DDX4 | Probable ATP-dependent RNA helicase DDX4 | Testis | 7.5 | 29_39 | 0.951 |
| Doublesex- and mab-3-related transcription factor B1 | Testis | 7.1 | 29_39 | 0.949 | |
| SHCBP1L | Testicular spindle-associated protein SHCBP1L | Testis | 7.7 | 29_39 | 0.945 |
| CALR3 | Calreticulin-3 | Testis | 7.1 | 29_246 | 0.945 |
| ZPBP2 | Zona pellucida-binding protein 2 | Testis | 7.2 | 29_39 | 0.941 |
| SPEM1 | Spermatid maturation protein 1 | Testis | 7.5 | 29_246 | 0.939 |
| ACSBG2 | Long-chain-fatty-acid—CoA ligase ACSBG2 | Testis | 8.0 | 29_39 | 0.938 |
| SPATA19 | Spermatogenesis-associated protein 19, mitochondrial | Testis | 8.0 | 29_168 | 0.937 |
| RNF151 | RING finger protein 151 | Testis | 8.5 | 29_39 | 0.937 |
| FSCN3 | Fascin-3 | Testis | 7.2 | 29_246 | 0.933 |
| TCP11 | T-complex protein 11 homolog | Testis | 8.7 | 29_258 | 0.930 |
| ODF1 | Outer dense fiber protein 1 | Testis | 9.6 | 29_39 | 0.929 |
| IQCF1 | IQ domain-containing protein F1 | Testis | 7.8 | 29_246 | 0.926 |
| PRM3 | Protamine-3 | Testis | 7.0 | 29_246 | 0.926 |
| TDRG1 | Testis development-related protein 1 | Testis | 7.0 | 29_39 | 0.923 |
| SYCP3 | Synaptonemal complex protein 3 | Testis | 7.2 | 29_39 | 0.917 |
| CABS1 | Calcium-binding and spermatid-specific protein 1 | Testis | 7.9 | 29_73 | 0.915 |
| DKKL1 | Dickkopf-like protein 1 | Testis | 9.3 | 29_39 | 0.914 |
| RPL10L | 60S ribosomal protein L10-like | Testis | 7.5 | 29_246 | 0.908 |
| SYCE3 | Synaptonemal complex central element protein 3 | Testis | 7.9 | 29_39 | 0.906 |
| CAPZA3 | F-actin-capping protein subunit alpha-3 | Testis | 8.7 | 29_258 | 0.906 |
| GTSF1 | Gametocyte-specific factor 1 | Testis | 7.0 | 29_246 | 0.905 |
| CCDC42 | Coiled-coil domain-containing protein 42 | Testis | 7.0 | 29_73 | 0.903 |
| TXNDC2 | Thioredoxin domain-containing protein 2 | Testis | 7.1 | 29_246 | 0.898 |
| TPPP2 | Tubulin polymerization-promoting protein family member 2 | Testis | 8.2 | 29_73 | 0.888 |
| AKAP3 | A-kinase anchor protein 3 | Testis | 7.3 | 29_39 | 0.880 |
| TNP1 | Spermatid nuclear transition protein 1 | Testis | 13.1 | 29_39 | 0.874 |
| TSSK6 | Testis-specific serine/threonine-protein kinase 6 | Testis | 8.2 | 29_246 | 0.869 |
| Protein maelstrom homolog | Testis | 7.3 | 29_39 | 0.864 | |
| T-complex protein 10A homolog 1 | Testis | 8.1 | 29_39 | 0.864 | |
| CCIN | Calicin | Testis | 7.3 | 29_73 | 0.862 |
| PRAME | Melanoma antigen preferentially expressed in tumors | Testis | 7.3 | 30_39 | 0.860 |
| PRM1 | Sperm protamine P1 | Testis | 14.3 | 29_73 | 0.850 |
| PRM2 | Protamine-2 | Testis | 14.3 | 29_73 | 0.849 |
| GGN | Gametogenetin | Testis | 7.5 | 29_39 | 0.845 |
| ROPN1L | Ropporin-1-like protein | Testis | 9.0 | 29_39 | 0.844 |
| CABYR | Calcium-binding tyrosine phosphorylation-regulated protein | Testis | 8.0 | 29_39 | 0.840 |
| INSL3 | Insulin-like 3 | Testis | 8.9 | 29_39 | 0.802 |
| ACRBP | Acrosin-binding protein | Testis | 9.3 | 30_246 | 0.798 |
| PHOSPHO1 | Phosphoethanolamine/phosphocholine phosphatase | Testis | 7.3 | 29_246 | 0.782 |
| SPATA24 | Spermatogenesis-associated protein 24 | Testis | 7.5 | 29_73 | 0.778 |
| SPINK2 | Serine protease inhibitor Kazal-type 2 | Testis | 9.2 | 29_73 | 0.773 |
| PCSK4 | Proprotein convertase subtilisin/kexin type 4 | Testis | 8.2 | 30_39 | 0.773 |
| PRSS21 | Testisin | Testis | 7.0 | 30_168 | 0.773 |
| Homeobox protein Nkx-2.1 | Thyroid | 8.5 | 29_202 | 0.870 | |
| TG | Thyroglobulin | Thyroid | 12.3 | 30_93 | 0.869 |
| Paired box protein Pax-8 | Thyroid | 10.4 | 29_499 | 0.802 | |
| Forkhead box protein E1 | Thyroid | 7.5 | 27_376 | 0.778 | |
| TRIP6 | Thyroid receptor-interacting protein 6 | Uterus | 7.6 | 12_209 | 0.762 |
Genes with Gene Ontology ‘developmental process’ or ‘transcription regulator activity’ annotations are grouped w.r.t. tissues and ordered by correlation—for each gene we only show the best correlated feature, in the form [layer]_[channel] (for the architecture VGG16 and the test dataset; correlation threshold = 0.75, log gene expression threshold = 7). Genes with ‘transcription regulator activity’ are shown in red.
Fig 4Visualizations of select histological features.
The following features (of the form [layer]_[channel]) found correlated with specific genes are visualized on each row: row 1: 8_55—CTRC (r = 0.85), row 2: 18_13—CUZD1 (r = 0.866), row 3: 23_324—CELA3B (r = 0.868), row 4: 30_467—AMY2A (r = 0.928). Original image (column 1), guided backpropagation of the feature on the original image (column 2), synthetic image of the feature (column 3), immunohistochemistry image for the corresponding gene from the Human Protein Atlas (column 4). All visualizations are for pancreas sample tile GTEX-11NSD-0526_32_5.
Fig 5Visualizations of select histological features.
The following features (of the form [layer]_[channel]) found correlated with specific genes are visualized on each row: row 1: 22_405 NKX2-1 (r = 0.726) Thyroid GTEX-11NSD-0126_31_16, row 2: 27_343 TG (r = 0.801) Thyroid GTEX-11NSD-0126_31_16, row 3: 29_499 PAX8 (r = 0.802) Thyroid GTEX-11NSD-0126_31_16, row 4: 25_260 NEB (r = 0.831) Muscle—Skeletal GTEX-145ME-2026_39_19. Original image (column 1), guided backpropagation of the feature on the original image (column 2), synthetic image of the feature (column 3), immunohistochemistry image for the corresponding gene from the Human Protein Atlas (column 4).
Numbers of significant gene-feature pairs with indirect dependencies gene-tissue-feature (g-t-f), feature-gene-tissue (f-g-t), gene-feature-tissue (g-f-t).
Conditional independence tests with α = 0.01 and respectively α = 0.05.
| Thresholds | |||||||
|---|---|---|---|---|---|---|---|
| corr 0.7 | 27,671 | 12,442 | 10,133 | 5,068 | 1,486 | 931 | 647 |
| corr 0.8 | 2,176 | 1,495 | 1,254 | 37 | 12 | 1 | 1 |