| Literature DB >> 35473556 |
Fabrizio Frasca1,2, Matteo Matteucci3, Michele Leone3, Marco J Morelli4, Marco Masseroli3.
Abstract
BACKGROUND: Histone Mark Modifications (HMs) are crucial actors in gene regulation, as they actively remodel chromatin to modulate transcriptional activity: aberrant combinatorial patterns of HMs have been connected with several diseases, including cancer. HMs are, however, reversible modifications: understanding their role in disease would allow the design of 'epigenetic drugs' for specific, non-invasive treatments. Standard statistical techniques were not entirely successful in extracting representative features from raw HM signals over gene locations. On the other hand, deep learning approaches allow for effective automatic feature extraction, but at the expense of model interpretation.Entities:
Keywords: Epigenetics; Gene expression regulation; Histone modifications; Interpretability
Mesh:
Substances:
Year: 2022 PMID: 35473556 PMCID: PMC9040271 DOI: 10.1186/s12859-022-04687-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
The 5 histone modifications considered in this study along with their known biological characterisation (Table from [10])
| Histone | Associated with | Functional |
|---|---|---|
| Modification | category | |
| H3K4me3 | Promoter regions | Promoter mark |
| H3K4me1 | Promoter/Enhancer regions | Regulating mark |
| H3K36me3 | Transcribed regions | Structural mark |
| H3K9me3 | Heterochromatin regions | Repressor mark |
| H3K27me3 | Polycomb repression | Repressor mark |
Fig. 1Feature extraction stages for histone mark h over gene g: ChIP-seq raw and processed signals for h are in blue, while the input-field for g is depicted in solid, thick grey. (1) Peak calling: ChIP-seq raw reads are fed as inputs to a peak calling algorithm to reliably extract read-enriched genomic regions. (2) Localisation: Genome-wide peak signals are localised within the defined genes’ input fields. (3) Extraction: The max peak enrichment value is extracted as h’s value into g’s feature vector
Fig. 2Test AUROC scores for DeepChrome, AttentiveChrome and ShallowChrome. Results are reported on the 56 considered epigenomes, indicated with their respective REMC code (see Supplementary Table S1, Additional file 1, for association between REMC codes and epigenomes). Results for AttentiveChrome have been manually reproduced by the authors of this manuscript. It was not possible to reproduce the result on epigenome E059, thus none about it is reported in the present figure. See Supplementary Section S8, Additional file 1, for further details
Aggregated statistics on the test results for DeepChrome, AttentiveChrome and ShallowChrome computed across the 56 considered epigenomes
| Statistic | DeepChrome | AttentiveChrome | ShallowChrome |
|---|---|---|---|
| Mean | 0.8008 | 0.8133 | 0.8737 |
| Median | 0.8009 | 0.8143 | 0.8829 |
| Max | 0.9225 | 0.9218 | 0.9196 |
| Min | 0.6854 | 0.7237 | 0.8084 |
Values reported for AttentiveChrome are those corresponding to the model configuration attaining best result statistic. ShallowChrome statistics are computed over mean test performances for each epigenome
Fig. 3Normalised regulative patterns extracted for gene PAX5 on epigenomes H1-hESC, GM12878 and K562; model bias is included to ease interpretation. Relative contributions to gene activation are in red when negative (repressors), in green when positive (activators) and in grey when found not to be statistically significant (Z-test p-value )
Fig. 4Aggregated ranking visualisation for chromatin state groups across the 56 considered epigenomes