| Literature DB >> 28096768 |
Fang Huang1, Jiawei Shen1, Qingli Guo1, Yongyong Shi2.
Abstract
BACKGROUND: Enhancers are tissue specific distal regulation elements, playing vital roles in gene regulation and expression. The prediction and identification of enhancers are important but challenging issues for bioinformatics studies. Existing computational methods, mostly single classifiers, can only predict the transcriptional coactivator EP300 based enhancers and show low generalization performance.Entities:
Keywords: Algorithm; Enhancer; Hybrid classifier
Mesh:
Substances:
Year: 2016 PMID: 28096768 PMCID: PMC5226099 DOI: 10.1186/s41065-016-0012-2
Source DB: PubMed Journal: Hereditas ISSN: 0018-0661 Impact factor: 3.271
Fig. 1The overview of eRFSVM (Different RF classifiers are made as base classifiers and SVMs classifier is made as main classifier)
Classifiers testing on K562 and hela
| Classifiers | Precision | Recall | F-score | Accuracy | ||||
|---|---|---|---|---|---|---|---|---|
| K562 | hela | K562 | hela | K562 | hela | K562 | hela | |
| Gm12878 | 84.39 % | 13.22 % | 0.88 % | 3.92 % | 1.75 % | 6.05 % | 68.47 % | 99.18 % |
| hep | 83.00 % | 30.24 % | 5.04 % | 0.31 % | 9.50 % | 0.62 % | 69.52 % | 99.33 % |
| hesc | 84.00 % | 4.73 % | 3.66 % | 5.47 % | 7.01 % | 5.07 % | 69.19 % | 98.63 % |
| huvec | 81.25 % | 7.44 % | 6.34 % | 0.35 % | 11.76 % | 0.66 % | 69.79 % | 99.30 % |
| eRFSVM | 83.69 % | 15.35 % | 4.92 % | 0.38 % | 9.29 % | 0.75 % | 69.50 % | 99.28 % |
Fig. 2ROC curve for classifier test on K562 (Cross-validation ROC plot of the optimum classifier to predict enhancers in K562 cells)
Fig. 3ROC curve for classifier test on hela (Cross-validation ROC plot of the optimum classifier to predict enhancers in hela cells)
Results testing on adipose with ChIP-Seq datasets
| Classifiers | Precision | Recall | F-score | Accuracy |
|---|---|---|---|---|
| blood | 55.80 % | 31.56 % | 40.32 % | 91.51 % |
| liver | 80.73 % | 27.50 % | 41.03 % | 92.81 % |
| lung | 83.72 % | 11.25 % | 19.83 % | 91.73 % |
| kidney | 71.83 % | 47.81 % | 57.41 % | 93.55 % |
| eRFSVM-FANTOM5 | 86.17 % | 36.06 % | 50.84 % | 93.38 % |
| SVMs-ANN | 65.30 % | 28.07 % | 39.26 % | 92.58 % |
Fig. 4ROC curve for eRFSVM test on adipose (Cross-validation ROC plot of the optimum classifier to predict enhancers on adipose)
Comparative performance analysis of the enhancer predictions in K562
| Classifiers | Precision | Recall | F-score | Number of predicted enhancer bases (Portion in whole genome) |
|---|---|---|---|---|
| RFSVM-ENCOE | 83.69 % | 4.92 % | 9.29 % | 120,670,200(3.89 %) |
| CSI_ANN | 67.36 % | 9.05 % | 15.96 % | 34,635,309(1.12 %) |
| RFECS | 69.56 % | 10.16 % | 17.74 % | 130,723,329(4.22 %) |
| DEEP-ENCODE | 83.56 % | 3.45 % | 6.62 % | 28,238,758(0.91 %) |
Results tesing on adipose with DNA sequence features
| Classifiers | Precision | Recall | F-score | Accuracy |
|---|---|---|---|---|
| blood | 20.77 % | 33.75 % | 25.71 % | 82.27 % |
| liver | 61.84 % | 14.69 % | 23.74 % | 91.42 % |
| lung | 40.07 % | 34.06 % | 36.82 % | 89.38 % |
| kidney | 53.42 % | 26.87 % | 35.76 % | 91.22 % |
| RF-SVMs | 23.61 % | 51.25 % | 32.31 % | 87.53 % |
| DEEP-FANTOM5 | 69.92 % | 18.30 % | 28.74 % | 89.20 % |