| Literature DB >> 34991467 |
Honglin Wang1, Pujan Joshi2, Seung-Hyun Hong2, Peter F Maye3, David W Rowe4, Dong-Guk Shin5.
Abstract
BACKGROUND: Interferon regulatory factor-8 (IRF8) and nuclear factor-activated T cells c1 (NFATc1) are two transcription factors that have an important role in osteoclast differentiation. Thanks to ChIP-seq technology, scientists can now estimate potential genome-wide target genes of IRF8 and NFATc1. However, finding target genes that are consistently up-regulated or down-regulated across different studies is hard because it requires analysis of a large number of high-throughput expression studies from a comparable context.Entities:
Keywords: IRF8; Machine learning; NFATc1; Osteoclast differentiation; Target prediction
Mesh:
Substances:
Year: 2022 PMID: 34991467 PMCID: PMC8740472 DOI: 10.1186/s12864-021-08159-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
IRF8 target gene regulation prediction based on the regulation prediction using SVM, NN and log2FC only
| Model | Up regulation | Down regulation |
|---|---|---|
| AIF1, CD164, MARCKS, MEF2C, RNASE4, TLR6, LSP1 | NDUFS7, RAB3IP, NUDT13, MCRS1, COX15, ATP5L | |
| AIF1, BID, CASP1, CTPS2, H2-M3, IRF5, LSP1, SLC15A3, TLR6 | NDUFS7, PARP8, NUDT13, MCRS1, COX15, ATP5L, ALG9 | |
| AIF1, BID, CTPS2, MARCKS, RNASE4, H2-M3, SLC15A3, NOTCH1, LSP1 | NDUFS7, RAB3IP, NUDT13, MCRS1, TAP2, COX15, ATP5L | |
| ATP6V0A1, CCL6, CCRL2, FGD2, LSP1, LY86, NOTCH1, P2RY12, PLSCR3, RNASE4, SLA, SLC15A3, STAT2, TNFRSF13B, TRIM21 | GTF3C5, RAB3IP | |
| AIF1, CTPS2, H2-M3, MARCKS, NOTCH1, RNASE4, S100A13, SLC15A3, TLR6, TNFRSF1B | ATP5L, NDUFS7 |
NFATc1 target gene regulation prediction based on the regulation prediction using SVM, NN and log2FC only
| Model | Up regulation | Down regulation |
|---|---|---|
| ABCB4, ACADM, ACADS, ACADVL, ACAT1, ACBD6, ACO2, AK2, ALDH4A1, AP1G2, APBA3, ATP5E, ATP5G1, ATP5G2, ATP5G3, ATP5J, ATP5L, ATP6V0B, ATP6V1C1, ATP6V1D, ATP6V1H, BCAT2, BCL2L13, BSG, C1QBP, CIAPIN1, COMTD1, COQ6, COX15, COX5B, COX7A2, COX7A2L, CS, CTTN, CYC1, CYCS, DLAT, DLST, ECSIT, ETFA, ETFDH, EXT1, FAHD1, FASTK, FDXR, GNB1L, GPX4, GSS, GTF2H3, HAGH, HINT3, IDH3A, IDH3B, IK, IMMT, KIF13A, LETM1, LRPPRC, MANBAL, MCAT, MCRS1, MDH1, MDH2, MFN2, MRPL1, MRPL14, MRPL34, MRPL36, MRPL38, MRPL46, MRPL48, MRPL51, MRPL9, MRPS25, MRPS34, MRPS35, MTX2, NDUFA10, NDUFA4, NDUFA5, NDUFA6, NDUFAB1, NDUFAF1, NDUFB10, NDUFB3, NDUFB5, NDUFB6, NDUFB7, NDUFC1, NDUFS3, NDUFS7, NDUFS8, NHEJ1, NSMAF, NUDT8, OGDH, OPA1, OXNAD1, PABPC4, PGAM5, PGLS, PIGO, PIP5K1B, PITPNM1, PPA2, PPARGC1B, PTGES2, PTPN12, PTPN9, RABGEF1, RELB, REPIN1, RREB1, SDHD, SLC25A11, SLC25A19, SLC25A3, SLC25A39, SLC25A5, SLC30A6, SLC39A13, SOD2, ST5, STARD7, TANK, TARBP2, TAX1BP3, TBC1D10B, TBRG4, TCIRG1, TERF2IP, TFRC, TIMM17A, TIMM44, TMEM60, TNFAIP3, TTC19, TUFM, UBE2G1, UBLCP1, UQCRC1, UQCRC2, UQCRH, USP4, VDAC1, VTI1B | CD164, CD48, CHST12, CNR2, CORO1A, EBI3, EPS15, FLI1, GCA, GNG2, IER3, IL10RA, IRF8, LAMP1, LSP1, MARCKS, NEDD9, NUCB2, P2RY6, PKIB, POU2F2, PRKD2, RASSF5, RB1, RBM43, RNASE4, RPS6KA3, SSBP2, TLE3, TLR6, TNFSF9, TPD52, WSB1, ZFP90 | |
| ABCB4, ACAD10, ACADS, AP1G2, ATP5L, CIAPIN1, CNIH4, COX15, COX7A2, CYC1, DLAT, DNAJA3, DUS3L, EXT1, FAHD1, FDXR, FEM1A, HAGH, HINT3, IDH3A, IMMT, LRPPRC, MANBAL, MCAT, MCRS1, MRPL36, MRPL38, MRPL9, MRPS34, MRPS35, NDUFB10, NDUFB5, NDUFB6, NDUFS3, NDUFS7, NSMAF, PABPC4, PEX16, PGLS, PIGQ, PIP5K1B, PPARGC1B, PRDX3, PTGES2, PTPN12, SEMA7A, SLC25A19, SLC39A13, SSNA1, ST5, TIMM44, TTC19, TUFM, USP4 | ANXA6, ATF3, CCL3, CD48, CDC42EP3, CHST12, CPEB2, DCK, EBI3, FLI1, GNG2, IRF5, IRF8, LSP1, LXN, MAP4K2, PIK3CG, SLC15A3, SLC9A3R1, SP100, TEX2, TLR6, TNFSF9, ZFP90 | |
| ABCB4, ACAD10, ACADS, AK2, AP1G2, ATP5G3, ATP5L, BCAT2, CIAPIN1, COG8, COQ6, COX15, COX17, CYC1, DLAT, DNAJA3, DUS3L, ECSIT, ETFDH, EXT1, FASTK, FDXR, GSS, HAGH, HINT2, HINT3, IDH3A, LRPPRC, MANBAL, MBTPS1, MCRS1, MRPL12, MRPL38, MRPL9, MRPS25, MRPS34, MRPS35, NDUFB10, NDUFC1, NDUFS3, NDUFS7, NSMAF, NUDT8, OXNAD1, PABPC4, PIP5K1B, PITPNM1, PPARGC1B, PRDX3, PTDSS2, PTGES2, RABGEF1, RELB, SLC39A13, STARD7, TANK, TAX1BP3, TBRG4, TCF12, TTC19, TUFM, UBE2G1, UBLCP1, UMPS, USP4 | ADAM15, CCL3, CD48, CHST12, CPEB2, EBI3, FLI1, GNG2, IL10RA, IRF8, LSP1, LXN, MAP4K2, MARCKS, MS4A7, NAB2, NOTCH1, NUCB2, P2RY6, POU2F2, RB1, RNASE4, RPS6KA3, SLC15A3, SLC9A3R1, SNAP29, TEX2, TMBIM1, TNFSF9 | |
| ACO2, ACSL1, DLAT, DNAJA3, FDXR, GSS, MCRS1, NFKB2, NFKBIE, NUDT8, OTUD7B, OXNAD1, PPARGC1B, PTGES2, SARS2, SDC1, SEMA7A, SLC25A39, SLC39A13, TARBP2, TBRG4, TRAF1 | CCR5, CDC42EP3, CNR2, ECE1, GSTK1, ICAM2, NAB2, PIAS3, SORL1, TLR6, TNFSF9 | |
| ABCB4, ACAD10, AK2, AP1G2, ATOX1, ATP5G2, BCAT2, DNAJA3, DUS3L, EXT1, FAHD1, FASTK, FDXR, IDH3A, IVNS1ABP, MCRS1, MFN2, NDUFS3, NFKBIE, NSMAF, OTUD7B, OXNAD1, PABPC4, PPARGC1B, PTGES2, PTPN12, REPIN1, SDC1, SEMA7A, SERPINB8, SLC39A13, ST5, STARD7, TANK, TARBP2, TBRG4, XRCC5 | GSTK1, HSPA2, ICAM2, NFKBIZ, PARVG, RASSF5, RBM43, SORL1, TNFSF9 |
Fig. 1The Venn-diagrams of all targets, up targets and down for IRF8 and NFATc1
Fig. 2The overall process of cTAP
Fig. 3CPs examples illustrating the issues and complexity involved in determine which pair of gene expression populations should be chosen for the analysis. (a) “No RANKL” vs. “RANKL” for OCU and “RANKL” vs. “RANKL with LEA” for OCD. (b) “osteoclast progenitors” vs. “mature-osteoclast progenitors” for OCU
Fig. 4Osteoclast differentiation pathway diagram including IRF8, NFATc1 and functional groups of marker genes
Fig. 5The overall process of trimmed quantile normalization
CASP1 downregulated in most of OCU CPs and upregulated in most of OCD
| ID | Context | CASP1 log2FC |
|---|---|---|
| 1 | OCU | -0.128 |
| 2 | OCU | -0.400 |
| 3 | OCD | 0.495 |
| 4 | OCU | -0.350 |
| 5 | OCU | 0.247 |
| 6 | OCU | -0.861 |
| 7 | OCU | -1.314 |
| 8 | OCD | 0.144 |
| 9 | OCD | 0.387 |
| 10 | OCD | 0.748 |
| 11 | OCD | 1.001 |
| 12 | OCD | 1.251 |
| 13 | OCD | 0.596 |
| 14 | OCU | -1.034 |
| 15 | OCD | 0.494 |
| 16 | OCU | 0.069 |
Total 16 CPs related with osteogenesis generated from 11 GEO data sets
| ID | GSE ID | Platform | Control population | Test population | Context |
|---|---|---|---|---|---|
| 1 | GSE111237 | GPL6885 | osteoclast progenitors | mature-osteoclast progenitors | OCU |
| 2 | GSE142866 | GPL17021 | No RANKL | RANKL | OCU |
| 3 | GSE142866 | GPL17021 | RANKL | RANKL with LEA | OCD |
| 4 | GSE149887 | GPL21103 | Mo (macrophages) | Oc (osteoclasts) | OCU |
| 5 | GSE17563 | GPL339 | bone marrow treated with hRANKL 0 hr | bone marrow treated with hRANKL 24h | OCU |
| 6 | GSE17563 | GPL339 | bone marrow treated with hRANKL 0 hr | bone marrow treated with hRANKL 72h | OCU |
| 7 | GSE20850 | GPL1261 | Macrophages | Osteoclasts | OCU |
| 8 | GSE30160 | GPL1261 | WT | RANK IVVY Knockin | OCD |
| 9 | GSE37219 | GPL8321 | WT | NFATc1-deficient OC | OCD |
| 10 | GSE57468 | GPL6885 | BMM RANKL 1day | BMM RANKL 0day | OCD |
| 11 | GSE57468 | GPL6885 | BMM RANKL 2day | BMM RANKL 0day | OCD |
| 12 | GSE57468 | GPL6885 | BMM RANKL 3day | BMM RANKL 0day | OCD |
| 13 | GSE76988 | GPL13112 | wild-type osteoclast M-CSF RANKL 24H | wild-type osteoclast M-CSF RANKL IL-3 24H | OCD |
| 14 | GSE76988 | GPL13112 | wild-type osteoclast precursor M-CSF 24H | wild-type osteoclast M-CSF RANKL 24H | OCU |
| 15 | GSE72846 | GPL17021 | Control | MMP9 KO | OCD |
| 16 | GSE135479 | GPL21103 | RANKL | FOXO3 RANKL | OCU |
Total 14 functional groups have different behaviors in different contexts
| Functional group | Contained genes | OCU | OCD |
|---|---|---|---|
| 1. Autoregulatory - up | CCL9, CCR1, CD109, CXCL10, SDC1, VEGFC | Activation | Inhibition |
| 2. Cell Differentiation signaling factors - down | PLCB4, PLCB2, GIT1, DOCK5, TRAF1, TRAF6 | Activation | Inhibition |
| 3. Cell Differentiation | CLCN7, CAR2, CALCR, CSF1R, TREM2, TNFRSF11A, OSCAR, OCSTAMP, MST1R, ITGB3, DCSTAMP | Activation | Inhibition |
| 4. Cytoskeleton Control | DCSTAMP, LAD1, MYO1B, OCSTAMP, SCIN, MYOD1, MARCKS | Activation | Inhibition |
| 5. Integrin Beta3 | CLCN7, OCSTAMP, CALCR, MST1R, CTSK, MMP14, ITGB3, MYO1D, ACP5, MMP9, CAR2, OSCAR | Activation | Inhibition |
| 6. Secreted Factors for External Cells - up | INF2, SEMA4D, SGPL1, SPP1, CXCL10, CCL9 | Activation | Inhibition |
| 7. Coupling Factors | PGF,SPNS2,CD200,SGPL1,SEMA7A,LIF,CST7 | Activation | Inhibition |
| 8. ACID & Enzymes for Matrix Dissolution | VCAN, ATP6V0D2, CAR2, CLCN7, SLC9B6, CTSK, ACP5, PDE2A, MMP14, HTRA1, MMP9, ADAM10, ATP6V0B, ATP6V0C, ATP6V0C-PS2 | Activation | Inhibition |
| 9. Autoregulatory - down | C1QA, C1QB, C1QC, CCL2, CCL3, CCL4, CCL6, CCL7, CXCL14, IGFBP4, PF4 | Inhibition | Activation |
| 10. Calcinuren Pathway | CALM1, CAMK1, CAMK2A, CALM2, CALM3, PPP3CA | Inhibition | Activation |
| 11. Cell Differentiation signaling factors - up | PPP2R3A, PPP3CA, CALM2, PPP2R3C, CAMK1, TNFAIP2, CAMK2A, CALM3, CALM1 | Inhibition | Activation |
| 12. Cell Signaling | SLIT1, SGPL1, INFB, IL10, CXCL5, IGF1, SPP1, SLIT3, C1QA, CCL8, CCL7, C1QC, C1QB | Inhibition | Activation |
| 13. MSC Signature | ACTA2, ACTG2, BGN, CCND1, COL1A1, COL1A2, COL2A1, DKK3, FN1, SERPINH1, SPARC, TNC | Inhibition | Activation |
| 14. Secreted Factors for External Cells - down | CCL6, CCL4, CCL3, CCL2, C1QC, C1QB, CCL7, CXCL14, CD200R1, CXCL16, IGF1, APOE, C1QA | Inhibition | Activation |
Fig. 6The 2-D plot produced by applying t-SNE to FGS of 16 CPs. Orange colored dots denote OCU CPs and blue colored dots OCD CPs
Fig. 7ROC curve of 4 different models’ prediction results using genes in FGs and random genes
Performance of 4 models
| Models | ACC | MCC | Sn | Sp | TP | TN | FP | FN |
|---|---|---|---|---|---|---|---|---|
| SVM | 0.79 | 0.58 | 0.82 | 0.76 | 303 | 283 | 87 | 65 |
| Gaussian NB | 0.69 | 0.41 | 0.48 | 0.89 | 177 | 332 | 38 | 191 |
| NN | 0.81 | 0.61 | 0.84 | 0.77 | 310 | 285 | 85 | 58 |
| LR | 0.79 | 0.59 | 0.78 | 0.81 | 288 | 300 | 70 | 80 |
Fig. 8TFGS gained by each machine learning model compare to it gained by log2 ratio only. The higher score means more FGs in all CPs follow the expectation of activation or inhibition
Distinct target genes of IRF8 in SVM and NN models compare to log2FC
| Models | Differently called genes |
|---|---|
| SVM | PARP8; BID; CCRL2; CTPS2; CASP1; H2-M3; SLC15A3; IRF5 |
| NN | TAP2; BID; BST2; CCRL2; CTPS2; H2-M3; NOTCH1 |
Distinct target genes of NFATc1 in SVM and NN models compare to log2FC
| Models | Differently called genes |
|---|---|
| SVM | ACAD10; CNIH4; DNAJA3; DUS3L; FEM1A; PEX16; PIGQ; PRDX3; SEMA7A; SSNA1; ANXA6; ATF3; CCL3; CDC42EP3; CPEB2; DCK; IRF5; LXN; MAP4K2; PIK3CG; SLC15A3; SLC9A3R1; SP100; TEX2 |
| NN | ACAD10; COG8; COX17; DNAJA3; DUS3L; HINT2; MBTPS1; MRPL12; PRDX3; PTDSS2; TCF12; UMPS; ADAM15; CCL3; CPEB2; LXN; MAP4K2; MS4A7; NAB2; NOTCH1; SLC15A3; SLC9A3R1; SNAP29; TEX2; TMBIM1 |
Targt genes with 5 features in different CP and their prediction results
| CP ID | Gene Symbol | F1 | F2 | F3 | F4 | F5 | SVM prediction result | NN prediction result | Pattern |
|---|---|---|---|---|---|---|---|---|---|
| 9 | NOTCH1 | – | + | + | – | – | Up | Up | GP |
| 5 | BID | + | + | + | – | – | Down | Down | GA |
| 6 | BID | + | + | + | – | – | Down | Down | GA |
| 5 | IRF5 | + | + | + | + | – | Down | Down | GA |
| 9 | CCRL2 | – | + | + | + | + | Up | Up | GP |
| 5 | CASP1 | + | + | + | + | – | Down | Down | GA |
| 16 | CASP1 | + | + | + | + | + | Down | UP | - |
| 10 | TAP2 | – | + | + | + | + | Up | Down | - |
| 6 | CTPS2 | + | + | + | + | – | Down | Down | GA |