| Literature DB >> 30914706 |
Srinivasulu Yerukala Sathipati1, Divya Sahu2, Hsuan-Cheng Huang2,3, Yenching Lin4, Shinn-Ying Ho5,6,7,8,9.
Abstract
Neuroblastoma (NB) is a commonly occurring cancer among infants and young children. Recently, long non-coding RNAs (lncRNAs) have been using as prognostic biomarkers for therapeutics and interventions in various cancers. Considering the poor survival of NB, the lncRNA-based therapeutic strategies must be improved. This work proposes an overall survival time estimator called SVR-NB to identify the lncRNA signature that is associated with the overall survival of patients with NB. SVR-NB is an optimized support vector regression (SVR)-based method that uses an inheritable bi-objective combinatorial genetic algorithm for feature selection. The dataset of 231 NB patients that contains overall survival information and expression profiles of 783 lncRNAs was used to design and evaluate SVR-NB from the database of gene expression omnibus accession GSE62564. SVR-NB identified a signature of 35 lncRNAs and achieved a mean squared correlation coefficient of 0.85 and a mean absolute error of 0.56 year between the actual and estimated overall survival time using 10-fold cross-validation. Further, we ranked and characterized the 35 lncRNAs according to their contribution towards the estimation accuracy. Functional annotations and co-expression gene analysis of LOC440896, LINC00632, and IGF2-AS revealed the association of co-expressed genes in Kyoto Encyclopedia of Genes and Genomes pathways.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30914706 PMCID: PMC6435792 DOI: 10.1038/s41598-019-41553-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Performance of SVR-NB.
| Method | Features Selected | Squared correlation coefficient | Mean absolute error (years) |
|---|---|---|---|
| SVR-NB | 33 | 0.89 | 0.49 |
| SVR-NB(Mean) | 30.26 | 0.85 ± 0.009 | 0.56 ± 0.09 |
| SVR-NB(FFS) | 35 | 0.84 | 0.63 |
| Ridge regression | 783 | 0.62 | 0.87 |
| LASSO | 41 | 0.68 | 0.78 |
| Elastic net | 44 | 0.67 | 0.81 |
Figure 1(a) Estimation performance of SVR-NB. (b) Estimation performance of ridge regression. (c) Estimation performance of LASSO regression. (d) Estimation performance of elastic net regression. X-axis refers to actual overall survival time, and Y-axis refers to estimated survival time.
Figure 2SVR-NB validation using an independent test cohort of 127 NB patients.
MED ranking of lncRNAs.
| Rank | Ref-Seq ID | LncRNA-Symbol | MED score |
|---|---|---|---|
| 1 | NR_015361 | LOC440896 | 2.588 |
| 2 | XR_108432 | LOC729770 | 1.694 |
| 3 | NR_028344 | LINC00632 | 1.655 |
| 4 | NR_002712 | CXCR2P1 | 1.451 |
| 5 | NR_033921 | LOC643542 | 1.388 |
| 6 | XR_109027 | LOC387720 | 1.296 |
| 7 | NR_028043 | IGF2-AS | 1.282 |
| 8 | NM_001164467 | DUX4L3 | 1.279 |
| 9 | NR_002835 | HAS2-AS1 | 0.983 |
| 10 | NR_038235 | LINC01606 | 0.981 |
| 11 | NR_030171 | MIR492 | 0.975 |
| 12 | NR_027088 | LOC284661 | 0.953 |
| 13 | NR_002145 | OR2L1P | 0.945 |
| 14 | NR_003503 | GGT8P | 0.925 |
| 15 | XR_109271 | LOC400511 | 0.857 |
| 16 | NR_027284 | LINC00602 | 0.811 |
| 17 | NR_033942 | ARHGEF34P | 0.768 |
| 18 | XM_001717149 | LOC100130503 | 0.719 |
| 19 | NR_027321 | LINC00964 | 0.649 |
| 20 | NR_002766 | MEG3 | 0.614 |
| 21 | NR_026816 | PSORS1C3 | 0.589 |
| 22 | NR_003187 | NCF1C | 0.511 |
| 23 | XR_109119 | LOC100129223 | 0.369 |
| 24 | XR_111273 | LOC100509445 | 0.300 |
| 25 | NR_033400 | CSNK1G2-AS1 | 0.290 |
| 26 | NR_029965 | MIR431 | 0.258 |
| 27 | NR_024192 | HILS1 | 0.255 |
| 28 | NR_026766 | MYCNOS | 0.236 |
| 29 | NR_038977 | LINC01239 | 0.155 |
| 30 | NR_073404 | LOC441081 | 0.125 |
| 31 | NR_037890 | DNAJB8-AS1 | 0.107 |
| 32 | NR_024119 | LINC00244 | 0.106 |
| 33 | NR_046173 | LOC254896 | 0.102 |
| 34 | XR_110545 | LOC730376 | 0.088 |
| 35 | XR_109597 | GDF5OS | 0.066 |
LncRNA and their predicted protein interactions.
| ID | Gene Name | Species | UCSC_TFBS |
|---|---|---|---|
| 3580 | C-X-C motif chemokine receptor 2 pseudogene 1 (CXCR2P1) | Homo sapiens | AP1, AP4, AREB6, ARP1, CDP, CDPCR3, CEBP, CETS1P54, CP2, E47, GATA1, GATA3, GR, GRE, HEN1, HNF1, HTF, IK3, LUN1, MYOD, MZF1, NF1, NFAT, P300, PAX4, PAX5, SEF1, SRF, TAL1ALPHAE47, TAL1BETAE47, TAL1BETAITF2, TAXCREB, TCF11, YY1 |
| 594842 | HAS2 antisense RNA 1 (HAS2-AS1) | Homo sapiens | AHR, AHRARNT, AML1, AP1, AP4, AREB6, ARNT, ATF, ATF6, BACH1, BACH2, BRACH, CART1, CDC5, CDPCR3HD, CEBP, CREB, CREBP1, CREBP1CJUN, E2F, E47, E4BP4, EGR3, EVI1, FOXJ2, FOXO3, FOXO4, FREAC3, FREAC4, FREAC7, GATA1, GFI1, GRE, HAND1E47, HEN1, HFH1, HFH3, HSF1, HSF2, HTF, IK2, IK3, LHX3, LMO2COM, LUN1, MEIS1BHOXA9, MYCMAX, MYOD, NFE2, NFKB, NFY, NKX25, NKX61, NMYC, OCT1, P300, PAX2, PAX4, PAX6, PBX1, PPARG, RFX1, S8, SOX5, SRY, STAT3, STAT5A, USF, XBP1, YY1, ZIC3 |
| 653548 | double homeobox 4 like 3 (DUX4L3) | Homo sapiens | AP2REP, AREB6, CDPCR3HD, FOXO3, FREAC4, HSF2, OCT1, P53, PAX3, PAX5, SPZ1, TCF11MAFG |
| 440896 | uncharacterized LOC440896 (LOC440896) | Homo sapiens | CEBPB, EVI1, FOXJ2, FREAC2, GATA1, IK3, ISRE, NKX25, PAX3, RP58, TCF11MAFG, TST1 |
KEGG pathway association of co-expressed genes for LOC440896, LINC00632, and IGF2-AS.
| LncRNA | Gene symbol | Gene name | Correlation with lncRNA | KEGG Pathway name (KEGG ID) |
|---|---|---|---|---|
| LOC440896 | SPATA24 | spermatogenesis associated 24 | 0.42 | ▪ Jak-STAT signaling pathway (hsa05630). |
| LOC644090 | uncharacterized LOC644090 | 0.33 | ||
| CRLF2 | cytokine receptor-like factor 2 | 0.29 | ||
| RANBP3L | RAN binding protein 3-like | 0.19 | ||
| LINC00632 | STMN4 | stathmin-like 4 | 0.26 | ▪ GABAergic synapse (hsa04727) |
| MOBP | myelin-associated oligodendrocyte basic | 0.26 | ||
| KIF1A | kinesin family member 1A | 0.24 | ||
| IGF2-AS | NYX | nyctalopin | 0.55 | ▪ Platelet activation (hsa04611) |
| LARGE | like-glycosyltransferase | 0.52 | ||
| DBP | D site of albumin promoter (albumin D-box) binding protein | 0.46 |
Figure 3Co-expressed gene regulatory network of (a) LOC4408965, (b) LINC00632, and (c) IGF2-AS. Nodes represent lncRNAs, and edges represents coexpressed genes. The red node in the middle indicates the lncRNA. The yellow, blue, green, aqua and grey coloured nodes indicates the genes that are involved in different KEGG pathways.