| Literature DB >> 33272232 |
Mona Maharjan1, Raihanul Bari Tanvir1, Kamal Chowdhury2, Wenrui Duan3, Ananda Mohan Mondal4.
Abstract
BACKGROUND: Lung cancer is the number one cancer killer in the world with more than 142,670 deaths estimated in the United States alone in the year 2019. Consequently, there is an overreaching need to identify the key biomarkers for lung cancer. The aim of this study is to computationally identify biomarker genes for lung cancer that can aid in its diagnosis and treatment. The gene expression profiles of two different types of studies, namely non-treatment and treatment, are considered for discovering biomarker genes. In non-treatment studies healthy samples are control and cancer samples are cases. Whereas, in treatment studies, controls are cancer cell lines without treatment and cases are cancer cell lines with treatment.Entities:
Keywords: Bioinformatics; Computational identification of biomarker; Lung cancer biomarkers; Non-treatment studies; Treatment studies
Year: 2020 PMID: 33272232 PMCID: PMC7713218 DOI: 10.1186/s12859-020-3524-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Data cleaning and identification of top 250 DEGs
Summary of expression profile studies for lung cancer. GDS1204 and GDS2499 are treatment studies and the others are non-treatment studies
| Study | Data Type | #Samples | #Control | #Case | Cancer Type | Treatment | Reference |
|---|---|---|---|---|---|---|---|
| GDS1204 | Microarray | 18 | 9 | 9 | A549 Cell Line | Yes | [ |
| GDS1312 | Microarray | 10 | 5 | 5 | Squamous Cell | No | [ |
| GDS2499 | Microarray | 12 | 3 | 9 | A549 Cell Line | Yes | [ |
| GDS4794 | Microarray | 65 | 42 | 23 | Small Cell | No | [ |
| GDS5201 | Microarray | 6 | 2 | 4 | Mouse Model | No | [ |
Fig. 2Overview of data analysis methodology
Fig. 3Functionally interacting protein network of non-treatment DEGs created using ReactomeFI
Fig. 4Top 10 hub genes highlighted in non-treatment network. These hub genes are identified using scoring method “Degree” in Cytohubba. The node with dark red represents the highest rank while the node with light yellow represents the lowest rank
The biomarker genes discovered from non-treatment studies along with their regulation and description
| Gene | Regulation | Description |
|---|---|---|
| BUB3 | UP | BUB3 mitotic checkpoint protein |
| CCNB1 | UP | Cyclin B1 |
| CCNB2 | DOWN | Cyclin B2 |
| CDC20 | DOWN | Cell division cycle 20 |
| CDCA8 | DOWN | Cell division cycle associated 8 |
| CDK1 | DOWN | Cyclin-dependent kinase 1 |
| CENPF | UP | Centromere Protein F |
| CENPI | UP | Centromere Protein I |
| KIF18A | UP | Kinesin Family Member 18A |
| KNTC1 | UP | Kinetochore Associated 1 |
| MAD2L1 | UP | MAD2 mitotic arrest deficient like 1 |
| NDC80 | UP | NDC80 Kinetochore Complex Component |
| NUP37 | DOWN | Nucleoporing 37 kDa |
| PCNA | UP | Proliferating Cell Nuclear Antigen |
| RAD21 | UP | RAD21 homolog |
| ZWINT | UP | ZW10 Interacting Kinetochore Protein |
The biomarker genes discovered from treatment studies along with their regulation and description
| Gene | Regulation | Description |
|---|---|---|
| CEBPB | DOWN | CCAAT/enhancer binding protein beta |
| FBXL14 | DOWN | F-box and leucine rich repeat protein 14 |
| FBXL3 | DOWN | F-box and leucine rich repeat protein 3 |
| FBXO30 | DOWN | F-box protein 30 |
| FBXO9 | DOWN | F-Box protein 9 |
| FOXA1 | DOWN | Forkhead box A1 |
| FOXA2 | DOWN | Forkhead box A2 |
| JUN | UP | Jun proto-oncogene AP-1 transcription factor subunit |
| JUND | DOWN | JunD proto-oncogene, AP-1 transcription factor subunit |
| MAPK8 | UP | Mitogen activated protein kinase 8 |
| MYC | DOWN | V-myc avian myelocytomatosis viral oncogene homolog |
| MYLIP | DOWN | Myosin regulatory light chain interacting protein |
| NFE2L2 | DOWN | Nuclear factor, erythroid 2 like 2 |
| RNF19A | DOWN | Ring finger protein 19A, RBR E3 ubiquitin protein ligase |
| RNF217 | DOWN | Ring finger protein 217 |
| UBC | DOWN | Ubiquitin C |
Fig. 5Pathway enrichment analysis of non-treatment and treatment biomarker genes. The significantly enriched KEGG pathways. Each point represents a pathway. Ratio of enrichment is the number of observed genes in that pathway divided by the total number of expected genes from each KEGG pathway
Fig. 6Survival analysis of non-treatment and treatment biomarker genes. a) CCNB2 and b) CDC20 are biomarker genes for non-treatment studies. Both genes have HR > 1, which indicates that low expression group has higher chance of survival. c) FBXL3 and d) FOXA2 are biomarker genes for treatment studies. Both genes have HR < 1, which indicates that high expression group has higher chance of survival
Level of granularity of number of DEGs in different steps of analysis. Each row represents a step of analysis. The second from the last row represents the number of biomarker genes discovered from non-treatment and treatment studies. The last row shows the number of biomarker genes that could be used to design a lab experiment for further exploration of dynamics in lung cancer development
| Tools/Analysis | Number of DEGs | |
|---|---|---|
| GEO Built-in Filter | 16,876 | |
| GEO2R | 407 | 547 |
| ReactomeFI | 166 | 260 |
| Cytohubba | 38 | 51 |
| Hub genes (Common in two algorithms) | 21 | 24 |
| MCODE (Genes present in clusters) | 63 | 55 |
| Biomarkers (Common in Hub and MCODE) | 16 | 16 |
| Survival Analysis | 14 | 14 |