| Literature DB >> 24743794 |
Abstract
Non-small cell lung cancer (NSCLC) has two major subtypes: adenocarcinoma (AC) and squamous cell carcinoma (SCC). The diagnosis and treatment of NSCLC are hindered by the limited knowledge about the pathogenesis mechanisms of subtypes of NSCLC. It is necessary to research the molecular mechanisms related with AC and SCC. In this work, we improved the logic analysis algorithm to mine the sufficient and necessary conditions for the presence states (presence or absence) of phenotypes. We applied our method to AC and SCC specimens, and identified [Formula: see text] lower and [Formula: see text] higher logic relationships between genes and two subtypes of NSCLC. The discovered relationships were independent of specimens selected, and their significance was validated by statistic test. Compared with the two earlier methods (the non-negative matrix factorization method and the relevance analysis method), the current method outperformed these methods in the recall rate and classification accuracy on NSCLC and normal specimens. We obtained [Formula: see text] biomarkers. Among [Formula: see text] biomarkers, [Formula: see text] genes have been used to distinguish AC from SCC in practice, and other six genes were newly discovered biomarkers for distinguishing subtypes. Furthermore, NKX2-1 has been considered as a molecular target for the targeted therapy of AC, and [Formula: see text] other genes may be novel molecular targets. By gene ontology analysis, we found that two biological processes ('epidermis development' and 'cell adhesion') were closely related with the tumorigenesis of subtypes of NSCLC. More generally, the current method could be extended to other complex diseases for distinguishing subtypes and detecting the molecular targets for targeted therapy.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24743794 PMCID: PMC3990524 DOI: 10.1371/journal.pone.0094644
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Data source.
| Subtype | No.(n) | |||
| AC | GSE10245(40) | GSE37745(106) | GSE18842(14) | GSE28571 (50) |
| SCC | GSE10245(18) | GSE37745(66) | GSE18842(32) | GSE28571 (28) |
| Normal | — | — | GSE18842(45) | — |
‘No.’ is the accession number from the Gene Expression Omnibus (GEO) database in NCBI; ‘n’ is the number of specimens; ‘—’ means there are no specimens from the corresponding data set.
Figure 1The recall rate of genes obtained by three methods.
According to each method, we rank the genes in descending order by the coefficients of genes related with phenotypes. We selecte the top genes, where . The classification accuracy is calculated based on the top genes. ‘RA’, ‘NMF’ and ‘U’ represent the relevance analysis method, the non-negative matrix factorization method and the current method, respectively.
Figure 225 genes are related with the subtypes of NSCLC.
There are genes related with subtypes of NSCLC by lower logic relationships, and each gene attaches a coefficient. The genes are ranked according to coefficients in descending order. The top genes are selected to identify biomarkers. The blue nodes represent biomarkers identified in this work. The yellow nodes represent six genes which are not related with NSCLC on the NSCLC and normal specimens. The red nodes represent subtypes, i.e. AC and SCC.
Significant GO terms.
| GO terms | Description | P-value1 | P-value2 | E1 | E2 |
| GO:0009888 | tissue development |
|
|
|
|
| GO:0008544 | epidermis development |
|
|
|
|
| GO:0030855 | epithelial cell differentiation |
|
|
|
|
| GO:0048856 | anatomical structure development |
|
|
|
|
| GO:0032502 | developmental process |
|
|
|
|
| GO:0007155 | cell adhesion |
|
|
|
|
| GO:0022610 | biological adhesion |
|
|
|
|
‘P-value1’ and ‘P-value2’ denote the p-value scores of GO terms based on the subtypes of NSCLC data and NSCLC and normal data, respectively. ‘E1’ and ‘E2’ are the enrichment values of GO terms based on the subtypes of NSCLC data and NSCLC and normal data, respectively.
Lower logic function of vector .
| Type | Symbol | Lower logic function | Logic statement |
|
|
|
| The value of |
|
|
|
| The value of |
‘’ denotes the function symbol of type of lower logic relationships, where and represents the sign for the lower logic relationships.
Higher logic function of vectors and .
| Type | Symbol | Higher logic function | Logic statement |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
|
|
|
| The value of |
‘’ denotes function symbol of type of higher logic relationships, where and represents the sign for the higher logic relationships.