Rian Pratama1, Jae Joon Hwang2, Ji Hye Lee3,4, Giltae Song5, Hae Ryoun Park6,7. 1. School of Computer Science and Engineering, Pusan National University, 63 Busandaehak-Ro, Busan, 46241, Republic of Korea. 2. Department of Oral and Maxillofacial Radiology, School of Dentistry, Pusan National University, Dental Research Institute, Yangsan, 50610, Republic of Korea. 3. Department of Oral Pathology, School of Dentistry, Pusan National University, 49 Busandaehak-Ro, Yangsan, 50612, Republic of Korea. 4. Periodontal Disease Signaling Network Research Center, School of Dentistry, Pusan National University, Yangsan, 50612, Republic of Korea. 5. School of Computer Science and Engineering, Pusan National University, 63 Busandaehak-Ro, Busan, 46241, Republic of Korea. gsong@pusan.ac.kr. 6. Department of Oral Pathology, School of Dentistry, Pusan National University, 49 Busandaehak-Ro, Yangsan, 50612, Republic of Korea. parkhr@pusan.ac.kr. 7. Periodontal Disease Signaling Network Research Center, School of Dentistry, Pusan National University, Yangsan, 50612, Republic of Korea. parkhr@pusan.ac.kr.
Abstract
BACKGROUND: Recently, the possibility of tumour classification based on genetic data has been investigated. However, genetic datasets are difficult to handle because of their massive size and complexity of manipulation. In the present study, we examined the diagnostic performance of machine learning applications using imaging-based classifications of oral squamous cell carcinoma (OSCC) gene sets. METHODS: RNA sequencing data from SCC tissues from various sites, including oral, non-oral head and neck, oesophageal, and cervical regions, were downloaded from The Cancer Genome Atlas (TCGA). The feature genes were extracted through a convolutional neural network (CNN) and machine learning, and the performance of each analysis was compared. RESULTS: The ability of the machine learning analysis to classify OSCC tumours was excellent. However, the tool exhibited poorer performance in discriminating histopathologically dissimilar cancers derived from the same type of tissue than in differentiating cancers of the same histopathologic type with different tissue origins, revealing that the differential gene expression pattern is a more important factor than the histopathologic features for differentiating cancer types. CONCLUSION: The CNN-based diagnostic model and the visualisation methods using RNA sequencing data were useful for correctly categorising OSCC. The analysis showed differentially expressed genes in multiwise comparisons of various types of SCCs, such as KCNA10, FOSL2, and PRDM16, and extracted leader genes from pairwise comparisons were FGF20, DLC1, and ZNF705D.
BACKGROUND: Recently, the possibility of tumour classification based on genetic data has been investigated. However, genetic datasets are difficult to handle because of their massive size and complexity of manipulation. In the present study, we examined the diagnostic performance of machine learning applications using imaging-based classifications of oral squamous cell carcinoma (OSCC) gene sets. METHODS: RNA sequencing data from SCC tissues from various sites, including oral, non-oral head and neck, oesophageal, and cervical regions, were downloaded from The Cancer Genome Atlas (TCGA). The feature genes were extracted through a convolutional neural network (CNN) and machine learning, and the performance of each analysis was compared. RESULTS: The ability of the machine learning analysis to classify OSCC tumours was excellent. However, the tool exhibited poorer performance in discriminating histopathologically dissimilar cancers derived from the same type of tissue than in differentiating cancers of the same histopathologic type with different tissue origins, revealing that the differential gene expression pattern is a more important factor than the histopathologic features for differentiating cancer types. CONCLUSION: The CNN-based diagnostic model and the visualisation methods using RNA sequencing data were useful for correctly categorising OSCC. The analysis showed differentially expressed genes in multiwise comparisons of various types of SCCs, such as KCNA10, FOSL2, and PRDM16, and extracted leader genes from pairwise comparisons were FGF20, DLC1, and ZNF705D.
Entities:
Keywords:
Convolutional neural network; Diagnostic model; Oral squamous cell carcinoma; The cancer genome atlas; Tumour classification