Yuwei Zhang1, Yang Tao1, Huihui Ji2, Wei Li3, Xingli Guo4, Derry Minyao Ng1, Maria Haleem1, Yang Xi2, Changzheng Dong1, Jinshun Zhao1, Lina Zhang1, Xiaohong Zhang1, Yangyang Xie5, Xiaoyu Dai5, Qi Liao1. 1. Department of Preventative Medicine, Zhejiang Provincial Key Laboratory of Pathological and Physiological Technology, Medicine School of Ningbo University, Ningbo, Zhejiang, China. 2. Zhejiang Provincial Key Laboratory of Pathophysiology, School of Medicine, Ningbo University, Zhejiang, Ningbo, China. 3. Center for Genetic Medicine Research, Children's National Medical Center, Department of Genomics and Precision Medicine, George Washington University, Washington, DC, USA. 4. School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi Province 710071, China. 5. Anorectal Surgery, Ningbo Second Hospital, Ningbo, China.
Abstract
MOTIVATION: Genome-scale CRISPR/Cas9 system has been a democratized gene editing technique and widely used to investigate gene functions in some biological processes and diseases especially cancers. Aiming to characterize gene aberrations and assess their effects on cancer, we designed a pipeline to identify the essential genes for pan-cancer. METHODS: CRISPR screening data were used to identify the essential genes that were collected from published data and integrated by Robust Rank Aggregation algorithm. Then, hypergeometrics test and random walks with restart (RWR) were used to predict additional essential genes on broader scale. Finally, the expression status and potential roles of these genes were explored based on TCGA portal and regulatory network analysis. RESULTS: We collected 926 samples from 10 CRISPR-based screening studies involving 33 different types of cancer to identify cancer-essential genes, which consists of 799 protein-coding genes (PCGs) and 97 long non-coding RNAs (lncRNAs). Then, we constructed a 'bi-colored' network with both PCGs and lncRNAs and applied it to predict additional essential genes including 495 PCGs and 280 lncRNAs on a broader scale using hypergeometrics test and RWR. After obtaining all essential genes, we further investigated their potential roles in cancer and found that essential genes have higher and more stable expression levels, and are associated with multiple cancer-associated biological processes and survival time. The regulatory network analysis detected two intriguing modules of essential genes participating in the regulation of cell cycle and ribosome biogenesis in cancer. AVAILABILITY AND IMPLEMENTATION: . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Genome-scale CRISPR/Cas9 system has been a democratized gene editing technique and widely used to investigate gene functions in some biological processes and diseases especially cancers. Aiming to characterize gene aberrations and assess their effects on cancer, we designed a pipeline to identify the essential genes for pan-cancer. METHODS: CRISPR screening data were used to identify the essential genes that were collected from published data and integrated by Robust Rank Aggregation algorithm. Then, hypergeometrics test and random walks with restart (RWR) were used to predict additional essential genes on broader scale. Finally, the expression status and potential roles of these genes were explored based on TCGA portal and regulatory network analysis. RESULTS: We collected 926 samples from 10 CRISPR-based screening studies involving 33 different types of cancer to identify cancer-essential genes, which consists of 799 protein-coding genes (PCGs) and 97 long non-coding RNAs (lncRNAs). Then, we constructed a 'bi-colored' network with both PCGs and lncRNAs and applied it to predict additional essential genes including 495 PCGs and 280 lncRNAs on a broader scale using hypergeometrics test and RWR. After obtaining all essential genes, we further investigated their potential roles in cancer and found that essential genes have higher and more stable expression levels, and are associated with multiple cancer-associated biological processes and survival time. The regulatory network analysis detected two intriguing modules of essential genes participating in the regulation of cell cycle and ribosome biogenesis in cancer. AVAILABILITY AND IMPLEMENTATION: . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.