Literature DB >> 19900972

HLungDB: an integrated database of human lung cancer research.

Lishan Wang1, Yuanyuan Xiong, Yihua Sun, Zhaoyuan Fang, Li Li, Hongbin Ji, Tieliu Shi.   

Abstract

The human lung cancer database (HLungDB) is a database with the integration of the lung cancer-related genes, proteins and miRNAs together with the corresponding clinical information. The main purpose of this platform is to establish a network of lung cancer-related molecules and to facilitate the mechanistic study of lung carcinogenesis. The entries describing the relationships between molecules and human lung cancer in the current release were extracted manually from literatures. Currently, we have collected 2585 genes and 212 miRNA with the experimental evidences involved in the different stages of lung carcinogenesis through text mining. Furthermore, we have incorporated the results from analysis of transcription factor-binding motifs, the promoters and the SNP sites for each gene. Since epigenetic alterations also play an important role in lung carcinogenesis, genes with epigenetic regulation were also included. We hope HLungDB will enrich our knowledge about lung cancer biology and eventually lead to the development of novel therapeutic strategies. HLungDB can be freely accessed at http://www.megabionet.org/bio/hlung.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19900972      PMCID: PMC2808962          DOI: 10.1093/nar/gkp945

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Lung cancer, one of the most common causes of cancer-related death in both men and women, is responsible for 1.3 million deaths worldwide every year. Lung cancer can be roughly divided into two groups according to pathology: non-small cell lung cancer (NSCLC) (80.4%) and small cell lung cancer (16.8%) (1). Many factors potentially contribute to lung cancer formation, e.g. tobacco smoke, ionizing radiation and viral infection. However, the mechanisms involved in lung carcinogenesis remain largely unknown. Similar to many other cancers, lung cancer is initiated by activation of oncogenes or inactivation of tumor suppressor genes (2). Previous studies have revealed the various causes of lung cancer at the genomic level. Mutations in the K-ras proto-oncogene are responsible for 10–30% of lung adenocarcinomas (3,4). The epidermal growth factor receptor (EGFR) regulates cell proliferation, apoptosis, angiogenesis and tumor invasion (3). Oncogenic mutations and amplification of EGFR are common in non-small cell lung cancer and thus provide the basis for treatment with EGFR inhibitors. In contrast, Her2/neu oncogenic mutation is less frequently observed (3). Other oncogenes involved include c-MET, NKX2-1, PIK3CA and BRAF (3). Inactivation of tumor suppressor genes plays important role in lung carcinogenesis. The p53 tumor suppressor gene, located on chromosome 17p, is affected in 60–75% of lung cancer including both NSCLC and SCLC while Rb is more likely inactivated in SCLC (5). P16 is also frequently inactivated through the methylation of its promoter region at genomic DNA level. Another important tumor suppressor gene is LKB1, whose loss-of-function mutation/deletion is observed in ∼30% lung adenocarcinomas and 20% of squamous cell carcinomas (6,7). Genetic polymorphisms are also indicated to be involved in lung carcinogenesis, e.g. interleukin-1 (8), cytochrome P450 (9), apoptosis promoters such as caspase-8 (10) and DNA repair molecules such as XRCC1 (11). People with these polymorphisms are susceptible to lung cancer development after exposure to carcinogens. Studies also suggest that the MDM2 309G allele is a low-penetrant risk factor for lung cancer development in Asian population (12). Although lung cancer research data have accumulated dramatically during the past several years, to our knowledge, there is no database specifically focusing on lung cancer molecular biology yet available. OMIM contains information on all known Mendelian disorders and focuses on the relationship between phenotype and genotype (13). MethyCancer is developed to study the interplay of DNA methylation, gene expression and cancer. It contains both highly integrated data of DNA methylation, cancer-related genes, mutation and cancer information from public resources, and the CpG Island (CGI) clones derived from the large-scale sequencing projects (14). MiR2Disease aims at providing a comprehensive resource of microRNA misregulation in various human diseases (15). EGFR Mutation Database has a convenient compilation of somatic EGFR mutations in NSCLC and associated epidemiological and methodological data, including response to the tyrosine kinase inhibitors Gefitinib and Erlotinib (16). These databases focus on cancer pathogenesis from different angles with a little touch of lung cancer. Thus, it is beneficial to establish a lung cancer-related database or platform involving genes/proteins/miRNAs. High-throughput techniques applied in the lung cancer research have generated a mass of data and provided important resources for us to potentially explore the molecular mechanisms and identify lung cancer-related molecules. The integration of information generated by small-scale studies and using high-throughput technology could provide a unique resource to facilitate the systematic study of the lung carcinogenesis process. To this end, we collected lung cancer-related molecules and other detailed information for database construction through text mining in combination with bioinformatics analysis. This repository and maintenance system specially designed for lung cancer information can no doubt facilitate future lung cancer investigations. Overall, HLungDB enables the exploration of relevant information for human lung cancer-related molecules from multiple angles, making it a unique resource for human lung cancer and will serve as a useful platform for those interested in lung cancer biology.

DATA COLLECTION AND CONTENT

As aforementioned, initial entries describing the relationship between genes and human lung cancer are collected manually. The gene–lung cancer relationship documented in the current release were collected through searching the PubMed database with a list of keywords, such as ‘lung cancer gene’, ‘pulmonary cancer gene’, ‘pulmonary adenocarcinoma gene’, etc. After we obtained the literature with the keywords above, we read through and interpreted each paper by collecting the important information, including the type of gene alteration, the clinical correlation and/or significance of the gene alternation with lung cancer, the lung cancer subtype, the potential mechanism of gene regulation and the experimental methods involved. Each entry in the database contains detailed information on a lung cancer–gene relationship, including a basic description of the gene, the expression pattern of gene (up- or down-regulated) in the lung cancer patient, the experimentally validated regulatory information (transcription factors, their binding motif and the promoter) and protein–protein interaction (PPI) network etc. Gene expression profiling data for lung cancer patient samples were also retrieved from GEO. The differentially expressed genes were selected if the change between lung cancer samples and normal control is larger than 2-fold. To make the results more reliable, we only selected those genes differentially expressed from at least three patients in a dataset and displayed them on our web site. In the current release of HLungDB, 2585 genes were selected for their relationships with lung carcinogenesis. A total of 271 lung cancer samples from six expression profiling datasets were analyzed to get the gene expression pattern (17–22). For the lung cancer-related SNPs, we searched PubMed with key words, namely ‘SNP’ and ‘lung cancer’. Then, we collected the SNPs proven to be correlated with lung cancer from those returned papers. In total, 424 SNPs, no matter whether they could be mapped to a gene or not, were added into the database. Additionally, 360 transcription factors with 1160 binding motifs and 253 lung cancer-related genes with detailed epigenetic information were also placed into the database. Accumulating evidence has indicated that miRNAs play an important role in lung cancer pathology. Previous experiments, both with high-throughput and small-scale methods, have identified many miRNAs differentially expressed in lung cancer and/or confirmed to be related to lung cancer. Hence, miRNA data are an important resource for lung cancer research. Therefore, we selected lung cancer-related miRNAs with experimental information from the literature. For those miRNAs with identified targets, the targets along with the experiment methods used are also provided in the platform. Currently, there are 212 lung cancer-related miRNAs included in the HLungDB. Next, we built the HLungDB database by integrating the data we collected with information from other resources (Figure 1), which makes our database a one-stop and knowledgeable platform for the lung cancer research community.
Figure 1.

The database structure of HLungDB.

The database structure of HLungDB.

DATA ACCESS

HLungDB provides a search engine to query detailed information on each gene–lung cancer relationship documented in the database. Query keywords, including gene/protein symbol or its synonym, are all allowed. The information flow is roughly described in Figure 2.
Figure 2.

The flowchart of query in the HLungDB database.

The flowchart of query in the HLungDB database. After submission of the symbol or the alias of a gene, gene centered information will be displayed in a new page, including symbol, alias, description, protein–protein interactions, expression alterations based on the microarray data and regulatory information if the gene has been confirmed to be related to lung cancer in our database. To see more details about how the gene is related to lung cancer, the user can click on the gene symbol link and a new page will appear to display evidence of the genes relationship to lung cancer. ‘Clinical Significance’ indicates the effect of the gene alteration on the lung cancer in the point of clinical view that is collected from the literature; ‘Function’ describes the gene’s role in lung cancer extracted from the published papers; ‘Gene Regulation’ presents the regulatory relationship of the gene with other genes, while ‘Expression Alteration’ shows the analysis results of lung cancer related microarray datasets, in which the user can see how many patients show gene upregulation and/or downregulation. The PPI link leads to a new page that shows the proteins interacting with the query protein, the ‘Show PPI Network’ link will display the selected protein–protein interaction network based on experimental evidence mostly from the HPRD system (23). In the PPI network section, user-friendly interfaces have made all the features of HLungDB PPI easily accessible and also provide direct view for the user to explore the relationship among the proteins. The ‘Regulatory Information’ links the user to the names of the transcription factors confirmed to regulate the gene. ‘See Details’ links the user to a new page that displays the binding site motifs of those transcription factors with the supporting PubMed ID. The ‘Show Promoter of Gene’ link will display the promoter sequence(s) of the selected gene. A gene with an unknown transcription factor will only show its promoter sequence(s). Alternatively, the user can query our system with the protein symbol, and a new summary page will provide a brief description of the protein, the PPI, the links to other related resources and the PPI network. Users can navigate each item in detail by clicking the related links. Users can also check whether a miRNA is related to lung cancer with the miRNA symbol. The results page will display the manually collected details for the related miRNA, including the disease type the miRNA is related to, the alterations in expression of the miRNA, the mechanism of the miRNA in lung cancer, the experiment methods used to confirm the mechanism, the targets of the miRNA if any with PubMed ID and the description of the miRNA involved in lung cancer. HLungDB provides two ways to view all lung cancer-related genes. The first approach is to query the database via visualized chromosome browser through ‘Chromosome’ listed on the first page. The user then clicks ‘Chromosome’ on the top of this page, and a chromosome map will return. In the Chromosome page, the user can view lung cancer-related genes by Chromosome ID. With the second approach, ‘Browse’ on the first page of HLungDB allows users to see all the genes confirmed to be related to lung cancer. The genes in this list are sorted by alphabetical order. Using these two approaches, users can easily retrieve all genes that are related to lung cancer. Another way to view lung cancer-related genes is provided in the pathway view. On the pathway list, users can check those lung cancer genes by clicking on the pathway name and view the network about this pathway through the ‘Pathway Network’ entrance. User can also click on the marginal node on the network to expand the network. For more detailed usage of the network, users can read the annotation on the pathway network page. Users can view lung cancer-related information in our database through browsing SNP, transcription factor and methylation lists. The ‘SNP View’ provides the user with lung cancer-related SNP obtained from PubMed by searching ‘SNP’ and ‘lung cancer’. The ‘TransFactor View’ presents transcription factors related to lung cancer with other detailed information. The ‘Methylation View’ displays genes with epigenetic alterations observed in lung cancer. In addition, convenient links are provided to other databases. HLungDB has been developed with crosslink to other relevant external resources. It includes the National Center for Biotechnology Information, a repository for published gene information, and PubMed, US National Library of Medicine, that includes over 18 million citations from MEDLINE and other life science journals. HPRD, HUGO, IPI, EBI and KEGG are also linked to HLungDB.

DISCUSSION

In order to provide a central resource for biologists in the lung cancer research community, we developed HLungDB, a database system aimed at providing a comprehensive resource of gene information and their relationships to lung cancer. The goal of the lung cancer database project was to construct a large-scale platform for lung cancer that would contribute to basic research and clinical research in the future. In the past 2 years, large amounts of data have been collected for this project. Information on lung cancer data was obtained from the PubMed and GEO databases. Genes, miRNAs, gene promoters, transcription factors, transcription factor-binding sites and the SNPs related to lung cancer have been collected and integrated into this system. Clinical information related to gene expression profile data was also extracted from GEO. We have systematically extracted information from published lung cancer-related studies. The database currently contains 2585 full-text entries describing lung cancer and genes. They have been integrated in such a way that investigators can rapidly query whether a gene or protein is found in human lung cancer, and other detailed lung cancer-related information about this gene. User-friendly query interfaces have made all the features of HLungDB easily accessible. HLungDB provides a comprehensive resource for human lung cancer research. We believe that HLungDB will be particularly interesting to the life science community and will greatly facilitate cancer biologists’ mission of unraveling the pathogenesis of lung cancer.

FUTURE DIRECTIONS

We are working to increase the quality and quantity of data and to supply additional database function. We plan to adopt two strategies to achieve these goals. First, text-mining tools will be adopted to improve our data collection. We will use text-mining tools to help us prescreen PubMed abstracts regularly that potentially describe the lung cancer–gene relationships. Second, since many proteins in the signaling transduction pathways are involved in the lung cancer development and progression, our next step is to identify those signal transduction pathways that have significant changes and display their components with identified alteration in lung cancer in a network view. At the same time, we will also collect the downstream genes for each altered signaling pathway in lung cancer and further characterize the relationship between them to ultimately fulfill the goal of identifying new potentially relevant lung cancer genes and new mechanisms.

FUNDING

State Key Program of Basic Research of China (Grant 2007CB108800, 2009CB918402, 2010CB912102); National High Technology Research and Development Program of China (863 project) (Grant No. 2006AA02Z313); National Natural Science Foundation of China (Grant 30870575, 30740084 and 30871284); Chinese Academy of Sciences (2008KIP101); Science and Technology Commission of Shanghai Municipality (06DZ22923, 08PJ14105). H.J. is a scholar of the Hundred Talents Program of the Chinese Academy of Sciences. Funding for open access charge: National Natural Science Foundation of China and the State Key Program of Basic Research of China. Conflict of interest statement. None declared.
  23 in total

1.  Preparation and properties of inhalable nanocomposite particles for treatment of lung cancer.

Authors:  Keishiro Tomoda; Takumi Ohkoshi; Keiji Hirota; Ganeshchandra S Sonavane; Takehisa Nakajima; Hiroshi Terada; Masahito Komuro; Kenji Kitazato; Kimiko Makino
Journal:  Colloids Surf B Biointerfaces       Date:  2009-02-11       Impact factor: 5.268

Review 2.  Lung cancer.

Authors:  Roy S Herbst; John V Heymach; Scott M Lippman
Journal:  N Engl J Med       Date:  2008-09-25       Impact factor: 91.245

Review 3.  Molecular mechanisms of lung cancer. Interaction of environmental and genetic factors. Giles F. Filley Lecture.

Authors:  T R Devereux; J A Taylor; J C Barrett
Journal:  Chest       Date:  1996-03       Impact factor: 9.410

4.  CYP1A1 and CYP1B1 polymorphisms and risk of lung cancer among never smokers: a population-based study.

Authors:  A S Wenzlaff; M L Cote; C H Bock; S J Land; S K Santer; D R Schwartz; A G Schwartz
Journal:  Carcinogenesis       Date:  2005-07-28       Impact factor: 4.944

Review 5.  Lung cancer. 9: Molecular biology of lung cancer: clinical implications.

Authors:  K M Fong; Y Sekido; A F Gazdar; J D Minna
Journal:  Thorax       Date:  2003-10       Impact factor: 9.139

6.  Genomic profiles associated with early micrometastasis in lung cancer: relevance of 4q deletion.

Authors:  Michaela Wrage; Salla Ruosaari; Paul P Eijk; Jussuf T Kaifi; Jaakko Hollmén; Emre F Yekebas; Jakob R Izbicki; Ruud H Brakenhoff; Thomas Streichert; Sabine Riethdorf; Markus Glatzel; Bauke Ylstra; Klaus Pantel; Harriet Wikman
Journal:  Clin Cancer Res       Date:  2009-02-10       Impact factor: 12.531

7.  Somatic mutations affect key pathways in lung adenocarcinoma.

Authors:  Li Ding; Gad Getz; David A Wheeler; Elaine R Mardis; Michael D McLellan; Kristian Cibulskis; Carrie Sougnez; Heidi Greulich; Donna M Muzny; Margaret B Morgan; Lucinda Fulton; Robert S Fulton; Qunyuan Zhang; Michael C Wendl; Michael S Lawrence; David E Larson; Ken Chen; David J Dooling; Aniko Sabo; Alicia C Hawes; Hua Shen; Shalini N Jhangiani; Lora R Lewis; Otis Hall; Yiming Zhu; Tittu Mathew; Yanru Ren; Jiqiang Yao; Steven E Scherer; Kerstin Clerc; Ginger A Metcalf; Brian Ng; Aleksandar Milosavljevic; Manuel L Gonzalez-Garay; John R Osborne; Rick Meyer; Xiaoqi Shi; Yuzhu Tang; Daniel C Koboldt; Ling Lin; Rachel Abbott; Tracie L Miner; Craig Pohl; Ginger Fewell; Carrie Haipek; Heather Schmidt; Brian H Dunford-Shore; Aldi Kraja; Seth D Crosby; Christopher S Sawyer; Tammi Vickery; Sacha Sander; Jody Robinson; Wendy Winckler; Jennifer Baldwin; Lucian R Chirieac; Amit Dutt; Tim Fennell; Megan Hanna; Bruce E Johnson; Robert C Onofrio; Roman K Thomas; Giovanni Tonon; Barbara A Weir; Xiaojun Zhao; Liuda Ziaugra; Michael C Zody; Thomas Giordano; Mark B Orringer; Jack A Roth; Margaret R Spitz; Ignacio I Wistuba; Bradley Ozenberger; Peter J Good; Andrew C Chang; David G Beer; Mark A Watson; Marc Ladanyi; Stephen Broderick; Akihiko Yoshizawa; William D Travis; William Pao; Michael A Province; George M Weinstock; Harold E Varmus; Stacey B Gabriel; Eric S Lander; Richard A Gibbs; Matthew Meyerson; Richard K Wilson
Journal:  Nature       Date:  2008-10-23       Impact factor: 49.962

8.  Lung cancer.

Authors:  W D Travis; L B Travis; S S Devesa
Journal:  Cancer       Date:  1995-01-01       Impact factor: 6.860

9.  Human Protein Reference Database--2009 update.

Authors:  T S Keshava Prasad; Renu Goel; Kumaran Kandasamy; Shivakumar Keerthikumar; Sameer Kumar; Suresh Mathivanan; Deepthi Telikicherla; Rajesh Raju; Beema Shafreen; Abhilash Venugopal; Lavanya Balakrishnan; Arivusudar Marimuthu; Sutopa Banerjee; Devi S Somanathan; Aimy Sebastian; Sandhya Rani; Somak Ray; C J Harrys Kishore; Sashi Kanth; Mukhtar Ahmed; Manoj K Kashyap; Riaz Mohmood; Y L Ramachandra; V Krishna; B Abdul Rahiman; Sujatha Mohan; Prathibha Ranganathan; Subhashri Ramabadran; Raghothama Chaerkady; Akhilesh Pandey
Journal:  Nucleic Acids Res       Date:  2008-11-06       Impact factor: 16.971

10.  miR2Disease: a manually curated database for microRNA deregulation in human disease.

Authors:  Qinghua Jiang; Yadong Wang; Yangyang Hao; Liran Juan; Mingxiang Teng; Xinjun Zhang; Meimei Li; Guohua Wang; Yunlong Liu
Journal:  Nucleic Acids Res       Date:  2008-10-15       Impact factor: 16.971

View more
  24 in total

1.  Human protein reference database and human proteinpedia as discovery resources for molecular biotechnology.

Authors:  Renu Goel; Babylakshmi Muthusamy; Akhilesh Pandey; T S Keshava Prasad
Journal:  Mol Biotechnol       Date:  2011-05       Impact factor: 2.695

2.  Fisetin induces apoptosis and endoplasmic reticulum stress in human non-small cell lung cancer through inhibition of the MAPK signaling pathway.

Authors:  Kyoung Ah Kang; Mei Jing Piao; Susara Ruwan Kumara Madduma Hewage; Yea Seong Ryu; Min Chang Oh; Taeg Kyu Kwon; Sungwook Chae; Jin Won Hyun
Journal:  Tumour Biol       Date:  2016-01-21

3.  Identification of featured biomarkers in different types of lung cancer with DNA microarray.

Authors:  Chao Zhou; Hao Chen; Li Han; An Wang; Liang-An Chen
Journal:  Mol Biol Rep       Date:  2014-07-08       Impact factor: 2.316

4.  Differential expression of RBM5, EGFR and KRAS mRNA and protein in non-small cell lung cancer tissues.

Authors:  Hong Liang; Jie Zhang; Chen Shao; Lijing Zhao; Wei Xu; Leslie C Sutherland; Ke Wang
Journal:  J Exp Clin Cancer Res       Date:  2012-04-26

5.  RCDB: Renal Cancer Gene Database.

Authors:  Jayashree Ramana
Journal:  BMC Res Notes       Date:  2012-05-18

6.  Lung Cancer: Are we up to the Challenge?

Authors:  Luca Esposito; Daniele Conti; Ramyasri Ailavajhala; Nansie Khalil; Antonio Giordano
Journal:  Curr Genomics       Date:  2010-11       Impact factor: 2.236

7.  CCDB: a curated database of genes involved in cervix cancer.

Authors:  Subhash M Agarwal; Dhwani Raghav; Harinder Singh; G P S Raghava
Journal:  Nucleic Acids Res       Date:  2010-11-02       Impact factor: 16.971

8.  IGDB.NSCLC: integrated genomic database of non-small cell lung cancer.

Authors:  Sen Kao; Cheng-Kai Shiau; De-Leung Gu; Chun-Ming Ho; Wen-Hui Su; Chian-Feng Chen; Chi-Hung Lin; Yuh-Shan Jou
Journal:  Nucleic Acids Res       Date:  2011-12-01       Impact factor: 16.971

9.  Dynamic network of transcription and pathway crosstalk to reveal molecular mechanism of MGd-treated human lung cancer cells.

Authors:  Liyan Shao; Lishan Wang; Zhiyun Wei; Yuyu Xiong; Yang Wang; Kefu Tang; Yang Li; Guoyin Feng; Qinghe Xing; Lin He
Journal:  PLoS One       Date:  2012-05-31       Impact factor: 3.240

10.  ADH IB expression, but not ADH III, is decreased in human lung cancer.

Authors:  Sarah C Mutka; Lucia H Green; Evie L Verderber; Jane P Richards; Doug L Looker; Elizabeth A Chlipala; Gary J Rosenthal
Journal:  PLoS One       Date:  2012-12-28       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.