Literature DB >> 20393554

International network of cancer genome projects.

Thomas J Hudson, Warwick Anderson, Axel Artez, Anna D Barker, Cindy Bell, Rosa R Bernabé, M K Bhan, Fabien Calvo, Iiro Eerola, Daniela S Gerhard, Alan Guttmacher, Mark Guyer, Fiona M Hemsley, Jennifer L Jennings, David Kerr, Peter Klatt, Patrik Kolar, Jun Kusada, David P Lane, Frank Laplace, Lu Youyong, Gerd Nettekoven, Brad Ozenberger, Jane Peterson, T S Rao, Jacques Remacle, Alan J Schafer, Tatsuhiro Shibata, Michael R Stratton, Joseph G Vockley, Koichi Watanabe, Huanming Yang, Matthew M F Yuen, Bartha M Knoppers, Martin Bobrow, Anne Cambon-Thomsen, Lynn G Dressler, Stephanie O M Dyke, Yann Joly, Kazuto Kato, Karen L Kennedy, Pilar Nicolás, Michael J Parker, Emmanuelle Rial-Sebbag, Carlos M Romeo-Casabona, Kenna M Shaw, Susan Wallace, Georgia L Wiesner, Nikolajs Zeps, Peter Lichter, Andrew V Biankin, Christian Chabannon, Lynda Chin, Bruno Clément, Enrique de Alava, Françoise Degos, Martin L Ferguson, Peter Geary, D Neil Hayes, Thomas J Hudson, Amber L Johns, Arek Kasprzyk, Hidewaki Nakagawa, Robert Penny, Miguel A Piris, Rajiv Sarin, Aldo Scarpa, Tatsuhiro Shibata, Marc van de Vijver, P Andrew Futreal, Hiroyuki Aburatani, Mónica Bayés, David D L Botwell, Peter J Campbell, Xavier Estivill, Daniela S Gerhard, Sean M Grimmond, Ivo Gut, Martin Hirst, Carlos López-Otín, Partha Majumder, Marco Marra, John D McPherson, Hidewaki Nakagawa, Zemin Ning, Xose S Puente, Yijun Ruan, Tatsuhiro Shibata, Michael R Stratton, Hendrik G Stunnenberg, Harold Swerdlow, Victor E Velculescu, Richard K Wilson, Hong H Xue, Liu Yang, Paul T Spellman, Gary D Bader, Paul C Boutros, Peter J Campbell, Paul Flicek, Gad Getz, Roderic Guigó, Guangwu Guo, David Haussler, Simon Heath, Tim J Hubbard, Tao Jiang, Steven M Jones, Qibin Li, Nuria López-Bigas, Ruibang Luo, Lakshmi Muthuswamy, B F Francis Ouellette, John V Pearson, Xose S Puente, Victor Quesada, Benjamin J Raphael, Chris Sander, Tatsuhiro Shibata, Terence P Speed, Lincoln D Stein, Joshua M Stuart, Jon W Teague, Yasushi Totoki, Tatsuhiko Tsunoda, Alfonso Valencia, David A Wheeler, Honglong Wu, Shancen Zhao, Guangyu Zhou, Lincoln D Stein, Roderic Guigó, Tim J Hubbard, Yann Joly, Steven M Jones, Arek Kasprzyk, Mark Lathrop, Nuria López-Bigas, B F Francis Ouellette, Paul T Spellman, Jon W Teague, Gilles Thomas, Alfonso Valencia, Teruhiko Yoshida, Karen L Kennedy, Myles Axton, Stephanie O M Dyke, P Andrew Futreal, Daniela S Gerhard, Chris Gunter, Mark Guyer, Thomas J Hudson, John D McPherson, Linda J Miller, Brad Ozenberger, Kenna M Shaw, Arek Kasprzyk, Lincoln D Stein, Junjun Zhang, Syed A Haider, Jianxin Wang, Christina K Yung, Anthony Cros, Anthony Cross, Yong Liang, Saravanamuttu Gnaneshan, Jonathan Guberman, Jack Hsu, Martin Bobrow, Don R C Chalmers, Karl W Hasel, Yann Joly, Terry S H Kaan, Karen L Kennedy, Bartha M Knoppers, William W Lowrance, Tohru Masui, Pilar Nicolás, Emmanuelle Rial-Sebbag, Laura Lyman Rodriguez, Catherine Vergely, Teruhiko Yoshida, Sean M Grimmond, Andrew V Biankin, David D L Bowtell, Nicole Cloonan, Anna deFazio, James R Eshleman, Dariush Etemadmoghadam, Brooke B Gardiner, Brooke A Gardiner, James G Kench, Aldo Scarpa, Robert L Sutherland, Margaret A Tempero, Nicola J Waddell, Peter J Wilson, John D McPherson, Steve Gallinger, Ming-Sound Tsao, Patricia A Shaw, Gloria M Petersen, Debabrata Mukhopadhyay, Lynda Chin, Ronald A DePinho, Sarah Thayer, Lakshmi Muthuswamy, Kamran Shazand, Timothy Beck, Michelle Sam, Lee Timms, Vanessa Ballin, Youyong Lu, Jiafu Ji, Xiuqing Zhang, Feng Chen, Xueda Hu, Guangyu Zhou, Qi Yang, Geng Tian, Lianhai Zhang, Xiaofang Xing, Xianghong Li, Zhenggang Zhu, Yingyan Yu, Jun Yu, Huanming Yang, Mark Lathrop, Jörg Tost, Paul Brennan, Ivana Holcatova, David Zaridze, Alvis Brazma, Lars Egevard, Egor Prokhortchouk, Rosamonde Elizabeth Banks, Mathias Uhlén, Anne Cambon-Thomsen, Juris Viksna, Fredrik Ponten, Konstantin Skryabin, Michael R Stratton, P Andrew Futreal, Ewan Birney, Ake Borg, Anne-Lise Børresen-Dale, Carlos Caldas, John A Foekens, Sancha Martin, Jorge S Reis-Filho, Andrea L Richardson, Christos Sotiriou, Hendrik G Stunnenberg, Giles Thoms, Marc van de Vijver, Laura van't Veer, Fabien Calvo, Daniel Birnbaum, Hélène Blanche, Pascal Boucher, Sandrine Boyault, Christian Chabannon, Ivo Gut, Jocelyne D Masson-Jacquemier, Mark Lathrop, Iris Pauporté, Xavier Pivot, Anne Vincent-Salomon, Eric Tabone, Charles Theillet, Gilles Thomas, Jörg Tost, Isabelle Treilleux, Fabien Calvo, Paulette Bioulac-Sage, Bruno Clément, Thomas Decaens, Françoise Degos, Dominique Franco, Ivo Gut, Marta Gut, Simon Heath, Mark Lathrop, Didier Samuel, Gilles Thomas, Jessica Zucman-Rossi, Peter Lichter, Roland Eils, Benedikt Brors, Jan O Korbel, Andrey Korshunov, Pablo Landgraf, Hans Lehrach, Stefan Pfister, Bernhard Radlwimmer, Guido Reifenberger, Michael D Taylor, Christof von Kalle, Partha P Majumder, Rajiv Sarin, T S Rao, M K Bhan, Aldo Scarpa, Paolo Pederzoli, Rita A Lawlor, Massimo Delledonne, Alberto Bardelli, Andrew V Biankin, Sean M Grimmond, Thomas Gress, David Klimstra, Giuseppe Zamboni, Tatsuhiro Shibata, Yusuke Nakamura, Hidewaki Nakagawa, Jun Kusada, Tatsuhiko Tsunoda, Satoru Miyano, Hiroyuki Aburatani, Kazuto Kato, Akihiro Fujimoto, Teruhiko Yoshida, Elias Campo, Carlos López-Otín, Xavier Estivill, Roderic Guigó, Silvia de Sanjosé, Miguel A Piris, Emili Montserrat, Marcos González-Díaz, Xose S Puente, Pedro Jares, Alfonso Valencia, Heinz Himmelbauer, Heinz Himmelbaue, Victor Quesada, Silvia Bea, Michael R Stratton, P Andrew Futreal, Peter J Campbell, Anne Vincent-Salomon, Andrea L Richardson, Jorge S Reis-Filho, Marc van de Vijver, Gilles Thomas, Jocelyne D Masson-Jacquemier, Samuel Aparicio, Ake Borg, Anne-Lise Børresen-Dale, Carlos Caldas, John A Foekens, Hendrik G Stunnenberg, Laura van't Veer, Douglas F Easton, Paul T Spellman, Sancha Martin, Anna D Barker, Lynda Chin, Francis S Collins, Carolyn C Compton, Martin L Ferguson, Daniela S Gerhard, Gad Getz, Chris Gunter, Alan Guttmacher, Mark Guyer, D Neil Hayes, Eric S Lander, Brad Ozenberger, Robert Penny, Jane Peterson, Chris Sander, Kenna M Shaw, Terence P Speed, Paul T Spellman, Joseph G Vockley, David A Wheeler, Richard K Wilson, Thomas J Hudson, Lynda Chin, Bartha M Knoppers, Eric S Lander, Peter Lichter, Lincoln D Stein, Michael R Stratton, Warwick Anderson, Anna D Barker, Cindy Bell, Martin Bobrow, Wylie Burke, Francis S Collins, Carolyn C Compton, Ronald A DePinho, Douglas F Easton, P Andrew Futreal, Daniela S Gerhard, Anthony R Green, Mark Guyer, Stanley R Hamilton, Tim J Hubbard, Olli P Kallioniemi, Karen L Kennedy, Timothy J Ley, Edison T Liu, Youyong Lu, Partha Majumder, Marco Marra, Brad Ozenberger, Jane Peterson, Alan J Schafer, Paul T Spellman, Hendrik G Stunnenberg, Brandon J Wainwright, Richard K Wilson, Huanming Yang.

Abstract

The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Year: 2010 PMID： 20393554 PMCID： PMC2902243 DOI： 10.1038/nature08987

Source DB: PubMed Journal: Nature ISSN： 0028-0836 Impact factor: 49.962

The genomes of all cancers accumulate somatic mutations1. These include nucleotide substitutions, small insertions and deletions, chromosomal rearrangements and copy number changes that can affect protein-coding or regulatory components of genes. In addition, cancer genomes usually acquire somatic epigenetic “marks” compared to non-neoplastic tissues from the same organ, notably changes in the methylation status of cytosines at CpG dinucleotides. A subset of the somatic mutations in cancer cells confers oncogenic properties such as growth advantage, tissue invasion and metastasis, angiogenesis, and evasion of apoptosis2. These are termed “driver” mutations. The identification of driver mutations will provide insights into cancer biology and highlight novel drug targets and diagnostic tests. Knowledge of cancer mutations has already led to the development of specific therapies, such as trastuzumab for HER2/neu positive breast cancers3 and imatinib, which targets BCR-ABL tyrosine kinase for the treatment of chronic myeloid leukemia4,5. The remaining somatic mutations in cancer genomes that do not contribute to cancer development are called “passengers”. These mutations provide insights into the DNA damage and repair processes that have been operative during cancer development, including exogenous environmental exposures6,7. In most cancer genomes, it is anticipated that passenger mutations, as well as germline variants not yet catalogued in polymorphism databases, will substantially outnumber drivers. Large-scale analyses of genes in tumors have revealed that the mutation load in cancer is abundant and heterogeneous8-13. Preliminary surveys of cancer genomes have already demonstrated their relevance in identifying new cancer genes that constitute potential therapeutic targets for several types of cancer, including PIK3CA14, BRAF15, NF110, KDR10, PIK3R19, and histone methyltransferases and demethylases16,17. These projects have also yielded correlations between cancer mutations and prognosis, such as IDH1 and IDH2 mutations in several types of gliomas13,18. Advances in massively parallel sequencing technology have enabled sequencing of entire cancer genomes 19-22. Following the launch of comprehensive cancer genome projects in the United Kingdom (Cancer Genome Project)23 and the United States (The Cancer Genome Atlas)24, cancer genome scientists and funding agencies met in Toronto (Canada) in October 2007 to discuss the opportunity to launch an international consortium. Key reasons for its formation were: (1) the scope is huge; (2) independent cancer genome initiatives could lead to duplication of effort or incomplete studies; (3) lack of standardization across studies could diminish the opportunities to merge and compare datasets; (4) the spectrum of many cancers is known to vary across the world; (5) an international consortium will accelerate the dissemination of datasets and analytical methods into the user community. Working groups were created to develop strategies and policies that would form the basis for participation in the ICGC. The goals of the Consortium (Box 1) were released in April 2008 (http://www.icgc.org/files/ICGC_April_29_2008.pdf). Since then, working groups and initial member projects have further refined the policies and plans for international collaboration.

Bioethical Framework

ICGC members agreed to a core set of bioethical elements for consent as a precondition of membership (Box 2). The Ethics and Policy Committee has created patient consent templates for both prospective collection and retrospective use of samples and data for ICGC projects. Differences in project-specific requirements and national legal frameworks may require some local amendments, while still reflecting the core principles of ICGC. The ICGC recognizes a delicate balance between protecting participants' personal data and sharing these data to accelerate cancer research. Data access policies have been drawn up that are respectful of the rights of the donors, while allowing ICGC data derived from samples to be shared ethically among a wide research community. Two levels of access have been implemented. For data that cannot be used to identify individuals, “Open access” datasets are publically available. These include data such as gender, age range, histology, normalized gene expression values, epigenetic datasets, somatic mutations, summaries of germline data, and study protocols. “Controlled access” datasets contain germline genomic data and detailed clinical information that are associated to a unique individual whose personal identifiers have been removed. To access controlled datasets researchers must seek authorizations by contacting the Data Access Compliance Office (DACO) (http://www.icgc.org/daco). An independent International Data Access Committee (IDAC) oversees the work of the DACO and provides assistance with resolving issues that arise.

Pathology and Clinical Annotation

Large-scale genomic studies of human tumors rely on the availability of fresh frozen tumor tissue. To address the paucity of samples that meet ICGC standards, many projects have initiated prospective collections of high quality source material. Accordingly, the ICGC recommended procedures to promote consistency of sample processing throughout the Consortium and ensure a series of quality features such as high tissue integrity and tumor cell content. Each project will need to include diverse data types such as environmental exposures, clinical history of participants, tumor histopathology, and clinical outcomes. Tumors display considerable clinical and biological heterogeneity which has resulted in a variety of tumor classifications. Within the ICGC, special measures are taken to promote the consistency of diagnosis. These include the coordination of diagnostic criteria among groups investigating tumors that are related, and policies that all samples will be reviewed by at least two independent reference pathologists. Furthermore, images of the stained tumor sections (or blood smear or cytospins for hematological neoplasias), from which diagnoses were made, will be stored and made available to the community. Although different tumor types may require specific procedures for tumor acquisition or compilation of clinical and environmental data, ICGC has set guidelines regarding the use of common definitions and data standards. This will allow ICGC data users to identify correlations between tumor-specific molecular changes with clinical and histopathological data including prognosis, prediction of therapy response and tumor classification schemes for diagnosis.

Study Design and Statistical Issues

To identify cancer-related genes, one needs to detect genes that are mutated at a higher frequency than the background mutation rate. Given that several driver genes have been found to be mutated at low frequencies, ICGC will identify somatic mutation observed in at least 3% of tumors of a given subtype. ICGC determined that 500 samples would be needed per tumor type (although for rare tumor types, a smaller sample size may be justified). In practice, the degree of heterogeneity of a given tumor type is difficult to know in advance, such that some particularly heterogeneous tumor types may require larger sample collections.

Cancer Genome Analyses

High-quality catalogues of somatic mutations from whole cancer genomes will ultimately be the ICGC standard. Shotgun sequencing employing second generation technologies can detect all classes of somatic mutation implicated in cancer. Moreover, if the level of coverage is sufficient, comprehensive high quality catalogues of somatic mutations from individual cancer genomes can be acquired with >90% sensitivity and >95% specificity. In order to achieve this, it will be necessary to sequence both the genome of the cancer and of a normal tissue from the same individual to distinguish germline variants. Although a few genomes of this standard have already been generated, the cost and the continuing technology development will mean that interim analyses of particularly informative sectors of the genome will be carried out, for example of all coding exons and microRNAs. For each individual cancer genome, the catalogue of somatic mutations will be supplemented by genome-wide information on the state of methylation of CpG dinucleotides. The optimal strategies and technologies to achieve this are not yet clear. Moreover, the genomes of individual cancers will be accompanied, where possible, by analyses of the transcriptome. Although conventional array-based approaches currently predominate, it is preferable that RNA sequencing becomes the standard as sequencing has a greater dynamic range25 and provides additional information including novel transcripts and sequence variants26.

ICGC Datasets

The distributed nature of the Consortium coupled with the large size of the datasets makes it cumbersome to store all data in a single centralized repository. For this reason, the ICGC has adopted a “franchise” database model for integrating the information and making it available to the public. Under this model, each member project releases tumor information by copying it into its local franchise database after it has been quality checked. Each franchise database shares a common schema to describe the specimens, the associated clinical information, and their genome characterization data. ICGC primary data files, including sequencing traces, are sent to the National Center for Biotechnology Information (NCBI) and/or the European Bioinformatics Institute (EBI) for archiving, while interpreted data sets, such as somatic mutation calls, are stored in franchise databases. The ICGC franchise databases and web portal use BioMart27, a data federation technology originally developed for use in Ensembl28, and since adopted for use by multiple model organism and genome databases. The management of the ICGC data flow is the responsibility of the ICGC Data Coordination Center (DCC) located at the Ontario Institute for Cancer Research. The DCC also operates the ICGC data portal which allows researchers to access both Open and Controlled access portions of the ICGC data. The portal provides a variety of user interfaces that range from simple gene-oriented queries (“Show me all the non-silent coding mutations identified in PIK3R1 for all cancers.”) to queries that integrate genomic, clinical, and functional information (“Show me all members of the toll receptor pathway having deletions in stage III breast cancer.”). These queries will be distributed across the franchise databases in a manner that is invisible to the user. The portal will also provide links to the primary files at NCBI and EBI, interfaces for generating tabular reports, data dumps in common bioinformatics formats, and other visualizations including genome browser tracks, pathway diagrams and survival curves. The portal is available via a link at http://www.icgc.org. At the time of this publication, the following cancer and reference datasets will be available through the ICGC web portal: Initial data releases from ICGC members for breast cancer (UK), liver cancer (Japan), and pancreatic cancer (Australia and Canada); A whole genome dataset of a metastatic melanoma cell line (COLO829)6; Open datasets from the TCGA for glioblastoma multiforme (GBM) and serous cystadenocarcinoma of the ovary (see below); Whole exome somatic mutation data from 68 individuals with breast, colorectal, pancreatic cancer and GBM11-13; Links to the human reference genome (http://www.genomereference.org/) and gene annotations from the GENCODE Project (http://www.sanger.ac.uk/gencode/) which includes the CCDS gene set29; Links to dbSNP30 and the HapMap31 databases, providing access to common patterns of variation in reference population samples; Links to Reactome32, a curated database of biological pathways in human; A set of reference gene models, mirrored from ENSEMBL28. The current version of the web portal provides an entry point to the open access data tier via interactive query as well as bulk download of data files. We expect that in mid 2010 both open access and controlled data will be available. The ICGC recently established a bioinformatics analysis working group to compare pipelines, analytic methods, consistency within and among algorithms, and establish guidelines or best practices for the Consortium. Over time, significant resources will be deployed to develop strategies to analyze the large complex datasets generated by ICGC member projects, and provide value-added views of cancer genomic data by integrating them with other biological and epidemiological datasets.

Data Release and IP Policies

The data release policies of the ICGC are intended to maximize public benefit while, at the same time, protecting the interests and rights of sample donors and their relatives. Members of the ICGC are committed to the principles of rapid data release (with appropriate controlled access mechanisms), in concordance with the Toronto Statement33. ICGC members encourage the scientific community to use any data that targets specific genes and mutations, without any restrictions. In order to allow ICGC members the opportunity to be the first to publish global analyses from datasets they generate, the Consortium has also agreed that member projects may specify conditions that include a time limit during which other data users are asked to refrain from publishing global analyses (defined by several ICGC member projects as 100 tumors and matched controls), a provision referred to as a “publication moratorium”. In order to allow time for a dataset to be analyzed and submitted for publication, ICGC members will have at most one year after released datasets reach the specified threshold before third parties are permitted to submit manuscripts describing global analyses. Further details on data release guidelines for data producers, users and reviewers are available http://www.icgc.org. Users of ICGC data are expected to respect these terms and to cite this manuscript and the source of pre-publication data, including the version of the dataset. In cases of uncertainty, scientists using ICGC data are encouraged to contact the member projects to discuss publication plans. ICGC members believe that maximum public benefit will be achieved if the data remain publicly accessible without patent restrictions hence no claims to possible intellectual property (IP) derived from primary data (including somatic mutations) will be made. Users of ICGC data (including ICGC members) may elect to perform further research and to exercise their IP rights on these downstream discoveries. If this occurs, users are expected to implement licensing policies that do not obstruct further research.

Initial ICGC Projects

Currently nine countries and two European consortia have initiated cancer genome projects under the umbrella of the ICGC. The initial projects, listed in an online table that accompanies this article, will analyze tumor types found around the globe and throughout the human body affecting a diversity of organs including blood, brain, breast, kidney, liver, pancreas, stomach, oral cavity, and ovary. Over time, the ICGC will investigate fifty or more types and subtypes of cancer in adults and children. In the case of tumors with multiple subtypes, analyses should be focused on subtypes that may be defined on pathological, molecular, etiological or geographical differences. It is expected that some cancer types will be studied in parallel in different parts of the world, as the mutation profiles may differ among populations. The consortium has enabled the coordination of initial projects analyzing similar cancers in different countries, and in some cases, the redirection of resources to launch new projects.

The Cancer Genome Atlas (TCGA)

TCGA is a comprehensive program in cancer genomics that is jointly supported and managed by the National Cancer Institute and the National Human Genome Research Institute of the U.S. National Institutes of Health. TCGA began in 2006 as a pilot focused on three projects, glioblastoma multiforme (GBM), serous cystadenocarcinoma of the ovary, and lung squamous carcinoma, and has recently expanded to produce comprehensive genomic data sets for at least 10 additional cancers in the next two years. Given TCGA's contributions in launching the ICGC and cooperation to ensure that its policies (posted at http://cancergenome.nih.gov) are coordinated with those of the ICGC, TCGA's participation in the ICGC is considered to be equivalent to that of a full member. TCGA, however, is not able to join the ICGC formally at this time, because of technical and legal issues in the U.S. related to the mechanisms of the distribution of controlled-access data, although such data are directly available to investigators at http://cancergenome.nih.gov/dataportal. The National Institutes of Health policies relating to distribution of controlled-access datasets are being reviewed with the intent of enabling researchers to integrate and analyze across databases, for example, using the franchise model adopted by the ICGC. Meanwhile, TCGA is ensuring that projects are coordinated and data sets are compatible with those of the Consortium.

ICGC in the Next Decade

A large proportion of common cancers affecting patients around the world have been or will soon be selected for comprehensive cancer genome studies. Further efforts will be needed to leverage support and expertise to tackle the remaining tumor types, including rare cancers. The challenges of the ICGC are daunting due to the scope of the initiative, the complexity that is inherent to the heterogeneity of cancer and the limitations of current technologies to provide accurate long-range assemblies of highly rearranged chromosomes found in tumor cells. These challenges underscore the importance of continued international coordination and further engagement of the scientific community in the next decade.

Moving Towards Clinical Applications

ICGC catalogues, which are expected to grow exponentially, will have immediate relevance in the cancer research community. Early insight into the biology of somatic mutations will come from functional studies in cell-based and animal models of tumors. Mutation screens in retrospective tumor banks linked to registries or clinical trials having significant clinical data will inform on the potential clinical utility of somatic mutations as biomarkers for prognosis or drug-response. Germline variants identified by ICGC projects may allow the discovery of genes predisposing to familial malignancies, such as PALB2 and pancreatic cancer12,34. High throughput screens of RNAi or small molecule libraries, and the adaptation of existing model systems, will play a major role in refining potential therapeutic candidates for further study35. Translating these discoveries into clinical practice will require more sophisticated clinical trials that take into account the increases in phenotypic subdivisions, additional coordination to identify subjects having tumors with similar profiles, and increased use of biomarkers, genomic analyses, informatics and other technologies in the clinical development of new therapeutics. Given the tremendous potential for relatively low-cost genomic sequencing to reveal clinically useful information, we anticipate that in the not so distant future, partial or full cancer genomes will routinely be sequenced as part of the clinical evaluation of cancer patients and as part of their on-going clinical management. The successful and appropriate translation of cancer genome research into clinical practice will raise important social and ethical questions. It will be essential to combine the expertise of oncologists, biostatisticians, pathologists, geneticists, policy-makers and members of the biopharmaceutical industry to meet this challenge by developing new policies and clinical paradigms that enable rapid translation of many new biomarkers and cancer targets into new clinical tests and therapeutic interventions that will benefit cancer patients. Coordinate the generation of comprehensive catalogues of genomic abnormalities (somatic mutations) in tumors in 50 different cancer types and/or subtypes which are of clinical and societal importance across the globe. Ensure high quality by defining the catalogue for each tumor type or subtype to include the full range of somatic mutations such as single-nucleotide variants, insertions, deletions, copy number changes, translocations and other chromosomal rearrangements, and to have the following features: Comprehensiveness, such that most cancer genes with somatic abnormalities occurring at a frequency of greater than 3% are discovered; High resolution, ideally at a single nucleotide level; High quality, using common standards for pathology and technology; Data from matched non-tumor tissue, to distinguish somatic from inherited sequence variants and aberrations; Generate complementary catalogues of transcriptomic and epigenomic datasets from the same tumors. Make the data available to the entire research community as rapidly as possible, and with minimal restrictions, to accelerate research into the causes and control of cancer. Coordinate research efforts so that the interests and priorities of individual participants, self-organizing consortia, funding agencies and nations are addressed, including use of the burden of disease and the minimization of unnecessary redundancy in tumor analysis efforts. Support the dissemination of knowledge and standards related to new technologies, software, and methods to facilitate data integration and sharing with cancer researchers around the globe. For prospective research, ICGC members should convey to potential participants, that: The ICGC is a coordinated effort among related scientific research projects being carried on around the world Participation in the ICGC and its component projects is voluntary Samples and data collected will be used for cancer research, which may include whole genome sequencing The patient's care will not be affected by their decision regarding participation The samples collected will be in limited quantities; access to them will be tightly controlled and will depend on the policy and practices of the ICGC-member project. At least a small percentage of the samples may be shared with laboratories in other countries for the purposes of performing quality control studies Data derived from the samples collected and data generated by the ICGC members will be made accessible to ICGC members and other international researchers through either an open or a controlled access database under terms and conditions that will maximize participant confidentiality The researchers accessing data and samples will be required to affirm that they will not attempt to re-identify participants There is a remote risk of being identified from data available on the databases Once data are placed in open databases, those data cannot be withdrawn later In controlled access databases the links to (local) data that can identify an individual will be destroyed upon withdrawal. Data previously distributed will continue to be used ICGC members agree not to make claims to possible IP on primary data No profit from eventual commercial products will be returned to subjects donating samples For retrospective research, the above guidelines remain the same, with the exception that where the individual is no longer a patient, there will not be a concern that their care could be affected by participation. For research involving samples and data from deceased individuals: Where required by law or ethics, consent should always be obtained from the families of a deceased individual if their samples and data are to be used; if re-consent is not required, however, ethics review is sufficient Ethics committee review should be sought for all research proposing the use of existing sample and data collections Existing collections are a limited and valuable resource; access to them will be tightly controlled. For research using anonymized samples, ethics review may be required in some jurisdictions.

35 in total

1. dbSNP: the NCBI database of genetic variation.

Authors: S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin
Journal: Nucleic Acids Res Date: 2001-01-01 Impact factor: 16.971

Review 2. The hallmarks of cancer.

Authors: D Hanahan; R A Weinberg
Journal: Cell Date: 2000-01-07 Impact factor: 41.582

3. Wellcome funds cancer database.

Authors: D Dickson
Journal: Nature Date: 1999-10-21 Impact factor: 49.962

4. High frequency of mutations of the PIK3CA gene in human cancers.

Authors: Yardena Samuels; Zhenghe Wang; Alberto Bardelli; Natalie Silliman; Janine Ptak; Steve Szabo; Hai Yan; Adi Gazdar; Steven M Powell; Gregory J Riggins; James K V Willson; Sanford Markowitz; Kenneth W Kinzler; Bert Vogelstein; Victor E Velculescu
Journal: Science Date: 2004-03-11 Impact factor: 47.728

5. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2.

Authors: D J Slamon; B Leyland-Jones; S Shak; H Fuchs; V Paton; A Bajamonde; T Fleming; W Eiermann; J Wolter; M Pegram; J Baselga; L Norton
Journal: N Engl J Med Date: 2001-03-15 Impact factor: 91.245

6. Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia.

Authors: B J Druker; M Talpaz; D J Resta; B Peng; E Buchdunger; J M Ford; N B Lydon; H Kantarjian; R Capdeville; S Ohno-Jones; C L Sawyers
Journal: N Engl J Med Date: 2001-04-05 Impact factor: 91.245

7. Activity of a specific inhibitor of the BCR-ABL tyrosine kinase in the blast crisis of chronic myeloid leukemia and acute lymphoblastic leukemia with the Philadelphia chromosome.

Authors: B J Druker; C L Sawyers; H Kantarjian; D J Resta; S F Reese; J M Ford; R Capdeville; M Talpaz
Journal: N Engl J Med Date: 2001-04-05 Impact factor: 91.245

8. Mutations of the BRAF gene in human cancer.

Authors: Helen Davies; Graham R Bignell; Charles Cox; Philip Stephens; Sarah Edkins; Sheila Clegg; Jon Teague; Hayley Woffendin; Mathew J Garnett; William Bottomley; Neil Davis; Ed Dicks; Rebecca Ewing; Yvonne Floyd; Kristian Gray; Sarah Hall; Rachel Hawes; Jaime Hughes; Vivian Kosmidou; Andrew Menzies; Catherine Mould; Adrian Parker; Claire Stevens; Stephen Watt; Steven Hooper; Rebecca Wilson; Hiran Jayatilake; Barry A Gusterson; Colin Cooper; Janet Shipley; Darren Hargrave; Katherine Pritchard-Jones; Norman Maitland; Georgia Chenevix-Trench; Gregory J Riggins; Darell D Bigner; Giuseppe Palmieri; Antonio Cossu; Adrienne Flanagan; Andrew Nicholson; Judy W C Ho; Suet Y Leung; Siu T Yuen; Barbara L Weber; Hilliard F Seigler; Timothy L Darrow; Hugh Paterson; Richard Marais; Christopher J Marshall; Richard Wooster; Michael R Stratton; P Andrew Futreal
Journal: Nature Date: 2002-06-09 Impact factor: 49.962

9. Patterns of somatic mutation in human cancer genomes.

Authors: Christopher Greenman; Philip Stephens; Raffaella Smith; Gillian L Dalgliesh; Christopher Hunter; Graham Bignell; Helen Davies; Jon Teague; Adam Butler; Claire Stevens; Sarah Edkins; Sarah O'Meara; Imre Vastrik; Esther E Schmidt; Tim Avis; Syd Barthorpe; Gurpreet Bhamra; Gemma Buck; Bhudipa Choudhury; Jody Clements; Jennifer Cole; Ed Dicks; Simon Forbes; Kris Gray; Kelly Halliday; Rachel Harrison; Katy Hills; Jon Hinton; Andy Jenkinson; David Jones; Andy Menzies; Tatiana Mironenko; Janet Perry; Keiran Raine; Dave Richardson; Rebecca Shepherd; Alexandra Small; Calli Tofts; Jennifer Varian; Tony Webb; Sofie West; Sara Widaa; Andy Yates; Daniel P Cahill; David N Louis; Peter Goldstraw; Andrew G Nicholson; Francis Brasseur; Leendert Looijenga; Barbara L Weber; Yoke-Eng Chiew; Anna DeFazio; Mel F Greaves; Anthony R Green; Peter Campbell; Ewan Birney; Douglas F Easton; Georgia Chenevix-Trench; Min-Han Tan; Sok Kean Khoo; Bin Tean Teh; Siu Tsan Yuen; Suet Yi Leung; Richard Wooster; P Andrew Futreal; Michael R Stratton
Journal: Nature Date: 2007-03-08 Impact factor: 49.962

10. Systematic sequencing of renal carcinoma reveals inactivation of histone modifying genes.

Authors: Gillian L Dalgliesh; Kyle Furge; Chris Greenman; Lina Chen; Graham Bignell; Adam Butler; Helen Davies; Sarah Edkins; Claire Hardy; Calli Latimer; Jon Teague; Jenny Andrews; Syd Barthorpe; Dave Beare; Gemma Buck; Peter J Campbell; Simon Forbes; Mingming Jia; David Jones; Henry Knott; Chai Yin Kok; King Wai Lau; Catherine Leroy; Meng-Lay Lin; David J McBride; Mark Maddison; Simon Maguire; Kirsten McLay; Andrew Menzies; Tatiana Mironenko; Lee Mulderrig; Laura Mudie; Sarah O'Meara; Erin Pleasance; Arjunan Rajasingham; Rebecca Shepherd; Raffaella Smith; Lucy Stebbings; Philip Stephens; Gurpreet Tang; Patrick S Tarpey; Kelly Turrell; Karl J Dykema; Sok Kean Khoo; David Petillo; Bill Wondergem; John Anema; Richard J Kahnoski; Bin Tean Teh; Michael R Stratton; P Andrew Futreal
Journal: Nature Date: 2010-01-06 Impact factor: 49.962

950 in total

1. Trans-ancestry mutational landscape of hepatocellular carcinoma genomes.

Authors: Yasushi Totoki; Kenji Tatsuno; Kyle R Covington; Hiroki Ueda; Chad J Creighton; Mamoru Kato; Shingo Tsuji; Lawrence A Donehower; Betty L Slagle; Hiromi Nakamura; Shogo Yamamoto; Eve Shinbrot; Natsuko Hama; Megan Lehmkuhl; Fumie Hosoda; Yasuhito Arai; Kim Walker; Mahmoud Dahdouli; Kengo Gotoh; Genta Nagae; Marie-Claude Gingras; Donna M Muzny; Hidenori Ojima; Kazuaki Shimada; Yutaka Midorikawa; John A Goss; Ronald Cotton; Akimasa Hayashi; Junji Shibahara; Shumpei Ishikawa; Jacfranz Guiteau; Mariko Tanaka; Tomoko Urushidate; Shoko Ohashi; Naoko Okada; Harsha Doddapaneni; Min Wang; Yiming Zhu; Huyen Dinh; Takuji Okusaka; Norihiro Kokudo; Tomoo Kosuge; Tadatoshi Takayama; Masashi Fukayama; Richard A Gibbs; David A Wheeler; Hiroyuki Aburatani; Tatsuhiro Shibata
Journal: Nat Genet Date: 2014-11-02 Impact factor: 38.330

Review 2. Clinical implementation of comprehensive strategies to characterize cancer genomes: opportunities and challenges.

Authors: Laura E MacConaill; Paul Van Hummelen; Matthew Meyerson; William C Hahn
Journal: Cancer Discov Date: 2011-09 Impact factor: 39.397

3. The fractal globule as a model of chromatin architecture in the cell.

Authors: Leonid A Mirny
Journal: Chromosome Res Date: 2011-01 Impact factor: 5.239

Review 4. Preclinical strategies to define predictive biomarkers for therapeutically relevant cancer subtypes.

Authors: Marina Pajic; Christopher J Scarlett; David K Chang; Robert L Sutherland; Andrew V Biankin
Journal: Hum Genet Date: 2011-04-23 Impact factor: 4.132

Review 5. Bioinformatics for personal genome interpretation.

Authors: Emidio Capriotti; Nathan L Nehrt; Maricel G Kann; Yana Bromberg
Journal: Brief Bioinform Date: 2012-01-13 Impact factor: 11.622

6. Evaluating de novo locus-disease discoveries in GWAS using the signal-to-noise ratio.

Authors: Xia Jiang; M Michael Barmada; Michael J Becich
Journal: AMIA Annu Symp Proc Date: 2011-10-22

7. Simultaneous structural variation discovery among multiple paired-end sequenced genomes.

Authors: Fereydoun Hormozdiari; Iman Hajirasouliha; Andrew McPherson; Evan E Eichler; S Cenk Sahinalp
Journal: Genome Res Date: 2011-11-02 Impact factor: 9.043

8. Next-generation sequencing of prostate tumors provides independent evidence of xenotropic murine leukemia virus-related gammaretrovirus contamination.

Authors: Fan Mo; Alexander W Wyatt; Chunxiao Wu; Anna V Lapuk; Marco A Marra; Martin E Gleave; Stanislav V Volik; Colin C Collins
Journal: J Clin Microbiol Date: 2011-12-07 Impact factor: 5.948

9. It's not about the data.

Authors:
Journal: Nat Genet Date: 2012-01-27 Impact factor: 38.330

10. Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing.

Authors: Yuchao Jiang; Yu Qiu; Andy J Minn; Nancy R Zhang
Journal: Proc Natl Acad Sci U S A Date: 2016-08-29 Impact factor: 11.205