Literature DB >> 30194412

Right data for right patient-a precisionFDA NCI-CPTAC Multi-omics Mislabeling Challenge.

Emily Boja1, Živana Težak2, Bing Zhang3, Pei Wang4, Elaine Johanson5, Denise Hinton6, Henry Rodriguez7.   

Abstract

Entities:  

Mesh:

Year:  2018        PMID: 30194412      PMCID: PMC6892367          DOI: 10.1038/s41591-018-0180-x

Source DB:  PubMed          Journal:  Nat Med        ISSN: 1078-8956            Impact factor:   53.440


× No keyword cloud information.
Although genomics has shaped the current scope of precision medicine, it is becoming increasingly clear that molecular phenotypes, such as DNA and RNA profiles and, in particular, protein abundance profiles, are essential to our understanding of biology and for enhancing our ability to achieve the promise of precision medicine for patients. Hence, simultaneous generation and integration of multidimensional multi-omics datasets from a large set of tumor samples, such as those used in the National Cancer Institute’s (NCI) The Cancer Genome Atlas (TCGA; https://cancergenome.nih.gov) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC; https://proteomics.cancer.gov) projects[1-4], is becoming a powerful approach to understanding the molecular basis of diseases and speeding the translation of new discoveries to patient care. This development has been largely enabled by the rapid technological advancement, standardization and harmonization in tumor molecular profiling in recent years. Consequently, several initiatives have been launched to leverage this development for application to clinical practice, including the International Cancer Proteogenome Consortium[5] and the Applied Proteogenomics Organizational Learning and Outcomes[6] programs. These efforts promise to revolutionize our understanding of cancer biology and change the way cancer is treated. The value of multi-omics technologies and datasets lies in the possibility of accurately extracting rich information to help understand the molecular complexities specific to individual patients through use of sophisticated integrative computational algorithms. Such information can be used to reach a deeper understanding of a disease, which then can be applied clinically, for example, to elucidate the relationship between the genome and proteome of a patient’s tumor or to deconvolute tumor heterogeneity associated with clinical outcome. Ideally, individual and population data would ultimately serve to inform a physician and a patient and to help determine the most appropriate treatment options. Furthermore, the comprehensive information obtained on the same sample in multiple dimensions can add value in pinpointing and correcting problems that can be encountered, such as sample mislabeling by accidental swapping of patient samples or data mislabeling (accidental swapping of patient omics data), which could lead to multiple patients receiving the wrong medical treatment, resulting in severe, irreversible consequences. Sample mislabeling that contributes to irreproducible results and invalid conclusions is known to be one of the obstacles in basic and translational research[7]. This is also prevalent in data-rich large-scale omics studies[8,9], in which human errors could arise anywhere in the data production and analysis pipeline—either sample mislabeling (early in the pipeline) or data mislabeling (later in the pipeline). The Food and Drug Administration (FDA) and NCI-CPTAC, with a history of collaboration[10], also have experience in building challenges, such as the precisionFDA Challenges (https://precision.fda.gov/challenges) and NCI–CPTAC DREAM Proteogenomics Challenge (https://www.synapse.org/#!Synapse:syn8228304/wiki/413428), to solve complex problems. Now they are joining forces to launch a Multi-omics Enabled Sample Mislabeling and Correction Challenge (https://precision.fda.gov/mislabeling) in September 2018. The objective of this challenge is to encourage development and evaluation of computational algorithms that can accurately detect and correct mislabeled samples using rich multi-omics datasets, enhancing the assurance that the right data is attributed to the right patient.

Challenge design

The challenge comprises two subchallenges to be conducted sequentially. In Subchallenge 1, participants will be asked to detect mislabeled samples. Participants will be presented with a training dataset and a test dataset, comprising real-world clinical and proteomics data. Mislabeled samples will be known in the training dataset and not known in the test dataset. Using the training dataset, participants will develop computational models to distinguish samples of matched and nonmatched clinical and proteomics data. The computational models will then be used to identify mislabeled samples in the test dataset. In Subchallenge 2, participants will be asked to correct mislabeled samples in richer data. Participants will be presented with real-world RNA profiling data for all samples in both the training and test datasets. Similar to the clinical and proteomics data, newly introduced RNA profiling data will also include mislabeled samples. As with Subchallenge 1, this information will be known in the training dataset, but not in the test dataset. Participants will develop computational algorithms to model the relationships among the three data types in the training dataset and then will apply the computational model to identify and correct instances of single data type sample mislabeling among the trio of data types in the test dataset. Subchallenge results will be independently evaluated (Fig. 1).
Fig. 1

Challenge design and timelines.

Challenge design and timelines.

Anticipated outcome and impact

An immediate outcome envisioned is a flagship challenge manuscript that gives an overview of the challenge data, questions, design, and outcomes[11]. Additionally, the algorithms that the participants propose will be aggregated with the aim of refining a final open-source product to be incorporated into an analysis pipeline and ultimately as part of a quality-management system to reduce errors. This could help speed the translation of multidimensional omics technologies and datasets to the clinic. Meanwhile, NCI and FDA hope to build and expand a community of scientists that will collaborate to solve important problems that prevent the translation of multi-omics data to the clinical labs.
  9 in total

1.  Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics.

Authors:  Li Ding; Matthew H Bailey; Eduard Porta-Pardo; Vesteinn Thorsson; Antonio Colaprico; Denis Bertrand; David L Gibbs; Amila Weerasinghe; Kuan-Lin Huang; Collin Tokheim; Isidro Cortés-Ciriano; Reyka Jayasinghe; Feng Chen; Lihua Yu; Sam Sun; Catharina Olsen; Jaegil Kim; Alison M Taylor; Andrew D Cherniack; Rehan Akbani; Chayaporn Suphavilai; Niranjan Nagarajan; Joshua M Stuart; Gordon B Mills; Matthew A Wyczalkowski; Benjamin G Vincent; Carolyn M Hutter; Jean Claude Zenklusen; Katherine A Hoadley; Michael C Wendl; Llya Shmulevich; Alexander J Lazar; David A Wheeler; Gad Getz
Journal:  Cell       Date:  2018-04-05       Impact factor: 41.582

2.  Protein-based multiplex assays: mock presubmissions to the US Food and Drug Administration.

Authors:  Fred E Regnier; Steven J Skates; Mehdi Mesri; Henry Rodriguez; Zivana Tezak; Marina V Kondratovich; Michail A Alterman; Joshua D Levin; Donna Roscoe; Eugene Reilly; James Callaghan; Kellie Kelm; David Brown; Reena Philip; Steven A Carr; Daniel C Liebler; Susan J Fisher; Paul Tempst; Tara Hiltke; Larry G Kessler; Christopher R Kinsinger; David F Ransohoff; Elizabeth Mansfield; N Leigh Anderson
Journal:  Clin Chem       Date:  2009-12-10       Impact factor: 8.327

Review 3.  Collaboration to Accelerate Proteogenomics Cancer Care: The Department of Veterans Affairs, Department of Defense, and the National Cancer Institute's Applied Proteogenomics OrganizationaL Learning and Outcomes (APOLLO) Network.

Authors:  L D Fiore; H Rodriguez; C D Shriver
Journal:  Clin Pharmacol Ther       Date:  2017-05       Impact factor: 6.875

4.  Proteogenomic characterization of human colon and rectal cancer.

Authors:  Bing Zhang; Jing Wang; Xiaojing Wang; Jing Zhu; Qi Liu; Zhiao Shi; Matthew C Chambers; Lisa J Zimmerman; Kent F Shaddox; Sangtae Kim; Sherri R Davies; Sean Wang; Pei Wang; Christopher R Kinsinger; Robert C Rivers; Henry Rodriguez; R Reid Townsend; Matthew J C Ellis; Steven A Carr; David L Tabb; Robert J Coffey; Robbert J C Slebos; Daniel C Liebler
Journal:  Nature       Date:  2014-07-20       Impact factor: 49.962

5.  Revolutionizing Precision Oncology through Collaborative Proteogenomics and Data Sharing.

Authors:  Henry Rodriguez; Stephen R Pennington
Journal:  Cell       Date:  2018-04-19       Impact factor: 41.582

6.  Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer.

Authors:  Hui Zhang; Tao Liu; Zhen Zhang; Samuel H Payne; Bai Zhang; Jason E McDermott; Jian-Ying Zhou; Vladislav A Petyuk; Li Chen; Debjit Ray; Shisheng Sun; Feng Yang; Lijun Chen; Jing Wang; Punit Shah; Seong Won Cha; Paul Aiyetan; Sunghee Woo; Yuan Tian; Marina A Gritsenko; Therese R Clauss; Caitlin Choi; Matthew E Monroe; Stefani Thomas; Song Nie; Chaochao Wu; Ronald J Moore; Kun-Hsing Yu; David L Tabb; David Fenyö; Vineet Bafna; Yue Wang; Henry Rodriguez; Emily S Boja; Tara Hiltke; Robert C Rivers; Lori Sokoll; Heng Zhu; Ie-Ming Shih; Leslie Cope; Akhilesh Pandey; Bing Zhang; Michael P Snyder; Douglas A Levine; Richard D Smith; Daniel W Chan; Karin D Rodland
Journal:  Cell       Date:  2016-06-29       Impact factor: 41.582

7.  A community effort to assess and improve drug sensitivity prediction algorithms.

Authors:  James C Costello; Laura M Heiser; Elisabeth Georgii; Mehmet Gönen; Michael P Menden; Nicholas J Wang; Mukesh Bansal; Muhammad Ammad-ud-din; Petteri Hintsanen; Suleiman A Khan; John-Patrick Mpindi; Olli Kallioniemi; Antti Honkela; Tero Aittokallio; Krister Wennerberg; James J Collins; Dan Gallahan; Dinah Singer; Julio Saez-Rodriguez; Samuel Kaski; Joe W Gray; Gustavo Stolovitzky
Journal:  Nat Biotechnol       Date:  2014-06-01       Impact factor: 54.908

8.  reGenotyper: Detecting mislabeled samples in genetic data.

Authors:  Konrad Zych; Basten L Snoek; Mark Elvin; Miriam Rodriguez; K Joeri Van der Velde; Danny Arends; Harm-Jan Westra; Morris A Swertz; Gino Poulin; Jan E Kammenga; Rainer Breitling; Ritsert C Jansen; Yang Li
Journal:  PLoS One       Date:  2017-02-13       Impact factor: 3.240

9.  Proteogenomics connects somatic mutations to signalling in breast cancer.

Authors:  Philipp Mertins; D R Mani; Kelly V Ruggles; Michael A Gillette; Karl R Clauser; Pei Wang; Xianlong Wang; Jana W Qiao; Song Cao; Francesca Petralia; Emily Kawaler; Filip Mundt; Karsten Krug; Zhidong Tu; Jonathan T Lei; Michael L Gatza; Matthew Wilkerson; Charles M Perou; Venkata Yellapantula; Kuan-lin Huang; Chenwei Lin; Michael D McLellan; Ping Yan; Sherri R Davies; R Reid Townsend; Steven J Skates; Jing Wang; Bing Zhang; Christopher R Kinsinger; Mehdi Mesri; Henry Rodriguez; Li Ding; Amanda G Paulovich; David Fenyö; Matthew J Ellis; Steven A Carr
Journal:  Nature       Date:  2016-05-25       Impact factor: 49.962

  9 in total
  6 in total

1.  A community effort to identify and correct mislabeled samples in proteogenomic studies.

Authors:  Seungyeul Yoo; Zhiao Shi; Bo Wen; SoonJye Kho; Renke Pan; Hanying Feng; Hong Chen; Anders Carlsson; Patrik Edén; Weiping Ma; Michael Raymer; Ezekiel J Maier; Zivana Tezak; Elaine Johanson; Denise Hinton; Henry Rodriguez; Jun Zhu; Emily Boja; Pei Wang; Bing Zhang
Journal:  Patterns (N Y)       Date:  2021-05-07

Review 2.  Clinical metagenomics.

Authors:  Charles Y Chiu; Steven A Miller
Journal:  Nat Rev Genet       Date:  2019-06       Impact factor: 53.242

Review 3.  Application of Proteomics in Cancer: Recent Trends and Approaches for Biomarkers Discovery.

Authors:  Yang Woo Kwon; Han-Seul Jo; Sungwon Bae; Youngsuk Seo; Parkyong Song; Minseok Song; Jong Hyuk Yoon
Journal:  Front Med (Lausanne)       Date:  2021-09-22

4.  SMAP is a pipeline for sample matching in proteogenomics.

Authors:  Ling Li; Mingming Niu; Alyssa Erickson; Jie Luo; Kincaid Rowbotham; Kai Guo; He Huang; Yuxin Li; Yi Jiang; Junguk Hur; Chunyu Liu; Junmin Peng; Xusheng Wang
Journal:  Nat Commun       Date:  2022-02-08       Impact factor: 17.694

5.  A reference profile-free deconvolution method to infer cancer cell-intrinsic subtypes and tumor-type-specific stromal profiles.

Authors:  Li Wang; Robert P Sebra; John P Sfakianos; Kimaada Allette; Wenhui Wang; Seungyeul Yoo; Nina Bhardwaj; Eric E Schadt; Xin Yao; Matthew D Galsky; Jun Zhu
Journal:  Genome Med       Date:  2020-02-28       Impact factor: 11.117

6.  Comparative analysis of transcriptomic profile, histology, and IDH mutation for classification of gliomas.

Authors:  Paul M H Tran; Lynn K H Tran; John Nechtman; Bruno Dos Santos; Sharad Purohit; Khaled Bin Satter; Boying Dun; Ravindra Kolhe; Suash Sharma; Roni Bollag; Jin-Xiong She
Journal:  Sci Rep       Date:  2020-11-26       Impact factor: 4.379

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.